[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413535#comment-13413535
 ] 

stack commented on HBASE-4050:
--

I suppose you don't need to set test size annotation on below because 
annotations are not a dependency when this is built:

{code}
+public class ReplicationMetricsSourceFactoryTest {
{code}

Does BaseMetricsSource not implement MetricsSource?

{code}
+public class BaseMetricsSourceImpl implements BaseMetricsSource, MetricsSource 
{
{code}

These need to be this accessible:

{code}
+  public ConcurrentMap
+  gauges = new ConcurrentHashMap();
+  public ConcurrentMap counters =
+  new ConcurrentHashMap();
+
+  protected String metricsContext;
+  protected String metricsName;
+  protected String metricsDescription;
{code}

(I see above twice)

The stuff below where we have a static boolean and in constructor we test 
something already created could be a PITA in minihbase setups?  Does it have to 
be static?  Aren't we slinging singletons here anyways?  (The singletons are ok 
in minihbasecontext too?):

{code}
+if (!hasInited) {
+  //Not too worried about mutli-threaded here as all it does is spam the 
logs.
+  hasInited = true;
+  DefaultMetricsSystem.initialize(HBASE_METRICS_SYSTEM_NAME);
+}
{code}

'hasInited' is name of a method that tests 'inited' variable... suggest 
changing its name.

What about that jmx mess registering metrics in tests?  The exception saying 
metrics already registered because we have more than one daemon in the one jvm. 
 We still have that issue here?

You wanted to complete this: "+/** BaseClass for */"

Another class has no class comments though has the comment delimiters.

Do we have to have metrics2 package?  Can this new stuff be in the metrics 
package?

I thought I saw a patch where you'd renamed the properties file to what LarsG 
suggested?

You seem to have made it so we do not need to have a metrics2 in hbase... thats 
great... but in the properties file I see:

{code}
+# See package.html for org.apache.hadoop.metrics2 for details
+
+*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink
{code}

Is that just old stuff?

Good stuff Elliott.  I'd be up for committing this and then doing other stuff 
in other issues.



> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
> HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
> HBASE-4050-7.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6380:
-

Status: Patch Available  (was: Open)

> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6380:
-

Status: Open  (was: Patch Available)

> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6380:
-

Attachment: 6380-trunk.txt

Retry

> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6370) Add compression codec test at HMaster when createTable/modifyColumn/modifyTable

2012-07-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6370:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the patch ShiXing.

> Add compression codec test at HMaster when 
> createTable/modifyColumn/modifyTable
> ---
>
> Key: HBASE-6370
> URL: https://issues.apache.org/jira/browse/HBASE-6370
> Project: HBase
>  Issue Type: Improvement
>Reporter: ShiXing
>Assignee: ShiXing
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 6370v3.txt, HBASE-6370-trunk-V1.patch, 
> HBASE-6370-trunk-V2.patch, runAllTests.out
>
>
> We deployed a cluster that none of the regionserver supports the compression 
> codec such like "lzo", but the cluster user/client does not know this, and he 
> specifies the family's compression codec by 
> HColumnDescripto.setCompressionType(Compresson.Algorithm.LZO);
> Because the HBaseAdmin's createTable is async, so the client is waiting all 
> the regions of the table to be online forever. And client does not know why 
> the regions are not online until the HBase administrator find this problem.
> In deed, all of the regions are assigning by master, but regionserver's 
> openHRegion always failed.
> In my option, we can suppose all the cluster's enviroment are the same, means 
> if the master is deployed some lib, the regionserver should also be deployed. 
> Of course above is just a suppose, in real deployment, the hbase dba may just 
> deploy lib on regionserver or master.
> So I think this failure can be found earlier before master create the 
> CreateTableHandler thread, and we can tell client quickly we didn't support 
> this compression codec type.
> I will upload the patch later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6338) Cache Method in RPC handler

2012-07-13 Thread binlijin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

binlijin updated HBASE-6338:


Attachment: HBASE-6338-94-2.patch
HBASE-6338-92-2.patch
HBASE-6338-90-2.patch

> Cache Method in RPC handler
> ---
>
> Key: HBASE-6338
> URL: https://issues.apache.org/jira/browse/HBASE-6338
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
> Attachments: HBASE-6338-90-2.patch, HBASE-6338-90.patch, 
> HBASE-6338-92-2.patch, HBASE-6338-92.patch, HBASE-6338-94-2.patch, 
> HBASE-6338-94.patch, HBASE-6338-trunk-2.patch, HBASE-6338-trunk.patch
>
>
> Every call in rpc handler a Method will be created, if we cache the method 
> will improve a little.
> I test with 0.90, Average Class.getMethod(String name, Class... 
> parameterTypes) cost 4780 ns , if we cache it cost 2620 ns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6338) Cache Method in RPC handler

2012-07-13 Thread binlijin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

binlijin updated HBASE-6338:


Attachment: HBASE-6338-trunk-2.patch

> Cache Method in RPC handler
> ---
>
> Key: HBASE-6338
> URL: https://issues.apache.org/jira/browse/HBASE-6338
> Project: HBase
>  Issue Type: Improvement
>Reporter: binlijin
> Attachments: HBASE-6338-90-2.patch, HBASE-6338-90.patch, 
> HBASE-6338-92-2.patch, HBASE-6338-92.patch, HBASE-6338-94-2.patch, 
> HBASE-6338-94.patch, HBASE-6338-trunk-2.patch, HBASE-6338-trunk.patch
>
>
> Every call in rpc handler a Method will be created, if we cache the method 
> will improve a little.
> I test with 0.90, Average Class.getMethod(String name, Class... 
> parameterTypes) cost 4780 ns , if we cache it cost 2620 ns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6387) Cache DNS lookups in HServerAddress

2012-07-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413552#comment-13413552
 ] 

stack commented on HBASE-6387:
--

HServerAddress is deprecated in trunk, replaced. On deserialization was doing a 
dns lookup.  So this is 0.89fb only Mikhail?

> Cache DNS lookups in HServerAddress
> ---
>
> Key: HBASE-6387
> URL: https://issues.apache.org/jira/browse/HBASE-6387
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> We have noticed that we rely on DNS lookups in some critical paths by using 
> HServerAddress, and Java only seems to be caching DNS data for 30 seconds by 
> default. Also, if DNS is down, Java's negative cache of DNS will ensure that 
> many successive attempts fail. However, we cannot just increase 
> networkaddress.cache.ttl to a large value, because e.g. namenode failover may 
> require resolving the same DNS name differently. Therefore I propose that we 
> add a DNS lookup cache in HServerAddress.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice

2012-07-13 Thread zhou wenjian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413550#comment-13413550
 ] 

zhou wenjian commented on HBASE-6391:
-

that is different, i think

> Master restart when enabling table will lead to region assignned twice
> --
>
> Key: HBASE-6391
> URL: https://issues.apache.org/jira/browse/HBASE-6391
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.1
>
>
> The Scenario can be reproduce below.
> Enabling an table, some region is online on regionserver,some are still being 
> processed.
> And restart the master.
> when master failover:
> // Region is being served and on an active server
> // add only if region not in disabled and enabling table
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   regions.put(regionInfo, regionLocation);
>   addToServers(regionLocation, regionInfo);
> }
> the opened region will not add to the Regions in master.
> and in the following recoverTableInEnablingState,the region will be assigned 
> again.
> that will lead to the cluster inconsistent

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice

2012-07-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413545#comment-13413545
 ] 

stack commented on HBASE-6391:
--

Is this the same as HBASE-6317 "Master clean start up and Partially enabled 
tables make region assignment inconsistent."?

> Master restart when enabling table will lead to region assignned twice
> --
>
> Key: HBASE-6391
> URL: https://issues.apache.org/jira/browse/HBASE-6391
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.1
>
>
> The Scenario can be reproduce below.
> Enabling an table, some region is online on regionserver,some are still being 
> processed.
> And restart the master.
> when master failover:
> // Region is being served and on an active server
> // add only if region not in disabled and enabling table
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   regions.put(regionInfo, regionLocation);
>   addToServers(regionLocation, regionInfo);
> }
> the opened region will not add to the Regions in master.
> and in the following recoverTableInEnablingState,the region will be assigned 
> again.
> that will lead to the cluster inconsistent

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent

2012-07-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413558#comment-13413558
 ] 

stack commented on HBASE-6272:
--

High level Jimmy, how should we proceed with this patch?  If we apply it, I 
think it means that any fixes on stuff like hbase-6060 will be for trunk only; 
they won't be backportable, at least not w/o a bunch of work.  Maybe thats 
fine.  Raising the question.

> In-memory region state is inconsistent
> --
>
> Key: HBASE-6272
> URL: https://issues.apache.org/jira/browse/HBASE-6272
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> AssignmentManger stores region state related information in several places: 
> regionsInTransition, regions (region info to server name map), and servers 
> (server name to region info set map).  However the access to these places is 
> not coordinated properly.  It leads to inconsistent in-memory region state 
> information.  Sometimes, some region could even be offline, and not in 
> transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413559#comment-13413559
 ] 

nkeywal commented on HBASE-6389:


We could remove the timeout? That would make things a little simpler.
Or we could keep it as an error case, and throw an exception if the timeout is 
reached. The intend would be to stop the master.

> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> while (
>   !this.master.isStopped() &&
> count < maxToStart &&
> (lastCountChange+interval > now || timeout > slept || count < 
> minToStart)
>   ){
> ..
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-13 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413569#comment-13413569
 ] 

Elliott Clark commented on HBASE-4050:
--

bq.I suppose you don't need to set test size annotation on below because 
annotations are not a dependency when this is built:

Correct.  The hbase-hadoop-compat module has no hadoop dependency.  In addition 
hbase-hadoop1-compat and hbase-hadoop2-compat currently only have unit tests, 
so they have the second test pass completely turned off.

bq.Does BaseMetricsSource not implement MetricsSource?
It does.  I guess it's just a little too explicit. I'll fix it in the patch 
first thing tomorrow morning.

bq.These need to be this accessible:
Kind of but not 100%; I'm open to either way.  In hadoop 1 metrics are pretty 
hard to test. Opening the maps up will make testing any classes that extend 
MetricsBaseSourceImpl easier.  Those classes that add functionality will need 
those maps to be public for testing.  However with that said this patch doesn't 
have those classes in it, so if you prefer I could make them protected and 
change that when needed.

bq.The stuff below where we have a static boolean and in constructor we test 
something already created could be a PITA in minihbase setups? Does it have to 
be static? Aren't we slinging singletons here anyways? (The singletons are ok 
in minihbasecontext too?):

We are currently slinging a singleton.  However when we add in more than just 
replication metrics we'll have more than one BaseMetricsSourceImpl.  The 
DefaultMetricsSystem.initialize call can be done multiple times as long as it's 
inited with the same string, however it complains quite loudly in logs.

bq.'hasInited' is name of a method that tests 'inited' variable... suggest 
changing its name.
Sure.  Something like defaultMetricsInited

bq.What about that jmx mess registering metrics in tests? The exception saying 
metrics already registered because we have more than one daemon in the one jvm. 
We still have that issue here?

We'll still have that.  A little bit less spam but not completely gone.  
Basically when all metrics are moved to metrics2 we'll see 4 or 5 log messages 
(one per dupe of ReplicationMeticsSource et al.) rather than the massive 
ammount we see now.
Maybe on test we should silience the junit messages from those classes ?  
Probably a good issue to file for the metrics clean up.

bq.Do we have to have metrics2 package? Can this new stuff be in the metrics 
package?
Nope.  Earlier you were asking to remove it.  So everything is in the metrics 
namespace.  That should make things a little nicer if we go the DI route, 
that's being discussed on the mailing list, and someone wants to go back to the 
old hadoop metrics.

bq.I thought I saw a patch where you'd renamed the properties file to what 
LarsG suggested?
Nope just replied that we could.  That file needs some examples and other love 
(ganglia examples and examples for regionserver/rest).  Seems like a good issue 
for me to file after this.

I'll clean up the two javadocs tomorrow morning.


> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
> HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
> HBASE-4050-7.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice

2012-07-13 Thread zhou wenjian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413571#comment-13413571
 ] 

zhou wenjian commented on HBASE-6391:
-

in HBASE-6317 
rajeshbabu  comments
As per the current code two scenarios may cause assignment incosistent.
1)in EnableTableHandler we dont assign regions if they are present in regions 
map.
final List onlineRegions 
=this.assignmentManager.getRegionsOfTable(tableName);
regionsInMeta.removeAll(onlineRegions);
But in case of enabling table regions during master start up we are not adding 
them to regions map in rebuldUseRegions even the regions in/transition to 
onlineServers.
if (false == checkIfRegionBelongsToDisabled(regionInfo) && false == 
checkIfRegionsBelongsToEnabling(regionInfo)) {
  synchronized (this.regions) {
regions.put(regionInfo, regionLocation);
addToServers(regionLocation, regionInfo);
  }
}
So we will call assign to all the regions even they are in transition/already 
assigned to online servers which may cause double assignment.
2) If all the tables are in ENABLING we may consider as clean cluster 
startup(because regions map is empty) and again call assignment for all the 
regions.(Which may again cause double assignment)


if we romove the check for RegionsBelongsToEnabling, the first scenario will 
not happen again.
and for the other scenario we just need to worry about only one case.
that is ,all tables are enabling ,and none of the regions' location are 
registered in the meta.


> Master restart when enabling table will lead to region assignned twice
> --
>
> Key: HBASE-6391
> URL: https://issues.apache.org/jira/browse/HBASE-6391
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.1
>
>
> The Scenario can be reproduce below.
> Enabling an table, some region is online on regionserver,some are still being 
> processed.
> And restart the master.
> when master failover:
> // Region is being served and on an active server
> // add only if region not in disabled and enabling table
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   regions.put(regionInfo, regionLocation);
>   addToServers(regionLocation, regionInfo);
> }
> the opened region will not add to the Regions in master.
> and in the following recoverTableInEnablingState,the region will be assigned 
> again.
> that will lead to the cluster inconsistent

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice

2012-07-13 Thread zhou wenjian (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413576#comment-13413576
 ] 

zhou wenjian commented on HBASE-6391:
-

in my opinion, we could treat the case as failover rather than clean start.

 

> Master restart when enabling table will lead to region assignned twice
> --
>
> Key: HBASE-6391
> URL: https://issues.apache.org/jira/browse/HBASE-6391
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.1
>
>
> The Scenario can be reproduce below.
> Enabling an table, some region is online on regionserver,some are still being 
> processed.
> And restart the master.
> when master failover:
> // Region is being served and on an active server
> // add only if region not in disabled and enabling table
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   regions.put(regionInfo, regionLocation);
>   addToServers(regionLocation, regionInfo);
> }
> the opened region will not add to the Regions in master.
> and in the following recoverTableInEnablingState,the region will be assigned 
> again.
> that will lead to the cluster inconsistent

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413583#comment-13413583
 ] 

Hadoop QA commented on HBASE-6380:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536342/6380-trunk.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 8 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestFromClientSide
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2381//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2381//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2381//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2381//console

This message is automatically generated.

> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6370) Add compression codec test at HMaster when createTable/modifyColumn/modifyTable

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413587#comment-13413587
 ] 

Hudson commented on HBASE-6370:
---

Integrated in HBase-TRUNK #3124 (See 
[https://builds.apache.org/job/HBase-TRUNK/3124/])
HBASE-6370 Add compression codec test at HMaster when 
createTable/modifyColumn/modifyTable (Revision 1361058)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> Add compression codec test at HMaster when 
> createTable/modifyColumn/modifyTable
> ---
>
> Key: HBASE-6370
> URL: https://issues.apache.org/jira/browse/HBASE-6370
> Project: HBase
>  Issue Type: Improvement
>Reporter: ShiXing
>Assignee: ShiXing
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 6370v3.txt, HBASE-6370-trunk-V1.patch, 
> HBASE-6370-trunk-V2.patch, runAllTests.out
>
>
> We deployed a cluster that none of the regionserver supports the compression 
> codec such like "lzo", but the cluster user/client does not know this, and he 
> specifies the family's compression codec by 
> HColumnDescripto.setCompressionType(Compresson.Algorithm.LZO);
> Because the HBaseAdmin's createTable is async, so the client is waiting all 
> the regions of the table to be online forever. And client does not know why 
> the regions are not online until the HBase administrator find this problem.
> In deed, all of the regions are assigning by master, but regionserver's 
> openHRegion always failed.
> In my option, we can suppose all the cluster's enviroment are the same, means 
> if the master is deployed some lib, the regionserver should also be deployed. 
> Of course above is just a suppose, in real deployment, the hbase dba may just 
> deploy lib on regionserver or master.
> So I think this failure can be found earlier before master create the 
> CreateTableHandler thread, and we can tell client quickly we didn't support 
> this compression codec type.
> I will upload the patch later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6391) Master restart when enabling table will lead to region assignned twice

2012-07-13 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413588#comment-13413588
 ] 

rajeshbabu commented on HBASE-6391:
---

I feel this is same as HBASE-6317 and we are trying to address the concerns in 
that.
To answer your questions
bq.may anyone tell me why not to add region in enabling state to regions in 
master
Consider a case where i had disabled a table.  Again try to ENABLE.  But in the 
middle the master restarted.  Now if we add the regions to the this.regions map 
then the EnableTableHandler will see if the regions are available in 
this.regions and wont call assign.  So those regions will remain closed in the 
RS.
bq.in my opinion, we could treat the case as failover rather than clean start.
In HBASE-6317 we are making it as a failover only.
{code}
  // store all the enabling state table names and corresponding online servers' 
regions.
  // This may be needed to avoid calling assign twice for the regions of the 
ENABLING table
  // that could have been assigned through processRIT.
  Map> enablingTables = new HashMap>(1);
{code}
In the patch available in HBASE-6317 we are trying to avoid double assignment 
by making a map of the enabling table regions so that if those regions are 
already assigned by processRIT we wont assign it now.
Also even if roundrobinassignemt is set to true on master restart and if we 
find some partially enabled tables we go with single assignment.  Please review 
the patch over in HBASE-6317 and let us know if you have some more open points.


> Master restart when enabling table will lead to region assignned twice
> --
>
> Key: HBASE-6391
> URL: https://issues.apache.org/jira/browse/HBASE-6391
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.1
>
>
> The Scenario can be reproduce below.
> Enabling an table, some region is online on regionserver,some are still being 
> processed.
> And restart the master.
> when master failover:
> // Region is being served and on an active server
> // add only if region not in disabled and enabling table
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   regions.put(regionInfo, regionLocation);
>   addToServers(regionLocation, regionInfo);
> }
> the opened region will not add to the Regions in master.
> and in the following recoverTableInEnablingState,the region will be assigned 
> again.
> that will lead to the cluster inconsistent

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4364) Filters applied to columns not in the selected column list are ignored

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413590#comment-13413590
 ] 

Hadoop QA commented on HBASE-4364:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12536340/hbase-4364_trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 10 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestSplitLogManager
  org.apache.hadoop.hbase.catalog.TestMetaReaderEditor

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2382//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2382//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2382//console

This message is automatically generated.

> Filters applied to columns not in the selected column list are ignored
> --
>
> Key: HBASE-4364
> URL: https://issues.apache.org/jira/browse/HBASE-4364
> Project: HBase
>  Issue Type: Bug
>  Components: filters
>Affects Versions: 0.90.4, 0.92.0, 0.94.0
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: hbase-4364_trunk.patch
>
>
> For a scan, if you select some set of columns using addColumns(), and then 
> apply a SingleColumnValueFilter that restricts the results based on some 
> other columns which aren't selected, then those filter conditions are ignored.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread Jie Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413592#comment-13413592
 ] 

Jie Huang commented on HBASE-6380:
--

Re-run those 2 test cases locally (on a 64-bit Linux server), Passed.
{noformat}
---
 T E S T S
---
Running org.apache.hadoop.hbase.client.TestFromClientSide
Tests run: 56, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 172.105 sec

Results :

Tests run: 56, Failures: 0, Errors: 0, Skipped: 3

---
 T E S T S
---
Running org.apache.hadoop.hbase.master.TestSplitLogManager
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.548 sec

Results :

Tests run: 12, Failures: 0, Errors: 0, Skipped: 0


{noformat}

> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6370) Add compression codec test at HMaster when createTable/modifyColumn/modifyTable

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413675#comment-13413675
 ] 

Hudson commented on HBASE-6370:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #92 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/92/])
HBASE-6370 Add compression codec test at HMaster when 
createTable/modifyColumn/modifyTable (Revision 1361058)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java


> Add compression codec test at HMaster when 
> createTable/modifyColumn/modifyTable
> ---
>
> Key: HBASE-6370
> URL: https://issues.apache.org/jira/browse/HBASE-6370
> Project: HBase
>  Issue Type: Improvement
>Reporter: ShiXing
>Assignee: ShiXing
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 6370v3.txt, HBASE-6370-trunk-V1.patch, 
> HBASE-6370-trunk-V2.patch, runAllTests.out
>
>
> We deployed a cluster that none of the regionserver supports the compression 
> codec such like "lzo", but the cluster user/client does not know this, and he 
> specifies the family's compression codec by 
> HColumnDescripto.setCompressionType(Compresson.Algorithm.LZO);
> Because the HBaseAdmin's createTable is async, so the client is waiting all 
> the regions of the table to be online forever. And client does not know why 
> the regions are not online until the HBase administrator find this problem.
> In deed, all of the regions are assigning by master, but regionserver's 
> openHRegion always failed.
> In my option, we can suppose all the cluster's enviroment are the same, means 
> if the master is deployed some lib, the regionserver should also be deployed. 
> Of course above is just a suppose, in real deployment, the hbase dba may just 
> deploy lib on regionserver or master.
> So I think this failure can be found earlier before master create the 
> CreateTableHandler thread, and we can tell client quickly we didn't support 
> this compression codec type.
> I will upload the patch later.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5533) Add more metrics to HBase

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413673#comment-13413673
 ] 

Hudson commented on HBASE-5533:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #92 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/92/])
HBASE-6377. HBASE-5533 metrics miss all operations submitted via MultiAction

Committed 6377-trunk-remove-get-put-delete-histograms.patch (Revision 1361026)

 Result = FAILURE
apurtell : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/ServerMetricsTmpl.jamon
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java


> Add more metrics to HBase
> -
>
> Key: HBASE-5533
> URL: https://issues.apache.org/jira/browse/HBASE-5533
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.2, 0.94.0
>Reporter: Shaneal Manek
>Assignee: Shaneal Manek
>Priority: Minor
> Fix For: 0.92.2, 0.94.0, 0.96.0
>
> Attachments: BlockingQueueContention.java, HBASE-5533-0.92-v4.patch, 
> HBASE-5533-TRUNK-v6.patch, HBASE-5533-TRUNK-v6.patch, 
> HBASE-5533-v7-0.92.patch, TimingOverhead.java, hbase-5533-0.92.patch, 
> hbase5533-0.92-v2.patch, hbase5533-0.92-v3.patch, hbase5533-0.92-v5.patch, 
> histogram_web_ui.png
>
>
> To debug/monitor production clusters, there are some more metrics I wish I 
> had available.
> In particular:
> - Although the average FS latencies are useful, a 'histogram' of recent 
> latencies (90% of reads completed in under 100ms, 99% in under 200ms, etc) 
> would be more useful
> - Similar histograms of latencies on common operations (GET, PUT, DELETE) 
> would be useful
> - Counting the number of accesses to each region to detect hotspotting
> - Exposing the current number of HLog files

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6377) HBASE-5533 metrics miss all operations submitted via MultiAction

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413674#comment-13413674
 ] 

Hudson commented on HBASE-6377:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #92 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/92/])
HBASE-6377. HBASE-5533 metrics miss all operations submitted via MultiAction

Committed 6377-trunk-remove-get-put-delete-histograms.patch (Revision 1361026)

 Result = FAILURE
apurtell : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/jamon/org/apache/hadoop/hbase/tmpl/regionserver/ServerMetricsTmpl.jamon
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/RegionServerMetrics.java


> HBASE-5533 metrics miss all operations submitted via MultiAction
> 
>
> Key: HBASE-6377
> URL: https://issues.apache.org/jira/browse/HBASE-6377
> Project: HBase
>  Issue Type: Bug
>  Components: metrics, regionserver
>Affects Versions: 0.96.0, 0.94.1
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 6377-0.94-remove-get-put-delete-histograms.patch, 
> 6377-0.94.patch, 6377-trunk-remove-get-put-delete-histograms.patch, 
> 6377-trunk-simple.patch, 6377.patch
>
>
> A client application (LoadTestTool) calls put() on HTables. Internally to the 
> HBase client those puts are batched into MultiActions. The total number of 
> put operations shown in the RegionServer's put metrics histogram never 
> increases from 0 even though millions of such operations are made. Needless 
> to say the latency for those operations are not measured either. The value of 
> HBASE-5533 metrics are suspect given the client will batch put and delete ops 
> like this.
> I had a fix in progress but HBASE-6284 messed it up. Before, MultiAction 
> processing in HRegionServer would distingush between puts and deletes and 
> dispatch them separately. It was easy to account for the time for them. Now 
> both puts and deletes are submitted in batch together as mutations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6384) hbck should group together those sidelined regions need to be bulk loaded later

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413676#comment-13413676
 ] 

Hudson commented on HBASE-6384:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #92 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/92/])
HBASE-6384 hbck should group together those sidelined regions need to be 
bulk loaded later (Revision 1361034)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java


> hbck should group together those sidelined regions need to be bulk loaded 
> later
> ---
>
> Key: HBASE-6384
> URL: https://issues.apache.org/jira/browse/HBASE-6384
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6384-trunk.patch
>
>
> Currently, hbck sidelines some regions to break big overlap groups to avoid 
> possible compaction and region split.  These sidelined regions should be
> bulk loaded back later.  Information about these regions is in the output.
> It will be much easier to group them together under the same sideline rootdir,
> for example, /hbase/.hbck/to_be_loaded/.  If so, even we lose the output
> file, we still know what regions to load back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6380:
-

   Resolution: Fixed
Fix Version/s: 0.94.1
   0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed 0.94 and trunk.  Thanks for the patch Jie.

> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent

2012-07-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413752#comment-13413752
 ] 

stack commented on HBASE-6272:
--

@Ram What do you think?  You think we should commit this to 0.96 and build 
fixes like 6060 on top of this or Maryann's issue on OFFLINE?  Or you want to 
hold off?  At the moment I'm thinking that fixes for 6060 will be big changes, 
not easily backported.

@Jimmy I added review over on rb.  Its looking good.

> In-memory region state is inconsistent
> --
>
> Key: HBASE-6272
> URL: https://issues.apache.org/jira/browse/HBASE-6272
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> AssignmentManger stores region state related information in several places: 
> regionsInTransition, regions (region info to server name map), and servers 
> (server name to region info set map).  However the access to these places is 
> not coordinated properly.  It leads to inconsistent in-memory region state 
> information.  Sometimes, some region could even be offline, and not in 
> transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.

2012-07-13 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413772#comment-13413772
 ] 

stack commented on HBASE-6299:
--

bq. Is it possible that we can do something in an earlier stage to prevent 
double assignment? like in forceRegionStateToOffline()?

Yes.  Lets try.  I was going to try and write up a reproduction of the bugs you 
describe above in a harness so can play with them in isolation rather than have 
to blow up someone's world.

> RS starts region open while fails ack to HMaster.sendRegionOpen() causes 
> inconsistency in HMaster's region state and a series of successive problems.
> -
>
> Key: HBASE-6299
> URL: https://issues.apache.org/jira/browse/HBASE-6299
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.94.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Critical
> Attachments: HBASE-6299-v2.patch, HBASE-6299.patch
>
>
> 1. HMaster tries to assign a region to an RS.
> 2. HMaster creates a RegionState for this region and puts it into 
> regionsInTransition.
> 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS 
> receives the open region request and starts to proceed, with success 
> eventually. However, due to network problems, HMaster fails to receive the 
> response for the openRegion() call, and the call times out.
> 4. HMaster attemps to assign for a second time, choosing another RS. 
> 5. But since the HMaster's OpenedRegionHandler has been triggered by the 
> region open of the previous RS, and the RegionState has already been removed 
> from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK 
> node "RS_ZK_REGION_OPENING" updated by the second attempt.
> 6. The unassigned ZK node stays and a later unassign fails coz 
> RS_ZK_REGION_CLOSING cannot be created.
> {code}
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
> region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.;
>  
> plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.,
>  src=swbss-hadoop-004,60020,1340890123243, 
> dest=swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:28,882 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,291 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
> event for 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, 
> regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node
> 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Deleting existing unassigned node for 
> b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Successfully deleted unassigned node for 
> region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has 
> opened the region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  that was online on serverName=swbss-hadoop-006,60020,1340890678078, 
> load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301)
> 2012-06-29 07:07:41,140 WARN 
> org.apache.hadoop.hb

[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413785#comment-13413785
 ] 

Hudson commented on HBASE-6380:
---

Integrated in HBase-TRUNK #3125 (See 
[https://builds.apache.org/job/HBase-TRUNK/3125/])
HBASE-6380 bulkload should update the store.storeSize (Revision 1361203)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413801#comment-13413801
 ] 

Hudson commented on HBASE-6380:
---

Integrated in HBase-0.94 #315 (See 
[https://builds.apache.org/job/HBase-0.94/315/])
HBASE-6380 bulkload should update the store.storeSize (Revision 1361204)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-13 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-4050:
-

Attachment: HBASE-4050-8.patch

Addressed stack's comments.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
> HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
> HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6272) In-memory region state is inconsistent

2012-07-13 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413856#comment-13413856
 ] 

Jimmy Xiang commented on HBASE-6272:


@Stack, thanks a lot for the review. I will respond on RB.
I will backport this patch to 0.92 and 0.94 after it is applied to trunk.

> In-memory region state is inconsistent
> --
>
> Key: HBASE-6272
> URL: https://issues.apache.org/jira/browse/HBASE-6272
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> AssignmentManger stores region state related information in several places: 
> regionsInTransition, regions (region info to server name map), and servers 
> (server name to region info set map).  However the access to these places is 
> not coordinated properly.  It leads to inconsistent in-memory region state 
> information.  Sometimes, some region could even be offline, and not in 
> transition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5376) Add more logging to triage HBASE-5312: Closed parent region present in Hlog.lastSeqWritten

2012-07-13 Thread Jonathan Hsieh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5376:
--

Affects Version/s: 0.90.7
Fix Version/s: (was: 0.90.7)

> Add more logging to triage HBASE-5312: Closed parent region present in 
> Hlog.lastSeqWritten
> --
>
> Key: HBASE-5376
> URL: https://issues.apache.org/jira/browse/HBASE-5376
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 0.90.7
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: hbase-5376.txt
>
>
> It is hard to find out what exactly caused HBASE-5312.  Some logging will be 
> helpful to shine some lights.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6390) append() and increment() may result in inconsistent result on retries.

2012-07-13 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413870#comment-13413870
 ] 

Andrew Purtell commented on HBASE-6390:
---

So what you are looking for here is a way for a user to, perhaps optionally, 
make idempotent requests out of Append and Increment, correct?

Let me volunteer a couple of strawmen:

1) Could overload the timestamp of the Append and Increment requests. If the 
request is "out of date" relative to another request already applied, throw 
back a DoNotRetryException (or just a DNRE for that op if submitted as a 
MultiAction). This is roughly how ZooKeeper handles this class of distributed 
synchronization issue. Timestamp becomes a global sequence number. Not a 
logical sequence number so clocks must be closely synchronized. Each memstore 
would track the (server side) time of the most recent in-place update mutation. 
Could go further and keep a soft cache of in-place update times by row or even 
KV for use by append/increment/ICV. If more specific information gets evicted 
from the cache due to pressure then fallback to the per-memstore global 
timestamp would still insure correctness but potentially more resubmission work 
for the client/app.

2) A more generic option could be:

  * Extend the API where the user can set an optional cookie (a long). 

  * Keep a ring buffer of recent cookies up on the server.

  * Check the buffer first if a request with given cookie has already been 
applied and throw an exception back to the client if so.

Wouldn't guarantee correctness outside of some time bound. Also I worry about 
state management on the server. How large would that buffer need to be to 
capture all cookies submitted within ~(2 * time bound)?

> append() and increment() may result in inconsistent result on retries.
> --
>
> Key: HBASE-6390
> URL: https://issues.apache.org/jira/browse/HBASE-6390
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Ashutosh Jindal
>
> append() and increment() api can give inconsistent result in following 
> scenarios :
> 1- For eg, if the client does not receive the response in the specified time, 
> it retries.  Now the first call to increment/append is already done and this 
> retry will again make the operation to succeed.  
> 2- Now if the sync() to WAL fails we get an IOException, on getting an 
> exception there is a retry done which again results in the doing the 
> increment/append again.  
> When may need some sort of roll back for the second problem.
> For the first one we need to see how to handle this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6392) UnknownRegionException blocks hbck from sideline big overlap regions

2012-07-13 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-6392:
--

 Summary: UnknownRegionException blocks hbck from sideline big 
overlap regions
 Key: HBASE-6392
 URL: https://issues.apache.org/jira/browse/HBASE-6392
 Project: HBase
  Issue Type: Bug
Reporter: Jimmy Xiang


Before sidelining a big overlap region, hbck tries to close it and offline it 
at first.  However, sometimes, it throws NotServingRegion or 
UnknownRegionException.
It could be because the region is not open/assigned at all, or some other issue.
We should figure out why and fix it.

By the way, it's better to print out in the log the command line to bulk load 
back sidelined regions, if any. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6384) hbck should group together those sidelined regions need to be bulk loaded later

2012-07-13 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413871#comment-13413871
 ] 

Jimmy Xiang commented on HBASE-6384:


@Jon, as to the actual bulk load command line, it is a good idea.  It will be 
addressed in HBASE-6392.

> hbck should group together those sidelined regions need to be bulk loaded 
> later
> ---
>
> Key: HBASE-6384
> URL: https://issues.apache.org/jira/browse/HBASE-6384
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6384-trunk.patch
>
>
> Currently, hbck sidelines some regions to break big overlap groups to avoid 
> possible compaction and region split.  These sidelined regions should be
> bulk loaded back later.  Information about these regions is in the output.
> It will be much easier to group them together under the same sideline rootdir,
> for example, /hbase/.hbck/to_be_loaded/.  If so, even we lose the output
> file, we still know what regions to load back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413872#comment-13413872
 ] 

Hadoop QA commented on HBASE-4050:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536404/HBASE-4050-8.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 11 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 10 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.catalog.TestMetaReaderEditor

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2383//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2383//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2383//console

This message is automatically generated.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
> HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
> HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6378) the javadoc of setEnabledTable maybe not describe accurately

2012-07-13 Thread David S. Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413883#comment-13413883
 ] 

David S. Wang commented on HBASE-6378:
--

>From the patch:

+   * Sets the ENABLED state in the cache and Creates or force updates an node 
to
+   * the ENABLED state for the specified table.

I'd modify the above to be:

+   * Sets the ENABLED state in the cache and creates or force updates a node to
+   * ENABLED state for the specified table.

> the javadoc of  setEnabledTable maybe not describe accurately 
> --
>
> Key: HBASE-6378
> URL: https://issues.apache.org/jira/browse/HBASE-6378
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.2
>
> Attachments: 6378.patch
>
>
>   /**
>* Sets the ENABLED state in the cache and deletes the zookeeper node. Fails
>* silently if the node is not in enabled in zookeeper
>* 
>* @param tableName
>* @throws KeeperException
>*/
>   public void setEnabledTable(final String tableName) throws KeeperException {
> setTableState(tableName, TableState.ENABLED);
>   }
> When setEnabledTable occours ,It will update the cache and the zookeeper 
> node,rather than to delete the zk node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader

2012-07-13 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413884#comment-13413884
 ] 

Anoop Sam John commented on HBASE-5997:
---

bq. On the second item when we do the compare, are the offsets to where the key 
bytes start or to where the key starts (with its length preample)? For sure, we 
are comparing the row portions of keys?

Offset will be to the key(with its length preample). KeyComparator will be 
used.we can see how the rowLength being considered. We compare the full key 
(rowKey and then CF, qualifier... )



> Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
> 
>
> Key: HBASE-5997
> URL: https://issues.apache.org/jira/browse/HBASE-5997
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: Anoop Sam John
> Fix For: 0.94.2
>
> Attachments: HBASE-5997_0.94.patch, HBASE-5997_94 V2.patch, 
> Testcase.patch.txt
>
>
> Pls refer to the comment
> https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346.
> Raised this issue to solve that comment. Just incase we don't forget it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5997) Fix concerns raised in HBASE-5922 related to HalfStoreFileReader

2012-07-13 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-5997:
--

Attachment: HBASE-5997_94 V3.patch

Patch addressing Stack's comment

> Fix concerns raised in HBASE-5922 related to HalfStoreFileReader
> 
>
> Key: HBASE-5997
> URL: https://issues.apache.org/jira/browse/HBASE-5997
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.6, 0.92.1, 0.94.0, 0.96.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: Anoop Sam John
> Fix For: 0.94.2
>
> Attachments: HBASE-5997_0.94.patch, HBASE-5997_94 V2.patch, 
> HBASE-5997_94 V3.patch, Testcase.patch.txt
>
>
> Pls refer to the comment
> https://issues.apache.org/jira/browse/HBASE-5922?focusedCommentId=13269346&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269346.
> Raised this issue to solve that comment. Just incase we don't forget it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong

2012-07-13 Thread Benjamin Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Kim updated HBASE-6288:


Attachment: HBASE-6288-trunk.patch
HBASE-6288-94.patch
HBASE-6288-92-1.patch
HBASE-6288-92.patch

> In hbase-daemons.sh, description of the default backup-master file path is 
> wrong
> 
>
> Key: HBASE-6288
> URL: https://issues.apache.org/jira/browse/HBASE-6288
> Project: HBase
>  Issue Type: Task
>  Components: master, scripts, shell
>Affects Versions: 0.92.0, 0.92.1, 0.94.0
>Reporter: Benjamin Kim
> Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, 
> HBASE-6288-94.patch, HBASE-6288-trunk.patch
>
>
> In hbase-daemons.sh, description of the default backup-master file path is 
> wrong
> {code}
> #   HBASE_BACKUP_MASTERS File naming remote hosts.
> # Default is ${HADOOP_CONF_DIR}/backup-masters
> {code}
> it says the default backup-masters file path is at a hadoop-conf-dir, but 
> shouldn't this be HBASE_CONF_DIR?
> also adding following lines to conf/hbase-env.sh would be helpful
> {code}
> # File naming hosts on which backup HMaster will run.  
> $HBASE_HOME/conf/backup-masters by default.
> export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6288) In hbase-daemons.sh, description of the default backup-master file path is wrong

2012-07-13 Thread Benjamin Kim (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413921#comment-13413921
 ] 

Benjamin Kim commented on HBASE-6288:
-

It took a while for being gone for a vacation. Here goes the patches =)

> In hbase-daemons.sh, description of the default backup-master file path is 
> wrong
> 
>
> Key: HBASE-6288
> URL: https://issues.apache.org/jira/browse/HBASE-6288
> Project: HBase
>  Issue Type: Task
>  Components: master, scripts, shell
>Affects Versions: 0.92.0, 0.92.1, 0.94.0
>Reporter: Benjamin Kim
> Attachments: HBASE-6288-92-1.patch, HBASE-6288-92.patch, 
> HBASE-6288-94.patch, HBASE-6288-trunk.patch
>
>
> In hbase-daemons.sh, description of the default backup-master file path is 
> wrong
> {code}
> #   HBASE_BACKUP_MASTERS File naming remote hosts.
> # Default is ${HADOOP_CONF_DIR}/backup-masters
> {code}
> it says the default backup-masters file path is at a hadoop-conf-dir, but 
> shouldn't this be HBASE_CONF_DIR?
> also adding following lines to conf/hbase-env.sh would be helpful
> {code}
> # File naming hosts on which backup HMaster will run.  
> $HBASE_HOME/conf/backup-masters by default.
> export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6393) Decouple audit event creation from storage in AccessController

2012-07-13 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created HBASE-6393:
-

 Summary: Decouple audit event creation from storage in 
AccessController
 Key: HBASE-6393
 URL: https://issues.apache.org/jira/browse/HBASE-6393
 Project: HBase
  Issue Type: Brainstorming
  Components: security
Reporter: Marcelo Vanzin


Currently, AccessControler takes care of both generating audit events (by 
performing access checks) and storing them (by creating a log message and 
writing it to the AUDITLOG logger).

This makes the logging system the only way to catch audit events. It means that 
if someone wants to do something fancier (like writing these records to a 
database somewhere), they need to hack through the logging system, and parse 
the messages generated by AccessController, which is not optimal.

The attached patch decouples generation and storage by introducing a new 
interface, used by AccessController, to log the audit events. The current, 
log-based storage is kept in place so that current users won't be affected by 
the change.

I'm filing this as an RFC at this point, so the patch is not totally clean; 
it's on top of HBase 0.92 (which is easier for me to test) and doesn't have any 
unit tests, for starters. But the changes should be very similar on trunk - I 
don't remember changes in this particular area of the code between those 
versions.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6393) Decouple audit event creation from storage in AccessController

2012-07-13 Thread Marcelo Vanzin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcelo Vanzin updated HBASE-6393:
--

Attachment: accesslogger-v1.patch

Current version of my code, tested with a custom implementation of the new 
AccessLogger interface.

> Decouple audit event creation from storage in AccessController
> --
>
> Key: HBASE-6393
> URL: https://issues.apache.org/jira/browse/HBASE-6393
> Project: HBase
>  Issue Type: Brainstorming
>  Components: security
>Reporter: Marcelo Vanzin
> Attachments: accesslogger-v1.patch
>
>
> Currently, AccessControler takes care of both generating audit events (by 
> performing access checks) and storing them (by creating a log message and 
> writing it to the AUDITLOG logger).
> This makes the logging system the only way to catch audit events. It means 
> that if someone wants to do something fancier (like writing these records to 
> a database somewhere), they need to hack through the logging system, and 
> parse the messages generated by AccessController, which is not optimal.
> The attached patch decouples generation and storage by introducing a new 
> interface, used by AccessController, to log the audit events. The current, 
> log-based storage is kept in place so that current users won't be affected by 
> the change.
> I'm filing this as an RFC at this point, so the patch is not totally clean; 
> it's on top of HBase 0.92 (which is easier for me to test) and doesn't have 
> any unit tests, for starters. But the changes should be very similar on trunk 
> - I don't remember changes in this particular area of the code between those 
> versions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-13 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413945#comment-13413945
 ] 

Elliott Clark commented on HBASE-4050:
--

Test failure looks un-related.  Works on my machine.

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Alex Baranau
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
> HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
> HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4050) Update HBase metrics framework to metrics2 framework

2012-07-13 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark reassigned HBASE-4050:


Assignee: Elliott Clark  (was: Alex Baranau)

> Update HBase metrics framework to metrics2 framework
> 
>
> Key: HBASE-4050
> URL: https://issues.apache.org/jira/browse/HBASE-4050
> Project: HBase
>  Issue Type: New Feature
>  Components: metrics
>Affects Versions: 0.90.4
> Environment: Java 6
>Reporter: Eric Yang
>Assignee: Elliott Clark
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 4050-metrics-v2.patch, 4050-metrics-v3.patch, 
> HBASE-4050-0.patch, HBASE-4050-1.patch, HBASE-4050-2.patch, 
> HBASE-4050-3.patch, HBASE-4050-5.patch, HBASE-4050-6.patch, 
> HBASE-4050-7.patch, HBASE-4050-8.patch, HBASE-4050.patch
>
>
> Metrics Framework has been marked deprecated in Hadoop 0.20.203+ and 0.22+, 
> and it might get removed in future Hadoop release.  Hence, HBase needs to 
> revise the dependency of MetricsContext to use Metrics2 framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Aditya Kishore (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414031#comment-13414031
 ] 

Aditya Kishore commented on HBASE-6389:
---

I like the idea of treating timeout as error case and if we do decide on that, 
two things need to be taken care of.

# The current default timeout of 4.5 sec may not be appropriate and may require 
upward revision (to the tune of few minutes), and
# The master would need to do a cluster shutdown including other standby 
masters, otherwise each standby master may continue after the previous one has 
given up. In the worst case scenario of this case, if somehow 'minToStart' 
number of RSes join the last master, the cluster may be left with no standby 
master.

For this JIRA, I would like to revert to the original behavior (until 0.92) of 
Master of waiting for 'minToStart' number of RSes.

> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxTo

[jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-6389:
--

Attachment: HBASE-6389_trunk.patch

The test failure were result of masked error in test code which this change 
brought out.

There were two such errors.

# The function 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster() was 
overriding the value of 'mintostart' and 'maxtostart' with a single value, even 
if the caller has set them explicitly.
# org.apache.hadoop.hbase.regionserver.TestRSKilledWhenMasterInitializing did 
not set these values even though it kills one RS during master initialization.

The attached patch fixes these two.

> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> while (
>   !this.master.isStopped() &&
> count < maxToStart &&
> (lastCountChange+interval > now || t

[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414106#comment-13414106
 ] 

Hadoop QA commented on HBASE-6389:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12536453/HBASE-6389_trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 8 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.mapreduce.TestHLogRecordReader

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2384//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2384//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2384//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2384//console

This message is automatically generated.

> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number 

[jira] [Assigned] (HBASE-6392) UnknownRegionException blocks hbck from sideline big overlap regions

2012-07-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HBASE-6392:
--

Assignee: Jimmy Xiang

> UnknownRegionException blocks hbck from sideline big overlap regions
> 
>
> Key: HBASE-6392
> URL: https://issues.apache.org/jira/browse/HBASE-6392
> Project: HBase
>  Issue Type: Bug
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> Before sidelining a big overlap region, hbck tries to close it and offline it 
> at first.  However, sometimes, it throws NotServingRegion or 
> UnknownRegionException.
> It could be because the region is not open/assigned at all, or some other 
> issue.
> We should figure out why and fix it.
> By the way, it's better to print out in the log the command line to bulk load 
> back sidelined regions, if any. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-6381) AssignmentManager should use the same logic for clean startup and failover

2012-07-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang reassigned HBASE-6381:
--

Assignee: Jimmy Xiang

> AssignmentManager should use the same logic for clean startup and failover
> --
>
> Key: HBASE-6381
> URL: https://issues.apache.org/jira/browse/HBASE-6381
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>
> Currently AssignmentManager handles clean startup and failover very 
> differently.
> Different logic is mingled together so it is hard to find out which is for 
> which.
> We should clean it up and share the same logic so that AssignmentManager 
> handles
> both cases the same way.  This way, the code will much easier to understand 
> and
> maintain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414154#comment-13414154
 ] 

Hudson commented on HBASE-6380:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #93 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/93/])
HBASE-6380 bulkload should update the store.storeSize (Revision 1361203)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Jimmy Xiang (JIRA)
Jimmy Xiang created HBASE-6394:
--

 Summary: verifyrep MR job map tasks throws NullPointerException 
 Key: HBASE-6394
 URL: https://issues.apache.org/jira/browse/HBASE-6394
 Project: HBase
  Issue Type: Bug
  Components: replication
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: 6394-trunk.patch

{noformat}
2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
for the task
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6394:
---

Attachment: 6394-trunk.patch

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6394:
---

Status: Patch Available  (was: Open)

The log is from a previous version of HBase. So it is a little bit off with 
trunk.

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414161#comment-13414161
 ] 

Zhihong Ted Yu commented on HBASE-6394:
---

{code}
+replicatedScanner.close();
{code}
I was expecting 'replicatedScanner = null' following the above call.

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-6395) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)
Zhihong Ted Yu created HBASE-6395:
-

 Summary: TestFSSchedulerApp should be in scheduler.fair package
 Key: HBASE-6395
 URL: https://issues.apache.org/jira/browse/HBASE-6395
 Project: HBase
  Issue Type: Bug
Reporter: Zhihong Ted Yu


MAPREDUCE-3451 added Fair Scheduler to MRv2

TestFSSchedulerApp was added under 
src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair but 
its package was declared to be 
org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-6395) TestFSSchedulerApp should be in scheduler.fair package

2012-07-13 Thread Zhihong Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Ted Yu resolved HBASE-6395.
---

Resolution: Won't Fix

This should have been a MAPREDUCE JIRA.

> TestFSSchedulerApp should be in scheduler.fair package
> --
>
> Key: HBASE-6395
> URL: https://issues.apache.org/jira/browse/HBASE-6395
> Project: HBase
>  Issue Type: Bug
>Reporter: Zhihong Ted Yu
>
> MAPREDUCE-3451 added Fair Scheduler to MRv2
> TestFSSchedulerApp was added under 
> src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair 
> but its package was declared to be 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414179#comment-13414179
 ] 

Jimmy Xiang commented on HBASE-6394:


Sure, I will add that.

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6394:
---

Status: Open  (was: Patch Available)

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6394:
---

Attachment: 6394-trunk_v2.patch

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414185#comment-13414185
 ] 

Lars Hofhansl commented on HBASE-6389:
--

+1 on last patch.
If there are no objections I'll commit this to 0.94 and 0.96.

Let's discuss the failure after timeout idea in a different jira.

> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> while (
>   !this.master.isStopped() &&
> count < maxToStart &&
> (lastCountChange+interval > now || timeout > slept || count < 
> minToStart)
>   ){
> ..
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6394:
---

Status: Patch Available  (was: Open)

Addressed Ted's comment.

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6391) Master restart when enabling table will lead to region assignned twice

2012-07-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6391:
-

Fix Version/s: (was: 0.94.1)
   0.94.2

I think this could be closed to DUP as well.
Moving to 0.94.2 for now.

> Master restart when enabling table will lead to region assignned twice
> --
>
> Key: HBASE-6391
> URL: https://issues.apache.org/jira/browse/HBASE-6391
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.2
>
>
> The Scenario can be reproduce below.
> Enabling an table, some region is online on regionserver,some are still being 
> processed.
> And restart the master.
> when master failover:
> // Region is being served and on an active server
> // add only if region not in disabled and enabling table
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   regions.put(regionInfo, regionLocation);
>   addToServers(regionLocation, regionInfo);
> }
> the opened region will not add to the Regions in master.
> and in the following recoverTableInEnablingState,the region will be assigned 
> again.
> that will lead to the cluster inconsistent

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Zhihong Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414189#comment-13414189
 ] 

Zhihong Ted Yu commented on HBASE-6394:
---

+1 on patch v2.

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (HBASE-6391) Master restart when enabling table will lead to region assignned twice

2012-07-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414186#comment-13414186
 ] 

Lars Hofhansl edited comment on HBASE-6391 at 7/14/12 12:24 AM:


I think this could be closed as DUP as well.
Moving to 0.94.2 for now.

  was (Author: lhofhansl):
I think this could be closed to DUP as well.
Moving to 0.94.2 for now.
  
> Master restart when enabling table will lead to region assignned twice
> --
>
> Key: HBASE-6391
> URL: https://issues.apache.org/jira/browse/HBASE-6391
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0
>Reporter: zhou wenjian
> Fix For: 0.94.2
>
>
> The Scenario can be reproduce below.
> Enabling an table, some region is online on regionserver,some are still being 
> processed.
> And restart the master.
> when master failover:
> // Region is being served and on an active server
> // add only if region not in disabled and enabling table
> if (false == checkIfRegionBelongsToDisabled(regionInfo)
> && false == checkIfRegionsBelongsToEnabling(regionInfo)) {
>   regions.put(regionInfo, regionLocation);
>   addToServers(regionLocation, regionInfo);
> }
> the opened region will not add to the Regions in master.
> and in the following recoverTableInEnablingState,the region will be assigned 
> again.
> that will lead to the cluster inconsistent

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414201#comment-13414201
 ] 

Lars Hofhansl commented on HBASE-6389:
--

Ran TestHLogRecordReader locally. Passes fine (I did not expect that to be 
related to this patch).


> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> while (
>   !this.master.isStopped() &&
> count < maxToStart &&
> (lastCountChange+interval > now || timeout > slept || count < 
> minToStart)
>   ){
> ..
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414211#comment-13414211
 ] 

Hadoop QA commented on HBASE-6394:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536479/6394-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 8 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2385//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2385//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2385//console

This message is automatically generated.

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6389:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.94 and 0.96

> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> while (
>   !this.master.isStopped() &&
> count < maxToStart &&
> (lastCountChange+interval > now || timeout > slept || count < 
> minToStart)
>   ){
> ..
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6380) bulkload should update the store.storeSize

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414242#comment-13414242
 ] 

Hudson commented on HBASE-6380:
---

Integrated in HBase-0.94-security #41 (See 
[https://builds.apache.org/job/HBase-0.94-security/41/])
HBASE-6380 bulkload should update the store.storeSize (Revision 1361204)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


> bulkload should update the store.storeSize
> --
>
> Key: HBASE-6380
> URL: https://issues.apache.org/jira/browse/HBASE-6380
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Jie Huang
>Assignee: Jie Huang
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: 6380-trunk.txt, 6380-trunk.txt, hbase-6380_0_94_0.patch
>
>
> After bulkloading some HFiles into the Table, we found the force-split didn't 
> work because of the MidKey == NULL. Only if we re-booted the HBase service, 
> the force-split can work normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414241#comment-13414241
 ] 

Hudson commented on HBASE-6389:
---

Integrated in HBase-0.94-security #41 (See 
[https://builds.apache.org/job/HBase-0.94-security/41/])
HBASE-6389 Modify the conditions to ensure that Master waits for sufficient 
number of Region Servers before starting region assignments (Aditya Kishore) 
(Revision 1361458)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> while (
>   !this.master.isSto

[jira] [Commented] (HBASE-6384) hbck should group together those sidelined regions need to be bulk loaded later

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414243#comment-13414243
 ] 

Hudson commented on HBASE-6384:
---

Integrated in HBase-0.94-security #41 (See 
[https://builds.apache.org/job/HBase-0.94-security/41/])
HBASE-6384 hbck should group together those sidelined regions need to be 
bulk loaded later (Revision 1361036)

 Result = FAILURE
jxiang : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java


> hbck should group together those sidelined regions need to be bulk loaded 
> later
> ---
>
> Key: HBASE-6384
> URL: https://issues.apache.org/jira/browse/HBASE-6384
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
> Fix For: 0.90.7, 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6384-trunk.patch
>
>
> Currently, hbck sidelines some regions to break big overlap groups to avoid 
> possible compaction and region split.  These sidelined regions should be
> bulk loaded back later.  Information about these regions is in the output.
> It will be much easier to group them together under the same sideline rootdir,
> for example, /hbase/.hbck/to_be_loaded/.  If so, even we lose the output
> file, we still know what regions to load back.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414250#comment-13414250
 ] 

Hudson commented on HBASE-6389:
---

Integrated in HBase-0.94 #316 (See 
[https://builds.apache.org/job/HBase-0.94/316/])
HBASE-6389 Modify the conditions to ensure that Master waits for sufficient 
number of Region Servers before starting region assignments (Aditya Kishore) 
(Revision 1361458)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> while (
>   !this.master.isStopped() &&
> 

[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414265#comment-13414265
 ] 

Hudson commented on HBASE-6389:
---

Integrated in HBase-TRUNK #3126 (See 
[https://builds.apache.org/job/HBase-TRUNK/3126/])
HBASE-6389 Modify the conditions to ensure that Master waits for sufficient 
number of Region Servers before starting region assignments (Aditya Kishore) 
(Revision 1361456)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> while (
>   !this.master.is

[jira] [Updated] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6394:
---

   Resolution: Fixed
Fix Version/s: 0.94.1
   0.96.0
   0.92.2
   Status: Resolved  (was: Patch Available)

Integrated to 0.92, 0,.94 and 0.96. Thanks Ted for the review.

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414274#comment-13414274
 ] 

Hadoop QA commented on HBASE-6394:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12536484/6394-trunk_v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 5 javac compiler warnings (more than 
the trunk's current 4 warnings).

-1 findbugs.  The patch appears to introduce 8 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
  org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2386//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2386//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2386//console

This message is automatically generated.

> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414282#comment-13414282
 ] 

Hudson commented on HBASE-6394:
---

Integrated in HBase-0.94 #318 (See 
[https://builds.apache.org/job/HBase-0.94/318/])
HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 
1361470)

 Result = ABORTED
jxiang : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java


> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6389) Modify the conditions to ensure that Master waits for sufficient number of Region Servers before starting region assignments

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414283#comment-13414283
 ] 

Hudson commented on HBASE-6389:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #94 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/94/])
HBASE-6389 Modify the conditions to ensure that Master waits for sufficient 
number of Region Servers before starting region assignments (Aditya Kishore) 
(Revision 1361456)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRSKilledWhenMasterInitializing.java


> Modify the conditions to ensure that Master waits for sufficient number of 
> Region Servers before starting region assignments
> 
>
> Key: HBASE-6389
> URL: https://issues.apache.org/jira/browse/HBASE-6389
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.96.0
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
>Priority: Critical
> Fix For: 0.96.0, 0.94.1
>
> Attachments: HBASE-6389_trunk.patch, HBASE-6389_trunk.patch
>
>
> Continuing from HBASE-6375.
> It seems I was mistaken in my assumption that changing the value of 
> "hbase.master.wait.on.regionservers.mintostart" to a sufficient number (from 
> default of 1) can help prevent assignment of all regions to one (or a small 
> number of) region server(s).
> While this was the case in 0.90.x and 0.92.x, the behavior has changed in 
> 0.94.0 onwards to address HBASE-4993.
> From 0.94.0 onwards, Master will proceed immediately after the timeout has 
> lapsed, even if "hbase.master.wait.on.regionservers.mintostart" has not 
> reached.
> Reading the current conditions of waitForRegionServers() clarifies it
> {code:title=ServerManager.java (trunk rev:1360470)}
> 
> 581 /**
> 582  * Wait for the region servers to report in.
> 583  * We will wait until one of this condition is met:
> 584  *  - the master is stopped
> 585  *  - the 'hbase.master.wait.on.regionservers.timeout' is reached
> 586  *  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
> 587  *region servers is reached
> 588  *  - the 'hbase.master.wait.on.regionservers.mintostart' is reached 
> AND
> 589  *   there have been no new region server in for
> 590  *  'hbase.master.wait.on.regionservers.interval' time
> 591  *
> 592  * @throws InterruptedException
> 593  */
> 594 public void waitForRegionServers(MonitoredTask status)
> 595 throws InterruptedException {
> 
> 
> 612   while (
> 613 !this.master.isStopped() &&
> 614   slept < timeout &&
> 615   count < maxToStart &&
> 616   (lastCountChange+interval > now || count < minToStart)
> 617 ){
> 
> {code}
> So with the current conditions, the wait will end as soon as timeout is 
> reached even lesser number of RS have checked-in with the Master and the 
> master will proceed with the region assignment among these RSes alone.
> As mentioned in 
> -[HBASE-4993|https://issues.apache.org/jira/browse/HBASE-4993?focusedCommentId=13237196#comment-13237196]-,
>  and I concur, this could have disastrous effect in large cluster especially 
> now that MSLAB is turned on.
> To enforce the required quorum as specified by 
> "hbase.master.wait.on.regionservers.mintostart" irrespective of timeout, 
> these conditions need to be modified as following
> {code:title=ServerManager.java}
> ..
>   /**
>* Wait for the region servers to report in.
>* We will wait until one of this condition is met:
>*  - the master is stopped
>*  - the 'hbase.master.wait.on.regionservers.maxtostart' number of
>*region servers is reached
>*  - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
>*   there have been no new region server in for
>*  'hbase.master.wait.on.regionservers.interval' time AND
>*   the 'hbase.master.wait.on.regionservers.timeout' is reached
>*
>* @throws InterruptedException
>*/
>   public void waitForRegionServers(MonitoredTask status)
> ..
> ..
> int minToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.mintostart", 1);
> int maxToStart = this.master.getConfiguration().
> getInt("hbase.master.wait.on.regionservers.maxtostart", 
> Integer.MAX_VALUE);
> if (maxToStart < minToStart) {
>   maxToStart = minToStart;
> }
> ..
> ..
> whi

[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414284#comment-13414284
 ] 

Hudson commented on HBASE-6394:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #94 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/94/])
HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 
1361469)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java


> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414288#comment-13414288
 ] 

Hudson commented on HBASE-6394:
---

Integrated in HBase-0.94-security #42 (See 
[https://builds.apache.org/job/HBase-0.94-security/42/])
HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 
1361470)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java


> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414289#comment-13414289
 ] 

Hudson commented on HBASE-6394:
---

Integrated in HBase-TRUNK #3127 (See 
[https://builds.apache.org/job/HBase-TRUNK/3127/])
HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 
1361469)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java


> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-6394) verifyrep MR job map tasks throws NullPointerException

2012-07-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13414302#comment-13414302
 ] 

Hudson commented on HBASE-6394:
---

Integrated in HBase-0.92 #476 (See 
[https://builds.apache.org/job/HBase-0.92/476/])
HBASE-6394 verifyrep MR job map tasks throws NullPointerException (Revision 
1361471)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java


> verifyrep MR job map tasks throws NullPointerException 
> ---
>
> Key: HBASE-6394
> URL: https://issues.apache.org/jira/browse/HBASE-6394
> Project: HBase
>  Issue Type: Bug
>  Components: replication
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.92.2, 0.96.0, 0.94.1
>
> Attachments: 6394-trunk.patch, 6394-trunk_v2.patch
>
>
> {noformat}
> 2012-07-02 16:23:34,871 INFO org.apache.hadoop.mapred.TaskLogsTruncater: 
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
> 2012-07-02 16:23:34,876 WARN org.apache.hadoop.mapred.Child: Error running 
> child
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier.cleanup(VerifyReplication.java:140)
>   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:645)
>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
>   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:396)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at org.apache.hadoop.mapred.Child.main(Child.java:264)
> 2012-07-02 16:23:34,882 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
> for the task
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira