[ 
https://issues.apache.org/jira/browse/HBASE-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13635271#comment-13635271
 ] 

Jean-Marc Spaggiari commented on HBASE-8374:
--------------------------------------------

Hi Ted,

I'm not sure about your patch.
{code}
+            if (loc.size() > 0) {
+              regionLocations[regionIndex] = new int[loc.size()];
+              for (int i=0; i < loc.size(); i++) {
+                regionLocations[regionIndex][i] = 
serversToIndex.get(loc.get(i));
+              }
             }
{code}

If loc.size() == 0, then the for loop will never run and the loc.get(i) will 
never be called. No? And we know that loc can't be null else the NPE will have 
been on loc.size(). So 2 options.
1) log.get(i) returns null and serversToIndex.get(null) give the NPE 
2) serversToIndex is null.

I think we are facing #1 here.
                
> NPE when launching the balance
> ------------------------------
>
>                 Key: HBASE-8374
>                 URL: https://issues.apache.org/jira/browse/HBASE-8374
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>    Affects Versions: 0.95.0
>         Environment: AWS / real cluster with 3 nodes + master
>            Reporter: Nicolas Liochon
>            Assignee: Ted Yu
>             Fix For: 0.98.0, 0.95.1
>
>         Attachments: 8374-trunk.txt
>
>
> I don't reproduce this all the time, but I had it on a fairly clean env.
> It occurs every 5 minutes (i.e. the balancer period). Impact is severe: the 
> balancer does not run. When it starts to occurs, it occurs all the time. I 
> haven't tried to restart the master, but I think it should be enough.
> Now, looking at the code, the NPE is strange. 
> {noformat}
> 2013-04-18 08:09:52,079 ERROR [box,60000,1366281581983-BalancerChore] 
> org.apache.hadoop.hbase.master.balancer.BalancerChore: Caught exception
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.<init>(BaseLoadBalancer.java:145)
>       at 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer.balanceCluster(StochasticLoadBalancer.java:194)
>       at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1295)
>       at 
> org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:48)
>       at org.apache.hadoop.hbase.Chore.run(Chore.java:81)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-04-18 08:09:52,103 DEBUG [box,60000,1366281581983-CatalogJanitor] 
> org.apache.hadoop.hbase.client.ClientScanner: Creating scanner over .META. 
> starting at key ''
> {noformat}
> {code}
>           if (regionFinder != null) {
>             //region location
>             List<ServerName> loc = regionFinder.getTopBlockLocations(region);
>             regionLocations[regionIndex] = new int[loc.size()];
>             for (int i=0; i < loc.size(); i++) {
>               regionLocations[regionIndex][i] = 
> serversToIndex.get(loc.get(i));  // <========= NPE here
>             }
>           }
> {code}
> pinging [~enis], just in case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to