[ 
https://issues.apache.org/jira/browse/HBASE-7513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-7513:
---------------------------------

    Fix Version/s: 0.94.5
    
> HDFSBlocksDistribution shouldn't send NPEs when something goes wrong
> --------------------------------------------------------------------
>
>                 Key: HBASE-7513
>                 URL: https://issues.apache.org/jira/browse/HBASE-7513
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.96.0, 0.94.4
>            Reporter: Jean-Daniel Cryans
>            Assignee: Elliott Clark
>            Priority: Minor
>             Fix For: 0.96.0, 0.94.5
>
>         Attachments: HBASE-7513-0.patch
>
>
> I saw a pretty weird failure on a cluster with corrupted files and this 
> particular exception really threw me off:
> {noformat}
> 2013-01-07 09:58:59,054 ERROR 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed open 
> of region=redacted., starting to roll back the global memstore size.
> java.io.IOException: java.io.IOException: java.lang.NullPointerException: 
> empty hosts
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:548)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:461)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3814)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3762)
>       at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:332)
>       at 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:108)
>       at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>       at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: java.lang.NullPointerException: empty hosts
>       at 
> org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:403)
>       at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:256)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2995)
>       at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:523)
>       at org.apache.hadoop.hbase.regionserver.HRegion$1.call(HRegion.java:521)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>       at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>       ... 3 more
> Caused by: java.lang.NullPointerException: empty hosts
>       at 
> org.apache.hadoop.hbase.HDFSBlocksDistribution.addHostsAndBlockWeight(HDFSBlocksDistribution.java:123)
>       at 
> org.apache.hadoop.hbase.util.FSUtils.computeHDFSBlocksDistribution(FSUtils.java:597)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFile.computeHDFSBlockDistribution(StoreFile.java:492)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:521)
>       at 
> org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:602)
>       at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:380)
>       at org.apache.hadoop.hbase.regionserver.Store$1.call(Store.java:375)
>       ... 8 more
> 2013-01-07 09:58:59,059 INFO 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Opening of 
> region "redacted" failed, marking as FAILED_OPEN in ZK
> {noformat}
> This is what the code looks like:
> {code}
> if (hosts == null || hosts.length == 0) {
>  throw new NullPointerException("empty hosts");
> }
> {code}
> So {{hosts}} can exist but we send an NPE anyways? And then this is wrapped 
> in {{Store}} by:
> {code}
> } catch (ExecutionException e) {
>   throw new IOException(e.getCause());
> {code}
> FWIW there's another NPE thrown in 
> {{HDFSBlocksDistribution.addHostAndBlockWeight}} and it looks wrong.
> We should change the code to just skip computing the locality if it's missing 
> and not throw big ugly exceptions. In this case the region would fail opening 
> later anyways but at least the error message will be clearer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to