[ 
https://issues.apache.org/jira/browse/HBASE-3524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993653#comment-12993653
 ] 

James Kennedy commented on HBASE-3524:
--------------------------------------

So that .meta file with DATA LOSS is definitely old (2010-05-20).
Looking back over old logs i realized that DATA LOSS WARN has been there for a 
while.
So probably that is a separate issue from this CompactionChecker problem.
Guess I'll just delete the file in HDFS.

So, it looks like my data is stable now after the forced compactions. I didn't 
have to apply the patch in production code to stop the NPEs.

I'm still concerned about how this happened to some regions and not others 
since all were left up long enough to get to that NPE point which only 
prevented the first post-0.90.0 upgrade full compactions for 8 out of 50 
tables. Maybe the other 42 were updated as part of the initial startup 
process...

> NPE from CompactionChecker
> --------------------------
>
>                 Key: HBASE-3524
>                 URL: https://issues.apache.org/jira/browse/HBASE-3524
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.90.0
>            Reporter: James Kennedy
>            Assignee: James Kennedy
>            Priority: Blocker
>             Fix For: 0.90.1, 0.90.2
>
>
> I recently updated production data to use HBase 0.90.0.
> Now I'm periodically seeing:
> [10/02/11 17:23:27] 30076066 [mpactionChecker] ERROR 
> nServer$MajorCompactionChecker  - Caught exception
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:832)
>       at 
> org.apache.hadoop.hbase.regionserver.Store.isMajorCompaction(Store.java:810)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegion.isMajorCompaction(HRegion.java:2800)
>       at 
> org.apache.hadoop.hbase.regionserver.HRegionServer$MajorCompactionChecker.chore(HRegionServer.java:1047)
>       at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
> The only negative effect is that this is interrupting compactions from 
> happening. But that is pretty serious and this might be a sign of data 
> corruption?
> Maybe it's just my data, but this task should at least involve improving the 
> handling to catch the NPE and still iterate through the other onlineRegions 
> that might compact without error.  The MajorCompactionChecker.chore() method 
> only catches IOExceptions and so this NPE breaks out of that loop. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to