[ 
https://issues.apache.org/jira/browse/HBASE-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13600499#comment-13600499
 ] 

Lars Hofhansl commented on HBASE-8055:
--------------------------------------

OK... I tracked down our problem here. In one of our tests written by the 
platform team we are creating HFile directly via HFileWriter 
({{HFile.getWriterFactory(conf, new CacheConfig(conf))...create()}}).
That does not write the StoreFile metadata, we should have used 
StoreFile.Writer ({{via StoreFile.WriterBuilder}}

Now, with that in mind. Should I:
* Remove the null check added here (and all the other null checks for 
timeRangeTracker)? After all without the NPE we would not have encountered the 
problem ([~stack], I think that is what you were getting at, right?)
* Check the HFile during bulk load (LoadIncrementalHFiles.groupOrSplit would be 
a reasonable spot). Although it is not entirely clear whether there are other 
situation in which we have HFiles without metdata.

Lastly, how can we discourage the direct use of HFileWriter? In trunk HFile and 
HFileWriteV* are tagged with {{@InterfaceAudience.Private}}, maybe that is good 
enough...?
                
> Null check missing in StoreFile.Reader.getMaxTimestamp()
> --------------------------------------------------------
>
>                 Key: HBASE-8055
>                 URL: https://issues.apache.org/jira/browse/HBASE-8055
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.95.0, 0.98.0, 0.94.7
>
>         Attachments: 8055-0.94.txt
>
>
> We just ran into a scenario where we got the following NPE:
> {code}
> 13/03/08 11:52:13 INFO regionserver.Store: Successfully loaded store file 
> file:/tmp/hfile-import-00Dxx0000001lmJ-09Cxx00000000Jm/COLFAM/file09Cxx00000000Jm
>  into store COLFAM (new location: 
> file:/tmp/localhbase/data/SFDC.ENTITY_HISTORY_ARCHIVE/aeacee43aaf1748c6e60b9cc12bcac3d/COLFAM/120d683414e44478984b50ddd79b6826)
> 13/03/08 11:52:13 ERROR regionserver.HRegionServer: Failed openScanner
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.getMaxTimestamp(StoreFile.java:1702)
>     at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.requestSeek(StoreFileScanner.java:301)
>     at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:127)
>     at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:2070)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.<init>(HRegion.java:3383)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateRegionScanner(HRegion.java:1628)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1620)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1596)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:2342)
>     at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
>     at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1400)
> 13/03/08 11:52:14 ERROR regionserver.HRegionServer: Failed openScanner
> {code}
> It's not clear, yet, how we got into this situation (we are generating HFiles 
> via HFileOutputFormat and bulk load those). It seems that can only happen 
> when the HFile itself is corrupted.
> Looking at the code, though, I see this is the only place where we access 
> StoreFile.reader.timeRangeTracker without a null check. So it appears we are 
> expecting scenarios in which it can be null.
> A simple fix would be:
> {code}
>     public long getMaxTimestamp() {
>       return timeRangeTracker == null ? Long.MAX_VALUE : 
> timeRangeTracker.maximumTimestamp;
>     }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to