By looking at the HFile content, I can see that the information display on
the WebUI is not correct.
The last key printed by HFilePrettyPrinter is K:
\xFF\xFF\xFF\xFE\x00\x00\x00\x00....

The region after this one is listed by the same application to have:
    firstKey=\xF5\x9BB\xF4\x00\x00\x00\x00...
    lastKey=\xFF\xFF\xFF`\x00\x00\x00\x00...

And the concernend region:
    firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00...
    lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00...

Which mean I have an overlap between the 2.

So now. What are the options.

1) HBCK doesn't report any issue.
2) HFile report the right keys information
3) WebUI does'nt report the right information.

Since the WebUI  display the information based on the META, my best guess
is that META content is not correct. So I can "simply" remove it and let
HBCK repair that. Another option might be to copy the files from the 2nd
region to the 1st one as another store and re-compact the 2 together?

Should we have something to detect such region overlap or some disconnect
between the META and the HFiles? I will not do anything for now because I
want to know you opinion, but I think we should at least have something to
detect that in HBCK, and most probably something to fix that too.

JM



2013/8/24 Jean-Marc Spaggiari <jean-m...@spaggiari.org>

> (I have added line feeds to make it easier to read)
> org.apache.hadoop.hbase.regionserver.WrongRegionException: Requested row
> out of range for calculated split on HRegion
> work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00
> http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-100000-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16,1376139517597.b39bf00b980b632901859761caafb9d0.,
>
> startKey   ='\xF5\x9A\xEA&\x00\x00\x00\x00
> http://video.mindentimes.ca/search/all/source/qmi-agency/kanye-west-spending-100000-on-private-flights-to-see-pregnant-kim-kardashian/2319156767001/page/16',
>
> getEndKey()='\xF5\x9B@}\x00\x00\x00\x00
> http://fr.video.sympatico.ca/accueil/les-plus-populaires/watch/kim-kardashian-rit-des-rumeurs-dinfidelite/2477090497001?sort=date&filter=Splash&page=5',
>
> row='\xFA\xCDH?\x00\x00\x00\x00http://www.futur.....
>
> Start key is xF5 x9A xEA
> End key is xF5 x9B x40
>
> But I'm getting xFA xCD as the mid key... Which is not in the range.
>
> MidKey definition:
>
>      * An approximation to the {@link HFile}'s mid-key. Operates on block
>      * boundaries, and does not go inside blocks. In other words, returns
> the
>      * first key of the middle block of the file
>
> Does it mean that my blocks into my HFile are not correctly ordered??? I
> have just one store file for this region.
>
> If I run  bin/hbase org.apache.hadoop.hbase.io.hfile.HFile on this region,
> I get this:
>
>     firstKey=\xF5\x9A\xEA&\x00\x00\x00\x00...
>     lastKey=\xFF\xFF\xFF\xFE\x00\x00\x00\x00...
>
> But from the WebUI, I have those 2 regions at the end:
> work_proposed,\xF5\x9A\xEA&\x00\x00\x00\x00... buldo:60030
> \xF5\x9A\xEA&\x00\x00\x00\x00h... \xF5\x9B@}\x00\x00\x00\x00... 0
> work_proposed,\xF5\x9B@}\x00\x00\x00\x00...
> Which is the same as what I got on the logs. But not the same as what the
> HFilePrettyPrinter is giving me. The provided midkey is fine if we consider
> the output of the HFilePrettyPrinter. But wrong if we consider the WebUI.
>
>
> http://pastebin.com/dmtAnQtF
> Version:0.94.12-SNAPSHOT but I'm facing that for weeks now. So not new.
>
> I will continue to investigate. Most probably will try to print the 58M
> keys into the HFile to see who's right, who's wrong. And why those
> information are different. Might also drop the entry in the META to let
> HBCK rebuild it based on the HDFS file and see...
>
> All the ideas are welcome.
>
> JM
>

Reply via email to