On 3/6/11 10:08 AM, Marc Limotte wrote:
    1. split started on region A
    2. region A was offlined
    3. The daughter regions were created in HDFS with the reference files
    4. .META. was updated for region A
    5. **** server crashed

So, the new daughter entries were never added to .META.

We first tried to online region A with the shell command "assign'.  Figuring
that hbase would just find and split region A again.  This seemed to have no
effect... not sure why, maybe because region A already had splitA/B
entries?  Region A remained offline.  We also tried to force it to split
region A, using the shell command "split".  Again no effect.

Finally we tried to manually complete the split that had started.  Peter
manually inserted the two daughter regions into .META.  We then tried to
force a compact from the shell, this failed with a NSRE.  So we onlined
region A with the "assign"  command-- it worked this time.  And now we seem
to be up again, compact works, data loads work, hbck checks out!

It looks like we've come up against a problem that looks identical to the one you described. How did you go about manually inserting the two child regions?

Our current thought on fixing it is to use the hbase shell to remove the entries for the child regions and rewrite the region's entry such that OFFLINE => false and SPLIT => false (ie both currently true) but we're not sure if thats a good solution.

*** .META. info for the problem region ***

domains,1932334:2011/02/18/03:com.photobucket.i654,1 column=info:regioninfo, timestamp=1300387322414, value=REGION => {NAME => 'domains,1932334:2011/02/18/03:com.photobucket.i654,1299792156289.3824e8b8310176b 299792156289.3824e8b8310176b6f3c2a1d3f3e708dc. 6f3c2a1d3f3e708dc.', STARTKEY => '1932334:2011/02/18/03:com.photobucket.i654', ENDKEY => '1933201:2011/03/02/09:org.wikipedia.af', ENCODED => 3824e8b831017

6b6f3c2a1d3f3e708dc, OFFLINE => true, SPLIT => true, TABLE => {{NAME => 'domains', FAMILIES => [{NAME => 'handling', BLOOMFILTER => 'NONE', REPLICATION_SCO PE => '0', COMPRESSION => 'LZO', VERSIONS => '1', TTL => '1000000000', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} domains,1932334:2011/02/18/03:com.photobucket.i654,1 column=info:server, timestamp=1300757176451, value=s8.sjc.opendns.com:60020 299792156289.3824e8b8310176b6f3c2a1d3f3e708dc.

domains,1932334:2011/02/18/03:com.photobucket.i654,1 column=info:serverstartcode, timestamp=1300757176451, value=1300752817197 299792156289.3824e8b8310176b6f3c2a1d3f3e708dc.

domains,1932334:2011/02/18/03:com.photobucket.i654,1 column=info:splitA, timestamp=1300387322414, value=REGION => {NAME => 'domains,1932334:2011/02/18/03:com.photobucket.i654,1300387311068.3fbd783ab2a3de505fd 299792156289.3824e8b8310176b6f3c2a1d3f3e708dc. 5607748d82ec7.', STARTKEY => '1932334:2011/02/18/03:com.photobucket.i654', ENDKEY => '1932968:2010/11/10/12:com.twitter', ENCODED => 3fbd783ab2a3de505fd560 7748d82ec7, TABLE => {{NAME => 'domains', FAMILIES => [{NAME => 'handling', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '1', TTL => '1000000000', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}} domains,1932334:2011/02/18/03:com.photobucket.i654,1 column=info:splitB, timestamp=1300387322414, value=REGION => {NAME => 'domains,1932968:2010/11/10/12:com.twitter,1300387311068.6e95a3361da531a57b5883014c04 299792156289.3824e8b8310176b6f3c2a1d3f3e708dc. 7cdc.', STARTKEY => '1932968:2010/11/10/12:com.twitter', ENDKEY => '1933201:2011/03/02/09:org.wikipedia.af', ENCODED => 6e95a3361da531a57b5883014c047cdc, T ABLE => {{NAME => 'domains', FAMILIES => [{NAME => 'handling', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '1', TTL => '1000000000', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}


- Adam

Reply via email to