On 3/6/11 10:08 AM, Marc Limotte wrote:
1. split started on region A
2. region A was offlined
3. The daughter regions were created in HDFS with the reference files
4. .META. was updated for region A
5. **** server crashed
So, the new daughter entries were never added to .META.
We first tried to online region A with the shell command "assign'. Figuring
that hbase would just find and split region A again. This seemed to have no
effect... not sure why, maybe because region A already had splitA/B
entries? Region A remained offline. We also tried to force it to split
region A, using the shell command "split". Again no effect.
Finally we tried to manually complete the split that had started. Peter
manually inserted the two daughter regions into .META. We then tried to
force a compact from the shell, this failed with a NSRE. So we onlined
region A with the "assign" command-- it worked this time. And now we seem
to be up again, compact works, data loads work, hbck checks out!
It looks like we've come up against a problem that looks identical to
the one you described. How did you go about manually inserting the two
child regions?
Our current thought on fixing it is to use the hbase shell to remove the
entries for the child regions and rewrite the region's entry such that
OFFLINE => false and SPLIT => false (ie both currently true) but we're
not sure if thats a good solution.
*** .META. info for the problem region ***
domains,1932334:2011/02/18/03:com.photobucket.i654,1
column=info:regioninfo, timestamp=1300387322414, value=REGION => {NAME
=>
'domains,1932334:2011/02/18/03:com.photobucket.i654,1299792156289.3824e8b8310176b
299792156289.3824e8b8310176b6f3c2a1d3f3e708dc.
6f3c2a1d3f3e708dc.', STARTKEY =>
'1932334:2011/02/18/03:com.photobucket.i654', ENDKEY =>
'1933201:2011/03/02/09:org.wikipedia.af', ENCODED => 3824e8b831017
6b6f3c2a1d3f3e708dc, OFFLINE => true, SPLIT => true, TABLE => {{NAME =>
'domains', FAMILIES => [{NAME => 'handling', BLOOMFILTER => 'NONE',
REPLICATION_SCO
PE => '0',
COMPRESSION => 'LZO', VERSIONS => '1', TTL => '1000000000', BLOCKSIZE =>
'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
domains,1932334:2011/02/18/03:com.photobucket.i654,1
column=info:server, timestamp=1300757176451,
value=s8.sjc.opendns.com:60020
299792156289.3824e8b8310176b6f3c2a1d3f3e708dc.
domains,1932334:2011/02/18/03:com.photobucket.i654,1
column=info:serverstartcode, timestamp=1300757176451,
value=1300752817197
299792156289.3824e8b8310176b6f3c2a1d3f3e708dc.
domains,1932334:2011/02/18/03:com.photobucket.i654,1
column=info:splitA, timestamp=1300387322414, value=REGION => {NAME =>
'domains,1932334:2011/02/18/03:com.photobucket.i654,1300387311068.3fbd783ab2a3de505fd
299792156289.3824e8b8310176b6f3c2a1d3f3e708dc. 5607748d82ec7.',
STARTKEY => '1932334:2011/02/18/03:com.photobucket.i654', ENDKEY =>
'1932968:2010/11/10/12:com.twitter', ENCODED => 3fbd783ab2a3de505fd560
7748d82ec7, TABLE
=> {{NAME => 'domains', FAMILIES => [{NAME => 'handling', BLOOMFILTER =>
'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS
=> '1', TTL =>
'1000000000', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
'true'}]}}
domains,1932334:2011/02/18/03:com.photobucket.i654,1
column=info:splitB, timestamp=1300387322414, value=REGION => {NAME =>
'domains,1932968:2010/11/10/12:com.twitter,1300387311068.6e95a3361da531a57b5883014c04
299792156289.3824e8b8310176b6f3c2a1d3f3e708dc. 7cdc.', STARTKEY
=> '1932968:2010/11/10/12:com.twitter', ENDKEY =>
'1933201:2011/03/02/09:org.wikipedia.af', ENCODED =>
6e95a3361da531a57b5883014c047cdc, T
ABLE => {{NAME =>
'domains', FAMILIES => [{NAME => 'handling', BLOOMFILTER => 'NONE',
REPLICATION_SCOPE => '0', COMPRESSION => 'LZO', VERSIONS => '1', TTL
=> '1000000000',
BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
- Adam