We have recently experienced some issues with our namenodes in HA
arrangement and had to recreate namenode metadata from a backup while some
new data has been pushed to the regions ervers in the meantime. We're on
HBase 98.6.

After launching the cluster again, we have realised that we're missing
~8000/190000 blocks. Looking at fsck output, we can see, for what looks
like a continuous stream of regions:

/hbase/data/default/table/ffa95306f599dbff99497e71841724fe/processed/35186fe43fed47989ddb4ace3648b109:
MISSING 1 blocks of total size 929610 B...
/hbase/data/default/table/ffa95306f599dbff99497e71841724fe/processed/bd41ca895f3749188c08dd2e540bc127:
CORRUPT blockpool BP-2037521063-<IP>-1418127576413 block blk_1076077966

I did not want to run fsck -delete and hbck complains because the files
would not be allocated to region servers - reporting missing blocks.

The total size of this table is circa 22TB on HDFS and recreating it would
be quite a drag (pushing it from our previous hbase cluster took about a
month). Is there any known way of dealing with such situation?

Mateusz KaczyƄski

Reply via email to