I have a table that blew up in the number of regions, I fixed the configuration and now need to bring the regions back down to our desired number of 16. I wrote a program to do exactly that and it worked when tested on a single node cluster, but now in production I get perpetually stuck on a merge with the log statement in the HMaster log below.
Typically this is the sequence of events: Table has the following - RegionA RegionB RegionC RegionMergedA = merge(RegionA, RegionB) RegionMergeB = merge(RegionMergedA, RegionC) Then the logs just get stuck repeating the following, where RegionMergeB = 190d50047a820d9b3ef588429c9065ea The perpetual statement on the master node. I waited 40 minutes to see if it cleared to no avail. 9:50:58.059 AM INFO org.apache.hadoop.hbase.master.handler.DispatchMergingRegionHandler Skip merging regions registry.medical.records,,1403961571625.190d50047a820d9b3ef588429c9065ea., registry.medical.records,\x0D\xFC\xE0\xD3\xFA\xF5FJ\x9D\xC6F\xD3\xE19*\xEAWLAB115\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x16%\xB6,1400656218257.11a7c474dd162ae4d7fc4a5573dda858., because region 190d50047a820d9b3ef588429c9065ea has merge qualifier The logs on the regionserver show a successful merge. Regions merged, hbase:meta updated, and report to master. region_a=registry.medical.records,,1403959070153.fa5121800ef7841875a153b3fef48d95., region_b=registry.medical.records,\x0C\xEE\xA7\x0CbNE\xA3\xAA\xC1g\x05\x86\xFA\xF3\xFEWLAB169\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x94_(,1400656255740.e750ecca47b5dd64060d7bc40e0c0a57.,merged region=registry.medical.records,,1403961571625.190d50047a820d9b3ef588429c9065ea.. Region merge took 0sec How do I clear the merge qualifier on the region that has been merged? I tried: *admin.runCatalogScan(); *table.clearRegionCache(); This also happens when I try to merge on the shell: merge_region '190d50047a820d9b3ef588429c9065ea', '11a7c474dd162ae4d7fc4a5573dda858'