Don't CLOSE region if message is not from server that opened it or is opening it
--------------------------------------------------------------------------------
Key: HBASE-549
URL: https://issues.apache.org/jira/browse/HBASE-549
Project: Hadoop HBase
Issue Type: Bug
Affects Versions: 0.16.0, 0.2.0, 0.1.1, 0.1.0
Reporter: stack
We assign a region to a server. It takes too long to open (HBASE-505). Region
gets assigned to another server. Meantime original host returns a
MSG_REPORT_CLOSE (because other regions opening messes it up moving files on
disk out from under it). We queue a shutdown which marks the region as needing
reassignment. Second server reports in that it successfully opened the region.
Master tells it it should not have opened it. Churn ensues.
Fix is to ignore the CLOSE if its reported server/startcode does not match that
of the server currently trying to open region. Fix is not easy because
currently we don't keep list of server info in unassigned regions.
Here's master log snippet showing problem:
{code}
...
2008-03-25 19:16:43,711 INFO org.apache.hadoop.hbase.HMaster: assigning region
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 to server
XX.XX.XX.220:60020
2008-03-25 19:16:46,725 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
from XX.XX.XX.220:60020
2008-03-25 19:18:06,411 DEBUG org.apache.hadoop.hbase.HMaster: shutdown scanner
looking at enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
2008-03-25 19:18:06,811 DEBUG org.apache.hadoop.hbase.HMaster: shutdown scanner
looking at enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
2008-03-25 19:19:46,841 INFO org.apache.hadoop.hbase.HMaster: assigning region
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 to server
XX.XX.XX.221:60020
2008-03-25 19:19:49,849 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
from XX.XX.XX.221:60020
2008-03-25 19:19:56,883 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_CLOSE : enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 from
XX.XX.XX.220:60020
2008-03-25 19:19:56,883 INFO org.apache.hadoop.hbase.HMaster:
XX.XX.XX.220:60020 no longer serving regionname:
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482, startKey:
<iLStZ0yTnfVUziYcNVVxWV==>, endKey: <jLB27Q4hKls4tSvp64rJfF==
>, encodedName: 1857033608, tableDesc: {name: enwiki_080103, families:
>{alternate_title:={name: alternate_title, max versions: 3, compression: NONE,
>in memory: false, max length: 2147483647, bloom filter: none},
>alternate_url:={name: al
ternate_url, max versions: 3, compression: NONE, in memory: false, max length:
2147483647, bloom filter: none}, anchor:={name: anchor, max versions: 3,
compression: NONE, in memory: false, max length: 2147483647, bloom filter:
none}, mi
sc:={name: misc, max versions: 3, compression: NONE, in memory: false, max
length: 2147483647, bloom filter: none}, page:={name: page, max versions: 3,
compression: NONE, in memory: false, max length: 2147483647, bloom filter:
none}, re
direct:={name: redirect, max versions: 3, compression: NONE, in memory: false,
max length: 2147483647, bloom filter: none}}}
2008-03-25 19:19:56,885 DEBUG org.apache.hadoop.hbase.HMaster: Main processing
loop: ProcessRegionClose of
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482, true, false
2008-03-25 19:19:56,885 INFO org.apache.hadoop.hbase.HMaster: region closed:
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
2008-03-25 19:19:56,887 INFO org.apache.hadoop.hbase.HMaster: reassign region:
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
2008-03-25 19:19:57,288 INFO org.apache.hadoop.hbase.HMaster: assigning region
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 to server
XX.XX.XX.189:60020
2008-03-25 19:20:00,296 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_PROCESS_OPEN : enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
from XX.XX.XX.189:60020
2008-03-25 19:20:16,885 DEBUG org.apache.hadoop.hbase.HMaster: Received
MSG_REPORT_OPEN : enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 from
XX.XX.XX.221:60020
2008-03-25 19:20:16,885 DEBUG org.apache.hadoop.hbase.HMaster: region server
XX.XX.XX.221:60020 should not have opened region
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
2008-03-25 19:23:51,707 DEBUG org.apache.hadoop.hbase.HMaster: shutdown scanner
looking at enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
2008-03-25 19:23:51,834 DEBUG org.apache.hadoop.hbase.HMaster: shutdown scanner
looking at enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482
2008-03-25 19:23:53,947 INFO org.apache.hadoop.hbase.HMaster: assigning region
enwiki_080103,iLStZ0yTnfVUziYcNVVxWV==,1205393076482 to server XX.XX.XX.97:60020
...
{code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.