Josh Elser created HBASE-20792: ---------------------------------- Summary: info:servername and info:sn inconsistent for OPEN region Key: HBASE-20792 URL: https://issues.apache.org/jira/browse/HBASE-20792 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.0.2
Next problem we've run into after HBASE-20752 and HBASE-20708 After a rolling restart of a cluster, we'll see situations where a collection of regions will simply not be assigned out to the RS. I was able to reproduce this my mimic the restart patterns our tests do internally (ignore whether this is the best way to restart nodes for now :)). The general pattern is this: {code:java} for rs in regionservers: stop(server, rs, RS) for master in masters: stop(server, master, MASTER) sleep(15) for master in masters: start(server, master, MASTER) for rs in regionservers: start(server, rs, RS){code} Looking at meta, we can see why the Master is ignoring some regions: {noformat} test column=table:state, timestamp=1529871718998, value=\x08\x00 test,,1529871718122.0297f680df6dc0166a44f9536346268e. column=info:regioninfo, timestamp=1529967103390, value={ENCODED => 0297f680df6dc0166a44f9536346268e, NAME => 'test,,1529871718122.0297f680df6dc0166a44f9536346268e.', STARTKEY => '', ENDKEY => ''} test,,1529871718122.0297f680df6dc0166a44f9536346268e. column=info:seqnumDuringOpen, timestamp=1529967103390, value=\x00\x00\x00\x00\x00\x00\x00* test,,1529871718122.0297f680df6dc0166a44f9536346268e. column=info:server, timestamp=1529967103390, value=ctr-e138-1518143905142-378097-02-000012.hwx.site:16020 test,,1529871718122.0297f680df6dc0166a44f9536346268e. column=info:serverstartcode, timestamp=1529967103390, value=1529966776248 test,,1529871718122.0297f680df6dc0166a44f9536346268e. column=info:sn, timestamp=1529967096482, value=ctr-e138-1518143905142-378097-02-000006.hwx.site,16020,1529966755170 test,,1529871718122.0297f680df6dc0166a44f9536346268e. column=info:state, timestamp=1529967103390, value=OPEN{noformat} The region is marked as {{OPEN}}. The master doesn't know any better. However, the interesting bit is that {{info:server}} and {{info:sn}} are inconsistent (which, according to the javadoc should not be possible for an {{OPEN}} region).{{}} This doesn't happen every time, but I caught it yesterday on the 2nd or 3rd attempt, so I'm hopeful it's not a bear to repro. -- This message was sent by Atlassian JIRA (v7.6.3#76005)