[
https://issues.apache.org/jira/browse/HBASE-471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12575555#action_12575555
]
stack commented on HBASE-471:
-----------------------------
So, working theory is that its the order in which we update .META. on split.
We first edit the parent marking it offline, etc. Then we add the lower split
and then the top split. If a client comes in while parent is offline but
before the lower daughter split has been added, they get ISE ("Region is
offline"). If the client comes in after parent has been offlined but after the
lower daughter has been added BUT before the upper daughter has been added
requesting a row from the top half of the split, they get WRE.
Working on a test to prove my theory.
If theory is correct, fix is reversing order in which we add regions. If we
add the top half first and there is a request for a key from the top half, the
client gets the newly added region. If a request for a key from lower half,
they'll get the parent. On trying to get the key from the parent, will get a
NSRE or closed. Will retry. Meantime the lower half should have been added.
Then its ok to offline the parent.
> IllegalStateException thrown in client after region was split and deleted
> -------------------------------------------------------------------------
>
> Key: HBASE-471
> URL: https://issues.apache.org/jira/browse/HBASE-471
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.1.0
> Environment: Linux Debian, HBase 0.16.0
> Reporter: Lars George
> Attachments: FixHBase428.java, hbase-master-log.tar.gz, logs.tar.gz
>
>
> For some reason a client sometimes fails to locate a row with a
> IllegalStateException when the region was split and deleted.
> > [2008-02-25 16:12:39,171] ERROR [http-80-Processor20]
> > archive.MultilingualArchive - getDocument: An error occurred.
> > java.lang.IllegalStateException: region offline:
> > pdc-docs,US7039976_20060509,1203981958556
> > at
> > org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:432)
> > at
> > org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:343)
> > at
> > org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:306)
> > at org.apache.hadoop.hbase.HTable.getRegionLocation(HTable.java:102)
> > at org.apache.hadoop.hbase.HTable.get(HTable.java:280)
> Tracing the region on the master shows this:
> > 2008-02-25 16:09:38,761 DEBUG org.apache.hadoop.hbase.HMaster: Received
> > MSG_REGION_SPLIT :
> +pdc-docs,US7039976_20060509,1203981958556 from 192.168.105.21:60020
> > 2008-02-25 16:09:38,761 INFO org.apache.hadoop.hbase.HMaster: region
> > pdc-docs,US7039976_20060509,1203981958556 split. New regions
> +are: pdc-docs,US7039976_20060509,1203984578345,
> pdc-docs,US7046359_20060516,1203984578345
> > 2008-02-25 16:10:02,470 DEBUG org.apache.hadoop.hbase.HMaster:
> > HMaster.metaScanner regioninfo: {regionname:
> +pdc-docs,US7039976_20060509,1203981958556, startKey: <US7039976_20060509>,
> endKey: <US7053021_20060530>, encodedName: 1260314009,
> +offline: true, split: true, tableDesc: {name: pdc-docs, families:
> {contents:={name: contents, max versions: 3, compression: NONE, in
> +memory: false, max length: 2147483647, bloom filter: none}, language:={name:
> language, max versions: 3, compression: NONE, in
> +memory: false, max length: 2147483647, bloom filter: none}, mimetype:={name:
> mimetype, max versions: 3, compression: NONE, in
> +memory: false, max length: 2147483647, bloom filter: none}}}}, server:
> 192.168.105.21:60020, startCode: 1203949130468
> > 2008-02-25 16:10:02,513 DEBUG org.apache.hadoop.hbase.HMaster:
> > pdc-docs,US7039976_20060509,1203984578345 no longer has references
> +to pdc-docs,US7039976_20060509,1203981958556
> > 2008-02-25 16:10:02,516 DEBUG org.apache.hadoop.hbase.HMaster:
> > pdc-docs,US7046359_20060516,1203984578345 no longer has references
> +to pdc-docs,US7039976_20060509,1203981958556
> > 2008-02-25 16:10:02,516 INFO org.apache.hadoop.hbase.HMaster: Deleting
> > region pdc-docs,US7039976_20060509,1203981958556 because
> +daughter splits no longer hold references
> After discussion with st^ack it seems that the server is simply not retrying
> IllegalStateException but IOExceptions only. Also see HBASE-452 which should
> be addressed at the same time.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.