[ 
https://issues.apache.org/jira/browse/HBASE-13709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550951#comment-14550951
 ] 

Enis Soztutar commented on HBASE-13709:
---------------------------------------

bq. The system time of backup master may be less than that of active master. So 
the master system time may decrease during master failover. In the extreme 
case, the updates to meta table also will be eclipsed.
Yes, this is true. However, this is still an improvement to the current 
approach. We are using the max or masters time and local RSs time, so the 
probability is smaller. I would rather do a proper fix around these issues, but 
it is hard to do without re-writing the whole assignment subsystem. 
bq. The other option that I've always thought would solve some of the races in 
meta is to always do a check and put.
Agreed. I was looking into that to see whether we can do checkAndPut() 
comparing the openSeqIds, but I think openSeqId is not sufficient since in a 
double assignment case, both assignment might have the same open seqId. We need 
an assignment version per region persisted in meta so that we can do 
checkAndPut() comparing the assignment version. 

> Updates to meta table server columns may be eclipsed
> ----------------------------------------------------
>
>                 Key: HBASE-13709
>                 URL: https://issues.apache.org/jira/browse/HBASE-13709
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.0.2, 1.2.0, 1.1.1
>
>         Attachments: hbase-13709_v1.patch
>
>
> HBASE-11536 fixes a case where on a very rare occasion, the meta updates may 
> be processed out of order. The fix is to use the RS's timestamp for the 
> server column in meta update, but that actually opens up a vulnerability for 
> clock skew (see the discussions in the jira). 
> For the region replicas case, we can reproduce a problem where the server 
> name field is eclipsed by the masters earlier update because the RS is 
> lagging behind. However, this is not specific to replicas, but occurs more 
> frequently with it. 
> One option that was discussed was to send the master's ts with open region 
> RPC and use it. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to