[ 
https://issues.apache.org/jira/browse/HBASE-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13640089#comment-13640089
 ] 

Jeffrey Zhong commented on HBASE-8365:
--------------------------------------

{quote}
But taking up the third option will have an impact on performance on the 
assignments?
{quote}
The 0.96 AM ZK notification handling does a very good job. It basically handles 
same region synchronously while still keep parallelism for different regions. 
The main difference between option3 and option2 is that option3 can guarantee 
to handle ZK notifications in the receiving order.

Please take it over otherwise I can pick it up next week. Thanks.
                
> Duplicated ZK notifications cause Master abort (or other unknown issues)
> ------------------------------------------------------------------------
>
>                 Key: HBASE-8365
>                 URL: https://issues.apache.org/jira/browse/HBASE-8365
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.6
>            Reporter: Jeffrey Zhong
>         Attachments: TestResult.txt, TestZookeeper.txt
>
>
> The duplicated ZK notifications should happen in trunk as well. Since the way 
> we handle ZK notifications is different in trunk, we don't see the issue 
> there. I'll explain later.
> The issue is causing TestMetaReaderEditor.testRetrying flaky with error 
> message {code}reader: count=2, t=null{code} A related link is at 
> https://builds.apache.org/job/HBase-0.94/941/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/
> The test case failure is due to an IllegalStateException and master is 
> aborted so the rest test cases also failed after testRetrying.
> Below are steps why the issue is happening(region 
> fa0e7a5590feb69bd065fbc99c228b36 is in interests):
> 1) Got first notification event RS_ZK_REGION_FAILED_OPEN at 2013-04-04 
> 17:39:01,197
> {code} DEBUG [pool-1-thread-1-EventThread] master.AssignmentManager(744): 
> Handling transition=RS_ZK_REGION_FAILED_OPEN, 
> server=janus.apache.org,42093,1365097126155, 
> region=fa0e7a5590feb69bd065fbc99c228b36{code}
> In the step, AM tries to open the region on another RS in a separate thread
> 2) Got second notification event RS_ZK_REGION_FAILED_OPEN at 2013-04-04 
> 17:39:01,200 
> {code}DEBUG [pool-1-thread-1-EventThread] master.AssignmentManager(744): 
> Handling transition=RS_ZK_REGION_FAILED_OPEN, 
> server=janus.apache.org,42093,1365097126155, 
> region=fa0e7a5590feb69bd065fbc99c228b36{code}
> 3) Later got opening notification event result from the step 1 at 2013-04-04 
> 17:39:01,288 
> {code} DEBUG [pool-1-thread-1-EventThread] master.AssignmentManager(744): 
> Handling transition=RS_ZK_REGION_OPENING, 
> server=janus.apache.org,54833,1365097126175, 
> region=fa0e7a5590feb69bd065fbc99c228b36{code}
> Step 2 ClosedRegionHandler throws IllegalStateException because "Cannot 
> transit it to OFFLINE"(state is in opening from notification 3) and abort 
> Master. This could happen in 0.94 because we handle notifications using 
> executorService which opens the door to handle events out of order through 
> receive them in order of updates.
> I've confirmed that we don't have duplicated AM listeners and both events 
> triggered by same ZK data of exact same version. The issue can be reproduced 
> once by running testRetrying test case 20 times in a loop.
> There are several issues for the failure:
> 1) duplicated ZK notifications. Since ZK watcher is one time trigger, the 
> duplicated notification should not happen from the same data of the same 
> version in the first place
> 2) ZooKeeper watcher handling is wrong in both 0.94 and trunk as following:
> a) 0.94 handle notifications in async way which may lead to handle 
> notifications out of order of the events happened
> b) In trunk, we handle ZK notifications synchronously which slows down other 
> components such as SSH, LogSplitting etc. because we have a single 
> notification queue
> c) In trunk & 0.94, we could use stale event data because we have a long 
> listener list. ZK node state could have changed at the time when handling the 
> event. If a listener needs to act upon latest state, it should re-fetch the 
> data to verify if the data triggered the handler hasn't changed.
> Suggestions:
> For 0.94, we can bandit the CloseRegionHandler to pass in the expected ZK 
> data version to skip event handling on stale data with min impact.
> For trunk, I'll open an improvement JIRA on ZK notification handling to 
> provide more parallelism to handle unrelated notifications.
> For the duplicated ZK notifications, we need bring some ZK experters to take 
> a look at this.
> Please let me know what you think or any better idea.
> Thanks!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to