[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454069#comment-13454069
 ] 

Sijie Guo commented on BOOKKEEPER-400:
--------------------------------------

Did an investigate the attached log, I found the issue caused by race condition 
between LedgerHandler#sendAddSuccessCallbacks and 
LedgerHandler#unsetSuccessAndSendWriteRequest on pendingAddOps and 
lastAddConfirmed.

in sendAddSuccessCallbacks, it is peeking the succeed ops from pendingAddOps 
and change the state; while unsetSuccessAndSendWriteRequest is unsetting bookie 
index for all pendingAddOps. the race happens as below:

1) thread 1 changes ensemble starting with entry 2928 (from the attached log).
2) thread 2 sendAddSuccessCallbacks, which remove those succeed add ops. and it 
happens to that add ops before (include) 2935 has succeed and removed.
3) thread 1 finished changed ensemble and started to unsetting bookie index to 
reissue write requests. but thread 2 has been removed those succeed add ops. 
2935 would not be reissued.
4) ensemble continue changed to replace those bookies owned entry 2935.

Let me explain the details using the log as below:

first, all the ensemble changes for ledger 338547, its corresponding unsetting 
entry 2935 and its corresponding responses of add entry 2935 requests are 
listed as below:
{code}
2012-09-06 00:14:28,054 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host12:3181, /host4:3181, /host6:3181] 
to: [/host1:3181, /host12:3181, /host11:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:14:32,340 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host12:3181, /host11:3181, /host6:3181] 
to: [/host1:3181, /host4:3181, /host11:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:14:38,972 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host4:3181, /host11:3181, /host6:3181] 
to: [/host1:3181, /host12:3181, /host11:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:14:46,028 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host12:3181, /host11:3181, /host6:3181] 
to: [/host1:3181, /host12:3181, /host2:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:14:46,209 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host12:3181, /host2:3181, /host6:3181] 
to: [/host9:3181, /host12:3181, /host2:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:14:46,562 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host12:3181, /host2:3181, /host6:3181] 
to: [/host9:3181, /host12:3181, /host2:3181, /host13:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:15:02,114 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host12:3181, /host2:3181, /host13:3181] 
to: [/host9:3181, /host15:3181, /host2:3181, /host13:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:15:39,033 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host15:3181, /host2:3181, /host13:3181] 
to: [/host9:3181, /host15:3181, /host2:3181, /host14:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:16:48,536 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host15:3181, /host2:3181, /host14:3181] 
to: [/host9:3181, /host15:3181, /host12:3181, /host14:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:17:53,879 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host15:3181, /host12:3181, /host14:3181] 
to: [/host9:3181, /host6:3181, /host12:3181, /host14:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:18:16,414 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host6:3181, /host12:3181, /host14:3181] 
to: [/host9:3181, /host6:3181, /host12:3181, /host1:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:19:29,124 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host6:3181, /host12:3181, /host1:3181] 
to: [/host9:3181, /host6:3181, /host13:3181, /host1:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:19:46,677 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host6:3181, /host13:3181, /host1:3181] 
to: [/host9:3181, /host10:3181, /host13:3181, /host1:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:20:00,353 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host10:3181, /host13:3181, /host1:3181] 
to: [/host9:3181, /host10:3181, /host13:3181, /host12:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:21:34,421 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host10:3181, /host13:3181, /host12:3181] 
to: [/host9:3181, /host10:3181, /host2:3181, /host12:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:21:57,198 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host10:3181, /host2:3181, /host12:3181] 
to: [/host9:3181, /host15:3181, /host2:3181, /host12:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:22:05,976 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host9:3181, /host15:3181, /host2:3181, /host12:3181] 
to: [/host7:3181, /host15:3181, /host2:3181, /host12:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:23:08,017 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host7:3181, /host15:3181, /host2:3181, /host12:3181] 
to: [/host7:3181, /host15:3181, /host3:3181, /host12:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:23:14,031 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host7:3181, /host15:3181, /host3:3181, /host12:3181] 
to: [/host7:3181, /host9:3181, /host3:3181, /host12:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:23:54,151 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host7:3181, /host9:3181, /host3:3181, /host12:3181] 
to: [/host2:3181, /host9:3181, /host3:3181, /host12:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:25:29,636 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host2:3181, /host9:3181, /host3:3181, /host12:3181] 
to: [/host1:3181, /host9:3181, /host3:3181, /host12:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:25:52,955 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host9:3181, /host3:3181, /host12:3181] 
to: [/host1:3181, /host9:3181, /host3:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:25:57,846 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host9:3181, /host3:3181, /host6:3181] 
to: [/host1:3181, /host7:3181, /host3:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:26:16,733 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host7:3181, /host3:3181, /host6:3181] 
to: [/host1:3181, /host7:3181, /host11:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:27:37,912 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host7:3181, /host11:3181, /host6:3181] 
to: [/host5:3181, /host7:3181, /host11:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:27:59,248 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host5:3181, /host7:3181, /host11:3181, /host6:3181] 
to: [/host5:3181, /host7:3181, /host11:3181, /host4:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:28:14,282 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host5:3181, /host7:3181, /host11:3181, /host4:3181] 
to: [/host5:3181, /host14:3181, /host11:3181, /host4:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:28:17,900 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host5:3181, /host14:3181, /host11:3181, /host4:3181] 
to: [/host5:3181, /host14:3181, /host7:3181, /host4:3181] for ledger:338547 
starting at entry: 2946
{code}

{code}
2012-09-06 00:19:37,904 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host9:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:19:37,911 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host6:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:19:37,913 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host1:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:19:37,930 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host6:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:19:47,000 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host1:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:21:39,985 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host10:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:22:07,401 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host12:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:23:49,055 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host7:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:23:59,118 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host15:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:25:51,045 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host9:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:26:08,289 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host2:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:27:37,427 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host1:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:27:58,069 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host6:3181 for ledger: 338547 
entry: 2935 rc: 0
2012-09-06 00:28:01,622 - DEBUG - [pool-10-thread-1:PerChannelBookieClient@534] 
- Got response for add request from bookie: /host7:3181 for ledger: 338547 
entry: 2935 rc: 0
{code}

{code}
2012-09-06 00:19:01,530 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 1
2012-09-06 00:19:14,590 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 3
2012-09-06 00:20:34,342 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 1
2012-09-06 00:21:17,403 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 3
2012-09-06 00:22:34,389 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 1
2012-09-06 00:22:49,620 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 0
2012-09-06 00:24:51,208 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 1
2012-09-06 00:25:09,143 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 0
2012-09-06 00:26:38,864 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 0
2012-09-06 00:26:59,237 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 3
2012-09-06 00:27:12,239 - DEBUG - [pool-10-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338547 entry: 2935 bookie index: 1
{code}

the final ensemble contains entry 2935 are : [/host5:3181, /host14:3181, 
/host11:3181, /host4:3181]. but there is no write add entry to these three 
bookies. (all addEntry requests for 2935 are sent to [host1, host2, host6, 
host7, host9, host10, host12, host15]).

matching the changing actions and responses with addEntry, we can know that the 
last response (to /host7) is matched to the following ensemble change which 
changing host9 to host7 in bookie index 1. which is also matching with the last 
unsetting operation on entry 2935.
{quote}
2012-09-06 00:25:57,846 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host9:3181, /host3:3181, /host6:3181] 
to: [/host1:3181, /host7:3181, /host3:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
{quote}

after that there is no unsetting, requests and responses for entry 2935 when 
changing ensemble starting from 2928.
{quote}
2012-09-06 00:26:16,733 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host7:3181, /host3:3181, /host6:3181] 
to: [/host1:3181, /host7:3181, /host11:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:27:37,912 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host1:3181, /host7:3181, /host11:3181, /host6:3181] 
to: [/host5:3181, /host7:3181, /host11:3181, /host6:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:27:59,248 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host5:3181, /host7:3181, /host11:3181, /host6:3181] 
to: [/host5:3181, /host7:3181, /host11:3181, /host4:3181] for ledger:338547 
starting at entry: 2928
2012-09-06 00:28:14,282 - DEBUG - [pool-10-thread-1:LedgerHandle@680] - 
Changing ensemble from: [/host5:3181, /host7:3181, /host11:3181, /host4:3181] 
to: [/host5:3181, /host14:3181, /host11:3181, /host4:3181] for ledger:338547 
starting at entry: 2928
{quote}

changing host3 to host11 on bookie index 2 (no effects on entry 2935)
changing host1 to host5 on bookie index 0 (should unset entry 2935, but it 
doesn't)
changing host6 to host4 on bookie index 3 (should unset entry 2935, but it 
doesn't)
changing host7 to host14 on bookie index 1 (should unset entry 2935, but it 
doesn't).

so you could see all the replicas for entry 2935 has been replaced without 
sending any add requests.

-----

from the above analysis, a possible solution for this fix is that we need to 
make the shared variables like lastAddConfirmed to volatile to make them 
visibile to all threads. and we need some synchronization on pendingAddOps to 
avoid item removed during iteration.

                
> Ledger entry not found in any of the bookies in the ensemble responsible for 
> that entry.
> ----------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-400
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-400
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-client
>            Reporter: Aniruddha
>         Attachments: clean.log.gz
>
>
> Detailed discussion at 
> http://mail-archives.apache.org/mod_mbox/zookeeper-bookkeeper-dev/201209.mbox/%3cCAOLhyDQzrmeOHmTxzPikeAqJ7pZUn0=vHfd=gc1srmtuye5...@mail.gmail.com%3e
> We had an internal discussion about this. From BOOKKEEPER-337, it seems that 
> handleBookieFailure could be invoked in parallel by a thread other the one 
> that calls LedgerHandle#sendAddSuccessCallbacks. The values updated by 
> handleBookieFailure might not be visible to the thread running 
> sendAddSuccessCallbacks because the fields are not volatile and this might 
> have caused our bad state. 
> BK-337 synchronizes access to metadata.addEnsemble() and we believe this 
> would make this scenario very improbable. 
> A long term fix might be to make LedgerMetadata immutable since it is rarely 
> updated. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to