[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453883#comment-13453883
 ] 

Rakesh R commented on BOOKKEEPER-400:
-------------------------------------

Hi Stu,

- Could you please update the Affect Version field.

- Also, it would be good to attach the zk ledgermetadata values(if you have) 
and will help us to know the ensemble reformations

- One observation from the log is, there are are lots of following warn 
message. From this I'm thinking there are huge number of invalid write 
responses and resulting parallel handleFailures() very often. BOOKKEEPER-337 
has ensured the concurrent modifications to the ledgermetadata.
I also feel we would fall into erroneous situation due to the concurrent 
modifications of successesSoFar, numResponsesPending variables.

{noformat} 
2012-09-06 00:44:16,099 - ERROR - [pool-25-thread-1:PerChannelBookieClient@564] 
- Unexpected add response received from bookie: /host2:3181 for ledger: 338682, 
entry: 2935 , ignoring
2012-09-06 00:44:17,242 - DEBUG - [pool-25-thread-1:PendingAddOp@98] - 
Unsetting success for ledger: 338682 entry: 2935 bookie index: 3
{noformat} 
                
> Ledger entry not found in any of the bookies in the ensemble responsible for 
> that entry.
> ----------------------------------------------------------------------------------------
>
>                 Key: BOOKKEEPER-400
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-400
>             Project: Bookkeeper
>          Issue Type: Bug
>          Components: bookkeeper-client
>            Reporter: Aniruddha
>         Attachments: clean.log.gz
>
>
> Detailed discussion at 
> http://mail-archives.apache.org/mod_mbox/zookeeper-bookkeeper-dev/201209.mbox/%3cCAOLhyDQzrmeOHmTxzPikeAqJ7pZUn0=vHfd=gc1srmtuye5...@mail.gmail.com%3e
> We had an internal discussion about this. From BOOKKEEPER-337, it seems that 
> handleBookieFailure could be invoked in parallel by a thread other the one 
> that calls LedgerHandle#sendAddSuccessCallbacks. The values updated by 
> handleBookieFailure might not be visible to the thread running 
> sendAddSuccessCallbacks because the fields are not volatile and this might 
> have caused our bad state. 
> BK-337 synchronizes access to metadata.addEnsemble() and we believe this 
> would make this scenario very improbable. 
> A long term fix might be to make LedgerMetadata immutable since it is rarely 
> updated. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to