[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402960#comment-13402960
 ] 

Uma Maheswara Rao G commented on BOOKKEEPER-247:
------------------------------------------------


I think we have to handle one special case in LedgerChecker.

Take a case, creating the ledger with ensemble 3 and quorum is 2.

Add a first entry:
 Now ensemble should look like '0 A B C'
Entry should have added to A, B.  Now kill the bookie C.

Add one more entry. Now Writer will get the exception when writing to C and 
will lead to ensemble updation.
Now new ensemble should look like '1 A B D'


Writer can continue with this ensemble util there is no failure again.

Now if you run the ledger checker on this Ledger, It will consider '0 A B C' as 
UnderReplicated Fragment. But here first entry already met the quorum, we need 
not reoplicate any entries.

I think we should skip such cases here.

Some grepped logs related to this issue:

{noformat}

First entry write:

2012-06-28 14:23:46,797 - INFO  - [main:BookKeeperClusterTestCase@336] - New 
bookie on port 5002 has been created.
2012-06-28 14:23:46,970 - INFO  - [New I/O client worker 
#1-1:PerChannelBookieClient$1@146] - Successfully connected to bookie: 
/10.18.47.127:5000
2012-06-28 14:23:46,970 - INFO  - [New I/O client worker 
#1-2:PerChannelBookieClient$1@146] - Successfully connected to bookie: 
/10.18.47.127:5001
2012-06-28 14:23:47,064 - INFO  - [main:TestLedgerChecker@137] - Killing 
/10.18.47.127:5002 from ensemble=[/10.18.47.127:5000, /10.18.47.127:5001, 
/10.18.47.127:5002]
Ensembles after first entry : {0=[/10.18.47.127:5000, /10.18.47.127:5001, 
/10.18.47.127:5002]}
.......................
.......................


2012-06-28 14:23:47,549 - INFO  - [main:BookKeeperClusterTestCase@336] - New 
bookie on port 5003 has been created.


Second erntry write:


First entry write:

2012-06-28 14:23:46,797 - INFO  - [main:BookKeeperClusterTestCase@336] - New 
bookie on port 5002 has been created.
2012-06-28 14:23:46,970 - INFO  - [New I/O client worker 
#1-1:PerChannelBookieClient$1@146] - Successfully connected to bookie: 
/XX.XX.XX.127:5000
2012-06-28 14:23:46,970 - INFO  - [New I/O client worker 
#1-2:PerChannelBookieClient$1@146] - Successfully connected to bookie: 
/XX.XX.XX.127:5001
2012-06-28 14:23:47,064 - INFO  - [main:TestLedgerChecker@137] - Killing 
/XX.XX.XX.127:5002 from ensemble=[/XX.XX.XX.127:5000, /XX.XX.XX.127:5001, 
/XX.XX.XX.127:5002]
Ensembles after first entry : {0=[/XX.XX.XX.127:5000, /XX.XX.XX.127:5001, 
/XX.XX.XX.127:5002]}
.......................
.......................


2012-06-28 14:23:47,549 - INFO  - [main:BookKeeperClusterTestCase@336] - New 
bookie on port 5003 has been created.


Second erntry write:

{noformat}
2012-06-28 14:23:48,537 - ERROR - [New I/O client boss 
#1:PerChannelBookieClient$1@151] - Could not connect to bookie: 
/XX.XX.XX.127:5002
2012-06-28 14:23:48,537 - WARN  - [New I/O client boss #1:PendingAddOp@146] - 
Write did not succeed: 3, 1
2012-06-28 14:23:48,584 - INFO  - [New I/O client worker 
#1-4:PerChannelBookieClient$1@146] - Successfully connected to bookie: 
/XX.XX.XX.127:5003
Ensembles after second entry : {0=[/XX.XX.XX.127:5000, /XX.XX.XX.127:5001, 
/XX.XX.XX.127:5002], 1=[/XX.XX.XX.127:5000, /XX.XX.XX.127:5001, 
/XX.XX.XX.127:5003]}
2012-06-28 14:23:48,631 - ERROR - [pool-4-thread-1:PerChannelBookieClient@618] 
- Unexpected read response received from bookie: /XX.XX.XX.127:5000 for ledger: 
3, entry: 0 , ignoring
2012-06-28 14:23:49,633 - ERROR - [New I/O client boss 
#1:PerChannelBookieClient$1@151] - Could not connect to bookie: 
/XX.XX.XX.127:5002
2012-06-28 14:23:49,633 - INFO  - [main:TestLedgerChecker@160] - unreplicated 
fragment: Fragment(LedgerID: 3, FirstEntryID: 1[2], LastEntryID: 1[0], Host: 
/XX.XX.XX.127:5000)
2012-06-28 14:23:49,633 - INFO  - [main:TestLedgerChecker@160] - unreplicated 
fragment: Fragment(LedgerID: 3, FirstEntryID: 0[1], LastEntryID: 0[-1], Host: 
/XX.XX.XX.127:5002)
2012-06-28 14:23:49,633 - INFO  - [main:BookKeeperClusterTestCase@92] - 
TearDown{noformat}



                
> Detection of under replication
> ------------------------------
>
>                 Key: BOOKKEEPER-247
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-247
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: bookkeeper-client, bookkeeper-server
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>
> This JIRA discusses how the bookkeeper system will detect underreplication of 
> ledger entries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to