[
https://issues.apache.org/jira/browse/BOOKKEEPER-247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402960#comment-13402960
]
Uma Maheswara Rao G commented on BOOKKEEPER-247:
------------------------------------------------
I think we have to handle one special case in LedgerChecker.
Take a case, creating the ledger with ensemble 3 and quorum is 2.
Add a first entry:
Now ensemble should look like '0 A B C'
Entry should have added to A, B. Now kill the bookie C.
Add one more entry. Now Writer will get the exception when writing to C and
will lead to ensemble updation.
Now new ensemble should look like '1 A B D'
Writer can continue with this ensemble util there is no failure again.
Now if you run the ledger checker on this Ledger, It will consider '0 A B C' as
UnderReplicated Fragment. But here first entry already met the quorum, we need
not reoplicate any entries.
I think we should skip such cases here.
Some grepped logs related to this issue:
{noformat}
First entry write:
2012-06-28 14:23:46,797 - INFO - [main:BookKeeperClusterTestCase@336] - New
bookie on port 5002 has been created.
2012-06-28 14:23:46,970 - INFO - [New I/O client worker
#1-1:PerChannelBookieClient$1@146] - Successfully connected to bookie:
/10.18.47.127:5000
2012-06-28 14:23:46,970 - INFO - [New I/O client worker
#1-2:PerChannelBookieClient$1@146] - Successfully connected to bookie:
/10.18.47.127:5001
2012-06-28 14:23:47,064 - INFO - [main:TestLedgerChecker@137] - Killing
/10.18.47.127:5002 from ensemble=[/10.18.47.127:5000, /10.18.47.127:5001,
/10.18.47.127:5002]
Ensembles after first entry : {0=[/10.18.47.127:5000, /10.18.47.127:5001,
/10.18.47.127:5002]}
.......................
.......................
2012-06-28 14:23:47,549 - INFO - [main:BookKeeperClusterTestCase@336] - New
bookie on port 5003 has been created.
Second erntry write:
First entry write:
2012-06-28 14:23:46,797 - INFO - [main:BookKeeperClusterTestCase@336] - New
bookie on port 5002 has been created.
2012-06-28 14:23:46,970 - INFO - [New I/O client worker
#1-1:PerChannelBookieClient$1@146] - Successfully connected to bookie:
/XX.XX.XX.127:5000
2012-06-28 14:23:46,970 - INFO - [New I/O client worker
#1-2:PerChannelBookieClient$1@146] - Successfully connected to bookie:
/XX.XX.XX.127:5001
2012-06-28 14:23:47,064 - INFO - [main:TestLedgerChecker@137] - Killing
/XX.XX.XX.127:5002 from ensemble=[/XX.XX.XX.127:5000, /XX.XX.XX.127:5001,
/XX.XX.XX.127:5002]
Ensembles after first entry : {0=[/XX.XX.XX.127:5000, /XX.XX.XX.127:5001,
/XX.XX.XX.127:5002]}
.......................
.......................
2012-06-28 14:23:47,549 - INFO - [main:BookKeeperClusterTestCase@336] - New
bookie on port 5003 has been created.
Second erntry write:
{noformat}
2012-06-28 14:23:48,537 - ERROR - [New I/O client boss
#1:PerChannelBookieClient$1@151] - Could not connect to bookie:
/XX.XX.XX.127:5002
2012-06-28 14:23:48,537 - WARN - [New I/O client boss #1:PendingAddOp@146] -
Write did not succeed: 3, 1
2012-06-28 14:23:48,584 - INFO - [New I/O client worker
#1-4:PerChannelBookieClient$1@146] - Successfully connected to bookie:
/XX.XX.XX.127:5003
Ensembles after second entry : {0=[/XX.XX.XX.127:5000, /XX.XX.XX.127:5001,
/XX.XX.XX.127:5002], 1=[/XX.XX.XX.127:5000, /XX.XX.XX.127:5001,
/XX.XX.XX.127:5003]}
2012-06-28 14:23:48,631 - ERROR - [pool-4-thread-1:PerChannelBookieClient@618]
- Unexpected read response received from bookie: /XX.XX.XX.127:5000 for ledger:
3, entry: 0 , ignoring
2012-06-28 14:23:49,633 - ERROR - [New I/O client boss
#1:PerChannelBookieClient$1@151] - Could not connect to bookie:
/XX.XX.XX.127:5002
2012-06-28 14:23:49,633 - INFO - [main:TestLedgerChecker@160] - unreplicated
fragment: Fragment(LedgerID: 3, FirstEntryID: 1[2], LastEntryID: 1[0], Host:
/XX.XX.XX.127:5000)
2012-06-28 14:23:49,633 - INFO - [main:TestLedgerChecker@160] - unreplicated
fragment: Fragment(LedgerID: 3, FirstEntryID: 0[1], LastEntryID: 0[-1], Host:
/XX.XX.XX.127:5002)
2012-06-28 14:23:49,633 - INFO - [main:BookKeeperClusterTestCase@92] -
TearDown{noformat}
> Detection of under replication
> ------------------------------
>
> Key: BOOKKEEPER-247
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-247
> Project: Bookkeeper
> Issue Type: Sub-task
> Components: bookkeeper-client, bookkeeper-server
> Reporter: Ivan Kelly
> Assignee: Ivan Kelly
>
> This JIRA discusses how the bookkeeper system will detect underreplication of
> ledger entries.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira