shustsud opened a new pull request, #4194: URL: https://github.com/apache/bookkeeper/pull/4194
Related Issue: https://github.com/apache/bookkeeper/issues/4097 ### Motivation - If write to Bookies succeeds during replacing ensemble, there may be a mismatch between the ensemble on metadata and the actual written bookies. - This issue occurs in the following scenario. ``` STEP1: Write entry0,1,2,3 to Bookies, but the entries are in the following state. - entry: 0(Bookie0, Bookie1) -> Waiting for successful write to Bookie0. - entry: 1(Bookie1, Bookie2) -> Writing to Bookie1,2 was successful, but its completion is pending. - entry: 2(Bookie2, Bookie3) -> Writing to Bookie2,3 was successful, but its completion is pending. - entry: 3(Bookie3, Bookie0) -> Waiting for successful write to Bookie0. STEP2: Write entry4,5,6 to Bookies, but the entries are in the following state. - entry: 0(Bookie0, Bookie1) -> Waiting for successful write to Bookie0. - entry: 1(Bookie1, Bookie2) -> Writing to Bookie1,2 was successful, but its completion is pending. - entry: 2(Bookie2, Bookie3) -> Writing to Bookie2,3 was successful, but its completion is pending. - entry: 3(Bookie3, Bookie0) -> Waiting for successful write to Bookie0. - entry: 4(Bookie0, Bookie1) -> Waiting for successful write to Bookie0. - entry: 5(Bookie1, Bookie2) -> Failed to write to Bookie2. - entry: 6(Bookie2, Bookie3) -> Failed to write to Bookie2,3. Writing of entry5,6 failed, so ensemble replacement is started. STEP3: Write entry0,3,4 succeeded, but its completion is pending because ensemble is in the process of being replaced. entries are in the following state. - entry: 0(Bookie0, Bookie1) -> Writing to Bookie0,1 was successful, but its completion is pending. - entry: 1(Bookie1, Bookie2) -> Writing to Bookie1,2 was successful, but its completion is pending. - entry: 2(Bookie2, Bookie3) -> Writing to Bookie2,3 was successful, but its completion is pending. - entry: 3(Bookie3, Bookie0) -> Writing to Bookie3,0 was successful, but its completion is pending. - entry: 4(Bookie0, Bookie1) -> Writing to Bookie0,1 was successful, but its completion is pending. - entry: 5(Bookie1, Bookie2) -> Failed to write to Bookie2. - entry: 6(Bookie2, Bookie3) -> Failed to write to Bookie2,3. STEP4: The ensemble replacement is completed and LedgerHandle#unsetSuccessAndSendWriteRequest is called. https://github.com/apache/bookkeeper/blob/13e7efaa971cd3613b065ac50836c5ee98985d13/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L2007-L2013 Entry0 is processed first, but since entry0 is not written to Bookie2,3, LedgerHandle#sendAddSuccessCallbacks is called. https://github.com/apache/bookkeeper/blob/234b817cdb4e054887ffd5e42eaed25dc02daf63/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingAddOp.java#L203-L206 Entry1,2,3 must be written again to Bookies after replacing ensemble, but writing of entry0,1,2,3,4 is completed at the above timing. https://github.com/apache/bookkeeper/blob/13e7efaa971cd3613b065ac50836c5ee98985d13/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L1814-L1839 As a result, entry1,2,3 is mismatched between the ensemble on metadata and the actual written bookies. ``` - Running the test added in this PR with the unfixed source reproduces the issue. ### Changes - Change the timing of when `LedgerHandle#sendAddSuccessCallbacks` is called. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
