oneby-wang opened a new issue, #4658:
URL: https://github.com/apache/bookkeeper/issues/4658

   Hi, I recently read BookKeeper GC source code to figure out what situation 
will cause data loss in pull request [Fix the data loss issue that caused by 
the wrong entry log header 
#4607](https://github.com/apache/bookkeeper/pull/4607). See [#4607 
(comment)](https://github.com/apache/bookkeeper/pull/4607#issuecomment-3228330508).
   I think logFileHeader and ledgersMap are very important part of entryLog 
file, and their data integrity and consistency should be guaranteed by digest. 
   We should add digest logFileHeader and ledgersMap in appendLedgersMap() 
method. I just list the original code to express my idea.
   
https://github.com/apache/bookkeeper/blob/e80d0318cfdebbc79f37d9120de8df28c8c1c13a/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/DefaultEntryLogger.java#L139-L210
   In GC thread, we should valid digest in extractEntryLogMetadataFromIndex() 
method, if digest validation failed, we should directly throw an IOException 
and fall back to extractEntryLogMetadataByScanning().
   
https://github.com/apache/bookkeeper/blob/e80d0318cfdebbc79f37d9120de8df28c8c1c13a/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/DefaultEntryLogger.java#L1072-L1159
   And, the following break should be replaced with an IOException, if 
ledgersMapSize <= 0, it an invalid value, we should throw an IOException and 
fall back into scanning, not break out the while loop, which may lead to 
uncertainty.
   
https://github.com/apache/bookkeeper/blob/e80d0318cfdebbc79f37d9120de8df28c8c1c13a/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/DefaultEntryLogger.java#L1104-L1106
   If we use digest validation, [Fix the data loss issue that caused by the 
wrong entry log header #4607](https://github.com/apache/bookkeeper/pull/4607) 
data loss may probably not happen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to