nsnycde commented on PR #4607: URL: https://github.com/apache/bookkeeper/pull/4607#issuecomment-3233468998
> > Hi, I'm wondering what situation will cause data loss(caused by wrong file header)? I read the source code, it seems the file header is updated by BufferedLogChannel.appendLedgersMap() method. > > @oneby-wang @nsnycde I guess that the file might be read by GC before that happens. This PR addresses the issue that there's garbage in the initial header. The original assumption has been that the header would be initialized with zeros. However, that assumption didn't hold with Netty's optimized array allocation as explained in [#4607 (comment)](https://github.com/apache/bookkeeper/pull/4607#issuecomment-2918429558) . If there's an invalid header in the entry log file, it could get deleted in GC instead of skipped. There's also the chance that the header just happens to contain data the gets interpreted as valid data in GC. At least this is how I understood the problem and the fix. When bk is running normally, gc should not read unflushed logs. You can see that the loop in the code below is entryLogger.getFlushedLogIds() ``` //org.apache.bookkeeper.bookie.GarbageCollectorThread#extractMetaFromEntryLogs protected void extractMetaFromEntryLogs() throws EntryLogMetadataMapException { for (long entryLogId : entryLogger.getFlushedLogIds()) { // Comb the current entry log file if it has not already been extracted. if (entryLogMetaMap.containsKey(entryLogId)) { continue; } // check whether log file exists or not // if it doesn't exist, this log file might have been garbage collected. if (!entryLogger.logExists(entryLogId)) { continue; } LOG.info("Extracting entry log meta from entryLogId: {}", entryLogId); try { // Read through the entry log file and extract the entry log meta EntryLogMetadata entryLogMeta = entryLogger.getEntryLogMetadata(entryLogId, throttler); removeIfLedgerNotExists(entryLogMeta); if (entryLogMeta.isEmpty()) { // This means the entry log is not associated with any active // ledgers anymore. // We can remove this entry log file now. LOG.info("Deleting entryLogId {} as it has no active ledgers!", entryLogId); removeEntryLog(entryLogId); gcStats.getReclaimedSpaceViaDeletes().addCount(entryLogMeta.getTotalSize()); } else { entryLogMetaMap.put(entryLogId, entryLogMeta); } } catch (IOException | RuntimeException e) { LOG.warn("Premature exception when processing " + entryLogId + " recovery will take care of the problem", e); } catch (OutOfMemoryError oome) { LOG.warn("OutOfMemoryError when processing {} - skipping the entry log", entryLogId, oome); } } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
