kowshik opened a new pull request #10280: URL: https://github.com/apache/kafka/pull/10280
**This PR is a work in progress. Please do not treat this as a finished product yet.** **TL;DR:** This PR implements the details of the Log layer refactor, as outlined in this document: https://docs.google.com/document/d/1dQJL4MCwqQJSPmZkVmVzshFZKuFy_bCPtubav4wBfHQ/edit. Few details maybe different from the doc, but it is more or less the same. **STRATEGY:** In this PR, I've extracted a new class called `LocalLog` out of `Log`. Currently `LocalLog` is purely an implementation detail thats not exposed outside `Log` class (except for tests). The object encapsulation is that each `Log` instance wraps around a `LocalLog` instance. This new `LocalLog` class attempts to encompass **most** of the responsibilities of local log surrounding the `segments` map, which otherwise were present in `Log` previously. Note that **not** all local log responsibilities have been moved over to this new class (yet). The criteria I used was to preserve (for now) in existing `Log` class, any logic that is mingled in a complex manner with the `logStartOffset` or the `LeaderEpochCache` or the `ProducerStateManager`. **WINS:** The main win is that the new `LocalLog` class is now agnostic of the `logStartOffset`, which continues to be managed mainly by `Log` class. Below is the local log functionality that has successfully moved over from `Log` to `LocalLog`: 1. The `ConcurrentNavigableMap` containing map of of offset -> segment, and the access logic surrounding it. 2. Read path logic to read records from the log. 3. Segment file deletion logic. 4. Segment recovery. 5. Segment truncation. 6. Segment roll. **BLOCKERS:** The API of `LocalLog` is probably not crisp at this point. Below is the main local log functionality that continues to remain in `Log` due to blockers. The reason is that the below logic is mingled with one or more of the following: `logStartOffset` or `LeaderEpochCache` or `ProducerStateManager`. This makes it hard to separate just the local logic out of it: 1. Segment append. 2. Last stable offset and logic surrounding it. 3. High watermark and logic surrounding it. 4. Logic to `fetchOffsetByTimestamp` and logic to `legacyFetchOffsetsBefore`. 5. Some of the retention logic thats related with the global view of the log. 6. All other logic related with handling `LeaderEpochCache` and `ProducerStateManager`. **PAINPOINTS:** 1. Producer state manager instance needed to be passed explicitly into the roll API to capture snapshots before roll. 2. Log locking semantics had to be changed in handful of areas, with lock taken at a coarse level (ex: recovery). 3. Few APIs needed re-ordering of logic in `Log` class to make migration feasible. 4. Certain APIs added to `LocalLog` are crude in nature or signature, examples: `def checkIfMemoryMappedBufferClosed`, `def markFlushed`, `def updateRecoveryPoint`, `def replaceSegments` etc. 5. Certain important APIs (such as `def append` logic) were hard to migrate because logic was mingled with Leader epoch cache, Producer state manager and log start offset. **TESTS:** Note that just the test `LogTest. testProducerSnapshotsRecoveryAfterUncleanShutdown*` is failing, and I'll need to look into this. Otherwise all other tests are expected to pass for this PR. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org