Apologies, it seems the images didn't attach... There were only two, I'm attaching them to this message. Sorry for the inconvenience!
- Ivan On Thu, Oct 2, 2025, at 14:06, Ivan Yurchenko wrote: > Hi dear Kafka community, > > In the initial Diskless proposal, we proposed to have a separate component, > batch/diskless coordinator, whose role would be to centrally manage the batch > and WAL file metadata for diskless topics. This component drew many > reasonable comments from the community about how it would support various > Kafka features (transactions, queues) and its scalability. While we believe > we have good answers to all the expressed concerns, we took a step back and > looked at the problem from a different perspective. > > We would like to propose an alternative Diskless design *without a > centralized coordinator*. We believe this approach has potential and propose > to discuss it as it may be more appealing to the community. > > Let us explain the idea. Most of the complications with the original Diskless > approach come from one necessary architecture change: globalizing the local > state of partition leader in the batch coordinator. This causes deviations to > the established workflows in various features like produce idempotence and > transactions, queues, retention, etc. These deviations need to be carefully > considered, designed, and later implemented and tested. In the new approach > we want to avoid this by making partition leaders again responsible for > managing their partitions, even in diskless topics. > > In classic Kafka topics, batch data and metadata are blended together in the > one partition log. The crux of the Diskless idea is to decouple them and move > data to the remote storage, while keeping metadata somewhere else. Using the > central batch coordinator for managing batch metadata is one way, but not the > only. > > Let’s now think about managing metadata for each user partition > independently. Generally partitions are independent and don’t share anything > apart from that their data are mixed in WAL files. If we figure out how to > commit and later delete WAL files safely, we will achieve the necessary > autonomy that allows us to get rid of the central batch coordinator. Instead, > *each diskless user partition will be managed by its leader*, as in classic > Kafka topics. Also like in classic topics, the leader uses the partition log > as the way to persist batch metadata, i.e. the regular batch header + the > information about how to find this batch on remote storage. In contrast to > classic topics, batch data is in remote storage. > > For clarity, let’s compare the three designs: > • Classic topics: > • Data and metadata are co-located in the partition log. > • The partition log content: [Batch header (metadata)|Batch data]. > • The partition log is replicated to the followers. > • The replicas and leader have local state built from metadata. > • Original Diskless: > • Metadata is in the batch coordinator, data is on remote storage. > • The partition state is global in the batch coordinator. > • New Diskless: > • Metadata is in the partition log, data is on remote storage. > • Partition log content: [Batch header (metadata)|Batch coordinates on > remote storage]. > • The partition log is replicated to the followers. > • The replicas and leader have local state built from metadata. > > Let’s consider the produce path. Here’s the reminder of the original Diskless > design: > > > The new approach could be depicted as the following: > > > As you can see, the main difference is that now instead of a single commit > request to the batch coordinator, we send multiple parallel commit requests > to all the leaders of each partition involved in the WAL file. Each of them > will commit its batches independently, without coordinating with other > leaders and any other components. Batch data is addressed by the WAL file > name, the byte offset and size, which allows partitions to know nothing about > other partitions to access their data in shared WAL files. > > The number of partitions involved in a single WAL file may be quite large, > e.g. a hundred. A hundred network requests to commit one WAL file is very > impractical. However, there are ways to reduce this number: > 1. Partition leaders are located on brokers. Requests to leaders on one > broker could be grouped together into a single physical network request > (resembling the normal Produce request that may carry batches for many > partitions inside). This will cap the number of network requests to the > number of brokers in the cluster. > 2. If we craft the cluster metadata to make producers send their requests to > the right brokers (with respect to AZs), we may achieve the higher > concentration of logical commit requests in physical network requests > reducing the number of the latter ones even further, ideally to one. > > Obviously, out of multiple commit requests some may fail or time out for a > variety of reasons. This is fine. Some producers will receive totally or > partially failed responses to their Produce requests, similar to what they > would have received when appending to a classic topic fails or times out. If > a partition experiences problems, other partitions will not be affected > (again, like in classic topics). Of course, the uncommitted data will be > garbage in WAL files. But WAL files are short-lived (batches are constantly > assembled into segments and offloaded to tiered storage), so this garbage > will be eventually deleted. > > For safely deleting WAL files we now need to centrally manage them, as this > is the only state and logic that spans multiple partitions. On the diagram, > you can see another commit request called “Commit file (best effort)” going > to the WAL File Manager. This manager will be responsible for the following: > 1. Collecting (by requests from brokers) and persisting information about > committed WAL files. > 2. To handle potential failures in file information delivery, it will be > doing prefix scan on the remote storage periodically to find and register > unknown files. The period of this scan will be configurable and ideally > should be quite long. > 3. Checking with the relevant partition leaders (after a grace period) if > they still have batches in a particular file. > 4. Physically deleting files when they aren’t anymore referred to by any > partition. > > This new design offers the following advantages: > 1. It simplifies the implementation of many Kafka features such as > idempotence, transactions, queues, tiered storage, retention. Now we don’t > need to abstract away and reuse the code from partition leaders in the batch > coordinator. Instead, we will literally use the same code paths in leaders, > with little adaptation. Workflows from classic topics mostly remain unchanged. > For example, it seems that > ReplicaManager.maybeSendPartitionsToTransactionCoordinator and > KafkaApis.handleWriteTxnMarkersRequest used for transaction support on the > partition leader side could be used for diskless topics with little > adaptation. ProducerStateManager, needed for both idempotent produce and > transactions, would be reused. > Another example is share groups support, where the share partition leader, > being co-located with the partition leader, would execute the same logic for > both diskless and classic topics. > 2. It returns to the familiar partition-based scaling model, where partitions > are independent. > 3. It makes the operation and failure patterns closer to the familiar ones > from classic topics. > 4. It opens a straightforward path to seamless switching the topics modes > between diskless and classic. > > The rest of the things remain unchanged compared to the previous Diskless > design (after all previous discussions). Such things as local segment > materialization by replicas, the consume path, tiered storage integration, > etc. > > If the community finds this design more suitable, we will update the KIP(s) > accordingly and continue working on it. Please let us know what you think. > > Best regards, > Ivan and Diskless team > > On Mon, Sep 29, 2025, at 15:06, Ivan Yurchenko wrote: > > Hi Justine, > > > > Yes, you're right. We need to track the aborted transactions for in the > > diskless coordinator for as long as the corresponding offsets are there. > > With the tiered storage unification Greg mentioned earlier, this will be > > finite time even for infinite data retention. > > > > Best, > > Ivan > > > > On Wed, Sep 17, 2025, at 19:41, Justine Olshan wrote: > > > Hey Ivan, > > > > > > Thanks for the response. I think most of what you said made sense, but I > > > did have some questions about this part: > > > > > > > As we understand this, the partition leader in classic topics forgets > > > about a transaction once it’s replicated (HWM overpasses it). The > > > transaction coordinator acts like the main guardian, allowing partition > > > leaders to do this safely. Please correct me if this is wrong. We think > > > about relying on this with the batch coordinator and delete the > > > information > > > about a transaction once it’s finished (as there’s no replication and HWM > > > advances immediately). > > > > > > I didn't quite understand this. In classic topics, we have maps for > > > ongoing > > > transactions which remove state when the transaction is completed and an > > > aborted transactions index which is retained for much longer. Once the > > > transaction is completed, the coordinator is no longer involved in > > > maintaining this partition side state, and it is subject to compaction > > > etc. > > > Looking back at the outline provided above, I didn't see much about the > > > fetch path, so maybe that could be expanded a bit further. I saw the > > > following in a response: > > > > When the broker constructs a fully valid local segment, all the > > > > necessary > > > control batches will be inserted and indices, including the transaction > > > index will be built to serve FetchRequests exactly as they are today. > > > > > > Based on this, it seems like we need to retain the information about > > > aborted txns for longer. > > > > > > Thanks, > > > Justine > > > > > > On Mon, Sep 15, 2025 at 9:43 AM Ivan Yurchenko <[email protected]> wrote: > > > > > > > Hi Justine and all, > > > > > > > > Thank you for your questions! > > > > > > > > > JO 1. >Since a transaction could be uniquely identified with producer > > > > > ID > > > > > and epoch, the positive result of this check could be cached locally > > > > > Are we saying that only new transaction version 2 transactions can be > > > > used > > > > > here? If not, we can't uniquely identify transactions with producer > > > > > id + > > > > > epoch > > > > > > > > You’re right that we (probably unintentionally) focused only on version > > > > 2. > > > > We can either limit the support to version 2 or consider using some > > > > surrogates to support version 1. > > > > > > > > > JO 2. >The batch coordinator does the final transactional checks of > > > > > the > > > > > batches. This procedure would output the same errors like the > > > > > partition > > > > > leader in classic topics would do. > > > > > Can you expand on what these checks are? Would you be checking if the > > > > > transaction was still ongoing for example?* * > > > > > > > > Yes, the producer epoch, that the transaction is ongoing, and of course > > > > the normal idempotence checks. What the partition leader in the classic > > > > topics does before appending a batch to the local log (e.g. in > > > > UnifiedLog.maybeStartTransactionVerification and > > > > UnifiedLog.analyzeAndValidateProducerState). In Diskless, we > > > > unfortunately > > > > cannot do these checks before appending the data to the WAL segment and > > > > uploading it, but we can “tombstone” these batches in the batch > > > > coordinator > > > > during the final commit. > > > > > > > > > Is there state about ongoing > > > > > transactions in the batch coordinator? I see some other state > > > > > mentioned > > > > in > > > > > the End transaction section, but it's not super clear what state is > > > > stored > > > > > and when it is stored. > > > > > > > > Right, this should have been more explicit. As the partition leader > > > > tracks > > > > ongoing transactions for classic topics, the batch coordinator has to as > > > > well. So when a transaction starts and ends, the transaction coordinator > > > > must inform the batch coordinator about this. > > > > > > > > > JO 3. I didn't see anything about maintaining LSO -- perhaps that > > > > > would > > > > be > > > > > stored in the batch coordinator? > > > > > > > > Yes. This could be deduced from the committed batches and other > > > > information, but for the sake of performance we’d better store it > > > > explicitly. > > > > > > > > > JO 4. Are there any thoughts about how long transactional state is > > > > > maintained in the batch coordinator and how it will be cleaned up? > > > > > > > > As we understand this, the partition leader in classic topics forgets > > > > about a transaction once it’s replicated (HWM overpasses it). The > > > > transaction coordinator acts like the main guardian, allowing partition > > > > leaders to do this safely. Please correct me if this is wrong. We think > > > > about relying on this with the batch coordinator and delete the > > > > information > > > > about a transaction once it’s finished (as there’s no replication and > > > > HWM > > > > advances immediately). > > > > > > > > Best, > > > > Ivan > > > > > > > > On Tue, Sep 9, 2025, at 00:38, Justine Olshan wrote: > > > > > Hey folks, > > > > > > > > > > Excited to see some updates related to transactions! > > > > > > > > > > I had a few questions. > > > > > > > > > > JO 1. >Since a transaction could be uniquely identified with producer > > > > > ID > > > > > and epoch, the positive result of this check could be cached locally > > > > > Are we saying that only new transaction version 2 transactions can be > > > > used > > > > > here? If not, we can't uniquely identify transactions with producer > > > > > id + > > > > > epoch > > > > > > > > > > JO 2. >The batch coordinator does the final transactional checks of > > > > > the > > > > > batches. This procedure would output the same errors like the > > > > > partition > > > > > leader in classic topics would do. > > > > > Can you expand on what these checks are? Would you be checking if the > > > > > transaction was still ongoing for example? Is there state about > > > > > ongoing > > > > > transactions in the batch coordinator? I see some other state > > > > > mentioned > > > > in > > > > > the End transaction section, but it's not super clear what state is > > > > stored > > > > > and when it is stored. > > > > > > > > > > JO 3. I didn't see anything about maintaining LSO -- perhaps that > > > > > would > > > > be > > > > > stored in the batch coordinator? > > > > > > > > > > JO 4. Are there any thoughts about how long transactional state is > > > > > maintained in the batch coordinator and how it will be cleaned up? > > > > > > > > > > On Mon, Sep 8, 2025 at 10:38 AM Jun Rao <[email protected]> > > > > wrote: > > > > > > > > > > > Hi, Greg and Ivan, > > > > > > > > > > > > Thanks for the update. A few comments. > > > > > > > > > > > > JR 10. "Consumer fetches are now served from local segments, making > > > > use of > > > > > > the > > > > > > indexes, page cache, request purgatory, and zero-copy functionality > > > > already > > > > > > built into classic topics." > > > > > > JR 10.1 Does the broker build the producer state for each partition > > > > > > in > > > > > > diskless topics? > > > > > > JR 10.2 For transactional data, the consumer fetches need to know > > > > aborted > > > > > > records. How is that achieved? > > > > > > > > > > > > JR 11. "The batch coordinator saves that the transaction is finished > > > > and > > > > > > also inserts the control batches in the corresponding logs of the > > > > involved > > > > > > Diskless topics. This happens only on the metadata level, no actual > > > > control > > > > > > batches are written to any file. " > > > > > > A fetch response could include multiple transactional batches. How > > > > does the > > > > > > broker obtain the information about the ending control batch for > > > > > > each > > > > > > batch? Does that mean that a fetch response needs to be built by > > > > > > stitching record batches and generated control batches together? > > > > > > > > > > > > JR 12. Queues: Is there still a share partition leader that all > > > > consumers > > > > > > are routed to? > > > > > > > > > > > > JR 13. "Should the KIPs be modified to include this or it's too > > > > > > implementation-focused?" It would be useful to include enough > > > > > > details > > > > to > > > > > > understand correctness and performance impact. > > > > > > > > > > > > HC5. Henry has a valid point. Requests from a given producer > > > > > > contain a > > > > > > sequence number, which is ordered. If a producer sends every Produce > > > > > > request to an arbitrary broker, those requests could reach the batch > > > > > > coordinator in different order and lead to rejection of the produce > > > > > > requests. > > > > > > > > > > > > Jun > > > > > > > > > > > > On Thu, Sep 4, 2025 at 12:00 AM Ivan Yurchenko <[email protected]> > > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > We have also thought in a bit more details about transactions and > > > > queues, > > > > > > > here's the plan. > > > > > > > > > > > > > > *Transactions* > > > > > > > > > > > > > > The support for transactions in *classic topics* is based on > > > > > > > precise > > > > > > > interactions between three actors: clients (mostly producers, but > > > > also > > > > > > > consumers), brokers (ReplicaManager and other classes), and > > > > transaction > > > > > > > coordinators. Brokers also run partition leaders with their local > > > > state > > > > > > > (ProducerStateManager and others). > > > > > > > > > > > > > > The high level (some details skipped) workflow is the following. > > > > When a > > > > > > > transactional Produce request is received by the broker: > > > > > > > 1. For each partition, the partition leader checks if a non-empty > > > > > > > transaction is running for this partition. This is done using its > > > > local > > > > > > > state derived from the log metadata (ProducerStateManager, > > > > > > > VerificationStateEntry, VerificationGuard). > > > > > > > 2. The transaction coordinator is informed about all the > > > > > > > partitions > > > > that > > > > > > > aren’t part of the transaction to include them. > > > > > > > 3. The partition leaders do additional transactional checks. > > > > > > > 4. The partition leaders append the transactional data to their > > > > > > > logs > > > > and > > > > > > > update some of their state (for example, log the fact that the > > > > > > transaction > > > > > > > is running for the partition and its first offset). > > > > > > > > > > > > > > When the transaction is committed or aborted: > > > > > > > 1. The producer contacts the transaction coordinator directly with > > > > > > > EndTxnRequest. > > > > > > > 2. The transaction coordinator writes PREPARE_COMMIT or > > > > PREPARE_ABORT to > > > > > > > its log and responds to the producer. > > > > > > > 3. The transaction coordinator sends WriteTxnMarkersRequest to the > > > > > > leaders > > > > > > > of the involved partitions. > > > > > > > 4. The partition leaders write the transaction markers to their > > > > > > > logs > > > > and > > > > > > > respond to the coordinator. > > > > > > > 5. The coordinator writes the final transaction state > > > > COMPLETE_COMMIT or > > > > > > > COMPLETE_ABORT. > > > > > > > > > > > > > > In classic topics, partitions have leaders and lots of important > > > > state > > > > > > > necessary for supporting this workflow is local. The main > > > > > > > challenge > > > > in > > > > > > > mapping this to Diskless comes from the fact there are no > > > > > > > partition > > > > > > > leaders, so the corresponding pieces of state need to be > > > > > > > globalized > > > > in > > > > > > the > > > > > > > batch coordinator. We are already doing this to support idempotent > > > > > > produce. > > > > > > > > > > > > > > The high level workflow for *diskless topics* would look very > > > > similar: > > > > > > > 1. For each partition, the broker checks if a non-empty > > > > > > > transaction > > > > is > > > > > > > running for this partition. In contrast to classic topics, this is > > > > > > checked > > > > > > > against the batch coordinator with a single RPC. Since a > > > > > > > transaction > > > > > > could > > > > > > > be uniquely identified with producer ID and epoch, the positive > > > > result of > > > > > > > this check could be cached locally (for the double configured > > > > duration > > > > > > of a > > > > > > > transaction, for example). > > > > > > > 2. The same: The transaction coordinator is informed about all the > > > > > > > partitions that aren’t part of the transaction to include them. > > > > > > > 3. No transactional checks are done on the broker side. > > > > > > > 4. The broker appends the transactional data to the current shared > > > > WAL > > > > > > > segment. It doesn’t update any transaction-related state for > > > > > > > Diskless > > > > > > > topics, because it doesn’t have any. > > > > > > > 5. The WAL segment is committed to the batch coordinator like in > > > > > > > the > > > > > > > normal produce flow. > > > > > > > 6. The batch coordinator does the final transactional checks of > > > > > > > the > > > > > > > batches. This procedure would output the same errors like the > > > > partition > > > > > > > leader in classic topics would do. I.e. some batches could be > > > > rejected. > > > > > > > This means, there will potentially be garbage in the WAL segment > > > > file in > > > > > > > case of transactional errors. This is preferable to doing more > > > > network > > > > > > > round trips, especially considering the WAL segments will be > > > > relatively > > > > > > > short-living (see the Greg's update above). > > > > > > > > > > > > > > When the transaction is committed or aborted: > > > > > > > 1. The producer contacts the transaction coordinator directly with > > > > > > > EndTxnRequest. > > > > > > > 2. The transaction coordinator writes PREPARE_COMMIT or > > > > PREPARE_ABORT to > > > > > > > its log and responds to the producer. > > > > > > > 3. *[NEW]* The transaction coordinator informs the batch > > > > > > > coordinator > > > > that > > > > > > > the transaction is finished. > > > > > > > 4. *[NEW]* The batch coordinator saves that the transaction is > > > > finished > > > > > > > and also inserts the control batches in the corresponding logs of > > > > > > > the > > > > > > > involved Diskless topics. This happens only on the metadata > > > > > > > level, no > > > > > > > actual control batches are written to any file. They will be > > > > dynamically > > > > > > > created on Fetch and other read operations. We could technically > > > > write > > > > > > > these control batches for real, but this would mean extra produce > > > > > > latency, > > > > > > > so it's better just to mark them in the batch coordinator and save > > > > these > > > > > > > milliseconds. > > > > > > > 5. The transaction coordinator sends WriteTxnMarkersRequest to the > > > > > > leaders > > > > > > > of the involved partitions. – Now only to classic topics now. > > > > > > > 6. The partition leaders of classic topics write the transaction > > > > markers > > > > > > > to their logs and respond to the coordinator. > > > > > > > 7. The coordinator writes the final transaction state > > > > COMPLETE_COMMIT or > > > > > > > COMPLETE_ABORT. > > > > > > > > > > > > > > Compared to the non-transactional produce flow, we get: > > > > > > > 1. An extra network round trip between brokers and the batch > > > > coordinator > > > > > > > when a new partition appear in the transaction. To mitigate the > > > > impact of > > > > > > > them: > > > > > > > - The results will be cached. > > > > > > > - The calls for multiple partitions in one Produce request will > > > > > > > be > > > > > > > grouped. > > > > > > > - The batch coordinator should be optimized for fast response to > > > > these > > > > > > > RPCs. > > > > > > > - The fact that a single producer normally will communicate > > > > > > > with a > > > > > > > single broker for the duration of the transaction further reduces > > > > > > > the > > > > > > > expected number of round trips. > > > > > > > 2. An extra round trip between the transaction coordinator and > > > > > > > batch > > > > > > > coordinator when a transaction is finished. > > > > > > > > > > > > > > With this proposal, transactions will also be able to span both > > > > classic > > > > > > > and Diskless topics. > > > > > > > > > > > > > > *Queues* > > > > > > > > > > > > > > The share group coordination and management is a side job that > > > > doesn't > > > > > > > interfere with the topic itself (leadership, replicas, physical > > > > storage > > > > > > of > > > > > > > records, etc.) and non-queue producers and consumers (Fetch and > > > > Produce > > > > > > > RPCs, consumer group-related RPCs are not affected.) We don't see > > > > > > > any > > > > > > > reason why we can't make Diskless topics compatible with share > > > > groups the > > > > > > > same way as classic topics are. Even on the code level, we don't > > > > expect > > > > > > any > > > > > > > serious refactoring: the same reading routines are used that are > > > > used for > > > > > > > fetching (e.g. ReplicaManager.readFromLog). > > > > > > > > > > > > > > > > > > > > > Should the KIPs be modified to include this or it's too > > > > > > > implementation-focused? > > > > > > > > > > > > > > Best regards, > > > > > > > Ivan > > > > > > > > > > > > > > On Wed, Sep 3, 2025, at 21:59, Greg Harris wrote: > > > > > > > > Hi all, > > > > > > > > > > > > > > > > Thank you all for your questions and design input on KIP-1150. > > > > > > > > > > > > > > > > We have just updated KIP-1150 and KIP-1163 with a new design. To > > > > > > > summarize > > > > > > > > the changes: > > > > > > > > > > > > > > > > 1. The design prioritizes integrating with the existing KIP-405 > > > > Tiered > > > > > > > > Storage interfaces, permitting data produced to a Diskless topic > > > > to be > > > > > > > > moved to tiered storage. > > > > > > > > This lowers the scalability requirements for the Batch > > > > > > > > Coordinator > > > > > > > > component, and allows Diskless to compose with Tiered Storage > > > > plugin > > > > > > > > features such as encryption and alternative data formats. > > > > > > > > > > > > > > > > 2. Consumer fetches are now served from local segments, making > > > > > > > > use > > > > of > > > > > > the > > > > > > > > indexes, page cache, request purgatory, and zero-copy > > > > > > > > functionality > > > > > > > already > > > > > > > > built into classic topics. > > > > > > > > However, local segments are now considered cache elements, do > > > > > > > > not > > > > need > > > > > > to > > > > > > > > be durably stored, and can be built without contacting any other > > > > > > > replicas. > > > > > > > > > > > > > > > > 3. The design has been simplified substantially, by removing the > > > > > > previous > > > > > > > > Diskless consume flow, distributed cache component, and "object > > > > > > > > compaction/merging" step. > > > > > > > > > > > > > > > > The design maintains leaderless produces as enabled by the Batch > > > > > > > > Coordinator, and the same latency profiles as the earlier > > > > > > > > design, > > > > while > > > > > > > > being simpler and integrating better into the existing > > > > > > > > ecosystem. > > > > > > > > > > > > > > > > Thanks, and we are eager to hear your feedback on the new > > > > > > > > design. > > > > > > > > Greg Harris > > > > > > > > > > > > > > > > On Mon, Jul 21, 2025 at 3:30 PM Jun Rao > > > > > > > > <[email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi, Jan, > > > > > > > > > > > > > > > > > > For me, the main gap of KIP-1150 is the support of all > > > > > > > > > existing > > > > > > client > > > > > > > > > APIs. Currently, there is no design for supporting APIs like > > > > > > > transactions > > > > > > > > > and queues. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > Jun > > > > > > > > > > > > > > > > > > On Mon, Jul 21, 2025 at 3:53 AM Jan Siekierski > > > > > > > > > <[email protected]> wrote: > > > > > > > > > > > > > > > > > > > Would it be a good time to ask for the current status of > > > > > > > > > > this > > > > KIP? > > > > > > I > > > > > > > > > > haven't seen much activity here for the past 2 months, the > > > > vote got > > > > > > > > > vetoed > > > > > > > > > > but I think the pending questions have been answered since > > > > then. > > > > > > > KIP-1183 > > > > > > > > > > (AutoMQ's proposal) also didn't have any activity since May. > > > > > > > > > > > > > > > > > > > > In my eyes KIP-1150 and KIP-1183 are two real choices that > > > > > > > > > > can > > > > be > > > > > > > > > > made, with a coordinator-based approach being by far the > > > > dominant > > > > > > one > > > > > > > > > when > > > > > > > > > > it comes to market adoption - but all these are standalone > > > > > > products. > > > > > > > > > > > > > > > > > > > > I'm a big fan of both approaches, but would hate to see a > > > > stall. So > > > > > > > the > > > > > > > > > > question is: can we get an update? > > > > > > > > > > > > > > > > > > > > Maybe it's time to start another vote? Colin McCabe - have > > > > > > > > > > your > > > > > > > questions > > > > > > > > > > been answered? If not, is there anything I can do to help? > > > > > > > > > > I'm > > > > > > deeply > > > > > > > > > > familiar with both architectures and have written about > > > > > > > > > > both? > > > > > > > > > > > > > > > > > > > > Kind regards, > > > > > > > > > > Jan > > > > > > > > > > > > > > > > > > > > On Tue, Jun 24, 2025 at 10:42 AM Stanislav Kozlovski < > > > > > > > > > > [email protected]> wrote: > > > > > > > > > > > > > > > > > > > > > I have some nits - it may be useful to > > > > > > > > > > > > > > > > > > > > > > a) group all the KIP email threads in the main one (just a > > > > bunch > > > > > > of > > > > > > > > > links > > > > > > > > > > > to everything) > > > > > > > > > > > b) create the email threads > > > > > > > > > > > > > > > > > > > > > > It's a bit hard to track it all - for example, I was > > > > searching > > > > > > for > > > > > > > a > > > > > > > > > > > discuss thread for KIP-1165 for a while; As far as I can > > > > tell, it > > > > > > > > > doesn't > > > > > > > > > > > exist yet. > > > > > > > > > > > > > > > > > > > > > > Since the KIPs are published (by virtue of having the root > > > > KIP be > > > > > > > > > > > published, having a DISCUSS thread and links to sub-KIPs > > > > where > > > > > > were > > > > > > > > > aimed > > > > > > > > > > > to move the discussion towards), I think it would be good > > > > > > > > > > > to > > > > > > create > > > > > > > > > > DISCUSS > > > > > > > > > > > threads for them all. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Stan > > > > > > > > > > > > > > > > > > > > > > On 2025/04/16 11:58:22 Josep Prat wrote: > > > > > > > > > > > > Hi Kafka Devs! > > > > > > > > > > > > > > > > > > > > > > > > We want to start a new KIP discussion about introducing > > > > > > > > > > > > a > > > > new > > > > > > > type of > > > > > > > > > > > > topics that would make use of Object Storage as the > > > > > > > > > > > > primary > > > > > > > source of > > > > > > > > > > > > storage. However, as this KIP is big we decided to > > > > > > > > > > > > split it > > > > > > into > > > > > > > > > > multiple > > > > > > > > > > > > related KIPs. > > > > > > > > > > > > We have the motivational KIP-1150 ( > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics > > > > > > > > > > > ) > > > > > > > > > > > > that aims to discuss if Apache Kafka should aim to have > > > > this > > > > > > > type of > > > > > > > > > > > > feature at all. This KIP doesn't go onto details on how > > > > > > > > > > > > to > > > > > > > implement > > > > > > > > > > it. > > > > > > > > > > > > This follows the same approach used when we discussed > > > > KRaft. > > > > > > > > > > > > > > > > > > > > > > > > But as we know that it is sometimes really hard to > > > > > > > > > > > > discuss > > > > on > > > > > > > that > > > > > > > > > meta > > > > > > > > > > > > level, we also created several sub-kips (linked in > > > > KIP-1150) > > > > > > that > > > > > > > > > offer > > > > > > > > > > > an > > > > > > > > > > > > implementation of this feature. > > > > > > > > > > > > > > > > > > > > > > > > We kindly ask you to use the proper DISCUSS threads for > > > > each > > > > > > > type of > > > > > > > > > > > > concern and keep this one to discuss whether Apache > > > > > > > > > > > > Kafka > > > > wants > > > > > > > to > > > > > > > > > have > > > > > > > > > > > > this feature or not. > > > > > > > > > > > > > > > > > > > > > > > > Thanks in advance on behalf of all the authors of this > > > > > > > > > > > > KIP. > > > > > > > > > > > > > > > > > > > > > > > > ------------------ > > > > > > > > > > > > Josep Prat > > > > > > > > > > > > Open Source Engineering Director, Aiven > > > > > > > > > > > > [email protected] | +491715557497 | aiven.io > > > > > > > > > > > > Aiven Deutschland GmbH > > > > > > > > > > > > Alexanderufer 3-7, 10117 Berlin > > > > > > > > > > > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen, > > > > > > > > > > > > Anna Richardson, Kenneth Chen > > > > > > > > > > > > Amtsgericht Charlottenburg, HRB 209739 B > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
