Hi all, Thanks for looking into the cost-saving KIPs holistically. KIP-1150 has changed substantially, so i'll copy the summary I posted from the other thread:
> We have just updated KIP-1150 and KIP-1163 with a new design. To summarize the changes: > 1. The design prioritizes integrating with the existing KIP-405 Tiered Storage interfaces, permitting data produced to a Diskless topic to be moved to tiered storage. > This lowers the scalability requirements for the Batch Coordinator component, and allows Diskless to compose with Tiered Storage plugin features such as encryption and alternative data formats. > 2. Consumer fetches are now served from local segments, making use of the indexes, page cache, request purgatory, and zero-copy functionality already built into classic topics. > However, local segments are now considered cache elements, do not need to be durably stored, and can be built without contacting any other replicas. > 3. The design has been simplified substantially, by removing the previous Diskless consume flow, distributed cache component, and "object compaction/merging" step. > The design maintains leaderless produces as enabled by the Batch Coordinator, and the same latency profiles as the earlier design, while being simpler and integrating better into the existing ecosystem. We are eager to hear your feedback in the KIP-1150 thread. Thanks, Greg Harris On Wed, Aug 27, 2025 at 8:08 PM Luke Chen <show...@gmail.com> wrote: > Hi Tom, > > Thanks for adding this performance test results. > So basically, for acks=1 case, it's what we expected. > > I have some comments about it, I'll reply in the discussion thread in > KIP-1176. > > Thank you. > Luke > > On Thu, Aug 28, 2025 at 4:53 AM Thomas Thornton > <tthorn...@salesforce.com.invalid> wrote: > > > Hi Luke, > > > > Thanks for creating this discussion on these KIPs. > > > > I'm collaborating with Henry on KIP-1176. We have deployed the Kafka fork > > to one of our Kafka clusters. We collected performance data (full results > > here > > < > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=354454090#KIP1176:TieredStorageforActiveLogSegment-Appendix:PerformanceData > > >) > > for the acks=1 case. Overall we see comparable performance (throughput & > > latency) for the producer path between a standard topic and one with > > KIP-1176 code. Consumer latency is slightly higher as data must travel an > > additional hop to/from S3E1Z. There are additional tunings we are aware > of > > that could further boost performance (e.g., tuning the interval that the > > leader/follower remote WAL tasks read/write data to/from cloud storage). > > > > Are there any other benchmarks the community would find useful? Any > > questions on these findings? > > > > Thanks, > > Tom > > > > On Tue, Aug 5, 2025 at 1:31 AM Luke Chen <show...@gmail.com> wrote: > > > > > Hi all, > > > > > > The Kafka community is currently seeing an unprecedented situation with > > > three KIPs (KIP-1150, IP-1176, KIP-1183) simultaneously addressing the > > same > > > challenge of high replication costs when running Kafka across multiple > > > cloud availability zones. Each KIP offers a different solution to this > > > issue. While diversity of innovative ideas is a key strength of > > open-source > > > projects, it creates a burden for reviewers and users who must compare > > and > > > comment on multiple proposals simultaneously. Furthermore, discussion > > > around the three KIPs has stalled for over two months now. This could > be > > > due to the authors being hesitant to proceed due to the existence of > > > alternative, potentially conflicting, solutions. Addressing replication > > > cost is a key concern of Kafka’s userbase and we should try to move the > > > conversation forward if we can. > > > > > > From what I understand, these three KIPs are not mutually exclusive. > But > > > adopting all three KIPs in the community might not be what we expect. > > Thus, > > > I would like to *start a discussion on how we could move the > conversation > > > forward*. > > > > > > To save time for the KIP readers/reviewers, I have created this > document > > > < > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/The+Path+Forward+for+Saving+Cross-AZ+Replication+Costs+KIPs > > > >[1] > > > to help summarize each of the KIPs and describe their current status. > > *Hope > > > to get some suggestions/feedback from the community*. > > > > > > > > > [1] > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/The+Path+Forward+for+Saving+Cross-AZ+Replication+Costs+KIPs > > > > > > KIP-1150: > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics > > > KIP-1176 > > > < > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+TopicsKIP-1176 > > > > > > : > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+Segment > > > KIP-1183 > > > < > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1176%3A+Tiered+Storage+for+Active+Log+SegmentKIP-1183 > > > > > > : > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1183%3A+Unified+Shared+Storage > > > > > > > > > Thank you. > > > Luke > > > > > >