Hi all,

Thanks for the KIP. I've reviewed 1150, 1163, and 1164, as well as the
relevant discussion threads. I may have granular comments about 1163 and
1164 but the overall approach suggested in 1150 looks good to me. I
especially like that the approach covers two main pain points of operating
and paying for Kafka today: it allows cross-AZ traffic to be reduced (even
eliminated in some cases), and it also allows local disk usage by brokers
to be reduced (if operators opt for a small local cache on follower brokers
for non-tiered segments).

+1 (binding)

Cheers,

Chris

On Mon, Jan 26, 2026 at 3:36 PM vaquar khan <[email protected]> wrote:

> Hi Josep,
>
> Thank you for the detailed response. I appreciate the clarification
> regarding the distinction between the Inkless POC and the KIP design.
>
> However, my objection is not based on temporary bugs in the fork, but *on
> architectural gaps in the KIPs themselves* that these implementation issues
> highlighted. If we are voting to approve the design, the design documents
> must be structurally complete regarding data safety.
>
> *1. Regarding Storage Leaks (The Missing Design)* You mentioned that
> cleanup logic "can be defined later." However, KIP-1163 explicitly
> delegates this responsibility to a separate process, and KIP-1165 (Object
> Compaction/GC) is currently marked as "Discarded" in the wiki.
>
> We cannot vote to approve a storage engine that has no specified mechanism
> for garbage collection. The "Upload-then-Commit" pattern described in
> KIP-1163 structurally creates orphaned segments during broker failures.
> Without an active KIP defining the reconciliation protocol (since KIP-1165
> was withdrawn), the proposal effectively describes a system with unbounded
> storage growth during failure modes. This is a blocking design gap, not an
> implementation detail.
>
> *2. Regarding EOS (The Coordinator Synchronization Gap)* This is not a
> misunderstanding of standard Kafka transactions; it is a critique of how
> KIP-1150 changes them. Standard EOS relies on the Partition Leader to
> sequence markers and calculate the LSO (Last Stable Offset) in memory.
> KIP-1150 removes the Leader.
>
> KIP-1164 (Batch Coordinator) must explicitly define the RPC flow between
> the Transaction Coordinator and the Batch Coordinator to replace the
> leader's role. Currently, the KIP does not specify how the system prevents
> a "Split Brain" scenario where a consumer reads ahead of a transaction
> marker that hasn't yet been sequenced by the Batch Coordinator. This is a
> protocol-level correctness issue that must be resolved in the text before
> adoption.
>
> Please note - I am maintaining my objection based on missing
> specifications, not code bugs.
>
> I respectfully request that we pause the vote until:
>
>     A valid design for Garbage Collection (replacing the discarded
> KIP-1165) is added to the proposal.
>
>     The Transaction/LSO synchronization protocol is explicitly documented
> in KIP-1164.
>
> Regards,
>
> Vaquar Khan
> Sr Data Architect
> https://www.linkedin.com/in/vaquar-khan-b695577/
>

Reply via email to