Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

2023-07-08 Thread Colin McCabe
On Wed, Jun 7, 2023, at 07:07, Christo Lolov wrote: > Hey Colin, > > I tried the following setup: > > * Create 3 EC2 machines. > * EC2 machine named A acts as a KRaft Controller. > * EC2 machine named B acts as a KRaft Broker. (The only configurations > different to the default values:

Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

2023-06-07 Thread Christo Lolov
Hey Colin, I tried the following setup: * Create 3 EC2 machines. * EC2 machine named A acts as a KRaft Controller. * EC2 machine named B acts as a KRaft Broker. (The only configurations different to the default values: log.retention.ms=3, log.segment.bytes=1048576,

Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

2023-06-05 Thread Christo Lolov
Hey Colin, Thanks for the review! I am also skeptical that much space can be reclaimed via compaction as detailed in the limitations section of the KIP. In my head there are two ways to get out of the saturated state - configure more aggressive retention and delete topics. I wasn't aware that

Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

2023-06-05 Thread Christo Lolov
Heya Igor, Thank you for reading through the KIP and providing feedback! 11. Good question. I will check whether a change is needed in the processing of the metadata records and come back. My hunch says no as long as the Kafka broker is still alive to process the metadata records. This being

Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

2023-06-02 Thread Colin McCabe
Hi Christo, We're not adding new stuff to ZK at this point (it's deprecated), so it would be good to drop that from the design. With regard to the "saturated" state: I'm skeptical that compaction could really move the needle much in terms of freeing up space -- in most workloads I've seen, it

Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

2023-06-02 Thread Igor Soarez
Hi Christo, Thank you for the KIP. Kafka is very sensitive to filesystem errors, and at the first IO error the whole log directory is permanently considered offline. It seems your proposal aims to increase the robustness of Kafka, and that's a positive improvement. I have some questions: 11.

[DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

2023-05-22 Thread Christo Lolov
Hello all! I would like to start a discussion on KIP-928: Making Kafka resilient to log directories becoming full which can be found at https://cwiki.apache.org/confluence/display/KAFKA/KIP-928%3A+Making+Kafka+resilient+to+log+directories+becoming+full . In summary, I frequently run into