Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

Colin McCabe Sat, 08 Jul 2023 10:27:40 -0700

On Wed, Jun 7, 2023, at 07:07, Christo Lolov wrote:
> Hey Colin,
>
> I tried the following setup:
>
> * Create 3 EC2 machines.
> * EC2 machine named A acts as a KRaft Controller.
> * EC2 machine named B acts as a KRaft Broker. (The only configurations
> different to the default values: log.retention.ms=30000,
> log.segment.bytes=1048576, log.retention.check.interval.ms=30000,
> leader.imbalance.check.interval.seconds=30)
> * EC2 machine named C acts as a Producer.
> * I attached 1 GB EBS volume to the EC2 machine B (Broker) and 
> configured
> the log.dirs to point to it.
> * I filled 995 MB of that EBS volume using fallocate.
> * I created a topic with 6 partitions and a replication factor of 1.
> * From the Producer machine I used 
> `~/kafka/bin/kafka-producer-perf-test.sh
> --producer.config ~/kafka/config/client.properties --topic batman
> --record-size 524288 --throughput 5 --num-records 150`. The disk on EC2
> machine B filled up and the broker shut down. I stopped the producer.
> * I stopped the controller on EC2 machine A. I started the controller to
> both be a controller and a broker (I need this because I cannot 
> communicate
> directly with a controller -
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-919%3A+Allow+AdminClient+to+Talk+Directly+with+the+KRaft+Controller+Quorum
> ).
> * I deleted the topic to which I had been writing by using 
> kafka-topics.sh .
> * I started the broker on EC2 machine B and it failed due to no space 
> left
> on disk during its recovery process. The topic was not deleted from the
> disk.
>
> As such, I am not convinced that KRaft addresses the problem of deleting
> topics on startup if there is no space left on the disk - is there
> something wrong with my setup that you disagree with? I think this will
> continue to be the case even when JBOD + KRaft is implemented.


Thank you for trying this. You're right that it doesn't work today, but it very 
easily could with no architecture changes.

We have the initial KRaft metadata load when log recovery starts. So if we 
wanted to, we could delete non-existent topics during log recovery. (Obviously 
we'd want to check both the topic ID and topic name, as always.)

This would be a good optimization in general. Spending time recovering a 
directory and then immediately deleting it during the initial metadata load is 
silly. I think nobody has bothered to optimize this yet since it's a bit of a 
rare case. But we very easily could.

I don't know if this would require a KIP or not. Arguably it's not user-visible 
behavior.

best,
Colin

>
> Let me know your thoughts!
>
> Best,
> Christo
>
> On Mon, 5 Jun 2023 at 11:03, Christo Lolov <christolo...@gmail.com> wrote:
>
>> Hey Colin,
>>
>> Thanks for the review!
>>
>> I am also skeptical that much space can be reclaimed via compaction as
>> detailed in the limitations section of the KIP.
>>
>> In my head there are two ways to get out of the saturated state -
>> configure more aggressive retention and delete topics. I wasn't aware that
>> KRaft deletes topics marked for deletion on startup if the disks occupied
>> by those partitions are full - I will check it out, thank you for the
>> information! On the retention side, I believe there is still a benefit in
>> keeping the broker up and responsive - in my experience, people first try
>> to reduce the data they have and only when that also does not work they are
>> okay with sacrificing all of the data.
>>
>> Let me know your thoughts!
>>
>> Best,
>> Christo
>>
>> On Fri, 2 Jun 2023 at 20:09, Colin McCabe <cmcc...@apache.org> wrote:
>>
>>> Hi Christo,
>>>
>>> We're not adding new stuff to ZK at this point (it's deprecated), so it
>>> would be good to drop that from the design.
>>>
>>> With regard to the "saturated" state: I'm skeptical that compaction could
>>> really move the needle much in terms of freeing up space -- in most
>>> workloads I've seen, it wouldn't. Compaction also requires free space to
>>> function as well.
>>>
>>> So the main benefit of the "satured" state seems to be enabling deletion
>>> on full disks. But KRaft mode already has most of that benefit. Full disks
>>> (or, indeed, downed brokers) don't block deletion on KRaft. If you delete a
>>> topic and then bounce the broker that had the disk full, it will delete the
>>> topic directory on startup as part of its snapshot load process.
>>>
>>> So I'm not sure if we really need this. Maybe we should re-evaluate once
>>> we have JBOD + KRaft.
>>>
>>> best,
>>> Colin
>>>
>>>
>>> On Mon, May 22, 2023, at 02:23, Christo Lolov wrote:
>>> > Hello all!
>>> >
>>> > I would like to start a discussion on KIP-928: Making Kafka resilient to
>>> > log directories becoming full which can be found at
>>> >
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-928%3A+Making+Kafka+resilient+to+log+directories+becoming+full
>>> > .
>>> >
>>> > In summary, I frequently run into problems where Kafka becomes
>>> unresponsive
>>> > when the disks backing its log directories become full. Such
>>> > unresponsiveness generally requires intervention outside of Kafka. I
>>> have
>>> > found it to be significantly nicer of an experience when Kafka maintains
>>> > control plane operations and allows you to free up space.
>>> >
>>> > I am interested in your thoughts and any suggestions for improving the
>>> > proposal!
>>> >
>>> > Best,
>>> > Christo
>>>
>>

Re: [DISCUSS] KIP-928: Making Kafka resilient to log directories becoming full

Reply via email to