Thanks, Paul. I would be really curious to see the talk when you're done :)

BTW, David Arthur posted a KIP recently that should avoid the upper limit on 
the number of elements in a batch for CreateTopics or CreatePartitions when 
it's done.

best,
Colin


On Fri, Sep 9, 2022, at 17:22, Paul Brebner wrote:
> Colin, hi, current max partitions reached is about 600,000 - I had to
> increase Linux file descriptors, mmap, and tweak the JVM heap settings a
> bit - heap error again.
> This is a bit of a hack to, as RF=1 and only a single EC2 instance - a
> proper 3 node cluster would in theory give >1M partitions which was what I
> really wanted to test out. I think I was also hitting this error attempting
> to create a single topic with lots of partitions:
> https://github.com/apache/kafka/pull/12595
> Current approach is to create multiple topics with 1000 partitions each, or
> single topic and increase the number of partitions.
> I've also got some good numbers around speed of meta data operations of
> Zookeeper vs. KRaft mode (KRaft lots faster = O(1) c.f. O(n) for ZK) etc.
> Anyway I'm happy I've got some numbers to report for my talk now, thanks
> for the info.
>
> Regards, Paul
>
> On Sat, 10 Sept 2022 at 02:43, Colin McCabe <cmcc...@apache.org> wrote:
>
>> Hi Paul,
>>
>> As Keith wrote, it does sound like you are hitting a separate Linux limit
>> like the max mmap count.
>>
>> I'm curious how many partitions you can create if you change that config!
>>
>> best,
>> Colin
>>
>>
>> On Tue, Sep 6, 2022, at 14:02, Keith Paulson wrote:
>> > I've had similar errors cause by mmap counts; try with
>> > vm.max_map_count=262144
>> >
>> >
>> > On 2022/09/01 23:57:54 Paul Brebner wrote:
>> >> Hi all,
>> >>
>> >> I've been attempting to benchmark Kafka KRaft version for an ApacheCon
>> > talk
>> >> and have identified 2 problems:
>> >>
>> >> 1 - it's still impossible to create large number of partitions/topics -
>> I
>> >> can create more than the comparable Zookeeper version but still not
>> >> "millions" - this is with RF=1 (as anything higher needs huge clusters
>> to
>> >> cope with the replication CPU overhead) only, and no load on the
>> clusters
>> >> yet (i.e. purely a topic/partition creation experiment).
>> >>
>> >> 2 - eventually the topic/partition creation command causes the Kafka
>> >> process to fail - looks like a memory error -
>> >>
>> >> java.lang.OutOfMemoryError: Metaspace
>> >> OpenJDK 64-Bit Server VM warning: INFO:
>> >> os::commit_memory(0x00007f4f554f9000, 65536, 1) failed; error='Not
>> enough
>> >> space' (errno=12)
>> >>
>> >> or similar error
>> >>
>> >> seems to happen consistently around 30,000+ partitions - this is on a
>> test
>> >> EC2 instance with 32GB Ram, 500,000 file descriptors (increased from
>> >> default) and 64GB disk (plenty spare). I'm not an OS expert, but the
>> kafka
>> >> process and the OS both seem to have plenty of RAM when this error
>> occurs.
>> >>
>> >> So there's 3 questions really: What's going wrong exactly? How to
>> achieve
>> >> more partitions? And should the topic create command (just using the CLI
>> > at
>> >> present to create topics) really be capable of killing the Kafka
>> instance,
>> >> or should it fail and throw an error, and the Kafka instance still
>> > continue
>> >> working...
>> >>
>> >> Regards, Paul Brebner
>> >>
>>

Reply via email to