Thanks, Paul. I would be really curious to see the talk when you're done :)
BTW, David Arthur posted a KIP recently that should avoid the upper limit on the number of elements in a batch for CreateTopics or CreatePartitions when it's done. best, Colin On Fri, Sep 9, 2022, at 17:22, Paul Brebner wrote: > Colin, hi, current max partitions reached is about 600,000 - I had to > increase Linux file descriptors, mmap, and tweak the JVM heap settings a > bit - heap error again. > This is a bit of a hack to, as RF=1 and only a single EC2 instance - a > proper 3 node cluster would in theory give >1M partitions which was what I > really wanted to test out. I think I was also hitting this error attempting > to create a single topic with lots of partitions: > https://github.com/apache/kafka/pull/12595 > Current approach is to create multiple topics with 1000 partitions each, or > single topic and increase the number of partitions. > I've also got some good numbers around speed of meta data operations of > Zookeeper vs. KRaft mode (KRaft lots faster = O(1) c.f. O(n) for ZK) etc. > Anyway I'm happy I've got some numbers to report for my talk now, thanks > for the info. > > Regards, Paul > > On Sat, 10 Sept 2022 at 02:43, Colin McCabe <[email protected]> wrote: > >> Hi Paul, >> >> As Keith wrote, it does sound like you are hitting a separate Linux limit >> like the max mmap count. >> >> I'm curious how many partitions you can create if you change that config! >> >> best, >> Colin >> >> >> On Tue, Sep 6, 2022, at 14:02, Keith Paulson wrote: >> > I've had similar errors cause by mmap counts; try with >> > vm.max_map_count=262144 >> > >> > >> > On 2022/09/01 23:57:54 Paul Brebner wrote: >> >> Hi all, >> >> >> >> I've been attempting to benchmark Kafka KRaft version for an ApacheCon >> > talk >> >> and have identified 2 problems: >> >> >> >> 1 - it's still impossible to create large number of partitions/topics - >> I >> >> can create more than the comparable Zookeeper version but still not >> >> "millions" - this is with RF=1 (as anything higher needs huge clusters >> to >> >> cope with the replication CPU overhead) only, and no load on the >> clusters >> >> yet (i.e. purely a topic/partition creation experiment). >> >> >> >> 2 - eventually the topic/partition creation command causes the Kafka >> >> process to fail - looks like a memory error - >> >> >> >> java.lang.OutOfMemoryError: Metaspace >> >> OpenJDK 64-Bit Server VM warning: INFO: >> >> os::commit_memory(0x00007f4f554f9000, 65536, 1) failed; error='Not >> enough >> >> space' (errno=12) >> >> >> >> or similar error >> >> >> >> seems to happen consistently around 30,000+ partitions - this is on a >> test >> >> EC2 instance with 32GB Ram, 500,000 file descriptors (increased from >> >> default) and 64GB disk (plenty spare). I'm not an OS expert, but the >> kafka >> >> process and the OS both seem to have plenty of RAM when this error >> occurs. >> >> >> >> So there's 3 questions really: What's going wrong exactly? How to >> achieve >> >> more partitions? And should the topic create command (just using the CLI >> > at >> >> present to create topics) really be capable of killing the Kafka >> instance, >> >> or should it fail and throw an error, and the Kafka instance still >> > continue >> >> working... >> >> >> >> Regards, Paul Brebner >> >> >>
