A Least Effort Kafka cluster rebalancer

2023-11-02 Thread Fabio Pardi
scripting around it, it can also run unattended when upscaling. You can find the newborn tool here: https://github.com/dba-git/kafka_smart_rebalancer/ Feedback and contributions are welcome! Regards, Fabio Pardi Agileos Consulting LTD https://www.agileosconsulting.com/

Rebalancing partitions in the most efficient way

2023-05-03 Thread Fabio Pardi
a lot of data movement. Is there a tool around which generates a new partitions distribution minimizing the number of changes? Something written in bash would be the best of life, suitable to run on vanilla K8s Kafka images too. regards, fabio pardi

Re: "Recreating" Zookeeper with KRaft

2023-04-19 Thread Fabio Pardi
Is it possible to have a separate Kraft controller from the brokers? Is there any documentation on setting this up? Hi, you should be able to achieve such setup running kafka in 'controller' mode. https://kafka.apache.org/documentation/#brokerconfigs_process.roles regards, fabio

Re: Under-replicated-partitions

2021-07-27 Thread Fabio Pardi
replicated kafka-topics.sh --describe I think it might be a good starting point to understand what is going on. To blindly reassign partitions is in my experience not an ideal solution, because you will have data shuffling around unnecessarily. regards, fabio pardi

Re: Extraction of Kafka metadata

2021-02-16 Thread Fabio Pardi
metadata is stored next to each message. you can see it running kafka-console-consumer.sh and '--property'. eg, to see the the timestamp of a message: kafka-console-consumer  --bootstrap-server your_kafka:9092  --from-beginning --topic topic_name --property print.timestamp=true regards, fabio pardi

kafka compression benchmarked using FHIR STU3 data

2021-01-25 Thread Fabio Pardi
hi, for the ones of you working with medical data, here is how FHIR STU3 data responds to different compression options in kafka: https://portavita.github.io/2021-01-25-Why_Kafka_compression_might_save_you_thousands_of_dollars/ ideas and comments are welcome regards, fabio pardi

kafka on k8s article

2021-01-18 Thread Fabio Pardi
are of course welcome. Regards, fabio pardi

Re: Semantics of acks=all

2020-12-11 Thread Fabio Pardi
s. > > If the produce request fails, what does the partition leader do with the > records it has written to the local log. Are they deleted, or will the > producer's retry cause duplication? > Hi, the record is not committed to the filesystem and an error is returned to the producer. regards, fabio pardi

Re: Perf on history reprocessing

2020-10-25 Thread Fabio Pardi
hi Mathieu, the best approach in my opinion is to try to understand where your bottleneck is, analyzing the graphs produced during history reprocessing. my best bet are the disks, but indeed it might be anywhere. regards, fabio pardi On 23/10/2020 20:07, Mathieu D wrote: > He

Re: reliable way to count number of messages

2020-06-08 Thread Fabio Pardi
Solved. For the future us: the reason why offsets are 2 times the messages is to be found in how (our) producer works. The producer commits the message and the transaction, thus the offset is incremented by 2 for each sent message. regards, fabio pardi On 08/06/2020 13:45, Fabio Pardi wrote

Re: reliable way to count number of messages

2020-06-08 Thread Fabio Pardi
suspicious besides the offset and the number of messages not being identical, is that the former is exactly 2 times the latter. regards, fabio pardi On 08/06/2020 12:26, Liam Clarke-Hutchinson wrote: > Hi Fabio, > > -1 is shorthand for latest when passed as --time to GetOffsetShel

reliable way to count number of messages

2020-06-08 Thread Fabio Pardi
5.4.1-ccs (Commit:fd1e543386b47352) kafka-run-class -version openjdk version "1.8.0_212" OpenJDK Runtime Environment (Zulu 8.38.0.13-CA-linux64) (build 1.8.0_212-b04) OpenJDK 64-Bit Server VM (Zulu 8.38.0.13-CA-linux64) (build 25.212-b04, mixed mode) regards, fabio pardi