Does having 9 partitions with 9 replication factors make sense here?
A replication factor of 9 sounds very high. For production, replication
factor of 3 is recommended.
How many partitions you want/need is a different question, and cannot be
answered in a general way.
"Yes" to all other questions.
-Matthias
On 5/12/23 9:50 AM, Mich Talebzadeh wrote:
Hi,
I have used Apache Kafka in conjunction with Spark as a messaging
source. This rather dated diagram describes it
I have two physical hosts each 64 GB, running RHES 7.6, these are called
rhes75 and rhes76 respectively. The Zookeeper version is 3.7.1 and kafka
version is 3.4.0
image.png
I have a topic md -> MarketData that has been defined as below
kafka-topics.sh --create --bootstrap-server
rhes75:9092,rhes75:9093,rhes75:9094,rhes76:9092,rhes76:9093,rhes76:9094,rhes76:9095,rhes76:9096, rhes76:9097 --replication-factor 9 --partitions 9 --topic md
kafka-topics.sh --describe --bootstrap-server
rhes75:9092,rhes75:9093,rhes75:9094,rhes76:9092,rhes76:9093,rhes76:9094,rhes76:9095,rhes76:9096, rhes76:9097 --topic md
This is working fine
Topic: md TopicId: UfQly87bQPCbVKoH-PQheg PartitionCount: 9
ReplicationFactor: 9 Configs: segment.bytes=1073741824
Topic: md Partition: 0 Leader: 12 Replicas:
12,10,8,2,9,11,1,7,3 Isr: 10,1,9,2,12,7,3,11,8
Topic: md Partition: 1 Leader: 9 Replicas:
9,8,2,12,11,1,7,3,10 Isr: 10,1,9,2,12,7,3,11,8
Topic: md Partition: 2 Leader: 11 Replicas:
11,2,12,9,1,7,3,10,8 Isr: 10,1,9,2,12,7,3,11,8
Topic: md Partition: 3 Leader: 1 Replicas:
1,12,9,11,7,3,10,8,2 Isr: 10,1,9,2,12,7,3,11,8
Topic: md Partition: 4 Leader: 7 Replicas:
7,9,11,1,3,10,8,2,12 Isr: 10,1,9,2,12,7,3,11,8
Topic: md Partition: 5 Leader: 3 Replicas:
3,11,1,7,10,8,2,12,9 Isr: 10,1,9,2,12,7,3,11,8
Topic: md Partition: 6 Leader: 10 Replicas:
10,1,7,3,8,2,12,9,11 Isr: 10,1,9,2,12,7,3,11,8
Topic: md Partition: 7 Leader: 8 Replicas:
8,7,3,10,2,12,9,11,1 Isr: 10,1,9,2,12,7,3,11,8
Topic: md Partition: 8 Leader: 2 Replicas:
2,3,10,8,12,9,11,1,7 Isr: 10,1,9,2,12,7,3,11,8
However, I have a number of questions
1. Does having 9 partitions with 9 replication factors make sense here?
2. As I understand the parallelism is equal to the number of partitions
for a topic.
3. Kafka only provides a total order over messages *within a
partition*, not between different partitions in a topic and in
this case I have one topic
4.
Data within a Partition will be stored in the order in which it is
written, therefore, data read from a Partition will be read in order
for that partition?
5.
Finally if I want to get messages in order across multiple all 9
partitionss, then I need to group messages with a key, so that
messages with the samekey goto the samepartition and withinthat
partition the messages are ordered
Thanks
*Disclaimer:* Use it at your own risk.Any and all responsibility for any
loss, damage or destruction of data or any other property which may
arise from relying on this email's technical content is explicitly
disclaimed. The author will in no case be liable for any monetary
damages arising from such loss, damage or destruction.