Hi,
In short, I don't see Kafka having problems with those numbers. Logstash
will have a harder time, I believe.
That said, it all depends on how you tune things an what kind of / how much
hardware you use.
2B or 200B events, yes, big numbers, but how quickly do you need to process
those? in 1 m
Yury,
Well thanks for sharing the insight of kafka partition distribution.
Well I am more of a concerned about the throughtput that kafka-storm can
collaborative give so as to event process.
Currently I am having around a 30 Gb file with around .2 Billion events,
this number is soon gonna rise 1
This is a quote from Kafka documentation:
"The routing decision is influenced by the kafka.producer.Partitioner.
interface Partitioner {
int partition(T key, int numPartitions);
}
The partition API uses the key and the number of available broker
partitions to return a partition id. This id is u
Hi,
I am having a setup where I am sniffing some logs(ofcourse the big ones)
through Logstash Forwarder and forwarding it to Logstash, which in turn
publish these events to Kafka.
I have created the Kafka Topic ensuring the required number of Partitions
and Replication Factor but not sure with Lo