Indeed my message size varies b/w ~500kb to ~5mb per avro. I am using kafka as a I need a scalable pub-sub messaging architecture with multiple produces and consumers and guaranty of delivery. Keeping data on filesystem or hdfs won't give me that.
Also In the link below [1] there is a linkedin's performance benchmark of kafka wrt message size which shows that Kafka's throughput increases with messages of size ~100kb+. Agreed for kafka a record is key+value, I'm wondering if kafka can give us a way to sneak peek a records metadata via its key. [1] https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines On 3 Jan 2015 01:27, "Jayesh Thakrar" <j_thak...@yahoo.com.invalid> wrote: > Just wondering Mukesh - the reason you want this feature is because your > value payload is not small (tens of kb). Don't know if that is the right > usage of kafka. It might be worthwhile to store the avro files in a > filesystem (regular, cluster fs, hdfs or even hbase) and the value in your > kafka message can be the reference or uri for the avro file. > > That way you make the best use of each system's features and strengths. > > Kafka does have api to get metadata - the topics, partitions and primary > for partition etc. If we consider a key-value pair as a "record" than what > you are looking for is to get a part of the record (ie key only) and not > the whole record - so i would still consider that a data query/api. > >