In my organisation I have 3 machine cluster of Kafka and each topic assigned two machine for storing there data.
There is one topic for which I get lot of data from clients thats data exceeds my disk space in one machine because that machine is a leader of that topic, when I look into kafka-logs seeing all topic data there is only one topic whose data has consumed lot of space. In one month I have 36 GB data from that topic. What I have thought so far is kafka logs will be delete after 24 hours and new data will be retained for 24 hours and my consumer will consume all data with in 24 hours, so there will be no problem of disk space. I have enable everything in server.properties to clean kafka-logs but still data is there. Below data is my settings in server.properties file. *log.retention.minutes=1440* *log.retention.hours=24* *log.retention.ms <http://log.retention.ms>=86400000* *log.cleaner.delete.retention.ms <http://log.cleaner.delete.retention.ms>=24* *log.segment.bytes=1048576* *log.retention.check.interval.ms <http://log.retention.check.interval.ms>=3000* *log.cleaner.enable=true* *zookeeper.connection.timeout.ms <http://zookeeper.connection.timeout.ms>=30000* *delete.topic.enable = true* *auto.create.topics.enable = true* *default.replication.factor=2* *auto.leader.rebalance.enable=true* *controlled.shutdown.enable=true* *controller.socket.timeout.ms <http://controller.socket.timeout.ms>=120000* Please help me in this so that my machine can able to handle large number of request and data well using 3 machine cluster. *Thanks, Kunal* *+91-9958189589* *Data Analyst* *First Paper Publication : **http://dl.acm.org/citation.cfm?id=2790798 <http://dl.acm.org/citation.cfm?id=2790798> * *Blog:- **http://learnhardwithkunalgupta.blogspot.in <http://learnhardwithkunalgupta.blogspot.in> *