Hi Eduardo, 1. "Why sometimes the applications prefer to connect to zookeeper instead brokers?"
I assume you are talking about the clients and some of our tools? These are parts of an older design and we are actively working on fixing this. The consumer used Zookeeper to store offsets, in 0.8.2 there's an option to use Kafka itself for that (by setting *offsets.storage = kafka*). We are planning on fixing the tools in 0.9, but obviously they are less performance sensitive than the consumers. 2. Regarding your tests and disk usage - I'm not sure exactly what fills your disk - if its the kafka transaction logs (i.e. log.dir), then we expect to store the size of all messages sent times the replication faction configured for each topic. We keep messages for the amount of time specified in *log.retention* parameters. If the disk is filled within minutes, either set log.retention.minutes very low (at risk of losing data if consumers need restart), or make sure your disk capacity matches the rates in which producers send data. Gwen On Sat, Feb 7, 2015 at 3:01 AM, Eduardo Costa Alfaia <e.costaalf...@unibs.it > wrote: > Hi Guys, > > I have some doubts about the Kafka, the first is Why sometimes the > applications prefer to connect to zookeeper instead brokers? Connecting to > zookeeper could create an overhead, because we are inserting other element > between producer and consumer. Another question is about the information > sent by producer, in my tests the producer send the messages to brokers and > a few minutes my HardDisk is full (my harddisk has 250GB), is there > something to do in the configuration to minimize this? > > Thanks > -- > Informativa sulla Privacy: http://www.unibs.it/node/8155 >