You are correct that a Kafka broker is not just writing to one file. Jay Kreps wrote a great blog post with lots of links to even greater detail on the topic of Kafka and disk write performance. Still a good read many years later.
https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines <https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines> -hans > On Mar 15, 2017, at 7:51 AM, Nicolas MOTTE <nicolas.mo...@amadeus.com> wrote: > > Ok that makes sense, thanks ! > > The next question I have regarding performance is about the way Kafka writes > in the data files. > I often hear Kafka is very performant because it writes in an append-only > fashion. > So even with hard disk (not SSD) we get a great performance because it writes > in sequence. > > I could understand that if Kafka was only writing to one file. > But in reality it s writing to N files, N being the number of partitions > hosted by the broker. > So even though it appends the data to each file, overall I assume it is not > writing in sequence on the disk. > > Am I wrong ? > > -----Original Message----- > From: Tauzell, Dave [mailto:dave.tauz...@surescripts.com] > Sent: 08 March 2017 22:09 > To: users@kafka.apache.org > Subject: RE: Performance and Encryption > > I think because the product batches messages which could be for different > topics. > > -Dave > > -----Original Message----- > From: Nicolas MOTTE [mailto:nicolas.mo...@amadeus.com] > Sent: Wednesday, March 8, 2017 2:41 PM > To: users@kafka.apache.org > Subject: Performance and Encryption > > Hi everyone, > > I understand one of the reasons why Kafka is performant is by using zero-copy. > > I often hear that when encryption is enabled, then Kafka has to copy the data > in user space to decode the message, so it has a big impact on performance. > > If it is true, I don t get why the message has to be decoded by Kafka. I > would assume that whether the message is encrypted or not, Kafka simply > receives it, appends it to the file, and when a consumer wants to read it, it > simply reads at the right offset... > > Also I m wondering if it s the case if we don t use keys (pure queuing system > with key=null). > > Cheers > Nico > > This e-mail and any files transmitted with it are confidential, may contain > sensitive information, and are intended solely for the use of the individual > or entity to whom they are addressed. If you have received this e-mail in > error, please notify the sender by reply e-mail immediately and destroy all > copies of the e-mail and any attachments. >