Re: Performance and Encryption
You are correct that a Kafka broker is not just writing to one file. Jay Kreps wrote a great blog post with lots of links to even greater detail on the topic of Kafka and disk write performance. Still a good read many years later. https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines <https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines> -hans > On Mar 15, 2017, at 7:51 AM, Nicolas MOTTE <nicolas.mo...@amadeus.com> wrote: > > Ok that makes sense, thanks ! > > The next question I have regarding performance is about the way Kafka writes > in the data files. > I often hear Kafka is very performant because it writes in an append-only > fashion. > So even with hard disk (not SSD) we get a great performance because it writes > in sequence. > > I could understand that if Kafka was only writing to one file. > But in reality it s writing to N files, N being the number of partitions > hosted by the broker. > So even though it appends the data to each file, overall I assume it is not > writing in sequence on the disk. > > Am I wrong ? > > -Original Message- > From: Tauzell, Dave [mailto:dave.tauz...@surescripts.com] > Sent: 08 March 2017 22:09 > To: users@kafka.apache.org > Subject: RE: Performance and Encryption > > I think because the product batches messages which could be for different > topics. > > -Dave > > -Original Message- > From: Nicolas MOTTE [mailto:nicolas.mo...@amadeus.com] > Sent: Wednesday, March 8, 2017 2:41 PM > To: users@kafka.apache.org > Subject: Performance and Encryption > > Hi everyone, > > I understand one of the reasons why Kafka is performant is by using zero-copy. > > I often hear that when encryption is enabled, then Kafka has to copy the data > in user space to decode the message, so it has a big impact on performance. > > If it is true, I don t get why the message has to be decoded by Kafka. I > would assume that whether the message is encrypted or not, Kafka simply > receives it, appends it to the file, and when a consumer wants to read it, it > simply reads at the right offset... > > Also I m wondering if it s the case if we don t use keys (pure queuing system > with key=null). > > Cheers > Nico > > This e-mail and any files transmitted with it are confidential, may contain > sensitive information, and are intended solely for the use of the individual > or entity to whom they are addressed. If you have received this e-mail in > error, please notify the sender by reply e-mail immediately and destroy all > copies of the e-mail and any attachments. >
RE: Performance and Encryption
Ok that makes sense, thanks ! The next question I have regarding performance is about the way Kafka writes in the data files. I often hear Kafka is very performant because it writes in an append-only fashion. So even with hard disk (not SSD) we get a great performance because it writes in sequence. I could understand that if Kafka was only writing to one file. But in reality it s writing to N files, N being the number of partitions hosted by the broker. So even though it appends the data to each file, overall I assume it is not writing in sequence on the disk. Am I wrong ? -Original Message- From: Tauzell, Dave [mailto:dave.tauz...@surescripts.com] Sent: 08 March 2017 22:09 To: users@kafka.apache.org Subject: RE: Performance and Encryption I think because the product batches messages which could be for different topics. -Dave -Original Message- From: Nicolas MOTTE [mailto:nicolas.mo...@amadeus.com] Sent: Wednesday, March 8, 2017 2:41 PM To: users@kafka.apache.org Subject: Performance and Encryption Hi everyone, I understand one of the reasons why Kafka is performant is by using zero-copy. I often hear that when encryption is enabled, then Kafka has to copy the data in user space to decode the message, so it has a big impact on performance. If it is true, I don t get why the message has to be decoded by Kafka. I would assume that whether the message is encrypted or not, Kafka simply receives it, appends it to the file, and when a consumer wants to read it, it simply reads at the right offset... Also I m wondering if it s the case if we don t use keys (pure queuing system with key=null). Cheers Nico This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.
Re: Performance and encryption
I believe these are defaults you can set at the broker level so that if the topic doesn’t have that setting set, it will inherit those But you can definitely override your topic configuration at the topic level On 9 March 2017 at 7:42:14 am, Nicolas Motte (lingusi...@gmail.com) wrote: Hi everyone, I have another question. Is there any reason why retention and cleanup policy are defined at cluster level and not topic level? I can t see why it would not be possible from a technical point of view... 2017-03-06 14:38 GMT+01:00 Nicolas Motte: > Hi everyone, > > I understand one of the reasons why Kafka is performant is by using > zero-copy. > > I often hear that when encryption is enabled, then Kafka has to copy the > data in user space to decode the message, so it has a big impact on > performance. > > If it is true, I don t get why the message has to be decoded by Kafka. I > would assume that whether the message is encrypted or not, Kafka simply > receives it, appends it to the file, and when a consumer wants to read it, > it simply reads at the right offset... > > Also I m wondering if it s the case if we don t use keys (pure queuing > system with key=null). > > Cheers > Nico > >
Re: Performance and encryption
They are defined at the broker level as a default for all topics that do not have an override for those configs. Both (and many other configs) can be overridden for individual topics using the command line tools. -Todd On Wed, Mar 8, 2017 at 12:36 PM, Nicolas Mottewrote: > Hi everyone, I have another question. > Is there any reason why retention and cleanup policy are defined at cluster > level and not topic level? > I can t see why it would not be possible from a technical point of view... > > 2017-03-06 14:38 GMT+01:00 Nicolas Motte : > > > Hi everyone, > > > > I understand one of the reasons why Kafka is performant is by using > > zero-copy. > > > > I often hear that when encryption is enabled, then Kafka has to copy the > > data in user space to decode the message, so it has a big impact on > > performance. > > > > If it is true, I don t get why the message has to be decoded by Kafka. I > > would assume that whether the message is encrypted or not, Kafka simply > > receives it, appends it to the file, and when a consumer wants to read > it, > > it simply reads at the right offset... > > > > Also I m wondering if it s the case if we don t use keys (pure queuing > > system with key=null). > > > > Cheers > > Nico > > > > > -- *Todd Palino* Staff Site Reliability Engineer Data Infrastructure Streaming linkedin.com/in/toddpalino
Re: Performance and Encryption
Nicholas, this appears to be a duplicate of your question from 2 days ago. Please review that for discussion on this question. -Todd On Wed, Mar 8, 2017 at 1:08 PM, Tauzell, Dave <dave.tauz...@surescripts.com> wrote: > I think because the product batches messages which could be for different > topics. > > -Dave > > -Original Message- > From: Nicolas MOTTE [mailto:nicolas.mo...@amadeus.com] > Sent: Wednesday, March 8, 2017 2:41 PM > To: users@kafka.apache.org > Subject: Performance and Encryption > > Hi everyone, > > I understand one of the reasons why Kafka is performant is by using > zero-copy. > > I often hear that when encryption is enabled, then Kafka has to copy the > data in user space to decode the message, so it has a big impact on > performance. > > If it is true, I don t get why the message has to be decoded by Kafka. I > would assume that whether the message is encrypted or not, Kafka simply > receives it, appends it to the file, and when a consumer wants to read it, > it simply reads at the right offset... > > Also I m wondering if it s the case if we don t use keys (pure queuing > system with key=null). > > Cheers > Nico > > This e-mail and any files transmitted with it are confidential, may > contain sensitive information, and are intended solely for the use of the > individual or entity to whom they are addressed. If you have received this > e-mail in error, please notify the sender by reply e-mail immediately and > destroy all copies of the e-mail and any attachments. > > -- *Todd Palino* Staff Site Reliability Engineer Data Infrastructure Streaming linkedin.com/in/toddpalino
Re: Performance and encryption
Hi everyone, I have another question. Is there any reason why retention and cleanup policy are defined at cluster level and not topic level? I can t see why it would not be possible from a technical point of view... 2017-03-06 14:38 GMT+01:00 Nicolas Motte: > Hi everyone, > > I understand one of the reasons why Kafka is performant is by using > zero-copy. > > I often hear that when encryption is enabled, then Kafka has to copy the > data in user space to decode the message, so it has a big impact on > performance. > > If it is true, I don t get why the message has to be decoded by Kafka. I > would assume that whether the message is encrypted or not, Kafka simply > receives it, appends it to the file, and when a consumer wants to read it, > it simply reads at the right offset... > > Also I m wondering if it s the case if we don t use keys (pure queuing > system with key=null). > > Cheers > Nico > >
Performance and Encryption
Hi everyone, I understand one of the reasons why Kafka is performant is by using zero-copy. I often hear that when encryption is enabled, then Kafka has to copy the data in user space to decode the message, so it has a big impact on performance. If it is true, I don t get why the message has to be decoded by Kafka. I would assume that whether the message is encrypted or not, Kafka simply receives it, appends it to the file, and when a consumer wants to read it, it simply reads at the right offset... Also I m wondering if it s the case if we don t use keys (pure queuing system with key=null). Cheers Nico
Re: Performance and encryption
Hi Todd, I agree that KAFKA-2561 would be good to have for the reasons you state. Ismael On Mon, Mar 6, 2017 at 5:17 PM, Todd Palinowrote: > Thanks for the link, Ismael. I had thought that the most recent kernels > already implemented this, but I was probably confusing it with BSD. Most of > my systems are stuck in the stone age right now anyway. > > It would be nice to get KAFKA-2561 in, either way. First off, if you can > take advantage of it it’s a good performance boost. Second, especially with > the security landscape getting worse and worse, it would be good to have > options as far as the TLS implementation goes. A zero-day exploit in the > Java TLS implementation would be devastating, and more difficult to react > to as it would require a new JRE (bringing with it who knows what > problems). Swapping an underlying OpenSSL version would be much more > palatable. > > -Todd > > > On Mon, Mar 6, 2017 at 9:01 AM, Ismael Juma wrote: > > > Even though OpenSSL is much faster than the Java 8 TLS implementation (I > > haven't tested against Java 9, which is much faster than Java 8, but > > probably still slower than OpenSSL), all the tests were without zero copy > > in the sense that is being discussed here (i.e. sendfile). To benefit > from > > sendfile with TLS, kernel-level changes/modules are required: > > > > https://github.com/ktls/af_ktls > > http://www.phoronix.com/scan.php?page=news_item=FreeBSD- > Faster-Sendfile > > > > Ismael > > > > On Mon, Mar 6, 2017 at 4:18 PM, Todd Palino wrote: > > > > > So that’s not quite true, Hans. First, as far as the performance hit > > being > > > not a big impact (25% is huge). Or that it’s to be expected. Part of > the > > > problem is that the Java TLS implementation does not support zero copy. > > > OpenSSL does, and in fact there’s been a ticket open to allow Kafka to > > > support using OpenSSL for a while now: > > > > > > https://issues.apache.org/jira/browse/KAFKA-2561 > > > > > > > > > > > > > > > On Mon, Mar 6, 2017 at 6:30 AM, Hans Jespersen > > wrote: > > > > > > > > > > > Its not a single message at a time that is encrypted with TLS its the > > > > entire network byte stream so a Kafka broker can’t even see the Kafka > > > > Protocol tunneled inside TLS unless it’s terminated at the broker. > > > > It is true that losing the zero copy optimization impacts performance > > > > somewhat but it’s not what I would call a “big impact” because Kafka > > > does > > > > a lot of other things to get it’s performance (like using page cache > > and > > > > doing lots on sequential disk I/O). The difference should be > something > > in > > > > the order of 25-30% slower with TLS enabled which is about what you > > would > > > > see with any other messaging protocol with TLS on vs off. > > > > > > > > If you wanted to encrypt each message independently before sending to > > > > Kafka then zero copy would still be in effect and all the consumers > > would > > > > get the same encrypted message (and have to understand how to decrypt > > > it). > > > > > > > > -hans > > > > > > > > > > > > > > > > > On Mar 6, 2017, at 5:38 AM, Nicolas Motte > > > wrote: > > > > > > > > > > Hi everyone, > > > > > > > > > > I understand one of the reasons why Kafka is performant is by using > > > > > zero-copy. > > > > > > > > > > I often hear that when encryption is enabled, then Kafka has to > copy > > > the > > > > > data in user space to decode the message, so it has a big impact on > > > > > performance. > > > > > > > > > > If it is true, I don t get why the message has to be decoded by > > Kafka. > > > I > > > > > would assume that whether the message is encrypted or not, Kafka > > simply > > > > > receives it, appends it to the file, and when a consumer wants to > > read > > > > it, > > > > > it simply reads at the right offset... > > > > > > > > > > Also I m wondering if it s the case if we don t use keys (pure > > queuing > > > > > system with key=null). > > > > > > > > > > Cheers > > > > > Nico > > > > > > > > > > > > > > > > > -- > > > *Todd Palino* > > > Staff Site Reliability Engineer > > > Data Infrastructure Streaming > > > > > > > > > > > > linkedin.com/in/toddpalino > > > > > > > > > -- > *Todd Palino* > Staff Site Reliability Engineer > Data Infrastructure Streaming > > > > linkedin.com/in/toddpalino >
Re: Performance and encryption
Hi Todd Can you please help me with notes or document on how did you achieve encryption ? I have followed data available on official sites but failed as I m no good with TLS . On Mar 6, 2017 19:55, "Todd Palino"wrote: > It’s not that Kafka has to decode it, it’s that it has to send it across > the network. This is specific to enabling TLS support (transport > encryption), and won’t affect any end-to-end encryption you do at the > client level. > > The operation in question is called “zero copy”. In order to send a message > batch to a consumer, the Kafka broker must read it from disk (sometimes > it’s cached in memory, but that’s irrelevant here) and send it across the > network. The Linux kernel allows this to happen without having to copy the > data in memory (to move it from the disk buffers to the network buffers). > However, if TLS is enabled, the broker must first encrypt the data going > across the network. This means that it can no longer take advantage of the > zero copy optimization as it has to make a copy in the process of applying > the TLS encryption. > > Now, how much of an impact this has on the broker operations is up for > debate, I think. Originally, when we ran into this problem was when TLS > support was added to Kafka and the zero copy send for plaintext > communications was accidentally removed as well. At the time, we saw a > significant performance hit, and the code was patched to put it back. > However, since then I’ve turned on inter-broker TLS in all of our clusters, > and when we did that there was no performance hit. This is odd, because the > replica fetchers should take advantage of the same zero copy optimization. > > It’s possible that it’s because it’s just one consumer (the replica > fetchers). We’re about to start testing additional consumers over TLS, so > we’ll see what happens at that point. All I can suggest right now is that > you test in your environment and see what the impact is. Oh, and using > message keys (or not) won’t matter here. > > -Todd > > > On Mon, Mar 6, 2017 at 5:38 AM, Nicolas Motte > wrote: > > > Hi everyone, > > > > I understand one of the reasons why Kafka is performant is by using > > zero-copy. > > > > I often hear that when encryption is enabled, then Kafka has to copy the > > data in user space to decode the message, so it has a big impact on > > performance. > > > > If it is true, I don t get why the message has to be decoded by Kafka. I > > would assume that whether the message is encrypted or not, Kafka simply > > receives it, appends it to the file, and when a consumer wants to read > it, > > it simply reads at the right offset... > > > > Also I m wondering if it s the case if we don t use keys (pure queuing > > system with key=null). > > > > Cheers > > Nico > > > > > > -- > *Todd Palino* > Staff Site Reliability Engineer > Data Infrastructure Streaming > > > > linkedin.com/in/toddpalino >
Re: Performance and encryption
Thanks for the link, Ismael. I had thought that the most recent kernels already implemented this, but I was probably confusing it with BSD. Most of my systems are stuck in the stone age right now anyway. It would be nice to get KAFKA-2561 in, either way. First off, if you can take advantage of it it’s a good performance boost. Second, especially with the security landscape getting worse and worse, it would be good to have options as far as the TLS implementation goes. A zero-day exploit in the Java TLS implementation would be devastating, and more difficult to react to as it would require a new JRE (bringing with it who knows what problems). Swapping an underlying OpenSSL version would be much more palatable. -Todd On Mon, Mar 6, 2017 at 9:01 AM, Ismael Jumawrote: > Even though OpenSSL is much faster than the Java 8 TLS implementation (I > haven't tested against Java 9, which is much faster than Java 8, but > probably still slower than OpenSSL), all the tests were without zero copy > in the sense that is being discussed here (i.e. sendfile). To benefit from > sendfile with TLS, kernel-level changes/modules are required: > > https://github.com/ktls/af_ktls > http://www.phoronix.com/scan.php?page=news_item=FreeBSD-Faster-Sendfile > > Ismael > > On Mon, Mar 6, 2017 at 4:18 PM, Todd Palino wrote: > > > So that’s not quite true, Hans. First, as far as the performance hit > being > > not a big impact (25% is huge). Or that it’s to be expected. Part of the > > problem is that the Java TLS implementation does not support zero copy. > > OpenSSL does, and in fact there’s been a ticket open to allow Kafka to > > support using OpenSSL for a while now: > > > > https://issues.apache.org/jira/browse/KAFKA-2561 > > > > > > > > > > On Mon, Mar 6, 2017 at 6:30 AM, Hans Jespersen > wrote: > > > > > > > > Its not a single message at a time that is encrypted with TLS its the > > > entire network byte stream so a Kafka broker can’t even see the Kafka > > > Protocol tunneled inside TLS unless it’s terminated at the broker. > > > It is true that losing the zero copy optimization impacts performance > > > somewhat but it’s not what I would call a “big impact” because Kafka > > does > > > a lot of other things to get it’s performance (like using page cache > and > > > doing lots on sequential disk I/O). The difference should be something > in > > > the order of 25-30% slower with TLS enabled which is about what you > would > > > see with any other messaging protocol with TLS on vs off. > > > > > > If you wanted to encrypt each message independently before sending to > > > Kafka then zero copy would still be in effect and all the consumers > would > > > get the same encrypted message (and have to understand how to decrypt > > it). > > > > > > -hans > > > > > > > > > > > > > On Mar 6, 2017, at 5:38 AM, Nicolas Motte > > wrote: > > > > > > > > Hi everyone, > > > > > > > > I understand one of the reasons why Kafka is performant is by using > > > > zero-copy. > > > > > > > > I often hear that when encryption is enabled, then Kafka has to copy > > the > > > > data in user space to decode the message, so it has a big impact on > > > > performance. > > > > > > > > If it is true, I don t get why the message has to be decoded by > Kafka. > > I > > > > would assume that whether the message is encrypted or not, Kafka > simply > > > > receives it, appends it to the file, and when a consumer wants to > read > > > it, > > > > it simply reads at the right offset... > > > > > > > > Also I m wondering if it s the case if we don t use keys (pure > queuing > > > > system with key=null). > > > > > > > > Cheers > > > > Nico > > > > > > > > > > > > -- > > *Todd Palino* > > Staff Site Reliability Engineer > > Data Infrastructure Streaming > > > > > > > > linkedin.com/in/toddpalino > > > -- *Todd Palino* Staff Site Reliability Engineer Data Infrastructure Streaming linkedin.com/in/toddpalino
Re: Performance and encryption
Even though OpenSSL is much faster than the Java 8 TLS implementation (I haven't tested against Java 9, which is much faster than Java 8, but probably still slower than OpenSSL), all the tests were without zero copy in the sense that is being discussed here (i.e. sendfile). To benefit from sendfile with TLS, kernel-level changes/modules are required: https://github.com/ktls/af_ktls http://www.phoronix.com/scan.php?page=news_item=FreeBSD-Faster-Sendfile Ismael On Mon, Mar 6, 2017 at 4:18 PM, Todd Palinowrote: > So that’s not quite true, Hans. First, as far as the performance hit being > not a big impact (25% is huge). Or that it’s to be expected. Part of the > problem is that the Java TLS implementation does not support zero copy. > OpenSSL does, and in fact there’s been a ticket open to allow Kafka to > support using OpenSSL for a while now: > > https://issues.apache.org/jira/browse/KAFKA-2561 > > > > > On Mon, Mar 6, 2017 at 6:30 AM, Hans Jespersen wrote: > > > > > Its not a single message at a time that is encrypted with TLS its the > > entire network byte stream so a Kafka broker can’t even see the Kafka > > Protocol tunneled inside TLS unless it’s terminated at the broker. > > It is true that losing the zero copy optimization impacts performance > > somewhat but it’s not what I would call a “big impact” because Kafka > does > > a lot of other things to get it’s performance (like using page cache and > > doing lots on sequential disk I/O). The difference should be something in > > the order of 25-30% slower with TLS enabled which is about what you would > > see with any other messaging protocol with TLS on vs off. > > > > If you wanted to encrypt each message independently before sending to > > Kafka then zero copy would still be in effect and all the consumers would > > get the same encrypted message (and have to understand how to decrypt > it). > > > > -hans > > > > > > > > > On Mar 6, 2017, at 5:38 AM, Nicolas Motte > wrote: > > > > > > Hi everyone, > > > > > > I understand one of the reasons why Kafka is performant is by using > > > zero-copy. > > > > > > I often hear that when encryption is enabled, then Kafka has to copy > the > > > data in user space to decode the message, so it has a big impact on > > > performance. > > > > > > If it is true, I don t get why the message has to be decoded by Kafka. > I > > > would assume that whether the message is encrypted or not, Kafka simply > > > receives it, appends it to the file, and when a consumer wants to read > > it, > > > it simply reads at the right offset... > > > > > > Also I m wondering if it s the case if we don t use keys (pure queuing > > > system with key=null). > > > > > > Cheers > > > Nico > > > > > > > -- > *Todd Palino* > Staff Site Reliability Engineer > Data Infrastructure Streaming > > > > linkedin.com/in/toddpalino >
Re: Performance and encryption
So that’s not quite true, Hans. First, as far as the performance hit being not a big impact (25% is huge). Or that it’s to be expected. Part of the problem is that the Java TLS implementation does not support zero copy. OpenSSL does, and in fact there’s been a ticket open to allow Kafka to support using OpenSSL for a while now: https://issues.apache.org/jira/browse/KAFKA-2561 On Mon, Mar 6, 2017 at 6:30 AM, Hans Jespersenwrote: > > Its not a single message at a time that is encrypted with TLS its the > entire network byte stream so a Kafka broker can’t even see the Kafka > Protocol tunneled inside TLS unless it’s terminated at the broker. > It is true that losing the zero copy optimization impacts performance > somewhat but it’s not what I would call a “big impact” because Kafka does > a lot of other things to get it’s performance (like using page cache and > doing lots on sequential disk I/O). The difference should be something in > the order of 25-30% slower with TLS enabled which is about what you would > see with any other messaging protocol with TLS on vs off. > > If you wanted to encrypt each message independently before sending to > Kafka then zero copy would still be in effect and all the consumers would > get the same encrypted message (and have to understand how to decrypt it). > > -hans > > > > > On Mar 6, 2017, at 5:38 AM, Nicolas Motte wrote: > > > > Hi everyone, > > > > I understand one of the reasons why Kafka is performant is by using > > zero-copy. > > > > I often hear that when encryption is enabled, then Kafka has to copy the > > data in user space to decode the message, so it has a big impact on > > performance. > > > > If it is true, I don t get why the message has to be decoded by Kafka. I > > would assume that whether the message is encrypted or not, Kafka simply > > receives it, appends it to the file, and when a consumer wants to read > it, > > it simply reads at the right offset... > > > > Also I m wondering if it s the case if we don t use keys (pure queuing > > system with key=null). > > > > Cheers > > Nico > > -- *Todd Palino* Staff Site Reliability Engineer Data Infrastructure Streaming linkedin.com/in/toddpalino
Re: Performance and encryption
Its not a single message at a time that is encrypted with TLS its the entire network byte stream so a Kafka broker can’t even see the Kafka Protocol tunneled inside TLS unless it’s terminated at the broker. It is true that losing the zero copy optimization impacts performance somewhat but it’s not what I would call a “big impact” because Kafka does a lot of other things to get it’s performance (like using page cache and doing lots on sequential disk I/O). The difference should be something in the order of 25-30% slower with TLS enabled which is about what you would see with any other messaging protocol with TLS on vs off. If you wanted to encrypt each message independently before sending to Kafka then zero copy would still be in effect and all the consumers would get the same encrypted message (and have to understand how to decrypt it). -hans > On Mar 6, 2017, at 5:38 AM, Nicolas Mottewrote: > > Hi everyone, > > I understand one of the reasons why Kafka is performant is by using > zero-copy. > > I often hear that when encryption is enabled, then Kafka has to copy the > data in user space to decode the message, so it has a big impact on > performance. > > If it is true, I don t get why the message has to be decoded by Kafka. I > would assume that whether the message is encrypted or not, Kafka simply > receives it, appends it to the file, and when a consumer wants to read it, > it simply reads at the right offset... > > Also I m wondering if it s the case if we don t use keys (pure queuing > system with key=null). > > Cheers > Nico
Re: Performance and encryption
It’s not that Kafka has to decode it, it’s that it has to send it across the network. This is specific to enabling TLS support (transport encryption), and won’t affect any end-to-end encryption you do at the client level. The operation in question is called “zero copy”. In order to send a message batch to a consumer, the Kafka broker must read it from disk (sometimes it’s cached in memory, but that’s irrelevant here) and send it across the network. The Linux kernel allows this to happen without having to copy the data in memory (to move it from the disk buffers to the network buffers). However, if TLS is enabled, the broker must first encrypt the data going across the network. This means that it can no longer take advantage of the zero copy optimization as it has to make a copy in the process of applying the TLS encryption. Now, how much of an impact this has on the broker operations is up for debate, I think. Originally, when we ran into this problem was when TLS support was added to Kafka and the zero copy send for plaintext communications was accidentally removed as well. At the time, we saw a significant performance hit, and the code was patched to put it back. However, since then I’ve turned on inter-broker TLS in all of our clusters, and when we did that there was no performance hit. This is odd, because the replica fetchers should take advantage of the same zero copy optimization. It’s possible that it’s because it’s just one consumer (the replica fetchers). We’re about to start testing additional consumers over TLS, so we’ll see what happens at that point. All I can suggest right now is that you test in your environment and see what the impact is. Oh, and using message keys (or not) won’t matter here. -Todd On Mon, Mar 6, 2017 at 5:38 AM, Nicolas Mottewrote: > Hi everyone, > > I understand one of the reasons why Kafka is performant is by using > zero-copy. > > I often hear that when encryption is enabled, then Kafka has to copy the > data in user space to decode the message, so it has a big impact on > performance. > > If it is true, I don t get why the message has to be decoded by Kafka. I > would assume that whether the message is encrypted or not, Kafka simply > receives it, appends it to the file, and when a consumer wants to read it, > it simply reads at the right offset... > > Also I m wondering if it s the case if we don t use keys (pure queuing > system with key=null). > > Cheers > Nico > -- *Todd Palino* Staff Site Reliability Engineer Data Infrastructure Streaming linkedin.com/in/toddpalino
Performance and encryption
Hi everyone, I understand one of the reasons why Kafka is performant is by using zero-copy. I often hear that when encryption is enabled, then Kafka has to copy the data in user space to decode the message, so it has a big impact on performance. If it is true, I don t get why the message has to be decoded by Kafka. I would assume that whether the message is encrypted or not, Kafka simply receives it, appends it to the file, and when a consumer wants to read it, it simply reads at the right offset... Also I m wondering if it s the case if we don t use keys (pure queuing system with key=null). Cheers Nico