[ https://issues.apache.org/jira/browse/KAFKA-13799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17517382#comment-17517382 ]
RivenSun edited comment on KAFKA-13799 at 4/5/22 12:37 PM: ----------------------------------------------------------- Hi [~showuon] [~dengziming] and [~guozhang] I think we can state in Kafka's documentation that when Kafka reads disk file messages, Kafka combines pagecache and zero-copy to greatly improve message consumption efficiency. But zero-copy only works in PlaintextTransportLayer. Maybe add this declaration in the sendfile module section of the documentation below. [https://kafka.apache.org/documentation/#maximizingefficiency] This combination of pagecache and sendfile means that on a Kafka cluster where the consumers are mostly caught up you will see no read activity on the disks whatsoever as they will be serving data entirely from cache. *When the transport layer uses the SSL protocol, sendfile will not be used due to the need to encrypt the data read.* WDYT? Thanks. was (Author: rivensun): Hi [~showuon] [~dengziming] and [~guozhang] I think we can state in Kafka's documentation that when Kafka reads disk file messages, Kafka combines pagecache and zero-copy to greatly improve message consumption efficiency. But zero-copy only works in PlaintextTransportLayer. Maybe add this declaration in the sendfile module section of the documentation below. [https://kafka.apache.org/documentation/#maximizingefficiency] WDYT? Thanks. > Improve documentation for Kafka zero-copy > ----------------------------------------- > > Key: KAFKA-13799 > URL: https://issues.apache.org/jira/browse/KAFKA-13799 > Project: Kafka > Issue Type: Improvement > Components: documentation > Reporter: RivenSun > Priority: Major > > Via documentation https://kafka.apache.org/documentation/#maximizingefficiency > and [https://kafka.apache.org/documentation/#networklayer] , > We can know that Kafka combines pagecache and zero-copy when reading messages > in files on disk, which greatly improves the consumption rate of messages. > But after browsing the source code: > Look directly at the *FileRecords.writeTo(...)* method, > 1. Only PlaintextTransportLayer.transferFrom() uses fileChannel.transferTo(), > and the bottom layer calls the sendfile method to implement zero-copy data > transfer. > 2. The logic of the SslTransportLayer.transferFrom() method: > {code:java} > fileChannel.read(fileChannelBuffer, pos) > -> > sslEngine.wrap(src, netWriteBuffer) > -> > flush(ByteBuffer buf) && socketChannel.write(buf){code} > That is, first read the data on the disk or directly from the page cache, > then encrypt the data, and finally send the encrypted data to the network. > {*}FileChannel.transferTo() is not used in the whole process{*}. > > Conclusion: > PlaintextTransportLayer and SslTransportLayer both use pagecache, but > SslTransportLayer does not implement zero-copy. -- This message was sent by Atlassian Jira (v8.20.1#820001)