[jira] [Commented] (KAFKA-15042) Clarify documentation on Tagged fields

Adrian Preston (Jira) Wed, 09 Aug 2023 12:43:16 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-15042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752503#comment-17752503
 ]


Adrian Preston commented on KAFKA-15042:
----------------------------------------

[~smashingquasar],

Looking at the hex dump of your APIVersions request, I think you are correctly 
encoding the empty tag field as a single 0x00 byte. It looks, however, like the 
compact string encoding (used for the 'client_software_name' and 
'client_software_version' fields) is adding an unexpected 0x01 as the first 
byte.

For comparison, here's the APIVersions request I get from Sarama:

{{00 00 00 21                (Length)}}
{{00 12                      (API key)}}
{{00 03                      (API version)}}
{{00 00 00 00                (Correlation ID)}}
{{00 07 77 65 62 2d 61 70 69 (Client ID)}}
{{00                         (Tagged fields)}}
{{08 77 65 62 2d 61 70 69    (Client software name)}}
{{06 30 2e 30 2e 31          (Client software version)}}
{{00                         (Tagged fields)}}

> Clarify documentation on Tagged fields
> --------------------------------------
>
>                 Key: KAFKA-15042
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15042
>             Project: Kafka
>          Issue Type: Wish
>          Components: docs
>         Environment: Using the Ubuntu/Kafka Docker image for testing purposes.
>            Reporter: Nicolas
>            Assignee: Adrian Preston
>            Priority: Major
>
> Hello,
> I am currently working on an implementation of the Kafka protocol.
> So far, all my code is working as intended through serialising requests and 
> deserialising response as long as I am not using the flex requests system.
> I am now trying to implement the flex requests system but the documentation 
> is scarce on the subject of tagged fields.
> If we take the Request Header v2:
>  
> {code:java}
> Request Header v2 => request_api_key request_api_version correlation_id 
> client_id TAG_BUFFER request_api_key => INT16 request_api_version => INT16 
> correlation_id => INT32 client_id => NULLABLE_STRING{code}
>  
> Here, the BNF seems violated. TAG_BUFFER is not a value in this situation. It 
> appears to be a type. It also does not appear within the detailed description 
> inside the BNF.
> TAG_BUFFER also does not refer to any declared type within the documentation. 
> It seems to indicate tagged fields though.
> Now when looking at tagged fields, the only mention of them within the 
> documentation is:
> {quote}Note that [KIP-482 tagged 
> fields|https://cwiki.apache.org/confluence/display/KAFKA/KIP-482%3A+The+Kafka+Protocol+should+Support+Optional+Tagged+Fields]
>  can be added to a request without incrementing the version number. This 
> offers an additional way of evolving the message schema without breaking 
> compatibility. Tagged fields do not take up any space when the field is not 
> set. Therefore, if a field is rarely used, it is more efficient to make it a 
> tagged field than to put it in the mandatory schema. However, tagged fields 
> are ignored by recipients that don't know about them, which could pose a 
> challenge if this is not the behavior that the sender wants. In such cases, a 
> version bump may be more appropriate.
> {quote}
> This leads to the KIP-482 that does not clearly and explicitly detail the 
> process of writing and reading those tagged fields.
> I decided to look up existing clients to understand how they handle tagged 
> fields. I notably looked at kafka-js (JavaScript client) and librdkafka (C 
> client) and to my surprise, they have not implemented tagged fields. In fact, 
> they rely on a hack to skip them and ignore them completely.
> I also had a look at the [Java client bundled within Kafka within the Tagged 
> Fields 
> section|https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/protocol/types/TaggedFields.java#L64].
> Now I am not a Java developer so I may not understand exactly this code. I 
> read the comment and implemented the logic following Google's Protobuf 
> specifications. The problem is, this leads to a request that outputs a stack 
> trace within Kafka (it would be appreciated to not just dump stack traces and 
> gracefully handle errors by the way).
>  
> As a reference, I tried to send an APIVersions (key: 18) (version: 3) request.
> My request reads as follows when converted to hexadecimal:
>  
> {code:java}
> Request header: 00 12 00 03 00 00 00 00 00 07 77 65 62 2d 61 70 69 00
> Request body: 01 08 77 65 62 2d 61 70 69 01 06 30 2e 30 2e 31 00
> Full request: 00 00 00 23 00 12 00 03 00 00 00 00 00 07 77 65 62 2d 61 70 69 
> 00 01 08 77 65 62 2d 61 70 69 01 06 30 2e 30 2e 31 00
> {code}
>  
> This creates a buffer underflow error within Kafka:
> {code:java}
> [2023-05-31 14:14:31,132] ERROR Exception while processing request from 
> 172.21.0.5:9092-172.21.0.3:59228-21 (kafka.network.Processor)
> org.apache.kafka.common.errors.InvalidRequestException: Error getting request 
> for apiKey: API_VERSIONS, apiVersion: 3, connectionId: 
> 172.21.0.5:9092-172.21.0.3:59228-21, listenerName: ListenerName(PLAINTEXT), 
> principal: User:ANONYMOUS
> Caused by: java.nio.BufferUnderflowException
>         at java.base/java.nio.HeapByteBuffer.get(HeapByteBuffer.java:182)
>         at java.base/java.nio.ByteBuffer.get(ByteBuffer.java:770)
>         at 
> org.apache.kafka.common.protocol.ByteBufferAccessor.readArray(ByteBufferAccessor.java:58)
>         at 
> org.apache.kafka.common.protocol.Readable.readUnknownTaggedField(Readable.java:53)
>         at 
> org.apache.kafka.common.message.ApiVersionsRequestData.read(ApiVersionsRequestData.java:133)
>         at 
> org.apache.kafka.common.message.ApiVersionsRequestData.<init>(ApiVersionsRequestData.java:74)
>         at 
> org.apache.kafka.common.requests.ApiVersionsRequest.parse(ApiVersionsRequest.java:119)
>         at 
> org.apache.kafka.common.requests.AbstractRequest.doParseRequest(AbstractRequest.java:207)
>         at 
> org.apache.kafka.common.requests.AbstractRequest.parseRequest(AbstractRequest.java:165)
>         at 
> org.apache.kafka.common.requests.RequestContext.parseRequest(RequestContext.java:95)
>         at 
> kafka.network.RequestChannel$Request.<init>(RequestChannel.scala:101)
>         at 
> kafka.network.Processor.$anonfun$processCompletedReceives$1(SocketServer.scala:1030)
>         at 
> java.base/java.util.LinkedHashMap$LinkedValues.forEach(LinkedHashMap.java:608)
>         at 
> kafka.network.Processor.processCompletedReceives(SocketServer.scala:1008)
>         at kafka.network.Processor.run(SocketServer.scala:893)
>         at java.base/java.lang.Thread.run(Thread.java:829){code}
>  
> I attempted many different approaches but I have to admit that I am making 
> absolutely no progress.
> I am creating this issue for two reasons:
>  * Obviously, I would very much appreciate an explanation on how to write and 
> read tagged fields with a detailed approach. What kind of bytes are expected? 
> What are the values format?
>  * Since mainstream clients aren't implementing this feature, it is wasted on 
> most users. Let's face it, very few people actually implement the binary 
> protocol. It is sad that this feature is not widely available to everyone, 
> especially since it can reduce requests loads. I think improving the 
> documentation to make the tagged fields clearly explained with at least one 
> example would greatly benefit the community.
> Thanks in advance for your answers!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15042) Clarify documentation on Tagged fields

Reply via email to