rhauch edited a comment on pull request #9950:
URL: https://github.com/apache/kafka/pull/9950#issuecomment-783634110


   Before this fix, the changes made in #4820 
([KAFKA-6684](https://issues.apache.org/jira/browse/KAFKA-6684)) resulted in 
`toString()` being called on `byte[]` and `ByteBuffer`. As highlighted in this 
PR and issue that `ByteBuffer.toString()` is not useful, but the `toString()` 
on `byte[]` still works. This PR seems to change that behavior, which would not 
be backward compatible.
   
   The discussion on PR #4820 also talked about making this compatible with 
`Values.convertToString(...)`, which for `byte[]` and `ByteBuffer` results in a 
base 64 encoded string (with `ISO-8859-1` encoding). See the [Values code for 
details](https://github.com/apache/kafka/blob/95f51539c8d0b88bd7f285011d42e2d1117107de/connect/api/src/main/java/org/apache/kafka/connect/data/Values.java#L673-L682).
   
   ~Because of this, it seems much more sensible to also use base64 here for 
`ByteBuffer` so it matches the existing behavior with `byte[]`.~
   
   ~I agree that it maybe doesn't suffice in all user situations, so if we also 
want to support other encodings we'd need other config changes that will 
require using the KIP mechanism to propose such enhancements.~
   
   Actually, the existing code simply outputs the `toString()` result of a 
`byte[]` object, which is of the form `[B@22ef9844`, which is the 
`Object.toString()` implementation that includes the class alias (e.g., `[B` is 
a byte array) and the hex representation of the object's `hashCode()`.
   
   However, given that none of the cast forms of `byte[]` or `ByteBuffer[]` are 
useful, then perhaps it's worth adding the previous discussion on #4820 
mentioning that:
   > As @ewencp suggested, for consistency perhaps we can use the same string 
formats as SimpleHeaderConverter: 
https://github.com/apache/kafka/blob/trunk/connect/api/src/main/java/org/apache/kafka/connect/data/Values.java,
   
   As mentioned above, the `Values.convertToString(...)` uses a base 64 
encoding. And this aligns with @kkonstantine's suggestion:
   > Why are we selecting hex here and not base64 for example? Hex of course is 
less efficient.
   
   WDYT?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to