[ 
https://issues.apache.org/jira/browse/KAFKA-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15300119#comment-15300119
 ] 

Ismael Juma commented on KAFKA-3744:
------------------------------------

It changes the message format so it needs a KIP. :) The KIP page even says: "We 
need to spend significantly more time on log format and protocol". Once those 
two bits are used for the purpose you propose, they cannot be used for anything 
else, so we take such changes very seriously (we don't have many free bits left 
as you can see).

As I said, it may be worth just asking for feedback on the mailing list before 
writing a complete KIP if you'd like to get some feedback before spending the 
time on it.

> Message format needs to identify serializer
> -------------------------------------------
>
>                 Key: KAFKA-3744
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3744
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: David Kay
>            Priority: Minor
>
> https://issues.apache.org/jira/browse/KAFKA-3698 was recently resolved with 
> https://github.com/apache/kafka/commit/27a19b964af35390d78e1b3b50bc03d23327f4d0.
> But Kafka documentation on message formats needs to be more explicit for new 
> users. Section 1.3 Step 4 says: "Send some messages" and takes lines of text 
> from the command line. Beginner's guide 
> (http://www.slideshare.net/miguno/apache-kafka-08-basic-training-verisign 
> Slide 104 says:
> {noformat}
>    Kafka does not care about data format of msg payload
>    Up to developer to handle serialization/deserialization
>       Common choices: Avro, JSON
> {noformat}
> If one producer sends lines of console text, another producer sends Avro, a 
> third producer sends JSON, and a fourth sends CBOR, how does the consumer 
> identify which deserializer to use for the payload?  The commit includes an 
> opaque K byte Key that could potentially include a codec identifier, but 
> provides no guidance on how to use it:
> {quote}
> "Leaving the key and value opaque is the right decision: there is a great 
> deal of progress being made on serialization libraries right now, and any 
> particular choice is unlikely to be right for all uses. Needless to say a 
> particular application using Kafka would likely mandate a particular 
> serialization type as part of its usage."
> {quote}
> Mandating any particular serialization is as unrealistic as mandating a 
> single mime-type for all web content.  There must be a way to signal the 
> serialization used to produce this message's V byte payload, and documenting 
> the existence of even a rudimentary codec registry with a few values (text, 
> Avro, JSON, CBOR) would establish the pattern to be used for future 
> serialization libraries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to