[
https://issues.apache.org/jira/browse/SAMZA-429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169960#comment-14169960
]
Jay Kreps commented on SAMZA-429:
---------------------------------
I think another way to put this is that there are potentially two formats:
1. A set of Java interfaces/classes that the task code interacts with
2. A binary message format
Currently the framework specifies neither. You could potentially specify either
(1) but not (2), (2) but not (1), or both (1) and (2).
The argument against specifying (1) is that each serialization type comes with
a set of libraries that will generate classes from some schema.
One intermediate position would be to leave the framework pluggable for (1) but
provide a good implementation of one serialization type that could be kind of
recommended for those who don't have a preference.
> Decouple Protocol from Task
> ---------------------------
>
> Key: SAMZA-429
> URL: https://issues.apache.org/jira/browse/SAMZA-429
> Project: Samza
> Issue Type: Improvement
> Reporter: Jonathan Herriott
>
> Maybe someone can point me in the right direction if this is wrong. One
> thing I've disliked about tasks is the fact that the protocols have to be
> baked directly into the Task, so if you want to process JSON, you have to
> treat the message contents as a HashMap, but if you want to use Avro, it
> needs to be treated as a GenericRecord object, etc. I think it would be
> super beneficial to fully abstract this from the Task object and just treat
> each thing as a "Message" object. I think the advantage of this is that you
> can test with JSON and run with Avro in production or whatever as debugging
> with JSON is a lot easier than Avro.
> The thing is, in the Task, I only care about the structure, I don't really
> care about what protocol it is. Maybe this statement is a bit naive, but I
> don't think there would ever be a good situation in which you would pass just
> a string or integer or whatever instead of some form of hierarchical message.
> In my opinion, all Serde should return a common interface for a Record for
> deserialization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)