[
https://issues.apache.org/jira/browse/SAMZA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279152#comment-14279152
]
Yi Pan (Data Infrastructure) commented on SAMZA-484:
----------------------------------------------------
+1 to the above proposal. I will work on an experimental implementation in that
direction.
For the generic data types we support, I would like to quote the types in
[Hive|http://blog.cloudera.com/blog/2012/11/analyzing-twitter-data-with-hadoop-part-3-querying-semi-structured-data-with-hive/]:
{quote}
There are all the usual players: integers, strings, floats, and the like, but
the interesting ones are the more exotic maps, arrays, and structs.
{quote}
> Define the serialization/deserialization format for stream tuple
> ----------------------------------------------------------------
>
> Key: SAMZA-484
> URL: https://issues.apache.org/jira/browse/SAMZA-484
> Project: Samza
> Issue Type: Sub-task
> Components: sql
> Reporter: Yi Pan (Data Infrastructure)
> Priority: Minor
> Labels: project
>
> It came out in the discussion for streaming SQL that we will need to define
> the serialization/deserialization format for stream tuple.
> The ideal serialization/deserialization format should allow both forward and
> backward compatibility on additional/missing fields in the data.
> Several choices to be considered:
> 1) Avro
> 2) Protobuf
> 3) Flatbuffer
> It might also be interesting to consider a pluggable serialization interface
> that allows different serialization methods for different Samza jobs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)