[jira] [Commented] (SAMZA-484) Define the serialization/deserialization format for stream tuple

Yi Pan (Data Infrastructure) (JIRA) Thu, 15 Jan 2015 11:22:29 -0800

    [ 
https://issues.apache.org/jira/browse/SAMZA-484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279152#comment-14279152
 ]


Yi Pan (Data Infrastructure) commented on SAMZA-484:
----------------------------------------------------

+1 to the above proposal. I will work on an experimental implementation in that 
direction.

For the generic data types we support, I would like to quote the types in 
[Hive|http://blog.cloudera.com/blog/2012/11/analyzing-twitter-data-with-hadoop-part-3-querying-semi-structured-data-with-hive/]:
{quote}
There are all the usual players: integers, strings, floats, and the like, but 
the interesting ones are the more exotic maps, arrays, and structs. 
{quote}


> Define the serialization/deserialization format for stream tuple
> ----------------------------------------------------------------
>
>                 Key: SAMZA-484
>                 URL: https://issues.apache.org/jira/browse/SAMZA-484
>             Project: Samza
>          Issue Type: Sub-task
>          Components: sql
>            Reporter: Yi Pan (Data Infrastructure)
>            Priority: Minor
>              Labels: project
>
> It came out in the discussion for streaming SQL that we will need to define 
> the serialization/deserialization format for stream tuple.
> The ideal serialization/deserialization format should allow both forward and 
> backward compatibility on additional/missing fields in the data.
> Several choices to be considered:
> 1) Avro
> 2) Protobuf
> 3) Flatbuffer
> It might also be interesting to consider a pluggable serialization interface 
> that allows different serialization methods for different Samza jobs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (SAMZA-484) Define the serialization/deserialization format for stream tuple

Reply via email to