[ 
https://issues.apache.org/jira/browse/STORM-138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-138:
-------------------------------
    Component/s: storm-multilang

> Pluggable serialization for multilang
> -------------------------------------
>
>                 Key: STORM-138
>                 URL: https://issues.apache.org/jira/browse/STORM-138
>             Project: Apache Storm
>          Issue Type: New Feature
>          Components: storm-multilang
>            Reporter: James Xu
>            Assignee: John Sebastian Gilmore
>            Priority: Minor
>             Fix For: 0.9.2-incubating
>
>
> https://github.com/nathanmarz/storm/issues/373
> Currently JSON is used to serialize tuples for multilang. It would be great 
> if the serialization mechanism were pluggable so that using richer types with 
> multilang would be possible.
> ---------
> francis-liberty: Hello, I am a newbie here, and I wanted to pick up this 
> issue. I also noticed a recent PR here #697 by jsgilmore, is it feasible for 
> this issue, too?
> I looked around the source code, and I would like to talk about my opinions 
> on this issue here.
> For now, ShellProcess only supports JSON to communicate with multilang 
> process: read, write. And, ShellSpout and ShellBolt talk with ShellProcess 
> through JSON, too. This is all because ShellProcess's interface use 
> JSONObject only. Conceptually, ShellProcess should encapsulate the multilang 
> details, and talk with Bolt and Spout using Tuple. (jsgilmore invented two 
> new classes, Immission and Emission. But I think all information Bolt and 
> Spout need is in Tuple already, no need for new data structures.) So, I think 
> it would be much cleaner to do serialization in ShellProcess only, and both 
> ShellSpout and ShellBolt don't know anything about how ShellProcess convert 
> between Tuple and strings.
> So, I suppose I can do the work of
> 1. change the interface of ShellProcess to return and accept Tuple data 
> structure, instead of JSONObject.
> 2. make ShellSpout and ShellBolt work on Tuple, all information like task_id, 
> stream_id and tuples should be retrieve/encapsulate in this data structure.
> 3. what other serialization format would you like to add? I think in the end 
> we need to add some example other than JSON to storm-starter storm.py/rb, 
> which I would also like to work on.
> ----------
> jsgilmore: Hi, all serialisation is done in the JSONSerialiser, so no 
> serialisation is done in ShellBolt, ShellProcess or ShellSpout. They just 
> send around the Emission and Immission classes. The point of the ISerializer 
> interface is to achieve the separation of serialisation.
> I come from the multilang side of Storm, so I'm not that familiar with the 
> internal Storm structures. If there is a class that the ISerializer interface 
> can use, instead of the Emission and Immission classes, I'm open to it.
> I would recommend that further discussion of PR #697 rather happen in the PR 
> thread itself though.
> I created an issue to add protocol buffer serialisation for multilang to 
> Storm in issue #654 , but I didn't see this issue. The whole purpose of PR 
> #697 is to solve this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to