[ https://issues.apache.org/jira/browse/STORM-138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rick Kellogg updated STORM-138: ------------------------------- Component/s: storm-multilang > Pluggable serialization for multilang > ------------------------------------- > > Key: STORM-138 > URL: https://issues.apache.org/jira/browse/STORM-138 > Project: Apache Storm > Issue Type: New Feature > Components: storm-multilang > Reporter: James Xu > Assignee: John Sebastian Gilmore > Priority: Minor > Fix For: 0.9.2-incubating > > > https://github.com/nathanmarz/storm/issues/373 > Currently JSON is used to serialize tuples for multilang. It would be great > if the serialization mechanism were pluggable so that using richer types with > multilang would be possible. > --------- > francis-liberty: Hello, I am a newbie here, and I wanted to pick up this > issue. I also noticed a recent PR here #697 by jsgilmore, is it feasible for > this issue, too? > I looked around the source code, and I would like to talk about my opinions > on this issue here. > For now, ShellProcess only supports JSON to communicate with multilang > process: read, write. And, ShellSpout and ShellBolt talk with ShellProcess > through JSON, too. This is all because ShellProcess's interface use > JSONObject only. Conceptually, ShellProcess should encapsulate the multilang > details, and talk with Bolt and Spout using Tuple. (jsgilmore invented two > new classes, Immission and Emission. But I think all information Bolt and > Spout need is in Tuple already, no need for new data structures.) So, I think > it would be much cleaner to do serialization in ShellProcess only, and both > ShellSpout and ShellBolt don't know anything about how ShellProcess convert > between Tuple and strings. > So, I suppose I can do the work of > 1. change the interface of ShellProcess to return and accept Tuple data > structure, instead of JSONObject. > 2. make ShellSpout and ShellBolt work on Tuple, all information like task_id, > stream_id and tuples should be retrieve/encapsulate in this data structure. > 3. what other serialization format would you like to add? I think in the end > we need to add some example other than JSON to storm-starter storm.py/rb, > which I would also like to work on. > ---------- > jsgilmore: Hi, all serialisation is done in the JSONSerialiser, so no > serialisation is done in ShellBolt, ShellProcess or ShellSpout. They just > send around the Emission and Immission classes. The point of the ISerializer > interface is to achieve the separation of serialisation. > I come from the multilang side of Storm, so I'm not that familiar with the > internal Storm structures. If there is a class that the ISerializer interface > can use, instead of the Emission and Immission classes, I'm open to it. > I would recommend that further discussion of PR #697 rather happen in the PR > thread itself though. > I created an issue to add protocol buffer serialisation for multilang to > Storm in issue #654 , but I didn't see this issue. The whole purpose of PR > #697 is to solve this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)