Fei Pan created STORM-2761: ------------------------------ Summary: JoinBolt.java 's paradigm is new model of stream join? Key: STORM-2761 URL: https://issues.apache.org/jira/browse/STORM-2761 Project: Apache Storm Issue Type: Question Components: storm-client Reporter: Fei Pan Priority: Critical
Hi, I am a researcher from University of Toronto and I am studying acceleration on stream processing platform. I have a question about the model of window-based stream join used in the JoinBolt.java. From my understanding, when a new tuple arrived, we join this new tuple with all the tuples in the window of the opposite stream. However, in the JoinBolt.java, not only the new tuple, but the tuples in the entire local window will join with the window of the opposite stream. This actually produces a lot of duplicated results, since most of the old tuples in the local window have joined before. I don't know if this is a new paradigm or the storm's team misunderstood the model of stream join. Can someone help me to clarify this question? -- This message was sent by Atlassian JIRA (v6.4.14#64029)