Makes sense. I noticed that too and I dropped the ControlMessage type in my code. I also moved taskName, taskCount to the parent ControlMessage class. Just updated the SEP-6. Please take a look again.
Thanks, Xinyu On Tue, May 30, 2017 at 9:12 AM, Chris Pettitt < cpett...@linkedin.com.invalid> wrote: > MessageType and ControlMessage.Type look redundant. You could either use > "ControlMessage" as the type in MessageType or drop ControlMessage.Type. > > On Fri, May 26, 2017 at 5:14 PM, xinyu liu <xinyuliu...@gmail.com> wrote: > > > Thanks a lot for the comments. I updated the SEP with more details and > > clarification. Please let me know if you have further questions. > > > > Thanks, > > Xinyu > > > > On Thu, May 25, 2017 at 11:19 AM, Prateek Maheshwari < > > pmaheshw...@linkedin.com.invalid> wrote: > > > > > Hi Xinyu, > > > > > > Thanks for the proposal. Some requests for clarifications. Let's update > > the > > > SEP directly instead of replying here. > > > > > > E.g., in "For any following intermediate stream whose input streams are > > all > > > end-of-stream, it will be marked as pending EOS" - Should clarify that > > > (IIUC) something is injecting EOS messages in all intermediate stream > > > partitions once it receives EOS from all input stream partitions it's > > > consuming. Should also clarify what is that something. > > > Same for "declare end of stream once all the EOS messages have been > > > received." - What does this declaration involve and who is doing this? > > > > > > In pro for approach 2: Not clear what this means - "The watermark can > > > conclude the input messages before this watermark have been complete." > > > > > > For the cons of approach 2: "Complicated failure scenario of the second > > > job. It needs to checkpoint all the watermark messages received, so > when > > it > > > recovered from failure, it can still count." - How is this related to > > EOS? > > > How is this related to the checkpoint watermark section? > > > Also, what is the "more messages required to write.. " referring to? > > > > > > "Samza needs to reconcile based on the task counts." - Please explain > > what > > > reconciliation means, why it needs to happen, and why we need to track > > the > > > producer task and total task count in the watermark message to do this. > > > > > > Checkpoint watermarks section is also unclear. What problem are we > trying > > > to solve here? > > > > > > Should also move the message format and the watermark message interface > > > sections to the bottom, since they depend on details in the event time > > and > > > checkpoint watermark sections. > > > > > > Thanks, > > > Prateek > > > > > > > > > On Wed, May 24, 2017 at 11:30 AM, xinyu liu <xinyuliu...@gmail.com> > > wrote: > > > > > > > Hi all, > > > > > > > > I created SEP-6 for SAMZA-1260 > > > > <https://issues.apache.org/jira/browse/SAMZA-1260>: Support > Watermark > > > > Across Intermediate Streams for Batch Processing. The link to the SEP > > is > > > > here: > > > > > > > > https://cwiki.apache.org/confluence/display/SAMZA/SEP- > > > > 6+Support+Watermark+Across+Intermediate+Streams+for+Batch+Processing > > > > > > > > Please review and comments are welcome! > > > > > > > > Thanks, > > > > Xinyu > > > > > > > > > >