Thomas, I was trying to refer to the input from previous operator.
Another thing when we extend the AbstractKafkaOutputOperator, do we need to specify <String, T> ? Since we are getting an object of class type from previous operator. Thanks Jaspal On Fri, Oct 7, 2016 at 10:12 AM, Thomas Weise <t...@apache.org> wrote: > Are you referring to the upstream operator in the DAG or the state of the > previous application after relaunch? Since the data is stored in MapR > streams, an operator that is a producer can also act as a consumer. Please > clarify your question. > > > On Fri, Oct 7, 2016 at 7:59 AM, Jaspal Singh <jaspal.singh1...@gmail.com> > wrote: > >> Hi Thomas, >> >> I have a question, so when we are using >> *KafkaSinglePortExactlyOnceOutputOperator* to write results into >> maprstream topic will it be able to read messgaes from the previous >> operator ? >> >> >> Thanks >> Jaspal >> >> On Thu, Oct 6, 2016 at 6:28 PM, Thomas Weise <t...@apache.org> wrote: >> >>> For recovery you need to set the window data manager like so: >>> >>> https://github.com/DataTorrent/examples/blob/master/tutorial >>> s/exactly-once/src/main/java/com/example/myapexapp/Application.java#L33 >>> >>> That will also apply to stateful restart of the entire application >>> (relaunch from previous instance's checkpointed state). >>> >>> For cold restart, you would need to consider the property you mention >>> and decide what is applicable to your use case. >>> >>> Thomas >>> >>> >>> On Thu, Oct 6, 2016 at 4:16 PM, Jaspal Singh <jaspal.singh1...@gmail.com >>> > wrote: >>> >>>> Ok now I get it. Thanks for the nice explaination !! >>>> >>>> One more thing, so you mentioned about checkpointing the offset ranges >>>> to replay in same order from kafka. >>>> >>>> Is there any property we need to configure to do that? like >>>> initialOffset set to APPLICATION_OR_LATEST. >>>> >>>> >>>> Thanks >>>> Jaspal >>>> >>>> >>>> On Thursday, October 6, 2016, Thomas Weise <thomas.we...@gmail.com> >>>> wrote: >>>> >>>>> What you want is the effect of exactly-once output (that's why we call >>>>> it also end-to-end exactly-once). There is no such thing as exactly-once >>>>> processing in a distributed system. In this case it would be rather >>>>> "produce exactly-once. Upstream operators, on failure, will recover to >>>>> checkpointed state and re-process the stream from there. This is >>>>> at-least-once, the default behavior. Because in the input operator you >>>>> have >>>>> configured to replay in the same order from Kafka (this is done by >>>>> checkpointing the offset ranges), the computation in the DAG is idempotent >>>>> and the output operator can discard the results that were already >>>>> published >>>>> instead of producing duplicates. >>>>> >>>>> On Thu, Oct 6, 2016 at 3:57 PM, Jaspal Singh < >>>>> jaspal.singh1...@gmail.com> wrote: >>>>> >>>>>> I think this is something called a customized operator implementation >>>>>> that is taking care of exactly once processing at output. >>>>>> >>>>>> What if any previous operators fail ? How we can make sure they also >>>>>> recover using EXACTLY_ONCE processing mode ? >>>>>> >>>>>> >>>>>> Thanks >>>>>> Jaspal >>>>>> >>>>>> >>>>>> On Thursday, October 6, 2016, Thomas Weise <thomas.we...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> In that case please have a look at: >>>>>>> >>>>>>> https://github.com/apache/apex-malhar/blob/master/kafka/src/ >>>>>>> main/java/org/apache/apex/malhar/kafka/KafkaSinglePortExactl >>>>>>> yOnceOutputOperator.java >>>>>>> >>>>>>> The operator will ensure that messages are not duplicated, under the >>>>>>> stated assumptions. >>>>>>> >>>>>>> >>>>>>> On Thu, Oct 6, 2016 at 3:37 PM, Jaspal Singh < >>>>>>> jaspal.singh1...@gmail.com> wrote: >>>>>>> >>>>>>>> Hi Thomas, >>>>>>>> >>>>>>>> In our case we are writing the results back to maprstreams topic based >>>>>>>> on some validations. >>>>>>>> >>>>>>>> >>>>>>>> Thanks >>>>>>>> Jaspal >>>>>>>> >>>>>>>> >>>>>>>> On Thursday, October 6, 2016, Thomas Weise <t...@apache.org> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> which operators in your application are writing to external >>>>>>>>> systems? >>>>>>>>> >>>>>>>>> When you look at the example from the blog ( >>>>>>>>> https://github.com/DataTorrent/examples/tree/master/tutoria >>>>>>>>> ls/exactly-once), there is Kafka input, which is configured to be >>>>>>>>> idempotent. The results are written to JDBC. That operator by itself >>>>>>>>> supports exactly-once through transactions (in conjunction with >>>>>>>>> idempotent >>>>>>>>> input), hence there is no need to configure the processing mode at >>>>>>>>> all. >>>>>>>>> >>>>>>>>> Thomas >>>>>>>>> >>>>>>>>> >>>>> >>> >> >