What are good examples of streaming sinks that support checkpointing (or transactions/rollbacks)? I don't Kafka supports a rollback.
On Mon, May 2, 2016 at 2:54 AM, Maximilian Michels <m...@apache.org> wrote: > Yes, I would expect sinks to provide similar additional interfaces > like sources, e.g. checkpointing. We could also use the > startBundle/processElement/finishBundle lifecycle methods to implement > checkpointing. I just wonder, if we want to make it more explicit. > Also, does it make sense that sinks can return a PCollection? You can > return PDone but you don't have to. > > Since sinks are fundamental in streaming pipelines, it just seemed odd > to me that there is not dedicated interface. I understand a bit > clearer now that it is not viewed as crucial because we can use > existing primitives to create sinks. In a way, that might be elegant > but also less explicit. > > On Fri, Apr 29, 2016 at 11:00 PM, Frances Perry <f...@google.com.invalid> > wrote: > >> > >> @Frances Sources are not simple DoFns. They add additional > >> functionality, e.g. checkpointing, watermark generation, creating > >> splits. If we want sinks to be portable, we should think about a > >> dedicated interface. At least for the checkpointing. > >> > > > > We might be mixing sources and sinks in this conversation. ;-) Sources > > definitely provide additional functionality as you mentioned. But at > least > > currently, sinks don't provide any new primitive functionality. Are you > > suggestion there needs to be a checkpointing interface for sinks beyond > > DoFn's bundle finalization? (Note that the existing Write for batch is > just > > a PTransform based around ParDo.) >