On Fri, May 5, 2017 at 1:53 PM, Robert Bradshaw <rober...@google.com.invalid > wrote:
> On Fri, May 5, 2017 at 1:33 PM, Kenneth Knowles <k...@google.com.invalid> > wrote: > > On Fri, May 5, 2017 at 12:43 PM, Robert Bradshaw < > > rober...@google.com.invalid> wrote: > > To > > maintain that, safely, we would add GloballyWindowedOutput or some such > > with the output() and outputTimestamp() method. This is routine work > quite > > analogous to what we already have. We would validate that the output > > windowing strategy is global windows if the parameter is present. > > So the current recommendation is to write > > c.output( > [value], > BoundedWindow.TIMESTAMP_MIN_VALUE, > GlobalWindow.INSTANCE); > > but we could recommend the user use a different context to not have to > specify BoundedWindow.TIMESTAMP_MIN_VALUE and GlobalWindow.INSTANCE so > that we could check this at compile time? > Yup. We are trying to take everything out of every context and get them into grokkable parameters. > >> We > >> > can also enforce that the correct subclass of window is output, though > >> this > >> > is not implemented as such right now. > >> > >> Are you thinking a type check based on the reflectively obtained > >> parameter of the WindowFn? > >> > > > > Yes. Though we don't do it today, it would be OutputWithWindow<W> or some > > such (which is actually the same as FinishBundleContext<W>) where W was > > some window class that would be checked against the window coder. We > cannot > > use the same inner class trick to avoid referencing the type, but that > > shouldn't be burdensome in this case. > > Is adding a type parameter backwards compatible? We'd just make a new type rather than change FinishBundleContext. I just wanted to mention that it is a tiny change. It is not important to get in for FSR because of this. Kenn I suppose with raw > types it is... Not sure how this would work with a proper BatchignDoFn > that may output to any window type, but maybe. > > Python's a bit different as it doesn't use contexts to emit values. > > >> But this behavior is all to support people who want to write transforms > >> > that work with essentially only one windowing strategy, basically > >> > pre-windowing stuff. Are there really any meaningful alternatives to > the > >> > global window? If so, I don't think that is realistically what this is > >> > about. Such transforms should be permitted, but not encouraged or > >> supported > >> > automatically, as they are antithetical to the unified model. So I > would > >> > also suggest that they should validate that their input is globally > >> > windowed and explicitly output to the global window. Trivial and they > >> > already should be doing that. They also have the capability of storing > >> the > >> > input's WindowFn and producing behavior identical to what you > describe. > >> > >> What it boils down to is that this is a common case that only works > >> easily in the Global window. Another example that I can think of like > >> this in the SDK, namely global combine. If you're in the global > >> window, you get a singleton PCollection, and we don't require users to > >> do extra for this behavior. If someone tries to use this transform in > >> a non-globally-windowed setting, only then do we give an error (this > >> time at pipeline construction) and force the user to do extra work. > >> It's a balance between having something that works correctly in the > >> unified model but requires extra (potentially tricky and error-prone) > >> effort only iff one needs it. > > The question above is the crux of the issue--are we striking the right > balance in requiring the user to specify a Window in all cases when we > can infer it in the by far most common case (like with global > combine's default value) and can detect and warn the user when they > don't fall into that case. >