I think we agree that emitting from FinalizeBundle without specifying
the Window explicitly should be an error for anything but the global
window (similar to global combine with defaults). The question is
whether it should be allowable in the global window case for
convenience (again, similar to global combine), as long as we can
detect its abuse.

I'd be interested in hearing if others have thoughts on this as well.

- Robert


On Fri, May 5, 2017 at 2:05 PM, Kenneth Knowles <k...@google.com.invalid> wrote:
> On Fri, May 5, 2017 at 1:53 PM, Robert Bradshaw <rober...@google.com.invalid
>> wrote:
>
>> On Fri, May 5, 2017 at 1:33 PM, Kenneth Knowles <k...@google.com.invalid>
>> wrote:
>> > On Fri, May 5, 2017 at 12:43 PM, Robert Bradshaw <
>> > rober...@google.com.invalid> wrote:
>> > To
>> > maintain that, safely, we would add GloballyWindowedOutput or some such
>> > with the output() and outputTimestamp() method. This is routine work
>> quite
>> > analogous to what we already have. We would validate that the output
>> > windowing strategy is global windows if the parameter is present.
>>
>> So the current recommendation is to write
>>
>>         c.output(
>>             [value],
>>             BoundedWindow.TIMESTAMP_MIN_VALUE,
>>             GlobalWindow.INSTANCE);
>>
>> but we could recommend the user use a different context to not have to
>> specify BoundedWindow.TIMESTAMP_MIN_VALUE and GlobalWindow.INSTANCE so
>> that we could check this at compile time?
>>
>
> Yup. We are trying to take everything out of every context and get them
> into grokkable parameters.
>
>
>
>> >> We
>> >> > can also enforce that the correct subclass of window is output, though
>> >> this
>> >> > is not implemented as such right now.
>> >>
>> >> Are you thinking a type check based on the reflectively obtained
>> >> parameter of the WindowFn?
>> >>
>> >
>> > Yes. Though we don't do it today, it would be OutputWithWindow<W> or some
>> > such (which is actually the same as FinishBundleContext<W>) where W was
>> > some window class that would be checked against the window coder. We
>> cannot
>> > use the same inner class trick to avoid referencing the type, but that
>> > shouldn't be burdensome in this case.
>>
>> Is adding a type parameter backwards compatible?
>
>
> We'd just make a new type rather than change FinishBundleContext. I just
> wanted to mention that it is a tiny change. It is not important to get in
> for FSR because of this.
>
> Kenn
>
>
> I suppose with raw
>> types it is... Not sure how this would work with a proper BatchignDoFn
>> that may output to any window type, but maybe.
>>
>> Python's a bit different as it doesn't use contexts to emit values.
>>
>> >> But this behavior is all to support people who want to write transforms
>> >> > that work with essentially only one windowing strategy, basically
>> >> > pre-windowing stuff. Are there really any meaningful alternatives to
>> the
>> >> > global window? If so, I don't think that is realistically what this is
>> >> > about. Such transforms should be permitted, but not encouraged or
>> >> supported
>> >> > automatically, as they are antithetical to the unified model. So I
>> would
>> >> > also suggest that they should validate that their input is globally
>> >> > windowed and explicitly output to the global window. Trivial and they
>> >> > already should be doing that. They also have the capability of storing
>> >> the
>> >> > input's WindowFn and producing behavior identical to what you
>> describe.
>> >>
>> >> What it boils down to is that this is a common case that only works
>> >> easily in the Global window. Another example that I can think of like
>> >> this in the SDK, namely global combine. If you're in the global
>> >> window, you get a singleton PCollection, and we don't require users to
>> >> do extra for this behavior. If someone tries to use this transform in
>> >> a non-globally-windowed setting, only then do we give an error (this
>> >> time at pipeline construction) and force the user to do extra work.
>> >> It's a balance between having something that works correctly in the
>> >> unified model but requires extra (potentially tricky and error-prone)
>> >> effort only iff one needs it.
>>
>> The question above is the crux of the issue--are we striking the right
>> balance in requiring the user to specify a Window in all cases when we
>> can infer it in the by far most common case (like with global
>> combine's default value) and can detect and warn the user when they
>> don't fall into that case.
>>

Reply via email to