You can also call Create and call setIsBoundedInternal(IsBounded.UNBOUNDED)
on the resulting PCollection, which will force the streaming runner to be
used.

On Thu, Apr 9, 2020 at 2:25 PM Steve Niemitz <sniem...@twitter.com> wrote:

> Ah yeah I forgot that you can force a pipeline into streaming mode with
> that flag.
>
> It sounds like the story here is there are tests for the streaming worker,
> but they run "on the side" in Google's environment?  My concern is it seems
> like (publically at least) there's no test coverage on the streaming
> worker.  ie, if I make changes, how do I know if I broke anything?
>
> On Thu, Apr 9, 2020 at 5:20 PM Luke Cwik <lc...@google.com> wrote:
>
>> You can use Create in streaming pipelines as well but you want to ensure
>> that --streaming is passed as a flag.
>> You could update the existing test target and force --streaming to be
>> inserted for example here:
>>
>> https://github.com/lukecwik/incubator-beam/blob/8097972b4d0ed759aa45f6710ac02b982c6e8deb/runners/google-cloud-dataflow-java/build.gradle#L156
>>
>> Create can't exercise a lot of complex windowing/triggering semantics
>> though or can not produce deterministic output which is why TestStream was
>> created which is like Create but allows you to control watermark and
>> processing time advancement. This allows for more complicated triggering
>> and execution but it only works with portable Dataflow pipelines that also
>> use this additional experiment ('use_unified_worker') which can be launched
>> with this target:
>>
>> https://github.com/lukecwik/incubator-beam/blob/8097972b4d0ed759aa45f6710ac02b982c6e8deb/runners/google-cloud-dataflow-java/build.gradle#L246
>> We are looking to have the first API stable version using the portability
>> framework with 2.21.0 which should mean that tests that run outside of
>> Google will be possible.
>>
>>
>> On Thu, Apr 9, 2020 at 7:17 AM Steve Niemitz <sniem...@apache.org> wrote:
>>
>>> I was trying to run a @ValidatesRunner test for the streaming dataflow
>>> runner, but I actually can't find any way to run them in streaming.  It
>>> looks like all the tests are set up using the Create transform,
>>> which generates a batch pipeline.
>>>
>>> Are there actually no @ValidatesRunner tests for the streaming dataflow
>>> runner?  That seems like a big gap in coverage.  Am I missing a setting
>>> somewhere?
>>>
>>

Reply via email to