You can also call Create and call setIsBoundedInternal(IsBounded.UNBOUNDED) on the resulting PCollection, which will force the streaming runner to be used.
On Thu, Apr 9, 2020 at 2:25 PM Steve Niemitz <sniem...@twitter.com> wrote: > Ah yeah I forgot that you can force a pipeline into streaming mode with > that flag. > > It sounds like the story here is there are tests for the streaming worker, > but they run "on the side" in Google's environment? My concern is it seems > like (publically at least) there's no test coverage on the streaming > worker. ie, if I make changes, how do I know if I broke anything? > > On Thu, Apr 9, 2020 at 5:20 PM Luke Cwik <lc...@google.com> wrote: > >> You can use Create in streaming pipelines as well but you want to ensure >> that --streaming is passed as a flag. >> You could update the existing test target and force --streaming to be >> inserted for example here: >> >> https://github.com/lukecwik/incubator-beam/blob/8097972b4d0ed759aa45f6710ac02b982c6e8deb/runners/google-cloud-dataflow-java/build.gradle#L156 >> >> Create can't exercise a lot of complex windowing/triggering semantics >> though or can not produce deterministic output which is why TestStream was >> created which is like Create but allows you to control watermark and >> processing time advancement. This allows for more complicated triggering >> and execution but it only works with portable Dataflow pipelines that also >> use this additional experiment ('use_unified_worker') which can be launched >> with this target: >> >> https://github.com/lukecwik/incubator-beam/blob/8097972b4d0ed759aa45f6710ac02b982c6e8deb/runners/google-cloud-dataflow-java/build.gradle#L246 >> We are looking to have the first API stable version using the portability >> framework with 2.21.0 which should mean that tests that run outside of >> Google will be possible. >> >> >> On Thu, Apr 9, 2020 at 7:17 AM Steve Niemitz <sniem...@apache.org> wrote: >> >>> I was trying to run a @ValidatesRunner test for the streaming dataflow >>> runner, but I actually can't find any way to run them in streaming. It >>> looks like all the tests are set up using the Create transform, >>> which generates a batch pipeline. >>> >>> Are there actually no @ValidatesRunner tests for the streaming dataflow >>> runner? That seems like a big gap in coverage. Am I missing a setting >>> somewhere? >>> >>