Yes, there is a dependency between Dataflow -> GCP IOs and this is expected
since Dataflow depends on parts of those implementations for its own
execution purposes. We definitely don't want GCP IOs depending on Dataflow
since we would like users of other runners to still be able to use GCP IOs
without bringing in Dataflow specific dependencies.

There is already a test definition inside of the Dataflow runner package
that is meant to run integration tests defined in the GCP IO package named
googleCloudPlatformLegacyWorkerIntegrationTest[1] task, does this fit your
needs?

1:
https://github.com/apache/beam/blob/0fce2b88660f52dae638697e1472aa108c982ae6/runners/google-cloud-dataflow-java/build.gradle#L318

On Fri, Jul 12, 2019 at 5:17 AM Michał Walenia <[email protected]>
wrote:

> Hi all,
> recently when I was trying to implement a performance test of BigQueryIO,
> I ran into an issue when trying to run the test on Dataflow.
> The problem was that I encountered a circular dependency when compiling
> the tests. I added the test in org.apache.beam.sdk.io.gcp.bigquery package,
> so I also needed to add DataflowRunner as a dependency in order to launch
> the test. The error was that DataflowRunner package depends on
> org.apache.beam.sdk.io.gcp.bigquery package (for example in [1]). Should
> it be like that?
> For now, in order to solve the problem, I intend to move the performance
> test to its own package in my PR [2] I am wondering about the right
> approach to this - shouldn’t we decouple the DataflowRunner code from IOs?
> If not, what’s the reason behind the way the modules are organized?
> I noticed that 5 tests are excluded from the integrationTest task in
> io.google-cloud-platform.bigquery build.gradle file [3]. Are they
> launched on Dataflow anywhere? I couldn’t find their usage except for the
> exclusions.
>
> [1] PubSubIO translations section in DataflowRunner.java
> <https://github.com/apache/beam/blob/0fce2b88660f52dae638697e1472aa108c982ae6/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java#L1104>
> [2] My PR <https://github.com/apache/beam/pull/9041>
> [3] DefaultCoderCloudObjectTranslatorRegistrar
> <https://github.com/apache/beam/blob/0fce2b88660f52dae638697e1472aa108c982ae6/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DefaultCoderCloudObjectTranslatorRegistrar.java#L45>
>
> Best regards
> Michal
>
> --
>
> Michał Walenia
> Polidea <https://www.polidea.com/> | Software Engineer
>
> M: +48 791 432 002 <+48791432002>
> E: [email protected]
>
> Unique Tech
> Check out our projects! <https://www.polidea.com/our-work>
>

Reply via email to