Hi all,
recently when I was trying to implement a performance test of BigQueryIO, I
ran into an issue when trying to run the test on Dataflow.
The problem was that I encountered a circular dependency when compiling the
tests. I added the test in org.apache.beam.sdk.io.gcp.bigquery package, so
I also needed to add DataflowRunner as a dependency in order to launch the
test. The error was that DataflowRunner package depends on
org.apache.beam.sdk.io.gcp.bigquery package (for example in [1]). Should it
be like that?
For now, in order to solve the problem, I intend to move the performance
test to its own package in my PR [2] I am wondering about the right
approach to this - shouldn’t we decouple the DataflowRunner code from IOs?
If not, what’s the reason behind the way the modules are organized?
I noticed that 5 tests are excluded from the integrationTest task in
io.google-cloud-platform.bigquery build.gradle file [3]. Are they launched
on Dataflow anywhere? I couldn’t find their usage except for the exclusions.

[1] PubSubIO translations section in DataflowRunner.java
<https://github.com/apache/beam/blob/0fce2b88660f52dae638697e1472aa108c982ae6/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowRunner.java#L1104>
[2] My PR <https://github.com/apache/beam/pull/9041>
[3] DefaultCoderCloudObjectTranslatorRegistrar
<https://github.com/apache/beam/blob/0fce2b88660f52dae638697e1472aa108c982ae6/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DefaultCoderCloudObjectTranslatorRegistrar.java#L45>

Best regards
Michal

-- 

Michał Walenia
Polidea <https://www.polidea.com/> | Software Engineer

M: +48 791 432 002 <+48791432002>
E: michal.wale...@polidea.com

Unique Tech
Check out our projects! <https://www.polidea.com/our-work>

Reply via email to