Sorry this was my mistake when I reviewed the PR last week. I suggested renaming this new E2E test to *IT since it looks like an integration test. This means that the test will run as part of the "Java Examples Dataflow" PreCommit. However the test is also using a local fake of PubSub, which won't work when running with a distributed runner like Dataflow.
We could just keep this as a *Test and make sure it's running with the DirectRunner in a Jenkins job. Brian On Mon, Jan 18, 2021 at 5:31 PM Boyuan Zhang <boyu...@google.com> wrote: > It does seem like the Dataflow will do some validation around PubSub > params before actually creating the pipeline. That's fair for Dataflow > because Dataflow will swap the PubSubIO from beam implementation into > Dataflow native one. > > I think if you really want to run your virtual PubSub with Dataflow, you > need to try out --experiments=enable_custom_pubsub_sink to enforce Dataflow > not to do the override. > > Would you like to share your job id thus we can verify the failure. Also > I'm not sure about the motivation to test it against Dataflow. Would you > like to elaborate more on that? > > > On Mon, Jan 18, 2021 at 3:51 AM Ramazan Yapparov < > ramazan.yappa...@akvelon.com> wrote: > >> Hi Beam! >> We've been writing E2E test for KafkaToPubsub example pipeline. Instead >> of depending on some real >> Cloud Pubsub and Kafka instances we decided to use Testcontainers. >> We launch Kafka and PubSub Emulator containers and after that we pass >> containers urls into pipeline options and run the pipeline. >> During PR review we received a request for turning this test into IT so >> it would run in Dataflow Runner >> instead of Direct Runner. >> Trying to do so, we've ran into some troubles with that: >> 1. While running the test all Docker containers start at the machine >> where the test is running, >> so in order for this test to work properly dataflow job should be able >> to reach test-runner machine by a public IP. >> I certainly can't do it on my local machine, not sure how it will >> behave when running in CI environment. >> 2. When we pass our fake PubSub url into the dataflow job we receive >> following error: >> json >> { >> "code" : 400, >> "errors" : [ { >> "domain" : "global", >> "message" : "(f214233f9dbe6968): The workflow could not be created. >> Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid >> Pub/Sub URL.", >> "reason" : "badRequest" >> } ], >> "message" : "(f214233f9dbe6968): The workflow could not be created. >> Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid >> Pub/Sub URL.", >> "status" : "INVALID_ARGUMENT" >> } >> >> Not sure how this can be avoided, looks like the job will only accept the >> real Cloud PubSub url. >> It would be great if you share some thoughts or any suggestions how it >> can be solved! >> >>