Re: Problems with E2E test
Sorry this was my mistake when I reviewed the PR last week. I suggested renaming this new E2E test to *IT since it looks like an integration test. This means that the test will run as part of the "Java Examples Dataflow" PreCommit. However the test is also using a local fake of PubSub, which won't work when running with a distributed runner like Dataflow. We could just keep this as a *Test and make sure it's running with the DirectRunner in a Jenkins job. Brian On Mon, Jan 18, 2021 at 5:31 PM Boyuan Zhang wrote: > It does seem like the Dataflow will do some validation around PubSub > params before actually creating the pipeline. That's fair for Dataflow > because Dataflow will swap the PubSubIO from beam implementation into > Dataflow native one. > > I think if you really want to run your virtual PubSub with Dataflow, you > need to try out --experiments=enable_custom_pubsub_sink to enforce Dataflow > not to do the override. > > Would you like to share your job id thus we can verify the failure. Also > I'm not sure about the motivation to test it against Dataflow. Would you > like to elaborate more on that? > > > On Mon, Jan 18, 2021 at 3:51 AM Ramazan Yapparov < > ramazan.yappa...@akvelon.com> wrote: > >> Hi Beam! >> We've been writing E2E test for KafkaToPubsub example pipeline. Instead >> of depending on some real >> Cloud Pubsub and Kafka instances we decided to use Testcontainers. >> We launch Kafka and PubSub Emulator containers and after that we pass >> containers urls into pipeline options and run the pipeline. >> During PR review we received a request for turning this test into IT so >> it would run in Dataflow Runner >> instead of Direct Runner. >> Trying to do so, we've ran into some troubles with that: >> 1. While running the test all Docker containers start at the machine >> where the test is running, >>so in order for this test to work properly dataflow job should be able >> to reach test-runner machine by a public IP. >>I certainly can't do it on my local machine, not sure how it will >> behave when running in CI environment. >> 2. When we pass our fake PubSub url into the dataflow job we receive >> following error: >> json >> { >> "code" : 400, >> "errors" : [ { >> "domain" : "global", >> "message" : "(f214233f9dbe6968): The workflow could not be created. >> Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid >> Pub/Sub URL.", >> "reason" : "badRequest" >> } ], >> "message" : "(f214233f9dbe6968): The workflow could not be created. >> Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid >> Pub/Sub URL.", >> "status" : "INVALID_ARGUMENT" >> } >> >> Not sure how this can be avoided, looks like the job will only accept the >> real Cloud PubSub url. >> It would be great if you share some thoughts or any suggestions how it >> can be solved! >> >>
Re: Problems with E2E test
It does seem like the Dataflow will do some validation around PubSub params before actually creating the pipeline. That's fair for Dataflow because Dataflow will swap the PubSubIO from beam implementation into Dataflow native one. I think if you really want to run your virtual PubSub with Dataflow, you need to try out --experiments=enable_custom_pubsub_sink to enforce Dataflow not to do the override. Would you like to share your job id thus we can verify the failure. Also I'm not sure about the motivation to test it against Dataflow. Would you like to elaborate more on that? On Mon, Jan 18, 2021 at 3:51 AM Ramazan Yapparov < ramazan.yappa...@akvelon.com> wrote: > Hi Beam! > We've been writing E2E test for KafkaToPubsub example pipeline. Instead of > depending on some real > Cloud Pubsub and Kafka instances we decided to use Testcontainers. > We launch Kafka and PubSub Emulator containers and after that we pass > containers urls into pipeline options and run the pipeline. > During PR review we received a request for turning this test into IT so it > would run in Dataflow Runner > instead of Direct Runner. > Trying to do so, we've ran into some troubles with that: > 1. While running the test all Docker containers start at the machine where > the test is running, >so in order for this test to work properly dataflow job should be able > to reach test-runner machine by a public IP. >I certainly can't do it on my local machine, not sure how it will > behave when running in CI environment. > 2. When we pass our fake PubSub url into the dataflow job we receive > following error: > json > { > "code" : 400, > "errors" : [ { > "domain" : "global", > "message" : "(f214233f9dbe6968): The workflow could not be created. > Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid Pub/Sub > URL.", > "reason" : "badRequest" > } ], > "message" : "(f214233f9dbe6968): The workflow could not be created. > Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid Pub/Sub > URL.", > "status" : "INVALID_ARGUMENT" > } > > Not sure how this can be avoided, looks like the job will only accept the > real Cloud PubSub url. > It would be great if you share some thoughts or any suggestions how it can > be solved! > >
Problems with E2E test
Hi Beam! We've been writing E2E test for KafkaToPubsub example pipeline. Instead of depending on some real Cloud Pubsub and Kafka instances we decided to use Testcontainers. We launch Kafka and PubSub Emulator containers and after that we pass containers urls into pipeline options and run the pipeline. During PR review we received a request for turning this test into IT so it would run in Dataflow Runner instead of Direct Runner. Trying to do so, we've ran into some troubles with that: 1. While running the test all Docker containers start at the machine where the test is running, so in order for this test to work properly dataflow job should be able to reach test-runner machine by a public IP. I certainly can't do it on my local machine, not sure how it will behave when running in CI environment. 2. When we pass our fake PubSub url into the dataflow job we receive following error: json { "code" : 400, "errors" : [ { "domain" : "global", "message" : "(f214233f9dbe6968): The workflow could not be created. Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid Pub/Sub URL.", "reason" : "badRequest" } ], "message" : "(f214233f9dbe6968): The workflow could not be created. Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid Pub/Sub URL.", "status" : "INVALID_ARGUMENT" } Not sure how this can be avoided, looks like the job will only accept the real Cloud PubSub url. It would be great if you share some thoughts or any suggestions how it can be solved!