Sorry this was my mistake when I reviewed the PR last week.

I suggested renaming this new E2E test to *IT since it looks like an
integration test. This means that the test will run as part of the "Java
Examples Dataflow" PreCommit. However the test is also using a local fake
of PubSub, which won't work when running with a distributed runner like
Dataflow.

We could just keep this as a *Test and make sure it's running with the
DirectRunner in a Jenkins job.

Brian

On Mon, Jan 18, 2021 at 5:31 PM Boyuan Zhang <boyu...@google.com> wrote:

> It does seem like the Dataflow will do some validation around PubSub
> params before actually creating the pipeline. That's fair for Dataflow
> because Dataflow will swap the PubSubIO from beam implementation into
> Dataflow native one.
>
> I think if you really want to run your virtual PubSub with Dataflow, you
> need to try out --experiments=enable_custom_pubsub_sink to enforce Dataflow
> not to do the override.
>
> Would you like to share your job id thus we can verify the failure. Also
> I'm not sure about the motivation to test it against Dataflow. Would you
> like to elaborate more on that?
>
>
> On Mon, Jan 18, 2021 at 3:51 AM Ramazan Yapparov <
> ramazan.yappa...@akvelon.com> wrote:
>
>> Hi Beam!
>> We've been writing E2E test for KafkaToPubsub example pipeline. Instead
>> of depending on some real
>> Cloud Pubsub and Kafka instances we decided to use Testcontainers.
>> We launch Kafka and PubSub Emulator containers and after that we pass
>> containers urls into pipeline options and run the pipeline.
>> During PR review we received a request for turning this test into IT so
>> it would run in Dataflow Runner
>> instead of Direct Runner.
>> Trying to do so, we've ran into some troubles with that:
>> 1. While running the test all Docker containers start at the machine
>> where the test is running,
>>    so in order for this test to work properly dataflow job should be able
>> to reach test-runner machine by a public IP.
>>    I certainly can't do it on my local machine, not sure how it will
>> behave when running in CI environment.
>> 2. When we pass our fake PubSub url into the dataflow job we receive
>> following error:
>> json
>> {
>>   "code" : 400,
>>   "errors" : [ {
>>     "domain" : "global",
>>     "message" : "(f214233f9dbe6968): The workflow could not be created.
>> Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid
>> Pub/Sub URL.",
>>     "reason" : "badRequest"
>>   } ],
>>   "message" : "(f214233f9dbe6968): The workflow could not be created.
>> Causes: (f214233f9dbe6719): http://localhost:49169 is not a valid
>> Pub/Sub URL.",
>>   "status" : "INVALID_ARGUMENT"
>> }
>>
>> Not sure how this can be avoided, looks like the job will only accept the
>> real Cloud PubSub url.
>> It would be great if you share some thoughts or any suggestions how it
>> can be solved!
>>
>>

Reply via email to