Are you able to run streaming word count on the same setup? On Tue, Oct 27, 2020 at 5:39 PM Sam Bourne <samb...@gmail.com> wrote:
> We updated from beam 2.18.0 to 2.24.0 and have been having issues using > the python ReadFromPubSub external transform in flink 1.10. It seems like > it starts up just fine, but it doesn’t consume any messages. > > I tried to reduce it to a simple example and tested back to beam 2.22.0 > but have gotten the same results (of no messages being read). > > I have a hard time believing that it’s been broken for so many versions > but I can’t seem to identify what I’ve done wrong. > > Steps to test: > > 1) Spin up the expansion service > > docker run -d --rm --network host apache/beam_flink1.10_job_server:2.24.0 > > 2) Create a simple pipeline using > apache_beam.io.external.gcp.pubsub.ReadFromPubSub > Here’s mine: > https://gist.github.com/sambvfx/a8582f4805e468a97331b0eb13911ebf > > 3) Run the pipeline > > python -m pubsub_example --runner=FlinkRunner --save_main_session > --flink_submit_uber_jar --environment_type=DOCKER > --environment_config=apache/beam_python3.7_sdk:2.24.0 > --checkpointing_interval=10000 --streaming > > 4) Emit a few pubsub messages > > python -m pubsub_example --msg hello > python -m pubsub_example --msg world > > What am I missing here? > -Sam > ------------------------------ > > Some debugging things I’ve tried: > > - I can run ReadFromPubSub (the DirectRunner version just fine). > - I confirmed that the gcloud credentials make it into the java sdk > container that spins up. Without these you get credential type errors. > - I modified the java DockerEnvironmentFactory to instead mount > GOOGLE_APPLICATION_CREDENTIALS service account .json and set the env > var. > - I’ve tried a variety of different flink flags. > >