[ https://issues.apache.org/jira/browse/BEAM-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bruce Arctor updated BEAM-7871: ------------------------------- Comment: was deleted (was: [~zdenulo], is this due to lack of there being an IO to firebase api (firestore in native mode)? I do see that there is a datastore connector, so was wondering if that works with firestore in datastore mode. Haven't tried yet, but this is a task I was hoping to accomplish (pubsub -> beam/dataflow -> firestore in native mode, meaning firebase api) – just starting to look into it which led me here. ) > Streaming from PubSub to Firestore doesn't work on Dataflow > ----------------------------------------------------------- > > Key: BEAM-7871 > URL: https://issues.apache.org/jira/browse/BEAM-7871 > Project: Beam > Issue Type: Bug > Components: io-py-gcp, runner-dataflow > Affects Versions: 2.13.0 > Reporter: Zdenko Hrcek > Priority: Major > > I came to the same error as here > [https://stackoverflow.com/questions/57059944/python-package-errors-while-running-gcp-dataflow] > but I don't see anywhere reported thus I am creating an issue just in case. > The pipeline is quite simple, reading from PubSub and writing to Firestore. > Beam version used is 2.13.0, Python 2.7 > With DirectRunner works ok, but on Dataflow it throws the following message: > > {code:java} > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error > received from SDK harness for instruction -81: Traceback (most recent call > last): > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 157, in _execute > response = task() > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 190, in <lambda> > self._execute(lambda: worker.do_instruction(work), work) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 312, in do_instruction > request.instruction_id) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", > line 331, in process_bundle > bundle_processor.process_bundle(instruction_id)) > File > "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py", > line 538, in process_bundle > op.start() > File "apache_beam/runners/worker/operations.py", line 554, in > apache_beam.runners.worker.operations.DoOperation.start > def start(self): > File "apache_beam/runners/worker/operations.py", line 555, in > apache_beam.runners.worker.operations.DoOperation.start > with self.scoped_start_state: > File "apache_beam/runners/worker/operations.py", line 557, in > apache_beam.runners.worker.operations.DoOperation.start > self.dofn_runner.start() > File "apache_beam/runners/common.py", line 778, in > apache_beam.runners.common.DoFnRunner.start > self._invoke_bundle_method(self.do_fn_invoker.invoke_start_bundle) > File "apache_beam/runners/common.py", line 775, in > apache_beam.runners.common.DoFnRunner._invoke_bundle_method > self._reraise_augmented(exn) > File "apache_beam/runners/common.py", line 800, in > apache_beam.runners.common.DoFnRunner._reraise_augmented > raise_with_traceback(new_exn) > File "apache_beam/runners/common.py", line 773, in > apache_beam.runners.common.DoFnRunner._invoke_bundle_method > bundle_method() > File "apache_beam/runners/common.py", line 359, in > apache_beam.runners.common.DoFnInvoker.invoke_start_bundle > def invoke_start_bundle(self): > File "apache_beam/runners/common.py", line 363, in > apache_beam.runners.common.DoFnInvoker.invoke_start_bundle > self.signature.start_bundle_method.method_value()) > File > "/home/zdenulo/dev/gcp_stuff/df_firestore_stream/df_firestore_stream.py", > line 39, in start_bundle > NameError: global name 'firestore' is not defined [while running > 'generatedPtransform-64'] > > {code} > It's interesting that using Beam version 2.12.0 solves the problem on > Dataflow, it works as expected, not sure what could be the problem. > Here is a repository with the code which was used > [https://github.com/zdenulo/dataflow_firestore_stream] > -- This message was sent by Atlassian JIRA (v7.6.14#76016)