[ 
https://issues.apache.org/jira/browse/BEAM-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce Arctor updated BEAM-7871:
-------------------------------
    Comment: was deleted

(was: [~zdenulo], is this due to lack of there being an IO to firebase api 
(firestore in native mode)?  I do see that there is a datastore connector, so 
was wondering if that works with firestore in datastore mode.  Haven't tried 
yet, but this is a task I was hoping to accomplish (pubsub -> beam/dataflow -> 
firestore in native mode, meaning firebase api) – just starting to look into it 
which led me here.  )

> Streaming from PubSub to Firestore doesn't work on Dataflow
> -----------------------------------------------------------
>
>                 Key: BEAM-7871
>                 URL: https://issues.apache.org/jira/browse/BEAM-7871
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-gcp, runner-dataflow
>    Affects Versions: 2.13.0
>            Reporter: Zdenko Hrcek
>            Priority: Major
>
> I came to the same error as here 
> [https://stackoverflow.com/questions/57059944/python-package-errors-while-running-gcp-dataflow]
>  but I don't see anywhere reported thus I am creating an issue just in case.
> The pipeline is quite simple, reading from PubSub and writing to Firestore.
> Beam version used is 2.13.0, Python 2.7
> With DirectRunner works ok, but on Dataflow it throws the following message:
>  
> {code:java}
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: Error 
> received from SDK harness for instruction -81: Traceback (most recent call 
> last):
>  File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 157, in _execute
>  response = task()
>  File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 190, in <lambda>
>  self._execute(lambda: worker.do_instruction(work), work)
>  File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 312, in do_instruction
>  request.instruction_id)
>  File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
>  line 331, in process_bundle
>  bundle_processor.process_bundle(instruction_id))
>  File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/bundle_processor.py",
>  line 538, in process_bundle
>  op.start()
>  File "apache_beam/runners/worker/operations.py", line 554, in 
> apache_beam.runners.worker.operations.DoOperation.start
>  def start(self):
>  File "apache_beam/runners/worker/operations.py", line 555, in 
> apache_beam.runners.worker.operations.DoOperation.start
>  with self.scoped_start_state:
>  File "apache_beam/runners/worker/operations.py", line 557, in 
> apache_beam.runners.worker.operations.DoOperation.start
>  self.dofn_runner.start()
>  File "apache_beam/runners/common.py", line 778, in 
> apache_beam.runners.common.DoFnRunner.start
>  self._invoke_bundle_method(self.do_fn_invoker.invoke_start_bundle)
>  File "apache_beam/runners/common.py", line 775, in 
> apache_beam.runners.common.DoFnRunner._invoke_bundle_method
>  self._reraise_augmented(exn)
>  File "apache_beam/runners/common.py", line 800, in 
> apache_beam.runners.common.DoFnRunner._reraise_augmented
>  raise_with_traceback(new_exn)
>  File "apache_beam/runners/common.py", line 773, in 
> apache_beam.runners.common.DoFnRunner._invoke_bundle_method
>  bundle_method()
>  File "apache_beam/runners/common.py", line 359, in 
> apache_beam.runners.common.DoFnInvoker.invoke_start_bundle
>  def invoke_start_bundle(self):
>  File "apache_beam/runners/common.py", line 363, in 
> apache_beam.runners.common.DoFnInvoker.invoke_start_bundle
>  self.signature.start_bundle_method.method_value())
>  File 
> "/home/zdenulo/dev/gcp_stuff/df_firestore_stream/df_firestore_stream.py", 
> line 39, in start_bundle
> NameError: global name 'firestore' is not defined [while running 
> 'generatedPtransform-64']
>  
> {code}
> It's interesting that using Beam version 2.12.0 solves the problem on 
> Dataflow, it works as expected, not sure what could be the problem.
> Here is a repository with the code which was used 
> [https://github.com/zdenulo/dataflow_firestore_stream]
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to