Any reason to use this? RUN pip install avro-python3 pyarrow==0.15.1 apache-beam[gcp]==2.30.0 pandas-datareader==0.9.0
It is typically recommended to use the latest Beam and build the docker image using the requirements released for each Beam, for example, https://github.com/apache/beam/blob/release-2.56.0/sdks/python/container/py311/base_image_requirements.txt On Wed, Jun 12, 2024 at 1:31 AM Sofia’s World <mmistr...@gmail.com> wrote: > Sure, apologies, it crossed my mind it would have been useful to refert to > it > > so this is the docker file > > > https://github.com/mmistroni/GCP_Experiments/edit/master/dataflow/shareloader/Dockerfile_tester > > I was using a setup.py as well, but then i commented out the usage in the > dockerfile after checking some flex templates which said it is not needed > > > https://github.com/mmistroni/GCP_Experiments/blob/master/dataflow/shareloader/setup_dftester.py > > thanks in advance > Marco > > > > > > > > On Tue, Jun 11, 2024 at 10:54 PM XQ Hu <x...@google.com> wrote: > >> Can you share your Dockerfile? >> >> On Tue, Jun 11, 2024 at 4:43 PM Sofia’s World <mmistr...@gmail.com> >> wrote: >> >>> thanks all, it seemed to work but now i am getting a different problem, >>> having issues in building pyarrow... >>> >>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>> <string>:36: DeprecationWarning: pkg_resources is deprecated as an API. See >>> https://setuptools.pypa.io/en/latest/pkg_resources.html >>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>> WARNING setuptools_scm.pyproject_reading toml section missing >>> 'pyproject.toml does not contain a tool.setuptools_scm section' >>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>> Traceback (most recent call last): >>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>> File >>> "/tmp/pip-build-env-meihcxsp/overlay/lib/python3.11/site-packages/setuptools_scm/_integration/pyproject_reading.py", >>> line 36, in read_pyproject >>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>> section = defn.get("tool", {})[tool_name] >>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>> ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^ >>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>> KeyError: 'setuptools_scm' >>> Step #0 - "build-shareloader-template": Step #4 - "dftester-image": >>> running bdist_wheel >>> >>> >>> >>> >>> It is somehow getting messed up with a toml ? >>> >>> >>> Could anyone advise? >>> >>> thanks >>> >>> Marco >>> >>> >>> >>> >>> >>> On Tue, Jun 11, 2024 at 1:00 AM XQ Hu via user <user@beam.apache.org> >>> wrote: >>> >>>> >>>> https://github.com/GoogleCloudPlatform/python-docs-samples/tree/main/dataflow/flex-templates/pipeline_with_dependencies >>>> is a great example. >>>> >>>> On Mon, Jun 10, 2024 at 4:28 PM Valentyn Tymofieiev via user < >>>> user@beam.apache.org> wrote: >>>> >>>>> In this case the Python version will be defined by the Python version >>>>> installed in the docker image of your flex template. So, you'd have to >>>>> build your flex template from a base image with Python 3.11. >>>>> >>>>> On Mon, Jun 10, 2024 at 12:50 PM Sofia’s World <mmistr...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello >>>>>> no i am running my pipelien on GCP directly via a flex template, >>>>>> configured using a Docker file >>>>>> Any chances to do something in the Dockerfile to force the version at >>>>>> runtime? >>>>>> Thanks >>>>>> >>>>>> On Mon, Jun 10, 2024 at 7:24 PM Anand Inguva via user < >>>>>> user@beam.apache.org> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Are you running your pipeline from the python 3.11 environment? If >>>>>>> you are running from a python 3.11 environment and don't use a custom >>>>>>> docker container image, DataflowRunner(Assuming Apache Beam on GCP means >>>>>>> Apache Beam on DataflowRunner), will use Python 3.11. >>>>>>> >>>>>>> Thanks, >>>>>>> Anand >>>>>>> >>>>>>