Hi Valentyn,

Thank You for information and details. All make sense! I think we can wait for 2.53.0 release and meantime apply hotfix.

Best

Wiśniowski Piotr

On 10.11.2023 20:27, Valentyn Tymofieiev via user wrote:
From https://pypi.org/project/pyarrow-hotfix/ :

pyarrow_hotfix must be imported in your application or library code for it to take effect.
Just installing the package is not sufficient:

For Beam users, that means that the pipeline code running on the workers would need to import this module on every worker, for example by adding this line to DoFn.setup or in main session (if pipeline is composed only from one file AND uses dill pickler with --save_main_session flag).

We will continue addressing this in https://github.com/apache/beam/issues/29392.

On Fri, Nov 10, 2023 at 10:23 AM Valentyn Tymofieiev <valen...@google.com> wrote:

    Hi Piotr, thanks for bringing this to the list.

    There is a FR to support pyarrow
    https://github.com/apache/beam/issues/28410 . I looked into it
    briefly in https://github.com/apache/beam/pull/28437 but saw some
    test failures and it has been on back burner. Given the news about
    vulnerability it would make sense to prioritize this.

    I think we could decouple this from 2.52.0 release since:
      1) there is a workaround
      2) new versions of pyarrow haven't been fully tested with Beam
      3) Beam 2.52.0 fixes some other issues that are known to
    affecting users, e.g. https://github.com/apache/beam/issues/28246

    From
    
https://securityonline.info/cve-2023-47248-pyarrow-arbitrary-code-execution-vulnerability-a-critical-threat-to-data-analysts/
    :
      > If you cannot upgrade to PyArrow 14.0.1, you can use the
    pyarrow-hotfix package to disable the vulnerability on older
    versions of PyArrow. However, this is not a permanent solution,
    and you should upgrade to PyArrow 14.0.1 as soon as possible. We
    could consider adding pyarrow-hotfix to the containers for 2.52.0
    release. CC: @Danny McCormick
    <mailto:dannymccorm...@google.com> (release manager).

    Beam users can also install this additional dependency via one of
    the ways described in
    https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/
    .



    On Fri, Nov 10, 2023 at 4:42 AM Wiśniowski Piotr
    <contact.wisniowskipi...@gmail.com> wrote:

        Hi,

        Few days ago this one was detected:
        
https://securityonline.info/cve-2023-47248-pyarrow-arbitrary-code-execution-vulnerability-a-critical-threat-to-data-analysts/

        I do see that beam 2.51.0 does have `pyarrow<=12.0.0` in
        requirements.

        1. Is there a reason for not allowing newer versions of pyarrow?

        2. Is there any planned effort on updating this to `14.0.1`?
        Is it
        possible to push the update to `2.52.0` beam release? I know
        the beam
        release is almost there.

        Best

        Wiśniowski Piotr

Reply via email to