Hi,

I'm trying to understand where the Apache Beam dependency comes from; it's
not just a regular dependency of PyFlink, but a build system dependency.
Searching through the code, it seems like Beam is only used by PyFlink, and
not by non-Python Flink. In my (limited) understanding, it looks like the
Beam dependency is there mostly to enable a Beam runner.

However, if that's the case, then there are a lot of users who may not want
to use PyFlink via Beam, in which case the Apache Beam dependency is
unnecessarily restrictive. For instance, the latest version of
`apache-beam` caps `numpy<1.25.0` and `pyarrow<12.0.0`, whereas NumPy
1.26.x and PyArrow 13.0.0 have already been out for some time.

Am I correct in my understanding that Beam is only a dependency in order
to create a Beam runner/integration? If so, can Beam be an extra/optional
dependency of PyFlink, instead of being required for everybody?

Thank you!

Best regards,
Deepyaman

Reply via email to