Hi Holden,

The note on Flink & Spark support sounds reasonable to me. I am optimistic
about getting Flink + TFX + Kubeflow working fairly soon, but I agree that
we don't want to over-promise.

I'm not so sure about the status of Dataflow here, perhaps someone else can
comment on that.

Looking forward to the book :)

Kyle

On Fri, Apr 17, 2020 at 1:14 PM Holden Karau <[email protected]> wrote:

> Hi Apache Beam Developers,
>
> I'm working on a book about Kubeflow, which naturally has a section on
> TFX. I want to set users expectations correctly so I wanted to know what
> y'all thought of this NOTE we were thinking of including in the early
> release:
>
> Apache Beam’s Python support outside of Google cloud's Dataflow is
> relatively new. TFX is a Python tool, so scaling it depends on Apache
> Beam's Python support. You can scale your job by using the non-portable
> dataflow component, but this requires changing your pipeline code and isn't
> supported by Kubeflow's current TFX components. As Apache Beam's support
> for Apache Flink & Spark improves support may be added for scaling the TFX
> components in a portable manner.
>
> Does this sound reasonable to folks? I don't want to over-promise but I
> also don't want to scare people away given all of the progress that is
> being made in supporting the open-source runners with language portability.
>
> Cheers,
>
> Holden :)
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  <https://amzn.to/2MaRAG9>
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>

Reply via email to