Dear Beam & Dask communities, Together with Pablo and Charles, I've hacked together an initial prototype of a Dask runner for Beam. I'm happy to announce that I have minimum viable working version in a fork here: https://github.com/alxmrs/beam/pull/1
There's definitely more work to do here – more operations to implement, tests to write, style guides to follow, etc. However, I'm pleased that there are enough operations implemented to run test pipelines with assertions. >From here, what are good next steps? Best, Alex PS – Meeting / design notes are available in this doc: https://docs.google.com/document/d/1Awj_eNmH-WRSte3bKcCcUlQDiZ5mMKmCO_xV-mHWAak/edit#heading=h.y0pwg4polebc On 2022/06/08 14:22:41 Ryan Abernathey wrote: > Dear Beamer, > > Thank you for all of your work on this amazing project. I am new to Beam > and am quite excited about its potential to help with some data processing > challenges in my field of climate science. > > Our community is interested in running Beam on Dask Distributed clusters, > which we already know how to deploy. This has been discussed at > https://issues.apache.org/jira/browse/BEAM-5336 and > https://github.com/apache/beam/issues/18962. It seems technically feasible. > > We are trying to organize a meeting next week to kickstart and coordinate > this effort. It would be great if we could entrain some Beam maintainers > into this meeting. If you have interest in this topic and are available > next week, please share your availability here - > https://www.when2meet.com/?15861604-jLnA4 > > Alternatively, if you have any guidance or suggestions you wish to provide > by email or GitHub discussion, we welcome your input. > > Thanks again for your open source work. > > Best, > Ryan Abernathey >