On Fri, Nov 8, 2019 at 9:46 AM Brian Hulette <[email protected]> wrote: > > > Does it make sense to do this? > I think this makes a lot of sense. Plus it's a good opportunity to refresh > the UX of [1]. > > > what's a good way of doing it? Should we expand the existing Capability > > Matrix to support SDKs as well? Or should we have a new one? > To me there are two aspects to this: how we model the data, and how we > present the data. > > For modelling the data: > Do we need to maintain the full 3-dimensional <feature - SDK - runner> > matrix? That seems untenable to me. With portability, I think the runner and > SDK matrix should be completely independent, so it should be safe to just > maintain <feature - SDK>, and <feature - runner> matrices and model the > 3-dimensional matrix as the cross-product of the two. > Maybe we should have a new capability matrix just for portable runners so we > can exploit this property?
Yes, being able to do that is the crux of the portability work. We may have to consider, say, "Portable Spark" and "Non-Portable Spark" to be two separate runners and have the caveat that some runners (namely the non-portable ones) do not work with all SDKs. Another thing I'd really, really like to see is these matrices automatically populated via validates runner test attributes. E.g. you can pick a runner, run the validates runner test suite, and see what is fully/partially/not at all supported. This is harder to do for SDKs, but at least you could get some signal by looking for the existence of (passing) tests. > For presenting the data: > I think there would be value in just presenting <feature - runner> (basically > what we have now in [1]), and also presenting <feature - SDK> separately. The > <feature - SDK> display could serve as documentation too, with examples of > how to do Y in each SDK. > Maybe there would also be value in presenting <feature - SDK - runner> in > some fancy UI so an architect can quickly answer "what can I do with SDK Z on > Runner X", but I'm not sure what that would look like. I think two tables are fine. Note that with cross-language, the restrictions of an SDK become less of an issue. One could imagine UIs that would let you select a (set of?) SDKs and runners and automatically populates the matrix according to the intersection. > [1] https://beam.apache.org/documentation/runners/capability-matrix/ > > On Thu, Nov 7, 2019 at 10:09 PM Thomas Weise <[email protected]> wrote: >> >> FWIW there are currently at least 2 instances of capability matrix [1] [2]. >> >> [1] has been in need of a refresh for a while. >> >> [2] is more useful but only covers portable runners and is hard to find. >> >> Thomas >> >> [1] https://beam.apache.org/documentation/runners/capability-matrix/ >> [2] https://s.apache.org/apache-beam-portability-support-table >> >> On Thu, Nov 7, 2019 at 7:52 PM Pablo Estrada <[email protected]> wrote: >>> >>> Hi all, >>> I think this is a relatively common question: >>> >>> - Can I do X with runner Y, and SDK Z? >>> >>> The answers vary significantly between SDK and Runner pairs. This makes it >>> such that the current Capability Matrix falls somewhat short when potential >>> users / solutions architects / etc are trying to decide to adopt Beam, and >>> which Runner / SDK to use. >>> >>> I think we need to put some effort in building a capability matrix that >>> expresses this information - and maintain it updated. >>> >>> I would like to discuss a few things: >>> - Does it make sense to do this? >>> - If it does, what's a good way of doing it? Should we expand the existing >>> Capability Matrix to support SDKs as well? Or should we have a new one? >>> - Any other thoughts you may have about the idea. >>> >>> Best >>> -P.
