It'd be awesome to see these updated. I'd add two more:

   1. A plain English summary of the runner's support in Beam. People who
   are new to Beam won't understand the in-depth coverage and need a general
   idea of how it is supported.
   2. The production readiness of the runner. Does the maintainer think
   this runner is production ready?



On Sun, Aug 20, 2017 at 8:03 AM Kenneth Knowles <k...@google.com.invalid>
wrote:

> Hi all,
>
> I want to revamp
> https://beam.apache.org/documentation/runners/capability-matrix/
>
> When Beam first started, we didn't work on feature branches for the core
> runners, and they had a lot more gaps compared to what goes on `master`
> today, so this tracked our progress in a way that was easy for users to
> read. Now it is still our best/only comparison page for users, but I think
> we could improve its usefulness.
>
> For the benefit of the thread, let me inline all the capabilities fully
> here:
>
> ========================
>
> "What is being computed?"
>  - ParDo
>  - GroupByKey
>  - Flatten
>  - Combine
>  - Composite Transforms
>  - Side Inputs
>  - Source API
>  - Splittable DoFn
>  - Metrics
>  - Stateful Processing
>
> "Where in event time?"
>  - Global windows
>  - Fixed windows
>  - Sliding windows
>  - Session windows
>  - Custom windows
>  - Custom merging windows
>  - Timestamp control
>
> "When in processing time?"
>  - Configurable triggering
>  - Event-time triggers
>  - Processing-time triggers
>  - Count triggers
>  - [Meta]data driven triggers
>  - Composite triggers
>  - Allowed lateness
>  - Timers
>
> "How do refinements relate?"
>  - Discarding
>  - Accumulating
>  - Accumulating & Retracting
>
> ========================
>
> Here are some issues I'd like to improve:
>
>  - Rows that are impossible to not support (ParDo)
>  - Rows where "support" doesn't really make sense (Composite transforms)
>  - Rows are actually the same model feature (non-merging windowfns)
>  - Rows that represent optimizations (Combine)
>  - Rows in the wrong place (Timers)
>  - Rows have not been designed ([Meta]Data driven triggers)
>  - Rows with names that appear no where else (Timestamp control)
>  - No place to compare non-model differences between runners
>
> I'm still pondering how to improve this, but I thought I'd send the notion
> out for discussion. Some imperfect ideas I've had:
>
> 1. Lump all the basic stuff (ParDo, GroupByKey, Read, Window) into one row
> 2. Make sections as users see them, like "ParDo" / "side Inputs" not
> "What?" / "side inputs"
> 3. Add rows for non-model things, like portability framework support,
> metrics backends, etc
> 4. Drop rows that are not informative, like Composite transforms, or not
> designed
> 5. Reorganize the windowing section to be just support for merging /
> non-merging windowing.
> 6. Switch to a more distinct color scheme than the solid vs faded colors
> currently used.
> 7. Find a web design to get short descriptions into the foreground to make
> it easier to grok.
>
> These are just a few thoughts, and not necessarily compatible with each
> other. What do you think?
>
> Kenn
>
-- 
Thanks,

Jesse

Reply via email to