Hi all,

I want to revamp
https://beam.apache.org/documentation/runners/capability-matrix/

When Beam first started, we didn't work on feature branches for the core
runners, and they had a lot more gaps compared to what goes on `master`
today, so this tracked our progress in a way that was easy for users to
read. Now it is still our best/only comparison page for users, but I think
we could improve its usefulness.

For the benefit of the thread, let me inline all the capabilities fully
here:

========================

"What is being computed?"
 - ParDo
 - GroupByKey
 - Flatten
 - Combine
 - Composite Transforms
 - Side Inputs
 - Source API
 - Splittable DoFn
 - Metrics
 - Stateful Processing

"Where in event time?"
 - Global windows
 - Fixed windows
 - Sliding windows
 - Session windows
 - Custom windows
 - Custom merging windows
 - Timestamp control

"When in processing time?"
 - Configurable triggering
 - Event-time triggers
 - Processing-time triggers
 - Count triggers
 - [Meta]data driven triggers
 - Composite triggers
 - Allowed lateness
 - Timers

"How do refinements relate?"
 - Discarding
 - Accumulating
 - Accumulating & Retracting

========================

Here are some issues I'd like to improve:

 - Rows that are impossible to not support (ParDo)
 - Rows where "support" doesn't really make sense (Composite transforms)
 - Rows are actually the same model feature (non-merging windowfns)
 - Rows that represent optimizations (Combine)
 - Rows in the wrong place (Timers)
 - Rows have not been designed ([Meta]Data driven triggers)
 - Rows with names that appear no where else (Timestamp control)
 - No place to compare non-model differences between runners

I'm still pondering how to improve this, but I thought I'd send the notion
out for discussion. Some imperfect ideas I've had:

1. Lump all the basic stuff (ParDo, GroupByKey, Read, Window) into one row
2. Make sections as users see them, like "ParDo" / "side Inputs" not
"What?" / "side inputs"
3. Add rows for non-model things, like portability framework support,
metrics backends, etc
4. Drop rows that are not informative, like Composite transforms, or not
designed
5. Reorganize the windowing section to be just support for merging /
non-merging windowing.
6. Switch to a more distinct color scheme than the solid vs faded colors
currently used.
7. Find a web design to get short descriptions into the foreground to make
it easier to grok.

These are just a few thoughts, and not necessarily compatible with each
other. What do you think?

Kenn

Reply via email to