With an eye towards cross-language (which includes cross-version)
pipelines and services (specifically looking at Dataflow) supporting
portable pipelines, there's been a desire to stabilize the portability
protos. There are currently many cleanups we'd like to do [1] (some
essential, others nice to have); are there others that people would
like to see?

Of course we would like it to be possible for the FnAPI and Beam
itself to continue to evolve. Most of this can be handled by runners
understanding various transform URNs, but not all. (An example that
comes to mind is support for large iterables [2], or the requirement
to observe and respect new fields on a PTransform or its payloads
[3]). One proposal for this is to add capabilities and/or
requirements. An environment (corresponding generally to an SDK) could
adveritize various capabilities (as a list or map of URNs) which a
runner can take advantage of without requiring all SDKs to support all
features at the same time. For the other way around, we need a way of
marking something that a runner must reject if it does not understand
it. This could be a set of requirements (again, a list of map of URNs)
that designate capabilities required to at least be understood by the
runner to faithfully execute this pipeline. (These could be attached
to a transform or the pipeline itself.) Do these sound like reasonable
additions? Also, would they ever need to be parameterized (map), or
would a list suffice?

[1] BEAM-2645, BEAM-2822, BEAM-3203, BEAM-3221, BEAM-3223, BEAM-3227,
BEAM-3576, BEAM-3577, BEAM-3595, BEAM-4150, BEAM-4180, BEAM-4374,
BEAM-5391, BEAM-5649, BEAM-8172, BEAM-8201, BEAM-8271, BEAM-8373,
BEAM-8539, BEAM-8804, BEAM-9229, BEAM-9262, BEAM-9266, and BEAM-9272
[2] 
https://lists.apache.org/thread.html/70cac361b659516933c505b513d43986c25c13da59eabfd28457f1f2@%3Cdev.beam.apache.org%3E
[3] 
https://lists.apache.org/thread.html/rdc57f240069c0807eae87ed2ff13d3ee503bc18e5f906d05624e6433%40%3Cdev.beam.apache.org%3E

Reply via email to