I have made some good progress here and have gotten to the following state
for non-portable runners:

DirectRunner[1]: Merged. Supports Read.Bounded and Read.Unbounded.
Twister2[2]: Ready for review. Supports Read.Bounded, the current runner
doesn't support unbounded pipelines.
Spark[3]: WIP. Supports Read.Bounded, Nexmark suite passes. Not certain
about level of unbounded pipeline support coverage since Spark uses its own
tiny suite of tests to get unbounded pipeline coverage instead of the
validates runner set.
Jet[4]: WIP. Supports Read.Bounded. Read.Unbounded definitely needs
additional work.
Sazma[5]: WIP. Supports Read.Bounded. Not certain about level of unbounded
pipeline support coverage since Spark uses its own tiny suite of tests to
get unbounded pipeline coverage instead of the validates runner set.
Flink: Unstarted.

@Pulasthi Supun Wickramasinghe <pulasthi...@gmail.com> , can you help me
with the Twister2 PR[2]?
@Ismaël Mejía <ieme...@gmail.com>, is PR[3] the expected level of support
for unbounded pipelines and hence ready for review?
@Jozsef Bartok <jo...@hazelcast.com>, can you help me out to get support
for unbounded splittable DoFn's into Jet[4]?
@Xinyu Liu <xinyuliu...@gmail.com>, is PR[5] the expected level of support
for unbounded pipelines and hence ready for review?

1: https://github.com/apache/beam/pull/12519
2: https://github.com/apache/beam/pull/12594
3: https://github.com/apache/beam/pull/12603
4: https://github.com/apache/beam/pull/12616
5: https://github.com/apache/beam/pull/12617

On Tue, Aug 11, 2020 at 10:55 AM Luke Cwik <lc...@google.com> wrote:

> There shouldn't be any changes required since the wrapper will smoothly
> transition the execution to be run as an SDF. New IOs should strongly
> prefer to use SDF since it should be simpler to write and will be more
> flexible but they can use the "*Source"-based APIs. Eventually we'll
> deprecate the APIs but we will never stop supporting them. Eventually they
> should all be migrated to use SDF and if there is another major Beam
> version, we'll finally be able to remove them.
>
> On Tue, Aug 11, 2020 at 8:40 AM Alexey Romanenko <aromanenko....@gmail.com>
> wrote:
>
>> Hi Luke,
>>
>> Great to hear about such progress on this!
>>
>> Talking about opt-out for all runners in the future, will it require any
>> code changes for current “*Source”-based IOs or the wrappers should
>> completely smooth this transition?
>> Do we need to require to create new IOs only based on SDF or again, the
>> wrappers should help to avoid this?
>>
>> On 10 Aug 2020, at 22:59, Luke Cwik <lc...@google.com> wrote:
>>
>> In the past couple of months wrappers[1, 2] have been added to the Beam
>> Java SDK which can execute BoundedSource and UnboundedSource as Splittable
>> DoFns. These have been opt-out for portable pipelines (e.g. Dataflow runner
>> v2, XLang pipelines on Flink/Spark) and opt-in using an experiment for all
>> other pipelines.
>>
>> I would like to start making the non-portable pipelines starting with the
>> DirectRunner[3] to be opt-out with the plan that eventually all runners
>> will only execute splittable DoFns and the BoundedSource/UnboundedSource
>> specific execution logic from the runners will be removed.
>>
>> Users will be able to opt-in any pipeline using the experiment
>> 'use_sdf_read' and opt-out with the experiment 'use_deprecated_read'. (For
>> portable pipelines these experiments were 'beam_fn_api' and
>> 'beam_fn_api_use_deprecated_read' respectively and I have added these two
>> additional aliases to make the experience less confusing).
>>
>> 1:
>> https://github.com/apache/beam/blob/af1ce643d8fde5352d4519a558de4a2dfd24721d/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java#L275
>> 2:
>> https://github.com/apache/beam/blob/af1ce643d8fde5352d4519a558de4a2dfd24721d/sdks/java/core/src/main/java/org/apache/beam/sdk/io/Read.java#L449
>> 3: https://github.com/apache/beam/pull/12519
>>
>>
>>

Reply via email to