[ 
https://issues.apache.org/jira/browse/BEAM-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963243#comment-15963243
 ] 

Chamikara Jayalath commented on BEAM-1925:
------------------------------------------

cc: [~sb2nov] [~robertwb] [~jkff]

> Make DoFn invocation logic of Python SDK more extensible
> --------------------------------------------------------
>
>                 Key: BEAM-1925
>                 URL: https://issues.apache.org/jira/browse/BEAM-1925
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py
>            Reporter: Chamikara Jayalath
>            Assignee: Chamikara Jayalath
>
> DoFn invocation logic of Python SDK is currently in DoFnRunner class.
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L54
> At initialization of this, we parse a DoFn and create local state. We use 
> this state when invoking DoFn methods process, start_bundle, and 
> finish_bundle. For example, we store a list of  ArgPlaceholder objects within 
> the state of DoFnRunner to facilitate invocation of process method.
> We will need to extend this functionality when adding new features to DoFn 
> class (for example to support Splittable DoFn [1]). So I think it's good to 
> refactor this code to be more extensible. 
> I think a good approach for this is to add DoFnInvoker and DoFnSignature 
> classes similar to Java SDK [2].
> In this approach:
> A DoFnSignature captures the signature of a DoFn including methods and 
> arguments.
> A DoFnInvoker implements a particular way DoFn methods will be executed 
> (initially we'll have simple and per-window invokers [3]).
> A runner uses DoFnRunner to execute methods of a given DoFn. At 
> initialization, DoFnRunner crates a DoFnSignature and a DoFnInvoker for the 
> given DoFn.
> DoFnSignature and DoFnInvoker methods will be used by SplittableDoFn 
> implementation as well. 
> [1] 
> https://docs.google.com/document/d/1h_zprJrOilivK2xfvl4L42vaX4DMYGfH1YDmi-s_ozM/edit#heading=h.e6patunrpiql
> [2]https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java
> [3] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L200



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to