Chamikara Madhusanka Jayalath created BEAM-10012:
----------------------------------------------------

             Summary: Update Python SDK to construct Dataflow job requests from 
Beam runner API protos
                 Key: BEAM-10012
                 URL: https://issues.apache.org/jira/browse/BEAM-10012
             Project: Beam
          Issue Type: New Feature
          Components: sdk-py-core
            Reporter: Chamikara Madhusanka Jayalath


Currently, portable runners are expected to do following when constructing a 
runner specific job.

SDK specific job graph -> Beam runner API proto -> Runner specific job request

Portable Spark and Flink follow this model.

Dataflow does following.

SDK specific job graph -> Runner specific job request

Beam runner API proto -> Upload to GCS -> Download at workers

 

We should update Dataflow to follow the prior path which is expected to be 
followed by all portable runners.

This will simplify the cross-language transforms job construction logic for 
Dataflow.

We can probably start this by just implementing this for Python SDK for 
portions of pipeline received by expanding external transforms.

cc: [~lcwik] [~robertwb]

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to