The expansion service can be provided by the job server, as done in the
Flink runner. It needs to be available at pipeline construction time, but
there is no need to run a separate service.

Thomas

On Mon, Nov 4, 2019 at 12:03 PM Robert Bradshaw <rober...@google.com> wrote:

> On Mon, Nov 4, 2019 at 11:54 AM Chamikara Jayalath <chamik...@google.com>
> wrote:
> >
> > On Mon, Nov 4, 2019 at 11:01 AM Hai Lu <lhai...@apache.org> wrote:
> >>
> >> Hi,
> >>
> >> We're looking into leveraging the cross language pipeline feature in
> our Beam pipelines on Samza runner. While the feature seems to work well,
> the PTransform expansion as a standalone service isn't very convenient.
> Particularly that the Python pipeline needs to specify the address of the
> expansion service.
> >>
> >> I'm wondering why we couldn't embed the expansion service into runner
> itself. I understand the cross language feature wants to be runner
> independent, but does it make sense to at least provide the option to allow
> runner to use the expansion service as a library and make it transparent to
> the portable pipeline?
> >
> >
> > Beam composite transforms are expanded before defining the portable job
> definition (and before submitting the jobs to the runner). So naturally
> this is something that has to be done in the Beam side. As an added
> benefit, as you identified, this allows us to keep this logic runner
> independent.
> > I think there were discussions regarding automatically starting up a
> local expansion service if one is not specified. Will this address your
> concerns ?
>
> Just to add to this, If you have a pipeline A -> B -> C, the expansion
> of B often needs to be evaluated before C can be applied (e.g. we're
> planning on exposing the SQL transforms cross language, and many
> cross-language IOs can query and supply their own schemas for
> downstream type checking), so one cannot construct the "whole"
> pipeline, pass it to the runner, and let the runner do the expansion.
>

Reply via email to