Re: Environments for External Transforms

Thomas Weise Thu, 23 May 2019 07:47:41 -0700

On Thu, May 23, 2019 at 3:46 AM Maximilian Michels <[email protected]> wrote:


> >  Writing a new transform involves updating the expansion service to
> include their new transform.
>
> Would it be conceivable that the expansion is performed via the
> environment? That would solve the problem of updating the expansion
> service, although it adds additional complexity for bringing up the
> environment.
>
>
Which environment would be used to perform the expansion? I think this is
an interesting option, as long as it does not introduce a hard dependency
on docker.


> On 23.05.19 11:31, Robert Bradshaw wrote:
> > On Wed, May 22, 2019 at 6:17 PM Maximilian Michels <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Hi,
> >
> >     Robert and me were discussing on the subject of user-specified
> >     environments for external transforms [1]. We couldn't decide whether
> >     users should have direct control over the environment when they use
> an
> >     external transform in their pipeline.
> >
> >     In my mind, it is quite natural that the Expansion Service is a
> >     long-running service that gets started with a list of available
> >     environments.
> >
> >
> > +1.
> >
> > IMHO, the expansion service should be expected to provide valid
> > environments for the transforms it vendors. Removing this expectation
> > seems wrong. Making it cheap to specify non-default dependencies without
> > building (publishing, etc.) a docker image is probably key to making
> > this work well (and also allowing more powerful environment
> introspection).
> >
> >     Such a list can be outdated and users may write transforms
> >     for a new environment they want to use in their pipeline.
> >
> >
> > This is the part that I'm having trouble following. Writing a new
> > transform involves updating the expansion service to include their new
> > transform. The author of a transform (in other words, the one who
> > defines its expansion and implementation) is in the position to name its
> > dependencies, etc. and the user of the transform (the one invoking it)
> > is not in a generally good position to know what environments would be
> > valid.
> >
> >     The easiest
> >     way would be to allow to pass the environment with the transform.
> >
> >
> > What this allows is using existing transforms in new environments. There
> > are possibly some usecases for this, e.g. expansion of a given transform
> > may be compatible with ether version X or version Y of a library, left
> > up to the discretion of the caller, but I think that this is really just
> > a deficiency in our environment specifications (e.g. it one should be
> > able to express this flexibility in the returned environment).
> >
> >     Note
> >     that we already give users control over the "main" environment via
> the
> >     PortablePipelineOptions, so this wouldn't be an entirely new concept.
> >
> >
> > Yes, the author of a pipeline/transform chooses the environment in which
> > those transforms execute.
> >
> >     The contrary position is that the Expansion Service should have full
> >     control over which environment is chosen. Going back to the
> discussion
> >     about artifact staging [2], this could enable to perform more
> >     optimizations, such as merging environments or detecting conflicts.
> >     However, this only works if this information has been provided
> upfront
> >     to the Expansion Service. It wouldn't be impossible to provide these
> >     hints alongside with the environment like suggested in the previous
> >     paragraph.
> >
> >     Any opinions? Should we allow users to optionally specify an
> >     environment
> >     for external transforms?
> >
> >     Thanks,
> >     Max
> >
> >     [1] https://github.com/apache/beam/pull/8639
> >     [2]
> >
> https://lists.apache.org/thread.html/6fcee7047f53cf1c0636fb65367ef70842016d57effe2e5795c4137d@%3Cdev.beam.apache.org%3E
> >
>

Re: Environments for External Transforms

Reply via email to