2(c) can also be "hacked" inside an SDK as an explicit environment override by the "user" where the expansion service isn't involved and the user/SDK manipulates the expansion service response. As Chamikara pointed out, I believe the response from the expansion service should be "safe" instead of allowing it to return broken combinations.
On Wed, May 22, 2019 at 11:08 AM Chamikara Jayalath <[email protected]> wrote: > > > On Wed, May 22, 2019 at 9:17 AM Maximilian Michels <[email protected]> wrote: > >> Hi, >> >> Robert and me were discussing on the subject of user-specified >> environments for external transforms [1]. We couldn't decide whether >> users should have direct control over the environment when they use an >> external transform in their pipeline. >> >> In my mind, it is quite natural that the Expansion Service is a >> long-running service that gets started with a list of available >> environments. Such a list can be outdated and users may write transforms >> for a new environment they want to use in their pipeline. The easiest >> way would be to allow to pass the environment with the transform. Note >> that we already give users control over the "main" environment via the >> PortablePipelineOptions, so this wouldn't be an entirely new concept. >> > > > I think we are trying to generalize the expansion service along multiple > axes. > (1) dependencies > (a) dependencies embedded in an environment (b) dependencies specific to > an transform (c) dependencies specified by the user expanding the transform > > (2) environments > (a)default environment (b) environments specified a startup of the > expansion service (c) environments specified by the user expanding the > transform (this proposal) > > It's great if we can implement the most generic solution along all these > exes but I think we run into risk of resulting in broken combinations by > trying to implement this before we have other necessary pieces to support a > long running expansion service. For example, support for dynamically > registering transforms and support for discovering transforms. > > What is the need for implementing 2 (c) now ? If there's no real need now > I suggest we settle with 2(a) or 2(b) for now till we can truly support a > long running expansion service. Also we'll have a better idea of how this > kind if features should evolve when we have at least two runners supporting > cross-language transforms (we are in the process of updating Dataflow to > support this). Just my 2 cents though :) > > >> >> The contrary position is that the Expansion Service should have full >> control over which environment is chosen. Going back to the discussion >> about artifact staging [2], this could enable to perform more >> optimizations, such as merging environments or detecting conflicts. >> However, this only works if this information has been provided upfront >> to the Expansion Service. It wouldn't be impossible to provide these >> hints alongside with the environment like suggested in the previous >> paragraph. >> >> Any opinions? Should we allow users to optionally specify an environment >> for external transforms? >> >> Thanks, >> Max >> >> [1] https://github.com/apache/beam/pull/8639 >> [2] >> >> https://lists.apache.org/thread.html/6fcee7047f53cf1c0636fb65367ef70842016d57effe2e5795c4137d@%3Cdev.beam.apache.org%3E >> >
