That depends on how you determine which view you're trying to use. All of the nodes that are visited in Pipeline#traverseTopologically are wired up in such a way that their outputs correspond one-to-one with the pipeline independent user graph. However, that means that the contents of any override don't necessarily match up with the graph that they're present in - that usually means that outputs have to be determined from the graph nodes, and shouldn't be obtained from the view itself.
I think you can swap from `PCollectionView.getView()` to `context.getOutput()` in FlinkStreamingTransformTranslators and everything should work out. PR #2035 is what I think should be sufficient (pending test runs, of course) On Fri, Feb 17, 2017 at 3:29 AM, Aljoscha Krettek <[email protected]> wrote: > Thomas, how were you planning to get the StreamingViewAs* PTransforms out > of the apply() in the Dataflow runner? > > I'm asking because the Flink Runner works basically the same way and I've > run into a problem. My problem is that the TupleTag of a PCollectionView is > generated when the user applies, for example, the View.AsIterable > PTransform. When the override is in the runner apply() this is not a > problem because the tag of the PCollectionView that the user has matches > the tag that we internally have in the Pipeline. Now, if I later change the > View.AsIterable using Pipeline surgery a new PCollectionView will be > created that will have a different internal tag from the one that the user > PCollectionView has. Then , we the user creates a ParDo with side inputs I > cannot properly match the replaced View against the user View and I can't > properly stitch together an execution graph. > > On Fri, 17 Feb 2017 at 00:45 Amit Sela <[email protected]> wrote: > > > Awesome! > > First thing I'm gonna do: > > > > 1. traverse the pipeline to determine if streaming. > > 2. If streaming, replace Read.Bounded with an adapted Read.Unbounded. > > > > Current implementation forces translating bounded reads by the unbounded > > translator and it feels awkward, this makes it right again. > > > > Thanks Thomas! > > > > On Thu, Feb 16, 2017 at 4:12 PM Aljoscha Krettek <[email protected]> > > wrote: > > > > > I might just try and do that. ;-) > > > > > > On Thu, 16 Feb 2017 at 03:55 Thomas Groh <[email protected]> > > wrote: > > > > > > > As of Github PR #1998 (https://github.com/apache/beam/pull/1998), > the > > > new > > > > Pipeline Surgery API is ready and available. There are a couple of > > > > refinements coming in PR #2006, but in general pipelines can now, > post > > > > construction, have PTransforms swapped out to whatever the runner > > desires > > > > (standard "behavior-maintaining" caveats apply). > > > > > > > > Moving forwards, this will enable pipelines to be run on multiple > > runners > > > > without having to reconstruct the graph via repeated applications of > > > > PTransforms to the pipeline (this also includes being able to, for > > > example, > > > > read a pipeline from a serialized representation, and executing the > > > result > > > > on an arbitrary runner). > > > > > > > > Those of you who are runner authors (at least, those who I can easily > > > > identify as such) should expect a Pull Request from me sometime next > > week > > > > porting you off of intercepting calls to apply and to the new surgery > > > API. > > > > You are, of course, welcome to beat me to the punch. > > > > > > > > Thanks, > > > > > > > > Thomas > > > > > > > > > >
