Can you elaborate a bit more? Maybe a specific code example? I'm a little bit concerned about this sort of global verification. If the PCollection gets passed around afterwards, new restrictions on what can be done with it are a pretty big deal.
Kenn On Fri, Jan 11, 2019 at 12:58 PM Reuven Lax <[email protected]> wrote: > My problem is exactly outputs. I want to verify schemas for any > OutputReceiver parameters, and I don't think I can do this in expand. > > The best idea I have so far is to create a new PipelineVisitor to do this, > and run that after the normal apply is done. > > Reuven > > On Fri, Jan 11, 2019 at 12:39 PM Kenneth Knowles <[email protected]> wrote: > >> I believe that today all coders must be fully defined for all arguments >> to expand(). For the outputs, the ParDo outputting should be agnostic, no? >> The constraints on setCoder(...) are hoped to be enough to make sure >> nothing breaks. >> >> Kenn >> >> On Fri, Jan 11, 2019 at 10:41 AM Reuven Lax <[email protected]> wrote: >> >>> Hi, >>> >>> I want to be able to write a verification phase that asserts that input >>> and output schemas for all ParDos match up properly. The only place I can >>> see to do that today is in expand(), however this does not work as Coders >>> may not be fully known when expand is called (remember Schemas are >>> implemented as a special type of Coder today). For example: >>> >>> p.apply(ParDo.of(MyDoFn)) >>> .SetCoder(FooCoder()); >>> >>> FooCoder is not known yet when expand is called for the ParDo. >>> >>> Is there any place in Beam today where I could set up such a >>> verification pass? >>> >>> Reuven >>> >>
