Can you elaborate a bit more? Maybe a specific code example? I'm a little
bit concerned about this sort of global verification. If the PCollection
gets passed around afterwards, new restrictions on what can be done with it
are a pretty big deal.

Kenn

On Fri, Jan 11, 2019 at 12:58 PM Reuven Lax <[email protected]> wrote:

> My problem is exactly outputs. I want to verify schemas for any
> OutputReceiver parameters, and I don't think I can do this in expand.
>
> The best idea I have so far is to create a new PipelineVisitor to do this,
> and run that after the normal apply is done.
>
> Reuven
>
> On Fri, Jan 11, 2019 at 12:39 PM Kenneth Knowles <[email protected]> wrote:
>
>> I believe that today all coders must be fully defined for all arguments
>> to expand(). For the outputs, the ParDo outputting should be agnostic, no?
>> The constraints on setCoder(...) are hoped to be enough to make sure
>> nothing breaks.
>>
>> Kenn
>>
>> On Fri, Jan 11, 2019 at 10:41 AM Reuven Lax <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> I want to be able to write a verification phase that asserts that input
>>> and output schemas for all ParDos match up properly. The only place I can
>>> see to do that today is in expand(), however this does not work as Coders
>>> may not be fully known when expand is called (remember Schemas are
>>> implemented as a special type of Coder today). For example:
>>>
>>> p.apply(ParDo.of(MyDoFn))
>>>   .SetCoder(FooCoder());
>>>
>>> FooCoder is not known yet when expand is called for the ParDo.
>>>
>>> Is there any place in Beam today where I could set up such a
>>> verification pass?
>>>
>>> Reuven
>>>
>>

Reply via email to