Thanks for sharing, Ning!
Is this update valuable to users as well? If so, consider sending a
user-geared update to u...@beam.apache.org.

-P.

On Wed, Dec 4, 2019 at 11:14 AM Ning Kang <ni...@google.com> wrote:

> *If you are not an Interactive Beam user, you can ignore this email.*
>
> Hi everyone,
>
> Recently, we've been actively developing on top of the existing
> InteractiveRunner for more Interactive Beam features
> <https://docs.google.com/document/d/1DYWrT6GL_qDCXhRMoxpjinlVAfHeVilK5Mtf8gO6zxQ/edit?usp=sharing>
> .
>
> One of the things we've changed is what PCollections will be cached and
> available for *get_result(pcoll)*.
>
> If your unit tests or code depend on executing a pipeline with the
> InteractiveRunner and check data of the PCollection through
> *get_result(pcoll)*, those code might run into an error saying "raise
> ValueError('PCollection not available, please run the pipeline.')".
>
> This is because now Interactive Beam automatically figures out what
> PCollections have been assigned to variables in the user-defined pipelines
> in your code/test/notebooks by looking at a "watched" scope of variable
> definitions.
> By default everything defined in "__main__" is watched.
>
> So if you've defined a pipeline in a local scope such as a function,
> Interactive Beam will not be able to "watch" it and then cache data for
> those PCollections.
> There is only one line change needed to fix the usage: watch your local
> scope.
>
> Something like,
> from apache_beam.runners.interactive import interactive_beam
> ...
> def some_func(...):
>     p = beam.Pipeline(InteractiveRunner())
>     pcoll = p | 'SomeTransform' >> SomeTransform()
>     ...
>     interactive_beam.watch(locals())
>     result = p.run()
>     ...
> ...
>
> Thanks for using Interactive Beam!
>
> Ning.
>
>
>
>

Reply via email to