On Thu, Feb 23, 2017 at 9:02 PM, Shen Li <cs.she...@gmail.com> wrote:

> Thanks a lot for explaining. As the SimpleDoFnRunner only takes one
> StepContext object in its arguments, do you mean the runner should create a
> new SimpleDoFnRunner for each key?
>

Yes, that is right. You have some flexibility how to manage this. You can
see the DirectRunner creates a new DoFnRunner for each bundle, and
separately keeps track of whether a bundle has one consistent key:
https://github.com/apache/beam/blob/master/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ParDoEvaluator.java#L76

Kenn



> Best,
>
> Shen
>
> On Thu, Feb 23, 2017 at 10:16 PM, Kenneth Knowles <k...@google.com.invalid>
> wrote:
>
> > Hi Shen,
> >
> > The way that this is done is that the StepContext.stateInternals() is
> > specialized to be per-key by the runner, before you create the
> > SimpleDoFnRunner. Does this help?
> >
> > Kenn
> >
> > On Thu, Feb 23, 2017 at 3:03 PM, Shen Li <cs.she...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I am trying to understand how a runner can support the Stateful ParDo.
> > > Currently our runner relies on the SimpleDoFnRunner (Beam-0.5.0). But
> it
> > > cannot pass ParDoTest#testValueStateSideOutput (
> > > https://github.com/apache/beam/blob/master/sdks/java/
> > > core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java#L1607
> > > )
> > >
> > > The reason is that it maintains a global state instead of a *per-key*
> > state
> > > as explained in the latest Beam blog. I scanned through the
> > implementation
> > > of the SimpleDoFnRunner, the implementation does not seem to consider
> the
> > > *per-key* state either (
> > > https://github.com/apache/beam/blob/master/runners/core-
> > > java/src/main/java/org/apache/beam/runners/core/
> > SimpleDoFnRunner.java#L626
> > > ).
> > >
> > > May I ask what is the recommended solution to handle stateful ParDo?
> > >
> > > Thanks,
> > >
> > > Shen
> > >
> >
>

Reply via email to