[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16068740#comment-16068740
 ] 

Eugene Kirpichov commented on BEAM-2140:
----------------------------------------

So, to elaborate on what Kenn said. We dug a bit deeper into this yesterday and 
came up with the following conclusions.

1) The reason that this stuff works in Dataflow and Direct runner is that, for 
running SDF, they use a code path that simply _does not drop late data/timers 
or GC state_. These happen in LateDataDroppingRunner and ReduceFnRunner and 
StatefulDoFnRunner - and the path for running ProcessFn does not involve any of 
these. Aljoscha, maybe you can see why your current codepaths for running 
ProcessFn in Flink involve dropping of late data / late timers, and make them 
not involve it? :) (I'm not sure where this dropping happens in Flink)
2) As a consequence, however, state doesn't get GC'd. In practice this means 
that, if you apply an SDF to input that is in many windows (e.g. to input 
windowed by fixed or sliding windows), it will slowly leak state. However, in 
practice this is likely not a huge concern because SDFs are expected to mostly 
be used when the amount of input is not super large (at least compared to 
output), and it is usually globally windowed. Especially in streaming use 
cases. I.e. it can be treated as a "Known issue" rather than "SDF does not work 
at all". *I would recommend proceeding to implement it in Flink runner with 
this same known issue*, and then solving the issue uniformly across all runners.

Posting this comment for now and writing another on how to do it without state 
leakage.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> -------------------------------------------------------
>
>                 Key: BEAM-2140
>                 URL: https://issues.apache.org/jira/browse/BEAM-2140
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to