[ https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16068740#comment-16068740 ]
Eugene Kirpichov commented on BEAM-2140: ---------------------------------------- So, to elaborate on what Kenn said. We dug a bit deeper into this yesterday and came up with the following conclusions. 1) The reason that this stuff works in Dataflow and Direct runner is that, for running SDF, they use a code path that simply _does not drop late data/timers or GC state_. These happen in LateDataDroppingRunner and ReduceFnRunner and StatefulDoFnRunner - and the path for running ProcessFn does not involve any of these. Aljoscha, maybe you can see why your current codepaths for running ProcessFn in Flink involve dropping of late data / late timers, and make them not involve it? :) (I'm not sure where this dropping happens in Flink) 2) As a consequence, however, state doesn't get GC'd. In practice this means that, if you apply an SDF to input that is in many windows (e.g. to input windowed by fixed or sliding windows), it will slowly leak state. However, in practice this is likely not a huge concern because SDFs are expected to mostly be used when the amount of input is not super large (at least compared to output), and it is usually globally windowed. Especially in streaming use cases. I.e. it can be treated as a "Known issue" rather than "SDF does not work at all". *I would recommend proceeding to implement it in Flink runner with this same known issue*, and then solving the issue uniformly across all runners. Posting this comment for now and writing another on how to do it without state leakage. > Fix SplittableDoFn ValidatesRunner tests in FlinkRunner > ------------------------------------------------------- > > Key: BEAM-2140 > URL: https://issues.apache.org/jira/browse/BEAM-2140 > Project: Beam > Issue Type: Bug > Components: runner-flink > Reporter: Aljoscha Krettek > Assignee: Aljoscha Krettek > > As discovered as part of BEAM-1763, there is a failing SDF test. We disabled > the tests to unblock the open PR for BEAM-1763. -- This message was sent by Atlassian JIRA (v6.4.14#64029)