[ https://issues.apache.org/jira/browse/FLINK-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16591511#comment-16591511 ]
eugen yushin commented on FLINK-10050: -------------------------------------- [~aljoscha] There's no info about windows for any of operator in Flink. Docs: https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/windows.html#working-with-window-results ``` The result of a windowed operation is again a {{DataStream}}, no information about the windowed operations is retained in the result elements ``` At the same time, coGroup/join keeps element's timestamps and consecutive operators can assign elements to respective windows. Docs: [https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/joining.html#window-join] ``` Those elements that do get joined will have as their timestamp the largest timestamp that still lies in the respective window. For example a window with {{[5, 10)}} as its boundaries would result in the joined elements having 9 as their timestamp. ``` Business case: 2 streams, 1 for different business metrics, another one - similar metrics but from microservices logs, result - reconciliation of these 2 streams. No other operators except sink are need for this particular business case. > Support 'allowedLateness' in CoGroupedStreams > --------------------------------------------- > > Key: FLINK-10050 > URL: https://issues.apache.org/jira/browse/FLINK-10050 > Project: Flink > Issue Type: Improvement > Components: Streaming > Affects Versions: 1.5.1, 1.6.0 > Reporter: eugen yushin > Priority: Major > Labels: ready-to-commit, windows > > WindowedStream has a support of 'allowedLateness' feature, while > CoGroupedStreams are not. At the mean time, WindowedStream is an inner part > of CoGroupedStreams and all main functionality (like evictor/trigger/...) is > simply delegated to WindowedStream. > There's no chance to operate with late arriving data from previous steps in > cogroups (and joins). Consider the following flow: > a. read data from source1 -> aggregate data with allowed lateness > b. read data from source2 -> aggregate data with allowed lateness > c. cogroup/join streams a and b, and compare aggregated values > Step c doesn't accept any late data from steps a/b due to lack of > `allowedLateness` API call in CoGroupedStreams.java. > Scope: add method `WithWindow.allowedLateness` to Java API > (flink-streaming-java) and extend scala API (flink-streaming-scala). -- This message was sent by Atlassian JIRA (v7.6.3#76005)