Jun Zhang created FLINK-19896: --------------------------------- Summary: Support first-n-rows deduplication in the Deduplicate operator Key: FLINK-19896 URL: https://issues.apache.org/jira/browse/FLINK-19896 Project: Flink Issue Type: Improvement Components: Table SQL / Runtime Affects Versions: 1.12.0, 1.11.3 Reporter: Jun Zhang Fix For: 1.11.2
Currently Deduplicate operator only supports first-row deduplication (ordered by proc-time). In scenario of first-n-rows deduplication, the planner has to resort to Rank operator. However, Rank operator is less efficient than Deduplicate in terms of state consumption. This issue proposes to extend DeduplicateKeepFirstRowFunction to support first-n-rows deduplication. -- This message was sent by Atlassian Jira (v8.3.4#803005)