Jun Zhang created FLINK-19896:
---------------------------------
Summary: Support first-n-rows deduplication in the Deduplicate
operator
Key: FLINK-19896
URL: https://issues.apache.org/jira/browse/FLINK-19896
Project: Flink
Issue Type: Improvement
Components: Table SQL / Runtime
Affects Versions: 1.12.0, 1.11.3
Reporter: Jun Zhang
Fix For: 1.11.2
Currently Deduplicate operator only supports first-row deduplication (ordered
by proc-time). In scenario of first-n-rows deduplication, the planner has to
resort to Rank operator. However, Rank operator is less efficient than
Deduplicate in terms of state consumption.
This issue proposes to extend DeduplicateKeepFirstRowFunction to support
first-n-rows deduplication.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)