[ https://issues.apache.org/jira/browse/FLINK-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379599#comment-15379599 ]
Gabor Gevay edited comment on FLINK-3279 at 7/15/16 4:00 PM: ------------------------------------------------------------- I think no. And a Jira is also needed for the sum, max, etc. aggregations. (Maybe these two things can be in one Jira.) was (Author: ggevay): https://issues.apache.org/jira/browse/FLINK-3479? > Optionally disable DistinctOperator combiner > -------------------------------------------- > > Key: FLINK-3279 > URL: https://issues.apache.org/jira/browse/FLINK-3279 > Project: Flink > Issue Type: New Feature > Components: DataSet API > Affects Versions: 1.0.0 > Reporter: Greg Hogan > Assignee: Greg Hogan > Priority: Minor > > Calling {{DataSet.distinct()}} executes {{DistinctOperator.DistinctFunction}} > which is a combinable {{RichGroupReduceFunction}}. Sometimes we know that > there will be few duplicate records and disabling the combine would improve > performance. > I propose adding {{public DistinctOperator<T> setCombinable(boolean > combinable)}} to {{DistinctOperator}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)