Github user fhueske commented on the issue:

    https://github.com/apache/flink/pull/5555
  
    Hi @walterddr and @hequn8128, thanks for the PR, review, and discussions.
    
    The current implementation with the `DistinctAggDelegateFunction` and 
accumulators takes the path of user-defined code although `DISTINCT` is a query 
feature that does not require user-code. 
    Wouldn't it be easier if we would just add two parameters to 
`AggregationCodeGenerator.generateAggregations()` methods:
    
    - distinctAggs: Array[Boolean]
    - stateBackedDistinct: Option[Boolean]
    
    and handle the distinct in the generated code? Given this information we 
can configure the required MapViews (also reusing them across multiple 
aggregation functions). Also we don't need to an aggregation function but 
access the MapView directly and check for distinct input or not.
    
    This would mean a bit more implementation effort for the code-generation, 
but be the cleaner design because we do not need to wrap aggregation function 
and accumulators. It would avoid all problems with nested map views and make 
the planning code easier.
    
    What do you think @walterddr, @hequn8128?


---

Reply via email to