dawidwys commented on PR #23411:
URL: https://github.com/apache/flink/pull/23411#issuecomment-1892398690

   https://github.com/apache/flink/pull/23411#pullrequestreview-1770150649
   
   It took me a while to understand that myself, but I think it's actually ok 
to depend on the `Object#equals/hashcode` in `AggregateFunctions` @snuyanzin 
what you said is correct for `ArrayDistinctFunction`, because:
   1. it is a scalar function, so it can be chained with other functions 
without an exchange before it
   2. It gets data from two input(s) (previous operator and e.g. a literal) 
haystack and needle may come from different operators which may produce data in 
different formats e.g. `GenericRowData` and `BinaryRowData`
   
   In `AggregateFunctions` we will always have all records as `BinaryRowData` 
(and alike) and those `equals/hashcode` should work just fine. (It may be a 
different story when we support `STRUCTURED_TYPE#equals/hashcode`, but we will 
need to revisit most of the operators then, because all `MapView(s)` will work 
incorrectly). This is what we do in other aggregate functions already. Take a 
look at e.g. `JsonArrayAggFunction` or `FirstValueWithRetractAggFunction `
   
   I haven't seen the previous version @Jiabao-Sun , but I believe we can 
remove the `equalityHandler` and make the `ArrayAggFunction` look similar to 
`JsonArrayAggFunction`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to