[ https://issues.apache.org/jira/browse/FLINK-31060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Leonard Xu reassigned FLINK-31060: ---------------------------------- Assignee: dalongliu > Release Testing: Verify FLINK-30542 Support adaptive local hash aggregate in > runtime > ------------------------------------------------------------------------------------ > > Key: FLINK-31060 > URL: https://issues.apache.org/jira/browse/FLINK-31060 > Project: Flink > Issue Type: Sub-task > Components: Table SQL / Runtime > Affects Versions: 1.17.0 > Reporter: Yunhong Zheng > Assignee: dalongliu > Priority: Major > Fix For: 1.17.0 > > > This issue aims to verify FLINK-30542: Support adaptive local hash aggregate > in runtime. > Adaptive local hash aggregation is an optimization of local hash aggregation, > which can adaptively determine whether to continue to do local hash > aggregation according to the distinct value rate of sampling data. If > distinct value rate bigger than defined threshold (see parameter: > 'table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold'), we will > stop aggregating and just send the input data to the downstream after a > simple projection. Otherwise, we will continue to do aggregation. > We can verify it in SQL client after we build the flink-dist package. > # Create a source table firstly. (Note: the source table need have different > degree of aggregation, means the distinct count can be controlled by source > connector, we recommend to modify dataGen table source to produce different > data with different distinct row number). > # Verify the result with different distinct value rate. (See: > table.exec.local-hash-agg.adaptive.distinct-value-rate-threshold) > # Check the log in 'TM' to see whether the adaptive local hash aggregate > works. > If you meet any problems, it's welcome to ping me directly. -- This message was sent by Atlassian Jira (v8.20.10#820010)