Hi all: 我们有个streaming sql得到的结果不正确,现象是sink得到的数据一会大一会小,我们想确认下,这是否是个bug, 或者flink还不支持这种sql。 具体场景是:先group by A, B两个维度计算UV,然后再group by A 把维度B的UV sum起来,对应的SQL如下:(A -> dt, B -> pvareaid) SELECT dt, SUM(a.uv) AS uv FROM ( SELECT dt, pvareaid, COUNT(DISTINCT cuid) AS uv FROM streaming_log_event WHERE action IN ('action1') AND pvareaid NOT IN ('pv1', 'pv2') AND pvareaid IS NOT NULL GROUP BY dt, pvareaid ) a GROUP BY dt; sink接收到的数据对应日志为: 2020-04-17 22:28:38,727 INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1) (GeneralRedisSinkFunction.invoke:169) - receive data(false,0,86,20200417) 2020-04-17 22:28:38,727 INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1) (GeneralRedisSinkFunction.invoke:169) - receive data(true,0,130,20200417) 2020-04-17 22:28:39,327 INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1) (GeneralRedisSinkFunction.invoke:169) - receive data(false,0,130,20200417) 2020-04-17 22:28:39,327 INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1) (GeneralRedisSinkFunction.invoke:169) - receive data(true,0,86,20200417) 2020-04-17 22:28:39,327 INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1) (GeneralRedisSinkFunction.invoke:169) - receive data(false,0,86,20200417) 2020-04-17 22:28:39,328 INFO groupBy xx -> to: Tuple2 -> Sink: Unnamed (1/1) (GeneralRedisSinkFunction.invoke:169) - receive data(true,0,131,20200417)
我们使用的是1.7.2, 测试作业的并行度为1。 这是对应的 issue: https://issues.apache.org/jira/browse/FLINK-17228 dixingxin...@163.com