Hi All, I need to aggregate some field of the event, at first I use keyby(), but I found the flink performs very slow (even stop working out results) due to the number of keys is around half a million per min. So I use windowAll() instead, and flink works as expected then.
The keyby() upon the field would generate unique key as the field value, so if the number of the uniqueness is huge, flink would have trouble both on cpu and memory. Is it considered in the design of flink? Since windowsAll() could be set parallelism, so I try to use key selector to use field hash but not value, that I hope it would decrease the number of the keys, but the flink throws key out-of-range exception. How to use key selector in correct way? In storm, we could achieve this goal at ease: use fieldGrouping to connect the spout and bolt.