Hi, I was wondering if it would be safe for me to make use of reinterpretAsKeyedStream on a Kafka source in order to have an "embarrassingly parallel" job without any .keyBy().
My Kafka topic is partitioned by the same id I'm then sending through a session window operator. Therefore there's in theory no need for data to be transferred between subtasks (between Kafka source and the windowing operator). Is it possible to avoid this by using reinterpretAsKeyedStream on the source? I'm worried about the warning from the docs saying: WARNING: The re-interpreted data stream MUST already be pre-partitioned in EXACTLY the same way Flinkās keyBy would partition the data in a shuffle w.r.t. key-group assignment. Of course, the partitioning in Kafka will not be *exactly* the same... What problems might this cause? I did it on a very small subset of data and didn't notice any issues. -- Piotr Domagalski