Ok then we are on the same page, but I disagree with your conclusion. The reason Flink has to do the deep copy is that it doesn't state that the inputs are immutable and should not be changed, and so have to do the deep copy. In Beam, the user is not supposed to modify the input collection and if they do, it's undefined behavior. This is the reason the DirectRunner checks for this, to make sure the users are not relying on it.

It's not written anywhere that the input cannot be mutated. A DirectRunner test is not a proof. Any runner could add a test which proves the opposite. In fact we may have one that checks copying for Flink.

I prefer safety and correctness over performance because I've seen too many cases where users shoot themselves in the foot. We should make sure that, by default, the user cannot modify the input element. An option to disable that is fine.

-Max

Reply via email to