The memory problem was that the last superstep was sometimes not empty. The result was then written to the released back channel.
When the iterative input came through a broadcast variable (the main input of the operator came from the cache), a step with an empty bc variable, but a full regular input happened. Depending on the user code, this may produce results.
