Hi all!
Now that we are coming to the next release, I wanted to make sure we
finalize the decision on that point, because it would be nice to not break
the behavior of system afterwards.
Right now, when tasks are chained together, the system copies the elements
always between different tasks in t
Hey guys,
Have we disabled the default input copying after all? I don't remember
seeing a Jira or PR for this (maybe I just missed it).
And if not, do we want this in the 0.10 release?
Cheers,
Gyula
On Fri, Oct 2, 2015 at 7:57 PM, Till Rohrmann wrote:
> Do we know what kind of impact the non-
I don't recall that the default policy was changed.
If we change it, would be a good idea to change it for 0.10 - the latest
for 1.0
One thing I realized is that to get predictable behavior with chaining, we
should not do the special case parallelism 1 chaining (meaning shuffle
operations get cha
+1 for disable copy by default
On 10/02/2015 05:53 PM, Stephan Ewen wrote:
> Hi all!
>
> Now that we are coming to the next release, I wanted to make sure we
> finalize the decision on that point, because it would be nice to not break
> the behavior of system afterwards.
>
> Right now, when tas
It seems like I'm one of the few people that run into the mutable elements
trap on the Batch API from time to time. At the moment I always clone when
I'm not 100% sure to avoid hunting the bugs later. So far I was happy to
learn that this is not a problem in Streaming, but that's just me.
When wor
@Martin:
I think you were a user of the Batch API before we made the non-reuse mode
the default mode.
By now, when you use a GroupReduceFunction or a MapPartitionFunction or so,
you need not do any cloning or copying. All functions that receive groups
will always get fresh elements.
This chaining
+1 Good idea. I think we can save quite some CPU cycles by not copying
records.
That is basically the behavior of the batch API, and there has so far never
> been an issue with that (people running into the trap of overwritten
> mutable elements).
As far as I know, this is only the case for chai
Do we know what kind of impact the non-reuse policy has? Maybe the
serialization overhead is subsumed by other effects.
But in general I'm ok with changing the default to non copying. We just
have to document this feature properly.
On Oct 2, 2015 6:31 PM, "Maximilian Michels" wrote:
> +1 Good id