Patrick,
My team is using shuffle consolidation but not speculation. We are also
using persist(DISK_ONLY) for caching.
Here are some config changes that are in our work-in-progress.
We've been trying for 2 weeks to get our production flow (maybe around
50-70 stages, a few forks and joins with
On Wed, Jun 18, 2014 at 6:19 PM, Surendranauth Hiraman
suren.hira...@velos.io wrote:
Patrick,
My team is using shuffle consolidation but not speculation. We are also
using persist(DISK_ONLY) for caching.
Use of shuffle consolidation is probably what is causing the issue.
Would be good idea
Just wondering, do you get this particular exception if you are not
consolidating shuffle data?
On Wed, Jun 18, 2014 at 12:15 PM, Mridul Muralidharan mri...@gmail.com wrote:
On Wed, Jun 18, 2014 at 6:19 PM, Surendranauth Hiraman
suren.hira...@velos.io wrote:
Patrick,
My team is using shuffle
Good question. At this point, I'd have to re-run it to know for sure. We've
been trying various different things, so I'd have to reset the flow config
back to that state.
I can say that by removing persist(DISK_ONLY), the flows are running more
stably, probably due to removing disk contention. We
Matt/Ryan,
Did you make any headway on this? My team is running into this also.
Doesn't happen on smaller datasets. Our input set is about 10 GB but we
generate 100s of GBs in the flow itself.
-Suren
On Fri, Jun 6, 2014 at 5:19 PM, Ryan Compton compton.r...@gmail.com wrote:
Just ran into
Out of curiosity - are you guys using speculation, shuffle
consolidation, or any other non-default option? If so that would help
narrow down what's causing this corruption.
On Tue, Jun 17, 2014 at 10:40 AM, Surendranauth Hiraman
suren.hira...@velos.io wrote:
Matt/Ryan,
Did you make any headway