Re: YARN Shuffle service and its compatibility

Reynold Xin Mon, 18 Apr 2016 15:10:12 -0700

Got it. So Mark is pushing for "best-effort" support.

IIUC, the reason for that PR is that they found the string comparison to
increase the size in large shuffles. Maybe we should add the ability to
support the short name to Spark 1.6.2?



On Mon, Apr 18, 2016 at 3:05 PM, Marcelo Vanzin <van...@cloudera.com> wrote:

> On Mon, Apr 18, 2016 at 2:02 PM, Reynold Xin <r...@databricks.com> wrote:
> > The bigger problem is that it is much easier to maintain backward
> > compatibility rather than dictating forward compatibility. For example,
> as
> > Marcin said, if we come up with a slightly different shuffle layout to
> > improve shuffle performance, we wouldn't be able to do that if we want to
> > allow Spark 1.6 shuffle service to read something generated by Spark 2.1.
>
> And I think that's really what Mark is proposing. Basically, "don't
> intentionally break backwards compatibility unless it's really
> required" (e.g. SPARK-12130). That would allow option B to work.
>
> If a new shuffle manager is created, then neither option A nor option
> B would really work. Moving all the shuffle-related classes to a
> different package, to support option A, would be really messy. At that
> point, you're better off maintaining the new shuffle service outside
> of YARN, which is rather messy too.
>
> The best would be if the shuffle service didn't really need to
> understand the shuffle manager, and could find files regardless; I'm
> not sure how feasible that is, though.
>
> --
> Marcelo
>

Re: YARN Shuffle service and its compatibility

Reply via email to