Re: HDFS as Shuffle Service

2016-04-28 Thread Andrew Ray
Yes, HDFS has serious problems with creating lots of files. But we can always just create a single merged file on HDFS per task. On Apr 28, 2016 11:17 AM, "Reynold Xin" wrote: Hm while this is an attractive idea in theory, in practice I think you are substantially overestimating HDFS' ability to

Re: HDFS as Shuffle Service

2016-04-28 Thread Mark Hamstra
Ah, got it. While that would be useful, it doesn't address the more general (and potentially even more beneficial) case where the total number of worker nodes is fully elastic. That already starts to push you into the direction of spitting Spark worker and HDFS data nodes into disjoint sets, and

Re: HDFS as Shuffle Service

2016-04-28 Thread Michael Gummelt
Not disjoint. Colocated. By "shrinking", I don't mean any nodes are going away. I mean executors are decreasing in number, which is the case with dynamic allocation. HDFS nodes aren't decreasing in number though, and we can still colocate on those nodes, as always. On Thu, Apr 28, 2016 at 11:1

Re: HDFS as Shuffle Service

2016-04-28 Thread Michael Gummelt
Yea, it's an open question. I'm willing to create some benchmarks, but I'd first like to know that the feature would be accepted assuming the results are reasonable. Can a committer give me a thumbs up? On Thu, Apr 28, 2016 at 11:17 AM, Reynold Xin wrote: > Hm while this is an attractive idea

Re: HDFS as Shuffle Service

2016-04-28 Thread Mark Hamstra
So you are only considering the case where your set of HDFS nodes is disjoint from your dynamic set of Spark Worker nodes? That would seem to be a pretty significant sacrifice of data locality. On Thu, Apr 28, 2016 at 11:15 AM, Michael Gummelt wrote: > > if after a work-load burst your cluster

Re: HDFS as Shuffle Service

2016-04-28 Thread Reynold Xin
Hm while this is an attractive idea in theory, in practice I think you are substantially overestimating HDFS' ability to handle a lot of small, ephemeral files. It has never really been optimized for that use case. On Thu, Apr 28, 2016 at 11:15 AM, Michael Gummelt wrote: > > if after a work-load

Re: HDFS as Shuffle Service

2016-04-28 Thread Michael Gummelt
> if after a work-load burst your cluster dynamically changes from 1 workers to 1000, will the typical HDFS replication factor be sufficient to retain access to the shuffle files in HDFS HDFS isn't resizing. Spark is. HDFS files should be HA and durable. On Thu, Apr 28, 2016 at 11:08 AM, M

Re: HDFS as Shuffle Service

2016-04-28 Thread Mark Hamstra
Yes, replicated and distributed shuffle materializations are key requirement to maintain performance in a fully elastic cluster where Executors aren't just reallocated across an essentially fixed number of Worker nodes, but rather the number of Workers itself is dynamic. Retaining the file interfac

Re: HDFS as Shuffle Service

2016-04-28 Thread Michael Gummelt
> Why would you run the shuffle service on 10K nodes but Spark executors on just 100 nodes? wouldn't you also run that service just on the 100 nodes? We have to start the service beforehand, out of band, and we don't know a priori where the Spark executors will land. Those 100 executors could lan

Re: HDFS as Shuffle Service

2016-04-28 Thread Sean Owen
Why would you run the shuffle service on 10K nodes but Spark executors on just 100 nodes? wouldn't you also run that service just on the 100 nodes? What does plumbing it through HDFS buy you in comparison? There's some additional overhead and if anything you lose some control over locality, in a c

Re: HDFS as Shuffle Service

2016-04-27 Thread Michael Gummelt
> Are you suggesting to have shuffle service persist and fetch data with hdfs, or skip shuffle service altogether and just write to hdfs? Skip shuffle service altogether. Write to HDFS. Mesos environments tend to be multi-tenant, and running the shuffle service on all nodes could be extremely wa

Re: HDFS as Shuffle Service

2016-04-27 Thread Steve Loughran
> On 27 Apr 2016, at 04:59, Takeshi Yamamuro wrote: > > Hi, all > > See SPARK-1529 for related discussion. > > // maropu I'd not seen that discussion. I'm actually curious about why the 15% diff in performance between Java NIO and Hadoop FS APIs, and, if it is the case (Hadoop still uses t

Re: HDFS as Shuffle Service

2016-04-26 Thread Takeshi Yamamuro
Hi, all See SPARK-1529 for related discussion. // maropu On Wed, Apr 27, 2016 at 12:27 PM, Saisai Shao wrote: > Quite curious about the benefits of using HDFS as shuffle service, also > what's the problem of using current shuffle service? > > > Thanks > Saisai > >

Re: HDFS as Shuffle Service

2016-04-26 Thread Saisai Shao
Quite curious about the benefits of using HDFS as shuffle service, also what's the problem of using current shuffle service? Thanks Saisai On Wed, Apr 27, 2016 at 4:31 AM, Timothy Chen wrote: > Are you suggesting to have shuffle service persist and fetch data with > hdfs, or s

Re: HDFS as Shuffle Service

2016-04-26 Thread Timothy Chen
Are you suggesting to have shuffle service persist and fetch data with hdfs, or skip shuffle service altogether and just write to hdfs? Tim > On Apr 26, 2016, at 11:20 AM, Michael Gummelt wrote: > > Has there been any thought or work on this (or any other networked file > system)? It would

HDFS as Shuffle Service

2016-04-26 Thread Michael Gummelt
Has there been any thought or work on this (or any other networked file system)? It would be valuable to support dynamic allocation without depending on the shuffle service. -- Michael Gummelt Software Engineer Mesosphere