To reiterate what Tom was saying - the code that runs inside of Spark on
YARN is exactly the same code that runs in any deployment mode. There
shouldn't be any performance difference once your application starts
(assuming you are comparing apples-to-apples in terms of hardware).

The differences are just that before your application runs, Spark allocates
resources from YARN. This will probably take more time than launching an
application against a standalone cluster because YARN's launching mechanism
is slower.


On Fri, Apr 11, 2014 at 8:43 AM, Tom Graves <tgraves...@yahoo.com> wrote:

> I haven't run on mesos before, but I do run on yarn. The performance
> differences are going to be in how long it takes you go get the Executors
> allocated.  On yarn that is going to depend on the cluster setup. If you
> have dedicated resources to a queue where you are running your spark job
> the overhead is pretty minimal.  Now if your cluster is multi-tenant and is
> really busy and you allow other queues are using your capacity it could
> take some time.  It is also possible to run into the situation where the
> memory of the nodemanagers get fragmented and you don't have any slots big
> enough for you so you have to wait for other applications to finish.  Again
> this mostly depends on the setup, how big of containers you need for Spark,
> etc.
>
> Tom
>   On Thursday, April 10, 2014 11:12 AM, Flavio Pompermaier <
> pomperma...@okkam.it> wrote:
>  Thank you for the reply Mayur, it would be nice to have a comparison
> about that.
> I hope one day it will be available, or to have the time to test it myself
> :)
> So you're using Mesos for the moment, right? Which are the main
> differences in you experience? YARN seems to be more flexible and
> interoperable with other frameworks..am I wrong?
>
> Best,
> Flavio
>
>
> On Thu, Apr 10, 2014 at 5:55 PM, Mayur Rustagi <mayur.rust...@gmail.com>wrote:
>
> I've had better luck with standalone in terms of speed & latency. I think
> thr is impact but not really very high. Bigger impact is towards being able
> to manage resources & share cluster.
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Wed, Apr 9, 2014 at 12:10 AM, Flavio Pompermaier 
> <pomperma...@okkam.it>wrote:
>
> Hi to everybody,
> I'm new to Spark and I'd like to know if running Spark on top of YARN or
> Mesos could affect (and how much) its performance. Is there any doc about
> this?
>
> Best,
> Flavio
>
>
>
>

Reply via email to