This PR adds support for multiple executors per worker:
https://github.com/apache/spark/pull/731 and should be available in 1.4.
Thanks,
Nishkam
On Wed, Jun 10, 2015 at 1:35 PM, Evo Eftimov evo.efti...@isecc.com wrote:
We/i were discussing STANDALONE mode, besides maxdml had already
Greg, if you look carefully, the code is enforcing that the memoryOverhead
be lower (and not higher) than spark.driver.memory.
Thanks,
Nishkam
On Mon, Sep 22, 2014 at 1:26 PM, Greg Hill greg.h...@rackspace.com wrote:
I thought I had this all figured out, but I'm getting some weird errors
now
. Is that
a bug that's since fixed? I'm on 1.0.1 and using 'yarn-cluster' as the
master. 'yarn-client' seems to pick up the values and works fine.
Greg
From: Nishkam Ravi nr...@cloudera.com
Date: Monday, September 22, 2014 3:30 PM
To: Greg greg.h...@rackspace.com
Cc: Andrew
Can you share more details about your job, cluster properties and
configuration parameters?
Thanks,
Nishkam
On Fri, Aug 29, 2014 at 11:33 AM, Chirag Aggarwal
chirag.aggar...@guavus.com wrote:
When I run SparkSql over yarn, it runs 2-4 times slower as compared to
when its run in local mode.
See if this helps:
https://github.com/nishkamravi2/SparkAutoConfig/
It's a very simple tool for auto-configuring default parameters in Spark.
Takes as input high-level parameters (like number of nodes, cores per node,
memory per node, etc) and spits out default configuration, user advice and
Spark-on-YARN takes 10-30 seconds of setup time for workloads like
WordCount and PageRank on a small-sized cluster and thereafter performs as
well as Spark standalone, as has been noted by Tom and Patrick. However,
certain amount of configuration/tuning effort is required to match peak