Re: Determining number of executors within RDD

2015-06-10 Thread Nishkam Ravi
This PR adds support for multiple executors per worker: https://github.com/apache/spark/pull/731 and should be available in 1.4. Thanks, Nishkam On Wed, Jun 10, 2015 at 1:35 PM, Evo Eftimov evo.efti...@isecc.com wrote: We/i were discussing STANDALONE mode, besides maxdml had already

Re: clarification for some spark on yarn configuration options

2014-09-22 Thread Nishkam Ravi
Greg, if you look carefully, the code is enforcing that the memoryOverhead be lower (and not higher) than spark.driver.memory. Thanks, Nishkam On Mon, Sep 22, 2014 at 1:26 PM, Greg Hill greg.h...@rackspace.com wrote: I thought I had this all figured out, but I'm getting some weird errors now

Re: clarification for some spark on yarn configuration options

2014-09-22 Thread Nishkam Ravi
. Is that a bug that's since fixed? I'm on 1.0.1 and using 'yarn-cluster' as the master. 'yarn-client' seems to pick up the values and works fine. Greg From: Nishkam Ravi nr...@cloudera.com Date: Monday, September 22, 2014 3:30 PM To: Greg greg.h...@rackspace.com Cc: Andrew

Re: SparkSql is slow over yarn

2014-08-29 Thread Nishkam Ravi
Can you share more details about your job, cluster properties and configuration parameters? Thanks, Nishkam On Fri, Aug 29, 2014 at 11:33 AM, Chirag Aggarwal chirag.aggar...@guavus.com wrote: When I run SparkSql over yarn, it runs 2-4 times slower as compared to when its run in local mode.

Re: Configuring Spark Memory

2014-07-23 Thread Nishkam Ravi
See if this helps: https://github.com/nishkamravi2/SparkAutoConfig/ It's a very simple tool for auto-configuring default parameters in Spark. Takes as input high-level parameters (like number of nodes, cores per node, memory per node, etc) and spits out default configuration, user advice and

Re: Spark on YARN performance

2014-04-18 Thread Nishkam Ravi
Spark-on-YARN takes 10-30 seconds of setup time for workloads like WordCount and PageRank on a small-sized cluster and thereafter performs as well as Spark standalone, as has been noted by Tom and Patrick. However, certain amount of configuration/tuning effort is required to match peak