RE: My first experience with Spark

2015-02-05 Thread java8964
using more time.We plan to make spark coexist with Hadoop cluster, so be able to control its memory usage is important for us.Does spark need that much of memory?ThanksYong Date: Thu, 5 Feb 2015 15:36:48 -0800 Subject: Re: My first experience with Spark From: deborah.sie...@gmail.com To: java8

Re: My first experience with Spark

2015-02-05 Thread Deborah Siegel
Hi Yong, Have you tried increasing your level of parallelism? How many tasks are you getting in failing stage? 2-3 tasks per CPU core is recommended, though maybe you need more for your shuffle operation? You can configure spark.default.parallelism, or pass in a level of parallelism as second par

RE: My first experience with Spark

2015-02-05 Thread java8964
nks Yong From: java8...@hotmail.com To: user@spark.apache.org Subject: My first experience with Spark Date: Thu, 5 Feb 2015 16:03:33 -0500 I am evaluating Spark for our production usage. Our production cluster is Hadoop 2.2.0 without Yarn. So I want to test Spark with Standalone deployment

My first experience with Spark

2015-02-05 Thread java8964
I am evaluating Spark for our production usage. Our production cluster is Hadoop 2.2.0 without Yarn. So I want to test Spark with Standalone deployment running with Hadoop. What I have in mind is to test a very complex Hive query, which joins between 6 tables, lots of nested structure with explo