Re: How does one decide no of executors/cores/memory allocation?

2015-06-17 Thread nsalian
Hello shreesh, That would be quite a challenge to understand. A few things that I think should help estimate those numbers: 1) Understanding the cost of the individual transformations in the application E.g a flatMap can be more expansive in memory as opposed to a map 2) The communication

Re: How does one decide no of executors/cores/memory allocation?

2015-06-16 Thread shreesh
I realize that there are a lot of ways to configure my application in spark. The part that is not clear is that how do I decide say for example in how many partitions should I divide my data or how much ram should I have or how many workers should one initialize? -- View this message in

RE: How does one decide no of executors/cores/memory allocation?

2015-06-16 Thread Evo Eftimov
Subject: Re: How does one decide no of executors/cores/memory allocation? I realize that there are a lot of ways to configure my application in spark. The part that is not clear is that how do I decide say for example in how many partitions should I divide my data or how much ram should I have or how

Re: How does one decide no of executors/cores/memory allocation?

2015-06-16 Thread Himanshu Mehra
Hi Shreesh, You can definitely decide the how many partitions your data should break into by passing a, 'minPartition' argument in the method sc.textFile(input/path, minPartition) and 'numSlices' arg in method sc.parallelize(localCollection, numSlices). In fact there is always a option to specify

Re: How does one decide no of executors/cores/memory allocation?

2015-06-15 Thread gaurav sharma
When you submit a job, spark breaks down it into stages, as per DAG. the stages run transformations or actions on the rdd's. Each rdd constitutes of N partitions. The tasks creates by spark to execute the stage are equal to the number of partitions. Every task is executed on the cored utilized