To run multiple workers with Spark’s standalone mode, set SPARK_WORKER_INSTANCES and SPARK_WORKER_CORES in conf/spark-env.sh. For example, if you have 16 cores and want 2 workers, you could add
export SPARK_WORKER_INSTANCES=2 export SPARK_WORKER_CORES=8 Matei On Apr 3, 2014, at 12:38 PM, Mayur Rustagi <mayur.rust...@gmail.com> wrote: > Are your workers not utilizing all the cores? > One worker will utilize multiple cores depending on resource allocation. > Regards > Mayur > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi > > > > On Wed, Apr 2, 2014 at 7:19 PM, Debasish Das <debasish.da...@gmail.com> wrote: > Hi Matei, > > How can I run multiple Spark workers per node ? I am running 8 core 10 node > cluster but I do have 8 more cores on each node....So having 2 workers per > node will definitely help my usecase. > > Thanks. > Deb > > > > > On Wed, Apr 2, 2014 at 3:58 PM, Matei Zaharia <matei.zaha...@gmail.com> wrote: > Hey Steve, > > This configuration sounds pretty good. The one thing I would consider is > having more disks, for two reasons — Spark uses the disks for large shuffles > and out-of-core operations, and often it’s better to run HDFS or your storage > system on the same nodes. But whether this is valuable will depend on whether > you plan to do that in your deployment. You should determine that and go from > there. > > The amount of cores and RAM are both good — actually with a lot more of these > you would probably want to run multiple Spark workers per node, which is more > work to configure. Your numbers are in line with other deployments. > > There’s a provisioning overview with more details at > https://spark.apache.org/docs/latest/hardware-provisioning.html but what you > have sounds fine. > > Matei > > On Apr 2, 2014, at 2:58 PM, Stephen Watt <sw...@redhat.com> wrote: > > > Hi Folks > > > > I'm looking to buy some gear to run Spark. I'm quite well versed in Hadoop > > Server design but there does not seem to be much Spark related collateral > > around infrastructure guidelines (or at least I haven't been able to find > > them). My current thinking for server design is something along these lines. > > > > - 2 x 10Gbe NICs > > - 128 GB RAM > > - 6 x 1 TB Small Form Factor Disks (2 x RAID 1 Mirror for O/S and Runtimes, > > 4 x 1TB for Data Drives) > > - 1 Disk Controller > > - 2 x 2.6 GHz 6 core processors > > > > If I stick with 1u servers then I lose disk capacity per rack but I get a > > lot more memory and CPU capacity per rack. This increases my total cluster > > memory footprint and it doesn't seem to make sense to have super dense > > storage servers because I can't fit all that data on disk in memory > > anyways. So at present, my thinking is to go with 1u servers instead of 2u > > Servers. Is 128GB RAM per server normal? Do you guys use more or less than > > that? > > > > Any feedback would be appreciated > > > > Regards > > Steve Watt > > >