RE: Spark Streaming: limit number of nodes

2015-06-24 Thread Evo Eftimov
Ok so you are running Spark in a Standalone Mode then 

 

Then for every Worker process on every node (you can run more than one Worker 
per node) you will have an Executor waiting for jobs ….

 

As far as I am concerned I think there are only two ways to achieve what  you 
need:

 

1.   Simply shutdown the spark worker processes / demons on the nodes you 
want to keep free from spark workloads OR run two separate Spark clusters one 
with e.g. 2 workers and one with e..g 5 workers – small jobs go to cluster 1 
and big jobs to cluster 2

2.   Try to set spark.executor.cores BUT that limits the number of cores 
per Executor rather than the total cores for the job and hence will probably 
not yield the effect you need  

 

From: Wojciech Pituła [mailto:w.pit...@gmail.com] 
Sent: Wednesday, June 24, 2015 10:49 AM
To: Evo Eftimov; user@spark.apache.org
Subject: Re: Spark Streaming: limit number of nodes

 

Ok, thanks. I have 1 worker process on each machine but I would like to run my 
app on only 3 of them. Is it possible?

 

śr., 24.06.2015 o 11:44 użytkownik Evo Eftimov  napisał:

There is no direct one to one mapping between Executor and Node

 

Executor is simply the spark framework term for JVM instance with some spark 
framework system code running in it 

 

A node is a physical server machine 

 

You can have more than one JVM per node 

 

And vice versa you can have Nodes without any JVM running on them. How? BY 
specifying the number of executors to be less than the number of nodes  

 

So if you specify number of executors to be 1 and you have 5 nodes,  ONE 
executor will run on only one of them 

 

The above is valid for Spark on YARN 

 

For spark in standalone mode the number of executors is equal to the number of 
spark worker processes (daemons) running on each node

 

From: Wojciech Pituła [mailto:w.pit...@gmail.com] 
Sent: Tuesday, June 23, 2015 12:38 PM
To: user@spark.apache.org
Subject: Spark Streaming: limit number of nodes

 

I have set up small standalone cluster: 5 nodes, every node has 5GB of memory 
an 8 cores. As you can see, node doesn't have much RAM.

 

I have 2 streaming apps, first one is configured to use 3GB of memory per node 
and second one uses 2GB per node.

 

My problem is, that smaller app could easily run on 2 or 3 nodes, instead of 5 
so I could lanuch third app. 

 

Is it possible to limit number of nodes(executors) that app wil get from 
standalone cluster?



Re: Spark Streaming: limit number of nodes

2015-06-24 Thread Wojciech Pituła
Ok, thanks. I have 1 worker process on each machine but I would like to run
my app on only 3 of them. Is it possible?

śr., 24.06.2015 o 11:44 użytkownik Evo Eftimov 
napisał:

> There is no direct one to one mapping between Executor and Node
>
>
>
> Executor is simply the spark framework term for JVM instance with some
> spark framework system code running in it
>
>
>
> A node is a physical server machine
>
>
>
> You can have more than one JVM per node
>
>
>
> And vice versa you can have Nodes without any JVM running on them. How? BY
> specifying the number of executors to be less than the number of nodes
>
>
>
> So if you specify number of executors to be 1 and you have 5 nodes,  ONE
> executor will run on only one of them
>
>
>
> The above is valid for Spark on YARN
>
>
>
> For spark in standalone mode the number of executors is equal to the
> number of spark worker processes (daemons) running on each node
>
>
>
> *From:* Wojciech Pituła [mailto:w.pit...@gmail.com]
> *Sent:* Tuesday, June 23, 2015 12:38 PM
> *To:* user@spark.apache.org
> *Subject:* Spark Streaming: limit number of nodes
>
>
>
> I have set up small standalone cluster: 5 nodes, every node has 5GB of
> memory an 8 cores. As you can see, node doesn't have much RAM.
>
>
>
> I have 2 streaming apps, first one is configured to use 3GB of memory per
> node and second one uses 2GB per node.
>
>
>
> My problem is, that smaller app could easily run on 2 or 3 nodes, instead
> of 5 so I could lanuch third app.
>
>
>
> Is it possible to limit number of nodes(executors) that app wil get from
> standalone cluster?
>


RE: Spark Streaming: limit number of nodes

2015-06-24 Thread Evo Eftimov
There is no direct one to one mapping between Executor and Node

 

Executor is simply the spark framework term for JVM instance with some spark 
framework system code running in it 

 

A node is a physical server machine 

 

You can have more than one JVM per node 

 

And vice versa you can have Nodes without any JVM running on them. How? BY 
specifying the number of executors to be less than the number of nodes  

 

So if you specify number of executors to be 1 and you have 5 nodes,  ONE 
executor will run on only one of them 

 

The above is valid for Spark on YARN 

 

For spark in standalone mode the number of executors is equal to the number of 
spark worker processes (daemons) running on each node

 

From: Wojciech Pituła [mailto:w.pit...@gmail.com] 
Sent: Tuesday, June 23, 2015 12:38 PM
To: user@spark.apache.org
Subject: Spark Streaming: limit number of nodes

 

I have set up small standalone cluster: 5 nodes, every node has 5GB of memory 
an 8 cores. As you can see, node doesn't have much RAM.

 

I have 2 streaming apps, first one is configured to use 3GB of memory per node 
and second one uses 2GB per node.

 

My problem is, that smaller app could easily run on 2 or 3 nodes, instead of 5 
so I could lanuch third app. 

 

Is it possible to limit number of nodes(executors) that app wil get from 
standalone cluster?



Re: Spark Streaming: limit number of nodes

2015-06-23 Thread Wojciech Pituła
I can not. I've already limited the number of cores to 10, so it gets 5
executors with 2 cores each...

wt., 23.06.2015 o 13:45 użytkownik Akhil Das 
napisał:

> Use *spark.cores.max* to limit the CPU per job, then you can easily
> accommodate your third job also.
>
> Thanks
> Best Regards
>
> On Tue, Jun 23, 2015 at 5:07 PM, Wojciech Pituła 
> wrote:
>
>> I have set up small standalone cluster: 5 nodes, every node has 5GB of
>> memory an 8 cores. As you can see, node doesn't have much RAM.
>>
>> I have 2 streaming apps, first one is configured to use 3GB of memory per
>> node and second one uses 2GB per node.
>>
>> My problem is, that smaller app could easily run on 2 or 3 nodes, instead
>> of 5 so I could lanuch third app.
>>
>> Is it possible to limit number of nodes(executors) that app wil get from
>> standalone cluster?
>>
>
>


Re: Spark Streaming: limit number of nodes

2015-06-23 Thread Akhil Das
Use *spark.cores.max* to limit the CPU per job, then you can easily
accommodate your third job also.

Thanks
Best Regards

On Tue, Jun 23, 2015 at 5:07 PM, Wojciech Pituła  wrote:

> I have set up small standalone cluster: 5 nodes, every node has 5GB of
> memory an 8 cores. As you can see, node doesn't have much RAM.
>
> I have 2 streaming apps, first one is configured to use 3GB of memory per
> node and second one uses 2GB per node.
>
> My problem is, that smaller app could easily run on 2 or 3 nodes, instead
> of 5 so I could lanuch third app.
>
> Is it possible to limit number of nodes(executors) that app wil get from
> standalone cluster?
>