Re: Spark application doesn't scale to worker nodes

Mich Talebzadeh Mon, 04 Jul 2016 09:36:43 -0700

Silly question. Have you added your workers to sbin/slaves file and have
you started start-slaves.sh.


on master node when you type jps what do you see?

The problem seems to be that workers are ignored and spark is essentially
running in Local mode

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 4 July 2016 at 17:05, Jakub Stransky <stransky...@gmail.com> wrote:

> Hi Mich,
>
> I have set up spark default configuration in conf directory
> spark-defaults.conf where I specify master hence no need to put it in
> command line
> spark.master   spark://spark.master:7077
>
> the same applies to driver memory which has been increased to 4GB
>  and the same is for spark.executor.memory 12GB as machines have 16GB
>
> Jakub
>
>
>
>
> On 4 July 2016 at 17:44, Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
>> Hi Jakub,
>>
>> In standalone mode Spark does the resource management. Which version of
>> Spark are you running?
>>
>> How do you define your SparkConf() parameters for example setMaster etc.
>>
>> From
>>
>> spark-submit --driver-class-path spark/sqljdbc4.jar --class DemoApp
>> SparkPOC.jar 10 4.3
>>
>> I did not see any executor, memory allocation, so I assume you are
>> allocating them somewhere else?
>>
>> HTH
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 4 July 2016 at 16:31, Jakub Stransky <stransky...@gmail.com> wrote:
>>
>>> Hello,
>>>
>>> I have a spark cluster consisting of 4 nodes in a standalone mode,
>>> master + 3 workers nodes with configured available memory and cpus etc.
>>>
>>> I have an spark application which is essentially a MLlib pipeline for
>>> training a classifier, in this case RandomForest  but could be a
>>> DecesionTree just for the sake of simplicity.
>>>
>>> But when I submit the spark application to the cluster via spark submit
>>> it is running out of memory. Even though the executors are "taken"/created
>>> in the cluster they are esentially doing nothing ( poor cpu, nor memory
>>> utilization) while the master seems to do all the work which finally
>>> results in OOM.
>>>
>>> My submission is following:
>>> spark-submit --driver-class-path spark/sqljdbc4.jar --class DemoApp
>>> SparkPOC.jar 10 4.3
>>>
>>> I am submitting from the master node.
>>>
>>> By default it is running in client mode which the driver process is
>>> attached to spark-shell.
>>>
>>> Do I need to set up some settings to make MLlib algos parallelized and
>>> distributed as well or all is driven by parallel factor set on dataframe
>>> with input data?
>>>
>>> Essentially it seems that all work is just done on master and the rest
>>> is idle.
>>> Any hints what to check?
>>>
>>> Thx
>>> Jakub
>>>
>>>
>>>
>>>
>>
>
>
> --
> Jakub Stransky
> cz.linkedin.com/in/jakubstransky
>
>

Re: Spark application doesn't scale to worker nodes

Reply via email to