Re: Spark Streaming Job Keeps growing memory over time

Sandeep Nemuri Tue, 09 Aug 2016 05:01:01 -0700

Hi Aashish,

Do you have checkpointing enabled ? if not, Can you try enabling
checkpointing and observe the memory pattern.


Thanks,
Sandeep
ᐧ

On Tue, Aug 9, 2016 at 4:25 PM, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hi Aashish,
>
> You are running in standalone mode with one node
>
> As I read you start master and 5 workers pop up from
> SPARK_WORKER_INSTANCES=5. I gather you use start-slaves.sh?
>
> Now that is the number of workers and low memory on them port 8080 should
> show practically no memory used (idle). Also every worker has been
> allocated 1 core SPARK_WORKER_CORE=1
>
> Now it all depends how you start your start-submit job and what parameters
> you pass to it.
>
> ${SPARK_HOME}/bin/spark-submit \
>                 --driver-memory 1G \
>                 --num-executors 2 \
>                 --executor-cores 1 \
>                 --executor-memory 1G \
>                 --master spark://<IP>:7077 \
>
> What are your parameters here? From my experience standalone mode has mind
> of its own and it does not follow what you have asked.
>
> If you increase the number of cores for workers, you may reduce the memory
> issue because effectively multiple tasks can be run on sub-set of your data.
>
> HTH
>
> P.S. I don't use SPARK_MASTER_OPTS
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 9 August 2016 at 11:21, aasish.kumar <aasish.ku...@avekshaa.com> wrote:
>
>> Hi,
>>
>> I am running spark v 1.6.1 on a single machine in standalone mode, having
>> 64GB RAM and 16cores.
>>
>> I have created five worker instances to create five executor as in
>> standalone mode, there cannot be more than one executor in one worker
>> node.
>>
>> *Configuration*:
>>
>> SPARK_WORKER_INSTANCES 5
>> SPARK_WORKER_CORE 1
>> SPARK_MASTER_OPTS "-Dspark.deploy.default.Cores=5"
>>
>> all other configurations are default in spark_env.sh
>>
>> I am running a spark streaming direct kafka job at an interval of 1 min,
>> which takes data from kafka and after some aggregation write the data to
>> mongo.
>>
>> *Problems:*
>>
>> > when I start master and slave, it starts one master process and five
>> > worker processes. each only consume about 212 MB of ram.when i submit
>> the
>> > job , it again creates 5 executor processes and 1 job process and also
>> the
>> > memory uses grows to 8GB in total and keeps growing over time (slowly)
>> > also when there is no data to process.
>>
>> I am also unpersisting cached rdd at the end also set spark.cleaner.ttl to
>> 600. but still memory is growing.
>>
>> > one more thing, I have seen the merged SPARK-1706, then also why i am
>> > unable to create multiple executor within a worker.and also in
>> > spark_env.sh file , setting any configuration related to executor comes
>> > under YARN only mode.
>>
>> I have also tried running example program but same problem.
>>
>> Any help would be greatly appreciated,
>>
>> Thanks
>>
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Spark-Streaming-Job-Keeps-growing-memo
>> ry-over-time-tp27498.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>


-- 
*  Regards*
*  Sandeep Nemuri*

Re: Spark Streaming Job Keeps growing memory over time

Reply via email to