Re: Spark on yarn - application hangs

2019-05-10 Thread Mich Talebzadeh
sure NP.

I meant these topics

[image: image.png]

Have a look at this article of mine

https://www.linkedin.com/pulse/real-time-processing-trade-data-kafka-flume-spark-talebzadeh-ph-d-/


under section

Understanding the Spark Application Through Visualization

See if it helps

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 10 May 2019 at 18:10, Mkal  wrote:

> How can i check what exactly is stagnant? Do you mean on the DAG
> visualization on Spark UI?
>
> Sorry i'm new to spark.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Re: Spark on yarn - application hangs

2019-05-10 Thread Mkal
How can i check what exactly is stagnant? Do you mean on the DAG
visualization on Spark UI?

Sorry i'm new to spark.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Spark on yarn - application hangs

2019-05-10 Thread Mich Talebzadeh
Hi,

Have you checked matrices from Spark UI by any chance? What is stagnant?

HTH

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Fri, 10 May 2019 at 17:51, Mkal  wrote:

> I've built a spark job in which an external program is called through the
> use
> of pipe().
> Job runs correctly on cluster when the input is a small sample dataset but
> when the input is a real large dataset it stays on RUNNING state forever.
>
> I've tried different ways to tune executor memory, executor cores, overhead
> memory but havent found a solution so far.
> I've also tried to force external program to use only 1 thread in case
> there
> is a problem due to it being a multithread application but nothing.
>
> Any suggestion would be welcome
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>