YARN - Pyspark

2016-09-29 Thread ayan guha
Hi I just observed a litlte weird behavior: I ran a pyspark job, very simple one. conf = SparkConf() conf.setAppName("Historical Meter Load") conf.set("spark.yarn.queue","root.Applications") conf.set("spark.executor.instances","50") conf.set("spark.executor.memory","10g") conf.set("spark.yarn.ex

Re: YARN - Pyspark

2016-09-30 Thread Timur Shenkao
It's not weird behavior. Did you run the job in cluster mode? I suspect your driver died / finished / stopped after 12 hours but your job continued. It's possible as you didn't output anything to console on driver node. Quite long time ago, when I just tried Spark Streaming, I launched PySpark Str

Re: YARN - Pyspark

2016-09-30 Thread ayan guha
I understand, thank you for explanation. However, I ran using yarn-client mode, submitted using nohup and I could see the logs getting into log file throughout the life of the job.everything worked well on spark side, just Yarn reported success long before job actually completed. I would love t