Also, I wanted to add if I specify the conf in the command line, it seems to be working.
For example, if I use spark-submit --master yarn --deploy-mode cluster --conf spark.yarn.queue=root.Application ayan_test.py 10 Then it is going to correct queue. Any help would be great Best Ayan On Mon, Feb 27, 2017 at 11:52 AM, ayan guha <guha.a...@gmail.com> wrote: > Hi > > I am facing an issue with Cluster Mode, with pyspark > > Here is my code: > > conf = SparkConf() > conf.setAppName("Spark Ingestion") > conf.set("spark.yarn.queue","root.Applications") > conf.set("spark.executor.instances","50") > conf.set("spark.executor.memory","22g") > conf.set("spark.yarn.executor.memoryOverhead","4096") > conf.set("spark.executor.cores","4") > conf.set("spark.sql.hive.convertMetastoreParquet", "false") > sc = SparkContext(conf = conf) > sqlContext = HiveContext(sc) > > r = sc.parallelize(xrange(1,10000)) > print r.count() > > sc.stop() > > The problem is none of my Config settings are passed on to Yarn. > > spark-submit --master yarn --deploy-mode cluster ayan_test.py > > I tried the same code with deploy-mode=client and all config are passing > fine. > > Am I missing something? Will introducing --property-file be of any help? > Can anybody share some working example? > > Best > Ayan > > -- > Best Regards, > Ayan Guha > -- Best Regards, Ayan Guha