spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread jamborta
Hi all, I cannot figure out why this command is not setting the driver memory (it is setting the executor memory): conf = (SparkConf() .setMaster(yarn-client) .setAppName(test) .set(spark.driver.memory, 1G)

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread Marcelo Vanzin
You can't set up the driver memory programatically in client mode. In that mode, the same JVM is running the driver, so you can't modify command line options anymore when initializing the SparkContext. (And you can't really start cluster mode apps that way, so the only way to set this is through

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread Tamas Jambor
thanks Marcelo. What's the reason it is not possible in cluster mode, either? On Wed, Oct 1, 2014 at 5:42 PM, Marcelo Vanzin van...@cloudera.com wrote: You can't set up the driver memory programatically in client mode. In that mode, the same JVM is running the driver, so you can't modify

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread Marcelo Vanzin
Because that's not how you launch apps in cluster mode; you have to do it through the command line, or by calling directly the respective backend code to launch it. (That being said, it would be nice to have a programmatic way of launching apps that handled all this - this has been brought up in

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread Tamas Jambor
when you say respective backend code to launch it, I thought this is the way to do that. thanks, Tamas On Wed, Oct 1, 2014 at 6:13 PM, Marcelo Vanzin van...@cloudera.com wrote: Because that's not how you launch apps in cluster mode; you have to do it through the command line, or by calling

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread jamborta
spark.driver.memory is not set (pyspark, 1.1.0), click here. NAML -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-driver-memory-is-not-set-pyspark-1-1-0-tp15498p15507.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread Marcelo Vanzin
No, you can't instantiate a SparkContext to start apps in cluster mode. For Yarn, for example, you'd have to call directly into org.apache.spark.deploy.yarn.Client; that class will tell the Yarn cluster to launch the driver for you and then instantiate the SparkContext. On Wed, Oct 1, 2014 at

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread Andrew Or
Hi Tamas, Yes, Marcelo is right. The reason why it doesn't make sense to set spark.driver.memory in your SparkConf is because your application code, by definition, *is* the driver. This means by the time you get to the code that initializes your SparkConf, your driver JVM has already started with

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread jamborta
://apache-spark-user-list.1001560.n3.nabble.com/spark-driver-memory-is-not-set-pyspark-1-1-0-tp15498p15510.html To unsubscribe from spark.driver.memory is not set (pyspark, 1.1.0), click here http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode

Re: spark.driver.memory is not set (pyspark, 1.1.0)

2014-10-01 Thread Tamas Jambor
Thank you for the replies. It makes sense for scala/java, but in python the JVM is launched when the spark context is initialised, so it should be able to set it, I assume. On Wed, Oct 1, 2014 at 6:24 PM, Andrew Or and...@databricks.com wrote: Hi Tamas, Yes, Marcelo is right. The reason why