Ah, great to know this is already being fixed. Thanks Patrick, I have marked my JIRA as a duplicate.
2014-08-07 21:42 GMT-07:00 Patrick Wendell <pwend...@gmail.com>: > Andrew - I think your JIRA may duplicate existing work: > https://github.com/apache/spark/pull/1513 > > > On Thu, Aug 7, 2014 at 7:55 PM, Andrew Or <and...@databricks.com> wrote: > > @Cody I took a quick glance at the Mesos code and it appears that we > > currently do not even pass extra java options to executors except in > coarse > > grained mode, and even in this mode we do not pass them to executors > > correctly. I have filed a related JIRA here: > > https://issues.apache.org/jira/browse/SPARK-2921. This is a somewhat > > serious limitation and we will try to fix this for 1.1. > > > > -Andrew > > > > > > 2014-08-07 19:42 GMT-07:00 Andrew Or <and...@databricks.com>: > > > >> Thanks Marcelo, I have moved the changes to a new PR to describe the > >> problems more clearly: https://github.com/apache/spark/pull/1845 > >> > >> @Gary Yeah, the goal is to get this into 1.1 as a bug fix. > >> > >> > >> 2014-08-07 17:30 GMT-07:00 Gary Malouf <malouf.g...@gmail.com>: > >> > >> Can this be cherry-picked for 1.1 if everything works out? In my > opinion, > >>> it could be qualified as a bug fix. > >>> > >>> > >>> On Thu, Aug 7, 2014 at 5:47 PM, Marcelo Vanzin <van...@cloudera.com> > >>> wrote: > >>> > >>> > Andrew has been working on a fix: > >>> > https://github.com/apache/spark/pull/1770 > >>> > > >>> > On Thu, Aug 7, 2014 at 2:35 PM, Cody Koeninger <c...@koeninger.org> > >>> wrote: > >>> > > Just wanted to check in on this, see if I should file a bug report > >>> > > regarding the mesos argument propagation. > >>> > > > >>> > > > >>> > > On Thu, Jul 31, 2014 at 8:35 AM, Cody Koeninger < > c...@koeninger.org> > >>> > wrote: > >>> > > > >>> > >> 1. I've tried with and without escaping equals sign, it doesn't > >>> affect > >>> > the > >>> > >> results. > >>> > >> > >>> > >> 2. Yeah, exporting SPARK_SUBMIT_OPTS from spark-env.sh works for > >>> getting > >>> > >> system properties set in the local shell (although not for > >>> executors). > >>> > >> > >>> > >> 3. We're using the default fine-grained mesos mode, not setting > >>> > >> spark.mesos.coarse, so it doesn't seem immediately related to that > >>> > ticket. > >>> > >> Should I file a bug report? > >>> > >> > >>> > >> > >>> > >> On Thu, Jul 31, 2014 at 1:33 AM, Patrick Wendell < > pwend...@gmail.com > >>> > > >>> > >> wrote: > >>> > >> > >>> > >>> The third issue may be related to this: > >>> > >>> https://issues.apache.org/jira/browse/SPARK-2022 > >>> > >>> > >>> > >>> We can take a look at this during the bug fix period for the 1.1 > >>> > >>> release next week. If we come up with a fix we can backport it > into > >>> > >>> the 1.0 branch also. > >>> > >>> > >>> > >>> On Wed, Jul 30, 2014 at 11:31 PM, Patrick Wendell < > >>> pwend...@gmail.com> > >>> > >>> wrote: > >>> > >>> > Thanks for digging around here. I think there are a few > distinct > >>> > issues. > >>> > >>> > > >>> > >>> > 1. Properties containing the '=' character need to be escaped. > >>> > >>> > I was able to load properties fine as long as I escape the '=' > >>> > >>> > character. But maybe we should document this: > >>> > >>> > > >>> > >>> > == spark-defaults.conf == > >>> > >>> > spark.foo a\=B > >>> > >>> > == shell == > >>> > >>> > scala> sc.getConf.get("spark.foo") > >>> > >>> > res2: String = a=B > >>> > >>> > > >>> > >>> > 2. spark.driver.extraJavaOptions, when set in the properties > file, > >>> > >>> > don't affect the driver when running in client mode (always the > >>> case > >>> > >>> > for mesos). We should probably document this. In this case you > >>> need > >>> > to > >>> > >>> > either use --driver-java-options or set SPARK_SUBMIT_OPTS. > >>> > >>> > > >>> > >>> > 3. Arguments aren't propagated on Mesos (this might be because > of > >>> the > >>> > >>> > other issues, or a separate bug). > >>> > >>> > > >>> > >>> > - Patrick > >>> > >>> > > >>> > >>> > On Wed, Jul 30, 2014 at 3:10 PM, Cody Koeninger < > >>> c...@koeninger.org> > >>> > >>> wrote: > >>> > >>> >> In addition, spark.executor.extraJavaOptions does not seem to > >>> behave > >>> > >>> as I > >>> > >>> >> would expect; java arguments don't seem to be propagated to > >>> > executors. > >>> > >>> >> > >>> > >>> >> > >>> > >>> >> $ cat conf/spark-defaults.conf > >>> > >>> >> > >>> > >>> >> spark.master > >>> > >>> >> > >>> > >>> > >>> > > >>> > mesos://zk://etl-01.mxstg:2181,etl-02.mxstg:2181,etl-03.mxstg:2181/masters > >>> > >>> >> spark.executor.extraJavaOptions -Dfoo.bar.baz=23 > >>> > >>> >> spark.driver.extraJavaOptions -Dfoo.bar.baz=23 > >>> > >>> >> > >>> > >>> >> > >>> > >>> >> $ ./bin/spark-shell > >>> > >>> >> > >>> > >>> >> scala> sc.getConf.get("spark.executor.extraJavaOptions") > >>> > >>> >> res0: String = -Dfoo.bar.baz=23 > >>> > >>> >> > >>> > >>> >> scala> sc.parallelize(1 to 100).map{ i => ( > >>> > >>> >> | java.net.InetAddress.getLocalHost.getHostName, > >>> > >>> >> | System.getProperty("foo.bar.baz") > >>> > >>> >> | )}.collect > >>> > >>> >> > >>> > >>> >> res1: Array[(String, String)] = Array((dn-01.mxstg,null), > >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null), > >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null), > >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-01.mxstg,null), > >>> > >>> >> (dn-01.mxstg,null), (dn-01.mxstg,null), (dn-02.mxstg,null), > >>> > >>> >> (dn-02.mxstg,null), ... > >>> > >>> >> > >>> > >>> >> > >>> > >>> >> > >>> > >>> >> Note that this is a mesos deployment, although I wouldn't > expect > >>> > that > >>> > >>> to > >>> > >>> >> affect the availability of spark.driver.extraJavaOptions in a > >>> local > >>> > >>> spark > >>> > >>> >> shell. > >>> > >>> >> > >>> > >>> >> > >>> > >>> >> On Wed, Jul 30, 2014 at 4:18 PM, Cody Koeninger < > >>> c...@koeninger.org > >>> > > > >>> > >>> wrote: > >>> > >>> >> > >>> > >>> >>> Either whitespace or equals sign are valid properties file > >>> formats. > >>> > >>> >>> Here's an example: > >>> > >>> >>> > >>> > >>> >>> $ cat conf/spark-defaults.conf > >>> > >>> >>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23 > >>> > >>> >>> > >>> > >>> >>> $ ./bin/spark-shell -v > >>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf > >>> > >>> >>> Adding default property: > >>> > >>> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23 > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> scala> System.getProperty("foo.bar.baz") > >>> > >>> >>> res0: String = null > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> If you add double quotes, the resulting string value will > have > >>> > double > >>> > >>> >>> quotes. > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> $ cat conf/spark-defaults.conf > >>> > >>> >>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23" > >>> > >>> >>> > >>> > >>> >>> $ ./bin/spark-shell -v > >>> > >>> >>> Using properties file: /opt/spark/conf/spark-defaults.conf > >>> > >>> >>> Adding default property: > >>> > >>> spark.driver.extraJavaOptions="-Dfoo.bar.baz=23" > >>> > >>> >>> > >>> > >>> >>> scala> System.getProperty("foo.bar.baz") > >>> > >>> >>> res0: String = null > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> Neither one of those affects the issue; the underlying > problem > >>> in > >>> > my > >>> > >>> case > >>> > >>> >>> seems to be that bin/spark-class uses the SPARK_SUBMIT_OPTS > and > >>> > >>> >>> SPARK_JAVA_OPTS environment variables, but nothing parses > >>> > >>> >>> spark-defaults.conf before the java process is started. > >>> > >>> >>> > >>> > >>> >>> Here's an example of the process running when only > >>> > >>> spark-defaults.conf is > >>> > >>> >>> being used: > >>> > >>> >>> > >>> > >>> >>> $ ps -ef | grep spark > >>> > >>> >>> > >>> > >>> >>> 514 5182 2058 0 21:05 pts/2 00:00:00 bash > >>> > >>> ./bin/spark-shell -v > >>> > >>> >>> > >>> > >>> >>> 514 5189 5182 4 21:05 pts/2 00:00:22 > >>> > >>> /usr/local/java/bin/java > >>> > >>> >>> -cp > >>> > >>> >>> > >>> > >>> > >>> > > >>> > ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx > >>> > >>> >>> -XX:MaxPermSize=128m -Djava.library.path= -Xms512m -Xmx512m > >>> > >>> >>> org.apache.spark.deploy.SparkSubmit spark-shell -v --class > >>> > >>> >>> org.apache.spark.repl.Main > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> Here's an example of it when the command line > >>> > --driver-java-options is > >>> > >>> >>> used (and thus things work): > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> $ ps -ef | grep spark > >>> > >>> >>> 514 5392 2058 0 21:15 pts/2 00:00:00 bash > >>> > >>> ./bin/spark-shell -v > >>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 > >>> > >>> >>> > >>> > >>> >>> 514 5399 5392 80 21:15 pts/2 00:00:06 > >>> > >>> /usr/local/java/bin/java > >>> > >>> >>> -cp > >>> > >>> >>> > >>> > >>> > >>> > > >>> > ::/opt/spark/conf:/opt/spark/lib/spark-assembly-1.0.1-hadoop2.3.0-mr1-cdh5.0.2.jar:/etc/hadoop/conf-mx > >>> > >>> >>> -XX:MaxPermSize=128m -Dfoo.bar.baz=23 -Djava.library.path= > >>> -Xms512m > >>> > >>> >>> -Xmx512m org.apache.spark.deploy.SparkSubmit spark-shell -v > >>> > >>> >>> --driver-java-options -Dfoo.bar.baz=23 --class > >>> > >>> org.apache.spark.repl.Main > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> >>> On Wed, Jul 30, 2014 at 3:43 PM, Patrick Wendell < > >>> > pwend...@gmail.com> > >>> > >>> >>> wrote: > >>> > >>> >>> > >>> > >>> >>>> Cody - in your example you are using the '=' character, but > in > >>> our > >>> > >>> >>>> documentation and tests we use a whitespace to separate the > key > >>> > and > >>> > >>> >>>> value in the defaults file. > >>> > >>> >>>> > >>> > >>> >>>> docs: > http://spark.apache.org/docs/latest/configuration.html > >>> > >>> >>>> > >>> > >>> >>>> spark.driver.extraJavaOptions -Dfoo.bar.baz=23 > >>> > >>> >>>> > >>> > >>> >>>> I'm not sure if the java properties file parser will try to > >>> > interpret > >>> > >>> >>>> the equals sign. If so you might need to do this. > >>> > >>> >>>> > >>> > >>> >>>> spark.driver.extraJavaOptions "-Dfoo.bar.baz=23" > >>> > >>> >>>> > >>> > >>> >>>> Do those work for you? > >>> > >>> >>>> > >>> > >>> >>>> On Wed, Jul 30, 2014 at 1:32 PM, Marcelo Vanzin < > >>> > van...@cloudera.com > >>> > >>> > > >>> > >>> >>>> wrote: > >>> > >>> >>>> > Hi Cody, > >>> > >>> >>>> > > >>> > >>> >>>> > Could you file a bug for this if there isn't one already? > >>> > >>> >>>> > > >>> > >>> >>>> > For system properties SparkSubmit should be able to read > >>> those > >>> > >>> >>>> > settings and do the right thing, but that obviously won't > >>> work > >>> > for > >>> > >>> >>>> > other JVM options... the current code should work fine in > >>> > cluster > >>> > >>> mode > >>> > >>> >>>> > though, since the driver is a different process. :-) > >>> > >>> >>>> > > >>> > >>> >>>> > > >>> > >>> >>>> > On Wed, Jul 30, 2014 at 1:12 PM, Cody Koeninger < > >>> > >>> c...@koeninger.org> > >>> > >>> >>>> wrote: > >>> > >>> >>>> >> We were previously using SPARK_JAVA_OPTS to set java > system > >>> > >>> properties > >>> > >>> >>>> via > >>> > >>> >>>> >> -D. > >>> > >>> >>>> >> > >>> > >>> >>>> >> This was used for properties that varied on a > >>> > >>> >>>> per-deployment-environment > >>> > >>> >>>> >> basis, but needed to be available in the spark shell and > >>> > workers. > >>> > >>> >>>> >> > >>> > >>> >>>> >> On upgrading to 1.0, we saw that SPARK_JAVA_OPTS had been > >>> > >>> deprecated, > >>> > >>> >>>> and > >>> > >>> >>>> >> replaced by spark-defaults.conf and command line > arguments > >>> to > >>> > >>> >>>> spark-submit > >>> > >>> >>>> >> or spark-shell. > >>> > >>> >>>> >> > >>> > >>> >>>> >> However, setting spark.driver.extraJavaOptions and > >>> > >>> >>>> >> spark.executor.extraJavaOptions in spark-defaults.conf is > >>> not a > >>> > >>> >>>> replacement > >>> > >>> >>>> >> for SPARK_JAVA_OPTS: > >>> > >>> >>>> >> > >>> > >>> >>>> >> > >>> > >>> >>>> >> $ cat conf/spark-defaults.conf > >>> > >>> >>>> >> spark.driver.extraJavaOptions=-Dfoo.bar.baz=23 > >>> > >>> >>>> >> > >>> > >>> >>>> >> $ ./bin/spark-shell > >>> > >>> >>>> >> > >>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz") > >>> > >>> >>>> >> res0: String = null > >>> > >>> >>>> >> > >>> > >>> >>>> >> > >>> > >>> >>>> >> $ ./bin/spark-shell --driver-java-options > "-Dfoo.bar.baz=23" > >>> > >>> >>>> >> > >>> > >>> >>>> >> scala> System.getProperty("foo.bar.baz") > >>> > >>> >>>> >> res0: String = 23 > >>> > >>> >>>> >> > >>> > >>> >>>> >> > >>> > >>> >>>> >> Looking through the shell scripts for spark-submit and > >>> > >>> spark-class, I > >>> > >>> >>>> can > >>> > >>> >>>> >> see why this is; parsing spark-defaults.conf from bash > >>> could be > >>> > >>> >>>> brittle. > >>> > >>> >>>> >> > >>> > >>> >>>> >> But from an ergonomic point of view, it's a step back to > go > >>> > from a > >>> > >>> >>>> >> set-it-and-forget-it configuration in spark-env.sh, to > >>> > requiring > >>> > >>> >>>> command > >>> > >>> >>>> >> line arguments. > >>> > >>> >>>> >> > >>> > >>> >>>> >> I can solve this with an ad-hoc script to wrap > spark-shell > >>> with > >>> > >>> the > >>> > >>> >>>> >> appropriate arguments, but I wanted to bring the issue > up to > >>> > see > >>> > >>> if > >>> > >>> >>>> anyone > >>> > >>> >>>> >> else had run into it, > >>> > >>> >>>> >> or had any direction for a general solution (beyond > parsing > >>> > java > >>> > >>> >>>> properties > >>> > >>> >>>> >> files from bash). > >>> > >>> >>>> > > >>> > >>> >>>> > > >>> > >>> >>>> > > >>> > >>> >>>> > -- > >>> > >>> >>>> > Marcelo > >>> > >>> >>>> > >>> > >>> >>> > >>> > >>> >>> > >>> > >>> > >>> > >> > >>> > >> > >>> > > >>> > > >>> > > >>> > -- > >>> > Marcelo > >>> > > >>> > --------------------------------------------------------------------- > >>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > >>> > For additional commands, e-mail: dev-h...@spark.apache.org > >>> > > >>> > > >>> > >> > >> >