Patch here: https://github.com/apache/spark/pull/609
On Wed, Apr 30, 2014 at 2:26 PM, Patrick Wendell <pwend...@gmail.com> wrote: > Dean - our e-mails crossed, but thanks for the tip. Was independently > arriving at your solution :) > > Okay I'll submit something. > > - Patrick > > On Wed, Apr 30, 2014 at 2:14 PM, Marcelo Vanzin <van...@cloudera.com> wrote: >> Cool, that seems to work. Thanks! >> >> On Wed, Apr 30, 2014 at 2:09 PM, Patrick Wendell <pwend...@gmail.com> wrote: >>> Marcelo - Mind trying the following diff locally? If it works I can >>> send a patch: >>> >>> patrick@patrick-t430s:~/Documents/spark$ git diff bin/spark-submit >>> diff --git a/bin/spark-submit b/bin/spark-submit >>> index dd0d95d..49bc262 100755 >>> --- a/bin/spark-submit >>> +++ b/bin/spark-submit >>> @@ -18,7 +18,7 @@ >>> # >>> >>> export SPARK_HOME="$(cd `dirname $0`/..; pwd)" >>> -ORIG_ARGS=$@ >>> +ORIG_ARGS=("$@") >>> >>> while (($#)); do >>> if [ "$1" = "--deploy-mode" ]; then >>> @@ -39,5 +39,5 @@ if [ ! -z $DRIVER_MEMORY ] && [ ! -z $DEPLOY_MODE ] >>> && [ $DEPLOY_MODE = "client" >>> export SPARK_MEM=$DRIVER_MEMORY >>> fi >>> >>> -$SPARK_HOME/bin/spark-class org.apache.spark.deploy.SparkSubmit $ORIG_ARGS >>> +$SPARK_HOME/bin/spark-class org.apache.spark.deploy.SparkSubmit >>> "${ORIG_ARGS[@]}" >>> >>> On Wed, Apr 30, 2014 at 1:51 PM, Patrick Wendell <pwend...@gmail.com> wrote: >>>> So I reproduced the problem here: >>>> >>>> == test.sh == >>>> #!/bin/bash >>>> for x in "$@"; do >>>> echo "arg: $x" >>>> done >>>> ARGS_COPY=$@ >>>> for x in "$ARGS_COPY"; do >>>> echo "arg_copy: $x" >>>> done >>>> == >>>> >>>> ./test.sh a b "c d e" f >>>> arg: a >>>> arg: b >>>> arg: c d e >>>> arg: f >>>> arg_copy: a b c d e f >>>> >>>> I'll dig around a bit more and see if we can fix it. Pretty sure we >>>> aren't passing these argument arrays around correctly in bash. >>>> >>>> On Wed, Apr 30, 2014 at 1:48 PM, Marcelo Vanzin <van...@cloudera.com> >>>> wrote: >>>>> On Wed, Apr 30, 2014 at 1:41 PM, Patrick Wendell <pwend...@gmail.com> >>>>> wrote: >>>>>> Yeah I think the problem is that the spark-submit script doesn't pass >>>>>> the argument array to spark-class in the right way, so any quoted >>>>>> strings get flattened. >>>>>> >>>>>> I think we'll need to figure out how to do this correctly in the bash >>>>>> script so that quoted strings get passed in the right way. >>>>> >>>>> I tried a few different approaches but finally ended up giving up; my >>>>> bash-fu is apparently not strong enough. If you can make it work >>>>> great, but I have "-J" working locally in case you give up like me. >>>>> :-) >>>>> >>>>> -- >>>>> Marcelo >> >> >> >> -- >> Marcelo