Marcelo - Mind trying the following diff locally? If it works I can
send a patch:

patrick@patrick-t430s:~/Documents/spark$ git diff bin/spark-submit
diff --git a/bin/spark-submit b/bin/spark-submit
index dd0d95d..49bc262 100755
--- a/bin/spark-submit
+++ b/bin/spark-submit
@@ -18,7 +18,7 @@
 #

 export SPARK_HOME="$(cd `dirname $0`/..; pwd)"
-ORIG_ARGS=$@
+ORIG_ARGS=("$@")

 while (($#)); do
   if [ "$1" = "--deploy-mode" ]; then
@@ -39,5 +39,5 @@ if [ ! -z $DRIVER_MEMORY ] && [ ! -z $DEPLOY_MODE ]
&& [ $DEPLOY_MODE = "client"
   export SPARK_MEM=$DRIVER_MEMORY
 fi

-$SPARK_HOME/bin/spark-class org.apache.spark.deploy.SparkSubmit $ORIG_ARGS
+$SPARK_HOME/bin/spark-class org.apache.spark.deploy.SparkSubmit
"${ORIG_ARGS[@]}"

On Wed, Apr 30, 2014 at 1:51 PM, Patrick Wendell <pwend...@gmail.com> wrote:
> So I reproduced the problem here:
>
> == test.sh ==
> #!/bin/bash
> for x in "$@"; do
>   echo "arg: $x"
> done
> ARGS_COPY=$@
> for x in "$ARGS_COPY"; do
>   echo "arg_copy: $x"
> done
> ==
>
> ./test.sh a b "c d e" f
> arg: a
> arg: b
> arg: c d e
> arg: f
> arg_copy: a b c d e f
>
> I'll dig around a bit more and see if we can fix it. Pretty sure we
> aren't passing these argument arrays around correctly in bash.
>
> On Wed, Apr 30, 2014 at 1:48 PM, Marcelo Vanzin <van...@cloudera.com> wrote:
>> On Wed, Apr 30, 2014 at 1:41 PM, Patrick Wendell <pwend...@gmail.com> wrote:
>>> Yeah I think the problem is that the spark-submit script doesn't pass
>>> the argument array to spark-class in the right way, so any quoted
>>> strings get flattened.
>>>
>>> I think we'll need to figure out how to do this correctly in the bash
>>> script so that quoted strings get passed in the right way.
>>
>> I tried a few different approaches but finally ended up giving up; my
>> bash-fu is apparently not strong enough. If you can make it work
>> great, but I have "-J" working locally in case you give up like me.
>> :-)
>>
>> --
>> Marcelo

Reply via email to