[ 
https://issues.apache.org/jira/browse/SPARK-40738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phil Walker updated SPARK-40738:
--------------------------------
    External issue ID: 38228  (was: 38167)

> spark-shell fails with "bad array subscript" in cygwin or msys bash session
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-40738
>                 URL: https://issues.apache.org/jira/browse/SPARK-40738
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell, Windows
>    Affects Versions: 3.3.0
>         Environment: The problem occurs in Windows if *_spark-shell_* is 
> called from a bash session.
> NOTE: the fix also applies to _*spark-submit*_ and and {_}*beeline*{_}, since 
> they call spark-shell.
>            Reporter: Phil Walker
>            Priority: Major
>              Labels: bash, cygwin, mingw, msys2,, windows
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> A spark pull request [spark PR|https://github.com/apache/spark/pull/38167] 
> fixes this issue, and also fixes a build error that is also related to 
> _*cygwin*_  and *msys/mingw* bash *sbt* sessions.
> If a Windows user tries to start a *_spark-shell_* session by calling the 
> bash script (rather than the *_spark-shell.cmd_* script), it fails with a 
> confusing error message.  Script _*spark-class*_ calls 
> _*launcher/src/main/java/org/apache/spark/launcher/Main.java* to_ generate 
> command line arguments, but the launcher produces a format appropriate to the 
> *_.cmd_* version of the script rather than the _*bash*_ version.
> The launcher Main method, when called for environments other than Windows, 
> interleaves NULL characters between the command line arguments.   It should 
> also do so in Windows when called from the bash script.  It incorrectly 
> assumes that if the OS is Windows, that it is being called by the .cmd 
> version of the script.
> The resulting error message is unhelpful:
>  
> {code:java}
> [lots of ugly stuff omitted]
> /opt/spark/bin/spark-class: line 100: CMD: bad array subscript
> {code}
> The key to _*launcher/Main*_ knowing that a request is from a _*bash*_ 
> session is that the _*SHELL*_ environment variable is set.   This will 
> normally be set in any of the various Windows shell environments 
> ({_}*cygwin*{_}, {_}*mingw64*{_}, {_}*msys2*{_}, etc) and will not normally 
> be set in Windows environments.   In the _*spark-class.cmd*_ script, 
> _*SHELL*_ is intentionally unset to avoid problems, and to permit bash users 
> to call the _*.cmd*_ scripts if they prefer (it will still work as before).
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to