This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 880d9bb3fcb [SPARK-40739][SPARK-40738] Fixes for cygwin/msys2/mingw 
sbt build and bash scripts
880d9bb3fcb is described below

commit 880d9bb3fcb69001512886496f2988ed17cc4c50
Author: Phil <philwa...@gmail.com>
AuthorDate: Mon Oct 24 08:28:54 2022 -0500

    [SPARK-40739][SPARK-40738] Fixes for cygwin/msys2/mingw sbt build and bash 
scripts
    
    This fixes two problems that affect development in a Windows shell 
environment, such as `cygwin` or `msys2`.
    
    ### The fixed build error
    Running `./build/sbt packageBin` from A Windows cygwin `bash` session fails.
    
    This occurs if `WSL` is installed, because `project\SparkBuild.scala` 
creates a `bash` process, but `WSL bash` is called, even though `cygwin bash` 
appears earlier in the `PATH`.  In addition, file path arguments to bash 
contain backslashes.    The fix is to insure that the correct `bash` is called, 
and that arguments passed to `bash` are passed with slashes rather than 
**slashes.**
    
    ### The build error message:
    ```bash
     ./build.sbt packageBin
    ```
    <pre>
    [info] compiling 9 Java sources to 
C:\Users\philwalk\workspace\spark\common\sketch\target\scala-2.12\classes ...
    /bin/bash: C:Usersphilwalkworkspacesparkcore/../build/spark-build-info: No 
such file or directory
    [info] compiling 1 Scala source to 
C:\Users\philwalk\workspace\spark\tools\target\scala-2.12\classes ...
    [info] compiling 5 Scala sources to 
C:\Users\philwalk\workspace\spark\mllib-local\target\scala-2.12\classes ...
    [info] Compiling 5 protobuf files to 
C:\Users\philwalk\workspace\spark\connector\connect\target\scala-2.12\src_managed\main
    [error] stack trace is suppressed; run last core / Compile / 
managedResources for the full output
    [error] (core / Compile / managedResources) Nonzero exit value: 127
    [error] Total time: 42 s, completed Oct 8, 2022, 4:49:12 PM
    sbt:spark-parent>
    sbt:spark-parent> last core /Compile /managedResources
    last core /Compile /managedResources
    [error] java.lang.RuntimeException: Nonzero exit value: 127
    [error]         at scala.sys.package$.error(package.scala:30)
    [error]         at 
scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:138)
    [error]         at 
scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:108)
    [error]         at Core$.$anonfun$settings$4(SparkBuild.scala:604)
    [error]         at scala.Function1.$anonfun$compose$1(Function1.scala:49)
    [error]         at 
sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:62)
    [error]         at sbt.std.Transform$$anon$4.work(Transform.scala:68)
    [error]         at sbt.Execute.$anonfun$submit$2(Execute.scala:282)
    [error]         at 
sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:23)
    [error]         at sbt.Execute.work(Execute.scala:291)
    [error]         at sbt.Execute.$anonfun$submit$1(Execute.scala:282)
    [error]         at 
sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265)
    [error]         at 
sbt.CompletionService$$anon$2.call(CompletionService.scala:64)
    [error]         at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    [error]         at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    [error]         at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    [error]         at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    [error]         at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    [error]         at java.base/java.lang.Thread.run(Thread.java:834)
    [error] (core / Compile / managedResources) Nonzero exit value: 127
    </pre>
    
    ### bash scripts fail when run from `cygwin` or `msys2`
    The other problem fixed by the PR is to address problems preventing the 
`bash` scripts (`spark-shell`, `spark-submit`, etc.) from being used in Windows 
`SHELL` environments.   The problem is that the bash version of `spark-class` 
fails in a Windows shell environment, the result of 
`launcher/src/main/java/org/apache/spark/launcher/Main.java` not following the 
convention expected by `spark-class`, and also appending CR to line endings.  
The resulting error message not helpful.
    
    There are two parts to this fix:
    1. modify `Main.java` to treat a `SHELL` session on Windows as a `bash` 
session
    2. remove the appended CR character when parsing the output produced by 
`Main.java`
    
    ### Does this PR introduce _any_ user-facing change?
    
    These changes should NOT affect anyone who is not trying build or run bash 
scripts from a Windows SHELL environment.
    
    ### How was this patch tested?
    Manual tests were performed to verify both changes.
    
    ### related JIRA issues
    The following 2 JIRA issue were created.  Both are fixed by this PR.  They 
are both linked to this PR.
    
    - Bug SPARK-40739 "sbt packageBin" fails in cygwin or other windows bash 
session
    - Bug SPARK-40738 spark-shell fails with "bad array"
    
    Closes #38228 from philwalk/windows-shell-env-fixes.
    
    Authored-by: Phil <philwa...@gmail.com>
    Signed-off-by: Sean Owen <sro...@gmail.com>
---
 bin/spark-class                                            | 3 ++-
 bin/spark-class2.cmd                                       | 2 ++
 build/spark-build-info                                     | 2 +-
 launcher/src/main/java/org/apache/spark/launcher/Main.java | 6 ++++--
 project/SparkBuild.scala                                   | 9 ++++++++-
 5 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/bin/spark-class b/bin/spark-class
index c1461a77122..fc343ca29fd 100755
--- a/bin/spark-class
+++ b/bin/spark-class
@@ -77,7 +77,8 @@ set +o posix
 CMD=()
 DELIM=$'\n'
 CMD_START_FLAG="false"
-while IFS= read -d "$DELIM" -r ARG; do
+while IFS= read -d "$DELIM" -r _ARG; do
+  ARG=${_ARG//$'\r'}
   if [ "$CMD_START_FLAG" == "true" ]; then
     CMD+=("$ARG")
   else
diff --git a/bin/spark-class2.cmd b/bin/spark-class2.cmd
index 68b271d1d05..800ec0c02c2 100755
--- a/bin/spark-class2.cmd
+++ b/bin/spark-class2.cmd
@@ -69,6 +69,8 @@ rem SPARK-28302: %RANDOM% would return the same number if we 
call it instantly a
 rem so we should make it sure to generate unique file to avoid process 
collision of writing into
 rem the same file concurrently.
 if exist %LAUNCHER_OUTPUT% goto :gen
+rem unset SHELL to indicate non-bash environment to launcher/Main
+set SHELL=
 "%RUNNER%" -Xmx128m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.launcher.Main %* 
> %LAUNCHER_OUTPUT%
 for /f "tokens=*" %%i in (%LAUNCHER_OUTPUT%) do (
   set SPARK_CMD=%%i
diff --git a/build/spark-build-info b/build/spark-build-info
index eb0e3d730e2..26157e8cf8c 100755
--- a/build/spark-build-info
+++ b/build/spark-build-info
@@ -24,7 +24,7 @@
 
 RESOURCE_DIR="$1"
 mkdir -p "$RESOURCE_DIR"
-SPARK_BUILD_INFO="${RESOURCE_DIR}"/spark-version-info.properties
+SPARK_BUILD_INFO="${RESOURCE_DIR%/}"/spark-version-info.properties
 
 echo_build_properties() {
   echo version=$1
diff --git a/launcher/src/main/java/org/apache/spark/launcher/Main.java 
b/launcher/src/main/java/org/apache/spark/launcher/Main.java
index e1054c7060f..6501fc1764c 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/Main.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/Main.java
@@ -87,7 +87,9 @@ class Main {
       cmd = buildCommand(builder, env, printLaunchCommand);
     }
 
-    if (isWindows()) {
+    // test for shell environments, to enable non-Windows treatment of command 
line prep
+    boolean shellflag = !isEmpty(System.getenv("SHELL"));
+    if (isWindows() && !shellflag) {
       System.out.println(prepareWindowsCommand(cmd, env));
     } else {
       // A sequence of NULL character and newline separates command-strings 
and others.
@@ -96,7 +98,7 @@ class Main {
       // In bash, use NULL as the arg separator since it cannot be used in an 
argument.
       List<String> bashCmd = prepareBashCommand(cmd, env);
       for (String c : bashCmd) {
-        System.out.print(c);
+        System.out.print(c.replaceFirst("\r$",""));
         System.out.print('\0');
       }
     }
diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index cc103e4ab00..33883a2efaa 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -599,11 +599,18 @@ object SparkParallelTestGrouping {
 
 object Core {
   import scala.sys.process.Process
+  def buildenv = Process(Seq("uname")).!!.trim.replaceFirst("[^A-Za-z0-9].*", 
"").toLowerCase
+  def bashpath = Process(Seq("where", 
"bash")).!!.split("[\r\n]+").head.replace('\\', '/')
   lazy val settings = Seq(
     (Compile / resourceGenerators) += Def.task {
       val buildScript = baseDirectory.value + "/../build/spark-build-info"
       val targetDir = baseDirectory.value + "/target/extra-resources/"
-      val command = Seq("bash", buildScript, targetDir, version.value)
+      // support Windows build under cygwin/mingw64, etc
+      val bash = buildenv match {
+        case "cygwin" | "msys2" | "mingw64" | "clang64" => bashpath
+        case _ => "bash"
+      }
+      val command = Seq(bash, buildScript, targetDir, version.value)
       Process(command).!!
       val propsFile = baseDirectory.value / "target" / "extra-resources" / 
"spark-version-info.properties"
       Seq(propsFile)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to