This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 880d9bb3fcb [SPARK-40739][SPARK-40738] Fixes for cygwin/msys2/mingw sbt build and bash scripts 880d9bb3fcb is described below commit 880d9bb3fcb69001512886496f2988ed17cc4c50 Author: Phil <philwa...@gmail.com> AuthorDate: Mon Oct 24 08:28:54 2022 -0500 [SPARK-40739][SPARK-40738] Fixes for cygwin/msys2/mingw sbt build and bash scripts This fixes two problems that affect development in a Windows shell environment, such as `cygwin` or `msys2`. ### The fixed build error Running `./build/sbt packageBin` from A Windows cygwin `bash` session fails. This occurs if `WSL` is installed, because `project\SparkBuild.scala` creates a `bash` process, but `WSL bash` is called, even though `cygwin bash` appears earlier in the `PATH`. In addition, file path arguments to bash contain backslashes. The fix is to insure that the correct `bash` is called, and that arguments passed to `bash` are passed with slashes rather than **slashes.** ### The build error message: ```bash ./build.sbt packageBin ``` <pre> [info] compiling 9 Java sources to C:\Users\philwalk\workspace\spark\common\sketch\target\scala-2.12\classes ... /bin/bash: C:Usersphilwalkworkspacesparkcore/../build/spark-build-info: No such file or directory [info] compiling 1 Scala source to C:\Users\philwalk\workspace\spark\tools\target\scala-2.12\classes ... [info] compiling 5 Scala sources to C:\Users\philwalk\workspace\spark\mllib-local\target\scala-2.12\classes ... [info] Compiling 5 protobuf files to C:\Users\philwalk\workspace\spark\connector\connect\target\scala-2.12\src_managed\main [error] stack trace is suppressed; run last core / Compile / managedResources for the full output [error] (core / Compile / managedResources) Nonzero exit value: 127 [error] Total time: 42 s, completed Oct 8, 2022, 4:49:12 PM sbt:spark-parent> sbt:spark-parent> last core /Compile /managedResources last core /Compile /managedResources [error] java.lang.RuntimeException: Nonzero exit value: 127 [error] at scala.sys.package$.error(package.scala:30) [error] at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.slurp(ProcessBuilderImpl.scala:138) [error] at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.$bang$bang(ProcessBuilderImpl.scala:108) [error] at Core$.$anonfun$settings$4(SparkBuild.scala:604) [error] at scala.Function1.$anonfun$compose$1(Function1.scala:49) [error] at sbt.internal.util.$tilde$greater.$anonfun$$u2219$1(TypeFunctions.scala:62) [error] at sbt.std.Transform$$anon$4.work(Transform.scala:68) [error] at sbt.Execute.$anonfun$submit$2(Execute.scala:282) [error] at sbt.internal.util.ErrorHandling$.wideConvert(ErrorHandling.scala:23) [error] at sbt.Execute.work(Execute.scala:291) [error] at sbt.Execute.$anonfun$submit$1(Execute.scala:282) [error] at sbt.ConcurrentRestrictions$$anon$4.$anonfun$submitValid$1(ConcurrentRestrictions.scala:265) [error] at sbt.CompletionService$$anon$2.call(CompletionService.scala:64) [error] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [error] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [error] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [error] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [error] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [error] at java.base/java.lang.Thread.run(Thread.java:834) [error] (core / Compile / managedResources) Nonzero exit value: 127 </pre> ### bash scripts fail when run from `cygwin` or `msys2` The other problem fixed by the PR is to address problems preventing the `bash` scripts (`spark-shell`, `spark-submit`, etc.) from being used in Windows `SHELL` environments. The problem is that the bash version of `spark-class` fails in a Windows shell environment, the result of `launcher/src/main/java/org/apache/spark/launcher/Main.java` not following the convention expected by `spark-class`, and also appending CR to line endings. The resulting error message not helpful. There are two parts to this fix: 1. modify `Main.java` to treat a `SHELL` session on Windows as a `bash` session 2. remove the appended CR character when parsing the output produced by `Main.java` ### Does this PR introduce _any_ user-facing change? These changes should NOT affect anyone who is not trying build or run bash scripts from a Windows SHELL environment. ### How was this patch tested? Manual tests were performed to verify both changes. ### related JIRA issues The following 2 JIRA issue were created. Both are fixed by this PR. They are both linked to this PR. - Bug SPARK-40739 "sbt packageBin" fails in cygwin or other windows bash session - Bug SPARK-40738 spark-shell fails with "bad array" Closes #38228 from philwalk/windows-shell-env-fixes. Authored-by: Phil <philwa...@gmail.com> Signed-off-by: Sean Owen <sro...@gmail.com> --- bin/spark-class | 3 ++- bin/spark-class2.cmd | 2 ++ build/spark-build-info | 2 +- launcher/src/main/java/org/apache/spark/launcher/Main.java | 6 ++++-- project/SparkBuild.scala | 9 ++++++++- 5 files changed, 17 insertions(+), 5 deletions(-) diff --git a/bin/spark-class b/bin/spark-class index c1461a77122..fc343ca29fd 100755 --- a/bin/spark-class +++ b/bin/spark-class @@ -77,7 +77,8 @@ set +o posix CMD=() DELIM=$'\n' CMD_START_FLAG="false" -while IFS= read -d "$DELIM" -r ARG; do +while IFS= read -d "$DELIM" -r _ARG; do + ARG=${_ARG//$'\r'} if [ "$CMD_START_FLAG" == "true" ]; then CMD+=("$ARG") else diff --git a/bin/spark-class2.cmd b/bin/spark-class2.cmd index 68b271d1d05..800ec0c02c2 100755 --- a/bin/spark-class2.cmd +++ b/bin/spark-class2.cmd @@ -69,6 +69,8 @@ rem SPARK-28302: %RANDOM% would return the same number if we call it instantly a rem so we should make it sure to generate unique file to avoid process collision of writing into rem the same file concurrently. if exist %LAUNCHER_OUTPUT% goto :gen +rem unset SHELL to indicate non-bash environment to launcher/Main +set SHELL= "%RUNNER%" -Xmx128m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.launcher.Main %* > %LAUNCHER_OUTPUT% for /f "tokens=*" %%i in (%LAUNCHER_OUTPUT%) do ( set SPARK_CMD=%%i diff --git a/build/spark-build-info b/build/spark-build-info index eb0e3d730e2..26157e8cf8c 100755 --- a/build/spark-build-info +++ b/build/spark-build-info @@ -24,7 +24,7 @@ RESOURCE_DIR="$1" mkdir -p "$RESOURCE_DIR" -SPARK_BUILD_INFO="${RESOURCE_DIR}"/spark-version-info.properties +SPARK_BUILD_INFO="${RESOURCE_DIR%/}"/spark-version-info.properties echo_build_properties() { echo version=$1 diff --git a/launcher/src/main/java/org/apache/spark/launcher/Main.java b/launcher/src/main/java/org/apache/spark/launcher/Main.java index e1054c7060f..6501fc1764c 100644 --- a/launcher/src/main/java/org/apache/spark/launcher/Main.java +++ b/launcher/src/main/java/org/apache/spark/launcher/Main.java @@ -87,7 +87,9 @@ class Main { cmd = buildCommand(builder, env, printLaunchCommand); } - if (isWindows()) { + // test for shell environments, to enable non-Windows treatment of command line prep + boolean shellflag = !isEmpty(System.getenv("SHELL")); + if (isWindows() && !shellflag) { System.out.println(prepareWindowsCommand(cmd, env)); } else { // A sequence of NULL character and newline separates command-strings and others. @@ -96,7 +98,7 @@ class Main { // In bash, use NULL as the arg separator since it cannot be used in an argument. List<String> bashCmd = prepareBashCommand(cmd, env); for (String c : bashCmd) { - System.out.print(c); + System.out.print(c.replaceFirst("\r$","")); System.out.print('\0'); } } diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala index cc103e4ab00..33883a2efaa 100644 --- a/project/SparkBuild.scala +++ b/project/SparkBuild.scala @@ -599,11 +599,18 @@ object SparkParallelTestGrouping { object Core { import scala.sys.process.Process + def buildenv = Process(Seq("uname")).!!.trim.replaceFirst("[^A-Za-z0-9].*", "").toLowerCase + def bashpath = Process(Seq("where", "bash")).!!.split("[\r\n]+").head.replace('\\', '/') lazy val settings = Seq( (Compile / resourceGenerators) += Def.task { val buildScript = baseDirectory.value + "/../build/spark-build-info" val targetDir = baseDirectory.value + "/target/extra-resources/" - val command = Seq("bash", buildScript, targetDir, version.value) + // support Windows build under cygwin/mingw64, etc + val bash = buildenv match { + case "cygwin" | "msys2" | "mingw64" | "clang64" => bashpath + case _ => "bash" + } + val command = Seq(bash, buildScript, targetDir, version.value) Process(command).!! val propsFile = baseDirectory.value / "target" / "extra-resources" / "spark-version-info.properties" Seq(propsFile) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org