This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
     new 55f92a3  [SPARK-28302][CORE] Make sure to generate unique output file 
for SparkLauncher on Windows
55f92a3 is described below

commit 55f92a31d7c1a6f02a9b0fc2ace6c5a5e0871ec4
Author: wuyi <ngone_5...@163.com>
AuthorDate: Tue Jul 9 15:49:31 2019 +0900

    [SPARK-28302][CORE] Make sure to generate unique output file for 
SparkLauncher on Windows
    
    ## What changes were proposed in this pull request?
    
    When using SparkLauncher to submit applications **concurrently** with 
multiple threads under **Windows**, some apps would show that "The process 
cannot access the file because it is being used by another process" and remains 
in LOST state at the end. The issue can be reproduced by  this 
[demo](https://issues.apache.org/jira/secure/attachment/12973920/Main.scala).
    
    After digging into the code, I find that, Windows cmd `%RANDOM%` would 
return the same number if we call it  instantly(e.g. < 500ms) after last call. 
As a result, SparkLauncher would get same output 
file(spark-class-launcher-output-%RANDOM%.txt) for apps. Then, the following 
app would hit the issue when it tries to write the same file which has already 
been opened for writing by another app.
    
    We should make sure to generate unique output file for SparkLauncher on 
Windows to avoid this issue.
    
    ## How was this patch tested?
    
    Tested manually on Windows.
    
    Closes #25076 from Ngone51/SPARK-28302.
    
    Authored-by: wuyi <ngone_5...@163.com>
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
    (cherry picked from commit 925f620570a022ff8229bfde076e7dde6bf242df)
    Signed-off-by: HyukjinKwon <gurwls...@apache.org>
---
 bin/spark-class2.cmd | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/bin/spark-class2.cmd b/bin/spark-class2.cmd
index 5da7d7a..34d04c9 100644
--- a/bin/spark-class2.cmd
+++ b/bin/spark-class2.cmd
@@ -63,7 +63,12 @@ if not "x%JAVA_HOME%"=="x" (
 
 rem The launcher library prints the command to be executed in a single line 
suitable for being
 rem executed by the batch interpreter. So read all the output of the launcher 
into a variable.
+:gen
 set LAUNCHER_OUTPUT=%temp%\spark-class-launcher-output-%RANDOM%.txt
+rem SPARK-28302: %RANDOM% would return the same number if we call it instantly 
after last call,
+rem so we should make it sure to generate unique file to avoid process 
collision of writing into
+rem the same file concurrently.
+if exist %LAUNCHER_OUTPUT% goto :gen
 "%RUNNER%" -Xmx128m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.launcher.Main %* 
> %LAUNCHER_OUTPUT%
 for /f "tokens=*" %%i in (%LAUNCHER_OUTPUT%) do (
   set SPARK_CMD=%%i


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to