[jira] [Commented] (SPARK-23015) spark-submit fails when submitting several jobs in parallel

Kevin Grealish (Jira) Mon, 16 Dec 2019 16:06:57 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-23015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997724#comment-16997724
 ]


Kevin Grealish commented on SPARK-23015:
----------------------------------------

Here is something that may help craft a complete solution that you can change 
in the Spark the scripts. This uses VB script to create a GUID and assign it to 
an environment variable. It depends on cscript which is part of Windows since 
Windows 95. Change the two %%i it just %i to run outside a batch program. 
Instead of writing a temp .vbs file, just include it with the script now using 
%RANDOM%.

echo WScript.StdOut.WriteLine Mid(CreateObject("Scriptlet.TypeLib").GUID, 2, 
36) > %TEMP%\uuid.vbs
for /f %%i in ('cscript //NoLogo %TEMP%\uuid.vbs') do @set UUID=%%i
echo made a UUID: %UUID%

This code will collide on writing uudi.vbs so instead, a uuid.vbs (say called 
makeuuid.vbs) file should be added to the scripts.

> spark-submit fails when submitting several jobs in parallel
> -----------------------------------------------------------
>
>                 Key: SPARK-23015
>                 URL: https://issues.apache.org/jira/browse/SPARK-23015
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1
>         Environment: Windows 10 (1709/16299.125)
> Spark 2.3.0
> Java 8, Update 151
>            Reporter: Hugh Zabriskie
>            Priority: Major
>
> Spark Submit's launching library prints the command to execute the launcher 
> (org.apache.spark.launcher.main) to a temporary text file, reads the result 
> back into a variable, and then executes that command.
> {code}
> set LAUNCHER_OUTPUT=%temp%\spark-class-launcher-output-%RANDOM%.txt
> "%RUNNER%" -Xmx128m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.launcher.Main 
> %* > %LAUNCHER_OUTPUT%
> {code}
> [bin/spark-class2.cmd, 
> L67|https://github.com/apache/spark/blob/master/bin/spark-class2.cmd#L66]
> That temporary text file is given a pseudo-random name by the %RANDOM% env 
> variable generator, which generates a number between 0 and 32767.
> This appears to be the cause of an error occurring when several spark-submit 
> jobs are launched simultaneously. The following error is returned from stderr:
> {quote}The process cannot access the file because it is being used by another 
> process. The system cannot find the file
> USER/AppData/Local/Temp/spark-class-launcher-output-RANDOM.txt.
> The process cannot access the file because it is being used by another 
> process.{quote}
> My hypothesis is that %RANDOM% is returning the same value for multiple jobs, 
> causing the launcher library to attempt to write to the same file from 
> multiple processes. Another mechanism is needed for reliably generating the 
> names of the temporary files so that the concurrency issue is resolved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-23015) spark-submit fails when submitting several jobs in parallel

Reply via email to