[jira] [Commented] (SPARK-10878) Race condition when resolving Maven coordinates via Ivy

Jeeyoung Kim (JIRA) Fri, 05 May 2017 11:29:33 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-10878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998731#comment-15998731
 ]


Jeeyoung Kim commented on SPARK-10878:
--------------------------------------

[~joshrosen] Yes, I realized what are potential race conditions (both inside 
Ivy and how Spark uses Ivy). Regarding (1), even if Ivy becomes thread-safe, 
writing a temporary pom file with a fixed filename would break things - thus I 
think this is valuable thing to to do. I can attempt a patch around this.

Regarding (2), I think it is quite inefficient solution, to have multiple 
resolution caches to get around this. My cache directory is half gigabytes 
right now, and having that per spark job seems inefficient.

> Race condition when resolving Maven coordinates via Ivy
> -------------------------------------------------------
>
>                 Key: SPARK-10878
>                 URL: https://issues.apache.org/jira/browse/SPARK-10878
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.0
>            Reporter: Ryan Williams
>            Priority: Minor
>
> I've recently been shell-scripting the creation of many concurrent 
> Spark-on-YARN apps and observing a fraction of them to fail with what I'm 
> guessing is a race condition in their Maven-coordinate resolution.
> For example, I might spawn an app for each path in file {{paths}} with the 
> following shell script:
> {code}
> cat paths | parallel "$SPARK_HOME/bin/spark-submit foo.jar {}"
> {code}
> When doing this, I observe some fraction of the spawned jobs to fail with 
> errors like:
> {code}
> :: retrieving :: org.apache.spark#spark-submit-parent
>         confs: [default]
> Exception in thread "main" java.lang.RuntimeException: problem during 
> retrieve of org.apache.spark#spark-submit-parent: java.text.ParseException: 
> failed to parse report: 
> /hpc/users/willir31/.ivy2/cache/org.apache.spark-spark-submit-parent-default.xml:
>  Premature end of file.
>         at 
> org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:249)
>         at 
> org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:83)
>         at org.apache.ivy.Ivy.retrieve(Ivy.java:551)
>         at 
> org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1006)
>         at 
> org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:286)
>         at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:153)
>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.text.ParseException: failed to parse report: 
> /hpc/users/willir31/.ivy2/cache/org.apache.spark-spark-submit-parent-default.xml:
>  Premature end of file.
>         at 
> org.apache.ivy.plugins.report.XmlReportParser.parse(XmlReportParser.java:293)
>         at 
> org.apache.ivy.core.retrieve.RetrieveEngine.determineArtifactsToCopy(RetrieveEngine.java:329)
>         at 
> org.apache.ivy.core.retrieve.RetrieveEngine.retrieve(RetrieveEngine.java:118)
>         ... 7 more
> Caused by: org.xml.sax.SAXParseException; Premature end of file.
>         at 
> org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown 
> Source)
>         at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown 
> Source)
>         at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
> {code}
> The more apps I try to launch simultaneously, the greater fraction of them 
> seem to fail with this or similar errors; a batch of ~10 will usually work 
> fine, a batch of 15 will see a few failures, and a batch of ~60 will have 
> dozens of failures.
> [This gist shows 11 recent failures I 
> observed|https://gist.github.com/ryan-williams/648bff70e518de0c7c84].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-10878) Race condition when resolving Maven coordinates via Ivy

Reply via email to