Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4215#discussion_r23878335
  
    --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
    @@ -431,6 +458,155 @@ object SparkSubmit {
       }
     }
     
    +/** Provides utility functions to be used inside SparkSubmit. */
    +private[spark] object SparkSubmitUtils extends Logging {
    +
    +  // Directories for caching downloads through ivy and storing the jars 
when maven coordinates are
    +  // supplied to spark-submit
    +  private var PACKAGES_DIRECTORY: File = null
    +
    +  /**
    +   * Represents a Maven Coordinate
    +   * @param groupId the groupId of the coordinate
    +   * @param artifactId the artifactId of the coordinate
    +   * @param version the version of the coordinate
    +   */
    +  private[spark] case class MavenCoordinate(groupId: String, artifactId: 
String, version: String)
    +
    +  /**
    +   * Resolves any dependencies that were supplied through maven coordinates
    +   * @param coordinates Comma-delimited string of maven coordinates
    +   * @param remoteRepos Comma-delimited string of remote repositories 
other than maven central
    +   * @param ivyPath The path to the local ivy repository
    +   * @return The comma-delimited path to the jars of the given maven 
artifacts including their
    +   *         transitive dependencies
    +   */
    +  private[spark] def resolveMavenCoordinates(
    +      coordinates: String,
    +      remoteRepos: String,
    +      ivyPath: String,
    +      isTest: Boolean = false): String = {
    --- End diff --
    
    This function is really long and has a bunch of different responsibilities, 
so I wonder whether it makes sense to split it into a few smaller helper 
functions (this could simplify testing as well).  I'm not sure what's the best 
way to do this, but one starting point might be to extract the coordinates -> 
MavenCoordinate function into its own function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to