[ https://issues.apache.org/jira/browse/SPARK-32385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17167504#comment-17167504 ]
DB Tsai commented on SPARK-32385: --------------------------------- +1 This will be very useful for users to include Spark as deps. [~hyukjin.kwon] from [https://www.baeldung.com/spring-maven-bom] Following is an example of how to write a BOM file: {code:java} <project ...> <modelVersion>4.0.0</modelVersion> <groupId>baeldung</groupId> <artifactId>Baeldung-BOM</artifactId> <version>0.0.1-SNAPSHOT</version> <packaging>pom</packaging> <name>BaelDung-BOM</name> <description>parent pom</description> <dependencyManagement> <dependencies> <dependency> <groupId>test</groupId> <artifactId>a</artifactId> <version>1.2</version> </dependency> <dependency> <groupId>test</groupId> <artifactId>b</artifactId> <version>1.0</version> <scope>compile</scope> </dependency> <dependency> <groupId>test</groupId> <artifactId>c</artifactId> <version>1.0</version> <scope>compile</scope> </dependency> </dependencies> </dependencyManagement> </project> {code} As we can see, the BOM is a normal POM file with a dependencyManagement section where we can include all an artifact's information and versions. > Publish a "bill of materials" (BOM) descriptor for Spark with correct > versions of various dependencies > ------------------------------------------------------------------------------------------------------ > > Key: SPARK-32385 > URL: https://issues.apache.org/jira/browse/SPARK-32385 > Project: Spark > Issue Type: Improvement > Components: Build > Affects Versions: 3.1.0 > Reporter: Vladimir Matveev > Priority: Major > > Spark has a lot of dependencies, many of them very common (e.g. Guava, > Jackson). Also, versions of these dependencies are not updated as frequently > as they are released upstream, which is totally understandable and natural, > but which also means that often Spark has a dependency on a lower version of > a library, which is incompatible with a higher, more recent version of the > same library. This incompatibility can manifest in different ways, e.g as > classpath errors or runtime check errors (like with Jackson), in certain > cases. > > Spark does attempt to "fix" versions of its dependencies by declaring them > explicitly in its {{pom.xml}} file. However, this approach, being somewhat > workable if the Spark-using project itself uses Maven, breaks down if another > build system is used, like Gradle. The reason is that Maven uses an > unconventional "nearest first" version conflict resolution strategy, while > many other tools like Gradle use the "highest first" strategy which resolves > the highest possible version number inside the entire graph of dependencies. > This means that other dependencies of the project can pull a higher version > of some dependency, which is incompatible with Spark. > > One example would be an explicit or a transitive dependency on a higher > version of Jackson in the project. Spark itself depends on several modules of > Jackson; if only one of them gets a higher version, and others remain on the > lower version, this will result in runtime exceptions due to an internal > version check in Jackson. > > A widely used solution for this kind of version issues is publishing of a > "bill of materials" descriptor (see here: > [https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html] > and here: > [https://docs.gradle.org/current/userguide/platforms.html#sub:bom_import]). > This descriptor would contain all versions of all dependencies of Spark; then > downstream projects will be able to use their build system's support for BOMs > to enforce version constraints required for Spark to function correctly. > > One example of successful implementation of the BOM-based approach is Spring: > [https://www.baeldung.com/spring-maven-bom#spring-bom]. For different Spring > projects, e.g. Spring Boot, there are BOM descriptors published which can be > used in downstream projects to fix the versions of Spring components and > their dependencies, significantly reducing confusion around proper version > numbers. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org