[ https://issues.apache.org/jira/browse/SPARK-15526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16071552#comment-16071552 ]
Sean Owen commented on SPARK-15526: ----------------------------------- I do think we should shade in 2.3.0 here. Yes, shading is the right answer, but I also think this causes enough headaches that it's worth making sure Spark provides no version of JPMML to the user classpath directly. > Shade JPMML > ----------- > > Key: SPARK-15526 > URL: https://issues.apache.org/jira/browse/SPARK-15526 > Project: Spark > Issue Type: Dependency upgrade > Components: ML, MLlib > Affects Versions: 2.0.0 > Reporter: Villu Ruusmann > Priority: Minor > Original Estimate: 2h > Remaining Estimate: 2h > > The Spark-MLlib module depends on the JPMML-Model library > (org.jpmml:pmml-model:1.2.7) for its PMML export capabilities. The > JPMML-Model library is included in the Apache Spark assembly, which makes it > very difficult to build and deploy competing PMML exporters that may wish to > depend on different versions (typically much newer) of the same library. > JPMML-Model library classes are not part of Apache Spark public APIs, so it > shouldn't be a problem if they are relocated by prepending a prefix > "org.spark_project" to their package names using Maven Shade Plugin. The > requested treatment is identical to how Google Guava and Jetty dependencies > are shaded in the final assembly. > This issue is raised in relation to the JPMML-SparkML project > (https://github.com/jpmml/jpmml-sparkml), which provides PMML export > capabilities for Spark ML Pipelines. Currently, application developers who > wish to use it must tweak their application classpath, which assumes > familiarity with build internals. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org