bq. depend on missing fastutil classes like Long2LongOpenHashMap Looks like Long2LongOpenHashMap should be added to the shaded jar.
Cheers On Tue, Feb 24, 2015 at 7:36 PM, Jim Kleckner <j...@cloudphysics.com> wrote: > Spark includes the clearspring analytics package but intentionally excludes > the dependencies of the fastutil package (see below). > > Spark includes parquet-column which includes fastutil and relocates it > under > parquet/ > but creates a shaded jar file which is incomplete because it shades out > some > of > the fastutil classes, notably Long2LongOpenHashMap, which is present in the > fastutil jar file that parquet-column is referencing. > > We are using more of the clearspring classes (e.g. QDigest) and those do > depend on > missing fastutil classes like Long2LongOpenHashMap. > > Even though I add them to our assembly jar file, the class loader finds the > spark assembly > and we get runtime class loader errors when we try to use it. > > It is possible to put our jar file first, as described here: > https://issues.apache.org/jira/browse/SPARK-939 > > http://spark.apache.org/docs/1.2.0/configuration.html#runtime-environment > > which I tried with args to spark-submit: > --conf spark.driver.userClassPathFirst=true --conf > spark.executor.userClassPathFirst=true > but we still get the class not found error. > > We have tried copying the source code for clearspring into our package and > renaming the > package and that makes it appear to work... Is this risky? It certainly > is > ugly. > > Can anyone recommend a way to deal with this "dependency **ll" ? > > > === The spark/pom.xml file contains the following lines: > > <dependency> > <groupId>com.clearspring.analytics</groupId> > <artifactId>stream</artifactId> > <version>2.7.0</version> > <exclusions> > > <exclusion> > <groupId>it.unimi.dsi</groupId> > <artifactId>fastutil</artifactId> > </exclusion> > </exclusions> > </dependency> > > === The parquet-column/pom.xml file contains: > <artifactId>maven-shade-plugin</artifactId> > <executions> > <execution> > <phase>package</phase> > <goals> > <goal>shade</goal> > </goals> > <configuration> > <minimizeJar>true</minimizeJar> > <artifactSet> > <includes> > <include>it.unimi.dsi:fastutil</include> > </includes> > </artifactSet> > <relocations> > <relocation> > <pattern>it.unimi.dsi</pattern> > <shadedPattern>parquet.it.unimi.dsi</shadedPattern> > </relocation> > </relocations> > </configuration> > </execution> > </executions> > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-excludes-fastutil-dependencies-we-need-tp21794.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >