Hi folks,

I'm trying to deploy the latest from master branch and having some trouble
with the assembly jar.

In the spark-1.1 official distribution(I use cdh version), I see the
following jars, where spark-assembly-1.1.0-hadoop2.0.0-mr1-cdh4.2.0.jar
contains a ton of stuff:
datanucleus-api-jdo-3.2.1.jar
datanucleus-core-3.2.2.jar
datanucleus-rdbms-3.2.1.jar
spark-assembly-1.1.0-hadoop2.0.0-mr1-cdh4.2.0.jar
spark-examples-1.1.0-hadoop2.0.0-mr1-cdh4.2.0.jar
spark-hive-thriftserver_2.10-1.1.0.jar
spark-hive_2.10-1.1.0.jar
spark-sql_2.10-1.1.0.jar


I tried to create a similar distribution off of master running
mvn -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DskipTests -Pbigtop-dist
package
and
./make-distribution.sh -Pbigtop-dist -Phive
-Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DskipTests


but in either case all I get in spark-assembly is near empty:

spark_official/dist/lib$ jar -tvf
spark-assembly-1.2.0-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.0.jar

META-INF/
META-INF/MANIFEST.MF
org/
org/apache/
org/apache/spark/
org/apache/spark/unused/
org/apache/spark/unused/UnusedStubClass.class
META-INF/maven/
META-INF/maven/org.spark-project.spark/
META-INF/maven/org.spark-project.spark/unused/
META-INF/maven/org.spark-project.spark/unused/pom.xml
META-INF/maven/org.spark-project.spark/unused/pom.properties
META-INF/NOTICE

Any advice on how to get spark-core and the rest packaged into the assembly
jar -- I'd like to have fewer things to copy around.

Reply via email to