Hi folks, I'm trying to deploy the latest from master branch and having some trouble with the assembly jar.
In the spark-1.1 official distribution(I use cdh version), I see the following jars, where spark-assembly-1.1.0-hadoop2.0.0-mr1-cdh4.2.0.jar contains a ton of stuff: datanucleus-api-jdo-3.2.1.jar datanucleus-core-3.2.2.jar datanucleus-rdbms-3.2.1.jar spark-assembly-1.1.0-hadoop2.0.0-mr1-cdh4.2.0.jar spark-examples-1.1.0-hadoop2.0.0-mr1-cdh4.2.0.jar spark-hive-thriftserver_2.10-1.1.0.jar spark-hive_2.10-1.1.0.jar spark-sql_2.10-1.1.0.jar I tried to create a similar distribution off of master running mvn -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DskipTests -Pbigtop-dist package and ./make-distribution.sh -Pbigtop-dist -Phive -Dhadoop.version=2.0.0-mr1-cdh4.2.0 -DskipTests but in either case all I get in spark-assembly is near empty: spark_official/dist/lib$ jar -tvf spark-assembly-1.2.0-SNAPSHOT-hadoop2.0.0-mr1-cdh4.2.0.jar META-INF/ META-INF/MANIFEST.MF org/ org/apache/ org/apache/spark/ org/apache/spark/unused/ org/apache/spark/unused/UnusedStubClass.class META-INF/maven/ META-INF/maven/org.spark-project.spark/ META-INF/maven/org.spark-project.spark/unused/ META-INF/maven/org.spark-project.spark/unused/pom.xml META-INF/maven/org.spark-project.spark/unused/pom.properties META-INF/NOTICE Any advice on how to get spark-core and the rest packaged into the assembly jar -- I'd like to have fewer things to copy around.