Indeed I see a lot of duplicate package warning in the maven-shade assembly package output, so I tried to eliminate them:
First I set scope of dependency to apache-spark to 'provided', as suggested in this page: http://spark.apache.org/docs/latest/submitting-applications.html But spark master gave me a blunt dependency not found error: Exception in thread "main" java.lang.NoClassDefFoundError: scala/collection/Seq at ... [my main object] Then I revert it back to 'compile' to see if things got better, but after which I again saw duplicates of packages, then random errors (like NoSuchFieldError, IllegalStateException etc.) Is setting scope = 'provided' mandatory in deployment? I mere remove this line for debugging locally. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-throws-NoSuchFieldError-when-testing-on-cluster-mode-tp8064p8076.html Sent from the Apache Spark User List mailing list archive at Nabble.com.