Hi Sean and Ted,
Thanks for your replies.

I don't have our current problems nicely written up as good questions yet.
I'm still sorting out classpath issues, etc.
In case it is of help, I'm seeing:
* "Exception in thread "Spark Context Cleaner"
java.lang.NoClassDefFoundError: 0
        at
org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:149)"
* We've been having clashing dependencies between a colleague and I because
of the aforementioned classpath issue
* The clashing dependencies are also causing issues with what jetty
libraries are available in the classloader from Spark and don't clash with
existing libraries we have.

More anon,

Cheers,
Edward



-------- Original Message --------
 Subject: Re: spark 1.3.1 jars in repo1.maven.org Date: 2015-05-20 00:38
From: Sean Owen <so...@cloudera.com> To: Edward Sargisson <esa...@pobox.com>
Cc: user <user@spark.apache.org>


Yes, the published artifacts can only refer to one version of anything
(OK, modulo publishing a large number of variants under classifiers).

You aren't intended to rely on Spark's transitive dependencies for
anything. Compiling against the Spark API has no relation to what
version of Hadoop it binds against because it's not part of any API.
You mark the Spark dependency even as "provided" in your build and get
all the Spark/Hadoop bindings at runtime from our cluster.

What problem are you experiencing?


On Wed, May 20, 2015 at 2:17 AM, Edward Sargisson <esa...@pobox.com> wrote:

Hi,
I'd like to confirm an observation I've just made. Specifically that spark
is only available in repo1.maven.org for one Hadoop variant.

The Spark source can be compiled against a number of different Hadoops using
profiles. Yay.
However, the spark jars in repo1.maven.org appear to be compiled against one
specific Hadoop and no other differentiation is made. (I can see a
difference with hadoop-client being 2.2.0 in repo1.maven.org and 1.0.4 in
the version I compiled locally).

The implication here is that if you have a pom file asking for
spark-core_2.10 version 1.3.1 then Maven will only give you an Hadoop 2
version. Maven assumes that non-snapshot artifacts never change so trying to
load an Hadoop 1 version will end in tears.

This then means that if you compile code against spark-core then there will
probably be classpath NoClassDefFound issues unless the Hadoop 2 version is
exactly the one you want.

Have I gotten this correct?

It happens that our little app is using a Spark context directly from a
Jetty webapp and the classpath differences were/are causing some confusion.
We are currently installing a Hadoop 1 spark master and worker.

Thanks a lot!
Edward

Reply via email to