Re: repositories for spark jars

Patrick Wendell Mon, 17 Mar 2014 10:59:42 -0700

Hey Nathan,

I don't think this would be possible because there are at least dozens
of permutations of Hadoop versions (different vendor distros X
different versions X YARN vs not YARN, etc) and maybe hundreds. So
publishing new artifacts for each would be really difficult.


What is the exact problem you ran into? Maybe we need to improve the
documentation to make it more clear how to correctly link against
spark/hadoop for user applications. Basically the model we have now is
users link against Spark and then link against the hadoop-client
relevant to their version of Hadoop.

- Patrick

On Mon, Mar 17, 2014 at 9:50 AM, Nathan Kronenfeld
<nkronenf...@oculusinfo.com> wrote:
> After just spending a couple days fighting with a new spark installation,
> getting spark and hadoop version numbers matching everywhere, I have a
> suggestion I'd like to put out there.
>
> Can we put the hadoop version against which the spark jars were built into
> the version number?
>
> I noticed that the Cloudera maven repo has started to do this (
> https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/spark/spark-core_2.10/)
> - sadly, though, only with the cdh5.x versions, not with the 4.x versions
> for which they also have spark parcels.  But I see no signs of it in the
> central maven repo.
>
> Is this already done in some other repo about which I don't know, perhaps?
>
> I know it would save us a lot of time and grief simply to be able to point
> a project we build at the right version, and not have to rebuild and deploy
> spark manually.
>
> --
> Nathan Kronenfeld
> Senior Visualization Developer
> Oculus Info Inc
> 2 Berkeley Street, Suite 600,
> Toronto, Ontario M5A 4J5
> Phone:  +1-416-203-3003 x 238
> Email:  nkronenf...@oculusinfo.com

Re: repositories for spark jars

Reply via email to