Re: repositories for spark jars

2014-03-19 Thread Evan Chan
The alternative is for Spark to not explicitly include hadoop_client, perhaps only as provided, and provide a facility to insert the hadoop client jars of your choice at packaging time. Unfortunately, hadoop_client pulls in a ton of other deps, so it's not as simple as copying one extra jar into

repositories for spark jars

2014-03-17 Thread Nathan Kronenfeld
After just spending a couple days fighting with a new spark installation, getting spark and hadoop version numbers matching everywhere, I have a suggestion I'd like to put out there. Can we put the hadoop version against which the spark jars were built into the version number? I noticed that the

Re: repositories for spark jars

2014-03-17 Thread Patrick Wendell
Hey Nathan, I don't think this would be possible because there are at least dozens of permutations of Hadoop versions (different vendor distros X different versions X YARN vs not YARN, etc) and maybe hundreds. So publishing new artifacts for each would be really difficult. What is the exact