We probably want to revisit the way we do binaries in general for
1.4+. IMO, something worth forking a separate thread for.

I've been hesitating to add new binaries because people
(understandably) complain if you ever stop packaging older ones, but
on the other hand the ASF has complained that we have too many
binaries already and that we need to pare it down because of the large
volume of files. Doubling the number of binaries we produce for Scala
2.11 seemed like it would be too much.

One solution potentially is to actually package "Hadoop provided"
binaries and encourage users to use these by simply setting
HADOOP_HOME, or have instructions for specific distros. I've heard
that our existing packages don't work well on HDP for instance, since
there are some configuration quirks that differ from the upstream
Hadoop.

If we cut down on the cross building for Hadoop versions, then it is
more tenable to cross build for Scala versions without exploding the
number of binaries.

- Patrick

On Sun, Mar 8, 2015 at 12:46 PM, Sean Owen <so...@cloudera.com> wrote:
> Yeah, interesting question of what is the better default for the
> single set of artifacts published to Maven. I think there's an
> argument for Hadoop 2 and perhaps Hive for the 2.10 build too. Pros
> and cons discussed more at
>
> https://issues.apache.org/jira/browse/SPARK-5134
> https://github.com/apache/spark/pull/3917
>
> On Sun, Mar 8, 2015 at 7:42 PM, Matei Zaharia <matei.zaha...@gmail.com> wrote:
>> +1
>>
>> Tested it on Mac OS X.
>>
>> One small issue I noticed is that the Scala 2.11 build is using Hadoop 1 
>> without Hive, which is kind of weird because people will more likely want 
>> Hadoop 2 with Hive. So it would be good to publish a build for that 
>> configuration instead. We can do it if we do a new RC, or it might be that 
>> binary builds may not need to be voted on (I forgot the details there).
>>
>> Matei

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to