Guys,

I am trying to help Spark/Shark community (spark-project.org and now
http://incubator.apache.org/projects/spark) with a predicament. Shark - that's
also known as Hive on Spark - is using some parts of Hive, ie HQL parser,
query optimizer, serdes, and codecs. 

In order to improve some known issues with performance and/or concurrency
Shark developers need to apply a couple of patches on top of the stock Hive:
   https://issues.apache.org/jira/browse/HIVE-2891
   https://issues.apache.org/jira/browse/HIVE-3772 (just committed to trunk)
(as per https://github.com/amplab/shark/wiki/Hive-Patches)

The issue here is that latest Shark is working on top if Hive 0.9 (Hive 0.11
work is underway) and having developers to apply the patches and build
their own version of the Hive is an extra step that can be avoided. 

One way to address it is to publish Shark specific versions of Hive artifacts
that would have all needed patches applied to stock release.  This way
downstream projects can simply reference the version org.apache.hive with
version 0.9.0-shark-0.7 instead of building Hive locally every time.

Perhaps this approach is a little overkill, so perhaps if Hive community is
willing to consider a maintenance release of Hive 0.9.1 and perhaps 0.11.1
to include fixes needed by Shark project?

I am willing to step up and produce Hive release bits if any of the committers
here can help with publishing.

-- 
Thanks in advance,
        Cos

Attachment: signature.asc
Description: Digital signature

Reply via email to