Travis Crawford created HCATALOG-520:
----------------------------------------

             Summary: Simplify HCatalog dependencies
                 Key: HCATALOG-520
                 URL: https://issues.apache.org/jira/browse/HCATALOG-520
             Project: HCatalog
          Issue Type: Improvement
            Reporter: Travis Crawford
            Assignee: Travis Crawford


Looking through the hcatalog-core dependencies I believe we have an opportunity 
to trim them down. A major goal of HCatalog is to be a dependency of other 
processing tools, and we can make that more attractive by invading their 
classpath as little as possible.

I believe the following look good (minus hive-exec which is a fat jar, but 
that's a separate issue):

{code}
    <dependency org="org.apache.hadoop" name="hadoop-tools" 
rev="${hadoop20.version}" conf="default->*"/>
    <dependency org="org.apache.hive" name="hive-builtins" 
rev="${hive.version}"/>
    <dependency org="org.apache.hive" name="hive-metastore" 
rev="${hive.version}"/>
    <dependency org="org.apache.hive" name="hive-common" rev="${hive.version}"/>
    <dependency org="org.apache.hive" name="hive-exec" rev="${hive.version}"/>
    <dependency org="org.apache.hive" name="hive-cli" rev="${hive.version}"/>
    <dependency org="org.apache.hive" name="hive-hbase-handler" 
rev="${hive.version}">
      <exclude org="org.apache.maven.plugins"/>
      <exclude org="org.jruby"/>
    </dependency>
{code}

The following are where I believe we can make improvements:

{code}
<dependency org="org.apache.pig" name="pig" rev="${pig.version}" 
conf="default->*"/>
{code}

Pig is still depended on in hcatalog-core tests, but has not yet been moved to 
the test target. A major goal of switching to subprojects was to stop forcing 
processing frameworks as dependencies on people using HCat. This should move to 
the test target (since some core tests use pig for convenience).

{code}
<dependency org="javax.management.j2ee" name="management-api" 
rev="${javax-mgmt.version}"/>
{code}

Does anyone know why management-api is needed? I'm not familiar with this and 
don't see any usages from a quick grep. Its something JMS-related, and maybe 
was needed by hcatalog-server-extensions at some point? If tests pass without 
this I think we should remove it.

{code}
<dependency org="org.codehaus.jackson" name="jackson-mapper-asl" 
rev="${jackson.version}"/>
<dependency org="org.codehaus.jackson" name="jackson-core-asl" 
rev="${jackson.version}"/>
{code}

HCatalog build requests jackson 1.7.3, and hive-exec depends on 1.8.8. Any 
objection to using the versions provided by Hive?

{code}
<dependency org="org.apache.thrift" name="libfb303" rev="${fb303.version}"/>
{code}

I don't believe this is required because hive-metastore depends on libfb303.

{code}
<dependency org="commons-dbcp" name="commons-dbcp" 
rev="${commons-dbcp.version}">
  <exclude module="commons-pool"/>
  <exclude org="org.apache.geronimo.specs" module="geronimo-jta_1.1_spec"/>
</dependency>
{code}

hive-metastore depends on commons-dbcp and I don't believe we need to 
explicitly depend on this.

{code}
<dependency org="com.google.guava" name="guava" rev="${guava.version}"/>
{code}

hive-exec depends on guava 11.0.2 too so I don't believe we need to depend on 
this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to