[
https://issues.apache.org/jira/browse/HCATALOG-520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467381#comment-13467381
]
Travis Crawford commented on HCATALOG-520:
------------------------------------------
Simply by removing the jars we obviously don't need I was able to reduce our
deps by ~20 jars.
Currently we depend on {{hive-hbase-handler}} in {{hcatalog-core}}, which I
don't think we actually need to do. This sheds a LOT of dependencies.
To gain visibility about what our transitive dependencies are, I ported over
something I do for a different project. The idea is you check in the list of
jars you expect to depend on, and at build time you fail if that list changes.
A tool is provided to easily update the list, and the error message is very
clear. The idea is your dependencies are a critical part of the project and you
should be aware what they are and when they change.
Preview for anyone who's interested:
https://github.com/traviscrawford/hcatalog/commit/a2a4e085b1b528f9e00765255dadae550f735513
> Simplify HCatalog dependencies
> ------------------------------
>
> Key: HCATALOG-520
> URL: https://issues.apache.org/jira/browse/HCATALOG-520
> Project: HCatalog
> Issue Type: Improvement
> Reporter: Travis Crawford
> Assignee: Travis Crawford
>
> Looking through the hcatalog-core dependencies I believe we have an
> opportunity to trim them down. A major goal of HCatalog is to be a dependency
> of other processing tools, and we can make that more attractive by invading
> their classpath as little as possible.
> I believe the following look good (minus hive-exec which is a fat jar, but
> that's a separate issue):
> {code}
> <dependency org="org.apache.hadoop" name="hadoop-tools"
> rev="${hadoop20.version}" conf="default->*"/>
> <dependency org="org.apache.hive" name="hive-builtins"
> rev="${hive.version}"/>
> <dependency org="org.apache.hive" name="hive-metastore"
> rev="${hive.version}"/>
> <dependency org="org.apache.hive" name="hive-common"
> rev="${hive.version}"/>
> <dependency org="org.apache.hive" name="hive-exec" rev="${hive.version}"/>
> <dependency org="org.apache.hive" name="hive-cli" rev="${hive.version}"/>
> <dependency org="org.apache.hive" name="hive-hbase-handler"
> rev="${hive.version}">
> <exclude org="org.apache.maven.plugins"/>
> <exclude org="org.jruby"/>
> </dependency>
> {code}
> The following are where I believe we can make improvements:
> {code}
> <dependency org="org.apache.pig" name="pig" rev="${pig.version}"
> conf="default->*"/>
> {code}
> Pig is still depended on in hcatalog-core tests, but has not yet been moved
> to the test target. A major goal of switching to subprojects was to stop
> forcing processing frameworks as dependencies on people using HCat. This
> should move to the test target (since some core tests use pig for
> convenience).
> {code}
> <dependency org="javax.management.j2ee" name="management-api"
> rev="${javax-mgmt.version}"/>
> {code}
> Does anyone know why management-api is needed? I'm not familiar with this and
> don't see any usages from a quick grep. Its something JMS-related, and maybe
> was needed by hcatalog-server-extensions at some point? If tests pass without
> this I think we should remove it.
> {code}
> <dependency org="org.codehaus.jackson" name="jackson-mapper-asl"
> rev="${jackson.version}"/>
> <dependency org="org.codehaus.jackson" name="jackson-core-asl"
> rev="${jackson.version}"/>
> {code}
> HCatalog build requests jackson 1.7.3, and hive-exec depends on 1.8.8. Any
> objection to using the versions provided by Hive?
> {code}
> <dependency org="org.apache.thrift" name="libfb303" rev="${fb303.version}"/>
> {code}
> I don't believe this is required because hive-metastore depends on libfb303.
> {code}
> <dependency org="commons-dbcp" name="commons-dbcp"
> rev="${commons-dbcp.version}">
> <exclude module="commons-pool"/>
> <exclude org="org.apache.geronimo.specs" module="geronimo-jta_1.1_spec"/>
> </dependency>
> {code}
> hive-metastore depends on commons-dbcp and I don't believe we need to
> explicitly depend on this.
> {code}
> <dependency org="com.google.guava" name="guava" rev="${guava.version}"/>
> {code}
> hive-exec depends on guava 11.0.2 too so I don't believe we need to depend on
> this.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira