[ https://issues.apache.org/jira/browse/SPARK-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995815#comment-13995815 ]
Sean Owen commented on SPARK-1802: ---------------------------------- I looked further into just what might go wrong by including hive-exec into the assembly, since it includes its dependencies directly (i.e. Maven can't manage around it.) Attached is a full dump of the conflicts. The ones that are potential issues appear to be the following, and one looks like it could be a deal-breaker -- protobuf -- since it's neither forwards nor backwards compatible. That is, I recommend testing this assembly with an older Hadoop that needs 2.4.1 and see if it croaks. The rest might be worked around but need some additional mojo to make sure the right version wins in the packaging. Certainly having hive-exec in the build is making me queasy! [WARNING] hive-exec-0.12.0.jar, libthrift-0.9.0.jar define 153 overlappping classes: HBase includes libthrift-0.8.0, but it's in examples, and so figure this is ignorable. [WARNING] hive-exec-0.12.0.jar, commons-lang-2.4.jar define 2 overlappping classes: Probably ignorable, but we have to make sure commons-lang-3.3.2 'wins' in the build. [WARNING] hive-exec-0.12.0.jar, jackson-core-asl-1.9.11.jar define 117 overlappping classes: [WARNING] hive-exec-0.12.0.jar, jackson-mapper-asl-1.8.8.jar define 432 overlappping classes: Believe this are ignorable. (Not sure why the jackson versions are mismatched? another todo) [WARNING] hive-exec-0.12.0.jar, guava-14.0.1.jar define 1087 overlappping classes: Should be OK. Hive uses 11.0.2 like Hadoop; the build is already taking that particular risk. We need 14.0.1 to win. [WARNING] hive-exec-0.12.0.jar, protobuf-java-2.4.1.jar define 204 overlappping classes: Oof. Hive has protobuf 2.5.0. This has got to be a problem for older Hadoop builds? > Audit dependency graph when Spark is built with -Phive > ------------------------------------------------------ > > Key: SPARK-1802 > URL: https://issues.apache.org/jira/browse/SPARK-1802 > Project: Spark > Issue Type: Bug > Reporter: Patrick Wendell > Assignee: Sean Owen > Priority: Blocker > Fix For: 1.0.0 > > > I'd like to have binary release for 1.0 include Hive support. Since this > isn't enabled by default in the build I don't think it's as well tested, so > we should dig around a bit and decide if we need to e.g. add any excludes. > {code} > $ mvn install -Phive -DskipTests && mvn dependency:build-classpath -pl > assembly | grep -v INFO | tr ":" "\n" | awk ' { FS="/"; print ( $(NF) ); }' > | sort > without_hive.txt > $ mvn install -Phive -DskipTests && mvn dependency:build-classpath -Phive -pl > assembly | grep -v INFO | tr ":" "\n" | awk ' { FS="/"; print ( $(NF) ); }' > | sort > with_hive.txt > $ diff without_hive.txt with_hive.txt > < antlr-2.7.7.jar > < antlr-3.4.jar > < antlr-runtime-3.4.jar > 10,14d6 > < avro-1.7.4.jar > < avro-ipc-1.7.4.jar > < avro-ipc-1.7.4-tests.jar > < avro-mapred-1.7.4.jar > < bonecp-0.7.1.RELEASE.jar > 22d13 > < commons-cli-1.2.jar > 25d15 > < commons-compress-1.4.1.jar > 33,34d22 > < commons-logging-1.1.1.jar > < commons-logging-api-1.0.4.jar > 38d25 > < commons-pool-1.5.4.jar > 46,49d32 > < datanucleus-api-jdo-3.2.1.jar > < datanucleus-core-3.2.2.jar > < datanucleus-rdbms-3.2.1.jar > < derby-10.4.2.0.jar > 53,57d35 > < hive-common-0.12.0.jar > < hive-exec-0.12.0.jar > < hive-metastore-0.12.0.jar > < hive-serde-0.12.0.jar > < hive-shims-0.12.0.jar > 60,61d37 > < httpclient-4.1.3.jar > < httpcore-4.1.3.jar > 68d43 > < JavaEWAH-0.3.2.jar > 73d47 > < javolution-5.5.1.jar > 76d49 > < jdo-api-3.0.1.jar > 78d50 > < jetty-6.1.26.jar > 87d58 > < jetty-util-6.1.26.jar > 93d63 > < json-20090211.jar > 98d67 > < jta-1.1.jar > 103,104d71 > < libfb303-0.9.0.jar > < libthrift-0.9.0.jar > 112d78 > < mockito-all-1.8.5.jar > 136d101 > < servlet-api-2.5-20081211.jar > 139d103 > < snappy-0.2.jar > 144d107 > < spark-hive_2.10-1.0.0.jar > 151d113 > < ST4-4.0.4.jar > 153d114 > < stringtemplate-3.2.1.jar > 156d116 > < velocity-1.7.jar > 158d117 > < xz-1.0.jar > {code} > Some initial investigation suggests we may need to take some precaution > surrounding (a) jetty and (b) servlet-api. -- This message was sent by Atlassian JIRA (v6.2#6252)