[ https://issues.apache.org/jira/browse/HADOOP-10115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353321#comment-14353321 ]
Allen Wittenauer commented on HADOOP-10115: ------------------------------------------- bq. One possible gap caused by just skipping the jars (rather than symlinking) is that if folks rely on the directory layout at deployment time to grab needed jars they might miss out. Presumably they're already grabbing the common share dir though? If you symlink, is there actually any benefit? It shrinks the distribution size, sure, but I suspect the JVM won't resolve the link to a degree that it realizes it is the same jar. Also, given that, e.g., HDFS requires common, if folks are only grabbing the HDFS deps and not the common deps, they are doing Bad Things (tm). But if we only commit this to trunk, it's even less of a concern. ;) bq. One good reason to do it as a follow-on is that we could switch to using an maven assembly instead of a shell script. I'm inclined to commit this now and fix this up either as a maven assembly or a separate script as a separate JIRA under the guiding principle of "don't let best stop better." I don't think there is any real question of whether or not this is better than what is currently there. Best might end up being more subjective and take longer. bq. (the two code comments) Yes, probably a good idea. bq. Should the yarn get processed before the NFS projects? I'm not sure if it matters much. > Exclude duplicate jars in hadoop package under different component's lib > ------------------------------------------------------------------------ > > Key: HADOOP-10115 > URL: https://issues.apache.org/jira/browse/HADOOP-10115 > Project: Hadoop Common > Issue Type: Bug > Components: build > Affects Versions: 3.0.0, 2.2.0 > Reporter: Vinayakumar B > Assignee: Vinayakumar B > Labels: common, hdfs, mapreduce, nfs, yarn > Attachments: HADOOP-10115-004.patch, HADOOP-10115-005.patch, > HADOOP-10115-006.patch, HADOOP-10115.patch, HADOOP-10115.patch, > HADOOP-10115.patch > > > In the hadoop package distribution there are more than 90% of the jars are > duplicated in multiple places. > For Ex: > almost all jars in share/hadoop/hdfs/lib are already there in > share/hadoop/common/lib > Same case for all other lib in share directory. > Anyway for all the daemon processes all directories are added to classpath. > So to reduce the package distribution size and the classpath overhead, remove > the duplicate jars from the distribution. -- This message was sent by Atlassian JIRA (v6.3.4#6332)