[
https://issues.apache.org/jira/browse/KAFKA-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195538#comment-15195538
]
ASF GitHub Bot commented on KAFKA-3250:
---------------------------------------
GitHub user granthenke opened a pull request:
https://github.com/apache/kafka/pull/1075
KAFKA-3250: release tarball is unnecessarily large due to duplicate l…
…ibraries
This ensures duplicates are not copied in the distribution without
rewriting all of the tar'ing logic. A larger improvement could be made to the
packaging code, but that should be tracked by another jira.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/granthenke/kafka libs-duplicates
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/kafka/pull/1075.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1075
----
commit 8cdbf18fb5b751e0fc922d405643c152daaef4d1
Author: Grant Henke <[email protected]>
Date: 2016-03-15T15:53:43Z
KAFKA-3250: release tarball is unnecessarily large due to duplicate
libraries
This ensures duplicates are not copied in the distribution without
rewriting all of the tar'ing logic. A larger improvement could be made to the
packaging code, but that should be tracked by another jira.
----
> release tarball is unnecessarily large due to duplicate libraries
> -----------------------------------------------------------------
>
> Key: KAFKA-3250
> URL: https://issues.apache.org/jira/browse/KAFKA-3250
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.9.0.1
> Reporter: Gwen Shapira
> Assignee: Grant Henke
> Fix For: 0.10.0.0
>
>
> Between 0.8.2.2 and 0.9.0, our release tarballs grew from 17M to 34M. We
> thought it is just due to new libraries and dependencies. But:
> 1. If you untar Kafka into a directory and check the directory size (du -sh),
> it is around 28M, smaller than the tarball. Recompressing give you 25M
> tarball.
> 2. If you list the original tar contents and grep for "snappy", you see it 4
> times in the tarball.
> Clearly we are creating a tarball with duplicates (and we didn't before).
> I think its due to how we are generating the tarball from core but pull in
> other projects into libs/ directory with their dependencies (which overlap).
> We need to find out how to sort it out (possibly with excludes).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)