André Kelpe created HADOOP-10986: ------------------------------------ Summary: hadoop tarball is twice as big as prev. version and 6 times as big unpacked Key: HADOOP-10986 URL: https://issues.apache.org/jira/browse/HADOOP-10986 Project: Hadoop Common Issue Type: Bug Affects Versions: 2.5.0 Reporter: André Kelpe
I noticed that the binary tarball for 2.5.0 is almost 300MB, while 2.4.1 is only 132MB. Unpacking the latest tarball gives me 1.8 GB of stuff, with the majority in the "share" directory. {code} $ cd hadoop-2.4.1 $ du -sh * 364K bin 356K etc 100K include 2,3M lib 128K libexec 24K LICENSE.txt 12K NOTICE.txt 12K README.txt 336K sbin 280M share {code} {code} $ cd hadoop-2.5.0 $ du -sh * 512K bin 332K etc 100K include 4,6M lib 128K libexec 336K sbin 1,8G share {code} I also saw some warnings from tar while unpacking: {code} $ tar xf hadoop-2.5.0.tar.gz tar: Ignoring unknown extended header keyword `SCHILY.dev' tar: Ignoring unknown extended header keyword `SCHILY.ino' tar: Ignoring unknown extended header keyword `SCHILY.nlink' tar: Ignoring unknown extended header keyword `SCHILY.dev' tar: Ignoring unknown extended header keyword `SCHILY.ino' tar: Ignoring unknown extended header keyword `SCHILY.nlink' tar: Ignoring unknown extended header keyword `SCHILY.dev' tar: Ignoring unknown extended header keyword `SCHILY.ino' tar: Ignoring unknown extended header keyword `SCHILY.nlink' tar: Ignoring unknown extended header keyword `SCHILY.dev' tar: Ignoring unknown extended header keyword `SCHILY.ino' tar: Ignoring unknown extended header keyword `SCHILY.nlink' tar: Ignoring unknown extended header keyword `SCHILY.dev' tar: Ignoring unknown extended header keyword `SCHILY.ino' tar: Ignoring unknown extended header keyword `SCHILY.nlink' tar: Ignoring unknown extended header keyword `SCHILY.dev' tar: Ignoring unknown extended header keyword `SCHILY.ino' tar: Ignoring unknown extended header keyword `SCHILY.nlink' {code} -- This message was sent by Atlassian JIRA (v6.2#6252)