* Shlomi Fish <shlo...@gmail.com> [2010-11-19 19:55]: > here is a report on compressing Graph-Easy-0.70.tar with various > compression methods: > > {{{ > shlomif:~/progs/perl/cpan/Graph/Easy/trunk/Graph-Easy/TEMP$ ls -l > total 3420 > -rw-r--r-- 1 shlomif shlomif 2160640 Nov 14 22:20 Graph-Easy-0.70.tar > -rw-r--r-- 1 shlomif shlomif 329197 Nov 5 12:24 Graph-Easy-0.70.tar.bz2 > -rw-r--r-- 1 shlomif shlomif 416916 Nov 14 22:23 Graph-Easy-0.70.tar.gz > -rw-r--r-- 1 shlomif shlomif 270796 Nov 14 22:21 Graph-Easy-0.70.tar.lrz > -rw-r--r-- 1 shlomif shlomif 312844 Nov 5 12:24 Graph-Easy-0.70.tar.xz > }}} > > As one can see, there are significant savings in size (and > bandwidth) by switching to .bz2 and .xz.
Where does one see that? I see some savings, but not significant ones. You drop from 2 MB to 400 kb by using gzip, then a further 100 to 150 kb by using more unusual compression programs. Just going to http://search.cpan.org/dist/Graph-Easy/ will pull down more data than you just saved. The initial savings is worthwhile, but the additional gains? The era of 28.8 modems is long past. (And even in areas where internet connectivity is bad, bandwidth is not the limiting factor. You go from cell phone with data plan to satellite internet to CD-ROMs delivered by truck: the scarce resource becomes latency, not the bandwidth at any one instant.) Gzip has 100% installed base. Even bzip2 does way worse; it has 100% installed base if you are looking at Linux and the 386BSD family, but is way less commonplace elsewhere, esp. Windows. And the other tools are only just making inroads on Linux. How long until they’re as widespread as bzip2? How long until bzip2 is as widespread as gzip? How large is the total CPAN archive – 10 GB? Re-compressing all of it now would yield a benefit of what, 3 GB? 4? Even 5 maybe? As Dave said, it fits on a thumb drive already. And we’re not even talking about re-compressing here, just about future support for new distributions. It’s gonna be a lot of work to iron out the entire tool chain to support the newer formats; then it will take a lot of time until the work trickles out far enough that people could start relying on it. For quite piddly gains, in absolute numbers. I really don’t see the point. Gzip is Good Enough. Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>