Re: [bug] gzip archives created with pkg_create have wrong data sizes
On Mon, 25 Apr 2016 18:24:53 -0400, dan mclaughlin wrote: > is it? one of the reasons i brought it up is that in gzip(1) it seems pretty > clear that it is supposed to be the file size The original file size (and thus compression ratio) is not stored in the gzip archive, it is computed by gzip(1) on the fly by decompressing the file and keeping track of how many bytes were read vs. how many would have been written. - todd
Re: [bug] gzip archives created with pkg_create have wrong data sizes
On Sun, 24 Apr 2016 12:57:46 +0200 Marc Espiewrote: > On Sun, Apr 24, 2016 at 01:47:24AM -0400, dan mclaughlin wrote: > > the sizes of the compressed/uncompressed data are wrong. i have tested gzip > > and 'tar zcf' and the values are right, but using pkg_create fails. > > gzip -l will just give you the first chunk, that's a limitation of the gzip > tool itself. > is it? one of the reasons i brought it up is that in gzip(1) it seems pretty clear that it is supposed to be the file size -l List information for the specified compressed files. The following information is listed: compressed sizeSize of the compressed file. uncompressed size Size of the file when uncompressed. compression ratio Ratio of the difference between the compressed and uncompressed sizes to the uncompressed size. so perhaps the manual needs to be adjusted. On Mon, 25 Apr 2016 11:19:45 -0400 "Ted Unangst" wrote: > Or we rename the files .pkg and nobody pokes them with the wrong tool. :) how about setting the final values to 0 (or some other number), so that if someone did look they would see that it is obviously not a standard gzip, and that the values cannot be trusted. as it is, it has values that at least conflict with what gzip(1) says they should be. those values are also printed out with the assumption that gzip(1) is right. $ gzip -vd bzip2-1.0.6p7.tgz bzip2-1.0.6p7.tgz: 96.0% -- replaced with bzip2-1.0.6p7.tar 322 bytes in, 7680 bytes out
Re: [bug] gzip archives created with pkg_create have wrong data sizes
On Mon, Apr 25, 2016 at 11:19:45AM -0400, Ted Unangst wrote: > Marc Espie wrote: > > On Sun, Apr 24, 2016 at 12:57:46PM +0200, Marc Espie wrote: > > > On Sun, Apr 24, 2016 at 01:47:24AM -0400, dan mclaughlin wrote: > > > > the sizes of the compressed/uncompressed data are wrong. i have tested > > > > gzip > > > > and 'tar zcf' and the values are right, but using pkg_create fails. > > > > > > gzip -l will just give you the first chunk, that's a limitation of the > > > gzip > > > tool itself. > > > > I've had a slightly closer look at gzip... > > > > making gzip -l able to recognize multiple chunks archive should be doable, > > but it would require a lot of code churn. > > Or we rename the files .pkg and nobody pokes them with the wrong tool. :) 1/ we're not debian. 2/ they're perfectly agreeable gzip files. It's truely a limitation of gzip(1).
Re: [bug] gzip archives created with pkg_create have wrong data sizes
Marc Espie wrote: > On Sun, Apr 24, 2016 at 12:57:46PM +0200, Marc Espie wrote: > > On Sun, Apr 24, 2016 at 01:47:24AM -0400, dan mclaughlin wrote: > > > the sizes of the compressed/uncompressed data are wrong. i have tested > > > gzip > > > and 'tar zcf' and the values are right, but using pkg_create fails. > > > > gzip -l will just give you the first chunk, that's a limitation of the gzip > > tool itself. > > I've had a slightly closer look at gzip... > > making gzip -l able to recognize multiple chunks archive should be doable, > but it would require a lot of code churn. Or we rename the files .pkg and nobody pokes them with the wrong tool. :)
Re: [bug] gzip archives created with pkg_create have wrong data sizes
On Sun, Apr 24, 2016 at 12:57:46PM +0200, Marc Espie wrote: > On Sun, Apr 24, 2016 at 01:47:24AM -0400, dan mclaughlin wrote: > > the sizes of the compressed/uncompressed data are wrong. i have tested gzip > > and 'tar zcf' and the values are right, but using pkg_create fails. > > gzip -l will just give you the first chunk, that's a limitation of the gzip > tool itself. I've had a slightly closer look at gzip... making gzip -l able to recognize multiple chunks archive should be doable, but it would require a lot of code churn. More precisely, the gz_read code has a check that we arrived at the end, it tries to read a new header, and it keeps going if it can. So this would require seeking on the input file, trying to read a new header and displaying it. I'm pretty sure it's not worth it. if you need looking at chunked tarballs further, there's some code in regress/usr.sbin/pkg_add/extract_chunks that does precisely that: look at the actual boundaries, and uncompress each chunk separately.
Re: [bug] gzip archives created with pkg_create have wrong data sizes
On Sun, Apr 24, 2016 at 01:47:24AM -0400, dan mclaughlin wrote: > the sizes of the compressed/uncompressed data are wrong. i have tested gzip > and 'tar zcf' and the values are right, but using pkg_create fails. gzip -l will just give you the first chunk, that's a limitation of the gzip tool itself. That could probably get fixed, but it's not that annoying. pkg_create files are a succession of gzip chunks, for two reasons: 1/ putting the plist in its separate chunk makes pkg_sign drastically faster, as it doesn't have to uncompress/recompress gzip files. 2/ files are ordered from last changed to least changed, and put into chunks of 8 files, starting at the end, making it possible for rsync to perform its magic on compressed packages, since the ending chunks do not change at all. The actual uncompressed size of each package can be obtained with pkg_info -s.
Re: [bug] gzip archives created with pkg_create have wrong data sizes
On 2016/04/24 01:47, dan mclaughlin wrote: > the sizes of the compressed/uncompressed data are wrong. i have tested gzip > and 'tar zcf' and the values are right, but using pkg_create fails. The gzip stream is broken into chunks for more efficient package signing and to improve rsync-friendliness. See e.g. http://anoncvs.spacehopper.org/openbsd-src/commit/?id=86ace4402e1421117708700d6f0ef008e0bee8b6
[bug] gzip archives created with pkg_create have wrong data sizes
the sizes of the compressed/uncompressed data are wrong. i have tested gzip and 'tar zcf' and the values are right, but using pkg_create fails. $ sysctl hw.machine kern.version hw.machine=i386 kern.version=OpenBSD 5.9-current (GENERIC) #0: Thu Apr 7 17:24:30 EDT 2016 build@node04:/usr/src/sys/arch/i386/compile/GENERIC it's not just i386 specific though, since i tested the amd64 packages as well. $ ftp ftp://ftp3.usa.openbsd.org/pub/OpenBSD/snapshots/packages/i386/bzip2... ... Retrieving pub/OpenBSD/snapshots/packages/i386/bzip2-1.0.6p7.tgz ... $ ls -l bzip2-1.0.6p7.tgz -rw-r--r-- 1 user user 125979 Apr 23 12:19 bzip2-1.0.6p7.tgz $ gzip -l bzip2-1.0.6p7.tgz compressed uncompressed ratio uncompressed_name 322 7680 96.0% bzip2-1.0.6p7.tar $ gzip -vd bzip2-1.0.6p7.tgz bzip2-1.0.6p7.tgz: 96.0% -- replaced with bzip2-1.0.6p7.tar 322 bytes in, 7680 bytes out $ ls -l bzip2-1.0.6p7.tar -rw-r--r-- 1 user user 375808 Apr 23 12:19 bzip2-1.0.6p7.tar $ gzip -v bzip2-1.0.6p7.tar bzip2-1.0.6p7.tar: 66.6% -- replaced with bzip2-1.0.6p7.tar.gz 375808 bytes in, 125704 bytes out $ gzip -l bzip2-1.0.6p7.tar.gz compressed uncompressed ratio uncompressed_name 125704375808 66.6% bzip2-1.0.6p7.tar $ pkg_create -f /var/db/pkg/bzip2-1.0.6p7/+CONTENTS $ ls -l bzip2-1.0.6p7.tgz -rw-r--r-- 1 user user 125891 Apr 24 00:36 bzip2-1.0.6p7.tgz $ gzip -l bzip2-1.0.6p7.tgz compressed uncompressed ratio uncompressed_name 319 7680 96.1% bzip2-1.0.6p7.tar $ tar zcf test.tgz -I list and $ tar cf - -I list | gzip -c >test.tgz give the expected correct results. i tried to track it a bit further $ grep gzip /usr/src/usr.sbin/pkg_add/OpenBSD/* /usr/src/usr.sbin/pkg_add/OpenBSD/Paths.pm:sub gzip() { '/usr/bin/gzip' } /usr/src/usr.sbin/pkg_add/OpenBSD/PkgCreate.pm: $state->say("Creating gzip'd tar ball in '#1'", $wname) but perl's not really my thing and so i really know where to go from here.