something magic about the size of a ports tree

2023-10-03 Thread Matthias Apitz
I have on my poudriere build host a ports tree and wanted to move it to
the host where the resulting packages are installed:

root@jet:/usr/local/poudriere/ports # du -sh ports20230806
397Mports20230806
root@jet:/usr/local/poudriere/ports # tar cf p.tar ports20230806
root@jet:/usr/local/poudriere/ports # ls -lh p.tar
-rw-r--r--  1 root wheel  672M Oct  3 18:00 p.tar

already the size of the tar file is somewhat magic; but if you un-tar it
on the other host I will get:

[guru@c720-1400094 ~]$ ls -lh p.tar
-rw-r--r--  1 guru wheel  672M  3 oct.  18:00 p.tar
[guru@c720-1400094 ~]$ tar xf p.tar
[guru@c720-1400094 ~]$ du -sh ports20230806
1,2Gports20230806

How this is possible?

matthias

-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub



Re: something magic about the size of a ports tree

2023-10-03 Thread Olivier Certner
Hi Matthias,

Some ZFS dataset with zstd compression on jet, and no compression on 
c720-1400094?

-- 
Olivier Certner





Re: something magic about the size of a ports tree

2023-10-03 Thread Dag-Erling Smørgrav
Matthias Apitz  writes:
> I have on my poudriere build host a ports tree and wanted to move it to
> the host where the resulting packages are installed:
>
> root@jet:/usr/local/poudriere/ports # du -sh ports20230806
> 397Mports20230806
> root@jet:/usr/local/poudriere/ports # tar cf p.tar ports20230806
> root@jet:/usr/local/poudriere/ports # ls -lh p.tar
> -rw-r--r--  1 root wheel  672M Oct  3 18:00 p.tar
>
> already the size of the tar file is somewhat magic; but if you un-tar it
> on the other host I will get:
>
> [guru@c720-1400094 ~]$ ls -lh p.tar
> -rw-r--r--  1 guru wheel  672M  3 oct.  18:00 p.tar
> [guru@c720-1400094 ~]$ tar xf p.tar
> [guru@c720-1400094 ~]$ du -sh ports20230806
> 1,2G  ports20230806
>
> How this is possible?

Most files in the ports tree are very small.  On disk, each file gets
rounded up to the nearest multiple of the filesystem block size, which
could be as small as 512 bytes or as large as 8 kB (or even more in
pathological cases).  In a tarball, they get rounded up to the nearest
multiple of 512 bytes plus an additional 512 bytes per file for
metadata.

For instance, your average distinfo file (of which there are 30k in the
ports tree) is only 200-250 bytes long, but it occupies 512 bytes on an
FFS filesystem, 1 kB in a tarball, and 4 kB on a typical ZFS filesystem.

Note that if the target system is FreeBSD 14 or newer, you can simply
mount the tarball (`sudo mount -rt tarfs p.tar /usr/ports`).

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: something magic about the size of a ports tree

2023-10-03 Thread Matthias Apitz
El día martes, octubre 03, 2023 a las 06:14:23p. m. +0200, Olivier Certner 
escribió:

> Hi Matthias,
> 
> Some ZFS dataset with zstd compression on jet, and no compression on 
> c720-1400094?
> 

Yes, on jet it is ZFS:

root@jet:/usr/local/poudriere/ports # mount | grep ports2023
poudriere/poudriere/ports/ports20230806 on 
/usr/local/poudriere/ports/ports20230806 (zfs, local, noatime, nfsv4acls)

on c720-1400094 it is only plain UFS.

matthias

-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub



Re: something magic about the size of a ports tree

2023-10-03 Thread Alan Somers
With ZFS, you might be using transparent compression.  "du -sh" will
show you a file's compressed size.  But "ls -lh" will show you the
logical size.  That's probably why the tarball looked so much bigger
than the ports tree on the first system.  If you do "du -sh" on the
tarball, I bet you'll see a much smaller number.

On Tue, Oct 3, 2023 at 9:27 AM Matthias Apitz  wrote:
>
> El día martes, octubre 03, 2023 a las 06:14:23p. m. +0200, Olivier Certner 
> escribió:
>
> > Hi Matthias,
> >
> > Some ZFS dataset with zstd compression on jet, and no compression on 
> > c720-1400094?
> >
>
> Yes, on jet it is ZFS:
>
> root@jet:/usr/local/poudriere/ports # mount | grep ports2023
> poudriere/poudriere/ports/ports20230806 on 
> /usr/local/poudriere/ports/ports20230806 (zfs, local, noatime, nfsv4acls)
>
> on c720-1400094 it is only plain UFS.
>
> matthias
>
> --
> Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
> Public GnuPG key: http://www.unixarea.de/key.pub
>



Re: something magic about the size of a ports tree

2023-10-03 Thread Michael Gmelin



> On 3. Oct 2023, at 18:27, Matthias Apitz  wrote:
> 
> El día martes, octubre 03, 2023 a las 06:14:23p. m. +0200, Olivier Certner 
> escribió:
> 
>> Hi Matthias,
>> 
>> Some ZFS dataset with zstd compression on jet, and no compression on 
>> c720-1400094?
>> 
> 
> Yes, on jet it is ZFS:
> 
> root@jet:/usr/local/poudriere/ports # mount | grep ports2023
> poudriere/poudriere/ports/ports20230806 on 
> /usr/local/poudriere/ports/ports20230806 (zfs, local, noatime, nfsv4acls)
> 
> on c720-1400094 it is only plain UFS.
> 

Try

du -hA file

Also, to experience the difference, try:

dd if=/dev/zero of=tempfile bs=1m count=10

and compare the results of ls, du -h, du -hA on the different filesystems.

   zfs get all | grep compr

can also be quite enlightening.

Cheers


>matthias
> 
> -- 
> Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
> Public GnuPG key: http://www.unixarea.de/key.pub
> 




Re: something magic about the size of a ports tree

2023-10-03 Thread Warner Losh
On Tue, Oct 3, 2023, 10:24 AM Dag-Erling Smørgrav  wrote:

> Matthias Apitz  writes:
> > I have on my poudriere build host a ports tree and wanted to move it to
> > the host where the resulting packages are installed:
> >
> > root@jet:/usr/local/poudriere/ports # du -sh ports20230806
> > 397Mports20230806
> > root@jet:/usr/local/poudriere/ports # tar cf p.tar ports20230806
> > root@jet:/usr/local/poudriere/ports # ls -lh p.tar
> > -rw-r--r--  1 root wheel  672M Oct  3 18:00 p.tar
> >
> > already the size of the tar file is somewhat magic; but if you un-tar it
> > on the other host I will get:
> >
> > [guru@c720-1400094 ~]$ ls -lh p.tar
> > -rw-r--r--  1 guru wheel  672M  3 oct.  18:00 p.tar
> > [guru@c720-1400094 ~]$ tar xf p.tar
> > [guru@c720-1400094 ~]$ du -sh ports20230806
> > 1,2G  ports20230806
> >
> > How this is possible?
>
> Most files in the ports tree are very small.  On disk, each file gets
> rounded up to the nearest multiple of the filesystem block size, which
> could be as small as 512 bytes or as large as 8 kB (or even more in
> pathological cases).  In a tarball, they get rounded up to the nearest
> multiple of 512 bytes plus an additional 512 bytes per file for
> metadata.
>
> For instance, your average distinfo file (of which there are 30k in the
> ports tree) is only 200-250 bytes long, but it occupies 512 bytes on an
> FFS filesystem, 1 kB in a tarball, and 4 kB on a typical ZFS filesystem.
>
> Note that if the target system is FreeBSD 14 or newer, you can simply
> mount the tarball (`sudo mount -rt tarfs p.tar /usr/ports`).
>

Do we support any compression on top of that? Has support for poudriere
been added for it?

Aldo I want a pony I'm mostly curious... I have no immediate plans here
(though aligning with the boot loader and supporting this on a block device
to support rootfs would be cool). Maybe some or all of these wishes would
make good GSOC projects?

Warner

DES
> --
> Dag-Erling Smørgrav - d...@freebsd.org
>
>


Re: something magic about the size of a ports tree

2023-10-03 Thread Dag-Erling Smørgrav
Warner Losh  writes:
> Do we support any compression on top of that? Has support for
> poudriere been added for it?

Yes (zstd) and no.

DES
-- 
Dag-Erling Smørgrav - d...@freebsd.org



Re: something magic about the size of a ports tree

2023-10-03 Thread Allan Jude

On 2023-10-03 12:24, Dag-Erling Smørgrav wrote:

Matthias Apitz  writes:

I have on my poudriere build host a ports tree and wanted to move it to
the host where the resulting packages are installed:

root@jet:/usr/local/poudriere/ports # du -sh ports20230806
397Mports20230806
root@jet:/usr/local/poudriere/ports # tar cf p.tar ports20230806
root@jet:/usr/local/poudriere/ports # ls -lh p.tar
-rw-r--r--  1 root wheel  672M Oct  3 18:00 p.tar

already the size of the tar file is somewhat magic; but if you un-tar it
on the other host I will get:

[guru@c720-1400094 ~]$ ls -lh p.tar
-rw-r--r--  1 guru wheel  672M  3 oct.  18:00 p.tar
[guru@c720-1400094 ~]$ tar xf p.tar
[guru@c720-1400094 ~]$ du -sh ports20230806
1,2Gports20230806

How this is possible?


Most files in the ports tree are very small.  On disk, each file gets
rounded up to the nearest multiple of the filesystem block size, which
could be as small as 512 bytes or as large as 8 kB (or even more in
pathological cases).  In a tarball, they get rounded up to the nearest
multiple of 512 bytes plus an additional 512 bytes per file for
metadata.

For instance, your average distinfo file (of which there are 30k in the
ports tree) is only 200-250 bytes long, but it occupies 512 bytes on an
FFS filesystem, 1 kB in a tarball, and 4 kB on a typical ZFS filesystem.



As an interesting side note to this, if ZFS is able to compress the file 
to under 112 bytes, ZFS will not allocate a sector, but instead store 
the file in an "embedded blockpointer", basically using the space it 
would normally store the LBAs and checksum of the file, to store the 
actual file data, resulting in a file that appears to use 0 bytes of 
space, because it entirely fits in the indirect block that would have 
pointed to the block itself.



Note that if the target system is FreeBSD 14 or newer, you can simply
mount the tarball (`sudo mount -rt tarfs p.tar /usr/ports`).

DES




--
Allan Jude