rsync's diff algorithm plays nicely with a vacuumed sqlite database.
I've seen it do efficient delta updates of database sizes in the tens
of gigabytes. sqlite's indices would also nicely play with your "untar
files as needed" strategy too as extracting a single file from a
tarball requires a linear read through the tarball until you find the
file you're looking for. I also understand that the patch files need
to be written out to build the software but can you pass the Tcl files
directly to eval() or its Tcl equivalent? If you're going this far you
could also replace the PortIndex as a file with queries against the
sqlite database as well.

On Fri, Aug 29, 2025 at 9:46 AM Ryan Carsten Schmidt
<[email protected]> wrote:
>
> Users and I have noted selfupdate is slow. Our ports collection has grown, 
> and for many years we no longer just rsync the ports tree, which was fast, 
> but instead rsync a tarball which we then decompress, which involves throwing 
> away and recreating the tens of thousands of files that make up the entire 
> ports tree each time.
>
> Can we speed it up and reduce local disk space usage by not unpacking the 
> tarball at sync time? Maybe we could keep the portindexes and _resources 
> folder on disk but leave everything else in the tarball until it's asked for, 
> e.g. untar the port's directory into the work directory when the user asks to 
> install that port.
>
> Or, any other ideas for speeding up selfupdate?
>
>
>


-- 
David Gilman
:DG<

Reply via email to