Hi

I think this is the most fitting mailinglist for my query, even though most of the traffic on the list seems to be development oriented, but here goes.

Lately I have been messing a bit with mirroring the OpenSolaris pkg repository, since we at work will have to perform an amount of automated OpenSolaris installations, and want to save on bandwidth and time consumed. The guide for setting up a pkg mirror[1] came in handy for this, I do however think that there could be a smarter way of syncing the repository than using rsync, especially when the repository consists of millions of small files.

When I update the repository it takes me around 24 hours in total for each run. 16 hours are spend on the repository server (the one I am syncing from) on generating a 76 MB file with a file list, which the rsync client on my end then checks it's files up against. I can't help but think of the server I am syncing against, and the amount of disk activity this must generate. That is probably also the reason why it takes so long time in the first place?

So I thought that I would try to start a discussion for finding a better way of synchronizing the repository, for all parts, since I can't imagine that this will scale properly.

The suggestion I currently have for improving the situation, is to use ZFS snapshots instead of rsync. That way the servers at Sun doesn't have to generate a file list for every client which wants to update the repository, plus the transfer should go much quicker, when it is one big file instead of 2,1 million tiny files. The snapshots can then be created incrementally daily and weekly, and then a full snapshot can be created bi-weekly or even monthly.

This system would then require a small system for fetching the snapshots, applying them, and destroying old snapshots, but I guess most of the job is already done within the auto-snapshot service?

What do you guys think of this idea, and do you have a better idea instead?


[1]: http://www.opensolaris.org/os/project/pkg/Mirroring/

--
Kind regards
Jeppe Toustrup
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to