Hi,

We do both @ ftp.rz.tu-bs.de

Martin


Von unterwegs gesendet.

> Am 08.10.2019 um 09:13 schrieb SoEasyTo Mirrors Manager 
> <mirr...@soeasyto.com>:
> 
>> On 2019-10-08 05:33, Michał Górny wrote :
>> 
>> Hello, everyone.
>> TL;DR: shortly, distfiles will need to be present under two paths for
>> the transitional period.  Would you prefer us using hardlinks or
>> symlinks for that?
>> We're planning to start deploying a new GLEP 75-based [1] mirror layout
>> to our mirrors soonish.  This implies a transitional period during which
>> we'll be using both old and new layouts, so all file entries will be
>> duplicated.  The plan is roughly to:
>> 1. Enable new split layout in emirrordist, and start using both
>> simultaneously for newly-mirrored files.
>> 2. Duplicate the existing distfiles to new layout.
>> 3. Live with both layouts for some longish time, to support people using
>> old Portage versions.
>> 4. Eventually disable the old (flat) layout and start removing files.
>> The basic problem is whether to use hardlinks or symlinks
>> for the duplicate files.  I've elaborate more on both solutions in [2]
>> but I'll summarize shortly here.
>> Hardlinks have the advantage that for mirrors enabling -H, they avoid
>> extra space usage and extra traffic.  However, we don't really know how
>> many mirrors enable that, and I suspect it's around half of them.
>> At initial deployment time, rsync will just hardlink files in new layout
>> to existing entries, and at cleanup time it will just unlink old
>> entries.
>> For mirrors not enabling -H, hardlinks will mean all distfiles being
>> transferred again during deployment time.  Furthermore, through all
>> transitional period all files will be duplicated, and so duplicated will
>> be space usage.  Cleanup should be lightweight though.
>> Symlinks have the advantage that we know that all or almost all mirrors
>> enable them.  They are lightweight at deployment time since it's just
>> a matter of rsync copying symlinks, and they definitely won't cause
>> double space usage.  However, they will cause all files being
>> retransferred at cleanup time -- due to symlinks being replaced by real
>> files.
>> Technically, I suppose we could avoid that by splitting that into two
>> stages, repeated for smaller groups of files.  Firstly, replace symlinks
>> with hardlinks which will make it light for at least some of the errors.
>> Then, remove old files and jump over to the next group.  For mirrors not
>> using -H, this will still mean double transfer but we'd limit double
>> space usage to one group at a time, and only for a short period.
>> If any mirrors sync over rsync without using -l (talking about private
>> mirrors here), they will not get the new layout at all which is going to
>> suck for their users.
>> Which way do you prefer?
> 
> For soeasyto mirror, we are already using both -H and --links, and the mirror
> is hosted on a single partition, so, in order to preserve bandwith as you
> suggested, it's better to use hardlinks, keeping in mind that could cause
> server "overload" as per [1], but it is not an issue here.
> 
> One question remains though: how will the layout.conf be created ? Is it by 
> the
> mirror maintainer, or only by the master distfiles, and then all mirrors will
> automatically replicate it ? Because it could be interesting to let the mirror
> maintainer decide whether to use split or flat layout depending on their usage
> of hardlinks / symlinks, and leave the choice by providing a master for flat,
> hybrid, and split layouts ?
> 
>> [1] https://www.gentoo.org/glep/glep-0075.html
>> [2] https://bugs.gentoo.org/534528#c38
> 


Reply via email to