On 2014-09-14 10:46, Michał Górny wrote:
> Dnia 2014-09-14, o godz. 15:40:06
> Davide Pesavento <p...@gentoo.org> napisał(a):
> > How long does the md5-cache regeneration process take? Are you sure it
> > will be able to keep up with the rate of pushes to the repo during
> > "peak hours"? If not, maybe we could use a time-based thing similar to
> > the current cvs->rsync synchronization.
> 
> This strongly depends on how much data is there to update. A few
> ebuilds are quite fast, eclass change isn't ;). I was thinking of
> something along the lines of, in pseudo-code speaking:
> 
>   systemctl restart cache-regen
> 
> That is, we start the regen on every update. If it finishes in time, it
> commits the new metadata. If another update occurs during regen, we
> just restart it to let it catch the new data.
> 
> Of course, if we can't spare the resources to do intermediate updates,
> we may as well switch to cron-based update method.

I don't see per push metadata regen working entirely well in this case
if this is the only way we're generating the metadata cache for users to
sync. It's easy to imagine a plausible situation where a widely used
eclass change is made followed by commits less than a minute apart (or
shorter than however long it would take for metadata regen to occur) for
at least 30 minutes (rsync refresh period for most user-facing mirrors)
during a time of high activity.

I haven't run portage metadata regen on a beefy machine lately, but I
don't think it could keep up in all cases. Perhaps someone can prove me
wrong.

Anyway, things could definitely be sped up if portage merges a few speed
tweaks used in pkgcore. Specifically, I think using some of the weakref
and perhaps jitted attrs support along with the eclass caching hacks
would give a 2-4x metadata regen speedup. Otherwise pkgcore could
potentially be used to regen metadata as well or some other tuned regen
tool.

Tim

Attachment: pgpGfmG5Ks9YC.pgp
Description: PGP signature

Reply via email to