On Mon, Feb 19, 2024 at 10:18 AM Stephen Smoogen <ssmoo...@redhat.com> wrote:
>
>
>
> On Mon, 19 Feb 2024 at 10:08, Kevin Kofler via devel 
> <devel@lists.fedoraproject.org> wrote:
>>
>> Stephen Smoogen wrote:
>> > 1. Drive size is not just what is needed but also throughput. The large
>> > drives needed to store the data COPR uses for its hundreds of chroots are
>> > much 'slower' on reads and writes even when adding in layers of RAID 1+0.
>> > Faster drives are possible but the price goes up considerably.
>> > 2. Throughput of individual drives also requires backplane speeds which
>> > match peek throughput of all the drives. Otherwise you end up with lots of
>> > weird stalling (as seen on certain builders which have such drives).
>>
>> What kind of throughput is needed for a repository that has not seen any new
>> builds for 2 years? Such a repository is going get only a handful downloads
>> and no uploads. Instead of deleting old repositories, they can be moved to a
>> low-throughput archive storage. This can be made transparent through
>> symlinks, union file systems, or even just at the HTTPS level if Copr itself
>> knows how to unarchive a repository when internally needed (e.g., if a new
>> build is submitted after 2 years of inactivity).
>>
>
> The throughput is actually in several places even for low/no usage 
> repositories.
> 1. RAID rebuilds will need to go through and check data. RAID-1 might seem 
> like a no-brainer but you tend to end up with 'which of these two disks is 
> the correct bit' over time problems.
> 2. web-spiders and such regularly peruse pretty much every package regularly. 
> Putting some repositories on slow disks and some on fast tend to cause web 
> front ends to pause out for unrelated tasks unless you set up your caching 
> and other middleware to deal with it. [This I know from when I tried to make 
> something more 'efficient' on downloads.fedoraproject.org and from some other 
> tooling.] It becomes a complete project of setting up the infrastructure to 
> best handle mixed loads. If you have a limited staff then it may be too much 
> work.
>
> That said, the above does sound like an interesting project to add to copr. I 
> do not know how much work it would take or who would be able to do it these 
> days. [My understanding is that COPR is 'one of many things' that the various 
> developers work on with most of the work done as a volunteer task.]
>

A lot of these problems may go away in the future once the Pulp
backend for COPR is ready. That will allow repository storage to move
from EBS to S3, and repositories in S3 can be set with different
policies wrt to CloudFront for transparent CDN performance.

https://github.com/fedora-copr/copr/issues/2533



-- 
真実はいつも一つ!/ Always, there's only one truth!
--
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to