Dear Étienne Mollier, Aha, a logical response. Thank you for shedding some light on this for me. That is probably the case. My access could perhaps be misinterpreted as an attack. Maybe I misunderstand the concept of a mirror, but I do not wish to maintain a server which allows the public to download Debian repositories. I'll look into it, in any case. If I find it is possible to simply download the entire collection, without having to host a mirror, I may very well go that route.
If I continue the scraping route, would adding wait time in my loop between downloads make my repeated access less of a problem? I would like to let it run until it is finished. It is tedious to restart my scrape periodically. Thanks, John On Sat, Jun 12, 2021 at 10:35 AM Étienne Mollier <etienne.moll...@mailoo.org> wrote: > Hi John, > > John E Petersen, on 2021-06-12: > > Hey folks, I’m developing a unique kernel based on Debian Linux, and I’ve > > been scraping the website for repositories. After a few thousand, the > > servers start to block my ip. > > I'm not too sure what you are trying to achieve. It sounds to > me like you wish to either develop a Debian derivative, or make > a backup copy of Debian. The IP blocking you see is probably > automated, and the result of your having done repeated access on > a system that might not have been sized to be mirrored directly. > I would suppose this is in place, so regular users can access to > these ressources without being impacted by numerous background > download tasks hammering the websites. > > Please have a look at the mirroring page[1] to assess whether > you want to mirror the packages archive, and if so, how to do it > with tools tailored for such task. > > [1]: https://www.debian.org/mirror/ftpmirror > > In hope this helps! > -- > Étienne Mollier <etienne.moll...@mailoo.org> > Fingerprint: 8f91 b227 c7d6 f2b1 948c 8236 793c f67e 8f0d 11da > Sent from /dev/pts/2, please excuse my verbosity. >