Public bug reported: Hello, I wanted a mirror of the irc logs hosted on https://irclogs.ubuntu.com/ and started the project with:
wget --mirror https://irclogs.ubuntu.com/ This worked okay but was very slow, as there's probably hundreds of thousands of links to traverse. I switched to wget2 to get the multiple simultaneous connections, and ran with: wget2 --mirror https://irclogs.ubuntu.com I assumed that wget2 would try to accomplish the same thing: mirror that site *and only that site*. What actually happened was that it followed a link on that site to ubuntu.com and downloaded two and a half million files like this: $ find ubuntu.com/ -ls | head -20 11190888 995121 drwxr-xr-x 48 sarnold sarnold 2417914 Mar 16 01:47 ubuntu.com/ 6717440 37 -rw-r--r-- 1 sarnold sarnold 73591 Mar 16 01:23 ubuntu.com/security?q=&package=epiphany&offset=40 6717456 29 -rw-r--r-- 1 sarnold sarnold 73469 Mar 16 01:23 ubuntu.com/security?q=&package=openssh&offset=40 6717468 37 -rw-r--r-- 1 sarnold sarnold 73687 Mar 16 01:23 ubuntu.com/security?q=&package=webkitgtk&offset=0 6717648 29 -rw-r--r-- 1 sarnold sarnold 73527 Mar 16 01:23 ubuntu.com/security?q=&package=openssh&offset=0 6717662 29 -rw-r--r-- 1 sarnold sarnold 73555 Mar 16 01:23 ubuntu.com/security?q=&package=grub2-unsigned&offset=60 6717758 37 -rw-r--r-- 1 sarnold sarnold 73625 Mar 16 01:23 ubuntu.com/security?q=&package=openssh&offset=80 6717786 37 -rw-r--r-- 1 sarnold sarnold 73693 Mar 16 01:23 ubuntu.com/security?q=&package=php8.0&offset=0 6717790 37 -rw-r--r-- 1 sarnold sarnold 73591 Mar 16 01:23 ubuntu.com/security?q=&package=grub2-unsigned&offset=80 6717980 29 -rw-r--r-- 1 sarnold sarnold 73435 Mar 16 01:23 ubuntu.com/security?q=&package=epiphany&offset=80 6717984 37 -rw-r--r-- 1 sarnold sarnold 73589 Mar 16 01:23 ubuntu.com/security?q=&package=openssh&offset=20 6717986 37 -rw-r--r-- 1 sarnold sarnold 73649 Mar 16 01:23 ubuntu.com/security?q=&package=awstats&offset=40 6718000 29 -rw-r--r-- 1 sarnold sarnold 73495 Mar 16 01:23 ubuntu.com/security?q=&package=grub2-unsigned&offset=60 6718034 37 -rw-r--r-- 1 sarnold sarnold 73649 Mar 16 01:23 ubuntu.com/security?q=&package=mozjs60&offset=60 6718176 29 -rw-r--r-- 1 sarnold sarnold 73555 Mar 16 01:23 ubuntu.com/security?q=&package=vlc&offset=0 6718210 37 -rw-r--r-- 1 sarnold sarnold 73629 Mar 16 01:23 ubuntu.com/security?q=&package=vlc&offset=20 6718248 37 -rw-r--r-- 1 sarnold sarnold 73617 Mar 16 01:23 ubuntu.com/security?q=&package=vlc&offset=60 6718266 37 -rw-r--r-- 1 sarnold sarnold 73673 Mar 16 01:23 ubuntu.com/security?q=&package=mozjs60&offset=60 6718292 37 -rw-r--r-- 1 sarnold sarnold 73593 Mar 16 01:23 ubuntu.com/security?q=&package=vlc&offset=40 6718354 29 -rw-r--r-- 1 sarnold sarnold 73545 Mar 16 01:23 ubuntu.com/security?q=&package=vlc&offset=60 This is unexpected and unpleasant. Thanks ProblemType: Bug DistroRelease: Ubuntu 20.04 Package: wget2 1.99.1-2.1 ProcVersionSignature: Ubuntu 5.4.0-166.183-generic 5.4.252 Uname: Linux 5.4.0-166-generic x86_64 NonfreeKernelModules: lkp_Ubuntu_5_4_0_166_183_generic_101 zfs zunicode zavl icp zcommon znvpair ApportVersion: 2.20.11-0ubuntu27.27 Architecture: amd64 CasperMD5CheckResult: skip Date: Sat Mar 16 01:50:30 2024 SourcePackage: wget2 UpgradeStatus: Upgraded to focal on 2020-01-24 (1512 days ago) ** Affects: wget2 (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug focal third-party-packages -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2058082 Title: wget2 --mirror leaves the specified host To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/wget2/+bug/2058082/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs