On Fri, Sep 24, 2021 at 1:32 AM Nicolás Alvarez
<nicolas.alva...@gmail.com> wrote:
>
> El jue, 23 de sep. de 2021 a la(s) 19:31, Aleix Pol (aleix...@kde.org) 
> escribió:
> >
> > On Thu, Sep 23, 2021 at 10:12 PM Nicolás Alvarez
> > <nicolas.alva...@gmail.com> wrote:
> > >
> > > El jue, 23 de sep. de 2021 a la(s) 08:55, Aleix Pol (aleix...@kde.org) 
> > > escribió:
> > > >
> > > > On Thu, Sep 23, 2021 at 11:52 AM Ben Cooksley <bcooks...@kde.org> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > It has recently come to our attention that the number of queries 
> > > > > being handled for the endpoint 
> > > > > https://autoconfig.kde.org/ocs/providers.xml on a day to day basis 
> > > > > has gotten to the point where it is causing issues with server 
> > > > > responsiveness to other traffic. This is perhaps best summarised by 
> > > > > the following:
> > > > >
> > > > > root@nicoda /var/log/apache2 # ls -lah ...
> > > > > -rw-r----- 1 root adm 458M Sep 23 06:25 autoconfig.kde.org.log.1
> > > > > -rw-r----- 1 root adm 381M Sep 23 06:25 networkcheck.kde.org.log.1
> > > > > -rw-r----- 1 root adm 143M Sep 23 06:25 www.kde.org.log.1
> > > > >
> > > > > root@nicoda /var/log/apache2 # cat autoconfig.kde.org.log.1 | wc -l
> > > > > 4,222,343
> > > > >
> > > > > Based on those numbers we're looking at 48-49 requests per second (on 
> > > > > average - peaks are much higher by many magnitudes), which seems 
> > > > > extremely excessive given that this file is only supposed to be 
> > > > > retrieved by KDE software when GHNS functionality is triggered. That 
> > > > > is supported by the substantial size difference it has over 
> > > > > networkcheck.kde.org - which is used by plasma-nm and NetworkManager 
> > > > > (on Neon) to check for whether they have a working internet 
> > > > > connection - which i'd expect to be the site receiving the most 
> > > > > traffic.
> > > > >
> > > > > As such, I therefore suspect we have bug(s) in software that makes 
> > > > > use of GHNS functionality.
> > > > >
> > > > > It would therefore be appreciated if we could please review the 
> > > > > software in question to determine whether it is operating correctly. 
> > > > > Given that it usually runs in the background on user systems, i'd 
> > > > > especially appreciate it if a detailed review could be conducted on 
> > > > > Discover and other software that conducts package management 
> > > > > operations or assists in managing updates.
> > > > >
> > > > > Unfortunately all these applications submit a fairly useless user 
> > > > > agent (Mozilla/5.0) so it is impossible for Sysadmin to ascertain any 
> > > > > further information. If we could get information on the software that 
> > > > > is originating the request added to the user agent to assist in 
> > > > > investigating these issues in the future that would be extremely 
> > > > > helpful.
> > > > >
> > > > > Thanks,
> > > > > Ben
> > > >
> > > > That's correct. Discover fetches them at startup. It's necessary to be
> > > > able to check if there are updates on KNS-provided resources.
> > > >
> > > > Incidentally,  I was looking into this yesterday incidentally. We
> > > > could see if caching is broken somehow. A request will still be needed
> > > > though to check if the cache is out of date.
> > >
> > > Caching seems to be working, since the vast majority of the requests
> > > are returning 304 Not Modified.
> > >
> > > However in *many* cases I see a single IP making multiple requests in
> > > the same second, and doing it again the next minute. Here's one IP
> > > address picked randomly:
> > >
> > > [22/Sep/2021:06:25:41 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:25:41 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:25:41 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:25:41 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:27:57 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:27:58 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:27:58 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:28:32 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:28:32 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:28:32 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:28:32 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:28:59 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:28:59 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:28:59 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:28:59 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:30:11 +0000] "GET /ocs/providers.xml HTTP/1.1" 200
> > > [22/Sep/2021:06:30:11 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:30:11 +0000] "GET /ocs/providers.xml HTTP/1.1" 200
> > > [22/Sep/2021:06:30:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:30:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:30:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:30:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:31:19 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:31:19 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:31:19 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:31:19 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:31:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:31:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:31:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > > [22/Sep/2021:06:31:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> > >
> > > This continues for hours. And it's not an isolated case; again, the IP
> > > I searched for was a random one from the log.
> > >
> > > There are 120 IP addresses that *each* made more than 10,000 requests
> > > in a 24h period.
> > >
> > > I tried a few GHNS things on my own system and I couldn't reproduce it...
> > >
> > > --
> > > Nicolás
> >
> > Is there a possibilty that we can get an UserAgent of something like
> > that to see who's asking?
>
> Yes please :P
>
> (all requests to this file currently have "Mozilla/5.0" as the
> User-Agent, I very much welcome modifying the code so that it sends
> something useful instead)
>
> --
> Nicolás

This is for KNS:
https://invent.kde.org/frameworks/knewstuff/-/merge_requests/142

Attica already sets the UserAgent.

Aleix

Reply via email to