On Thu, Feb 10, 2022 at 11:05 AM Ben Cooksley <bcooks...@kde.org> wrote: > > > > On Thu, Feb 10, 2022 at 8:20 AM Aleix Pol <aleix...@kde.org> wrote: >> >> [Snip] >> >> We still haven't discussed here is how to prevent this problem from >> happening again. >> >> If we don't have information about what is happening, we cannot fix problems. > > > Part of the issue here is that the problem only came to Sysadmin attention > very recently, when the system ran out of disk space as a result of growing > log files. > It was at that point we realised we had a serious problem. > > Prior to that the system load hadn't climbed to dangerous levels (> number of > CPU cores) and Apache was keeping up with the traffic, so none of our other > monitoring was tripped. > > If you have any thoughts on what sort of information you are thinking of that > would be helpful.
We could have plots of the amount of queries we get with a KNewStuff/* user-agent over time and their distribution. > It would definitely be helpful though to know when new software is going to > be released that will be interacting with the servers as we will then be able > to monitor for abnormalities. We make big announcements of every Plasma release... (?) >> Is there anything that could be done in this front? The issue here >> could have been addressed months ago, we just never knew it was >> happening. > > > One possibility that did occur to me today would be for us to integrate some > kind of killswitch that our applications would check on first initialisation > of functionality that talks to KDE.org servers. > This would allow us to disable the functionality in question on user systems. > > The check would only be done on first initialization to keep load low, while > still ensuring all users eventually are affected by the killswitch (as they > will eventually need to logout/reboot for some reason or another). > > The killswitch would probably work best if it had some kind of version check > in it so we could specify which versions are disabled. > That would allow for subsequent updates - once delivered by distributions - > to restore the functionality (while leaving it disabled for those who haven't > updated). The file we are serving here effectively is the kill switch to all of KNewStuff. Aleix