Thanks for sharing the article SJ and additional details Peter! Just wanted to mention that, tangentially related, there is a place in Wikimedia where anomaly detection is used for monitoring "performance" and that's around detecting instances of Wikipedia outages (often censorship). More details in this blogpost: https://techblog.wikimedia.org/2021/01/15/censorship-outages-and-internet-shutdowns-monitoring-wikipedias-accessibility-around-the-world/
Best, Isaac On Mon, Jan 31, 2022 at 8:30 AM <pe...@wikimedia.org> wrote: > Hi Samuel, > my name is Peter and I work in the performance team. I also read the post > and I also found it interesting. Our performance metrics are viewable in > Grafana, a good start point is the performance summary dashboard: > https://grafana.wikimedia.org/d/cZgMg49Wz/performance-summary. We have > many dashboards but we lack some documentations, so please ask so I can > guide you. > > We collect and keep track of performance metrics directly from our users, > we run synthetic browser tests every X hour where we record a video of the > browser screen, collect visual metrics and we also run some tests on > commits. > > The largest research we've done in this is the study Gilles did about > correlation between what the user perceive vs browser metrics > https://techblog.wikimedia.org/2019/06/17/performance-perception-correlation-to-rum-metrics/ > and the paper https://nonsns.github.io/paper/rossi19www.pdf. > > For regressions, I've gone through the same path as the people at Netflix > by trying different amount of runs, taking median/fastest/slowest runs etc > to find more "stable" metrics. We don't proxy performance by memory usage, > we focus more on visual metrics for the users and for us we need to do more > than three runs. We do 5-11 runs depending on what we test. I haven't > blogged about that work but it should be in some Phabricator tasks, I can > look it up if you are interested. What is also interesting is what kind of > practical regression you could find. In our most trimmed systems I think we > can find performance regressions that are slighlty over 2%. But there's > parts where the regression needs to be 10-20% for us to get alerts. > > I wrote a blog post a couple of years ago about one regression > https://techblog.wikimedia.org/2018/10/03/best-friends-forever/ > > I like the use of anomaly detection, we discussed that in the teams some > time ago but we haven't tried it out. Today we mostly use static thresholds > in some way. I think a tool for anomaly detection would be something many > teams could use. > > I really like that they have statistics about false alerts etc. We don't > have that today but we should. I started to keep track of them manually, > but hmm I failed :) > > Best > Peter Hedenskog > _______________________________________________ > Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org > To unsubscribe send an email to wiki-research-l-le...@lists.wikimedia.org > -- Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation _______________________________________________ Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org To unsubscribe send an email to wiki-research-l-le...@lists.wikimedia.org