> > * Time series correlation and anomaly detection: AKA: I want an alert for > that massive memcached bytes_out spike that doesn't also wake me up with > false positives at 2AM.
Related: Abe Stanway gave a talk at BACON 2013 about Etsy's realtime anomaly detection and correlation tools, Skyline and Oculus, which form the Kale stack [0][1]. [0]: http://devslovebacon.com/conferences/bacon-2013/talks/bring-the-noise-continuously-deploying-under-a-hailstorm-of-metrics [1] https://codeascraft.com/2013/06/11/introducing-kale/ On Wed, Nov 5, 2014 at 4:22 PM, Toby Negrin <tneg...@wikimedia.org> wrote: > > Awesome -- thanks Ori. > > On Wed, Nov 5, 2014 at 12:56 AM, Ori Livneh <o...@wikimedia.org> wrote: > >> Facebook just published this summary of a summit for database researchers >> held at Menlo Park last September. I recommend it. It contains a clear and >> concise description of Facebook's data infrastructure, and a description of >> the open problems they are thinking about, which is even more interesting. >> >> >> https://research.facebook.com/blog/1522692927972019/facebook-s-top-open-data-problems/ >> >> To whet your appetite, here are the problems (the summaries mostly my own >> paraphrase): >> >> * Mobile: How should the shift toward mobile devices affect Facebookâs >> data infrastructure? >> >> * Reducing replication: How can we reduce the number of round trips >> between the application and data layers? >> >> * Impact of Caching on Availability (aka "oh no, we just restarted >> memcached"): How do we harness the efficiency gains provided by caching >> without being brought to our knees by a sudden drop in cache hit rate? >> >> * Sampling at logging time in a distributed environment: How should we >> sample log streams if we want to maintain accuracy and flexibility to >> answer post-hoc queries? >> >> * Trading storage space and CPU: TL;DR: gzip --best or gzip --fast? >> >> * Reliability of pipelines: Pipelines are less reliable than the sum of >> their parts. A pipeline composed of two systems, each 0.999 reliable, >> is 0.989 reliable. Much sadness. What to do? >> >> * Globally distributed warehouse: consistency models and synchronization >> problems. >> >> * Time series correlation and anomaly detection: AKA: I want an alert for >> that massive memcached bytes_out spike that doesn't also wake me up with >> false positives at 2AM. >> >> >> >> _______________________________________________ >> Engineering mailing list >> engineer...@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/engineering >> >> > > _______________________________________________ > Engineering mailing list > engineer...@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/engineering > >
_______________________________________________ Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech