On Wed, Nov 5, 2014 at 9:56 AM, Ori Livneh <o...@wikimedia.org> wrote: > Facebook just published this summary of a summit for database researchers > held at Menlo Park last September. I recommend it. It contains a clear and > concise description of Facebook's data infrastructure, and a description of > the open problems they are thinking about, which is even more interesting. > > https://research.facebook.com/blog/1522692927972019/facebook-s-top-open-data-problems/ > > To whet your appetite, here are the problems (the summaries mostly my own > paraphrase): > > * Mobile: How should the shift toward mobile devices affect Facebookâs data > infrastructure? > > * Reducing replication: How can we reduce the number of round trips between > the application and data layers? > > * Impact of Caching on Availability (aka "oh no, we just restarted > memcached"): How do we harness the efficiency gains provided by caching > without being brought to our knees by a sudden drop in cache hit rate? > > * Sampling at logging time in a distributed environment: How should we > sample log streams if we want to maintain accuracy and flexibility to answer > post-hoc queries? > > * Trading storage space and CPU: TL;DR: gzip --best or gzip --fast? > > * Reliability of pipelines: Pipelines are less reliable than the sum of > their parts. A pipeline composed of two systems, each 0.999 reliable, is > 0.989 reliable. Much sadness. What to do?
<nitpicking>0.999*0.999=0.998001~=0.998</nitpicking> > > * Globally distributed warehouse: consistency models and synchronization > problems. > > * Time series correlation and anomaly detection: AKA: I want an alert for > that massive memcached bytes_out spike that doesn't also wake me up with > false positives at 2AM. > > > > _______________________________________________ > Wikidata-tech mailing list > Wikidata-tech@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikidata-tech > _______________________________________________ Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech