Ahhh... yes. The keynote from Google at Buzzwords this year had some peripheral comments about variable replication as well while talking about Spanner.
On Tue, Sep 11, 2012 at 7:34 PM, Ian Holsman <[email protected]> wrote: > The paper mentions how they selectively replicate different subsets of the > data. They use 'china queries' or somesuch as their example. > > my understanding is that there is some kind of query/subset monitor that > detects hot spots, and then increases the replication count of them across > the farm. It must also be responsible for decreasing the count as the > hotspots become cool again. > > regards > Ian > On Sep 12, 2012, at 12:31 PM, Ted Dunning <[email protected]> wrote: > > > What do you mean be selective replication? > > > > On Tue, Sep 11, 2012 at 7:23 PM, Worthy LaFollette <[email protected] > >wrote: > > > >> Very good paper. Am curious now to the strategies for selective > >> replication, which looks if done right would make the query generation > more > >> efficient. Do you know of any papers on that subject? > >> > >> On Tue, Sep 11, 2012 at 1:37 PM, Ted Dunning <[email protected]> > >> wrote: > >> > >>> Headed into Thursday's meetup, this paper by Jeff Dean provides a very > >> good > >>> description of strategies for getting fast response times with variable > >>> quality infrastructure. > >>> > >>> http://research.google.com/people/jeff/latency.html > >>> > >>> The key point here is that it is very important to have asynchronous > >>> queries with a cancel. Above that level, there needs to be a simple > >>> strategy for pushing second versions of queries out to the workers and > >>> canceling defunct or redundant queries. > >>> > >> > > -- > Ian Holsman > [email protected] > http://doitwithdata.com.au > PH: +61-400-988-964 Skype:iholsman > > >
