Hi Mike,

One of the nicest features of Beehive is that it enables the system
designer to specify the performance target for the system, and it then
performs just the bare minimum replication necessary to achieve that
goal. 

If you set a target lookup goal of (log N) hops, Beehive will simply
make no additional replicas to improve performance. It will still make
k' replicas to compensate for failures of the home node. You choose k'
just like you choose "r-successors" in DHash.

If you set a performance target of, say, 0.5 hops for lookups, Beehive
will make just enough replicas such that on average, more than 50% of
lookups are satisfied at the first overlay node. Some queries will take
another hop, some will require two, some might require up to log N, for
an average lookup path length of 0.5 hops. 

If you were running, say, a P2P videoclips-on-demand service, you could
guarantee that half the clips that a user selects are already on their
home machine. The number of additional replicas to go from (log N) to,
say, 0.5 hop lookups is often quite modest. Our NSDI'04 paper has a
graph showing the bandwidth vs. lookup-time tradeoff.

It would be incorrect to claim that Beehive is aggressive in its
replication. It gives you a pretty convenient knob with which you can
adjust the performance goals of the system. It does just enough
replication to achieve the desired goal. By "just enough," we mean
"absolutely optimal" if your app is amenable to the original Beehive
approach (NSDI'04), or "near-optimal" if you are using the more general
Honeycomb technique (NSDI'06). 

We built a few systems based on Beehive/Honeycomb:
        CoDoNS (SIGCOMM'04), a safety net and replacement for DNS,
        Corona (NSDI'06), a next-gen RSS replacement,
        and CobWeb, and Akamai-like web cache that fields >10 million
        hits/day.
So it's a practical algorithm, not very hard to layer on top of Pastry,
Chord, Kademlia, Koorde, Viceroy, SkipNets, and many other structured
overlays. Feel free to contact us (beehive AT systems DOT cs DOT cornell
DOT edu) if you have any questions about the system.

Best,
Gun,
Venugopalan "Rama" Ramasubramanian,
Ryan Peterson,
Yee Jiun Song.

On Tue, 2006-07-25 at 20:35 -0700, Michael Parker wrote: 
> Although it's been awhile since I've read the paper, IIRC, Beehive's
> replication policy is very aggressive. If your goal is to replicate so
> that the data remains on the network under the harshest levels of
> churn imaginable, then Beehive would be a good idea. But if you're not
> expecting this sort of workload, and your primary goal for replication
> was to make retrieving the data quicker, it might be better to use a
> more "modest" but proven replication strategy, such as DHash, and then
> look into speeding up lookups using parallelism. (See Kademlia,
> Epichord, or Accordion.)
> 
> 
> - Mike
> 
> 
> 
> On Jul 25, 2006, at 6:44 AM, Bob Harris wrote:
> 
> > Seems like replication is a constant topic around here. Beehive is
> > the state of 
> > the art in replication. They can turn a O(lg N) overlay like chord
> > or pastry into 
> > an O(1) overlay purely via replication.
> > 
> > Bob. 


_______________________________________________
p2p-hackers mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/p2p-hackers

Reply via email to