Re: Shard Server addition/removal

Ravikumar Govindarajan Thu, 27 Apr 2017 23:38:55 -0700

Wow, this has been thought about 2 years back!! Amazing.

I checked the code. One nit. Favorite-node must be avoided if it's already
marked as an excluded-node


The use-case I stated above is a little complex than this. In-addition to
primary node (local machine serving the shard), we will now need to
instruct hadoop to favor the secondary too, during re-replication

Also, we have a mixed storage implementation. 1st replica will be on SSD of
local node, 2 & 3 on HDD of random nodes. So, need to handle it accordingly

But this is a great implementation to build on. Many thanks for this

--
Ravi

On Fri, Apr 28, 2017 at 1:15 AM, Tim Williams <[email protected]> wrote:

> Have you looked in /contrib for the block placement stuff?  Maybe it
> provides some ideas?
>
> https://git1-us-west.apache.org/repos/asf?p=incubator-
> blur.git;a=tree;f=contrib/blur-block-placement-policy;h=
> 743a50d6431f4f8cecbb0f55d75baf187da7f755;hb=HEAD
>
> Thanks,
> --tim
>
>
> On Wed, Apr 26, 2017 at 9:40 AM, Ravikumar Govindarajan
> <[email protected]> wrote:
> >>
> >> In case of HDFS or MAPRF can we dynamically assign
> >> shards to shardservers based on the data locality (using block
> locations)?
> >
> >
> > I was exploring the reverse option. Blur will suggest the set of
> > hadoop-datanodes to replicate while writing index files.
> >
> > Blur will also explicitly control bootstrapping a new datanode &
> > load-balancing it, as well as removing a datanode from cluster..
> >
> > Such fine control is possible by customizing BlockPlacementPolicy API...
> >
> > Have started exploring it. Changes look big. Will keep the group posted
> on
> > progress
> >
> > On Fri, Apr 21, 2017 at 10:42 PM, rahul challapalli <
> > [email protected]> wrote:
> >
> >> Its been a while since I looked at the code, but I believe a shard
> server
> >> has a list of shards which it can serve. Now maintaining this static
> >> mapping (or tight coupling) between shard servers and shards is a design
> >> decision which makes complete sense for clusters where nodes do not
> share a
> >> distributed file system. In case of HDFS or MAPRF can we dynamically
> assign
> >> shards to shardservers based on the data locality (using block
> locations)?
> >> Obviously this hasn't been well thought out as a lot of components
> would be
> >> affected. Just dumping a few thoughts from my brain.
> >>
> >> - Rahul
> >>
> >> On Fri, Apr 21, 2017 at 9:44 AM, Ravikumar Govindarajan <
> >> [email protected]> wrote:
> >>
> >> > We have been facing lot of slowdown in production, whenever a
> >> shard-server
> >> > is added or removed...
> >> >
> >> > Shards which were locally served via short-circuit suddenly becomes
> fully
> >> > remote & at scale, this melts down.
> >> >
> >> > Block cache is kind of reactive cache & takes a lot of time to settle
> >> down
> >> > (at-least for us!!)
> >> >
> >> > Have been thinking of handling this locality issue for some time now..
> >> >
> >> > 1. For every shard, Blur can map a primary server & a secondary
> server in
> >> > ZooKeeper
> >> > 2. File-writes can use the favored nodes hint of Hadoop & write to
> both
> >> > these servers [https://issues.apache.org/jira/browse/HDFS-2576]
> >> > 3. When a machine goes down, instead of randomly assigning shards to
> >> > different shard-servers, Blur can decide to allocate shards to
> designated
> >> > secondary servers.
> >> >
> >> > Adding a new machine is another problem, where it will immediately
> start
> >> > serving shards from remote machines. It must need data copies of all
> >> > primary shards it is supposed serve from local disk..
> >> >
> >> > hadoop has something called BlockPlacementPolicy that can be hacked
> into.
> >> > [
> >> > http://hadoopblog.blogspot.in/2009/09/hdfs-block-replica-
> >> > placement-in-your.html
> >> > ]
> >> >
> >> > When booting a new machine, lets say we increase replication-factor
> from
> >> 3
> >> > to 4, for shards that will be hosted by new machine (setrep command
> from
> >> > hdfs console)
> >> >
> >> > Now hadoop will call our CustomBlockPlacementPolicy class to arrange
> >> extra
> >> > replication, where we can sneak in the new IP..
> >> >
> >> > Once all shards to be hosted by this new machine are replicated, we
> can
> >> > close these shards, update the mappings in ZK & open them. Data will
> be
> >> > served locally
> >> >
> >> > Similarly, when restoring replication-factor from 4 to 3, our
> >> > CustomBlockPlacementPolicy class can hook up to ZK, find out which
> node
> >> to
> >> > delete the data & proceed...
> >> >
> >> > Do let know your thoughts on this...
> >> >
> >>
>

Re: Shard Server addition/removal

Reply via email to