> On Aug. 19, 2014, 6:31 p.m., Josh Elser wrote: > > One big design concern I have is what gains the final solution would > > actually have over what is currently possible with Accumulo as it stands. > > > > Right now, you can force tablets to migrate by stopping a tserver. This > > goes back through the balancer, so you have a bit of churn in however many > > "rounds" the Balancer takes to choose where those tablets should go, and > > then for the master to process the necessary assignments for each tserver. > > How I'm seeing it described is that the only piece of the puzzle that we're > > making better is removing the migration components in favor of letting the > > user control this directly. How much does a "smart" Balancer implementation > > close the gap between the user providing migrations in regards to > > performance? Also, how does removing the Balancer from the equation change > > the wall time to get a tablet assigned (is it significant)? > > > > We have to also understand that while we can decompose the problem into > > some simple primitives, I believe this approach is still a rather difficult > > distributed state problem that I'm worried is being over-architected. My > > $0.02. > > Josh Elser wrote: > For context, I was reading about HBase's support on the subject and found > http://hbase.apache.org/book/node.management.html. Their general approach is > to provide a graceful shutdown for regionservers. This is still subject to > problems in mass amounts of servers being stopped at one time. To alleviate > some of this pain, they use ZK to store what servers are currently in a > "draining state" to avoid new assignments to those nodes -- "[...] > decommissioning mulitple nodes may be non-optimal because regions that are > being drained from one region server may be moved to other regionservers that > are also draining. Marking RegionServers to be in the draining state prevents > this from happening",
An alternative to this design, is one that Mike mentioned on the issue. Temporarily replace the balancer. I am thinking that providing these primitves for manipulating tablets will allow an administrator to quickly script a one off solution to a problem, in addition to solving the rolling restart problem. You do not get this quick flexibility with writing a new balancer. Killing tablet servers is a solution. I think it would be nice to have a solution that avoids log recovery, minimizes down time of individual tablets, preserves locality, and is easy to use. It does not have to be this solution. W/o additional scripts, the primary use case in 1454 would not be easy to use. A balancer alone would not be enough to achieve the goal of migrating tablets between old and new tservers on the same node. However a balancer + tservers states like you mentioned from HBAse may provide enough. Should probably try to explore the balancer option a bit more. - kturner ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24855/#review51006 ----------------------------------------------------------- On Aug. 19, 2014, 5:50 p.m., kturner wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/24855/ > ----------------------------------------------------------- > > (Updated Aug. 19, 2014, 5:50 p.m.) > > > Review request for accumulo. > > > Bugs: ACCUMULO-1454 > https://issues.apache.org/jira/browse/ACCUMULO-1454 > > > Repository: accumulo > > > Description > ------- > > Positing ACCUMULO-1454 design doc for review > > > Diffs > ----- > > docs/src/main/asciidoc/design/ACCUMULO-1454-proposal-01.adoc PRE-CREATION > > Diff: https://reviews.apache.org/r/24855/diff/ > > > Testing > ------- > > > Thanks, > > kturner > >