[jira] [Comment Edited] (ACCUMULO-4353) Stabilize tablet assignment during transient failure

2016-06-24 Thread Shawn Walker (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15348409#comment-15348409
 ] 

Shawn Walker edited comment on ACCUMULO-4353 at 6/24/16 4:14 PM:
-

bq. Are you attempting to design a mechanism that could be used to avoid 
re-balancing and have the master keep assignments where they were previously, 
knowing that servers will come back into operation?
That is the idea, yes.  There is definitely a tradeoff here.

bq. If this is really about trying to make rolling-restarts better, I'd 
encourage a look at ACCUMULO-1454.
As I mentioned before, I hadn't seen ACCUMULO-1454 before starting this.  I've 
now looked at the discussion of that ticket.  What I implemented was 
approximately what Christopher Tubbs and David Medinets were suggesting.  I 
also read through Keith Turner's design proposal summary.  I have some 
reservations with it:
* It requires that each planned restart involves tablet servers changing ports. 
 While the recent changes to Accumulo to support a narrow port range during 
port search would make this more plausible, it might still prove difficult to 
establish firewall rules for Accumulo.  (Sean Busby raises this issue in the 
discussion).
* What happens if a tablet is split after migration starts?  It seems to me 
there might be a race condition here which would lead to incomplete migration 
between sibling tablet servers.  Do we block assignment during the rolling 
restart, too?  That seems seems like a cure worse than the problem.
* Even barring those two concerns, I again raise the spectre of ops complexity. 
 To transition a single server, I need to know (a) which port the "old" tserver 
was running on, and (b) which port the "new" tserver is running on.  If I'm 
using some sort of dynamic port assignment (which I would need to unless I 
pointed the "new" tserver at an entirely different configuration), it could be 
non-trivial to gather these pieces of information.  While the burden on the 
operator of a cluster of 5 tservers might not be significant, the burden on the 
operator of a cluster of 200 tservers might make this approach infeasible. And 
the non-triviality of determining the correct port migration mapping would also 
make the process difficult to robustly automate.

bq. While seeing a pull request accompanying the issue reported, It seems a bit 
premature to me to see code without some discussion on what the problems are 
and how best to solve them.
Ahh, my mistake then.  As a new contributor to Accumulo, I still don't have a 
full grasp of the rules, either written or unwritten.  My feeling from watching 
the list was that primary modus operandi was to present a (fully implemented) 
solution along with a proposed problem, and then to discuss the merits of the 
solution.



was (Author: shawnwalker):
bq. Are you attempting to design a mechanism that could be used to avoid 
re-balancing and have the master keep assignments where they were previously, 
knowing that servers will come back into operation?
That is the idea, yes.  There is definitely a tradeoff here.

bq. If this is really about trying to make rolling-restarts better, I'd 
encourage a look at ACCUMULO-1454.
As I mentioned before, I hadn't seen ACCUMULO-1454 before starting this.  I've 
now looked at the discussion of that ticket.  What I implemented was 
approximately what Christopher Tubbs and David Medinets were suggesting.  I 
also read through Keith Turner's design proposal summary.  I have some 
reservations with it:
* It requires that each planned restart involves tablet servers changing ports. 
 While the recent changes to Accumulo to support a narrow port range during 
port search would make this more plausible, it might still prove difficult to 
establish firewall rules for Accumulo.  (Sean Busby raises this issue in the 
discussion).
* What happens if a tablet is split after migration starts?  It seems to me 
there might be a race condition here which would lead to incomplete migration 
between sibling tablet servers.  Do we block assignment during the rolling 
restart, too?  That seems seems like a cure worse than the problem.
* Even barring those two concerns, I again raise the spectre of ops complexity. 
 To transition a single server, I need to know (a) which port the "old" tserver 
was running on, and (b) which port the "new" tserver is running on.  If I'm 
using some sort of dynamic port assignment (which I would need to unless I 
pointed the "new" tserver at an entirely different configuration), it could be 
non-trivial to gather these pieces of information.  While the burden on the 
operator of a cluster of 5 tservers might not be significant, the burden on the 
operator of a cluster of 200 tservers might make this approach infeasible. And 
the non-triviality of determining the correct port migration mapping would also 
make the process difficult to 

[jira] [Comment Edited] (ACCUMULO-4353) Stabilize tablet assignment during transient failure

2016-06-23 Thread Shawn Walker (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347595#comment-15347595
 ] 

Shawn Walker edited comment on ACCUMULO-4353 at 6/24/16 3:04 AM:
-

I did have rolling restarts as a primary motivation for doing this work, though 
a few other scenarios did come to mind as potential applications:
* tserver loses lock (possibly due to load), dies, and is restarted quickly via 
some external infrastructure, e.g. Puppet
* temporary network connectivity loss

I was thinking a `table.suspend.duration` on the order of 2-3 minutes might 
make sense for general purposes in a large cluster.  Long enough to catch most 
truly transient problems, sufficiently short that many applications wouldn't be 
unduly impacted.  Particularly seeing as any application already has to deal 
with a ~30 second wait before the master really notices a tablet server gone 
anyways.  After all, Accumulo is ultimately a consistent+partition tolerant 
database, not an available+partition tolerant database.  If availability is a 
user's top priority, other databases (e.g. Apache Cassandra) offer tradeoffs in 
that direction.

I hadn't seen ACCUMULO-1454, I'll take a closer look in the morning. One 
concern that I had with some rolling-restart ideas was a matter of ops 
complexity.  In my (admittedly limited) experience, orchestrating a rolling 
restart that needs to do much more than "kill daemon, restart daemon" over a 
large cluster can be a huge headache.




was (Author: shawnwalker):
I did have rolling restarts as a primary motivation for doing this work, though 
a few other scenarios did come to mind as potential applications:
* tserver loses lock (possibly due to load), dies, and is restarted quickly via 
some external infrastructure, e.g. Puppet
* temporary network connectivity loss

I was thinking a `table.suspend.duration` on the order of 2-3 minutes might 
make sense for general purposes in a large cluster.  Long enough to catch most 
truly transient problems, sufficiently short that many applications wouldn't be 
unduly impacted.  Particularly seeing as any application already has to deal 
with a ~30 second wait before the master really notices a tablet server gone 
anyways.

I hadn't seen ACCUMULO-1454, I'll take a closer look in the morning. One 
concern that I had with some rolling-restart ideas was a matter of ops 
complexity.  In my (admittedly limited) experience, orchestrating a rolling 
restart that needs to do much more than "kill daemon, restart daemon" over a 
large cluster can be a huge headache.



> Stabilize tablet assignment during transient failure
> 
>
> Key: ACCUMULO-4353
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4353
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Shawn Walker
>Assignee: Shawn Walker
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a tablet server dies, Accumulo attempts to reassign the tablets it was 
> hosting as quickly as possible to maintain availability.  If multiple tablet 
> servers die in quick succession, such as from a rolling restart of the 
> Accumulo cluster or a network partition, this behavior can cause a storm of 
> reassignment and rebalancing, placing significant load on the master.
> To avert such load, Accumulo should be capable of maintaining a steady tablet 
> assignment state in the face of transient tablet server loss.  Instead of 
> reassigning tablets as quickly as possible, Accumulo should be await the 
> return of a temporarily downed tablet server (for some configurable duration) 
> before assigning its tablets to other tablet servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (ACCUMULO-4353) Stabilize tablet assignment during transient failure

2016-06-23 Thread marco polo (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15347266#comment-15347266
 ] 

marco polo edited comment on ACCUMULO-4353 at 6/23/16 9:46 PM:
---

Are you attempting to design a mechanism that could be used to avoid 
re-balancing and have the master keep assignments where they were previously, 
knowing that servers will come back into operation?

I only ask because I question why the load on the master is a problem. You will 
cause load since clients will persist for that length of time. Wont' you 
increase the number of thrift connections waiting since you may not re-balance 
for some time? 


was (Author: phrocker):
Are you attempting to design a mechanism that could be used to avoid 
re-balancing and have the master keep assignments where they were previously, 
knowing that servers will come back into operation?

> Stabilize tablet assignment during transient failure
> 
>
> Key: ACCUMULO-4353
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4353
> Project: Accumulo
>  Issue Type: Improvement
>Reporter: Shawn Walker
>Assignee: Shawn Walker
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When a tablet server dies, Accumulo attempts to reassign the tablets it was 
> hosting as quickly as possible to maintain availability.  If multiple tablet 
> servers die in quick succession, such as from a rolling restart of the 
> Accumulo cluster or a network partition, this behavior can cause a storm of 
> reassignment and rebalancing, placing significant load on the master.
> To avert such load, Accumulo should be capable of maintaining a steady tablet 
> assignment state in the face of transient tablet server loss.  Instead of 
> reassigning tablets as quickly as possible, Accumulo should be await the 
> return of a temporarily downed tablet server (for some configurable duration) 
> before assigning its tablets to other tablet servers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)