keith-turner opened a new pull request, #6168:
URL: https://github.com/apache/accumulo/pull/6168
Lays the foundation for multiple manager with the following changes. The
best place to start looking at these changes is in the Manager.run() method
which sets everything and ties it all together.
* Each manager process acquires two zookeeper locks now, a primary lock and
an assistant lock. Only one manager process can obtain the primary lock and
when it does it assumes the role of primary manager. All manager processes
acquire an assistant lock, which is similar to a tserver or compactor lock.
The assistant lock advertises the manager process as being available to other
Accumulo processes to handle assistant manager operations.
* Manager processes have a single thrift server and thrift services hosted
on that thrift server are categorized into primary manager and assistant
manager services. When an assistant manager receives an RPC for a primary
manager thrift service it will not execute the request and will throw an error
or ignore the request.
* The primary manager process delegates manager responsibility via RPCs to
assistant managers.
* Any management responsibility not delegated runs on the primary manager.
Using the changes above fate is now distributed across all manager
processes. In the future the changes above should make it easy to delegate
other responsibilities to assistant managers. The following is an outline of
the fate changes.
* New FateWorker class. This runs in every manager and handles request
from the primary manager to adjust what range of the fate table its currently
responsible for. FateWorker implements a new thrift service used to assign it
ranges.
* New FateManager class that is run by the primary manager and is
responsible for partitioning fate processing across all assistant managers. As
manager processes come and go this will repartition the fate table evenly
across all available managers. The FateManager communicates with FateWorkers
via thrift.
* Some new RPCs for best effort notifications. Before these changes there
were in memory notification systems that made the manager more responsive.
These would allow a fate operation to signal the Tablet Group Watcher to take
action sooner. FateWorkerEnv sends these notifications to the primary manger
over a new RPC. Does not matter if they are lost, things will still eventually
happen.
Other than fate, the primary manager process does everything the current
manager does. This change pulls from #3262 and #6139.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]