From: Daniil Leshchev <[email protected]> Introduce new command-line options for configuring balancing process.
Introduce the data collector for gathering information about network speed. This information can be used in order to optimize time of cluster balancing. --- doc/design-migration-speed-hbal.rst | 44 +++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/doc/design-migration-speed-hbal.rst b/doc/design-migration-speed-hbal.rst index a0dcfe0..14b867e 100644 --- a/doc/design-migration-speed-hbal.rst +++ b/doc/design-migration-speed-hbal.rst @@ -26,3 +26,47 @@ a compromise between moves speed and optimal scoring. This can be implemented by introducing ``--avoid-disk-moves *FACTOR*`` option which will admit disk moves only if the gain in the cluster metrics is *FACTOR* times higher than the gain achievable by non disk moves. + +Avoiding insignificant long-time solutions +====================================== + +The next step is to estimate an amount of time required to perform a balancing +step and introduce a new term: ``long-time`` solution. + +``--long-solution-threshold`` option will specify a duration in seconds. +A solution exceeding the duration is a ``long-time`` solution by definition. + +With time estimations we will be able to filter Hbal's sequences and not allow +to perform long-time solutions without enough gain in cluster metric. This can +be done by introducing ``--avoid-long-solutions *FACTOR*`` option, which will +admit only long-time solutions whose K/N metrics are more, than *FACTOR* where +K is gain of such solution and N is an estimated time to perform it. + +As a result we can achieve almost similar improvement of the cluster metrics +after balancing with significant decrease of time to balancing. + +Network bandwidth estimation +============================ + +Balancing time can be estimated by taking amount of data to be moved and +current network bandwidth between each pair of affected nodes. + +We propose to add a new data collector, that will gather information about +network speed by sending some amount of data. By counting time to perform this, +we can estimate average network speed between any two nodes in the cluster. + +DataCollector implementation details +==================================== + +As a first approach we suggest implement dummy data collector whose output +could be configured by user. + +For serious data collector it's useless to send tiny packets less than 100Kb, +because of time to connection establishing. Since in almost all implementations +of TCP/IP stack MTU is limited to approximately 1500 bytes, we propose also not +to use *ping* command, but implement own process of package sending or for +example parse output from *scp* command. + +During *dcUpdate* every data collector sends requests to other nodes and +measures time to get response. So after master node invoke *dcReport* +on all collectors, it will get full graph of network speed. -- 1.9.1
