From: Daniil Leshchev <[email protected]>

Introduce new command-line options for configuring
balancing process.

Introduce the data collector for gathering information
about network speed. This information can be used in order
to optimize time of cluster balancing.
---
 doc/design-migration-speed-hbal.rst | 44 +++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/doc/design-migration-speed-hbal.rst 
b/doc/design-migration-speed-hbal.rst
index a0dcfe0..14b867e 100644
--- a/doc/design-migration-speed-hbal.rst
+++ b/doc/design-migration-speed-hbal.rst
@@ -26,3 +26,47 @@ a compromise between moves speed and optimal scoring. This 
can be implemented
 by introducing ``--avoid-disk-moves *FACTOR*`` option which will admit disk
 moves only if the gain in the cluster metrics is *FACTOR* times
 higher than the gain achievable by non disk moves.
+
+Avoiding insignificant long-time solutions
+======================================
+
+The next step is to estimate an amount of time required to perform a balancing
+step and introduce a new term: ``long-time`` solution.
+
+``--long-solution-threshold`` option will specify a duration in seconds.
+A solution exceeding the duration is a ``long-time`` solution by definition.
+
+With time estimations we will be able to filter Hbal's sequences and not allow
+to perform long-time solutions without enough gain in cluster metric. This can
+be done by introducing ``--avoid-long-solutions *FACTOR*`` option, which will
+admit only long-time solutions whose K/N metrics are more, than *FACTOR* where
+K is gain of such solution and N is an estimated time to perform it.
+
+As a result we can achieve almost similar improvement of the cluster metrics
+after balancing with significant decrease of time to balancing.
+
+Network bandwidth estimation
+============================
+
+Balancing time can be estimated by taking amount of data to be moved and
+current network bandwidth between each pair of affected nodes.
+
+We propose to add a new data collector, that will gather information about
+network speed by sending some amount of data. By counting time to perform this,
+we can estimate average network speed between any two nodes in the cluster.
+
+DataCollector implementation details
+====================================
+
+As a first approach we suggest implement dummy data collector whose output
+could be configured by user.
+
+For serious data collector it's useless to send tiny packets less than 100Kb,
+because of time to connection establishing. Since in almost all implementations
+of TCP/IP stack MTU is limited to approximately 1500 bytes, we propose also not
+to use *ping* command, but implement own process of package sending or for
+example parse output from *scp* command.
+
+During *dcUpdate* every data collector sends requests to other nodes and
+measures time to get response. So after master node invoke *dcReport*
+on all collectors, it will get full graph of network speed.
-- 
1.9.1

Reply via email to