[Gluster-devel] Rebalance improvement design

Susant Palai Tue, 31 Mar 2015 01:20:08 -0700

Hi,
   Posted patch for rebalance improvement here: 
http://review.gluster.org/#/c/9657/ .
You can find the feature page here: 
http://www.gluster.org/community/documentation/index.php/Features/improve_rebalance_performance


The current patch address two part of the design proposed.
1. Rebalance multiple files in parallel
2. Crawl only bricks that belong to the current node

Brief design explanation for the above two points.

1. Rebalance multiple files in parallel:
   -------------------------------------

        The existing rebalance engine is single threaded. Hence, introduced 
multiple threads which will be running parallel to the crawler.
        The current rebalance migration is converted to a "Producer-Consumer" 
frame work. 
        Where Producer is : Crawler 
              Consumer is : Migrating Threads 
 
        Crawler: Crawler is the main thread. The job of the crawler is now 
limited to fix-layout of each directory and add the files 
                 which are eligible for the migration to a global queue. Hence, 
the crawler will not be "blocked" by migration process. 

       Producer: Producer will monitor the global queue. If any file is added 
to this queue, it will dqueue that entry and migrate the file.
                Currently 15 migration threads are spawned at the beginning of 
the rebalance process. Hence, multiple file migration 
                happens in parallel.


2. Crawl only bricks that belong to the current node:
   --------------------------------------------------

           As rebalance process is spawned per node, it migrates only the files 
that belongs to it's own node for the sake of load
           balancing. But it also reads entries from the whole cluster, which 
is not necessary as readdir hits other nodes.

     New Design:
           As part of the new design the rebalancer decides the subvols that 
are local to the rebalancer node by checking the node-uuid of 
           root directory prior to the crawler starts. Hence, readdir won't hit 
the whole cluster  as it has already the context of
          local subvols and also node-uuid request for each file can be 
avoided. This makes the rebalance process "more scalable".


Requesting reviews asap.

Regards,
Susant





         








_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

[Gluster-devel] Rebalance improvement design

Reply via email to