Hotspotting was the first thing that came to my mind with the proposed
balancer. The fservers don't keep all the K/V in memory. You are balancing
query and live ingest across your resources.
-------- Original message --------
From: Eric Newton <[email protected]>
Date: 07/29/2015 8:46 PM (GMT-05:00)
To: [email protected]
Subject: Re: Entry-based TableBalancer
To my knowledge, nobody has written such a balancer.
In the history of the project, we started writing advanced, complicated
balancers that moved tablets around much too quickly, which degraded
performance. After that, we wrote much simpler balancers to avoid the chaos.
We're moving back to more complex balancers, but mostly just to ensure that we
aren't hotspoting, based on known ingest patterns (date related, for example).
If you write a new balancer, make it slow to move tablets, and very simple.
Avoid over-optimizing tablet placement.
-Eric
On Wed, Jul 29, 2015 at 8:20 PM, Konstantin Pelykh <[email protected]> wrote:
Hi,
I'm looking for a tablet balancer which operates based on a number of entries
per tablet as opposed to a number of tablets per tablet server. My goal is to
get even distribution of entries across the cluster.
As an example:
tablet #1 15M entries
tablet #2 5M entries
tablet #3 8M entries
After balancing tablets I would want to get:
Server 1 hosts: tablet1
Server 2 hosts: tablet2, tablet3
The idea is pretty simple and I believe such balancer has already been
developed, so I decided to check before reinventing the wheel.
Thanks!
Konstantin
--------
Big Data / Lucene and Solr Consultant
LinkedIn: linkedin.com/in/kpelykh
Website: www.kpelykh.com