I have the following suggestions/tweaks: - [Balance operation step 5] Master keeps track of unacknowledged range moves for some time and say once a month the Master broadcasts a list of unacked range moves to all RangeServers and confirms that each range is owned by at most one RangeServer. - [sys/RS_STATS table] Stagger writes to sys/RS_STATS table maybe something like writes are randomly distributed over a 10 min interval every hour. - [Load balancing pseudocode] Re-sort server_load_vec_asc in inner loop to avoid heavily loaded ranges always moving together. - [Load balancing pseudocode] Use separate loadavg_per_loadestimate for source and destination RangeServers to calculate separate "partial_deviation" values. Using the same partial_deviation value for source and destination servers assumes both have identical capacity. The loadavg_per_loadestimate values can be stored in the server_load_vec. - [RangeServer Added] Avoid doing multiple balance operations when a set of servers is added over a time interval of a couple of minutes. When a RangeServer is added, wait for some interval (say 5 min) before doing a balance operation. - [BasicBalancing Algorithm] Cells/s, BytesRead/s represent network traffic better than load on an individual server. For example a scan might read 200M of data but return 100B. So, the loadestimate should contain some raw Bytes/s data to better estimate the work done by the RangeServer on behalf of a range.
-Sanjit On Mon, Dec 20, 2010 at 1:23 PM, Doug Judd <[email protected]> wrote: > Hi Gordon, > > Sounds like a plan. Whenever you get the wiki page put together please > post it to this list so that other folks in the community can keep tabs > and/or participate. Thanks! > > - Doug > > > On Mon, Dec 20, 2010 at 11:39 AM, Gordon <[email protected]> wrote: > >> Hi Doug, >> >> That's fine at this point since it simplifies that task to some degree. >> We'll work together to propose a few ideas for what the basic API might look >> like and then refine it with input from other interested parties. >> >> I will try to formulate it as an ML problem and as an optimization problem >> so that researchers from those communities can engage it. I'll work up a >> wiki page for the problem, we'll get some feedback on it, and then when >> we're ready I can push it to various research groups. >> >> >> On Mon, Dec 20, 2010 at 7:30 PM, Doug Judd <[email protected]> wrote: >> >>> Hi Gordon, >>> >>> Thank you. Any help from you and the folks in your network would be much >>> appreciated. Let me know if there is any more information you need from me >>> or if you'd like to setup a meeting at some point to discuss. >>> >>> As far as the balance policy goes, I think for the first go-round we >>> might want to have the system automatically balance when nodes are added and >>> include a BALANCE command to be run manually if/when the system becomes out >>> of balance. >>> >>> - Doug >>> >>> >>> On Mon, Dec 20, 2010 at 10:38 AM, Gordon <[email protected]> wrote: >>> >>>> Hi Doug, >>>> >>>> Thanks for the nice writeup -- at this point we would like to start >>>> carving out what the API would like to allow people to extend with their >>>> own >>>> intelligent load balancing algorithms. Load balancing of this type is >>>> recognized as a hard research problem(*) and I would like to see Hypertable >>>> be used as central to that research. >>>> >>>> We'll need a set of abstractions and interfaces that allow a variety of >>>> approaches that people might like to apply -- reinforcement learning, >>>> constraint programming, or other AI style planning approaches. I'd like to >>>> collect some feedback from a few different sources to see if we can solve >>>> for the minimal interface that gives everyone the inputs and the controls >>>> they need to devise a balancer. Then, we can produce an API recommendation. >>>> >>>> Also, we should add a bit more detail to talk about policy for calling >>>> the balancer. Intelligence in the balancing strategy might probably better >>>> be served by allowing the balancer policy itself to be learned or >>>> optimized. >>>> For example, the system might be better left in a slightly unbalanced >>>> condition rather than pay the balancing cost if it's learned that the >>>> unbalanced condition is temporary. >>>> >>>> I'll point out the problem and this mailing list to folks doing research >>>> in ML (particularly SysML or ML applied to systems management) but also >>>> from >>>> the constraint computing field. >>>> >>>> (*) see for example: >>>> http://www2.cs.uni-paderborn.de/cs/ag-monien/RESEARCH/LOADBAL/ >>>> >>>> >>>> >>>> On Sun, Dec 19, 2010 at 9:15 PM, Doug Judd <[email protected]> wrote: >>>> >>>>> We're now 100% focused on load balancing. This will handle the >>>>> addition of new nodes as well as normal load imbalances. I've come up >>>>> with >>>>> a design and have described it in the following document: >>>>> >>>>> http://code.google.com/p/hypertable/wiki/LoadBalancing >>>>> >>>>> Please read it and send your feedback. Thanks! >>>>> >>>>> - Doug >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Hypertable Development" group. >>>>> To post to this group, send email to [email protected]. >>>>> To unsubscribe from this group, send email to >>>>> [email protected]<hypertable-dev%[email protected]> >>>>> . >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/hypertable-dev?hl=en. >>>>> >>>> >>>> >>>> >>>> -- >>>> Gordon Rios -- Cork Constraint Computation Centre >>>> http://www.4c.ucc.ie/web/people.jsp?id=144 >>>> http://www.linkedin.com/in/gordonrios >>>> Ireland: +353 86 089 2416 >>>> USA: +1 650 906 3473 >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Hypertable Development" group. >>>> To post to this group, send email to [email protected]. >>>> To unsubscribe from this group, send email to >>>> [email protected]<hypertable-dev%[email protected]> >>>> . >>>> For more options, visit this group at >>>> http://groups.google.com/group/hypertable-dev?hl=en. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Hypertable Development" group. >>> To post to this group, send email to [email protected]. >>> To unsubscribe from this group, send email to >>> [email protected]<hypertable-dev%[email protected]> >>> . >>> For more options, visit this group at >>> http://groups.google.com/group/hypertable-dev?hl=en. >>> >> >> >> >> -- >> Gordon Rios -- Cork Constraint Computation Centre >> http://www.4c.ucc.ie/web/people.jsp?id=144 >> http://www.linkedin.com/in/gordonrios >> Ireland: +353 86 089 2416 >> USA: +1 650 906 3473 >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Hypertable Development" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]<hypertable-dev%[email protected]> >> . >> For more options, visit this group at >> http://groups.google.com/group/hypertable-dev?hl=en. >> > > -- > You received this message because you are subscribed to the Google Groups > "Hypertable Development" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<hypertable-dev%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/hypertable-dev?hl=en. > -- You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en.
