Re: StochasticLoadBalancer questions
Regarding #1, my main concern is that if we poll the region load at a "bad" time and get back an abnormally high or low value, the balancer could over react. For example if your regions most recent readRequestsCount is 100 and you've been seeing 5 for the last 9 times you polled, the "average" outputted is 52.5 instead of 14.5. This could just be a temporary spike in requests to a region making it seem much worse than it may be going forward and cause a region to move when it is actually unnecessary. On Fri, Jan 13, 2017 at 2:10 PM, Ted Yu wrote: > For #2, you're more than welcome to attach patch on the JIRA. > > For #1, last time I tried to trace which JIRA introduced the formula but > ended up with one Elliott did which just moved that line of code. > I can spend more time in the future on this. > > What downside have you observed for #1 ? > > Cheers > > On Fri, Jan 13, 2017 at 2:07 PM, Timothy Brown > wrote: > > > I tried it out on our staging cluster and saw that the total number of > > requests per region server a bit more balanced with our current weights > for > > the read and write costs. I did not attempt to calculate the exact > requests > > per second but rather looked at a relative rate by averaging the increase > > in reads and writes over the interval that the RegionLoad is currently > > polled. This should have the same desired effect of balancing the number > of > > requests across the cluster. If you don't mind, I would like to take a > stab > > at the JIRA you've created. > > > > For #1, any idea if this is the desired behavior? > > > > Thanks, > > Tim > > > > On Fri, Jan 13, 2017 at 10:27 AM, Ted Yu wrote: > > > > > Logged HBASE-17462 for #2. > > > > > > FYI > > > > > > On Thu, Jan 12, 2017 at 8:49 AM, Ted Yu wrote: > > > > > > > For #2, I think MemstoreSizeCostFunction belongs to the same category > > if > > > > we are to adopt moving average. > > > > > > > > Some factors to consider: > > > > > > > > The data structure used by StochasticLoadBalancer should be concise. > > The > > > > number of regions in a cluster can be expected to approach 1 million. > > We > > > > cannot afford to store long history of read / write requests in > master. > > > > > > > > Efficiency of cost calculation should be high - there're many cost > > > > functions the balancer goes through, it is expected for each cost > > > function > > > > to return quickly. Otherwise we would not come up with proper region > > > > movement plan(s) in time. > > > > > > > > Cheers > > > > > > > > On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu wrote: > > > > > > > >> For #2, I think it makes sense to try out using request rates for > cost > > > >> calculation. > > > >> > > > >> If the experiment result turns out to be better, we can consider > using > > > >> such measure. > > > >> > > > >> Thanks > > > >> > > > >> On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown > > > > >> wrote: > > > >> > > > >>> Hi, > > > >>> > > > >>> I have a couple of questions about the StochasticLoadBalancer. > > > >>> > > > >>> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is > > weights > > > >>> later samples of the RegionLoad more than previous ones. For > example, > > > >>> with > > > >>> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 > + > > > >>> .125*load4). Is this the intended behavior? > > > >>> > > > >>> 2) Would it make more sense to calculate the ReadRequestCost and > > > >>> WriteRequestCost as rates? Right now it looks like the cost is just > > > based > > > >>> off the total number of read/write requests a region has gotten > over > > > its > > > >>> lifetime. > > > >>> > > > >>> -Tim > > > >>> > > > >> > > > >> > > > > > > > > > >
Re: StochasticLoadBalancer questions
For #2, you're more than welcome to attach patch on the JIRA. For #1, last time I tried to trace which JIRA introduced the formula but ended up with one Elliott did which just moved that line of code. I can spend more time in the future on this. What downside have you observed for #1 ? Cheers On Fri, Jan 13, 2017 at 2:07 PM, Timothy Brown wrote: > I tried it out on our staging cluster and saw that the total number of > requests per region server a bit more balanced with our current weights for > the read and write costs. I did not attempt to calculate the exact requests > per second but rather looked at a relative rate by averaging the increase > in reads and writes over the interval that the RegionLoad is currently > polled. This should have the same desired effect of balancing the number of > requests across the cluster. If you don't mind, I would like to take a stab > at the JIRA you've created. > > For #1, any idea if this is the desired behavior? > > Thanks, > Tim > > On Fri, Jan 13, 2017 at 10:27 AM, Ted Yu wrote: > > > Logged HBASE-17462 for #2. > > > > FYI > > > > On Thu, Jan 12, 2017 at 8:49 AM, Ted Yu wrote: > > > > > For #2, I think MemstoreSizeCostFunction belongs to the same category > if > > > we are to adopt moving average. > > > > > > Some factors to consider: > > > > > > The data structure used by StochasticLoadBalancer should be concise. > The > > > number of regions in a cluster can be expected to approach 1 million. > We > > > cannot afford to store long history of read / write requests in master. > > > > > > Efficiency of cost calculation should be high - there're many cost > > > functions the balancer goes through, it is expected for each cost > > function > > > to return quickly. Otherwise we would not come up with proper region > > > movement plan(s) in time. > > > > > > Cheers > > > > > > On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu wrote: > > > > > >> For #2, I think it makes sense to try out using request rates for cost > > >> calculation. > > >> > > >> If the experiment result turns out to be better, we can consider using > > >> such measure. > > >> > > >> Thanks > > >> > > >> On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown > > >> wrote: > > >> > > >>> Hi, > > >>> > > >>> I have a couple of questions about the StochasticLoadBalancer. > > >>> > > >>> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is > weights > > >>> later samples of the RegionLoad more than previous ones. For example, > > >>> with > > >>> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 + > > >>> .125*load4). Is this the intended behavior? > > >>> > > >>> 2) Would it make more sense to calculate the ReadRequestCost and > > >>> WriteRequestCost as rates? Right now it looks like the cost is just > > based > > >>> off the total number of read/write requests a region has gotten over > > its > > >>> lifetime. > > >>> > > >>> -Tim > > >>> > > >> > > >> > > > > > >
Re: StochasticLoadBalancer questions
I tried it out on our staging cluster and saw that the total number of requests per region server a bit more balanced with our current weights for the read and write costs. I did not attempt to calculate the exact requests per second but rather looked at a relative rate by averaging the increase in reads and writes over the interval that the RegionLoad is currently polled. This should have the same desired effect of balancing the number of requests across the cluster. If you don't mind, I would like to take a stab at the JIRA you've created. For #1, any idea if this is the desired behavior? Thanks, Tim On Fri, Jan 13, 2017 at 10:27 AM, Ted Yu wrote: > Logged HBASE-17462 for #2. > > FYI > > On Thu, Jan 12, 2017 at 8:49 AM, Ted Yu wrote: > > > For #2, I think MemstoreSizeCostFunction belongs to the same category if > > we are to adopt moving average. > > > > Some factors to consider: > > > > The data structure used by StochasticLoadBalancer should be concise. The > > number of regions in a cluster can be expected to approach 1 million. We > > cannot afford to store long history of read / write requests in master. > > > > Efficiency of cost calculation should be high - there're many cost > > functions the balancer goes through, it is expected for each cost > function > > to return quickly. Otherwise we would not come up with proper region > > movement plan(s) in time. > > > > Cheers > > > > On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu wrote: > > > >> For #2, I think it makes sense to try out using request rates for cost > >> calculation. > >> > >> If the experiment result turns out to be better, we can consider using > >> such measure. > >> > >> Thanks > >> > >> On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown > >> wrote: > >> > >>> Hi, > >>> > >>> I have a couple of questions about the StochasticLoadBalancer. > >>> > >>> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is weights > >>> later samples of the RegionLoad more than previous ones. For example, > >>> with > >>> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 + > >>> .125*load4). Is this the intended behavior? > >>> > >>> 2) Would it make more sense to calculate the ReadRequestCost and > >>> WriteRequestCost as rates? Right now it looks like the cost is just > based > >>> off the total number of read/write requests a region has gotten over > its > >>> lifetime. > >>> > >>> -Tim > >>> > >> > >> > > >
Re: StochasticLoadBalancer questions
Logged HBASE-17462 for #2. FYI On Thu, Jan 12, 2017 at 8:49 AM, Ted Yu wrote: > For #2, I think MemstoreSizeCostFunction belongs to the same category if > we are to adopt moving average. > > Some factors to consider: > > The data structure used by StochasticLoadBalancer should be concise. The > number of regions in a cluster can be expected to approach 1 million. We > cannot afford to store long history of read / write requests in master. > > Efficiency of cost calculation should be high - there're many cost > functions the balancer goes through, it is expected for each cost function > to return quickly. Otherwise we would not come up with proper region > movement plan(s) in time. > > Cheers > > On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu wrote: > >> For #2, I think it makes sense to try out using request rates for cost >> calculation. >> >> If the experiment result turns out to be better, we can consider using >> such measure. >> >> Thanks >> >> On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown >> wrote: >> >>> Hi, >>> >>> I have a couple of questions about the StochasticLoadBalancer. >>> >>> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is weights >>> later samples of the RegionLoad more than previous ones. For example, >>> with >>> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 + >>> .125*load4). Is this the intended behavior? >>> >>> 2) Would it make more sense to calculate the ReadRequestCost and >>> WriteRequestCost as rates? Right now it looks like the cost is just based >>> off the total number of read/write requests a region has gotten over its >>> lifetime. >>> >>> -Tim >>> >> >> >
Re: StochasticLoadBalancer questions
For #2, I think MemstoreSizeCostFunction belongs to the same category if we are to adopt moving average. Some factors to consider: The data structure used by StochasticLoadBalancer should be concise. The number of regions in a cluster can be expected to approach 1 million. We cannot afford to store long history of read / write requests in master. Efficiency of cost calculation should be high - there're many cost functions the balancer goes through, it is expected for each cost function to return quickly. Otherwise we would not come up with proper region movement plan(s) in time. Cheers On Wed, Jan 11, 2017 at 5:51 PM, Ted Yu wrote: > For #2, I think it makes sense to try out using request rates for cost > calculation. > > If the experiment result turns out to be better, we can consider using > such measure. > > Thanks > > On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown > wrote: > >> Hi, >> >> I have a couple of questions about the StochasticLoadBalancer. >> >> 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is weights >> later samples of the RegionLoad more than previous ones. For example, with >> a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 + >> .125*load4). Is this the intended behavior? >> >> 2) Would it make more sense to calculate the ReadRequestCost and >> WriteRequestCost as rates? Right now it looks like the cost is just based >> off the total number of read/write requests a region has gotten over its >> lifetime. >> >> -Tim >> > >
Re: StochasticLoadBalancer questions
For #2, I think it makes sense to try out using request rates for cost calculation. If the experiment result turns out to be better, we can consider using such measure. Thanks On Wed, Jan 11, 2017 at 5:34 PM, Timothy Brown wrote: > Hi, > > I have a couple of questions about the StochasticLoadBalancer. > > 1) In CostFromRegionLoadFunction.getRegionLoadCost the cost is weights > later samples of the RegionLoad more than previous ones. For example, with > a queue size of 4 it would be (.5 * load1 + .25*load2 + .125*load3 + > .125*load4). Is this the intended behavior? > > 2) Would it make more sense to calculate the ReadRequestCost and > WriteRequestCost as rates? Right now it looks like the cost is just based > off the total number of read/write requests a region has gotten over its > lifetime. > > -Tim >