Re: Load balancing and load determination

2018-12-15 Thread Greg Ames
On Tue, Oct 30, 2018 at 8:53 AM Jim Jagielski  wrote:

> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
>
> Thoughts? Ideas? Comments? Suggestions? :)
>

There are a couple of analogous systems from my former employer that used a
"pull from front end queue" concept for load balancing.  I thought that was
very interesting, although I never had any practical experience servicing
those systems.

The idea is that each back end pulls off work as quickly as possible.  If
one back end is slower/faster than the average, it just does less/more work
that the others, and no (bound to fail) clever oracle is required.  The
downside might be the presence of a front end queue which implies latency.
I don't know how such a system would perform when the back ends are only
lightly or moderately loaded and the front end queue is usually empty.

Greg Ames


Re: Load balancing and load determination

2018-11-09 Thread William A Rowe Jr
On Thu, Nov 8, 2018 at 1:48 PM Jim Jagielski  wrote:

> I have a semi-working implementation that I'll be committing to trunk in a
> bit...


 I'm confused. Semi-working would seem to be orthoganal to keeping trunk in
a releasable state, but it depends on what you mean. But before you commit
a significant change, please first consider posting the patch, or simpler,
please
consider a sandbox fork for iterative development?  From the project bylaws;

When to Commit a Change

Ideas must be review-then-commit; patches can be commit-then-review. With a
commit-then-review process, we trust that the developer doing the commit
has a high degree of confidence in the change. Doubtful changes, new
features, and large-scale overhauls need to be discussed before being
committed to a repository. Any change that affects the semantics of
arguments to configurable directives, significantly adds to the runtime
size of the program, or changes the semantics of an existing API function
must receive consensus approval on the mailing list before being committed.


Re: Load balancing and load determination

2018-11-08 Thread Jim Jagielski
I have a semi-working implementation that I'll be committing to trunk in a 
bit...

> On Nov 8, 2018, at 1:33 AM, Mladen Turk  wrote:
> 
> On 30.10.2018. 13:53, Jim Jagielski wrote:
>> As some of you know, one of my passions and area of focus is
>> on the use of Apache httpd as a reverse proxy and, as such, load
>> balancing, failover, etc are of vital interest to me.
> 
> Been a while, but seems I'm back :D
> Love the idea to have more intelligent then "lets guess"
> way of deducting the load balancer score.
> 
> What we did for heartbeat/heartmonitor/watchdog can be used
> for collecting backend data.
> 
> The thing I'm trying to do is the way that backend can
> register or remove itself as node inside load balancer.
> That would also require some sort of backend-server communication,
> shared memory management (mod_slotmem maybe), and a way to
> survive graceful restart.
> 
> Backend sending its load status at regular intervals would
> be addition to "I'm here, count me in" or
> "I'm out, bye, good luck with other nodes".
> 
> What do you think?
> 
> 
> 
> Regards
> -- 
> ^TM



Re: Load balancing and load determination

2018-11-07 Thread Mladen Turk

On 30.10.2018. 13:53, Jim Jagielski wrote:

As some of you know, one of my passions and area of focus is
on the use of Apache httpd as a reverse proxy and, as such, load
balancing, failover, etc are of vital interest to me.



Been a while, but seems I'm back :D
Love the idea to have more intelligent then "lets guess"
way of deducting the load balancer score.

What we did for heartbeat/heartmonitor/watchdog can be used
for collecting backend data.

The thing I'm trying to do is the way that backend can
register or remove itself as node inside load balancer.
That would also require some sort of backend-server communication,
shared memory management (mod_slotmem maybe), and a way to
survive graceful restart.

Backend sending its load status at regular intervals would
be addition to "I'm here, count me in" or
"I'm out, bye, good luck with other nodes".

What do you think?



Regards
--
^TM


Re: Load balancing and load determination

2018-11-06 Thread Jim Jagielski
Which is why we allow for both pre-send checks and out-of-band health checks...

> On Nov 5, 2018, at 10:58 AM, William A Rowe Jr  wrote:
> 
> 
> The last thing we want are the routing headaches of contacting an
> ever-changing list one-or-many potential balancers. And we can't
> rely on a dying lbmember to "check in" that it isn't functional. Since
> the balancer must already start requests to the backend, having that
> backend supplement the responses with its health status is simple.



Re: Load balancing and load determination

2018-11-06 Thread jean-frederic clere
On 05/11/2018 16:58, William A Rowe Jr wrote:
> On Mon, Nov 5, 2018 at 7:48 AM jean-frederic clere  > wrote:
> 
> On 30/10/2018 13:53, Jim Jagielski wrote:
> > As some of you know, one of my passions and area of focus is
> > on the use of Apache httpd as a reverse proxy and, as such, load
> > balancing, failover, etc are of vital interest to me.
> >
> > One topic which I have mulling over, off and on, has been the
> > idea of some sort of universal load number, that could be used
> > and agreed upon by web servers. Right now, the reverse proxy
> > "guesses" the load on the backend servers which is OK, and
> > works well enough, but it would be great if it actually "knew"
> > the current loads on those servers. I already have code that
> > shares basic architectural info, such as number of CPUs, available
> > memory, loadavg, etc which can help, of course, but again, all
> > this info can be used to *infer* the current status of those backend
> > servers; it doesn't really provide what the current load actually
> > *is*.
> >
> > So I was thinking maybe some sort of small, simple and "fast"
> > benchmark which could be run by the backends as part of their
> > "status" update to the front-end reverse proxy server... something
> > that shows general capability at that point in time, like Hanoi or
> > something similar. Or maybe some hash function. Some simple code
> > that could be used to create that "universal" load number.
> >
> > Thoughts? Ideas? Comments? Suggestions? :)
> 
> having the back-ends to provide the load they are able to handle
> lbfactor (via w_lf or somethere similar. That requires the back-ends to
> be able to send request to httpd balancer-manager handler.
> 
> 
> Not really. I'd suggest a response header, travelling with each response
> back to the balancer, which can be composed quickly enough to share
> a play-by-play snapshot of the availability of that backend. This adds
> next to no traffic and minimal cpu drain if composed cleanly. And it can
> optionally be axed by the balancer in the response to the client.

The problem is that if there is no requests going to back-end the
load-balancer won't know that the back-end is available again after a
load peak.

> 
> The last thing we want are the routing headaches of contacting an
> ever-changing list one-or-many potential balancers. And we can't
> rely on a dying lbmember to "check in" that it isn't functional. Since
> the balancer must already start requests to the backend, having that
> backend supplement the responses with its health status is simple.
> 
> 

cping/cpong or options * allows check back-end nodes before sending
requests.

-- 
Cheers

Jean-Frederic


Re: Load balancing and load determination

2018-11-05 Thread Stefan Eissing


> Am 05.11.2018 um 16:58 schrieb William A Rowe Jr :
> 
> On Mon, Nov 5, 2018 at 7:48 AM jean-frederic clere  wrote:
> On 30/10/2018 13:53, Jim Jagielski wrote:
> > As some of you know, one of my passions and area of focus is
> > on the use of Apache httpd as a reverse proxy and, as such, load
> > balancing, failover, etc are of vital interest to me.
> > 
> > One topic which I have mulling over, off and on, has been the
> > idea of some sort of universal load number, that could be used
> > and agreed upon by web servers. Right now, the reverse proxy
> > "guesses" the load on the backend servers which is OK, and
> > works well enough, but it would be great if it actually "knew"
> > the current loads on those servers. I already have code that
> > shares basic architectural info, such as number of CPUs, available
> > memory, loadavg, etc which can help, of course, but again, all
> > this info can be used to *infer* the current status of those backend
> > servers; it doesn't really provide what the current load actually
> > *is*.
> > 
> > So I was thinking maybe some sort of small, simple and "fast"
> > benchmark which could be run by the backends as part of their
> > "status" update to the front-end reverse proxy server... something
> > that shows general capability at that point in time, like Hanoi or
> > something similar. Or maybe some hash function. Some simple code
> > that could be used to create that "universal" load number.
> > 
> > Thoughts? Ideas? Comments? Suggestions? :)
> 
> having the back-ends to provide the load they are able to handle
> lbfactor (via w_lf or somethere similar. That requires the back-ends to
> be able to send request to httpd balancer-manager handler.
> 
> Not really. I'd suggest a response header, travelling with each response
> back to the balancer, which can be composed quickly enough to share
> a play-by-play snapshot of the availability of that backend. This adds
> next to no traffic and minimal cpu drain if composed cleanly. And it can
> optionally be axed by the balancer in the response to the client.
> 
> The last thing we want are the routing headaches of contacting an
> ever-changing list one-or-many potential balancers. And we can't
> rely on a dying lbmember to "check in" that it isn't functional. Since
> the balancer must already start requests to the backend, having that
> backend supplement the responses with its health status is simple.

Funnily enough, I did my master thesis (is that a word?) a long, long
while ago on scheduling in distributed systems. And with "distributed"
the general tricky thing is that there is not global knowledge of the
system state.

While any load indicator reported from the backends might look very
useful, once you deal with several front ends, this degenerates quickly
(where each frontend makes its own decision without talking to each
other).

If you detect and exclude any failing backends (heartbeat), then, with
growing number of back- and frontends, it's very hard to beat a random
job distribution.

I found that, in general, pulling works slightly better than pushing. The
scenario here would be that backends ask frontends for requests to execute.
That is also very stable in case of backend failures, of course.

tl;dr

If your problem scenario includes more than a single frontend, go for random.

Cheers,

Stefan



Re: Load balancing and load determination

2018-11-05 Thread William A Rowe Jr
On Mon, Nov 5, 2018 at 7:48 AM jean-frederic clere 
wrote:

> On 30/10/2018 13:53, Jim Jagielski wrote:
> > As some of you know, one of my passions and area of focus is
> > on the use of Apache httpd as a reverse proxy and, as such, load
> > balancing, failover, etc are of vital interest to me.
> >
> > One topic which I have mulling over, off and on, has been the
> > idea of some sort of universal load number, that could be used
> > and agreed upon by web servers. Right now, the reverse proxy
> > "guesses" the load on the backend servers which is OK, and
> > works well enough, but it would be great if it actually "knew"
> > the current loads on those servers. I already have code that
> > shares basic architectural info, such as number of CPUs, available
> > memory, loadavg, etc which can help, of course, but again, all
> > this info can be used to *infer* the current status of those backend
> > servers; it doesn't really provide what the current load actually
> > *is*.
> >
> > So I was thinking maybe some sort of small, simple and "fast"
> > benchmark which could be run by the backends as part of their
> > "status" update to the front-end reverse proxy server... something
> > that shows general capability at that point in time, like Hanoi or
> > something similar. Or maybe some hash function. Some simple code
> > that could be used to create that "universal" load number.
> >
> > Thoughts? Ideas? Comments? Suggestions? :)
>
> having the back-ends to provide the load they are able to handle
> lbfactor (via w_lf or somethere similar. That requires the back-ends to
> be able to send request to httpd balancer-manager handler.


Not really. I'd suggest a response header, travelling with each response
back to the balancer, which can be composed quickly enough to share
a play-by-play snapshot of the availability of that backend. This adds
next to no traffic and minimal cpu drain if composed cleanly. And it can
optionally be axed by the balancer in the response to the client.

The last thing we want are the routing headaches of contacting an
ever-changing list one-or-many potential balancers. And we can't
rely on a dying lbmember to "check in" that it isn't functional. Since
the balancer must already start requests to the backend, having that
backend supplement the responses with its health status is simple.


Re: Load balancing and load determination

2018-11-05 Thread jean-frederic clere
On 05/11/2018 15:05, Jim Jagielski wrote:
> I was thinking about something more robust and usable than heartbeat (due to 
> multicast) but similar in basic concept.

I remember trying mod_heartmonitor with a simple listener like


Where ProxyList is the lit of httpd that are able to proxy to tomcat
back-ends. See http://jfclere.blogspot.com/2009/04/ I guess I need to
revisit that to use mod_prxy_balancer logic.


> 
>> On Nov 5, 2018, at 8:48 AM, jean-frederic clere  wrote:
>>
>> On 30/10/2018 13:53, Jim Jagielski wrote:
>>> As some of you know, one of my passions and area of focus is
>>> on the use of Apache httpd as a reverse proxy and, as such, load
>>> balancing, failover, etc are of vital interest to me.
>>>
>>> One topic which I have mulling over, off and on, has been the
>>> idea of some sort of universal load number, that could be used
>>> and agreed upon by web servers. Right now, the reverse proxy
>>> "guesses" the load on the backend servers which is OK, and
>>> works well enough, but it would be great if it actually "knew"
>>> the current loads on those servers. I already have code that
>>> shares basic architectural info, such as number of CPUs, available
>>> memory, loadavg, etc which can help, of course, but again, all
>>> this info can be used to *infer* the current status of those backend
>>> servers; it doesn't really provide what the current load actually
>>> *is*.
>>>
>>> So I was thinking maybe some sort of small, simple and "fast"
>>> benchmark which could be run by the backends as part of their
>>> "status" update to the front-end reverse proxy server... something
>>> that shows general capability at that point in time, like Hanoi or
>>> something similar. Or maybe some hash function. Some simple code
>>> that could be used to create that "universal" load number.
>>>
>>> Thoughts? Ideas? Comments? Suggestions? :)
>>
>> having the back-ends to provide the load they are able to handle
>> lbfactor (via w_lf or somethere similar. That requires the back-ends to
>> be able to send request to httpd balancer-manager handler.
>>
>>>
>>
>>
>> -- 
>> Cheers
>>
>> Jean-Frederic
> 
> 


-- 
Cheers

Jean-Frederic


Re: Load balancing and load determination

2018-11-05 Thread Jim Jagielski
I was thinking about something more robust and usable than heartbeat (due to 
multicast) but similar in basic concept.

> On Nov 5, 2018, at 8:48 AM, jean-frederic clere  wrote:
> 
> On 30/10/2018 13:53, Jim Jagielski wrote:
>> As some of you know, one of my passions and area of focus is
>> on the use of Apache httpd as a reverse proxy and, as such, load
>> balancing, failover, etc are of vital interest to me.
>> 
>> One topic which I have mulling over, off and on, has been the
>> idea of some sort of universal load number, that could be used
>> and agreed upon by web servers. Right now, the reverse proxy
>> "guesses" the load on the backend servers which is OK, and
>> works well enough, but it would be great if it actually "knew"
>> the current loads on those servers. I already have code that
>> shares basic architectural info, such as number of CPUs, available
>> memory, loadavg, etc which can help, of course, but again, all
>> this info can be used to *infer* the current status of those backend
>> servers; it doesn't really provide what the current load actually
>> *is*.
>> 
>> So I was thinking maybe some sort of small, simple and "fast"
>> benchmark which could be run by the backends as part of their
>> "status" update to the front-end reverse proxy server... something
>> that shows general capability at that point in time, like Hanoi or
>> something similar. Or maybe some hash function. Some simple code
>> that could be used to create that "universal" load number.
>> 
>> Thoughts? Ideas? Comments? Suggestions? :)
> 
> having the back-ends to provide the load they are able to handle
> lbfactor (via w_lf or somethere similar. That requires the back-ends to
> be able to send request to httpd balancer-manager handler.
> 
>> 
> 
> 
> -- 
> Cheers
> 
> Jean-Frederic



Re: Load balancing and load determination

2018-11-05 Thread jean-frederic clere
On 30/10/2018 13:53, Jim Jagielski wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
> 
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
> 
> Thoughts? Ideas? Comments? Suggestions? :)

having the back-ends to provide the load they are able to handle
lbfactor (via w_lf or somethere similar. That requires the back-ends to
be able to send request to httpd balancer-manager handler.

> 


-- 
Cheers

Jean-Frederic


Re: Load balancing and load determination

2018-10-30 Thread Mark Blackman



> On 30 Oct 2018, at 12:53, Jim Jagielski  wrote:
> 
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
> 
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
> 
> Thoughts? Ideas? Comments? Suggestions? :)

What problem are you trying to solve? Broadly, I think they best you can do is 
ask the backends to include a response header indicating their current appetite 
for more connections.

- Mark



Re: Load balancing and load determination

2018-10-30 Thread Jim Jagielski



> On Oct 30, 2018, at 9:06 AM, Daniel Ruggeri  wrote:
> 
> Hi, Jim J;
> I recall a while back that Jim Riggs proposed a spec for exactly this a while 
> back... I think it was shared here on list and some light iteration was done. 
> IIUC, he was even planning to present it at ACNA until travel plans fell 
> through.
> 


https://lists.apache.org/thread.html/ca115bd3f21f7da91fa01a4d83af7d73987750e1e48bb2bf76236e52@1430369651@%3Cdev.httpd.apache.org%3E





Re: Load balancing and load determination

2018-10-30 Thread Eric Covener
> The main consideration is one of consistency... unless there is some agreed 
> upon "standard" then comparisons are worthless and the resulting load 
> balancing will be inaccurate. For example, say that Apache is front-ending 10 
> servers, 5 are Apache and other 5 are Foo, but Foo consistently falsifies 
> it's capability simply to ensure that it gets all the traffic. Sure, you can 
> adjust settings on the front end to offset that, but that defeats the whole 
> purpose of some *accurate*, objective measure of capability.

Wouldn't the servers you load balance between for the same URL
generally be more similar than that?


Re: Load balancing and load determination

2018-10-30 Thread Jim Jagielski
The only reason why I brought up the concept of a benchmark is because it is 
dead easy to provide the source for said benchmark and have backend servers 
simply time how long it takes to run it each status update. Each backend would 
simply then send the "time taken" and that would provide some measure of how 
beefy and/or loaded said server was.

The main consideration is one of consistency... unless there is some agreed 
upon "standard" then comparisons are worthless and the resulting load balancing 
will be inaccurate. For example, say that Apache is front-ending 10 servers, 5 
are Apache and other 5 are Foo, but Foo consistently falsifies it's capability 
simply to ensure that it gets all the traffic. Sure, you can adjust settings on 
the front end to offset that, but that defeats the whole purpose of some 
*accurate*, objective measure of capability.

Yeah, I recall JR posting something after I brought up this topic at one of my 
ApacheCon sessions...

Maybe it's more of an "availability factor" than a load factor... with 0 being 
"send me nothing" and 1 being "I am completely unloaded" and decimal values 
between indicating their "availability" to handle traffic.

> On Oct 30, 2018, at 9:06 AM, Daniel Ruggeri  wrote:
> 
> Hi, Jim J;
> I recall a while back that Jim Riggs proposed a spec for exactly this a while 
> back... I think it was shared here on list and some light iteration was done. 
> IIUC, he was even planning to present it at ACNA until travel plans fell 
> through.
> 
> Hi, Jim R;
> Any chance you have the latest and greatest, or is the version from the list 
> archives current state?
> 
> 
> One of the things I recall *really liking* from the recommendation is letting 
> the backend decide its factor based on whatever it believes is most 
> important. In some servers, that may be available threads. In others it could 
> be percentage of memory used. Still yet, other servers may decide based on 
> number of idle GPUs on-system. I think this is roughly the same you are 
> suggesting, Jim J, but I struggle to think of a universal benchmark because 
> backends are so varied.
> -- 
> Daniel Ruggeri
> 
> On October 30, 2018 7:53:20 AM CDT, Jim Jagielski  wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
> 
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
> 
> Thoughts? Ideas? Comments? Suggestions? :)



Re: Load balancing and load determination

2018-10-30 Thread Yehuda Katz
HAProxy has a similar feature called agent-check (
https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#5.2-agent-check)
although in their case, the backend server specifies it's own weight.
Either way - whether the frontend or backend determines the weight - it
would be useful.

- Y

Sent from a device with a very small keyboard and hyperactive autocorrect.

On Tue, Oct 30, 2018, 8:53 AM Jim Jagielski  wrote:

> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
>
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
>
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
>
> Thoughts? Ideas? Comments? Suggestions? :)
>

On Oct 30, 2018 8:53 AM, "Jim Jagielski"  wrote:

As some of you know, one of my passions and area of focus is
on the use of Apache httpd as a reverse proxy and, as such, load
balancing, failover, etc are of vital interest to me.

One topic which I have mulling over, off and on, has been the
idea of some sort of universal load number, that could be used
and agreed upon by web servers. Right now, the reverse proxy
"guesses" the load on the backend servers which is OK, and
works well enough, but it would be great if it actually "knew"
the current loads on those servers. I already have code that
shares basic architectural info, such as number of CPUs, available
memory, loadavg, etc which can help, of course, but again, all
this info can be used to *infer* the current status of those backend
servers; it doesn't really provide what the current load actually
*is*.

So I was thinking maybe some sort of small, simple and "fast"
benchmark which could be run by the backends as part of their
"status" update to the front-end reverse proxy server... something
that shows general capability at that point in time, like Hanoi or
something similar. Or maybe some hash function. Some simple code
that could be used to create that "universal" load number.

Thoughts? Ideas? Comments? Suggestions? :)


Re: Load balancing and load determination

2018-10-30 Thread Michal Karm
On 10/30/2018 01:53 PM, Jim Jagielski wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
>
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
>
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
>
> Thoughts? Ideas? Comments? Suggestions? :)
>

Hello,

It seems that is exactly what https://modcluster.io/ does.

- it has a Tomcat listener / JBoss AS (Wildfly) module that reports
  a worker-side calculated load number
https://docs.modcluster.io/#_worker_side_load_metrics

- httpd side is completely oblivious as to how the number was calculated,
  which worker used which load metric to calculate it etc, it just receives a 
number

- httpd side dynamically configures mod_proxy balancers according to joining
  and leaving worker nodes

- httpd side uses the load number to balance requests among healthy workers


An obvious down side is that the worker must implement this mod_cluster
logic. Implementations exist for JBoss AS/Wildfly/Tomcat, but we don't have
one for Jetty for example. On the bright side, the protocol itself is dead 
simple.

Disclosure: I am involved in the project.


Cheers

Michal Karm Babacek

-- 
Sent from my Hosaka Ono-Sendai Cyberspace 7




Re: Load balancing and load determination

2018-10-30 Thread Daniel Ruggeri
Hi, Jim J;
   I recall a while back that Jim Riggs proposed a spec for exactly this a 
while back... I think it was shared here on list and some light iteration was 
done. IIUC, he was even planning to present it at ACNA until travel plans fell 
through.

Hi, Jim R;
   Any chance you have the latest and greatest, or is the version from the list 
archives current state?


One of the things I recall *really liking* from the recommendation is letting 
the backend decide its factor based on whatever it believes is most important. 
In some servers, that may be available threads. In others it could be 
percentage of memory used. Still yet, other servers may decide based on number 
of idle GPUs on-system. I think this is roughly the same you are suggesting, 
Jim J, but I struggle to think of a universal benchmark because backends are so 
varied.
-- 
Daniel Ruggeri

On October 30, 2018 7:53:20 AM CDT, Jim Jagielski  wrote:
>As some of you know, one of my passions and area of focus is
>on the use of Apache httpd as a reverse proxy and, as such, load
>balancing, failover, etc are of vital interest to me.
>
>One topic which I have mulling over, off and on, has been the
>idea of some sort of universal load number, that could be used
>and agreed upon by web servers. Right now, the reverse proxy
>"guesses" the load on the backend servers which is OK, and
>works well enough, but it would be great if it actually "knew"
>the current loads on those servers. I already have code that
>shares basic architectural info, such as number of CPUs, available
>memory, loadavg, etc which can help, of course, but again, all
>this info can be used to *infer* the current status of those backend
>servers; it doesn't really provide what the current load actually
>*is*.
>
>So I was thinking maybe some sort of small, simple and "fast"
>benchmark which could be run by the backends as part of their
>"status" update to the front-end reverse proxy server... something
>that shows general capability at that point in time, like Hanoi or
>something similar. Or maybe some hash function. Some simple code
>that could be used to create that "universal" load number.
>
>Thoughts? Ideas? Comments? Suggestions? :)


Load balancing and load determination

2018-10-30 Thread Jim Jagielski
As some of you know, one of my passions and area of focus is
on the use of Apache httpd as a reverse proxy and, as such, load
balancing, failover, etc are of vital interest to me.

One topic which I have mulling over, off and on, has been the
idea of some sort of universal load number, that could be used
and agreed upon by web servers. Right now, the reverse proxy
"guesses" the load on the backend servers which is OK, and
works well enough, but it would be great if it actually "knew"
the current loads on those servers. I already have code that
shares basic architectural info, such as number of CPUs, available
memory, loadavg, etc which can help, of course, but again, all
this info can be used to *infer* the current status of those backend
servers; it doesn't really provide what the current load actually
*is*.

So I was thinking maybe some sort of small, simple and "fast"
benchmark which could be run by the backends as part of their
"status" update to the front-end reverse proxy server... something
that shows general capability at that point in time, like Hanoi or
something similar. Or maybe some hash function. Some simple code
that could be used to create that "universal" load number.

Thoughts? Ideas? Comments? Suggestions? :)