Re: Load balancing and load determination

2018-10-30 Thread Mark Blackman



> On 30 Oct 2018, at 12:53, Jim Jagielski  wrote:
> 
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
> 
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
> 
> Thoughts? Ideas? Comments? Suggestions? :)

What problem are you trying to solve? Broadly, I think they best you can do is 
ask the backends to include a response header indicating their current appetite 
for more connections.

- Mark



Re: Load balancing and load determination

2018-10-30 Thread Jim Jagielski



> On Oct 30, 2018, at 9:06 AM, Daniel Ruggeri  wrote:
> 
> Hi, Jim J;
> I recall a while back that Jim Riggs proposed a spec for exactly this a while 
> back... I think it was shared here on list and some light iteration was done. 
> IIUC, he was even planning to present it at ACNA until travel plans fell 
> through.
> 


https://lists.apache.org/thread.html/ca115bd3f21f7da91fa01a4d83af7d73987750e1e48bb2bf76236e52@1430369651@%3Cdev.httpd.apache.org%3E





Re: Load balancing and load determination

2018-10-30 Thread Eric Covener
> The main consideration is one of consistency... unless there is some agreed 
> upon "standard" then comparisons are worthless and the resulting load 
> balancing will be inaccurate. For example, say that Apache is front-ending 10 
> servers, 5 are Apache and other 5 are Foo, but Foo consistently falsifies 
> it's capability simply to ensure that it gets all the traffic. Sure, you can 
> adjust settings on the front end to offset that, but that defeats the whole 
> purpose of some *accurate*, objective measure of capability.

Wouldn't the servers you load balance between for the same URL
generally be more similar than that?


Re: Load balancing and load determination

2018-10-30 Thread Jim Jagielski
The only reason why I brought up the concept of a benchmark is because it is 
dead easy to provide the source for said benchmark and have backend servers 
simply time how long it takes to run it each status update. Each backend would 
simply then send the "time taken" and that would provide some measure of how 
beefy and/or loaded said server was.

The main consideration is one of consistency... unless there is some agreed 
upon "standard" then comparisons are worthless and the resulting load balancing 
will be inaccurate. For example, say that Apache is front-ending 10 servers, 5 
are Apache and other 5 are Foo, but Foo consistently falsifies it's capability 
simply to ensure that it gets all the traffic. Sure, you can adjust settings on 
the front end to offset that, but that defeats the whole purpose of some 
*accurate*, objective measure of capability.

Yeah, I recall JR posting something after I brought up this topic at one of my 
ApacheCon sessions...

Maybe it's more of an "availability factor" than a load factor... with 0 being 
"send me nothing" and 1 being "I am completely unloaded" and decimal values 
between indicating their "availability" to handle traffic.

> On Oct 30, 2018, at 9:06 AM, Daniel Ruggeri  wrote:
> 
> Hi, Jim J;
> I recall a while back that Jim Riggs proposed a spec for exactly this a while 
> back... I think it was shared here on list and some light iteration was done. 
> IIUC, he was even planning to present it at ACNA until travel plans fell 
> through.
> 
> Hi, Jim R;
> Any chance you have the latest and greatest, or is the version from the list 
> archives current state?
> 
> 
> One of the things I recall *really liking* from the recommendation is letting 
> the backend decide its factor based on whatever it believes is most 
> important. In some servers, that may be available threads. In others it could 
> be percentage of memory used. Still yet, other servers may decide based on 
> number of idle GPUs on-system. I think this is roughly the same you are 
> suggesting, Jim J, but I struggle to think of a universal benchmark because 
> backends are so varied.
> -- 
> Daniel Ruggeri
> 
> On October 30, 2018 7:53:20 AM CDT, Jim Jagielski  wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
> 
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
> 
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
> 
> Thoughts? Ideas? Comments? Suggestions? :)



Re: Load balancing and load determination

2018-10-30 Thread Yehuda Katz
HAProxy has a similar feature called agent-check (
https://cbonte.github.io/haproxy-dconv/1.8/configuration.html#5.2-agent-check)
although in their case, the backend server specifies it's own weight.
Either way - whether the frontend or backend determines the weight - it
would be useful.

- Y

Sent from a device with a very small keyboard and hyperactive autocorrect.

On Tue, Oct 30, 2018, 8:53 AM Jim Jagielski  wrote:

> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
>
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
>
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
>
> Thoughts? Ideas? Comments? Suggestions? :)
>

On Oct 30, 2018 8:53 AM, "Jim Jagielski"  wrote:

As some of you know, one of my passions and area of focus is
on the use of Apache httpd as a reverse proxy and, as such, load
balancing, failover, etc are of vital interest to me.

One topic which I have mulling over, off and on, has been the
idea of some sort of universal load number, that could be used
and agreed upon by web servers. Right now, the reverse proxy
"guesses" the load on the backend servers which is OK, and
works well enough, but it would be great if it actually "knew"
the current loads on those servers. I already have code that
shares basic architectural info, such as number of CPUs, available
memory, loadavg, etc which can help, of course, but again, all
this info can be used to *infer* the current status of those backend
servers; it doesn't really provide what the current load actually
*is*.

So I was thinking maybe some sort of small, simple and "fast"
benchmark which could be run by the backends as part of their
"status" update to the front-end reverse proxy server... something
that shows general capability at that point in time, like Hanoi or
something similar. Or maybe some hash function. Some simple code
that could be used to create that "universal" load number.

Thoughts? Ideas? Comments? Suggestions? :)


Re: Load balancing and load determination

2018-10-30 Thread Michal Karm
On 10/30/2018 01:53 PM, Jim Jagielski wrote:
> As some of you know, one of my passions and area of focus is
> on the use of Apache httpd as a reverse proxy and, as such, load
> balancing, failover, etc are of vital interest to me.
>
> One topic which I have mulling over, off and on, has been the
> idea of some sort of universal load number, that could be used
> and agreed upon by web servers. Right now, the reverse proxy
> "guesses" the load on the backend servers which is OK, and
> works well enough, but it would be great if it actually "knew"
> the current loads on those servers. I already have code that
> shares basic architectural info, such as number of CPUs, available
> memory, loadavg, etc which can help, of course, but again, all
> this info can be used to *infer* the current status of those backend
> servers; it doesn't really provide what the current load actually
> *is*.
>
> So I was thinking maybe some sort of small, simple and "fast"
> benchmark which could be run by the backends as part of their
> "status" update to the front-end reverse proxy server... something
> that shows general capability at that point in time, like Hanoi or
> something similar. Or maybe some hash function. Some simple code
> that could be used to create that "universal" load number.
>
> Thoughts? Ideas? Comments? Suggestions? :)
>

Hello,

It seems that is exactly what https://modcluster.io/ does.

- it has a Tomcat listener / JBoss AS (Wildfly) module that reports
  a worker-side calculated load number
https://docs.modcluster.io/#_worker_side_load_metrics

- httpd side is completely oblivious as to how the number was calculated,
  which worker used which load metric to calculate it etc, it just receives a 
number

- httpd side dynamically configures mod_proxy balancers according to joining
  and leaving worker nodes

- httpd side uses the load number to balance requests among healthy workers


An obvious down side is that the worker must implement this mod_cluster
logic. Implementations exist for JBoss AS/Wildfly/Tomcat, but we don't have
one for Jetty for example. On the bright side, the protocol itself is dead 
simple.

Disclosure: I am involved in the project.


Cheers

Michal Karm Babacek

-- 
Sent from my Hosaka Ono-Sendai Cyberspace 7




Re: Load balancing and load determination

2018-10-30 Thread Daniel Ruggeri
Hi, Jim J;
   I recall a while back that Jim Riggs proposed a spec for exactly this a 
while back... I think it was shared here on list and some light iteration was 
done. IIUC, he was even planning to present it at ACNA until travel plans fell 
through.

Hi, Jim R;
   Any chance you have the latest and greatest, or is the version from the list 
archives current state?


One of the things I recall *really liking* from the recommendation is letting 
the backend decide its factor based on whatever it believes is most important. 
In some servers, that may be available threads. In others it could be 
percentage of memory used. Still yet, other servers may decide based on number 
of idle GPUs on-system. I think this is roughly the same you are suggesting, 
Jim J, but I struggle to think of a universal benchmark because backends are so 
varied.
-- 
Daniel Ruggeri

On October 30, 2018 7:53:20 AM CDT, Jim Jagielski  wrote:
>As some of you know, one of my passions and area of focus is
>on the use of Apache httpd as a reverse proxy and, as such, load
>balancing, failover, etc are of vital interest to me.
>
>One topic which I have mulling over, off and on, has been the
>idea of some sort of universal load number, that could be used
>and agreed upon by web servers. Right now, the reverse proxy
>"guesses" the load on the backend servers which is OK, and
>works well enough, but it would be great if it actually "knew"
>the current loads on those servers. I already have code that
>shares basic architectural info, such as number of CPUs, available
>memory, loadavg, etc which can help, of course, but again, all
>this info can be used to *infer* the current status of those backend
>servers; it doesn't really provide what the current load actually
>*is*.
>
>So I was thinking maybe some sort of small, simple and "fast"
>benchmark which could be run by the backends as part of their
>"status" update to the front-end reverse proxy server... something
>that shows general capability at that point in time, like Hanoi or
>something similar. Or maybe some hash function. Some simple code
>that could be used to create that "universal" load number.
>
>Thoughts? Ideas? Comments? Suggestions? :)


Load balancing and load determination

2018-10-30 Thread Jim Jagielski
As some of you know, one of my passions and area of focus is
on the use of Apache httpd as a reverse proxy and, as such, load
balancing, failover, etc are of vital interest to me.

One topic which I have mulling over, off and on, has been the
idea of some sort of universal load number, that could be used
and agreed upon by web servers. Right now, the reverse proxy
"guesses" the load on the backend servers which is OK, and
works well enough, but it would be great if it actually "knew"
the current loads on those servers. I already have code that
shares basic architectural info, such as number of CPUs, available
memory, loadavg, etc which can help, of course, but again, all
this info can be used to *infer* the current status of those backend
servers; it doesn't really provide what the current load actually
*is*.

So I was thinking maybe some sort of small, simple and "fast"
benchmark which could be run by the backends as part of their
"status" update to the front-end reverse proxy server... something
that shows general capability at that point in time, like Hanoi or
something similar. Or maybe some hash function. Some simple code
that could be used to create that "universal" load number.

Thoughts? Ideas? Comments? Suggestions? :)