Re: Per EndPoint Threads???

2017-08-13 Thread Owen Rubel
Owen Rubel
oru...@gmail.com

On Sun, Aug 13, 2017 at 5:57 AM, Christopher Schultz <
ch...@christopherschultz.net> wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Owen,
>
> On 8/12/17 12:47 PM, Owen Rubel wrote:
> > What I am talking about is something that improves communication as
> > we notice that communication channel needing more resources. Not
> > caching what is communicated... improving the CHANNEL for
> > communicating the resource (whatever it may be).
>
> If the channel is an HTTP connection (or TCP; the application protocol
> isn't terribly relevant), then you are limited by the following:
>
> 1. Network bandwidth
> 2. Available threads (to service a particular request)
> 3. Hardware resources on the server (CPU/memory/disk/etc.)
>
> Let's ignore 1 and 3 for now, since you are primarily concerned with
> concurrency, and concurrency is useless if the other resources are
> constrained or otherwise limiting the equation.
>
> Let's say we had "per endpoint" thread pools, so that e.g. /create had
> its own thread pool, and /show had another one, etc. What would that
> buy us?
>
> (Let's ignore for now the fact that one set of threads must always be
> used to decode the request to decide where it's going, like /create or
> /show.)
>
> If we have a limited total number of threads (e.g. 10), then we could
> "reserve" some of them so that we could always have 2 threads for
> /create even if all the other threads in the system (the other 8) were
> being used for something else. If we had 2 threads for /create and 2
> threads for /show, then only 6 would remain for e.g. /edit or /delete.
> So if 6 threads were already being used for /edit or /delete, the 7th
> incoming request would be queued, but anyone making a request for
> /show or /create would (if a thread in those pools is available) be
> serviced immediately.
>
> I can see some utility in this ability, because it would allow the
> container to ensure that some resources were never starved... or,
> rather, that they have some priority over certain other services. In
> other words, the service could enjoy guaranteed provisioning for
> certain endpoints.
>
> As it stands, Tomcat (and, I would venture a guess, most if not all
> other containers) implements a fair request pipeline where requests
> are (at least roughly) serviced in the order in which they are
> received. Rather than guaranteeing provisioning for a particular
> endpoint, the closest thing that could be implemented (at the
> application level) would be a resource-availability-limiting
> mechanism, such as counting the number of in-flight requests and
> rejecting those which exceed some threshold with e.g. a 503 response.
>
> Unfortunately, that doesn't actually prioritize some requests, it
> merely rejects others in order to attempt to prioritize those others.
> It also starves endpoints even when there is no reason to do so (e.g.
> in the 10-thread scenario, if all 4 /show and /create threads are
> idle, but 6 requests are already in process for the other endpoints, a
> 7th request for those other endpoints will be rejected).
>
> I believe that per-endpoint provisioning is a possibility, but I don't
> think that the potential gains are worth the certain complexity of the
> system required to implement it.
>
> There are other ways to handle heterogeneous service requests in a way
> that doesn't starve one type of request in favor of another. One
> obvious solution is horizontal scaling with a load-balancer. An LB can
> be used to implement a sort of guaranteed-provisioning for certain
> endpoints by providing more back-end servers for certain endpoints. If
> you want to make sure that /show can be called by any client at any
> time, then make sure you spin-up 1000 /show servers and register them
> with the load-balancer. You can survive with only maybe 10 nodes
> servicing /delete requests; others will either wait in a queue or
> receive a 503 from the lb.
>
> For my money, I'd maximize the number of threads available for all
> requests (whether within a single server, or across a large cluster)
> and not require that they be available for any particular endpoint.
> Once you have to depart from a single server, you MUST have something
> like a load-balancer involved, and therefore the above solution
> becomes not only more practical but also more powerful.
>
> Since relying on a one-box-wonder to run a high-availability web
> service isn't practical, provisioning is necessarily above the
> cluster-node level, and so the problem has effectively moved from the
> app server to the load-balancer (or reverse proxy). I believe the
> application server is an inappropriate place to implement this type of
> provisioning because it's too small-scale. The app server should serve
> requests as quickly as possible, and arranging for this kind of
> provisioning would add a level of complexity that would jeopardize
> performance of all requests within the application server.
>
> > But 

Re: Per EndPoint Threads???

2017-08-13 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Owen,

On 8/12/17 12:47 PM, Owen Rubel wrote:
> What I am talking about is something that improves communication as
> we notice that communication channel needing more resources. Not
> caching what is communicated... improving the CHANNEL for
> communicating the resource (whatever it may be).

If the channel is an HTTP connection (or TCP; the application protocol
isn't terribly relevant), then you are limited by the following:

1. Network bandwidth
2. Available threads (to service a particular request)
3. Hardware resources on the server (CPU/memory/disk/etc.)

Let's ignore 1 and 3 for now, since you are primarily concerned with
concurrency, and concurrency is useless if the other resources are
constrained or otherwise limiting the equation.

Let's say we had "per endpoint" thread pools, so that e.g. /create had
its own thread pool, and /show had another one, etc. What would that
buy us?

(Let's ignore for now the fact that one set of threads must always be
used to decode the request to decide where it's going, like /create or
/show.)

If we have a limited total number of threads (e.g. 10), then we could
"reserve" some of them so that we could always have 2 threads for
/create even if all the other threads in the system (the other 8) were
being used for something else. If we had 2 threads for /create and 2
threads for /show, then only 6 would remain for e.g. /edit or /delete.
So if 6 threads were already being used for /edit or /delete, the 7th
incoming request would be queued, but anyone making a request for
/show or /create would (if a thread in those pools is available) be
serviced immediately.

I can see some utility in this ability, because it would allow the
container to ensure that some resources were never starved... or,
rather, that they have some priority over certain other services. In
other words, the service could enjoy guaranteed provisioning for
certain endpoints.

As it stands, Tomcat (and, I would venture a guess, most if not all
other containers) implements a fair request pipeline where requests
are (at least roughly) serviced in the order in which they are
received. Rather than guaranteeing provisioning for a particular
endpoint, the closest thing that could be implemented (at the
application level) would be a resource-availability-limiting
mechanism, such as counting the number of in-flight requests and
rejecting those which exceed some threshold with e.g. a 503 response.

Unfortunately, that doesn't actually prioritize some requests, it
merely rejects others in order to attempt to prioritize those others.
It also starves endpoints even when there is no reason to do so (e.g.
in the 10-thread scenario, if all 4 /show and /create threads are
idle, but 6 requests are already in process for the other endpoints, a
7th request for those other endpoints will be rejected).

I believe that per-endpoint provisioning is a possibility, but I don't
think that the potential gains are worth the certain complexity of the
system required to implement it.

There are other ways to handle heterogeneous service requests in a way
that doesn't starve one type of request in favor of another. One
obvious solution is horizontal scaling with a load-balancer. An LB can
be used to implement a sort of guaranteed-provisioning for certain
endpoints by providing more back-end servers for certain endpoints. If
you want to make sure that /show can be called by any client at any
time, then make sure you spin-up 1000 /show servers and register them
with the load-balancer. You can survive with only maybe 10 nodes
servicing /delete requests; others will either wait in a queue or
receive a 503 from the lb.

For my money, I'd maximize the number of threads available for all
requests (whether within a single server, or across a large cluster)
and not require that they be available for any particular endpoint.
Once you have to depart from a single server, you MUST have something
like a load-balancer involved, and therefore the above solution
becomes not only more practical but also more powerful.

Since relying on a one-box-wonder to run a high-availability web
service isn't practical, provisioning is necessarily above the
cluster-node level, and so the problem has effectively moved from the
app server to the load-balancer (or reverse proxy). I believe the
application server is an inappropriate place to implement this type of
provisioning because it's too small-scale. The app server should serve
requests as quickly as possible, and arranging for this kind of
provisioning would add a level of complexity that would jeopardize
performance of all requests within the application server.

> But like you said, this is not something that is doable so I'll
> look elsewhere.

I think it's doable, just not worth it given the orthogonal solutions
available. Some things are better-implemented at other layers of the
application (as a whole system) and perhaps not the application server
itself.