Owen Rubel
oru...@gmail.com

On Sun, Aug 13, 2017 at 5:57 AM, Christopher Schultz <
ch...@christopherschultz.net> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Owen,
>
> On 8/12/17 12:47 PM, Owen Rubel wrote:
> > What I am talking about is something that improves communication as
> > we notice that communication channel needing more resources. Not
> > caching what is communicated... improving the CHANNEL for
> > communicating the resource (whatever it may be).
>
> If the channel is an HTTP connection (or TCP; the application protocol
> isn't terribly relevant), then you are limited by the following:
>
> 1. Network bandwidth
> 2. Available threads (to service a particular request)
> 3. Hardware resources on the server (CPU/memory/disk/etc.)
>
> Let's ignore 1 and 3 for now, since you are primarily concerned with
> concurrency, and concurrency is useless if the other resources are
> constrained or otherwise limiting the equation.
>
> Let's say we had "per endpoint" thread pools, so that e.g. /create had
> its own thread pool, and /show had another one, etc. What would that
> buy us?
>
> (Let's ignore for now the fact that one set of threads must always be
> used to decode the request to decide where it's going, like /create or
> /show.)
>
> If we have a limited total number of threads (e.g. 10), then we could
> "reserve" some of them so that we could always have 2 threads for
> /create even if all the other threads in the system (the other 8) were
> being used for something else. If we had 2 threads for /create and 2
> threads for /show, then only 6 would remain for e.g. /edit or /delete.
> So if 6 threads were already being used for /edit or /delete, the 7th
> incoming request would be queued, but anyone making a request for
> /show or /create would (if a thread in those pools is available) be
> serviced immediately.
>
> I can see some utility in this ability, because it would allow the
> container to ensure that some resources were never starved... or,
> rather, that they have some priority over certain other services. In
> other words, the service could enjoy guaranteed provisioning for
> certain endpoints.
>
> As it stands, Tomcat (and, I would venture a guess, most if not all
> other containers) implements a fair request pipeline where requests
> are (at least roughly) serviced in the order in which they are
> received. Rather than guaranteeing provisioning for a particular
> endpoint, the closest thing that could be implemented (at the
> application level) would be a resource-availability-limiting
> mechanism, such as counting the number of in-flight requests and
> rejecting those which exceed some threshold with e.g. a 503 response.
>
> Unfortunately, that doesn't actually prioritize some requests, it
> merely rejects others in order to attempt to prioritize those others.
> It also starves endpoints even when there is no reason to do so (e.g.
> in the 10-thread scenario, if all 4 /show and /create threads are
> idle, but 6 requests are already in process for the other endpoints, a
> 7th request for those other endpoints will be rejected).
>
> I believe that per-endpoint provisioning is a possibility, but I don't
> think that the potential gains are worth the certain complexity of the
> system required to implement it.
>
> There are other ways to handle heterogeneous service requests in a way
> that doesn't starve one type of request in favor of another. One
> obvious solution is horizontal scaling with a load-balancer. An LB can
> be used to implement a sort of guaranteed-provisioning for certain
> endpoints by providing more back-end servers for certain endpoints. If
> you want to make sure that /show can be called by any client at any
> time, then make sure you spin-up 1000 /show servers and register them
> with the load-balancer. You can survive with only maybe 10 nodes
> servicing /delete requests; others will either wait in a queue or
> receive a 503 from the lb.
>
> For my money, I'd maximize the number of threads available for all
> requests (whether within a single server, or across a large cluster)
> and not require that they be available for any particular endpoint.
> Once you have to depart from a single server, you MUST have something
> like a load-balancer involved, and therefore the above solution
> becomes not only more practical but also more powerful.
>
> Since relying on a one-box-wonder to run a high-availability web
> service isn't practical, provisioning is necessarily above the
> cluster-node level, and so the problem has effectively moved from the
> app server to the load-balancer (or reverse proxy). I believe the
> application server is an inappropriate place to implement this type of
> provisioning because it's too small-scale. The app server should serve
> requests as quickly as possible, and arranging for this kind of
> provisioning would add a level of complexity that would jeopardize
> performance of all requests within the application server.
>
> > But like you said, this is not something that is doable so I'll
> > look elsewhere.
>
> I think it's doable, just not worth it given the orthogonal solutions
> available. Some things are better-implemented at other layers of the
> application (as a whole system) and perhaps not the application server
> itself.
>
> Someone with intimate experience with Obidos should be familiar with
> the benefits of separation of these kinds of concerns ;)
>
> If you are really more concerned with threads that are tied-up with
> I/O-bound work, then Websocket really is your friend. The complex
> threading model of Websocket allows applications to do Real Work on
> application threads and then delegate the work of pushing bytes across
> the wire to the container, resulting in very few I/O-bound threads.
>
> But the way you have phrased your questions seems like you were more
> interested in guaranteed provisioning than avoiding I/O-bound threads.
>
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlmQTK8ACgkQHPApP6U8
> pFiSqw/+LB0K2z2wMZkZca7hqpOTnC3wyhr/8tAsJhPKNWMu9A/MzTAbDLhHM6Q3
> anRBSEzAPU1RR0YDh4ym0yi81C+5LWf92i74ITHhZOqnOsHJpP2NpENdumHNeq5C
> USwbaa2BAycL0SxKdSmm5kiXDs6HQcH/dspudIHcna2Wx9mOWaW7/jcnmc4XZcFe
> Na/Xi6Ita+oky8yadjt8k5GTqPBD0AFDu6KYXfhIaqkoa5OXTn8A1HuCsMoDYJQj
> jYMd58ahbKGjhPgwPq0D/1gtFf6VcTAxK7d7T4EvKXvIYgv3vj+4ddAXRk6y6Ac4
> AMw70PjvZpIZdslHwTwGk3AJ2u+fxBYIXmF3dDh7oIh00+HXow9V9WqLfkW9jDV1
> vIC5ofjsiztNCZnhGH4eTIRohn0mou3mZnIbM1dtc+NmLGArGYjxU2Q1rHcWqjlM
> QjKQimdPEaAT0iwtz6iY8hMI4PHJ9B8BnFHrZMm6wnYkMBbA0IHM2ofl1BgtgdIH
> IKfm2yo4cGcUKFXYvWTKHFslV5Seqs5rc0NlaRO8OYt4FvxjEt3THS6b8Wog7qzs
> EMGTrFouq2SyW+4cKp6cajOUAAU7u2PqkUxbEEZcf1ITwhw4aNgdS+bVhTwDApw4
> w1hDPV/IsNHpgSFRiOzKpOWFtRjsCbtKIwf3WNEO6EfgmGZGQpA=
> =Lrxc
> -----END PGP SIGNATURE-----
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

>If we have a limited total number of threads (e.g. 10), then we could
>"reserve" some of them so that we could always have 2 threads for
>/create even if all the other threads in the system (the other 8) were
>being used for something else. If we had 2 threads for /create and 2
>threads for /show, then only 6 would remain for e.g. /edit or /delete.
>So if 6 threads were already being used for /edit or /delete, the 7th
>incoming request would be queued, but anyone making a request for
>/show or /create would (if a thread in those pools is available) be
>serviced immediately.

Use percentages like most load balancers do to solve that problem and then
adjust the percentages as traffic changes.


So say we have the following assigned thread percentages:

person/show - 5%
person/create-2%
person/edit-2%
person/delete 1%

*(always guaranteeing that each would have 1 thread shared from the pool at
all times)

If suddenly traffic starts to spike on 'person/edit', we steal from
'person/show'. Why? 'person/show' had those threads created created
dynamically and may not be using them all currently.

We steal from the highest percentages durinmg spikes because we currently
have a new highest percentage.

And if that changes, they will steal back.

At least this is what I was envisioning for an implementation.

Reply via email to