Owen Rubel oru...@gmail.com On Sun, Aug 13, 2017 at 5:57 AM, Christopher Schultz < ch...@christopherschultz.net> wrote:
> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > Owen, > > On 8/12/17 12:47 PM, Owen Rubel wrote: > > What I am talking about is something that improves communication as > > we notice that communication channel needing more resources. Not > > caching what is communicated... improving the CHANNEL for > > communicating the resource (whatever it may be). > > If the channel is an HTTP connection (or TCP; the application protocol > isn't terribly relevant), then you are limited by the following: > > 1. Network bandwidth > 2. Available threads (to service a particular request) > 3. Hardware resources on the server (CPU/memory/disk/etc.) > > Let's ignore 1 and 3 for now, since you are primarily concerned with > concurrency, and concurrency is useless if the other resources are > constrained or otherwise limiting the equation. > > Let's say we had "per endpoint" thread pools, so that e.g. /create had > its own thread pool, and /show had another one, etc. What would that > buy us? > > (Let's ignore for now the fact that one set of threads must always be > used to decode the request to decide where it's going, like /create or > /show.) > > If we have a limited total number of threads (e.g. 10), then we could > "reserve" some of them so that we could always have 2 threads for > /create even if all the other threads in the system (the other 8) were > being used for something else. If we had 2 threads for /create and 2 > threads for /show, then only 6 would remain for e.g. /edit or /delete. > So if 6 threads were already being used for /edit or /delete, the 7th > incoming request would be queued, but anyone making a request for > /show or /create would (if a thread in those pools is available) be > serviced immediately. > > I can see some utility in this ability, because it would allow the > container to ensure that some resources were never starved... or, > rather, that they have some priority over certain other services. In > other words, the service could enjoy guaranteed provisioning for > certain endpoints. > > As it stands, Tomcat (and, I would venture a guess, most if not all > other containers) implements a fair request pipeline where requests > are (at least roughly) serviced in the order in which they are > received. Rather than guaranteeing provisioning for a particular > endpoint, the closest thing that could be implemented (at the > application level) would be a resource-availability-limiting > mechanism, such as counting the number of in-flight requests and > rejecting those which exceed some threshold with e.g. a 503 response. > > Unfortunately, that doesn't actually prioritize some requests, it > merely rejects others in order to attempt to prioritize those others. > It also starves endpoints even when there is no reason to do so (e.g. > in the 10-thread scenario, if all 4 /show and /create threads are > idle, but 6 requests are already in process for the other endpoints, a > 7th request for those other endpoints will be rejected). > > I believe that per-endpoint provisioning is a possibility, but I don't > think that the potential gains are worth the certain complexity of the > system required to implement it. > > There are other ways to handle heterogeneous service requests in a way > that doesn't starve one type of request in favor of another. One > obvious solution is horizontal scaling with a load-balancer. An LB can > be used to implement a sort of guaranteed-provisioning for certain > endpoints by providing more back-end servers for certain endpoints. If > you want to make sure that /show can be called by any client at any > time, then make sure you spin-up 1000 /show servers and register them > with the load-balancer. You can survive with only maybe 10 nodes > servicing /delete requests; others will either wait in a queue or > receive a 503 from the lb. > > For my money, I'd maximize the number of threads available for all > requests (whether within a single server, or across a large cluster) > and not require that they be available for any particular endpoint. > Once you have to depart from a single server, you MUST have something > like a load-balancer involved, and therefore the above solution > becomes not only more practical but also more powerful. > > Since relying on a one-box-wonder to run a high-availability web > service isn't practical, provisioning is necessarily above the > cluster-node level, and so the problem has effectively moved from the > app server to the load-balancer (or reverse proxy). I believe the > application server is an inappropriate place to implement this type of > provisioning because it's too small-scale. The app server should serve > requests as quickly as possible, and arranging for this kind of > provisioning would add a level of complexity that would jeopardize > performance of all requests within the application server. > > > But like you said, this is not something that is doable so I'll > > look elsewhere. > > I think it's doable, just not worth it given the orthogonal solutions > available. Some things are better-implemented at other layers of the > application (as a whole system) and perhaps not the application server > itself. > > Someone with intimate experience with Obidos should be familiar with > the benefits of separation of these kinds of concerns ;) > > If you are really more concerned with threads that are tied-up with > I/O-bound work, then Websocket really is your friend. The complex > threading model of Websocket allows applications to do Real Work on > application threads and then delegate the work of pushing bytes across > the wire to the container, resulting in very few I/O-bound threads. > > But the way you have phrased your questions seems like you were more > interested in guaranteed provisioning than avoiding I/O-bound threads. > > - -chris > -----BEGIN PGP SIGNATURE----- > Comment: GPGTools - http://gpgtools.org > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAlmQTK8ACgkQHPApP6U8 > pFiSqw/+LB0K2z2wMZkZca7hqpOTnC3wyhr/8tAsJhPKNWMu9A/MzTAbDLhHM6Q3 > anRBSEzAPU1RR0YDh4ym0yi81C+5LWf92i74ITHhZOqnOsHJpP2NpENdumHNeq5C > USwbaa2BAycL0SxKdSmm5kiXDs6HQcH/dspudIHcna2Wx9mOWaW7/jcnmc4XZcFe > Na/Xi6Ita+oky8yadjt8k5GTqPBD0AFDu6KYXfhIaqkoa5OXTn8A1HuCsMoDYJQj > jYMd58ahbKGjhPgwPq0D/1gtFf6VcTAxK7d7T4EvKXvIYgv3vj+4ddAXRk6y6Ac4 > AMw70PjvZpIZdslHwTwGk3AJ2u+fxBYIXmF3dDh7oIh00+HXow9V9WqLfkW9jDV1 > vIC5ofjsiztNCZnhGH4eTIRohn0mou3mZnIbM1dtc+NmLGArGYjxU2Q1rHcWqjlM > QjKQimdPEaAT0iwtz6iY8hMI4PHJ9B8BnFHrZMm6wnYkMBbA0IHM2ofl1BgtgdIH > IKfm2yo4cGcUKFXYvWTKHFslV5Seqs5rc0NlaRO8OYt4FvxjEt3THS6b8Wog7qzs > EMGTrFouq2SyW+4cKp6cajOUAAU7u2PqkUxbEEZcf1ITwhw4aNgdS+bVhTwDApw4 > w1hDPV/IsNHpgSFRiOzKpOWFtRjsCbtKIwf3WNEO6EfgmGZGQpA= > =Lrxc > -----END PGP SIGNATURE----- > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > >If we have a limited total number of threads (e.g. 10), then we could >"reserve" some of them so that we could always have 2 threads for >/create even if all the other threads in the system (the other 8) were >being used for something else. If we had 2 threads for /create and 2 >threads for /show, then only 6 would remain for e.g. /edit or /delete. >So if 6 threads were already being used for /edit or /delete, the 7th >incoming request would be queued, but anyone making a request for >/show or /create would (if a thread in those pools is available) be >serviced immediately. Use percentages like most load balancers do to solve that problem and then adjust the percentages as traffic changes. So say we have the following assigned thread percentages: person/show - 5% person/create-2% person/edit-2% person/delete 1% *(always guaranteeing that each would have 1 thread shared from the pool at all times) If suddenly traffic starts to spike on 'person/edit', we steal from 'person/show'. Why? 'person/show' had those threads created created dynamically and may not be using them all currently. We steal from the highest percentages durinmg spikes because we currently have a new highest percentage. And if that changes, they will steal back. At least this is what I was envisioning for an implementation.