the retires are only on failure to establish a connection - no other retries should be happening iirc.
-r On Mon, Jun 25, 2018 at 1:29 PM Tyson Norris <[email protected]> wrote: > Thanks Markus - one other question: > > Assuming retry is the current missing piece to using PoolingRestClient (or > akka http directly), I’m also wondering if “retry” is the proper approach > here? > It may be worthwhile to initiate a port connection (with its own > timeout/retry behavior) before the /init so that we can distinguish between > “container startup is slow” and “bad behavior in action container after > startup”? > > Also, I’m wondering if there are cases where rampant retry causes > unintended side affects, etc - this could be worse with concurrency > enabled, but I don’t know if this should be considered a real problem. > > FWIW We avoid this (http request to container that is not yet listening) > in mesos by not returning the container till the mesos health check passes > (which currently just check the port connection), so this would be a > similar setup at the invoker layer. > > Thanks > Tyson > > > On Jun 25, 2018, at 10:08 AM, Markus Thoemmes < > [email protected]> wrote: > > > > Hi Tyson, > > > > Ha, I was thinking about moving back to akka the other day. A few > comments: > > > > 1. Travis build environments have 1.5 CPU cores which might explain the > "strange" behavior you get from the apache client? Maybe it adjusts its > thread pool based on the number of cores available? > > 2. To implement retries based on akka-http, have a look at what we used > to use for invoker communication: > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fincubator-openwhisk%2Fcommit%2F31946029cad740a00c6e6f367637a1bcfea5dd18%23diff-5c6f165d3e8395b6fe915ef0d24e5d1f&data=02%7C01%7Ctnorris%40adobe.com%7Ca07af0966c50438a270908d5dabe4b3a%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C636655433199601572&sdata=ZHjbf0ukNQaEFq2i58f5hxMt3zRa3JCHdHR0MRAn8Uo%3D&reserved=0 > (NewHttpUtils.scala to be precise). > > 3. I guess you're going to create a new PoolingRestClient per container? > I could imagine it is problematic if new containers come and go with a > global connection pool. Just something to be aware of. > > > > Oh, another thing to be aware of: We **used** to have akka-http there > and never tried again after the revert. We're certainly on a much newer > version now but we had issues of indefinitly hanging connections when we > first implemented it. Running some high-load scenarios before pushing this > into master will be needed. > > > > I don't want to put you off the task though, give it a shot, I'd love to > have this back :). Thanks for attacking! > > > > Cheers, > > -m > > > >
