P.S. I'd also like to quiet attempts to recover workers from errors to a
lower (and by default unlogged) logging level. The transition of a
worker into an error state should certainly be logged, but logging every
time we find it to still be in an error state seems to be excessive --
at least for a sparsely populated port bank use case.
Jess Holle wrote:
Jess Holle wrote:
Jess Holle wrote:
Mladen Turk wrote:
Jess Holle wrote:
Mladen Turk wrote:
Is there a means of achieving background-only (or nearly so)
testing of dead workers with mod_jk? That's what I'm looking for
in both jk and mod_proxy_ajp connectors. I guess I was
hoping/assuming it was there in mod_jk from reading the docs.
There is in the mod_jk (SVN trunk).
I've been reading this code now...
The watchdog thread looks very useful. If I understand it correctly,
the watchdog thread can do whatever it feels like but currently
mainly calls wc_maintain, which will only do work at most every
worker.maintain seconds, right?
connection_keepalive does not look like it really my bill, though.
I'm most worried about workers in an error state and ensuring that
they are rechecked every recover_wait_time -- but only by the
watchdog thread and ideally via a ping/pong. Currently
recover_workers appears to just put workers into a recovery state
where they'll be elligible to be tried again on a future request --
without checking whether the worker is actually accessible. That's
fine for some use cases, but explicitly what I want to avoid.
Are there any thoughts to have an option to have recover_workers() do
a ping prior to returning a working to a non-error state?
And, yes, a watchdog thread in mod_proxy_balancer /and /a reasonable
means of balancer invoking a ping via mod_proxy_ajp would be really
helpful as far as mod_proxy_ajp is concerned.
Another possibly simpler alternative: we could introduce a limit as to
how many workers we attempt to do (unforced) recoveries on for any
given request. Any request could likely tolerate a recovery attempt
or two. None should have to tolerate 6-12 recovery attempts just
because of a currently sparsely populated port range.
Thoughts?
--
Jess Holle