Re: Apache 2.4 adoption
For myself the compelling feature of 2.4 is the event MPM. But it doesn't work on Windows (nor is there an alternative to thread-per-request processing there). And when HTTPS is used its still thread-per-request. And of course I need to know mod_jk works absolutely flawlessly with the event MPM too... So in the end the even MPM is not all that compelling yet -- for me at least. There are a few other features in 2.2 that'd be nice to have, but the big draw just isn't complete enough in scope. -- Jess Holle
Re: [VOTE] Release Apache httpd 2.4.3 as GA
On 8/18/2012 8:39 AM, Jim Jagielski wrote: On Aug 17, 2012, at 11:01 PM, Jess Holle je...@ptc.com wrote:\ Downstream customers in my case means customers that will deploy Apache and our products on their own servers. In a great many cases these servers run Windows. Ahh. That explains it. The Windows MPM is designed to be the most optimal implementation for Windows servers, dedicated and specific to Windows. What is it about the Windows MPM which is inadequate to your or your client's needs? We have direct access to Microsoft engineers, so I think they would also be curious as well. MS is quite interested in ensuring Apache httpd runs extremely well on Windows. The Windows MPM does indeed work rather well. That said, if one has a lot of long running connections that are mostly idle won't one run into exactly the same issues that mod_worker has vs. mod_event? What's the strategy for dealing with large numbers of long-poll requests, long HTTP keepalive settings, etc, with the Windows MPM? Similarly what's the strategy for this on UNIX when /all /the requests in question are HTTPS? Again, we've not hit the limit there with mod_worker, but a major interest in 2.4.x was raising the ceiling in this direction. -- Jess Holle
Re: [VOTE] Release Apache httpd 2.4.3 as GA
Does the event MPM now: 1. Work on Windows? 2. Work with HTTPS? When both are true 2.4.x will become very interesting. Until then, not so much over 2.2.x. On 8/17/2012 12:34 PM, Jim Jagielski wrote: The pre-release test tarballs for Apache httpd 2.4.3 can be found at the usual place: http://httpd.apache.org/dev/dist/ I'm calling a VOTE on releasing these as Apache httpd 2.4.3 GA. NOTE: The -deps tarballs are included here *only* to make life easier for the tester. They will not be, and are not, part of the official release. [ ] +1: Good to go [ ] +0: meh [ ] -1: Danger Will Robinson. And why. Vote will last the normal 72 hrs.
Re: [VOTE] Release Apache httpd 2.4.3 as GA
The fact that there is no event MPM equivalent for Windows is a huge gap for 2.4.x. Given the large percentage of our downstream customers using Windows there's not a huge motivation to move to 2.4.x. Moreover, it's my understanding that the event MPM falls back to behaving like the worker MPM in SSL cases. Is that true? If so, then that further decreases the motivation to move to 2.4.x. Overall, given that a large portion of our downstream usages are on Windows, say 50% for the sake of argument, and that a large percentage of our usages are HTTPS, again say 50% for the sake of argument, the benefits of the event MPM are really quite narrow in practice in our case. That said, I didn't know or had forgotten that SSL didn't work with the Windows MPM in 2.4.x. That would be a substantial regression from 2.2.x -- and resolving this would clear the way for 2.4.x being GA barring any other such regressions. -- Jess Holle On 8/17/2012 12:48 PM, Jim Jagielski wrote: In the Announcement you'll see: NOTE to Windows users: The issues with AcceptFilter None replacing Win32DisableAcceptEx appears to have resolved starting with version 2.4.3 make Apache httpd 2.4.x suitable for Windows servers. NOTE: The event MPM is a *nix mpm and has never worked on Windows. On Aug 17, 2012, at 1:38 PM, Jess Holle je...@ptc.com wrote: Does the event MPM now: • Work on Windows? • Work with HTTPS? When both are true 2.4.x will become very interesting. Until then, not so much over 2.2.x. On 8/17/2012 12:34 PM, Jim Jagielski wrote: The pre-release test tarballs for Apache httpd 2.4.3 can be found at the usual place: http://httpd.apache.org/dev/dist/ I'm calling a VOTE on releasing these as Apache httpd 2.4.3 GA. NOTE: The -deps tarballs are included here *only* to make life easier for the tester. They will not be, and are not, part of the official release. [ ] +1: Good to go [ ] +0: meh [ ] -1: Danger Will Robinson. And why. Vote will last the normal 72 hrs.
Re: [VOTE] Release Apache httpd 2.4.3 as GA
Downstream customers in my case means customers that will deploy Apache and our products on their own servers. In a great many cases these servers run Windows. The clients in most cases are Windows too, but that's a different matter entirely. On 8/17/2012 3:12 PM, Jim Jagielski wrote: I am curious how the number of downstream customers being Windows effects anything on the server side... On Aug 17, 2012, at 2:16 PM, Jess Holle je...@ptc.com wrote: The fact that there is no event MPM equivalent for Windows is a huge gap for 2.4.x. Given the large percentage of our downstream customers using Windows there's not a huge motivation to move to 2.4.x. Moreover, it's my understanding that the event MPM falls back to behaving like the worker MPM in SSL cases. Is that true? If so, then that further decreases the motivation to move to 2.4.x. Overall, given that a large portion of our downstream usages are on Windows, say 50% for the sake of argument, and that a large percentage of our usages are HTTPS, again say 50% for the sake of argument, the benefits of the event MPM are really quite narrow in practice in our case. That said, I didn't know or had forgotten that SSL didn't work with the Windows MPM in 2.4.x. That would be a substantial regression from 2.2.x -- and resolving this would clear the way for 2.4.x being GA barring any other such regressions. -- Jess Holle On 8/17/2012 12:48 PM, Jim Jagielski wrote: In the Announcement you'll see: NOTE to Windows users: The issues with AcceptFilter None replacing Win32DisableAcceptEx appears to have resolved starting with version 2.4.3 make Apache httpd 2.4.x suitable for Windows servers. NOTE: The event MPM is a *nix mpm and has never worked on Windows. On Aug 17, 2012, at 1:38 PM, Jess Holle je...@ptc.com wrote: Does the event MPM now: • Work on Windows? • Work with HTTPS? When both are true 2.4.x will become very interesting. Until then, not so much over 2.2.x. On 8/17/2012 12:34 PM, Jim Jagielski wrote: The pre-release test tarballs for Apache httpd 2.4.3 can be found at the usual place: http://httpd.apache.org/dev/dist/ I'm calling a VOTE on releasing these as Apache httpd 2.4.3 GA. NOTE: The -deps tarballs are included here *only* to make life easier for the tester. They will not be, and are not, part of the official release. [ ] +1: Good to go [ ] +0: meh [ ] -1: Danger Will Robinson. And why. Vote will last the normal 72 hrs. .
Re: [RESULT] Re: [VOTE] Release Apache httpd 2.4.1
Does the event MPM work on Windows? Or is Apache on Windows still limited to the winnt MPM? If so, doesn't this leave Apache on Windows /far /behind other platforms when it comes to threads required for a given load? I guess it doesn't matter *that* much until the event MPM and mod_ssl work out their differences such that one can reduce the threads required when HTTPS is used. For those who use a lot of HTTPS, the event MPM doesn't seem to buy one anything for now, right? On 2/21/2012 1:00 AM, William A. Rowe Jr. wrote: On 2/20/2012 8:04 AM, Jess Holle wrote: Ok, issues with all mod_ssl would be a big problem. If you needed to do DisableWin32AcceptEx, though, then something was already not quite right. What you mean by mod_ssl on a port, though? You just mean running an HTTPS listener right? Precisely. mod_ssl does not interact well (expects its bucket read to be blocking) against the incomplete response created by a 'data-less' AcceptEx or accept(). There are a ton of weird variations in which blocking states are inherited from a listening socket to an AcceptEx socket vs an accept socket on Windows. That's where the problem is, and those in a position to debug hadn't hit on this state (and the fact that the timing has to be very fast means that it isn't easily reproduced in a debug environment). Most modules won't care. As Steffan points out, most non-ssl modules don't care either. mod_ssl freaks out, but we can't exactly put the blame on mod_ssl when it explicitly demanded a blocking response.
Re: [RESULT] Re: [VOTE] Release Apache httpd 2.4.1
Ok, issues with all mod_ssl would be a big problem. If you needed to do DisableWin32AcceptEx, though, then something was already not quite right. What you mean by mod_ssl on a port, though? You just mean running an HTTPS listener right? On 2/18/2012 12:43 AM, William A. Rowe Jr. wrote: On 2/17/2012 10:38 PM, Gregg Smith wrote: On 2/17/2012 3:15 PM, Jess Holle wrote: Does this mean the Windows-specific issues have been resolved? Or that this is a non-Windows GA? No, the Windows specific issue (PR 52476) has not been solved. So it's GA for all but Windows. It's quite certainly GA for windows. Unless you wish to run mod_ssl on a port, and never successfully ran without the DisableWin32AcceptEx directive. For that small subset of users, there is more diagnostics required, and they won't enjoy success until 2.4.2 if then.
Re: [RESULT] Re: [VOTE] Release Apache httpd 2.4.1
Does this mean the Windows-specific issues have been resolved? Or that this is a non-Windows GA? On 2/17/2012 9:13 AM, Tom Evans wrote: On Fri, Feb 17, 2012 at 1:42 PM, Jim Jagielskij...@jagunet.com wrote: As such, I call the vote as PASSING and that httpd 2.4.1 will be released as GA. Congratulations, very excited to soon have 2.4 in production! Cheers Tom
Re: [VOTE] httpd 2.2.12 tarballs
Rainer Jung wrote: 5) Starting a service only works using the ApacheMonitor or the Windows Service Control. Using the commandline httpd.exe I can not start the service. The event log shows: [Sat Jul 25 15:11:03 2009] [notice] Disabled use of AcceptEx() WinSock2 API (OS 10048)Normalerweise darf jede Socketadresse (Protokoll, Netzwerkadresse oder Anschluss) nur jeweils einmal verwendet werden. : make_sock: could not bind to address 127.0.0.1:8000 no listening sockets available, shutting down Unable to open logs So there's a warning about using IP address or port twice. I did check, that no other process uses the port and starting via ApacheMonitor with the same config is no problem. So I guess (wildly), that we have a bug when starting from the commandline, resulting in the parent and the child both trying to do the bind. I'll see, what I can find out about it, but I would say it's not a blocker, because IMHO most users do not control the service via the commandline interface. Hmmm... We do. -- Jess Holle
Re: mod_proxy / mod_proxy_balancer
Rainer Jung wrote: In most situations aplications need stickyness. So balancing will not happen in an ideal situation, instead it tries to keep load equal although most requests are sticky. Because of the influence of sticky requests it can happen that accumulated load distributes very uneven between the nodes. Should the balancer try to correct such accumulated differences? Other applications are memory bound. Memory is needed by request handling but also by session handling. Data accumulation is mor eimportant here, because of the sessions. Again, we can not be perfect, because we don't get a notification, when a session expires or a user logs out. So we can only count the new sessions. This counter in my opinion also needs some aging, so that we won't compensate historic inequality without bounds. I must confess, that I don't have an example here, how this inequality can happen for sessions when balancing new session requests (stickyness doesn't influence this), but I think balancing based on old data is the wrong model here too. An ability to balance based on new sessions with an idle time out on such sessions would be close enough to reality in cases where sessions expire rather than being explicitly invalidated (e.g. by a logout). Of course that redoes what a servlet engine would be doing and does so with lower fidelity. An ability to ask a backend for its current session count and load balance new requests on that basis would be really helpful. Whether this ability is buried into AJP, for instance, or is simply a separate request to a designated URL is another question, but the latter approach seems fairly general and the number of such requests could be throttled by a time-to-live setting on the last such count obtained. Actually this could and should be generalized beyond active sessions to a back-end health metric. Each backend could compute and respond with a relative measure of busyness/health and respond and the load balancer could then balance new (session-less) requests to the least busy / most healthy backend. This would seem to be *huge* step forward in load balancing capability/fidelity. It's my understanding that mod_cluster is pursuing just this sort of thing to some degree -- but currently only works for JBoss backends. -- Jess Holle
Re: mod_proxy / mod_proxy_balancer
Rainer Jung wrote: An ability to balance based on new sessions with an idle time out on such sessions would be close enough to reality in cases where sessions expire rather than being explicitly invalidated (e.g. by a logout). But then we end up in a stateful situation. This is a serious design decision. If we want to track idleness for sessions, we need to track a list of sessions (session ids) the balancer has seen. This makes things much more complex. Combined with the non-ability to track logouts and the errors coming in form a global situation (more than one Apache instance), I think it will be more of a problem than a solution. The more I think about this the more I agree. From the start I preferred the session/health query to the back-end with a time-to-live, on further consideration I *greatly* prefer this approach. Of course that redoes what a servlet engine would be doing and does so with lower fidelity. An ability to ask a backend for its current session count and load balance new requests on that basis would be really helpful. Seems much nicer. Agreed. Actually this could and should be generalized beyond active sessions to a back-end health metric. Each backend could compute and respond with a relative measure of busyness/health and respond and the load balancer could then balance new (session-less) requests to the least busy / most healthy backend. This would seem to be *huge* step forward in load balancing capability/fidelity. It's my understanding that mod_cluster is pursuing just this sort of thing to some degree -- but currently only works for JBoss backends. Yes, I think the counter/aging discussion is for the baseline, i.e. when we do not have any information channel to or from the backend nodes. As soon as mod_cluster comes into play, we can use more up-to-date real data and only need to decide how to interprete it and how to interpolate during the update interval. Should general support for a query URL be provided in mod_proxy_balancer? Or should this be left to mod_cluster? Does mod_cluster provide yet another approach top to bottom (separate than mod_jk and mod_proxy/mod_proxy_ajp)? It would seem nice to me if mod_jk and/or mod_proxy_balancer could do health checks, but you have to draw the line somewhere on growing any given module and if mod_jk and mod_proxy_balancer are not going in that direction at some point mod_cluster may be in my future. -- Jess Holle
Re: mod_proxy / mod_proxy_balancer
jean-frederic clere wrote: Jess Holle wrote: An ability to balance based on new sessions with an idle time out on such sessions would be close enough to reality in cases where sessions expire rather than being explicitly invalidated (e.g. by a logout). Storing the sessionid to share the load depending on the number of active sessions, brings a problem of security, no? To the degree that you consider Apache vulnerable to attack to retrieve these, yes. I prefer the health check request approach below for this and other reasons (amount of required bookkeeping, etc). Of course that redoes what a servlet engine would be doing and does so with lower fidelity. An ability to ask a backend for its current session count and load balance new requests on that basis would be really helpful. Whether this ability is buried into AJP, for instance, or is simply a separate request to a designated URL is another question, but the latter approach seems fairly general and the number of such requests could be throttled by a time-to-live setting on the last such count obtained. Actually this could and should be generalized beyond active sessions to a back-end health metric. Each backend could compute and respond with a relative measure of busyness/health and respond and the load balancer could then balance new (session-less) requests to the least busy / most healthy backend. This would seem to be *huge* step forward in load balancing capability/fidelity. It's my understanding that mod_cluster is pursuing just this sort of thing to some degree -- but currently only works for JBoss backends. This wrong it works with Tomcat too. mod_cluster works with Tomcat, but according to the docs I've seen the dynamic (health/session metric based rather than static) load balancing only worked with JBoss backends. Or has this changed? -- Jess Holle
Re: mod_proxy / mod_proxy_balancer
jean-frederic clere wrote: Should general support for a query URL be provided in mod_proxy_balancer? Or should this be left to mod_cluster? Can you explain more? I don't get the question. What I mean is 1. Should mod_proxy_balancer be extended to provide a balancer algorithm in which one specifies a backend URL that will provide a single numeric health metric, throttle the number of such requests via a time-to-live associated with this information, and balance on this basis or 2. Should mod_cluster handle this issue? 3. Or both? * For instance, mod_cluster might leverage special nuances in AJP, JBoss, and Tomcat, whereas mod_proxy_balancer might provide more generic support for helath checks on any back end server that can expose a health metric URL. From your response below, it sounds like you're saying it's #2, which is /largely /fine and good -- but this raises questions: 1. How general is the health check metric in mod_cluster? * I only care about Tomcat backends myself, but control over the metric would be good. 2. Does this require special JBoss nuggets in Tomcat? * I'd hope not, i.e. that this is a simple matter of a pre-designated URL or a very simple standalone socket protocol. 3. When will mod_cluster support health metric based balancing of Tomcat? 4. How disruptive to an existing configuration using mod_proxy_balancer/mod_proxy_ajp is mod_cluster? * How much needs to be changed? 5. How portable is the mod_cluster code? * Does it build on Windows? HPUX? AIX? I say this is largely fine and good as I'd like to see just the health-metric based balancing algorithm in Apache 2.2.x itself. Does mod_cluster provide yet another approach top to bottom (separate than mod_jk and mod_proxy/mod_proxy_ajp)? Mod_cluster is just a balancer for mod_proxy but due to the dynamic creation of balancers and workers it can't get in the httpd-trunk code right now. It would seem nice to me if mod_jk and/or mod_proxy_balancer could do health checks, but you have to draw the line somewhere on growing any given module and if mod_jk and mod_proxy_balancer are not going in that direction at some point mod_cluster may be in my future. -- Jess Holle
Re: mod_proxy / mod_proxy_balancer
Rainer Jung wrote: On 06.05.2009 14:35, jean-frederic clere wrote: Jess Holle wrote: Rainer Jung wrote: Yes, I think the counter/aging discussion is for the baseline, i.e. when we do not have any information channel to or from the backend nodes. As soon as mod_cluster comes into play, we can use more up-to-date real data and only need to decide how to interprete it and how to interpolate during the update interval. Should general support for a query URL be provided in mod_proxy_balancer? Or should this be left to mod_cluster? Can you explain more? I don't get the question. Does mod_cluster provide yet another approach top to bottom (separate than mod_jk and mod_proxy/mod_proxy_ajp)? Mod_cluster is just a balancer for mod_proxy but due to the dynamic creation of balancers and workers it can't get in the httpd-trunk code right now. It would seem nice to me if mod_jk and/or mod_proxy_balancer could do health checks, but you have to draw the line somewhere on growing any given module and if mod_jk and mod_proxy_balancer are not going in that direction at some point mod_cluster may be in my future. Cool :-) There are at several different sub systems, and as I understood mod_cluster it already carefully separates them: 1) Dynamic topology detection (optional) What are our backend nodes? If you do not want to statically configure them, you need some mechanism based on either - registration: backend nodes register at one or multiple topology management nodes; the addresses of those are either configured, or they announce themselves on the network via broad- or multicast). - detection: topology manager receives broad- or multicast packets of the backend nodes. They do not need to know the topology manager, only the multicast address More enhanced would be to already learn the forwarding rules (e.g. URLs to map) from the backend nodes. In the simpler case, the topology would be configured statically. 2) Dynamic state detection a) Livelyness b) Load numbers Both could be either polled by (maybe scalability issues) or pushed to a state manager. Push could be done by tcp (the address could be sent to the backend, once it was detected in 1) or defined statically). Maybe one would use both ways, e.g. push for active state changes, like when an admin stops a node, poll for state manager driven things. Not sure. 3) Balancing Would be done based on the data collected by the state manager. It's not clear at all, whether those three should be glued together tightly, or kept in different pieces. I had the impression the general direction is more about separating them and to allow multiple experiments, like mod_cluster and mod_heartbeat. The interaction would be done via some common data container, e.g. slotmem or in a distributed (multiple Apaches) situation memcache or similar. Does this make sense? Yes. I've been working around #1 by using pre-designated port ranges for backends, e.g. configuring for balancing over a port range of 10 and only having a couple of servers running in this range at most given times. That's fine as long as one quiets Apache's error logging so that it only complains about backends that are *newly* unreachable rather than complaining each time a backend is retried. I supplied a patch for this some time back. #2 and #3 are huge, however, and it would be good to see something firm rather than experimental in these areas sooner than later. -- Jess Holle
Re: mod_proxy / mod_proxy_balancer
Jim Jagielski wrote: On May 6, 2009, at 4:35 AM, Jess Holle wrote: Of course that redoes what a servlet engine would be doing and does so with lower fidelity. An ability to ask a backend for its current session count and load balance new requests on that basis would be really helpful. Whether this ability is buried into AJP, for instance, or is simply a separate request to a designated URL is another question, but the latter approach seems fairly general and the number of such requests could be throttled by a time-to-live setting on the last such count obtained. Actually this could and should be generalized beyond active sessions to a back-end health metric. Each backend could compute and respond with a relative measure of busyness/health and respond and the load balancer could then balance new (session-less) requests to the least busy / most healthy backend. This would seem to be *huge* step forward in load balancing capability/fidelity. The trick, of course, at least with HTTP, is that the querying of the backend is, of course, a request, and so one needs to worry about such things as keepalives and persistent connections, and how long do we wait for responses, etc... That's why oob-like health-and-status chatter is nice, because it doesn't interfere with the normal reverse-proxy/host logic. An idea: Instead of asking for this info before sending the request, what about the backend sending it as part of the response, as a response header. You don't know that status of the machine now, but you do know the status of it right after it handled the last request (the last time you saw it) and, assuming nothing else touched it, that status is likely still good. Latency will be an issue, of course... Overlapping requests where you don't have the response from req1 before you send req2 means that both requests think the server is at the same state, whereas of course, they aren't, but it may even out since req3, for example, (which happens after req1 is done) thinks that the backend has 2 concurrent requests, instead of the 1 (req2) and so maybe isn't selected... The hysteresis would be interesting to model :) There's inherent hysteresis in this sort of thing. Including health information (e.g. via a custom response header) on all responses is an interesting notion. Exposing a URL on Apache through which the backend can push its health information (e.g. upon starting a new session or invalidating a session or detecting a low memory condition) also makes sense. If these do not suffice a watchdog thread (as in mod_jk) could do periodic health checks on the backends in a separate thread or requests could pre-request health information for a backend if that backend's health information is sufficiently old. There's lots of possibilities here. -- Jess Holle
Re: mod_proxy / mod_proxy_balancer
Jim Jagielski wrote: On May 6, 2009, at 9:07 AM, Jess Holle wrote: jean-frederic clere wrote: Should general support for a query URL be provided in mod_proxy_balancer? Or should this be left to mod_cluster? Can you explain more? I don't get the question. What I mean is • Should mod_proxy_balancer be extended to provide a balancer algorithm in which one specifies a backend URL that will provide a single numeric health metric, throttle the number of such requests via a time-to-live associated with this information, and balance on this basis or • Should mod_cluster handle this issue? • Or both? Please recall that, afaik, mod_cluster is not AL nor is it part of Apache. So asking for direction for what is basically an external project on the Apache httpd dev list is kinda weird :) In any case, I think the hope of the ASF is that this capability is part of httpd, and you can see, with mod_heartbeat and the like, efforts in the direction. But the world is big enough for different implementations... You're right -- I was being weird. Sorry. I guess part of the reason for my asking was whether the ASF was basically saying we're not chasing this problem, see mod_cluster folk if you need it solved -- and, if so, hoping to get a little starting info as to what I'd be getting into chasing mod_cluster. I'd like to see this capability in httpd itself -- or at least have it very easy to add in a very seamless fashion via a pluggable custom balancer algorithm (without other larger configuration side effects) -- and thus would hope the ASF sees this as within the scope of httpd's core suite of modules. -- Jess Holle
Re: mod_proxy/mod_proxy_balancer bug
Rainer Jung wrote: The same type of balancing decision algorithm was part of mod_jk between 1.2.7 and 1.2.15. I always had problems to understand, how it exactly behaves in case some workers are out of order. The algorithm is interesting, but I found it very hard to model its mathematics into formulas. We finally decided to switch to something else. For request, traffic or session based balancing we do count items (requests, bytes or new sessions), and divide the counters by two once a minute. That way load that happened in the past does count less. Furthermore a worker that was dead or deactivated some time gets the biggest current load number when being reactivated, so that it starts a smooth as possible. I expect porting this to mod_proxy in trunk will be easy, but I'm not sure what experience others have with the fairness of balancing in case you add dynamics to the workers (errors and administrative downtimes). I'd be /_very_ /interested in such a port to mod_proxy_balancer -- in 2.2.x in my case. Any help/pointers/assistance would be appreciated. I could apply such a change as a patch to just my version, but I'd be a lot more interested in getting 2.2.x as a whole to a better place and not having to maintain my own fork of things. I get a strong impression that others haven't really pushed mod_proxy_balancer in this area. Overall having solid mod_proxy_balancer functionality obviously benefits more than just AJP and I like the idea of mod_proxy_ajp. That said, if mod_jk is going to move ahead and mod_proxy_ajp become a backwater at some point I'll need to move back to mod_jk, though I'd really want the ability to gracefully throttle requests in mod_jk first. [When mod_jk runs out of connections it gives a 503. mod_proxy can queue up requests instead.] -- Jess Holle
mod_proxy/mod_proxy_balancer bug
proxy_handler() calls ap_proxy_pre_request() inside a do loop over balanced workers. This in turn calls proxy_balancer_pre_request() which does (*worker)-s-busy++. Correspondingly proxy_balancer_post_request() does: if (worker worker-s-busy) worker-s-busy--; Unfortunately, proxy_handler only calls proxy_run_post_request() and thus proxy_balancer_post_request() outside the do loop. Thus the busy count of workers which currently cannot take requests (e.g. that are currently dead) increases without bound due to retries -- and is never reset. Does anyone (i.e. who is more familiar with this code) have suggestions for how this should be fixed? If not, I can take a swing at it. Similarly, when retrying workers in various routines in mod_proxy_balancer.c those worker's lbstatus is incremented. If the retry fails, however, the lbstatus is never reset. This issue also leads to an lbstatus that increases without bound. Just because a worker was dead for 8 hours does not mean it can handle all the work load now. It needs to start fresh -- not 8 hours in the hole. This issue also creates an unduly huge impact when doing mycandidate-s-lbstatus -= total_factor; We're seeing the load balancing be thrown dramatically off in this case. Does anyone have suggestions for how this should be fixed? If not, again I can take a swing at this, e.g. reseting lbstatus to 0 in ap_proxy_retry_worker(). It *seems* like both of the issue center on handling of dead workers, especially having a multiple dead workers and/or workers that are dead for long periods of time. I've not yet checked whether mod_jk (where I believe these basic algorithms came from) has similar issues. -- Jess Holle
Re: mod_proxy/mod_proxy_balancer bug
Jess Holle wrote: proxy_handler() calls ap_proxy_pre_request() inside a do loop over balanced workers. This in turn calls proxy_balancer_pre_request() which does (*worker)-s-busy++. Correspondingly proxy_balancer_post_request() does: if (worker worker-s-busy) worker-s-busy--; Unfortunately, proxy_handler only calls proxy_run_post_request() and thus proxy_balancer_post_request() outside the do loop. Thus the busy count of workers which currently cannot take requests (e.g. that are currently dead) increases without bound due to retries -- and is never reset. Does anyone (i.e. who is more familiar with this code) have suggestions for how this should be fixed? If not, I can take a swing at it. Similarly, when retrying workers in various routines in mod_proxy_balancer.c those worker's lbstatus is incremented. If the retry fails, however, the lbstatus is never reset. This issue also leads to an lbstatus that increases without bound. Just because a worker was dead for 8 hours does not mean it can handle all the work load now. It needs to start fresh -- not 8 hours in the hole. This issue also creates an unduly huge impact when doing mycandidate-s-lbstatus -= total_factor; Actually I'm offbase here. total_factor places undue emphasis on any worker that satisfies a request when multiple dead workers are retried. For instance, if there are 7 dead workers, all being retried, 2 healthy workers, and all with an lbfactor of 1 the worker that gets the request gets its lbstatus decremented by 9, whereas it really should only be decremented by 2 -- else the weighting gets thrown way off. However, it is /not/ thrown off more due to the huge lbstatus values that build up in dead workers. That only becomes an issue when dead workers come to life. We're seeing the load balancing be thrown dramatically off in this case. Does anyone have suggestions for how this should be fixed? If not, again I can take a swing at this, e.g. reseting lbstatus to 0 in ap_proxy_retry_worker(). It *seems* like both of the issue center on handling of dead workers, especially having a multiple dead workers and/or workers that are dead for long periods of time. I've not yet checked whether mod_jk (where I believe these basic algorithms came from) has similar issues. -- Jess Holle
Proposed proxy logging patch
I have noticed that proxy logging about a dead server is overtly verbose in some circumstances. If you use proxy_balancer to load balance over a number of servers and one is dead you'll get 3 lines of error logging every time it retries the server (to see if it's alive yet). This can get rather obnoxious when you're balancing over a number of ports which may or may not have a server listening at the time -- and when you're allowing retries of dead servers with any frequency. The attached patch changes the level of such logging to debug for retries of a worker known to be in an error state, leaving the level at error for other cases. The result is that you get error logging when a server is first determined to be unavailable -- and then are simply not bothered about this any longer. Note that this patch does have a little bit of unrelated WIN32-specific material regarding APR_SO_NONBLOCK -- as this is necessary on Windows to get connection timeouts to work -- at least until someone fixes the apr_socket_connect() implementation on Windows. -- Jess Holle P.S. I've developed a similar patch for the tomcat JK connectors as well. --- modules/proxy/mod_proxy.h.orig 2008-11-11 14:04:34.0 -0600 +++ modules/proxy/mod_proxy.h 2009-01-19 14:53:57.745762300 -0600 @@ -272,6 +272,7 @@ #define PROXY_WORKER_STOPPED0x0040 #define PROXY_WORKER_IN_ERROR 0x0080 #define PROXY_WORKER_HOT_STANDBY0x0100 +#define PROXY_WORKER_RETRYING 0x0200 /* Additional status flag to indicate worker is being retried [PTC] */ #define PROXY_WORKER_NOT_USABLE_BITMAP ( PROXY_WORKER_IN_SHUTDOWN | \ PROXY_WORKER_DISABLED | PROXY_WORKER_STOPPED | PROXY_WORKER_IN_ERROR ) --- modules/proxy/proxy_util.c.orig 2008-11-11 14:04:34.0 -0600 +++ modules/proxy/proxy_util.c 2009-01-19 14:57:43.552431300 -0600 @@ -1932,6 +1932,7 @@ if (apr_time_now() worker-s-error_time + worker-retry) { ++worker-s-retries; worker-s-status = ~PROXY_WORKER_IN_ERROR; +worker-s-status |= PROXY_WORKER_RETRYING; /* Flag that we're retrying a worker that was in error [PTC] */ ap_log_error(APLOG_MARK, APLOG_DEBUG, 0, s, proxy: %s: worker for (%s) has been marked for retry, proxy_function, worker-hostname); @@ -2223,6 +2224,7 @@ { apr_status_t rv; int connected = 0; +int was_retry; /* Added [PTC] */ int loglevel; apr_sockaddr_t *backend_addr = conn-addr; apr_socket_t *newsock; @@ -2301,13 +2303,25 @@ proxy: %s: fam %d socket created to connect to %s, proxy_function, backend_addr-family, worker-hostname); +/* Workaround needed on Windows to get connection timeout to actually work. + Should be fixed in WIN32 implementation of apr_socket_connect(). [PTC] */ +#ifdef WIN32 +apr_socket_opt_set(newsock, APR_SO_NONBLOCK, 1); +#endif + /* make the connection out of the socket */ rv = apr_socket_connect(newsock, backend_addr); +/* Second part of workaround noted just above [PTC] */ +#ifdef WIN32 +apr_socket_opt_set(newsock, APR_SO_NONBLOCK, 0); +#endif + /* if an error occurred, loop round and try again */ if (rv != APR_SUCCESS) { apr_socket_close(newsock); -loglevel = backend_addr-next ? APLOG_DEBUG : APLOG_ERR; +/* Only log failure at debug level when we know we're retrying a previously failed worker [PTC] */ +loglevel = backend_addr-next ? APLOG_DEBUG : ( (worker-s-status PROXY_WORKER_RETRYING) ? APLOG_DEBUG : APLOG_ERR ); ap_log_error(APLOG_MARK, loglevel, rv, s, proxy: %s: attempt to connect to %pI (%s) failed, proxy_function, @@ -2337,19 +2351,27 @@ * Altrough some connections may be alive * no further connections to the worker could be made */ +/* If we are retrying a worker that had previously failed, then only log + at debug level verbosity. Clear retry bitmask from status in all cases. + Finally, when a retry fails return -DECLINED rather than DECLINED. [PTC] +*/ if (!connected PROXY_WORKER_IS_USABLE(worker) !(worker-s-status PROXY_WORKER_IGNORE_ERRORS)) { +was_retry = worker-s-status PROXY_WORKER_RETRYING; +loglevel = was_retry ? APLOG_DEBUG : APLOG_ERR; worker-s-status |= PROXY_WORKER_IN_ERROR; worker-s-error_time = apr_time_now(); -ap_log_error(APLOG_MARK, APLOG_ERR, 0, s, +ap_log_error(APLOG_MARK, loglevel, 0, s, ap_proxy_connect_backend disabling worker for (%s), worker-hostname); } else { worker-s-error_time = 0; worker-s-retries = 0; +was_retry = 0; } -return connected ? OK : DECLINED; +worker-s-status = ~PROXY_WORKER_RETRYING; +return connected ? OK
Re: proxy_ajp connect timeout fix.
Also note that the mod_jk code works just fine here, i.e. its socket connection timeouts are obeyed without further hackery. This is via jk_connect.c's nb_connect(), not APR, though -- so chalk one up for by-passing APR? -- Jess Holle Jess Holle wrote: Ruediger Pluem wrote: I guess you should move this over to d...@apr as this is likely a problem with the windows specific connect call not returning immediately. I moved this over to d...@apr as suggested, but have not received any responses there. Note that I applied Matt Stevenson's suggested fix from earlier in this thread [http://marc.info/?l=apache-httpd-devm=122358323701009w=2] and the connection timeout then worked on Windows as expected with 8 dead ports being checked in between 1 and 2 seconds -- which is what I'd expect given a connectiontimeout of 160ms. It would seem proxy_util.c should not have to do this but rather that whatever is needed to get connection timeouts to work on a given platform should be done in apr_socket_connect(). This raises another question, though. Earlier in this thread there were claims that Matt Stevenson's patch had adverse performance impacts, e.g. on HTTPS. Can someone explain how this could be? I ask in part as unless/until someone figures out the right fix in APR, I'll have to use Matt's patch -- and would like to understand the downsides and mitigate them if possible. -- Jess Holle
Re: proxy_ajp connect timeout fix.
Ruediger Pluem wrote: I guess you should move this over to d...@apr as this is likely a problem with the windows specific connect call not returning immediately. I moved this over to d...@apr as suggested, but have not received any responses there. Note that I applied Matt Stevenson's suggested fix from earlier in this thread [http://marc.info/?l=apache-httpd-devm=122358323701009w=2] and the connection timeout then worked on Windows as expected with 8 dead ports being checked in between 1 and 2 seconds -- which is what I'd expect given a connectiontimeout of 160ms. It would seem proxy_util.c should not have to do this but rather that whatever is needed to get connection timeouts to work on a given platform should be done in apr_socket_connect(). This raises another question, though. Earlier in this thread there were claims that Matt Stevenson's patch had adverse performance impacts, e.g. on HTTPS. Can someone explain how this could be? I ask in part as unless/until someone figures out the right fix in APR, I'll have to use Matt's patch -- and would like to understand the downsides and mitigate them if possible. -- Jess Holle
Re: proxy_ajp connect timeout fix.
Did anyone test this on Windows? I ask as I am trying connectiontimeout=160ms as part of my Apache 2.2.11 configuration and am getting a configuration error. I get the same error with ping and other parameters which now use ap_timeout_parameter_parse(). My BalanceMember looks something like: BalancerMember ajp://localhost:8010 route=tomcat1 min=16 max=80 smax=40 ttl=900 keepalive=Off timeout=9 retry=30 connectiontimeout=160ms flushpackets=on -- Jess Holle
Re: proxy_ajp connect timeout fix.
Thanks! Ruediger Pluem wrote: On 12/16/2008 11:17 PM, Jess Holle wrote: Did anyone test this on Windows? I stumbled across the same issue on Red Hat AS 5 today. Try to patch your APR with http://svn.apache.org/viewvc?rev=727052view=rev from APR trunk. This should fix this. I ask as I am trying connectiontimeout=160ms as part of my Apache 2.2.11 configuration and am getting a configuration error. I get the same error with ping and other parameters which now use ap_timeout_parameter_parse(). My BalanceMember looks something like: BalancerMember ajp://localhost:8010 route=tomcat1 min=16 max=80 smax=40 ttl=900 keepalive=Off timeout=9 retry=30 connectiontimeout=160ms flushpackets=on Regards Rüdiger
Re: proxy_ajp connect timeout fix.
The errno assignments you added did the trick. Unfortunately, I'm still missing the overall goal. I have many proxy balance members like: BalancerMember ajp://localhost:8010 route=tomcat1 min=16 max=80 smax=40 ttl=900 keepalive=Off timeout=9 retry=30 connectiontimeout=160ms flushpackets=on BalancerMember ajp://localhost:8011 route=tomcat2 min=16 max=80 smax=40 ttl=900 keepalive=Off timeout=9 retry=30 connectiontimeout=160ms flushpackets=on ... However, the error log says: [Tue Dec 16 17:32:*25* 2008] [error] (OS 10061)No connection could be made because the target machine actively refused it. : proxy: AJP: attempt to connect to 127.0.0.1:8011 (localhost) failed [Tue Dec 16 17:32:25 2008] [error] ap_proxy_connect_backend disabling worker for (localhost) [Tue Dec 16 17:32:25 2008] [error] proxy: AJP: failed to make connection to backend: localhost [Tue Dec 16 17:32:*26* 2008] [error] (OS 10061)No connection could be made because the target machine actively refused it. : proxy: AJP: attempt to connect to 127.0.0.1:8012 (localhost) failed [Tue Dec 16 17:32:26 2008] [error] ap_proxy_connect_backend disabling worker for (localhost) [Tue Dec 16 17:32:26 2008] [error] proxy: AJP: failed to make connection to backend: localhost ... Each port (on Windows) still consistently takes right around 1 full second to reject. despite having set connectiontimeout to be 160ms. Something seems to still be awry here as 160ms is significantly less than 1000ms... -- Jess Holle Ruediger Pluem wrote: On 12/16/2008 11:17 PM, Jess Holle wrote: Did anyone test this on Windows? I stumbled across the same issue on Red Hat AS 5 today. Try to patch your APR with http://svn.apache.org/viewvc?rev=727052view=rev from APR trunk. This should fix this I ask as I am trying connectiontimeout=160ms as part of my Apache 2.2.11 configuration and am getting a configuration error. I get the same error with ping and other parameters which now use ap_timeout_parameter_parse(). My BalanceMember looks something like: BalancerMember ajp://localhost:8010 route=tomcat1 min=16 max=80 smax=40 ttl=900 keepalive=Off timeout=9 retry=30 connectiontimeout=160ms flushpackets=on Regards Rüdiger
Re: Time for 2.2.11?
Ruediger Pluem wrote: On 11/15/2008 09:50 PM, Ruediger Pluem wrote: Not that much time has passed since we released 2.2.10 (one month), but I would like to see a release of 2.2.11 in the near future. Why? 2.2.10 has two regressions, one against 2.2.8 (crashes caused by the proxy) which is already backported and one against 2.2.9 (errors in openssl detection) which is currently proposed for backport and misses two votes. There are two further changes in the STATUS file that only miss one additional vote. With these 3 changes in the pipeline and the 10 changes already done for 2.2.11 I think we have enough stuff for a release given the two regressions above. I even volunteer to be the RM for this release and if the remaining proposals get in I would like to TR on 29th / 30th of November and release on 6th / 7th of December if the voting passes. And yes I know some of us will be disappointed that some things will miss the boat again (especially SNI), but they wouldn't be in a 2.2.x release even if we do not release 2.2.11 at the beginning of December. Opinions? Given the positive feedback: Please vote now on the backports :-). I /really/ want to see a sub-second proxy connection timeout as this is needed due to Windows' inappropriate RFC interpretation. This would be r705005 produced by Ruediger, I believe. If this is not part of 2.2.11 (I'm pretty sure it did not go into 2.2.10) then I'm going to have to backport this myself into our binaries for real soon here and keep doing so with each new 2.2.x. It would be /much/ better to just have this in 2.2.x. -- Jess Holle
Re: Time for 2.2.11?
Cool. Thanks! I'll anxiously await 2.2.11 then. Rainer Jung wrote: Jess Holle schrieb: Ruediger Pluem wrote: On 11/15/2008 09:50 PM, Ruediger Pluem wrote: Given the positive feedback: Please vote now on the backports :-). I /really/ want to see a sub-second proxy connection timeout as this is needed due to Windows' inappropriate RFC interpretation. This would be r705005 produced by Ruediger, I believe. If this is not part of 2.2.11 (I'm pretty sure it did not go into 2.2.10) then I'm going to have to backport this myself into our binaries for real soon here and keep doing so with each new 2.2.x. It would be /much/ better to just have this in 2.2.x. This has already been backported to 2.2.x on November 11th: http://svn.apache.org/viewvc?view=revrevision=713145 and http://mail-archives.apache.org/mod_mbox/httpd-cvs/200811.mbox/[EMAIL PROTECTED] Regards, Rainer
Re: proxy_ajp connect timeout fix.
Mladen Turk wrote: Ruediger Pluem wrote: This would be similar what we have for parsing some file sizes across the conf (1024, 1K, 1M, ...) I'm not aware of any standard defining that, but since we underneath use apr_time anything from 1us should be valid time (weather it makes sense is a different story). How about the following: +1 More time modifies (minutes and hours, 'M', 'hH') would be handy, but that's it. More modifiers would be /nice/ but seconds and milliseconds suffice for my current needs. Think it's even backportable to 2.2.x. That would be good as I need it in 2.2.x :-) -- Jess Holle
Re: proxy_ajp connect timeout fix.
Ruediger Pluem wrote: I checked 2.2.x and trunk in the meantime and they behaves as they should *without* the patch. If I try to connect to a non existing host the apr_socket_connect call returns after the timeout set via connectiontimeout. I guess this leaves us to the question whether we need to be able to set connectiontimeouts below one second. I reverted r703998 on trunk. I need some means of getting proxy connections to sockets with nothing listening on them to fail in less than 1 second as a workaround to Windows' noncompliant RST handling. I believe Mladen just added this to mod_jk (where I also need such a capability for the IIS/Tomcat connector). This seems cleaner than the GetTcpTable mess I was creating (as I'd assume local connections should take significantly less than 1 second to connect when successful). -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: On 10/13/2008 12:50 AM, Jess Holle wrote: Perhaps I misunderstand things here, but isn't this connection timeout setting used for more than just the timing out the initial formation of the connection? It would seem that logical that there would be a connection timeout for forming the initial connection and another for timeouts of responses, etc, but I had understood this was not the case. We currently have connection timeout set very, very large as otherwise we got timeouts when the backend URL was something very computationally intensive that took a long time to respond with data (e.g. a good number of minutes). That should seemingly be distinct from an initial connection timeout, but my understanding was that it is not. Am I just confused here? No you are not. The next 2.2.x release will contain the parameter connectiontimeout where you can set *just* the connection timeout. The other parameter you are referring to is timeout. It will keep its meaning and will be used as a connection timeout if connectiontimeout is not set. By next 2.2.x release do you mean 2.2.10 (assuming it goes out)? Or trunk beyond 2.2.10? -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. We have had another engineer verify the issue on another machine using a .Net application, so (1) it is not Apache specific and (2) it is not specific to my machine. He's also somewhat of a Windows guru, but I'd be ecstatic if someone could point out a reasonable way around this issue. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: On 10/13/2008 12:50 AM, Jess Holle wrote: Perhaps I misunderstand things here, but isn't this connection timeout setting used for more than just the timing out the initial formation of the connection? It would seem that logical that there would be a connection timeout for forming the initial connection and another for timeouts of responses, etc, but I had understood this was not the case. We currently have connection timeout set very, very large as otherwise we got timeouts when the backend URL was something very computationally intensive that took a long time to respond with data (e.g. a good number of minutes). That should seemingly be distinct from an initial connection timeout, but my understanding was that it is not. Am I just confused here? No you are not. The next 2.2.x release will contain the parameter connectiontimeout where you can set *just* the connection timeout. The other parameter you are referring to is timeout. It will keep its meaning and will be used as a connection timeout if connectiontimeout is not set. Ah, so we additionally need something like Matt Stevenson's patch (or just change connection timeout to be a float or double rather than changing its units) to allow connection timeouts of much less than 1 second (e.g. 0.125 seconds) to address my Windows issue with slow connection rejection -- and that's /if/ Windows consistently connects in less than that timeframe when it is going to connect. Right? -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Mladen Turk wrote: Ruediger Pluem wrote: Not exactly. I would prefer to fix the basic issue with Windows. If we need to support milliseconds for connection timeouts seems to be another story for me. Can some of the Windows gurus come to the rescue to either confirm and explain why it takes that long for connect to return after trying to connect to a closed port (on the same machine !!!) or if this must be a local issue on a specific machine. According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. Hmm... Oddly I'm seeing right around 1 second (just a little over) delay for the rejection of each connection on a port on which nothing is listening. This obviously does not match up with the 9 seconds in any way. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
I just set this parameter to 0 and the issue went away entirely. Good catch, Ruediger! Thank you -- and all who helped on this thread! It would appear that Microsoft's documentation slipped a decimal place somewhere as it would appear there is about 0.3 second delay on the initial retry and about a 0.6 second on the second -- resulting in about a 1 second overall delay when other overhead/latency is included. I don't see a way to reduce this delay and overall concur with Andy that this parameter should be 0 by all rights. Any thoughts? -- Jess Holle Andy Wang wrote: Ruediger Pluem wrote: According to the Microsoft (http://support.microsoft.com/default.aspx/kb/314053) TcpMaxConnectRetransmissions Key: Tcpip\Parameters Value Type: REG_DWORD - Number Valid Range: 0 - 0x Default: 2 Description: This parameter determines the number of times that TCP retransmits a connect request (SYN) before aborting the attempt. The retransmission timeout is doubled with each successive retransmission in a particular connect attempt. The initial timeout value is three seconds. So, looks like with default of 2 retransmits there is at least 9 second delay. I thought the problem was with local connects to a port, to which no process listens to. In this case the OS should immediately sending a RST (reset) and the client should not resend the SYN. I expect this TcpMaxConnectRetransmissions to be used in the case, were the remote server doesn't send any answer (neither RST nor SYN/ACK) and the client needs to retry the connection establishment after some timeout. That shouldn't be the case here, because the local system should always be able to send a RST if nothing is listening on the port. Except there's a local firewall with packet drop coming into the game (or name resolution timeouts, or ...). Exactly my understanding of the problem. So I see no use in TcpMaxConnectRetransmissions for this case. Regards Rüdiger Looks like it might be the retry issue: 3963 13.831213 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4087 14.280717 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4088 14.280735 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 4238 14.827581 source destination TCP 2608 1230 [SYN] Seq=0 Win=65535 Len=0 MSS=1460 4239 14.827603 source destination TCP 1230 2608 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 The RSTs occur remotely and Windows retries twice. 14284.025656unixsource destinationTCP57864 1230 [SYN] Seq=0 Win=32768 Len=0 MSS=1460 WS=0 14294.025674unixsource destinationTCP1230 57864 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 That's the same exchange from an HP-UX source to a linux destination. I'm assuming that windows pulls the same crap for localhost traffic even though we can't capture it to prove the case. Oh the joys of TCP/IP troubleshooting on Windows :) Andy
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. 2. Allow connectiontimeout to somehow accept milliseconds. Or a floating point number? Unfortunately this would seem to impact actual connection timeouts as an undesired side-effect of trying to address Windows' bad treatment of RSTs, right? -- Jess Holle 31a32,35 #ifdef WIN32 #include iphlpapi.h #endif 2268a2273,2397 #ifdef WIN32 typedef struct live_port_data_t live_port_data_t; struct live_port_data_t { apr_time_t time_obtained; int n_ports; int *ports; }; static live_port_data_t *live_port_data = NULL; static int int_comparator( const void *pint1, const void *pint2 ) { int int1 = *((int*)pint1); int int2 = *((int*)pint2); if ( int1 int2 ) return -1; if ( int2 int2 ); return 1; return 0; } static live_port_data_t *get_port_data() { /* Much of this routine adapted directly from http://msdn.microsoft.com/en-us/library/aa366026(VS.85).aspx */ /* Declare and initialize variables */ PMIB_TCPTABLE pTcpTable; DWORD dwSize; DWORD dwRetVal; pTcpTable = (MIB_TCPTABLE *) malloc( sizeof (MIB_TCPTABLE) ); if ( pTcpTable == NULL ) return NULL; dwSize = sizeof (MIB_TCPTABLE); /* Make an initial call to GetTcpTable to get the necessary size into the dwSize variable */ if ((dwRetVal = GetTcpTable(pTcpTable, dwSize, FALSE)) == ERROR_INSUFFICIENT_BUFFER) { free(pTcpTable); pTcpTable = (MIB_TCPTABLE *) malloc(dwSize); if (pTcpTable == NULL) return NULL; } /* Make a second call to GetTcpTable to get the actual data we require */ if ((dwRetVal = GetTcpTable(pTcpTable, dwSize, FALSE)) != NO_ERROR) { free(pTcpTable); return NULL; } else { apr_time_t time_now = apr_time_now(); live_port_data_t *port_data; int nUniqPorts = 0; int *uniqPorts; { int nEntries = (int) pTcpTable-dwNumEntries; int *ports = (int*) malloc( nEntries * sizeof( int ) ); int prevPort = -9; int i; /* copy ports from pTcpTable to ports array */ for (i = 0; i nEntries; i++) ports[i] = ntohs( (u_short) pTcpTable-table[i].dwLocalPort ); free( pTcpTable ); /* sort ports array */ qsort( ports, nEntries, sizeof( int ), int_comparator ); /* reduce ports array to list of unique ports */ uniqPorts = (int*) malloc( nEntries * sizeof( int ) ); /* array will be oversized in the end; value speed over small memory savings */ for (i = 0; i nEntries; i++) { int port = ports[i]; if ( port != prevPort ) { uniqPorts[nUniqPorts] = port; ++nUniqPorts; prevPort = port; } } free( ports ); } port_data = malloc( sizeof( live_port_data_t ) ); port_data-time_obtained = time_now; port_data-n_ports = nUniqPorts; port_data-ports = uniqPorts; return port_data; } } static void destroy_port_data( live_port_data_t *port_data ) { free( port_data-ports ); free( port_data ); } static int port_in_data( const live_port_data_t *port_data, int port ) { return ( bsearch( port, port_data-ports, port_data-n_ports, sizeof( int ), int_comparator ) != NULL ); } /* TO_DO: make this configurable */ #define LIVE_PORT_DATA_TTL 150 /* use hard-wired time-to-live of 1.5 seconds for port data */ static int port_is_clearly_not_alive( const apr_sockaddr_t *addr, const server_rec *s ) { /* if not dealing with localhost, then simply return 0
Re: Speeding up mod_proxy_balancer on Windows
Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I forgot the -u on my diff. Here's a unified diff. -- Jess Holle --- proxy_util-2.2.9.c 2008-05-28 16:11:24.0 -0500 +++ proxy_util.c2008-10-13 14:32:26.342593500 -0500 @@ -29,6 +29,10 @@ #define apr_socket_create apr_socket_create_ex #endif +#ifdef WIN32 +#include iphlpapi.h +#endif + /* Global balancer counter */ int PROXY_DECLARE_DATA proxy_lb_workers = 0; static int lb_workers_limit = 0; @@ -2266,6 +2270,131 @@ } #endif /* USE_ALTERNATE_IS_CONNECTED */ +#ifdef WIN32 + +typedef struct live_port_data_t live_port_data_t; +struct live_port_data_t { + apr_time_t time_obtained; + int n_ports; +int *ports; +}; + +static live_port_data_t *live_port_data = NULL; + +static int int_comparator( const void *pint1, const void *pint2 ) +{ + int int1 = *((int*)pint1); + int int2 = *((int*)pint2); + if ( int1 int2 ) + return -1; + if ( int2 int2 ); + return 1; + return 0; +} + +static live_port_data_t *get_port_data() +{ + /* Much of this routine adapted directly from http://msdn.microsoft.com/en-us/library/aa366026(VS.85).aspx */ + + /* Declare and initialize variables */ + PMIB_TCPTABLE pTcpTable; + DWORD dwSize; + DWORD dwRetVal; + + pTcpTable = (MIB_TCPTABLE *) malloc( sizeof (MIB_TCPTABLE) ); + if ( pTcpTable == NULL ) + return NULL; + + dwSize = sizeof (MIB_TCPTABLE); + /* Make an initial call to GetTcpTable to + get the necessary size into the dwSize variable */ + if ((dwRetVal = GetTcpTable(pTcpTable, dwSize, FALSE)) == + ERROR_INSUFFICIENT_BUFFER) { + free(pTcpTable); + pTcpTable = (MIB_TCPTABLE *) malloc(dwSize); + if (pTcpTable == NULL) + return NULL; + } + + /* Make a second call to GetTcpTable to get + the actual data we require */ + if ((dwRetVal = GetTcpTable(pTcpTable, dwSize, FALSE)) != NO_ERROR) { + free(pTcpTable); + return NULL; + } + else + { + apr_time_t time_now = apr_time_now(); + live_port_data_t *port_data; + int nUniqPorts = 0; + int *uniqPorts; + { + int nEntries = (int) pTcpTable-dwNumEntries; + int *ports = (int*) malloc( nEntries * sizeof( int ) ); + int prevPort = -9; + int i; + /* copy ports from pTcpTable to ports array */ + for (i = 0; i nEntries; i++) + ports[i] = ntohs( (u_short) pTcpTable-table[i].dwLocalPort ); + free( pTcpTable ); + /* sort ports array */ + qsort( ports, nEntries, sizeof( int ), int_comparator ); + /* reduce ports array to list of unique ports */ + uniqPorts = (int*) malloc( nEntries * sizeof( int ) ); /* array will be oversized in the end; value speed over small memory savings */ + for (i = 0; i nEntries; i++) { + int port = ports[i]; + if ( port != prevPort ) + { + uniqPorts[nUniqPorts] = port; + ++nUniqPorts; + prevPort = port; + } + } + free( ports ); + } + port_data = malloc( sizeof( live_port_data_t ) ); + port_data-time_obtained = time_now; + port_data-n_ports = nUniqPorts; +port_data-ports = uniqPorts; + return port_data; + } +} + +static void destroy_port_data( live_port_data_t *port_data ) +{ + free( port_data-ports ); + free( port_data ); +} + +static int port_in_data( const live_port_data_t *port_data, int port ) +{ + return ( bsearch( port, port_data-ports, port_data-n_ports, sizeof( int ), int_comparator ) != NULL ); +} + +/* TO_DO: make this configurable */ +#define LIVE_PORT_DATA_TTL
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: On 10/13/2008 09:37 PM, Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're Mind to attach this as a unified diff Already did -- I goofed the first time... worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I guess the post config hook would be the correct place to create such a mutex. Depending on what type of mutex you need a call to apr_global_mutex_child_init is due additionally in the child_init hook. Have a look at other caching modules in httpd that have to deal with this like ldap or on trunk the small objects caches. Thanks for the pointer. 2. Allow connectiontimeout to somehow accept milliseconds. Or a floating point number? Unfortunately this would seem to impact actual connection timeouts as an undesired side-effect of trying to address Windows' bad treatment of RSTs, right? Not directly as you can interpret an integer value of an existing configuration also as a float, but I would like to keep the value an integer. This should be doable the way I proposed ( 100 seconds = 100 milliseconds), but comments on this approach are welcome. The range-based interpretation just seems too subtle to me, but I'm probably biased. I've used floating point seconds for configuration in my own server code after having been burned by having to switch from seconds to milliseconds to nanoseconds, etc. I'd rather hide the implementation's units from the user. I still use an integer when I wish to assert that nothing below a second is allowable and don't see any value in being able to specify 2.5 in addition to 2 and 3, though. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: On 10/13/2008 10:04 PM, Jess Holle wrote: Jess Holle wrote: Ruediger Pluem wrote: So if noone finds a registry entry to stop this RFC violating behaviour I'd love to see this solved by such a discovery, option 0. I see only two options on Windows: 1. Fiddle around with GetTcpTable. I've attached my incomplete code in this regard (as a diff against 2.2.9, which is what I used as the base for my changes) for what they're worth. There are TO_DO notes where I know I'm missing stuff. I tested basic use of GetTcpTable(), which solved the problem, but haven't completed my conversion to caching this data -- in part because I don't know where to allocate an lock to arbitrate access to this cached data. I forgot the -u on my diff. Here's a unified diff. Thanks for this. Given that it introduces a lot of platform specific code to the proxy and given the outstanding cache problem I would like to wait for Bill's proposal to improve apr_socket_connect within APR as this looks more appealing overall. If improving APR turns out to be not possible I would come back to your patch. That makes perfect sense to me. I was going to set the code aside for now myself for similar reasons, but wanted to share it before I forgot in case it turns out to be useful. [And, yes, I know the platform-specific bit in the middle of mod_proxy was rather ugly -- and requires 1 additional Win32 library be added to mod_proxy's VC++ config as well...] -- Jess Holle
Re: class loader in Apache Jserv and Apache HTTP server
AspectJ's documentation should give some coverage to wedging use of this ClassLoader into an existing app, but this really isn't the place for such a question. If it were Tomcat, I'd suggest the Tomcat user's group as some Tomcat user has likely done something similar, but there's no such resource for JServ at this point. Also I'd suggest just moving up to Java 5 (and a recent AspectJ version) and using the javaagent-based approach, which is a lot easier and cleaner than ClassLoader hackery. -- Jess Holle Ruediger Pluem wrote: On 10/13/2008 10:35 PM, jetpilot wrote: Hi All I'm trying to use Aspect's J load time weaving feature using WeavingURLClassLoader. Aspect that i write is defined on one method in JServConnection class in ApacheJServ module.Basically i need to set class loader(WeavingURLClassLoader, i guess using system property) to be used when Apache HTTP Server starts and load the classes.This way loader will weave the aspect.But i didn't find a way how to set this parameter, in jserv.properties there is wrapper.bin param where you specify java path, but how can i change default class loader being used. Apache Jserv is not part of the Apache HTTP Server project and it was retired long time ago and replaced with Tomcat (see http://jakarta.apache.org/site/retired-projects.html). So I can't give you a pointer where to post your question. Regards Rüdiger
Re: Speeding up mod_proxy_balancer on Windows
I've managed to create a workaround for this issue with GetTcpTable(). The only remaining issue I have is that I don't want to call this too often. I want to hold on to the data with a time-to-live during which I'll assume the data has not changed. That's all easy enough except for locking. That's easy at a logical level, but where can I allocate locks for such a thing so that I simply have a single lock per worker process (there's only one on Windows, of course, which is all I care about) allocated up front for the life of the process? I'd like to just use APR locks (possibly read/write), but to do so I clearly need to hook into the right place in the Apache life cycle and the right pool. -- Jess Holle P.S. Sorry for the stupid question -- the nuances of Apache lifecycle, pools, etc, are still clearly beyond me.
Re: Speeding up mod_proxy_balancer on Windows
Perhaps I misunderstand things here, but isn't this connection timeout setting used for more than just the timing out the initial formation of the connection? It would seem that logical that there would be a connection timeout for forming the initial connection and another for timeouts of responses, etc, but I had understood this was not the case. We currently have connection timeout set very, very large as otherwise we got timeouts when the backend URL was something very computationally intensive that took a long time to respond with data (e.g. a good number of minutes). That should seemingly be distinct from an initial connection timeout, but my understanding was that it is not. Am I just confused here? -- Jess Holle Matt Stevenson wrote: Hi, Send this to the wrong address first time. May have saved the GetTcpTable coding. Here is a usec timeout fix, although I wouldn't go below 100 milliseconds without some testing under load. I'm not sure its the perfect way to do it, but it avoids changing the connectiontimeout parameter to usec (still defaults to sec). Order is important connectiontimeoutisusec must come after connectiontimeout. Ideas on better ways to do it welcome. I can see a need for timeouts less than a second outside the windows case. Also included the non blocking patch without the ifdefs. Regards Matt ProxyPass / balance://hotcluster/ Proxy balance://hotcluster # below IPs are not reachable, acts like a down box (if timeout is small enough) # 1 sec BalancerMember ajp://192.168.0.23:7010 loadfactor=1 connectiontimeout=100 connectiontimeoutisusec=1 # 1 sec normal BalancerMember ajp://192.168.0.24:7010 loadfactor=1 connectiontimeout=1 # 750 milli sec. BalancerMember ajp://192.168.0.25:7010 loadfactor=1 connectiontimeout=75 connectiontimeoutisusec=1 BalancerMember ajp://localhost:8009 loadfactor=1 connectiontimeout=2 /Proxy Index: modules/proxy/proxy_util.c === --- modules/proxy/proxy_util.c(revision 703688) +++ modules/proxy/proxy_util.c(working copy) @@ -2358,9 +2358,17 @@ proxy: %s: fam %d socket created to connect to %s, proxy_function, backend_addr-family, worker-hostname); +/* use non blocking for connect timeouts to work. The ifdef + limits to unix systems which have apr_wait_for_io_or_timeout. + TODO: remove the ifdef and see what works/breaks */ + +apr_socket_opt_set(newsock, APR_SO_NONBLOCK, 1); + /* make the connection out of the socket */ rv = apr_socket_connect(newsock, backend_addr); +apr_socket_opt_set(newsock, APR_SO_NONBLOCK, 0); + /* if an error occurred, loop round and try again */ if (rv != APR_SUCCESS) { apr_socket_close(newsock); Index: modules/proxy/mod_proxy.c === --- modules/proxy/mod_proxy.c(revision 703688) +++ modules/proxy/mod_proxy.c(working copy) @@ -291,6 +291,13 @@ worker-conn_timeout = apr_time_from_sec(ival); worker-conn_timeout_set = 1; } +else if (!strcasecmp(key, connectiontimeoutisusec)) { +/* change timeout to useconds */ +ival = atoi(val); +if (ival == 1 worker-conn_timeout_set == 1){ +worker-conn_timeout = apr_time_make(0, apr_time_sec(worker-conn_timeout) ); +} +} else { return unknown Worker parameter; }
Re: Speeding up mod_proxy_balancer on Windows
Jess Holle wrote: Ruediger Pluem wrote: Did you check whether the currently running thread proxy_ajp connect timeout fix. (http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED] and http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED]) does fix your issue on Windows? I was watching this. I'll have to give this a try as it just became clear from the latest of these messages that the fix applied to Windows. I just tried this fix -- it didn't help. Also the lowest connection timeout one can set via the proxy config options is 1 second, right? And that's roughly what it takes for connection to each dead port to take anyway (just a little over a second). -- Jess Holle
Speeding up mod_proxy_balancer on Windows
I had previously discovered that mod_proxy_balancer takes over 1 second on Windows to determine that nothing is listening on the target port. This becomes problematic if you are balancing over a sparsely populated set of proxy ports. A Windows guru here found the Windows GetTcpTable which would appear to offer a quicker way to determine a port's status -- whereas doing the obvious thing and attempting to connect takes over a second to fail. I'd like to experiment with using this API to address this issue upon attempted formation of the first connection for a given worker one is balancing over. Can anyone suggest where I should look to do add such a call? Eventually this should presumably be an APR-level thing, but in the short term I'm just looking for where I can experiment with inserting it in an #ifdef in the proxy code -- and getting a little lost here, unfortunately. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
P.S. Yes, I know this approach only has any hope of working when Apache and the proxy backends are on the same host. Jess Holle wrote: I had previously discovered that mod_proxy_balancer takes over 1 second on Windows to determine that nothing is listening on the target port. This becomes problematic if you are balancing over a sparsely populated set of proxy ports. A Windows guru here found the Windows GetTcpTable which would appear to offer a quicker way to determine a port's status -- whereas doing the obvious thing and attempting to connect takes over a second to fail. I'd like to experiment with using this API to address this issue upon attempted formation of the first connection for a given worker one is balancing over. Can anyone suggest where I should look to do add such a call? Eventually this should presumably be an APR-level thing, but in the short term I'm just looking for where I can experiment with inserting it in an #ifdef in the proxy code -- and getting a little lost here, unfortunately. -- Jess Holle
Re: Speeding up mod_proxy_balancer on Windows
Ruediger Pluem wrote: Did you check whether the currently running thread proxy_ajp connect timeout fix. (http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED] and http://mail-archives.apache.org/mod_mbox/httpd-dev/200810.mbox/[EMAIL PROTECTED]) does fix your issue on Windows? I was watching this. I'll have to give this a try as it just became clear from the latest of these messages that the fix applied to Windows. If httpd and the backends are running on the same machine this shouldn't take a second. The connect call should return immediately with an error code indicating that the connection was refused (if the port is down). Yes, that's what I'd assumed. It has become clear from testing on multiple machines that this is not the case. Another engineer did testing with Windows APIs directly in .NET app and came up with the same result -- over 1 second per port refusal. If not is it possible that there is a local firewall that causes this trouble? I tried disabling that and other such steps. I certainly concur -- the refusal should be immediate and certainly /far/ faster than 1 second per port. I've been left wondering if this isn't an odd-ball hack by Microsoft to slow down remote port scans. I'll give the timeout fix a try, but I'm not hopeful given the data so far. -- Jess Holle
Pinging proxy backends
I need to add background pinging of proxy backends to Apache 2.2. [I'll contribute code back, but may or may not be able to get it into a form that is acceptable for inclusion in Apache.] Unfortunately, it is clear to me I do not have nearly the familiarity with the Apache APIs to make this possible in a reasonable time period. My C skills are a bit rusty as well, but I mainly have an API knowledge/comprehension issue at this point. I can create a background thread for this easily enough (thanks to reading through code by Mladen Turk), but otherwise I'm struggling. Can anyone give me some pointers on / code snippets for some basic pieces? For instance, how might I initiate a backend request on my own (rather than as part of normal request forwarding in mod_proxy)? What I'm really targeting is mod_proxy_ajp, so I'd really actually like to figure out is how to do a cping in this case rather than any sort of URL request (or should we just have a cping be a special ajp:// URL somehow?). I also need to figure out the same thing for the mod_jk IIS/Tomcat connector, unfortunately... As to why all this bother, as per an e-mail to this group some time back, I need to load balance over a sparsely populated range of ports. In other words, I'll only have a Tomcat actually running on a few of these ports at any given time. This actually works nicely on Linux, for instance, as things stand. Each dead port rejects requests almost instantaneously. On Windows, however, this can take over a second -- so whenever the worker retry interval expires a foreground request can end up trying all the dead ports and being delayed by # of dead ports seconds! If anyone has any ideas as to how to fix this issue with Windows instead, I am /all/ ears. I'd rather just keep the code simple and have reasonable TCP/IP stack behavior -- but I may just be dreaming in this regard. -- Jess Holle
mod_proxy_balancer enhancements
I am looking for 2 things mod_proxy_balancer cannot currently provide: 1. Something to limit the maximum impact of having many dead members under a load balancer on normal requests. * The process of discovering that dead workers are still dead shouldn't overtly impact any normal request (assuming there are live workers available) * Sample situation: o Load balancing over 10 ports, most of which do not have an active backend (Tomcat) associated at the time. If there is only 1 backend alive, every 'retry' seconds, a normal request is delayed by a period of 9*dead-connection-latency. That's neither necessary nor acceptable. * Possible solutions include: o Having an option to have a background thread ping the backends rather than allowing normal requests to do so. + In the case of mod_proxy_ajp, a cping would be preferred here, rather than a full request. o Limiting the number of workers any single normal request will attempt to recover 2. Something to reduce the severity of log messages when discovering that a dead worker is still dead. * There is no need to fill the error logs with notices that a worker that has been dead is still dead. This is good troubleshooting info and should be logged, but at a lower severity level that does not show up in the logs by default. * Depending on the solution to (1), this might just fall out of that. I had already started a discussion along these lines on the Tomcat development mailing list, as I have the same needs for both mod_proxy_ajp (for Apache 2.2 front ends) and mod_jk/isapi (for IIS front ends). Mladen Turk kindly pointed me to some work he had recently done on trunk for mod_jk to add a background watchdog thread for periodic background work. He has also talked about adding a similar capability to Apache itself in the future. Jim Jagielski pointed out that the discussion should probably move over here as portions have impact on Apache itself. I need solutions to these problems one way or another, so if nothing else I'll have to hack in something into our own fork of the code. I have a fair amount of time to solve these problems, however, so I'd much rather see them solved in a good, general way that can be a value-add part of both mod_jk and mod_proxy -- rather than a one-off fork. Ideally the solutions would be somewhat consistent as well, for everyone's sanity. Thoughts? Suggestions? -- Jess Holle
Re: Apache - MS LDAPSDK with multi-byte DN
Eric Covener wrote: 2008/7/16 Stusynski, Dan [EMAIL PROTECTED]: Hello devs, It would appear that the MS LDAP SDK has an issue when Apache is compiled against it. Our Apache 2.2.9 compiled with VC6 on Windows against the MS LDAP SDK seems to have an issue when searching for a DN that contains multibyte characters (non ascii), in this case a Chinese character. The ldap_search_ext_s(...) from util_ldap.c returns with a USER_NOT_FOUND. For example, assuming a user exists in LDAP with a UID=testMBUser with a DN: cn=t我st,cn=test,ou=test1,ou=people,cn=myLdapBranch,cn=TestEnvironment,o=testerGroup Have you tried this feature? Presumably your data in LDAP is utf-8, you just need to figure out what to convert _from_. http://httpd.apache.org/docs/2.2/mod/mod_authnz_ldap.html#authldapcharsetconfig I'm pretty sure the issue here is that a UTF-8 DN is being sent but the MS LDAP SDK garbles this. -- Jess Holle
Re: PR42829
Nick Kew wrote: As for maintaining local patches, he's not the only one doing that, and our license clearly allows it. Licenses that restrict such things seem to be widely disliked: c.f. DJB/qmail. We've made a concerted effort to supply all patches back, yet we always find that we maintain a few local patches. We don't want to, but there are various bits that we just never successfully pushed back for one reason or another, e.g.: * mod_authn_alias.dep/.dsp/.mak: changes for building on Windows o Not sure why [as I'm no longer doing these builds myself] could be to allow us to build with an older MS studio * mod_deflate.c: added support for a response header which will allow responses (e.g. from Tomcat) to dynamically opt out of compression o code was suggested on the Apache lists, but uninteresting to Apache trunk apparently * util_ldap*.c: still changing '#if APR_HAS_SHARED_MEMORY' to '#if 0' as last we checked the shared memory stuff was still unstable with the worker MPM -- at least on Solaris and AIX -- Jess Holle
Re: 2.2.9 status
Was a solution ever arrived at for proper handling of %3B (escaped ';') in URLs passed to Tomcat via mod_proxy_ajp? This and 8K AJP packet handling are sorely missing in mod_proxy_ajp. -- Jess Holle
Re: 2.2.9 (Was: Re: [PROPOSAL] Time Based Releases)
jean-frederic clere wrote: I have looked to #44803 in fact we need something like JkOptions +ForwardURIEscaped which means something that requires changes in both mod_rewrite and mod_proxy. I will propose a patch soon. Thank you! -- Jess Holle
Re: 2.2.9 (Was: Re: [PROPOSAL] Time Based Releases)
Plüm wrote: IMHO we already forward them escaped. The problem is that things get unescaped first and for reserved characters like ';' this process cannot be reverted. So if the original URL contained an escaped ';' the forwarded one will contain a literal ';'. With mod_proxy or better ProxyPass you already can get around this by specifying the nocanon option which causes the the original URL to be forwarded (much like JkOptions +ForwardURICompatUnparsed). That makes sense -- as we're only having issues with ; to the best of my knowledge. That said, we need to use the rewrite stuff shown in the bug (unless there's another way we're missing) to forward just the appropriate requests to Tomcat. Is there a way we're missing (that would ideally be clearly documented...) to avoid running into issues with ; when using mod_proxy and mod_rewrite as we are? If not, then one is really needed for ;. -- Jess Holle
Re: 2.2.9 (Was: Re: [PROPOSAL] Time Based Releases)
jean-frederic clere wrote: IMHO we already forward them escaped. The problem is that things get unescaped first and for reserved characters like ';' this process cannot be reverted. So if the original URL contained an escaped ';' the forwarded one will contain a literal ';'. With mod_proxy or better ProxyPass you already can get around this by specifying the nocanon option which causes the the original URL to be forwarded (much like JkOptions +ForwardURICompatUnparsed). No nocanon doesn't do that. it use url (in which the %3B is already converted in ;) instead the r-unparsed_uri. And that would be JK_OPT_FWDURICOMPATUNPARSED and not ForwardURIEscaped. To get ForwardURIEscaped we could call ap_escape_uri() on url. I can confirm that using ProxyPass and nocanon does not solve the problem -- I just tested this. -- Jess Holle
Re: 2.2.9 (Was: Re: [PROPOSAL] Time Based Releases)
Jim Jagielski wrote: Can you try: Index: modules/proxy/mod_proxy_ajp.c === --- modules/proxy/mod_proxy_ajp.c(revision 648735) +++ modules/proxy/mod_proxy_ajp.c(working copy) @@ -72,8 +72,13 @@ search = r-args; /* process path */ -path = ap_proxy_canonenc(r-pool, url, strlen(url), enc_path, 0, - r-proxyreq); +if (apr_table_get(r-notes, proxy-nocanon)) { +path = url; /* this is the raw path */ +} +else { +path = ap_proxy_canonenc(r-pool, url, strlen(url), + enc_path, 0, r-proxyreq); +} if (path == NULL) return HTTP_BAD_REQUEST; I don't do our Apache builds any more (and don't have things set up to do so), but our engineer who does is slated to test the patch attached to the bug soon. Is this the same as the patch attached to the bug report -- or a different one? -- Jess Holle
Re: 2.2.9 (Was: Re: [PROPOSAL] Time Based Releases)
Jess Holle wrote: Jim Jagielski wrote: Can you try: Index: modules/proxy/mod_proxy_ajp.c === --- modules/proxy/mod_proxy_ajp.c(revision 648735) +++ modules/proxy/mod_proxy_ajp.c(working copy) @@ -72,8 +72,13 @@ search = r-args; /* process path */ -path = ap_proxy_canonenc(r-pool, url, strlen(url), enc_path, 0, - r-proxyreq); +if (apr_table_get(r-notes, proxy-nocanon)) { +path = url; /* this is the raw path */ +} +else { +path = ap_proxy_canonenc(r-pool, url, strlen(url), + enc_path, 0, r-proxyreq); +} if (path == NULL) return HTTP_BAD_REQUEST; I don't do our Apache builds any more (and don't have things set up to do so), but our engineer who does is slated to test the patch attached to the bug soon. Is this the same as the patch attached to the bug report -- or a different one? To be more clear exactly which patch should we be testing? -- Jess Holle
Re: 2.2.9 (Was: Re: [PROPOSAL] Time Based Releases)
Jim Jagielski wrote: This section is the same as that in the bug report (make mod_proxy_ajp aware of the nocanon EnvVar), but the attached patch also includes a workaround for the doubling of any query strings. This 2nd part needs to be addressed but the real fix may not be done by this patch. If you have no query args, then either is fine. We generally have query strings (and have to support the most general case due to the number of quite disparate pages being served), so we'll need the full patch. Thanks. -- Jess Holle
Re: 2.2.9 (Was: Re: [PROPOSAL] Time Based Releases)
Jim Jagielski wrote: Plus, every 3 months would coincide with the report-to-board cycle, making it easier for everyone to follow :) Next is due in May, so if we release this month, then we can follow a Release before the board report or else Release at board report cycle (with the caveat I noted before ;) ) As noted (and as I pinged a few people about in Amsterdam), I'm looking to push for a 2.2.9 soon. We have enough for sure warrant it... There is the ship 1.3.0 with 2.2.9 thread going on, but I do not want 2.2.9 to hold off too long for that... So I'd like to propose 2.2.9 for the end of this month. Sorry to butt in, but is there any hope of getting issue #44803 addressed in that timeframe? [The gap between mod_jk and mod_proxy_ajp in this and other areas (I don't believe it can set a longer packet size than 8K yet) is a bit troubling...] -- Jess Holle
Re: time for 1.3.40 and 2.2.7 ?
Guenter Knauf wrote: On 12/08/2007 04:04 PM, Ruediger Pluem wrote: Thanks folks for all the reviewing work done. From my perspective there is now nothing left between us and 2.2.7. Jim do you still volunteer to RM? I see a new small issue with mod_proxy_ajp which I've not yet tracked down; maybe my config is wrong, but now with recent code I see warnings when I start Apache which I didnt see with 2.2.6 with same config; I will try to track down this... Now that you bring up mod_proxy_ajp... Has the flexible packet size stuff been backported to 2.2.x yet? This stuff is important for some cases. mod_jk has it and I believe trunk does as well. -- Jess Holle
Re: time for 1.3.40 and 2.2.7 ?
Thanks! -- Jess Holle Mladen Turk wrote: Jess Holle wrote: Now that you bring up mod_proxy_ajp... Has the flexible packet size stuff been backported to 2.2.x yet? This stuff is important for some cases. mod_jk has it and I believe trunk does as well. It does, but don't know why it was limited to the 16384 bytes, and who committed that, while in mod_jk its 65536 and it works perfectly. I'll propose the backport for sure cause 64K is ajp protocol limit (and Tomcat will accept it) Regards, Mladen
Re: [PATCH] proxy/ajp_header.c: Fix header detection
Martin Kraemer wrote: On Thu, Aug 30, 2007 at 04:45:38PM +0200, Rainer Jung wrote: I committed Martins patch to mod_jk a couple of minutes ago. Thanks Martin! The Content-Type part of the patch didn't apply to mod_jk though. ... -if (memcmp(stringname, Content-Type, 12) == 0) { +if (strncasecmp(stringname, Content-Type, 12) == 0) { That is good, because it was wrong... Of course we need the normal strcasecmp(stringname, Content-Type), not the one limited to 12 chars (think of Content-TypeXYZ). Already committed to trunk. Backporting to 2.2.x? -- Jess Holle
ProxyTimeout Revisited
Currently one can specify timeout on one's BalancerMember (e.g. with mod_proxy_ajp). Does this serve as both a connection and request timeout? If so, in the worst case I can use it to be both and thus set it for the latter (knowing it is ridiculous for the former). I read the Re: ProxyTimeout does not work as documented http://marc.info/?l=apache-httpd-devm=117986243317037w=2 thread and am trying to figure out how to set connection and request timeouts for mod_proxy_ajp today with 2.2.4 -- and hopefully do so in a manner that is compatible with the resolution of this thread... -- Jess Holle
Re: ProxyTimeout Revisited
Hmmm The documentation says: timeout |Timeout| Connection timeout in seconds. If not set the Apache will wait until the free connection is available. This directive is used for limiting the number of connections to the backend server together with |max| parameter. Yet the discussion thread at http://marc.info/?l=apache-httpd-devm=117986243317037w=2 seems to say that timeout on the workers will be used as a connection and/or request -- and then fallback to ProxyTimeout and finally to Timeout. Essentially I'm looking to transform: Proxy balancer://ajpWorker BalancerMember ajp://localhost:8010 min=15 max=300 smax=30 ttl=900 keepalive=Off timeout=86400 /Proxy into something that will: 1. Wait essentially indefinitely for a response from a backend Tomcat (e.g. up to a day 86400). 2. Wait up to a different time period, e.g. 180 seconds, to find a worker rather than immediately returning a 503. It's not at all clear to me how both are achieved in 2.2.4 and how both will be achieved shortly when the changes discussed in the Re: ProxyTimeout does not work as documented thread are backed into a 2.2.x release (or if they will be). -- Jess Holle Rainer Jung wrote: As I understand mod_proxy_* and APR code, the BalancerManager timeout will set a timeout for individual read and write attempts to backend connections. So it neither correlates to an idle timeout on the connection (see ttl and smax) neither to a request timeout in the sense of a limit to the full request handling time. The timeout starts whenever something is expected to get read or written. So most of the times it should fire, when retrieving the initial response packets takes longer than this timeout, or your backend starts to hang in the middle of request processing. Regards, Rainer Jess Holle wrote: Currently one can specify timeout on one's BalancerMember (e.g. with mod_proxy_ajp). Does this serve as both a connection and request timeout? If so, in the worst case I can use it to be both and thus set it for the latter (knowing it is ridiculous for the former). I read the Re: ProxyTimeout does not work as documented http://marc.info/?l=apache-httpd-devm=117986243317037w=2 thread and am trying to figure out how to set connection and request timeouts for mod_proxy_ajp today with 2.2.4 -- and hopefully do so in a manner that is compatible with the resolution of this thread... -- Jess Holle
Re: ProxyTimeout Revisited
Okay, I'm still wondering about the future behavior based on the Re: ProxyTimeout does not work as documented thread (which is why I'm bothering the dev mailing list, since the thread is from there), but after some testing the current (2.2.4) behavior is clearly that: 1. If no timeout is specified on proxy workers: * They will wait indefinitely for a free connection to the back end. * They will wait for ProxyTimeout or 300 seconds for a response from the back end servers. 2. If a timeout is specified on proxy workers: * They will use this timeout as the time to wait for a free connection. * They will use this timeout as the time to wait for a response from the back end servers. Everything but the last bullet item of (2) is crystal clear from the documentation. That last bullet was clear as mud to me from the docs, though. -- Jess Holle
Re: ProxyTimeout Revisited
Ah, that would make sense -- but that's not what the docs say as you point out :-) -- Jess Holle Rainer Jung wrote: I think you need to make a distinction between the timeout *attribute* on a BalancerMember and the one on a balancer itself. At least the code does the distinction (2.2.4). a) timeout for a Balancermember (aka worker): timeout waiting for a read or write on an existing backend connection to complete. b) timeout for a balancer: if it can't get a connection from the pool, it will try again in intervals of timeout*1000/100 milliseconds until timeout seconds have expired (i.e. 100 times) or it managed to get a connection, I think the documentation does not correctly document the code for the a) part! Regards, Rainer Jess Holle wrote: Okay, I'm still wondering about the future behavior based on the Re: ProxyTimeout does not work as documented thread (which is why I'm bothering the dev mailing list, since the thread is from there), but after some testing the current (2.2.4) behavior is clearly that: 1. If no timeout is specified on proxy workers: * They will wait indefinitely for a free connection to the back end. * They will wait for ProxyTimeout or 300 seconds for a response from the back end servers. 2. If a timeout is specified on proxy workers: * They will use this timeout as the time to wait for a free connection. * They will use this timeout as the time to wait for a response from the back end servers. Everything but the last bullet item of (2) is crystal clear from the documentation. That last bullet was clear as mud to me from the docs, though. -- Jess Holle
Re: bug #42120: Apache improperly handling Path Component parameters?
Nick Kew wrote: On Fri, 13 Apr 2007 16:30:06 -0500 Andy Wang [EMAIL PROTECTED] wrote: There are a number of potential workarounds (LocationMatch, or Multiple Location blocks to deal with the ;* pattern) but it does seem like this is a bug unless someone can clarify RFC 2396 section 3.3 for me and explain why it isn't. Your reading of the RFC is correct but irrelevant. The semantics of Location are (like Directory) based on path components. It's clear they are today, but is that really proper? Java servlet engines have an encodeURL() API that is the standard means of cookie-less session passing and uses exactly this URL syntax. Thus anyone using Location in conjunction with Java servlet engines may be getting unexpected results. For instance, using Location /x/y/z to establish authentication via Apache will not establish authentication on /x/y/z;jsessionid=xxx -- yet the servlet engine will treat the two the same (except that the latter denotes a particular session). This is thus a security hole for the unaware (if they were using authentication to prevent access to those resources, not just to establish identity for use of them). We can certainly change add more Location blocks or use LocationMatch to handle /x/y/z;*, but this seems to be a clear disconnect between Apache's Location notion and that of both the RFC and Java servlet engines. I realize that this is an old issue that has been left as is for years. We're running into it only now because we normally only allow cookie-based session passing but suddenly have cause to support this form as well in some corner cases. While we can work around the issue it would seem Location should simply be fixed. -- Jess Holle
Re: [Fwd: Re: Apache 2.2.3 mod_proxy issue]
Jim Jagielski wrote: Sounds good. On a related note, our practice with mod_jk is to route only *.jsp, /servlet/*, and a few other URL patterns to Tomcat and let Apache handle everything else. We also want to support load balancing with sticky sessions, of course. That combination is pretty easy and straightforward with mod_jk. It has been *baffling* with mod_proxy_ajp. Perhaps we just haven't spent long enough on mod_rewrite, etc, but so far we're not getting anywhere... How about RewriteEngine On RewriteRule ^(.*\.jsp|/servlet/.*)$ balancer://mycluster$1 [P] Proxy balancer://mycluster ProxySet stickysession=JSESSIONID nofailover=On BalancerMember ajp://1.2.3.4:8009 route=tomcat1 max=10 BalancerMember ajp://1.2.3.5:8010 route=tomcat2 max=10 /Proxy Seems to be that we should simply make ProxyPass more pattern aware... We don't need a full regex for 95% of the cases, and so we'd have a nice faster impl. Needing to switch to (and load in) mod_rewrite for something that the proxy module should do itself seems backwards :) Agreed. By the way, ProxyPass /servlet/ balancer://mycluster does the 2nd part of what you want, it's just the '*.jsp' stuff we're missing... So basically, ProxyPass more JkMount-like... Gotcha. -- Jess Holle
Re: [Fwd: Re: Apache 2.2.3 mod_proxy issue]
Ruediger Pluem wrote: I guess we should create a directive like DefineWorker (I do not really care about the exact name), that enables the administrator to define / create a worker. That would be really handy for mod_rewrite as in the reverse proxy case the number of different backend targets are usually limited and known to the administrator. This would avoid the need for nasty tricks like pseudo balancers with only one member. Sounds good. On a related note, our practice with mod_jk is to route only *.jsp, /servlet/*, and a few other URL patterns to Tomcat and let Apache handle everything else. We also want to support load balancing with sticky sessions, of course. That combination is pretty easy and straightforward with mod_jk. It has been *baffling* with mod_proxy_ajp. Perhaps we just haven't spent long enough on mod_rewrite, etc, but so far we're not getting anywhere... -- Jess Holle
Apache 2.2.3 mod_proxy issue
When I use: RewriteCond %{REQUEST_URI} /jsp-examples/(.*).jsp(.*) RewriteRule ^/(.*) ajp://localhost:8010/$1 [P] and pound on Tomcat 5.5.20's /jsp-examples/jsp2/el/basic-arithmetic.jsp through Apache (with only a single thread doing the pounding), I start to get 503's after a while. The error log says: [Fri Oct 27 10:49:42 2006] [error] (OS 10048)Only one usage of each socket address (protocol/network address/port) is normally permitted. : proxy: AJP: attempt to connect to 127.0.0.1:8010 (*) failed This is on Windows. I then look at netstat and see *loads* of connections in TIME_WAIT. This would seem to strongly indicate that no connection pooling is being done -- at least not properly. On the other hand, if I use: ProxyPass /jsp-examples ajp://localhost:8010/jsp-examples This works fine! I assume I should file a bug against mod_proxy -- or is this a known issue? This has serious implications for those who route only JSP and servlet requests to Tomcat and let Apache serve all static content from a web app. Many Tomcat folk would thus advise not using Apache at all, but I want to use it precisely for it's AJP-based load balancing. One might say to use Apache 2.0 and mod_jk, which work fine, but I need authentication against multiple LDAPs -- which is another feature 2.2 has over both 2.0 and Tomcat. -- Jess Holle
Apache 2.x perf degradation on large downloads on Windows
I'm seeing what appears to be really severe performance degradation during the course of really large downloads (e.g. 800MBs) on Windows Apache's -- both 2.0.x (recent builds) and 2.2.3. Has anyone else seen this? Is this just a lack of tuning? If so, pointers would be appreciated. Note we're using: SendBufferSize 16384 and EnableSendfile Off Before blaming the latter setting, however, I should point out that the problem we're seeing exists both for this case with simple static file downloads and for dynamic downloads through mod_jk and Tomcat (we've only tested this case with 2.0.x). The latter case is actually our real issue, but unless/until static file downloads don't show this degradation there seems to be little point in chasing the (more complex) dynamic case. -- Jess Holle
Re: Apache 2.x perf degradation on large downloads on Windows
Jess Holle wrote: I'm seeing what appears to be really severe performance degradation during the course of really large downloads (e.g. 800MBs) on Windows Apache's -- both 2.0.x (recent builds) and 2.2.3. Has anyone else seen this? Is this just a lack of tuning? If so, pointers would be appreciated. Note we're using: SendBufferSize 16384 and EnableSendfile Off Before blaming the latter setting, however, I should point out that the problem we're seeing exists both for this case with simple static file downloads and for dynamic downloads through mod_jk and Tomcat (we've only tested this case with 2.0.x). The latter case is actually our real issue, but unless/until static file downloads don't show this degradation there seems to be little point in chasing the (more complex) dynamic case. Also the enabling send file does not seem to make any difference to the results. -- Jess Holle
Re: Apache 2.x perf degradation on large downloads on Windows
Jess Holle wrote: In some of my testing, Win32DisableAcceptEx seems to make a huge improvement, however... Okay, I take that back... Jess Holle wrote: Jess Holle wrote: I'm seeing what appears to be really severe performance degradation during the course of really large downloads (e.g. 800MBs) on Windows Apache's -- both 2.0.x (recent builds) and 2.2.3. Has anyone else seen this? Is this just a lack of tuning? If so, pointers would be appreciated. Note we're using: SendBufferSize 16384 and EnableSendfile Off Before blaming the latter setting, however, I should point out that the problem we're seeing exists both for this case with simple static file downloads and for dynamic downloads through mod_jk and Tomcat (we've only tested this case with 2.0.x). The latter case is actually our real issue, but unless/until static file downloads don't show this degradation there seems to be little point in chasing the (more complex) dynamic case. Also the enabling send file does not seem to make any difference to the results. -- Jess Holle
Re: 2.2.3
I personally would assume 2.2.3 is unrelated to stuff in trunk. I assume there must be a backlog of tweaks and fixes for 2.2.x by this point, however -- or was 2.2.2 that good. It would be really good to get bug #40051 resolved. This makes use of AuthnProviderAlias really painful -- as you have to keep sorting the aliases until nothing crashes! It would also be nice to see some of the mod_jk improvements merged into mod_proxy_ajp, but that's not going to happen overnight. -- Jess Holle William A. Rowe, Jr. wrote: Eli Marmor wrote: Hi, 3 months have passed since the last release; Is 2.2.3 expected soon? Thanks to your great efforts, there are exciting new features in the trunk, and it would be great to bring them to the masses... FYI - 2.2.3 doesn't have alot to do with the cool stuff in trunk; for most of those you will be keeping your eyes peeled for 2.4.0 (and a 2.3.0 alpha even before that.)
Re: 2.2.3
I'm not asking for substantive changes in 2.2.x. I'm just hoping to see a steady (but not overly rapid) stream of updates to 2.2.x -- especially to address bug #40051 (short term) and mod_proxy_ajp's lack of the latest mod_jk features (mid term). -- Jess Holle William A. Rowe, Jr. wrote: Jess Holle wrote: I personally would assume 2.2.3 is unrelated to stuff in trunk. Sometimes it includes the same - the 2.2 branch changes are in; http://svn.apache.org/repos/asf/httpd/httpd/branches/2.2.x/CHANGES I assume there must be a backlog of tweaks and fixes for 2.2.x by this point, however -- or was 2.2.2 /that/ good. Steller, but you knew that :) Seriously... if you want another trunk change that *doesn't require us to break .conf structure, binary compat rules, etc*, then see if it's in http://svn.apache.org/repos/asf/httpd/httpd/branches/2.2.x/STATUS and ask for someone to add it as a proposal, if not. Bill
Apache 2.2.2 + CGI - Child Exit?!?
With Apache 2.2.2 on Windows, we're getting the following error: Parent: child process exited with status 3221225477 -- Restarting. Whenever URLs are requested that execute CGI that was working 100% fine in Apache 2.0.x. This leads me to ask: How can a CGI program kill an Apache worker process? How can this happen only in Apache 2.2.2, but not in any 2.0.x that we've tried. The CGI in question is a 3rd-party application for which we have neither source nor official Apache 2.2.2 support. Yet I would think that CGI should just work irrespective, right? Or at least it shouldn't kill the Apache worker processes, right? Could this be an issue in 2.2.2's mod_cgi? To make matters weirder the issue went away for one developer, but he does not know what he did to make it go away nor can anyone else seem to reproduce this. -- Jess Holle
Re: Apache 2.2.2 + CGI - Child Exit?!?
Note: In case anyone thinks I posted to the wrong group, I'm really looking for: Developer-level info as to if/how a CGI program could cause an Apache worker process death in 2.2.2 Developer-level pointers on where in mod_cgi we should place breakpoints, logging, or the like to try to determine what's going wrong -- or if this even makes sense Of course if there is some out-of-the-box verbosity option I'm missing to help with this, I'm all ears. I used LogLevel debug, of course, but it told me nothing new. -- Jess Holle Jess Holle wrote: With Apache 2.2.2 on Windows, we're getting the following error: Parent: child process exited with status 3221225477 -- Restarting. Whenever URLs are requested that execute CGI that was working 100% fine in Apache 2.0.x. This leads me to ask: How can a CGI program kill an Apache worker process? How can this happen only in Apache 2.2.2, but not in any 2.0.x that we've tried. The CGI in question is a 3rd-party application for which we have neither source nor official Apache 2.2.2 support. Yet I would think that CGI should just work irrespective, right? Or at least it shouldn't kill the Apache worker processes, right? Could this be an issue in 2.2.2's mod_cgi? To make matters weirder the issue went away for one developer, but he does not know what he did to make it go away nor can anyone else seem to reproduce this. -- Jess Holle
Firewalls vs. Apache 2.2 mod_ldap
Both we and our customers have had issues with mod_ldap in Apache 2.0.x when one has a firewall between Apache and the target LDAP server that drops connections that have been idle for a long time. Has this been addressed in Apache 2.2.2? If not, are there plans to address it? mod_jk has licked this issue in its way (and I assume mod_proxy_ajp inherited this), but last I checked mod_ldap still had serious issues (e.g. hung and/or failed requests) when such connection drops occur. -- Jess Holle
Re: Apache 2.2.2 + CGI - Child Exit?!?
A little more troubleshooting shows that this is likely not an issue with the particular CGI in question. I say this because the issue goes away when we reorder several of our conf files. Even if this is our error, a child death without any warning or debug output due to a conf file ordering issue seems like a serious bug! We'll do more troubleshooting and hopefully narrow down to what the issue is, but these are not terribly complex conf files overall. The CGI conf file does a bit of ScriptAlias, Alias, Directory, AuthnProviderAlias, AuthLDAPURL, usage. The other conf files (those whose inclusion prior to the CGI conf file causes the issue) contain similar things except no ScriptAlias usage plus Location and some mod_rewrite usage. -- Jess Holle Jess Holle wrote: Note: In case anyone thinks I posted to the wrong group, I'm really looking for: Developer-level info as to if/how a CGI program could cause an Apache worker process death in 2.2.2 Developer-level pointers on where in mod_cgi we should place breakpoints, logging, or the like to try to determine what's going wrong -- or if this even makes sense Of course if there is some out-of-the-box verbosity option I'm missing to help with this, I'm all ears. I used LogLevel debug, of course, but it told me nothing new. -- Jess Holle Jess Holle wrote: With Apache 2.2.2 on Windows, we're getting the following error: Parent: child process exited with status 3221225477 -- Restarting. Whenever URLs are requested that execute CGI that was working 100% fine in Apache 2.0.x. This leads me to ask: How can a CGI program kill an Apache worker process? How can this happen only in Apache 2.2.2, but not in any 2.0.x that we've tried. The CGI in question is a 3rd-party application for which we have neither source nor official Apache 2.2.2 support. Yet I would think that CGI should just work irrespective, right? Or at least it shouldn't kill the Apache worker processes, right? Could this be an issue in 2.2.2's mod_cgi? To make matters weirder the issue went away for one developer, but he does not know what he did to make it go away nor can anyone else seem to reproduce this. -- Jess Holle
Re: Apache 2.2.2 + CGI - Child Exit?!?
Hmmm This seems to be something really funny with regards to use of AuthnProviderAlias, AuthLDAPURL, and AuthBasicProvider as commenting these things out causes the issue to go away. We'll hopefully be able to get more precise information so a bug can be filed (and ideally some initial notions as to what's really going on gathered). Jess Holle wrote: A little more troubleshooting shows that this is likely not an issue with the particular CGI in question. I say this because the issue goes away when we reorder several of our conf files. Even if this is our error, a child death without any warning or debug output due to a conf file ordering issue seems like a serious bug! We'll do more troubleshooting and hopefully narrow down to what the issue is, but these are not terribly complex conf files overall. The CGI conf file does a bit of ScriptAlias, Alias, Directory, AuthnProviderAlias, AuthLDAPURL, usage. The other conf files (those whose inclusion prior to the CGI conf file causes the issue) contain similar things except no ScriptAlias usage plus Location and some mod_rewrite usage. -- Jess Holle Jess Holle wrote: Note: In case anyone thinks I posted to the wrong group, I'm really looking for: Developer-level info as to if/how a CGI program could cause an Apache worker process death in 2.2.2 Developer-level pointers on where in mod_cgi we should place breakpoints, logging, or the like to try to determine what's going wrong -- or if this even makes sense Of course if there is some out-of-the-box verbosity option I'm missing to help with this, I'm all ears. I used LogLevel debug, of course, but it told me nothing new. -- Jess Holle Jess Holle wrote: With Apache 2.2.2 on Windows, we're getting the following error: Parent: child process exited with status 3221225477 -- Restarting. Whenever URLs are requested that execute CGI that was working 100% fine in Apache 2.0.x. This leads me to ask: How can a CGI program kill an Apache worker process? How can this happen only in Apache 2.2.2, but not in any 2.0.x that we've tried. The CGI in question is a 3rd-party application for which we have neither source nor official Apache 2.2.2 support. Yet I would think that CGI should just work irrespective, right? Or at least it shouldn't kill the Apache worker processes, right? Could this be an issue in 2.2.2's mod_cgi? To make matters weirder the issue went away for one developer, but he does not know what he did to make it go away nor can anyone else seem to reproduce this. -- Jess Holle
Re: IPV6 enabled on supplied Windows 32 binary?
Colm MacCarthaigh wrote: On Thu, Jul 13, 2006 at 10:47:15AM -0500, Jess Holle wrote: So what's the story with IPv6 on Windows? Works fine in every version of windows since 2000, although 2000 itself needs a kit and patching installed. Great. That covers all versions of Windows my employer cares about, so we can build our Apache with IPv6 always enabled. Are there some versions of Windows which always support it, but the headers we use for Windows don't detect this at build time? We don't use an autoconf-like system on Windows, so although we could detect this at build time, it's just something which has been left as a build option. I knew that but was wondering if there was some #if magic based on OS level defines instead. I can see why there is not, though -- that would result in requiring an older OS to build an Apache that is compatible with the older OS. We bundle our own Apache builds for a number of Windows OS levels and have customers who really want IPv6... I don't want to break anything for the rest of the customers, though. That's at your own risk :-) Obviously :-) By "break anything" I meant that it should still work despite the fact that they're actually only using IPv4, etc -- not that this would not uncover a bug -- which is always a risk. -- Jess Holle
Re: Integrated Authentication
This seemed to work fine last I tried it (with mod_jk). Trent Nelson wrote: You're after NTLM support. There's a module floating around out there named 'mod_auth_sspi' that does this, although it can be a bit hard to track down (see http://www.gknw.at/development/apache/httpd-2.0/win32/modules/). Once loaded, set up a directive like this: IfModule mod_auth_sspi.c Alias /foo C:/bar/foo Location /foo-auth AuthName Please Enter Your Logon Details AuthType SSPI SSPIAuth On SSPIAuthoritative On SSPIOfferBasic On SSPIBasicPreferred Off require valid-user /Location /IfModule By default, if the user uses IE, it'll automatically pick their details up without requiring them to log in. If they're using Firefox or some other browser that doesn't support NTLM, they'll have to log in manually with their Windows domain credentials. I've only ever used this from a Perl handler, so I'm not entirely sure what exactly in the request that it sets (perhaps someone could clarify?), but from the Perl handler, the login name was accessible from $r-user(). Note that the format includes the domain as well, i.e. 'LIME\tnelson'. Actually, I'd be interested to hear if anyone used this in conjunction with mod_jk, such that the user's Windows domain login name was available by the time it got to a servlet via request.getUserPrincipalName() or something. Anyone done that? The Java approach for enabling NTLM support w/ Tomcat directly seems nasty. Trent. From: Sergio Stateri [mailto:[EMAIL PROTECTED] Sent: 12 April 2006 21:37 To: dev@httpd.apache.org Subject: Integrated Authentication Hi, Is there any way to do Apache HTTP Server recognize the users of Operation System and put it in a System Variable, like IIS with Integrated Authentication ? (IIS put Windows logged User in the REMOTE_USER cgi variable). thanks in advance for any help, Sergio Stateri Jr. [EMAIL PROTECTED]
Re: [VOTE] Release 2.2.1 as GA
So does that mean if one grabs the 2.2.1 tar ball one will not be able to build on Windows? If so, that's not a very compelling tarball for those needing to support Windows. If 2.2.1 is being labeled as non-GA already then that's quite appropriate, move on to 2.2.2 as soon as possible. 2.2.1 shouldn't be labeled as GA if it does not build on Windows as is, though. -- Jess Holle William A. Rowe, Jr. wrote: On 4/1/06, Steffen [EMAIL PROTECTED] wrote: No go on win32: unresolved external symbol [EMAIL PROTECTED] referenced in function _show_compile_settings .\Release/httpd.exe : fatal error LNK1120 This is fixed on APR 0.9 / 1.2 branches and 1.3 trunk. Brad's fixed this on 1.2 branch and 1.3 trunk - and I have a pending request to him to ensure it's fixed on 0.9 branch as well. My thought is reroll aprutil, which i'll do tonight. The problem isn't with httpd; the problem's with the maintenance on two platforms, and that's what the release notes/CHANGES will say. Bill
Re: mod_proxy_ajp flushing
Is there any way we can manage to extend AJP1.3+ in Tomcat 5.5.x, mod_proxy_ajp, and possibly even mod_jk to *optionally* add explicit flush to the protocol *only* when one opts into it via a configuration option? Some of us need explicit flush, don't want a performance penalty, and see Tomcat 6 as a long ways off from a production support / usage perspective. -- Jess Holle Mladen Turk wrote: Jim Jagielski wrote: Any other comments about the patch? Should I just commit the revised one and we can tweak from there... +1 Although I still consider FLUSHING_BANDAID as useless. The closest we can get to the real meaning of Servlet spec out.flush() is to flush on each packet. FLUSHING_BANDAID will simply not flush if the servlet container continues to send the data after the flush packet. I hope we'll soon extend the AJP protocol spec with explicit flushing, but no mater if we did, since we have to be AJP1.3 compatible, the patch is needed. Regards, Mladen
Re: mod_proxy_ajp - The purpose of FLUSHING_BANDAID
I am not concerned with the form of this feature's delivery -- as long as it is well documented. I will say that having an explicit flush that flushes up through the web server is absolutely critical to certain use cases. Lack of this functionality one way or another will break our apps. -- Jess Holle Mladen Turk wrote: Hi, I would love that we remove the FLUSHING_BANDAID from the code because it concept breaks the AJP protocol specification. Instead FLUSHING_BANDAID I propose that we introduce a new directive 'flush=on' that would behave like the most recent mod_jk directive 'JkOptions +FlushPackets'. The point is that the AJP protocol is packet based, so trying to mimic the 'stream' behavior is bogus. Further more, this (FLUSHING_BANDAID) does not resolve the explicit flush from the application server, because it take care only on the transport rather then a spec. I know that for such cases we would need to extend the AJP protocol with explicit flushing, but for now the only solution is to have an directive that will flush on each packet. So, since FLUSHING_BANDAID has none particular usage I'm asking the author to remove that code, so we can work on a packet flushing rather then a time based one. Regards, Mladen.
Re: AW: mod_proxy_ajp - The purpose of FLUSHING_BANDAID
As someone who depends on such flushing I'd echo that we don't need flushing after every AJP packet -- just when we explicitly call flush(). Plm wrote: -Ursprngliche Nachricht- Von: Mladen Turk First: I am the author. Hi, I would love that we remove the FLUSHING_BANDAID from the code because it concept breaks the AJP protocol specification. I do not understand how this breaks the spec. There might be reasons to handle this differently, but I see no violation of the specs. The flushing bandaid simply tries to detect whether it needs to add a flush bucket or not after the data of *one* packet has been added to the brigade. So if buffering the data in the core output filter without the flushing bandaid or with flush=off does not break the spec and if setting flush=on does not break the spec how does the flushing bandaid breaks this? Instead FLUSHING_BANDAID I propose that we introduce a new directive 'flush=on' that would behave like the most recent mod_jk directive 'JkOptions +FlushPackets'. The drawback of this solution form my point of view is that - The user has to configure it - It is a bang bang switch: Either you flush after each packet or never do explicit flushing (apart from the case that the core buffer is filled). BTW: mod_proxy_http also tries to be that intelligent, but it does not work as the EAGAIN handling does not work as expected and httpd always reads in blocking mode from the backend. The point is that the AJP protocol is packet based, so trying to mimic the 'stream' behavior is bogus. Further more, this (FLUSHING_BANDAID) does not resolve the explicit flush from the application server, because it take Ok, it does not do this exactly, but in most cases it works, because if you flush data explicitly on the application server it usually takes some time until you sent the next data. care only on the transport rather then a spec. I know that for such cases we would need to extend the AJP protocol with explicit flushing, but for now the only solution is to have an directive that will flush on each packet. That seems to be the final solution to me. Something like a SEND_BODY_FLUSH AJP message. So, since FLUSHING_BANDAID has none particular usage I'm asking the author to remove that code, so we can work on a packet flushing rather then a time based one. I am happy to discuss a better solution. As the name says it is a BANDAID :-). So I am keen on additional proposals / comments on this. As a summary from your side I see: 1. Extend AJP protocol [The desired target from my view]. 2. Add an option to flush after every AJP packet [has some drawbacks from my point of view (s.o).] Regards Rdiger
Re: mod_proxy_ajp - The purpose of FLUSHING_BANDAID
I believe mod_jk added an explicit flush option rather than reverting the default to flushing -- as I believe we suddenly had to add this after our application stopped behaving properly and traced this issue back. -- Jess Holle William A. Rowe, Jr. wrote: Ruediger Pluem wrote: OTH I guess we still have to convince some people to switch from mod_jk to mod_proxy_ajp. So I guess having a similar behaviour in mod_proxy_ajp as in mod_jk will ease this. Default for mod_jk is: No flushing. Then I'm confused, I thought this was reverted in the current mod_jk code to avoid exactly this problem. Default switched to flush, with the option of no flushing, after many people tripped over this 'bug' in mod_jk. Bill
Re: mod_proxy_ajp - The purpose of FLUSHING_BANDAID
For those in control of both endpoints, it would be nice to have a patch to enable an extension to AJP in Tomcat 5.5.x and Apache 2.2 -- rather than having to wait until Tomcat 6... Of course, ideally said Tomcat patch would allow one to toggle at runtime whether the extended protocol was used or whether it operated in compatibility mode. -- Jess Holle
Re: Win32 Port of Apache 2.2?
William A. Rowe, Jr. wrote: When Apache declares some tarball 2.2.0 released, it never changes. It won't change until a 2.2.1 is released. And 2.2.1 has not been released due to bugs that affect *ALL* platforms, not just your preferred platform. Just to be clear, Josh and I (who are coworkers) don't necessarily have a preferred platform. We have to build, ship, and support a consistent quasi-auto-configuring Apache on Windows, Solaris, AIX, and (soon) some Linux variants. We need all of the above to work and have solid official sources available. -- Jess Holle
Re: Win32 Port of Apache 2.2?
William A. Rowe, Jr. wrote: Jess Holle wrote: William A. Rowe, Jr. wrote: When Apache declares some tarball 2.2.0 released, it never changes. It won't change until a 2.2.1 is released. And 2.2.1 has not been released due to bugs that affect *ALL* platforms, not just your preferred platform. Just to be clear, Josh and I (who are coworkers) don't necessarily have a preferred platform. We have to build, ship, and support a consistent quasi-auto-configuring Apache on Windows, Solaris, AIX, and (soon) some Linux variants. Sounds familiar :) We need all of the above to work and have solid official sources available. Then use the httpd-2.2.0 tarball. If you are building all those platforms, We know full well you've tweaked those in order to get -most- of them to build properly to your requirements. Why would you expect win32 to be different? Actually we do very little tweaking at all on Solaris -- unless Josh has started doing so lately. AIX has a few gotchas, of course, primarily due to its special linking limitations, er, features. With Apache 2.0.x we do no tweaking to speak of from the official sources. This is not including cross-platform patches we apply, of course (which, yes, we provide back to reduce our maintenance load, but some just are apparently not of general interest, e.g. a special response header in mod_deflate to disable its operation, e.g. on a per-response basis from mod_jk, etc). Note that apr/build/lineends.pl and apr/build/fixwin32mak.pl make moving from a unix tarball to a dos file tree, and from .dsp's exported into make files into directory-independent make files quite trivial. That's good to know. The fact that a -rev2 even exists was to get more participation from win32 developers in order to ensure forward progress, for a clean 2.2.1 result. You can find the same quasi-official changes in http://www.apache.org/dist/httpd/patches/apply_to_2.2.0/ Perhaps there is some reason you didn't shout when the available candidates were posted to this list (or testers@) and it wouldn't build on win32 for you? I think that was a timing issue as to when Josh first started working on 2.2.0. The time to holler is then, not now, and will be again soon as 2.2.1 becomes available. httpd's success or lack thereof is directly proportional to how many people get involved, and get involved early, from all walks of work and life. Understood. -- Jess Holle
Re: Apache 2.2.0 on Win32
William A. Rowe, Jr. wrote: In fact we are continuing on with Visual Studio 6, for the time being, so folks can continue to use various things like the modperl etc built upon ActiveState. Okay, that was not quite clear and is helpful to know. So you're saying that if we build with the latest MS dev/net studio modules built with MS VC++ 6 won't work? It would be good for the overall Apache community to know what they should build 2.2 with for maximum compatibility with non-open-source modules built by others in the community. Another driver, of course, is leveraging the best compiler/optimizer possible without sacrificing too much compatibility. What's the strategy here? -- Jess Holle
Re: Apache 2.2.0 for Windows
Also the build environment issues Josh brings up (missing mod_authn_alias project, lack of official Windows source ball, references to 2.1 rather than 2.2, etc) should be addressed irrespective of Apachelounge binaries. Anyone with appropriate tools should be able to build a Windows binary -- including mod_authn_alias -- and this should not be harder than with Apache 2.0.x. -- Jess Holle Joost de Heer wrote: Apachelounge has a binary available, which you can download after registering. This isn't an official build however. The binary at the Apachelounge is build with the official sources. And includes mod_authn_alias and mod_ssl My interpretation is that there is a difference between 'an official build' and 'a build from the official sources'. Joost
Re: OT: performance FUD
Paul A Houle wrote: Jess Holle wrote: So if one uses worker and few processes (i.e. lots of threads per), then Solaris should be fine? That's what people think, but I'd like to see some numbers. I've never put a worker Apache into production because most of our systems depend on PHP or something else which I wouldn't trust 100% in a threaded configuration. That's understandable if you're in that boat. We bundle and support our own Apache builds with our products. Our only dynamic content comes from mod_jk (and thus will come from the proxy AJP module in 2.2), so threading is all well and good. Given that most of our content is dynamic and thus via AJP, Apache performance is never really the issue -- if anything above the application code itself is ever an issue it is the extra hop involved with AJP, but there are clear load-balancing, security, etc, benefits from this architecture. Customers seem to consistently assume that using Apache is giving (substantively) lower overall performance than they'd get with something else, though -- chalk that up to good marketing by Microsoft, Sun, et al. As for the big file issue you note, that would only seem to be a big issue when coupled with slow connections -- which are getting rarer these days -- and much more of an issue with prefork than worker. -- Jess Holle
Re: [vote] 2.2.0 tarballs
Once 2.2 is released we'll be working to use it -- and distribute it with our products -- on Windows, Solaris, and AIX. I throw in patches relevant to these platforms when possible, but I don't have the time or interest in native (non-Java) code anymore to help out more. -- Jess Holle William A. Rowe, Jr. wrote: Jim Jagielski wrote: Joe Orton wrote: Win32 is not special. It's a second-class citizen if anything because it gets so little developer attention. Now *that's* a statement for the Release Notes :) Absolutely, add to this list AIX, OS2, Netware, BeOS, HPUX and many others. Not to mention OS/390, BS2000 and several others I don't think we can build on since 1.3. Perhaps the Apache HTTP Server for Linux 2.6/Solaris 10/BSD 4 would be a more appropriate name for this project, based on the current community participation, as long as we are going for Truth in Advertising. Of course there are maintainers for each of those 'others', but since active development has become nothing but Linux/Solaris/BSD we should specify supported platforms, not bother to list the dozens of platforms that are not as closely maintained. Bill
Re: [vote] 2.2.0 tarballs
I'm no commiter but must concur -- until the build runs cleanly on Windows 2.2.0 should not go out the door. Not everyone may like it, but Windows is a major Apache usage platform these days. -- Jess Holle Nick Kew wrote: On Tuesday 29 November 2005 08:32, Paul Querna wrote: Paul Querna wrote: These tarballs are Identical to 2.1.10 except for two changes: * include/ap_release.h Updated to be 2.2.0-release * The root directory was changed from httpd-2.1.10 to httpd-2.2.0 Okay, I lied, slightly: * svn r348009: Added AP_DECLARE to mod_dbd exported functions. No functional changes for most operating systems. Yow! That reminds me: that was in response to someone complaining of a build failure on Windows, and he said it *still* failed with AP_DECLARE. Can someone with Windows *please* look at this? -1 for GA while this is outstanding! http://marc.theaimsgroup.com/?l=apache-httpd-devm=113266737311013w=2
Re: [vote] 2.2.0 tarballs
Colm MacCarthaigh wrote: On Tue, Nov 29, 2005 at 05:53:52AM -0600, Jess Holle wrote: I'm no commiter but must concur -- until the build runs cleanly on Windows 2.2.0 should not go out the door. Not everyone may like it, but Windows is a major Apache usage platform these days. mod_dbd isn't included in the win32 build environment yet, so it has no effect on a standard build. mod_dbd may or may not become available within the win32 build environment during the life of 2.2, but I don't think this should hold up GA. It's often the case that some modules or support utilities lag behind on win32. Ah... Sorry to jump the gun. I'm anxious to start the move to 2.2 on various platforms (Windows, Solaris, and AIX). -- Jess Holle
Re: [vote] 2.2.0 tarballs
Joe Orton wrote: On Tue, Nov 29, 2005 at 02:03:59PM +0100, Steffen wrote: Build with no issue here on Windows, except mod_authn_db and dmod_dbd. In the change log: *) Add mod_authn_dbd (SQL-based authentication) [Nick Kew] I agree with Jesse: 2.2.0 should not go out the door until we can build mod_authn_db and mod_dbd on windows. It's pretty silly for anybody to suddenly wake up and declare some random bug as a showstopper for 2.2. Nobody has cared enough about the problem to fix it in the six months and four(?) 2.1.x alpha/beta releases that mod_dbd has been in the tree. So it clearly isn't really very critical to anybody, and isn't showstopper material. As I noted in my previous e-mail, I was over-reacting as I did not understand this module was simply not part of the build on Windows yet. Steffan's thoughts may be quite different than mine on this matter, but I'd say go ahead and go for 2.2.0 if this is the biggest issue out there. [I'm much more concerned about authentication against multiple LDAPs than anything else in the authentication arena.] -- Jess Holle
Re: [vote] 2.2.0 tarballs
Jim Jagielski wrote: Joe Orton wrote: On Tue, Nov 29, 2005 at 02:03:59PM +0100, Steffen wrote: Build with no issue here on Windows, except mod_authn_db and dmod_dbd. In the change log: *) Add mod_authn_dbd (SQL-based authentication) [Nick Kew] I agree with Jesse: 2.2.0 should not go out the door until we can build mod_authn_db and mod_dbd on windows. It's pretty silly for anybody to suddenly wake up and declare some random bug as a showstopper for 2.2. Nobody has cared enough about the problem to fix it in the six months and four(?) 2.1.x alpha/beta releases that mod_dbd has been in the tree. So it clearly isn't really very critical to anybody, and isn't showstopper material. According to: http://httpd.apache.org/docs/2.1/new_features_2_2.html mod_dbd is explicitly mentioned as a new feature of 2.2 and, therefore, a compelling reason to upgrade. Either we stop refering to mod_dbd as something special enough to warrant special attention as a core enhancement or we fix it so it *is* one. That is a good point. Truth in advertising (as best as can be managed) will only help -- and lack thereof only hurt... -- Jess Holle
Re: [vote] 2.2.0 tarballs
We don't until the first GA release, but from there on out we compile just about every release ourselves as we often end up applying our own patches when we find issues (submitting them back, of course) and we do our own cross-platform installation packaging, automated configuration, etc, of Apache for our customers (so the raw build result is more useful). -- Jess Holle Joost de Heer wrote: Win32 is not special. It's a second-class citizen if anything because it gets so little developer attention. And how many people compile the thing on Windows anyway, except the msi builder? My guess is that I need about 2 hands to count them Joost
Re: OT: performance FUD
Paul A Houle wrote: Justin Erenkrantz wrote: If it's on equivalent hardware (i.e. Linux/Intel vs. Solaris/Intel on the same box), I doubt there will be an extreme performance gap. In fact, I've often seen Solaris outperform Linux on certain types of loads. In my experience, a lot of Linux network card drivers are sub-standard; if it's supported by Solaris, there's a fair chance the driver takes full advantage of the hardware. (Netgear GigE drivers on Linux are abysmal.) -- justin I think the issue with Apache/Solaris is that process switches take a long time on Solaris. So if one uses worker and few processes (i.e. lots of threads per), then Solaris should be fine? -- Jess Holle
Re: Shared memory on Win (Was: ldap crash on exit)
I'd long since given up and been patching all the mod_*ldap stuff to pretend shared memory does not exist on Windows. This seems to work fine. While it would be great to have everything similarly and well on all the platforms, what's the real downside here given that there is only one worker process on Windows? Also not using shared memory allowed us to keep using a local read/write lock rather than a global lock for a while, but maintaining this diff became unwieldy over time, so I gave up on this. -- Jess Holle Graham Leggett wrote: Michael Vergoz wrote: also note that mod_auth_ldap is experimental in Apache. Only in v2.0 - mod_ldap and mod_authnz_ldap are no longer experiemental in v2.2, which is imminent for release. If someone can find a fix for this it would be very cool. Regards, Graham --
Re: Shared memory on Win (Was: ldap crash on exit)
Dropping the cache upon a graceful seems like a small price to pay to me, but I can see others begging to differ... William A. Rowe, Jr. wrote: Jess Holle wrote: I'd long since given up and been patching all the mod_*ldap stuff to pretend shared memory does not exist on Windows. This seems to work fine. While it would be great to have everything similarly and well on all the platforms, what's the real downside here given that there is only one worker process on Windows? Actually, what's the downside if they configure one process, 400 threads on solaris? Shouldn't we take the same optimization? Doesn't Netware share this issue (sorry, I'm still foggy on the threads-as-processes deal over on the Netware server.) ap_mpm_query will tell us if we have 'sibling' processes. Simple flag, simple exception. The downside I can think of off hand is that if the current cache has been designed to handle gracefuls - then poof, your cache doesn't persist, and that's especially bad for anyone who's set MaxRequests 1 to mop up any crufty third party module leakage. Bill