Re: dynamically change max client value
Based on my experience, this wouldn't be a high-quality solution, it would be a hack. I've seen very few cases where load spiked enough to be an issue, but was transient enough that a solution like this would work - and in those cases, plain old Unix multitasking normally suffices. What happens if you implement the solution anyhow is that you get a bunch of users stuck in the ListenBacklog. So they'll wait a couple minutes before their request even starts. If you have a deep backlog, requests just pile up so that the machine never gets its head above water. In the worst case, clients will timeout while their request is in the backlog, but since you don't find that out until you send a response which writes out to the network, you can very easily do work that can never be delivered. Beyond all that, the user experience simply _sucks_. [Yes, I've done what you suggest, just not using the implementation you suggest. It's integrated into an existing custom module, you could also probably do it with a reverse proxy. In the end, it was not a productive solution.] What I think you really want is a module that will intercept all requests, and send back "The server is really busy, try again in five minutes" if the server is too busy by some measure. You generally want this to be a super-low-cost option, so that you can spin through requests very quickly. Optimally, no externally-blockable pieces (no database connections, no locking filesystem access, etc). One relatively simple option might be to use a Squid, and an URL redirector which implements the magic check. If the machine is not busy, send through to the real server, if the machine is busy, redirect to an URL which will deliver your message. [Again, yes, I've done this in Apache1.3, but in code targetted to our custom modules. You could certainly do it more generically, I just haven't had the need. You might check mod_backhand.] Later, scott On Mon, 4 Nov 2002, David Burry wrote: > I realize that allowing _everything_ to be dynamically configured via > SNMP (or signal or something) would probably be too substantial of an > API change to be considered for the current code base, but it would be > nice to consider it for some future major revision of Apache > > And it would be more than just "nice" if at least the max client value > thing could be somehow worked into the current versions of Apache... > There is a current very real and very large problem that could be solved > by this, not just a "nice to have" feature. This is what I meant to > emphasize in my original email... > > Dave > > - Original Message - > From: "Dirk-Willem van Gulik" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Monday, November 04, 2002 9:35 AM > Subject: Re: dynamically change max client value > > > > > > In my ideal world every config directive would be able to advertize or > > register an optional 'has changed' hook. Which, if present, would be > > called in context whenever a value is somehow updated (through snmp, a > > configd, signal, wathever). If there is no such hook; the old -update- on > > graceful restart is the default (though it sure would be nice to have some > > values also advertize that they need a full shutdown and restart). > > > > Of course one could also argue for not just a put but also for a 'get' > > interface in context :-) > > > > Dw > > > > On Mon, 4 Nov 2002, David Burry wrote: > > > > > Recently there has been a little discussion about an API in apache for > > > controlling starts, stops, restarts, etc... > > > > > > I have an idea that may help me solve a problem I've been having. The > > > problem is in limiting the number of processes that will run on a > machine to > > > somewhere below where the machine will keel over and die, while still > being > > > close to the maximum the machine will handle. The issue is depending on > > > what the majority of those processes are doing it changes the maximum > number > > > a given machine can handle by a few orders of magnitude, so a > multi-purpose > > > machine that serves, say, static content and cgi scripts (or other > things > > > that vary greatly in machine resource usage) cannot be properly tuned > for > > > maximum performance while guaranteeing the machine won't die under heavy > > > load. > > > > > > The solution I've thought of is... what if Apache had an API that could > be > > > used to say "no more processes, whatever you have NOW is the max!" or > > > otherwise to dynamically raise or lower the max number (perhaps "oh > there's > > > too many, reduce a bit") You see, an external monitoring system > could > > > monitor cpu and memory and whatnot and dynamically adjust apache > depending > > > on what it's doing. This kind of system could really increase the > > > stability of any large Apache server farm, and help keep large traffic > > > spikes from killing apache so bad that nobody gets served anything at > all. > > > > > > In fact
Re: A suggested ROADMAP for working 2.1/2.2 forward?
On Thu, 17 Oct 2002, William A. Rowe, Jr. wrote: > With the simultaneous release of Apache 2.1-stable and Apache > 2.2-development, the Apache HTTP Server project is moving to a more > predictable stable code branch, while opening the development to forward > progress without concern for breaking the stable branch. This document > explains the rational between the two versions and their behavior, going > forward. This is great. I'm our "Apache guy", and 2.0 has been a non-starter. I can fairly easily keep up with the 1.3 changes, because it doesn't involve revision to our codebase, so we get the best of both worlds. What I think this arrangement would allow me to do is make local adjustments to our 2.1 codebase, and if they prove out in production, I can repackage them as a patch to 2.2. Right now, the likelihood that I'll contribute to the most current development tree is nil, because it's just too different from where most of my work is done. Excellent, scott
Re: [PATCH] Alerting when fnctl is going bad
On Wed, 25 Sep 2002, Sander van Zoest wrote: > On Wed, 25 Sep 2002, Justin Erenkrantz wrote: > > On Thu, Sep 26, 2002 at 02:11:59AM +0200, Dirk-Willem van Gulik wrote: > > > ->Makes the wait loop no longer endless - but causes it > > > to bail out (and emit some warnings ahead of time) after > > > a couple of thousand consequituve EINTRs. > > Placing a 'magic number' on how many EINTRs is 'failure' doesn't > > seem right. -- justin > > Although, things like these have been done many times in the past. > Especially in BSD. As long as the number is high enough to where there > doesn't seem to be an obvious reason to go above that, then I do not see > why not. Is there any other way to detect that the fcntl() is bad, other than "we got more than X EINTR"? For instance, in this case I'm guessing it's related to interrupts due to the lockfile being on a network filesystem of some sort, and it looks like you could have a server run fine for a couple days, then drop itself for no obvious reason. Every time I've ever seen code which does something different after getting "too many" EINTR responses, and later rolled that code into a production environment, it's turned out to be wrong. EINTR never seems to happen in development environments :-). If you're getting "too many" EINTR results, in my experience it means that the code isn't handling errors correctly somewhere else, and it will bite you in other ways, so nowadays I always go looking for the _real_ problem. [Unless, of course, the OS itself has a bug. But that should definitely involve conditional code to fix.] Later, scott
RE: E-Kabong resolution: Re: acceptance of El-Kabong into APR
On Thu, 12 Sep 2002, Harrie Hazewinkel wrote: > --On Thursday, September 12, 2002 8:50 AM -0500 "Jenkins, David" > > I disagree almost completely. If you are truly dedicated to the ASF > > community, you will understand the cautiousness necessary in deciding > > who has commit privs. > > I was mainly thinking of bigger pieces of code - code component - and > for those their are mostly also maintainers needed. Those maintainers > are mostly first the donator. For small patches I agree not everyone > should get commit access. I think it's important to keep in mind that being part of the Apache deliverables is not the only option. Contributors can always spin up their own external opensource project, as was done for mod_ssl, mod_perl, mod_php, etc. Yes, this places more of a burden on the contributor, but that's fair in cases where the contributor desires to maintain tighter control. A side effect of this is that if the component becomes popular, integrating it becomes more compelling. Later, scott
Re: El-Kabong -- HTML Parser
[I am not an Apache contributor, merely a lurker, but...] On Tue, 10 Sep 2002, Jon Travis wrote: > These are not coercive tactics. These are processes which are > beneficial to both the ASF and Covalent. I cannot continually monitor > the progress of this project for eternity. I'm astonished that this > deadline email has caused such a response. This sets an extremely bad > precedent for other companies (or anyone for that matter) who wants to > contribute to the ASF. > > Personally (Covalent hat off), it's a bummer that this is your response > to the donation. I was the one who originally proposed it to > management, they agreed to it, and now I've gotten involved in all kinds > of politics and inflamatory emails. That's a long way from being > excited about contritributing to the ASF, and sadly seems like more > trouble than it's worth. As I said earlier: if all you want is to contribute the code, put a compatible open source license on it and put it on a publicly accessable website, somewhere. >From following the thread, I get the feeling you don't want to contribute it, you want someone to take ownership of it. A couple points: 1) Everyone here has a real-life job. 2) Many of those jobs don't involve Apache directly. 3) Anyone who's writing code has their own pet projects they want done. 4) Anyone without a pet project has a choice of dozens/hundreds of abandoned/unmaintained projects to work on. 5) Integration work is hard work. If you really want the ASF to pull this project into the Apache core, your best bet is to volunteer to integrate it and write some example code. After all, you're the one with the code, you're the one who wants to contribute it to the community. This isn't specific to the Apache group. This is just how open source software works. And this basic thread happens every couple months on every open source project I monitor. As far as inflammatory emails, you must be reading lists that I don't have access to, because I haven't seen it. Given that you've essentially asked the community to prove that it's worthy of accepting your contribution, I'm actually surprised the responses have been so calm. Later, scott
Re: El-Kabong -- HTML Parser
I'm not sure I understand what your goal is, here. The discussion seems to be +1 for including your parser somewhere in some Apache project in the future, there's just no clear concensus on where. Is there any reason you can't just release your project under the ASF license and be done with it? Later, scott On Mon, 9 Sep 2002, Jon Travis wrote: > Ok, since I'm not seeing any activity towards getting this > integrated, I'd like to set a deadline. This would help > me out, since it gives direction as to where the project > can go, as well as the ASF since political discussion shouldn't > weigh down the process. It will just save us all a lot of > time & energy. > > Anyway, I'd like to give an additional week to the ASF > to deal with the code. Next Monday, if it hasn't been > decided I'll look into other options. > > -- Jon > > > On Mon, Sep 09, 2002 at 10:36:21AM -0700, Jon Travis wrote: > > Time for another ping. It's been 2 weeks. Any word? > > > > -- Jon > > > > > > On Mon, Aug 26, 2002 at 08:32:16PM -0700, Jon Travis wrote: > > > Hi all... > > > Jon Travis here... > > > > > > Covalent has written a pretty keen HTML parser (called el-kabong) > > > which we'd like to offer to the ASF for inclusion in APR-util (or > > > whichever other umbrella it fits under.) It's faster than > > > anything I can find, provides a SAX stylee interface, uses > > > APR for most of its operations (hash tables, etc.), and has a > > > pretty nice testsuite. We use it in our code to re-write HTML on > > > the fly. I would be the initial maintainer of the code. > > > > > > Please voice any interest, thanks. > > > > > > -- Jon > > > >
Re: Thread-unsafe libraries in httpd-2.0
On Thu, 15 Aug 2002, William A. Rowe, Jr. wrote: > There's no reason to bloat all of Apache and it's well behaved modules > with extra code, when only a handful of modules care to report that they > can't be compiled for a threaded architecture. The strict engineer in me agrees. The pragmatic engineer in me realizes that threading issues are hard, and that you're going to get more false positives (modules allowed to run who shouldn't be) if you make threading opt-out rather than opt-in. It's not like this code (or flag) has to be handled on every request. [Just in case that wasn't clear - modules should indicate that they are thread-safe, else the threaded MPMs should abort. Perhaps it would be sufficient to simply report an error or alert in the logs, so that when things go wrong, it occurs to the admin to consider thread-safety issues alongside other issues.] When it comes down to it, we're only talking about a couple extra lines for all of the standard modules to indicate that they are thread-safe. While that road does lead to creature feep, non-thread-safe code running in a threaded program can be very touchy, likely to work in a large number of cases, while crashing with weird, hard-to-debug symptoms. Later, scott
Re: config handling (was: Re: cvs commit: httpd-2.0/server core.c)
On Mon, 20 May 2002, Greg Stein wrote: > On Sat, May 18, 2002 at 12:32:20PM -0500, William A. Rowe, Jr. wrote: > > On Win32, we load-unload-reload the parent, then load-unload-reload > > the child config. Losing both redundant unload-load sequences will > > be a huge win at startup. > > Yup. If we process the tree in a much smarter fashion, then nothing > should need to be unloaded. One thing I _like_ about the load-unload-reload is that it generally forces you (the module author) to consider the graceful restart case, rather than simply crashing (or getting buggy) the first time someone does it. [Sorry if you're using those terms in a technical fashion that I'm not following.] OTOH, on Windows the parent and child both have to load things, so you get a similar effect. [Speaking of this, one thing I'd like to see for Windows would be a way for the parent process to cache the config (or parse tree) and pass it directly to the child, so that you don't have the possibility of changing config when a new child is spawned due to MaxRequestsPerChild. Yeah, I _should_ submit a patch rather than a request.] Later, scott
RE: is httpd a valid way to start Apache?
On Thu, 16 May 2002, Joshua Slive wrote: > On Thu, 16 May 2002, Ryan Bloom wrote: > > My own opinion is that we leave things exactly as they are today. If > > you are running the binary by hand, you are taking some responsibility > > for knowing what you are doing. That means having the environment > > variables setup correctly before you start. > > > > If you don't want that responsibility, use apachectl to run the > > server. Trying to solve this problem any other way just seems like we > > are asking for trouble. > > I think that is exactly what this proposal is saying. But at the same > time it is cleaning up apachectl and adding some useful functionality to > httpd. As I've said, the current apachectl is over-complicated and the > split between apachectl and httpd is confusing to some people. This > change would clear that up. Would it make sense to move the httpd binary to .../libexec/httpd? That makes it clear that this is an internal binary which you shouldn't run directly, unless you're really smart. Then apachectl stays in .../sbin/. [Idea courtesy of mysql's mysqld.] Later, scott
RE: Move perchild to experimental?
In my experience this argument always ends with: copy the ,v files, then cvs rm the old version, with a comment on the order of "moved to ../wherever". Perhaps with a "moved from .../wherever" comment added to the new version. I think it's even ended that way on this list a couple times. Messing with history is bad! Later, scott On Wed, 17 Apr 2002, Ryan Bloom wrote: > I would much rather move the ,v files. This is a standard argument on > this list, and there has never been consensus. The history is important > with stuff like MPMs, and doing a cvs rm, cvs add removes the history. > > Ryan > > -- > Ryan Bloom [EMAIL PROTECTED] > 645 Howard St. [EMAIL PROTECTED] > San Francisco, CA > > > -Original Message- > > From: Aaron Bannert [mailto:[EMAIL PROTECTED]] > > Sent: Wednesday, April 17, 2002 3:12 PM > > To: [EMAIL PROTECTED] > > Subject: Re: Move perchild to experimental? > > > > On Wed, Apr 17, 2002 at 03:10:10PM -0700, Justin Erenkrantz wrote: > > > Okay, so it seems we have consensus to move it. > > > > > > Uh, how do we move it? > > > > > > - Delete it and re-add them in the new directory > > > - Move the .v files on icarus > > > > If you move the ,v files you'll be messing with history, how about > just > > delete and add? > > > > *cough*svn could probably do it*cough* > > > > -aaron >
Re: performance: using mlock(2) on httpd parent process
On Wed, 20 Mar 2002, Stas Bekman wrote: > mod_perl child processes save a lot of memory when they can share memory > with the parent process and quite often we get reports from people that > they lose that shared memory when the system decides to page out the > parent's memory pages because they are LRU (least recently used, the > algorithm used by many memory managers). I'm fairly certain that this is not an issue. If a page was shared COW before being paged out, I expect it will be shared COW when paged back in, at least for any modern OS. [To verify that I wasn't talking through my hat, here, I just verified this using RedHat 7.2 running kernel 2.4.9-21. If you're interested in my methodology, drop me an email.] > I believe that this applies to all httpd modules and httpd itself, the > more we can share the less memory resources are needed, and usually it > leads to a better performance. I'm absolutely _certain_ that unmodified pages from executable files will be backed by the executable, and will thus be shared by default. > Therefore my question is there any reason for not using mlockall(2) in > the parent process on systems that support it and when the parent httpd > is started as root (mlock* works only within root owned processes). I don't think mlockall is appropriate for something with the heft of mod_perl. Why are the pages being swapped out in the first place? Presumably there's a valid reason. Doing mlockall on your mod_perl would result in restricting the memory available to the rest of the system. Whatever is causing mod_perl to page out would then start thrashing. Worse, since mlockall will lock down mod_perl pages indiscriminately, the resulting thrashing will probably be even worse than what they're seeing right now. Later, scott
Re: performance: using mlock(2) on httpd parent process
On Thu, 21 Mar 2002, Stas Bekman wrote: > > On Wed, 20 Mar 2002, Stas Bekman wrote: > > > >>mod_perl child processes save a lot of memory when they can share > >>memory with the parent process and quite often we get reports from > >>people that they lose that shared memory when the system decides to > >>page out the parent's memory pages because they are LRU (least > >>recently used, the algorithm used by many memory managers). > >> > > > > I'm fairly certain that this is not an issue. If a page was shared > > COW before being paged out, I expect it will be shared COW when paged > > back in, at least for any modern OS. > > But if the system needs to page things out, most of the parent process's > pages will be scheduled to go first, no? So we are talking about a > constant page-in/page-out from/to the parent process as a performance > degradation rather than memory unsharing. Am I correct? The system is going to page out an approximation of the least-recently-used pages. If the children are using those pages, then they won't be paged out, regardless of what the parent is doing. [If the children _aren't_ using those pages, then who cares?] > > [To verify that I wasn't talking through my hat, here, I just verified > > this using RedHat 7.2 running kernel 2.4.9-21. If you're interested in my > > methodology, drop me an email.] > > I suppose that this could vary from one kernel version to another. Perhaps, but I doubt it. I can't really do real tests on older kernels because I don't have them on any machines I control, but I'd be somewhat surprised if any OS which runs on modern hardware worked this way. It would require the OS to map a given page to multiple places in the swapfile, which would be significant extra work, and I can't think of any gains from doing so. > I'm just repeating the reports posted to the mod_perl list. I've never > seen such a problem myself, since I try hard to have close to zero swap > usage. :-). In my experience, you can get some really weird stuff happening when you start swapping mod_perl. It seems to be stable in memory usage, though, so long as you have MaxClients set low enough that your maximum amount of committed memory is appropriate. Also, I've seen people run other heavyweight processes, like mysql, on the same system, so that when the volume spikes, mod_perl spikes AND mysql spikes. A sure recipe for disaster. > [Yes, please let me know your methodology for testing this] OK, two programs. bigshare.c: #include #include #include #define MEGS 256 static char *mem = NULL; static char vv = 0; static void handler(int signo) { char val = 0; unsigned ii; signal(signo, handler); for (ii=0; ii int main(int argc, char **argv) { char *mem = calloc(1, 384*1024*1024); free(mem); return 0; } These both compile under RedHat 7.2, you might have to adjust the #include directives for other systems. Adjust the MEGS value in bigshare.c to be big enough to matter, but not so big that it causes bigshare itself to swap. I chose 1/2 of my real memory size. The 384 in makeitswap.c is 3/4 of my real memory, so it pushes tons of stuff into swap. Run bigshare. Use ps or something appropriate to determine that, indeed, all four bigshare processes are using up 256M of memory, but it's all shared. Then, run makeitswap. All of the bigshare processes should partly or fully page out. Afterwards I I was seeing RSS from 260k to 1M on the bigshare processes. Then, kill -USR1 one of the bigswap processes. This causes the process to re-read all of the memory it earlier allocated, thus it should page in 256M or so. ps or top should show the RSS rising as it swaps back in. You can also use "vmstat 1" to watch it happen (watch the Swap/si column). On some systems you may need to use iostat. More than likely your system response also goes to heck, because it's spending so much time swapping data in. bigswap should end up with RSS about 256M, again. Then, kill -USR1 another of the bigswap processes. On my system, this happened much faster than the first time. Also, I saw only minimal swapins in vmstat (128 or so per second, versus >10,000 per second for the -USR1 against the first process). Send -USR1 to other bigshare processes, same results. You can verify that the pages are shared with ps or whatever. > >>Therefore my question is there any reason for not using mlockall(2) in > >>the parent process on systems that support it and when the parent > >>httpd is started as root (mlock* works only within root owned > >>processes). > > > > I don't think mlockall is appropriate for something with the heft of > > mod_perl. > > > > Why are the pages being swapped out in the first place? Presumably > > there's a valid reason. > > Well, the system coming close to zero of real memory available. The > parent process starts swapping like crazy because most of its pages are > LRU, slowing the whole system down and if the load doesn'
Re: Parent death should force children suttee
On Thu, Jan 31, 2002 at 06:40:01PM -0500, Dale Ghent wrote: > From a users' standpoint, it would seem more like a bug in apache if > s/he tries to shut apache down via apachectl, and then start it back up. > > First, the shutdown will fail, because the ppid is no-longer existing > (and thus producding the "unclean shutdown message), and when the > attempt by the bewildered admin is made to start apache again, it fails > because the childs are still bound to ports and whatnot. > > Although I hold no voting power here, I'd say that the children are to > die with the parent. It might be useful to allow conceptual space for it to work either way. Locally we've modified Apache 1.3.x for FreeBSD to add a "graceful shutdown" signal at USR2. Upon receiving this signal, the parent drops a mark in the scoreboard, and does a shutdown() on the listen socket(s). This indicates to the OS that it should stop accepting new connections on the socket. The existing requests continue to completion, and children can even accept new requests off of the listen backlog (ap_max_request_per_child is bumped up in the children in this case, because the parent won't spin up new servers if the children die). Once the backlog is empty, continued accept() calls will result in an error (ECONNABORTED), and the child bails out. Basically, this was to play nice with our load balancer, you can shut down and start up servers with zero requests lost in transit. If that wasn't cool enough - once the parent calls shutdown() on the listen sockets, FreeBSD lets us _immediately_ start a new server on the same sockets. So, rolling a new build is as simple as "shut the old one down, start the new one up." [BTW, I agree with the children killing themselves when the parent goes away, perhaps configurable between "Kill yourself ASAP" versus "Kill yourself when you come up for air between requests." The notion of killing everything which _looks_ like an Apache child scares me (what if you're running multiple servers on a box?).] Later, scott hess [EMAIL PROTECTED]