RE: :Parser & Expat cause segfaults
> -Original Message- > From: Oskari 'Okko' Ojala [mailto:[EMAIL PROTECTED]] > Got a problem: About 250 of 1000 requests cause a segfault > (11) when using > XML::Parser::parse() under mod_perl. In FAQs it is stated that this is > because of the bundled Expat in Apache. > > I've tried disabling Apache's Expat with --disable-rule=EXPAT, but it > doesn't help. > > Have you found any workarounds or patches, or is the reason to my > segfaults somewhere else? > > Platform: > > Red Hat 7.0 > Apache 1.3.19 > mod_perl 1.25 > perl 5.6.0 > expat 1.95.1 > HTML::Mason 1.02 > XML::Parser 2.30 There's (apparently) a known symbol conflict between XML::Parser 2.30 and Apache (which I only know because it happened to someone here just the other day). Drop down to 2.29 and it should work fine. Stephen.
RE: Fwd: [speedycgi] Speedycgi scales better than mod_perl withsc ripts that contain un-shared memory
> > This doesn't affect the argument, because the core of it is that: > > > > a) the CPU will not completely process a single task all > at once; instead, > > it will divide its time _between_ the tasks > > b) tasks do not arrive at regular intervals > > c) tasks take varying amounts of time to complete > > [snip] > I won't agree with (a) unless you qualify it further - what > do you claim > is the method or policy for (a)? I think this has been answered ... basically, resource conflicts (including I/O), interrupts, long running tasks, higher priority tasks, and, of course, the process yielding, can all cause the CPU to switch processes (which of these qualify depends very much on the OS in question). This is why, despite the efficiency of single-task running, you can usefully run more than one process on a UNIX system. Otherwise, if you ran a single Apache process and had no traffic, you couldn't run a shell at the same time - Apache would consume practically all your CPU in its select() loop 8-) > Apache httpd's are scheduled on an LRU basis. This was > discussed early > in this thread. Apache uses a file-lock for its mutex > around the accept > call, and file-locking is implemented in the kernel using a > round-robin > (fair) selection in order to prevent starvation. This results in > incoming requests being assigned to httpd's in an LRU fashion. I'll apologise, and say, yes, of course you're right, but I do have a query: There are at (IIRC) 5 methods that Apache uses to serialize requests: fcntl(), flock(), Sys V semaphores, uslock (IRIX only) and Pthreads (reliably only on Solaris). Do they _all_ result in LRU? > Remember that the httpd's in the speedycgi case will have very little > un-shared memory, because they don't have perl interpreters in them. > So the processes are fairly indistinguishable, and the LRU isn't as > big a penalty in that case. Ye_but_, interpreter for interpreter, won't the equivalent speedycgi have roughly as much unshared memory as the mod_perl? I've had a lot of (dumb) discussions with people who complain about the size of Apache+mod_perl without realising that the interpreter code's all shared, and with pre-loading a lot of the perl code can be too. While I _can_ see speedycgi having an advantage (because it's got a much better overview of what's happening, and can intelligently manage the situation), I don't think it's as large as you're suggesting. I think this needs to be intensively benchmarked to answer that > other interpreters, and you expand the number of interpreters in use. > But still, you'll wind up using the smallest number of interpreters > required for the given load and timeslice. As soon as those 1st and > 2nd perl interpreters finish their run, they go back at the beginning > of the queue, and the 7th/ 8th or later requests can then > use them, etc. > Now you have a pool of maybe four interpreters, all being > used on an MRU > basis. But it won't expand beyond that set unless your load > goes up or > your program's CPU time requirements increase beyond another > timeslice. > MRU will ensure that whatever the number of interpreters in use, it > is the lowest possible, given the load, the CPU-time required by the > program and the size of the timeslice. Yep...no arguments here. SpeedyCGI should result in fewer interpreters. I will say that there are a lot of convincing reasons to follow the SpeedyCGI model rather than the mod_perl model, but I've generally thought that the increase in that kind of performance that can be obtained as sufficiently minimal as to not warrant the extra layer... thoughts, anyone? Stephen.
RE: Fwd: [speedycgi] Speedycgi scales better than mod_perl withscripts that contain un-shared memory
> -Original Message- > From: Sam Horrocks [mailto:[EMAIL PROTECTED]] > Sent: 17 January 2001 23:37 > To: Gunther Birznieks > Cc: [EMAIL PROTECTED]; mod_perl list; [EMAIL PROTECTED] > Subject: Re: Fwd: [speedycgi] Speedycgi scales better than mod_perl > withscripts that contain un-shared memory > > > > With some modification, I guess I am thinking that the > cook is really the > > OS and the CPU is really the oven. But the hamburgers on > an Intel oven have > > to be timesliced instead of left to cook and then after > it's done the next > > hamburger is put on. > > > > So if we think of meals as Perl requests, the reality is > that not all meals > > take the same amount of time to cook. A quarter pounder > surely takes longer > > than your typical paper thin McDonald's Patty. [snip] > > I don't like your mods to the analogy, because they don't model how > a CPU actually works. Even if the cook == the OS and the oven == the > CPU, the oven *must* work on tasks sequentially. If you look at the > assembly language for your Intel CPU you won't see anything about it > doing multi-tasking. It does adds, subtracts, stores, loads, > jumps, etc. > It executes code sequentially. You must model this somewhere in your > analogy if it's going to be accurate. ( I think the analogies have lost their usefulness) This doesn't affect the argument, because the core of it is that: a) the CPU will not completely process a single task all at once; instead, it will divide its time _between_ the tasks b) tasks do not arrive at regular intervals c) tasks take varying amounts of time to complete Now, if (a) were true but (b) and (c) were not, then, yes, it would have the same effective result as sequential processing. Tasks that arrived first would finish first. In the real world however, (b) and (c) are usually true, and it becomes practically impossible to predict which task handler (in this case, a mod_perl process) will complete first. Similarly, because of the non-deterministic nature of computer systems, Apache doesn't service requests on an LRU basis; you're comparing SpeedyCGI against a straw man. Apache's servicing algortihm approaches randomness, so you need to build a comparison between forced-MRU and random choice. (Note I'm not saying SpeedyCGI _won't_ winjust that the current comparison doesn't make sense) Thinking about it, assuming you are, at some time, servicing requests _below_ system capacity, SpeedyCGI will always win in memory usage, and probably have an edge in handling response time. My concern would be, does it offer _enough_ of an edge? Especially bearing in mind, if I understand, you could end runing anywhere up 2x as many processes (n Apache handlers + n script handlers)? > No, homogeneity (or the lack of it) wouldn't make a > difference. Those 3rd, > 5th or 6th processes run only *after* the 1st and 2nd have > finished using > the CPU. And at that poiint you could re-use those > interpreters that 1 and 2 > were using. This, if you'll excuse me, is quite clearly wrong. See the above argument, and imagine that tasks 1 and 2 happen to take three times as long to complete than 3, and you should see that that they could all end being in the scheduling queue together. Perhaps you're considering tasks which are too small to take more than 1 or 2 timeslices, in which case, you're much less likely to want to accelerate them. [snipping obscenely long quoted thread 8-)] Stephen.
RE: Wild Proposal :)
> -Original Message- > From: Perrin Harkins [mailto:[EMAIL PROTECTED]] > Sent: 11 October 2000 04:45 > To: Ajit Deshpande > Cc: [EMAIL PROTECTED] > Subject: Re: Wild Proposal :) > > > Hi Ajit, > > It's not entirely clear to me what problem you're trying to > solve here. > I'll comment on some of the specifics you've written down here, but I > may be missing your larger point. Ajit's examples aren't perfect, but the problem is a real one. The problem is one of generalisation. Logically, you don't want to put an application that is 10% web-related into mod_perl. So, you can take it out the other 90% and stick it into an RPC server, but wouldn't it be nice if there was an application server framework that handled connections,load balancing and resource management for you? > There's DBI::Proxy already. Before jumping on the "we need pooled > connections" bandwagon, you should read Jeffrey Baker's post on the > subject here: > http://forum.swarthmore.edu/epigone/modperl/breetalwox/38B4DB3F.612476CE@acm .org People always manage to miss the point on this one. It's not about saving the cycles required to open the connection, as they're minimal at worst. It's about saving the _time_ to open the connection. On a network application, opening a connection is going to be quite possibly your largest latency. On a large application doing a lot of transactions per second, the overhead involved in building connections and tearing them down can lose you serious time. It also complicates scaling the database server. It's far better to pay your overhead once and just re-use the connection. Stephen.
RE: Why does Apache do this braindamaged dlclose/dlopen stuff?
> So in the longer term, is there a reason the parent has to contain the > interpreter at all? Can't it just do a system call when it needs one? > It seems a bit excessive to put aside a couple of megabytes of system > memory just to run startup.pl. Well, remember that the interpreter itself will remain shared throughout, so there's no real disadvantage in having in the parent. The main reason to run startup.pl in the parent is to overcome as much of Perl's startup time as possible. Compiling the code domainates the startup time, so the thing to do is to pull in your modules in startup.pl . That way, it's only done once, and the results are shared between all children. I think the thing to do here is fix the memory leaks 8-) Stephen.
RE: Why does Apache do this braindamaged dlclose/dlopen stuff?
> -Original Message- > From: Gerald Richter [mailto:[EMAIL PROTECTED]] > Sent: 19 January 2000 04:36 > To: Alan Burlison; [EMAIL PROTECTED] > Subject: RE: Why does Apache do this braindamaged > dlclose/dlopen stuff? > So I would agree to your last sentences that Dynloader is responsible for unloading, because that's > the only module, which knows what it had loaded. Agreed. It's a relatively small change to DynaLoader, with great benefits for embedded Perl. >Also I am not so sure if unloading all the libraries can be really successfully done, because the Perl > XS libraries don't assume that they will ever unloaded (because they are actualy only unloaded when the program exits). > > This may be the reason for memory leaks Daniel metioned earlier, because the XS libraries don't have a chance to > free there resources (or not aware of that they actually should do so). Yes and no. If XS libraries are written with OO-style wrappers (which, IMHO, they always should be), then surely you can catch the unloading in a DESTROY sub and use that to do the deallocation of resources? Perl can only manage Perl resources, and extension resources should be the responsibility of the the programmer. Stephen.
RE: Another IE5 complaint
> -Original Message- > From: Rod Butcher [mailto:[EMAIL PROTECTED]] > Sent: 23 November 1999 10:20 > Cc: [EMAIL PROTECTED] > Subject: Re: Another IE5 complaint > > > Am I the only battling service vendor who actually feels good when > somebody bookmarks my website ? > I can absorb the overhead of accesses to a favorites icon. > This may be a security hazard for the client, but I detect a > holier-than-thou attitude here against M$. > Will somebody tell me why this M$ initiative is bad, other than for > pre-determined prejudices ? > Rgds > Rod Butcher Speaking as someone who works for an ISP, anything that obscures (by volume) genuine errors is a Bad Thing. The error log is a useful diagnostic tool only if you can see the errors. Yes, you could filter out the requests before examining the file, but the point is MS is making more work for people by being thoughtless. Further reasons it's a bad idea * It's not standard * It's a specific solution to a general problem, and therefore fragile (i.e. it breaks too easily) * It's a quick hack rather than a genuine initiative (which would take effort) Stephen. -- The views expressed are personal and do not necessarily reflect those of Planet Online Limited.