Re: mod_perl vs. C for high performance Apache modules
At 03:58 PM 12/14/2001, Jeff Yoak wrote: At 09:15 PM 12/14/2001 +0100, Thomas Eibner wrote: The key to mod_perl development is speed, there are numerous testimonials from users implementing a lot of work in a very short time with mod_perl. Ask the clients investor wheter he wants to pay for having everything you did rewritten as an Apache module in C. That is very likely going to take a lot of time. Thank you for your reply. I realized in reading it that my tone leads one to the common image of a buzzword driven doody-head who wants this because of what he read in Byte. That's certainly common enough, and I've never had a problem dealing with such types. (Well... not an unsolvable problem... :-) This is something different. The investor is in a related business, and has developed substantially similar software for years. And it is really good. What's worse is that my normal, biggest argument isn't compelling in this case, that by the time this would be done in C, I'd be doing contract work on Mars. The investor claims to have evaluated Perl vs. C years ago, to have witnessed that every single hit on the webserver under mod_perl causes a CPU usage spike that isn't seen with C, and that under heavy load mod_perl completely falls apart where C doesn't. (This code is, of course, LONG gone so I can't evaluate it for whether the C was good and the Perl was screwy.) At any rate, because of this, he's spent years having good stuff written in C. Unbeknownst to either me or my client, both this software and its developer were available to us, so in this case it would have been faster, cheaper and honestly even better, by which I mean more fully-featured. CPU usage is certainly one factor... but CPUs are cheap compared to development man-hours. Since you haven't provided any details on the application, this may not be relevant, but most of the web apps that we write (and I read about here) spend much of their time waiting for responses from other back-end servers - databases, NFS mounted file systems, or whatever. It's probably undeniable that a well written C application will run faster than almost anything in an interpreted language, but that may not make much of a difference to the total response time. -Simon Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology One Athenaeum Street. Suite 1700, Cambridge, MA 02142 Phone: (617)621-5296: URL: http://www.northernlight.com Northern Light - Just what you've been searching for
Re: Phase for controlling network input?
I'm not sure that any mod_perl handlers are dispatched until the whole request is received, so you may have to deal with this at the core Apache level. I think the following is your best bet (from http://httpd.apache.org/docs/mod/core.html#timeout ) TimeOut directive Syntax: TimeOut number Default: TimeOut 300 Context: server config Status: core The TimeOut directive currently defines the amount of time Apache will wait for three things: 1.The total amount of time it takes to receive a GET request. 2.The amount of time between receipt of TCP packets on a POST or PUT request. 3.The amount of time between ACKs on transmissions of TCP packets in responses. We plan on making these separately configurable at some point down the road. The timer used to default to 1200 before 1.2, but has been lowered to 300 which is still far more than necessary in most situations. It is not set any lower by default because there may still be odd places in the code where the timer is not reset when a packet is sent. We've experienced this kind of attack inadvertently (as the result of a totally misconfigured HTTP client app which froze in the middle of sending an HTTP request ;=) but I wasn't aware that there were known attacks based on that. -Simon At 11:09 AM 9/26/2001, Bill McGonigle wrote: I'm hoping this is possible with mod_perl, since I'm already familiar with it and fairly allergic to c, but can't seem to figure out the right phase. I've been seeing log files recently that point to a certain DDOS attack brewing on apache servers. I want to write a module that keeps a timer for the interval from when the apache child gets a network connection to when the client request has been sent. I need a trigger when a network connection is established and a trigger when apache thinks it has received the request (before the response). PerlChildInitHandler seems too early, since the child may be a pre-forked child without a connection. PerlPostReadRequest seems too late since I can't be guaranteed of being called if the request isn't complete, which is the problem I'm trying to solve. I could clear a flag in PerlPostReadRequest, but that would imply something is persisting from before that would be able to read the flag. Maybe I'm think about this all wrong. Any suggestions? Thanks, -Bill - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology One Athenaeum Street. Suite 1700, Cambridge, MA 02142 Phone: (617)621-5296: URL: http://www.northernlight.com Northern Light - Just what you've been searching for
Re: Dynamic httpd.conf file using mod_perl...
At 04:16 AM 4/17/01, Ask Bjoern Hansen wrote: On Mon, 16 Apr 2001, Jim Winstead wrote: [...] you would have to do a "run config template expander HUP" instead of just doing a HUP of the apache parent process, but that doesn't seem like a big deal to me. And it has the big advantage of also working with httpd's without mod_perl. like proxy servers ... Going off on a slight tangent from the orginal topic - the template-based approach would also work well for subsystems that have separate configuration files - we put quite a bit of application configuration info into files other than httpd.conf, so that we can modify it without requiring a server restart. -Simon - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do(); more than 70M impressions per day, http://valueclick.com
[OT] HTTP/1.1 client support using LWPng
Slightly off topic... I am considering using the LWPng (HTTP/1.1) client code for an app where we could gladly use both of the HTTP/1.1 features that it offers: persistent connections (the client and server are separated by 7 time zones and the TCP connect time is a horrible 125 ms ;=( , and pipelining of requests. The status of the code , according to Gisle Aas, is definitely alpha, and it hasn't been touched in a few years. Has anyone else used this module ? and how successfully ? Thanks -Simon - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology One Athenaeum Street. Suite 1700, Cambridge, MA 02142 Phone: (617)621-5296 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Re: Socket/PIPE/Stream to long running process
At 11:04 AM 2/2/01 -0800, Rob Bloodgood wrote: So, in my mod_perl app, I run thru each request, then blast a UDP packet to a process on the local machine that collects statistics on my traffic: snip My question is, should I be creating this socket for every request? OR would it be more "correct" to create it once on process startup and stash it in $r-pnotes or something? we have similar code in a mod_perl environment for sending Multicast UDP packets - I just store the socket filehandle in a global when it's created, and the next request can pick it up from there (just test if the global is defined). keeping the endpoint info in pnotes is only useful if you need write multiple UDP packets per request. -Simon And if I did that, would it work w/ TCP? Or unix pipes/sockets (which I *don't* understand) (btw the box is linux)? In testing, I'd prefer not to use TCP because it blocks if the count server is hung or down, vs UDP, where I just lose a couple of packets. TIA! L8r, Rob ----- Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology One Athenaeum Street. Suite 1700, Cambridge, MA 02142 Phone: (617)621-5296 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Re: Caching search results
At 10:10 AM 1/8/01 -0800, you wrote: Bill Moseley wrote: Anyway, I'd like to avoid the repeated queries in mod_perl, of course. So, in the sort term, I was thinking about caching search results (which is just a sorted list of file names) using a simple file-system db -- that is, (carefully) build file names out of the queries and writing them to some directory tree . Then I'd use cron to purge LRU files every so often. I think this approach will work fine and instead of a dbm or rdbms approach. Always start with CPAN. Try Tie::FileLRUCache or File::Cache for starters. A dbm would be fine too, but more trouble to purge old entries from. an RDBMS is not much more trouble to purge, if you have a time-of-last-update field. And if you're ever going to access your cache from multiple servers, you definitely don't want to deal with locking issues for DBM and filesystem based solutions ;=( -Simon - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology One Athenaeum Street. Suite 1700, Cambridge, MA 02142 Phone: (617)621-5296 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Re: Caching search results
At 02:02 PM 1/8/01 -0800, Sander van Zoest wrote: On Mon, 8 Jan 2001, Simon Rosenthal wrote: an RDBMS is not much more trouble to purge, if you have a time-of-last-update field. And if you're ever going to access your cache from multiple servers, you definitely don't want to deal with locking issues for DBM and filesystem based solutions ;=( RDBMS does bring replication and backup issues. The DBM and FS solutions definately have their advantages. It would not be too difficult to write a serialized daemon that makes request over the net to a DBM file. What in you experience makes you pick the overhead of an RDBMS for a simple cache in favor of DBM, FS solutions? We cache user session state (basically using Apache::Session) in a small (maybe 500K records) mysql database , which is accessed by multiple web servers. We made an explicit decision NOT to replicate or backup this database - it's very dynamic, and the only user visible consequence of a loss of the database would be an unexpected login screen - we felt this was a tradeoff we could live with. We have a hot spare mysql instance which can be brought into service immediately, if required. I couldn't see writing a daemon as you suggested offering us any benefits under those circumstances, given that RDBMS access is built into Apache::Session. I would not be as cavalier as this if we were doing anything more than using the RDBMS as a fast cache. With decent hardware (which we have - Sun Enterprise servers with nice fast disks and enough memory) the typical record retrieval time is around 10ms, which even if slow compared to a local FS access is plenty fast enough in the context of the processing we do for dynamic pages. Hope this answers your question. -Simon -- Sander van Zoest [[EMAIL PROTECTED]] Covalent Technologies, Inc. http://www.covalent.net/ (415) 536-5218 http://www.vanzoest.com/sander/ - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology One Athenaeum Street. Suite 1700, Cambridge, MA 02142 Phone: (617)621-5296 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Re: [OT?] Cross domain cookie/ticket access
At 11:37 PM 9/7/00 -0600, Joe Pearson wrote: I thought you could set a cookie for a different domain - you just can't read a different domain's cookie. So you could simply set 3 cookies when the user authenticates. I don't think you can set a cookie for a completely different domain, based on my reading of RFC2109 and some empirical tests ... it would be a massive privacy/security hole, yes ? - Simon Now I'm curious, I'll need to try that. -- Joe Pearson Database Management Services, Inc. 208-384-1311 ext. 11 http://www.webdms.com -Original Message- From: Aaron Johnson [EMAIL PROTECTED] To: [EMAIL PROTECTED] [EMAIL PROTECTED] Date: Thursday, September 07, 2000 10:08 AM Subject: [OT?] Cross domain cookie/ticket access I am trying to implement a method of allowing access to three separate servers on three separate domains. The goal is to only have to login once and having free movement across the three protected access domains. A cookie can't work due to the limit of a single domain. Has anyone out there had to handle this situation? I have thought about several different alternatives, but they just get uglier and uglier. One thought was that they could go to a central server and login. At the time of login they would be redirected to a special page on each of the other two servers with any required login information. These pages would in turn return them to the login machine. At the end of the login process they would be redirected to the web site they original wanted. This is a rough summary of what might happen - domain1.net - user requests a page in a protected directory. They don't have a cookie. They are redirected to the cookie server. This server asks for the user name and pass and authenticates the user. Once authenticated the cookie server redirects the client to each of the other (the ones not matching the originally requested domain) domains. This redirect is a page that hands the client a cookie and sets up the session information. domain2.net gets the request and redirects the user to a page that will return them to the cookie machine which will add the domain2.net to the list of domains in the cookie. And then the process will repeat for each domain that needs to be processed. Am I crazy? Did I miss something in the documentation for the current Session/Auth/Cookie modules? I did some hacking of the Ticket(Access|Tool|Master) Example in the Eagle book, but the cookie limit is keeping it from working correctly. ( BTW: I already use it for a single server login and it works great. ) Any information would be appreciated. Aaron Johnson - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology 222 Third Street, Cambridge MA 02142 Phone: (617)621-5296 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Re: What phase am I in
At 12:51 PM 4/7/00 -0400, Paul G. Weiss wrote: Is there any way to determine from the Apache::Request object what phase of handling we'er in? I have some code that is used during more than one phase and I'd like it to behave differently for each phase. the current_callback() method (Eagle book, p465). Funny, I had to find this out yesterday.. - Simon -Paul - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology 222 Third Street, Cambridge MA 02142 Phone: (617)577-2796 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Re: [Rare Modules] Apache::RegistryNG
At 06:17 PM 2/4/00 +0200, Stas Bekman wrote: The next module is Apache::RegistryNG. CApache::RegistryNG is the same as CApache::Registry, aside from using filename instead of URI for the namespace. It also uses OO interface. snip There is no compelling reason to use CApache::RegistryNG over CApache::Registry, unless you want to do add or change the functionality of the existing IRegistry.pm. For example, CApache::RegistryBB (Bare-Bones) is another subclass that skips the stat() call performed by CApache::Registry on each request. One situation where Apache::RegistryNG may definitely be required is if you are rewriting URLS (using either mod_rewrite or your own handler) in certain ways. For instance if you have a rewrite rule of the form XYZ123456.html == /perl/foo.pl?p1=XYZp2=123456 Apache::Registry loses big, as it recompiles foo.pl for each unique URL. We ran into this and were totally baffled as to why we had no mod_perl performance boost until Doug pointed us to RegistryNG, which is definitely your friend in these circumstances. - Simon - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology 222 Third Street, Cambridge MA 02142 Phone: (617)577-2796 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Re: Caching DB queries amongst multiple httpd child processes
At 03:33 PM 2/3/00 +1100, Peter Skipworth wrote: Does anyone have any experience in using IPC shared memory or similar in caching data amongst multiple httpd daemons ? We run a large-ish database dependent site, with a mysql daemon serving many hundreds of requests a minute. While we are currently caching SQL query results on a per-process basis, it would be nice to share this ability across the server as a whole. I've played with IPC::Shareable and IPC::ShareLite, but both seem to be a little unreliable - unsurprising as both modules are currently still under development. Our platform is a combination of FreeBSD and Solaris servers - speaking of which, has anyone taken this one step further again and cached SQL results amongst multiple web servers ? We looked at this, as we have a busy multiple web server environment and are planning to use Apache::Session + Mysql to manage session state. Although per-host caching in shared memory or whatever seemed desirable on paper, the complexities of ensuring that cache entries are not invalid due to an update on another server are major. When we set up a testbed to benchmark Mysql for this project, the time taken to retrieve or update a session state record across the network over an established connection to our Mysql host (Sparc 333 mhz Ultra 5/Solaris 2.6 with lots of memory) was so small (5-7 ms including LOCK/UNLOCK TABLE commands where needed) that we didn't pursue per host caches any further. Clearly, YMMV depending on the hardware you have available. - Simon Thanks in advance, Peter Skipworth -- .-. | Peter SkipworthPh: 03 9897 1121 | | Senior Programmer Mob: 0417 013 292 | | realestate.com.au [EMAIL PROTECTED] | `-' - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology 222 Third Street, Cambridge MA 02142 Phone: (617)577-2796 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Job openings at Northern Light Technology, Cambridge, MA.
Who we're looking for: We're looking for a Senior/Principal Engineer in the Web Architecture/Systems group, which is responsible for all aspects of web server technology and the development of core technology components for our web based applications. (We have a separate Applications development group who are also hiring; see our jobs page at http://www.northernlight.com/docs/jobs_company.html for other open positions). About you: You will have 2+ years software development experience using C/C++/Perl in a UNIX environment, plus intimate familiarity with the Apache web server, mod_perl, and a good understanding of system performance and tuning issues in a high traffic Web environment. Ability to work on multiple projects at once, good communications skills and a tolerance for organized chaos are all highly desirable. About Northern Light: Since it premiered as the first Web-based research engine in August of 1997, Northern Light has grown considerably, earning accolades for its search engine technology. We're now a 150 person company (pre IPO) headquartered in the Kendall Square area of Cambridge, Mass. We're not new kids on the block. Our management team has been around a few blocks with, combined, over 100 years of experience in the software industry. They know what it takes to make the company successful. At the same time, we're very young at heart. We tackle interesting projects and actively encourage creative thinking and continuous learning. The energy, humor, and commitment to quality shared by people at Northern Light are unsurpassed. Please send your resume by email, fax or ground mail to: Human Resources Northern Light Technology 222 Third Street, Suite 1320 Cambridge, MA 02142 Fax: (617) 621-3459 Email: [EMAIL PROTECTED] Feel free to call me at (617) 621 5296 or email me if you have any questions. - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology 222 Third Street, Cambridge MA 02142 Phone: (617)577-2796 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Re: access_log
At 11:09 AM 1/12/00 -0500, Gacesa, Petar wrote: I was doing the stress testing of the Apache web server by simulating a large number of http requests. After several hours I started getting the following line in my access_log file: 165.78.11.40 - - [11/Jan/200:22:33:45 -0500] "-" 408 - Instead of the URL that was supposed to be accessed. Can somebody please tell me what this means? Petar It's a bit off-topic... nothing to do with mod_perl. It's reporting a situation when the client starts to send an HTTP request but doesn't complete it - you have an Apache process tied up waiting for the request to complete; it doesn't, and Apache eventually times out the request and so logs it. So look at your simulated client. - Simon ----- Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology 222 Third Street, Cambridge MA 02142 Phone: (617)577-2796 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"
Managing session state over multiple servers
Hi: We're planning on migrating to an Apache::Session + mysql approach for managing session state, for a large-ish site hosted on multiple servers. While there have been many useful discussions on this list concerning the technologies involved, I haven't seen many war stories from the field, as it were. I have some specific questions - hopefully someone out there has had to address these issues and may have some good advice. a) If your site runs on multiple servers, do you attempt to cache session state records on the web server for any length of time after they are retrieved from the DBMS ? and if so, how do you handle cache consistency across all your servers - (we have rather blind load balancing hardware in front of our server farm with no way of implementing any kind of server affinity that I am aware of) b) Does anyone have redundant database servers ? If so ... are there any implementation gotchas ? and if you have a single server, how does session management work when it goes down ? (I'm pretty happy with the hardware - Suns - which we have, but a disk can go at any time) c) This is more of a mysql question When do people consider a session to have expired ? and what is the best strategy for deleting expired sessions from a database, especially given that mysql's table based locking seems to leave a bit to be desired if you're trying to mix update operations with a big SELECT/DELETE to purge expired sessions ? TIA - Simon - Simon Rosenthal ([EMAIL PROTECTED]) Web Systems Architect Northern Light Technology 222 Third Street, Cambridge MA 02142 Phone: (617)577-2796 : URL: http://www.northernlight.com "Northern Light - Just what you've been searching for"