Re: Growing Up
There is a consideration, regarding using a proxy or a different server, that has not been brought up: If there is mod_perl based access control for the static files, then it's basically impossible not to go through a mod_perl server to serve them. If you're access control is in mod_perl, you have to at least hit the mod_perl server to check whether access is allowed. I've not used it myself, however Perlbal has a neat feature where it can "internally" redirect. So mod_perl can return a redirect to Perlbal, which will then go and retrieve the real file from your static server and send that to the client. Otherwise I'm not sure how complete a proxy solution Perlbal is but Live Journal is suppoed to be using it. In fact, I'm not sure what the effect would be in that scenario if a proxy was used: would it serve the static file regardless of the access control?, does it depend on the expiration data on the headers sent through the proxy when the acess controled static file was sent? Proxies should inspect the Vary: header to see under what conditions it can serve the same content. So if you're using Cookies for authentication, you should have 'Cookie' in your Vary header. It will then only re-serve the same content should it receive the same Vary header. Compared with setting the content to be no-cache or immediately expired this has the advantage that if the client re-requests the same resource it can be served from proxy cache rather than hitting the end servers again. Carl
Re: Growing Up
On Apr 18, 2007, at 12:36 PM, Perrin Harkins wrote: On 4/18/07, Denis Banovic <[EMAIL PROTECTED]> wrote: Is it possible to configure Perlbal so there is no single point of failure? That sort of high-availability setup is beyond the scope of an application-level load balancer like Perlbal. You need to use something that allows for IP takeover. The hardware solutions work, and there are many HA software solutions for Linux as well. Some of them are noted in the mod_perl docs: http://perl.apache.org/download/ third_party.html#High_Availability_and_Load_Balancing_Projects To add to that software list: on BSD there's CARP http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol FWIW, A common setup i see is something along the lines of 1. internet 2. lan gateway : 2+ nodes running carp or virtual server 3. load balancer : 2+ nodes ( sometimes running on the gateways ) 4. lan : db + application servers often i see people using soekris boxes on the gateways ( ever notice if you subscribe to bsd user-groups thats all people ever talk about ) http://www.soekris.com - they're little embedded boxes that run off compactflash cards. it ends up costing ~$700 for 2 independent boxes. they're not terribly fast , and don't support gigabit, but they hold up well if you're only doing the basic firewall/routing. and unless your traffic is requiring gigabit connectivity, they're a decent option to delay running out and buying a couple of dedicated firewall boxes with gigabit to run as your gateway. // Jonathan Vanasco | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | SyndiClick.com | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | FindMeOn.com - The cure for Multiple Web Personality Disorder | Web Identity Management and 3D Social Networking | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | RoadSound.com - Tools For Bands, Stuff For Fans | Collaborative Online Management And Syndication Tools | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Re: Growing Up
On 4/18/07, Denis Banovic <[EMAIL PROTECTED]> wrote: Is it possible to configure Perlbal so there is no single point of failure? That sort of high-availability setup is beyond the scope of an application-level load balancer like Perlbal. You need to use something that allows for IP takeover. The hardware solutions work, and there are many HA software solutions for Linux as well. Some of them are noted in the mod_perl docs: http://perl.apache.org/download/third_party.html#High_Availability_and_Load_Balancing_Projects - Perrin
AW: Growing Up
Hi! Is it possible to configure Perlbal so there is no single point of failure? I wanted to use perlbal as LB in front of few machines but decided to go with 2 LB's because they can work in redundant mode. Denis -Ursprüngliche Nachricht- Von: Frank Wiles [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 17. April 2007 18:46 An: Perrin Harkins Cc: Clinton Gormley; modperl@perl.apache.org Betreff: Re: Growing Up On Tue, 17 Apr 2007 10:48:57 -0400 "Perrin Harkins" <[EMAIL PROTECTED]> wrote: > On 4/17/07, Clinton Gormley <[EMAIL PROTECTED]> wrote: > > is it reasonable to serve your static files from a mod_perl server, > > as long as you have a proxy/pound/squid in front? > > Yes, but spending no time in mod_perl for a static file is better than > spending a little time, and the files will be served faster if there's > no extra proxying step. If you aren't having scaling problems, then > don't worry about it. Personally, I've fallen in love with Perlbal and it can serve up static files from disk so that would be probably what I would do in this situation. - Frank Wiles <[EMAIL PROTECTED]> http://www.wiles.org -
Re: Growing Up
On 4/17/07, Rafael Caceres <[EMAIL PROTECTED]> wrote: There is a consideration, regarding using a proxy or a different server, that has not been brought up: If there is mod_perl based access control for the static files, then it's basically impossible not to go through a mod_perl server to serve them. I use mod_auth_tkt. You issue a cookie with credentials, and the C module can use it to check access rights on static files from the proxy server. You have to run apache as your proxy server, but I prefer that anyway. In fact, I'm not sure what the effect would be in that scenario if a proxy was used: would it serve the static file regardless of the access control? No, it would talk to mod_perl every time and not do any caching, unless you have a mis-configured proxy. - Perrin
Re: Growing Up
On Mon, 2007-04-16 at 12:21 -0700, Will Fould wrote: > Hi, > > I have a service that is currently running a basic LAMP stack with > mod_perl and life has been good! > > The site running has been getting very busy and I've ordered a second > machine with intention to move the database off that machine and start > the growing up process. > > I am looking for next steps to growing up from this machine. Can > somebody recommend a good article, presentation or document that > advocates various strategies to growing up the current architecture > (i.e. basic load balancing, network topology, switches, etc. )? > > I realize that milage will vary based on the particular service and > demands. Currently, the site does not deliver a lot of static content > that can be cached or cause huge I/O issues (i.e. images, media, huge > pages, etc). Our database is probably 95% read-only. > > Thanks a lot There is a consideration, regarding using a proxy or a different server, that has not been brought up: If there is mod_perl based access control for the static files, then it's basically impossible not to go through a mod_perl server to serve them. In fact, I'm not sure what the effect would be in that scenario if a proxy was used: would it serve the static file regardless of the access control?, does it depend on the expiration data on the headers sent through the proxy when the acess controled static file was sent? Rafael Caceres Analizado por ThMailServer para Linux.
Re: Growing Up
On Tue, 17 Apr 2007 10:48:57 -0400 "Perrin Harkins" <[EMAIL PROTECTED]> wrote: > On 4/17/07, Clinton Gormley <[EMAIL PROTECTED]> wrote: > > is it reasonable to serve your static files from a mod_perl server, > > as long as you have a proxy/pound/squid in front? > > Yes, but spending no time in mod_perl for a static file is better than > spending a little time, and the files will be served faster if there's > no extra proxying step. If you aren't having scaling problems, then > don't worry about it. Personally, I've fallen in love with Perlbal and it can serve up static files from disk so that would be probably what I would do in this situation. - Frank Wiles <[EMAIL PROTECTED]> http://www.wiles.org -
Re: Growing Up
On 4/17/07, Clinton Gormley <[EMAIL PROTECTED]> wrote: is it reasonable to serve your static files from a mod_perl server, as long as you have a proxy/pound/squid in front? Yes, but spending no time in mod_perl for a static file is better than spending a little time, and the files will be served faster if there's no extra proxying step. If you aren't having scaling problems, then don't worry about it. - Perrin
Re: Growing Up
On Apr 17, 2007, at 3:55 AM, Clinton Gormley wrote: Must disagree with you about pound http://www.apsis.ch/pound/ index_html being a PITA to configure and maintain. Pound is really easy to configure, fast as all hell, and just never goes down. I've been using it for about 3 years now and I've never ever had a problem with it. if its working for you, great ;) I had some issues when I first tried it, then leaned to nginx which can handle proxy+loadbalancing and serving static content as well. Just a point of clarification, with reference to this email: http://marc.info/?l=apache-modperl&m=117595808501296&w=2 (File Uploads using MP2 best practises): is it reasonable to serve your static files from a mod_perl server, as long as you have a proxy/pound/squid in front? My understanding is that the cost of using your mod_perl server to serve static files is the amount of time that a slow request would tie them up. However, if your requests are all fast, because your proxy handles the slow part, then this ceases to be an issue. Am I correct in this assumption? I have a bunch of mod_perl servers behind a single pound proxy (plus failover), and they share the uploaded images via NFS currently, although I'm considering moving to iSCSI with OCFS2 when I am convinced of its stability. Any views on this? That assumption sounds right -- so long as you have a caching proxy like squid. Not all proxies cache ( i'm pretty sure that pound doesn't ). Any content you can offload from mp should give your app a big boost -- the thing that 'kills' modperl performance is tying up the same apache child used for content generation with 45 .gifs/jpg/ pngs and a handful of css/js files. If you're doing uploaded images over NFS though, chances are you have a lot of images -- which can make caching a bit of a nightmare as you try to balance the cache params. so i'd strongly suggest using a lightweight server (even vanilla apache would be an improvement). alternately, you could consider using amazon's s3 for mass storage with a CDN for distribution. ( i'm constantly told that s3 has uptime/access issues -- your data is safe, but it might not be accessible for an hour ). using a combo of the two gives you reliable storage and distro both for cheap. // Jonathan Vanasco | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | FindMeOn.com - The cure for Multiple Web Personality Disorder | Web Identity Management and 3D Social Networking | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | RoadSound.com - Tools For Bands, Stuff For Fans | Collaborative Online Management And Syndication Tools | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Re: Growing Up
> switch to a lightweight proxy + httpd on port 80. i like nginx > because its had much fewer critical bugs than lighttpd. others like > lighty. either will be fine - they'll free up apache to deal with > content generation and you'll see a ginormous performance boost off > that . you could use squid or pound for similar tasks, but they're a > PITA to configure and maintain Must disagree with you about pound http://www.apsis.ch/pound/index_html being a PITA to configure and maintain. Pound is really easy to configure, fast as all hell, and just never goes down. I've been using it for about 3 years now and I've never ever had a problem with it. Just a point of clarification, with reference to this email: http://marc.info/?l=apache-modperl&m=117595808501296&w=2 (File Uploads using MP2 best practises): is it reasonable to serve your static files from a mod_perl server, as long as you have a proxy/pound/squid in front? My understanding is that the cost of using your mod_perl server to serve static files is the amount of time that a slow request would tie them up. However, if your requests are all fast, because your proxy handles the slow part, then this ceases to be an issue. Am I correct in this assumption? I have a bunch of mod_perl servers behind a single pound proxy (plus failover), and they share the uploaded images via NFS currently, although I'm considering moving to iSCSI with OCFS2 when I am convinced of its stability. Any views on this? thanks Clint
Re: Growing Up
On 4/16/07, Will Fould <[EMAIL PROTECTED]> wrote: I am looking for next steps to growing up from this machine. Can somebody recommend a good article, presentation or document that advocates various strategies to growing up the current architecture (i.e. basic load balancing, network topology, switches, etc. )? Have you read the book "Practical mod_perl"? That's often a good starting point. http://modperlbook.org/ For an extreme case, you can read the LiveJournal story: http://danga.com/words/2005_oscon/ You can also ready my story about eToys: http://perl.apache.org/docs/tutorials/apps/scale_etoys/etoys.html It sound like you're mostly looking for database scaling advice, so you may want to check for presentations and papers related to your database. I know that the MySQL conference publishes lots of good stuff every year. - Perrin
Re: Growing Up
On Apr 16, 2007, at 3:21 PM, Will Fould wrote: Hi, I have a service that is currently running a basic LAMP stack with mod_perl and life has been good! The site running has been getting very busy and I've ordered a second machine with intention to move the database off that machine and start the growing up process. I am looking for next steps to growing up from this machine. Can somebody recommend a good article, presentation or document that advocates various strategies to growing up the current architecture (i.e. basic load balancing, network topology, switches, etc. )? I realize that milage will vary based on the particular service and demands. Currently, the site does not deliver a lot of static content that can be cached or cause huge I/O issues (i.e. images, media, huge pages, etc). Our database is probably 95% read-only. Thanks a lot w I don't have any articles or papers offhand, but I can say what I have been discussing with friends lately-- it seems like everyone is clustering their apps this month. i'm in the process right now too -- scaling one of my apps from 1 server to a 2node cluster with a 1TB mirror raid + 4gb ram postgres store on the back and modperl/nginx on the front. i'm only running 8 apache children on the front, and bumped memcached up to 700mb. switch to a lightweight proxy + httpd on port 80. i like nginx because its had much fewer critical bugs than lighttpd. others like lighty. either will be fine - they'll free up apache to deal with content generation and you'll see a ginormous performance boost off that . you could use squid or pound for similar tasks, but they're a PITA to configure and maintain the #1 slowdown i've seen from apache is from using the same server to handle the perl/php/python interpreter as being used for transferring a static file. every request not served by apache is more resources / memory for your app. check your db memory usage. if its 95% read only , is it full of complex joins? blocking operations? if so, don't just consider offloading to a dedicated db machine, but also consider running a slave read-only version of it to the local machine. apache sucks a ton of memory, but in comparison to what a db needs its nothing. when you migrate to a clustered setup you'll have at least 1gb of 'extra' memory to use. half of that can easily go to apache, but you'll see the law of diminishing returns weigh in after N httpd instances -- thats when you toss memory to memcached or a local replicant. profile your db requests: if you have a lot of repetitive queries, you could save a bunch of queries by using memcached profile your app design for db handle / connection suitability. a lot of people program for a system with 1db connection that handles all read/write. its usually fine on 1 box, but it doesn't work in a clustered system. question your db / schema. if you're using a 'new' feature in mysql, you might have a giant performance hit. if you have a badly planned/indexed query on postgres you're looking at the difference between 100ms and 10minutes on a select. also check for blocking operations. if your db does 'select for share', you might get a performance boost. question your os. some distros run apps better than others-- memory management / kqueue v select v poll / io / etc . i have friends who have been swapping distros like crazy over the past few months trying to squeeze a little more performance out. its easier changing distros than add new servers to a cluster. you can hold off on any real networking until you're at a 3+ cluster. if you've got 2-3 machines, you can just handle everything with an extra NIC. at 3-4 you'll want a lan. this is generic info -- i use it with all my projects( 60% mp/pg , 20% php/pg , 20% python/pg ), and i have friends using similar stuff in php/mysql , erlang/pg , python/pg , rails/mysql // Jonathan Vanasco | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | SyndiClick.com | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | FindMeOn.com - The cure for Multiple Web Personality Disorder | Web Identity Management and 3D Social Networking | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - | RoadSound.com - Tools For Bands, Stuff For Fans | Collaborative Online Management And Syndication Tools | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Re: Growing Up
On Mon, 16 Apr 2007 12:21:30 -0700 "Will Fould" <[EMAIL PROTECTED]> wrote: > I have a service that is currently running a basic LAMP stack with > mod_perl and life has been good! > > The site running has been getting very busy and I've ordered a second > machine with intention to move the database off that machine and > start the growing up process. > > I am looking for next steps to growing up from this machine. Can > somebody recommend a good article, presentation or document that > advocates various strategies to growing up the current architecture > (i.e. basic load balancing, network topology, switches, etc. )? > > I realize that milage will vary based on the particular service and > demands. Currently, the site does not deliver a lot of static content > that can be cached or cause huge I/O issues (i.e. images, media, huge > pages, etc). Our database is probably 95% read-only. You say your DB is 95% read only, but you can't cache. I assume you mean you can't cache entirely rendered HTML pages with something like squid. But you *can* and should cache the database from the database with something like Cache::Memcached. I'd normally suggest Cache::FastMmap, but you've already indicated you're growing enough to need to be able to scale across multiple machines. I think Ask Bjørn Hansen's presentation covers all of the recent generally accepted wisdom: http://develooper.com/talks/Real-World-Scalability-Web-Builder-2006.pdf - Frank Wiles <[EMAIL PROTECTED]> http://www.wiles.org -
Growing Up
Hi, I have a service that is currently running a basic LAMP stack with mod_perl and life has been good! The site running has been getting very busy and I've ordered a second machine with intention to move the database off that machine and start the growing up process. I am looking for next steps to growing up from this machine. Can somebody recommend a good article, presentation or document that advocates various strategies to growing up the current architecture (i.e. basic load balancing, network topology, switches, etc. )? I realize that milage will vary based on the particular service and demands. Currently, the site does not deliver a lot of static content that can be cached or cause huge I/O issues (i.e. images, media, huge pages, etc). Our database is probably 95% read-only. Thanks a lot w