Re: high performance, highly available web clusters
On May 20, 2004, at 9:27 AM, David Wilk wrote: Now, here's the other question. Now that the web cluster can scale the static content ad infinitum, what about the dynamic content? What can be done with Mysql to load balance? currently they do what everyone does with two stand-alone Mysql servers that are updated simulataneously with the client writing to both. The client can then read from the backup Mysql server if the primary fails. I could just build two massive stand-alones, but a cluster would be more scalable. Yuck. I'd use MySQL's built-in replication. Build a fairly bullet-proof master and have as many slaves as needed. Put the slaves behind some sort of load balancer (LVS or whatnot) and you can scale pretty far with mostly read applications. We've used this quite effectively for many parts of Yahoo. Jeremy -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: high performance, highly available web clusters
On Fri, May 21, 2004 at 01:23:52AM +1000 or thereabouts, Russell Coker wrote: > On Thu, 20 May 2004 15:48, David Wilk <[EMAIL PROTECTED]> wrote: > > The cluster is comprised of a load-balancer, several web servers > > connected to a redundant pair of NFS servers and a redundant pair of > > MySQL servers. The current bottle-neck is, of course, the NFS servers. > > However, the entire thing needs an increase in capacity by several > > times. > > The first thing I would do in such a situation is remove the redundant NFS > servers. I have found the NFS client code in Linux to be quite fragile and > wouldn't be surprised if a cluster fail-over killed all the NFS clients (a > problem I often had in Solaris 2.6). In this case the webservers (NFS client) and NFS servers are FreeBSD. I believe FreeBSD's NFS is a bit more reliable than with Linux. However, for pure performance (and scalability) reasons, the NFS has got to go. Local disks can be used for content that doesn't need to change in real time. that's what the Mysql servers are for. Now, here's the other question. Now that the web cluster can scale the static content ad infinitum, what about the dynamic content? What can be done with Mysql to load balance? currently they do what everyone does with two stand-alone Mysql servers that are updated simulataneously with the client writing to both. The client can then read from the backup Mysql server if the primary fails. I could just build two massive stand-alones, but a cluster would be more scalable. > > > However, for alot less money, one could simply do away with the file > > server entirely. Since this is static content, one could keep these > > files locally on the webservers and push the content out from a central > > server via rsync. I figure a pair of redundant internal web server > > 'staging servers' could be used for content update. Once tested, the > > update could be pushed to the production servers with a script using > > rsync and ssh. Each server, would of course, require fast and redundant > > disk subsystems. > > Yes, that's a good option. I designed something similar for an ISP I used to > work for, never got around to implementing it though. My idea was to have a > cron job watch the FTP logs to launch rsync. That way rsync would only try > to copy the files that were most recently updated. There would be a daily > rsync cron job to cover for any problems in launching rsync from ftpd. > > With local disks you get much more bandwidth (even a Gig-E link can't compare > with a local disk), better reliability, and you can use the kernel-httpd if > you need even better performance for static content. Finally such a design > allows you to have a virtually unlimited number of web servers. Agreed. I think the last comment on scalability is key. I hadn't thought of that. Removing the common storage makes adding more webservers as easy as dropping in more boxes to the cluster and updating the load-balancer. Adding mores storage is not a chore either. Servers can be removed one at a time for disk upgrades. or, simply add new ones and retire the old ones, add more drives to the RAID... etc. thanks for the advice! Dave > > -- > http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages > http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark > http://www.coker.com.au/postal/Postal SMTP/POP benchmark > http://www.coker.com.au/~russell/ My home page -- *** David Wilk System Administrator Community Internet Access, Inc. [EMAIL PROTECTED]
Re: high performance, highly available web clusters
On Thu, 20 May 2004 15:48, David Wilk <[EMAIL PROTECTED]> wrote: > The cluster is comprised of a load-balancer, several web servers > connected to a redundant pair of NFS servers and a redundant pair of > MySQL servers. The current bottle-neck is, of course, the NFS servers. > However, the entire thing needs an increase in capacity by several > times. The first thing I would do in such a situation is remove the redundant NFS servers. I have found the NFS client code in Linux to be quite fragile and wouldn't be surprised if a cluster fail-over killed all the NFS clients (a problem I often had in Solaris 2.6). > However, for alot less money, one could simply do away with the file > server entirely. Since this is static content, one could keep these > files locally on the webservers and push the content out from a central > server via rsync. I figure a pair of redundant internal web server > 'staging servers' could be used for content update. Once tested, the > update could be pushed to the production servers with a script using > rsync and ssh. Each server, would of course, require fast and redundant > disk subsystems. Yes, that's a good option. I designed something similar for an ISP I used to work for, never got around to implementing it though. My idea was to have a cron job watch the FTP logs to launch rsync. That way rsync would only try to copy the files that were most recently updated. There would be a daily rsync cron job to cover for any problems in launching rsync from ftpd. With local disks you get much more bandwidth (even a Gig-E link can't compare with a local disk), better reliability, and you can use the kernel-httpd if you need even better performance for static content. Finally such a design allows you to have a virtually unlimited number of web servers. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page
Re: high performance, highly available web clusters
On Fri, May 21, 2004 at 01:23:52AM +1000 or thereabouts, Russell Coker wrote: > On Thu, 20 May 2004 15:48, David Wilk <[EMAIL PROTECTED]> wrote: > > The cluster is comprised of a load-balancer, several web servers > > connected to a redundant pair of NFS servers and a redundant pair of > > MySQL servers. The current bottle-neck is, of course, the NFS servers. > > However, the entire thing needs an increase in capacity by several > > times. > > The first thing I would do in such a situation is remove the redundant NFS > servers. I have found the NFS client code in Linux to be quite fragile and > wouldn't be surprised if a cluster fail-over killed all the NFS clients (a > problem I often had in Solaris 2.6). In this case the webservers (NFS client) and NFS servers are FreeBSD. I believe FreeBSD's NFS is a bit more reliable than with Linux. However, for pure performance (and scalability) reasons, the NFS has got to go. Local disks can be used for content that doesn't need to change in real time. that's what the Mysql servers are for. Now, here's the other question. Now that the web cluster can scale the static content ad infinitum, what about the dynamic content? What can be done with Mysql to load balance? currently they do what everyone does with two stand-alone Mysql servers that are updated simulataneously with the client writing to both. The client can then read from the backup Mysql server if the primary fails. I could just build two massive stand-alones, but a cluster would be more scalable. > > > However, for alot less money, one could simply do away with the file > > server entirely. Since this is static content, one could keep these > > files locally on the webservers and push the content out from a central > > server via rsync. I figure a pair of redundant internal web server > > 'staging servers' could be used for content update. Once tested, the > > update could be pushed to the production servers with a script using > > rsync and ssh. Each server, would of course, require fast and redundant > > disk subsystems. > > Yes, that's a good option. I designed something similar for an ISP I used to > work for, never got around to implementing it though. My idea was to have a > cron job watch the FTP logs to launch rsync. That way rsync would only try > to copy the files that were most recently updated. There would be a daily > rsync cron job to cover for any problems in launching rsync from ftpd. > > With local disks you get much more bandwidth (even a Gig-E link can't compare > with a local disk), better reliability, and you can use the kernel-httpd if > you need even better performance for static content. Finally such a design > allows you to have a virtually unlimited number of web servers. Agreed. I think the last comment on scalability is key. I hadn't thought of that. Removing the common storage makes adding more webservers as easy as dropping in more boxes to the cluster and updating the load-balancer. Adding mores storage is not a chore either. Servers can be removed one at a time for disk upgrades. or, simply add new ones and retire the old ones, add more drives to the RAID... etc. thanks for the advice! Dave > > -- > http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages > http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark > http://www.coker.com.au/postal/Postal SMTP/POP benchmark > http://www.coker.com.au/~russell/ My home page -- *** David Wilk System Administrator Community Internet Access, Inc. [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: high performance, highly available web clusters
On Thu, May 20, 2004 at 08:43:35AM -0400 or thereabouts, John Keimel wrote: > Personally, I can't see the sense in replacing a set of NFS servers with > individual disks. While you might save money going with local disks in > the short run your maintenance costs (moreso the time cost than dollar > cost) would increase accordingly. Just dealing with lots of extra moving > parts puts a shiver down my spine. Each webserver will need local storage for the system anyway. I would make that local storage large enough for the static content that is normally held on the NFS server. Worried about disks failing? That happens, and if a server drops out of the cluster, we just put it back after repairs. The cluster offers a level of redundancy that makes a single failure hardly noticeable. The problem with NFS is that it simply was not designed to handle the number of FS operations (90-150/s now and we want 10X that) that web serving can demand. You suggest a RAM disk, and yet find the NFS server adequate as well??? > > I'm not sure how your 'static content' fits in with your mentioning > multiple MySQL servers, that seems dynamic to me - or at least, ability > for much dynamic content. Static content is stored on the NFS server, dynamic content is stored on the Mysql servers. The vast majority of content are image files. > > If you ARE serving up a lot of static content, I might recommend a > situation that's similar to a project I worked on for a $FAMOUSAUTHOR > where we designed multiple web servers behind a pair of L4 switches. The > pair of switches (pair for redundancy) load balanced for us and we ran > THTTPD on the servers. There were a few links to offsite content, where > content hosting providers (cannot remember the first, but they later > went with Akamai) offered up the larger file people came to download. > Over the millions of hits we got, it survived quite nicely. We ran out > of bandwidth (50Mb/s) before the servers even blinked. that's awesome. Sounds like you got that one nailed. > > Perhaps if it IS static you might also consider loading your content > into a RAMdisk, which would provide probably the fastest access time. I > might consider such a thing these days with the dirt cheap pricing of > RAM. Actually, I figure a large bank of RAM (say, 4GB) will allow linux to allocate enough ram to the disk cache that the most commonly used files will be read right from RAM. Does this seem reasonable? > > I think some kind of common disk (NFS, whatever, on RAID) is your > best solution. why does it have to be common disk? why not local that is periodically updated? the increase in latency by using NFS (or SMB, whatever) and the overhead of all the FS operations is just killer. Besides, when you aggregate all your storage to a single fileserver, you provide yourself a single point of failure. Even with a dual redundant NFS setup, you still have only one level of redundancy. With a 10 server web cluster I could lose half my servers and still serve plenty of content. > > HTH > > j > -- > > == > + It's simply not | John Keimel+ > + RFC1149 compliant!| [EMAIL PROTECTED]+ > + | http://www.keimel.com + > == > > > -- > To UNSUBSCRIBE, email to [EMAIL PROTECTED] > with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED] -- *** David Wilk System Administrator Community Internet Access, Inc. [EMAIL PROTECTED]
Re: high performance, highly available web clusters
On Thu, 20 May 2004 15:48, David Wilk <[EMAIL PROTECTED]> wrote: > The cluster is comprised of a load-balancer, several web servers > connected to a redundant pair of NFS servers and a redundant pair of > MySQL servers. The current bottle-neck is, of course, the NFS servers. > However, the entire thing needs an increase in capacity by several > times. The first thing I would do in such a situation is remove the redundant NFS servers. I have found the NFS client code in Linux to be quite fragile and wouldn't be surprised if a cluster fail-over killed all the NFS clients (a problem I often had in Solaris 2.6). > However, for alot less money, one could simply do away with the file > server entirely. Since this is static content, one could keep these > files locally on the webservers and push the content out from a central > server via rsync. I figure a pair of redundant internal web server > 'staging servers' could be used for content update. Once tested, the > update could be pushed to the production servers with a script using > rsync and ssh. Each server, would of course, require fast and redundant > disk subsystems. Yes, that's a good option. I designed something similar for an ISP I used to work for, never got around to implementing it though. My idea was to have a cron job watch the FTP logs to launch rsync. That way rsync would only try to copy the files that were most recently updated. There would be a daily rsync cron job to cover for any problems in launching rsync from ftpd. With local disks you get much more bandwidth (even a Gig-E link can't compare with a local disk), better reliability, and you can use the kernel-httpd if you need even better performance for static content. Finally such a design allows you to have a virtually unlimited number of web servers. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: high performance, highly available web clusters
On Wed, May 19, 2004 at 11:48:31PM -0600, David Wilk wrote: ... > The cluster is comprised of a load-balancer, several web servers > connected to a redundant pair of NFS servers and a redundant pair of > MySQL servers. The current bottle-neck is, of course, the NFS servers. > However, the entire thing needs an increase in capacity by several > times. ... > The expensive option would be to add a high-performance SAN which would > do the trick for all of the servers that required high-performance > shared storage. this would solve the NFS performance problems. > > However, for alot less money, one could simply do away with the file > server entirely. Since this is static content, one could keep these > files locally on the webservers and push the content out from a central > server via rsync. I figure a pair of redundant internal web server > 'staging servers' could be used for content update. Once tested, the > update could be pushed to the production servers with a script using > rsync and ssh. Each server, would of course, require fast and redundant > disk subsystems. > > I think the lowest cost option is to increase the number of image > servers, beef up the NFS servers and MySQL servers and add to the number > of web servers in the cluster. This doesn't really solve the design > problem, though. > Personally, I can't see the sense in replacing a set of NFS servers with individual disks. While you might save money going with local disks in the short run your maintenance costs (moreso the time cost than dollar cost) would increase accordingly. Just dealing with lots of extra moving parts puts a shiver down my spine. I'm not sure how your 'static content' fits in with your mentioning multiple MySQL servers, that seems dynamic to me - or at least, ability for much dynamic content. If you ARE serving up a lot of static content, I might recommend a situation that's similar to a project I worked on for a $FAMOUSAUTHOR where we designed multiple web servers behind a pair of L4 switches. The pair of switches (pair for redundancy) load balanced for us and we ran THTTPD on the servers. There were a few links to offsite content, where content hosting providers (cannot remember the first, but they later went with Akamai) offered up the larger file people came to download. Over the millions of hits we got, it survived quite nicely. We ran out of bandwidth (50Mb/s) before the servers even blinked. Perhaps if it IS static you might also consider loading your content into a RAMdisk, which would provide probably the fastest access time. I might consider such a thing these days with the dirt cheap pricing of RAM. I think some kind of common disk (NFS, whatever, on RAID) is your best solution. HTH j -- == + It's simply not | John Keimel+ + RFC1149 compliant!| [EMAIL PROTECTED]+ + | http://www.keimel.com + ==
Re: high performance, highly available web clusters
On Thu, May 20, 2004 at 08:43:35AM -0400 or thereabouts, John Keimel wrote: > Personally, I can't see the sense in replacing a set of NFS servers with > individual disks. While you might save money going with local disks in > the short run your maintenance costs (moreso the time cost than dollar > cost) would increase accordingly. Just dealing with lots of extra moving > parts puts a shiver down my spine. Each webserver will need local storage for the system anyway. I would make that local storage large enough for the static content that is normally held on the NFS server. Worried about disks failing? That happens, and if a server drops out of the cluster, we just put it back after repairs. The cluster offers a level of redundancy that makes a single failure hardly noticeable. The problem with NFS is that it simply was not designed to handle the number of FS operations (90-150/s now and we want 10X that) that web serving can demand. You suggest a RAM disk, and yet find the NFS server adequate as well??? > > I'm not sure how your 'static content' fits in with your mentioning > multiple MySQL servers, that seems dynamic to me - or at least, ability > for much dynamic content. Static content is stored on the NFS server, dynamic content is stored on the Mysql servers. The vast majority of content are image files. > > If you ARE serving up a lot of static content, I might recommend a > situation that's similar to a project I worked on for a $FAMOUSAUTHOR > where we designed multiple web servers behind a pair of L4 switches. The > pair of switches (pair for redundancy) load balanced for us and we ran > THTTPD on the servers. There were a few links to offsite content, where > content hosting providers (cannot remember the first, but they later > went with Akamai) offered up the larger file people came to download. > Over the millions of hits we got, it survived quite nicely. We ran out > of bandwidth (50Mb/s) before the servers even blinked. that's awesome. Sounds like you got that one nailed. > > Perhaps if it IS static you might also consider loading your content > into a RAMdisk, which would provide probably the fastest access time. I > might consider such a thing these days with the dirt cheap pricing of > RAM. Actually, I figure a large bank of RAM (say, 4GB) will allow linux to allocate enough ram to the disk cache that the most commonly used files will be read right from RAM. Does this seem reasonable? > > I think some kind of common disk (NFS, whatever, on RAID) is your > best solution. why does it have to be common disk? why not local that is periodically updated? the increase in latency by using NFS (or SMB, whatever) and the overhead of all the FS operations is just killer. Besides, when you aggregate all your storage to a single fileserver, you provide yourself a single point of failure. Even with a dual redundant NFS setup, you still have only one level of redundancy. With a 10 server web cluster I could lose half my servers and still serve plenty of content. > > HTH > > j > -- > > == > + It's simply not | John Keimel+ > + RFC1149 compliant!| [EMAIL PROTECTED]+ > + | http://www.keimel.com + > == > > > -- > To UNSUBSCRIBE, email to [EMAIL PROTECTED] > with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED] -- *** David Wilk System Administrator Community Internet Access, Inc. [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: high performance, highly available web clusters
On Wed, May 19, 2004 at 11:48:31PM -0600, David Wilk wrote: ... > The cluster is comprised of a load-balancer, several web servers > connected to a redundant pair of NFS servers and a redundant pair of > MySQL servers. The current bottle-neck is, of course, the NFS servers. > However, the entire thing needs an increase in capacity by several > times. ... > The expensive option would be to add a high-performance SAN which would > do the trick for all of the servers that required high-performance > shared storage. this would solve the NFS performance problems. > > However, for alot less money, one could simply do away with the file > server entirely. Since this is static content, one could keep these > files locally on the webservers and push the content out from a central > server via rsync. I figure a pair of redundant internal web server > 'staging servers' could be used for content update. Once tested, the > update could be pushed to the production servers with a script using > rsync and ssh. Each server, would of course, require fast and redundant > disk subsystems. > > I think the lowest cost option is to increase the number of image > servers, beef up the NFS servers and MySQL servers and add to the number > of web servers in the cluster. This doesn't really solve the design > problem, though. > Personally, I can't see the sense in replacing a set of NFS servers with individual disks. While you might save money going with local disks in the short run your maintenance costs (moreso the time cost than dollar cost) would increase accordingly. Just dealing with lots of extra moving parts puts a shiver down my spine. I'm not sure how your 'static content' fits in with your mentioning multiple MySQL servers, that seems dynamic to me - or at least, ability for much dynamic content. If you ARE serving up a lot of static content, I might recommend a situation that's similar to a project I worked on for a $FAMOUSAUTHOR where we designed multiple web servers behind a pair of L4 switches. The pair of switches (pair for redundancy) load balanced for us and we ran THTTPD on the servers. There were a few links to offsite content, where content hosting providers (cannot remember the first, but they later went with Akamai) offered up the larger file people came to download. Over the millions of hits we got, it survived quite nicely. We ran out of bandwidth (50Mb/s) before the servers even blinked. Perhaps if it IS static you might also consider loading your content into a RAMdisk, which would provide probably the fastest access time. I might consider such a thing these days with the dirt cheap pricing of RAM. I think some kind of common disk (NFS, whatever, on RAID) is your best solution. HTH j -- == + It's simply not | John Keimel+ + RFC1149 compliant!| [EMAIL PROTECTED]+ + | http://www.keimel.com + == -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
high performance, highly available web clusters
Howdy all, I am thinking about how to increase the capacity of a web cluster and was wondering if anyone out there had any experience with this type of thing. The cluster is comprised of a load-balancer, several web servers connected to a redundant pair of NFS servers and a redundant pair of MySQL servers. The current bottle-neck is, of course, the NFS servers. However, the entire thing needs an increase in capacity by several times. First of all, the web servers need a hardware upgrade and increase in total number. The expensive option would be to add a high-performance SAN which would do the trick for all of the servers that required high-performance shared storage. this would solve the NFS performance problems. However, for alot less money, one could simply do away with the file server entirely. Since this is static content, one could keep these files locally on the webservers and push the content out from a central server via rsync. I figure a pair of redundant internal web server 'staging servers' could be used for content update. Once tested, the update could be pushed to the production servers with a script using rsync and ssh. Each server, would of course, require fast and redundant disk subsystems. I think the lowest cost option is to increase the number of image servers, beef up the NFS servers and MySQL servers and add to the number of web servers in the cluster. This doesn't really solve the design problem, though. What have you guys done with web clusters? thanks! Dave -- *** David Wilk System Administrator Community Internet Access, Inc. [EMAIL PROTECTED]