On 03/31/2010 05:12 PM, Alex Mandel wrote: > I'm looking for some references and tips on how to tune a server > specifically for serving large files over the internet. ie 4 GB iso > files. I'm talking software config tweaks here.
How many 4GB ISO files are there? How many simultaneous files? Clients? How fast is the uplink to er, umm, wherever the clients are (on the internet)? > The system is using a RAID with striping, the filesystem is XFS (some > tuning there maybe?) Your enemy is random access, I wouldn't expect XFS, ext2/3/4, jfs, or any of the many other alternatives to make much difference. > and will be running Apache2.2 all on a Debian > Stable install if that helps. It's got 2 2.5ghz cores and 8GB of ram > (those can be adjusted since it's actually a kvm virtual machine). Well your enemy is random access. To keep things simple lets just assume a single disk. If you have one download and relatively current hardware you should be able to manage around 80-90MB/sec (assuming your network can keep up). What gets ugly is if you have 2 or more clients accessing 2 or more files. Suddenly it becomes very very important to intelligently handle your I/O. Say you have 4 clients, reading 4 ISO files, and a relatively stupid/straight forward I/O system. Say you read 4KB from each file (in user space), then do a send. Turns out a single disk can only so 75 seeks/sec or so, that means you only get 75*4KB = 300KB/sec. Obviously you want to read ahead on the files. You might assume that a "RAID with striping" will be much faster than a single disk on random workloads. I suggest testing this with your favorite benchmark, something like postmark or fio. Set it up to simulate the number of simultaneous random access over the size of files you expect to be involved. Once you fix that problem you run into others as the bandwidth starts increasing. Normal configurations often use basically read(file...) to write(socket...). This involves extra copies and context switches. Context switches related to kernel performance much like random reads do to I/O performance. Fixes include using mmap or sendfile. Which brings me to my next question... do you have a requirement for apache or is that just the path fo least resistance? Various servers are highly optimized for serving large static files quickly. Tux springs to mind, although is somewhat dated these days. A decent summary is at: http://en.wikipedia.org/wiki/TUX_web_server A more recent entry into the simple/fast webservers is nginx, fairly popular for a niche server. Various popular sites like wordpress.com, hulu, and sourceforge use it. _______________________________________________ vox-tech mailing list vox-tech@lists.lugod.org http://lists.lugod.org/mailman/listinfo/vox-tech