Re: [SLUG] Performance Tuning
Sorry for the dropout and lack of acknowledgement. Words 'hell', 'loose', 'break' uppermost in memory. Lots of information there. Thanks to everyone who's responded. Plenty of start points and paths to follow. Looks like it's a going to be a long month. :-) Kyle Glen Turner wrote: Kyle wrote: Ok, a couple of responses thus far. Some further info. The software I can tune myself. I was more looking for Linux specific tuning. * Yes, I was/am concerned about I/O. * But also ensuring the OS itself (system processes) is not hindering anything otherwise. * The RAID is the storage medium. (Hardware RAID) * Incremental change analysis is done client side. * Dual P4's / 1GB RAM * Filesys is ext3 mounted with 'defaults' You've chosen *the* application which most stresses the operating system :-) Cut the problem into three - tune the disk - tune the network - tune the backup software. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
On Sat, Sep 06, 2008 at 09:54:13AM +1000, Kyle wrote: > Can somebody recommend a reasonably comprehensive but straightforward > performance tuning article/HowTo/PDF/site I could read pls? > > Specifically, I am looking to perf-tune a dual-CPU RAID5 box used as a > backup server. Are you backing up to disk or tape? -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
Glen Turner <[EMAIL PROTECTED]> writes: [...] > Note the contention between network and disk I/O buffers. These both > need low memory. A 32b OS only has 512MB of that, which is a fail for > this application (especially since Linux locks hard on kernel memory > fragmentation). You need a 64b install. Sorry to change the topic a little, but can you confirm my understanding here: that 512MB figure comes from what is left of the 896MB of ZONE_NORMAL after kernel memory, pagetables and the like are factored in, right? Regards, Daniel -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
On Sun, Sep 07, 2008 at 07:06:08PM +0930, Glen Turner wrote: > Kyle wrote: >> Ok, [snip] > - use jumbo frames (9000B packet > 8KB disk block, so very efficient) I noticed a lot of people have talked about using large >1500 frames, usually >9k. I had been using jumbo frames for +8 months and I had found it beneficial. But since using 2.6.25 (and now 2.6.26), I have been getting a lot of kernel memory allocation errors, I have been told they where order 2 and not to worry about them. Cause of fragmented swap space (and some other description that I can't remember right now, but the jist being not to worry). I had found that the system behaved a bit slowly/differently after these events. usually brought on by high network load, moving around 40-300G of files, either with scp or nfs. Since turning off jumbo frames - moving back to standard mtu I have not had these ooop's. My question to the list is, do the people who use jumbo frames have you been seeing these errors. 2 of the servers were using forcedeth and I rtl8168B (using the realteck driver). At sites where I haven't used large mtu I have seen the problem. My setup for large mtu, is me just changing the mtu for the interface. I am guessing there is a leak somewhere. Another hicup for this site is it is a mixed mtu site (that fun), I have had to hand code all the relevant mtu and place them in the routing table with ip r add for ipv4 and ipv6. alex [snip] > -- > Glen Turner > -- > SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ > Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html > -- "Free societies are hopeful societies. And free societies will be allies against these hateful few who have no conscience, who kill at the whim of a hat." - George W. Bush 09/17/2004 Washington, DC signature.asc Description: Digital signature -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
Kyle wrote: Ok, a couple of responses thus far. Some further info. The software I can tune myself. I was more looking for Linux specific tuning. * Yes, I was/am concerned about I/O. * But also ensuring the OS itself (system processes) is not hindering anything otherwise. * The RAID is the storage medium. (Hardware RAID) * Incremental change analysis is done client side. * Dual P4's / 1GB RAM * Filesys is ext3 mounted with 'defaults' You've chosen *the* application which most stresses the operating system :-) Cut the problem into three - tune the disk - tune the network - tune the backup software. Disk: - you are writing large files. - RAID5 is not your friend, why not RAID10 since disk is so cheap? - some filesystems do big files better than others (xfs > ext3) - you need all spindles running under the same load, so layout your disks that that in mind. You'll probably need four spindles running to ensure that the average write speed exceeds the maximum read speed of the clients. Test this -- the client should not stall. - you are not reading - caching gains you little, so adjust the weighting so caches are cleared down more agressively - discard metadata uselessness (such as atime). - kill all low value disk-using processes (such as Beagle, slocate and other such rubbish, typically run from cron). - The stripe sizes used to build the RAID should be unusually large and should mesh well with the filesystem's extents. Network: - set autotuning for the bandwidth-display product. A reasonable reference is: http://www.gdt.id.au/~gdt/presentations/2008-01-29-linuxconfau-tcptune/ - use jumbo frames (9000B packet > 8KB disk block, so very efficient) - avoid firewalls and other bogusness - check every counter on every host/switch/router for errors. You need zero errors. Note the contention between network and disk I/O buffers. These both need low memory. A 32b OS only has 512MB of that, which is a fail for this application (especially since Linux locks hard on kernel memory fragmentation). You need a 64b install. Do the math (which depends on the number of clients), but I think you'll find that 1GB of RAM won't be sufficient and you'll run out of cache before you run out of filesystem bandwidth. Backup software: - chain backups, so only one/two client is running at a time. - avoid rate limiting, it's more efficient for one or two clients racing to the finish rather than have 30 clients all talking slowly. - set any block sizes way big. - work out how the indexing works. Move that off the main backup spindles, so that index updates don't move the disk heads on the backup spindles. Of course, all this needs to be taken with a grain of salt. There's a world of difference between tuning small backup server (where you just want things to complete overnight) and a corporate backup server (where you are more interested in how many clients each machine can back up per night). Finally, what is your offsite strategy? If you're ejecting diskpacks then note that not all chassis are rated to continually do this. Worse still your diskpacks may not fit into a borrowed chassis. Better to use a third-party container and keep a spare container chassis offsite with the diskpacks. Also some backup software needs a full scan of all diskpacks if it the software is asked to do a disaster recovery and this can take a long time. -- Glen Turner -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
On 06/09/2008, at 5:11 PM, Daniel Pittman wrote: "Tony Sceats" <[EMAIL PROTECTED]> writes: On Sat, Sep 6, 2008 at 12:22 PM, Daniel Pittman <[EMAIL PROTECTED]> wrote: Kyle <[EMAIL PROTECTED]> writes: The software I can tune myself. I was more looking for Linux specific tuning. * Yes, I was/am concerned about I/O. * But also ensuring the OS itself (system processes) is not hindering anything otherwise. For a backup system I would also suggest tuning the networking in addition to what people have added here. I would also suggest only changing one thing at a time, assessing the impact, and moving on. There are a lot of guides around on getting the best out of the network so a google search should yield results. As a guide: * If possible, use Gigabit or better for the backup server. * There may be some benefits in using multiple NIC's (bonded). * Tune the TCP buffers (wmem and rmem). * Split NICS across CPU's (processor affinity). * Don't spend too much time with interrupt coalation unless you have a lot of time and need *every* bit of network IO. * If your networking infrastucture can support it use large tcp frames (>1500 mtu). * for local backups avoid transiting large backups over firewalls and routers. 802.1q tagging is really useful here (vlan trunking). Not always avoidable but can make a considerable difference. Cheers Jason. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
"Tony Sceats" <[EMAIL PROTECTED]> writes: > On Sat, Sep 6, 2008 at 12:22 PM, Daniel Pittman <[EMAIL PROTECTED]> wrote: > > Kyle <[EMAIL PROTECTED]> writes: > > > The software I can tune myself. I was more looking for Linux specific > tuning. > > > > * Yes, I was/am concerned about I/O. > > * But also ensuring the OS itself (system processes) is not hindering > anything > > otherwise. > > Unless you are running other processes on the system you can be > reasonably confident that early performance measurement will tell you if > the OS is responsible for problems. Oh, look, an assumption of mine. How charmingly old fashioned of me. > you should look out specifically for things like beagle and updatedb > and what time they are run (ie if they are running at the same time as > your backups) as these are disk indexers.. This is sound advice. I assumed that anyone running a server would have started from an absolutely minimal system with no GUI, no extraneous daemons, and would know to tune other disk load away from peak periods. Which is a heck of an assumption to make, so thanks for catching it. [...] > You will probably find performance here disappointing. XFS with a > 2.6.24 or more recent kernel will do better, perhaps significantly so. > > I would concur with this - XFS on RAID 10 on newer kernels gives you > very good performance (RAID 0 gives you the write performance, RAID 1 > will give you the redundancy - and read performance) I should probably add some notes on XFS tuning: Set your agcount down to ~ 16 per terabyte or so, since more doesn't really help that much at most SMB scales. Use attr2 format, and inode64, if you can -- and consider a 512 byte inode rather than the default 256 if your backup software makes use of xattrs or symlinks. Give the device an external log, if you can safely, and use an external bitmap for any Linux software RAID devices as well, since both help with performance. (As long as they are on a distinct spindle, HBA port, etc.) Ensure that your log is large, and boost the size of your in-kernel log buffers and log buffer count, to help reduce the time spent waiting on log writeout for more FS activity. Test with various disk schedulers -- XFS and CFQ have interacted quite badly in some kernel versions, and AS or deadline might boost performance. This is also workload dependent, so test, test, test. Finally, I am not kidding about the 2.6.24 or later bit: don't use XFS in production without a serious UPS unless running the most recent. [...] > The RAID layout and filesystem choices are the only real points to > consider tuning up front -- and, probably, enabling LVM for ease of > future management of volume space.[1] > > In my experience LVM is actually pretty slow, at least running it > wilth ext3 over multiple PV's, but then when I was looking at this it > was probably 2 years ago and things may have improved a lot since > then. >From my experience there is no significant performance loss, but it could be that the workloads I deal with don't stress it enough, or in the right way, to show performance issues. > In any case LVM is certaintly *really really* handy if you need to > grow a filesystem, although it's not the only way of doing such a > thing (but it is by far the easiest and cleanest), and backup systems > are princably concerned with storage space, so the cost might be worth > it in the end That would be my feeling, unless the RAID HBA provides the same ability to grow the storage media. Regards, Daniel -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
Kyle <[EMAIL PROTECTED]> writes: > The software I can tune myself. I was more looking for Linux specific tuning. > > * Yes, I was/am concerned about I/O. > * But also ensuring the OS itself (system processes) is not hindering anything > otherwise. Unless you are running other processes on the system you can be reasonably confident that early performance measurement will tell you if the OS is responsible for problems. > * The RAID is the storage medium. (Hardware RAID) You /really/ need to let us know what brand and configuration; 3ware and Areca have very different performance characteristics; the presence or absence of a BBU is going to be key, also. In general, there is a reasonable chance you will find the RAID5 performance of your hardware solution disappointing, and if you can move to something closer to a RAID10 you will, in general, have better results. > * Incremental change analysis is done client side. Does the data just sit on this system, or is it sent somewhere else afterwards? (In other words, is this just a big, reliable hard disk?) Does this require reading the previous backup, or is it purely date based? Do you have a backup window? > * Dual P4's / 1GB RAM > * Filesys is ext3 mounted with 'defaults' You will probably find performance here disappointing. XFS with a 2.6.24 or more recent kernel will do better, perhaps significantly so. Otherwise you would want to look for tips on tuning an ext3 filesystem to be less conservative about performance or, if you can carry the risk, use ext2, to deliver better write performance. For a write-only load where the server is, essentially, a big and reliable hard disk for the backup software then, basically, you don't have a lot of tuning to do. The RAID layout and filesystem choices are the only real points to consider tuning up front -- and, probably, enabling LVM for ease of future management of volume space.[1] For 3ware, at least, and probably for other hardware controllers, you can gain significantly by tuning the queue depth, in line with the vendors recommendations. Finally, test, test, test. Get something as close as possible to your real load running and identify where the bottleneck is. [...] >> You have not even given enough information about what you intend to use >> it for: >> >> * what parts of performance are you concerned about? >> * what is the backup software? >> * how does it get data from the clients? >> * where does it spool that data temporarily? >> * where does it store it finally? >> * what compression, format transformation, etc happens to the data? >> * what filesystem are you planning to use, and is that fixed? >> * can you use something other than RAID5? >> * is it software RAID, hardware, or FakeRAID? >> * What sort of dual CPU is it -- two sockets, two cores, or one HT CPU? >> * how much memory do you have? Regards, Daniel Footnotes: [1] Depending on your hardware RAID brand you may be able to grow a live array, in which case you have more or less the same flexibility that LVM would give you. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
"Tony Sceats" <[EMAIL PROTECTED]> writes: > Actually I was under the impression that this was a vanilla on-disk > backup - ie, something akin to dumping tgz's onto a server with > reliable disks (hence RAID5).. It could be, but that is a pretty uncommon model in my experience. > I must say though that I am surprised that a tape feeder is faster > than disks to the point of having to maintain a large buffer, but then > the last time I was responsible for any tape system was with a single > DDS4 tape drive.. You poor thing. ;) Seriously, though, the tape streamers are not faster than disks in the same class: you can find plenty of enterprise level U320 / SAS SCSI disk systems that can comfortably supply an LTO4 device. You start to hit problems when folks want to do this on the cheap, so back their system with large, slow 7200 or 10K SATA disks on SAS or SATA controllers; there you really do need to stripe to keep up. > backup management is the curse of systems admin imho, so I avoid it at > all costs - it's horribly mundane and prone to break at will (but we > all thank the great bit keeper when we need them!) > > I was also under the (unreasearched) opinion that incremental change > analysis was performed on the client side, not the backup server side, > although I suppose that backing a large number of similar servers would > result in a lot of the same files being written a lot of times (eg /lib), so > it would be smart to only have one copy of the file and a reference for each > backup that includes this file.. this would certaintly be done on the server > side... This depends enormously on your backup software; BackupPC and the various rsync-and-hard-links based backup systems tend to put a lot of load on the server side. > but really the point of the above is that you're almost certaintly > right if you're talking about enterprise backup solutions, and that > more to the point, the precise backup solution/software chosen will > drastically change how you tune the server, which is actually what we > were both saying, so it's a case in point In my experience there isn't /that/ much difference between "enterprise" and "home" backup strategies, except the order of magnitude: you face the same sort of performance issues that an LTO4/SAS array user does with your SATA/DDS[1]. You just face them slower: your tape device needs less MB/second to keep it streaming rather than scrubbing, but your disks are also slower and busier, so you have the same sort of dramas with keeping the device fed. Regards, Daniel In my experience, of course, which doesn't mean that /everyone/ is going to face these same issues. Footnotes: [1] Well, maybe not with DDS, but with tape hardware that is reasonably affordable by home users in this day and age, such as LTO[12], AIT and SDLT. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
Ok, a couple of responses thus far. Some further info. The software I can tune myself. I was more looking for Linux specific tuning. * Yes, I was/am concerned about I/O. * But also ensuring the OS itself (system processes) is not hindering anything otherwise. * The RAID is the storage medium. (Hardware RAID) * Incremental change analysis is done client side. * Dual P4's / 1GB RAM * Filesys is ext3 mounted with 'defaults' Kind Regards Kyle Daniel Pittman wrote: You have not even given enough information about what you intend to use it for: * what parts of performance are you concerned about? * what is the backup software? * how does it get data from the clients? * where does it spool that data temporarily? * where does it store it finally? * what compression, format transformation, etc happens to the data? * what filesystem are you planning to use, and is that fixed? * can you use something other than RAID5? * is it software RAID, hardware, or FakeRAID? * What sort of dual CPU is it -- two sockets, two cores, or one HT CPU? * how much memory do you have? -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
Actually I was under the impression that this was a vanilla on-disk backup - ie, something akin to dumping tgz's onto a server with reliable disks (hence RAID5).. I must say though that I am surprised that a tape feeder is faster than disks to the point of having to maintain a large buffer, but then the last time I was responsible for any tape system was with a single DDS4 tape drive.. backup management is the curse of systems admin imho, so I avoid it at all costs - it's horribly mundane and prone to break at will (but we all thank the great bit keeper when we need them!) I was also under the (unreasearched) opinion that incremental change analysis was performed on the client side, not the backup server side, although I suppose that backing a large number of similar servers would result in a lot of the same files being written a lot of times (eg /lib), so it would be smart to only have one copy of the file and a reference for each backup that includes this file.. this would certaintly be done on the server side... but really the point of the above is that you're almost certaintly right if you're talking about enterprise backup solutions, and that more to the point, the precise backup solution/software chosen will drastically change how you tune the server, which is actually what we were both saying, so it's a case in point On Sat, Sep 6, 2008 at 11:03 AM, Daniel Pittman <[EMAIL PROTECTED]> wrote: > "Tony Sceats" <[EMAIL PROTECTED]> writes: > > > Performance depends specifically on the job at hand, and whilst short on > > detail, I would suspect that you want fast file systems and disks > optimised > > for write speed, > > For backups? Often you need very fast reads, either to feed a modern > tape streamer at the 30 to 80 MB per second of data it wants[1], or to > be able to perform "read and compare" operations on existing data for > comparison or pooling purposes (or even just the metadata of millions of > files, really.) > > Fast writes are typically the least important part of the backup > system. They matter, sure, for beating your backup window, but the > actual performance problems often pop up on the read side. In my > experience, anyhow. :) > > Regards, >Daniel > > Footnotes: > [1] ...and more if you have two or three of these drives attached to a > single server, which is more common as we have individual machines > with one to two tapes capacity each -- even for the 800MB native > LTO4 devices. > > -- > SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ > Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html > -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
"Tony Sceats" <[EMAIL PROTECTED]> writes: > Performance depends specifically on the job at hand, and whilst short on > detail, I would suspect that you want fast file systems and disks optimised > for write speed, For backups? Often you need very fast reads, either to feed a modern tape streamer at the 30 to 80 MB per second of data it wants[1], or to be able to perform "read and compare" operations on existing data for comparison or pooling purposes (or even just the metadata of millions of files, really.) Fast writes are typically the least important part of the backup system. They matter, sure, for beating your backup window, but the actual performance problems often pop up on the read side. In my experience, anyhow. :) Regards, Daniel Footnotes: [1] ...and more if you have two or three of these drives attached to a single server, which is more common as we have individual machines with one to two tapes capacity each -- even for the 800MB native LTO4 devices. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
Performance depends specifically on the job at hand, and whilst short on detail, I would suspect that you want fast file systems and disks optimised for write speed, but that's probably less than half the story, and I'm guessing that whatever the rest of the details are, you will be trying to tune an IO Bound system, so perhaps googling for this will help anyway, IBM have a pretty good primer for the different types of tuning you can do that goes reasonably in depth too http://safari.ibmpressbooks.com/013144753X On Sat, Sep 6, 2008 at 9:54 AM, Kyle <[EMAIL PROTECTED]> wrote: > Can somebody recommend a reasonably comprehensive but straightforward > performance tuning article/HowTo/PDF/site I could read pls? > > Specifically, I am looking to perf-tune a dual-CPU RAID5 box used as a > backup server. > > -- > > Kind Regards > > Kyle > > -- > SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ > Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html > -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] Performance Tuning
Kyle <[EMAIL PROTECTED]> writes: > Can somebody recommend a reasonably comprehensive but straightforward > performance tuning article/HowTo/PDF/site I could read pls? There isn't one, because... > Specifically, I am looking to perf-tune a dual-CPU RAID5 box used as a > backup server. ...what you need to do to tune your system to work effectively as a backup server is extremely different from, say, tuning to a database load, to tuning for a compute load, to tuning for anything else. You have not even given enough information about what you intend to use it for: * what parts of performance are you concerned about? * what is the backup software? * how does it get data from the clients? * where does it spool that data temporarily? * where does it store it finally? * what compression, format transformation, etc happens to the data? * what filesystem are you planning to use, and is that fixed? * can you use something other than RAID5? * is it software RAID, hardware, or FakeRAID? * What sort of dual CPU is it -- two sockets, two cores, or one HT CPU? * how much memory do you have? Regards, Daniel -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html