Re: I/O performance issues on 2.4.23 SMP system
> >>I was the poster who initiated the previous thread on this subject. The > >>problem disappeared here after we went down to 2 GB of memory (although > >>we physically removed it from the server rather than passing the arg to > >>the kernel... shouldn't make a difference though, I'd imagine). We went > >>straight from 4 GB to 2 GB, so I can't comment on the results of using 3 > >>GB. The above comment sounds a lot like a bounce buffer issue. This is not an IO issue. Bounce Buffer issues look a like like IO problems on the surface. However, the IO bus will get a messy from having to much memory feeding it. Bounce Buffer issues can occur anytime you use over 2GB of RAM on a 32bit system. I have a Dual SMP Xeon 700 (32 bit) with 10GB of RAM in it. It is under a 10-20% CPU load daily. Originally, I had a bounce buffer problem that occurred during backups and heavy IO loads. The output from sar, system activity report, told me that process switches were not recovering after backups. IO loads would 'snowball' after backups. Generally, the whole system seemed to get overwhelmed and unstable after a heavy IO event, like a backup. I found this strange. Since the patch has been applied the server has been running very stable for over 43 days. I fixed the problem with following: This Bounce Buffer problem was resolved with the 00_block-highmem-all-18b-3 patch. http://www.kernel.org/pub/linux/kernel/people/andrea For example, the following sar output shows a normal recovery after a heavy IO event: 22:30:01 all83 089 1302172 0.35 0.33 0.35 074 090 183 088 -> backup started #rsync 100GB RAID 5 Array 23:40:01 all3 14 18261731166 1.44 1.46 1.52 00:00:02 cpu %usr %sys %nice %idle pswch/s runq nrproc lavg1 lavg5 avg15 _cpu_ 00:10:01 all3 14 18256793166 1.62 1.56 1.53 03 13 183 13 15 181 00:20:01 all4 14 18260683156 1.45 1.46 1.46 03 14 182 14 14 181 00:30:01 all2 13 18355855161 1.10 1.16 1.29 03 13 184 12 14 183 00:40:01 all38 18831912146 0.12 0.63 1.01 038 188 138 188 00:50:01 all33 095 863139 0.15 0.23 0.60 -> sync finished If you sar output does not look like this after a backup, and you have more than 2GB of RAM something is probably going on with a buffer. You can fix it two ways, upgrade to a 64Bit machine or patch your kernel with the block-highmem patch written by Andrea. My Kernel: 2.4.18 image=/boot/vmlinuz-2.4.18 #Compiled using GCC-2.95 on new IMAP server #Debian 2.4.18 Kernel package #Debian 2.4.18 xfs kernel patch #block-highmemory patch from http://www.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/ #00_block-highmem-all-18b-3 #HIMEM Kernel Support to 64GB #HIMEME IO Support added label=LinuxHIMEM read-only My Hardware: 00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (rev 21) 00:00.1 Host bridge: ServerWorks CNB20HE Host Bridge (rev 01) 00:00.2 Host bridge: ServerWorks: Unknown device 0006 00:00.3 Host bridge: ServerWorks: Unknown device 0006 00:01.0 SCSI storage controller: Adaptec 7896 00:01.1 SCSI storage controller: Adaptec 7896 00:05.0 Ethernet controller: Advanced Micro Devices [AMD] 79c970 [PCnet LANCE] (rev 44) 00:06.0 VGA compatible controller: S3 Inc. Trio 64 3D (rev 01) 00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 4f) 00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller 00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 04) 01:01.0 RAID bus controller: IBM Netfinity ServeRAID controller 01:02.0 RAID bus controller: IBM Netfinity ServeRAID controller 02:06.0 Ethernet controller: Intel Corp. 82557 [Ethernet Pro 100] (rev 0c) On 03/02/04 13:25 -0600, Benjamin Sherman wrote: > Thanks to all who sent comments on this. I did some more testing and > went straight to the source for input. > > > if you want to try the 4G patch then i'd suggest Andrew Morton's -mm > tree, which has it included: > > http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.2-rc2/2.6.2-rc2-mm2/ > > i've got a 2.4 backport too, included in RHEL3. (the SRPM is > downloadable.) But extracting the patch from this srpm will likely not > apply to a vanilla 2.4 tree - there are lots of other patches as well > and interdependencies. So i'd suggest the RHEL3 kernel as-is, or the -mm > tree in 2.6. > > Ingo > > > Of course, as newer kernels are released, Andrew releases newer -mm > patches. This patch set solved the I/O problem and let me use 4GB RAM. > > > > Mark Ferlatte wrote
Re: I/O performance issues on 2.4.23 SMP system
> >>I was the poster who initiated the previous thread on this subject. The > >>problem disappeared here after we went down to 2 GB of memory (although > >>we physically removed it from the server rather than passing the arg to > >>the kernel... shouldn't make a difference though, I'd imagine). We went > >>straight from 4 GB to 2 GB, so I can't comment on the results of using 3 > >>GB. The above comment sounds a lot like a bounce buffer issue. This is not an IO issue. Bounce Buffer issues look a like like IO problems on the surface. However, the IO bus will get a messy from having to much memory feeding it. Bounce Buffer issues can occur anytime you use over 2GB of RAM on a 32bit system. I have a Dual SMP Xeon 700 (32 bit) with 10GB of RAM in it. It is under a 10-20% CPU load daily. Originally, I had a bounce buffer problem that occurred during backups and heavy IO loads. The output from sar, system activity report, told me that process switches were not recovering after backups. IO loads would 'snowball' after backups. Generally, the whole system seemed to get overwhelmed and unstable after a heavy IO event, like a backup. I found this strange. Since the patch has been applied the server has been running very stable for over 43 days. I fixed the problem with following: This Bounce Buffer problem was resolved with the 00_block-highmem-all-18b-3 patch. http://www.kernel.org/pub/linux/kernel/people/andrea For example, the following sar output shows a normal recovery after a heavy IO event: 22:30:01 all83 089 1302172 0.35 0.33 0.35 074 090 183 088 -> backup started #rsync 100GB RAID 5 Array 23:40:01 all3 14 18261731166 1.44 1.46 1.52 00:00:02 cpu %usr %sys %nice %idle pswch/s runq nrproc lavg1 lavg5 avg15 _cpu_ 00:10:01 all3 14 18256793166 1.62 1.56 1.53 03 13 183 13 15 181 00:20:01 all4 14 18260683156 1.45 1.46 1.46 03 14 182 14 14 181 00:30:01 all2 13 18355855161 1.10 1.16 1.29 03 13 184 12 14 183 00:40:01 all38 18831912146 0.12 0.63 1.01 038 188 138 188 00:50:01 all33 095 863139 0.15 0.23 0.60 -> sync finished If you sar output does not look like this after a backup, and you have more than 2GB of RAM something is probably going on with a buffer. You can fix it two ways, upgrade to a 64Bit machine or patch your kernel with the block-highmem patch written by Andrea. My Kernel: 2.4.18 image=/boot/vmlinuz-2.4.18 #Compiled using GCC-2.95 on new IMAP server #Debian 2.4.18 Kernel package #Debian 2.4.18 xfs kernel patch #block-highmemory patch from http://www.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/ #00_block-highmem-all-18b-3 #HIMEM Kernel Support to 64GB #HIMEME IO Support added label=LinuxHIMEM read-only My Hardware: 00:00.0 Host bridge: ServerWorks CNB20HE Host Bridge (rev 21) 00:00.1 Host bridge: ServerWorks CNB20HE Host Bridge (rev 01) 00:00.2 Host bridge: ServerWorks: Unknown device 0006 00:00.3 Host bridge: ServerWorks: Unknown device 0006 00:01.0 SCSI storage controller: Adaptec 7896 00:01.1 SCSI storage controller: Adaptec 7896 00:05.0 Ethernet controller: Advanced Micro Devices [AMD] 79c970 [PCnet LANCE] (rev 44) 00:06.0 VGA compatible controller: S3 Inc. Trio 64 3D (rev 01) 00:0f.0 ISA bridge: ServerWorks OSB4 South Bridge (rev 4f) 00:0f.1 IDE interface: ServerWorks OSB4 IDE Controller 00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 04) 01:01.0 RAID bus controller: IBM Netfinity ServeRAID controller 01:02.0 RAID bus controller: IBM Netfinity ServeRAID controller 02:06.0 Ethernet controller: Intel Corp. 82557 [Ethernet Pro 100] (rev 0c) On 03/02/04 13:25 -0600, Benjamin Sherman wrote: > Thanks to all who sent comments on this. I did some more testing and > went straight to the source for input. > > > if you want to try the 4G patch then i'd suggest Andrew Morton's -mm > tree, which has it included: > > http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.2-rc2/2.6.2-rc2-mm2/ > > i've got a 2.4 backport too, included in RHEL3. (the SRPM is > downloadable.) But extracting the patch from this srpm will likely not > apply to a vanilla 2.4 tree - there are lots of other patches as well > and interdependencies. So i'd suggest the RHEL3 kernel as-is, or the -mm > tree in 2.6. > > Ingo > > > Of course, as newer kernels are released, Andrew releases newer -mm > patches. This patch set solved the I/O problem and let me use 4GB RAM. > > > > Mark Ferlatte wrote: > >
Re: I/O performance issues on 2.4.23 SMP system
Thanks to all who sent comments on this. I did some more testing and went straight to the source for input. if you want to try the 4G patch then i'd suggest Andrew Morton's -mm tree, which has it included: http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.2-rc2/2.6.2-rc2-mm2/ i've got a 2.4 backport too, included in RHEL3. (the SRPM is downloadable.) But extracting the patch from this srpm will likely not apply to a vanilla 2.4 tree - there are lots of other patches as well and interdependencies. So i'd suggest the RHEL3 kernel as-is, or the -mm tree in 2.6. Ingo Of course, as newer kernels are released, Andrew releases newer -mm patches. This patch set solved the I/O problem and let me use 4GB RAM. Mark Ferlatte wrote: Daniel Erat said on Thu, Jan 29, 2004 at 08:08:49AM -0800: I was the poster who initiated the previous thread on this subject. The problem disappeared here after we went down to 2 GB of memory (although we physically removed it from the server rather than passing the arg to the kernel... shouldn't make a difference though, I'd imagine). We went straight from 4 GB to 2 GB, so I can't comment on the results of using 3 GB. Our problem didn't seem to directly correspond with the 1 GB threshold -- it wouldn't manifest itself until the server had allocated all 4 GB of RAM. After a reboot, it would be nice and speedy again for a day or two until all the memory was being used for buffering again. This was the behavior I saw as well. I did a bunch of research and source reading before actually figuring out what was going on; it wasn't a well documented bug for some reason... I guess there aren't that many people running large boxes using 2.4. This makes me think that the problems I saw with 2GB were not related to the IO subsystem, but were something else. Time to go play around a bit; getting those boxes up to 2GB without having to do a kernel patch/upgrade cycle would be nice. M -- Benjamin Sherman Software Developer Iowa Interactive, Inc 515-323-3468 x14 [EMAIL PROTECTED]
Re: I/O performance issues on 2.4.23 SMP system
Thanks to all who sent comments on this. I did some more testing and went straight to the source for input. if you want to try the 4G patch then i'd suggest Andrew Morton's -mm tree, which has it included: http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.2-rc2/2.6.2-rc2-mm2/ i've got a 2.4 backport too, included in RHEL3. (the SRPM is downloadable.) But extracting the patch from this srpm will likely not apply to a vanilla 2.4 tree - there are lots of other patches as well and interdependencies. So i'd suggest the RHEL3 kernel as-is, or the -mm tree in 2.6. Ingo Of course, as newer kernels are released, Andrew releases newer -mm patches. This patch set solved the I/O problem and let me use 4GB RAM. Mark Ferlatte wrote: Daniel Erat said on Thu, Jan 29, 2004 at 08:08:49AM -0800: I was the poster who initiated the previous thread on this subject. The problem disappeared here after we went down to 2 GB of memory (although we physically removed it from the server rather than passing the arg to the kernel... shouldn't make a difference though, I'd imagine). We went straight from 4 GB to 2 GB, so I can't comment on the results of using 3 GB. Our problem didn't seem to directly correspond with the 1 GB threshold -- it wouldn't manifest itself until the server had allocated all 4 GB of RAM. After a reboot, it would be nice and speedy again for a day or two until all the memory was being used for buffering again. This was the behavior I saw as well. I did a bunch of research and source reading before actually figuring out what was going on; it wasn't a well documented bug for some reason... I guess there aren't that many people running large boxes using 2.4. This makes me think that the problems I saw with 2GB were not related to the IO subsystem, but were something else. Time to go play around a bit; getting those boxes up to 2GB without having to do a kernel patch/upgrade cycle would be nice. M -- Benjamin Sherman Software Developer Iowa Interactive, Inc 515-323-3468 x14 [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: I/O performance issues on 2.4.23 SMP system
On Fri, 30 Jan 2004 01:02, Jeff S Wheeler <[EMAIL PROTECTED]> wrote: > I don't know anything about thos 2.4.23 I/O problem, but I will tell you > that RAID 5 is not the way to go for big SQL performance. In a RAID 5 > array, all the heads must move for every operation. You already spent a > lot of money on that server. I suggest you buy more disks for RAID 10. Any decent RAID-5 implementation will have a non-volatile write-back cache. This will hugely increase performance as it allows the possibility of combining writes. NB This is something that Linux software RAID lacks support for. Moving all heads is not required for every operation. Reading from all disks is not required for a read unless an entire line is to be brought in, last time I did read benchmarks it seemed that this wasn't being done on Mylex RAID controllers or Sun Metadisk (never done any real tests on Linux software RAID-5). Reading from all disks is not necessarily required for a one-block write either. Reading the block that is to be written and the parity block is enough. New parity block will be old_block ^ old_parity ^ new_block. Doing reads from and writes to two disks should be significantly faster than reads from all disks and writes to two. The benchmark results Craig Sanders posted when comparing RAID-5 and RAID-10 were surprising, RAID-5 won many of the test scenarios! I recall that Craig posted the results to this list, a google search should return them. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page
Re: I/O performance issues on 2.4.23 SMP system
On Fri, 30 Jan 2004 01:02, Jeff S Wheeler <[EMAIL PROTECTED]> wrote: > I don't know anything about thos 2.4.23 I/O problem, but I will tell you > that RAID 5 is not the way to go for big SQL performance. In a RAID 5 > array, all the heads must move for every operation. You already spent a > lot of money on that server. I suggest you buy more disks for RAID 10. Any decent RAID-5 implementation will have a non-volatile write-back cache. This will hugely increase performance as it allows the possibility of combining writes. NB This is something that Linux software RAID lacks support for. Moving all heads is not required for every operation. Reading from all disks is not required for a read unless an entire line is to be brought in, last time I did read benchmarks it seemed that this wasn't being done on Mylex RAID controllers or Sun Metadisk (never done any real tests on Linux software RAID-5). Reading from all disks is not necessarily required for a one-block write either. Reading the block that is to be written and the parity block is enough. New parity block will be old_block ^ old_parity ^ new_block. Doing reads from and writes to two disks should be significantly faster than reads from all disks and writes to two. The benchmark results Craig Sanders posted when comparing RAID-5 and RAID-10 were surprising, RAID-5 won many of the test scenarios! I recall that Craig posted the results to this list, a google search should return them. -- http://www.coker.com.au/selinux/ My NSA Security Enhanced Linux packages http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark http://www.coker.com.au/postal/Postal SMTP/POP benchmark http://www.coker.com.au/~russell/ My home page -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: I/O performance issues on 2.4.23 SMP system
Daniel Erat said on Thu, Jan 29, 2004 at 08:08:49AM -0800: > I was the poster who initiated the previous thread on this subject. The > problem disappeared here after we went down to 2 GB of memory (although > we physically removed it from the server rather than passing the arg to > the kernel... shouldn't make a difference though, I'd imagine). We went > straight from 4 GB to 2 GB, so I can't comment on the results of using 3 > GB. > > Our problem didn't seem to directly correspond with the 1 GB threshold > -- it wouldn't manifest itself until the server had allocated all 4 GB > of RAM. After a reboot, it would be nice and speedy again for a day or > two until all the memory was being used for buffering again. This was the behavior I saw as well. I did a bunch of research and source reading before actually figuring out what was going on; it wasn't a well documented bug for some reason... I guess there aren't that many people running large boxes using 2.4. This makes me think that the problems I saw with 2GB were not related to the IO subsystem, but were something else. Time to go play around a bit; getting those boxes up to 2GB without having to do a kernel patch/upgrade cycle would be nice. M pgpEZm48kWcf3.pgp Description: PGP signature
Re: I/O performance issues on 2.4.23 SMP system
On Wed, Jan 28, 2004 at 01:38:29PM -0800, Mark Ferlatte wrote: [snip] > The problem (bug) is that block device IO has to go through buffers > that are below 1GB. The memory manager doesn't know this, so what > happens is that the IO layer requests a block of memory below 1GB, and > the swapout daemon (kswapd) then runs around like a madman trying to > free pages, instead of shuffling pages that don't need to be below 1GB > to higher memory addresses. Since many of the pages below 1GB can't > be freed (they belong to active programs), the IO starves. > > With 1GB of memory, both the IO layer and the swapout daemon are > working with the same view of memory, so the bug is concealed, and > performance is good. > > I have heard of people trying 2GB, and having it work, but it didn't > for me. I was the poster who initiated the previous thread on this subject. The problem disappeared here after we went down to 2 GB of memory (although we physically removed it from the server rather than passing the arg to the kernel... shouldn't make a difference though, I'd imagine). We went straight from 4 GB to 2 GB, so I can't comment on the results of using 3 GB. Our problem didn't seem to directly correspond with the 1 GB threshold -- it wouldn't manifest itself until the server had allocated all 4 GB of RAM. After a reboot, it would be nice and speedy again for a day or two until all the memory was being used for buffering again. Dan
Re: I/O performance issues on 2.4.23 SMP system
Daniel Erat said on Thu, Jan 29, 2004 at 08:08:49AM -0800: > I was the poster who initiated the previous thread on this subject. The > problem disappeared here after we went down to 2 GB of memory (although > we physically removed it from the server rather than passing the arg to > the kernel... shouldn't make a difference though, I'd imagine). We went > straight from 4 GB to 2 GB, so I can't comment on the results of using 3 > GB. > > Our problem didn't seem to directly correspond with the 1 GB threshold > -- it wouldn't manifest itself until the server had allocated all 4 GB > of RAM. After a reboot, it would be nice and speedy again for a day or > two until all the memory was being used for buffering again. This was the behavior I saw as well. I did a bunch of research and source reading before actually figuring out what was going on; it wasn't a well documented bug for some reason... I guess there aren't that many people running large boxes using 2.4. This makes me think that the problems I saw with 2GB were not related to the IO subsystem, but were something else. Time to go play around a bit; getting those boxes up to 2GB without having to do a kernel patch/upgrade cycle would be nice. M pgp0.pgp Description: PGP signature
Re: I/O performance issues on 2.4.23 SMP system
On Wed, Jan 28, 2004 at 01:38:29PM -0800, Mark Ferlatte wrote: [snip] > The problem (bug) is that block device IO has to go through buffers > that are below 1GB. The memory manager doesn't know this, so what > happens is that the IO layer requests a block of memory below 1GB, and > the swapout daemon (kswapd) then runs around like a madman trying to > free pages, instead of shuffling pages that don't need to be below 1GB > to higher memory addresses. Since many of the pages below 1GB can't > be freed (they belong to active programs), the IO starves. > > With 1GB of memory, both the IO layer and the swapout daemon are > working with the same view of memory, so the bug is concealed, and > performance is good. > > I have heard of people trying 2GB, and having it work, but it didn't > for me. I was the poster who initiated the previous thread on this subject. The problem disappeared here after we went down to 2 GB of memory (although we physically removed it from the server rather than passing the arg to the kernel... shouldn't make a difference though, I'd imagine). We went straight from 4 GB to 2 GB, so I can't comment on the results of using 3 GB. Our problem didn't seem to directly correspond with the 1 GB threshold -- it wouldn't manifest itself until the server had allocated all 4 GB of RAM. After a reboot, it would be nice and speedy again for a day or two until all the memory was being used for buffering again. Dan -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: I/O performance issues on 2.4.23 SMP system
The problem (bug) is that block device IO has to go through buffers that are below 1GB. The memory manager doesn't know this, so what happens is that the IO layer requests a block of memory below 1GB, and the swapout daemon (kswapd) then runs around like a madman trying to free pages, instead of shuffling pages that don't need to be below 1GB to higher memory addresses. Since many of the pages below 1GB can't be freed (they belong to active programs), the IO starves. With 1GB of memory, both the IO layer and the swapout daemon are working with the same view of memory, so the bug is concealed, and performance is good. I have heard of people trying 2GB, and having it work, but it didn't for me. Right, I have seen a 2GB success story. Do you know if this is fixed in kernel 2.6.x? -- Benjamin Sherman Software Developer Iowa Interactive, Inc 515-323-3468 x14 [EMAIL PROTECTED] smime.p7s Description: S/MIME Cryptographic Signature
Re: I/O performance issues on 2.4.23 SMP system
Is this problem specific to the 3ware cards? does anyone know of any issues with the Highpoint 1640 SATA RAID cards? Any experience or recomendations with these? No, this issue is not specific to 3ware cards. The original poster had QLogic fibre channel card and Adaptec SCSI. -- Benjamin Sherman Software Developer Iowa Interactive, Inc 515-323-3468 x14 [EMAIL PROTECTED] smime.p7s Description: S/MIME Cryptographic Signature
Re: I/O performance issues on 2.4.23 SMP system
The problem (bug) is that block device IO has to go through buffers that are below 1GB. The memory manager doesn't know this, so what happens is that the IO layer requests a block of memory below 1GB, and the swapout daemon (kswapd) then runs around like a madman trying to free pages, instead of shuffling pages that don't need to be below 1GB to higher memory addresses. Since many of the pages below 1GB can't be freed (they belong to active programs), the IO starves. With 1GB of memory, both the IO layer and the swapout daemon are working with the same view of memory, so the bug is concealed, and performance is good. I have heard of people trying 2GB, and having it work, but it didn't for me. Right, I have seen a 2GB success story. Do you know if this is fixed in kernel 2.6.x? -- Benjamin Sherman Software Developer Iowa Interactive, Inc 515-323-3468 x14 [EMAIL PROTECTED] smime.p7s Description: S/MIME Cryptographic Signature
Re: I/O performance issues on 2.4.23 SMP system
Is this problem specific to the 3ware cards? does anyone know of any issues with the Highpoint 1640 SATA RAID cards? Any experience or recomendations with these? No, this issue is not specific to 3ware cards. The original poster had QLogic fibre channel card and Adaptec SCSI. -- Benjamin Sherman Software Developer Iowa Interactive, Inc 515-323-3468 x14 [EMAIL PROTECTED] smime.p7s Description: S/MIME Cryptographic Signature
Re: I/O performance issues on 2.4.23 SMP system
On Tue, 2004-01-27 at 16:49, Benjamin Sherman wrote: > I have a server running dual 2.66Ghz Xeons and 4GB RAM, in a > PenguinComputing Relion 230S system. It has a 3ware RAID card with 3 > 120GB SATA drives in RAID5. It is currently running Debian 3.0 w/ > vanilla kernel 2.4.23, HIGHMEM4G=y, HIGHIO=y, SMP=y, ACPI=y. I see the > problem with APCI and HT turned off OR if I leave them on. I don't know anything about thos 2.4.23 I/O problem, but I will tell you that RAID 5 is not the way to go for big SQL performance. In a RAID 5 array, all the heads must move for every operation. You already spent a lot of money on that server. I suggest you buy more disks for RAID 10. -- Jeff
Re: I/O performance issues on 2.4.23 SMP system
On Tue, 2004-01-27 at 16:49, Benjamin Sherman wrote: > I have a server running dual 2.66Ghz Xeons and 4GB RAM, in a > PenguinComputing Relion 230S system. It has a 3ware RAID card with 3 > 120GB SATA drives in RAID5. It is currently running Debian 3.0 w/ > vanilla kernel 2.4.23, HIGHMEM4G=y, HIGHIO=y, SMP=y, ACPI=y. I see the > problem with APCI and HT turned off OR if I leave them on. I don't know anything about thos 2.4.23 I/O problem, but I will tell you that RAID 5 is not the way to go for big SQL performance. In a RAID 5 array, all the heads must move for every operation. You already spent a lot of money on that server. I suggest you buy more disks for RAID 10. -- Jeff -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: I/O performance issues on 2.4.23 SMP system
Mark Ferlatte wrote: Benjamin Sherman said on Wed, Jan 28, 2004 at 03:16:56PM -0600: I've got some machines in nearly the same configuration. What I ended up doing was to put an `append="mem=1G"' in the lilo.conf boot stanza for the kernel I was using, and rebooted the machine in question. This does reduce the available memory in the machine to 1GB, but solves the IO problem. In my case, it was much faster, even though MySQL couldn't buffer nearly as much as with 4GB. Thanks, Mark. I will probably try this with 3GB instead of 1GB. Did you try that? Yes; it didn't work. The problem (bug) is that block device IO has to go through buffers that are below 1GB. The memory manager doesn't know this, so what happens is that the IO layer requests a block of memory below 1GB, and the swapout daemon (kswapd) then runs around like a madman trying to free pages, instead of shuffling pages that don't need to be below 1GB to higher memory addresses. Since many of the pages below 1GB can't be freed (they belong to active programs), the IO starves. With 1GB of memory, both the IO layer and the swapout daemon are working with the same view of memory, so the bug is concealed, and performance is good. I have heard of people trying 2GB, and having it work, but it didn't for me. M Is this problem specific to the 3ware cards? does anyone know of any issues with the Highpoint 1640 SATA RAID cards? Any experience or recomendations with these? Which is the best SATA raid card for linux at the moment? Thanks José PS. please reply to the list.
Re: I/O performance issues on 2.4.23 SMP system
Mark Ferlatte wrote: Benjamin Sherman said on Wed, Jan 28, 2004 at 03:16:56PM -0600: I've got some machines in nearly the same configuration. What I ended up doing was to put an `append="mem=1G"' in the lilo.conf boot stanza for the kernel I was using, and rebooted the machine in question. This does reduce the available memory in the machine to 1GB, but solves the IO problem. In my case, it was much faster, even though MySQL couldn't buffer nearly as much as with 4GB. Thanks, Mark. I will probably try this with 3GB instead of 1GB. Did you try that? Yes; it didn't work. The problem (bug) is that block device IO has to go through buffers that are below 1GB. The memory manager doesn't know this, so what happens is that the IO layer requests a block of memory below 1GB, and the swapout daemon (kswapd) then runs around like a madman trying to free pages, instead of shuffling pages that don't need to be below 1GB to higher memory addresses. Since many of the pages below 1GB can't be freed (they belong to active programs), the IO starves. With 1GB of memory, both the IO layer and the swapout daemon are working with the same view of memory, so the bug is concealed, and performance is good. I have heard of people trying 2GB, and having it work, but it didn't for me. M Is this problem specific to the 3ware cards? does anyone know of any issues with the Highpoint 1640 SATA RAID cards? Any experience or recomendations with these? Which is the best SATA raid card for linux at the moment? Thanks José PS. please reply to the list. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: I/O performance issues on 2.4.23 SMP system
Benjamin Sherman said on Wed, Jan 28, 2004 at 03:16:56PM -0600: > >I've got some machines in nearly the same configuration. What I ended up > >doing was to put an `append="mem=1G"' in the lilo.conf boot stanza for the > >kernel I was using, and rebooted the machine in question. > > > >This does reduce the available memory in the machine to 1GB, but solves the > >IO problem. In my case, it was much faster, even though MySQL couldn't > >buffer nearly as much as with 4GB. > Thanks, Mark. I will probably try this with 3GB instead of 1GB. Did you try > that? Yes; it didn't work. The problem (bug) is that block device IO has to go through buffers that are below 1GB. The memory manager doesn't know this, so what happens is that the IO layer requests a block of memory below 1GB, and the swapout daemon (kswapd) then runs around like a madman trying to free pages, instead of shuffling pages that don't need to be below 1GB to higher memory addresses. Since many of the pages below 1GB can't be freed (they belong to active programs), the IO starves. With 1GB of memory, both the IO layer and the swapout daemon are working with the same view of memory, so the bug is concealed, and performance is good. I have heard of people trying 2GB, and having it work, but it didn't for me. M pgpoDfLP7KTv2.pgp Description: PGP signature
Re: I/O performance issues on 2.4.23 SMP system
* Is the I/O patch referenced (by Ingo Molnar) available for 2.4.24? Possibly; it's certainly not merged into 2.4.24. Can anyone point me to the specific patch? I've got some machines in nearly the same configuration. What I ended up doing was to put an `append="mem=1G"' in the lilo.conf boot stanza for the kernel I was using, and rebooted the machine in question. This does reduce the available memory in the machine to 1GB, but solves the IO problem. In my case, it was much faster, even though MySQL couldn't buffer nearly as much as with 4GB. Thanks, Mark. I will probably try this with 3GB instead of 1GB. Did you try that? -- Benjamin Sherman Software Developer Iowa Interactive, Inc 515-323-3468 x14 [EMAIL PROTECTED] smime.p7s Description: S/MIME Cryptographic Signature
Re: I/O performance issues on 2.4.23 SMP system
Benjamin Sherman said on Tue, Jan 27, 2004 at 03:49:24PM -0600: > So, I have a couple of questions because this box made it to production > before the problem was discovered and I can't test as I'd like. > * If I were to use 64GB HIGHMEM support. Would this problem go away? Nope. > * Is the I/O patch referenced (by Ingo Molnar) available for 2.4.24? Possibly; it's certainly not merged into 2.4.24. > * Is the patch available individually, if so, where can it be found? I > googled quite a bit, but didn't find anything definite. > > Any thoughts or suggestions? I've got some machines in nearly the same configuration. What I ended up doing was to put an `append="mem=1G"' in the lilo.conf boot stanza for the kernel I was using, and rebooted the machine in question. This does reduce the available memory in the machine to 1GB, but solves the IO problem. In my case, it was much faster, even though MySQL couldn't buffer nearly as much as with 4GB. M pgpv7xXO6Gh3N.pgp Description: PGP signature
Re: I/O performance issues on 2.4.23 SMP system
Benjamin Sherman said on Wed, Jan 28, 2004 at 03:16:56PM -0600: > >I've got some machines in nearly the same configuration. What I ended up > >doing was to put an `append="mem=1G"' in the lilo.conf boot stanza for the > >kernel I was using, and rebooted the machine in question. > > > >This does reduce the available memory in the machine to 1GB, but solves the > >IO problem. In my case, it was much faster, even though MySQL couldn't > >buffer nearly as much as with 4GB. > Thanks, Mark. I will probably try this with 3GB instead of 1GB. Did you try > that? Yes; it didn't work. The problem (bug) is that block device IO has to go through buffers that are below 1GB. The memory manager doesn't know this, so what happens is that the IO layer requests a block of memory below 1GB, and the swapout daemon (kswapd) then runs around like a madman trying to free pages, instead of shuffling pages that don't need to be below 1GB to higher memory addresses. Since many of the pages below 1GB can't be freed (they belong to active programs), the IO starves. With 1GB of memory, both the IO layer and the swapout daemon are working with the same view of memory, so the bug is concealed, and performance is good. I have heard of people trying 2GB, and having it work, but it didn't for me. M pgp0.pgp Description: PGP signature
Re: I/O performance issues on 2.4.23 SMP system
* Is the I/O patch referenced (by Ingo Molnar) available for 2.4.24? Possibly; it's certainly not merged into 2.4.24. Can anyone point me to the specific patch? I've got some machines in nearly the same configuration. What I ended up doing was to put an `append="mem=1G"' in the lilo.conf boot stanza for the kernel I was using, and rebooted the machine in question. This does reduce the available memory in the machine to 1GB, but solves the IO problem. In my case, it was much faster, even though MySQL couldn't buffer nearly as much as with 4GB. Thanks, Mark. I will probably try this with 3GB instead of 1GB. Did you try that? -- Benjamin Sherman Software Developer Iowa Interactive, Inc 515-323-3468 x14 [EMAIL PROTECTED] smime.p7s Description: S/MIME Cryptographic Signature
Re: I/O performance issues on 2.4.23 SMP system
Benjamin Sherman said on Tue, Jan 27, 2004 at 03:49:24PM -0600: > So, I have a couple of questions because this box made it to production > before the problem was discovered and I can't test as I'd like. > * If I were to use 64GB HIGHMEM support. Would this problem go away? Nope. > * Is the I/O patch referenced (by Ingo Molnar) available for 2.4.24? Possibly; it's certainly not merged into 2.4.24. > * Is the patch available individually, if so, where can it be found? I > googled quite a bit, but didn't find anything definite. > > Any thoughts or suggestions? I've got some machines in nearly the same configuration. What I ended up doing was to put an `append="mem=1G"' in the lilo.conf boot stanza for the kernel I was using, and rebooted the machine in question. This does reduce the available memory in the machine to 1GB, but solves the IO problem. In my case, it was much faster, even though MySQL couldn't buffer nearly as much as with 4GB. M pgp0.pgp Description: PGP signature
I/O performance issues on 2.4.23 SMP system
I am following up a message sent to this list: # Subject: severe I/O performance issues on 2.4.22 SMP system # From: Daniel Erat <[EMAIL PROTECTED]> # Date: Fri, 31 Oct 2003 12:38:38 -0800 I have a server running dual 2.66Ghz Xeons and 4GB RAM, in a PenguinComputing Relion 230S system. It has a 3ware RAID card with 3 120GB SATA drives in RAID5. It is currently running Debian 3.0 w/ vanilla kernel 2.4.23, HIGHMEM4G=y, HIGHIO=y, SMP=y, ACPI=y. I see the problem with APCI and HT turned off OR if I leave them on. I think my problem is perhaps the same as Mr. Erat's. Basically, I/O on this box sucks. A good example of the problem is the compared import of identical data to mysql. On this box, importing a dataset takes roughly 20 minutes. On another dev server (single Athlon 2Ghz, 1GB RAM, software RAID5 over Firewire), with identical mysql and dataset, the same import takes roughly 4.5 minutes. So, I have a couple of questions because this box made it to production before the problem was discovered and I can't test as I'd like. * If I were to use 64GB HIGHMEM support. Would this problem go away? * Is the I/O patch referenced (by Ingo Molnar) available for 2.4.24? OR is the patch going to be in the kernel anytime soon? * Is the patch available individually, if so, where can it be found? I googled quite a bit, but didn't find anything definite. Any thoughts or suggestions? Thanks! -- Benjamin Sherman Iowa Interactive, Inc [EMAIL PROTECTED] smime.p7s Description: S/MIME Cryptographic Signature
I/O performance issues on 2.4.23 SMP system
I am following up a message sent to this list: # Subject: severe I/O performance issues on 2.4.22 SMP system # From: Daniel Erat <[EMAIL PROTECTED]> # Date: Fri, 31 Oct 2003 12:38:38 -0800 I have a server running dual 2.66Ghz Xeons and 4GB RAM, in a PenguinComputing Relion 230S system. It has a 3ware RAID card with 3 120GB SATA drives in RAID5. It is currently running Debian 3.0 w/ vanilla kernel 2.4.23, HIGHMEM4G=y, HIGHIO=y, SMP=y, ACPI=y. I see the problem with APCI and HT turned off OR if I leave them on. I think my problem is perhaps the same as Mr. Erat's. Basically, I/O on this box sucks. A good example of the problem is the compared import of identical data to mysql. On this box, importing a dataset takes roughly 20 minutes. On another dev server (single Athlon 2Ghz, 1GB RAM, software RAID5 over Firewire), with identical mysql and dataset, the same import takes roughly 4.5 minutes. So, I have a couple of questions because this box made it to production before the problem was discovered and I can't test as I'd like. * If I were to use 64GB HIGHMEM support. Would this problem go away? * Is the I/O patch referenced (by Ingo Molnar) available for 2.4.24? OR is the patch going to be in the kernel anytime soon? * Is the patch available individually, if so, where can it be found? I googled quite a bit, but didn't find anything definite. Any thoughts or suggestions? Thanks! -- Benjamin Sherman Iowa Interactive, Inc [EMAIL PROTECTED] smime.p7s Description: S/MIME Cryptographic Signature