Re: 4G SGI quad Xeon -memory-related slowdowns
> Sounds like a true-to-God bug. Possibly in the form of incorrect MTRR > settings. Make sure you enable MTRR support. MTRR is enabled - here's the dump from /proc/mtrr: reg00: base=0xc000 (3072MB), size=1024MB: uncachable, count=1 reg01: base=0x ( 0MB), size=4096MB: write-back, count=1 reg02: base=0x1 (4096MB), size=1024MB: write-back, count=1 Note that the sizes sum to six gigabytes, and we only have four in the box. We've discovered that if we turn off PCI hotplug support and resource remapping in the BIOS, then /proc/mtrr looks more sensible: reg00: base=0xfc00 (4032MB), size= 64MB: uncachable, count=1 reg01: base=0x ( 0MB), size=4096MB: write-back, count=1 However, the 64G-compiled kernel still hangs. The complete task dump is at http://quirk.fnal.gov/xeon/ It looks like everything is blocked on lock_get_status. > I do need more information on what seems to hang, and how it hangs. One > of the pre-kernels will give you a nice stack backtrace for each process > if you press control-scrolllock, and that might be useful. Sysreq dump, plus meminfo and MTRR are at the above URL. Please let me know if you need any other information. FYI, with the revised BIOS settings, even the 4G-compiled kernel sees and uses the full 4G. So if this problem turns out to take time to fix, we can still get full use from the machine in the interim. Many thanks for the help! -Paul -- Paul Hubbard [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 4G SGI quad Xeon -memory-related slowdowns
Sounds like a true-to-God bug. Possibly in the form of incorrect MTRR settings. Make sure you enable MTRR support. MTRR is enabled - here's the dump from /proc/mtrr: reg00: base=0xc000 (3072MB), size=1024MB: uncachable, count=1 reg01: base=0x ( 0MB), size=4096MB: write-back, count=1 reg02: base=0x1 (4096MB), size=1024MB: write-back, count=1 Note that the sizes sum to six gigabytes, and we only have four in the box. We've discovered that if we turn off PCI hotplug support and resource remapping in the BIOS, then /proc/mtrr looks more sensible: reg00: base=0xfc00 (4032MB), size= 64MB: uncachable, count=1 reg01: base=0x ( 0MB), size=4096MB: write-back, count=1 However, the 64G-compiled kernel still hangs. The complete task dump is at http://quirk.fnal.gov/xeon/ It looks like everything is blocked on lock_get_status. I do need more information on what seems to hang, and how it hangs. One of the pre-kernels will give you a nice stack backtrace for each process if you press control-scrolllock, and that might be useful. Sysreq dump, plus meminfo and MTRR are at the above URL. Please let me know if you need any other information. FYI, with the revised BIOS settings, even the 4G-compiled kernel sees and uses the full 4G. So if this problem turns out to take time to fix, we can still get full use from the machine in the interim. Many thanks for the help! -Paul -- Paul Hubbard [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 4G SGI quad Xeon - memory-related slowdowns
Hi Paul, > 2) Other block I/O output (eg dd if=/dev/zero of=/dev/sdi bs=4M) also > run very slowly What do you notice when running "top" and doing the above? Does the "buff" value grow high (+700MB), with high CPU usage? If so, I think this might be down to nr_free_buffer_pages(). This function includes the pages in all zones (including the HIGHMEM zone) in its calculations, while only DMA and NORMAL zone pages are used for buffers. This upsets the result from balance_dirty_state() (fs/buffer.c), and as a result the required flushing of buffers is only done as a result of running v low of pages in the DMA and NORMAL zones. I've attached a "quick hack" I did for 2.4.0. It doesn't completely solve the problem, but moves it in the right direction. Please let me know if this helps. Mark diff -urN -X dontdiff linux-2.4.0/mm/page_alloc.c markhe-2.4.0/mm/page_alloc.c --- linux-2.4.0/mm/page_alloc.c Wed Jan 3 17:59:06 2001 +++ markhe-2.4.0/mm/page_alloc.cMon Jan 15 15:35:14 2001 @@ -583,6 +583,27 @@ } /* + * Free pages in zone "type", and the zones below it. + */ +unsigned int nr_free_pages_zone (int type) +{ + unsigned int sum; + zone_t *zone; + pg_data_t *pgdat = pgdat_list; + + if (type >= MAX_NR_ZONES) + BUG(); + + sum = 0; + while (pgdat) { + for (zone = pgdat->node_zones; zone < pgdat->node_zones + type; +zone++) + sum += zone->free_pages; + pgdat = pgdat->node_next; + } + return sum; +} + +/* * Total amount of inactive_clean (allocatable) RAM: */ unsigned int nr_inactive_clean_pages (void) @@ -600,6 +621,25 @@ return sum; } +unsigned int nr_inactive_clean_pages_zone(int type) +{ + unsigned int sum; + zone_t *zone; + pg_data_t *pgdat = pgdat_list; + + if (type >= MAX_NR_ZONES) + BUG(); + type++; + + sum = 0; + while (pgdat) { + for (zone = pgdat->node_zones; zone < pgdat->node_zones + type; +zone++) + sum += zone->inactive_clean_pages; + pgdat = pgdat->node_next; + } + return sum; +} + /* * Amount of free RAM allocatable as buffer memory: */ @@ -607,9 +647,9 @@ { unsigned int sum; - sum = nr_free_pages(); - sum += nr_inactive_clean_pages(); - sum += nr_inactive_dirty_pages; + sum = nr_free_pages_zone(ZONE_NORMAL); + sum += nr_inactive_clean_pages_zone(ZONE_NORMAL); + sum += nr_inactive_dirty_pages; /* XXX */ /* * Keep our write behind queue filled, even if
Re: 4G SGI quad Xeon - memory-related slowdowns
Hi Paul, 2) Other block I/O output (eg dd if=/dev/zero of=/dev/sdi bs=4M) also run very slowly What do you notice when running "top" and doing the above? Does the "buff" value grow high (+700MB), with high CPU usage? If so, I think this might be down to nr_free_buffer_pages(). This function includes the pages in all zones (including the HIGHMEM zone) in its calculations, while only DMA and NORMAL zone pages are used for buffers. This upsets the result from balance_dirty_state() (fs/buffer.c), and as a result the required flushing of buffers is only done as a result of running v low of pages in the DMA and NORMAL zones. I've attached a "quick hack" I did for 2.4.0. It doesn't completely solve the problem, but moves it in the right direction. Please let me know if this helps. Mark diff -urN -X dontdiff linux-2.4.0/mm/page_alloc.c markhe-2.4.0/mm/page_alloc.c --- linux-2.4.0/mm/page_alloc.c Wed Jan 3 17:59:06 2001 +++ markhe-2.4.0/mm/page_alloc.cMon Jan 15 15:35:14 2001 @@ -583,6 +583,27 @@ } /* + * Free pages in zone "type", and the zones below it. + */ +unsigned int nr_free_pages_zone (int type) +{ + unsigned int sum; + zone_t *zone; + pg_data_t *pgdat = pgdat_list; + + if (type = MAX_NR_ZONES) + BUG(); + + sum = 0; + while (pgdat) { + for (zone = pgdat-node_zones; zone pgdat-node_zones + type; +zone++) + sum += zone-free_pages; + pgdat = pgdat-node_next; + } + return sum; +} + +/* * Total amount of inactive_clean (allocatable) RAM: */ unsigned int nr_inactive_clean_pages (void) @@ -600,6 +621,25 @@ return sum; } +unsigned int nr_inactive_clean_pages_zone(int type) +{ + unsigned int sum; + zone_t *zone; + pg_data_t *pgdat = pgdat_list; + + if (type = MAX_NR_ZONES) + BUG(); + type++; + + sum = 0; + while (pgdat) { + for (zone = pgdat-node_zones; zone pgdat-node_zones + type; +zone++) + sum += zone-inactive_clean_pages; + pgdat = pgdat-node_next; + } + return sum; +} + /* * Amount of free RAM allocatable as buffer memory: */ @@ -607,9 +647,9 @@ { unsigned int sum; - sum = nr_free_pages(); - sum += nr_inactive_clean_pages(); - sum += nr_inactive_dirty_pages; + sum = nr_free_pages_zone(ZONE_NORMAL); + sum += nr_inactive_clean_pages_zone(ZONE_NORMAL); + sum += nr_inactive_dirty_pages; /* XXX */ /* * Keep our write behind queue filled, even if
Re: 4G SGI quad Xeon - memory-related slowdowns
On 15 Jan 2001, Linus Torvalds wrote: > The performance problem is _probably_ due to the kernel having to > double-buffer the IO requests, coupled with bad MTRR settings (ie > memory above the 4GB range is probably marked as non-cacheable or > something, which means that you'll get really bad performance). the highmem related double-buffering alone on such a category of system is miniscule, compared to other costs of IO, and considering the expected bandwidth (20-30 MB/sec). the MTRR part could be a problem. > Not using the high memory will avoid the double-buffering, and will > also avoid using memory that isn't cached. If I'm right. > The hang still indicates that something is wrong in PAE-land, though. it's working just fine on all 4GB+ systems tested (including 32GB systems), Intel, Dell, Compaq boxes. So if it's a unique PAE bug, then it must be some boundary condition. Paul, here is the memory map of my 8GB system: BIOS-provided physical RAM map: BIOS-e820: 0009d400 @ (usable) BIOS-e820: 2c00 @ 0009d400 (reserved) BIOS-e820: 0002 @ 000e (reserved) BIOS-e820: 03ef8000 @ 0010 (usable) BIOS-e820: 7c00 @ 03ff8000 (ACPI data) BIOS-e820: 0400 @ 03fffc00 (ACPI NVS) BIOS-e820: ec00 @ 0400 (usable) BIOS-e820: 0140 @ fec0 (reserved) BIOS-e820: f000 @ 0001 (usable) and here are the MTRR settings: [root@m mingo]# cat /proc/mtrr reg00: base=0xf000 (3840MB), size= 256MB: uncachable, count=1 reg01: base=0x ( 0MB), size=4096MB: write-back, count=1 reg02: base=0x1 (4096MB), size=2048MB: write-back, count=1 reg03: base=0x18000 (6144MB), size=1024MB: write-back, count=1 reg04: base=0x1c000 (7168MB), size= 512MB: write-back, count=1 reg05: base=0x1e000 (7680MB), size= 256MB: write-back, count=1 i'd suggest using the mem=exact feature to force different type of memory maps. Eg. i'm using the following append= line to force a 800 MB setup: append="mem=exactmap mem=0x0009d800@0x mem=0x03ef8000@0x0010 mem=0x2bffe000@0x0400" such mem=exactmap lines can be constructed based on the BIOS output. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 4G SGI quad Xeon - memory-related slowdowns
In article <[EMAIL PROTECTED]>, Paul Hubbard <[EMAIL PROTECTED]> wrote: > >We're having some problems with the 2.4.0 kernel on our SGI 1450, and >were hoping for some help. > The box is a quad Xeon 700/2MB, with 4GB of memory, ServerSet III HE >chipset, RH6.1 (slightly modified for local configuration) distribution. > >a) If we compile the kernel with no high memory support, /proc/meminfo >shows 1G of memory and everything works fine. Good. >b) If we compile for 4G of memory, /proc/meminfo shows about 3G, and >overriding the amount at the lilo prompt causes kernel panics at bootup. >However, other than missing a quarter of the memory, it works just >fine. 3GB is right - your last 1GB is above the 4GB mark, and it's mapped there explicitly so that you'll have space in the low 32 bits to map PCI devices etc (and things like the APIC, you get the idea). If you try to override it, you will very obviously crash, because if you tell Linux that you have 4GB of memory, Linux will think that you have 4GB of _contiguous_ memory, which is not true. The only way to use that last gigabyte is to enable support for memory > 4GB, and get the proper memory map _without_ any overrides that shows the proper holes for PCI space. Check your "dmesg" output under a working kernel for details - you'll see how the memory is laid out and reported by the e820 call.. >c) If we compile the kernel for 64G high memory (PAE mode), we see all >of the memory but have other problems: > i) mkefs -m0 on a 72GB Seagate SCSI disk runs very slowly (about >5MB/sec instead of 22-25) and the machine hangs after the format >completes. To be exact, the command prompt returns, but > ls or any other command will never return, and you have to reset >the box. This is a > showstopper for us! Sounds like a true-to-God bug. Possibly in the form of incorrect MTRR settings. Make sure you enable MTRR support. I do need more information on what seems to hang, and how it hangs. One of the pre-kernels will give you a nice stack backtrace for each process if you press control-scrolllock, and that might be useful. > ii) If I override the amount of memory via lilo, we still get the > hang, but performance actually improves! The performance problem is _probably_ due to the kernel having to double-buffer the IO requests, coupled with bad MTRR settings (ie memory above the 4GB range is probably marked as non-cacheable or something, which means that you'll get really bad performance). Not using the high memory will avoid the double-buffering, and will also avoid using memory that isn't cached. If I'm right. The hang still indicates that something is wrong in PAE-land, though. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
4G SGI quad Xeon - memory-related slowdowns
We're having some problems with the 2.4.0 kernel on our SGI 1450, and were hoping for some help. The box is a quad Xeon 700/2MB, with 4GB of memory, ServerSet III HE chipset, RH6.1 (slightly modified for local configuration) distribution. a) If we compile the kernel with no high memory support, /proc/meminfo shows 1G of memory and everything works fine. b) If we compile for 4G of memory, /proc/meminfo shows about 3G, and overriding the amount at the lilo prompt causes kernel panics at bootup. However, other than missing a quarter of the memory, it works just fine. c) If we compile the kernel for 64G high memory (PAE mode), we see all of the memory but have other problems: i) mkefs -m0 on a 72GB Seagate SCSI disk runs very slowly (about 5MB/sec instead of 22-25) and the machine hangs after the format completes. To be exact, the command prompt returns, but ls or any other command will never return, and you have to reset the box. This is a showstopper for us! ii) If I override the amount of memory via lilo, we still get the hang, but performance actually improves! At 1G, it's slow for a few seconds, and then runs fine. At 2G, it's slow, and when I tried to boot 3G I got an odd startup crash that I've not had time to replicate. Other notes: 1) SCSI is onboard Adaptec 39160 (aic7xxx driver, dual-channel) and we've tried different drives, cables, terminators, etc. 2) Other block I/O output (eg dd if=/dev/zero of=/dev/sdi bs=4M) also run very slowly 3) We are using vmstat 1 to monitor data rates 4) I tried the format with 2.4 prerelease, and the mkfs was very slow, and I got a SCSI reset at the end of the format. Perhaps this is related? 5) If necessary, we can easily load a different distribution on the machine if that might be part of the problem. If necessary, we can setup a login on the machine, or run whatever test code is necessary. Other than this, it's a pretty nice box to work on. Please reply to rjetton and phubbard at fnal.gov, thanks. -Paul -- Paul Hubbard [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
4G SGI quad Xeon - memory-related slowdowns
We're having some problems with the 2.4.0 kernel on our SGI 1450, and were hoping for some help. The box is a quad Xeon 700/2MB, with 4GB of memory, ServerSet III HE chipset, RH6.1 (slightly modified for local configuration) distribution. a) If we compile the kernel with no high memory support, /proc/meminfo shows 1G of memory and everything works fine. b) If we compile for 4G of memory, /proc/meminfo shows about 3G, and overriding the amount at the lilo prompt causes kernel panics at bootup. However, other than missing a quarter of the memory, it works just fine. c) If we compile the kernel for 64G high memory (PAE mode), we see all of the memory but have other problems: i) mkefs -m0 on a 72GB Seagate SCSI disk runs very slowly (about 5MB/sec instead of 22-25) and the machine hangs after the format completes. To be exact, the command prompt returns, but ls or any other command will never return, and you have to reset the box. This is a showstopper for us! ii) If I override the amount of memory via lilo, we still get the hang, but performance actually improves! At 1G, it's slow for a few seconds, and then runs fine. At 2G, it's slow, and when I tried to boot 3G I got an odd startup crash that I've not had time to replicate. Other notes: 1) SCSI is onboard Adaptec 39160 (aic7xxx driver, dual-channel) and we've tried different drives, cables, terminators, etc. 2) Other block I/O output (eg dd if=/dev/zero of=/dev/sdi bs=4M) also run very slowly 3) We are using vmstat 1 to monitor data rates 4) I tried the format with 2.4 prerelease, and the mkfs was very slow, and I got a SCSI reset at the end of the format. Perhaps this is related? 5) If necessary, we can easily load a different distribution on the machine if that might be part of the problem. If necessary, we can setup a login on the machine, or run whatever test code is necessary. Other than this, it's a pretty nice box to work on. Please reply to rjetton and phubbard at fnal.gov, thanks. -Paul -- Paul Hubbard [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 4G SGI quad Xeon - memory-related slowdowns
In article [EMAIL PROTECTED], Paul Hubbard [EMAIL PROTECTED] wrote: We're having some problems with the 2.4.0 kernel on our SGI 1450, and were hoping for some help. The box is a quad Xeon 700/2MB, with 4GB of memory, ServerSet III HE chipset, RH6.1 (slightly modified for local configuration) distribution. a) If we compile the kernel with no high memory support, /proc/meminfo shows 1G of memory and everything works fine. Good. b) If we compile for 4G of memory, /proc/meminfo shows about 3G, and overriding the amount at the lilo prompt causes kernel panics at bootup. However, other than missing a quarter of the memory, it works just fine. 3GB is right - your last 1GB is above the 4GB mark, and it's mapped there explicitly so that you'll have space in the low 32 bits to map PCI devices etc (and things like the APIC, you get the idea). If you try to override it, you will very obviously crash, because if you tell Linux that you have 4GB of memory, Linux will think that you have 4GB of _contiguous_ memory, which is not true. The only way to use that last gigabyte is to enable support for memory 4GB, and get the proper memory map _without_ any overrides that shows the proper holes for PCI space. Check your "dmesg" output under a working kernel for details - you'll see how the memory is laid out and reported by the e820 call.. c) If we compile the kernel for 64G high memory (PAE mode), we see all of the memory but have other problems: i) mkefs -m0 on a 72GB Seagate SCSI disk runs very slowly (about 5MB/sec instead of 22-25) and the machine hangs after the format completes. To be exact, the command prompt returns, but ls or any other command will never return, and you have to reset the box. This is a showstopper for us! Sounds like a true-to-God bug. Possibly in the form of incorrect MTRR settings. Make sure you enable MTRR support. I do need more information on what seems to hang, and how it hangs. One of the pre-kernels will give you a nice stack backtrace for each process if you press control-scrolllock, and that might be useful. ii) If I override the amount of memory via lilo, we still get the hang, but performance actually improves! The performance problem is _probably_ due to the kernel having to double-buffer the IO requests, coupled with bad MTRR settings (ie memory above the 4GB range is probably marked as non-cacheable or something, which means that you'll get really bad performance). Not using the high memory will avoid the double-buffering, and will also avoid using memory that isn't cached. If I'm right. The hang still indicates that something is wrong in PAE-land, though. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 4G SGI quad Xeon - memory-related slowdowns
On 15 Jan 2001, Linus Torvalds wrote: The performance problem is _probably_ due to the kernel having to double-buffer the IO requests, coupled with bad MTRR settings (ie memory above the 4GB range is probably marked as non-cacheable or something, which means that you'll get really bad performance). the highmem related double-buffering alone on such a category of system is miniscule, compared to other costs of IO, and considering the expected bandwidth (20-30 MB/sec). the MTRR part could be a problem. Not using the high memory will avoid the double-buffering, and will also avoid using memory that isn't cached. If I'm right. The hang still indicates that something is wrong in PAE-land, though. it's working just fine on all 4GB+ systems tested (including 32GB systems), Intel, Dell, Compaq boxes. So if it's a unique PAE bug, then it must be some boundary condition. Paul, here is the memory map of my 8GB system: BIOS-provided physical RAM map: BIOS-e820: 0009d400 @ (usable) BIOS-e820: 2c00 @ 0009d400 (reserved) BIOS-e820: 0002 @ 000e (reserved) BIOS-e820: 03ef8000 @ 0010 (usable) BIOS-e820: 7c00 @ 03ff8000 (ACPI data) BIOS-e820: 0400 @ 03fffc00 (ACPI NVS) BIOS-e820: ec00 @ 0400 (usable) BIOS-e820: 0140 @ fec0 (reserved) BIOS-e820: f000 @ 0001 (usable) and here are the MTRR settings: [root@m mingo]# cat /proc/mtrr reg00: base=0xf000 (3840MB), size= 256MB: uncachable, count=1 reg01: base=0x ( 0MB), size=4096MB: write-back, count=1 reg02: base=0x1 (4096MB), size=2048MB: write-back, count=1 reg03: base=0x18000 (6144MB), size=1024MB: write-back, count=1 reg04: base=0x1c000 (7168MB), size= 512MB: write-back, count=1 reg05: base=0x1e000 (7680MB), size= 256MB: write-back, count=1 i'd suggest using the mem=exact feature to force different type of memory maps. Eg. i'm using the following append= line to force a 800 MB setup: append="mem=exactmap mem=0x0009d800@0x mem=0x03ef8000@0x0010 mem=0x2bffe000@0x0400" such mem=exactmap lines can be constructed based on the BIOS output. Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/