Re: arcmsr & areca-1660 - strange behaviour under heavy load
Hi Andrew, no, right now I have the machine in the weird state, swap is empty (3GB), and so is bigger part of RAM (~100MB free), and the gcc crashes even when trying to compile c program with empty main function. so it doesn't seem to be problem with memory exhaustion. Hopefully the areca guys will be able to find out what is going on. But anyways, if You'll have any other idea what should I check/try, please let me know, as I have to admit that I'd really like to hunt it down myself (and yes, there is some vanity on my side here :)) thanks a lot once more cheers nik On Tue, 26 Feb 2008, Andrew Morton wrote: On Tue, 26 Feb 2008 10:35:31 +0100 (CET) Nikola Ciprich <[EMAIL PROTECTED]> wrote: Hi On Sun, 24 Feb 2008, Andrew Morton wrote: Hi Andrew, thanks a lot for reply, I'm attaching requested information. please let me know if You need more information/testing, whatever. I'll be glad to help. BR nik Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c > /dev/null ; echo m > /proc/sysrq-trigger ; dmesg -c Thanks. Alas, that all looks OK to me. You never get any out-of-memory messages, and no oom-killing messages? Possibly what is happening here is that in this low-memory condition, some of the driver's internal memory-allocation attempts are failing, and the driver isn't correctly handling this. This is a rare situation which may well not have been hit in anyone else's testing. I expect that the Areca engineers will be able to reproduce this with a suitably small "mem=" kernel boot option. If not, they could perhaps investigate the kernel's fault-injection framework, which permits simulation of page allocation failures. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: arcmsr & areca-1660 - strange behaviour under heavy load
On Tue, 26 Feb 2008 10:35:31 +0100 (CET) Nikola Ciprich <[EMAIL PROTECTED]> wrote: > Hi > > On Sun, 24 Feb 2008, Andrew Morton wrote: > > Hi Andrew, > thanks a lot for reply, I'm attaching requested information. > please let me know if You need more information/testing, whatever. > I'll be glad to help. > BR > nik > > >> Areca support doesn't seem to be very interested in the problem :-( > > > > (cc's added) > > > > Please get the machine into this state of memory exhaustion then take > > copies of the output of the following, and send them via reply-to-all to > > this email: > > > > - cat /proc/meminfo > > > > - cat /proc/slabinfo > > > > - dmesg -c > /dev/null ; echo m > /proc/sysrq-trigger ; dmesg -c > > > > Thanks. Alas, that all looks OK to me. You never get any out-of-memory messages, and no oom-killing messages? Possibly what is happening here is that in this low-memory condition, some of the driver's internal memory-allocation attempts are failing, and the driver isn't correctly handling this. This is a rare situation which may well not have been hit in anyone else's testing. I expect that the Areca engineers will be able to reproduce this with a suitably small "mem=" kernel boot option. If not, they could perhaps investigate the kernel's fault-injection framework, which permits simulation of page allocation failures. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: arcmsr & areca-1660 - strange behaviour under heavy load
Hi Nikola, As I said, we will test on our site. Our support team will help you to settle the issue. Sorry for your inconvenience, -Original Message- From: Nikola Ciprich [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 26, 2008 5:36 PM To: Andrew Morton Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED]; Nick Cheng; Erich Chen; [EMAIL PROTECTED] Subject: Re: arcmsr & areca-1660 - strange behaviour under heavy load Hi On Sun, 24 Feb 2008, Andrew Morton wrote: Hi Andrew, thanks a lot for reply, I'm attaching requested information. please let me know if You need more information/testing, whatever. I'll be glad to help. BR nik >> Areca support doesn't seem to be very interested in the problem :-( > > (cc's added) > > Please get the machine into this state of memory exhaustion then take > copies of the output of the following, and send them via reply-to-all to > this email: > > - cat /proc/meminfo > > - cat /proc/slabinfo > > - dmesg -c > /dev/null ; echo m > /proc/sysrq-trigger ; dmesg -c > > Thanks. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: arcmsr & areca-1660 - strange behaviour under heavy load
Hi On Sun, 24 Feb 2008, Andrew Morton wrote: Hi Andrew, thanks a lot for reply, I'm attaching requested information. please let me know if You need more information/testing, whatever. I'll be glad to help. BR nik Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c > /dev/null ; echo m > /proc/sysrq-trigger ; dmesg -c Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- MemTotal: 188464 kB MemFree: 82268 kB Buffers: 20572 kB Cached: 40312 kB SwapCached: 1508 kB Active: 35220 kB Inactive:29436 kB SwapTotal: 3145712 kB SwapFree: 3142156 kB Dirty: 340 kB Writeback: 0 kB AnonPages:2836 kB Mapped: 4532 kB Slab:29824 kB SReclaimable:16728 kB SUnreclaim: 13096 kB PageTables: 1024 kB NFS_Unstable:0 kB Bounce: 0 kB CommitLimit: 3239944 kB Committed_AS:13732 kB VmallocTotal: 34359738367 kB VmallocUsed: 10644 kB VmallocChunk: 34359727343 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB slabinfo - version: 2.1 # name : tunables: slabdata fib6_nodes 5 59 64 591 : tunables 120 608 : slabdata 1 1 0 ip6_dst_cache 4 12320 121 : tunables 54 278 : slabdata 1 1 0 ndisc_cache1 15256 151 : tunables 120 608 : slabdata 1 1 0 RAWv6 11 1289641 : tunables 54 278 : slabdata 3 3 0 UDPLITEv6 0 089641 : tunables 54 278 : slabdata 0 0 0 UDPv6 0 089641 : tunables 54 278 : slabdata 0 0 0 tw_sock_TCPv6 0 0192 201 : tunables 120 608 : slabdata 0 0 0 request_sock_TCPv6 0 0192 201 : tunables 120 608 : slabdata 0 0 0 TCPv6 2 4 166442 : tunables 24 128 : slabdata 1 1 0 ip_fib_alias 10 59 64 591 : tunables 120 608 : slabdata 1 1 0 ip_fib_hash 10 59 64 591 : tunables 120 608 : slabdata 1 1 0 reiser_inode_cache 13 1577651 : tunables 54 278 : slabdata 3 3 0 dm_mpath_io0 0 40 921 : tunables 120 608 : slabdata 0 0 0 dm_snap_pending_exception128136112 341 : tunables 120 60 8 : slabdata 4 4 0 dm_snap_exception 0 0 32 1121 : tunables 120 608 : slabdata 0 0 0 dm_uevent 0 0 260832 : tunables 24 128 : slabdata 0 0 0 dm_target_io1320 1440 24 1441 : tunables 120 608 : slabdata 10 10 0 dm_io 1320 1472 40 921 : tunables 120 608 : slabdata 16 16 0 scsi_cmd_cache38 40384 101 : tunables 54 278 : slabdata 4 4 0 sgpool-128 2 2 409611 : tunables 24 128 : slabdata 2 2 0 sgpool-64 2 2 204821 : tunables 24 128 : slabdata 1 1 0 sgpool-32 2 4 102441 : tunables 54 278 : slabdata 1 1 0 sgpool-16 3 851281 : tunables 54 278 : slabdata 1 1 0 sgpool-8 13 45256 151 : tunables 120 608 : slabdata 3 3 0 scsi_io_context0 0112 341 : tunables 120 608 : slabdata 0 0 0 ext3_inode_cache3946 402883241 : tunables 54 278 : slabdata 1007 1007 0 ext3_xattr 0 0 88 441 : tunables 120 608 : slabdata 0 0 0 journal_handle32144 24 1441 : tunables 120 608 : slabdata 1 1 0 journal_head 105280 96 401 : tunables 120 608 : slabdata 7 7 0 revoke_table 6202 16 2021 : tunables 120 608 : slabdata 1 1 0 revoke_record 0 0 32 1121 : tunables 120 608 : slabdata 0 0
Re: arcmsr areca-1660 - strange behaviour under heavy load
Hi On Sun, 24 Feb 2008, Andrew Morton wrote: Hi Andrew, thanks a lot for reply, I'm attaching requested information. please let me know if You need more information/testing, whatever. I'll be glad to help. BR nik Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c /dev/null ; echo m /proc/sysrq-trigger ; dmesg -c Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- MemTotal: 188464 kB MemFree: 82268 kB Buffers: 20572 kB Cached: 40312 kB SwapCached: 1508 kB Active: 35220 kB Inactive:29436 kB SwapTotal: 3145712 kB SwapFree: 3142156 kB Dirty: 340 kB Writeback: 0 kB AnonPages:2836 kB Mapped: 4532 kB Slab:29824 kB SReclaimable:16728 kB SUnreclaim: 13096 kB PageTables: 1024 kB NFS_Unstable:0 kB Bounce: 0 kB CommitLimit: 3239944 kB Committed_AS:13732 kB VmallocTotal: 34359738367 kB VmallocUsed: 10644 kB VmallocChunk: 34359727343 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB slabinfo - version: 2.1 # nameactive_objs num_objs objsize objperslab pagesperslab : tunables limit batchcount sharedfactor : slabdata active_slabs num_slabs sharedavail fib6_nodes 5 59 64 591 : tunables 120 608 : slabdata 1 1 0 ip6_dst_cache 4 12320 121 : tunables 54 278 : slabdata 1 1 0 ndisc_cache1 15256 151 : tunables 120 608 : slabdata 1 1 0 RAWv6 11 1289641 : tunables 54 278 : slabdata 3 3 0 UDPLITEv6 0 089641 : tunables 54 278 : slabdata 0 0 0 UDPv6 0 089641 : tunables 54 278 : slabdata 0 0 0 tw_sock_TCPv6 0 0192 201 : tunables 120 608 : slabdata 0 0 0 request_sock_TCPv6 0 0192 201 : tunables 120 608 : slabdata 0 0 0 TCPv6 2 4 166442 : tunables 24 128 : slabdata 1 1 0 ip_fib_alias 10 59 64 591 : tunables 120 608 : slabdata 1 1 0 ip_fib_hash 10 59 64 591 : tunables 120 608 : slabdata 1 1 0 reiser_inode_cache 13 1577651 : tunables 54 278 : slabdata 3 3 0 dm_mpath_io0 0 40 921 : tunables 120 608 : slabdata 0 0 0 dm_snap_pending_exception128136112 341 : tunables 120 60 8 : slabdata 4 4 0 dm_snap_exception 0 0 32 1121 : tunables 120 608 : slabdata 0 0 0 dm_uevent 0 0 260832 : tunables 24 128 : slabdata 0 0 0 dm_target_io1320 1440 24 1441 : tunables 120 608 : slabdata 10 10 0 dm_io 1320 1472 40 921 : tunables 120 608 : slabdata 16 16 0 scsi_cmd_cache38 40384 101 : tunables 54 278 : slabdata 4 4 0 sgpool-128 2 2 409611 : tunables 24 128 : slabdata 2 2 0 sgpool-64 2 2 204821 : tunables 24 128 : slabdata 1 1 0 sgpool-32 2 4 102441 : tunables 54 278 : slabdata 1 1 0 sgpool-16 3 851281 : tunables 54 278 : slabdata 1 1 0 sgpool-8 13 45256 151 : tunables 120 608 : slabdata 3 3 0 scsi_io_context0 0112 341 : tunables 120 608 : slabdata 0 0 0 ext3_inode_cache3946 402883241 : tunables 54 278 : slabdata 1007 1007 0 ext3_xattr 0 0 88 441 : tunables 120 608 : slabdata 0 0 0 journal_handle32144 24 1441 : tunables 120 608 : slabdata 1 1 0 journal_head 105280 96 401 : tunables 120 608 : slabdata 7 7 0 revoke_table 6202 16 2021 : tunables 120 608 : slabdata 1 1 0
RE: arcmsr areca-1660 - strange behaviour under heavy load
Hi Nikola, As I said, we will test on our site. Our support team will help you to settle the issue. Sorry for your inconvenience, -Original Message- From: Nikola Ciprich [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 26, 2008 5:36 PM To: Andrew Morton Cc: linux-kernel@vger.kernel.org; [EMAIL PROTECTED]; Nick Cheng; Erich Chen; [EMAIL PROTECTED] Subject: Re: arcmsr areca-1660 - strange behaviour under heavy load Hi On Sun, 24 Feb 2008, Andrew Morton wrote: Hi Andrew, thanks a lot for reply, I'm attaching requested information. please let me know if You need more information/testing, whatever. I'll be glad to help. BR nik Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c /dev/null ; echo m /proc/sysrq-trigger ; dmesg -c Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: arcmsr areca-1660 - strange behaviour under heavy load
On Tue, 26 Feb 2008 10:35:31 +0100 (CET) Nikola Ciprich [EMAIL PROTECTED] wrote: Hi On Sun, 24 Feb 2008, Andrew Morton wrote: Hi Andrew, thanks a lot for reply, I'm attaching requested information. please let me know if You need more information/testing, whatever. I'll be glad to help. BR nik Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c /dev/null ; echo m /proc/sysrq-trigger ; dmesg -c Thanks. Alas, that all looks OK to me. You never get any out-of-memory messages, and no oom-killing messages? Possibly what is happening here is that in this low-memory condition, some of the driver's internal memory-allocation attempts are failing, and the driver isn't correctly handling this. This is a rare situation which may well not have been hit in anyone else's testing. I expect that the Areca engineers will be able to reproduce this with a suitably small mem= kernel boot option. If not, they could perhaps investigate the kernel's fault-injection framework, which permits simulation of page allocation failures. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: arcmsr areca-1660 - strange behaviour under heavy load
Hi Andrew, no, right now I have the machine in the weird state, swap is empty (3GB), and so is bigger part of RAM (~100MB free), and the gcc crashes even when trying to compile c program with empty main function. so it doesn't seem to be problem with memory exhaustion. Hopefully the areca guys will be able to find out what is going on. But anyways, if You'll have any other idea what should I check/try, please let me know, as I have to admit that I'd really like to hunt it down myself (and yes, there is some vanity on my side here :)) thanks a lot once more cheers nik On Tue, 26 Feb 2008, Andrew Morton wrote: On Tue, 26 Feb 2008 10:35:31 +0100 (CET) Nikola Ciprich [EMAIL PROTECTED] wrote: Hi On Sun, 24 Feb 2008, Andrew Morton wrote: Hi Andrew, thanks a lot for reply, I'm attaching requested information. please let me know if You need more information/testing, whatever. I'll be glad to help. BR nik Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c /dev/null ; echo m /proc/sysrq-trigger ; dmesg -c Thanks. Alas, that all looks OK to me. You never get any out-of-memory messages, and no oom-killing messages? Possibly what is happening here is that in this low-memory condition, some of the driver's internal memory-allocation attempts are failing, and the driver isn't correctly handling this. This is a rare situation which may well not have been hit in anyone else's testing. I expect that the Areca engineers will be able to reproduce this with a suitably small mem= kernel boot option. If not, they could perhaps investigate the kernel's fault-injection framework, which permits simulation of page allocation failures. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: arcmsr & areca-1660 - strange behaviour under heavy load
On Sat, 23 Feb 2008 12:20:12 +0100 (CET) Nikola Ciprich <[EMAIL PROTECTED]> wrote: > Hi, > > I've found strange problem either in arcmsr driver, or maybe in > areca-1660 card... > When system on SAS discs RAID connected to areca-1660 card > gets under heavy I/O load, it gets unusable after some time. I can 100% > reproduce > this, although it needs quite speciffic conditions: > It can be reproduced on 2x quad core machine, RAM has to be limited to > ~192MB to cause heavy paging. > Only thing needed to cause the problem is to start loop doing kernel > compilation using make -j 8 - this loads the system heavily, because of > lack of memory. After few correct compile runs the system gets into > state when all programs including the basic ones (ls, cp, ..) start > crashing... dmesg (when it works) doesn't say anything strange... > After reboot, the system is OK again. > I have tested it on different motherboards, with different CPUs, RAMs(all > were properly tested with memtest), with two different areca cards and > different drives. I can't reproduce the problem on same hardware when > using different RAID card (ie adaptec). All testing systems were properly > cooled.. > I have tried all available areca firmwares, two different distributions > (oracle linux, and centos), and kernels ranging from distribution ones, to > last GIT snapshot. > Could somebody please give me some hints on how to hunt this problem? > Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c > /dev/null ; echo m > /proc/sysrq-trigger ; dmesg -c Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: arcmsr areca-1660 - strange behaviour under heavy load
On Sat, 23 Feb 2008 12:20:12 +0100 (CET) Nikola Ciprich [EMAIL PROTECTED] wrote: Hi, I've found strange problem either in arcmsr driver, or maybe in areca-1660 card... When system on SAS discs RAID connected to areca-1660 card gets under heavy I/O load, it gets unusable after some time. I can 100% reproduce this, although it needs quite speciffic conditions: It can be reproduced on 2x quad core machine, RAM has to be limited to ~192MB to cause heavy paging. Only thing needed to cause the problem is to start loop doing kernel compilation using make -j 8 - this loads the system heavily, because of lack of memory. After few correct compile runs the system gets into state when all programs including the basic ones (ls, cp, ..) start crashing... dmesg (when it works) doesn't say anything strange... After reboot, the system is OK again. I have tested it on different motherboards, with different CPUs, RAMs(all were properly tested with memtest), with two different areca cards and different drives. I can't reproduce the problem on same hardware when using different RAID card (ie adaptec). All testing systems were properly cooled.. I have tried all available areca firmwares, two different distributions (oracle linux, and centos), and kernels ranging from distribution ones, to last GIT snapshot. Could somebody please give me some hints on how to hunt this problem? Areca support doesn't seem to be very interested in the problem :-( (cc's added) Please get the machine into this state of memory exhaustion then take copies of the output of the following, and send them via reply-to-all to this email: - cat /proc/meminfo - cat /proc/slabinfo - dmesg -c /dev/null ; echo m /proc/sysrq-trigger ; dmesg -c Thanks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/