Re: 2.6.1[01] freeze on x86_64
Jan Engelhardt wrote: Er... by serial console, I assume you mean via a serial cable and some other device. If so, then no, I don't have that capability. I didn't know about netconsole before you mentioned it; I'll do some research and set it up. I do Serial console -- only requires a serial cable, available in the next computer store -- also works with non-Linux, non-x86 and (mostly) systems-w/o-compiler. Well, that and the knowledge of how to monitor it on the other end, which I lack :-). And does one need a null-modem cable? I haven't used serial cables since USB was introduced. Is serial monitoring preferred over netconsole? In any case, I've figured out how to get netconsole working, and have started monitoring it from my wife's laptop. I just need to reboot and make sure it is working (that I got the UDP addresses right), and then wait for a crash. I won't get to this until this evening, probably. Mark, I'm going to start CC'ing you, and I'll forward you the previous emails. Thanks, everybody, for responding. --- SER - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
On 22 Mar 2005, Jan Engelhardt wrote: > >> >acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active > >> > threshold [0] > >> > >> You mean you got this in /var/log/messages? > > > > Yes, in /var/log/messages. The lock up occurs without warning, so the only > > opportunity I have to look for error messages is in the syslogs. > > > >> Can you connect a serial console or netconsole and see if that > >> > > Er... by serial console, I assume you mean via a serial cable and some other > > device. If so, then no, I don't have that capability. I didn't know about > > netconsole before you mentioned it; I'll do some research and set it up. I > > do > > have a second computer (well, my wife's laptop is also running Linux) that I > > could use to monitor UDP traffic, if I can figure out what to use as a > > client > > to capture the messages. This may take me a couple of days. > > Serial console -- only requires a serial cable, available in the next > computer > store -- also works with non-Linux, non-x86 and (mostly) systems-w/o-compiler. > > > Jan Engelhardt I've actually got old dumb terminals sitting around. I'll hook one up and set the oops=panic option also. Maybe we can nail this down as I've pretty much avoided using my x86-64 desktop ever since. I'd been torn trying to decide whether or not to migrate to a different file system. -- Mark Nippere-contacts: 4475 Carter Creek Parkway [EMAIL PROTECTED] Apartment 724 http://nipsy.bitgnome.net/ Bryan, Texas, 77802-4481 AIM/Yahoo: texasnipsy ICQ: 66971617 (979)575-3193 MSN: [EMAIL PROTECTED] -BEGIN GEEK CODE BLOCK- Version: 3.1 GG/IT d- s++:+ a- C++$ UBL$ P--->+++ L+++$ !E--- W++(--) N+ o K++ w(---) O++ M V(--) PS+++(+) PE(--) Y+ PGP t+ 5 X R tv b+++@ DI+(++) D+ G e h r++ y+(**) --END GEEK CODE BLOCK-- ---begin random quote of the moment--- He hoped and prayed that there wasn't an afterlife. Then he realized there was a contradiction involved here and merely hoped that there wasn't an afterlife. -- Douglas Adams end random quote of the moment - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
>> >acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active >> > threshold [0] >> >> You mean you got this in /var/log/messages? > > Yes, in /var/log/messages. The lock up occurs without warning, so the only > opportunity I have to look for error messages is in the syslogs. > >> Can you connect a serial console or netconsole and see if that >> > Er... by serial console, I assume you mean via a serial cable and some other > device. If so, then no, I don't have that capability. I didn't know about > netconsole before you mentioned it; I'll do some research and set it up. I do > have a second computer (well, my wife's laptop is also running Linux) that I > could use to monitor UDP traffic, if I can figure out what to use as a client > to capture the messages. This may take me a couple of days. Serial console -- only requires a serial cable, available in the next computer store -- also works with non-Linux, non-x86 and (mostly) systems-w/o-compiler. Jan Engelhardt -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
Andi Kleen wrote: Sean Russell <[EMAIL PROTECTED]> writes: acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active threshold [0] You mean you got this in /var/log/messages? Yes, in /var/log/messages. The lock up occurs without warning, so the only opportunity I have to look for error messages is in the syslogs. Can you connect a serial console or netconsole and see if that Er... by serial console, I assume you mean via a serial cable and some other device. If so, then no, I don't have that capability. I didn't know about netconsole before you mentioned it; I'll do some research and set it up. I do have a second computer (well, my wife's laptop is also running Linux) that I could use to monitor UDP traffic, if I can figure out what to use as a client to capture the messages. This may take me a couple of days. I didn't post to the list earlier specifically because I knew the debugging process would rapidly exceed my knowledge about kernel debugging. I appologize for making you walk me through the process. catches anything? Also boot with oops=panic As a boot parameter? I'll give that a try. --- SER - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
Sean Russell <[EMAIL PROTECTED]> writes: > appear to be related to the lockup. In my logs, the last message > before the crash is always (that I've noticed) an ACPI error: > > acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active > threshold [0] You mean you got this in /var/log/messages? Can you connect a serial console or netconsole and see if that catches anything? Also boot with oops=panic -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
Sean Russell [EMAIL PROTECTED] writes: appear to be related to the lockup. In my logs, the last message before the crash is always (that I've noticed) an ACPI error: acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active threshold [0] You mean you got this in /var/log/messages? Can you connect a serial console or netconsole and see if that catches anything? Also boot with oops=panic -Andi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
Andi Kleen wrote: Sean Russell [EMAIL PROTECTED] writes: acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active threshold [0] You mean you got this in /var/log/messages? Yes, in /var/log/messages. The lock up occurs without warning, so the only opportunity I have to look for error messages is in the syslogs. Can you connect a serial console or netconsole and see if that Er... by serial console, I assume you mean via a serial cable and some other device. If so, then no, I don't have that capability. I didn't know about netconsole before you mentioned it; I'll do some research and set it up. I do have a second computer (well, my wife's laptop is also running Linux) that I could use to monitor UDP traffic, if I can figure out what to use as a client to capture the messages. This may take me a couple of days. I didn't post to the list earlier specifically because I knew the debugging process would rapidly exceed my knowledge about kernel debugging. I appologize for making you walk me through the process. catches anything? Also boot with oops=panic As a boot parameter? I'll give that a try. --- SER - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active threshold [0] You mean you got this in /var/log/messages? Yes, in /var/log/messages. The lock up occurs without warning, so the only opportunity I have to look for error messages is in the syslogs. Can you connect a serial console or netconsole and see if that Er... by serial console, I assume you mean via a serial cable and some other device. If so, then no, I don't have that capability. I didn't know about netconsole before you mentioned it; I'll do some research and set it up. I do have a second computer (well, my wife's laptop is also running Linux) that I could use to monitor UDP traffic, if I can figure out what to use as a client to capture the messages. This may take me a couple of days. Serial console -- only requires a serial cable, available in the next computer store -- also works with non-Linux, non-x86 and (mostly) systems-w/o-compiler. Jan Engelhardt -- - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
On 22 Mar 2005, Jan Engelhardt wrote: acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active threshold [0] You mean you got this in /var/log/messages? Yes, in /var/log/messages. The lock up occurs without warning, so the only opportunity I have to look for error messages is in the syslogs. Can you connect a serial console or netconsole and see if that Er... by serial console, I assume you mean via a serial cable and some other device. If so, then no, I don't have that capability. I didn't know about netconsole before you mentioned it; I'll do some research and set it up. I do have a second computer (well, my wife's laptop is also running Linux) that I could use to monitor UDP traffic, if I can figure out what to use as a client to capture the messages. This may take me a couple of days. Serial console -- only requires a serial cable, available in the next computer store -- also works with non-Linux, non-x86 and (mostly) systems-w/o-compiler. Jan Engelhardt I've actually got old dumb terminals sitting around. I'll hook one up and set the oops=panic option also. Maybe we can nail this down as I've pretty much avoided using my x86-64 desktop ever since. I'd been torn trying to decide whether or not to migrate to a different file system. -- Mark Nippere-contacts: 4475 Carter Creek Parkway [EMAIL PROTECTED] Apartment 724 http://nipsy.bitgnome.net/ Bryan, Texas, 77802-4481 AIM/Yahoo: texasnipsy ICQ: 66971617 (979)575-3193 MSN: [EMAIL PROTECTED] -BEGIN GEEK CODE BLOCK- Version: 3.1 GG/IT d- s++:+ a- C++$ UBL$ P---+++ L+++$ !E--- W++(--) N+ o K++ w(---) O++ M V(--) PS+++(+) PE(--) Y+ PGP t+ 5 X R tv b+++@ DI+(++) D+ G e h r++ y+(**) --END GEEK CODE BLOCK-- ---begin random quote of the moment--- He hoped and prayed that there wasn't an afterlife. Then he realized there was a contradiction involved here and merely hoped that there wasn't an afterlife. -- Douglas Adams end random quote of the moment - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.1[01] freeze on x86_64
Jan Engelhardt wrote: Er... by serial console, I assume you mean via a serial cable and some other device. If so, then no, I don't have that capability. I didn't know about netconsole before you mentioned it; I'll do some research and set it up. I do Serial console -- only requires a serial cable, available in the next computer store -- also works with non-Linux, non-x86 and (mostly) systems-w/o-compiler. Well, that and the knowledge of how to monitor it on the other end, which I lack :-). And does one need a null-modem cable? I haven't used serial cables since USB was introduced. Is serial monitoring preferred over netconsole? In any case, I've figured out how to get netconsole working, and have started monitoring it from my wife's laptop. I just need to reboot and make sure it is working (that I got the UDP addresses right), and then wait for a crash. I won't get to this until this evening, probably. Mark, I'm going to start CC'ing you, and I'll forward you the previous emails. Thanks, everybody, for responding. --- SER - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.1[01] freeze on x86_64
Hello, One liner: I'm getting mysterious (to me), almost random hard freezes of the kernel running 2.6.10 and 2.6.11. Kernel version: Linux version 2.6.11-gentoo-r3 ([EMAIL PROTECTED]) (gcc version 3.4.2 (Gentoo Linux 3.4.2-r2, ssp-3.4.1-1, pie-8.7.6.5)) Mark Nipper posted a message on March 5 regarding some mysterious kernel lockups which he didn't get a response to (I've contacted him about it). Since I'm having what I think is the same problem, I thought I'd post a message so he's not just a single lonely voice in the dark. Mark and I have similar set-ups. We're both running x86_64 kernels, and ReiserFS3. He's running Debian, I'm running Gentoo. We haven't compared kernel config files yet; it might mean something to him, but to be honest, I barely know enough to compile my own kernels and wouldn't know where to begin to look for the problem. Mark has only encountered this on 2.6.11, but I don't think he's tried any other kernel versions on x86_64; I get this problem on both 2.6.10 and 2.6.11. I didn't see the problem on 2.6.9. In both of our cases, the kernel is locking up, and requires a power cycle to get it back. We're not able to SSH into our machines, and we get no response from any of the input devices. Furthermore, even with full debugging turned on, there are no messages in the log file that appear to be related to the lockup. In my logs, the last message before the crash is always (that I've noticed) an ACPI error: acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active threshold [0] but this message appears a lot in my logs, so I think it is coincidence. For Mark, the last message was some ReiserFS message. Mark feels like the error is ReiserFS related, and I was pretty sure it was swap related, until I turned off all swap partitions and the problem still occurred. I *may* try converting all of my filesystems to something else if somebody knowledgeable thinks it could be the problem, but I'm guessing it is something deeper in; I've never seen a filesystem related problem that caused a lock-up like this. I still feel that this may be memory related. When I turn off swap, or when a drastically reduce my memory use, my laptop can run for hours, or even days with little use. On the other hand, it can freeze up after five minutes, even before KDE has finished loading completely, with the swap on. However, I haven't found a situation where it won't, eventually, lock up. But I can't really pin it down, so I don't know where the problem is. I haven't noticed the lockups without X, but I haven't run for any great length of time without X. I'm running the ATI proprietary drivers, but I even when I revert to the XOrg ATI drivers (non-proprietary), I still get the lockups. I'm really sorry that I can't provide more information; I'm usually not totally incompetent at narrowing down problems in software, but I have no idea where to even start looking for the problem here. If there are any things I should try that might provide more information, please let me know. I'm attaching my kernel config, plus all of the info from /proc that is suggested by the FAQ to be included. I'll be happy to recompile my kernel with other options, if I can get some hints at starting points; I doubt my changing flags at random will help much. Thanks, in advance. Sean Russell processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 4 model name : AMD Athlon(tm) 64 Processor 3400+ stepping: 10 cpu MHz : 801.849 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm 3dnowext 3dnow bogomips: 1572.86 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp -0009f7ff : System RAM 0009f800-0009 : reserved 000a-000b : Video RAM area 000c-000cefff : Video ROM 000cf000-000c : Adapter ROM 000f-000f : System ROM 0010-3fee : System RAM 0010-003ae374 : Kernel code 003ae375-004e0d27 : Kernel data 3fef-3fef9fff : ACPI Tables 3fefa000-3fef : ACPI Non-volatile Storage 3ff0-3fff : reserved 4000-4fff : :00:05.0 4000-4fff : ipw2200 40001000-40001fff : :00:0c.0 d000-d0003fff : :00:06.0 d0004000-d0004fff : :00:0e.0 d0005000-d0005fff : :00:0e.0 d0006000-d0006fff : :00:0e.1 d0007000-d0007fff : :00:0e.1 d0008000-d00087ff : :00:06.0 d0008000-d00087ff : ohci1394 d0008800-d00088ff : :00:08.0 d0008800-d00088ff : r8169 d0008c00-d0008cff : :00:10.3 d0008c00-d0008cff : ehci_hcd d010-d01f : PCI Bus #01 d010-d010 : :01:00.0 d010-d010 : radeonfb
2.6.1[01] freeze on x86_64
Hello, One liner: I'm getting mysterious (to me), almost random hard freezes of the kernel running 2.6.10 and 2.6.11. Kernel version: Linux version 2.6.11-gentoo-r3 ([EMAIL PROTECTED]) (gcc version 3.4.2 (Gentoo Linux 3.4.2-r2, ssp-3.4.1-1, pie-8.7.6.5)) Mark Nipper posted a message on March 5 regarding some mysterious kernel lockups which he didn't get a response to (I've contacted him about it). Since I'm having what I think is the same problem, I thought I'd post a message so he's not just a single lonely voice in the dark. Mark and I have similar set-ups. We're both running x86_64 kernels, and ReiserFS3. He's running Debian, I'm running Gentoo. We haven't compared kernel config files yet; it might mean something to him, but to be honest, I barely know enough to compile my own kernels and wouldn't know where to begin to look for the problem. Mark has only encountered this on 2.6.11, but I don't think he's tried any other kernel versions on x86_64; I get this problem on both 2.6.10 and 2.6.11. I didn't see the problem on 2.6.9. In both of our cases, the kernel is locking up, and requires a power cycle to get it back. We're not able to SSH into our machines, and we get no response from any of the input devices. Furthermore, even with full debugging turned on, there are no messages in the log file that appear to be related to the lockup. In my logs, the last message before the crash is always (that I've noticed) an ACPI error: acpi_thermal-0400 [23] acpi_thermal_get_trip_: Invalid active threshold [0] but this message appears a lot in my logs, so I think it is coincidence. For Mark, the last message was some ReiserFS message. Mark feels like the error is ReiserFS related, and I was pretty sure it was swap related, until I turned off all swap partitions and the problem still occurred. I *may* try converting all of my filesystems to something else if somebody knowledgeable thinks it could be the problem, but I'm guessing it is something deeper in; I've never seen a filesystem related problem that caused a lock-up like this. I still feel that this may be memory related. When I turn off swap, or when a drastically reduce my memory use, my laptop can run for hours, or even days with little use. On the other hand, it can freeze up after five minutes, even before KDE has finished loading completely, with the swap on. However, I haven't found a situation where it won't, eventually, lock up. But I can't really pin it down, so I don't know where the problem is. I haven't noticed the lockups without X, but I haven't run for any great length of time without X. I'm running the ATI proprietary drivers, but I even when I revert to the XOrg ATI drivers (non-proprietary), I still get the lockups. I'm really sorry that I can't provide more information; I'm usually not totally incompetent at narrowing down problems in software, but I have no idea where to even start looking for the problem here. If there are any things I should try that might provide more information, please let me know. I'm attaching my kernel config, plus all of the info from /proc that is suggested by the FAQ to be included. I'll be happy to recompile my kernel with other options, if I can get some hints at starting points; I doubt my changing flags at random will help much. Thanks, in advance. Sean Russell processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 4 model name : AMD Athlon(tm) 64 Processor 3400+ stepping: 10 cpu MHz : 801.849 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 pni syscall nx mmxext lm 3dnowext 3dnow bogomips: 1572.86 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp -0009f7ff : System RAM 0009f800-0009 : reserved 000a-000b : Video RAM area 000c-000cefff : Video ROM 000cf000-000c : Adapter ROM 000f-000f : System ROM 0010-3fee : System RAM 0010-003ae374 : Kernel code 003ae375-004e0d27 : Kernel data 3fef-3fef9fff : ACPI Tables 3fefa000-3fef : ACPI Non-volatile Storage 3ff0-3fff : reserved 4000-4fff : :00:05.0 4000-4fff : ipw2200 40001000-40001fff : :00:0c.0 d000-d0003fff : :00:06.0 d0004000-d0004fff : :00:0e.0 d0005000-d0005fff : :00:0e.0 d0006000-d0006fff : :00:0e.1 d0007000-d0007fff : :00:0e.1 d0008000-d00087ff : :00:06.0 d0008000-d00087ff : ohci1394 d0008800-d00088ff : :00:08.0 d0008800-d00088ff : r8169 d0008c00-d0008cff : :00:10.3 d0008c00-d0008cff : ehci_hcd d010-d01f : PCI Bus #01 d010-d010 : :01:00.0 d010-d010 : radeonfb