Re: [PATCH] bad temperature values from w83781d in 2.6.22
Hi Mark, On Thu, Aug 09, 2007 at 08:26:19AM -0400, Mark M. Hoffman wrote: > Hi Joerg: > > > On Wed, Aug 08, 2007 at 11:56:42PM -0400, Mark M. Hoffman wrote: > > > Thanks for sending all that. I see one bug clearly, and I'm pretty close > > > to > > > seeing the other one. But for tonight, I need sleep. > > There's just one bug after all. The second was a figment of my sleep-deprived > imagination. > [...] > > My bad, there's a second i2cset command that would have done it. Please try > this patch against v2.6.22. > Great, problem fixed. Thanks a lot for your prompt solution! -jo -- -rw-r--r-- 1 jo users 62 2007-08-09 08:46 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [lm-sensors] bad temperature values from w83781d in 2.6.22
Hi Mark, On Wed, Aug 08, 2007 at 11:56:42PM -0400, Mark M. Hoffman wrote: > Hi Joerg: > > * Joerg Sommrey <[EMAIL PROTECTED]> [2007-08-08 17:17:16 +0200]: > > Hi Mark, > > > > just to eliminate as many impacts as possible, I did: > > - reinstall the unmodified sensors.conf from Tyan's support page > > - power off before rebooting > > > > A call to "sensors -s" is done without errors in all cases. > > The module parameters I use currently with both kernels: > > > > options w83781d force_w83782d=0,0x2d force_subclients=0,0x2d,0x48,0x49 > > options w83627hf force_addr=0x0c00 > > > > When I first realized the problem, I didn't use w83627hf yet. Results > > are the same when w83781d is used as driver for w83627hf. > > Parameters in that case just from Tyan: > > > > options w83781d force_w83782d=0,0x2d force_subclients=0,0x2d,0x48,0x49 > > force_w83627hf=0,0x2c force_subclients=0,0x2c,0x4a,0x4b init=0 > > > > "My" i2cdump doesn't accept an -y option, maybe a Debianism. Results > > see below. > > Newer i2cdump skips the 5-second warning when given -y, that's all. > > > ### 2.6.21 ### > > Script started on Wed Aug 8 16:53:10 2007 > > bear:~/hwmon# i2cdump 0 0x2d b 0 0x4e > > (snip tons of results) > > Thanks for sending all that. I see one bug clearly, and I'm pretty close to > seeing the other one. But for tonight, I need sleep. > > In the meantime, please try this command as root, against the newer kernel, > *after* you've done 'sensors -s': > > # i2cset -f 0 0x2d 0x5d 0x0e b > > Wait > 2 seconds for the hardware to update itself, then run 'sensors' again. > I'm pretty sure you'll see the correct temps. The displayed temperatures changed to 67.5°C / 66.0°C. Still, this seems to be too high. The power supply's fan runs too slow for such CPU temperatures. In older kernels it becomes noisy above 50°C. Under load the temperatures shown are around 95°C, way too high. -jo -- -rw-r--r-- 1 jo users 62 2007-08-08 21:06 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bad temperature values from w83781d in 2.6.22
P.KPP.KPP.KPP.KP d0: 50 00 4b 50 50 00 4b 50 50 00 4b 50 50 00 4b 50P.KPP.KPP.KPP.KP e0: 50 00 4b 50 50 00 4b 50 4f 00 4b 50 4f 00 4b 50P.KPP.KPO.KPO.KP f0: 50 00 4b 50 50 00 4b 50 50 00 4b 50 50 00 4b 50P.KPP.KPP.KPP.KP bear:~/hwmon# i2cdump 0 0x49 No size specified (using byte-data access) WARNING! This program can confuse your I2C bus, cause data loss and worse! I will probe file /dev/i2c-0, address 0x49, mode byte You have five seconds to reconsider and press CTRL-C! 0 1 2 3 4 5 6 7 8 9 a b c d e f0123456789abcdef 00: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 10: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 20: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 30: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 40: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 50: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 60: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 70: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 80: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP 90: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP a0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP b0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP c0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP d0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP e0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP f0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP bear:~/hwmon# sensors w83782d-i2c-0-2d Adapter: SMBus AMD768 adapter at 80e0 AGP V: +1.73 V (min = +3.14 V, max = +3.47 V) ALARM +5 V: +4.81 V (min = +4.76 V, max = +5.24 V) DDR V: +1.22 V (min = +2.85 V, max = +3.47 V) ALARM 3 VSB: +3.30 V (min = +2.85 V, max = +3.15 V) ALARM Bat V: +0.00 V (min = +2.64 V, max = +3.95 V) ALARM chs1 Fan:0 RPM (min = 2700 RPM, div = 2) ALARM chs2 Fan:0 RPM (min = 3970 RPM, div = 2) ALARM chs3 Fan:0 RPM (min = 10546 RPM, div = 2) ALARM VRM2 Temp: +56 C (high = +80 C, hyst = +75 C) sensor = transistor CPU1 Temp: +79.5 C (high = +80 C, hyst = +75 C) sensor = transistor ALARM CPU2 Temp: +79.5 C (high = +80 C, hyst = +75 C) sensor = transistor ALARM alarms: beep_enable: Sound alarm enabled w83627hf-isa-0c00 Adapter: ISA adapter VCore1:+1.71 V (min = +1.66 V, max = +1.84 V) VCore2:+1.71 V (min = +1.66 V, max = +1.84 V) +3.3 V:+3.33 V (min = +3.14 V, max = +3.47 V) +12 V: +11.83 V (min = +13.21 V, max = +10.83 V) ALARM -12 V: -12.20 V (min = -13.18 V, max = -10.80 V) CPU1 Fan: 4041 RPM (min = 4687 RPM, div = 2) ALARM CPU2 Fan: 4166 RPM (min = 6750 RPM, div = 2) ALARM VRM1 Temp: +43 C (high = -124 C, hyst = +16 C) sensor = transistor ALARM AGP Temp: +49.5 C (high = +80 C, hyst = +75 C) sensor = transistor DDR Temp: +46.0 C (high = +80 C, hyst = +75 C) sensor = transistor alarms: Chassis intrusion detection ALARM beep_enable: Sound alarm disabled bear:~/hwmon# exit Script done on Wed Aug 8 16:43:20 2007 On Tue, Aug 07, 2007 at 09:03:16PM -0400, Mark M. Hoffman wrote: > Hi Joerg: > > (I tried to follow-up using the gmane.org mail/news gateway... didn't seem > to work.) > > * Joerg Sommrey <[EMAIL PROTECTED]> [2007-08-05 12:26:04 +0200]: > > Hi, > > > > after upgrading from 2.6.21 to 2.6.22 the CPU temperatures shown by > > w83781d look unreal. They were in a range from 40°C when idle to > > 75°C under full load with 2.6.21. The values shown now are in a very > > small range from 77°C to 82°C. From the (low) noise of the fan I can > > tell that the temperature is <50°C. > > The third temperature shown is completely wrong. > > > > I have a Tyan Tiger MPX board with a w83782d chip. Output from > > "sensors": > > > > w83782d-i2c-0-2d > > Adapter: SMBus AMD768 adapter at 80e0 > > +5 V: +4.81 V (min = +4.76 V, max = +5.24 V) > > 3 VSB: +3.30 V (min = +2.85 V, max = +3.15 V) ALARM > > chs3 Fan: 2122 RPM (min = 2657 RPM, div = 4) ALARM > > VRM2 Temp: -208°C (high = -176°C, hyst = -181°C) sensor = transistor > > CPU1 Temp: +78.5°C (high = +80°C, hyst = +75°C) sensor = transistor > > ALARM > > CPU2 Temp: +77.5°C (high = +80°C, hyst = +75°C) sensor = transistor
Re: bad temperature values from w83781d in 2.6.22
Thanks for your reply. the sensors.conf I'm currently using is provided by Tyan, so this seems to be ok. One major difference that I can see: I don't have compute statements for the CPU temperatures. If I use your config, I get 7.8°C :-) So there is definitely some difference in our hardware environment. OTOH with the "right" compute statement the problem seems fixable. BTW: there is another hwmon chip on the board, a w83627hf. Up to 2.6.21 this was managed by the w83781d driver, too. Now I use the w83627hf driver (on the isa bus). No problem with that part. -jo On Sun, Aug 05, 2007 at 01:20:23PM +0200, Rene Herman wrote: > On 08/05/2007 12:26 PM, Joerg Sommrey wrote: > > >after upgrading from 2.6.21 to 2.6.22 the CPU temperatures shown by > >w83781d look unreal. They were in a range from 40°C when idle to > >75°C under full load with 2.6.21. The values shown now are in a very > >small range from 77°C to 82°C. From the (low) noise of the fan I can > >tell that the temperature is <50°C. > >The third temperature shown is completely wrong. > > > >I have a Tyan Tiger MPX board with a w83782d chip. Output from > >"sensors": > > > >w83782d-i2c-0-2d > >Adapter: SMBus AMD768 adapter at 80e0 > > As a datapoint, the same W83782D on AMD756 (also I2C) works correctly with > 2.6.22: > > w83782d-i2c-0-2d > Adapter: SMBus AMD756 adapter at 50e0 > > Jean Delvare recently worked on the ISA interface to these chips but it > seems this would not be the cause if you are also using I2C. Our hardware > appears rather identical... > > I've attached (an excerpt of) my /etc/sensors.conf -- I once dug through > the datasheets for those compute lines for example so perhaps its still > useful even if 2.6.21 working for you probably means you don't have a > config problem. > > Rene. > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
bad temperature values from w83781d in 2.6.22
Hi, after upgrading from 2.6.21 to 2.6.22 the CPU temperatures shown by w83781d look unreal. They were in a range from 40°C when idle to 75°C under full load with 2.6.21. The values shown now are in a very small range from 77°C to 82°C. From the (low) noise of the fan I can tell that the temperature is <50°C. The third temperature shown is completely wrong. I have a Tyan Tiger MPX board with a w83782d chip. Output from "sensors": w83782d-i2c-0-2d Adapter: SMBus AMD768 adapter at 80e0 +5 V: +4.81 V (min = +4.76 V, max = +5.24 V) 3 VSB: +3.30 V (min = +2.85 V, max = +3.15 V) ALARM chs3 Fan: 2122 RPM (min = 2657 RPM, div = 4) ALARM VRM2 Temp: -208°C (high = -176°C, hyst = -181°C) sensor = transistor CPU1 Temp: +78.5°C (high = +80°C, hyst = +75°C) sensor = transistor ALARM CPU2 Temp: +77.5°C (high = +80°C, hyst = +75°C) sensor = transistor ALARM alarms: beep_enable: Sound alarm enabled # cat /sys/bus/i2c/devices/0-002d/temp*_input -209000 77500 77500 Any ideas? -jo -- -rw-r--r-- 1 jo users 62 2007-08-04 14:02 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21: pmtmr losing time
On Tue, May 01, 2007 at 09:36:24AM +0200, Joerg Sommrey wrote: > On Mon, Apr 30, 2007 at 11:38:34PM +0200, Thomas Gleixner wrote: > > On Mon, 2007-04-30 at 18:39 +0200, Joerg Sommrey wrote: > > > Here it is. Maybe this problem is related to the usage of the > > > "experimental" amd76x_pm module? > > > > Can you please verify what happens w/o that module ? > > > After rebooting the problem vanished for now. It first appeared after > an uptime of about 3 days. I'll wait a few days. If it shows > again, then I'll check without amd76x_pm. It really looks like amd76x_pm is causing the time loss. Sadly, I have no idea how to avoid this. What happens when the pm-timer wraps around while the processors are in C3 sleep state? How is this wrap detected at all? Is there anything I could put into amd76x_pm? It's no problem detecting a timer wrap there. -jo -- -rw-r--r-- 1 jo users 62 2007-05-13 21:55 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21: pmtmr losing time
On Mon, Apr 30, 2007 at 11:38:34PM +0200, Thomas Gleixner wrote: > On Mon, 2007-04-30 at 18:39 +0200, Joerg Sommrey wrote: > > Here it is. Maybe this problem is related to the usage of the > > "experimental" amd76x_pm module? > > Can you please verify what happens w/o that module ? > After rebooting the problem vanished for now. It first appeared after an uptime of about 3 days. I'll wait a few days. If it shows again, then I'll check without amd76x_pm. -jo > Thanks, > > tglx > > -- -rw-r--r-- 1 jo users 62 2007-05-01 01:11 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.21: pmtmr losing time
On Mon, Apr 30, 2007 at 06:23:36PM +0200, Thomas Gleixner wrote: > On Mon, 2007-04-30 at 14:52 +0200, Joerg Sommrey wrote: > > Hi all, > > > > after switching to 2.6.21 the system clock sporadically loses time on my > > box (i386, Athlon MP). > > It's always around 4.68 seconds and happened 7 times in the last 12 > > hours. A simple calculation (2 ^ ACPI_PM_MASK / PMTMR_TICKS_PER_SEC = > > 2 ^ 24 / 3579545 = 4.686968875) shows: There is almost exactly one > > pmtmr-cycle missing. Could this be caused by a pmtmr-wrap when the > > system is in a sleep state? > > Hmm, looks like. That's strange we don't sleep 4.68 seconds. Can you > provide me the output of /proc/timer_list please ? > > tglx > > Here it is. Maybe this problem is related to the usage of the "experimental" amd76x_pm module? -jo Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 249154808025750 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1177701795305563440 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: , tick_sched_timer, S:01 # expires at 249154808417850 nsecs [in 392100 nsecs] #1: , it_real_fn, S:01 # expires at 249154832599897 nsecs [in 24574147 nsecs] #2: , hrtimer_wakeup, S:01 # expires at 249154937316640 nsecs [in 129290890 nsecs] #3: , it_real_fn, S:01 # expires at 249154940604253 nsecs [in 132578503 nsecs] #4: , it_real_fn, S:01 # expires at 249156991584989 nsecs [in 2183559239 nsecs] #5: , hrtimer_wakeup, S:01 # expires at 249163930201148 nsecs [in 9122175398 nsecs] #6: , hrtimer_wakeup, S:01 # expires at 249163930619465 nsecs [in 9122593715 nsecs] #7: , it_real_fn, S:01 # expires at 249164018722673 nsecs [in 9210696923 nsecs] #8: , it_real_fn, S:01 # expires at 249164018756764 nsecs [in 9210731014 nsecs] #9: , hrtimer_wakeup, S:01 # expires at 249166140491719 nsecs [in 11332465969 nsecs] #10: , it_real_fn, S:01 # expires at 249168890475020 nsecs [in 14082449270 nsecs] #11: , it_real_fn, S:01 # expires at 249216937518155 nsecs [in 62129492405 nsecs] #12: , it_real_fn, S:01 # expires at 249694958542841 nsecs [in 540150517091 nsecs] #13: , hrtimer_wakeup, S:01 # expires at 252071939585424 nsecs [in 2917131559674 nsecs] #14: , it_real_fn, S:01 # expires at 277954353421786 nsecs [in 28799545396036 nsecs] .expires_next : 249154808417850 nsecs .hres_active: 1 .nr_events : 12571303 .nohz_mode : 2 .idle_tick : 249154808417850 nsecs .tick_stopped : 0 .idle_jiffies : 74656449 .idle_calls : 31252991 .idle_sleeps: 18916982 .idle_entrytime : 249154806452801 nsecs .idle_sleeptime : 202805663229475 nsecs .last_jiffies : 74656449 .next_jiffies : 74656452 .idle_expires : 249154815084516 nsecs jiffies: 74656449 cpu: 1 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1177701795305563440 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: , tick_sched_timer, S:01 # expires at 249154808417850 nsecs [in 392100 nsecs] #1: , it_real_fn, S:01 # expires at 249154824005495 nsecs [in 15979745 nsecs] #2: , hrtimer_wakeup, S:01 # expires at 249154876721584 nsecs [in 68695834 nsecs] #3: , hrtimer_wakeup, S:01 # expires at 249154876724658 nsecs [in 68698908 nsecs] #4: , it_real_fn, S:01 # expires at 249156991445550 nsecs [in 2183419800 nsecs] #5: , hrtimer_wakeup, S:01 # expires at 249158033258177 nsecs [in 3225232427 nsecs] #6: , it_real_fn, S:01 # expires at 249158937910855 nsecs [in 4129885105 nsecs] #7: , hrtimer_wakeup, S:01 # expires at 249160645185907 nsecs [in 5837160157 nsecs] #8: , hrtimer_wakeup, S:01 # expires at 249288273241617 nsecs [in 133465215867 nsecs] #9: , it_real_fn, S:01 # expires at 249688052597900 nsecs [in 533244572150 nsecs] #10: , hrtimer_wakeup, S:01 # expires at 250127175002655 nsecs [in 972366976905 nsecs] #11: , hrtimer_wakeup, S:01 # expires at 252087970262178 nsecs [in 2933162236428 nsecs] .expires_next : 249154808417850 nsecs .hres_active: 1 .nr_events : 13107829 .nohz_mode : 2 .idle_tick : 249154805084517 nsecs .tick_stopped : 0 .idle_jiffies : 74656448 .idle_calls : 27583722 .idle_sleeps: 11208166 .idle_entrytime : 249154805632934 nsecs .idle_sleeptime : 197905845372598 nsecs .last_jiffies : 74656449 .next_jiffies : 74656450 .idle_expires : 249154935084504 nsecs jiffies: 74656449 Tick Device: mode: 1 Clock Event Device: pit max_delta_ns: 27461866 min_delta_ns: 12571 mult: 5124677 shift: 32 mode: 3 next_event: 9223372036854775807 nsecs set_next_event: pit_next_event set_
Linux 2.6.21: pmtmr losing time
Hi all, after switching to 2.6.21 the system clock sporadically loses time on my box (i386, Athlon MP). It's always around 4.68 seconds and happened 7 times in the last 12 hours. A simple calculation (2 ^ ACPI_PM_MASK / PMTMR_TICKS_PER_SEC = 2 ^ 24 / 3579545 = 4.686968875) shows: There is almost exactly one pmtmr-cycle missing. Could this be caused by a pmtmr-wrap when the system is in a sleep state? -jo -- -rw-r--r-- 1 jo users 62 2007-04-29 22:29 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] amd76x_pm: C2 powersaving for AMD K7
On Wed, Sep 07, 2005 at 10:00:01AM +0300, Tony Lindgren wrote: > * Pavel Machek <[EMAIL PROTECTED]> [050906 15:28]: > > Hi! > > > > > > > +NOTE: Currently there's a bug somewhere where the reading the > > > > > + P_LVL2 for the first time causes the system to sleep instead > > > > > of > > > > > + idling. This means that you need to hit the power button once > > > > > to > > > > > + wake the system after loading the module for the first time > > > > > after > > > > > + reboot. After that the system idles as supposed. > > > > > + (Only observed on Tony's system.) > > > > > > > > Could you fix this before merge? > > > > > > I think this is some BIOS issue or hardware bug. It happens only on > > > Tyan S2460. I tried dumping the registers few years ago on my > > > Tyan s2460, but no luck. > > > > > > Low chance for anybody fixing it... > > > > > > > So at least DMI-blacklist it... > > I rarely have access to that hardware, so don't count on me doing > this... > I'm unable to do this neither. Looks like this won't be fixed. -jo -- -rw-r--r-- 1 jo users 63 2005-09-06 20:21 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] amd76x_pm: C2 powersaving for AMD K7
On Mon, Sep 05, 2005 at 04:51:45PM +0200, Pavel Machek wrote: > Hi! > > > This patch adds some experimental features to amd76x_pm, namely C3, NTH > > and POS support. These features are not enabled by default and are > > intended for kernel hackers only. > > > > Could those wait until ready? There seems to be no development in this area at the moment. That's why these were separated into the -extra patch. Just omit this -extra patch and everything is fine :-) -jo -- -rw-r--r-- 1 jo users 63 2005-09-05 08:37 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/2] amd76x_pm: C2 powersaving for AMD K7
This is a processor idle module for AMD SMP 760MP(X) based systems. The patch was originally written by Tony Lindgren and has been around since 2002. It enables C2 mode on AMD SMP systems and thus saves about 70 - 90 W of energy in the idle mode compared to the default idle mode. The idle function has been rewritten and is now free of locking issues and is independent from the number of CPUs. The impact from this module on the system clock and on i/o transfer rates has been reduced. This patch can also be found at http://www.sommrey.de/amd76x_pm/amd_76x_pm-2.6.13-1.patch Signed-off-by: Joerg Sommrey <[EMAIL PROTECTED]> diff -Nru linux-2.6.13/Documentation/amd76x_pm.txt linux-2.6.13-jo/Documentation/amd76x_pm.txt --- linux-2.6.13/Documentation/amd76x_pm.txt1970-01-01 01:00:00.0 +0100 +++ linux-2.6.13-jo/Documentation/amd76x_pm.txt 2005-09-01 21:50:12.0 +0200 @@ -0,0 +1,326 @@ + ACPI style power management for SMP AMD-760MP(X) based systems + == + +For use until the ACPI project catches up. :-) + +Using this module saves about 70 - 90W of energy in the idle mode compared +to the default idle mode. Waking up from the idle mode is fast to keep the +system response time good. Currently no CPU load calculation is done, the +system exits the idle mode after every C2 call. + +NOTE: Currently there's a bug somewhere where the reading the + P_LVL2 for the first time causes the system to sleep instead of + idling. This means that you need to hit the power button once to + wake the system after loading the module for the first time after + reboot. After that the system idles as supposed. + (Only observed on Tony's system.) + + +Influenced by Vcool, and LVCool. Rewrote everything from scratch to +use the PCI features in Linux, and to support SMP systems. + +Currently tested amongst others on a TYAN S2460 (760MP) system (Tony), an +ASUS A7M266-D (760MPX) system (Johnathan) and a TYAN S2466 (760MPX) +system (Jo). Adding support for other Athlon SMP or single processor +systems should be easy if desired. + +The file /sys/devices/pci:00/:00:00.0/C2_cnt shows the number of +C2 calls since module load. + +There are some parameters for tuning the behaviour of amd76x_pm: +lazy_idle, spin_idle, watch_irqs, watch_int and min_C1 + +lazy_idle and spin_idle are closely related: + +- lazy_idle defines the number of idle calls into amd76x_smp_idle that are + needed to *enable* C2 mode. This parameter is the maximum loop counter + for an outer loop with interrupts enabled that guarantees low latencies. + The default for lazy_idle is 512. + +- spin_idle defines the maximum number of spin cycles in an inner idle loop + where one CPU waits for all others to get into C2-enabled mode. When all + CPUs are in C2-enable mode they (more ore less) simultaneously *enter* C2 + mode. In this inner loop interrupts are disabled. The loop is left + immediatly if there is something waiting to be scheduled. The default + for spin_idle is 2*lazy_idle. + +lazy_idle and spin_idle define a "rubber measure" for the idling +behaviour: lazy_idle defines the minimum "idling" needed to enter C2 and +spin_idle defines when to give up. + +Low values for lazy_idle and high values for spin_idle give better +cooling. Higher values for lazy_idle simply give less cooling. spin_idle +is a kind of emergency break to leave C2-enable mode if CPUs don't +synchronize. + +Interrupts are disabled in C2 mode. The CPUs are woken up by timer +interrups or by NMIs. This causes a high interrupt latency for other +interrupts that leads to a significant reduction in io or network +throughput. There has been introduced a "irq rate watcher" to reduce +this effect. If the irq rate watcher detects that an interrupt has a +rate above a given limit, C2 idling is disabled and a low latency C1 +idling mode is used instead. The parameters watch_irqs, watch_int and +min_C1 control this irq rate watcher: + +- watch_irqs defines which interrupts are to be watched and optionally + at which interrupt rate C2 mode shall be disabled. The syntax for + watch_irqs is irq1[:rate1],irq2[:rate2],... The rate is measured in + interrupts per second and defaults to 128. There is no default for + watch_irqs. To enable the irq rate watcher you must specify this + parameter. Enter the interrupts used by disk controllers or network + adapters here. + +- watch_int defines the time interval (in milliseconds) at which the + interrupt rate is checked. Too low values may result in an overhead + and too high values cause the C1 mode to "kick in" later. The default + for watch_int is 1 second. + +- min_C1 defines the mininum number of check intervals with low + interrupt rates that are needed to leave the forced C1 mode. + +All parameters lazy_idle, spin_idle, watch_irqs and watch_int
[PATCH 2/2] amd76x_pm: C2 powersaving for AMD K7
This patch adds some experimental features to amd76x_pm, namely C3, NTH and POS support. These features are not enabled by default and are intended for kernel hackers only. Signed-off-by: Joerg Sommrey <[EMAIL PROTECTED]> --- linux-2.6.13-jo/drivers/acpi/amd76x_pm.c2005-09-01 21:50:34.0 +0200 +++ linux-2.6.13-jo/drivers/acpi/amd76x_pm.c-extra 2005-09-01 21:53:50.0 +0200 @@ -130,12 +130,17 @@ #include -#define VERSION"20050830" +#define VERSION"20050830-extra" +// #define AMD76X_NTH 1 +// #define AMD76X_POS 1 +// #define AMD76X_C3 1 // #define AMD76X_LOG_C1 1 extern void default_idle(void); +#ifndef AMD76X_NTH static void amd76x_smp_idle(void); +#endif static int amd76x_pm_main(void); static unsigned long lazy_idle = 0; @@ -143,8 +148,10 @@ static unsigned long watch_int = 0; static unsigned long min_C1 = AMD76X_MIN_C1; +#ifndef AMD76X_NTH static int show_watch_irqs(char *, struct kernel_param *); static int set_watch_irqs(const char *, struct kernel_param *); +#endif module_param(lazy_idle, long, S_IRUGO | S_IWUSR); @@ -155,6 +162,7 @@ MODULE_PARM_DESC(spin_idle, "\tnumber of spin cycles to wait for other CPUs to become idle"); +#ifndef AMD76X_NTH module_param(watch_int, long, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(watch_int, "\twatch interval (in milliseconds) for interrupts"); @@ -165,6 +173,7 @@ "\tlist of irqs (and optional their limit per second) that " "cause fallback to C1 mode. " "Syntax: irq0[:limit0],irq1[:limit1],..."); +#endif module_param(min_C1, long, S_IRUGO | S_IWUSR); MODULE_PARM_DESC(min_C1, @@ -181,6 +190,12 @@ struct PM_cfg { unsigned int status_reg; unsigned int C2_reg; +#ifdef AMD76X_C3 + unsigned int C3_reg; +#endif +#ifdef AMD76X_NTH + unsigned int NTH_reg; +#endif unsigned int slp_reg; unsigned int resume_reg; void (*orig_idle) (void); @@ -189,21 +204,34 @@ static struct PM_cfg amd76x_pm_cfg __cacheline_aligned_in_smp; +#ifndef AMD76X_NTH struct idle_stat { atomic_t num_idle; +#ifdef AMD76X_C3 + atomic_t num_C2; +#endif }; static struct idle_stat amd76x_stat __cacheline_aligned_in_smp = { .num_idle = ATOMIC_INIT(0), +#ifdef AMD76X_C3 + .num_C2 = ATOMIC_INIT(0) +#endif }; struct cpu_stat { int idle_count; int C2_cnt; +#ifdef AMD76X_C3 + int C3_cnt; + int C2_active; +#else int _fill[2]; +#endif }; static struct cpu_stat prs[NR_CPUS] __cacheline_aligned_in_smp; +#endif struct watch_item { int irq; @@ -287,7 +315,13 @@ regdword &= 0xff80; amd76x_pm_cfg.status_reg = (regdword + 0x00); amd76x_pm_cfg.slp_reg =(regdword + 0x04); +#ifdef AMD76X_NTH + amd76x_pm_cfg.NTH_reg =(regdword + 0x10); +#endif amd76x_pm_cfg.C2_reg = (regdword + 0x14); +#ifdef AMD76X_C3 + amd76x_pm_cfg.C3_reg = (regdword + 0x15); +#endif amd76x_pm_cfg.resume_reg = (regdword + 0x16); /* N/A for 768 */ } @@ -332,6 +366,52 @@ regdword &= ~((STPCLK_EN | CPUSLP_EN) << C2_REGS); pci_write_config_dword(pdev_sb, 0x50, regdword); } +#ifdef AMD76X_C3 +/* + * Untested C3 idle support for AMD-766. + */ +static void +config_amd766_C3(int enable) +{ +unsigned int regdword; + +/* Set C3 options in C3A50, page 63 in AMD-766 doc */ +pci_read_config_dword(pdev_sb, 0x50, ®dword); +if(enable) { +regdword &= ~((DCSTOP_EN | PCISTP_EN | SUSPND_EN | CPURST_EN) +<< C3_REGS); +regdword |= (STPCLK_EN /* ~ 20 Watt savings max */ + | CPUSLP_EN /* Additional ~ 70 Watts max! */ + | CPUSTP_EN) /* yet more savings! */ + << C3_REGS; +} +else +regdword &= ~((STPCLK_EN | CPUSLP_EN | CPUSTP_EN) << C3_REGS); +pci_write_config_dword(pdev_sb, 0x50, regdword); +} +#endif + + +#ifdef AMD76X_POS +static void +config_amd766_POS(int enable) +{ + unsigned int regdword; + + /* Set C3 options in C3A50, page 63 in AMD-766 doc */ + pci_read_config_dword(pdev_sb, 0x50, ®dword); + if(enable) { + regdword &= ~((ZZ_CACHE_EN | CPURST_EN) << POS_REGS); + regdword |= ((DCSTOP_EN | STPCLK_EN | CPUSTP_EN | PCISTP_EN | + CPUSLP_EN | SUSPND_EN) << POS_REGS); + } + else + regdword ^= (0xff << POS_REGS); + pci_write_config_dword(pdev_sb, 0x50, regdword); +} +#endif + + /* * Configures the 765 & 766 southbridges. @@ -342,6 +422,13 @@ amd76x_get_PM(); config_PMIO_amd76x(1, 1); config_a
2.6.12-rc2: Promise SATA150 TX4 failures
Hi all, just tried 2.6.12-rc2 and I still have the same errors from my SATA disks as with 2.6.11. The setup is a bit complex. The relevant parts (I think) are: Adaptec AHA-2940UW SCSI-controller, attached are: 1 harddisk /dev/sda 1 DDS3 streamer /dev/st0 Promise SATA150 TX4 controller, attached are: 2 identical hardisks /dev/sdb and /dev/sdc /dev/sda consists of the root partition, a swap partition and 4 other partitions that are physical volumes for dm volume group /dev/vg1 /dev/sdb and /dev/sdc have two partitions each, the first of both make a RAID-0 array /dev/md0 and the second of both make a md RAID-1 array /dev/md1 /dev/md0 and /dev/md1 are the physical volumes for dm volume groups /dev/vg2 and /dev/vg3 resp. To trigger the failure: - For all logical volumes in /dev/vg1, /dev/vg2 and /dev/vg3 a snapshot is created. - All snapshots are mounted read-only in a "snapshot hierarchy" under /snap. - A backup to tape is taken using something like: find /snap -print | cpio -oaH crc -F /dev/st0 Backup must go to tape, no problem with /dev/null - At this point, some additional i/o on the SATA disks cause the whole box to hang. Mostly some errors are written to syslog, they are always similar: Apr 9 01:30:35 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }Apr 9 01:30:35 bear kernel: ata2: called with no error (51)! Apr 9 01:30:35 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Apr 9 01:30:35 bear kernel: sdc: Current: sense key: Medium Error Apr 9 01:30:35 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed Apr 9 01:30:35 bear kernel: end_request: I/O error, dev sdc, sector 43100350 Apr 9 01:30:35 bear kernel: raid1: Disk failure on sdc2, disabling device. Apr 9 01:30:35 bear kernel: ^IOperation continuing on 1 devices The errors are always reported for /dev/sdc2, the second device of a RAID-1 array. After reboot I am able to raidhotadd the failed partition without problems. The problem is 100% reproducible. The hang is not a "hard hang": X keeps running, the watchdog doesn't hit but no new processes can be started. Syslog entries stop after some time (from a few seconds to several minutes). The problem appeared somewhere between 2.6.10 and 2.6.11. 2.6.10: ok 2.6.10-ac8: ok 2.6.10-ac11:failed 2.6.11: failed 2.6.12-rc2: failed I'd be glad if there would be a solution for this problem as it prevents me from using any newer kernel. -jo -- -rw-r--r-- 1 jo users 63 2005-04-09 09:31 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SATA] libata-dev queue updated
On Fri, Mar 04, 2005 at 11:06:23PM +0100, Joerg Sommrey wrote: > On Fri, Mar 04, 2005 at 03:43:38PM -0500, Jeff Garzik wrote: > > Joerg Sommrey wrote: > > >On Fri, Mar 04, 2005 at 01:07:16PM -0500, Jeff Garzik wrote: > > > > > >>Joerg Sommrey wrote: > > >> > > >>>On Fri, Mar 04, 2005 at 02:10:14AM -0500, Jeff Garzik wrote: > > >>> > > >>> > > >>>>Joerg Sommrey wrote: > > >>>> > > >>>> > > >>>>>On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote: > > >>>>> > > >>>>> > > >>>>> > > >>>>>>Joerg Sommrey wrote: > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>>>On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote: > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>>>Joerg Sommrey wrote: > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>>>>>>>Jeff Garzik wrote: > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>>>Patch: > > >>>>>>>>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2 > > >>>>>>>>> > > >>>>>>>>> > > >>>>>>>>>Still not usable here. The same errors as before when backing up: > > >>>>>>>> > > >>>>>>>>Please try 2.6.11 without any patches. > > >>>>>>> > > >>>>>>>Plain 2.6.11 doesn't work either. All of 2.6.10-ac11, 2.6.11-rc5, > > >>>>>>>2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with > > >>>>>>>the > > >>>>>>>same symptoms. > > >>>>>>> > > >>>>>>>Reverting to stable 2.6.10-ac8 :-) > > >>>>>> > > >>>>>>Does reverting the attached patch in 2.6.11 (apply with patch -R) fix > > >>>>>>things? > > >>>>>> > > >>>>> > > >>>>> > > >>>>>Still the same with this patch reverted. > > >>>> > > >>>>Does reverting the attached patch in 2.6.11 fix things? (apply with > > >>>>patch -R) > > >>>> > > >>>>This patch reverts the entire libata back to 2.6.10. > > >>>> > > >>> > > >>>I'm confused. Still the same with everything reverted. What shall I do > > >>>now? > > >> > > >>Well, first, thanks for your patience in narrowing this down. > > >> > > >>This means we have eliminated libata as a problem source, but we still > > >>have the rest of the kernel go to through :) > > >> > > >>Try disabling ACPI with 'acpi=off' or 'pci=biosirq' to see if that fixes > > >>things. > > >> > > > > > >I tried both settings with plain 2.6.11. Almost the same results, in my > > >impression apci=off causes the failure to appear even faster. > > > > Just to make sure I have things right, please tell me if this is correct: > > > > * 2.6.10 vanilla works > > > > * 2.6.11 vanilla does not work > > > > * 2.6.11 vanilla + 2.6.10 libata does not work > > [2.6.10 libata == reverting all libata changes] > > > > Is that all correct? > > Thanks for asking these precise questions. After double-checking > everything I found a typo in my configuration that changes things a bit. > I repeated some tests and the correct answers are now: > * 2.6.10 vanilla works > * 2.6.10-ac8 works > * 2.6.10-ac11 does not work > * 2.6.11 vanilla does not work > * 2.6.11 w/o promise.patchdoes not work > * 2.6.11 + 2.6.10 libata works! > > This looks much more consistent to me but brings the case back to > libata. After one more test using 2.6.11 + 2.6.10 libata I got some errors. They are different, they end after some time and they don't lock the system: Mar 4 23:15:00 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }Mar 4 23:15:00 bear kernel: sdb: Current: sense key: Recovered Error Mar 4 23:15:00 bear kernel: ASC=0x26 <> ASCQ=0xc0 Mar 4 23:15:00 bear kernel: FMK, ILI Got 1900 of these in 90 seconds and silence afterwards. Maybe that helps. I'll keep this kernel running and watch it. -jo -- -rw-r--r-- 1 jo users 63 2005-03-04 23:12 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SATA] libata-dev queue updated
On Fri, Mar 04, 2005 at 03:43:38PM -0500, Jeff Garzik wrote: > Joerg Sommrey wrote: > >On Fri, Mar 04, 2005 at 01:07:16PM -0500, Jeff Garzik wrote: > > > >>Joerg Sommrey wrote: > >> > >>>On Fri, Mar 04, 2005 at 02:10:14AM -0500, Jeff Garzik wrote: > >>> > >>> > >>>>Joerg Sommrey wrote: > >>>> > >>>> > >>>>>On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote: > >>>>> > >>>>> > >>>>> > >>>>>>Joerg Sommrey wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>>>On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>>Joerg Sommrey wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>>Jeff Garzik wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>>Patch: > >>>>>>>>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2 > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>Still not usable here. The same errors as before when backing up: > >>>>>>>> > >>>>>>>>Please try 2.6.11 without any patches. > >>>>>>> > >>>>>>>Plain 2.6.11 doesn't work either. All of 2.6.10-ac11, 2.6.11-rc5, > >>>>>>>2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with > >>>>>>>the > >>>>>>>same symptoms. > >>>>>>> > >>>>>>>Reverting to stable 2.6.10-ac8 :-) > >>>>>> > >>>>>>Does reverting the attached patch in 2.6.11 (apply with patch -R) fix > >>>>>>things? > >>>>>> > >>>>> > >>>>> > >>>>>Still the same with this patch reverted. > >>>> > >>>>Does reverting the attached patch in 2.6.11 fix things? (apply with > >>>>patch -R) > >>>> > >>>>This patch reverts the entire libata back to 2.6.10. > >>>> > >>> > >>>I'm confused. Still the same with everything reverted. What shall I do > >>>now? > >> > >>Well, first, thanks for your patience in narrowing this down. > >> > >>This means we have eliminated libata as a problem source, but we still > >>have the rest of the kernel go to through :) > >> > >>Try disabling ACPI with 'acpi=off' or 'pci=biosirq' to see if that fixes > >>things. > >> > > > >I tried both settings with plain 2.6.11. Almost the same results, in my > >impression apci=off causes the failure to appear even faster. > > Just to make sure I have things right, please tell me if this is correct: > > * 2.6.10 vanilla works > > * 2.6.11 vanilla does not work > > * 2.6.11 vanilla + 2.6.10 libata does not work > [2.6.10 libata == reverting all libata changes] > > Is that all correct? Thanks for asking these precise questions. After double-checking everything I found a typo in my configuration that changes things a bit. I repeated some tests and the correct answers are now: * 2.6.10 vanillaworks * 2.6.10-ac8works * 2.6.10-ac11 does not work * 2.6.11 vanilladoes not work * 2.6.11 w/o promise.patch does not work * 2.6.11 + 2.6.10 libataworks! This looks much more consistent to me but brings the case back to libata. -jo -- -rw-r--r-- 1 jo users 63 2005-03-04 22:48 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SATA] libata-dev queue updated
On Fri, Mar 04, 2005 at 01:07:16PM -0500, Jeff Garzik wrote: > Joerg Sommrey wrote: > >On Fri, Mar 04, 2005 at 02:10:14AM -0500, Jeff Garzik wrote: > > > >>Joerg Sommrey wrote: > >> > >>>On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote: > >>> > >>> > >>>>Joerg Sommrey wrote: > >>>> > >>>> > >>>>>On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote: > >>>>> > >>>>> > >>>>> > >>>>>>Joerg Sommrey wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>>>Jeff Garzik wrote: > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>>Patch: > >>>>>>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2 > >>>>>>> > >>>>>>> > >>>>>>>Still not usable here. The same errors as before when backing up: > >>>>>> > >>>>>>Please try 2.6.11 without any patches. > >>>>> > >>>>>Plain 2.6.11 doesn't work either. All of 2.6.10-ac11, 2.6.11-rc5, > >>>>>2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with the > >>>>>same symptoms. > >>>>> > >>>>>Reverting to stable 2.6.10-ac8 :-) > >>>> > >>>>Does reverting the attached patch in 2.6.11 (apply with patch -R) fix > >>>>things? > >>>> > >>> > >>> > >>>Still the same with this patch reverted. > >> > >>Does reverting the attached patch in 2.6.11 fix things? (apply with > >>patch -R) > >> > >>This patch reverts the entire libata back to 2.6.10. > >> > > > >I'm confused. Still the same with everything reverted. What shall I do > >now? > > Well, first, thanks for your patience in narrowing this down. > > This means we have eliminated libata as a problem source, but we still > have the rest of the kernel go to through :) > > Try disabling ACPI with 'acpi=off' or 'pci=biosirq' to see if that fixes > things. > I tried both settings with plain 2.6.11. Almost the same results, in my impression apci=off causes the failure to appear even faster. -jo -- -rw-r--r-- 1 jo users 63 2005-03-04 20:54 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SATA] libata-dev queue updated
On Fri, Mar 04, 2005 at 02:10:14AM -0500, Jeff Garzik wrote: > Joerg Sommrey wrote: > >On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote: > > > >>Joerg Sommrey wrote: > >> > >>>On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote: > >>> > >>> > >>>>Joerg Sommrey wrote: > >>>> > >>>> > >>>>>Jeff Garzik wrote: > >>>>> > >>>>> > >>>>>>Patch: > >>>>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2 > >>>>> > >>>>> > >>>>>Still not usable here. The same errors as before when backing up: > >>>> > >>>>Please try 2.6.11 without any patches. > >>> > >>>Plain 2.6.11 doesn't work either. All of 2.6.10-ac11, 2.6.11-rc5, > >>>2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with the > >>>same symptoms. > >>> > >>>Reverting to stable 2.6.10-ac8 :-) > >> > >>Does reverting the attached patch in 2.6.11 (apply with patch -R) fix > >>things? > >> > > > > > >Still the same with this patch reverted. > > Does reverting the attached patch in 2.6.11 fix things? (apply with > patch -R) > > This patch reverts the entire libata back to 2.6.10. > I'm confused. Still the same with everything reverted. What shall I do now? -jo -- -rw-r--r-- 1 jo users 63 2005-03-04 18:44 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SATA] libata-dev queue updated
On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote: > Joerg Sommrey wrote: > >On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote: > > > >>Joerg Sommrey wrote: > >> > >>>Jeff Garzik wrote: > >>> > >>>>Patch: > >>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2 > >>> > >>> > >>>Still not usable here. The same errors as before when backing up: > >> > >>Please try 2.6.11 without any patches. > > > >Plain 2.6.11 doesn't work either. All of 2.6.10-ac11, 2.6.11-rc5, > >2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with the > >same symptoms. > > > >Reverting to stable 2.6.10-ac8 :-) > > Does reverting the attached patch in 2.6.11 (apply with patch -R) fix > things? > Still the same with this patch reverted. -jo -- -rw-r--r-- 1 jo users 63 2005-03-04 07:32 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SATA] libata-dev queue updated
On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote: > Joerg Sommrey wrote: > >Jeff Garzik wrote: > >>Patch: > >>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2 > > > > > >Still not usable here. The same errors as before when backing up: > > Please try 2.6.11 without any patches. Plain 2.6.11 doesn't work either. All of 2.6.10-ac11, 2.6.11-rc5, 2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with the same symptoms. Reverting to stable 2.6.10-ac8 :-) -jo -- -rw-r--r-- 1 jo users 63 2005-03-03 20:23 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [SATA] libata-dev queue updated
Jeff Garzik wrote: >BK users: > bk pull bk://gkernel.bkbits.net/libata-dev-2.6 >Patch: >http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2 Still not usable here. The same errors as before when backing up: Mar 2 21:09:50 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }Mar 2 21:09:51 bear kernel: ata1: called with no error (51)! Mar 2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }Mar 2 21:09:51 bear kernel: ata1: called with no error (51)! Mar 2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }Mar 2 21:09:51 bear kernel: ata1: called with no error (51)! Mar 2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }Mar 2 21:09:51 bear kernel: ata1: called with no error (51)! Mar 2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }Mar 2 21:09:51 bear kernel: ata1: called with no error (51)! Mar 2 21:09:51 bear kernel: SCSI error : <1 0 0 0> return code = 0x802 Mar 2 21:09:51 bear kernel: sdb: Current: sense key: Medium Error Mar 2 21:09:51 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed Mar 2 21:09:51 bear kernel: end_request: I/O error, dev sdb, sector 43099350 Mar 2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }Mar 2 21:09:51 bear kernel: ata1: called with no error (51)! Mar 2 21:09:52 bear kernel: SCSI error : <1 0 0 0> return code = 0x802 Mar 2 21:09:52 bear kernel: sdb: Current: sense key: Medium Error Mar 2 21:09:52 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed Mar 2 21:09:52 bear kernel: end_request: I/O error, dev sdb, sector 43099358 Mar 2 21:09:52 bear kernel: raid1: Disk failure on sdb2, disabling device. Mar 2 21:09:52 bear kernel: ^IOperation continuing on 1 devices Mar 2 21:09:52 bear kernel: raid1: sdb2: rescheduling sector 2904720 Mar 2 21:09:52 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }Mar 2 21:09:52 bear kernel: ata1: called with no error (51)! Mar 2 21:09:52 bear kernel: SCSI error : <1 0 0 0> return code = 0x802 Mar 2 21:09:52 bear kernel: sdb: Current: sense key: Medium Error Mar 2 21:09:52 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed Using Promise SATA150 TX4 / md-raid1 / lvm / reiserfs -jo -- -rw-r--r-- 1 jo users 63 2005-03-02 21:14 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11-rc5: Promise SATA150 TX4 failure
Hi all, a problem that was introduced between 2.6.10-ac9 and 2.6.10-ac11 made it's way into 2.6.11-rc5. While taking a backup onto a SCSI-streamer one of my RAID1-arrays gets corrupted. Afterwards the system hangs and isn't even bootable. Need to raidhotadd the failed partition in single user mode to get the box working again. Error messages: Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:15 bear kernel: ata2: called with no error (51)! Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:15 bear kernel: ata2: called with no error (51)! Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:15 bear kernel: ata2: called with no error (51)! Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:15 bear kernel: ata2: called with no error (51)! Mar 1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:15 bear kernel: ata2: called with no error (51)! Mar 1 01:46:15 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Mar 1 01:46:15 bear kernel: sdc: Current: sense key: Medium Error Mar 1 01:46:15 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed Mar 1 01:46:15 bear kernel: end_request: I/O error, dev sdc, sector 52694606 Mar 1 01:46:15 bear kernel: raid1: Disk failure on sdc2, disabling device. Mar 1 01:46:15 bear kernel: ^IOperation continuing on 1 devices Mar 1 01:46:15 bear kernel: raid1: sdc2: rescheduling sector 12499976 Mar 1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:16 bear kernel: ata2: called with no error (51)! Mar 1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Mar 1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error Mar 1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed Mar 1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694614 Mar 1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:16 bear kernel: ata2: called with no error (51)! Mar 1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Mar 1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error Mar 1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed Mar 1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694622 Mar 1 01:46:16 bear kernel: raid1: sdc2: rescheduling sector 12499984 Mar 1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:16 bear kernel: ata2: called with no error (51)! Mar 1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Mar 1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error Mar 1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto reallocate failed Mar 1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694630 Mar 1 01:46:16 bear kernel: raid1: sdc2: rescheduling sector 1250 Mar 1 01:46:16 bear kernel: RAID1 conf printout: Mar 1 01:46:16 bear kernel: --- wd:1 rd:2 Mar 1 01:46:16 bear kernel: disk 0, wo:0, o:1, dev:sdb2 Mar 1 01:46:16 bear kernel: disk 1, wo:1, o:0, dev:sdc2 Mar 1 01:46:16 bear kernel: RAID1 conf printout: Mar 1 01:46:16 bear kernel: --- wd:1 rd:2 Mar 1 01:46:16 bear kernel: disk 0, wo:0, o:1, dev:sdb2 Mar 1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 12499976 to another mirror Mar 1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 12499984 to another mirror Mar 1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 1250 to another mirror Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:16 bear kernel: ata1: called with no error (51)! Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:16 bear kernel: ata1: called with no error (51)! Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:16 bear kernel: ata1: called with no error (51)! Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:16 bear kernel: ata1: called with no error (51)! Mar 1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error } Mar 1 01:46:16 bear kernel: ata1: called with no error (51)! Mar 1 01:46:16 bear kernel: SCSI error : <1 0 0 0> return code = 0x802 Mar 1 01:46:16 bear kernel: sdb: Current: sense key: Medium Error etc. until hard reboot. The failing array consists of two partitions of two SATA disks connected to a Promise SATA150 TX4 controller. -jo -- -rw-r--r-- 1 jo users 63 2005-03-01 02:26 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-i
Re: Question on CONFIG_IRQBALANCE / 2.6.x
On Fri, Feb 18, 2005 at 02:39:49PM -0800, Martin J. Bligh wrote: > > > > there's something I don't understand: With IRQBALANCE *enabled* almost > > all interrupts are processed on CPU0. This changed in an unexpected way > > after disabling IRQBALANCE: now all interrupts are distributed uniformly > > to both CPUs. Maybe it's intentional, but it's not what I expect when a > > config option named IRQBALANCE is *disabled*. > > > > Can anybody comment on this? > > If you have a Pentium 3 based system, by default they'll round robin. > If you turn on IRQbalance, they won't move until the traffic gets high > enough load to matter. That's presumably what you're seeing. It's an Athlon box that propably has the same behaviour. Just another question on this topic: with IRQBALANCE enabled, almost all interupts are routet to CPU0. Lately irq 0 runs on CPU1 and never returns to CPU0 - is there any obvious reason for that? -jo -- -rw-r--r-- 1 jo users 63 2005-02-18 23:29 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Question on CONFIG_IRQBALANCE / 2.6.x
Hi all, there's something I don't understand: With IRQBALANCE *enabled* almost all interrupts are processed on CPU0. This changed in an unexpected way after disabling IRQBALANCE: now all interrupts are distributed uniformly to both CPUs. Maybe it's intentional, but it's not what I expect when a config option named IRQBALANCE is *disabled*. Can anybody comment on this? Thanks, -jo -- -rw-r--r-- 1 jo users 63 2005-02-18 21:21 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Problem on SATA-disk with Promise SATAII 150 TX4 ("DriveReady SeekComplete Error")
Jeff Garzik wrote:: >Johannes Resch wrote: >> Hi, >> >> [please CC me on replies] >> >> I've got a box running 2.6.10 (with the patch[0] needed to support the >> Promise SATAII 150 TX4 controller). >> This box has three software raid1 partitions mirrored on a SATA disk on >> the Promise controller and a disk on the mainboard IDE controller (VIA >> vt8235). >> >> Within 4 days running the raid1, I got those three errors pasted below, >> each marking the SATA-raidmember as faulty. After "raidhotremove" and >> "raidhotadd" the SATA-raidmember syncs again fine and works at least a >> day until it is marked as faulty again. >> >> Any pointers where I could look at to resolve this problem? >> The SATA drive is a new Seagate ST3250823AS. >I would change out your cables, and also make sure you are running >2.6.11-rc3-bk-latest, which includes all the SATAII patches and other fixes. I don't believe it has anything to do with cabling. 2.6.10-ac9 introduced some sata patches. I didn't check -ac9 and -ac10, but -ac11 and -ac12 are not usable on my box with exactly the same symptoms. -jo -- -rw-r--r-- 1 jo users 63 2005-02-12 18:43 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Need advice on amd76x_pm [patch included]
note + * in amd76x_smp_idle(). I've noticed that when NTH and idling are both + * enabled, my hardware locks and requires a hard reset, so I have + * #ifndefed around the idle loop setting to prevent this. POS locks it up + * too, both ought to be fixable. I've also noticed that idling and NTH + * make some interference that is picked up by the onboard sound chip on + * my ASUS A7M266-D motherboard. + * + * 20030601: Pasi Savolainen + * Simple port to 2.5 + * Added sysfs interface for making nice graphs with mrtg. + * Look for /sys/devices/pci0/00:00.0/C2_cnt & lazy_idle (latter writable) + * + * 20041204: Joerg Sommrey (jo) + * trying to enable preemption + * added C3-count to sysfs + * renamed module parm from "l" to "lazy_idle" + * added some "dummy op" after return from C2 and C3 + * Note: using this module on my S2466 makes the system clock kind + * of instable. After playing with some RR-priorities and processor + * affinities I managed to reduce ntpd's time offsets to about + * 4ms. Without this module time offsets are in the range of 1-2ms. + * + * TODO: Thermal throttling (TTH). + * /proc interface for normal throttling level. + * /proc interface for POS. + * + * + * + * + * Processor idle mode module for AMD SMP 760MP(X) based systems + * + * Copyright (C) 2002 Tony Lindgren <[EMAIL PROTECTED]> + *Johnathan Hicks (768 support) + * + * Using this module saves about 70 - 90W of energy in the idle mode compared + * to the default idle mode. Waking up from the idle mode is fast to keep the + * system response time good. Currently no CPU load calculation is done, the + * system exits the idle mode if the idle function runs twice on the same + * processor in a row. This only works on SMP systems, but maybe the idle mode + * enabling can be integrated to ACPI to provide C2 mode at some point. + * + * NOTE: Currently there's a bug somewhere where the reading the + * P_LVL2 for the first time causes the system to sleep instead of + * idling. This means that you need to hit the power button once to + * wake the system after loading the module for the first time after + * reboot. After that the system idles as supposed. + * + * + * Influenced by Vcool, and LVCool. Rewrote everything from scratch to + * use the PCI features in Linux, and to support SMP systems. + * + * Currently only tested on a TYAN S2460 (760MP) system (Tony) and an + * ASUS A7M266-D (760MPX) system (Johnathan). Adding support for other Athlon + * SMP or single processor systems should be easy if desired. + * + * This software is licensed under GNU General Public License Version 2 + * as specified in file COPYING in the Linux kernel source tree main + * directory. + * + * + */ + + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#define VERSION"20041204" + +// #define AMD76X_C3 1 +// #define AMD76X_NTH 1 +// #define AMD76X_POS 1 +// #define AMD76X_PREEMPT_DISABLE 1 +#define AMD76X_IRQ_DISABLE 1 +#define AMD76X_DUMMY_OP 1 + + +extern void default_idle(void); +static void amd76x_smp_idle(void); +static int amd76x_pm_main(void); + +unsigned long lazy_idle = 0; + +/* jo: make some compile time warnings about deprecation go away */ +module_param(lazy_idle, long, 0); +MODULE_PARM_DESC(lazy_idle, "number of idle cycles before entering C2"); + +static struct pci_dev *pdev_nb; +static struct pci_dev *pdev_sb; + +struct PM_cfg { + unsigned int status_reg; + unsigned int C2_reg; + unsigned int C3_reg; + unsigned int NTH_reg; + unsigned int slp_reg; + unsigned int resume_reg; + void (*orig_idle) (void); + void (*curr_idle) (void); + unsigned long C2_cnt, C3_cnt, idle_cnt; + int last_pr; +}; + +static struct PM_cfg amd76x_pm_cfg; + +struct cpu_idle_state { + int idle; + int count; +}; +static struct cpu_idle_state prs[2]; + +static struct pci_device_id __devinitdata amd_nb_tbl[] = { + {PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FE_GATE_700C, PCI_ANY_ID, PCI_ANY_ID,}, + {0,} +}; + +static struct pci_device_id __devinitdata amd_sb_tbl[] = { + {PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_VIPER_7413, PCI_ANY_ID, PCI_ANY_ID,}, + {PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_VIPER_7443, PCI_ANY_ID, PCI_ANY_ID,}, + {0,} +}; + +/* + * Configures the AMD-762 northbridge to support PM calls + */ +static int +config_amd762(int enable) +{ + unsigned int regdword; + + /* Enable STPGNT in BIU Status/Control for cpu0 */ + pci_read_config_dword(pdev_nb, 0x60, ®dword); + regdword |= (1 << 17); + pci_write_config_dword(pdev_nb, 0x60, regdword); + + /* Enable STPGNT in BIU Status/Control for cpu1 */ + pci_read_config_dword(pdev_nb, 0x68, ®dwo
strange repeating keys and irq 0 routing on 2.6.x
Hi all, a few times in the last couple of months I experienced some strange key repeating in X. A single key-press sometimes results in 2-8 typed keys. This started around 2.6.8 but I'm not sure. It was last seen on 2.6.10-ac8. There were similar problems reported for Toshiba laptops and within XFree, but my problem seems to be different. The key repeating starts some hours after reboot and it gets worse with time. After one day the keyboard is not usable anymore. At the same time ntpd often fails reading the radio clock attached to a serial port. It doesn't happen after every reboot. BUT: Sometimes IRQ 0 is processed on CPU1 after reboot (though /proc/irq/0/smp_affinity is 3 and all other irq are handled on CPU0). In these cases the repeating keys appear. The box has two Athlon MPs. What can I do to gather more information? -jo -- -rw-r--r-- 1 jo users 63 2005-01-10 08:33 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.10-ac12
On Sun, Feb 06, 2005 at 04:02:42PM +, Alan Cox wrote: >Arjan van de Ven is now building RPMS of the kernel and those can be found >in the RPM subdirectory and should be yum-able. Expect the RPMS to lag the >diff a little as the RPM builds and tests do take time. >Nothing terribly exciting here security wise but various bugs for problems >people have been hitting that are now fixed upstream, and also the ULi >tulip variant should now work. If you are running IPv6 you may well want >the networking fixes. Something broke my box after 2.6.10-ac8. Both -ac11 and -ac12 cause problems when backing up sata/md-raid/dm-snapshots/reiserfs to a SCSI tape drive. (Works fine when writing to /dev/null :-) Backup started 1:30 a.m., errors messges start with: Feb 8 01:34:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }Feb 8 01:34:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Feb 8 01:34:07 bear kernel: FMK Current sdc: sense = 70 99 Feb 8 01:34:07 bear kernel: ASC=26 ASCQ=c0 Feb 8 01:34:07 bear kernel: end_request: I/O error, dev sdc, sector 52294166 Feb 8 01:34:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }Feb 8 01:34:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Feb 8 01:34:07 bear kernel: FMK Current sdc: sense = 70 99 Feb 8 01:34:07 bear kernel: ASC=26 ASCQ=c0 Feb 8 01:34:07 bear kernel: end_request: I/O error, dev sdc, sector 52294174 Feb 8 01:34:07 bear kernel: raid1: Disk failure on sdc2, disabling device. Feb 8 01:34:07 bear kernel: ^IOperation continuing on 1 devices Feb 8 01:34:07 bear kernel: raid1: sdc2: rescheduling sector 12099536 Feb 8 01:34:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }Feb 8 01:34:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Feb 8 01:34:07 bear kernel: FMK Current sdc: sense = 70 99 Feb 8 01:34:07 bear kernel: ASC=26 ASCQ=c0 Feb 8 01:34:07 bear kernel: end_request: I/O error, dev sdc, sector 52294182 Feb 8 01:34:07 bear kernel: raid1: sdc2: rescheduling sector 12099552 etc. until hard reboot via sysrq-b I found only one patch that seems to be related: >2.6.10-ac11 >* Fix oops with md over dm(Jens Axboe) Can this by any chance cause my problems? -jo -- -rw-r--r-- 1 jo users 63 2005-02-08 01:47 /home/jo/.signature - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.10-ac11 causes failures on sata/md/raid
Hello, 2.6.10-ac11 causes strange errors on a sata-md-raid-1. It starts like this: Jan 31 23:21:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }Jan 31 23:21:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Jan 31 23:21:07 bear kernel: FMK Current sdc: sense = 70 99 Jan 31 23:21:07 bear kernel: ASC=26 ASCQ=c0 Jan 31 23:21:07 bear kernel: end_request: I/O error, dev sdc, sector 50888046 Jan 31 23:21:07 bear kernel: raid1: Disk failure on sdc2, disabling device. Jan 31 23:21:07 bear kernel: ^IOperation continuing on 1 devices Jan 31 23:21:07 bear kernel: raid1: sdc2: rescheduling sector 10693416 Jan 31 23:21:07 bear kernel: raid1: sdb2: redirecting sector 10693416 to another mirror Jan 31 23:21:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }Jan 31 23:21:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Jan 31 23:21:07 bear kernel: FMK Current sdc: sense = 70 99 Jan 31 23:21:07 bear kernel: ASC=26 ASCQ=c0 Jan 31 23:21:07 bear kernel: end_request: I/O error, dev sdc, sector 50888054 Jan 31 23:21:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }Jan 31 23:21:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802 Jan 31 23:21:07 bear kernel: FMK Current sdc: sense = 70 99 Jan 31 23:21:07 bear kernel: ASC=26 ASCQ=c0 Jan 31 23:21:07 bear kernel: end_request: I/O error, dev sdc, sector 50888062 Jan 31 23:21:07 bear kernel: raid1: sdc2: rescheduling sector 10693424 Jan 31 23:21:07 bear kernel: raid1: sdb2: redirecting sector 10693424 to another mirror These messages are repeated forever (with different sector numbers). I am able to boot into single user mode. Hotadding the "failed" partition /dev/sdc2 to /dev/md1 works w/o problems. No problems accessing the filesystem on /dev/md1. But when I enter runlevel 3 the same errors appear and make the system again unusable. After reverting to 2.6.10-ac8 everything works fine again. -jo -- -rw-r--r-- 1 jo users 63 2005-02-01 18:44 /home/jo/.signature # # Automatically generated make config: don't edit # Linux kernel version: 2.6.10-ac11 # Sat Jan 29 13:15:21 2005 # CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y # # Code maturity level options # CONFIG_EXPERIMENTAL=y # CONFIG_CLEAN_COMPILE is not set CONFIG_BROKEN=y CONFIG_BROKEN_ON_SMP=y CONFIG_LOCK_KERNEL=y # # General setup # CONFIG_LOCALVERSION="" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set CONFIG_SYSCTL=y # CONFIG_AUDIT is not set CONFIG_LOG_BUF_SHIFT=18 CONFIG_HOTPLUG=y CONFIG_KOBJECT_UEVENT=y # CONFIG_IKCONFIG is not set # CONFIG_EMBEDDED is not set CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_FUTEX=y CONFIG_EPOLL=y # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 # CONFIG_TINY_SHMEM is not set # # Loadable module support # CONFIG_MODULES=y CONFIG_MODULE_UNLOAD=y CONFIG_MODULE_FORCE_UNLOAD=y CONFIG_OBSOLETE_MODPARM=y # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_KMOD=y CONFIG_STOP_MACHINE=y # # Processor type and features # CONFIG_X86_PC=y # CONFIG_X86_ELAN is not set # CONFIG_X86_VOYAGER is not set # CONFIG_X86_NUMAQ is not set # CONFIG_X86_SUMMIT is not set # CONFIG_X86_BIGSMP is not set # CONFIG_X86_VISWS is not set # CONFIG_X86_GENERICARCH is not set # CONFIG_X86_ES7000 is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set CONFIG_MK7=y # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set CONFIG_X86_HZ=1000 # CONFIG_X86_GENERIC is not set CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_USE_3DNOW=y # CONFIG_HPET_TIMER is not set CONFIG_SMP=y CONFIG_NR_CPUS=2 # CONFIG_SCHED_SMT is not set CONFIG_PREEMPT=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_X86_MCE=y CONFIG_X86_MCE_NONFATAL=y CONFIG_X86_MCE_P4THERMAL=y # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set # CONFIG_MICROCODE is not set CONFIG_X86_MSR=m CONFIG_X86_CPUID=m # # Firmware Drivers # # CONFIG_EDD is not set # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_HIGHMEM=y # CONFIG_HIGHPTE is not set # CONFIG_MATH_EMULATION