Re: [PATCH] bad temperature values from w83781d in 2.6.22

2007-08-09 Thread Joerg Sommrey
Hi Mark,

On Thu, Aug 09, 2007 at 08:26:19AM -0400, Mark M. Hoffman wrote:
> Hi Joerg:
> 
> > On Wed, Aug 08, 2007 at 11:56:42PM -0400, Mark M. Hoffman wrote:
> > > Thanks for sending all that.  I see one bug clearly, and I'm pretty close 
> > > to
> > > seeing the other one.  But for tonight, I need sleep.
> 
> There's just one bug after all.  The second was a figment of my sleep-deprived
> imagination.
> 
[...]
> 
> My bad, there's a second i2cset command that would have done it.  Please try
> this patch against v2.6.22.
> 

Great, problem fixed.  Thanks a lot for your prompt solution!

-jo

-- 
-rw-r--r-- 1 jo users 62 2007-08-09 08:46 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [lm-sensors] bad temperature values from w83781d in 2.6.22

2007-08-09 Thread Joerg Sommrey
Hi Mark,

On Wed, Aug 08, 2007 at 11:56:42PM -0400, Mark M. Hoffman wrote:
> Hi Joerg:
> 
> * Joerg Sommrey <[EMAIL PROTECTED]> [2007-08-08 17:17:16 +0200]:
> > Hi Mark,
> > 
> > just to eliminate as many impacts as possible, I did:
> > - reinstall the unmodified sensors.conf from Tyan's support page
> > - power off before rebooting
> > 
> > A call to "sensors -s" is done without errors in all cases.
> > The module parameters I use currently with both kernels:
> > 
> > options w83781d force_w83782d=0,0x2d force_subclients=0,0x2d,0x48,0x49
> > options w83627hf force_addr=0x0c00
> > 
> > When I first realized the problem, I didn't use w83627hf yet.  Results
> > are the same when w83781d is used as driver for w83627hf.
> > Parameters in that case just from Tyan:
> > 
> > options w83781d force_w83782d=0,0x2d force_subclients=0,0x2d,0x48,0x49 
> > force_w83627hf=0,0x2c force_subclients=0,0x2c,0x4a,0x4b init=0
> > 
> > "My" i2cdump doesn't accept an -y option, maybe a Debianism.  Results
> > see below.
> 
> Newer i2cdump skips the 5-second warning when given -y, that's all.
> 
> > ### 2.6.21 ###
> > Script started on Wed Aug  8 16:53:10 2007
> > bear:~/hwmon# i2cdump 0 0x2d b 0 0x4e
> 
> (snip tons of results)
> 
> Thanks for sending all that.  I see one bug clearly, and I'm pretty close to
> seeing the other one.  But for tonight, I need sleep.
> 
> In the meantime, please try this command as root, against the newer kernel,
> *after* you've done 'sensors -s':
> 
>   # i2cset -f 0 0x2d 0x5d 0x0e b
> 
> Wait > 2 seconds for the hardware to update itself, then run 'sensors' again.
> I'm pretty sure you'll see the correct temps.
The displayed temperatures changed to 67.5°C / 66.0°C.  Still, this seems to
be too high.  The power supply's fan runs too slow for such CPU
temperatures. In older kernels it becomes noisy above 50°C.

Under load the temperatures shown are around 95°C, way too high.

-jo

-- 
-rw-r--r-- 1 jo users 62 2007-08-08 21:06 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bad temperature values from w83781d in 2.6.22

2007-08-08 Thread Joerg Sommrey
   P.KPP.KPP.KPP.KP
d0: 50 00 4b 50 50 00 4b 50 50 00 4b 50 50 00 4b 50P.KPP.KPP.KPP.KP
e0: 50 00 4b 50 50 00 4b 50 4f 00 4b 50 4f 00 4b 50P.KPP.KPO.KPO.KP
f0: 50 00 4b 50 50 00 4b 50 50 00 4b 50 50 00 4b 50P.KPP.KPP.KPP.KP
bear:~/hwmon# i2cdump 0 0x49
No size specified (using byte-data access)
  WARNING! This program can confuse your I2C bus, cause data loss and worse!
  I will probe file /dev/i2c-0, address 0x49, mode byte
  You have five seconds to reconsider and press CTRL-C!

 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f0123456789abcdef
00: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
10: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
20: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
30: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
40: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
50: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
60: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
70: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
80: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
90: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
a0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
b0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
c0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
d0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
e0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
f0: 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50 4f 00 4b 50O.KPO.KPO.KPO.KP
bear:~/hwmon# sensors
w83782d-i2c-0-2d
Adapter: SMBus AMD768 adapter at 80e0
AGP V: +1.73 V  (min =  +3.14 V, max =  +3.47 V)   ALARM  
 +5 V: +4.81 V  (min =  +4.76 V, max =  +5.24 V)  
DDR V: +1.22 V  (min =  +2.85 V, max =  +3.47 V)   ALARM  
3 VSB: +3.30 V  (min =  +2.85 V, max =  +3.15 V)   ALARM  
Bat V: +0.00 V  (min =  +2.64 V, max =  +3.95 V)   ALARM  
chs1 Fan:0 RPM  (min = 2700 RPM, div = 2)  ALARM  
chs2 Fan:0 RPM  (min = 3970 RPM, div = 2)  ALARM  
chs3 Fan:0 RPM  (min = 10546 RPM, div = 2)  ALARM  
VRM2 Temp:   +56 C  (high =   +80 C, hyst =   +75 C)   sensor = transistor  
 
CPU1 Temp: +79.5 C  (high =   +80 C, hyst =   +75 C)   sensor = transistor   
ALARM   
CPU2 Temp: +79.5 C  (high =   +80 C, hyst =   +75 C)   sensor = transistor   
ALARM   
alarms:   
beep_enable:
  Sound alarm enabled

w83627hf-isa-0c00
Adapter: ISA adapter
VCore1:+1.71 V  (min =  +1.66 V, max =  +1.84 V)  
VCore2:+1.71 V  (min =  +1.66 V, max =  +1.84 V)  
+3.3 V:+3.33 V  (min =  +3.14 V, max =  +3.47 V)  
 +12 V:   +11.83 V  (min = +13.21 V, max = +10.83 V)   ALARM  
 -12 V:   -12.20 V  (min = -13.18 V, max = -10.80 V)  
CPU1 Fan: 4041 RPM  (min = 4687 RPM, div = 2)  ALARM  
CPU2 Fan: 4166 RPM  (min = 6750 RPM, div = 2)  ALARM  
VRM1 Temp:   +43 C  (high =  -124 C, hyst =   +16 C)   sensor = transistor   
ALARM   
AGP Temp:  +49.5 C  (high =   +80 C, hyst =   +75 C)   sensor = transistor  
 
DDR Temp:  +46.0 C  (high =   +80 C, hyst =   +75 C)   sensor = transistor  
 
alarms:   Chassis intrusion detection  ALARM
beep_enable:
  Sound alarm disabled

bear:~/hwmon# exit

Script done on Wed Aug  8 16:43:20 2007

On Tue, Aug 07, 2007 at 09:03:16PM -0400, Mark M. Hoffman wrote:
> Hi Joerg:
> 
> (I tried to follow-up using the gmane.org mail/news gateway... didn't seem
> to work.)
> 
> * Joerg Sommrey <[EMAIL PROTECTED]> [2007-08-05 12:26:04 +0200]:
> > Hi,
> > 
> > after upgrading from 2.6.21 to 2.6.22 the CPU temperatures shown by
> > w83781d look unreal.  They were in a range from 40°C when idle to
> > 75°C under full load with 2.6.21.  The values shown now are in a very
> > small range from 77°C to 82°C.  From the (low) noise of the fan I can
> > tell that the temperature is <50°C.
> > The third temperature shown is completely wrong.
> > 
> > I have a Tyan Tiger MPX board with a w83782d chip. Output from
> > "sensors":
> > 
> > w83782d-i2c-0-2d
> > Adapter: SMBus AMD768 adapter at 80e0
> >  +5 V: +4.81 V  (min =  +4.76 V, max =  +5.24 V)
> > 3 VSB: +3.30 V  (min =  +2.85 V, max =  +3.15 V)   ALARM
> > chs3 Fan: 2122 RPM  (min = 2657 RPM, div = 4)  ALARM
> > VRM2 Temp:  -208°C  (high =  -176°C, hyst =  -181°C)   sensor = transistor 
> > CPU1 Temp: +78.5°C  (high =   +80°C, hyst =   +75°C)   sensor = transistor  
> >  ALARM
> > CPU2 Temp: +77.5°C  (high =   +80°C, hyst =   +75°C)   sensor = transistor

Re: bad temperature values from w83781d in 2.6.22

2007-08-05 Thread Joerg Sommrey
Thanks for your reply.

the sensors.conf I'm currently using is provided by Tyan, so this seems
to be ok.  One major difference that I can see: I don't have compute
statements for the CPU temperatures.  If I use your config, I get 7.8°C
:-)

So there is definitely some difference in our hardware environment. OTOH
with the "right" compute statement the problem seems fixable.

BTW: there is another hwmon chip on the board, a w83627hf.  Up to 2.6.21
this was managed by the w83781d driver, too.  Now I use the w83627hf
driver (on the isa bus).  No problem with that part.

-jo

On Sun, Aug 05, 2007 at 01:20:23PM +0200, Rene Herman wrote:
> On 08/05/2007 12:26 PM, Joerg Sommrey wrote:
> 
> >after upgrading from 2.6.21 to 2.6.22 the CPU temperatures shown by
> >w83781d look unreal.  They were in a range from 40°C when idle to
> >75°C under full load with 2.6.21.  The values shown now are in a very
> >small range from 77°C to 82°C.  From the (low) noise of the fan I can
> >tell that the temperature is <50°C.
> >The third temperature shown is completely wrong.
> >
> >I have a Tyan Tiger MPX board with a w83782d chip. Output from
> >"sensors":
> >
> >w83782d-i2c-0-2d
> >Adapter: SMBus AMD768 adapter at 80e0
> 
> As a datapoint, the same W83782D on AMD756 (also I2C) works correctly with 
> 2.6.22:
> 
> w83782d-i2c-0-2d
> Adapter: SMBus AMD756 adapter at 50e0
> 
> Jean Delvare recently worked on the ISA interface to these chips but it 
> seems this would not be the cause if you are also using I2C. Our hardware 
> appears rather identical...
> 
> I've attached (an excerpt of) my /etc/sensors.conf -- I once dug through 
> the datasheets for those compute lines for example so perhaps its still 
> useful even if 2.6.21 working for you probably means you don't have a 
> config problem.
> 
> Rene.
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


bad temperature values from w83781d in 2.6.22

2007-08-05 Thread Joerg Sommrey
Hi,

after upgrading from 2.6.21 to 2.6.22 the CPU temperatures shown by
w83781d look unreal.  They were in a range from 40°C when idle to
75°C under full load with 2.6.21.  The values shown now are in a very
small range from 77°C to 82°C.  From the (low) noise of the fan I can
tell that the temperature is <50°C.
The third temperature shown is completely wrong.

I have a Tyan Tiger MPX board with a w83782d chip. Output from
"sensors":

w83782d-i2c-0-2d
Adapter: SMBus AMD768 adapter at 80e0
 +5 V: +4.81 V  (min =  +4.76 V, max =  +5.24 V)
3 VSB: +3.30 V  (min =  +2.85 V, max =  +3.15 V)   ALARM
chs3 Fan: 2122 RPM  (min = 2657 RPM, div = 4)  ALARM
VRM2 Temp:  -208°C  (high =  -176°C, hyst =  -181°C)   sensor = transistor 
CPU1 Temp: +78.5°C  (high =   +80°C, hyst =   +75°C)   sensor = transistor   
ALARM
CPU2 Temp: +77.5°C  (high =   +80°C, hyst =   +75°C)   sensor = transistor   
ALARM
alarms:
beep_enable:
  Sound alarm enabled

# cat /sys/bus/i2c/devices/0-002d/temp*_input
-209000
77500
77500

Any ideas?

-jo

-- 
-rw-r--r-- 1 jo users 62 2007-08-04 14:02 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21: pmtmr losing time

2007-05-14 Thread Joerg Sommrey
On Tue, May 01, 2007 at 09:36:24AM +0200, Joerg Sommrey wrote:
> On Mon, Apr 30, 2007 at 11:38:34PM +0200, Thomas Gleixner wrote:
> > On Mon, 2007-04-30 at 18:39 +0200, Joerg Sommrey wrote:
> > > Here it is.  Maybe this problem is related to the usage of the
> > > "experimental" amd76x_pm module?
> > 
> > Can you please verify what happens w/o that module ?
> > 
> After rebooting the problem vanished for now.  It first appeared after
> an uptime of about 3 days.  I'll wait a few days.  If it shows
> again, then I'll check without amd76x_pm.

It really looks like amd76x_pm is causing the time loss.  Sadly, I have
no idea how to avoid this.  What happens when the pm-timer wraps around
while the processors are in C3 sleep state?  How is this wrap detected
at all?  Is there anything I could put into amd76x_pm?  It's no problem
detecting a timer wrap there.

-jo

-- 
-rw-r--r-- 1 jo users 62 2007-05-13 21:55 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21: pmtmr losing time

2007-05-01 Thread Joerg Sommrey
On Mon, Apr 30, 2007 at 11:38:34PM +0200, Thomas Gleixner wrote:
> On Mon, 2007-04-30 at 18:39 +0200, Joerg Sommrey wrote:
> > Here it is.  Maybe this problem is related to the usage of the
> > "experimental" amd76x_pm module?
> 
> Can you please verify what happens w/o that module ?
> 
After rebooting the problem vanished for now.  It first appeared after
an uptime of about 3 days.  I'll wait a few days.  If it shows
again, then I'll check without amd76x_pm.

-jo

> Thanks,
> 
>   tglx
> 
> 

-- 
-rw-r--r-- 1 jo users 62 2007-05-01 01:11 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.21: pmtmr losing time

2007-04-30 Thread Joerg Sommrey
On Mon, Apr 30, 2007 at 06:23:36PM +0200, Thomas Gleixner wrote:
> On Mon, 2007-04-30 at 14:52 +0200, Joerg Sommrey wrote:
> > Hi all,
> > 
> > after switching to 2.6.21 the system clock sporadically loses time on my
> > box (i386, Athlon MP).
> > It's always around 4.68 seconds and happened 7 times in the last 12
> > hours.  A simple calculation (2 ^ ACPI_PM_MASK / PMTMR_TICKS_PER_SEC =
> > 2 ^ 24 / 3579545 = 4.686968875) shows: There is almost exactly one
> > pmtmr-cycle missing.  Could this be caused by a pmtmr-wrap when the
> > system is in a sleep state? 
> 
> Hmm, looks like. That's strange we don't sleep 4.68 seconds. Can you
> provide me the output of /proc/timer_list please ?
> 
>   tglx
> 
> 

Here it is.  Maybe this problem is related to the usage of the
"experimental" amd76x_pm module?

-jo

Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 249154808025750 nsecs

cpu: 0
 clock 0:
  .index:  0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset: 1177701795305563440 nsecs
active timers:
 clock 1:
  .index:  1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset: 0 nsecs
active timers:
 #0: , tick_sched_timer, S:01
 # expires at 249154808417850 nsecs [in 392100 nsecs]
 #1: , it_real_fn, S:01
 # expires at 249154832599897 nsecs [in 24574147 nsecs]
 #2: , hrtimer_wakeup, S:01
 # expires at 249154937316640 nsecs [in 129290890 nsecs]
 #3: , it_real_fn, S:01
 # expires at 249154940604253 nsecs [in 132578503 nsecs]
 #4: , it_real_fn, S:01
 # expires at 249156991584989 nsecs [in 2183559239 nsecs]
 #5: , hrtimer_wakeup, S:01
 # expires at 249163930201148 nsecs [in 9122175398 nsecs]
 #6: , hrtimer_wakeup, S:01
 # expires at 249163930619465 nsecs [in 9122593715 nsecs]
 #7: , it_real_fn, S:01
 # expires at 249164018722673 nsecs [in 9210696923 nsecs]
 #8: , it_real_fn, S:01
 # expires at 249164018756764 nsecs [in 9210731014 nsecs]
 #9: , hrtimer_wakeup, S:01
 # expires at 249166140491719 nsecs [in 11332465969 nsecs]
 #10: , it_real_fn, S:01
 # expires at 249168890475020 nsecs [in 14082449270 nsecs]
 #11: , it_real_fn, S:01
 # expires at 249216937518155 nsecs [in 62129492405 nsecs]
 #12: , it_real_fn, S:01
 # expires at 249694958542841 nsecs [in 540150517091 nsecs]
 #13: , hrtimer_wakeup, S:01
 # expires at 252071939585424 nsecs [in 2917131559674 nsecs]
 #14: , it_real_fn, S:01
 # expires at 277954353421786 nsecs [in 28799545396036 nsecs]
  .expires_next   : 249154808417850 nsecs
  .hres_active: 1
  .nr_events  : 12571303
  .nohz_mode  : 2
  .idle_tick  : 249154808417850 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 74656449
  .idle_calls : 31252991
  .idle_sleeps: 18916982
  .idle_entrytime : 249154806452801 nsecs
  .idle_sleeptime : 202805663229475 nsecs
  .last_jiffies   : 74656449
  .next_jiffies   : 74656452
  .idle_expires   : 249154815084516 nsecs
jiffies: 74656449

cpu: 1
 clock 0:
  .index:  0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset: 1177701795305563440 nsecs
active timers:
 clock 1:
  .index:  1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset: 0 nsecs
active timers:
 #0: , tick_sched_timer, S:01
 # expires at 249154808417850 nsecs [in 392100 nsecs]
 #1: , it_real_fn, S:01
 # expires at 249154824005495 nsecs [in 15979745 nsecs]
 #2: , hrtimer_wakeup, S:01
 # expires at 249154876721584 nsecs [in 68695834 nsecs]
 #3: , hrtimer_wakeup, S:01
 # expires at 249154876724658 nsecs [in 68698908 nsecs]
 #4: , it_real_fn, S:01
 # expires at 249156991445550 nsecs [in 2183419800 nsecs]
 #5: , hrtimer_wakeup, S:01
 # expires at 249158033258177 nsecs [in 3225232427 nsecs]
 #6: , it_real_fn, S:01
 # expires at 249158937910855 nsecs [in 4129885105 nsecs]
 #7: , hrtimer_wakeup, S:01
 # expires at 249160645185907 nsecs [in 5837160157 nsecs]
 #8: , hrtimer_wakeup, S:01
 # expires at 249288273241617 nsecs [in 133465215867 nsecs]
 #9: , it_real_fn, S:01
 # expires at 249688052597900 nsecs [in 533244572150 nsecs]
 #10: , hrtimer_wakeup, S:01
 # expires at 250127175002655 nsecs [in 972366976905 nsecs]
 #11: , hrtimer_wakeup, S:01
 # expires at 252087970262178 nsecs [in 2933162236428 nsecs]
  .expires_next   : 249154808417850 nsecs
  .hres_active: 1
  .nr_events  : 13107829
  .nohz_mode  : 2
  .idle_tick  : 249154805084517 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 74656448
  .idle_calls : 27583722
  .idle_sleeps: 11208166
  .idle_entrytime : 249154805632934 nsecs
  .idle_sleeptime : 197905845372598 nsecs
  .last_jiffies   : 74656449
  .next_jiffies   : 74656450
  .idle_expires   : 249154935084504 nsecs
jiffies: 74656449


Tick Device: mode: 1
Clock Event Device: pit
 max_delta_ns:   27461866
 min_delta_ns:   12571
 mult:   5124677
 shift:  32
 mode:   3
 next_event: 9223372036854775807 nsecs
 set_next_event: pit_next_event
 set_

Linux 2.6.21: pmtmr losing time

2007-04-30 Thread Joerg Sommrey
Hi all,

after switching to 2.6.21 the system clock sporadically loses time on my
box (i386, Athlon MP).
It's always around 4.68 seconds and happened 7 times in the last 12
hours.  A simple calculation (2 ^ ACPI_PM_MASK / PMTMR_TICKS_PER_SEC =
2 ^ 24 / 3579545 = 4.686968875) shows: There is almost exactly one
pmtmr-cycle missing.  Could this be caused by a pmtmr-wrap when the
system is in a sleep state? 

-jo

-- 
-rw-r--r-- 1 jo users 62 2007-04-29 22:29 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] amd76x_pm: C2 powersaving for AMD K7

2005-09-07 Thread Joerg Sommrey
On Wed, Sep 07, 2005 at 10:00:01AM +0300, Tony Lindgren wrote:
> * Pavel Machek <[EMAIL PROTECTED]> [050906 15:28]:
> > Hi!
> > 
> > > > > +NOTE: Currently there's a bug somewhere where the reading the
> > > > > +  P_LVL2 for the first time causes the system to sleep instead 
> > > > > of 
> > > > > +  idling. This means that you need to hit the power button once 
> > > > > to
> > > > > +  wake the system after loading the module for the first time 
> > > > > after
> > > > > +  reboot. After that the system idles as supposed.
> > > > > +  (Only observed on Tony's system.)
> > > > 
> > > > Could you fix this before merge?
> > > 
> > > I think this is some BIOS issue or hardware bug. It happens only on
> > > Tyan S2460. I tried dumping the registers few years ago on my
> > > Tyan s2460, but no luck.
> > > 
> > > Low chance for anybody fixing it...
> > > 
> > 
> > So at least DMI-blacklist it...
> 
> I rarely have access to that hardware, so don't count on me doing
> this...
> 
I'm unable to do this neither.  Looks like this won't be fixed.

-jo

-- 
-rw-r--r--  1 jo users 63 2005-09-06 20:21 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] amd76x_pm: C2 powersaving for AMD K7

2005-09-05 Thread Joerg Sommrey
On Mon, Sep 05, 2005 at 04:51:45PM +0200, Pavel Machek wrote:
> Hi!
> 
> > This patch adds some experimental features to amd76x_pm, namely C3, NTH
> > and POS support.  These features are not enabled by default and are
> > intended for kernel hackers only.
> > 
> 
> Could those wait until ready?
There seems to be no development in this area at the moment.  That's why
these were separated into the -extra patch.  Just omit this -extra patch
and everything is fine :-)

-jo

-- 
-rw-r--r--  1 jo users 63 2005-09-05 08:37 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] amd76x_pm: C2 powersaving for AMD K7

2005-09-01 Thread Joerg Sommrey
This is a processor idle module for AMD SMP 760MP(X) based systems.
The patch was originally written by Tony Lindgren and has been around
since 2002.  It enables C2 mode on AMD SMP systems and thus saves
about 70 - 90 W of energy in the idle mode compared to the default idle
mode.  The idle function has been rewritten and is now free of locking
issues and is independent from the number of CPUs.  The impact
from this module on the system clock and on i/o transfer rates has
been reduced.

This patch can also be found at
http://www.sommrey.de/amd76x_pm/amd_76x_pm-2.6.13-1.patch

Signed-off-by: Joerg Sommrey <[EMAIL PROTECTED]>

diff -Nru linux-2.6.13/Documentation/amd76x_pm.txt 
linux-2.6.13-jo/Documentation/amd76x_pm.txt
--- linux-2.6.13/Documentation/amd76x_pm.txt1970-01-01 01:00:00.0 
+0100
+++ linux-2.6.13-jo/Documentation/amd76x_pm.txt 2005-09-01 21:50:12.0 
+0200
@@ -0,0 +1,326 @@
+   ACPI style power management for SMP AMD-760MP(X) based systems
+   ==
+
+For use until the ACPI project catches up. :-)
+
+Using this module saves about 70 - 90W of energy in the idle mode compared
+to the default idle mode. Waking up from the idle mode is fast to keep the
+system response time good. Currently no CPU load calculation is done, the
+system exits the idle mode after every C2 call.
+
+NOTE: Currently there's a bug somewhere where the reading the
+  P_LVL2 for the first time causes the system to sleep instead of 
+  idling. This means that you need to hit the power button once to
+  wake the system after loading the module for the first time after
+  reboot. After that the system idles as supposed.
+  (Only observed on Tony's system.)
+
+
+Influenced by Vcool, and LVCool. Rewrote everything from scratch to
+use the PCI features in Linux, and to support SMP systems.
+
+Currently tested amongst others on a TYAN S2460 (760MP) system (Tony), an
+ASUS A7M266-D (760MPX) system (Johnathan) and a TYAN S2466 (760MPX)
+system (Jo). Adding support for other Athlon SMP or single processor
+systems should be easy if desired.  
+
+The file /sys/devices/pci:00/:00:00.0/C2_cnt shows the number of
+C2 calls since module load.
+
+There are some parameters for tuning the behaviour of amd76x_pm:
+lazy_idle, spin_idle, watch_irqs, watch_int and min_C1
+
+lazy_idle and spin_idle are closely related:
+
+- lazy_idle defines the number of idle calls into amd76x_smp_idle that are
+  needed to *enable* C2 mode.  This parameter is the maximum loop counter
+  for an outer loop with interrupts enabled that guarantees low latencies. 
+  The default for lazy_idle is 512.
+
+- spin_idle defines the maximum number of spin cycles in an inner idle loop
+  where one CPU waits for all others to get into C2-enabled mode. When all
+  CPUs are in C2-enable mode they (more ore less) simultaneously *enter* C2
+  mode. In this inner loop interrupts are disabled.  The loop is left
+  immediatly if there is something waiting to be scheduled.  The default
+  for spin_idle is 2*lazy_idle.
+
+lazy_idle and spin_idle define a "rubber measure" for the idling
+behaviour: lazy_idle defines the minimum "idling" needed to enter C2 and
+spin_idle defines when to give up.
+
+Low values for lazy_idle and high values for spin_idle give better
+cooling.  Higher values for lazy_idle simply give less cooling.  spin_idle
+is a kind of emergency break to leave C2-enable mode if CPUs don't
+synchronize.
+
+Interrupts are disabled in C2 mode.  The CPUs are woken up by timer
+interrups or by NMIs.  This causes a high interrupt latency for other
+interrupts that leads to a significant reduction in io or network
+throughput.  There has been introduced a "irq rate watcher" to reduce
+this effect.  If the irq rate watcher detects that an interrupt has a
+rate above a given limit, C2 idling is disabled and a low latency C1
+idling mode is used instead. The parameters watch_irqs, watch_int and
+min_C1 control this irq rate watcher:
+
+- watch_irqs defines which interrupts are to be watched and optionally
+  at which interrupt rate C2 mode shall be disabled.  The syntax for 
+  watch_irqs is irq1[:rate1],irq2[:rate2],...  The rate is measured in
+  interrupts per second and defaults to 128.  There is no default for
+  watch_irqs.  To enable the irq rate watcher you must specify this
+  parameter.  Enter the interrupts used by disk controllers or network
+  adapters here.
+  
+- watch_int defines the time interval (in milliseconds) at which the
+  interrupt rate is checked.  Too low values may result in an overhead
+  and too high values cause the C1 mode to "kick in" later.  The default
+  for watch_int is 1 second.
+
+- min_C1 defines the mininum number of check intervals with low
+  interrupt rates that are needed to leave the forced C1 mode.
+
+All parameters lazy_idle, spin_idle, watch_irqs and watch_int 

[PATCH 2/2] amd76x_pm: C2 powersaving for AMD K7

2005-09-01 Thread Joerg Sommrey
This patch adds some experimental features to amd76x_pm, namely C3, NTH
and POS support.  These features are not enabled by default and are
intended for kernel hackers only.

Signed-off-by: Joerg Sommrey <[EMAIL PROTECTED]>

--- linux-2.6.13-jo/drivers/acpi/amd76x_pm.c2005-09-01 21:50:34.0 
+0200
+++ linux-2.6.13-jo/drivers/acpi/amd76x_pm.c-extra  2005-09-01 
21:53:50.0 +0200
@@ -130,12 +130,17 @@
 
 #include 
 
-#define VERSION"20050830"
+#define VERSION"20050830-extra"
 
+// #define AMD76X_NTH 1
+// #define AMD76X_POS 1
+// #define AMD76X_C3 1
 // #define AMD76X_LOG_C1 1
 
 extern void default_idle(void);
+#ifndef AMD76X_NTH
 static void amd76x_smp_idle(void);
+#endif
 static int amd76x_pm_main(void);
 
 static unsigned long lazy_idle = 0;
@@ -143,8 +148,10 @@
 static unsigned long watch_int = 0;
 static unsigned long min_C1 = AMD76X_MIN_C1;
 
+#ifndef AMD76X_NTH
 static int show_watch_irqs(char *, struct kernel_param *);
 static int set_watch_irqs(const char *, struct kernel_param *);
+#endif
 
 module_param(lazy_idle, long, S_IRUGO | S_IWUSR);
 
@@ -155,6 +162,7 @@
 MODULE_PARM_DESC(spin_idle,
"\tnumber of spin cycles to wait for other CPUs to become 
idle");
 
+#ifndef AMD76X_NTH
 module_param(watch_int, long, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(watch_int,
"\twatch interval (in milliseconds) for interrupts");
@@ -165,6 +173,7 @@
"\tlist of irqs (and optional their limit per second) that "
"cause fallback to C1 mode. "
"Syntax: irq0[:limit0],irq1[:limit1],...");
+#endif
 
 module_param(min_C1, long, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(min_C1,
@@ -181,6 +190,12 @@
 struct PM_cfg {
unsigned int status_reg;
unsigned int C2_reg;
+#ifdef AMD76X_C3
+   unsigned int C3_reg;
+#endif
+#ifdef AMD76X_NTH
+   unsigned int NTH_reg;
+#endif
unsigned int slp_reg;
unsigned int resume_reg;
void (*orig_idle) (void);
@@ -189,21 +204,34 @@
 
 static struct PM_cfg amd76x_pm_cfg __cacheline_aligned_in_smp;
 
+#ifndef AMD76X_NTH
 struct idle_stat {
atomic_t num_idle;
+#ifdef AMD76X_C3
+   atomic_t num_C2;
+#endif
 };
 
 static struct idle_stat amd76x_stat __cacheline_aligned_in_smp = {
.num_idle = ATOMIC_INIT(0),
+#ifdef AMD76X_C3
+   .num_C2 = ATOMIC_INIT(0)
+#endif
 };
 
 struct cpu_stat {
int idle_count;
int C2_cnt;
+#ifdef AMD76X_C3
+   int C3_cnt;
+   int C2_active;
+#else
int _fill[2];
+#endif
 };
 
 static struct cpu_stat prs[NR_CPUS] __cacheline_aligned_in_smp;
+#endif
 
 struct watch_item {
int irq;
@@ -287,7 +315,13 @@
regdword &= 0xff80;
amd76x_pm_cfg.status_reg = (regdword + 0x00);
amd76x_pm_cfg.slp_reg =(regdword + 0x04);
+#ifdef AMD76X_NTH
+   amd76x_pm_cfg.NTH_reg =(regdword + 0x10);
+#endif
amd76x_pm_cfg.C2_reg = (regdword + 0x14);
+#ifdef AMD76X_C3
+   amd76x_pm_cfg.C3_reg = (regdword + 0x15);
+#endif
amd76x_pm_cfg.resume_reg = (regdword + 0x16); /* N/A for 768 */
 }
 
@@ -332,6 +366,52 @@
regdword &= ~((STPCLK_EN | CPUSLP_EN) << C2_REGS);
pci_write_config_dword(pdev_sb, 0x50, regdword);
 }
+#ifdef AMD76X_C3
+/*
+ * Untested C3 idle support for AMD-766.
+ */
+static void
+config_amd766_C3(int enable)
+{
+unsigned int regdword;
+
+/* Set C3 options in C3A50, page 63 in AMD-766 doc */
+pci_read_config_dword(pdev_sb, 0x50, ®dword);
+if(enable) {
+regdword &= ~((DCSTOP_EN | PCISTP_EN | SUSPND_EN | CPURST_EN)
+<< C3_REGS);
+regdword |= (STPCLK_EN  /* ~ 20 Watt savings max */
+ |  CPUSLP_EN   /* Additional ~ 70 Watts max! */
+ |  CPUSTP_EN)  /* yet more savings! */
+ << C3_REGS;
+}
+else
+regdword &= ~((STPCLK_EN | CPUSLP_EN | CPUSTP_EN) << C3_REGS);
+pci_write_config_dword(pdev_sb, 0x50, regdword);
+}
+#endif
+
+
+#ifdef AMD76X_POS
+static void
+config_amd766_POS(int enable)
+{
+   unsigned int regdword;
+
+   /* Set C3 options in C3A50, page 63 in AMD-766 doc */
+   pci_read_config_dword(pdev_sb, 0x50, ®dword);
+   if(enable) {
+   regdword &= ~((ZZ_CACHE_EN | CPURST_EN) << POS_REGS);
+   regdword |= ((DCSTOP_EN | STPCLK_EN | CPUSTP_EN | PCISTP_EN |
+   CPUSLP_EN | SUSPND_EN) << POS_REGS);
+   }
+   else
+   regdword ^= (0xff << POS_REGS);
+   pci_write_config_dword(pdev_sb, 0x50, regdword);
+}
+#endif
+
+
 
 /*
  * Configures the 765 & 766 southbridges.
@@ -342,6 +422,13 @@
amd76x_get_PM();
config_PMIO_amd76x(1, 1);
config_a

2.6.12-rc2: Promise SATA150 TX4 failures

2005-04-09 Thread Joerg Sommrey
Hi all,

just tried 2.6.12-rc2 and I still have the same errors from my SATA
disks as with 2.6.11.  The setup is a bit complex.  The relevant parts (I
think) are:

Adaptec AHA-2940UW SCSI-controller, attached are:
1 harddisk /dev/sda
1 DDS3 streamer /dev/st0

Promise SATA150 TX4 controller, attached are:
2 identical hardisks /dev/sdb and /dev/sdc

/dev/sda consists of the root partition, a swap partition and 4 other
partitions that are physical volumes for dm volume group /dev/vg1

/dev/sdb and /dev/sdc have two partitions each, the first of both make
a RAID-0 array /dev/md0 and the second of both make a md RAID-1 array
/dev/md1

/dev/md0 and /dev/md1 are the physical volumes for dm volume groups
/dev/vg2 and /dev/vg3 resp.

To trigger the failure:
- For all logical volumes in /dev/vg1, /dev/vg2 and /dev/vg3 a snapshot is
  created.
- All snapshots are mounted read-only in a "snapshot hierarchy" under
  /snap.
- A backup to tape is taken using something like:
  find /snap -print | cpio -oaH crc -F /dev/st0
  Backup must go to tape, no problem with /dev/null
- At this point, some additional i/o on the SATA disks cause the whole
  box to hang. Mostly some errors are written to syslog, they are
  always similar:
Apr  9 01:30:35 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error 
}Apr  9 01:30:35 bear kernel: ata2: called with no error (51)!
Apr  9 01:30:35 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Apr  9 01:30:35 bear kernel: sdc: Current: sense key: Medium Error
Apr  9 01:30:35 bear kernel: Additional sense: Unrecovered read error - 
auto reallocate failed
Apr  9 01:30:35 bear kernel: end_request: I/O error, dev sdc, sector 43100350
Apr  9 01:30:35 bear kernel: raid1: Disk failure on sdc2, disabling device.
Apr  9 01:30:35 bear kernel: ^IOperation continuing on 1 devices

The errors are always reported for /dev/sdc2, the second device of a
RAID-1 array.  After reboot I am able to raidhotadd the failed partition
without problems.

The problem is 100% reproducible.

The hang is not a "hard hang": X keeps running, the watchdog doesn't hit
but no new processes can be started.  Syslog entries stop after some time
(from a few seconds to several minutes).

The problem appeared somewhere between 2.6.10 and 2.6.11.
2.6.10: ok
2.6.10-ac8: ok
2.6.10-ac11:failed
2.6.11: failed
2.6.12-rc2: failed

I'd be glad if there would be a solution for this problem as it prevents
me from using any newer kernel.

-jo

-- 
-rw-r--r--  1 jo users 63 2005-04-09 09:31 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SATA] libata-dev queue updated

2005-03-04 Thread Joerg Sommrey
On Fri, Mar 04, 2005 at 11:06:23PM +0100, Joerg Sommrey wrote:
> On Fri, Mar 04, 2005 at 03:43:38PM -0500, Jeff Garzik wrote:
> > Joerg Sommrey wrote:
> > >On Fri, Mar 04, 2005 at 01:07:16PM -0500, Jeff Garzik wrote:
> > >
> > >>Joerg Sommrey wrote:
> > >>
> > >>>On Fri, Mar 04, 2005 at 02:10:14AM -0500, Jeff Garzik wrote:
> > >>>
> > >>>
> > >>>>Joerg Sommrey wrote:
> > >>>>
> > >>>>
> > >>>>>On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote:
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>>Joerg Sommrey wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>>On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote:
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>>Joerg Sommrey wrote:
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>>Jeff Garzik wrote:
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>>Patch:
> > >>>>>>>>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>Still not usable here.  The same errors as before when backing up:
> > >>>>>>>>
> > >>>>>>>>Please try 2.6.11 without any patches.
> > >>>>>>>
> > >>>>>>>Plain 2.6.11 doesn't work either.  All of 2.6.10-ac11, 2.6.11-rc5,
> > >>>>>>>2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with 
> > >>>>>>>the
> > >>>>>>>same symptoms. 
> > >>>>>>>
> > >>>>>>>Reverting to stable 2.6.10-ac8 :-)
> > >>>>>>
> > >>>>>>Does reverting the attached patch in 2.6.11 (apply with patch -R) fix 
> > >>>>>>things?
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>Still the same with this patch reverted.
> > >>>>
> > >>>>Does reverting the attached patch in 2.6.11 fix things?  (apply with 
> > >>>>patch -R)
> > >>>>
> > >>>>This patch reverts the entire libata back to 2.6.10.
> > >>>>
> > >>>
> > >>>I'm confused.  Still the same with everything reverted.  What shall I do
> > >>>now?
> > >>
> > >>Well, first, thanks for your patience in narrowing this down.
> > >>
> > >>This means we have eliminated libata as a problem source, but we still 
> > >>have the rest of the kernel go to through :)
> > >>
> > >>Try disabling ACPI with 'acpi=off' or 'pci=biosirq' to see if that fixes 
> > >>things.
> > >>
> > >
> > >I tried both settings with plain 2.6.11. Almost the same results, in my
> > >impression apci=off causes the failure to appear even faster.
> > 
> > Just to make sure I have things right, please tell me if this is correct:
> > 
> > * 2.6.10 vanilla works
> > 
> > * 2.6.11 vanilla does not work
> > 
> > * 2.6.11 vanilla + 2.6.10 libata does not work
> >   [2.6.10 libata == reverting all libata changes]
> > 
> > Is that all correct?
> 
> Thanks for asking these precise questions.  After double-checking
> everything I found a typo in my configuration that changes things a bit.
> I repeated some tests and the correct answers are now:
> * 2.6.10 vanilla  works
> * 2.6.10-ac8  works
> * 2.6.10-ac11 does not work
> * 2.6.11 vanilla  does not work
> * 2.6.11 w/o promise.patchdoes not work
> * 2.6.11 + 2.6.10 libata  works!
> 
> This looks much more consistent to me but brings the case back to
> libata.

After one more test using 2.6.11 + 2.6.10 libata I got some errors.
They are different, they end after some time and they don't lock the system:

Mar  4 23:15:00 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error 
}Mar  4 23:15:00 bear kernel: sdb: Current: sense key: Recovered Error
Mar  4 23:15:00 bear kernel: ASC=0x26 <> ASCQ=0xc0
Mar  4 23:15:00 bear kernel: FMK, ILI

Got 1900 of these in 90 seconds and silence afterwards.  Maybe that
helps. I'll keep this kernel running and watch it.

-jo

-- 
-rw-r--r--  1 jo users 63 2005-03-04 23:12 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SATA] libata-dev queue updated

2005-03-04 Thread Joerg Sommrey
On Fri, Mar 04, 2005 at 03:43:38PM -0500, Jeff Garzik wrote:
> Joerg Sommrey wrote:
> >On Fri, Mar 04, 2005 at 01:07:16PM -0500, Jeff Garzik wrote:
> >
> >>Joerg Sommrey wrote:
> >>
> >>>On Fri, Mar 04, 2005 at 02:10:14AM -0500, Jeff Garzik wrote:
> >>>
> >>>
> >>>>Joerg Sommrey wrote:
> >>>>
> >>>>
> >>>>>On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Joerg Sommrey wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>Joerg Sommrey wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>Jeff Garzik wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>>Patch:
> >>>>>>>>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>Still not usable here.  The same errors as before when backing up:
> >>>>>>>>
> >>>>>>>>Please try 2.6.11 without any patches.
> >>>>>>>
> >>>>>>>Plain 2.6.11 doesn't work either.  All of 2.6.10-ac11, 2.6.11-rc5,
> >>>>>>>2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with 
> >>>>>>>the
> >>>>>>>same symptoms. 
> >>>>>>>
> >>>>>>>Reverting to stable 2.6.10-ac8 :-)
> >>>>>>
> >>>>>>Does reverting the attached patch in 2.6.11 (apply with patch -R) fix 
> >>>>>>things?
> >>>>>>
> >>>>>
> >>>>>
> >>>>>Still the same with this patch reverted.
> >>>>
> >>>>Does reverting the attached patch in 2.6.11 fix things?  (apply with 
> >>>>patch -R)
> >>>>
> >>>>This patch reverts the entire libata back to 2.6.10.
> >>>>
> >>>
> >>>I'm confused.  Still the same with everything reverted.  What shall I do
> >>>now?
> >>
> >>Well, first, thanks for your patience in narrowing this down.
> >>
> >>This means we have eliminated libata as a problem source, but we still 
> >>have the rest of the kernel go to through :)
> >>
> >>Try disabling ACPI with 'acpi=off' or 'pci=biosirq' to see if that fixes 
> >>things.
> >>
> >
> >I tried both settings with plain 2.6.11. Almost the same results, in my
> >impression apci=off causes the failure to appear even faster.
> 
> Just to make sure I have things right, please tell me if this is correct:
> 
> * 2.6.10 vanilla works
> 
> * 2.6.11 vanilla does not work
> 
> * 2.6.11 vanilla + 2.6.10 libata does not work
>   [2.6.10 libata == reverting all libata changes]
> 
> Is that all correct?

Thanks for asking these precise questions.  After double-checking
everything I found a typo in my configuration that changes things a bit.
I repeated some tests and the correct answers are now:
* 2.6.10 vanillaworks
* 2.6.10-ac8works
* 2.6.10-ac11   does not work
* 2.6.11 vanilladoes not work
* 2.6.11 w/o promise.patch  does not work
* 2.6.11 + 2.6.10 libataworks!

This looks much more consistent to me but brings the case back to
libata.

-jo

-- 
-rw-r--r--  1 jo users 63 2005-03-04 22:48 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SATA] libata-dev queue updated

2005-03-04 Thread Joerg Sommrey
On Fri, Mar 04, 2005 at 01:07:16PM -0500, Jeff Garzik wrote:
> Joerg Sommrey wrote:
> >On Fri, Mar 04, 2005 at 02:10:14AM -0500, Jeff Garzik wrote:
> >
> >>Joerg Sommrey wrote:
> >>
> >>>On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote:
> >>>
> >>>
> >>>>Joerg Sommrey wrote:
> >>>>
> >>>>
> >>>>>On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>>Joerg Sommrey wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>Jeff Garzik wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>Patch:
> >>>>>>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2
> >>>>>>>
> >>>>>>>
> >>>>>>>Still not usable here.  The same errors as before when backing up:
> >>>>>>
> >>>>>>Please try 2.6.11 without any patches.
> >>>>>
> >>>>>Plain 2.6.11 doesn't work either.  All of 2.6.10-ac11, 2.6.11-rc5,
> >>>>>2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with the
> >>>>>same symptoms. 
> >>>>>
> >>>>>Reverting to stable 2.6.10-ac8 :-)
> >>>>
> >>>>Does reverting the attached patch in 2.6.11 (apply with patch -R) fix 
> >>>>things?
> >>>>
> >>>
> >>>
> >>>Still the same with this patch reverted.
> >>
> >>Does reverting the attached patch in 2.6.11 fix things?  (apply with 
> >>patch -R)
> >>
> >>This patch reverts the entire libata back to 2.6.10.
> >>
> >
> >I'm confused.  Still the same with everything reverted.  What shall I do
> >now?
> 
> Well, first, thanks for your patience in narrowing this down.
> 
> This means we have eliminated libata as a problem source, but we still 
> have the rest of the kernel go to through :)
> 
> Try disabling ACPI with 'acpi=off' or 'pci=biosirq' to see if that fixes 
> things.
> 
I tried both settings with plain 2.6.11. Almost the same results, in my
impression apci=off causes the failure to appear even faster.

-jo

-- 
-rw-r--r--  1 jo users 63 2005-03-04 20:54 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SATA] libata-dev queue updated

2005-03-04 Thread Joerg Sommrey
On Fri, Mar 04, 2005 at 02:10:14AM -0500, Jeff Garzik wrote:
> Joerg Sommrey wrote:
> >On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote:
> >
> >>Joerg Sommrey wrote:
> >>
> >>>On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote:
> >>>
> >>>
> >>>>Joerg Sommrey wrote:
> >>>>
> >>>>
> >>>>>Jeff Garzik wrote:
> >>>>>
> >>>>>
> >>>>>>Patch:
> >>>>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2
> >>>>>
> >>>>>
> >>>>>Still not usable here.  The same errors as before when backing up:
> >>>>
> >>>>Please try 2.6.11 without any patches.
> >>>
> >>>Plain 2.6.11 doesn't work either.  All of 2.6.10-ac11, 2.6.11-rc5,
> >>>2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with the
> >>>same symptoms. 
> >>>
> >>>Reverting to stable 2.6.10-ac8 :-)
> >>
> >>Does reverting the attached patch in 2.6.11 (apply with patch -R) fix 
> >>things?
> >>
> >
> >
> >Still the same with this patch reverted.
> 
> Does reverting the attached patch in 2.6.11 fix things?  (apply with 
> patch -R)
> 
> This patch reverts the entire libata back to 2.6.10.
> 
I'm confused.  Still the same with everything reverted.  What shall I do
now?

-jo

-- 
-rw-r--r--  1 jo users 63 2005-03-04 18:44 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SATA] libata-dev queue updated

2005-03-03 Thread Joerg Sommrey
On Thu, Mar 03, 2005 at 11:09:26PM -0500, Jeff Garzik wrote:
> Joerg Sommrey wrote:
> >On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote:
> >
> >>Joerg Sommrey wrote:
> >>
> >>>Jeff Garzik wrote:
> >>>
> >>>>Patch:
> >>>>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2
> >>>
> >>>
> >>>Still not usable here.  The same errors as before when backing up:
> >>
> >>Please try 2.6.11 without any patches.
> >
> >Plain 2.6.11 doesn't work either.  All of 2.6.10-ac11, 2.6.11-rc5,
> >2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with the
> >same symptoms. 
> >
> >Reverting to stable 2.6.10-ac8 :-)
> 
> Does reverting the attached patch in 2.6.11 (apply with patch -R) fix 
> things?
> 

Still the same with this patch reverted.
-jo

-- 
-rw-r--r--  1 jo users 63 2005-03-04 07:32 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SATA] libata-dev queue updated

2005-03-03 Thread Joerg Sommrey
On Wed, Mar 02, 2005 at 05:43:59PM -0500, Jeff Garzik wrote:
> Joerg Sommrey wrote:
> >Jeff Garzik wrote:
> >>Patch:
> >>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2
> >
> >
> >Still not usable here.  The same errors as before when backing up:
> 
> Please try 2.6.11 without any patches.
Plain 2.6.11 doesn't work either.  All of 2.6.10-ac11, 2.6.11-rc5,
2.6.11-rc5 + 2.6.11-rc5-bk4-libata-dev1.patch and 2.6.11 fail with the
same symptoms. 

Reverting to stable 2.6.10-ac8 :-)

-jo

-- 
-rw-r--r--  1 jo users 63 2005-03-03 20:23 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SATA] libata-dev queue updated

2005-03-02 Thread Joerg Sommrey
Jeff Garzik wrote:

>BK users:

>   bk pull bk://gkernel.bkbits.net/libata-dev-2.6

>Patch:
>http://www.kernel.org/pub/linux/kernel/people/jgarzik/libata/2.6.11-rc5-bk4-libata-dev1.patch.bz2

Still not usable here.  The same errors as before when backing up:

Mar  2 21:09:50 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error 
}Mar  2 21:09:51 bear kernel: ata1: called with no error (51)!
Mar  2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error 
}Mar  2 21:09:51 bear kernel: ata1: called with no error (51)!
Mar  2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error 
}Mar  2 21:09:51 bear kernel: ata1: called with no error (51)!
Mar  2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error 
}Mar  2 21:09:51 bear kernel: ata1: called with no error (51)!
Mar  2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error 
}Mar  2 21:09:51 bear kernel: ata1: called with no error (51)!
Mar  2 21:09:51 bear kernel: SCSI error : <1 0 0 0> return code = 0x802
Mar  2 21:09:51 bear kernel: sdb: Current: sense key: Medium Error
Mar  2 21:09:51 bear kernel: Additional sense: Unrecovered read error - 
auto reallocate failed
Mar  2 21:09:51 bear kernel: end_request: I/O error, dev sdb, sector 43099350
Mar  2 21:09:51 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error 
}Mar  2 21:09:51 bear kernel: ata1: called with no error (51)!
Mar  2 21:09:52 bear kernel: SCSI error : <1 0 0 0> return code = 0x802
Mar  2 21:09:52 bear kernel: sdb: Current: sense key: Medium Error
Mar  2 21:09:52 bear kernel: Additional sense: Unrecovered read error - 
auto reallocate failed
Mar  2 21:09:52 bear kernel: end_request: I/O error, dev sdb, sector 43099358
Mar  2 21:09:52 bear kernel: raid1: Disk failure on sdb2, disabling device.
Mar  2 21:09:52 bear kernel: ^IOperation continuing on 1 devices
Mar  2 21:09:52 bear kernel: raid1: sdb2: rescheduling sector 2904720
Mar  2 21:09:52 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error 
}Mar  2 21:09:52 bear kernel: ata1: called with no error (51)!
Mar  2 21:09:52 bear kernel: SCSI error : <1 0 0 0> return code = 0x802
Mar  2 21:09:52 bear kernel: sdb: Current: sense key: Medium Error
Mar  2 21:09:52 bear kernel: Additional sense: Unrecovered read error - 
auto reallocate failed

Using Promise SATA150 TX4 / md-raid1 / lvm / reiserfs

-jo
-- 
-rw-r--r--  1 jo users 63 2005-03-02 21:14 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.11-rc5: Promise SATA150 TX4 failure

2005-02-28 Thread Joerg Sommrey
Hi all,

a problem that was introduced between 2.6.10-ac9 and 2.6.10-ac11 made
it's way into 2.6.11-rc5.  While taking a backup onto a SCSI-streamer one
of my RAID1-arrays gets corrupted.  Afterwards the system hangs and
isn't even bootable.  Need to raidhotadd the failed partition in single
user mode to get the box working again. Error messages:

Mar  1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar  1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar  1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar  1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar  1 01:46:15 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:15 bear kernel: ata2: called with no error (51)!
Mar  1 01:46:15 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Mar  1 01:46:15 bear kernel: sdc: Current: sense key: Medium Error
Mar  1 01:46:15 bear kernel: Additional sense: Unrecovered read error - auto
reallocate failed
Mar  1 01:46:15 bear kernel: end_request: I/O error, dev sdc, sector 52694606
Mar  1 01:46:15 bear kernel: raid1: Disk failure on sdc2, disabling device.
Mar  1 01:46:15 bear kernel: ^IOperation continuing on 1 devices
Mar  1 01:46:15 bear kernel: raid1: sdc2: rescheduling sector 12499976
Mar  1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:16 bear kernel: ata2: called with no error (51)!
Mar  1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Mar  1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error
Mar  1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto
reallocate failed
Mar  1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694614
Mar  1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:16 bear kernel: ata2: called with no error (51)!
Mar  1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Mar  1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error
Mar  1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto
reallocate failed
Mar  1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694622
Mar  1 01:46:16 bear kernel: raid1: sdc2: rescheduling sector 12499984
Mar  1 01:46:16 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:16 bear kernel: ata2: called with no error (51)!
Mar  1 01:46:16 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Mar  1 01:46:16 bear kernel: sdc: Current: sense key: Medium Error
Mar  1 01:46:16 bear kernel: Additional sense: Unrecovered read error - auto
reallocate failed
Mar  1 01:46:16 bear kernel: end_request: I/O error, dev sdc, sector 52694630
Mar  1 01:46:16 bear kernel: raid1: sdc2: rescheduling sector 1250
Mar  1 01:46:16 bear kernel: RAID1 conf printout:
Mar  1 01:46:16 bear kernel:  --- wd:1 rd:2
Mar  1 01:46:16 bear kernel:  disk 0, wo:0, o:1, dev:sdb2
Mar  1 01:46:16 bear kernel:  disk 1, wo:1, o:0, dev:sdc2
Mar  1 01:46:16 bear kernel: RAID1 conf printout:
Mar  1 01:46:16 bear kernel:  --- wd:1 rd:2
Mar  1 01:46:16 bear kernel:  disk 0, wo:0, o:1, dev:sdb2
Mar  1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 12499976 to another
mirror
Mar  1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 12499984 to another
mirror
Mar  1 01:46:16 bear kernel: raid1: sdb2: redirecting sector 1250 to another
mirror
Mar  1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar  1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar  1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar  1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar  1 01:46:16 bear kernel: ata1: status=0x51 { DriveReady SeekComplete Error }
Mar  1 01:46:16 bear kernel: ata1: called with no error (51)!
Mar  1 01:46:16 bear kernel: SCSI error : <1 0 0 0> return code = 0x802
Mar  1 01:46:16 bear kernel: sdb: Current: sense key: Medium Error

etc. until hard reboot.

The failing array consists of two partitions of two SATA disks connected
to a Promise SATA150 TX4 controller.

-jo

-- 
-rw-r--r--  1 jo users 63 2005-03-01 02:26 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-i

Re: Question on CONFIG_IRQBALANCE / 2.6.x

2005-02-18 Thread Joerg Sommrey
On Fri, Feb 18, 2005 at 02:39:49PM -0800, Martin J. Bligh wrote:
> > 
> > there's something I don't understand:  With IRQBALANCE *enabled* almost
> > all interrupts are processed on CPU0.  This changed in an unexpected way
> > after disabling IRQBALANCE: now all interrupts are distributed uniformly
> > to both CPUs.  Maybe it's intentional, but it's not what I expect when a
> > config option named IRQBALANCE is *disabled*.
> > 
> > Can anybody comment on this?
> 
> If you have a Pentium 3 based system, by default they'll round robin.
> If you turn on IRQbalance, they won't move until the traffic gets high
> enough load to matter. That's presumably what you're seeing.

It's an Athlon box that propably has the same behaviour.  Just another
question on this topic:  with IRQBALANCE enabled, almost all interupts
are routet to CPU0.  Lately irq 0 runs on CPU1 and never returns to CPU0
- is there any obvious reason for that?

-jo

-- 
-rw-r--r--  1 jo users 63 2005-02-18 23:29 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Question on CONFIG_IRQBALANCE / 2.6.x

2005-02-18 Thread Joerg Sommrey
Hi all,

there's something I don't understand:  With IRQBALANCE *enabled* almost
all interrupts are processed on CPU0.  This changed in an unexpected way
after disabling IRQBALANCE: now all interrupts are distributed uniformly
to both CPUs.  Maybe it's intentional, but it's not what I expect when a
config option named IRQBALANCE is *disabled*.

Can anybody comment on this?

Thanks,
-jo

-- 
-rw-r--r--  1 jo users 63 2005-02-18 21:21 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Problem on SATA-disk with Promise SATAII 150 TX4 ("DriveReady SeekComplete Error")

2005-02-12 Thread Joerg Sommrey
Jeff Garzik wrote::
>Johannes Resch wrote:
>> Hi,
>> 
>> [please CC me on replies]
>> 
>> I've got a box running 2.6.10 (with the patch[0] needed to support the 
>> Promise SATAII 150 TX4 controller).
>> This box has three software raid1 partitions mirrored on a SATA disk on 
>> the Promise controller and a disk on the mainboard IDE controller (VIA 
>> vt8235).
>> 
>> Within 4 days running the raid1, I got those three errors pasted below, 
>> each marking the SATA-raidmember as faulty. After "raidhotremove" and 
>> "raidhotadd" the SATA-raidmember syncs again fine and works at least a 
>> day until it is marked as faulty again.
>> 
>> Any pointers where I could look at to resolve this problem?
>> The SATA drive is a new Seagate ST3250823AS.

>I would change out your cables, and also make sure you are running 
>2.6.11-rc3-bk-latest, which includes all the SATAII patches and other fixes.

I don't believe it has anything to do with cabling.  2.6.10-ac9 introduced
some sata patches.  I didn't check -ac9 and -ac10, but -ac11 and -ac12 are
not usable on my box with exactly the same symptoms.

-jo
-- 
-rw-r--r--  1 jo users 63 2005-02-12 18:43 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Need advice on amd76x_pm [patch included]

2005-02-08 Thread Joerg Sommrey
note
+ * in amd76x_smp_idle(). I've noticed that when NTH and idling are both
+ * enabled, my hardware locks and requires a hard reset, so I have
+ * #ifndefed around the idle loop setting to prevent this. POS locks it up
+ * too, both ought to be fixable. I've also noticed that idling and NTH
+ * make some interference that is picked up by the onboard sound chip on
+ * my ASUS A7M266-D motherboard.
+ *
+ *   20030601: Pasi Savolainen
+ *  Simple port to 2.5
+ *  Added sysfs interface for making nice graphs with mrtg.
+ *  Look for /sys/devices/pci0/00:00.0/C2_cnt & lazy_idle (latter writable)
+ *
+ *   20041204: Joerg Sommrey (jo)
+ *  trying to enable preemption
+ *  added C3-count to sysfs
+ *  renamed module parm from "l" to "lazy_idle"
+ *  added some "dummy op" after return from C2 and C3
+ *  Note: using this module on my S2466 makes the system clock kind
+ *  of instable. After playing with some RR-priorities and processor
+ *  affinities I managed to reduce ntpd's time offsets to about
+ *  4ms. Without this module time offsets are in the range of 1-2ms.
+ *
+ * TODO: Thermal throttling (TTH).
+ *  /proc interface for normal throttling level.
+ *  /proc interface for POS.
+ *
+ *
+ *
+ *
+ * Processor idle mode module for AMD SMP 760MP(X) based systems
+ *
+ * Copyright (C) 2002 Tony Lindgren <[EMAIL PROTECTED]>
+ *Johnathan Hicks (768 support)
+ *
+ * Using this module saves about 70 - 90W of energy in the idle mode compared
+ * to the default idle mode. Waking up from the idle mode is fast to keep the
+ * system response time good. Currently no CPU load calculation is done, the
+ * system exits the idle mode if the idle function runs twice on the same
+ * processor in a row. This only works on SMP systems, but maybe the idle mode
+ * enabling can be integrated to ACPI to provide C2 mode at some point.
+ *
+ * NOTE: Currently there's a bug somewhere where the reading the
+ *   P_LVL2 for the first time causes the system to sleep instead of 
+ *   idling. This means that you need to hit the power button once to
+ *   wake the system after loading the module for the first time after
+ *   reboot. After that the system idles as supposed.
+ *
+ *
+ * Influenced by Vcool, and LVCool. Rewrote everything from scratch to
+ * use the PCI features in Linux, and to support SMP systems.
+ * 
+ * Currently only tested on a TYAN S2460 (760MP) system (Tony) and an
+ * ASUS A7M266-D (760MPX) system (Johnathan). Adding support for other Athlon
+ * SMP or single processor systems should be easy if desired.
+ *
+ * This software is licensed under GNU General Public License Version 2 
+ * as specified in file COPYING in the Linux kernel source tree main 
+ * directory.
+ * 
+ *   
+ */
+
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#define VERSION"20041204"
+
+// #define AMD76X_C3  1
+// #define AMD76X_NTH 1
+// #define AMD76X_POS 1
+// #define AMD76X_PREEMPT_DISABLE 1
+#define AMD76X_IRQ_DISABLE 1
+#define AMD76X_DUMMY_OP 1
+
+
+extern void default_idle(void);
+static void amd76x_smp_idle(void);
+static int amd76x_pm_main(void);
+
+unsigned long lazy_idle = 0;
+
+/* jo: make some compile time warnings about deprecation go away */
+module_param(lazy_idle, long, 0);
+MODULE_PARM_DESC(lazy_idle, "number of idle cycles before entering C2");
+
+static struct pci_dev *pdev_nb;
+static struct pci_dev *pdev_sb;
+
+struct PM_cfg {
+   unsigned int status_reg;
+   unsigned int C2_reg;
+   unsigned int C3_reg;
+   unsigned int NTH_reg;
+   unsigned int slp_reg;
+   unsigned int resume_reg;
+   void (*orig_idle) (void);
+   void (*curr_idle) (void);
+   unsigned long C2_cnt, C3_cnt, idle_cnt;
+   int last_pr;
+};
+
+static struct PM_cfg amd76x_pm_cfg;
+
+struct cpu_idle_state {
+   int idle;
+   int count;
+};
+static struct cpu_idle_state prs[2];
+
+static struct pci_device_id  __devinitdata amd_nb_tbl[] = {
+   {PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_FE_GATE_700C, PCI_ANY_ID, 
PCI_ANY_ID,},
+   {0,}
+};
+
+static struct pci_device_id  __devinitdata amd_sb_tbl[] = {
+   {PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_VIPER_7413, PCI_ANY_ID, 
PCI_ANY_ID,},
+   {PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_VIPER_7443, PCI_ANY_ID, 
PCI_ANY_ID,},
+   {0,}
+};
+
+/*
+ * Configures the AMD-762 northbridge to support PM calls
+ */
+static int
+config_amd762(int enable)
+{
+   unsigned int regdword;
+
+   /* Enable STPGNT in BIU Status/Control for cpu0 */
+   pci_read_config_dword(pdev_nb, 0x60, ®dword);
+   regdword |= (1 << 17);
+   pci_write_config_dword(pdev_nb, 0x60, regdword);
+
+   /* Enable STPGNT in BIU Status/Control for cpu1 */
+   pci_read_config_dword(pdev_nb, 0x68, ®dwo

strange repeating keys and irq 0 routing on 2.6.x

2005-02-08 Thread Joerg Sommrey
Hi all,

a few times in the last couple of months I experienced some strange key
repeating in X.  A single key-press sometimes results in 2-8 typed
keys.  This started around 2.6.8 but I'm not sure.  It was last seen on
2.6.10-ac8.

There were similar problems reported for Toshiba laptops and within XFree,
but my problem seems to be different.

The key repeating starts some hours after reboot and it gets
worse with time.  After one day the keyboard is not usable anymore.
At the same time ntpd often fails reading the radio clock attached to a
serial port. 

It doesn't happen after every reboot.  BUT: Sometimes IRQ 0 is processed
on CPU1 after reboot (though /proc/irq/0/smp_affinity is 3 and all other
irq are handled on CPU0). In these cases the repeating keys appear.

The box has two Athlon MPs.

What can I do to gather more information?

-jo

-- 
-rw-r--r--  1 jo users 63 2005-01-10 08:33 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.10-ac12

2005-02-08 Thread Joerg Sommrey
On Sun, Feb 06, 2005 at 04:02:42PM +, Alan Cox wrote:

>Arjan van de Ven is now building RPMS of the kernel and those can be found
>in the RPM subdirectory and should be yum-able. Expect the RPMS to lag the
>diff a little as the RPM builds and tests do take time.

>Nothing terribly exciting here security wise but various bugs for problems
>people have been hitting that are now fixed upstream, and also the ULi
>tulip variant should now work. If you are running IPv6 you may well want
>the networking fixes.

Something broke my box after 2.6.10-ac8.  Both -ac11 and -ac12 cause
problems when backing up sata/md-raid/dm-snapshots/reiserfs to a SCSI
tape drive.  (Works fine when writing to /dev/null :-)

Backup started 1:30 a.m., errors messges start with:

Feb  8 01:34:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error 
}Feb  8 01:34:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Feb  8 01:34:07 bear kernel: FMK Current sdc: sense = 70 99
Feb  8 01:34:07 bear kernel: ASC=26 ASCQ=c0
Feb  8 01:34:07 bear kernel: end_request: I/O error, dev sdc, sector 52294166
Feb  8 01:34:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error 
}Feb  8 01:34:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Feb  8 01:34:07 bear kernel: FMK Current sdc: sense = 70 99
Feb  8 01:34:07 bear kernel: ASC=26 ASCQ=c0
Feb  8 01:34:07 bear kernel: end_request: I/O error, dev sdc, sector 52294174
Feb  8 01:34:07 bear kernel: raid1: Disk failure on sdc2, disabling device.
Feb  8 01:34:07 bear kernel: ^IOperation continuing on 1 devices
Feb  8 01:34:07 bear kernel: raid1: sdc2: rescheduling sector 12099536
Feb  8 01:34:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error 
}Feb  8 01:34:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Feb  8 01:34:07 bear kernel: FMK Current sdc: sense = 70 99
Feb  8 01:34:07 bear kernel: ASC=26 ASCQ=c0
Feb  8 01:34:07 bear kernel: end_request: I/O error, dev sdc, sector 52294182
Feb  8 01:34:07 bear kernel: raid1: sdc2: rescheduling sector 12099552
etc. until hard reboot via sysrq-b

I found only one patch that seems to be related:
>2.6.10-ac11
>*  Fix oops with md over dm(Jens Axboe)

Can this by any chance cause my problems?

-jo

-- 
-rw-r--r--  1 jo users 63 2005-02-08 01:47 /home/jo/.signature
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.10-ac11 causes failures on sata/md/raid

2005-02-01 Thread Joerg Sommrey
Hello,

2.6.10-ac11 causes strange errors on a sata-md-raid-1. It starts like this:

Jan 31 23:21:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error 
}Jan 31 23:21:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Jan 31 23:21:07 bear kernel: FMK Current sdc: sense = 70 99
Jan 31 23:21:07 bear kernel: ASC=26 ASCQ=c0
Jan 31 23:21:07 bear kernel: end_request: I/O error, dev sdc, sector 50888046
Jan 31 23:21:07 bear kernel: raid1: Disk failure on sdc2, disabling device.
Jan 31 23:21:07 bear kernel: ^IOperation continuing on 1 devices
Jan 31 23:21:07 bear kernel: raid1: sdc2: rescheduling sector 10693416
Jan 31 23:21:07 bear kernel: raid1: sdb2: redirecting sector 10693416 to 
another mirror
Jan 31 23:21:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error 
}Jan 31 23:21:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Jan 31 23:21:07 bear kernel: FMK Current sdc: sense = 70 99
Jan 31 23:21:07 bear kernel: ASC=26 ASCQ=c0
Jan 31 23:21:07 bear kernel: end_request: I/O error, dev sdc, sector 50888054
Jan 31 23:21:07 bear kernel: ata2: status=0x51 { DriveReady SeekComplete Error 
}Jan 31 23:21:07 bear kernel: SCSI error : <2 0 0 0> return code = 0x802
Jan 31 23:21:07 bear kernel: FMK Current sdc: sense = 70 99
Jan 31 23:21:07 bear kernel: ASC=26 ASCQ=c0
Jan 31 23:21:07 bear kernel: end_request: I/O error, dev sdc, sector 50888062
Jan 31 23:21:07 bear kernel: raid1: sdc2: rescheduling sector 10693424
Jan 31 23:21:07 bear kernel: raid1: sdb2: redirecting sector 10693424 to 
another mirror

These messages are repeated forever (with different sector numbers).

I am able to boot into single user mode. Hotadding the "failed"
partition /dev/sdc2 to /dev/md1 works w/o problems.  No problems
accessing the filesystem on /dev/md1.  But when I enter runlevel 3 the
same errors appear and make the system again unusable.

After reverting to 2.6.10-ac8 everything works fine again.

-jo
-- 
-rw-r--r--  1 jo users 63 2005-02-01 18:44 /home/jo/.signature

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.10-ac11
# Sat Jan 29 13:15:21 2005
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
# CONFIG_CLEAN_COMPILE is not set
CONFIG_BROKEN=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=18
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
# CONFIG_IKCONFIG is not set
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
CONFIG_X86_HZ=1000
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
# CONFIG_HPET_TIMER is not set
CONFIG_SMP=y
CONFIG_NR_CPUS=2
# CONFIG_SCHED_SMT is not set
CONFIG_PREEMPT=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MCE_P4THERMAL=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m

#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION