Re: ACPI error messages on Lenovo W540

2014-08-17 Thread Eric McCorkle
Finally got time to do some more poking around regarding this.  I 
haven't read the entire ACPI spec, so bear with me...


As John Baldwin said, the wrapper nvidia_acpi.c passes in a buffer 
instead of a package.  In the definition for _DSM, you can see calls to 
NVOP, NVPS, and NBCI.  I looked at all three, and they seem to treat 
Arg3 as a buffer as well (they call CreateField on it, which according 
to the ACPI spec, should take a buffer argument).


So it looks like both the nvidia driver as well as the ALS code are 
pretty thoroughly convinced that Arg3 (the fourth arg) is a buffer 
instead of a package, despite what the spec says about what it should be.


Could this possibly be fixed by adding some kind of quirk to the ACPI 
execution engine that says _DSM's fourth argument is a buffer?


On 06/24/2014 20:51, Eric McCorkle wrote:


It looks like this is the definition:

 Method (_DSM, 4, NotSerialized)  // _DSM:
Device-Specific Method
 {
 If (\CMPB (Arg0, Buffer (0x10)
 {
 /*  */   0xF8, 0xD8, 0x86,
0xA4, 0xDA, 0x0B, 0x1B, 0x47,
 /* 0008 */   0xA7, 0x2B, 0x60,
0x42, 0xA6, 0xB5, 0xBE, 0xE0
 }))
 {
 Return (NVOP (Arg0, Arg1, Arg2, Arg3))
 }

 If (\CMPB (Arg0, Buffer (0x10)
 {
 /*  */   0x01, 0x2D, 0x13,
0xA3, 0xDA, 0x8C, 0xBA, 0x49,
 /* 0008 */   0xA5, 0x2E, 0xBC,
0x9D, 0x46, 0xDF, 0x6B, 0x81
 }))
 {
 Return (NVPS (Arg0, Arg1, Arg2, Arg3))
 }

 If (\WIN8)
 {
 If (\CMPB (Arg0, Buffer (0x10)
 {
 /*  */   0x75, 0x0B, 0xA5,
0xD4, 0xC7, 0x65, 0xF7, 0x46,
 /* 0008 */   0xBF, 0xB7, 0x41,
0x51, 0x4C, 0xEA, 0x02, 0x44
 }))
 {
 Return (NBCI (Arg0, Arg1, Arg2, Arg3))
 }
 }

 Return (Buffer (0x04)
 {
  0x01, 0x00, 0x00, 0x80
 })
 }

Not sure if that's helpful at all...


Ah, the nvidia driver calls _DSM and it has the bug.  In its
nvidia_acpi.c file
it uses a Buffer instead of a Package for the fourth argument to
_DSM.  OTOH, the
warning doesn't prevent the method from running, so the warning may be
harmless.
You can try contacting the nvidia folks to tell them about the warning
and point
out the bug in their _DSM wrapper in nvidia_acpi.c to see what they say.


Will do.  Is there a preferred point of contact?

___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: ACPI error messages on Lenovo W540

2014-08-17 Thread Eric McCorkle
Actually, I managed to get a kernel panic to happen, which provides some 
more information:


NVRM: GPU at :01:00.0 has fallen off the bus.
NVRM: RmInitAdapter failed! (0x25:0x28:1169)
nvidia0: NVRM: rm_init_adapter() failed!


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x8
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x817eeeb5
stack pointer   = 0x28:0xfe021d481420
frame pointer   = 0x28:0xfe021d481500
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 3
current process = 73254 (Xorg)
trap number = 12
panic: page fault
cpuid = 2
Uptime: 12h17m8s

It looks like rm_init_adapter is contained in nvidia's .o file that 
comes with the driver, so I'm out of luck for trying to track it down. 
I can't tell whether this is caused by the ACPI errors or not...


On 08/17/2014 08:31, Eric McCorkle wrote:

Finally got time to do some more poking around regarding this.  I
haven't read the entire ACPI spec, so bear with me...

As John Baldwin said, the wrapper nvidia_acpi.c passes in a buffer
instead of a package.  In the definition for _DSM, you can see calls to
NVOP, NVPS, and NBCI.  I looked at all three, and they seem to treat
Arg3 as a buffer as well (they call CreateField on it, which according
to the ACPI spec, should take a buffer argument).

So it looks like both the nvidia driver as well as the ALS code are
pretty thoroughly convinced that Arg3 (the fourth arg) is a buffer
instead of a package, despite what the spec says about what it should be.

Could this possibly be fixed by adding some kind of quirk to the ACPI
execution engine that says _DSM's fourth argument is a buffer?

On 06/24/2014 20:51, Eric McCorkle wrote:


It looks like this is the definition:

 Method (_DSM, 4, NotSerialized)  // _DSM:
Device-Specific Method
 {
 If (\CMPB (Arg0, Buffer (0x10)
 {
 /*  */   0xF8, 0xD8, 0x86,
0xA4, 0xDA, 0x0B, 0x1B, 0x47,
 /* 0008 */   0xA7, 0x2B, 0x60,
0x42, 0xA6, 0xB5, 0xBE, 0xE0
 }))
 {
 Return (NVOP (Arg0, Arg1, Arg2, Arg3))
 }

 If (\CMPB (Arg0, Buffer (0x10)
 {
 /*  */   0x01, 0x2D, 0x13,
0xA3, 0xDA, 0x8C, 0xBA, 0x49,
 /* 0008 */   0xA5, 0x2E, 0xBC,
0x9D, 0x46, 0xDF, 0x6B, 0x81
 }))
 {
 Return (NVPS (Arg0, Arg1, Arg2, Arg3))
 }

 If (\WIN8)
 {
 If (\CMPB (Arg0, Buffer (0x10)
 {
 /*  */   0x75, 0x0B, 0xA5,
0xD4, 0xC7, 0x65, 0xF7, 0x46,
 /* 0008 */   0xBF, 0xB7, 0x41,
0x51, 0x4C, 0xEA, 0x02, 0x44
 }))
 {
 Return (NBCI (Arg0, Arg1, Arg2, Arg3))
 }
 }

 Return (Buffer (0x04)
 {
  0x01, 0x00, 0x00, 0x80
 })
 }

Not sure if that's helpful at all...


Ah, the nvidia driver calls _DSM and it has the bug.  In its
nvidia_acpi.c file
it uses a Buffer instead of a Package for the fourth argument to
_DSM.  OTOH, the
warning doesn't prevent the method from running, so the warning may be
harmless.
You can try contacting the nvidia folks to tell them about the warning
and point
out the bug in their _DSM wrapper in nvidia_acpi.c to see what they say.


Will do.  Is there a preferred point of contact?

___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org

___
freebsd-acpi@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-acpi
To unsubscribe, send any mail to freebsd-acpi-unsubscr...@freebsd.org


Re: [PATCH] ACPI CMOS region support - submission?

2014-08-17 Thread Anthony Jenkins
Thanks for the critique guys, comments inline.  Sorry it took so long...

On 08/05/2014 16:49, Bruce Evans wrote:
 On Tue, 5 Aug 2014, Ian Smith wrote:

 On Sat, 2 Aug 2014 18:10:05 -0400, Anthony Jenkins wrote:

  I made a few minor changes since the last incarnation:
 
   - Defined the CMOS address/data register addresses as macros
   - Defined the (apparent) I/O delay as a macro

 Looks good, Anthony.

  I also verified the ACPI CMOS region code only accesses up to
  register 63 (0x3F - in previous emails I mistakenly said 0x7F).

 Is 0x3f what the ACPI spec limits ACPI access to?

 This is mildly surprising, as all modern RTC chips are at least 128
 bytes; noone's actually used a MC146818 in over ten years, I gather.

 BIOSes at least used to use much more.  It is good for ACPI to not allow
 clobbering the BIOS state.

  If/when this gets in, I'd like to add sysctl controls to e.g. allow
  ACPI access to the date/time registers (I currently return failure
  when attempting to write them via ACPI).  I don't see anything in the
  spec (after re-reading it) that disallows ACPI from touching those.

 I don't know about the spec, but I can't see allowing ACPI write access
 to time/alarm/date registers as being a sensible idea; this could allow
 completely out-of-band modification of time/alarm/date regs without the
 necessary precautions needed for setting these, namely use of the SET
 bit in status reg B (0x0b) to disable updates while setting time  date,
 and anyway why allow changing time/date without the OS knowing about it?
 Reading should be no problem, though without proper observance of UIP
 bit timing, data may not be consistent.

 They might need to be read on resume.  I can't see how resume can possibly
 work without reading them or some clock to reset the time.  But this should
 be done by calling the standard time reading functions, not at a low level.

 I guess there may be a case for messing with the alarm bytes, though we
 don't currently allow use of the alarm for wake from poweroff (S5) or
 STR (S3) states, as doth Linux.  I looked into this some years ago when
 there were a few requests for the feature, however it would involve some
 major reworking to always preserve the alarm interrupt bit through init
 and suspend/resume, and appropriate clearing of same after use, along
 with a utility to setup the alarm .. a potential to leave open, perhaps?

 I use the seconds alarm for pps, but don't use suspend.

 Which leads to my other concern here: that you are allowing out-of-band
 access to especially status reg C (0x0c), which when read clears all
 enabled interrupts, and the other status regs whose settings are also
 too important to allow screwing with.

 Yes, reading it would break interrupt handling.

 We used to use the RTC periodic interrupt as a clock source for a 128Hz
 interrupt (1024Hz while profiling) but since mav@'s reworking of all the
 clocks and timers sometime prior to 9.0-REL (IIRC), that's now rarely if
 ever used as such, but remains an available clock or timer source.

 Reg A (0x0a) is r/w and sets clock divider and rate selection bits.  Do
 we want ACPI to be able to mess with these?  I think not.  Readonly, ok.

 Reg B (0x0b) is r/w and contains the aforementioned SET bit, the three
 interrupt enable bits (PIE, AIE, UIE), the SQWE, DM, 24/12 and DSE bits,
 again none of which should be allowed to be written to out-of-band.

 And reg C (0x0c) is read-only, and as mentioned clears interrupts; again
 not something that should be permitted without the OS knowing about it.

 I suppose you have the MC146818 spec to hand?

 Couple of things from your patch:

 -staticintrtc_reg = -1;

 Indeed; I never could figure out the advantage in this, as how rarely
 would you want to read or write the same reg twice in a row anyway?

 About 99.% of accesses are to the status register for handling RTC
 interrupts.  The RTC is almost never used except for the 128Hz interrupt,
 so the index register is normally constant at the single value used by
 this interrupt (RTC_INTR).  With mav's changes, it is now rarely used.

 Removing this is a good re-pessimization.  rtcin() does 4 i/o's with the
 pessimization.  This takes about 5 usec.  At 128 Hz this only wastes
 0.064% of a CPU.  At 1024 Hz it wastes 0.512%.  The non-pessimized
 version only does 1 i/o so it wastes 1/4 as much.  The wastage should
 be much larger.  1024 Hz is far too slow for profiling a modern system.
 It is about right for a 4.77 MHz 8086, except the RTC overhead is too
 high for that (I used the i8254).  Scaling this for a 2 GHz modern CPU
 gives 429 kHz (or more, since instructions per cycle is more).  The
 overhead for that with the pessimization would be 214% of a CPU.
 Without the pessimization, it would only be 53.7% of a CPU.  Still too
 much.  The RTC is just unsuitable for profiling interrupts at a useful
 rate.  The i8254 is better and the lapic timer is much better, but
 profhz has only been