from:"Jörg Otte"

segmentation faults with kernels >v5.8-rc2

2020-06-29 Thread Jörg Otte

I frequently get segmentation faults in newer kernel >v5.8-rc2 e.g.
Chrome_ChildIOT[1298]: segfault at 40d048 ip 56111a41970d sp
7fdced6931b0 error 6 in chrome[561115f64000+785b000]

Bisection gave me the following
first bad commit:

e9c15badbb7b20ccdbadf5da14e0a68fbad51015 is the first bad commit
commit e9c15badbb7b20ccdbadf5da14e0a68fbad51015
Author: Mel Gorman 
Date:   Mon Jun 15 13:13:58 2020 +0100

fs: Do not check if there is a fsnotify watcher on pseudo inodes
...

I double checked by reverting this commit on top of v5.8-rc3 and the
segmentation faults are gone.


Jörg

Re: [5.2.0-rcx] Bluetooth: hci0: unexpected event for opcode

2019-06-06 Thread Jörg Otte

Am Do., 6. Juni 2019 um 13:20 Uhr schrieb Marcel Holtmann :
>
> Hi Joerg,
>
> >>> In 5.2.0-rcx I see a new error message on startup probably after
> >>> loading the Bluetooth firmware:
> >>> [1.609460] Bluetooth: hci0: unexpected event for opcode 0xfc2f
> >>>
>  dmesg | grep Bluetooth
> >>> [0.130969] Bluetooth: Core ver 2.22
> >>> [0.130973] Bluetooth: HCI device and connection manager initialized
> >>> [0.130974] Bluetooth: HCI socket layer initialized
> >>> [0.130975] Bluetooth: L2CAP socket layer initialized
> >>> [0.130976] Bluetooth: SCO socket layer initialized
> >>> [0.374716] Bluetooth: RFCOMM TTY layer initialized
> >>> [0.374718] Bluetooth: RFCOMM socket layer initialized
> >>> [0.374718] Bluetooth: RFCOMM ver 1.11
> >>> [0.374719] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> >>> [0.374720] Bluetooth: BNEP socket layer initialized
> >>> [0.374721] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
> >>> [0.374722] Bluetooth: HIDP socket layer initialized
> >>> [1.422530] Bluetooth: hci0: read Intel version: 370710018002030d00
> >>> [1.422533] Bluetooth: hci0: Intel Bluetooth firmware file:
> >>> intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq
> >>> [1.609460] Bluetooth: hci0: unexpected event for opcode 0xfc2f
> >>> [1.625557] Bluetooth: hci0: Intel firmware patch completed and 
> >>> activated
> >>> [   21.986125] input: BluetoothMouse3600 Mouse as
> >>> /devices/virtual/misc/uhid/0005:045E:0916.0004/input/input15
> >>> [   21.986329] input: BluetoothMouse3600 Consumer Control as
> >>> /devices/virtual/misc/uhid/0005:045E:0916.0004/input/input16
> >>> [   21.986408] hid-generic 0005:045E:0916.0004: input,hidraw3:
> >>> BLUETOOTH HID v1.10 Mouse [BluetoothMouse3600] on 80:19:34:4D:31:44
> >>>
> >>>
> >>> The error message goes away if I revert following patch:
> >>> f80c5dad7b64 Bluetooth: Ignore CC events not matching the last HCI command
> >>
> >> if you can send btmon trace (or better btmon -w trace.log) for this event 
> >> triggering it, then we can look if this is a hardware issue.
> >
> > The problem is that it happens only once during startup, especially at
> > the very first startup after power-on only. So I can't issue any
> > command.
>
> try to blacklist btusb.ko module. Create /etc/modprobe.d/blacklist-btusb.conf 
> with the content of "blacklist vc4”.

I hhink you mean "blacklist btusb"

>Then once booted, start “btmon -w trace.log” and then “modprobe btusb”. This 
>should give you the initial firmware loading trace.
>
> I am just assuming that the module is connected via USB, if not then you need 
> to let me know.
>
> >> We have only seen this with Atheros hardware so far, but it might happen 
> >> for others as well. It indicates that this is an unexpected event. 
> >> Normally you can ignore this warning since it just indicates an existing 
> >> issue that we just papered over before. So if everything works as before, 
> >> just ignore it,
> >
> > Yes for me BT works as usual so ignoring it would be no problem (but
> > it looks ugly because the error message is painted right on the
> > boot-screen).
>
> The 0xfc2f command is never issued by btusb.c or btintel.c actually. It is a 
> command to apply the BDDATA information used only by Intel AG6xx devices 
> which are UART only. So I am almost certain that this is a bug in the 
> hardware / firmware and the patch above just started to highlight it. The 
> trace will show if that is the case.

Done!. Here comes trace.log.

Thank's Jörg
btsnoopÑ$$ÿÿâh]ÈvÏ?Linux version 5.2.0-rc3-BT (x86_64)!!ÿÿâh]ÈvÏCBluetooth subsystem version 2.22ÿÿâh]ÈvÏjbtmonâh]ÉØ¼hci0âh]ÉØ	
âh]ÉØ	âh]ÉØ
|âh]ÉØ9¯âh]ÉØ9Óüâh]ÉØ=
ü7
âh]ÉØ=üâh]ÉØAwü

âh]ÉØAü

Wâh]ÉØEhüýýâh]ÉØEyüú!ô  øÂÿá!Ê aÑÀà~èHÌH0ZÐ¢ÁV SØ®=`0¢Ààx°H ô©pÈÿÔHÐÙ  (Iö  ÒàpXÓ`xàÀ¸àxHÂ¸%xäâIö!ñ¸¡àx  öàÃ¸àx Hã!ðàxê %¯ !P àxHàxÒc   exÂÐðØ9À%HXÓâh]ÉØM$üýýâh]ÉØM9üúô!ô¹,HÈà~àx4HáÇü´è^¬»   £,E# #   þ¦t <S ]5 V 0HÉpÁÇÁÆàÁÅ8H»^Ø   ¼  ìØHáÇü´üH´üð6ØbÑdH ØØîñhHÀþèÞJ#@J&p¯ë¨H°mÂàxXHëvúïÏsTHëvî
/ïuíd%èa#ë  ##ÿÿ
âh]ÉØN³bluetoothdBluetooth daemon 5.48âh]ÉØTíüýýâh]ÉØUüúè!ôFÙà «  UzÇrAlHÀ¢Áv pH uèD   8Ø²oþ
rôàxJ&pV% *pðè  6H@Ý`Ö/ü°¶tHÛB±Ïr0|H  B.
oÿp  îV	/ùÚ   PÚqàÁÅxHáÇü´üDÆ@!Q ãóðàx

Ñ\H

Re: [5.2.0-rcx] Bluetooth: hci0: unexpected event for opcode

2019-06-06 Thread Jörg Otte

Am Do., 6. Juni 2019 um 08:18 Uhr schrieb Marcel Holtmann :
>
> Hi Joerg,
>
> > In 5.2.0-rcx I see a new error message on startup probably after
> > loading the Bluetooth firmware:
> > [1.609460] Bluetooth: hci0: unexpected event for opcode 0xfc2f
> >
> >> dmesg | grep Bluetooth
> > [0.130969] Bluetooth: Core ver 2.22
> > [0.130973] Bluetooth: HCI device and connection manager initialized
> > [0.130974] Bluetooth: HCI socket layer initialized
> > [0.130975] Bluetooth: L2CAP socket layer initialized
> > [0.130976] Bluetooth: SCO socket layer initialized
> > [0.374716] Bluetooth: RFCOMM TTY layer initialized
> > [0.374718] Bluetooth: RFCOMM socket layer initialized
> > [0.374718] Bluetooth: RFCOMM ver 1.11
> > [0.374719] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
> > [0.374720] Bluetooth: BNEP socket layer initialized
> > [0.374721] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
> > [0.374722] Bluetooth: HIDP socket layer initialized
> > [1.422530] Bluetooth: hci0: read Intel version: 370710018002030d00
> > [1.422533] Bluetooth: hci0: Intel Bluetooth firmware file:
> > intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq
> > [1.609460] Bluetooth: hci0: unexpected event for opcode 0xfc2f
> > [1.625557] Bluetooth: hci0: Intel firmware patch completed and activated
> > [   21.986125] input: BluetoothMouse3600 Mouse as
> > /devices/virtual/misc/uhid/0005:045E:0916.0004/input/input15
> > [   21.986329] input: BluetoothMouse3600 Consumer Control as
> > /devices/virtual/misc/uhid/0005:045E:0916.0004/input/input16
> > [   21.986408] hid-generic 0005:045E:0916.0004: input,hidraw3:
> > BLUETOOTH HID v1.10 Mouse [BluetoothMouse3600] on 80:19:34:4D:31:44
> >
> >
> > The error message goes away if I revert following patch:
> > f80c5dad7b64 Bluetooth: Ignore CC events not matching the last HCI command
>
> if you can send btmon trace (or better btmon -w trace.log) for this event 
> triggering it, then we can look if this is a hardware issue.

The problem is that it happens only once during startup, especially at
the very first startup after power-on only. So I can't issue any
command.

>We have only seen this with Atheros hardware so far, but it might happen for 
>others as well. It indicates that this is an unexpected event. Normally you 
>can ignore this warning since it just indicates an existing issue that we just 
>papered over before. So if everything works as before, just ignore it,

Yes for me BT works as usual so ignoring it would be no problem (but
it looks ugly because the error message is painted right on the
boot-screen).

Thanks, Jörg

[5.2.0-rcx] Bluetooth: hci0: unexpected event for opcode

2019-06-05 Thread Jörg Otte

In 5.2.0-rcx I see a new error message on startup probably after
loading the Bluetooth firmware:
[1.609460] Bluetooth: hci0: unexpected event for opcode 0xfc2f

> dmesg | grep Bluetooth
[0.130969] Bluetooth: Core ver 2.22
[0.130973] Bluetooth: HCI device and connection manager initialized
[0.130974] Bluetooth: HCI socket layer initialized
[0.130975] Bluetooth: L2CAP socket layer initialized
[0.130976] Bluetooth: SCO socket layer initialized
[0.374716] Bluetooth: RFCOMM TTY layer initialized
[0.374718] Bluetooth: RFCOMM socket layer initialized
[0.374718] Bluetooth: RFCOMM ver 1.11
[0.374719] Bluetooth: BNEP (Ethernet Emulation) ver 1.3
[0.374720] Bluetooth: BNEP socket layer initialized
[0.374721] Bluetooth: HIDP (Human Interface Emulation) ver 1.2
[0.374722] Bluetooth: HIDP socket layer initialized
[1.422530] Bluetooth: hci0: read Intel version: 370710018002030d00
[1.422533] Bluetooth: hci0: Intel Bluetooth firmware file:
intel/ibt-hw-37.7.10-fw-1.80.2.3.d.bseq
[1.609460] Bluetooth: hci0: unexpected event for opcode 0xfc2f
[1.625557] Bluetooth: hci0: Intel firmware patch completed and activated
[   21.986125] input: BluetoothMouse3600 Mouse as
/devices/virtual/misc/uhid/0005:045E:0916.0004/input/input15
[   21.986329] input: BluetoothMouse3600 Consumer Control as
/devices/virtual/misc/uhid/0005:045E:0916.0004/input/input16
[   21.986408] hid-generic 0005:045E:0916.0004: input,hidraw3:
BLUETOOTH HID v1.10 Mouse [BluetoothMouse3600] on 80:19:34:4D:31:44


The error message goes away if I revert following patch:
f80c5dad7b64 Bluetooth: Ignore CC events not matching the last HCI command

Thanks, Jörg

Re: [v4.17-rcx] Lost IBPB, IBRS_FW support for spectre_v2 mitigation.

2018-05-05 Thread Jörg Otte

2018-05-04 18:18 GMT+02:00 Borislav Petkov :
> On Wed, May 02, 2018 at 02:20:52PM +0200, Thomas Gleixner wrote:
>> Thanks for confirming. Still need to find a way which is less fragile, but
>> that's probably too much of churn for rc4
>>
>> At least I know exactly what's happening, so I can write a better changelog.
>>
>> Thanks for testing!
>
> Jörg, can you pls also test this one ontop of Thomas' patch to make
> sure it doesn't break your box.
>
> Thx.
>
> ---
> From 6857c2ac8e31f4f9b350cfad4f6b6eb831bf57f1 Mon Sep 17 00:00:00 2001
> From: Borislav Petkov 
> Date: Wed, 2 May 2018 18:15:14 +0200
> Subject: [PATCH] x86/CPU: Use synthetic bits for IBRS/IBPB/STIBP
>
> Intel and AMD have different CPUID bits for those so use synthetic bits
> which get set on the respective vendor in init_speculation_control(). So
> that debacles like the commit message of
>
>   c65732e4f721 ("x86/cpu: Restore CPUID_8000_0008_EBX reload")
>

Patch doesn't hurt me. For me it´s ok.

Thanks, Jörg

Re: [v4.17-rcx] Lost IBPB, IBRS_FW support for spectre_v2 mitigation.

2018-05-02 Thread Jörg Otte

2018-05-02 11:02 GMT+02:00 Thomas Gleixner :
> On Wed, 2 May 2018, Jörg Otte wrote:
>> With revert:
>>
>> jojo@fichte:~$ dmesg | grep -i -e spec -e micro -e "Linux version"
>>
>> [0.00] microcode: microcode updated early to revision 0x24,
>> date = 2018-01-21
>> [0.00] Linux version 4.17.0-rc3-revert-1-gcb1069f
>> (jojo@fichte) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubu
>>
>> dmesg | grep -i -e spec -e micro -e "Linux version"
>>
>> [0.00] microcode: microcode updated early to revision 0x24,
>> date = 2018-01-21
>> [0.00] Linux version 4.17.0-rc3-patch-1-gdc10603
>> (jojo@fichte) (gcc version 5.4.0 20160609 (Ubuntu
>> 5.4.0-6ubuntu1~16.04.9)) #20 SMP Wed May 2 09:08:07 CEST 2018
>> [0.028417] Spectre V2 : Mitigation: Full generic retpoline
>> [0.491803] microcode: sig=0x306c3, pf=0x10, revision=0x24
>> [0.491831] microcode: Microcode Update Driver: v2.2.ntu1~16.04.9))
>> #21 SMP Wed May 2 09:14:29 CEST 2018
>> [0.028414] Spectre V2 : Mitigation: Full generic retpoline
>> [0.028415] Spectre V2 : Spectre v2 mitigation: Enabling Indirect
>> Branch Prediction Barrier
>> [0.028415] Spectre V2 : Enabling Restricted Speculation for firmware 
>> calls
>> [0.500157] microcode: sig=0x306c3, pf=0x10, revision=0x24
>> [0.500183] microcode: Microcode Update Driver: v2.2.
>>
>>
>> With patch:
>>
>> dmesg | grep -i -e spec -e micro -e "Linux version"
>>
>> [0.00] microcode: microcode updated early to revision 0x24,
>> date = 2018-01-21
>> [0.00] Linux version 4.17.0-rc3-patch-1-gdc10603
>> (jojo@fichte) (gcc version 5.4.0 20160609 (Ubuntu
>> 5.4.0-6ubuntu1~16.04.9)) #20 SMP Wed May 2 09:08:07 CEST 2018
>> [0.028417] Spectre V2 : Mitigation: Full generic retpoline
>> [0.491803] microcode: sig=0x306c3, pf=0x10, revision=0x24
>> [0.491831] microcode: Microcode Update Driver: v2.2.
>
> Ok, I think I know what's going wrong in that steaming pile of horrors of
> CPUID detection. I need to analyze it down to the roots, but if you have
> cycles, can you please test the patch below?
>
> It's a hack and even if it fixes the problem I'm going to do it differently.
>
> Thanks,
>
> tglx
>
> 8<---
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -848,6 +848,11 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
> c->x86_power = edx;
> }
>
> +   if (c->extended_cpuid_level >= 0x8008) {
> +   cpuid(0x8008, &eax, &ebx, &ecx, &edx);
> +   c->x86_capability[CPUID_8000_0008_EBX] = ebx;
> +   }
> +
> if (c->extended_cpuid_level >= 0x800a)
> c->x86_capability[CPUID_8000_000A_EDX] = 
> cpuid_edx(0x800a);
>
> @@ -871,7 +876,6 @@ static void get_cpu_address_sizes(struct
>
> c->x86_virt_bits = (eax >> 8) & 0xff;
> c->x86_phys_bits = eax & 0xff;
> -   c->x86_capability[CPUID_8000_0008_EBX] = ebx;
> }
>  #ifdef CONFIG_X86_32
> else if (cpu_has(c, X86_FEATURE_PAE) || cpu_has(c, X86_FEATURE_PSE36))
>

OK, that patch works for me!

Thanks, Jörg

Re: [v4.17-rcx] Lost IBPB, IBRS_FW support for spectre_v2 mitigation.

2018-05-02 Thread Jörg Otte

2018-05-01 22:14 GMT+02:00 Linus Torvalds :
> On Tue, May 1, 2018 at 5:59 AM Thomas Gleixner  wrote:
>
>> Then I really have no idea how reverting the patch you pointed out would
>> fix it.
>
> So I do think that the original patch is buggy.
>
> What I think *may* be going on is:
>
>   - first we do that
>
>  get_cpu_cap(c);
>  get_cpu_address_sizes(c);
>
>  but at that point, CPU levels may be masked, and that 0x8008 leaf
> isn't seen
>
>   - then we do
>
>  if (this_cpu->c_early_init)
>  this_cpu->c_early_init(c);
>
> which calls early_init_intel(), which does that
>
>  if (msr_clear_bit(MSR_IA32_MISC_ENABLE,
>MSR_IA32_MISC_ENABLE_LIMIT_CPUID_BIT) >
> 0) {
>
> which now raises the cpuid_level.
>
>   - then we do
>
>  get_cpu_cap(c);
>
> again, because the cpuid level has been raised, and _now_ it used to get
> that 0x8008 leaf information.
>
> But with the change, that second call to get_cpu_cap() didn't do anything,
> because the 0x8008 leaf handling had been moved away.
>
> However, I agree that your patch to just do that CPUID_8000_0008_EBX in
> get_cpu_cap() should have fixed it, and it's possible that Jörg mis-tested
> it.
>
> Jörg, are you sure you didn't somehow get the wrong microcode? Because
> another way for those bits to be cleared again is if
> bad_spectre_microcode() triggers. That should show up in dmesg as "Intel
> Spectre v2 broken microcode detected" though.
>
> Linus

I downloaded microcode from Intel.
Here are the excerpts from dmesg:

With revert:

jojo@fichte:~$ dmesg | grep -i -e spec -e micro -e "Linux version"

[0.00] microcode: microcode updated early to revision 0x24,
date = 2018-01-21
[0.00] Linux version 4.17.0-rc3-revert-1-gcb1069f
(jojo@fichte) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubu

dmesg | grep -i -e spec -e micro -e "Linux version"

[0.00] microcode: microcode updated early to revision 0x24,
date = 2018-01-21
[0.00] Linux version 4.17.0-rc3-patch-1-gdc10603
(jojo@fichte) (gcc version 5.4.0 20160609 (Ubuntu
5.4.0-6ubuntu1~16.04.9)) #20 SMP Wed May 2 09:08:07 CEST 2018
[0.028417] Spectre V2 : Mitigation: Full generic retpoline
[0.491803] microcode: sig=0x306c3, pf=0x10, revision=0x24
[0.491831] microcode: Microcode Update Driver: v2.2.ntu1~16.04.9))
#21 SMP Wed May 2 09:14:29 CEST 2018
[0.028414] Spectre V2 : Mitigation: Full generic retpoline
[0.028415] Spectre V2 : Spectre v2 mitigation: Enabling Indirect
Branch Prediction Barrier
[0.028415] Spectre V2 : Enabling Restricted Speculation for firmware calls
[0.500157] microcode: sig=0x306c3, pf=0x10, revision=0x24
[0.500183] microcode: Microcode Update Driver: v2.2.


With patch:

dmesg | grep -i -e spec -e micro -e "Linux version"

[0.00] microcode: microcode updated early to revision 0x24,
date = 2018-01-21
[0.00] Linux version 4.17.0-rc3-patch-1-gdc10603
(jojo@fichte) (gcc version 5.4.0 20160609 (Ubuntu
5.4.0-6ubuntu1~16.04.9)) #20 SMP Wed May 2 09:08:07 CEST 2018
[0.028417] Spectre V2 : Mitigation: Full generic retpoline
[0.491803] microcode: sig=0x306c3, pf=0x10, revision=0x24
[0.491831] microcode: Microcode Update Driver: v2.2.

Thanks, Jörg

Re: [v4.17-rcx] Lost IBPB, IBRS_FW support for spectre_v2 mitigation.

2018-05-01 Thread Jörg Otte

2018-04-30 21:53 GMT+02:00 Thomas Gleixner :
> Jörg,
>
> On Mon, 30 Apr 2018, Jörg Otte wrote:
>
>> In v4.16 I already had support for BPB, IBRS_FW for spectre_v2 mitigation.
>> But this went away in v17-rcx.
>>
>> With 4.16 I have:
>> jojo@fichte:~$ cd /sys/devices/system/cpu/vulnerabilities; grep ".*" *
>> meltdown:Mitigation: PTI
>> spectre_v1:Mitigation: __user pointer sanitization
>> spectre_v2:Mitigation: Full generic retpoline, IBPB, IBRS_FW
>>
>> With 4.17-rcx I have:
>> meltdown:Mitigation: PTI
>> spectre_v1:Mitigation: __user pointer sanitization
>> spectre_v2:Mitigation: Full generic retpoline
>>
>> Processor is
>> vendor_id   : GenuineIntel
>> cpu family  : 6
>> model   : 60
>> model name  : Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz
>> stepping: 3
>> microcode   : 0x24
>>
>>
>> The problem goes away if I revert:
>> d94a155 x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption
>
> Does the patch below fix the problem for you?
>
> Thanks,
>
> tglx
>
> 8<--
> Subject: x86/cpu: Restore CPUID_8000_0008_EBX reload
> From: Thomas Gleixner 
> Date: Mon, 30 Apr 2018 21:47:46 +0200
>
> The recent commt which addresses the x86_phys_bits corruption with
> encrypted memory on CPUID reload after a microcode update lost the reload
> of CPUID_8000_0008_EBX as well.
>
> As a consequence IBRS and IBRS_FW are not longer detected
>
> Restore the behaviour by bringing the reload of CPUID_8000_0008_EBX back,.
>
> Fixes: d94a155c59c9 ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment 
> corruption")
> Reported-by: Jörg Otte 
> Signed-off-by: Thomas Gleixner 
> Cc: kirill.shute...@linux.intel.com
> ---
>  arch/x86/kernel/cpu/common.c |5 +
>  1 file changed, 5 insertions(+)
>
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -848,6 +848,11 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
> c->x86_power = edx;
> }
>
> +   if (c->extended_cpuid_level >= 0x8008) {
> +   cpuid(0x8008, &eax, &ebx, &ecx, &edx);
> +   c->x86_capability[CPUID_8000_0008_EBX] = ebx;
> +   }
> +
> if (c->extended_cpuid_level >= 0x800a)
> c->x86_capability[CPUID_8000_000A_EDX] = 
> cpuid_edx(0x800a);
>

No, does not fix it.

Thanks, Jörg

[v4.17-rcx] Lost IBPB, IBRS_FW support for spectre_v2 mitigation.

2018-04-30 Thread Jörg Otte

Hi,

In v4.16 I already had support for BPB, IBRS_FW for spectre_v2 mitigation.
But this went away in v17-rcx.

With 4.16 I have:
jojo@fichte:~$ cd /sys/devices/system/cpu/vulnerabilities; grep ".*" *
meltdown:Mitigation: PTI
spectre_v1:Mitigation: __user pointer sanitization
spectre_v2:Mitigation: Full generic retpoline, IBPB, IBRS_FW

With 4.17-rcx I have:
meltdown:Mitigation: PTI
spectre_v1:Mitigation: __user pointer sanitization
spectre_v2:Mitigation: Full generic retpoline

Processor is
vendor_id   : GenuineIntel
cpu family  : 6
model   : 60
model name  : Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz
stepping: 3
microcode   : 0x24


The problem goes away if I revert:
d94a155 x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption

Thanks, Jörg

Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte

2018-04-17 16:27 GMT+02:00 Borislav Petkov :
> On Tue, Apr 17, 2018 at 04:16:34PM +0200, Jörg Otte wrote:
>> Current Linus master tree (4.17.0-rc1-00021-ga27fc14) does'nt fix it.
>
> Then pls continue bisecting. Unless someone has a better idea...
>

finished bisection.
39114b7a743e6759bab4d96b7d9651d44d17e3f9 is the first bad commit
(x86/pti: Never implicitly clear _PAGE_GLOBAL for kernel image).

Thanks, Jörg

Re: 4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte

2018-04-17 10:14 GMT+02:00 Borislav Petkov :
> On Tue, Apr 17, 2018 at 10:00:25AM +0200, Jörg Otte wrote:
>> Maybe the problem came in with:
>> 6b0a02e:  "Merge branch 'x86-pti-for-linus' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"
>
> Fetch latest Linus master and try again - there might be a relevant fix
> there.
>

Current Linus master tree (4.17.0-rc1-00021-ga27fc14) does'nt fix it.

Thanks, Jörg

4.17.0-rc1 doesn't boot.

2018-04-17 Thread Jörg Otte

Hi,
my notebook doesn't boot with 4.17.0-rc1. Booting stops right after
displaying "loading initial ramdisk..". No further displays.
Also nothing is wriiten to the logs.

First known bad kernel is: 4.16.0-12564-g6b0a02e
Last known good kernel is: 4.16.0-12548-g71b8ebb

Maybe the problem came in with:
6b0a02e:  "Merge branch 'x86-pti-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip"


Thanks, Jörg

v15-rc1 regression: No sound via usb-audio

2017-11-27 Thread Jörg Otte

With v15-rc1 "alsamixer" has no volume control for usb sound boxes.
The sound boxes have been working well for years so this a regression.

Reverting commit:
8428a8e: "ALSA: usb-audio: Fix potential zero-division at parsing FU"

fixes the problem for me completely.


Thanks, Jörg

Re: 4.14.0-rc7: cpufreq interface in sysfs removed ?

2017-11-02 Thread Jörg Otte

2017-11-02 9:37 GMT+01:00 Jörg Otte :
> 2017-11-01 21:24 GMT+01:00 Rafael J. Wysocki :
>> On Wed, Nov 1, 2017 at 6:06 PM, Jörg Otte  wrote:
>>> In 4.14.0-rc7-9-g287683d cpufreq directory under
>>> /sys/devices/system/cpu/cpufreq
>>> is empty. Also link /sys/devices/system/cpu/cpu0/cpufreq is missing.
>>>
>>> Is this change intentional?
>>
>> Sure not.
>>
>> What's the last good kernel you tested before this one?
> Unfortunately I had a disk crash and I was offline in last weeks.
> This is the first kernel with everything new installed on new disk.
> So I don't remember exactly last good kernel. For sure it was  4.14.0-rc1.
>
> Maybe I made a mistake with the kernel configuration ?
>
> #
> # CPU Frequency scaling
> #
> CONFIG_CPU_FREQ=y
> CONFIG_CPU_FREQ_GOV_ATTR_SET=y
> CONFIG_CPU_FREQ_GOV_COMMON=y
> CONFIG_CPU_FREQ_STAT=y
> CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
> # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
> CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
> CONFIG_CPU_FREQ_GOV_POWERSAVE=y
> # CONFIG_CPU_FREQ_GOV_USERSPACE is not set
> CONFIG_CPU_FREQ_GOV_ONDEMAND=y
> CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
> CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y
>
> #
> # CPU frequency scaling drivers
> #
> CONFIG_X86_INTEL_PSTATE=y
> # CONFIG_X86_PCC_CPUFREQ is not set
> # CONFIG_X86_ACPI_CPUFREQ is not set
> # CONFIG_X86_SPEEDSTEP_CENTRINO is not set
> # CONFIG_X86_P4_CLOCKMOD is not set
>
>
> Just discovered:
> If intel_pstate is active everythin looks fine.
> If intel_pstate is disabled (intel_pstate=disable) the directories are
> empty again.
>
> Processor are:
> jojo@fichte:~$ cat /proc/cpuinfo
> processor   : 0 .. 3
> vendor_id   : GenuineIntel
> cpu family  : 6
> model   : 60
> model name  : Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz
> stepping: 3
> microcode   : 0x22
> cpu MHz : 2494.433
> cache size  : 3072 KB
> physical id : 0
> siblings: 4
> core id : 0
> cpu cores   : 2
> apicid  : 0
> initial apicid  : 0
> fpu : yes
> fpu_exception   : yes
> cpuid level : 13
> wp  : yes
> flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
> mca cmov pat pse36 clflush dts acpi mmx fxsr
> sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc
> arch_perfmon pebs bts rep_good nopl xtopology nons
> top_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est
> tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1
> sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
> lahf_lm abm cpuid_fault epb tpr_shadow vnmi flexpr
> iority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
> xsaveopt dtherm ida arat pln pts
> bugs:
> bogomips: 4988.86
> clflush size: 64
> cache_alignment : 64
> address sizes   : 39 bits physical, 48 bits virtual
> power management:
>

>> # CONFIG_X86_ACPI_CPUFREQ is not set

OK, that's was it! My bad. Sorry for the noise.

Thanks,
Jörg

Re: 4.14.0-rc7: cpufreq interface in sysfs removed ?

2017-11-02 Thread Jörg Otte

2017-11-01 21:24 GMT+01:00 Rafael J. Wysocki :
> On Wed, Nov 1, 2017 at 6:06 PM, Jörg Otte  wrote:
>> In 4.14.0-rc7-9-g287683d cpufreq directory under
>> /sys/devices/system/cpu/cpufreq
>> is empty. Also link /sys/devices/system/cpu/cpu0/cpufreq is missing.
>>
>> Is this change intentional?
>
> Sure not.
>
> What's the last good kernel you tested before this one?
Unfortunately I had a disk crash and I was offline in last weeks.
This is the first kernel with everything new installed on new disk.
So I don't remember exactly last good kernel. For sure it was  4.14.0-rc1.

Maybe I made a mistake with the kernel configuration ?

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_GOV_ATTR_SET=y
CONFIG_CPU_FREQ_GOV_COMMON=y
CONFIG_CPU_FREQ_STAT=y
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_SCHEDUTIL is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
CONFIG_CPU_FREQ_GOV_SCHEDUTIL=y

#
# CPU frequency scaling drivers
#
CONFIG_X86_INTEL_PSTATE=y
# CONFIG_X86_PCC_CPUFREQ is not set
# CONFIG_X86_ACPI_CPUFREQ is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
# CONFIG_X86_P4_CLOCKMOD is not set

Just discovered:
If intel_pstate is active everythin looks fine.
If intel_pstate is disabled (intel_pstate=disable) the directories are
empty again.

Processor are:
jojo@fichte:~$ cat /proc/cpuinfo
processor   : 0 .. 3
vendor_id   : GenuineIntel
cpu family  : 6
model   : 60
model name  : Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz
stepping: 3
microcode   : 0x22
cpu MHz : 2494.433
cache size  : 3072 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 2
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr
sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc
arch_perfmon pebs bts rep_good nopl xtopology nons
top_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est
tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1
sse4_2 movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm cpuid_fault epb tpr_shadow vnmi flexpr
iority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid
xsaveopt dtherm ida arat pln pts
bugs:
bogomips: 4988.86
clflush size: 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

Thanks,
Jörg

4.14.0-rc7: cpufreq interface in sysfs removed ?

2017-11-01 Thread Jörg Otte

In 4.14.0-rc7-9-g287683d cpufreq directory under
/sys/devices/system/cpu/cpufreq
is empty. Also link /sys/devices/system/cpu/cpu0/cpufreq is missing.

Is this change intentional?

Jörg

[4.12.0-rc0]: compile error in initramfs.c

2017-05-06 Thread Jörg Otte

In kernel 4.11.0-10502-g3ef2bc0 I get following compile error:

/kernel/linux/init/initramfs.c: In function 'populate_rootfs':
/kernel/linux/init/initramfs.c:644:2: error: label at end of compound statement
  done:

the compile error goes away if I revert
commit 17a9be31747535184f2af156b1f080ec4c92a952
"initramfs: Always do fput() and load modules after rootfs populate"


Thanks
Jörg

regression: snd-usb-audio kernel panic NULL pointer exception

2016-12-16 Thread Jörg Otte

if I connect Logitech usb audio speaker (idVendor=1130, idProduct=1620) I get
a kernel panic - fatal exception in interrupt.

There is nothing written into the logs so this
is a hand written excerpt from screen:

call trace:

? snd_complete_urb
? __usb_hcd_giveback_urb
? xhci_giveback_urb_in_irq.isra
? ata_scsi_qc_complete
? finish_td.isra.59
? xhci_irq
? enqueue_task_fair
? __handle_irq_event_percpu
? handle_irq_even
? handle_udge_irq
? handle_irq
? do_irq
? common_interrupt

? cpuidle_enter_state
? cpuidle_enter_state
? do_idle
? cpu_startup_entry
? start_secondary
? start_cpu

Code:
RIP: retrieve_playback_urb+0xd
CR2: 10

Last known good kernel is: 4.9.0-07150-gcdb98c2
First known bad kernel is: 4.9.0-08648-g5cc60ae

Thanks, Jörg

Re: [v4.9-rc4] dvb-usb/cinergyT2 NULL pointer dereference

2016-11-11 Thread Jörg Otte

2016-11-11 14:55 GMT+01:00 Mauro Carvalho Chehab :
> Em Thu, 10 Nov 2016 12:15:39 +0100
> Jörg Otte  escreveu:
>
>> 2016-11-10 9:40 GMT+01:00 Mauro Carvalho Chehab :
>> > Em Wed, 9 Nov 2016 11:07:35 -0800
>> > Linus Torvalds  escreveu:
>> >
>> >> On Wed, Nov 9, 2016 at 3:09 AM, Jörg Otte  wrote:
>> >> >
>> >> > Tried patch with no success. Again a NULL ptr dereferece.
>> >>
>> >> I suspect a much simpler approach is to just miove the "data_mutex"
>> >> away from the priv area and into "struct dvb_usb_device" and
>> >> "dvb_usb_adapter". Sure, that grows those structures a tiny bit, and
>> >> not every driver may need that mutex, but it simplifies things
>> >> enormously. Mauro?
>> >>
>> >>  Linus
>
>> The patch works for me.
>
> Thanks for testing! That's the (hopefully) final version of it,
> with the fix applied to the other dvb-usb drivers that use
> data_mutex (except for the frontend ones, with uses a different
> private structure, and where the mutex is initialized at attach).
>
> Benjamin,
>
> Could you please test it?
>
> Thanks!
> Mauro
>
> -
>
> [PATCH] dvb-usb: move data_mutex to struct dvb_usb_device
>
> The data_mutex is initialized too late, as it is needed for
> each device driver's power control, causing an OOPS:
>
> dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in 
> warm state.
> BUG: unable to handle kernel NULL pointer dereference at   
> (null)
> IP: [] __mutex_lock_slowpath+0x6f/0x100 PGD 0
> Oops: 0002 [#1] SMP
> Modules linked in: dvb_usb_cinergyT2(+) dvb_usb
> CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod #24
> Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 
> 05/09/2014
> task: 88020e943840 task.stack: 8801f36ec000
> RIP: 0010:[]  [] 
> __mutex_lock_slowpath+0x6f/0x100
> RSP: 0018:8801f36efb10  EFLAGS: 00010282
> RAX:  RBX: 88021509bdc8 RCX: c100
> RDX: 0001 RSI:  RDI: 88021509bdcc
> RBP: 8801f36efb58 R08: 88021f216320 R09: 0010
> R10: 88021f216320 R11: 0023fee6c5a1 R12: 88020e943840
> R13: 88021509bdcc R14:  R15: 88021509bdd0
> FS:  7f21adb86740() GS:88021f20() 
> knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 000215bce000 CR4: 001406f0
> Stack:
>  88021509bdd0   c0137c80
>  88021509bdc8 8801f5944000 0001 c0136b00
>  880213e52000 88021509bdc8 84661856 88021509bd80
> Call Trace:
>  [] ? mutex_lock+0x16/0x25
>  [] ? cinergyt2_power_ctrl+0x1f/0x60 
> [dvb_usb_cinergyT2]
>  [] ? dvb_usb_device_init+0x21e/0x5d0 [dvb_usb]
>  [] ? cinergyt2_usb_probe+0x21/0x50 
> [dvb_usb_cinergyT2]
>  [] ? usb_probe_interface+0xf3/0x2a0
>  [] ? driver_probe_device+0x208/0x2b0
>  [] ? __driver_attach+0x87/0x90
>  [] ? driver_probe_device+0x2b0/0x2b0
>  [] ? bus_for_each_dev+0x52/0x80
>  [] ? bus_add_driver+0x1a3/0x220
>  [] ? driver_register+0x56/0xd0
>  [] ? usb_register_driver+0x77/0x130
>  [] ? 0xc013a000
>  [] ? do_one_initcall+0x46/0x180
>  [] ? free_vmap_area_noflush+0x38/0x70
>  [] ? kmem_cache_alloc+0x84/0xc0
>  [] ? do_init_module+0x50/0x1be
>  [] ? load_module+0x1d8b/0x2100
>  [] ? find_symbol_in_section+0xa0/0xa0
>  [] ? SyS_finit_module+0x89/0x90
>  [] ? entry_SYSCALL_64_fastpath+0x13/0x94
> Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c 
> 8d 7b 08 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48> 89 20 
> 4c 89 64 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43 RIP  
> [] __mutex_lock_slowpath+0x6f/0x100 RSP 
> CR2: 
>
> So, move it to the struct dvb_usb_device and initialize it
> before calling the driver's callbacks.
>
> Reported-by: Jörg Otte 
> Signed-off-by: Mauro Carvalho Chehab 
>
> diff --git a/drivers/media/usb/dvb-usb/af9005.c 
> b/drivers/media/usb/dvb-usb/af9005.c
> index b257780fb380..7853261906b1 100644
> --- a/drivers/media/usb/dvb-usb/af9005.c
> +++ b/drivers/media/usb

Re: [v4.9-rc4] dvb-usb/cinergyT2 NULL pointer dereference

2016-11-10 Thread Jörg Otte

2016-11-10 9:40 GMT+01:00 Mauro Carvalho Chehab :
> Em Wed, 9 Nov 2016 11:07:35 -0800
> Linus Torvalds  escreveu:
>
>> On Wed, Nov 9, 2016 at 3:09 AM, Jörg Otte  wrote:
>> >
>> > Tried patch with no success. Again a NULL ptr dereferece.
>>
>> That patch was pure garbage, I think. Pretty much all the other
>> drivers that use the same approach will have the same issue. Adding
>> that init function just for the semaphore is crazy.
>>
>> I suspect a much simpler approach is to just miove the "data_mutex"
>> away from the priv area and into "struct dvb_usb_device" and
>> "dvb_usb_adapter". Sure, that grows those structures a tiny bit, and
>> not every driver may need that mutex, but it simplifies things
>> enormously. Mauro?
>>
>>  Linus
>
>
> [PATCH] cinergyT2-core: move data_mutex to struct dvb_usb_device
>
> The data_mutex is initialized too late, as it is needed for
> the device's power control, causing an OOPS:
>
> dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm 
> state.
> BUG: unable to handle kernel NULL pointer dereference at   (null)
> IP: [] __mutex_lock_slowpath+0x6f/0x100 PGD 0
> Oops: 0002 [#1] SMP
> Modules linked in: dvb_usb_cinergyT2(+) dvb_usb
> CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod #24
> Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 05/09/2014
> task: 88020e943840 task.stack: 8801f36ec000
> RIP: 0010:[]  [] 
> __mutex_lock_slowpath+0x6f/0x100
> RSP: 0018:8801f36efb10  EFLAGS: 00010282
> RAX:  RBX: 88021509bdc8 RCX: c100
> RDX: 0001 RSI:  RDI: 88021509bdcc
> RBP: 8801f36efb58 R08: 88021f216320 R09: 0010
> R10: 88021f216320 R11: 0023fee6c5a1 R12: 88020e943840
> R13: 88021509bdcc R14:  R15: 88021509bdd0
> FS:  7f21adb86740() GS:88021f20() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 000215bce000 CR4: 001406f0
> Stack:
>  88021509bdd0   c0137c80
>  88021509bdc8 8801f5944000 0001 c0136b00
>  880213e52000 88021509bdc8 84661856 88021509bd80
> Call Trace:
>  [] ? mutex_lock+0x16/0x25
>  [] ? cinergyt2_power_ctrl+0x1f/0x60 [dvb_usb_cinergyT2]
>  [] ? dvb_usb_device_init+0x21e/0x5d0 [dvb_usb]
>  [] ? cinergyt2_usb_probe+0x21/0x50 [dvb_usb_cinergyT2]
>  [] ? usb_probe_interface+0xf3/0x2a0
>  [] ? driver_probe_device+0x208/0x2b0
>  [] ? __driver_attach+0x87/0x90
>  [] ? driver_probe_device+0x2b0/0x2b0
>  [] ? bus_for_each_dev+0x52/0x80
>  [] ? bus_add_driver+0x1a3/0x220
>  [] ? driver_register+0x56/0xd0
>  [] ? usb_register_driver+0x77/0x130
>  [] ? 0xc013a000
>  [] ? do_one_initcall+0x46/0x180
>  [] ? free_vmap_area_noflush+0x38/0x70
>  [] ? kmem_cache_alloc+0x84/0xc0
>  [] ? do_init_module+0x50/0x1be
>  [] ? load_module+0x1d8b/0x2100
>  [] ? find_symbol_in_section+0xa0/0xa0
>  [] ? SyS_finit_module+0x89/0x90
>  [] ? entry_SYSCALL_64_fastpath+0x13/0x94
> Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c 8d 7b 08 
> 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48> 89 20 4c 89 64 
> 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43 RIP  [] 
> __mutex_lock_slowpath+0x6f/0x100 RSP 
> CR2: 
>
> So, move it to the struct dvb_usb_device and initialize it
> before calling the driver's callbacks.
>
> Reported-by: Jörg Otte 
> Signed-off-by: Mauro Carvalho Chehab 
>
> diff --git a/drivers/media/usb/dvb-usb/cinergyT2-core.c 
> b/drivers/media/usb/dvb-usb/cinergyT2-core.c
> index 8ac825413d5a..87e3bd33900d 100644
> --- a/drivers/media/usb/dvb-usb/cinergyT2-core.c
> +++ b/drivers/media/usb/dvb-usb/cinergyT2-core.c
> @@ -42,7 +42,6 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
>  struct cinergyt2_state {
> u8 rc_counter;
> unsigned char data[64];
> -   struct mutex data_mutex;
>  };
>
>  /* We are missing a release hook with usb_device data */
> @@ -56,12 +55,12 @@ static int cinergyt2_streaming_ctrl(struct 
> dvb_usb_adapter *adap, int enable)
> struct cinergyt2_state *st = d->priv;
> int ret;
>
> -   mutex_lock(&st->data_mutex);
> +   mutex_lock(&d->data_mutex);
> st->data[0] = CINERGYT2_EP1_CONTROL_STREAM_TRANSFER;
> st->data[1] = enable ? 1 : 0;
>
> ret = dvb_usb_generic_rw(d, st->data, 2, st->data, 64, 0);
> -   mutex_unlock(&st->data_mu

Re: [v4.9-rc4] dvb-usb/cinergyT2 NULL pointer dereference

2016-11-09 Thread Jörg Otte

2016-11-08 21:22 GMT+01:00 Mauro Carvalho Chehab :
> Em Tue, 8 Nov 2016 10:42:03 -0800
> Linus Torvalds  escreveu:
>
>> On Sun, Nov 6, 2016 at 7:40 AM, Jörg Otte  wrote:
>> > Since v4.9-rc4 I get following crash in dvb-usb-cinergyT2 module.
>>
>> Looks like it's commit 5ef8ed0e5608f ("[media] cinergyT2-core: don't
>> do DMA on stack"), which movced the DMA data array from the stack to
>> the "private" pointer. In the process it also added serialization in
>> the form of "data_mutex", but and now it oopses on that mutex because
>> the private pointer is NULL.
>>
>> It looks like the "->private" pointer is allocated in dvb_usb_adapter_init()
>>
>> cinergyt2_usb_probe ->
>>   dvb_usb_device_init ->
>> dvb_usb_init() ->
>>   dvb_usb_adapter_init()
>>
>> but the dvb_usb_init() function calls dvb_usb_device_power_ctrl()
>> (which calls the "power_ctrl" function, which is
>> cinergyt2_power_ctrl() for that drive) *before* it initializes the
>> private field.
>>
>> Mauro, Patrick, could dvb_usb_adapter_init() be called earlier, perhaps?
>
> Calling it earlier won't work, as we need to load the firmware before
> sending the power control commands on some devices.
>
> Probably the best here is to pass an extra optional function parameter
> that will initialize the mutex before calling any functions.
>
> Btw, if it broke here, the DMA fixes will likely break on other drivers.
> So, after Jörg tests this patch, I'll work on a patch series addressing
> this issue on the other drivers I touched.
>
> Regards,
> Mauro
>
> -
>
> [PATCH RFC] cinergyT2-core: initialize the mutex early
>
> NOTE: don't merge this patch as-is... I actually folded two patches
> together here, in order to make easier to test, but the best is to
> place the changes at the core first, and then the changes at the
> drivers that would need an early init.
>
> The mutex used to protect the URB data buffer needs to be
> inialized early, as otherwise it will cause an OOPS:
>
> dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm 
> state.
> BUG: unable to handle kernel NULL pointer dereference at   (null)
> IP: [] __mutex_lock_slowpath+0x6f/0x100 PGD 0
> Oops: 0002 [#1] SMP
> Modules linked in: dvb_usb_cinergyT2(+) dvb_usb
> CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod #24
> Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 05/09/2014
> task: 88020e943840 task.stack: 8801f36ec000
> RIP: 0010:[]  [] 
> __mutex_lock_slowpath+0x6f/0x100
> RSP: 0018:8801f36efb10  EFLAGS: 00010282
> RAX:  RBX: 88021509bdc8 RCX: c100
> RDX: 0001 RSI:  RDI: 88021509bdcc
> RBP: 8801f36efb58 R08: 88021f216320 R09: 0010
> R10: 88021f216320 R11: 0023fee6c5a1 R12: 88020e943840
> R13: 88021509bdcc R14:  R15: 88021509bdd0
> FS:  7f21adb86740() GS:88021f20() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2:  CR3: 000215bce000 CR4: 001406f0
> Stack:
>  88021509bdd0   c0137c80
>  88021509bdc8 8801f5944000 0001 c0136b00
>  880213e52000 88021509bdc8 84661856 88021509bd80
> Call Trace:
>  [] ? mutex_lock+0x16/0x25
>  [] ? cinergyt2_power_ctrl+0x1f/0x60 [dvb_usb_cinergyT2]
>  [] ? dvb_usb_device_init+0x21e/0x5d0 [dvb_usb]
>  [] ? cinergyt2_usb_probe+0x21/0x50 [dvb_usb_cinergyT2]
>  [] ? usb_probe_interface+0xf3/0x2a0
>  [] ? driver_probe_device+0x208/0x2b0
>  [] ? __driver_attach+0x87/0x90
>  [] ? driver_probe_device+0x2b0/0x2b0
>  [] ? bus_for_each_dev+0x52/0x80
>  [] ? bus_add_driver+0x1a3/0x220
>  [] ? driver_register+0x56/0xd0
>  [] ? usb_register_driver+0x77/0x130
>  [] ? 0xc013a000
>  [] ? do_one_initcall+0x46/0x180
>  [] ? free_vmap_area_noflush+0x38/0x70
>  [] ? kmem_cache_alloc+0x84/0xc0
>  [] ? do_init_module+0x50/0x1be
>  [] ? load_module+0x1d8b/0x2100
>  [] ? find_symbol_in_section+0xa0/0xa0
>  [] ? SyS_finit_module+0x89/0x90
>  [] ? entry_SYSCALL_64_fastpath+0x13/0x94
> Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c 8d 7b 08 
> 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48> 89 20 4c 89 64 
> 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43 RIP  [] 
> __mutex_lock_slowpath+0x6f/0x100 RSP 
> CR2: 
>
> Reported-by: Jörg Otte 
> Fixes: 6679a901c380 ("[media] cinergyT2-core

[v4.9-rc4] dvb-usb/cinergyT2 NULL pointer dereference

2016-11-06 Thread Jörg Otte

Since v4.9-rc4 I get following crash in dvb-usb-cinergyT2 module.

dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm state.
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [] __mutex_lock_slowpath+0x6f/0x100
PGD 0

Oops: 0002 [#1] SMP
Modules linked in: dvb_usb_cinergyT2(+) dvb_usb
CPU: 0 PID: 2029 Comm: modprobe Not tainted 4.9.0-rc4-dvbmod #24
Hardware name: FUJITSU LIFEBOOK A544/FJNBB35 , BIOS Version 1.17 05/09/2014
task: 88020e943840 task.stack: 8801f36ec000
RIP: 0010:[]  []
__mutex_lock_slowpath+0x6f/0x100
RSP: 0018:8801f36efb10  EFLAGS: 00010282
RAX:  RBX: 88021509bdc8 RCX: c100
RDX: 0001 RSI:  RDI: 88021509bdcc
RBP: 8801f36efb58 R08: 88021f216320 R09: 0010
R10: 88021f216320 R11: 0023fee6c5a1 R12: 88020e943840
R13: 88021509bdcc R14:  R15: 88021509bdd0
FS:  7f21adb86740() GS:88021f20() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2:  CR3: 000215bce000 CR4: 001406f0
Stack:
 88021509bdd0   c0137c80
 88021509bdc8 8801f5944000 0001 c0136b00
 880213e52000 88021509bdc8 84661856 88021509bd80
Call Trace:
 [] ? mutex_lock+0x16/0x25
 [] ? cinergyt2_power_ctrl+0x1f/0x60 [dvb_usb_cinergyT2]
 [] ? dvb_usb_device_init+0x21e/0x5d0 [dvb_usb]
 [] ? cinergyt2_usb_probe+0x21/0x50 [dvb_usb_cinergyT2]
 [] ? usb_probe_interface+0xf3/0x2a0
 [] ? driver_probe_device+0x208/0x2b0
 [] ? __driver_attach+0x87/0x90
 [] ? driver_probe_device+0x2b0/0x2b0
 [] ? bus_for_each_dev+0x52/0x80
 [] ? bus_add_driver+0x1a3/0x220
 [] ? driver_register+0x56/0xd0
 [] ? usb_register_driver+0x77/0x130
 [] ? 0xc013a000
 [] ? do_one_initcall+0x46/0x180
 [] ? free_vmap_area_noflush+0x38/0x70
 [] ? kmem_cache_alloc+0x84/0xc0
 [] ? do_init_module+0x50/0x1be
 [] ? load_module+0x1d8b/0x2100
 [] ? find_symbol_in_section+0xa0/0xa0
 [] ? SyS_finit_module+0x89/0x90
 [] ? entry_SYSCALL_64_fastpath+0x13/0x94
Code: e8 a7 1d 00 00 8b 03 83 f8 01 0f 84 97 00 00 00 48 8b 43 10 4c
8d 7b 08 48 89 63 10 4c 89 3c 24 41 be ff ff ff ff 48 89 44 24 08 <48>
89 20 4c 89 64 24 10 eb 1a 49 c7 44 24 08 02 00 00 00 c6 43
RIP  [] __mutex_lock_slowpath+0x6f/0x100
 RSP 
CR2: 
---[ end trace 648d79474da94e34 ]---


Thanks, Jörg

Re: [4.9-rc1] Build-time 2x slower

2016-10-20 Thread Jörg Otte

2016-10-20 13:14 GMT+02:00 Rafael J. Wysocki :
> On Thursday, October 20, 2016 09:57:45 AM Jörg Otte wrote:
>> 2016-10-19 22:55 GMT+02:00 Rafael J. Wysocki :
>> > On Wednesday, October 19, 2016 06:59:35 PM Jörg Otte wrote:
>> >> 2016-10-19 17:29 GMT+02:00 Linus Torvalds :
>> >> > On Wed, Oct 19, 2016 at 4:07 AM, Jörg Otte  wrote:
>> >> >>
>> >> >> Additional info: I usally use schedutil governor.
>> >> >> If I switch to performance governor problems go away.
>> >> >> Maybe a cpufreq problem?
>> >> >
>> >> > Oh, I completely misread the original bug report, and then didn't read
>> >> > your confirmation email right.
>> >> >
>> >> > I thought you had a slower build of the different kernels (when
>> >> > building on the same kernel), and that the _build_ itself had slowed
>> >> > down for some reason. But you're actually saying that doing the _same_
>> >> > build actually takes longer when running on 4.9-rc1.
>> >>
>> >> Exactly!
>> >>
>> >> Btw: ondemand governor is also good.
>> >>
>> >> > There are a few small cpufreq changes there in between commit
>> >> > 29fbff8698fc (that you reported was fine - please tell me I got _that_
>> >> > right, at least?) and 4.9-rc1.
>> >>
>> >> Perfect! That's what I mean.
>> >>
>> >> > Adding Rafael to the cc.
>> >> >
>> >> > That said, none of them look all that likely to me. It *would* be good
>> >> > if you could bisect it a bit (perhaps not fully, but a couple of
>> >> > bisection steps to narrow down what area it is).
>> >>
>> >> I try that tomorrow.
>> >
>> > Well, please try commit ef98988ba369 (Merge tag 'pm-extra-4.9-rc1' of 
>> > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm) which is the
>> > merge introducing the late cpufreq changes.  If the issue is there, please
>> > try to revert commit 899bb6642f2a (cpufreq: skip invalid entries when 
>> > searching
>> > the frequency) which is the only cpufreq one that may matter for the 
>> > schedutil
>> > governor (and I have one fix for that commit queued up already).
>> >
>>
>> I first tried the merge but git said I'm already uptodate (my tree
>> is at 1a1891d Merge tag 'for-f2fs-4.9-rc2' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs already)
>>
>> Then I did the revert of 899bb6642f2a and
>> that worked fine for me.
>
> OK, thanks!
>
> Pointer arithmetics is messed up in that commit somewhat which may be the
> reason for what you see.
>
> Please check if this patch helps (instead of the revert):
>
> https://patchwork.kernel.org/patch/9379639/
>
Yes, works well for me.

Thanks, Jörg

Re: [4.9-rc1] Build-time 2x slower

2016-10-20 Thread Jörg Otte

2016-10-19 22:55 GMT+02:00 Rafael J. Wysocki :
> On Wednesday, October 19, 2016 06:59:35 PM Jörg Otte wrote:
>> 2016-10-19 17:29 GMT+02:00 Linus Torvalds :
>> > On Wed, Oct 19, 2016 at 4:07 AM, Jörg Otte  wrote:
>> >>
>> >> Additional info: I usally use schedutil governor.
>> >> If I switch to performance governor problems go away.
>> >> Maybe a cpufreq problem?
>> >
>> > Oh, I completely misread the original bug report, and then didn't read
>> > your confirmation email right.
>> >
>> > I thought you had a slower build of the different kernels (when
>> > building on the same kernel), and that the _build_ itself had slowed
>> > down for some reason. But you're actually saying that doing the _same_
>> > build actually takes longer when running on 4.9-rc1.
>>
>> Exactly!
>>
>> Btw: ondemand governor is also good.
>>
>> > There are a few small cpufreq changes there in between commit
>> > 29fbff8698fc (that you reported was fine - please tell me I got _that_
>> > right, at least?) and 4.9-rc1.
>>
>> Perfect! That's what I mean.
>>
>> > Adding Rafael to the cc.
>> >
>> > That said, none of them look all that likely to me. It *would* be good
>> > if you could bisect it a bit (perhaps not fully, but a couple of
>> > bisection steps to narrow down what area it is).
>>
>> I try that tomorrow.
>
> Well, please try commit ef98988ba369 (Merge tag 'pm-extra-4.9-rc1' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm) which is the
> merge introducing the late cpufreq changes.  If the issue is there, please
> try to revert commit 899bb6642f2a (cpufreq: skip invalid entries when 
> searching
> the frequency) which is the only cpufreq one that may matter for the schedutil
> governor (and I have one fix for that commit queued up already).
>

I first tried the merge but git said I'm already uptodate (my tree
is at 1a1891d Merge tag 'for-f2fs-4.9-rc2' of
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs already)

Then I did the revert of 899bb6642f2a and
that worked fine for me.

Thanks, Jörg

Re: [4.9-rc1] Build-time 2x slower

2016-10-19 Thread Jörg Otte

2016-10-19 17:29 GMT+02:00 Linus Torvalds :
> On Wed, Oct 19, 2016 at 4:07 AM, Jörg Otte  wrote:
>>
>> Additional info: I usally use schedutil governor.
>> If I switch to performance governor problems go away.
>> Maybe a cpufreq problem?
>
> Oh, I completely misread the original bug report, and then didn't read
> your confirmation email right.
>
> I thought you had a slower build of the different kernels (when
> building on the same kernel), and that the _build_ itself had slowed
> down for some reason. But you're actually saying that doing the _same_
> build actually takes longer when running on 4.9-rc1.

Exactly!

Btw: ondemand governor is also good.

> There are a few small cpufreq changes there in between commit
> 29fbff8698fc (that you reported was fine - please tell me I got _that_
> right, at least?) and 4.9-rc1.

Perfect! That's what I mean.

> Adding Rafael to the cc.
>
> That said, none of them look all that likely to me. It *would* be good
> if you could bisect it a bit (perhaps not fully, but a couple of
> bisection steps to narrow down what area it is).

I try that tomorrow.

Jörg

Re: [4.9-rc1] Build-time 2x slower

2016-10-19 Thread Jörg Otte

2016-10-18 19:19 GMT+02:00 Linus Torvalds :
> On Tue, Oct 18, 2016 at 9:49 AM, Jörg Otte  wrote:
>> On Tue, Oct 18, 2016 at 12:04 AM, Sedat Dilek  wrote:
>>>
>>> not sure whom to address on this issue.
>>>
>>> I have built Linux v4.9-rc1, v4.8.2 and v4.4.25 kernels (in this
>>> order) this morning.
>>>
>>> Building a Linux v4.8.2 under Linux v4.9-rc1 took two times longer.
>>>
>>> As usually I build with 2 parallel-make-jobs.
>>> This takes approx. 30mins.
>>> Under Linux v4.9-rc1 it took approx. an hour.
>>>
>>> My system is a Ubuntu/precise AMD64 (WUBI installation).
>>> I use my normal build-environment.
>>
>> I can confirm the problem. I use 3 build jobs in parallel
>> and the kernel build takes 2,5 times longer.
>>
>> I'm only seeing 1 (of 4) cores are running with max frequency.
>> The other are running in minimum frequency. And this seems not
>> to be limited to build jobs however.
>>
>> The last known good kernel for me is  ..-4.8.0-14604-g29fbff8
>
> Well, there are a few merges in 4.9-rc1 since that
> 4.8.0-14604-g29fbff8 version, but the obvious ones are my pulls from:
>
>   Michal Marek (2):
>  kbuild updates
>  misc kbuild changes
>
> (My merge commit ID's are 50cff89837a4 and 84d69848c97f) with
> everything else looking like "normal code updates".
>
> Michal: a 2.5x slowdown of the kernel build was presumably *not* intentional.
>
> I'm not seeing anything obvious, but if it's spending a lot more time
> in fixdep, then it's that "strstr()" change. That commit seems to
> assume that strstr() is fast, which is a debatable assumption and
> might be wrong in some environments.
>
> But even with a "strstr()" written by a sloth that was dropped on its
> head a few too many times when young, I can't see it being *that* much
> slower.
>
> Can you do just a silly
>
>perf record make -j8
>
> of the bad build, and see if something stands out when you do "perf report"?
>
> But maybe Michal has some ideas.
>
>   Linus

Additional info: I usally use schedutil governor.
If I switch to performance governor problems go away.
Maybe a cpufreq problem?

Jörg

Re: [4.9-rc1] Build-time 2x slower

2016-10-18 Thread Jörg Otte

2016-10-18 11:30 GMT+02:00 Sedat Dilek :
> On Tue, Oct 18, 2016 at 3:38 AM, Ming Lei  wrote:
>> On Tue, Oct 18, 2016 at 12:04 AM, Sedat Dilek  wrote:
>>> Hi Linus,
>>>
>>> not sure whom to address on this issue.
>>>
>>> I have built Linux v4.9-rc1, v4.8.2 and v4.4.25 kernels (in this
>>> order) this morning.
>>>
>>> Building a Linux v4.8.2 under Linux v4.9-rc1 took two times longer.
>>>
>>> As usually I build with 2 parallel-make-jobs.
>>> This takes approx. 30mins.
>>> Under Linux v4.9-rc1 it took approx. an hour.
>>>
>>> My system is a Ubuntu/precise AMD64 (WUBI installation).
>>> I use my normal build-environment.
>>>
>>> If you need further informations, please let me know.
>>
>> Kernel building is more like a CPU-bound workload, so maybe
>> some clues can be got by comparing results of 'perf top/record',
>> which should be very easy to collect.
>>
>
> I have no big experiences with perf.
> Last I played was testing early days of "lockdep" feature.
> Can you give some clear instructions on how to use perf top/record in
> this scenario?
>
> - Sedat -
>
>>>
>>> Regards,
>>> - Sedat -
>>>
>>> P.S.: Listing content of attached tarball file.
>>>
>>> $ LC_ALL=C ls -R
>>> .:
>>> build-time  configs  scripts  toolchain-amd64
>>>
>>> ./build-time:
>>> ls-alt_building-4-4-25-under-4-8-2.txt
>>> ls-alt_building-4-8-2-under-4-9-0-rc1.txt
>>>
>>> ./configs:
>>> WHATS-IN-DILEKS-LINUX-KERNEL.txt  config-4.4.25-1-iniza-small
>>> config-4.8.2-1-iniza-small  config-4.9.0-rc1-1-iniza-small
>>>
>>> ./scripts:
>>> build_linux-upstream.sh
>>>
>>> ./toolchain-amd64:
>>> HOST-AND-BUILD-TOOLCHAIN.txt
>>
>>
>>
>> Thanks,
>> Ming Lei

I can confirm the problem. I use 3 build jobs in parallel
and the kernel build takes 2,5 times longer.

I'm only seeing 1 (of 4) cores are running with max frequency.
The other are running in minimum frequency. And this seems not
to be limited to build jobs however.

The last known good kernel for me is  ..-4.8.0-14604-g29fbff8

Jörg

Re: [PATCH v2] cinergyT2-core: don't do DMA on stack

2016-10-07 Thread Jörg Otte

2016-10-06 20:29 GMT+02:00 Mauro Carvalho Chehab :
> Em Thu, 6 Oct 2016 10:27:56 -0700
> Andy Lutomirski  escreveu:
>
>> On Wed, Oct 5, 2016 at 11:58 AM, Mauro Carvalho Chehab
>>  wrote:
>> > Sorry, forgot to C/C people that are at the "Re: Problem with VMAP_STACK=y"
>> > thread.
>> >
>> > Forwarded message:
>> >
>> > Date: Wed,  5 Oct 2016 15:54:18 -0300
>> > From: Mauro Carvalho Chehab 
>> > To: Linux Doc Mailing List 
>> > Cc: Mauro Carvalho Chehab , Mauro Carvalho 
>> > Chehab , Mauro Carvalho Chehab 
>> > Subject: [PATCH v2] cinergyT2-core: don't do DMA on stack
>> >
>> >
>> > The USB control messages require DMA to work. We cannot pass
>> > a stack-allocated buffer, as it is not warranted that the
>> > stack would be into a DMA enabled area.
>> >
>> > Signed-off-by: Mauro Carvalho Chehab 
>> > ---
>> >
>> > Added the fixups made by Johannes Stezenbach
>> >
>> >  drivers/media/usb/dvb-usb/cinergyT2-core.c | 45 
>> > ++
>> >  1 file changed, 27 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/drivers/media/usb/dvb-usb/cinergyT2-core.c 
>> > b/drivers/media/usb/dvb-usb/cinergyT2-core.c
>> > index 9fd1527494eb..8267e3777af6 100644
>> > --- a/drivers/media/usb/dvb-usb/cinergyT2-core.c
>> > +++ b/drivers/media/usb/dvb-usb/cinergyT2-core.c
>> > @@ -41,6 +41,7 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
>> >
>> >  struct cinergyt2_state {
>> > u8 rc_counter;
>> > +   unsigned char data[64];
>> >  };
>> >
>> >  /* We are missing a release hook with usb_device data */
>> > @@ -50,29 +51,36 @@ static struct dvb_usb_device_properties 
>> > cinergyt2_properties;
>> >
>> >  static int cinergyt2_streaming_ctrl(struct dvb_usb_adapter *adap, int 
>> > enable)
>> >  {
>> > -   char buf[] = { CINERGYT2_EP1_CONTROL_STREAM_TRANSFER, enable ? 1 : 
>> > 0 };
>> > -   char result[64];
>> > -   return dvb_usb_generic_rw(adap->dev, buf, sizeof(buf), result,
>> > -   sizeof(result), 0);
>> > +   struct dvb_usb_device *d = adap->dev;
>> > +   struct cinergyt2_state *st = d->priv;
>> > +
>> > +   st->data[0] = CINERGYT2_EP1_CONTROL_STREAM_TRANSFER;
>> > +   st->data[1] = enable ? 1 : 0;
>> > +
>> > +   return dvb_usb_generic_rw(d, st->data, 2, st->data, 64, 0);
>> >  }
>> >
>> >  static int cinergyt2_power_ctrl(struct dvb_usb_device *d, int enable)
>> >  {
>>
>> This...
>>
>> > -   char buf[] = { CINERGYT2_EP1_SLEEP_MODE, enable ? 0 : 1 };
>> > -   char state[3];
>> > -   return dvb_usb_generic_rw(d, buf, sizeof(buf), state, 
>> > sizeof(state), 0);
>> > +   struct cinergyt2_state *st = d->priv;
>> > +
>> > +   st->data[0] = CINERGYT2_EP1_SLEEP_MODE;
>>
>> ...does not match this:
>>
>> > +   st->data[1] = enable ? 1 : 0;
>>
>> --Andy
>
> Gah! Yes. This is what happens when coding using cut-and-paste ;)
>
> Jörg,
>
> Please test it with the condition reversed with the enclosed patch.
>
> if this doesn't work, you can enable dvb-usb debug at runtime,
> by loading it with debug parameter:
>
> parm:   debug:set debugging level 
> (1=info,xfer=2,pll=4,ts=8,err=16,rc=32,fw=64,mem=128,uxfer=256  (or-able)). 
> (debugging is not enabled) (int)
>
> debug=2 should show the control messages sent to the device on dmesg.
>
> Regards,
> Mauro
>
>
> [PATCH] cinergyT2-core: don't do DMA on stack
>
> The USB control messages require DMA to work. We cannot pass
> a stack-allocated buffer, as it is not warranted that the
> stack would be into a DMA enabled area.
>
> Signed-off-by: Mauro Carvalho Chehab 
>
> diff --git a/drivers/media/usb/dvb-usb/cinergyT2-core.c 
> b/drivers/media/usb/dvb-usb/cinergyT2-core.c
> index 9fd1527494eb..91640c927776 100644
> --- a/drivers/media/usb/dvb-usb/cinergyT2-core.c
> +++ b/drivers/media/usb/dvb-usb/cinergyT2-core.c
> @@ -41,6 +41,7 @@ DVB_DEFINE_MOD_OPT_ADAPTER_NR(adapter_nr);
>
>  struct cinergyt2_state {
> u8 rc_counter;
> +   unsigned char data[64];
>  };
>
>  /* We are missing a release hook with usb_device data */
> @@ -50,29 +51,36 @@ static struct dvb_usb_device_properties 
> cinergyt2_properties;
>
>  static int cinergyt2_streaming_ctrl(struct dvb_usb_adapter *adap, int enable)
>  {
> -   char buf[] = { CINERGYT2_EP1_CONTROL_STREAM_TRANSFER, enable ? 1 : 0 
> };
> -   char result[64];
> -   return dvb_usb_generic_rw(adap->dev, buf, sizeof(buf), result,
> -   sizeof(result), 0);
> +   struct dvb_usb_device *d = adap->dev;
> +   struct cinergyt2_state *st = d->priv;
> +
> +   st->data[0] = CINERGYT2_EP1_CONTROL_STREAM_TRANSFER;
> +   st->data[1] = enable ? 1 : 0;
> +
> +   return dvb_usb_generic_rw(d, st->data, 2, st->data, 64, 0);
>  }
>
>  static int cinergyt2_power_ctrl(struct dvb_usb_device *d, int enable)
>  {
> -   char buf[] = { CINERGYT2_EP1_SLEEP_MODE, enable ? 0 : 1 };
> -   char state[3];
> -   return dvb_usb_generic_rw(d, buf, sizeof(buf), state, sizeof(state), 
> 0);
> +

Re: Problem with VMAP_STACK=y

2016-10-06 Thread Jörg Otte

2016-10-05 20:55 GMT+02:00 Mauro Carvalho Chehab :
> Hi Johannes,
>
> Em Wed, 5 Oct 2016 20:29:45 +0200
> Johannes Stezenbach  escreveu:
>
>> On Wed, Oct 05, 2016 at 06:04:50AM -0300, Mauro Carvalho Chehab wrote:
>> >  static int cinergyt2_frontend_attach(struct dvb_usb_adapter *adap)
>> >  {
>> > -   char query[] = { CINERGYT2_EP1_GET_FIRMWARE_VERSION };
>> > -   char state[3];
>> > +   struct dvb_usb_device *d = adap->dev;
>> > +   struct cinergyt2_state *st = d->priv;
>> > int ret;
>> >
>> > adap->fe_adap[0].fe = cinergyt2_fe_attach(adap->dev);
>> >
>> > -   ret = dvb_usb_generic_rw(adap->dev, query, sizeof(query), state,
>> > -   sizeof(state), 0);
>>
>> it seems to miss this:
>>
>>   st->data[0] = CINERGYT2_EP1_GET_FIRMWARE_VERSION;
>>
>> > +   ret = dvb_usb_generic_rw(d, st->data, 1, st->data, 3, 0);
>> > if (ret < 0) {
>> > deb_rc("cinergyt2_power_ctrl() Failed to retrieve sleep "
>> > "state info\n");
>> > @@ -141,13 +147,14 @@ static int repeatable_keys[] = {
>> >  static int cinergyt2_rc_query(struct dvb_usb_device *d, u32 *event, int 
>> > *state)
>> >  {
>> > struct cinergyt2_state *st = d->priv;
>> > -   u8 key[5] = {0, 0, 0, 0, 0}, cmd = CINERGYT2_EP1_GET_RC_EVENTS;
>> > int i;
>> >
>> > *state = REMOTE_NO_KEY_PRESSED;
>> >
>> > -   dvb_usb_generic_rw(d, &cmd, 1, key, sizeof(key), 0);
>> > -   if (key[4] == 0xff) {
>> > +   st->data[0] = CINERGYT2_EP1_SLEEP_MODE;
>>
>> should probably be
>>
>>   st->data[0] = CINERGYT2_EP1_GET_RC_EVENTS;
>>
>> > +
>> > +   dvb_usb_generic_rw(d, st->data, 1, st->data, 5, 0);
>>
>>
>> HTH,
>> Johannes
>
>
> Thanks for the review! Yeah, you're right: both firmware and remote
> controller logic would be broken without the above fixes.
>
> Just sent a version 2 of this patch to the ML with the above fixes.
>
> Regards,
> Mauro

Applied V2 of the patch. Unfortunately no progress.
No video, no error messages.

Jörg

Re: Problem with VMAP_STACK=y

2016-10-05 Thread Jörg Otte

2016-10-05 17:53 GMT+02:00 Andy Lutomirski :
> On Wed, Oct 5, 2016 at 8:21 AM, Jörg Otte  wrote:
>> 2016-10-05 11:04 GMT+02:00 Mauro Carvalho Chehab :
>>> Em Wed, 5 Oct 2016 09:50:42 +0200 (CEST)
>>> Jiri Kosina  escreveu:
>>>
>>>> On Wed, 5 Oct 2016, Patrick Boettcher wrote:
>>>>
>>>> > > > Thanks for the quick response.
>>>> > > > Drivers are:
>>>> > > > dvb_core, dvb_usb, dbv_usb_cynergyT2
>>>> > >
>>>> > > This dbv_usb_cynergyT2 is not from Linus' tree, is it? I don't seem
>>>> > > to be able to find it, and the only google hit I am getting is your
>>>> > > very mail to LKML :)
>>>> >
>>>> > It's a typo, it should say dvb_usb_cinergyT2.
>>>>
>>>> Ah, thanks. Same issues there in
>>>>
>>>>   cinergyt2_frontend_attach()
>>>>   cinergyt2_rc_query()
>>>>
>>>> I think this would require more in-depth review of all the media drivers
>>>> and having all this fixed for 4.9. It should be pretty straightforward;
>>>> all the instances I've seen so far should be just straightforward
>>>> conversion to kmalloc() + kfree(), as the buffer is not being embedded in
>>>> any structure etc.
>>>
>>> What we're doing on most cases is to put a buffer (usually with 80
>>> chars for USB drivers) inside the "state" struct (on DVB drivers),
>>> in order to avoid doing kmalloc/kfree all the times. One such patch is
>>> changeset c4a98793a63c4.
>>>
>>> I'm enclosing a non-tested patch fixing it for the cinergyT2-core.c
>>> driver.
>>>
>>> Thanks,
>>> Mauro
>>>
>>> [PATCH] cinergyT2-core: don't do DMA on stack
>>>
>>
>> Tried the cinergyT2 patch. No success. When I select a TV channel
>> just nothing happens. There are no error messages.
>
> Could you try compiling with CONFIG_DEBUG_VIRTUAL=y and seeing if you
> get a nice error message?
>
> --Andy

Done. Still no error messages in dmesg and syslog.

DVB adapter signals it is tuned to the channel.
For me it looks like there`s no data reaching the application
(similar to if I forget to connect an antenna).

Jörg

Re: Problem with VMAP_STACK=y

2016-10-05 Thread Jörg Otte

2016-10-05 11:04 GMT+02:00 Mauro Carvalho Chehab :
> Em Wed, 5 Oct 2016 09:50:42 +0200 (CEST)
> Jiri Kosina  escreveu:
>
>> On Wed, 5 Oct 2016, Patrick Boettcher wrote:
>>
>> > > > Thanks for the quick response.
>> > > > Drivers are:
>> > > > dvb_core, dvb_usb, dbv_usb_cynergyT2
>> > >
>> > > This dbv_usb_cynergyT2 is not from Linus' tree, is it? I don't seem
>> > > to be able to find it, and the only google hit I am getting is your
>> > > very mail to LKML :)
>> >
>> > It's a typo, it should say dvb_usb_cinergyT2.
>>
>> Ah, thanks. Same issues there in
>>
>>   cinergyt2_frontend_attach()
>>   cinergyt2_rc_query()
>>
>> I think this would require more in-depth review of all the media drivers
>> and having all this fixed for 4.9. It should be pretty straightforward;
>> all the instances I've seen so far should be just straightforward
>> conversion to kmalloc() + kfree(), as the buffer is not being embedded in
>> any structure etc.
>
> What we're doing on most cases is to put a buffer (usually with 80
> chars for USB drivers) inside the "state" struct (on DVB drivers),
> in order to avoid doing kmalloc/kfree all the times. One such patch is
> changeset c4a98793a63c4.
>
> I'm enclosing a non-tested patch fixing it for the cinergyT2-core.c
> driver.
>
> Thanks,
> Mauro
>
> [PATCH] cinergyT2-core: don't do DMA on stack
>

Tried the cinergyT2 patch. No success. When I select a TV channel
just nothing happens. There are no error messages.

Jörg

Re: Problem with VMAP_STACK=y

2016-10-04 Thread Jörg Otte

2016-10-04 15:26 GMT+02:00 Jiri Kosina :
> On Tue, 4 Oct 2016, Jörg Otte wrote:
>
>> With kernel 4.8.0-01558-g21f54dd I get thousands of
>> "dvb-usb: bulk message failed: -11 (1/0)"
>> messages in the logs and the DVB adapter is not working.
>>
>> It tourned out the new config option VMAP_STACK=y (which is the default)
>> is the culprit.
>> No problems for me with VMAP_STACK=n.
>
> I'd guess that this is EAGAIN coming from usb_hcd_map_urb_for_dma() as the
> DVB driver is trying to perform on-stack DMA.
>
> Not really knowing which driver exactly you're using, I quickly skimmed
> through DVB sources, and it turns out this indeed seems to be rather
> common antipattern, and it should be fixed nevertheless. See
>
> cxusb_ctrl_msg()
> dibusb_power_ctrl()
> dibusb2_0_streaming_ctrl()
> dibusb2_0_power_ctrl()
> digitv_ctrl_msg()
> dtt200u_fe_init()
> dtt200u_fe_set_frontend()
> dtt200u_power_ctrl()
> dtt200u_streaming_ctrl()
> dtt200u_pid_filter()
>
> Adding relevant CCs.
>
> --
> Jiri Kosina
> SUSE Labs

Thanks for the quick response.
Drivers are:
dvb_core, dvb_usb, dbv_usb_cynergyT2


Jörg

Problem with VMAP_STACK=y

2016-10-04 Thread Jörg Otte

With kernel 4.8.0-01558-g21f54dd I get thousands of
"dvb-usb: bulk message failed: -11 (1/0)"
messages in the logs and the DVB adapter is not working.

It tourned out the new config option VMAP_STACK=y (which is the default)
is the culprit.
No problems for me with VMAP_STACK=n.


Thanks, Jörg

Re: Drm compile errors.

2016-08-03 Thread Jörg Otte

2016-08-03 18:49 GMT+02:00 Linus Torvalds :
> On Wed, Aug 3, 2016 at 10:43 AM, Jörg Otte  wrote:
>> Current mainline kernel does not compile for me:
>
> You've managed to turn of DEBUG_FS, which is actually fairly hard
> considering how many things select it.
>
> Does the attached trivial patch fix it for you?
>
>  Linus

Yes, patch works for me.

Thanks Jörg

Drm compile errors.

2016-08-03 Thread Jörg Otte

Current mainline kernel does not compile for me:

In file included from
/media/jojo/deftoshiba/kernel/linux/drivers/gpu/drm/i915/i915_drv.c:48:0:
/media/jojo/deftoshiba/kernel/linux/drivers/gpu/drm/i915/i915_drv.h:
In function 'i915_debugfs_register':
/media/jojo/deftoshiba/kernel/linux/drivers/gpu/drm/i915/i915_drv.h:3654:48:
error: parameter name omitted
 static inline int i915_debugfs_register(struct drm_i915_private *) {return 0;}
^
/media/jojo/deftoshiba/kernel/linux/drivers/gpu/drm/i915/i915_drv.h:
In function 'i915_debugfs_unregister':
/media/jojo/deftoshiba/kernel/linux/drivers/gpu/drm/i915/i915_drv.h:3655:51:
error: parameter name omitted
 static inline void i915_debugfs_unregister(struct drm_i915_private *) {}
   ^
make[5]: *** [drivers/gpu/drm/i915/i915_drv.o] Error 1
make[4]: *** [drivers/gpu/drm/i915] Error 2
make[3]: *** [drivers/gpu/drm] Error 2
make[2]: *** [drivers/gpu] Error 2
make[2]: *** Waiting for unfinished jobs

first known bad kernel: 4.7.0-11470-gd52bd54
last known good kernel: 4.7.0-07962-g07f00f0

Thanks Jörg

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-04-02 Thread Jörg Otte

2016-04-02 17:28 GMT+02:00 Srinivas Pandruvada
:
>
> On Sat, 2016-04-02 at 08:30 +0200, Sedat Dilek wrote:
>> > I am trying CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y from
>> > linux-pm.git#linux-next out of curiosity...
>> >
>> > $ ./scripts/diffconfig /boot/config-$(uname -r) .config
>> >  CPU_FREQ_DEFAULT_GOV_PERFORMANCE y -> n
>> > +CPU_FREQ_DEFAULT_GOV_SCHEDUTIL y
>> > +CPU_FREQ_GOV_ATTR_SET y
>> > +CPU_FREQ_GOV_SCHEDUTIL y
>> >
>> > ...will report.
>> >
>>
>> Not sure why I see here "powersave".
>> Does Intel-PState driver not support CPU_FREQ_GOV_SCHEDUTIL?
>>
>> $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver
>> intel_pstate
>> intel_pstate
>> intel_pstate
>> intel_pstate
>>
>> $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
>> powersave
>> powersave
>> powersave
>> powersave
>
> If you are using Ubuntu, the OS has a script which will automatically
> change from performance.
> Doug can give more information on this script.
>
> Thanks,
> Srinivas
>
>
>
>>
>> See also attached files.
>>
>> - sed@ -

maybe:
/etc/init.d/ondemand

Thanks, Jörg

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-04-01 Thread Jörg Otte

2016-04-01 18:46 GMT+02:00 Jörg Otte :
> 2016-04-01 17:20 GMT+02:00 Doug Smythies :
>> On 2016.04.01 05:40 Rafael J. Wysocki wrote:
>>> On Friday, April 01, 2016 11:20:42 AM Jörg Otte wrote:
>>>> 2016-03-31 17:43 GMT+02:00 Rafael J. Wysocki :
>>>>> On Thursday, March 31, 2016 05:25:18 PM Jörg Otte wrote:
>>>>>> 2016-03-31 13:42 GMT+02:00 Rafael J. Wysocki :
>>>>>>> On Thursday, March 31, 2016 11:05:56 AM Jörg Otte wrote:
>>
>>>>
>>>> here they are.
>>>>
>>
>>> First of all, the sampling mechanics works as expected
>>> in the failing case, which is the most important thing
>>> I wanted to know.
>>
>> Yes, but that might be part of the problem, as for some CPUs
>> there is never a long duration, and thus the long duration
>> check never kicks in driving the target pstate down.
>>
>>> The core_busy column is clearly suspicious and it
>>> looks like CPUs 2 and 3 never really go idle.
>>
>> This has been observed several times before [1].
>> Due to beat frequencies between desktop type frame rates
>> and such, the worst manifestation of the issue seems to be
>> for 300 Hz kernels, but Ubuntu uses uses 250 Hz.
>>
>> Oh look, Jörg is using 300 Hz!!
>>
>> $ grep CONFIG_HZ .config_jorg
>> # CONFIG_HZ_PERIODIC is not set
>> # CONFIG_HZ_100 is not set
>> # CONFIG_HZ_250 is not set
>> CONFIG_HZ_300=y
>> # CONFIG_HZ_1000 is not set
>> CONFIG_HZ=300
>>
>
> I use 300Hz because of:
> "250 Hz is a good compromise choice allowing server performance
> while also showing good interactive responsiveness even
> on SMP and NUMA systems. If you are going to be using NTSC video
> or multimedia, selected 300Hz instead." (from KBuild helptext)
>
> -> I often use multimedia so according this text 300 Hz is the better
> choice.
>
>>> I guess we'll need to find out
>>> why they don't go idle to get to the bottom of this, but it firmly falls 
>>> into
>>> the weird stuff territory already.
>
>> I'm compiling a 300 Hz kernel now, also with "# CONFIG_NO_HZ is not set",
>
> Again from KBuild helptext:
> "CONFIG_NO_HZ:
> This is the old config entry that enables dynticks idle.
> We keep it around for a little while to enforce backward
> compatibility with older config files."
>
> -> NO_HZ outdated.
>
>> but I have never been able to re-create these type of findings before.
>>
>> I have also tried several other things in an attempt re-create Jörg's
>> Case, so far without success.
>>
>> References:
>> [1] https://bugzilla.kernel.org/show_bug.cgi?id=93521
>> In particular:
>> https://bugzilla.kernel.org/show_bug.cgi?id=93521#c35
>> https://bugzilla.kernel.org/show_bug.cgi?id=93521#c42
>> https://bugzilla.kernel.org/show_bug.cgi?id=93521#c77
>>
>> ... Doug
>
>
> Nevertheless, I'll try setting 250Hz + NO_HZ
>
For me no improvements.
Neither 300->250Hz  nor  NO_HZ_IDLE + NO_HZ

Thanks, Jörg

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-04-01 Thread Jörg Otte

2016-04-01 17:20 GMT+02:00 Doug Smythies :
> On 2016.04.01 05:40 Rafael J. Wysocki wrote:
>> On Friday, April 01, 2016 11:20:42 AM Jörg Otte wrote:
>>> 2016-03-31 17:43 GMT+02:00 Rafael J. Wysocki :
>>>> On Thursday, March 31, 2016 05:25:18 PM Jörg Otte wrote:
>>>>> 2016-03-31 13:42 GMT+02:00 Rafael J. Wysocki :
>>>>>> On Thursday, March 31, 2016 11:05:56 AM Jörg Otte wrote:
>
>>>
>>> here they are.
>>>
>
>> First of all, the sampling mechanics works as expected
>> in the failing case, which is the most important thing
>> I wanted to know.
>
> Yes, but that might be part of the problem, as for some CPUs
> there is never a long duration, and thus the long duration
> check never kicks in driving the target pstate down.
>
>> The core_busy column is clearly suspicious and it
>> looks like CPUs 2 and 3 never really go idle.
>
> This has been observed several times before [1].
> Due to beat frequencies between desktop type frame rates
> and such, the worst manifestation of the issue seems to be
> for 300 Hz kernels, but Ubuntu uses uses 250 Hz.
>
> Oh look, Jörg is using 300 Hz!!
>
> $ grep CONFIG_HZ .config_jorg
> # CONFIG_HZ_PERIODIC is not set
> # CONFIG_HZ_100 is not set
> # CONFIG_HZ_250 is not set
> CONFIG_HZ_300=y
> # CONFIG_HZ_1000 is not set
> CONFIG_HZ=300
>

I use 300Hz because of:
"250 Hz is a good compromise choice allowing server performance
while also showing good interactive responsiveness even
on SMP and NUMA systems. If you are going to be using NTSC video
or multimedia, selected 300Hz instead." (from KBuild helptext)

-> I often use multimedia so according this text 300 Hz is the better
choice.

>> I guess we'll need to find out
>> why they don't go idle to get to the bottom of this, but it firmly falls into
>> the weird stuff territory already.

> I'm compiling a 300 Hz kernel now, also with "# CONFIG_NO_HZ is not set",

Again from KBuild helptext:
"CONFIG_NO_HZ:
This is the old config entry that enables dynticks idle.
We keep it around for a little while to enforce backward
compatibility with older config files."

-> NO_HZ outdated.

> but I have never been able to re-create these type of findings before.
>
> I have also tried several other things in an attempt re-create Jörg's
> Case, so far without success.
>
> References:
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=93521
> In particular:
> https://bugzilla.kernel.org/show_bug.cgi?id=93521#c35
> https://bugzilla.kernel.org/show_bug.cgi?id=93521#c42
> https://bugzilla.kernel.org/show_bug.cgi?id=93521#c77
>
> ... Doug


Nevertheless, I'll try setting 250Hz + NO_HZ

Thanks, Jörg

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-04-01 Thread Jörg Otte

2016-04-01 14:40 GMT+02:00 Rafael J. Wysocki :
> On Friday, April 01, 2016 11:20:42 AM Jörg Otte wrote:
>> 2016-03-31 17:43 GMT+02:00 Rafael J. Wysocki :
>> > On Thursday, March 31, 2016 05:25:18 PM Jörg Otte wrote:
>> >> 2016-03-31 13:42 GMT+02:00 Rafael J. Wysocki :
>> >> > On Thursday, March 31, 2016 11:05:56 AM Jörg Otte wrote:
>> >> >
>> >> > [cut]
>> >> >
>> >> >> >
>> >> >>
>> >> >> Yes, works for me.
>> >> >>
>> >> >> CPUID(7): No-SGX
>> >> >>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> >> >>-  110.66 1682 2494
>> >> >>0  110.60 1856 2494
>> >> >>1   60.3418982494
>> >> >>2  130.8216282494
>> >> >>3  130.8715282494
>> >> >>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> >> >>-   60.58 9632494
>> >> >>0   80.83 9572494
>> >> >>1   10.08 9842494
>> >> >>2  101.04 9752494
>> >> >>3   30.35 9342494
>> >> >>
>> >> >
>> >
>> > [cut]
>> >
>> >> >
>> >>
>> >> No, this patch doesn't help.
>> >
>> > Well, more work to do then.
>> >
>> > I've just noticed a bug in this patch, which is not relevant for the 
>> > results,
>> > but below is a new version.
>> >
>> >> CPUID(7): No-SGX
>> >>   CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> >>-   80.3225072495
>> >>0  130.5325052495
>> >>1   30.1125232495
>> >>2   10.0625552495
>> >>3  150.5925002495
>> >>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> >>-   80.3324862495
>> >>0  120.5024822495
>> >>1   50.2224892495
>> >>2   10.0424922495
>> >>3  150.5924872495
>> >
>
> [cut]
>
>>
>> here they are.
>>
>
> Thanks!
>
> First of all, the sampling mechanics works as expected in the failing case,
> which is the most important thing I wanted to know.  However, there are 
> anomalies
> in the failing case trace.  The core_busy column is clearly suspicious and it
> looks like CPUs 2 and 3 never really go idle.  I guess we'll need to find out
> why they don't go idle to get to the bottom of this, but it firmly falls into
> the weird stuff territory already.
>
> In the meantime, below is one more patch to test, on top of the previous one
> (that is, https://patchwork.kernel.org/patch/8714401/).
>
> Again, this is a change I'd like to make regardless, so it would be good to
> know if anything more has to be done before we go further.
>
> ---
> From: Rafael J. Wysocki 
> Subject: [PATCH] intel_pstate: Avoid extra invocation of intel_pstate_sample()
>
> The initialization of intel_pstate for a given CPU involves populating
> the fields of its struct cpudata that represent the previous sample,
> but currently that is done in a problematic way.
>
> Namely, intel_pstate_init_cpu() makes an extra call to
> intel_pstate_sample() so it reads the current register values that
> will be used to populate the "previous sample" record during the
> next invocation of intel_pstate_sample().  However, after commit
> a4675fbc4a7a (cpufreq: intel_pstate: Replace timers with utilization
> update callbacks) that doesn't work for last_sample_time, because
> the time value is passed to intel_pstate_sample() as an argument now.
> Passing 0 to it from intel_pstate_init_cpu() is problematic, because
> that causes cpu->last_sample_time == 0 to be visible in
> get_target_pstate_use_performance() (and hence the extra
> cpu->last_sample_time > 0 check in there) and effectively allows
> the first invocation of intel_pstate_sample() from
> intel_pstate_update_util() to happen immediately after the
> initialization which may lead to a significant "turn on"
> effect in the governor algorithm.
>
> To mitigate that issue, rework the initialization to avoid the
> extra intel_pstate_sample() call from intel_pstate_init_cpu().
> Instead,

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-04-01 Thread Jörg Otte

2016-03-31 19:55 GMT+02:00 Srinivas Pandruvada
:
> On Thu, 2016-03-31 at 19:27 +0200, Jörg Otte wrote:
>> 2016-03-31 17:43 GMT+02:00 Rafael J. Wysocki :
>> > On Thursday, March 31, 2016 05:25:18 PM Jörg Otte wrote:
>> > > 2016-03-31 13:42 GMT+02:00 Rafael J. Wysocki :
>> > > > On Thursday, March 31, 2016 11:05:56 AM Jörg Otte wrote:
>> > > >
>> > > > [cut]
>> > > >
>> > > > > >
>> > > > >
>> > > > > Yes, works for me.
>> > > > >
>> > > > > CPUID(7): No-SGX
>> > > > >  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> > > > >-  110.66 1682 2494
>> > > > >0  110.60 1856 2494
>> > > > >1   60.3418982494
>> > > > >2  130.8216282494
>> > > > >3  130.8715282494
>> > > > >  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> > > > >-   60.58 9632494
>> > > > >0   80.83 9572494
>> > > > >1   10.08 9842494
>> > > > >2  101.04 9752494
>> > > > >3   30.35 9342494
>> > > > >
>> > > >
>> >
>> > [cut]
>> >
>> > > >
>> > >
>> > > No, this patch doesn't help.
>> >
>> > Well, more work to do then.
>> >
>> > I've just noticed a bug in this patch, which is not relevant for
>> > the results,
>> > but below is a new version.
>> >
>> > > CPUID(7): No-SGX
>> > >   CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> > >-   80.3225072495
>> > >0  130.5325052495
>> > >1   30.1125232495
>> > >2   10.0625552495
>> > >3  150.5925002495
>> > >  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> > >-   80.3324862495
>> > >0  120.5024822495
>> > >1   50.2224892495
>> > >2   10.0424922495
>> > >3  150.5924872495
>> >
>> > Please apply the patch below and take a (1s or so) trace from the
>> > pstate_sample
>> > tracepoint (under /sys/kernel/debug/tracing/events/power/ on my
>> > systems).
>> >
>> > Then please apply the revert instead of it and take a trace from
>> > that tracepoint
>> > again and send both of the traces to me.
>> >
>> > ---
>> > From: Rafael J. Wysocki 
>> > Subject: [PATCH] intel_pstate: Do not set utilization update hook
>> > too early
>> >
>> > The utilization update hook in the intel_pstate driver is set too
>> > early, as it only should be set after the policy has been fully
>> > initialized by the core.  That may cause intel_pstate_update_util()
>> > to use incorrect data and put the CPUs into incorrect P-states as
>> > a result.
>> >
>> > To prevent that from happening, make intel_pstate_set_policy() set
>> > the utilization update hook instead of intel_pstate_init_cpu() so
>> > intel_pstate_update_util() only runs when all things have been
>> > initialized as appropriate.
>> >
>> > Signed-off-by: Rafael J. Wysocki 
>> > ---
>> >  drivers/cpufreq/intel_pstate.c |   27 +++
>> >  1 file changed, 19 insertions(+), 8 deletions(-)
>> >
>> > Index: linux-pm/drivers/cpufreq/intel_pstate.c
>> > ===
>> > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
>> > +++ linux-pm/drivers/cpufreq/intel_pstate.c
>> > @@ -1103,7 +1103,6 @@ static int intel_pstate_init_cpu(unsigne
>> > intel_pstate_sample(cpu, 0);
>> >
>> > cpu->update_util.func = intel_pstate_update_util;
>> > -   cpufreq_set_update_util_data(cpunum, &cpu->update_util);
>> >
>> > pr_debug("intel_pstate: controlling: cpu %d\n", cpunum);
>> >
>> > @@ -1122,18 +1121,29 @@ static unsigned int intel_pstate_get(uns
>> > return get_avg_frequency(cpu);
>> >  }
>> >
>> > +st

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-04-01 Thread Jörg Otte

2016-03-31 17:43 GMT+02:00 Rafael J. Wysocki :
> On Thursday, March 31, 2016 05:25:18 PM Jörg Otte wrote:
>> 2016-03-31 13:42 GMT+02:00 Rafael J. Wysocki :
>> > On Thursday, March 31, 2016 11:05:56 AM Jörg Otte wrote:
>> >
>> > [cut]
>> >
>> >> >
>> >>
>> >> Yes, works for me.
>> >>
>> >> CPUID(7): No-SGX
>> >>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> >>-  110.66 1682 2494
>> >>0  110.60 1856 2494
>> >>1   60.3418982494
>> >>2  130.8216282494
>> >>3  130.8715282494
>> >>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> >>-   60.58 9632494
>> >>0   80.83 9572494
>> >>1   10.08 9842494
>> >>2  101.04 9752494
>> >>3   30.35 9342494
>> >>
>> >
>
> [cut]
>
>> >
>>
>> No, this patch doesn't help.
>
> Well, more work to do then.
>
> I've just noticed a bug in this patch, which is not relevant for the results,
> but below is a new version.
>
>> CPUID(7): No-SGX
>>   CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-   80.3225072495
>>0  130.5325052495
>>1   30.1125232495
>>2   10.0625552495
>>3  150.5925002495
>>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-   80.3324862495
>>0  120.5024822495
>>1   50.2224892495
>>2   10.0424922495
>>3  150.5924872495
>
> Please apply the patch below and take a (1s or so) trace from the 
> pstate_sample
> tracepoint (under /sys/kernel/debug/tracing/events/power/ on my systems).
>
> Then please apply the revert instead of it and take a trace from that 
> tracepoint
> again and send both of the traces to me.
>
> ---
> From: Rafael J. Wysocki 
> Subject: [PATCH] intel_pstate: Do not set utilization update hook too early
>
> The utilization update hook in the intel_pstate driver is set too
> early, as it only should be set after the policy has been fully
> initialized by the core.  That may cause intel_pstate_update_util()
> to use incorrect data and put the CPUs into incorrect P-states as
> a result.
>
> To prevent that from happening, make intel_pstate_set_policy() set
> the utilization update hook instead of intel_pstate_init_cpu() so
> intel_pstate_update_util() only runs when all things have been
> initialized as appropriate.
>
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/cpufreq/intel_pstate.c |   27 +++
>  1 file changed, 19 insertions(+), 8 deletions(-)
>
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -1103,7 +1103,6 @@ static int intel_pstate_init_cpu(unsigne
> intel_pstate_sample(cpu, 0);
>
> cpu->update_util.func = intel_pstate_update_util;
> -   cpufreq_set_update_util_data(cpunum, &cpu->update_util);
>
> pr_debug("intel_pstate: controlling: cpu %d\n", cpunum);
>
> @@ -1122,18 +1121,29 @@ static unsigned int intel_pstate_get(uns
> return get_avg_frequency(cpu);
>  }
>
> +static void intel_pstate_set_update_util_hook(unsigned int cpu)
> +{
> +   cpufreq_set_update_util_data(cpu, &all_cpu_data[cpu]->update_util);
> +}
> +
> +static void intel_pstate_clear_update_util_hook(unsigned int cpu)
> +{
> +   cpufreq_set_update_util_data(cpu, NULL);
> +   synchronize_sched();
> +}
> +
>  static int intel_pstate_set_policy(struct cpufreq_policy *policy)
>  {
> if (!policy->cpuinfo.max_freq)
> return -ENODEV;
>
> +   intel_pstate_clear_update_util_hook(policy->cpu);
> +
> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE &&
> policy->max >= policy->cpuinfo.max_freq) {
> pr_debug("intel_pstate: set performance\n");
> limits = &performance_limits;
> -   if (hwp_active)
> -   intel_pstate_hwp_set(policy->cpus);
> -   return 0;
> +   goto out;
> }

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-31 Thread Jörg Otte

2016-03-31 17:43 GMT+02:00 Rafael J. Wysocki :
> On Thursday, March 31, 2016 05:25:18 PM Jörg Otte wrote:
>> 2016-03-31 13:42 GMT+02:00 Rafael J. Wysocki :
>> > On Thursday, March 31, 2016 11:05:56 AM Jörg Otte wrote:
>> >
>> > [cut]
>> >
>> >> >
>> >>
>> >> Yes, works for me.
>> >>
>> >> CPUID(7): No-SGX
>> >>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> >>-  110.66 1682 2494
>> >>0  110.60 1856 2494
>> >>1   60.3418982494
>> >>2  130.8216282494
>> >>3  130.8715282494
>> >>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>> >>-   60.58 9632494
>> >>0   80.83 9572494
>> >>1   10.08 9842494
>> >>2  101.04 9752494
>> >>3   30.35 9342494
>> >>
>> >
>
> [cut]
>
>> >
>>
>> No, this patch doesn't help.
>
> Well, more work to do then.
>
> I've just noticed a bug in this patch, which is not relevant for the results,
> but below is a new version.
>
>> CPUID(7): No-SGX
>>   CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-   80.3225072495
>>0  130.5325052495
>>1   30.1125232495
>>2   10.0625552495
>>3  150.5925002495
>>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-   80.3324862495
>>0  120.5024822495
>>1   50.2224892495
>>2   10.0424922495
>>3  150.5924872495
>
> Please apply the patch below and take a (1s or so) trace from the 
> pstate_sample
> tracepoint (under /sys/kernel/debug/tracing/events/power/ on my systems).
>
> Then please apply the revert instead of it and take a trace from that 
> tracepoint
> again and send both of the traces to me.
>
> ---
> From: Rafael J. Wysocki 
> Subject: [PATCH] intel_pstate: Do not set utilization update hook too early
>
> The utilization update hook in the intel_pstate driver is set too
> early, as it only should be set after the policy has been fully
> initialized by the core.  That may cause intel_pstate_update_util()
> to use incorrect data and put the CPUs into incorrect P-states as
> a result.
>
> To prevent that from happening, make intel_pstate_set_policy() set
> the utilization update hook instead of intel_pstate_init_cpu() so
> intel_pstate_update_util() only runs when all things have been
> initialized as appropriate.
>
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/cpufreq/intel_pstate.c |   27 +++
>  1 file changed, 19 insertions(+), 8 deletions(-)
>
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -1103,7 +1103,6 @@ static int intel_pstate_init_cpu(unsigne
> intel_pstate_sample(cpu, 0);
>
> cpu->update_util.func = intel_pstate_update_util;
> -   cpufreq_set_update_util_data(cpunum, &cpu->update_util);
>
> pr_debug("intel_pstate: controlling: cpu %d\n", cpunum);
>
> @@ -1122,18 +1121,29 @@ static unsigned int intel_pstate_get(uns
> return get_avg_frequency(cpu);
>  }
>
> +static void intel_pstate_set_update_util_hook(unsigned int cpu)
> +{
> +   cpufreq_set_update_util_data(cpu, &all_cpu_data[cpu]->update_util);
> +}
> +
> +static void intel_pstate_clear_update_util_hook(unsigned int cpu)
> +{
> +   cpufreq_set_update_util_data(cpu, NULL);
> +   synchronize_sched();
> +}
> +
>  static int intel_pstate_set_policy(struct cpufreq_policy *policy)
>  {
> if (!policy->cpuinfo.max_freq)
> return -ENODEV;
>
> +   intel_pstate_clear_update_util_hook(policy->cpu);
> +
> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE &&
> policy->max >= policy->cpuinfo.max_freq) {
> pr_debug("intel_pstate: set performance\n");
> limits = &performance_limits;
> -   if (hwp_active)
> -   intel_pstate_hwp_set(policy->cpus);
> -   return 0;
> +   goto out;
> }

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-31 Thread Jörg Otte

2016-03-31 17:06 GMT+02:00 Srinivas Pandruvada
:
> On Thu, 2016-03-31 at 07:39 -0700, Doug Smythies wrote:
>> On 2016.03.31 02:24 Jörg Otte wrote:
>
> Hi Jörg,
>
> Can you send me your kernel config file?
>
> Thanks,
> Srinivas
>>
>> > jojo@fichte:/sys/devices/system/cpu/cpufreq/policy0$ cat
>> > scaling_governor
>> > powersave
>> >
>> > turbostat -i 1 --msr=0x199
>> > CPUID(7): No-SGX
>> > CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz   MSR 0x199
>> >   -  451.7425902496  0x
>> >   0  451.7625652498  0x0a00
>> >   1  722.8425482496  0x0800
>> >   2  301.1126612496  0x1a00
>> >   3  331.2326612495  0x1a00
>> > CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz   MSR 0x199
>> >   -   90.3525252495  0x
>> >   0   10.0427352495  0x0800
>> >   1   10.0525012495  0x0800
>> >   2  170.6525402495  0x1a00
>> >   3  160.6425012495  0x1a00
>> > CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz   MSR 0x199
>> >   -  110.4325232495  0x
>> >   0   30.1126312495  0x0c00
>> >   1   70.2725242495  0x0800
>> >   2  180.7225272495  0x1a00
>> >   3  150.6125012495  0x1a00
>>
>> Very Interesting.
>>
>> I would still like to get a trace sample to post process.
>> Copied from a previous e-mail:
>>
>> On an otherwise idle system, do:
>>
>> # perf record -a --event=power:pstate_sample sleep 300
>>
>> If pressed for time, your sleep time can be less than 5 minutes,
>> but try to get at least 100 seconds.
>>
>> The resulting perf.data file will be too big to include as an
>> on-list attachment, but send it (or them) to me off-list for
>> post processing, and I'll report back.
>>
>> ... Doug
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pm"
>> in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Here it is.

Thanks, Jörg

#
# Automatically generated file; DO NOT EDIT.
# Linux/x86 4.5.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_PERF_EVENTS_INTEL_UNCORE=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi
-fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9
-fcall-saved-r10 -fcall-saved-r11"
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_DEBUG_RODATA=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION="-reva4675fbc4a7a"
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
# CONFIG_SWAP is not set
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
# CONFIG_USELIB is not set
CONFIG_AUDIT=y

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-31 Thread Jörg Otte

2016-03-31 13:42 GMT+02:00 Rafael J. Wysocki :
> On Thursday, March 31, 2016 11:05:56 AM Jörg Otte wrote:
>
> [cut]
>
>> >
>>
>> Yes, works for me.
>>
>> CPUID(7): No-SGX
>>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-  110.66 1682 2494
>>0  110.60 1856 2494
>>1   60.3418982494
>>2  130.8216282494
>>3  130.8715282494
>>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-   60.58 9632494
>>0   80.83 9572494
>>1   10.08 9842494
>>2  101.04 9752494
>>3   30.35 9342494
>>
>
> Great, thanks!
>
> To me, the only area where things are really different before and after the
> revert is the initialization, so that likely is when the problem triggers.
>
> And sure enough, there is an initialization problem in intel_pstate.
>
> Please test the patch below instead of the revert and let me know if it
> makes any difference.
>
> It (or equivalent) will need to be applied anyway, so we'll work on top of it
> going forward, but also it may just be sufficient to address the problem 
> you're
> seeing.
>
> ---
> From: Rafael J. Wysocki 
> Subject: [PATCH] intel_pstate: Do not set utilization update hook too early
>
> The utilization update hook in the intel_pstate driver is set too
> early, as it only should be set after the policy has been fully
> initialized by the core.  That may cause intel_pstate_update_util()
> to use incorrect data and put the CPUs into incorrect P-states as
> a result.
>
> To prevent that from happening, make intel_pstate_set_policy() set
> the utilization update hook instead of intel_pstate_init_cpu() so
> intel_pstate_update_util() only runs when all things have been
> initialized as appropriate.
>
> Signed-off-by: Rafael J. Wysocki 
> ---
>  drivers/cpufreq/intel_pstate.c |   27 +++
>  1 file changed, 19 insertions(+), 8 deletions(-)
>
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -1103,7 +1103,6 @@ static int intel_pstate_init_cpu(unsigne
> intel_pstate_sample(cpu, 0);
>
> cpu->update_util.func = intel_pstate_update_util;
> -   cpufreq_set_update_util_data(cpunum, &cpu->update_util);
>
> pr_debug("intel_pstate: controlling: cpu %d\n", cpunum);
>
> @@ -1122,18 +1121,29 @@ static unsigned int intel_pstate_get(uns
> return get_avg_frequency(cpu);
>  }
>
> +static void intel_pstate_set_update_util_hook(unsigned int cpu)
> +{
> +   cpufreq_set_update_util_data(cpu, &all_cpu_data[cpu]->update_util);
> +}
> +
> +static void intel_pstate_clear_update_util_hook(unsigned int cpu)
> +{
> +   cpufreq_set_update_util_data(cpu, NULL);
> +   synchronize_sched();
> +}
> +
>  static int intel_pstate_set_policy(struct cpufreq_policy *policy)
>  {
> if (!policy->cpuinfo.max_freq)
> return -ENODEV;
>
> +   intel_pstate_clear_update_util_hook(policy->cpu);
> +
> if (policy->policy == CPUFREQ_POLICY_PERFORMANCE &&
> policy->max >= policy->cpuinfo.max_freq) {
> pr_debug("intel_pstate: set performance\n");
> limits = &performance_limits;
> -   if (hwp_active)
> -   intel_pstate_hwp_set(policy->cpus);
> -   return 0;
> +   goto out;
> }
>
> pr_debug("intel_pstate: set powersave\n");
> @@ -1163,6 +1173,9 @@ static int intel_pstate_set_policy(struc
> limits->max_perf = div_fp(int_tofp(limits->max_perf_pct),
>   int_tofp(100));
>
> + out:
> +   intel_pstate_set_update_util_hook(policy->cpu);
> +
> if (hwp_active)
> intel_pstate_hwp_set(policy->cpus);
>
> @@ -1187,8 +1200,7 @@ static void intel_pstate_stop_cpu(struct
>
> pr_debug("intel_pstate: CPU %d exiting\n", cpu_num);
>
> -   cpufreq_set_update_util_data(cpu_num, NULL);
> -   synchronize_sched();
> +   intel_pstate_set_update_util_hook(cpu_num);
>
> if (hwp_active)
> return;
> @@ -1455,8 +1467,7 @@ out:
> get_online_cpus();
> for_each_online_cpu(cpu) {
> if (all_cpu_data[cpu]) {
>

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-31 Thread Jörg Otte

2016-03-30 22:26 GMT+02:00 Srinivas Pandruvada
:
> On Wed, 2016-03-30 at 22:12 +0200, Rafael J. Wysocki wrote:
>> On Wed, Mar 30, 2016 at 8:58 PM, Srinivas Pandruvada
>>  wrote:
>> >
>> > On Wed, 2016-03-30 at 11:50 -0700, Doug Smythies wrote:
>> > >
>> > > On 2016.03.30 08:52 Jörg Otte wrote:
>> > > >
>> > > >
>> > > > 2016-03-30 17:33 GMT+02:00 Pandruvada, Srinivas
>> > > > > > > > a...@intel.com>:
>> > > > >
>> > > > >
>> > > > > On Wed, 2016-03-30 at 13:05 +0200, Rafael J. Wysocki wrote:
>> > > > > >
>> > > > > >
>> > > > > > On Wed, Mar 30, 2016 at 12:17 PM, Jörg Otte > > > > > > .com
>> > > > > > >
>> > > > > > >
>> > > > >
>> > > > >
>> > > > > >
>> > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Now in v4.6-rc1 the characteristic has dramatically
>> > > > > > > > > changed.
>> > > > > > > > > If in idle the processor frequency is more or less a
>> > > > > > > > > few
>> > > > > > > > > MHz around 2500Mhz.
>> > > > > > > > > I currently use acpi_cpufreq which works as usual.
>> > > > > > > > > Processor: Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz
>> > > > > > > > > (family: 0x6, model: 0x3c, stepping: 0x3)
>> > > > >
>> > > > >
>> > > > > I want to reproduce this if I can. Can you give us info about
>> > > > > your
>> > > > > setup (Linux distribution, laptop model etc.)?
>> > > I would like to try to reproduce the issue also.
>> > >
>> > > >
>> > > >
>> > > > Distro: Ubuntu 14.04.4 LTS
>> > > Note that with Ubuntu 14.04, I had issues where my CPU
>> > > would lock at pstate 24 (not always 24, but usually),
>> > > regardless of load.
>> > > However, it was always after an S3 suspend, occurred 100%
>> > > of the time, and was independent of intel_pstate or
>> > > acpi-cpufreq CPU frequency scaling drivers.
>> > >
>> > > Since changing my test server to Ubuntu server edition 16.04
>> > > (development version), I have not had those issues. While I have
>> > > no proof, I have assumed the issue elimination was somehow
>> > > related
>> > > to the change to systemd.
>> > >
>> > > It might be worth observing both what the intel_pstate is asking
>> > > for
>> > > and what the processor is actually doing.
>> > If Jörg runs with
>> >
>> > turbostat -i 1 --msr=0x199
>> >
>> > We can tell whether if we requested or the same problem you had.
>> > I tried on Ubuntu LTS 14.04 on same Haswell CPU model, I didn't see
>> > this issue.
>> There seems to be something odd about the Jörg's setup, or we'd have
>> received more reports about this issue.
>>
>> Question is what that is and what really makes the difference.
>>
> I think, somehow we entered performance mode from powersave by default
>
> turbostat -i 1 --msr=0x199 will tell us.
>

jojo@fichte:/sys/devices/system/cpu/cpufreq/policy0$ cat scaling_governor
powersave

turbostat -i 1 --msr=0x199
CPUID(7): No-SGX
 CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz   MSR 0x199
   -  451.7425902496  0x
   0  451.7625652498  0x0a00
   1  722.8425482496  0x0800
   2  301.1126612496  0x1a00
   3  331.2326612495  0x1a00
 CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz   MSR 0x199
   -   90.3525252495  0x
   0   10.0427352495  0x0800
   1   10.0525012495  0x0800
   2  170.6525402495  0x1a00
   3  160.6425012495  0x1a00
 CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz   MSR 0x199
   -  110.4325232495  0x
   0   30.1126312495  0x0c00
   1   70.2725242495  0x0800
   2  180.7225272495  0x1a00
   3  150.6125012495  0x1a00

Thanks, Jörg

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-31 Thread Jörg Otte

2016-03-30 20:39 GMT+02:00 Rafael J. Wysocki :
> On Wednesday, March 30, 2016 05:29:18 PM Jörg Otte wrote:
>> 2016-03-30 13:05 GMT+02:00 Rafael J. Wysocki :
>> > On Wed, Mar 30, 2016 at 12:17 PM, Jörg Otte  wrote:
>> >> 2016-03-29 23:34 GMT+02:00 Rafael J. Wysocki :
>> >>> On Tuesday, March 29, 2016 07:32:27 PM Jörg Otte wrote:
>> >>>> 2016-03-29 19:24 GMT+02:00 Jörg Otte :
>> >>>> > in v4.5 and earlier intel-pstate downscaled idle processors (load
>> >>>> > 0.1-0.2%) to minumum frequency, in my case 800MHz.
>> >>>> >
>> >>>> > Now in v4.6-rc1 the characteristic has dramatically changed. If in
>> >>>> > idle the processor frequency is more or less a few MHz around 2500Mhz.
>> >>>> > This is the maximum non turbo frequency.
>> >>>> >
>> >>>> > No difference between powersafe or performance governor.
>> >>>> >
>> >>>> > I currently use acpi_cpufreq which works as usual.
>> >>>> >
>> >>>> > Processor:
>> >>>> > Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model: 0x3c,
>> >>>> > stepping: 0x3)
>> >>>> >
>> >>>> > Last known good kernel is: 4.5.0-01127-g9256d5a
>> >>>> > First known bad kernel is: 4.5.0-02535-g09fd671
>> >>>> >
>> >>>> > There is
>> >>>> > commit 277edba Merge tag 'pm+acpi-4.6-rc1-1' of
>> >>>> > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
>> >>>> > in between, which brought a few changes in intel_pstate.
>> >>>
>> >>> Can you please check commit a4675fbc4a7a (cpufreq: intel_pstate: Replace 
>> >>> timers
>> >>> with utilization update callbacks)?
>> >>>
>> >> Yes , this solved the problem for me.
>> >> I had to resolve some conflicts myself when reverting that
>> >> commit. Hard work :).
>> >
>> > Thanks for doing this.  Can you please post the revert patch you have used?
>> >
>>
>> The patch is on top of 4.5.0-02535-g09fd671.
>> I'm not sure what gmail is doing with spaces and tabs,
>> so I attach the revert patch.
>
> That worked, thanks!
>
>> >> Here is a 10-seconds trace of the used frequencies when
>> >> in "desktop-idle":
>> >>
>> >> driver  cpu0 cpu1 cpu2 cpu3
>> >> -
>> >> intel_pstate (  800  928  941 1200) MHz   load:( 0.2)%
>> >> intel_pstate (  800  928 1181 1800) MHz   load:( 0.0)%
>> >> intel_pstate ( 1675 1576 1347  800) MHz   load:( 0.0)%
>> >> intel_pstate ( 1198 1576  842  800) MHz   load:( 0.5)%
>> >> intel_pstate (  800 1181 1113 1600) MHz   load:( 0.0)%
>> >> intel_pstate (  808 1181  805  800) MHz   load:( 0.5)%
>> >> intel_pstate (  844 1191  900 1082) MHz   load:( 0.3)%
>> >> intel_pstate (  816 1191  800  800) MHz   load:( 0.0)%
>> >> intel_pstate (  800  905  892 1082) MHz   load:( 0.2)%
>> >> intel_pstate (  945  905 1340  800) MHz   load:( 0.3)%
>> >
>> > Please also run turbostat with and without your revert patch applied.
>> >
>>
>> turbostat without revert
>> Kernel: 4.5.0-02535-g09fd671
>> -
>> CPUID(7): No-SGX
>>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-  130.5325142495
>>0  140.5525182495
>>1   80.3325272495
>>2  150.6025062495
>>3  160.6225092495
>>
>> turbostat after revert of commit a4675fbc4a7a
>> kernel: 4.5.0-reva4675fbc4a7a-02536-g77225b1
>> --
>> CPUID(7): No-SGX
>>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-   40.3511422494
>>0   10.1110162494
>>1   20.17 9612494
>>2  100.8212152494
>>3   30.2910862494
>>  CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
>>-   40.46 8852494
>>0   10.12 8892494
>>1   10.16 8852494
>>2  101.15 8832494
>>3   40.40 8912494
>
> Clearly, there's something fis

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-30 Thread Jörg Otte

2016-03-30 17:33 GMT+02:00 Pandruvada, Srinivas :
> On Wed, 2016-03-30 at 13:05 +0200, Rafael J. Wysocki wrote:
>> On Wed, Mar 30, 2016 at 12:17 PM, Jörg Otte 
>> wrote:
>> >
>> > 2016-03-29 23:34 GMT+02:00 Rafael J. Wysocki :
>> > >
>> > > On Tuesday, March 29, 2016 07:32:27 PM Jörg Otte wrote:
>> > > >
>> > > > 2016-03-29 19:24 GMT+02:00 Jörg Otte :
>> > > > >
>> > > > > in v4.5 and earlier intel-pstate downscaled idle processors
>> > > > > (load
>> > > > > 0.1-0.2%) to minumum frequency, in my case 800MHz.
>> > > > >
>> > > > > Now in v4.6-rc1 the characteristic has dramatically changed.
>> > > > > If in
>> > > > > idle the processor frequency is more or less a few MHz around
>> > > > > 2500Mhz.
>> > > > > This is the maximum non turbo frequency.
>> > > > >
>> > > > > No difference between powersafe or performance governor.
>> > > > >
>> > > > > I currently use acpi_cpufreq which works as usual.
>> > > > >
>> > > > > Processor:
>> > > > > Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model:
>> > > > > 0x3c,
>> > > > > stepping: 0x3)
>> > > > >
>> > > > > Last known good kernel is: 4.5.0-01127-g9256d5a
>> > > > > First known bad kernel is: 4.5.0-02535-g09fd671
>> > > > >
>> > > > > There is
>> > > > > commit 277edba Merge tag 'pm+acpi-4.6-rc1-1' of
>> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
>> > > > > in between, which brought a few changes in intel_pstate.
>> > > Can you please check commit a4675fbc4a7a (cpufreq: intel_pstate:
>> > > Replace timers
>> > > with utilization update callbacks)?
>> > >
>> > Yes , this solved the problem for me.
>> > I had to resolve some conflicts myself when reverting that
>> > commit. Hard work :).
>> Thanks for doing this.  Can you please post the revert patch you have
>> used?
>>
>> >
>> > Here is a 10-seconds trace of the used frequencies when
>> > in "desktop-idle":
>> >
>> > driver  cpu0 cpu1 cpu2 cpu3
>> > -
>> > intel_pstate (  800  928  941 1200) MHz   load:( 0.2)%
>> > intel_pstate (  800  928 1181 1800) MHz   load:( 0.0)%
>> > intel_pstate ( 1675 1576 1347  800) MHz   load:( 0.0)%
>> > intel_pstate ( 1198 1576  842  800) MHz   load:( 0.5)%
>> > intel_pstate (  800 1181 1113 1600) MHz   load:( 0.0)%
>> > intel_pstate (  808 1181  805  800) MHz   load:( 0.5)%
>> > intel_pstate (  844 1191  900 1082) MHz   load:( 0.3)%
>> > intel_pstate (  816 1191  800  800) MHz   load:( 0.0)%
>> > intel_pstate (  800  905  892 1082) MHz   load:( 0.2)%
>> > intel_pstate (  945  905 1340  800) MHz   load:( 0.3)%
>> Please also run turbostat with and without your revert patch applied.
> I want to reproduce this if I can. Can you give us info about your
> setup (Linux distribution, laptop model etc.)?
>
> Thanks,
> Srinivas

Distro: Ubuntu 14.04.4 LTS
Laptop: FUJITSU LIFEBOOK A544

lspci:
===
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core
Processor DRAM Controller (rev 06)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v3/4th Gen Core
Processor PCI Express x16 Controller (rev 06)
00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core
Processor Integrated Graphics Controller (rev 06)
00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core
Processor HD Audio Controller (rev 06)
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
Family USB xHCI (rev 04)
00:16.0 Communication controller: Intel Corporation 8 Series/C220
Series Chipset Family MEI Controller #1 (rev 04)
00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset
High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
Family PCI Express Root Port #1 (rev d4)
00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
Family PCI Express Root Port #3 (rev d4)
00:1c.5 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
Family PCI Express Root Port #6 (rev d4)
00:1f.0 ISA bridge: Intel Corporation HM86 Express LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series
Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 04)

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-30 Thread Jörg Otte

2016-03-30 13:05 GMT+02:00 Rafael J. Wysocki :
> On Wed, Mar 30, 2016 at 12:17 PM, Jörg Otte  wrote:
>> 2016-03-29 23:34 GMT+02:00 Rafael J. Wysocki :
>>> On Tuesday, March 29, 2016 07:32:27 PM Jörg Otte wrote:
>>>> 2016-03-29 19:24 GMT+02:00 Jörg Otte :
>>>> > in v4.5 and earlier intel-pstate downscaled idle processors (load
>>>> > 0.1-0.2%) to minumum frequency, in my case 800MHz.
>>>> >
>>>> > Now in v4.6-rc1 the characteristic has dramatically changed. If in
>>>> > idle the processor frequency is more or less a few MHz around 2500Mhz.
>>>> > This is the maximum non turbo frequency.
>>>> >
>>>> > No difference between powersafe or performance governor.
>>>> >
>>>> > I currently use acpi_cpufreq which works as usual.
>>>> >
>>>> > Processor:
>>>> > Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model: 0x3c,
>>>> > stepping: 0x3)
>>>> >
>>>> > Last known good kernel is: 4.5.0-01127-g9256d5a
>>>> > First known bad kernel is: 4.5.0-02535-g09fd671
>>>> >
>>>> > There is
>>>> > commit 277edba Merge tag 'pm+acpi-4.6-rc1-1' of
>>>> > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
>>>> > in between, which brought a few changes in intel_pstate.
>>>
>>> Can you please check commit a4675fbc4a7a (cpufreq: intel_pstate: Replace 
>>> timers
>>> with utilization update callbacks)?
>>>
>> Yes , this solved the problem for me.
>> I had to resolve some conflicts myself when reverting that
>> commit. Hard work :).
>
> Thanks for doing this.  Can you please post the revert patch you have used?
>

The patch is on top of 4.5.0-02535-g09fd671.
I'm not sure what gmail is doing with spaces and tabs,
so I attach the revert patch.


>> Here is a 10-seconds trace of the used frequencies when
>> in "desktop-idle":
>>
>> driver  cpu0 cpu1 cpu2 cpu3
>> -
>> intel_pstate (  800  928  941 1200) MHz   load:( 0.2)%
>> intel_pstate (  800  928 1181 1800) MHz   load:( 0.0)%
>> intel_pstate ( 1675 1576 1347  800) MHz   load:( 0.0)%
>> intel_pstate ( 1198 1576  842  800) MHz   load:( 0.5)%
>> intel_pstate (  800 1181 1113 1600) MHz   load:( 0.0)%
>> intel_pstate (  808 1181  805  800) MHz   load:( 0.5)%
>> intel_pstate (  844 1191  900 1082) MHz   load:( 0.3)%
>> intel_pstate (  816 1191  800  800) MHz   load:( 0.0)%
>> intel_pstate (  800  905  892 1082) MHz   load:( 0.2)%
>> intel_pstate (  945  905 1340  800) MHz   load:( 0.3)%
>
> Please also run turbostat with and without your revert patch applied.
>

turbostat without revert
Kernel: 4.5.0-02535-g09fd671
-
CPUID(7): No-SGX
 CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
   -  130.5325142495
   0  140.5525182495
   1   80.3325272495
   2  150.6025062495
   3  160.6225092495

turbostat after revert of commit a4675fbc4a7a
kernel: 4.5.0-reva4675fbc4a7a-02536-g77225b1
--
CPUID(7): No-SGX
 CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
   -   40.3511422494
   0   10.1110162494
   1   20.17 9612494
   2  100.8212152494
   3   30.2910862494
 CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz
   -   40.46 8852494
   0   10.12 8892494
   1   10.16 8852494
   2  101.15 8832494
   3   40.40 8912494
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index cb56074..97c16af 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -71,7 +71,7 @@ struct sample {
 	u64 mperf;
 	u64 tsc;
 	int freq;
-	u64 time;
+	ktime_t time;
 };
 
 struct pstate_data {
@@ -103,13 +103,13 @@ struct _pid {
 struct cpudata {
 	int cpu;
 
-	struct update_util_data update_util;
+	struct timer_list timer;
 
 	struct pstate_data pstate;
 	struct vid_data vid;
 	struct _pid pid;
 
-	u64	last_sample_time;
+	ktime_t last_sample_time;
 	u64	prev_aperf;
 	u64	prev_mperf;
 	u64	prev_tsc;
@@ -120,7 +120,6 @@ struct cpudata {
 static struct cpudata **all_cpu_data;
 struct pstate_adjust_policy {
 	int sample_rate_ms;
-	s64 sample_rate_ns;
 	int deadband;
 	int setpoint;
 	int p_gain_pct;
@@ -719,7 +718,7 @@ static void core_set_pstate(struct cpudata *cpudata, int pstate)
 	if (limits->no_turbo && !limits->turbo_disabled)
 		val |= (u6

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-30 Thread Jörg Otte

2016-03-29 23:34 GMT+02:00 Rafael J. Wysocki :
> On Tuesday, March 29, 2016 07:32:27 PM Jörg Otte wrote:
>> 2016-03-29 19:24 GMT+02:00 Jörg Otte :
>> > in v4.5 and earlier intel-pstate downscaled idle processors (load
>> > 0.1-0.2%) to minumum frequency, in my case 800MHz.
>> >
>> > Now in v4.6-rc1 the characteristic has dramatically changed. If in
>> > idle the processor frequency is more or less a few MHz around 2500Mhz.
>> > This is the maximum non turbo frequency.
>> >
>> > No difference between powersafe or performance governor.
>> >
>> > I currently use acpi_cpufreq which works as usual.
>> >
>> > Processor:
>> > Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model: 0x3c,
>> > stepping: 0x3)
>> >
>> > Last known good kernel is: 4.5.0-01127-g9256d5a
>> > First known bad kernel is: 4.5.0-02535-g09fd671
>> >
>> > There is
>> > commit 277edba Merge tag 'pm+acpi-4.6-rc1-1' of
>> > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
>> > in between, which brought a few changes in intel_pstate.
>
> Can you please check commit a4675fbc4a7a (cpufreq: intel_pstate: Replace 
> timers
> with utilization update callbacks)?
>
Yes , this solved the problem for me.
I had to resolve some conflicts myself when reverting that
commit. Hard work :).

Here is a 10-seconds trace of the used frequencies when
in "desktop-idle":

driver  cpu0 cpu1 cpu2 cpu3
-
intel_pstate (  800  928  941 1200) MHz   load:( 0.2)%
intel_pstate (  800  928 1181 1800) MHz   load:( 0.0)%
intel_pstate ( 1675 1576 1347  800) MHz   load:( 0.0)%
intel_pstate ( 1198 1576  842  800) MHz   load:( 0.5)%
intel_pstate (  800 1181 1113 1600) MHz   load:( 0.0)%
intel_pstate (  808 1181  805  800) MHz   load:( 0.5)%
intel_pstate (  844 1191  900 1082) MHz   load:( 0.3)%
intel_pstate (  816 1191  800  800) MHz   load:( 0.0)%
intel_pstate (  800  905  892 1082) MHz   load:( 0.2)%
intel_pstate (  945  905 1340  800) MHz   load:( 0.3)%


Thanks, Jörg

Re: [intel-pstate driver regression] processor frequency very high even if in idle

2016-03-29 Thread Jörg Otte

2016-03-29 19:24 GMT+02:00 Jörg Otte :
> in v4.5 and earlier intel-pstate downscaled idle processors (load
> 0.1-0.2%) to minumum frequency, in my case 800MHz.
>
> Now in v4.6-rc1 the characteristic has dramatically changed. If in
> idle the processor frequency is more or less a few MHz around 2500Mhz.
> This is the maximum non turbo frequency.
>
> No difference between powersafe or performance governor.
>
> I currently use acpi_cpufreq which works as usual.
>
> Processor:
> Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model: 0x3c,
> stepping: 0x3)
>
> Last known good kernel is: 4.5.0-01127-g9256d5a
> First known bad kernel is: 4.5.0-02535-g09fd671
>
> There is
> commit 277edba Merge tag 'pm+acpi-4.6-rc1-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
> in between, which brought a few changes in intel_pstate.
>
> thanks, Jörg

[v4.3.0-rc2->3] Regression: BIG networking performance loss

2015-09-29 Thread Jörg Otte

With kernels vmlinuz-4.3.0-rc2-00228-gd4a748a and earlier it is no
problem for me to stream HD-videos (700-800 Kbyte/s) from YouTube.

With the same video material and kernels
vmlinuz-4.3.0-rc2-00438-gd8cc397 and later I only reach 70-80 KByte/s.
That's a one-tenth than before.

The merges between 00228 -> 00438 are:
d8cc397 Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
c91d707 Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending
bcba282 Merge tag 'usb-4.3-rc3' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
fb740f9 Merge tag 'tty-4.3-rc3' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
b11e7b8 Merge tag 'staging-4.3-rc3' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
7c1efea Merge tag 'driver-core-4.3-rc3' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
64b796e Merge tag 'char-misc-4.3-rc3' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
518a7cb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411
PCI Express Gigabit Ethernet Controller (rev 07)
Driver:r8169.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference

2015-07-13 Thread Jörg Otte

2015-07-13 9:58 GMT+02:00 Maarten Lankhorst :
> Op 13-07-15 om 09:42 schreef Jörg Otte:
>> 2015-07-13 9:23 GMT+02:00 Maarten Lankhorst 
>> :
>>> Op 13-07-15 om 08:22 schreef Daniel Vetter:
>>>> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
>>>>> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte  wrote:
>>>>>> BUG: unable to handle kernel NULL pointer dereference at 0009
>>>>>> IP: [] 0xbd3447bb
>>>>> Ugh. Please enable KALLSYMS to get sane symbols.
>>>>>
>>>>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
>>>>> so it's pretty clearly just that change frm
>>>>>
>>>>> -   if (intel_crtc->active) {
>>>>> +   if (crtc_state->base.active) {
>>>>>
>>>>> and "crtc_state" is NULL.
>>>>>
>>>>> And the code very much knows that crtc_state can be NULL, since it's
>>>>> initialized with
>>>>>
>>>>> crtc_state = state->base.state ?
>>>>> intel_atomic_get_crtc_state(state->base.state,
>>>>> intel_crtc) : NULL;
>>>>>
>>>>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
>>>>> ("drm/i915: Use crtc_state->active in primary check_plane func") for
>>>>> now, or is there a better fix? Like just checking crtc_state for NULL?
>>>> Indeed embarrassing. I've missed that we still have 1 caller left that's
>>>> using the transitional helpers, and those don't fill out
>>>> plane_state->state backpointers to the global atomic update since there is
>>>> no global atomic update for transitional helpers. Below diff should fix
>>>> this - we need to preferentially check crts_state->active and if that's
>>>> not set intel_crtc->active should yield the right result for the one
>>>> remaining caller (it's in the crtc_disable paths).
>>>>
>>>> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
>>>> to transition to a different QA team ("we'll do this in 1 week without
>>>> upfront planing") I essentially don't have proper QA support for 1-2
>>>> months by now. The other trouble in this area specifically is that this
>>>> code is already completely changed in -next again, so any testing done on
>>>> integration trees (like -next or drm-intel-nightly) won't test any patches
>>>> for 4.2.
>>>> -Daniel
>>>>
>>>> Oh and Signed-off-by: Daniel Vetter  in case you
>>>> decide to apply this right away.
>>>>
>>> Well your version has the benefit of compiling without errors. :-)
>>>
>>> Reviewed-by: Maarten Lankhorst 
>> Just noticed another problem:
>> On each resume I get the following error:
>> ---[ cut here ]
>> WARNING: CPU: 2 PID: 2663 at
>> /data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319
>> 0x9a33d5e9()
>> WARN_ON(!crtc->state->enable)
>> CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15
>> ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
>> orkqueue: events_unbound 0x9a055750
>>  9a98ea28 9a6d84d2 
>> 9a03c416 88020951c4e0  
>> 8802141cb800 88021630c000 9a03c4d5 9a9c3664
>> all Trace:
>> [] ? 0x9a6d84d2
>> [] ? 0x9a03c416
>> [] ? 0x9a03c4d5
>> [] ? 0x9a33d5e9
>> [] ? 0x9a343ac3
>> [] ? 0x9a3a
>> [] ? 0x9a345518
>> [] ? 0x9a3246f0
>> [] ? 0x9a2e1ce8
>> [] ? 0x9a236170
>> [] ? 0x9a38b28d
>> [] ? 0x9a38b784
>> [] ? 0x9a38baa4
>> [] ? 0x9a05577d
>> [] ? 0x9a04dc47
>> [] ? 0x9a04dfab
>> [] ? 0x9a04dea0
>> [] ? 0x9a05331c
>> [] ? 0x9a053260
>> [] ? 0x9a6dfa0f
>> [] ? 0x9a053260
>> --[ end trace 1b6d28ee34071679 ]---
>>
>> Nervertheless resume works, so it doesn't hurt me.
>>
>>
>> BTW: I get also up to 40..50!  compile warnings like:
>> i915/i915_drv.h: In function 'i915_debugfs_connector_add':
>> i915/i915_drv.h:3119:53

Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference

2015-07-13 Thread Jörg Otte

2015-07-13 9:23 GMT+02:00 Maarten Lankhorst :
> Op 13-07-15 om 08:22 schreef Daniel Vetter:
>> On Sun, Jul 12, 2015 at 09:52:51AM -0700, Linus Torvalds wrote:
>>> On Sun, Jul 12, 2015 at 1:03 AM, Jörg Otte  wrote:
>>>> BUG: unable to handle kernel NULL pointer dereference at 0009
>>>> IP: [] 0xbd3447bb
>>> Ugh. Please enable KALLSYMS to get sane symbols.
>>>
>>> But yes, "crtc_state->base.active" is at offset 9 from "crtc_state",
>>> so it's pretty clearly just that change frm
>>>
>>> -   if (intel_crtc->active) {
>>> +   if (crtc_state->base.active) {
>>>
>>> and "crtc_state" is NULL.
>>>
>>> And the code very much knows that crtc_state can be NULL, since it's
>>> initialized with
>>>
>>> crtc_state = state->base.state ?
>>> intel_atomic_get_crtc_state(state->base.state,
>>> intel_crtc) : NULL;
>>>
>>> Tssk. Daniel? Should I just revert that commit dec4f799d0a4
>>> ("drm/i915: Use crtc_state->active in primary check_plane func") for
>>> now, or is there a better fix? Like just checking crtc_state for NULL?
>> Indeed embarrassing. I've missed that we still have 1 caller left that's
>> using the transitional helpers, and those don't fill out
>> plane_state->state backpointers to the global atomic update since there is
>> no global atomic update for transitional helpers. Below diff should fix
>> this - we need to preferentially check crts_state->active and if that's
>> not set intel_crtc->active should yield the right result for the one
>> remaining caller (it's in the crtc_disable paths).
>>
>> For cheap excuses why i915 is so crap in 4.2: Thanks to a hipshot decision
>> to transition to a different QA team ("we'll do this in 1 week without
>> upfront planing") I essentially don't have proper QA support for 1-2
>> months by now. The other trouble in this area specifically is that this
>> code is already completely changed in -next again, so any testing done on
>> integration trees (like -next or drm-intel-nightly) won't test any patches
>> for 4.2.
>> -Daniel
>>
>> Oh and Signed-off-by: Daniel Vetter  in case you
>> decide to apply this right away.
>>
> Well your version has the benefit of compiling without errors. :-)
>
> Reviewed-by: Maarten Lankhorst 

Just noticed another problem:
On each resume I get the following error:
---[ cut here ]
WARNING: CPU: 2 PID: 2663 at
/data/kernel/linux/drivers/gpu/drm/i915/intel_display.c:6319
0x9a33d5e9()
WARN_ON(!crtc->state->enable)
CPU: 2 PID: 2663 Comm: kworker/u8:80 Not tainted 4.2.0-rc2 #15
ardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
orkqueue: events_unbound 0x9a055750
 9a98ea28 9a6d84d2 
9a03c416 88020951c4e0  
8802141cb800 88021630c000 9a03c4d5 9a9c3664
all Trace:
[] ? 0x9a6d84d2
[] ? 0x9a03c416
[] ? 0x9a03c4d5
[] ? 0x9a33d5e9
[] ? 0x9a343ac3
[] ? 0x9a3a
[] ? 0x9a345518
[] ? 0x9a3246f0
[] ? 0x9a2e1ce8
[] ? 0x9a236170
[] ? 0x9a38b28d
[] ? 0x9a38b784
[] ? 0x9a38baa4
[] ? 0x9a05577d
[] ? 0x9a04dc47
[] ? 0x9a04dfab
[] ? 0x9a04dea0
[] ? 0x9a05331c
[] ? 0x9a053260
[] ? 0x9a6dfa0f
[] ? 0x9a053260
--[ end trace 1b6d28ee34071679 ]---

Nervertheless resume works, so it doesn't hurt me.


BTW: I get also up to 40..50!  compile warnings like:
i915/i915_drv.h: In function 'i915_debugfs_connector_add':
i915/i915_drv.h:3119:53: warning: no return statement in function
returning non-void [-Wreturn-type]

which may cause yet uncovered troubles.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference

2015-07-12 Thread Jörg Otte

2015-07-12 10:03 GMT+02:00 Jörg Otte :
> 4.2.0-rc1-00201-g59c3cb5 introducued a null pointer derefence and a
> system freeze
> when Xorg is started ( 4.2.0-rc1-00062-gc4b5fd3 was fine) :
>
> BUG: unable to handle kernel NULL pointer dereference at 0009
> IP: [] 0xbd3447bb
> PGD 0
> Oops:  [#1] SMP
> CPU: 1 PID: 1290 Comm: Xorg Not tainted 4.2.0-rc1-00201-g59c3cb5 #6
> Hardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
> task: 8802149d6c00 ti: 880206df4000 task.ti: 880206df4000
> RIP: 0010:[]  [] 0xbd3447bb
> RSP: 0018:880206df7b08  EFLAGS: 00010246
> RAX:  RBX: 88021578f480 RCX: 88021578f4d0
> RDX:  RSI: 88021630b000 RDI: 880214a68000
> RBP: 88021630b000 R08: 88021578f4e0 R09: 88021578f4f0
> R10: 3c18 R11: fff2 R12: 880214a68000
> R13: 88021634e800 R14:  R15: 
> FS:  7ff3caa60880() GS:88021f28() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0009 CR3: 000206e07000 CR4: 001407e0
> Stack:
>  88020001 88020001 8802 88020001
>  88021578f500 bd2df135 880213f71c00 880214a68000
>   880214a7 0001 880214a68000
> Call Trace:
>  [] ? 0xbd2df135
>  [] ? 0xbd2b6ca8
>  [] ? 0xbd33cc7e
>  [] ? 0xbd343673
>  [] ? 0xbd2d0728
>  [] ? 0xbd2d088e
>  [] ? 0xbd2d10c5
>  [] ? 0xbd2c6976
>  [] ? 0xbd2d0fe0
>  [] ? 0xbd0c6a1f
>  [] ? 0xbd0e79e1
>  [] ? 0xbd0e7ed1
>  [] ? 0xbd6df557
> Code: 48 89 54 24 20 48 8b 54 24 40 48 89 ee 89 0c 24 4c 89 f9 c7 44
> 24 18 01 00 00 00 89 44 24 08 e8 bc 1f f7 ff 85 c0 41 89 c7 75 67 <41>
> 80 7e 09 00 74 56 49 8b 84 24 38 02 00 00 c6 85 d0 08 00 00
> RIP  [] 0xbd3447bb
>  RSP 
> CR2: 0009
> ---[ end trace dd0931f7f0d43d12 ] ---

I can fix the problem for me by reverting:

commit dec4f799d0a4c9edae20512fa60b0a36f3299ca2
Author: Daniel Vetter 
Date:   Tue Jul 7 11:15:47 2015 +0200

drm/i915: Use crtc_state->active in primary check_plane func
Since
commit 8c7b5ccb729870e606321b3703e2c2e698c49a95
Author: Ander Conselvan de Oliveira 
Date:   Tue Apr 21 17:13:19 2015 +0300
drm/i915: Use atomic helpers for computing changed flags

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[4.2.0-rc1-00201-g59c3cb5] Regression: kernel NULL pointer dereference

2015-07-12 Thread Jörg Otte

4.2.0-rc1-00201-g59c3cb5 introducued a null pointer derefence and a
system freeze
when Xorg is started ( 4.2.0-rc1-00062-gc4b5fd3 was fine) :

BUG: unable to handle kernel NULL pointer dereference at 0009
IP: [] 0xbd3447bb
PGD 0
Oops:  [#1] SMP
CPU: 1 PID: 1290 Comm: Xorg Not tainted 4.2.0-rc1-00201-g59c3cb5 #6
Hardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
task: 8802149d6c00 ti: 880206df4000 task.ti: 880206df4000
RIP: 0010:[]  [] 0xbd3447bb
RSP: 0018:880206df7b08  EFLAGS: 00010246
RAX:  RBX: 88021578f480 RCX: 88021578f4d0
RDX:  RSI: 88021630b000 RDI: 880214a68000
RBP: 88021630b000 R08: 88021578f4e0 R09: 88021578f4f0
R10: 3c18 R11: fff2 R12: 880214a68000
R13: 88021634e800 R14:  R15: 
FS:  7ff3caa60880() GS:88021f28() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0009 CR3: 000206e07000 CR4: 001407e0
Stack:
 88020001 88020001 8802 88020001
 88021578f500 bd2df135 880213f71c00 880214a68000
  880214a7 0001 880214a68000
Call Trace:
 [] ? 0xbd2df135
 [] ? 0xbd2b6ca8
 [] ? 0xbd33cc7e
 [] ? 0xbd343673
 [] ? 0xbd2d0728
 [] ? 0xbd2d088e
 [] ? 0xbd2d10c5
 [] ? 0xbd2c6976
 [] ? 0xbd2d0fe0
 [] ? 0xbd0c6a1f
 [] ? 0xbd0e79e1
 [] ? 0xbd0e7ed1
 [] ? 0xbd6df557
Code: 48 89 54 24 20 48 8b 54 24 40 48 89 ee 89 0c 24 4c 89 f9 c7 44
24 18 01 00 00 00 89 44 24 08 e8 bc 1f f7 ff 85 c0 41 89 c7 75 67 <41>
80 7e 09 00 74 56 49 8b 84 24 38 02 00 00 c6 85 d0 08 00 00
RIP  [] 0xbd3447bb
 RSP 
CR2: 0009
---[ end trace dd0931f7f0d43d12 ] ---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [4.1.0-07254-gc13c810] Regression: Bluetooth not working.

2015-06-30 Thread Jörg Otte

2015-06-29 23:13 GMT+02:00 Tedd Ho-Jeong An :
> Hi Jorg
>
> On Mon, 29 Jun 2015 16:37:32 +0200
> Jörg Otte  wrote:
>
>> 2015-06-29 12:30 GMT+02:00 Alexey Dobriyan :
>> > On Mon, Jun 29, 2015 at 12:00 PM, Jörg Otte  wrote:
>> >> 2015-06-28 18:09 GMT+02:00 Alexey Dobriyan :
>> >>> On Sun, Jun 28, 2015 at 05:36:04PM +0200, Jörg Otte wrote:
>> >>>> 2015-06-26 16:28 GMT+02:00 Jörg Otte :
>> >>>> > 2015-06-26 12:03 GMT+02:00 Jörg Otte :
>> >>>> >> 2015-06-26 11:37 GMT+02:00 Marcel Holtmann :
>> >>>> >>> Hi Joerg,
>> >>>> >>>
>> >>>> >>>> Bluetooth is inoperable in current Linus tree and the
>> >>>> >>>> first bad commit is:
>> >>>> >>>>
>> >>>> >>>> 835a6a2f8603237a3e6cded5a6765090ecb06ea5 is the first bad commit
>> >>>> >>>> commit 835a6a2f8603237a3e6cded5a6765090ecb06ea5
>> >>>> >>>> Author: Alexey Dobriyan 
>> >>>> >>>> Date:   Wed Jun 10 20:28:33 2015 +0300
>> >>>> >>>>
>> >>>> >>>>Bluetooth: Stop sabotaging list poisoning
>> >>>> >>>>
>> >>>> >>>>list_del() poisons pointers with special values, no need to 
>> >>>> >>>> overwrite them.
>> >>>> >>>>
>> >>>> >>>>Signed-off-by: Alexey Dobriyan 
>> >>>> >>>>Signed-off-by: Marcel Holtmann 
>> >>>> >>>>
>> >>>> >>>> My BT adapter is an intel 8087:07da
>> >>>> >>>> I reverted that commit and this fixed the problem for me.
>> >>>> >>>
>> >>>> >>> today we had a patch from Tedd fixing the list initialization in 
>> >>>> >>> the HIDP code.
>> >>>> >>>
>> >>>> >>> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
>> >>>> >>> index 9070dfd6b4ad..f1a117f8cad2 100644
>> >>>> >>> --- a/net/bluetooth/hidp/core.c
>> >>>> >>> +++ b/net/bluetooth/hidp/core.c
>> >>>> >>> @@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session 
>> >>>> >>> **out, const bdaddr_t *bdaddr,
>> >>>> >>> session->conn = l2cap_conn_get(conn);
>> >>>> >>> session->user.probe = hidp_session_probe;
>> >>>> >>> session->user.remove = hidp_session_remove;
>> >>>> >>> +   INIT_LIST_HEAD(&session->user.list);
>> >>>> >>> session->ctrl_sock = ctrl_sock;
>> >>>> >>> session->intr_sock = intr_sock;
>> >>>> >>> skb_queue_head_init(&session->ctrl_transmit);
>> >>>> >>>
>> >>>> >>> Could this be fixing it for you as well?
>> >>>> >>>
>> >>>> >> I will check this when I am at home in the
>> >>>> >> afternoon.
>> >>>> >>
>> >>>> >
>> >>>> > The patch works for me too.
>> >>>> >
>> >>>> Ok, this was a little bit hasty!
>> >>>> I now see the following additional problems:
>> >>>>
>> >>>> - System freeze on resume (occures always).
>> >>>> - System freeze on shutdown (occures sometimes)
>> >>>> - System freeze when BT-mouse is connecting (occures sometimes).
>> >>>>
>> >>>> Then I can't do anything except power off.
>> >>>>
>> >>>> This happens only if Bluetooth AND BT-mouse is activated.
>> >>>
>> >>> OK, what happens if you just revert only list_del patch?
>> >>
>> >> I have applied this patch:
>> >>
>> >> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
>> >> index 9070dfd6b4ad..f1a117f8cad2 100644
>> >> --- a/net/bluetooth/hidp/core.c
>> >> +++ b/net/bluetooth/hidp/core.c
>> >> @@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session
>> >> **out, const bdaddr_t *bdaddr,
>>

Re: [4.1.0-07254-gc13c810] Regression: Bluetooth not working.

2015-06-29 Thread Jörg Otte

2015-06-29 12:30 GMT+02:00 Alexey Dobriyan :
> On Mon, Jun 29, 2015 at 12:00 PM, Jörg Otte  wrote:
>> 2015-06-28 18:09 GMT+02:00 Alexey Dobriyan :
>>> On Sun, Jun 28, 2015 at 05:36:04PM +0200, Jörg Otte wrote:
>>>> 2015-06-26 16:28 GMT+02:00 Jörg Otte :
>>>> > 2015-06-26 12:03 GMT+02:00 Jörg Otte :
>>>> >> 2015-06-26 11:37 GMT+02:00 Marcel Holtmann :
>>>> >>> Hi Joerg,
>>>> >>>
>>>> >>>> Bluetooth is inoperable in current Linus tree and the
>>>> >>>> first bad commit is:
>>>> >>>>
>>>> >>>> 835a6a2f8603237a3e6cded5a6765090ecb06ea5 is the first bad commit
>>>> >>>> commit 835a6a2f8603237a3e6cded5a6765090ecb06ea5
>>>> >>>> Author: Alexey Dobriyan 
>>>> >>>> Date:   Wed Jun 10 20:28:33 2015 +0300
>>>> >>>>
>>>> >>>>Bluetooth: Stop sabotaging list poisoning
>>>> >>>>
>>>> >>>>list_del() poisons pointers with special values, no need to 
>>>> >>>> overwrite them.
>>>> >>>>
>>>> >>>>Signed-off-by: Alexey Dobriyan 
>>>> >>>>Signed-off-by: Marcel Holtmann 
>>>> >>>>
>>>> >>>> My BT adapter is an intel 8087:07da
>>>> >>>> I reverted that commit and this fixed the problem for me.
>>>> >>>
>>>> >>> today we had a patch from Tedd fixing the list initialization in the 
>>>> >>> HIDP code.
>>>> >>>
>>>> >>> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
>>>> >>> index 9070dfd6b4ad..f1a117f8cad2 100644
>>>> >>> --- a/net/bluetooth/hidp/core.c
>>>> >>> +++ b/net/bluetooth/hidp/core.c
>>>> >>> @@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session 
>>>> >>> **out, const bdaddr_t *bdaddr,
>>>> >>> session->conn = l2cap_conn_get(conn);
>>>> >>> session->user.probe = hidp_session_probe;
>>>> >>> session->user.remove = hidp_session_remove;
>>>> >>> +   INIT_LIST_HEAD(&session->user.list);
>>>> >>> session->ctrl_sock = ctrl_sock;
>>>> >>> session->intr_sock = intr_sock;
>>>> >>> skb_queue_head_init(&session->ctrl_transmit);
>>>> >>>
>>>> >>> Could this be fixing it for you as well?
>>>> >>>
>>>> >> I will check this when I am at home in the
>>>> >> afternoon.
>>>> >>
>>>> >
>>>> > The patch works for me too.
>>>> >
>>>> Ok, this was a little bit hasty!
>>>> I now see the following additional problems:
>>>>
>>>> - System freeze on resume (occures always).
>>>> - System freeze on shutdown (occures sometimes)
>>>> - System freeze when BT-mouse is connecting (occures sometimes).
>>>>
>>>> Then I can't do anything except power off.
>>>>
>>>> This happens only if Bluetooth AND BT-mouse is activated.
>>>
>>> OK, what happens if you just revert only list_del patch?
>>
>> I have applied this patch:
>>
>> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
>> index 9070dfd6b4ad..f1a117f8cad2 100644
>> --- a/net/bluetooth/hidp/core.c
>> +++ b/net/bluetooth/hidp/core.c
>> @@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session
>> **out, const bdaddr_t *bdaddr,
>> session->conn = l2cap_conn_get(conn);
>> session->user.probe = hidp_session_probe;
>> session->user.remove = hidp_session_remove;
>> +   INIT_LIST_HEAD(&session->user.list);
>> session->ctrl_sock = ctrl_sock;
>> session->intr_sock = intr_sock;
>> skb_queue_head_init(&session->ctrl_transmit);
>>
>> without this patch bluetooth doesn't work at all for me.
>
> Sure.
>
> Please drop this patch, and do
>
>   git-revert 835a6a2f8603237a3e6cded5a6765090ecb06ea5
>
> Maybe it's some other changes causing hangs.

Looks good so far. The system freeze on resume is gone.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [4.1.0-07254-gc13c810] Regression: Bluetooth not working.

2015-06-29 Thread Jörg Otte

2015-06-28 18:09 GMT+02:00 Alexey Dobriyan :
> On Sun, Jun 28, 2015 at 05:36:04PM +0200, Jörg Otte wrote:
>> 2015-06-26 16:28 GMT+02:00 Jörg Otte :
>> > 2015-06-26 12:03 GMT+02:00 Jörg Otte :
>> >> 2015-06-26 11:37 GMT+02:00 Marcel Holtmann :
>> >>> Hi Joerg,
>> >>>
>> >>>> Bluetooth is inoperable in current Linus tree and the
>> >>>> first bad commit is:
>> >>>>
>> >>>> 835a6a2f8603237a3e6cded5a6765090ecb06ea5 is the first bad commit
>> >>>> commit 835a6a2f8603237a3e6cded5a6765090ecb06ea5
>> >>>> Author: Alexey Dobriyan 
>> >>>> Date:   Wed Jun 10 20:28:33 2015 +0300
>> >>>>
>> >>>>Bluetooth: Stop sabotaging list poisoning
>> >>>>
>> >>>>list_del() poisons pointers with special values, no need to 
>> >>>> overwrite them.
>> >>>>
>> >>>>Signed-off-by: Alexey Dobriyan 
>> >>>>Signed-off-by: Marcel Holtmann 
>> >>>>
>> >>>> My BT adapter is an intel 8087:07da
>> >>>> I reverted that commit and this fixed the problem for me.
>> >>>
>> >>> today we had a patch from Tedd fixing the list initialization in the 
>> >>> HIDP code.
>> >>>
>> >>> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
>> >>> index 9070dfd6b4ad..f1a117f8cad2 100644
>> >>> --- a/net/bluetooth/hidp/core.c
>> >>> +++ b/net/bluetooth/hidp/core.c
>> >>> @@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session 
>> >>> **out, const bdaddr_t *bdaddr,
>> >>> session->conn = l2cap_conn_get(conn);
>> >>> session->user.probe = hidp_session_probe;
>> >>> session->user.remove = hidp_session_remove;
>> >>> +   INIT_LIST_HEAD(&session->user.list);
>> >>> session->ctrl_sock = ctrl_sock;
>> >>> session->intr_sock = intr_sock;
>> >>> skb_queue_head_init(&session->ctrl_transmit);
>> >>>
>> >>> Could this be fixing it for you as well?
>> >>>
>> >> I will check this when I am at home in the
>> >> afternoon.
>> >>
>> >
>> > The patch works for me too.
>> >
>> Ok, this was a little bit hasty!
>> I now see the following additional problems:
>>
>> - System freeze on resume (occures always).
>> - System freeze on shutdown (occures sometimes)
>> - System freeze when BT-mouse is connecting (occures sometimes).
>>
>> Then I can't do anything except power off.
>>
>> This happens only if Bluetooth AND BT-mouse is activated.
>
> OK, what happens if you just revert only list_del patch?

I have applied this patch:

diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
index 9070dfd6b4ad..f1a117f8cad2 100644
--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session
**out, const bdaddr_t *bdaddr,
session->conn = l2cap_conn_get(conn);
session->user.probe = hidp_session_probe;
session->user.remove = hidp_session_remove;
+   INIT_LIST_HEAD(&session->user.list);
session->ctrl_sock = ctrl_sock;
session->intr_sock = intr_sock;
skb_queue_head_init(&session->ctrl_transmit);

without this patch bluetooth doesn't work at all for me.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [4.1.0-07254-gc13c810] Regression: Bluetooth not working.

2015-06-28 Thread Jörg Otte

2015-06-26 16:28 GMT+02:00 Jörg Otte :
> 2015-06-26 12:03 GMT+02:00 Jörg Otte :
>> 2015-06-26 11:37 GMT+02:00 Marcel Holtmann :
>>> Hi Joerg,
>>>
>>>> Bluetooth is inoperable in current Linus tree and the
>>>> first bad commit is:
>>>>
>>>> 835a6a2f8603237a3e6cded5a6765090ecb06ea5 is the first bad commit
>>>> commit 835a6a2f8603237a3e6cded5a6765090ecb06ea5
>>>> Author: Alexey Dobriyan 
>>>> Date:   Wed Jun 10 20:28:33 2015 +0300
>>>>
>>>>Bluetooth: Stop sabotaging list poisoning
>>>>
>>>>list_del() poisons pointers with special values, no need to overwrite 
>>>> them.
>>>>
>>>>Signed-off-by: Alexey Dobriyan 
>>>>Signed-off-by: Marcel Holtmann 
>>>>
>>>> My BT adapter is an intel 8087:07da
>>>> I reverted that commit and this fixed the problem for me.
>>>
>>> today we had a patch from Tedd fixing the list initialization in the HIDP 
>>> code.
>>>
>>> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
>>> index 9070dfd6b4ad..f1a117f8cad2 100644
>>> --- a/net/bluetooth/hidp/core.c
>>> +++ b/net/bluetooth/hidp/core.c
>>> @@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session **out, 
>>> const bdaddr_t *bdaddr,
>>> session->conn = l2cap_conn_get(conn);
>>> session->user.probe = hidp_session_probe;
>>> session->user.remove = hidp_session_remove;
>>> +   INIT_LIST_HEAD(&session->user.list);
>>> session->ctrl_sock = ctrl_sock;
>>> session->intr_sock = intr_sock;
>>> skb_queue_head_init(&session->ctrl_transmit);
>>>
>>> Could this be fixing it for you as well?
>>>
>> I will check this when I am at home in the
>> afternoon.
>>
>
> The patch works for me too.
>
Ok, this was a little bit hasty!
I now see the following additional problems:

- System freeze on resume (occures always).
- System freeze on shutdown (occures sometimes)
- System freeze when BT-mouse is connecting (occures sometimes).

Then I can't do anything except power off.

This happens only if Bluetooth AND BT-mouse is activated.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [4.1.0-07254-gc13c810] Regression: Bluetooth not working.

2015-06-26 Thread Jörg Otte

2015-06-26 12:03 GMT+02:00 Jörg Otte :
> 2015-06-26 11:37 GMT+02:00 Marcel Holtmann :
>> Hi Joerg,
>>
>>> Bluetooth is inoperable in current Linus tree and the
>>> first bad commit is:
>>>
>>> 835a6a2f8603237a3e6cded5a6765090ecb06ea5 is the first bad commit
>>> commit 835a6a2f8603237a3e6cded5a6765090ecb06ea5
>>> Author: Alexey Dobriyan 
>>> Date:   Wed Jun 10 20:28:33 2015 +0300
>>>
>>>Bluetooth: Stop sabotaging list poisoning
>>>
>>>list_del() poisons pointers with special values, no need to overwrite 
>>> them.
>>>
>>>Signed-off-by: Alexey Dobriyan 
>>>Signed-off-by: Marcel Holtmann 
>>>
>>> My BT adapter is an intel 8087:07da
>>> I reverted that commit and this fixed the problem for me.
>>
>> today we had a patch from Tedd fixing the list initialization in the HIDP 
>> code.
>>
>> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
>> index 9070dfd6b4ad..f1a117f8cad2 100644
>> --- a/net/bluetooth/hidp/core.c
>> +++ b/net/bluetooth/hidp/core.c
>> @@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session **out, 
>> const bdaddr_t *bdaddr,
>> session->conn = l2cap_conn_get(conn);
>> session->user.probe = hidp_session_probe;
>> session->user.remove = hidp_session_remove;
>> +   INIT_LIST_HEAD(&session->user.list);
>> session->ctrl_sock = ctrl_sock;
>> session->intr_sock = intr_sock;
>> skb_queue_head_init(&session->ctrl_transmit);
>>
>> Could this be fixing it for you as well?
>>
> I will check this when I am at home in the
> afternoon.
>

The patch works for me too.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [4.1.0-07254-gc13c810] Regression: Bluetooth not working.

2015-06-26 Thread Jörg Otte

2015-06-26 11:37 GMT+02:00 Marcel Holtmann :
> Hi Joerg,
>
>> Bluetooth is inoperable in current Linus tree and the
>> first bad commit is:
>>
>> 835a6a2f8603237a3e6cded5a6765090ecb06ea5 is the first bad commit
>> commit 835a6a2f8603237a3e6cded5a6765090ecb06ea5
>> Author: Alexey Dobriyan 
>> Date:   Wed Jun 10 20:28:33 2015 +0300
>>
>>Bluetooth: Stop sabotaging list poisoning
>>
>>list_del() poisons pointers with special values, no need to overwrite 
>> them.
>>
>>Signed-off-by: Alexey Dobriyan 
>>Signed-off-by: Marcel Holtmann 
>>
>> My BT adapter is an intel 8087:07da
>> I reverted that commit and this fixed the problem for me.
>
> today we had a patch from Tedd fixing the list initialization in the HIDP 
> code.
>
> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
> index 9070dfd6b4ad..f1a117f8cad2 100644
> --- a/net/bluetooth/hidp/core.c
> +++ b/net/bluetooth/hidp/core.c
> @@ -915,6 +915,7 @@ static int hidp_session_new(struct hidp_session **out, 
> const bdaddr_t *bdaddr,
> session->conn = l2cap_conn_get(conn);
> session->user.probe = hidp_session_probe;
> session->user.remove = hidp_session_remove;
> +   INIT_LIST_HEAD(&session->user.list);
> session->ctrl_sock = ctrl_sock;
> session->intr_sock = intr_sock;
> skb_queue_head_init(&session->ctrl_transmit);
>
> Could this be fixing it for you as well?
>
I will check this when I am at home in the
afternoon.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[4.1.0-07254-gc13c810] Regression: Bluetooth not working.

2015-06-26 Thread Jörg Otte

Bluetooth is inoperable in current Linus tree and the
first bad commit is:

835a6a2f8603237a3e6cded5a6765090ecb06ea5 is the first bad commit
commit 835a6a2f8603237a3e6cded5a6765090ecb06ea5
Author: Alexey Dobriyan 
Date:   Wed Jun 10 20:28:33 2015 +0300

Bluetooth: Stop sabotaging list poisoning

list_del() poisons pointers with special values, no need to overwrite them.

Signed-off-by: Alexey Dobriyan 
Signed-off-by: Marcel Holtmann 

My BT adapter is an intel 8087:07da
I reverted that commit and this fixed the problem for me.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.1] Regression: Bluetooth mouse not working.

2015-04-18 Thread Jörg Otte

What this patch tried to do is to limit it to what userspace is
currently actually using. My mistake was to look only at BlueZ 5.x
userspace and not at BlueZ 4.x userspace. The fix to not break
existing userspace is essentially this:
>
> diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
> index a05b9dbf14c9..9070dfd6b4ad 100644
> --- a/net/bluetooth/hidp/core.c
> +++ b/net/bluetooth/hidp/core.c
> @@ -1313,7 +1313,8 @@ int hidp_connection_add(struct hidp_connadd_req *req,
> struct socket *ctrl_sock,
> struct socket *intr_sock)
>  {
> -   u32 valid_flags = 0;
> +   u32 valid_flags = BIT(HIDP_VIRTUAL_CABLE_UNPLUG) |
> + BIT(HIDP_BOOT_PROTOCOL_MODE);
>
> I ask Joerg to test this patch, but looking at old userspace is that is what 
> is happening there.
>

I think the last 3 lines are missing, the complete patch is:

diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
index a05b9db..02298bc 100644
--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -1313,7 +1313,8 @@ int hidp_connection_add(struct hidp_connadd_req *req,
  struct socket *ctrl_sock,
  struct socket *intr_sock)
 {
- u32 valid_flags = 0;
+ u32 valid_flags = BIT(HIDP_VIRTUAL_CABLE_UNPLUG) |
+ BIT(HIDP_BOOT_PROTOCOL_MODE);
  struct hidp_session *session;
  struct l2cap_conn *conn;
  struct l2cap_chan *chan;

that patch works for me.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.1] Regression: Bluetooth mouse not working.

2015-04-17 Thread Jörg Otte

2015-04-17 18:55 GMT+02:00 Jörg Otte :
> 2015-04-17 18:51 GMT+02:00 Marcel Holtmann :
>> Hi Joerg,
>>
>>>>> On Fri, Apr 17, 2015 at 5:36 AM, Jörg Otte  wrote:
>>>>>> The BT mouse is "death" in v4.1.
>>>>>> The BT mouse has been working in 4.0 and previous kernels, so this
>>>>>> is a regression.
>>>>>
>>>>> Any chance of bisecting it?
>>>>>
>>>>>  Linus
>>>> I will try that.
>>>>
>>>> Thanks, Jörg
>>>
>>> I first tried to bisect over all. But than I got an unbootable kernel.
>>> Then I did a bisect over net/bluetooth and I get the following fist bad
>>> commit:
>>>
>>> 5f5da99f1da5b01c7c45473a500c7dbb77a00958 is the first bad commit
>>> commit 5f5da99f1da5b01c7c45473a500c7dbb77a00958
>>> Author: Marcel Holtmann 
>>> Date:   Wed Apr 1 13:51:53 2015 -0700
>>>
>>>Bluetooth: Restrict HIDP flags to only valid ones
>>>
>>>The HIDP flags should be clearly restricted to valid ones. So this puts
>>>extra checks in place to ensure this.
>>>
>>>Signed-off-by: Marcel Holtmann 
>>>Signed-off-by: Johan Hedberg 
>>>
>>> :04 04 b51ac3634c9d44f4d9df0e7f548b524954b99c76
>>> 63bfb47283609849f1b3b8f05fe61743ccddfee6 M  net
>>
>> thanks for bi-secting this. I looked at our existing userspace and 
>> restricted it to the flags that are currently in use. However it seems that 
>> I made a mistake. What version of BlueZ userspace are you using (bluetoothd 
>> --version).
>>
>> Regards
>>
>> Marcel
>>
> bluetoothd --version
> 4.98
>
> Thanks, Jörg

Just reverted that commit. It fixed the problem for me.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.1] Regression: Bluetooth mouse not working.

2015-04-17 Thread Jörg Otte

2015-04-17 18:51 GMT+02:00 Marcel Holtmann :
> Hi Joerg,
>
>>>> On Fri, Apr 17, 2015 at 5:36 AM, Jörg Otte  wrote:
>>>>> The BT mouse is "death" in v4.1.
>>>>> The BT mouse has been working in 4.0 and previous kernels, so this
>>>>> is a regression.
>>>>
>>>> Any chance of bisecting it?
>>>>
>>>>  Linus
>>> I will try that.
>>>
>>> Thanks, Jörg
>>
>> I first tried to bisect over all. But than I got an unbootable kernel.
>> Then I did a bisect over net/bluetooth and I get the following fist bad
>> commit:
>>
>> 5f5da99f1da5b01c7c45473a500c7dbb77a00958 is the first bad commit
>> commit 5f5da99f1da5b01c7c45473a500c7dbb77a00958
>> Author: Marcel Holtmann 
>> Date:   Wed Apr 1 13:51:53 2015 -0700
>>
>>Bluetooth: Restrict HIDP flags to only valid ones
>>
>>The HIDP flags should be clearly restricted to valid ones. So this puts
>>extra checks in place to ensure this.
>>
>>Signed-off-by: Marcel Holtmann 
>>Signed-off-by: Johan Hedberg 
>>
>> :04 04 b51ac3634c9d44f4d9df0e7f548b524954b99c76
>> 63bfb47283609849f1b3b8f05fe61743ccddfee6 M  net
>
> thanks for bi-secting this. I looked at our existing userspace and restricted 
> it to the flags that are currently in use. However it seems that I made a 
> mistake. What version of BlueZ userspace are you using (bluetoothd --version).
>
> Regards
>
> Marcel
>
bluetoothd --version
4.98

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.1] Regression: Bluetooth mouse not working.

2015-04-17 Thread Jörg Otte

2015-04-17 16:51 GMT+02:00 Jörg Otte :
> 2015-04-17 15:44 GMT+02:00 Linus Torvalds :
>> On Fri, Apr 17, 2015 at 5:36 AM, Jörg Otte  wrote:
>>> The BT mouse is "death" in v4.1.
>>> The BT mouse has been working in 4.0 and previous kernels, so this
>>> is a regression.
>>
>> Any chance of bisecting it?
>>
>>   Linus
> I will try that.
>
> Thanks, Jörg

I first tried to bisect over all. But than I got an unbootable kernel.
Then I did a bisect over net/bluetooth and I get the following fist bad
commit:

5f5da99f1da5b01c7c45473a500c7dbb77a00958 is the first bad commit
commit 5f5da99f1da5b01c7c45473a500c7dbb77a00958
Author: Marcel Holtmann 
Date:   Wed Apr 1 13:51:53 2015 -0700

Bluetooth: Restrict HIDP flags to only valid ones

The HIDP flags should be clearly restricted to valid ones. So this puts
extra checks in place to ensure this.

Signed-off-by: Marcel Holtmann 
Signed-off-by: Johan Hedberg 

:04 04 b51ac3634c9d44f4d9df0e7f548b524954b99c76
63bfb47283609849f1b3b8f05fe61743ccddfee6 M  net


Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.1] Regression: Bluetooth mouse not working.

2015-04-17 Thread Jörg Otte

2015-04-17 15:44 GMT+02:00 Linus Torvalds :
> On Fri, Apr 17, 2015 at 5:36 AM, Jörg Otte  wrote:
>> The BT mouse is "death" in v4.1.
>> The BT mouse has been working in 4.0 and previous kernels, so this
>> is a regression.
>
> Any chance of bisecting it?
>
>   Linus
I will try that.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[V4.1] Regression: Bluetooth mouse not working.

2015-04-17 Thread Jörg Otte

The BT mouse is "death" in v4.1.
The BT mouse has been working in 4.0 and previous kernels, so this
is a regression.

BT adapter is an intel 8087:07da. The mouse is an MS Notebook Mouse 500.

It just doesn't work without any errors displayed in dmesg.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.0-rc5 Regression] drm/intel problem

2015-03-27 Thread Jörg Otte

2015-03-27 8:03 GMT+01:00 Kalle Valo :
> Linus Torvalds  writes:
>
>>  can you verify/confirm that current git works for you? And if not,
>> maybe bisect exactly where it happened?
>
> I had a similar problem as Jörg on my Lenovo x230, display black on -rc5
> except some small colored line on the top. I updated to current git
> (commit below) and everything looks to be ok after 10 minutes of
> testing.
>
> commit 3c435c1e472ba344ee25f795f4807d4457e61f6c
> Merge: be8a9bc63328 9822393d23ba
> Author: Linus Torvalds 
> Date:   Thu Mar 26 15:04:05 2015 -0700
>
> Merge branch 'drm-fixes' of
> git://people.freedesktop.org/~airlied/linux
>
> Pull drm refcounting fixes from Dave Airlie:
>  "Here is the complete set of i915 bug/warn/refcounting fixes"
>
>
> --
> Kalle Valo

OK, current git

3c435c1 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

fixes the problem for me too.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[V4.0-rc5 Regression] drm/intel problem

2015-03-23 Thread Jörg Otte

The display remains dark in V4.0-rc5 except of a small white line
at the top of the screen so I can't see anything. RC4 was good.
The latest good kernel that I know of is at least
4.0.0-rc4-00199-gb314aca.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-11 Thread Jörg Otte

2015-03-11 12:01 GMT+01:00 Jörg Otte :
> 2015-03-10 18:04 GMT+01:00 Mathias Nyman :
>> On 10.03.2015 17:36, Jörg Otte wrote:
>>
>>>>> Any chance you could take a log with xhci debugging enabled before 
>>>>> attaching the DVB-T
>>>>> stick?
>>>>>
>>>>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>>>>>
>>>>>
>>>>
>>>> here it comes attached.
>>>>
>>>>
>>>>> I'd suspect one of these two patches:
>>>>>
>>>>> commit 45ba2154d12fc43b70312198ec47085f10be801a
>>>>> xhci: fix reporting of 0-sized URBs in control endpoint
>>>>>
>>>>> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
>>>>> xhci: Clear the host side toggle manually when endpoint is 'soft 
>>>>> reset'
>>>>>
>>>
>>> Revert the commits.
>>> The second one  "xhci: Clear the host side..."  is it !
>>>
>>
>> Yes, thank you
>>
>> Seems that It wasn't mature enough, I'll revert it.
>>
>> From your logs I can see what went wrong,
>>
>> If you still have some time, could you try out a patch (attached) and see if 
>> it solves the
>> issue for you. (on top of clean 4.0-rc3). I can't reproduce it with my own 
>> USB DVB-T device
>
> Problems:
> error: patch failed: drivers/usb/host/xhci.c:2972
> error: drivers/usb/host/xhci.c: patch does not apply
>
> For me the patch looks formally good.
> No idea why.

OK, finally I got it applied successfully.
I can confirm now it works for me.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-11 Thread Jörg Otte

2015-03-10 18:04 GMT+01:00 Mathias Nyman :
> On 10.03.2015 17:36, Jörg Otte wrote:
>
>>>> Any chance you could take a log with xhci debugging enabled before 
>>>> attaching the DVB-T
>>>> stick?
>>>>
>>>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>>>>
>>>>
>>>
>>> here it comes attached.
>>>
>>>
>>>> I'd suspect one of these two patches:
>>>>
>>>> commit 45ba2154d12fc43b70312198ec47085f10be801a
>>>> xhci: fix reporting of 0-sized URBs in control endpoint
>>>>
>>>> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
>>>> xhci: Clear the host side toggle manually when endpoint is 'soft reset'
>>>>
>>
>> Revert the commits.
>> The second one  "xhci: Clear the host side..."  is it !
>>
>
> Yes, thank you
>
> Seems that It wasn't mature enough, I'll revert it.
>
> From your logs I can see what went wrong,
>
> If you still have some time, could you try out a patch (attached) and see if 
> it solves the
> issue for you. (on top of clean 4.0-rc3). I can't reproduce it with my own 
> USB DVB-T device

Problems:
error: patch failed: drivers/usb/host/xhci.c:2972
error: drivers/usb/host/xhci.c: patch does not apply

For me the patch looks formally good.
No idea why.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Jörg Otte

2015-03-10 15:03 GMT+01:00 Jörg Otte :
> 2015-03-10 14:06 GMT+01:00 Mathias Nyman :
>> On 10.03.2015 11:40, Jörg Otte wrote:
>>> If I plug in my USB DVB-T stick I get the following in dmesg:
>>>
>>> dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm 
>>> state.
>>> dvb-usb: will pass the complete MPEG2 transport stream to the software 
>>> demuxer.
>>> DVB: registering new adapter (TerraTec/qanu USB2.0 Highspeed DVB-T Receiver)
>>> usb 1-1: DVB: registering adapter 0 frontend 0 (TerraTec/qanu USB2.0
>>> Highspeed DVB-T Receiver)...
>>> input: IR-receiver inside an USB DVB receiver as
>>> /devices/pci:00/:00:14.0/usb1/1-1/input/input17
>>> dvb-usb: schedule remote query interval to 50 msecs.
>>> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
>>> current TD ep_index 1 comp_code 1
>>> xhci_hcd :00:14.0: Looking for event-dma 000207540400
>>> trb-start 000207540420 trb-end 000207540420 seg-start
>>> 0002075404
>>> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
>>> current TD ep_index 1 comp_code 1
>>> xhci_hcd :00:14.0: Looking for event-dma 000207540410
>>> trb-start 000207540420 trb-end 000207540420 seg-start
>>> 0002075404
>>> dvb-usb: bulk message failed: -110 (2/0)
>>>
>>> and DVB-T is not functional. The problem came in with:
>>>
>>> 1163d50 Merge tag 'usb-4.0-rc3' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
>>>
>>> I never had this xhci_hcd error before so this is a regression.
>>>
>>>
>>> Thanks, Jörg
>>
>> Oh, thanks.
>>
>> Looks like we get an event for a TRB we just moved past.
>>
>> Any chance you could take a log with xhci debugging enabled before attaching 
>> the DVB-T
>> stick?
>>
>> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>>
>>
>
> here it comes attached.
>
>
>> I'd suspect one of these two patches:
>>
>> commit 45ba2154d12fc43b70312198ec47085f10be801a
>> xhci: fix reporting of 0-sized URBs in control endpoint
>>
>> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
>> xhci: Clear the host side toggle manually when endpoint is 'soft reset'
>>

Revert the commits.
The second one  "xhci: Clear the host side..."  is it !

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Jörg Otte

2015-03-10 14:06 GMT+01:00 Mathias Nyman :
> On 10.03.2015 11:40, Jörg Otte wrote:
>> If I plug in my USB DVB-T stick I get the following in dmesg:
>>
>> dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm 
>> state.
>> dvb-usb: will pass the complete MPEG2 transport stream to the software 
>> demuxer.
>> DVB: registering new adapter (TerraTec/qanu USB2.0 Highspeed DVB-T Receiver)
>> usb 1-1: DVB: registering adapter 0 frontend 0 (TerraTec/qanu USB2.0
>> Highspeed DVB-T Receiver)...
>> input: IR-receiver inside an USB DVB receiver as
>> /devices/pci:00/:00:14.0/usb1/1-1/input/input17
>> dvb-usb: schedule remote query interval to 50 msecs.
>> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
>> current TD ep_index 1 comp_code 1
>> xhci_hcd :00:14.0: Looking for event-dma 000207540400
>> trb-start 000207540420 trb-end 000207540420 seg-start
>> 0002075404
>> xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
>> current TD ep_index 1 comp_code 1
>> xhci_hcd :00:14.0: Looking for event-dma 000207540410
>> trb-start 000207540420 trb-end 000207540420 seg-start
>> 0002075404
>> dvb-usb: bulk message failed: -110 (2/0)
>>
>> and DVB-T is not functional. The problem came in with:
>>
>> 1163d50 Merge tag 'usb-4.0-rc3' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
>>
>> I never had this xhci_hcd error before so this is a regression.
>>
>>
>> Thanks, Jörg
>
> Oh, thanks.
>
> Looks like we get an event for a TRB we just moved past.
>
> Any chance you could take a log with xhci debugging enabled before attaching 
> the DVB-T
> stick?
>
> echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
>
>

here it comes attached.


> I'd suspect one of these two patches:
>
> commit 45ba2154d12fc43b70312198ec47085f10be801a
> xhci: fix reporting of 0-sized URBs in control endpoint
>
> commit 27082e2654dc148078b0abdfc3c8e5ccbde0ebfa
> xhci: Clear the host side toggle manually when endpoint is 'soft reset'
>
> -Mathias
>

Thanks, Jörg


xhci-debug.gz
Description: GNU Zip compressed data

[V4.0.0-rc3] Xhci Regression: ERROR Transfer event TRB DMA ptr not part of current TD

2015-03-10 Thread Jörg Otte

If I plug in my USB DVB-T stick I get the following in dmesg:

dvb-usb: found a 'TerraTec/qanu USB2.0 Highspeed DVB-T Receiver' in warm state.
dvb-usb: will pass the complete MPEG2 transport stream to the software demuxer.
DVB: registering new adapter (TerraTec/qanu USB2.0 Highspeed DVB-T Receiver)
usb 1-1: DVB: registering adapter 0 frontend 0 (TerraTec/qanu USB2.0
Highspeed DVB-T Receiver)...
input: IR-receiver inside an USB DVB receiver as
/devices/pci:00/:00:14.0/usb1/1-1/input/input17
dvb-usb: schedule remote query interval to 50 msecs.
xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
current TD ep_index 1 comp_code 1
xhci_hcd :00:14.0: Looking for event-dma 000207540400
trb-start 000207540420 trb-end 000207540420 seg-start
0002075404
xhci_hcd :00:14.0: ERROR Transfer event TRB DMA ptr not part of
current TD ep_index 1 comp_code 1
xhci_hcd :00:14.0: Looking for event-dma 000207540410
trb-start 000207540420 trb-end 000207540420 seg-start
0002075404
dvb-usb: bulk message failed: -110 (2/0)

and DVB-T is not functional. The problem came in with:

1163d50 Merge tag 'usb-4.0-rc3' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

I never had this xhci_hcd error before so this is a regression.


Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[V4.0] Regression: Support for Bluetooth Adapter dropped.

2015-02-27 Thread Jörg Otte

Bluetooth has ever been working on my Notebook. I successfully
use Bluetooth mouse and Obex file transfer for years. And with
Kernel V4.00 the Adapter is no longer recognized.

It is an Intel 8087:07da.

The first bad commit is:
commit d0ac9eb72b6dceae318c15ee82ef2aaee233666d
Author: Marcel Holtmann 
Date:   Wed Jan 28 19:41:43 2015 -0800

Bluetooth: btusb: Ignore unknown Intel devices with generic descriptor

Can you please revert that commit?

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v3.20]: Missing /sys/class/ACAD directory OR Cannot release Mutex

2015-02-24 Thread Jörg Otte

2015-02-24 12:03 GMT+01:00 Jörg Otte :
> 2015-02-24 0:24 GMT+01:00 Rafael J. Wysocki :
>> On Sunday, February 22, 2015 01:33:09 PM Jörg Otte wrote:
>>> Starting with kernel 3.19.0-05184-g18320f2, I often find my notebook
>>> running with 'powersave' policy even if it is on AC.
>>>
>>> It turned out that /sys/class/ACAD directory is missing and in the
>>> logs I see:
>>>
>>> ACPI Error: Cannot release Mutex [MUT0], not acquired (20150204/exmutex-376)
>>> ACPI Error: Method parse/execution failed [\_SB_.ACAD._PSR] (Node
>>> 880216028d48), AE_AML_MUTEX_NOT_ACQUIRED (20150204/psparse-536)
>>> ACPI Exception: AE_AML_MUTEX_NOT_ACQUIRED, Error reading AC Adapter
>>> state (20150204/ac-131)
>>>
>>> At the last 25 startups since 3.19.0-05184-g18320f2 it occured 13 times.
>>>
>>> It does not occur with kernels older than 3.19.0-05184-g18320f2.
>>
>> Which mainline commit is that?
> 18320f2 Merge tag 'pm+acpi-3.20-rc1-2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
>
>>
>> The only changes in 4.0-rc1 that can affect that are the ACPI EC updates 
>> IIRC.
>>
>> Is there a chance to try the acpi-ec branch from linux-pm.git and see if the
>> problem is there?
> Should be no problem. I will give it a try.

I booted kernel 3.19.0-rc5-00560-g92e4b1b from linux-pm/acpi-ec 5 times
and it didn't occure.

Today I noticed 2 other things:
1) There are at least 2 slightly different error cases
 a) ACPI Error: Cannot release Mutex [MUT0], not acquired
 b) ACPI Error: Thread 370114560 cannot release Mutex [MUT0]
acquired by thread ..

2) Boot times are different between good and bad case, bad case takes
significally longer:
 a)  Good case
[0.285892] intel_idle: MWAIT substates: 0x21120
[0.285893] intel_idle: v0.4 model 0x3A
[0.285894] intel_idle: lapic_timer_reliable_states 0x
[0.288461] ACPI: AC Adapter [ACAD] (off-line)

 b) Bad case
[0.284061] intel_idle: MWAIT substates: 0x21120
[0.284062] intel_idle: v0.4 model 0x3A
[0.284063] intel_idle: lapic_timer_reliable_states 0x
[1.279089] tsc: Refined TSC clocksource calibration: 2494.334 MHz
[2.280355] Switched to clocksource tsc
[5.194933] ACPI Error: Thread 370114560 cannot release Mutex [MUT0]

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v3.20] ACPI backlight problem

2015-02-24 Thread Jörg Otte

2015-02-24 0:18 GMT+01:00 Rafael J. Wysocki :
> On Sunday, February 22, 2015 05:53:11 PM Jörg Otte wrote:
>> When I press backlight keys then normally OSD is shown and
>> backlight follows immediately the keystrokes step by step.
>
> That's how it works in 3.19, right?
Yes.

>
>> Now in v3.20 it behaves different:
>> When I press backlight keys at first nothing happens. If I watch a
>> video it is being stopped.
>> After some seconds OSD is shown, backlight changes - in one step - to
>> the new level and video continues.
>
> So what machine is that?

FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v3.20]: Missing /sys/class/ACAD directory OR Cannot release Mutex

2015-02-24 Thread Jörg Otte

2015-02-24 0:24 GMT+01:00 Rafael J. Wysocki :
> On Sunday, February 22, 2015 01:33:09 PM Jörg Otte wrote:
>> Starting with kernel 3.19.0-05184-g18320f2, I often find my notebook
>> running with 'powersave' policy even if it is on AC.
>>
>> It turned out that /sys/class/ACAD directory is missing and in the
>> logs I see:
>>
>> ACPI Error: Cannot release Mutex [MUT0], not acquired (20150204/exmutex-376)
>> ACPI Error: Method parse/execution failed [\_SB_.ACAD._PSR] (Node
>> 880216028d48), AE_AML_MUTEX_NOT_ACQUIRED (20150204/psparse-536)
>> ACPI Exception: AE_AML_MUTEX_NOT_ACQUIRED, Error reading AC Adapter
>> state (20150204/ac-131)
>>
>> At the last 25 startups since 3.19.0-05184-g18320f2 it occured 13 times.
>>
>> It does not occur with kernels older than 3.19.0-05184-g18320f2.
>
> Which mainline commit is that?
18320f2 Merge tag 'pm+acpi-3.20-rc1-2' of
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

>
> The only changes in 4.0-rc1 that can affect that are the ACPI EC updates IIRC.
>
> Is there a chance to try the acpi-ec branch from linux-pm.git and see if the
> problem is there?
Should be no problem. I will give it a try.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v3.20] ACPI backlight problem

2015-02-22 Thread Jörg Otte

When I press backlight keys then normally OSD is shown and
backlight follows immediately the keystrokes step by step.

Now in v3.20 it behaves different:
When I press backlight keys at first nothing happens. If I watch a
video it is being stopped.
After some seconds OSD is shown, backlight changes - in one step - to
the new level and video continues.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v3.20]: Missing /sys/class/ACAD directory OR Cannot release Mutex

2015-02-22 Thread Jörg Otte

Starting with kernel 3.19.0-05184-g18320f2, I often find my notebook
running with 'powersave' policy even if it is on AC.

It turned out that /sys/class/ACAD directory is missing and in the
logs I see:

ACPI Error: Cannot release Mutex [MUT0], not acquired (20150204/exmutex-376)
ACPI Error: Method parse/execution failed [\_SB_.ACAD._PSR] (Node
880216028d48), AE_AML_MUTEX_NOT_ACQUIRED (20150204/psparse-536)
ACPI Exception: AE_AML_MUTEX_NOT_ACQUIRED, Error reading AC Adapter
state (20150204/ac-131)

At the last 25 startups since 3.19.0-05184-g18320f2 it occured 13 times.

It does not occur with kernels older than 3.19.0-05184-g18320f2.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [v3.20] Regression: V3.20 doesn'nt wake up from suspend

2015-02-13 Thread Jörg Otte

2015-02-13 15:40 GMT+01:00 Rafael J. Wysocki :
> On Friday, February 13, 2015 09:49:20 AM Jörg Otte wrote:
>> In suspend state the power-led remains on (should be blinking).
>> So may be the kernel does not reach suspend state correctly.
>> Once suspended, pressing the suspend key to wake up does not have
>> any visible effect.
>>
>> There is nothing special in the logs.
>>
>> I tried to bisect, but I ran into an unbootable kernel, so all I
>> can say is:
>> Last good kernel:  3.19.0-00463-g3e8c04e
>> First bad kernel:  3.19.0-02595-gc5ce28d
>>
>> Suspend has always been working, so this a regression.
>>
>> It is a FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
>
> I'll be sending a pull request with one revert related to broken suspend
> later today, so hopefully this will help you too.

Yes it fixed it.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[v3.20] Regression: V3.20 doesn'nt wake up from suspend

2015-02-13 Thread Jörg Otte

In suspend state the power-led remains on (should be blinking).
So may be the kernel does not reach suspend state correctly.
Once suspended, pressing the suspend key to wake up does not have
any visible effect.

There is nothing special in the logs.

I tried to bisect, but I ran into an unbootable kernel, so all I
can say is:
Last good kernel:  3.19.0-00463-g3e8c04e
First bad kernel:  3.19.0-02595-gc5ce28d

Suspend has always been working, so this a regression.

It is a FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [proc:] 3.16.0-10436-g9138475: access denied to /proc/1540/task/1540/net/dev

2014-08-11 Thread Jörg Otte

2014-08-11 6:30 GMT+02:00 Linus Torvalds :
> On Sun, Aug 10, 2014 at 1:05 PM, Eric W. Biederman
>  wrote:
>>
>> Linus would you like me to send pull request with those two changes reverted?
>
> I just did them (delayed it a bit in the hope to get confirmation, but
> it looks very straightforward, so since I'll be on airplanes most of
> tomorrow..)
>
>  Linus

OK, Kernel: 3.16.0-10473-gc8d6637 fixes the problem for me.

Thanks,
Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[proc:] 3.16.0-10436-g9138475: access denied to /proc/1540/task/1540/net/dev

2014-08-10 Thread Jörg Otte

My network interface eth0 doesn't come up in 3.16.0-10436-g9138475
I am seeing following "security problem" in dmesg:

audit: type=1400 audit(1407684227.003:28): apparmor="DENIED"
  operation="open" profile="/sbin/dhclient"
  name="/proc/1540/task/1540/net/dev" pid=1540 comm="dhclient"
  requested_mask="r" denied_mask="r" fsuid=0 ouid=0

I think the problem is introduced by the following commits, especially
6ba8ed7:

344470c proc: Point /proc/mounts at /proc/thread-self/mounts instead
of /proc/self/mounts
e813244 proc: Point /proc/net at /proc/thread-self/net instead of /proc/self/net
0097875 proc: Implement /proc/thread-self to point at the directory of
the current thread
6ba8ed7 proc: Have net show up under /proc//task/

To get eth0 activated I need to MODIFY APPARMOR-CONFIGURATION:

e.g.
# Site-specific additions and overrides for sbin.dhclient.
# For more details, please see /etc/apparmor.d/local/README.
/sbin/dhclient {
  @{PROC}/[0-9]*/task/[0-9]*/net/ r,
  @{PROC}/[0-9]*/task/[0-9]*/net/** r
}

Is this interface change to user space intentional?

Thanks,
Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [usb resume regression] in 3.16-rc1

2014-06-20 Thread Jörg Otte

2014-06-20 16:57 GMT+02:00 Alan Stern :
> On Fri, 20 Jun 2014, Jörg Otte wrote:
>
>> 2014-06-19 19:35 GMT+02:00 Alan Stern :
>> > On Thu, 19 Jun 2014, Jörg Otte wrote:
>> >
>> >> I don't know how to do this.
>> >
>> > To enable dynamic debugging (as root):
>> >
>> > echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
>> > echo 'module ehci_hcd =p' >/sys/kernel/debug/dynamic_debug/control
>> >
>> > Do this before you carry out the suspend, and post the resulting dmesg
>> > log.
>> >
>> > Alan Stern
>> >
>> here it is.
>
> Now I see the reason for the regression.  There already is a patch to
> fix it:
>
> http://marc.info/?l=linux-usb&m=140304702623966&w=2
>
> The fix will probably be included in 3.16-rc2 or -rc3.
>
> Alan Stern
>
The patch works for me.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [usb resume regression] in 3.16-rc1

2014-06-19 Thread Jörg Otte

I don't know how to do this.

Thanks, Jörg

2014-06-19 17:02 GMT+02:00 Alan Stern :
> On Thu, 19 Jun 2014, Jörg Otte wrote:
>
>> on resume with 3.16-rc1 I get the following error messages in dmesg
>> which are alltogether not present in 3.15:
>>
>> [   43.518116] dpm_run_callback(): 0xa53c4120 returns -13
>> [   43.518119] PM: Device 1-1 failed to resume async: error -13
>> [   43.522380] hub 1-1:1.0: hub_port_status failed (err = -71)
>> [   43.528538] sd 0:0:0:0: [sda] Starting disk
>> [   43.530886] hub 1-1:1.0: hub_port_status failed (err = -71)
>> [   43.535140] usb 1-1-port2: cannot disable (err = -71)
>> [   43.535142] dpm_run_callback(): 0xa53c4120 returns -71
>> [   43.535145] PM: Device 1-1.2 failed to resume async: error -71
>> [   43.543673] usb 1-1-port4: cannot disable (err = -71)
>> [   43.543674] dpm_run_callback(): 0xa53c4120 returns -71
>> [   43.543679] PM: Device 1-1.4 failed to resume async: error -71
>>
>> usb devices connected to the hub (in my case mouse and keyboard) are
>> all death after resume and cannot be activated again.
>
> -13 is -EACCES.  It's not clear what could have caused that error,
> since EACCES isn't used in the USB core or in ehci-hcd.
>
> Can you try doing the same thing after enabling dynamic debugging in
> usbcore and ehci-hcd?
>
> Alan Stern
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Intel-gfx] [3.14.0-rc4] regression: drm FIFO underruns

2014-05-16 Thread Jörg Otte

2014-05-16 13:53 GMT+02:00 Ville Syrjälä :
> On Tue, May 13, 2014 at 06:38:32PM +0200, Daniel Vetter wrote:
>> On Tue, May 13, 2014 at 05:21:49PM +0200, Jörg Otte wrote:
>> > 2014-05-13 15:22 GMT+02:00 Daniel Vetter :
>> > > On Tue, May 13, 2014 at 12:38:41PM +0200, Daniel Vetter wrote:
>> > >> On Tue, May 13, 2014 at 12:29 PM, Jörg Otte  wrote:
>> > >> >>> Branch drm-intel-nightly as of
>> > >> >>> ed60c27 drm-intel-nightly: 2014y-05m-09d-21h-51m-45s integration 
>> > >> >>> manifest
>> > >> >>> looks badly:
>> > >> >>>- KDE splash screen on boot-up is not visible
>> > >> >>>- x-windows don't have title and menu bars
>> > >> >>>- KDE system menu is not visible
>> > >> >>>- moving windows around destroys its content
>> > >> >>
>> > >> >> Ugh, that's ugly. Nothing else change like e.g. the version of
>> > >> >> xfree-video-intel?
>> > >> >
>> > >> >  (II) Loading /usr/lib/xorg/modules/drivers/intel_drv.so
>> > >> >  (II) Module intel: vendor="X.Org Foundation"
>> > >> >  compiled for 1.11.3, module version = 2.17.0
>> > >> >  Module class: X.Org Video Driver
>> > >> >  ABI class: X.Org Video Driver, version 11.0
>> > >>
>> > >> Chris, any ideas? It's an ivybridge apparently.
>> > >>
>> > >> For the fifo underruns I think we've fully confirmed that they only
>> > >> happen on boot-up. I'll try to come up with some ideas what could have
>> > >> gone wrong there.
>> > >
>> > > Please test the below patch.
>> > > -Daniel
>> > >
>> > > diff --git a/drivers/gpu/drm/i915/i915_irq.c 
>> > > b/drivers/gpu/drm/i915/i915_irq.c
>> > > index b10fbde1d5ee..63ced2dee027 100644
>> > > --- a/drivers/gpu/drm/i915/i915_irq.c
>> > > +++ b/drivers/gpu/drm/i915/i915_irq.c
>> > > @@ -427,9 +427,6 @@ bool __intel_set_cpu_fifo_underrun_reporting(struct 
>> > > drm_device *dev,
>> > >
>> > > ret = !intel_crtc->cpu_fifo_underrun_disabled;
>> > >
>> > > -   if (enable == ret)
>> > > -   goto done;
>> > > -
>> > > intel_crtc->cpu_fifo_underrun_disabled = !enable;
>> > >
>> > > if (enable && (INTEL_INFO(dev)->gen < 5 || IS_VALLEYVIEW(dev)))
>> > > @@ -441,7 +438,6 @@ bool __intel_set_cpu_fifo_underrun_reporting(struct 
>> > > drm_device *dev,
>> > > else if (IS_GEN8(dev))
>> > > broadwell_set_fifo_underrun_reporting(dev, pipe, enable);
>> > >
>> > > -done:
>> > > return ret;
>> > >  }
>> > >
>> > > --
>> >
>> > Doesn't work for me, I still have an underrun at boot-up.
>>
>> I'm at a loss tbh with ideas. We successfully disable both pipes, then
>> enable pipe A and it all works.
>>
>> Then we enable pipe B and _both_ pipes underrun immediately afterwards.
>> Really strange. Can you please reproduce the issue again on
>> drm-intel-nightly (latest -nightly should also have the display
>> corruptions fixed, so good to retest anyway) and attach a new dmesg with
>> drm.debug=0xe.
>
> I see a few underrun on my IVB as well. But it seems to be limited to
> cases that involve the VGA connector, which doesn't actually exist on
> this machine so I can't be sure if it's really properly set up on the
> PCH. But so far with just two HDMI connectors I was unable to reproduce
> it.
>
>>
>> Meanwhile I'll try to come up with new theories and ideas.
>
> I was thinking that we might frob with the PCH refclk during driver init
> and that might cause the PCH underrun for Jörg, but it looks like the
> underruns really happen at the modeset time which is much later than the
> PCH refclock init.
>
> For the 1<->n pipe transition, I don't think we handle it correctly at
> the moment. I have a fix as part of my remaining watermark patches. I
> rebased those and I'll repost them soon. In the meantime I pushed them
> to [1]. Jörg, can you give that branch a go?
>
> [1] git://gitorious.org/vsyrjala/linux.git watermarks_intm_31_notrace
>
Unfortunately not working for me.
It gives me some more errors:

[drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo
underrun on pipe A
[drm:ivb_err_int_handler] *ERROR* Pipe A FIFO underrun
[drm:ivybridge_set_fifo_underrun_reporting] *ERROR* uncleared fifo
underrun on pipe B
[drm:ivb_err_int_handler] *ERROR* Pipe B FIFO underrun
[drm:cpt_set_fifo_underrun_reporting] *ERROR* uncleared pch fifo
underrun on pch transcoder A
[drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun
[drm:cpt_set_fifo_underrun_reporting] *ERROR* uncleared pch fifo
underrun on pch transcoder B
[drm:cpt_serr_int_handler] *ERROR* PCH transcoder B FIFO underrun

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Intel-gfx] [3.14.0-rc4] regression: drm FIFO underruns

2014-05-13 Thread Jörg Otte

2014-05-13 15:22 GMT+02:00 Daniel Vetter :
> On Tue, May 13, 2014 at 12:38:41PM +0200, Daniel Vetter wrote:
>> On Tue, May 13, 2014 at 12:29 PM, Jörg Otte  wrote:
>> >>> Branch drm-intel-nightly as of
>> >>> ed60c27 drm-intel-nightly: 2014y-05m-09d-21h-51m-45s integration manifest
>> >>> looks badly:
>> >>>- KDE splash screen on boot-up is not visible
>> >>>- x-windows don't have title and menu bars
>> >>>- KDE system menu is not visible
>> >>>- moving windows around destroys its content
>> >>
>> >> Ugh, that's ugly. Nothing else change like e.g. the version of
>> >> xfree-video-intel?
>> >
>> >  (II) Loading /usr/lib/xorg/modules/drivers/intel_drv.so
>> >  (II) Module intel: vendor="X.Org Foundation"
>> >  compiled for 1.11.3, module version = 2.17.0
>> >  Module class: X.Org Video Driver
>> >  ABI class: X.Org Video Driver, version 11.0
>>
>> Chris, any ideas? It's an ivybridge apparently.
>>
>> For the fifo underruns I think we've fully confirmed that they only
>> happen on boot-up. I'll try to come up with some ideas what could have
>> gone wrong there.
>
> Please test the below patch.
> -Daniel
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index b10fbde1d5ee..63ced2dee027 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -427,9 +427,6 @@ bool __intel_set_cpu_fifo_underrun_reporting(struct 
> drm_device *dev,
>
> ret = !intel_crtc->cpu_fifo_underrun_disabled;
>
> -   if (enable == ret)
> -   goto done;
> -
> intel_crtc->cpu_fifo_underrun_disabled = !enable;
>
> if (enable && (INTEL_INFO(dev)->gen < 5 || IS_VALLEYVIEW(dev)))
> @@ -441,7 +438,6 @@ bool __intel_set_cpu_fifo_underrun_reporting(struct 
> drm_device *dev,
> else if (IS_GEN8(dev))
> broadwell_set_fifo_underrun_reporting(dev, pipe, enable);
>
> -done:
> return ret;
>  }
>
> --

Doesn't work for me, I still have an underrun at boot-up.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Intel-gfx] [3.14.0-rc4] regression: drm FIFO underruns

2014-05-13 Thread Jörg Otte

2014-05-12 21:03 GMT+02:00 Daniel Vetter :
> On Mon, May 12, 2014 at 01:25:24PM +0200, Jörg Otte wrote:
>> 2014-05-11 18:49 GMT+02:00 Daniel Vetter :
>> > On Sat, May 10, 2014 at 10:52 AM, Jörg Otte  wrote:
>> >>> On Fri, May 09, 2014 at 05:14:38PM +0100, Damien Lespiau wrote:
>> >>>> On Fri, May 09, 2014 at 06:11:37PM +0200, Jörg Otte wrote:
>> >>>> > > Jörg, can you please boot with drm.debug=0xe, reproduce the issue 
>> >>>> > > and
>> >>>> > > then attach the complete dmesg? Please make sure that the dmesg
>> >>>> > > contains the boot-up stuff too.
>> >>>> > >
>> >>>> > > Thanks, Daniel
>> >>>> > Here it is. I should mention it only happens at boot-up.
>> >>>>
>> >>>> [0.374095] [drm] Wrong MCH_SSKPD value: 0x20100406
>> >>>> [0.374096] [drm] This can cause pipe underruns and display issues.
>> >>>> [0.374097] [drm] Please upgrade your BIOS to fix this.
>> >>>
>> >>> That can be a factor, but I think we may have some more general issue
>> >>> in the modeset sequence which causes these to get reported. I'm getting
>> >>> some on my machine as well where SSKPD looks more sane. Maybe we turn on
>> >>> the error reporting too early or something.
>> >>>
>> >>> But I'm not going to spend time worrying about these before my previous
>> >>> watermark stuff gets merged. Also the underrun reporting code itself
>> >>> would need some kind of rewrite to be really useful.
>> >>>
>> >>> If the display doesn't blank out during use everything is more or less
>> >>> fine and you can ignore these errors. It's quite likely that the
>> >>> errors were always present and you didn't know it. We just made them
>> >>> more prominent recently.
>> >>>
>> >>> --
>> >>> Ville Syrjälä
>> >>> Intel OTC
>> >>
>> >> It comes out on the boot-up screen which is normally clean. So it becomes
>> >> highly visible for anyone.
>> >
>> > To make sure that you're only seeing this at boot up and not elseplace
>> > please check that it doesn't show up when you do anything of the
>> > below:
>> > a) suspend/resume
>> > b) changing the output mode (e.g. with xrandr --mode)
>> > c) changing the output pipe (e.g. with xrandr --crtc)
>> > d) all of the above but with heavy system load, e.g. compile kernels
>> > with make -j 
>> >
>> > Also please test the latest drm-intel-nightly branch from
>> > http://cgit.freedesktop.org/drm-intel to make sure we haven't yet
>> > fixed this in our -next branch.
>> >
>>
>> Ok, that was a lot of homework ;)
>>
>> I checked a,b,d):  All worked without FIFO underruns.
>>
>> For c): I must admit I don't know what --crtc is good for and
>> the man page isn't very useful. I can't enter a meaningful command.
>
> $ xrandr --output  --auto --crtc 0
>
> and
>
> $ xrandr --output  --auto --crtc 1
>
> should do the trick, presuming you only have one output in total. Then
> switch a bit between them.

Works without underruns, with standard and nightly kernel.

>> Branch drm-intel-nightly as of
>> ed60c27 drm-intel-nightly: 2014y-05m-09d-21h-51m-45s integration manifest
>> looks badly:
>>- KDE splash screen on boot-up is not visible
>>- x-windows don't have title and menu bars
>>- KDE system menu is not visible
>>- moving windows around destroys its content
>
> Ugh, that's ugly. Nothing else change like e.g. the version of
> xfree-video-intel?

 (II) Loading /usr/lib/xorg/modules/drivers/intel_drv.so
 (II) Module intel: vendor="X.Org Foundation"
 compiled for 1.11.3, module version = 2.17.0
 Module class: X.Org Video Driver
 ABI class: X.Org Video Driver, version 11.0

>> apart from that: Via control key I can open a terminal and I checked
>> a,b), both worked without FIFO underruns.
>
> And what about at boot? If -nightly regresses even on that that's pretty
> awful.
Sorry I forgot to mention this. It behaves like the standard kernel.
The FIFO underrun comes at boot-up. And than never again.

>
> Also please test
>
> http://patchwork.freedesktop.org/patch/25568/

For me it doesn't make any difference.

>it might help for your case. If it doesn't we need to look into what
>exactly goes wrong on driver load

Don't know if this matters: I always use kernels without loadable modules.
Everything is built-in.


Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Intel-gfx] [3.14.0-rc4] regression: drm FIFO underruns

2014-05-12 Thread Jörg Otte

2014-05-11 18:49 GMT+02:00 Daniel Vetter :
> On Sat, May 10, 2014 at 10:52 AM, Jörg Otte  wrote:
>>> On Fri, May 09, 2014 at 05:14:38PM +0100, Damien Lespiau wrote:
>>>> On Fri, May 09, 2014 at 06:11:37PM +0200, Jörg Otte wrote:
>>>> > > Jörg, can you please boot with drm.debug=0xe, reproduce the issue and
>>>> > > then attach the complete dmesg? Please make sure that the dmesg
>>>> > > contains the boot-up stuff too.
>>>> > >
>>>> > > Thanks, Daniel
>>>> > Here it is. I should mention it only happens at boot-up.
>>>>
>>>> [0.374095] [drm] Wrong MCH_SSKPD value: 0x20100406
>>>> [0.374096] [drm] This can cause pipe underruns and display issues.
>>>> [0.374097] [drm] Please upgrade your BIOS to fix this.
>>>
>>> That can be a factor, but I think we may have some more general issue
>>> in the modeset sequence which causes these to get reported. I'm getting
>>> some on my machine as well where SSKPD looks more sane. Maybe we turn on
>>> the error reporting too early or something.
>>>
>>> But I'm not going to spend time worrying about these before my previous
>>> watermark stuff gets merged. Also the underrun reporting code itself
>>> would need some kind of rewrite to be really useful.
>>>
>>> If the display doesn't blank out during use everything is more or less
>>> fine and you can ignore these errors. It's quite likely that the
>>> errors were always present and you didn't know it. We just made them
>>> more prominent recently.
>>>
>>> --
>>> Ville Syrjälä
>>> Intel OTC
>>
>> It comes out on the boot-up screen which is normally clean. So it becomes
>> highly visible for anyone.
>
> To make sure that you're only seeing this at boot up and not elseplace
> please check that it doesn't show up when you do anything of the
> below:
> a) suspend/resume
> b) changing the output mode (e.g. with xrandr --mode)
> c) changing the output pipe (e.g. with xrandr --crtc)
> d) all of the above but with heavy system load, e.g. compile kernels
> with make -j 
>
> Also please test the latest drm-intel-nightly branch from
> http://cgit.freedesktop.org/drm-intel to make sure we haven't yet
> fixed this in our -next branch.
>

Ok, that was a lot of homework ;)

I checked a,b,d):  All worked without FIFO underruns.

For c): I must admit I don't know what --crtc is good for and
the man page isn't very useful. I can't enter a meaningful command.

Branch drm-intel-nightly as of
ed60c27 drm-intel-nightly: 2014y-05m-09d-21h-51m-45s integration manifest
looks badly:
   - KDE splash screen on boot-up is not visible
   - x-windows don't have title and menu bars
   - KDE system menu is not visible
   - moving windows around destroys its content
apart from that: Via control key I can open a terminal and I checked
a,b), both worked without FIFO underruns.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Intel-gfx] [3.14.0-rc4] regression: drm FIFO underruns

2014-05-10 Thread Jörg Otte

2014-05-09 19:03 GMT+02:00 Ville Syrjälä :
> On Fri, May 09, 2014 at 05:14:38PM +0100, Damien Lespiau wrote:
>> On Fri, May 09, 2014 at 06:11:37PM +0200, Jörg Otte wrote:
>> > > Jörg, can you please boot with drm.debug=0xe, reproduce the issue and
>> > > then attach the complete dmesg? Please make sure that the dmesg
>> > > contains the boot-up stuff too.
>> > >
>> > > Thanks, Daniel
>> > Here it is. I should mention it only happens at boot-up.
>>
>> [0.374095] [drm] Wrong MCH_SSKPD value: 0x20100406
>> [0.374096] [drm] This can cause pipe underruns and display issues.
>> [0.374097] [drm] Please upgrade your BIOS to fix this.
>
> That can be a factor, but I think we may have some more general issue
> in the modeset sequence which causes these to get reported. I'm getting
> some on my machine as well where SSKPD looks more sane. Maybe we turn on
> the error reporting too early or something.
>
> But I'm not going to spend time worrying about these before my previous
> watermark stuff gets merged. Also the underrun reporting code itself
> would need some kind of rewrite to be really useful.
>
> If the display doesn't blank out during use everything is more or less
> fine and you can ignore these errors. It's quite likely that the
> errors were always present and you didn't know it. We just made them
> more prominent recently.
>
> --
> Ville Syrjälä
> Intel OTC

It comes out on the boot-up screen which is normally clean. So it becomes
highly visible for anyone.

Thanks, Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.14.0-rc4] regression: drm FIFO underruns

2014-05-05 Thread Jörg Otte

I still have FIFO underruns in drm:
[drm:ivb_err_int_handler] *ERROR* Pipe B FIFO underrun
[drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun
[drm:cpt_serr_int_handler] *ERROR* PCH transcoder B FIFO underrun

which I already reported here:
https://lkml.org/lkml/2014/4/9/127

and which is still unanswered!

I tried to bisect the thing, but I ran into compile errors. So I
can only say it came in with
e9f37d3 "Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux"

Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [ 3.14.0-12041-g75ff24f ] regression: drm warning

2014-04-14 Thread Jörg Otte

2014-04-09 12:08 GMT+02:00 Jörg Otte :
> Kernel 3.14.0-12041-g75ff24f from 9.4.2014 introduces the following
> on the console (driver is i915):
>
> [drm:ivb_err_int_handler] *ERROR* Pipe B FIFO underrun
> [drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun
> [drm:cpt_serr_int_handler] *ERROR* PCH transcoder B FIFO underrun
>
> In syslog I find:
>
> [ cut here ]
> WARNING: CPU: 1 PID: 1 at
> /data/kernel/linux/drivers/gpu/drm/drm_mm.c:211 0xb22a70fa()
> no hole found for node 0x0 + 0x30
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.14.0-12041-g75ff24f #45
> Hardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
>   b2967138 b262ae5d 8802160d5a58
>  b20391ad 880214994000 0030 
>  880214952800  b2039255 b2967190
> Call Trace:
>  [] ? 0xb262ae5d
>  [] ? 0xb20391ad
>  [] ? 0xb2039255
>  [] ? 0xb22a70fa
>  [] ? 0xb22d4b11
>  [] ? 0xb2300f92
>  [] ? 0xb232af7b
>  [] ? 0xb232f675
>  [] ? 0xb21fa38c
>  [] ? 0xb232ec04
>  [] ? 0xb2622997
>  [] ? 0xb232ee62
>  [] ? 0xb22a6a5b
>  [] ? 0xb22a3994
>  [] ? 0xb22a5a38
>  [] ? 0xb221a019
>  [] ? 0xb23327b8
>  [] ? 0xb2332a03
>  [] ? 0xb2332970
>  [] ? 0xb2330c45
>  [] ? 0xb2331534
>  [] ? 0xb2332fce
>  [] ? 0xb2aa2cc5
>  [] ? 0xb2a7fe45
>  [] ? 0xb2052100
>  [] ? 0xb2a7ffe2
>  [] ? 0xb2a7f79e
>  [] ? 0xb2624d20
>  [] ? 0xb2624d29
>  [] ? 0xb263223c
>  [] ? 0xb2624d20
> ---[ end trace 0d3f14b61bc31dab ]---
>
> This does not happen in 3.14.0-11030-ga7963eb from 8.4.2014 and never
> before.
>
> Suppose merge
> 12024   e9f37d3 Merge branch 'drm-next' of
> git://people.freedesktop.org/~airlied/linux
> produces this problem.
>

cc to more people..

Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[ 3.14.0-12041-g75ff24f ] regression: drm warning

2014-04-09 Thread Jörg Otte

Kernel 3.14.0-12041-g75ff24f from 9.4.2014 introduces the following
on the console (driver is i915):

[drm:ivb_err_int_handler] *ERROR* Pipe B FIFO underrun
[drm:cpt_serr_int_handler] *ERROR* PCH transcoder A FIFO underrun
[drm:cpt_serr_int_handler] *ERROR* PCH transcoder B FIFO underrun

In syslog I find:

[ cut here ]
WARNING: CPU: 1 PID: 1 at
/data/kernel/linux/drivers/gpu/drm/drm_mm.c:211 0xb22a70fa()
no hole found for node 0x0 + 0x30
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.14.0-12041-g75ff24f #45
Hardware name: FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012
  b2967138 b262ae5d 8802160d5a58
 b20391ad 880214994000 0030 
 880214952800  b2039255 b2967190
Call Trace:
 [] ? 0xb262ae5d
 [] ? 0xb20391ad
 [] ? 0xb2039255
 [] ? 0xb22a70fa
 [] ? 0xb22d4b11
 [] ? 0xb2300f92
 [] ? 0xb232af7b
 [] ? 0xb232f675
 [] ? 0xb21fa38c
 [] ? 0xb232ec04
 [] ? 0xb2622997
 [] ? 0xb232ee62
 [] ? 0xb22a6a5b
 [] ? 0xb22a3994
 [] ? 0xb22a5a38
 [] ? 0xb221a019
 [] ? 0xb23327b8
 [] ? 0xb2332a03
 [] ? 0xb2332970
 [] ? 0xb2330c45
 [] ? 0xb2331534
 [] ? 0xb2332fce
 [] ? 0xb2aa2cc5
 [] ? 0xb2a7fe45
 [] ? 0xb2052100
 [] ? 0xb2a7ffe2
 [] ? 0xb2a7f79e
 [] ? 0xb2624d20
 [] ? 0xb2624d29
 [] ? 0xb263223c
 [] ? 0xb2624d20
---[ end trace 0d3f14b61bc31dab ]---

This does not happen in 3.14.0-11030-ga7963eb from 8.4.2014 and never
before.

Suppose merge
12024   e9f37d3 Merge branch 'drm-next' of
git://people.freedesktop.org/~airlied/linux
produces this problem.

Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [regression] linux-3.14.0-rc5-.. kernel does not switch power off

2014-03-09 Thread Jörg Otte

2014-03-09 18:15 GMT+01:00 Rafael J. Wysocki :
> On Sunday, March 09, 2014 05:58:59 PM Rafael J. Wysocki wrote:
>> On Sunday, March 09, 2014 05:55:28 PM Rafael J. Wysocki wrote:
>> > On Sunday, March 09, 2014 05:12:56 PM Jörg Otte wrote:
>> > > On shutdown power is not switched off. The harddisk is already down.
>> > > Reboot is working.
>> > >
>> > > Last known good kernel is: 3.14.0-rc5-00265-gb01d4e6
>> > > First known bad kernel is: 3.14.0-rc5-00287-gca62eec
>> >
>> > Can you please tell me which commits those two kernels correspond to?
>>
>> And please post a boot log from the failing one.
>
> Never mind, I know what the problem is, commit 3130497f5bab (ACPI / sleep:
> pm_power_off needs more sanity checks to be installed) has to be reverted.
>
> Sorry about the breakage and I'll send a revert patch shortly.
>
> Rafael
>

Works for me, thanks!
Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[regression] linux-3.14.0-rc5-.. kernel does not switch power off

2014-03-09 Thread Jörg Otte

On shutdown power is not switched off. The harddisk is already down.
Reboot is working.

Last known good kernel is: 3.14.0-rc5-00265-gb01d4e6
First known bad kernel is: 3.14.0-rc5-00287-gca62eec

FUJITSU LIFEBOOK AH532/FJNBB1C, BIOS Version 1.09 05/22/2012

Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ACPI: reduce log level for message "ACPI: \_PR_.CPU4: failed to get CPU APIC ID"

2014-01-27 Thread Jörg Otte

2014-01-27 Jiang Liu :
> Commit b981513f806d (ACPI / scan: bail out early if failed to parse
> APIC ID for CPU) emits an error message if ACPI processor driver fails
> to query APIC ID for the CPU.
>
> Originally it's designed to catch BIOS bugs for CPU hot-addition. But
> it accidently reveals another type of BIOS bug that:
> 1) BIOS implements ACPI objects for all possible instead of present
>CPUs. (It's legal per ACPI specification.)
> 2) BIOS doesn't implement _STA method for CPU objects. OSPM assumes
>that all CPU objects are present and functioning and binds ACPI
>processor driver to those CPU objects, which then triggers the error
>message. According to ACPI spec, BIOS should implement _STA method
>for those absent CPUs at least.
>
> Though it's a BIOS bug in essential, there are some BIOSes in the fields
> which are implmented in this way. So reduce the log level from ERR to
> DEBUG to accommodate these existing BIOSes.
>
> Fixes: b981513f806d (ACPI / scan: bail out early if failed to parse APIC ID 
> for CPU)
> Signed-off-by: Jiang Liu 
> ---
>  drivers/acpi/acpi_processor.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index c9311be..c29c2c3 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -261,7 +261,7 @@ static int acpi_processor_get_info(struct acpi_device 
> *device)
>
> apic_id = acpi_get_apicid(pr->handle, device_declaration, 
> pr->acpi_id);
> if (apic_id < 0) {
> -   acpi_handle_err(pr->handle, "failed to get CPU APIC ID.\n");
> +   acpi_handle_debug(pr->handle, "failed to get CPU APIC ID.\n");
> return -ENODEV;
> }
> pr->apic_id = apic_id;
> --
> 1.7.10.4
>

Thanks, the patch works for me.

Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kernel 3.13.0-06058-g2d08cd0] Strange ACPI error messages

2014-01-25 Thread Jörg Otte

2014-01-25 Rafael J. Wysocki :
> On Saturday, January 25, 2014 11:03:09 AM Jörg Otte wrote:
>> Kernel 3.13.0-06058-g2d08cd0 displays following errors on the console:
>>
>> ACPI: \_PR_.CPU4: failed to get CPU APIC ID.
>> ACPI: \_PR_.CPU5: failed to get CPU APIC ID.
>> ACPI: \_PR_.CPU6: failed to get CPU APIC ID.
>> ACPI: \_PR_.CPU7: failed to get CPU APIC ID.
>>
>> I don't have CPUs 4..7! Error messages regarding not existing
>> CPUs should not be displayed.
>> kernel 3.13.0-05617-g3aacd62 and before did'nt show this messages.
>>
>> This is a regression. I never saw this messages before.
>
> Does your system show any other suspicious symptoms or are you worried about
> those messages only?
>
> Rafael
>

what's looking badly is this:

ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_]
(20131218/hwxface-580)
ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_]
(20131218/hwxface-580)

and this:

acpi PNP0A08:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
 \_SB_.PCI0:_OSC invalid UUID
 _OSC request data:1 1e 0
acpi PNP0A08:00: _OSC failed (AE_ERROR); disabling ASPM

but these messages are not new.

Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[kernel 3.13.0-06058-g2d08cd0] Strange ACPI error messages

2014-01-25 Thread Jörg Otte

Kernel 3.13.0-06058-g2d08cd0 displays following errors on the console:

ACPI: \_PR_.CPU4: failed to get CPU APIC ID.
ACPI: \_PR_.CPU5: failed to get CPU APIC ID.
ACPI: \_PR_.CPU6: failed to get CPU APIC ID.
ACPI: \_PR_.CPU7: failed to get CPU APIC ID.

I don't have CPUs 4..7! Error messages regarding not existing
CPUs should not be displayed.
kernel 3.13.0-05617-g3aacd62 and before did'nt show this messages.

This is a regression. I never saw this messages before.

Processor model: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
$ cat /proc/cpuinfo | grep processor
processor   : 0
processor   : 1
processor   : 2
processor   : 3


Jörg
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 127 matches

Mail list logo