Re: sched/fair: Hard lockup from idle_balance()/task_numa_migrate() race

2017-11-07 Thread Ed Swierk
On Mon, Oct 16, 2017 at 4:11 PM, Ed Swierk  wrote:
> To recap: a dual-socket Xeon (E5 v4) server system had been running a
> bunch of KVM workloads just fine for over 6 weeks. Suddenly hard
> lockups occurred on cpu 13 in task_numa_migrate(), and cpu 0 in
> idle_balance(). That condition persisted until the system was rebooted
> after a couple of hours, and meanwhile all manner of soft lockups and
> even SATA errors appeared.
> 
> [3851435.62] NMI watchdog: Watchdog detected hard LOCKUP on cpu 13
> [3851435.784582] Modules linked in: vhost_net tun vhost macvtap 8021q garp 
> mrp drbg ansi_cprng dm_crypt algif_skcipher af_alg ip_gre ip_tunnel gre 
> macvlan ebtable_filter ebtables openvswitch nf_defrag_ipv6 libcrc32c nfsd 
> auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ip6table_filter ip6_tables 
> nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp nf_conntrack_ipv4 
> nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables x_tables 
> iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp 
> kvm_intel kvm irqbypass crct10dif_pclmul raid10 crc32_pclmul aesni_intel 
> aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac lpc_ich 
> mfd_core mei_me mei i2c_i801 ioatdma ipmi_ssif ipmi_msghandler acpi_pad 
> acpi_cpufreq squashfs lz4_decompress xhci_pci xhci_hcd ixgbe mdio vxlan 
> ip6_udp_tunnel udp_tunnel ptp pps_core dca ehci_pci ehci_hcd usbcore ahci 
> libahci usb_common tpm_tis
> [3851435.874884] irq event stamp: 3828060294
> [3851435.879398] hardirqs last  enabled at (3828060293): [] 
> 810db86c
> [3851435.10] hardirqs last disabled at (3828060294): [] 
> 8170ff74
> [3851435.898254] softirqs last  enabled at (3828060156): [] 
> 8107f31e
> [3851435.907666] softirqs last disabled at (3828060139): [] 
> 8107f648
> [3851435.917094] CPU: 13 PID: 2146 Comm: pocviexg Not tainted 4.4.52-grsec #1
> [3851435.925252] Hardware name: Intel S2600WTTR, BIOS 
> SE5C610.86B.01.01.0016.C1.033120161139 03/31/2016
> [3851435.937473] task: 881c9ff35c00 ti: 881c9ff372b0 task.ti: 
> 881c9ff372b0
> [3851435.946104] RIP: 0010:[]  [] 
> try_to_wake_up+0xe4/0x500
> [3851435.955614] RSP: :c90016beba40  EFLAGS: 0002
> [3851435.961780] RAX:  RBX: 88102cc8e3d8 RCX: 
> 0001
> [3851435.970016] RDX: 0001 RSI: 01e4 RDI: 
> 
> [3851435.978259] RBP: c90016beba80 R08:  R09: 
> 0001
> [3851435.986493] R10: 00e0 R11: 881c9ff35c00 R12: 
> 0082
> [3851435.994727] R13: 88102cc8dc00 R14: 001d3f00 R15: 
> 81f25060
> [3851436.002962] FS:  6e79227fc700() GS:88203400() 
> knlGS:
> [3851436.012266] CS:  0010 DS:  ES:  CR0: 80050033
> [3851436.018920] CR2: 6e79227fba38 CR3: 00201d2fe000 CR4: 
> 003626f0
> [3851436.027155] DR0:  DR1:  DR2: 
> 
> [3851436.035391] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [3851436.043626] Stack:
> [3851436.046098]  8820341ce620 8820341ce608  
> 8820341ce600
> [3851436.054695]  c90016bebb48 000d  
> 881033fce600
> [3851436.063259]  c90016beba90 810adc80 c90016bebbe8 
> 8112c830
> [3851436.071868] Call Trace:
> [3851436.074862]  [] wake_up_process+0x10/0x20  
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1997
> [3851436.081317]  [] stop_two_cpus+0x1b0/0x280  
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/stop_machine.c#L235
> [3851436.087778]  [] ? cpu_stop_park+0x40/0x40
> [3851436.094237]  [] ? cpu_stop_park+0x40/0x40
> [3851436.100715]  [] ? __migrate_swap_task.part.94+0x70/0x70
> [3851436.108539]  [] migrate_swap+0xf1/0x1f0
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1408
> [3851436.114800]  [] task_numa_migrate+0x345/0xb60  
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L1572
> [3851436.121656]  [] ? futex_wait_queue_me+0xf3/0x160
> [3851436.128801]  [] numa_migrate_preferred+0x73/0x80   
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L1597
> [3851436.135936]  [] task_numa_fault+0x7a4/0xee0
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L2137
> [3851436.142604]  [] ? should_numa_migrate_memory+0x50/0x130
> [3851436.150437]  [] handle_mm_fault+0xfd6/0x1e10   
>   https:/

Re: sched/fair: Hard lockup from idle_balance()/task_numa_migrate() race

2017-11-03 Thread Ed Swierk
On Fri, Nov 3, 2017 at 8:03 AM, Linus Torvalds
 wrote:
> I suspect that you'd get more responses if it was an upstream kernel
> (and preferably newer), or even a distro one.
>
> If you run the grsec patches, you need to go to grsec to get support.
> We don't even know what they are doing, since they're hiding their
> patches and not breaking them out.

I understand--we are also not keen on the way grsec distributes their
patches or the timeliness with which they follow stable/LTS kernel
releases.

In this case, I can't conceive of a way to reproduce the problem with
an upstream kernel or any other, yet to my limited understanding of
the evidence it appears there may indeed be a real problem lurking in
there.

I will follow up with the grsec folks.

> On Thu, Nov 2, 2017 at 5:51 PM, Ed Swierk  wrote:
>> Ping?
>>
>> On Wed, Oct 25, 2017 at 9:35 PM, Ed Swierk  
>> wrote:
>>>
>>> Ping?
>>>
>>> On Mon, Oct 16, 2017 at 4:11 PM, Ed Swierk  
>>> wrote:
>>> >
>>> > Ping for Peter, Ingo and other sched maintainers:
>>> >
>>> > I'd appreciate any feedback on this hard lockup issue, which occurred
>>> > on a system running kernel 4.4.52-grsec.

Thanks,
--Ed


Re: sched/fair: Hard lockup from idle_balance()/task_numa_migrate() race

2017-11-02 Thread Ed Swierk
Ping?

On Wed, Oct 25, 2017 at 9:35 PM, Ed Swierk  wrote:
> 
> Ping?
> 
> On Mon, Oct 16, 2017 at 4:11 PM, Ed Swierk  wrote:
> > 
> > Ping for Peter, Ingo and other sched maintainers:
> > 
> > I'd appreciate any feedback on this hard lockup issue, which occurred
> > on a system running kernel 4.4.52-grsec.
> > 
> > To recap: a dual-socket Xeon (E5 v4) server system had been running a
> > bunch of KVM workloads just fine for over 6 weeks. Suddenly hard
> > lockups occurred on cpu 13 in task_numa_migrate(), and cpu 0 in
> > idle_balance(). That condition persisted until the system was rebooted
> > after a couple of hours, and meanwhile all manner of soft lockups and
> > even SATA errors appeared.
> > 
> > [3851435.62] NMI watchdog: Watchdog detected hard LOCKUP on cpu 13
> > [3851435.784582] Modules linked in: vhost_net tun vhost macvtap 8021q garp 
> > mrp drbg ansi_cprng dm_crypt algif_skcipher af_alg ip_gre ip_tunnel gre 
> > macvlan ebtable_filter ebtables openvswitch nf_defrag_ipv6 libcrc32c nfsd 
> > auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ip6table_filter 
> > ip6_tables nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp 
> > nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter 
> > ip_tables x_tables iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal 
> > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul raid10 
> > crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper 
> > cryptd sb_edac lpc_ich mfd_core mei_me mei i2c_i801 ioatdma ipmi_ssif 
> > ipmi_msghandler acpi_pad acpi_cpufreq squashfs lz4_decompress xhci_pci 
> > xhci_hcd ixgbe mdio vxlan ip6_udp_tunnel udp_tunnel ptp pps_core dca 
> > ehci_pci ehci_hcd usbcore ahci libahci usb_common tpm_tis
> > [3851435.874884] irq event stamp: 3828060294
> > [3851435.879398] hardirqs last  enabled at (3828060293): 
> > [] 810db86c
> > [3851435.10] hardirqs last disabled at (3828060294): 
> > [] 8170ff74
> > [3851435.898254] softirqs last  enabled at (3828060156): 
> > [] 8107f31e
> > [3851435.907666] softirqs last disabled at (3828060139): 
> > [] 8107f648
> > [3851435.917094] CPU: 13 PID: 2146 Comm: pocviexg Not tainted 4.4.52-grsec 
> > #1
> > [3851435.925252] Hardware name: Intel S2600WTTR, BIOS 
> > SE5C610.86B.01.01.0016.C1.033120161139 03/31/2016
> > [3851435.937473] task: 881c9ff35c00 ti: 881c9ff372b0 task.ti: 
> > 881c9ff372b0
> > [3851435.946104] RIP: 0010:[]  [] 
> > try_to_wake_up+0xe4/0x500
> > [3851435.955614] RSP: :c90016beba40  EFLAGS: 0002
> > [3851435.961780] RAX:  RBX: 88102cc8e3d8 RCX: 
> > 0001
> > [3851435.970016] RDX: 0001 RSI: 01e4 RDI: 
> > 
> > [3851435.978259] RBP: c90016beba80 R08:  R09: 
> > 0001
> > [3851435.986493] R10: 00e0 R11: 881c9ff35c00 R12: 
> > 0082
> > [3851435.994727] R13: 88102cc8dc00 R14: 001d3f00 R15: 
> > 81f25060
> > [3851436.002962] FS:  6e79227fc700() GS:88203400() 
> > knlGS:
> > [3851436.012266] CS:  0010 DS:  ES:  CR0: 80050033
> > [3851436.018920] CR2: 6e79227fba38 CR3: 00201d2fe000 CR4: 
> > 003626f0
> > [3851436.027155] DR0:  DR1:  DR2: 
> > 
> > [3851436.035391] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> > [3851436.043626] Stack:
> > [3851436.046098]  8820341ce620 8820341ce608  
> > 8820341ce600
> > [3851436.054695]  c90016bebb48 000d  
> > 881033fce600
> > [3851436.063259]  c90016beba90 810adc80 c90016bebbe8 
> > 8112c830
> > [3851436.071868] Call Trace:
> > [3851436.074862]  [] wake_up_process+0x10/0x20
> > 
> > https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1997
> > [3851436.081317]  [] stop_two_cpus+0x1b0/0x280
> > 
> > https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/stop_machine.c#L235
> > [3851436.087778]  [] ? cpu_stop_park+0x40/0x40
> > [3851436.094237]  [] ? cpu_stop_park+0x40/0x40
> > [3851436.100715]  [] ? 
> > __migrate_swap_task.part.94+0x70/0x70
> > [3851436.108539]  [] migrate_swap+0xf1/0x1f0  
> > 
> > https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kern

Re: sched/fair: Hard lockup from idle_balance()/task_numa_migrate() race

2017-10-25 Thread Ed Swierk
Ping?

On Mon, Oct 16, 2017 at 4:11 PM, Ed Swierk  wrote:
> 
> Ping for Peter, Ingo and other sched maintainers:
> 
> I'd appreciate any feedback on this hard lockup issue, which occurred
> on a system running kernel 4.4.52-grsec.
> 
> To recap: a dual-socket Xeon (E5 v4) server system had been running a
> bunch of KVM workloads just fine for over 6 weeks. Suddenly hard
> lockups occurred on cpu 13 in task_numa_migrate(), and cpu 0 in
> idle_balance(). That condition persisted until the system was rebooted
> after a couple of hours, and meanwhile all manner of soft lockups and
> even SATA errors appeared.
> 
> [3851435.62] NMI watchdog: Watchdog detected hard LOCKUP on cpu 13
> [3851435.784582] Modules linked in: vhost_net tun vhost macvtap 8021q garp 
> mrp drbg ansi_cprng dm_crypt algif_skcipher af_alg ip_gre ip_tunnel gre 
> macvlan ebtable_filter ebtables openvswitch nf_defrag_ipv6 libcrc32c nfsd 
> auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ip6table_filter ip6_tables 
> nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_tcpudp nf_conntrack_ipv4 
> nf_defrag_ipv4 xt_conntrack nf_conntrack iptable_filter ip_tables x_tables 
> iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp 
> kvm_intel kvm irqbypass crct10dif_pclmul raid10 crc32_pclmul aesni_intel 
> aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac lpc_ich 
> mfd_core mei_me mei i2c_i801 ioatdma ipmi_ssif ipmi_msghandler acpi_pad 
> acpi_cpufreq squashfs lz4_decompress xhci_pci xhci_hcd ixgbe mdio vxlan 
> ip6_udp_tunnel udp_tunnel ptp pps_core dca ehci_pci ehci_hcd usbcore ahci 
> libahci usb_common tpm_tis
> [3851435.874884] irq event stamp: 3828060294
> [3851435.879398] hardirqs last  enabled at (3828060293): [] 
> 810db86c
> [3851435.10] hardirqs last disabled at (3828060294): [] 
> 8170ff74
> [3851435.898254] softirqs last  enabled at (3828060156): [] 
> 8107f31e
> [3851435.907666] softirqs last disabled at (3828060139): [] 
> 8107f648
> [3851435.917094] CPU: 13 PID: 2146 Comm: pocviexg Not tainted 4.4.52-grsec #1
> [3851435.925252] Hardware name: Intel S2600WTTR, BIOS 
> SE5C610.86B.01.01.0016.C1.033120161139 03/31/2016
> [3851435.937473] task: 881c9ff35c00 ti: 881c9ff372b0 task.ti: 
> 881c9ff372b0
> [3851435.946104] RIP: 0010:[]  [] 
> try_to_wake_up+0xe4/0x500
> [3851435.955614] RSP: :c90016beba40  EFLAGS: 0002
> [3851435.961780] RAX:  RBX: 88102cc8e3d8 RCX: 
> 0001
> [3851435.970016] RDX: 0001 RSI: 01e4 RDI: 
> 
> [3851435.978259] RBP: c90016beba80 R08:  R09: 
> 0001
> [3851435.986493] R10: 00e0 R11: 881c9ff35c00 R12: 
> 0082
> [3851435.994727] R13: 88102cc8dc00 R14: 001d3f00 R15: 
> 81f25060
> [3851436.002962] FS:  6e79227fc700() GS:88203400() 
> knlGS:
> [3851436.012266] CS:  0010 DS:  ES:  CR0: 80050033
> [3851436.018920] CR2: 6e79227fba38 CR3: 00201d2fe000 CR4: 
> 003626f0
> [3851436.027155] DR0:  DR1:  DR2: 
> 
> [3851436.035391] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [3851436.043626] Stack:
> [3851436.046098]  8820341ce620 8820341ce608  
> 8820341ce600
> [3851436.054695]  c90016bebb48 000d  
> 881033fce600
> [3851436.063259]  c90016beba90 810adc80 c90016bebbe8 
> 8112c830
> [3851436.071868] Call Trace:
> [3851436.074862]  [] wake_up_process+0x10/0x20  
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1997
> [3851436.081317]  [] stop_two_cpus+0x1b0/0x280  
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/stop_machine.c#L235
> [3851436.087778]  [] ? cpu_stop_park+0x40/0x40
> [3851436.094237]  [] ? cpu_stop_park+0x40/0x40
> [3851436.100715]  [] ? __migrate_swap_task.part.94+0x70/0x70
> [3851436.108539]  [] migrate_swap+0xf1/0x1f0
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1408
> [3851436.114800]  [] task_numa_migrate+0x345/0xb60  
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L1572
> [3851436.121656]  [] ? futex_wait_queue_me+0xf3/0x160
> [3851436.128801]  [] numa_migrate_preferred+0x73/0x80   
>   
> https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L1597
> [3851436.135936]  [] task_numa_fault+0x7a4/0xee0
>   
> https://github.com/skyp

sched/fair: Hard lockup from idle_balance()/task_numa_migrate() race

2017-10-16 Thread Ed Swierk
Ping for Peter, Ingo and other sched maintainers:

I'd appreciate any feedback on this hard lockup issue, which occurred
on a system running kernel 4.4.52-grsec.

To recap: a dual-socket Xeon (E5 v4) server system had been running a
bunch of KVM workloads just fine for over 6 weeks. Suddenly hard
lockups occurred on cpu 13 in task_numa_migrate(), and cpu 0 in
idle_balance(). That condition persisted until the system was rebooted
after a couple of hours, and meanwhile all manner of soft lockups and
even SATA errors appeared.

[3851435.62] NMI watchdog: Watchdog detected hard LOCKUP on cpu 13
[3851435.784582] Modules linked in: vhost_net tun vhost macvtap 8021q garp mrp 
drbg ansi_cprng dm_crypt algif_skcipher af_alg ip_gre ip_tunnel gre macvlan 
ebtable_filter ebtables openvswitch nf_defrag_ipv6 libcrc32c nfsd auth_rpcgss 
nfs_acl nfs lockd grace fscache sunrpc ip6table_filter ip6_tables nf_log_ipv4 
nf_log_common xt_LOG xt_limit xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 
xt_conntrack nf_conntrack iptable_filter ip_tables x_tables iTCO_wdt 
iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel 
kvm irqbypass crct10dif_pclmul raid10 crc32_pclmul aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd sb_edac lpc_ich mfd_core mei_me mei 
i2c_i801 ioatdma ipmi_ssif ipmi_msghandler acpi_pad acpi_cpufreq squashfs 
lz4_decompress xhci_pci xhci_hcd ixgbe mdio vxlan ip6_udp_tunnel udp_tunnel ptp 
pps_core dca ehci_pci ehci_hcd usbcore ahci libahci usb_common tpm_tis
[3851435.874884] irq event stamp: 3828060294
[3851435.879398] hardirqs last  enabled at (3828060293): [] 
810db86c
[3851435.10] hardirqs last disabled at (3828060294): [] 
8170ff74
[3851435.898254] softirqs last  enabled at (3828060156): [] 
8107f31e
[3851435.907666] softirqs last disabled at (3828060139): [] 
8107f648
[3851435.917094] CPU: 13 PID: 2146 Comm: pocviexg Not tainted 4.4.52-grsec #1
[3851435.925252] Hardware name: Intel S2600WTTR, BIOS 
SE5C610.86B.01.01.0016.C1.033120161139 03/31/2016
[3851435.937473] task: 881c9ff35c00 ti: 881c9ff372b0 task.ti: 
881c9ff372b0
[3851435.946104] RIP: 0010:[]  [] 
try_to_wake_up+0xe4/0x500
[3851435.955614] RSP: :c90016beba40  EFLAGS: 0002
[3851435.961780] RAX:  RBX: 88102cc8e3d8 RCX: 
0001
[3851435.970016] RDX: 0001 RSI: 01e4 RDI: 

[3851435.978259] RBP: c90016beba80 R08:  R09: 
0001
[3851435.986493] R10: 00e0 R11: 881c9ff35c00 R12: 
0082
[3851435.994727] R13: 88102cc8dc00 R14: 001d3f00 R15: 
81f25060
[3851436.002962] FS:  6e79227fc700() GS:88203400() 
knlGS:
[3851436.012266] CS:  0010 DS:  ES:  CR0: 80050033
[3851436.018920] CR2: 6e79227fba38 CR3: 00201d2fe000 CR4: 
003626f0
[3851436.027155] DR0:  DR1:  DR2: 

[3851436.035391] DR3:  DR6: fffe0ff0 DR7: 
0400
[3851436.043626] Stack:
[3851436.046098]  8820341ce620 8820341ce608  
8820341ce600
[3851436.054695]  c90016bebb48 000d  
881033fce600
[3851436.063259]  c90016beba90 810adc80 c90016bebbe8 
8112c830
[3851436.071868] Call Trace:
[3851436.074862]  [] wake_up_process+0x10/0x20
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1997
[3851436.081317]  [] stop_two_cpus+0x1b0/0x280
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/stop_machine.c#L235
[3851436.087778]  [] ? cpu_stop_park+0x40/0x40
[3851436.094237]  [] ? cpu_stop_park+0x40/0x40
[3851436.100715]  [] ? __migrate_swap_task.part.94+0x70/0x70
[3851436.108539]  [] migrate_swap+0xf1/0x1f0  
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1408
[3851436.114800]  [] task_numa_migrate+0x345/0xb60
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L1572
[3851436.121656]  [] ? futex_wait_queue_me+0xf3/0x160
[3851436.128801]  [] numa_migrate_preferred+0x73/0x80 
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L1597
[3851436.135936]  [] task_numa_fault+0x7a4/0xee0  
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L2137
[3851436.142604]  [] ? should_numa_migrate_memory+0x50/0x130
[3851436.150437]  [] handle_mm_fault+0xfd6/0x1e10 
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/mm/memory.c#L3503
[3851436.157189]  [] ? handle_mm_fault+0x48/0x1e10
[3851436.164051]  [] ? ___might_sleep+0x1ea/0x220
[3851436.170804]  [] __do_page_fault+0x1bf/0x460
[3851436.177475]  [] ? sys_futex+0x74/0x230
[3851436.183642]  [] do_page_fault+0x2c/0x40
[3851436.189909]  [] page_fault+0x28/0x30
[3851

Re: Which came first, hard kernel lockup or SATA errors?

2017-10-10 Thread Ed Swierk
Continuing the conversation with the voices in my head...

On Mon, Oct 9, 2017 at 10:45 PM, Ed Swierk  wrote:
> Based on the addresses in the stack and registers, here's what I think
> happened.
> 
> On cpu 13:
> 
> - task_numa_fault() calls task_numa_migrate(), which selects the task
>   on cpu 0 as the dst_task.
> - migrate_swap() calls stop_two_cpus(), which acquires the cpu_stopper
>   locks for the dst_cpu (cpu 0, at 0x881033fce600) and src_cpu
>   (cpu X, at 0x8820341ce600).
> - stop_two_cpus() calls wake_up_process() on the lower-numbered cpu
>   first, which has to be cpu 0.
> - wake_up_process() spins until the cpu 0 task (at 0x88102cc8dc00)
>   is no longer on_cpu.
> 
> On cpu 0:
> 
> - pick_next_task_fair() calls idle_balance(). According to the "This
>   is OK" comment, current is on_cpu at this point.
> - idle_balance() calls load_balance() for dst_cpu 0.
> - load_balance() decides to move a task from cpu X, so calls
>   stop_one_cpu_nowait() on cpu X.
> - stop_one_cpu_nowait() spins trying to acquire the cpu_stopper lock
>   for cpu X (at 0x8820341ce600).
> 
> So idle_balance() on cpu 0 is stuck waiting for task_numa_fault() to
> move a task to cpu 0, which is blocked on idle_balance() completing.

Also, it appears that task_numa_fault() tries to migrate current, so
the src_cpu X used by task_numa_migrate() is cpu 13 in this
case. Though the key issue is that both task_numa_migrate() and
idle_balance() are trying to stop the same cpu, regardless of whether
it's the cpu task_numa_migrate() is running on.

So I'm wondering how this situation could be prevented.

Can task_numa_migrate() avoid picking a dst_task that might itself
try to stop either src_cpu or dst_cpu?

Or, can load_balance() avoid a cpu that might be stopped for migration
(or any other reason), or detect such a conflict and bail out rather
than spinning forever?

--Ed


Re: Which came first, hard kernel lockup or SATA errors?

2017-10-09 Thread Ed Swierk
On Fri, Oct 6, 2017 at 6:25 PM, Ed Swierk  wrote:
> I'm trying to untangle a series of problems that suddenly occurred on
> a dual-socket Xeon server system that had been running a bunch of KVM
> workloads just fine for over 6 weeks (4.4.52-grsec kernel,
> Debian-derived userspace).

I think I've narrowed the hard lockup to a race between
task_numa_migrate() and idle_balance().

Here are the two lockup messages again, this time with approximate
code references.

[3851435.62] NMI watchdog: Watchdog detected hard LOCKUP on cpu 13
[3851435.784582] Modules linked in: vhost_net tun vhost macvtap 8021q garp mrp 
drbg ansi_cprng dm_crypt algif_skcipher af_alg ip_gre ip_tunnel gre macvlan 
ebtable_filter ebtables openvswitch nf_defrag_ipv6 libcrc32c nfsd auth_rpcgss 
nfs_acl nfs lockd grace fscache sunrpc ip6table_filter ip6_tables nf_log_ipv4 
nf_log_common xt_LOG xt_limit xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 
xt_conntrack nf_conntrack iptable_filter ip_tables x_tables iTCO_wdt 
iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel 
kvm irqbypass crct10dif_pclmul raid10 crc32_pclmul aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd sb_edac lpc_ich mfd_core mei_me mei 
i2c_i801 ioatdma ipmi_ssif ipmi_msghandler acpi_pad acpi_cpufreq squashfs 
lz4_decompress xhci_pci xhci_hcd ixgbe mdio vxlan ip6_udp_tunnel udp_tunnel ptp 
pps_core dca ehci_pci ehci_hcd usbcore ahci libahci usb_common tpm_tis
[3851435.874884] irq event stamp: 3828060294
[3851435.879398] hardirqs last  enabled at (3828060293): [] 
810db86c
[3851435.10] hardirqs last disabled at (3828060294): [] 
8170ff74
[3851435.898254] softirqs last  enabled at (3828060156): [] 
8107f31e
[3851435.907666] softirqs last disabled at (3828060139): [] 
8107f648
[3851435.917094] CPU: 13 PID: 2146 Comm: pocviexg Not tainted 4.4.52-grsec #1
[3851435.925252] Hardware name: Intel S2600WTTR, BIOS 
SE5C610.86B.01.01.0016.C1.033120161139 03/31/2016
[3851435.937473] task: 881c9ff35c00 ti: 881c9ff372b0 task.ti: 
881c9ff372b0
[3851435.946104] RIP: 0010:[]  [] 
try_to_wake_up+0xe4/0x500
[3851435.955614] RSP: :c90016beba40  EFLAGS: 0002
[3851435.961780] RAX:  RBX: 88102cc8e3d8 RCX: 
0001
[3851435.970016] RDX: 0001 RSI: 01e4 RDI: 

[3851435.978259] RBP: c90016beba80 R08:  R09: 
0001
[3851435.986493] R10: 00e0 R11: 881c9ff35c00 R12: 
0082
[3851435.994727] R13: 88102cc8dc00 R14: 001d3f00 R15: 
81f25060
[3851436.002962] FS:  6e79227fc700() GS:88203400() 
knlGS:
[3851436.012266] CS:  0010 DS:  ES:  CR0: 80050033
[3851436.018920] CR2: 6e79227fba38 CR3: 00201d2fe000 CR4: 
003626f0
[3851436.027155] DR0:  DR1:  DR2: 

[3851436.035391] DR3:  DR6: fffe0ff0 DR7: 
0400
[3851436.043626] Stack:
[3851436.046098]  8820341ce620 8820341ce608  
8820341ce600
[3851436.054695]  c90016bebb48 000d  
881033fce600
[3851436.063259]  c90016beba90 810adc80 c90016bebbe8 
8112c830
[3851436.071868] Call Trace:
[3851436.074862]  [] wake_up_process+0x10/0x20
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1997
[3851436.081317]  [] stop_two_cpus+0x1b0/0x280
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/stop_machine.c#L235
[3851436.087778]  [] ? cpu_stop_park+0x40/0x40
[3851436.094237]  [] ? cpu_stop_park+0x40/0x40
[3851436.100715]  [] ? __migrate_swap_task.part.94+0x70/0x70
[3851436.108539]  [] migrate_swap+0xf1/0x1f0  
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/core.c#L1408
[3851436.114800]  [] task_numa_migrate+0x345/0xb60
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L1572
[3851436.121656]  [] ? futex_wait_queue_me+0xf3/0x160
[3851436.128801]  [] numa_migrate_preferred+0x73/0x80 
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L1597
[3851436.135936]  [] task_numa_fault+0x7a4/0xee0  
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/kernel/sched/fair.c#L2137
[3851436.142604]  [] ? should_numa_migrate_memory+0x50/0x130
[3851436.150437]  [] handle_mm_fault+0xfd6/0x1e10 
https://github.com/skyportsystems/linux/blob/4.4.52-grsec/mm/memory.c#L3503
[3851436.157189]  [] ? handle_mm_fault+0x48/0x1e10
[3851436.164051]  [] ? ___might_sleep+0x1ea/0x220
[3851436.170804]  [] __do_page_fault+0x1bf/0x460
[3851436.177475]  [] ? sys_futex+0x74/0x230
[3851436.183642]  [] do_page_fault+0x2c/0x40
[3851436.189909]  [] page_fault+0x28/0x30
[3851436.195892] Code: 0f 85 e0 02 00 00 41 8

Which came first, hard kernel lockup or SATA errors?

2017-10-06 Thread Ed Swierk
I'm trying to untangle a series of problems that suddenly occurred on
a dual-socket Xeon server system that had been running a bunch of KVM
workloads just fine for over 6 weeks (4.4.52-grsec kernel,
Debian-derived userspace).

Here are the highlights, with timestamps in seconds:

[3851435] NMI watchdog: Watchdog detected hard LOCKUP on cpu 13
[3851438] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0

[3851450] NMI watchdog: BUG: soft lockup - CPU#8 stuck for 22s! [pocviexg:2136]
[3851450] NMI watchdog: BUG: soft lockup - CPU#12 stuck for 22s! [CPU 
0/KVM:9662]
[3851450] NMI watchdog: BUG: soft lockup - CPU#14 stuck for 22s! [CPU 
1/KVM:10302]

[3851478] NMI watchdog: BUG: soft lockup - CPU#8 stuck for 22s! [pocviexg:2136]
[3851478] NMI watchdog: BUG: soft lockup - CPU#12 stuck for 22s! [CPU 
0/KVM:9662]
[3851478] NMI watchdog: BUG: soft lockup - CPU#14 stuck for 22s! [CPU 
1/KVM:10302]

[3851482] BUG: spinlock lockup suspected on CPU#0, migration/0/10

[3851482] Sending NMI to all CPUs:
[3851482] NMI backtrace for cpu 0
[3851483] NMI backtrace for cpu 1
[3851483] NMI backtrace for cpu 2
[3851483] NMI backtrace for cpu 3
[3851483] NMI backtrace for cpu 4

[3851484] ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen **
[3851484] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen **

[3851484] NMI backtrace for cpu 5
[3851484] NMI backtrace for cpu 6
...
[3851489] NMI backtrace for cpu 24

[3851489] ata8.00: qc timeout (cmd 0xec) **
[3851489] ata7.00: qc timeout (cmd 0xec) **

[3851489] NMI backtrace for cpu 25
...
[3851490] NMI backtrace for cpu 31

[3851499] ata8.00: qc timeout (cmd 0xec) **
[3851499] ata7.00: qc timeout (cmd 0xec) **

[3851510] NMI watchdog: BUG: soft lockup - CPU#12 stuck for 22s! [CPU 
0/KVM:9662]
...and so on for 2 more hours, until the system was finally rebooted

First I want to know if I can blame the whole mess on flaky
hardware. The error messages marked with ** are from IO operations on
two SATA SSDs. On one hand, getting errors from two different drives
at the exact same time suggests that the drives themselves aren't the
problem. On the other hand, the fact that the system has two other
drives connected to the same controller that continued to work (even
though all were being accessed more or less simultaneously in a
software RAID-10 configuration) suggests that the controller isn't
totally broken. (The controller is embedded in the C610 PCH on the
motherboard.) No SATA errors have occurred on that system before or
since.

Is there any chance that those SATA errors could have been caused by
the kernel lockups--say due to a timing-sensitive interrupt not
getting serviced? Or by any other conceivable software problem?

If not, is it plausible that a hardware problem with the SATA
controller or drives caused the kernel lockups, even though the first
hard lockups were reported 48 seconds before the first SATA error
messages?

Following is an excerpt of the console log. Let me know if any other
details would be helpful.



[3851435.62] NMI watchdog: Watchdog detected hard LOCKUP on cpu 13
[3851435.784582] Modules linked in: vhost_net tun vhost macvtap 8021q garp mrp 
drbg ansi_cprng dm_crypt algif_skcipher af_alg ip_gre ip_tunnel gre macvlan 
ebtable_filter ebtables openvswitch nf_defrag_ipv6 libcrc32c nfsd auth_rpcgss 
nfs_acl nfs lockd grace fscache sunrpc ip6table_filter ip6_tables nf_log_ipv4 
nf_log_common xt_LOG xt_limit xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 
xt_conntrack nf_conntrack iptable_filter ip_tables x_tables iTCO_wdt 
iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel 
kvm irqbypass crct10dif_pclmul raid10 crc32_pclmul aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd sb_edac lpc_ich mfd_core mei_me mei 
i2c_i801 ioatdma ipmi_ssif ipmi_msghandler acpi_pad acpi_cpufreq squashfs 
lz4_decompress xhci_pci xhci_hcd ixgbe mdio vxlan ip6_udp_tunnel udp_tunnel ptp 
pps_core dca ehci_pci ehci_hcd usbcore ahci libahci usb_common tpm_tis
[3851435.874884] irq event stamp: 3828060294
[3851435.879398] hardirqs last  enabled at (3828060293): [] 
810db86c
[3851435.10] hardirqs last disabled at (3828060294): [] 
8170ff74
[3851435.898254] softirqs last  enabled at (3828060156): [] 
8107f31e
[3851435.907666] softirqs last disabled at (3828060139): [] 
8107f648
[3851435.917094] CPU: 13 PID: 2146 Comm: pocviexg Not tainted 4.4.52-grsec #1
[3851435.925252] Hardware name: Intel S2600WTTR, BIOS 
SE5C610.86B.01.01.0016.C1.033120161139 03/31/2016
[3851435.937473] task: 881c9ff35c00 ti: 881c9ff372b0 task.ti: 
881c9ff372b0
[3851435.946104] RIP: 0010:[]  [] 
try_to_wake_up+0xe4/0x500
[3851435.955614] RSP: :c90016beba40  EFLAGS: 0002
[3851435.961780] RAX:  RBX: 88102cc8e3d8 RCX: 
0001
[3851435.970016] RDX: 0001 RSI: 01e4 RDI: 

[3851435.978259] RBP: c90016beba80 R08: 

[PATCH] thermal: intel_pch_thermal: Read large temp values correctly

2017-07-19 Thread Ed Swierk
On all supported platforms, the TS Reading (TSR) field in the
Temperature (TEMP) register is 9 bits wide. Values above 0x100 (78
degrees C) are plausible, so don't mask out the topmost bit. And the
register itself is 16 bits wide, so use readw() rather than readl().

Signed-off-by: Ed Swierk 
---
 drivers/thermal/intel_pch_thermal.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/intel_pch_thermal.c 
b/drivers/thermal/intel_pch_thermal.c
index 9889c90..318fc1b 100644
--- a/drivers/thermal/intel_pch_thermal.c
+++ b/drivers/thermal/intel_pch_thermal.c
@@ -49,7 +49,7 @@
 #define WPT_TSGPEN 0x84/* General Purpose Event Enables */
 
 /*  Wildcat Point-LP  PCH Thermal Register bit definitions */
-#define WPT_TEMP_TSR   0x00ff  /* Temp TS Reading */
+#define WPT_TEMP_TSR   0x01ff  /* Temp TS Reading */
 #define WPT_TSC_CPDE   0x01/* Catastrophic Power-Down Enable */
 #define WPT_TSS_TSDSS  0x10/* Thermal Sensor Dynamic Shutdown Status */
 #define WPT_TSS_GPES   0x08/* GPE status */
@@ -174,9 +174,9 @@ static int pch_wpt_init(struct pch_thermal_device *ptd, int 
*nr_trips)
 
 static int pch_wpt_get_temp(struct pch_thermal_device *ptd, int *temp)
 {
-   u8 wpt_temp;
+   u16 wpt_temp;
 
-   wpt_temp = WPT_TEMP_TSR & readl(ptd->hw_base + WPT_TEMP);
+   wpt_temp = WPT_TEMP_TSR & readw(ptd->hw_base + WPT_TEMP);
 
/* Resolution of 1/2 degree C and an offset of -50C */
*temp = (wpt_temp * 1000 / 2 - 5);
-- 
1.9.1



[PATCH] thermal: intel_pch_thermal: Fix enable check on Broadwell-DE

2017-07-19 Thread Ed Swierk
Using the TSDSS flag to determine whether the thermal sensor is
enabled is problematic. Broadwell-DE (Xeon D-1500) does not support
dynamic shutdown and the TSDSS flag always reads 0 (contrary to the
current datasheet). Even on hardware supporting dynamic shutdown, the
driver does nothing to configure it, and the dynamic shutdown state
should not prevent the driver from loading. The ETS flag itself
indicates whether the thermal sensor is enabled, so use it instead of
the TSDSS flag on all hardware platforms.

Signed-off-by: Ed Swierk 
---
 drivers/thermal/intel_pch_thermal.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/intel_pch_thermal.c 
b/drivers/thermal/intel_pch_thermal.c
index 2b49e8d..9889c90 100644
--- a/drivers/thermal/intel_pch_thermal.c
+++ b/drivers/thermal/intel_pch_thermal.c
@@ -125,7 +125,7 @@ static int pch_wpt_init(struct pch_thermal_device *ptd, int 
*nr_trips)
*nr_trips = 0;
 
/* Check if BIOS has already enabled thermal sensor */
-   if (WPT_TSS_TSDSS & readb(ptd->hw_base + WPT_TSS)) {
+   if (WPT_TSEL_ETS & readb(ptd->hw_base + WPT_TSEL)) {
ptd->bios_enabled = true;
goto read_trips;
}
@@ -141,7 +141,7 @@ static int pch_wpt_init(struct pch_thermal_device *ptd, int 
*nr_trips)
}
 
writeb(tsel|WPT_TSEL_ETS, ptd->hw_base + WPT_TSEL);
-   if (!(WPT_TSS_TSDSS & readb(ptd->hw_base + WPT_TSS))) {
+   if (!(WPT_TSEL_ETS & readb(ptd->hw_base + WPT_TSEL))) {
dev_err(&ptd->pdev->dev, "Sensor can't be enabled\n");
return -ENODEV;
}
-- 
1.9.1



Disable all network protocols on an interface?

2016-11-14 Thread Ed Swierk
I have a Linux kernel 4.4 system hosting a number of kvm VMs. Physical 
interface eth0 connects to an 802.1Q trunk port on an external switch. Each VM 
has a virtual interface (e1000 or virtio-net) connected to the physical NIC 
through a macvtap interface and a VLAN interface; traffic between the external 
switch and the host is tagged with a per-VM tag. The only logic is 
demultiplexing incoming traffic by VLAN tag and stripping the tag, and adding 
the tag for outgoing traffic. Other than that, the eth0-VM datapath is a dumb 
pipe.

eth0 is assigned an IP address for host applications to send and receive 
untagged packets. For example, here's the setup with 2 VMs.

+- (untagged) 192.168.0.2
  eth0 -+- (tag 1) --- eth0.1 --- macvtap1 --- VM1
+- (tag 2) --- eth0.2 --- macvtap2 --- VM2

Various iptables rules filter the untagged packets received for host 
applications. The last rule in the INPUT chain logs incoming packets that don't 
match earlier rules:

  -A INPUT -m limit --limit 10/min -j LOG --log-prefix FilterInput

This all works, but I see occasional FilterInput messages for traffic received 
on eth0.1 and eth0.2: so far, only DHCP packets with destination MAC address 
ff:ff:ff:ff:ff:ff.

  FilterInput IN=eth0.1 OUT= MAC=ff:ff:ff:ff:ff:ff:00:01:02:03:04:05:08:00 
SRC=0.0.0.0 DST=255.255.255.255 LEN=328 TOS=0x10 PREC=0x00 TTL=128 ID=0 
PROTO=UDP SPT=68 DPT=67 LEN=308

Even though these are IP packets, I naively expect packets received on the VLAN 
interface lacking IP address to be either consumed by the attached macvtap or 
dropped before they trigger an iptables filter INPUT rule. It's a bit alarming 
to see packets destined for a VM to be processed at all by the host IP stack.

Digging through the code, I find that the core packet receive function 
__netif_receive_skb_core() first gives master devices like bridges and 
macvlans/macvtaps a chance to consume the packet; otherwise the packet gets 
handled by all installed protocols like IPv4. The packet gets pretty far down 
the IP receive process before it's discovered that there's nowhere to route it 
to, and no local sockets to deliver it to. The iptables INPUT chain is invoked 
well before that happens. (As far as I can tell, there's no explicit check in 
the IP receive code whether a local interface has an IP address.)

The macvlan's rx_handler definitively consumes or drops unicast packets, 
depending on the destination MAC address. But for broadcast packets, it  passes 
the packet to the attached VM interface and also tells the core receive 
function to continue processing it. Presumably this is to allow a macvlan to 
attach to one or more VMs as well as have a local IP address.

The logic in the bridge driver is a bit different: it consumes all packets from 
the slave interface. This makes sense as only the bridge master interface can 
be assigned a local IP address.

However in my application, I'm setting up the macvtap interfaces in passthrough 
mode, which precludes assigning a local IP address, just like a bridge slave. 
So it stands to reason that for a macvlan in passthrough mode, its rx_handler 
should consume or drop all packets, and not allow broadcast packets to also be 
handled locally.

This one-line change seems to do the trick:

--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -411,7 +411,7 @@ static rx_handler_result_t macvlan_handle_frame(struct 
sk_buff **pskb)
rx_handler_result_t handle_res;

port = macvlan_port_get_rcu(skb->dev);
-   if (is_multicast_ether_addr(eth->h_dest)) {
+   if (is_multicast_ether_addr(eth->h_dest) && !port->passthru) {
skb = ip_check_defrag(dev_net(skb->dev), skb, 
IP_DEFRAG_MACVLAN);
if (!skb)
return RX_HANDLER_CONSUMED;

Well, mostly. I still see FilterInput log messages in the brief window between 
creating the VLAN interface and attaching the macvtap to it, since there's no 
rx_handler to consume them. Hooking the VLAN interface to a bridge rather than 
a macvtap suppresses local IP processing on the slave but enables it on the 
bridge master interface. Apparently any non-slave interface can handle IP 
traffic to some extent, even if it doesn't have an IP address.

I worry that allowing any IP processing at all on eth0-VM traffic is a 
potential security hole, and I'm one configuration typo away from letting VM's 
traffic leak into another VM or a host application, and vice versa. And logging 
those FilterInput messages for non-local traffic just looks like sloppy 
security.

Is there some way to stop all local protocols from handling packets received on 
an interface--a protocol-agnostic equivalent of 
net.ipv6.conf.INTF.disable_ipv6? Would it be reasonable to implement one?

--Ed


Re: [PATCH v2 00/11] staging: octeon: multi rx group (queue) support

2016-08-31 Thread Ed Swierk
On 8/31/16 13:57, Aaro Koskinen wrote:
> This series implements multiple RX group support that should improve
> the networking performance on multi-core OCTEONs. Basically we register
> IRQ and NAPI for each group, and ask the HW to select the group for
> the incoming packets based on hash.
> 
> Tested on EdgeRouter Lite with a simple forwarding test using two flows
> and 16 RX groups distributed between two cores - the routing throughput
> is roughly doubled.
> 
> Also tested with EBH5600 (8 cores) and EBB6800 (16 cores) by sending
> and receiving traffic in both directions using SGMII interfaces.

With this series on 4.4.19, rx works with receive_group_order > 0.
Setting receive_group_order=4, I do see 16 Ethernet interrupts. I tried
fiddling with various smp_affinity values (e.g. setting them all to
, or assigning a different one to each interrupt, or giving a
few to some and a few to others), as well as different values for
rps_cpus. 10-thread parallel iperf performance varies between 0.5 and 1.5
Gbit/sec total depending on the particular settings.

With the SDK kernel I get over 8 Gbit/sec. It seems to be achieving that
using just one interrupt (not even a separate one for tx, as far as I can
tell) pegged to CPU 0 (the default smp_affinity). I must be missing some
other major configuration tweak, perhaps specific to 10G.

Can you run a test on the EBB6800 with the interfaces in 10G mode?

--Ed


Re: [PATCH 0/9] staging: octeon: multi rx group (queue) support

2016-08-31 Thread Ed Swierk
On 8/31/16 14:20, Aaro Koskinen wrote:
> On Wed, Aug 31, 2016 at 09:20:07AM -0700, Ed Swierk wrote:
>> Here's my workaround:
> 
> [...]
> 
>> -static int cvm_oct_poll(struct oct_rx_group *rx_group, int budget)
>> +static int cvm_oct_poll(int group, int budget)
>>  {
>>  const int   coreid = cvmx_get_core_num();
>>  u64 old_group_mask;
>> @@ -181,13 +181,13 @@ static int cvm_oct_poll(struct oct_rx_group *rx_group, 
>> int budget)
>>  if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
>>  old_group_mask = cvmx_read_csr(CVMX_SSO_PPX_GRP_MSK(coreid));
>>  cvmx_write_csr(CVMX_SSO_PPX_GRP_MSK(coreid),
>> -   BIT(rx_group->group));
>> +   BIT(group));
>> @@ -447,7 +447,7 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, 
>> int budget)
>>   napi);
>>  int rx_count;
>>
>> -rx_count = cvm_oct_poll(rx_group, budget);
>> +rx_count = cvm_oct_poll(rx_group->group, budget);
> 
> I'm confused - there should be no difference?!

I can't figure out the difference either. I get a crash within the first
couple packets, while with the workaround I can't get it to crash at all.
It always bombs in netif_receive_skb(), which isn't very close to any
rx_group pointer dereference.

# ping 172.16.100.253
PING 172.16.100.253 (172.16.100.253): 56 data bytes
Data bus error, epc == 803fd4ac, ra == 801943d8
Oops[#1]:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.19+ #94
task: 80863e80 ti: 8084 task.ti: 8084
$ 0   :  80126078 ef7bdef7bdef7bdf 815d3860
$ 4   : 80e045c8 81aae950 81aae950 
$ 8   : 81aae950 0038 0070 03bf
$12   : 5400 03bd  
$16   : 81aae950 81aae950 80e045c8 
$20   : 00fa 0001 5c02e0fa 
$24   : 0062 80548468  
$28   : 8084 808436d0 80feba38 801943d8
Hi: 
Lo: 05198e3760c0
epc   : 803fd4ac __list_add_rcu+0x7c/0xa0
ra: 801943d8 __lock_acquire+0xd94/0x1bf0
Status: 10008ce2KX SX UX KERNEL EXL 
Cause : 40808c1c (ExcCode 07)
PrId  : 000d910a (Cavium Octeon II)
Modules linked in:
Process swapper/0 (pid: 0, threadinfo=8084, task=80863e80, 
tls=)
Stack : 80863e80 808646c8 81aae950 801943d8
  00fa 808646c0  0002
   8057ab90 80864690 80870990
  0001   0017
   80193e08 0017 80864688
  0001 8057ab90 808a7d28 80007f4b7500
  80007a0b52e8 0001 807f 80007f768068
  8085fac8 8019568c  
  808a7d10 80645e60 80007f4a8600 0254
  808a7d58 8057ab90 0008 80007f7680a0
  ...
Call Trace:
[<__list_add_rcu at list_debug.c:97 (discriminator 2)>] __list_add_rcu+0x7c/0xa0
[] __lock_acquire+0xd94/0x1bf0
[] lock_acquire+0x50/0x78
[<__raw_read_lock at rwlock_api_smp.h:150
 (inlined by) _raw_read_lock at spinlock.c:223>] _raw_read_lock+0x4c/0x90
[] raw_local_deliver+0x58/0x1e8
[] 
ip_local_deliver_finish+0x118/0x4a8
[] ip_local_deliver+0x68/0xe0
[] ip_rcv+0x398/0x478
[<__netif_receive_skb_core at dev.c:3948>] __netif_receive_skb_core+0x764/0x818
[] 
netif_receive_skb_internal+0x148/0x214
[] 
cvm_oct_napi_poll+0x790/0xa2c
[] net_rx_action+0x130/0x2e0
[] __do_softirq+0x1f0/0x318
[] irq_exit+0x64/0xcc
[] octeon_irq_ciu2+0x154/0x1c4
[] plat_irq_dispatch+0x70/0x108
[] ret_from_irq+0x0/0x4
[] __r4k_wait+0x20/0x40
[] cpu_startup_entry+0x154/0x1d0
[] start_kernel+0x538/0x554

Presumably there's some sort of race condition that my change doesn't
really fix but happens to avoid by dereferencing rx_group just once early
on?

--Ed


Re: [PATCH 0/9] staging: octeon: multi rx group (queue) support

2016-08-31 Thread Ed Swierk
Aaro Koskinen wrote:
> Oops, looks like I tested without CONFIG_NET_POLL_CONTROLLER enabled
> and that seems to be broken. Sorry.

I'm not using CONFIG_NET_POLL_CONTROLLER either; the problem is in the
normal cvm_oct_napi_poll() path.

Here's my workaround:

--- a/drivers/staging/octeon/ethernet-rx.c
+++ b/drivers/staging/octeon/ethernet-rx.c
@@ -159,7 +159,7 @@ static inline int cvm_oct_check_rcv_error(cvmx_wqe_t *work)
return 0;
 }

-static int cvm_oct_poll(struct oct_rx_group *rx_group, int budget)
+static int cvm_oct_poll(int group, int budget)
 {
const int   coreid = cvmx_get_core_num();
u64 old_group_mask;
@@ -181,13 +181,13 @@ static int cvm_oct_poll(struct oct_rx_group *rx_group, 
int budget)
if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
old_group_mask = cvmx_read_csr(CVMX_SSO_PPX_GRP_MSK(coreid));
cvmx_write_csr(CVMX_SSO_PPX_GRP_MSK(coreid),
-  BIT(rx_group->group));
+  BIT(group));
cvmx_read_csr(CVMX_SSO_PPX_GRP_MSK(coreid)); /* Flush */
} else {
old_group_mask = cvmx_read_csr(CVMX_POW_PP_GRP_MSKX(coreid));
cvmx_write_csr(CVMX_POW_PP_GRP_MSKX(coreid),
   (old_group_mask & ~0xull) |
-  BIT(rx_group->group));
+  BIT(group));
}

if (USE_ASYNC_IOBDMA) {
@@ -212,15 +212,15 @@ static int cvm_oct_poll(struct oct_rx_group *rx_group, 
int budget)
if (!work) {
if (OCTEON_IS_MODEL(OCTEON_CN68XX)) {
cvmx_write_csr(CVMX_SSO_WQ_IQ_DIS,
-  BIT(rx_group->group));
+  BIT(group));
cvmx_write_csr(CVMX_SSO_WQ_INT,
-  BIT(rx_group->group));
+  BIT(group));
} else {
union cvmx_pow_wq_int wq_int;

wq_int.u64 = 0;
-   wq_int.s.iq_dis = BIT(rx_group->group);
-   wq_int.s.wq_int = BIT(rx_group->group);
+   wq_int.s.iq_dis = BIT(group);
+   wq_int.s.wq_int = BIT(group);
cvmx_write_csr(CVMX_POW_WQ_INT, wq_int.u64);
}
break;
@@ -447,7 +447,7 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int 
budget)
 napi);
int rx_count;

-   rx_count = cvm_oct_poll(rx_group, budget);
+   rx_count = cvm_oct_poll(rx_group->group, budget);

if (rx_count < budget) {
/* No more work */
@@ -466,7 +466,7 @@ static int cvm_oct_napi_poll(struct napi_struct *napi, int 
budget)
  */
 void cvm_oct_poll_controller(struct net_device *dev)
 {
-   cvm_oct_poll(oct_rx_group, 16);
+   cvm_oct_poll(oct_rx_group->group, 16);
 }
 #endif

> Can you see multiple ethernet IRQs in /proc/interrupts and their
> counters increasing?
> 
> With receive_group_order=4 you should see 16 IRQs.

I see the 16 IRQs, and the first one does increase. But packets don't make
it to the application.

--Ed


Re: [PATCH 0/9] staging: octeon: multi rx group (queue) support

2016-08-30 Thread Ed Swierk
Hi Aaro,

On Tue, Aug 30, 2016 at 11:47 AM, Aaro Koskinen  wrote:
> This series implements multiple RX group support that should improve
> the networking performance on multi-core OCTEONs. Basically we register
> IRQ and NAPI for each group, and ask the HW to select the group for
> the incoming packets based on hash.
>
> Tested on EdgeRouter Lite with a simple forwarding test using two flows
> and 16 RX groups distributed between two cores - the routing throughput
> is roughly doubled.

I applied the series to my 4.4.19 tree, which involved backporting a
bunch of other patches from master, most of them trivial.

When I test it on a Cavium Octeon 2 (CN6880) board, I get an immediate
crash (bus error) in the netif_receive_skb() call from cvm_oct_poll().
Replacing the rx_group argument to cvm_oct_poll() with int group, and
dereferencing rx_group->group in the caller (cvm_oct_napi_poll())
instead makes the crash disappear. Apparently there's some race in
dereferencing rx_group from within cvm_oct_poll().

With this workaround in place, I can send and receive on XAUI
interfaces, but don't see any performance improvement. I'm guessing I
need to set receive_group_order > 0. But any value between 1 and 4
seems to break rx altogether. When I ping another host I see both
request and response on the wire, and the interface counters increase,
but the response doesn't make it back to ping.

Is some other configuration needed to make use of multiple rx groups?

--Ed


Re: [PATCH] xenbus: don't look up transaction IDs for ordinary writes

2016-08-18 Thread Ed Swierk
On Tue, Aug 16, 2016 at 3:07 AM, Juergen Gross  wrote:
> On 15/08/16 17:02, Jan Beulich wrote:
>> This should really only be done for XS_TRANSACTION_END messages, or
>> else at least some of the xenstore-* tools don't work anymore.
>>
>> Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition")
>> Reported-by: Richard Schütz 
>> Cc: 
>> Signed-off-by: Jan Beulich 
>> Tested-by: Richard Schütz 
>
> Reviewed-by: Juergen Gross 

Confirmed that this patch fixes xenstore-read, xenstore-ls et al, on
stable kernel 4.4.17.

It should also go to 4.7, 4.6, 4.1, 3.18.

--Ed


Re: [PATCH v9 0/5] tpm: Command duration logging and chip-specific override

2016-07-13 Thread Ed Swierk
On Wed, Jul 13, 2016 at 10:36 AM, Jason Gunthorpe
 wrote:
> I think your bios is broken?

The BIOS is broken in many ways. I already have to pass
memmap=256M$0x8000, otherwise PCIe extended config space
(MMCONFIG) is inaccessible. Also I found memmap=0x7000$0x7a7d
works around "APEI: Can not request [mem 0x7a7d0018-0x7a7d0067] for
APEI ERST registers", as the BIOS seems to be mistakenly reserving
0x7b7d-7b7d7000 instead.

> A working BIOS will look like this:
>
>  $ cat /proc/iomem  | grep -i fed400
>  fed4-fed44fff : pnp 00:00
>
> It sets aside the struct resource during pnp:
>
>  [0.097318] pnp: PnP ACPI init
>  [0.097366] system 00:00: [mem 0xfed4-0xfed44fff] has been reserved
>
> What did your system do?
>
> You should see prints like this:
>
> printk(KERN_DEBUG
>"e820: reserve RAM buffer [mem %#010llx-%#010llx]\n",
>start, end);
>
> Which only happen if E820_RAM is set, which is certainly not right for
> TPM memory.

On my system I see

e820: reserve RAM buffer [mem 0x0009b000-0x0009]
e820: reserve RAM buffer [mem 0x75b02000-0x77ff]
e820: reserve RAM buffer [mem 0x7a1f6000-0x7bff]
e820: reserve RAM buffer [mem 0x7b80-0x7bff]
e820: reserve RAM buffer [mem 0xfed3-0x]
e820: reserve RAM buffer [mem 0x505deb000-0x507ff]

which doesn't make a whole lot of sense, as several of those areas
overlap each other, never mind devices.

> I don't know what kernel convention is to handle these sorts of
> defects?
>
> Is the use of the memmap kernel command line an appropriate work
> around?

It works for me, though I would like to know if there's another approach.

--Ed


Re: [PATCH v9 0/5] tpm: Command duration logging and chip-specific override

2016-07-13 Thread Ed Swierk
On Wed, Jul 13, 2016 at 9:19 AM, Ed Swierk  wrote:
> v9: Include command duration in existing error messages rather than
> logging an extra debug message. Rebase onto Jarkko's tree.

Incidentally, with Jarkko's tree the tpm_tis module refuses to
initialize (with or without force=1):

tpm_tis 00:03: can't request region for resource [mem 0xfed4-0xfed44fff]
tpm_tis: probe of 00:03 failed with error -16

The memory region is not marked reserved by the BIOS:

e820: BIOS-provided physical RAM map:
Xen: [mem 0x-0x0005] usable
Xen: [mem 0x0006-0x00067fff] reserved
Xen: [mem 0x00068000-0x0009afff] usable
Xen: [mem 0x000a-0x000f] reserved
Xen: [mem 0x0010-0x75b01fff] usable
Xen: [mem 0x75baf000-0x7a1f5fff] usable
Xen: [mem 0x7b7d7000-0x7b7f] usable
Xen: [mem 0x7bf0-0x7bff] usable
Xen: [mem 0xc7ffc000-0xc7ffcfff] reserved
Xen: [mem 0xfbffc000-0xfbffcfff] reserved
Xen: [mem 0xfec0-0xfec01fff] reserved
Xen: [mem 0xfec4-0xfec40fff] reserved
Xen: [mem 0xfed2-0xfed2] usable
Xen: [mem 0xfee0-0xfeef] reserved
Xen: [mem 0x0001-0x000505deafff] usable

With force=1, /proc/iomem shows:

fed3-fedf : RAM buffer
  fed4-fed44fff : tpm_tis

I can work around this problem by passing memmap=0x5000$0xfed4 on
the kernel command line.

--Ed


[PATCH v9 2/5] tpm: Add optional logging of TPM command durations

2016-07-13 Thread Ed Swierk
Some TPMs violate their own advertised command durations. This is much
easier to debug with data about how long each command actually takes
to complete. Add debug messages that can be enabled by running

  echo -n 'module tpm +p' >/sys/kernel/debug/dynamic_debug/control

on a kernel configured with DYNAMIC_DEBUG=y.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-interface.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 5e3c1b6..a4beb53 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -335,13 +335,14 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
 {
ssize_t rc;
u32 count, ordinal;
-   unsigned long stop;
+   unsigned long start, stop;
 
if (bufsiz > TPM_BUFSIZE)
bufsiz = TPM_BUFSIZE;
 
count = be32_to_cpu(*((__be32 *) (buf + 2)));
ordinal = be32_to_cpu(*((__be32 *) (buf + 6)));
+   dev_dbg(&chip->dev, "starting command %d count %d\n", ordinal, count);
if (count == 0)
return -ENODATA;
if (count > bufsiz) {
@@ -362,18 +363,23 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
if (chip->flags & TPM_CHIP_FLAG_IRQ)
goto out_recv;
 
+   start = jiffies;
if (chip->flags & TPM_CHIP_FLAG_TPM2)
-   stop = jiffies + tpm2_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm2_calc_ordinal_duration(chip, ordinal);
else
-   stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm_calc_ordinal_duration(chip, ordinal);
do {
u8 status = chip->ops->status(chip);
if ((status & chip->ops->req_complete_mask) ==
-   chip->ops->req_complete_val)
+   chip->ops->req_complete_val) {
+   dev_dbg(&chip->dev, "completed command %d in %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
goto out_recv;
+   }
 
if (chip->ops->req_canceled(chip, status)) {
-   dev_err(&chip->dev, "Operation Canceled\n");
+   dev_err(&chip->dev, "canceled command %d after %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
rc = -ECANCELED;
goto out;
}
@@ -383,7 +389,8 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
} while (time_before(jiffies, stop));
 
chip->ops->cancel(chip);
-   dev_err(&chip->dev, "Operation Timed out\n");
+   dev_err(&chip->dev, "command %d timed out after %d ms\n", ordinal,
+   jiffies_to_msecs(jiffies - start));
rc = -ETIME;
goto out;
 
-- 
1.9.1



[PATCH v9 3/5] tpm: Clean up reading of timeout and duration capabilities

2016-07-13 Thread Ed Swierk
Call tpm_getcap() from tpm_get_timeouts() to eliminate redundant
code. Return all errors to the caller rather than swallowing them
(e.g. when tpm_transmit_cmd() returns nonzero).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 74 +++-
 1 file changed, 27 insertions(+), 47 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index a4beb53..dc492ee 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -460,9 +460,19 @@ ssize_t tpm_getcap(struct tpm_chip *chip, __be32 
subcap_id, cap_t *cap,
tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
tpm_cmd.params.getcap_in.subcap = subcap_id;
}
+
rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, desc);
+
+   if (!rc &&
+   ((subcap_id == TPM_CAP_PROP_TIS_TIMEOUT &&
+ be32_to_cpu(tpm_cmd.header.out.length) != TPM_HEADER_SIZE + 20) ||
+(subcap_id == TPM_CAP_PROP_TIS_DURATION &&
+ be32_to_cpu(tpm_cmd.header.out.length) != TPM_HEADER_SIZE + 16)))
+   rc = -EINVAL;
+
if (!rc)
*cap = tpm_cmd.params.getcap_out.cap;
+
return rc;
 }
 
@@ -503,10 +513,9 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
 
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
-   struct tpm_cmd_t tpm_cmd;
+   cap_t cap;
unsigned long new_timeout[4];
unsigned long old_timeout[4];
-   struct duration_t *duration_cap;
ssize_t rc;
 
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
@@ -524,42 +533,25 @@ int tpm_get_timeouts(struct tpm_chip *chip)
return 0;
}
 
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, NULL);
-
+   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   "attempting to determine the timeouts");
if (rc == TPM_ERR_INVALID_POSTINIT) {
/* The TPM is not started, we are the first to talk to it.
   Execute a startup command. */
-   dev_info(&chip->dev, "Issuing TPM_STARTUP");
+   dev_info(&chip->dev, "Issuing TPM_STARTUP\n");
if (tpm_startup(chip, TPM_ST_CLEAR))
return rc;
 
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
- NULL);
-   }
-   if (rc) {
-   dev_err(&chip->dev,
-   "A TPM error (%zd) occurred attempting to determine the 
timeouts\n",
-   rc);
-   goto duration;
+   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   "attempting to determine the timeouts");
}
+   if (rc)
+   return rc;
 
-   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
-   be32_to_cpu(tpm_cmd.header.out.length)
-   != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
-   return -EINVAL;
-
-   old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.d);
+   old_timeout[0] = be32_to_cpu(cap.timeout.a);
+   old_timeout[1] = be32_to_cpu(cap.timeout.b);
+   old_timeout[2] = be32_to_cpu(cap.timeout.c);
+   old_timeout[3] = be32_to_cpu(cap.timeout.d);
memcpy(new_timeout, old_timeout, sizeof(new_timeout));
 
/*
@@ -597,29 +589,17 @@ int tpm_get_timeouts(struct tpm_chip *chip)
chip->timeout_c = usecs_to_jiffies(new_timeout[2]);
chip->timeout_d = usecs_to_jiffies(new_timeout[3]);
 
-duration:
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_DURATION;
-
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
- "attempting to determine the durations");
+   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_DURATION

[PATCH v9 4/5] tpm: Allow TPM chip drivers to override reported command durations

2016-07-13 Thread Ed Swierk
Some TPM chips report bogus command durations in their capabilities,
just as others report incorrect timeouts. Rework tpm_get_timeouts() to
allow chip drivers to override either via a single callback. Also
clean up handling of TPMs that report milliseconds instead of
microseconds.

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 148 ++-
 drivers/char/tpm/tpm_tis_core.c  |  32 +++--
 include/linux/tpm.h  |   3 +-
 3 files changed, 94 insertions(+), 89 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index dc492ee..9dafc25 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -513,10 +513,10 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
 
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
-   cap_t cap;
-   unsigned long new_timeout[4];
-   unsigned long old_timeout[4];
-   ssize_t rc;
+   cap_t cap1, cap2;
+   int rc;
+   unsigned long orig_timeout_a, orig_timeout_b, orig_timeout_c,
+   orig_timeout_d, orig_duration[3];
 
if (chip->flags & TPM_CHIP_FLAG_TPM2) {
/* Fixed timeouts for TPM2 */
@@ -533,7 +533,7 @@ int tpm_get_timeouts(struct tpm_chip *chip)
return 0;
}
 
-   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_TIMEOUT, &cap1,
"attempting to determine the timeouts");
if (rc == TPM_ERR_INVALID_POSTINIT) {
/* The TPM is not started, we are the first to talk to it.
@@ -542,77 +542,95 @@ int tpm_get_timeouts(struct tpm_chip *chip)
if (tpm_startup(chip, TPM_ST_CLEAR))
return rc;
 
-   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_TIMEOUT, &cap1,
"attempting to determine the timeouts");
}
if (rc)
return rc;
 
-   old_timeout[0] = be32_to_cpu(cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(cap.timeout.d);
-   memcpy(new_timeout, old_timeout, sizeof(new_timeout));
-
-   /*
-* Provide ability for vendor overrides of timeout values in case
-* of misreporting.
-*/
-   if (chip->ops->update_timeouts != NULL)
-   chip->timeout_adjusted =
-   chip->ops->update_timeouts(chip, new_timeout);
-
-   if (!chip->timeout_adjusted) {
-   /* Don't overwrite default if value is 0 */
-   if (new_timeout[0] != 0 && new_timeout[0] < 1000) {
-   int i;
-
-   /* timeouts in msec rather usec */
-   for (i = 0; i != ARRAY_SIZE(new_timeout); i++)
-   new_timeout[i] *= 1000;
-   chip->timeout_adjusted = true;
-   }
-   }
-
-   /* Report adjusted timeouts */
-   if (chip->timeout_adjusted) {
-   dev_info(&chip->dev,
-HW_ERR "Adjusting reported timeouts: A %lu->%luus B 
%lu->%luus C %lu->%luus D %lu->%luus\n",
-old_timeout[0], new_timeout[0],
-old_timeout[1], new_timeout[1],
-old_timeout[2], new_timeout[2],
-old_timeout[3], new_timeout[3]);
-   }
-
-   chip->timeout_a = usecs_to_jiffies(new_timeout[0]);
-   chip->timeout_b = usecs_to_jiffies(new_timeout[1]);
-   chip->timeout_c = usecs_to_jiffies(new_timeout[2]);
-   chip->timeout_d = usecs_to_jiffies(new_timeout[3]);
-
-   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_DURATION, &cap,
+   rc = tpm_getcap(chip, TPM_CAP_PROP_TIS_DURATION, &cap2,
"attempting to determine the durations");
if (rc)
return rc;
 
-   chip->duration[TPM_SHORT] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_short));
-   chip->duration[TPM_MEDIUM] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_medium));
-   chip->duration[TPM_LONG] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_long));
+   be32_to_cpus(&cap1.timeout.a);
+   be32_to_cpus(&cap1.timeout.b);
+   be32_to_cpus(&cap1.timeout.c);
+   be32_to_cpus(&cap1.timeout.d);
+   chip->timeout_a = usecs_to_jiffies(cap1.timeout.a);
+   chip->timeout_b = usecs_to_jiffies(cap1.timeout.b);
+   chip->timeout_c = usecs_to_jiffies(cap1.timeout.c);
+   chip->timeout_d = usecs_to_jiffies(cap1.timeout.d);
 
-   /* The Broadcom BCM0

[PATCH v9 0/5] tpm: Command duration logging and chip-specific override

2016-07-13 Thread Ed Swierk
v9: Include command duration in existing error messages rather than
logging an extra debug message. Rebase onto Jarkko's tree.

v8: Fix v7 goof-up in tpm_getcap().

v7: Use tpm_getcap() instead of a redundant new function.

v6: Split tpm_get_cap_prop() out of tpm_get_timeouts(); always return
error on TPM command failure.

v5: Use msecs_to_jiffies() instead of * HZ / 1000.

v4: Rework tpm_get_timeouts() to allow overriding both timeouts and
durations via a single callback.

This series
- improves TPM command error reporting
- adds optional logging of TPM command durations
- allows chip-specific override of command durations as well as protocol
  timeouts
- overrides ST19NP18 TPM command duration to avoid lockups

Ed Swierk (5):
  tpm_tis: Improve reporting of IO errors
  tpm: Add optional logging of TPM command durations
  tpm: Clean up reading of timeout and duration capabilities
  tpm: Allow TPM chip drivers to override reported command durations
  tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

 drivers/char/tpm/tpm-interface.c | 219 ---
 drivers/char/tpm/tpm_tis_core.c  |  46 
 include/linux/tpm.h  |   3 +-
 3 files changed, 136 insertions(+), 132 deletions(-)

-- 
1.9.1



[PATCH v9 5/5] tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

2016-07-13 Thread Ed Swierk
The STMicro ST19NP18-TPM sometimes takes much longer to execute
commands than it reports in its capabilities. For example, command 186
(TPM_FlushSpecific) has been observed to take 14560 msec to complete,
far longer than the 3000 msec limit for "short" commands reported by
the chip. The behavior has also been seen with command 101
(TPM_GetCapability).

Worse, when the tpm_tis driver attempts to cancel the current command
(by writing commandReady = 1 to TPM_STS_x), the chip locks up
completely, returning all-1s from all memory-mapped register
reads. The lockup can be cleared only by resetting the system.

The occurrence of this excessive command duration depends on the
sequence of commands preceding it. One sequence is creating at least 2
new keys via TPM_CreateWrapKey, then letting the TPM idle for at least
30 seconds, then loading a key via TPM_LoadKey2. The next
TPM_FlushSpecific occasionally takes tens of seconds to
complete. Another sequence is creating many keys in a row without
pause. The TPM_CreateWrapKey operation gets much slower after the
first few iterations, as one would expect when the pool of precomputed
keys is exhausted. Then after a 35-second pause, the same TPM_LoadKey2
followed by TPM_FlushSpecific sequence triggers the behavior.

Our working theory is that this older TPM sometimes pauses to
precompute keys, which modern chips implement as a background
process. Without access to the chip's implementation details it's
impossible to know whether any commands are immune to being blocked by
this process. So it seems safest to ignore the chip's reported command
durations, and use a value much higher than any observed duration,
like 180 sec (which is the duration this chip reports for "long"
commands).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm_tis_core.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index f013664..84d6008 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -416,6 +416,12 @@ static void tpm_tis_update_timeouts(struct tpm_chip *chip)
chip->timeout_d = msecs_to_jiffies(TIS_SHORT_TIMEOUT);
chip->timeout_adjusted = true;
break;
+   case 0x104a: /* STMicro ST19NP18-TPM */
+   chip->duration[TPM_SHORT] = 180 * HZ;
+   chip->duration[TPM_MEDIUM] = 180 * HZ;
+   chip->duration[TPM_LONG] = 180 * HZ;
+   chip->duration_adjusted = true;
+   break;
}
 }
 
-- 
1.9.1



[PATCH v9 1/5] tpm_tis: Improve reporting of IO errors

2016-07-13 Thread Ed Swierk
Mysterious TPM behavior can be difficult to track down through all the
layers of software. Add error messages for conditions that should
never happen. Also include the manufacturer ID along with other chip
data printed during init.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm_tis_core.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c
index 8110b52..e62fdeb 100644
--- a/drivers/char/tpm/tpm_tis_core.c
+++ b/drivers/char/tpm/tpm_tis_core.c
@@ -217,6 +217,8 @@ static int tpm_tis_recv(struct tpm_chip *chip, u8 *buf, 
size_t count)
 
expected = be32_to_cpu(*(__be32 *) (buf + 2));
if (expected > count) {
+   dev_err(&chip->dev, "Response too long (wanted %zd, got %d)\n",
+   count, expected);
size = -EIO;
goto out;
}
@@ -283,6 +285,8 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &priv->int_queue, false);
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
+   dev_err(&chip->dev, "Chip not accepting %zd bytes\n",
+   len - count);
rc = -EIO;
goto out_err;
}
@@ -297,6 +301,7 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &priv->int_queue, false);
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) != 0) {
+   dev_err(&chip->dev, "Chip not accepting last byte\n");
rc = -EIO;
goto out_err;
}
@@ -707,8 +712,9 @@ int tpm_tis_core_init(struct device *dev, struct 
tpm_tis_data *priv, int irq,
if (rc < 0)
goto out_err;
 
-   dev_info(dev, "%s TPM (device-id 0x%X, rev-id %d)\n",
+   dev_info(dev, "%s TPM (manufacturer-id 0x%X, device-id 0x%X, rev-id 
%d)\n",
 (chip->flags & TPM_CHIP_FLAG_TPM2) ? "2.0" : "1.2",
+priv->manufacturer_id,
 vendor >> 16, rid);
 
if (!(priv->flags & TPM_TIS_ITPM_POSSIBLE)) {
-- 
1.9.1



[PATCH v8 3/5] tpm: Clean up reading of timeout and duration capabilities

2016-06-21 Thread Ed Swierk
Call tpm_getcap() from tpm_get_timeouts() to eliminate redundant
code. Return all errors to the caller rather than swallowing them
(e.g. when tpm_transmit_cmd() returns nonzero).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 74 +++-
 1 file changed, 27 insertions(+), 47 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index cc1e5bc..b65c139 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -461,9 +461,19 @@ ssize_t tpm_getcap(struct device *dev, __be32 subcap_id, 
cap_t *cap,
tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
tpm_cmd.params.getcap_in.subcap = subcap_id;
}
+
rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, desc);
+
+   if (!rc &&
+   ((subcap_id == TPM_CAP_PROP_TIS_TIMEOUT &&
+ be32_to_cpu(tpm_cmd.header.out.length) != TPM_HEADER_SIZE + 20) ||
+(subcap_id == TPM_CAP_PROP_TIS_DURATION &&
+ be32_to_cpu(tpm_cmd.header.out.length) != TPM_HEADER_SIZE + 16)))
+   rc = -EINVAL;
+
if (!rc)
*cap = tpm_cmd.params.getcap_out.cap;
+
return rc;
 }
 
@@ -504,48 +514,30 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
 
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
-   struct tpm_cmd_t tpm_cmd;
+   cap_t cap;
unsigned long new_timeout[4];
unsigned long old_timeout[4];
-   struct duration_t *duration_cap;
ssize_t rc;
 
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, NULL);
-
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   "attempting to determine the timeouts");
if (rc == TPM_ERR_INVALID_POSTINIT) {
/* The TPM is not started, we are the first to talk to it.
   Execute a startup command. */
-   dev_info(chip->pdev, "Issuing TPM_STARTUP");
+   dev_info(chip->pdev, "Issuing TPM_STARTUP\n");
if (tpm_startup(chip, TPM_ST_CLEAR))
return rc;
 
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
- NULL);
-   }
-   if (rc) {
-   dev_err(chip->pdev,
-   "A TPM error (%zd) occurred attempting to determine the 
timeouts\n",
-   rc);
-   goto duration;
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   "attempting to determine the timeouts");
}
+   if (rc)
+   return rc;
 
-   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
-   be32_to_cpu(tpm_cmd.header.out.length)
-   != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
-   return -EINVAL;
-
-   old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.d);
+   old_timeout[0] = be32_to_cpu(cap.timeout.a);
+   old_timeout[1] = be32_to_cpu(cap.timeout.b);
+   old_timeout[2] = be32_to_cpu(cap.timeout.c);
+   old_timeout[3] = be32_to_cpu(cap.timeout.d);
memcpy(new_timeout, old_timeout, sizeof(new_timeout));
 
/*
@@ -583,29 +575,17 @@ int tpm_get_timeouts(struct tpm_chip *chip)
chip->vendor.timeout_c = usecs_to_jiffies(new_timeout[2]);
chip->vendor.timeout_d = usecs_to_jiffies(new_timeout[3]);
 
-duration:
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_DURATION;
-
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
- "attempting to determine the durations");
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_DURATION, &cap,
+   "attempting to determine the durations");
if (rc)
return rc;
 
-   if (be32_to

[PATCH v8 2/5] tpm: Add optional logging of TPM command durations

2016-06-21 Thread Ed Swierk
Some TPMs violate their own advertised command durations. This is much
easier to debug with data about how long each command actually takes
to complete. Add debug messages that can be enabled by running

  echo -n 'module tpm +p' >/sys/kernel/debug/dynamic_debug/control

on a kernel configured with DYNAMIC_DEBUG=y.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-interface.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index c50637d..cc1e5bc 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -333,13 +333,14 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
 {
ssize_t rc;
u32 count, ordinal;
-   unsigned long stop;
+   unsigned long start, stop;
 
if (bufsiz > TPM_BUFSIZE)
bufsiz = TPM_BUFSIZE;
 
count = be32_to_cpu(*((__be32 *) (buf + 2)));
ordinal = be32_to_cpu(*((__be32 *) (buf + 6)));
+   dev_dbg(chip->pdev, "starting command %d count %d\n", ordinal, count);
if (count == 0)
return -ENODATA;
if (count > bufsiz) {
@@ -360,18 +361,24 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
if (chip->vendor.irq)
goto out_recv;
 
+   start = jiffies;
if (chip->flags & TPM_CHIP_FLAG_TPM2)
-   stop = jiffies + tpm2_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm2_calc_ordinal_duration(chip, ordinal);
else
-   stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm_calc_ordinal_duration(chip, ordinal);
do {
u8 status = chip->ops->status(chip);
if ((status & chip->ops->req_complete_mask) ==
-   chip->ops->req_complete_val)
+   chip->ops->req_complete_val) {
+   dev_dbg(chip->pdev, "completed command %d in %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
goto out_recv;
+   }
 
if (chip->ops->req_canceled(chip, status)) {
dev_err(chip->pdev, "Operation Canceled\n");
+   dev_dbg(chip->pdev, "canceled command %d after %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
rc = -ECANCELED;
goto out;
}
@@ -382,6 +389,8 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
 
chip->ops->cancel(chip);
dev_err(chip->pdev, "Operation Timed out\n");
+   dev_dbg(chip->pdev, "command %d timed out after %d ms\n", ordinal,
+   jiffies_to_msecs(jiffies - start));
rc = -ETIME;
goto out;
 
-- 
1.9.1



[PATCH v8 4/5] tpm: Allow TPM chip drivers to override reported command durations

2016-06-21 Thread Ed Swierk
Some TPM chips report bogus command durations in their capabilities,
just as others report incorrect timeouts. Rework tpm_get_timeouts() to
allow chip drivers to override either via a single callback. Also
clean up handling of TPMs that report milliseconds instead of
microseconds.

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 143 +--
 drivers/char/tpm/tpm_tis.c   |  35 +++---
 include/linux/tpm.h  |   3 +-
 3 files changed, 88 insertions(+), 93 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index b65c139..5f98488 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -514,12 +514,11 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
 
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
-   cap_t cap;
-   unsigned long new_timeout[4];
-   unsigned long old_timeout[4];
-   ssize_t rc;
+   cap_t cap1, cap2;
+   int rc;
+   struct tpm_vendor_specific orig_vendor;
 
-   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap1,
"attempting to determine the timeouts");
if (rc == TPM_ERR_INVALID_POSTINIT) {
/* The TPM is not started, we are the first to talk to it.
@@ -528,77 +527,91 @@ int tpm_get_timeouts(struct tpm_chip *chip)
if (tpm_startup(chip, TPM_ST_CLEAR))
return rc;
 
-   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap1,
"attempting to determine the timeouts");
}
if (rc)
return rc;
 
-   old_timeout[0] = be32_to_cpu(cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(cap.timeout.d);
-   memcpy(new_timeout, old_timeout, sizeof(new_timeout));
-
-   /*
-* Provide ability for vendor overrides of timeout values in case
-* of misreporting.
-*/
-   if (chip->ops->update_timeouts != NULL)
-   chip->vendor.timeout_adjusted =
-   chip->ops->update_timeouts(chip, new_timeout);
-
-   if (!chip->vendor.timeout_adjusted) {
-   /* Don't overwrite default if value is 0 */
-   if (new_timeout[0] != 0 && new_timeout[0] < 1000) {
-   int i;
-
-   /* timeouts in msec rather usec */
-   for (i = 0; i != ARRAY_SIZE(new_timeout); i++)
-   new_timeout[i] *= 1000;
-   chip->vendor.timeout_adjusted = true;
-   }
-   }
-
-   /* Report adjusted timeouts */
-   if (chip->vendor.timeout_adjusted) {
-   dev_info(chip->pdev,
-HW_ERR "Adjusting reported timeouts: A %lu->%luus B 
%lu->%luus C %lu->%luus D %lu->%luus\n",
-old_timeout[0], new_timeout[0],
-old_timeout[1], new_timeout[1],
-old_timeout[2], new_timeout[2],
-old_timeout[3], new_timeout[3]);
-   }
-
-   chip->vendor.timeout_a = usecs_to_jiffies(new_timeout[0]);
-   chip->vendor.timeout_b = usecs_to_jiffies(new_timeout[1]);
-   chip->vendor.timeout_c = usecs_to_jiffies(new_timeout[2]);
-   chip->vendor.timeout_d = usecs_to_jiffies(new_timeout[3]);
-
-   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_DURATION, &cap,
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_DURATION, &cap2,
"attempting to determine the durations");
if (rc)
return rc;
 
-   chip->vendor.duration[TPM_SHORT] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_short));
-   chip->vendor.duration[TPM_MEDIUM] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_medium));
-   chip->vendor.duration[TPM_LONG] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_long));
+   be32_to_cpus(&cap1.timeout.a);
+   be32_to_cpus(&cap1.timeout.b);
+   be32_to_cpus(&cap1.timeout.c);
+   be32_to_cpus(&cap1.timeout.d);
+   chip->vendor.timeout_a = usecs_to_jiffies(cap1.timeout.a);
+   chip->vendor.timeout_b = usecs_to_jiffies(cap1.timeout.b);
+   chip->vendor.timeout_c = usecs_to_jiffies(cap1.timeout.c);
+   chip->vendor.timeout_d = usecs_to_jiffies(cap1.timeout.d);
 
-   /* The Broadcom BCM0102 chipset in a Dell Latitude D820 gets the above
-* value wrong and apparently reports msecs rather than u

[PATCH v8 0/5] tpm: Command duration logging and chip-specific override

2016-06-21 Thread Ed Swierk
v8: Fix v7 goof-up in tpm_getcap().

v7: Use tpm_getcap() instead of a redundant new function.

v6: Split tpm_get_cap_prop() out of tpm_get_timeouts(); always return
error on TPM command failure.

v5: Use msecs_to_jiffies() instead of * HZ / 1000.

v4: Rework tpm_get_timeouts() to allow overriding both timeouts and
durations via a single callback.

This series
- improves TPM command error reporting
- adds optional logging of TPM command durations
- allows chip-specific override of command durations as well as protocol
  timeouts
- overrides ST19NP18 TPM command duration to avoid lockups

Ed Swierk (5):
  tpm_tis: Improve reporting of IO errors
  tpm: Add optional logging of TPM command durations
  tpm: Clean up reading of timeout and duration capabilities
  tpm: Allow TPM chip drivers to override reported command durations
  tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

 drivers/char/tpm/tpm-interface.c | 212 ---
 drivers/char/tpm/tpm_tis.c   |  49 -
 include/linux/tpm.h  |   3 +-
 3 files changed, 130 insertions(+), 134 deletions(-)

-- 
1.9.1



[PATCH v8 5/5] tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

2016-06-21 Thread Ed Swierk
The STMicro ST19NP18-TPM sometimes takes much longer to execute
commands than it reports in its capabilities. For example, command 186
(TPM_FlushSpecific) has been observed to take 14560 msec to complete,
far longer than the 3000 msec limit for "short" commands reported by
the chip. The behavior has also been seen with command 101
(TPM_GetCapability).

Worse, when the tpm_tis driver attempts to cancel the current command
(by writing commandReady = 1 to TPM_STS_x), the chip locks up
completely, returning all-1s from all memory-mapped register
reads. The lockup can be cleared only by resetting the system.

The occurrence of this excessive command duration depends on the
sequence of commands preceding it. One sequence is creating at least 2
new keys via TPM_CreateWrapKey, then letting the TPM idle for at least
30 seconds, then loading a key via TPM_LoadKey2. The next
TPM_FlushSpecific occasionally takes tens of seconds to
complete. Another sequence is creating many keys in a row without
pause. The TPM_CreateWrapKey operation gets much slower after the
first few iterations, as one would expect when the pool of precomputed
keys is exhausted. Then after a 35-second pause, the same TPM_LoadKey2
followed by TPM_FlushSpecific sequence triggers the behavior.

Our working theory is that this older TPM sometimes pauses to
precompute keys, which modern chips implement as a background
process. Without access to the chip's implementation details it's
impossible to know whether any commands are immune to being blocked by
this process. So it seems safest to ignore the chip's reported command
durations, and use a value much higher than any observed duration,
like 180 sec (which is the duration this chip reports for "long"
commands).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm_tis.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index caf7278..8355b45 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -485,6 +485,12 @@ static void tpm_tis_update_timeouts(struct tpm_chip *chip)
chip->vendor.timeout_d = msecs_to_jiffies(TIS_SHORT_TIMEOUT);
chip->vendor.timeout_adjusted = true;
break;
+   case 0x104a: /* STMicro ST19NP18-TPM */
+   chip->vendor.duration[TPM_SHORT] = 180 * HZ;
+   chip->vendor.duration[TPM_MEDIUM] = 180 * HZ;
+   chip->vendor.duration[TPM_LONG] = 180 * HZ;
+   chip->vendor.duration_adjusted = true;
+   break;
}
 }
 
-- 
1.9.1



[PATCH v8 1/5] tpm_tis: Improve reporting of IO errors

2016-06-21 Thread Ed Swierk
Mysterious TPM behavior can be difficult to track down through all the
layers of software. Add error messages for conditions that should
never happen. Also include the manufacturer ID along with other chip
data printed during init.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm_tis.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 65f7eec..088fa86 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -299,6 +299,8 @@ static int tpm_tis_recv(struct tpm_chip *chip, u8 *buf, 
size_t count)
 
expected = be32_to_cpu(*(__be32 *) (buf + 2));
if (expected > count) {
+   dev_err(chip->pdev, "Response too long (wanted %zd, got %d)\n",
+   count, expected);
size = -EIO;
goto out;
}
@@ -366,6 +368,8 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
+   dev_err(chip->pdev, "Chip not accepting %zd bytes\n",
+   len - count);
rc = -EIO;
goto out_err;
}
@@ -378,6 +382,7 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if ((status & TPM_STS_DATA_EXPECT) != 0) {
+   dev_err(chip->pdev, "Chip not accepting last byte\n");
rc = -EIO;
goto out_err;
}
@@ -689,8 +694,9 @@ static int tpm_tis_init(struct device *dev, struct tpm_info 
*tpm_info,
vendor = ioread32(chip->vendor.iobase + TPM_DID_VID(0));
chip->vendor.manufacturer_id = vendor;
 
-   dev_info(dev, "%s TPM (device-id 0x%X, rev-id %d)\n",
+   dev_info(dev, "%s TPM (manufacturer-id 0x%X, device-id 0x%X, rev-id 
%d)\n",
 (chip->flags & TPM_CHIP_FLAG_TPM2) ? "2.0" : "1.2",
+chip->vendor.manufacturer_id,
 vendor >> 16, ioread8(chip->vendor.iobase + TPM_RID(0)));
 
if (!itpm) {
-- 
1.9.1



Re: [PATCH v7 3/5] tpm: Clean up reading of timeout and duration capabilities

2016-06-21 Thread Ed Swierk
On Mon, Jun 20, 2016 at 6:54 PM, Ed Swierk  wrote:
> --- a/drivers/char/tpm/tpm-interface.c
> +++ b/drivers/char/tpm/tpm-interface.c
> @@ -461,9 +461,19 @@ ssize_t tpm_getcap(struct device *dev, __be32 subcap_id, 
> cap_t *cap,
> tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
> tpm_cmd.params.getcap_in.subcap = subcap_id;
> }
> +
> rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, desc);
> +
> +   if (!rc &&
> +   ((subcap_id == TPM_CAP_PROP_TIS_TIMEOUT &&
> + be32_to_cpu(tpm_cmd.header.out.length) != TPM_HEADER_SIZE + 5) 
> ||
> +(subcap_id == TPM_CAP_PROP_TIS_DURATION &&
> + be32_to_cpu(tpm_cmd.header.out.length) != TPM_HEADER_SIZE + 4)))
> +   rc = -EINVAL;
> +

Woops, a totally innocuous last-minute (post-testing) cleanup broke
this code; should be TPM_HEADER_SIZE + 20 and + 16. I'll push out v8
as soon as I redo my tests.

--Ed


[PATCH v7 3/5] tpm: Clean up reading of timeout and duration capabilities

2016-06-20 Thread Ed Swierk
Call tpm_getcap() from tpm_get_timeouts() to eliminate redundant
code. Return all errors to the caller rather than swallowing them
(e.g. when tpm_transmit_cmd() returns nonzero).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 74 +++-
 1 file changed, 27 insertions(+), 47 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index cc1e5bc..73c3ee0 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -461,9 +461,19 @@ ssize_t tpm_getcap(struct device *dev, __be32 subcap_id, 
cap_t *cap,
tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
tpm_cmd.params.getcap_in.subcap = subcap_id;
}
+
rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, desc);
+
+   if (!rc &&
+   ((subcap_id == TPM_CAP_PROP_TIS_TIMEOUT &&
+ be32_to_cpu(tpm_cmd.header.out.length) != TPM_HEADER_SIZE + 5) ||
+(subcap_id == TPM_CAP_PROP_TIS_DURATION &&
+ be32_to_cpu(tpm_cmd.header.out.length) != TPM_HEADER_SIZE + 4)))
+   rc = -EINVAL;
+
if (!rc)
*cap = tpm_cmd.params.getcap_out.cap;
+
return rc;
 }
 
@@ -504,48 +514,30 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
 
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
-   struct tpm_cmd_t tpm_cmd;
+   cap_t cap;
unsigned long new_timeout[4];
unsigned long old_timeout[4];
-   struct duration_t *duration_cap;
ssize_t rc;
 
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, NULL);
-
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   "attempting to determine the timeouts");
if (rc == TPM_ERR_INVALID_POSTINIT) {
/* The TPM is not started, we are the first to talk to it.
   Execute a startup command. */
-   dev_info(chip->pdev, "Issuing TPM_STARTUP");
+   dev_info(chip->pdev, "Issuing TPM_STARTUP\n");
if (tpm_startup(chip, TPM_ST_CLEAR))
return rc;
 
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
- NULL);
-   }
-   if (rc) {
-   dev_err(chip->pdev,
-   "A TPM error (%zd) occurred attempting to determine the 
timeouts\n",
-   rc);
-   goto duration;
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   "attempting to determine the timeouts");
}
+   if (rc)
+   return rc;
 
-   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
-   be32_to_cpu(tpm_cmd.header.out.length)
-   != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
-   return -EINVAL;
-
-   old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.d);
+   old_timeout[0] = be32_to_cpu(cap.timeout.a);
+   old_timeout[1] = be32_to_cpu(cap.timeout.b);
+   old_timeout[2] = be32_to_cpu(cap.timeout.c);
+   old_timeout[3] = be32_to_cpu(cap.timeout.d);
memcpy(new_timeout, old_timeout, sizeof(new_timeout));
 
/*
@@ -583,29 +575,17 @@ int tpm_get_timeouts(struct tpm_chip *chip)
chip->vendor.timeout_c = usecs_to_jiffies(new_timeout[2]);
chip->vendor.timeout_d = usecs_to_jiffies(new_timeout[3]);
 
-duration:
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_DURATION;
-
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
- "attempting to determine the durations");
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_DURATION, &cap,
+   "attempting to determine the durations");
if (rc)
return rc;
 
-   if (be32_to

[PATCH v7 1/5] tpm_tis: Improve reporting of IO errors

2016-06-20 Thread Ed Swierk
Mysterious TPM behavior can be difficult to track down through all the
layers of software. Add error messages for conditions that should
never happen. Also include the manufacturer ID along with other chip
data printed during init.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm_tis.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 65f7eec..088fa86 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -299,6 +299,8 @@ static int tpm_tis_recv(struct tpm_chip *chip, u8 *buf, 
size_t count)
 
expected = be32_to_cpu(*(__be32 *) (buf + 2));
if (expected > count) {
+   dev_err(chip->pdev, "Response too long (wanted %zd, got %d)\n",
+   count, expected);
size = -EIO;
goto out;
}
@@ -366,6 +368,8 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
+   dev_err(chip->pdev, "Chip not accepting %zd bytes\n",
+   len - count);
rc = -EIO;
goto out_err;
}
@@ -378,6 +382,7 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if ((status & TPM_STS_DATA_EXPECT) != 0) {
+   dev_err(chip->pdev, "Chip not accepting last byte\n");
rc = -EIO;
goto out_err;
}
@@ -689,8 +694,9 @@ static int tpm_tis_init(struct device *dev, struct tpm_info 
*tpm_info,
vendor = ioread32(chip->vendor.iobase + TPM_DID_VID(0));
chip->vendor.manufacturer_id = vendor;
 
-   dev_info(dev, "%s TPM (device-id 0x%X, rev-id %d)\n",
+   dev_info(dev, "%s TPM (manufacturer-id 0x%X, device-id 0x%X, rev-id 
%d)\n",
 (chip->flags & TPM_CHIP_FLAG_TPM2) ? "2.0" : "1.2",
+chip->vendor.manufacturer_id,
 vendor >> 16, ioread8(chip->vendor.iobase + TPM_RID(0)));
 
if (!itpm) {
-- 
1.9.1



[PATCH v7 0/5] tpm: Command duration logging and chip-specific override

2016-06-20 Thread Ed Swierk
v7: Use tpm_getcap() instead of a redundant new function.

v6: Split tpm_get_cap_prop() out of tpm_get_timeouts(); always return
error on TPM command failure.

v5: Use msecs_to_jiffies() instead of * HZ / 1000.

v4: Rework tpm_get_timeouts() to allow overriding both timeouts and
durations via a single callback.

This series
- improves TPM command error reporting
- adds optional logging of TPM command durations
- allows chip-specific override of command durations as well as protocol
  timeouts
- overrides ST19NP18 TPM command duration to avoid lockups

Ed Swierk (5):
  tpm_tis: Improve reporting of IO errors
  tpm: Add optional logging of TPM command durations
  tpm: Clean up reading of timeout and duration capabilities
  tpm: Allow TPM chip drivers to override reported command durations
  tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

 drivers/char/tpm/tpm-interface.c | 212 ---
 drivers/char/tpm/tpm_tis.c   |  49 -
 include/linux/tpm.h  |   3 +-
 3 files changed, 130 insertions(+), 134 deletions(-)

-- 
1.9.1



[PATCH v7 4/5] tpm: Allow TPM chip drivers to override reported command durations

2016-06-20 Thread Ed Swierk
Some TPM chips report bogus command durations in their capabilities,
just as others report incorrect timeouts. Rework tpm_get_timeouts() to
allow chip drivers to override either via a single callback. Also
clean up handling of TPMs that report milliseconds instead of
microseconds.

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 143 +--
 drivers/char/tpm/tpm_tis.c   |  35 +++---
 include/linux/tpm.h  |   3 +-
 3 files changed, 88 insertions(+), 93 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 73c3ee0..36a6861 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -514,12 +514,11 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
 
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
-   cap_t cap;
-   unsigned long new_timeout[4];
-   unsigned long old_timeout[4];
-   ssize_t rc;
+   cap_t cap1, cap2;
+   int rc;
+   struct tpm_vendor_specific orig_vendor;
 
-   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap1,
"attempting to determine the timeouts");
if (rc == TPM_ERR_INVALID_POSTINIT) {
/* The TPM is not started, we are the first to talk to it.
@@ -528,77 +527,91 @@ int tpm_get_timeouts(struct tpm_chip *chip)
if (tpm_startup(chip, TPM_ST_CLEAR))
return rc;
 
-   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap,
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_TIMEOUT, &cap1,
"attempting to determine the timeouts");
}
if (rc)
return rc;
 
-   old_timeout[0] = be32_to_cpu(cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(cap.timeout.d);
-   memcpy(new_timeout, old_timeout, sizeof(new_timeout));
-
-   /*
-* Provide ability for vendor overrides of timeout values in case
-* of misreporting.
-*/
-   if (chip->ops->update_timeouts != NULL)
-   chip->vendor.timeout_adjusted =
-   chip->ops->update_timeouts(chip, new_timeout);
-
-   if (!chip->vendor.timeout_adjusted) {
-   /* Don't overwrite default if value is 0 */
-   if (new_timeout[0] != 0 && new_timeout[0] < 1000) {
-   int i;
-
-   /* timeouts in msec rather usec */
-   for (i = 0; i != ARRAY_SIZE(new_timeout); i++)
-   new_timeout[i] *= 1000;
-   chip->vendor.timeout_adjusted = true;
-   }
-   }
-
-   /* Report adjusted timeouts */
-   if (chip->vendor.timeout_adjusted) {
-   dev_info(chip->pdev,
-HW_ERR "Adjusting reported timeouts: A %lu->%luus B 
%lu->%luus C %lu->%luus D %lu->%luus\n",
-old_timeout[0], new_timeout[0],
-old_timeout[1], new_timeout[1],
-old_timeout[2], new_timeout[2],
-old_timeout[3], new_timeout[3]);
-   }
-
-   chip->vendor.timeout_a = usecs_to_jiffies(new_timeout[0]);
-   chip->vendor.timeout_b = usecs_to_jiffies(new_timeout[1]);
-   chip->vendor.timeout_c = usecs_to_jiffies(new_timeout[2]);
-   chip->vendor.timeout_d = usecs_to_jiffies(new_timeout[3]);
-
-   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_DURATION, &cap,
+   rc = tpm_getcap(chip->pdev, TPM_CAP_PROP_TIS_DURATION, &cap2,
"attempting to determine the durations");
if (rc)
return rc;
 
-   chip->vendor.duration[TPM_SHORT] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_short));
-   chip->vendor.duration[TPM_MEDIUM] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_medium));
-   chip->vendor.duration[TPM_LONG] =
-   usecs_to_jiffies(be32_to_cpu(cap.duration.tpm_long));
+   be32_to_cpus(&cap1.timeout.a);
+   be32_to_cpus(&cap1.timeout.b);
+   be32_to_cpus(&cap1.timeout.c);
+   be32_to_cpus(&cap1.timeout.d);
+   chip->vendor.timeout_a = usecs_to_jiffies(cap1.timeout.a);
+   chip->vendor.timeout_b = usecs_to_jiffies(cap1.timeout.b);
+   chip->vendor.timeout_c = usecs_to_jiffies(cap1.timeout.c);
+   chip->vendor.timeout_d = usecs_to_jiffies(cap1.timeout.d);
 
-   /* The Broadcom BCM0102 chipset in a Dell Latitude D820 gets the above
-* value wrong and apparently reports msecs rather than u

[PATCH v7 5/5] tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

2016-06-20 Thread Ed Swierk
The STMicro ST19NP18-TPM sometimes takes much longer to execute
commands than it reports in its capabilities. For example, command 186
(TPM_FlushSpecific) has been observed to take 14560 msec to complete,
far longer than the 3000 msec limit for "short" commands reported by
the chip. The behavior has also been seen with command 101
(TPM_GetCapability).

Worse, when the tpm_tis driver attempts to cancel the current command
(by writing commandReady = 1 to TPM_STS_x), the chip locks up
completely, returning all-1s from all memory-mapped register
reads. The lockup can be cleared only by resetting the system.

The occurrence of this excessive command duration depends on the
sequence of commands preceding it. One sequence is creating at least 2
new keys via TPM_CreateWrapKey, then letting the TPM idle for at least
30 seconds, then loading a key via TPM_LoadKey2. The next
TPM_FlushSpecific occasionally takes tens of seconds to
complete. Another sequence is creating many keys in a row without
pause. The TPM_CreateWrapKey operation gets much slower after the
first few iterations, as one would expect when the pool of precomputed
keys is exhausted. Then after a 35-second pause, the same TPM_LoadKey2
followed by TPM_FlushSpecific sequence triggers the behavior.

Our working theory is that this older TPM sometimes pauses to
precompute keys, which modern chips implement as a background
process. Without access to the chip's implementation details it's
impossible to know whether any commands are immune to being blocked by
this process. So it seems safest to ignore the chip's reported command
durations, and use a value much higher than any observed duration,
like 180 sec (which is the duration this chip reports for "long"
commands).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm_tis.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index caf7278..8355b45 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -485,6 +485,12 @@ static void tpm_tis_update_timeouts(struct tpm_chip *chip)
chip->vendor.timeout_d = msecs_to_jiffies(TIS_SHORT_TIMEOUT);
chip->vendor.timeout_adjusted = true;
break;
+   case 0x104a: /* STMicro ST19NP18-TPM */
+   chip->vendor.duration[TPM_SHORT] = 180 * HZ;
+   chip->vendor.duration[TPM_MEDIUM] = 180 * HZ;
+   chip->vendor.duration[TPM_LONG] = 180 * HZ;
+   chip->vendor.duration_adjusted = true;
+   break;
}
 }
 
-- 
1.9.1



[PATCH v7 2/5] tpm: Add optional logging of TPM command durations

2016-06-20 Thread Ed Swierk
Some TPMs violate their own advertised command durations. This is much
easier to debug with data about how long each command actually takes
to complete. Add debug messages that can be enabled by running

  echo -n 'module tpm +p' >/sys/kernel/debug/dynamic_debug/control

on a kernel configured with DYNAMIC_DEBUG=y.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-interface.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index c50637d..cc1e5bc 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -333,13 +333,14 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
 {
ssize_t rc;
u32 count, ordinal;
-   unsigned long stop;
+   unsigned long start, stop;
 
if (bufsiz > TPM_BUFSIZE)
bufsiz = TPM_BUFSIZE;
 
count = be32_to_cpu(*((__be32 *) (buf + 2)));
ordinal = be32_to_cpu(*((__be32 *) (buf + 6)));
+   dev_dbg(chip->pdev, "starting command %d count %d\n", ordinal, count);
if (count == 0)
return -ENODATA;
if (count > bufsiz) {
@@ -360,18 +361,24 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
if (chip->vendor.irq)
goto out_recv;
 
+   start = jiffies;
if (chip->flags & TPM_CHIP_FLAG_TPM2)
-   stop = jiffies + tpm2_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm2_calc_ordinal_duration(chip, ordinal);
else
-   stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm_calc_ordinal_duration(chip, ordinal);
do {
u8 status = chip->ops->status(chip);
if ((status & chip->ops->req_complete_mask) ==
-   chip->ops->req_complete_val)
+   chip->ops->req_complete_val) {
+   dev_dbg(chip->pdev, "completed command %d in %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
goto out_recv;
+   }
 
if (chip->ops->req_canceled(chip, status)) {
dev_err(chip->pdev, "Operation Canceled\n");
+   dev_dbg(chip->pdev, "canceled command %d after %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
rc = -ECANCELED;
goto out;
}
@@ -382,6 +389,8 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
 
chip->ops->cancel(chip);
dev_err(chip->pdev, "Operation Timed out\n");
+   dev_dbg(chip->pdev, "command %d timed out after %d ms\n", ordinal,
+   jiffies_to_msecs(jiffies - start));
rc = -ETIME;
goto out;
 
-- 
1.9.1



[PATCH v6 2/5] tpm: Add optional logging of TPM command durations

2016-06-10 Thread Ed Swierk
Some TPMs violate their own advertised command durations. This is much
easier to debug with data about how long each command actually takes
to complete. Add debug messages that can be enabled by running

  echo -n 'module tpm +p' >/sys/kernel/debug/dynamic_debug/control

on a kernel configured with DYNAMIC_DEBUG=y.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-interface.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index c50637d..cc1e5bc 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -333,13 +333,14 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
 {
ssize_t rc;
u32 count, ordinal;
-   unsigned long stop;
+   unsigned long start, stop;
 
if (bufsiz > TPM_BUFSIZE)
bufsiz = TPM_BUFSIZE;
 
count = be32_to_cpu(*((__be32 *) (buf + 2)));
ordinal = be32_to_cpu(*((__be32 *) (buf + 6)));
+   dev_dbg(chip->pdev, "starting command %d count %d\n", ordinal, count);
if (count == 0)
return -ENODATA;
if (count > bufsiz) {
@@ -360,18 +361,24 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
if (chip->vendor.irq)
goto out_recv;
 
+   start = jiffies;
if (chip->flags & TPM_CHIP_FLAG_TPM2)
-   stop = jiffies + tpm2_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm2_calc_ordinal_duration(chip, ordinal);
else
-   stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm_calc_ordinal_duration(chip, ordinal);
do {
u8 status = chip->ops->status(chip);
if ((status & chip->ops->req_complete_mask) ==
-   chip->ops->req_complete_val)
+   chip->ops->req_complete_val) {
+   dev_dbg(chip->pdev, "completed command %d in %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
goto out_recv;
+   }
 
if (chip->ops->req_canceled(chip, status)) {
dev_err(chip->pdev, "Operation Canceled\n");
+   dev_dbg(chip->pdev, "canceled command %d after %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
rc = -ECANCELED;
goto out;
}
@@ -382,6 +389,8 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
 
chip->ops->cancel(chip);
dev_err(chip->pdev, "Operation Timed out\n");
+   dev_dbg(chip->pdev, "command %d timed out after %d ms\n", ordinal,
+   jiffies_to_msecs(jiffies - start));
rc = -ETIME;
goto out;
 
-- 
1.9.1



[PATCH v6 3/5] tpm: Factor out reading of timeout and duration capabilities

2016-06-10 Thread Ed Swierk
Factor sending the TPM_GetCapability command and validating the result
from tpm_get_timeouts() into a new function. Return all errors to the
caller rather than swallowing them (e.g. when tpm_transmit_cmd()
returns nonzero).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 96 ++--
 1 file changed, 52 insertions(+), 44 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index cc1e5bc..4d1f62c 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -502,6 +502,52 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
"attempting to start the TPM");
 }
 
+static int tpm_get_cap_prop(struct tpm_chip *chip, __be32 type, int size,
+   cap_t *cap, char *desc)
+{
+   struct tpm_cmd_t tpm_cmd;
+   ssize_t rc;
+
+   tpm_cmd.header.in = tpm_getcap_header;
+   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
+   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
+   tpm_cmd.params.getcap_in.subcap = type;
+   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, NULL);
+
+   if (rc == TPM_ERR_INVALID_POSTINIT) {
+   /* The TPM is not started, we are the first to talk to it.
+  Execute a startup command. */
+   dev_info(chip->pdev, "Issuing TPM_STARTUP\n");
+   if (tpm_startup(chip, TPM_ST_CLEAR))
+   return rc;
+
+   tpm_cmd.header.in = tpm_getcap_header;
+   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
+   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
+   tpm_cmd.params.getcap_in.subcap = type;
+   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
+ NULL);
+   }
+
+   if (rc) {
+   dev_err(chip->pdev,
+   "Error %zd reading %s\n", rc, desc);
+   return rc;
+   }
+
+   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
+   be32_to_cpu(tpm_cmd.header.out.length)
+   != sizeof(tpm_cmd.header.out) + sizeof(u32) + size * sizeof(u32)) {
+   dev_err(chip->pdev,
+   "Bad return code or length reading %s\n", desc);
+   return -EINVAL;
+   }
+
+   memcpy(cap, &tpm_cmd.params.getcap_out.cap, sizeof(cap_t));
+
+   return 0;
+}
+
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
struct tpm_cmd_t tpm_cmd;
@@ -510,37 +556,10 @@ int tpm_get_timeouts(struct tpm_chip *chip)
struct duration_t *duration_cap;
ssize_t rc;
 
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, NULL);
-
-   if (rc == TPM_ERR_INVALID_POSTINIT) {
-   /* The TPM is not started, we are the first to talk to it.
-  Execute a startup command. */
-   dev_info(chip->pdev, "Issuing TPM_STARTUP");
-   if (tpm_startup(chip, TPM_ST_CLEAR))
-   return rc;
-
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
-   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
- NULL);
-   }
-   if (rc) {
-   dev_err(chip->pdev,
-   "A TPM error (%zd) occurred attempting to determine the 
timeouts\n",
-   rc);
-   goto duration;
-   }
-
-   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
-   be32_to_cpu(tpm_cmd.header.out.length)
-   != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
-   return -EINVAL;
+   rc = tpm_get_cap_prop(chip, TPM_CAP_PROP_TIS_TIMEOUT, 4,
+ &tpm_cmd.params.getcap_out.cap, "timeouts");
+   if (rc)
+   return rc;
 
old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
@@ -583,22 +602,11 @@ int tpm_get_timeouts(struct tpm_chip *chip)
chip->vendor.timeout_c = usecs_to_jiffies(new_timeout[2]);
chip->vendor.timeout_d = usecs_to_jiffies(new_timeout[3]);
 
-duration:
-   tpm_cmd.header.in = tpm_getcap_header;
-   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
-   tpm_cmd.params.getcap

[PATCH v6 4/5] tpm: Allow TPM chip drivers to override reported command durations

2016-06-10 Thread Ed Swierk
Some TPM chips report bogus command durations in their capabilities,
just as others report incorrect timeouts. Rework tpm_get_timeouts() to
allow chip drivers to override either via a single callback. Also
clean up handling of TPMs that report milliseconds instead of
microseconds.

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 139 +--
 drivers/char/tpm/tpm_tis.c   |  35 +++---
 include/linux/tpm.h  |   3 +-
 3 files changed, 85 insertions(+), 92 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 4d1f62c..a14adfd 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -550,83 +550,94 @@ static int tpm_get_cap_prop(struct tpm_chip *chip, __be32 
type, int size,
 
 int tpm_get_timeouts(struct tpm_chip *chip)
 {
-   struct tpm_cmd_t tpm_cmd;
-   unsigned long new_timeout[4];
-   unsigned long old_timeout[4];
-   struct duration_t *duration_cap;
-   ssize_t rc;
+   cap_t cap1, cap2;
+   int rc;
+   struct tpm_vendor_specific orig_vendor;
 
-   rc = tpm_get_cap_prop(chip, TPM_CAP_PROP_TIS_TIMEOUT, 4,
- &tpm_cmd.params.getcap_out.cap, "timeouts");
+   rc = tpm_get_cap_prop(chip, TPM_CAP_PROP_TIS_TIMEOUT, 4, &cap1,
+ "timeouts");
if (rc)
return rc;
 
-   old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.d);
-   memcpy(new_timeout, old_timeout, sizeof(new_timeout));
+   rc = tpm_get_cap_prop(chip, TPM_CAP_PROP_TIS_DURATION, 3, &cap2,
+  "durations");
+   if (rc)
+   return rc;
 
-   /*
-* Provide ability for vendor overrides of timeout values in case
-* of misreporting.
-*/
-   if (chip->ops->update_timeouts != NULL)
-   chip->vendor.timeout_adjusted =
-   chip->ops->update_timeouts(chip, new_timeout);
+   be32_to_cpus(&cap1.timeout.a);
+   be32_to_cpus(&cap1.timeout.b);
+   be32_to_cpus(&cap1.timeout.c);
+   be32_to_cpus(&cap1.timeout.d);
+   chip->vendor.timeout_a = usecs_to_jiffies(cap1.timeout.a);
+   chip->vendor.timeout_b = usecs_to_jiffies(cap1.timeout.b);
+   chip->vendor.timeout_c = usecs_to_jiffies(cap1.timeout.c);
+   chip->vendor.timeout_d = usecs_to_jiffies(cap1.timeout.d);
 
-   if (!chip->vendor.timeout_adjusted) {
-   /* Don't overwrite default if value is 0 */
-   if (new_timeout[0] != 0 && new_timeout[0] < 1000) {
-   int i;
-
-   /* timeouts in msec rather usec */
-   for (i = 0; i != ARRAY_SIZE(new_timeout); i++)
-   new_timeout[i] *= 1000;
-   chip->vendor.timeout_adjusted = true;
-   }
+   /* Some TPMs report timeouts in milliseconds rather than
+  microseconds. Use a value between 1 and 1000 as an
+  indication that this is the case. */
+   if (cap1.timeout.a > 0 && cap1.timeout.a < 1000) {
+   chip->vendor.timeout_a = msecs_to_jiffies(cap1.timeout.a);
+   chip->vendor.timeout_b = msecs_to_jiffies(cap1.timeout.b);
+   chip->vendor.timeout_c = msecs_to_jiffies(cap1.timeout.c);
+   chip->vendor.timeout_d = msecs_to_jiffies(cap1.timeout.d);
+   chip->vendor.timeout_adjusted = true;
}
 
-   /* Report adjusted timeouts */
+   be32_to_cpus(&cap2.duration.tpm_short);
+   be32_to_cpus(&cap2.duration.tpm_medium);
+   be32_to_cpus(&cap2.duration.tpm_long);
+   chip->vendor.duration[TPM_SHORT] =
+   usecs_to_jiffies(cap2.duration.tpm_short);
+   chip->vendor.duration[TPM_MEDIUM] =
+   usecs_to_jiffies(cap2.duration.tpm_medium);
+   chip->vendor.duration[TPM_LONG] =
+   usecs_to_jiffies(cap2.duration.tpm_long);
+
+   memcpy(&orig_vendor, &chip->vendor, sizeof(orig_vendor));
+
+   /* Interpret duration values between 1 and 1 as
+  milliseconds to deal with TPMs like the Broadcom BCM0102 in
+  the Dell Latitude D820. */
+   if (cap2.duration.tpm_short > 0 && cap2.duration.tpm_short < 1) {
+   chip->vendor.duration[TPM_SHORT] =
+   msecs_to_jiffies(cap2.duration.tpm_short);
+   chip->vendor.duration[TPM_MEDIUM] =
+   msecs_to_jiffies(cap2.dur

[PATCH v6 0/5] tpm: Command duration logging and chip-specific override

2016-06-10 Thread Ed Swierk
v6: Split tpm_get_cap_prop() out of tpm_get_timeouts(); always return
error on TPM command failure.

v5: Use msecs_to_jiffies() instead of * HZ / 1000.

v4: Rework tpm_get_timeouts() to allow overriding both timeouts and
durations via a single callback.

This series
- improves TPM command error reporting
- adds optional logging of TPM command durations
- allows chip-specific override of command durations as well as protocol
  timeouts
- overrides ST19NP18 TPM command duration to avoid lockups

Ed Swierk (5):
  tpm_tis: Improve reporting of IO errors
  tpm: Add optional logging of TPM command durations
  tpm: Factor out reading of timeout and duration capabilities
  tpm: Allow TPM chip drivers to override reported command durations
  tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

 drivers/char/tpm/tpm-interface.c | 192 ++-
 drivers/char/tpm/tpm_tis.c   |  48 +-
 include/linux/tpm.h  |   3 +-
 3 files changed, 132 insertions(+), 111 deletions(-)

-- 
1.9.1



[PATCH v6 5/5] tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

2016-06-10 Thread Ed Swierk
The STMicro ST19NP18-TPM sometimes takes much longer to execute
commands than it reports in its capabilities. For example, command 186
(TPM_FlushSpecific) has been observed to take 14560 msec to complete,
far longer than the 3000 msec limit for "short" commands reported by
the chip. The behavior has also been seen with command 101
(TPM_GetCapability).

Worse, when the tpm_tis driver attempts to cancel the current command
(by writing commandReady = 1 to TPM_STS_x), the chip locks up
completely, returning all-1s from all memory-mapped register
reads. The lockup can be cleared only by resetting the system.

The occurrence of this excessive command duration depends on the
sequence of commands preceding it. One sequence is creating at least 2
new keys via TPM_CreateWrapKey, then letting the TPM idle for at least
30 seconds, then loading a key via TPM_LoadKey2. The next
TPM_FlushSpecific occasionally takes tens of seconds to
complete. Another sequence is creating many keys in a row without
pause. The TPM_CreateWrapKey operation gets much slower after the
first few iterations, as one would expect when the pool of precomputed
keys is exhausted. Then after a 35-second pause, the same TPM_LoadKey2
followed by TPM_FlushSpecific sequence triggers the behavior.

Our working theory is that this older TPM sometimes pauses to
precompute keys, which modern chips implement as a background
process. Without access to the chip's implementation details it's
impossible to know whether any commands are immune to being blocked by
this process. So it seems safest to ignore the chip's reported command
durations, and use a value much higher than any observed duration,
like 180 sec (which is the duration this chip reports for "long"
commands).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm_tis.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index caf7278..862c502 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -485,6 +485,11 @@ static void tpm_tis_update_timeouts(struct tpm_chip *chip)
chip->vendor.timeout_d = msecs_to_jiffies(TIS_SHORT_TIMEOUT);
chip->vendor.timeout_adjusted = true;
break;
+   case 0x104a: /* STMicro ST19NP18-TPM */
+   chip->vendor.duration[TPM_SHORT] = 180 * HZ;
+   chip->vendor.duration[TPM_MEDIUM] = 180 * HZ;
+   chip->vendor.duration[TPM_LONG] = 180 * HZ;
+   chip->vendor.duration_adjusted = true;
}
 }
 
-- 
1.9.1



[PATCH v6 1/5] tpm_tis: Improve reporting of IO errors

2016-06-10 Thread Ed Swierk
Mysterious TPM behavior can be difficult to track down through all the
layers of software. Add error messages for conditions that should
never happen. Also include the manufacturer ID along with other chip
data printed during init.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm_tis.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 65f7eec..088fa86 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -299,6 +299,8 @@ static int tpm_tis_recv(struct tpm_chip *chip, u8 *buf, 
size_t count)
 
expected = be32_to_cpu(*(__be32 *) (buf + 2));
if (expected > count) {
+   dev_err(chip->pdev, "Response too long (wanted %zd, got %d)\n",
+   count, expected);
size = -EIO;
goto out;
}
@@ -366,6 +368,8 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
+   dev_err(chip->pdev, "Chip not accepting %zd bytes\n",
+   len - count);
rc = -EIO;
goto out_err;
}
@@ -378,6 +382,7 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if ((status & TPM_STS_DATA_EXPECT) != 0) {
+   dev_err(chip->pdev, "Chip not accepting last byte\n");
rc = -EIO;
goto out_err;
}
@@ -689,8 +694,9 @@ static int tpm_tis_init(struct device *dev, struct tpm_info 
*tpm_info,
vendor = ioread32(chip->vendor.iobase + TPM_DID_VID(0));
chip->vendor.manufacturer_id = vendor;
 
-   dev_info(dev, "%s TPM (device-id 0x%X, rev-id %d)\n",
+   dev_info(dev, "%s TPM (manufacturer-id 0x%X, device-id 0x%X, rev-id 
%d)\n",
 (chip->flags & TPM_CHIP_FLAG_TPM2) ? "2.0" : "1.2",
+chip->vendor.manufacturer_id,
 vendor >> 16, ioread8(chip->vendor.iobase + TPM_RID(0)));
 
if (!itpm) {
-- 
1.9.1



Re: [PATCH v5 3/4] tpm: Allow TPM chip drivers to override reported command durations

2016-06-10 Thread Ed Swierk
On Fri, Jun 10, 2016 at 12:42 PM, Jarkko Sakkinen
 wrote:
> On Fri, Jun 10, 2016 at 10:34:15AM -0700, Ed Swierk wrote:
>> On Fri, Jun 10, 2016 at 5:19 AM, Jarkko Sakkinen
>>  wrote:
>> > On Wed, Jun 08, 2016 at 04:00:17PM -0700, Ed Swierk wrote:
>> >> Some TPM chips report bogus command durations in their capabilities,
>> >> just as others report incorrect timeouts. Rework tpm_get_timeouts()
>> >> to allow chip drivers to override either via a single callback.
>> >> Also clean up handling of TPMs that report milliseconds instead of
>> >> microseconds.
>> >>
>> >> Signed-off-by: Ed Swierk 
>> >> ---
>> >>  drivers/char/tpm/tpm-interface.c | 177 
>> >> +--
>> >>  drivers/char/tpm/tpm_tis.c   |  35 ++--
>> >>  include/linux/tpm.h  |   3 +-
>> >>  3 files changed, 106 insertions(+), 109 deletions(-)
>> >>
>> >> diff --git a/drivers/char/tpm/tpm-interface.c 
>> >> b/drivers/char/tpm/tpm-interface.c
>> >> index cc1e5bc..b8a08bb 100644
>> >> --- a/drivers/char/tpm/tpm-interface.c
>> >> +++ b/drivers/char/tpm/tpm-interface.c
>> >> @@ -502,123 +502,138 @@ static int tpm_startup(struct tpm_chip *chip, 
>> >> __be16 startup_type)
>> >>   "attempting to start the TPM");
>> >>  }
>> >>
>> >> -int tpm_get_timeouts(struct tpm_chip *chip)
>> >> +static int tpm_get_cap_prop(struct tpm_chip *chip, __be32 type, int size,
>> >> + cap_t *cap, char *desc)
>> >>  {
>> >>   struct tpm_cmd_t tpm_cmd;
>> >> - unsigned long new_timeout[4];
>> >> - unsigned long old_timeout[4];
>> >> - struct duration_t *duration_cap;
>> >>   ssize_t rc;
>> >>
>> >>   tpm_cmd.header.in = tpm_getcap_header;
>> >>   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
>> >>   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
>> >> - tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
>> >> + tpm_cmd.params.getcap_in.subcap = type;
>> >>   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, 
>> >> NULL);
>> >>
>> >>   if (rc == TPM_ERR_INVALID_POSTINIT) {
>> >>   /* The TPM is not started, we are the first to talk to it.
>> >>  Execute a startup command. */
>> >> - dev_info(chip->pdev, "Issuing TPM_STARTUP");
>> >> + dev_info(chip->pdev, "Issuing TPM_STARTUP\n");
>> >>   if (tpm_startup(chip, TPM_ST_CLEAR))
>> >>   return rc;
>> >>
>> >>   tpm_cmd.header.in = tpm_getcap_header;
>> >>   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
>> >>   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
>> >> - tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
>> >> + tpm_cmd.params.getcap_in.subcap = type;
>> >>   rc = tpm_transmit_cmd(chip, &tpm_cmd, 
>> >> TPM_INTERNAL_RESULT_SIZE,
>> >> NULL);
>> >>   }
>> >> +
>> >>   if (rc) {
>> >>   dev_err(chip->pdev,
>> >> - "A TPM error (%zd) occurred attempting to determine 
>> >> the timeouts\n",
>> >> - rc);
>> >> - goto duration;
>> >> + "Error %zd reading %s\n", rc, desc);
>> >> + return -EINVAL;
>> >>   }
>> >>
>> >>   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
>> >>   be32_to_cpu(tpm_cmd.header.out.length)
>> >> - != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
>> >> + != sizeof(tpm_cmd.header.out) + sizeof(u32) + size * 
>> >> sizeof(u32)) {
>> >> + dev_err(chip->pdev,
>> >> + "Bad return code or length reading %s\n", desc);
>> >>   return -EINVAL;
>> >> -
>> >> - old_timeout[0] = 
>> >> be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
>> >> -   

Re: [PATCH v5 3/4] tpm: Allow TPM chip drivers to override reported command durations

2016-06-10 Thread Ed Swierk
On Fri, Jun 10, 2016 at 5:19 AM, Jarkko Sakkinen
 wrote:
> On Wed, Jun 08, 2016 at 04:00:17PM -0700, Ed Swierk wrote:
>> Some TPM chips report bogus command durations in their capabilities,
>> just as others report incorrect timeouts. Rework tpm_get_timeouts()
>> to allow chip drivers to override either via a single callback.
>> Also clean up handling of TPMs that report milliseconds instead of
>> microseconds.
>>
>> Signed-off-by: Ed Swierk 
>> ---
>>  drivers/char/tpm/tpm-interface.c | 177 
>> +--
>>  drivers/char/tpm/tpm_tis.c   |  35 ++--
>>  include/linux/tpm.h  |   3 +-
>>  3 files changed, 106 insertions(+), 109 deletions(-)
>>
>> diff --git a/drivers/char/tpm/tpm-interface.c 
>> b/drivers/char/tpm/tpm-interface.c
>> index cc1e5bc..b8a08bb 100644
>> --- a/drivers/char/tpm/tpm-interface.c
>> +++ b/drivers/char/tpm/tpm-interface.c
>> @@ -502,123 +502,138 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
>> startup_type)
>>   "attempting to start the TPM");
>>  }
>>
>> -int tpm_get_timeouts(struct tpm_chip *chip)
>> +static int tpm_get_cap_prop(struct tpm_chip *chip, __be32 type, int size,
>> + cap_t *cap, char *desc)
>>  {
>>   struct tpm_cmd_t tpm_cmd;
>> - unsigned long new_timeout[4];
>> - unsigned long old_timeout[4];
>> - struct duration_t *duration_cap;
>>   ssize_t rc;
>>
>>   tpm_cmd.header.in = tpm_getcap_header;
>>   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
>>   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
>> - tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
>> + tpm_cmd.params.getcap_in.subcap = type;
>>   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, NULL);
>>
>>   if (rc == TPM_ERR_INVALID_POSTINIT) {
>>   /* The TPM is not started, we are the first to talk to it.
>>  Execute a startup command. */
>> - dev_info(chip->pdev, "Issuing TPM_STARTUP");
>> + dev_info(chip->pdev, "Issuing TPM_STARTUP\n");
>>   if (tpm_startup(chip, TPM_ST_CLEAR))
>>   return rc;
>>
>>   tpm_cmd.header.in = tpm_getcap_header;
>>   tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
>>   tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
>> - tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
>> + tpm_cmd.params.getcap_in.subcap = type;
>>   rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
>> NULL);
>>   }
>> +
>>   if (rc) {
>>   dev_err(chip->pdev,
>> - "A TPM error (%zd) occurred attempting to determine 
>> the timeouts\n",
>> - rc);
>> - goto duration;
>> + "Error %zd reading %s\n", rc, desc);
>> + return -EINVAL;
>>   }
>>
>>   if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
>>   be32_to_cpu(tpm_cmd.header.out.length)
>> - != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
>> + != sizeof(tpm_cmd.header.out) + sizeof(u32) + size * sizeof(u32)) {
>> + dev_err(chip->pdev,
>> + "Bad return code or length reading %s\n", desc);
>>   return -EINVAL;
>> -
>> - old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
>> - old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
>> - old_timeout[2] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.c);
>> - old_timeout[3] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.d);
>> - memcpy(new_timeout, old_timeout, sizeof(new_timeout));
>> -
>> - /*
>> -  * Provide ability for vendor overrides of timeout values in case
>> -  * of misreporting.
>> -  */
>> - if (chip->ops->update_timeouts != NULL)
>> - chip->vendor.timeout_adjusted =
>> - chip->ops->update_timeouts(chip, new_timeout);
>> -
>> - if (!chip->vendor.timeout_adjusted) {
>> - /* Don't overwrite default if value is 0 */
>> - if (new_timeout[0] != 0 && new_timeout[0] < 1000) {
>>

[PATCH v5 4/4] tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

2016-06-08 Thread Ed Swierk
The STMicro ST19NP18-TPM sometimes takes much longer to execute
commands than it reports in its capabilities. For example, command 186
(TPM_FlushSpecific) has been observed to take 14560 msec to complete,
far longer than the 3000 msec limit for "short" commands reported by
the chip. The behavior has also been seen with command 101
(TPM_GetCapability).

Worse, when the tpm_tis driver attempts to cancel the current command
(by writing commandReady = 1 to TPM_STS_x), the chip locks up
completely, returning all-1s from all memory-mapped register
reads. The lockup can be cleared only by resetting the system.

The occurrence of this excessive command duration depends on the
sequence of commands preceding it. One sequence is creating at least 2
new keys via TPM_CreateWrapKey, then letting the TPM idle for at least
30 seconds, then loading a key via TPM_LoadKey2. The next
TPM_FlushSpecific occasionally takes tens of seconds to
complete. Another sequence is creating many keys in a row without
pause. The TPM_CreateWrapKey operation gets much slower after the
first few iterations, as one would expect when the pool of precomputed
keys is exhausted. Then after a 35-second pause, the same TPM_LoadKey2
followed by TPM_FlushSpecific sequence triggers the behavior.

Our working theory is that this older TPM sometimes pauses to
precompute keys, which modern chips implement as a background
process. Without access to the chip's implementation details it's
impossible to know whether any commands are immune to being blocked by
this process. So it seems safest to ignore the chip's reported command
durations, and use a value much higher than any observed duration,
like 180 sec (which is the duration this chip reports for "long"
commands).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm_tis.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index caf7278..862c502 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -485,6 +485,11 @@ static void tpm_tis_update_timeouts(struct tpm_chip *chip)
chip->vendor.timeout_d = msecs_to_jiffies(TIS_SHORT_TIMEOUT);
chip->vendor.timeout_adjusted = true;
break;
+   case 0x104a: /* STMicro ST19NP18-TPM */
+   chip->vendor.duration[TPM_SHORT] = 180 * HZ;
+   chip->vendor.duration[TPM_MEDIUM] = 180 * HZ;
+   chip->vendor.duration[TPM_LONG] = 180 * HZ;
+   chip->vendor.duration_adjusted = true;
}
 }
 
-- 
1.9.1



[PATCH v5 1/4] tpm_tis: Improve reporting of IO errors

2016-06-08 Thread Ed Swierk
Mysterious TPM behavior can be difficult to track down through all the
layers of software. Add error messages for conditions that should
never happen. Also include the manufacturer ID along with other chip
data printed during init.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm_tis.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 65f7eec..088fa86 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -299,6 +299,8 @@ static int tpm_tis_recv(struct tpm_chip *chip, u8 *buf, 
size_t count)
 
expected = be32_to_cpu(*(__be32 *) (buf + 2));
if (expected > count) {
+   dev_err(chip->pdev, "Response too long (wanted %zd, got %d)\n",
+   count, expected);
size = -EIO;
goto out;
}
@@ -366,6 +368,8 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
+   dev_err(chip->pdev, "Chip not accepting %zd bytes\n",
+   len - count);
rc = -EIO;
goto out_err;
}
@@ -378,6 +382,7 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if ((status & TPM_STS_DATA_EXPECT) != 0) {
+   dev_err(chip->pdev, "Chip not accepting last byte\n");
rc = -EIO;
goto out_err;
}
@@ -689,8 +694,9 @@ static int tpm_tis_init(struct device *dev, struct tpm_info 
*tpm_info,
vendor = ioread32(chip->vendor.iobase + TPM_DID_VID(0));
chip->vendor.manufacturer_id = vendor;
 
-   dev_info(dev, "%s TPM (device-id 0x%X, rev-id %d)\n",
+   dev_info(dev, "%s TPM (manufacturer-id 0x%X, device-id 0x%X, rev-id 
%d)\n",
 (chip->flags & TPM_CHIP_FLAG_TPM2) ? "2.0" : "1.2",
+chip->vendor.manufacturer_id,
 vendor >> 16, ioread8(chip->vendor.iobase + TPM_RID(0)));
 
if (!itpm) {
-- 
1.9.1



[PATCH v5 0/4] tpm: Command duration logging and chip-specific override

2016-06-08 Thread Ed Swierk
v5: Use msecs_to_jiffies() instead of * HZ / 1000.

v4: Rework tpm_get_timeouts() to allow overriding both timeouts and
durations via a single callback.

This series
- improves TPM command error reporting
- adds optional logging of TPM command durations
- allows chip-specific override of command durations as well as protocol
  timeouts
- overrides ST19NP18 TPM command duration to avoid lockups

Ed Swierk (4):
  tpm_tis: Improve reporting of IO errors
  tpm: Add optional logging of TPM command durations
  tpm: Allow TPM chip drivers to override reported command durations
  tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

 drivers/char/tpm/tpm-interface.c | 194 ++-
 drivers/char/tpm/tpm_tis.c   |  48 +-
 include/linux/tpm.h  |   3 +-
 3 files changed, 131 insertions(+), 114 deletions(-)

-- 
1.9.1



[PATCH v5 2/4] tpm: Add optional logging of TPM command durations

2016-06-08 Thread Ed Swierk
Some TPMs violate their own advertised command durations. This is much
easier to debug with data about how long each command actually takes
to complete. Add debug messages that can be enabled by running

  echo -n 'module tpm +p' >/sys/kernel/debug/dynamic_debug/control

on a kernel configured with DYNAMIC_DEBUG=y.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-interface.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index c50637d..cc1e5bc 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -333,13 +333,14 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
 {
ssize_t rc;
u32 count, ordinal;
-   unsigned long stop;
+   unsigned long start, stop;
 
if (bufsiz > TPM_BUFSIZE)
bufsiz = TPM_BUFSIZE;
 
count = be32_to_cpu(*((__be32 *) (buf + 2)));
ordinal = be32_to_cpu(*((__be32 *) (buf + 6)));
+   dev_dbg(chip->pdev, "starting command %d count %d\n", ordinal, count);
if (count == 0)
return -ENODATA;
if (count > bufsiz) {
@@ -360,18 +361,24 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
if (chip->vendor.irq)
goto out_recv;
 
+   start = jiffies;
if (chip->flags & TPM_CHIP_FLAG_TPM2)
-   stop = jiffies + tpm2_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm2_calc_ordinal_duration(chip, ordinal);
else
-   stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm_calc_ordinal_duration(chip, ordinal);
do {
u8 status = chip->ops->status(chip);
if ((status & chip->ops->req_complete_mask) ==
-   chip->ops->req_complete_val)
+   chip->ops->req_complete_val) {
+   dev_dbg(chip->pdev, "completed command %d in %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
goto out_recv;
+   }
 
if (chip->ops->req_canceled(chip, status)) {
dev_err(chip->pdev, "Operation Canceled\n");
+   dev_dbg(chip->pdev, "canceled command %d after %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
rc = -ECANCELED;
goto out;
}
@@ -382,6 +389,8 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
 
chip->ops->cancel(chip);
dev_err(chip->pdev, "Operation Timed out\n");
+   dev_dbg(chip->pdev, "command %d timed out after %d ms\n", ordinal,
+   jiffies_to_msecs(jiffies - start));
rc = -ETIME;
goto out;
 
-- 
1.9.1



[PATCH v5 3/4] tpm: Allow TPM chip drivers to override reported command durations

2016-06-08 Thread Ed Swierk
Some TPM chips report bogus command durations in their capabilities,
just as others report incorrect timeouts. Rework tpm_get_timeouts()
to allow chip drivers to override either via a single callback.
Also clean up handling of TPMs that report milliseconds instead of
microseconds.

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 177 +--
 drivers/char/tpm/tpm_tis.c   |  35 ++--
 include/linux/tpm.h  |   3 +-
 3 files changed, 106 insertions(+), 109 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index cc1e5bc..b8a08bb 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -502,123 +502,138 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
"attempting to start the TPM");
 }
 
-int tpm_get_timeouts(struct tpm_chip *chip)
+static int tpm_get_cap_prop(struct tpm_chip *chip, __be32 type, int size,
+   cap_t *cap, char *desc)
 {
struct tpm_cmd_t tpm_cmd;
-   unsigned long new_timeout[4];
-   unsigned long old_timeout[4];
-   struct duration_t *duration_cap;
ssize_t rc;
 
tpm_cmd.header.in = tpm_getcap_header;
tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
+   tpm_cmd.params.getcap_in.subcap = type;
rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, NULL);
 
if (rc == TPM_ERR_INVALID_POSTINIT) {
/* The TPM is not started, we are the first to talk to it.
   Execute a startup command. */
-   dev_info(chip->pdev, "Issuing TPM_STARTUP");
+   dev_info(chip->pdev, "Issuing TPM_STARTUP\n");
if (tpm_startup(chip, TPM_ST_CLEAR))
return rc;
 
tpm_cmd.header.in = tpm_getcap_header;
tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
+   tpm_cmd.params.getcap_in.subcap = type;
rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
  NULL);
}
+
if (rc) {
dev_err(chip->pdev,
-   "A TPM error (%zd) occurred attempting to determine the 
timeouts\n",
-   rc);
-   goto duration;
+   "Error %zd reading %s\n", rc, desc);
+   return -EINVAL;
}
 
if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
be32_to_cpu(tpm_cmd.header.out.length)
-   != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
+   != sizeof(tpm_cmd.header.out) + sizeof(u32) + size * sizeof(u32)) {
+   dev_err(chip->pdev,
+   "Bad return code or length reading %s\n", desc);
return -EINVAL;
-
-   old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.d);
-   memcpy(new_timeout, old_timeout, sizeof(new_timeout));
-
-   /*
-* Provide ability for vendor overrides of timeout values in case
-* of misreporting.
-*/
-   if (chip->ops->update_timeouts != NULL)
-   chip->vendor.timeout_adjusted =
-   chip->ops->update_timeouts(chip, new_timeout);
-
-   if (!chip->vendor.timeout_adjusted) {
-   /* Don't overwrite default if value is 0 */
-   if (new_timeout[0] != 0 && new_timeout[0] < 1000) {
-   int i;
-
-   /* timeouts in msec rather usec */
-   for (i = 0; i != ARRAY_SIZE(new_timeout); i++)
-   new_timeout[i] *= 1000;
-   chip->vendor.timeout_adjusted = true;
-   }
}
 
-   /* Report adjusted timeouts */
-   if (chip->vendor.timeout_adjusted) {
-   dev_info(chip->pdev,
-HW_ERR "Adjusting reported timeouts: A %lu->%luus B 
%lu->%luus C %lu->%luus D %lu->%luus\n",
-old_timeout[0], new_timeout[0],
-old_timeout[1], new_timeout[1],
-old_timeout[2], new_timeout[2],
-old_timeout[3], new_timeout[3]);
-   }
+   memcpy(cap, &tpm_cmd.params.getcap_out.ca

Re: [tpmdd-devel] [PATCH v4 3/4] tpm: Allow TPM chip drivers to override reported command durations

2016-06-08 Thread Ed Swierk
On Wed, Jun 8, 2016 at 12:05 PM, Jason Gunthorpe
 wrote:
> On Tue, Jun 07, 2016 at 05:45:39PM -0700, Ed Swierk wrote:
>> + case 0x32041114: /* Atmel 3204 */
>> + chip->vendor.timeout_a = TIS_SHORT_TIMEOUT * HZ / 1000;
>> + chip->vendor.timeout_b = TIS_LONG_TIMEOUT * HZ / 1000;
>> + chip->vendor.timeout_c = TIS_SHORT_TIMEOUT * HZ / 1000;
>> + chip->vendor.timeout_d = TIS_SHORT_TIMEOUT * HZ / 1000;
>
> Shouldn't these use  msec_to_jiffies?

Indeed * HZ / 1000 can be one jiffy less than msec_to_jiffies(),
depending on HZ. I'll change it.

--Ed


[PATCH v4 2/4] tpm: Add optional logging of TPM command durations

2016-06-07 Thread Ed Swierk
Some TPMs violate their own advertised command durations. This is much
easier to debug with data about how long each command actually takes
to complete. Add debug messages that can be enabled by running

  echo -n 'module tpm +p' >/sys/kernel/debug/dynamic_debug/control

on a kernel configured with DYNAMIC_DEBUG=y.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm-interface.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index c50637d..cc1e5bc 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -333,13 +333,14 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
 {
ssize_t rc;
u32 count, ordinal;
-   unsigned long stop;
+   unsigned long start, stop;
 
if (bufsiz > TPM_BUFSIZE)
bufsiz = TPM_BUFSIZE;
 
count = be32_to_cpu(*((__be32 *) (buf + 2)));
ordinal = be32_to_cpu(*((__be32 *) (buf + 6)));
+   dev_dbg(chip->pdev, "starting command %d count %d\n", ordinal, count);
if (count == 0)
return -ENODATA;
if (count > bufsiz) {
@@ -360,18 +361,24 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char 
*buf,
if (chip->vendor.irq)
goto out_recv;
 
+   start = jiffies;
if (chip->flags & TPM_CHIP_FLAG_TPM2)
-   stop = jiffies + tpm2_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm2_calc_ordinal_duration(chip, ordinal);
else
-   stop = jiffies + tpm_calc_ordinal_duration(chip, ordinal);
+   stop = start + tpm_calc_ordinal_duration(chip, ordinal);
do {
u8 status = chip->ops->status(chip);
if ((status & chip->ops->req_complete_mask) ==
-   chip->ops->req_complete_val)
+   chip->ops->req_complete_val) {
+   dev_dbg(chip->pdev, "completed command %d in %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
goto out_recv;
+   }
 
if (chip->ops->req_canceled(chip, status)) {
dev_err(chip->pdev, "Operation Canceled\n");
+   dev_dbg(chip->pdev, "canceled command %d after %d ms\n",
+   ordinal, jiffies_to_msecs(jiffies - start));
rc = -ECANCELED;
goto out;
}
@@ -382,6 +389,8 @@ ssize_t tpm_transmit(struct tpm_chip *chip, const char *buf,
 
chip->ops->cancel(chip);
dev_err(chip->pdev, "Operation Timed out\n");
+   dev_dbg(chip->pdev, "command %d timed out after %d ms\n", ordinal,
+   jiffies_to_msecs(jiffies - start));
rc = -ETIME;
goto out;
 
-- 
1.9.1



[PATCH v4 1/4] tpm_tis: Improve reporting of IO errors

2016-06-07 Thread Ed Swierk
Mysterious TPM behavior can be difficult to track down through all the
layers of software. Add error messages for conditions that should
never happen. Also include the manufacturer ID along with other chip
data printed during init.

Signed-off-by: Ed Swierk 
Reviewed-by: Jarkko Sakkinen 
---
 drivers/char/tpm/tpm_tis.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 65f7eec..088fa86 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -299,6 +299,8 @@ static int tpm_tis_recv(struct tpm_chip *chip, u8 *buf, 
size_t count)
 
expected = be32_to_cpu(*(__be32 *) (buf + 2));
if (expected > count) {
+   dev_err(chip->pdev, "Response too long (wanted %zd, got %d)\n",
+   count, expected);
size = -EIO;
goto out;
}
@@ -366,6 +368,8 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0) {
+   dev_err(chip->pdev, "Chip not accepting %zd bytes\n",
+   len - count);
rc = -EIO;
goto out_err;
}
@@ -378,6 +382,7 @@ static int tpm_tis_send_data(struct tpm_chip *chip, u8 
*buf, size_t len)
  &chip->vendor.int_queue, false);
status = tpm_tis_status(chip);
if ((status & TPM_STS_DATA_EXPECT) != 0) {
+   dev_err(chip->pdev, "Chip not accepting last byte\n");
rc = -EIO;
goto out_err;
}
@@ -689,8 +694,9 @@ static int tpm_tis_init(struct device *dev, struct tpm_info 
*tpm_info,
vendor = ioread32(chip->vendor.iobase + TPM_DID_VID(0));
chip->vendor.manufacturer_id = vendor;
 
-   dev_info(dev, "%s TPM (device-id 0x%X, rev-id %d)\n",
+   dev_info(dev, "%s TPM (manufacturer-id 0x%X, device-id 0x%X, rev-id 
%d)\n",
 (chip->flags & TPM_CHIP_FLAG_TPM2) ? "2.0" : "1.2",
+chip->vendor.manufacturer_id,
 vendor >> 16, ioread8(chip->vendor.iobase + TPM_RID(0)));
 
if (!itpm) {
-- 
1.9.1



[PATCH v4 0/4] tpm: Command duration logging and chip-specific override

2016-06-07 Thread Ed Swierk
v4: Rework tpm_get_timeouts() to allow overriding both timeouts and
durations via a single callback.

This series
- improves TPM command error reporting
- adds optional logging of TPM command durations
- allows chip-specific override of command durations as well as protocol
  timeouts
- overrides ST19NP18 TPM command duration to avoid lockups

Ed Swierk (4):
  tpm_tis: Improve reporting of IO errors
  tpm: Add optional logging of TPM command durations
  tpm: Allow TPM chip drivers to override reported command durations
  tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

 drivers/char/tpm/tpm-interface.c | 194 ++-
 drivers/char/tpm/tpm_tis.c   |  48 +-
 include/linux/tpm.h  |   3 +-
 3 files changed, 131 insertions(+), 114 deletions(-)

-- 
1.9.1



[PATCH v4 4/4] tpm_tis: Increase ST19NP18 TPM command duration to avoid chip lockup

2016-06-07 Thread Ed Swierk
The STMicro ST19NP18-TPM sometimes takes much longer to execute
commands than it reports in its capabilities. For example, command 186
(TPM_FlushSpecific) has been observed to take 14560 msec to complete,
far longer than the 3000 msec limit for "short" commands reported by
the chip. The behavior has also been seen with command 101
(TPM_GetCapability).

Worse, when the tpm_tis driver attempts to cancel the current command
(by writing commandReady = 1 to TPM_STS_x), the chip locks up
completely, returning all-1s from all memory-mapped register
reads. The lockup can be cleared only by resetting the system.

The occurrence of this excessive command duration depends on the
sequence of commands preceding it. One sequence is creating at least 2
new keys via TPM_CreateWrapKey, then letting the TPM idle for at least
30 seconds, then loading a key via TPM_LoadKey2. The next
TPM_FlushSpecific occasionally takes tens of seconds to
complete. Another sequence is creating many keys in a row without
pause. The TPM_CreateWrapKey operation gets much slower after the
first few iterations, as one would expect when the pool of precomputed
keys is exhausted. Then after a 35-second pause, the same TPM_LoadKey2
followed by TPM_FlushSpecific sequence triggers the behavior.

Our working theory is that this older TPM sometimes pauses to
precompute keys, which modern chips implement as a background
process. Without access to the chip's implementation details it's
impossible to know whether any commands are immune to being blocked by
this process. So it seems safest to ignore the chip's reported command
durations, and use a value much higher than any observed duration,
like 180 sec (which is the duration this chip reports for "long"
commands).

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm_tis.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index 5c74980..0041622 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -485,6 +485,11 @@ static void tpm_tis_update_timeouts(struct tpm_chip *chip)
chip->vendor.timeout_d = TIS_SHORT_TIMEOUT * HZ / 1000;
chip->vendor.timeout_adjusted = true;
break;
+   case 0x104a: /* STMicro ST19NP18-TPM */
+   chip->vendor.duration[TPM_SHORT] = 180 * HZ;
+   chip->vendor.duration[TPM_MEDIUM] = 180 * HZ;
+   chip->vendor.duration[TPM_LONG] = 180 * HZ;
+   chip->vendor.duration_adjusted = true;
}
 }
 
-- 
1.9.1



[PATCH v4 3/4] tpm: Allow TPM chip drivers to override reported command durations

2016-06-07 Thread Ed Swierk
Some TPM chips report bogus command durations in their capabilities,
just as others report incorrect timeouts. Rework tpm_get_timeouts()
to allow chip drivers to override either via a single callback.
Also clean up handling of TPMs that report milliseconds instead of
microseconds.

Signed-off-by: Ed Swierk 
---
 drivers/char/tpm/tpm-interface.c | 177 +--
 drivers/char/tpm/tpm_tis.c   |  35 ++--
 include/linux/tpm.h  |   3 +-
 3 files changed, 106 insertions(+), 109 deletions(-)

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index cc1e5bc..b8a08bb 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -502,123 +502,138 @@ static int tpm_startup(struct tpm_chip *chip, __be16 
startup_type)
"attempting to start the TPM");
 }
 
-int tpm_get_timeouts(struct tpm_chip *chip)
+static int tpm_get_cap_prop(struct tpm_chip *chip, __be32 type, int size,
+   cap_t *cap, char *desc)
 {
struct tpm_cmd_t tpm_cmd;
-   unsigned long new_timeout[4];
-   unsigned long old_timeout[4];
-   struct duration_t *duration_cap;
ssize_t rc;
 
tpm_cmd.header.in = tpm_getcap_header;
tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
+   tpm_cmd.params.getcap_in.subcap = type;
rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE, NULL);
 
if (rc == TPM_ERR_INVALID_POSTINIT) {
/* The TPM is not started, we are the first to talk to it.
   Execute a startup command. */
-   dev_info(chip->pdev, "Issuing TPM_STARTUP");
+   dev_info(chip->pdev, "Issuing TPM_STARTUP\n");
if (tpm_startup(chip, TPM_ST_CLEAR))
return rc;
 
tpm_cmd.header.in = tpm_getcap_header;
tpm_cmd.params.getcap_in.cap = TPM_CAP_PROP;
tpm_cmd.params.getcap_in.subcap_size = cpu_to_be32(4);
-   tpm_cmd.params.getcap_in.subcap = TPM_CAP_PROP_TIS_TIMEOUT;
+   tpm_cmd.params.getcap_in.subcap = type;
rc = tpm_transmit_cmd(chip, &tpm_cmd, TPM_INTERNAL_RESULT_SIZE,
  NULL);
}
+
if (rc) {
dev_err(chip->pdev,
-   "A TPM error (%zd) occurred attempting to determine the 
timeouts\n",
-   rc);
-   goto duration;
+   "Error %zd reading %s\n", rc, desc);
+   return -EINVAL;
}
 
if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
be32_to_cpu(tpm_cmd.header.out.length)
-   != sizeof(tpm_cmd.header.out) + sizeof(u32) + 4 * sizeof(u32))
+   != sizeof(tpm_cmd.header.out) + sizeof(u32) + size * sizeof(u32)) {
+   dev_err(chip->pdev,
+   "Bad return code or length reading %s\n", desc);
return -EINVAL;
-
-   old_timeout[0] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.a);
-   old_timeout[1] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.b);
-   old_timeout[2] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.c);
-   old_timeout[3] = be32_to_cpu(tpm_cmd.params.getcap_out.cap.timeout.d);
-   memcpy(new_timeout, old_timeout, sizeof(new_timeout));
-
-   /*
-* Provide ability for vendor overrides of timeout values in case
-* of misreporting.
-*/
-   if (chip->ops->update_timeouts != NULL)
-   chip->vendor.timeout_adjusted =
-   chip->ops->update_timeouts(chip, new_timeout);
-
-   if (!chip->vendor.timeout_adjusted) {
-   /* Don't overwrite default if value is 0 */
-   if (new_timeout[0] != 0 && new_timeout[0] < 1000) {
-   int i;
-
-   /* timeouts in msec rather usec */
-   for (i = 0; i != ARRAY_SIZE(new_timeout); i++)
-   new_timeout[i] *= 1000;
-   chip->vendor.timeout_adjusted = true;
-   }
}
 
-   /* Report adjusted timeouts */
-   if (chip->vendor.timeout_adjusted) {
-   dev_info(chip->pdev,
-HW_ERR "Adjusting reported timeouts: A %lu->%luus B 
%lu->%luus C %lu->%luus D %lu->%luus\n",
-old_timeout[0], new_timeout[0],
-old_timeout[1], new_timeout[1],
-old_timeout[2], new_timeout[2],
-old_timeout[3], new_timeout[3]);
-   }
+   memcpy(cap, &tpm_cmd.params.getcap_out.ca

Re: [RFC PATCH urcu on mips, parisc] Fix: compat_futex should work-around futex signal-restart kernel bug

2015-12-17 Thread Ed Swierk
I believe e967ef02 "MIPS: Fix restart of indirect syscalls" should be
backported to all stable kernels.

It would be a surprising coincidence if parisc suffers from the same problem.

--Ed


On Thu, Dec 17, 2015 at 4:54 AM, Mathieu Desnoyers
 wrote:
> - On Dec 16, 2015, at 5:09 PM, Mathieu Desnoyers 
> mathieu.desnoy...@efficios.com wrote:
>
>> When testing liburcu on a 3.18 Linux kernel, 2-core MIPS (cpu model :
>> Ingenic JZRISC V4.15  FPU V0.0), we notice that a blocked sys_futex
>> FUTEX_WAIT returns -1, errno=ENOSYS when interrupted by a SA_RESTART
>> signal handler. This spurious ENOSYS behavior causes hangs in liburcu
>> 0.9.x. Running a MIPS 3.18 kernel under a QEMU emulator exhibits the
>> same behavior. This might affect earlier kernels.
>>
>> This issue appears to be fixed in 3.18.y stable kernels and 3.19, but
>> nevertheless, we should try to handle this kernel bug more gracefully
>> than a user-space hang due to unexpected spurious ENOSYS return value.
>
> It's actually fixed in 3.19, but not in 3.18.y stable kernels. The
> Linux kernel upstream fix commit is:
> e967ef02 "MIPS: Fix restart of indirect syscalls"
>
> I've created a small test program that could also be used on parisc
> to check if it suffers from the same issue (see attached).
>
> On bogus mips kernels, we see the following output:
> [OK] Test program with pid: 5748 SIGUSR1 handler
> [FAIL] futex returns -1, Function not implemented
>
> Let me know if someone can try it out on a parisc kernel.
>
> Thanks!
>
> Mathieu
>
>>
>> Therefore, fallback on the "async-safe" version of compat_futex in those
>> situations where FUTEX_WAIT returns ENOSYS. This async-safe fallback has
>> the nice property of being OK to use concurrently with other FUTEX_WAKE
>> and FUTEX_WAIT futex() calls, because it's simply a busy-wait scheme.
>>
>> We suspect that parisc might be affected by a similar issue (Debian
>> build bots reported a similar hang on both mips and parisc), but we do
>> not have access to the hardware required to test this hypothesis.
>>
>> Signed-off-by: Mathieu Desnoyers 
>> CC: Michael Jeanson 
>> CC: Paul E. McKenney 
>> CC: Ralf Baechle 
>> CC: linux-m...@linux-mips.org
>> CC: linux-kernel@vger.kernel.org
>> CC: "James E.J. Bottomley" 
>> CC: Helge Deller 
>> CC: linux-par...@vger.kernel.org
>> ---
>> compat_futex.c |  2 ++
>> urcu/futex.h   | 12 +++-
>> 2 files changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/compat_futex.c b/compat_futex.c
>> index b7f78f0..9e918fe 100644
>> --- a/compat_futex.c
>> +++ b/compat_futex.c
>> @@ -111,6 +111,8 @@ end:
>>  * _ASYNC SIGNAL-SAFE_.
>>  * For now, timeout, uaddr2 and val3 are unused.
>>  * Waiter will busy-loop trying to read the condition.
>> + * It is OK to use compat_futex_async() on a futex address on which
>> + * futex() WAKE operations are also performed.
>>  */
>>
>> int compat_futex_async(int32_t *uaddr, int op, int32_t val,
>> diff --git a/urcu/futex.h b/urcu/futex.h
>> index 4d16cfa..a17eda8 100644
>> --- a/urcu/futex.h
>> +++ b/urcu/futex.h
>> @@ -73,7 +73,17 @@ static inline int futex_noasync(int32_t *uaddr, int op,
>> int32_t val,
>>
>>   ret = futex(uaddr, op, val, timeout, uaddr2, val3);
>>   if (caa_unlikely(ret < 0 && errno == ENOSYS)) {
>> - return compat_futex_noasync(uaddr, op, val, timeout,
>> + /*
>> +  * The fallback on ENOSYS is the async-safe version of
>> +  * the compat futex implementation, because the
>> +  * async-safe compat implementation allows being used
>> +  * concurrently with calls to futex(). Indeed, sys_futex
>> +  * FUTEX_WAIT, on some architectures (e.g. mips), within
>> +  * a given process, spuriously return ENOSYS due to
>> +  * signal restart bugs on some kernel versions (e.g.
>> +  * Linux kernel 3.18 and possibly earlier).
>> +  */
>> + return compat_futex_async(uaddr, op, val, timeout,
>>   uaddr2, val3);
>>   }
>>   return ret;
>> --
>> 2.1.4
>
> --
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ethtool: Extend ethtool plugin module eeprom API to phylib

2015-01-02 Thread Ed Swierk
This patch extends the ethtool plugin module eeprom API to support cards
whose phy support is delegated to a separate driver.

The handlers for ETHTOOL_GMODULEINFO and ETHTOOL_GMODULEEEPROM call the
module_info and module_eeprom functions if the phy driver provides them;
otherwise the handlers call the equivalent ethtool_ops functions provided
by network drivers with built-in phy support.

Signed-off-by: Ed Swierk 
---
 include/linux/phy.h |  9 +
 net/core/ethtool.c  | 45 ++---
 2 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 565188c..04e5f5c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -472,6 +472,15 @@ struct phy_driver {
/* See set_wol, but for checking whether Wake on LAN is enabled. */
void (*get_wol)(struct phy_device *dev, struct ethtool_wolinfo *wol);
 
+   /* Get the size and type of the eeprom contained within a plug-in
+* module */
+   int (*module_info)(struct phy_device *dev,
+  struct ethtool_modinfo *modinfo);
+
+   /* Get the eeprom information from the plug-in module */
+   int (*module_eeprom)(struct phy_device *dev,
+struct ethtool_eeprom *ee, u8 *data);
+
struct device_driver driver;
 };
 #define to_phy_driver(d) container_of(d, struct phy_driver, driver)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 30071de..466526b 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -1405,20 +1405,31 @@ static int ethtool_get_ts_info(struct net_device *dev, 
void __user *useraddr)
return err;
 }
 
+static int __ethtool_get_module_info(struct net_device *dev,
+struct ethtool_modinfo *modinfo)
+{
+   const struct ethtool_ops *ops = dev->ethtool_ops;
+   struct phy_device *phydev = dev->phydev;
+
+   if (phydev && phydev->drv && phydev->drv->module_info)
+   return phydev->drv->module_info(phydev, modinfo);
+
+   if (ops->get_module_info)
+   return ops->get_module_info(dev, modinfo);
+
+   return -EOPNOTSUPP;
+}
+
 static int ethtool_get_module_info(struct net_device *dev,
   void __user *useraddr)
 {
int ret;
struct ethtool_modinfo modinfo;
-   const struct ethtool_ops *ops = dev->ethtool_ops;
-
-   if (!ops->get_module_info)
-   return -EOPNOTSUPP;
 
if (copy_from_user(&modinfo, useraddr, sizeof(modinfo)))
return -EFAULT;
 
-   ret = ops->get_module_info(dev, &modinfo);
+   ret = __ethtool_get_module_info(dev, &modinfo);
if (ret)
return ret;
 
@@ -1428,21 +1439,33 @@ static int ethtool_get_module_info(struct net_device 
*dev,
return 0;
 }
 
+static int __ethtool_get_module_eeprom(struct net_device *dev,
+  struct ethtool_eeprom *ee, u8 *data)
+{
+   const struct ethtool_ops *ops = dev->ethtool_ops;
+   struct phy_device *phydev = dev->phydev;
+
+   if (phydev && phydev->drv && phydev->drv->module_eeprom)
+   return phydev->drv->module_eeprom(phydev, ee, data);
+
+   if (ops->get_module_eeprom)
+   return ops->get_module_eeprom(dev, ee, data);
+
+   return -EOPNOTSUPP;
+}
+
 static int ethtool_get_module_eeprom(struct net_device *dev,
 void __user *useraddr)
 {
int ret;
struct ethtool_modinfo modinfo;
-   const struct ethtool_ops *ops = dev->ethtool_ops;
-
-   if (!ops->get_module_info || !ops->get_module_eeprom)
-   return -EOPNOTSUPP;
 
-   ret = ops->get_module_info(dev, &modinfo);
+   ret = __ethtool_get_module_info(dev, &modinfo);
if (ret)
return ret;
 
-   return ethtool_get_any_eeprom(dev, useraddr, ops->get_module_eeprom,
+   return ethtool_get_any_eeprom(dev, useraddr,
+ __ethtool_get_module_eeprom,
  modinfo.eeprom_len);
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[updated PATCH] forcedeth: power down phy when interface is down

2007-09-21 Thread Ed Swierk
On 9/19/07, Ayaz Abdulla <[EMAIL PROTECTED]> wrote:
> It seems that you are powering down the phy even if WOL is enabled.

Right; I've updated the patch to skip powering down the phy when wol is enabled.

> Secondly, can you powerdown the phy at the same time you start
> performing autoneg restart?

This seems to work fine with a BCM5461S phy; it would be worth testing
with other phys to be sure.

--Ed


forcedeth-phy-power-down.patch
Description: Binary data


[PATCH] forcedeth: "no link" is informational

2007-09-18 Thread Ed Swierk
Log "no link during initialization" at KERN_INFO as it's not an error,
and occurs every time the interface comes up (when the
forcedeth-phy-power-down patch is applied).

Signed-off-by: Ed Swierk <[EMAIL PROTECTED]>


forcedeth-open-no-link-printk.patch
Description: Binary data


[PATCH] forcedeth: power down phy when interface is down

2007-09-18 Thread Ed Swierk
Bring the physical link down when the interface is down, by placing
the PHY in power-down state. This mirrors the behavior of other
drivers including e1000 and tg3.

Signed-off-by: Ed Swierk <[EMAIL PROTECTED]>


forcedeth-phy-power-down.patch
Description: Binary data


Re: [PATCH] Configurable tap interface MTU

2007-09-12 Thread Ed Swierk
On 9/11/07, Herbert Xu <[EMAIL PROTECTED]> wrote:
> Please make it 65535 without an Ethernet header and 65521
> with an Ethernet header.

Here is a revised patch that allows MTUs up to 65535 for tap
interfaces and up to 65521 for tun interfaces.

(If I set the MTU to 65521 on a tun interface, ping complains "message
too long" when I send a 65521-byte packet; 65520 works okay, though.)

--Ed


tap-change-mtu.patch
Description: Binary data


[PATCH] Configurable tap interface MTU

2007-09-11 Thread Ed Swierk
This patch makes it possible to change the MTU on a tap interface.
Increasing the MTU beyond the 1500-byte default is useful for
applications that interoperate with Ethernet devices supporting jumbo
frames.

The patch caps the MTU somewhat arbitrarily at 16000 bytes. This is
slightly lower than the value used by the e1000 driver, so it seems
like a safe upper limit.

Signed-off-by: Ed Swierk <[EMAIL PROTECTED]>

---


tap-change-mtu.patch
Description: Binary data


Re: Longer command line for kexec on i386

2007-07-25 Thread Ed Swierk
Ed Swierk  arastra.com> writes:

> I made the following change to kexec:
>
> [deleted]
> 
> but I'm now seeing intermittent corruption of the initrd in the new kernel--a
> few bytes at different locations each time.
> 
> I suspect I've neglected some other important changes to the boot protocol.  

The corruption is due to a "hardware" bug (qemu), which was in fact corrupting
memory (http://article.gmane.org/gmane.comp.emulators.qemu/18774).  Extending
the command line in kexec somehow made the corruption noticeable.

--Ed


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Longer command line for kexec on i386

2007-07-23 Thread Ed Swierk
I'm attempting to get kexec to pass a command line longer than 256 bytes to the
new kernel, using kexec-tools-testing-20070330 and kernel 2.6.22.1.  The new
kernel is a bzImage, and I'm using kexec to tack on a command line and an 
initrd.

I made the following change to kexec:

--- kexec-tools-testing-20070330.orig/include/x86/x86-linux.h
+++ kexec-tools-testing-20070330/include/x86/x86-linux.h
@@ -148,14 +148,13 @@
 #endif
struct e820entry e820_map[E820MAX]; /* 0x2d0 */
/* 0x550 */
-#define COMMAND_LINE_SIZE 256
+#define COMMAND_LINE_SIZE 2048
 };

 struct x86_linux_faked_param_header {
struct x86_linux_param_header hdr;  /* 0x00 */
uint8_t reserved16[688];/* 0x550 */
-   uint8_t command_line[COMMAND_LINE_SIZE]; /* 0x800 */
-   uint8_t reserved17[1792];   /* 0x900 - 0x1000 */
+   uint8_t command_line[COMMAND_LINE_SIZE]; /* 0x800 - 0x1000 */
 };

 struct x86_linux_header {

but I'm now seeing intermittent corruption of the initrd in the new kernel--a
few bytes at different locations each time.

I suspect I've neglected some other important changes to the boot protocol.  Any
clues would be appreciated.

Thanks,
--Ed


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/