[PATCH] KVM: VMX: Fix vmx->nested freeing when no SMI handler

2017-11-21 Thread Wanpeng Li
From: Wanpeng Li 

Reported by syzkaller:

[ cut here ]
WARNING: CPU: 5 PID: 2939 at arch/x86/kvm/vmx.c:3844 
free_loaded_vmcs+0x77/0x80 [kvm_intel]
CPU: 5 PID: 2939 Comm: repro Not tainted 4.14.0+ #26
RIP: 0010:free_loaded_vmcs+0x77/0x80 [kvm_intel]
Call Trace:
 vmx_free_vcpu+0xda/0x130 [kvm_intel]
 kvm_arch_destroy_vm+0x192/0x290 [kvm]
 kvm_put_kvm+0x262/0x560 [kvm]
 kvm_vm_release+0x2c/0x30 [kvm]
 __fput+0x190/0x370
 task_work_run+0xa1/0xd0
 do_exit+0x4d2/0x13e0
 do_group_exit+0x89/0x140
 get_signal+0x318/0xb80
 do_signal+0x8c/0xb40
 exit_to_usermode_loop+0xe4/0x140
 syscall_return_slowpath+0x206/0x230
 entry_SYSCALL_64_fastpath+0x98/0x9a

The syzkaller testcase will execute VMXON/VMLAUCH instructions, so the 
vmx->nested stuff is populated, it will also issue KVM_SMI ioctl. However, 
the testcase is just a simple c program and not be lauched by something 
like seabios which implements smi_handler. Commit 05cade71cf (KVM: nSVM: 
fix SMI injection in guest mode) gets out of guest mode and set nested.vmxon 
to false for the duration of SMM according to SDM 34.14.1 "leave VMX 
operation" upon entering SMM. We can't alloc/free the vmx->nested stuff 
each time when entering/exiting SMM since it will induce more overhead. So 
the function vmx_pre_enter_smm() marks nested.vmxon false even if vmx->nested 
stuff is still populated. What it expected is em_rsm() can mark nested.vmxon 
to be true again. However, the smi_handler/rsm will not execute since there 
is no something like seabios in this scenario. The function free_nested() 
fails to free the vmx->nested stuff since the vmx->nested.vmxon is false 
which results in the above warning.

This patch fixes it by also considering the no SMI handler case, luckily 
vmx->nested.smm.vmxon is marked according to the value of vmx->nested.vmxon 
in vmx_pre_enter_smm(), we can take advantage of it and free vmx->nested 
stuff when L1 goes down.

Reported-by: Dmitry Vyukov 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Dmitry Vyukov 
Fixes: 05cade71cf (KVM: nSVM: fix SMI injection in guest mode)
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index dccc0f7..ed22425 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7372,7 +7372,7 @@ static inline void nested_release_vmcs12(struct vcpu_vmx 
*vmx)
  */
 static void free_nested(struct vcpu_vmx *vmx)
 {
-   if (!vmx->nested.vmxon)
+   if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon)
return;
 
vmx->nested.vmxon = false;
-- 
2.7.4



[PATCH] KVM: VMX: Fix vmx->nested freeing when no SMI handler

2017-11-21 Thread Wanpeng Li
From: Wanpeng Li 

Reported by syzkaller:

[ cut here ]
WARNING: CPU: 5 PID: 2939 at arch/x86/kvm/vmx.c:3844 
free_loaded_vmcs+0x77/0x80 [kvm_intel]
CPU: 5 PID: 2939 Comm: repro Not tainted 4.14.0+ #26
RIP: 0010:free_loaded_vmcs+0x77/0x80 [kvm_intel]
Call Trace:
 vmx_free_vcpu+0xda/0x130 [kvm_intel]
 kvm_arch_destroy_vm+0x192/0x290 [kvm]
 kvm_put_kvm+0x262/0x560 [kvm]
 kvm_vm_release+0x2c/0x30 [kvm]
 __fput+0x190/0x370
 task_work_run+0xa1/0xd0
 do_exit+0x4d2/0x13e0
 do_group_exit+0x89/0x140
 get_signal+0x318/0xb80
 do_signal+0x8c/0xb40
 exit_to_usermode_loop+0xe4/0x140
 syscall_return_slowpath+0x206/0x230
 entry_SYSCALL_64_fastpath+0x98/0x9a

The syzkaller testcase will execute VMXON/VMLAUCH instructions, so the 
vmx->nested stuff is populated, it will also issue KVM_SMI ioctl. However, 
the testcase is just a simple c program and not be lauched by something 
like seabios which implements smi_handler. Commit 05cade71cf (KVM: nSVM: 
fix SMI injection in guest mode) gets out of guest mode and set nested.vmxon 
to false for the duration of SMM according to SDM 34.14.1 "leave VMX 
operation" upon entering SMM. We can't alloc/free the vmx->nested stuff 
each time when entering/exiting SMM since it will induce more overhead. So 
the function vmx_pre_enter_smm() marks nested.vmxon false even if vmx->nested 
stuff is still populated. What it expected is em_rsm() can mark nested.vmxon 
to be true again. However, the smi_handler/rsm will not execute since there 
is no something like seabios in this scenario. The function free_nested() 
fails to free the vmx->nested stuff since the vmx->nested.vmxon is false 
which results in the above warning.

This patch fixes it by also considering the no SMI handler case, luckily 
vmx->nested.smm.vmxon is marked according to the value of vmx->nested.vmxon 
in vmx_pre_enter_smm(), we can take advantage of it and free vmx->nested 
stuff when L1 goes down.

Reported-by: Dmitry Vyukov 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Dmitry Vyukov 
Fixes: 05cade71cf (KVM: nSVM: fix SMI injection in guest mode)
Signed-off-by: Wanpeng Li 
---
 arch/x86/kvm/vmx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index dccc0f7..ed22425 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -7372,7 +7372,7 @@ static inline void nested_release_vmcs12(struct vcpu_vmx 
*vmx)
  */
 static void free_nested(struct vcpu_vmx *vmx)
 {
-   if (!vmx->nested.vmxon)
+   if (!vmx->nested.vmxon && !vmx->nested.smm.vmxon)
return;
 
vmx->nested.vmxon = false;
-- 
2.7.4



Re: [PATCH 24/30] staging: rts5208: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Greg Kroah-Hartman
On Wed, Nov 22, 2017 at 12:31:09AM -0500, Sinan Kaya wrote:
> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
> where a PCI device is present. This restricts the device drivers to be
> reused for other domain numbers.
> 
> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
> extract the domain number. Other places, use the actual domain number from
> the device.
> 
> Signed-off-by: Sinan Kaya 
> ---
>  drivers/staging/rts5208/rtsx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/rts5208/rtsx.c b/drivers/staging/rts5208/rtsx.c
> index 89e2cfe..13b14fe 100644
> --- a/drivers/staging/rts5208/rtsx.c
> +++ b/drivers/staging/rts5208/rtsx.c
> @@ -281,7 +281,7 @@ int rtsx_read_pci_cfg_byte(u8 bus, u8 dev, u8 func, u8 
> offset, u8 *val)
>   u8 data;
>   u8 devfn = (dev << 3) | func;
>  
> - pdev = pci_get_bus_and_slot(bus, devfn);
> + pdev = pci_get_domain_bus_and_slot(0, bus, devfn);

Ugh, this whole function should go away, but it's good enough for now,
I'll queue it up after -rc1 is out.

thanks,

greg k-h


Re: [PATCH 24/30] staging: rts5208: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Greg Kroah-Hartman
On Wed, Nov 22, 2017 at 12:31:09AM -0500, Sinan Kaya wrote:
> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
> where a PCI device is present. This restricts the device drivers to be
> reused for other domain numbers.
> 
> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
> extract the domain number. Other places, use the actual domain number from
> the device.
> 
> Signed-off-by: Sinan Kaya 
> ---
>  drivers/staging/rts5208/rtsx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/rts5208/rtsx.c b/drivers/staging/rts5208/rtsx.c
> index 89e2cfe..13b14fe 100644
> --- a/drivers/staging/rts5208/rtsx.c
> +++ b/drivers/staging/rts5208/rtsx.c
> @@ -281,7 +281,7 @@ int rtsx_read_pci_cfg_byte(u8 bus, u8 dev, u8 func, u8 
> offset, u8 *val)
>   u8 data;
>   u8 devfn = (dev << 3) | func;
>  
> - pdev = pci_get_bus_and_slot(bus, devfn);
> + pdev = pci_get_domain_bus_and_slot(0, bus, devfn);

Ugh, this whole function should go away, but it's good enough for now,
I'll queue it up after -rc1 is out.

thanks,

greg k-h


Re: [PATCH 29/30] i7300_idle: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Greg Kroah-Hartman
On Wed, Nov 22, 2017 at 12:31:14AM -0500, Sinan Kaya wrote:
> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
> where a PCI device is present. This restricts the device drivers to be
> reused for other domain numbers.
> 
> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
> extract the domain number. Other places, use the actual domain number from
> the device.
> 
> Signed-off-by: Sinan Kaya 
> ---
>  include/linux/i7300_idle.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/i7300_idle.h b/include/linux/i7300_idle.h
> index 4dbe651..58cd9c6 100644
> --- a/include/linux/i7300_idle.h
> +++ b/include/linux/i7300_idle.h
> @@ -48,7 +48,7 @@ static inline int i7300_idle_platform_probe(struct pci_dev 
> **fbd_dev,
>   int i;
>   struct pci_dev *memdev, *dmadev;
>  
> - memdev = pci_get_bus_and_slot(MEMCTL_BUS, MEMCTL_DEVFN);
> + memdev = pci_get_domain_bus_and_slot(0, MEMCTL_BUS, MEMCTL_DEVFN);

You have a pci_dev, why can't you use it here to get the domain?

>   if (!memdev)
>   return -ENODEV;
>  
> @@ -61,7 +61,7 @@ static inline int i7300_idle_platform_probe(struct pci_dev 
> **fbd_dev,
>   if (pci_tbl[i].vendor == 0)
>   return -ENODEV;
>  
> - dmadev = pci_get_bus_and_slot(IOAT_BUS, IOAT_DEVFN);
> + dmadev = pci_get_domain_bus_and_slot(0, IOAT_BUS, IOAT_DEVFN);

Same here.

thanks,

greg k-h


[tip:x86/urgent] x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow

2017-11-21 Thread tip-bot for Andrey Ryabinin
Commit-ID:  f68d62a56708b0c19dca7a998f408510f2fbc3a8
Gitweb: https://git.kernel.org/tip/f68d62a56708b0c19dca7a998f408510f2fbc3a8
Author: Andrey Ryabinin 
AuthorDate: Wed, 15 Nov 2017 17:36:35 -0800
Committer:  Ingo Molnar 
CommitDate: Wed, 22 Nov 2017 07:18:35 +0100

x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow

[ Note, this commit is a cherry-picked version of:

d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize 
shadow")

  ... for easier x86 entry code testing and back-porting. ]

The KASAN shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt.  However,
since that no longer zeroes the mapped pages, it is not suitable for
KASAN, which requires zeroed shadow memory.

Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate().  Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.

Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatas...@oracle.com
Signed-off-by: Andrey Ryabinin 
Signed-off-by: Pavel Tatashin 
Cc: Andy Lutomirski 
Cc: Steven Sistare 
Cc: Daniel Jordan 
Cc: Bob Picco 
Cc: Michal Hocko 
Cc: Alexander Potapenko 
Cc: Ard Biesheuvel 
Cc: Catalin Marinas 
Cc: Christian Borntraeger 
Cc: David S. Miller 
Cc: Dmitry Vyukov 
Cc: Heiko Carstens 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Mark Rutland 
Cc: Matthew Wilcox 
Cc: Mel Gorman 
Cc: Michal Hocko 
Cc: Sam Ravnborg 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Ingo Molnar 
---
 arch/x86/Kconfig|   2 +-
 arch/x86/mm/kasan_init_64.c | 143 +---
 2 files changed, 137 insertions(+), 8 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a0623f0..09dcc94 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -110,7 +110,7 @@ config X86
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMAP  if X86_64 || X86_PAE
select HAVE_ARCH_JUMP_LABEL
-   select HAVE_ARCH_KASAN  if X86_64 && SPARSEMEM_VMEMMAP
+   select HAVE_ARCH_KASAN  if X86_64
select HAVE_ARCH_KGDB
select HAVE_ARCH_KMEMCHECK
select HAVE_ARCH_MMAP_RND_BITS  if MMU
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 2b60dc6..99dfed6 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -4,12 +4,14 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -18,7 +20,134 @@ extern struct range pfn_mapped[E820_MAX_ENTRIES];
 
 static p4d_t tmp_p4d_table[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
 
-static int __init map_range(struct range *range)
+static __init void *early_alloc(size_t size, int nid)
+{
+   return memblock_virt_alloc_try_nid_nopanic(size, size,
+   __pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid);
+}
+
+static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
+ unsigned long end, int nid)
+{
+   pte_t *pte;
+
+   if (pmd_none(*pmd)) {
+   void *p;
+
+   if (boot_cpu_has(X86_FEATURE_PSE) &&
+   ((end - addr) == PMD_SIZE) &&
+   IS_ALIGNED(addr, PMD_SIZE)) {
+   p = early_alloc(PMD_SIZE, nid);
+   if (p && pmd_set_huge(pmd, __pa(p), PAGE_KERNEL))
+   return;
+   else if (p)
+   memblock_free(__pa(p), PMD_SIZE);
+   }
+
+   p = early_alloc(PAGE_SIZE, nid);
+   pmd_populate_kernel(_mm, pmd, p);
+   }
+
+   pte = pte_offset_kernel(pmd, addr);
+   do {
+   pte_t entry;
+   void *p;
+
+   if (!pte_none(*pte))
+   continue;
+
+   p = early_alloc(PAGE_SIZE, nid);
+   entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL);
+   set_pte_at(_mm, addr, pte, entry);
+   } while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+static void 

Re: [PATCH 29/30] i7300_idle: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Greg Kroah-Hartman
On Wed, Nov 22, 2017 at 12:31:14AM -0500, Sinan Kaya wrote:
> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
> where a PCI device is present. This restricts the device drivers to be
> reused for other domain numbers.
> 
> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
> extract the domain number. Other places, use the actual domain number from
> the device.
> 
> Signed-off-by: Sinan Kaya 
> ---
>  include/linux/i7300_idle.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/i7300_idle.h b/include/linux/i7300_idle.h
> index 4dbe651..58cd9c6 100644
> --- a/include/linux/i7300_idle.h
> +++ b/include/linux/i7300_idle.h
> @@ -48,7 +48,7 @@ static inline int i7300_idle_platform_probe(struct pci_dev 
> **fbd_dev,
>   int i;
>   struct pci_dev *memdev, *dmadev;
>  
> - memdev = pci_get_bus_and_slot(MEMCTL_BUS, MEMCTL_DEVFN);
> + memdev = pci_get_domain_bus_and_slot(0, MEMCTL_BUS, MEMCTL_DEVFN);

You have a pci_dev, why can't you use it here to get the domain?

>   if (!memdev)
>   return -ENODEV;
>  
> @@ -61,7 +61,7 @@ static inline int i7300_idle_platform_probe(struct pci_dev 
> **fbd_dev,
>   if (pci_tbl[i].vendor == 0)
>   return -ENODEV;
>  
> - dmadev = pci_get_bus_and_slot(IOAT_BUS, IOAT_DEVFN);
> + dmadev = pci_get_domain_bus_and_slot(0, IOAT_BUS, IOAT_DEVFN);

Same here.

thanks,

greg k-h


[tip:x86/urgent] x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow

2017-11-21 Thread tip-bot for Andrey Ryabinin
Commit-ID:  f68d62a56708b0c19dca7a998f408510f2fbc3a8
Gitweb: https://git.kernel.org/tip/f68d62a56708b0c19dca7a998f408510f2fbc3a8
Author: Andrey Ryabinin 
AuthorDate: Wed, 15 Nov 2017 17:36:35 -0800
Committer:  Ingo Molnar 
CommitDate: Wed, 22 Nov 2017 07:18:35 +0100

x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow

[ Note, this commit is a cherry-picked version of:

d17a1d97dc20: ("x86/mm/kasan: don't use vmemmap_populate() to initialize 
shadow")

  ... for easier x86 entry code testing and back-porting. ]

The KASAN shadow is currently mapped using vmemmap_populate() since that
provides a semi-convenient way to map pages into init_top_pgt.  However,
since that no longer zeroes the mapped pages, it is not suitable for
KASAN, which requires zeroed shadow memory.

Add kasan_populate_shadow() interface and use it instead of
vmemmap_populate().  Besides, this allows us to take advantage of
gigantic pages and use them to populate the shadow, which should save us
some memory wasted on page tables and reduce TLB pressure.

Link: http://lkml.kernel.org/r/20171103185147.2688-2-pasha.tatas...@oracle.com
Signed-off-by: Andrey Ryabinin 
Signed-off-by: Pavel Tatashin 
Cc: Andy Lutomirski 
Cc: Steven Sistare 
Cc: Daniel Jordan 
Cc: Bob Picco 
Cc: Michal Hocko 
Cc: Alexander Potapenko 
Cc: Ard Biesheuvel 
Cc: Catalin Marinas 
Cc: Christian Borntraeger 
Cc: David S. Miller 
Cc: Dmitry Vyukov 
Cc: Heiko Carstens 
Cc: "H. Peter Anvin" 
Cc: Ingo Molnar 
Cc: Mark Rutland 
Cc: Matthew Wilcox 
Cc: Mel Gorman 
Cc: Michal Hocko 
Cc: Sam Ravnborg 
Cc: Thomas Gleixner 
Cc: Will Deacon 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Ingo Molnar 
---
 arch/x86/Kconfig|   2 +-
 arch/x86/mm/kasan_init_64.c | 143 +---
 2 files changed, 137 insertions(+), 8 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a0623f0..09dcc94 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -110,7 +110,7 @@ config X86
select HAVE_ARCH_AUDITSYSCALL
select HAVE_ARCH_HUGE_VMAP  if X86_64 || X86_PAE
select HAVE_ARCH_JUMP_LABEL
-   select HAVE_ARCH_KASAN  if X86_64 && SPARSEMEM_VMEMMAP
+   select HAVE_ARCH_KASAN  if X86_64
select HAVE_ARCH_KGDB
select HAVE_ARCH_KMEMCHECK
select HAVE_ARCH_MMAP_RND_BITS  if MMU
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 2b60dc6..99dfed6 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -4,12 +4,14 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -18,7 +20,134 @@ extern struct range pfn_mapped[E820_MAX_ENTRIES];
 
 static p4d_t tmp_p4d_table[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
 
-static int __init map_range(struct range *range)
+static __init void *early_alloc(size_t size, int nid)
+{
+   return memblock_virt_alloc_try_nid_nopanic(size, size,
+   __pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid);
+}
+
+static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
+ unsigned long end, int nid)
+{
+   pte_t *pte;
+
+   if (pmd_none(*pmd)) {
+   void *p;
+
+   if (boot_cpu_has(X86_FEATURE_PSE) &&
+   ((end - addr) == PMD_SIZE) &&
+   IS_ALIGNED(addr, PMD_SIZE)) {
+   p = early_alloc(PMD_SIZE, nid);
+   if (p && pmd_set_huge(pmd, __pa(p), PAGE_KERNEL))
+   return;
+   else if (p)
+   memblock_free(__pa(p), PMD_SIZE);
+   }
+
+   p = early_alloc(PAGE_SIZE, nid);
+   pmd_populate_kernel(_mm, pmd, p);
+   }
+
+   pte = pte_offset_kernel(pmd, addr);
+   do {
+   pte_t entry;
+   void *p;
+
+   if (!pte_none(*pte))
+   continue;
+
+   p = early_alloc(PAGE_SIZE, nid);
+   entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL);
+   set_pte_at(_mm, addr, pte, entry);
+   } while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+static void __init kasan_populate_pud(pud_t *pud, unsigned long addr,
+ unsigned long end, int nid)
+{
+   pmd_t *pmd;
+   unsigned long next;
+
+   if (pud_none(*pud)) {
+   void *p;
+
+   if (boot_cpu_has(X86_FEATURE_GBPAGES) &&
+   ((end - addr) == PUD_SIZE) &&
+   IS_ALIGNED(addr, PUD_SIZE)) {
+   p = early_alloc(PUD_SIZE, nid);
+   if (p && pud_set_huge(pud, __pa(p), PAGE_KERNEL))
+   return;
+   else if (p)
+   

Re: Regression with a91d66129fb9 ("ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removal")

2017-11-21 Thread Takashi Iwai
On Tue, 21 Nov 2017 17:25:05 +0100,
Takashi Iwai wrote:
> 
> On Tue, 21 Nov 2017 17:14:42 +0100,
> Laura Abbott wrote:
> > 
> > Hi,
> > 
> > Fedora got a bug report 
> > (https://bugzilla.redhat.com/show_bug.cgi?id=1512853)
> > that Line Out stopped working between 4.13.9 and 4.13.10. Reverting
> > 82d745a55779 ("ALSA: hda - Fix incorrect TLV callback check introduced 
> > during set_fs() removal")
> > fixed the problem. I didn't ask the reporter to test on 4.14 since I didn't 
> > see
> > anything explicitly tagged as fixing the issue. Any ideas?
> 
> It might be that the formerly saved asound.state brought the
> inconsistency.  Try to remove the saved state (either
> /var/lib/alsa/asound.state or /etc/asound.state) after unloading the
> sound driver modules and reboot/retest.
> 
> In anyway, give alsa-info.sh output with the affected machine.
> Run the script with --no-upload option, and attach the generated
> file.

BTW, having the output of alsa-info.sh with a bug report helps
analysis and debugging quite a lot.

So it'd be appreciated if you can ask it always as the first step for
a bug report on Fedora regarding sound.  Especially when it's a
regression, the outputs on both working and non-working cases would be
nice, so that we can compare more precisely.


thanks,

Takashi


Re: [PATCH 12/30] Drivers: ide: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Greg KH
On Wed, Nov 22, 2017 at 12:30:57AM -0500, Sinan Kaya wrote:
> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
> where a PCI device is present. This restricts the device drivers to be
> reused for other domain numbers.
> 
> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
> extract the domain number. Other places, use the actual domain number from
> the device.

While this is a great generic text, you might want to make it a bit more
custom to each specific patch.  For example, you don't use a domain of 0
in this one, so the text is a bit wrong and confusing if you look at it
stand-alone.

I like the series and the idea, just fix up this text in some of the
patches and you should be fine.

thanks,

greg k-h


Re: Regression with a91d66129fb9 ("ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removal")

2017-11-21 Thread Takashi Iwai
On Tue, 21 Nov 2017 17:25:05 +0100,
Takashi Iwai wrote:
> 
> On Tue, 21 Nov 2017 17:14:42 +0100,
> Laura Abbott wrote:
> > 
> > Hi,
> > 
> > Fedora got a bug report 
> > (https://bugzilla.redhat.com/show_bug.cgi?id=1512853)
> > that Line Out stopped working between 4.13.9 and 4.13.10. Reverting
> > 82d745a55779 ("ALSA: hda - Fix incorrect TLV callback check introduced 
> > during set_fs() removal")
> > fixed the problem. I didn't ask the reporter to test on 4.14 since I didn't 
> > see
> > anything explicitly tagged as fixing the issue. Any ideas?
> 
> It might be that the formerly saved asound.state brought the
> inconsistency.  Try to remove the saved state (either
> /var/lib/alsa/asound.state or /etc/asound.state) after unloading the
> sound driver modules and reboot/retest.
> 
> In anyway, give alsa-info.sh output with the affected machine.
> Run the script with --no-upload option, and attach the generated
> file.

BTW, having the output of alsa-info.sh with a bug report helps
analysis and debugging quite a lot.

So it'd be appreciated if you can ask it always as the first step for
a bug report on Fedora regarding sound.  Especially when it's a
regression, the outputs on both working and non-working cases would be
nice, so that we can compare more precisely.


thanks,

Takashi


Re: [PATCH 12/30] Drivers: ide: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Greg KH
On Wed, Nov 22, 2017 at 12:30:57AM -0500, Sinan Kaya wrote:
> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
> where a PCI device is present. This restricts the device drivers to be
> reused for other domain numbers.
> 
> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
> extract the domain number. Other places, use the actual domain number from
> the device.

While this is a great generic text, you might want to make it a bit more
custom to each specific patch.  For example, you don't use a domain of 0
in this one, so the text is a bit wrong and confusing if you look at it
stand-alone.

I like the series and the idea, just fix up this text in some of the
patches and you should be fine.

thanks,

greg k-h


Re: [PATCH] media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION

2017-11-21 Thread Randy Dunlap
On 11/21/17 23:41, kbuild test robot wrote:
> Hi Jesse,
> 
> Thank you for the patch! Yet something to improve:

missing
#include 

Jesse, did you build all of these driver changes?


> [auto build test ERROR on linuxtv-media/master]
> [also build test ERROR on v4.14 next-20171121]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Jesse-Chan/media-mtk-vcodec-add-missing-MODULE_LICENSE-DESCRIPTION/20171122-124620
> base:   git://linuxtv.org/media_tree.git master
> config: xtensa-allmodconfig (attached as .config)
> compiler: xtensa-linux-gcc (GCC) 4.9.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=xtensa 
> 
> All errors (new ones prefixed by >>):
> 
>>> drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c:55:16: error: expected 
>>> declaration specifiers or '...' before string constant
> MODULE_LICENSE("GPL v2");
>^
>drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c:56:20: error: expected 
> declaration specifiers or '...' before string constant
> MODULE_DESCRIPTION("Mediatek video codec driver");
>^
> 
> vim +55 drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c
> 
> 54
>   > 55MODULE_LICENSE("GPL v2");
> 
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation
> 


-- 
~Randy


Re: [PATCH] media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION

2017-11-21 Thread Randy Dunlap
On 11/21/17 23:41, kbuild test robot wrote:
> Hi Jesse,
> 
> Thank you for the patch! Yet something to improve:

missing
#include 

Jesse, did you build all of these driver changes?


> [auto build test ERROR on linuxtv-media/master]
> [also build test ERROR on v4.14 next-20171121]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Jesse-Chan/media-mtk-vcodec-add-missing-MODULE_LICENSE-DESCRIPTION/20171122-124620
> base:   git://linuxtv.org/media_tree.git master
> config: xtensa-allmodconfig (attached as .config)
> compiler: xtensa-linux-gcc (GCC) 4.9.0
> reproduce:
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> make.cross ARCH=xtensa 
> 
> All errors (new ones prefixed by >>):
> 
>>> drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c:55:16: error: expected 
>>> declaration specifiers or '...' before string constant
> MODULE_LICENSE("GPL v2");
>^
>drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c:56:20: error: expected 
> declaration specifiers or '...' before string constant
> MODULE_DESCRIPTION("Mediatek video codec driver");
>^
> 
> vim +55 drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c
> 
> 54
>   > 55MODULE_LICENSE("GPL v2");
> 
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation
> 


-- 
~Randy


Re: [PATCH 30/30] PCI: remove pci_get_bus_and_slot() function

2017-11-21 Thread Greg KH
On Wed, Nov 22, 2017 at 12:08:45AM -0600, Timur Tabi wrote:
> On 11/21/17 11:55 PM, Sinan Kaya wrote:
> > For places where domain number information is available, I extracted domain 
> > number
> > and added into pci_get_domain_bus_and_slot() call such as xen or bn drivers.
> 
> My suggestion is that you restrict your first patch set to only these
> patches.
> 
> > The assumption at this point is for pci_get_bus_and_slot() usages to be 
> > caught
> > in code-review.
> 
> How about this:
> 
> static inline struct pci_dev * __deprecated pci_get_bus_and_slot(unsigned
> int bus,
>  unsigned int devfn)
> {
>   return pci_get_domain_bus_and_slot(0, bus, devfn);
> }

Ick, no, why?  What is wrong with removing this function as is?  Don't
mark something as __depreciated if there are no in-kernel users, just
delete it and move on.

If you have out-of-tree drivers, then yes, they can make a wrapper for
this function like this if they really feel the need, or they can get
their code merged :)

thanks,

greg k-h


[PATCH v3] staging: fsl-mc: use 32bits to support 64K size mc-portals

2017-11-21 Thread Bharat Bhushan
As per APIs each mc-portal is of 64K size while currently
16bits (type u16) is used to store size of mc-portal.
In these cases upper bit of portal size gets truncated.

Signed-off-by: Bharat Bhushan 
---
v2->v3:
 - v2 patch: https://patchwork.kernel.org/patch/10067661/
 - Changes patch subject and description

v1->v2:
 - v1 patch: https://patchwork.kernel.org/patch/10067657/
 - replace uin32_t to u32 in patch subject/description

 drivers/staging/fsl-mc/include/mc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/fsl-mc/include/mc.h 
b/drivers/staging/fsl-mc/include/mc.h
index aafe63a..2cf15b0 100644
--- a/drivers/staging/fsl-mc/include/mc.h
+++ b/drivers/staging/fsl-mc/include/mc.h
@@ -325,7 +325,7 @@ static inline void mc_cmd_read_api_version(struct 
mc_command *cmd,
 struct fsl_mc_io {
struct device *dev;
u16 flags;
-   u16 portal_size;
+   u32 portal_size;
phys_addr_t portal_phys_addr;
void __iomem *portal_virt_addr;
struct fsl_mc_device *dpmcp_dev;
-- 
1.9.3



Re: [PATCH 30/30] PCI: remove pci_get_bus_and_slot() function

2017-11-21 Thread Greg KH
On Wed, Nov 22, 2017 at 12:08:45AM -0600, Timur Tabi wrote:
> On 11/21/17 11:55 PM, Sinan Kaya wrote:
> > For places where domain number information is available, I extracted domain 
> > number
> > and added into pci_get_domain_bus_and_slot() call such as xen or bn drivers.
> 
> My suggestion is that you restrict your first patch set to only these
> patches.
> 
> > The assumption at this point is for pci_get_bus_and_slot() usages to be 
> > caught
> > in code-review.
> 
> How about this:
> 
> static inline struct pci_dev * __deprecated pci_get_bus_and_slot(unsigned
> int bus,
>  unsigned int devfn)
> {
>   return pci_get_domain_bus_and_slot(0, bus, devfn);
> }

Ick, no, why?  What is wrong with removing this function as is?  Don't
mark something as __depreciated if there are no in-kernel users, just
delete it and move on.

If you have out-of-tree drivers, then yes, they can make a wrapper for
this function like this if they really feel the need, or they can get
their code merged :)

thanks,

greg k-h


[PATCH v3] staging: fsl-mc: use 32bits to support 64K size mc-portals

2017-11-21 Thread Bharat Bhushan
As per APIs each mc-portal is of 64K size while currently
16bits (type u16) is used to store size of mc-portal.
In these cases upper bit of portal size gets truncated.

Signed-off-by: Bharat Bhushan 
---
v2->v3:
 - v2 patch: https://patchwork.kernel.org/patch/10067661/
 - Changes patch subject and description

v1->v2:
 - v1 patch: https://patchwork.kernel.org/patch/10067657/
 - replace uin32_t to u32 in patch subject/description

 drivers/staging/fsl-mc/include/mc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/fsl-mc/include/mc.h 
b/drivers/staging/fsl-mc/include/mc.h
index aafe63a..2cf15b0 100644
--- a/drivers/staging/fsl-mc/include/mc.h
+++ b/drivers/staging/fsl-mc/include/mc.h
@@ -325,7 +325,7 @@ static inline void mc_cmd_read_api_version(struct 
mc_command *cmd,
 struct fsl_mc_io {
struct device *dev;
u16 flags;
-   u16 portal_size;
+   u32 portal_size;
phys_addr_t portal_phys_addr;
void __iomem *portal_virt_addr;
struct fsl_mc_device *dpmcp_dev;
-- 
1.9.3



[tip:x86/urgent] x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing

2017-11-21 Thread tip-bot for Andy Lutomirski
Commit-ID:  548c3050ea8d16997ae27f9e080a8338a606fc93
Gitweb: https://git.kernel.org/tip/548c3050ea8d16997ae27f9e080a8338a606fc93
Author: Andy Lutomirski 
AuthorDate: Tue, 21 Nov 2017 20:43:56 -0800
Committer:  Ingo Molnar 
CommitDate: Wed, 22 Nov 2017 06:35:48 +0100

x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing

When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
before it.  This means that users of entry_SYSCALL_64_after_hwframe()
were responsible for invoking TRACE_IRQS_OFF, and the one and only
user (Xen, added in the same commit) got it wrong.

I think this would manifest as a warning if a Xen PV guest with
CONFIG_DEBUG_LOCKDEP=y were used with context tracking.  (The
context tracking bit is to cause lockdep to get invoked before we
turn IRQs back on.)  I haven't tested that for real yet because I
can't get a kernel configured like that to boot at all on Xen PV.

Move TRACE_IRQS_OFF below the label.

Signed-off-by: Andy Lutomirski 
Cc: Boris Ostrovsky 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dave Hansen 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: sta...@vger.kernel.org
Fixes: 8a9949bc71a7 ("x86/xen/64: Rearrange the SYSCALL entries")
Link: 
http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.l...@kernel.org
Signed-off-by: Ingo Molnar 
---
 arch/x86/entry/entry_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index a2b30ec..5063ed1 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -148,8 +148,6 @@ ENTRY(entry_SYSCALL_64)
movq%rsp, PER_CPU_VAR(rsp_scratch)
movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp
 
-   TRACE_IRQS_OFF
-
/* Construct struct pt_regs on stack */
pushq   $__USER_DS  /* pt_regs->ss */
pushq   PER_CPU_VAR(rsp_scratch)/* pt_regs->sp */
@@ -170,6 +168,8 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
sub $(6*8), %rsp/* pt_regs->bp, bx, r12-15 not 
saved */
UNWIND_HINT_REGS extra=0
 
+   TRACE_IRQS_OFF
+
/*
 * If we need to do entry work or if we guess we'll need to do
 * exit work, go straight to the slow path.


[tip:x86/urgent] x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing

2017-11-21 Thread tip-bot for Andy Lutomirski
Commit-ID:  548c3050ea8d16997ae27f9e080a8338a606fc93
Gitweb: https://git.kernel.org/tip/548c3050ea8d16997ae27f9e080a8338a606fc93
Author: Andy Lutomirski 
AuthorDate: Tue, 21 Nov 2017 20:43:56 -0800
Committer:  Ingo Molnar 
CommitDate: Wed, 22 Nov 2017 06:35:48 +0100

x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing

When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
before it.  This means that users of entry_SYSCALL_64_after_hwframe()
were responsible for invoking TRACE_IRQS_OFF, and the one and only
user (Xen, added in the same commit) got it wrong.

I think this would manifest as a warning if a Xen PV guest with
CONFIG_DEBUG_LOCKDEP=y were used with context tracking.  (The
context tracking bit is to cause lockdep to get invoked before we
turn IRQs back on.)  I haven't tested that for real yet because I
can't get a kernel configured like that to boot at all on Xen PV.

Move TRACE_IRQS_OFF below the label.

Signed-off-by: Andy Lutomirski 
Cc: Boris Ostrovsky 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dave Hansen 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: sta...@vger.kernel.org
Fixes: 8a9949bc71a7 ("x86/xen/64: Rearrange the SYSCALL entries")
Link: 
http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.l...@kernel.org
Signed-off-by: Ingo Molnar 
---
 arch/x86/entry/entry_64.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index a2b30ec..5063ed1 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -148,8 +148,6 @@ ENTRY(entry_SYSCALL_64)
movq%rsp, PER_CPU_VAR(rsp_scratch)
movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp
 
-   TRACE_IRQS_OFF
-
/* Construct struct pt_regs on stack */
pushq   $__USER_DS  /* pt_regs->ss */
pushq   PER_CPU_VAR(rsp_scratch)/* pt_regs->sp */
@@ -170,6 +168,8 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
sub $(6*8), %rsp/* pt_regs->bp, bx, r12-15 not 
saved */
UNWIND_HINT_REGS extra=0
 
+   TRACE_IRQS_OFF
+
/*
 * If we need to do entry work or if we guess we'll need to do
 * exit work, go straight to the slow path.


Re: WARNING: can't dereference registers at ffffc90004dfff60 for ip error_entry+0x7d/0xd0 (Re: [PATCH v2 00/18] Entry stack switching)

2017-11-21 Thread Ingo Molnar

* Ingo Molnar  wrote:

> 
> * Andy Lutomirski  wrote:
> 
> > This sets up stack switching, including for SYSCALL.  I think it's
> > in decent shape.
> > 
> > Known issues:
> >  - I think we're going to want a way to turn the stack switching on and
> >off either at boot time or at runtime.  It should be fairly 
> > straightforward
> >to make it work.
> > 
> >  - I think the ORC unwinder isn't so good at dealing with stack overflows.
> >It bails too early (I think), resulting in lots of ? entries.  This
> >isn't a regression with this series -- it's just something that could
> >be improved.
> 
> Note that with the attached config on an Intel testbox I get the following 
> new ORC 
> unwinder warning during bootup:
> 
> [   12.200554] calling  ghash_pclmulqdqni_mod_init+0x0/0x54 @ 1
> [   12.209536] WARNING: can't dereference registers at c90004dfff60 for 
> ip error_entry+0x7d/0xd0
> [   12.231388] initcall ghash_pclmulqdqni_mod_init+0x0/0x54 returned 0 after 
> 23480 usecs
> 
> Thanks,

Also note that the ORC warning goes away if CONFIG_PROVE_LOCKING is disabled.

Thanks,

Ingo


Re: WARNING: can't dereference registers at ffffc90004dfff60 for ip error_entry+0x7d/0xd0 (Re: [PATCH v2 00/18] Entry stack switching)

2017-11-21 Thread Ingo Molnar

* Ingo Molnar  wrote:

> 
> * Andy Lutomirski  wrote:
> 
> > This sets up stack switching, including for SYSCALL.  I think it's
> > in decent shape.
> > 
> > Known issues:
> >  - I think we're going to want a way to turn the stack switching on and
> >off either at boot time or at runtime.  It should be fairly 
> > straightforward
> >to make it work.
> > 
> >  - I think the ORC unwinder isn't so good at dealing with stack overflows.
> >It bails too early (I think), resulting in lots of ? entries.  This
> >isn't a regression with this series -- it's just something that could
> >be improved.
> 
> Note that with the attached config on an Intel testbox I get the following 
> new ORC 
> unwinder warning during bootup:
> 
> [   12.200554] calling  ghash_pclmulqdqni_mod_init+0x0/0x54 @ 1
> [   12.209536] WARNING: can't dereference registers at c90004dfff60 for 
> ip error_entry+0x7d/0xd0
> [   12.231388] initcall ghash_pclmulqdqni_mod_init+0x0/0x54 returned 0 after 
> 23480 usecs
> 
> Thanks,

Also note that the ORC warning goes away if CONFIG_PROVE_LOCKING is disabled.

Thanks,

Ingo


[PATCH v4 1/1] perf/bench/numa: Fixup discontiguous/sparse numa nodes

2017-11-21 Thread sathnaga
From: Satheesh Rajendran 

Certain systems are designed to have sparse/discontiguous nodes.
On such systems, perf bench numa hangs, shows wrong number of nodes
and shows values for non-existent nodes. Handle this by only
taking nodes that are exposed by kernel to userspace.

Cc: Arnaldo Carvalho de Melo 
Cc: Naveen N. Rao 
Reviewed-by: Srikar Dronamraju 
Signed-off-by: Satheesh Rajendran 
Signed-off-by: Balamuruhan S 
---
 tools/perf/bench/numa.c | 57 +++--
 1 file changed, 51 insertions(+), 6 deletions(-)

diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index d95fdcc..ed7db12 100644
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@@ -216,6 +216,47 @@ static const char * const numa_usage[] = {
NULL
 };
 
+/*
+ * To get number of numa nodes present.
+ */
+static int nr_numa_nodes(void)
+{
+   int i, nr_nodes = 0;
+
+   for (i = 0; i < g->p.nr_nodes; i++) {
+   if (numa_bitmask_isbitset(numa_nodes_ptr, i))
+   nr_nodes++;
+   }
+
+   return nr_nodes;
+}
+
+/*
+ * To check if given numa node is present.
+ */
+static int is_node_present(int node)
+{
+   return numa_bitmask_isbitset(numa_nodes_ptr, node);
+}
+
+/*
+ * To check given numa node has cpus.
+ */
+static bool node_has_cpus(int node)
+{
+   struct bitmask *cpu = numa_allocate_cpumask();
+   unsigned int i;
+
+   if (cpu && !numa_node_to_cpus(node, cpu)) {
+   for (i = 0; i < cpu->size; i++) {
+   if (numa_bitmask_isbitset(cpu, i))
+   return true;
+   }
+   }
+
+   return false; /* lets fall back to nocpus safely */
+}
+
 static cpu_set_t bind_to_cpu(int target_cpu)
 {
cpu_set_t orig_mask, mask;
@@ -244,12 +285,12 @@ static cpu_set_t bind_to_cpu(int target_cpu)
 
 static cpu_set_t bind_to_node(int target_node)
 {
-   int cpus_per_node = g->p.nr_cpus/g->p.nr_nodes;
+   int cpus_per_node = g->p.nr_cpus / nr_numa_nodes();
cpu_set_t orig_mask, mask;
int cpu;
int ret;
 
-   BUG_ON(cpus_per_node*g->p.nr_nodes != g->p.nr_cpus);
+   BUG_ON(cpus_per_node * nr_numa_nodes() != g->p.nr_cpus);
BUG_ON(!cpus_per_node);
 
ret = sched_getaffinity(0, sizeof(orig_mask), _mask);
@@ -649,7 +690,7 @@ static int parse_setup_node_list(void)
int i;
 
for (i = 0; i < mul; i++) {
-   if (t >= g->p.nr_tasks) {
+   if (t >= g->p.nr_tasks || 
!node_has_cpus(bind_node)) {
printf("\n# NOTE: ignoring bind NODEs 
starting at NODE#%d\n", bind_node);
goto out;
}
@@ -964,13 +1005,14 @@ static void calc_convergence(double runtime_ns_max, 
double *convergence)
sum = 0;
 
for (node = 0; node < g->p.nr_nodes; node++) {
+   if (!is_node_present(node))
+   continue;
nr = nodes[node];
nr_min = min(nr, nr_min);
nr_max = max(nr, nr_max);
sum += nr;
}
BUG_ON(nr_min > nr_max);
-
BUG_ON(sum > g->p.nr_tasks);
 
if (0 && (sum < g->p.nr_tasks))
@@ -984,8 +1026,11 @@ static void calc_convergence(double runtime_ns_max, 
double *convergence)
process_groups = 0;
 
for (node = 0; node < g->p.nr_nodes; node++) {
-   int processes = count_node_processes(node);
+   int processes;
 
+   if (!is_node_present(node))
+   continue;
+   processes = count_node_processes(node);
nr = nodes[node];
tprintf(" %2d/%-2d", nr, processes);
 
@@ -1291,7 +1336,7 @@ static void print_summary(void)
 
printf("\n ###\n");
printf(" # %d %s will execute (on %d nodes, %d CPUs):\n",
-   g->p.nr_tasks, g->p.nr_tasks == 1 ? "task" : "tasks", 
g->p.nr_nodes, g->p.nr_cpus);
+   g->p.nr_tasks, g->p.nr_tasks == 1 ? "task" : "tasks", 
nr_numa_nodes(), g->p.nr_cpus);
printf(" #  %5dx %5ldMB global  shared mem operations\n",
g->p.nr_loops, g->p.bytes_global/1024/1024);
printf(" #  %5dx %5ldMB process shared mem operations\n",
-- 
2.7.4



[PATCH v4 1/1] perf/bench/numa: Fixup discontiguous/sparse numa nodes

2017-11-21 Thread sathnaga
From: Satheesh Rajendran 

Certain systems are designed to have sparse/discontiguous nodes.
On such systems, perf bench numa hangs, shows wrong number of nodes
and shows values for non-existent nodes. Handle this by only
taking nodes that are exposed by kernel to userspace.

Cc: Arnaldo Carvalho de Melo 
Cc: Naveen N. Rao 
Reviewed-by: Srikar Dronamraju 
Signed-off-by: Satheesh Rajendran 
Signed-off-by: Balamuruhan S 
---
 tools/perf/bench/numa.c | 57 +++--
 1 file changed, 51 insertions(+), 6 deletions(-)

diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index d95fdcc..ed7db12 100644
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@@ -216,6 +216,47 @@ static const char * const numa_usage[] = {
NULL
 };
 
+/*
+ * To get number of numa nodes present.
+ */
+static int nr_numa_nodes(void)
+{
+   int i, nr_nodes = 0;
+
+   for (i = 0; i < g->p.nr_nodes; i++) {
+   if (numa_bitmask_isbitset(numa_nodes_ptr, i))
+   nr_nodes++;
+   }
+
+   return nr_nodes;
+}
+
+/*
+ * To check if given numa node is present.
+ */
+static int is_node_present(int node)
+{
+   return numa_bitmask_isbitset(numa_nodes_ptr, node);
+}
+
+/*
+ * To check given numa node has cpus.
+ */
+static bool node_has_cpus(int node)
+{
+   struct bitmask *cpu = numa_allocate_cpumask();
+   unsigned int i;
+
+   if (cpu && !numa_node_to_cpus(node, cpu)) {
+   for (i = 0; i < cpu->size; i++) {
+   if (numa_bitmask_isbitset(cpu, i))
+   return true;
+   }
+   }
+
+   return false; /* lets fall back to nocpus safely */
+}
+
 static cpu_set_t bind_to_cpu(int target_cpu)
 {
cpu_set_t orig_mask, mask;
@@ -244,12 +285,12 @@ static cpu_set_t bind_to_cpu(int target_cpu)
 
 static cpu_set_t bind_to_node(int target_node)
 {
-   int cpus_per_node = g->p.nr_cpus/g->p.nr_nodes;
+   int cpus_per_node = g->p.nr_cpus / nr_numa_nodes();
cpu_set_t orig_mask, mask;
int cpu;
int ret;
 
-   BUG_ON(cpus_per_node*g->p.nr_nodes != g->p.nr_cpus);
+   BUG_ON(cpus_per_node * nr_numa_nodes() != g->p.nr_cpus);
BUG_ON(!cpus_per_node);
 
ret = sched_getaffinity(0, sizeof(orig_mask), _mask);
@@ -649,7 +690,7 @@ static int parse_setup_node_list(void)
int i;
 
for (i = 0; i < mul; i++) {
-   if (t >= g->p.nr_tasks) {
+   if (t >= g->p.nr_tasks || 
!node_has_cpus(bind_node)) {
printf("\n# NOTE: ignoring bind NODEs 
starting at NODE#%d\n", bind_node);
goto out;
}
@@ -964,13 +1005,14 @@ static void calc_convergence(double runtime_ns_max, 
double *convergence)
sum = 0;
 
for (node = 0; node < g->p.nr_nodes; node++) {
+   if (!is_node_present(node))
+   continue;
nr = nodes[node];
nr_min = min(nr, nr_min);
nr_max = max(nr, nr_max);
sum += nr;
}
BUG_ON(nr_min > nr_max);
-
BUG_ON(sum > g->p.nr_tasks);
 
if (0 && (sum < g->p.nr_tasks))
@@ -984,8 +1026,11 @@ static void calc_convergence(double runtime_ns_max, 
double *convergence)
process_groups = 0;
 
for (node = 0; node < g->p.nr_nodes; node++) {
-   int processes = count_node_processes(node);
+   int processes;
 
+   if (!is_node_present(node))
+   continue;
+   processes = count_node_processes(node);
nr = nodes[node];
tprintf(" %2d/%-2d", nr, processes);
 
@@ -1291,7 +1336,7 @@ static void print_summary(void)
 
printf("\n ###\n");
printf(" # %d %s will execute (on %d nodes, %d CPUs):\n",
-   g->p.nr_tasks, g->p.nr_tasks == 1 ? "task" : "tasks", 
g->p.nr_nodes, g->p.nr_cpus);
+   g->p.nr_tasks, g->p.nr_tasks == 1 ? "task" : "tasks", 
nr_numa_nodes(), g->p.nr_cpus);
printf(" #  %5dx %5ldMB global  shared mem operations\n",
g->p.nr_loops, g->p.bytes_global/1024/1024);
printf(" #  %5dx %5ldMB process shared mem operations\n",
-- 
2.7.4



[PATCH v4 0/1] Fixup for discontiguous/sparse numa nodes

2017-11-21 Thread sathnaga
From: Satheesh Rajendran 

Certain systems would have sparse/discontinguous
numa nodes.
perf bench numa doesnt work well on such nodes.
1. It shows wrong values.
2. It can hang.
3. It can show redundant information for non-existant nodes.

 #numactl -H
available: 2 nodes (0,8)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 61352 MB
node 0 free: 57168 MB
node 8 cpus: 8 9 10 11 12 13 14 15
node 8 size: 65416 MB
node 8 free: 36593 MB
node distances:
node   0   8
  0:  10  40
  8:  40  10

Scenario 1:

Before Fix:
 # perf bench numa mem --no-data_rand_walk -p 2 -t 20 -G 0 -P 3072 -T 0 -l 50 
-c -s 1000
...
...
 # 40 tasks will execute (on 9 nodes, 16 CPUs): > Wrong number of nodes
...
 #2.0%  [0.2 mins]  1/1   0/0   0/0   0/0   0/0   0/0   0/0   0/0   4/1  [ 
4/2 ] l:  0-0   (  0) > Shows info on non-existant nodes.

After Fix:
 # ./perf bench numa mem --no-data_rand_walk -p 2 -t 20 -G 0 -P 3072 -T 0 -l 50 
-c -s 1000
...
...
 # 40 tasks will execute (on 2 nodes, 16 CPUs):
... 
 #2.0%  [0.2 mins]  9/1   0/0  [ 9/1 ] l:  0-0   (  0)
 #4.0%  [0.4 mins] 21/2  19/1  [ 2/3 ] l:  0-1   (  1) {1-2}

Scenario 2:

Before Fix:
 # perf bench numa all
 # Running numa/mem benchmark...

...
 # Running RAM-bw-remote, "perf bench numa mem -p 1 -t 1 -P 1024 -C 0 -M 1 -s 
20 -zZq --thp  1 --no-data_rand_walk"
perf: bench/numa.c:306: bind_to_memnode: Assertion `!(ret)' failed. 
> Got hung

After Fix:
 # ./perf bench numa all
 # Running numa/mem benchmark...

...
 # Running RAM-bw-remote, "perf bench numa mem -p 1 -t 1 -P 1024 -C 0 -M 1 -s 
20 -zZq --thp  1 --no-data_rand_walk"

 # NOTE: ignoring bind NODEs starting at NODE#1
 # NOTE: 0 tasks mem-bound, 1 tasks unbound
 20.017 secs slowest (max) thread-runtime
 20.000 secs fastest (min) thread-runtime
 20.006 secs average thread-runtime
  0.043 % difference between max/avg runtime
413.794 GB data processed, per thread
413.794 GB data processed, total
  0.048 nsecs/byte/thread runtime
 20.672 GB/sec/thread speed
 20.672 GB/sec total speed

Changes in v2:
Fixed review comments for function names and alloc failure handle

Changes in v3:
Coding Style fixes.

Changes in v4:
Address review comments from Naveen and Arnaldo.
Merge two commits into single.


Satheesh Rajendran (1):
  perf/bench/numa: Handle discontiguous/sparse numa nodes

 tools/perf/bench/numa.c | 57 +++--
 1 file changed, 51 insertions(+), 6 deletions(-)

-- 
2.7.4



[PATCH v4 0/1] Fixup for discontiguous/sparse numa nodes

2017-11-21 Thread sathnaga
From: Satheesh Rajendran 

Certain systems would have sparse/discontinguous
numa nodes.
perf bench numa doesnt work well on such nodes.
1. It shows wrong values.
2. It can hang.
3. It can show redundant information for non-existant nodes.

 #numactl -H
available: 2 nodes (0,8)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 61352 MB
node 0 free: 57168 MB
node 8 cpus: 8 9 10 11 12 13 14 15
node 8 size: 65416 MB
node 8 free: 36593 MB
node distances:
node   0   8
  0:  10  40
  8:  40  10

Scenario 1:

Before Fix:
 # perf bench numa mem --no-data_rand_walk -p 2 -t 20 -G 0 -P 3072 -T 0 -l 50 
-c -s 1000
...
...
 # 40 tasks will execute (on 9 nodes, 16 CPUs): > Wrong number of nodes
...
 #2.0%  [0.2 mins]  1/1   0/0   0/0   0/0   0/0   0/0   0/0   0/0   4/1  [ 
4/2 ] l:  0-0   (  0) > Shows info on non-existant nodes.

After Fix:
 # ./perf bench numa mem --no-data_rand_walk -p 2 -t 20 -G 0 -P 3072 -T 0 -l 50 
-c -s 1000
...
...
 # 40 tasks will execute (on 2 nodes, 16 CPUs):
... 
 #2.0%  [0.2 mins]  9/1   0/0  [ 9/1 ] l:  0-0   (  0)
 #4.0%  [0.4 mins] 21/2  19/1  [ 2/3 ] l:  0-1   (  1) {1-2}

Scenario 2:

Before Fix:
 # perf bench numa all
 # Running numa/mem benchmark...

...
 # Running RAM-bw-remote, "perf bench numa mem -p 1 -t 1 -P 1024 -C 0 -M 1 -s 
20 -zZq --thp  1 --no-data_rand_walk"
perf: bench/numa.c:306: bind_to_memnode: Assertion `!(ret)' failed. 
> Got hung

After Fix:
 # ./perf bench numa all
 # Running numa/mem benchmark...

...
 # Running RAM-bw-remote, "perf bench numa mem -p 1 -t 1 -P 1024 -C 0 -M 1 -s 
20 -zZq --thp  1 --no-data_rand_walk"

 # NOTE: ignoring bind NODEs starting at NODE#1
 # NOTE: 0 tasks mem-bound, 1 tasks unbound
 20.017 secs slowest (max) thread-runtime
 20.000 secs fastest (min) thread-runtime
 20.006 secs average thread-runtime
  0.043 % difference between max/avg runtime
413.794 GB data processed, per thread
413.794 GB data processed, total
  0.048 nsecs/byte/thread runtime
 20.672 GB/sec/thread speed
 20.672 GB/sec total speed

Changes in v2:
Fixed review comments for function names and alloc failure handle

Changes in v3:
Coding Style fixes.

Changes in v4:
Address review comments from Naveen and Arnaldo.
Merge two commits into single.


Satheesh Rajendran (1):
  perf/bench/numa: Handle discontiguous/sparse numa nodes

 tools/perf/bench/numa.c | 57 +++--
 1 file changed, 51 insertions(+), 6 deletions(-)

-- 
2.7.4



Re: [PATCH] media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION

2017-11-21 Thread kbuild test robot
Hi Jesse,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linuxtv-media/master]
[also build test ERROR on v4.14 next-20171121]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Jesse-Chan/media-mtk-vcodec-add-missing-MODULE_LICENSE-DESCRIPTION/20171122-124620
base:   git://linuxtv.org/media_tree.git master
config: xtensa-allmodconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 4.9.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=xtensa 

All errors (new ones prefixed by >>):

>> drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c:55:16: error: expected 
>> declaration specifiers or '...' before string constant
MODULE_LICENSE("GPL v2");
   ^
   drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c:56:20: error: expected 
declaration specifiers or '...' before string constant
MODULE_DESCRIPTION("Mediatek video codec driver");
   ^

vim +55 drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c

54  
  > 55  MODULE_LICENSE("GPL v2");

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 3.16 105/133] mm/vmstat.c: fix wrong comment

2017-11-21 Thread Vlastimil Babka
On 11/22/2017 02:58 AM, Ben Hutchings wrote:
> 3.16.51-rc1 review patch.  If anyone has any objections, please let me know.

I don't really care much in the end, but is "fix wrong comment" really a
stable patch material these days? :)

> --
> 
> From: SeongJae Park 
> 
> commit f113e64121ba9f4791332248b315d9f57ee33a6b upstream.
> 
> Comment for pagetypeinfo_showblockcount() is mistakenly duplicated from
> pagetypeinfo_show_free()'s comment.  This commit fixes it.
> 
> Link: http://lkml.kernel.org/r/20170809185816.11244-1-sj38.p...@gmail.com
> Fixes: 467c996c1e19 ("Print out statistics in relation to fragmentation 
> avoidance to /proc/pagetypeinfo")
> Signed-off-by: SeongJae Park 
> Cc: Michal Hocko 
> Cc: Vlastimil Babka 
> Signed-off-by: Andrew Morton 
> Signed-off-by: Linus Torvalds 
> Signed-off-by: Ben Hutchings 
> ---
>  mm/vmstat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -975,7 +975,7 @@ static void pagetypeinfo_showblockcount_
>   seq_putc(m, '\n');
>  }
>  
> -/* Print out the free pages at each order for each migratetype */
> +/* Print out the number of pageblocks for each migratetype */
>  static int pagetypeinfo_showblockcount(struct seq_file *m, void *arg)
>  {
>   int mtype;
> 



Re: [PATCH] media: mtk-vcodec: add missing MODULE_LICENSE/DESCRIPTION

2017-11-21 Thread kbuild test robot
Hi Jesse,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linuxtv-media/master]
[also build test ERROR on v4.14 next-20171121]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Jesse-Chan/media-mtk-vcodec-add-missing-MODULE_LICENSE-DESCRIPTION/20171122-124620
base:   git://linuxtv.org/media_tree.git master
config: xtensa-allmodconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 4.9.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=xtensa 

All errors (new ones prefixed by >>):

>> drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c:55:16: error: expected 
>> declaration specifiers or '...' before string constant
MODULE_LICENSE("GPL v2");
   ^
   drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c:56:20: error: expected 
declaration specifiers or '...' before string constant
MODULE_DESCRIPTION("Mediatek video codec driver");
   ^

vim +55 drivers/media/platform/mtk-vcodec/mtk_vcodec_intr.c

54  
  > 55  MODULE_LICENSE("GPL v2");

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 3.16 105/133] mm/vmstat.c: fix wrong comment

2017-11-21 Thread Vlastimil Babka
On 11/22/2017 02:58 AM, Ben Hutchings wrote:
> 3.16.51-rc1 review patch.  If anyone has any objections, please let me know.

I don't really care much in the end, but is "fix wrong comment" really a
stable patch material these days? :)

> --
> 
> From: SeongJae Park 
> 
> commit f113e64121ba9f4791332248b315d9f57ee33a6b upstream.
> 
> Comment for pagetypeinfo_showblockcount() is mistakenly duplicated from
> pagetypeinfo_show_free()'s comment.  This commit fixes it.
> 
> Link: http://lkml.kernel.org/r/20170809185816.11244-1-sj38.p...@gmail.com
> Fixes: 467c996c1e19 ("Print out statistics in relation to fragmentation 
> avoidance to /proc/pagetypeinfo")
> Signed-off-by: SeongJae Park 
> Cc: Michal Hocko 
> Cc: Vlastimil Babka 
> Signed-off-by: Andrew Morton 
> Signed-off-by: Linus Torvalds 
> Signed-off-by: Ben Hutchings 
> ---
>  mm/vmstat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -975,7 +975,7 @@ static void pagetypeinfo_showblockcount_
>   seq_putc(m, '\n');
>  }
>  
> -/* Print out the free pages at each order for each migratetype */
> +/* Print out the number of pageblocks for each migratetype */
>  static int pagetypeinfo_showblockcount(struct seq_file *m, void *arg)
>  {
>   int mtype;
> 



Re: [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect()

2017-11-21 Thread Adrian Hunter
On 21/11/17 17:39, Ulf Hansson wrote:
> On 21 November 2017 at 14:42, Adrian Hunter  wrote:
>> card_busy_detect() has a 10 minute timeout. However the correct timeout is
>> the data timeout. Change card_busy_detect() to use the data timeout.
> 
> Unfortunate I don't think there is "correct" timeout for this case.
> 
> The data->timeout_ns is to indicate for the host to how long the
> maximum time it's allowed to take between blocks that are written to
> the data lines.
> 
> I haven't found a definition of the busy timeout, after the data write
> has completed. The spec only mentions that the device moves to
> programming state and pulls DAT0 to indicate busy.

To me it reads more like the timeout is for each block, including the last
i.e. the same timeout for "busy".  Note the card is also busy between blocks.

Equally it is the timeout we give the host controller.  So either the host
controller does not have a timeout for "busy" - which begs the question why
it has a timeout at all - or it invents its own "busy" timeout - which begs
the question why it isn't in the spec.

> 
> Sure, 10 min seems crazy, perhaps something along the lines of 10-20 s
> is more reasonable. What do you think?

We give SD cards a generous 3 seconds for writes.  SDHCI has long had a 10
second software timer for the whole request, which strongly suggests that
requests have always completed within 10 seconds.  So that puts the range of
an arbitrary timeout 3-10 s.


Re: [PATCH V14 07/24] mmc: block: Use data timeout in card_busy_detect()

2017-11-21 Thread Adrian Hunter
On 21/11/17 17:39, Ulf Hansson wrote:
> On 21 November 2017 at 14:42, Adrian Hunter  wrote:
>> card_busy_detect() has a 10 minute timeout. However the correct timeout is
>> the data timeout. Change card_busy_detect() to use the data timeout.
> 
> Unfortunate I don't think there is "correct" timeout for this case.
> 
> The data->timeout_ns is to indicate for the host to how long the
> maximum time it's allowed to take between blocks that are written to
> the data lines.
> 
> I haven't found a definition of the busy timeout, after the data write
> has completed. The spec only mentions that the device moves to
> programming state and pulls DAT0 to indicate busy.

To me it reads more like the timeout is for each block, including the last
i.e. the same timeout for "busy".  Note the card is also busy between blocks.

Equally it is the timeout we give the host controller.  So either the host
controller does not have a timeout for "busy" - which begs the question why
it has a timeout at all - or it invents its own "busy" timeout - which begs
the question why it isn't in the spec.

> 
> Sure, 10 min seems crazy, perhaps something along the lines of 10-20 s
> is more reasonable. What do you think?

We give SD cards a generous 3 seconds for writes.  SDHCI has long had a 10
second software timer for the whole request, which strongly suggests that
requests have always completed within 10 seconds.  So that puts the range of
an arbitrary timeout 3-10 s.


WARNING: can't dereference registers at ffffc90004dfff60 for ip error_entry+0x7d/0xd0 (Re: [PATCH v2 00/18] Entry stack switching)

2017-11-21 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> This sets up stack switching, including for SYSCALL.  I think it's
> in decent shape.
> 
> Known issues:
>  - I think we're going to want a way to turn the stack switching on and
>off either at boot time or at runtime.  It should be fairly straightforward
>to make it work.
> 
>  - I think the ORC unwinder isn't so good at dealing with stack overflows.
>It bails too early (I think), resulting in lots of ? entries.  This
>isn't a regression with this series -- it's just something that could
>be improved.

Note that with the attached config on an Intel testbox I get the following new 
ORC 
unwinder warning during bootup:

[   12.200554] calling  ghash_pclmulqdqni_mod_init+0x0/0x54 @ 1
[   12.209536] WARNING: can't dereference registers at c90004dfff60 for ip 
error_entry+0x7d/0xd0
[   12.231388] initcall ghash_pclmulqdqni_mod_init+0x0/0x54 returned 0 after 
23480 usecs

Thanks,

Ingo
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.14.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
# CONFIG_TICK_CPU_ACCOUNTING is not set
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
# CONFIG_CPU_ISOLATION is not set

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_CONTEXT_TRACKING=y
CONFIG_CONTEXT_TRACKING_FORCE=y
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=m
# CONFIG_IKCONFIG_PROC is not set
CONFIG_LOG_BUF_SHIFT=18
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y

WARNING: can't dereference registers at ffffc90004dfff60 for ip error_entry+0x7d/0xd0 (Re: [PATCH v2 00/18] Entry stack switching)

2017-11-21 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> This sets up stack switching, including for SYSCALL.  I think it's
> in decent shape.
> 
> Known issues:
>  - I think we're going to want a way to turn the stack switching on and
>off either at boot time or at runtime.  It should be fairly straightforward
>to make it work.
> 
>  - I think the ORC unwinder isn't so good at dealing with stack overflows.
>It bails too early (I think), resulting in lots of ? entries.  This
>isn't a regression with this series -- it's just something that could
>be improved.

Note that with the attached config on an Intel testbox I get the following new 
ORC 
unwinder warning during bootup:

[   12.200554] calling  ghash_pclmulqdqni_mod_init+0x0/0x54 @ 1
[   12.209536] WARNING: can't dereference registers at c90004dfff60 for ip 
error_entry+0x7d/0xd0
[   12.231388] initcall ghash_pclmulqdqni_mod_init+0x0/0x54 returned 0 after 
23480 usecs

Thanks,

Ingo
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.14.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
# CONFIG_TICK_CPU_ACCOUNTING is not set
CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
# CONFIG_CPU_ISOLATION is not set

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_CONTEXT_TRACKING=y
CONFIG_CONTEXT_TRACKING_FORCE=y
CONFIG_BUILD_BIN2C=y
CONFIG_IKCONFIG=m
# CONFIG_IKCONFIG_PROC is not set
CONFIG_LOG_BUF_SHIFT=18
CONFIG_LOG_CPU_MAX_BUF_SHIFT=12
CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13
CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y
CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y

Re: [PATCH 2/5] media: dt-bindings: Add bindings for TDA1997X

2017-11-21 Thread Sakari Ailus
Hi Tim,

On Thu, Nov 09, 2017 at 10:45:33AM -0800, Tim Harvey wrote:
> Cc: Rob Herring 
> Signed-off-by: Tim Harvey 
> ---
> v3:
>  - fix typo
> 
> v2:
>  - add vendor prefix and remove _ from vidout-portcfg
>  - remove _ from labels
>  - remove max-pixel-rate property
>  - describe and provide example for single output port
>  - update to new audio port bindings
> ---
>  .../devicetree/bindings/media/i2c/tda1997x.txt | 179 
> +
>  1 file changed, 179 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/media/i2c/tda1997x.txt
> 
> diff --git a/Documentation/devicetree/bindings/media/i2c/tda1997x.txt 
> b/Documentation/devicetree/bindings/media/i2c/tda1997x.txt
> new file mode 100644
> index 000..dd37f14
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/media/i2c/tda1997x.txt
> @@ -0,0 +1,179 @@
> +Device-Tree bindings for the NXP TDA1997x HDMI receiver
> +
> +The TDA19971/73 are HDMI video receivers.
> +
> +The TDA19971 Video port output pins can be used as follows:
> + - RGB 8bit per color (24 bits total): R[11:4] B[11:4] G[11:4]
> + - YUV444 8bit per color (24 bits total): Y[11:4] Cr[11:4] Cb[11:4]
> + - YUV422 semi-planar 8bit per component (16 bits total): Y[11:4] CbCr[11:4]
> + - YUV422 semi-planar 10bit per component (20 bits total): Y[11:2] CbCr[11:2]
> + - YUV422 semi-planar 12bit per component (24 bits total): - Y[11:0] 
> CbCr[11:0]
> + - YUV422 BT656 8bit per component (8 bits total): YCbCr[11:4] (2-cycles)
> + - YUV422 BT656 10bit per component (10 bits total): YCbCr[11:2] (2-cycles)
> + - YUV422 BT656 12bit per component (12 bits total): YCbCr[11:0] (2-cycles)
> +
> +The TDA19973 Video port output pins can be used as follows:
> + - RGB 12bit per color (36 bits total): R[11:0] B[11:0] G[11:0]
> + - YUV444 12bit per color (36 bits total): Y[11:0] Cb[11:0] Cr[11:0]
> + - YUV422 semi-planar 12bit per component (24 bits total): Y[11:0] CbCr[11:0]
> + - YUV422 BT656 12bit per component (12 bits total): YCbCr[11:0] (2-cycles)
> +
> +The Video port output pins are mapped via 4-bit 'pin groups' allowing
> +for a variety of connection possibilities including swapping pin order within
> +pin groups. The video_portcfg device-tree property consists of register 
> mapping
> +pairs which map a chip-specific VP output register to a 4-bit pin group. If
> +the pin group needs to be bit-swapped you can use the *_S pin-group defines.
> +
> +Required Properties:
> + - compatible  :
> +  - "nxp,tda19971" for the TDA19971
> +  - "nxp,tda19973" for the TDA19973
> + - reg : I2C slave address
> + - interrupts  : The interrupt number
> + - DOVDD-supply: Digital I/O supply
> + - DVDD-supply : Digital Core supply
> + - AVDD-supply : Analog supply
> + - nxp,vidout-portcfg  : array of pairs mapping VP output pins to pin groups.
> +
> +Optional Properties:
> + - nxp,audout-format   : DAI bus format: "i2s" or "spdif".
> + - nxp,audout-width: width of audio output data bus (1-4).
> + - nxp,audout-layout   : data layout (0=AP0 used, 1=AP0/AP1/AP2/AP3 used).
> + - nxp,audout-mclk-fs  : Multiplication factor between stream rate and codec
> + mclk.
> +
> +The device node must contain one 'port' child node for its digital output
> +video port, in accordance with the video interface bindings defined in
> +Documentation/devicetree/bindings/media/video-interfaces.txt.

Could you add that this port has one endpoint node as well? (Unless you
support multiple, that is.)

> +
> +Optional Endpoint Properties:
> +  The following three properties are defined in video-interfaces.txt and
> +  are valid for source endpoints only:

Transmitters? Don't you have an endpoint only in the port representing the
transmitter?

> +  - hsync-active: Horizontal synchronization polarity. Defaults to active 
> high.
> +  - vsync-active: Vertical synchronization polarity. Defaults to active high.
> +  - data-active: Data polarity. Defaults to active high.
> +
> +Examples:
> + - VP[15:0] connected to IMX6 CSI_DATA[19:4] for 16bit YUV422
> +   16bit I2S layout0 with a 128*fs clock (A_WS, AP0, A_CLK pins)
> + hdmi-receiver@48 {
> + compatible = "nxp,tda19971";
> + pinctrl-names = "default";
> + pinctrl-0 = <_tda1997x>;
> + reg = <0x48>;
> + interrupt-parent = <>;
> + interrupts = <7 IRQ_TYPE_LEVEL_LOW>;
> + DOVDD-supply = <_3p3v>;
> + AVDD-supply = <_1p8v>;
> + DVDD-supply = <_1p8v>;
> + /* audio */
> + #sound-dai-cells = <0>;
> + nxp,audout-format = "i2s";
> + nxp,audout-layout = <0>;
> + nxp,audout-width = <16>;
> + nxp,audout-mclk-fs = <128>;
> + /*
> +  * The 8bpp YUV422 semi-planar mode outputs CbCr[11:4]
> +  * and Y[11:4] across 16bits in the same pixclk cycle.
> + 

Re: [PATCH 2/5] media: dt-bindings: Add bindings for TDA1997X

2017-11-21 Thread Sakari Ailus
Hi Tim,

On Thu, Nov 09, 2017 at 10:45:33AM -0800, Tim Harvey wrote:
> Cc: Rob Herring 
> Signed-off-by: Tim Harvey 
> ---
> v3:
>  - fix typo
> 
> v2:
>  - add vendor prefix and remove _ from vidout-portcfg
>  - remove _ from labels
>  - remove max-pixel-rate property
>  - describe and provide example for single output port
>  - update to new audio port bindings
> ---
>  .../devicetree/bindings/media/i2c/tda1997x.txt | 179 
> +
>  1 file changed, 179 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/media/i2c/tda1997x.txt
> 
> diff --git a/Documentation/devicetree/bindings/media/i2c/tda1997x.txt 
> b/Documentation/devicetree/bindings/media/i2c/tda1997x.txt
> new file mode 100644
> index 000..dd37f14
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/media/i2c/tda1997x.txt
> @@ -0,0 +1,179 @@
> +Device-Tree bindings for the NXP TDA1997x HDMI receiver
> +
> +The TDA19971/73 are HDMI video receivers.
> +
> +The TDA19971 Video port output pins can be used as follows:
> + - RGB 8bit per color (24 bits total): R[11:4] B[11:4] G[11:4]
> + - YUV444 8bit per color (24 bits total): Y[11:4] Cr[11:4] Cb[11:4]
> + - YUV422 semi-planar 8bit per component (16 bits total): Y[11:4] CbCr[11:4]
> + - YUV422 semi-planar 10bit per component (20 bits total): Y[11:2] CbCr[11:2]
> + - YUV422 semi-planar 12bit per component (24 bits total): - Y[11:0] 
> CbCr[11:0]
> + - YUV422 BT656 8bit per component (8 bits total): YCbCr[11:4] (2-cycles)
> + - YUV422 BT656 10bit per component (10 bits total): YCbCr[11:2] (2-cycles)
> + - YUV422 BT656 12bit per component (12 bits total): YCbCr[11:0] (2-cycles)
> +
> +The TDA19973 Video port output pins can be used as follows:
> + - RGB 12bit per color (36 bits total): R[11:0] B[11:0] G[11:0]
> + - YUV444 12bit per color (36 bits total): Y[11:0] Cb[11:0] Cr[11:0]
> + - YUV422 semi-planar 12bit per component (24 bits total): Y[11:0] CbCr[11:0]
> + - YUV422 BT656 12bit per component (12 bits total): YCbCr[11:0] (2-cycles)
> +
> +The Video port output pins are mapped via 4-bit 'pin groups' allowing
> +for a variety of connection possibilities including swapping pin order within
> +pin groups. The video_portcfg device-tree property consists of register 
> mapping
> +pairs which map a chip-specific VP output register to a 4-bit pin group. If
> +the pin group needs to be bit-swapped you can use the *_S pin-group defines.
> +
> +Required Properties:
> + - compatible  :
> +  - "nxp,tda19971" for the TDA19971
> +  - "nxp,tda19973" for the TDA19973
> + - reg : I2C slave address
> + - interrupts  : The interrupt number
> + - DOVDD-supply: Digital I/O supply
> + - DVDD-supply : Digital Core supply
> + - AVDD-supply : Analog supply
> + - nxp,vidout-portcfg  : array of pairs mapping VP output pins to pin groups.
> +
> +Optional Properties:
> + - nxp,audout-format   : DAI bus format: "i2s" or "spdif".
> + - nxp,audout-width: width of audio output data bus (1-4).
> + - nxp,audout-layout   : data layout (0=AP0 used, 1=AP0/AP1/AP2/AP3 used).
> + - nxp,audout-mclk-fs  : Multiplication factor between stream rate and codec
> + mclk.
> +
> +The device node must contain one 'port' child node for its digital output
> +video port, in accordance with the video interface bindings defined in
> +Documentation/devicetree/bindings/media/video-interfaces.txt.

Could you add that this port has one endpoint node as well? (Unless you
support multiple, that is.)

> +
> +Optional Endpoint Properties:
> +  The following three properties are defined in video-interfaces.txt and
> +  are valid for source endpoints only:

Transmitters? Don't you have an endpoint only in the port representing the
transmitter?

> +  - hsync-active: Horizontal synchronization polarity. Defaults to active 
> high.
> +  - vsync-active: Vertical synchronization polarity. Defaults to active high.
> +  - data-active: Data polarity. Defaults to active high.
> +
> +Examples:
> + - VP[15:0] connected to IMX6 CSI_DATA[19:4] for 16bit YUV422
> +   16bit I2S layout0 with a 128*fs clock (A_WS, AP0, A_CLK pins)
> + hdmi-receiver@48 {
> + compatible = "nxp,tda19971";
> + pinctrl-names = "default";
> + pinctrl-0 = <_tda1997x>;
> + reg = <0x48>;
> + interrupt-parent = <>;
> + interrupts = <7 IRQ_TYPE_LEVEL_LOW>;
> + DOVDD-supply = <_3p3v>;
> + AVDD-supply = <_1p8v>;
> + DVDD-supply = <_1p8v>;
> + /* audio */
> + #sound-dai-cells = <0>;
> + nxp,audout-format = "i2s";
> + nxp,audout-layout = <0>;
> + nxp,audout-width = <16>;
> + nxp,audout-mclk-fs = <128>;
> + /*
> +  * The 8bpp YUV422 semi-planar mode outputs CbCr[11:4]
> +  * and Y[11:4] across 16bits in the same pixclk cycle.
> +  */
> + 

[rcu:rcu/dev 62/62] kernel/rcu/rcuperf.c:649:2: note: in expansion of macro 'if'

2017-11-21 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
rcu/dev
head:   b151f93a71fc9fecb560e823a92402d882516483
commit: b151f93a71fc9fecb560e823a92402d882516483 [62/62] torture: Eliminate 
torture_runnable
config: i386-randconfig-x008-201747 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
git checkout b151f93a71fc9fecb560e823a92402d882516483
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from kernel/rcu/rcuperf.c:22:
   kernel/rcu/rcuperf.c: In function 'rcu_perf_init':
   kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 
'torture_init_begin'
 if (!torture_init_begin(perf_type, verbose, _runnable))
  ^
   include/linux/compiler.h:156:30: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> kernel/rcu/rcuperf.c:649:2: note: in expansion of macro 'if'
 if (!torture_init_begin(perf_type, verbose, _runnable))
 ^~
   In file included from kernel/rcu/rcuperf.c:48:0:
   include/linux/torture.h:82:6: note: declared here
bool torture_init_begin(char *ttype, bool v);
 ^~
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from kernel/rcu/rcuperf.c:22:
   kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 
'torture_init_begin'
 if (!torture_init_begin(perf_type, verbose, _runnable))
  ^
   include/linux/compiler.h:156:42: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> kernel/rcu/rcuperf.c:649:2: note: in expansion of macro 'if'
 if (!torture_init_begin(perf_type, verbose, _runnable))
 ^~
   In file included from kernel/rcu/rcuperf.c:48:0:
   include/linux/torture.h:82:6: note: declared here
bool torture_init_begin(char *ttype, bool v);
 ^~
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from kernel/rcu/rcuperf.c:22:
   kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 
'torture_init_begin'
 if (!torture_init_begin(perf_type, verbose, _runnable))
  ^
   include/linux/compiler.h:167:16: note: in definition of macro '__trace_if'
  __r = !!(cond); \
   ^~~~
>> kernel/rcu/rcuperf.c:649:2: note: in expansion of macro 'if'
 if (!torture_init_begin(perf_type, verbose, _runnable))
 ^~
   In file included from kernel/rcu/rcuperf.c:48:0:
   include/linux/torture.h:82:6: note: declared here
bool torture_init_begin(char *ttype, bool v);
 ^~
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from kernel/rcu/rcuperf.c:22:
   kernel/rcu/rcuperf.c: At top level:
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'strcpy' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:421:2: note: in expansion of macro 'if'
 if (p_size == (size_t)-1 && q_size == (size_t)-1)
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'kmemdup' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:411:2: note: in expansion of macro 'if'
 if (p_size < size)
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'kmemdup' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro 

[rcu:rcu/dev 62/62] kernel/rcu/rcuperf.c:649:2: note: in expansion of macro 'if'

2017-11-21 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
rcu/dev
head:   b151f93a71fc9fecb560e823a92402d882516483
commit: b151f93a71fc9fecb560e823a92402d882516483 [62/62] torture: Eliminate 
torture_runnable
config: i386-randconfig-x008-201747 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
git checkout b151f93a71fc9fecb560e823a92402d882516483
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from kernel/rcu/rcuperf.c:22:
   kernel/rcu/rcuperf.c: In function 'rcu_perf_init':
   kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 
'torture_init_begin'
 if (!torture_init_begin(perf_type, verbose, _runnable))
  ^
   include/linux/compiler.h:156:30: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> kernel/rcu/rcuperf.c:649:2: note: in expansion of macro 'if'
 if (!torture_init_begin(perf_type, verbose, _runnable))
 ^~
   In file included from kernel/rcu/rcuperf.c:48:0:
   include/linux/torture.h:82:6: note: declared here
bool torture_init_begin(char *ttype, bool v);
 ^~
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from kernel/rcu/rcuperf.c:22:
   kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 
'torture_init_begin'
 if (!torture_init_begin(perf_type, verbose, _runnable))
  ^
   include/linux/compiler.h:156:42: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> kernel/rcu/rcuperf.c:649:2: note: in expansion of macro 'if'
 if (!torture_init_begin(perf_type, verbose, _runnable))
 ^~
   In file included from kernel/rcu/rcuperf.c:48:0:
   include/linux/torture.h:82:6: note: declared here
bool torture_init_begin(char *ttype, bool v);
 ^~
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from kernel/rcu/rcuperf.c:22:
   kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 
'torture_init_begin'
 if (!torture_init_begin(perf_type, verbose, _runnable))
  ^
   include/linux/compiler.h:167:16: note: in definition of macro '__trace_if'
  __r = !!(cond); \
   ^~~~
>> kernel/rcu/rcuperf.c:649:2: note: in expansion of macro 'if'
 if (!torture_init_begin(perf_type, verbose, _runnable))
 ^~
   In file included from kernel/rcu/rcuperf.c:48:0:
   include/linux/torture.h:82:6: note: declared here
bool torture_init_begin(char *ttype, bool v);
 ^~
   In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from kernel/rcu/rcuperf.c:22:
   kernel/rcu/rcuperf.c: At top level:
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'strcpy' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:421:2: note: in expansion of macro 'if'
 if (p_size == (size_t)-1 && q_size == (size_t)-1)
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'kmemdup' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:411:2: note: in expansion of macro 'if'
 if (p_size < size)
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'kmemdup' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro 

Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christoph Hellwig
Jens, please don't just revert the commit in your for-linus tree.

On its own this will totally mess up the interrupt assignments.  Give
me a bit of time to sort this out properly.


Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable)

2017-11-21 Thread Christoph Hellwig
Jens, please don't just revert the commit in your for-linus tree.

On its own this will totally mess up the interrupt assignments.  Give
me a bit of time to sort this out properly.


Re: [PATCH v2] powerpc: fix boot on BOOK3S_32 with CONFIG_STRICT_KERNEL_RWX

2017-11-21 Thread Christophe LEROY



Le 22/11/2017 à 00:07, Balbir Singh a écrit :

On Wed, Nov 22, 2017 at 1:28 AM, Christophe Leroy
 wrote:

On powerpc32, patch_instruction() is called by apply_feature_fixups()
which is called from early_init()

There is the following note in front of early_init():
  * Note that the kernel may be running at an address which is different
  * from the address that it was linked at, so we must use RELOC/PTRRELOC
  * to access static data (including strings).  -- paulus

Therefore, slab_is_available() cannot be called yet, and
text_poke_area must be addressed with PTRRELOC()

Fixes: 37bc3e5fd764f ("powerpc/lib/code-patching: Use alternate map
for patch_instruction()")
Reported-by: Meelis Roos 
Cc: Balbir Singh 
Signed-off-by: Christophe Leroy 
---
  v2: Added missing asm/setup.h

  arch/powerpc/lib/code-patching.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index c9de03e0c1f1..d469224c4ada 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -21,6 +21,7 @@
  #include 
  #include 
  #include 
+#include 

  static int __patch_instruction(unsigned int *addr, unsigned int instr)
  {
@@ -146,11 +147,8 @@ int patch_instruction(unsigned int *addr, unsigned int 
instr)
  * During early early boot patch_instruction is called
  * when text_poke_area is not ready, but we still need
  * to allow patching. We just do the plain old patching
-* We use slab_is_available and per cpu read * via this_cpu_read
-* of text_poke_area. Per-CPU areas might not be up early
-* this can create problems with just using this_cpu_read()
  */
-   if (!slab_is_available() || !this_cpu_read(text_poke_area))
+   if (!this_cpu_read(*PTRRELOC(_poke_area)))
 return __patch_instruction(addr, instr);


On ppc64, we call apply_feature_fixups() in early_setup() after we've
relocated ourselves. Sorry for missing the ppc32 case. I would like to
avoid PTRRELOC when unnecessary.


What do you suggest then ?

Some #ifdef PPC32 around that ?

Christophe




Balbir Singh.



Re: [PATCH v2] powerpc: fix boot on BOOK3S_32 with CONFIG_STRICT_KERNEL_RWX

2017-11-21 Thread Christophe LEROY



Le 22/11/2017 à 00:07, Balbir Singh a écrit :

On Wed, Nov 22, 2017 at 1:28 AM, Christophe Leroy
 wrote:

On powerpc32, patch_instruction() is called by apply_feature_fixups()
which is called from early_init()

There is the following note in front of early_init():
  * Note that the kernel may be running at an address which is different
  * from the address that it was linked at, so we must use RELOC/PTRRELOC
  * to access static data (including strings).  -- paulus

Therefore, slab_is_available() cannot be called yet, and
text_poke_area must be addressed with PTRRELOC()

Fixes: 37bc3e5fd764f ("powerpc/lib/code-patching: Use alternate map
for patch_instruction()")
Reported-by: Meelis Roos 
Cc: Balbir Singh 
Signed-off-by: Christophe Leroy 
---
  v2: Added missing asm/setup.h

  arch/powerpc/lib/code-patching.c | 6 ++
  1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index c9de03e0c1f1..d469224c4ada 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -21,6 +21,7 @@
  #include 
  #include 
  #include 
+#include 

  static int __patch_instruction(unsigned int *addr, unsigned int instr)
  {
@@ -146,11 +147,8 @@ int patch_instruction(unsigned int *addr, unsigned int 
instr)
  * During early early boot patch_instruction is called
  * when text_poke_area is not ready, but we still need
  * to allow patching. We just do the plain old patching
-* We use slab_is_available and per cpu read * via this_cpu_read
-* of text_poke_area. Per-CPU areas might not be up early
-* this can create problems with just using this_cpu_read()
  */
-   if (!slab_is_available() || !this_cpu_read(text_poke_area))
+   if (!this_cpu_read(*PTRRELOC(_poke_area)))
 return __patch_instruction(addr, instr);


On ppc64, we call apply_feature_fixups() in early_setup() after we've
relocated ourselves. Sorry for missing the ppc32 case. I would like to
avoid PTRRELOC when unnecessary.


What do you suggest then ?

Some #ifdef PPC32 around that ?

Christophe




Balbir Singh.



[rcu:rcu/dev 62/62] kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 'torture_init_begin'

2017-11-21 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
rcu/dev
head:   b151f93a71fc9fecb560e823a92402d882516483
commit: b151f93a71fc9fecb560e823a92402d882516483 [62/62] torture: Eliminate 
torture_runnable
config: i386-randconfig-x001-201747 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
git checkout b151f93a71fc9fecb560e823a92402d882516483
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   kernel/rcu/rcuperf.c: In function 'rcu_perf_init':
>> kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 
>> 'torture_init_begin'
 if (!torture_init_begin(perf_type, verbose, _runnable))
  ^~
   In file included from kernel/rcu/rcuperf.c:48:0:
   include/linux/torture.h:82:6: note: declared here
bool torture_init_begin(char *ttype, bool v);
 ^~

vim +/torture_init_begin +649 kernel/rcu/rcuperf.c

8704baab9 Paul E. McKenney 2015-12-31  638  
8704baab9 Paul E. McKenney 2015-12-31  639  static int __init
8704baab9 Paul E. McKenney 2015-12-31  640  rcu_perf_init(void)
8704baab9 Paul E. McKenney 2015-12-31  641  {
8704baab9 Paul E. McKenney 2015-12-31  642  long i;
8704baab9 Paul E. McKenney 2015-12-31  643  int firsterr = 0;
8704baab9 Paul E. McKenney 2015-12-31  644  static struct rcu_perf_ops 
*perf_ops[] = {
f60cb4d4c Paul E. McKenney 2017-04-19  645  _ops, _bh_ops, 
_ops, _ops, _ops,
f1dbc54b9 Paul E. McKenney 2017-05-25  646  _ops,
8704baab9 Paul E. McKenney 2015-12-31  647  };
8704baab9 Paul E. McKenney 2015-12-31  648  
8704baab9 Paul E. McKenney 2015-12-31 @649  if 
(!torture_init_begin(perf_type, verbose, _runnable))
8704baab9 Paul E. McKenney 2015-12-31  650  return -EBUSY;
8704baab9 Paul E. McKenney 2015-12-31  651  
8704baab9 Paul E. McKenney 2015-12-31  652  /* Process args and tell the 
world that the perf'er is on the job. */
8704baab9 Paul E. McKenney 2015-12-31  653  for (i = 0; i < 
ARRAY_SIZE(perf_ops); i++) {
8704baab9 Paul E. McKenney 2015-12-31  654  cur_ops = perf_ops[i];
8704baab9 Paul E. McKenney 2015-12-31  655  if (strcmp(perf_type, 
cur_ops->name) == 0)
8704baab9 Paul E. McKenney 2015-12-31  656  break;
8704baab9 Paul E. McKenney 2015-12-31  657  }
8704baab9 Paul E. McKenney 2015-12-31  658  if (i == ARRAY_SIZE(perf_ops)) {
8704baab9 Paul E. McKenney 2015-12-31  659  pr_alert("rcu-perf: 
invalid perf type: \"%s\"\n",
8704baab9 Paul E. McKenney 2015-12-31  660   perf_type);
8704baab9 Paul E. McKenney 2015-12-31  661  pr_alert("rcu-perf 
types:");
8704baab9 Paul E. McKenney 2015-12-31  662  for (i = 0; i < 
ARRAY_SIZE(perf_ops); i++)
8704baab9 Paul E. McKenney 2015-12-31  663  pr_alert(" %s", 
perf_ops[i]->name);
8704baab9 Paul E. McKenney 2015-12-31  664  pr_alert("\n");
8704baab9 Paul E. McKenney 2015-12-31  665  firsterr = -EINVAL;
8704baab9 Paul E. McKenney 2015-12-31  666  goto unwind;
8704baab9 Paul E. McKenney 2015-12-31  667  }
8704baab9 Paul E. McKenney 2015-12-31  668  if (cur_ops->init)
8704baab9 Paul E. McKenney 2015-12-31  669  cur_ops->init();
8704baab9 Paul E. McKenney 2015-12-31  670  
8704baab9 Paul E. McKenney 2015-12-31  671  nrealwriters = 
compute_real(nwriters);
8704baab9 Paul E. McKenney 2015-12-31  672  nrealreaders = 
compute_real(nreaders);
8704baab9 Paul E. McKenney 2015-12-31  673  
atomic_set(_rcu_perf_reader_started, 0);
8704baab9 Paul E. McKenney 2015-12-31  674  
atomic_set(_rcu_perf_writer_started, 0);
8704baab9 Paul E. McKenney 2015-12-31  675  
atomic_set(_rcu_perf_writer_finished, 0);
8704baab9 Paul E. McKenney 2015-12-31  676  
rcu_perf_print_module_parms(cur_ops, "Start of test");
8704baab9 Paul E. McKenney 2015-12-31  677  
8704baab9 Paul E. McKenney 2015-12-31  678  /* Start up the kthreads. */
8704baab9 Paul E. McKenney 2015-12-31  679  
8704baab9 Paul E. McKenney 2015-12-31  680  if (shutdown) {
8704baab9 Paul E. McKenney 2015-12-31  681  
init_waitqueue_head(_wq);
8704baab9 Paul E. McKenney 2015-12-31  682  firsterr = 
torture_create_kthread(rcu_perf_shutdown, NULL,
8704baab9 Paul E. McKenney 2015-12-31  683  
  shutdown_task);
8704baab9 Paul E. McKenney 2015-12-31  684  if (firsterr)
8704baab9 Paul E. McKenney 2015-12-31  685  goto unwind;
8704baab9 Paul E. McKenney 2015-12-31  686  
schedule_timeout_uninterruptible(1);
8704baab9 Paul E. McKenney 2015-12-31  687  }
8704baab9 Paul E. McKenney 2015-12-31  688  reader_tasks = 
kcalloc(nrealreaders, sizeof(reader_tasks[0]),
8704baab9 Paul E. McKenney 2015-12-31  689 
GFP_KERNEL);

[rcu:rcu/dev 62/62] kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 'torture_init_begin'

2017-11-21 Thread kbuild test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
rcu/dev
head:   b151f93a71fc9fecb560e823a92402d882516483
commit: b151f93a71fc9fecb560e823a92402d882516483 [62/62] torture: Eliminate 
torture_runnable
config: i386-randconfig-x001-201747 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
git checkout b151f93a71fc9fecb560e823a92402d882516483
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   kernel/rcu/rcuperf.c: In function 'rcu_perf_init':
>> kernel/rcu/rcuperf.c:649:7: error: too many arguments to function 
>> 'torture_init_begin'
 if (!torture_init_begin(perf_type, verbose, _runnable))
  ^~
   In file included from kernel/rcu/rcuperf.c:48:0:
   include/linux/torture.h:82:6: note: declared here
bool torture_init_begin(char *ttype, bool v);
 ^~

vim +/torture_init_begin +649 kernel/rcu/rcuperf.c

8704baab9 Paul E. McKenney 2015-12-31  638  
8704baab9 Paul E. McKenney 2015-12-31  639  static int __init
8704baab9 Paul E. McKenney 2015-12-31  640  rcu_perf_init(void)
8704baab9 Paul E. McKenney 2015-12-31  641  {
8704baab9 Paul E. McKenney 2015-12-31  642  long i;
8704baab9 Paul E. McKenney 2015-12-31  643  int firsterr = 0;
8704baab9 Paul E. McKenney 2015-12-31  644  static struct rcu_perf_ops 
*perf_ops[] = {
f60cb4d4c Paul E. McKenney 2017-04-19  645  _ops, _bh_ops, 
_ops, _ops, _ops,
f1dbc54b9 Paul E. McKenney 2017-05-25  646  _ops,
8704baab9 Paul E. McKenney 2015-12-31  647  };
8704baab9 Paul E. McKenney 2015-12-31  648  
8704baab9 Paul E. McKenney 2015-12-31 @649  if 
(!torture_init_begin(perf_type, verbose, _runnable))
8704baab9 Paul E. McKenney 2015-12-31  650  return -EBUSY;
8704baab9 Paul E. McKenney 2015-12-31  651  
8704baab9 Paul E. McKenney 2015-12-31  652  /* Process args and tell the 
world that the perf'er is on the job. */
8704baab9 Paul E. McKenney 2015-12-31  653  for (i = 0; i < 
ARRAY_SIZE(perf_ops); i++) {
8704baab9 Paul E. McKenney 2015-12-31  654  cur_ops = perf_ops[i];
8704baab9 Paul E. McKenney 2015-12-31  655  if (strcmp(perf_type, 
cur_ops->name) == 0)
8704baab9 Paul E. McKenney 2015-12-31  656  break;
8704baab9 Paul E. McKenney 2015-12-31  657  }
8704baab9 Paul E. McKenney 2015-12-31  658  if (i == ARRAY_SIZE(perf_ops)) {
8704baab9 Paul E. McKenney 2015-12-31  659  pr_alert("rcu-perf: 
invalid perf type: \"%s\"\n",
8704baab9 Paul E. McKenney 2015-12-31  660   perf_type);
8704baab9 Paul E. McKenney 2015-12-31  661  pr_alert("rcu-perf 
types:");
8704baab9 Paul E. McKenney 2015-12-31  662  for (i = 0; i < 
ARRAY_SIZE(perf_ops); i++)
8704baab9 Paul E. McKenney 2015-12-31  663  pr_alert(" %s", 
perf_ops[i]->name);
8704baab9 Paul E. McKenney 2015-12-31  664  pr_alert("\n");
8704baab9 Paul E. McKenney 2015-12-31  665  firsterr = -EINVAL;
8704baab9 Paul E. McKenney 2015-12-31  666  goto unwind;
8704baab9 Paul E. McKenney 2015-12-31  667  }
8704baab9 Paul E. McKenney 2015-12-31  668  if (cur_ops->init)
8704baab9 Paul E. McKenney 2015-12-31  669  cur_ops->init();
8704baab9 Paul E. McKenney 2015-12-31  670  
8704baab9 Paul E. McKenney 2015-12-31  671  nrealwriters = 
compute_real(nwriters);
8704baab9 Paul E. McKenney 2015-12-31  672  nrealreaders = 
compute_real(nreaders);
8704baab9 Paul E. McKenney 2015-12-31  673  
atomic_set(_rcu_perf_reader_started, 0);
8704baab9 Paul E. McKenney 2015-12-31  674  
atomic_set(_rcu_perf_writer_started, 0);
8704baab9 Paul E. McKenney 2015-12-31  675  
atomic_set(_rcu_perf_writer_finished, 0);
8704baab9 Paul E. McKenney 2015-12-31  676  
rcu_perf_print_module_parms(cur_ops, "Start of test");
8704baab9 Paul E. McKenney 2015-12-31  677  
8704baab9 Paul E. McKenney 2015-12-31  678  /* Start up the kthreads. */
8704baab9 Paul E. McKenney 2015-12-31  679  
8704baab9 Paul E. McKenney 2015-12-31  680  if (shutdown) {
8704baab9 Paul E. McKenney 2015-12-31  681  
init_waitqueue_head(_wq);
8704baab9 Paul E. McKenney 2015-12-31  682  firsterr = 
torture_create_kthread(rcu_perf_shutdown, NULL,
8704baab9 Paul E. McKenney 2015-12-31  683  
  shutdown_task);
8704baab9 Paul E. McKenney 2015-12-31  684  if (firsterr)
8704baab9 Paul E. McKenney 2015-12-31  685  goto unwind;
8704baab9 Paul E. McKenney 2015-12-31  686  
schedule_timeout_uninterruptible(1);
8704baab9 Paul E. McKenney 2015-12-31  687  }
8704baab9 Paul E. McKenney 2015-12-31  688  reader_tasks = 
kcalloc(nrealreaders, sizeof(reader_tasks[0]),
8704baab9 Paul E. McKenney 2015-12-31  689 
GFP_KERNEL);

Re: [PATCH v2] staging: comedi: add missing MODULE_DESCRIPTION/LICENSE

2017-11-21 Thread Matthew Giassa

* Ian Abbott  [2017-11-20 10:46:36 +]:


On 20/11/17 10:29, Ian Abbott wrote:

On 20/11/17 07:50, Jesse Chan wrote:

This change resolves a new compile-time warning
when built as a loadable module:

WARNING: modpost: missing MODULE_LICENSE() in 
drivers/staging/comedi/drivers/ni_atmio.o

see include/linux/module.h for more information

This adds the license as "GPL", which matches the header of the file.

MODULE_DESCRIPTION is also added.

Signed-off-by: Jesse Chan 
---
  drivers/staging/comedi/drivers/ni_atmio.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/staging/comedi/drivers/ni_atmio.c 
b/drivers/staging/comedi/drivers/ni_atmio.c

index 2d62a8c57332..b61d56367773 100644
--- a/drivers/staging/comedi/drivers/ni_atmio.c
+++ b/drivers/staging/comedi/drivers/ni_atmio.c
@@ -361,3 +361,6 @@ static struct comedi_driver ni_atmio_driver = {
  .detach    = ni_atmio_detach,
  };
  module_comedi_driver(ni_atmio_driver);
+
+MODULE_DESCRIPTION("Comedi low-level driver");
+MODULE_LICENSE("GPL");


Thanks!  I wonder how I managed to miss out this driver in commit 
3c323c01b6bd ("Staging: comedi: Add MODULE_LICENSE and similar to NI 
modules")?


Reviewed-by: Ian Abbott 


Despite my above comment, we should probably give precedence to 
Matthew Giassa's patch for the same issue, since it was sent earlier.


--
-=( Ian Abbott @ MEV Ltd.E-mail:  )=-
-=(  Web: http://www.mev.co.uk/  )=-


--

Thanks. Also, this one should probably include the MODULE_AUTHOR macro
as well.

Cheers!


Re: [PATCH v2] staging: comedi: add missing MODULE_DESCRIPTION/LICENSE

2017-11-21 Thread Matthew Giassa

* Ian Abbott  [2017-11-20 10:46:36 +]:


On 20/11/17 10:29, Ian Abbott wrote:

On 20/11/17 07:50, Jesse Chan wrote:

This change resolves a new compile-time warning
when built as a loadable module:

WARNING: modpost: missing MODULE_LICENSE() in 
drivers/staging/comedi/drivers/ni_atmio.o

see include/linux/module.h for more information

This adds the license as "GPL", which matches the header of the file.

MODULE_DESCRIPTION is also added.

Signed-off-by: Jesse Chan 
---
  drivers/staging/comedi/drivers/ni_atmio.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/drivers/staging/comedi/drivers/ni_atmio.c 
b/drivers/staging/comedi/drivers/ni_atmio.c

index 2d62a8c57332..b61d56367773 100644
--- a/drivers/staging/comedi/drivers/ni_atmio.c
+++ b/drivers/staging/comedi/drivers/ni_atmio.c
@@ -361,3 +361,6 @@ static struct comedi_driver ni_atmio_driver = {
  .detach    = ni_atmio_detach,
  };
  module_comedi_driver(ni_atmio_driver);
+
+MODULE_DESCRIPTION("Comedi low-level driver");
+MODULE_LICENSE("GPL");


Thanks!  I wonder how I managed to miss out this driver in commit 
3c323c01b6bd ("Staging: comedi: Add MODULE_LICENSE and similar to NI 
modules")?


Reviewed-by: Ian Abbott 


Despite my above comment, we should probably give precedence to 
Matthew Giassa's patch for the same issue, since it was sent earlier.


--
-=( Ian Abbott @ MEV Ltd.E-mail:  )=-
-=(  Web: http://www.mev.co.uk/  )=-


--

Thanks. Also, this one should probably include the MODULE_AUTHOR macro
as well.

Cheers!


Re: [PATCH] scsi/eh: fix hang adding ehandler wakeups after decrementing host_busy

2017-11-21 Thread Pavel Tikhomirov

Great news, that it works for you!

Thanks a lot!
Pavel

On 11/22/2017 03:49 AM, Stuart Hayes wrote:

My apologies... yes, your patch also fixes my issue.  I was looking at the two 
new places from which you were calling scsi_eh_wakeup(), and didn't notice that 
you moved the spinlock in scsi_device_unbusy()... moving the spinlock in 
scsi_device_unbusy() also should the issue I'm seeing, given that 
scsi_eh_scmd_add() also uses the spinlock.

I tested your patch on my issue, and it did indeed fix my issue.

So you can add...

Tested-by: Stuart Hayes 

Thanks
Stuart


On 11/21/2017 2:09 AM, Pavel Tikhomirov wrote:

My patch should also fix your issue too, please see explanation in reply to 
your patch. Do your testing show that it doesn't?

Thanks, Pavel.

On 11/21/2017 09:10 AM, Stuart Hayes wrote:

Pavel,

It turns out that the error handler on our systems was not getting woken up for 
a different reason... I submitted a patch earlier today that fixes the issue I 
were seeing (I CCed you on the patch).

Before I got my hands on the failing system and was able to root cause it, I 
was pretty sure that your patch was going to fix our issue, because after I 
examined the code paths, I couldn't find any other reason that the error 
handler would not get woken up.  I tried forcing the bug that your patch fixes 
to occur, by compiling in some mdelay()s in a key place or two in the scsi 
code, but it never failed for me that way.  With my patch, several systems that 
previously failed in 10 minutes or less successfully ran for many days.

Thanks,
Stuart

On 11/9/2017 8:54 AM, Pavel Tikhomirov wrote:

Are there any issues with this patch 
(https://patchwork.kernel.org/patch/9938919/) that Pavel Tikhomirov submitted 
back in September?  I am willing to help if there's anything I can do to help 
get it accepted.


Hi, Stuart, I asked James Bottomley about the patch status offlist and it seems 
that the problem is - patch lacks testing and review. I would highly appreciate 
review from your side and anyone who wants to participate!

And if you can confirm that the patch solves the problem on your environment with no side 
effects please add "Tested-by:" tag also.

Thanks, Pavel

On 09/05/2017 03:54 PM, Pavel Tikhomirov wrote:

We have a problem on several our nodes with scsi EH. Imagine such an
order of execution of two threads:

CPU1 scsi_eh_scmd_add    CPU2 scsi_host_queue_ready
/* shost->host_busy == 1 initialy */

  if (shost->shost_state == SHOST_RECOVERY)
  /* does not get here */
  return 0;

lock(shost->host_lock);
shost->shost_state = SHOST_RECOVERY;

  busy = shost->host_busy++;
  /* host->can_queue == 1 initialy, busy == 1
   * - go to starved label */
  lock(shost->host_lock) /* wait */

shost->host_failed++;
/* shost->host_busy == 2, shost->host_failed == 1 */
call scsi_eh_wakeup(shost) {
  if (host_busy == host_failed) {
  /* does not get here */
  wake_up_process(shost->ehandler)
  }
}
unlock(shost->host_lock)

  /* acquire lock */
  shost->host_busy--;

Finaly we do not wakeup scsi_error_handler and all other commands
coming will hang as we are in never ending recovery state as there
is no one left to wakeup handler.

So scsi disc in these host becomes unresponsive and all bio on node
hangs. (We trigger these problem when scsi cmnds to DVD drive timeout.)

Main idea of the fix is to try to do wake up every time we decrement
host_busy or increment host_failed(the latter is already OK).

Now the very *last* one of busy threads getting host_lock after
decrementing host_busy will see all write operations on host's
shost_state, host_busy and host_failed completed thanks to implied
memory barriers on spin_lock/unlock, so at the time of busy==failed
we will trigger wakeup in at least one thread. (Thats why putting
recovery and failed checks under lock)

Signed-off-by: Pavel Tikhomirov 
---
    drivers/scsi/scsi_lib.c | 21 +
    1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f6097b89d5d3..6c99221d60aa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -320,12 +320,11 @@ void scsi_device_unbusy(struct scsi_device *sdev)
    if (starget->can_queue > 0)
    atomic_dec(>target_busy);
    +    spin_lock_irqsave(shost->host_lock, flags);
    if (unlikely(scsi_host_in_recovery(shost) &&
- (shost->host_failed || shost->host_eh_scheduled))) {
-    spin_lock_irqsave(shost->host_lock, flags);
+ (shost->host_failed || shost->host_eh_scheduled)))
    scsi_eh_wakeup(shost);
-    spin_unlock_irqrestore(shost->host_lock, flags);
-    }
+    spin_unlock_irqrestore(shost->host_lock, flags);
      atomic_dec(>device_busy);
    

Re: [PATCH] scsi/eh: fix hang adding ehandler wakeups after decrementing host_busy

2017-11-21 Thread Pavel Tikhomirov

Great news, that it works for you!

Thanks a lot!
Pavel

On 11/22/2017 03:49 AM, Stuart Hayes wrote:

My apologies... yes, your patch also fixes my issue.  I was looking at the two 
new places from which you were calling scsi_eh_wakeup(), and didn't notice that 
you moved the spinlock in scsi_device_unbusy()... moving the spinlock in 
scsi_device_unbusy() also should the issue I'm seeing, given that 
scsi_eh_scmd_add() also uses the spinlock.

I tested your patch on my issue, and it did indeed fix my issue.

So you can add...

Tested-by: Stuart Hayes 

Thanks
Stuart


On 11/21/2017 2:09 AM, Pavel Tikhomirov wrote:

My patch should also fix your issue too, please see explanation in reply to 
your patch. Do your testing show that it doesn't?

Thanks, Pavel.

On 11/21/2017 09:10 AM, Stuart Hayes wrote:

Pavel,

It turns out that the error handler on our systems was not getting woken up for 
a different reason... I submitted a patch earlier today that fixes the issue I 
were seeing (I CCed you on the patch).

Before I got my hands on the failing system and was able to root cause it, I 
was pretty sure that your patch was going to fix our issue, because after I 
examined the code paths, I couldn't find any other reason that the error 
handler would not get woken up.  I tried forcing the bug that your patch fixes 
to occur, by compiling in some mdelay()s in a key place or two in the scsi 
code, but it never failed for me that way.  With my patch, several systems that 
previously failed in 10 minutes or less successfully ran for many days.

Thanks,
Stuart

On 11/9/2017 8:54 AM, Pavel Tikhomirov wrote:

Are there any issues with this patch 
(https://patchwork.kernel.org/patch/9938919/) that Pavel Tikhomirov submitted 
back in September?  I am willing to help if there's anything I can do to help 
get it accepted.


Hi, Stuart, I asked James Bottomley about the patch status offlist and it seems 
that the problem is - patch lacks testing and review. I would highly appreciate 
review from your side and anyone who wants to participate!

And if you can confirm that the patch solves the problem on your environment with no side 
effects please add "Tested-by:" tag also.

Thanks, Pavel

On 09/05/2017 03:54 PM, Pavel Tikhomirov wrote:

We have a problem on several our nodes with scsi EH. Imagine such an
order of execution of two threads:

CPU1 scsi_eh_scmd_add    CPU2 scsi_host_queue_ready
/* shost->host_busy == 1 initialy */

  if (shost->shost_state == SHOST_RECOVERY)
  /* does not get here */
  return 0;

lock(shost->host_lock);
shost->shost_state = SHOST_RECOVERY;

  busy = shost->host_busy++;
  /* host->can_queue == 1 initialy, busy == 1
   * - go to starved label */
  lock(shost->host_lock) /* wait */

shost->host_failed++;
/* shost->host_busy == 2, shost->host_failed == 1 */
call scsi_eh_wakeup(shost) {
  if (host_busy == host_failed) {
  /* does not get here */
  wake_up_process(shost->ehandler)
  }
}
unlock(shost->host_lock)

  /* acquire lock */
  shost->host_busy--;

Finaly we do not wakeup scsi_error_handler and all other commands
coming will hang as we are in never ending recovery state as there
is no one left to wakeup handler.

So scsi disc in these host becomes unresponsive and all bio on node
hangs. (We trigger these problem when scsi cmnds to DVD drive timeout.)

Main idea of the fix is to try to do wake up every time we decrement
host_busy or increment host_failed(the latter is already OK).

Now the very *last* one of busy threads getting host_lock after
decrementing host_busy will see all write operations on host's
shost_state, host_busy and host_failed completed thanks to implied
memory barriers on spin_lock/unlock, so at the time of busy==failed
we will trigger wakeup in at least one thread. (Thats why putting
recovery and failed checks under lock)

Signed-off-by: Pavel Tikhomirov 
---
    drivers/scsi/scsi_lib.c | 21 +
    1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index f6097b89d5d3..6c99221d60aa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -320,12 +320,11 @@ void scsi_device_unbusy(struct scsi_device *sdev)
    if (starget->can_queue > 0)
    atomic_dec(>target_busy);
    +    spin_lock_irqsave(shost->host_lock, flags);
    if (unlikely(scsi_host_in_recovery(shost) &&
- (shost->host_failed || shost->host_eh_scheduled))) {
-    spin_lock_irqsave(shost->host_lock, flags);
+ (shost->host_failed || shost->host_eh_scheduled)))
    scsi_eh_wakeup(shost);
-    spin_unlock_irqrestore(shost->host_lock, flags);
-    }
+    spin_unlock_irqrestore(shost->host_lock, flags);
      atomic_dec(>device_busy);
    }
@@ -1503,6 +1502,13 @@ static inline int 

Re: [PATCH v1 3/9] perf util: Reconstruct rblist for supporting per-thread shadow stats

2017-11-21 Thread Jin, Yao



On 11/22/2017 2:31 PM, Ravi Bangoria wrote:


On 11/20/2017 08:13 PM, Jin Yao wrote:
@@ -76,6 +97,17 @@ static struct rb_node *saved_value_new(struct 
rblist *rblist __maybe_unused,

  return >rb_node;
  }

+static void saved_value_delete(struct rblist *rblist __maybe_unused,
+   struct rb_node *rb_node)
+{
+    struct saved_value *v = container_of(rb_node,
+ struct saved_value,
+ rb_node);
+
+    if (v)
+    free(v);
+}


Do we really need if(v) ?

Thanks,
Ravi



Hi Ravi,

Looks it doesn't need if(v).

I put if(v) here is from my coding habits (checking pointer before free).

It's OK for me if you think the code should be removed.

Thanks
Jin Yao




Re: [PATCH v1 3/9] perf util: Reconstruct rblist for supporting per-thread shadow stats

2017-11-21 Thread Jin, Yao



On 11/22/2017 2:31 PM, Ravi Bangoria wrote:


On 11/20/2017 08:13 PM, Jin Yao wrote:
@@ -76,6 +97,17 @@ static struct rb_node *saved_value_new(struct 
rblist *rblist __maybe_unused,

  return >rb_node;
  }

+static void saved_value_delete(struct rblist *rblist __maybe_unused,
+   struct rb_node *rb_node)
+{
+    struct saved_value *v = container_of(rb_node,
+ struct saved_value,
+ rb_node);
+
+    if (v)
+    free(v);
+}


Do we really need if(v) ?

Thanks,
Ravi



Hi Ravi,

Looks it doesn't need if(v).

I put if(v) here is from my coding habits (checking pointer before free).

It's OK for me if you think the code should be removed.

Thanks
Jin Yao




RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-21 Thread Alexander.Steffen
> > > On 10/20/2017 08:12 PM, alexander.stef...@infineon.com wrote:
> > > >> The TPM burstcount status indicates the number of bytes that can
> > > >> be sent to the TPM without causing bus wait states.  Effectively,
> > > >> it is the number of empty bytes in the command FIFO.
> > > >>
> > > >> This patch optimizes the tpm_tis_send_data() function by checking
> > > >> the burstcount only once. And if the burstcount is valid, it writes
> > > >> all the bytes at once, permitting wait state.
> > > >>
> > > >> After this change, performance on a TPM 1.2 with an 8 byte
> > > >> burstcount for 1000 extends improved from ~41sec to ~14sec.
> > > >>
> > > >> Suggested-by: Ken Goldman  in
> > > >> conjunction with the TPM Device Driver work group.
> > > >> Signed-off-by: Nayna Jain
> > > >> Acked-by: Mimi Zohar
> > > >> ---
> > > >>   drivers/char/tpm/tpm_tis_core.c | 42 +++--
> --
> > --
> > > 
> > > >> 
> > > >>   1 file changed, 15 insertions(+), 27 deletions(-)
> > > >>
> > > >> diff --git a/drivers/char/tpm/tpm_tis_core.c
> > > >> b/drivers/char/tpm/tpm_tis_core.c
> > > >> index b33126a35694..993328ae988c 100644
> > > >> --- a/drivers/char/tpm/tpm_tis_core.c
> > > >> +++ b/drivers/char/tpm/tpm_tis_core.c
> > > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip
> > > *chip,
> > > >> u8 *buf, size_t len)
> > > >>   {
> > > >>struct tpm_tis_data *priv = dev_get_drvdata(>dev);
> > > >>int rc, status, burstcnt;
> > > >> -  size_t count = 0;
> > > >>bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> > > >>
> > > >>status = tpm_tis_status(chip);
> > > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct
> tpm_chip
> > > *chip,
> > > >> u8 *buf, size_t len)
> > > >>}
> > > >>}
> > > >>
> > > >> -  while (count < len - 1) {
> > > >> -  burstcnt = get_burstcount(chip);
> > > >> -  if (burstcnt < 0) {
> > > >> -  dev_err(>dev, "Unable to read
> burstcount\n");
> > > >> -  rc = burstcnt;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> -  burstcnt = min_t(int, burstcnt, len - count - 1);
> > > >> -  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> > > >>> locality),
> > > >> -   burstcnt, buf + count);
> > > >> -  if (rc < 0)
> > > >> -  goto out_err;
> > > >> -
> > > >> -  count += burstcnt;
> > > >> -
> > > >> -  if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> > > >>> timeout_c,
> > > >> -  >int_queue, false) < 0) {
> > > >> -  rc = -ETIME;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> -  status = tpm_tis_status(chip);
> > > >> -  if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0)
> {
> > > >> -  rc = -EIO;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> +  /*
> > > >> +   * Get the initial burstcount to ensure TPM is ready to
> > > >> +   * accept data.
> > > >> +   */
> > > >> +  burstcnt = get_burstcount(chip);
> > > >> +  if (burstcnt < 0) {
> > > >> +  dev_err(>dev, "Unable to read burstcount\n");
> > > >> +  rc = burstcnt;
> > > >> +  goto out_err;
> > > >>}
> > > >>
> > > >> +  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> >locality),
> > > >> +  len - 1, buf);
> > > >> +  if (rc < 0)
> > > >> +  goto out_err;
> > > >> +
> > > >>/* write last byte */
> > > >> -  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> > > >> buf[count]);
> > > >> +  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> buf[len-
> > > >> 1]);
> > > >>if (rc < 0)
> > > >>goto out_err;
> > > >>
> > > >> --
> > > >> 2.13.3
> > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying 
> > > > to
> > > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests
> > > seem to take an unusual amount of time. More analysis probably has to
> > wait
> > > until November, since I am going to be in Prague next week.
> > >
> > > Thanks Alex for testing these.. Did you get the chance to do any further
> > > analysis ?
> >
> > I am working on that now. Ken's suggestion seems reasonable, so I am
> going
> > to test whether correctly waiting for the flags to change fixes the problem.
> If
> > it does, I'll send the patches.
> 
> Sorry for the delay, I had to take care of some device tree changes in v4.14
> that broke my ARM test machines.
> 
> I've implemented some patches that fix the issue that Ken pointed out and
> rebased your patch 2/4 ("ignore burstcount") on top. While doing 

RE: [PATCH v4 2/4] tpm: ignore burstcount to improve tpm_tis send() performance

2017-11-21 Thread Alexander.Steffen
> > > On 10/20/2017 08:12 PM, alexander.stef...@infineon.com wrote:
> > > >> The TPM burstcount status indicates the number of bytes that can
> > > >> be sent to the TPM without causing bus wait states.  Effectively,
> > > >> it is the number of empty bytes in the command FIFO.
> > > >>
> > > >> This patch optimizes the tpm_tis_send_data() function by checking
> > > >> the burstcount only once. And if the burstcount is valid, it writes
> > > >> all the bytes at once, permitting wait state.
> > > >>
> > > >> After this change, performance on a TPM 1.2 with an 8 byte
> > > >> burstcount for 1000 extends improved from ~41sec to ~14sec.
> > > >>
> > > >> Suggested-by: Ken Goldman  in
> > > >> conjunction with the TPM Device Driver work group.
> > > >> Signed-off-by: Nayna Jain
> > > >> Acked-by: Mimi Zohar
> > > >> ---
> > > >>   drivers/char/tpm/tpm_tis_core.c | 42 +++--
> --
> > --
> > > 
> > > >> 
> > > >>   1 file changed, 15 insertions(+), 27 deletions(-)
> > > >>
> > > >> diff --git a/drivers/char/tpm/tpm_tis_core.c
> > > >> b/drivers/char/tpm/tpm_tis_core.c
> > > >> index b33126a35694..993328ae988c 100644
> > > >> --- a/drivers/char/tpm/tpm_tis_core.c
> > > >> +++ b/drivers/char/tpm/tpm_tis_core.c
> > > >> @@ -316,7 +316,6 @@ static int tpm_tis_send_data(struct tpm_chip
> > > *chip,
> > > >> u8 *buf, size_t len)
> > > >>   {
> > > >>struct tpm_tis_data *priv = dev_get_drvdata(>dev);
> > > >>int rc, status, burstcnt;
> > > >> -  size_t count = 0;
> > > >>bool itpm = priv->flags & TPM_TIS_ITPM_WORKAROUND;
> > > >>
> > > >>status = tpm_tis_status(chip);
> > > >> @@ -330,35 +329,24 @@ static int tpm_tis_send_data(struct
> tpm_chip
> > > *chip,
> > > >> u8 *buf, size_t len)
> > > >>}
> > > >>}
> > > >>
> > > >> -  while (count < len - 1) {
> > > >> -  burstcnt = get_burstcount(chip);
> > > >> -  if (burstcnt < 0) {
> > > >> -  dev_err(>dev, "Unable to read
> burstcount\n");
> > > >> -  rc = burstcnt;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> -  burstcnt = min_t(int, burstcnt, len - count - 1);
> > > >> -  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> > > >>> locality),
> > > >> -   burstcnt, buf + count);
> > > >> -  if (rc < 0)
> > > >> -  goto out_err;
> > > >> -
> > > >> -  count += burstcnt;
> > > >> -
> > > >> -  if (wait_for_tpm_stat(chip, TPM_STS_VALID, chip-
> > > >>> timeout_c,
> > > >> -  >int_queue, false) < 0) {
> > > >> -  rc = -ETIME;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> -  status = tpm_tis_status(chip);
> > > >> -  if (!itpm && (status & TPM_STS_DATA_EXPECT) == 0)
> {
> > > >> -  rc = -EIO;
> > > >> -  goto out_err;
> > > >> -  }
> > > >> +  /*
> > > >> +   * Get the initial burstcount to ensure TPM is ready to
> > > >> +   * accept data.
> > > >> +   */
> > > >> +  burstcnt = get_burstcount(chip);
> > > >> +  if (burstcnt < 0) {
> > > >> +  dev_err(>dev, "Unable to read burstcount\n");
> > > >> +  rc = burstcnt;
> > > >> +  goto out_err;
> > > >>}
> > > >>
> > > >> +  rc = tpm_tis_write_bytes(priv, TPM_DATA_FIFO(priv-
> >locality),
> > > >> +  len - 1, buf);
> > > >> +  if (rc < 0)
> > > >> +  goto out_err;
> > > >> +
> > > >>/* write last byte */
> > > >> -  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> > > >> buf[count]);
> > > >> +  rc = tpm_tis_write8(priv, TPM_DATA_FIFO(priv->locality),
> buf[len-
> > > >> 1]);
> > > >>if (rc < 0)
> > > >>goto out_err;
> > > >>
> > > >> --
> > > >> 2.13.3
> > > > This seems to fail reliably with my SPI TPM 2.0. I get EIO when trying 
> > > > to
> > > send large amounts of data, e.g. with TPM2_Hash, and subsequent tests
> > > seem to take an unusual amount of time. More analysis probably has to
> > wait
> > > until November, since I am going to be in Prague next week.
> > >
> > > Thanks Alex for testing these.. Did you get the chance to do any further
> > > analysis ?
> >
> > I am working on that now. Ken's suggestion seems reasonable, so I am
> going
> > to test whether correctly waiting for the flags to change fixes the problem.
> If
> > it does, I'll send the patches.
> 
> Sorry for the delay, I had to take care of some device tree changes in v4.14
> that broke my ARM test machines.
> 
> I've implemented some patches that fix the issue that Ken pointed out and
> rebased your patch 2/4 ("ignore burstcount") on top. While doing this I
> noticed that your original patch does not, as the commit message says, 

RE: [PATCH] nvmem: uniphier: change access unit from 32bit to 8bit

2017-11-21 Thread Keiji Hayashibara
Reviewed-by: Keiji Hayashibara 

Thanks.

-
Best Regards,
Keiji Hayashibara

> -Original Message-
> From: Kunihiko Hayashi [mailto:hayashi.kunih...@socionext.com]
> Sent: Wednesday, November 22, 2017 2:15 PM
> To: Srinivas Kandagatla 
> Cc: Yamada, Masahiro/山田 真弘 ; Hayashibara, 
> Keiji/林原 啓二
> ; masami.hirama...@linaro.org; 
> jaswinder.si...@linaro.org;
> linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org; Hayashi, 
> Kunihiko/林 邦彦
> 
> Subject: [PATCH] nvmem: uniphier: change access unit from 32bit to 8bit
> 
> The efuse on UniPhier allows 8bit access according to the specification.
> Since bit offset of nvmem is limited to 0-7, it is desiable to change access 
> unit of nvmem to 8bit.
> 
> Signed-off-by: Kunihiko Hayashi 
> ---
>  drivers/nvmem/uniphier-efuse.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvmem/uniphier-efuse.c b/drivers/nvmem/uniphier-efuse.c 
> index 2bb45c4..fac3122 100644
> --- a/drivers/nvmem/uniphier-efuse.c
> +++ b/drivers/nvmem/uniphier-efuse.c
> @@ -27,11 +27,11 @@ static int uniphier_reg_read(void *context,
>unsigned int reg, void *_val, size_t bytes)  {
>   struct uniphier_efuse_priv *priv = context;
> - u32 *val = _val;
> + u8 *val = _val;
>   int offs;
> 
> - for (offs = 0; offs < bytes; offs += sizeof(u32))
> - *val++ = readl(priv->base + reg + offs);
> + for (offs = 0; offs < bytes; offs += sizeof(u8))
> + *val++ = readb(priv->base + reg + offs);
> 
>   return 0;
>  }
> @@ -53,8 +53,8 @@ static int uniphier_efuse_probe(struct platform_device 
> *pdev)
>   if (IS_ERR(priv->base))
>   return PTR_ERR(priv->base);
> 
> - econfig.stride = 4;
> - econfig.word_size = 4;
> + econfig.stride = 1;
> + econfig.word_size = 1;
>   econfig.read_only = true;
>   econfig.reg_read = uniphier_reg_read;
>   econfig.size = resource_size(res);
> --
> 2.7.4





RE: [PATCH] nvmem: uniphier: change access unit from 32bit to 8bit

2017-11-21 Thread Keiji Hayashibara
Reviewed-by: Keiji Hayashibara 

Thanks.

-
Best Regards,
Keiji Hayashibara

> -Original Message-
> From: Kunihiko Hayashi [mailto:hayashi.kunih...@socionext.com]
> Sent: Wednesday, November 22, 2017 2:15 PM
> To: Srinivas Kandagatla 
> Cc: Yamada, Masahiro/山田 真弘 ; Hayashibara, 
> Keiji/林原 啓二
> ; masami.hirama...@linaro.org; 
> jaswinder.si...@linaro.org;
> linux-arm-ker...@lists.infradead.org; linux-kernel@vger.kernel.org; Hayashi, 
> Kunihiko/林 邦彦
> 
> Subject: [PATCH] nvmem: uniphier: change access unit from 32bit to 8bit
> 
> The efuse on UniPhier allows 8bit access according to the specification.
> Since bit offset of nvmem is limited to 0-7, it is desiable to change access 
> unit of nvmem to 8bit.
> 
> Signed-off-by: Kunihiko Hayashi 
> ---
>  drivers/nvmem/uniphier-efuse.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/nvmem/uniphier-efuse.c b/drivers/nvmem/uniphier-efuse.c 
> index 2bb45c4..fac3122 100644
> --- a/drivers/nvmem/uniphier-efuse.c
> +++ b/drivers/nvmem/uniphier-efuse.c
> @@ -27,11 +27,11 @@ static int uniphier_reg_read(void *context,
>unsigned int reg, void *_val, size_t bytes)  {
>   struct uniphier_efuse_priv *priv = context;
> - u32 *val = _val;
> + u8 *val = _val;
>   int offs;
> 
> - for (offs = 0; offs < bytes; offs += sizeof(u32))
> - *val++ = readl(priv->base + reg + offs);
> + for (offs = 0; offs < bytes; offs += sizeof(u8))
> + *val++ = readb(priv->base + reg + offs);
> 
>   return 0;
>  }
> @@ -53,8 +53,8 @@ static int uniphier_efuse_probe(struct platform_device 
> *pdev)
>   if (IS_ERR(priv->base))
>   return PTR_ERR(priv->base);
> 
> - econfig.stride = 4;
> - econfig.word_size = 4;
> + econfig.stride = 1;
> + econfig.word_size = 1;
>   econfig.read_only = true;
>   econfig.reg_read = uniphier_reg_read;
>   econfig.size = resource_size(res);
> --
> 2.7.4





Re: [tegra186]: emmc resume failing after booting from snapshot image

2017-11-21 Thread Pintu Kumar
Hi,

I am trying to bring up suspend-to-disk (snapshot boot) on jetson-tx2
board (nvidia tegra186).
Suspend is working fine, but during boot with snapshot image, emmc
resume is failing.
Kernel version: 4.4
Repo: https://nv-tegra.nvidia.com/gitweb/?p=linux-4.4.git;a=summary
repo: tegra-l4t-r27.1


It returns with:
mmc0: error -110 during resume (card was removed?)

Please see the complete logs below.

I tried to add some prints in mmc driver to check the cause.
It seems that mmc is not responding during resume.
[  137.125314] mmc_sleep: ERROR: mmc_wait_for_cmd, ret: -110
When I check more, I found that CMD5 (Sleep/Awake) is failing to response.
It is not able to wakeup the mmc from sleep.

What could be cause of this problem?
Any pointers or suggestions about this issue will be really helpful

When I see the linux kernel mainline kernel-4.14 (latest), I could see
that there are some patches in drivers/mmc/core/* which are missing in
jetson tx2 kernel version-4.4.
Do you think any of the latest patches may help to solve this issue?
If yes, can you point to some of relevant once?


Please help.


Thanks,
Pintu

LOGS:
---
## Booting ...
[ 117.079061] sdhci-tegra 340.sdhci: Tuning already done,
restoring the best tap value : 112
[ 117.081179] xhci-tegra 353.xhci: exiting ELPG
[ 117.081798] xhci-tegra 353.xhci: Firmware timestamp: 2016-09-01
11:32:41 UTC, Version: 55.05 release
[ 117.085202] xhci-tegra 353.xhci: exiting ELPG done
[ 117.085204] xhci-tegra 353.xhci: entering ELPG
[ 117.087770] xhci-tegra 353.xhci: entering ELPG done
[ 117.087775] Wake76 for irq=199
[ 117.08] Wake77 for irq=199
[ 117.087778] Wake78 for irq=199
[ 117.087779] Wake79 for irq=199
[ 117.087780] Wake80 for irq=199
[ 117.087781] Wake81 for irq=199
[ 117.087782] Wake82 for irq=199
[ 117.087784] Enabling wake76
[ 117.087785] Enabling wake77
[ 117.087786] Enabling wake78
[ 117.087787] Enabling wake79
[ 117.087788] Enabling wake80
[ 117.087789] Enabling wake81
[ 117.087790] Enabling wake82
[ 117.087891] tegra186-cam-rtcpu b00.rtcpu: sce gets halted
[ 117.090012] Wake24 for irq=241
[ 117.090015] Enabling wake24
[ 117.090598] nct1008_nct72 7-004c: success in disabling tmp451 VDD rail
[ 117.090654] gpio tegra-gpio-aon wake29 for gpio=56(FF:0)
[ 117.090655] Enabling wake29
[ 117.090774] gpio tegra-gpio wake53 for gpio=159(X:5)
[ 117.090775] Enabling wake53
[ 117.111219] tegradc 1521.nvdisplay: suspend
[ 117.111422] Wake73 for irq=42
[ 117.111424] Enabling wake73
[ 117.111597] bcm54xx_low_power_mode(): put phy in iddq-lp mode
[ 117.113533] gpio tegra-gpio wake71 for gpio=125(P:6)
[ 117.113535] Enabling wake71
[ 117.113632] PM: suspend of devices complete after 34.680 msecs
[ 117.114829] host1x 13e1.host1x: suspended
[ 117.114898] PM: late suspend of devices complete after 1.262 msecs
[ 117.133746] PM: noirq suspend of devices complete after 18.841 msecs
[ 117.133971] Disabling non-boot CPUs ...
[ 117.134249] CPU3: shutdown
[ 117.148582] psci: Retrying again to check for CPU kill
[ 117.148586] psci: CPU3 killed.
[ 117.149097] CPU4: shutdown
[ 117.164584] psci: Retrying again to check for CPU kill
[ 117.164589] psci: CPU4 killed.
[ 117.165041] CPU5: shutdown
[ 117.180572] psci: Retrying again to check for CPU kill
[ 117.180576] psci: CPU5 killed.
[ 117.180834] Entered SC7
[ 117.180834] Wake[31-0] level=0x4000
[ 117.180834] Wake[63-32] level=0x0
[ 117.180834] Wake[95-64] level=0x7f2a0
[ 117.180834] Wake[31-0] enable=0x2100
[ 117.180834] Wake[63-32] enable=0x20
[ 117.180834] Wake[95-64] enable=0x7f280
[ 117.180834] Wake[31-0] route=0x2100
[ 117.180834] Wake[63-32] route=0x20
[ 117.180834] Wake[95-64] route=0x7f280

[ 117.180834] Wake[32:0] status=0x0
[ 117.180834] Wake[64:32] status=0x0
[ 117.180834] Wake[96:64] status=0x0
[ 117.180834] Exited SC7
[ 117.180834] bpmp: waiting for handshake
[ 117.180834] bpmp: synchronizing channels
[ 117.180834] bpmp: channels synchronized
[ 117.180869] Enabling non-boot CPUs ...
[ 117.181067] CPU3: Booted secondary processor [411fd073]
[ 117.181198] CPU3 is up
[ 117.181353] CPU4: Booted secondary processor [411fd073]
[ 117.181455] CPU4 is up
[ 117.181609] CPU5: Booted secondary processor [411fd073]
[ 117.181721] CPU5 is up
[ 117.182740] xhci-tegra 353.xhci: exiting ELPG
[ 117.183221] xhci-tegra 353.xhci: Firmware timestamp: 2016-09-01
11:32:41 UTC, Version: 55.05 release
[ 117.381630] xhci-tegra 353.xhci: XHCI Controller not ready.
Falcon state: 0x10
[ 117.381633] xhci-tegra 353.xhci: exiting ELPG failed
[ 117.381643] dpm_run_callback(): tegra_xhci_resume_noirq+0x0/0x48 returns -14
[ 117.381653] PM: Device 353.xhci failed to resume noirq: error -14
[ 117.381724] PM: noirq resume of devices complete after 199.999 msecs
[ 117.383100] PM: early resume of devices complete after 1.236 msecs
[ 117.390964] gpio tegra-gpio wake71 for gpio=125(P:6)
[ 117.390966] Disabling wake71
[ 

Re: [tegra186]: emmc resume failing after booting from snapshot image

2017-11-21 Thread Pintu Kumar
Hi,

I am trying to bring up suspend-to-disk (snapshot boot) on jetson-tx2
board (nvidia tegra186).
Suspend is working fine, but during boot with snapshot image, emmc
resume is failing.
Kernel version: 4.4
Repo: https://nv-tegra.nvidia.com/gitweb/?p=linux-4.4.git;a=summary
repo: tegra-l4t-r27.1


It returns with:
mmc0: error -110 during resume (card was removed?)

Please see the complete logs below.

I tried to add some prints in mmc driver to check the cause.
It seems that mmc is not responding during resume.
[  137.125314] mmc_sleep: ERROR: mmc_wait_for_cmd, ret: -110
When I check more, I found that CMD5 (Sleep/Awake) is failing to response.
It is not able to wakeup the mmc from sleep.

What could be cause of this problem?
Any pointers or suggestions about this issue will be really helpful

When I see the linux kernel mainline kernel-4.14 (latest), I could see
that there are some patches in drivers/mmc/core/* which are missing in
jetson tx2 kernel version-4.4.
Do you think any of the latest patches may help to solve this issue?
If yes, can you point to some of relevant once?


Please help.


Thanks,
Pintu

LOGS:
---
## Booting ...
[ 117.079061] sdhci-tegra 340.sdhci: Tuning already done,
restoring the best tap value : 112
[ 117.081179] xhci-tegra 353.xhci: exiting ELPG
[ 117.081798] xhci-tegra 353.xhci: Firmware timestamp: 2016-09-01
11:32:41 UTC, Version: 55.05 release
[ 117.085202] xhci-tegra 353.xhci: exiting ELPG done
[ 117.085204] xhci-tegra 353.xhci: entering ELPG
[ 117.087770] xhci-tegra 353.xhci: entering ELPG done
[ 117.087775] Wake76 for irq=199
[ 117.08] Wake77 for irq=199
[ 117.087778] Wake78 for irq=199
[ 117.087779] Wake79 for irq=199
[ 117.087780] Wake80 for irq=199
[ 117.087781] Wake81 for irq=199
[ 117.087782] Wake82 for irq=199
[ 117.087784] Enabling wake76
[ 117.087785] Enabling wake77
[ 117.087786] Enabling wake78
[ 117.087787] Enabling wake79
[ 117.087788] Enabling wake80
[ 117.087789] Enabling wake81
[ 117.087790] Enabling wake82
[ 117.087891] tegra186-cam-rtcpu b00.rtcpu: sce gets halted
[ 117.090012] Wake24 for irq=241
[ 117.090015] Enabling wake24
[ 117.090598] nct1008_nct72 7-004c: success in disabling tmp451 VDD rail
[ 117.090654] gpio tegra-gpio-aon wake29 for gpio=56(FF:0)
[ 117.090655] Enabling wake29
[ 117.090774] gpio tegra-gpio wake53 for gpio=159(X:5)
[ 117.090775] Enabling wake53
[ 117.111219] tegradc 1521.nvdisplay: suspend
[ 117.111422] Wake73 for irq=42
[ 117.111424] Enabling wake73
[ 117.111597] bcm54xx_low_power_mode(): put phy in iddq-lp mode
[ 117.113533] gpio tegra-gpio wake71 for gpio=125(P:6)
[ 117.113535] Enabling wake71
[ 117.113632] PM: suspend of devices complete after 34.680 msecs
[ 117.114829] host1x 13e1.host1x: suspended
[ 117.114898] PM: late suspend of devices complete after 1.262 msecs
[ 117.133746] PM: noirq suspend of devices complete after 18.841 msecs
[ 117.133971] Disabling non-boot CPUs ...
[ 117.134249] CPU3: shutdown
[ 117.148582] psci: Retrying again to check for CPU kill
[ 117.148586] psci: CPU3 killed.
[ 117.149097] CPU4: shutdown
[ 117.164584] psci: Retrying again to check for CPU kill
[ 117.164589] psci: CPU4 killed.
[ 117.165041] CPU5: shutdown
[ 117.180572] psci: Retrying again to check for CPU kill
[ 117.180576] psci: CPU5 killed.
[ 117.180834] Entered SC7
[ 117.180834] Wake[31-0] level=0x4000
[ 117.180834] Wake[63-32] level=0x0
[ 117.180834] Wake[95-64] level=0x7f2a0
[ 117.180834] Wake[31-0] enable=0x2100
[ 117.180834] Wake[63-32] enable=0x20
[ 117.180834] Wake[95-64] enable=0x7f280
[ 117.180834] Wake[31-0] route=0x2100
[ 117.180834] Wake[63-32] route=0x20
[ 117.180834] Wake[95-64] route=0x7f280

[ 117.180834] Wake[32:0] status=0x0
[ 117.180834] Wake[64:32] status=0x0
[ 117.180834] Wake[96:64] status=0x0
[ 117.180834] Exited SC7
[ 117.180834] bpmp: waiting for handshake
[ 117.180834] bpmp: synchronizing channels
[ 117.180834] bpmp: channels synchronized
[ 117.180869] Enabling non-boot CPUs ...
[ 117.181067] CPU3: Booted secondary processor [411fd073]
[ 117.181198] CPU3 is up
[ 117.181353] CPU4: Booted secondary processor [411fd073]
[ 117.181455] CPU4 is up
[ 117.181609] CPU5: Booted secondary processor [411fd073]
[ 117.181721] CPU5 is up
[ 117.182740] xhci-tegra 353.xhci: exiting ELPG
[ 117.183221] xhci-tegra 353.xhci: Firmware timestamp: 2016-09-01
11:32:41 UTC, Version: 55.05 release
[ 117.381630] xhci-tegra 353.xhci: XHCI Controller not ready.
Falcon state: 0x10
[ 117.381633] xhci-tegra 353.xhci: exiting ELPG failed
[ 117.381643] dpm_run_callback(): tegra_xhci_resume_noirq+0x0/0x48 returns -14
[ 117.381653] PM: Device 353.xhci failed to resume noirq: error -14
[ 117.381724] PM: noirq resume of devices complete after 199.999 msecs
[ 117.383100] PM: early resume of devices complete after 1.236 msecs
[ 117.390964] gpio tegra-gpio wake71 for gpio=125(P:6)
[ 117.390966] Disabling wake71
[ 

Re: [PATCH v2 00/18] Entry stack switching

2017-11-21 Thread Ingo Molnar

* Ingo Molnar  wrote:

> 
> * Andy Lutomirski  wrote:
> 
> > This sets up stack switching, including for SYSCALL.  I think it's
> > in decent shape.
> > 
> > Known issues:
> >  - I think we're going to want a way to turn the stack switching on and
> >off either at boot time or at runtime.  It should be fairly 
> > straightforward
> >to make it work.
> > 
> >  - I think the ORC unwinder isn't so good at dealing with stack overflows.
> >It bails too early (I think), resulting in lots of ? entries.  This
> >isn't a regression with this series -- it's just something that could
> >be improved.
> > 
> > Ingo, patch 1 may be tip/urgent material.  It fixes what I think is
> > a bug in Xen.  I'm having a hard time testing because it's being
> > masked by a bigger unrelated bug that's keeping Xen from booting
> > when configured to hit the bug I'm fixing.  (The latter bug goes at
> > least back to v4.13, I think I know roughtly what's wrong, and I've
> > reported it to the maintainers.)
> 
> Hm, with this series the previous IRQ vector bug appears again:
> 
> [   51.156370] do_IRQ: 16.34 No irq handler for vector
> [   57.511030] do_IRQ: 16.34 No irq handler for vector
> [   57.528335] do_IRQ: 16.34 No irq handler for vector
> [   57.533256] do_IRQ: 16.34 No irq handler for vector
> [   63.991913] do_IRQ: 16.34 No irq handler for vector
> [   63.996810] do_IRQ: 16.34 No irq handler for vector
> 
> I've attached the reproducer config. Note that the system appears to be 
> working to 
> a certain extent (I could ssh to it and extract its config), but produces 
> these 
> warnings sporadically.
> 
> Also note that this is the same AMD system tha had irq-tracing/lockdep 
> troubles 
> yesterday. So maybe this warning is related and we either have broken 
> lockdep, or 
> these IRQ vector warnings.

Yeah, so I just double checked, if from the latest series I revert:

  x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing

then I (no surprise) get the bootup lockdep warning - but don't get the IRQ 
vector 
warnings.

Thanks,

Ingo


Re: [PATCH v2 00/18] Entry stack switching

2017-11-21 Thread Ingo Molnar

* Ingo Molnar  wrote:

> 
> * Andy Lutomirski  wrote:
> 
> > This sets up stack switching, including for SYSCALL.  I think it's
> > in decent shape.
> > 
> > Known issues:
> >  - I think we're going to want a way to turn the stack switching on and
> >off either at boot time or at runtime.  It should be fairly 
> > straightforward
> >to make it work.
> > 
> >  - I think the ORC unwinder isn't so good at dealing with stack overflows.
> >It bails too early (I think), resulting in lots of ? entries.  This
> >isn't a regression with this series -- it's just something that could
> >be improved.
> > 
> > Ingo, patch 1 may be tip/urgent material.  It fixes what I think is
> > a bug in Xen.  I'm having a hard time testing because it's being
> > masked by a bigger unrelated bug that's keeping Xen from booting
> > when configured to hit the bug I'm fixing.  (The latter bug goes at
> > least back to v4.13, I think I know roughtly what's wrong, and I've
> > reported it to the maintainers.)
> 
> Hm, with this series the previous IRQ vector bug appears again:
> 
> [   51.156370] do_IRQ: 16.34 No irq handler for vector
> [   57.511030] do_IRQ: 16.34 No irq handler for vector
> [   57.528335] do_IRQ: 16.34 No irq handler for vector
> [   57.533256] do_IRQ: 16.34 No irq handler for vector
> [   63.991913] do_IRQ: 16.34 No irq handler for vector
> [   63.996810] do_IRQ: 16.34 No irq handler for vector
> 
> I've attached the reproducer config. Note that the system appears to be 
> working to 
> a certain extent (I could ssh to it and extract its config), but produces 
> these 
> warnings sporadically.
> 
> Also note that this is the same AMD system tha had irq-tracing/lockdep 
> troubles 
> yesterday. So maybe this warning is related and we either have broken 
> lockdep, or 
> these IRQ vector warnings.

Yeah, so I just double checked, if from the latest series I revert:

  x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing

then I (no surprise) get the bootup lockdep warning - but don't get the IRQ 
vector 
warnings.

Thanks,

Ingo


Re: [PATCH v1 3/9] perf util: Reconstruct rblist for supporting per-thread shadow stats

2017-11-21 Thread Ravi Bangoria


On 11/20/2017 08:13 PM, Jin Yao wrote:

@@ -76,6 +97,17 @@ static struct rb_node *saved_value_new(struct rblist *rblist 
__maybe_unused,
return >rb_node;
  }

+static void saved_value_delete(struct rblist *rblist __maybe_unused,
+  struct rb_node *rb_node)
+{
+   struct saved_value *v = container_of(rb_node,
+struct saved_value,
+rb_node);
+
+   if (v)
+   free(v);
+}


Do we really need if(v) ?

Thanks,
Ravi



Re: [PATCH v1 3/9] perf util: Reconstruct rblist for supporting per-thread shadow stats

2017-11-21 Thread Ravi Bangoria


On 11/20/2017 08:13 PM, Jin Yao wrote:

@@ -76,6 +97,17 @@ static struct rb_node *saved_value_new(struct rblist *rblist 
__maybe_unused,
return >rb_node;
  }

+static void saved_value_delete(struct rblist *rblist __maybe_unused,
+  struct rb_node *rb_node)
+{
+   struct saved_value *v = container_of(rb_node,
+struct saved_value,
+rb_node);
+
+   if (v)
+   free(v);
+}


Do we really need if(v) ?

Thanks,
Ravi



Re: [PATCH v2 00/18] Entry stack switching

2017-11-21 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> This sets up stack switching, including for SYSCALL.  I think it's
> in decent shape.
> 
> Known issues:
>  - I think we're going to want a way to turn the stack switching on and
>off either at boot time or at runtime.  It should be fairly straightforward
>to make it work.
> 
>  - I think the ORC unwinder isn't so good at dealing with stack overflows.
>It bails too early (I think), resulting in lots of ? entries.  This
>isn't a regression with this series -- it's just something that could
>be improved.
> 
> Ingo, patch 1 may be tip/urgent material.  It fixes what I think is
> a bug in Xen.  I'm having a hard time testing because it's being
> masked by a bigger unrelated bug that's keeping Xen from booting
> when configured to hit the bug I'm fixing.  (The latter bug goes at
> least back to v4.13, I think I know roughtly what's wrong, and I've
> reported it to the maintainers.)

Hm, with this series the previous IRQ vector bug appears again:

[   51.156370] do_IRQ: 16.34 No irq handler for vector
[   57.511030] do_IRQ: 16.34 No irq handler for vector
[   57.528335] do_IRQ: 16.34 No irq handler for vector
[   57.533256] do_IRQ: 16.34 No irq handler for vector
[   63.991913] do_IRQ: 16.34 No irq handler for vector
[   63.996810] do_IRQ: 16.34 No irq handler for vector

I've attached the reproducer config. Note that the system appears to be working 
to 
a certain extent (I could ssh to it and extract its config), but produces these 
warnings sporadically.

Also note that this is the same AMD system tha had irq-tracing/lockdep troubles 
yesterday. So maybe this warning is related and we either have broken lockdep, 
or 
these IRQ vector warnings.

Thanks,

Ingo
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.14.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#

Re: [PATCH v2 00/18] Entry stack switching

2017-11-21 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> This sets up stack switching, including for SYSCALL.  I think it's
> in decent shape.
> 
> Known issues:
>  - I think we're going to want a way to turn the stack switching on and
>off either at boot time or at runtime.  It should be fairly straightforward
>to make it work.
> 
>  - I think the ORC unwinder isn't so good at dealing with stack overflows.
>It bails too early (I think), resulting in lots of ? entries.  This
>isn't a regression with this series -- it's just something that could
>be improved.
> 
> Ingo, patch 1 may be tip/urgent material.  It fixes what I think is
> a bug in Xen.  I'm having a hard time testing because it's being
> masked by a bigger unrelated bug that's keeping Xen from booting
> when configured to hit the bug I'm fixing.  (The latter bug goes at
> least back to v4.13, I think I know roughtly what's wrong, and I've
> reported it to the maintainers.)

Hm, with this series the previous IRQ vector bug appears again:

[   51.156370] do_IRQ: 16.34 No irq handler for vector
[   57.511030] do_IRQ: 16.34 No irq handler for vector
[   57.528335] do_IRQ: 16.34 No irq handler for vector
[   57.533256] do_IRQ: 16.34 No irq handler for vector
[   63.991913] do_IRQ: 16.34 No irq handler for vector
[   63.996810] do_IRQ: 16.34 No irq handler for vector

I've attached the reproducer config. Note that the system appears to be working 
to 
a certain extent (I could ssh to it and extract its config), but produces these 
warnings sporadically.

Also note that this is the same AMD system tha had irq-tracing/lockdep troubles 
yesterday. So maybe this warning is related and we either have broken lockdep, 
or 
these IRQ vector warnings.

Thanks,

Ingo
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.14.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y
CONFIG_HAVE_SETUP_PER_CPU_AREA=y
CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y
CONFIG_ARCH_HIBERNATION_POSSIBLE=y
CONFIG_ARCH_SUSPEND_POSSIBLE=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ZONE_DMA32=y
CONFIG_AUDIT_ARCH=y
CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y
CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y
CONFIG_HAVE_INTEL_TXT=y
CONFIG_X86_64_SMP=y
CONFIG_ARCH_SUPPORTS_UPROBES=y
CONFIG_FIX_EARLYCON_MEM=y
CONFIG_PGTABLE_LEVELS=4
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"
CONFIG_IRQ_WORK=y
CONFIG_BUILDTIME_EXTABLE_SORT=y
CONFIG_THREAD_INFO_IN_TASK=y

#
# General setup
#
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_CROSS_COMPILE=""
# CONFIG_COMPILE_TEST is not set
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_HAVE_KERNEL_GZIP=y
CONFIG_HAVE_KERNEL_BZIP2=y
CONFIG_HAVE_KERNEL_LZMA=y
CONFIG_HAVE_KERNEL_XZ=y
CONFIG_HAVE_KERNEL_LZO=y
CONFIG_HAVE_KERNEL_LZ4=y
CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set
# CONFIG_KERNEL_LZ4 is not set
CONFIG_DEFAULT_HOSTNAME="(none)"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_POSIX_MQUEUE_SYSCTL=y
CONFIG_CROSS_MEMORY_ATTACH=y
CONFIG_FHANDLE=y
CONFIG_USELIB=y
CONFIG_AUDIT=y
CONFIG_HAVE_ARCH_AUDITSYSCALL=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

#
# IRQ subsystem
#
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_IRQ_SHOW=y
CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_GENERIC_IRQ_MIGRATION=y
CONFIG_IRQ_DOMAIN=y
CONFIG_IRQ_DOMAIN_HIERARCHY=y
CONFIG_GENERIC_MSI_IRQ=y
CONFIG_GENERIC_MSI_IRQ_DOMAIN=y
CONFIG_GENERIC_IRQ_MATRIX_ALLOCATOR=y
CONFIG_GENERIC_IRQ_RESERVATION_MODE=y
# CONFIG_IRQ_DOMAIN_DEBUG is not set
CONFIG_IRQ_FORCED_THREADING=y
CONFIG_SPARSE_IRQ=y
# CONFIG_GENERIC_IRQ_DEBUGFS is not set
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_ARCH_CLOCKSOURCE_DATA=y
CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
CONFIG_GENERIC_TIME_VSYSCALL=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
CONFIG_GENERIC_CMOS_UPDATE=y

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_VIRT_CPU_ACCOUNTING=y
# 

Re: [PATCH] nvmem: uniphier: change access unit from 32bit to 8bit

2017-11-21 Thread Masahiro Yamada
2017-11-22 14:14 GMT+09:00 Kunihiko Hayashi :
> The efuse on UniPhier allows 8bit access according to the specification.
> Since bit offset of nvmem is limited to 0-7, it is desiable to change
> access unit of nvmem to 8bit.
>
> Signed-off-by: Kunihiko Hayashi 


Tested on LD4, sLD8, Pro4, PXs2, LD11, LD20, and PXs3.
All worked for me.

Tested-by: Masahiro Yamada 

Thanks.


-- 
Best Regards
Masahiro Yamada


Re: [PATCH] nvmem: uniphier: change access unit from 32bit to 8bit

2017-11-21 Thread Masahiro Yamada
2017-11-22 14:14 GMT+09:00 Kunihiko Hayashi :
> The efuse on UniPhier allows 8bit access according to the specification.
> Since bit offset of nvmem is limited to 0-7, it is desiable to change
> access unit of nvmem to 8bit.
>
> Signed-off-by: Kunihiko Hayashi 


Tested on LD4, sLD8, Pro4, PXs2, LD11, LD20, and PXs3.
All worked for me.

Tested-by: Masahiro Yamada 

Thanks.


-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 06/18] x86/kasan/64: Teach KASAN about the cpu_entry_area

2017-11-21 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> The cpu_entry_area will contain stacks.  Make sure that KASAN has
> appropriate shadow mappings for them.
> 
> Cc: Andrey Ryabinin 
> Cc: Alexander Potapenko 
> Cc: Dmitry Vyukov 
> Cc: kasan-...@googlegroups.com
> Signed-off-by: Andy Lutomirski 
> ---
>  arch/x86/mm/kasan_init_64.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
> index 99dfed6dfef8..43d376687315 100644
> --- a/arch/x86/mm/kasan_init_64.c
> +++ b/arch/x86/mm/kasan_init_64.c
> @@ -330,7 +330,14 @@ void __init kasan_init(void)
> early_pfn_to_nid(__pa(_stext)));
>  
>   kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END),
> - (void *)KASAN_SHADOW_END);
> +kasan_mem_to_shadow((void 
> *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_BOTTOM;
> +
> + kasan_populate_shadow((unsigned long)kasan_mem_to_shadow((void 
> *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_BOTTOM))),
> +   (unsigned long)kasan_mem_to_shadow((void 
> *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_TOP) + PAGE_SIZE)),
> + 0);
> +
> + kasan_populate_zero_shadow(kasan_mem_to_shadow((void 
> *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_TOP) + PAGE_SIZE)),
> +(void *)KASAN_SHADOW_END);

Note, this commit has a dependency on:

  d17a1d97dc20: x86/mm/kasan: don't use vmemmap_populate() to initialize shadow

which got merged upstream outside the x86 tree, so it has a whole bunch of 
merge 
window dependencies.

To make testing+backporting to v4.14 easier I've cherry-picked d17a1d97dc20 
into 
x86/urgent.

( I've Cc:-ed Linus, just in case this kind of preemptive cherry-picking is 
  frowned upon. )

Thanks,

Ingo


Re: [PATCH v2 06/18] x86/kasan/64: Teach KASAN about the cpu_entry_area

2017-11-21 Thread Ingo Molnar

* Andy Lutomirski  wrote:

> The cpu_entry_area will contain stacks.  Make sure that KASAN has
> appropriate shadow mappings for them.
> 
> Cc: Andrey Ryabinin 
> Cc: Alexander Potapenko 
> Cc: Dmitry Vyukov 
> Cc: kasan-...@googlegroups.com
> Signed-off-by: Andy Lutomirski 
> ---
>  arch/x86/mm/kasan_init_64.c | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
> index 99dfed6dfef8..43d376687315 100644
> --- a/arch/x86/mm/kasan_init_64.c
> +++ b/arch/x86/mm/kasan_init_64.c
> @@ -330,7 +330,14 @@ void __init kasan_init(void)
> early_pfn_to_nid(__pa(_stext)));
>  
>   kasan_populate_zero_shadow(kasan_mem_to_shadow((void *)MODULES_END),
> - (void *)KASAN_SHADOW_END);
> +kasan_mem_to_shadow((void 
> *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_BOTTOM;
> +
> + kasan_populate_shadow((unsigned long)kasan_mem_to_shadow((void 
> *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_BOTTOM))),
> +   (unsigned long)kasan_mem_to_shadow((void 
> *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_TOP) + PAGE_SIZE)),
> + 0);
> +
> + kasan_populate_zero_shadow(kasan_mem_to_shadow((void 
> *)(__fix_to_virt(FIX_CPU_ENTRY_AREA_TOP) + PAGE_SIZE)),
> +(void *)KASAN_SHADOW_END);

Note, this commit has a dependency on:

  d17a1d97dc20: x86/mm/kasan: don't use vmemmap_populate() to initialize shadow

which got merged upstream outside the x86 tree, so it has a whole bunch of 
merge 
window dependencies.

To make testing+backporting to v4.14 easier I've cherry-picked d17a1d97dc20 
into 
x86/urgent.

( I've Cc:-ed Linus, just in case this kind of preemptive cherry-picking is 
  frowned upon. )

Thanks,

Ingo


Re: [PATCH v1] scripts: leaking_addresses.pl: add support for 32-bit kernel addresses

2017-11-21 Thread Kaiwan N Billimoria
Thanks Tobin, for your detailed comments.

On Wed, Nov 22, 2017 at 5:29 AM, Tobin C. Harding  wrote:
> You don't typically need [xxx v1] for version 1, the v1 is implicit.
>
> Please use the git brief description prefix that is already in use i.e
>
> leaking_addresses: add support for 32-bit kernel addresses

Ok..

> On Tue, Nov 21, 2017 at 01:28:14PM +0530, kaiwan.billimo...@gmail.com wrote:
>> - support for ARM-32
>
> Sure, we can do this later.

Righto

>
>> - programatically query and set the PAGE_OFFSET based on arch (it's currently
>> hard-coded)
>
> Let's do this straight away, it will be much nicer.

Yes, will work on it..

>> 2. Minor edit:
>> the '--raw', '--suppress-dmesg', '--squash-by-path' and
>> '--squash-by-filename' option switches are only meaningful
>> when the 'input-raw=' option is used. So, indent the 'Help' screen lines
>> to reflect the fact.
>
> This is a different change to the architecture stuff so should be in a
> separate patch. You could do both as a series if you like. Off the top
> of my head I have never seen options output like this, but if you have,
> I'm willing to accept your view. You are correct that the options
> mentioned are only use in conjuncture with '--input-raw' so some way of
> showing this would be nice.

I realize this; so, yeah, will make the next one a series and put this
in the 2nd..

>>
>> +my $bit_size = 64;   # Check 64-bit kernel addresses by default
>
> This global is unnecessary. You already have is_ix86_32() so you can just
> use that.

>From your later comments, I think you see that using this global is necessary.

> Please use kernel coding style
>
> $bit_size = 32;

Ok..

> Bonus points, you uncovered a bug in the current script `if (is_x86_64)`
> was missing the parenthesis!

Yeah :-)

>
>> + if ($match =~ '\b(0x)?(f|F){8}\b') {
>> + return 1;
>> + }
>
> So, may_leak_address() and is_false_positive() are tightly coupled and
> not really that nice. Once we add 32 bit support it gets worse. Going
> forwards, we can either add your 32 bit work then refactor both
> functions or you can refactor them as you add the 32 bit stuff. I'm open
> to either.

Yes I agree. Having said that, I'll leave it on the back burner for now..

>Some things to note
>
> - The mask stuff (all 1's) should have an all 0's regex also.

Well, once we determine the address is >= PAGE_OFFSET, it's
automatically apparent that it's not 0, yes?

> - The mask stuff should probably be closer to the mask stuff for 64
>   bit. It's not immediately apparent a clean way to do this though.
> - It's not immediately apparent if an address less that PAGE_OFFSET is a
>   false positive or should be caught in leaks_address().

Hmm only thing I can think of offhand- on many ARM-32's, the kernel
module space is below
PAGE_OFFSET; we'd have to take that into consideration of course.
Anything else < PAGE_OFFSET and a kernel address? Anyone?

> - Do we need 32 bit equivalents for
>
> if ($line =~ '\bKEY=[[:xdigit:]]{14} [[:xdigit:]]{16} 
> [[:xdigit:]]{16}\b' or
> $line =~ '\b[[:xdigit:]]{14} [[:xdigit:]]{16} 
> [[:xdigit:]]{16}\b') {
>
Ok am unclear on what exactly the above achieves.. could you pl throw
some light on it, thanks..

>
> Your patch did not apply, the problem looks to be in the code section
> above. You can see that there is no removed line. For next spin please
> check your patch applies on top of the 'leaks' branch (which now
> includes the fix for the bug you found).

Yes, sorry about that; will do..

> I have one more comment that should have been at the top but I did not
> want to confuse things. Typically, the git brief description should be
> limited to 50 characters. If you do decide to split this patch into two
> and use the prefix suggested you may like to change the git brief
> description but don't feel you have to. If you do decide to do this, your
> next patch set will be a version 1 again. I may be wrong but I never
> increment a patch version if the subject line changes (excluding
> contents of [] ).

Right. I plan to send the next one as a 2 patch series; will keep the
git prefix you suggest
(and as Sub changes, will not label the version).
>
> thanks,
> Tobin.

Thanks,
Kaiwan.


Re: [PATCH v1] scripts: leaking_addresses.pl: add support for 32-bit kernel addresses

2017-11-21 Thread Kaiwan N Billimoria
Thanks Tobin, for your detailed comments.

On Wed, Nov 22, 2017 at 5:29 AM, Tobin C. Harding  wrote:
> You don't typically need [xxx v1] for version 1, the v1 is implicit.
>
> Please use the git brief description prefix that is already in use i.e
>
> leaking_addresses: add support for 32-bit kernel addresses

Ok..

> On Tue, Nov 21, 2017 at 01:28:14PM +0530, kaiwan.billimo...@gmail.com wrote:
>> - support for ARM-32
>
> Sure, we can do this later.

Righto

>
>> - programatically query and set the PAGE_OFFSET based on arch (it's currently
>> hard-coded)
>
> Let's do this straight away, it will be much nicer.

Yes, will work on it..

>> 2. Minor edit:
>> the '--raw', '--suppress-dmesg', '--squash-by-path' and
>> '--squash-by-filename' option switches are only meaningful
>> when the 'input-raw=' option is used. So, indent the 'Help' screen lines
>> to reflect the fact.
>
> This is a different change to the architecture stuff so should be in a
> separate patch. You could do both as a series if you like. Off the top
> of my head I have never seen options output like this, but if you have,
> I'm willing to accept your view. You are correct that the options
> mentioned are only use in conjuncture with '--input-raw' so some way of
> showing this would be nice.

I realize this; so, yeah, will make the next one a series and put this
in the 2nd..

>>
>> +my $bit_size = 64;   # Check 64-bit kernel addresses by default
>
> This global is unnecessary. You already have is_ix86_32() so you can just
> use that.

>From your later comments, I think you see that using this global is necessary.

> Please use kernel coding style
>
> $bit_size = 32;

Ok..

> Bonus points, you uncovered a bug in the current script `if (is_x86_64)`
> was missing the parenthesis!

Yeah :-)

>
>> + if ($match =~ '\b(0x)?(f|F){8}\b') {
>> + return 1;
>> + }
>
> So, may_leak_address() and is_false_positive() are tightly coupled and
> not really that nice. Once we add 32 bit support it gets worse. Going
> forwards, we can either add your 32 bit work then refactor both
> functions or you can refactor them as you add the 32 bit stuff. I'm open
> to either.

Yes I agree. Having said that, I'll leave it on the back burner for now..

>Some things to note
>
> - The mask stuff (all 1's) should have an all 0's regex also.

Well, once we determine the address is >= PAGE_OFFSET, it's
automatically apparent that it's not 0, yes?

> - The mask stuff should probably be closer to the mask stuff for 64
>   bit. It's not immediately apparent a clean way to do this though.
> - It's not immediately apparent if an address less that PAGE_OFFSET is a
>   false positive or should be caught in leaks_address().

Hmm only thing I can think of offhand- on many ARM-32's, the kernel
module space is below
PAGE_OFFSET; we'd have to take that into consideration of course.
Anything else < PAGE_OFFSET and a kernel address? Anyone?

> - Do we need 32 bit equivalents for
>
> if ($line =~ '\bKEY=[[:xdigit:]]{14} [[:xdigit:]]{16} 
> [[:xdigit:]]{16}\b' or
> $line =~ '\b[[:xdigit:]]{14} [[:xdigit:]]{16} 
> [[:xdigit:]]{16}\b') {
>
Ok am unclear on what exactly the above achieves.. could you pl throw
some light on it, thanks..

>
> Your patch did not apply, the problem looks to be in the code section
> above. You can see that there is no removed line. For next spin please
> check your patch applies on top of the 'leaks' branch (which now
> includes the fix for the bug you found).

Yes, sorry about that; will do..

> I have one more comment that should have been at the top but I did not
> want to confuse things. Typically, the git brief description should be
> limited to 50 characters. If you do decide to split this patch into two
> and use the prefix suggested you may like to change the git brief
> description but don't feel you have to. If you do decide to do this, your
> next patch set will be a version 1 again. I may be wrong but I never
> increment a patch version if the subject line changes (excluding
> contents of [] ).

Right. I plan to send the next one as a 2 patch series; will keep the
git prefix you suggest
(and as Sub changes, will not label the version).
>
> thanks,
> Tobin.

Thanks,
Kaiwan.


Re: [PATCH 30/30] PCI: remove pci_get_bus_and_slot() function

2017-11-21 Thread Timur Tabi

On 11/21/17 11:55 PM, Sinan Kaya wrote:

For places where domain number information is available, I extracted domain 
number
and added into pci_get_domain_bus_and_slot() call such as xen or bn drivers.


My suggestion is that you restrict your first patch set to only these 
patches.



The assumption at this point is for pci_get_bus_and_slot() usages to be caught
in code-review.


How about this:

static inline struct pci_dev * __deprecated 
pci_get_bus_and_slot(unsigned int bus,

   unsigned int devfn)
{
return pci_get_domain_bus_and_slot(0, bus, devfn);
}


--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: [PATCH 30/30] PCI: remove pci_get_bus_and_slot() function

2017-11-21 Thread Timur Tabi

On 11/21/17 11:55 PM, Sinan Kaya wrote:

For places where domain number information is available, I extracted domain 
number
and added into pci_get_domain_bus_and_slot() call such as xen or bn drivers.


My suggestion is that you restrict your first patch set to only these 
patches.



The assumption at this point is for pci_get_bus_and_slot() usages to be caught
in code-review.


How about this:

static inline struct pci_dev * __deprecated 
pci_get_bus_and_slot(unsigned int bus,

   unsigned int devfn)
{
return pci_get_domain_bus_and_slot(0, bus, devfn);
}


--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: [PATCH v6 25/37] tracing: Add support for 'field variables'

2017-11-21 Thread Namhyung Kim
On Fri, Nov 17, 2017 at 02:33:04PM -0600, Tom Zanussi wrote:
> @@ -1387,6 +1405,8 @@ static struct trace_event_file *find_var_file(struct 
> trace_array *tr,
>   list_for_each_entry(var_data, >hist_vars, list) {
>   var_hist_data = var_data->hist_data;
>   file = var_hist_data->event_file;
> + if (file == found)
> + continue;

Shouldn't it be moved to the patch 22?

Thanks,
Namhyung


>   call = file->event_call;
>   name = trace_event_name(call);
>  


Re: [PATCH v6 25/37] tracing: Add support for 'field variables'

2017-11-21 Thread Namhyung Kim
On Fri, Nov 17, 2017 at 02:33:04PM -0600, Tom Zanussi wrote:
> @@ -1387,6 +1405,8 @@ static struct trace_event_file *find_var_file(struct 
> trace_array *tr,
>   list_for_each_entry(var_data, >hist_vars, list) {
>   var_hist_data = var_data->hist_data;
>   file = var_hist_data->event_file;
> + if (file == found)
> + continue;

Shouldn't it be moved to the patch 22?

Thanks,
Namhyung


>   call = file->event_call;
>   name = trace_event_name(call);
>  


Re: [PATCH 30/30] PCI: remove pci_get_bus_and_slot() function

2017-11-21 Thread Sinan Kaya
On 11/22/2017 12:45 AM, Timur Tabi wrote:
> On 11/21/17 11:31 PM, Sinan Kaya wrote:
>> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
>> extract the domain number. Other places, use the actual domain number from
>> the device.
>>
>> Now that all users of pci_get_bus_and_slot() switched to
>> pci_get_domain_bus_and_slot(), it is now safe to remove this function.
> 
> This doesn't really eliminate pci_get_bus_and_slot(), because it doesn't 
> force developers to support non-zero domains.  What's to stop a driver 
> developer from doing this?
> 
> #define pci_get_bus_and_slot(b, d) pci_get_domain_bus_and_slot(0, b, d)
> 
> thereby completely ignoring what you're trying to do?
> 

Surely, the goal is not to eliminate all domain 0 users/assumptions but open the
path for flexibility over time. 

There are patches in this series where I hard-coded a value of 0 because domain
information was not available.

For places where domain number information is available, I extracted domain 
number
and added into pci_get_domain_bus_and_slot() call such as xen or bn drivers.

This will allow these drivers to be used with non-zero segment numbers. These
issues were missed until this refactoring took place.

pci_get_domain_bus_and_slot() function makes the developer think about where to
find the domain number as it is mandatory. 

I also double checked that all current users of pci_get_domain_bus_and_slot()
are actually extracting the domain number correctly along with the bus, device,
function.

The assumption at this point is for pci_get_bus_and_slot() usages to be caught
in code-review.

This is a best-effort approach towards flexibility.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH 30/30] PCI: remove pci_get_bus_and_slot() function

2017-11-21 Thread Sinan Kaya
On 11/22/2017 12:45 AM, Timur Tabi wrote:
> On 11/21/17 11:31 PM, Sinan Kaya wrote:
>> Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
>> extract the domain number. Other places, use the actual domain number from
>> the device.
>>
>> Now that all users of pci_get_bus_and_slot() switched to
>> pci_get_domain_bus_and_slot(), it is now safe to remove this function.
> 
> This doesn't really eliminate pci_get_bus_and_slot(), because it doesn't 
> force developers to support non-zero domains.  What's to stop a driver 
> developer from doing this?
> 
> #define pci_get_bus_and_slot(b, d) pci_get_domain_bus_and_slot(0, b, d)
> 
> thereby completely ignoring what you're trying to do?
> 

Surely, the goal is not to eliminate all domain 0 users/assumptions but open the
path for flexibility over time. 

There are patches in this series where I hard-coded a value of 0 because domain
information was not available.

For places where domain number information is available, I extracted domain 
number
and added into pci_get_domain_bus_and_slot() call such as xen or bn drivers.

This will allow these drivers to be used with non-zero segment numbers. These
issues were missed until this refactoring took place.

pci_get_domain_bus_and_slot() function makes the developer think about where to
find the domain number as it is mandatory. 

I also double checked that all current users of pci_get_domain_bus_and_slot()
are actually extracting the domain number correctly along with the bus, device,
function.

The assumption at this point is for pci_get_bus_and_slot() usages to be caught
in code-review.

This is a best-effort approach towards flexibility.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH 30/30] PCI: remove pci_get_bus_and_slot() function

2017-11-21 Thread Timur Tabi

On 11/21/17 11:31 PM, Sinan Kaya wrote:

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Now that all users of pci_get_bus_and_slot() switched to
pci_get_domain_bus_and_slot(), it is now safe to remove this function.


This doesn't really eliminate pci_get_bus_and_slot(), because it doesn't 
force developers to support non-zero domains.  What's to stop a driver 
developer from doing this?


#define pci_get_bus_and_slot(b, d) pci_get_domain_bus_and_slot(0, b, d)

thereby completely ignoring what you're trying to do?

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: [PATCH 30/30] PCI: remove pci_get_bus_and_slot() function

2017-11-21 Thread Timur Tabi

On 11/21/17 11:31 PM, Sinan Kaya wrote:

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Now that all users of pci_get_bus_and_slot() switched to
pci_get_domain_bus_and_slot(), it is now safe to remove this function.


This doesn't really eliminate pci_get_bus_and_slot(), because it doesn't 
force developers to support non-zero domains.  What's to stop a driver 
developer from doing this?


#define pci_get_bus_and_slot(b, d) pci_get_domain_bus_and_slot(0, b, d)

thereby completely ignoring what you're trying to do?

--
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.


Re: FW: [PATCH 24/31] nds32: Miscellaneous header files

2017-11-21 Thread Vincent Chen
2017-11-09 18:42 GMT+08:00 Vincent Chen :
>>>On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>>> +
>>> +static inline void __delay(unsigned long loops) {
>>> +   __asm__ __volatile__(".align 2\n"
>>> +"1:\n"
>>> +"\taddi\t%0, %0, -1\n"
>>> +"\tbgtz\t%0, 1b\n"
>>> +:"=r"(loops)
>>> +:"0"(loops)); }
>>
>> Does the architecture define a high-resolution clock source? If yes, then 
>> it's better to use that to get exact timing than to rely on the loop 
>> calibration.
>>
> Dear Arnd:
>
> Thanks.
> I will modify it in the next version patch.
>
Sorry.
Our architecture does not define a high-resolution clock source.
At that time, I promised you because I thought maybe I can use SOC
defined clock source to replace it.
For portability, It is a terrible method.
 we will keep the original implementation for __delay() in the next
version patch.

Vincent
>>> +/*
>>> + * This file is generally used by user-level software, so you need to
>>> + * be a little careful about namespace pollution etc.  Also, we
>>> +cannot
>>> + * assume GCC is being used.
>>> + */
>>> +
>>> +typedef unsigned short __kernel_mode_t; #define __kernel_mode_t
>>> +__kernel_mode_t
>>> +
>>> +typedef unsigned short __kernel_ipc_pid_t; #define __kernel_ipc_pid_t
>>> +__kernel_ipc_pid_t
>>> +
>>> +typedef unsigned short __kernel_uid_t; typedef unsigned short
>>> +__kernel_gid_t; #define __kernel_uid_t __kernel_uid_t
>>> +
>>> +typedef unsigned short __kernel_old_dev_t; #define __kernel_old_dev_t
>>> +__kernel_old_dev_t
>>> +
>>> +#include 
>>
>> I don't understand why you would want to override any of those.
>> Changing them unfortunately means rebuilding all of your user space, but I 
>> think it would be better to do that now than to suffer from this later on.
>>
>> Arnd
>
> Thanks.
> I will remove them in the next version patch.
>
> Best regards
> Vincent


Re: FW: [PATCH 24/31] nds32: Miscellaneous header files

2017-11-21 Thread Vincent Chen
2017-11-09 18:42 GMT+08:00 Vincent Chen :
>>>On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu  wrote:
>>> +
>>> +static inline void __delay(unsigned long loops) {
>>> +   __asm__ __volatile__(".align 2\n"
>>> +"1:\n"
>>> +"\taddi\t%0, %0, -1\n"
>>> +"\tbgtz\t%0, 1b\n"
>>> +:"=r"(loops)
>>> +:"0"(loops)); }
>>
>> Does the architecture define a high-resolution clock source? If yes, then 
>> it's better to use that to get exact timing than to rely on the loop 
>> calibration.
>>
> Dear Arnd:
>
> Thanks.
> I will modify it in the next version patch.
>
Sorry.
Our architecture does not define a high-resolution clock source.
At that time, I promised you because I thought maybe I can use SOC
defined clock source to replace it.
For portability, It is a terrible method.
 we will keep the original implementation for __delay() in the next
version patch.

Vincent
>>> +/*
>>> + * This file is generally used by user-level software, so you need to
>>> + * be a little careful about namespace pollution etc.  Also, we
>>> +cannot
>>> + * assume GCC is being used.
>>> + */
>>> +
>>> +typedef unsigned short __kernel_mode_t; #define __kernel_mode_t
>>> +__kernel_mode_t
>>> +
>>> +typedef unsigned short __kernel_ipc_pid_t; #define __kernel_ipc_pid_t
>>> +__kernel_ipc_pid_t
>>> +
>>> +typedef unsigned short __kernel_uid_t; typedef unsigned short
>>> +__kernel_gid_t; #define __kernel_uid_t __kernel_uid_t
>>> +
>>> +typedef unsigned short __kernel_old_dev_t; #define __kernel_old_dev_t
>>> +__kernel_old_dev_t
>>> +
>>> +#include 
>>
>> I don't understand why you would want to override any of those.
>> Changing them unfortunately means rebuilding all of your user space, but I 
>> think it would be better to do that now than to suffer from this later on.
>>
>> Arnd
>
> Thanks.
> I will remove them in the next version patch.
>
> Best regards
> Vincent


Re: [PATCHv3 2/2] x86/selftests: Add test for mapping placement for 5-level paging

2017-11-21 Thread Aneesh Kumar K.V
"Kirill A. Shutemov"  writes:

> With 5-level paging, we have 56-bit virtual address space available for
> userspace. But we don't want to expose userspace to addresses above
> 47-bits, unless it asked specifically for it.
>
> We use mmap(2) hint address as a way for kernel to know if it's okay to
> allocate virtual memory above 47-bit.
>
> Let's add a self-test that covers few corner cases of the interface.
>
> Signed-off-by: Kirill A. Shutemov 

Can we move this to selftest/vm/ ? I had a variant which i was using to
test issues on ppc64. One change we did recently was to use >=128TB as
the hint addr value to select larger address space. I also would like to
check for exact mmap return addr in some case. Attaching below the test
i was using. I will check whether this patch can be updated to test what
is converted in my selftest. I also want to do the boundary check twice.
The hash trasnslation mode in POWER require us to track addr limit and
we had bugs around address space slection before and after updating the
addr limit.

>From 7739eb02bb6b6602572a9c259e915ef23950aae1 Mon Sep 17 00:00:00 2001
From: "Aneesh Kumar K.V" 
Date: Mon, 13 Nov 2017 10:41:10 +0530
Subject: [PATCH] selftest/mm: Add test for checking mmap across 128TB boundary

Signed-off-by: Aneesh Kumar K.V 
---
 tools/testing/selftests/vm/Makefile |   1 +
 tools/testing/selftests/vm/run_vmtests  |  11 ++
 tools/testing/selftests/vm/va_128TBswitch.c | 170 
 3 files changed, 182 insertions(+)
 create mode 100644 tools/testing/selftests/vm/va_128TBswitch.c

diff --git a/tools/testing/selftests/vm/Makefile 
b/tools/testing/selftests/vm/Makefile
index cbb29e41ef2b..b1fb3cd7cf52 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -17,6 +17,7 @@ TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
 TEST_GEN_FILES += mlock-random-test
 TEST_GEN_FILES += virtual_address_range
+TEST_GEN_FILES += va_128TBswitch
 
 TEST_PROGS := run_vmtests
 
diff --git a/tools/testing/selftests/vm/run_vmtests 
b/tools/testing/selftests/vm/run_vmtests
index 07548a1fa901..b367f7801b67 100755
--- a/tools/testing/selftests/vm/run_vmtests
+++ b/tools/testing/selftests/vm/run_vmtests
@@ -176,4 +176,15 @@ else
echo "[PASS]"
 fi
 
+echo "-"
+echo "running virtual address 128TB switch test" 
+echo "-"
+./va_128TBswitch
+if [ $? -ne 0 ]; then
+   echo "[FAIL]"
+   exitcode=1
+else
+   echo "[PASS]"
+fi
+
 exit $exitcode
diff --git a/tools/testing/selftests/vm/va_128TBswitch.c 
b/tools/testing/selftests/vm/va_128TBswitch.c
new file mode 100644
index ..dfa501b825a8
--- /dev/null
+++ b/tools/testing/selftests/vm/va_128TBswitch.c
@@ -0,0 +1,170 @@
+/*
+ * Copyright IBM Corporation, 2017
+ * Author Aneesh Kumar K.V 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include 
+#include 
+#include 
+
+#ifdef DEBUG
+#define pr_debug(fmt, ...)  printf(fmt, ##__VA_ARGS__)
+#else
+#define pr_debug(fmt, ...)
+#endif
+
+/*
+ * >= 128TB is the hint addr value we used to select
+ * large address space.
+ */
+#define ADDR_SWITCH_HINT (1UL << 47)
+
+#ifdef __powerpc64__
+#define MAP_SIZE  64*1024
+#else
+#define MAP_SIZE  4*1024
+#endif
+
+
+void report_failure(long in_addr, unsigned long flags)
+{
+   printf("Failed to map 0x%lx with flags 0x%lx\n", in_addr, flags);
+   exit(1);
+}
+
+int *__map_addr(long in_addr, int size, unsigned long flags, int unmap)
+{
+   int *addr;
+
+   addr = (int *)mmap((void *)in_addr, size, PROT_READ | PROT_WRITE,
+  MAP_ANONYMOUS | MAP_PRIVATE | flags, -1, 0);
+   if (addr == MAP_FAILED)
+   report_failure(in_addr, flags);
+   pr_debug("Mapped addr 0x%lx-0x%lx for request 0x%lx with flag 0x%lx\n",
+(unsigned long)addr, ((unsigned long)addr + size), in_addr, 
flags);
+   /*
+* Try to access to catch errors in fault handling/slb miss handling
+*/
+   *addr = 10;
+   if (unmap)
+   munmap(addr, size);
+   return addr;
+}
+
+int *map_addr(long in_addr, unsigned long flags, int unmap)
+{
+   return __map_addr(in_addr, MAP_SIZE, flags, unmap);
+}
+
+void boundary_check(void)
+{
+   int *a;
+
+   /*
+* If stack is moved, we could possibly allocate
+* this at the requested address.
+*/
+   a = 

Re: [PATCHv3 2/2] x86/selftests: Add test for mapping placement for 5-level paging

2017-11-21 Thread Aneesh Kumar K.V
"Kirill A. Shutemov"  writes:

> With 5-level paging, we have 56-bit virtual address space available for
> userspace. But we don't want to expose userspace to addresses above
> 47-bits, unless it asked specifically for it.
>
> We use mmap(2) hint address as a way for kernel to know if it's okay to
> allocate virtual memory above 47-bit.
>
> Let's add a self-test that covers few corner cases of the interface.
>
> Signed-off-by: Kirill A. Shutemov 

Can we move this to selftest/vm/ ? I had a variant which i was using to
test issues on ppc64. One change we did recently was to use >=128TB as
the hint addr value to select larger address space. I also would like to
check for exact mmap return addr in some case. Attaching below the test
i was using. I will check whether this patch can be updated to test what
is converted in my selftest. I also want to do the boundary check twice.
The hash trasnslation mode in POWER require us to track addr limit and
we had bugs around address space slection before and after updating the
addr limit.

>From 7739eb02bb6b6602572a9c259e915ef23950aae1 Mon Sep 17 00:00:00 2001
From: "Aneesh Kumar K.V" 
Date: Mon, 13 Nov 2017 10:41:10 +0530
Subject: [PATCH] selftest/mm: Add test for checking mmap across 128TB boundary

Signed-off-by: Aneesh Kumar K.V 
---
 tools/testing/selftests/vm/Makefile |   1 +
 tools/testing/selftests/vm/run_vmtests  |  11 ++
 tools/testing/selftests/vm/va_128TBswitch.c | 170 
 3 files changed, 182 insertions(+)
 create mode 100644 tools/testing/selftests/vm/va_128TBswitch.c

diff --git a/tools/testing/selftests/vm/Makefile 
b/tools/testing/selftests/vm/Makefile
index cbb29e41ef2b..b1fb3cd7cf52 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -17,6 +17,7 @@ TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
 TEST_GEN_FILES += mlock-random-test
 TEST_GEN_FILES += virtual_address_range
+TEST_GEN_FILES += va_128TBswitch
 
 TEST_PROGS := run_vmtests
 
diff --git a/tools/testing/selftests/vm/run_vmtests 
b/tools/testing/selftests/vm/run_vmtests
index 07548a1fa901..b367f7801b67 100755
--- a/tools/testing/selftests/vm/run_vmtests
+++ b/tools/testing/selftests/vm/run_vmtests
@@ -176,4 +176,15 @@ else
echo "[PASS]"
 fi
 
+echo "-"
+echo "running virtual address 128TB switch test" 
+echo "-"
+./va_128TBswitch
+if [ $? -ne 0 ]; then
+   echo "[FAIL]"
+   exitcode=1
+else
+   echo "[PASS]"
+fi
+
 exit $exitcode
diff --git a/tools/testing/selftests/vm/va_128TBswitch.c 
b/tools/testing/selftests/vm/va_128TBswitch.c
new file mode 100644
index ..dfa501b825a8
--- /dev/null
+++ b/tools/testing/selftests/vm/va_128TBswitch.c
@@ -0,0 +1,170 @@
+/*
+ * Copyright IBM Corporation, 2017
+ * Author Aneesh Kumar K.V 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include 
+#include 
+#include 
+
+#ifdef DEBUG
+#define pr_debug(fmt, ...)  printf(fmt, ##__VA_ARGS__)
+#else
+#define pr_debug(fmt, ...)
+#endif
+
+/*
+ * >= 128TB is the hint addr value we used to select
+ * large address space.
+ */
+#define ADDR_SWITCH_HINT (1UL << 47)
+
+#ifdef __powerpc64__
+#define MAP_SIZE  64*1024
+#else
+#define MAP_SIZE  4*1024
+#endif
+
+
+void report_failure(long in_addr, unsigned long flags)
+{
+   printf("Failed to map 0x%lx with flags 0x%lx\n", in_addr, flags);
+   exit(1);
+}
+
+int *__map_addr(long in_addr, int size, unsigned long flags, int unmap)
+{
+   int *addr;
+
+   addr = (int *)mmap((void *)in_addr, size, PROT_READ | PROT_WRITE,
+  MAP_ANONYMOUS | MAP_PRIVATE | flags, -1, 0);
+   if (addr == MAP_FAILED)
+   report_failure(in_addr, flags);
+   pr_debug("Mapped addr 0x%lx-0x%lx for request 0x%lx with flag 0x%lx\n",
+(unsigned long)addr, ((unsigned long)addr + size), in_addr, 
flags);
+   /*
+* Try to access to catch errors in fault handling/slb miss handling
+*/
+   *addr = 10;
+   if (unmap)
+   munmap(addr, size);
+   return addr;
+}
+
+int *map_addr(long in_addr, unsigned long flags, int unmap)
+{
+   return __map_addr(in_addr, MAP_SIZE, flags, unmap);
+}
+
+void boundary_check(void)
+{
+   int *a;
+
+   /*
+* If stack is moved, we could possibly allocate
+* this at the requested address.
+*/
+   a = map_addr((ADDR_SWITCH_HINT - MAP_SIZE), 0, 1);
+   if ((unsigned long)a > ADDR_SWITCH_HINT - MAP_SIZE)
+   report_failure(ADDR_SWITCH_HINT - MAP_SIZE, 0);
+
+   /*
+

[PATCH 03/30] x86/PCI: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 arch/x86/pci/irq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/pci/irq.c b/arch/x86/pci/irq.c
index 04526291..52e5510 100644
--- a/arch/x86/pci/irq.c
+++ b/arch/x86/pci/irq.c
@@ -839,7 +839,8 @@ static void __init pirq_find_router(struct irq_router *r)
DBG(KERN_DEBUG "PCI: Attempting to find IRQ router for [%04x:%04x]\n",
rt->rtr_vendor, rt->rtr_device);
 
-   pirq_router_dev = pci_get_bus_and_slot(rt->rtr_bus, rt->rtr_devfn);
+   pirq_router_dev = pci_get_domain_bus_and_slot(0, rt->rtr_bus,
+ rt->rtr_devfn);
if (!pirq_router_dev) {
DBG(KERN_DEBUG "PCI: Interrupt router not found at "
"%02x:%02x\n", rt->rtr_bus, rt->rtr_devfn);
-- 
1.9.1



[PATCH 04/30] ata: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/ata/pata_ali.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/pata_ali.c b/drivers/ata/pata_ali.c
index d19cd88..b297fea 100644
--- a/drivers/ata/pata_ali.c
+++ b/drivers/ata/pata_ali.c
@@ -466,7 +466,7 @@ static void ali_init_chipset(struct pci_dev *pdev)
tmp |= 0x01;/* CD_ROM enable for DMA */
pci_write_config_byte(pdev, 0x53, tmp);
}
-   north = pci_get_bus_and_slot(0, PCI_DEVFN(0,0));
+   north = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
if (north && north->vendor == PCI_VENDOR_ID_AL && ali_isa_bridge) {
/* Configure the ALi bridge logic. For non ALi rely on BIOS.
   Set the south bridge enable bit */
-- 
1.9.1



[PATCH 03/30] x86/PCI: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 arch/x86/pci/irq.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/pci/irq.c b/arch/x86/pci/irq.c
index 04526291..52e5510 100644
--- a/arch/x86/pci/irq.c
+++ b/arch/x86/pci/irq.c
@@ -839,7 +839,8 @@ static void __init pirq_find_router(struct irq_router *r)
DBG(KERN_DEBUG "PCI: Attempting to find IRQ router for [%04x:%04x]\n",
rt->rtr_vendor, rt->rtr_device);
 
-   pirq_router_dev = pci_get_bus_and_slot(rt->rtr_bus, rt->rtr_devfn);
+   pirq_router_dev = pci_get_domain_bus_and_slot(0, rt->rtr_bus,
+ rt->rtr_devfn);
if (!pirq_router_dev) {
DBG(KERN_DEBUG "PCI: Interrupt router not found at "
"%02x:%02x\n", rt->rtr_bus, rt->rtr_devfn);
-- 
1.9.1



[PATCH 04/30] ata: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/ata/pata_ali.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ata/pata_ali.c b/drivers/ata/pata_ali.c
index d19cd88..b297fea 100644
--- a/drivers/ata/pata_ali.c
+++ b/drivers/ata/pata_ali.c
@@ -466,7 +466,7 @@ static void ali_init_chipset(struct pci_dev *pdev)
tmp |= 0x01;/* CD_ROM enable for DMA */
pci_write_config_byte(pdev, 0x53, tmp);
}
-   north = pci_get_bus_and_slot(0, PCI_DEVFN(0,0));
+   north = pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(0, 0));
if (north && north->vendor == PCI_VENDOR_ID_AL && ali_isa_bridge) {
/* Configure the ALi bridge logic. For non ALi rely on BIOS.
   Set the south bridge enable bit */
-- 
1.9.1



[PATCH 06/30] edd: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/firmware/edd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/firmware/edd.c b/drivers/firmware/edd.c
index e229576..60a8f13 100644
--- a/drivers/firmware/edd.c
+++ b/drivers/firmware/edd.c
@@ -669,10 +669,10 @@ static void edd_release(struct kobject * kobj)
struct edd_info *info = edd_dev_get_info(edev);
 
if (edd_dev_is_type(edev, "PCI") || edd_dev_is_type(edev, "XPRS")) {
-   return pci_get_bus_and_slot(info->params.interface_path.pci.bus,
-
PCI_DEVFN(info->params.interface_path.pci.slot,
-  info->params.interface_path.pci.
-  function));
+   return pci_get_domain_bus_and_slot(0,
+   info->params.interface_path.pci.bus,
+   PCI_DEVFN(info->params.interface_path.pci.slot,
+   info->params.interface_path.pci.function));
}
return NULL;
 }
-- 
1.9.1



[PATCH 06/30] edd: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/firmware/edd.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/firmware/edd.c b/drivers/firmware/edd.c
index e229576..60a8f13 100644
--- a/drivers/firmware/edd.c
+++ b/drivers/firmware/edd.c
@@ -669,10 +669,10 @@ static void edd_release(struct kobject * kobj)
struct edd_info *info = edd_dev_get_info(edev);
 
if (edd_dev_is_type(edev, "PCI") || edd_dev_is_type(edev, "XPRS")) {
-   return pci_get_bus_and_slot(info->params.interface_path.pci.bus,
-
PCI_DEVFN(info->params.interface_path.pci.slot,
-  info->params.interface_path.pci.
-  function));
+   return pci_get_domain_bus_and_slot(0,
+   info->params.interface_path.pci.bus,
+   PCI_DEVFN(info->params.interface_path.pci.slot,
+   info->params.interface_path.pci.function));
}
return NULL;
 }
-- 
1.9.1



Re: [PATCH v1 8/9] perf stat: Remove --per-thread pid/tid limitation

2017-11-21 Thread Jin, Yao



On 11/21/2017 11:18 PM, Jiri Olsa wrote:

On Mon, Nov 20, 2017 at 10:43:43PM +0800, Jin Yao wrote:

Currently, if we execute 'perf stat --per-thread' without specifying
pid/tid, perf will return error.

root@skl:/tmp# perf stat --per-thread
The --per-thread option is only available when monitoring via -p -t options.
 -p, --pidstat events on existing process id
 -t, --tidstat events on existing thread id

This patch removes this limitation. If no pid/tid specified, it returns
all threads (get threads from /proc).

Signed-off-by: Jin Yao 
---
  tools/perf/builtin-stat.c | 23 +++
  tools/perf/util/target.h  |  7 +++
  2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9eec145..2d718f7 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -277,7 +277,7 @@ static int create_perf_stat_counter(struct perf_evsel 
*evsel)
attr->enable_on_exec = 1;
}
  
-	if (target__has_cpu())

+   if (target__has_cpu() && !target__has_per_thread())


please add comment on why this is needed..


return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
  
  	return perf_evsel__open_per_thread(evsel, evsel_list->threads);

@@ -340,7 +340,7 @@ static int read_counter(struct perf_evsel *counter)
int nthreads = thread_map__nr(evsel_list->threads);
int ncpus, cpu, thread;
  
-	if (target__has_cpu())

+   if (target__has_cpu() && !target__has_per_thread())


same here



That's because this patch series doesn't support cpu_list yet. So if 
it's a cpu_list case, then skip.


I plan to add cpu_list supporting as follow-up patch to avoid adding too 
much in this patch series.


Thanks
Jin Yao


thanks,
jirka



[PATCH 05/30] agp: nvidia: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/char/agp/nvidia-agp.c | 12 +---
 drivers/char/agp/sworks-agp.c |  3 ++-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/char/agp/nvidia-agp.c b/drivers/char/agp/nvidia-agp.c
index 828b344..623205b 100644
--- a/drivers/char/agp/nvidia-agp.c
+++ b/drivers/char/agp/nvidia-agp.c
@@ -340,11 +340,17 @@ static int agp_nvidia_probe(struct pci_dev *pdev,
u8 cap_ptr;
 
nvidia_private.dev_1 =
-   pci_get_bus_and_slot((unsigned int)pdev->bus->number, 
PCI_DEVFN(0, 1));
+   pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+   (unsigned int)pdev->bus->number,
+   PCI_DEVFN(0, 1));
nvidia_private.dev_2 =
-   pci_get_bus_and_slot((unsigned int)pdev->bus->number, 
PCI_DEVFN(0, 2));
+   pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+   (unsigned int)pdev->bus->number,
+   PCI_DEVFN(0, 2));
nvidia_private.dev_3 =
-   pci_get_bus_and_slot((unsigned int)pdev->bus->number, 
PCI_DEVFN(30, 0));
+   pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+   (unsigned int)pdev->bus->number,
+   PCI_DEVFN(30, 0));
 
if (!nvidia_private.dev_1 || !nvidia_private.dev_2 || 
!nvidia_private.dev_3) {
printk(KERN_INFO PFX "Detected an NVIDIA nForce/nForce2 "
diff --git a/drivers/char/agp/sworks-agp.c b/drivers/char/agp/sworks-agp.c
index 03be4ac..4dbdd3b 100644
--- a/drivers/char/agp/sworks-agp.c
+++ b/drivers/char/agp/sworks-agp.c
@@ -474,7 +474,8 @@ static int agp_serverworks_probe(struct pci_dev *pdev,
}
 
/* Everything is on func 1 here so we are hardcoding function one */
-   bridge_dev = pci_get_bus_and_slot((unsigned int)pdev->bus->number,
+   bridge_dev = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+   (unsigned int)pdev->bus->number,
PCI_DEVFN(0, 1));
if (!bridge_dev) {
dev_info(>dev, "can't find secondary device\n");
-- 
1.9.1



Re: [PATCH v1 8/9] perf stat: Remove --per-thread pid/tid limitation

2017-11-21 Thread Jin, Yao



On 11/21/2017 11:18 PM, Jiri Olsa wrote:

On Mon, Nov 20, 2017 at 10:43:43PM +0800, Jin Yao wrote:

Currently, if we execute 'perf stat --per-thread' without specifying
pid/tid, perf will return error.

root@skl:/tmp# perf stat --per-thread
The --per-thread option is only available when monitoring via -p -t options.
 -p, --pidstat events on existing process id
 -t, --tidstat events on existing thread id

This patch removes this limitation. If no pid/tid specified, it returns
all threads (get threads from /proc).

Signed-off-by: Jin Yao 
---
  tools/perf/builtin-stat.c | 23 +++
  tools/perf/util/target.h  |  7 +++
  2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 9eec145..2d718f7 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -277,7 +277,7 @@ static int create_perf_stat_counter(struct perf_evsel 
*evsel)
attr->enable_on_exec = 1;
}
  
-	if (target__has_cpu())

+   if (target__has_cpu() && !target__has_per_thread())


please add comment on why this is needed..


return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
  
  	return perf_evsel__open_per_thread(evsel, evsel_list->threads);

@@ -340,7 +340,7 @@ static int read_counter(struct perf_evsel *counter)
int nthreads = thread_map__nr(evsel_list->threads);
int ncpus, cpu, thread;
  
-	if (target__has_cpu())

+   if (target__has_cpu() && !target__has_per_thread())


same here



That's because this patch series doesn't support cpu_list yet. So if 
it's a cpu_list case, then skip.


I plan to add cpu_list supporting as follow-up patch to avoid adding too 
much in this patch series.


Thanks
Jin Yao


thanks,
jirka



[PATCH 05/30] agp: nvidia: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/char/agp/nvidia-agp.c | 12 +---
 drivers/char/agp/sworks-agp.c |  3 ++-
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/char/agp/nvidia-agp.c b/drivers/char/agp/nvidia-agp.c
index 828b344..623205b 100644
--- a/drivers/char/agp/nvidia-agp.c
+++ b/drivers/char/agp/nvidia-agp.c
@@ -340,11 +340,17 @@ static int agp_nvidia_probe(struct pci_dev *pdev,
u8 cap_ptr;
 
nvidia_private.dev_1 =
-   pci_get_bus_and_slot((unsigned int)pdev->bus->number, 
PCI_DEVFN(0, 1));
+   pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+   (unsigned int)pdev->bus->number,
+   PCI_DEVFN(0, 1));
nvidia_private.dev_2 =
-   pci_get_bus_and_slot((unsigned int)pdev->bus->number, 
PCI_DEVFN(0, 2));
+   pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+   (unsigned int)pdev->bus->number,
+   PCI_DEVFN(0, 2));
nvidia_private.dev_3 =
-   pci_get_bus_and_slot((unsigned int)pdev->bus->number, 
PCI_DEVFN(30, 0));
+   pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+   (unsigned int)pdev->bus->number,
+   PCI_DEVFN(30, 0));
 
if (!nvidia_private.dev_1 || !nvidia_private.dev_2 || 
!nvidia_private.dev_3) {
printk(KERN_INFO PFX "Detected an NVIDIA nForce/nForce2 "
diff --git a/drivers/char/agp/sworks-agp.c b/drivers/char/agp/sworks-agp.c
index 03be4ac..4dbdd3b 100644
--- a/drivers/char/agp/sworks-agp.c
+++ b/drivers/char/agp/sworks-agp.c
@@ -474,7 +474,8 @@ static int agp_serverworks_probe(struct pci_dev *pdev,
}
 
/* Everything is on func 1 here so we are hardcoding function one */
-   bridge_dev = pci_get_bus_and_slot((unsigned int)pdev->bus->number,
+   bridge_dev = pci_get_domain_bus_and_slot(pci_domain_nr(pdev->bus),
+   (unsigned int)pdev->bus->number,
PCI_DEVFN(0, 1));
if (!bridge_dev) {
dev_info(>dev, "can't find secondary device\n");
-- 
1.9.1



[PATCH 07/30] ibft: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/firmware/iscsi_ibft.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/iscsi_ibft.c b/drivers/firmware/iscsi_ibft.c
index 14042a6..6bc8e66 100644
--- a/drivers/firmware/iscsi_ibft.c
+++ b/drivers/firmware/iscsi_ibft.c
@@ -719,8 +719,9 @@ static int __init ibft_create_kobject(struct 
acpi_table_ibft *header,
* executes only devices which are in domain 0. Furthermore, the
* iBFT spec doesn't have a domain id field :-(
*/
-   pci_dev = pci_get_bus_and_slot((nic->pci_bdf & 0xff00) >> 8,
-  (nic->pci_bdf & 0xff));
+   pci_dev = pci_get_domain_bus_and_slot(0,
+   (nic->pci_bdf & 0xff00) >> 8,
+   (nic->pci_bdf & 0xff));
if (pci_dev) {
rc = sysfs_create_link(_kobj->kobj,
   _dev->dev.kobj, "device");
-- 
1.9.1



[PATCH 07/30] ibft: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/firmware/iscsi_ibft.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/iscsi_ibft.c b/drivers/firmware/iscsi_ibft.c
index 14042a6..6bc8e66 100644
--- a/drivers/firmware/iscsi_ibft.c
+++ b/drivers/firmware/iscsi_ibft.c
@@ -719,8 +719,9 @@ static int __init ibft_create_kobject(struct 
acpi_table_ibft *header,
* executes only devices which are in domain 0. Furthermore, the
* iBFT spec doesn't have a domain id field :-(
*/
-   pci_dev = pci_get_bus_and_slot((nic->pci_bdf & 0xff00) >> 8,
-  (nic->pci_bdf & 0xff));
+   pci_dev = pci_get_domain_bus_and_slot(0,
+   (nic->pci_bdf & 0xff00) >> 8,
+   (nic->pci_bdf & 0xff));
if (pci_dev) {
rc = sysfs_create_link(_kobj->kobj,
   _dev->dev.kobj, "device");
-- 
1.9.1



[PATCH 17/30] PCI: cpqhp: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/pci/hotplug/cpqphp_pci.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/cpqphp_pci.c b/drivers/pci/hotplug/cpqphp_pci.c
index e220d49..8897a77 100644
--- a/drivers/pci/hotplug/cpqphp_pci.c
+++ b/drivers/pci/hotplug/cpqphp_pci.c
@@ -89,7 +89,9 @@ int cpqhp_configure_device(struct controller *ctrl, struct 
pci_func *func)
pci_lock_rescan_remove();
 
if (func->pci_dev == NULL)
-   func->pci_dev = pci_get_bus_and_slot(func->bus, 
PCI_DEVFN(func->device, func->function));
+   func->pci_dev = pci_get_domain_bus_and_slot(0, func->bus,
+   PCI_DEVFN(func->device,
+   func->function));
 
/* No pci device, we need to create it then */
if (func->pci_dev == NULL) {
@@ -99,7 +101,9 @@ int cpqhp_configure_device(struct controller *ctrl, struct 
pci_func *func)
if (num)
pci_bus_add_devices(ctrl->pci_dev->bus);
 
-   func->pci_dev = pci_get_bus_and_slot(func->bus, 
PCI_DEVFN(func->device, func->function));
+   func->pci_dev = pci_get_domain_bus_and_slot(0, func->bus,
+   PCI_DEVFN(func->device,
+   func->function));
if (func->pci_dev == NULL) {
dbg("ERROR: pci_dev still null\n");
goto out;
@@ -129,7 +133,10 @@ int cpqhp_unconfigure_device(struct pci_func *func)
 
pci_lock_rescan_remove();
for (j = 0; j < 8 ; j++) {
-   struct pci_dev *temp = pci_get_bus_and_slot(func->bus, 
PCI_DEVFN(func->device, j));
+   struct pci_dev *temp = pci_get_domain_bus_and_slot(0,
+   func->bus,
+   PCI_DEVFN(func->device,
+   j));
if (temp) {
pci_dev_put(temp);
pci_stop_and_remove_bus_device(temp);
@@ -319,6 +326,7 @@ int cpqhp_save_config(struct controller *ctrl, int 
busnumber, int is_hot_plug)
int cloop = 0;
int stop_it;
int index;
+   u16 devfn;
 
/* Decide which slots are supported */
 
@@ -416,7 +424,9 @@ int cpqhp_save_config(struct controller *ctrl, int 
busnumber, int is_hot_plug)
new_slot->switch_save = 0x10;
/* In case of unsupported board */
new_slot->status = DevError;
-   new_slot->pci_dev = pci_get_bus_and_slot(new_slot->bus, 
(new_slot->device << 3) | new_slot->function);
+   devfn = (new_slot->device << 3) | new_slot->function;
+   new_slot->pci_dev = pci_get_domain_bus_and_slot(0,
+   new_slot->bus, devfn);
 
for (cloop = 0; cloop < 0x20; cloop++) {
rc = pci_bus_read_config_dword(ctrl->pci_bus, 
PCI_DEVFN(device, function), cloop << 2, (u32 *) 
&(new_slot->config_space[cloop]));
-- 
1.9.1



[PATCH 17/30] PCI: cpqhp: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/pci/hotplug/cpqphp_pci.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/cpqphp_pci.c b/drivers/pci/hotplug/cpqphp_pci.c
index e220d49..8897a77 100644
--- a/drivers/pci/hotplug/cpqphp_pci.c
+++ b/drivers/pci/hotplug/cpqphp_pci.c
@@ -89,7 +89,9 @@ int cpqhp_configure_device(struct controller *ctrl, struct 
pci_func *func)
pci_lock_rescan_remove();
 
if (func->pci_dev == NULL)
-   func->pci_dev = pci_get_bus_and_slot(func->bus, 
PCI_DEVFN(func->device, func->function));
+   func->pci_dev = pci_get_domain_bus_and_slot(0, func->bus,
+   PCI_DEVFN(func->device,
+   func->function));
 
/* No pci device, we need to create it then */
if (func->pci_dev == NULL) {
@@ -99,7 +101,9 @@ int cpqhp_configure_device(struct controller *ctrl, struct 
pci_func *func)
if (num)
pci_bus_add_devices(ctrl->pci_dev->bus);
 
-   func->pci_dev = pci_get_bus_and_slot(func->bus, 
PCI_DEVFN(func->device, func->function));
+   func->pci_dev = pci_get_domain_bus_and_slot(0, func->bus,
+   PCI_DEVFN(func->device,
+   func->function));
if (func->pci_dev == NULL) {
dbg("ERROR: pci_dev still null\n");
goto out;
@@ -129,7 +133,10 @@ int cpqhp_unconfigure_device(struct pci_func *func)
 
pci_lock_rescan_remove();
for (j = 0; j < 8 ; j++) {
-   struct pci_dev *temp = pci_get_bus_and_slot(func->bus, 
PCI_DEVFN(func->device, j));
+   struct pci_dev *temp = pci_get_domain_bus_and_slot(0,
+   func->bus,
+   PCI_DEVFN(func->device,
+   j));
if (temp) {
pci_dev_put(temp);
pci_stop_and_remove_bus_device(temp);
@@ -319,6 +326,7 @@ int cpqhp_save_config(struct controller *ctrl, int 
busnumber, int is_hot_plug)
int cloop = 0;
int stop_it;
int index;
+   u16 devfn;
 
/* Decide which slots are supported */
 
@@ -416,7 +424,9 @@ int cpqhp_save_config(struct controller *ctrl, int 
busnumber, int is_hot_plug)
new_slot->switch_save = 0x10;
/* In case of unsupported board */
new_slot->status = DevError;
-   new_slot->pci_dev = pci_get_bus_and_slot(new_slot->bus, 
(new_slot->device << 3) | new_slot->function);
+   devfn = (new_slot->device << 3) | new_slot->function;
+   new_slot->pci_dev = pci_get_domain_bus_and_slot(0,
+   new_slot->bus, devfn);
 
for (cloop = 0; cloop < 0x20; cloop++) {
rc = pci_bus_read_config_dword(ctrl->pci_bus, 
PCI_DEVFN(device, function), cloop << 2, (u32 *) 
&(new_slot->config_space[cloop]));
-- 
1.9.1



[PATCH 14/30] powerpc/powermac: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/macintosh/via-pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index c4c2b3b..3e8b3b6 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -1799,7 +1799,7 @@ static int powerbook_sleep_grackle(void)
struct adb_request req;
struct pci_dev *grackle;
 
-   grackle = pci_get_bus_and_slot(0, 0);
+   grackle = pci_get_domain_bus_and_slot(0, 0, 0);
if (!grackle)
return -ENODEV;
 
-- 
1.9.1



[PATCH 14/30] powerpc/powermac: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/macintosh/via-pmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/macintosh/via-pmu.c b/drivers/macintosh/via-pmu.c
index c4c2b3b..3e8b3b6 100644
--- a/drivers/macintosh/via-pmu.c
+++ b/drivers/macintosh/via-pmu.c
@@ -1799,7 +1799,7 @@ static int powerbook_sleep_grackle(void)
struct adb_request req;
struct pci_dev *grackle;
 
-   grackle = pci_get_bus_and_slot(0, 0);
+   grackle = pci_get_domain_bus_and_slot(0, 0, 0);
if (!grackle)
return -ENODEV;
 
-- 
1.9.1



[PATCH 15/30] bnx2x: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 10 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 9ca994d..9f40c23 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -812,7 +812,7 @@ static u8 bnx2x_vf_is_pcie_pending(struct bnx2x *bp, u8 
abs_vfid)
if (!vf)
return false;
 
-   dev = pci_get_bus_and_slot(vf->bus, vf->devfn);
+   dev = pci_get_domain_bus_and_slot(vf->domain, vf->bus, vf->devfn);
if (dev)
return bnx2x_is_pcie_pending(dev);
return false;
@@ -1041,6 +1041,13 @@ void bnx2x_iov_init_dmae(struct bnx2x *bp)
REG_WR(bp, DMAE_REG_BACKWARD_COMP_EN, 0);
 }
 
+static int bnx2x_vf_domain(struct bnx2x *bp, int vfid)
+{
+   struct pci_dev *dev = bp->pdev;
+
+   return pci_domain_nr(dev->bus);
+}
+
 static int bnx2x_vf_bus(struct bnx2x *bp, int vfid)
 {
struct pci_dev *dev = bp->pdev;
@@ -1611,6 +1618,7 @@ int bnx2x_iov_nic_init(struct bnx2x *bp)
struct bnx2x_virtf *vf = BP_VF(bp, vfid);
 
/* fill in the BDF and bars */
+   vf->domain = bnx2x_vf_domain(bp, vfid);
vf->bus = bnx2x_vf_bus(bp, vfid);
vf->devfn = bnx2x_vf_devfn(bp, vfid);
bnx2x_vf_set_bars(bp, vf);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 53466f6..eb814c6 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -182,6 +182,7 @@ struct bnx2x_virtf {
u32 error;  /* 0 means all's-well */
 
/* BDF */
+   unsigned int domain;
unsigned int bus;
unsigned int devfn;
 
-- 
1.9.1



Re: [BUG] 4.4.x-rt - memcg: refill_stock() use get_cpu_light() has data corruption issue

2017-11-21 Thread Mike Galbraith
On Tue, 2017-11-21 at 22:50 -0500, Steven Rostedt wrote:
> 
> Does it work if you revert the patch?

That would restore the gripe.  How about this..

mm, memcg: serialize consume_stock(), drain_local_stock() and refill_stock()

Haiyang HY1 Tan reports encountering races between drain_stock() and
refill_stock(), resulting in drain_stock() draining stock freshly assigned
by refill_stock().  This doesn't appear to have been safe before RT touched
any of it due do drain_local_stock() being preemptible until db2ba40c277d
came along and disabled irqs across the lot.  Rather than do that with
the upstream RT replacement with local_lock_irqsave/restore() since
older trees don't yet need to be irq safe, use the local lock name and
placement for consistency, but serialize with get/put_locked_var().

The below may not deserve full credit for the breakage, but it surely
didn't help, so tough, it gets to wear the BPB.

Reported-by: Haiyang HY1 Tan 
Signed-off-by: Mike Galbraith 
Fixes: ("mm, memcg: make refill_stock() use get_cpu_light()")
---
 mm/memcontrol.c |   15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1861,6 +1861,7 @@ struct memcg_stock_pcp {
 #define FLUSHING_CACHED_CHARGE 0
 };
 static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock);
+static DEFINE_LOCAL_IRQ_LOCK(memcg_stock_ll);
 static DEFINE_MUTEX(percpu_charge_mutex);
 
 /**
@@ -1882,12 +1883,12 @@ static bool consume_stock(struct mem_cgr
if (nr_pages > CHARGE_BATCH)
return ret;
 
-   stock = _cpu_var(memcg_stock);
+   stock = _locked_var(memcg_stock_ll, memcg_stock);
if (memcg == stock->cached && stock->nr_pages >= nr_pages) {
stock->nr_pages -= nr_pages;
ret = true;
}
-   put_cpu_var(memcg_stock);
+   put_locked_var(memcg_stock_ll, memcg_stock);
return ret;
 }
 
@@ -1914,9 +1915,12 @@ static void drain_stock(struct memcg_sto
  */
 static void drain_local_stock(struct work_struct *dummy)
 {
-   struct memcg_stock_pcp *stock = this_cpu_ptr(_stock);
+   struct memcg_stock_pcp *stock;
+
+   stock = _locked_var(memcg_stock_ll, memcg_stock);
drain_stock(stock);
clear_bit(FLUSHING_CACHED_CHARGE, >flags);
+   put_locked_var(memcg_stock_ll, memcg_stock);
 }
 
 /*
@@ -1926,16 +1930,15 @@ static void drain_local_stock(struct wor
 static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
struct memcg_stock_pcp *stock;
-   int cpu = get_cpu_light();
 
-   stock = _cpu(memcg_stock, cpu);
+   stock = _locked_var(memcg_stock_ll, memcg_stock);
 
if (stock->cached != memcg) { /* reset if necessary */
drain_stock(stock);
stock->cached = memcg;
}
stock->nr_pages += nr_pages;
-   put_cpu_light();
+   put_locked_var(memcg_stock_ll, memcg_stock);
 }
 
 /*


[PATCH 15/30] bnx2x: deprecate pci_get_bus_and_slot()

2017-11-21 Thread Sinan Kaya
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya 
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 10 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h |  1 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
index 9ca994d..9f40c23 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
@@ -812,7 +812,7 @@ static u8 bnx2x_vf_is_pcie_pending(struct bnx2x *bp, u8 
abs_vfid)
if (!vf)
return false;
 
-   dev = pci_get_bus_and_slot(vf->bus, vf->devfn);
+   dev = pci_get_domain_bus_and_slot(vf->domain, vf->bus, vf->devfn);
if (dev)
return bnx2x_is_pcie_pending(dev);
return false;
@@ -1041,6 +1041,13 @@ void bnx2x_iov_init_dmae(struct bnx2x *bp)
REG_WR(bp, DMAE_REG_BACKWARD_COMP_EN, 0);
 }
 
+static int bnx2x_vf_domain(struct bnx2x *bp, int vfid)
+{
+   struct pci_dev *dev = bp->pdev;
+
+   return pci_domain_nr(dev->bus);
+}
+
 static int bnx2x_vf_bus(struct bnx2x *bp, int vfid)
 {
struct pci_dev *dev = bp->pdev;
@@ -1611,6 +1618,7 @@ int bnx2x_iov_nic_init(struct bnx2x *bp)
struct bnx2x_virtf *vf = BP_VF(bp, vfid);
 
/* fill in the BDF and bars */
+   vf->domain = bnx2x_vf_domain(bp, vfid);
vf->bus = bnx2x_vf_bus(bp, vfid);
vf->devfn = bnx2x_vf_devfn(bp, vfid);
bnx2x_vf_set_bars(bp, vf);
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h 
b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
index 53466f6..eb814c6 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h
@@ -182,6 +182,7 @@ struct bnx2x_virtf {
u32 error;  /* 0 means all's-well */
 
/* BDF */
+   unsigned int domain;
unsigned int bus;
unsigned int devfn;
 
-- 
1.9.1



Re: [BUG] 4.4.x-rt - memcg: refill_stock() use get_cpu_light() has data corruption issue

2017-11-21 Thread Mike Galbraith
On Tue, 2017-11-21 at 22:50 -0500, Steven Rostedt wrote:
> 
> Does it work if you revert the patch?

That would restore the gripe.  How about this..

mm, memcg: serialize consume_stock(), drain_local_stock() and refill_stock()

Haiyang HY1 Tan reports encountering races between drain_stock() and
refill_stock(), resulting in drain_stock() draining stock freshly assigned
by refill_stock().  This doesn't appear to have been safe before RT touched
any of it due do drain_local_stock() being preemptible until db2ba40c277d
came along and disabled irqs across the lot.  Rather than do that with
the upstream RT replacement with local_lock_irqsave/restore() since
older trees don't yet need to be irq safe, use the local lock name and
placement for consistency, but serialize with get/put_locked_var().

The below may not deserve full credit for the breakage, but it surely
didn't help, so tough, it gets to wear the BPB.

Reported-by: Haiyang HY1 Tan 
Signed-off-by: Mike Galbraith 
Fixes: ("mm, memcg: make refill_stock() use get_cpu_light()")
---
 mm/memcontrol.c |   15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1861,6 +1861,7 @@ struct memcg_stock_pcp {
 #define FLUSHING_CACHED_CHARGE 0
 };
 static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_stock);
+static DEFINE_LOCAL_IRQ_LOCK(memcg_stock_ll);
 static DEFINE_MUTEX(percpu_charge_mutex);
 
 /**
@@ -1882,12 +1883,12 @@ static bool consume_stock(struct mem_cgr
if (nr_pages > CHARGE_BATCH)
return ret;
 
-   stock = _cpu_var(memcg_stock);
+   stock = _locked_var(memcg_stock_ll, memcg_stock);
if (memcg == stock->cached && stock->nr_pages >= nr_pages) {
stock->nr_pages -= nr_pages;
ret = true;
}
-   put_cpu_var(memcg_stock);
+   put_locked_var(memcg_stock_ll, memcg_stock);
return ret;
 }
 
@@ -1914,9 +1915,12 @@ static void drain_stock(struct memcg_sto
  */
 static void drain_local_stock(struct work_struct *dummy)
 {
-   struct memcg_stock_pcp *stock = this_cpu_ptr(_stock);
+   struct memcg_stock_pcp *stock;
+
+   stock = _locked_var(memcg_stock_ll, memcg_stock);
drain_stock(stock);
clear_bit(FLUSHING_CACHED_CHARGE, >flags);
+   put_locked_var(memcg_stock_ll, memcg_stock);
 }
 
 /*
@@ -1926,16 +1930,15 @@ static void drain_local_stock(struct wor
 static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages)
 {
struct memcg_stock_pcp *stock;
-   int cpu = get_cpu_light();
 
-   stock = _cpu(memcg_stock, cpu);
+   stock = _locked_var(memcg_stock_ll, memcg_stock);
 
if (stock->cached != memcg) { /* reset if necessary */
drain_stock(stock);
stock->cached = memcg;
}
stock->nr_pages += nr_pages;
-   put_cpu_light();
+   put_locked_var(memcg_stock_ll, memcg_stock);
 }
 
 /*


  1   2   3   4   5   6   7   8   9   10   >