[PATCH v3] kernel/module_64.c: Add REL24 relocation support of livepatch symbols

2017-10-16 Thread Kamalesh Babulal
Livepatch re-uses module loader function apply_relocate_add() to write
relocations, instead of managing them by arch-dependent
klp_write_module_reloc() function.

apply_relocate_add() doesn't understand livepatch symbols (marked with
SHN_LIVEPATCH symbol section index) and assumes them to be local symbols
by default for R_PPC64_REL24 relocation type. It fails with an error,
when trying to calculate offset with local_entry_offset():

module_64: kpatch_meminfo: REL24 -1152921504897399800 out of range!

Whereas livepatch symbols are essentially SHN_UNDEF, should be
called via stub used for global calls. This issue can be fixed by
teaching apply_relocate_add() to handle both SHN_UNDEF/SHN_LIVEPATCH
symbols via the same stub. It isn't a complete fix, as it will fail for
local calls becoming global.

Consider a livepatch sequence[1] below, where function A calls B, B is
livepatched function and any calls to function B is redirected to
patched version P. Now, livepatched function P calls function C in M2,
whereas C was local to function B and global to function P.

  ++++++  ++
  ||   +++--->||  +-->||
  |  A |   ||  B ||  F |  |   |  P |
  ||   ||||+--+   ||
  |+---+||||<-+   ||
  ||<--+   ++   C|||  |   ||
  ||   |   | +->||||  |   ||<---+
  | K / M1 |   |   | |  | K / M2 |  +-+ Kernel |  +---+ Mod3   +--+ |
  ++   |   | |  ++  | ++  ++  | |
   |   | |  | | |
   +---+-+--+ | |
   | || |
   | ++ |
   ++

Handling such call with existing stub, triggers another error:

module_64: kpatch_meminfo: Expect noop after relocate, got 3d22

Reason being, ppc64le ABI v2 expects a nop instruction after every
branch to a global call. That gets overwritten by an instruction to
restore TOC with r2 value of callee. Given function C was local to
function B, it does not store/restore TOC as they are not expected to be
clobbered for functions called via local entry point.

The current stub can be extended to re-store TOC and have a single stub
for both SHN_UNDEF/SHN_LIVEPATCH symbols. Constructing a single stub proves
to be an overhead for non-livepatch calls, with additional instructions
to restore TOC.

A new stub to call livepatch symbols with an intermediate stack to
store/restore, TOC/LR between livepatched function and global function
will work for most of the cases but will fail when arguments are passed
via stack between functions.

This issue has been already solved by introduction of livepatch_handler,
which runs in _mcount context by introducing livepatch stack growing
upwards from the base of the regular stack, eliminating the need for an
intermediate stack.

Current approach is to setup klp_stub mimicking the functionality of
entry_64.S::livepatch_handler to store/restore TOC/LR values, other than
the stub setup and branch. This patch also introduces new
ppc64le_klp_stub_entry[], along with the helpers to find/allocate
livepatch stub.

[1] ASCII diagram adopted from:
http://mpe.github.io/posts/2016/05/23/kernel-live-patching-for-ppc64le/

Signed-off-by: Kamalesh Babulal 
Reviewed-by: Balbir Singh 
Cc: Naveen N. Rao 
Cc: Josh Poimboeuf 
Cc: Jessica Yu 
Cc: Ananth N Mavinakayanahalli 
Cc: Aravinda Prasad 
Cc: Torsten Duwe 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: live-patch...@vger.kernel.org
---
v3:
 - Defined FUNC_DESC_OFFSET to calculate func_desc offset from
   struct ppc64le_klp_stub_entry.
 - Replaced BUG_ON() with WARN_ON() in klp_stub_for_addr().
 - Major commit message re-write.

v2:
 - Changed klp_stub construction to re-use livepatch_handler and
   additional patch code required for klp_stub, instead of duplicating it.
 - Minor comments and commit body edit.

 arch/powerpc/include/asm/module.h  |   4 +
 arch/powerpc/kernel/module_64.c| 139 -
 arch/powerpc/kernel/trace/ftrace_64_mprofile.S |  31 ++
 3 files changed, 171 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/module.h 
b/arch/powerpc/include/asm/module.h
index 6c0132c..de22c4c 100644
--- a/arch/powerpc/include/asm/module.h
+++ b/arch/powerpc/include/asm/module.h
@@ -44,6 +44,10 @@ struct mod_arch_specific {
unsigned long toc;
unsigned long tramp;
 #endif
+#ifdef CONFIG_LIVEPATCH
+   /* Count of kernel livepatch relocations */
+   unsigned long klp_relocs;
+#endif
 
 #else /* powerpc64 */
/* Indices of PLT sections within module. */
diff --git a/arch/po

Re: [PATCH] powerpc/watchdog: Convert timers to use timer_setup()

2017-10-16 Thread Michael Ellerman
Kees Cook  writes:

> In preparation for unconditionally passing the struct timer_list pointer to
> all timer callbacks, switch to using the new timer_setup() and from_timer()
> to pass the timer pointer explicitly.
>
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Kees Cook 
> ---
>  arch/powerpc/kernel/watchdog.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index 15e209a37c2d..50797528b5e1 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -263,9 +263,8 @@ static void wd_timer_reset(unsigned int cpu, struct 
> timer_list *t)
>   add_timer_on(t, cpu);
>  }
>  
> -static void wd_timer_fn(unsigned long data)
> +static void wd_timer_fn(struct timer_list *t)

At first glance this looks like it will break, changing the signature,
but the new timer_setup() force-casts to the old signature (for now),
and we can fit a pointer in an unsigned long so it's OK.

>  {
> - struct timer_list *t = this_cpu_ptr(&wd_timer);
>   int cpu = smp_processor_id();
>  
>   watchdog_timer_interrupt(cpu);
> @@ -292,7 +291,7 @@ static void start_watchdog_timer_on(unsigned int cpu)
>  
>   per_cpu(wd_timer_tb, cpu) = get_tb();
>  
> - setup_pinned_timer(t, wd_timer_fn, 0);
> + timer_setup(t, wd_timer_fn, TIMER_PINNED);
>   wd_timer_reset(cpu, t);
>  }
>  

Acked-by: Michael Ellerman 

cheers


Re: [PATCH v3 2/2] pseries/eeh: Add Pseries pcibios_bus_add_device

2017-10-16 Thread Michael Ellerman
"Bryant G. Ly"  writes:
> On 10/12/17 1:29 PM, Bjorn Helgaas wrote:
...
>>
>> If that's the case, how to you ever bind a driver to these VFs?  The
>> changelog says you don't want VF drivers to load *immediately*, so I
>> assume you do want them to load eventually.
>>
> The VF's that get dynamically created within the configure SR-IOV call, 
> on the Pseries Platform, wont be matched with a driver. - We do not
> want it to match.
>
> The Power Hypervisor will load the VFs. The VF's will get assigned(by 
> the user) > via the HMC or Novalink in this environment which will
> then trigger PHYP to load the VF device node to the device tree.

What about the other "Power Hypervisor"? ie. KVM running on Power.

We also use the pseries platform when running under KVM.

cheers


[PATCH 2/2] mm/mmu_notifier: avoid call to invalidate_range() in range_end()

2017-10-16 Thread jglisse
From: Jérôme Glisse 

This is an optimization patch that only affect mmu_notifier users which
rely on the invalidate_range() callback. This patch avoids calling that
callback twice in a row from inside __mmu_notifier_invalidate_range_end

Existing pattern (before this patch):
mmu_notifier_invalidate_range_start()
pte/pmd/pud_clear_flush_notify()
mmu_notifier_invalidate_range()
mmu_notifier_invalidate_range_end()
mmu_notifier_invalidate_range()

New pattern (after this patch):
mmu_notifier_invalidate_range_start()
pte/pmd/pud_clear_flush_notify()
mmu_notifier_invalidate_range()
mmu_notifier_invalidate_range_only_end()

We call the invalidate_range callback after clearing the page table under
the page table lock and we skip the call to invalidate_range inside the
__mmu_notifier_invalidate_range_end() function.

Idea from Andrea Arcangeli

Signed-off-by: Jérôme Glisse 
Cc: Andrea Arcangeli 
Cc: Andrew Morton 
Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Cc: David Woodhouse 
Cc: Alistair Popple 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Stephen Rothwell 
Cc: Andrew Donnellan 

Cc: io...@lists.linux-foundation.org
Cc: linuxppc-dev@lists.ozlabs.org
---
 include/linux/mmu_notifier.h | 17 ++--
 mm/huge_memory.c | 46 
 mm/memory.c  |  6 +-
 mm/migrate.c | 15 ---
 mm/mmu_notifier.c| 11 +--
 5 files changed, 83 insertions(+), 12 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 49c925c96b8a..6665c4624287 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -213,7 +213,8 @@ extern void __mmu_notifier_change_pte(struct mm_struct *mm,
 extern void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
  unsigned long start, unsigned long end);
 extern void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
- unsigned long start, unsigned long end);
+ unsigned long start, unsigned long end,
+ bool only_end);
 extern void __mmu_notifier_invalidate_range(struct mm_struct *mm,
  unsigned long start, unsigned long end);
 
@@ -267,7 +268,14 @@ static inline void 
mmu_notifier_invalidate_range_end(struct mm_struct *mm,
  unsigned long start, unsigned long end)
 {
if (mm_has_notifiers(mm))
-   __mmu_notifier_invalidate_range_end(mm, start, end);
+   __mmu_notifier_invalidate_range_end(mm, start, end, false);
+}
+
+static inline void mmu_notifier_invalidate_range_only_end(struct mm_struct *mm,
+ unsigned long start, unsigned long end)
+{
+   if (mm_has_notifiers(mm))
+   __mmu_notifier_invalidate_range_end(mm, start, end, true);
 }
 
 static inline void mmu_notifier_invalidate_range(struct mm_struct *mm,
@@ -438,6 +446,11 @@ static inline void 
mmu_notifier_invalidate_range_end(struct mm_struct *mm,
 {
 }
 
+static inline void mmu_notifier_invalidate_range_only_end(struct mm_struct *mm,
+ unsigned long start, unsigned long end)
+{
+}
+
 static inline void mmu_notifier_invalidate_range(struct mm_struct *mm,
  unsigned long start, unsigned long end)
 {
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index ff5bc647b51d..b2912305994f 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1220,7 +1220,12 @@ static int do_huge_pmd_wp_page_fallback(struct vm_fault 
*vmf, pmd_t orig_pmd,
page_remove_rmap(page, true);
spin_unlock(vmf->ptl);
 
-   mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start, mmun_end);
+   /*
+* No need to double call mmu_notifier->invalidate_range() callback as
+* the above pmdp_huge_clear_flush_notify() did already call it.
+*/
+   mmu_notifier_invalidate_range_only_end(vma->vm_mm, mmun_start,
+   mmun_end);
 
ret |= VM_FAULT_WRITE;
put_page(page);
@@ -1369,7 +1374,12 @@ int do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t 
orig_pmd)
}
spin_unlock(vmf->ptl);
 out_mn:
-   mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start, mmun_end);
+   /*
+* No need to double call mmu_notifier->invalidate_range() callback as
+* the above pmdp_huge_clear_flush_notify() did already call it.
+*/
+   mmu_notifier_invalidate_range_only_end(vma->vm_mm, mmun_start,
+  mmun_end);
 out:
return ret;
 out_unlock:
@@ -2021,7 +2031,12 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t 
*pud,
 
 out:
spin_unlock(ptl);
-   mmu_notifier_invalidate_range_end(mm, haddr, haddr + HPAGE

[PATCH 1/2] mm/mmu_notifier: avoid double notification when it is useless v2

2017-10-16 Thread jglisse
From: Jérôme Glisse 

This patch only affects users of mmu_notifier->invalidate_range callback
which are device drivers related to ATS/PASID, CAPI, IOMMUv2, SVM ...
and it is an optimization for those users. Everyone else is unaffected
by it.

When clearing a pte/pmd we are given a choice to notify the event under
the page table lock (notify version of *_clear_flush helpers do call the
mmu_notifier_invalidate_range). But that notification is not necessary in
all cases.

This patches remove almost all cases where it is useless to have a call
to mmu_notifier_invalidate_range before mmu_notifier_invalidate_range_end.
It also adds documentation in all those cases explaining why.

Below is a more in depth analysis of why this is fine to do this:

For secondary TLB (non CPU TLB) like IOMMU TLB or device TLB (when device
use thing like ATS/PASID to get the IOMMU to walk the CPU page table to
access a process virtual address space). There is only 2 cases when you
need to notify those secondary TLB while holding page table lock when
clearing a pte/pmd:

  A) page backing address is free before mmu_notifier_invalidate_range_end
  B) a page table entry is updated to point to a new page (COW, write fault
 on zero page, __replace_page(), ...)

Case A is obvious you do not want to take the risk for the device to write
to a page that might now be used by something completely different.

Case B is more subtle. For correctness it requires the following sequence
to happen:
  - take page table lock
  - clear page table entry and notify (pmd/pte_huge_clear_flush_notify())
  - set page table entry to point to new page

If clearing the page table entry is not followed by a notify before setting
the new pte/pmd value then you can break memory model like C11 or C++11 for
the device.

Consider the following scenario (device use a feature similar to ATS/
PASID):

Two address addrA and addrB such that |addrA - addrB| >= PAGE_SIZE we
assume they are write protected for COW (other case of B apply too).

[Time N] -
CPU-thread-0  {try to write to addrA}
CPU-thread-1  {try to write to addrB}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {read addrA and populate device TLB}
DEV-thread-2  {read addrB and populate device TLB}
[Time N+1] ---
CPU-thread-0  {COW_step0: {mmu_notifier_invalidate_range_start(addrA)}}
CPU-thread-1  {COW_step0: {mmu_notifier_invalidate_range_start(addrB)}}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+2] ---
CPU-thread-0  {COW_step1: {update page table point to new page for addrA}}
CPU-thread-1  {COW_step1: {update page table point to new page for addrB}}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+3] ---
CPU-thread-0  {preempted}
CPU-thread-1  {preempted}
CPU-thread-2  {write to addrA which is a write to new page}
CPU-thread-3  {}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+3] ---
CPU-thread-0  {preempted}
CPU-thread-1  {preempted}
CPU-thread-2  {}
CPU-thread-3  {write to addrB which is a write to new page}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+4] ---
CPU-thread-0  {preempted}
CPU-thread-1  {COW_step3: {mmu_notifier_invalidate_range_end(addrB)}}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {}
DEV-thread-2  {}
[Time N+5] ---
CPU-thread-0  {preempted}
CPU-thread-1  {}
CPU-thread-2  {}
CPU-thread-3  {}
DEV-thread-0  {read addrA from old page}
DEV-thread-2  {read addrB from new page}

So here because at time N+2 the clear page table entry was not pair with a
notification to invalidate the secondary TLB, the device see the new value
for addrB before seing the new value for addrA. This break total memory
ordering for the device.

When changing a pte to write protect or to point to a new write protected
page with same content (KSM) it is ok to delay invalidate_range callback to
mmu_notifier_invalidate_range_end() outside the page table lock. This is
true even if the thread doing page table update is preempted right after
releasing page table lock before calling mmu_notifier_invalidate_range_end

Changed since v1:
  - typos (thanks to Andrea)
  - Avoid unnecessary precaution in try_to_unmap() (Andrea)
  - Be more conservative in try_to_unmap_one()

Signed-off-by: Jérôme Glisse 
Cc: Andrea Arcangeli 
Cc: Nadav Amit 
Cc: Linus Torvalds 
Cc: Andrew Morton 
Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Cc: David Woodhouse 
Cc: Alistair Popple 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Stephen Rothwell 
Cc: Andrew Donnellan 

Cc: io...@lists.linux-foundation.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-n...@vger.kernel.org
--

[PATCH 0/2] Optimize mmu_notifier->invalidate_range callback

2017-10-16 Thread jglisse
From: Jérôme Glisse 

(Andrew you already have v1 in your queue of patch 1, patch 2 is new,
 i think you can drop it patch 1 v1 for v2, v2 is bit more conservative
 and i fixed typos)

All this only affect user of invalidate_range callback (at this time
CAPI arch/powerpc/platforms/powernv/npu-dma.c, IOMMU ATS/PASID in
drivers/iommu/amd_iommu_v2.c|intel-svm.c)

This patchset remove useless double call to mmu_notifier->invalidate_range
callback wherever it is safe to do so. The first patch just remove useless
call and add documentation explaining why it is safe to do so. The second
patch go further by introducing mmu_notifier_invalidate_range_only_end()
which skip callback to invalidate_range this can be done when clearing a
pte, pmd or pud with notification which call invalidate_range right after
clearing under the page table lock.

It should improve performances but i am lacking hardware and benchmarks
which might show an improvement. Maybe folks in cc can help here.

Cc: Andrea Arcangeli 
Cc: Andrew Morton 
Cc: Joerg Roedel 
Cc: Suravee Suthikulpanit 
Cc: David Woodhouse 
Cc: Alistair Popple 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Stephen Rothwell 
Cc: Andrew Donnellan 
Cc: io...@lists.linux-foundation.org
Cc: linuxppc-dev@lists.ozlabs.org

Jérôme Glisse (2):
  mm/mmu_notifier: avoid double notification when it is useless v2
  mm/mmu_notifier: avoid call to invalidate_range() in range_end()

 Documentation/vm/mmu_notifier.txt | 93 +++
 fs/dax.c  |  9 +++-
 include/linux/mmu_notifier.h  | 20 +++--
 mm/huge_memory.c  | 66 ---
 mm/hugetlb.c  | 16 +--
 mm/ksm.c  | 15 ++-
 mm/memory.c   |  6 ++-
 mm/migrate.c  | 15 +--
 mm/mmu_notifier.c | 11 -
 mm/rmap.c | 59 ++---
 10 files changed, 281 insertions(+), 29 deletions(-)
 create mode 100644 Documentation/vm/mmu_notifier.txt

-- 
2.13.6



Re: [PATCH] powerpc/watchdog: Convert timers to use timer_setup()

2017-10-16 Thread Nicholas Piggin
On Mon, 16 Oct 2017 16:47:10 -0700
Kees Cook  wrote:

> In preparation for unconditionally passing the struct timer_list pointer to
> all timer callbacks, switch to using the new timer_setup() and from_timer()
> to pass the timer pointer explicitly.
> 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: Nicholas Piggin 
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Kees Cook 

Looks fine to me. Is this intended to be merged via the powerpc tree
in the next merge window?

Acked-by: Nicholas Piggin 

> ---
>  arch/powerpc/kernel/watchdog.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index 15e209a37c2d..50797528b5e1 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -263,9 +263,8 @@ static void wd_timer_reset(unsigned int cpu, struct 
> timer_list *t)
>   add_timer_on(t, cpu);
>  }
>  
> -static void wd_timer_fn(unsigned long data)
> +static void wd_timer_fn(struct timer_list *t)
>  {
> - struct timer_list *t = this_cpu_ptr(&wd_timer);
>   int cpu = smp_processor_id();
>  
>   watchdog_timer_interrupt(cpu);
> @@ -292,7 +291,7 @@ static void start_watchdog_timer_on(unsigned int cpu)
>  
>   per_cpu(wd_timer_tb, cpu) = get_tb();
>  
> - setup_pinned_timer(t, wd_timer_fn, 0);
> + timer_setup(t, wd_timer_fn, TIMER_PINNED);
>   wd_timer_reset(cpu, t);
>  }
>  



Re: BUG_ON() in irq_work_run_list

2017-10-16 Thread Michael Ellerman
Chandan Rajendra  writes:

> Executing fstests' generic/036 test in a loop on next-20171013 kernel causes
> BUG_ON()'s condition to evaluate to true,

Did it used to work? ie. the bug just started happening? If so is there
a next tag which *doesn't* have the bug.

> run fstests generic/036 at 2017-10-14 09:23:29
> [ cut here ]
> kernel BUG at /root/repos/linux/kernel/irq_work.c:138!
> Oops: Exception in kernel mode, sig: 5 [#1]
> BE SMP NR_CPUS=2048 DEBUG_PAGEALLOC NUMA pSeries
> Modules linked in:
> CPU: 3 PID: 0 Comm: swapper/3 Tainted: GW
> 4.14.0-rc4-next-20171013 #7
> task: c0063862c780 task.stack: c006387e4000
> NIP:  c02476ac LR: c02477c8 CTR: 
> REGS: c0063ffd3810 TRAP: 0700   Tainted: GW 
> (4.14.0-rc4-next-20171013)
> MSR:  80029032   CR: 24002442  XER: 2000
 ^
 Hard enabled, but that's expected.

> CFAR: c02477c4 SOFTE: 1
^
Also soft enabled, which is bad.

> GPR00: c01b70c4 c0063ffd3a90 c143bb00 c0063fee1a60
> GPR04: 002b c00635ad1b0c c0063383c9e8 
> GPR08: 00063ecd 0001 0001 
> GPR12: 28002482 cfd41080 c006387e7f90 0200
> GPR16: f0b048c0  c13a0920 c13a0920
> GPR20: 0003  0001 0002
> GPR24: 0010 c0063e22c498 c0063ffd3df0 
> GPR28: 00063ecd   c1211a60
> NIP [c02476ac] .irq_work_run_list+0xc/0x100
> LR [c02477c8] .irq_work_run+0x28/0x50
> Call Trace:
> [c0063ffd3a90] [c0787638] 
> .__blk_mq_complete_request_remote+0x38/0x50 (unreliable)
> [c0063ffd3b10] [c01b70c4] 
> .flush_smp_call_function_queue+0xd4/0x1e0
> [c0063ffd3ba0] [c0044a4c] .smp_ipi_demux_relaxed+0x9c/0x110
> [c0063ffd3c30] [c008dbdc] .icp_hv_ipi_action+0x5c/0xb0
> [c0063ffd3cb0] [c0174384] .__handle_irq_event_percpu+0x94/0x2d0
> [c0063ffd3d80] [c01745f4] .handle_irq_event_percpu+0x34/0x90
> [c0063ffd3e10] [c017ae20] .handle_percpu_irq+0x80/0xd0
> [c0063ffd3e90] [c0172ad0] .generic_handle_irq+0x50/0x80
> [c0063ffd3f10] [c0016cd0] .__do_irq+0x90/0x210
> [c0063ffd3f90] [c002a900] .call_do_irq+0x14/0x24
> [c006387e77a0] [c0016ee0] .do_IRQ+0x90/0x140
> [c006387e7840] [c0008c20] hardware_interrupt_common+0x150/0x160
> --- interrupt: 501 at .plpar_hcall_norets+0x14/0x20
> LR = .check_and_cede_processor+0x2c/0x40
> [c006387e7b30] [c0b3f028] .check_and_cede_processor+0x18/0x40 
> (unreliable)
> [c006387e7ba0] [c0b3f3c8] .shared_cede_loop+0x48/0x140
> [c006387e7c20] [c0b3c644] .cpuidle_enter_state+0xa4/0x410
> [c006387e7cd0] [c0159158] .call_cpuidle+0x68/0xd0
> [c006387e7d60] [c0159640] .do_idle+0x2b0/0x310
> [c006387e7e20] [c01598b0] .cpu_startup_entry+0x30/0x40
> [c006387e7ea0] [c0045e38] .start_secondary+0x4e8/0x530
> [c006387e7f90] [c000b06c] start_secondary_prolog+0x10/0x14
> Instruction dump:
> 3861 4e800020 6000 6000 6000 3860 4e800020 6000
> 6000 894d027a 312a 7d295110 <0b09> e923 2fa9 4d9e0020
> ---[ end trace 921006f210ad28ba ]---
>
> The corresponding code is,
>
> static void irq_work_run_list(struct llist_head *list)
> {
> unsigned long flags;
> struct irq_work *work;
> struct llist_node *llnode;
>
> BUG_ON(!irqs_disabled());

OK. Called from irq_work_run() called from the end of
flush_smp_call_function_queue().

That's also before we start running any of the irq_work functions, so it
can't have been one of them.

So seemingly something that was smp_call_function'ed() turned interrupts
back on.

The obvious candidate being __blk_mq_complete_request_remote(). It's
called via smp_call_function_single_async() in __blk_mq_complete_request().

The problem is __blk_mq_complete_request_remote() actually just calls
another function via a pointer, and we don't have a good way of working
out what that was. It's no longer in CTR (although it's a bit odd that
CTR is 0x0), it could be in a GPR but we can't know which one without a
full disassembly.

I dug a bit more but couldn't see anything obvious.

Does this reproduce? Easily?

cheers


Re: [PATCH v2 2/2] vgaarb: Factor out EFI and fallback default device selection

2017-10-16 Thread Daniel Axtens
Bjorn Helgaas  writes:

> The default VGA device is normally set in vga_arbiter_add_pci_device() when
> we call it for the first enabled device that can be accessed with the
> legacy VGA resources ([mem 0xa-0xb], etc.)
>
> That default device can be overridden by an EFI device that owns the boot
> framebuffer.  As a fallback, we can also select a VGA device that can't be
> accessed via legacy VGA resources, or a VGA device that isn't even enabled.
>
> Factor out this EFI and fallback selection from vga_arb_device_init() into
> a separate vga_arb_select_default_device() function.  This doesn't change
> any behavior, but it untangles the "bridge control possible" checking and
> messages from the default device selection.
>
> Tested-by: Zhou Wang   # D05 Hisi Hip07, Hip08
> Signed-off-by: Bjorn Helgaas 
> ---
>  drivers/gpu/vga/vgaarb.c |   57 
> --
>  1 file changed, 35 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/vga/vgaarb.c b/drivers/gpu/vga/vgaarb.c
> index 8035e38d5110..d35d6d271f3f 100644
> --- a/drivers/gpu/vga/vgaarb.c
> +++ b/drivers/gpu/vga/vgaarb.c
> @@ -1402,29 +1402,14 @@ static struct miscdevice vga_arb_device = {
>   MISC_DYNAMIC_MINOR, "vga_arbiter", &vga_arb_device_fops
>  };
>  
> -static int __init vga_arb_device_init(void)
> +static void __init vga_arb_select_default_device(void)
>  {
> - int rc;
>   struct pci_dev *pdev;
>   struct vga_device *vgadev;
>  
> - rc = misc_register(&vga_arb_device);
> - if (rc < 0)
> - pr_err("error %d registering device\n", rc);
> -
> - bus_register_notifier(&pci_bus_type, &pci_notifier);
> -
> - /* We add all pci devices satisfying vga class in the arbiter by
> -  * default */
> - pdev = NULL;
> - while ((pdev =
> - pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> -PCI_ANY_ID, pdev)) != NULL)
> - vga_arbiter_add_pci_device(pdev);
> -
> +#if defined(CONFIG_X86) || defined(CONFIG_IA64)
>   list_for_each_entry(vgadev, &vga_list, list) {
>   struct device *dev = &vgadev->pdev->dev;
> -#if defined(CONFIG_X86) || defined(CONFIG_IA64)
>   /*
>* Override vga_arbiter_add_pci_device()'s I/O based detection
>* as it may take the wrong device (e.g. on Apple system under
> @@ -1461,12 +1446,8 @@ static int __init vga_arb_device_init(void)
>   vgaarb_info(dev, "overriding boot device\n");
>   vga_set_default_device(vgadev->pdev);
>   }
> -#endif
> - if (vgadev->bridge_has_one_vga)
> - vgaarb_info(dev, "bridge control possible\n");
> - else
> - vgaarb_info(dev, "no bridge control possible\n");
>   }
> +#endif
>  
>   if (!vga_default_device()) {
>   list_for_each_entry(vgadev, &vga_list, list) {
> @@ -1492,6 +1473,38 @@ static int __init vga_arb_device_init(void)
>   vga_set_default_device(vgadev->pdev);
>   }
>   }
> +}
> +
> +static int __init vga_arb_device_init(void)
> +{
> + int rc;
> + struct pci_dev *pdev;
> + struct vga_device *vgadev;
> +
> + rc = misc_register(&vga_arb_device);
> + if (rc < 0)
> + pr_err("error %d registering device\n", rc);
> +
> + bus_register_notifier(&pci_bus_type, &pci_notifier);
> +
> + /* We add all PCI devices satisfying VGA class in the arbiter by
> +  * default */
> + pdev = NULL;
> + while ((pdev =
> + pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> +PCI_ANY_ID, pdev)) != NULL)
> + vga_arbiter_add_pci_device(pdev);
> +
> + list_for_each_entry(vgadev, &vga_list, list) {
> + struct device *dev = &vgadev->pdev->dev;
> +
> + if (vgadev->bridge_has_one_vga)
> + vgaarb_info(dev, "bridge control possible\n");
> + else
> + vgaarb_info(dev, "no bridge control possible\n");
> + }

Initially I wondered if this info printk could be moved into
vga_arbiter_check_bridge_sharing(), but it's been separated out since
3448a19da479b ("vgaarb: use bridges to control VGA routing where
possible."), and upon closer examination, it seems you can't be sure a
device doesn't share a bridge until the end of the process, so this is
indeed correct.

Everything else also looks good to me.

Reviewed-by: Daniel Axtens 

Regards,
Daniel

> +
> + vga_arb_select_default_device();
>  
>   pr_info("loaded\n");
>   return rc;


[PATCH 55/58] net: fs_enet: Remove unused timer

2017-10-16 Thread Kees Cook
Removes unused timer and its old initialization call.

Cc: Pantelis Antoniou 
Cc: Vitaly Bordug 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: net...@vger.kernel.org
Signed-off-by: Kees Cook 
---
 drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c | 2 --
 drivers/net/ethernet/freescale/fs_enet/fs_enet.h  | 1 -
 2 files changed, 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c 
b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
index 753259091b22..7892f2f0c6b5 100644
--- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
+++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
@@ -1023,8 +1023,6 @@ static int fs_enet_probe(struct platform_device *ofdev)
 
ndev->ethtool_ops = &fs_ethtool_ops;
 
-   init_timer(&fep->phy_timer_list);
-
netif_carrier_off(ndev);
 
ndev->features |= NETIF_F_SG;
diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet.h 
b/drivers/net/ethernet/freescale/fs_enet/fs_enet.h
index 5ce516c8a62a..dd306deb7cf1 100644
--- a/drivers/net/ethernet/freescale/fs_enet/fs_enet.h
+++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet.h
@@ -137,7 +137,6 @@ struct fs_enet_private {
cbd_t __iomem *cur_rx;
cbd_t __iomem *cur_tx;
int tx_free;
-   struct timer_list phy_timer_list;
const struct phy_info *phy;
u32 msg_enable;
struct mii_if_info mii_if;
-- 
2.7.4



[PATCH] powerpc/watchdog: Convert timers to use timer_setup()

2017-10-16 Thread Kees Cook
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Nicholas Piggin 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook 
---
 arch/powerpc/kernel/watchdog.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 15e209a37c2d..50797528b5e1 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -263,9 +263,8 @@ static void wd_timer_reset(unsigned int cpu, struct 
timer_list *t)
add_timer_on(t, cpu);
 }
 
-static void wd_timer_fn(unsigned long data)
+static void wd_timer_fn(struct timer_list *t)
 {
-   struct timer_list *t = this_cpu_ptr(&wd_timer);
int cpu = smp_processor_id();
 
watchdog_timer_interrupt(cpu);
@@ -292,7 +291,7 @@ static void start_watchdog_timer_on(unsigned int cpu)
 
per_cpu(wd_timer_tb, cpu) = get_tb();
 
-   setup_pinned_timer(t, wd_timer_fn, 0);
+   timer_setup(t, wd_timer_fn, TIMER_PINNED);
wd_timer_reset(cpu, t);
 }
 
-- 
2.7.4


-- 
Kees Cook
Pixel Security


[PATCH] crypto: vmx - Use skcipher for ctr fallback

2017-10-16 Thread Paulo Flabiano Smorigo
Signed-off-by: Paulo Flabiano Smorigo 
---
 drivers/crypto/vmx/aes_ctr.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/crypto/vmx/aes_ctr.c b/drivers/crypto/vmx/aes_ctr.c
index 17d8421..fc60d00 100644
--- a/drivers/crypto/vmx/aes_ctr.c
+++ b/drivers/crypto/vmx/aes_ctr.c
@@ -27,21 +27,23 @@
 #include 
 #include 
 #include 
+#include 
+
 #include "aesp8-ppc.h"
 
 struct p8_aes_ctr_ctx {
-   struct crypto_blkcipher *fallback;
+   struct crypto_skcipher *fallback;
struct aes_key enc_key;
 };
 
 static int p8_aes_ctr_init(struct crypto_tfm *tfm)
 {
const char *alg = crypto_tfm_alg_name(tfm);
-   struct crypto_blkcipher *fallback;
+   struct crypto_skcipher *fallback;
struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
 
-   fallback =
-   crypto_alloc_blkcipher(alg, 0, CRYPTO_ALG_NEED_FALLBACK);
+   fallback = crypto_alloc_skcipher(alg, 0,
+   CRYPTO_ALG_ASYNC | CRYPTO_ALG_NEED_FALLBACK);
if (IS_ERR(fallback)) {
printk(KERN_ERR
   "Failed to allocate transformation for '%s': %ld\n",
@@ -49,11 +51,11 @@ static int p8_aes_ctr_init(struct crypto_tfm *tfm)
return PTR_ERR(fallback);
}
printk(KERN_INFO "Using '%s' as fallback implementation.\n",
-  crypto_tfm_alg_driver_name((struct crypto_tfm *) fallback));
+   crypto_skcipher_driver_name(fallback));
 
-   crypto_blkcipher_set_flags(
+   crypto_skcipher_set_flags(
fallback,
-   crypto_blkcipher_get_flags((struct crypto_blkcipher *)tfm));
+   crypto_skcipher_get_flags((struct crypto_skcipher *)tfm));
ctx->fallback = fallback;
 
return 0;
@@ -64,7 +66,7 @@ static void p8_aes_ctr_exit(struct crypto_tfm *tfm)
struct p8_aes_ctr_ctx *ctx = crypto_tfm_ctx(tfm);
 
if (ctx->fallback) {
-   crypto_free_blkcipher(ctx->fallback);
+   crypto_free_skcipher(ctx->fallback);
ctx->fallback = NULL;
}
 }
@@ -83,7 +85,7 @@ static int p8_aes_ctr_setkey(struct crypto_tfm *tfm, const u8 
*key,
pagefault_enable();
preempt_enable();
 
-   ret += crypto_blkcipher_setkey(ctx->fallback, key, keylen);
+   ret += crypto_skcipher_setkey(ctx->fallback, key, keylen);
return ret;
 }
 
@@ -117,15 +119,14 @@ static int p8_aes_ctr_crypt(struct blkcipher_desc *desc,
struct blkcipher_walk walk;
struct p8_aes_ctr_ctx *ctx =
crypto_tfm_ctx(crypto_blkcipher_tfm(desc->tfm));
-   struct blkcipher_desc fallback_desc = {
-   .tfm = ctx->fallback,
-   .info = desc->info,
-   .flags = desc->flags
-   };
 
if (in_interrupt()) {
-   ret = crypto_blkcipher_encrypt(&fallback_desc, dst, src,
-  nbytes);
+   SKCIPHER_REQUEST_ON_STACK(req, ctx->fallback);
+   skcipher_request_set_tfm(req, ctx->fallback);
+   skcipher_request_set_callback(req, desc->flags, NULL, NULL);
+   skcipher_request_set_crypt(req, src, dst, nbytes, desc->info);
+   ret = crypto_skcipher_encrypt(req);
+   skcipher_request_zero(req);
} else {
blkcipher_walk_init(&walk, dst, src, nbytes);
ret = blkcipher_walk_virt_block(desc, &walk, AES_BLOCK_SIZE);
-- 
2.9.4



Re: [PATCH 0/4] char-TPM: Adjustments for ten function implementations

2017-10-16 Thread Joe Perches
On Mon, 2017-10-16 at 21:35 +0300, Jarkko Sakkinen wrote:
> A minor complaint: all commits are missing "Fixes:" tag.

None of these patches fix anything.
All are trivial changes without much of any impact.



Re: [PATCH 0/4] char-TPM: Adjustments for ten function implementations

2017-10-16 Thread SF Markus Elfring
> A minor complaint: all commits are missing "Fixes:" tag.

* Do you require it to be added to the commit messages?

* Would you like to get a finer patch granularity then?

* Do you find any more information missing?

Regards,
Markus


Re: [PATCH 0/4] char-TPM: Adjustments for ten function implementations

2017-10-16 Thread Jarkko Sakkinen
On Mon, Oct 16, 2017 at 09:31:39PM +0300, Jarkko Sakkinen wrote:
> On Mon, Oct 16, 2017 at 07:30:13PM +0200, SF Markus Elfring wrote:
> > From: Markus Elfring 
> > Date: Mon, 16 Oct 2017 19:12:34 +0200
> > 
> > A few update suggestions were taken into account
> > from static source code analysis.
> > 
> > Markus Elfring (4):
> >   Delete an error message for a failed memory allocation
> > in tpm_ascii_bios_measurements_show()
> >   Delete an error message for a failed memory allocation in 
> > tpm_ibmvtpm_probe()
> >   Improve a size determination in nine functions
> >   Less checks in tpm_ibmvtpm_probe() after error detection
> > 
> >  drivers/char/tpm/st33zp24/i2c.c  |  3 +--
> >  drivers/char/tpm/st33zp24/spi.c  |  3 +--
> >  drivers/char/tpm/st33zp24/st33zp24.c |  3 +--
> >  drivers/char/tpm/tpm1_eventlog.c |  5 +
> >  drivers/char/tpm/tpm_crb.c   |  2 +-
> >  drivers/char/tpm/tpm_i2c_atmel.c |  2 +-
> >  drivers/char/tpm/tpm_i2c_nuvoton.c   |  2 +-
> >  drivers/char/tpm/tpm_ibmvtpm.c   | 23 +--
> >  drivers/char/tpm/tpm_tis.c   |  2 +-
> >  drivers/char/tpm/tpm_tis_spi.c   |  3 +--
> >  10 files changed, 18 insertions(+), 30 deletions(-)
> > 
> > -- 
> > 2.14.2
> > 
> 
> For some sparse errors I fixed a while ago I got review feedback that
> one should explain what is wrong what the fix does and not tell tool
> reported. And it really does make sense to me.
> 
> Describing the tool that was used to find the issues fits to the cover
> letter but not to the commits themselves.
> 
> I think I recently accepted a small fix with a "tool generated commit
> message" but I don't want to take it as a practice It was a minor
> mistake from my side to accept such patch.

A minor complaint: all commits are missing "Fixes:" tag.

/Jarkko


Re: [PATCH 0/4] char-TPM: Adjustments for ten function implementations

2017-10-16 Thread Jarkko Sakkinen
On Mon, Oct 16, 2017 at 07:30:13PM +0200, SF Markus Elfring wrote:
> From: Markus Elfring 
> Date: Mon, 16 Oct 2017 19:12:34 +0200
> 
> A few update suggestions were taken into account
> from static source code analysis.
> 
> Markus Elfring (4):
>   Delete an error message for a failed memory allocation
> in tpm_ascii_bios_measurements_show()
>   Delete an error message for a failed memory allocation in 
> tpm_ibmvtpm_probe()
>   Improve a size determination in nine functions
>   Less checks in tpm_ibmvtpm_probe() after error detection
> 
>  drivers/char/tpm/st33zp24/i2c.c  |  3 +--
>  drivers/char/tpm/st33zp24/spi.c  |  3 +--
>  drivers/char/tpm/st33zp24/st33zp24.c |  3 +--
>  drivers/char/tpm/tpm1_eventlog.c |  5 +
>  drivers/char/tpm/tpm_crb.c   |  2 +-
>  drivers/char/tpm/tpm_i2c_atmel.c |  2 +-
>  drivers/char/tpm/tpm_i2c_nuvoton.c   |  2 +-
>  drivers/char/tpm/tpm_ibmvtpm.c   | 23 +--
>  drivers/char/tpm/tpm_tis.c   |  2 +-
>  drivers/char/tpm/tpm_tis_spi.c   |  3 +--
>  10 files changed, 18 insertions(+), 30 deletions(-)
> 
> -- 
> 2.14.2
> 

For some sparse errors I fixed a while ago I got review feedback that
one should explain what is wrong what the fix does and not tell tool
reported. And it really does make sense to me.

Describing the tool that was used to find the issues fits to the cover
letter but not to the commits themselves.

I think I recently accepted a small fix with a "tool generated commit
message" but I don't want to take it as a practice It was a minor
mistake from my side to accept such patch.

/Jarkko


[PATCH 4/4] char/tpm: Less checks in tpm_ibmvtpm_probe() after error detection

2017-10-16 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 16 Oct 2017 19:00:34 +0200

Two pointer checks could be repeated by the tpm_ibmvtpm_probe() function
during error handling even if the relevant properties can be determined
for the involved variables before by source code analysis.

* Return directly after a call of the function "kzalloc" failed
  at the beginning.

* Adjust jump targets so that extra checks can be omitted at the end.

Signed-off-by: Markus Elfring 
---
 drivers/char/tpm/tpm_ibmvtpm.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
index a4b462a77b99..b8dda7546f64 100644
--- a/drivers/char/tpm/tpm_ibmvtpm.c
+++ b/drivers/char/tpm/tpm_ibmvtpm.c
@@ -610,7 +610,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
 
ibmvtpm = kzalloc(sizeof(*ibmvtpm), GFP_KERNEL);
if (!ibmvtpm)
-   goto cleanup;
+   return -ENOMEM;
 
ibmvtpm->dev = dev;
ibmvtpm->vdev = vio_dev;
@@ -619,7 +619,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
crq_q->crq_addr = (struct ibmvtpm_crq *)get_zeroed_page(GFP_KERNEL);
if (!crq_q->crq_addr) {
dev_err(dev, "Unable to allocate memory for crq_addr\n");
-   goto cleanup;
+   goto free_tpm;
}
 
crq_q->num_entry = CRQ_RES_BUF_SIZE / sizeof(*crq_q->crq_addr);
@@ -629,7 +629,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
 
if (dma_mapping_error(dev, ibmvtpm->crq_dma_handle)) {
dev_err(dev, "dma mapping failed\n");
-   goto cleanup;
+   goto free_page;
}
 
rc = plpar_hcall_norets(H_REG_CRQ, vio_dev->unit_address,
@@ -683,13 +683,10 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
 reg_crq_cleanup:
dma_unmap_single(dev, ibmvtpm->crq_dma_handle, CRQ_RES_BUF_SIZE,
 DMA_BIDIRECTIONAL);
-cleanup:
-   if (ibmvtpm) {
-   if (crq_q->crq_addr)
-   free_page((unsigned long)crq_q->crq_addr);
-   kfree(ibmvtpm);
-   }
-
+free_page:
+   free_page((unsigned long)crq_q->crq_addr);
+free_tpm:
+   kfree(ibmvtpm);
return rc;
 }
 
-- 
2.14.2



[PATCH 3/4] char/tpm: Improve a size determination in nine functions

2017-10-16 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 16 Oct 2017 18:28:17 +0200

Replace the specification of data structures by pointer dereferences
as the parameter for the operator "sizeof" to make the corresponding size
determination a bit safer according to the Linux coding style convention.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/char/tpm/st33zp24/i2c.c  | 3 +--
 drivers/char/tpm/st33zp24/spi.c  | 3 +--
 drivers/char/tpm/st33zp24/st33zp24.c | 3 +--
 drivers/char/tpm/tpm_crb.c   | 2 +-
 drivers/char/tpm/tpm_i2c_atmel.c | 2 +-
 drivers/char/tpm/tpm_i2c_nuvoton.c   | 2 +-
 drivers/char/tpm/tpm_ibmvtpm.c   | 2 +-
 drivers/char/tpm/tpm_tis.c   | 2 +-
 drivers/char/tpm/tpm_tis_spi.c   | 3 +--
 9 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/drivers/char/tpm/st33zp24/i2c.c b/drivers/char/tpm/st33zp24/i2c.c
index be5d1abd3e8e..d0cb25688485 100644
--- a/drivers/char/tpm/st33zp24/i2c.c
+++ b/drivers/char/tpm/st33zp24/i2c.c
@@ -245,8 +245,7 @@ static int st33zp24_i2c_probe(struct i2c_client *client,
return -ENODEV;
}
 
-   phy = devm_kzalloc(&client->dev, sizeof(struct st33zp24_i2c_phy),
-  GFP_KERNEL);
+   phy = devm_kzalloc(&client->dev, sizeof(*phy), GFP_KERNEL);
if (!phy)
return -ENOMEM;
 
diff --git a/drivers/char/tpm/st33zp24/spi.c b/drivers/char/tpm/st33zp24/spi.c
index 0fc4f20b5f83..c952df9244c8 100644
--- a/drivers/char/tpm/st33zp24/spi.c
+++ b/drivers/char/tpm/st33zp24/spi.c
@@ -358,8 +358,7 @@ static int st33zp24_spi_probe(struct spi_device *dev)
return -ENODEV;
}
 
-   phy = devm_kzalloc(&dev->dev, sizeof(struct st33zp24_spi_phy),
-  GFP_KERNEL);
+   phy = devm_kzalloc(&dev->dev, sizeof(*phy), GFP_KERNEL);
if (!phy)
return -ENOMEM;
 
diff --git a/drivers/char/tpm/st33zp24/st33zp24.c 
b/drivers/char/tpm/st33zp24/st33zp24.c
index 4d1dc8b46877..0686a129268c 100644
--- a/drivers/char/tpm/st33zp24/st33zp24.c
+++ b/drivers/char/tpm/st33zp24/st33zp24.c
@@ -533,8 +533,7 @@ int st33zp24_probe(void *phy_id, const struct 
st33zp24_phy_ops *ops,
if (IS_ERR(chip))
return PTR_ERR(chip);
 
-   tpm_dev = devm_kzalloc(dev, sizeof(struct st33zp24_dev),
-  GFP_KERNEL);
+   tpm_dev = devm_kzalloc(dev, sizeof(*tpm_dev), GFP_KERNEL);
if (!tpm_dev)
return -ENOMEM;
 
diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c
index 7b3c2a8aa9de..343c46e8560f 100644
--- a/drivers/char/tpm/tpm_crb.c
+++ b/drivers/char/tpm/tpm_crb.c
@@ -557,7 +557,7 @@ static int crb_acpi_add(struct acpi_device *device)
if (sm == ACPI_TPM2_MEMORY_MAPPED)
return -ENODEV;
 
-   priv = devm_kzalloc(dev, sizeof(struct crb_priv), GFP_KERNEL);
+   priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
 
diff --git a/drivers/char/tpm/tpm_i2c_atmel.c b/drivers/char/tpm/tpm_i2c_atmel.c
index 95ce2e9ccdc6..2d0df930a76d 100644
--- a/drivers/char/tpm/tpm_i2c_atmel.c
+++ b/drivers/char/tpm/tpm_i2c_atmel.c
@@ -165,7 +165,7 @@ static int i2c_atmel_probe(struct i2c_client *client,
if (IS_ERR(chip))
return PTR_ERR(chip);
 
-   priv = devm_kzalloc(dev, sizeof(struct priv_data), GFP_KERNEL);
+   priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
 
diff --git a/drivers/char/tpm/tpm_i2c_nuvoton.c 
b/drivers/char/tpm/tpm_i2c_nuvoton.c
index c6428771841f..5983d52eb6d9 100644
--- a/drivers/char/tpm/tpm_i2c_nuvoton.c
+++ b/drivers/char/tpm/tpm_i2c_nuvoton.c
@@ -551,7 +551,7 @@ static int i2c_nuvoton_probe(struct i2c_client *client,
if (IS_ERR(chip))
return PTR_ERR(chip);
 
-   priv = devm_kzalloc(dev, sizeof(struct priv_data), GFP_KERNEL);
+   priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
 
diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
index b18148ef2612..a4b462a77b99 100644
--- a/drivers/char/tpm/tpm_ibmvtpm.c
+++ b/drivers/char/tpm/tpm_ibmvtpm.c
@@ -608,7 +608,7 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
if (IS_ERR(chip))
return PTR_ERR(chip);
 
-   ibmvtpm = kzalloc(sizeof(struct ibmvtpm_dev), GFP_KERNEL);
+   ibmvtpm = kzalloc(sizeof(*ibmvtpm), GFP_KERNEL);
if (!ibmvtpm)
goto cleanup;
 
diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c
index ebd0e75a3e4d..0a3af60bab2a 100644
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -294,7 +294,7 @@ static int tpm_tis_init(struct device *dev, struct tpm_info 
*tpm_info)
if (rc)
return rc;
 
-   phy = devm_kzalloc(dev, sizeof(struct tpm_tis_tcg_phy), GFP_KERNEL);
+ 

[PATCH 2/4] char/tpm: Delete an error message for a failed memory allocation in tpm_ibmvtpm_probe()

2017-10-16 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 16 Oct 2017 18:08:23 +0200

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/char/tpm/tpm_ibmvtpm.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/char/tpm/tpm_ibmvtpm.c b/drivers/char/tpm/tpm_ibmvtpm.c
index 25f6e2665385..b18148ef2612 100644
--- a/drivers/char/tpm/tpm_ibmvtpm.c
+++ b/drivers/char/tpm/tpm_ibmvtpm.c
@@ -609,10 +609,8 @@ static int tpm_ibmvtpm_probe(struct vio_dev *vio_dev,
return PTR_ERR(chip);
 
ibmvtpm = kzalloc(sizeof(struct ibmvtpm_dev), GFP_KERNEL);
-   if (!ibmvtpm) {
-   dev_err(dev, "kzalloc for ibmvtpm failed\n");
+   if (!ibmvtpm)
goto cleanup;
-   }
 
ibmvtpm->dev = dev;
ibmvtpm->vdev = vio_dev;
-- 
2.14.2



[PATCH 1/4] char/tpm: Delete an error message for a failed memory allocation in tpm_ascii_bios_measurements_show()

2017-10-16 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 16 Oct 2017 17:43:55 +0200

Omit an extra message for a memory allocation failure in this function.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 drivers/char/tpm/tpm1_eventlog.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/char/tpm/tpm1_eventlog.c b/drivers/char/tpm/tpm1_eventlog.c
index 9a8605e500b5..9749469cb823 100644
--- a/drivers/char/tpm/tpm1_eventlog.c
+++ b/drivers/char/tpm/tpm1_eventlog.c
@@ -280,11 +280,8 @@ static int tpm_ascii_bios_measurements_show(struct 
seq_file *m, void *v)
(unsigned char *)(v + sizeof(struct tcpa_event));
 
eventname = kmalloc(MAX_TEXT_EVENT, GFP_KERNEL);
-   if (!eventname) {
-   printk(KERN_ERR "%s: ERROR - No Memory for event name\n ",
-  __func__);
+   if (!eventname)
return -EFAULT;
-   }
 
/* 1st: PCR */
seq_printf(m, "%2d ", do_endian_conversion(event->pcr_index));
-- 
2.14.2



[PATCH 0/4] char-TPM: Adjustments for ten function implementations

2017-10-16 Thread SF Markus Elfring
From: Markus Elfring 
Date: Mon, 16 Oct 2017 19:12:34 +0200

A few update suggestions were taken into account
from static source code analysis.

Markus Elfring (4):
  Delete an error message for a failed memory allocation
in tpm_ascii_bios_measurements_show()
  Delete an error message for a failed memory allocation in tpm_ibmvtpm_probe()
  Improve a size determination in nine functions
  Less checks in tpm_ibmvtpm_probe() after error detection

 drivers/char/tpm/st33zp24/i2c.c  |  3 +--
 drivers/char/tpm/st33zp24/spi.c  |  3 +--
 drivers/char/tpm/st33zp24/st33zp24.c |  3 +--
 drivers/char/tpm/tpm1_eventlog.c |  5 +
 drivers/char/tpm/tpm_crb.c   |  2 +-
 drivers/char/tpm/tpm_i2c_atmel.c |  2 +-
 drivers/char/tpm/tpm_i2c_nuvoton.c   |  2 +-
 drivers/char/tpm/tpm_ibmvtpm.c   | 23 +--
 drivers/char/tpm/tpm_tis.c   |  2 +-
 drivers/char/tpm/tpm_tis_spi.c   |  3 +--
 10 files changed, 18 insertions(+), 30 deletions(-)

-- 
2.14.2



Re: [PATCH 2/2] powerpc/perf/imc: use NUMA_NO_NODE for alloc_pages_node

2017-10-16 Thread Madhavan Srinivasan



On Monday 16 October 2017 07:48 AM, Balbir Singh wrote:

On Mon, 16 Oct 2017 00:13:42 +0530
Madhavan Srinivasan  wrote:


alloc_pages_node() when passed NUMA_NO_NODE for the
node_id, could get memory from closest node. Cleanup
core imc and thread imc memory init functions to use
NUMA_NO_NODE.


The changelog is not clear, alloc_pages_node() takes a
preferred node id and creates a node zonelist from it.
How is NUMA_NO_NODE better?


IIUC with NUMA_NO_NODE we could remove the __GFP_NOWARN
for alloc_pages_node(). That said, one must be careful to make
sure we dont end up allocating memory from the boot cpu.
And incase of In Memory Collection (IMC) counters, it is handled
in the cpu online path.

Will send a v2 with __GFP_NOWARN removed which
I missed in this :)  .

Maddy


Balbir Singh.





[Part1 PATCH v6 09/17] resource: Provide resource struct in resource walk callback

2017-10-16 Thread Brijesh Singh
From: Tom Lendacky 

In prep for a new function that will need additional resource information
during the resource walk, update the resource walk callback to pass the
resource structure.  Since the current callback start and end arguments
are pulled from the resource structure, the callback functions can obtain
them from the resource structure directly.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: "H. Peter Anvin" 
Cc: Borislav Petkov 
Cc: linux-ker...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Benjamin Herrenschmidt 
Reviewed-by: Kees Cook 
Reviewed-by: Borislav Petkov 
Signed-off-by: Tom Lendacky 
Signed-off-by: Brijesh Singh 
---
 arch/powerpc/kernel/machine_kexec_file_64.c | 12 +---
 arch/x86/kernel/crash.c | 18 +-
 arch/x86/kernel/pmem.c  |  2 +-
 include/linux/ioport.h  |  4 ++--
 include/linux/kexec.h   |  2 +-
 kernel/kexec_file.c |  5 +++--
 kernel/resource.c   |  9 +
 7 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c 
b/arch/powerpc/kernel/machine_kexec_file_64.c
index 992c0d258e5d..e4395f937d63 100644
--- a/arch/powerpc/kernel/machine_kexec_file_64.c
+++ b/arch/powerpc/kernel/machine_kexec_file_64.c
@@ -91,11 +91,13 @@ int arch_kimage_file_post_load_cleanup(struct kimage *image)
  * and that value will be returned. If all free regions are visited without
  * func returning non-zero, then zero will be returned.
  */
-int arch_kexec_walk_mem(struct kexec_buf *kbuf, int (*func)(u64, u64, void *))
+int arch_kexec_walk_mem(struct kexec_buf *kbuf,
+   int (*func)(struct resource *, void *))
 {
int ret = 0;
u64 i;
phys_addr_t mstart, mend;
+   struct resource res = { };
 
if (kbuf->top_down) {
for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
@@ -105,7 +107,9 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf, int 
(*func)(u64, u64, void *))
 * range while in kexec, end points to the last byte
 * in the range.
 */
-   ret = func(mstart, mend - 1, kbuf);
+   res.start = mstart;
+   res.end = mend - 1;
+   ret = func(&res, kbuf);
if (ret)
break;
}
@@ -117,7 +121,9 @@ int arch_kexec_walk_mem(struct kexec_buf *kbuf, int 
(*func)(u64, u64, void *))
 * range while in kexec, end points to the last byte
 * in the range.
 */
-   ret = func(mstart, mend - 1, kbuf);
+   res.start = mstart;
+   res.end = mend - 1;
+   ret = func(&res, kbuf);
if (ret)
break;
}
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 44404e2307bb..815008c9ca18 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -209,7 +209,7 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_KEXEC_FILE
-static int get_nr_ram_ranges_callback(u64 start, u64 end, void *arg)
+static int get_nr_ram_ranges_callback(struct resource *res, void *arg)
 {
unsigned int *nr_ranges = arg;
 
@@ -342,7 +342,7 @@ static int elf_header_exclude_ranges(struct crash_elf_data 
*ced,
return ret;
 }
 
-static int prepare_elf64_ram_headers_callback(u64 start, u64 end, void *arg)
+static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg)
 {
struct crash_elf_data *ced = arg;
Elf64_Ehdr *ehdr;
@@ -355,7 +355,7 @@ static int prepare_elf64_ram_headers_callback(u64 start, 
u64 end, void *arg)
ehdr = ced->ehdr;
 
/* Exclude unwanted mem ranges */
-   ret = elf_header_exclude_ranges(ced, start, end);
+   ret = elf_header_exclude_ranges(ced, res->start, res->end);
if (ret)
return ret;
 
@@ -518,14 +518,14 @@ static int add_e820_entry(struct boot_params *params, 
struct e820_entry *entry)
return 0;
 }
 
-static int memmap_entry_callback(u64 start, u64 end, void *arg)
+static int memmap_entry_callback(struct resource *res, void *arg)
 {
struct crash_memmap_data *cmd = arg;
struct boot_params *params = cmd->params;
struct e820_entry ei;
 
-   ei.addr = start;
-   ei.size = end - start + 1;
+   ei.addr = res->start;
+   ei.size = res->end - res->start + 1;
ei.type = cmd->type;
add_e820_entry(params, &ei);
 
@@ -619,12 +619,12 @@ int crash_setup_memmap_entries(struct kimage *image, 
struct boot_params *params)
return ret;
 }
 
-static int determine_backup_region(u64 start, u64 end, void *arg)
+static int determine_backup_regi

Re: [PATCH0/8] Support for ibm, dynamic-memory-v2 device tree property

2017-10-16 Thread Nathan Fontenot
On 10/06/2017 04:58 PM, Michael Ellerman wrote:
> Nathan Fontenot  writes:
>> This patch set provides a set of updates to de-couple the LMB information
>> provided in the ibm,dynamic-memory device tree property from the device
>> tree property format. The goal is to provide a data set of LMB information
>> so that consumners of this data do not need to understand and provide
>> multiple parsing routines for the supported device tree property formats.
> 
> Couple of build problems in this series.
> 
> The non-pseries build dies with:
> 
>   arch/powerpc/kernel/prom.c:541:4: error: implicit declaration of function 
> 'early_init_dt_scan_drmem_lmbs' [-Werror=implicit-function-declaration]
> 
> And pseries dies with:
> 
>   arch/powerpc/platforms/pseries/hotplug-memory.c:860:21: error: 
> 'lmbs_available' may be used uninitialized in this function 
> [-Werror=maybe-uninitialized]
> 
> cheers
> 

Ugh! I was hoping I had tried all permutations. Obviously I missed some.

I'll send out an updated patch set after fixing the build issues.

-Nathan



Re: [PATCH v2] cxl: Dump PSL_FIR register on PSL9 error irq

2017-10-16 Thread christophe lombard

Le 11/10/2017 à 08:14, Vaibhav Jain a écrit :


For PSL9 currently we aren't dumping the PSL FIR register when a
PSL error interrupt is triggered. Contents of this register are useful
in debugging AFU issues.

This patch fixes issue by adding a new service_layer_ops callback
cxl_native_err_irq_dump_regs_psl9() to dump the PSL_FIR registers on a
PSL error interrupt thereby bringing the behavior in line with PSL on
POWER-8. Also the existing service_layer_ops callback
for PSL8 has been renamed to cxl_native_err_irq_dump_regs_psl8().

Signed-off-by: Vaibhav Jain 
---
Changelog:
[v2] -> As created a different function to dump the FIR register for PSL9 (Fred)
---


Thanks

Acked-by:  Christophe Lombard



Re: [PATCH 2/2] powerpc/hotplug: Ensure nodes initialized for hotplug

2017-10-16 Thread Michael Ellerman
Michael Bringmann  writes:

> powerpc/hotplug: On systems like PowerPC which allow 'hot-add' of CPU,
> it may occur that the new resources are to be inserted into nodes
> that were not used for memory resources at bootup.  Many different
> configurations of PowerPC resources may need to be supported depending
> upon the environment.

Give me some detail please?!

> This patch fixes some problems encountered at

What problems?

> runtime with configurations that support memory-less nodes, but which
> allow CPUs to be added at and after boot.

How does it fix those problems?

> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index b385cd0..e811dd1 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1325,6 +1325,17 @@ static long vphn_get_associativity(unsigned long cpu,
>   return rc;
>  }
>  
> +static int verify_node_preparation(int nid)
> +{

I would not expect a function called "verify" ...

> + if ((NODE_DATA(nid) == NULL) ||
> + (NODE_DATA(nid)->node_spanned_pages == 0)) {
> + if (try_online_node(nid))

.. to do something like online a node.

> + return first_online_node;
> + }
> +
> + return nid;
> +}
> +
>  /*
>   * Update the CPU maps and sysfs entries for a single CPU when its NUMA
>   * characteristics change. This function doesn't perform any locking and is
> @@ -1433,9 +1444,11 @@ int numa_update_cpu_topology(bool cpus_locked)
>   /* Use associativity from first thread for all siblings */
>   vphn_get_associativity(cpu, associativity);
>   new_nid = associativity_to_nid(associativity);
> - if (new_nid < 0 || !node_online(new_nid))
> + if (new_nid < 0 || !node_possible(new_nid))
>   new_nid = first_online_node;
>  
> + new_nid = verify_node_preparation(new_nid);

You're being called part-way through CPU hotplug here, are we sure it's
safe to go and do memory hotplug from there? What's the locking
situation?

cheers


Re: [PATCH] cxl: Rename register PSL9_FIR2 to PSL9_FIR_MASK

2017-10-16 Thread christophe lombard

Le 09/10/2017 à 19:56, Vaibhav Jain a écrit :


PSL9 doesn't have a FIR2 register as was the case with PSL8. However
currently the register definitions in 'cxl.h' have a definition for
PSL9_FIR2 that actually points to PSL9_FIR_MASK register in the P1
area at offset 0x308.

So this patch renames the def PSL9_FIR2 to PSL9_FIR_MASK and updates
the references in the code to point to the new identifier. It also
removes the code to dump contents of FIR2 (FIR_MASK actually) in
cxl_native_irq_dump_regs_psl9().

Fixes: f24be42aab37("cxl: Add psl9 specific code")
Reported-by: Frederic Barrat 
Signed-off-by: Vaibhav Jain 


Thanks

Acked-by:  Christophe Lombard



Re: refactor dma_cache_sync V2

2017-10-16 Thread Christoph Hellwig
On Tue, Oct 03, 2017 at 12:49:51PM +0100, Robin Murphy wrote:
> Reviewed-by: Robin Murphy 

Thanks Robin.  I've heard very little from the arch maintainers,
but if people remain silent I will apply the whole series to the
dma-mapping tree in the next days.


Re: [PATCH 1/2] powerpc/nodes: Ensure enough nodes avail for operations

2017-10-16 Thread Michael Ellerman
Michael Bringmann  writes:

> powerpc/nodes: On systems like PowerPC which allow 'hot-add' of CPU

This is a powerpc-only patch, so saying "systems like PowerPC" is
confusing. What you should be saying is "On pseries systems".

> or memory resources, it may occur that the new resources are to be
> inserted into nodes that were not used for these resources at bootup.
> In the kernel, any node that is used must be defined and initialized
> at boot.
>
> This patch extracts the value of the lowest domain level (number of
> allocable resources) from the "rtas" device tree property
> "ibm,current-associativity-domains" or the device tree property

What is current associativity domains? I've not heard of it, where is it
documented, and what does it mean.

Why would use the "current" set vs the "max"? I thought the whole point
was to discover the maximum possible set of nodes that could be
hotplugged.

> "ibm,max-associativity-domains" to use as the maximum number of nodes
> to setup as possibly available in the system.  This new setting will
> override the instruction,
>
> nodes_and(node_possible_map, node_possible_map, node_online_map);
>
> presently seen in the function arch/powerpc/mm/numa.c:initmem_init().
>
> If the property is not present at boot, no operation will be performed
> to define or enable additional nodes.
>
> Signed-off-by: Michael Bringmann 
> ---
>  arch/powerpc/mm/numa.c |   47 +++
>  1 file changed, 47 insertions(+)
>
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index ec098b3..b385cd0 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -892,6 +892,51 @@ static void __init setup_node_data(int nid, u64 
> start_pfn, u64 end_pfn)
>   NODE_DATA(nid)->node_spanned_pages = spanned_pages;
>  }
>  
> +static void __init node_associativity_setup(void)

This should really be called "find_possible_nodes()" or something more
descriptive.

> +{
> + struct device_node *rtas;
> +
> + rtas = of_find_node_by_path("/rtas");
> + if (rtas) {

If you just short-circuit that return the whole function body can be
deintented, making it significantly more readable.

ie:
+   rtas = of_find_node_by_path("/rtas");
+   if (!rtas)
+   return;

> + const __be32 *prop;
> + u32 len, entries, numnodes, i;
> +
> + prop = of_get_property(rtas,
> + "ibm,current-associativity-domains", 
> &len);

Please don't use of_get_property() in new code, we have much better
accessors these days, which do better error checking and handle the
endian conversions for you.

In this case you'd use eg:

u32 entries;
rc = of_property_read_u32(rtas, "ibm,current-associativity-domains", 
&entries);

> + if (!prop || len < sizeof(unsigned int)) {
> + prop = of_get_property(rtas,
> + "ibm,max-associativity-domains", &len);
> + goto endit;
> + }
> +
> + entries = of_read_number(prop++, 1);
> +
> + if (len < (entries * sizeof(unsigned int)))
> + goto endit;
> +
> + if ((0 <= min_common_depth) && (min_common_depth <= 
> (entries-1)))
> + entries = min_common_depth;
> + else
> + entries -= 1;
^
You can't just guess that will be the right entry.

If min_common_depth is < 0 the function should have just returned
immediately at the top.

If min_common_depth is outside the range of the property that's a buggy
device tree, you should print a warning and return.

> + numnodes = of_read_number(&prop[entries], 1);

u32 num_nodes;
rc = of_property_read_u32_index(rtas, 
"ibm,current-associativity-domains", min_common_depth, &num_nodes);
> +
> + printk(KERN_INFO "numa: Nodes = %d (mcd = %d)\n", numnodes,
> + min_common_depth);
> +
> + for (i = 0; i < numnodes; i++) {
> + if (!node_possible(i)) {
> + setup_node_data(i, 0, 0);

Do we actually need to setup the NODE_DATA() yet? Doing it now ensures
it will not be allocated node local, which sucks.

> + node_set(i, node_possible_map);
> + }
> + }
> + }
> +
> +endit:

"out" would be the normal name.

> + if (rtas)
> + of_node_put(rtas);
> +}
> +
>  void __init initmem_init(void)
>  {
>   int nid, cpu;
> @@ -911,6 +956,8 @@ void __init initmem_init(void)
>*/

You need to update the comment above here which is contradicted by the
new function you're adding.

>   nodes_and(node_possible_map, node_possible_map, node_online_map);
>  
> + node_associativity_setup();
> +
>   for_each_online_node(nid) {
>   unsigned long start_pfn, end_

Re: [PATCH v4 5/5] powerpc/mce: hookup memory_failure for UE errors

2017-10-16 Thread Michael Ellerman
Balbir Singh  writes:

> On Mon, Oct 16, 2017 at 4:38 PM, Michael Ellerman  wrote:
>> Balbir Singh  writes:
>>
>>> If we are in user space and hit a UE error, we now have the
>>> basic infrastructure to walk the page tables and find out
>>> the effective address that was accessed, since the DAR
>>> is not valid.
>>>
>>> We use a work_queue content to hookup the bad pfn, any
>>> other context causes problems, since memory_failure itself
>>> can call into schedule() via lru_drain_ bits.
>>>
>>> We could probably poison the struct page to avoid a race
>>> between detection and taking corrective action.
>>>
>>> Signed-off-by: Balbir Singh 
>>> Reviewed-by: Nicholas Piggin 
>>> ---
>>>  arch/powerpc/kernel/mce.c | 70 
>>> +--
>>>  1 file changed, 67 insertions(+), 3 deletions(-)
>>
>> I'm not sure why this is in mce.c but the rest was in mce_power.c ?
>
> The way the code is organized is that save_mce_event is implemented here
> and called from mce_power.c. save_mce_event() does the processing
> of the event.

save_mce_event() is only called from mce_power.c :/

I'd be happy if it was mce_real_mode.c and mce.c, that would be a good
distinction to make.

Though even then, most of the code is Book3s64 specific I think so maybe
it shouldn't be in just mce.c.

Anyway I'll merge this and we can improve things later.

cheers


Re: [PATCH kernel] RFC: prom_init: Fetch flatten device tree from the system firmware

2017-10-16 Thread Michael Ellerman
Alexey Kardashevskiy  writes:

> At the moment, on 256CPU + 256 PCI devices guest, it takes the guest
> about 8.5sec to read the entire device tree. Some explanation can be
> found here: https://patchwork.ozlabs.org/patch/826124/ but mostly it is
> because the kernel traverses the tree twice and it calls "getprop" for
> each properly which is really SLOF as it searches from the linked list
> beginning every time.
>
> Since SLOF has just learned to build FDT and this takes less than 0.5sec
> for such a big guest, this makes use of the proposed client interface
> method - "fdt-fetch".

It's a pity doing it the normal way is so slow, but this seems like a
reasonable idea anyway.

> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
> index 02190e90c7ae..daa50a153737 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -2498,6 +2498,31 @@ static void __init flatten_device_tree(void)
>   prom_panic("Can't allocate initial device-tree chunk\n");
>   mem_end = mem_start + room;
  
I'd prefer you didn't munge it inside flatten_device_tree(), rather
create a wrapper that does ~=:

void get_flat_devicetree(void)
{
if (!fetch_flat_devicetree())
flatten_device_tree();

printf(...)
}

> + if (!call_prom_ret("fdt-fetch", 2, 1, NULL, mem_start,
> +room - sizeof(mem_reserve_map))) {
> + u32 size;
> +
> + hdr = (void *) mem_start;
> +
> + /* Fixup the boot cpuid */
> + hdr->boot_cpuid_phys = cpu_to_be32(prom.cpu);
> +
> + /* Append the reserved map to the end of the blob */
> + hdr->off_mem_rsvmap = hdr->totalsize;
> + size = be32_to_cpu(hdr->totalsize);
> + rsvmap = (void *) hdr + size;
> + hdr->totalsize = cpu_to_be32(size + sizeof(mem_reserve_map));
> + memcpy(rsvmap, mem_reserve_map, sizeof(mem_reserve_map));
> +
> + /* Store the DT address */
> + dt_header_start = mem_start;
> +
> +#ifdef DEBUG_PROM
> + prom_printf("Fetched DTB: %d bytes to @%x\n", size, mem_start);
> +#endif

I think that should actually not be under DEBUG_PROM. The origin of the
FDT is fairly crucial information, so I think we can tolerate an extra
line of output to know that.

> + goto print_exit;

This was the clue that it should be in a separate function :)

cheers

> + }
> +
>   /* Get root of tree */
>   root = call_prom("peer", 1, 1, (phandle)0);
>   if (root == (phandle)0)
> @@ -2548,6 +2573,7 @@ static void __init flatten_device_tree(void)
>   /* Copy the reserve map in */
>   memcpy(rsvmap, mem_reserve_map, sizeof(mem_reserve_map));
>  
> +print_exit:
>  #ifdef DEBUG_PROM
>   {
>   int i;
> -- 
> 2.11.0


Re: [PATCH kernel] RFC: prom_init: Fetch flatten device tree from the system firmware

2017-10-16 Thread Alexey Kardashevskiy
On 16/10/17 21:20, Segher Boessenkool wrote:
> On Mon, Oct 16, 2017 at 06:07:06PM +1100, Alexey Kardashevskiy wrote:
>> On 16/10/17 17:46, David Gibson wrote:
> 
>> +/* Fixup the boot cpuid */
>> +hdr->boot_cpuid_phys = cpu_to_be32(prom.cpu);
>
> If SLOF is generating a tree it really should get this header field
> right as well.


 Ah, I did not realize it is just a phandle from /chosen/cpu. Will
 fix.
>>>
>>> It's not a phandle.  It's just the "address" (i.e. reg value) of the
>>> boot cpu.
>>
>> Well, it is "reg" of a CPU with phandle==/chosen/cpu so my fdt code needs
>> to look there to pick the right "reg" rather than just plain 0. I'll fix
>> this but in general can it possibly be not a zero in QEMU/SLOF?
> 
> /chosen/cpu is an ihandle, not a phandle. 

Sure, that is what my proposed fdt-boot-cpu  does already.

> Most (if not all) references
> in /chosen are.





-- 
Alexey


[PATCH v3] KVM: PPC: Book3S PR: only install valid SLBs during KVM_SET_SREGS

2017-10-16 Thread Greg Kurz
Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
some of which are valid (ie, SLB_ESID_V is set) and the rest are
likely all-zeroes (with QEMU at least).

Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
assumes to find the SLB index in the 3 lower bits of its rb argument.
When passed zeroed arguments, it happily overwrites the 0th SLB entry
with zeroes. This is exactly what happens while doing live migration
with QEMU when the destination pushes the incoming SLB descriptors to
KVM PR. When reloading the SLBs at the next synchronization, QEMU first
clears its SLB array and only restore valid ones, but the 0th one is
now gone and we cannot access the corresponding memory anymore:

(qemu) x/x $pc
c00b742c: Cannot access memory

To avoid this, let's filter out non-valid SLB entries. While here, we
also force a full SLB flush before installing new entries. Since SLB
is for 64-bit only, we now build this path conditionally to avoid a
build break on 32-bit, which doesn't define SLB_ESID_V.

Signed-off-by: Greg Kurz 
---
v3: - build SLB path for 64-bit only [*]
- add blank line after declaration of rb and rs to silence checkpatch

v2: - flush SLB before installing new entries

[*] I'm wondering if other users of BOOK3S_HFLAG_SLB in this file
shouldn't be conditionally built as well, for consistency.
---
 arch/powerpc/kvm/book3s_pr.c |   16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 3beb4ff469d1..68c1f8bc17c4 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -1326,12 +1326,22 @@ static int kvm_arch_vcpu_ioctl_set_sregs_pr(struct 
kvm_vcpu *vcpu,
kvmppc_set_pvr_pr(vcpu, sregs->pvr);
 
vcpu3s->sdr1 = sregs->u.s.sdr1;
+#ifdef CONFIG_PPC_BOOK3S_64
if (vcpu->arch.hflags & BOOK3S_HFLAG_SLB) {
+   /* Flush all SLB entries */
+   vcpu->arch.mmu.slbmte(vcpu, 0, 0);
+   vcpu->arch.mmu.slbia(vcpu);
+
for (i = 0; i < 64; i++) {
-   vcpu->arch.mmu.slbmte(vcpu, 
sregs->u.s.ppc64.slb[i].slbv,
-   
sregs->u.s.ppc64.slb[i].slbe);
+   u64 rb = sregs->u.s.ppc64.slb[i].slbe;
+   u64 rs = sregs->u.s.ppc64.slb[i].slbv;
+
+   if (rb & SLB_ESID_V)
+   vcpu->arch.mmu.slbmte(vcpu, rs, rb);
}
-   } else {
+   } else
+#endif
+   {
for (i = 0; i < 16; i++) {
vcpu->arch.mmu.mtsrin(vcpu, i, sregs->u.s.ppc32.sr[i]);
}



Re: [PATCH kernel] RFC: prom_init: Fetch flatten device tree from the system firmware

2017-10-16 Thread Segher Boessenkool
On Mon, Oct 16, 2017 at 06:07:06PM +1100, Alexey Kardashevskiy wrote:
> On 16/10/17 17:46, David Gibson wrote:

>  +/* Fixup the boot cpuid */
>  +hdr->boot_cpuid_phys = cpu_to_be32(prom.cpu);
> >>>
> >>> If SLOF is generating a tree it really should get this header field
> >>> right as well.
> >>
> >>
> >> Ah, I did not realize it is just a phandle from /chosen/cpu. Will
> >> fix.
> > 
> > It's not a phandle.  It's just the "address" (i.e. reg value) of the
> > boot cpu.
> 
> Well, it is "reg" of a CPU with phandle==/chosen/cpu so my fdt code needs
> to look there to pick the right "reg" rather than just plain 0. I'll fix
> this but in general can it possibly be not a zero in QEMU/SLOF?

/chosen/cpu is an ihandle, not a phandle.  Most (if not all) references
in /chosen are.


Segher


Re: [PATCH] cxl: Rename register PSL9_FIR2 to PSL9_FIR_MASK

2017-10-16 Thread christophe lombard

Le 09/10/2017 à 19:56, Vaibhav Jain a écrit :


PSL9 doesn't have a FIR2 register as was the case with PSL8. However
currently the register definitions in 'cxl.h' have a definition for
PSL9_FIR2 that actually points to PSL9_FIR_MASK register in the P1
area at offset 0x308.

So this patch renames the def PSL9_FIR2 to PSL9_FIR_MASK and updates
the references in the code to point to the new identifier. It also
removes the code to dump contents of FIR2 (FIR_MASK actually) in
cxl_native_irq_dump_regs_psl9().

Fixes: f24be42aab37("cxl: Add psl9 specific code")
Reported-by: Frederic Barrat 
Signed-off-by: Vaibhav Jain 
---


Thanks

Acked-by:  Christophe Lombard



Re: [PATCH kernel] RFC: prom_init: Fetch flatten device tree from the system firmware

2017-10-16 Thread David Gibson
On Mon, Oct 16, 2017 at 06:07:06PM +1100, Alexey Kardashevskiy wrote:
> On 16/10/17 17:46, David Gibson wrote:
> > On Mon, Oct 16, 2017 at 05:22:55PM +1100, Alexey Kardashevskiy wrote:
> >> On 16/10/17 17:11, David Gibson wrote:
> >>> On Mon, Oct 16, 2017 at 04:49:17PM +1100, Alexey Kardashevskiy wrote:
>  At the moment, on 256CPU + 256 PCI devices guest, it takes the guest
>  about 8.5sec to read the entire device tree. Some explanation can be
>  found here: https://patchwork.ozlabs.org/patch/826124/ but mostly it is
>  because the kernel traverses the tree twice and it calls "getprop" for
>  each properly which is really SLOF as it searches from the linked list
>  beginning every time.
> 
>  Since SLOF has just learned to build FDT and this takes less than 0.5sec
>  for such a big guest, this makes use of the proposed client interface
>  method - "fdt-fetch".
> 
>  If "fdt-fetch" is not available, the old method is used.
> 
>  Signed-off-by: Alexey Kardashevskiy 
> >>>
> >>> I like the concept, few details though..
> >>>
>  ---
>   arch/powerpc/kernel/prom_init.c | 26 ++
>   1 file changed, 26 insertions(+)
> 
>  diff --git a/arch/powerpc/kernel/prom_init.c 
>  b/arch/powerpc/kernel/prom_init.c
>  index 02190e90c7ae..daa50a153737 100644
>  --- a/arch/powerpc/kernel/prom_init.c
>  +++ b/arch/powerpc/kernel/prom_init.c
>  @@ -2498,6 +2498,31 @@ static void __init flatten_device_tree(void)
>   prom_panic("Can't allocate initial device-tree 
>  chunk\n");
>   mem_end = mem_start + room;
>   
>  +if (!call_prom_ret("fdt-fetch", 2, 1, NULL, mem_start,
>  +   room - sizeof(mem_reserve_map))) {
>  +u32 size;
>  +
>  +hdr = (void *) mem_start;
>  +
>  +/* Fixup the boot cpuid */
>  +hdr->boot_cpuid_phys = cpu_to_be32(prom.cpu);
> >>>
> >>> If SLOF is generating a tree it really should get this header field
> >>> right as well.
> >>
> >>
> >> Ah, I did not realize it is just a phandle from /chosen/cpu. Will
> >> fix.
> > 
> > It's not a phandle.  It's just the "address" (i.e. reg value) of the
> > boot cpu.
> 
> 
> Well, it is "reg" of a CPU with phandle==/chosen/cpu so my fdt code needs
> to look there to pick the right "reg" rather than just plain 0.

Ah, right, I see what you mean.

> I'll fix
> this but in general can it possibly be not a zero in QEMU/SLOF?

Erm.. probably not, but I'm not totally certain what could happen if
you tried creating all your cpu cores explicitly with -device instead
of just using -smp.

I think it's safer to look it up in SLOF, so that it won't break if we
change how cpu addresses are assigned in qemu.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v3 3/3] powerpc:selftest update memcmp_64 selftest for VMX implementation

2017-10-16 Thread Simon Guo
Hi Cyril,

Thanks for the review.

On Mon, Oct 16, 2017 at 02:32:58PM +1100, Cyril Bur wrote:
> On Fri, 2017-10-13 at 12:30 +0800, wei.guo.si...@gmail.com wrote:
> > From: Simon Guo 
> > 
> > This patch adjust selftest memcmp_64 so that memcmp selftest can be
> > compiled successfully.
> > 
> 
> Do they not compile at the moment?
> 
Since I renamed enter/exit_vmx_copy() to enter/exit_vmx_ops() in patch 
2nd, this patch is necessary to make selftest compiled based on patch 2nd.

> > It also adds testcases for:
> > - memcmp over 4K bytes size.
> > - s1/s2 with different/random offset on 16 bytes boundary.
> > - enter/exit_vmx_ops pairness.
> > 
> 
> This is a great idea, just a thought though - perhaps it might make
> more sense to have each condition be tested for in a separate binary
> rather than a single binary that tests everything.
> 
I am not sure whether it is better to seperate those test cases
on the same memcmp func. Currently the time to execute this single 
test bin is within 35 secs, and keeping them in one exec will make 
people test more comprehensively. So Personally I prefer to keep 
them in one at present.  

Thanks,
- Simon


[PATCH] powerpc/mm/radix: Drop unneeded NULL check

2017-10-16 Thread Aneesh Kumar K.V
From: Michael Ellerman 

We call these functions with non-NULL mm or vma. Hence we can skip the NULL
check in these functions. We also remove now unused function
__local_flush_hugetlb_page().

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/hugetlb.h |  6 --
 arch/powerpc/mm/tlb-radix.c| 18 --
 2 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index b8a0fb442c64..795d825c2edd 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -40,12 +40,6 @@ static inline void flush_hugetlb_page(struct vm_area_struct 
*vma,
return radix__flush_hugetlb_page(vma, vmaddr);
 }
 
-static inline void __local_flush_hugetlb_page(struct vm_area_struct *vma,
- unsigned long vmaddr)
-{
-   if (radix_enabled())
-   return radix__local_flush_hugetlb_page(vma, vmaddr);
-}
 #else
 
 static inline pte_t *hugepd_page(hugepd_t hpd)
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 191cc5abc940..98fdf2243d35 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -164,7 +164,7 @@ void radix__local_flush_tlb_page_psize(struct mm_struct 
*mm, unsigned long vmadd
unsigned long ap = mmu_get_ap(psize);
 
preempt_disable();
-   pid = mm ? mm->context.id : 0;
+   pid = mm->context.id;
if (pid != MMU_NO_CONTEXT)
_tlbiel_va(vmaddr, pid, ap, RIC_FLUSH_TLB);
preempt_enable();
@@ -175,10 +175,9 @@ void radix__local_flush_tlb_page(struct vm_area_struct 
*vma, unsigned long vmadd
 #ifdef CONFIG_HUGETLB_PAGE
/* need the return fix for nohash.c */
if (vma && is_vm_hugetlb_page(vma))
-   return __local_flush_hugetlb_page(vma, vmaddr);
+   return radix__local_flush_hugetlb_page(vma, vmaddr);
 #endif
-   radix__local_flush_tlb_page_psize(vma ? vma->vm_mm : NULL, vmaddr,
- mmu_virtual_psize);
+   radix__local_flush_tlb_page_psize(vma->vm_mm, vmaddr, 
mmu_virtual_psize);
 }
 EXPORT_SYMBOL(radix__local_flush_tlb_page);
 
@@ -232,7 +231,7 @@ void radix__flush_tlb_page_psize(struct mm_struct *mm, 
unsigned long vmaddr,
unsigned long ap = mmu_get_ap(psize);
 
preempt_disable();
-   pid = mm ? mm->context.id : 0;
+   pid = mm->context.id;
if (unlikely(pid == MMU_NO_CONTEXT))
goto bail;
if (!mm_is_thread_local(mm))
@@ -247,10 +246,9 @@ void radix__flush_tlb_page(struct vm_area_struct *vma, 
unsigned long vmaddr)
 {
 #ifdef CONFIG_HUGETLB_PAGE
if (vma && is_vm_hugetlb_page(vma))
-   return flush_hugetlb_page(vma, vmaddr);
+   return radix__flush_hugetlb_page(vma, vmaddr);
 #endif
-   radix__flush_tlb_page_psize(vma ? vma->vm_mm : NULL, vmaddr,
-   mmu_virtual_psize);
+   radix__flush_tlb_page_psize(vma->vm_mm, vmaddr, mmu_virtual_psize);
 }
 EXPORT_SYMBOL(radix__flush_tlb_page);
 
@@ -330,7 +328,7 @@ void radix__flush_tlb_range_psize(struct mm_struct *mm, 
unsigned long start,
 
 
preempt_disable();
-   pid = mm ? mm->context.id : 0;
+   pid = mm->context.id;
if (unlikely(pid == MMU_NO_CONTEXT))
goto err_out;
 
@@ -361,7 +359,7 @@ void radix__flush_tlb_collapsed_pmd(struct mm_struct *mm, 
unsigned long addr)
unsigned long pid, end;
 
 
-   pid = mm ? mm->context.id : 0;
+   pid = mm->context.id;
if (unlikely(pid == MMU_NO_CONTEXT))
goto no_context;
 
-- 
2.13.6



Re: [PATCH kernel] RFC: prom_init: Fetch flatten device tree from the system firmware

2017-10-16 Thread Alexey Kardashevskiy
On 16/10/17 17:46, David Gibson wrote:
> On Mon, Oct 16, 2017 at 05:22:55PM +1100, Alexey Kardashevskiy wrote:
>> On 16/10/17 17:11, David Gibson wrote:
>>> On Mon, Oct 16, 2017 at 04:49:17PM +1100, Alexey Kardashevskiy wrote:
 At the moment, on 256CPU + 256 PCI devices guest, it takes the guest
 about 8.5sec to read the entire device tree. Some explanation can be
 found here: https://patchwork.ozlabs.org/patch/826124/ but mostly it is
 because the kernel traverses the tree twice and it calls "getprop" for
 each properly which is really SLOF as it searches from the linked list
 beginning every time.

 Since SLOF has just learned to build FDT and this takes less than 0.5sec
 for such a big guest, this makes use of the proposed client interface
 method - "fdt-fetch".

 If "fdt-fetch" is not available, the old method is used.

 Signed-off-by: Alexey Kardashevskiy 
>>>
>>> I like the concept, few details though..
>>>
 ---
  arch/powerpc/kernel/prom_init.c | 26 ++
  1 file changed, 26 insertions(+)

 diff --git a/arch/powerpc/kernel/prom_init.c 
 b/arch/powerpc/kernel/prom_init.c
 index 02190e90c7ae..daa50a153737 100644
 --- a/arch/powerpc/kernel/prom_init.c
 +++ b/arch/powerpc/kernel/prom_init.c
 @@ -2498,6 +2498,31 @@ static void __init flatten_device_tree(void)
prom_panic("Can't allocate initial device-tree chunk\n");
mem_end = mem_start + room;
  
 +  if (!call_prom_ret("fdt-fetch", 2, 1, NULL, mem_start,
 + room - sizeof(mem_reserve_map))) {
 +  u32 size;
 +
 +  hdr = (void *) mem_start;
 +
 +  /* Fixup the boot cpuid */
 +  hdr->boot_cpuid_phys = cpu_to_be32(prom.cpu);
>>>
>>> If SLOF is generating a tree it really should get this header field
>>> right as well.
>>
>>
>> Ah, I did not realize it is just a phandle from /chosen/cpu. Will
>> fix.
> 
> It's not a phandle.  It's just the "address" (i.e. reg value) of the
> boot cpu.


Well, it is "reg" of a CPU with phandle==/chosen/cpu so my fdt code needs
to look there to pick the right "reg" rather than just plain 0. I'll fix
this but in general can it possibly be not a zero in QEMU/SLOF?


> 
 +  /* Append the reserved map to the end of the blob */
 +  hdr->off_mem_rsvmap = hdr->totalsize;
 +  size = be32_to_cpu(hdr->totalsize);
 +  rsvmap = (void *) hdr + size;
 +  hdr->totalsize = cpu_to_be32(size + sizeof(mem_reserve_map));
 +  memcpy(rsvmap, mem_reserve_map, sizeof(mem_reserve_map));
>>>
>>> .. and the reserve map for that matter.  I don't really understand
>>> what you're doing here. 
>>
>> ? Get the blob, increase the FDT size by sizeof(mem_reserve_map), fix up
>> totalsize and off_mem_rsvmap, copy mem_reserve_map to the end of the blob
>> (the actual order is slightly different, may be a bit confusing).
> 
> Right.. but where is mem_reserve_map coming from, if it hasn't come
> from an FDT?
> 
>> Asking SLOF to reserve the space seems to be unnecessary complication of
>> the interface - SLOF does not provide any reserved memory records.
> 
> Ah.. right, the reservations are coming from the pre-prom kernel, not
> from the firmware itself.  Yeah, that makes sense.  Ok, this makes
> sense then...


Right, the reservations are added via reserve_mem() in
arch/powerpc/kernel/prom_init.c

> 
>>> Note also that the reserve map is required to
>>> be 8-byte aligned, which totalsize might not be.
>>
>> Ah, good point.
> 
> ..at least with that fixed and maybe some comments to make what's
> gonig on clearer.

>>
>>
>>>
 +  /* Store the DT address */
 +  dt_header_start = mem_start;
 +
 +#ifdef DEBUG_PROM
 +  prom_printf("Fetched DTB: %d bytes to @%x\n", size, mem_start);
 +#endif
 +  goto print_exit;
 +  }
 +
/* Get root of tree */
root = call_prom("peer", 1, 1, (phandle)0);
if (root == (phandle)0)
 @@ -2548,6 +2573,7 @@ static void __init flatten_device_tree(void)
/* Copy the reserve map in */
memcpy(rsvmap, mem_reserve_map, sizeof(mem_reserve_map));
  
 +print_exit:
  #ifdef DEBUG_PROM
{
int i;
>>>
>>
>>
> 
> 
> 
> 


-- 
Alexey



signature.asc
Description: OpenPGP digital signature


[PATCH] powerpc/mm/hash: Add pr_fmt() to hash_utils64.c

2017-10-16 Thread Aneesh Kumar K.V
Make the printks look a bit nicer by adding a prefix.

Radix config now do
 radix-mmu: Page sizes from device-tree:
 radix-mmu: Page size shift = 12 AP=0x0
 radix-mmu: Page size shift = 16 AP=0x5
 radix-mmu: Page size shift = 21 AP=0x1
 radix-mmu: Page size shift = 30 AP=0x2

This patch update hash config to do similar dmesg output. With the patch we have

 hash-mmu: Page sizes from device-tree:
 hash-mmu: base_shift=12: shift=12, sllp=0x, avpnm=0x, tlbiel=1, 
penc=0
 hash-mmu: base_shift=12: shift=16, sllp=0x, avpnm=0x, tlbiel=1, 
penc=7
 hash-mmu: base_shift=12: shift=24, sllp=0x, avpnm=0x, tlbiel=1, 
penc=56
 hash-mmu: base_shift=16: shift=16, sllp=0x0110, avpnm=0x, tlbiel=1, 
penc=1
 hash-mmu: base_shift=16: shift=24, sllp=0x0110, avpnm=0x, tlbiel=1, 
penc=8
 hash-mmu: base_shift=20: shift=20, sllp=0x0111, avpnm=0x, tlbiel=0, 
penc=2
 hash-mmu: base_shift=24: shift=24, sllp=0x0100, avpnm=0x0001, tlbiel=0, 
penc=0
 hash-mmu: base_shift=34: shift=34, sllp=0x0120, avpnm=0x07ff, tlbiel=0, 
penc=3

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/hash_utils_64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 78955c2153ed..b28dafe7d404 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -21,6 +21,7 @@
 #undef DEBUG
 #undef DEBUG_LOW
 
+#define pr_fmt(fmt) "hash-mmu: " fmt
 #include 
 #include 
 #include 
-- 
2.13.6