date:20180104

Re: [GIT PULL] ARM: uniphier: fixes for v4.15 (2nd)

2018-01-04 Thread Arnd Bergmann

On Fri, Dec 29, 2017 at 1:30 PM, Masahiro Yamada
 wrote:
> Hi Arnd, Olof,
>
> This is the 2nd bug-fix pull request for v4.15.
> Just one DT fix.  Please pull!

I've ended up cherry-picking that commit manually into the fixes branch:
We haven't updated the fixes branch to a later -rc, and your pull request
was based on -rc3, so pulling it would create an ugly backmerge.

You did nothing wrong here, so it seemed unnecessary to ask you for
a respin based on -rc1. Hope that works for you.

 Arnd

Re: [git pull] drm fixes for 4.15-rc6

2018-01-04 Thread Jani Nikula

On Fri, 29 Dec 2017, Jani Nikula  wrote:
> On Thu, 28 Dec 2017, Randy Dunlap  wrote:
>> It would be good to get this documentation build error patch
>> merged into 4.15.  Daniel Vetter says that he merged (applied) it.
>>
>> [PATCH] documentation/gpu/i915: fix docs build error after file rename
>>   https://marc.info/?l=linux-kernel&m=151234419425847&w=2
>
> Hi Randy, didn't look too closely but I presume our scripts tripped over
> the Fixes: tag that's split to two lines in that patch. I'll pick it up
> for our next fixes batch.

I just sent the pull request with this to Dave.

BR,
Jani.


-- 
Jani Nikula, Intel Open Source Technology Center

Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

2018-01-04 Thread Andi Kleen

On Thu, Jan 04, 2018 at 10:06:01AM -0600, Josh Poimboeuf wrote:
> On Thu, Jan 04, 2018 at 07:59:14AM -0800, Andi Kleen wrote:
> > > NAK.  We can't blindly disable objtool warnings, that will break
> > > livepatch and the ORC unwinder.  If you share a .o file (or the GCC
> > > code) I can look at adding retpoline support.
> > 
> > I don't think we can wait for that. We can disable livepatch and the
> > unwinder for now. They are not essential. Frame pointers should work
> > well enough for unwinding
> 
> If you want to make this feature conflict with livepatch and ORC,
> silencing objtool warnings is not the way to do it.

I don't see why it would conflict with the unwinder anyways?

It doesn't change the long term stack state, so it should be invisible to the 
unwinder (unless you crash in the thunk, which is very unlikely)

I actually got some unwinder backtraces during development and they seemed
to work.

> 
> > and afaik nobody can use livepatch in mainline anyways.
> 
> Why not?  The patch creation tooling is still out-of-tree, but livepatch
> itself is fully supported in mainline.

Ok.

Still doesn't seem critical at this point if it's some out of tree
thing.

-Andi

Re: Avoid speculative indirect calls in kernel

2018-01-04 Thread David Woodhouse

On Thu, 2018-01-04 at 15:29 +, Woodhouse, David wrote:
> 
> > With the GCC -mindirect-branch=thunk-external support, and microcode,
> > Xen will make a boot-time choice between using Retpoline, Lfence (which
> > is the better AMD option, and more performant than retpoline), or IBRS
> > on Skylake and newer processors where it is strictly necessary, as well
> > as using IBPB whenever available.
> 
> I need to pull in the AMD lfence alternative for retpoline, giving us a
> 3-way choice of the existing retpoline thunk, "lfence; jmp *%\reg", and
> a bare "jmp *%\reg".

I think I can abuse X86_FEATURE_SYSCALL for that, right? So it would
look something like this:

 --- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -12,7 +12,7 @@
 
 ENTRY(__x86.indirect_thunk.\reg)
CFI_STARTPROC
-   ALTERNATIVE "call 2f", __stringify(jmp *%\reg), X86_BUG_NO_RETPOLINE
+   ALTERNATIVE_2 "call 2f", __stringify(lfence;jmp *%\reg), 
X86_FEATURE_SYSCALL, __stringify(jmp *%\reg), X86_BUG_NO_RETPOLINE
 1:
lfence
ASM_UNREACHABLE


However, I would very much like to see a categorical statement from AMD
that the lfence is sufficient in all cases. Remember, Intel were saying
that too for a while, before finding that it was not *quite* good
enough.

smime.p7s
Description: S/MIME cryptographic signature

Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs

2018-01-04 Thread Andy Lutomirski

On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner  wrote:
> On Wed, 3 Jan 2018, Andy Lutomirski wrote:
>> On Wed, Jan 3, 2018 at 8:35 PM, Benjamin Gilbert
>>  wrote:
>> > On Wed, Jan 03, 2018 at 04:37:53PM -0800, Andy Lutomirski wrote:
>> >> Maybe try rebuilding a bad kernel with free_ldt_pgtables() modified
>> >> to do nothing, and the read /sys/kernel/debug/page_tables/current (or
>> >> current_kernel, or whatever it's called).  The problem may be obvious.
>> >
>> > current_kernel attached.  I have not seen any crashes with
>> > free_ldt_pgtables() stubbed out.
>>
>> I haven't reproduced it, but I think I see what's wrong.  KASLR sets
>> vaddr_end to a totally bogus value.  It should be no larger than
>> LDT_BASE_ADDR.  I suspect that your vmemmap is getting randomized into
>> the LDT range.  If it weren't for that, it could just as easily land
>> in the cpu_entry_area range.  This will need fixing in all versions
>> that aren't still called KAISER.
>>
>> Our memory map code is utter shite.  This kind of bug should not be
>> possible without a giant warning at boot that something is screwed up.
>
> You're right it's utter shite and the KASLR folks who added this insanity
> of making vaddr_end depend on a gazillion of config options and not
> documenting it in mm.txt or elsewhere where it's obvious to find should
> really sit back and think hard about their half baken 'security' features.
>
> Just look at the insanity of comment above the vaddr_end ifdef maze.
>
> Benjamin, can you test the patch below please?
>
> Thanks,
>
> tglx
>
> 8<--
> --- a/Documentation/x86/x86_64/mm.txt
> +++ b/Documentation/x86/x86_64/mm.txt
> @@ -12,8 +12,9 @@ ea00 - eaff (=40
>  ... unused hole ...
>  ec00 - fbff (=44 bits) kasan shadow memory (16TB)
>  ... unused hole ...
> -fe00 - fe7f (=39 bits) LDT remap for PTI
> -fe80 - feff (=39 bits) cpu_entry_area mapping
> +   vaddr_end for KASLR
> +fe00 - fe7f (=39 bits) cpu_entry_area mapping
> +fe80 - feff (=39 bits) LDT remap for PTI
>  ff00 - ff7f (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffef - fffe (=64 GB) EFI region mapping space
> @@ -37,7 +38,9 @@ ffd4 - ffd5 (=49
>  ... unused hole ...
>  ffdf - fc00 (=53 bits) kasan shadow memory (8PB)
>  ... unused hole ...
> -fe80 - feff (=39 bits) cpu_entry_area mapping
> +   vaddr_end for KASLR
> +fe00 - fe7f (=39 bits) cpu_entry_area mapping
> +... unused hole ...
>  ff00 - ff7f (=39 bits) %esp fixup stacks
>  ... unused hole ...
>  ffef - fffe (=64 GB) EFI region mapping space
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
>  # define VMALLOC_SIZE_TB   _AC(32, UL)
>  # define __VMALLOC_BASE_AC(0xc900, UL)
>  # define __VMEMMAP_BASE_AC(0xea00, UL)
> -# define LDT_PGD_ENTRY _AC(-4, UL)
> +# define LDT_PGD_ENTRY _AC(-3, UL)
>  # define LDT_BASE_ADDR (LDT_PGD_ENTRY << PGDIR_SHIFT)
>  #endif

If you actually change the memory map order, you need to change the
shadow copy in mm/dump_pagetables.c, too.  I have a draft patch to
just sort the damn list, but that's not ready yet.

Re: [PATCH 1/2] perf-probe: Ensure debuginfo's build-id is correct

2018-01-04 Thread Arnaldo Carvalho de Melo

Em Mon, Dec 18, 2017 at 04:29:03PM +0900, Masami Hiramatsu escreveu:
> Ensure that the build-id of debuginfo is correctly
> matched to target build-id, if not, it warns user
> to check the system debuginfo package is correctly
> installed.

So we look at a variety of files looking for one that has a matching
build-id, I think the warning message should state that the file with
the unmatched build-id is simply being skipped, no?

And why do this at 'perf probe -l' time? I.e. at that point whatever
probes that are in place already have all the needed debug info?

I.e. the warning should be done at probe creation time only?

- Arnaldo
 
> E.g. on such environment, you will see below warning.
>   ==
>   # perf probe -l
>   WARN: There is a build-id mismatch between
>/usr/lib/debug/usr/lib64/libc-2.25.so.debug
>and
>/usr/lib64/libc-2.25.so
>   Please check your system's debuginfo files for mismatches, e.g. check
>   the versions for the target package and debuginfo package.
> probe_libc:malloc_get_state (on malloc_get_state@GLIBC_2.2.5 in /us
>   r/lib64/libc-2.25.so)
>   ==
> 
> Signed-off-by: Masami Hiramatsu 
> ---
>  tools/perf/util/probe-finder.c |   18 ++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> index a5731de0e5eb..5bb71e056b21 100644
> --- a/tools/perf/util/probe-finder.c
> +++ b/tools/perf/util/probe-finder.c
> @@ -119,9 +119,11 @@ enum dso_binary_type distro_dwarf_types[] = {
>  
>  struct debuginfo *debuginfo__new(const char *path)
>  {
> + u8 bid[BUILD_ID_SIZE], bid2[BUILD_ID_SIZE];
>   enum dso_binary_type *type;
>   char buf[PATH_MAX], nil = '\0';
>   struct dso *dso;
> + bool have_build_id = false;
>   struct debuginfo *dinfo = NULL;
>  
>   /* Try to open distro debuginfo files */
> @@ -129,12 +131,28 @@ struct debuginfo *debuginfo__new(const char *path)
>   if (!dso)
>   goto out;
>  
> + if (filename__read_build_id(path, bid, BUILD_ID_SIZE) > 0)
> + have_build_id = true;
> +
>   for (type = distro_dwarf_types;
>!dinfo && *type != DSO_BINARY_TYPE__NOT_FOUND;
>type++) {
>   if (dso__read_binary_type_filename(dso, *type, &nil,
>  buf, PATH_MAX) < 0)
>   continue;
> +
> + if (have_build_id) {
> + /* This can be fail because the file doesn't exist */
> + if (filename__read_build_id(buf, bid2,
> + BUILD_ID_SIZE) < 0)
> + continue;
> + if (memcmp(bid, bid2, BUILD_ID_SIZE)) {
> + pr_warning("WARN: There is a build-id mismatch 
> between\n %s\n and\n %s\n"
> + "Please check your system's debuginfo 
> files for mismatches, e.g. check the "
> + "versions for the target package and 
> debuginfo package.\n", buf, path);
> + continue;
> + }
> + }
>   dinfo = __debuginfo__new(buf);
>   }
>   dso__put(dso);

Re: [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre")

2018-01-04 Thread Andy Lutomirski

On Thu, Jan 4, 2018 at 1:30 AM, Woodhouse, David  wrote:
> On Thu, 2018-01-04 at 01:10 -0800, Paul Turner wrote:
>> Apologies for the discombobulation around today's disclosure.  Obviously the
>> original goal was to communicate this a little more coherently, but the
>> unscheduled advances in the disclosure disrupted the efforts to pull this
>> together more cleanly.
>>
>> I wanted to open discussion the "retpoline" approach and and define its
>> requirements so that we can separate the core
>> details from questions regarding any particular implementation thereof.
>>
>> As a starting point, a full write-up describing the approach is available at:
>>   https://support.google.com/faqs/answer/7625886
>
> Note that (ab)using 'ret' in this way is incompatible with CET on
> upcoming processors. HJ added a -mno-indirect-branch-register option to
> the latest round of GCC patches, which puts the branch target in a
> register instead of on the stack. My kernel patches (which I'm about to
> reconcile with Andi's tweaks and post) do the same.
>
> That means that in the cases where at runtime we want to ALTERNATIVE
> out the retpoline, it just turns back into a bare 'jmp *\reg'.
>
>

I hate to say this, but I think Intel should postpone CET until the
dust settles.  Intel should also consider a hardware-protected stack
that is only accessible with PUSH, POP, CALL, RET, and a new MOVSTACK
instruction.  That, by itself, would give considerable protection.
But we still need JMP_NO_SPECULATE.  Or, better yet, get the CPU to
stop leaking data during speculative execution.

Re: [net-next: PATCH 0/8] Armada 7k/8k PP2 ACPI support

2018-01-04 Thread Andrew Lunn

> > I already agreed with 'reg' being awkward in the later emails.
> > Wouldn't _ADR be more appropriate to specify PHY address on MDIO bus?
> > 
> Ah it is an actual address, then yes _ADR is probably more appropriate.

Newbie ACPI question. What is the definition of an address?

In this cause, we are talking about an address of a device on an MDIO
bus. It takes a value between 0 and 31.

How are IC2 device addresses represented in ACPI? MDIO devices and I2C
devices are pretty similar. So it would make sense to use the same as
what I2C uses.

 Andrew

KASLR may break some kernel features (was Re: [PATCH v5 1/4] kaslr: add immovable_mem=nn[KMG]@ss[KMG] to specify extracting memory)

2018-01-04 Thread Luiz Capitulino

On Thu, 4 Jan 2018 18:30:57 +0800
Baoquan He  wrote:

> On 01/04/18 at 04:02pm, Chao Fan wrote:
> > In current code, kaslr may choose the memory region in movable
> > nodes to extract kernel, which will make the nodes can't be hot-removed.
> > To solve it, we can specify the memory region in immovable node.
> > Create immovable_mem to store the regions in immovable_mem, where should
> > be chosen by kaslr.

[...]

> Hi Chao,
> 
> Thanks for your effort on this issue.
> 
> Luiz told me they met a hugetlb issue when kaslr enabled on kvm guest.
> Please check the below bug information. There's only one available
> position which hugepage can use to allocate. In this case, if we have a
> generic parameter to tell kernel where we can randomize into, this
> hugepage issue can be solved. We can restrict kernel to randomize beyond
> [0x4000, 0x7fff]. Not sure if your immovable_mem=nn[KMG]@ss[KMG]
> can be adjusted to do this. I am hesitating on whether we should change
> this or not.

Having a generic kaslr parameter to control where the kernel is extracted
is one solution for this problem.

The general problem statement is that KASLR may break some kernel features
depending on where the kernel is extracted. Two examples are hot-plugged
memory (this series) and 1GB HugeTLB pages.

The 1GB HugeTLB page issue is not specific to KVM guests. It just happens
that there's a bunch of people running guests with up to 5GB of memory and
with that amount of memory you have one or two 1GB pages and is easier for
KASLR to extract the kernel into a 1GB region and split a 1GB page. So,
you may not get any 1GB pages at all when this happens. However, I can also
reproduce this on bare-metal with lots of memory where I can loose a 1GB
page from time to time.

Having a kaslr_range= parameter solves both issues, but two major drawbacks
is that it breaks existing setups and I guess users will have a very hard
time choosing good ranges.

Another idea would be to have a CONFIG_KASLR_RANGES, where each arch
could have a list of ranges known to contain holes and/or immovable
memory and only extract the kernel into those ranges.

Re: [PATCH 01/11] arm64: use RET instruction for exiting the trampoline

2018-01-04 Thread Ard Biesheuvel

On 4 January 2018 at 15:08, Will Deacon  wrote:
> Speculation attacks against the entry trampoline can potentially resteer
> the speculative instruction stream through the indirect branch and into
> arbitrary gadgets within the kernel.
>
> This patch defends against these attacks by forcing a misprediction
> through the return stack: a dummy BL instruction loads an entry into
> the stack, so that the predicted program flow of the subsequent RET
> instruction is to a branch-to-self instruction which is finally resolved
> as a branch to the kernel vectors with speculation suppressed.
>
> Signed-off-by: Will Deacon 
> ---
>  arch/arm64/kernel/entry.S | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 031392ee5f47..b9feb587294d 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -1029,6 +1029,9 @@ alternative_else_nop_endif
> .if \regsize == 64
> msr tpidrro_el0, x30// Restored in kernel_ventry
> .endif
> +   bl  2f
> +   b   .
> +2:

This deserves a comment, I guess?

Also, is deliberately unbalancing the return stack likely to cause
performance problems, e.g., in libc hot paths?

> tramp_map_kernelx30
>  #ifdef CONFIG_RANDOMIZE_BASE
> adr x30, tramp_vectors + PAGE_SIZE
> @@ -1041,7 +1044,7 @@ alternative_insn isb, nop, 
> ARM64_WORKAROUND_QCOM_FALKOR_E1003
> msr vbar_el1, x30
> add x30, x30, #(1b - tramp_vectors)
> isb
> -   br  x30
> +   ret
> .endm
>
> .macro tramp_exit, regsize = 64
> --
> 2.1.4
>

Re: [RFC] Retpoline: Binary mitigation for branch-target-injection (aka "Spectre")

2018-01-04 Thread David Woodhouse

On Thu, 2018-01-04 at 08:18 -0800, Andy Lutomirski wrote:
> I hate to say this, but I think Intel should postpone CET until the
> dust settles.

CET isn't a *problem* for retpoline. We've had a CET-compatible version
for a while now, and I posted it earlier. It's just that Andi was
working from an older version of my patches.

Of course, there's a school of thought that says that Intel should
postpone *everything* until this is all fixed sanely, but there's
nothing special about CET in that respect.

smime.p7s
Description: S/MIME cryptographic signature

Re: [PATCH 06/11] arm64: Move post_ttbr_update_workaround to C code

2018-01-04 Thread Ard Biesheuvel

On 4 January 2018 at 15:08, Will Deacon  wrote:
> From: Marc Zyngier 
>
> We will soon need to invoke a CPU-specific function pointer after changing
> page tables, so move post_ttbr_update_workaround out into C code to make
> this possible.
>
> Signed-off-by: Marc Zyngier 
> Signed-off-by: Will Deacon 
> ---
>  arch/arm64/include/asm/assembler.h | 13 -
>  arch/arm64/kernel/entry.S  |  2 +-
>  arch/arm64/mm/context.c|  9 +
>  arch/arm64/mm/proc.S   |  3 +--
>  4 files changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/arch/arm64/include/asm/assembler.h 
> b/arch/arm64/include/asm/assembler.h
> index c45bc94f15d0..cee60ce0da52 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -476,17 +476,4 @@ alternative_endif
> mrs \rd, sp_el0
> .endm
>
> -/*
> - * Errata workaround post TTBRx_EL1 update.
> - */
> -   .macro  post_ttbr_update_workaround
> -#ifdef CONFIG_CAVIUM_ERRATUM_27456
> -alternative_if ARM64_WORKAROUND_CAVIUM_27456
> -   ic  iallu
> -   dsb nsh
> -   isb
> -alternative_else_nop_endif
> -#endif
> -   .endm
> -
>  #endif /* __ASM_ASSEMBLER_H */
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index b9feb587294d..6aa112baf601 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -277,7 +277,7 @@ alternative_else_nop_endif
>  * Cavium erratum 27456 (broadcast TLBI instructions may cause I-cache
>  * corruption).
>  */
> -   post_ttbr_update_workaround
> +   bl  post_ttbr_update_workaround
> .endif
>  1:
> .if \el != 0
> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
> index 1cb3bc92ae5c..c1e3b6479c8f 100644
> --- a/arch/arm64/mm/context.c
> +++ b/arch/arm64/mm/context.c
> @@ -239,6 +239,15 @@ void check_and_switch_context(struct mm_struct *mm, 
> unsigned int cpu)
> cpu_switch_mm(mm->pgd, mm);
>  }
>
> +/* Errata workaround post TTBRx_EL1 update. */
> +asmlinkage void post_ttbr_update_workaround(void)
> +{
> +   asm volatile(ALTERNATIVE("nop; nop; nop",

What does 'volatile' add here?

> +"ic iallu; dsb nsh; isb",
> +ARM64_WORKAROUND_CAVIUM_27456,
> +CONFIG_CAVIUM_ERRATUM_27456));
> +}
> +
>  static int asids_init(void)
>  {
> asid_bits = get_cpu_asid_bits();
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 3146dc96f05b..6affb68a9a14 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -145,8 +145,7 @@ ENTRY(cpu_do_switch_mm)
> isb
> msr ttbr0_el1, x0   // now update TTBR0
> isb
> -   post_ttbr_update_workaround
> -   ret
> +   b   post_ttbr_update_workaround // Back to C code...
>  ENDPROC(cpu_do_switch_mm)
>
> .pushsection ".idmap.text", "ax"
> --
> 2.1.4
>

Re: Avoid speculative indirect calls in kernel

2018-01-04 Thread Andrea Arcangeli

Hello,

On Thu, Jan 04, 2018 at 04:32:01PM +0100, Paolo Bonzini wrote:
> On 04/01/2018 15:51, Andrew Cooper wrote:
> > Where have you got this idea from?  Using IBPB on every mode switch
> > would be an insane overhead to take, and isn't necessary.

It's only on kernel entry and vmexit.

> IIRC it started as a paranoia mode for AMD, but then we found out it was
> actually faster than IBRS on some Intel processor where IBRS performance
> was horrible.  But I don't remember the details of the performance
> testing, sorry.

Yes, it depends on the workload what is faster. ibrs 0 ibpb 2 is
possible to use on CPUs with SPEC_CTRL too in fact.

It's only where SPEC_CTRL is missing and only IBPB_SUPPORT is
available, that ibrs 0 ibpb 2 is the only option to fix variant#2 for
good.

If you run lots of syscalls ibrs 1 ibpb 1 is much faster. If you do
infrequent syscalls computing a lot in kernel like I/O with large
buffers getting copied, ibrs 0 ibpb 2 is much faster than ibrs 1 ibpb
1 (on those microcodes where ibrs 1 reduces performance a lot, not all
microcodes implementing SPEC_CTRL are inefficient like that).

If SPEC_CTRL is available ibrs 1 ibpb 1 should be preferred even if it
may not always be faster in every workload.

AMD website says
https://www.amd.com/en/corporate/speculative-execution

"Differences in AMD architecture mean there is a near zero risk of
exploitation of this variant."

ibrs 0 ibpb 2 brings the probability down to zero even when SPEC_CTRL
is missing and only IBPB_SUPPORT is available in microcode, if you
need that kind of piece of mind.

What exactly would be the point of shipping fixes for variant#2 if we
leave spectre variant#2 unfixed also in cases where we could have
fixed it?

The problem is, it's very unlikely, but if by accident somebody can
mount and setup such an attack, then spectre variant#2 becomes a
problem almost as bad as spectre variant#1 is and your hypervisor
guest/host isolation is fully compromised.

It's not up to us to decide if to leave something with "near zero
risk" unfixed by default, so for now we provided a fix that brings the
probability of such spectre variant#2 attack to zero whenever
possible so that such a spectre varaint#2 attack becomes impossible
(not just "near zero risk"").

Of course we made sure the performance comes back at runtime no matter
what after running this:

echo 0 >/sys/kernel/debug/x86/ibpb_enabled
echo 0 >/sys/kernel/debug/x86/ibrs_enabled

Or if you prefer at boot time with "noibrs noibpb". Not everyone
will necessarily care about that kind of variant#2 attacks of course.

NOTE: if those two tunables both read as 0 it means the fix for
variant#2 isn't activated by the running kernel and you need to
contact your CPU manufacturer for a microcode update providing
SPEC_CTRL or at least IBPB_SUPPORT (in the latter case the fix will
generally tend to perform worse and ibrs 0 ibpb 2 mode will
auto-engage).

For meltdown variant#3 same thing: if you want to disable the fix at
runtime because it's a guest kernel and it's running a single
microservice with a single app (similar to unikernel) or something
like that, you can with "nopti" or:

echo 0 >/sys/kernel/debug/x86/pti_enabled

Same issue if it's a bare metal host and it's running a single app and
it doesn't store secure data in kernel space etc... There's always an
option to disable the fixes.

Only spectre variant#1 fix is always on, as there's no performance
overhead to it.

By default it boots in the most secure setting possible so that all
spectre variant#1 and variant2 and meltdown variant#3 are fixed.

Thanks,
Andrea

Re: [PATCH 08/11] arm64: KVM: Use per-CPU vector when BP hardening is enabled

2018-01-04 Thread Ard Biesheuvel

On 4 January 2018 at 15:08, Will Deacon  wrote:
> From: Marc Zyngier 
>
> Now that we have per-CPU vectors, let's plug then in the KVM/arm64 code.
>

Why does bp hardening require per-cpu vectors?

> Signed-off-by: Marc Zyngier 
> Signed-off-by: Will Deacon 
> ---
>  arch/arm/include/asm/kvm_mmu.h   | 10 ++
>  arch/arm64/include/asm/kvm_mmu.h | 38 ++
>  arch/arm64/kvm/hyp/switch.c  |  2 +-
>  virt/kvm/arm/arm.c   |  8 +++-
>  4 files changed, 56 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index fa6f2174276b..eb46fc81a440 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -221,6 +221,16 @@ static inline unsigned int kvm_get_vmid_bits(void)
> return 8;
>  }
>
> +static inline void *kvm_get_hyp_vector(void)
> +{
> +   return kvm_ksym_ref(__kvm_hyp_vector);
> +}
> +
> +static inline int kvm_map_vectors(void)
> +{
> +   return 0;
> +}
> +
>  #endif /* !__ASSEMBLY__ */
>
>  #endif /* __ARM_KVM_MMU_H__ */
> diff --git a/arch/arm64/include/asm/kvm_mmu.h 
> b/arch/arm64/include/asm/kvm_mmu.h
> index 672c8684d5c2..2d6d4bd9de52 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -309,5 +309,43 @@ static inline unsigned int kvm_get_vmid_bits(void)
> return (cpuid_feature_extract_unsigned_field(reg, 
> ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
>  }
>
> +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
> +#include 
> +
> +static inline void *kvm_get_hyp_vector(void)
> +{
> +   struct bp_hardening_data *data = arm64_get_bp_hardening_data();
> +   void *vect = kvm_ksym_ref(__kvm_hyp_vector);
> +
> +   if (data->fn) {
> +   vect = __bp_harden_hyp_vecs_start +
> +  data->hyp_vectors_slot * SZ_2K;
> +
> +   if (!has_vhe())
> +   vect = lm_alias(vect);
> +   }
> +
> +   return vect;
> +}
> +
> +static inline int kvm_map_vectors(void)
> +{
> +   return create_hyp_mappings(kvm_ksym_ref(__bp_harden_hyp_vecs_start),
> +  kvm_ksym_ref(__bp_harden_hyp_vecs_end),
> +  PAGE_HYP_EXEC);
> +}
> +
> +#else
> +static inline void *kvm_get_hyp_vector(void)
> +{
> +   return kvm_ksym_ref(__kvm_hyp_vector);
> +}
> +
> +static inline int kvm_map_vectors(void)
> +{
> +   return 0;
> +}
> +#endif
> +
>  #endif /* __ASSEMBLY__ */
>  #endif /* __ARM64_KVM_MMU_H__ */
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index f7c651f3a8c0..8d4f3c9d6dc4 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -52,7 +52,7 @@ static void __hyp_text __activate_traps_vhe(void)
> val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
> write_sysreg(val, cpacr_el1);
>
> -   write_sysreg(__kvm_hyp_vector, vbar_el1);
> +   write_sysreg(kvm_get_hyp_vector(), vbar_el1);
>  }
>
>  static void __hyp_text __activate_traps_nvhe(void)
> diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
> index 6b60c98a6e22..1c9fdb6db124 100644
> --- a/virt/kvm/arm/arm.c
> +++ b/virt/kvm/arm/arm.c
> @@ -1158,7 +1158,7 @@ static void cpu_init_hyp_mode(void *dummy)
> pgd_ptr = kvm_mmu_get_httbr();
> stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
> hyp_stack_ptr = stack_page + PAGE_SIZE;
> -   vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
> +   vector_ptr = (unsigned long)kvm_get_hyp_vector();
>
> __cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);
> __cpu_init_stage2();
> @@ -1403,6 +1403,12 @@ static int init_hyp_mode(void)
> goto out_err;
> }
>
> +   err = kvm_map_vectors();
> +   if (err) {
> +   kvm_err("Cannot map vectors\n");
> +   goto out_err;
> +   }
> +
> /*
>  * Map the Hyp stack pages
>  */
> --
> 2.1.4
>

Re: [PATCH V4 11/26] iommu/amd: deprecate pci_get_bus_and_slot()

2018-01-04 Thread Gary R Hook

On 01/04/2018 06:25 AM, Sinan Kaya wrote:

On 12/19/2017 12:37 AM, Sinan Kaya wrote:

pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Getting ready to remove pci_get_bus_and_slot() function in favor of
pci_get_domain_bus_and_slot().

Hard-code the domain number as 0 for the AMD IOMMU driver.

>
> Any comments from the IOMMU people?
>

pci_get_bus_and_slot() appears to (now) be a convenience function that 
wraps pci_get_domain_bus_and_slot() while using a 0 for the domain 
value. Exactly what you are doing here, albeit in a more overt way.

How is this patch advantageous? Seems to me that if other domains need 
to be enabled, that driver could be changed if and when that requirement 
arises.

But perhaps I'm missing a nuance here.

Re: [patch V5 02/11] LICENSES: Add the GPL 2.0 license

2018-01-04 Thread Carmen Bianca Bakker

Hi all,

Since December, `GPL-2.0` is no longer the correct identifier for the
licence.  The American FSF has been in talks with the SPDX Workgroup to
change it to `GPL-2.0-only`.

See the rationale here:

https://www.gnu.org/licenses/identify-licenses-clearly.html

See the new canonical licence list here:

https://spdx.org/licenses/

This change is valid for all GPL licences.  Similarly, `GPL-2.0+` has
been changed to `GPL-2.0-or-later`.

I believe that this patch should be changed to reflect that.  The
identifiers used in this patch are still valid, but deprecated.

Yours sincerely,

-- 
Carmen Bianca Bakker
Technical Intern
Free Software Foundation Europe e.V.



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 11/11] arm64: Implement branch predictor hardening for affected Cortex-A CPUs

2018-01-04 Thread Ard Biesheuvel

On 4 January 2018 at 15:08, Will Deacon  wrote:
> Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing
> and can theoretically be attacked by malicious code.
>
> This patch implements a PSCI-based mitigation for these CPUs when available.
> The call into firmware will invalidate the branch predictor state, preventing
> any malicious entries from affecting other victim contexts.
>
> Signed-off-by: Marc Zyngier 
> Signed-off-by: Will Deacon 
> ---
>  arch/arm64/kernel/bpi.S| 24 
>  arch/arm64/kernel/cpu_errata.c | 42 
> ++
>  2 files changed, 66 insertions(+)
>
> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
> index 06a931eb2673..2b10d52a0321 100644
> --- a/arch/arm64/kernel/bpi.S
> +++ b/arch/arm64/kernel/bpi.S
> @@ -53,3 +53,27 @@ ENTRY(__bp_harden_hyp_vecs_start)
> vectors __kvm_hyp_vector
> .endr
>  ENTRY(__bp_harden_hyp_vecs_end)
> +ENTRY(__psci_hyp_bp_inval_start)
> +   stp x0, x1, [sp, #-16]!
> +   stp x2, x3, [sp, #-16]!
> +   stp x4, x5, [sp, #-16]!
> +   stp x6, x7, [sp, #-16]!
> +   stp x8, x9, [sp, #-16]!
> +   stp x10, x11, [sp, #-16]!
> +   stp x12, x13, [sp, #-16]!
> +   stp x14, x15, [sp, #-16]!
> +   stp x16, x17, [sp, #-16]!
> +   stp x18, x19, [sp, #-16]!

Would it be better to update sp only once here?
Also, do x18 and x19 need to be preserved/restored here?

> +   mov x0, #0x8400
> +   smc #0
> +   ldp x18, x19, [sp], #16
> +   ldp x16, x17, [sp], #16
> +   ldp x14, x15, [sp], #16
> +   ldp x12, x13, [sp], #16
> +   ldp x10, x11, [sp], #16
> +   ldp x8, x9, [sp], #16
> +   ldp x6, x7, [sp], #16
> +   ldp x4, x5, [sp], #16
> +   ldp x2, x3, [sp], #16
> +   ldp x0, x1, [sp], #16
> +ENTRY(__psci_hyp_bp_inval_end)
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 16ea5c6f314e..cb0fb3796bb8 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -53,6 +53,8 @@ static int cpu_enable_trap_ctr_access(void *__unused)
>  DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
>
>  #ifdef CONFIG_KVM
> +extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
> +
>  static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
> const char *hyp_vecs_end)
>  {
> @@ -94,6 +96,9 @@ static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
> spin_unlock(&bp_lock);
>  }
>  #else
> +#define __psci_hyp_bp_inval_start  NULL
> +#define __psci_hyp_bp_inval_endNULL
> +
>  static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
>   const char *hyp_vecs_start,
>   const char *hyp_vecs_end)
> @@ -118,6 +123,21 @@ static void  install_bp_hardening_cb(const struct 
> arm64_cpu_capabilities *entry,
>
> __install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
>  }
> +
> +#include 
> +
> +static int enable_psci_bp_hardening(void *data)
> +{
> +   const struct arm64_cpu_capabilities *entry = data;
> +
> +   if (psci_ops.get_version)
> +   install_bp_hardening_cb(entry,
> +  
> (bp_hardening_cb_t)psci_ops.get_version,
> +  __psci_hyp_bp_inval_start,
> +  __psci_hyp_bp_inval_end);
> +
> +   return 0;
> +}
>  #endif /* CONFIG_HARDEN_BRANCH_PREDICTOR */
>
>  #define MIDR_RANGE(model, min, max) \
> @@ -261,6 +281,28 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
> },
>  #endif
> +#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
> +   {
> +   .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> +   MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
> +   .enable = enable_psci_bp_hardening,
> +   },
> +   {
> +   .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> +   MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
> +   .enable = enable_psci_bp_hardening,
> +   },
> +   {
> +   .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> +   MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
> +   .enable = enable_psci_bp_hardening,
> +   },
> +   {
> +   .capability = ARM64_HARDEN_BRANCH_PREDICTOR,
> +   MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
> +   .enable = enable_psci_bp_hardening,
> +   },
> +#endif
> {
> }
>  };
> --
> 2.1.4
>

Re: [PATCH V4 11/26] iommu/amd: deprecate pci_get_bus_and_slot()

2018-01-04 Thread Sinan Kaya

On 1/4/2018 11:28 AM, Gary R Hook wrote:
> On 01/04/2018 06:25 AM, Sinan Kaya wrote:
>> On 12/19/2017 12:37 AM, Sinan Kaya wrote:
>>> pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
>>> where a PCI device is present. This restricts the device drivers to be
>>> reused for other domain numbers.
>>>
>>> Getting ready to remove pci_get_bus_and_slot() function in favor of
>>> pci_get_domain_bus_and_slot().
>>>
>>> Hard-code the domain number as 0 for the AMD IOMMU driver.
> 
> 
> 
>>
>> Any comments from the IOMMU people?
>>
> 
> pci_get_bus_and_slot() appears to (now) be a convenience function that wraps 
> pci_get_domain_bus_and_slot() while using a 0 for the domain value. Exactly 
> what you are doing here, albeit in a more overt way.
> 
> How is this patch advantageous? Seems to me that if other domains need to be 
> enabled, that driver could be changed if and when that requirement arises.
> 
> But perhaps I'm missing a nuance here.
> 
> 

The benefit of the change was discussed here:

https://lkml.org/lkml/2017/12/19/349

I hope it helps.


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

2018-01-04 Thread Josh Poimboeuf

On Thu, Jan 04, 2018 at 08:13:08AM -0800, Andi Kleen wrote:
> On Thu, Jan 04, 2018 at 10:06:01AM -0600, Josh Poimboeuf wrote:
> > On Thu, Jan 04, 2018 at 07:59:14AM -0800, Andi Kleen wrote:
> > > > NAK.  We can't blindly disable objtool warnings, that will break
> > > > livepatch and the ORC unwinder.  If you share a .o file (or the GCC
> > > > code) I can look at adding retpoline support.
> > > 
> > > I don't think we can wait for that. We can disable livepatch and the
> > > unwinder for now. They are not essential. Frame pointers should work
> > > well enough for unwinding
> > 
> > If you want to make this feature conflict with livepatch and ORC,
> > silencing objtool warnings is not the way to do it.
> 
> I don't see why it would conflict with the unwinder anyways?
> 
> It doesn't change the long term stack state, so it should be invisible to the 
> unwinder (unless you crash in the thunk, which is very unlikely)
> 
> I actually got some unwinder backtraces during development and they seemed
> to work.

Those objtool warnings are places where ORC annotations are either
missing or wrong.

At the very least, this needs to conflict with HAVE_RELIABLE_STACKTRACE
and HAVE_STACK_VALIDATION until objtool can understand the new code.
Currently ORC relies on HAVE_STACK_VALIDATION, so CONFIG_UNWINDER_ORC
would need to be disabled as well.

> > > and afaik nobody can use livepatch in mainline anyways.
> > 
> > Why not?  The patch creation tooling is still out-of-tree, but livepatch
> > itself is fully supported in mainline.
> 
> Ok.
> 
> Still doesn't seem critical at this point if it's some out of tree
> thing.

There are many livepatch users out there who would disagree.  The
out-of-tree bits aren't in kernel space.  If your patches are ready
before objtool supports them, then fine, make them conflict with
objtool.  But please don't introduce silent breakage.

Either way we'll need to figure out a way to get objtool support ASAP.

-- 
Josh

Re: "bad pmd" errors + oops with KPTI on 4.14.11 after loading X.509 certs

2018-01-04 Thread Thomas Gleixner

On Thu, 4 Jan 2018, Andy Lutomirski wrote:
> On Thu, Jan 4, 2018 at 4:28 AM, Thomas Gleixner  wrote:
> > --- a/arch/x86/include/asm/pgtable_64_types.h
> > +++ b/arch/x86/include/asm/pgtable_64_types.h
> > @@ -88,7 +88,7 @@ typedef struct { pteval_t pte; } pte_t;
> >  # define VMALLOC_SIZE_TB   _AC(32, UL)
> >  # define __VMALLOC_BASE_AC(0xc900, UL)
> >  # define __VMEMMAP_BASE_AC(0xea00, UL)
> > -# define LDT_PGD_ENTRY _AC(-4, UL)
> > +# define LDT_PGD_ENTRY _AC(-3, UL)
> >  # define LDT_BASE_ADDR (LDT_PGD_ENTRY << PGDIR_SHIFT)
> >  #endif
> 
> If you actually change the memory map order, you need to change the
> shadow copy in mm/dump_pagetables.c, too.  I have a draft patch to
> just sort the damn list, but that's not ready yet.

Yes, I forgot that in the first attempt. Noticed myself when dumping it,
but that should be irrelevant to figure out whether it fixes the problem at
hand.

[PATCH] Staging: iio: Prefer using BIT macro

2018-01-04 Thread Sumit Pundir

This patch fixes the following checkpatch.pl error at multiple lines:

CHECK: Prefer using the BIT macro

Signed-off-by: Sumit Pundir 
---
 drivers/staging/iio/cdc/ad7152.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/iio/cdc/ad7152.c b/drivers/staging/iio/cdc/ad7152.c
index 59d1b35..b2b15b9 100644
--- a/drivers/staging/iio/cdc/ad7152.c
+++ b/drivers/staging/iio/cdc/ad7152.c
@@ -47,24 +47,24 @@
 #define AD7152_STATUS_PWDN BIT(7)
 
 /* Setup Register Bit Designations (AD7152_REG_CHx_SETUP) */
-#define AD7152_SETUP_CAPDIFF   (1 << 5)
+#define AD7152_SETUP_CAPDIFF   BIT(5)
 #define AD7152_SETUP_RANGE_2pF (0 << 6)
-#define AD7152_SETUP_RANGE_0_5pF   (1 << 6)
+#define AD7152_SETUP_RANGE_0_5pF   BIT(6)
 #define AD7152_SETUP_RANGE_1pF (2 << 6)
 #define AD7152_SETUP_RANGE_4pF (3 << 6)
 #define AD7152_SETUP_RANGE(x)  ((x) << 6)
 
 /* Config Register Bit Designations (AD7152_REG_CFG) */
-#define AD7152_CONF_CH2EN  (1 << 3)
-#define AD7152_CONF_CH1EN  (1 << 4)
+#define AD7152_CONF_CH2EN  BIT(3)
+#define AD7152_CONF_CH1EN  BIT(4)
 #define AD7152_CONF_MODE_IDLE  (0 << 0)
-#define AD7152_CONF_MODE_CONT_CONV (1 << 0)
+#define AD7152_CONF_MODE_CONT_CONV BIT(0)
 #define AD7152_CONF_MODE_SINGLE_CONV   (2 << 0)
 #define AD7152_CONF_MODE_OFFS_CAL  (5 << 0)
 #define AD7152_CONF_MODE_GAIN_CAL  (6 << 0)
 
 /* Capdac Register Bit Designations (AD7152_REG_CAPDAC_XXX) */
-#define AD7152_CAPDAC_DACEN(1 << 7)
+#define AD7152_CAPDAC_DACENBIT(7)
 #define AD7152_CAPDAC_DACP(x)  ((x) & 0x1F)
 
 /* CFG2 Register Bit Designations (AD7152_REG_CFG2) */
-- 
2.7.4

Re: [PATCH v3 10/13] x86/retpoline/pvops: Convert assembler indirect jumps

2018-01-04 Thread Andi Kleen

On Thu, Jan 04, 2018 at 04:02:06PM +0100, Juergen Gross wrote:
> On 04/01/18 15:37, David Woodhouse wrote:
> > Convert pvops invocations to use non-speculative call sequences, when
> > CONFIG_RETPOLINE is enabled.
> > 
> > There is scope for future optimisation here — once the pvops methods are
> > actually set, we could just turn the damn things into *direct* jumps.
> > But this is perfectly sufficient for now, without that added complexity.
> 
> I don't see the need to modify the pvops calls.
> 
> All indirect calls are replaced by either direct calls or other code
> long before any user code is active.
> 
> For modules the replacements are in place before the module is being
> used.

Agreed. This shouldn't be needed.

-Andi

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Pavel Tatashin

I am getting the following panic when trying to boot 4.4.110rc1 on
Intel(R) Xeon(R) CPU E5-2630:

[5.923489] BUG: unable to handle kernel NULL pointer dereference
at 000d
[5.932259] IP: [] dyntick_save_progress_counter+0x12/0x50
[5.940142] PGD 0
[5.942400] Oops: 0002 [#1] SMP
[5.946023] Modules linked in:
[5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
4.4.110-rc1_pt_linux-4.4.110rc1 #1
[5.958484] Hardware name: Oracle Corporation ORACLE SERVER
X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
881ff2f24000
[5.977905] RIP: 0010:[]  []
dyntick_save_progress_counter+0x12/0x50
[5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
[5.994434] RAX: 0001 RBX: 81b02140 RCX: 883fec768000
[6.002403] RDX:  RSI: 881ff2f27e5f RDI: 88407e958140
[6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 00016110f359
[6.018333] R10: 0b10 R11:  R12: 81b02140
[6.026297] R13: ffdf R14: 0021 R15: 0002
[6.034262] FS:  () GS:881fff94()
knlGS:
[6.043293] CS:  0010 DS:  ES:  CR0: 80050033
[6.049707] CR2: 000d CR3: 01aa6000 CR4: 00360670
[6.057672] DR0:  DR1:  DR2: 
[6.065638] DR3:  DR6: fffe0ff0 DR7: 0400
[6.073603] Stack:
[6.075847]  881ff2f27e18 810e8fac 0202
881ff2f27e60
[6.084158]  881ff2f27e5f 810e70c0 81b02140
81b127a0
[6.092465]  0001  0003
881ff2f27eb8
[6.100768] Call Trace:
[6.103501]  [] force_qs_rnp+0xdc/0x150
[6.109527]  [] ? rcu_start_gp+0x70/0x70
[6.115654]  [] rcu_gp_kthread+0x468/0x9b0
[6.121976]  [] ? prepare_to_wait_event+0xf0/0xf0
[6.128973]  [] ? rcu_process_callbacks+0x5f0/0x5f0
[6.136167]  [] kthread+0xe5/0x100
[6.141710]  [] ? kthread_park+0x60/0x60
[6.147840]  [] ret_from_fork+0x3f/0x70
[6.153868]  [] ? kthread_park+0x60/0x60

I tried to bisect the problem, but when I try to boot only with:
"KAISER: Kernel Address Isolation" machine hangs during boot and
reboots without any panic message.

4.4.109 boots fine
4.9.75rc1 also boots fine.

Thank you,
Pavel

On Wed, Jan 3, 2018 at 3:11 PM, Greg Kroah-Hartman
 wrote:
> This is the start of the stable review cycle for the 4.4.110 release.
> There are 37 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Fri Jan  5 19:50:38 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.110-rc1.gz
> or in the git tree and branch at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-4.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
> -
> Pseudo-Shortlog of commits:
>
> Greg Kroah-Hartman 
> Linux 4.4.110-rc1
>
> Kees Cook 
> KPTI: Report when enabled
>
> Kees Cook 
> KPTI: Rename to PAGE_TABLE_ISOLATION
>
> Borislav Petkov 
> x86/kaiser: Move feature detection up
>
> Jiri Kosina 
> kaiser: disabled on Xen PV
>
> Borislav Petkov 
> x86/kaiser: Reenable PARAVIRT
>
> Thomas Gleixner 
> x86/paravirt: Dont patch flush_tlb_single
>
> Hugh Dickins 
> kaiser: kaiser_flush_tlb_on_return_to_user() check PCID
>
> Hugh Dickins 
> kaiser: asm/tlbflush.h handle noPGE at lower level
>
> Hugh Dickins 
> kaiser: drop is_atomic arg to kaiser_pagetable_walk()
>
> Hugh Dickins 
> kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
>
> Borislav Petkov 
> x86/kaiser: Check boottime cmdline params
>
> Borislav Petkov 
> x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling
>
> Hugh Dickins 
> kaiser: add "nokaiser" boot option, using ALTERNATIVE
>
> Hugh Dickins 
> kaiser: fix unlikely error in alloc_ldt_struct()
>
> Hugh Dickins 
> kaiser: _pgd_alloc() without __GFP_REPEAT to avoid stalls
>
> Hugh Dickins 
> kaiser: paranoid_entry pass cr3 need to paranoid_exit
>
> Hugh Dickins 
> kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
>
> Hugh Dickins 
> kaiser: PCID 0 for kernel and 128 for user
>
> Hugh Dickins 
> kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user
>
> Dave Hansen 
> kaiser: enhanced by kernel and user PCIDs
>
> Hugh Dickins 
> kaiser: vmstat show NR_KAISERTABLE as nr_overhead
>
> Hugh Dickins 
> kaiser: delete KAISER_REAL_SWITCH option
>
> Hugh Dickins 
> kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET
>
> Hugh Dickins 
> kaiser:

Re: [GIT PULL] ARM: uniphier: fixes for v4.15 (2nd)

2018-01-04 Thread Masahiro Yamada

Hi Arnd,

2018-01-05 1:10 GMT+09:00 Arnd Bergmann :
> On Fri, Dec 29, 2017 at 1:30 PM, Masahiro Yamada
>  wrote:
>> Hi Arnd, Olof,
>>
>> This is the 2nd bug-fix pull request for v4.15.
>> Just one DT fix.  Please pull!
>
> I've ended up cherry-picking that commit manually into the fixes branch:
> We haven't updated the fixes branch to a later -rc, and your pull request
> was based on -rc3, so pulling it would create an ugly backmerge.
>
> You did nothing wrong here, so it seemed unnecessary to ask you for
> a respin based on -rc1. Hope that works for you.
>
>  Arnd

Works for me.  Thanks!

-- 
Best Regards
Masahiro Yamada

Re: [RFC PATCH] asm/generic: introduce if_nospec and nospec_barrier

2018-01-04 Thread Mark Rutland

On Thu, Jan 04, 2018 at 08:54:11AM -0600, Eric W. Biederman wrote:
> Dan Williams  writes:
> > On Wed, Jan 3, 2018 at 9:01 PM, Eric W. Biederman  
> > wrote:
> >> "Williams, Dan J"  writes:
> Either the patch you presented missed a whole lot like 90%+ of the
> user/kernel interface or there is some mitigating factor that I am not
> seeing.  Either way until reasonable people can read the code and
> agree on the potential exploitability of it, I will be nacking these
> patches.

As Dan mentioned, this is the result of auditing some static analysis reports.
I don't think it was claimed that this was complete, just that these are
locations that we're fairly certain need attention.

Auditing the entire user/kernel interface is going to take time, and I don't
think we should ignore this corpus in the mean time (though we certainly want
to avoid a whack-a-mole game).

[...]

> >>> diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
> >>> index 8ca9915befc8..7f83abdea255 100644
> >>> --- a/net/mpls/af_mpls.c
> >>> +++ b/net/mpls/af_mpls.c
> >>> @@ -81,6 +81,8 @@ static struct mpls_route *mpls_route_input_rcu(struct 
> >>> net *net, unsigned index)
> >>>   if (index < net->mpls.platform_labels) {
> >>>   struct mpls_route __rcu **platform_label =
> >>>   rcu_dereference(net->mpls.platform_label);
> >>> +
> >>> + osb();
> >>>   rt = rcu_dereference(platform_label[index]);
> >>>   }
> >>>   return rt;
> >>
> >> Ouch!  This adds a barrier in the middle of an rcu lookup, on the
> >> fast path for routing mpls packets.  Which if memory serves will
> >> noticably slow down software processing of mpls packets.
> >>
> >> Why does osb() fall after the branch for validity?  So that we allow
> >> speculation up until then?
> >
> > It falls there so that the cpu only issues reads with known good 'index' 
> > values.
> >
> >> I suspect it would be better to have those barriers in the tun/tap
> >> interfaces where userspace can inject packets and thus time them.  Then
> >> the code could still speculate and go fast for remote packets.
> >>
> >> Or does the speculation stomping have to be immediately at the place
> >> where we use data from userspace to perform a table lookup?
> >
> > The speculation stomping barrier has to be between where we validate
> > the input and when we may speculate on invalid input.
> 
> So a serializing instruction at the kernel/user boundary (like say
> loading cr3) is not enough?  That would seem to break any chance of a
> controlled timing.

Unfortunately, it isn't sufficient to do this at the kernel/user boundary. Any
subsequent bounds check can be mis-speculated regardless of prior
serialization.

Such serialization has to occur *after* the relevant bounds check, but *before*
use of the value that was checked.

Where it's possible to audit user-provided values up front, we may be able to
batch checks to amortize the cost of such serialization, but typically bounds
checks are spread arbitrarily deep in the kernel.

[...]

> Given what I have seen in other parts of the thread I think an and
> instruction that just limits the index to a sane range is generally
> applicable, and should be cheap enough to not care about.

Where feasible, this sounds good to me.

However, since many places have dynamic bounds which aren't necessarily
powers-of-two, I'm not sure how applicable this is.

Thanks,
Mark.

Re: [PATCH v3 4/9] ARM: dts: r7s72100: Add Capture Engine Unit (CEU)

2018-01-04 Thread Simon Horman

On Thu, Jan 04, 2018 at 05:03:12PM +0100, Jacopo Mondi wrote:
> Add Capture Engine Unit (CEU) node to device tree.
> 
> Signed-off-by: Jacopo Mondi 
> Reviewed-by: Geert Uytterhoeven 
> Reviewed-by: Laurent Pinchart 

This looks good to me. Please ping me once the bindings, which I assume are
the only dependency, are Acked or accepted.

ARM: SoC fixes for 4.15

2018-01-04 Thread Arnd Bergmann

The following changes since commit ce39882eb1d87dd9bb4f89d4ae09ef2547aee079:

  Merge tag 'amlogic-fixes-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-amlogic
into fixes (2017-12-09 20:23:29 -0800)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git
tags/armsoc-fixes

for you to fetch changes up to abb62c46d4949d44979fa647740feff3f7538799:

  arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoC
(2018-01-04 17:09:01 +0100)


ARM: SoC fixes for 4.15

Fixes this time include mostly device tree changes, as usual,
the notable ones include:

- A number of patches to fix most of the remaining DTC warnings
  that got introduced when DTC started warning about some
  obvious mistakes. We still have some remaining warnings that
  probably may have to wait until 4.16 to get fixed while we
  try to figure out what the correct contents should be.
- On Allwinner A64, Ethernet PHYs need a fix after a mistake in
  coordination between patches merged through multiple branches.
- Various fixes for PMICs on allwinner based boards
- Two fixes for ethernet link detection on some Renesas machines
- Two stability fixes for rockchip based boards

Aside from device-tree, two other areas got fixes for older
problems:

- For TI Davinci DM365, a couple of fixes were needed to repair
  the MMC DMA engine support, apparently this has been broken for
  a while.
- One important fix for all Allwinner chips with the PMIC driver
  as a loadable module.



[note: this is a bit larger than usual. Most of the fixes were merged
before Christmas into the fixes branch, but then neither of us
was around to send the pull request until now].

Alejandro Mery (3):
  ARM: davinci: Use platform_device_register_full() to create pdev
for dm365's eDMA
  ARM: davinci: Add dma_mask to dm365's eDMA device
  ARM: davinci: fix mmc entries in dm365's dma_slave_map

Arnd Bergmann (8):
  Merge tag 'v4.15-rockchip-dts32fixes-1' of
ssh://gitolite.kernel.org/.../mmind/linux-rockchip into fixes
  Merge tag 'v4.15-rockchip-dts64fixes-1' of
ssh://gitolite.kernel.org/.../mmind/linux-rockchip into fixes
  Merge tag 'at91-ab-4.15-dt-fixes' of
ssh://gitolite.kernel.org/.../abelloni/linux into fixes
  Merge tag 'davinci-fixes-for-v4.15' of
ssh://gitolite.kernel.org/.../nsekhar/linux-davinci into fixes
  ARM: dts: ls1021a: fix incorrect clock references
  ARM: dts: tango4: remove bogus interrupt-controller property
  Merge tag 'renesas-fixes-for-v4.15' of
ssh://gitolite.kernel.org/.../horms/renesas into fixes
  Merge tag 'sunxi-fixes-for-4.15' of
ssh://gitolite.kernel.org/.../sunxi/linux into fixes

Bogdan Mirea (2):
  arm64: dts: renesas: salvator-x: Remove renesas, no-ether-link property
  arm64: dts: renesas: ulcb: Remove renesas, no-ether-link property

Chen-Yu Tsai (1):
  ARM: dts: sunxi: Convert to CCU index macros for HDMI controller

David Lechner (1):
  ARM: dts: da850-lego-ev3: Fix battery voltage gpio

Heiko Stuebner (3):
  ARM: dts: rockchip: add cpu0-regulator on rk3066a-marsboard
  arm64: dts: rockchip: fix trailing 0 in rk3328 tsadc interrupts
  arm64: dts: rockchip: limit rk3328-rock64 gmac speed to 100MBit for now

Icenowy Zheng (1):
  arm64: allwinner: a64: add Ethernet PHY regulator for several boards

Jagan Teki (1):
  arm64: allwinner: a64-sopine: Fix to use dcdc1 regulator instead of vcc3v3

Javier Martinez Canillas (1):
  ARM: dts: exynos: Enable Mixer node for Exynos5800 Peach Pi machine

Joel Stanley (1):
  ARM: dts: aspeed-g4: Correct VUART IRQ number

Klaus Goger (1):
  arm64: dts: rockchip: remove vdd_log from rk3399-puma

Masahiro Yamada (1):
  arm64: dts: uniphier: fix gpio-ranges property of PXs3 SoC

Maxime Ripard (1):
  ARM: dts: sun8i: a711: Reinstate the PMIC compatible

Peter Rosin (1):
  ARM: dts: at91: disable the nxp,se97b SMBUS timeout on the TSE-850

Rob Herring (1):
  ARM: dts: rockchip: fix rk3288 iep-IOMMU interrupts property cells

Sergey Matyukevich (1):
  arm64: dts: orange-pi-zero-plus2: fix sdcard detect

Stefan Brüns (1):
  sunxi-rsb: Include OF based modalias in device uevent

 arch/arm/boot/dts/aspeed-g4.dtsi   |  2 +-
 arch/arm/boot/dts/at91-tse850-3.dts|  1 +
 arch/arm/boot/dts/da850-lego-ev3.dts   |  4 +--
 arch/arm/boot/dts/exynos5800-peach-pi.dts  |  4 +++
 arch/arm/boot/dts/ls1021a-qds.dts  |  2 +-
 arch/arm/boot/dts/ls1021a-twr.dts  |  2 +-
 arch/arm/boot/dts/rk3066a-marsboard.dts|  4 +++
 arch/arm/boot/dts/rk3288.dtsi  |  2 +-
 arch/arm/boot/dts/sun4i-a10.dtsi   |  4 +--
 arch/arm/boot/dts/sun5i-a10s.dtsi  |  4 +--
 arch/arm/boot/dts/sun6i-a31.dtsi

Re: [PATCH V7 11/12] arm64: dts: add syscon for whale2 platform

2018-01-04 Thread Arnd Bergmann

On Fri, Dec 22, 2017 at 6:30 AM, Chunyan Zhang  wrote:
> On 22 December 2017 at 07:03, Stephen Boyd  wrote:
>> On 12/07, Chunyan Zhang wrote:
>>> Some clocks on SC9860 are in the same address area with syscon
>>> devices, the proper syscon node will be quoted under the
>>> definitions of those clocks in DT.
>>>
>>> Signed-off-by: Chunyan Zhang 
>>> ---
>>
>> These last two can go via arm-soc?
>
> Thanks Stephen!
>
> Hi Arnd, Olof
>
> Could you please take the patch 11, 12 through arm-soc?

Applied both to next/dt now, thanks!

  Arnd

Re: [RECEND PATCH V7 12/12] arm64: dts: add clocks for SC9860

2018-01-04 Thread Arnd Bergmann

On Thu, Jan 4, 2018 at 8:08 AM, Chunyan Zhang  wrote:
> From: Chunyan Zhang 
>
> Some clocks on SC9860 are in the same address area with syscon devices,
> those are what have a property of 'sprd,syscon' which would refer to
> syscon devices, others would have a reg property indicated their address
> ranges.
>
> Signed-off-by: Chunyan Zhang 

Applied, thanks!

 Arnd

Re: Avoid speculative indirect calls in kernel

2018-01-04 Thread Andrea Arcangeli

On Thu, Jan 04, 2018 at 03:29:37PM +, Woodhouse, David wrote:
> On Thu, 2018-01-04 at 14:51 +, Andrew Cooper wrote:
> > 
> > > * never turn off indirect branch prediction, but use a branch prediction
> > > barrier on every mode switch (needed for current AMD microcode)
> > 
> > Where have you got this idea from?  Using IBPB on every mode switch
> > would be an insane overhead to take, and isn't necessary.
> 
> AMD *only* has IBPB and not IBRS, but IIRC you don't need to do it on

AMD 0x10 0x12 0x16 basically have IBRS and no IBPB, those works
perfectly fine in ibrs 2 ibpb 1 mode, variant#2 fixed and zero
overhead.

> every context switch into the kernel; only when switching between
> VMs/processes?

Some AMD only has IBPB and no IBRS, then IBPB has to be called in
every enter kernel or vmexit to give the same security as ibrs 1 ibpb
1 (modulo SMT/HT but that's not the spectre PoC and you can rule that
out mathematically also by simply using cpu pinning as you already do
or disabling SMT if you care that much). Note ibrs 1 ibpb 1 also won't
cover HT effects of guest/user mode vs guest/user mode so cpu pinning
may be advisable anyway in your case (even with ibrs 1 ibpb 1 no
difference).

Of course everything can be trivially opted out at runtime and all
measurable performance restored, but by default it boots in the most
secure config available and it will make spectre variant#2 attack
impossible with only ibpb available.

> I need to pull in the AMD lfence alternative for retpoline, giving us a
> 3-way choice of the existing retpoline thunk, "lfence; jmp *%\reg", and
> a bare "jmp *%\reg".
> 
> Then the IBRS bits can be added on top.

"AMD lfence and reptoline" in the same sentence sounds like somebody
else also cares about spectre variant#2 on AMD. "Reptoline" only ever
makes sense in spectre variant#2 context so either ibrs 0 ibpb 2 mode
makes some sense too, or special lfence repotline for AMD should not
be worth mentioning in the first place.

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Greg Kroah-Hartman

On Thu, Jan 04, 2018 at 11:38:25AM -0500, Pavel Tatashin wrote:
> I am getting the following panic when trying to boot 4.4.110rc1 on
> Intel(R) Xeon(R) CPU E5-2630:
> 
> [5.923489] BUG: unable to handle kernel NULL pointer dereference
> at 000d
> [5.932259] IP: [] 
> dyntick_save_progress_counter+0x12/0x50
> [5.940142] PGD 0
> [5.942400] Oops: 0002 [#1] SMP
> [5.946023] Modules linked in:
> [5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
> 4.4.110-rc1_pt_linux-4.4.110rc1 #1
> [5.958484] Hardware name: Oracle Corporation ORACLE SERVER
> X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> [5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
> 881ff2f24000
> [5.977905] RIP: 0010:[]  []
> dyntick_save_progress_counter+0x12/0x50
> [5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
> [5.994434] RAX: 0001 RBX: 81b02140 RCX: 
> 883fec768000
> [6.002403] RDX:  RSI: 881ff2f27e5f RDI: 
> 88407e958140
> [6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 
> 00016110f359
> [6.018333] R10: 0b10 R11:  R12: 
> 81b02140
> [6.026297] R13: ffdf R14: 0021 R15: 
> 0002
> [6.034262] FS:  () GS:881fff94()
> knlGS:
> [6.043293] CS:  0010 DS:  ES:  CR0: 80050033
> [6.049707] CR2: 000d CR3: 01aa6000 CR4: 
> 00360670
> [6.057672] DR0:  DR1:  DR2: 
> 
> [6.065638] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [6.073603] Stack:
> [6.075847]  881ff2f27e18 810e8fac 0202
> 881ff2f27e60
> [6.084158]  881ff2f27e5f 810e70c0 81b02140
> 81b127a0
> [6.092465]  0001  0003
> 881ff2f27eb8
> [6.100768] Call Trace:
> [6.103501]  [] force_qs_rnp+0xdc/0x150
> [6.109527]  [] ? rcu_start_gp+0x70/0x70
> [6.115654]  [] rcu_gp_kthread+0x468/0x9b0
> [6.121976]  [] ? prepare_to_wait_event+0xf0/0xf0
> [6.128973]  [] ? rcu_process_callbacks+0x5f0/0x5f0
> [6.136167]  [] kthread+0xe5/0x100
> [6.141710]  [] ? kthread_park+0x60/0x60
> [6.147840]  [] ret_from_fork+0x3f/0x70
> [6.153868]  [] ? kthread_park+0x60/0x60
> 
> I tried to bisect the problem, but when I try to boot only with:
> "KAISER: Kernel Address Isolation" machine hangs during boot and
> reboots without any panic message.
> 
> 4.4.109 boots fine
> 4.9.75rc1 also boots fine.

Hm, so I'm guessing 4.15-rc6 also works?

Odd that 4.9.75-rc1 fails.

Adding Jiri and Hugh and Dave here to see if they have seen this
before...

thanks,

greg k-h

Re: [PATCH v2 0/4] arm64: defconfig: enable additional led triggers

2018-01-04 Thread Arnd Bergmann

On Tue, Jan 2, 2018 at 8:19 AM, Amit Kucheria  wrote:
> On Thu, Dec 21, 2017 at 8:46 PM, Arnd Bergmann  wrote:
>> On Wed, Dec 6, 2017 at 9:57 AM, Amit Kucheria  
>> wrote:
>>> (Adding Arnd)
>>>
>>> Now that the merge window rush has abated, can you please apply this
>>> trivial series?
>>>
>>> On Mon, Nov 6, 2017 at 12:38 PM, Amit Kucheria  
>>> wrote:
 This patchset enables the kernel panic and disk-activity trigger for LEDs
 and then enables the panic trigger for three 96Boards - DB410c, Hikey and
 Hikey960.

 DB410c and Hikey panic behaviour has been tested by triggering a panic
 through /proc/sysrq-trigger, while Hikey960 is only compile-tested.
>>
>> I applied all four now, but it would have been easier for me if you had 
>> either
>> sent them to the platform maintainers, or to a...@kernel.org.
>
> The platform maintainers are cc'ed but I guess nobody took them since
> the patchset touched 3 different platforms and a common defconfig.
>
> I'll remember to cc a...@kernel.org in the future but is there any
> reason why this email address isn't listed in MAINTAINERS?

We normally want to have all patches merged through the platform
maintainers, and have no ambiguity regarding who picks things up.
More importantly, being listed in the MAINTAINERS file would result
in us getting thousands of patches each merge window mixed in with
the stuff that we actually do need to see, so that would likely be more
lossy and more work for us.

  Arnd

Re: [PATCH v11 2/6] mailbox: qcom: Create APCS child device for clock controller

2018-01-04 Thread Georgi Djakov

Hi Jassi,

On 12/29/2017 08:14 AM, Jassi Brar wrote:
> Hi Bjorn,
> 
> On Sun, Dec 24, 2017 at 10:36 AM, Bjorn Andersson
>  wrote:
>> On Fri 22 Dec 20:57 PST 2017, Jassi Brar wrote:
>>
>>> On Tue, Dec 5, 2017 at 9:16 PM, Georgi Djakov  
>>> wrote:
 There is a clock controller functionality provided by the APCS hardware
 block of msm8916 devices. The device-tree would represent an APCS node
 with both mailbox and clock provider properties.

>>> The spec might depict a 'clock' box and 'mailbox' box inside the
>>> bigger APCS box. However, from the code I see in this patchset, they
>>> are orthogonal and can & should be represented as independent DT
>>> nodes.
>>
>> The APCS consists of a number of different hardware blocks, one of them
>> being the "APCS global" block, which is what this node and drivers
>> relate to. On 8916 this contains both the IPC register and clock
>> control. But it's still just one block according to the hardware
>> specification.
>>
>> As such DT should describe the one hardware block by one node IMHO.
>>
> In my even humbler opinion, DT should describe a h/w functional unit
> which _could_ be seen as a standalone component.

The APCS is one separate register block related to the CPU cluster. I
haven't seen any strict guidelines for such cases in the DT docs, and
during the discussion got the impression that this is the preferred
binding. Rob has also reviewed the binding, so we should be fine to move
forward with this one.

> For example, if this APCS had a mac controller, would we also populate
> a netdev from mailbox driver? And what if next revision moves/drops
> this clock controller out of APCS, keeping mailbox controller exactly
> same?

The clock controller may change in some next SoC architecture and that's
why the SoC version is also part of the the compatible string.

Thanks,
Georgi

Re: objtool segfault with ORC unwinder enabled

2018-01-04 Thread Markus

On Thursday, 4 January 2018 16:46:13 CET Josh Poimboeuf wrote:
> On Wed, Jan 03, 2018 at 06:26:19PM +0100, Markus wrote:
> > > > > I'm unable to recreate.  Can you attach one of the .o files (like
> > > > > the
> > > > > above irq.o)?
> > > > 
> > > > Sure, see attached. (From vanilla linux-4.14.11.)
> > > 
> > > There's something weird with the toolchain.  The object file doesn't
> > > have an ELF section symbol for the .irqentry.text section.
> > > 
> > > Are there any special KCFLAGS being added?  Can you build the object
> > > with V=1 to show the full gcc command line?
> > 
> > I have not added anything. There is no env variable set like $KCFLAGS or
> > $CFLAGS. (If that was the question.)
> > 
> > I think you mean this line from output:
> > gcc -Wp,-MD,arch/x86/kernel/.irq.o.d  -nostdinc -isystem
> > /usr/lib/gcc/x86_64- pc-linux-gnu/6.4.0/include -I./arch/x86/include
> > -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi
> > -I./arch/x86/include/generated/uapi -I./ include/uapi
> > -I./include/generated/uapi -include ./include/linux/kconfig.h -
> > D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-
> > aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration
> > -Wno- format-security -std=gnu89 -fno-PIE -mno-sse -mno-mmx -mno-sse2
> > -mno-3dnow - mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387
> > -mno-fp-ret-in-387 - mpreferred-stack-boundary=3 -mskip-rax-setup
> > -mtune=generic -mno-red-zone - mcmodel=kernel -funit-at-a-time
> > -DCONFIG_AS_CFI=1 -
> > DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1
> > -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1
> > -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 - DCONFIG_AS_AVX512=1
> > -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe -Wno- sign-compare
> > -fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -
> > Wno-frame-address -O2 --param=allow-store-data-races=0 -DCC_HAVE_ASM_GOTO
> > - Wframe-larger-than=2048 -fno-stack-protector
> > -Wno-unused-but-set-variable - Wno-unused-const-variable
> > -fomit-frame-pointer -fno-var-tracking-assignments -
> > Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fno-
> > stack-check -fconserve-stack -Werror=implicit-int
> > -Werror=strict-prototypes - Werror=date-time
> > -Werror=incompatible-pointer-types -Werror=designated-init -
> > Iarch/x86/kernel/../include/asm/trace-DKBUILD_BASENAME='"irq"'  -
> > DKBUILD_MODNAME='"irq"' -c -o arch/x86/kernel/.tmp_irq.o
> > arch/x86/kernel/irq.c
> > 
> > The next line is the objtool that segfaults.
> 
> I don't see anything unusual there.  Are there any Gentoo patches
> against either the kernel or GCC which would strip unused symbols?

The kernel is the vanilla kernel. (4.14.11 and also 4.15-rc6)
Its not a gentoo specific gcc patch. (Then every gentoo user would be 
affected?)

But I enabled ld.gold as default linker like 5 years ago. Never had a problem 
with this.

Is ld.gold supposed to fail here?

I switched back to ld.bfd and it seems to work.

BR,
Markus

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Guenter Roeck

On Thu, Jan 04, 2018 at 05:53:06PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 04, 2018 at 11:38:25AM -0500, Pavel Tatashin wrote:
> > I am getting the following panic when trying to boot 4.4.110rc1 on
> > Intel(R) Xeon(R) CPU E5-2630:
> > 
> > [5.923489] BUG: unable to handle kernel NULL pointer dereference
> > at 000d
> > [5.932259] IP: [] 
> > dyntick_save_progress_counter+0x12/0x50
> > [5.940142] PGD 0
> > [5.942400] Oops: 0002 [#1] SMP
> > [5.946023] Modules linked in:
> > [5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
> > 4.4.110-rc1_pt_linux-4.4.110rc1 #1
> > [5.958484] Hardware name: Oracle Corporation ORACLE SERVER
> > X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> > [5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
> > 881ff2f24000
> > [5.977905] RIP: 0010:[]  []
> > dyntick_save_progress_counter+0x12/0x50
> > [5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
> > [5.994434] RAX: 0001 RBX: 81b02140 RCX: 
> > 883fec768000
> > [6.002403] RDX:  RSI: 881ff2f27e5f RDI: 
> > 88407e958140
> > [6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 
> > 00016110f359
> > [6.018333] R10: 0b10 R11:  R12: 
> > 81b02140
> > [6.026297] R13: ffdf R14: 0021 R15: 
> > 0002
> > [6.034262] FS:  () GS:881fff94()
> > knlGS:
> > [6.043293] CS:  0010 DS:  ES:  CR0: 80050033
> > [6.049707] CR2: 000d CR3: 01aa6000 CR4: 
> > 00360670
> > [6.057672] DR0:  DR1:  DR2: 
> > 
> > [6.065638] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> > [6.073603] Stack:
> > [6.075847]  881ff2f27e18 810e8fac 0202
> > 881ff2f27e60
> > [6.084158]  881ff2f27e5f 810e70c0 81b02140
> > 81b127a0
> > [6.092465]  0001  0003
> > 881ff2f27eb8
> > [6.100768] Call Trace:
> > [6.103501]  [] force_qs_rnp+0xdc/0x150
> > [6.109527]  [] ? rcu_start_gp+0x70/0x70
> > [6.115654]  [] rcu_gp_kthread+0x468/0x9b0
> > [6.121976]  [] ? prepare_to_wait_event+0xf0/0xf0
> > [6.128973]  [] ? rcu_process_callbacks+0x5f0/0x5f0
> > [6.136167]  [] kthread+0xe5/0x100
> > [6.141710]  [] ? kthread_park+0x60/0x60
> > [6.147840]  [] ret_from_fork+0x3f/0x70
> > [6.153868]  [] ? kthread_park+0x60/0x60
> > 
> > I tried to bisect the problem, but when I try to boot only with:
> > "KAISER: Kernel Address Isolation" machine hangs during boot and
> > reboots without any panic message.
> > 
> > 4.4.109 boots fine
> > 4.9.75rc1 also boots fine.
> 
> Hm, so I'm guessing 4.15-rc6 also works?
> 
> Odd that 4.9.75-rc1 fails.
> 

I thought the above says that it boots fine ?

Guenter

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Pavel Tatashin

> Hm, so I'm guessing 4.15-rc6 also works?

I have not test 4.15

> Odd that 4.9.75-rc1 fails.

4.9.75-rc1 does NOT fail, it boots fine.

config for 4.4.110rc1 panic is attached.

Thank you,
Pasha


config_linux-4.4.110rc1.gz
Description: GNU Zip compressed data

Re: INFO: rcu detected stall in memcpy

2018-01-04 Thread Takashi Iwai

On Thu, 04 Jan 2018 15:17:23 +0100,
Takashi Iwai wrote:
> 
> On Thu, 04 Jan 2018 15:01:06 +0100,
> Dmitry Vyukov wrote:
> > 
> > On Thu, Jan 4, 2018 at 1:57 PM, Takashi Iwai  wrote:
> > > On Thu, 04 Jan 2018 13:08:45 +0100,
> > > Dmitry Vyukov wrote:
> > >>
> > >> On Thu, Jan 4, 2018 at 1:03 PM, syzbot
> > >>  wrote:
> > >> > Hello,
> > >> >
> > >> > syzkaller hit the following crash on
> > >> > 30a7acd573899fd8b8ac39236eff6468b195ac7d
> > >> > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> > >> > compiler: gcc (GCC) 7.1.1 20170620
> > >> > .config is attached
> > >> > Raw console output is attached.
> > >> > Unfortunately, I don't have any reproducer for this bug yet.
> > >> >
> > >> >
> > >> > IMPORTANT: if you fix the bug, please add the following tag to the 
> > >> > commit:
> > >> > Reported-by: syzbot+387f48da65cb522ab...@syzkaller.appspotmail.com
> > >> > It will help syzbot understand when the bug is fixed. See footer for
> > >> > details.
> > >> > If you forward the report, please keep this part and the footer.
> > >>
> > >> This looks ALSA-related. +ALSA maintainers.
> > >
> > > Not sure exactly what triggers it.  It's the simple memcpy(), and I
> > > don't know where RCU is involved in that code path.
> > >
> > > BTW, other two suspicious RCU usage reports are actually stopped at
> > > the second WARN_ON() after the RCU message, and the second WARN_ON()
> > > is independent from RCU; it's the known spurious WARN_ON() and was
> > > already removed in the sound git tree.
> > 
> > 
> > Hi Takashi,
> > 
> > Another similar one just popped up:
> > 
> > https://groups.google.com/forum/#!topic/syzkaller-bugs/X3d6-PIrJM0
> > 
> > This looks like mulaw_decode enters an infinite loop, or at least
> > doing very large amount of computations without a resched, e.g.
> > (uint64_t)-1 number of iterations of something along these lines.
> 
> OK, that makes sense.
> 
> My rough guess is that it's the misconfigured aloop device by
> concurrent setup.  The aloop device allows to restrict the parameters
> of the other side of the connection, and something bad may happen
> there if both sides are updated concurrently.
> 
> We've seen segfault by memset() at loopback_preapre() in
> sound/drivers/aloop.c by syzbot+3902b5220e8ca27889ca, too, which
> indicates also the wrongly setup parameters that overflows the
> allocated buffer.

Below two patches may possibly plug the holes, but I'm not entirely
sure whether that's the exact culprit.  Could you put them into syzbot
to watch whether they have any influence?

In anyway, they are obvious bugs to be fixed, so I'm going to queue to
my tree.


thanks,

Takashi



0001-ALSA-pcm-Add-missing-error-checks-in-OSS-emulation-p.patch
Description: Binary data


0002-ALSA-aloop-Fix-racy-hw-constraints-adjustment.patch
Description: Binary data

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Guenter Roeck

On Wed, Jan 03, 2018 at 09:11:06PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.110 release.
> There are 37 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Fri Jan  5 19:50:38 UTC 2018.
> Anything received after that time might be too late.
> 

For v4.4.109-38-g99abd6c:

Build results:
total: 145 pass: 145 fail: 0
Qemu test results:
total: 118 pass: 118 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

Re: [PATCH 08/11] arm64: KVM: Use per-CPU vector when BP hardening is enabled

2018-01-04 Thread Marc Zyngier

On 04/01/18 16:28, Ard Biesheuvel wrote:
> On 4 January 2018 at 15:08, Will Deacon  wrote:
>> From: Marc Zyngier 
>>
>> Now that we have per-CPU vectors, let's plug then in the KVM/arm64 code.
>>
> 
> Why does bp hardening require per-cpu vectors?

The description is not 100% accurate. We have per *CPU type* vectors.
This stems from the following, slightly conflicting requirements:

- We have systems with more than one CPU type (think big-little)
- Different implementations require different BP hardening sequences
- The BP hardening sequence must be executed before doing any branch

The natural solution is to have one set of vectors per CPU type,
containing the BP hardening sequence for that particular implementation,
ending with a branch to the common code.

M.
-- 
Jazz is not dead. It just smells funny...

Re: [PATCH 4.9 00/39] 4.9.75-stable review

2018-01-04 Thread Guenter Roeck

On Wed, Jan 03, 2018 at 09:11:14PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.75 release.
> There are 39 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 

For v4.9.74-40-gd88d440:

Build results:
total: 145 pass: 145 fail: 0
Qemu test results:
total: 126 pass: 126 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Willy Tarreau

On Thu, Jan 04, 2018 at 05:53:06PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 04, 2018 at 11:38:25AM -0500, Pavel Tatashin wrote:
> > I am getting the following panic when trying to boot 4.4.110rc1 on
> > Intel(R) Xeon(R) CPU E5-2630:
> > 
> > [5.923489] BUG: unable to handle kernel NULL pointer dereference
> > at 000d
> > [5.932259] IP: [] 
> > dyntick_save_progress_counter+0x12/0x50
> > [5.940142] PGD 0
> > [5.942400] Oops: 0002 [#1] SMP
> > [5.946023] Modules linked in:
> > [5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
> > 4.4.110-rc1_pt_linux-4.4.110rc1 #1
> > [5.958484] Hardware name: Oracle Corporation ORACLE SERVER
> > X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> > [5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
> > 881ff2f24000
> > [5.977905] RIP: 0010:[]  []
> > dyntick_save_progress_counter+0x12/0x50
> > [5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
> > [5.994434] RAX: 0001 RBX: 81b02140 RCX: 
> > 883fec768000
> > [6.002403] RDX:  RSI: 881ff2f27e5f RDI: 
> > 88407e958140
> > [6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 
> > 00016110f359
> > [6.018333] R10: 0b10 R11:  R12: 
> > 81b02140
> > [6.026297] R13: ffdf R14: 0021 R15: 
> > 0002
> > [6.034262] FS:  () GS:881fff94()
> > knlGS:
> > [6.043293] CS:  0010 DS:  ES:  CR0: 80050033
> > [6.049707] CR2: 000d CR3: 01aa6000 CR4: 
> > 00360670
> > [6.057672] DR0:  DR1:  DR2: 
> > 
> > [6.065638] DR3:  DR6: fffe0ff0 DR7: 
> > 0400
> > [6.073603] Stack:
> > [6.075847]  881ff2f27e18 810e8fac 0202
> > 881ff2f27e60
> > [6.084158]  881ff2f27e5f 810e70c0 81b02140
> > 81b127a0
> > [6.092465]  0001  0003
> > 881ff2f27eb8
> > [6.100768] Call Trace:
> > [6.103501]  [] force_qs_rnp+0xdc/0x150
> > [6.109527]  [] ? rcu_start_gp+0x70/0x70
> > [6.115654]  [] rcu_gp_kthread+0x468/0x9b0
> > [6.121976]  [] ? prepare_to_wait_event+0xf0/0xf0
> > [6.128973]  [] ? rcu_process_callbacks+0x5f0/0x5f0
> > [6.136167]  [] kthread+0xe5/0x100
> > [6.141710]  [] ? kthread_park+0x60/0x60
> > [6.147840]  [] ret_from_fork+0x3f/0x70
> > [6.153868]  [] ? kthread_park+0x60/0x60
> > 
> > I tried to bisect the problem, but when I try to boot only with:
> > "KAISER: Kernel Address Isolation" machine hangs during boot and
> > reboots without any panic message.
> > 
> > 4.4.109 boots fine
> > 4.9.75rc1 also boots fine.
> 
> Hm, so I'm guessing 4.15-rc6 also works?
> 
> Odd that 4.9.75-rc1 fails.

s/4.9.75/4.4.110/ I suppose.

Can't this be because more patches are required in 4.4 to support this
patch set ? Or maybe a manual fix for a conflict that went wrong ? Just
trying to guess.

Willy

Re: [PATCH 08/11] arm64: KVM: Use per-CPU vector when BP hardening is enabled

2018-01-04 Thread Ard Biesheuvel

On 4 January 2018 at 17:04, Marc Zyngier  wrote:
> On 04/01/18 16:28, Ard Biesheuvel wrote:
>> On 4 January 2018 at 15:08, Will Deacon  wrote:
>>> From: Marc Zyngier 
>>>
>>> Now that we have per-CPU vectors, let's plug then in the KVM/arm64 code.
>>>
>>
>> Why does bp hardening require per-cpu vectors?
>
> The description is not 100% accurate. We have per *CPU type* vectors.
> This stems from the following, slightly conflicting requirements:
>
> - We have systems with more than one CPU type (think big-little)
> - Different implementations require different BP hardening sequences
> - The BP hardening sequence must be executed before doing any branch
>
> The natural solution is to have one set of vectors per CPU type,
> containing the BP hardening sequence for that particular implementation,
> ending with a branch to the common code.
>

Crystal clear, thanks.

Re: Avoid speculative indirect calls in kernel

2018-01-04 Thread Alan Cox

> If you run lots of syscalls ibrs 1 ibpb 1 is much faster. If you do
> infrequent syscalls computing a lot in kernel like I/O with large
> buffers getting copied, ibrs 0 ibpb 2 is much faster than ibrs 1 ibpb
> 1 (on those microcodes where ibrs 1 reduces performance a lot, not all
> microcodes implementing SPEC_CTRL are inefficient like that).

Have you looked at whether you can measure activity and switch
automatically between the two (or by task). It seems silly to leave
something the machine can accurately assess toa human ?

Alan

Re: [PATCH v9 7/8] crypto: caam: cleanup CONFIG_64BIT ifdefs when using io{read|write}64

2018-01-04 Thread Logan Gunthorpe




On 04/01/18 12:25 AM, Horia Geantă wrote:

+#include 

Typo: lo-hi should be used instead (see previous patch versions).

Please add in the commit message the explanation (which was there in v8 but
removed in v9):
To be consistent with CAAM engine HW spec: in case of 64-bit registers,
irrespective of device endianness, the lower address should be read from
/ written to first, followed by the upper address. Indeed the I/O
accessors in CAAM driver currently don't follow the spec, however this
is a good opportunity to fix the code.


Ok, well I just copied what the latest code did. I assumed that seeing 
it was cleaned up very recently that they would have done it correctly.


  
  /*

   * Architecture-specific register access methods
@@ -136,7 +136,6 @@ static inline void clrsetbits_32(void __iomem *reg, u32 
clear, u32 set)
   *base + 0x : least-significant 32 bits
   *base + 0x0004 : most-significant 32 bits
   */
-#ifdef CONFIG_64BIT
  static inline void wr_reg64(void __iomem *reg, u64 data)
  {
if (caam_little_end)

Since the 2 cases (32/64-bit) are merged, caam_imx should be accounted for the
logic to stay the same.


Oops, my mistake. I'll fix this and the above and send a revised set.

Logan

Re: "BUG: using smp_processor_id() in preemptible" with KPTI on 4.14.11

2018-01-04 Thread Peter Zijlstra

On Thu, Jan 04, 2018 at 04:37:24PM +0100, Thomas Gleixner wrote:
> > Yes:
> > 
> >BUG: using smp_processor_id() in preemptible [] code: 
> > ovsdb-server/4498
> >caller is native_flush_tlb_single+0x57/0xc0
> >CPU: 2 PID: 4498 Comm: ovsdb-server Not tainted 
> > 4.15.0-rc6-kvm-00423-gea1908c252eb #3
> >Hardware name: MSI MS-7798/B75MA-P45 (MS-7798), BIOS V1.9 09/30/2013
> >Call Trace:
> > dump_stack+0x5c/0x86
> > check_preemption_disabled+0xdd/0xe0
> > native_flush_tlb_single+0x57/0xc0
> > ? __set_pte_vaddr+0x2d/0x40
> > __set_pte_vaddr+0x2d/0x40
> > set_pte_vaddr+0x2f/0x40
> > cea_set_pte+0x30/0x40
> > ds_update_cea.constprop.4+0x4d/0x70
> > reserve_ds_buffers+0x159/0x410
> > ? wp_page_copy+0x370/0x6c0
> > x86_reserve_hardware+0x150/0x160
> > x86_pmu_event_init+0x3e/0x1f0
> > perf_try_init_event+0x69/0x80
> > perf_event_alloc+0x652/0x740
> > SyS_perf_event_open+0x3f6/0xd60
> > do_syscall_64+0x5c/0x190
> > entry_SYSCALL64_slow_path+0x25/0x25
> >RIP: 0033:0x72bff0a3c0b9
> >RSP: 002b:7ffed11c2f18 EFLAGS: 0206 ORIG_RAX: 012a
> >RAX: ffda RBX: 7ffed11c30f0 RCX: 72bff0a3c0b9
> >RDX:  RSI:  RDI: 7ffed11c2f20
> >RBP:  R08:  R09: 0070
> >R10:  R11: 0206 R12: 0008
> >R13:  R14: 7ffed11c30d0 R15: 60986ecfb600

Fun, so set_pte_vaddr() and the whole cpu_entry_area are supposed to be
per CPU. But the DS crud does cross CPU updates of those tables.

So we need some additional fun and games..

How's the below?

---
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 8f0aace08b87..8156e47da7ba 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -5,6 +5,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #include "../perf_event.h"
@@ -283,20 +284,35 @@ static DEFINE_PER_CPU(void *, insn_buffer);
 
 static void ds_update_cea(void *cea, void *addr, size_t size, pgprot_t prot)
 {
+   unsigned long start = (unsigned long)cea;
phys_addr_t pa;
size_t msz = 0;
 
pa = virt_to_phys(addr);
+
+   preempt_disable();
for (; msz < size; msz += PAGE_SIZE, pa += PAGE_SIZE, cea += PAGE_SIZE)
cea_set_pte(cea, pa, prot);
+
+   /*
+* This is a cross-CPU update of the cpu_entry_area, we must shoot down
+* all TLB entries for it.
+*/
+   flush_tlb_kernel_range(start, start + size);
+   preempt_enable();
 }
 
 static void ds_clear_cea(void *cea, size_t size)
 {
+   unsigned long start = (unsigned long)cea;
size_t msz = 0;
 
+   preempt_disable();
for (; msz < size; msz += PAGE_SIZE, cea += PAGE_SIZE)
cea_set_pte(cea, 0, PAGE_NONE);
+
+   flush_tlb_kernel_range(start, start + size);
+   preempt_enable();
 }
 
 static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)

Re: [0/4] video-UDLFB: Adjustments for five function implementations

2018-01-04 Thread Bartlomiej Zolnierkiewicz

On Friday, December 29, 2017 07:10:00 PM SF Markus Elfring wrote:
> >>   Delete an error message for a failed memory allocation in two functions
> > 
> > This patch removes the information about the device for which the 
> > allocation fails.
> 
> * Do you find a Linux allocation failure report insufficient in this use case?

Yes, there is more information available currently in the driver and
I see no real improvement in removing it.

> * Are you looking for any more clarification?

I will not apply any of such patches for now. The only exception
being drivers that support hardware that can have only one instance
in the system (but it needs to be explicitly stated in the patch
description and the patch needs to be reviewed by a someone else
than the author).

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Greg Kroah-Hartman

On Thu, Jan 04, 2018 at 09:01:06AM -0800, Guenter Roeck wrote:
> On Thu, Jan 04, 2018 at 05:53:06PM +0100, Greg Kroah-Hartman wrote:
> > On Thu, Jan 04, 2018 at 11:38:25AM -0500, Pavel Tatashin wrote:
> > > I am getting the following panic when trying to boot 4.4.110rc1 on
> > > Intel(R) Xeon(R) CPU E5-2630:
> > > 
> > > [5.923489] BUG: unable to handle kernel NULL pointer dereference
> > > at 000d
> > > [5.932259] IP: [] 
> > > dyntick_save_progress_counter+0x12/0x50
> > > [5.940142] PGD 0
> > > [5.942400] Oops: 0002 [#1] SMP
> > > [5.946023] Modules linked in:
> > > [5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
> > > 4.4.110-rc1_pt_linux-4.4.110rc1 #1
> > > [5.958484] Hardware name: Oracle Corporation ORACLE SERVER
> > > X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> > > [5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
> > > 881ff2f24000
> > > [5.977905] RIP: 0010:[]  []
> > > dyntick_save_progress_counter+0x12/0x50
> > > [5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
> > > [5.994434] RAX: 0001 RBX: 81b02140 RCX: 
> > > 883fec768000
> > > [6.002403] RDX:  RSI: 881ff2f27e5f RDI: 
> > > 88407e958140
> > > [6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 
> > > 00016110f359
> > > [6.018333] R10: 0b10 R11:  R12: 
> > > 81b02140
> > > [6.026297] R13: ffdf R14: 0021 R15: 
> > > 0002
> > > [6.034262] FS:  () GS:881fff94()
> > > knlGS:
> > > [6.043293] CS:  0010 DS:  ES:  CR0: 80050033
> > > [6.049707] CR2: 000d CR3: 01aa6000 CR4: 
> > > 00360670
> > > [6.057672] DR0:  DR1:  DR2: 
> > > 
> > > [6.065638] DR3:  DR6: fffe0ff0 DR7: 
> > > 0400
> > > [6.073603] Stack:
> > > [6.075847]  881ff2f27e18 810e8fac 0202
> > > 881ff2f27e60
> > > [6.084158]  881ff2f27e5f 810e70c0 81b02140
> > > 81b127a0
> > > [6.092465]  0001  0003
> > > 881ff2f27eb8
> > > [6.100768] Call Trace:
> > > [6.103501]  [] force_qs_rnp+0xdc/0x150
> > > [6.109527]  [] ? rcu_start_gp+0x70/0x70
> > > [6.115654]  [] rcu_gp_kthread+0x468/0x9b0
> > > [6.121976]  [] ? prepare_to_wait_event+0xf0/0xf0
> > > [6.128973]  [] ? rcu_process_callbacks+0x5f0/0x5f0
> > > [6.136167]  [] kthread+0xe5/0x100
> > > [6.141710]  [] ? kthread_park+0x60/0x60
> > > [6.147840]  [] ret_from_fork+0x3f/0x70
> > > [6.153868]  [] ? kthread_park+0x60/0x60
> > > 
> > > I tried to bisect the problem, but when I try to boot only with:
> > > "KAISER: Kernel Address Isolation" machine hangs during boot and
> > > reboots without any panic message.
> > > 
> > > 4.4.109 boots fine
> > > 4.9.75rc1 also boots fine.
> > 
> > Hm, so I'm guessing 4.15-rc6 also works?
> > 
> > Odd that 4.9.75-rc1 fails.

Sorry, it's been a long few days, I meant "odd that the 4.9 -rc works
and the 4.4 one fails".

{sigh}

I think I need to ignore email for a while...

greg k-h

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Greg Kroah-Hartman

On Thu, Jan 04, 2018 at 06:03:15PM +0100, Willy Tarreau wrote:
> On Thu, Jan 04, 2018 at 05:53:06PM +0100, Greg Kroah-Hartman wrote:
> > On Thu, Jan 04, 2018 at 11:38:25AM -0500, Pavel Tatashin wrote:
> > > I am getting the following panic when trying to boot 4.4.110rc1 on
> > > Intel(R) Xeon(R) CPU E5-2630:
> > > 
> > > [5.923489] BUG: unable to handle kernel NULL pointer dereference
> > > at 000d
> > > [5.932259] IP: [] 
> > > dyntick_save_progress_counter+0x12/0x50
> > > [5.940142] PGD 0
> > > [5.942400] Oops: 0002 [#1] SMP
> > > [5.946023] Modules linked in:
> > > [5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
> > > 4.4.110-rc1_pt_linux-4.4.110rc1 #1
> > > [5.958484] Hardware name: Oracle Corporation ORACLE SERVER
> > > X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> > > [5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
> > > 881ff2f24000
> > > [5.977905] RIP: 0010:[]  []
> > > dyntick_save_progress_counter+0x12/0x50
> > > [5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
> > > [5.994434] RAX: 0001 RBX: 81b02140 RCX: 
> > > 883fec768000
> > > [6.002403] RDX:  RSI: 881ff2f27e5f RDI: 
> > > 88407e958140
> > > [6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 
> > > 00016110f359
> > > [6.018333] R10: 0b10 R11:  R12: 
> > > 81b02140
> > > [6.026297] R13: ffdf R14: 0021 R15: 
> > > 0002
> > > [6.034262] FS:  () GS:881fff94()
> > > knlGS:
> > > [6.043293] CS:  0010 DS:  ES:  CR0: 80050033
> > > [6.049707] CR2: 000d CR3: 01aa6000 CR4: 
> > > 00360670
> > > [6.057672] DR0:  DR1:  DR2: 
> > > 
> > > [6.065638] DR3:  DR6: fffe0ff0 DR7: 
> > > 0400
> > > [6.073603] Stack:
> > > [6.075847]  881ff2f27e18 810e8fac 0202
> > > 881ff2f27e60
> > > [6.084158]  881ff2f27e5f 810e70c0 81b02140
> > > 81b127a0
> > > [6.092465]  0001  0003
> > > 881ff2f27eb8
> > > [6.100768] Call Trace:
> > > [6.103501]  [] force_qs_rnp+0xdc/0x150
> > > [6.109527]  [] ? rcu_start_gp+0x70/0x70
> > > [6.115654]  [] rcu_gp_kthread+0x468/0x9b0
> > > [6.121976]  [] ? prepare_to_wait_event+0xf0/0xf0
> > > [6.128973]  [] ? rcu_process_callbacks+0x5f0/0x5f0
> > > [6.136167]  [] kthread+0xe5/0x100
> > > [6.141710]  [] ? kthread_park+0x60/0x60
> > > [6.147840]  [] ret_from_fork+0x3f/0x70
> > > [6.153868]  [] ? kthread_park+0x60/0x60
> > > 
> > > I tried to bisect the problem, but when I try to boot only with:
> > > "KAISER: Kernel Address Isolation" machine hangs during boot and
> > > reboots without any panic message.
> > > 
> > > 4.4.109 boots fine
> > > 4.9.75rc1 also boots fine.
> > 
> > Hm, so I'm guessing 4.15-rc6 also works?
> > 
> > Odd that 4.9.75-rc1 fails.
> 
> s/4.9.75/4.4.110/ I suppose.

Yes, mistake on my side.

> Can't this be because more patches are required in 4.4 to support this
> patch set ? Or maybe a manual fix for a conflict that went wrong ? Just
> trying to guess.

Odd thing is, the 4.9 series started from the 4.4 code for most of the
patches, so I would expect that one to fail...

greg k-h

Re: Avoid speculative indirect calls in kernel

2018-01-04 Thread Dave Hansen

On 01/04/2018 08:25 AM, Andrea Arcangeli wrote:
> It's only where SPEC_CTRL is missing and only IBPB_SUPPORT is
> available, that ibrs 0 ibpb 2 is the only option to fix variant#2 for
> good.

Could you help us decode what "ibrs 0 ibpb 2" means to you?

[PATCH 1/2 v2] jump_label: export static_key_slow_inc/dec_cpuslocked()

2018-01-04 Thread Konstantin Khlebnikov

For fixing cpu_hotplug_lock recursion in tg_set_cfs_bandwidth().

Signed-off-by: Konstantin Khlebnikov 

---

v2: remove EXPORT_SYMBOL_GPL, second patch unchanged
---
 include/linux/jump_label.h |   12 
 kernel/jump_label.c|   16 +++-
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index c7b368c734af..db10e6a9d315 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -160,6 +160,8 @@ extern void arch_jump_label_transform_static(struct 
jump_entry *entry,
 extern int jump_label_text_reserved(void *start, void *end);
 extern void static_key_slow_inc(struct static_key *key);
 extern void static_key_slow_dec(struct static_key *key);
+extern void static_key_slow_inc_cpuslocked(struct static_key *key);
+extern void static_key_slow_dec_cpuslocked(struct static_key *key);
 extern void jump_label_apply_nops(struct module *mod);
 extern int static_key_count(struct static_key *key);
 extern void static_key_enable(struct static_key *key);
@@ -222,6 +224,16 @@ static inline void static_key_slow_dec(struct static_key 
*key)
atomic_dec(&key->enabled);
 }
 
+static inline void static_key_slow_inc_cpuslocked(struct static_key *key)
+{
+   static_key_slow_inc(key);
+}
+
+static inline void static_key_slow_dec_cpuslocked(struct static_key *key)
+{
+   static_key_slow_dec(key);
+}
+
 static inline int jump_label_text_reserved(void *start, void *end)
 {
return 0;
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 8594d24e4adc..934ef0bb6c0d 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -79,7 +79,7 @@ int static_key_count(struct static_key *key)
 }
 EXPORT_SYMBOL_GPL(static_key_count);
 
-static void static_key_slow_inc_cpuslocked(struct static_key *key)
+void static_key_slow_inc_cpuslocked(struct static_key *key)
 {
int v, v1;
 
@@ -180,9 +180,9 @@ void static_key_disable(struct static_key *key)
 }
 EXPORT_SYMBOL_GPL(static_key_disable);
 
-static void static_key_slow_dec_cpuslocked(struct static_key *key,
-  unsigned long rate_limit,
-  struct delayed_work *work)
+static void __static_key_slow_dec_cpuslocked(struct static_key *key,
+unsigned long rate_limit,
+struct delayed_work *work)
 {
/*
 * The negative count check is valid even when a negative
@@ -206,12 +206,18 @@ static void static_key_slow_dec_cpuslocked(struct 
static_key *key,
jump_label_unlock();
 }
 
+void static_key_slow_dec_cpuslocked(struct static_key *key)
+{
+   STATIC_KEY_CHECK_USE(key);
+   __static_key_slow_dec_cpuslocked(key, 0, NULL);
+}
+
 static void __static_key_slow_dec(struct static_key *key,
  unsigned long rate_limit,
  struct delayed_work *work)
 {
cpus_read_lock();
-   static_key_slow_dec_cpuslocked(key, rate_limit, work);
+   __static_key_slow_dec_cpuslocked(key, rate_limit, work);
cpus_read_unlock();
 }

Re: [PATCH 11/11] arm64: Implement branch predictor hardening for affected Cortex-A CPUs

2018-01-04 Thread Marc Zyngier

On 04/01/18 16:31, Ard Biesheuvel wrote:
> On 4 January 2018 at 15:08, Will Deacon  wrote:
>> Cortex-A57, A72, A73 and A75 are susceptible to branch predictor aliasing
>> and can theoretically be attacked by malicious code.
>>
>> This patch implements a PSCI-based mitigation for these CPUs when available.
>> The call into firmware will invalidate the branch predictor state, preventing
>> any malicious entries from affecting other victim contexts.
>>
>> Signed-off-by: Marc Zyngier 
>> Signed-off-by: Will Deacon 
>> ---
>>  arch/arm64/kernel/bpi.S| 24 
>>  arch/arm64/kernel/cpu_errata.c | 42 
>> ++
>>  2 files changed, 66 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
>> index 06a931eb2673..2b10d52a0321 100644
>> --- a/arch/arm64/kernel/bpi.S
>> +++ b/arch/arm64/kernel/bpi.S
>> @@ -53,3 +53,27 @@ ENTRY(__bp_harden_hyp_vecs_start)
>> vectors __kvm_hyp_vector
>> .endr
>>  ENTRY(__bp_harden_hyp_vecs_end)
>> +ENTRY(__psci_hyp_bp_inval_start)
>> +   stp x0, x1, [sp, #-16]!
>> +   stp x2, x3, [sp, #-16]!
>> +   stp x4, x5, [sp, #-16]!
>> +   stp x6, x7, [sp, #-16]!
>> +   stp x8, x9, [sp, #-16]!
>> +   stp x10, x11, [sp, #-16]!
>> +   stp x12, x13, [sp, #-16]!
>> +   stp x14, x15, [sp, #-16]!
>> +   stp x16, x17, [sp, #-16]!
>> +   stp x18, x19, [sp, #-16]!
> 
> Would it be better to update sp only once here?

Maybe. I suppose that's quite uarch dependent, but worth trying.

> Also, do x18 and x19 need to be preserved/restored here?

My bad. I misread the SMCCC and though I needed to save it too. For the
reference, the text says:

"Registers  X18-X30 and stack pointers SP_EL0 and SP_ELx are saved by
the function that is called, and must be preserved over the SMC or HVC
call."

I'll amend the patch.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Greg Kroah-Hartman

On Thu, Jan 04, 2018 at 06:11:02PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 04, 2018 at 06:03:15PM +0100, Willy Tarreau wrote:
> > On Thu, Jan 04, 2018 at 05:53:06PM +0100, Greg Kroah-Hartman wrote:
> > > On Thu, Jan 04, 2018 at 11:38:25AM -0500, Pavel Tatashin wrote:
> > > > I am getting the following panic when trying to boot 4.4.110rc1 on
> > > > Intel(R) Xeon(R) CPU E5-2630:
> > > > 
> > > > [5.923489] BUG: unable to handle kernel NULL pointer dereference
> > > > at 000d
> > > > [5.932259] IP: [] 
> > > > dyntick_save_progress_counter+0x12/0x50
> > > > [5.940142] PGD 0
> > > > [5.942400] Oops: 0002 [#1] SMP
> > > > [5.946023] Modules linked in:
> > > > [5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
> > > > 4.4.110-rc1_pt_linux-4.4.110rc1 #1
> > > > [5.958484] Hardware name: Oracle Corporation ORACLE SERVER
> > > > X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> > > > [5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
> > > > 881ff2f24000
> > > > [5.977905] RIP: 0010:[]  []
> > > > dyntick_save_progress_counter+0x12/0x50
> > > > [5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
> > > > [5.994434] RAX: 0001 RBX: 81b02140 RCX: 
> > > > 883fec768000
> > > > [6.002403] RDX:  RSI: 881ff2f27e5f RDI: 
> > > > 88407e958140
> > > > [6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 
> > > > 00016110f359
> > > > [6.018333] R10: 0b10 R11:  R12: 
> > > > 81b02140
> > > > [6.026297] R13: ffdf R14: 0021 R15: 
> > > > 0002
> > > > [6.034262] FS:  () GS:881fff94()
> > > > knlGS:
> > > > [6.043293] CS:  0010 DS:  ES:  CR0: 80050033
> > > > [6.049707] CR2: 000d CR3: 01aa6000 CR4: 
> > > > 00360670
> > > > [6.057672] DR0:  DR1:  DR2: 
> > > > 
> > > > [6.065638] DR3:  DR6: fffe0ff0 DR7: 
> > > > 0400
> > > > [6.073603] Stack:
> > > > [6.075847]  881ff2f27e18 810e8fac 0202
> > > > 881ff2f27e60
> > > > [6.084158]  881ff2f27e5f 810e70c0 81b02140
> > > > 81b127a0
> > > > [6.092465]  0001  0003
> > > > 881ff2f27eb8
> > > > [6.100768] Call Trace:
> > > > [6.103501]  [] force_qs_rnp+0xdc/0x150
> > > > [6.109527]  [] ? rcu_start_gp+0x70/0x70
> > > > [6.115654]  [] rcu_gp_kthread+0x468/0x9b0
> > > > [6.121976]  [] ? prepare_to_wait_event+0xf0/0xf0
> > > > [6.128973]  [] ? rcu_process_callbacks+0x5f0/0x5f0
> > > > [6.136167]  [] kthread+0xe5/0x100
> > > > [6.141710]  [] ? kthread_park+0x60/0x60
> > > > [6.147840]  [] ret_from_fork+0x3f/0x70
> > > > [6.153868]  [] ? kthread_park+0x60/0x60
> > > > 
> > > > I tried to bisect the problem, but when I try to boot only with:
> > > > "KAISER: Kernel Address Isolation" machine hangs during boot and
> > > > reboots without any panic message.
> > > > 
> > > > 4.4.109 boots fine
> > > > 4.9.75rc1 also boots fine.
> > > 
> > > Hm, so I'm guessing 4.15-rc6 also works?
> > > 
> > > Odd that 4.9.75-rc1 fails.
> > 
> > s/4.9.75/4.4.110/ I suppose.
> 
> Yes, mistake on my side.
> 
> > Can't this be because more patches are required in 4.4 to support this
> > patch set ? Or maybe a manual fix for a conflict that went wrong ? Just
> > trying to guess.
> 
> Odd thing is, the 4.9 series started from the 4.4 code for most of the
> patches, so I would expect that one to fail...

Also, the 4.4 patches were supposed to have been better tested, I need
to go dig and see what I messed up here...

greg k-h

Re: [V3 2/2] ASoC: max98373: Added Amplifier Driver

2018-01-04 Thread Mark Brown

On Wed, Jan 03, 2018 at 10:39:17AM -0800, Ryan Lee wrote:

This looks mostly good.  There are a few smaller issues but I think at
this point it's most sensible to apply and fix those incrementally so
I'll do that, please follow up with patches fixing the remaining issues.

> --- /dev/null
> +++ b/sound/soc/codecs/max98373.c
> @@ -0,0 +1,971 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (c) 2017, Maxim Integrated */

SPDX headers are supposed to be C++ comments, please send a followup
patch fixing this.

> +static int max98373_get_bclk_sel(int bclk)
> +{
> + int i;
> + /* match BCLKs per LRCLK */
> + for (i = 0; i < ARRAY_SIZE(bclk_sel_table); i++) {
> + if (bclk_sel_table[i] == bclk)
> + return i + 2;
> + }
> + return 0;
> +}
> +static int max98373_set_clock(struct snd_soc_codec *codec,

Missing blank line between the functions.

> + }
> + /* set DAI_SR to correct LRCLK frequency */

Another missing blank line.

> +static int max98373_dai_tdm_slot(struct snd_soc_dai *dai,
> + unsigned int tx_mask, unsigned int rx_mask,
> + int slots, int slot_width)
> +{
> + struct snd_soc_codec *codec = dai->codec;
> + struct max98373_priv *max98373 = snd_soc_codec_get_drvdata(codec);
> + int bsel = 0;
> + unsigned int chan_sz = 0;
> + unsigned int mask;
> + int x, slot_found;
> +
> + max98373->tdm_mode = true;

This should really also support disabling TDM mode - if the parameters
are all 0 just turn TDM off.  Again can be fixed in a followup.

> +SOC_SINGLE_TLV("DHT Gain Min", MAX98373_R20D1_DHT_CFG,
> + MAX98373_DHT_SPK_GAIN_MIN_SHIFT, 9, 0, max98373_dht_spkgain_min_tlv),
> +SOC_SINGLE_TLV("DHT Rot Pnt", MAX98373_R20D1_DHT_CFG,
> + MAX98373_DHT_ROT_PNT_SHIFT, 15, 0, max98373_dht_rotation_point_tlv),
> +SOC_SINGLE_TLV("DHT Attack Step", MAX98373_R20D2_DHT_ATTACK_CFG,
> + MAX98373_DHT_ATTACK_STEP_SHIFT, 4, 0, max98373_dht_step_size_tlv),
> +SOC_SINGLE_TLV("DHT Release Step", MAX98373_R20D3_DHT_RELEASE_CFG,
> + MAX98373_DHT_RELEASE_STEP_SHIFT, 4, 0, max98373_dht_step_size_tlv),

You should add a Volume on the end of these control names so that
userspace knows how to display them properly; it's a little confusing as
they're not actually gains but it tends to work out better.  Same for
most of the other TLV controls.

signature.asc
Description: PGP signature

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Willy Tarreau

On Thu, Jan 04, 2018 at 06:11:02PM +0100, Greg Kroah-Hartman wrote:
> > Can't this be because more patches are required in 4.4 to support this
> > patch set ? Or maybe a manual fix for a conflict that went wrong ? Just
> > trying to guess.
> 
> Odd thing is, the 4.9 series started from the 4.4 code for most of the
> patches, so I would expect that one to fail...

I see. Then maybe a missing patch somewhere in 4.4 compared to 4.9 :-/
I have no idea what to look for however.

Willy

Re: Avoid speculative indirect calls in kernel

2018-01-04 Thread Paolo Bonzini

On 04/01/2018 18:13, Dave Hansen wrote:
> On 01/04/2018 08:25 AM, Andrea Arcangeli wrote:
>> It's only where SPEC_CTRL is missing and only IBPB_SUPPORT is
>> available, that ibrs 0 ibpb 2 is the only option to fix variant#2 for
>> good.
> 
> Could you help us decode what "ibrs 0 ibpb 2" means to you?

IBRS 0 = disabled
IBRS 1 = only kernel sets IBRS=1
IBRS 2 = indirect branch prediction fully disabled, or do the right
thing on future processors

IBPB 0 = disabled
IBPB 1 = on context switch
IBPB 2 = on every kernel or hypervisor entry

Thanks,

Paolo

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Greg Kroah-Hartman

On Thu, Jan 04, 2018 at 06:14:15PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 04, 2018 at 06:11:02PM +0100, Greg Kroah-Hartman wrote:
> > On Thu, Jan 04, 2018 at 06:03:15PM +0100, Willy Tarreau wrote:
> > > On Thu, Jan 04, 2018 at 05:53:06PM +0100, Greg Kroah-Hartman wrote:
> > > > On Thu, Jan 04, 2018 at 11:38:25AM -0500, Pavel Tatashin wrote:
> > > > > I am getting the following panic when trying to boot 4.4.110rc1 on
> > > > > Intel(R) Xeon(R) CPU E5-2630:
> > > > > 
> > > > > [5.923489] BUG: unable to handle kernel NULL pointer dereference
> > > > > at 000d
> > > > > [5.932259] IP: [] 
> > > > > dyntick_save_progress_counter+0x12/0x50
> > > > > [5.940142] PGD 0
> > > > > [5.942400] Oops: 0002 [#1] SMP
> > > > > [5.946023] Modules linked in:
> > > > > [5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
> > > > > 4.4.110-rc1_pt_linux-4.4.110rc1 #1
> > > > > [5.958484] Hardware name: Oracle Corporation ORACLE SERVER
> > > > > X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> > > > > [5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
> > > > > 881ff2f24000
> > > > > [5.977905] RIP: 0010:[]  []
> > > > > dyntick_save_progress_counter+0x12/0x50
> > > > > [5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
> > > > > [5.994434] RAX: 0001 RBX: 81b02140 RCX: 
> > > > > 883fec768000
> > > > > [6.002403] RDX:  RSI: 881ff2f27e5f RDI: 
> > > > > 88407e958140
> > > > > [6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 
> > > > > 00016110f359
> > > > > [6.018333] R10: 0b10 R11:  R12: 
> > > > > 81b02140
> > > > > [6.026297] R13: ffdf R14: 0021 R15: 
> > > > > 0002
> > > > > [6.034262] FS:  () GS:881fff94()
> > > > > knlGS:
> > > > > [6.043293] CS:  0010 DS:  ES:  CR0: 80050033
> > > > > [6.049707] CR2: 000d CR3: 01aa6000 CR4: 
> > > > > 00360670
> > > > > [6.057672] DR0:  DR1:  DR2: 
> > > > > 
> > > > > [6.065638] DR3:  DR6: fffe0ff0 DR7: 
> > > > > 0400
> > > > > [6.073603] Stack:
> > > > > [6.075847]  881ff2f27e18 810e8fac 0202
> > > > > 881ff2f27e60
> > > > > [6.084158]  881ff2f27e5f 810e70c0 81b02140
> > > > > 81b127a0
> > > > > [6.092465]  0001  0003
> > > > > 881ff2f27eb8
> > > > > [6.100768] Call Trace:
> > > > > [6.103501]  [] force_qs_rnp+0xdc/0x150
> > > > > [6.109527]  [] ? rcu_start_gp+0x70/0x70
> > > > > [6.115654]  [] rcu_gp_kthread+0x468/0x9b0
> > > > > [6.121976]  [] ? prepare_to_wait_event+0xf0/0xf0
> > > > > [6.128973]  [] ? 
> > > > > rcu_process_callbacks+0x5f0/0x5f0
> > > > > [6.136167]  [] kthread+0xe5/0x100
> > > > > [6.141710]  [] ? kthread_park+0x60/0x60
> > > > > [6.147840]  [] ret_from_fork+0x3f/0x70
> > > > > [6.153868]  [] ? kthread_park+0x60/0x60
> > > > > 
> > > > > I tried to bisect the problem, but when I try to boot only with:
> > > > > "KAISER: Kernel Address Isolation" machine hangs during boot and
> > > > > reboots without any panic message.
> > > > > 
> > > > > 4.4.109 boots fine
> > > > > 4.9.75rc1 also boots fine.
> > > > 
> > > > Hm, so I'm guessing 4.15-rc6 also works?
> > > > 
> > > > Odd that 4.9.75-rc1 fails.
> > > 
> > > s/4.9.75/4.4.110/ I suppose.
> > 
> > Yes, mistake on my side.
> > 
> > > Can't this be because more patches are required in 4.4 to support this
> > > patch set ? Or maybe a manual fix for a conflict that went wrong ? Just
> > > trying to guess.
> > 
> > Odd thing is, the 4.9 series started from the 4.4 code for most of the
> > patches, so I would expect that one to fail...
> 
> Also, the 4.4 patches were supposed to have been better tested, I need
> to go dig and see what I messed up here...

Nope, it matches up with what is in SLES12 exactly, I must be missing
something else here as a prerequisite...

Re: [RFC PATCH 2/5] perf jevents: add support for arch recommended events

2018-01-04 Thread John Garry


On 21/12/2017 19:39, Jiri Olsa wrote:

Hi Jirka,
>
> When you say reasonable size for x86, I ran a string duplication finder on
> the x86 JSONs and the results show a huge amount of duplication. Please
> check this:
> 
https://gist.githubusercontent.com/johnpgarry/68bc87e823ae2ce0f7b475b4e55e5795/raw/f4cea138999d8b34151b9586d733592e01774d7a/x86%2520JSON%2520duplication
>
> Extract:
> "Found a 65 line (311 tokens) duplication in the following files:
> Starting at line 100 of
> /linux/tools/perf/pmu-events/arch/x86/ivybridge/ivb-metrics.json
> Starting at line 100 of
> /linux/tools/perf/pmu-events/arch/x86/ivytown/ivt-metrics.json
> Starting at line 100 of
> /linux/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json
> Starting at line 100 of
> /linux/tools/perf/pmu-events/arch/x86/skylakex/skx-metrics.json
> Starting at line 76 of
> /linux/tools/perf/pmu-events/arch/x86/jaketown/jkt-metrics.json
> Starting at line 100 of
> /linux/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json
> Starting at line 76 of
> /linux/tools/perf/pmu-events/arch/x86/sandybridge/snb-metrics.json
> Starting at line 100 of
> /linux/tools/perf/pmu-events/arch/x86/broadwellx/bdx-metrics.json"
>



Hi Jirka,

Sorry for the slow reply.


> Won't this all potentially have a big maintainence cost?

as Andi said it's mostly just the disk space,
which is not big deal

I'm not doing JSON file updates, but I think having
simple single dir for platform/cpu could save us some
confusion in future


Understood. But for ARM, which has very standardised architecture 
events, it is good to reduce this event duplication between platforms.




however I won't oppose if you want to add this logic,
but please:
  - use the list_head ;-)


Of course


  - leave the process_one_file function simple
and separate the level0 processing


ok, this is how it should look already, albeit a couple of 
process_one_file() modifications. I'll re-check this.



  - you are using 'EventCode' as an unique ID to find
the base, but it's not unique for x86, you'll need
to add some other ID scheme that fits to all archs


Right, so you mentioned earlier using a new keyword token to identify 
whether we use the standard event, so we can go his way - ok?


I would also like to mention at this point why I did the event 
pre-processing in jevents, and not a separate script:

- current build does not transverse the arch tree
- tree transversal for JSON processing is done in jevents
- a script would mean derived objects, which means:
- makefile changes for derived objects
- jevents would have to deal with derived objects
- jevents already has support for JSON processing

The advantage of using a script is that we keep the JSON processing in 
jevents simple.


All the best,
John



thanks,
jirka

Re: general protection fault in nf_tables_dump_obj_done

2018-01-04 Thread Florian Westphal

#syz fix: netfilter: nf_tables: fix potential NULL-ptr deref in 
nf_tables_dump_obj_done()

Re: [PATCH] Remove silentoldconfig from "make help"; fix kconfig/conf's help

2018-01-04 Thread Masahiro Yamada

(+CC Michal's new address)

2017-12-19 10:26 GMT+09:00 Marc Herbert :
> As explained by Michal Marek at https://lkml.org/lkml/2011/8/31/189
> silentoldconfig has become a misnomer. It has become an internal
> interface and "oldconfig" is just as silent now.


Hmm, I'd like to be sure about your intention.

"oldconfig" is not silent.  (nor is silentoldconfig).

When it finds a new symbol, it will show a dialog
to ask users to input a value.

"olddefconfig" is really silent
because it automatically sets new symbols to default.

I agree it is confusing due to the historical misnomer.



> It's not part of the
> user interface so remove it from "make help" to stop confusing people
> trying to use it as seen for instance at
> https://chromium-review.googlesource.com/271688
>
> On the other hand, correct and expand its description in the help of
> scripts/kconfig/conf.c
>
> Signed-off-by: Marc Herbert 
> ---
>  scripts/kconfig/Makefile | 1 -
>  scripts/kconfig/conf.c   | 5 -
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
> index 297c1bf35140..bf13b8fa3ccc 100644
> --- a/scripts/kconfig/Makefile
> +++ b/scripts/kconfig/Makefile
> @@ -142,7 +142,6 @@ help:
> @echo  '  oldconfig   - Update current config utilising a 
> provided .config as base'
> @echo  '  localmodconfig  - Update current config disabling modules 
> not loaded'
> @echo  '  localyesconfig  - Update current config converting local 
> mods to core'
> -   @echo  '  silentoldconfig - Same as oldconfig, but quietly, 
> additionally update deps'
> @echo  '  defconfig   - New config with default from ARCH 
> supplied defconfig'
> @echo  '  savedefconfig   - Save current config as ./defconfig 
> (minimal config)'
> @echo  '  allnoconfig - New config where all options are answered 
> with no'
> diff --git a/scripts/kconfig/conf.c b/scripts/kconfig/conf.c


A few lines below, I see


@echo  '  olddefconfig- Same as silentoldconfig but sets
new symbols to their'
@echo  'default value'


If you drop silentoldconfig help,
the "Same as silentoldconfig" is not sensible.
You need to update this line, too.

I think "Same as oldconfig but ..." will be OK.



> index 866369f10ff8..f8c002a19f62 100644
> --- a/scripts/kconfig/conf.c
> +++ b/scripts/kconfig/conf.c
> @@ -477,7 +477,10 @@ static void conf_usage(const char *progname)
> printf("  --listnewconfig List new options\n");
> printf("  --oldaskconfig  Start a new configuration using a 
> line-oriented program\n");
> printf("  --oldconfig Update a configuration using a 
> provided .config as base\n");
> -   printf("  --silentoldconfig   Same as oldconfig, but quietly, 
> additionally update deps\n");
> +   printf("  --silentoldconfig   Similar to oldconfig but:\n"
> +  "- no re-formatting of .config 
> when nothing's missing\n"
> +  "- generates configuration in 
> include/{generated/,config/}\n"
> +  "  (oldconfig used to be more 
> verbose)\n");


What do you mean by "oldconfig used to be more verbose" ?

Did oldconfig change its behavior?

Unless I am missing something, the current behavior of "oldconfig" has
been the same
at least since the beginning of the git era.





> printf("  --olddefconfig  Same as silentoldconfig but sets 
> new symbols to their default value\n");
> printf("  --oldnoconfig   An alias of olddefconfig\n");
> printf("  --defconfig   New config with default defined in 
> \n");
> --
> 2.9.5
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Best Regards
Masahiro Yamada

Re: [PATCH V4 11/26] iommu/amd: deprecate pci_get_bus_and_slot()

2018-01-04 Thread Gary R Hook


On 01/04/2018 10:32 AM, Sinan Kaya wrote:

On 1/4/2018 11:28 AM, Gary R Hook wrote:

On 01/04/2018 06:25 AM, Sinan Kaya wrote:

On 12/19/2017 12:37 AM, Sinan Kaya wrote:

pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Getting ready to remove pci_get_bus_and_slot() function in favor of
pci_get_domain_bus_and_slot().

Hard-code the domain number as 0 for the AMD IOMMU driver.






Any comments from the IOMMU people?



pci_get_bus_and_slot() appears to (now) be a convenience function that wraps 
pci_get_domain_bus_and_slot() while using a 0 for the domain value. Exactly 
what you are doing here, albeit in a more overt way.

How is this patch advantageous? Seems to me that if other domains need to be 
enabled, that driver could be changed if and when that requirement arises.

But perhaps I'm missing a nuance here.




The benefit of the change was discussed here:

https://lkml.org/lkml/2017/12/19/349

I hope it helps.




Thank you for pointing out that thread directly. I read through it and 
thought further about this change.


I am not the maintainer, but as an AMD developer, this is fine change. I 
can't ACK but I can agree.


Gary

Re: objtool segfault with ORC unwinder enabled

2018-01-04 Thread Josh Poimboeuf

On Thu, Jan 04, 2018 at 05:56:30PM +0100, Markus wrote:
> On Thursday, 4 January 2018 16:46:13 CET Josh Poimboeuf wrote:
> > On Wed, Jan 03, 2018 at 06:26:19PM +0100, Markus wrote:
> > > > > > I'm unable to recreate.  Can you attach one of the .o files (like
> > > > > > the
> > > > > > above irq.o)?
> > > > > 
> > > > > Sure, see attached. (From vanilla linux-4.14.11.)
> > > > 
> > > > There's something weird with the toolchain.  The object file doesn't
> > > > have an ELF section symbol for the .irqentry.text section.
> > > > 
> > > > Are there any special KCFLAGS being added?  Can you build the object
> > > > with V=1 to show the full gcc command line?
> > > 
> > > I have not added anything. There is no env variable set like $KCFLAGS or
> > > $CFLAGS. (If that was the question.)
> > > 
> > > I think you mean this line from output:
> > > gcc -Wp,-MD,arch/x86/kernel/.irq.o.d  -nostdinc -isystem
> > > /usr/lib/gcc/x86_64- pc-linux-gnu/6.4.0/include -I./arch/x86/include
> > > -I./arch/x86/include/generated -I./include -I./arch/x86/include/uapi
> > > -I./arch/x86/include/generated/uapi -I./ include/uapi
> > > -I./include/generated/uapi -include ./include/linux/kconfig.h -
> > > D__KERNEL__ -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-
> > > aliasing -fno-common -fshort-wchar -Werror-implicit-function-declaration
> > > -Wno- format-security -std=gnu89 -fno-PIE -mno-sse -mno-mmx -mno-sse2
> > > -mno-3dnow - mno-avx -m64 -falign-jumps=1 -falign-loops=1 -mno-80387
> > > -mno-fp-ret-in-387 - mpreferred-stack-boundary=3 -mskip-rax-setup
> > > -mtune=generic -mno-red-zone - mcmodel=kernel -funit-at-a-time
> > > -DCONFIG_AS_CFI=1 -
> > > DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1
> > > -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1
> > > -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 - DCONFIG_AS_AVX512=1
> > > -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe -Wno- sign-compare
> > > -fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -
> > > Wno-frame-address -O2 --param=allow-store-data-races=0 -DCC_HAVE_ASM_GOTO
> > > - Wframe-larger-than=2048 -fno-stack-protector
> > > -Wno-unused-but-set-variable - Wno-unused-const-variable
> > > -fomit-frame-pointer -fno-var-tracking-assignments -
> > > Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fno-
> > > stack-check -fconserve-stack -Werror=implicit-int
> > > -Werror=strict-prototypes - Werror=date-time
> > > -Werror=incompatible-pointer-types -Werror=designated-init -
> > > Iarch/x86/kernel/../include/asm/trace-DKBUILD_BASENAME='"irq"'  -
> > > DKBUILD_MODNAME='"irq"' -c -o arch/x86/kernel/.tmp_irq.o
> > > arch/x86/kernel/irq.c
> > > 
> > > The next line is the objtool that segfaults.
> > 
> > I don't see anything unusual there.  Are there any Gentoo patches
> > against either the kernel or GCC which would strip unused symbols?
> 
> The kernel is the vanilla kernel. (4.14.11 and also 4.15-rc6)
> Its not a gentoo specific gcc patch. (Then every gentoo user would be 
> affected?)
> 
> But I enabled ld.gold as default linker like 5 years ago. Never had a problem 
> with this.
> 
> Is ld.gold supposed to fail here?
> 
> I switched back to ld.bfd and it seems to work.

Ah, that explains it.  With CONFIG_MODVERSIONS, the linker does some
work after gcc, but before objtool.  Can you try this patch?  (Note this
isn't the final patch, as this breaks the CONFIG_MODVERSIONS=n case.)

diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index cb8997ed0149..3cf3cc6077ea 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -270,7 +270,7 @@ endif
 # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file
 cmd_objtool = $(if $(patsubst y%,, \

$(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n), \
-   $(__objtool_obj) $(objtool_args) "$(@)";)
+   $(__objtool_obj) $(objtool_args) "$(@D)/.tmp_$(@F)";)
 objtool_obj = $(if $(patsubst y%,, \

$(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n), \
$(__objtool_obj))
@@ -286,16 +286,16 @@ objtool_dep = $(objtool_obj)  
\
 define rule_cc_o_c
$(call echo-cmd,checksrc) $(cmd_checksrc) \
$(call cmd_and_fixdep,cc_o_c) \
+   $(call echo-cmd,objtool) $(cmd_objtool)   \
$(cmd_modversions_c)  \
$(cmd_checkdoc)   \
-   $(call echo-cmd,objtool) $(cmd_objtool)   \
$(call echo-cmd,record_mcount) $(cmd_record_mcount)
 endef
 
 define rule_as_o_S
$(call cmd_and_fixdep,as_o_S) \
-   $(cmd_modversions_S)  \
-   $(call echo-cmd,objtool) $(cmd_objtool)
+   $(call echo-cmd,objtool) $(cmd_objtoo

Re: [PATCH v2 11/12] retpoline/objtool: Disable some objtool warnings

2018-01-04 Thread Josh Poimboeuf

On Thu, Jan 04, 2018 at 10:32:52AM -0600, Josh Poimboeuf wrote:
> Either way we'll need to figure out a way to get objtool support ASAP.

BTW, I got dwmw2's GCC patches but I'm about to disappear for a few days
so it'll probably be next week before I get a chance to look at this.

-- 
Josh

Re: [PATCH v3] gpio: winbond: add driver

2018-01-04 Thread Andy Shevchenko

On Thu, 2018-01-04 at 00:41 +0100, Maciej S. Szmigiero wrote:
> On 03.01.2018 20:05, Andy Shevchenko wrote:
> > On Sat, 2017-12-30 at 22:02 +0100, Maciej S. Szmigiero wrote:
> > > This commit adds GPIO driver for Winbond Super I/Os.

> > First of all, looking more at this driver, why don't we create a
> > gpiochip per real "port" during actual configuration?
> 
> Hmm.. there is only a one 'chip' here, so why would the driver want to
> register multiple ones?
> 
> That would also create at least one additional point of failure if
> one or more such gpiochip(s) register but one fails to do so.
> 
> > And I still have filing that this one suitable for MFD.
> 
> As I wrote previously, that would necessitate rewriting also w83627ehf
> hwmon and w83627hf_wdt drivers, and would make the driver stand out
> against other, similar Super I/O drivers.
> 
> > Anyone, does it make sense?

OK, at least I shared my point.

> > > +/* returns whether changing a pin is allowed */
> > > +static bool winbond_gpio_get_info(unsigned int *gpio_num,
> > > +   const struct winbond_gpio_info
> > > **info)
> > > +{
> > > + bool allow_changing = true;
> > > + unsigned long i;
> > > +

> > > + for_each_set_bit(i, &gpios, sizeof(gpios)) {

sizeof(gpios) will produce wrong number for you. It's rather
BITS_PER_LONG here. Right?

> > > + if (*gpio_num < 8)
> > > + break;
> > > +
> > > + *gpio_num -= 8;
> > > + }
> > 
> > Why not hweight() here?
> > 
> > unsigned int shift = hweight_long(gpios) * 8;
> > unsigned int index = fls_long(gpios); // AFAIU
> > 
> > *offset -= *offset >= shift ? shift : shift - 8;
> > *info = &winbond_gpio_infos[index];
> > 
> > ...
> 
> Unfortunately, this code doesn't produce the same results as the code
> above.
> 
> First, in this code "index" does not depend on "gpio_num" (or
> "offset")
> passed to winbond_gpio_get_info() function, so gpio 0 (on the first
> GPIO
> device or port) will access the same winbond_gpio_infos entry as gpio
> 18
> (which is located on the third GPIO port).

Actually, it does depend on gpio_num (it's your point to break the
loop).

So, fls(*offset) then (I renamed gpio_num to offset in my example).

> In fact, the calculated "index" would always point to the last enabled
> GPIO port (since that is the meaning of "gpios" MSB, assuming this
> user-provided parameter was properly verified or sanitized).

Yes, I missed that.

> Second, the calculated "offset" would end negative for anything but
> the
> very last GPIO port (ok, not really negative since it is an unsigned
> type,
> but still not correct either).

So, sounds like hweight_int(*offset) then. No?

> And that even not taking into account the special case of GPIO6 port
> that has only 5 gpios.

This doesn't matter because of check in ternary operator.

> What we want in this code is for "i" (or "index") to contain the GPIO
> port number for the passed "gpio_num" (or "offset") and that this
> last variable ends reduced modulo 8 from its original value.

Yep.

> > > + if (gpios & ~GENMASK(ARRAY_SIZE(winbond_gpio_infos) - 1,
> > > 0))
> > > {
> > > + wb_sio_err("unknown ports enabled in GPIO ports
> > > bitmask\n");
> > > + return 0;
> > > + }
> > 
> > Do we care? Perhaps just enforce mask based on the size and leave
> > garbage out.
> 
> Can be done either way, but I think notifying user that he or she has
> provided an incorrect parameter value is a good thing - we can use a
> accept-but-warn style.

I would prefer latter (accept-but-warn).

-- 
Andy Shevchenko 
Intel Finland Oy

Re: [PATCH][V2] wcn36xx: fix incorrect assignment to msg_body.min_ch_time

2018-01-04 Thread Bjorn Andersson

On Fri 29 Dec 01:07 PST 2017, Colin King wrote:

> From: Colin Ian King 
> 
> The second assignment to msg_body.min_ch_time is incorrect, it
> should actually be to msg_body.max_ch_time.
> 
> Thanks to Bjorn Andersson for identifying the correct way to fix
> this as my original fix was incorrect.
> 
> Detected by CoverityScan, CID#1463042 ("Unused Value")
> 
> Fixes: 2f3bef4b247e ("wcn36xx: Add hardware scan offload support")
> Signed-off-by: Colin Ian King 

Thanks Colin,

Acked-by: Bjorn Andersson 

Regards,
Bjorn

> ---
>  drivers/net/wireless/ath/wcn36xx/smd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/ath/wcn36xx/smd.c 
> b/drivers/net/wireless/ath/wcn36xx/smd.c
> index 2914618a0335..2a4871ca9c72 100644
> --- a/drivers/net/wireless/ath/wcn36xx/smd.c
> +++ b/drivers/net/wireless/ath/wcn36xx/smd.c
> @@ -626,7 +626,7 @@ int wcn36xx_smd_start_hw_scan(struct wcn36xx *wcn, struct 
> ieee80211_vif *vif,
>  
>   msg_body.scan_type = WCN36XX_HAL_SCAN_TYPE_ACTIVE;
>   msg_body.min_ch_time = 30;
> - msg_body.min_ch_time = 100;
> + msg_body.max_ch_time = 100;
>   msg_body.scan_hidden = 1;
>   memcpy(msg_body.mac, vif->addr, ETH_ALEN);
>   msg_body.p2p_search = vif->p2p;
> -- 
> 2.14.1
>

Re: Avoid speculative indirect calls in kernel

2018-01-04 Thread Andrea Arcangeli

Hi Alan,

On Thu, Jan 04, 2018 at 05:04:42PM +, Alan Cox wrote:
> > If you run lots of syscalls ibrs 1 ibpb 1 is much faster. If you do
> > infrequent syscalls computing a lot in kernel like I/O with large
> > buffers getting copied, ibrs 0 ibpb 2 is much faster than ibrs 1 ibpb
> > 1 (on those microcodes where ibrs 1 reduces performance a lot, not all
> > microcodes implementing SPEC_CTRL are inefficient like that).
> 
> Have you looked at whether you can measure activity and switch
> automatically between the two (or by task). It seems silly to leave
> something the machine can accurately assess toa human ?

We didn't but it'd be definitely reasonable to investigate and it's a
good idea for those CPUs where the updated microcode has to shutdown
way more than just indirect branch prediction speculation to achieve
the ibrs 1 semantics.

If the workload changes from frequent syscalls to reasonably large
read/writes and less frequent syscalls or lots of interrupts in idle
CPUs, it would work well to switch between ibrs 1 ibpb 1 and ibpb 2
ibrs 0 automatically. As long as the pattern keeps repeating for a
while... that is the question ;).

Thanks!
Andrea

Applied "ASoC: mediatek: modify MT2701 AFE driver to adapt mfd device" to the asoc tree

2018-01-04 Thread Mark Brown

The patch

   ASoC: mediatek: modify MT2701 AFE driver to adapt mfd device

has been applied to the asoc tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From dfa3cbb83e099d5ef9809b67ea3bff3a39dc2f06 Mon Sep 17 00:00:00 2001
From: Ryder Lee 
Date: Thu, 4 Jan 2018 15:44:08 +0800
Subject: [PATCH] ASoC: mediatek: modify MT2701 AFE driver to adapt mfd device

As the new MFD parent is in place, modify MT2701 AFE driver to adapt it.

Signed-off-by: Ryder Lee 
Signed-off-by: Mark Brown 
---
 sound/soc/mediatek/mt2701/mt2701-afe-pcm.c | 45 +-
 sound/soc/mediatek/mt2701/mt2701-reg.h |  1 -
 2 files changed, 20 insertions(+), 26 deletions(-)

diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-pcm.c 
b/sound/soc/mediatek/mt2701/mt2701-afe-pcm.c
index 0edadca12a5e..f0cd08fa5c5d 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-pcm.c
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-pcm.c
@@ -17,6 +17,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1368,14 +1369,6 @@ static const struct mt2701_i2s_data 
mt2701_i2s_data[MT2701_I2S_NUM][2] = {
},
 };
 
-static const struct regmap_config mt2701_afe_regmap_config = {
-   .reg_bits = 32,
-   .reg_stride = 4,
-   .val_bits = 32,
-   .max_register = AFE_END_ADDR,
-   .cache_type = REGCACHE_NONE,
-};
-
 static irqreturn_t mt2701_asys_isr(int irq_id, void *dev)
 {
int id;
@@ -1414,9 +1407,9 @@ static int mt2701_afe_runtime_resume(struct device *dev)
 
 static int mt2701_afe_pcm_dev_probe(struct platform_device *pdev)
 {
+   struct snd_soc_component *component;
struct mtk_base_afe *afe;
struct mt2701_afe_private *afe_priv;
-   struct resource *res;
struct device *dev;
int i, irq_id, ret;
 
@@ -1446,17 +1439,11 @@ static int mt2701_afe_pcm_dev_probe(struct 
platform_device *pdev)
return ret;
}
 
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-
-   afe->base_addr = devm_ioremap_resource(&pdev->dev, res);
-
-   if (IS_ERR(afe->base_addr))
-   return PTR_ERR(afe->base_addr);
-
-   afe->regmap = devm_regmap_init_mmio(&pdev->dev, afe->base_addr,
-   &mt2701_afe_regmap_config);
-   if (IS_ERR(afe->regmap))
-   return PTR_ERR(afe->regmap);
+   afe->regmap = syscon_node_to_regmap(dev->parent->of_node);
+   if (!afe->regmap) {
+   dev_err(dev, "could not get regmap from parent\n");
+   return -ENODEV;
+   }
 
mutex_init(&afe->irq_alloc_lock);
 
@@ -1490,6 +1477,12 @@ static int mt2701_afe_pcm_dev_probe(struct 
platform_device *pdev)
= &mt2701_i2s_data[i][I2S_IN];
}
 
+   component = kzalloc(sizeof(*component), GFP_KERNEL);
+   if (!component)
+   return -ENOMEM;
+
+   component->regmap = afe->regmap;
+
afe->mtk_afe_hardware = &mt2701_afe_hardware;
afe->memif_fs = mt2701_memif_fs;
afe->irq_fs = mt2701_irq_fs;
@@ -1502,7 +1495,7 @@ static int mt2701_afe_pcm_dev_probe(struct 
platform_device *pdev)
ret = mt2701_init_clock(afe);
if (ret) {
dev_err(dev, "init clock error\n");
-   return ret;
+   goto err_init_clock;
}
 
platform_set_drvdata(pdev, afe);
@@ -1521,10 +1514,10 @@ static int mt2701_afe_pcm_dev_probe(struct 
platform_device *pdev)
goto err_platform;
}
 
-   ret = snd_soc_register_component(&pdev->dev,
-&mt2701_afe_pcm_dai_component,
-mt2701_afe_pcm_dais,
-ARRAY_SIZE(mt2701_afe_pcm_dais));
+   ret = snd_soc_add_component(dev, component,
+   &mt2701_afe_pcm_dai_component,
+   mt2701_afe_pcm_dais,
+   ARRAY_SIZE(mt2701_afe_pcm_dais));
if (ret) {
dev_warn(dev, "err_dai_component\n");
goto err_dai_component;
@@ -1538,6 +1531,8 @@ static int mt2701_afe_pcm_dev_probe(struct 
platform_device *pdev)
pm_runtime_

Applied "ASoC: Added device tree binding for max98373 amplifier" to the asoc tree

2018-01-04 Thread Mark Brown

The patch

   ASoC: Added device tree binding for max98373 amplifier

has been applied to the asoc tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From bab4a10f0dc745b3c07acb8fa5fbc4337e140f58 Mon Sep 17 00:00:00 2001
From: Ryan Lee 
Date: Wed, 3 Jan 2018 10:38:24 -0800
Subject: [PATCH] ASoC: Added device tree binding for max98373 amplifier

Signed-off-by: Ryan Lee 
Signed-off-by: Mark Brown 
---
 .../devicetree/bindings/sound/max98373.txt | 40 ++
 1 file changed, 40 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/max98373.txt

diff --git a/Documentation/devicetree/bindings/sound/max98373.txt 
b/Documentation/devicetree/bindings/sound/max98373.txt
new file mode 100644
index ..456cb1c59353
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/max98373.txt
@@ -0,0 +1,40 @@
+Maxim Integrated MAX98373 Speaker Amplifier
+
+This device supports I2C.
+
+Required properties:
+
+ - compatible : "maxim,max98373"
+
+ - reg : the I2C address of the device.
+
+Optional properties:
+
+  - maxim,vmon-slot-no : slot number used to send voltage information
+   or in inteleave mode this will be used as
+   interleave slot.
+   slot range : 0 ~ 15,  Default : 0
+
+  - maxim,imon-slot-no : slot number used to send current information
+   slot range : 0 ~ 15,  Default : 0
+
+  - maxim,spkfb-slot-no : slot number used to send speaker feedback information
+   slot range : 0 ~ 15,  Default : 0
+
+  - maxim,interleave-mode : For cases where a single combined channel
+  for the I/V sense data is not sufficient, the device can 
also be configured
+  to share a single data output channel on alternating frames.
+  In this configuration, the current and voltage data will be 
frame interleaved
+  on a single output channel.
+   Boolean, define to enable the interleave mode, Default : 
false
+
+Example:
+
+codec: max98373@31 {
+   compatible = "maxim,max98373";
+   reg = <0x31>;
+   maxim,vmon-slot-no = <0>;
+   maxim,imon-slot-no = <1>;
+   maxim,spkfb-slot-no = <2>;
+   maxim,interleave-mode;
+};
-- 
2.15.1

Applied "ASoC: mediatek: update MT2701 AFE documentation to adapt mfd device" to the asoc tree

2018-01-04 Thread Mark Brown

The patch

   ASoC: mediatek: update MT2701 AFE documentation to adapt mfd device

has been applied to the asoc tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 7f12a56367bf526afde7e81820a8c7d97e75ed10 Mon Sep 17 00:00:00 2001
From: Ryder Lee 
Date: Thu, 4 Jan 2018 15:44:09 +0800
Subject: [PATCH] ASoC: mediatek: update MT2701 AFE documentation to adapt mfd
 device

As the new MFD parent is in place, modify MT2701 AFE documentation to
adapt it. Also add three core clocks in example.

Signed-off-by: Ryder Lee 
Signed-off-by: Mark Brown 
---
 .../devicetree/bindings/sound/mt2701-afe-pcm.txt   | 171 +++--
 1 file changed, 93 insertions(+), 78 deletions(-)

diff --git a/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt 
b/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt
index 0450baad2813..6df87b97f7cb 100644
--- a/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt
+++ b/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt
@@ -2,15 +2,17 @@ Mediatek AFE PCM controller for mt2701
 
 Required properties:
 - compatible = "mediatek,mt2701-audio";
-- reg: register location and size
 - interrupts: should contain AFE and ASYS interrupts
 - interrupt-names: should be "afe" and "asys"
 - power-domains: should define the power domain
 - clocks: Must contain an entry for each entry in clock-names
   See ../clocks/clock-bindings.txt for details
 - clock-names: should have these clock names:
+   "infra_sys_audio_clk",
"top_audio_mux1_sel",
"top_audio_mux2_sel",
+   "top_audio_a1sys_hp",
+   "top_audio_a2sys_hp",
"i2s0_src_sel",
"i2s1_src_sel",
"i2s2_src_sel",
@@ -45,85 +47,98 @@ Required properties:
 - assigned-clocks-parents: parent of input clocks of assigned clocks.
 - assigned-clock-rates: list of clock frequencies of assigned clocks.
 
+Must be a subnode of MediaTek audsys device tree node.
+See ../arm/mediatek/mediatek,audsys.txt for details about the parent node.
+
 Example:
 
-   afe: mt2701-afe-pcm@1122 {
-   compatible = "mediatek,mt2701-audio";
-   reg = <0 0x1122 0 0x2000>,
- <0 0x112A 0 0x2>;
-   interrupts = ,
-;
-   interrupt-names = "afe", "asys";
-   power-domains = <&scpsys MT2701_POWER_DOMAIN_IFR_MSC>;
-   clocks = <&topckgen CLK_TOP_AUD_MUX1_SEL>,
-<&topckgen CLK_TOP_AUD_MUX2_SEL>,
-<&topckgen CLK_TOP_AUD_K1_SRC_SEL>,
-<&topckgen CLK_TOP_AUD_K2_SRC_SEL>,
-<&topckgen CLK_TOP_AUD_K3_SRC_SEL>,
-<&topckgen CLK_TOP_AUD_K4_SRC_SEL>,
-<&topckgen CLK_TOP_AUD_K1_SRC_DIV>,
-<&topckgen CLK_TOP_AUD_K2_SRC_DIV>,
-<&topckgen CLK_TOP_AUD_K3_SRC_DIV>,
-<&topckgen CLK_TOP_AUD_K4_SRC_DIV>,
-<&topckgen CLK_TOP_AUD_I2S1_MCLK>,
-<&topckgen CLK_TOP_AUD_I2S2_MCLK>,
-<&topckgen CLK_TOP_AUD_I2S3_MCLK>,
-<&topckgen CLK_TOP_AUD_I2S4_MCLK>,
-<&audiosys CLK_AUD_I2SO1>,
-<&audiosys CLK_AUD_I2SO2>,
-<&audiosys CLK_AUD_I2SO3>,
-<&audiosys CLK_AUD_I2SO4>,
-<&audiosys CLK_AUD_I2SIN1>,
-<&audiosys CLK_AUD_I2SIN2>,
-<&audiosys CLK_AUD_I2SIN3>,
-<&audiosys CLK_AUD_I2SIN4>,
-<&audiosys CLK_AUD_ASRCO1>,
-<&audiosys CLK_AUD_ASRCO2>,
-<&audiosys CLK_AUD_ASRCO3>,
-<&audiosys CLK_AUD_ASRCO4>,
-<&audiosys CLK_AUD_AFE>,
-<&audiosys CLK_AUD_AFE_CONN>,
-<&audiosys CLK_AUD_A1SYS>,
-<&audiosys CLK_AUD_A2SYS>,
-<&audiosys CLK_AUD_AFE_MRGIF>;
+   audsys: aud

Re: [PATCH v4 2/7] ARM: davinci: don't use static clk_lookup

2018-01-04 Thread David Lechner




On 1/4/18 5:10 AM, Sekhar Nori wrote:

Hi David,

On Monday 01 January 2018 05:09 AM, David Lechner wrote:

In preparation of moving to the common clock framework, usage of static
struct clk_lookup is removed. The common clock framework uses an opaque
struct clk, so we won't be able to use static tables as was previously
done.

davinci_clk_init() is changed to init a single clock instead of a table
and an individual clk_register_clkdev() is added for each clock.

Signed-off-by: David Lechner 


Is there a need for this considering in 6/7 you end up modifying quite a
bit of this patch again?


No, you are right. And I've been working ahead with device tree support 
so I think I want to do this a bit differently anyway.

Applied "ASoC: max98373: Added Amplifier Driver" to the asoc tree

2018-01-04 Thread Mark Brown

The patch

   ASoC: max98373: Added Amplifier Driver

has been applied to the asoc tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 2f3d24a1355ad32845300dfd0a375c361be7ab38 Mon Sep 17 00:00:00 2001
From: Ryan Lee 
Date: Wed, 3 Jan 2018 10:39:17 -0800
Subject: [PATCH] ASoC: max98373: Added Amplifier Driver

Signed-off-by: Ryan Lee 
Signed-off-by: Mark Brown 
---
 sound/soc/codecs/Kconfig|   5 +
 sound/soc/codecs/Makefile   |   2 +
 sound/soc/codecs/max98373.c | 971 
 sound/soc/codecs/max98373.h | 212 ++
 4 files changed, 1190 insertions(+)
 create mode 100644 sound/soc/codecs/max98373.c
 create mode 100644 sound/soc/codecs/max98373.h

diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig
index a42ddbc93f3d..80af1f4d3097 100644
--- a/sound/soc/codecs/Kconfig
+++ b/sound/soc/codecs/Kconfig
@@ -95,6 +95,7 @@ config SND_SOC_ALL_CODECS
select SND_SOC_MAX98925 if I2C
select SND_SOC_MAX98926 if I2C
select SND_SOC_MAX98927 if I2C
+   select SND_SOC_MAX98373 if I2C
select SND_SOC_MAX9850 if I2C
select SND_SOC_MAX9860 if I2C
select SND_SOC_MAX9768 if I2C
@@ -623,6 +624,10 @@ config SND_SOC_MAX98927
tristate "Maxim Integrated MAX98927 Speaker Amplifier"
depends on I2C
 
+config SND_SOC_MAX98373
+   tristate "Maxim Integrated MAX98373 Speaker Amplifier"
+   depends on I2C
+
 config SND_SOC_MAX9850
tristate
 
diff --git a/sound/soc/codecs/Makefile b/sound/soc/codecs/Makefile
index 0001069ce2a7..31a620b5e8a3 100644
--- a/sound/soc/codecs/Makefile
+++ b/sound/soc/codecs/Makefile
@@ -90,6 +90,7 @@ snd-soc-max9867-objs := max9867.o
 snd-soc-max98925-objs := max98925.o
 snd-soc-max98926-objs := max98926.o
 snd-soc-max98927-objs := max98927.o
+snd-soc-max98373-objs := max98373.o
 snd-soc-max9850-objs := max9850.o
 snd-soc-max9860-objs := max9860.o
 snd-soc-mc13783-objs := mc13783.o
@@ -330,6 +331,7 @@ obj-$(CONFIG_SND_SOC_MAX9867)   += snd-soc-max9867.o
 obj-$(CONFIG_SND_SOC_MAX98925) += snd-soc-max98925.o
 obj-$(CONFIG_SND_SOC_MAX98926) += snd-soc-max98926.o
 obj-$(CONFIG_SND_SOC_MAX98927) += snd-soc-max98927.o
+obj-$(CONFIG_SND_SOC_MAX98373) += snd-soc-max98373.o
 obj-$(CONFIG_SND_SOC_MAX9850)  += snd-soc-max9850.o
 obj-$(CONFIG_SND_SOC_MAX9860)  += snd-soc-max9860.o
 obj-$(CONFIG_SND_SOC_MC13783)  += snd-soc-mc13783.o
diff --git a/sound/soc/codecs/max98373.c b/sound/soc/codecs/max98373.c
new file mode 100644
index ..9af0d985d6e9
--- /dev/null
+++ b/sound/soc/codecs/max98373.c
@@ -0,0 +1,971 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2017, Maxim Integrated */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "max98373.h"
+
+static struct reg_default max98373_reg[] = {
+   {MAX98373_R2000_SW_RESET, 0x00},
+   {MAX98373_R2001_INT_RAW1, 0x00},
+   {MAX98373_R2002_INT_RAW2, 0x00},
+   {MAX98373_R2003_INT_RAW3, 0x00},
+   {MAX98373_R2004_INT_STATE1, 0x00},
+   {MAX98373_R2005_INT_STATE2, 0x00},
+   {MAX98373_R2006_INT_STATE3, 0x00},
+   {MAX98373_R2007_INT_FLAG1, 0x00},
+   {MAX98373_R2008_INT_FLAG2, 0x00},
+   {MAX98373_R2009_INT_FLAG3, 0x00},
+   {MAX98373_R200A_INT_EN1, 0x00},
+   {MAX98373_R200B_INT_EN2, 0x00},
+   {MAX98373_R200C_INT_EN3, 0x00},
+   {MAX98373_R200D_INT_FLAG_CLR1, 0x00},
+   {MAX98373_R200E_INT_FLAG_CLR2, 0x00},
+   {MAX98373_R200F_INT_FLAG_CLR3, 0x00},
+   {MAX98373_R2010_IRQ_CTRL, 0x00},
+   {MAX98373_R2014_THERM_WARN_THRESH, 0x10},
+   {MAX98373_R2015_THERM_SHDN_THRESH, 0x27},
+   {MAX98373_R2016_THERM_HYSTERESIS, 0x01},
+   {MAX98373_R2017_THERM_FOLDBACK_SET, 0xC0},
+   {MAX98373_R2018_THERM_FOLDBACK_EN, 0x00},
+   {MAX98373_R201E_PIN_DRIVE_STRENGTH, 0x55},
+   {MAX98373_R2020_PCM_TX_HIZ_EN_1, 0xFE},
+   {MAX98373_R2021_PCM_TX_HIZ_EN_2, 0xFF},
+   {MAX98373_R2022_PCM_TX_SRC_1, 0x00},
+   {MAX98373_R2023_PCM_TX_SRC_2, 0x00},
+   {MAX98373_R2024_PCM_DATA_FMT_CFG, 0xC0},
+   {MAX98373_R2025_AUDIO_IF_MODE, 0x00},
+   {MAX98373_R2

[RFC PATCH] swiotlb: _swiotlb_tbl_map_single() can be static

2018-01-04 Thread kbuild test robot


Fixes: bd4bb89b2f71 ("swiotlb: suppress warning when __GFP_NOWARN is set v2")
Signed-off-by: Fengguang Wu 
---
 swiotlb.c |   10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index ed443d6..e253e80 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -490,11 +490,11 @@ static void swiotlb_bounce(phys_addr_t orig_addr, 
phys_addr_t tlb_addr,
}
 }
 
-phys_addr_t _swiotlb_tbl_map_single(struct device *hwdev,
-   dma_addr_t tbl_dma_addr,
-   phys_addr_t orig_addr, size_t size,
-   enum dma_data_direction dir,
-   unsigned long attrs, bool warn)
+static phys_addr_t _swiotlb_tbl_map_single(struct device *hwdev,
+  dma_addr_t tbl_dma_addr,
+  phys_addr_t orig_addr, size_t size,
+  enum dma_data_direction dir,
+  unsigned long attrs, bool warn)
 {
unsigned long flags;
phys_addr_t tlb_addr;

Re: [PATCH] swiotlb: suppress warning when __GFP_NOWARN is set v2

2018-01-04 Thread kbuild test robot

Hi Christian,

I love your patch! Perhaps something to improve:

[auto build test WARNING on v4.15-rc5]
[also build test WARNING on next-20180104]
[cannot apply to swiotlb/linux-next]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Christian-K-nig/swiotlb-suppress-warning-when-__GFP_NOWARN-is-set-v2/20180104-185406
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)


Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

Re: [PATCH 1/2] bitmap: new bitmap_copy_safe and bitmap_{from,to}_arr32

2018-01-04 Thread Yury Norov

Hi Andy,

Thanks for review. Comments inline.

On Sun, Dec 31, 2017 at 02:34:42PM +0200, Andy Shevchenko wrote:
> On Thu, Dec 28, 2017 at 5:00 PM, Yury Norov  wrote:
> > This patchset replaces bitmap_{to,from}_u32array with more simple
> > and standard looking copy-like functions.
> >
> > bitmap_from_u32array() takes 4 arguments (bitmap_to_u32array is similar):
> >  - unsigned long *bitmap, which is destination;
> >  - unsigned int nbits, the length of destination bitmap, in bits;
> >  - const u32 *buf, the source; and
> >  - unsigned int nwords, the length of source buffer in ints.
> >
> > In description to the function it is detailed like:
> > * copy min(nbits, 32*nwords) bits from @buf to @bitmap, remaining
> > * bits between nword and nbits in @bitmap (if any) are cleared.
> >
> > Having two size arguments looks unneeded and potentially dangerous.
> 
> For the first argument, it depends what logic we would like to put behind.
> Imagine the case (and since these functions are targetting some wider
> cases) when you have not aligned bitmap (nbits % BITS_PER_LONG != 0).
> 
> So, there are 2 cases, nwords > nbits / BITS_PER_U32, or nbits /
> BITS_PER_U32 > nwords.
> 
> We have at least two options for the first one:
> 1) cut it to fit and return some code (or updated nbits, or ...) to
> tell this to the caller;
> 2) return an error and do nothing.
> 
> For the second case one:
> 1) merge bitmaps;
> 2) fill with 0 or 1 (another parameter?) the rest of bitmap.

This is the whole point of the patch.
Kernel users doesn't need all that functionality to manage the case
nwords != nbits / BITS_PER_U32. All existing users explicitly pass
nwords and nbits matched.

Support for unmatched nwords and nbits is pretty tricky. As you
mentioned here, there are 2 cases with (at least) 2 options for each
case. Existing code doesn't take all that into account, and if we go
handle it properly, we'll end up with quite big portion of code, which
we should also cover with tests, carefully comment and maintain. And
all this will be for nothing because there's no *real* users of that
functionality.

> > It is unneeded because normally user of copy-like function should
> > take care of the size of destination and make it big enough to fit
> > source data.
> >
> > And it is dangerous because function may hide possible error if user
> > doesn't provide big enough bitmap, and data becomes silently dropped.
> 
> We might return -E2BIG, for example.
> 
> > That's why all copy-like functions have 1 argument for size of copying
> > data, and I don't see any reason to make bitmap_from_u32array()
> > different.
> >
> > One exception that comes in mind is strncpy() which also provides size
> > of destination in arguments, but it's strongly argued by the possibility
> > of taking broken strings in source. This is not the case of
> > bitmap_{from,to}_u32array().
> >
> > There is no many real users of bitmap_{from,to}_u32array(), and they all
> > very clearly provide size of destination matched with the size of
> > source, so additional functionality is not used in fact. Like this:
> > bitmap_from_u32array(to->link_modes.supported,
> > __ETHTOOL_LINK_MODE_MASK_NBITS,
> > link_usettings.link_modes.supported,
> > __ETHTOOL_LINK_MODE_MASK_NU32);
> > Where:
> > #define __ETHTOOL_LINK_MODE_MASK_NU32 \
> > DIV_ROUND_UP(__ETHTOOL_LINK_MODE_MASK_NBITS, 32)
> 
> Consider more generic use of them.
> 
> For example, we have a big enough bitmap, but would like to copy only
> few u32 items to it. Another case, we have quite big u32 array, but
> would like to copy only some of them.
> It is first what came to my mind, there are might be much more
> interesting corner cases.
> 
> Will it survive?

For your examples, I think, we already have a set of suitable functions
in lib/bitmap.c. *arr32 functions are only to convert bitmaps from/to
u32 arrays with different endianness, probably taken from userspace or
some hardware.

So, for your 1st example, function may look like this:
void bitmap_insert(hugemap, offset, smallmap, nbits) 
{
unsigned long first = offset / BITS_PER_LONG;
unsigned long off = offset % BITS_PER_LONG;
unsigned long last = first + (nbits + off) / BITS_PER_LONG;
unsigned long first_smallmap = hugemap[first]
 & BITMAP_FIRST_WORD_MASK(offset);
unsigned long last_smallmap = hugemap[first]
 & BITMAP_LAST_WORD_MASK(nbits);

bitmap_zero(tmp, nbits + off);
__bitmap_shift_right(tmp, off);

tmp[0] &= hugemap[first];
tmp[(nbits + off) / BITS_PER_LONG] &= hugemap[last];
bitmap_copy(hugemap[first], smallmap, nbits);
}

And usage:
DECLARE_BITMAP(tmp, nbits + off);

bitmap_from_arr32(tmp, arr32, nbits);
bitmap_insert(hugemap, offset, tmp, nbits) 

This is pretty artificial example. If there'll be real users of
bitmap_insert, I believe imp

Applied "ASoC: mediatek: add some core clocks for MT2701 AFE" to the asoc tree

2018-01-04 Thread Mark Brown

The patch

   ASoC: mediatek: add some core clocks for MT2701 AFE

has been applied to the asoc tree at

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.  

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

>From 96365d9fdb2f0d81bfc010298289a8c168931cd0 Mon Sep 17 00:00:00 2001
From: Ryder Lee 
Date: Thu, 4 Jan 2018 15:44:07 +0800
Subject: [PATCH] ASoC: mediatek: add some core clocks for MT2701 AFE

Add three core clocks for MT2701 AFE.

Signed-off-by: Ryder Lee 
Signed-off-by: Mark Brown 
---
 sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c | 30 ++-
 sound/soc/mediatek/mt2701/mt2701-afe-common.h |  3 +++
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c 
b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
index 56a057c78c9a..949fc3a1d025 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
@@ -18,8 +18,11 @@
 #include "mt2701-afe-clock-ctrl.h"
 
 static const char *const base_clks[] = {
+   [MT2701_INFRA_SYS_AUDIO] = "infra_sys_audio_clk",
[MT2701_TOP_AUD_MCLK_SRC0] = "top_audio_mux1_sel",
[MT2701_TOP_AUD_MCLK_SRC1] = "top_audio_mux2_sel",
+   [MT2701_TOP_AUD_A1SYS] = "top_audio_a1sys_hp",
+   [MT2701_TOP_AUD_A2SYS] = "top_audio_a2sys_hp",
[MT2701_AUDSYS_AFE] = "audio_afe_pd",
[MT2701_AUDSYS_AFE_CONN] = "audio_afe_conn_pd",
[MT2701_AUDSYS_A1SYS] = "audio_a1sys_pd",
@@ -169,10 +172,26 @@ static int mt2701_afe_enable_audsys(struct mtk_base_afe 
*afe)
struct mt2701_afe_private *afe_priv = afe->platform_priv;
int ret;
 
-   ret = clk_prepare_enable(afe_priv->base_ck[MT2701_AUDSYS_AFE]);
+   /* Enable infra clock gate */
+   ret = clk_prepare_enable(afe_priv->base_ck[MT2701_INFRA_SYS_AUDIO]);
if (ret)
return ret;
 
+   /* Enable top a1sys clock gate */
+   ret = clk_prepare_enable(afe_priv->base_ck[MT2701_TOP_AUD_A1SYS]);
+   if (ret)
+   goto err_a1sys;
+
+   /* Enable top a2sys clock gate */
+   ret = clk_prepare_enable(afe_priv->base_ck[MT2701_TOP_AUD_A2SYS]);
+   if (ret)
+   goto err_a2sys;
+
+   /* Internal clock gates */
+   ret = clk_prepare_enable(afe_priv->base_ck[MT2701_AUDSYS_AFE]);
+   if (ret)
+   goto err_afe;
+
ret = clk_prepare_enable(afe_priv->base_ck[MT2701_AUDSYS_A1SYS]);
if (ret)
goto err_audio_a1sys;
@@ -193,6 +212,12 @@ static int mt2701_afe_enable_audsys(struct mtk_base_afe 
*afe)
clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_A1SYS]);
 err_audio_a1sys:
clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_AFE]);
+err_afe:
+   clk_disable_unprepare(afe_priv->base_ck[MT2701_TOP_AUD_A2SYS]);
+err_a2sys:
+   clk_disable_unprepare(afe_priv->base_ck[MT2701_TOP_AUD_A1SYS]);
+err_a1sys:
+   clk_disable_unprepare(afe_priv->base_ck[MT2701_INFRA_SYS_AUDIO]);
 
return ret;
 }
@@ -205,6 +230,9 @@ static void mt2701_afe_disable_audsys(struct mtk_base_afe 
*afe)
clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_A2SYS]);
clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_A1SYS]);
clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_AFE]);
+   clk_disable_unprepare(afe_priv->base_ck[MT2701_TOP_AUD_A1SYS]);
+   clk_disable_unprepare(afe_priv->base_ck[MT2701_TOP_AUD_A2SYS]);
+   clk_disable_unprepare(afe_priv->base_ck[MT2701_INFRA_SYS_AUDIO]);
 }
 
 int mt2701_afe_enable_clock(struct mtk_base_afe *afe)
diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-common.h 
b/sound/soc/mediatek/mt2701/mt2701-afe-common.h
index 9a2b301a4c21..ae8ddeacfbfe 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-common.h
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-common.h
@@ -61,8 +61,11 @@ enum {
 };
 
 enum audio_base_clock {
+   MT2701_INFRA_SYS_AUDIO,
MT2701_TOP_AUD_MCLK_SRC0,
MT2701_TOP_AUD_MCLK_SRC1,
+   MT2701_TOP_AUD_A1SYS,
+   MT2701_TOP_AUD_A2SYS,
MT2701_AUDSYS_AFE,
MT2701_AUDSYS_AFE_CONN,
MT2701_AUDSYS_A1SYS,
-- 
2.15.1

Re: [PATCH v4 3/7] ARM: davinci: fix duplicate clocks

2018-01-04 Thread David Lechner




On 1/4/18 5:12 AM, Sekhar Nori wrote:

On Monday 01 January 2018 05:09 AM, David Lechner wrote:

There are a number of clocks that were duplicated because they are used by
more than one device. It is no longer necessary to do this since we are
explicitly calling clk_register_clkdev() for each clock. In da830.c, some
clocks were using the same LPSC, which would cause problems with reference
counting, so these are combinded into one clock each. In da850.c the
duplicate clocks had already been fixed by creating dummy child clocks, so
these clocks are removed.

Signed-off-by: David Lechner 


If we do end up keeping 2/7, this should be done before that - to avoid
retouching code that was just introduced.



FWIW, this can't be done before because it will cause broken linked 
lists in the davinci clocks. But, as I mentioned already, I am going to 
try a different approach, so this patch will go away completely.

Re: [0/4] video-UDLFB: Adjustments for five function implementations

2018-01-04 Thread SF Markus Elfring

>> * Do you find a Linux allocation failure report insufficient in this use 
>> case?
> 
> Yes,

Interesting …


> there is more information available currently in the driver and
> I see no real improvement in removing it.
> 
>> * Are you looking for any more clarification?
> 
> I will not apply any of such patches for now. The only exception
> being drivers that support hardware that can have only one instance
> in the system …

Thanks for your feedback.


> and the patch needs to be reviewed by a someone else than the author).

I am curious if this will ever happen again for my update suggestions
in such a software area.

Regards,
Markus

Re: [PATCH v4 5/7] clk: Introduce davinci clocks

2018-01-04 Thread David Lechner




On 1/4/18 6:28 AM, Sekhar Nori wrote:

On Wednesday 03 January 2018 03:01 AM, David Lechner wrote:

Forgot to cc linux-clk, so doing that now...


On 12/31/2017 05:39 PM, David Lechner wrote:

This introduces new drivers for arch/arm/mach-davinci. The code is based
on the clock drivers from there and adapted to use the common clock
framework.

Signed-off-by: David Lechner 
---
   drivers/clk/Makefile  |   1 +
   drivers/clk/davinci/Makefile  |   3 +
   drivers/clk/davinci/da8xx-cfgchip-clk.c   | 380
++
   drivers/clk/davinci/pll.c | 333
++
   drivers/clk/davinci/psc.c | 217 +
   include/linux/clk/davinci.h   |  46 
   include/linux/platform_data/davinci_clk.h |  25 ++
   7 files changed, 1005 insertions(+)


This is a pretty huge patch and I think each of cfgchip, pll and PSC
clocks deserve a patch of their own.


Will do.



On the PLL patch, please describe how the PLL implementation on DaVinci
is different from Keystone, so no reuse is really possible. Similarly
for the PSC patch (no non-DT support in keystone etc).


OK.




diff --git a/drivers/clk/davinci/psc.c b/drivers/clk/davinci/psc.c
new file mode 100644
index 000..8ae85ee
--- /dev/null
+++ b/drivers/clk/davinci/psc.c
@@ -0,0 +1,217 @@



+static void psc_config(struct davinci_psc_clk *psc,
+   enum davinci_psc_state next_state)
+{
+    u32 epcpr, ptcmd, pdstat, pdctl, mdstat, mdctl, ptstat;
+
+    mdctl = readl(psc->base + MDCTL + 4 * psc->lpsc);
+    mdctl &= ~MDSTAT_STATE_MASK;
+    mdctl |= next_state;
+    /* TODO: old davinci clocks for da850 set MDCTL_FORCE bit for
sata and
+ * dsp here. Is this really needed?
+ */
+    writel(mdctl, psc->base + MDCTL + 4 * psc->lpsc);
+
+    pdstat = readl(psc->base + PDSTAT + 4 * psc->pd);
+    if ((pdstat & PDSTAT_STATE_MASK) == 0) {
+    pdctl = readl(psc->base + PDSTAT + 4 * psc->pd);
+    pdctl |= PDCTL_NEXT;
+    writel(pdctl, psc->base + PDSTAT + 4 * psc->pd);
+
+    ptcmd = BIT(psc->pd);
+    writel(ptcmd, psc->base + PTCMD);
+
+    do {
+    epcpr = __raw_readl(psc->base + EPCPR);
+    } while (!(epcpr & BIT(psc->pd)));
+
+    pdctl = __raw_readl(psc->base + PDCTL + 4 * psc->pd);
+    pdctl |= PDCTL_EPCGOOD;
+    __raw_writel(pdctl, psc->base + PDCTL + 4 * psc->pd);


Can we shift to regmap here too? Then the polling loops like above can
be converted to regmap_read_poll_timeout() too like you have done elsewhere.



I'll give it a try.

RE: [PATCH v3 1/9] ufs: sysfs: device descriptor

2018-01-04 Thread Stanislav Nijnikov



> -Original Message-
> From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> Sent: Wednesday, January 3, 2018 3:44 AM
> To: Stanislav Nijnikov 
> Cc: linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
> gre...@linuxfoundation.org; Alex Lemberg 
> Subject: Re: [PATCH v3 1/9] ufs: sysfs: device descriptor
> 
> On 01/02, Stanislav Nijnikov wrote:
> >
> >
> > > -Original Message-
> > > From: Jaegeuk Kim [mailto:jaeg...@kernel.org]
> > > Sent: Thursday, December 28, 2017 9:37 PM
> > > To: Stanislav Nijnikov 
> > > Cc: linux-s...@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > gre...@linuxfoundation.org; Alex Lemberg 
> > > Subject: Re: [PATCH v3 1/9] ufs: sysfs: device descriptor
> > >
> > > On 12/28, Stanislav Nijnikov wrote:
> > > > This patch introduces a sysfs group entry for the UFS device
> > > > descriptor parameters. The group adds "device_descriptor" folder
> > > > under the UFS driver sysfs entry
> > > > (/sys/bus/platform/drivers/ufshcd/*). The parameters are shown as
> > > > hexadecimal numbers. The full information about the parameters could
> be found at UFS specifications 2.1.
> > > >
> > > > Signed-off-by: Stanislav Nijnikov 
> > > > ---
> > > >  Documentation/ABI/testing/sysfs-driver-ufs | 223
> > > +
> > > >  drivers/scsi/ufs/Makefile  |   3 +-
> > > >  drivers/scsi/ufs/ufs-sysfs.c   | 170 ++
> > > >  drivers/scsi/ufs/ufs-sysfs.h   |  25 
> > > >  drivers/scsi/ufs/ufs.h |   8 ++
> > > >  drivers/scsi/ufs/ufshcd.c  |  12 +-
> > > >  drivers/scsi/ufs/ufshcd.h  |   4 +
> > > >  7 files changed, 439 insertions(+), 6 deletions(-)  create mode
> > > > 100644 Documentation/ABI/testing/sysfs-driver-ufs
> > > >  create mode 100644 drivers/scsi/ufs/ufs-sysfs.c  create mode
> > > > 100644 drivers/scsi/ufs/ufs-sysfs.h
> > > >
> > > > diff --git a/Documentation/ABI/testing/sysfs-driver-ufs
> > > > b/Documentation/ABI/testing/sysfs-driver-ufs
> > > > new file mode 100644
> > > > index 000..17cc4aa
> > > > --- /dev/null
> > > > +++ b/Documentation/ABI/testing/sysfs-driver-ufs
> > >
> > > [snip]
> > >
> > > > diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile
> > > > index 9310c6c..918f579 100644
> > > > --- a/drivers/scsi/ufs/Makefile
> > > > +++ b/drivers/scsi/ufs/Makefile
> > > > @@ -3,6 +3,7 @@
> > > >  obj-$(CONFIG_SCSI_UFS_DWC_TC_PCI) += tc-dwc-g210-pci.o ufshcd-
> > > dwc.o
> > > > tc-dwc-g210.o
> > > >  obj-$(CONFIG_SCSI_UFS_DWC_TC_PLATFORM) += tc-dwc-g210-
> pltfrm.o
> > > > ufshcd-dwc.o tc-dwc-g210.o
> > > >  obj-$(CONFIG_SCSI_UFS_QCOM) += ufs-qcom.o
> > > > -obj-$(CONFIG_SCSI_UFSHCD) += ufshcd.o
> > > > +obj-$(CONFIG_SCSI_UFSHCD) += ufshcd-core.o ufshcd-core-objs :=
> > > > +ufshcd.o ufs-sysfs.o
> > >
> > > Why not just adding ufs-sysfs.o in the existing configuration?
> >
> > The kernel build robot compiles the UFS driver as a separate module.
> > The existing configuration doesn't allow to add a new file to be
> > compiled as a part of this module. The line like "
> > obj-$(CONFIG_SCSI_UFSHCD) += ufshcd.o ufs-sysfs.o" in the makefile will
> actually create 2 modules.
> > This was the reason why I used EXPORT_SYMBOL in the first version.
> 
> Is there a reason to drop the first version?
> 
It was updated according to Greg Kroah-Hartman' notes (one of them was what is
a reason to use EXPORT_SYMBOL if functions are used only in one module).
> >
> > >
> > > >  obj-$(CONFIG_SCSI_UFSHCD_PCI) += ufshcd-pci.o
> > > >  obj-$(CONFIG_SCSI_UFSHCD_PLATFORM) += ufshcd-pltfrm.o diff --git
> > > > a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c new
> > > > file mode 100644 index 000..1c685f3
> > > > --- /dev/null
> > > > +++ b/drivers/scsi/ufs/ufs-sysfs.c
> > > > @@ -0,0 +1,170 @@
> > > > +/*
> > > > +* UFS Device Management sysfs
> > > > +*
> > > > +* Copyright (C) 2017 Western Digital Corporation
> > > > +*
> > > > +* This program is free software; you can redistribute it and/or
> > > > +* modify it under the terms of the GNU General Public License
> > > > +version
> > > > +* 2 as published by the Free Software Foundation.
> > > > +*
> > > > +* This program is distributed in the hope that it will be useful,
> > > > +but
> > > > +* WITHOUT ANY WARRANTY; without even the implied warranty of
> > > > +* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
> the
> > > GNU
> > > > +* General Public License for more details.
> > > > +*
> > > > +*/
> > > > +
> > > > +#include 
> > > > +#include 
> > > > +
> > > > +#include "ufs.h"
> > > > +#include "ufs-sysfs.h"
> > > > +/* collision between the device descriptor parameter and the
> > > > +definition */ #undef DEVICE_CLASS
> > >
> > > Does this make sense? How about attaching "_" for all the macro like
> > > _DEVICE_CLASS below?
> > >
> >
> > It's not just changing the one line that uses "DEVICE_CLASS" to use
> > "_DEVICE_CLASS". It's will be necessary to add "

Re: [PATCH v4 5/7] clk: Introduce davinci clocks

2018-01-04 Thread David Lechner




On 1/4/18 6:43 AM, Sekhar Nori wrote:

Hi David,

On Monday 01 January 2018 05:09 AM, David Lechner wrote:

+   /* TODO: old davinci clocks for da850 set MDCTL_FORCE bit for sata and
+* dsp here. Is this really needed?
+*/


The commit that introduced this flag suggests so.

commit aad70de20fc69970a3080e7e8f02b54a4a3fe3e6
Author: Sekhar Nori 
AuthorDate: Wed Jul 6 06:01:22 2011 +
Commit: Sekhar Nori 
CommitDate: Fri Jul 8 11:10:09 2011 +0530

 davinci: enable forced transitions on PSC

 Some DaVinci modules like the SATA on DA850
 need forced module state transitions.

 Define a "force" flag which can be passed to
 the PSC config function to enable it to make
 forced transitions.

 Forced transitions shouldn't normally be attempted,
 unless the TRM explicitly specifies its usage.

 ChangeLog:
 v2:
 Modified to take care of the fact that
 davinci_psc_config() now takes the flags
 directly.

 Signed-off-by: Sekhar Nori 

I can check without that flag again, but I do recall it being needed.



OK, I will add it back. I need to add some other flags as well anyway.

Re: [PATCH v5 6/9] ACPI/PPTT: Add topology parsing code

2018-01-04 Thread Jeremy Linton


Hi,

On 01/04/2018 12:48 AM, vkil...@codeaurora.org wrote:

Hi Jeremy


-Original Message-
From: linux-arm-kernel

[mailto:linux-arm-kernel-boun...@lists.infradead.org]

On Behalf Of Jeremy Linton
Sent: Wednesday, January 3, 2018 10:28 PM
To: vkil...@codeaurora.org
Cc: 'Mark Rutland' ; jonathan.zh...@cavium.com;
jayachandran.n...@cavium.com; 'Lorenzo Pieralisi'
; austi...@codeaurora.org; 'Linux PM' ; jh...@codeaurora.org; 'Catalin Marinas'
; 'Sudeep Holla' ; 'Will
Deacon' ; 'Linux Kernel Mailing List' ; wangxiongfe...@huawei.com; 'ACPI Devel Maling
List' ; 'Viresh Kumar'

;

'Rafael J. Wysocki' ; 'Hanjun Guo'
; 'Greg Kroah-Hartman'
; 'Rafael J. Wysocki' ; 'Al
Stone' ; linux-arm-ker...@lists.infradead.org; 'Len

Brown'


Subject: Re: [PATCH v5 6/9] ACPI/PPTT: Add topology parsing code

Hi,

On 01/03/2018 02:49 AM, vkil...@codeaurora.org wrote:

Hi Jeremy,

   Sorry, I don't have your previous patch emails to reply on right
patch context.
So commenting on top of this patch.

AFAIU, the PPTT v5 patches still rely on CLIDR_EL1 register to know
the type of Caches enabled/available on the platform. With PPTT, it
should not rely on architecture registers. There can be platforms
which can report cache availability in PPTT but not in architecture
registers.

The following code snippet shows usage of CLIDR_EL1

In arch/arm64/kernel/cacheinfo.c

static inline enum cache_type get_cache_type(int level) {
   u64 clidr;

   if (level > MAX_CACHE_LEVEL)
   return CACHE_TYPE_NOCACHE;
   clidr = read_sysreg(clidr_el1);
   return CLIDR_CTYPE(clidr, level); }

static int __populate_cache_leaves(unsigned int cpu) {
unsigned int level, idx;
enum cache_type type;
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
struct cacheinfo *this_leaf = this_cpu_ci->info_list;

for (idx = 0, level = 1; level <= this_cpu_ci->num_levels &&
 idx < this_cpu_ci->num_leaves; idx++, level++) {
type = get_cache_type(level);
if (type == CACHE_TYPE_SEPARATE) {
ci_leaf_init(this_leaf++, CACHE_TYPE_DATA,

level);

ci_leaf_init(this_leaf++, CACHE_TYPE_INST,

level);

} else {
ci_leaf_init(this_leaf++, type, level);
}
   }
return 0;
   }

In populate_cache_leaves() the cache type is read from CLIDR_EL1

register.

If CLIDR_EL1 reports CACHE_TYPE_NOCACHE for a particular level then
sysfs entry /sys/devices/system/cpu/cpu0/index/type is not created
and hence userspace tools like lstopo will not report this cache
level.



This sounds suspiciously like one of things tweaked between v4->v5. If you

look

at update_cache_properties() in patch 2/9, you will see that we only
update/find NOCACHE nodes and convert them to UNIFIED when all the
attributes in the node are supplied.

This means that if the node has an incomplete set of attributes we won't

update

it. Can you verify that you have all those attributes set for nodes which

aren't

being described by the hardware?


Thanks for pointing out.
Why do we need to check for set of attributes and decide it as UNIFIED
cache.?
We can get cache type from attributes bits[3:2] if cache type valid flag is
set
irrespective of other attributes. If cache type valid flag is not set then
we can assume
it as NOCACHE type as neither architecture register nor in PPTT has valid
cache type.


To answer the first question, in a strict sense we don't need to check 
any of the attributes in order to override the cache type. That said, 
initially I was going to trigger the override only when important 
attributes were set to assure that we weren't exporting meaningless 
nodes into sysfs. Then while picking which attributes I considered 
important, I came to the conclusion that it was simply better to assure 
that they were all set for nodes entirely generated by the PPTT. AKA, I 
don't want to see L3 cache nodes with their size or associativity unset, 
its better in that case that they remain hidden.


Per, the cache type valid bit. The code is written with the assumption 
that it is overriding probed values (despite that not being true at the 
moment for arm64) in the spirit of the standard. This informs/restricts 
how the code works because we aren't simply generating the entire 
cacheinfo directly from PPTT walks. Instead we are merging the PPTT 
information with anything previously probed, meaning we need a way to 
match existing cacheinfo structures with PPTT nodes.


So, the logic finding/matching an existing probed cache node requires 
that the cache type is valid because the cache level, and type is used 
as the match key. If the PPTT cache node doesn't have the cache type 
valid set, then the match logic won't find the node, and the PPTT code 
won't make any updates. That may also be what your seeing

Re: [PATCH v4 6/7] ARM: davinci: convert to common clock framework

2018-01-04 Thread David Lechner




On 1/4/18 6:39 AM, Sekhar Nori wrote:

On Monday 01 January 2018 05:09 AM, David Lechner wrote:

This converts all of arch/arm/mach-davinci to the common clock framework.
The clock drivers from clock.c and psc.c have been moved to drivers/clk,
so these files are removed.

There is one subtle change in the clock trees. AUX, BPDIV and OSCDIV
clocks now have "ref_clk" as a parent instead of the PLL clock. These
clocks are part of the PLL's MMIO block, but they bypass the PLL and
therefore it makes more sense to have "ref_clk" as their parent since
"ref_clk" is the input clock of the PLL.

CONFIG_DAVINCI_RESET_CLOCKS is removed since the common clock frameworks
takes care of disabling unused clocks.

Known issue: This breaks CPU frequency scaling on da850.


This functionality needs to be restored as part of this series since we
cannot commit anything with regressions.



Do you have a suggestion on how to accomplish this? I don't have a board 
for testing, so I don't have a way of knowing if my changes will work or 
not.




Also, the order of #includes are cleaned up in files while we are touching
this code.

Signed-off-by: David Lechner 


This is a pretty huge patch again and I hope it can be broken down.
Ideally one per SoC converted and then the unused code removal.



Will do.

RFD: Fastpath amelioration of the KAISER/KPTI performance impact

2018-01-04 Thread Kalle A. Sandstrom


[presented with intent to amuse and edumacate, here's a little something
something for the current performance crisis.]


--- cut here ---

Fastpath amelioration of the KAISER fixes' performance impact in Linux.
Kalle A. Sandström, 20180104

[DRAFT VERSION 0: not for publication. not even for serious consideration; v0
should be read as an elaborate joke.]


ABSTRACT.

This document identifies an opportunity for clawing back some of the
performance penalty from the KAISER/KPTI security patch by means of
fast-pathing interprocess communication in the section of code that'd
otherwise trampoline kernel entry. Two possible designs to this end are
briefly outlined.

The designs presented are for the very worst case where microcode updates
don't appear, or are restricted to new CPU models, and consequently
KAISER/KPTI is here to stay for a hojillion people. All of this may be a
terrible idea. Caveat lector; a good argument can be made in favour of not
looking into the abyss.


SYNOPSIS.

Increase the constant function fragment's footprint to handle some forms of
task switching and inter-process communication without enabling the kernel
proper, thereby halving the number of TLB flushes over some IPC roundtrips.
The IPC mechanism might be something as involved as a reimplementation of most
POSIX I/O, or as minimal as a rendezvous synchronization primitive combined
with existing shared memory gubbins. Distinguish this from the ``big'' kernel
with filesystems, MM, block devices, and anything with an infinite memory
requirement; which is stashed behind the extra TLB flush.

Call the intermediary an ittybittykernel. *rimshot*

While most performance gains from this general approach should happen early
on, higher-hanging fruit will be available for a long time to come, so the CFF
is expected to grow indefinitely. It could be foreseen that there'll be a
long-term game of cat and mouse between the speculative information leak
finder and the perpetually-appointed security engineer, providing both with
long-term careers in computational esoterica.

This design document presents a speculative development path towards such a
Frankenstein's architecture as well as a first step along that path,
ultimately motivated by the prospect of recovering some of that putative 20%
performance penalty. On the downside, even the best result will still be worse
than a hypothetical microkernel system written from scratch, but only until
the CPU manufacturers repair their emissions: after that monolithic will rule
microbenchmarks once more (on new hardware, and chips where a microcode fix is
available & yields a lesser penalty).


BACKGROUND.

The KAISER patch makes the kernel invulnerable to the speculative address
space probing feature of certain Intel processors (the ``Meltdown''
vulnerability). It accomplishes this at the cost of a TLB flush coming and
going per syscall, which brings their minimum number over the shortest
possible inter-process roundtrip to 4.

This is a heavy performance cost in applications where out-of-process
computation doesn't dominate TLB reload overhead. It could even be said that
in terms of performance, KAISER turns Linux into the worst possible
microkernel system: one where exactly no services are provided by the
intermediate layer but all of a monolithic design's downsides are retained,
leaving the intermediary's introduction a step for the worse from all
perspectives besides security.


PROPOSAL.

Instead of having the kernel mapped into each process and serving syscalls
etc. directly, the KAISER patch changes the kernel interface to an analogue of
what's used in 4G/4G mode. That's to say, it forwards kernel entry via a set
of IDT and syscall trampolines over the TLB flush boundary into what's
effectively a separate kernel address space. The simplified rationale is that
since the region containing the trampolines is small and its contents easily
audited for security issues, this prevents both leakage of useful information
regarding kernel address space layout randomization, and (consequently) the
utilization of speculative kernel information leak vulnerabilities without
(an)other ASLR leak(s).

The proposal at hand amounts to an increase in the footprint of this
``constant function fragment'' to the end that communication between the X
server and its clients wouldn't suffer double TLB flushes. Two distinct means
are proposed: the first is a conservative reimplementation of a subset of
POSIX file descriptor and process management, and UNIX domain sockets; and the
second a simplistic ``shared memory with rendezvous sync'' primitive coupled
with fiddly business in the C library and a legacy fallback.

Regardless of design particulars, the additional code's presence is justified
by being eventually fully auditable for both KASLR information leaks and
exploitable speculative-execution gadgets. T

Re: [PATCH 4.4 00/37] 4.4.110-stable review

2018-01-04 Thread Guenter Roeck

On Thu, Jan 04, 2018 at 06:16:04PM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 04, 2018 at 06:14:15PM +0100, Greg Kroah-Hartman wrote:
> > On Thu, Jan 04, 2018 at 06:11:02PM +0100, Greg Kroah-Hartman wrote:
> > > On Thu, Jan 04, 2018 at 06:03:15PM +0100, Willy Tarreau wrote:
> > > > On Thu, Jan 04, 2018 at 05:53:06PM +0100, Greg Kroah-Hartman wrote:
> > > > > On Thu, Jan 04, 2018 at 11:38:25AM -0500, Pavel Tatashin wrote:
> > > > > > I am getting the following panic when trying to boot 4.4.110rc1 on
> > > > > > Intel(R) Xeon(R) CPU E5-2630:
> > > > > > 
> > > > > > [5.923489] BUG: unable to handle kernel NULL pointer dereference
> > > > > > at 000d
> > > > > > [5.932259] IP: [] 
> > > > > > dyntick_save_progress_counter+0x12/0x50
> > > > > > [5.940142] PGD 0
> > > > > > [5.942400] Oops: 0002 [#1] SMP
> > > > > > [5.946023] Modules linked in:
> > > > > > [5.949448] CPU: 5 PID: 8 Comm: rcu_sched Not tainted
> > > > > > 4.4.110-rc1_pt_linux-4.4.110rc1 #1
> > > > > > [5.958484] Hardware name: Oracle Corporation ORACLE SERVER
> > > > > > X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> > > > > > [5.969552] task: 881ff2f1ab00 ti: 881ff2f24000 task.ti:
> > > > > > 881ff2f24000
> > > > > > [5.977905] RIP: 0010:[]  []
> > > > > > dyntick_save_progress_counter+0x12/0x50
> > > > > > [5.988505] RSP: :881ff2f27dc0  EFLAGS: 00010046
> > > > > > [5.994434] RAX: 0001 RBX: 81b02140 RCX: 
> > > > > > 883fec768000
> > > > > > [6.002403] RDX:  RSI: 881ff2f27e5f RDI: 
> > > > > > 88407e958140
> > > > > > [6.010368] RBP: 881ff2f27dc0 R08: 881ff2f27e78 R09: 
> > > > > > 00016110f359
> > > > > > [6.018333] R10: 0b10 R11:  R12: 
> > > > > > 81b02140
> > > > > > [6.026297] R13: ffdf R14: 0021 R15: 
> > > > > > 0002
> > > > > > [6.034262] FS:  () GS:881fff94()
> > > > > > knlGS:
> > > > > > [6.043293] CS:  0010 DS:  ES:  CR0: 80050033
> > > > > > [6.049707] CR2: 000d CR3: 01aa6000 CR4: 
> > > > > > 00360670
> > > > > > [6.057672] DR0:  DR1:  DR2: 
> > > > > > 
> > > > > > [6.065638] DR3:  DR6: fffe0ff0 DR7: 
> > > > > > 0400
> > > > > > [6.073603] Stack:
> > > > > > [6.075847]  881ff2f27e18 810e8fac 0202
> > > > > > 881ff2f27e60
> > > > > > [6.084158]  881ff2f27e5f 810e70c0 81b02140
> > > > > > 81b127a0
> > > > > > [6.092465]  0001  0003
> > > > > > 881ff2f27eb8
> > > > > > [6.100768] Call Trace:
> > > > > > [6.103501]  [] force_qs_rnp+0xdc/0x150
> > > > > > [6.109527]  [] ? rcu_start_gp+0x70/0x70
> > > > > > [6.115654]  [] rcu_gp_kthread+0x468/0x9b0
> > > > > > [6.121976]  [] ? 
> > > > > > prepare_to_wait_event+0xf0/0xf0
> > > > > > [6.128973]  [] ? 
> > > > > > rcu_process_callbacks+0x5f0/0x5f0
> > > > > > [6.136167]  [] kthread+0xe5/0x100
> > > > > > [6.141710]  [] ? kthread_park+0x60/0x60
> > > > > > [6.147840]  [] ret_from_fork+0x3f/0x70
> > > > > > [6.153868]  [] ? kthread_park+0x60/0x60
> > > > > > 
> > > > > > I tried to bisect the problem, but when I try to boot only with:
> > > > > > "KAISER: Kernel Address Isolation" machine hangs during boot and
> > > > > > reboots without any panic message.
> > > > > > 
> > > > > > 4.4.109 boots fine
> > > > > > 4.9.75rc1 also boots fine.
> > > > > 
> > > > > Hm, so I'm guessing 4.15-rc6 also works?
> > > > > 
> > > > > Odd that 4.9.75-rc1 fails.
> > > > 
> > > > s/4.9.75/4.4.110/ I suppose.
> > > 
> > > Yes, mistake on my side.
> > > 
> > > > Can't this be because more patches are required in 4.4 to support this
> > > > patch set ? Or maybe a manual fix for a conflict that went wrong ? Just
> > > > trying to guess.
> > > 
> > > Odd thing is, the 4.9 series started from the 4.4 code for most of the
> > > patches, so I would expect that one to fail...
> > 
> > Also, the 4.4 patches were supposed to have been better tested, I need
> > to go dig and see what I messed up here...
> 
> Nope, it matches up with what is in SLES12 exactly, I must be missing
> something else here as a prerequisite...

FWIW, v4.4.110-rc1 boots fine when merged into chromeos-4.4, on i7-7Y75.

Guenter

[PATCH] proc: spread likely/unlikely a bit

2018-01-04 Thread Alexey Dobriyan

use_pde() is used at every open/read/write/... of every random /proc
file. Negative refcount happens only if PDE is being deleted by module
(read: never). So it gets "likely".

unuse_pde() gets "unlikely" for the same reason.

close_pdeo() gets unlikely as the completion is filled only if there is
a race between PDE removal and close() (read: never ever).

It even saves code on x86_64 defconfig:

add/remove: 0/0 grow/shrink: 1/2 up/down: 2/-20 (-18)
Function old new   delta
close_pdeo   183 185  +2
proc_reg_get_unmapped_area   119 111  -8
proc_reg_poll 85  73 -12

Signed-off-by: Alexey Dobriyan 
---

 fs/proc/inode.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -129,12 +129,12 @@ enum {BIAS = -1U<<31};
 
 static inline int use_pde(struct proc_dir_entry *pde)
 {
-   return atomic_inc_unless_negative(&pde->in_use);
+   return likely(atomic_inc_unless_negative(&pde->in_use));
 }
 
 static void unuse_pde(struct proc_dir_entry *pde)
 {
-   if (atomic_dec_return(&pde->in_use) == BIAS)
+   if (unlikely(atomic_dec_return(&pde->in_use) == BIAS))
complete(pde->pde_unload_completion);
 }
 
@@ -167,7 +167,7 @@ static void close_pdeo(struct proc_dir_entry *pde, struct 
pde_opener *pdeo)
spin_lock(&pde->pde_unload_lock);
/* After ->release. */
list_del(&pdeo->lh);
-   if (pdeo->c)
+   if (unlikely(pdeo->c))
complete(pdeo->c);
kfree(pdeo);
}
@@ -421,7 +421,7 @@ static const char *proc_get_link(struct dentry *dentry,
 struct delayed_call *done)
 {
struct proc_dir_entry *pde = PDE(inode);
-   if (unlikely(!use_pde(pde)))
+   if (!use_pde(pde))
return ERR_PTR(-EINVAL);
set_delayed_call(done, proc_put_link, pde);
return pde->data;

Re: [RFC PATCH] asm/generic: introduce if_nospec and nospec_barrier

2018-01-04 Thread Dan Williams

On Wed, Jan 3, 2018 at 10:28 PM, Julia Lawall  wrote:
>
>
> On Wed, 3 Jan 2018, Dan Williams wrote:
>
>> [ adding Julia and Dan ]
>>
>> On Wed, Jan 3, 2018 at 5:07 PM, Alan Cox  wrote:
>> > On Wed, 3 Jan 2018 16:39:31 -0800
>> > Linus Torvalds  wrote:
>> >
>> >> On Wed, Jan 3, 2018 at 4:15 PM, Dan Williams  
>> >> wrote:
>> >> > The 'if_nospec' primitive marks locations where the kernel is disabling
>> >> > speculative execution that could potentially access privileged data. It
>> >> > is expected to be paired with a 'nospec_{ptr,load}' where the user
>> >> > controlled value is actually consumed.
>> >>
>> >> I'm much less worried about these "nospec_load/if" macros, than I am
>> >> about having a sane way to determine when they should be needed.
>> >>
>> >> Is there such a sane model right now, or are we talking "people will
>> >> randomly add these based on strong feelings"?
>> >
>> > There are people trying to tune coverity and other tool rules to identify
>> > cases, and some of the work so far was done that way. For x86 we didn't
>> > find too many so far so either the needed pattern is uncommon or   8)
>> >
>> > Given you can execute over a hundred basic instructions in a speculation
>> > window it does need to be a tool that can explore not just in function
>> > but across functions. That's really tough for the compiler itself to do
>> > without help.
>> >
>> > What remains to be seen is if there are other patterns that affect
>> > different processors.
>> >
>> > In the longer term the compiler itself needs to know what is and isn't
>> > safe (ie you need to be able to write things like
>> >
>> > void foo(tainted __user int *x)
>> >
>> > and have the compiler figure out what level of speculation it can do and
>> > (on processors with those features like IA64) when it can and can't do
>> > various kinds of non-trapping loads.
>> >
>>
>> It would be great if coccinelle and/or smatch could be taught to catch
>> some of these case at least as a first pass "please audit this code
>> block" type of notification.
>>
>
> What should one be looking for.  Do you have a typical example?
>

See "Exploiting Conditional Branch Misprediction" from the paper [1].

The typical example is an attacker controlled index used to trigger a
dependent read near a branch. Where an example of "near" from the
paper is "up to 188 simple instructions inserted in the source code
between the ‘if’ statement and the line accessing array...".

if (attacker_controlled_index < bound)
 val = array[attacker_controlled_index];
else
return error;

...when the cpu speculates that the 'index < bound' branch is taken it
reads index and uses that value to read array[index]. The result of an
'array' relative read is potentially observable in the cache.

[1]: https://spectreattack.com/spectre.pdf

[PATCH] proc: rearrange args

2018-01-04 Thread Alexey Dobriyan

Rearrange args for smaller code.

lookup revolves around memcmp() which gets len 3rd arg, so propagate
length as 3rd arg.

readdir and lookup add additional arg to VFS ->readdir and ->lookup,
so better add it to the end.

Space savings on x86_64:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-18 (-18)
Function old new   delta
proc_readdir  22  13  -9
proc_lookup   18   9  -9

proc_match() is smaller if not inlined, I promise!

Signed-off-by: Alexey Dobriyan 
---

 fs/proc/generic.c  |   18 +-
 fs/proc/internal.h |5 ++---
 fs/proc/proc_net.c |4 ++--
 3 files changed, 13 insertions(+), 14 deletions(-)

--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -28,7 +28,7 @@
 
 static DEFINE_RWLOCK(proc_subdir_lock);
 
-static int proc_match(unsigned int len, const char *name, struct 
proc_dir_entry *de)
+static int proc_match(const char *name, struct proc_dir_entry *de, unsigned 
int len)
 {
if (len < de->namelen)
return -1;
@@ -60,7 +60,7 @@ static struct proc_dir_entry *pde_subdir_find(struct 
proc_dir_entry *dir,
struct proc_dir_entry *de = rb_entry(node,
 struct proc_dir_entry,
 subdir_node);
-   int result = proc_match(len, name, de);
+   int result = proc_match(name, de, len);
 
if (result < 0)
node = node->rb_left;
@@ -84,7 +84,7 @@ static bool pde_subdir_insert(struct proc_dir_entry *dir,
struct proc_dir_entry *this = rb_entry(*new,
   struct proc_dir_entry,
   subdir_node);
-   int result = proc_match(de->namelen, de->name, this);
+   int result = proc_match(de->name, this, de->namelen);
 
parent = *new;
if (result < 0)
@@ -211,8 +211,8 @@ void proc_free_inum(unsigned int inum)
  * Don't create negative dentries here, return -ENOENT by hand
  * instead.
  */
-struct dentry *proc_lookup_de(struct proc_dir_entry *de, struct inode *dir,
-   struct dentry *dentry)
+struct dentry *proc_lookup_de(struct inode *dir, struct dentry *dentry,
+ struct proc_dir_entry *de)
 {
struct inode *inode;
 
@@ -235,7 +235,7 @@ struct dentry *proc_lookup_de(struct proc_dir_entry *de, 
struct inode *dir,
 struct dentry *proc_lookup(struct inode *dir, struct dentry *dentry,
unsigned int flags)
 {
-   return proc_lookup_de(PDE(dir), dir, dentry);
+   return proc_lookup_de(dir, dentry, PDE(dir));
 }
 
 /*
@@ -247,8 +247,8 @@ struct dentry *proc_lookup(struct inode *dir, struct dentry 
*dentry,
  * value of the readdir() call, as long as it's non-negative
  * for success..
  */
-int proc_readdir_de(struct proc_dir_entry *de, struct file *file,
-   struct dir_context *ctx)
+int proc_readdir_de(struct file *file, struct dir_context *ctx,
+   struct proc_dir_entry *de)
 {
int i;
 
@@ -292,7 +292,7 @@ int proc_readdir(struct file *file, struct dir_context *ctx)
 {
struct inode *inode = file_inode(file);
 
-   return proc_readdir_de(PDE(inode), file, ctx);
+   return proc_readdir_de(file, ctx, PDE(inode));
 }
 
 /*
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -153,10 +153,9 @@ extern bool proc_fill_cache(struct file *, struct 
dir_context *, const char *, i
  * generic.c
  */
 extern struct dentry *proc_lookup(struct inode *, struct dentry *, unsigned 
int);
-extern struct dentry *proc_lookup_de(struct proc_dir_entry *, struct inode *,
-struct dentry *);
+struct dentry *proc_lookup_de(struct inode *, struct dentry *, struct 
proc_dir_entry *);
 extern int proc_readdir(struct file *, struct dir_context *);
-extern int proc_readdir_de(struct proc_dir_entry *, struct file *, struct 
dir_context *);
+int proc_readdir_de(struct file *, struct dir_context *, struct proc_dir_entry 
*);
 
 static inline struct proc_dir_entry *pde_get(struct proc_dir_entry *pde)
 {
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -135,7 +135,7 @@ static struct dentry *proc_tgid_net_lookup(struct inode 
*dir,
de = ERR_PTR(-ENOENT);
net = get_proc_task_net(dir);
if (net != NULL) {
-   de = proc_lookup_de(net->proc_net, dir, dentry);
+   de = proc_lookup_de(dir, dentry, net->proc_net);
put_net(net);
}
return de;
@@ -172,7 +172,7 @@ static int proc_tgid_net_readdir(struct file *file, struct 
dir_context *ctx)
ret = -EINVAL;
net = get_proc_task_net(file_inode(file));
if (net != NULL) {
-   ret = proc_readdir_de(net->proc_net, file, ct

Re: [PATCH v5 7/9] arm64: Topology, rename cluster_id

2018-01-04 Thread Jeremy Linton


Hi,

On 01/03/2018 09:59 PM, Xiongfeng Wang wrote:



On 2018/1/4 1:32, Jeremy Linton wrote:

Hi,

On 01/03/2018 08:29 AM, Sudeep Holla wrote:


On 02/01/18 02:29, Xiongfeng Wang wrote:

Hi,

On 2017/12/18 20:42, Morten Rasmussen wrote:

On Fri, Dec 15, 2017 at 10:36:35AM -0600, Jeremy Linton wrote:

Hi,

On 12/13/2017 12:02 PM, Lorenzo Pieralisi wrote:

[+Morten, Dietmar]

$SUBJECT should be:

arm64: topology: rename cluster_id



[cut]



I think we still need the information describing which cores are in one
cluster. Many arm64 chips have the architecture core/cluster/socket. Cores
in one cluster may share a same L2 cache. That information can be used to
build the sched_domain. If we put cores in one cluster in one sched_domain,
the performance will be better.(please see kernel/sched/topology.c:1197,
cpu_coregroup_mask() uses 'core_sibling' to build a multi-core
sched_domain).


We get all the cache information from DT/ACPI PPTT(mainly topology) and now
even the geometry. So ideally, the sharing information must come from that.
Any other solution might end up in conflict if DT/PPTT and that mismatch.


So I think we still need variable to record which cores are in one
sched_domain for future use.


I tend to say no, at-least not as is.



Well, either way, with DynamiQ (and a55/a75) the cores have private L2's, which 
means that the cluster sharing is happening at what is then the L3 level. So, 
the code I had in earlier versions would have needed tweaks to deal with that 
anyway.

IMHO, if we want to detect this kind of sharing for future scheduling domains, 
it should probably be done independent of PPTT/DT/MIPDR by picking out shared 
cache levels from struct cacheinfo *. Which makes that change unrelated to the 
basic population of cacheinfo and cpu_topology in this patchset.


I think we need to build scheduling domains not only on the cache-sharing 
information,
but also some other information, such as which cores use the same cache 
coherent interconnect
(I don't know the detail, I just guess)

I think PPTT is used to report the cores topology, which cores are more related 
to each other.
They may share the same cache, or use the same CCI, or are physically near to 
each other.
I think we should use this information to build  MC(multi-cores) scheduling 
domains.

Or maybe  we can just discard the MC scheduling domain and handle this 
scheduling-domain-building
task to the NUMA subsystem entirely, I don't know if it is proper.



For the immediate future what I would like is a way to identify where in 
the PPTT topology the NUMA domains begin (rather than assuming socket, 
which is the current plan). That allows the manufactures of systems 
(with say say MCM based topologies) to dictate at which level in the 
cpu/cache topology they want to start describing the topology with the 
SLIT/SRAT tables. I think that moves us in the direction you are 
indicating while still leaving the door open for something like a 
cluster level scheduling domain (based on cores sharing caches) or a 
split LLC domain (also based on cores sharing caches) that happens to be 
on die...

Re: [PATCH v2 0/8] ARM: sun9i: SMP support with Multi-Cluster Power Management

2018-01-04 Thread Lorenzo Pieralisi

On Thu, Jan 04, 2018 at 03:58:38PM +0100, Maxime Ripard wrote:
> On Thu, Jan 04, 2018 at 10:37:46PM +0800, Chen-Yu Tsai wrote:
> > This is v2 of my sun9i SMP support with MCPM series which was started
> > over two years ago [1]. We've tried to implement PSCI for both the A80
> > and A83T. Results were not promising. The issue is that these two chips
> > have a broken security extensions implementation. If a specific bit is
> > not burned in its e-fuse, most if not all security protections don't
> > work [2]. Even worse, non-secure access to the GIC become secure. This
> > requires a crazy workaround in the GIC driver which probably doesn't work
> > in all cases [3].
> > 
> > Nicolas mentioned that the MCPM framework is likely overkill in our
> > case [4]. However the framework does provide cluster/core state tracking
> > and proper sequencing of cache related operations. We could rework
> > the code to use standard smp_ops, but I would like to actually get
> > a working version in first.
> > 
> > Much of the sunxi-specific MCPM code is derived from Allwinner code and
> > documentation, with some references to the other MCPM implementations,
> > as well as the Cortex's Technical Reference Manuals for the power
> > sequencing info.
> > 
> > One major difference compared to other platforms is we currently do not
> > have a standalone PMU or other embedded firmware to do the actually power
> > sequencing. All power/reset control is done by the kernel. Nicolas
> > mentioned that a new optional callback should be added in cases where the
> > kernel has to do the actual power down [5]. For now however I'm using a
> > dedicated single thread workqueue. CPU and cluster power off work is
> > queued from the .{cpu,cluster}_powerdown_prepare callbacks. This solution
> > is somewhat heavy, as I have a total of 10 static work structs. It might
> > also be a bit racy, as nothing prevents the system from bringing a core
> > back before the asynchronous work shuts it down. This would likely
> > happen under a heavily loaded system with a scheduler that brings cores
> > in and out of the system frequently. In simple use-cases it performs OK.
> 
> It all looks sane to me
> Acked-by: Maxime Ripard 

It does not to me, sorry. You do not need MCPM (and workqueues) to
do SMP bring-up.

Nico explained why, just do it:

commit 905cdf9dda5d ("ARM: hisi/hip04: remove the MCPM overhead")

Lorenzo

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-04 Thread Linus Torvalds

David,
 these are all marked as spam, because your emails have screwed up
DKIM. You used

From: David Woodhouse 

but then you used infradead as a mailer, so it has the DKIM signature
from infradead, not from Amazon.co.uk.

The DKIM signature does pass for infradead, but amazon dmarc - quite
reasonably - wants the from to match.

End result:

   dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=QUARANTINE)
header.from=amazon.co.uk

and everything was in spam.

Please don't do this. There's enough spam in the world that we don't
need people mis-configuring their emails and making real emails look
like spam too.

   Linus

On Thu, Jan 4, 2018 at 6:36 AM, David Woodhouse  wrote:
> Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
> the corresponding thunks. Provide assembler macros for invoking the thunks
> in the same way that GCC does, from native and inline assembler.

Re: Avoid speculative indirect calls in kernel

2018-01-04 Thread Andrea Arcangeli

Hello,

On Thu, Jan 04, 2018 at 06:15:01PM +0100, Paolo Bonzini wrote:
> On 04/01/2018 18:13, Dave Hansen wrote:
> > On 01/04/2018 08:25 AM, Andrea Arcangeli wrote:
> >> It's only where SPEC_CTRL is missing and only IBPB_SUPPORT is
> >> available, that ibrs 0 ibpb 2 is the only option to fix variant#2 for
> >> good.
> > 
> > Could you help us decode what "ibrs 0 ibpb 2" means to you?
> 
> IBRS 0 = disabled
> IBRS 1 = only kernel sets IBRS=1
> IBRS 2 = indirect branch prediction fully disabled, or do the right
> thing on future processors
> 
> IBPB 0 = disabled
> IBPB 1 = on context switch
> IBPB 2 = on every kernel or hypervisor entry

Yes.

ibrs 0 ibpb 2 kernel entry and vmexit.

ibpb 2 if set, is forcing ibrs to 0 (it's sharing the same branch in
the kernel entry points and it wouldn't make sense anyway to enable
ibrs with ibpb 2).

ibrs 0 ibpb 2 is only ever activated if SPEC_CTRL is missing but
IBPB_SUPPORT is present and it does the same as stuff_RSB, imagine it
like a stuff_IBP where stuff_RSB is already called.

Re: [PATCH]cpuidle: preventive check in cpuidle_select against crash

2018-01-04 Thread gaurav jindal

On Wed, Jan 03, 2018 at 12:16:26PM +0100, Rafael J. Wysocki wrote:
> On Friday, December 29, 2017 7:45:22 PM CET gaurav jindal wrote:
> > On Wed, Dec 27, 2017 at 03:30:02AM +0100, Rafael J. Wysocki wrote:
> > > On Wed, Dec 27, 2017 at 2:57 AM, gaurav jindal
> > >  wrote:
> > > > On Wed, Dec 27, 2017 at 01:42:58AM +0100, Rafael J. Wysocki wrote:
> > > >> On Tue, Dec 26, 2017 at 8:26 AM, gaurav jindal
> > > >>  wrote:
> > > >> > When selecting the idle state using cpuidle_select, there is no
> > > >> > check on cpuidle_curr_governor. In cpuidle_switch_governor,
> > > >> > cpuidle_currr_governor can be set to NULL to specify "disabled".
> > > >>
> > > >> How exactly?
> > > >
> > > > In cpuidle_switch_governor:
> > > >
> > > > /**
> > > >  * cpuidle_switch_governor - changes the governor
> > > >  * @gov: the new target governor
> > > >  *
> > > >  * NOTE: "gov" can be NULL to specify disabled
> > > >  * Must be called with cpuidle_lock acquired.
> > > >  */
> > > > int cpuidle_switch_governor(struct cpuidle_governor *gov)
> > > > {
> > > > struct cpuidle_device *dev;
> > > >
> > > > if (gov == cpuidle_curr_governor)
> > > > return 0;
> > > >
> > > > cpuidle_uninstall_idle_handler();
> > > >
> > > > if (cpuidle_curr_governor) {
> > > > list_for_each_entry(dev, &cpuidle_detected_devices, 
> > > > device_list)
> > > > cpuidle_disable_device(dev);
> > > > }
> > > >
> > > > cpuidle_curr_governor = gov;
> > > >
> > > > This allows to set the cpuidle_switch_governor as NULL. Although there 
> > > > is no
> > > > current code flow leading here, but it has a potential for bug in 
> > > > future. So
> > > > may be better to have prevention.
> > > 
> > > Or maybe not.
> > > 
> > > Why don't you make cpuidle_switch_governor() check the argument
> > > against NULL instead?
> > 
> > If we check gov (argument passed in  cpuidle_switch_governor())against
> > NULL in cpuidle_switch_governor, can be a problem in a case where it 
> > is called as
> > cpuidle_switch_governor(NULL);
> > 
> > If cpuidle_curr_governor is not NULL, first the device is disabled.
> > 
> > if (cpuidle_curr_governor) {
> > list_for_each_entry(dev, &cpuidle_detected_devices, device_list)
> > cpuidle_disable_device(dev);
> > }
> > 
> > after this cpuidle_curr_governor is set to gov, which is NULL in this case.
> > 
> > cpuidle_curr_governor = gov;
> > /* if is not updated by inserting a check, it will have an oudated value*/
> > 
> > Now, if gov is not NULL (which it is in this case), cpuidle device is 
> > enabled
> > 
> > if (gov) {
> > list_for_each_entry(dev, &cpuidle_detected_devices, device_list)
> > cpuidle_enable_device(dev);
> > cpuidle_install_idle_handler();
> > printk(KERN_INFO "cpuidle: using governor %s\n", gov->name);
> > }
> > If we check for gov against NULL in this function, it will produce
> > dangling pointers and resource leaks.
> 
> I didn't recommend you to introduce bugs.
> 
I did not intend to do so. I am really sorry it got expressed in that way :(.
> Just return -EINVAL if gov is NULL before checking if gov is equal to
> cpuidle_curr_governor.
> 
Okay 
> Thanks,
> Rafael
> 

this patch checks if the new governor is NULL before updating the
cupidle_curr_governor.

Signed-off-by: gaurav jindal

---

diff --git a/drivers/cpuidle/governor.c b/drivers/cpuidle/governor.c
index 4e78263..5d359af 100644
--- a/drivers/cpuidle/governor.c
+++ b/drivers/cpuidle/governor.c
@@ -36,14 +36,15 @@ static struct cpuidle_governor * 
__cpuidle_find_governor(const char *str)
 /**
  * cpuidle_switch_governor - changes the governor
  * @gov: the new target governor
- *
- * NOTE: "gov" can be NULL to specify disabled
  * Must be called with cpuidle_lock acquired.
  */
 int cpuidle_switch_governor(struct cpuidle_governor *gov)
 {
struct cpuidle_device *dev;
 
+   if (!gov)
+   return -EINVAL;
+
if (gov == cpuidle_curr_governor)
return 0;

Re: general protection fault in __netlink_ns_capable

2018-01-04 Thread Andrei Vagin

On Thu, Jan 04, 2018 at 01:01:17PM +0100, Dmitry Vyukov wrote:
> On Wed, Jan 3, 2018 at 8:37 AM, Andrei Vagin  wrote:
> >> > Hello,
> >> >
> >> > syzkaller hit the following crash on
> >> > 75aa5540627fdb3d8f86229776ea87f995275351
> >> > git://git.cmpxchg.org/linux-mmots.git/master
> >> > compiler: gcc (GCC) 7.1.1 20170620
> >> > .config is attached
> >> > Raw console output is attached.
> >> > C reproducer is attached
> >> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> >> > for information about syzkaller reproducers
> >> >
> >> >
> >> > IMPORTANT: if you fix the bug, please add the following tag to the 
> >> > commit:
> >> > Reported-by: syzbot+e432865c29eb4c48c...@syzkaller.appspotmail.com
> >> > It will help syzbot understand when the bug is fixed. See footer for
> >> > details.
> >> > If you forward the report, please keep this part and the footer.
> >> >
> >> > netlink: 3 bytes leftover after parsing attributes in process
> >> > `syzkaller140561'.
> >> > netlink: 3 bytes leftover after parsing attributes in process
> >> > `syzkaller140561'.
> >> > netlink: 3 bytes leftover after parsing attributes in process
> >> > `syzkaller140561'.
> >> > kasan: CONFIG_KASAN_INLINE enabled
> >> > kasan: GPF could be caused by NULL-ptr deref or user memory access
> >> > general protection fault:  [#1] SMP KASAN
> >> > Dumping ftrace buffer:
> >> >(ftrace buffer empty)
> >> > Modules linked in:
> >> > CPU: 1 PID: 3149 Comm: syzkaller140561 Not tainted 4.15.0-rc4-mm1+ #47
> >> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> >> > Google 01/01/2011
> >> > RIP: 0010:__netlink_ns_capable+0x8b/0x120 net/netlink/af_netlink.c:868
> >>
> >> NETLINK_CB(skb).sk is NULL here. It looks like we have to use
> >> sk_ns_capable instead of netlink_ns_capable:
> >>
> >> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> >> index c688dc564b11..408c75de52ea 100644
> >> --- a/net/core/rtnetlink.c
> >> +++ b/net/core/rtnetlink.c
> >> @@ -1762,7 +1762,7 @@ static struct net *get_target_net(struct sk_buff
> >> *skb, int netnsid)
> >> /* For now, the caller is required to have CAP_NET_ADMIN in
> >>  * the user namespace owning the target net ns.
> >>  */
> >> -   if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) {
> >> +   if (!sk_ns_capable(skb->sk, net->user_ns, CAP_NET_ADMIN)) {
> >> put_net(net);
> >> return ERR_PTR(-EACCES);
> >> }
> >>
> >
> > get_target_net() is used twice in the code. In rtnl_getlink(), we need
> > to use netlink_ns_capable(skb, ...), but in rtnl_dump_ifinfo, we need to
> > use sk_ns_capable(skb->sk, ...).
> >
> > Pls, take a look at this patch:
> > https://patchwork.ozlabs.org/patch/854896/
> > Subject: rtnetlink: give a user socket to get_target_net()
> 
> 
> Please include this tag into the commit:
> 

I sent v2 with this tag. Sorry for inconvenience.
https://patchwork.ozlabs.org/patch/855147/

> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+e432865c29eb4c48c...@syzkaller.appspotmail.com
> > > It will help syzbot understand when the bug is fixed.

PTI build regression with nvidia drivers

2018-01-04 Thread Kees Cook

Hi,

This was pointed out in a few places, but not forwarded to lkml yet that I saw:

https://devtalk.nvidia.com/default/topic/1028222/linux/lts-kernel-patch-for-intel-cpu-vulnerability-breaks-nvidia-driver/post/5230546

Before and after PTI, cpu_tlbstate is a GPL export:

$ git show v4.14:arch/x86/mm/init.c | grep cpu_tlbstate
DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate) = {
EXPORT_SYMBOL_GPL(cpu_tlbstate);

But after PTI, inlining or something is dragging cpu_tlbstate into the
open, causing build failures.

Technically, to avoid regressions for that module, we'll need to drop
the GPL marking on that symbol, or find some other solution...

-Kees

-- 
Kees Cook
Pixel Security

Re: [PATCH v3 01/13] x86/retpoline: Add initial retpoline support

2018-01-04 Thread Alexei Starovoitov

On Thu, Jan 04, 2018 at 02:36:58PM +, David Woodhouse wrote:
> Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
> the corresponding thunks. Provide assembler macros for invoking the thunks
> in the same way that GCC does, from native and inline assembler.
> 
> This adds an X86_BUG_NO_RETPOLINE "feature" for runtime patching out
> of the thunks. This is a placeholder for now; the patches which support
> the new Intel/AMD microcode features will flesh out the precise conditions
> under which we disable the retpoline and do other things instead.
> 
> [Andi Kleen: Rename the macros and add CONFIG_RETPOLINE option]
> 
> Signed-off-by: David Woodhouse 
...
> +.macro THUNK sp reg
> + .section .text.__x86.indirect_thunk.\reg
> +
> +ENTRY(__x86.indirect_thunk.\reg)
> + CFI_STARTPROC
> + ALTERNATIVE "call 2f", __stringify(jmp *%\reg), X86_BUG_NO_RETPOLINE
> +1:
> + lfence
> + jmp 1b
> +2:
> + mov %\reg, (%\sp)
> + ret
> + CFI_ENDPROC
> +ENDPROC(__x86.indirect_thunk.\reg)

Clearly Paul's approach to retpoline without lfence is faster.
I'm guessing it wasn't shared with amazon/intel until now and
this set of patches going to adopt it, right?

Paul, could you share a link to a set of alternative gcc patches
that do retpoline similar to llvm diff ?

[PATCH 7/7] x86/microcode: Recheck IBRS features on microcode reload

2018-01-04 Thread Tim Chen

On new microcode write, check whether IBRS
is present by rescanning scattered CPU features.

Signed-off-by: Tim Chen 
---
 arch/x86/kernel/cpu/microcode/core.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/core.c 
b/arch/x86/kernel/cpu/microcode/core.c
index c4fa4a8..44b9355 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -40,6 +40,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DRIVER_VERSION "2.2"
 
@@ -444,6 +445,11 @@ static ssize_t microcode_write(struct file *file, const 
char __user *buf,
if (ret > 0)
perf_check_microcode();
 
+   /* check spec_ctrl capabilities */
+   mutex_lock(&spec_ctrl_mutex);
+   init_scattered_cpuid_features(&boot_cpu_data);
+   mutex_unlock(&spec_ctrl_mutex);
+
mutex_unlock(µcode_mutex);
put_online_cpus();
 
-- 
2.9.4

[PATCH 0/7] IBRS patch series

2018-01-04 Thread Tim Chen

This patch series enables the basic detection and usage of x86 indirect
branch speculation feature.  It enables the indirect branch restricted
speculation (IBRS) on kernel entry and disables it on exit.
It enumerates the indirect branch prediction barrier (IBPB).

The x86 IBRS feature requires corresponding microcode support.
It mitigates the variant 2 vulnerability described in
https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html

If IBRS is set, near returns and near indirect jumps/calls will not
allow their predicted target address to be controlled by code that
executed in a less privileged prediction mode before the IBRS mode was
last written with a value of 1 or on another logical processor so long
as all RSB entries from the previous less privileged prediction mode
are overwritten.

Setting of IBPB ensures that earlier code's behavior does not control later
indirect branch predictions.  It is used when context switching to new
untrusted address space. Unlike IBRS, IBPB is a command MSR
and does not retain its state.

Speculation on Skylake and later requires these patches ("dynamic IBRS")
be used instead of retpoline[1].  If you are very paranoid or you run on
a CPU where IBRS=1 is cheaper, you may also want to run in "IBRS always"
mode.

See: 
https://docs.google.com/document/d/e/2PACX-1vSMrwkaoSUBAFc6Fjd19F18c1O9pudkfAY-7lGYGOTN8mc9ul-J6pWadcAaBJZcVA7W_3jlLKRtKRbd/pub

More detailed description of IBRS is described in the first patch.

It is applied on top of the page table isolation changes.

A run time and boot time control of the IBRS feature is provided

There are 2 ways to control IBRS

1. At boot time
noibrs kernel boot parameter will disable IBRS usage

Otherwise if the above parameters are not specified, the system
will enable ibrs and ibpb usage if the cpu supports it.

2. At run time
echo 0 > /sys/kernel/debug/ibrs_enabled will turn off IBRS
echo 1 > /sys/kernel/debug/ibrs_enabled will turn on IBRS in kernel
echo 2 > /sys/kernel/debug/ibrs_enabled will turn on IBRS in both userspace 
and kernel (IBRS always)

[1] https://lkml.org/lkml/2018/1/4/174

Tim Chen (7):
  x86/feature: Detect the x86 feature to control Speculation
  x86/enter: MACROS to set/clear IBRS
  x86/enter: Use IBRS on syscall and interrupts
  x86/idle: Disable IBRS entering idle and enable it on wakeup
  x86: Use IBRS for firmware update path
  x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature
  x86/microcode: Recheck IBRS features on microcode reload

 Documentation/admin-guide/kernel-parameters.txt |   4 +
 arch/x86/entry/entry_64.S   |  24 +++
 arch/x86/entry/entry_64_compat.S|   9 +
 arch/x86/include/asm/apm.h  |   6 +
 arch/x86/include/asm/cpufeatures.h  |   1 +
 arch/x86/include/asm/efi.h  |  16 +-
 arch/x86/include/asm/msr-index.h|   7 +
 arch/x86/include/asm/mwait.h|  19 ++
 arch/x86/include/asm/spec_ctrl.h| 253 
 arch/x86/kernel/cpu/Makefile|   1 +
 arch/x86/kernel/cpu/microcode/core.c|   6 +
 arch/x86/kernel/cpu/scattered.c |  11 ++
 arch/x86/kernel/cpu/spec_ctrl.c | 124 
 arch/x86/kernel/process.c   |   9 +-
 14 files changed, 486 insertions(+), 4 deletions(-)
 create mode 100644 arch/x86/include/asm/spec_ctrl.h
 create mode 100644 arch/x86/kernel/cpu/spec_ctrl.c

-- 
2.9.4

[PATCH 5/7] x86: Use IBRS for firmware update path

2018-01-04 Thread Tim Chen

From: David Woodhouse 

We are impervious to the indirect branch prediction attack with retpoline
but firmware won't be, so we still need to set IBRS to protect
firmware code execution when calling into firmware at runtime.

Signed-off-by: David Woodhouse 
Signed-off-by: Tim Chen 
---
 arch/x86/include/asm/apm.h   |  6 ++
 arch/x86/include/asm/efi.h   | 16 ++--
 arch/x86/include/asm/spec_ctrl.h | 37 +
 3 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/apm.h b/arch/x86/include/asm/apm.h
index 4d4015d..1ca4f7b 100644
--- a/arch/x86/include/asm/apm.h
+++ b/arch/x86/include/asm/apm.h
@@ -7,6 +7,8 @@
 #ifndef _ASM_X86_MACH_DEFAULT_APM_H
 #define _ASM_X86_MACH_DEFAULT_APM_H
 
+#include 
+
 #ifdef APM_ZERO_SEGS
 #  define APM_DO_ZERO_SEGS \
"pushl %%ds\n\t" \
@@ -28,6 +30,7 @@ static inline void apm_bios_call_asm(u32 func, u32 ebx_in, 
u32 ecx_in,
u32 *eax, u32 *ebx, u32 *ecx,
u32 *edx, u32 *esi)
 {
+   unprotected_firmware_begin();
/*
 * N.B. We do NOT need a cld after the BIOS call
 * because we always save and restore the flags.
@@ -44,6 +47,7 @@ static inline void apm_bios_call_asm(u32 func, u32 ebx_in, 
u32 ecx_in,
  "=S" (*esi)
: "a" (func), "b" (ebx_in), "c" (ecx_in)
: "memory", "cc");
+   unprotected_formware_end();
 }
 
 static inline bool apm_bios_call_simple_asm(u32 func, u32 ebx_in,
@@ -52,6 +56,7 @@ static inline bool apm_bios_call_simple_asm(u32 func, u32 
ebx_in,
int cx, dx, si;
boolerror;
 
+   unprotected_firmware_begin();
/*
 * N.B. We do NOT need a cld after the BIOS call
 * because we always save and restore the flags.
@@ -68,6 +73,7 @@ static inline bool apm_bios_call_simple_asm(u32 func, u32 
ebx_in,
  "=S" (si)
: "a" (func), "b" (ebx_in), "c" (ecx_in)
: "memory", "cc");
+   unprotected_formware_end();
return error;
 }
 
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 85f6ccb..25bd506 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * We map the EFI regions needed for runtime services non-contiguously,
@@ -36,8 +37,17 @@
 
 extern asmlinkage unsigned long efi_call_phys(void *, ...);
 
-#define arch_efi_call_virt_setup() kernel_fpu_begin()
-#define arch_efi_call_virt_teardown()  kernel_fpu_end()
+#define arch_efi_call_virt_setup() \
+{( \
+   kernel_fpu_begin(); \
+   unprotected_firmware_begin();   \
+)}
+
+#define arch_efi_call_virt_teardown()  \
+{( \
+   unprotected_firmware_end(); \
+   kernel_fpu_end();   \
+)}
 
 /*
  * Wrap all the virtual calls in a way that forces the parameters on the stack.
@@ -73,6 +83,7 @@ struct efi_scratch {
efi_sync_low_kernel_mappings(); \
preempt_disable();  \
__kernel_fpu_begin();   \
+   unprotected_firmware_begin();   \
\
if (efi_scratch.use_pgd) {  \
efi_scratch.prev_cr3 = __read_cr3();\
@@ -91,6 +102,7 @@ struct efi_scratch {
__flush_tlb_all();  \
}   \
\
+   unprotected_firmware_end(); \
__kernel_fpu_end(); \
preempt_enable();   \
 })
diff --git a/arch/x86/include/asm/spec_ctrl.h b/arch/x86/include/asm/spec_ctrl.h
index 28b0314..23b2804 100644
--- a/arch/x86/include/asm/spec_ctrl.h
+++ b/arch/x86/include/asm/spec_ctrl.h
@@ -113,5 +113,42 @@ static inline void unprotected_speculation_end(void)
rmb();
 }
 
+
+#if defined(RETPOLINE)
+/*
+ * RETPOLINE does not protect against indirect speculation
+ * in firmware code.  Enable IBRS to protect firmware execution.
+ */
+static inline void unprotected_firmware_begin(void)
+{
+   if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+__disable_indirect_speculation();

[PATCH 6/7] x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL feature

2018-01-04 Thread Tim Chen

There are 2 ways to control IBRS

1. At boot time
noibrs kernel boot parameter will disable IBRS usage

Otherwise if the above parameters are not specified, the system
will enable ibrs and ibpb usage if the cpu supports it.

2. At run time
echo 0 > /sys/kernel/debug/ibrs_enabled will turn off IBRS
echo 1 > /sys/kernel/debug/ibrs_enabled will turn on IBRS in kernel
echo 2 > /sys/kernel/debug/ibrs_enabled will turn on IBRS in both userspace 
and kernel

The implementation was updated with input from Andrea Arcangeli.

Signed-off-by: Tim Chen 
---
 Documentation/admin-guide/kernel-parameters.txt |   4 +
 arch/x86/include/asm/spec_ctrl.h| 163 +++-
 arch/x86/kernel/cpu/Makefile|   1 +
 arch/x86/kernel/cpu/scattered.c |  10 ++
 arch/x86/kernel/cpu/spec_ctrl.c | 124 ++
 5 files changed, 270 insertions(+), 32 deletions(-)
 create mode 100644 arch/x86/kernel/cpu/spec_ctrl.c

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 5dfd262..d64f49f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2573,6 +2573,10 @@
noexec=on: enable non-executable mappings (default)
noexec=off: disable non-executable mappings
 
+   noibrs  [X86]
+   Don't use indirect branch restricted speculation (IBRS)
+   feature.
+
nosmap  [X86]
Disable SMAP (Supervisor Mode Access Prevention)
even if it is supported by processor.
diff --git a/arch/x86/include/asm/spec_ctrl.h b/arch/x86/include/asm/spec_ctrl.h
index 23b2804..2c35571 100644
--- a/arch/x86/include/asm/spec_ctrl.h
+++ b/arch/x86/include/asm/spec_ctrl.h
@@ -1,13 +1,17 @@
 #ifndef _ASM_X86_SPEC_CTRL_H
 #define _ASM_X86_SPEC_CTRL_H
 
-#include 
 #include 
 #include 
-#include 
+
+#define SPEC_CTRL_IBRS_INUSE   (1<<0)  /* OS enables IBRS usage */
+#define SPEC_CTRL_IBRS_SUPPORTED   (1<<1)  /* System supports IBRS */
+#define SPEC_CTRL_IBRS_ADMIN_DISABLED  (1<<2)  /* Admin disables IBRS */
 
 #ifdef __ASSEMBLY__
 
+.extern spec_ctrl_ibrs
+
 .macro PUSH_MSR_REGS
pushq %rax
pushq %rcx
@@ -27,35 +31,63 @@
 .endm
 
 .macro ENABLE_IBRS
-   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   testl $SPEC_CTRL_IBRS_INUSE, spec_ctrl_ibrs
+   jz .Lskip_\@
+
PUSH_MSR_REGS
WRMSR_ASM $MSR_IA32_SPEC_CTRL, $SPEC_CTRL_FEATURE_ENABLE_IBRS
POP_MSR_REGS
-10:
+
+   jmp .Ldone_\@
+.Lskip_\@:
+   /*
+* prevent speculation beyond here as we could want to
+* stop speculation by enabling IBRS
+*/
+   lfence
+.Ldone_\@:
 .endm
 
 .macro DISABLE_IBRS
-   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   testl $SPEC_CTRL_IBRS_INUSE, spec_ctrl_ibrs
+   jz .Lskip_\@
+
PUSH_MSR_REGS
WRMSR_ASM $MSR_IA32_SPEC_CTRL, $SPEC_CTRL_FEATURE_DISABLE_IBRS
POP_MSR_REGS
-10:
+
+.Lskip_\@:
 .endm
 
 .macro ENABLE_IBRS_CLOBBER
-   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   testl $SPEC_CTRL_IBRS_INUSE, spec_ctrl_ibrs
+   jz .Lskip_\@
+
WRMSR_ASM $MSR_IA32_SPEC_CTRL, $SPEC_CTRL_FEATURE_ENABLE_IBRS
-10:
+
+   jmp .Ldone_\@
+.Lskip_\@:
+   /*
+* prevent speculation beyond here as we could want to
+* stop speculation by enabling IBRS
+*/
+   lfence
+.Ldone_\@:
 .endm
 
 .macro DISABLE_IBRS_CLOBBER
-   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   testl $SPEC_CTRL_IBRS_INUSE, spec_ctrl_ibrs
+   jz .Lskip_\@
+
WRMSR_ASM $MSR_IA32_SPEC_CTRL, $SPEC_CTRL_FEATURE_DISABLE_IBRS
-10:
+
+.Lskip_\@:
 .endm
 
 .macro ENABLE_IBRS_SAVE_AND_CLOBBER save_reg:req
-   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   testl $SPEC_CTRL_IBRS_INUSE, spec_ctrl_ibrs
+   jz .Lskip_\@
+
movl$MSR_IA32_SPEC_CTRL, %ecx
rdmsr
movl%eax, \save_reg
@@ -63,22 +95,103 @@
movl$0, %edx
movl$SPEC_CTRL_FEATURE_ENABLE_IBRS, %eax
wrmsr
-10:
+
+   jmp .Ldone_\@
+.Lskip_\@:
+   /*
+* prevent speculation beyond here as we could want to
+* stop speculation by enabling IBRS
+*/
+   lfence
+.Ldone_\@:
 .endm
 
 .macro RESTORE_IBRS_CLOBBER save_reg:req
-   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   testl $SPEC_CTRL_IBRS_INUSE, spec_ctrl_ibrs
+   jz .Lskip_\@
+
/* Set IBRS to the value saved in the save_reg */
movl$MSR_IA32_SPEC_CTRL, %ecx
 movl$0, %edx
movl\save_reg, %eax
wrmsr
-10:
+
+   jmp .Ldone_\@
+.Lskip_\@:
+   /*
+* prevent speculation beyond here as we could want to
+* stop speculation by enabling IBRS
+*/
+   lfence
+

[PATCH 2/7] x86/enter: MACROS to set/clear IBRS

2018-01-04 Thread Tim Chen

Create macros to control IBRS.  Use these macros to enable IBRS on kernel entry
paths and disable IBRS on kernel exit paths.

The registers rax, rcx and rdx are touched when controlling IBRS
so they need to be saved when they can't be clobbered.

Signed-off-by: Tim Chen 
---
 arch/x86/include/asm/spec_ctrl.h | 80 
 1 file changed, 80 insertions(+)
 create mode 100644 arch/x86/include/asm/spec_ctrl.h

diff --git a/arch/x86/include/asm/spec_ctrl.h b/arch/x86/include/asm/spec_ctrl.h
new file mode 100644
index 000..16fc4f58
--- /dev/null
+++ b/arch/x86/include/asm/spec_ctrl.h
@@ -0,0 +1,80 @@
+#ifndef _ASM_X86_SPEC_CTRL_H
+#define _ASM_X86_SPEC_CTRL_H
+
+#include 
+#include 
+#include 
+#include 
+
+#ifdef __ASSEMBLY__
+
+.macro PUSH_MSR_REGS
+   pushq %rax
+   pushq %rcx
+   pushq %rdx
+.endm
+
+.macro POP_MSR_REGS
+   popq %rdx
+   popq %rcx
+   popq %rax;
+.endm
+
+.macro WRMSR_ASM msr_nr:req eax_val:req
+   movl \msr_nr, %ecx
+   movl $0, %edx
+   movl \eax_val, %eax
+.endm
+
+.macro ENABLE_IBRS
+   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   PUSH_MSR_REGS
+   WRMSR_ASM $MSR_IA32_SPEC_CTRL, $SPEC_CTRL_FEATURE_ENABLE_IBRS
+   POP_MSR_REGS
+10:
+.endm
+
+.macro DISABLE_IBRS
+   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   PUSH_MSR_REGS
+   WRMSR_ASM $MSR_IA32_SPEC_CTRL, $SPEC_CTRL_FEATURE_DISABLE_IBRS
+   POP_MSR_REGS
+10:
+.endm
+
+.macro ENABLE_IBRS_CLOBBER
+   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   WRMSR_ASM $MSR_IA32_SPEC_CTRL, $SPEC_CTRL_FEATURE_ENABLE_IBRS
+10:
+.endm
+
+.macro DISABLE_IBRS_CLOBBER
+   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   WRMSR_ASM $MSR_IA32_SPEC_CTRL, $SPEC_CTRL_FEATURE_DISABLE_IBRS
+10:
+.endm
+
+.macro ENABLE_IBRS_SAVE_AND_CLOBBER save_reg:req
+   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   movl$MSR_IA32_SPEC_CTRL, %ecx
+   rdmsr
+   movl%eax, \save_reg
+
+   movl$0, %edx
+   movl$SPEC_CTRL_FEATURE_ENABLE_IBRS, %eax
+   wrmsr
+10:
+.endm
+
+.macro RESTORE_IBRS_CLOBBER save_reg:req
+   ALTERNATIVE "jmp 10f", "", X86_FEATURE_SPEC_CTRL
+   /* Set IBRS to the value saved in the save_reg */
+   movl$MSR_IA32_SPEC_CTRL, %ecx
+movl$0, %edx
+   movl\save_reg, %eax
+   wrmsr
+10:
+.endm
+
+#endif /* __ASSEMBLY__ */
+#endif /* _ASM_X86_SPEC_CTRL_H */
-- 
2.9.4

[PATCH 3/7] x86/enter: Use IBRS on syscall and interrupts

2018-01-04 Thread Tim Chen

Set IBRS upon kernel entrance via syscall and interrupts. Clear it
upon exit.

If NMI runs when exiting kernel between IBRS_DISABLE and
SWAPGS, the NMI would have turned on IBRS bit 0 and then it would have
left enabled when exiting the NMI. IBRS bit 0 would then be left
enabled in userland until the next enter kernel.

That is a minor inefficiency only, but we can eliminate it by saving
the MSR when entering the NMI in save_paranoid and restoring it when
exiting the NMI.

Signed-off-by: Andrea Arcangeli 
Signed-off-by: Tim Chen 
---
 arch/x86/entry/entry_64.S| 24 
 arch/x86/entry/entry_64_compat.S |  9 +
 2 files changed, 33 insertions(+)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3f72f5c..0c4d542 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "calling.h"
@@ -170,6 +171,8 @@ ENTRY(entry_SYSCALL_64_trampoline)
 
/* Load the top of the task stack into RSP */
movqCPU_ENTRY_AREA_tss + TSS_sp1 + CPU_ENTRY_AREA, %rsp
+   /* Stack is usable, use the non-clobbering IBRS enable: */
+   ENABLE_IBRS
 
/* Start building the simulated IRET frame. */
pushq   $__USER_DS  /* pt_regs->ss */
@@ -213,6 +216,8 @@ ENTRY(entry_SYSCALL_64)
 * is not required to switch CR3.
 */
movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp
+   /* Stack is usable, use the non-clobbering IBRS enable: */
+   ENABLE_IBRS
 
TRACE_IRQS_OFF
 
@@ -407,6 +412,7 @@ syscall_return_via_sysret:
 * We are on the trampoline stack.  All regs except RDI are live.
 * We can do future final exit work right here.
 */
+   DISABLE_IBRS
SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
 
popq%rdi
@@ -745,6 +751,7 @@ GLOBAL(swapgs_restore_regs_and_return_to_usermode)
 * We can do future final exit work right here.
 */
 
+   DISABLE_IBRS
SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
 
/* Restore RDI. */
@@ -832,6 +839,14 @@ native_irq_return_ldt:
SWAPGS  /* to kernel GS */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi   /* to kernel CR3 */
 
+   /*
+* Normally we enable IBRS when we switch to kernel's CR3.
+* But we are going to switch back to user CR3 immediately
+* in this routine after fixing ESPFIX stack.  There is
+* no vulnerable code branching for IBRS to protect.
+* We don't toggle IBRS to avoid the cost of two MSR writes.
+*/
+
movqPER_CPU_VAR(espfix_waddr), %rdi
movq%rax, (0*8)(%rdi)   /* user RAX */
movq(1*8)(%rsp), %rax   /* user RIP */
@@ -965,6 +980,8 @@ ENTRY(switch_to_thread_stack)
SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
movq%rsp, %rdi
movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp
+   /* Stack is usable, use the non-clobbering IBRS enable: */
+   ENABLE_IBRS
UNWIND_HINT sp_offset=16 sp_reg=ORC_REG_DI
 
pushq   7*8(%rdi)   /* regs->ss */
@@ -1265,6 +1282,7 @@ ENTRY(paranoid_entry)
 
 1:
SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14
+   ENABLE_IBRS_SAVE_AND_CLOBBER save_reg=%r13d
 
ret
 END(paranoid_entry)
@@ -1288,6 +1306,7 @@ ENTRY(paranoid_exit)
testl   %ebx, %ebx  /* swapgs needed? */
jnz .Lparanoid_exit_no_swapgs
TRACE_IRQS_IRETQ
+   RESTORE_IBRS_CLOBBER save_reg=%r13d
RESTORE_CR3 scratch_reg=%rbx save_reg=%r14
SWAPGS_UNSAFE_STACK
jmp .Lparanoid_exit_restore
@@ -1318,6 +1337,7 @@ ENTRY(error_entry)
SWAPGS
/* We have user CR3.  Change to kernel CR3. */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
+   ENABLE_IBRS_CLOBBER
 
 .Lerror_entry_from_usermode_after_swapgs:
/* Put us onto the real thread stack. */
@@ -1365,6 +1385,7 @@ ENTRY(error_entry)
 */
SWAPGS
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
+   ENABLE_IBRS_CLOBBER
jmp .Lerror_entry_done
 
 .Lbstep_iret:
@@ -1379,6 +1400,7 @@ ENTRY(error_entry)
 */
SWAPGS
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
+   ENABLE_IBRS
 
/*
 * Pretend that the exception came from user mode: set up pt_regs
@@ -1480,6 +1502,7 @@ ENTRY(nmi)
SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
movq%rsp, %rdx
movqPER_CPU_VAR(cpu_current_top_of_stack), %rsp
+   ENABLE_IBRS
UNWIND_HINT_IRET_REGS base=%rdx offset=8
pushq   5*8(%rdx)   /* pt_regs->ss */
pushq   4*8(%rdx)   /* pt_regs->rsp */
@@ -1730,6 +1753,7 @@ end_repeat_nmi:
movq$-1, %rsi
calldo_nmi
 
+   RESTORE_IBRS_CLOBBER save_reg=%r13d
RESTORE_CR3 scratch_reg=%r15 save_reg=%r14
 
testl   %

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 966 matches

Mail list logo