Re: [PATCH v3 1/5] perf tools: Add all matching dynamic sort keys for field name

2016-01-06 Thread Namhyung Kim
Hi Arnaldo and Jiri,

On Wed, Jan 06, 2016 at 01:29:48PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Jan 06, 2016 at 12:19:39PM +0100, Jiri Olsa escreveu:
> > On Wed, Jan 06, 2016 at 09:54:57AM +0900, Namhyung Kim wrote:
> > > When a perf.data file has multiple events, it's likely to be similar
> > > (tracepoint) events.  In that case, they might have same field name so
> > > add all of them to sort keys instead of bailing out.
> > > 
> > > In addition, it contains a trivial whitespace fix at callsite of
> > > add_all_dynamic_fields().
> > > 
> > > Acked-by: Jiri Olsa 
> > > Signed-off-by: Namhyung Kim 
> > 
> > hum, I haven't tried with this last version but I get all
> > hist tests failing on acme's perf/core and it seems to be
> > related to sorting changes:
> 
> Bisected it down to
> 
> 0337e6473845 ("perf tools: Make 'trace' or 'trace_fields' sort key default 
> for tracepoint events")
> 
> Fixed with the following patch, which I'm folding into the above commit,
> thanks for the report!

Ouch, thank you for the fix! :)

Thanks,
Namhyung


> 
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index acd222907bd6..4b4b1c5cccef 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -2187,6 +2187,9 @@ static const char *get_default_sort_order(struct 
> perf_evlist *evlist)
>  
>   BUG_ON(sort__mode >= ARRAY_SIZE(default_sort_orders));
>  
> + if (evlist == NULL)
> + goto out_no_evlist;
> +
>   evlist__for_each(evlist, evsel) {
>   if (evsel->attr.type != PERF_TYPE_TRACEPOINT) {
>   use_trace = false;
> @@ -2199,7 +2202,7 @@ static const char *get_default_sort_order(struct 
> perf_evlist *evlist)
>   if (symbol_conf.raw_trace)
>   return "trace_fields";
>   }
> -
> +out_no_evlist:
>   return default_sort_orders[sort__mode];
>  }
>  
>  
> > [jolsa@krava perf]$ ./perf test hist 
> > 15: Test matching and linking multiple hists : FAILED!
> > 25: Test filtering hist entries  : FAILED!
> > 28: Test output sorting of hist entries  : FAILED!
> > 29: Test cumulation of child hist entries: FAILED!
> > 
> > 
> > [jolsa@krava perf]$ ./perf test 15 -v
> > 15: Test matching and linking multiple hists :
> > --- start ---
> > test child forked, pid 10676
> > perf: Segmentation fault
> > Obtained 16 stack frames.
> > ./perf(dump_stack+0x2d) [0x50f1d7]
> > ./perf(sighandler_dump_stack+0x2d) [0x50f2b7]
> > /lib64/libc.so.6(+0x34a4f) [0x7fb0b1178a4f]
> > ./perf() [0x508c40]
> > ./perf() [0x508e23]
> > ./perf(setup_sorting+0x26) [0x5097b0]
> > ./perf(test__hists_link+0xb6) [0x487f43]
> > ./perf() [0x4725c0]
> > ./perf() [0x4726ff]
> > ./perf() [0x472986]
> > ./perf(cmd_test+0x1fe) [0x472ddc]
> > ./perf() [0x49b2ab]
> > ./perf() [0x49b513]
> > ./perf() [0x49b661]
> > ./perf(main+0x258) [0x49b9e2]
> > /lib64/libc.so.6(__libc_start_main+0xef) [0x7fb0b11646ff]
> > test child interrupted
> >  end 
> > Test matching and linking multiple hists: FAILED!
> > 
> > 
> > thanks,
> > jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Fix documentation for adp1653 DT

2016-01-06 Thread Rob Herring
On Sat, Dec 26, 2015 at 12:37:16AM +0100, Pali Rohár wrote:
> Property names do not match real names needed by driver itself.
> This patch fix this problem.
> 
> Signed-off-by: Pali Rohár 

Applied, thanks.

Rob

> ---
>  .../devicetree/bindings/media/i2c/adp1653.txt  |7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/media/i2c/adp1653.txt 
> b/Documentation/devicetree/bindings/media/i2c/adp1653.txt
> index 5ce66f2..4cce0de 100644
> --- a/Documentation/devicetree/bindings/media/i2c/adp1653.txt
> +++ b/Documentation/devicetree/bindings/media/i2c/adp1653.txt
> @@ -12,12 +12,13 @@ There are two LED outputs available - flash and 
> indicator. One LED is
>  represented by one child node, nodes need to be named "flash" and 
> "indicator".
>  
>  Required properties of the LED child node:
> -- max-microamp : see Documentation/devicetree/bindings/leds/common.txt
> +- led-max-microamp : see Documentation/devicetree/bindings/leds/common.txt
>  
>  Required properties of the flash LED child node:
>  
>  - flash-max-microamp : see Documentation/devicetree/bindings/leds/common.txt
>  - flash-timeout-us : see Documentation/devicetree/bindings/leds/common.txt
> +- led-max-microamp : see Documentation/devicetree/bindings/leds/common.txt
>  
>  Example:
>  
> @@ -29,9 +30,9 @@ Example:
>   flash {
>   flash-timeout-us = <50>;
>   flash-max-microamp = <32>;
> - max-microamp = <5>;
> + led-max-microamp = <5>;
>   };
>   indicator {
> - max-microamp = <17500>;
> + led-max-microamp = <17500>;
>   };
>   };
> -- 
> 1.7.9.5
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] ARM: fix atags_to_fdt with stack-protector-strong

2016-01-06 Thread Kees Cook
Building with CONFIG_CC_STACKPROTECTOR_STRONG triggers protection code
generation under CONFIG_ARM_ATAG_DTB_COMPAT but this is too early for
being able to use any of the stack_chk code. Explicitly disable it for
only the atags_to_fdt bits.

Suggested-by: zhxihu 
Signed-off-by: Kees Cook 
---
v3:
- actually send to everyone correctly
v2:
- use call cc-option unconditionally, arnd
---
 arch/arm/boot/compressed/Makefile | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/compressed/Makefile 
b/arch/arm/boot/compressed/Makefile
index 3f9a9ebc77c3..d7d2c2981f65 100644
--- a/arch/arm/boot/compressed/Makefile
+++ b/arch/arm/boot/compressed/Makefile
@@ -106,6 +106,14 @@ ORIG_CFLAGS := $(KBUILD_CFLAGS)
 KBUILD_CFLAGS = $(subst -pg, , $(ORIG_CFLAGS))
 endif
 
+# -fstack-protector-strong triggers protection checks in this code,
+# but it is being used too early to link to meaningful stack_chk logic.
+CFLAGS_atags_to_fdt.o := $(call cc-option, -fno-stack-protector)
+CFLAGS_fdt.o := $(call cc-option, -fno-stack-protector)
+CFLAGS_fdt_ro.o := $(call cc-option, -fno-stack-protector)
+CFLAGS_fdt_rw.o := $(call cc-option, -fno-stack-protector)
+CFLAGS_fdt_wip.o := $(call cc-option, -fno-stack-protector)
+
 ccflags-y := -fpic -mno-single-pic-base -fno-builtin -I$(obj)
 asflags-y := -DZIMAGE
 
-- 
2.6.3


-- 
Kees Cook
Chrome OS & Brillo Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Input: ALPS - Detect trackstick presence for v7 protocol

2016-01-06 Thread Dmitry Torokhov
On Wed, Jan 06, 2016 at 09:12:30AM +0100, Hans de Goede wrote:
> Hi,
> 
> On 05-01-16 17:44, Pali Rohár wrote:
> >On Sunday 22 March 2015 14:46:11 Pali Rohár wrote:
> >>This patch adds detection of trackstick for v7 protocol devices. Code in 
> >>this
> >>patch is used in official Dell touchpad linux drivers for Dell models:
> >>Dell Latitude E5250/5250, E5450/5450, E5550/5550
> >>
> >>Detection code and base reg for alps v3 rushmore and v7 devices is exacly 
> >>same.
> >>
> >>Also user in bug https://bugzilla.kernel.org/show_bug.cgi?id=94801 reported
> >>that Toshiba Sattellite Z30-A-1DG has only alps v7 touchpad device without
> >>trackstick and kernel reports to userspace also redundant trackstick device.
> >>
> >>Signed-off-by: Pali Rohár 
> >>---
> >
> >Hello!
> >
> >Alex now tested this patch on two Dell machines with ALPS: E5450 (with
> >TrackStick) and E5250 (without TrackStick).
> >
> >With patch nothing was changed for E5450. And E5250 with patch does not
> >show trackstick input device anymore.
> >
> >Tested-by: Alex Hung 
> 
> With that this patch looks good to me:
> 
> Reviewed-by: Hans de Goede 
> 
> Pali, it is probably a good idea to send Dmitry a v2 of these 2
> patches rebased on top of the latest next and with Alex' Tested-By
> and my Reviewed-by-s added.

It's alright, I added the tags and applied.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


weird DirectMap2M accounting.

2016-01-06 Thread Dave Jones
I just spotted this in /proc/meminfo on an old Core2 machine with 4G.

DirectMap2M:18446744073709543424 kB

Looks like we subtracted 8192 from 0 somewhere.

Should split_page_count() be checking that direct_pages_count > 0 ?

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/31] x86, pkeys: store protection in high VMA flags

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

vma->vm_flags is an 'unsigned long', so has space for 32 flags
on 32-bit architectures.  The high 32 bits are unused on 64-bit
platforms.  We've steered away from using the unused high VMA
bits for things because we would have difficulty supporting it
on 32-bit.

Protection Keys are not available in 32-bit mode, so there is
no concern about supporting this feature in 32-bit mode or on
32-bit CPUs.

This patch carves out 4 bits from the high half of
vma->vm_flags and allows architectures to set config option
to make them available.

Sparse complains about these constants unless we explicitly
call them "UL".

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/Kconfig   |1 +
 b/include/linux/mm.h |   11 +++
 b/mm/Kconfig |3 +++
 3 files changed, 15 insertions(+)

diff -puN arch/x86/Kconfig~pkeys-06-eat-high-vma-flags arch/x86/Kconfig
--- a/arch/x86/Kconfig~pkeys-06-eat-high-vma-flags  2016-01-06 
15:50:06.481195258 -0800
+++ b/arch/x86/Kconfig  2016-01-06 15:50:06.488195574 -0800
@@ -152,6 +152,7 @@ config X86
select VIRT_TO_BUS
select X86_DEV_DMA_OPS  if X86_64
select X86_FEATURE_NAMESif PROC_FS
+   select ARCH_USES_HIGH_VMA_FLAGS if 
X86_INTEL_MEMORY_PROTECTION_KEYS
 
 config INSTRUCTION_DECODER
def_bool y
diff -puN include/linux/mm.h~pkeys-06-eat-high-vma-flags include/linux/mm.h
--- a/include/linux/mm.h~pkeys-06-eat-high-vma-flags2016-01-06 
15:50:06.482195303 -0800
+++ b/include/linux/mm.h2016-01-06 15:50:06.489195619 -0800
@@ -158,6 +158,17 @@ extern unsigned int kobjsize(const void
 #define VM_NOHUGEPAGE  0x4000  /* MADV_NOHUGEPAGE marked this vma */
 #define VM_MERGEABLE   0x8000  /* KSM may merge identical pages */
 
+#ifdef CONFIG_ARCH_USES_HIGH_VMA_FLAGS
+#define VM_HIGH_ARCH_BIT_0 32  /* bit only usable on 64-bit 
architectures */
+#define VM_HIGH_ARCH_BIT_1 33  /* bit only usable on 64-bit 
architectures */
+#define VM_HIGH_ARCH_BIT_2 34  /* bit only usable on 64-bit 
architectures */
+#define VM_HIGH_ARCH_BIT_3 35  /* bit only usable on 64-bit 
architectures */
+#define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0)
+#define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1)
+#define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2)
+#define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3)
+#endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */
+
 #if defined(CONFIG_X86)
 # define VM_PATVM_ARCH_1   /* PAT reserves whole VMA at 
once (x86) */
 #elif defined(CONFIG_PPC)
diff -puN mm/Kconfig~pkeys-06-eat-high-vma-flags mm/Kconfig
--- a/mm/Kconfig~pkeys-06-eat-high-vma-flags2016-01-06 15:50:06.484195393 
-0800
+++ b/mm/Kconfig2016-01-06 15:50:06.489195619 -0800
@@ -668,3 +668,6 @@ config ZONE_DEVICE
 
 config FRAME_VECTOR
bool
+
+config ARCH_USES_HIGH_VMA_FLAGS
+   bool
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/31] signals, pkeys: notify userspace about protection key faults

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

A protection key fault is very similar to any other access error.
There must be a VMA, etc...  We even want to take the same action
(SIGSEGV) that we do with a normal access fault.

However, we do need to let userspace know that something is
different.  We do this the same way what we did with SEGV_BNDERR
with Memory Protection eXtensions (MPX): define a new SEGV code:
SEGV_PKUERR.

We add a siginfo field: si_pkey that reveals to userspace which
protection key was set on the PTE that we faulted on.  There is
no other easy way for userspace to figure this out.  They could
parse smaps but that would be a bit cruel.

We share space with in siginfo with _addr_bnd.  #BR faults from
MPX are completely separate from page faults (#PF) that trigger
from protection key violations, so we never need both at the same
time.

Note that _pkey is a 64-bit value.  The current hardware only
supports 4-bit protection keys.  We do this because there is
_plenty_ of space in _sigfault and it is possible that future
processors would support more than 4 bits of protection keys.

The x86 code to actually fill in the siginfo is in the next
patch.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/include/uapi/asm-generic/siginfo.h |   17 -
 b/kernel/signal.c|4 
 2 files changed, 16 insertions(+), 5 deletions(-)

diff -puN include/uapi/asm-generic/siginfo.h~pkeys-09-siginfo-core 
include/uapi/asm-generic/siginfo.h
--- a/include/uapi/asm-generic/siginfo.h~pkeys-09-siginfo-core  2016-01-06 
15:50:07.838256440 -0800
+++ b/include/uapi/asm-generic/siginfo.h2016-01-06 15:50:07.843256665 
-0800
@@ -91,10 +91,15 @@ typedef struct siginfo {
int _trapno;/* TRAP # which caused the signal */
 #endif
short _addr_lsb; /* LSB of the reported address */
-   struct {
-   void __user *_lower;
-   void __user *_upper;
-   } _addr_bnd;
+   union {
+   /* used when si_code=SEGV_BNDERR */
+   struct {
+   void __user *_lower;
+   void __user *_upper;
+   } _addr_bnd;
+   /* used when si_code=SEGV_PKUERR */
+   u64 _pkey;
+   };
} _sigfault;
 
/* SIGPOLL */
@@ -137,6 +142,7 @@ typedef struct siginfo {
 #define si_addr_lsb_sifields._sigfault._addr_lsb
 #define si_lower   _sifields._sigfault._addr_bnd._lower
 #define si_upper   _sifields._sigfault._addr_bnd._upper
+#define si_pkey_sifields._sigfault._pkey
 #define si_band_sifields._sigpoll._band
 #define si_fd  _sifields._sigpoll._fd
 #ifdef __ARCH_SIGSYS
@@ -206,7 +212,8 @@ typedef struct siginfo {
 #define SEGV_MAPERR(__SI_FAULT|1)  /* address not mapped to object */
 #define SEGV_ACCERR(__SI_FAULT|2)  /* invalid permissions for mapped 
object */
 #define SEGV_BNDERR(__SI_FAULT|3)  /* failed address bound checks */
-#define NSIGSEGV   3
+#define SEGV_PKUERR(__SI_FAULT|4)  /* failed protection key checks */
+#define NSIGSEGV   4
 
 /*
  * SIGBUS si_codes
diff -puN kernel/signal.c~pkeys-09-siginfo-core kernel/signal.c
--- a/kernel/signal.c~pkeys-09-siginfo-core 2016-01-06 15:50:07.840256530 
-0800
+++ b/kernel/signal.c   2016-01-06 15:50:07.844256710 -0800
@@ -2709,6 +2709,10 @@ int copy_siginfo_to_user(siginfo_t __use
err |= __put_user(from->si_upper, >si_upper);
}
 #endif
+#ifdef SEGV_PKUERR
+   if (from->si_signo == SIGSEGV && from->si_code == SEGV_PKUERR)
+   err |= __put_user(from->si_pkey, >si_pkey);
+#endif
break;
case __SI_CHLD:
err |= __put_user(from->si_pid, >si_pid);
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Input: ALPS - Report v3 pinnacle trackstick device only if is present

2016-01-06 Thread Dmitry Torokhov
On Tue, Jan 05, 2016 at 05:54:19PM +0100, Pali Rohár wrote:
> On Monday 23 March 2015 12:42:25 Hans de Goede wrote:
> > Hi,
> > 
> > On 22-03-15 14:47, Pali Rohár wrote:
> > >This patch move v3 pinnacle code for trackstick detection from 
> > >alps_hw_init_v3()
> > >to alps_set_protocol() so ALPS_DUALPOINT flag can be cleared before 
> > >registering
> > >trackstick input device in kernel.
> > >
> > >Signed-off-by: Pali Rohár 
> > 
> > Looks good:
> > 
> > Acked-by: Hans de Goede 
> > 
> > Regards,
> > 
> > Hans
> > 
> 
> Hi Hans! I would like to remind this patch as it stays here on ML.

Applied, thank you.

> 
> > 
> > >---
> > >  drivers/input/mouse/alps.c |   12 +++-
> > >  1 file changed, 7 insertions(+), 5 deletions(-)
> > >
> > >diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c
> > >index c9cd27a..d24e98d 100644
> > >--- a/drivers/input/mouse/alps.c
> > >+++ b/drivers/input/mouse/alps.c
> > >@@ -1877,15 +1877,12 @@ error:
> > >
> > >  static int alps_hw_init_v3(struct psmouse *psmouse)
> > >  {
> > >+  struct alps_data *priv = psmouse->private;
> > >   struct ps2dev *ps2dev = >ps2dev;
> > >   int reg_val;
> > >   unsigned char param[4];
> > >
> > >-  reg_val = alps_probe_trackstick_v3_v7(psmouse, ALPS_REG_BASE_PINNACLE);
> > >-  if (reg_val == -EIO)
> > >-  goto error;
> > >-
> > >-  if (reg_val == 0 &&
> > >+  if ((priv->flags & ALPS_DUALPOINT) &&
> > >   alps_setup_trackstick_v3(psmouse, ALPS_REG_BASE_PINNACLE) == -EIO)
> > >   goto error;
> > >
> > >@@ -2249,6 +2246,11 @@ static int alps_set_protocol(struct psmouse 
> > >*psmouse,
> > >   priv->decode_fields = alps_decode_pinnacle;
> > >   priv->nibble_commands = alps_v3_nibble_commands;
> > >   priv->addr_command = PSMOUSE_CMD_RESET_WRAP;
> > >+
> > >+  if (alps_probe_trackstick_v3_v7(psmouse,
> > >+  ALPS_REG_BASE_PINNACLE) < 0)
> > >+  priv->flags &= ~ALPS_DUALPOINT;
> > >+
> > >   break;
> > >
> > >   case ALPS_PROTO_V3_RUSHMORE:
> > >
> 
> -- 
> Pali Rohár
> pali.ro...@gmail.com

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/31] mm, gup: introduce concept of "foreign" get_user_pages()

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

For protection keys, we need to understand whether protections
should be enforced in software or not.  In general, we enforce
protections when working on our own task, but not when on others.
We call these "current" and "foreign" operations.

This introduces two new get_user_pages() variants:

get_current_user_pages()
get_foreign_user_pages()

get_current_user_pages() is a drop-in replacement for when
get_user_pages() was called with (current, current->mm, ...) as
arguments.  Using it makes a few of the call sites look a bit
nicer.

get_foreign_user_pages() is a replacement for when
get_user_pages() is called on non-current tsk/mm.

We leave a stub get_user_pages() around with a __deprecated
warning.

This also effectively turns get_user_pages_unlocked() in to
get_user_pages_unlocked_current() since it no longer gets a
tsk/mm passed in.  I thought that would be too long of a name if
we added "_current" on there.  BTW, if someone wants the
get_user_pages_unlocked() behavior with a non-current tsk/mm,
they just have to use __get_user_pages_unlocked() directly.

Signed-off-by: Dave Hansen 
Cc: Andrew Morton 
Cc: Kirill A. Shutemov 
Cc: Andrea Arcangeli 
Cc: Naoya Horiguchi 
Cc: vba...@suse.cz
---

 b/arch/mips/mm/gup.c  |3 -
 b/arch/s390/mm/gup.c  |4 --
 b/arch/sh/mm/gup.c|2 -
 b/arch/sparc/mm/gup.c |2 -
 b/arch/x86/mm/gup.c   |2 -
 b/arch/x86/mm/mpx.c   |4 +-
 b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c |4 +-
 b/drivers/gpu/drm/i915/i915_gem_userptr.c |2 -
 b/drivers/gpu/drm/radeon/radeon_ttm.c |4 +-
 b/drivers/gpu/drm/via/via_dmablit.c   |3 -
 b/drivers/infiniband/core/umem.c  |2 -
 b/drivers/infiniband/core/umem_odp.c  |8 ++--
 b/drivers/infiniband/hw/mthca/mthca_memfree.c |3 -
 b/drivers/infiniband/hw/qib/qib_user_pages.c  |3 -
 b/drivers/infiniband/hw/usnic/usnic_uiom.c|2 -
 b/drivers/media/pci/ivtv/ivtv-udma.c  |4 +-
 b/drivers/media/pci/ivtv/ivtv-yuv.c   |   10 ++---
 b/drivers/media/v4l2-core/videobuf-dma-sg.c   |3 -
 b/drivers/misc/sgi-gru/grufault.c |3 -
 b/drivers/scsi/st.c   |2 -
 b/drivers/video/fbdev/pvr2fb.c|4 +-
 b/drivers/virt/fsl_hypervisor.c   |5 +-
 b/fs/exec.c   |8 +++-
 b/include/linux/mm.h  |   39 +--
 b/kernel/events/uprobes.c |4 +-
 b/mm/frame_vector.c   |2 -
 b/mm/gup.c|   51 --
 b/mm/memory.c |2 -
 b/mm/mempolicy.c  |6 +--
 b/mm/nommu.c  |   34 ++---
 b/mm/process_vm_access.c  |6 ++-
 b/mm/util.c   |4 --
 b/net/ceph/pagevec.c  |2 -
 b/security/tomoyo/domain.c|9 
 b/virt/kvm/async_pf.c |2 -
 b/virt/kvm/kvm_main.c |   13 +++---
 36 files changed, 147 insertions(+), 114 deletions(-)

diff -puN arch/mips/mm/gup.c~get_current_user_pages arch/mips/mm/gup.c
--- a/arch/mips/mm/gup.c~get_current_user_pages 2016-01-06 15:50:02.181001390 
-0800
+++ b/arch/mips/mm/gup.c2016-01-06 15:50:02.243004185 -0800
@@ -301,8 +301,7 @@ slow_irqon:
start += nr << PAGE_SHIFT;
pages += nr;
 
-   ret = get_user_pages_unlocked(current, mm, start,
- (end - start) >> PAGE_SHIFT,
+   ret = get_user_pages_unlocked(start, (end - start) >> PAGE_SHIFT,
  write, 0, pages);
 
/* Have to be a bit careful with return values */
diff -puN arch/s390/mm/gup.c~get_current_user_pages arch/s390/mm/gup.c
--- a/arch/s390/mm/gup.c~get_current_user_pages 2016-01-06 15:50:02.183001480 
-0800
+++ b/arch/s390/mm/gup.c2016-01-06 15:50:02.243004185 -0800
@@ -230,7 +230,6 @@ int __get_user_pages_fast(unsigned long
 int get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages)
 {
-   struct mm_struct *mm = current->mm;
int nr, ret;
 
start &= PAGE_MASK;
@@ -241,8 +240,7 @@ int get_user_pages_fast(unsigned long st
/* Try to get the remaining pages with get_user_pages */
start += nr << PAGE_SHIFT;
pages += nr;
-   ret = get_user_pages_unlocked(current, mm, start,
-nr_pages - nr, write, 

[PATCH 00/31] x86: Memory Protection Keys (v8)

2016-01-06 Thread Dave Hansen
Memory Protection Keys for User pages is a CPU feature which will
first appear on Skylake Servers, but will also be supported on
future non-server parts (there is also a QEMU implementation).  It
provides a mechanism for enforcing page-based protections, but
without requiring modification of the page tables when an
application changes protection domains.

This set introduces supported limited to:
1. Allows "execute-only" memory
2. Enables KVM to run Protection-Key-enabled guests

My preference would be to merge this part by itself (presumably
for 4.6, *not* 4.5).  This set contains the vast majority of
of the code, with the small but tricky explicit user interface
parts left off.  We can have a more focused review on those at
a later time in a (much smaller) follow-on series.

Changes from v7:
 * Fixed merge issue with cpu feature bitmap definitions
 * Fixed up some comments in get_user_pages() and smaps patches
   (thanks Vlastimil!)

Changes from v6:
 * fix up ??'s showing up in in smaps' VmFlags field
 * added execute-only support
 * removed all the new syscalls from this set.  We can discuss
   them in detail after this is merged.

Changes from v5:

 * make types in read_pkru() u32's, not ints
 * rework VM_* bits to avoid using __ffsl() and clean up
   vma_pkey()
 * rework pte_allows_gup() to use p??_val() instead of passing
   around p{te,md,ud}_t types.
 * Fix up some inconsistent bool vs. int usage
 * corrected name of ARCH_VM_PKEY_FLAGS in patch description
 * remove NR_PKEYS... config option.  Just define it directly

Changes from v4:

 * Made "allow setting of XSAVE state" safe if we got preempted
   between when we saved our FPU state and when we restore it.
   (I would appreciate a look from Ingo on this patch).
 * Fixed up a few things from Thomas's latest comments: splt up
   siginfo in to x86 and generic, removed extra 'eax' variable
   in rdpkru function, reworked vm_flags assignment, reworded
   a comment in pte_allows_gup()
 * Add missing DISABLED/REQUIRED_MASK14 in cpufeature.h
 * Added comment about compile optimization in fault path
 * Left get_user_pages_locked() alone.  Andrea thinks we need it.

Changes from RFCv3:

 * Added 'current' and 'foreign' variants of get_user_pages() to
   help indicate whether protection keys should be enforced.
   Thanks to Jerome Glisse for pointing out this issue.
 * Added "allocation" and set/get system calls so that we can do
   management of proection keys in the kernel.  This opens the
   door to use of specific protection keys for kernel use in the
   future, such as for execute-only memory.
 * Removed the kselftest code for the moment.  It will be
   submitted separately.

Thanks Ingo and Thomas for most of these):
Changes from RFCv2 (Thanks Ingo and Thomas for most of these):

 * few minor compile warnings
 * changed 'nopku' interaction with cpuid bits.  Now, we do not
   clear the PKU cpuid bit, we just skip enabling it.
 * changed __pkru_allows_write() to also check access disable bit
 * removed the unused write_pkru()
 * made si_pkey a u64 and added some patch description details.
   Also made it share space in siginfo with MPX and clarified
   comments.
 * give some real text for the Processor Trace xsave state
 * made vma_pkey() less ugly (and much more optimized actually)
 * added SEGV_PKUERR to copy_siginfo_to_user()
 * remove page table walk when filling in si_pkey, added some
   big fat comments about it being inherently racy.
 * added self test code

This code is not runnable to anyone outside of Intel unless they
have some special hardware or a fancy simulator.  There is a qemu
model to emulate the feature, but it is not currently implemented
fully enough to be usable.  If you are interested in running this
for real, please get in touch with me.  Hardware is available to a
very small but nonzero number of people.

This set is also available here:

git://git.kernel.org/pub/scm/linux/kernel/git/daveh/x86-pkeys.git 
pkeys-v019

=== diffstat ===

Dave Hansen (31):
  mm, gup: introduce concept of "foreign" get_user_pages()
  x86, fpu: add placeholder for Processor Trace XSAVE state
  x86, pkeys: Add Kconfig option
  x86, pkeys: cpuid bit definition
  x86, pkeys: define new CR4 bit
  x86, pkeys: add PKRU xsave fields and data structure(s)
  x86, pkeys: PTE bits for storing protection key
  x86, pkeys: new page fault error code bit: PF_PK
  x86, pkeys: store protection in high VMA flags
  x86, pkeys: arch-specific protection bits
  x86, pkeys: pass VMA down in to fault signal generation code
  signals, pkeys: notify userspace about protection key faults
  x86, pkeys: fill in pkey field in siginfo
  x86, pkeys: add functions to fetch PKRU
  mm: factor out VMA fault permission checking
  x86, mm: simplify get_user_pages() PTE bit handling
  x86, pkeys: check VMAs and PTEs for protection keys
  mm: add gup flag to indicate "foreign" mm access
  x86, pkeys: 

[PATCH 04/31] x86, pkeys: cpuid bit definition

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

There are two CPUID bits for protection keys.  One is for whether
the CPU contains the feature, and the other will appear set once
the OS enables protection keys.  Specifically:

Bit 04: OSPKE. If 1, OS has set CR4.PKE to enable
Protection keys (and the RDPKRU/WRPKRU instructions)

This is because userspace can not see CR4 contents, but it can
see CPUID contents.

X86_FEATURE_PKU is referred to as "PKU" in the hardware documentation:

CPUID.(EAX=07H,ECX=0H):ECX.PKU [bit 3]

X86_FEATURE_OSPKE is "OSPKU":

CPUID.(EAX=07H,ECX=0H):ECX.OSPKE [bit 4]

These are the first CPU features which need to look at the
ECX word in CPUID leaf 0x7, so this patch also includes
fetching that word in to the cpuinfo->x86_capability[] array.

Add it to the disabled-features mask when its config option is
off.  Even though we are not using it here, we also extend the
REQUIRED_MASK_BIT_SET() macro to keep it mirroring the
DISABLED_MASK_BIT_SET() version.

This means that in almost all code, you should use:

cpu_has(c, X86_FEATURE_PKU)

and *not* the CONFIG option.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/include/asm/cpufeature.h|   56 ++---
 b/arch/x86/include/asm/disabled-features.h |   13 ++
 b/arch/x86/include/asm/required-features.h |5 ++
 b/arch/x86/kernel/cpu/common.c |1 
 4 files changed, 54 insertions(+), 21 deletions(-)

diff -puN arch/x86/include/asm/cpufeature.h~pkeys-01-cpuid 
arch/x86/include/asm/cpufeature.h
--- a/arch/x86/include/asm/cpufeature.h~pkeys-01-cpuid  2016-01-06 
15:50:04.310097377 -0800
+++ b/arch/x86/include/asm/cpufeature.h 2016-01-06 15:50:04.318097738 -0800
@@ -12,7 +12,7 @@
 #include 
 #endif
 
-#define NCAPINTS   14  /* N 32-bit words worth of info */
+#define NCAPINTS   15  /* N 32-bit words worth of info */
 #define NBUGINTS   1   /* N 32-bit bug flags */
 
 /*
@@ -258,6 +258,10 @@
 /* AMD-defined CPU features, CPUID level 0x8008 (ebx), word 13 */
 #define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */
 
+/* Intel-defined CPU features, CPUID level 0x0007:0 (ecx), word 13 */
+#define X86_FEATURE_PKU(14*32+ 3) /* Protection Keys for 
Userspace */
+#define X86_FEATURE_OSPKE  (14*32+ 4) /* OS Protection Keys Enable */
+
 /*
  * BUG word(s)
  */
@@ -298,28 +302,38 @@ extern const char * const x86_bug_flags[
 test_bit(bit, (unsigned long *)((c)->x86_capability))
 
 #define REQUIRED_MASK_BIT_SET(bit) \
-( (((bit)>>5)==0 && (1UL<<((bit)&31) & REQUIRED_MASK0)) || \
-  (((bit)>>5)==1 && (1UL<<((bit)&31) & REQUIRED_MASK1)) || \
-  (((bit)>>5)==2 && (1UL<<((bit)&31) & REQUIRED_MASK2)) || \
-  (((bit)>>5)==3 && (1UL<<((bit)&31) & REQUIRED_MASK3)) || \
-  (((bit)>>5)==4 && (1UL<<((bit)&31) & REQUIRED_MASK4)) || \
-  (((bit)>>5)==5 && (1UL<<((bit)&31) & REQUIRED_MASK5)) || \
-  (((bit)>>5)==6 && (1UL<<((bit)&31) & REQUIRED_MASK6)) || \
-  (((bit)>>5)==7 && (1UL<<((bit)&31) & REQUIRED_MASK7)) || \
-  (((bit)>>5)==8 && (1UL<<((bit)&31) & REQUIRED_MASK8)) || \
-  (((bit)>>5)==9 && (1UL<<((bit)&31) & REQUIRED_MASK9)) )
+( (((bit)>>5)==0  && (1UL<<((bit)&31) & REQUIRED_MASK0 )) ||   \
+  (((bit)>>5)==1  && (1UL<<((bit)&31) & REQUIRED_MASK1 )) ||   \
+  (((bit)>>5)==2  && (1UL<<((bit)&31) & REQUIRED_MASK2 )) ||   \
+  (((bit)>>5)==3  && (1UL<<((bit)&31) & REQUIRED_MASK3 )) ||   \
+  (((bit)>>5)==4  && (1UL<<((bit)&31) & REQUIRED_MASK4 )) ||   \
+  (((bit)>>5)==5  && (1UL<<((bit)&31) & REQUIRED_MASK5 )) ||   \
+  (((bit)>>5)==6  && (1UL<<((bit)&31) & REQUIRED_MASK6 )) ||   \
+  (((bit)>>5)==7  && (1UL<<((bit)&31) & REQUIRED_MASK7 )) ||   \
+  (((bit)>>5)==8  && (1UL<<((bit)&31) & REQUIRED_MASK8 )) ||   \
+  (((bit)>>5)==9  && (1UL<<((bit)&31) & REQUIRED_MASK9 )) ||   \
+  (((bit)>>5)==10 && (1UL<<((bit)&31) & REQUIRED_MASK10)) ||   \
+  (((bit)>>5)==11 && (1UL<<((bit)&31) & REQUIRED_MASK11)) ||   \
+  (((bit)>>5)==12 && (1UL<<((bit)&31) & REQUIRED_MASK12)) ||   \
+  (((bit)>>5)==13 && (1UL<<((bit)&31) & REQUIRED_MASK13)) ||   \
+  (((bit)>>5)==14 && (1UL<<((bit)&31) & REQUIRED_MASK14)) )
 
 #define DISABLED_MASK_BIT_SET(bit) \
-( (((bit)>>5)==0 && (1UL<<((bit)&31) & DISABLED_MASK0)) || \
-  (((bit)>>5)==1 && (1UL<<((bit)&31) & DISABLED_MASK1)) || \
-  (((bit)>>5)==2 && (1UL<<((bit)&31) & DISABLED_MASK2)) || \
-  (((bit)>>5)==3 && (1UL<<((bit)&31) & DISABLED_MASK3)) || \
-  (((bit)>>5)==4 && (1UL<<((bit)&31) & DISABLED_MASK4)) || \
-  (((bit)>>5)==5 && 

Re: [PATCH] PCI: iproc: fix msi driver selection

2016-01-06 Thread Bjorn Helgaas
On Fri, Dec 18, 2015 at 03:57:53PM +0100, Arnd Bergmann wrote:
> The newly added MSI support for iproc causes a link error when its
> Kconfig option is disabled:
> 
> ERROR: "iproc_msi_exit" [drivers/pci/host/pcie-iproc.ko] undefined!
> ERROR: "iproc_msi_init" [drivers/pci/host/pcie-iproc.ko] undefined!
> 
> This changes the header file so we use stub functions whenever
> the driver is not built, even when CONFIG_MSI is enabled.
> 
> As the Kconfig logic for the driver is a bit off, I'm rectifying
> that as well, by making it depend on the specific drivers that
> call into the driver, and moving the option behind those instead
> of before them.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 610894347cbf ("PCI: iproc: Add iProc PCIe MSI support")

Applied with Ray's Reviewed-by to pci/host-iproc for v4.5, thanks!

Actually, since 610894347cbf hasn't been merged upstream yet, I just
squashed this fix into it and updated "pci/host-iproc" and "next".

> ---
> Found on ARM randconfig builds a couple of days ago
> 
> diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
> index 490476e172fd..d7c05894af70 100644
> --- a/drivers/pci/host/Kconfig
> +++ b/drivers/pci/host/Kconfig
> @@ -124,15 +124,6 @@ config PCIE_IPROC
> iProc family of SoCs. An appropriate bus interface driver needs
> to be enabled to select this.
>  
> -config PCIE_IPROC_MSI
> - bool "Broadcom iProc PCIe MSI support"
> - depends on ARCH_BCM_IPROC && PCI_MSI
> - select PCI_MSI_IRQ_DOMAIN
> - default ARCH_BCM_IPROC
> - help
> -   Say Y here if you want to enable MSI support for Broadcom's iProc
> -   PCIe controller
> -
>  config PCIE_IPROC_PLATFORM
>   tristate "Broadcom iProc PCIe platform bus driver"
>   depends on ARCH_BCM_IPROC || (ARM && COMPILE_TEST)
> @@ -154,6 +145,16 @@ config PCIE_IPROC_BCMA
> Say Y here if you want to use the Broadcom iProc PCIe controller
> through the BCMA bus interface
>  
> +config PCIE_IPROC_MSI
> + bool "Broadcom iProc PCIe MSI support"
> + depends on PCIE_IPROC_PLATFORM || PCIE_IPROC_BCMA
> + depends on PCI_MSI
> + select PCI_MSI_IRQ_DOMAIN
> + default ARCH_BCM_IPROC
> + help
> +   Say Y here if you want to enable MSI support for Broadcom's iProc
> +   PCIe controller
> +
>  config PCIE_ALTERA
>   bool "Altera PCIe controller"
>   depends on ARM || NIOS2
> diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
> index 6def23a7eb54..e84d93c53c7b 100644
> --- a/drivers/pci/host/pcie-iproc.h
> +++ b/drivers/pci/host/pcie-iproc.h
> @@ -79,7 +79,7 @@ struct iproc_pcie {
>  int iproc_pcie_setup(struct iproc_pcie *pcie, struct list_head *res);
>  int iproc_pcie_remove(struct iproc_pcie *pcie);
>  
> -#ifdef CONFIG_PCI_MSI
> +#ifdef CONFIG_PCIE_IPROC_MSI
>  int iproc_msi_init(struct iproc_pcie *pcie, struct device_node *node);
>  void iproc_msi_exit(struct iproc_pcie *pcie);
>  #else
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/31] x86, pkeys: define new CR4 bit

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

There is a new bit in CR4 for enabling protection keys.  We
will actually enable it later in the series.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/include/uapi/asm/processor-flags.h |2 ++
 1 file changed, 2 insertions(+)

diff -puN arch/x86/include/uapi/asm/processor-flags.h~pkeys-02-cr4 
arch/x86/include/uapi/asm/processor-flags.h
--- a/arch/x86/include/uapi/asm/processor-flags.h~pkeys-02-cr4  2016-01-06 
15:50:04.798119379 -0800
+++ b/arch/x86/include/uapi/asm/processor-flags.h   2016-01-06 
15:50:04.801119514 -0800
@@ -118,6 +118,8 @@
 #define X86_CR4_SMEP   _BITUL(X86_CR4_SMEP_BIT)
 #define X86_CR4_SMAP_BIT   21 /* enable SMAP support */
 #define X86_CR4_SMAP   _BITUL(X86_CR4_SMAP_BIT)
+#define X86_CR4_PKE_BIT22 /* enable Protection Keys support */
+#define X86_CR4_PKE_BITUL(X86_CR4_PKE_BIT)
 
 /*
  * x86-64 Task Priority Register, CR8
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/31] x86, pkeys: PTE bits for storing protection key

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

Previous documentation has referred to these 4 bits as "ignored".
That means that software could have made use of them.  But, as
far as I know, the kernel never used them.

They are still ignored when protection keys is not enabled, so
they could theoretically still get used for software purposes.

We also implement "empty" versions so that code that references
to them can be optimized away by the compiler when the config
option is not enabled.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/include/asm/pgtable_types.h |   17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff -puN arch/x86/include/asm/pgtable_types.h~pkeys-04-ptebits 
arch/x86/include/asm/pgtable_types.h
--- a/arch/x86/include/asm/pgtable_types.h~pkeys-04-ptebits 2016-01-06 
15:50:05.662158333 -0800
+++ b/arch/x86/include/asm/pgtable_types.h  2016-01-06 15:50:05.665158468 
-0800
@@ -25,7 +25,11 @@
 #define _PAGE_BIT_SPLITTING_PAGE_BIT_SOFTW2 /* only valid on a PSE pmd */
 #define _PAGE_BIT_HIDDEN   _PAGE_BIT_SOFTW3 /* hidden by kmemcheck */
 #define _PAGE_BIT_SOFT_DIRTY   _PAGE_BIT_SOFTW3 /* software dirty tracking */
-#define _PAGE_BIT_NX   63   /* No execute: only valid after cpuid 
check */
+#define _PAGE_BIT_PKEY_BIT059   /* Protection Keys, bit 1/4 */
+#define _PAGE_BIT_PKEY_BIT160   /* Protection Keys, bit 2/4 */
+#define _PAGE_BIT_PKEY_BIT261   /* Protection Keys, bit 3/4 */
+#define _PAGE_BIT_PKEY_BIT362   /* Protection Keys, bit 4/4 */
+#define _PAGE_BIT_NX   63   /* No execute: only valid after cpuid 
check */
 
 /* If _PAGE_BIT_PRESENT is clear, we use these: */
 /* - if the user mapped it with PROT_NONE; pte_present gives true */
@@ -47,6 +51,17 @@
 #define _PAGE_SPECIAL  (_AT(pteval_t, 1) << _PAGE_BIT_SPECIAL)
 #define _PAGE_CPA_TEST (_AT(pteval_t, 1) << _PAGE_BIT_CPA_TEST)
 #define _PAGE_SPLITTING(_AT(pteval_t, 1) << _PAGE_BIT_SPLITTING)
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+#define _PAGE_PKEY_BIT0(_AT(pteval_t, 1) << _PAGE_BIT_PKEY_BIT0)
+#define _PAGE_PKEY_BIT1(_AT(pteval_t, 1) << _PAGE_BIT_PKEY_BIT1)
+#define _PAGE_PKEY_BIT2(_AT(pteval_t, 1) << _PAGE_BIT_PKEY_BIT2)
+#define _PAGE_PKEY_BIT3(_AT(pteval_t, 1) << _PAGE_BIT_PKEY_BIT3)
+#else
+#define _PAGE_PKEY_BIT0(_AT(pteval_t, 0))
+#define _PAGE_PKEY_BIT1(_AT(pteval_t, 0))
+#define _PAGE_PKEY_BIT2(_AT(pteval_t, 0))
+#define _PAGE_PKEY_BIT3(_AT(pteval_t, 0))
+#endif
 #define __HAVE_ARCH_PTE_SPECIAL
 
 #ifdef CONFIG_KMEMCHECK
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/31] x86, fpu: add placeholder for Processor Trace XSAVE state

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

There is an XSAVE state component for Intel Processor Trace (PT).
But, we do not currently use it.

We add a placeholder in the code for it so it is not a mystery and
also so we do not need an explicit enum initialization for Protection
Keys in a moment.

Why don't we use it?

We might end up using this at _some_ point in the future.  But,
this is a "system" state which requires using the currently
unsupported XSAVES feature.  Unlike all the other XSAVE states,
PT state is also not directly tied to a thread.  You might
context-switch between threads, but not want to change any of the
PT state.  Or, you might switch between threads, and *do* want to
change PT state, all depending on what is being traced.

We currently just manually set some MSRs to do this PT context
switching, and it is unclear whether replacing our direct MSR use
with XSAVE will be a net win or loss, both in code complexity and
performance.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
Cc: Andi Kleen 
Cc: yu-cheng...@intel.com
Cc: fenghua...@intel.com
---

 b/arch/x86/include/asm/fpu/types.h |1 +
 b/arch/x86/kernel/fpu/xstate.c |   10 --
 2 files changed, 9 insertions(+), 2 deletions(-)

diff -puN arch/x86/include/asm/fpu/types.h~pt-xstate-bit 
arch/x86/include/asm/fpu/types.h
--- a/arch/x86/include/asm/fpu/types.h~pt-xstate-bit2016-01-06 
15:50:03.468059415 -0800
+++ b/arch/x86/include/asm/fpu/types.h  2016-01-06 15:50:03.473059640 -0800
@@ -108,6 +108,7 @@ enum xfeature {
XFEATURE_OPMASK,
XFEATURE_ZMM_Hi256,
XFEATURE_Hi16_ZMM,
+   XFEATURE_PT_UNIMPLEMENTED_SO_FAR,
 
XFEATURE_MAX,
 };
diff -puN arch/x86/kernel/fpu/xstate.c~pt-xstate-bit 
arch/x86/kernel/fpu/xstate.c
--- a/arch/x86/kernel/fpu/xstate.c~pt-xstate-bit2016-01-06 
15:50:03.470059505 -0800
+++ b/arch/x86/kernel/fpu/xstate.c  2016-01-06 15:50:03.473059640 -0800
@@ -13,6 +13,11 @@
 
 #include 
 
+/*
+ * Although we spell it out in here, the Processor Trace
+ * xfeature is completely unused.  We use other mechanisms
+ * to save/restore PT state in Linux.
+ */
 static const char *xfeature_names[] =
 {
"x87 floating point registers"  ,
@@ -23,7 +28,7 @@ static const char *xfeature_names[] =
"AVX-512 opmask",
"AVX-512 Hi256" ,
"AVX-512 ZMM_Hi256" ,
-   "unknown xstate feature",
+   "Processor Trace (unused)"  ,
 };
 
 /*
@@ -469,7 +474,8 @@ static void check_xstate_against_struct(
 * numbers.
 */
if ((nr < XFEATURE_YMM) ||
-   (nr >= XFEATURE_MAX)) {
+   (nr >= XFEATURE_MAX) ||
+   (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR)) {
WARN_ONCE(1, "no structure for xstate: %d\n", nr);
XSTATE_WARN_ON(1);
}
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 31/32] sh: support a 2-byte smp_store_mb

2016-01-06 Thread Michael S. Tsirkin
On Wed, Jan 06, 2016 at 01:23:50PM -0500, Rich Felker wrote:
> On Wed, Jan 06, 2016 at 03:32:18PM +0100, Peter Zijlstra wrote:
> > On Wed, Jan 06, 2016 at 01:52:17PM +0200, Michael S. Tsirkin wrote:
> > > > > Peter, what do you think? How about I leave this patch as is for now?
> > > > 
> > > > No, and I object to removing the single byte implementation too. Either
> > > > remove the full arch or fix xchg() to conform. xchg() should work on all
> > > > native word sizes, for SH that would be 1,2 and 4 bytes.
> > > 
> > > Rick, maybe you could explain how is current 1 byte xchg on llsc wrong?
> > 
> > It doesn't seem to preserve the 3 other bytes in the word.
> > 
> > > It does use 4 byte accesses but IIUC that is all that exists on
> > > this architecture.
> > 
> > Right, that's not a problem, look at arch/alpha/include/asm/xchg.h for
> > example. A store to another portion of the word should make the
> > store-conditional fail and we'll retry the loop.
> > 
> > The short versions should however preserve the other bytes in the word.
> 
> Indeed. Also, accesses must be aligned, so the asm needs to round down
> to an aligned address and perform a correct read-modify-write on it,
> placing the new byte in the correct offset in the word.
> 
> Alternatively (my preference) this logic can be impemented in C as a
> wrapper around the 32-bit cmpxchg. I think this is less error-prone
> and it can be shared between the multiple sh cmpxchg back-ends,
> including the new cas.l one we need for J2.

Sounds much more reasonable.

> > SH's cmpxchg() is equally incomplete and does not provide 1 and 2 byte
> > versions.
> > 
> > In any case, I'm all for rm -rf arch/sh/, one less arch to worry about
> > is always good, but ISTR some people wanting to resurrect SH:
> > 
> >   http://old.lwn.net/Articles/647636/
> > 
> > Rob, Jeff, Sato-san, might I suggest you send a MAINTAINERS patch and
> > take up an active interest in SH lest someone 'accidentally' nukes it?
> 
> We're in the process of preparing such a proposal right now. That
> current intent is that Sato-san and I will co-maintain arch/sh. We'll
> include more details about motivation, proposed development direction,
> existing work to be merged, etc. in that proposal.
> 
> Rich
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


blog post: Monitoring real-time latencies

2016-01-06 Thread Julien Desfossez
Hi,

Here is a blog post related to detecting and understanding high
interrupt-processing latencies on real-time systems. It is based on a
new project called latency_tracker that hooks on the existing kernel
tracepoints and executes actions when high latency events occur.

https://lttng.org/blog/2016/01/06/monitoring-realtime-latencies/

It is a work in progress, so if you are interested and/or have any
comments, please let me know.

Thanks,

Julien
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] jffs2: use to_delayed_work

2016-01-06 Thread Brian Norris
On Fri, Jan 01, 2016 at 10:06:27PM +0800, Geliang Tang wrote:
> Use to_delayed_work() instead of open-coding it.
> 
> Signed-off-by: Geliang Tang 

Applied to l2-mtd.git
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/35] block: add REQ_OP definitions and bi_op/op fields

2016-01-06 Thread Martin K. Petersen
> "Mike" == mchristi   writes:

+enum req_op {
+   REQ_OP_READ,
+   REQ_OP_WRITE= REQ_WRITE,
+   REQ_OP_DISCARD  = REQ_DISCARD,
+   REQ_OP_WRITE_SAME   = REQ_WRITE_SAME,
+};
+

I have been irked by the REQ_ prefix in bios since the flags were
consolidated a while back. When I attempted to fix the READ/WRITE mess I
used a BLK_ prefix as a result.

Anyway. Just bikeshedding...

-- 
Martin K. Petersen  Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv1 6/6] rdmacg: Added documentation for rdma controller.

2016-01-06 Thread Tejun Heo
Hello,

On Thu, Jan 07, 2016 at 04:14:26AM +0530, Parav Pandit wrote:
> Yes. I read through. I can see two changes to be made in V2 version of
> this patch.
> 1. rdma.resource.verb.usage and rdma.resource.verb.limit to change
> respectively to,
> 2. rdma.resource.verb.stat and rdma.resource.verb.max.
> 3. rdma.resource.verb.failcnt indicate failure events, which I think
> should go to events.

What's up with the ".resource" part?  Also can't the .max file list
the available resources?  Why does it need a separtae list file?

> I roll out new patch for events post this patch as additional feature
> and remove this feature in V2.
> 
> rdma.resource.verb.list file is unique to rdma cgroup, so I believe
> this is fine.

Please see above.

> We will conclude whether to have rdma.resource.hw. or not in
> other patches.
> I am in opinion to keep "resource" and "verb" or "hw" tags around to
> keep it verbose enough to know what are we trying to control.

What does that achieve?  I feel that it's getting overengineered
constantly.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 5/5] perf evlist: Add --trace-fields option to show trace fields

2016-01-06 Thread Namhyung Kim
On Wed, Jan 06, 2016 at 08:29:49PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jan 07, 2016 at 08:21:44AM +0900, Namhyung Kim escreveu:
> > On Wed, Jan 06, 2016 at 08:10:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Wed, Jan 06, 2016 at 09:55:01AM +0900, Namhyung Kim escreveu:
> > > > To use dynamic sort keys, it might be good to add an option to see the
> > > > list of field names.
> > > > 
> > > >   $ perf evlist -i perf.data.sched
> > > >   sched:sched_switch
> > > >   sched:sched_stat_wait
> > > >   sched:sched_stat_sleep
> > > >   sched:sched_stat_iowait
> > > >   sched:sched_stat_runtime
> > > >   sched:sched_process_fork
> > > >   sched:sched_wakeup
> > > >   sched:sched_wakeup_new
> > > >   sched:sched_migrate_task
> > > >   # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint 
> > > > events
> > > 
> > > Ok, almost there, question is: if I ask explicitely for
> > > "--trace-fields", why should we have the "trace_fields=" in all lines,
> > > instead of just:
> > > 
> > >  
> > >$ perf evlist -i perf.data.sched --trace-fields
> > >sched:sched_switch: 
> > > prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
> > >sched:sched_stat_wait: comm,pid,delay
> > > 
> > > ?
> > 
> > I made it to work with other options too, like 'perf evlist --freq 
> > --trace-fields'.
> > In that case you may want to see it. :)
> 
> I see... And then you want to show those at the same time... Would you
> have an use case for that? Or would it be better to make them mutually
> exclusive? /me unsure...

I don't have one.  But I think I sometimes want to see it with
-v/--verbose option.  Hmm.. do you think --verbose should imply
--trace-fields?

Thanks,
Namhyung


> > > 
> > > I like the lack of spaces after commans, this way we can double click
> > > and select the whole list, then edit it, etc.
> > > 
> > > - Arnaldo
> > >  
> > > >   $ perf evlist -i perf.data.sched --trace-fields
> > > >   sched:sched_switch: 
> > > > trace_fields=prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
> > > >   sched:sched_stat_wait: trace_fields=comm,pid,delay
> > > >   sched:sched_stat_sleep: trace_fields=comm,pid,delay
> > > >   sched:sched_stat_iowait: trace_fields=comm,pid,delay
> > > >   sched:sched_stat_runtime: trace_fields=comm,pid,runtime,vruntime
> > > >   sched:sched_process_fork: 
> > > > trace_fields=parent_comm,parent_pid,child_comm,child_pid
> > > >   sched:sched_wakeup: trace_fields=comm,pid,prio,success,target_cpu
> > > >   sched:sched_wakeup_new: trace_fields=comm,pid,prio,success,target_cpu
> > > >   sched:sched_migrate_task: trace_fields=comm,pid,prio,orig_cpu,dest_cpu
> > > > 
> > > > Signed-off-by: Namhyung Kim 
> > > > ---
> > > >  tools/perf/Documentation/perf-evlist.txt |  3 +++
> > > >  tools/perf/builtin-evlist.c  | 11 ++-
> > > >  tools/perf/util/evsel.c  | 23 +++
> > > >  tools/perf/util/evsel.h  |  1 +
> > > >  4 files changed, 37 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/tools/perf/Documentation/perf-evlist.txt 
> > > > b/tools/perf/Documentation/perf-evlist.txt
> > > > index 1ceb3700ffbb..6f7200fb85cf 100644
> > > > --- a/tools/perf/Documentation/perf-evlist.txt
> > > > +++ b/tools/perf/Documentation/perf-evlist.txt
> > > > @@ -32,6 +32,9 @@ OPTIONS
> > > >  --group::
> > > > Show event group information.
> > > >  
> > > > +--trace-fields::
> > > > +   Show tracepoint field names.
> > > > +
> > > >  SEE ALSO
> > > >  
> > > >  linkperf:perf-record[1], linkperf:perf-list[1],
> > > > diff --git a/tools/perf/builtin-evlist.c b/tools/perf/builtin-evlist.c
> > > > index 08a7d36a2cf8..8a31f511e1a0 100644
> > > > --- a/tools/perf/builtin-evlist.c
> > > > +++ b/tools/perf/builtin-evlist.c
> > > > @@ -26,14 +26,22 @@ static int __cmd_evlist(const char *file_name, 
> > > > struct perf_attr_details *details
> > > > .mode = PERF_DATA_MODE_READ,
> > > > .force = details->force,
> > > > };
> > > > +   bool has_tracepoint = false;
> > > >  
> > > > session = perf_session__new(, 0, NULL);
> > > > if (session == NULL)
> > > > return -1;
> > > >  
> > > > -   evlist__for_each(session->evlist, pos)
> > > > +   evlist__for_each(session->evlist, pos) {
> > > > perf_evsel__fprintf(pos, details, stdout);
> > > >  
> > > > +   if (pos->attr.type == PERF_TYPE_TRACEPOINT)
> > > > +   has_tracepoint = true;
> > > > +   }
> > > > +
> > > > +   if (has_tracepoint && !details->trace_fields)
> > > > +   printf("# Tip: use 'perf evlist --trace-fields' to show 
> > > > fields for tracepoint events\n");
> > > > +
> > > > perf_session__delete(session);
> > > > return 0;
> > > >  }
> > > > @@ -49,6 +57,7 @@ int cmd_evlist(int argc, const char **argv, const 
> > > > char *prefix 

Re: [PATCH v2 31/32] sh: support a 2-byte smp_store_mb

2016-01-06 Thread Rich Felker
On Wed, Jan 06, 2016 at 10:23:12PM +0200, Michael S. Tsirkin wrote:
> On Wed, Jan 06, 2016 at 01:23:50PM -0500, Rich Felker wrote:
> > On Wed, Jan 06, 2016 at 03:32:18PM +0100, Peter Zijlstra wrote:
> > > On Wed, Jan 06, 2016 at 01:52:17PM +0200, Michael S. Tsirkin wrote:
> > > > > > Peter, what do you think? How about I leave this patch as is for 
> > > > > > now?
> > > > > 
> > > > > No, and I object to removing the single byte implementation too. 
> > > > > Either
> > > > > remove the full arch or fix xchg() to conform. xchg() should work on 
> > > > > all
> > > > > native word sizes, for SH that would be 1,2 and 4 bytes.
> > > > 
> > > > Rick, maybe you could explain how is current 1 byte xchg on llsc wrong?
> > > 
> > > It doesn't seem to preserve the 3 other bytes in the word.
> > > 
> > > > It does use 4 byte accesses but IIUC that is all that exists on
> > > > this architecture.
> > > 
> > > Right, that's not a problem, look at arch/alpha/include/asm/xchg.h for
> > > example. A store to another portion of the word should make the
> > > store-conditional fail and we'll retry the loop.
> > > 
> > > The short versions should however preserve the other bytes in the word.
> > 
> > Indeed. Also, accesses must be aligned, so the asm needs to round down
> > to an aligned address and perform a correct read-modify-write on it,
> > placing the new byte in the correct offset in the word.
> > 
> > Alternatively (my preference) this logic can be impemented in C as a
> > wrapper around the 32-bit cmpxchg. I think this is less error-prone
> > and it can be shared between the multiple sh cmpxchg back-ends,
> > including the new cas.l one we need for J2.
> > 
> > > SH's cmpxchg() is equally incomplete and does not provide 1 and 2 byte
> > > versions.
> > > 
> > > In any case, I'm all for rm -rf arch/sh/, one less arch to worry about
> > > is always good, but ISTR some people wanting to resurrect SH:
> > > 
> > >   http://old.lwn.net/Articles/647636/
> > > 
> > > Rob, Jeff, Sato-san, might I suggest you send a MAINTAINERS patch and
> > > take up an active interest in SH lest someone 'accidentally' nukes it?
> > 
> > We're in the process of preparing such a proposal right now. That
> > current intent is that Sato-san and I will co-maintain arch/sh. We'll
> > include more details about motivation, proposed development direction,
> > existing work to be merged, etc. in that proposal.
> 
> Well I'd like to be able to make progress with generic
> arch cleanups meanwhile.
> 
> Could you quickly write a version of 1 and 2 byte xchg that
> works so I can include it?

Here are quick, untested generic ones:

static inline unsigned long xchg_u8(volatile u8 *m, unsigned long val)
{
u32 old;
unsigned long offset = (unsigned long)m & 3;
volatile u32 *w = (volatile u32 *)(m - offset);
union { u32 w; u8 b[4]; } u;
do {
old = u.w = *w;
result = w.b[offset];
w.b[offset] = val;
} while (cmpxchg(w, old, u.w) != old);
return result;
}

static inline unsigned long xchg_u16(volatile u16 *m, unsigned long val)
{
u32 old;
unsigned long result;
unsigned long offset = ((unsigned long)m & 3) >> 1;
volatile u32 *w = (volatile u32 *)(m - offset);
union { u32 w; u16 h[2]; } u;
do {
old = u.w = *w;
result = w.h[offset];
w.h[offset] = val;
} while (cmpxchg(w, old, u.w) != old);
return result;
}

It would be nice to have these in asm-generic for archs which don't
define their own versions rather than having cruft like this repeated
per-arch. Strictly speaking, the volatile u32 used to access the
32-bit word containing the u8 or u16 should be
__attribute__((__may_alias__)) too. Is there an existing kernel type
for a "may_alias u32" or should it perhaps be added?

Rich
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 17/31] x86, pkeys: check VMAs and PTEs for protection keys

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

Today, for normal faults and page table walks, we check the VMA
and/or PTE to ensure that it is compatible with the action.  For
instance, if we get a write fault on a non-writeable VMA, we
SIGSEGV.

We try to do the same thing for protection keys.  Basically, we
try to make sure that if a user does this:

mprotect(ptr, size, PROT_NONE);
*ptr = foo;

they see the same effects with protection keys when they do this:

mprotect(ptr, size, PROT_READ|PROT_WRITE);
set_pkey(ptr, size, 4);
wrpkru(0xff3f); // access disable pkey 4
*ptr = foo;

The state to do that checking is in the VMA, but we also
sometimes have to do it on the page tables only, like when doing
a get_user_pages_fast() where we have no VMA.

We add two functions and expose them to generic code:

arch_pte_access_permitted(pte_flags, write)
arch_vma_access_permitted(vma, write)

These are, of course, backed up in x86 arch code with checks
against the PTE or VMA's protection key.

But, there are also cases where we do not want to respect
protection keys.  When we ptrace(), for instance, we do not want
to apply the tracer's PKRU permissions to the PTEs from the
process being traced.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/powerpc/include/asm/mmu_context.h   |   11 ++
 b/arch/s390/include/asm/mmu_context.h  |   11 ++
 b/arch/unicore32/include/asm/mmu_context.h |   11 ++
 b/arch/x86/include/asm/mmu_context.h   |   49 +
 b/arch/x86/include/asm/pgtable.h   |   29 +
 b/arch/x86/mm/fault.c  |   21 +++-
 b/arch/x86/mm/gup.c|5 ++
 b/include/asm-generic/mm_hooks.h   |   11 ++
 b/mm/gup.c |   18 --
 b/mm/memory.c  |4 ++
 10 files changed, 166 insertions(+), 4 deletions(-)

diff -puN arch/powerpc/include/asm/mmu_context.h~pkeys-13-pte-fault 
arch/powerpc/include/asm/mmu_context.h
--- a/arch/powerpc/include/asm/mmu_context.h~pkeys-13-pte-fault 2016-01-06 
15:50:09.964352292 -0800
+++ b/arch/powerpc/include/asm/mmu_context.h2016-01-06 15:50:09.981353058 
-0800
@@ -148,5 +148,16 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool 
write)
+{
+   /* by default, allow everything */
+   return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+   /* by default, allow everything */
+   return true;
+}
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff -puN arch/s390/include/asm/mmu_context.h~pkeys-13-pte-fault 
arch/s390/include/asm/mmu_context.h
--- a/arch/s390/include/asm/mmu_context.h~pkeys-13-pte-fault2016-01-06 
15:50:09.965352337 -0800
+++ b/arch/s390/include/asm/mmu_context.h   2016-01-06 15:50:09.981353058 
-0800
@@ -130,4 +130,15 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool 
write)
+{
+   /* by default, allow everything */
+   return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+   /* by default, allow everything */
+   return true;
+}
 #endif /* __S390_MMU_CONTEXT_H */
diff -puN arch/unicore32/include/asm/mmu_context.h~pkeys-13-pte-fault 
arch/unicore32/include/asm/mmu_context.h
--- a/arch/unicore32/include/asm/mmu_context.h~pkeys-13-pte-fault   
2016-01-06 15:50:09.967352427 -0800
+++ b/arch/unicore32/include/asm/mmu_context.h  2016-01-06 15:50:09.981353058 
-0800
@@ -97,4 +97,15 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool 
write)
+{
+   /* by default, allow everything */
+   return true;
+}
+
+static inline bool arch_pte_access_permitted(pte_t pte, bool write)
+{
+   /* by default, allow everything */
+   return true;
+}
 #endif
diff -puN arch/x86/include/asm/mmu_context.h~pkeys-13-pte-fault 
arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkeys-13-pte-fault 2016-01-06 
15:50:09.968352472 -0800
+++ b/arch/x86/include/asm/mmu_context.h2016-01-06 15:50:09.982353103 
-0800
@@ -254,4 +254,53 @@ static inline int vma_pkey(struct vm_are
return pkey;
 }
 
+static inline bool __pkru_allows_pkey(u16 pkey, bool write)
+{
+   u32 pkru = read_pkru();
+
+   if (!__pkru_allows_read(pkru, pkey))
+   return false;
+   if (write && !__pkru_allows_write(pkru, pkey))
+   return false;
+
+   return true;
+}
+
+/*
+ * We only want to enforce protection keys on the current process
+ * because we effectively have no access to PKRU for other
+ * processes or any way to tell *which * PKRU in 

[PATCH 21/31] x86, pkeys: dump PKRU with other kernel registers

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

Protection Keys never affect kernel mappings.  But, they can
affect whether the kernel will fault when it touches a user
mapping.  The kernel doesn't touch user mappings without some
careful choreography and these accesses don't generally result in
oopses.  But, if one does, we definitely want to have PKRU
available so we can figure out if protection keys played a role.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/kernel/process_64.c |2 ++
 1 file changed, 2 insertions(+)

diff -puN arch/x86/kernel/process_64.c~pkeys-30-kernel-error-dumps 
arch/x86/kernel/process_64.c
--- a/arch/x86/kernel/process_64.c~pkeys-30-kernel-error-dumps  2016-01-06 
15:50:12.265456034 -0800
+++ b/arch/x86/kernel/process_64.c  2016-01-06 15:50:12.268456170 -0800
@@ -116,6 +116,8 @@ void __show_regs(struct pt_regs *regs, i
printk(KERN_DEFAULT "DR0: %016lx DR1: %016lx DR2: %016lx\n", d0, d1, 
d2);
printk(KERN_DEFAULT "DR3: %016lx DR6: %016lx DR7: %016lx\n", d3, d6, 
d7);
 
+   if (boot_cpu_has(X86_FEATURE_OSPKE))
+   printk(KERN_DEFAULT "PKRU: %08x\n", read_pkru());
 }
 
 void release_thread(struct task_struct *dead_task)
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 19/31] x86, pkeys: optimize fault handling in access_error()

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

We might not strictly have to make modifictions to
access_error() to check the VMA here.

If we do not, we will do this:
1. app sets VMA pkey to K
2. app touches a !present page
3. do_page_fault(), allocates and maps page, sets pte.pkey=K
4. return to userspace
5. touch instruction reexecutes, but triggers PF_PK
6. do PKEY signal

What happens with this patch applied:
1. app sets VMA pkey to K
2. app touches a !present page
3. do_page_fault() notices that K is inaccessible
4. do PKEY signal

We basically skip the fault that does an allocation.

So what this lets us do is protect areas from even being
*populated* unless it is accessible according to protection
keys.  That seems handy to me and makes protection keys work
more like an mprotect()'d mapping.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/mm/fault.c |   15 +++
 1 file changed, 15 insertions(+)

diff -puN arch/x86/mm/fault.c~pkeys-15-access_error arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-15-access_error 2016-01-06 15:50:11.270411174 
-0800
+++ b/arch/x86/mm/fault.c   2016-01-06 15:50:11.274411354 -0800
@@ -900,10 +900,16 @@ bad_area(struct pt_regs *regs, unsigned
 static inline bool bad_area_access_from_pkeys(unsigned long error_code,
struct vm_area_struct *vma)
 {
+   /* This code is always called on the current mm */
+   bool foreign = false;
+
if (!boot_cpu_has(X86_FEATURE_OSPKE))
return false;
if (error_code & PF_PK)
return true;
+   /* this checks permission keys on the VMA: */
+   if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE), foreign))
+   return true;
return false;
 }
 
@@ -1091,6 +1097,8 @@ int show_unhandled_signals = 1;
 static inline int
 access_error(unsigned long error_code, struct vm_area_struct *vma)
 {
+   /* This is only called for the current mm, so: */
+   bool foreign = false;
/*
 * Access or read was blocked by protection keys. We do
 * this check before any others because we do not want
@@ -1099,6 +1107,13 @@ access_error(unsigned long error_code, s
 */
if (error_code & PF_PK)
return 1;
+   /*
+* Make sure to check the VMA so that we do not perform
+* faults just to hit a PF_PK as soon as we fill in a
+* page.
+*/
+   if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE), foreign))
+   return 1;
 
if (error_code & PF_WRITE) {
/* write, present and write, not present: */
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] mtd: nand: pxa3xx_nand: add register access debug

2016-01-06 Thread Brian Norris
On Sat, Dec 19, 2015 at 01:19:26PM +0100, Robert Jarzmik wrote:
> Brian Norris  writes:
> 
> > I don't have very strong opinions on this. It's kind of annoying to have
> > this sort of stuff duplicated for every driver, if it's really needed.
> > But I'll admit this kind of infrastructure is sometimes useful.
> >
> > Anecdote: I recently found the regmap trace event infrastructure pretty
> > nice for debugging some other drivers. This would only require you to
> > have tracing enabled, and then no recompiles are necessary at all. Just
> > cmdline changes.
> >
> > So, I could go with this patch, if Robert still desires it. Or you could
> > convert to using regmap for MMIO :)
> I'm as you, I don't feel strong opinion about it, I'd like to have a debug
> tracing tool, be that this patch or regmap MMIO.

Regmap is probably overkill, and wouldn't do a lot to satisfy Ezequiel's
concern, I expect.

> If we all agree on a path I could even make the final patch, whichever 
> solution
> is chosen.

I'm OK with this one, which still applies OK for me. But I do have one
comment, which I'll post at the top-level.

Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 22/31] x86, pkeys: dump pkey from VMA in /proc/pid/smaps

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

The protection key can now be just as important as read/write
permissions on a VMA.  We need some debug mechanism to help
figure out if it is in play.  smaps seems like a logical
place to expose it.

arch/x86/kernel/setup.c is a bit of a weirdo place to put
this code, but it already had seq_file.h and there was not
a much better existing place to put it.

We also use no #ifdef.  If protection keys is .config'd out we
will effectively get the same function as if we used the weak
generic function.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
Cc: vba...@suse.cz
---

 b/arch/x86/kernel/setup.c |9 +
 b/fs/proc/task_mmu.c  |   14 ++
 2 files changed, 23 insertions(+)

diff -puN arch/x86/kernel/setup.c~pkeys-40-smaps arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~pkeys-40-smaps2016-01-06 15:50:12.674474474 
-0800
+++ b/arch/x86/kernel/setup.c   2016-01-06 15:50:12.679474700 -0800
@@ -112,6 +112,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -1282,3 +1283,11 @@ static int __init register_kernel_offset
return 0;
 }
 __initcall(register_kernel_offset_dumper);
+
+void arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
+{
+   if (!boot_cpu_has(X86_FEATURE_OSPKE))
+   return;
+
+   seq_printf(m, "ProtectionKey:  %8u\n", vma_pkey(vma));
+}
diff -puN fs/proc/task_mmu.c~pkeys-40-smaps fs/proc/task_mmu.c
--- a/fs/proc/task_mmu.c~pkeys-40-smaps 2016-01-06 15:50:12.675474519 -0800
+++ b/fs/proc/task_mmu.c2016-01-06 15:50:12.679474700 -0800
@@ -615,11 +615,20 @@ static void show_smap_vma_flags(struct s
[ilog2(VM_MERGEABLE)]   = "mg",
[ilog2(VM_UFFD_MISSING)]= "um",
[ilog2(VM_UFFD_WP)] = "uw",
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+   /* These come out via ProtectionKey: */
+   [ilog2(VM_PKEY_BIT0)]   = "",
+   [ilog2(VM_PKEY_BIT1)]   = "",
+   [ilog2(VM_PKEY_BIT2)]   = "",
+   [ilog2(VM_PKEY_BIT3)]   = "",
+#endif
};
size_t i;
 
seq_puts(m, "VmFlags: ");
for (i = 0; i < BITS_PER_LONG; i++) {
+   if (!mnemonics[i][0])
+   continue;
if (vma->vm_flags & (1UL << i)) {
seq_printf(m, "%c%c ",
   mnemonics[i][0], mnemonics[i][1]);
@@ -657,6 +666,10 @@ static int smaps_hugetlb_range(pte_t *pt
 }
 #endif /* HUGETLB_PAGE */
 
+void __weak arch_show_smap(struct seq_file *m, struct vm_area_struct *vma)
+{
+}
+
 static int show_smap(struct seq_file *m, void *v, int is_pid)
 {
struct vm_area_struct *vma = v;
@@ -713,6 +726,7 @@ static int show_smap(struct seq_file *m,
   (vma->vm_flags & VM_LOCKED) ?
(unsigned long)(mss.pss >> (10 + PSS_SHIFT)) : 0);
 
+   arch_show_smap(m, vma);
show_smap_vma_flags(m, vma);
m_cache_vma(m, vma);
return 0;
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] X.509: Partially revert patch to add validation against IMA MOK keyring

2016-01-06 Thread James Morris
> Partially revert commit 41c89b64d7184a780f12f2cccdabe65cb2408893:
> 
>   Author: Petko Manolov 
>   Date:   Wed Dec 2 17:47:55 2015 +0200
>   IMA: create machine owner and blacklist keyrings
> 

If you need this applied to a tree, please state which.

-- 
James Morris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: weird DirectMap2M accounting.

2016-01-06 Thread Dave Jones
On Wed, Jan 06, 2016 at 06:55:27PM -0500, Dave Jones wrote:
 > I just spotted this in /proc/meminfo on an old Core2 machine with 4G.
 > 
 > DirectMap2M:18446744073709543424 kB
 > 
 > Looks like we subtracted 8192 from 0 somewhere.
 > 
 > Should split_page_count() be checking that direct_pages_count > 0 ?

Ok, this diff makes that number print out as 0.

If this looks ok, I'll submit it properly, though I'd like to better
understand what's happening here. Shouldn't I have 2M pages ?

Dave


diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index a3137a4feed1..ff0e0c6c350e 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -66,6 +66,9 @@ void update_page_count(int level, unsigned long pages)
 
 static void split_page_count(int level)
 {
+   if (direct_pages_count[level] == 0)
+   return;
+
direct_pages_count[level]--;
direct_pages_count[level - 1] += PTRS_PER_PTE;
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -next] serial: 8250: Fix serial port driver for OF platform devices

2016-01-06 Thread Guenter Roeck
Commit afd7f88f1577 ("serial: 8250: move of_serial code to 8250 directory")
moved the serial port driver for Open Firmware platform devices from one
directory to another, but a mixup in Kconfig options resulted in the driver
never being built. This results in runtime failures for some xtensa,
openrisc, and powerpc configurations.

Fixes: afd7f88f1577 ("serial: 8250: move of_serial code to 8250 directory")
Cc: Arnd Bergmann 
Signed-off-by: Guenter Roeck 
---
It might make sense to merge this patch with the commit introducing the problem.

 drivers/tty/serial/8250/8250_of.c | 3 +--
 drivers/tty/serial/8250/Kconfig   | 5 +
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/8250/8250_of.c 
b/drivers/tty/serial/8250/8250_of.c
index d66fd24f87cf..33021c1f7d55 100644
--- a/drivers/tty/serial/8250/8250_of.c
+++ b/drivers/tty/serial/8250/8250_of.c
@@ -18,10 +18,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
-#include "8250/8250.h"
+#include "8250.h"
 
 struct of_serial_info {
struct clk *clk;
diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kconfig
index b03cb5175113..e5ab94e381fb 100644
--- a/drivers/tty/serial/8250/Kconfig
+++ b/drivers/tty/serial/8250/Kconfig
@@ -378,9 +378,14 @@ config SERIAL_8250_MID
  present on the UART found on Intel Medfield SOC and various other
  Intel platforms.
 
+config SERIAL_8250_OF
+   tristate
+   depends on SERIAL_8250 && OF
+
 config SERIAL_OF_PLATFORM
tristate "Devicetree based probing for 8250 ports"
depends on SERIAL_8250 && OF
+   select SERIAL_8250_OF
help
  This option is used for all 8250 compatible serial ports that
  are probed through devicetree, including Open Firmware based
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the vfs tree

2016-01-06 Thread Stephen Rothwell
Hi all,

On Mon, 21 Dec 2015 11:23:01 +1100 Stephen Rothwell  
wrote:
>
> On Thu, 10 Dec 2015 11:18:47 +1100 Stephen Rothwell  
> wrote:
> >
> > After merging the vfs tree, today's linux-next build (x86_64 allmodconfig)
> > failed like this:
> > 
> > fs/orangefs/symlink.c:26:2: error: unknown field 'follow_link' specified in 
> > initializer
> >   .follow_link = pvfs2_follow_link,
> >   ^
> > fs/orangefs/symlink.c:26:17: warning: initialization from incompatible 
> > pointer type [-Wincompatible-pointer-types]
> >   .follow_link = pvfs2_follow_link, 
> >  ^
> > fs/orangefs/symlink.c:26:17: note: (near initialization for 
> > 'pvfs2_symlink_inode_operations.put_link')
> > 
> > Caused by commit
> > 
> >   6b2553918d8b ("replace ->follow_link() with new method that could stay in 
> > RCU mode")
> > 
> > [I wish there was some way to stage these API changes :-(]
> > 
> > I applied the following merge fix patch (which may need more work):
> > 
> > From: Stephen Rothwell 
> > Date: Thu, 10 Dec 2015 11:12:36 +1100
> > Subject: [PATCH] orangfs: update for follow_link to get_link change
> > 
> > Signed-off-by: Stephen Rothwell 
> > ---
> >  fs/orangefs/symlink.c | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/orangefs/symlink.c b/fs/orangefs/symlink.c
> > index 2adfceff7730..dbf24a98a3c9 100644
> > --- a/fs/orangefs/symlink.c
> > +++ b/fs/orangefs/symlink.c
> > @@ -8,9 +8,15 @@
> >  #include "pvfs2-kernel.h"
> >  #include "pvfs2-bufmap.h"
> >  
> > -static const char *pvfs2_follow_link(struct dentry *dentry, void **cookie)
> > +static const char *pvfs2_get_link(struct dentry *dentry, struct inode 
> > *inode,
> > + void **cookie)
> >  {
> > -   char *target =  PVFS2_I(dentry->d_inode)->link_target;
> > +   char *target;
> > +
> > +   if (!dentry)
> > +   return ERR_PTR(-ECHILD);
> > +
> > +   target =  PVFS2_I(inode)->link_target;
> >  
> > gossip_debug(GOSSIP_INODE_DEBUG,
> >  "%s: called on %s (target is %p)\n",
> > @@ -23,7 +29,7 @@ static const char *pvfs2_follow_link(struct dentry 
> > *dentry, void **cookie)
> >  
> >  struct inode_operations pvfs2_symlink_inode_operations = {
> > .readlink = generic_readlink,
> > -   .follow_link = pvfs2_follow_link,
> > +   .get_link = pvfs2_get_link,
> > .setattr = pvfs2_setattr,
> > .getattr = pvfs2_getattr,
> > .listxattr = pvfs2_listxattr,  
> 
> This patch now looks like this (after changes to the orangefs tree):
> 
> diff --git a/fs/orangefs/symlink.c b/fs/orangefs/symlink.c
> index 1b3ae63463dc..01977e88e95d 100644
> --- a/fs/orangefs/symlink.c
> +++ b/fs/orangefs/symlink.c
> @@ -8,9 +8,15 @@
>  #include "orangefs-kernel.h"
>  #include "orangefs-bufmap.h"
>  
> -static const char *orangefs_follow_link(struct dentry *dentry, void **cookie)
> +static const char *orangefs_get_link(struct dentry *dentry, struct inode 
> *inode,
> +  void **cookie)
>  {
> - char *target =  ORANGEFS_I(dentry->d_inode)->link_target;
> + char *target;
> +
> + if (!dentry)
> + return ERR_PTR(-ECHILD);
> +
> + target = ORANGEFS_I(inode)->link_target;
>  
>   gossip_debug(GOSSIP_INODE_DEBUG,
>"%s: called on %s (target is %p)\n",
> @@ -23,7 +29,7 @@ static const char *orangefs_follow_link(struct dentry 
> *dentry, void **cookie)
>  
>  struct inode_operations orangefs_symlink_inode_operations = {
>   .readlink = generic_readlink,
> - .follow_link = orangefs_follow_link,
> + .get_link = orangefs_get_link,
>   .setattr = orangefs_setattr,
>   .getattr = orangefs_getattr,
>   .listxattr = orangefs_listxattr,

This patch now looks like this (I think this is right):

From: Stephen Rothwell 
Date: Thu, 7 Jan 2016 11:37:17 +1100
Subject: [PATCH] orangfs: update for follow_link to get_link change

Signed-off-by: Stephen Rothwell 
---
 fs/orangefs/symlink.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/orangefs/symlink.c b/fs/orangefs/symlink.c
index 1b3ae63463dc..a21083790fb8 100644
--- a/fs/orangefs/symlink.c
+++ b/fs/orangefs/symlink.c
@@ -8,22 +8,26 @@
 #include "orangefs-kernel.h"
 #include "orangefs-bufmap.h"
 
-static const char *orangefs_follow_link(struct dentry *dentry, void **cookie)
+static const char *orangefs_get_link(struct dentry *dentry, struct inode 
*inode,
+struct delayed_call *done)
 {
-   char *target =  ORANGEFS_I(dentry->d_inode)->link_target;
+   char *target;
+
+   if (!dentry)
+   return ERR_PTR(-ECHILD);
+
+   target =  ORANGEFS_I(dentry->d_inode)->link_target;
 
gossip_debug(GOSSIP_INODE_DEBUG,
 "%s: called on %s (target is %p)\n",
 __func__, (char 

[PATCH RESEND v4 05/11] staging: fsl-mc: Extended MC bus allocator to include IRQs

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

All the IRQs for DPAA2 objects in the same DPRC must use
the ICID of that DPRC, as their device Id in the GIC-ITS.
Thus, all these IRQs must share the same ITT table in the GIC.
As a result, a pool of IRQs with the same device Id must be
preallocated per DPRC (fsl-mc bus instance). So, the fsl-mc
bus object allocator is extended to also provide services
to allocate IRQs to DPAA2 devices, from their parent fsl-mc bus
IRQ pool.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4: none

Changes in v3: none

Changes in v2: none

 drivers/staging/fsl-mc/bus/mc-allocator.c   | 199 
 drivers/staging/fsl-mc/include/mc-private.h |  15 +++
 drivers/staging/fsl-mc/include/mc.h |   9 ++
 3 files changed, 223 insertions(+)

diff --git a/drivers/staging/fsl-mc/bus/mc-allocator.c 
b/drivers/staging/fsl-mc/bus/mc-allocator.c
index 88d1857..c5fa628 100644
--- a/drivers/staging/fsl-mc/bus/mc-allocator.c
+++ b/drivers/staging/fsl-mc/bus/mc-allocator.c
@@ -15,6 +15,7 @@
 #include "../include/dpcon-cmd.h"
 #include "dpmcp-cmd.h"
 #include "dpmcp.h"
+#include 

 /**
  * fsl_mc_resource_pool_add_device - add allocatable device to a resource
@@ -160,6 +161,7 @@ static const char *const fsl_mc_pool_type_strings[] = {
[FSL_MC_POOL_DPMCP] = "dpmcp",
[FSL_MC_POOL_DPBP] = "dpbp",
[FSL_MC_POOL_DPCON] = "dpcon",
+   [FSL_MC_POOL_IRQ] = "irq",
 };

 static int __must_check object_type_to_pool_type(const char *object_type,
@@ -465,6 +467,203 @@ void fsl_mc_object_free(struct fsl_mc_device *mc_adev)
 }
 EXPORT_SYMBOL_GPL(fsl_mc_object_free);

+/*
+ * Initialize the interrupt pool associated with a MC bus.
+ * It allocates a block of IRQs from the GIC-ITS
+ */
+int fsl_mc_populate_irq_pool(struct fsl_mc_bus *mc_bus,
+unsigned int irq_count)
+{
+   unsigned int i;
+   struct msi_desc *msi_desc;
+   struct fsl_mc_device_irq *irq_resources;
+   struct fsl_mc_device_irq *mc_dev_irq;
+   int error;
+   struct fsl_mc_device *mc_bus_dev = _bus->mc_dev;
+   struct fsl_mc_resource_pool *res_pool =
+   _bus->resource_pools[FSL_MC_POOL_IRQ];
+
+   if (WARN_ON(irq_count == 0 ||
+   irq_count > FSL_MC_IRQ_POOL_MAX_TOTAL_IRQS))
+   return -EINVAL;
+
+   error = fsl_mc_msi_domain_alloc_irqs(_bus_dev->dev, irq_count);
+   if (error < 0)
+   return error;
+
+   irq_resources = devm_kzalloc(_bus_dev->dev,
+sizeof(*irq_resources) * irq_count,
+GFP_KERNEL);
+   if (!irq_resources) {
+   error = -ENOMEM;
+   goto cleanup_msi_irqs;
+   }
+
+   for (i = 0; i < irq_count; i++) {
+   mc_dev_irq = _resources[i];
+
+   /*
+* NOTE: This mc_dev_irq's MSI addr/value pair will be set
+* by the fsl_mc_msi_write_msg() callback
+*/
+   mc_dev_irq->resource.type = res_pool->type;
+   mc_dev_irq->resource.data = mc_dev_irq;
+   mc_dev_irq->resource.parent_pool = res_pool;
+   INIT_LIST_HEAD(_dev_irq->resource.node);
+   list_add_tail(_dev_irq->resource.node, _pool->free_list);
+   }
+
+   for_each_msi_entry(msi_desc, _bus_dev->dev) {
+   mc_dev_irq = _resources[msi_desc->fsl_mc.msi_index];
+   mc_dev_irq->msi_desc = msi_desc;
+   mc_dev_irq->resource.id = msi_desc->irq;
+   }
+
+   res_pool->max_count = irq_count;
+   res_pool->free_count = irq_count;
+   mc_bus->irq_resources = irq_resources;
+   return 0;
+
+cleanup_msi_irqs:
+   fsl_mc_msi_domain_free_irqs(_bus_dev->dev);
+   return error;
+}
+EXPORT_SYMBOL_GPL(fsl_mc_populate_irq_pool);
+
+/**
+ * Teardown the interrupt pool associated with an MC bus.
+ * It frees the IRQs that were allocated to the pool, back to the GIC-ITS.
+ */
+void fsl_mc_cleanup_irq_pool(struct fsl_mc_bus *mc_bus)
+{
+   struct fsl_mc_device *mc_bus_dev = _bus->mc_dev;
+   struct fsl_mc_resource_pool *res_pool =
+   _bus->resource_pools[FSL_MC_POOL_IRQ];
+
+   if (WARN_ON(!mc_bus->irq_resources))
+   return;
+
+   if (WARN_ON(res_pool->max_count == 0))
+   return;
+
+   if (WARN_ON(res_pool->free_count != res_pool->max_count))
+   return;
+
+   INIT_LIST_HEAD(_pool->free_list);
+   res_pool->max_count = 0;
+   res_pool->free_count = 0;
+   mc_bus->irq_resources = NULL;
+   fsl_mc_msi_domain_free_irqs(_bus_dev->dev);
+}
+EXPORT_SYMBOL_GPL(fsl_mc_cleanup_irq_pool);
+
+/**
+ * It allocates the IRQs required by a given MC object device. The
+ * IRQs are allocated from the interrupt pool associated with the
+ * MC bus that contains the device, if the device is 

[PATCH RESEND v4 03/11] staging: fsl-mc: Added generic MSI support for FSL-MC devices

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

Created an MSI domain for the fsl-mc bus-- including functions
to create a domain, find a domain, alloc/free domain irqs, and
bus specific overrides for domain and irq_chip ops.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4:
- Addressed comments from Marc Zyngier:
  * Re-implemented fsl_mc_find_msi_domain() using of_msi_get_domain()

Changes in v3:
- Addressed comments from Marc Zyngier:
  * Added WARN_ON in fsl_mc_msi_set_desc to check that caller does
not set set_desc
  * Changed type of paddr in irq_cfg to be phys_addr_t
  * Added WARN_ON in fsl_mc_msi_update_chip_op() to check that caller
does not set irq_write_msi_msg

Changes in v2: none

 drivers/staging/fsl-mc/bus/Kconfig  |   1 +
 drivers/staging/fsl-mc/bus/Makefile |   1 +
 drivers/staging/fsl-mc/bus/mc-msi.c | 276 
 drivers/staging/fsl-mc/include/dprc.h   |   2 +-
 drivers/staging/fsl-mc/include/mc-private.h |  17 ++
 drivers/staging/fsl-mc/include/mc.h |  17 ++
 6 files changed, 313 insertions(+), 1 deletion(-)
 create mode 100644 drivers/staging/fsl-mc/bus/mc-msi.c

diff --git a/drivers/staging/fsl-mc/bus/Kconfig 
b/drivers/staging/fsl-mc/bus/Kconfig
index 0d779d9..c498ac6 100644
--- a/drivers/staging/fsl-mc/bus/Kconfig
+++ b/drivers/staging/fsl-mc/bus/Kconfig
@@ -9,6 +9,7 @@
 config FSL_MC_BUS
tristate "Freescale Management Complex (MC) bus driver"
depends on OF && ARM64
+   select GENERIC_MSI_IRQ_DOMAIN
help
  Driver to enable the bus infrastructure for the Freescale
   QorIQ Management Complex (fsl-mc). The fsl-mc is a hardware
diff --git a/drivers/staging/fsl-mc/bus/Makefile 
b/drivers/staging/fsl-mc/bus/Makefile
index 25433a9..a5f2ba4 100644
--- a/drivers/staging/fsl-mc/bus/Makefile
+++ b/drivers/staging/fsl-mc/bus/Makefile
@@ -13,5 +13,6 @@ mc-bus-driver-objs := mc-bus.o \
  dpmng.o \
  dprc-driver.o \
  mc-allocator.o \
+ mc-msi.o \
  dpmcp.o \
  dpbp.o
diff --git a/drivers/staging/fsl-mc/bus/mc-msi.c 
b/drivers/staging/fsl-mc/bus/mc-msi.c
new file mode 100644
index 000..3a8258f
--- /dev/null
+++ b/drivers/staging/fsl-mc/bus/mc-msi.c
@@ -0,0 +1,276 @@
+/*
+ * Freescale Management Complex (MC) bus driver MSI support
+ *
+ * Copyright (C) 2015 Freescale Semiconductor, Inc.
+ * Author: German Rivera 
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include "../include/mc-private.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../include/mc-sys.h"
+#include "dprc-cmd.h"
+
+static void fsl_mc_msi_set_desc(msi_alloc_info_t *arg,
+   struct msi_desc *desc)
+{
+   arg->desc = desc;
+   arg->hwirq = (irq_hw_number_t)desc->fsl_mc.msi_index;
+}
+
+static void fsl_mc_msi_update_dom_ops(struct msi_domain_info *info)
+{
+   struct msi_domain_ops *ops = info->ops;
+
+   if (WARN_ON(!ops))
+   return;
+
+   /*
+* set_desc should not be set by the caller
+*/
+   if (WARN_ON(ops->set_desc))
+   return;
+
+   ops->set_desc = fsl_mc_msi_set_desc;
+}
+
+static void __fsl_mc_msi_write_msg(struct fsl_mc_device *mc_bus_dev,
+  struct fsl_mc_device_irq *mc_dev_irq)
+{
+   int error;
+   struct fsl_mc_device *owner_mc_dev = mc_dev_irq->mc_dev;
+   struct msi_desc *msi_desc = mc_dev_irq->msi_desc;
+   struct dprc_irq_cfg irq_cfg;
+
+   /*
+* msi_desc->msg.address is 0x0 when this function is invoked in
+* the free_irq() code path. In this case, for the MC, we don't
+* really need to "unprogram" the MSI, so we just return.
+*/
+   if (msi_desc->msg.address_lo == 0x0 && msi_desc->msg.address_hi == 0x0)
+   return;
+
+   if (WARN_ON(!owner_mc_dev))
+   return;
+
+   irq_cfg.paddr = ((u64)msi_desc->msg.address_hi << 32) |
+   msi_desc->msg.address_lo;
+   irq_cfg.val = msi_desc->msg.data;
+   irq_cfg.user_irq_id = msi_desc->irq;
+
+   if (owner_mc_dev == mc_bus_dev) {
+   /*
+* IRQ is for the mc_bus_dev's DPRC itself
+*/
+   error = dprc_set_irq(mc_bus_dev->mc_io,
+MC_CMD_FLAG_INTR_DIS | MC_CMD_FLAG_PRI,
+mc_bus_dev->mc_handle,
+mc_dev_irq->dev_irq_index,
+_cfg);
+   if (error < 0) {
+   dev_err(_mc_dev->dev,
+

[PATCH RESEND v4 04/11] staging: fsl-mc: Added GICv3-ITS support for FSL-MC MSIs

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

Added platform-specific MSI support layer for FSL-MC devices.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4:
- Addressed comments from Marc Zyngier:
  * Moved bus type check earlier in its_fsl_mc_msi_prepare()
  * Removed its_dev_id variable
  * Changed some assignments to keep both sides on the same line

Changes in v3: none

Changes in v2: none

 drivers/staging/fsl-mc/bus/Makefile|   1 +
 .../staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c | 127 +
 drivers/staging/fsl-mc/include/mc-private.h|   4 +
 3 files changed, 132 insertions(+)
 create mode 100644 drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c

diff --git a/drivers/staging/fsl-mc/bus/Makefile 
b/drivers/staging/fsl-mc/bus/Makefile
index a5f2ba4..e731517 100644
--- a/drivers/staging/fsl-mc/bus/Makefile
+++ b/drivers/staging/fsl-mc/bus/Makefile
@@ -14,5 +14,6 @@ mc-bus-driver-objs := mc-bus.o \
  dprc-driver.o \
  mc-allocator.o \
  mc-msi.o \
+ irq-gic-v3-its-fsl-mc-msi.o \
  dpmcp.o \
  dpbp.o
diff --git a/drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c 
b/drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c
new file mode 100644
index 000..4e8e822
--- /dev/null
+++ b/drivers/staging/fsl-mc/bus/irq-gic-v3-its-fsl-mc-msi.c
@@ -0,0 +1,127 @@
+/*
+ * Freescale Management Complex (MC) bus driver MSI support
+ *
+ * Copyright (C) 2015 Freescale Semiconductor, Inc.
+ * Author: German Rivera 
+ *
+ * This file is licensed under the terms of the GNU General Public
+ * License version 2. This program is licensed "as is" without any
+ * warranty of any kind, whether express or implied.
+ */
+
+#include "../include/mc-private.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "../include/mc-sys.h"
+#include "dprc-cmd.h"
+
+static struct irq_chip its_msi_irq_chip = {
+   .name = "fsl-mc-bus-msi",
+   .irq_mask = irq_chip_mask_parent,
+   .irq_unmask = irq_chip_unmask_parent,
+   .irq_eoi = irq_chip_eoi_parent,
+   .irq_set_affinity = msi_domain_set_affinity
+};
+
+static int its_fsl_mc_msi_prepare(struct irq_domain *msi_domain,
+ struct device *dev,
+ int nvec, msi_alloc_info_t *info)
+{
+   struct fsl_mc_device *mc_bus_dev;
+   struct msi_domain_info *msi_info;
+
+   if (WARN_ON(dev->bus != _mc_bus_type))
+   return -EINVAL;
+
+   mc_bus_dev = to_fsl_mc_device(dev);
+   if (WARN_ON(!(mc_bus_dev->flags & FSL_MC_IS_DPRC)))
+   return -EINVAL;
+
+   /*
+* Set the device Id to be passed to the GIC-ITS:
+*
+* NOTE: This device id corresponds to the IOMMU stream ID
+* associated with the DPRC object (ICID).
+*/
+   info->scratchpad[0].ul = mc_bus_dev->icid;
+   msi_info = msi_get_domain_info(msi_domain->parent);
+   return msi_info->ops->msi_prepare(msi_domain->parent, dev, nvec, info);
+}
+
+static struct msi_domain_ops its_fsl_mc_msi_ops = {
+   .msi_prepare = its_fsl_mc_msi_prepare,
+};
+
+static struct msi_domain_info its_fsl_mc_msi_domain_info = {
+   .flags  = (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS),
+   .ops= _fsl_mc_msi_ops,
+   .chip   = _msi_irq_chip,
+};
+
+static const struct of_device_id its_device_id[] = {
+   {   .compatible = "arm,gic-v3-its", },
+   {},
+};
+
+int __init its_fsl_mc_msi_init(void)
+{
+   struct device_node *np;
+   struct irq_domain *parent;
+   struct irq_domain *mc_msi_domain;
+
+   for (np = of_find_matching_node(NULL, its_device_id); np;
+np = of_find_matching_node(np, its_device_id)) {
+   if (!of_property_read_bool(np, "msi-controller"))
+   continue;
+
+   parent = irq_find_matching_host(np, DOMAIN_BUS_NEXUS);
+   if (!parent || !msi_get_domain_info(parent)) {
+   pr_err("%s: unable to locate ITS domain\n",
+  np->full_name);
+   continue;
+   }
+
+   mc_msi_domain = fsl_mc_msi_create_irq_domain(
+of_node_to_fwnode(np),
+_fsl_mc_msi_domain_info,
+parent);
+   if (!mc_msi_domain) {
+   pr_err("%s: unable to create fsl-mc domain\n",
+  np->full_name);
+   continue;
+   }
+
+   WARN_ON(mc_msi_domain->
+   host_data != _fsl_mc_msi_domain_info);
+
+   pr_info("fsl-mc MSI: %s domain 

[PATCH RESEND v4 11/11] staging: fsl-mc: Added MSI support to the MC bus driver

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

Initialize/Cleanup ITS-MSI support for the MC bus driver at driver
init/exit time. Associate an MSI domain with each DPAA2 child device.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4: none

Changes in v3: none

Changes in v2: none

 drivers/staging/fsl-mc/bus/mc-bus.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/staging/fsl-mc/bus/mc-bus.c 
b/drivers/staging/fsl-mc/bus/mc-bus.c
index d34f1af..9317561 100644
--- a/drivers/staging/fsl-mc/bus/mc-bus.c
+++ b/drivers/staging/fsl-mc/bus/mc-bus.c
@@ -16,6 +16,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include "../include/dpmng.h"
 #include "../include/mc-sys.h"
 #include "dprc-cmd.h"
@@ -472,6 +474,8 @@ int fsl_mc_device_add(struct dprc_obj_desc *obj_desc,
mc_dev->icid = parent_mc_dev->icid;
mc_dev->dma_mask = FSL_MC_DEFAULT_DMA_MASK;
mc_dev->dev.dma_mask = _dev->dma_mask;
+   dev_set_msi_domain(_dev->dev,
+  dev_get_msi_domain(_mc_dev->dev));
}

/*
@@ -833,8 +837,15 @@ static int __init fsl_mc_bus_driver_init(void)
if (error < 0)
goto error_cleanup_dprc_driver;

+   error = its_fsl_mc_msi_init();
+   if (error < 0)
+   goto error_cleanup_mc_allocator;
+
return 0;

+error_cleanup_mc_allocator:
+   fsl_mc_allocator_driver_exit();
+
 error_cleanup_dprc_driver:
dprc_driver_exit();

@@ -856,6 +867,7 @@ static void __exit fsl_mc_bus_driver_exit(void)
if (WARN_ON(!mc_dev_cache))
return;

+   its_fsl_mc_msi_cleanup();
fsl_mc_allocator_driver_exit();
dprc_driver_exit();
platform_driver_unregister(_mc_bus_driver);
--
2.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] x86/fpu: Disable XGETBV1 when no XSAVE

2016-01-06 Thread yu-cheng yu
When "noxsave" is given as a command-line input, the kernel should disable
XGETBV1. This issue currently does not cause any actual problems. XGETBV1
is only useful if we have something using the 'init optimization' (i.e.
xsaveopt, xsaves). We already clear both of those in
fpu__xstate_clear_all_cpu_caps(). But this is good for completeness.

Signed-off-by: Yu-cheng Yu 
Reviewed-by: Dave Hansen 
Cc: x...@kernel.org
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
Cc: Dave Hansen 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Sai Praneeth Prakhya 
Cc: Ravi V. Shankar 
Cc: Fenghua Yu 
---
 arch/x86/kernel/fpu/xstate.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 70fc312..b27d3b6 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -52,6 +52,7 @@ void fpu__xstate_clear_all_cpu_caps(void)
setup_clear_cpu_cap(X86_FEATURE_AVX512ER);
setup_clear_cpu_cap(X86_FEATURE_AVX512CD);
setup_clear_cpu_cap(X86_FEATURE_MPX);
+   setup_clear_cpu_cap(X86_FEATURE_XGETBV1);
 }
 
 /*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] x86/fpu: Disable AVX when eagerfpu is off

2016-01-06 Thread yu-cheng yu
When "eagerfpu=off" is given as a command-line input, the kernel should
disable AVX support.

The Task Switched bit used for lazy context switching does not support
AVX. If AVX is enabled without eagerfpu context switching, one task's AVX
state could become corrupted or leak to other tasks. This is a bug and has
bad security implications.

This only affects systems that have AVX/AVX2/AVX512 and this issue will be
found only when one actually uses AVX/AVX2/AVX512 _AND_ does eagerfpu=off.

Referece: Intel Software Developer's Manual Vol. 3A

Sec. 2.5 Control Registers:
TS Task Switched bit (bit 3 of CR0) -- Allows the saving of the x87 FPU/
MMX/SSE/SSE2/SSE3/SSSE3/SSE4 context on a task switch to be delayed until
an x87 FPU/MMX/SSE/SSE2/SSE3/SSSE3/SSE4 instruction is actually executed
by the new task.

Sec. 13.4.1 Using the TS Flag to Control the Saving of the X87 FPU and SSE
State
When the TS flag is set, the processor monitors the instruction stream for
x87 FPU, MMX, SSE instructions. When the processor detects one of these
instructions, it raises a device-not-available exeception (#NM) prior to
executing the instruction.

Signed-off-by: Yu-cheng Yu 
Cc: x...@kernel.org
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
Cc: Dave Hansen 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Sai Praneeth Prakhya 
Cc: Ravi V. Shankar 
Cc: Fenghua Yu 
---
 arch/x86/include/asm/fpu/xstate.h | 11 ++-
 arch/x86/kernel/fpu/init.c|  6 ++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h 
b/arch/x86/include/asm/fpu/xstate.h
index 3a6c89b..af30fde 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -20,15 +20,16 @@
 
 /* Supported features which support lazy state saving */
 #define XFEATURE_MASK_LAZY (XFEATURE_MASK_FP | \
-XFEATURE_MASK_SSE | \
+XFEATURE_MASK_SSE)
+
+/* Supported features which require eager state saving */
+#define XFEATURE_MASK_EAGER(XFEATURE_MASK_BNDREGS | \
+XFEATURE_MASK_BNDCSR | \
 XFEATURE_MASK_YMM | \
-XFEATURE_MASK_OPMASK | \
+XFEATURE_MASK_OPMASK | \
 XFEATURE_MASK_ZMM_Hi256 | \
 XFEATURE_MASK_Hi16_ZMM)
 
-/* Supported features which require eager state saving */
-#define XFEATURE_MASK_EAGER(XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR)
-
 /* All currently supported features */
 #define XCNTXT_MASK(XFEATURE_MASK_LAZY | XFEATURE_MASK_EAGER)
 
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index aad53cc..e7e3dbfdb 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -291,6 +291,12 @@ u64 __init fpu__get_supported_xfeatures_mask(void)
 static void __init fpu__clear_eager_fpu_features(void)
 {
setup_clear_cpu_cap(X86_FEATURE_MPX);
+   setup_clear_cpu_cap(X86_FEATURE_AVX);
+   setup_clear_cpu_cap(X86_FEATURE_AVX2);
+   setup_clear_cpu_cap(X86_FEATURE_AVX512F);
+   setup_clear_cpu_cap(X86_FEATURE_AVX512PF);
+   setup_clear_cpu_cap(X86_FEATURE_AVX512ER);
+   setup_clear_cpu_cap(X86_FEATURE_AVX512CD);
 }
 
 /*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] x86/fpu: Disable MPX when eagerfpu is off

2016-01-06 Thread yu-cheng yu
This issue is a fallout from the command-line parsing move.

When "eagerfpu=off" is given as a command-line input, the kernel should
disable MPX support. The decision for turning off MPX was made in
fpu__init_system_ctx_switch(), which is after the selection of the XSAVE
format. This patch fixes it by getting that decision done earlier in
fpu__init_system_xstate().

Signed-off-by: Yu-cheng Yu 
Cc: x...@kernel.org
Cc: H. Peter Anvin 
Cc: Thomas Gleixner 
Cc: Dave Hansen 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Sai Praneeth Prakhya 
Cc: Ravi V. Shankar 
Cc: Fenghua Yu 
---
 arch/x86/include/asm/fpu/internal.h |  1 +
 arch/x86/kernel/fpu/init.c  | 56 +
 arch/x86/kernel/fpu/xstate.c|  3 +-
 3 files changed, 46 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index 3c3550c..6b07a842 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -42,6 +42,7 @@ extern void fpu__init_cpu_xstate(void);
 extern void fpu__init_system(struct cpuinfo_x86 *c);
 extern void fpu__init_check_bugs(void);
 extern void fpu__resume_cpu(void);
+extern u64 fpu__get_supported_xfeatures_mask(void);
 
 /*
  * Debugging facility:
diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c
index 057b73e..aad53cc 100644
--- a/arch/x86/kernel/fpu/init.c
+++ b/arch/x86/kernel/fpu/init.c
@@ -265,7 +265,45 @@ static void __init 
fpu__init_system_xstate_size_legacy(void)
 static enum { AUTO, ENABLE, DISABLE } eagerfpu = AUTO;
 
 /*
+ * Find supported xfeatures based on cpu features and command-line input.
+ * This must be called after fpu__init_parse_early_param() is called and
+ * xfeatures_mask is enumerated.
+ */
+u64 __init fpu__get_supported_xfeatures_mask(void)
+{
+   /* Support all xfeatures known to us */
+   if (eagerfpu != DISABLE)
+   return XCNTXT_MASK;
+
+   /* Warning of xfeatures being disabled for no eagerfpu mode */
+   if (xfeatures_mask & XFEATURE_MASK_EAGER) {
+   pr_err("x86/fpu: eagerfpu switching disabled, disabling the 
following xstate features: 0x%llx.\n",
+   xfeatures_mask & XFEATURE_MASK_EAGER);
+   }
+
+   /* Return a mask that masks out all features requiring eagerfpu mode */
+   return ~XFEATURE_MASK_EAGER;
+}
+
+/*
+ * Disable features dependent on eagerfpu.
+ */
+static void __init fpu__clear_eager_fpu_features(void)
+{
+   setup_clear_cpu_cap(X86_FEATURE_MPX);
+}
+
+/*
  * Pick the FPU context switching strategy:
+ *
+ * When eagerfpu is AUTO or ENABLE, we ensure it is ENABLE if either of
+ * the following is true:
+ *
+ * (1) the cpu has xsaveopt, as it has the optimization and doing eager
+ * FPU switching has a relatively low cost compared to a plain xsave;
+ * (2) the cpu has xsave features (e.g. MPX) that depend on eager FPU
+ * switching. Should the kernel boot with noxsaveopt, we support MPX
+ * with eager FPU switching at a higher cost.
  */
 static void __init fpu__init_system_ctx_switch(void)
 {
@@ -277,19 +315,11 @@ static void __init fpu__init_system_ctx_switch(void)
WARN_ON_FPU(current->thread.fpu.fpstate_active);
current_thread_info()->status = 0;
 
-   /* Auto enable eagerfpu for xsaveopt */
if (cpu_has_xsaveopt && eagerfpu != DISABLE)
eagerfpu = ENABLE;
 
-   if (xfeatures_mask & XFEATURE_MASK_EAGER) {
-   if (eagerfpu == DISABLE) {
-   pr_err("x86/fpu: eagerfpu switching disabled, disabling 
the following xstate features: 0x%llx.\n",
-  xfeatures_mask & XFEATURE_MASK_EAGER);
-   xfeatures_mask &= ~XFEATURE_MASK_EAGER;
-   } else {
-   eagerfpu = ENABLE;
-   }
-   }
+   if (xfeatures_mask & XFEATURE_MASK_EAGER)
+   eagerfpu = ENABLE;
 
if (eagerfpu == ENABLE)
setup_force_cpu_cap(X86_FEATURE_EAGER_FPU);
@@ -307,10 +337,12 @@ static void __init fpu__init_parse_early_param(void)
 * No need to check "eagerfpu=auto" again, since it is the
 * initial default.
 */
-   if (cmdline_find_option_bool(boot_command_line, "eagerfpu=off"))
+   if (cmdline_find_option_bool(boot_command_line, "eagerfpu=off")) {
eagerfpu = DISABLE;
-   else if (cmdline_find_option_bool(boot_command_line, "eagerfpu=on"))
+   fpu__clear_eager_fpu_features();
+   } else if (cmdline_find_option_bool(boot_command_line, "eagerfpu=on")) {
eagerfpu = ENABLE;
+   }
 
if (cmdline_find_option_bool(boot_command_line, "no387"))
setup_clear_cpu_cap(X86_FEATURE_FPU);
diff 

Re: [PATCH] input: gpio_keys: Fix check for disabling unsupported key

2016-01-06 Thread Dmitry Torokhov
On Tue, Jan 05, 2016 at 09:24:40AM +0200, Ivaylo Dimitrov wrote:
> Hi,
> 
> On  5.01.2016 03:19, Dmitry Torokhov wrote:
> >>/* First validate */
> >>-   for (i = 0; i < ddata->pdata->nbuttons; i++) {
> >>-   struct gpio_button_data *bdata = >data[i];
> >>+   for (i = 0; i < n_events; i++) {
> >
> >for_each_set_bit()?
> 
> Yeah, seems I must have overslept that helper, will send an updated version.
> 
> >
> >OTOH maybe we should do
> >
> > bitmap = get_bitmap_events_by_type(type); // new, return keybit or swbit
> 
> new helper function? or static function in gpio-keys? who
> allocates/frees the bitmap memory? or this is static data? Maybe I
> don't get the idea :) .
> 
> > if (!bitmap_subset(bits, bitmap, n_events)) {
> > error = -EINVAL;
> > goto out;
> > }
> >
> >... and leave the rest of the loop as is?
> >
> 
> Not sure about that. Unless I miss something, what we want is:
> 
> 1. make sure that what user has written is within the range of the
> event type. I hope bitmap_parselist already does it for us.
> 
> 2. Make sure that for every bit in bits set based on what user has
> provided, there is a matching gpio in this particular gpio-keys
> device.
> 
> 3. Make sure that every gpio user wants disabled is actually allowed
> to be disabled.
> 
> I don't see how 2 is achieved with ^^^ code.
> 
> So, shall I send a new version of the patch with for_each_set_bit()
> used, or you'll fix the $subject problem with whatever magic you
> think is needed?

How about the patch below (compiled but not tested)?

Thanks.

-- 
Dmitry

Input: gpio-keys - fix check for disabling unsupported keys

From: Dmitry Torokhov 

Commit 4ea14a53d8f881034fa9e186653821c4e3d9a8fb ("Input: gpio-keys - report
error when disabling unsupported key") tried let user know that they
attempted to disable an unsupported key, unfortunately the check is wrong
as it believes that all codes are invalid. Fix it by ensuring that keys
that we try to disable are subset of keys (or switches) that device
reports.

Fixes: 4ea14a53d8f8 ("Input: gpio-keys - report error when disabling 
unsupported key")
Reported-by: Ivaylo Dimitrov 
Signed-off-by: Dmitry Torokhov 
---
 drivers/input/keyboard/gpio_keys.c |   29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/drivers/input/keyboard/gpio_keys.c 
b/drivers/input/keyboard/gpio_keys.c
index bef317f..b9f01bd 100644
--- a/drivers/input/keyboard/gpio_keys.c
+++ b/drivers/input/keyboard/gpio_keys.c
@@ -96,7 +96,7 @@ struct gpio_keys_drvdata {
  * Return value of this function can be used to allocate bitmap
  * large enough to hold all bits for given type.
  */
-static inline int get_n_events_by_type(int type)
+static int get_n_events_by_type(int type)
 {
BUG_ON(type != EV_SW && type != EV_KEY);
 
@@ -104,6 +104,22 @@ static inline int get_n_events_by_type(int type)
 }
 
 /**
+ * get_bm_events_by_type() - returns bitmap of supported events per @type
+ * @input: input device from which bitmap is retrieved
+ * @type: type of button (%EV_KEY, %EV_SW)
+ *
+ * Return value of this function can be used to allocate bitmap
+ * large enough to hold all bits for given type.
+ */
+static const unsigned long *get_bm_events_by_type(struct input_dev *dev,
+ int type)
+{
+   BUG_ON(type != EV_SW && type != EV_KEY);
+
+   return (type == EV_KEY) ? dev->keybit : dev->swbit;
+}
+
+/**
  * gpio_keys_disable_button() - disables given GPIO button
  * @bdata: button data for button to be disabled
  *
@@ -213,6 +229,7 @@ static ssize_t gpio_keys_attr_store_helper(struct 
gpio_keys_drvdata *ddata,
   const char *buf, unsigned int type)
 {
int n_events = get_n_events_by_type(type);
+   const unsigned long *bitmap = get_bm_events_by_type(ddata->input, type);
unsigned long *bits;
ssize_t error;
int i;
@@ -226,6 +243,11 @@ static ssize_t gpio_keys_attr_store_helper(struct 
gpio_keys_drvdata *ddata,
goto out;
 
/* First validate */
+   if (!bitmap_subset(bits, bitmap, n_events)) {
+   error = -EINVAL;
+   goto out;
+   }
+
for (i = 0; i < ddata->pdata->nbuttons; i++) {
struct gpio_button_data *bdata = >data[i];
 
@@ -239,11 +261,6 @@ static ssize_t gpio_keys_attr_store_helper(struct 
gpio_keys_drvdata *ddata,
}
}
 
-   if (i == ddata->pdata->nbuttons) {
-   error = -EINVAL;
-   goto out;
-   }
-
mutex_lock(>disable_lock);
 
for (i = 0; i < ddata->pdata->nbuttons; i++) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at 

Re: [PATCH] mtd: tests: consolidate kmalloc/memset 0 call to kzalloc

2016-01-06 Thread Brian Norris
On Thu, Dec 31, 2015 at 04:21:22PM +0100, Nicholas Mc Guire wrote:
> This is an API consolidation only. The use of kmalloc + memset to 0
> is equivalent to kzalloc.
> 
> Signed-off-by: Nicholas Mc Guire 

Applied to l2-mtd.git
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 5/5] perf evlist: Add --trace-fields option to show trace fields

2016-01-06 Thread Arnaldo Carvalho de Melo
Em Thu, Jan 07, 2016 at 08:21:44AM +0900, Namhyung Kim escreveu:
> On Wed, Jan 06, 2016 at 08:10:51PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Jan 06, 2016 at 09:55:01AM +0900, Namhyung Kim escreveu:
> > > To use dynamic sort keys, it might be good to add an option to see the
> > > list of field names.
> > > 
> > >   $ perf evlist -i perf.data.sched
> > >   sched:sched_switch
> > >   sched:sched_stat_wait
> > >   sched:sched_stat_sleep
> > >   sched:sched_stat_iowait
> > >   sched:sched_stat_runtime
> > >   sched:sched_process_fork
> > >   sched:sched_wakeup
> > >   sched:sched_wakeup_new
> > >   sched:sched_migrate_task
> > >   # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint 
> > > events
> > 
> > Ok, almost there, question is: if I ask explicitely for
> > "--trace-fields", why should we have the "trace_fields=" in all lines,
> > instead of just:
> > 
> >  
> >$ perf evlist -i perf.data.sched --trace-fields
> >sched:sched_switch: 
> > prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
> >sched:sched_stat_wait: comm,pid,delay
> > 
> > ?
> 
> I made it to work with other options too, like 'perf evlist --freq 
> --trace-fields'.
> In that case you may want to see it. :)

I see... And then you want to show those at the same time... Would you
have an use case for that? Or would it be better to make them mutually
exclusive? /me unsure...

- Arnaldo
 
> Thanks,
> Namhyung
> 
> 
> > 
> > I like the lack of spaces after commans, this way we can double click
> > and select the whole list, then edit it, etc.
> > 
> > - Arnaldo
> >  
> > >   $ perf evlist -i perf.data.sched --trace-fields
> > >   sched:sched_switch: 
> > > trace_fields=prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
> > >   sched:sched_stat_wait: trace_fields=comm,pid,delay
> > >   sched:sched_stat_sleep: trace_fields=comm,pid,delay
> > >   sched:sched_stat_iowait: trace_fields=comm,pid,delay
> > >   sched:sched_stat_runtime: trace_fields=comm,pid,runtime,vruntime
> > >   sched:sched_process_fork: 
> > > trace_fields=parent_comm,parent_pid,child_comm,child_pid
> > >   sched:sched_wakeup: trace_fields=comm,pid,prio,success,target_cpu
> > >   sched:sched_wakeup_new: trace_fields=comm,pid,prio,success,target_cpu
> > >   sched:sched_migrate_task: trace_fields=comm,pid,prio,orig_cpu,dest_cpu
> > > 
> > > Signed-off-by: Namhyung Kim 
> > > ---
> > >  tools/perf/Documentation/perf-evlist.txt |  3 +++
> > >  tools/perf/builtin-evlist.c  | 11 ++-
> > >  tools/perf/util/evsel.c  | 23 +++
> > >  tools/perf/util/evsel.h  |  1 +
> > >  4 files changed, 37 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/tools/perf/Documentation/perf-evlist.txt 
> > > b/tools/perf/Documentation/perf-evlist.txt
> > > index 1ceb3700ffbb..6f7200fb85cf 100644
> > > --- a/tools/perf/Documentation/perf-evlist.txt
> > > +++ b/tools/perf/Documentation/perf-evlist.txt
> > > @@ -32,6 +32,9 @@ OPTIONS
> > >  --group::
> > >   Show event group information.
> > >  
> > > +--trace-fields::
> > > + Show tracepoint field names.
> > > +
> > >  SEE ALSO
> > >  
> > >  linkperf:perf-record[1], linkperf:perf-list[1],
> > > diff --git a/tools/perf/builtin-evlist.c b/tools/perf/builtin-evlist.c
> > > index 08a7d36a2cf8..8a31f511e1a0 100644
> > > --- a/tools/perf/builtin-evlist.c
> > > +++ b/tools/perf/builtin-evlist.c
> > > @@ -26,14 +26,22 @@ static int __cmd_evlist(const char *file_name, struct 
> > > perf_attr_details *details
> > >   .mode = PERF_DATA_MODE_READ,
> > >   .force = details->force,
> > >   };
> > > + bool has_tracepoint = false;
> > >  
> > >   session = perf_session__new(, 0, NULL);
> > >   if (session == NULL)
> > >   return -1;
> > >  
> > > - evlist__for_each(session->evlist, pos)
> > > + evlist__for_each(session->evlist, pos) {
> > >   perf_evsel__fprintf(pos, details, stdout);
> > >  
> > > + if (pos->attr.type == PERF_TYPE_TRACEPOINT)
> > > + has_tracepoint = true;
> > > + }
> > > +
> > > + if (has_tracepoint && !details->trace_fields)
> > > + printf("# Tip: use 'perf evlist --trace-fields' to show fields 
> > > for tracepoint events\n");
> > > +
> > >   perf_session__delete(session);
> > >   return 0;
> > >  }
> > > @@ -49,6 +57,7 @@ int cmd_evlist(int argc, const char **argv, const char 
> > > *prefix __maybe_unused)
> > >   OPT_BOOLEAN('g', "group", _group,
> > >   "Show event group information"),
> > >   OPT_BOOLEAN('f', "force", , "don't complain, do it"),
> > > + OPT_BOOLEAN(0, "trace-fields", _fields, "Show tracepoint 
> > > fields"),
> > >   OPT_END()
> > >   };
> > >   const char * const evlist_usage[] = {
> > > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > > index 544e4400de13..b7822c98fcca 100644
> > > --- a/tools/perf/util/evsel.c
> > > +++ b/tools/perf/util/evsel.c
> > > @@ -2298,6 

[PATCH 31/31] x86, pkeys: execute-only support

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

Protection keys provide new page-based protection in hardware.
But, they have an interesting attribute: they only affect data
accesses and never affect instruction fetches.  That means that
if we set up some memory which is set as "access-disabled" via
protection keys, we can still execute from it.

This patch uses protection keys to set up mappings to do just that.
If a user calls:

mmap(..., PROT_EXEC);
or
mprotect(ptr, sz, PROT_EXEC);

(note PROT_EXEC-only without PROT_READ/WRITE), the kernel will
notice this, and set a special protection key on the memory.  It
also sets the appropriate bits in the Protection Keys User Rights
(PKRU) register so that the memory becomes unreadable and
unwritable.

I haven't found any userspace that does this today.  With this
facility in place, we expect userspace to move to use it
eventually.

The security provided by this approach is not comprehensive.  The
PKRU register which controls access permissions is a normal
user register writable from unprivileged userspace.  An attacker
who can execute the 'wrpkru' instruction can easily disable the
protection provided by this feature.

The protection key that is used for execute-only support is
permanently dedicated at compile time.  This is fine for now
because there is currently no API to set a protection key other
than this one.

Despite there being a constant PKRU value across the entire
system, we do not set it unless this feature is in use in a
process.  That is to preserve the PKRU XSAVE 'init state',
which can lead to faster context switches.

PKRU *is* a user register and the kernel is modifying it.  That
means that code doing:

pkru = rdpkru()
pkru |= 0x100;
mmap(..., PROT_EXEC);
wrpkru(pkru);

could lose the bits in PKRU that enforce execute-only
permissions.  To avoid this, we suggest avoiding ever calling
mmap() or mprotect() when the PKRU value is expected to be
stable.

Signed-off-by: Dave Hansen 
Cc: LKML 
Cc: x...@kernel.org
Cc: torva...@linux-foundation.org
Cc: a...@linux-foundation.org
Cc: linux...@kvack.org
Cc: keesc...@google.com
Cc: l...@amacapital.net
---

 b/arch/x86/include/asm/pkeys.h |   25 ++
 b/arch/x86/kernel/fpu/xstate.c |2 
 b/arch/x86/mm/Makefile |2 
 b/arch/x86/mm/fault.c  |   13 +
 b/arch/x86/mm/pkeys.c  |  101 +
 b/include/linux/pkeys.h|3 +
 b/mm/mmap.c|   10 +++-
 b/mm/mprotect.c|8 +--
 8 files changed, 157 insertions(+), 7 deletions(-)

diff -puN arch/x86/include/asm/pkeys.h~pkeys-79-xonly 
arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~pkeys-79-xonly   2016-01-06 
15:50:16.796660318 -0800
+++ b/arch/x86/include/asm/pkeys.h  2016-01-06 15:50:16.809660904 -0800
@@ -6,4 +6,29 @@
 extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
unsigned long init_val);
 
+/*
+ * Try to dedicate one of the protection keys to be used as an
+ * execute-only protection key.
+ */
+#define PKEY_DEDICATED_EXECUTE_ONLY 15
+extern int __execute_only_pkey(struct mm_struct *mm);
+static inline int execute_only_pkey(struct mm_struct *mm)
+{
+   if (!boot_cpu_has(X86_FEATURE_OSPKE))
+   return 0;
+
+   return __execute_only_pkey(mm);
+}
+
+extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
+   int prot, int pkey);
+static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
+   int prot, int pkey)
+{
+   if (!boot_cpu_has(X86_FEATURE_OSPKE))
+   return 0;
+
+   return __arch_override_mprotect_pkey(vma, prot, pkey);
+}
+
 #endif /*_ASM_X86_PKEYS_H */
diff -puN arch/x86/kernel/fpu/xstate.c~pkeys-79-xonly 
arch/x86/kernel/fpu/xstate.c
--- a/arch/x86/kernel/fpu/xstate.c~pkeys-79-xonly   2016-01-06 
15:50:16.797660363 -0800
+++ b/arch/x86/kernel/fpu/xstate.c  2016-01-06 15:50:16.809660904 -0800
@@ -878,8 +878,6 @@ int arch_set_user_pkey_access(struct tas
int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
u32 new_pkru_bits = 0;
 
-   if (!validate_pkey(pkey))
-   return -EINVAL;
/*
 * This check implies XSAVE support.  OSPKE only gets
 * set if we enable XSAVE and we enable PKU in XCR0.
diff -puN arch/x86/mm/fault.c~pkeys-79-xonly arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-79-xonly2016-01-06 15:50:16.799660453 
-0800
+++ b/arch/x86/mm/fault.c   2016-01-06 15:50:16.810660949 -0800
@@ -14,6 +14,8 @@
 #include /* prefetchw*/
 #include /* exception_enter(), ...   */
 #include  /* faulthandler_disabled()  */
+#include/* PKEY_*   */
+#include 
 
 #include /* boot_cpu_has, ...*/
 #include

[PATCH 18/31] mm: add gup flag to indicate "foreign" mm access

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

We try to enforce protection keys in software the same way that we
do in hardware.  (See long example below).

But, we only want to do this when accessing our *own* process's
memory.  If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then
tried to PTRACE_POKE a target process which just happened to have
some mprotect_pkey(pkey=6) memory, we do *not* want to deny the
debugger access to that memory.  PKRU is fundamentally a
thread-local structure and we do not want to enforce it on access
to _another_ thread's data.

This gets especially tricky when we have workqueues or other
delayed-work mechanisms that might run in a random process's context.
We can check that we only enforce pkeys when operating on our *own* mm,
but delayed work gets performed when a random user context is active.
We might end up with a situation where a delayed-work gup fails when
running randomly under its "own" task but succeeds when running under
another process.  We want to avoid that.

To avoid that, we add a GUP flag: FOLL_FOREIGN and a fault flag:
FAULT_FLAG_FOREIGN.  They indicate that we are walking an mm
which is not guranteed to be the same as current->mm and should
not be subject to protection key enforcement.

Thanks to Jerome Glisse for pointing out this scenario.

*** Why do we enforce protection keys in software?? ***

Imagine that we disabled access to the memory pointer to by 'buf'.
The, we implemented sys_write() like this:

sys_read(fd, buf, len...)
{
struct page *page = follow_page(buf);
void *buf_mapped = kmap(page);
memcpy(buf_mapped, fd_data, len);
...
}

This writes to 'buf' via a *kernel* mapping, without a protection
key.  While this implementation does the same thing:

sys_read(fd, buf, len...)
{
copy_to_user(buf, fd_data, len);
...
}

but would hit a protection key fault because the userspace 'buf'
mapping has a protection key set.

To provide consistency, and to make key-protected memory work
as much like mprotect()ed memory as possible, we try to enforce
the same protections as the hardware would when the *kernel* walks
the page tables (and other mm structures).

Signed-off-by: Dave Hansen 
Cc: linux-a...@vger.kernel.org
---

 b/arch/powerpc/include/asm/mmu_context.h   |3 ++-
 b/arch/s390/include/asm/mmu_context.h  |3 ++-
 b/arch/unicore32/include/asm/mmu_context.h |3 ++-
 b/arch/x86/include/asm/mmu_context.h   |5 +++--
 b/drivers/iommu/amd_iommu_v2.c |8 +---
 b/include/asm-generic/mm_hooks.h   |3 ++-
 b/include/linux/mm.h   |2 ++
 b/mm/gup.c |   15 ++-
 b/mm/ksm.c |   10 --
 b/mm/memory.c  |3 ++-
 10 files changed, 38 insertions(+), 17 deletions(-)

diff -puN 
arch/powerpc/include/asm/mmu_context.h~pkeys-14-gup-fault-foreign-flag 
arch/powerpc/include/asm/mmu_context.h
--- a/arch/powerpc/include/asm/mmu_context.h~pkeys-14-gup-fault-foreign-flag
2016-01-06 15:50:10.622381958 -0800
+++ b/arch/powerpc/include/asm/mmu_context.h2016-01-06 15:50:10.640382770 
-0800
@@ -148,7 +148,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool 
write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+   bool write, bool foreign)
 {
/* by default, allow everything */
return true;
diff -puN arch/s390/include/asm/mmu_context.h~pkeys-14-gup-fault-foreign-flag 
arch/s390/include/asm/mmu_context.h
--- a/arch/s390/include/asm/mmu_context.h~pkeys-14-gup-fault-foreign-flag   
2016-01-06 15:50:10.624382049 -0800
+++ b/arch/s390/include/asm/mmu_context.h   2016-01-06 15:50:10.641382815 
-0800
@@ -130,7 +130,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool 
write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+   bool write, bool foreign)
 {
/* by default, allow everything */
return true;
diff -puN 
arch/unicore32/include/asm/mmu_context.h~pkeys-14-gup-fault-foreign-flag 
arch/unicore32/include/asm/mmu_context.h
--- a/arch/unicore32/include/asm/mmu_context.h~pkeys-14-gup-fault-foreign-flag  
2016-01-06 15:50:10.625382094 -0800
+++ b/arch/unicore32/include/asm/mmu_context.h  2016-01-06 15:50:10.641382815 
-0800
@@ -97,7 +97,8 @@ static inline void arch_bprm_mm_init(str
 {
 }
 
-static inline bool arch_vma_access_permitted(struct vm_area_struct *vma, bool 
write)
+static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
+   bool write, bool foreign)
 {
/* by default, allow everything */
return 

[PATCH 29/31] x86, pkeys: allow kernel to modify user pkey rights register

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

The Protection Key Rights for User memory (PKRU) is a 32-bit
user-accessible register.  It contains two bits for each
protection key: one to write-disable (WD) access to memory
covered by the key and another to access-disable (AD).

Userspace can read/write the register with the RDPKRU and WRPKRU
instructions.  But, the register is saved and restored with the
XSAVE family of instructions, which means we have to treat it
like a floating point register.

The kernel needs to write to the register if it wants to
implement execute-only memory or if it implements a system call
to change PKRU.

To do this, we need to create a 'pkru_state' buffer, read the old
contents in to it, modify it, and then tell the FPU code that
there is modified data in there so it can (possibly) move the
buffer back in to the registers.

This uses the fpu__xfeature_set_state() function that we defined
in the previous patch.

Signed-off-by: Dave Hansen 
---

 b/arch/x86/include/asm/pgtable.h |5 +-
 b/arch/x86/include/asm/pkeys.h   |3 +
 b/arch/x86/kernel/fpu/xstate.c   |   74 +++
 b/include/linux/pkeys.h  |5 ++
 4 files changed, 85 insertions(+), 2 deletions(-)

diff -puN arch/x86/include/asm/pgtable.h~pkeys-77-arch_set_user_pkey_access 
arch/x86/include/asm/pgtable.h
--- a/arch/x86/include/asm/pgtable.h~pkeys-77-arch_set_user_pkey_access 
2016-01-06 15:50:15.900619921 -0800
+++ b/arch/x86/include/asm/pgtable.h2016-01-06 15:50:15.909620327 -0800
@@ -912,16 +912,17 @@ static inline pte_t pte_swp_clear_soft_d
 
 #define PKRU_AD_BIT 0x1
 #define PKRU_WD_BIT 0x2
+#define PKRU_BITS_PER_PKEY 2
 
 static inline bool __pkru_allows_read(u32 pkru, u16 pkey)
 {
-   int pkru_pkey_bits = pkey * 2;
+   int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
return !(pkru & (PKRU_AD_BIT << pkru_pkey_bits));
 }
 
 static inline bool __pkru_allows_write(u32 pkru, u16 pkey)
 {
-   int pkru_pkey_bits = pkey * 2;
+   int pkru_pkey_bits = pkey * PKRU_BITS_PER_PKEY;
/*
 * Access-disable disables writes too so we need to check
 * both bits here.
diff -puN arch/x86/include/asm/pkeys.h~pkeys-77-arch_set_user_pkey_access 
arch/x86/include/asm/pkeys.h
--- a/arch/x86/include/asm/pkeys.h~pkeys-77-arch_set_user_pkey_access   
2016-01-06 15:50:15.902620011 -0800
+++ b/arch/x86/include/asm/pkeys.h  2016-01-06 15:50:15.909620327 -0800
@@ -3,4 +3,7 @@
 
 #define arch_max_pkey() (boot_cpu_has(X86_FEATURE_OSPKE) ? 16 : 1)
 
+extern int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+   unsigned long init_val);
+
 #endif /*_ASM_X86_PKEYS_H */
diff -puN arch/x86/kernel/fpu/xstate.c~pkeys-77-arch_set_user_pkey_access 
arch/x86/kernel/fpu/xstate.c
--- a/arch/x86/kernel/fpu/xstate.c~pkeys-77-arch_set_user_pkey_access   
2016-01-06 15:50:15.904620101 -0800
+++ b/arch/x86/kernel/fpu/xstate.c  2016-01-06 15:50:15.909620327 -0800
@@ -5,6 +5,7 @@
  */
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -856,3 +857,76 @@ out:
 */
fpu__current_fpstate_write_end();
 }
+
+#define NR_VALID_PKRU_BITS (CONFIG_NR_PROTECTION_KEYS * 2)
+#define PKRU_VALID_MASK (NR_VALID_PKRU_BITS - 1)
+
+/*
+ * This will go out and modify the XSAVE buffer so that PKRU is
+ * set to a particular state for access to 'pkey'.
+ *
+ * PKRU state does affect kernel access to user memory.  We do
+ * not modfiy PKRU *itself* here, only the XSAVE state that will
+ * be restored in to PKRU when we return back to userspace.
+ */
+int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+   unsigned long init_val)
+{
+   struct xregs_state *xsave = >thread.fpu.state.xsave;
+   struct pkru_state *old_pkru_state;
+   struct pkru_state new_pkru_state;
+   int pkey_shift = (pkey * PKRU_BITS_PER_PKEY);
+   u32 new_pkru_bits = 0;
+
+   if (!validate_pkey(pkey))
+   return -EINVAL;
+   /*
+* This check implies XSAVE support.  OSPKE only gets
+* set if we enable XSAVE and we enable PKU in XCR0.
+*/
+   if (!boot_cpu_has(X86_FEATURE_OSPKE))
+   return -EINVAL;
+
+   /* Set the bits we need in PKRU  */
+   if (init_val & PKEY_DISABLE_ACCESS)
+   new_pkru_bits |= PKRU_AD_BIT;
+   if (init_val & PKEY_DISABLE_WRITE)
+   new_pkru_bits |= PKRU_WD_BIT;
+
+   /* Shift the bits in to the correct place in PKRU for pkey. */
+   new_pkru_bits <<= pkey_shift;
+
+   /* Locate old copy of the state in the xsave buffer */
+   old_pkru_state = get_xsave_addr(xsave, XFEATURE_MASK_PKRU);
+
+   /*
+* When state is not in the buffer, it is in the init
+* state, set it manually.  Otherwise, copy out the old
+* state.
+*/
+   if (!old_pkru_state)
+   new_pkru_state.pkru = 0;
+   else
+   

[PATCH 27/31] x86: separate out LDT init from context init

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

The arch-specific mm_context_t is a great place to put
protection-key allocation state.

But, we need to initialize the allocation state because pkey 0 is
always "allocated".  All of the runtime initialization of
mm_context_t is done in *_ldt() manipulation functions.  This
renames the existing LDT functions like this:

init_new_context() -> init_new_context_ldt()
destroy_context() -> destroy_context_ldt()

and makes init_new_context() and destroy_context() available for
generic use.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/include/asm/mmu_context.h |   21 -
 b/arch/x86/kernel/ldt.c  |4 ++--
 2 files changed, 18 insertions(+), 7 deletions(-)

diff -puN arch/x86/include/asm/mmu_context.h~pkeys-72-init-ldt-extricate 
arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkeys-72-init-ldt-extricate
2016-01-06 15:50:15.004579524 -0800
+++ b/arch/x86/include/asm/mmu_context.h2016-01-06 15:50:15.008579705 
-0800
@@ -52,15 +52,15 @@ struct ldt_struct {
 /*
  * Used for LDT copy/destruction.
  */
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
-void destroy_context(struct mm_struct *mm);
+int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm);
+void destroy_context_ldt(struct mm_struct *mm);
 #else  /* CONFIG_MODIFY_LDT_SYSCALL */
-static inline int init_new_context(struct task_struct *tsk,
-  struct mm_struct *mm)
+static inline int init_new_context_ldt(struct task_struct *tsk,
+  struct mm_struct *mm)
 {
return 0;
 }
-static inline void destroy_context(struct mm_struct *mm) {}
+static inline void destroy_context_ldt(struct mm_struct *mm) {}
 #endif
 
 static inline void load_mm_ldt(struct mm_struct *mm)
@@ -104,6 +104,17 @@ static inline void enter_lazy_tlb(struct
 #endif
 }
 
+static inline int init_new_context(struct task_struct *tsk,
+  struct mm_struct *mm)
+{
+   init_new_context_ldt(tsk, mm);
+   return 0;
+}
+static inline void destroy_context(struct mm_struct *mm)
+{
+   destroy_context_ldt(mm);
+}
+
 static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 struct task_struct *tsk)
 {
diff -puN arch/x86/kernel/ldt.c~pkeys-72-init-ldt-extricate 
arch/x86/kernel/ldt.c
--- a/arch/x86/kernel/ldt.c~pkeys-72-init-ldt-extricate 2016-01-06 
15:50:15.005579569 -0800
+++ b/arch/x86/kernel/ldt.c 2016-01-06 15:50:15.009579749 -0800
@@ -103,7 +103,7 @@ static void free_ldt_struct(struct ldt_s
  * we do not have to muck with descriptors here, that is
  * done in switch_mm() as needed.
  */
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int init_new_context_ldt(struct task_struct *tsk, struct mm_struct *mm)
 {
struct ldt_struct *new_ldt;
struct mm_struct *old_mm;
@@ -144,7 +144,7 @@ out_unlock:
  *
  * 64bit: Don't touch the LDT register - we're already in the next thread.
  */
-void destroy_context(struct mm_struct *mm)
+void destroy_context_ldt(struct mm_struct *mm)
 {
free_ldt_struct(mm->context.ldt);
mm->context.ldt = NULL;
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 25/31] mm, multi-arch: pass a protection key in to calc_vm_flag_bits()

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

This plumbs a protection key through calc_vm_flag_bits().  We
could have done this in calc_vm_prot_bits(), but I did not feel
super strongly which way to go.  It was pretty arbitrary which
one to use.

Signed-off-by: Dave Hansen 
Cc: linux-...@vger.kernel.org
Cc: linux-a...@vger.kernel.org
---

 b/arch/powerpc/include/asm/mman.h  |5 +++--
 b/drivers/char/agp/frontend.c  |2 +-
 b/drivers/staging/android/ashmem.c |4 ++--
 b/include/linux/mman.h |6 +++---
 b/mm/mmap.c|2 +-
 b/mm/mprotect.c|2 +-
 b/mm/nommu.c   |2 +-
 7 files changed, 12 insertions(+), 11 deletions(-)

diff -puN arch/powerpc/include/asm/mman.h~pkeys-70-calc_vm_prot_bits 
arch/powerpc/include/asm/mman.h
--- a/arch/powerpc/include/asm/mman.h~pkeys-70-calc_vm_prot_bits
2016-01-06 15:50:13.971532951 -0800
+++ b/arch/powerpc/include/asm/mman.h   2016-01-06 15:50:13.984533537 -0800
@@ -18,11 +18,12 @@
  * This file is included by linux/mman.h, so we can't use cacl_vm_prot_bits()
  * here.  How important is the optimization?
  */
-static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot)
+static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
+   unsigned long pkey)
 {
return (prot & PROT_SAO) ? VM_SAO : 0;
 }
-#define arch_calc_vm_prot_bits(prot) arch_calc_vm_prot_bits(prot)
+#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
 
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
diff -puN drivers/char/agp/frontend.c~pkeys-70-calc_vm_prot_bits 
drivers/char/agp/frontend.c
--- a/drivers/char/agp/frontend.c~pkeys-70-calc_vm_prot_bits2016-01-06 
15:50:13.972532995 -0800
+++ b/drivers/char/agp/frontend.c   2016-01-06 15:50:13.984533537 -0800
@@ -156,7 +156,7 @@ static pgprot_t agp_convert_mmap_flags(i
 {
unsigned long prot_bits;
 
-   prot_bits = calc_vm_prot_bits(prot) | VM_SHARED;
+   prot_bits = calc_vm_prot_bits(prot, 0) | VM_SHARED;
return vm_get_page_prot(prot_bits);
 }
 
diff -puN drivers/staging/android/ashmem.c~pkeys-70-calc_vm_prot_bits 
drivers/staging/android/ashmem.c
--- a/drivers/staging/android/ashmem.c~pkeys-70-calc_vm_prot_bits   
2016-01-06 15:50:13.974533086 -0800
+++ b/drivers/staging/android/ashmem.c  2016-01-06 15:50:13.985533582 -0800
@@ -372,8 +372,8 @@ static int ashmem_mmap(struct file *file
}
 
/* requested protection bits must match our allowed protection mask */
-   if (unlikely((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask)) &
-calc_vm_prot_bits(PROT_MASK))) {
+   if (unlikely((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask, 0)) &
+calc_vm_prot_bits(PROT_MASK, 0))) {
ret = -EPERM;
goto out;
}
diff -puN include/linux/mman.h~pkeys-70-calc_vm_prot_bits include/linux/mman.h
--- a/include/linux/mman.h~pkeys-70-calc_vm_prot_bits   2016-01-06 
15:50:13.976533176 -0800
+++ b/include/linux/mman.h  2016-01-06 15:50:13.985533582 -0800
@@ -35,7 +35,7 @@ static inline void vm_unacct_memory(long
  */
 
 #ifndef arch_calc_vm_prot_bits
-#define arch_calc_vm_prot_bits(prot) 0
+#define arch_calc_vm_prot_bits(prot, pkey) 0
 #endif
 
 #ifndef arch_vm_get_page_prot
@@ -70,12 +70,12 @@ static inline int arch_validate_prot(uns
  * Combine the mmap "prot" argument into "vm_flags" used internally.
  */
 static inline unsigned long
-calc_vm_prot_bits(unsigned long prot)
+calc_vm_prot_bits(unsigned long prot, unsigned long pkey)
 {
return _calc_vm_trans(prot, PROT_READ,  VM_READ ) |
   _calc_vm_trans(prot, PROT_WRITE, VM_WRITE) |
   _calc_vm_trans(prot, PROT_EXEC,  VM_EXEC) |
-  arch_calc_vm_prot_bits(prot);
+  arch_calc_vm_prot_bits(prot, pkey);
 }
 
 /*
diff -puN mm/mmap.c~pkeys-70-calc_vm_prot_bits mm/mmap.c
--- a/mm/mmap.c~pkeys-70-calc_vm_prot_bits  2016-01-06 15:50:13.977533221 
-0800
+++ b/mm/mmap.c 2016-01-06 15:50:13.986533627 -0800
@@ -1309,7 +1309,7 @@ unsigned long do_mmap(struct file *file,
 * to. we assume access permissions have been handled by the open
 * of the memory object, so we don't do any here.
 */
-   vm_flags |= calc_vm_prot_bits(prot) | calc_vm_flag_bits(flags) |
+   vm_flags |= calc_vm_prot_bits(prot, 0) | calc_vm_flag_bits(flags) |
mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
 
if (flags & MAP_LOCKED)
diff -puN mm/mprotect.c~pkeys-70-calc_vm_prot_bits mm/mprotect.c
--- a/mm/mprotect.c~pkeys-70-calc_vm_prot_bits  2016-01-06 15:50:13.979533311 
-0800
+++ b/mm/mprotect.c 2016-01-06 15:50:13.986533627 -0800
@@ -373,7 +373,7 @@ SYSCALL_DEFINE3(mprotect, unsigned long,
if ((prot & PROT_READ) && (current->personality & READ_IMPLIES_EXEC))
   

[PATCH 26/31] x86, pkeys: add arch_validate_pkey()

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

The syscall-level code is passed a protection key and need to
return an appropriate error code if the protection key is bogus.
We will be using this in subsequent patches.

Note that this also begins a series of arch-specific calls that
we need to expose in otherwise arch-independent code.  We create
a linux/pkeys.h header where we will put *all* the stubs for
these functions.

Signed-off-by: Dave Hansen 
---

 b/arch/x86/Kconfig |1 +
 b/arch/x86/include/asm/pkeys.h |6 ++
 b/include/linux/pkeys.h|   25 +
 b/mm/Kconfig   |2 ++
 4 files changed, 34 insertions(+)

diff -puN /dev/null arch/x86/include/asm/pkeys.h
--- /dev/null   2015-12-10 15:28:13.322405854 -0800
+++ b/arch/x86/include/asm/pkeys.h  2016-01-06 15:50:14.531558199 -0800
@@ -0,0 +1,6 @@
+#ifndef _ASM_X86_PKEYS_H
+#define _ASM_X86_PKEYS_H
+
+#define arch_max_pkey() (boot_cpu_has(X86_FEATURE_OSPKE) ? 16 : 1)
+
+#endif /*_ASM_X86_PKEYS_H */
diff -puN arch/x86/Kconfig~pkeys-71-arch_validate_pkey arch/x86/Kconfig
--- a/arch/x86/Kconfig~pkeys-71-arch_validate_pkey  2016-01-06 
15:50:14.526557973 -0800
+++ b/arch/x86/Kconfig  2016-01-06 15:50:14.532558243 -0800
@@ -153,6 +153,7 @@ config X86
select X86_DEV_DMA_OPS  if X86_64
select X86_FEATURE_NAMESif PROC_FS
select ARCH_USES_HIGH_VMA_FLAGS if 
X86_INTEL_MEMORY_PROTECTION_KEYS
+   select ARCH_HAS_PKEYS   if 
X86_INTEL_MEMORY_PROTECTION_KEYS
 
 config INSTRUCTION_DECODER
def_bool y
diff -puN /dev/null include/linux/pkeys.h
--- /dev/null   2015-12-10 15:28:13.322405854 -0800
+++ b/include/linux/pkeys.h 2016-01-06 15:50:14.532558243 -0800
@@ -0,0 +1,25 @@
+#ifndef _LINUX_PKEYS_H
+#define _LINUX_PKEYS_H
+
+#include 
+#include 
+
+#ifdef CONFIG_ARCH_HAS_PKEYS
+#include 
+#else /* ! CONFIG_ARCH_HAS_PKEYS */
+#define arch_max_pkey() (1)
+#endif /* ! CONFIG_ARCH_HAS_PKEYS */
+
+/*
+ * This is called from mprotect_pkey().
+ *
+ * Returns true if the protection keys is valid.
+ */
+static inline bool validate_pkey(int pkey)
+{
+   if (pkey < 0)
+   return false;
+   return (pkey < arch_max_pkey());
+}
+
+#endif /* _LINUX_PKEYS_H */
diff -puN mm/Kconfig~pkeys-71-arch_validate_pkey mm/Kconfig
--- a/mm/Kconfig~pkeys-71-arch_validate_pkey2016-01-06 15:50:14.528558063 
-0800
+++ b/mm/Kconfig2016-01-06 15:50:14.532558243 -0800
@@ -671,3 +671,5 @@ config FRAME_VECTOR
 
 config ARCH_USES_HIGH_VMA_FLAGS
bool
+config ARCH_HAS_PKEYS
+   bool
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 5/5] perf evlist: Add --trace-fields option to show trace fields

2016-01-06 Thread Namhyung Kim
To use dynamic sort keys, it might be good to add an option to see the
list of field names.

  $ perf evlist -i perf.data.sched
  sched:sched_switch
  sched:sched_stat_wait
  sched:sched_stat_sleep
  sched:sched_stat_iowait
  sched:sched_stat_runtime
  sched:sched_process_fork
  sched:sched_wakeup
  sched:sched_wakeup_new
  sched:sched_migrate_task
  # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events

  $ perf evlist -i perf.data.sched --trace-fields
  sched:sched_switch: 
trace_fields=prev_comm,prev_pid,prev_prio,prev_state,next_comm,next_pid,next_prio
  sched:sched_stat_wait: trace_fields=comm,pid,delay
  sched:sched_stat_sleep: trace_fields=comm,pid,delay
  sched:sched_stat_iowait: trace_fields=comm,pid,delay
  sched:sched_stat_runtime: trace_fields=comm,pid,runtime,vruntime
  sched:sched_process_fork: 
trace_fields=parent_comm,parent_pid,child_comm,child_pid
  sched:sched_wakeup: trace_fields=comm,pid,prio,success,target_cpu
  sched:sched_wakeup_new: trace_fields=comm,pid,prio,success,target_cpu
  sched:sched_migrate_task: trace_fields=comm,pid,prio,orig_cpu,dest_cpu

Signed-off-by: Namhyung Kim 
---
 tools/perf/Documentation/perf-evlist.txt |  3 +++
 tools/perf/builtin-evlist.c  | 11 ++-
 tools/perf/util/evsel.c  | 23 +++
 tools/perf/util/evsel.h  |  1 +
 4 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-evlist.txt 
b/tools/perf/Documentation/perf-evlist.txt
index 1ceb3700ffbb..6f7200fb85cf 100644
--- a/tools/perf/Documentation/perf-evlist.txt
+++ b/tools/perf/Documentation/perf-evlist.txt
@@ -32,6 +32,9 @@ OPTIONS
 --group::
Show event group information.
 
+--trace-fields::
+   Show tracepoint field names.
+
 SEE ALSO
 
 linkperf:perf-record[1], linkperf:perf-list[1],
diff --git a/tools/perf/builtin-evlist.c b/tools/perf/builtin-evlist.c
index 08a7d36a2cf8..8a31f511e1a0 100644
--- a/tools/perf/builtin-evlist.c
+++ b/tools/perf/builtin-evlist.c
@@ -26,14 +26,22 @@ static int __cmd_evlist(const char *file_name, struct 
perf_attr_details *details
.mode = PERF_DATA_MODE_READ,
.force = details->force,
};
+   bool has_tracepoint = false;
 
session = perf_session__new(, 0, NULL);
if (session == NULL)
return -1;
 
-   evlist__for_each(session->evlist, pos)
+   evlist__for_each(session->evlist, pos) {
perf_evsel__fprintf(pos, details, stdout);
 
+   if (pos->attr.type == PERF_TYPE_TRACEPOINT)
+   has_tracepoint = true;
+   }
+
+   if (has_tracepoint && !details->trace_fields)
+   printf("# Tip: use 'perf evlist --trace-fields' to show fields 
for tracepoint events\n");
+
perf_session__delete(session);
return 0;
 }
@@ -49,6 +57,7 @@ int cmd_evlist(int argc, const char **argv, const char 
*prefix __maybe_unused)
OPT_BOOLEAN('g', "group", _group,
"Show event group information"),
OPT_BOOLEAN('f', "force", , "don't complain, do it"),
+   OPT_BOOLEAN(0, "trace-fields", _fields, "Show tracepoint 
fields"),
OPT_END()
};
const char * const evlist_usage[] = {
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 544e4400de13..b7822c98fcca 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2298,6 +2298,29 @@ int perf_evsel__fprintf(struct perf_evsel *evsel,
printed += comma_fprintf(fp, , " %s=%" PRIu64,
 term, (u64)evsel->attr.sample_freq);
}
+
+   if (details->trace_fields) {
+   struct format_field *field;
+
+   if (evsel->attr.type != PERF_TYPE_TRACEPOINT) {
+   printed += comma_fprintf(fp, , " (not a 
tracepoint)");
+   goto out;
+   }
+
+   field = evsel->tp_format->format.fields;
+   if (field == NULL) {
+   printed += comma_fprintf(fp, , " (no trace 
field)");
+   goto out;
+   }
+
+   printed += comma_fprintf(fp, , " trace_fields=%s", 
field->name);
+
+   field = field->next;
+   while (field) {
+   printed += comma_fprintf(fp, , "%s", field->name);
+   field = field->next;
+   }
+   }
 out:
fputc('\n', fp);
return ++printed;
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 5ded1fc0341e..8e75434bd01c 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -369,6 +369,7 @@ struct perf_attr_details {
bool verbose;
bool event_group;
bool force;
+   bool trace_fields;
 };
 
 int perf_evsel__fprintf(struct perf_evsel *evsel,
-- 
2.6.4

--
To unsubscribe from this list: send the line 

[PATCH v4 3/5] perf tools: Fix dynamic sort keys to sort properly

2016-01-06 Thread Namhyung Kim
Currently, the dynamic sort keys compares trace data using memcmp().
But for output sorting, it should check data size and compare by word.
Also it sorted strings in reverse order, fix it.

Before)

  $ perf report -F overhead -s prev_pid,next_pid
  ...
  # Overheadprev_pidnext_pid
  #   ..  ..
  #
   0.39% 490   0
   9.12% 225   0
   0.04% 224   0
   0.51% 731 189
   0.08% 731   3
   0.12% 731   0
   4.82% 729   0
   0.08%1229   0
   0.20% 715   0
   4.78% 189 225
  ...

After)

  $ perf report -F overhead -s prev_pid,next_pid
  ...
  # Overheadprev_pidnext_pid
  #   ..  ..
  #
   0.43%   0   7
   0.04%   0  11
   0.04%   0  12
   0.08%   0  14
   0.04%   0  17
   0.08%   0  19
   0.04%   0  22
   0.04%   0  27
   0.04%   0  37
   0.04%   0  42
  ...

Reported-by: Arnaldo Carvalho de Melo 
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/sort.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index e8a5cdee3f0d..4d05b13aeac8 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1804,6 +1804,9 @@ static int64_t __sort__hde_sort(struct perf_hpp_fmt *fmt,
struct hpp_dynamic_entry *hde;
struct format_field *field;
unsigned offset, size;
+   int64_t *a64, *b64;
+   int32_t *a32, *b32;
+   int16_t *a16, *b16;
 
hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
 
@@ -1819,7 +1822,25 @@ static int64_t __sort__hde_sort(struct perf_hpp_fmt *fmt,
size = field->size;
}
 
-   return memcmp(b->raw_data + offset, a->raw_data + offset, size);
+   if (field->flags & FIELD_IS_STRING)
+   return strcmp(b->raw_data + offset, a->raw_data + offset);
+
+   switch (size) {
+   case 8:
+   a64 = a->raw_data + offset;
+   b64 = b->raw_data + offset;
+   return *b64 - *a64;
+   case 4:
+   a32 = a->raw_data + offset;
+   b32 = b->raw_data + offset;
+   return *b32 - *a32;
+   case 2:
+   a16 = a->raw_data + offset;
+   b16 = b->raw_data + offset;
+   return *b16 - *a16;
+   default:
+   return memcmp(b->raw_data + offset, a->raw_data + offset, size);
+   }
 }
 
 bool perf_hpp__is_dynamic_entry(struct perf_hpp_fmt *fmt)
-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 1/5] perf tools: Fix sorting of dynamic sort keys

2016-01-06 Thread Namhyung Kim
Currently it sorts entries in reverse (alphabetic) order, fix it.

Signed-off-by: Namhyung Kim 
---
This patch can be folded into the original patch c7c2a5e40f17
("perf tools: Add dynamic sort key for tracepoint events")

 tools/perf/util/sort.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 04e2a5cb19e3..425097d2a1cd 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1795,7 +1795,7 @@ static int64_t __sort__hde_cmp(struct perf_hpp_fmt *fmt,
update_dynamic_len(hde, b);
}
 
-   return memcmp(a->raw_data + offset, b->raw_data + offset, size);
+   return memcmp(b->raw_data + offset, a->raw_data + offset, size);
 }
 
 bool perf_hpp__is_dynamic_entry(struct perf_hpp_fmt *fmt)
-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 2/5] perf tools: Separate hpp->sort callback for dynamic sort keys

2016-01-06 Thread Namhyung Kim
The ->sort callback is used for final output sorting.  As it's called
after processing all hist entries, it doesn't need to update dynamic
length anymore.  Also it needs additional handling to sort them
properly (which is the topic of next patch).

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/sort.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 425097d2a1cd..e8a5cdee3f0d 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1798,6 +1798,30 @@ static int64_t __sort__hde_cmp(struct perf_hpp_fmt *fmt,
return memcmp(b->raw_data + offset, a->raw_data + offset, size);
 }
 
+static int64_t __sort__hde_sort(struct perf_hpp_fmt *fmt,
+   struct hist_entry *a, struct hist_entry *b)
+{
+   struct hpp_dynamic_entry *hde;
+   struct format_field *field;
+   unsigned offset, size;
+
+   hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
+
+   field = hde->field;
+   if (field->flags & FIELD_IS_DYNAMIC) {
+   unsigned long long dyn;
+
+   pevent_read_number_field(field, a->raw_data, );
+   offset = dyn & 0x;
+   size = (dyn >> 16) & 0x;
+   } else {
+   offset = field->offset;
+   size = field->size;
+   }
+
+   return memcmp(b->raw_data + offset, a->raw_data + offset, size);
+}
+
 bool perf_hpp__is_dynamic_entry(struct perf_hpp_fmt *fmt)
 {
return fmt->cmp == __sort__hde_cmp;
@@ -1826,7 +1850,7 @@ __alloc_dynamic_entry(struct perf_evsel *evsel, struct 
format_field *field)
 
hde->hpp.cmp = __sort__hde_cmp;
hde->hpp.collapse = __sort__hde_cmp;
-   hde->hpp.sort = __sort__hde_cmp;
+   hde->hpp.sort = __sort__hde_sort;
 
INIT_LIST_HEAD(>hpp.list);
INIT_LIST_HEAD(>hpp.sort_list);
-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 4/5] perf tools: Support dynamic sort keys for -F/--fields

2016-01-06 Thread Namhyung Kim
Now dynamic sort keys are supported for tracepoint events, add it to
output fields too.

Signed-off-by: Namhyung Kim 
---
 tools/perf/util/sort.c | 51 --
 1 file changed, 33 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 4d05b13aeac8..c09b34f545c6 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1953,7 +1953,7 @@ static struct perf_evsel *find_evsel(struct perf_evlist 
*evlist, char *event_nam
 
 static int __dynamic_dimension__add(struct perf_evsel *evsel,
struct format_field *field,
-   bool raw_trace)
+   bool raw_trace, bool is_sort_key)
 {
struct hpp_dynamic_entry *hde;
 
@@ -1963,18 +1963,24 @@ static int __dynamic_dimension__add(struct perf_evsel 
*evsel,
 
hde->raw_trace = raw_trace;
 
-   perf_hpp__register_sort_field(>hpp);
+   if (is_sort_key)
+   perf_hpp__register_sort_field(>hpp);
+   else
+   perf_hpp__column_register(>hpp);
+
return 0;
 }
 
-static int add_evsel_fields(struct perf_evsel *evsel, bool raw_trace)
+static int add_evsel_fields(struct perf_evsel *evsel, bool raw_trace,
+   bool is_sort_key)
 {
int ret;
struct format_field *field;
 
field = evsel->tp_format->format.fields;
while (field) {
-   ret = __dynamic_dimension__add(evsel, field, raw_trace);
+   ret = __dynamic_dimension__add(evsel, field, raw_trace,
+  is_sort_key);
if (ret < 0)
return ret;
 
@@ -1983,7 +1989,8 @@ static int add_evsel_fields(struct perf_evsel *evsel, 
bool raw_trace)
return 0;
 }
 
-static int add_all_dynamic_fields(struct perf_evlist *evlist, bool raw_trace)
+static int add_all_dynamic_fields(struct perf_evlist *evlist, bool raw_trace,
+ bool is_sort_key)
 {
int ret;
struct perf_evsel *evsel;
@@ -1992,7 +1999,7 @@ static int add_all_dynamic_fields(struct perf_evlist 
*evlist, bool raw_trace)
if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
continue;
 
-   ret = add_evsel_fields(evsel, raw_trace);
+   ret = add_evsel_fields(evsel, raw_trace, is_sort_key);
if (ret < 0)
return ret;
}
@@ -2000,7 +2007,8 @@ static int add_all_dynamic_fields(struct perf_evlist 
*evlist, bool raw_trace)
 }
 
 static int add_all_matching_fields(struct perf_evlist *evlist,
-  char *field_name, bool raw_trace)
+  char *field_name, bool raw_trace,
+  bool is_sort_key)
 {
int ret = -ESRCH;
struct perf_evsel *evsel;
@@ -2014,14 +2022,16 @@ static int add_all_matching_fields(struct perf_evlist 
*evlist,
if (field == NULL)
continue;
 
-   ret = __dynamic_dimension__add(evsel, field, raw_trace);
+   ret = __dynamic_dimension__add(evsel, field, raw_trace,
+  is_sort_key);
if (ret < 0)
break;
}
return ret;
 }
 
-static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
+static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok,
+bool is_sort_key)
 {
char *str, *event_name, *field_name, *opt_name;
struct perf_evsel *evsel;
@@ -2051,12 +2061,13 @@ static int add_dynamic_entry(struct perf_evlist 
*evlist, const char *tok)
}
 
if (!strcmp(field_name, "trace_fields")) {
-   ret = add_all_dynamic_fields(evlist, raw_trace);
+   ret = add_all_dynamic_fields(evlist, raw_trace, is_sort_key);
goto out;
}
 
if (event_name == NULL) {
-   ret = add_all_matching_fields(evlist, field_name, raw_trace);
+   ret = add_all_matching_fields(evlist, field_name, raw_trace,
+ is_sort_key);
goto out;
}
 
@@ -2074,7 +2085,7 @@ static int add_dynamic_entry(struct perf_evlist *evlist, 
const char *tok)
}
 
if (!strcmp(field_name, "*")) {
-   ret = add_evsel_fields(evsel, raw_trace);
+   ret = add_evsel_fields(evsel, raw_trace, is_sort_key);
} else {
field = pevent_find_any_field(evsel->tp_format, field_name);
if (field == NULL) {
@@ -2083,7 +2094,8 @@ static int add_dynamic_entry(struct perf_evlist *evlist, 
const char *tok)
return -ENOENT;
}
 
-   ret = __dynamic_dimension__add(evsel, field, raw_trace);
+  

[PATCH -next] gpio: xilinx: Do not use gpiochip_get_data() in xgpio_save_regs()

2016-01-06 Thread Guenter Roeck
Commit 097d88e94c44 ("gpio: xilinx: use gpiochip data pointer") replaces
the use of container_of() with gpiochip_get_data(). Unfortunately, the
data pointer is not yet set by the time xgpio_save_regs() is called,
causing a system hang.

Fixes: 097d88e94c44 ("gpio: xilinx: use gpiochip data pointer")
Signed-off-by: Guenter Roeck 
---
It might make sense to merge this patch with the patch introducing the problem.

 drivers/gpio/gpio-xilinx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 3345ab0ba1b3..d0fbb7f99523 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -207,7 +207,8 @@ static int xgpio_dir_out(struct gpio_chip *gc, unsigned int 
gpio, int val)
  */
 static void xgpio_save_regs(struct of_mm_gpio_chip *mm_gc)
 {
-   struct xgpio_instance *chip = gpiochip_get_data(_gc->gc);
+   struct xgpio_instance *chip =
+   container_of(mm_gc, struct xgpio_instance, mmchip);
 
xgpio_writereg(mm_gc->regs + XGPIO_DATA_OFFSET, chip->gpio_state[0]);
xgpio_writereg(mm_gc->regs + XGPIO_TRI_OFFSET, chip->gpio_dir[0]);
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 10/14] serial: pic32_uart: Add PIC32 UART driver

2016-01-06 Thread Paul.Thacker
On 01/05/2016 03:50 PM, One Thousand Gnomes wrote:
>
>> +#define PIC32_SDEV_NAME "ttyS"
>> +#define PIC32_SDEV_MAJORTTY_MAJOR
>> +#define PIC32_SDEV_MINOR64
>
> No. Same goes for you as every one of the forty other people a year who
> try and claim their console is ttyS. If it's not an 8250 it isn't.
>
> ttyS is the 8250, use dynamic major and minor and a different name.

Ok. Is there a naming convention documented anywhere? How about ttyPIC?

>
>
>> +/* serial core request to change current uart setting */
>> +static void pic32_uart_set_termios(struct uart_port *port,
>> +   struct ktermios *new,
>> +   struct ktermios *old)
>> +{
>
> You need to clear any termios features requested but not supported. In
> your case that appears to be CMSPAR, as you don't seem to support
> mark/space parity.

Ack.

>
> Similarly if you only support 8N1 or 7E1/7O1 you need to force the CSIZE
> bits to match what you ended up setting the UART to do.

Ack.

>
>> +/* update baud */
>> +baud = uart_get_baud_rate(port, new, old, 0, port->uartclk / 16);
>> +quot = uart_get_divisor(port, baud) - 1;
>> +pic32_uart_write(quot, sport, PIC32_UART_BRG);
>> +uart_update_timeout(port, new->c_cflag, baud);
>
> See the 8250 driver for an example: you probably need to write back the
> actual rate you got.

Ack.

>
>> +/* serial core request to release uart iomem */
>> +static void pic32_uart_release_port(struct uart_port *port)
>> +{
>> +struct platform_device *pdev = to_platform_device(port->dev);
>> +struct resource *res_mem;
>> +unsigned int res_size;
>
> resource_size_t for resources. Or you could just avoid the pointless
> variable in the first place 8)

Pointless variable removed.

>
> Other oddments - things like kasprintf() returns should be checked

Ack.

>
>
> Alan

Thanks,
Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND v4 08/11] staging: fsl-mc: set MSI domain for DPRC objects

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

THE MSI domain associated with a root DPRC object is
obtained form the device tree. Child DPRCs inherit
the parent DPRC MSI domain.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4:
- Addressed comments from Marc Zyngier:
  * Changed call to fsl_mc_find_msi_domain() to match new
signature changed in patch 3.

Changes in v3: none

Changes in v2: none

 drivers/staging/fsl-mc/bus/dprc-driver.c | 39 
 1 file changed, 39 insertions(+)

diff --git a/drivers/staging/fsl-mc/bus/dprc-driver.c 
b/drivers/staging/fsl-mc/bus/dprc-driver.c
index ef1bb93..38fc404 100644
--- a/drivers/staging/fsl-mc/bus/dprc-driver.c
+++ b/drivers/staging/fsl-mc/bus/dprc-driver.c
@@ -13,6 +13,7 @@
 #include "../include/mc-sys.h"
 #include 
 #include 
+#include 
 #include "dprc-cmd.h"

 struct dprc_child_objs {
@@ -398,11 +399,16 @@ static int dprc_probe(struct fsl_mc_device *mc_dev)
 {
int error;
size_t region_size;
+   struct device *parent_dev = mc_dev->dev.parent;
struct fsl_mc_bus *mc_bus = to_fsl_mc_bus(mc_dev);
+   bool msi_domain_set = false;

if (WARN_ON(strcmp(mc_dev->obj_desc.type, "dprc") != 0))
return -EINVAL;

+   if (WARN_ON(dev_get_msi_domain(_dev->dev)))
+   return -EINVAL;
+
if (!mc_dev->mc_io) {
/*
 * This is a child DPRC:
@@ -421,6 +427,30 @@ static int dprc_probe(struct fsl_mc_device *mc_dev)
 _dev->mc_io);
if (error < 0)
return error;
+   /*
+* Inherit parent MSI domain:
+*/
+   dev_set_msi_domain(_dev->dev,
+  dev_get_msi_domain(parent_dev));
+   msi_domain_set = true;
+   } else {
+   /*
+* This is a root DPRC
+*/
+   struct irq_domain *mc_msi_domain;
+
+   if (WARN_ON(parent_dev->bus == _mc_bus_type))
+   return -EINVAL;
+
+   error = fsl_mc_find_msi_domain(parent_dev,
+  _msi_domain);
+   if (error < 0) {
+   dev_warn(_dev->dev,
+"WARNING: MC bus without interrupt support\n");
+   } else {
+   dev_set_msi_domain(_dev->dev, mc_msi_domain);
+   msi_domain_set = true;
+   }
}

error = dprc_open(mc_dev->mc_io, 0, mc_dev->obj_desc.id,
@@ -446,6 +476,9 @@ error_cleanup_open:
(void)dprc_close(mc_dev->mc_io, 0, mc_dev->mc_handle);

 error_cleanup_mc_io:
+   if (msi_domain_set)
+   dev_set_msi_domain(_dev->dev, NULL);
+
fsl_destroy_mc_io(mc_dev->mc_io);
return error;
 }
@@ -463,6 +496,7 @@ error_cleanup_mc_io:
 static int dprc_remove(struct fsl_mc_device *mc_dev)
 {
int error;
+   struct fsl_mc_bus *mc_bus = to_fsl_mc_bus(mc_dev);

if (WARN_ON(strcmp(mc_dev->obj_desc.type, "dprc") != 0))
return -EINVAL;
@@ -475,6 +509,11 @@ static int dprc_remove(struct fsl_mc_device *mc_dev)
if (error < 0)
dev_err(_dev->dev, "dprc_close() failed: %d\n", error);

+   if (dev_get_msi_domain(_dev->dev)) {
+   fsl_mc_cleanup_irq_pool(mc_bus);
+   dev_set_msi_domain(_dev->dev, NULL);
+   }
+
dev_info(_dev->dev, "DPRC device unbound from driver");
return 0;
 }
--
2.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND v4 10/11] staging: fsl-mc: Added DPRC interrupt handler

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

The interrupt handler for DPRC IRQs is added. DPRC IRQs are
generated for hot plug events related to DPAA2 objects in a given
DPRC. These events include, creating/destroying DPAA2 objects in
the DPRC, changing the "plugged" state of DPAA2 objects and moving
objects between DPRCs.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4: none

Changes in v3: none

Changes in v2: none

 drivers/staging/fsl-mc/bus/dprc-driver.c | 247 +++
 1 file changed, 247 insertions(+)

diff --git a/drivers/staging/fsl-mc/bus/dprc-driver.c 
b/drivers/staging/fsl-mc/bus/dprc-driver.c
index 42b2494..52c6fce 100644
--- a/drivers/staging/fsl-mc/bus/dprc-driver.c
+++ b/drivers/staging/fsl-mc/bus/dprc-driver.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "dprc-cmd.h"

 struct dprc_child_objs {
@@ -386,6 +387,230 @@ error:
 EXPORT_SYMBOL_GPL(dprc_scan_container);

 /**
+ * dprc_irq0_handler - Regular ISR for DPRC interrupt 0
+ *
+ * @irq: IRQ number of the interrupt being handled
+ * @arg: Pointer to device structure
+ */
+static irqreturn_t dprc_irq0_handler(int irq_num, void *arg)
+{
+   return IRQ_WAKE_THREAD;
+}
+
+/**
+ * dprc_irq0_handler_thread - Handler thread function for DPRC interrupt 0
+ *
+ * @irq: IRQ number of the interrupt being handled
+ * @arg: Pointer to device structure
+ */
+static irqreturn_t dprc_irq0_handler_thread(int irq_num, void *arg)
+{
+   int error;
+   u32 status;
+   struct device *dev = (struct device *)arg;
+   struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+   struct fsl_mc_bus *mc_bus = to_fsl_mc_bus(mc_dev);
+   struct fsl_mc_io *mc_io = mc_dev->mc_io;
+   struct msi_desc *msi_desc = mc_dev->irqs[0]->msi_desc;
+
+   dev_dbg(dev, "DPRC IRQ %d triggered on CPU %u\n",
+   irq_num, smp_processor_id());
+
+   if (WARN_ON(!(mc_dev->flags & FSL_MC_IS_DPRC)))
+   return IRQ_HANDLED;
+
+   mutex_lock(_bus->scan_mutex);
+   if (WARN_ON(!msi_desc || msi_desc->irq != (u32)irq_num))
+   goto out;
+
+   error = dprc_get_irq_status(mc_io, 0, mc_dev->mc_handle, 0,
+   );
+   if (error < 0) {
+   dev_err(dev,
+   "dprc_get_irq_status() failed: %d\n", error);
+   goto out;
+   }
+
+   error = dprc_clear_irq_status(mc_io, 0, mc_dev->mc_handle, 0,
+ status);
+   if (error < 0) {
+   dev_err(dev,
+   "dprc_clear_irq_status() failed: %d\n", error);
+   goto out;
+   }
+
+   if (status & (DPRC_IRQ_EVENT_OBJ_ADDED |
+ DPRC_IRQ_EVENT_OBJ_REMOVED |
+ DPRC_IRQ_EVENT_CONTAINER_DESTROYED |
+ DPRC_IRQ_EVENT_OBJ_DESTROYED |
+ DPRC_IRQ_EVENT_OBJ_CREATED)) {
+   unsigned int irq_count;
+
+   error = dprc_scan_objects(mc_dev, _count);
+   if (error < 0) {
+   /*
+* If the error is -ENXIO, we ignore it, as it indicates
+* that the object scan was aborted, as we detected that
+* an object was removed from the DPRC in the MC, while
+* we were scanning the DPRC.
+*/
+   if (error != -ENXIO) {
+   dev_err(dev, "dprc_scan_objects() failed: %d\n",
+   error);
+   }
+
+   goto out;
+   }
+
+   if (irq_count > FSL_MC_IRQ_POOL_MAX_TOTAL_IRQS) {
+   dev_warn(dev,
+"IRQs needed (%u) exceed IRQs preallocated 
(%u)\n",
+irq_count, FSL_MC_IRQ_POOL_MAX_TOTAL_IRQS);
+   }
+   }
+
+out:
+   mutex_unlock(_bus->scan_mutex);
+   return IRQ_HANDLED;
+}
+
+/*
+ * Disable and clear interrupt for a given DPRC object
+ */
+static int disable_dprc_irq(struct fsl_mc_device *mc_dev)
+{
+   int error;
+   struct fsl_mc_io *mc_io = mc_dev->mc_io;
+
+   WARN_ON(mc_dev->obj_desc.irq_count != 1);
+
+   /*
+* Disable generation of interrupt, while we configure it:
+*/
+   error = dprc_set_irq_enable(mc_io, 0, mc_dev->mc_handle, 0, 0);
+   if (error < 0) {
+   dev_err(_dev->dev,
+   "Disabling DPRC IRQ failed: dprc_set_irq_enable() 
failed: %d\n",
+   error);
+   return error;
+   }
+
+   /*
+* Disable all interrupt causes for the interrupt:
+*/
+   error = dprc_set_irq_mask(mc_io, 0, mc_dev->mc_handle, 0, 0x0);
+   if (error < 0) {
+   dev_err(_dev->dev,
+   "Disabling DPRC 

Re: [PATCHv1 6/6] rdmacg: Added documentation for rdma controller.

2016-01-06 Thread Parav Pandit
On Wed, Jan 6, 2016 at 3:23 AM, Tejun Heo  wrote:
> Hello,
>
> On Wed, Jan 06, 2016 at 12:28:06AM +0530, Parav Pandit wrote:
>> +5-4-1. RDMA Interface Files
>> +
>> +  rdma.resource.verb.list
>> +  rdma.resource.verb.limit
>> +  rdma.resource.verb.usage
>> +  rdma.resource.verb.failcnt
>> +  rdma.resource.hw.list
>> +  rdma.resource.hw.limit
>> +  rdma.resource.hw.usage
>> +  rdma.resource.hw.failcnt
>
> Can you please read the rest of cgroup.txt and put the interface in
> line with the common conventions followed by other controllers?
>

Yes. I read through. I can see two changes to be made in V2 version of
this patch.
1. rdma.resource.verb.usage and rdma.resource.verb.limit to change
respectively to,
2. rdma.resource.verb.stat and rdma.resource.verb.max.
3. rdma.resource.verb.failcnt indicate failure events, which I think
should go to events.
I roll out new patch for events post this patch as additional feature
and remove this feature in V2.

rdma.resource.verb.list file is unique to rdma cgroup, so I believe
this is fine.

We will conclude whether to have rdma.resource.hw. or not in
other patches.
I am in opinion to keep "resource" and "verb" or "hw" tags around to
keep it verbose enough to know what are we trying to control.

Is that ok?

> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Patch v6 1/3] media: ti-vpe: Document CAL driver

2016-01-06 Thread Benoit Parrot
Device Tree bindings for the Camera Adaptation Layer (CAL) driver

Signed-off-by: Benoit Parrot 
Acked-by: Rob Herring 
---
 Documentation/devicetree/bindings/media/ti-cal.txt | 72 ++
 1 file changed, 72 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/media/ti-cal.txt

diff --git a/Documentation/devicetree/bindings/media/ti-cal.txt 
b/Documentation/devicetree/bindings/media/ti-cal.txt
new file mode 100644
index ..ae9b52f37576
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/ti-cal.txt
@@ -0,0 +1,72 @@
+Texas Instruments DRA72x CAMERA ADAPTATION LAYER (CAL)
+--
+
+The Camera Adaptation Layer (CAL) is a key component for image capture
+applications. The capture module provides the system interface and the
+processing capability to connect CSI2 image-sensor modules to the
+DRA72x device.
+
+Required properties:
+- compatible: must be "ti,dra72-cal"
+- reg: CAL Top level, Receiver Core #0, Receiver Core #1 and Camera RX
+   control address space
+- reg-names: cal_top, cal_rx_core0, cal_rx_core1, and camerrx_control
+registers
+- interrupts: should contain IRQ line for the CAL;
+
+CAL supports 2 camera port nodes on MIPI bus. Each CSI2 camera port nodes
+should contain a 'port' child node with child 'endpoint' node. Please
+refer to the bindings defined in
+Documentation/devicetree/bindings/media/video-interfaces.txt.
+
+Example:
+   cal: cal@4845b000 {
+   compatible = "ti,dra72-cal";
+   ti,hwmods = "cal";
+   reg = <0x4845B000 0x400>,
+ <0x4845B800 0x40>,
+ <0x4845B900 0x40>,
+ <0x4A002e94 0x4>;
+   reg-names = "cal_top",
+   "cal_rx_core0",
+   "cal_rx_core1",
+   "camerrx_control";
+   interrupts = ;
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   csi2_0: port@0 {
+   reg = <0>;
+   endpoint {
+   slave-mode;
+   remote-endpoint = <_1>;
+   };
+   };
+   csi2_1: port@1 {
+   reg = <1>;
+   };
+   };
+   };
+
+   i2c5: i2c@4807c000 {
+   ar0330@10 {
+   compatible = "ti,ar0330";
+   reg = <0x10>;
+
+   port {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   ar0330_1: endpoint {
+   reg = <0>;
+   clock-lanes = <1>;
+   data-lanes = <0 2 3 4>;
+   remote-endpoint = <_0>;
+   };
+   };
+   };
+   };
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Patch v6 0/3] media: ti-vpe: Add CAL v4l2 camera capture driver

2016-01-06 Thread Benoit Parrot
The Camera Adaptation Layer (CAL) is a block which consists of a dual
port CSI2/MIPI camera capture engine.
This camera engine is currently found on DRA72xx family of devices.

Port #0 can handle CSI2 camera connected to up to 4 data lanes.
Port #1 can handle CSI2 camera connected to up to 2 data lanes.

The driver implements the required API/ioctls to be V4L2 compliant.
Driver supports the following:
- V4L2 API using DMABUF/MMAP buffer access based on videobuf2 api
- Asynchronous sensor sub device registration
- DT support

Currently each port is designed to connect to a single sub-device.
In other words port aggregation is not currently supported.

Changes since v5:
- Added ti-vpe entry into the MAINTAINERS file.
- Per review comment corrected potential infinite loop.
- Fix checkpatch alignment and use safer strlcpy.
- Remove trace like debug statement.
- Modified register and bit field macro to use existing
  bitops support, cleaning up the header file and make
  the code a little easier to follow.

Changes since v4:
- Corrected dt bindings per review comment.
- Applied related dt bindings changes to driver code.
- Folded in coccinelle generated patches.
- Corrected checkpatch.pl --strict warnings.

Changes since v3:
- Nothing really I messed up the previous format-patch with the
  wrong commit-id. Sorry about the repeat.

Changes since v2:
- Rework Kconfig options and added COMPILE_TEST
- Merged in provided vb2 buffer rework
- Rebase on tip of lmm master and fixe vb2 split related changes

Changes since v1:
- Remove unnecessary format description
- Reworked how transient frame format is maintained
  in order to make it easier to use the fill helper functions
- Added a per port list of active frame format
- Reworked an added missing vb2 cleanup code
- Fix a module load/unload kernel oops
- Switch to use proper int64 get function for pixel rate control

=

Here is a sample output of the v4l2-compliance tool:

# ./v4l2-compliance -f -s -v -d /dev/video0 
Driver Info:
Driver name   : cal
Card type : cal
Bus info : platform:cal-000

Capabilities  : 0x8521
Video Capture
Read/Write
Streaming
Extended Pix Format
Device Capabilities
Device Caps   : 0x0521
Video Capture
Read/Write
Streaming
Extended Pix Format

Compliance test for device /dev/video0 (not using libv4l2):

Required ioctls:
test VIDIOC_QUERYCAP: OK

Allow for multiple opens:
test second video open: OK
test VIDIOC_QUERYCAP: OK
test VIDIOC_G/S_PRIORITY: OK

Debug ioctls:
test VIDIOC_DBG_G/S_REGISTER: OK (Not Supported)
test VIDIOC_LOG_STATUS: OK

Input ioctls:
test VIDIOC_G/S_TUNER/ENUM_FREQ_BANDS: OK (Not Supported)
test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
test VIDIOC_S_HW_FREQ_SEEK: OK (Not Supported)
test VIDIOC_ENUMAUDIO: OK (Not Supported)
test VIDIOC_G/S/ENUMINPUT: OK
test VIDIOC_G/S_AUDIO: OK (Not Supported)
Inputs: 1 Audio Inputs: 0 Tuners: 0

Output ioctls:
test VIDIOC_G/S_MODULATOR: OK (Not Supported)
test VIDIOC_G/S_FREQUENCY: OK (Not Supported)
test VIDIOC_ENUMAUDOUT: OK (Not Supported)
test VIDIOC_G/S/ENUMOUTPUT: OK (Not Supported)
test VIDIOC_G/S_AUDOUT: OK (Not Supported)
Outputs: 0 Audio Outputs: 0 Modulators: 0

Input/Output configuration ioctls:
test VIDIOC_ENUM/G/S/QUERY_STD: OK (Not Supported)
test VIDIOC_ENUM/G/S/QUERY_DV_TIMINGS: OK (Not Supported)
test VIDIOC_DV_TIMINGS_CAP: OK (Not Supported)
test VIDIOC_G/S_EDID: OK (Not Supported)

Test input 0:

Control ioctls:
info: checking v4l2_queryctrl of control 'User Controls' 
(0x00980001)
info: checking v4l2_queryctrl of control 'Horizontal Flip' 
(0x00980914)
info: checking v4l2_queryctrl of control 'Vertical Flip' 
(0x00980915)
info: checking v4l2_queryctrl of control 'Image Processing 
Controls' (0x009f0001)
info: checking v4l2_queryctrl of control 'Pixel Rate' 
(0x009f0902)
info: checking v4l2_queryctrl of control 'Horizontal Flip' 
(0x00980914)
info: checking v4l2_queryctrl of control 'Vertical Flip' 
(0x00980915)
test VIDIOC_QUERY_EXT_CTRL/QUERYMENU: OK
test VIDIOC_QUERYCTRL: OK
info: checking control 'User Controls' (0x00980001)
info: checking control 'Horizontal Flip' (0x00980914)
info: checking control 'Vertical Flip' (0x00980915)
info: checking control 'Image Processing Controls' (0x009f0001)
info: checking control 'Pixel Rate' (0x009f0902)
test VIDIOC_G/S_CTRL: OK
info: checking extended control 'User Controls' 

[Patch v6 2/3] MAINTAINERS: Add ti-vpe maintainer entry

2016-01-06 Thread Benoit Parrot
Signed-off-by: Benoit Parrot 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 4635e1d14612..ebbdb410c0f0 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10631,6 +10631,14 @@ L: linux-o...@vger.kernel.org
 S: Maintained
 F: drivers/thermal/ti-soc-thermal/
 
+TI VPE/CAL DRIVERS
+M: Benoit Parrot 
+L: linux-me...@vger.kernel.org
+W: http://linuxtv.org/
+Q: http://patchwork.linuxtv.org/project/linux-media/list/
+S: Maintained
+F: drivers/media/platform/ti-vpe/
+
 TI CDCE706 CLOCK DRIVER
 M: Max Filippov 
 S: Maintained
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/5] perf tools: Fix dynamic sort keys to sort properly

2016-01-06 Thread Arnaldo Carvalho de Melo
Em Wed, Jan 06, 2016 at 08:31:49PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Jan 07, 2016 at 08:26:45AM +0900, Namhyung Kim escreveu:
> > On Wed, Jan 06, 2016 at 08:06:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > > Em Wed, Jan 06, 2016 at 09:54:59AM +0900, Namhyung Kim escreveu:
> > > > Currently, the dynamic sort keys compares trace data using memcmp().
> > > > But for output sorting, it should check data size and compare by word.
> > > > Also it sorted strings in reverse order, fix it.
> > > 
> > > Can this be broken down in two patches? This is complex code, lets try
> > > to make it as bisectable as possible.
> > 
> > OK, I'll break out the string part then.  But I think it doesn't help
> > much to reduce the complexity.
> 
> Well, number of patches is not a problem, everytime I see a "Also lets
> do this other thing" I cringe, it is automatic, sorry :-\
> 
> For reviewing its s much better to see things nicely separated, and
> sometimes I like one part but not the other, so I pick one and continue
> discussion on the other, etc.

Ah, please rebase from my latest perf/core, I'm still holding on it
since some 'perf test' entries are failing and I want to check first if
its due to bugs introduced in this branch...

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] skb corruption and kernel panic at forwarding with fragmentation

2016-01-06 Thread Florian Westphal
Florian Westphal  wrote:
> Thadeu Lima de Souza Cascardo  wrote:
> > On Wed, Jan 06, 2016 at 11:11:41PM +0300, Konstantin Khlebnikov wrote:

[ skb_gso_segment uses skb->cb[], causes oops if ip_fragment is invoked
  on segmented skbs ]

> > I have hit this as well, this fixes it for me on an older kernel. Can you 
> > try it
> > on latest kernel?
> 
> > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> > index d8a1745..f44bc91 100644
> > --- a/net/ipv4/ip_output.c
> > +++ b/net/ipv4/ip_output.c
> > @@ -216,6 +216,7 @@ static int ip_finish_output_gso(struct sk_buff *skb)
> > netdev_features_t features;
> > struct sk_buff *segs;
> > int ret = 0;
> > +   struct inet_skb_parm ipcb;
> >  
> > if (skb_gso_network_seglen(skb) <= ip_skb_dst_mtu(skb))
> > return ip_finish_output2(skb);
> > @@ -227,6 +228,10 @@ static int ip_finish_output_gso(struct sk_buff *skb)
> >  * 2) skb arrived via virtio-net, we thus get TSO/GSO skbs directly
> >  * from host network stack.
> >  */
> > +   /* We need to save IPCB here because skb_gso_segment will use
> > +* SKB_GSO_CB.
> > +*/
> > +   ipcb = *IPCB(skb);
> > features = netif_skb_features(skb);
> > segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK);
> > if (IS_ERR_OR_NULL(segs)) {
> > @@ -241,6 +246,7 @@ static int ip_finish_output_gso(struct sk_buff *skb)
> > int err;
> >  
> > segs->next = NULL;
> > +   *IPCB(segs) = ipcb;
> > err = ip_fragment(segs, ip_finish_output2);
> >  
> > if (err && ret == 0)
> 
> I'm worried that this doesn't solve all cases. f.e. xfrm may also
> call skb_gso_segment(), and it will call into ipv4/ipv6 netfilter
> postrouting + ipv4 output functions...
> 
> nfqnl_enqueue_packet() is also affected.

... but it seems that those three are the only affected callers
of skb_gso_segment (tbf is ok since skb isn't owned by anyone,
ovs does save/restore already).

I think this patch is the right way, we just need similar
save/restore in nfqnl_enqueue_packet and xfrm_output_gso().

The latter two can be used by either ipv4 or ipv6 so it might
be preferable to just save/restore sizeof(struct skb_gso_cb);
or a union of inet_skb_parm+inet6_skb_parm.

Cascardo, can you cook a patch?

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] PCI: hosts: mark pcie/pci (msi) irq cascade handler as IRQF_NO_THREAD

2016-01-06 Thread Bjorn Helgaas
Hi Grygorii,

On Thu, Dec 10, 2015 at 09:18:20PM +0200, Grygorii Strashko wrote:
> On -RT and if kernel is booting with "threadirqs" cmd line parameter
> pcie/pci (msi) irq cascade handlers (like dra7xx_pcie_msi_irq_handler())
> will be forced threaded and, as result, will generate warnings like:
> 
> WARNING: CPU: 1 PID: 82 at kernel/irq/handle.c:150 
> handle_irq_event_percpu+0x14c/0x174()
> irq 460 handler irq_default_primary_handler+0x0/0x14 enabled interrupts
> Backtrace:
>  (warn_slowpath_common) from [] (warn_slowpath_fmt+0x38/0x40)
>  (warn_slowpath_fmt) from [] (handle_irq_event_percpu+0x14c/0x174)
>  (handle_irq_event_percpu) from [] (handle_irq_event+0x84/0xb8)
>  (handle_irq_event) from [] (handle_simple_irq+0x90/0x118)
>  (handle_simple_irq) from [] (generic_handle_irq+0x30/0x44)
>  (generic_handle_irq) from [] 
> (dra7xx_pcie_msi_irq_handler+0x7c/0x8c)
>  (dra7xx_pcie_msi_irq_handler) from [] 
> (irq_forced_thread_fn+0x28/0x5c)
>  (irq_forced_thread_fn) from [] (irq_thread+0x128/0x204)
> 
> This happens because all of them invoke generic_handle_irq() from the
> requsted handler. generic_handle_irq grabs raw_locks and this needs to
> run in raw-irq context.
> 
> This issue was originally reproduced on TI dra7-evem, but, as was
> identified during dicussion [1], other PCI(e) hosts can also suffer
> from this issue. So let's fix all them at once and mark pcie/pci (msi)
> irq cascade handlers IRQF_NO_THREAD explicitly.
> 
> [1] https://lkml.org/lkml/2015/11/20/356
> 
> Cc: Kishon Vijay Abraham I 
> Cc: Bjorn Helgaas 
> Cc: Jingoo Han 
> Cc: Kukjin Kim 
> Cc: Krzysztof Kozlowski 
> Cc: Richard Zhu 
> Cc: Lucas Stach 
> Cc: Thierry Reding 
> Cc: Stephen Warren 
> Cc: Alexandre Courbot 
> Cc: Simon Horman 
> Cc: Pratyush Anand 
> Cc: Michal Simek 
> Cc: "Sören Brinkmann" 
> Cc: Sebastian Andrzej Siewior 
> Signed-off-by: Grygorii Strashko 
> ---
> Changes in v3:
>  - change applied to all affected pci(e) host drivers in drivers/pci/hosts.
>After some invsetigation I've decided to not touch arch code - it is not 
> easy
>to identify all places which need to be fixed. 
>if it's still required - i can send separate patches for 
>arch/mips/pci/msi-octeon.c and arch/sparc/kernel/pci_msi.c.
> Links
> v2: https://lkml.org/lkml/2015/11/20/356
> v1: https://lkml.org/lkml/2015/11/5/593
> ref: https://lkml.org/lkml/2015/11/3/660
> 
>  drivers/pci/host/pci-dra7xx.c | 13 -
>  drivers/pci/host/pci-exynos.c |  3 ++-
>  drivers/pci/host/pci-imx6.c   |  3 ++-
>  drivers/pci/host/pci-tegra.c  |  2 +-
>  drivers/pci/host/pcie-rcar.c  |  6 --
>  drivers/pci/host/pcie-spear13xx.c |  3 ++-
>  drivers/pci/host/pcie-xilinx.c|  3 ++-
>  7 files changed, 25 insertions(+), 8 deletions(-)

I applied this to pci/host for v4.5, thanks.  I added a stable tag.
I haven't seen any acks from the host driver guys, but I will still add
them if I see any in the next few days.

Bjorn

> diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c
> index 8c36880..0415192 100644
> --- a/drivers/pci/host/pci-dra7xx.c
> +++ b/drivers/pci/host/pci-dra7xx.c
> @@ -301,8 +301,19 @@ static int __init dra7xx_add_pcie_port(struct 
> dra7xx_pcie *dra7xx,
>   return -EINVAL;
>   }
>  
> + /*
> +  * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD
> +  * On -RT and if kernel is booting with "threadirqs" cmd line parameter
> +  * the dra7xx_pcie_msi_irq_handler() will be forced threaded but,
> +  * in the same time, it's IRQ dispatcher and calls generic_handle_irq(),
> +  * which, in turn, will be resolved to handle_simple_irq() call.
> +  * The handle_simple_irq() expected to be called with IRQ disabled, as
> +  * result kernle will display warning:
> +  * "irq XXX handler YYY+0x0/0x14 enabled interrupts".
> +  */
>   ret = devm_request_irq(>dev, pp->irq,
> -dra7xx_pcie_msi_irq_handler, IRQF_SHARED,
> +dra7xx_pcie_msi_irq_handler,
> +IRQF_SHARED | IRQF_NO_THREAD,
>  "dra7-pcie-msi", pp);
>   if (ret) {
>   dev_err(>dev, "failed to request irq\n");
> diff --git a/drivers/pci/host/pci-exynos.c b/drivers/pci/host/pci-exynos.c
> index 01095e1..d997d22 100644
> --- a/drivers/pci/host/pci-exynos.c
> +++ b/drivers/pci/host/pci-exynos.c
> @@ -522,7 +522,8 @@ static int __init exynos_add_pcie_port(struct pcie_port 
> *pp,
>  
>   ret = devm_request_irq(>dev, pp->msi_irq,
>   exynos_pcie_msi_irq_handler,

[PATCH] Input: rohm_bu21023 - fix handling of retrying firmware update

2016-01-06 Thread Dmitry Torokhov
Because of the wrong condition we'd never retry firmware update.

Signed-off-by: Dmitry Torokhov 
---
 drivers/input/touchscreen/rohm_bu21023.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/touchscreen/rohm_bu21023.c 
b/drivers/input/touchscreen/rohm_bu21023.c
index ba6024f..611156a 100644
--- a/drivers/input/touchscreen/rohm_bu21023.c
+++ b/drivers/input/touchscreen/rohm_bu21023.c
@@ -725,7 +725,7 @@ static int rohm_ts_load_firmware(struct i2c_client *client,
break;
 
error = -EIO;
-   } while (++retry >= FIRMWARE_RETRY_MAX);
+   } while (++retry <= FIRMWARE_RETRY_MAX);
 
 out:
error2 = i2c_smbus_write_byte_data(client, INT_MASK, INT_ALL);
-- 
2.6.0.rc2.230.g3dd15c0


-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 6/6] nvdimm: Add IOCTL pass thru functions

2016-01-06 Thread Jerry Hoemann
Add ioctl command ND_CMD_CALL_DSM to acpi_nfit_ctl and __nd_ioctl which
allow kernel to call a nvdimm's _DSM as a passthru without using the
marshaling code of the nd_cmd_desc.

Signed-off-by: Jerry Hoemann 
---
 drivers/acpi/nfit.c  | 52 +++-
 drivers/nvdimm/bus.c | 47 ---
 2 files changed, 83 insertions(+), 16 deletions(-)

diff --git a/drivers/acpi/nfit.c b/drivers/acpi/nfit.c
index aa45d48..015fc8e 100644
--- a/drivers/acpi/nfit.c
+++ b/drivers/acpi/nfit.c
@@ -85,6 +85,10 @@ static int acpi_nfit_ctl(struct nvdimm_bus_descriptor 
*nd_desc,
const u8 *uuid;
u32 offset;
int rc, i;
+   __u64 rev = 1, func = cmd;
+
+   struct nd_cmd_dsmcall_pkg *pkg = buf;
+   int dsm_call = (cmd == ND_CMD_CALL_DSM);
 
if (nvdimm) {
struct nfit_mem *nfit_mem = nvdimm_provider_data(nvdimm);
@@ -108,6 +112,8 @@ static int acpi_nfit_ctl(struct nvdimm_bus_descriptor 
*nd_desc,
handle = adev->handle;
dimm_name = "bus";
}
+   if (dsm_call)
+   dsm_mask = ~0UL;
 
if (!desc || (cmd && (desc->out_num + desc->in_num == 0)))
return -ENOTTY;
@@ -127,15 +133,25 @@ static int acpi_nfit_ctl(struct nvdimm_bus_descriptor 
*nd_desc,
in_buf.buffer.length += nd_cmd_in_size(nvdimm, cmd, desc,
i, buf);
 
+   if (dsm_call) {
+   /* must skip over package wrapper */
+   in_buf.buffer.pointer = (void *) >dsm_buf;
+   in_buf.buffer.length = pkg->h.dsm_in;
+   /* for pass thru must use value sent in from user space. */
+   uuid = pkg->h.dsm_uuid;
+   rev  = pkg->h.dsm_rev;
+   func = pkg->h.dsm_fun_idx;
+   }
+
if (IS_ENABLED(CONFIG_ACPI_NFIT_DEBUG)) {
-   dev_dbg(dev, "%s:%s cmd: %s input length: %d\n", __func__,
-   dimm_name, cmd_name, in_buf.buffer.length);
-   print_hex_dump_debug(cmd_name, DUMP_PREFIX_OFFSET, 4,
-   4, in_buf.buffer.pointer, min_t(u32, 128,
-   in_buf.buffer.length), true);
+   dev_dbg(dev, "%s:%s cmd: %d: %llu input length: %d\n", __func__,
+   dimm_name, cmd, func, in_buf.buffer.length);
+   print_hex_dump_debug("nvdimm in  ", DUMP_PREFIX_OFFSET, 4, 4,
+   in_buf.buffer.pointer,
+   min_t(u32, 256, in_buf.buffer.length), true);
}
 
-   out_obj = acpi_evaluate_dsm(handle, uuid, 1, cmd, _obj);
+   out_obj = acpi_evaluate_dsm(handle, uuid, rev, func, _obj);
if (!out_obj) {
dev_dbg(dev, "%s:%s _DSM failed cmd: %s\n", __func__, dimm_name,
cmd_name);
@@ -143,18 +159,28 @@ static int acpi_nfit_ctl(struct nvdimm_bus_descriptor 
*nd_desc,
}
 
if (out_obj->package.type != ACPI_TYPE_BUFFER) {
-   dev_dbg(dev, "%s:%s unexpected output object type cmd: %s type: 
%d\n",
-   __func__, dimm_name, cmd_name, out_obj->type);
+   dev_dbg(dev, "%s:%s unexpected output object type cmd: %s %llu, 
type: %d\n",
+   __func__, dimm_name, cmd_name, func, out_obj->type);
rc = -EINVAL;
goto out;
}
 
if (IS_ENABLED(CONFIG_ACPI_NFIT_DEBUG)) {
-   dev_dbg(dev, "%s:%s cmd: %s output length: %d\n", __func__,
-   dimm_name, cmd_name, out_obj->buffer.length);
-   print_hex_dump_debug(cmd_name, DUMP_PREFIX_OFFSET, 4,
-   4, out_obj->buffer.pointer, min_t(u32, 128,
-   out_obj->buffer.length), true);
+   dev_dbg(dev, "%s:%s cmd %d: %llu output length %d\n", __func__,
+   dimm_name, cmd, func, out_obj->buffer.length);
+   print_hex_dump_debug("nvdimm out ", DUMP_PREFIX_OFFSET, 4, 4,
+   out_obj->buffer.pointer,
+   min_t(u32, 256, out_obj->buffer.length), true);
+   }
+
+   if (dsm_call) {
+   pkg->h.dsm_size = out_obj->buffer.length;
+   memcpy(pkg->dsm_buf + pkg->h.dsm_in,
+   out_obj->buffer.pointer,
+   min(pkg->h.dsm_size, pkg->h.dsm_out));
+
+   ACPI_FREE(out_obj);
+   return 0;
}
 
for (i = 0, offset = 0; i < desc->out_num; i++) {
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 87fe545..8d3a64b 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -367,6 +367,12 @@ static const struct nd_cmd_desc __nd_cmd_dimm_descs[] = {
.out_num = 3,
.out_sizes = { 4, 4, UINT_MAX, },
},
+ 

[PATCH v5 1/6] ACPI / util: Fix acpi_evaluate_dsm() argument type

2016-01-06 Thread Jerry Hoemann
The ACPI spec speicifies that arguments "Revision ID" and
"Function Index" to a _DSM are type "Integer."  Type Integers
are 64 bit quantities.

The function evaluate_dsm specifies these types as simple "int"
which are 32 bits.  Correct type passed to acpi_evaluate_dsm
and its callers and derived callers to pass correct type.

acpi_check_dsm and acpi_evaluate_dsm_typed had similar issue
and were corrected as well.

Signed-off-by: Jerry Hoemann 
---
 drivers/acpi/utils.c| 4 ++--
 include/acpi/acpi_bus.h | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
index 475c907..049cba4 100644
--- a/drivers/acpi/utils.c
+++ b/drivers/acpi/utils.c
@@ -628,7 +628,7 @@ acpi_status acpi_evaluate_lck(acpi_handle handle, int lock)
  * some old BIOSes do expect a buffer or an integer etc.
  */
 union acpi_object *
-acpi_evaluate_dsm(acpi_handle handle, const u8 *uuid, int rev, int func,
+acpi_evaluate_dsm(acpi_handle handle, const u8 *uuid, u64 rev, u64 func,
  union acpi_object *argv4)
 {
acpi_status ret;
@@ -677,7 +677,7 @@ EXPORT_SYMBOL(acpi_evaluate_dsm);
  * functions. Currently only support 64 functions at maximum, should be
  * enough for now.
  */
-bool acpi_check_dsm(acpi_handle handle, const u8 *uuid, int rev, u64 funcs)
+bool acpi_check_dsm(acpi_handle handle, const u8 *uuid, u64 rev, u64 funcs)
 {
int i;
u64 mask = 0;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index ad0a5ff..8e6abcf 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -61,12 +61,12 @@ bool acpi_ata_match(acpi_handle handle);
 bool acpi_bay_match(acpi_handle handle);
 bool acpi_dock_match(acpi_handle handle);
 
-bool acpi_check_dsm(acpi_handle handle, const u8 *uuid, int rev, u64 funcs);
+bool acpi_check_dsm(acpi_handle handle, const u8 *uuid, u64 rev, u64 funcs);
 union acpi_object *acpi_evaluate_dsm(acpi_handle handle, const u8 *uuid,
-   int rev, int func, union acpi_object *argv4);
+   u64 rev, u64 func, union acpi_object *argv4);
 
 static inline union acpi_object *
-acpi_evaluate_dsm_typed(acpi_handle handle, const u8 *uuid, int rev, int func,
+acpi_evaluate_dsm_typed(acpi_handle handle, const u8 *uuid, u64 rev, u64 func,
union acpi_object *argv4, acpi_object_type type)
 {
union acpi_object *obj;
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 0/6] nvdimm: Add an IOCTL pass thru for DSM calls

2016-01-06 Thread Jerry Hoemann
The NVDIMM code in the kernel supports an IOCTL interface to user
space based upon the Intel Example DSM:

http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

This interface cannot be used by other NVDIMM DSMs that support
incompatible functions.

This patch set adds a generic "passthru" IOCTL interface which
is not tied to a particular DSM.

A new _IOC_NR ND_CMD_CALL_DSM == "10" is added for the pass thru call.

The new data structure nd_cmd_dsmcall_pkg serves as a wrapper for
the passthru calls.  This wrapper supplies the data that the kernel
needs to make the _DSM call.

Unlike the definitions of the _DSM functions themselves, the nd_cmd_dsmcall_pkg
provides the calling information (input/output sizes) in an uniform
manner making the kernel marshaling of the arguments straight
forward.

This shifts the marshaling burden from the kernel to the user
space application while still permitting the kernel to internally
call _DSM functions.

The kernel functions __nd_ioctl and acpi_nfit_ctl were modified
to accomodate ND_CMD_CALL_DSM.


Changes in version 5:
-
0. Fixed submit comment for drivers/acpi/utils.c.


Changes in version 4:
-
0. Added patch to correct parameter type passed to acpi_evaluate_dsm
   ACPI defines arguments rev and fun as 64 bit quanties and the ioctl
   exports to user face rev and func. We want those to match the ACPI spec.

   Also modified acpi_evaluate_dsm_typed and acpi_check dsm which had
   similar issue.

1. nd_cmd_dsmcall_pkg rearange a reserve and rounded up total size
   to 16 byte boundary.

2. Created stand alone patch for the pre-existing security issue related
   to "read only" IOCTL calls.

3. Added patch for increasing envelope size of IOCTL.  Needed to
   be able to read in the wrapper to know remaining size to copy in.

   Note: in_env, out_env are statics sized based upon this change.

4. Moved copyin code to table driven nd_cmd_desc 

  Note, the last 40 lines or so of acpi_nfit_ctl will not return _DSM
  data unless the size allocated in user space buffer equals
  out_obj->buffer.length.

  The semantic we want in the pass thru case is to return as much
  of the _DSM data as the user space buffer would accomodate.

  Hence, in acpi_nfit_ctl I have retained the line:

memcpy(pkg->dsm_buf + pkg->h.dsm_in,
out_obj->buffer.pointer,
min(pkg->h.dsm_size, pkg->h.dsm_out));

  and the early return from the function.




Changes in version 3:
-
1. Changed name ND_CMD_PASSTHRU to ND_CMD_CALL_DSM.

2. Value of ND_CMD_CALL_DSM is 10, not 100.

3. Changed name of nd_passthru_pkg to nd_cmd_dsmcall_pkg.

4. Removed separate functions for handling ND_CMD_CALL_DSM.
   Moved functionality to __nd_ioctl and acpi_nfit_ctl proper.
   The resultant code looks very different from prior versions.

5. BUGFIX: __nd_ioctl: Change the if read_only switch to use
 _IOC_NR cmd (not ioctl_cmd) for better protection.

Do we want to make a stand alone patch for this issue?


Changes in version 2:
-
1. Cleanup access mode check in nd_ioctl and nvdimm_ioctl.
2. Change name of ndn_pkg to nd_passthru_pkg
3. Adjust sizes in nd_passthru_pkg. DSM intergers are 64 bit.
4. No new ioctl type, instead tunnel into the existing number space.
5. Push down one function level where determine ioctl cmd type.
6. re-work diagnostic print/dump message in pass-thru functions.




Jerry Hoemann (6):
  ACPI / util: Fix acpi_evaluate_dsm() argument type
  nvdimm: Clean-up access mode check.
  nvdimm: Add wrapper for IOCTL pass thru
  nvdimm: Fix security issue with DSM IOCTL.
  nvdimm: Increase max envelope size for IOCTL
  nvdimm: Add IOCTL pass thru functions

 drivers/acpi/nfit.c| 52 ++-
 drivers/acpi/utils.c   |  4 +--
 drivers/nvdimm/bus.c   | 67 +-
 include/acpi/acpi_bus.h|  6 ++---
 include/linux/libnvdimm.h  |  2 +-
 include/uapi/linux/ndctl.h | 19 +
 6 files changed, 118 insertions(+), 32 deletions(-)

-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 4/6] nvdimm: Fix security issue with DSM IOCTL.

2016-01-06 Thread Jerry Hoemann
Code attempts to prevent certain IOCTL DSM from being called
when device is opened read only.  This security feature can
be trivially overcome by changing the size portion of the
ioctl_command which isn't used.

Check only the _IOC_NR (i.e. the command).

Signed-off-by: Jerry Hoemann 
---
 drivers/nvdimm/bus.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 1c81203..87fe545 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -513,10 +513,10 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, 
struct nvdimm *nvdimm,
 
/* fail write commands (when read-only) */
if (read_only)
-   switch (ioctl_cmd) {
-   case ND_IOCTL_VENDOR:
-   case ND_IOCTL_SET_CONFIG_DATA:
-   case ND_IOCTL_ARS_START:
+   switch (cmd) {
+   case ND_CMD_VENDOR:
+   case ND_CMD_SET_CONFIG_DATA:
+   case ND_CMD_ARS_START:
dev_dbg(_bus->dev, "'%s' command while 
read-only.\n",
nvdimm ? nvdimm_cmd_name(cmd)
: nvdimm_bus_cmd_name(cmd));
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 3/6] nvdimm: Add wrapper for IOCTL pass thru

2016-01-06 Thread Jerry Hoemann
Add struct nd_passthru_pkg which serves as a warapper for
the data being passed via a pass thru to a NVDIMM DSM.
This wrapper specifies the extra information in a uniform
manner allowing the kenrel to call a DSM without knowing
specifics of the DSM.

Add dsm_call command to nvdimm_bus_cmd_name and nvdimm_cmd_name.

Signed-off-by: Jerry Hoemann 
---
 include/uapi/linux/ndctl.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index 5b4a4be..6823af3 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -109,6 +109,7 @@ enum {
ND_CMD_VENDOR_EFFECT_LOG_SIZE = 7,
ND_CMD_VENDOR_EFFECT_LOG = 8,
ND_CMD_VENDOR = 9,
+   ND_CMD_CALL_DSM = 10,
 };
 
 enum {
@@ -122,6 +123,7 @@ static inline const char *nvdimm_bus_cmd_name(unsigned cmd)
[ND_CMD_ARS_CAP] = "ars_cap",
[ND_CMD_ARS_START] = "ars_start",
[ND_CMD_ARS_STATUS] = "ars_status",
+   [ND_CMD_CALL_DSM] = "dsm_call",
};
 
if (cmd < ARRAY_SIZE(names) && names[cmd])
@@ -141,6 +143,7 @@ static inline const char *nvdimm_cmd_name(unsigned cmd)
[ND_CMD_VENDOR_EFFECT_LOG_SIZE] = "effect_size",
[ND_CMD_VENDOR_EFFECT_LOG] = "effect_log",
[ND_CMD_VENDOR] = "vendor",
+   [ND_CMD_CALL_DSM] = "dsm_call",
};
 
if (cmd < ARRAY_SIZE(names) && names[cmd])
@@ -204,4 +207,20 @@ enum ars_masks {
ARS_STATUS_MASK = 0x,
ARS_EXT_STATUS_SHIFT = 16,
 };
+
+
+struct nd_cmd_dsmcall_pkg {
+   struct {
+   __u8dsm_uuid[16];
+   __u64   reserved1;  /* reserved should be zero */
+   __u64   dsm_rev;/* revision of dsm call  */
+   __u64   dsm_fun_idx;/* DSM function id   */
+   __u32   dsm_in; /* size of _DSM input*/
+   __u32   dsm_out;/* size of user buffer   */
+   __u32   reserved2[23];  /* reserved must be zero */
+   __u32   dsm_size;   /* size _DSM would write */
+   } h;
+   unsigned char dsm_buf[];/* Contents of DSM call  */
+};
+
 #endif /* __NDCTL_H__ */
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 5/6] nvdimm: Increase max envelope size for IOCTL

2016-01-06 Thread Jerry Hoemann
In __nd_ioctl must first read in the fixed sized portion of an ioctl
so that it can then determine the size of the variable part.

Prepare for ND_CMD_CALL_DSM calls which have larger fixed portion
wrapper.

Signed-off-by: Jerry Hoemann 
---
 include/linux/libnvdimm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index 3f021dc..b0a2f60 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -27,7 +27,7 @@ enum {
/* need to set a limit somewhere, but yes, this is likely overkill */
ND_IOCTL_MAX_BUFLEN = SZ_4M,
ND_CMD_MAX_ELEM = 4,
-   ND_CMD_MAX_ENVELOPE = 16,
+   ND_CMD_MAX_ENVELOPE = 256,
ND_CMD_ARS_STATUS_MAX = SZ_4K,
ND_MAX_MAPPINGS = 32,
 
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scripts: recordmcount: fix incorrect use of sprintf

2016-01-06 Thread Kees Cook
On Wed, Jan 6, 2016 at 3:28 AM, Fengguang Wu  wrote:
> Hi Steven,
>
> On Mon, Jan 04, 2016 at 10:42:46AM -0500, Steven Rostedt wrote:
>> On Wed, 30 Dec 2015 23:06:41 +
>> Colin King  wrote:
>>
>> > From: Colin Ian King 
>> >
>> > Fix build warning:
>> >
>> > scripts/recordmcount.c:589:4: warning: format not a string
>> > literal and no format arguments [-Wformat-security]
>> > sprintf("%s: failed\n", file);
>> >
>> > Fixes: a50bd43935586 ("ftrace/scripts: Have recordmcount copy the object 
>> > file")
>> > Signed-off-by: Colin Ian King 
>> > ---
>> >  scripts/recordmcount.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
>> > index 301d70b..e1675927 100644
>> > --- a/scripts/recordmcount.c
>> > +++ b/scripts/recordmcount.c
>> > @@ -586,7 +586,7 @@ main(int argc, char *argv[])
>> > do_file(file);
>> > break;
>> > case SJ_FAIL:/* error in do_file or below */
>> > -   sprintf("%s: failed\n", file);
>> > +   fprintf(stderr, "%s: failed\n", file);
>>
>> Paper bag bug. I'm not sure how this passed my tests? My tests check
>> for warnings. And I even got a "BUILD SUCCESS" from Fengguang Wu's
>> kbuild test robot.
>
> Because the error will only show up on "gcc -Wformat-security".

I wish GCC were smarter about this. I would have hoped at least ONE of
the various -Wformat... options would warn about "you did not actually
specify a buffer for sprintf". But yeah, this comes from
-Wformat-security (a subset of -Wformat-nonliteral), but can't be
turned on by default, as mentioned by Fengguang.

> It tend to raise false positives, so Kees setup a dedicated tree
> which enables -Wformat-security as well as quieting the common false
> positives. In that way Kees can catch such problems from time to time,
> however the limitation is, only upstreamed code can be tested in Kees'
> tree.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 55/58] mtd: nand: add helpers to access ->priv

2016-01-06 Thread Brian Norris
On Sat, Dec 19, 2015 at 04:01:24AM +0100, Boris Brezillon wrote:
> Actually the nand_{get,set}_controller_data() helpers are not about
> assigning NAND controller private data (as you pointed those can
> already be retrieved thanks to the ->controller field using the
> container_of() trick), but per-chip private data instantiated by the
> NAND controller and attached to a specific chip. For example, some
> controllers pre-compute some register values or a clk rate to set when
> a specific chip is selected. This is what per-chip controller data is
> meant for.

Sure. Really, it's just anything the controller driver needs to store on
a per-chip basis. All I'm suggesting is picking a name that doesn't
imply it's a per-controller instance, when it's actually a
per-flash-chip instance.

> Now, the reason I explicitly specified the data usage instead of using
> a generic name like nand_{get,set}_data() is because I plan to define

I never suggested just "_data"; I said "_drvdata".

> other helpers to allow NAND manufacturer code to manipulate its own
> private data. This is required if we want to support read-retry on some
> chips who are requiring a read OTP area step to retrieve some register
> values which will later be used to change from one read-retry mode to
> another.
> The plan was to define the nand_{set,get}_manufacturer_data() helpers,
> and create or reuse an existing priv field (mtd->priv?) to store this
> private data.

That's interesting. Sounds like an OK idea. (Personally, I wouldn't
try to use mtd->priv for this, but otherwise looks OK.)

> Also note that the spi framework provides the same kind of helpers [1].

Hmm, OK. FWIW, they have both "driver data" and "controller state". It's
not perfectly clear to me why both exist.

> This being said, I'm perfectly fine changing the function names, but
> I'd like to replace it by something explicitly telling the user that
> this field should only be set by NAND controller drivers. 

Sure. I though a "driver data"-based name did this. But I'll leave it to
you. I could even be OK with "controller data", if you still think this
fits your overall controller refactoring plan, and communicates its
purpose best.

> [1]http://lxr.free-electrons.com/source/include/linux/spi/spi.h#L189

Regards,
Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv1 0/6] rdma controller support

2016-01-06 Thread Parav Pandit
Hi Tejun,

On Wed, Jan 6, 2016 at 3:26 AM, Tejun Heo  wrote:
> Hello,
>
> On Wed, Jan 06, 2016 at 12:28:00AM +0530, Parav Pandit wrote:
>> Resources are not defined by the RDMA cgroup. Resources are defined
>> by RDMA/IB stack & optionally by HCA vendor device drivers.
>
> As I wrote before, I don't think this is a good idea.  Drivers will
> inevitably add non-sensical "resources" which don't make any sense
> without much scrutiny.

In our last discussion on v0 patch,
http://lkml.iu.edu/hypermail/linux/kernel/1509.1/04331.html

The direction was, that vendor should be able to define their own resources.
> If different controllers can't agree upon the
> same set of resources, which probably is a pretty good sign that this
> isn't too well thought out to begin with,

When you said "different controller" you meant "different hw vendors", right?
Or you meant, rdma, mem, cpu as controller here?

> at least make all resource
> types defined by the controller itself and let the controllers enable
> them selectively.
>
In this V1 patch, resource is defined by the IB stack and rdma cgroup
is facilitator for same.
By doing so, IB stack modules can define new resource without really
making changes to cgroup.
This design also allows hw vendors to define their own resources which
will be reviewed in rdma mailing list anway.
The idea is different hw versions can have different resource support,
so the whole intention is not about defining different resource but
rather enabling it.
But yes, I equally agree that by doing so, different hw controller
vendors can define different hw resources.


> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/5] perf tools: Fix dynamic sort keys to sort properly

2016-01-06 Thread Namhyung Kim
On Wed, Jan 06, 2016 at 08:06:43PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Jan 06, 2016 at 09:54:59AM +0900, Namhyung Kim escreveu:
> > Currently, the dynamic sort keys compares trace data using memcmp().
> > But for output sorting, it should check data size and compare by word.
> > Also it sorted strings in reverse order, fix it.
> 
> Can this be broken down in two patches? This is complex code, lets try
> to make it as bisectable as possible.

OK, I'll break out the string part then.  But I think it doesn't help
much to reduce the complexity.

Thanks,
Namhyung


> 
> - Arnaldo
> 
> > 
> > Before)
> > 
> >   $ perf report -F overhead -s prev_pid,next_pid
> >   ...
> >   # Overheadprev_pidnext_pid
> >   #   ..  ..
> >   #
> >0.39% 490   0
> >9.12% 225   0
> >0.04% 224   0
> >0.51% 731 189
> >0.08% 731   3
> >0.12% 731   0
> >4.82% 729   0
> >0.08%1229   0
> >0.20% 715   0
> >4.78% 189 225
> >   ...
> > 
> > After)
> > 
> >   $ perf report -F overhead -s prev_pid,next_pid
> >   ...
> >   # Overheadprev_pidnext_pid
> >   #   ..  ..
> >   #
> >0.43%   0   7
> >0.04%   0  11
> >0.04%   0  12
> >0.08%   0  14
> >0.04%   0  17
> >0.08%   0  19
> >0.04%   0  22
> >0.04%   0  27
> >0.04%   0  37
> >0.04%   0  42
> >   ...
> > 
> > Reported-by: Arnaldo Carvalho de Melo 
> > Signed-off-by: Namhyung Kim 
> > ---
> >  tools/perf/util/sort.c | 47 ++-
> >  1 file changed, 46 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> > index 9618a64875c0..264d2b630549 100644
> > --- a/tools/perf/util/sort.c
> > +++ b/tools/perf/util/sort.c
> > @@ -1798,6 +1798,51 @@ static int64_t __sort__hde_cmp(struct perf_hpp_fmt 
> > *fmt,
> > return memcmp(a->raw_data + offset, b->raw_data + offset, size);
> >  }
> >  
> > +static int64_t __sort__hde_sort(struct perf_hpp_fmt *fmt,
> > +   struct hist_entry *a, struct hist_entry *b)
> > +{
> > +   struct hpp_dynamic_entry *hde;
> > +   struct format_field *field;
> > +   unsigned offset, size;
> > +   int64_t *a64, *b64;
> > +   int32_t *a32, *b32;
> > +   int16_t *a16, *b16;
> > +
> > +   hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
> > +
> > +   field = hde->field;
> > +   if (field->flags & FIELD_IS_DYNAMIC) {
> > +   unsigned long long dyn;
> > +
> > +   pevent_read_number_field(field, a->raw_data, );
> > +   offset = dyn & 0x;
> > +   size = (dyn >> 16) & 0x;
> > +   } else {
> > +   offset = field->offset;
> > +   size = field->size;
> > +   }
> > +
> > +   if (field->flags & FIELD_IS_STRING)
> > +   return strcmp(b->raw_data + offset, a->raw_data + offset);
> > +
> > +   switch (size) {
> > +   case 8:
> > +   a64 = a->raw_data + offset;
> > +   b64 = b->raw_data + offset;
> > +   return *b64 - *a64;
> > +   case 4:
> > +   a32 = a->raw_data + offset;
> > +   b32 = b->raw_data + offset;
> > +   return *b32 - *a32;
> > +   case 2:
> > +   a16 = a->raw_data + offset;
> > +   b16 = b->raw_data + offset;
> > +   return *b16 - *a16;
> > +   default:
> > +   return memcmp(b->raw_data + offset, a->raw_data + offset, size);
> > +   }
> > +}
> > +
> >  bool perf_hpp__is_dynamic_entry(struct perf_hpp_fmt *fmt)
> >  {
> > return fmt->cmp == __sort__hde_cmp;
> > @@ -1826,7 +1871,7 @@ __alloc_dynamic_entry(struct perf_evsel *evsel, 
> > struct format_field *field)
> >  
> > hde->hpp.cmp = __sort__hde_cmp;
> > hde->hpp.collapse = __sort__hde_cmp;
> > -   hde->hpp.sort = __sort__hde_cmp;
> > +   hde->hpp.sort = __sort__hde_sort;
> >  
> > INIT_LIST_HEAD(>hpp.list);
> > INIT_LIST_HEAD(>hpp.sort_list);
> > -- 
> > 2.6.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/5] perf tools: Fix dynamic sort keys to sort properly

2016-01-06 Thread Arnaldo Carvalho de Melo
Em Thu, Jan 07, 2016 at 08:26:45AM +0900, Namhyung Kim escreveu:
> On Wed, Jan 06, 2016 at 08:06:43PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Jan 06, 2016 at 09:54:59AM +0900, Namhyung Kim escreveu:
> > > Currently, the dynamic sort keys compares trace data using memcmp().
> > > But for output sorting, it should check data size and compare by word.
> > > Also it sorted strings in reverse order, fix it.
> > 
> > Can this be broken down in two patches? This is complex code, lets try
> > to make it as bisectable as possible.
> 
> OK, I'll break out the string part then.  But I think it doesn't help
> much to reduce the complexity.

Well, number of patches is not a problem, everytime I see a "Also lets
do this other thing" I cringe, it is automatic, sorry :-\

For reviewing its s much better to see things nicely separated, and
sometimes I like one part but not the other, so I pick one and continue
discussion on the other, etc.

- Arnaldo
 
> Thanks,
> Namhyung
> 
> 
> > 
> > - Arnaldo
> > 
> > > 
> > > Before)
> > > 
> > >   $ perf report -F overhead -s prev_pid,next_pid
> > >   ...
> > >   # Overheadprev_pidnext_pid
> > >   #   ..  ..
> > >   #
> > >0.39% 490   0
> > >9.12% 225   0
> > >0.04% 224   0
> > >0.51% 731 189
> > >0.08% 731   3
> > >0.12% 731   0
> > >4.82% 729   0
> > >0.08%1229   0
> > >0.20% 715   0
> > >4.78% 189 225
> > >   ...
> > > 
> > > After)
> > > 
> > >   $ perf report -F overhead -s prev_pid,next_pid
> > >   ...
> > >   # Overheadprev_pidnext_pid
> > >   #   ..  ..
> > >   #
> > >0.43%   0   7
> > >0.04%   0  11
> > >0.04%   0  12
> > >0.08%   0  14
> > >0.04%   0  17
> > >0.08%   0  19
> > >0.04%   0  22
> > >0.04%   0  27
> > >0.04%   0  37
> > >0.04%   0  42
> > >   ...
> > > 
> > > Reported-by: Arnaldo Carvalho de Melo 
> > > Signed-off-by: Namhyung Kim 
> > > ---
> > >  tools/perf/util/sort.c | 47 
> > > ++-
> > >  1 file changed, 46 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> > > index 9618a64875c0..264d2b630549 100644
> > > --- a/tools/perf/util/sort.c
> > > +++ b/tools/perf/util/sort.c
> > > @@ -1798,6 +1798,51 @@ static int64_t __sort__hde_cmp(struct perf_hpp_fmt 
> > > *fmt,
> > >   return memcmp(a->raw_data + offset, b->raw_data + offset, size);
> > >  }
> > >  
> > > +static int64_t __sort__hde_sort(struct perf_hpp_fmt *fmt,
> > > + struct hist_entry *a, struct hist_entry *b)
> > > +{
> > > + struct hpp_dynamic_entry *hde;
> > > + struct format_field *field;
> > > + unsigned offset, size;
> > > + int64_t *a64, *b64;
> > > + int32_t *a32, *b32;
> > > + int16_t *a16, *b16;
> > > +
> > > + hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
> > > +
> > > + field = hde->field;
> > > + if (field->flags & FIELD_IS_DYNAMIC) {
> > > + unsigned long long dyn;
> > > +
> > > + pevent_read_number_field(field, a->raw_data, );
> > > + offset = dyn & 0x;
> > > + size = (dyn >> 16) & 0x;
> > > + } else {
> > > + offset = field->offset;
> > > + size = field->size;
> > > + }
> > > +
> > > + if (field->flags & FIELD_IS_STRING)
> > > + return strcmp(b->raw_data + offset, a->raw_data + offset);
> > > +
> > > + switch (size) {
> > > + case 8:
> > > + a64 = a->raw_data + offset;
> > > + b64 = b->raw_data + offset;
> > > + return *b64 - *a64;
> > > + case 4:
> > > + a32 = a->raw_data + offset;
> > > + b32 = b->raw_data + offset;
> > > + return *b32 - *a32;
> > > + case 2:
> > > + a16 = a->raw_data + offset;
> > > + b16 = b->raw_data + offset;
> > > + return *b16 - *a16;
> > > + default:
> > > + return memcmp(b->raw_data + offset, a->raw_data + offset, size);
> > > + }
> > > +}
> > > +
> > >  bool perf_hpp__is_dynamic_entry(struct perf_hpp_fmt *fmt)
> > >  {
> > >   return fmt->cmp == __sort__hde_cmp;
> > > @@ -1826,7 +1871,7 @@ __alloc_dynamic_entry(struct perf_evsel *evsel, 
> > > struct format_field *field)
> > >  
> > >   hde->hpp.cmp = __sort__hde_cmp;
> > >   hde->hpp.collapse = __sort__hde_cmp;
> > > - hde->hpp.sort = __sort__hde_cmp;
> > > + hde->hpp.sort = __sort__hde_sort;
> > >  
> > >   INIT_LIST_HEAD(>hpp.list);
> > >   INIT_LIST_HEAD(>hpp.sort_list);
> > > -- 
> > > 2.6.4
--
To unsubscribe from this list: send the line 

Re: [PATCHv1 3/6] rdmacg: implements rdma cgroup

2016-01-06 Thread Parav Pandit
On Wed, Jan 6, 2016 at 3:31 AM, Tejun Heo  wrote:
> Hello,
>
> On Wed, Jan 06, 2016 at 12:28:03AM +0530, Parav Pandit wrote:
>> +/* hash table to keep map of tgid to owner cgroup */
>> +DEFINE_HASHTABLE(pid_cg_map_tbl, 7);
>> +DEFINE_SPINLOCK(pid_cg_map_lock);/* lock to protect hash table access */
>> +
>> +/* Keeps mapping of pid to its owning cgroup at rdma level,
>> + * This mapping doesn't change, even if process migrates from one to other
>> + * rdma cgroup.
>> + */
>> +struct pid_cg_map {
>> + struct pid *pid;/* hash key */
>> + struct rdma_cgroup *cg;
>> +
>> + struct hlist_node hlist;/* pid to cgroup hash table link */
>> + atomic_t refcnt;/* count active user tasks to figure 
>> out
>> +  * when to free the memory
>> +  */
>> +};
>
> Ugh, there's something clearly wrong here.  Why does the rdma
> controller need to keep track of pid cgroup membership?
>
Rdma resource can be allocated by parent process, used and freed by
child process.
Child process could belong to different rdma cgroup.
Parent process might have been terminated after creation of rdma
cgroup. (Followed by cgroup might have been deleted too).
Its discussed in https://lkml.org/lkml/2015/11/2/307

In nutshell, there is process that clearly owns the rdma resource.
So to keep the design simple, rdma resource is owned by the creator
process and cgroup without modifying the task_struct.

>> +static void _dealloc_cg_rpool(struct rdma_cgroup *cg,
>> +   struct cg_resource_pool *rpool)
>> +{
>> + spin_lock(>cg_list_lock);
>> +
>> + /* if its started getting used by other task,
>> +  * before we take the spin lock, then skip,
>> +  * freeing it.
>> +  */
>
> Please follow CodingStyle.
>
>> + if (atomic_read(>refcnt) == 0) {
>> + list_del_init(>cg_list);
>> + spin_unlock(>cg_list_lock);
>> +
>> + _free_cg_rpool(rpool);
>> + return;
>> + }
>> + spin_unlock(>cg_list_lock);
>> +}
>> +
>> +static void dealloc_cg_rpool(struct rdma_cgroup *cg,
>> +  struct cg_resource_pool *rpool)
>> +{
>> + /* Don't free the resource pool which is created by the
>> +  * user, otherwise we miss the configured limits. We don't
>> +  * gain much either by splitting storage of limit and usage.
>> +  * So keep it around until user deletes the limits.
>> +  */
>> + if (atomic_read(>creator) == RDMACG_RPOOL_CREATOR_DEFAULT)
>> + _dealloc_cg_rpool(cg, rpool);
>
> I'm pretty sure you can get away with an fixed length array of
> counters.  Please keep it simple.  It's a simple hard limit enforcer.
> There's no need to create a massive dynamic infrastrucure.
>
Every resource pool for verbs resource is fixed length array. Length
of the array is defined by the IB stack modules.
This array is per cgroup, per device.
Its per device, because we agreed that we want to address requirement
of controlling/configuring them on per device basis.
Devices appear and disappear. Therefore they are allocated dynamically.
Otherwise this array could be static in cgroup structure.



> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 0/6] nvdimm: Add an IOCTL pass thru for DSM calls

2016-01-06 Thread Dan Williams
On Wed, Jan 6, 2016 at 3:03 PM, Jerry Hoemann  wrote:
> The NVDIMM code in the kernel supports an IOCTL interface to user
> space based upon the Intel Example DSM:
>
> http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf
>
> This interface cannot be used by other NVDIMM DSMs that support
> incompatible functions.
>
> This patch set adds a generic "passthru" IOCTL interface which
> is not tied to a particular DSM.
>
> A new _IOC_NR ND_CMD_CALL_DSM == "10" is added for the pass thru call.
>
> The new data structure nd_cmd_dsmcall_pkg serves as a wrapper for
> the passthru calls.  This wrapper supplies the data that the kernel
> needs to make the _DSM call.
>
> Unlike the definitions of the _DSM functions themselves, the 
> nd_cmd_dsmcall_pkg
> provides the calling information (input/output sizes) in an uniform
> manner making the kernel marshaling of the arguments straight
> forward.
>
> This shifts the marshaling burden from the kernel to the user
> space application while still permitting the kernel to internally
> call _DSM functions.
>
> The kernel functions __nd_ioctl and acpi_nfit_ctl were modified
> to accomodate ND_CMD_CALL_DSM.
>
>
> Changes in version 5:
> -
> 0. Fixed submit comment for drivers/acpi/utils.c.
>
>
> Changes in version 4:
> -
> 0. Added patch to correct parameter type passed to acpi_evaluate_dsm
>ACPI defines arguments rev and fun as 64 bit quanties and the ioctl
>exports to user face rev and func. We want those to match the ACPI spec.
>
>Also modified acpi_evaluate_dsm_typed and acpi_check dsm which had
>similar issue.
>
> 1. nd_cmd_dsmcall_pkg rearange a reserve and rounded up total size
>to 16 byte boundary.
>
> 2. Created stand alone patch for the pre-existing security issue related
>to "read only" IOCTL calls.
>
> 3. Added patch for increasing envelope size of IOCTL.  Needed to
>be able to read in the wrapper to know remaining size to copy in.
>
>Note: in_env, out_env are statics sized based upon this change.
>
> 4. Moved copyin code to table driven nd_cmd_desc
>
>   Note, the last 40 lines or so of acpi_nfit_ctl will not return _DSM
>   data unless the size allocated in user space buffer equals
>   out_obj->buffer.length.
>
>   The semantic we want in the pass thru case is to return as much
>   of the _DSM data as the user space buffer would accomodate.
>
>   Hence, in acpi_nfit_ctl I have retained the line:
>
> memcpy(pkg->dsm_buf + pkg->h.dsm_in,
> out_obj->buffer.pointer,
> min(pkg->h.dsm_size, pkg->h.dsm_out));
>
>   and the early return from the function.
>
>
>
>
> Changes in version 3:
> -
> 1. Changed name ND_CMD_PASSTHRU to ND_CMD_CALL_DSM.
>
> 2. Value of ND_CMD_CALL_DSM is 10, not 100.
>
> 3. Changed name of nd_passthru_pkg to nd_cmd_dsmcall_pkg.
>
> 4. Removed separate functions for handling ND_CMD_CALL_DSM.
>Moved functionality to __nd_ioctl and acpi_nfit_ctl proper.
>The resultant code looks very different from prior versions.
>
> 5. BUGFIX: __nd_ioctl: Change the if read_only switch to use
>  _IOC_NR cmd (not ioctl_cmd) for better protection.
>
> Do we want to make a stand alone patch for this issue?
>
>
> Changes in version 2:
> -
> 1. Cleanup access mode check in nd_ioctl and nvdimm_ioctl.
> 2. Change name of ndn_pkg to nd_passthru_pkg
> 3. Adjust sizes in nd_passthru_pkg. DSM intergers are 64 bit.
> 4. No new ioctl type, instead tunnel into the existing number space.
> 5. Push down one function level where determine ioctl cmd type.
> 6. re-work diagnostic print/dump message in pass-thru functions.
>
>
>
>
> Jerry Hoemann (6):
>   ACPI / util: Fix acpi_evaluate_dsm() argument type
>   nvdimm: Clean-up access mode check.
>   nvdimm: Add wrapper for IOCTL pass thru
>   nvdimm: Fix security issue with DSM IOCTL.
>   nvdimm: Increase max envelope size for IOCTL
>   nvdimm: Add IOCTL pass thru functions

These look good to me.

I'll tag "nvdimm: Fix security issue with DSM IOCTL." for -stable.

Thanks Jerry!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 20/31] x86, pkeys: differentiate instruction fetches

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

As discussed earlier, we attempt to enforce protection keys in
software.

However, the code checks all faults to ensure that they are not
violating protection key permissions.  It was assumed that all
faults are either write faults where we check PKRU[key].WD (write
disable) or read faults where we check the AD (access disable)
bit.

But, there is a third category of faults for protection keys:
instruction faults.  Instruction faults never run afoul of
protection keys because they do not affect instruction fetches.

So, plumb the PF_INSTR bit down in to the
arch_vma_access_permitted() function where we do the protection
key checks.

We also add a new FAULT_FLAG_INSTRUCTION.  This is because
handle_mm_fault() is not passed the architecture-specific
error_code where we keep PF_INSTR, so we need to encode the
instruction fetch information in to the arch-generic fault
flags.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 

---

 b/arch/powerpc/include/asm/mmu_context.h |2 +-
 b/arch/s390/include/asm/mmu_context.h|2 +-
 b/arch/x86/include/asm/mmu_context.h |5 -
 b/arch/x86/mm/fault.c|8 ++--
 b/include/asm-generic/mm_hooks.h |2 +-
 b/include/linux/mm.h |1 +
 b/mm/gup.c   |   11 +--
 b/mm/memory.c|1 +
 8 files changed, 24 insertions(+), 8 deletions(-)

diff -puN 
arch/powerpc/include/asm/mmu_context.h~pkeys-16-allow-execute-on-unreadable 
arch/powerpc/include/asm/mmu_context.h
--- 
a/arch/powerpc/include/asm/mmu_context.h~pkeys-16-allow-execute-on-unreadable   
2016-01-06 15:50:11.677429524 -0800
+++ b/arch/powerpc/include/asm/mmu_context.h2016-01-06 15:50:11.692430200 
-0800
@@ -149,7 +149,7 @@ static inline void arch_bprm_mm_init(str
 }
 
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
-   bool write, bool foreign)
+   bool write, bool execute, bool foreign)
 {
/* by default, allow everything */
return true;
diff -puN 
arch/s390/include/asm/mmu_context.h~pkeys-16-allow-execute-on-unreadable 
arch/s390/include/asm/mmu_context.h
--- a/arch/s390/include/asm/mmu_context.h~pkeys-16-allow-execute-on-unreadable  
2016-01-06 15:50:11.679429614 -0800
+++ b/arch/s390/include/asm/mmu_context.h   2016-01-06 15:50:11.692430200 
-0800
@@ -131,7 +131,7 @@ static inline void arch_bprm_mm_init(str
 }
 
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
-   bool write, bool foreign)
+   bool write, bool execute, bool foreign)
 {
/* by default, allow everything */
return true;
diff -puN 
arch/x86/include/asm/mmu_context.h~pkeys-16-allow-execute-on-unreadable 
arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~pkeys-16-allow-execute-on-unreadable   
2016-01-06 15:50:11.681429704 -0800
+++ b/arch/x86/include/asm/mmu_context.h2016-01-06 15:50:11.693430245 
-0800
@@ -291,8 +291,11 @@ static inline bool vma_is_foreign(struct
 }
 
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
-   bool write, bool foreign)
+   bool write, bool execute, bool foreign)
 {
+   /* pkeys never affect instruction fetches */
+   if (execute)
+   return true;
/* allow access if the VMA is not one from this process */
if (foreign || vma_is_foreign(vma))
return true;
diff -puN arch/x86/mm/fault.c~pkeys-16-allow-execute-on-unreadable 
arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-16-allow-execute-on-unreadable  2016-01-06 
15:50:11.682429749 -0800
+++ b/arch/x86/mm/fault.c   2016-01-06 15:50:11.693430245 -0800
@@ -908,7 +908,8 @@ static inline bool bad_area_access_from_
if (error_code & PF_PK)
return true;
/* this checks permission keys on the VMA: */
-   if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE), foreign))
+   if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
+   (error_code & PF_INSTR), foreign))
return true;
return false;
 }
@@ -1112,7 +1113,8 @@ access_error(unsigned long error_code, s
 * faults just to hit a PF_PK as soon as we fill in a
 * page.
 */
-   if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE), foreign))
+   if (!arch_vma_access_permitted(vma, (error_code & PF_WRITE),
+   (error_code & PF_INSTR), foreign))
return 1;
 
if (error_code & PF_WRITE) {
@@ -1267,6 +1269,8 @@ __do_page_fault(struct pt_regs *regs, un
 
if (error_code & PF_WRITE)
flags |= FAULT_FLAG_WRITE;
+   if (error_code & PF_INSTR)
+   flags |= FAULT_FLAG_INSTRUCTION;
 
/*
 * 

[PATCH 15/31] mm: factor out VMA fault permission checking

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

This code matches a fault condition up with the VMA and ensures
that the VMA allows the fault to be handled instead of just
erroring out.

We will be extending this in a moment to comprehend protection
keys.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/mm/gup.c |   16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff -puN mm/gup.c~pkeys-10-pte-fault mm/gup.c
--- a/mm/gup.c~pkeys-10-pte-fault   2016-01-06 15:50:09.144315322 -0800
+++ b/mm/gup.c  2016-01-06 15:50:09.148315502 -0800
@@ -557,6 +557,18 @@ next_page:
 }
 EXPORT_SYMBOL(__get_user_pages);
 
+bool vma_permits_fault(struct vm_area_struct *vma, unsigned int fault_flags)
+{
+   vm_flags_t vm_flags;
+
+   vm_flags = (fault_flags & FAULT_FLAG_WRITE) ? VM_WRITE : VM_READ;
+
+   if (!(vm_flags & vma->vm_flags))
+   return false;
+
+   return true;
+}
+
 /*
  * fixup_user_fault() - manually resolve a user page fault
  * @tsk:   the task_struct to use for page fault accounting, or
@@ -588,15 +600,13 @@ int fixup_user_fault(struct task_struct
 unsigned long address, unsigned int fault_flags)
 {
struct vm_area_struct *vma;
-   vm_flags_t vm_flags;
int ret;
 
vma = find_extend_vma(mm, address);
if (!vma || address < vma->vm_start)
return -EFAULT;
 
-   vm_flags = (fault_flags & FAULT_FLAG_WRITE) ? VM_WRITE : VM_READ;
-   if (!(vm_flags & vma->vm_flags))
+   if (!vma_permits_fault(vma, fault_flags))
return -EFAULT;
 
ret = handle_mm_fault(mm, vma, address, fault_flags);
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 16/31] x86, mm: simplify get_user_pages() PTE bit handling

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

The current get_user_pages() code is a wee bit more complicated
than it needs to be for pte bit checking.  Currently, it establishes
a mask of required pte _PAGE_* bits and ensures that the pte it
goes after has all those bits.

This consolidates the three identical copies of this code.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/mm/gup.c |   39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff -puN arch/x86/mm/gup.c~pkeys-12-gup-swizzle arch/x86/mm/gup.c
--- a/arch/x86/mm/gup.c~pkeys-12-gup-swizzle2016-01-06 15:50:09.552333717 
-0800
+++ b/arch/x86/mm/gup.c 2016-01-06 15:50:09.555333852 -0800
@@ -64,6 +64,24 @@ retry:
 }
 
 /*
+ * 'pteval' can come from a pte, pmd or pud.  We only check
+ * _PAGE_PRESENT, _PAGE_USER, and _PAGE_RW in here which are the
+ * same value on all 3 types.
+ */
+static inline int pte_allows_gup(unsigned long pteval, int write)
+{
+   unsigned long need_pte_bits = _PAGE_PRESENT|_PAGE_USER;
+
+   if (write)
+   need_pte_bits |= _PAGE_RW;
+
+   if ((pteval & need_pte_bits) != need_pte_bits)
+   return 0;
+
+   return 1;
+}
+
+/*
  * The performance critical leaf functions are made noinline otherwise gcc
  * inlines everything into a single function which results in too much
  * register pressure.
@@ -71,13 +89,8 @@ retry:
 static noinline int gup_pte_range(pmd_t pmd, unsigned long addr,
unsigned long end, int write, struct page **pages, int *nr)
 {
-   unsigned long mask;
pte_t *ptep;
 
-   mask = _PAGE_PRESENT|_PAGE_USER;
-   if (write)
-   mask |= _PAGE_RW;
-
ptep = pte_offset_map(, addr);
do {
pte_t pte = gup_get_pte(ptep);
@@ -88,8 +101,8 @@ static noinline int gup_pte_range(pmd_t
pte_unmap(ptep);
return 0;
}
-
-   if ((pte_flags(pte) & (mask | _PAGE_SPECIAL)) != mask) {
+   if (!pte_allows_gup(pte_val(pte), write) ||
+   pte_special(pte)) {
pte_unmap(ptep);
return 0;
}
@@ -117,14 +130,10 @@ static inline void get_head_page_multipl
 static noinline int gup_huge_pmd(pmd_t pmd, unsigned long addr,
unsigned long end, int write, struct page **pages, int *nr)
 {
-   unsigned long mask;
struct page *head, *page;
int refs;
 
-   mask = _PAGE_PRESENT|_PAGE_USER;
-   if (write)
-   mask |= _PAGE_RW;
-   if ((pmd_flags(pmd) & mask) != mask)
+   if (!pte_allows_gup(pmd_val(pmd), write))
return 0;
/* hugepages are never "special" */
VM_BUG_ON(pmd_flags(pmd) & _PAGE_SPECIAL);
@@ -193,14 +202,10 @@ static int gup_pmd_range(pud_t pud, unsi
 static noinline int gup_huge_pud(pud_t pud, unsigned long addr,
unsigned long end, int write, struct page **pages, int *nr)
 {
-   unsigned long mask;
struct page *head, *page;
int refs;
 
-   mask = _PAGE_PRESENT|_PAGE_USER;
-   if (write)
-   mask |= _PAGE_RW;
-   if ((pud_flags(pud) & mask) != mask)
+   if (!pte_allows_gup(pud_val(pud), write))
return 0;
/* hugepages are never "special" */
VM_BUG_ON(pud_flags(pud) & _PAGE_SPECIAL);
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 14/31] x86, pkeys: add functions to fetch PKRU

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

This adds the raw instruction to access PKRU as well as some
accessor functions that correctly handle when the CPU does not
support the instruction.  We don't use it here, but we will use
read_pkru() in the next patch.

Signed-off-by: Dave Hansen 
---

 b/arch/x86/include/asm/pgtable.h   |8 
 b/arch/x86/include/asm/special_insns.h |   22 ++
 2 files changed, 30 insertions(+)

diff -puN arch/x86/include/asm/pgtable.h~pkeys-10-kernel-pkru-instructions 
arch/x86/include/asm/pgtable.h
--- a/arch/x86/include/asm/pgtable.h~pkeys-10-kernel-pkru-instructions  
2016-01-06 15:50:08.711295799 -0800
+++ b/arch/x86/include/asm/pgtable.h2016-01-06 15:50:08.716296025 -0800
@@ -102,6 +102,14 @@ static inline int pte_dirty(pte_t pte)
return pte_flags(pte) & _PAGE_DIRTY;
 }
 
+
+static inline u32 read_pkru(void)
+{
+   if (boot_cpu_has(X86_FEATURE_OSPKE))
+   return __read_pkru();
+   return 0;
+}
+
 static inline int pte_young(pte_t pte)
 {
return pte_flags(pte) & _PAGE_ACCESSED;
diff -puN 
arch/x86/include/asm/special_insns.h~pkeys-10-kernel-pkru-instructions 
arch/x86/include/asm/special_insns.h
--- a/arch/x86/include/asm/special_insns.h~pkeys-10-kernel-pkru-instructions
2016-01-06 15:50:08.713295890 -0800
+++ b/arch/x86/include/asm/special_insns.h  2016-01-06 15:50:08.717296070 
-0800
@@ -98,6 +98,28 @@ static inline void native_write_cr8(unsi
 }
 #endif
 
+#ifdef CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS
+static inline u32 __read_pkru(void)
+{
+   u32 ecx = 0;
+   u32 edx, pkru;
+
+   /*
+* "rdpkru" instruction.  Places PKRU contents in to EAX,
+* clears EDX and requires that ecx=0.
+*/
+   asm volatile(".byte 0x0f,0x01,0xee\n\t"
+: "=a" (pkru), "=d" (edx)
+: "c" (ecx));
+   return pkru;
+}
+#else
+static inline u32 __read_pkru(void)
+{
+   return 0;
+}
+#endif
+
 static inline void native_wbinvd(void)
 {
asm volatile("wbinvd": : :"memory");
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/31] x86, pkeys: fill in pkey field in siginfo

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

This fills in the new siginfo field: si_pkey to indicate to
userspace which protection key was set on the PTE that we faulted
on.

Note though that *ALL* protection key faults have to be generated
by a valid, present PTE at some point.  But this code does no PTE
lookups which seeds odd.  The reason is that we take advantage of
the way we generate PTEs from VMAs.  All PTEs under a VMA share
some attributes.  For instance, they are _all_ either PROT_READ
*OR* PROT_NONE.  They also always share a protection key, so we
never have to walk the page tables; we just use the VMA.

Note that _pkey is a 64-bit value.  The current hardware only
supports 4-bit protection keys.  We do this because there is
_plenty_ of space in _sigfault and it is possible that future
processors would support more than 4 bits of protection keys.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/include/asm/pgtable_types.h |5 ++
 b/arch/x86/mm/fault.c  |   64 -
 2 files changed, 68 insertions(+), 1 deletion(-)

diff -puN arch/x86/include/asm/pgtable_types.h~pkeys-09-siginfo-x86 
arch/x86/include/asm/pgtable_types.h
--- a/arch/x86/include/asm/pgtable_types.h~pkeys-09-siginfo-x86 2016-01-06 
15:50:08.273276052 -0800
+++ b/arch/x86/include/asm/pgtable_types.h  2016-01-06 15:50:08.278276277 
-0800
@@ -64,6 +64,11 @@
 #endif
 #define __HAVE_ARCH_PTE_SPECIAL
 
+#define _PAGE_PKEY_MASK (_PAGE_PKEY_BIT0 | \
+_PAGE_PKEY_BIT1 | \
+_PAGE_PKEY_BIT2 | \
+_PAGE_PKEY_BIT3)
+
 #ifdef CONFIG_KMEMCHECK
 #define _PAGE_HIDDEN   (_AT(pteval_t, 1) << _PAGE_BIT_HIDDEN)
 #else
diff -puN arch/x86/mm/fault.c~pkeys-09-siginfo-x86 arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-09-siginfo-x86  2016-01-06 15:50:08.275276142 
-0800
+++ b/arch/x86/mm/fault.c   2016-01-06 15:50:08.279276323 -0800
@@ -15,12 +15,14 @@
 #include /* exception_enter(), ...   */
 #include  /* faulthandler_disabled()  */
 
+#include /* boot_cpu_has, ...*/
 #include  /* dotraplinkage, ...   */
 #include/* pgd_*(), ... */
 #include  /* kmemcheck_*(), ...   */
 #include /* VSYSCALL_ADDR
*/
 #include   /* emulate_vsyscall */
 #include   /* struct vm86  */
+#include/* vma_pkey()   */
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -169,6 +171,56 @@ is_prefetch(struct pt_regs *regs, unsign
return prefetch;
 }
 
+/*
+ * A protection key fault means that the PKRU value did not allow
+ * access to some PTE.  Userspace can figure out what PKRU was
+ * from the XSAVE state, and this function fills out a field in
+ * siginfo so userspace can discover which protection key was set
+ * on the PTE.
+ *
+ * If we get here, we know that the hardware signaled a PF_PK
+ * fault and that there was a VMA once we got in the fault
+ * handler.  It does *not* guarantee that the VMA we find here
+ * was the one that we faulted on.
+ *
+ * 1. T1   : mprotect_key(foo, PAGE_SIZE, pkey=4);
+ * 2. T1   : set PKRU to deny access to pkey=4, touches page
+ * 3. T1   : faults...
+ * 4.T2: mprotect_key(foo, PAGE_SIZE, pkey=5);
+ * 5. T1   : enters fault handler, takes mmap_sem, etc...
+ * 6. T1   : reaches here, sees vma_pkey(vma)=5, when we really
+ *  faulted on a pte with its pkey=4.
+ */
+static void fill_sig_info_pkey(int si_code, siginfo_t *info,
+   struct vm_area_struct *vma)
+{
+   /* This is effectively an #ifdef */
+   if (!boot_cpu_has(X86_FEATURE_OSPKE))
+   return;
+
+   /* Fault not from Protection Keys: nothing to do */
+   if (si_code != SEGV_PKUERR)
+   return;
+   /*
+* force_sig_info_fault() is called from a number of
+* contexts, some of which have a VMA and some of which
+* do not.  The PF_PK handing happens after we have a
+* valid VMA, so we should never reach this without a
+* valid VMA.
+*/
+   if (!vma) {
+   WARN_ONCE(1, "PKU fault with no VMA passed in");
+   info->si_pkey = 0;
+   return;
+   }
+   /*
+* si_pkey should be thought of as a strong hint, but not
+* absolutely guranteed to be 100% accurate because of
+* the race explained above.
+*/
+   info->si_pkey = vma_pkey(vma);
+}
+
 static void
 force_sig_info_fault(int si_signo, int si_code, unsigned long address,
 struct task_struct *tsk, struct vm_area_struct *vma,
@@ -187,6 +239,8 @@ force_sig_info_fault(int si_signo, int s
lsb = PAGE_SHIFT;
info.si_addr_lsb = lsb;
 
+   

Re: [PATCH 1/3] mtd: nand: pxa3xx_nand: add register access debug

2016-01-06 Thread Brian Norris
On Tue, Aug 11, 2015 at 09:57:12PM +0200, Robert Jarzmik wrote:
> Add verbose debug for register accesses. This enables easier debugging
> by following where and how hardware is stimulated, and how it answers.
> 
> Signed-off-by: Robert Jarzmik 
> ---
>  drivers/mtd/nand/pxa3xx_nand.c | 22 +-
>  1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/mtd/nand/pxa3xx_nand.c b/drivers/mtd/nand/pxa3xx_nand.c
> index 1259cc558ce9..ed44bddcc43f 100644
> --- a/drivers/mtd/nand/pxa3xx_nand.c
> +++ b/drivers/mtd/nand/pxa3xx_nand.c
> @@ -127,11 +127,23 @@
>  #define EXT_CMD_TYPE_MONO0 /* Monolithic read/write */
>  
>  /* macros for registers read/write */
> -#define nand_writel(info, off, val)  \
> - writel_relaxed((val), (info)->mmio_base + (off))
> -
> -#define nand_readl(info, off)\
> - readl_relaxed((info)->mmio_base + (off))
> +#define nand_writel(info, off, val)  \
> + do {\
> + dev_vdbg(>pdev->dev,  \
> +  "%s():%d nand_writel(0x%x, %s)\n", \
> +  __func__, __LINE__, (val), #off);  \

The stringification of 'off' works for now, but I think that'd be a bit
restrictive in the future, if we ever want to (e.g.) do arithmetic to
compute the offset, like:

nand_writel(info, SOME_REGISTER_MACRO + idx * 4, foo);

You might be better off just printing the hex value of the offset.

Regards,
Brian

> + writel_relaxed((val), (info)->mmio_base + (off));   \
> + } while (0)
> +
> +#define nand_readl(info, off)
> \
> + ({  \
> + unsigned int _v;\
> + _v = readl_relaxed((info)->mmio_base + (off));  \
> + dev_vdbg(>pdev->dev,  \
> +  "%s():%d nand_readl(%s): 0x%x\n",  \
> +  __func__, __LINE__, #off, _v); \
> + _v; \
> + })
>  
>  /* error code and state */
>  enum {
> -- 
> 2.1.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/31] x86, pkeys: new page fault error code bit: PF_PK

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

Note: "PK" is how the Intel SDM refers to this bit, so we also
use that nomenclature.

This only defines the bit, it does not plumb it anywhere to be
handled.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/mm/fault.c |8 
 1 file changed, 8 insertions(+)

diff -puN arch/x86/mm/fault.c~pkeys-05-pfec arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-05-pfec 2016-01-06 15:50:06.068176638 -0800
+++ b/arch/x86/mm/fault.c   2016-01-06 15:50:06.071176773 -0800
@@ -33,6 +33,7 @@
  *   bit 2 ==   0: kernel-mode access  1: user-mode access
  *   bit 3 ==  1: use of reserved bit detected
  *   bit 4 ==  1: fault was an instruction fetch
+ *   bit 5 ==  1: protection keys block access
  */
 enum x86_pf_error_code {
 
@@ -41,6 +42,7 @@ enum x86_pf_error_code {
PF_USER =   1 << 2,
PF_RSVD =   1 << 3,
PF_INSTR=   1 << 4,
+   PF_PK   =   1 << 5,
 };
 
 /*
@@ -916,6 +918,12 @@ static int spurious_fault_check(unsigned
 
if ((error_code & PF_INSTR) && !pte_exec(*pte))
return 0;
+   /*
+* Note: We do not do lazy flushing on protection key
+* changes, so no spurious fault will ever set PF_PK.
+*/
+   if ((error_code & PF_PK))
+   return 1;
 
return 1;
 }
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/31] x86, pkeys: pass VMA down in to fault signal generation code

2016-01-06 Thread Dave Hansen

From: Dave Hansen 

During a page fault, we look up the VMA to ensure that the fault
is in a region with a valid mapping.  But, in the top-level page
fault code we don't need the VMA for much else.  Once we have
decided that an access is bad, we are going to send a signal no
matter what and do not need the VMA any more.  So we do not pass
it down in to the signal generation code.

But, for protection keys, we need the VMA.  It tells us *which*
protection key we violated if we get a PF_PK.  So, we need to
pass the VMA down and fill in siginfo->si_pkey.

Signed-off-by: Dave Hansen 
Reviewed-by: Thomas Gleixner 
---

 b/arch/x86/mm/fault.c |   50 --
 1 file changed, 28 insertions(+), 22 deletions(-)

diff -puN arch/x86/mm/fault.c~pkeys-08-pass-down-vma arch/x86/mm/fault.c
--- a/arch/x86/mm/fault.c~pkeys-08-pass-down-vma2016-01-06 
15:50:07.422237684 -0800
+++ b/arch/x86/mm/fault.c   2016-01-06 15:50:07.425237819 -0800
@@ -171,7 +171,8 @@ is_prefetch(struct pt_regs *regs, unsign
 
 static void
 force_sig_info_fault(int si_signo, int si_code, unsigned long address,
-struct task_struct *tsk, int fault)
+struct task_struct *tsk, struct vm_area_struct *vma,
+int fault)
 {
unsigned lsb = 0;
siginfo_t info;
@@ -656,6 +657,8 @@ no_context(struct pt_regs *regs, unsigne
struct task_struct *tsk = current;
unsigned long flags;
int sig;
+   /* No context means no VMA to pass down */
+   struct vm_area_struct *vma = NULL;
 
/* Are we prepared to handle this kernel fault? */
if (fixup_exception(regs)) {
@@ -679,7 +682,8 @@ no_context(struct pt_regs *regs, unsigne
tsk->thread.cr2 = address;
 
/* XXX: hwpoison faults will set the wrong code. */
-   force_sig_info_fault(signal, si_code, address, tsk, 0);
+   force_sig_info_fault(signal, si_code, address,
+tsk, vma, 0);
}
 
/*
@@ -756,7 +760,8 @@ show_signal_msg(struct pt_regs *regs, un
 
 static void
 __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
-  unsigned long address, int si_code)
+  unsigned long address, struct vm_area_struct *vma,
+  int si_code)
 {
struct task_struct *tsk = current;
 
@@ -799,7 +804,7 @@ __bad_area_nosemaphore(struct pt_regs *r
tsk->thread.error_code  = error_code;
tsk->thread.trap_nr = X86_TRAP_PF;
 
-   force_sig_info_fault(SIGSEGV, si_code, address, tsk, 0);
+   force_sig_info_fault(SIGSEGV, si_code, address, tsk, vma, 0);
 
return;
}
@@ -812,14 +817,14 @@ __bad_area_nosemaphore(struct pt_regs *r
 
 static noinline void
 bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
-unsigned long address)
+unsigned long address, struct vm_area_struct *vma)
 {
-   __bad_area_nosemaphore(regs, error_code, address, SEGV_MAPERR);
+   __bad_area_nosemaphore(regs, error_code, address, vma, SEGV_MAPERR);
 }
 
 static void
 __bad_area(struct pt_regs *regs, unsigned long error_code,
-  unsigned long address, int si_code)
+  unsigned long address,  struct vm_area_struct *vma, int si_code)
 {
struct mm_struct *mm = current->mm;
 
@@ -829,25 +834,25 @@ __bad_area(struct pt_regs *regs, unsigne
 */
up_read(>mmap_sem);
 
-   __bad_area_nosemaphore(regs, error_code, address, si_code);
+   __bad_area_nosemaphore(regs, error_code, address, vma, si_code);
 }
 
 static noinline void
 bad_area(struct pt_regs *regs, unsigned long error_code, unsigned long address)
 {
-   __bad_area(regs, error_code, address, SEGV_MAPERR);
+   __bad_area(regs, error_code, address, NULL, SEGV_MAPERR);
 }
 
 static noinline void
 bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
- unsigned long address)
+ unsigned long address, struct vm_area_struct *vma)
 {
-   __bad_area(regs, error_code, address, SEGV_ACCERR);
+   __bad_area(regs, error_code, address, vma, SEGV_ACCERR);
 }
 
 static void
 do_sigbus(struct pt_regs *regs, unsigned long error_code, unsigned long 
address,
- unsigned int fault)
+ struct vm_area_struct *vma, unsigned int fault)
 {
struct task_struct *tsk = current;
int code = BUS_ADRERR;
@@ -874,12 +879,13 @@ do_sigbus(struct pt_regs *regs, unsigned
code = BUS_MCEERR_AR;
}
 #endif
-   force_sig_info_fault(SIGBUS, code, address, tsk, fault);
+   force_sig_info_fault(SIGBUS, code, address, tsk, vma, fault);
 }
 
 static noinline void
 

Re: [PATCH v2 15/32] powerpc: define __smp_xxx

2016-01-06 Thread Boqun Feng
On Wed, Jan 06, 2016 at 10:23:51PM +0200, Michael S. Tsirkin wrote:
[...]
> > > 
> > > Sorry, I don't understand - why do you have to do anything?
> > > I changed all users of smp_lwsync so they
> > > use __smp_lwsync on SMP and barrier() on !SMP.
> > > 
> > > This is exactly the current behaviour, I also tested that
> > > generated code does not change at all.
> > > 
> > > Is there a patch in your tree that conflicts with this?
> > > 
> > 
> > Because in a patchset which implements atomic relaxed/acquire/release
> > variants on PPC I use smp_lwsync(), this makes it have another user,
> > please see this mail:
> > 
> > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877
> > 
> > in definition of PPC's __atomic_op_release().
> > 
> > 
> > But I think removing smp_lwsync() is a good idea and actually I think we
> > can go further to remove __smp_lwsync() and let __smp_load_acquire and
> > __smp_store_release call __lwsync() directly, but that is another thing.
> > 
> > Anyway, I will modify my patch.
> > 
> > Regards,
> > Boqun
> 
> 
> Thanks!
> Could you send an ack then please?
> 

Sure, if you need one from me, feel free to add my ack for this patch:

Acked-by: Boqun Feng 

Regards,
Boqun


signature.asc
Description: PGP signature


Re: [linux-sunxi] Re: [PATCH v8 2/2] ASoc: sun4i-codec: Add FM, Line and Mic inputs

2016-01-06 Thread Maxime Ripard
On Mon, Dec 28, 2015 at 04:06:49AM +0100, Danny Milosavljevic wrote:
> Hi Maxime,
> 
> On Sun, 27 Dec 2015 19:21:57 +0100
> Maxime Ripard  wrote:
> 
> > On Mon, Dec 21, 2015 at 12:34:16PM +0100, Danny Milosavljevic wrote:
> > > This is the second part, actually adding FM, Line and Mic inputs.
> > 
> > Again, having a meaningful and standalone commit log would be nice.
> 
> Okay, will elaborate some more in v9.
> 
> > > +#define SUN4I_CODEC_ADC_ACTL_PREG1_A10   (25)
> > > +#define SUN4I_CODEC_ADC_ACTL_PREG2_A10   (23)
> > 
> > Why the A10 suffix?
> 
> The sun4i*_a10 names are for things that work on A10 but not on A20.
> This way whoever touches the driver later can know for which things he has
> to consider multiple cases.
> Otherwise he will have no indication that he is using a bit index where 
> there sometimes is no bit (or, worse, the wrong bit).
> 
> It's intended to be used like this:
> 
> sun4i_foo_a10
>   foo is sun4i_foo_a10 on A10 (only).
> sun4i_foo
>   foo is sun4i_foo on A10 and also on future chips (like A20).
> sun7i_foo
>   foo is sun7i_foo on A20 and (hopefully) also on future chips.

I find the sun4i_*_a10 and sun4i highly redundant. If there the same
define for sun7i, then you know that it's not meant to be used for the
A20, and that's it.

My point is also that it will just be an ever growing list of suffixes
when we will support more SoCs, for example for symbols that might be
in the A10, not the A20, but the A31.

> > > +static const char * const sun4i_codec_capture_source[] = {
> > > + "Line-In",
> > > + "FM",
> > > + "Mic1",
> > > + "Mic2",
> > > + "Mic1,Mic2",
> > > + "Mic1+Mic2",
> > > + "Output Mixer",
> > > + "Line-In,Mic1",
> > > +};
> > > +static SOC_ENUM_SINGLE_DECL(sun4i_codec_enum_capture_source,
> > > + SUN4I_CODEC_ADC_ACTL,
> > > + SUN4I_CODEC_ADC_ACTL_ADCIS,
> > > + sun4i_codec_capture_source);
> > 
> > Isn't it possible to expose this as two (shared) muxes with different
> > names to make it clear what will go to the left ADC and what will go
> > to the right?
> 
> I don't know how to do that. I'll try to find out.
> 
> (It would be best if someone who knows how that should act did the alsamixer 
> testing later too, though)
> 
> Can two muxes use the same bit in the hardware without problems?
> Or do you mean reuse the same mux instance? (I think _new1 always creates 
> a new instance from each struct kcontrol_new automatically).

Yeah, the case where two widgets share the same bits is handled iirc
but sharing the controls.

> > The power amplifier is only for the playback, so there's no need to
> > differentiate between playback and capture. The current name was fine.
> 
> "Power Amplifier Volume" shows up as Capture control in alsamixer as well.
> It isn't supposed to be a Capture control.
> > > + /* Line-In, FM, Mic1, Mic2 */ \
> > > + SOC_SINGLE_TLV("Line-In Playback Volume", \
> ...
> > > + SOC_SINGLE_TLV("FM Playback Volume", \
> ...
> > > + SOC_SINGLE_TLV("Mic Playback Volume", \
> ...
> > 
> > Those are not volume it's gain, 
> 
> We tried to call the things ..." Gain" before and it was difficult to do, 
> with some breakage along the way, see below.
> Also, Mark said they should be named ..." Volume" (see 
> ).

Ah, my bad.

Judging from ControlNames.txt, the Power Amplifier Volume should
probably called Headphone Playback Volume then.

> >and it should probably be two different shared controls for mic1 and mic2.
> 
> I'll try...
> 
> >> "Capture Source"
> > This one is the ADC Gain. Please name it that way.
> 
> Unfortunately, the strings have meaning to asoc and alsa-lib and you can't go 
> renaming them like that without breaking things. The names were carefully 
> chosen to make it actually work properly without having to patch alsa-lib and 
> parts of ASoC core (which I did before and have since reverted).
> 
> In this case, there's a special case in upstream alsa-lib:
> 
> alsa-lib-1.0.28:
> ./src/mixer/simple_none.c:if (strcmp(name, "Capture Source") == 0) {
> ...
> 
> I'm not totally against naming them like you suggested - but know that you 
> are requiring changes in alsa-lib as well then - or presumably breaking the 
> user's workflow.
> 
> For example the (upstream) "Capture Source" special case in alsa-lib adds 
> radio-button-like things to the respective elements. 
> You can switch to one of the inputs by pressing Space while its gain element 
> is selected.
> In the mic case, it's the mic preamplier gain that you press Space on - if 
> it's 
> indeed shown as a Capture control...

No, you're totally right, I just entirely missed that ControlNames
files you pointed me to.

Thanks,
Maxime

-- 
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


signature.asc
Description: Digital 

[GIT] Networking

2016-01-06 Thread David Miller

As usual, there are a couple straggler bug fixes:

1) qlcnic_alloc_mbx_args() error returns are not checked in qlcnic driver.
   Fix from Insu Yun.

2) SKB refcounting bug in connector, from Florian Westphal.

3) vrf_get_saddr() has to propagate fib_lookup() errors to it's callers,
   from David Ahern.

4) Fix AF_UNIX splice/bind deadlock, from Rainer Weikusat.

5) qdisc_rcu_free() fails to free the per-cpu qstats.  Fix from John
   Fastabend.

6) vmxnet3 driver passes wrong page to dma_map_page(), fix from
   Shrikrishna Khare.

7) Don't allow zero cwnd in tcp_cwnd_reduction(), from Yuchung Cheng.

Please pull, thanks a lot!

The following changes since commit 9c982e86dbdbaa3fb248dfc776a32cbc8927:

  Merge tag 'pci-v4.4-fixes-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci (2015-12-31 14:59:21 
-0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to 8b8a321ff72c785ed5e8b4cf6eda20b35d427390:

  tcp: fix zero cwnd in tcp_cwnd_reduction (2016-01-06 16:39:56 -0500)


Alan (1):
  mkiss: fix scribble on freed memory

David Ahern (1):
  net: Propagate lookup failure in l3mdev_get_saddr to caller

Florian Westphal (1):
  connector: bump skb->users before callback invocation

Francesco Ruggeri (1):
  net: possible use after free in dst_release

Hannes Frederic Sowa (1):
  bridge: Only call /sbin/bridge-stp for the initial network namespace

Insu Yun (2):
  qlcnic: correctly handle qlcnic_alloc_mbx_args
  cxgb4: correctly handling failed allocation

John Fastabend (1):
  net: sched: fix missing free per cpu on qstats

Kristian Evensen (1):
  net: qmi_wwan: Add WeTelecom-WPD600N

One Thousand Gnomes (1):
  6pack: fix free memory scribbles

Rabin Vincent (2):
  net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
  ARM: net: bpf: fix zero right shift

Rainer Weikusat (1):
  af_unix: Fix splice-bind deadlock

Shrikrishna Khare (1):
  Driver: Vmxnet3: Fix regression caused by 5738a09

Yuchung Cheng (1):
  tcp: fix zero cwnd in tcp_cwnd_reduction

hayeswang (1):
  r8152: add reset_resume function

 arch/arm/net/bpf_jit_32.c   | 19 +++
 arch/mips/net/bpf_jit.c | 16 +---
 arch/powerpc/net/bpf_jit_comp.c | 13 ++---
 arch/sparc/net/bpf_jit_comp.c   | 17 ++---
 drivers/connector/connector.c   | 11 +++
 drivers/net/ethernet/chelsio/cxgb4/clip_tbl.c   |  4 
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c |  6 --
 drivers/net/hamradio/6pack.c|  6 ++
 drivers/net/hamradio/mkiss.c|  5 +
 drivers/net/usb/qmi_wwan.c  |  1 +
 drivers/net/usb/r8152.c | 10 +-
 drivers/net/vmxnet3/vmxnet3_drv.c   |  8 
 drivers/net/vmxnet3/vmxnet3_int.h   |  4 ++--
 drivers/net/vrf.c   | 10 +++---
 include/linux/filter.h  | 19 +++
 include/net/l3mdev.h| 16 ++--
 include/net/route.h |  7 ++-
 net/bridge/br_stp_if.c  |  5 -
 net/core/dst.c  |  3 ++-
 net/ipv4/raw.c  |  7 +--
 net/ipv4/tcp_input.c|  3 +++
 net/ipv4/udp.c  |  7 +--
 net/sched/sch_generic.c |  4 +++-
 net/unix/af_unix.c  | 66 
--
 24 files changed, 150 insertions(+), 117 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND v4 02/11] fsl-mc: msi: Added FSL-MC-specific member to the msi_desc's union

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

FSL-MC is a bus type different from PCI and platform, so it needs
its own member in the msi_desc's union.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4: none

Changes in v3: none

Changes in v2:
- Addressed comment from Jiang Liu
  * Added a dedicated structure for FSL-MC in struct msi_desc

 include/linux/msi.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/linux/msi.h b/include/linux/msi.h
index f71a25e..152e51a 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -33,6 +33,14 @@ struct platform_msi_desc {
 };

 /**
+ * fsl_mc_msi_desc - FSL-MC device specific msi descriptor data
+ * @msi_index: The index of the MSI descriptor
+ */
+struct fsl_mc_msi_desc {
+   u16 msi_index;
+};
+
+/**
  * struct msi_desc - Descriptor structure for MSI based interrupts
  * @list:  List head for management
  * @irq:   The base interrupt number
@@ -87,6 +95,7 @@ struct msi_desc {
 * tree wide cleanup.
 */
struct platform_msi_desc platform;
+   struct fsl_mc_msi_desc fsl_mc;
};
 };

--
2.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND v4 07/11] staging: fsl-mc: Populate the IRQ pool for an MC bus instance

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

Scan the corresponding DPRC container to get total count
of IRQs needed by all its child DPAA2 objects. Then,
preallocate a set of MSI IRQs with the DPRC's ICID
(GIT-ITS device Id) to populate the the DPRC's IRQ pool.
Each child DPAA2 object in the DPRC and the DPRC object itself
will allocate their necessary MSI IRQs from the DPRC's IRQ pool,
in their driver probe function.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4: none

Changes in v3: none

Changes in v2: none

 drivers/staging/fsl-mc/bus/dprc-driver.c| 24 ++--
 drivers/staging/fsl-mc/include/mc-private.h |  3 ++-
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/fsl-mc/bus/dprc-driver.c 
b/drivers/staging/fsl-mc/bus/dprc-driver.c
index 767d437..ef1bb93 100644
--- a/drivers/staging/fsl-mc/bus/dprc-driver.c
+++ b/drivers/staging/fsl-mc/bus/dprc-driver.c
@@ -241,6 +241,7 @@ static void dprc_cleanup_all_resource_pools(struct 
fsl_mc_device *mc_bus_dev)
  * dprc_scan_objects - Discover objects in a DPRC
  *
  * @mc_bus_dev: pointer to the fsl-mc device that represents a DPRC object
+ * @total_irq_count: total number of IRQs needed by objects in the DPRC.
  *
  * Detects objects added and removed from a DPRC and synchronizes the
  * state of the Linux bus driver, MC by adding and removing
@@ -254,11 +255,13 @@ static void dprc_cleanup_all_resource_pools(struct 
fsl_mc_device *mc_bus_dev)
  * populated before they can get allocation requests from probe callbacks
  * of the device drivers for the non-allocatable devices.
  */
-int dprc_scan_objects(struct fsl_mc_device *mc_bus_dev)
+int dprc_scan_objects(struct fsl_mc_device *mc_bus_dev,
+ unsigned int *total_irq_count)
 {
int num_child_objects;
int dprc_get_obj_failures;
int error;
+   unsigned int irq_count = mc_bus_dev->obj_desc.irq_count;
struct dprc_obj_desc *child_obj_desc_array = NULL;

error = dprc_get_obj_count(mc_bus_dev->mc_io,
@@ -307,6 +310,7 @@ int dprc_scan_objects(struct fsl_mc_device *mc_bus_dev)
continue;
}

+   irq_count += obj_desc->irq_count;
dev_dbg(_bus_dev->dev,
"Discovered object: type %s, id %d\n",
obj_desc->type, obj_desc->id);
@@ -319,6 +323,7 @@ int dprc_scan_objects(struct fsl_mc_device *mc_bus_dev)
}
}

+   *total_irq_count = irq_count;
dprc_remove_devices(mc_bus_dev, child_obj_desc_array,
num_child_objects);

@@ -344,6 +349,7 @@ EXPORT_SYMBOL_GPL(dprc_scan_objects);
 int dprc_scan_container(struct fsl_mc_device *mc_bus_dev)
 {
int error;
+   unsigned int irq_count;
struct fsl_mc_bus *mc_bus = to_fsl_mc_bus(mc_bus_dev);

dprc_init_all_resource_pools(mc_bus_dev);
@@ -352,11 +358,25 @@ int dprc_scan_container(struct fsl_mc_device *mc_bus_dev)
 * Discover objects in the DPRC:
 */
mutex_lock(_bus->scan_mutex);
-   error = dprc_scan_objects(mc_bus_dev);
+   error = dprc_scan_objects(mc_bus_dev, _count);
mutex_unlock(_bus->scan_mutex);
if (error < 0)
goto error;

+   if (dev_get_msi_domain(_bus_dev->dev) && !mc_bus->irq_resources) {
+   if (irq_count > FSL_MC_IRQ_POOL_MAX_TOTAL_IRQS) {
+   dev_warn(_bus_dev->dev,
+"IRQs needed (%u) exceed IRQs preallocated 
(%u)\n",
+irq_count, FSL_MC_IRQ_POOL_MAX_TOTAL_IRQS);
+   }
+
+   error = fsl_mc_populate_irq_pool(
+   mc_bus,
+   FSL_MC_IRQ_POOL_MAX_TOTAL_IRQS);
+   if (error < 0)
+   goto error;
+   }
+
return 0;
 error:
dprc_cleanup_all_resource_pools(mc_bus_dev);
diff --git a/drivers/staging/fsl-mc/include/mc-private.h 
b/drivers/staging/fsl-mc/include/mc-private.h
index 3babe92..be72a44 100644
--- a/drivers/staging/fsl-mc/include/mc-private.h
+++ b/drivers/staging/fsl-mc/include/mc-private.h
@@ -114,7 +114,8 @@ void fsl_mc_device_remove(struct fsl_mc_device *mc_dev);

 int dprc_scan_container(struct fsl_mc_device *mc_bus_dev);

-int dprc_scan_objects(struct fsl_mc_device *mc_bus_dev);
+int dprc_scan_objects(struct fsl_mc_device *mc_bus_dev,
+ unsigned int *total_irq_count);

 int __init dprc_driver_init(void);

--
2.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND v4 01/11] irqdomain: Added domain bus token DOMAIN_BUS_FSL_MC_MSI

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

Since an FSL-MC bus is a new bus type that is neither PCI nor
PLATFORM, we need a new domain bus token to disambiguate the
IRQ domain for FSL-MC MSIs.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4: none

Changes in v3: none

Changes in v2: none

 include/linux/irqdomain.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index d5e5c5b..c0cb5d1 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -73,6 +73,7 @@ enum irq_domain_bus_token {
DOMAIN_BUS_PCI_MSI,
DOMAIN_BUS_PLATFORM_MSI,
DOMAIN_BUS_NEXUS,
+   DOMAIN_BUS_FSL_MC_MSI,
 };

 /**
--
2.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RESEND v4 06/11] staging: fsl-mc: Changed DPRC built-in portal's mc_io to be atomic

2016-01-06 Thread J. German Rivera
From: "J. German Rivera" 

The DPRC built-in portal's mc_io is used to send commands to the MC
to program MSIs for MC objects. This is done by the
fsl_mc_msi_write_msg() callback, which is invoked by the generic MSI
layer with interrupts disabled. As a result, the mc_io used in
fsl_mc_msi_write_msg needs to be an atomic mc_io.

Signed-off-by: J. German Rivera 
---
CHANGE HISTORY

Changes in v4: none

Changes in v3: none

Changes in v2: none

 drivers/staging/fsl-mc/bus/dprc-driver.c | 4 +++-
 drivers/staging/fsl-mc/bus/mc-bus.c  | 3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/fsl-mc/bus/dprc-driver.c 
b/drivers/staging/fsl-mc/bus/dprc-driver.c
index 2c4cd70..767d437 100644
--- a/drivers/staging/fsl-mc/bus/dprc-driver.c
+++ b/drivers/staging/fsl-mc/bus/dprc-driver.c
@@ -396,7 +396,9 @@ static int dprc_probe(struct fsl_mc_device *mc_dev)
error = fsl_create_mc_io(_dev->dev,
 mc_dev->regions[0].start,
 region_size,
-NULL, 0, _dev->mc_io);
+NULL,
+FSL_MC_IO_ATOMIC_CONTEXT_PORTAL,
+_dev->mc_io);
if (error < 0)
return error;
}
diff --git a/drivers/staging/fsl-mc/bus/mc-bus.c 
b/drivers/staging/fsl-mc/bus/mc-bus.c
index 84db55b..d34f1af 100644
--- a/drivers/staging/fsl-mc/bus/mc-bus.c
+++ b/drivers/staging/fsl-mc/bus/mc-bus.c
@@ -702,7 +702,8 @@ static int fsl_mc_bus_probe(struct platform_device *pdev)
mc_portal_phys_addr = res.start;
mc_portal_size = resource_size();
error = fsl_create_mc_io(>dev, mc_portal_phys_addr,
-mc_portal_size, NULL, 0, _io);
+mc_portal_size, NULL,
+FSL_MC_IO_ATOMIC_CONTEXT_PORTAL, _io);
if (error < 0)
return error;

--
2.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH for-4.5 v2] mtd: nand: assign reasonable default name for NAND drivers

2016-01-06 Thread Brian Norris
On Tue, Jan 05, 2016 at 10:39:45AM -0800, Brian Norris wrote:
> Commits such as commit 853f1c58c4b2 ("mtd: nand: omap2: show parent
> device structure in sysfs") attempt to rely on the core MTD code to set
> the MTD name based on the parent device. However, nand_base tries to set
> a different default name according to the flash name (e.g., extracted
> from the ONFI parameter page), which means NAND drivers will never make
> use of the MTD defaults. This is not the intention of commit
> 853f1c58c4b2.
> 
> This results in problems when trying to use the cmdline partition
> parser, since the MTD name is different than expected. Let's fix this by
> providing a default NAND name, where possible.
> 
> Note that this is not really a great default name in the long run, since
> this means that if there are multiple MTDs attached to the same
> controller device, they will have the same name. But that is an existing
> issue and requires future work on a better controller vs. flash chip
> abstraction to fix properly.
> 
> Fixes: 853f1c58c4b2 ("mtd: nand: omap2: show parent device structure in 
> sysfs")
> Reported-by: Heiko Schocher 
> Signed-off-by: Brian Norris 
> Reviewed-by: Boris Brezillon 
> Tested-by: Heiko Schocher 
> Cc: Heiko Schocher 
> Cc: Frans Klaver 
> Cc: 
> ---
> v2:
>  * target 4.5, as 4.4 is getting late
>  * add -stable tags
>  * move assignment directly into nand_scan_ident() (nand_set_defaults() has a
>slightly different purpose and gets reused, so it's not as good of a
>candidate)

Applied to l2-mtd.git, for 4.5 (+ stable)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v5 2/6] nvdimm: Clean-up access mode check.

2016-01-06 Thread Jerry Hoemann
Change nd_ioctl and nvdimm_ioctl access mode check to use O_RDONLY.

Signed-off-by: Jerry Hoemann 
---
 drivers/nvdimm/bus.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 7e2c43f..1c81203 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -602,14 +602,14 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, 
struct nvdimm *nvdimm,
 static long nd_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 {
long id = (long) file->private_data;
-   int rc = -ENXIO, read_only;
+   int rc = -ENXIO, ro;
struct nvdimm_bus *nvdimm_bus;
 
-   read_only = (O_RDWR != (file->f_flags & O_ACCMODE));
+   ro = ((file->f_flags & O_ACCMODE) == O_RDONLY);
mutex_lock(_bus_list_mutex);
list_for_each_entry(nvdimm_bus, _bus_list, list) {
if (nvdimm_bus->id == id) {
-   rc = __nd_ioctl(nvdimm_bus, NULL, read_only, cmd, arg);
+   rc = __nd_ioctl(nvdimm_bus, NULL, ro, cmd, arg);
break;
}
}
@@ -633,10 +633,10 @@ static int match_dimm(struct device *dev, void *data)
 
 static long nvdimm_ioctl(struct file *file, unsigned int cmd, unsigned long 
arg)
 {
-   int rc = -ENXIO, read_only;
+   int rc = -ENXIO, ro;
struct nvdimm_bus *nvdimm_bus;
 
-   read_only = (O_RDWR != (file->f_flags & O_ACCMODE));
+   ro = ((file->f_flags & O_ACCMODE) == O_RDONLY);
mutex_lock(_bus_list_mutex);
list_for_each_entry(nvdimm_bus, _bus_list, list) {
struct device *dev = device_find_child(_bus->dev,
@@ -647,7 +647,7 @@ static long nvdimm_ioctl(struct file *file, unsigned int 
cmd, unsigned long arg)
continue;
 
nvdimm = to_nvdimm(dev);
-   rc = __nd_ioctl(nvdimm_bus, nvdimm, read_only, cmd, arg);
+   rc = __nd_ioctl(nvdimm_bus, nvdimm, ro, cmd, arg);
put_device(dev);
break;
}
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/5] perf tools: Fix dynamic sort keys to sort properly

2016-01-06 Thread Arnaldo Carvalho de Melo
Em Wed, Jan 06, 2016 at 09:54:59AM +0900, Namhyung Kim escreveu:
> Currently, the dynamic sort keys compares trace data using memcmp().
> But for output sorting, it should check data size and compare by word.
> Also it sorted strings in reverse order, fix it.

Can this be broken down in two patches? This is complex code, lets try
to make it as bisectable as possible.

- Arnaldo

> 
> Before)
> 
>   $ perf report -F overhead -s prev_pid,next_pid
>   ...
>   # Overheadprev_pidnext_pid
>   #   ..  ..
>   #
>0.39% 490   0
>9.12% 225   0
>0.04% 224   0
>0.51% 731 189
>0.08% 731   3
>0.12% 731   0
>4.82% 729   0
>0.08%1229   0
>0.20% 715   0
>4.78% 189 225
>   ...
> 
> After)
> 
>   $ perf report -F overhead -s prev_pid,next_pid
>   ...
>   # Overheadprev_pidnext_pid
>   #   ..  ..
>   #
>0.43%   0   7
>0.04%   0  11
>0.04%   0  12
>0.08%   0  14
>0.04%   0  17
>0.08%   0  19
>0.04%   0  22
>0.04%   0  27
>0.04%   0  37
>0.04%   0  42
>   ...
> 
> Reported-by: Arnaldo Carvalho de Melo 
> Signed-off-by: Namhyung Kim 
> ---
>  tools/perf/util/sort.c | 47 ++-
>  1 file changed, 46 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 9618a64875c0..264d2b630549 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1798,6 +1798,51 @@ static int64_t __sort__hde_cmp(struct perf_hpp_fmt 
> *fmt,
>   return memcmp(a->raw_data + offset, b->raw_data + offset, size);
>  }
>  
> +static int64_t __sort__hde_sort(struct perf_hpp_fmt *fmt,
> + struct hist_entry *a, struct hist_entry *b)
> +{
> + struct hpp_dynamic_entry *hde;
> + struct format_field *field;
> + unsigned offset, size;
> + int64_t *a64, *b64;
> + int32_t *a32, *b32;
> + int16_t *a16, *b16;
> +
> + hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
> +
> + field = hde->field;
> + if (field->flags & FIELD_IS_DYNAMIC) {
> + unsigned long long dyn;
> +
> + pevent_read_number_field(field, a->raw_data, );
> + offset = dyn & 0x;
> + size = (dyn >> 16) & 0x;
> + } else {
> + offset = field->offset;
> + size = field->size;
> + }
> +
> + if (field->flags & FIELD_IS_STRING)
> + return strcmp(b->raw_data + offset, a->raw_data + offset);
> +
> + switch (size) {
> + case 8:
> + a64 = a->raw_data + offset;
> + b64 = b->raw_data + offset;
> + return *b64 - *a64;
> + case 4:
> + a32 = a->raw_data + offset;
> + b32 = b->raw_data + offset;
> + return *b32 - *a32;
> + case 2:
> + a16 = a->raw_data + offset;
> + b16 = b->raw_data + offset;
> + return *b16 - *a16;
> + default:
> + return memcmp(b->raw_data + offset, a->raw_data + offset, size);
> + }
> +}
> +
>  bool perf_hpp__is_dynamic_entry(struct perf_hpp_fmt *fmt)
>  {
>   return fmt->cmp == __sort__hde_cmp;
> @@ -1826,7 +1871,7 @@ __alloc_dynamic_entry(struct perf_evsel *evsel, struct 
> format_field *field)
>  
>   hde->hpp.cmp = __sort__hde_cmp;
>   hde->hpp.collapse = __sort__hde_cmp;
> - hde->hpp.sort = __sort__hde_cmp;
> + hde->hpp.sort = __sort__hde_sort;
>  
>   INIT_LIST_HEAD(>hpp.list);
>   INIT_LIST_HEAD(>hpp.sort_list);
> -- 
> 2.6.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv1 2/6] IB/core: Added members to support rdma cgroup

2016-01-06 Thread Parav Pandit
On Wed, Jan 6, 2016 at 3:26 AM, Tejun Heo  wrote:
> On Wed, Jan 06, 2016 at 12:28:02AM +0530, Parav Pandit wrote:
>> Added function pointer table to store resource pool specific
>> operation for each resource type (verb and hw).
>> Added list node to link device to rdma cgroup so that it can
>> participate in resource accounting and limit configuration.
>
> Is there any point in splitting patches 1 and 2 from 3?
>
Patch 2 is in IB stack, so I separated that patch out from 1. That
makes it 3 patches.
If you think single patch is easier to review, let me know, I can
respin to have one patch for these 3 smaller patches.

> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Staging: wlan-ng: p80211conv.c: Coding style fixes General coding style checks have been fixed. Warnings not fixed.

2016-01-06 Thread Greg KH
On Thu, Jan 07, 2016 at 12:45:40AM +0530, Pranjal Bhor wrote:
> Signed-off-by: Pranjal Bhor 

Your subject is very "odd", please fix it up and provide a real
changelog entry as well.

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5] mtd: support BB SRAM on ICP DAS LP-8x4x

2016-01-06 Thread Brian Norris
On Sun, Dec 20, 2015 at 01:43:58PM +0300, Sergei Ianovich wrote:
> On Sat, 2015-12-19 at 21:38 -0600, Rob Herring wrote:
> > On Tue, Dec 15, 2015 at 09:58:53PM +0300, Sergei Ianovich wrote:
> > > +Required properties:
> > > +- compatible : should be "icpdas,sram-lp8x4x"
> > 
> > No wildcards please. Otherwise looks fine.
> 
> There is a similar review comment from Arnd Bergmann in the discussion
> of `[PATCH v5] serial: support for 16550A serial ports on LP-8x4x`.
> 
> I'll quote my latest clarification:
> > ... This driver will support ports on LP-8081, 

^^ So 8081 doesn't even match the wildcard scheme you give in the
compatible string, proving the point of the Conventional Wisdom
suggestion Rob gave...

> > LP-8141, LP-8441, LP-8841. Last time I checked the vendor was announcing
> > a series with 3 as the last digit. They use lp8x4x name, eg. in
> > documentation like `LP-8x4x_ChangeLog.txt`. They ship their proprietary
> > SDK in `lp8x4x_sdk_for_linux.tar`. All of this implies that it is a
> > single board.
> 
> I think the solution should be the same for all LP-8x4x drivers (IRQ,
> SRAM, SERIAL, IIO).

The rationale is described here:

http://devicetree.org/Device_Tree_Usage#Understanding_the_compatible_Property

Quote:
> Warning: Don't use wildcard compatible values, like "fsl,mpc83xx-uart"
> or similar. Silicon vendors will invariably make a change that breaks
> your wildcard assumptions the moment it is too late to change it.
> Instead, choose a specific silicon implementations and make all
> subsequent silicon compatible with it.

I don't think your circumstance is anything unique.

Regards,
Brian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] of/unittest: Show broken behaviour in the platform bus

2016-01-06 Thread Rob Herring
On Mon, Jan 4, 2016 at 6:13 AM, Wolfram Sang  wrote:
> From: Grant Likely 
>
> Add a single resource to the test bus device to exercise the platform
> bus code a little more. This isn't strictly a devicetree test, but it is
> a corner case that the devicetree runs into. Until we've got platform
> device unittests, it can live here. It doesn't need to be an explicit
> text because the kernel will oops when it is wrong.
>
> Cc: Pantelis Antoniou 
> Cc: Rob Herring 
> Cc: Greg Kroah-Hartman 
> Cc: Ricardo Ribalda Delgado 
> Signed-off-by: Grant Likely 
> [wsa: added the comment provided by Grant, rebased, and tested]
> Signed-off-by: Wolfram Sang 

Applied, thanks.

Rob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 RESEND] tools/hv: Use include/uapi with __EXPORTED_HEADERS__

2016-01-06 Thread Kamal Mostafa
Use the local uapi headers to keep in sync with "recently" added #define's
(e.g. VSS_OP_REGISTER1).

Fixes: 3eb2094c59e8 ("Adding makefile for tools/hv")
Cc: 
Signed-off-by: Kamal Mostafa 
Signed-off-by: K. Y. Srinivasan 
---
 tools/hv/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/hv/Makefile b/tools/hv/Makefile
index a8ab795..a8c4644 100644
--- a/tools/hv/Makefile
+++ b/tools/hv/Makefile
@@ -5,6 +5,8 @@ PTHREAD_LIBS = -lpthread
 WARNINGS = -Wall -Wextra
 CFLAGS = $(WARNINGS) -g $(PTHREAD_LIBS) $(shell getconf LFS_CFLAGS)
 
+CFLAGS += -D__EXPORTED_HEADERS__ -I../../include/uapi -I../../include
+
 all: hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon
 %: %.c
$(CC) $(CFLAGS) -o $@ $^
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    2   3   4   5   6   7   8   9   10   11   >