Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data

2017-01-30 Thread Pali Rohár
On Monday 30 January 2017 18:53:09 Tony Lindgren wrote:
> * Pavel Machek  [170127 11:41]:
> > On Fri 2017-01-27 17:23:07, Kalle Valo wrote:
> > > Pali Rohár  writes:
> > > > On Friday 27 January 2017 14:26:22 Kalle Valo wrote:
> > > >> Pali Rohár  writes:
> > > >> > 2) It was already tested that example NVS data can be used
> > > >> > for N900 e.g. for SSH connection. If real correct data are
> > > >> > not available it is better to use at least those example
> > > >> > (and probably log warning message) so user can connect via
> > > >> > SSH and start investigating where is problem.
> > > >> 
> > > >> I disagree. Allowing default calibration data to be used can
> > > >> be unnoticed by user and left her wondering why wifi works so
> > > >> badly.
> > > > 
> > > > So there are only two options:
> > > > 
> > > > 1) Disallow it and so these users will have non-working wifi.
> > > > 
> > > > 2) Allow those data to be used as fallback mechanism.
> > > > 
> > > > And personally I'm against 1) because it will break wifi
> > > > support for *all* Nokia N900 devices right now.
> > > 
> > > All two of them? :)
> > 
> > Umm. You clearly want a flock of angry penguins at your doorsteps
> > :-).
> 
> Well this silly issue of symlinking and renaming nvs files in a
> standard Linux distro was also hitting me on various devices with
> wl12xx/wl18xx trying to use the same rootfs.

wl12xx/wl18xx have probably exactly same problem as wl1251.

> Why don't we just set a custom compatible property for n900 that then
> picks up some other nvs file instead of the default?

But that still does not solve this problem correctly. Every n900 device 
have different NVS file. If we allow to load firmware directly from VFS 
without userspace helper we would see again same problem.

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data

2017-01-30 Thread Pali Rohár
On Monday 30 January 2017 18:53:09 Tony Lindgren wrote:
> * Pavel Machek  [170127 11:41]:
> > On Fri 2017-01-27 17:23:07, Kalle Valo wrote:
> > > Pali Rohár  writes:
> > > > On Friday 27 January 2017 14:26:22 Kalle Valo wrote:
> > > >> Pali Rohár  writes:
> > > >> > 2) It was already tested that example NVS data can be used
> > > >> > for N900 e.g. for SSH connection. If real correct data are
> > > >> > not available it is better to use at least those example
> > > >> > (and probably log warning message) so user can connect via
> > > >> > SSH and start investigating where is problem.
> > > >> 
> > > >> I disagree. Allowing default calibration data to be used can
> > > >> be unnoticed by user and left her wondering why wifi works so
> > > >> badly.
> > > > 
> > > > So there are only two options:
> > > > 
> > > > 1) Disallow it and so these users will have non-working wifi.
> > > > 
> > > > 2) Allow those data to be used as fallback mechanism.
> > > > 
> > > > And personally I'm against 1) because it will break wifi
> > > > support for *all* Nokia N900 devices right now.
> > > 
> > > All two of them? :)
> > 
> > Umm. You clearly want a flock of angry penguins at your doorsteps
> > :-).
> 
> Well this silly issue of symlinking and renaming nvs files in a
> standard Linux distro was also hitting me on various devices with
> wl12xx/wl18xx trying to use the same rootfs.

wl12xx/wl18xx have probably exactly same problem as wl1251.

> Why don't we just set a custom compatible property for n900 that then
> picks up some other nvs file instead of the default?

But that still does not solve this problem correctly. Every n900 device 
have different NVS file. If we allow to load firmware directly from VFS 
without userspace helper we would see again same problem.

-- 
Pali Rohár
pali.ro...@gmail.com


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH v3 07/24] drm/rockchip: dw-mipi-dsi: include bad value in error message

2017-01-30 Thread Sean Paul
On Sun, Jan 29, 2017 at 01:24:27PM +, John Keeping wrote:
> As an aid to debugging.

Reviewed-by: Sean Paul 

> 
> Signed-off-by: John Keeping 
> Reviewed-by: Chris Zhong 
> ---
> v3:
> - Add Chris' Reviewed-by
> Unchanged in v2
> 
>  drivers/gpu/drm/rockchip/dw-mipi-dsi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c 
> b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> index 2e6ad4591ebf..92dbc3e56603 100644
> --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> @@ -644,7 +644,8 @@ static ssize_t dw_mipi_dsi_host_transfer(struct 
> mipi_dsi_host *host,
>   ret = dw_mipi_dsi_dcs_long_write(dsi, msg);
>   break;
>   default:
> - dev_err(dsi->dev, "unsupported message type\n");
> + dev_err(dsi->dev, "unsupported message type 0x%02x\n",
> + msg->type);
>   ret = -EINVAL;
>   }
>  
> -- 
> 2.11.0.197.gb556de5.dirty
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Sean Paul, Software Engineer, Google / Chromium OS


Re: [PATCH v3 07/24] drm/rockchip: dw-mipi-dsi: include bad value in error message

2017-01-30 Thread Sean Paul
On Sun, Jan 29, 2017 at 01:24:27PM +, John Keeping wrote:
> As an aid to debugging.

Reviewed-by: Sean Paul 

> 
> Signed-off-by: John Keeping 
> Reviewed-by: Chris Zhong 
> ---
> v3:
> - Add Chris' Reviewed-by
> Unchanged in v2
> 
>  drivers/gpu/drm/rockchip/dw-mipi-dsi.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c 
> b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> index 2e6ad4591ebf..92dbc3e56603 100644
> --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> @@ -644,7 +644,8 @@ static ssize_t dw_mipi_dsi_host_transfer(struct 
> mipi_dsi_host *host,
>   ret = dw_mipi_dsi_dcs_long_write(dsi, msg);
>   break;
>   default:
> - dev_err(dsi->dev, "unsupported message type\n");
> + dev_err(dsi->dev, "unsupported message type 0x%02x\n",
> + msg->type);
>   ret = -EINVAL;
>   }
>  
> -- 
> 2.11.0.197.gb556de5.dirty
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Sean Paul, Software Engineer, Google / Chromium OS


Re: [PATCH v3 06/24] drm/rockchip: dw-mipi-dsi: avoid out-of-bounds read on tx_buf

2017-01-30 Thread Sean Paul
On Sun, Jan 29, 2017 at 01:24:26PM +, John Keeping wrote:
> As a side-effect of this, encode the endianness explicitly rather than
> casting a u16.
> 
> Signed-off-by: John Keeping 
> Reviewed-by: Chris Zhong 
> ---
> v3:
> - Add Chris' Reviewed-by
> Unchanged in v2
> 
>  drivers/gpu/drm/rockchip/dw-mipi-dsi.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c 
> b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> index 4be1ff3a42bb..2e6ad4591ebf 100644
> --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> @@ -572,8 +572,13 @@ static int dw_mipi_dsi_gen_pkt_hdr_write(struct 
> dw_mipi_dsi *dsi, u32 hdr_val)
>  static int dw_mipi_dsi_dcs_short_write(struct dw_mipi_dsi *dsi,
>  const struct mipi_dsi_msg *msg)
>  {
> - const u16 *tx_buf = msg->tx_buf;
> - u32 val = GEN_HDATA(*tx_buf) | GEN_HTYPE(msg->type);
> + const u8 *tx_buf = msg->tx_buf;
> + u32 val = GEN_HTYPE(msg->type);
> +
> + if (msg->tx_len > 0)
> + val |= GEN_HDATA(tx_buf[0]);
> + if (msg->tx_len > 1)
> + val |= GEN_HDATA(tx_buf[1] << 8);

You should probably update the mask inside GEN_HDATA to mask off 8 bits instead 
of
16.

Sean

>  
>   if (msg->tx_len > 2) {
>   dev_err(dsi->dev, "too long tx buf length %zu for short 
> write\n",
> -- 
> 2.11.0.197.gb556de5.dirty
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Sean Paul, Software Engineer, Google / Chromium OS


Re: [PATCH v3 06/24] drm/rockchip: dw-mipi-dsi: avoid out-of-bounds read on tx_buf

2017-01-30 Thread Sean Paul
On Sun, Jan 29, 2017 at 01:24:26PM +, John Keeping wrote:
> As a side-effect of this, encode the endianness explicitly rather than
> casting a u16.
> 
> Signed-off-by: John Keeping 
> Reviewed-by: Chris Zhong 
> ---
> v3:
> - Add Chris' Reviewed-by
> Unchanged in v2
> 
>  drivers/gpu/drm/rockchip/dw-mipi-dsi.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c 
> b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> index 4be1ff3a42bb..2e6ad4591ebf 100644
> --- a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> +++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
> @@ -572,8 +572,13 @@ static int dw_mipi_dsi_gen_pkt_hdr_write(struct 
> dw_mipi_dsi *dsi, u32 hdr_val)
>  static int dw_mipi_dsi_dcs_short_write(struct dw_mipi_dsi *dsi,
>  const struct mipi_dsi_msg *msg)
>  {
> - const u16 *tx_buf = msg->tx_buf;
> - u32 val = GEN_HDATA(*tx_buf) | GEN_HTYPE(msg->type);
> + const u8 *tx_buf = msg->tx_buf;
> + u32 val = GEN_HTYPE(msg->type);
> +
> + if (msg->tx_len > 0)
> + val |= GEN_HDATA(tx_buf[0]);
> + if (msg->tx_len > 1)
> + val |= GEN_HDATA(tx_buf[1] << 8);

You should probably update the mask inside GEN_HDATA to mask off 8 bits instead 
of
16.

Sean

>  
>   if (msg->tx_len > 2) {
>   dev_err(dsi->dev, "too long tx buf length %zu for short 
> write\n",
> -- 
> 2.11.0.197.gb556de5.dirty
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Sean Paul, Software Engineer, Google / Chromium OS


Re: [PATCH 5/5] gpio: ws16c48: Add support for GPIO names

2017-01-30 Thread kbuild test robot
Hi William,

[auto build test ERROR on gpio/for-next]
[also build test ERROR on next-20170130]
[cannot apply to v4.10-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/William-Breathitt-Gray/gpio-Add-support-for-GPIO-names-for-several-ISA_BUS_API-drivers/20170131-013038
base:   https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git 
for-next
config: i386-randconfig-x003-201705 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpio/gpio-ws16c48.c: In function 'ws16c48_probe':
>> drivers/gpio/gpio-ws16c48.c:381:28: error: 'ws16c48_names' undeclared (first 
>> use in this function)
 ws16c48gpio->chip.names = ws16c48_names;
   ^
   drivers/gpio/gpio-ws16c48.c:381:28: note: each undeclared identifier is 
reported only once for each function it appears in
   At top level:
   drivers/gpio/gpio-ws16c48.c:345:20: warning: 'ws14c48_names' defined but not 
used [-Wunused-variable]
static const char *ws14c48_names[WS16C48_NGPIO] = {
   ^

vim +/ws16c48_names +381 drivers/gpio/gpio-ws16c48.c

   375  
   376  ws16c48gpio->chip.label = name;
   377  ws16c48gpio->chip.parent = dev;
   378  ws16c48gpio->chip.owner = THIS_MODULE;
   379  ws16c48gpio->chip.base = -1;
   380  ws16c48gpio->chip.ngpio = WS16C48_NGPIO;
 > 381  ws16c48gpio->chip.names = ws16c48_names;
   382  ws16c48gpio->chip.get_direction = ws16c48_gpio_get_direction;
   383  ws16c48gpio->chip.direction_input = 
ws16c48_gpio_direction_input;
   384  ws16c48gpio->chip.direction_output = 
ws16c48_gpio_direction_output;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 5/5] gpio: ws16c48: Add support for GPIO names

2017-01-30 Thread kbuild test robot
Hi William,

[auto build test ERROR on gpio/for-next]
[also build test ERROR on next-20170130]
[cannot apply to v4.10-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/William-Breathitt-Gray/gpio-Add-support-for-GPIO-names-for-several-ISA_BUS_API-drivers/20170131-013038
base:   https://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio.git 
for-next
config: i386-randconfig-x003-201705 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   drivers/gpio/gpio-ws16c48.c: In function 'ws16c48_probe':
>> drivers/gpio/gpio-ws16c48.c:381:28: error: 'ws16c48_names' undeclared (first 
>> use in this function)
 ws16c48gpio->chip.names = ws16c48_names;
   ^
   drivers/gpio/gpio-ws16c48.c:381:28: note: each undeclared identifier is 
reported only once for each function it appears in
   At top level:
   drivers/gpio/gpio-ws16c48.c:345:20: warning: 'ws14c48_names' defined but not 
used [-Wunused-variable]
static const char *ws14c48_names[WS16C48_NGPIO] = {
   ^

vim +/ws16c48_names +381 drivers/gpio/gpio-ws16c48.c

   375  
   376  ws16c48gpio->chip.label = name;
   377  ws16c48gpio->chip.parent = dev;
   378  ws16c48gpio->chip.owner = THIS_MODULE;
   379  ws16c48gpio->chip.base = -1;
   380  ws16c48gpio->chip.ngpio = WS16C48_NGPIO;
 > 381  ws16c48gpio->chip.names = ws16c48_names;
   382  ws16c48gpio->chip.get_direction = ws16c48_gpio_get_direction;
   383  ws16c48gpio->chip.direction_input = 
ws16c48_gpio_direction_input;
   384  ws16c48gpio->chip.direction_output = 
ws16c48_gpio_direction_output;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [RFC V2 11/12] mm: Tag VMA with VM_CDM flag during page fault

2017-01-30 Thread Dave Hansen
Here's the flag definition:

> +#ifdef CONFIG_COHERENT_DEVICE
> +#define VM_CDM   0x0080  /* Contains coherent device 
> memory */
> +#endif

But it doesn't match the implementation:

> +#ifdef CONFIG_COHERENT_DEVICE
> +static void mark_vma_cdm(nodemask_t *nmask,
> + struct page *page, struct vm_area_struct *vma)
> +{
> + if (!page)
> + return;
> +
> + if (vma->vm_flags & VM_CDM)
> + return;
> +
> + if (nmask && !nodemask_has_cdm(*nmask))
> + return;
> +
> + if (is_cdm_node(page_to_nid(page)))
> + vma->vm_flags |= VM_CDM;
> +}

That flag is a one-way trip.  Any VMA with that flag set on it will keep
it for the life of the VMA, despite whether it has CDM pages in it now
or not.  Even if you changed the policy back to one that doesn't allow
CDM and forced all the pages to be migrated out.

This also assumes that the only way to get a page mapped into a VMA is
via alloc_pages_vma().  Do the NUMA migration APIs use this path?

When you *set* this flag, you don't go and turn off KSM merging, for
instance.  You keep it from being turned on from this point forward, but
you don't turn it off.

This is happening with mmap_sem held for read.  Correct?  Is it OK that
you're modifying the VMA?  That vm_flags manipulation is non-atomic, so
how can that even be safe?

If you're going to go down this route, I think you need to be very
careful.  We need to ensure that when this flag gets set, it's never set
on VMAs that are "normal" and will only be set on VMAs that were
*explicitly* set up for accessing CDM.  That means that you'll need to
make sure that there's no possible way to get a CDM page faulted into a
VMA unless it's via an explicitly assigned policy that would have cause
the VMA to be split from any "normal" one in the system.

This all makes me really nervous.


Re: [RFC V2 11/12] mm: Tag VMA with VM_CDM flag during page fault

2017-01-30 Thread Dave Hansen
Here's the flag definition:

> +#ifdef CONFIG_COHERENT_DEVICE
> +#define VM_CDM   0x0080  /* Contains coherent device 
> memory */
> +#endif

But it doesn't match the implementation:

> +#ifdef CONFIG_COHERENT_DEVICE
> +static void mark_vma_cdm(nodemask_t *nmask,
> + struct page *page, struct vm_area_struct *vma)
> +{
> + if (!page)
> + return;
> +
> + if (vma->vm_flags & VM_CDM)
> + return;
> +
> + if (nmask && !nodemask_has_cdm(*nmask))
> + return;
> +
> + if (is_cdm_node(page_to_nid(page)))
> + vma->vm_flags |= VM_CDM;
> +}

That flag is a one-way trip.  Any VMA with that flag set on it will keep
it for the life of the VMA, despite whether it has CDM pages in it now
or not.  Even if you changed the policy back to one that doesn't allow
CDM and forced all the pages to be migrated out.

This also assumes that the only way to get a page mapped into a VMA is
via alloc_pages_vma().  Do the NUMA migration APIs use this path?

When you *set* this flag, you don't go and turn off KSM merging, for
instance.  You keep it from being turned on from this point forward, but
you don't turn it off.

This is happening with mmap_sem held for read.  Correct?  Is it OK that
you're modifying the VMA?  That vm_flags manipulation is non-atomic, so
how can that even be safe?

If you're going to go down this route, I think you need to be very
careful.  We need to ensure that when this flag gets set, it's never set
on VMAs that are "normal" and will only be set on VMAs that were
*explicitly* set up for accessing CDM.  That means that you'll need to
make sure that there's no possible way to get a CDM page faulted into a
VMA unless it's via an explicitly assigned policy that would have cause
the VMA to be split from any "normal" one in the system.

This all makes me really nervous.


[PATCH] tick/broadcast: Reduce lock cacheline contention

2017-01-30 Thread Waiman Long
It was observed that on an Intel x86 system without the ARAT (Always
running APIC timer) feature and with fairly large number of CPUs as
well as CPUs coming in and out of intel_idle frequently, the lock
contention on the tick_broadcast_lock can become significant.

To reduce contention, the lock is put into its own cacheline and all
the cpumask_var_t variables are put into the __read_mostly section.

Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket
16-core 32-thread Nehalam system, the performance number improved
from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on
a 4.9.6 kernel.  This is a 3.4% improvement.

Signed-off-by: Waiman Long 
---
 include/linux/cpumask.h  |  7 ++-
 kernel/time/tick-broadcast.c | 15 ---
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index c717f5e..23c1a6d 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -649,11 +649,15 @@ static inline size_t cpumask_size(void)
  * used. Please use this_cpu_cpumask_var_t in those cases. The direct use
  * of this_cpu_ptr() or this_cpu_read() will lead to failures when the
  * other type of cpumask_var_t implementation is configured.
+ *
+ * Please also note that __cpumask_var_read_mostly can be used to declare
+ * a cpumask_var_t variable itself (not its content) as read mostly.
  */
 #ifdef CONFIG_CPUMASK_OFFSTACK
 typedef struct cpumask *cpumask_var_t;
 
-#define this_cpu_cpumask_var_ptr(x) this_cpu_read(x)
+#define this_cpu_cpumask_var_ptr(x)this_cpu_read(x)
+#define __cpumask_var_read_mostly  __read_mostly
 
 bool alloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags, int node);
 bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags);
@@ -667,6 +671,7 @@ static inline size_t cpumask_size(void)
 typedef struct cpumask cpumask_var_t[1];
 
 #define this_cpu_cpumask_var_ptr(x) this_cpu_ptr(x)
+#define __cpumask_var_read_mostly
 
 static inline bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags)
 {
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 3109204..244c935 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -29,12 +29,13 @@
  */
 
 static struct tick_device tick_broadcast_device;
-static cpumask_var_t tick_broadcast_mask;
-static cpumask_var_t tick_broadcast_on;
-static cpumask_var_t tmpmask;
-static DEFINE_RAW_SPINLOCK(tick_broadcast_lock);
+static cpumask_var_t tick_broadcast_mask __cpumask_var_read_mostly;
+static cpumask_var_t tick_broadcast_on __cpumask_var_read_mostly;
+static cpumask_var_t tmpmask __cpumask_var_read_mostly;
 static int tick_broadcast_forced;
 
+static __cacheline_aligned_in_smp DEFINE_RAW_SPINLOCK(tick_broadcast_lock);
+
 #ifdef CONFIG_TICK_ONESHOT
 static void tick_broadcast_clear_oneshot(int cpu);
 static void tick_resume_broadcast_oneshot(struct clock_event_device *bc);
@@ -517,9 +518,9 @@ void tick_resume_broadcast(void)
 
 #ifdef CONFIG_TICK_ONESHOT
 
-static cpumask_var_t tick_broadcast_oneshot_mask;
-static cpumask_var_t tick_broadcast_pending_mask;
-static cpumask_var_t tick_broadcast_force_mask;
+static cpumask_var_t tick_broadcast_oneshot_mask __cpumask_var_read_mostly;
+static cpumask_var_t tick_broadcast_pending_mask __cpumask_var_read_mostly;
+static cpumask_var_t tick_broadcast_force_mask __cpumask_var_read_mostly;
 
 /*
  * Exposed for debugging: see timer_list.c
-- 
1.8.3.1



[PATCH] tick/broadcast: Reduce lock cacheline contention

2017-01-30 Thread Waiman Long
It was observed that on an Intel x86 system without the ARAT (Always
running APIC timer) feature and with fairly large number of CPUs as
well as CPUs coming in and out of intel_idle frequently, the lock
contention on the tick_broadcast_lock can become significant.

To reduce contention, the lock is put into its own cacheline and all
the cpumask_var_t variables are put into the __read_mostly section.

Running the SP benchmark of the NAS Parallel Benchmarks on a 4-socket
16-core 32-thread Nehalam system, the performance number improved
from 3353.94 Mop/s to 3469.31 Mop/s when this patch was applied on
a 4.9.6 kernel.  This is a 3.4% improvement.

Signed-off-by: Waiman Long 
---
 include/linux/cpumask.h  |  7 ++-
 kernel/time/tick-broadcast.c | 15 ---
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index c717f5e..23c1a6d 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -649,11 +649,15 @@ static inline size_t cpumask_size(void)
  * used. Please use this_cpu_cpumask_var_t in those cases. The direct use
  * of this_cpu_ptr() or this_cpu_read() will lead to failures when the
  * other type of cpumask_var_t implementation is configured.
+ *
+ * Please also note that __cpumask_var_read_mostly can be used to declare
+ * a cpumask_var_t variable itself (not its content) as read mostly.
  */
 #ifdef CONFIG_CPUMASK_OFFSTACK
 typedef struct cpumask *cpumask_var_t;
 
-#define this_cpu_cpumask_var_ptr(x) this_cpu_read(x)
+#define this_cpu_cpumask_var_ptr(x)this_cpu_read(x)
+#define __cpumask_var_read_mostly  __read_mostly
 
 bool alloc_cpumask_var_node(cpumask_var_t *mask, gfp_t flags, int node);
 bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags);
@@ -667,6 +671,7 @@ static inline size_t cpumask_size(void)
 typedef struct cpumask cpumask_var_t[1];
 
 #define this_cpu_cpumask_var_ptr(x) this_cpu_ptr(x)
+#define __cpumask_var_read_mostly
 
 static inline bool alloc_cpumask_var(cpumask_var_t *mask, gfp_t flags)
 {
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 3109204..244c935 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -29,12 +29,13 @@
  */
 
 static struct tick_device tick_broadcast_device;
-static cpumask_var_t tick_broadcast_mask;
-static cpumask_var_t tick_broadcast_on;
-static cpumask_var_t tmpmask;
-static DEFINE_RAW_SPINLOCK(tick_broadcast_lock);
+static cpumask_var_t tick_broadcast_mask __cpumask_var_read_mostly;
+static cpumask_var_t tick_broadcast_on __cpumask_var_read_mostly;
+static cpumask_var_t tmpmask __cpumask_var_read_mostly;
 static int tick_broadcast_forced;
 
+static __cacheline_aligned_in_smp DEFINE_RAW_SPINLOCK(tick_broadcast_lock);
+
 #ifdef CONFIG_TICK_ONESHOT
 static void tick_broadcast_clear_oneshot(int cpu);
 static void tick_resume_broadcast_oneshot(struct clock_event_device *bc);
@@ -517,9 +518,9 @@ void tick_resume_broadcast(void)
 
 #ifdef CONFIG_TICK_ONESHOT
 
-static cpumask_var_t tick_broadcast_oneshot_mask;
-static cpumask_var_t tick_broadcast_pending_mask;
-static cpumask_var_t tick_broadcast_force_mask;
+static cpumask_var_t tick_broadcast_oneshot_mask __cpumask_var_read_mostly;
+static cpumask_var_t tick_broadcast_pending_mask __cpumask_var_read_mostly;
+static cpumask_var_t tick_broadcast_force_mask __cpumask_var_read_mostly;
 
 /*
  * Exposed for debugging: see timer_list.c
-- 
1.8.3.1



Re: [GIT PULL 4/4] arm64: dts: exynos: for v4.11, 2nd round

2017-01-30 Thread Krzysztof Kozlowski
On Sun, Jan 29, 2017 at 09:23:29PM -0800, Olof Johansson wrote:
> Hi Krzysztof,
> 
> On Sun, Jan 29, 2017 at 10:06:29PM +0200, Krzysztof Kozlowski wrote:
> > Hi,
> > 
> > On top of previous pull request.
> > 
> > This adds proper clocks to LPASS node on Exynos5433 which is needed
> > by Marek's patchset:
> >  - [PATCH v2 0/8] Pad retentions support for Exynos5433
> >https://lkml.kernel.org/r/1485419634-28331-1-git-send-email-m.szyprowski 
> > () samsung ! com
> > 
> > 
> > Cc: Marek Szyprowski 
> > Cc: Sylwester Nawrocki 
> > Cc: Linus Walleij 
> > Cc: Tomasz Figa 
> > Cc: Lee Jones 
> > 
> > Best regards,
> > Krzysztof
> > 
> > 
> > The following changes since commit e4e381133241a27d732e78be09973b89a193eaf7:
> > 
> >   arm64: dts: exynos: Enable HDMI/TV path on Exynos5433-TM2 (2017-01-11 
> > 18:20:28 +0200)
> > 
> > are available in the git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> > tags/samsung-dt64-4.11-2
> > 
> > for you to fetch changes up to 7547162ac351483df3641f64e99e10be329dd6a2:
> > 
> >   arm64: dts: exynos: Add clocks to Exynos5433 LPASS module (2017-01-26 
> > 22:04:20 +0200)
> 
> I think you tagged the wrong branch here. The log message shows the right hash
> at the tip, but the tag is of 95648b747071d530b5bb983735cfe01b66bf, which
> seems to be on your for-next.
> 
> Care to respin, so your tag and our merged branch match up?
>
Fixed, the same name of tag but this time on proper branch:
tag namesamsung-dt64-4.11-2
https://git.kernel.org/cgit/linux/kernel/git/krzk/linux.git/log/?h=next/dt64

Sorry for the mess.

Best regards,
Krzysztof



Re: [GIT PULL 4/4] arm64: dts: exynos: for v4.11, 2nd round

2017-01-30 Thread Krzysztof Kozlowski
On Sun, Jan 29, 2017 at 09:23:29PM -0800, Olof Johansson wrote:
> Hi Krzysztof,
> 
> On Sun, Jan 29, 2017 at 10:06:29PM +0200, Krzysztof Kozlowski wrote:
> > Hi,
> > 
> > On top of previous pull request.
> > 
> > This adds proper clocks to LPASS node on Exynos5433 which is needed
> > by Marek's patchset:
> >  - [PATCH v2 0/8] Pad retentions support for Exynos5433
> >https://lkml.kernel.org/r/1485419634-28331-1-git-send-email-m.szyprowski 
> > () samsung ! com
> > 
> > 
> > Cc: Marek Szyprowski 
> > Cc: Sylwester Nawrocki 
> > Cc: Linus Walleij 
> > Cc: Tomasz Figa 
> > Cc: Lee Jones 
> > 
> > Best regards,
> > Krzysztof
> > 
> > 
> > The following changes since commit e4e381133241a27d732e78be09973b89a193eaf7:
> > 
> >   arm64: dts: exynos: Enable HDMI/TV path on Exynos5433-TM2 (2017-01-11 
> > 18:20:28 +0200)
> > 
> > are available in the git repository at:
> > 
> >   git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git 
> > tags/samsung-dt64-4.11-2
> > 
> > for you to fetch changes up to 7547162ac351483df3641f64e99e10be329dd6a2:
> > 
> >   arm64: dts: exynos: Add clocks to Exynos5433 LPASS module (2017-01-26 
> > 22:04:20 +0200)
> 
> I think you tagged the wrong branch here. The log message shows the right hash
> at the tip, but the tag is of 95648b747071d530b5bb983735cfe01b66bf, which
> seems to be on your for-next.
> 
> Care to respin, so your tag and our merged branch match up?
>
Fixed, the same name of tag but this time on proper branch:
tag namesamsung-dt64-4.11-2
https://git.kernel.org/cgit/linux/kernel/git/krzk/linux.git/log/?h=next/dt64

Sorry for the mess.

Best regards,
Krzysztof



Re: [PATCH v20 08/17] clocksource/drivers/arm_arch_timer: Rework counter frequency detection.

2017-01-30 Thread Mark Rutland
On Thu, Jan 26, 2017 at 01:49:03PM +0800, Fu Wei wrote:
> On 26 January 2017 at 01:25, Mark Rutland  wrote:
> > On Wed, Jan 25, 2017 at 02:46:12PM +0800, Fu Wei wrote:
> >> On 25 January 2017 at 01:24, Mark Rutland  wrote:
> >> > On Wed, Jan 18, 2017 at 09:25:32PM +0800, fu@linaro.org wrote:
> >> >> From: Fu Wei 

> > For CNT{,EL0}BaseN.CNTFRQ, I am very concerned by the wording in the
> > current ARMv8 ARM ARM. This does not match my understanding, nor does it
> > match the description in the ARMv7 ARM. I believe this may be a
> > documentation error, and I'm chasing that up internally.
> >
> > Either the currently logic in the driver which attempts to read
> > CNT{,EL0}BaseN.CNTFRQ is flawed, or the description in the ARM ARM is
> > erroneous.
> 
> Yes, those description did confuse me. :-(
> 
> But according to another document(ARMv8-A Foundation Platform User
> Guide  ARM DUI0677K),
> Table 3-2 ARMv8-A Foundation Platform memory map (continued)
> 
> AP_REFCLK CNTBase0, Generic Timer 64KB   S
> AP_REFCLK CNTBase1, Generic Timer 64KB   S/NS
> 
> Dose it means the timer frame 0 can be accessed in SECURE status  only,
> and the timer frame 1 can be accessed in both status?

That does appear to be what it says.

I assume in this case CNTCTLBase.CNTSAR<0> is RES0.

> And because Linux kernel is running on Non-secure EL1, so should we
> skip "SECURE" timer in Linux?

I guess you mean by checking the GTx Common flags, to see if the timer
is secure? Yes, we must skip those.

Looking further at this, the ACPI spec is sorely lacking any statement
as to the configuration of CNTCTLBase.{CNTSAR,CNTTIDR,CNTACR}, so it's
not clear if we can access anything in a frame, even if it is listed as
being a non-secure timer.

I think we need a stronger statement here. Otherwise, we will encounter
problems. Linux currently assumes that CNTCTLBase.CNTACR is
writeable, given a non-secure frame N. This is only the case if
CNTCTLBase.CNTSAR.NS == 1.

Thanks,
Mark.


[PATCH] powerpc: sort Kconfig selects under CONFIG_PPC

2017-01-30 Thread Andrew Donnellan
config PPC has a lot of selects under it. They're not sorted in any
particular order, leading to merge conflicts when adding items at the end.

Sort them alphabetically.

Suggested-by: Michael Ellerman <m...@ellerman.id.au>
Signed-off-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com>
---

On top of linux-next 20170130

---
 arch/powerpc/Kconfig | 128 +--
 1 file changed, 64 insertions(+), 64 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 689cf9218b21..570195c8a86a 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -80,91 +80,91 @@ config ARCH_HAS_DMA_SET_COHERENT_MASK
 config PPC
bool
default y
-   select BUILDTIME_EXTABLE_SORT
+   select ARCH_HAS_DEVMEM_IS_ALLOWED
+   select ARCH_HAS_DMA_SET_COHERENT_MASK
+   select ARCH_HAS_ELF_RANDOMIZE
+   select ARCH_HAS_GCOV_PROFILE_ALL
+   select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE
+   select ARCH_HAS_SG_CHAIN
+   select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
+   select ARCH_HAS_UBSAN_SANITIZE_ALL
+   select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
+   select ARCH_SUPPORTS_ATOMIC_RMW
+   select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
+   select ARCH_USE_BUILTIN_BSWAP
+   select ARCH_USE_CMPXCHG_LOCKREF if PPC64
+   select ARCH_WANT_IPC_PARSE_VERSION
+   select ARCH_WEAK_RELEASE_ACQUIRE
select BINFMT_ELF
-   select ARCH_HAS_ELF_RANDOMIZE
-   select OF
-   select OF_EARLY_FLATTREE
-   select OF_RESERVED_MEM
-   select HAVE_FTRACE_MCOUNT_RECORD
+   select BUILDTIME_EXTABLE_SORT
+   select CLONE_BACKWARDS
+   select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN
+   select EDAC_ATOMIC_SCRUB
+   select EDAC_SUPPORT
+   select GENERIC_ATOMIC64 if PPC32
+   select GENERIC_CLOCKEVENTS
+   select GENERIC_CLOCKEVENTS_BROADCAST if SMP
+   select GENERIC_CMOS_UPDATE
+   select GENERIC_CPU_AUTOPROBE
+   select GENERIC_IRQ_SHOW
+   select GENERIC_IRQ_SHOW_LEVEL
+   select GENERIC_SMP_IDLE_THREAD
+   select GENERIC_STRNCPY_FROM_USER
+   select GENERIC_STRNLEN_USER
+   select GENERIC_TIME_VSYSCALL_OLD
+   select HAVE_ARCH_AUDITSYSCALL
+   select HAVE_ARCH_HARDENED_USERCOPY
+   select HAVE_ARCH_JUMP_LABEL
+   select HAVE_ARCH_KGDB
+   select HAVE_ARCH_SECCOMP_FILTER
+   select HAVE_ARCH_TRACEHOOK
+   select HAVE_CBPF_JIT if !PPC64
+   select HAVE_DEBUG_KMEMLEAK
+   select HAVE_DEBUG_STACKOVERFLOW
+   select HAVE_DMA_API_DEBUG
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS if MPROFILE_KERNEL
-   select HAVE_FUNCTION_TRACER
+   select HAVE_EBPF_JIT if PPC64
+   select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
+   select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_TRACER
-   select SYSCTL_EXCEPTION_TRACE
-   select VIRT_TO_BUS if !PPC64
+   select HAVE_FUNCTION_TRACER
+   select HAVE_GENERIC_RCU_GUP
+   select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
select HAVE_IDE
select HAVE_IOREMAP_PROT
-   select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
+   select HAVE_IRQ_EXIT_ON_IRQ_STACK
+   select HAVE_KERNEL_GZIP
select HAVE_KPROBES
-   select HAVE_ARCH_KGDB
select HAVE_KRETPROBES
-   select HAVE_ARCH_TRACEHOOK
+   select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_MEMBLOCK
select HAVE_MEMBLOCK_NODE_MAP
-   select HAVE_DMA_API_DEBUG
+   select HAVE_MOD_ARCH_SPECIFIC
+   select HAVE_NMI if PERF_EVENTS
select HAVE_OPROFILE
-   select HAVE_DEBUG_KMEMLEAK
-   select ARCH_HAS_SG_CHAIN
-   select GENERIC_ATOMIC64 if PPC32
select HAVE_PERF_EVENTS
+   select HAVE_PERF_EVENTS_NMI if PPC64
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
+   select HAVE_RCU_TABLE_FREE if SMP
select HAVE_REGS_AND_STACK_ACCESS_API
-   select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
-   select ARCH_WANT_IPC_PARSE_VERSION
-   select SPARSE_IRQ
+   select HAVE_SYSCALL_TRACEPOINTS
+   select HAVE_VIRT_CPU_ACCOUNTING
select IRQ_DOMAIN
-   select GENERIC_IRQ_SHOW
-   select GENERIC_IRQ_SHOW_LEVEL
select IRQ_FORCED_THREADING
-   select HAVE_RCU_TABLE_FREE if SMP
-   select HAVE_SYSCALL_TRACEPOINTS
-   select HAVE_CBPF_JIT if !PPC64
-   select HAVE_EBPF_JIT if PPC64
-   select HAVE_ARCH_JUMP_LABEL
-   select ARCH_HAVE_NMI_SAFE_CMPXCHG
-   select ARCH_HAS_GCOV_PROFILE_ALL
-   select GENERIC_SMP_IDLE_THREAD
-   select GENERIC_CMOS_UPDATE
-  

Re: [PATCH v20 08/17] clocksource/drivers/arm_arch_timer: Rework counter frequency detection.

2017-01-30 Thread Mark Rutland
On Thu, Jan 26, 2017 at 01:49:03PM +0800, Fu Wei wrote:
> On 26 January 2017 at 01:25, Mark Rutland  wrote:
> > On Wed, Jan 25, 2017 at 02:46:12PM +0800, Fu Wei wrote:
> >> On 25 January 2017 at 01:24, Mark Rutland  wrote:
> >> > On Wed, Jan 18, 2017 at 09:25:32PM +0800, fu@linaro.org wrote:
> >> >> From: Fu Wei 

> > For CNT{,EL0}BaseN.CNTFRQ, I am very concerned by the wording in the
> > current ARMv8 ARM ARM. This does not match my understanding, nor does it
> > match the description in the ARMv7 ARM. I believe this may be a
> > documentation error, and I'm chasing that up internally.
> >
> > Either the currently logic in the driver which attempts to read
> > CNT{,EL0}BaseN.CNTFRQ is flawed, or the description in the ARM ARM is
> > erroneous.
> 
> Yes, those description did confuse me. :-(
> 
> But according to another document(ARMv8-A Foundation Platform User
> Guide  ARM DUI0677K),
> Table 3-2 ARMv8-A Foundation Platform memory map (continued)
> 
> AP_REFCLK CNTBase0, Generic Timer 64KB   S
> AP_REFCLK CNTBase1, Generic Timer 64KB   S/NS
> 
> Dose it means the timer frame 0 can be accessed in SECURE status  only,
> and the timer frame 1 can be accessed in both status?

That does appear to be what it says.

I assume in this case CNTCTLBase.CNTSAR<0> is RES0.

> And because Linux kernel is running on Non-secure EL1, so should we
> skip "SECURE" timer in Linux?

I guess you mean by checking the GTx Common flags, to see if the timer
is secure? Yes, we must skip those.

Looking further at this, the ACPI spec is sorely lacking any statement
as to the configuration of CNTCTLBase.{CNTSAR,CNTTIDR,CNTACR}, so it's
not clear if we can access anything in a frame, even if it is listed as
being a non-secure timer.

I think we need a stronger statement here. Otherwise, we will encounter
problems. Linux currently assumes that CNTCTLBase.CNTACR is
writeable, given a non-secure frame N. This is only the case if
CNTCTLBase.CNTSAR.NS == 1.

Thanks,
Mark.


[PATCH] powerpc: sort Kconfig selects under CONFIG_PPC

2017-01-30 Thread Andrew Donnellan
config PPC has a lot of selects under it. They're not sorted in any
particular order, leading to merge conflicts when adding items at the end.

Sort them alphabetically.

Suggested-by: Michael Ellerman 
Signed-off-by: Andrew Donnellan 
---

On top of linux-next 20170130

---
 arch/powerpc/Kconfig | 128 +--
 1 file changed, 64 insertions(+), 64 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 689cf9218b21..570195c8a86a 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -80,91 +80,91 @@ config ARCH_HAS_DMA_SET_COHERENT_MASK
 config PPC
bool
default y
-   select BUILDTIME_EXTABLE_SORT
+   select ARCH_HAS_DEVMEM_IS_ALLOWED
+   select ARCH_HAS_DMA_SET_COHERENT_MASK
+   select ARCH_HAS_ELF_RANDOMIZE
+   select ARCH_HAS_GCOV_PROFILE_ALL
+   select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE
+   select ARCH_HAS_SG_CHAIN
+   select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
+   select ARCH_HAS_UBSAN_SANITIZE_ALL
+   select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
+   select ARCH_SUPPORTS_ATOMIC_RMW
+   select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
+   select ARCH_USE_BUILTIN_BSWAP
+   select ARCH_USE_CMPXCHG_LOCKREF if PPC64
+   select ARCH_WANT_IPC_PARSE_VERSION
+   select ARCH_WEAK_RELEASE_ACQUIRE
select BINFMT_ELF
-   select ARCH_HAS_ELF_RANDOMIZE
-   select OF
-   select OF_EARLY_FLATTREE
-   select OF_RESERVED_MEM
-   select HAVE_FTRACE_MCOUNT_RECORD
+   select BUILDTIME_EXTABLE_SORT
+   select CLONE_BACKWARDS
+   select DCACHE_WORD_ACCESS if PPC64 && CPU_LITTLE_ENDIAN
+   select EDAC_ATOMIC_SCRUB
+   select EDAC_SUPPORT
+   select GENERIC_ATOMIC64 if PPC32
+   select GENERIC_CLOCKEVENTS
+   select GENERIC_CLOCKEVENTS_BROADCAST if SMP
+   select GENERIC_CMOS_UPDATE
+   select GENERIC_CPU_AUTOPROBE
+   select GENERIC_IRQ_SHOW
+   select GENERIC_IRQ_SHOW_LEVEL
+   select GENERIC_SMP_IDLE_THREAD
+   select GENERIC_STRNCPY_FROM_USER
+   select GENERIC_STRNLEN_USER
+   select GENERIC_TIME_VSYSCALL_OLD
+   select HAVE_ARCH_AUDITSYSCALL
+   select HAVE_ARCH_HARDENED_USERCOPY
+   select HAVE_ARCH_JUMP_LABEL
+   select HAVE_ARCH_KGDB
+   select HAVE_ARCH_SECCOMP_FILTER
+   select HAVE_ARCH_TRACEHOOK
+   select HAVE_CBPF_JIT if !PPC64
+   select HAVE_DEBUG_KMEMLEAK
+   select HAVE_DEBUG_STACKOVERFLOW
+   select HAVE_DMA_API_DEBUG
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS if MPROFILE_KERNEL
-   select HAVE_FUNCTION_TRACER
+   select HAVE_EBPF_JIT if PPC64
+   select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
+   select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_TRACER
-   select SYSCTL_EXCEPTION_TRACE
-   select VIRT_TO_BUS if !PPC64
+   select HAVE_FUNCTION_TRACER
+   select HAVE_GENERIC_RCU_GUP
+   select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
select HAVE_IDE
select HAVE_IOREMAP_PROT
-   select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
+   select HAVE_IRQ_EXIT_ON_IRQ_STACK
+   select HAVE_KERNEL_GZIP
select HAVE_KPROBES
-   select HAVE_ARCH_KGDB
select HAVE_KRETPROBES
-   select HAVE_ARCH_TRACEHOOK
+   select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_MEMBLOCK
select HAVE_MEMBLOCK_NODE_MAP
-   select HAVE_DMA_API_DEBUG
+   select HAVE_MOD_ARCH_SPECIFIC
+   select HAVE_NMI if PERF_EVENTS
select HAVE_OPROFILE
-   select HAVE_DEBUG_KMEMLEAK
-   select ARCH_HAS_SG_CHAIN
-   select GENERIC_ATOMIC64 if PPC32
select HAVE_PERF_EVENTS
+   select HAVE_PERF_EVENTS_NMI if PPC64
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
+   select HAVE_RCU_TABLE_FREE if SMP
select HAVE_REGS_AND_STACK_ACCESS_API
-   select HAVE_HW_BREAKPOINT if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
-   select ARCH_WANT_IPC_PARSE_VERSION
-   select SPARSE_IRQ
+   select HAVE_SYSCALL_TRACEPOINTS
+   select HAVE_VIRT_CPU_ACCOUNTING
select IRQ_DOMAIN
-   select GENERIC_IRQ_SHOW
-   select GENERIC_IRQ_SHOW_LEVEL
select IRQ_FORCED_THREADING
-   select HAVE_RCU_TABLE_FREE if SMP
-   select HAVE_SYSCALL_TRACEPOINTS
-   select HAVE_CBPF_JIT if !PPC64
-   select HAVE_EBPF_JIT if PPC64
-   select HAVE_ARCH_JUMP_LABEL
-   select ARCH_HAVE_NMI_SAFE_CMPXCHG
-   select ARCH_HAS_GCOV_PROFILE_ALL
-   select GENERIC_SMP_IDLE_THREAD
-   select GENERIC_CMOS_UPDATE
-   select GENERIC_TIME_VSYSCALL_OLD
-   s

[PATCH] xen-netfront: Delete rx_refill_timer in xennet_disconnect_backend()

2017-01-30 Thread Boris Ostrovsky
rx_refill_timer should be deleted as soon as we disconnect from the
backend since otherwise it is possible for the timer to go off before
we get to xennet_destroy_queues(). If this happens we may dereference
queue->rx.sring which is set to NULL in xennet_disconnect_backend().

Signed-off-by: Boris Ostrovsky 
CC: sta...@vger.kernel.org
---
 drivers/net/xen-netfront.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 8315fe7..722fe9f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1379,6 +1379,8 @@ static void xennet_disconnect_backend(struct 
netfront_info *info)
for (i = 0; i < num_queues && info->queues; ++i) {
struct netfront_queue *queue = >queues[i];
 
+   del_timer_sync(>rx_refill_timer);
+
if (queue->tx_irq && (queue->tx_irq == queue->rx_irq))
unbind_from_irqhandler(queue->tx_irq, queue);
if (queue->tx_irq && (queue->tx_irq != queue->rx_irq)) {
@@ -1733,7 +1735,6 @@ static void xennet_destroy_queues(struct netfront_info 
*info)
 
if (netif_running(info->netdev))
napi_disable(>napi);
-   del_timer_sync(>rx_refill_timer);
netif_napi_del(>napi);
}
 
-- 
1.8.3.1



Re: [RFC v2 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-01-30 Thread Marc Zyngier
On 30/01/17 15:02, Christoffer Dall wrote:
> On Sun, Jan 29, 2017 at 03:21:06PM +, Marc Zyngier wrote:
>> On Fri, Jan 27 2017 at 01:04:56 AM, Jintack Lim  
>> wrote:
>>> Now that we maintain the EL1 physical timer register states of VMs,
>>> update the physical timer interrupt level along with the virtual one.
>>>
>>> Note that the emulated EL1 physical timer is not mapped to any hardware
>>> timer, so we call a proper vgic function.
>>>
>>> Signed-off-by: Jintack Lim 
>>> ---
>>>  virt/kvm/arm/arch_timer.c | 20 
>>>  1 file changed, 20 insertions(+)
>>>
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index 0f6e935..3b6bd50 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -180,6 +180,21 @@ static void kvm_timer_update_mapped_irq(struct 
>>> kvm_vcpu *vcpu, bool new_level,
>>> WARN_ON(ret);
>>>  }
>>>  
>>> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
>>> +struct arch_timer_context *timer)
>>> +{
>>> +   int ret;
>>> +
>>> +   BUG_ON(!vgic_initialized(vcpu->kvm));
>>
>> Although I've added my fair share of BUG_ON() in the code base, I've
>> since reconsidered my position. If we get in a situation where the vgic
>> is not initialized, maybe it would be better to just WARN_ON and return
>> early rather than killing the whole box. Thoughts?
>>
> 
> The distinction to me is whether this will cause fatal crashes or
> exploits down the road if we're working on uninitialized data.  If all
> that can happen if the vgic is not initialized, is that the guest
> doesn't see interrupts, for example, then a WARN_ON is appropriate.
> 
> Which is the case here?
> 
> That being said, do we need this at all?  This is in the critial path
> and is actually measurable (I know this from my work on the other timer
> series), so it's better to get rid of it if we can.  Can we simply
> convince ourselves this will never happen, and is the code ever likely
> to change so that it gets called with the vgic disabled later?

That'd be the best course of action. I remember us reworking some of
that in the now defunct vgic-less series. Maybe we could salvage that
code, if only for the time we spent on it...

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


[PATCH] xen-netfront: Delete rx_refill_timer in xennet_disconnect_backend()

2017-01-30 Thread Boris Ostrovsky
rx_refill_timer should be deleted as soon as we disconnect from the
backend since otherwise it is possible for the timer to go off before
we get to xennet_destroy_queues(). If this happens we may dereference
queue->rx.sring which is set to NULL in xennet_disconnect_backend().

Signed-off-by: Boris Ostrovsky 
CC: sta...@vger.kernel.org
---
 drivers/net/xen-netfront.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 8315fe7..722fe9f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1379,6 +1379,8 @@ static void xennet_disconnect_backend(struct 
netfront_info *info)
for (i = 0; i < num_queues && info->queues; ++i) {
struct netfront_queue *queue = >queues[i];
 
+   del_timer_sync(>rx_refill_timer);
+
if (queue->tx_irq && (queue->tx_irq == queue->rx_irq))
unbind_from_irqhandler(queue->tx_irq, queue);
if (queue->tx_irq && (queue->tx_irq != queue->rx_irq)) {
@@ -1733,7 +1735,6 @@ static void xennet_destroy_queues(struct netfront_info 
*info)
 
if (netif_running(info->netdev))
napi_disable(>napi);
-   del_timer_sync(>rx_refill_timer);
netif_napi_del(>napi);
}
 
-- 
1.8.3.1



Re: [RFC v2 06/10] KVM: arm/arm64: Update the physical timer interrupt level

2017-01-30 Thread Marc Zyngier
On 30/01/17 15:02, Christoffer Dall wrote:
> On Sun, Jan 29, 2017 at 03:21:06PM +, Marc Zyngier wrote:
>> On Fri, Jan 27 2017 at 01:04:56 AM, Jintack Lim  
>> wrote:
>>> Now that we maintain the EL1 physical timer register states of VMs,
>>> update the physical timer interrupt level along with the virtual one.
>>>
>>> Note that the emulated EL1 physical timer is not mapped to any hardware
>>> timer, so we call a proper vgic function.
>>>
>>> Signed-off-by: Jintack Lim 
>>> ---
>>>  virt/kvm/arm/arch_timer.c | 20 
>>>  1 file changed, 20 insertions(+)
>>>
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index 0f6e935..3b6bd50 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -180,6 +180,21 @@ static void kvm_timer_update_mapped_irq(struct 
>>> kvm_vcpu *vcpu, bool new_level,
>>> WARN_ON(ret);
>>>  }
>>>  
>>> +static void kvm_timer_update_irq(struct kvm_vcpu *vcpu, bool new_level,
>>> +struct arch_timer_context *timer)
>>> +{
>>> +   int ret;
>>> +
>>> +   BUG_ON(!vgic_initialized(vcpu->kvm));
>>
>> Although I've added my fair share of BUG_ON() in the code base, I've
>> since reconsidered my position. If we get in a situation where the vgic
>> is not initialized, maybe it would be better to just WARN_ON and return
>> early rather than killing the whole box. Thoughts?
>>
> 
> The distinction to me is whether this will cause fatal crashes or
> exploits down the road if we're working on uninitialized data.  If all
> that can happen if the vgic is not initialized, is that the guest
> doesn't see interrupts, for example, then a WARN_ON is appropriate.
> 
> Which is the case here?
> 
> That being said, do we need this at all?  This is in the critial path
> and is actually measurable (I know this from my work on the other timer
> series), so it's better to get rid of it if we can.  Can we simply
> convince ourselves this will never happen, and is the code ever likely
> to change so that it gets called with the vgic disabled later?

That'd be the best course of action. I remember us reworking some of
that in the now defunct vgic-less series. Maybe we could salvage that
code, if only for the time we spent on it...

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


Re: [PATCH 0/7] Fix issues and factorize arm/arm64 capacity information code

2017-01-30 Thread Catalin Marinas
On Mon, Jan 30, 2017 at 12:29:01PM +, Juri Lelli wrote:
> I'd need more advice on this set, especially on how and if patch 6 could fly.

Since you got some comments and said that you are going to fix them in
the next version, I guess people are waiting for you to post a new
series.

-- 
Catalin


[RFC 1/3] system-power: Add system power and restart framework

2017-01-30 Thread Thierry Reding
From: Thierry Reding 

This adds a very simple framework that allows drivers to register system
power and restart controllers. The goal of this framework is to replace
the current notifier based mechanism for restart handlers and the power
off equivalent that is the global pm_power_off() function.

Both of these approaches currently lack any means of locking against
concurrently registering handlers or formal definitions on what proper
priorities are to order handlers.

The system-power framework attempts to remedy this by adding a system
power chip object that drivers can embed in their driver-specific data.
A chip contains a description of capabilities that the framework uses
to determine a good sequence of handlers to use for restart and power
off.

Signed-off-by: Thierry Reding 
---
 drivers/base/Makefile|   3 +-
 drivers/base/system-power.c  | 110 +++
 include/linux/system-power.h |  38 +++
 3 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/system-power.c
 create mode 100644 include/linux/system-power.h

diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index f2816f6ff76a..eef165221d9d 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -4,7 +4,8 @@ obj-y   := component.o core.o bus.o dd.o 
syscore.o \
   driver.o class.o platform.o \
   cpu.o firmware.o init.o map.o devres.o \
   attribute_container.o transport_class.o \
-  topology.o container.o property.o cacheinfo.o
+  topology.o container.o property.o cacheinfo.o \
+  system-power.o
 obj-$(CONFIG_DEVTMPFS) += devtmpfs.o
 obj-$(CONFIG_DMA_CMA) += dma-contiguous.o
 obj-y  += power/
diff --git a/drivers/base/system-power.c b/drivers/base/system-power.c
new file mode 100644
index ..96c0cb457933
--- /dev/null
+++ b/drivers/base/system-power.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (c) 2017 NVIDIA Corporation
+ *
+ * This file is released under the GPL v2
+ */
+
+#define pr_fmt(fmt) "system-power: " fmt
+
+#include 
+
+static DEFINE_MUTEX(system_power_lock);
+static LIST_HEAD(system_power_chips);
+
+int system_power_chip_add(struct system_power_chip *chip)
+{
+   if (!chip->ops || (!chip->ops->restart && !chip->ops->power_off)) {
+   WARN(1, pr_fmt("must implement restart or power off\n"));
+   return -EINVAL;
+   }
+
+   mutex_lock(_power_lock);
+
+   INIT_LIST_HEAD(>list);
+   list_add_tail(>list, _power_chips);
+
+   mutex_unlock(_power_lock);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(system_power_chip_add);
+
+int system_power_chip_remove(struct system_power_chip *chip)
+{
+   mutex_lock(_power_lock);
+
+   list_del_init(>list);
+
+   mutex_unlock(_power_lock);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(system_power_chip_remove);
+
+bool system_can_power_off(void)
+{
+   /* XXX for backwards compatibility */
+   return pm_power_off != NULL;
+}
+
+int system_restart(char *cmd)
+{
+   struct system_power_chip *chip;
+   int err;
+
+   mutex_lock(_power_lock);
+
+   list_for_each_entry(chip, _power_chips, list) {
+   if (!chip->ops->restart)
+   continue;
+
+   pr_debug("trying to restart using %ps\n", chip);
+
+   err = chip->ops->restart(chip, reboot_mode, cmd);
+   if (err < 0)
+   dev_warn(chip->dev, "failed to restart: %d\n", err);
+   }
+
+   mutex_unlock(_power_lock);
+
+   /* XXX for backwards compatibility */
+   do_kernel_restart(cmd);
+
+   return 0;
+}
+
+int system_power_off_prepare(void)
+{
+   /* XXX for backwards compatibility */
+   if (pm_power_off_prepare)
+   pm_power_off_prepare();
+
+   return 0;
+}
+
+int system_power_off(void)
+{
+   struct system_power_chip *chip;
+   int err;
+
+   mutex_lock(_power_lock);
+
+   list_for_each_entry(chip, _power_chips, list) {
+   if (!chip->ops->power_off)
+   continue;
+
+   pr_debug("trying to power off using %ps\n", chip);
+
+   err = chip->ops->power_off(chip);
+   if (err < 0)
+   dev_warn(chip->dev, "failed to power off: %d\n", err);
+   }
+
+   mutex_unlock(_power_lock);
+
+   /* XXX for backwards compatibility */
+   if (pm_power_off)
+   pm_power_off();
+
+   return 0;
+}
diff --git a/include/linux/system-power.h b/include/linux/system-power.h
new file mode 100644
index ..f709c14c1552
--- /dev/null
+++ b/include/linux/system-power.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (c) 2017 NVIDIA Corporation
+ *
+ * This file is released under the GPL v2
+ */
+
+#ifndef SYSTEM_POWER_H
+#define 

Re: [PATCH 0/7] Fix issues and factorize arm/arm64 capacity information code

2017-01-30 Thread Catalin Marinas
On Mon, Jan 30, 2017 at 12:29:01PM +, Juri Lelli wrote:
> I'd need more advice on this set, especially on how and if patch 6 could fly.

Since you got some comments and said that you are going to fix them in
the next version, I guess people are waiting for you to post a new
series.

-- 
Catalin


[RFC 1/3] system-power: Add system power and restart framework

2017-01-30 Thread Thierry Reding
From: Thierry Reding 

This adds a very simple framework that allows drivers to register system
power and restart controllers. The goal of this framework is to replace
the current notifier based mechanism for restart handlers and the power
off equivalent that is the global pm_power_off() function.

Both of these approaches currently lack any means of locking against
concurrently registering handlers or formal definitions on what proper
priorities are to order handlers.

The system-power framework attempts to remedy this by adding a system
power chip object that drivers can embed in their driver-specific data.
A chip contains a description of capabilities that the framework uses
to determine a good sequence of handlers to use for restart and power
off.

Signed-off-by: Thierry Reding 
---
 drivers/base/Makefile|   3 +-
 drivers/base/system-power.c  | 110 +++
 include/linux/system-power.h |  38 +++
 3 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/system-power.c
 create mode 100644 include/linux/system-power.h

diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index f2816f6ff76a..eef165221d9d 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -4,7 +4,8 @@ obj-y   := component.o core.o bus.o dd.o 
syscore.o \
   driver.o class.o platform.o \
   cpu.o firmware.o init.o map.o devres.o \
   attribute_container.o transport_class.o \
-  topology.o container.o property.o cacheinfo.o
+  topology.o container.o property.o cacheinfo.o \
+  system-power.o
 obj-$(CONFIG_DEVTMPFS) += devtmpfs.o
 obj-$(CONFIG_DMA_CMA) += dma-contiguous.o
 obj-y  += power/
diff --git a/drivers/base/system-power.c b/drivers/base/system-power.c
new file mode 100644
index ..96c0cb457933
--- /dev/null
+++ b/drivers/base/system-power.c
@@ -0,0 +1,110 @@
+/*
+ * Copyright (c) 2017 NVIDIA Corporation
+ *
+ * This file is released under the GPL v2
+ */
+
+#define pr_fmt(fmt) "system-power: " fmt
+
+#include 
+
+static DEFINE_MUTEX(system_power_lock);
+static LIST_HEAD(system_power_chips);
+
+int system_power_chip_add(struct system_power_chip *chip)
+{
+   if (!chip->ops || (!chip->ops->restart && !chip->ops->power_off)) {
+   WARN(1, pr_fmt("must implement restart or power off\n"));
+   return -EINVAL;
+   }
+
+   mutex_lock(_power_lock);
+
+   INIT_LIST_HEAD(>list);
+   list_add_tail(>list, _power_chips);
+
+   mutex_unlock(_power_lock);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(system_power_chip_add);
+
+int system_power_chip_remove(struct system_power_chip *chip)
+{
+   mutex_lock(_power_lock);
+
+   list_del_init(>list);
+
+   mutex_unlock(_power_lock);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(system_power_chip_remove);
+
+bool system_can_power_off(void)
+{
+   /* XXX for backwards compatibility */
+   return pm_power_off != NULL;
+}
+
+int system_restart(char *cmd)
+{
+   struct system_power_chip *chip;
+   int err;
+
+   mutex_lock(_power_lock);
+
+   list_for_each_entry(chip, _power_chips, list) {
+   if (!chip->ops->restart)
+   continue;
+
+   pr_debug("trying to restart using %ps\n", chip);
+
+   err = chip->ops->restart(chip, reboot_mode, cmd);
+   if (err < 0)
+   dev_warn(chip->dev, "failed to restart: %d\n", err);
+   }
+
+   mutex_unlock(_power_lock);
+
+   /* XXX for backwards compatibility */
+   do_kernel_restart(cmd);
+
+   return 0;
+}
+
+int system_power_off_prepare(void)
+{
+   /* XXX for backwards compatibility */
+   if (pm_power_off_prepare)
+   pm_power_off_prepare();
+
+   return 0;
+}
+
+int system_power_off(void)
+{
+   struct system_power_chip *chip;
+   int err;
+
+   mutex_lock(_power_lock);
+
+   list_for_each_entry(chip, _power_chips, list) {
+   if (!chip->ops->power_off)
+   continue;
+
+   pr_debug("trying to power off using %ps\n", chip);
+
+   err = chip->ops->power_off(chip);
+   if (err < 0)
+   dev_warn(chip->dev, "failed to power off: %d\n", err);
+   }
+
+   mutex_unlock(_power_lock);
+
+   /* XXX for backwards compatibility */
+   if (pm_power_off)
+   pm_power_off();
+
+   return 0;
+}
diff --git a/include/linux/system-power.h b/include/linux/system-power.h
new file mode 100644
index ..f709c14c1552
--- /dev/null
+++ b/include/linux/system-power.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (c) 2017 NVIDIA Corporation
+ *
+ * This file is released under the GPL v2
+ */
+
+#ifndef SYSTEM_POWER_H
+#define SYSTEM_POWER_H
+
+#include 
+#include 
+#include 
+

Re: irq domain hierarchy vs. chaining w/ PCI MSI-X...

2017-01-30 Thread David Daney

On 01/30/2017 05:32 AM, Thomas Gleixner wrote:

On Fri, 13 Jan 2017, David Daney wrote:

At the point where the handle_*_irq() functions call handle_irq_event(), we
need to 9optionally) do something both immediately before and after the call
to handle_irq_event().

In irq_chip add a function:

void (*irq_handle)(struct irq_data *data, struct irq_desc *desc);

Really this is the per irq_chip flow handler.

Then in handle_fasteoi_irq() and probably the other flow handlers as well:

   .
   .
   .
   if (chip->irq_handle)
  chip->irq_handle(>irq_data, desc);
   else
  handle_irq_event(desc);
   .
   .
   .

Those 4 lines of code could be factored out into a helper function in chip.c



And why don't you just write a flow handler function which does exactly
what you need instead of adding more conditionals into all hotpath
functions?


I came to the same conclusion myself and have already started to 
implement it that way.


In my particular use case, we already know for certain which flow 
handler the parent irqdomain is using, so it is easy to copy it and add 
the extra handling needed by the child irq_chip.


In the general case, a driver cannot know which flow handler the parent 
irqdomain is using, so it is impossible to know how to write a new flow 
handler that implements the desired semantics of the irq_chip hierarchy.


David Daney




Re: irq domain hierarchy vs. chaining w/ PCI MSI-X...

2017-01-30 Thread David Daney

On 01/30/2017 05:32 AM, Thomas Gleixner wrote:

On Fri, 13 Jan 2017, David Daney wrote:

At the point where the handle_*_irq() functions call handle_irq_event(), we
need to 9optionally) do something both immediately before and after the call
to handle_irq_event().

In irq_chip add a function:

void (*irq_handle)(struct irq_data *data, struct irq_desc *desc);

Really this is the per irq_chip flow handler.

Then in handle_fasteoi_irq() and probably the other flow handlers as well:

   .
   .
   .
   if (chip->irq_handle)
  chip->irq_handle(>irq_data, desc);
   else
  handle_irq_event(desc);
   .
   .
   .

Those 4 lines of code could be factored out into a helper function in chip.c



And why don't you just write a flow handler function which does exactly
what you need instead of adding more conditionals into all hotpath
functions?


I came to the same conclusion myself and have already started to 
implement it that way.


In my particular use case, we already know for certain which flow 
handler the parent irqdomain is using, so it is easy to copy it and add 
the extra handling needed by the child irq_chip.


In the general case, a driver cannot know which flow handler the parent 
irqdomain is using, so it is impossible to know how to write a new flow 
handler that implements the desired semantics of the irq_chip hierarchy.


David Daney




Re: [RFC V2 12/12] mm: Tag VMA with VM_CDM flag explicitly during mbind(MPOL_BIND)

2017-01-30 Thread Dave Hansen
On 01/29/2017 07:35 PM, Anshuman Khandual wrote:
> + if ((new_pol->mode == MPOL_BIND)
> + && nodemask_has_cdm(new_pol->v.nodes))
> + set_vm_cdm(vma);

So, if you did:

mbind(addr, PAGE_SIZE, MPOL_BIND, all_nodes, ...);
mbind(addr, PAGE_SIZE, MPOL_BIND, one_non_cdm_node, ...);

You end up with a VMA that can never have KSM done on it, etc...  Even
though there's no good reason for it.  I guess /proc/$pid/smaps might be
able to help us figure out what was going on here, but that still seems
like an awful lot of damage.


Re: [RFC V2 12/12] mm: Tag VMA with VM_CDM flag explicitly during mbind(MPOL_BIND)

2017-01-30 Thread Dave Hansen
On 01/29/2017 07:35 PM, Anshuman Khandual wrote:
> + if ((new_pol->mode == MPOL_BIND)
> + && nodemask_has_cdm(new_pol->v.nodes))
> + set_vm_cdm(vma);

So, if you did:

mbind(addr, PAGE_SIZE, MPOL_BIND, all_nodes, ...);
mbind(addr, PAGE_SIZE, MPOL_BIND, one_non_cdm_node, ...);

You end up with a VMA that can never have KSM done on it, etc...  Even
though there's no good reason for it.  I guess /proc/$pid/smaps might be
able to help us figure out what was going on here, but that still seems
like an awful lot of damage.


Re: [PATCH] printk: fix printk.devkmsg sysctl

2017-01-30 Thread Borislav Petkov
On Mon, Jan 30, 2017 at 06:31:38PM +0100, Rabin Vincent wrote:
> Would it be possible for you to please submit it as a patch yourself so
> that this gets fixed in the way you like?  Thank you.

Sure. I'll add your Reported-by when we're done.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 


Re: [PATCH] printk: fix printk.devkmsg sysctl

2017-01-30 Thread Borislav Petkov
On Mon, Jan 30, 2017 at 06:31:38PM +0100, Rabin Vincent wrote:
> Would it be possible for you to please submit it as a patch yourself so
> that this gets fixed in the way you like?  Thank you.

Sure. I'll add your Reported-by when we're done.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 


Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data

2017-01-30 Thread Tony Lindgren
* Pavel Machek  [170127 11:41]:
> On Fri 2017-01-27 17:23:07, Kalle Valo wrote:
> > Pali Rohár  writes:
> > 
> > > On Friday 27 January 2017 14:26:22 Kalle Valo wrote:
> > >> Pali Rohár  writes:
> > >> 
> > >> > 2) It was already tested that example NVS data can be used for N900 
> > >> > e.g.
> > >> > for SSH connection. If real correct data are not available it is better
> > >> > to use at least those example (and probably log warning message) so 
> > >> > user
> > >> > can connect via SSH and start investigating where is problem.
> > >> 
> > >> I disagree. Allowing default calibration data to be used can be
> > >> unnoticed by user and left her wondering why wifi works so badly.
> > >
> > > So there are only two options:
> > >
> > > 1) Disallow it and so these users will have non-working wifi.
> > >
> > > 2) Allow those data to be used as fallback mechanism.
> > >
> > > And personally I'm against 1) because it will break wifi support for
> > > *all* Nokia N900 devices right now.
> > 
> > All two of them? :)
> 
> Umm. You clearly want a flock of angry penguins at your doorsteps :-).

Well this silly issue of symlinking and renaming nvs files in a standard
Linux distro was also hitting me on various devices with wl12xx/wl18xx
trying to use the same rootfs.

Why don't we just set a custom compatible property for n900 that then
picks up some other nvs file instead of the default?

Regards,

Tony


Re: [PATCH 2/6] wl1251: Use request_firmware_prefer_user() for loading NVS calibration data

2017-01-30 Thread Tony Lindgren
* Pavel Machek  [170127 11:41]:
> On Fri 2017-01-27 17:23:07, Kalle Valo wrote:
> > Pali Rohár  writes:
> > 
> > > On Friday 27 January 2017 14:26:22 Kalle Valo wrote:
> > >> Pali Rohár  writes:
> > >> 
> > >> > 2) It was already tested that example NVS data can be used for N900 
> > >> > e.g.
> > >> > for SSH connection. If real correct data are not available it is better
> > >> > to use at least those example (and probably log warning message) so 
> > >> > user
> > >> > can connect via SSH and start investigating where is problem.
> > >> 
> > >> I disagree. Allowing default calibration data to be used can be
> > >> unnoticed by user and left her wondering why wifi works so badly.
> > >
> > > So there are only two options:
> > >
> > > 1) Disallow it and so these users will have non-working wifi.
> > >
> > > 2) Allow those data to be used as fallback mechanism.
> > >
> > > And personally I'm against 1) because it will break wifi support for
> > > *all* Nokia N900 devices right now.
> > 
> > All two of them? :)
> 
> Umm. You clearly want a flock of angry penguins at your doorsteps :-).

Well this silly issue of symlinking and renaming nvs files in a standard
Linux distro was also hitting me on various devices with wl12xx/wl18xx
trying to use the same rootfs.

Why don't we just set a custom compatible property for n900 that then
picks up some other nvs file instead of the default?

Regards,

Tony


Re: [PATCH v6 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt

2017-01-30 Thread Rob Herring
On Wed, Jan 25, 2017 at 02:06:29PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> Document the device-tree bindings defining the the properties under
> the @power-mgt node in the device tree that describe the idle states
> for Linux running on baremetal POWER servers.
> 
> These bindings are documented separately instead of using the the
> common idle state bindings since the idle-states on POWER servers
> are exposed as property arrays where as the common idle state bindings
> expect idle-states to be described as nodes.
> 
> Cc: Rob Herring 
> Signed-off-by: Gautham R. Shenoy 
> ---
>  .../devicetree/bindings/powerpc/opal/power-mgt.txt | 118 
> +
>  1 file changed, 118 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt

Acked-by: Rob Herring 


Re: [PATCH v6 5/5] Documentation:powerpc: Add device-tree bindings for power-mgt

2017-01-30 Thread Rob Herring
On Wed, Jan 25, 2017 at 02:06:29PM +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" 
> 
> Document the device-tree bindings defining the the properties under
> the @power-mgt node in the device tree that describe the idle states
> for Linux running on baremetal POWER servers.
> 
> These bindings are documented separately instead of using the the
> common idle state bindings since the idle-states on POWER servers
> are exposed as property arrays where as the common idle state bindings
> expect idle-states to be described as nodes.
> 
> Cc: Rob Herring 
> Signed-off-by: Gautham R. Shenoy 
> ---
>  .../devicetree/bindings/powerpc/opal/power-mgt.txt | 118 
> +
>  1 file changed, 118 insertions(+)
>  create mode 100644 
> Documentation/devicetree/bindings/powerpc/opal/power-mgt.txt

Acked-by: Rob Herring 


Re: [PATCH RESEND 2/5] dmaengine: Provide a wrapper for memcpy operations

2017-01-30 Thread Boris Brezillon
Hi Vinod,

On Mon, 30 Jan 2017 22:24:17 +0530
Vinod Koul  wrote:

> On Fri, Jan 27, 2017 at 05:42:01PM +0100, Boris Brezillon wrote:
> > Almost all ->device_prep_dma_xx() methods have a wrapper defined in
> > dmaengine.h. Add one for  ->device_prep_dma_memcpy().  
> 
> Looks good to me.
> 
> Acked-by: Vinod Koul 
> 
> 

Maybe you can take this patch directly (the NAND related bits are for
4.12).


Re: [PATCH RESEND 2/5] dmaengine: Provide a wrapper for memcpy operations

2017-01-30 Thread Boris Brezillon
Hi Vinod,

On Mon, 30 Jan 2017 22:24:17 +0530
Vinod Koul  wrote:

> On Fri, Jan 27, 2017 at 05:42:01PM +0100, Boris Brezillon wrote:
> > Almost all ->device_prep_dma_xx() methods have a wrapper defined in
> > dmaengine.h. Add one for  ->device_prep_dma_memcpy().  
> 
> Looks good to me.
> 
> Acked-by: Vinod Koul 
> 
> 

Maybe you can take this patch directly (the NAND related bits are for
4.12).


Re: [PATCH] Staging: omap4iss: fix coding style issues

2017-01-30 Thread Laurent Pinchart
Hello Avraham,

Thank you for the patch.

On Saturday 28 Jan 2017 20:00:08 Avraham Shukron wrote:
> This is a patch that fixes checkpatch.pl issues in omap4iss/iss_video.c
> Specifically, it fixes "line over 80 characters" issues
> 
> Signed-off-by: Avraham Shukron 

This looks OK to me. I've applied the patch to my tree and will push it to 
v4.11.

Acked-by: Laurent Pinchart 

> ---
>  drivers/staging/media/omap4iss/iss_video.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/media/omap4iss/iss_video.c
> b/drivers/staging/media/omap4iss/iss_video.c index c16927a..cdab053 100644
> --- a/drivers/staging/media/omap4iss/iss_video.c
> +++ b/drivers/staging/media/omap4iss/iss_video.c
> @@ -298,7 +298,8 @@ iss_video_check_format(struct iss_video *video, struct
> iss_video_fh *vfh)
> 
>  static int iss_video_queue_setup(struct vb2_queue *vq,
>unsigned int *count, unsigned int 
*num_planes,
> -  unsigned int sizes[], struct device 
*alloc_devs[])
> +  unsigned int sizes[],
> +  struct device *alloc_devs[])
>  {
>   struct iss_video_fh *vfh = vb2_get_drv_priv(vq);
>   struct iss_video *video = vfh->video;
> @@ -678,8 +679,8 @@ iss_video_get_selection(struct file *file, void *fh,
> struct v4l2_selection *sel) if (subdev == NULL)
>   return -EINVAL;
> 
> - /* Try the get selection operation first and fallback to get format if 
not
> -  * implemented.
> + /* Try the get selection operation first and fallback to get format if
> +  * not implemented.
>*/
>   sdsel.pad = pad;
>   ret = v4l2_subdev_call(subdev, pad, get_selection, NULL, );

-- 
Regards,

Laurent Pinchart



Re: [PATCH] Staging: omap4iss: fix coding style issues

2017-01-30 Thread Laurent Pinchart
Hello Avraham,

Thank you for the patch.

On Saturday 28 Jan 2017 20:00:08 Avraham Shukron wrote:
> This is a patch that fixes checkpatch.pl issues in omap4iss/iss_video.c
> Specifically, it fixes "line over 80 characters" issues
> 
> Signed-off-by: Avraham Shukron 

This looks OK to me. I've applied the patch to my tree and will push it to 
v4.11.

Acked-by: Laurent Pinchart 

> ---
>  drivers/staging/media/omap4iss/iss_video.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/staging/media/omap4iss/iss_video.c
> b/drivers/staging/media/omap4iss/iss_video.c index c16927a..cdab053 100644
> --- a/drivers/staging/media/omap4iss/iss_video.c
> +++ b/drivers/staging/media/omap4iss/iss_video.c
> @@ -298,7 +298,8 @@ iss_video_check_format(struct iss_video *video, struct
> iss_video_fh *vfh)
> 
>  static int iss_video_queue_setup(struct vb2_queue *vq,
>unsigned int *count, unsigned int 
*num_planes,
> -  unsigned int sizes[], struct device 
*alloc_devs[])
> +  unsigned int sizes[],
> +  struct device *alloc_devs[])
>  {
>   struct iss_video_fh *vfh = vb2_get_drv_priv(vq);
>   struct iss_video *video = vfh->video;
> @@ -678,8 +679,8 @@ iss_video_get_selection(struct file *file, void *fh,
> struct v4l2_selection *sel) if (subdev == NULL)
>   return -EINVAL;
> 
> - /* Try the get selection operation first and fallback to get format if 
not
> -  * implemented.
> + /* Try the get selection operation first and fallback to get format if
> +  * not implemented.
>*/
>   sdsel.pad = pad;
>   ret = v4l2_subdev_call(subdev, pad, get_selection, NULL, );

-- 
Regards,

Laurent Pinchart



Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()

2017-01-30 Thread Yasuaki Ishimatsu



On 01/30/2017 11:56 AM, Thomas Gleixner wrote:

On Mon, 30 Jan 2017, Yasuaki Ishimatsu wrote:

Hi Thomas,

Do you have any idea to fix the issue?
If you have the idea, please send me the patch.


Yes, I have a patch, but need to do some tests and get changelogs
written. Will keep you updated.


Great!!
I wait for your patch.

Thanks,
Yasuaki Ishimatsu


Thanks,

tglx



Re: [PATCH] perf/x86/intel/rapl: Rename rapl_cpu_prepare() to rapl_cpu_starting()

2017-01-30 Thread Yasuaki Ishimatsu



On 01/30/2017 11:56 AM, Thomas Gleixner wrote:

On Mon, 30 Jan 2017, Yasuaki Ishimatsu wrote:

Hi Thomas,

Do you have any idea to fix the issue?
If you have the idea, please send me the patch.


Yes, I have a patch, but need to do some tests and get changelogs
written. Will keep you updated.


Great!!
I wait for your patch.

Thanks,
Yasuaki Ishimatsu


Thanks,

tglx



Re: [RFC v2 05/10] KVM: arm/arm64: Initialize the emulated EL1 physical timer

2017-01-30 Thread Marc Zyngier
On 30/01/17 14:58, Christoffer Dall wrote:
> On Sun, Jan 29, 2017 at 12:07:48PM +, Marc Zyngier wrote:
>> On Fri, Jan 27 2017 at 01:04:55 AM, Jintack Lim  
>> wrote:
>>> Initialize the emulated EL1 physical timer with the default irq number.
>>>
>>> Signed-off-by: Jintack Lim 
>>> ---
>>>  arch/arm/kvm/reset.c | 9 -
>>>  arch/arm64/kvm/reset.c   | 9 -
>>>  include/kvm/arm_arch_timer.h | 3 ++-
>>>  virt/kvm/arm/arch_timer.c| 9 +++--
>>>  4 files changed, 25 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
>>> index 4b5e802..1da8b2d 100644
>>> --- a/arch/arm/kvm/reset.c
>>> +++ b/arch/arm/kvm/reset.c
>>> @@ -37,6 +37,11 @@
>>> .usr_regs.ARM_cpsr = SVC_MODE | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT,
>>>  };
>>>  
>>> +static const struct kvm_irq_level cortexa_ptimer_irq = {
>>> +   { .irq = 30 },
>>> +   .level = 1,
>>> +};
>>
>> At some point, we'll have to make that discoverable/configurable. Maybe
>> the time for a discoverable arch timer has come (see below).
>>
>>> +
>>>  static const struct kvm_irq_level cortexa_vtimer_irq = {
>>> { .irq = 27 },
>>> .level = 1,
>>> @@ -58,6 +63,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>>  {
>>> struct kvm_regs *reset_regs;
>>> const struct kvm_irq_level *cpu_vtimer_irq;
>>> +   const struct kvm_irq_level *cpu_ptimer_irq;
>>>  
>>> switch (vcpu->arch.target) {
>>> case KVM_ARM_TARGET_CORTEX_A7:
>>> @@ -65,6 +71,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>> reset_regs = _regs_reset;
>>> vcpu->arch.midr = read_cpuid_id();
>>> cpu_vtimer_irq = _vtimer_irq;
>>> +   cpu_ptimer_irq = _ptimer_irq;
>>> break;
>>> default:
>>> return -ENODEV;
>>> @@ -77,5 +84,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>> kvm_reset_coprocs(vcpu);
>>>  
>>> /* Reset arch_timer context */
>>> -   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>>> +   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq, cpu_ptimer_irq);
>>>  }
>>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>>> index e95d4f6..d9e9697 100644
>>> --- a/arch/arm64/kvm/reset.c
>>> +++ b/arch/arm64/kvm/reset.c
>>> @@ -46,6 +46,11 @@
>>> COMPAT_PSR_I_BIT | COMPAT_PSR_F_BIT),
>>>  };
>>>  
>>> +static const struct kvm_irq_level default_ptimer_irq = {
>>> +   .irq= 30,
>>> +   .level  = 1,
>>> +};
>>> +
>>>  static const struct kvm_irq_level default_vtimer_irq = {
>>> .irq= 27,
>>> .level  = 1,
>>> @@ -104,6 +109,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, 
>>> long ext)
>>>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>>  {
>>> const struct kvm_irq_level *cpu_vtimer_irq;
>>> +   const struct kvm_irq_level *cpu_ptimer_irq;
>>> const struct kvm_regs *cpu_reset;
>>>  
>>> switch (vcpu->arch.target) {
>>> @@ -117,6 +123,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>> }
>>>  
>>> cpu_vtimer_irq = _vtimer_irq;
>>> +   cpu_ptimer_irq = _ptimer_irq;
>>> break;
>>> }
>>>  
>>> @@ -130,5 +137,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>> kvm_pmu_vcpu_reset(vcpu);
>>>  
>>> /* Reset timer */
>>> -   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>>> +   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq, cpu_ptimer_irq);
>>>  }
>>> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
>>> index 69f648b..a364593 100644
>>> --- a/include/kvm/arm_arch_timer.h
>>> +++ b/include/kvm/arm_arch_timer.h
>>> @@ -59,7 +59,8 @@ struct arch_timer_cpu {
>>>  int kvm_timer_enable(struct kvm_vcpu *vcpu);
>>>  void kvm_timer_init(struct kvm *kvm);
>>>  int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>> -const struct kvm_irq_level *irq);
>>> +const struct kvm_irq_level *virt_irq,
>>> +const struct kvm_irq_level *phys_irq);
>>>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
>>>  void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
>>>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index f72005a..0f6e935 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -329,9 +329,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>>>  }
>>>  
>>>  int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>> -const struct kvm_irq_level *irq)
>>> +const struct kvm_irq_level *virt_irq,
>>> +const struct kvm_irq_level *phys_irq)
>>>  {
>>> struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
>>> +   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
>>>  
>>> /*
>>>  * The vcpu timer irq number cannot be determined in
>>> @@ -339,7 +341,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>>  * kvm_vcpu_set_target(). 

Re: [RFC v2 05/10] KVM: arm/arm64: Initialize the emulated EL1 physical timer

2017-01-30 Thread Marc Zyngier
On 30/01/17 14:58, Christoffer Dall wrote:
> On Sun, Jan 29, 2017 at 12:07:48PM +, Marc Zyngier wrote:
>> On Fri, Jan 27 2017 at 01:04:55 AM, Jintack Lim  
>> wrote:
>>> Initialize the emulated EL1 physical timer with the default irq number.
>>>
>>> Signed-off-by: Jintack Lim 
>>> ---
>>>  arch/arm/kvm/reset.c | 9 -
>>>  arch/arm64/kvm/reset.c   | 9 -
>>>  include/kvm/arm_arch_timer.h | 3 ++-
>>>  virt/kvm/arm/arch_timer.c| 9 +++--
>>>  4 files changed, 25 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
>>> index 4b5e802..1da8b2d 100644
>>> --- a/arch/arm/kvm/reset.c
>>> +++ b/arch/arm/kvm/reset.c
>>> @@ -37,6 +37,11 @@
>>> .usr_regs.ARM_cpsr = SVC_MODE | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT,
>>>  };
>>>  
>>> +static const struct kvm_irq_level cortexa_ptimer_irq = {
>>> +   { .irq = 30 },
>>> +   .level = 1,
>>> +};
>>
>> At some point, we'll have to make that discoverable/configurable. Maybe
>> the time for a discoverable arch timer has come (see below).
>>
>>> +
>>>  static const struct kvm_irq_level cortexa_vtimer_irq = {
>>> { .irq = 27 },
>>> .level = 1,
>>> @@ -58,6 +63,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>>  {
>>> struct kvm_regs *reset_regs;
>>> const struct kvm_irq_level *cpu_vtimer_irq;
>>> +   const struct kvm_irq_level *cpu_ptimer_irq;
>>>  
>>> switch (vcpu->arch.target) {
>>> case KVM_ARM_TARGET_CORTEX_A7:
>>> @@ -65,6 +71,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>> reset_regs = _regs_reset;
>>> vcpu->arch.midr = read_cpuid_id();
>>> cpu_vtimer_irq = _vtimer_irq;
>>> +   cpu_ptimer_irq = _ptimer_irq;
>>> break;
>>> default:
>>> return -ENODEV;
>>> @@ -77,5 +84,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>> kvm_reset_coprocs(vcpu);
>>>  
>>> /* Reset arch_timer context */
>>> -   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>>> +   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq, cpu_ptimer_irq);
>>>  }
>>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>>> index e95d4f6..d9e9697 100644
>>> --- a/arch/arm64/kvm/reset.c
>>> +++ b/arch/arm64/kvm/reset.c
>>> @@ -46,6 +46,11 @@
>>> COMPAT_PSR_I_BIT | COMPAT_PSR_F_BIT),
>>>  };
>>>  
>>> +static const struct kvm_irq_level default_ptimer_irq = {
>>> +   .irq= 30,
>>> +   .level  = 1,
>>> +};
>>> +
>>>  static const struct kvm_irq_level default_vtimer_irq = {
>>> .irq= 27,
>>> .level  = 1,
>>> @@ -104,6 +109,7 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, 
>>> long ext)
>>>  int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>>  {
>>> const struct kvm_irq_level *cpu_vtimer_irq;
>>> +   const struct kvm_irq_level *cpu_ptimer_irq;
>>> const struct kvm_regs *cpu_reset;
>>>  
>>> switch (vcpu->arch.target) {
>>> @@ -117,6 +123,7 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>> }
>>>  
>>> cpu_vtimer_irq = _vtimer_irq;
>>> +   cpu_ptimer_irq = _ptimer_irq;
>>> break;
>>> }
>>>  
>>> @@ -130,5 +137,5 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
>>> kvm_pmu_vcpu_reset(vcpu);
>>>  
>>> /* Reset timer */
>>> -   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq);
>>> +   return kvm_timer_vcpu_reset(vcpu, cpu_vtimer_irq, cpu_ptimer_irq);
>>>  }
>>> diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
>>> index 69f648b..a364593 100644
>>> --- a/include/kvm/arm_arch_timer.h
>>> +++ b/include/kvm/arm_arch_timer.h
>>> @@ -59,7 +59,8 @@ struct arch_timer_cpu {
>>>  int kvm_timer_enable(struct kvm_vcpu *vcpu);
>>>  void kvm_timer_init(struct kvm *kvm);
>>>  int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>> -const struct kvm_irq_level *irq);
>>> +const struct kvm_irq_level *virt_irq,
>>> +const struct kvm_irq_level *phys_irq);
>>>  void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu);
>>>  void kvm_timer_flush_hwstate(struct kvm_vcpu *vcpu);
>>>  void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu);
>>> diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
>>> index f72005a..0f6e935 100644
>>> --- a/virt/kvm/arm/arch_timer.c
>>> +++ b/virt/kvm/arm/arch_timer.c
>>> @@ -329,9 +329,11 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
>>>  }
>>>  
>>>  int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>> -const struct kvm_irq_level *irq)
>>> +const struct kvm_irq_level *virt_irq,
>>> +const struct kvm_irq_level *phys_irq)
>>>  {
>>> struct arch_timer_context *vtimer = vcpu_vtimer(vcpu);
>>> +   struct arch_timer_context *ptimer = vcpu_ptimer(vcpu);
>>>  
>>> /*
>>>  * The vcpu timer irq number cannot be determined in
>>> @@ -339,7 +341,8 @@ int kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
>>>  * kvm_vcpu_set_target(). To handle this, we determine
>>>  * vcpu timer 

Re: [PATCH] i2c: at91: ensure state is restored after suspending

2017-01-30 Thread Wolfram Sang

> + at91_init_twi_bus(twi_dev);

Can't this function reinit the registers to the needed values? I am not
convinced that a cache will always reflect the proper state after
resume.



signature.asc
Description: PGP signature


Re: [PATCH] i2c: at91: ensure state is restored after suspending

2017-01-30 Thread Wolfram Sang

> + at91_init_twi_bus(twi_dev);

Can't this function reinit the registers to the needed values? I am not
convinced that a cache will always reflect the proper state after
resume.



signature.asc
Description: PGP signature


Re: [RFC v2 10/10] KVM: arm/arm64: Emulate the EL1 phys timer register access

2017-01-30 Thread Marc Zyngier
On 30/01/17 17:26, Peter Maydell wrote:
> On 30 January 2017 at 17:08, Jintack Lim  wrote:
>> On Sun, Jan 29, 2017 at 10:44 AM, Marc Zyngier  wrote:
>>> Shouldn't we take the ENABLE bit into account? The ARMv8 ARM version I
>>> have at hand (version h) seems to indicate that we should, but we should
>>> check with the latest and greatest...
>>
>> Thanks! I was not clear about this. I have ARM ARM version k, and it
>> says that 'When the value of the ENABLE bit is 0, the ISTATUS field is
>> UNKNOWN.' So I thought the istatus value doesn't matter if ENABLE is
>> 0, and just set istatus bit regardless of ENABLE bit. If this is not
>> what the manual meant, then I'm happy to fix this.
> 
> It looks like the spec has been relaxed between the doc version
> that Marc was looking at and the current one. So it's OK for
> an implementation to either (a) set ISTATUS to 0 if ENABLE
> is 0, or (b) do what you've done and set ISTATUS according
> to the timer comparison whether ENABLE is clear or not
> (or even (c) set ISTATUS to a random value if ENABLE is clear,
> and other less likely choices).
> I think we should add a comment to note that it's architecturally
> UNKNOWN and we've made a choice for our implementation convenience.

In that case, the proposed implementation is perfectly fine. I'll retire
the old ARMv8 ARM from my laptop (funnily enough, I didn't fancy
downloading version k while on the train and having my phone as my link
to the outside world... ;-).

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


Re: [RFC v2 10/10] KVM: arm/arm64: Emulate the EL1 phys timer register access

2017-01-30 Thread Marc Zyngier
On 30/01/17 17:26, Peter Maydell wrote:
> On 30 January 2017 at 17:08, Jintack Lim  wrote:
>> On Sun, Jan 29, 2017 at 10:44 AM, Marc Zyngier  wrote:
>>> Shouldn't we take the ENABLE bit into account? The ARMv8 ARM version I
>>> have at hand (version h) seems to indicate that we should, but we should
>>> check with the latest and greatest...
>>
>> Thanks! I was not clear about this. I have ARM ARM version k, and it
>> says that 'When the value of the ENABLE bit is 0, the ISTATUS field is
>> UNKNOWN.' So I thought the istatus value doesn't matter if ENABLE is
>> 0, and just set istatus bit regardless of ENABLE bit. If this is not
>> what the manual meant, then I'm happy to fix this.
> 
> It looks like the spec has been relaxed between the doc version
> that Marc was looking at and the current one. So it's OK for
> an implementation to either (a) set ISTATUS to 0 if ENABLE
> is 0, or (b) do what you've done and set ISTATUS according
> to the timer comparison whether ENABLE is clear or not
> (or even (c) set ISTATUS to a random value if ENABLE is clear,
> and other less likely choices).
> I think we should add a comment to note that it's architecturally
> UNKNOWN and we've made a choice for our implementation convenience.

In that case, the proposed implementation is perfectly fine. I'll retire
the old ARMv8 ARM from my laptop (funnily enough, I didn't fancy
downloading version k while on the train and having my phone as my link
to the outside world... ;-).

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


[RFC 0/3] Add system power and restart framework

2017-01-30 Thread Thierry Reding
From: Thierry Reding 

Hi everyone,

This series of patches proposes a small framework targetted at system
power and restart drivers. Restart drivers currently use a notifier
chain and there was an attempt by Guenter Roeck a while ago to move
power off drivers to similar infrastructure[0]. That attempt had met
with some pushback, with the main criticism being that there was no
formal definition of the priorities of these handlers.

The system power and restart framework tries to solve this by adding
a more explicit framework that power and restart drivers can register
with. This is currently very simple, but it is meant primarily as a
basis for discussion so that we can reach concensus on what we want
such a framework to look like (and if we need one at all).

There was a bit of discussion on this two weeks ago[1], and this set
is an attempt at implementing my proposal from that discussion[2]. A
formal mechanism to implement priorities on top of this could easily
be added (see linked discussion).

One very big advantage of this method is that we have a very easy way
of keeping backwards compatibility with the restart notifier chain and
the pm_power_off() mechanism, which means we can convert drivers one
by one and evolve the framework incrementally without having to have a
flag day where everyone needs to convert.

So the goal of this series is to get some feedback on whether or not the
people involved in earlier discussions around this think this is sound.
If so I've got a couple of patches on top that convert architectures
over to using the new function calls and a couple of drivers that I have
converted as a proof of concept.

Thanks,
Thierry

[0]: https://lkml.org/lkml/2014/11/6/505
[1]: https://lkml.org/lkml/2017/1/12/470
[2]: https://lkml.org/lkml/2017/1/20/89

Thierry Reding (3):
  system-power: Add system power and restart framework
  kernel: Wire up system power framework
  PM / hibernate: Wire up system-power framework

 drivers/base/Makefile|   3 +-
 drivers/base/system-power.c  | 110 +++
 include/linux/system-power.h |  38 +++
 kernel/power/hibernate.c |   3 +-
 kernel/reboot.c  |  11 +++--
 5 files changed, 158 insertions(+), 7 deletions(-)
 create mode 100644 drivers/base/system-power.c
 create mode 100644 include/linux/system-power.h

-- 
2.11.0



[RFC 0/3] Add system power and restart framework

2017-01-30 Thread Thierry Reding
From: Thierry Reding 

Hi everyone,

This series of patches proposes a small framework targetted at system
power and restart drivers. Restart drivers currently use a notifier
chain and there was an attempt by Guenter Roeck a while ago to move
power off drivers to similar infrastructure[0]. That attempt had met
with some pushback, with the main criticism being that there was no
formal definition of the priorities of these handlers.

The system power and restart framework tries to solve this by adding
a more explicit framework that power and restart drivers can register
with. This is currently very simple, but it is meant primarily as a
basis for discussion so that we can reach concensus on what we want
such a framework to look like (and if we need one at all).

There was a bit of discussion on this two weeks ago[1], and this set
is an attempt at implementing my proposal from that discussion[2]. A
formal mechanism to implement priorities on top of this could easily
be added (see linked discussion).

One very big advantage of this method is that we have a very easy way
of keeping backwards compatibility with the restart notifier chain and
the pm_power_off() mechanism, which means we can convert drivers one
by one and evolve the framework incrementally without having to have a
flag day where everyone needs to convert.

So the goal of this series is to get some feedback on whether or not the
people involved in earlier discussions around this think this is sound.
If so I've got a couple of patches on top that convert architectures
over to using the new function calls and a couple of drivers that I have
converted as a proof of concept.

Thanks,
Thierry

[0]: https://lkml.org/lkml/2014/11/6/505
[1]: https://lkml.org/lkml/2017/1/12/470
[2]: https://lkml.org/lkml/2017/1/20/89

Thierry Reding (3):
  system-power: Add system power and restart framework
  kernel: Wire up system power framework
  PM / hibernate: Wire up system-power framework

 drivers/base/Makefile|   3 +-
 drivers/base/system-power.c  | 110 +++
 include/linux/system-power.h |  38 +++
 kernel/power/hibernate.c |   3 +-
 kernel/reboot.c  |  11 +++--
 5 files changed, 158 insertions(+), 7 deletions(-)
 create mode 100644 drivers/base/system-power.c
 create mode 100644 include/linux/system-power.h

-- 
2.11.0



[PATCH 1/2] iommu/vt-d: Fix some macros that are incorrectly specified in intel-iommu

2017-01-30 Thread Ashok Raj
From: CQ Tang 

Some of the macros are incorrect with wrong bit-shifts resulting in picking
the incorrect invalidation granularity. Incorrect Source-ID in extended
devtlb invalidation caused device side errors.

To: Joerg Roedel 
To: David Woodhouse 
Cc: io...@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Cc: sta...@vger.kernel.org
Cc: CQ Tang 
Cc: Ashok Raj 

Fixes: 2f26e0a9 ("iommu/vt-d: Add basic SVM PASID support")
Signed-off-by: CQ Tang 
Signed-off-by: Ashok Raj 
Tested-by: CQ Tang 
---
 include/linux/intel-iommu.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index d49e26c..23e129e 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -153,8 +153,8 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
 #define DMA_TLB_GLOBAL_FLUSH (((u64)1) << 60)
 #define DMA_TLB_DSI_FLUSH (((u64)2) << 60)
 #define DMA_TLB_PSI_FLUSH (((u64)3) << 60)
-#define DMA_TLB_IIRG(type) ((type >> 60) & 7)
-#define DMA_TLB_IAIG(val) (((val) >> 57) & 7)
+#define DMA_TLB_IIRG(type) ((type >> 60) & 3)
+#define DMA_TLB_IAIG(val) (((val) >> 57) & 3)
 #define DMA_TLB_READ_DRAIN (((u64)1) << 49)
 #define DMA_TLB_WRITE_DRAIN (((u64)1) << 48)
 #define DMA_TLB_DID(id)(((u64)((id) & 0x)) << 32)
@@ -164,9 +164,9 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
 
 /* INVALID_DESC */
 #define DMA_CCMD_INVL_GRANU_OFFSET  61
-#define DMA_ID_TLB_GLOBAL_FLUSH(((u64)1) << 3)
-#define DMA_ID_TLB_DSI_FLUSH   (((u64)2) << 3)
-#define DMA_ID_TLB_PSI_FLUSH   (((u64)3) << 3)
+#define DMA_ID_TLB_GLOBAL_FLUSH(((u64)1) << 4)
+#define DMA_ID_TLB_DSI_FLUSH   (((u64)2) << 4)
+#define DMA_ID_TLB_PSI_FLUSH   (((u64)3) << 4)
 #define DMA_ID_TLB_READ_DRAIN  (((u64)1) << 7)
 #define DMA_ID_TLB_WRITE_DRAIN (((u64)1) << 6)
 #define DMA_ID_TLB_DID(id) (((u64)((id & 0x) << 16)))
@@ -316,8 +316,8 @@ enum {
 #define QI_DEV_EIOTLB_SIZE (((u64)1) << 11)
 #define QI_DEV_EIOTLB_GLOB(g)  ((u64)g)
 #define QI_DEV_EIOTLB_PASID(p) (((u64)p) << 32)
-#define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 32)
-#define QI_DEV_EIOTLB_QDEP(qd) (((qd) & 0x1f) << 16)
+#define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 16)
+#define QI_DEV_EIOTLB_QDEP(qd) ((u64)((qd) & 0x1f) << 4)
 #define QI_DEV_EIOTLB_MAX_INVS 32
 
 #define QI_PGRP_IDX(idx)   (((u64)(idx)) << 55)
-- 
2.7.4



[PATCH 1/2] iommu/vt-d: Fix some macros that are incorrectly specified in intel-iommu

2017-01-30 Thread Ashok Raj
From: CQ Tang 

Some of the macros are incorrect with wrong bit-shifts resulting in picking
the incorrect invalidation granularity. Incorrect Source-ID in extended
devtlb invalidation caused device side errors.

To: Joerg Roedel 
To: David Woodhouse 
Cc: io...@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Cc: sta...@vger.kernel.org
Cc: CQ Tang 
Cc: Ashok Raj 

Fixes: 2f26e0a9 ("iommu/vt-d: Add basic SVM PASID support")
Signed-off-by: CQ Tang 
Signed-off-by: Ashok Raj 
Tested-by: CQ Tang 
---
 include/linux/intel-iommu.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index d49e26c..23e129e 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -153,8 +153,8 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
 #define DMA_TLB_GLOBAL_FLUSH (((u64)1) << 60)
 #define DMA_TLB_DSI_FLUSH (((u64)2) << 60)
 #define DMA_TLB_PSI_FLUSH (((u64)3) << 60)
-#define DMA_TLB_IIRG(type) ((type >> 60) & 7)
-#define DMA_TLB_IAIG(val) (((val) >> 57) & 7)
+#define DMA_TLB_IIRG(type) ((type >> 60) & 3)
+#define DMA_TLB_IAIG(val) (((val) >> 57) & 3)
 #define DMA_TLB_READ_DRAIN (((u64)1) << 49)
 #define DMA_TLB_WRITE_DRAIN (((u64)1) << 48)
 #define DMA_TLB_DID(id)(((u64)((id) & 0x)) << 32)
@@ -164,9 +164,9 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
 
 /* INVALID_DESC */
 #define DMA_CCMD_INVL_GRANU_OFFSET  61
-#define DMA_ID_TLB_GLOBAL_FLUSH(((u64)1) << 3)
-#define DMA_ID_TLB_DSI_FLUSH   (((u64)2) << 3)
-#define DMA_ID_TLB_PSI_FLUSH   (((u64)3) << 3)
+#define DMA_ID_TLB_GLOBAL_FLUSH(((u64)1) << 4)
+#define DMA_ID_TLB_DSI_FLUSH   (((u64)2) << 4)
+#define DMA_ID_TLB_PSI_FLUSH   (((u64)3) << 4)
 #define DMA_ID_TLB_READ_DRAIN  (((u64)1) << 7)
 #define DMA_ID_TLB_WRITE_DRAIN (((u64)1) << 6)
 #define DMA_ID_TLB_DID(id) (((u64)((id & 0x) << 16)))
@@ -316,8 +316,8 @@ enum {
 #define QI_DEV_EIOTLB_SIZE (((u64)1) << 11)
 #define QI_DEV_EIOTLB_GLOB(g)  ((u64)g)
 #define QI_DEV_EIOTLB_PASID(p) (((u64)p) << 32)
-#define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 32)
-#define QI_DEV_EIOTLB_QDEP(qd) (((qd) & 0x1f) << 16)
+#define QI_DEV_EIOTLB_SID(sid) ((u64)((sid) & 0x) << 16)
+#define QI_DEV_EIOTLB_QDEP(qd) ((u64)((qd) & 0x1f) << 4)
 #define QI_DEV_EIOTLB_MAX_INVS 32
 
 #define QI_PGRP_IDX(idx)   (((u64)(idx)) << 55)
-- 
2.7.4



[PATCH 2/2] iommu/vt-d: tylersburg isoch identity map check is done too late.

2017-01-30 Thread Ashok Raj
The check to set identity map for tylersburg is done too late. It needs
to be done before the check for identity_map domain is done.

To: Joerg Roedel 
To: David Woodhouse 
Cc: io...@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Cc: sta...@vger.kernel.org
Cc: Ashok Raj 

Fixes: 86080ccc22 ("iommu/vt-d: Allocate si_domain in init_dmars()")
Signed-off-by: Ashok Raj 
Reported-by: Yunhong Jiang 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8a18525..23eead3 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3325,13 +3325,14 @@ static int __init init_dmars(void)
iommu_identity_mapping |= IDENTMAP_GFX;
 #endif
 
+   check_tylersburg_isoch();
+
if (iommu_identity_mapping) {
ret = si_domain_init(hw_pass_through);
if (ret)
goto free_iommu;
}
 
-   check_tylersburg_isoch();
 
/*
 * If we copied translations from a previous kernel in the kdump
-- 
2.7.4



[PATCH 2/2] iommu/vt-d: tylersburg isoch identity map check is done too late.

2017-01-30 Thread Ashok Raj
The check to set identity map for tylersburg is done too late. It needs
to be done before the check for identity_map domain is done.

To: Joerg Roedel 
To: David Woodhouse 
Cc: io...@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Cc: sta...@vger.kernel.org
Cc: Ashok Raj 

Fixes: 86080ccc22 ("iommu/vt-d: Allocate si_domain in init_dmars()")
Signed-off-by: Ashok Raj 
Reported-by: Yunhong Jiang 
---
 drivers/iommu/intel-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8a18525..23eead3 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3325,13 +3325,14 @@ static int __init init_dmars(void)
iommu_identity_mapping |= IDENTMAP_GFX;
 #endif
 
+   check_tylersburg_isoch();
+
if (iommu_identity_mapping) {
ret = si_domain_init(hw_pass_through);
if (ret)
goto free_iommu;
}
 
-   check_tylersburg_isoch();
 
/*
 * If we copied translations from a previous kernel in the kdump
-- 
2.7.4



Re: [RFC v2 02/10] KVM: arm/arm64: Move cntvoff to each timer context

2017-01-30 Thread Jintack Lim
On Mon, Jan 30, 2017 at 9:51 AM, Marc Zyngier  wrote:
> On 30/01/17 14:45, Christoffer Dall wrote:
>> On Sun, Jan 29, 2017 at 11:54:05AM +, Marc Zyngier wrote:
>>> On Fri, Jan 27 2017 at 01:04:52 AM, Jintack Lim  
>>> wrote:
 Make cntvoff per each timer context. This is helpful to abstract kvm
 timer functions to work with timer context without considering timer
 types (e.g. physical timer or virtual timer).

 This also would pave the way for ever doing adjustments of the cntvoff
 on a per-CPU basis if that should ever make sense.

 Signed-off-by: Jintack Lim 
 ---
  arch/arm/include/asm/kvm_host.h   |  6 +++---
  arch/arm64/include/asm/kvm_host.h |  4 ++--
  include/kvm/arm_arch_timer.h  |  8 +++-
  virt/kvm/arm/arch_timer.c | 26 --
  virt/kvm/arm/hyp/timer-sr.c   |  3 +--
  5 files changed, 29 insertions(+), 18 deletions(-)

 diff --git a/arch/arm/include/asm/kvm_host.h 
 b/arch/arm/include/asm/kvm_host.h
 index d5423ab..f5456a9 100644
 --- a/arch/arm/include/asm/kvm_host.h
 +++ b/arch/arm/include/asm/kvm_host.h
 @@ -60,9 +60,6 @@ struct kvm_arch {
 /* The last vcpu id that ran on each physical CPU */
 int __percpu *last_vcpu_ran;

 -   /* Timer */
 -   struct arch_timer_kvm   timer;
 -
 /*
  * Anything that is not used directly from assembly code goes
  * here.
 @@ -75,6 +72,9 @@ struct kvm_arch {
 /* Stage-2 page table */
 pgd_t *pgd;

 +   /* A lock to synchronize cntvoff among all vtimer context of vcpus */
 +   spinlock_t cntvoff_lock;
>>>
>>> Is there any condition where we need this to be a spinlock? I would have
>>> thought that a mutex should have been enough, as this should only be
>>> updated on migration or initialization. Not that it matters much in this
>>> case, but I wondered if there is something I'm missing.
>>>
>>
>> I would think the critical section is small enough that a spinlock makes
>> sense, but what I don't think we need is to add the additional lock.
>>
>> I think just taking the kvm->lock should be sufficient, which happens to
>> be a mutex, and while that may be a bit slower to take than the
>> spinlock, it's not in the critical path so let's just keep things
>> simple.
>>
>> Perhaps this what Marc also meant.
>
> That would be the logical conclusion, assuming that we can sleep on this
> path.

All right. I'll take kvm->lock there.

Thanks,
Jintack

>
> Thanks,
>
> M.
> --
> Jazz is not dead. It just smells funny...
>



Re: [RFC v2 02/10] KVM: arm/arm64: Move cntvoff to each timer context

2017-01-30 Thread Jintack Lim
On Mon, Jan 30, 2017 at 9:51 AM, Marc Zyngier  wrote:
> On 30/01/17 14:45, Christoffer Dall wrote:
>> On Sun, Jan 29, 2017 at 11:54:05AM +, Marc Zyngier wrote:
>>> On Fri, Jan 27 2017 at 01:04:52 AM, Jintack Lim  
>>> wrote:
 Make cntvoff per each timer context. This is helpful to abstract kvm
 timer functions to work with timer context without considering timer
 types (e.g. physical timer or virtual timer).

 This also would pave the way for ever doing adjustments of the cntvoff
 on a per-CPU basis if that should ever make sense.

 Signed-off-by: Jintack Lim 
 ---
  arch/arm/include/asm/kvm_host.h   |  6 +++---
  arch/arm64/include/asm/kvm_host.h |  4 ++--
  include/kvm/arm_arch_timer.h  |  8 +++-
  virt/kvm/arm/arch_timer.c | 26 --
  virt/kvm/arm/hyp/timer-sr.c   |  3 +--
  5 files changed, 29 insertions(+), 18 deletions(-)

 diff --git a/arch/arm/include/asm/kvm_host.h 
 b/arch/arm/include/asm/kvm_host.h
 index d5423ab..f5456a9 100644
 --- a/arch/arm/include/asm/kvm_host.h
 +++ b/arch/arm/include/asm/kvm_host.h
 @@ -60,9 +60,6 @@ struct kvm_arch {
 /* The last vcpu id that ran on each physical CPU */
 int __percpu *last_vcpu_ran;

 -   /* Timer */
 -   struct arch_timer_kvm   timer;
 -
 /*
  * Anything that is not used directly from assembly code goes
  * here.
 @@ -75,6 +72,9 @@ struct kvm_arch {
 /* Stage-2 page table */
 pgd_t *pgd;

 +   /* A lock to synchronize cntvoff among all vtimer context of vcpus */
 +   spinlock_t cntvoff_lock;
>>>
>>> Is there any condition where we need this to be a spinlock? I would have
>>> thought that a mutex should have been enough, as this should only be
>>> updated on migration or initialization. Not that it matters much in this
>>> case, but I wondered if there is something I'm missing.
>>>
>>
>> I would think the critical section is small enough that a spinlock makes
>> sense, but what I don't think we need is to add the additional lock.
>>
>> I think just taking the kvm->lock should be sufficient, which happens to
>> be a mutex, and while that may be a bit slower to take than the
>> spinlock, it's not in the critical path so let's just keep things
>> simple.
>>
>> Perhaps this what Marc also meant.
>
> That would be the logical conclusion, assuming that we can sleep on this
> path.

All right. I'll take kvm->lock there.

Thanks,
Jintack

>
> Thanks,
>
> M.
> --
> Jazz is not dead. It just smells funny...
>



Re: [PATCH] printk: fix printk.devkmsg sysctl

2017-01-30 Thread Rabin Vincent
On Fri, Jan 27, 2017 at 07:19:30PM +0100, Borislav Petkov wrote:
> On Fri, Jan 27, 2017 at 04:42:30PM +0100, Rabin Vincent wrote:
> > proc_dostring() eats the '\n' and stops
> 
> Not a problem, see diff below.

Would it be possible for you to please submit it as a patch yourself so
that this gets fixed in the way you like?  Thank you.


Re: [PATCH v16 3/3] usb: typec: add driver for Intel Whiskey Cove PMIC USB Type-C PHY

2017-01-30 Thread kbuild test robot
Hi Heikki,

[auto build test WARNING on usb/usb-testing]
[also build test WARNING on v4.10-rc6 next-20170130]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Heikki-Krogerus/lib-string-add-sysfs_match_string-helper/20170130-214825
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git 
usb-testing


coccinelle warnings: (new ones prefixed by >>)

>> drivers/usb/typec/typec.c:1249:1-3: WARNING: PTR_ERR_OR_ZERO can be used

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


Re: [PATCH] printk: fix printk.devkmsg sysctl

2017-01-30 Thread Rabin Vincent
On Fri, Jan 27, 2017 at 07:19:30PM +0100, Borislav Petkov wrote:
> On Fri, Jan 27, 2017 at 04:42:30PM +0100, Rabin Vincent wrote:
> > proc_dostring() eats the '\n' and stops
> 
> Not a problem, see diff below.

Would it be possible for you to please submit it as a patch yourself so
that this gets fixed in the way you like?  Thank you.


Re: [PATCH v16 3/3] usb: typec: add driver for Intel Whiskey Cove PMIC USB Type-C PHY

2017-01-30 Thread kbuild test robot
Hi Heikki,

[auto build test WARNING on usb/usb-testing]
[also build test WARNING on v4.10-rc6 next-20170130]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Heikki-Krogerus/lib-string-add-sysfs_match_string-helper/20170130-214825
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git 
usb-testing


coccinelle warnings: (new ones prefixed by >>)

>> drivers/usb/typec/typec.c:1249:1-3: WARNING: PTR_ERR_OR_ZERO can be used

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


[PATCH] usb: typec: fix ptr_ret.cocci warnings

2017-01-30 Thread kbuild test robot
drivers/usb/typec/typec.c:1249:1-3: WARNING: PTR_ERR_OR_ZERO can be used


 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

CC: Heikki Krogerus 
Signed-off-by: Fengguang Wu 
---

 typec.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/drivers/usb/typec/typec.c
+++ b/drivers/usb/typec/typec.c
@@ -1246,9 +1246,7 @@ EXPORT_SYMBOL_GPL(typec_unregister_port)
 static int __init typec_init(void)
 {
typec_class = class_create(THIS_MODULE, "typec");
-   if (IS_ERR(typec_class))
-   return PTR_ERR(typec_class);
-   return 0;
+   return PTR_ERR_OR_ZERO(typec_class);
 }
 subsys_initcall(typec_init);
 


[PATCH] usb: typec: fix ptr_ret.cocci warnings

2017-01-30 Thread kbuild test robot
drivers/usb/typec/typec.c:1249:1-3: WARNING: PTR_ERR_OR_ZERO can be used


 Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR

Generated by: scripts/coccinelle/api/ptr_ret.cocci

CC: Heikki Krogerus 
Signed-off-by: Fengguang Wu 
---

 typec.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

--- a/drivers/usb/typec/typec.c
+++ b/drivers/usb/typec/typec.c
@@ -1246,9 +1246,7 @@ EXPORT_SYMBOL_GPL(typec_unregister_port)
 static int __init typec_init(void)
 {
typec_class = class_create(THIS_MODULE, "typec");
-   if (IS_ERR(typec_class))
-   return PTR_ERR(typec_class);
-   return 0;
+   return PTR_ERR_OR_ZERO(typec_class);
 }
 subsys_initcall(typec_init);
 


Re: [PATCH V2] ARM: dts: BCM5301X: Add missing Netgear R8000 LEDs and Keys

2017-01-30 Thread Florian Fainelli
On 01/30/2017 01:08 AM, Aditya Xavier wrote:
> Would you require me to send the revised Patch ?
> 
> Or would this do ?

Will take care of it this time.

> 
> And thanks for guiding me through this process :)
> 
> 
>> On 29-Jan-2017, at 2:45 AM, Rafał Miłecki  wrote:
>>
>> On 28 January 2017 at 15:37, AdityaXavier  wrote:
>>> From: Aditya Xavier 
>>>
>>> Added two WAN status LEDs and a GPIO Key for Brightness which were missing.
>>> V2: Updated subject, Power LED names, and WAN labels.
>>
>> Changelog (V2 ... part) should go into that /comments/ section (see below).
>> Florian: can you drop that line when applying this patch? Otherwise it
>> looks OK to me.
>>
>>
>>> Signed-off-by: Aditya Xavier 
>>
>> Acked-by: Rafał Miłecki 
>>
>> Thanks for the patch!
>>
>>
>>> ---
>>
>> Right here, below these 3 dashes is a place where you can add extra
>> comments (they won't go into log when doing "git am").
>>
>>> arch/arm/boot/dts/bcm4709-netgear-r8000.dts | 22 --
>>> 1 file changed, 20 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm/boot/dts/bcm4709-netgear-r8000.dts 
>>> b/arch/arm/boot/dts/bcm4709-netgear-r8000.dts
>>> index 92f8a72..90d4420 100644
>>> --- a/arch/arm/boot/dts/bcm4709-netgear-r8000.dts
>>> +++ b/arch/arm/boot/dts/bcm4709-netgear-r8000.dts
>>> @@ -27,18 +27,30 @@
>>>leds {
>>>compatible = "gpio-leds";
>>>
>>> -   power0 {
>>> +   power-white {
>>>label = "bcm53xx:white:power";
>>>gpios = < 2 GPIO_ACTIVE_LOW>;
>>>linux,default-trigger = "default-on";
>>>};
>>>
>>> -   power1 {
>>> +   power-amber {
>>>label = "bcm53xx:amber:power";
>>>gpios = < 3 GPIO_ACTIVE_LOW>;
>>>linux,default-trigger = "default-off";
>>>};
>>>
>>> +   wan-white {
>>> +   label = "bcm53xx:white:wan";
>>> +   gpios = < 8 GPIO_ACTIVE_LOW>;
>>> +   linux,default-trigger = "default-on";
>>> +   };
>>> +
>>> +   wan-amber {
>>> +   label = "bcm53xx:amber:wan";
>>> +   gpios = < 9 GPIO_ACTIVE_HIGH>;
>>> +   linux,default-trigger = "default-off";
>>> +   };
>>> +
>>>5ghz-1 {
>>>label = "bcm53xx:white:5ghz-1";
>>>gpios = < 12 GPIO_ACTIVE_LOW>;
>>> @@ -104,6 +116,12 @@
>>>linux,code = ;
>>>gpios = < 6 GPIO_ACTIVE_LOW>;
>>>};
>>> +
>>> +   brightness {
>>> +   label = "Backlight";
>>> +   linux,code = ;
>>> +   gpios = < 19 GPIO_ACTIVE_LOW>;
>>> +   };
>>>};
>>> };
>>>
>>> --
>>> 2.9.3
>>>
>>
>>
>>
>> -- 
>> Rafał
> 


-- 
Florian


Re: [PATCH V2] ARM: dts: BCM5301X: Add missing Netgear R8000 LEDs and Keys

2017-01-30 Thread Florian Fainelli
On 01/30/2017 01:08 AM, Aditya Xavier wrote:
> Would you require me to send the revised Patch ?
> 
> Or would this do ?

Will take care of it this time.

> 
> And thanks for guiding me through this process :)
> 
> 
>> On 29-Jan-2017, at 2:45 AM, Rafał Miłecki  wrote:
>>
>> On 28 January 2017 at 15:37, AdityaXavier  wrote:
>>> From: Aditya Xavier 
>>>
>>> Added two WAN status LEDs and a GPIO Key for Brightness which were missing.
>>> V2: Updated subject, Power LED names, and WAN labels.
>>
>> Changelog (V2 ... part) should go into that /comments/ section (see below).
>> Florian: can you drop that line when applying this patch? Otherwise it
>> looks OK to me.
>>
>>
>>> Signed-off-by: Aditya Xavier 
>>
>> Acked-by: Rafał Miłecki 
>>
>> Thanks for the patch!
>>
>>
>>> ---
>>
>> Right here, below these 3 dashes is a place where you can add extra
>> comments (they won't go into log when doing "git am").
>>
>>> arch/arm/boot/dts/bcm4709-netgear-r8000.dts | 22 --
>>> 1 file changed, 20 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/arm/boot/dts/bcm4709-netgear-r8000.dts 
>>> b/arch/arm/boot/dts/bcm4709-netgear-r8000.dts
>>> index 92f8a72..90d4420 100644
>>> --- a/arch/arm/boot/dts/bcm4709-netgear-r8000.dts
>>> +++ b/arch/arm/boot/dts/bcm4709-netgear-r8000.dts
>>> @@ -27,18 +27,30 @@
>>>leds {
>>>compatible = "gpio-leds";
>>>
>>> -   power0 {
>>> +   power-white {
>>>label = "bcm53xx:white:power";
>>>gpios = < 2 GPIO_ACTIVE_LOW>;
>>>linux,default-trigger = "default-on";
>>>};
>>>
>>> -   power1 {
>>> +   power-amber {
>>>label = "bcm53xx:amber:power";
>>>gpios = < 3 GPIO_ACTIVE_LOW>;
>>>linux,default-trigger = "default-off";
>>>};
>>>
>>> +   wan-white {
>>> +   label = "bcm53xx:white:wan";
>>> +   gpios = < 8 GPIO_ACTIVE_LOW>;
>>> +   linux,default-trigger = "default-on";
>>> +   };
>>> +
>>> +   wan-amber {
>>> +   label = "bcm53xx:amber:wan";
>>> +   gpios = < 9 GPIO_ACTIVE_HIGH>;
>>> +   linux,default-trigger = "default-off";
>>> +   };
>>> +
>>>5ghz-1 {
>>>label = "bcm53xx:white:5ghz-1";
>>>gpios = < 12 GPIO_ACTIVE_LOW>;
>>> @@ -104,6 +116,12 @@
>>>linux,code = ;
>>>gpios = < 6 GPIO_ACTIVE_LOW>;
>>>};
>>> +
>>> +   brightness {
>>> +   label = "Backlight";
>>> +   linux,code = ;
>>> +   gpios = < 19 GPIO_ACTIVE_LOW>;
>>> +   };
>>>};
>>> };
>>>
>>> --
>>> 2.9.3
>>>
>>
>>
>>
>> -- 
>> Rafał
> 


-- 
Florian


Re: [RFC v2 10/10] KVM: arm/arm64: Emulate the EL1 phys timer register access

2017-01-30 Thread Jintack Lim
Hi Peter,

On Mon, Jan 30, 2017 at 12:26 PM, Peter Maydell
 wrote:
> On 30 January 2017 at 17:08, Jintack Lim  wrote:
>> On Sun, Jan 29, 2017 at 10:44 AM, Marc Zyngier  wrote:
>>> Shouldn't we take the ENABLE bit into account? The ARMv8 ARM version I
>>> have at hand (version h) seems to indicate that we should, but we should
>>> check with the latest and greatest...
>>
>> Thanks! I was not clear about this. I have ARM ARM version k, and it
>> says that 'When the value of the ENABLE bit is 0, the ISTATUS field is
>> UNKNOWN.' So I thought the istatus value doesn't matter if ENABLE is
>> 0, and just set istatus bit regardless of ENABLE bit. If this is not
>> what the manual meant, then I'm happy to fix this.
>
> It looks like the spec has been relaxed between the doc version
> that Marc was looking at and the current one. So it's OK for
> an implementation to either (a) set ISTATUS to 0 if ENABLE
> is 0, or (b) do what you've done and set ISTATUS according
> to the timer comparison whether ENABLE is clear or not
> (or even (c) set ISTATUS to a random value if ENABLE is clear,
> and other less likely choices).
> I think we should add a comment to note that it's architecturally
> UNKNOWN and we've made a choice for our implementation convenience.

Thanks for the clarification. I'll put a comment in v3.

>
> thanks
> -- PMM
>



Re: [PATCH 2/2] x86/fpu: copy MXCSR & MXCSR_FLAGS with SSE/YMM state

2017-01-30 Thread Yu-cheng Yu
On Wed, Jan 25, 2017 at 08:57:59PM -0500, r...@redhat.com wrote:
> From: Rik van Riel 
> 
> On Skylake CPUs I noticed that XRSTOR is unable to deal with states
> created by copyout_from_xsaves if the xstate has only SSE/YMM state, and
> no FP state. That is, xfeatures had XFEATURE_MASK_SSE set, but not
> XFEATURE_MASK_FP.
> 
> The reason is that part of the SSE/YMM state lives in the MXCSR and
> MXCSR_FLAGS fields of the FP state.
> 
> Ensure that whenever we copy SSE or YMM state around, the MXCSR and
> MXCSR_FLAGS fields are also copied around.
> 
> Signed-off-by: Rik van Riel 
> ---
>  arch/x86/kernel/fpu/xstate.c | 39 ++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index c1508d56ecfb..10b10917af81 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -1004,6 +1004,23 @@ int copyout_from_xsaves(unsigned int pos, unsigned int 
> count, void *kbuf,
>   }
>  
>   /*
> +  * Restoring SSE/YMM state requires that MXCSR & MXCSR_MASK are saved.
> +  * Those fields are part of the legacy FP state, and only get saved
> +  * above if XFEATURES_MASK_FP is set.
> +  *
> +  * Copy out those fields if we have SSE/YMM but no FP register data.
> +  */
> + if ((header.xfeatures & (XFEATURE_MASK_SSE|XFEATURE_MASK_YMM)) &&
> + !(header.xfeatures & XFEATURE_MASK_FP)) {
> + size = sizeof(u64);
> + ret = xstate_copyout(offset, size, kbuf, ubuf,
> +  >i387.mxcsr, 0, count);
> +
> + if (ret)
> + return ret;
> + }
> +
> + /*
>* Fill xsave->i387.sw_reserved value for ptrace frame:
>*/
>   offset = offsetof(struct fxregs_state, sw_reserved);
> @@ -1030,6 +1047,7 @@ int copyin_to_xsaves(const void *kbuf, const void 
> __user *ubuf,
>   int i;
>   u64 xfeatures;
>   u64 allowed_features;
> + void *dst;
>  
>   offset = offsetof(struct xregs_state, header);
>   size = sizeof(xfeatures);
> @@ -1053,7 +1071,7 @@ int copyin_to_xsaves(const void *kbuf, const void 
> __user *ubuf,
>   u64 mask = ((u64)1 << i);
>  
>   if (xfeatures & mask) {
> - void *dst = __raw_xsave_addr(xsave, 1 << i);
> + dst = __raw_xsave_addr(xsave, 1 << i);
>  
>   offset = xstate_offsets[i];
>   size = xstate_sizes[i];
> @@ -1068,6 +1086,25 @@ int copyin_to_xsaves(const void *kbuf, const void 
> __user *ubuf,
>   }
>  
>   /*
> +  * SSE/YMM state depends on the MXCSR & MXCSR_MASK fields from the FP
> +  * state. If we restored only SSE/YMM state but not FP state, copy
> +  * those fields to ensure the SSE/YMM state restore works.
> +  */

In xstateregs_set(), we enforced the starting pos must be from (0), which in
XSAVE time, was probably for this reason.  The real mistake here, I think, is
allowing skipping of xstate[0] and xstate[1].  Both should have been there
even for XSAVES compacted-format.  Would it be a simpler fix just making sure
xstate[0] and xstate[1] are copied?

Yu-cheng
 


Re: [RFC v2 10/10] KVM: arm/arm64: Emulate the EL1 phys timer register access

2017-01-30 Thread Jintack Lim
Hi Peter,

On Mon, Jan 30, 2017 at 12:26 PM, Peter Maydell
 wrote:
> On 30 January 2017 at 17:08, Jintack Lim  wrote:
>> On Sun, Jan 29, 2017 at 10:44 AM, Marc Zyngier  wrote:
>>> Shouldn't we take the ENABLE bit into account? The ARMv8 ARM version I
>>> have at hand (version h) seems to indicate that we should, but we should
>>> check with the latest and greatest...
>>
>> Thanks! I was not clear about this. I have ARM ARM version k, and it
>> says that 'When the value of the ENABLE bit is 0, the ISTATUS field is
>> UNKNOWN.' So I thought the istatus value doesn't matter if ENABLE is
>> 0, and just set istatus bit regardless of ENABLE bit. If this is not
>> what the manual meant, then I'm happy to fix this.
>
> It looks like the spec has been relaxed between the doc version
> that Marc was looking at and the current one. So it's OK for
> an implementation to either (a) set ISTATUS to 0 if ENABLE
> is 0, or (b) do what you've done and set ISTATUS according
> to the timer comparison whether ENABLE is clear or not
> (or even (c) set ISTATUS to a random value if ENABLE is clear,
> and other less likely choices).
> I think we should add a comment to note that it's architecturally
> UNKNOWN and we've made a choice for our implementation convenience.

Thanks for the clarification. I'll put a comment in v3.

>
> thanks
> -- PMM
>



Re: [PATCH 2/2] x86/fpu: copy MXCSR & MXCSR_FLAGS with SSE/YMM state

2017-01-30 Thread Yu-cheng Yu
On Wed, Jan 25, 2017 at 08:57:59PM -0500, r...@redhat.com wrote:
> From: Rik van Riel 
> 
> On Skylake CPUs I noticed that XRSTOR is unable to deal with states
> created by copyout_from_xsaves if the xstate has only SSE/YMM state, and
> no FP state. That is, xfeatures had XFEATURE_MASK_SSE set, but not
> XFEATURE_MASK_FP.
> 
> The reason is that part of the SSE/YMM state lives in the MXCSR and
> MXCSR_FLAGS fields of the FP state.
> 
> Ensure that whenever we copy SSE or YMM state around, the MXCSR and
> MXCSR_FLAGS fields are also copied around.
> 
> Signed-off-by: Rik van Riel 
> ---
>  arch/x86/kernel/fpu/xstate.c | 39 ++-
>  1 file changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index c1508d56ecfb..10b10917af81 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -1004,6 +1004,23 @@ int copyout_from_xsaves(unsigned int pos, unsigned int 
> count, void *kbuf,
>   }
>  
>   /*
> +  * Restoring SSE/YMM state requires that MXCSR & MXCSR_MASK are saved.
> +  * Those fields are part of the legacy FP state, and only get saved
> +  * above if XFEATURES_MASK_FP is set.
> +  *
> +  * Copy out those fields if we have SSE/YMM but no FP register data.
> +  */
> + if ((header.xfeatures & (XFEATURE_MASK_SSE|XFEATURE_MASK_YMM)) &&
> + !(header.xfeatures & XFEATURE_MASK_FP)) {
> + size = sizeof(u64);
> + ret = xstate_copyout(offset, size, kbuf, ubuf,
> +  >i387.mxcsr, 0, count);
> +
> + if (ret)
> + return ret;
> + }
> +
> + /*
>* Fill xsave->i387.sw_reserved value for ptrace frame:
>*/
>   offset = offsetof(struct fxregs_state, sw_reserved);
> @@ -1030,6 +1047,7 @@ int copyin_to_xsaves(const void *kbuf, const void 
> __user *ubuf,
>   int i;
>   u64 xfeatures;
>   u64 allowed_features;
> + void *dst;
>  
>   offset = offsetof(struct xregs_state, header);
>   size = sizeof(xfeatures);
> @@ -1053,7 +1071,7 @@ int copyin_to_xsaves(const void *kbuf, const void 
> __user *ubuf,
>   u64 mask = ((u64)1 << i);
>  
>   if (xfeatures & mask) {
> - void *dst = __raw_xsave_addr(xsave, 1 << i);
> + dst = __raw_xsave_addr(xsave, 1 << i);
>  
>   offset = xstate_offsets[i];
>   size = xstate_sizes[i];
> @@ -1068,6 +1086,25 @@ int copyin_to_xsaves(const void *kbuf, const void 
> __user *ubuf,
>   }
>  
>   /*
> +  * SSE/YMM state depends on the MXCSR & MXCSR_MASK fields from the FP
> +  * state. If we restored only SSE/YMM state but not FP state, copy
> +  * those fields to ensure the SSE/YMM state restore works.
> +  */

In xstateregs_set(), we enforced the starting pos must be from (0), which in
XSAVE time, was probably for this reason.  The real mistake here, I think, is
allowing skipping of xstate[0] and xstate[1].  Both should have been there
even for XSAVES compacted-format.  Would it be a simpler fix just making sure
xstate[0] and xstate[1] are copied?

Yu-cheng
 


Re: [RFC V2 03/12] mm: Change generic FALLBACK zonelist creation process

2017-01-30 Thread Dave Hansen
On 01/29/2017 07:35 PM, Anshuman Khandual wrote:
> * CDM node's zones are not part of any other node's FALLBACK zonelist
> * CDM node's FALLBACK list contains it's own memory zones followed by
>   all system RAM zones in regular order as before
> * CDM node's zones are part of it's own NOFALLBACK zonelist

This seems like a sane policy for the system that you're describing.
But, it's still a policy, and it's rather hard-coded into the kernel.
Let's say we had a CDM node with 100x more RAM than the rest of the
system and it was just as fast as the rest of the RAM.  Would we still
want it isolated like this?  Or would we want a different policy?

Why do we need this hard-coded along with the cpuset stuff later in the
series.  Doesn't taking a node out of the cpuset also take it out of the
fallback lists?

>   while ((node = find_next_best_node(local_node, _mask)) >= 0) {
> +#ifdef CONFIG_COHERENT_DEVICE
> + /*
> +  * CDM node's own zones should not be part of any other
> +  * node's fallback zonelist but only it's own fallback
> +  * zonelist.
> +  */
> + if (is_cdm_node(node) && (pgdat->node_id != node))
> + continue;
> +#endif

On a superficial note: Isn't that #ifdef unnecessary?  is_cdm_node() has
a 'return 0' stub when the config option is off anyway.


Re: [PATCH RT] Align rt_mutex inlining with upstream behavior

2017-01-30 Thread Andy Ritger
On Thu, Jan 26, 2017 at 06:01:09PM +0100, Sebastian Andrzej Siewior wrote:
> On 2017-01-24 18:45:50 [-0800], Alex Goins wrote:
> > mutex_destroy is no-op inline when DEBUG_MUTEX is not enabled. The RT Linux
> > patches replace mutex_destroy() with rt_mutex_destroy(). This patch aligns
> > rt_mutex_destroy() with mutex_destroy() by using the same no-op inline
> > technique.
> > 
> > Signed-off-by: Alex Goins 
> > Reviewed-by: Andy Ritger 
> 
> So what is the problem? Why are we doing this? There is still a check to
> see if the lock is in use which is also done for the case where
> DEBUG_MUTEX is disabled.

The problem is that various static inline functions such as
reservation_object_fini() indirectly call mutex_destroy.  On DEBUG_MUTEX
kernels, mutex_destroy is EXPORT_SYMBOL_GPL.

In upstream, non-DEBUG_MUTEX kernels define mutex_destroy to a noop.
This gives users the option of disabling DEBUG_MUTEX if they want to
use non-GPL, reservation_object_fini()-using, kernel modules.

In PREEMPTRT, non-DEBUG_MUTEX kernels export rt_mutex_destroy as
EXPORT_SYMBOL_GPL, so users no longer have the work around of using
DEBUG_MUTEX.

This patch gives PREEMPTRT users the option of disabling DEBUG_MUTEX if
they want to use such kernel modules, matching upstream behavior.



Re: [RFC V2 03/12] mm: Change generic FALLBACK zonelist creation process

2017-01-30 Thread Dave Hansen
On 01/29/2017 07:35 PM, Anshuman Khandual wrote:
> * CDM node's zones are not part of any other node's FALLBACK zonelist
> * CDM node's FALLBACK list contains it's own memory zones followed by
>   all system RAM zones in regular order as before
> * CDM node's zones are part of it's own NOFALLBACK zonelist

This seems like a sane policy for the system that you're describing.
But, it's still a policy, and it's rather hard-coded into the kernel.
Let's say we had a CDM node with 100x more RAM than the rest of the
system and it was just as fast as the rest of the RAM.  Would we still
want it isolated like this?  Or would we want a different policy?

Why do we need this hard-coded along with the cpuset stuff later in the
series.  Doesn't taking a node out of the cpuset also take it out of the
fallback lists?

>   while ((node = find_next_best_node(local_node, _mask)) >= 0) {
> +#ifdef CONFIG_COHERENT_DEVICE
> + /*
> +  * CDM node's own zones should not be part of any other
> +  * node's fallback zonelist but only it's own fallback
> +  * zonelist.
> +  */
> + if (is_cdm_node(node) && (pgdat->node_id != node))
> + continue;
> +#endif

On a superficial note: Isn't that #ifdef unnecessary?  is_cdm_node() has
a 'return 0' stub when the config option is off anyway.


Re: [PATCH RT] Align rt_mutex inlining with upstream behavior

2017-01-30 Thread Andy Ritger
On Thu, Jan 26, 2017 at 06:01:09PM +0100, Sebastian Andrzej Siewior wrote:
> On 2017-01-24 18:45:50 [-0800], Alex Goins wrote:
> > mutex_destroy is no-op inline when DEBUG_MUTEX is not enabled. The RT Linux
> > patches replace mutex_destroy() with rt_mutex_destroy(). This patch aligns
> > rt_mutex_destroy() with mutex_destroy() by using the same no-op inline
> > technique.
> > 
> > Signed-off-by: Alex Goins 
> > Reviewed-by: Andy Ritger 
> 
> So what is the problem? Why are we doing this? There is still a check to
> see if the lock is in use which is also done for the case where
> DEBUG_MUTEX is disabled.

The problem is that various static inline functions such as
reservation_object_fini() indirectly call mutex_destroy.  On DEBUG_MUTEX
kernels, mutex_destroy is EXPORT_SYMBOL_GPL.

In upstream, non-DEBUG_MUTEX kernels define mutex_destroy to a noop.
This gives users the option of disabling DEBUG_MUTEX if they want to
use non-GPL, reservation_object_fini()-using, kernel modules.

In PREEMPTRT, non-DEBUG_MUTEX kernels export rt_mutex_destroy as
EXPORT_SYMBOL_GPL, so users no longer have the work around of using
DEBUG_MUTEX.

This patch gives PREEMPTRT users the option of disabling DEBUG_MUTEX if
they want to use such kernel modules, matching upstream behavior.



Re: [PATCH RESEND] staging: media: lirc: use new parport device model

2017-01-30 Thread Sean Young
On Sat, Jan 21, 2017 at 12:55:54AM +, Sudip Mukherjee wrote:
> From: Sudip Mukherjee 
> 
> Modify lirc_parallel driver to use the new parallel port device model.
> 
> Signed-off-by: Sudip Mukherjee 
> ---
> 
> Resending after more than one year.
> Prevoius patch is at https://patchwork.kernel.org/patch/7883591/

Since noone ported lirc_parallel to rc-core, the lirc_parallel staging
driver has been droppped from the current media tree.

I have ported a few other lirc drivers to rc-core but I never found
anyone using lirc_parallel or the hardware itself.


Sean


Re: [PATCH RESEND] staging: media: lirc: use new parport device model

2017-01-30 Thread Sean Young
On Sat, Jan 21, 2017 at 12:55:54AM +, Sudip Mukherjee wrote:
> From: Sudip Mukherjee 
> 
> Modify lirc_parallel driver to use the new parallel port device model.
> 
> Signed-off-by: Sudip Mukherjee 
> ---
> 
> Resending after more than one year.
> Prevoius patch is at https://patchwork.kernel.org/patch/7883591/

Since noone ported lirc_parallel to rc-core, the lirc_parallel staging
driver has been droppped from the current media tree.

I have ported a few other lirc drivers to rc-core but I never found
anyone using lirc_parallel or the hardware itself.


Sean


Re: [PATCH 11/22] ARM: dts: add top-level DT bindings for Cortina Gemini

2017-01-30 Thread Rob Herring
On Sat, Jan 28, 2017 at 3:56 PM, Linus Walleij  wrote:
> On Mon, Jan 23, 2017 at 9:21 PM, Rob Herring  wrote:
>> On Sun, Jan 22, 2017 at 01:22:19PM +0100, Linus Walleij wrote:
>>> This adds the top level SoC bindings for Cortina systems Gemini
>>> platforms.
> (...)
>>> +- intcon: the root node must have an interrupt controller node pointing to
>>
>> intcon is just a source label and not meaningful for the binding.
>
> OK
>
>>> +Example:
>>> +
>>> +/ {
>>> + interrupt-parent = <>;
>>> +
>>> + syscon: syscon@4000 {
>>
>> This chip has no internal bus? Put all these nodes under a bus.
>
> Are you thinking something of the form:
>
> soc: soc {
> #address-cells = <1>;
> #size-cells = <1>;
> ranges;
> compatible = "simple-bus";
>
> syscon: syscon@4000 {
>
> (...)
>
> ?

Yes.

Rob


Re: [PATCH 11/22] ARM: dts: add top-level DT bindings for Cortina Gemini

2017-01-30 Thread Rob Herring
On Sat, Jan 28, 2017 at 3:56 PM, Linus Walleij  wrote:
> On Mon, Jan 23, 2017 at 9:21 PM, Rob Herring  wrote:
>> On Sun, Jan 22, 2017 at 01:22:19PM +0100, Linus Walleij wrote:
>>> This adds the top level SoC bindings for Cortina systems Gemini
>>> platforms.
> (...)
>>> +- intcon: the root node must have an interrupt controller node pointing to
>>
>> intcon is just a source label and not meaningful for the binding.
>
> OK
>
>>> +Example:
>>> +
>>> +/ {
>>> + interrupt-parent = <>;
>>> +
>>> + syscon: syscon@4000 {
>>
>> This chip has no internal bus? Put all these nodes under a bus.
>
> Are you thinking something of the form:
>
> soc: soc {
> #address-cells = <1>;
> #size-cells = <1>;
> ranges;
> compatible = "simple-bus";
>
> syscon: syscon@4000 {
>
> (...)
>
> ?

Yes.

Rob


Re: linux-next: manual merge of the gpio tree with the staging tree

2017-01-30 Thread Linus Walleij
On Mon, Jan 30, 2017 at 5:28 AM, Stephen Rothwell  wrote:

> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.

OK I'll mention it to Linus (the big penguin). Thanks!

Linus Walleij


Re: linux-next: manual merge of the gpio tree with the staging tree

2017-01-30 Thread Linus Walleij
On Mon, Jan 30, 2017 at 5:28 AM, Stephen Rothwell  wrote:

> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.

OK I'll mention it to Linus (the big penguin). Thanks!

Linus Walleij


[GIT PULL rcu/next] RCU commits for 4.11

2017-01-30 Thread Paul E. McKenney
Hello, Ingo,

This pull request contains the following changes:

1.  Documentation updates.

http://lkml.kernel.org/r/20170114085032.ga18...@linux.vnet.ibm.com

2.  Dyntick updates, consolidating open-coded counter accesses
into a well-defined API.

http://lkml.kernel.org/r/20170124214602.ga2...@linux.vnet.ibm.com

3.  Miscellaneous fixes.

http://lkml.kernel.org/r/20170124215111.gb2...@linux.vnet.ibm.com

4.  SRCU updates: Simplify algorithm, add formal verification.

http://lkml.kernel.org/r/20170124220011.gc2...@linux.vnet.ibm.com

5.  Torture-test updates.

http://lkml.kernel.org/r/20170114092533.ga23...@linux.vnet.ibm.com

All of these changes have been subjected to 0day Test Robot and -next
testing, and are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git for-mingo

for you to fetch changes up to 31945aa9f14085c81cb3257e51bb210698b78626:

  Merge branches 'doc.2017.01.15b', 'dyntick.2017.01.23a', 'fixes.2017.01.23a', 
'srcu.2017.01.25a' and 'torture.2017.01.15b' into HEAD (2017-01-25 12:56:05 
-0800)


Byungchul Park (1):
  rcu: Only dump stalled-tasks stacks if there was a real stall

Joel Fernandes (1):
  llist: Clarify comments about when locking is needed

Lance Roy (2):
  srcu: Implement more-efficient reader counts
  rcutorture: Add CBMC-based formal verification for SRCU

Mathieu Desnoyers (1):
  Fix: Disable sys_membarrier when nohz_full is enabled

Matt Fleming (1):
  rcu: Enable RCU tracepoints by default to aid in debugging

Paul E. McKenney (33):
  rcu: Design documentation for expedited grace periods
  doc: Update control-dependencies section of memory-barriers.txt
  doc: Quick-Quiz answers are now inline
  doc: Add rcutree.rcu_kick_kthreads to kernel-parameters.txt
  torture: Add a check for CONFIG_RCU_STALL_COMMON for TINY01
  torture: Add CONFIG_PROVE_RCU_REPEATEDLY=y for TINY02
  torture: Add tests without slow grace period setup/cleanup
  torture: Run at least one test with CONFIG_DEBUG_OBJECTS_RCU_HEAD
  torture: Run one test with DEBUG_LOCK_ALLOC but not PROVE_LOCKING
  torture: Run a couple scenarios with CONFIG_RCU_EQS_DEBUG
  torture: Update RCU test scenario documentation
  torture: Enable DEBUG_OBJECTS_RCU_HEAD for Tiny RCU
  rcu: Abstract the dynticks momentary-idle operation
  rcu: Abstract the dynticks snapshot operation
  lockdep: Make RCU suspicious-access splats use pr_err
  rcu: Remove unneeded rcu_process_callbacks() declarations
  rcu: Add long-term CPU kicking
  rcu: Remove short-term CPU kicking
  rcu: Once again use NMI-based stack traces in stall warnings
  rcu: Re-enable TASKS_RCU for User Mode Linux
  rcu: Don't wake rcuc/X kthreads on NOCB CPUs
  rcu: Add comment headers to expedited-grace-period counter functions
  rcu: Make rcu_cpu_starting() use its "cpu" argument
  rcu: Fix comment in rcu_organize_nocb_kthreads()
  rcu: Eliminate unused expedited_normal counter
  rcu: Add lockdep checks to synchronous expedited primitives
  rcu: Abstract dynticks extended quiescent state enter/exit operations
  rcu: Abstract extended quiescent state determination
  rcu: Check cond_resched_rcu_qs() state less often to reduce GP overhead
  rcu: Adjust FQS offline checks for exact online-CPU detection
  srcu: Force full grace-period ordering
  srcu: Reduce probability of SRCU ->unlock_count[] counter overflow
  Merge branches 'doc.2017.01.15b', 'dyntick.2017.01.23a', 
'fixes.2017.01.23a', 'srcu.2017.01.25a' and 'torture.2017.01.15b' into HEAD

Sebastian Andrzej Siewior (1):
  rcu: update: Make RCU_EXPEDITE_BOOT be the default

Tetsuo Handa (1):
  doc: Fix RCU requirements typos

Tobias Klauser (1):
  rcu: Remove unused but set variable

Yang Shi (1):
  locktorture: Fix potential memory leak with rw lock test

 .../Design/Data-Structures/Data-Structures.html|   5 +-
 .../Design/Expedited-Grace-Periods/ExpRCUFlow.svg  | 830 +
 .../Expedited-Grace-Periods/ExpSchedFlow.svg   | 826 
 .../Expedited-Grace-Periods.html   | 626 
 .../RCU/Design/Expedited-Grace-Periods/Funnel0.svg | 275 +++
 .../RCU/Design/Expedited-Grace-Periods/Funnel1.svg | 275 +++
 .../RCU/Design/Expedited-Grace-Periods/Funnel2.svg | 287 +++
 .../RCU/Design/Expedited-Grace-Periods/Funnel3.svg | 323 
 .../RCU/Design/Expedited-Grace-Periods/Funnel4.svg | 323 
 .../RCU/Design/Expedited-Grace-Periods/Funnel5.svg | 335 +
 .../RCU/Design/Expedited-Grace-Periods/Funnel6.svg | 335 +
 .../RCU/Design/Expedited-Grace-Periods/Funnel7.svg | 347 +
 .../RCU/Design/Expedited-Grace-Periods/Funnel8.svg | 311 
 

[GIT PULL rcu/next] RCU commits for 4.11

2017-01-30 Thread Paul E. McKenney
Hello, Ingo,

This pull request contains the following changes:

1.  Documentation updates.

http://lkml.kernel.org/r/20170114085032.ga18...@linux.vnet.ibm.com

2.  Dyntick updates, consolidating open-coded counter accesses
into a well-defined API.

http://lkml.kernel.org/r/20170124214602.ga2...@linux.vnet.ibm.com

3.  Miscellaneous fixes.

http://lkml.kernel.org/r/20170124215111.gb2...@linux.vnet.ibm.com

4.  SRCU updates: Simplify algorithm, add formal verification.

http://lkml.kernel.org/r/20170124220011.gc2...@linux.vnet.ibm.com

5.  Torture-test updates.

http://lkml.kernel.org/r/20170114092533.ga23...@linux.vnet.ibm.com

All of these changes have been subjected to 0day Test Robot and -next
testing, and are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git for-mingo

for you to fetch changes up to 31945aa9f14085c81cb3257e51bb210698b78626:

  Merge branches 'doc.2017.01.15b', 'dyntick.2017.01.23a', 'fixes.2017.01.23a', 
'srcu.2017.01.25a' and 'torture.2017.01.15b' into HEAD (2017-01-25 12:56:05 
-0800)


Byungchul Park (1):
  rcu: Only dump stalled-tasks stacks if there was a real stall

Joel Fernandes (1):
  llist: Clarify comments about when locking is needed

Lance Roy (2):
  srcu: Implement more-efficient reader counts
  rcutorture: Add CBMC-based formal verification for SRCU

Mathieu Desnoyers (1):
  Fix: Disable sys_membarrier when nohz_full is enabled

Matt Fleming (1):
  rcu: Enable RCU tracepoints by default to aid in debugging

Paul E. McKenney (33):
  rcu: Design documentation for expedited grace periods
  doc: Update control-dependencies section of memory-barriers.txt
  doc: Quick-Quiz answers are now inline
  doc: Add rcutree.rcu_kick_kthreads to kernel-parameters.txt
  torture: Add a check for CONFIG_RCU_STALL_COMMON for TINY01
  torture: Add CONFIG_PROVE_RCU_REPEATEDLY=y for TINY02
  torture: Add tests without slow grace period setup/cleanup
  torture: Run at least one test with CONFIG_DEBUG_OBJECTS_RCU_HEAD
  torture: Run one test with DEBUG_LOCK_ALLOC but not PROVE_LOCKING
  torture: Run a couple scenarios with CONFIG_RCU_EQS_DEBUG
  torture: Update RCU test scenario documentation
  torture: Enable DEBUG_OBJECTS_RCU_HEAD for Tiny RCU
  rcu: Abstract the dynticks momentary-idle operation
  rcu: Abstract the dynticks snapshot operation
  lockdep: Make RCU suspicious-access splats use pr_err
  rcu: Remove unneeded rcu_process_callbacks() declarations
  rcu: Add long-term CPU kicking
  rcu: Remove short-term CPU kicking
  rcu: Once again use NMI-based stack traces in stall warnings
  rcu: Re-enable TASKS_RCU for User Mode Linux
  rcu: Don't wake rcuc/X kthreads on NOCB CPUs
  rcu: Add comment headers to expedited-grace-period counter functions
  rcu: Make rcu_cpu_starting() use its "cpu" argument
  rcu: Fix comment in rcu_organize_nocb_kthreads()
  rcu: Eliminate unused expedited_normal counter
  rcu: Add lockdep checks to synchronous expedited primitives
  rcu: Abstract dynticks extended quiescent state enter/exit operations
  rcu: Abstract extended quiescent state determination
  rcu: Check cond_resched_rcu_qs() state less often to reduce GP overhead
  rcu: Adjust FQS offline checks for exact online-CPU detection
  srcu: Force full grace-period ordering
  srcu: Reduce probability of SRCU ->unlock_count[] counter overflow
  Merge branches 'doc.2017.01.15b', 'dyntick.2017.01.23a', 
'fixes.2017.01.23a', 'srcu.2017.01.25a' and 'torture.2017.01.15b' into HEAD

Sebastian Andrzej Siewior (1):
  rcu: update: Make RCU_EXPEDITE_BOOT be the default

Tetsuo Handa (1):
  doc: Fix RCU requirements typos

Tobias Klauser (1):
  rcu: Remove unused but set variable

Yang Shi (1):
  locktorture: Fix potential memory leak with rw lock test

 .../Design/Data-Structures/Data-Structures.html|   5 +-
 .../Design/Expedited-Grace-Periods/ExpRCUFlow.svg  | 830 +
 .../Expedited-Grace-Periods/ExpSchedFlow.svg   | 826 
 .../Expedited-Grace-Periods.html   | 626 
 .../RCU/Design/Expedited-Grace-Periods/Funnel0.svg | 275 +++
 .../RCU/Design/Expedited-Grace-Periods/Funnel1.svg | 275 +++
 .../RCU/Design/Expedited-Grace-Periods/Funnel2.svg | 287 +++
 .../RCU/Design/Expedited-Grace-Periods/Funnel3.svg | 323 
 .../RCU/Design/Expedited-Grace-Periods/Funnel4.svg | 323 
 .../RCU/Design/Expedited-Grace-Periods/Funnel5.svg | 335 +
 .../RCU/Design/Expedited-Grace-Periods/Funnel6.svg | 335 +
 .../RCU/Design/Expedited-Grace-Periods/Funnel7.svg | 347 +
 .../RCU/Design/Expedited-Grace-Periods/Funnel8.svg | 311 
 

Re: [PATCH 8/9] bcache: use kvmalloc

2017-01-30 Thread Michal Hocko
On Mon 30-01-17 17:47:31, Vlastimil Babka wrote:
> On 01/30/2017 10:49 AM, Michal Hocko wrote:
> > From: Michal Hocko 
> > 
> > bcache_device_init uses kmalloc for small requests and vmalloc for those
> > which are larger than 64 pages. This alone is a strange criterion.
> > Moreover kmalloc can fallback to vmalloc on the failure. Let's simply
> > use kvmalloc instead as it knows how to handle the fallback properly
> 
> I don't see why separate patch, some of the conversions in 5/9 were quite
> similar (except comparing with PAGE_SIZE, not 64*PAGE_SIZE), but nevermind.

I just found it later so I kept it separate. It can be folded to 5/9 if
that makes more sense.
 
> > Cc: Kent Overstreet 
> > Signed-off-by: Michal Hocko 
> 
> Acked-by: Vlastimil Babka 

Thanks!

> > ---
> >  drivers/md/bcache/super.c | 8 ++--
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> > index 3a19cbc8b230..4cb6b88a1465 100644
> > --- a/drivers/md/bcache/super.c
> > +++ b/drivers/md/bcache/super.c
> > @@ -767,16 +767,12 @@ static int bcache_device_init(struct bcache_device 
> > *d, unsigned block_size,
> > }
> > 
> > n = d->nr_stripes * sizeof(atomic_t);
> > -   d->stripe_sectors_dirty = n < PAGE_SIZE << 6
> > -   ? kzalloc(n, GFP_KERNEL)
> > -   : vzalloc(n);
> > +   d->stripe_sectors_dirty = kvzalloc(n, GFP_KERNEL);
> > if (!d->stripe_sectors_dirty)
> > return -ENOMEM;
> > 
> > n = BITS_TO_LONGS(d->nr_stripes) * sizeof(unsigned long);
> > -   d->full_dirty_stripes = n < PAGE_SIZE << 6
> > -   ? kzalloc(n, GFP_KERNEL)
> > -   : vzalloc(n);
> > +   d->full_dirty_stripes = kvzalloc(n, GFP_KERNEL);
> > if (!d->full_dirty_stripes)
> > return -ENOMEM;
> > 
> > 

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 8/9] bcache: use kvmalloc

2017-01-30 Thread Michal Hocko
On Mon 30-01-17 17:47:31, Vlastimil Babka wrote:
> On 01/30/2017 10:49 AM, Michal Hocko wrote:
> > From: Michal Hocko 
> > 
> > bcache_device_init uses kmalloc for small requests and vmalloc for those
> > which are larger than 64 pages. This alone is a strange criterion.
> > Moreover kmalloc can fallback to vmalloc on the failure. Let's simply
> > use kvmalloc instead as it knows how to handle the fallback properly
> 
> I don't see why separate patch, some of the conversions in 5/9 were quite
> similar (except comparing with PAGE_SIZE, not 64*PAGE_SIZE), but nevermind.

I just found it later so I kept it separate. It can be folded to 5/9 if
that makes more sense.
 
> > Cc: Kent Overstreet 
> > Signed-off-by: Michal Hocko 
> 
> Acked-by: Vlastimil Babka 

Thanks!

> > ---
> >  drivers/md/bcache/super.c | 8 ++--
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> > index 3a19cbc8b230..4cb6b88a1465 100644
> > --- a/drivers/md/bcache/super.c
> > +++ b/drivers/md/bcache/super.c
> > @@ -767,16 +767,12 @@ static int bcache_device_init(struct bcache_device 
> > *d, unsigned block_size,
> > }
> > 
> > n = d->nr_stripes * sizeof(atomic_t);
> > -   d->stripe_sectors_dirty = n < PAGE_SIZE << 6
> > -   ? kzalloc(n, GFP_KERNEL)
> > -   : vzalloc(n);
> > +   d->stripe_sectors_dirty = kvzalloc(n, GFP_KERNEL);
> > if (!d->stripe_sectors_dirty)
> > return -ENOMEM;
> > 
> > n = BITS_TO_LONGS(d->nr_stripes) * sizeof(unsigned long);
> > -   d->full_dirty_stripes = n < PAGE_SIZE << 6
> > -   ? kzalloc(n, GFP_KERNEL)
> > -   : vzalloc(n);
> > +   d->full_dirty_stripes = kvzalloc(n, GFP_KERNEL);
> > if (!d->full_dirty_stripes)
> > return -ENOMEM;
> > 
> > 

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 04/14] x86/fpu: Remove 'kbuf' parameter from the copy_xstate_to_user() APIs

2017-01-30 Thread Yu-cheng Yu
On Mon, Jan 30, 2017 at 04:45:21PM +0100, Borislav Petkov wrote:
> On Mon, Jan 30, 2017 at 10:57:28AM +0100, Ingo Molnar wrote:
> > Would anyone object to using u32 in these prototypes?
> 
> Well, would there be any disadvantage to forcing them to u32?
> Potentially by something else wanting to use those interfaces besides
> the regset thing and that something else doesn't like u32s?
> 
> Otherwise, I don't see a problem.
> 
> I mean, if 4G are not enough for xstate dimensions then we have a whole
> different problem.

This function pair was intended to be similar to user_regset_copyout(), 
user_regset_copyin() used for the standard-format XSAVE area copying.
I totally agree it is complex and should be simplified.  Why don't we
do both places? 

Yu-cheng
 


Re: [PATCH 04/14] x86/fpu: Remove 'kbuf' parameter from the copy_xstate_to_user() APIs

2017-01-30 Thread Yu-cheng Yu
On Mon, Jan 30, 2017 at 04:45:21PM +0100, Borislav Petkov wrote:
> On Mon, Jan 30, 2017 at 10:57:28AM +0100, Ingo Molnar wrote:
> > Would anyone object to using u32 in these prototypes?
> 
> Well, would there be any disadvantage to forcing them to u32?
> Potentially by something else wanting to use those interfaces besides
> the regset thing and that something else doesn't like u32s?
> 
> Otherwise, I don't see a problem.
> 
> I mean, if 4G are not enough for xstate dimensions then we have a whole
> different problem.

This function pair was intended to be similar to user_regset_copyout(), 
user_regset_copyin() used for the standard-format XSAVE area copying.
I totally agree it is complex and should be simplified.  Why don't we
do both places? 

Yu-cheng
 


[PATCH] i2c: at91: ensure state is restored after suspending

2017-01-30 Thread Alexandre Belloni
When going to suspend, the I2C registers may be lost because the power to
VDDcore is cut. Save them and restore them when resuming.

Signed-off-by: Alexandre Belloni 
---
 drivers/i2c/busses/i2c-at91.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/i2c/busses/i2c-at91.c b/drivers/i2c/busses/i2c-at91.c
index 0b86c6173e07..633bdd899952 100644
--- a/drivers/i2c/busses/i2c-at91.c
+++ b/drivers/i2c/busses/i2c-at91.c
@@ -140,6 +140,12 @@ struct at91_twi_dev {
unsigned transfer_status;
struct i2c_adapter adapter;
unsigned twi_cwgr_reg;
+   struct {
+   u32 mmr;
+   u32 imr;
+   u32 fmr;
+   u32 fimr;
+   } cache;
struct at91_twi_pdata *pdata;
bool use_dma;
bool use_alt_cmd;
@@ -1172,6 +1178,15 @@ static int at91_twi_runtime_resume(struct device *dev)
 
 static int at91_twi_suspend_noirq(struct device *dev)
 {
+   struct at91_twi_dev *twi_dev = dev_get_drvdata(dev);
+
+   twi_dev->cache.mmr = at91_twi_read(twi_dev, AT91_TWI_MMR);
+   twi_dev->cache.imr = at91_twi_read(twi_dev, AT91_TWI_IMR);
+   if (twi_dev->fifo_size) {
+   twi_dev->cache.fmr = at91_twi_read(twi_dev, AT91_TWI_FMR);
+   twi_dev->cache.fimr = at91_twi_read(twi_dev, AT91_TWI_FIMR);
+   }
+
if (!pm_runtime_status_suspended(dev))
at91_twi_runtime_suspend(dev);
 
@@ -1180,6 +1195,7 @@ static int at91_twi_suspend_noirq(struct device *dev)
 
 static int at91_twi_resume_noirq(struct device *dev)
 {
+   struct at91_twi_dev *twi_dev = dev_get_drvdata(dev);
int ret;
 
if (!pm_runtime_status_suspended(dev)) {
@@ -1191,6 +1207,14 @@ static int at91_twi_resume_noirq(struct device *dev)
pm_runtime_mark_last_busy(dev);
pm_request_autosuspend(dev);
 
+   at91_init_twi_bus(twi_dev);
+   at91_twi_write(twi_dev, AT91_TWI_MMR, twi_dev->cache.mmr);
+   at91_twi_write(twi_dev, AT91_TWI_IER, twi_dev->cache.imr);
+   if (twi_dev->fifo_size) {
+   at91_twi_write(twi_dev, AT91_TWI_FMR, twi_dev->cache.fmr);
+   at91_twi_write(twi_dev, AT91_TWI_FIER, twi_dev->cache.fimr);
+   }
+
return 0;
 }
 
-- 
2.11.0



[PATCH] i2c: at91: ensure state is restored after suspending

2017-01-30 Thread Alexandre Belloni
When going to suspend, the I2C registers may be lost because the power to
VDDcore is cut. Save them and restore them when resuming.

Signed-off-by: Alexandre Belloni 
---
 drivers/i2c/busses/i2c-at91.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/drivers/i2c/busses/i2c-at91.c b/drivers/i2c/busses/i2c-at91.c
index 0b86c6173e07..633bdd899952 100644
--- a/drivers/i2c/busses/i2c-at91.c
+++ b/drivers/i2c/busses/i2c-at91.c
@@ -140,6 +140,12 @@ struct at91_twi_dev {
unsigned transfer_status;
struct i2c_adapter adapter;
unsigned twi_cwgr_reg;
+   struct {
+   u32 mmr;
+   u32 imr;
+   u32 fmr;
+   u32 fimr;
+   } cache;
struct at91_twi_pdata *pdata;
bool use_dma;
bool use_alt_cmd;
@@ -1172,6 +1178,15 @@ static int at91_twi_runtime_resume(struct device *dev)
 
 static int at91_twi_suspend_noirq(struct device *dev)
 {
+   struct at91_twi_dev *twi_dev = dev_get_drvdata(dev);
+
+   twi_dev->cache.mmr = at91_twi_read(twi_dev, AT91_TWI_MMR);
+   twi_dev->cache.imr = at91_twi_read(twi_dev, AT91_TWI_IMR);
+   if (twi_dev->fifo_size) {
+   twi_dev->cache.fmr = at91_twi_read(twi_dev, AT91_TWI_FMR);
+   twi_dev->cache.fimr = at91_twi_read(twi_dev, AT91_TWI_FIMR);
+   }
+
if (!pm_runtime_status_suspended(dev))
at91_twi_runtime_suspend(dev);
 
@@ -1180,6 +1195,7 @@ static int at91_twi_suspend_noirq(struct device *dev)
 
 static int at91_twi_resume_noirq(struct device *dev)
 {
+   struct at91_twi_dev *twi_dev = dev_get_drvdata(dev);
int ret;
 
if (!pm_runtime_status_suspended(dev)) {
@@ -1191,6 +1207,14 @@ static int at91_twi_resume_noirq(struct device *dev)
pm_runtime_mark_last_busy(dev);
pm_request_autosuspend(dev);
 
+   at91_init_twi_bus(twi_dev);
+   at91_twi_write(twi_dev, AT91_TWI_MMR, twi_dev->cache.mmr);
+   at91_twi_write(twi_dev, AT91_TWI_IER, twi_dev->cache.imr);
+   if (twi_dev->fifo_size) {
+   at91_twi_write(twi_dev, AT91_TWI_FMR, twi_dev->cache.fmr);
+   at91_twi_write(twi_dev, AT91_TWI_FIER, twi_dev->cache.fimr);
+   }
+
return 0;
 }
 
-- 
2.11.0



Re: [RFC v2 10/10] KVM: arm/arm64: Emulate the EL1 phys timer register access

2017-01-30 Thread Peter Maydell
On 30 January 2017 at 17:08, Jintack Lim  wrote:
> On Sun, Jan 29, 2017 at 10:44 AM, Marc Zyngier  wrote:
>> Shouldn't we take the ENABLE bit into account? The ARMv8 ARM version I
>> have at hand (version h) seems to indicate that we should, but we should
>> check with the latest and greatest...
>
> Thanks! I was not clear about this. I have ARM ARM version k, and it
> says that 'When the value of the ENABLE bit is 0, the ISTATUS field is
> UNKNOWN.' So I thought the istatus value doesn't matter if ENABLE is
> 0, and just set istatus bit regardless of ENABLE bit. If this is not
> what the manual meant, then I'm happy to fix this.

It looks like the spec has been relaxed between the doc version
that Marc was looking at and the current one. So it's OK for
an implementation to either (a) set ISTATUS to 0 if ENABLE
is 0, or (b) do what you've done and set ISTATUS according
to the timer comparison whether ENABLE is clear or not
(or even (c) set ISTATUS to a random value if ENABLE is clear,
and other less likely choices).
I think we should add a comment to note that it's architecturally
UNKNOWN and we've made a choice for our implementation convenience.

thanks
-- PMM


Re: [RFC v2 10/10] KVM: arm/arm64: Emulate the EL1 phys timer register access

2017-01-30 Thread Peter Maydell
On 30 January 2017 at 17:08, Jintack Lim  wrote:
> On Sun, Jan 29, 2017 at 10:44 AM, Marc Zyngier  wrote:
>> Shouldn't we take the ENABLE bit into account? The ARMv8 ARM version I
>> have at hand (version h) seems to indicate that we should, but we should
>> check with the latest and greatest...
>
> Thanks! I was not clear about this. I have ARM ARM version k, and it
> says that 'When the value of the ENABLE bit is 0, the ISTATUS field is
> UNKNOWN.' So I thought the istatus value doesn't matter if ENABLE is
> 0, and just set istatus bit regardless of ENABLE bit. If this is not
> what the manual meant, then I'm happy to fix this.

It looks like the spec has been relaxed between the doc version
that Marc was looking at and the current one. So it's OK for
an implementation to either (a) set ISTATUS to 0 if ENABLE
is 0, or (b) do what you've done and set ISTATUS according
to the timer comparison whether ENABLE is clear or not
(or even (c) set ISTATUS to a random value if ENABLE is clear,
and other less likely choices).
I think we should add a comment to note that it's architecturally
UNKNOWN and we've made a choice for our implementation convenience.

thanks
-- PMM


[RFC 3/3] PM / hibernate: Wire up system-power framework

2017-01-30 Thread Thierry Reding
From: Thierry Reding 

Use the system-power framework's equivalent to test for power off
capability instead of relying on the globally defined pm_power_off()
function pointer.

The system-power framework implements a fallback that relies on this
global function in case no system power chips have been registered.

Moving this to the system-power framework allows us to eventually
remove any traces of pm_power_off() once all handlers have moved over
to the new framework.

Signed-off-by: Thierry Reding 
---
 kernel/power/hibernate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index b26dbc48c75b..e7429ea11e9a 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -617,7 +618,7 @@ static void power_down(void)
case HIBERNATION_PLATFORM:
hibernation_platform_enter();
case HIBERNATION_SHUTDOWN:
-   if (pm_power_off)
+   if (system_can_power_off())
kernel_power_off();
break;
 #ifdef CONFIG_SUSPEND
-- 
2.11.0



[RFC 3/3] PM / hibernate: Wire up system-power framework

2017-01-30 Thread Thierry Reding
From: Thierry Reding 

Use the system-power framework's equivalent to test for power off
capability instead of relying on the globally defined pm_power_off()
function pointer.

The system-power framework implements a fallback that relies on this
global function in case no system power chips have been registered.

Moving this to the system-power framework allows us to eventually
remove any traces of pm_power_off() once all handlers have moved over
to the new framework.

Signed-off-by: Thierry Reding 
---
 kernel/power/hibernate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index b26dbc48c75b..e7429ea11e9a 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "power.h"
 
@@ -617,7 +618,7 @@ static void power_down(void)
case HIBERNATION_PLATFORM:
hibernation_platform_enter();
case HIBERNATION_SHUTDOWN:
-   if (pm_power_off)
+   if (system_can_power_off())
kernel_power_off();
break;
 #ifdef CONFIG_SUSPEND
-- 
2.11.0



[PATCH 2/2] watchdog: sama5d4: Implement resume hook

2017-01-30 Thread Alexandre Belloni
When resuming for the deepest state on sama5d2, it is necessary to restore
MR as the registers are lost.

Signed-off-by: Alexandre Belloni 
---
 drivers/watchdog/sama5d4_wdt.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/watchdog/sama5d4_wdt.c b/drivers/watchdog/sama5d4_wdt.c
index 6dd07bef515a..de8ff10a032e 100644
--- a/drivers/watchdog/sama5d4_wdt.c
+++ b/drivers/watchdog/sama5d4_wdt.c
@@ -258,11 +258,28 @@ static const struct of_device_id sama5d4_wdt_of_match[] = 
{
 };
 MODULE_DEVICE_TABLE(of, sama5d4_wdt_of_match);
 
+#ifdef CONFIG_PM_SLEEP
+static int sama5d4_wdt_resume(struct device *dev)
+{
+   struct sama5d4_wdt *wdt = dev_get_drvdata(dev);
+
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr & ~AT91_WDT_WDDIS);
+   if (wdt->mr & AT91_WDT_WDDIS)
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
+
+   return 0;
+}
+#endif
+
+static SIMPLE_DEV_PM_OPS(sama5d4_wdt_pm_ops, NULL,
+sama5d4_wdt_resume);
+
 static struct platform_driver sama5d4_wdt_driver = {
.probe  = sama5d4_wdt_probe,
.remove = sama5d4_wdt_remove,
.driver = {
.name   = "sama5d4_wdt",
+   .pm = _wdt_pm_ops,
.of_match_table = sama5d4_wdt_of_match,
}
 };
-- 
2.11.0



Re: [PATCH v8 07/12] dt-bindings: i2c: i2c-mux-simple: document i2c-mux-simple bindings

2017-01-30 Thread Rob Herring
On Sat, Jan 28, 2017 at 4:42 PM, Peter Rosin  wrote:
> On 2017-01-27 20:39, Rob Herring wrote:
>> On Wed, Jan 18, 2017 at 04:57:10PM +0100, Peter Rosin wrote:
>>> Describe how a generic multiplexer controller is used to mux an i2c bus.
>>>
>>> Acked-by: Jonathan Cameron 
>>> Signed-off-by: Peter Rosin 
>>> ---
>>>  .../devicetree/bindings/i2c/i2c-mux-simple.txt | 81 
>>> ++
>>>  1 file changed, 81 insertions(+)
>>>  create mode 100644 Documentation/devicetree/bindings/i2c/i2c-mux-simple.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/i2c/i2c-mux-simple.txt 
>>> b/Documentation/devicetree/bindings/i2c/i2c-mux-simple.txt
>>> new file mode 100644
>>> index ..253d5027843b
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/i2c/i2c-mux-simple.txt
>>> @@ -0,0 +1,81 @@
>>> +Simple I2C Bus Mux
>>> +
>>> +This binding describes an I2C bus multiplexer that uses a mux controller
>>> +from the mux subsystem to route the I2C signals.
>>> +
>>> +  .-.  .-.
>>> +  | dev |  | dev |
>>> +..'-'  '-'
>>> +| SoC|   ||
>>> +||  .+'
>>> +|   .--. |  .--+child bus A, on MUX value set to 0
>>> +|   | I2C  |-|--| Mux  |
>>> +|   '--' |  '--+---+child bus B, on MUX value set to 1
>>> +|   .--. | |'--++.
>>> +|   | MUX- | | |   |||
>>> +|   | Ctrl |-|-+.-.  .-.  .-.
>>> +|   '--' |  | dev |  | dev |  | dev |
>>> +''  '-'  '-'  '-'
>>> +
>>> +Required properties:
>>> +- compatible: i2c-mux-simple,mux-locked or i2c-mux-simple,parent-locked
>>
>> Not a fan of using "simple" nor the ','. Perhaps lock type should be
>> separate property.
>
> How about just i2c-mux for the compatible? Because i2c-mux-mux (which
> follows the naming of previous i2c muxes) looks really stupid. Or
> perhaps i2c-mux-generic?

I like "generic" only slightly more than "simple". :)

If the mux is gpio controlled, then it should still be called
i2c-gpio-mux. Let's not invent brand new bindings when current ones
are easily extended. We already have pretty generic names here, let's
not make them more generic.

>
> I'm also happy to have the lock type as a separate property. One lock
> type, e.g. parent-locked, could be the default and adding a 'mux-locked'
> property could change that. Would that be ok?

I prefer this. Then existing bindings can use it.

> Or should it be a requirement that one of 'mux-locked'/'parent-locked'
> is always present?

I would make it boolean and make not present be the more common case.
Not present could also mean determined via other means as you have
today with existing bindings. Maybe then you need both properties.

>> I'm not sure I get the mux vs. parent locked fully. How do I determine
>> what type I have? We already have bindings for several types of i2c
>> muxes. How does the locking annotation fit into those?
>
> We have briefly discussed this before [1] in the context of i2c-mux-gpio
> and i2c-mux-pinctrl, when I added the mux-locked/parent-locked distinction
> to the i2c-mux infrastructure (it wasn't named mux-locked/parent-locked
> way back then though). There is more detail on what the difference is
> between the two in Documentation/i2c/i2c-topology.
>
> Side note regarding your remark "use an I2C controlled mux instead"; it
> appears that I'm not alone [2] with this kind of requirement...
>
> [1] https://lkml.org/lkml/2016/1/6/437
> [2] http://marc.info/?t=14787795912=1=2
>
> But, now that I have pondered on this for a year or so, I firmly
> believe it was a mistake to have the code in i2c-mux-gpio and
> i2c-mux-pinctrl automatically try to deduce if the mux should be
> mux-locked or parent-locked. It might be easy to make that call
> in some trivial cases, but it is not difficult to dream up
> scenarios where it would be extremely hard for the code to get
> this decision right. It's just fragile. But now we have code in
> those two muxes that has unwanted tentacles into the guts of the
> gpio and pinctrl subsystems. Hopefully those unwanted tentacles
> can be replaced with something based on device links? However, it
> is still not hard to come up with scenarios that will require
> manual intervention in order to select the right kind of i2c mux
> locking. So, I fear that we have inadequate code trying to make a
> decision automatically, and that we at some point down the line
> will have some impossible case needing a binding that trumps the
> heuristic. Why have a heuristic at all in that case? In short, it
> should have been a binding from the start, methinks.
>
> That was a long rant regarding i2c-mux-gpio and 

[PATCH 2/2] watchdog: sama5d4: Implement resume hook

2017-01-30 Thread Alexandre Belloni
When resuming for the deepest state on sama5d2, it is necessary to restore
MR as the registers are lost.

Signed-off-by: Alexandre Belloni 
---
 drivers/watchdog/sama5d4_wdt.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/drivers/watchdog/sama5d4_wdt.c b/drivers/watchdog/sama5d4_wdt.c
index 6dd07bef515a..de8ff10a032e 100644
--- a/drivers/watchdog/sama5d4_wdt.c
+++ b/drivers/watchdog/sama5d4_wdt.c
@@ -258,11 +258,28 @@ static const struct of_device_id sama5d4_wdt_of_match[] = 
{
 };
 MODULE_DEVICE_TABLE(of, sama5d4_wdt_of_match);
 
+#ifdef CONFIG_PM_SLEEP
+static int sama5d4_wdt_resume(struct device *dev)
+{
+   struct sama5d4_wdt *wdt = dev_get_drvdata(dev);
+
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr & ~AT91_WDT_WDDIS);
+   if (wdt->mr & AT91_WDT_WDDIS)
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
+
+   return 0;
+}
+#endif
+
+static SIMPLE_DEV_PM_OPS(sama5d4_wdt_pm_ops, NULL,
+sama5d4_wdt_resume);
+
 static struct platform_driver sama5d4_wdt_driver = {
.probe  = sama5d4_wdt_probe,
.remove = sama5d4_wdt_remove,
.driver = {
.name   = "sama5d4_wdt",
+   .pm = _wdt_pm_ops,
.of_match_table = sama5d4_wdt_of_match,
}
 };
-- 
2.11.0



Re: [PATCH v8 07/12] dt-bindings: i2c: i2c-mux-simple: document i2c-mux-simple bindings

2017-01-30 Thread Rob Herring
On Sat, Jan 28, 2017 at 4:42 PM, Peter Rosin  wrote:
> On 2017-01-27 20:39, Rob Herring wrote:
>> On Wed, Jan 18, 2017 at 04:57:10PM +0100, Peter Rosin wrote:
>>> Describe how a generic multiplexer controller is used to mux an i2c bus.
>>>
>>> Acked-by: Jonathan Cameron 
>>> Signed-off-by: Peter Rosin 
>>> ---
>>>  .../devicetree/bindings/i2c/i2c-mux-simple.txt | 81 
>>> ++
>>>  1 file changed, 81 insertions(+)
>>>  create mode 100644 Documentation/devicetree/bindings/i2c/i2c-mux-simple.txt
>>>
>>> diff --git a/Documentation/devicetree/bindings/i2c/i2c-mux-simple.txt 
>>> b/Documentation/devicetree/bindings/i2c/i2c-mux-simple.txt
>>> new file mode 100644
>>> index ..253d5027843b
>>> --- /dev/null
>>> +++ b/Documentation/devicetree/bindings/i2c/i2c-mux-simple.txt
>>> @@ -0,0 +1,81 @@
>>> +Simple I2C Bus Mux
>>> +
>>> +This binding describes an I2C bus multiplexer that uses a mux controller
>>> +from the mux subsystem to route the I2C signals.
>>> +
>>> +  .-.  .-.
>>> +  | dev |  | dev |
>>> +..'-'  '-'
>>> +| SoC|   ||
>>> +||  .+'
>>> +|   .--. |  .--+child bus A, on MUX value set to 0
>>> +|   | I2C  |-|--| Mux  |
>>> +|   '--' |  '--+---+child bus B, on MUX value set to 1
>>> +|   .--. | |'--++.
>>> +|   | MUX- | | |   |||
>>> +|   | Ctrl |-|-+.-.  .-.  .-.
>>> +|   '--' |  | dev |  | dev |  | dev |
>>> +''  '-'  '-'  '-'
>>> +
>>> +Required properties:
>>> +- compatible: i2c-mux-simple,mux-locked or i2c-mux-simple,parent-locked
>>
>> Not a fan of using "simple" nor the ','. Perhaps lock type should be
>> separate property.
>
> How about just i2c-mux for the compatible? Because i2c-mux-mux (which
> follows the naming of previous i2c muxes) looks really stupid. Or
> perhaps i2c-mux-generic?

I like "generic" only slightly more than "simple". :)

If the mux is gpio controlled, then it should still be called
i2c-gpio-mux. Let's not invent brand new bindings when current ones
are easily extended. We already have pretty generic names here, let's
not make them more generic.

>
> I'm also happy to have the lock type as a separate property. One lock
> type, e.g. parent-locked, could be the default and adding a 'mux-locked'
> property could change that. Would that be ok?

I prefer this. Then existing bindings can use it.

> Or should it be a requirement that one of 'mux-locked'/'parent-locked'
> is always present?

I would make it boolean and make not present be the more common case.
Not present could also mean determined via other means as you have
today with existing bindings. Maybe then you need both properties.

>> I'm not sure I get the mux vs. parent locked fully. How do I determine
>> what type I have? We already have bindings for several types of i2c
>> muxes. How does the locking annotation fit into those?
>
> We have briefly discussed this before [1] in the context of i2c-mux-gpio
> and i2c-mux-pinctrl, when I added the mux-locked/parent-locked distinction
> to the i2c-mux infrastructure (it wasn't named mux-locked/parent-locked
> way back then though). There is more detail on what the difference is
> between the two in Documentation/i2c/i2c-topology.
>
> Side note regarding your remark "use an I2C controlled mux instead"; it
> appears that I'm not alone [2] with this kind of requirement...
>
> [1] https://lkml.org/lkml/2016/1/6/437
> [2] http://marc.info/?t=14787795912=1=2
>
> But, now that I have pondered on this for a year or so, I firmly
> believe it was a mistake to have the code in i2c-mux-gpio and
> i2c-mux-pinctrl automatically try to deduce if the mux should be
> mux-locked or parent-locked. It might be easy to make that call
> in some trivial cases, but it is not difficult to dream up
> scenarios where it would be extremely hard for the code to get
> this decision right. It's just fragile. But now we have code in
> those two muxes that has unwanted tentacles into the guts of the
> gpio and pinctrl subsystems. Hopefully those unwanted tentacles
> can be replaced with something based on device links? However, it
> is still not hard to come up with scenarios that will require
> manual intervention in order to select the right kind of i2c mux
> locking. So, I fear that we have inadequate code trying to make a
> decision automatically, and that we at some point down the line
> will have some impossible case needing a binding that trumps the
> heuristic. Why have a heuristic at all in that case? In short, it
> should have been a binding from the start, methinks.
>
> That was a long rant regarding i2c-mux-gpio and i2c-mux-pinctrl.
> I obviously think it is bad to have 

[PATCH 1/2] watchdog: sama5d4: Cache MR instead of a partial config

2017-01-30 Thread Alexandre Belloni
.config is used to cache a part of WDT_MR at probe time and is not used
afterwards. Instead of doing that, actually cache MR and avoid reading it
every time it is modified.

Signed-off-by: Alexandre Belloni 
---
 drivers/watchdog/sama5d4_wdt.c | 45 ++
 1 file changed, 19 insertions(+), 26 deletions(-)

diff --git a/drivers/watchdog/sama5d4_wdt.c b/drivers/watchdog/sama5d4_wdt.c
index a49634cdc1cc..6dd07bef515a 100644
--- a/drivers/watchdog/sama5d4_wdt.c
+++ b/drivers/watchdog/sama5d4_wdt.c
@@ -28,7 +28,7 @@
 struct sama5d4_wdt {
struct watchdog_device  wdd;
void __iomem*reg_base;
-   u32 config;
+   u32 mr;
 };
 
 static int wdt_timeout = WDT_DEFAULT_TIMEOUT;
@@ -53,11 +53,9 @@ MODULE_PARM_DESC(nowayout,
 static int sama5d4_wdt_start(struct watchdog_device *wdd)
 {
struct sama5d4_wdt *wdt = watchdog_get_drvdata(wdd);
-   u32 reg;
 
-   reg = wdt_read(wdt, AT91_WDT_MR);
-   reg &= ~AT91_WDT_WDDIS;
-   wdt_write(wdt, AT91_WDT_MR, reg);
+   wdt->mr &= ~AT91_WDT_WDDIS;
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
 
return 0;
 }
@@ -65,11 +63,9 @@ static int sama5d4_wdt_start(struct watchdog_device *wdd)
 static int sama5d4_wdt_stop(struct watchdog_device *wdd)
 {
struct sama5d4_wdt *wdt = watchdog_get_drvdata(wdd);
-   u32 reg;
 
-   reg = wdt_read(wdt, AT91_WDT_MR);
-   reg |= AT91_WDT_WDDIS;
-   wdt_write(wdt, AT91_WDT_MR, reg);
+   wdt->mr |= AT91_WDT_WDDIS;
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
 
return 0;
 }
@@ -88,14 +84,12 @@ static int sama5d4_wdt_set_timeout(struct watchdog_device 
*wdd,
 {
struct sama5d4_wdt *wdt = watchdog_get_drvdata(wdd);
u32 value = WDT_SEC2TICKS(timeout);
-   u32 reg;
 
-   reg = wdt_read(wdt, AT91_WDT_MR);
-   reg &= ~AT91_WDT_WDV;
-   reg &= ~AT91_WDT_WDD;
-   reg |= AT91_WDT_SET_WDV(value);
-   reg |= AT91_WDT_SET_WDD(value);
-   wdt_write(wdt, AT91_WDT_MR, reg);
+   wdt->mr &= ~AT91_WDT_WDV;
+   wdt->mr &= ~AT91_WDT_WDD;
+   wdt->mr |= AT91_WDT_SET_WDV(value);
+   wdt->mr |= AT91_WDT_SET_WDD(value);
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
 
wdd->timeout = timeout;
 
@@ -132,19 +126,19 @@ static int of_sama5d4_wdt_init(struct device_node *np, 
struct sama5d4_wdt *wdt)
 {
const char *tmp;
 
-   wdt->config = AT91_WDT_WDDIS;
+   wdt->mr = AT91_WDT_WDDIS;
 
if (!of_property_read_string(np, "atmel,watchdog-type", ) &&
!strcmp(tmp, "software"))
-   wdt->config |= AT91_WDT_WDFIEN;
+   wdt->mr |= AT91_WDT_WDFIEN;
else
-   wdt->config |= AT91_WDT_WDRSTEN;
+   wdt->mr |= AT91_WDT_WDRSTEN;
 
if (of_property_read_bool(np, "atmel,idle-halt"))
-   wdt->config |= AT91_WDT_WDIDLEHLT;
+   wdt->mr |= AT91_WDT_WDIDLEHLT;
 
if (of_property_read_bool(np, "atmel,dbg-halt"))
-   wdt->config |= AT91_WDT_WDDBGHLT;
+   wdt->mr |= AT91_WDT_WDDBGHLT;
 
return 0;
 }
@@ -163,11 +157,10 @@ static int sama5d4_wdt_init(struct sama5d4_wdt *wdt)
reg &= ~AT91_WDT_WDDIS;
wdt_write(wdt, AT91_WDT_MR, reg);
 
-   reg = wdt->config;
-   reg |= AT91_WDT_SET_WDD(value);
-   reg |= AT91_WDT_SET_WDV(value);
+   wdt->mr |= AT91_WDT_SET_WDD(value);
+   wdt->mr |= AT91_WDT_SET_WDV(value);
 
-   wdt_write(wdt, AT91_WDT_MR, reg);
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
 
return 0;
 }
@@ -211,7 +204,7 @@ static int sama5d4_wdt_probe(struct platform_device *pdev)
return ret;
}
 
-   if ((wdt->config & AT91_WDT_WDFIEN) && irq) {
+   if ((wdt->mr & AT91_WDT_WDFIEN) && irq) {
ret = devm_request_irq(>dev, irq, sama5d4_wdt_irq_handler,
   IRQF_SHARED | IRQF_IRQPOLL |
   IRQF_NO_SUSPEND, pdev->name, pdev);
-- 
2.11.0



[PATCH 1/2] watchdog: sama5d4: Cache MR instead of a partial config

2017-01-30 Thread Alexandre Belloni
.config is used to cache a part of WDT_MR at probe time and is not used
afterwards. Instead of doing that, actually cache MR and avoid reading it
every time it is modified.

Signed-off-by: Alexandre Belloni 
---
 drivers/watchdog/sama5d4_wdt.c | 45 ++
 1 file changed, 19 insertions(+), 26 deletions(-)

diff --git a/drivers/watchdog/sama5d4_wdt.c b/drivers/watchdog/sama5d4_wdt.c
index a49634cdc1cc..6dd07bef515a 100644
--- a/drivers/watchdog/sama5d4_wdt.c
+++ b/drivers/watchdog/sama5d4_wdt.c
@@ -28,7 +28,7 @@
 struct sama5d4_wdt {
struct watchdog_device  wdd;
void __iomem*reg_base;
-   u32 config;
+   u32 mr;
 };
 
 static int wdt_timeout = WDT_DEFAULT_TIMEOUT;
@@ -53,11 +53,9 @@ MODULE_PARM_DESC(nowayout,
 static int sama5d4_wdt_start(struct watchdog_device *wdd)
 {
struct sama5d4_wdt *wdt = watchdog_get_drvdata(wdd);
-   u32 reg;
 
-   reg = wdt_read(wdt, AT91_WDT_MR);
-   reg &= ~AT91_WDT_WDDIS;
-   wdt_write(wdt, AT91_WDT_MR, reg);
+   wdt->mr &= ~AT91_WDT_WDDIS;
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
 
return 0;
 }
@@ -65,11 +63,9 @@ static int sama5d4_wdt_start(struct watchdog_device *wdd)
 static int sama5d4_wdt_stop(struct watchdog_device *wdd)
 {
struct sama5d4_wdt *wdt = watchdog_get_drvdata(wdd);
-   u32 reg;
 
-   reg = wdt_read(wdt, AT91_WDT_MR);
-   reg |= AT91_WDT_WDDIS;
-   wdt_write(wdt, AT91_WDT_MR, reg);
+   wdt->mr |= AT91_WDT_WDDIS;
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
 
return 0;
 }
@@ -88,14 +84,12 @@ static int sama5d4_wdt_set_timeout(struct watchdog_device 
*wdd,
 {
struct sama5d4_wdt *wdt = watchdog_get_drvdata(wdd);
u32 value = WDT_SEC2TICKS(timeout);
-   u32 reg;
 
-   reg = wdt_read(wdt, AT91_WDT_MR);
-   reg &= ~AT91_WDT_WDV;
-   reg &= ~AT91_WDT_WDD;
-   reg |= AT91_WDT_SET_WDV(value);
-   reg |= AT91_WDT_SET_WDD(value);
-   wdt_write(wdt, AT91_WDT_MR, reg);
+   wdt->mr &= ~AT91_WDT_WDV;
+   wdt->mr &= ~AT91_WDT_WDD;
+   wdt->mr |= AT91_WDT_SET_WDV(value);
+   wdt->mr |= AT91_WDT_SET_WDD(value);
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
 
wdd->timeout = timeout;
 
@@ -132,19 +126,19 @@ static int of_sama5d4_wdt_init(struct device_node *np, 
struct sama5d4_wdt *wdt)
 {
const char *tmp;
 
-   wdt->config = AT91_WDT_WDDIS;
+   wdt->mr = AT91_WDT_WDDIS;
 
if (!of_property_read_string(np, "atmel,watchdog-type", ) &&
!strcmp(tmp, "software"))
-   wdt->config |= AT91_WDT_WDFIEN;
+   wdt->mr |= AT91_WDT_WDFIEN;
else
-   wdt->config |= AT91_WDT_WDRSTEN;
+   wdt->mr |= AT91_WDT_WDRSTEN;
 
if (of_property_read_bool(np, "atmel,idle-halt"))
-   wdt->config |= AT91_WDT_WDIDLEHLT;
+   wdt->mr |= AT91_WDT_WDIDLEHLT;
 
if (of_property_read_bool(np, "atmel,dbg-halt"))
-   wdt->config |= AT91_WDT_WDDBGHLT;
+   wdt->mr |= AT91_WDT_WDDBGHLT;
 
return 0;
 }
@@ -163,11 +157,10 @@ static int sama5d4_wdt_init(struct sama5d4_wdt *wdt)
reg &= ~AT91_WDT_WDDIS;
wdt_write(wdt, AT91_WDT_MR, reg);
 
-   reg = wdt->config;
-   reg |= AT91_WDT_SET_WDD(value);
-   reg |= AT91_WDT_SET_WDV(value);
+   wdt->mr |= AT91_WDT_SET_WDD(value);
+   wdt->mr |= AT91_WDT_SET_WDV(value);
 
-   wdt_write(wdt, AT91_WDT_MR, reg);
+   wdt_write(wdt, AT91_WDT_MR, wdt->mr);
 
return 0;
 }
@@ -211,7 +204,7 @@ static int sama5d4_wdt_probe(struct platform_device *pdev)
return ret;
}
 
-   if ((wdt->config & AT91_WDT_WDFIEN) && irq) {
+   if ((wdt->mr & AT91_WDT_WDFIEN) && irq) {
ret = devm_request_irq(>dev, irq, sama5d4_wdt_irq_handler,
   IRQF_SHARED | IRQF_IRQPOLL |
   IRQF_NO_SUSPEND, pdev->name, pdev);
-- 
2.11.0



Re: [PATCH 09/14] x86/fpu: Change 'size_total' parameter to unsigned and standardize the size checks in copy_xstate_to_*()

2017-01-30 Thread Yu-cheng Yu
On Thu, Jan 26, 2017 at 11:22:54AM +0100, Ingo Molnar wrote:
> 'size_total' is derived from an unsigned input parameter - and then converted
> to 'int' and checked for negative ranges:
> 
>   if (size_total < 0 || offset < size_total) {
> 
> This conversion and the checks are unnecessary obfuscation, reject overly
> large requested copy sizes outright and simplify the underlying code.
> 
> Reported-by: Rik van Riel 
> Cc: Andy Lutomirski 
> Cc: Borislav Petkov 
> Cc: Dave Hansen 
> Cc: Fenghua Yu 
> Cc: H. Peter Anvin 
> Cc: Linus Torvalds 
> Cc: Oleg Nesterov 
> Cc: Thomas Gleixner 
> Cc: Yu-cheng Yu 
> Cc: Fenghua Yu 
> Signed-off-by: Ingo Molnar 
> ---
>  arch/x86/kernel/fpu/xstate.c | 32 +++-
>  1 file changed, 15 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index 8f9da89015e6..cceabca485c8 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -924,15 +924,11 @@ int arch_set_user_pkey_access(struct task_struct *tsk, 
> int pkey,
>   * the source data pointer or increment pos, count, kbuf, and ubuf.
>   */
>  static inline int
> -__copy_xstate_to_kernel(void *kbuf,
> - const void *data,
> - unsigned int offset, unsigned int size, int size_total)
> +__copy_xstate_to_kernel(void *kbuf, const void *data,
> + unsigned int offset, unsigned int size, unsigned int 
> size_total)
>  {
> - if (!size)
> - return 0;
> -
> - if (size_total < 0 || offset < size_total) {
> - unsigned int copy = size_total < 0 ? size : min(size, 
> size_total - offset);
> + if (offset < size_total) {
> + unsigned int copy = min(size, size_total - offset);
>  
>   memcpy(kbuf + offset, data, copy);
>   }
> @@ -985,12 +981,13 @@ int copy_xstate_to_kernel(void *kbuf, struct 
> xregs_state *xsave, unsigned int of
>   offset = xstate_offsets[i];
>   size = xstate_sizes[i];
>  
> + /* The next component has to fit fully into the output 
> buffer: */
> + if (offset + size > size_total)
> + break;

This makes sense, but would be different from the non-compacted format path 
where this
rule is not enforced.  Do we want to unify both?

Yu-cheng




Re: [PATCH 09/14] x86/fpu: Change 'size_total' parameter to unsigned and standardize the size checks in copy_xstate_to_*()

2017-01-30 Thread Yu-cheng Yu
On Thu, Jan 26, 2017 at 11:22:54AM +0100, Ingo Molnar wrote:
> 'size_total' is derived from an unsigned input parameter - and then converted
> to 'int' and checked for negative ranges:
> 
>   if (size_total < 0 || offset < size_total) {
> 
> This conversion and the checks are unnecessary obfuscation, reject overly
> large requested copy sizes outright and simplify the underlying code.
> 
> Reported-by: Rik van Riel 
> Cc: Andy Lutomirski 
> Cc: Borislav Petkov 
> Cc: Dave Hansen 
> Cc: Fenghua Yu 
> Cc: H. Peter Anvin 
> Cc: Linus Torvalds 
> Cc: Oleg Nesterov 
> Cc: Thomas Gleixner 
> Cc: Yu-cheng Yu 
> Cc: Fenghua Yu 
> Signed-off-by: Ingo Molnar 
> ---
>  arch/x86/kernel/fpu/xstate.c | 32 +++-
>  1 file changed, 15 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index 8f9da89015e6..cceabca485c8 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -924,15 +924,11 @@ int arch_set_user_pkey_access(struct task_struct *tsk, 
> int pkey,
>   * the source data pointer or increment pos, count, kbuf, and ubuf.
>   */
>  static inline int
> -__copy_xstate_to_kernel(void *kbuf,
> - const void *data,
> - unsigned int offset, unsigned int size, int size_total)
> +__copy_xstate_to_kernel(void *kbuf, const void *data,
> + unsigned int offset, unsigned int size, unsigned int 
> size_total)
>  {
> - if (!size)
> - return 0;
> -
> - if (size_total < 0 || offset < size_total) {
> - unsigned int copy = size_total < 0 ? size : min(size, 
> size_total - offset);
> + if (offset < size_total) {
> + unsigned int copy = min(size, size_total - offset);
>  
>   memcpy(kbuf + offset, data, copy);
>   }
> @@ -985,12 +981,13 @@ int copy_xstate_to_kernel(void *kbuf, struct 
> xregs_state *xsave, unsigned int of
>   offset = xstate_offsets[i];
>   size = xstate_sizes[i];
>  
> + /* The next component has to fit fully into the output 
> buffer: */
> + if (offset + size > size_total)
> + break;

This makes sense, but would be different from the non-compacted format path 
where this
rule is not enforced.  Do we want to unify both?

Yu-cheng




Re: [PATCH 9/9] net, bpf: use kvzalloc helper

2017-01-30 Thread Michal Hocko
Andrew, please ignore this one.

On Mon 30-01-17 10:49:40, Michal Hocko wrote:
> From: Michal Hocko 
> 
> both bpf_map_area_alloc and xt_alloc_table_info try really hard to
> play nicely with large memory requests which can be triggered from
> the userspace (by an admin). See 5bad87348c70 ("netfilter: x_tables:
> avoid warn and OOM killer on vmalloc call") resp. d407bd25a204 ("bpf:
> don't trigger OOM killer under pressure with map alloc").
> 
> The current allocation pattern strongly resembles kvmalloc helper except
> for one thing __GFP_NORETRY is not used for the vmalloc fallback. The
> main reason why kvmalloc doesn't really support __GFP_NORETRY is
> because vmalloc doesn't support this flag properly and it is far from
> straightforward to make it understand it because there are some hard
> coded GFP_KERNEL allocation deep in the call chains. This patch simply
> replaces the open coded variants with kvmalloc and puts a note to
> push on MM people to support __GFP_NORETRY in kvmalloc it this turns out
> to be really needed along with OOM report pointing at vmalloc.
> 
> If there is an immediate need and no full support yet then
>   kvmalloc(size, gfp | __GFP_NORETRY)
> will work as good as __vmalloc(gfp | __GFP_NORETRY) - in other words it
> might trigger the OOM in some cases.
> 
> Cc: Alexei Starovoitov 
> Cc: Andrey Konovalov 
> Cc: Marcelo Ricardo Leitner 
> Cc: Pablo Neira Ayuso 
> Acked-by: Daniel Borkmann 
> Signed-off-by: Michal Hocko 
> ---
>  kernel/bpf/syscall.c | 19 +--
>  net/netfilter/x_tables.c | 16 ++--
>  2 files changed, 11 insertions(+), 24 deletions(-)
> 
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 08a4d287226b..3d38c7a51e1a 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -54,21 +54,12 @@ void bpf_register_map_type(struct bpf_map_type_list *tl)
>  
>  void *bpf_map_area_alloc(size_t size)
>  {
> - /* We definitely need __GFP_NORETRY, so OOM killer doesn't
> -  * trigger under memory pressure as we really just want to
> -  * fail instead.
> + /*
> +  * FIXME: we would really like to not trigger the OOM killer and rather
> +  * fail instead. This is not supported right now. Please nag MM people
> +  * if these OOM start bothering people.
>*/
> - const gfp_t flags = __GFP_NOWARN | __GFP_NORETRY | __GFP_ZERO;
> - void *area;
> -
> - if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
> - area = kmalloc(size, GFP_USER | flags);
> - if (area != NULL)
> - return area;
> - }
> -
> - return __vmalloc(size, GFP_KERNEL | __GFP_HIGHMEM | flags,
> -  PAGE_KERNEL);
> + return kvzalloc(size, GFP_USER);
>  }
>  
>  void bpf_map_area_free(void *area)
> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> index d529989f5791..ba8ba633da72 100644
> --- a/net/netfilter/x_tables.c
> +++ b/net/netfilter/x_tables.c
> @@ -995,16 +995,12 @@ struct xt_table_info *xt_alloc_table_info(unsigned int 
> size)
>   if ((SMP_ALIGN(size) >> PAGE_SHIFT) + 2 > totalram_pages)
>   return NULL;
>  
> - if (sz <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
> - info = kmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY);
> - if (!info) {
> - info = __vmalloc(sz, GFP_KERNEL | __GFP_NOWARN |
> -  __GFP_NORETRY | __GFP_HIGHMEM,
> -  PAGE_KERNEL);
> - if (!info)
> - return NULL;
> - }
> - memset(info, 0, sizeof(*info));
> + /*
> +  * FIXME: we would really like to not trigger the OOM killer and rather
> +  * fail instead. This is not supported right now. Please nag MM people
> +  * if these OOM start bothering people.
> +  */
> + info = kvzalloc(sz, GFP_KERNEL);
>   info->size = size;
>   return info;
>  }
> -- 
> 2.11.0
> 

-- 
Michal Hocko
SUSE Labs


Re: [RFC V2 02/12] mm: Isolate HugeTLB allocations away from CDM nodes

2017-01-30 Thread Dave Hansen
On 01/29/2017 07:35 PM, Anshuman Khandual wrote:
> HugeTLB allocation/release/accounting currently spans across all the nodes
> under N_MEMORY node mask. Coherent memory nodes should not be part of these
> allocations. So use system_ram() call to fetch system RAM only nodes on the
> platform which can then be used for HugeTLB allocation purpose instead of
> N_MEMORY node mask. This isolates coherent device memory nodes from HugeTLB
> allocations.

Does this end up making it impossible to use hugetlbfs to access device
memory?


<    6   7   8   9   10   11   12   13   14   15   >