Re: [PATCH 3.19.y-ckt 173/251] clocksource: exynos_mct: Avoid blocking calls in the cpu hotplug notifier

2015-07-15 Thread Krzysztof Kozlowski
2015-07-16 15:37 GMT+09:00 Duan Andy :
> From: Kamal Mostafa  Sent: Thursday, July 16, 2015 9:08 
> AM
>> To: linux-kernel@vger.kernel.org; sta...@vger.kernel.org; kernel-
>> t...@lists.ubuntu.com
>> Cc: Kamal Mostafa; daniel.lezc...@linaro.org; kyungmin.p...@samsung.com;
>> Damian Eppel; kg...@kernel.org; Thomas Gleixner; linux-arm-
>> ker...@lists.infradead.org; m.szyprow...@samsung.com
>> Subject: [PATCH 3.19.y-ckt 173/251] clocksource: exynos_mct: Avoid
>> blocking calls in the cpu hotplug notifier
>>
>> 3.19.8-ckt4 -stable review patch.  If anyone has any objections, please
>> let me know.
>>
>> --
>>
>> From: Damian Eppel 
>>
>> commit 56a94f13919c0db5958611b388e1581b4852f3c9 upstream.
>>
>> Whilst testing cpu hotplug events on kernel configured with DEBUG_PREEMPT
>> and DEBUG_ATOMIC_SLEEP we get following BUG message, caused by calling
>> request_irq() and free_irq() in the context of hotplug notification
>> (which is in this case atomic context).
>>
>> [   40.785859] CPU1: Software reset
>> [   40.786660] BUG: sleeping function called from invalid context at
>> mm/slub.c:1241
>> [   40.786668] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name:
>> swapper/1
>> [   40.786678] Preemption disabled at:[<  (null)>]   (null)
>> [   40.786681]
>> [   40.786692] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc4-
>> 00024-g7dca860 #36
>> [   40.786698] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
>> [   40.786728] [] (unwind_backtrace) from []
>> (show_stack+0x10/0x14)
>> [   40.786747] [] (show_stack) from []
>> (dump_stack+0x70/0xbc)
>> [   40.786767] [] (dump_stack) from []
>> (kmem_cache_alloc+0xd8/0x170)
>> [   40.786785] [] (kmem_cache_alloc) from []
>> (request_threaded_irq+0x64/0x128)
>> [   40.786804] [] (request_threaded_irq) from []
>> (exynos4_local_timer_setup+0xc0/0x13c)
>> [   40.786820] [] (exynos4_local_timer_setup) from []
>> (exynos4_mct_cpu_notify+0x30/0xa8)
>> [   40.786838] [] (exynos4_mct_cpu_notify) from []
>> (notifier_call_chain+0x44/0x84)
>> [   40.786857] [] (notifier_call_chain) from []
>> (__cpu_notify+0x28/0x44)
>> [   40.786873] [] (__cpu_notify) from []
>> (secondary_start_kernel+0xec/0x150)
>> [   40.786886] [] (secondary_start_kernel) from [<40008764>]
>> (0x40008764)
>>
>> Interrupts cannot be requested/freed in the CPU_STARTING/CPU_DYING
>> notifications which run on the hotplugged cpu with interrupts and
>> preemption disabled.
>>
>> To avoid the issue, request the interrupts for all possible cpus in the
>> boot code. The interrupts are marked NO_AUTOENABLE to avoid a racy
>> request_irq/disable_irq() sequence. The flag prevents the
>> request_irq() code from enabling the interrupt immediately.
>>
>> The interrupt is then enabled in the CPU_STARTING notifier of the
>> hotplugged cpu and again disabled with disable_irq_nosync() in the
>> CPU_DYING notifier.
>>
>> [ tglx: Massaged changelog to match the patch ]
>>
>> Fixes: 7114cd749a12 ("clocksource: exynos_mct: use (request/free)_irq
>> calls for local timer registration")
>> Reported-by: Krzysztof Kozlowski 
>> Reviewed-by: Krzysztof Kozlowski 
>> Tested-by: Krzysztof Kozlowski 
>> Tested-by: Marcin Jabrzyk 
>> Signed-off-by: Damian Eppel 
>> Cc: m.szyprow...@samsung.com
>> Cc: kyungmin.p...@samsung.com
>> Cc: daniel.lezc...@linaro.org
>> Cc: kg...@kernel.org
>> Cc: linux-arm-ker...@lists.infradead.org
>> Link: http://lkml.kernel.org/r/1435324984-7328-1-git-send-email-
>> d.ep...@samsung.com
>> Signed-off-by: Thomas Gleixner 
>> Signed-off-by: Kamal Mostafa 
>> ---
>>  drivers/clocksource/exynos_mct.c | 43 --
>> --
>>  1 file changed, 30 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/clocksource/exynos_mct.c
>> b/drivers/clocksource/exynos_mct.c
>> index 83564c9..c844616 100644
>> --- a/drivers/clocksource/exynos_mct.c
>> +++ b/drivers/clocksource/exynos_mct.c
>> @@ -466,15 +466,12 @@ static int exynos4_local_timer_setup(struct
>> clock_event_device *evt)
>>   exynos4_mct_write(TICK_BASE_CNT, mevt->base + MCT_L_TCNTB_OFFSET);
>>
>>   if (mct_int_type == MCT_INT_SPI) {
>> - evt->irq = mct_irqs[MCT_L0_IRQ + cpu];
>> - if (request_irq(evt->irq, exynos4_mct_tick_isr,
>> - IRQF_TIMER | IRQF_NOBALANCING,
>> - evt->name, mevt)) {
>> - pr_err("exynos-mct: cannot register IRQ %d\n",
>> - evt->irq);
>> +
>> + if (evt->irq == -1)
>>   return -EIO;
>> - }
>> - irq_force_affinity(mct_irqs[MCT_L0_IRQ + cpu],
>> cpumask_of(cpu));
>> +
>> + irq_force_affinity(evt->irq, cpumask_of(cpu));
>> + enable_irq(evt->irq);
>>   } else {
>>   enable_percpu_irq(mct_irqs[MCT_L0_IRQ], 0);
>
> In here, why not use enable_percpu_irq(evt->irq) ?
>
>>   }
>> @@ -487,10 +484,12 @@ static int exynos4_local_timer_setup(struct
>> clock_event_device *e

Re: [PATCH v5 2/3] pwm: add MediaTek display PWM driver support

2015-07-15 Thread Daniel Kurtz
On Thu, Jul 16, 2015 at 1:38 PM, YH Huang  wrote:
> On Wed, 2015-07-15 at 23:59 +0800, YH Huang wrote:
>> On Mon, 2015-07-13 at 18:19 +0800, Daniel Kurtz wrote:
>> > On Mon, Jul 13, 2015 at 5:04 PM, YH Huang  wrote:
>> > > Add display PWM driver support to modify backlight for MT8173 and MT6595.
>> > > The PWM has one channel to control the brightness of the display.
>> > > When the (high_width / period) is closer to 1, the screen is brighter;
>> > > otherwise, it is darker.
>> > >
>> > > Signed-off-by: YH Huang 
>> > > ---
>> > >  drivers/pwm/Kconfig|  10 ++
>> > >  drivers/pwm/Makefile   |   1 +
>> > >  drivers/pwm/pwm-mtk-disp.c | 256 
>> > > +
>> > >  3 files changed, 267 insertions(+)
>> > >  create mode 100644 drivers/pwm/pwm-mtk-disp.c
>> > >
>> > > diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
>> > > index b1541f4..f5b03a4 100644
>> > > --- a/drivers/pwm/Kconfig
>> > > +++ b/drivers/pwm/Kconfig
>> > > @@ -211,6 +211,16 @@ config PWM_LPSS_PLATFORM
>> > >   To compile this driver as a module, choose M here: the module
>> > >   will be called pwm-lpss-platform.
>> > >
>> > > +config PWM_MTK_DISP
>> > > +   tristate "MediaTek display PWM driver"
>> > > +   depends on ARCH_MEDIATEK || COMPILE_TEST
>> > > +   help
>> > > + Generic PWM framework driver for MediaTek disp-pwm device.
>> > > + The PWM is used to control the backlight brightness for 
>> > > display.
>> > > +
>> > > + To compile this driver as a module, choose M here: the module
>> > > + will be called pwm-mtk-disp.
>> > > +
>> > >  config PWM_MXS
>> > > tristate "Freescale MXS PWM support"
>> > > depends on ARCH_MXS && OF
>> > > diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
>> > > index ec50eb5..99c9e75 100644
>> > > --- a/drivers/pwm/Makefile
>> > > +++ b/drivers/pwm/Makefile
>> > > @@ -18,6 +18,7 @@ obj-$(CONFIG_PWM_LPC32XX) += pwm-lpc32xx.o
>> > >  obj-$(CONFIG_PWM_LPSS) += pwm-lpss.o
>> > >  obj-$(CONFIG_PWM_LPSS_PCI) += pwm-lpss-pci.o
>> > >  obj-$(CONFIG_PWM_LPSS_PLATFORM)+= pwm-lpss-platform.o
>> > > +obj-$(CONFIG_PWM_MTK_DISP) += pwm-mtk-disp.o
>> > >  obj-$(CONFIG_PWM_MXS)  += pwm-mxs.o
>> > >  obj-$(CONFIG_PWM_PCA9685)  += pwm-pca9685.o
>> > >  obj-$(CONFIG_PWM_PUV3) += pwm-puv3.o
>> > > diff --git a/drivers/pwm/pwm-mtk-disp.c b/drivers/pwm/pwm-mtk-disp.c
>> > > new file mode 100644
>> > > index 000..1f17cee
>> > > --- /dev/null
>> > > +++ b/drivers/pwm/pwm-mtk-disp.c
>> > > @@ -0,0 +1,256 @@
>> > > +/*
>> > > + * MediaTek display pulse-width-modulation controller driver.
>> > > + * Copyright (c) 2015 MediaTek Inc.
>> > > + * Author: YH Huang 
>> > > + *
>> > > + * This program is free software; you can redistribute it and/or modify
>> > > + * it under the terms of the GNU General Public License version 2 as
>> > > + * published by the Free Software Foundation.
>> > > + *
>> > > + * This program is distributed in the hope that it will be useful,
>> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> > > + * GNU General Public License for more details.
>> > > + */
>> > > +
>> > > +#include 
>> > > +#include 
>> > > +#include 
>> > > +#include 
>> > > +#include 
>> > > +#include 
>> > > +#include 
>> > > +#include 
>> > > +
>> > > +#define DISP_PWM_EN0
>> >
>> > The "DISP_PWM_*" are register offsets, so use a hex value, like this:
>> >
>> > #define DISP_PWM_EN  0x00
>> >
>> > Use BIT() for register *fields*, that is, the individual bits of a 
>> > register.
>> >
>>
>> Got it!
>>
>> > > +#define PWM_ENABLE_MASKBIT(0)
>> > > +
>> > > +#define DISP_PWM_COMMITBIT(3)
>> >
>> > #define DISP_PWM_COMMIT0x08
>> >
>> > > +#define PWM_COMMIT_MASKBIT(0)
>> > > +
>> > > +#define DISP_PWM_CON_0 BIT(4)
>> >
>> > #define DISP_PWM_COMMIT0x10
>> >
>> > > +#define PWM_CLKDIV_SHIFT   16
>> > > +#define PWM_CLKDIV_MAX 0x3ff
>> > > +#define PWM_CLKDIV_MASK(PWM_CLKDIV_MAX << 
>> > > PWM_CLKDIV_SHIFT)
>> > > +
>> > > +#define DISP_PWM_CON_1 0x14
>> > > +#define PWM_PERIOD_MASK0xfff
>> > > +/* Shift log2(PWM_PERIOD_MASK + 1) as divisor */
>> > > +#define PWM_PERIOD_BIT_SHIFT   12
>> > > +
>> > > +#define PWM_HIGH_WIDTH_SHIFT   16
>> > > +#define PWM_HIGH_WIDTH_MASK(0x1fff << PWM_HIGH_WIDTH_SHIFT)
>> > > +
>> > > +struct mtk_disp_pwm {
>> > > +   struct pwm_chip chip;
>> > > +   struct device *dev;
>> >
>> > I don't think "dev" is actually used.  And, if needed, it can be
>> > extracted from "chip".
>> >
>>
>> I will drop it.
>>
>> > > +   struct clk *clk_main;
>> > > +   struct clk *clk_mm;
>> > > +   void __iomem *base;
>> > > +};
>> > > +
>> > > +static inline struct mtk_disp_pwm

RE: [PATCH 3.19.y-ckt 173/251] clocksource: exynos_mct: Avoid blocking calls in the cpu hotplug notifier

2015-07-15 Thread Duan Andy
From: Kamal Mostafa  Sent: Thursday, July 16, 2015 9:08 AM
> To: linux-kernel@vger.kernel.org; sta...@vger.kernel.org; kernel-
> t...@lists.ubuntu.com
> Cc: Kamal Mostafa; daniel.lezc...@linaro.org; kyungmin.p...@samsung.com;
> Damian Eppel; kg...@kernel.org; Thomas Gleixner; linux-arm-
> ker...@lists.infradead.org; m.szyprow...@samsung.com
> Subject: [PATCH 3.19.y-ckt 173/251] clocksource: exynos_mct: Avoid
> blocking calls in the cpu hotplug notifier
> 
> 3.19.8-ckt4 -stable review patch.  If anyone has any objections, please
> let me know.
> 
> --
> 
> From: Damian Eppel 
> 
> commit 56a94f13919c0db5958611b388e1581b4852f3c9 upstream.
> 
> Whilst testing cpu hotplug events on kernel configured with DEBUG_PREEMPT
> and DEBUG_ATOMIC_SLEEP we get following BUG message, caused by calling
> request_irq() and free_irq() in the context of hotplug notification
> (which is in this case atomic context).
> 
> [   40.785859] CPU1: Software reset
> [   40.786660] BUG: sleeping function called from invalid context at
> mm/slub.c:1241
> [   40.786668] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name:
> swapper/1
> [   40.786678] Preemption disabled at:[<  (null)>]   (null)
> [   40.786681]
> [   40.786692] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc4-
> 00024-g7dca860 #36
> [   40.786698] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
> [   40.786728] [] (unwind_backtrace) from []
> (show_stack+0x10/0x14)
> [   40.786747] [] (show_stack) from []
> (dump_stack+0x70/0xbc)
> [   40.786767] [] (dump_stack) from []
> (kmem_cache_alloc+0xd8/0x170)
> [   40.786785] [] (kmem_cache_alloc) from []
> (request_threaded_irq+0x64/0x128)
> [   40.786804] [] (request_threaded_irq) from []
> (exynos4_local_timer_setup+0xc0/0x13c)
> [   40.786820] [] (exynos4_local_timer_setup) from []
> (exynos4_mct_cpu_notify+0x30/0xa8)
> [   40.786838] [] (exynos4_mct_cpu_notify) from []
> (notifier_call_chain+0x44/0x84)
> [   40.786857] [] (notifier_call_chain) from []
> (__cpu_notify+0x28/0x44)
> [   40.786873] [] (__cpu_notify) from []
> (secondary_start_kernel+0xec/0x150)
> [   40.786886] [] (secondary_start_kernel) from [<40008764>]
> (0x40008764)
> 
> Interrupts cannot be requested/freed in the CPU_STARTING/CPU_DYING
> notifications which run on the hotplugged cpu with interrupts and
> preemption disabled.
> 
> To avoid the issue, request the interrupts for all possible cpus in the
> boot code. The interrupts are marked NO_AUTOENABLE to avoid a racy
> request_irq/disable_irq() sequence. The flag prevents the
> request_irq() code from enabling the interrupt immediately.
> 
> The interrupt is then enabled in the CPU_STARTING notifier of the
> hotplugged cpu and again disabled with disable_irq_nosync() in the
> CPU_DYING notifier.
> 
> [ tglx: Massaged changelog to match the patch ]
> 
> Fixes: 7114cd749a12 ("clocksource: exynos_mct: use (request/free)_irq
> calls for local timer registration")
> Reported-by: Krzysztof Kozlowski 
> Reviewed-by: Krzysztof Kozlowski 
> Tested-by: Krzysztof Kozlowski 
> Tested-by: Marcin Jabrzyk 
> Signed-off-by: Damian Eppel 
> Cc: m.szyprow...@samsung.com
> Cc: kyungmin.p...@samsung.com
> Cc: daniel.lezc...@linaro.org
> Cc: kg...@kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Link: http://lkml.kernel.org/r/1435324984-7328-1-git-send-email-
> d.ep...@samsung.com
> Signed-off-by: Thomas Gleixner 
> Signed-off-by: Kamal Mostafa 
> ---
>  drivers/clocksource/exynos_mct.c | 43 --
> --
>  1 file changed, 30 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/clocksource/exynos_mct.c
> b/drivers/clocksource/exynos_mct.c
> index 83564c9..c844616 100644
> --- a/drivers/clocksource/exynos_mct.c
> +++ b/drivers/clocksource/exynos_mct.c
> @@ -466,15 +466,12 @@ static int exynos4_local_timer_setup(struct
> clock_event_device *evt)
>   exynos4_mct_write(TICK_BASE_CNT, mevt->base + MCT_L_TCNTB_OFFSET);
> 
>   if (mct_int_type == MCT_INT_SPI) {
> - evt->irq = mct_irqs[MCT_L0_IRQ + cpu];
> - if (request_irq(evt->irq, exynos4_mct_tick_isr,
> - IRQF_TIMER | IRQF_NOBALANCING,
> - evt->name, mevt)) {
> - pr_err("exynos-mct: cannot register IRQ %d\n",
> - evt->irq);
> +
> + if (evt->irq == -1)
>   return -EIO;
> - }
> - irq_force_affinity(mct_irqs[MCT_L0_IRQ + cpu],
> cpumask_of(cpu));
> +
> + irq_force_affinity(evt->irq, cpumask_of(cpu));
> + enable_irq(evt->irq);
>   } else {
>   enable_percpu_irq(mct_irqs[MCT_L0_IRQ], 0);

In here, why not use enable_percpu_irq(evt->irq) ?

>   }
> @@ -487,10 +484,12 @@ static int exynos4_local_timer_setup(struct
> clock_event_device *evt)  static void exynos4_local_timer_stop(struct
> clock_event_device *evt)  {
>   evt->set_mode(CLOCK_EVT_MODE_UNUSED, evt);
> - if (mct_int_type

Re: [PATCH] csiostor: Use list_for_each_safe instead of re-implementing it

2015-07-15 Thread Johannes Thumshirn
Christophe JAILLET  writes:

> Use 'list_for_each_safe' instead of 'list_for_each' + own logic to keep
> safe when a list entry is deleted.
> Delete the now useless 'csio_list_prev' macro.
>
> Signed-off-by: Christophe JAILLET 
> ---
>  drivers/scsi/csiostor/csio_defs.h |  1 -
>  drivers/scsi/csiostor/csio_hw.c   | 10 --
>  drivers/scsi/csiostor/csio_scsi.c | 10 --
>  3 files changed, 8 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/scsi/csiostor/csio_defs.h 
> b/drivers/scsi/csiostor/csio_defs.h
> index c38017b..4b3557c 100644
> --- a/drivers/scsi/csiostor/csio_defs.h
> +++ b/drivers/scsi/csiostor/csio_defs.h
> @@ -70,7 +70,6 @@ csio_list_deleted(struct list_head *list)
>  }
>  
>  #define csio_list_next(elem) (((struct list_head *)(elem))->next)
> -#define csio_list_prev(elem) (((struct list_head *)(elem))->prev)
>  
>  /* State machine */
>  typedef void (*csio_sm_state_t)(void *, uint32_t);
> diff --git a/drivers/scsi/csiostor/csio_hw.c b/drivers/scsi/csiostor/csio_hw.c
> index 622bdab..61ee6cb 100644
> --- a/drivers/scsi/csiostor/csio_hw.c
> +++ b/drivers/scsi/csiostor/csio_hw.c
> @@ -3643,20 +3643,19 @@ static void
>  csio_mgmt_tmo_handler(uintptr_t data)
>  {
>   struct csio_mgmtm *mgmtm = (struct csio_mgmtm *) data;
> - struct list_head *tmp;
> + struct list_head *tmp, *next;
>   struct csio_ioreq *io_req;
>  
>   csio_dbg(mgmtm->hw, "Mgmt timer invoked!\n");
>  
>   spin_lock_irq(&mgmtm->hw->lock);
>  
> - list_for_each(tmp, &mgmtm->active_q) {
> + list_for_each_safe(tmp, next, &mgmtm->active_q) {
>   io_req = (struct csio_ioreq *) tmp;
>   io_req->tmo -= min_t(uint32_t, io_req->tmo, ECM_MIN_TMO);
>  
>   if (!io_req->tmo) {
>   /* Dequeue the request from retry Q. */
> - tmp = csio_list_prev(tmp);
>   list_del_init(&io_req->sm.sm_list);
>   if (io_req->io_cbfn) {
>   /* io_req will be freed by completion handler */
> @@ -3680,7 +3679,7 @@ csio_mgmtm_cleanup(struct csio_mgmtm *mgmtm)
>  {
>   struct csio_hw *hw = mgmtm->hw;
>   struct csio_ioreq *io_req;
> - struct list_head *tmp;
> + struct list_head *tmp, *next;
>   uint32_t count;
>  
>   count = 30;
> @@ -3692,9 +3691,8 @@ csio_mgmtm_cleanup(struct csio_mgmtm *mgmtm)
>   }
>  
>   /* release outstanding req from ACTIVEQ */
> - list_for_each(tmp, &mgmtm->active_q) {
> + list_for_each_safe(tmp, next, &mgmtm->active_q) {
>   io_req = (struct csio_ioreq *) tmp;
> - tmp = csio_list_prev(tmp);
>   list_del_init(&io_req->sm.sm_list);
>   mgmtm->stats.n_active--;
>   if (io_req->io_cbfn) {
> diff --git a/drivers/scsi/csiostor/csio_scsi.c 
> b/drivers/scsi/csiostor/csio_scsi.c
> index 2c4562d..2bfb401 100644
> --- a/drivers/scsi/csiostor/csio_scsi.c
> +++ b/drivers/scsi/csiostor/csio_scsi.c
> @@ -2322,7 +2322,7 @@ csio_scsi_alloc_ddp_bufs(struct csio_scsim *scm, struct 
> csio_hw *hw,
>int buf_size, int num_buf)
>  {
>   int n = 0;
> - struct list_head *tmp;
> + struct list_head *tmp, *next;
>   struct csio_dma_buf *ddp_desc = NULL;
>   uint32_t unit_size = 0;
>  
> @@ -2370,9 +2370,8 @@ csio_scsi_alloc_ddp_bufs(struct csio_scsim *scm, struct 
> csio_hw *hw,
>   return 0;
>  no_mem:
>   /* release dma descs back to freelist and free dma memory */
> - list_for_each(tmp, &scm->ddp_freelist) {
> + list_for_each_safe(tmp, next, &scm->ddp_freelist) {
>   ddp_desc = (struct csio_dma_buf *) tmp;
> - tmp = csio_list_prev(tmp);
>   pci_free_consistent(hw->pdev, ddp_desc->len, ddp_desc->vaddr,
>   ddp_desc->paddr);
>   list_del_init(&ddp_desc->list);
> @@ -2393,13 +2392,12 @@ no_mem:
>  static void
>  csio_scsi_free_ddp_bufs(struct csio_scsim *scm, struct csio_hw *hw)
>  {
> - struct list_head *tmp;
> + struct list_head *tmp, *next;
>   struct csio_dma_buf *ddp_desc;
>  
>   /* release dma descs back to freelist and free dma memory */
> - list_for_each(tmp, &scm->ddp_freelist) {
> + list_for_each_safe(tmp, next, &scm->ddp_freelist) {
>   ddp_desc = (struct csio_dma_buf *) tmp;
> - tmp = csio_list_prev(tmp);
>   pci_free_consistent(hw->pdev, ddp_desc->len, ddp_desc->vaddr,
>   ddp_desc->paddr);
>   list_del_init(&ddp_desc->list);

Reviewed-by: Johannes Thumshirn 

-- 
Johannes Thumshirn   Storage
jthumsh...@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message t

[PATCH v2 3/3] ARM: exynos_defconfig: Enable NTC Thermistors support

2015-07-15 Thread Javier Martinez Canillas
The Exynos5420 Peach Pit and Exynos5800 Peach Pi Chromebooks have
IIO based ADC thermistors. Enable built-in support for its driver.

Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Krzysztof Kozlowski 

---

Changes in v2:
- Add Krzysztof Kozlowski Reviewed-by tag in patch #3.

 arch/arm/configs/exynos_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/exynos_defconfig 
b/arch/arm/configs/exynos_defconfig
index 9504e7790288..e5d7d4476a80 100644
--- a/arch/arm/configs/exynos_defconfig
+++ b/arch/arm/configs/exynos_defconfig
@@ -94,6 +94,7 @@ CONFIG_CHARGER_MAX14577=y
 CONFIG_CHARGER_MAX77693=y
 CONFIG_CHARGER_TPS65090=y
 CONFIG_SENSORS_LM90=y
+CONFIG_SENSORS_NTC_THERMISTOR=y
 CONFIG_SENSORS_PWM_FAN=y
 CONFIG_SENSORS_INA2XX=y
 CONFIG_THERMAL=y
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/3] ARM: multi_v7_defconfig: Enable max77802 regulator

2015-07-15 Thread Javier Martinez Canillas
The Exynos5420 based Peach Pit and Exynos5800 based Peach Pi Chromebooks
use the Maxim max77802 Power Management IC (PMIC). This PMIC has besides
other devices, a set of regulators that can be controller over I2C.

Commit f3caa529c6f5 ("ARM: multi_v7_defconfig: Enable max77802 regulator,
rtc and clock drivers") was supposed to enable the config option for the
regulator driver as a module but the final version that landed did not
include this. The commit was modified and the REGULATOR_MAX77802 removed
since it was thought to be useless.

Unfortunately that's not the case for the mentioned reason above so this
patch enables the needed Kconfig option.

Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Krzysztof Kozlowski 

---

Changes in v2:
- Better explanation why the max77802 regulator config option is needed.
- Add Krzysztof Kozlowski Reviewed-by tag in patch #1

 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index 4b93761d58d2..b07493997993 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -402,6 +402,7 @@ CONFIG_REGULATOR_MAX14577=m
 CONFIG_REGULATOR_MAX8907=y
 CONFIG_REGULATOR_MAX8973=y
 CONFIG_REGULATOR_MAX77686=y
+CONFIG_REGULATOR_MAX77802=m
 CONFIG_REGULATOR_MAX77693=m
 CONFIG_REGULATOR_PALMAS=y
 CONFIG_REGULATOR_S2MPS11=y
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [BUG] mellanox IB driver fails to load on large config

2015-07-15 Thread Or Gerlitz

On 7/14/2015 11:28 PM, Alex Thorlton wrote:


We see the same exact messages on 4.1-rc8.




does this solves the problem?


diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index ad31e47..c8ae3b9 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -45,7 +45,7 @@
 #include 

 #define MAX_MSIX_P_PORT17
-#define MAX_MSIX   64
+#define MAX_MSIX   1024
 #define MIN_MSIX_P_PORT5
 #define MLX4_IS_LEGACY_EQ_MODE(dev_cap) ((dev_cap).num_comp_vectors < \
(dev_cap).num_ports * MIN_MSIX_P_PORT)
--

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] asm-generic: {get,put}_user ptr argument evaluate only 1 time

2015-07-15 Thread Geert Uytterhoeven
Hi Sato-san,

On Thu, Jul 16, 2015 at 7:15 AM, Yoshinori Sato
 wrote:
> Current implemantation ptr argument evaluate 2 times.
> It'll be an unexpected result.
>
> Signed-off-by: Yoshinori Sato 

Acked-by: Geert Uytterhoeven 

> ---
>  include/asm-generic/uaccess.h | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
> index 72d8803..1b813fb 100644
> --- a/include/asm-generic/uaccess.h
> +++ b/include/asm-generic/uaccess.h
> @@ -163,9 +163,10 @@ static inline __must_check long __copy_to_user(void 
> __user *to,
>
>  #define put_user(x, ptr)   \
>  ({ \
> +   __typeof__((ptr)) __p = (ptr);  \
> might_fault();  \
> -   access_ok(VERIFY_WRITE, ptr, sizeof(*ptr)) ?\
> -   __put_user(x, ptr) :\
> +   access_ok(VERIFY_WRITE, __p, sizeof(*__p)) ?\
> +   __put_user(x, __p) :\

For safety, you may want to change "x" to "(x") while at it.

> -EFAULT;\
>  })
>
> @@ -225,9 +226,10 @@ extern int __put_user_bad(void) 
> __attribute__((noreturn));
>
>  #define get_user(x, ptr)   \
>  ({ \
> +   __typeof__((ptr)) __p = (ptr);  \
> might_fault();  \
> -   access_ok(VERIFY_READ, ptr, sizeof(*ptr)) ? \
> -   __get_user(x, ptr) :\
> +   access_ok(VERIFY_READ, __p, sizeof(*__p)) ? \
> +   __get_user(x, __p) :\

Likewise.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ARM: OMAP2: Delete unnecessary checks before three function calls

2015-07-15 Thread SF Markus Elfring
> I have to say, I am a bit leery about applying the omap_device.c and
> omap_hwmod.c changes, since the called functions -- omap_device_delete()
> and clk_disable() -- don't explicitly document that NULLs are allowed
> to be passed in.

How are the chances to improve documentation around such implementation details?


> So there's no explicit contract that callers can rely upon, to (at least
> in theory) prevent those internal NULL pointer checks from being removed.

Are there any additional variations to consider for source files from different
processor architectures?


> So I would suggest that those two functions' kerneldoc be patched first to 
> explicitly state that passing in a NULL pointer is allowed.

Should my static source code analysis approach help you any more to clarify
further open issues?


> So I'll apply that change now for v4.3, touching up the commit message 
> accordingly.

Thanks for your constructive feedback.


>>  arch/arm/mach-omap2/omap_device.c | 3 +--
>>  arch/arm/mach-omap2/omap_hwmod.c  | 5 +
>>  arch/arm/mach-omap2/timer.c   | 3 +--

Did Tony Lindgren pick a similar update suggestion up, too?
https://lkml.org/lkml/2015/7/15/112

Regards,
Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/3] ARM: multi_v7_defconfig: Enable NTC Thermistors support

2015-07-15 Thread Javier Martinez Canillas
The Exynos5420 Peach Pit and Exynos5800 Peach Pi Chromebooks have
IIO based ADC thermistors. Enable module support for its driver
and also for the needed Exynos ADC driver.

Signed-off-by: Javier Martinez Canillas 
Reviewed-by: Krzysztof Kozlowski 

---

Changes in v2:
- Add Krzysztof Kozlowski Reviewed-by tag in patch #2.

 arch/arm/configs/multi_v7_defconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index b07493997993..0a8aa724c5a0 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -357,6 +357,7 @@ CONFIG_POWER_RESET_KEYSTONE=y
 CONFIG_POWER_RESET_RMOBILE=y
 CONFIG_SENSORS_LM90=y
 CONFIG_SENSORS_LM95245=y
+CONFIG_SENSORS_NTC_THERMISTOR=m
 CONFIG_THERMAL=y
 CONFIG_CPU_THERMAL=y
 CONFIG_RCAR_THERMAL=y
@@ -617,6 +618,7 @@ CONFIG_MEMORY=y
 CONFIG_TI_AEMIF=y
 CONFIG_IIO=y
 CONFIG_AT91_ADC=m
+CONFIG_EXYNOS_ADC=m
 CONFIG_XILINX_XADC=y
 CONFIG_AK8975=y
 CONFIG_PWM=y
-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/3] ARM: Enable NTC Thermistors and max77802 regulator drivers

2015-07-15 Thread Javier Martinez Canillas
Hello Kukjin and Krzysztof,

This series enables support in exynos_defconfig and multi_v7_defconfig
for the NTC Thermistors found on the Exynos5 Peach Chromebooks and
also enables support for the max77802 regulators in multi_v7_defconfig.

This is the second version that address issues pointed out by Krzysztof
and adds his Reviewed-by tag.

The series is composed of the following patches:

Changes in v2:
- Better explanation why the max77802 regulator config option is needed.
- Add Krzysztof Kozlowski Reviewed-by tag in patch #1
- Add Krzysztof Kozlowski Reviewed-by tag in patch #2.
- Add Krzysztof Kozlowski Reviewed-by tag in patch #3.

Javier Martinez Canillas (3):
  ARM: multi_v7_defconfig: Enable max77802 regulator
  ARM: multi_v7_defconfig: Enable NTC Thermistors support
  ARM: exynos_defconfig: Enable NTC Thermistors support

 arch/arm/configs/exynos_defconfig   | 1 +
 arch/arm/configs/multi_v7_defconfig | 3 +++
 2 files changed, 4 insertions(+)

-- 
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] Input: tsc2005 - convert to regmap

2015-07-15 Thread Dmitry Torokhov
On Wed, Jul 15, 2015 at 02:13:26PM +0200, Sebastian Reichel wrote:
> -static int tsc2005_write(struct tsc2005 *ts, u8 reg, u16 value)
> -{
> - u32 tx = ((reg | TSC2005_REG_PND0) << 16) | value;
> - struct spi_transfer xfer = {
> - .tx_buf = &tx,
> - .len= 4,
> - .bits_per_word  = 24,
> - };

I wonder why the original code used 24 bit-sized-words for transfers...

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Documentation: Update filesystems/debugfs.txt

2015-07-15 Thread Wang Long
This patch update the Documentation/filesystems/debugfs.txt
file. The main work is to add the description of the following
functions:
debugfs_create_atomic_t
debugfs_create_u32_array
debugfs_create_devm_seqfile
debugfs_create_file_size

Signed-off-by: Wang Long 
---
 Documentation/filesystems/debugfs.txt | 41 +++
 1 file changed, 41 insertions(+)

diff --git a/Documentation/filesystems/debugfs.txt 
b/Documentation/filesystems/debugfs.txt
index 88ab81c..b1ba8df 100644
--- a/Documentation/filesystems/debugfs.txt
+++ b/Documentation/filesystems/debugfs.txt
@@ -1,4 +1,5 @@
 Copyright 2009 Jonathan Corbet 
+Updated by Wang Long  on 2015/07/16
 
 Debugfs exists as a simple way for kernel developers to make information
 available to user space.  Unlike /proc, which is only meant for information
@@ -51,6 +52,17 @@ operations should be provided; others can be included as 
needed.  Again,
 the return value will be a dentry pointer to the created file, NULL for
 error, or ERR_PTR(-ENODEV) if debugfs support is missing.
 
+Create a file with an initial size, the following function can be used
+instead:
+
+struct dentry *debugfs_create_file_size(const char *name, umode_t mode,
+   struct dentry *parent, void *data,
+   const struct file_operations *fops,
+   loff_t file_size);
+
+file_size is the initial file size. The other parameters are the same
+as the function debugfs_create_file.
+
 In a number of cases, the creation of a set of file operations is not
 actually necessary; the debugfs code provides a number of helper functions
 for simple situations.  Files containing a single integer value can be
@@ -100,6 +112,14 @@ A read on the resulting file will yield either Y (for 
non-zero values) or
 N, followed by a newline.  If written to, it will accept either upper- or
 lower-case values, or 1 or 0.  Any other input will be silently ignored.
 
+Also, atomic_t values can be placed in debugfs with:
+
+struct dentry *debugfs_create_atomic_t(const char *name, umode_t mode,
+   struct dentry *parent, atomic_t *value)
+
+A read of this file will get atomic_t values, and a write of this file
+will set atomic_t values.
+
 Another option is exporting a block of arbitrary binary data, with
 this structure and function:
 
@@ -147,6 +167,27 @@ The "base" argument may be 0, but you may want to build 
the reg32 array
 using __stringify, and a number of register names (macros) are actually
 byte offsets over a base for the register block.
 
+If you want to dump an u32 array in debugfs, you can create file with:
+
+struct dentry *debugfs_create_u32_array(const char *name, umode_t mode,
+   struct dentry *parent,
+   u32 *array, u32 elements);
+
+The "array" argument provides data, and the "elements" argument is
+the number of elements in the array. Note: Once array is created its
+size can not be changed.
+
+There is a helper function to create device related seq_file:
+
+   struct dentry *debugfs_create_devm_seqfile(struct device *dev,
+   const char *name,
+   struct dentry *parent,
+   int (*read_fn)(struct seq_file *s,
+   void *data));
+
+The "dev" argument is the device related to this debugfs file, and
+the "read_fn" is a function pointer which to be called to print the
+seq_file content.
 
 There are a couple of other directory-oriented helper functions:
 
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Jul 16

2015-07-15 Thread Stephen Rothwell
Hi all,

For those that care:  I have stopped signing these messages because
vger's mailing list software destroys the signatures and then Google
(at least) assumes that they are spam.  I could clear sign them instead
(if anyone thinks that is worth while).

Changes since 20150715:

Removed tree: init (it has all been merged)

The tip tree lost its build failure.

The rcu tree gained a conflict against the tip tree and a build failure
for which I applied a merge fix patch.

The akpm-current tree gained a conflict against the arm tree and 2 build
failures for which I applied fix patches.

Non-merge commits (relative to Linus' tree): 2446
 2485 files changed, 100730 insertions(+), 34595 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc and an allmodconfig for x86_64,
a multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), it is also built with powerpc allnoconfig
(32 and 64 bit), ppc44x_defconfig and allyesconfig (this fails its final
link) and i386, sparc, sparc64 and arm defconfig.

Below is a summary of the state of the merge.

I am currently merging 223 trees (counting Linus' and 33 trees of patches
pending for Linus' tree).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (97d6e2b636c6 Merge tag 'module-final-v4.2-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux)
Merging fixes/master (c7e9ad7da219 Merge branch 'perf-urgent-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
Merging kbuild-current/rc-fixes (c517d838eb7d Linux 4.0-rc1)
Merging arc-current/for-curr (e4140819dadc ARC: signal handling robustify)
Merging arm-current/fixes (ab525b473c96 ARM: invalidate L1 before enabling 
coherency)
Merging m68k-current/for-linus (1214c525484c m68k: Use for_each_sg())
Merging metag-fixes/fixes (0164a711c97b metag: Fix ioremap_wc/ioremap_cached 
build errors)
Merging mips-fixes/mips-fixes (1795cd9b3a91 Linux 3.16-rc5)
Merging powerpc-fixes/fixes (bc0195aad0da Linux 4.2-rc2)
Merging powerpc-merge-mpe/fixes (bc0195aad0da Linux 4.2-rc2)
Merging powerpc-merge-benh/merge (c517d838eb7d Linux 4.0-rc1)
Merging sparc/master (4a10a91756ef Merge branch 'upstream' of 
git://git.infradead.org/users/pcmoore/audit)
Merging net/master (bc8c20acaea1 bridge: multicast: treat igmpv3 report with 
INCLUDE and no sources as a leave)
Merging ipsec/master (31a418986a58 xen: netback: read hotplug script once at 
start of day.)
Merging sound-current/for-linus (4d0e677523a9 ALSA: line6: Fix -EBUSY error 
during active monitoring)
Merging pci-current/for-linus (c9ddbac9c891 PCI: Restore PCI_MSIX_FLAGS_BIRMASK 
definition)
Merging wireless-drivers/master (7865598ec24a ath9k_hw: fix device ID check for 
AR956x)
Merging driver-core.current/driver-core-linus (bc0195aad0da Linux 4.2-rc2)
Merging tty.current/tty-linus (bc0195aad0da Linux 4.2-rc2)
Merging usb.current/usb-linus (51f007e1a1f1 Merge tag 'usb-serial-4.2-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus)
Merging usb-gadget-fixes/fixes (b2e2c94b878b usb: gadget: f_midi: fix error 
recovery path)
Merging usb-serial-fixes/usb-linus (d23f47d4927f USB: serial: Destroy 
serial_minors IDR on module exit)
Merging staging.current/staging-linus (d309509f8472 staging: vt6656: check 
ieee80211_bss_conf bssid not NULL)
Merging char-misc.current/char-misc-linus (bc0195aad0da Linux 4.2-rc2)
Merging input-current/for-linus (dbf3c370862d Revert "Input: synaptics - 
allocate 3 slots to keep stability in image sensors")
Merging crypto-current/master (030f4e968741 crypto: nx - Fix reentrancy bugs)
Merging ide/master (d681f1166919 ide: remove deprecated use of pci api)
Merging devicetree-current/devicetree/merge (f76502aa9140 of/d

Re: [PATCH] ARM: OMAP2: Delete unnecessary checks before three function calls

2015-07-15 Thread Tony Lindgren
* Paul Walmsley  [150715 22:58]:
> Hello Markus
> 
> On Tue, 30 Jun 2015, SF Markus Elfring wrote:
> 
> > From: Markus Elfring 
> > Date: Tue, 30 Jun 2015 14:00:16 +0200
> > 
> > The functions clk_disable(), of_node_put() and omap_device_delete() test
> > whether their argument is NULL and then return immediately.
> > Thus the test around the call is not needed.
> > 
> > This issue was detected by using the Coccinelle software.
> > 
> > Signed-off-by: Markus Elfring 
> 
> Thanks for the patch.  I have to say, I am a bit leery about applying the 
> omap_device.c and omap_hwmod.c changes, since the called functions -- 
> omap_device_delete() and clk_disable() -- don't explicitly document that 
> NULLs are allowed to be passed in.  So there's no explicit contract that 
> callers can rely upon, to (at least in theory) prevent those internal NULL 
> pointer checks from being removed.
> 
> So I would suggest that those two functions' kerneldoc be patched first to 
> explicitly state that passing in a NULL pointer is allowed.  Then I would 
> feel a bit more comfortable applying the omap_device.c and omap_hwmod.c 
> changes.
> 
> The kerneldoc for of_node_put() does explicitly allow NULLs to be passed 
> in.  So I'll apply that change now for v4.3, touching up the commit 
> message accordingly.

I have them applied from a later thread already, but will drop both in
my branch as I have not pushed them out yet.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] ARM: multi_v7_defconfig: Enable max77802 regulator

2015-07-15 Thread Krzysztof Kozlowski
On 16.07.2015 14:44, Javier Martinez Canillas wrote:
> Hello Krzysztof,
> 
> Thanks for the feedback.
> 
> On 07/16/2015 02:45 AM, Krzysztof Kozlowski wrote:
>> On 16.07.2015 01:32, Javier Martinez Canillas wrote:
>>> The Maxim max77802 Power Management IC has besides other devices, a set of
>>> regulators. Commit f3caa529c6f5 ("ARM: multi_v7_defconfig: Enable max77802
>>> regulator, rtc and clock drivers") was supposed to enable the config option
>>> for the regulator driver as a module but the final version that landed did
>>> not include this. So this patch enables the needed Kconfig option.
>>>
>>> Signed-off-by: Javier Martinez Canillas 
>>
>> Please describe why do you want to enable it (IOW who will benefit from
>> enabling it?). This symbol was removed by Kukjin from your commit:
>>  [kg...@kernel.org: removing useless REGULATOR_MAX77802 config]
>> so justification would be welcomed.
>>
> 
> You are right, sorry for not making the commit message clear. This PMIC
> is used by a couple of Exynos5 based boars such as the Peach Pit and Pi
> Chromebooks. I expect it to be found in other designs too just like the
> max77686 is found in many Exynos5 based boards.
> 
> I'll add this to the commit message on v2.
>  
>> Beside the commit description I agree with the patch.
>>
> 
> Does this mean I can add your Reviewed-by to this patch as well?

With extended description (something similar to explanation in your
other patches) yes, go ahead:

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kprobes: Use debugfs_remove_recursive instead debugfs_remove

2015-07-15 Thread Masami Hiramatsu
On 2015/07/16 11:16, Wang Long wrote:
> In debugfs_kprobe_init, we create a directory 'kprobes' and three
> files 'list', 'enabled' and 'blacklist'. When any one of the three
> files creation fails, we should remove all of them. But debugfs_remove
> function can not complete this work. So use debugfs_remove_recursive
> instead.
> 

OK, it should be fixed.

Acked-by: Masami Hiramatsu 

Thank you!

> Signed-off-by: Wang Long 
> ---
>  kernel/kprobes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index c90e417..8cd82a5 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -2459,7 +2459,7 @@ static int __init debugfs_kprobe_init(void)
>   return 0;
>  
>  error:
> - debugfs_remove(dir);
> + debugfs_remove_recursive(dir);
>   return -ENOMEM;
>  }
>  
> 


-- 
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: masami.hiramatsu...@hitachi.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-discuss] [TECH TOPIC] IRQ affinity

2015-07-15 Thread Michael S. Tsirkin
On Wed, Jul 15, 2015 at 02:48:00PM -0400, Matthew Wilcox wrote:
> On Wed, Jul 15, 2015 at 11:25:55AM -0600, Jens Axboe wrote:
> > On 07/15/2015 11:19 AM, Keith Busch wrote:
> > >On Wed, 15 Jul 2015, Bart Van Assche wrote:
> > >>* With blk-mq and scsi-mq optimal performance can only be achieved if
> > >> the relationship between MSI-X vector and NUMA node does not change
> > >> over time. This is necessary to allow a blk-mq/scsi-mq driver to
> > >> ensure that interrupts are processed on the same NUMA node as the
> > >> node on which the data structures for a communication channel have
> > >> been allocated. However, today there is no API that allows
> > >> blk-mq/scsi-mq drivers and irqbalanced to exchange information
> > >> about the relationship between MSI-X vector ranges and NUMA nodes.
> > >
> > >We could have low-level drivers provide blk-mq the controller's irq
> > >associated with a particular h/w context, and the block layer can provide
> > >the context's cpumask to irqbalance with the smp affinity hint.
> > >
> > >The nvme driver already uses the hwctx cpumask to set hints, but this
> > >doesn't seems like it should be a driver responsibility. It currently
> > >doesn't work correctly anyway with hot-cpu since blk-mq could rebalance
> > >the h/w contexts without syncing with the low-level driver.
> > >
> > >If we can add this to blk-mq, one additional case to consider is if the
> > >same interrupt vector is used with multiple h/w contexts. Blk-mq's cpu
> > >assignment needs to be aware of this to prevent sharing a vector across
> > >NUMA nodes.
> > 
> > Exactly. I may have promised to do just that at the last LSF/MM conference,
> > just haven't done it yet. The point is to share the mask, I'd ideally like
> > to take it all the way where the driver just asks for a number of vecs
> > through a nice API that takes care of all this. Lots of duplicated code in
> > drivers for this these days, and it's a mess.
> 
> Yes.  I think the fundamental problem is that our MSI-X API is so funky.
> We have this incredibly flexible scheme where each MSI-X vector could
> have its own interrupt handler, but that's not what drivers want.
> They want to say "Give me eight MSI-X vectors spread across the CPUs,
> and use this interrupt handler for all of them".  That is, instead of
> the current scheme where each MSI-X vector gets its own Linux interrupt,
> we should have one interrupt handler (of the per-cpu interrupt type),
> which shows up with N bits set in its CPU mask.

It would definitely be nice to have a way to express that.  But it's
also pretty common for drivers to have e.g. RX and TX use separate
vectors, and these need separate handlers.

> ___
> Ksummit-discuss mailing list
> ksummit-disc...@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] add pwm capability to dm816x

2015-07-15 Thread Paul Walmsley
Hello Brian,

On Mon, 15 Jun 2015, Brian Hutchinson wrote:

> Clocks 4-7 are capable of PWM output on dm816x.
> 
> This adds the pwm capability to those timers.
> 
> Cc: Paul Walmsley 
> Cc: Tero Kristo 
> Cc: Tony Lindgren 
> Signed-off-by: Brian Hutchinson >

This patch seems to be corrupted.  The above line doesn't look right; 
there are some spurious newlines in the patch header, and tabs seem to 
have been converted to whitespace.  Some of these issues may be due to 
mailer problems.  Could you please fix and try again?


- Paul

> 
> --- arch/arm/mach-omap2/omap_hwmod_81xx_data.c_orig 2015-06-15
> 13:20:43.174343431 -0400
> +++ arch/arm/mach-omap2/omap_hwmod_81xx_data.c  2015-06-15
> 13:34:51.770551392 -0400
> @@ -546,6 +546,14 @@ static struct omap_timer_capability_dev_
> .timer_capability   = OMAP_TIMER_ALWON,
>  };
> 
> +/* pwm timers dev attribute.
> + * timers 4-7 may be used for PWM output - see datasheet timer terminal
> + * functions table
> + */
> +static struct omap_timer_capability_dev_attr capability_pwm_dev_attr = {
> +   .timer_capability   = OMAP_TIMER_ALWON | OMAP_TIMER_HAS_PWM,
> +};
> +
>  static struct omap_hwmod dm816x_timer1_hwmod = {
> .name   = "timer1",
> .clkdm_name = "alwon_l3s_clkdm",
> @@ -619,7 +627,7 @@ static struct omap_hwmod dm816x_timer4_h
> .modulemode = MODULEMODE_SWCTRL,
> },
> },
> -   .dev_attr   = &capability_alwon_dev_attr,
> +   .dev_attr   = &capability_pwm_dev_attr,
> .class  = &dm816x_timer_hwmod_class,
>  };
> 
> @@ -640,7 +648,7 @@ static struct omap_hwmod dm816x_timer5_h
> .modulemode = MODULEMODE_SWCTRL,
> },
> },
> -   .dev_attr   = &capability_alwon_dev_attr,
> +   .dev_attr   = &capability_pwm_dev_attr,
> .class  = &dm816x_timer_hwmod_class,
>  };
> 
> @@ -661,7 +669,7 @@ static struct omap_hwmod dm816x_timer6_h
> .modulemode = MODULEMODE_SWCTRL,
> },
> },
> -   .dev_attr   = &capability_alwon_dev_attr,
> +   .dev_attr   = &capability_pwm_dev_attr,
> .class  = &dm816x_timer_hwmod_class,
>  };
> 
> @@ -682,7 +690,7 @@ static struct omap_hwmod dm816x_timer7_h
> .modulemode = MODULEMODE_SWCTRL,
> },
> },
> -   .dev_attr   = &capability_alwon_dev_attr,
> +   .dev_attr   = &capability_pwm_dev_attr,
> .class  = &dm816x_timer_hwmod_class,
>  };
> 


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC v4 17/25] powerpc, fbdev: Use arch_nvram_ops methods instead of nvram_read_byte() and nvram_write_byte()

2015-07-15 Thread Finn Thain

On Wed, 15 Jul 2015, I wrote:

> On Tue, 14 Jul 2015, Benjamin Herrenschmidt wrote:
> 
> > Maybe we should have a dedicated accessor for "mac_xpram" ...
> 
> ... I can see how to implement XPRAM for matroxfb and imsttfb 

I'll have to retract that. The video mode and color mode settings used by 
the PowerMac framebuffer drivers don't exist in the PRAM portion of NVRAM.

Addresses 0x140F and 0x1410 are found in the partition reserved by Apple 
for "Name Registry properties", according to Designing PCI Cards and 
Drivers for Power Macintosh Computers. There is no equivalent on m68k 
Macs, AFAIK.

This is NVRAM partition 2 on my beige g3, which begins at 0x1400. I'm not 
sure that this is true on New World PowerMacs, and I suspect that the 
framebuffer drivers should be calling pmac_get_partition() to determine 
the offset of the beginning of the Name Registry partition.

The arch_nvram_ops methods don't deal with structures like partitions. 
They treat the entire 8 KiB as unstructured, because that's how /dev/nvram 
treats it.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()

2015-07-15 Thread Stephane Eranian
On Wed, Jul 15, 2015 at 2:35 PM, Peter Zijlstra  wrote:
> On Wed, Jul 15, 2015 at 08:42:50AM +0200, Stephane Eranian wrote:
>> On Fri, Jul 3, 2015 at 9:49 PM, Vince Weaver  
>> wrote:
>> > On Fri, 3 Jul 2015, Peter Zijlstra wrote:
>> >
>> >> That said, its far too warm and I might just not be making sense.
>> >
>> > you need to come visit Maine!  Although I am not sure the cooler weather
>> > necessarily improves my kernel debugging skills.
>> >
>> > I managed to lock the machine (again this is with the patch applied).
>> >
>> I can reproduce the problem on my HSW running the fuzzer.
>>
>> I can see why this could be happening if you are mixing PEBS and non PEBS 
>> events
>> in the bottom 4 counters. I suspect:
>> for (bit = 0; bit < x86_pmu.max_pebs_events; bit++) {
>> if ((counts[bit] == 0) && (error[bit] == 0))
>> continue;
>>
>> This test is not correct when you have non-PEBS events mixed with PEBS
>> events and
>> they overflow at the same time. They will have counts[i] != 0 but
>> error[i] == 0, and thus
>> you fall thru the loop and hit the assert. Or it is something along those 
>> lines.
>>
>
> The only way I can make this work is if ->status only has !PEBS events
> set, because if it has both set we'll take that slow path which masks
> out the !PEBS bits.
>
> After masking there are 3 options:
>
>  - there is one bit set, and its @bit, we increment counts[bit].
>  - there are multiple bits set, we increment error[] for each set bit,
>we do not increment counts[].
>  - there are no bits set, we do nothing.
>
> The intent was to never increment counts[] for !PEBS events.
>
> Now if we start out with only a single !PEBS event set, we'll pass the
> test and increment counts[] for a !PEBS and hit the warn.
>
> The below patch modifies the code such that it can deal with that
> particular issue. Can you try?
>
Been running it for a couple of hours, so far so good. I will let it
run all night.
Testing it on HSW and NHM, my SNB is broken at the moment.

Don't know if the fuzzer has already reproduced the test case.
Thanks.

> ---
>  arch/x86/kernel/cpu/perf_event_intel_ds.c | 29 +
>  1 file changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c 
> b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> index 71fc40238843..68d0ced1d229 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -1142,6 +1142,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs 
> *iregs)
>
> for (at = base; at < top; at += x86_pmu.pebs_record_size) {
> struct pebs_record_nhm *p = at;
> +   u64 pebs_status;
>
> /* PEBS v3 has accurate status bits */
> if (x86_pmu.intel_cap.pebs_format >= 3) {
> @@ -1152,12 +1153,14 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs 
> *iregs)
> continue;
> }
>
> -   bit = find_first_bit((unsigned long *)&p->status,
> +   pebs_status = p->status & cpuc->pebs_enabled;
> +   pebs_status &= (1ULL << x86_pmu.max_pebs_events) - 1;
> +
> +   bit = find_first_bit((unsigned long *)&pebs_status,
> x86_pmu.max_pebs_events);
> if (bit >= x86_pmu.max_pebs_events)
> continue;
> -   if (!test_bit(bit, cpuc->active_mask))
> -   continue;
> +
> /*
>  * The PEBS hardware does not deal well with the situation
>  * when events happen near to each other and multiple bits
> @@ -1172,27 +1175,21 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs 
> *iregs)
>  * one, and it's not possible to reconstruct all events
>  * that caused the PEBS record. It's called collision.
>  * If collision happened, the record will be dropped.
> -*
>  */
> -   if (p->status != (1 << bit)) {
> -   u64 pebs_status;
> -
> -   /* slow path */
> -   pebs_status = p->status & cpuc->pebs_enabled;
> -   pebs_status &= (1ULL << MAX_PEBS_EVENTS) - 1;
> -   if (pebs_status != (1 << bit)) {
> -   for_each_set_bit(i, (unsigned long 
> *)&pebs_status,
> -MAX_PEBS_EVENTS)
> -   error[i]++;
> -   continue;
> -   }
> +   if (p->status != (1ULL << bit)) {
> +   for_each_set_bit(i, (unsigned long *)&pebs_status,
> +x86_pmu.max_pebs_events)
> +   error[i]++;
> +   continue;
> }
> +
> 

BUG: perf error on syscalls for powerpc64.

2015-07-15 Thread Zumeng Chen
Hi All,

1028ccf5 did a change for sys_call_table from a pointer to an array of
unsigned long, I think it's not proper, here is my reason:

sys_call_table defined as a label in assembler should be pointer array
rather than an array as described in 1028ccf5. If we defined it as an
array, then arch_syscall_addr will return the address of sys_call_table[],
actually the content of sys_call_table[] is demanded by arch_syscall_addr.
so 'perf list' will ignore all syscalls since find_syscall_meta will
return null
in init_ftrace_syscalls because of the wrong arch_syscall_addr.

Did I miss something, or Gcc compiler has done something newer ?

Cheers,
Zumeng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: OMAP2: Delete unnecessary checks before three function calls

2015-07-15 Thread Paul Walmsley
Hello Markus

On Tue, 30 Jun 2015, SF Markus Elfring wrote:

> From: Markus Elfring 
> Date: Tue, 30 Jun 2015 14:00:16 +0200
> 
> The functions clk_disable(), of_node_put() and omap_device_delete() test
> whether their argument is NULL and then return immediately.
> Thus the test around the call is not needed.
> 
> This issue was detected by using the Coccinelle software.
> 
> Signed-off-by: Markus Elfring 

Thanks for the patch.  I have to say, I am a bit leery about applying the 
omap_device.c and omap_hwmod.c changes, since the called functions -- 
omap_device_delete() and clk_disable() -- don't explicitly document that 
NULLs are allowed to be passed in.  So there's no explicit contract that 
callers can rely upon, to (at least in theory) prevent those internal NULL 
pointer checks from being removed.

So I would suggest that those two functions' kerneldoc be patched first to 
explicitly state that passing in a NULL pointer is allowed.  Then I would 
feel a bit more comfortable applying the omap_device.c and omap_hwmod.c 
changes.

The kerneldoc for of_node_put() does explicitly allow NULLs to be passed 
in.  So I'll apply that change now for v4.3, touching up the commit 
message accordingly.

regards,

- Paul

> ---
>  arch/arm/mach-omap2/omap_device.c | 3 +--
>  arch/arm/mach-omap2/omap_hwmod.c  | 5 +
>  arch/arm/mach-omap2/timer.c   | 3 +--
>  3 files changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/mach-omap2/omap_device.c 
> b/arch/arm/mach-omap2/omap_device.c
> index 4cb8fd9..196366e 100644
> --- a/arch/arm/mach-omap2/omap_device.c
> +++ b/arch/arm/mach-omap2/omap_device.c
> @@ -193,8 +193,7 @@ static int _omap_device_notifier_call(struct 
> notifier_block *nb,
>  
>   switch (event) {
>   case BUS_NOTIFY_DEL_DEVICE:
> - if (pdev->archdata.od)
> - omap_device_delete(pdev->archdata.od);
> + omap_device_delete(pdev->archdata.od);
>   break;
>   case BUS_NOTIFY_ADD_DEVICE:
>   if (pdev->dev.of_node)
> diff --git a/arch/arm/mach-omap2/omap_hwmod.c 
> b/arch/arm/mach-omap2/omap_hwmod.c
> index d78c12e..1091ee7 100644
> --- a/arch/arm/mach-omap2/omap_hwmod.c
> +++ b/arch/arm/mach-omap2/omap_hwmod.c
> @@ -921,10 +921,7 @@ static int _disable_clocks(struct omap_hwmod *oh)
>   int i = 0;
>  
>   pr_debug("omap_hwmod: %s: disabling clocks\n", oh->name);
> -
> - if (oh->_clk)
> - clk_disable(oh->_clk);
> -
> + clk_disable(oh->_clk);
>   p = oh->slave_ports.next;
>  
>   while (i < oh->slaves_cnt) {
> diff --git a/arch/arm/mach-omap2/timer.c b/arch/arm/mach-omap2/timer.c
> index cac46d8..15448221 100644
> --- a/arch/arm/mach-omap2/timer.c
> +++ b/arch/arm/mach-omap2/timer.c
> @@ -208,8 +208,7 @@ static void __init omap_dmtimer_init(void)
>   /* If we are a secure device, remove any secure timer nodes */
>   if ((omap_type() != OMAP2_DEVICE_TYPE_GP)) {
>   np = omap_get_timer_dt(omap_timer_match, "ti,timer-secure");
> - if (np)
> - of_node_put(np);
> + of_node_put(np);
>   }
>  }
>  
> -- 
> 2.4.5
> 


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Eric W. Biederman
Andy Lutomirski  writes:

> On Wed, Jul 15, 2015 at 10:04 PM, Eric W. Biederman
>  wrote:
>> Andy Lutomirski  writes:
>>
>>>
>>> So here's the semantic question:
>>>
>>> Suppose an unprivileged user (uid 1000) creates a user namespace and a
>>> mount namespace.  They stick a file (owned by uid 1000 as seen by
>>> init_user_ns) in there and mark it setuid root and give it fcaps.
>>
>> To make this make sense I have to ask, is this file on a filesystem
>> where uid 1000 as seen by the init_user_ns stored as uid 1000 on
>> the filesystem?  Or is this uid 0 as seen by the filesystem?
>>
>> I assume this is uid 0 on the filesystem in question or else your
>> unprivileged user would not have sufficient privileges over the
>> filesystem to setup fcaps.
>
> I was thinking uid 0 as seen by the filesystem.  But even if it were
> uid 1000, the unprivileged user can still set whatever mode and xattrs
> they want -- they control the backing store.

Yes.   And that is what I was really asking.  Are we taking about a
filesystem where the user controls the backing store?

>>> Then global root gets an fd to this filesystem.  If they execve the
>>> file directly, then, with my patch 4, it won't act as setuid 1000 and
>>> the fcaps will be ignored.  Even with my patch 4, though, if they bind
>>> mount the fs and execve the file from their bind mount, it will act as
>>> setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
>>> fcaps will (correctly) not be honored.
>>
>> With patch 3 you can also think of it as fcaps being honored and you
>> get all the caps in the appropriate user namespace, but since you are
>> not in that user namespace and so don't have a place to store them
>> in struct cred you don't get the file caps.
>>
>> From the philosophy of interpreting the file as defined by the
>> filesystem in principle we could extend struct cred so you actually
>> get the creds just in uid 1000s user namespace, but that is very
>> unlikely to be worth it.
>
> I agree.
>
>>
>>> I tend to thing that, if we're not honoring the fcaps, we shouldn't be
>>> honoring the setuid bit either.  After all, it's really not a trusted
>>> file, even though the only user who could have messed with it really
>>> is the apparent owner.
>>
>> For the file caps we can't honor them because you don't have the bits
>> in struct cred.
>>
>> For setuid we can honor it, and setuid is something that the user
>> namespace allows.
>>
>
> We certainly *can* honor it.  But why should we?  I'd be more
> comfortable with this if the contents of an untrusted filesystem were
> really treated as just data.

In these weird bleed through situtations I don't know that we should.
But extending nosuid protections in this way is a bit like yama
a bit gratuitious stomping don't care cases in the semantics to
make bugs harder to exploit.

>>> And, if we're going to say we don't trust the file and shouldn't honor
>>> setuid or fcaps, then merging all the functionality into mnt_may_suid
>>> could make sense.  Yes, these two things do different things, but they
>>> could hook in to the same place.
>>
>> There are really two separate questions:
>> - Do we trust this filesystem?
>> - Do you have the bits to implement this concept?
>>
>> Even if in this specific context the two questions wind up looking
>> exactly the same. I think it makes a lot of sense to ask the two
>> questions separately.  As future maintenance changes may cause the
>> implementation of the questions to diverge.
>>
>
> Agreed.
>
> Unless someone thinks of an argument to the contrary, I'd say "no, we
> don't trust this filesystem".  I could be convinced otherwise.

But this is context dependent.  From the perspective of the container
we really do want to trust the filesystem.  As the container root set it
up, and if he isn't being hostile likely has a use for setfcaps files
and setuid files and all of the rest.

Perhaps I should phrase it as:
- In this context do we trust the code?   AKA mnt_may_suid?
- What do these bits mean in this context?  (Usually something more 
complicated).

Which says to me we want both patches 3 and 4 (even if 4 uses s_user_ns)
because 3 is different than 4.

And now I better context switch back to fixing bind mounts.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the rcu tree

2015-07-15 Thread Stephen Rothwell
Hi Paul,

On Wed, 15 Jul 2015 20:51:38 -0700 "Paul E. McKenney" 
 wrote:
>
> Thank you in both cases!  I suspect that more will follow, so is there
> something I can do to make this easier?  (Hard for me to patch stuff
> that is not yet in the tree...)

No, that is what I am here for.  But it would be good if you remember
this when it comes time for your tree to be merged into tip ...

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/6] locking/pvqspinlock: Opportunistically defer kicking to unlock time

2015-07-15 Thread Peter Zijlstra
On Wed, Jul 15, 2015 at 10:18:35PM -0400, Waiman Long wrote:
> On 07/15/2015 06:03 AM, Peter Zijlstra wrote:

> >*groan*, so you complained the previous version of this patch was too
> >complex, but let me say I vastly preferred it to this one :/
> 
> I said it was complex as maintaining a tri-state variable needed more
> thought than 2 bi-state variables. I can revert it back to the tri-state
> variable as doing an unconditional kick in unlock simplifies the code at
> pv_wait_head().

Well, your state space isn't shrunk, you just use more variables and I'm
not entirely sure that actually matters.

What also doesn't help is that mixing with the kicking code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 4/6] locking/pvqspinlock: Allow vCPUs kick-ahead

2015-07-15 Thread Peter Zijlstra
On Wed, Jul 15, 2015 at 10:01:02PM -0400, Waiman Long wrote:
> On 07/15/2015 05:39 AM, Peter Zijlstra wrote:
> >On Tue, Jul 14, 2015 at 10:13:35PM -0400, Waiman Long wrote:
> >>Frequent CPU halting (vmexit) and CPU kicking (vmenter) lengthens
> >>critical section and block forward progress.  This patch implements
> >>a kick-ahead mechanism where the unlocker will kick the queue head
> >>vCPUs as well as up to four additional vCPUs next to the queue head
> >>if they were halted.  The kickings are done after exiting the critical
> >>section to improve parallelism.
> >>
> >>The amount of kick-ahead allowed depends on the number of vCPUs
> >>in the VM guest.  This patch, by itself, won't do much as most of
> >>the kickings are currently done at lock time. Coupled with the next
> >>patch that defers lock time kicking to unlock time, it should improve
> >>overall system performance in a busy overcommitted guest.
> >>
> >>Linux kernel builds were run in KVM guest on an 8-socket, 4
> >>cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
> >>Haswell-EX system. Both systems are configured to have 32 physical
> >>CPUs. The kernel build times before and after the patch were:
> >>
> >>WestmereHaswell
> >>   Patch32 vCPUs48 vCPUs32 vCPUs48 vCPUs
> >>   -
> >>   Before patch  3m25.0s10m34.1s 2m02.0s15m35.9s
> >>   After patch3m27.4s10m32.0s2m00.8s14m52.5s
> >>
> >>There wasn't too much difference before and after the patch.
> >That means either the patch isn't worth it, or as you seem to imply its
> >in the wrong place in this series.
> 
> It needs to be coupled with the next patch to be effective as most of the
> kicking are happening at the lock side, instead of at the unlock side. If
> you look at the sample pvqspinlock stats in patch 3:
> 
> lock_kick_count=755354
> unlock_kick_count=87
> 
> The number of unlock kicks is negligible compared with the lock kicks. Patch
> 5 does have a dependency on patch 4 unless we make it unconditionally defers
> kicking to the unlock call which was what I had done in the v1 patch. The
> reason why I change this in v2 is because I found a very slight performance
> degradation in doing so.

This way we cannot see the gains of the proposed complexity. So put it
in a place where you can.

> >You also do not offer any support for any of the magic numbers..
> 
> I chose 4 for PV_KICK_AHEAD_MAX as I didn't see much performance difference
> when I did a kick-ahead of 5. Also, it may be too unfair to the vCPU that
> was doing the kicking if the number is too big. Another magic number is
> pv_kick_ahead number. This one is kind of arbitrary. Right now I do a log2,
> but it can be divided by 4 (rshift 2) as well.

So what was the difference between 1-2-3-4 ? I would be thinking one
extra kick is the biggest help, no?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] ARM: multi_v7_defconfig: Enable max77802 regulator

2015-07-15 Thread Javier Martinez Canillas
Hello Krzysztof,

Thanks for the feedback.

On 07/16/2015 02:45 AM, Krzysztof Kozlowski wrote:
> On 16.07.2015 01:32, Javier Martinez Canillas wrote:
>> The Maxim max77802 Power Management IC has besides other devices, a set of
>> regulators. Commit f3caa529c6f5 ("ARM: multi_v7_defconfig: Enable max77802
>> regulator, rtc and clock drivers") was supposed to enable the config option
>> for the regulator driver as a module but the final version that landed did
>> not include this. So this patch enables the needed Kconfig option.
>>
>> Signed-off-by: Javier Martinez Canillas 
> 
> Please describe why do you want to enable it (IOW who will benefit from
> enabling it?). This symbol was removed by Kukjin from your commit:
>   [kg...@kernel.org: removing useless REGULATOR_MAX77802 config]
> so justification would be welcomed.
>

You are right, sorry for not making the commit message clear. This PMIC
is used by a couple of Exynos5 based boars such as the Peach Pit and Pi
Chromebooks. I expect it to be found in other designs too just like the
max77686 is found in many Exynos5 based boards.

I'll add this to the commit message on v2.
 
> Beside the commit description I agree with the patch.
>

Does this mean I can add your Reviewed-by to this patch as well?

> Best regards,
> Krzysztof
> 

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/6] locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL

2015-07-15 Thread Peter Zijlstra
On Wed, Jul 15, 2015 at 08:18:23PM -0400, Waiman Long wrote:
> On 07/15/2015 05:10 AM, Peter Zijlstra wrote:
> > /*
> >+ * A failed cmpxchg doesn't provide any memory-ordering guarantees,
> >+ * so we need a barrier to order the read of the node data in
> >+ * pv_unhash *after* we've read the lock being _Q_SLOW_VAL.
> >+ *
> >+ * Matches the cmpxchg() in pv_wait_head() setting _Q_SLOW_VAL.
> >+ */
> >+smp_rmb();
> 
> According to memory_barriers.txt, cmpxchg() is a full memory barrier. It
> didn't say a failed cmpxchg will lose its memory guarantee. So is the
> documentation right? 

The documentation is not entirely clear on this; but there are hints
that this is so.

> Or is that true for some architectures? I think it is
> not true for x86.

On x86 LOCK CMPXCHG is always a sync point, but yes there are archs for
which a failed cmpxchg does _NOT_ provide any barrier semantics.

The reason I started looking was because Will made Argh64 one of those.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 2/3] pwm: add MediaTek display PWM driver support

2015-07-15 Thread YH Huang
On Wed, 2015-07-15 at 23:59 +0800, YH Huang wrote:
> On Mon, 2015-07-13 at 18:19 +0800, Daniel Kurtz wrote:
> > On Mon, Jul 13, 2015 at 5:04 PM, YH Huang  wrote:
> > > Add display PWM driver support to modify backlight for MT8173 and MT6595.
> > > The PWM has one channel to control the brightness of the display.
> > > When the (high_width / period) is closer to 1, the screen is brighter;
> > > otherwise, it is darker.
> > >
> > > Signed-off-by: YH Huang 
> > > ---
> > >  drivers/pwm/Kconfig|  10 ++
> > >  drivers/pwm/Makefile   |   1 +
> > >  drivers/pwm/pwm-mtk-disp.c | 256 
> > > +
> > >  3 files changed, 267 insertions(+)
> > >  create mode 100644 drivers/pwm/pwm-mtk-disp.c
> > >
> > > diff --git a/drivers/pwm/Kconfig b/drivers/pwm/Kconfig
> > > index b1541f4..f5b03a4 100644
> > > --- a/drivers/pwm/Kconfig
> > > +++ b/drivers/pwm/Kconfig
> > > @@ -211,6 +211,16 @@ config PWM_LPSS_PLATFORM
> > >   To compile this driver as a module, choose M here: the module
> > >   will be called pwm-lpss-platform.
> > >
> > > +config PWM_MTK_DISP
> > > +   tristate "MediaTek display PWM driver"
> > > +   depends on ARCH_MEDIATEK || COMPILE_TEST
> > > +   help
> > > + Generic PWM framework driver for MediaTek disp-pwm device.
> > > + The PWM is used to control the backlight brightness for display.
> > > +
> > > + To compile this driver as a module, choose M here: the module
> > > + will be called pwm-mtk-disp.
> > > +
> > >  config PWM_MXS
> > > tristate "Freescale MXS PWM support"
> > > depends on ARCH_MXS && OF
> > > diff --git a/drivers/pwm/Makefile b/drivers/pwm/Makefile
> > > index ec50eb5..99c9e75 100644
> > > --- a/drivers/pwm/Makefile
> > > +++ b/drivers/pwm/Makefile
> > > @@ -18,6 +18,7 @@ obj-$(CONFIG_PWM_LPC32XX) += pwm-lpc32xx.o
> > >  obj-$(CONFIG_PWM_LPSS) += pwm-lpss.o
> > >  obj-$(CONFIG_PWM_LPSS_PCI) += pwm-lpss-pci.o
> > >  obj-$(CONFIG_PWM_LPSS_PLATFORM)+= pwm-lpss-platform.o
> > > +obj-$(CONFIG_PWM_MTK_DISP) += pwm-mtk-disp.o
> > >  obj-$(CONFIG_PWM_MXS)  += pwm-mxs.o
> > >  obj-$(CONFIG_PWM_PCA9685)  += pwm-pca9685.o
> > >  obj-$(CONFIG_PWM_PUV3) += pwm-puv3.o
> > > diff --git a/drivers/pwm/pwm-mtk-disp.c b/drivers/pwm/pwm-mtk-disp.c
> > > new file mode 100644
> > > index 000..1f17cee
> > > --- /dev/null
> > > +++ b/drivers/pwm/pwm-mtk-disp.c
> > > @@ -0,0 +1,256 @@
> > > +/*
> > > + * MediaTek display pulse-width-modulation controller driver.
> > > + * Copyright (c) 2015 MediaTek Inc.
> > > + * Author: YH Huang 
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License version 2 as
> > > + * published by the Free Software Foundation.
> > > + *
> > > + * This program is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + * GNU General Public License for more details.
> > > + */
> > > +
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +#include 
> > > +
> > > +#define DISP_PWM_EN0
> > 
> > The "DISP_PWM_*" are register offsets, so use a hex value, like this:
> > 
> > #define DISP_PWM_EN  0x00
> > 
> > Use BIT() for register *fields*, that is, the individual bits of a register.
> > 
> 
> Got it!
> 
> > > +#define PWM_ENABLE_MASKBIT(0)
> > > +
> > > +#define DISP_PWM_COMMITBIT(3)
> > 
> > #define DISP_PWM_COMMIT0x08
> > 
> > > +#define PWM_COMMIT_MASKBIT(0)
> > > +
> > > +#define DISP_PWM_CON_0 BIT(4)
> > 
> > #define DISP_PWM_COMMIT0x10
> > 
> > > +#define PWM_CLKDIV_SHIFT   16
> > > +#define PWM_CLKDIV_MAX 0x3ff
> > > +#define PWM_CLKDIV_MASK(PWM_CLKDIV_MAX << 
> > > PWM_CLKDIV_SHIFT)
> > > +
> > > +#define DISP_PWM_CON_1 0x14
> > > +#define PWM_PERIOD_MASK0xfff
> > > +/* Shift log2(PWM_PERIOD_MASK + 1) as divisor */
> > > +#define PWM_PERIOD_BIT_SHIFT   12
> > > +
> > > +#define PWM_HIGH_WIDTH_SHIFT   16
> > > +#define PWM_HIGH_WIDTH_MASK(0x1fff << PWM_HIGH_WIDTH_SHIFT)
> > > +
> > > +struct mtk_disp_pwm {
> > > +   struct pwm_chip chip;
> > > +   struct device *dev;
> > 
> > I don't think "dev" is actually used.  And, if needed, it can be
> > extracted from "chip".
> > 
> 
> I will drop it.
> 
> > > +   struct clk *clk_main;
> > > +   struct clk *clk_mm;
> > > +   void __iomem *base;
> > > +};
> > > +
> > > +static inline struct mtk_disp_pwm *to_mtk_disp_pwm(struct pwm_chip *chip)
> > > +{
> > > +   return container_of(chip, struct mtk_disp_pwm, chip);
> > > +}
> > > +
> > > +static void mtk_disp_pwm_update_bits(void

linux-next: build failure after merge of the akpm-current tree

2015-07-15 Thread Stephen Rothwell
Hi Andrew,

After merging the akpm-current tree, today's linux-next build (powerpc
ppc64_defconfig) failed like this:

ERROR: ".smpboot_register_percpu_thread_cpumask" 
[drivers/infiniband/hw/ehca/ib_ehca.ko] undefined!

Caused by commit

  2b07b4da35a9 ("smpboot: allow passing the cpumask on per-cpu thread 
registration")

I have added the following build faix for today:

From: Stephen Rothwell 
Date: Thu, 16 Jul 2015 15:30:05 +1000
Subject: [PATCH] smpboot: fix for allow passing the cpumask on per-cpu thread 
registration

Signed-off-by: Stephen Rothwell 
---
 kernel/smpboot.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index d99a41d25b0c..a818cbc73e14 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -308,7 +308,7 @@ out:
put_online_cpus();
return ret;
 }
-EXPORT_SYMBOL_GPL(smpboot_register_percpu_thread);
+EXPORT_SYMBOL_GPL(smpboot_register_percpu_thread_cpumask);
 
 /**
  * smpboot_unregister_percpu_thread - Unregister a per_cpu thread related to 
hotplug
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] staging: rtl8188eu: core: find and remove code valid only for 5 HGz.

2015-07-15 Thread Sudip Mukherjee
On Wed, Jul 15, 2015 at 10:04:08PM -0400, Sreenath Madasu wrote:
> This one of the TODO tasks for staging rtl8188eu driver. I have removed
> the code referring to channel > 14 for rtw_ap.c, rtw_ieee80211.c and
> rtw_mlme.c files. Please review.
Your patch will give a new build warning:
warning: unused variable ‘pcur_network’ [-Wunused-variable]

regards
sudip
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/9] ARM: multi_v7_defconfig: Enable max77802 regulator, rtc and clock drivers

2015-07-15 Thread Javier Martinez Canillas
Hello Krzysztof,

On 07/16/2015 02:42 AM, Krzysztof Kozlowski wrote:
> On 16.07.2015 00:38, Javier Martinez Canillas wrote:
>> Hello,
>>
>> On Thu, May 14, 2015 at 5:40 PM, Javier Martinez Canillas
>>  wrote:
>>> The Maxim max77802 Power Management IC is used on many Exynos machines.
>>> Besides a bunch of regulators, this chip has a Real-Time-Clock (RTC)
>>> and 2-channel 32kHz clock outputs.
>>>
>>> Enable the kernel config options to have the drivers for these devices
>>> built as a module.
>>>
>>> Signed-off-by: Javier Martinez Canillas 
>>> ---
>>>  arch/arm/configs/multi_v7_defconfig | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/arch/arm/configs/multi_v7_defconfig 
>>> b/arch/arm/configs/multi_v7_defconfig
>>> index 2349584b6e08..080120fe5580 100644
>>> --- a/arch/arm/configs/multi_v7_defconfig
>>> +++ b/arch/arm/configs/multi_v7_defconfig
>>> @@ -373,6 +373,7 @@ CONFIG_POWER_RESET_SYSCON=y
>>>  CONFIG_REGULATOR_MAX8907=y
>>>  CONFIG_REGULATOR_MAX8973=y
>>>  CONFIG_REGULATOR_MAX77686=y
>>> +CONFIG_REGULATOR_MAX77802=m
>>
>> I noticed that the version that landed in 4.2-rc1 as commit
>> f3caa529c6f5 ("ARM: multi_v7_defconfig: Enable max77802 regulator, rtc
>> and clock drivers") doesn't include this symbol. I guess it was caused
>> by a wrong resolved conflict? I'll post a patch to enable the
>> regulator again.
> 
> As you can see in mentioned mainline commit Kukjin removed it manually:
> [kg...@kernel.org: removing useless REGULATOR_MAX77802 config]
>

Oh, I missed that in the commit message. I thought it was a merge / conflict
error, not something done on purpose.
 
> I wonder why?
>

Me too.

> Best regards,
> Krzysztof
> --

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] mem-hotplug: Handle node hole when initializing numa_meminfo.

2015-07-15 Thread Tang Chen


On 07/16/2015 05:20 AM, Tejun Heo wrote:

On Wed, Jul 01, 2015 at 11:16:54AM +0800, Tang Chen wrote:
...

-   /* and there's no empty block */
-   if (bi->start >= bi->end)
+   /* and there's no empty or non-exist block */
+   if (bi->start >= bi->end ||
+   memblock_overlaps_region(&memblock.memory,
+   bi->start, bi->end - bi->start) == -1)

Ugh can you please change memblock_overlaps_region() to return
bool instead?


Well, I think memblock_overlaps_region() is designed to return
the index of the region overlapping with the given region.
Maybe it had some users before.

Of course for now, it is only called by memblock_is_region_reserved().

It is OK to change the return value of memblock_overlaps_region() to bool.
But any caller of memblock_is_region_reserved() should also be changed.

I think it is OK to leave it there.

Thanks.



Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: build warning after merge of the akpm-current tree

2015-07-15 Thread Stephen Rothwell
Hi Andrew,

After merging the akpm-current tree, today's linux-next build (arm
multi_v7_defconfig) produced this warning:

lib/genalloc.c: In function 'gen_pool_get':
/scratch/sfr/next/lib/genalloc.c:599:6: warning: passing argument 4 of 
'devres_find' discards 'const' qualifier from pointer target type
  p = devres_find(dev, devm_gen_pool_release, devm_gen_pool_match, name);
  ^
In file included from /scratch/sfr/next/include/linux/node.h:17:0,
 from /scratch/sfr/next/include/linux/cpu.h:16,
 from /scratch/sfr/next/include/linux/of_device.h:4,
 from /scratch/sfr/next/lib/genalloc.c:37:
/scratch/sfr/next/include/linux/device.h:620:14: note: expected 'void *' but 
argument is of type 'const char *'
 extern void *devres_find(struct device *dev, dr_release_t release,
  ^

Caused by commit

  e89a70fd54f2 ("genalloc: add support of multiple gen_pools per device")

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: build failure after merge of the akpm-current tree

2015-07-15 Thread Stephen Rothwell
Hi Andrew,

After merging the akpm-current tree, today's linux-next build (arm
multi_v7_defconfig) failed like this:

arch/arm/kernel/entry-common.S: Assembler messages:
arch/arm/kernel/entry-common.S:108: Error: __NR_syscalls is not equal to the 
size of the syscall table

Caused by commit

  d221fc1f0f25 ("mm: mlock: add new mlock, munlock, and munlockall system 
calls")

I have added the following fix patch for today:

From: Stephen Rothwell 
Date: Thu, 16 Jul 2015 14:58:53 +1000
Subject: [PATCH] mm: mlock: fix for add new mlock, munlock, and munlockall 
system calls

Signed-off-by: Stephen Rothwell 
---
 arch/arm/include/asm/unistd.h | 2 +-
 arch/arm/kernel/calls.S   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/unistd.h b/arch/arm/include/asm/unistd.h
index 32640c431a08..2516c09d65d7 100644
--- a/arch/arm/include/asm/unistd.h
+++ b/arch/arm/include/asm/unistd.h
@@ -19,7 +19,7 @@
  * This may need to be greater than __NR_last_syscall+1 in order to
  * account for the padding in the syscall table
  */
-#define __NR_syscalls  (388)
+#define __NR_syscalls  (392)
 
 /*
  * *NOTE*: This is a ghost syscall private to the kernel.  Only the
diff --git a/arch/arm/kernel/calls.S b/arch/arm/kernel/calls.S
index 514e77b26414..88808221383b 100644
--- a/arch/arm/kernel/calls.S
+++ b/arch/arm/kernel/calls.S
@@ -399,7 +399,7 @@
CALL(sys_execveat)
CALL(sys_mlock2)
CALL(sys_munlock2)
-/* 400 */  CALL(sys_munlockall2)
+/* 390 */  CALL(sys_munlockall2)
 #ifndef syscalls_counted
 .equ syscalls_padding, ((NR_syscalls + 3) & ~3) - NR_syscalls
 #define syscalls_counted
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 6/6] ARM: PRM: AM437x: Enable IO wakeup feature

2015-07-15 Thread Keerthy



On Thursday 16 July 2015 10:44 AM, Paul Walmsley wrote:

Hi

On Tue, 14 Jul 2015, Keerthy wrote:


Enable IO wakeup feature.

Signed-off-by: Keerthy 


Per my comments on one of the previous patches, please add a short
description in the commit message for what enabling I/O wakeup will do for
a user.


Okay will do that.



- Paul


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] asm-generic: {get,put}_user ptr argument evaluate only 1 time

2015-07-15 Thread Yoshinori Sato
Current implemantation ptr argument evaluate 2 times.
It'll be an unexpected result.

Signed-off-by: Yoshinori Sato 
---
 include/asm-generic/uaccess.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/asm-generic/uaccess.h b/include/asm-generic/uaccess.h
index 72d8803..1b813fb 100644
--- a/include/asm-generic/uaccess.h
+++ b/include/asm-generic/uaccess.h
@@ -163,9 +163,10 @@ static inline __must_check long __copy_to_user(void __user 
*to,
 
 #define put_user(x, ptr)   \
 ({ \
+   __typeof__((ptr)) __p = (ptr);  \
might_fault();  \
-   access_ok(VERIFY_WRITE, ptr, sizeof(*ptr)) ?\
-   __put_user(x, ptr) :\
+   access_ok(VERIFY_WRITE, __p, sizeof(*__p)) ?\
+   __put_user(x, __p) :\
-EFAULT;\
 })
 
@@ -225,9 +226,10 @@ extern int __put_user_bad(void) __attribute__((noreturn));
 
 #define get_user(x, ptr)   \
 ({ \
+   __typeof__((ptr)) __p = (ptr);  \
might_fault();  \
-   access_ok(VERIFY_READ, ptr, sizeof(*ptr)) ? \
-   __get_user(x, ptr) :\
+   access_ok(VERIFY_READ, __p, sizeof(*__p)) ? \
+   __get_user(x, __p) :\
-EFAULT;\
 })
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] memory-barriers: remove smp_mb__after_unlock_lock()

2015-07-15 Thread Benjamin Herrenschmidt
On Thu, 2015-07-16 at 15:03 +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2015-07-16 at 12:00 +1000, Michael Ellerman wrote:
> > That would fix the problem with smp_mb__after_unlock_lock(), but not
> > the original worry we had about loads happening before the SC in lock.
> 
> However I think isync fixes *that* :-) The problem with isync is as you
> said, it's not a -memory- barrier per-se, it's an execution barrier /
> context synchronizing instruction. The combination stwcx. + bne + isync
> however prevents the execution of anything past the isync until the
> stwcx has completed and the bne has been "decided", which prevents loads
> from leaking into the LL/SC loop. It will also prevent a store in the
> lock from being issued before the stwcx. has completed. It does *not*
> prevent as far as I can tell another unrelated store before the lock
> from leaking into the lock, including the one used to unlock a different
> lock.

Except that the architecture says:

<<
Because a Store Conditional instruction may com-
plete before its store has been performed, a condi-
tional Branch instruction that depends on the CR0
value set by a Store Conditional instruction does
not order the Store Conditional's store with respect
to storage accesses caused by instructions that
follow the Branch
>>

So isync in lock in architecturally incorrect, despite being what the
architecture recommends using, yay !

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/6] ARM: OMAP4: PRM: Remove hardcoding of PRM_IO_PMCTRL_OFFSET register

2015-07-15 Thread Keerthy

Paul,

Thanks for the review!

On Thursday 16 July 2015 07:24 AM, Paul Walmsley wrote:

Hi

a few minor comments

On Wed, 8 Jul 2015, Keerthy wrote:


PRM_IO_PMCTRL_OFFSET need not be same for all SOCs hence
remove hardcoding and use the value provided by the omap_prcm_irq_setup
structure.


Please mention here that the reason why you're making this change is to
support AM437x.


Sure. I will do that.





Signed-off-by: Keerthy 
---
  arch/arm/mach-omap2/prcm-common.h |  1 +
  arch/arm/mach-omap2/prm44xx.c | 11 ++-
  2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm/mach-omap2/prcm-common.h 
b/arch/arm/mach-omap2/prcm-common.h
index 6ae0b3a..2e60406 100644
--- a/arch/arm/mach-omap2/prcm-common.h
+++ b/arch/arm/mach-omap2/prcm-common.h
@@ -494,6 +494,7 @@ struct omap_prcm_irq {
  struct omap_prcm_irq_setup {
u16 ack;
u16 mask;
+   u16 pm_ctrl;


Please add a kerneldoc structure documentation line for this new field, to
match the existing documentation here.


Okay.




u8 nr_regs;
u8 nr_irqs;
const struct omap_prcm_irq *irqs;
diff --git a/arch/arm/mach-omap2/prm44xx.c b/arch/arm/mach-omap2/prm44xx.c
index 4541700..8149e5a 100644
--- a/arch/arm/mach-omap2/prm44xx.c
+++ b/arch/arm/mach-omap2/prm44xx.c
@@ -45,6 +45,7 @@ static const struct omap_prcm_irq omap4_prcm_irqs[] = {
  static struct omap_prcm_irq_setup omap4_prcm_irq_setup = {
.ack= OMAP4_PRM_IRQSTATUS_MPU_OFFSET,
.mask   = OMAP4_PRM_IRQENABLE_MPU_OFFSET,
+   .pm_ctrl= OMAP4_PRM_IO_PMCTRL_OFFSET,
.nr_regs= 2,
.irqs   = omap4_prcm_irqs,
.nr_irqs= ARRAY_SIZE(omap4_prcm_irqs),
@@ -306,10 +307,10 @@ static void omap44xx_prm_reconfigure_io_chain(void)
omap4_prm_rmw_inst_reg_bits(OMAP4430_WUCLK_CTRL_MASK,
OMAP4430_WUCLK_CTRL_MASK,
inst,
-   OMAP4_PRM_IO_PMCTRL_OFFSET);
+   omap4_prcm_irq_setup.pm_ctrl);
omap_test_timeout(
(((omap4_prm_read_inst_reg(inst,
-  OMAP4_PRM_IO_PMCTRL_OFFSET) &
+  omap4_prcm_irq_setup.pm_ctrl) &
   OMAP4430_WUCLK_STATUS_MASK) >>
  OMAP4430_WUCLK_STATUS_SHIFT) == 1),
MAX_IOPAD_LATCH_TIME, i);
@@ -319,10 +320,10 @@ static void omap44xx_prm_reconfigure_io_chain(void)
/* Trigger WUCLKIN disable */
omap4_prm_rmw_inst_reg_bits(OMAP4430_WUCLK_CTRL_MASK, 0x0,
inst,
-   OMAP4_PRM_IO_PMCTRL_OFFSET);
+   omap4_prcm_irq_setup.pm_ctrl);
omap_test_timeout(
(((omap4_prm_read_inst_reg(inst,
-  OMAP4_PRM_IO_PMCTRL_OFFSET) &
+  omap4_prcm_irq_setup.pm_ctrl) &
   OMAP4430_WUCLK_STATUS_MASK) >>
  OMAP4430_WUCLK_STATUS_SHIFT) == 0),
MAX_IOPAD_LATCH_TIME, i);
@@ -350,7 +351,7 @@ static void __init omap44xx_prm_enable_io_wakeup(void)
omap4_prm_rmw_inst_reg_bits(OMAP4430_GLOBAL_WUEN_MASK,
OMAP4430_GLOBAL_WUEN_MASK,
inst,
-   OMAP4_PRM_IO_PMCTRL_OFFSET);
+   omap4_prcm_irq_setup.pm_ctrl);
  }

  /**
--
1.9.1




- Paul


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets

2015-07-15 Thread Keerthy



On Thursday 16 July 2015 08:08 AM, Paul Walmsley wrote:

On Thu, 16 Jul 2015, Paul Walmsley wrote:


On Wed, 8 Jul 2015, Keerthy wrote:


Add the PRM IRQ register offsets.

Signed-off-by: Keerthy 


Please add more detail to your commit messages so they conform to
Documentation/SubmittingPatches:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109

For example, this commit message should read something like:

---

ARM: AM43xx: Add the PRM IRQ register offsets

Add the PRM IRQ register offsets.  This is needed to support PRM I/O
wakeup on AM43xx.

--

Basically, your patches need to provide context as to _why_ the change is
needed.

I've fixed the message for this patch, and queued it for v4.3, but
please take care with this issue in the future.


Also I've moved the AM43XX_PRM_IO_PMCTRL_OFFSET macro out of the AM43XX CM
section, since it doesn't belong there.


Thanks Paul!




- Paul


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Andy Lutomirski
On Wed, Jul 15, 2015 at 10:04 PM, Eric W. Biederman
 wrote:
> Andy Lutomirski  writes:
>
>> On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman
>>  wrote:
>>>
>>> Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
>>> this series.  Something I had not done before since it sounded totally
>>> wrong.
>>>
>>> That combined with your earlier comments I think I can say something
>>> meaningful.
>>>
>>> Andy as I read your patch the thread you are primarily worried about is
>>> chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
>>> deal with that case is reasonable, and is unlikely to break userspace.
>>> It is one of those hairy security things so we need to be careful not to
>>> introduce a regression.
>>>
>>
>> Indeed.  It's plausible this could regress something, but it would be
>> really weird.
>>
>>> I think a top down enhancement of nosuid to just block funny cases that
>>> no one cares about is completely sensible.Removing goofy corner
>>> that no one cares about and that are only good for security exploits
>>> seems reasonable.
>>>
>>
>> Agreed.
>>
>>> I am a little concerned that smack does not seem to respect nosuid
>>> on filesystems.  But that is an issue with nosuid not with your enhanced
>>> nosuid.
>>>
>>>
>>>
>>>
>>> Now this patch 3/7 really should be entitled:
>>> "Limit file caps to the userns of the super block".
>>>
>>> It really really is doing something different.   This change is about a
>>> bottom up understanding of what file caps means on a filesystem mounted
>>> by a user namespace root.
>>>
>>> That is file caps should only apply to the user namespace root of the
>>> root user who mounted the filesystem, because that is all the privileges
>>> the mounter of the filesystem had.
>>>
>>> This guarantees that even if the filesystem somehow propagates with
>>> mount propagation that there will be no issues.  I think I know how to
>>> make that happen...
>>>
>>>
>>>
>>>
>>> But deeply and fundamentally limiting a filesystem to only the
>>> privilieges of it's user namespace root, and enhancing nosuid
>>> protections are rather different things.
>>>
>>
>> So here's the semantic question:
>>
>> Suppose an unprivileged user (uid 1000) creates a user namespace and a
>> mount namespace.  They stick a file (owned by uid 1000 as seen by
>> init_user_ns) in there and mark it setuid root and give it fcaps.
>
> To make this make sense I have to ask, is this file on a filesystem
> where uid 1000 as seen by the init_user_ns stored as uid 1000 on
> the filesystem?  Or is this uid 0 as seen by the filesystem?
>
> I assume this is uid 0 on the filesystem in question or else your
> unprivileged user would not have sufficient privileges over the
> filesystem to setup fcaps.

I was thinking uid 0 as seen by the filesystem.  But even if it were
uid 1000, the unprivileged user can still set whatever mode and xattrs
they want -- they control the backing store.

>
>> Then global root gets an fd to this filesystem.  If they execve the
>> file directly, then, with my patch 4, it won't act as setuid 1000 and
>> the fcaps will be ignored.  Even with my patch 4, though, if they bind
>> mount the fs and execve the file from their bind mount, it will act as
>> setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
>> fcaps will (correctly) not be honored.
>
> With patch 3 you can also think of it as fcaps being honored and you
> get all the caps in the appropriate user namespace, but since you are
> not in that user namespace and so don't have a place to store them
> in struct cred you don't get the file caps.
>
> From the philosophy of interpreting the file as defined by the
> filesystem in principle we could extend struct cred so you actually
> get the creds just in uid 1000s user namespace, but that is very
> unlikely to be worth it.

I agree.

>
>> I tend to thing that, if we're not honoring the fcaps, we shouldn't be
>> honoring the setuid bit either.  After all, it's really not a trusted
>> file, even though the only user who could have messed with it really
>> is the apparent owner.
>
> For the file caps we can't honor them because you don't have the bits
> in struct cred.
>
> For setuid we can honor it, and setuid is something that the user
> namespace allows.
>

We certainly *can* honor it.  But why should we?  I'd be more
comfortable with this if the contents of an untrusted filesystem were
really treated as just data.

>> And, if we're going to say we don't trust the file and shouldn't honor
>> setuid or fcaps, then merging all the functionality into mnt_may_suid
>> could make sense.  Yes, these two things do different things, but they
>> could hook in to the same place.
>
> There are really two separate questions:
> - Do we trust this filesystem?
> - Do you have the bits to implement this concept?
>
> Even if in this specific context the two questions wind up looking
> exactly the same. I think it makes a lot of sense to ask the two
> questions sepa

Re: [PATCH v3 6/6] ARM: PRM: AM437x: Enable IO wakeup feature

2015-07-15 Thread Paul Walmsley
Hi

On Tue, 14 Jul 2015, Keerthy wrote:

> Enable IO wakeup feature.
> 
> Signed-off-by: Keerthy 

Per my comments on one of the previous patches, please add a short 
description in the commit message for what enabling I/O wakeup will do for 
a user.

- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/6] powernv: cpufreq: Report frequency throttle by OCC

2015-07-15 Thread Viresh Kumar
On 13-07-15, 19:39, Shilpasri G Bhat wrote:
> This patchset intends to add frequency throttle reporting mechanism
> to powernv-cpufreq driver when OCC throttles the frequency. OCC is an
> On-Chip-Controller which takes care of the power and thermal safety of
> the chip. The CPU frequency can be throttled during an OCC reset or
> when OCC tries to limit the max allowed frequency. The patchset will
> report such conditions so as to keep the user informed about reason
> for the drop in performance of workloads when frequency is throttled.
> 
> Changes from v3:
> - Rebased on top of 4.2-rc1
> - Minor changes in patch 2,3,4,6 this does not change the
>   functionality of the code
> - 594fcb9ec9e powerpc/powernv: Expose OPAL APIs required by PRD
>   interface , this patch fixes the build error due to which this
>   series was initially dropped
>   ERROR: ".opal_message_notifier_register"
>   drivers/cpufreq/powernv-cpufreq.ko] undefined!

I have already Acked v3 of this and that applies to this one as well..

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: cpufreq/ondemand: unpinning an unpinned lock.

2015-07-15 Thread Viresh Kumar
On 16-07-15, 02:13, Rafael J. Wysocki wrote:
> Cc: Viresh as he's been working on governors recently.
> 
> On Wednesday, July 15, 2015 06:04:22 PM Dave Jones wrote:
> > WARNING: CPU: 1 PID: 29529 at kernel/locking/lockdep.c:3497 
> > lock_unpin_lock+0x109/0x110()
> > unpinning an unpinned lock
> > CPU: 1 PID: 29529 Comm: kworker/1:1 Not tainted 4.2.0-rc2-think+ #3
> > Workqueue: events od_dbs_timer
> >  0009 880094d5baa8 ae7f5e6f 0007
> >  880094d5baf8 880094d5bae8 ae07b91a 0118
> >  00e0 880507bd5c58 0092 0004
> > Call Trace:
> >  [] dump_stack+0x4f/0x7b
> >  [] warn_slowpath_common+0x8a/0xc0
> >  [] warn_slowpath_fmt+0x46/0x50
> >  [] lock_unpin_lock+0x109/0x110
> >  [] __schedule+0x3ac/0xb60
> >  [] schedule+0x41/0x90
> >  [] schedule_preempt_disabled+0x18/0x30
> >  [] mutex_lock_nested+0x16f/0x3e0
> >  [] ? gov_queue_work+0x2f/0xf0
> >  [] ? od_check_cpu+0x57/0xd0
> >  [] ? gov_queue_work+0x2f/0xf0
> >  [] gov_queue_work+0x2f/0xf0
> >  [] od_dbs_timer+0xbd/0x150
> >  [] process_one_work+0x1f3/0x7a0
> >  [] ? process_one_work+0x162/0x7a0
> >  [] ? worker_thread+0xf9/0x470
> >  [] worker_thread+0x69/0x470
> >  [] ? preempt_count_sub+0xa3/0xf0
> >  [] ? process_one_work+0x7a0/0x7a0
> >  [] kthread+0x11f/0x140
> >  [] ? kthread_create_on_node+0x250/0x250
> >  [] ret_from_fork+0x3f/0x70
> >  [] ? kthread_create_on_node+0x250/0x250
> > ---[ end trace 86cca931caec9193 ]---

I don't know why this will happen. Just to confirm, you are getting
this over 4.2-rc(1 or 2)? And you weren't getting these on 4.1 at all?
And its always reproducible? How ?

There have been races in cpufreq core since sometime and what got
pushed in 4.2-rc1 is just half of the fix. The other half is present
here:

http://marc.info/?i=cover.1434713657.git.viresh.kumar%40linaro.org

Please try this and let us know if things work well or not.

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Eric W. Biederman
Andy Lutomirski  writes:

> On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman
>  wrote:
>>
>> Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
>> this series.  Something I had not done before since it sounded totally
>> wrong.
>>
>> That combined with your earlier comments I think I can say something
>> meaningful.
>>
>> Andy as I read your patch the thread you are primarily worried about is
>> chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
>> deal with that case is reasonable, and is unlikely to break userspace.
>> It is one of those hairy security things so we need to be careful not to
>> introduce a regression.
>>
>
> Indeed.  It's plausible this could regress something, but it would be
> really weird.
>
>> I think a top down enhancement of nosuid to just block funny cases that
>> no one cares about is completely sensible.Removing goofy corner
>> that no one cares about and that are only good for security exploits
>> seems reasonable.
>>
>
> Agreed.
>
>> I am a little concerned that smack does not seem to respect nosuid
>> on filesystems.  But that is an issue with nosuid not with your enhanced
>> nosuid.
>>
>>
>>
>>
>> Now this patch 3/7 really should be entitled:
>> "Limit file caps to the userns of the super block".
>>
>> It really really is doing something different.   This change is about a
>> bottom up understanding of what file caps means on a filesystem mounted
>> by a user namespace root.
>>
>> That is file caps should only apply to the user namespace root of the
>> root user who mounted the filesystem, because that is all the privileges
>> the mounter of the filesystem had.
>>
>> This guarantees that even if the filesystem somehow propagates with
>> mount propagation that there will be no issues.  I think I know how to
>> make that happen...
>>
>>
>>
>>
>> But deeply and fundamentally limiting a filesystem to only the
>> privilieges of it's user namespace root, and enhancing nosuid
>> protections are rather different things.
>>
>
> So here's the semantic question:
>
> Suppose an unprivileged user (uid 1000) creates a user namespace and a
> mount namespace.  They stick a file (owned by uid 1000 as seen by
> init_user_ns) in there and mark it setuid root and give it fcaps.

To make this make sense I have to ask, is this file on a filesystem
where uid 1000 as seen by the init_user_ns stored as uid 1000 on
the filesystem?  Or is this uid 0 as seen by the filesystem?

I assume this is uid 0 on the filesystem in question or else your
unprivileged user would not have sufficient privileges over the
filesystem to setup fcaps.

> Then global root gets an fd to this filesystem.  If they execve the
> file directly, then, with my patch 4, it won't act as setuid 1000 and
> the fcaps will be ignored.  Even with my patch 4, though, if they bind
> mount the fs and execve the file from their bind mount, it will act as
> setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
> fcaps will (correctly) not be honored.

With patch 3 you can also think of it as fcaps being honored and you
get all the caps in the appropriate user namespace, but since you are
not in that user namespace and so don't have a place to store them
in struct cred you don't get the file caps.

>From the philosophy of interpreting the file as defined by the
filesystem in principle we could extend struct cred so you actually
get the creds just in uid 1000s user namespace, but that is very
unlikely to be worth it.

> I tend to thing that, if we're not honoring the fcaps, we shouldn't be
> honoring the setuid bit either.  After all, it's really not a trusted
> file, even though the only user who could have messed with it really
> is the apparent owner.

For the file caps we can't honor them because you don't have the bits
in struct cred.

For setuid we can honor it, and setuid is something that the user
namespace allows.

> And, if we're going to say we don't trust the file and shouldn't honor
> setuid or fcaps, then merging all the functionality into mnt_may_suid
> could make sense.  Yes, these two things do different things, but they
> could hook in to the same place.

There are really two separate questions:
- Do we trust this filesystem?
- Do you have the bits to implement this concept?

Even if in this specific context the two questions wind up looking
exactly the same. I think it makes a lot of sense to ask the two
questions separately.  As future maintenance changes may cause the
implementation of the questions to diverge.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH v2] memory-barriers: remove smp_mb__after_unlock_lock()

2015-07-15 Thread Benjamin Herrenschmidt
On Thu, 2015-07-16 at 12:00 +1000, Michael Ellerman wrote:
> That would fix the problem with smp_mb__after_unlock_lock(), but not
> the original worry we had about loads happening before the SC in lock.

However I think isync fixes *that* :-) The problem with isync is as you
said, it's not a -memory- barrier per-se, it's an execution barrier /
context synchronizing instruction. The combination stwcx. + bne + isync
however prevents the execution of anything past the isync until the
stwcx has completed and the bne has been "decided", which prevents loads
from leaking into the LL/SC loop. It will also prevent a store in the
lock from being issued before the stwcx. has completed. It does *not*
prevent as far as I can tell another unrelated store before the lock
from leaking into the lock, including the one used to unlock a different
lock.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Initial support for user namespace owned mounts

2015-07-15 Thread Eric W. Biederman
Casey Schaufler  writes:

> On 7/15/2015 6:08 PM, Andy Lutomirski wrote:
>> On Wed, Jul 15, 2015 at 3:39 PM, Casey Schaufler  
>> wrote:
>>> On 7/15/2015 2:06 PM, Eric W. Biederman wrote:
 Casey Schaufler  writes:
 The first step needs to be not trusting those labels and treating such
 filesystems as filesystems without label support.  I hope that is Seth
 has implemented.
>>> A filesystem with Smack labels gets mounted in a namespace. The labels
>>> are ignored. Instead, the filesystem defaults (potentially specified as
>>> mount options smackfsdef="something", but usually the floor label ("_"))
>>> are used, giving the user the ability to read everything and (usually)
>>> change nothing. This is both dangerous (unintended read access to files)
>>> and pointless (can't make changes).
>> I don't get it.
>>
>> If I mount an unprivileged filesystem, then either the contents were
>> put there *by me*, in which case letting me access them are fine, or
>> (with Seth's patches and then some) I control the backing store, in
>> which case I can do whatever I want regardless of what LSM thinks.
>>
>> So I don't see the problem.  Why would Smack or any other LSM care at
>> all, unless it wants to prevent me from mounting the fs in the first
>> place?
>
> First off, I don't cotton to the notion that you should be able
> to mount filesystems without privilege. But it seems I'm being
> outvoted on that. I suspect that there are cases where it might
> be safe, but I can't think of one off the top of my head.

There are two fundamental issues mounting filesystems without privielge,
by which I actually mean mounting filesystems as the root user in a user
namespace.

- Are the semantics safe.
- Is the extra attack surface a problem.

Figuring out how to make semantics safe is what we are talking about.

Once we sort out the semantics we can look at the handful of filesystems
like fuse where the extra attack surface is not a concern.

With that said desktop environments have for a long time been
automatically mounting whichever filesystem you place in your computer,
so in practice what this is really about is trying to align the kernel
with how people use filesystems.

I haven't looked closely but I think docker is just about as bad as
those desktop environments when it comes to mounting filesystems.

> If you do mount a filesystem it needs to behave according to the
> rules of the system.

I agree.

> If you have a security module that uses
> attributes on the filesystem you can't ignore them just because
> it's "your data". Mandatory access control schemes, including
> Smack and SELinux don't give a fig about who you are. It's the
> label on the data and the process that matter. If "you" get to
> muck the labels up, you've broken the mandatory access control.

So there are filesystems like fat and minix that can not store a label.
Since it is not possible to store labels securely in filesystems mounted
by unprivileged users (at least in the normal sense) the intent would be
to treat a filesystem mounted without the privileges of the global root
user as a filesystem that does not support xattrs.

Treating such a filesystem as a filesystem that does not support xattrs
is the only possible way support such a filesystem securely, because as
you have said someone who can muck up the labels breaks mandatory access
control.

Given how non-trivial it is to grasp the nuances of different lsms
mandatory access control semantics, I am asking Seth for the first past
to simply forbid mounting of filesystems with just user namespace
permissions when there is an lsm active.

Once we get that far smack may never need to support such systems.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the akpm-current tree with the arm tree

2015-07-15 Thread Stephen Rothwell
Hi Andrew,

Today's linux-next merge of the akpm-current tree got a conflict in:

  arch/arm/include/asm/Kbuild

between commit:

  57853e8906a0 ("ARM: 8403/1: kbuild: don't use generic mcs_spinlock.h header")

from the arm tree and commit:

  74cf1a5a0c64 ("mm: clean up per architecture MM hook header files")

from the akpm-current tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc arch/arm/include/asm/Kbuild
index 517ef6dd22b9,30b3bc1666d2..
--- a/arch/arm/include/asm/Kbuild
+++ b/arch/arm/include/asm/Kbuild
@@@ -12,6 -12,8 +12,7 @@@ generic-y += irq_regs.
  generic-y += kdebug.h
  generic-y += local.h
  generic-y += local64.h
 -generic-y += mcs_spinlock.h
+ generic-y += mm-arch-hooks.h
  generic-y += msgbuf.h
  generic-y += param.h
  generic-y += parport.h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Andy Lutomirski
On Wed, Jul 15, 2015 at 9:23 PM, Eric W. Biederman
 wrote:
>
> Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
> this series.  Something I had not done before since it sounded totally
> wrong.
>
> That combined with your earlier comments I think I can say something
> meaningful.
>
> Andy as I read your patch the thread you are primarily worried about is
> chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
> deal with that case is reasonable, and is unlikely to break userspace.
> It is one of those hairy security things so we need to be careful not to
> introduce a regression.
>

Indeed.  It's plausible this could regress something, but it would be
really weird.

> I think a top down enhancement of nosuid to just block funny cases that
> no one cares about is completely sensible.Removing goofy corner
> that no one cares about and that are only good for security exploits
> seems reasonable.
>

Agreed.

> I am a little concerned that smack does not seem to respect nosuid
> on filesystems.  But that is an issue with nosuid not with your enhanced
> nosuid.
>
>
>
>
> Now this patch 3/7 really should be entitled:
> "Limit file caps to the userns of the super block".
>
> It really really is doing something different.   This change is about a
> bottom up understanding of what file caps means on a filesystem mounted
> by a user namespace root.
>
> That is file caps should only apply to the user namespace root of the
> root user who mounted the filesystem, because that is all the privileges
> the mounter of the filesystem had.
>
> This guarantees that even if the filesystem somehow propagates with
> mount propagation that there will be no issues.  I think I know how to
> make that happen...
>
>
>
>
> But deeply and fundamentally limiting a filesystem to only the
> privilieges of it's user namespace root, and enhancing nosuid
> protections are rather different things.
>

So here's the semantic question:

Suppose an unprivileged user (uid 1000) creates a user namespace and a
mount namespace.  They stick a file (owned by uid 1000 as seen by
init_user_ns) in there and mark it setuid root and give it fcaps.

Then global root gets an fd to this filesystem.  If they execve the
file directly, then, with my patch 4, it won't act as setuid 1000 and
the fcaps will be ignored.  Even with my patch 4, though, if they bind
mount the fs and execve the file from their bind mount, it will act as
setuid 1000.  Maybe this is odd.  However, with Seth's patch 3, the
fcaps will (correctly) not be honored.

I tend to thing that, if we're not honoring the fcaps, we shouldn't be
honoring the setuid bit either.  After all, it's really not a trusted
file, even though the only user who could have messed with it really
is the apparent owner.

And, if we're going to say we don't trust the file and shouldn't honor
setuid or fcaps, then merging all the functionality into mnt_may_suid
could make sense.  Yes, these two things do different things, but they
could hook in to the same place.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


OFFICIAL LETTER 16\07\2015

2015-07-15 Thread MR. PHILIP COHEN


HELLO, 

KINDLY STUDY ATTACHED DOCUMENT FOR A BETTER UNDERSTANDING TO MY PROPOSAL.

THANKS FOR TAKING THE TIME TO READ MY E-MAIL MESSAGE.

REGARDS, 
MR. PHILIP COHEN



MR. PHILIP COHEN.docx
Description: MS-Word 2007 document


[PATCH v3 3/4] arm64: Add Broadcom iProc family support

2015-07-15 Thread Ray Jui
This patch adds support to Broadcom's iProc family of arm64 based SoCs
in the arm64 Kconfig and defconfig files

Signed-off-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 arch/arm64/Kconfig   |5 +
 arch/arm64/configs/defconfig |2 ++
 2 files changed, 7 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 318175f..969ef4a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -162,6 +162,11 @@ source "kernel/Kconfig.freezer"
 
 menu "Platform selection"
 
+config ARCH_BCM_IPROC
+   bool "Broadcom iProc SoC Family"
+   help
+ This enables support for Broadcom iProc based SoCs
+
 config ARCH_EXYNOS
bool
help
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 4e17e7e..c83d51f 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -31,6 +31,7 @@ CONFIG_MODULES=y
 CONFIG_MODULE_UNLOAD=y
 # CONFIG_BLK_DEV_BSG is not set
 # CONFIG_IOSCHED_DEADLINE is not set
+CONFIG_ARCH_BCM_IPROC=y
 CONFIG_ARCH_EXYNOS7=y
 CONFIG_ARCH_FSL_LS2085A=y
 CONFIG_ARCH_HISI=y
@@ -102,6 +103,7 @@ CONFIG_SERIO_AMBAKMI=y
 CONFIG_LEGACY_PTY_COUNT=16
 CONFIG_SERIAL_8250=y
 CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_8250_DW=y
 CONFIG_SERIAL_8250_MT6577=y
 CONFIG_SERIAL_AMBA_PL011=y
 CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 1/4] PCI: iproc: enable arm64 support for iProc PCIe

2015-07-15 Thread Ray Jui
This patch enables arm64 support to the iProc PCIe driver

Signed-off-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 drivers/pci/host/pcie-iproc.c |   15 ---
 drivers/pci/host/pcie-iproc.h |8 ++--
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/host/pcie-iproc.c b/drivers/pci/host/pcie-iproc.c
index d77481e..8a556d5 100644
--- a/drivers/pci/host/pcie-iproc.c
+++ b/drivers/pci/host/pcie-iproc.c
@@ -58,11 +58,6 @@
 #define SYS_RC_INTX_EN   0x330
 #define SYS_RC_INTX_MASK 0xf
 
-static inline struct iproc_pcie *sys_to_pcie(struct pci_sys_data *sys)
-{
-   return sys->private_data;
-}
-
 /**
  * Note access to the configuration registers are protected at the higher layer
  * by 'pci_lock' in drivers/pci/access.c
@@ -71,8 +66,7 @@ static void __iomem *iproc_pcie_map_cfg_bus(struct pci_bus 
*bus,
unsigned int devfn,
int where)
 {
-   struct pci_sys_data *sys = bus->sysdata;
-   struct iproc_pcie *pcie = sys_to_pcie(sys);
+   struct iproc_pcie *pcie = bus->sysdata;
unsigned slot = PCI_SLOT(devfn);
unsigned fn = PCI_FUNC(devfn);
unsigned busno = bus->number;
@@ -208,10 +202,7 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct 
list_head *res)
 
iproc_pcie_reset(pcie);
 
-   pcie->sysdata.private_data = pcie;
-
-   bus = pci_create_root_bus(pcie->dev, 0, &iproc_pcie_ops,
- &pcie->sysdata, res);
+   bus = pci_create_root_bus(pcie->dev, 0, &iproc_pcie_ops, pcie, res);
if (!bus) {
dev_err(pcie->dev, "unable to create PCI root bus\n");
ret = -ENOMEM;
@@ -229,7 +220,9 @@ int iproc_pcie_setup(struct iproc_pcie *pcie, struct 
list_head *res)
 
pci_scan_child_bus(bus);
pci_assign_unassigned_bus_resources(bus);
+#ifdef CONFIG_ARM
pci_fixup_irqs(pci_common_swizzle, pcie->map_irq);
+#endif
pci_bus_add_devices(bus);
 
return 0;
diff --git a/drivers/pci/host/pcie-iproc.h b/drivers/pci/host/pcie-iproc.h
index ba0a108..0ee9673 100644
--- a/drivers/pci/host/pcie-iproc.h
+++ b/drivers/pci/host/pcie-iproc.h
@@ -18,18 +18,22 @@
 
 /**
  * iProc PCIe device
+ * @sysdata: Per PCI controller data. This needs to be kept at the beginning of
+ * struct iproc_pcie, to enable support of both ARM32 and ARM64 platforms with
+ * minimal changes in the iProc PCIe core driver
  * @dev: pointer to device data structure
  * @base: PCIe host controller I/O register base
  * @resources: linked list of all PCI resources
- * @sysdata: Per PCI controller data
  * @root_bus: pointer to root bus
  * @phy: optional PHY device that controls the Serdes
  * @irqs: interrupt IDs
  */
 struct iproc_pcie {
+#ifdef CONFIG_ARM
+   struct pci_sys_data sysdata;
+#endif
struct device *dev;
void __iomem *base;
-   struct pci_sys_data sysdata;
struct pci_bus *root_bus;
struct phy *phy;
int irqs[IPROC_PCIE_MAX_NUM_IRQS];
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 0/4] Add Broadcom North Star 2 support

2015-07-15 Thread Ray Jui
This patch series adds Broadcom North Star 2 (NS2) SoC support. NS2 is an ARMv8
based SoC and under the Broadcom iProc family.

Sorry for tying this with the Broadcom iProc PCIe driver fixes for ARM64. I
have to tie them together because iProc PCIe support is enabled by default
when ARCH_BCM_IPROC is enabled. Without the fixes in the iProc PCIe driver,
enabling CONFIG_ARCH_BCM_IPROC would break the build for arm64 defconfig. Let
me know if there's a better way to handle this.

This patch series is generated based on v4.2-rc2 and tested on Broadcom NS2 SVK

Code available on GITHUB: https://github.com/Broadcom/arm64-linux.git
branch is ns2-core-v3

Changes from V2:
- Drop hardcoded earlycon kernel command line paramter in NS2 SVK dts file
because 1) earlycon is a debugging feature that can be enabled in the
bootloader and should not be enabled by default in the board dts file and 2)
of_earlycon should be used and support should be added to 8250 DW driver

Changes from V1:
- Took Arnd's advice to tweak the location of struct pci_sys_data within
struct iproc_pcie. This helps to get rid of most of the CONFIG_ARM wrap in
iProc PCIe core driver
- Use stdout-path and alias for serial console in NS2 SVK dts
- Add all 4 CPU descriptions in NS2 dtsi
- Remove "clock-frequency" property in the armv8 timer node so timer frequency
can be determined based on readings from CNTFRQ_EL0
- Remove config flag ARCH_BCM_NS2. Leave only ARCH_BCM_IPROC for all Broadcom
arm64 SoCs as advised

Ray Jui (4):
  PCI: iproc: enable arm64 support for iProc PCIe
  PCI: iproc: Fix ARM64 dependency in Kconfig
  arm64: Add Broadcom iProc family support
  arm64: dts: Add Broadcom North Star 2 support

 Documentation/devicetree/bindings/arm/bcm/ns2.txt |9 ++
 arch/arm64/Kconfig|5 +
 arch/arm64/boot/dts/Makefile  |1 +
 arch/arm64/boot/dts/broadcom/Makefile |5 +
 arch/arm64/boot/dts/broadcom/ns2-svk.dts  |   59 +++
 arch/arm64/boot/dts/broadcom/ns2.dtsi |  118 +
 arch/arm64/configs/defconfig  |2 +
 drivers/pci/host/Kconfig  |2 +-
 drivers/pci/host/pcie-iproc.c |   15 +--
 drivers/pci/host/pcie-iproc.h |8 +-
 10 files changed, 210 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/arm/bcm/ns2.txt
 create mode 100644 arch/arm64/boot/dts/broadcom/Makefile
 create mode 100644 arch/arm64/boot/dts/broadcom/ns2-svk.dts
 create mode 100644 arch/arm64/boot/dts/broadcom/ns2.dtsi

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3 4/4] arm64: dts: Add Broadcom North Star 2 support

2015-07-15 Thread Ray Jui
Add Broadcom NS2 device tree binding document. Also add initial device
tree dtsi for Broadcom North Star 2 (NS2) SoC and board support for NS2
SVK board

Signed-off-by: Jon Mason 
Signed-off-by: Ray Jui 
Reviewed-by: Scott Branden 
---
 Documentation/devicetree/bindings/arm/bcm/ns2.txt |9 ++
 arch/arm64/boot/dts/Makefile  |1 +
 arch/arm64/boot/dts/broadcom/Makefile |5 +
 arch/arm64/boot/dts/broadcom/ns2-svk.dts  |   59 +++
 arch/arm64/boot/dts/broadcom/ns2.dtsi |  118 +
 5 files changed, 192 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/arm/bcm/ns2.txt
 create mode 100644 arch/arm64/boot/dts/broadcom/Makefile
 create mode 100644 arch/arm64/boot/dts/broadcom/ns2-svk.dts
 create mode 100644 arch/arm64/boot/dts/broadcom/ns2.dtsi

diff --git a/Documentation/devicetree/bindings/arm/bcm/ns2.txt 
b/Documentation/devicetree/bindings/arm/bcm/ns2.txt
new file mode 100644
index 000..35f056f
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/bcm/ns2.txt
@@ -0,0 +1,9 @@
+Broadcom North Star 2 (NS2) device tree bindings
+
+
+Boards with NS2 shall have the following properties:
+
+Required root node property:
+
+NS2 SVK board
+compatible = "brcm,ns2-svk", "brcm,ns2";
diff --git a/arch/arm64/boot/dts/Makefile b/arch/arm64/boot/dts/Makefile
index 38913be..9f95941 100644
--- a/arch/arm64/boot/dts/Makefile
+++ b/arch/arm64/boot/dts/Makefile
@@ -1,6 +1,7 @@
 dts-dirs += amd
 dts-dirs += apm
 dts-dirs += arm
+dts-dirs += broadcom
 dts-dirs += cavium
 dts-dirs += exynos
 dts-dirs += freescale
diff --git a/arch/arm64/boot/dts/broadcom/Makefile 
b/arch/arm64/boot/dts/broadcom/Makefile
new file mode 100644
index 000..e21fe66
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/Makefile
@@ -0,0 +1,5 @@
+dtb-$(CONFIG_ARCH_BCM_IPROC) += ns2-svk.dtb
+
+always := $(dtb-y)
+subdir-y   := $(dts-dirs)
+clean-files:= *.dtb
diff --git a/arch/arm64/boot/dts/broadcom/ns2-svk.dts 
b/arch/arm64/boot/dts/broadcom/ns2-svk.dts
new file mode 100644
index 000..244baf8
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/ns2-svk.dts
@@ -0,0 +1,59 @@
+/*
+ *  BSD LICENSE
+ *
+ *  Copyright(c) 2015 Broadcom Corporation.  All rights reserved.
+ *
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ ** Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ ** Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in
+ *  the documentation and/or other materials provided with the
+ *  distribution.
+ ** Neither the name of Broadcom Corporation nor the names of its
+ *  contributors may be used to endorse or promote products derived
+ *  from this software without specific prior written permission.
+ *
+ *  THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *  "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *  LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *  A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *  OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *  DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *  THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *  (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/dts-v1/;
+
+#include "ns2.dtsi"
+
+/ {
+   model = "Broadcom NS2 SVK";
+   compatible = "brcm,ns2-svk", "brcm,ns2";
+
+   aliases {
+   serial0 = &uart3;
+   };
+
+   chosen {
+   stdout-path = "serial0:115200n8";
+   };
+
+   memory {
+   device_type = "memory";
+   reg = <0x0 0x8000 0x 0x4000>;
+   };
+
+   soc: soc {
+   uart3: serial@6613 {
+   status = "ok";
+   };
+   };
+};
diff --git a/arch/arm64/boot/dts/broadcom/ns2.dtsi 
b/arch/arm64/boot/dts/broadcom/ns2.dtsi
new file mode 100644
index 000..3c92d92
--- /dev/null
+++ b/arch/arm64/boot/dts/broadcom/ns2.dtsi
@@ -0,0 +1,118 @@
+/*
+ *  BSD LICENSE
+ *
+ *  Copyright(c) 2015 Broadcom Corporation.  All rights reserved.
+ *
+ *  Redistribution and use in source and binary forms, with or without
+ *  modification, are permitted provided that the following conditions
+ *  are met:
+ *
+ ** Redistributions of source code must retain the above copyright
+ *   

[PATCH v3 2/4] PCI: iproc: Fix ARM64 dependency in Kconfig

2015-07-15 Thread Ray Jui
Allow Broadcom iProc PCIe core driver to be compiled for ARM64

Signed-off-by: Ray Jui 
Reviewed-by: Vikram Prakash 
Reviewed-by: Scott Branden 
---
 drivers/pci/host/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig
index c132bdd..d2c6144 100644
--- a/drivers/pci/host/Kconfig
+++ b/drivers/pci/host/Kconfig
@@ -117,7 +117,7 @@ config PCI_VERSATILE
 
 config PCIE_IPROC
tristate "Broadcom iProc PCIe controller"
-   depends on OF && ARM
+   depends on OF && (ARM || ARM64)
default n
help
  This enables the iProc PCIe core controller support for Broadcom's
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/7] fs: Ignore file caps in mounts from other user namespaces

2015-07-15 Thread Eric W. Biederman

Ok.  Andy I have stopped and really looked at your patch that is 4/7 in
this series.  Something I had not done before since it sounded totally
wrong.

That combined with your earlier comments I think I can say something
meaningful.  

Andy as I read your patch the thread you are primarily worried about is
chdir(/some/directory/in/another/mnt/ns).  I think enhancing nosuid to
deal with that case is reasonable, and is unlikely to break userspace.
It is one of those hairy security things so we need to be careful not to
introduce a regression.

I think a top down enhancement of nosuid to just block funny cases that
no one cares about is completely sensible.Removing goofy corner
that no one cares about and that are only good for security exploits
seems reasonable.

I am a little concerned that smack does not seem to respect nosuid
on filesystems.  But that is an issue with nosuid not with your enhanced
nosuid.




Now this patch 3/7 really should be entitled:
"Limit file caps to the userns of the super block".

It really really is doing something different.   This change is about a
bottom up understanding of what file caps means on a filesystem mounted
by a user namespace root. 

That is file caps should only apply to the user namespace root of the
root user who mounted the filesystem, because that is all the privileges
the mounter of the filesystem had.

This guarantees that even if the filesystem somehow propagates with
mount propagation that there will be no issues.  I think I know how to
make that happen...




But deeply and fundamentally limiting a filesystem to only the
privilieges of it's user namespace root, and enhancing nosuid
protections are rather different things.


The approaches show up differently for dealing with uids and gids,
as mappings are required.  The approaches will likely to continue to
show up differently for file caps when Serge implements a version
of file caps with a user namespace root in them.

The approaches fundamentally will need to do different things with
security xattrs.  As mnt_may_suid can just treat as a filesystem
without labels, while ultimately the lsms will have to do something
meaningful.



So while in the very narrow case of todays file caps the two approaches
are the same.   Enhancing nosuid is something very different from
limiting a filesystem to it's mounters user namespace.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V2 6/7] hvsock: introduce Hyper-V VM Sockets feature

2015-07-15 Thread David Miller
From: Dexuan Cui 
Date: Tue, 14 Jul 2015 03:00:48 -0700

> + pr_debug("hvsock_sk_destruct: called\n");

Debug logging just to state that a function is called is not appropriate,
we have very sophisticated tracing facilities in the kernel that can do
that transparently, and more.

PLease remove this.

> + if (hvsk->channel) {
> + pr_debug("hvsock_sk_destruct: calling vmbus_close()\n");

Likewise, these kinds of debug logs are totally inappropriate.

> +static int hvsock_release(struct socket *sock)
> +{
> + /* sock->sk is NULL, if accept() is interrupted by a signal */
> + if (sock->sk) {
> + __hvsock_release(sock->sk);
> + sock->sk = NULL;
> + }
> +
> + sock->state = SS_FREE;
> + pr_debug("hvsock_release called\n\n");

Likewise.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


LKML archives at UofI down?

2015-07-15 Thread Josh Triplett
The LKML archives once present at
http://lkml.iu.edu/hypermail/linux/kernel/index.html seem to be down;
http://lkml.iu.edu/hypermail/ appears empty.  Does anyone know what
happened to it?

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V2 3/7] Drivers: hv: vmbus: add APIs to send/recv hvsock packet and get the r/w-ability

2015-07-15 Thread David Miller
From: Dexuan Cui 
Date: Tue, 14 Jul 2015 02:58:56 -0700

> +int vmbus_sendpacket_hvsock(struct vmbus_channel *channel, void *buf, u32 
> len)
> +{
> + struct vmpacket_descriptor desc;
> + struct vmpipe_proto_header pipe_hdr;
> + u32 packetlen;
> + u32 packetlen_aligned;
> + struct kvec bufferlist[4];
> + u64 aligned_data = 0;
> + int ret;
> + bool signal = false;

Reverse christmas-tree (longest to shortest line) order these local
variables, please.

> +EXPORT_SYMBOL(vmbus_sendpacket_hvsock);

EXPORT_SYMBOL_GPL()

> +int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void *buffer,
> + u32 bufferlen, u32 *buffer_actual_len)
> +{
> + struct vmpacket_descriptor *desc;
> + struct vmpipe_proto_header *pipe_hdr;
> + u32 packet_len, payload_len;
> + int ret;
> + bool signal = false;

Again, please use reverse christmas-tree order.

> +void vmbus_get_hvsock_rw_status(struct vmbus_channel *channel,
> +bool *can_read, bool *can_write)

Second line is not properly indented, it should start exactly one
column after the openning parenthesis on the previous line.

> + hv_get_ringbuffer_availbytes(inring_info,
> + bytes_avail_toread,
> + bytes_avail_towrite);

Again, improperly indented.

> +extern int vmbus_sendpacket_hvsock(struct vmbus_channel *channel,
> + void *buf, u32 len);
> +

Likewise.

> +extern int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void 
> *buffer,
> + u32 bufferlen, u32 *buffer_actual_len);
> +
> +extern void vmbus_get_hvsock_rw_status(struct vmbus_channel *channel,
> +bool *can_read, bool *can_write);

Likewise.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net-next] hv_netvsc: Add close of RNDIS filter into change mtu call

2015-07-15 Thread David Miller
From: Haiyang Zhang 
Date: Mon, 13 Jul 2015 13:09:16 -0700

> The current change mtu call only stops tx before removing RNDIS filter.
> In case ringbufer is not empty, the rndis_filter_device_remove() may
> hang on removing the buffers.
> 
> This patch adds close of RNDIS filter before removing it, also a
> gradual waiting loop until the ring is empty. The change_mtu hang
> issue under heavy traffic is solved by this patch.
> 
> Signed-off-by: Haiyang Zhang 
> Reviewed-by: K. Y. Srinivasan 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [3/3] IRQ: Print "unexpected IRQ" messages consistently across architectures

2015-07-15 Thread Michael Ellerman
On Mon, 2015-07-13 at 13:35 -0500, Bjorn Helgaas wrote:
> On Sun, Jul 12, 2015 at 10:23 PM, Michael Ellerman  
> wrote:
> > On Sun, 2015-12-07 at 22:02:11 UTC, Bjorn Helgaas wrote:
> >> Many architectures use a variant of "unexpected IRQ trap at vector %x" to
> >> log unexpected IRQs.  This is confusing because (a) it prints the Linux IRQ
> >> number, but "vector" more often refers to a CPU vector number, and (b) it
> >> prints the IRQ number in hex with no base indication, while Linux IRQ
> >> numbers are usually printed in decimal.
> >>
> >> Print the same text ("unexpected IRQ %d") across all architectures.
> >>
> >> No functional change other than the output text.
> >
> > There's already a fallback version in asm-generic, so shouldn't you instead
> > just delete all the versions that are identical to that?
> >
> > eg. on powerpc we have:
> >
> >>  static inline void ack_bad_irq(unsigned int irq)
> >>  {
> >> - printk(KERN_CRIT "unexpected IRQ trap at vector %02x\n", irq);
> >> + printk(KERN_CRIT "unexpected IRQ %d\n", irq);
> >>  }
> >
> > And the generic version is:
> >
> >>  #ifndef ack_bad_irq
> >>  static inline void ack_bad_irq(unsigned int irq)
> >>  {
> >> - printk(KERN_CRIT "unexpected IRQ trap at vector %02x\n", irq);
> >> + printk(KERN_CRIT "unexpected IRQ %d\n", irq);
> >>  }
> >>  #endif
> >
> > So we can just delete the powerpc version?
> 
> Wow, I really didn't do my homework here.  Not only is there a generic
> version already, but there's also print_irq_desc(), which prints way
> more information than any of the ack_bad_irq() implementations.

Even better :)

> I'll try again :)

Thanks.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V2 1/7] Drivers: hv: vmbus: define the new offer type for Hyper-V socket (hvsock)

2015-07-15 Thread David Miller
From: Dexuan Cui 
Date: Tue, 14 Jul 2015 02:58:03 -0700

> A helper function is also added.
> 
> Signed-off-by: Dexuan Cui 
> ---
>  include/linux/hyperv.h | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 30d3a1f..aa21814 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -236,6 +236,7 @@ struct vmbus_channel_offer {
>  #define VMBUS_CHANNEL_LOOPBACK_OFFER 0x100
>  #define VMBUS_CHANNEL_PARENT_OFFER   0x200
>  #define VMBUS_CHANNEL_REQUEST_MONITORED_NOTIFICATION 0x400
> +#define VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER   0x2000
>  
>  struct vmpacket_descriptor {
>   u16 type;
> @@ -758,6 +759,12 @@ struct vmbus_channel {
>   struct list_head percpu_list;
>  };
>  
> +static inline bool is_hvsock_channel(const struct vmbus_channel *c)
> +{
> + return !!(c->offermsg.offer.chn_flags &
> + VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER);
> +}
> +

This is not indented properly, plus it makes no sense to add a flag before
anyone even sets the flag.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] KVM: MTRR: fix memory type handling if MTRR is completely disabled

2015-07-15 Thread Alex Williamson
On Thu, 2015-07-16 at 03:25 +0800, Xiao Guangrong wrote:
> From: Xiao Guangrong 
> 
> Currently code uses default memory type if MTRR is fully disabled,
> fix it by using UC instead
> 
> Signed-off-by: Xiao Guangrong 
> ---

Seems to work for me.  I don't see a 0th patch, but for the series:

Tested-by: Alex Williamson 

Thanks!

>  arch/x86/kvm/mtrr.c | 21 -
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c
> index de1d2d8..e275013 100644
> --- a/arch/x86/kvm/mtrr.c
> +++ b/arch/x86/kvm/mtrr.c
> @@ -120,6 +120,16 @@ static u8 mtrr_default_type(struct kvm_mtrr *mtrr_state)
>   return mtrr_state->deftype & IA32_MTRR_DEF_TYPE_TYPE_MASK;
>  }
>  
> +static u8 mtrr_disabled_type(void)
> +{
> + /*
> +  * Intel SDM 11.11.2.2: all MTRRs are disabled when
> +  * IA32_MTRR_DEF_TYPE.E bit is cleared, and the UC
> +  * memory type is applied to all of physical memory.
> +  */
> + return MTRR_TYPE_UNCACHABLE;
> +}
> +
>  /*
>  * Three terms are used in the following code:
>  * - segment, it indicates the address segments covered by fixed MTRRs.
> @@ -434,6 +444,8 @@ struct mtrr_iter {
>  
>   /* output fields. */
>   int mem_type;
> + /* mtrr is completely disabled? */
> + bool mtrr_disabled;
>   /* [start, end) is not fully covered in MTRRs? */
>   bool partial_map;
>  
> @@ -549,7 +561,7 @@ static void mtrr_lookup_var_next(struct mtrr_iter *iter)
>  static void mtrr_lookup_start(struct mtrr_iter *iter)
>  {
>   if (!mtrr_is_enabled(iter->mtrr_state)) {
> - iter->partial_map = true;
> + iter->mtrr_disabled = true;
>   return;
>   }
>  
> @@ -563,6 +575,7 @@ static void mtrr_lookup_init(struct mtrr_iter *iter,
>   iter->mtrr_state = mtrr_state;
>   iter->start = start;
>   iter->end = end;
> + iter->mtrr_disabled = false;
>   iter->partial_map = false;
>   iter->fixed = false;
>   iter->range = NULL;
> @@ -656,6 +669,9 @@ u8 kvm_mtrr_get_guest_memory_type(struct kvm_vcpu *vcpu, 
> gfn_t gfn)
>   return MTRR_TYPE_WRBACK;
>   }
>  
> + if (iter.mtrr_disabled)
> + return mtrr_disabled_type();
> +
>   /* It is not covered by MTRRs. */
>   if (iter.partial_map) {
>   /*
> @@ -689,6 +705,9 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu 
> *vcpu, gfn_t gfn,
>   return false;
>   }
>  
> + if (iter.mtrr_disabled)
> + return true;
> +
>   if (!iter.partial_map)
>   return true;
>  



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Question] How to implement GPIO driver for sparse hw numbers?

2015-07-15 Thread Masahiro Yamada
Hi Linus,


2015-07-15 7:04 GMT+09:00 Linus Walleij :
> On Fri, Jun 19, 2015 at 5:27 AM, Masahiro Yamada
>  wrote:
>
>> In my understanding, the GPIO driver framework requires that
>> the hw numbers should be contiguous within each GPIO chip.
>
> Yes but noone says that .request() to the driver has to succeed
> on every GPIO so just cover all GPIOs from 0 to 307 with
> your GPIO chip and then implement your "holes" in the GPIO
> range from 0 to 307 by letting .request() fail.

Thanks,
At first I also thought about it, but finally I did not adopt it.

Having holes in the GPIO range is not handy because:

[1] When we map a gpio range into a pin range,
we must divide "gpio-ranges" property into many lines
   gpio-ranges = http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sm750fb: coding style fixes lines over 80 chars

2015-07-15 Thread Joe Perches
On Thu, 2015-07-16 at 00:16 +0530, Vinay Simha BN wrote:
> scripts/checkpatch.pl kernel coding style fixes of WARNING

Please don't be a checkpatch robot.

Use tools to prompt your brain, but don't ever turn
your brain off.

> diff --git a/drivers/staging/sm750fb/ddk750_help.h 
> b/drivers/staging/sm750fb/ddk750_help.h


> +/* if 718 big endian turned on,be aware that don't use this driver for 
> general
> +  use,only for ppc big-endian */
> +#warning "big endian on target cpu and enable nature big endian support of 
> 718
> + capability !"

Yes, this if #if 0, but it's also obviously incorrect

I didn't look at the rest.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] ath10k: fixing wrong initialization of struct channel

2015-07-15 Thread Maninder Singh
chandef is initialized with NULL and on the very next line,
we are using it to get channel, which is not correct.

channel should be initialized after obtaining chandef.

Signed-off-by: Maninder Singh 
---
 drivers/net/wireless/ath/ath10k/mac.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/mac.c 
b/drivers/net/wireless/ath/ath10k/mac.c
index 218b6af..3d196b5 100644
--- a/drivers/net/wireless/ath/ath10k/mac.c
+++ b/drivers/net/wireless/ath/ath10k/mac.c
@@ -836,7 +836,7 @@ static inline int ath10k_vdev_setup_sync(struct ath10k *ar)
 static int ath10k_monitor_vdev_start(struct ath10k *ar, int vdev_id)
 {
struct cfg80211_chan_def *chandef = NULL;
-   struct ieee80211_channel *channel = chandef->chan;
+   struct ieee80211_channel *channel = NULL;
struct wmi_vdev_start_request_arg arg = {};
int ret = 0;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the rcu tree

2015-07-15 Thread Paul E. McKenney
On Thu, Jul 16, 2015 at 01:14:23PM +1000, Stephen Rothwell wrote:
> Hi Paul,
> 
> After merging the rcu tree, today's linux-next build (arm
> multi_v7_defconfig) failed like this:
> 
> kernel/notifier.c: In function 'notify_die':
> kernel/notifier.c:547:2: error: implicit declaration of function 
> 'rcu_lockdep_assert' [-Werror=implicit-function-declaration]
>   rcu_lockdep_assert(rcu_is_watching(),
>   ^
> 
> Caused by commit
> 
>   02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()")
> 
> interacting with commit
> 
>   e727c7d7a11e ("notifiers, RCU: Assert that RCU is watching in notify_die()")
> 
> [ and I also noted
>   0333a209cbf6 ("x86/irq, context_tracking: Document how IRQ context tracking 
> works and add an RCU assertion")
> ]
> 
> from the tip tree.

Thank you in both cases!  I suspect that more will follow, so is there
something I can do to make this easier?  (Hard for me to patch stuff
that is not yet in the tree...)

Thanx, Paul

> I added the following merge fix patch:
> 
> From: Stephen Rothwell 
> Date: Thu, 16 Jul 2015 13:08:50 +1000
> Subject: [PATCH] rcu: merge fix for Rename rcu_lockdep_assert() to 
> RCU_LOCKDEP_WARN()
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  arch/x86/kernel/irq.c | 2 +-
>  kernel/notifier.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
> index 30dbf35bc90b..f9cd81825187 100644
> --- a/arch/x86/kernel/irq.c
> +++ b/arch/x86/kernel/irq.c
> @@ -234,7 +234,7 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs 
> *regs)
>   entering_irq();
> 
>   /* entering_irq() tells RCU that we're not quiescent.  Check it. */
> - rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
> + RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
> 
>   irq = __this_cpu_read(vector_irq[vector]);
> 
> diff --git a/kernel/notifier.c b/kernel/notifier.c
> index 980e4330fb59..fd2c9acbcc19 100644
> --- a/kernel/notifier.c
> +++ b/kernel/notifier.c
> @@ -544,7 +544,7 @@ int notrace notify_die(enum die_val val, const char *str,
>   .signr  = sig,
> 
>   };
> - rcu_lockdep_assert(rcu_is_watching(),
> + RCU_LOCKDEP_WARN(!rcu_is_watching(),
>  "notify_die called but RCU thinks we're quiescent");
>   return atomic_notifier_call_chain(&die_chain, val, &args);
>  }
> -- 
> 2.1.4
> 
> -- 
> Cheers,
> Stephen Rothwells...@canb.auug.org.au
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] gpio: UniPhier: add driver for UniPhier GPIO controller

2015-07-15 Thread Masahiro Yamada
Hi Linus,


2015-07-15 23:15 GMT+09:00 Linus Walleij :
> On Tue, Jul 14, 2015 at 4:43 AM, Masahiro Yamada
>  wrote:
>
>> This GPIO controller device is used on UniPhier SoCs.
>>
>> Signed-off-by: Masahiro Yamada 
>> ---
>>
>> Changes in v3:
>>   - Use module_platform_driver()
>>
>> Changes in v2:
>>   - Fix typos in the comment block
>
> OK why no device tree bindings? Are they in a separate patch?


Sorry, I was planning to do it later.

OK.  I will come back with
Documentation/devicetree/bindings/gpio/uniphier-gpio.txt in binding info in it.


>> +/*
>> + * Unfortunately, the hardware specification adopts weird GPIO pin labeling.
>> + * The ports are named as
>> + *   PORT00,  PORT01,  PORT02,  ..., PORT07,
>> + *   PORT10,  PORT11,  PORT12,  ..., PORT17,
>> + *   PORT20,  PORT21,  PORT22,  ..., PORT27,
>> + *...
>> + *   PORT90,  PORT91,  PORT92,  ..., PORT97,
>> + *   PORT100, PORT101, PORT102, ..., PORT107,
>> + *...
>> + *
>> + * The PORTs with 8 or 9 in the one's place are missing, i.e. the one's 
>> place
>> + * is octal, while the other places are decimal.  If we handle the port 
>> numbers
>> + * as seen in the hardware documents, the GPIO offsets must be 
>> non-contiguous.
>> + * It is possible to have sparse GPIO pins, but not handy for GPIO range
>> + * mappings, register accessing, etc.
>> + *
>> + * To make things simpler (for driver and device tree implementation), this
>> + * driver takes contiguously-numbered GPIO offsets.  GPIO consumers should 
>> make
>> + * sure to convert the PORT number into the one that fits in this driver.
>> + * The conversion logic is very easy math, for example,
>> + *   PORT15  -->  GPIO offset 13   (8 * 1 + 5)
>> + *   PORT123 -->  GPIO offset 99   (8 * 12 + 3)
>> + */
>> +#define UNIPHIER_GPIO_PORTS_PER_BANK   8
>> +#define UNIPHIER_GPIO_BANK_MASK\
>> +   ((1UL << (UNIPHIER_GPIO_PORTS_PER_BANK)) - 1)
>
>
>
>> +
>> +#define UNIPHIER_GPIO_REG_DATA 0   /* data */
>> +#define UNIPHIER_GPIO_REG_DIR  4   /* direction (1:in, 0:out) */
>> +
>> +struct uniphier_gpio_priv {
>> +   struct of_mm_gpio_chip mmchip;
>> +   spinlock_t lock;
>> +};
>> +
>> +static unsigned uniphier_gpio_bank_to_reg(unsigned bank, unsigned reg_type)
>> +{
>> +   unsigned reg;
>> +
>> +   reg = (bank + 1) * 8 + reg_type;
>> +
>> +   /*
>> +* Unfortunately, there is a register hole at offset 0x90-0x9f.
>> +* Add 0x10 when crossing the hole.
>> +*/
>> +   if (reg >= 0x90)
>> +   reg += 0x10;
>> +
>> +   return reg;
>> +}
>> +
>> +static void uniphier_gpio_bank_write(struct gpio_chip *chip,
>> +unsigned bank, unsigned reg_type,
>> +unsigned mask, unsigned value)
>> +{
>> +   struct of_mm_gpio_chip *mmchip = to_of_mm_gpio_chip(chip);
>> +   struct uniphier_gpio_priv *priv;
>> +   unsigned long flags;
>> +   unsigned reg;
>> +   u32 tmp;
>> +
>> +   if (!mask)
>> +   return;
>> +
>> +   priv = container_of(mmchip, struct uniphier_gpio_priv, mmchip);
>> +
>> +   reg = uniphier_gpio_bank_to_reg(bank, reg_type);
>> +
>> +   /*
>> +* Note
>> +* regmap_update_bits() should not be used here.
>> +*
>> +* The DATA registers return the current readback of pins, not the
>> +* previously written data when they are configured as "input".
>> +* The DATA registers must be overwritten even if the data you are
>> +* going to write is the same as what readl() has returned.
>> +*
>> +* regmap_update_bits() does not write back if the data is not 
>> changed.
>> +*/
>
> Why is this mentioned when the driver doesn't even use regmap?
> Development artifact?


At first, I thought regmap_update_bits() might be useful,
but it tuned out a bad idea.

Anyway, it did not use regmap in this driver, so this comment sounds a
bit weird.
I will delete it in v4.



>> +static int uniphier_gpio_get_direction(struct gpio_chip *chip, unsigned 
>> offset)
>> +{
>> +   return uniphier_gpio_offset_read(chip, UNIPHIER_GPIO_REG_DIR, 
>> offset) ?
>> +   GPIOF_DIR_IN : GPIOF_DIR_OUT;
>
> Just use
> return !!uniphier_gpio_offset_read(chip, UNIPHIER_GPIO_REG_DIR, offset);


OK, will fix.

>> +static int uniphier_gpio_get(struct gpio_chip *chip, unsigned offset)
>> +{
>> +   return uniphier_gpio_offset_read(chip, offset, 
>> UNIPHIER_GPIO_REG_DATA);
>
> return !!uniphier_gpio_offset_read(chip, offset, UNIPHIER_GPIO_REG_DATA);

Likewise.


>> +static void uniphier_gpio_set_multiple(struct gpio_chip *chip,
>> +  unsigned long *mask,
>> +  unsigned long *bits)
>> +{
>> +   unsigned bank, shift, bank_mask, bank_bits;
>> +   int i;
>> +
>> +   for (i = 0; i < chip->ngpio; i += UNIP

Re: [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian

2015-07-15 Thread Michael Ellerman
On Wed, 2015-07-15 at 08:16 -0700, Kees Cook wrote:
> On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman  
> wrote:
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> > b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index b2374c131340..51adb9afb511 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -82,7 +82,13 @@ struct seccomp_data {
> >  };
> >  #endif
> >
> > +#if __BYTE_ORDER == __LITTLE_ENDIAN
> >  #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
> > +#elif __BYTE_ORDER == __BIG_ENDIAN
> > +#define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]) + 
> > sizeof(__u32))
> > +#else
> > +#error "wut?"
> > +#endif
> 
> Ah-ha! Yes, thanks. Could you change the #error to something that
> describes the particular (impossible) failure condition? "wut? Unknown
> __BYTE_ORDER?!". Not a huge deal, but I always like verbose errors. :)
> Especially for "impossible" situations. :)

Yeah sorry that was a "quick hack" which got promoted into an actual patch.

Fixed to use your message.

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks

2015-07-15 Thread Michael Ellerman
On Wed, 2015-07-15 at 08:12 -0700, Kees Cook wrote:
> On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman  
> wrote:
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> > b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index c5abe7fd7590..b2374c131340 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -645,6 +645,10 @@ static struct siginfo TRAP_info;
> >  static volatile int TRAP_nr;
> >  static void TRAP_action(int nr, siginfo_t *info, void *void_context)
> >  {
> > +   fprintf(stderr, "in TRAP_action\n");
> > +   fprintf(stderr, "info->si_call_addr %p\n", info->si_call_addr);
> > +   fprintf(stderr, "info->si_syscall %u\n", info->si_syscall);
> > +   fprintf(stderr, "info->si_arch %u\n", info->si_arch);
> > memcpy(&TRAP_info, info, sizeof(TRAP_info));
> > TRAP_nr = nr;
> >  }
> 
> This chunk looks like left-over debugging?

Urgh yep, that's ugly. Thanks for noticing.

Will remove before merging :)

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Initial support for user namespace owned mounts

2015-07-15 Thread Eric W. Biederman

Seth I think for the LSMs we should start with:

diff --git a/security/security.c b/security/security.c
index 062f3c997fdc..5b6ece92a8e5 100644
--- a/security/security.c
+++ b/security/security.c
@@ -310,6 +310,8 @@ int security_sb_statfs(struct dentry *dentry)
 int security_sb_mount(const char *dev_name, struct path *path,
const char *type, unsigned long flags, void *data)
 {
+   if (current_user_ns() != &init_user_ns)
+   return -EPERM;
return call_int_hook(sb_mount, 0, dev_name, path, type, flags, data);
 }


Then we should push this down into all of the lsms.
Then when we should remove or relax or change the check as appropriate
in each lsm.

The point is this is good enough to see that it is trivially safe,
and this allows us to focus on the core issues, and stop worrying about
the lsms for a bit.

Then we can focus on each lsm one at at time and take the time to really
understand them and talk with their maintainers etc to make certain
we get things correct.

This should remove the need for your patches 5, 6 and 7. For the
immediate future.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] block: partition: introduce 'cpu' para to part_inc|dec_in_flight

2015-07-15 Thread Ming Lei
So that it is easier to convert part->in_flight[rw] into percpu variable
in the following patch.

Signed-off-by: Ming Lei 
---
 block/bio.c   | 4 ++--
 block/blk-core.c  | 4 ++--
 block/blk-merge.c | 2 +-
 drivers/nvdimm/core.c | 4 ++--
 include/linux/genhd.h | 4 ++--
 5 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 2a00d34..fe8807f 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1724,7 +1724,7 @@ void generic_start_io_acct(int rw, unsigned long sectors,
part_round_stats(cpu, part);
part_stat_inc(cpu, part, ios[rw]);
part_stat_add(cpu, part, sectors[rw], sectors);
-   part_inc_in_flight(part, rw);
+   part_inc_in_flight(cpu, part, rw);
 
part_stat_unlock();
 }
@@ -1738,7 +1738,7 @@ void generic_end_io_acct(int rw, struct hd_struct *part,
 
part_stat_add(cpu, part, ticks[rw], duration);
part_round_stats(cpu, part);
-   part_dec_in_flight(part, rw);
+   part_dec_in_flight(cpu, part, rw);
 
part_stat_unlock();
 }
diff --git a/block/blk-core.c b/block/blk-core.c
index 82819e6..f180a6d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2194,7 +2194,7 @@ void blk_account_io_done(struct request *req)
part_stat_inc(cpu, part, ios[rw]);
part_stat_add(cpu, part, ticks[rw], duration);
part_round_stats(cpu, part);
-   part_dec_in_flight(part, rw);
+   part_dec_in_flight(cpu, part, rw);
 
hd_struct_put(part);
part_stat_unlock();
@@ -2252,7 +2252,7 @@ void blk_account_io_start(struct request *rq, bool new_io)
hd_struct_get(part);
}
part_round_stats(cpu, part);
-   part_inc_in_flight(part, rw);
+   part_inc_in_flight(cpu, part, rw);
rq->part = part;
}
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 30a0d9f..cb7c46d 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -449,7 +449,7 @@ static void blk_account_io_merge(struct request *req)
part = req->part;
 
part_round_stats(cpu, part);
-   part_dec_in_flight(part, rq_data_dir(req));
+   part_dec_in_flight(cpu, part, rq_data_dir(req));
 
hd_struct_put(part);
part_stat_unlock();
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index cb62ec6..053d026 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -224,7 +224,7 @@ void __nd_iostat_start(struct bio *bio, unsigned long 
*start)
part_round_stats(cpu, &disk->part0);
part_stat_inc(cpu, &disk->part0, ios[rw]);
part_stat_add(cpu, &disk->part0, sectors[rw], bio_sectors(bio));
-   part_inc_in_flight(&disk->part0, rw);
+   part_inc_in_flight(cpu, &disk->part0, rw);
part_stat_unlock();
 }
 EXPORT_SYMBOL(__nd_iostat_start);
@@ -238,7 +238,7 @@ void nd_iostat_end(struct bio *bio, unsigned long start)
 
part_stat_add(cpu, &disk->part0, ticks[rw], duration);
part_round_stats(cpu, &disk->part0);
-   part_dec_in_flight(&disk->part0, rw);
+   part_dec_in_flight(cpu, &disk->part0, rw);
part_stat_unlock();
 }
 EXPORT_SYMBOL(nd_iostat_end);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 2adbfa6..612ae80 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -381,14 +381,14 @@ static inline void free_part_stats(struct hd_struct *part)
 #define part_stat_sub(cpu, gendiskp, field, subnd) \
part_stat_add(cpu, gendiskp, field, -subnd)
 
-static inline void part_inc_in_flight(struct hd_struct *part, int rw)
+static inline void part_inc_in_flight(int cpu, struct hd_struct *part, int rw)
 {
atomic_inc(&part->in_flight[rw]);
if (part->partno)
atomic_inc(&part_to_disk(part)->part0.in_flight[rw]);
 }
 
-static inline void part_dec_in_flight(struct hd_struct *part, int rw)
+static inline void part_dec_in_flight(int cpu, struct hd_struct *part, int rw)
 {
atomic_dec(&part->in_flight[rw]);
if (part->partno)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/4] block: partition: convert percpu ref

2015-07-15 Thread Ming Lei
Percpu refcount is the perfect match for partition's case,
and the conversion is quite straight.

With the convertion, one pair of atomic inc/dec can be saved
for accounting block I/O, which is run in hot path of block I/O.

Signed-off-by: Ming Lei 
---
 block/genhd.c |  6 +-
 block/partition-generic.c |  9 +
 include/linux/genhd.h | 27 +--
 3 files changed, 27 insertions(+), 15 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index ed3f5b9..3213b66 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1284,7 +1284,11 @@ struct gendisk *alloc_disk_node(int minors, int node_id)
 * converted to make use of bd_mutex and sequence counters.
 */
seqcount_init(&disk->part0.nr_sects_seq);
-   hd_ref_init(&disk->part0);
+   if (hd_ref_init(&disk->part0)) {
+   hd_free_part(&disk->part0);
+   kfree(disk);
+   return NULL;
+   }
 
disk->minors = minors;
rand_initialize_disk(disk);
diff --git a/block/partition-generic.c b/block/partition-generic.c
index eca0d02..e771113 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -232,8 +232,9 @@ static void delete_partition_rcu_cb(struct rcu_head *head)
put_device(part_to_dev(part));
 }
 
-void __delete_partition(struct hd_struct *part)
+void __delete_partition(struct percpu_ref *ref)
 {
+   struct hd_struct *part = container_of(ref, struct hd_struct, ref);
call_rcu(&part->rcu_head, delete_partition_rcu_cb);
 }
 
@@ -254,7 +255,7 @@ void delete_partition(struct gendisk *disk, int partno)
kobject_put(part->holder_dir);
device_del(part_to_dev(part));
 
-   hd_struct_put(part);
+   hd_struct_kill(part);
 }
 
 static ssize_t whole_disk_show(struct device *dev,
@@ -355,8 +356,8 @@ struct hd_struct *add_partition(struct gendisk *disk, int 
partno,
if (!dev_get_uevent_suppress(ddev))
kobject_uevent(&pdev->kobj, KOBJ_ADD);
 
-   hd_ref_init(p);
-   return p;
+   if (!hd_ref_init(p))
+   return p;
 
 out_free_info:
free_part_info(p);
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index a221220..2adbfa6 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_BLOCK
 
@@ -124,7 +125,7 @@ struct hd_struct {
 #else
struct disk_stats dkstats;
 #endif
-   atomic_t ref;
+   struct percpu_ref ref;
struct rcu_head rcu_head;
 };
 
@@ -611,7 +612,7 @@ extern struct hd_struct * __must_check add_partition(struct 
gendisk *disk,
 sector_t len, int flags,
 struct partition_meta_info
   *info);
-extern void __delete_partition(struct hd_struct *);
+extern void __delete_partition(struct percpu_ref *);
 extern void delete_partition(struct gendisk *, int);
 extern void printk_all_partitions(void);
 
@@ -640,33 +641,39 @@ extern ssize_t part_fail_store(struct device *dev,
   const char *buf, size_t count);
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
-static inline void hd_ref_init(struct hd_struct *part)
+static inline int hd_ref_init(struct hd_struct *part)
 {
-   atomic_set(&part->ref, 1);
-   smp_mb();
+   if (percpu_ref_init(&part->ref, __delete_partition, 0,
+   GFP_KERNEL))
+   return -ENOMEM;
+   return 0;
 }
 
 static inline void hd_struct_get(struct hd_struct *part)
 {
-   atomic_inc(&part->ref);
-   smp_mb__after_atomic();
+   percpu_ref_get(&part->ref);
 }
 
 static inline int hd_struct_try_get(struct hd_struct *part)
 {
-   return atomic_inc_not_zero(&part->ref);
+   return percpu_ref_tryget_live(&part->ref);
 }
 
 static inline void hd_struct_put(struct hd_struct *part)
 {
-   if (atomic_dec_and_test(&part->ref))
-   __delete_partition(part);
+   percpu_ref_put(&part->ref);
+}
+
+static inline void hd_struct_kill(struct hd_struct *part)
+{
+   percpu_ref_kill(&part->ref);
 }
 
 static inline void hd_free_part(struct hd_struct *part)
 {
free_part_stats(part);
free_part_info(part);
+   percpu_ref_exit(&part->ref);
 }
 
 /*
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] block: account io: convert part->in_fligh[] into percpu variable

2015-07-15 Thread Ming Lei
So the atomic operations for accounting block I/O can be killed
completely, and it is OK to add the percpu variables in part_in_flight()
because the function is run at most one time in every tick.

Signed-off-by: Ming Lei 
---
 block/blk-core.c  |  1 +
 block/partition-generic.c |  5 +++--
 drivers/md/dm.c   | 10 ++
 include/linux/genhd.h | 24 ++--
 4 files changed, 28 insertions(+), 12 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index f180a6d..0001d4c 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1344,6 +1344,7 @@ static void part_round_stats_single(int cpu, struct 
hd_struct *part,
if (now == part->stamp)
return;
 
+   /* at most one percpu addition per one tick */
inflight = part_in_flight(part);
if (inflight) {
__part_stat_add(cpu, part, time_in_queue,
diff --git a/block/partition-generic.c b/block/partition-generic.c
index e771113..0a553e7 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -140,8 +140,9 @@ ssize_t part_inflight_show(struct device *dev,
 {
struct hd_struct *p = dev_to_part(dev);
 
-   return sprintf(buf, "%8u %8u\n", atomic_read(&p->in_flight[0]),
-   atomic_read(&p->in_flight[1]));
+   return sprintf(buf, "%8u %8u\n",
+   part_stat_read(p, in_flight[0]),
+   part_stat_read(p, in_flight[1]));
 }
 
 #ifdef CONFIG_FAIL_MAKE_REQUEST
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index de70377..1b6d8be 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -651,9 +651,9 @@ static void start_io_acct(struct dm_io *io)
 
cpu = part_stat_lock();
part_round_stats(cpu, &dm_disk(md)->part0);
+   part_stat_set(cpu, &dm_disk(md)->part0, in_flight[rw],
+   atomic_inc_return(&md->pending[rw]));
part_stat_unlock();
-   atomic_set(&dm_disk(md)->part0.in_flight[rw],
-   atomic_inc_return(&md->pending[rw]));
 
if (unlikely(dm_stats_used(&md->stats)))
dm_stats_account_io(&md->stats, bio->bi_rw, 
bio->bi_iter.bi_sector,
@@ -665,7 +665,7 @@ static void end_io_acct(struct dm_io *io)
struct mapped_device *md = io->md;
struct bio *bio = io->bio;
unsigned long duration = jiffies - io->start_time;
-   int pending;
+   int pending, cpu;
int rw = bio_data_dir(bio);
 
generic_end_io_acct(rw, &dm_disk(md)->part0, io->start_time);
@@ -679,7 +679,9 @@ static void end_io_acct(struct dm_io *io)
 * a flush.
 */
pending = atomic_dec_return(&md->pending[rw]);
-   atomic_set(&dm_disk(md)->part0.in_flight[rw], pending);
+   cpu = part_stat_lock();
+   part_stat_set(cpu, &dm_disk(md)->part0, in_flight[rw], pending);
+   part_stat_unlock();
pending += atomic_read(&md->pending[rw^0x1]);
 
/* nudge anyone waiting on suspend queue */
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 612ae80..abe5567 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -86,6 +86,7 @@ struct disk_stats {
unsigned long ticks[2];
unsigned long io_ticks;
unsigned long time_in_queue;
+   unsigned int  in_flight[2];
 };
 
 #define PARTITION_META_INFO_VOLNAMELTH 64
@@ -119,7 +120,6 @@ struct hd_struct {
int make_it_fail;
 #endif
unsigned long stamp;
-   atomic_t in_flight[2];
 #ifdef CONFIG_SMP
struct disk_stats __percpu *dkstats;
 #else
@@ -320,6 +320,9 @@ extern struct hd_struct *disk_map_sector_rcu(struct gendisk 
*disk,
res;\
 })
 
+#define part_stat_set(cpu, part, field, seted) \
+   (per_cpu_ptr((part)->dkstats, (cpu))->field = (seted))
+
 static inline void part_stat_set_all(struct hd_struct *part, int value)
 {
int i;
@@ -351,6 +354,9 @@ static inline void free_part_stats(struct hd_struct *part)
 
 #define part_stat_read(part, field)((part)->dkstats.field)
 
+#define part_stat_set(cpu, part, field, seted) \
+   ((part)->dkstats.field = (seted))
+
 static inline void part_stat_set_all(struct hd_struct *part, int value)
 {
memset(&part->dkstats, value, sizeof(struct disk_stats));
@@ -383,21 +389,27 @@ static inline void free_part_stats(struct hd_struct *part)
 
 static inline void part_inc_in_flight(int cpu, struct hd_struct *part, int rw)
 {
-   atomic_inc(&part->in_flight[rw]);
+   part_stat_inc(cpu, part, in_flight[rw]);
if (part->partno)
-   atomic_inc(&part_to_disk(part)->part0.in_flight[rw]);
+   part_stat_inc(cpu, &part_to_disk(part)->part0, in_flight[rw]);
 }
 
 static inline void part_dec_in_flight(int cpu, struct hd_struct *part, int rw)
 {
-   atomic_dec(&part->in_flight[rw]);
+   part_stat_dec(cpu, part, in_flight[rw]);
if (part->partno)
-   atomic_d

[PATCH 0/4] block: account io: kill atomic operations

2015-07-15 Thread Ming Lei
Hi,

This patches kills two kinds of atomic operations in block
accounting I/O.

The 1st two patches convert atomic refcount of partition
into percpu refcount.

The 2nd two patches converts partition->in_flight[] into percpu
variable.

With this change, ~15% throughput improvement can be observed
when running fio(randread) over null blk in a dual-socket
environment.

 block/bio.c   |  4 ++--
 block/blk-core.c  |  5 ++--
 block/blk-merge.c |  2 +-
 block/genhd.c |  9 ---
 block/partition-generic.c | 17 ++---
 drivers/md/dm.c   | 10 
 drivers/nvdimm/core.c |  4 ++--
 include/linux/genhd.h | 61 +--
 8 files changed, 72 insertions(+), 40 deletions(-)

Thanks,
Ming

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] block: partition: introduce hd_free_part()

2015-07-15 Thread Ming Lei
So the helper can be used in both generic partition
case and part0 case.

Signed-off-by: Ming Lei 
---
 block/genhd.c | 3 +--
 block/partition-generic.c | 3 +--
 include/linux/genhd.h | 6 ++
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index e552e1b..ed3f5b9 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -1110,8 +1110,7 @@ static void disk_release(struct device *dev)
disk_release_events(disk);
kfree(disk->random);
disk_replace_part_tbl(disk, NULL);
-   free_part_stats(&disk->part0);
-   free_part_info(&disk->part0);
+   hd_free_part(&disk->part0);
if (disk->queue)
blk_put_queue(disk->queue);
kfree(disk);
diff --git a/block/partition-generic.c b/block/partition-generic.c
index 0d9e5f9..eca0d02 100644
--- a/block/partition-generic.c
+++ b/block/partition-generic.c
@@ -212,8 +212,7 @@ static void part_release(struct device *dev)
 {
struct hd_struct *p = dev_to_part(dev);
blk_free_devt(dev->devt);
-   free_part_stats(p);
-   free_part_info(p);
+   hd_free_part(p);
kfree(p);
 }
 
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index ec274e0..a221220 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -663,6 +663,12 @@ static inline void hd_struct_put(struct hd_struct *part)
__delete_partition(part);
 }
 
+static inline void hd_free_part(struct hd_struct *part)
+{
+   free_part_stats(part);
+   free_part_info(part);
+}
+
 /*
  * Any access of part->nr_sects which is not protected by partition
  * bd_mutex or gendisk bdev bd_mutex, should be done using this
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH perf/core v2 00/16] perf-probe --cache and SDT support

2015-07-15 Thread Hemant Kumar

Hi Masami,

On 07/15/2015 02:43 PM, Masami Hiramatsu wrote:

Hi,

Here is the 2nd version of the patchset for probe-cache and
initial SDT support which are going to be perf-cache finally.


Thanks for adding the SDT support.


The perf-probe is useful for debugging, but it strongly depends
on the debuginfo. Without debuginfo, it is just a frontend of
ftrace's dynamic events. This can usually happen in server
farms or on cloud system, since no one wants to distribute
big debuginfo packages.

To solve this issue, I had tried to make a pre-analyzed probes
( https://lkml.org/lkml/2014/10/31/207 ) but it has a problm
that we can't ensure the probed binary is same as what we analyzed.
Arnaldo gave me an idea to reuse build-id cache for that perpose
and this series is the first prototype of that.

At the same time, Hemant has started to support SDT probes which
also use the cache file of SDT info. So I decided to merge this
into the same build-id cache.
In this version, SDT support is still very limited, it works
as a part of probe-cache.

In this version, perf probe supports --cache option which means
that perf probe manipulate probe caches, for example,

   # perf probe --cache --add "probe-desc"

does not only add probe events but also add "probe-desc" and
it's result on the cache. (Note that the cached entry is always
referred even without --cache)
The --list and --del commands also support --cache. Note that
both are only manipulate caches, not real events.

To use SDT, we have to scan the target binary at first by using
perf-buildid-cache, e.g.

   # perf buildid-cache --add /lib/libc-2.17.so

And perf probe --cache --list shows what SDTs are scanned.

   # perf probe --cache --list
   /usr/lib/libc-2.17.so (a6fb821bdf53660eb2c29f778757aef294d3d392):
   libc:setjmp=setjmp
   libc:longjmp=longjmp
   libc:longjmp_target=longjmp_target
   libc:memory_heap_new=memory_heap_new
   libc:memory_sbrk_less=memory_sbrk_less
   libc:memory_arena_reuse_free_list=memory_arena_reuse_free_list
   libc:memory_arena_reuse=memory_arena_reuse
   ...

To use the SDT events, perf probe -x BIN %SDTEVENT allows you to
add a probe on SDTEVENT@BIN.

   # perf probe -x /lib/libc-2.17.so %memory_heap_new

If you define a cached probe with event name, you can also reuse
it as same as SDT events.

   # perf probe -x ./perf --cache -n 'myevent=dso__load $params'

(Note that "-n" option only updates caches)
To use the above "myevent", you just have to add "%myevent".

   # perf probe -x ./perf %myevent


TODOs:
  - Show available cached/SDT events by perf-list
  - Allow perf-record to use cached/SDT events directly


As I was already working on SDT events' recording
https://lkml.org/lkml/2014/11/2/73,
I can re-spin the patches on top of your patchset and make the
required changes to implement the above TODOs.
What would you suggest?


Thank you,

---

Hemant Kumar (1):
   perf/sdt: ELF support for SDT

Masami Hiramatsu (15):
   perf probe: Simplify __add_probe_trace_events code
   perf probe: Move ftrace probe-event operations to probe-file.c
   perf probe: Use strbuf for making strings in probe-event.c
   perf-buildid-cache: Use path/to/bin/buildid/elf instead of 
path/to/bin/buildid
   perf buildid: Use SBUILD_ID_SIZE macro
   perf buildid: Introduce sysfs/filename__sprintf_build_id
   perf: Add lsdir to read a directory
   perf-buildid-cache: Use lsdir for looking up buildid caches
   perf probe: Add --cache option to cache the probe definitions
   perf probe: Use cache entry if possible
   perf probe: Show all cached probes
   perf probe: Remove caches when --cache is given
   perf probe: Add group name support
   perf buildid-cache: Scan and import user SDT events to probe cache
   perf probe: Accept %sdt and %cached event name


  tools/perf/Documentation/perf-probe.txt |   14
  tools/perf/builtin-buildid-cache.c  |   22 -
  tools/perf/builtin-buildid-list.c   |   28 -
  tools/perf/builtin-probe.c  |3
  tools/perf/util/Build   |1
  tools/perf/util/build-id.c  |  230 ++--
  tools/perf/util/build-id.h  |   11
  tools/perf/util/dso.h   |5
  tools/perf/util/probe-event.c   |  918 ++-
  tools/perf/util/probe-event.h   |   16 -
  tools/perf/util/probe-file.c|  763 ++
  tools/perf/util/probe-file.h|   46 ++
  tools/perf/util/probe-finder.c  |   10
  tools/perf/util/symbol-elf.c|  252 +
  tools/perf/util/symbol.c|2
  tools/perf/util/symbol.h|   22 +
  tools/perf/util/util.c  |   34 +
  tools/perf/util/util.h  |4
  18 files changed, 1781 insertions(+), 600 deletions(-)
  create mode 100644 tools/perf/util/probe-file.c
  create mode 100644 tools/perf/util/probe-file.h




--
Thanks,
Hemant Kumar

--

linux-next: build failure after merge of the rcu tree

2015-07-15 Thread Stephen Rothwell
Hi Paul,

After merging the rcu tree, today's linux-next build (arm
multi_v7_defconfig) failed like this:

kernel/notifier.c: In function 'notify_die':
kernel/notifier.c:547:2: error: implicit declaration of function 
'rcu_lockdep_assert' [-Werror=implicit-function-declaration]
  rcu_lockdep_assert(rcu_is_watching(),
  ^

Caused by commit

  02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()")

interacting with commit

  e727c7d7a11e ("notifiers, RCU: Assert that RCU is watching in notify_die()")

[ and I also noted
  0333a209cbf6 ("x86/irq, context_tracking: Document how IRQ context tracking 
works and add an RCU assertion")
]

from the tip tree.

I added the following merge fix patch:

From: Stephen Rothwell 
Date: Thu, 16 Jul 2015 13:08:50 +1000
Subject: [PATCH] rcu: merge fix for Rename rcu_lockdep_assert() to 
RCU_LOCKDEP_WARN()

Signed-off-by: Stephen Rothwell 
---
 arch/x86/kernel/irq.c | 2 +-
 kernel/notifier.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 30dbf35bc90b..f9cd81825187 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -234,7 +234,7 @@ __visible unsigned int __irq_entry do_IRQ(struct pt_regs 
*regs)
entering_irq();
 
/* entering_irq() tells RCU that we're not quiescent.  Check it. */
-   rcu_lockdep_assert(rcu_is_watching(), "IRQ failed to wake up RCU");
+   RCU_LOCKDEP_WARN(!rcu_is_watching(), "IRQ failed to wake up RCU");
 
irq = __this_cpu_read(vector_irq[vector]);
 
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 980e4330fb59..fd2c9acbcc19 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -544,7 +544,7 @@ int notrace notify_die(enum die_val val, const char *str,
.signr  = sig,
 
};
-   rcu_lockdep_assert(rcu_is_watching(),
+   RCU_LOCKDEP_WARN(!rcu_is_watching(),
   "notify_die called but RCU thinks we're quiescent");
return atomic_notifier_call_chain(&die_chain, val, &args);
 }
-- 
2.1.4

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the rcu tree with the tip tree

2015-07-15 Thread Stephen Rothwell
Hi Paul,

Today's linux-next merge of the rcu tree got a conflict in:

  arch/x86/kernel/traps.c

between commit:

  8c84014f3bbb ("x86/entry: Remove exception_enter() from most trap handlers")

from the tip tree and commit:

  02300fdb3e5f ("rcu: Rename rcu_lockdep_assert() to RCU_LOCKDEP_WARN()")

from the rcu tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc arch/x86/kernel/traps.c
index 8e65d8a9b8db,c5a5231d1d11..
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@@ -131,14 -136,19 +131,14 @@@ void ist_enter(struct pt_regs *regs
preempt_count_add(HARDIRQ_OFFSET);
  
/* This code is a bit fragile.  Test it. */
-   rcu_lockdep_assert(rcu_is_watching(), "ist_enter didn't work");
+   RCU_LOCKDEP_WARN(!rcu_is_watching(), "ist_enter didn't work");
 -
 -  return prev_state;
  }
  
 -void ist_exit(struct pt_regs *regs, enum ctx_state prev_state)
 +void ist_exit(struct pt_regs *regs)
  {
 -  /* Must be before exception_exit. */
preempt_count_sub(HARDIRQ_OFFSET);
  
 -  if (user_mode(regs))
 -  return exception_exit(prev_state);
 -  else
 +  if (!user_mode(regs))
rcu_nmi_exit();
  }
  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Initial support for user namespace owned mounts

2015-07-15 Thread Casey Schaufler
On 7/15/2015 6:08 PM, Andy Lutomirski wrote:
> On Wed, Jul 15, 2015 at 3:39 PM, Casey Schaufler  
> wrote:
>> On 7/15/2015 2:06 PM, Eric W. Biederman wrote:
>>> Casey Schaufler  writes:
>>> The first step needs to be not trusting those labels and treating such
>>> filesystems as filesystems without label support.  I hope that is Seth
>>> has implemented.
>> A filesystem with Smack labels gets mounted in a namespace. The labels
>> are ignored. Instead, the filesystem defaults (potentially specified as
>> mount options smackfsdef="something", but usually the floor label ("_"))
>> are used, giving the user the ability to read everything and (usually)
>> change nothing. This is both dangerous (unintended read access to files)
>> and pointless (can't make changes).
> I don't get it.
>
> If I mount an unprivileged filesystem, then either the contents were
> put there *by me*, in which case letting me access them are fine, or
> (with Seth's patches and then some) I control the backing store, in
> which case I can do whatever I want regardless of what LSM thinks.
>
> So I don't see the problem.  Why would Smack or any other LSM care at
> all, unless it wants to prevent me from mounting the fs in the first
> place?

First off, I don't cotton to the notion that you should be able
to mount filesystems without privilege. But it seems I'm being
outvoted on that. I suspect that there are cases where it might
be safe, but I can't think of one off the top of my head.

If you do mount a filesystem it needs to behave according to the
rules of the system. If you have a security module that uses
attributes on the filesystem you can't ignore them just because
it's "your data". Mandatory access control schemes, including
Smack and SELinux don't give a fig about who you are. It's the
label on the data and the process that matter. If "you" get to
muck the labels up, you've broken the mandatory access control.

> --Andy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] fs: Add user namesapace member to struct super_block

2015-07-15 Thread Eric W. Biederman
Seth Forshee  writes:

> Initially this will be used to eliminate the implicit MNT_NODEV
> flag for mounts from user namespaces. In the future it will also
> be used for translating ids and checking capabilities for
> filesystems mounted from user namespaces.
>
> s_user_ns is initialized in alloc_super() and is generally set to
> current_user_ns(). To avoid security and corruption issues, two
> additional mount checks are also added:
>
>  - do_new_mount() gains a check that the user has CAP_SYS_ADMIN
>in current_user_ns().
>
>  - sget() will fail with EBUSY when the filesystem it's looking
>for is already mounted from another user namespace.
>
> proc needs some special handling here. The user namespace of
> current isn't appropriate when forking as a result of clone (2)
> with CLONE_NEWPID|CLONE_NEWUSER, as it will make proc unmountable
> from within the new user namespace. Instead, the user namespace
> which owns the new pid namespace should be used. sget_userns() is
> added to allow passing of a user namespace other than that of
> current, and this is used by proc_mount(). sget() becomes a
> wrapper around sget_userns() which passes current_user_ns().

>From bits of the previous conversation.

We need sget_userns(..., &init_user_ns) for sysfs.  The sysfs
xattrs can travel from one mount of sysfs to another via the sysfs
backing store.

For tmpfs and any other filesystems we support mounting without
privilige that support xattrs.  We need to identify them and
see if userspace is taking advantage of the ability to set
xattrs and file caps (unlikely).  If they are we need to call
sget_userns(..., &init_user_ns) on those filesystems as well.

Possibly/Probably we should just do that for all of the interesting
filesystems to start with and then change back to an ordinary old sget
after we have done the testing and confirmed we will not be introducing
userspace regressions.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework

2015-07-15 Thread Linus Torvalds
On Wed, Jul 15, 2015 at 5:34 PM, Dave Hansen
 wrote:
>
> I understand why you were misled by it, but the old "xsave_hdr_struct"
> was wrong.  Fenghua even posted patches to remove it before the FPU
> rework (you were cc'd):
>
> https://lkml.org/lkml/2015/4/18/164

Oh, and that patch looks like a good idea.

I wish there was some way to make sure sizeof() fail on it so that
we'd enforce that nobody allocates that thing as-is. I had this dim
memory that an unsized array at the end would do that, but I was
clearly wrong. It's just the array itself you can't do sizeof on, not
the structure that contains it. Is there some magic trick that I'm
forgetting?

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp

2015-07-15 Thread Naoya Horiguchi
On Thu, Jul 16, 2015 at 04:33:07AM +0200, Andi Kleen wrote:
> > @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
> >  * directly for tail pages.
> >  */
> > if (PageTransHuge(head)) {
> > +   /*
> > +* Non anonymous thp exists only in allocation/free time. We
> > +* can't handle such a case correctly, so let's give it up.
> > +* This should be better than triggering BUG_ON when kernel
> > +* tries to touch a "partially handled" page.
> > +*/
> > +   if (!PageAnon(head))
> > +   return 0;
> 
> Please print a message for this case. In the future there will be
> likely more non anonymous THP pages from Kirill's large page cache work
> (so eventually we'll need it)

OK, I'll do this.

Thanks,
Naoya Horiguchi--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] ARM: EXYNOS: mach: Improvements for 4.3

2015-07-15 Thread Krzysztof Kozlowski
Dear Kukjin,

Exynos mach-code related improvements. Description along with a tag.
You can find them also on the lists with my reviewed-by.

Best regards,
Krzysztof


The following changes since commit 1c4c7159ed2468f3ac4ce5a7f08d79663d381a93:

  Merge tag 'ext4_for_linus_stable' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 (2015-07-05 16:24:54 
-0700)

are available in the git repository at:


  https://github.com/krzk/linux.git tags/samsung-mach-4.3

for you to fetch changes up to 70f83b6716ea0e5944071c12ff1716f93a9c2d8d:

  cpufreq: exynos: remove Exynos5250 specific cpufreq driver support 
(2015-07-16 10:39:56 +0900)


Improvements for Exynos based boards:
1. Switch to generic cpufreq-dt driver for Exynos5250. The old driver
   is removed.
2. Fix memory leak in cpufreq error path.
3. Cleanups: remove duplicated define with bootloader's sleep magic
   constant, staticize local function, drop 'owner' from
   platform driver, fix cast of iomem to ERR_PTR.


Bartlomiej Zolnierkiewicz (1):
  cpufreq: exynos: remove Exynos5250 specific cpufreq driver support

Krzysztof Kozlowski (4):
  ARM: EXYNOS: pmu: Make local function static
  ARM: EXYNOS: Remove duplicated define of SLEEP_MAGIC
  ARM: EXYNOS: pmu: Drop owner assignment
  ARM: EXYNOS: Use IOMEM_ERR_PTR when function returns iomem

Shailendra Verma (1):
  cpufreq: exynos: Fix for memory leak in case SOC name does not match

Thomas Abraham (3):
  clk: samsung: exynos5250: add cpu clock configuration data and 
instantiate cpu clock
  ARM: dts: Exynos5250: add CPU OPP and regulator supply property
  ARM: Exynos: switch to using generic cpufreq driver for Exynos5250

 arch/arm/boot/dts/exynos5250-arndale.dts  |   4 +
 arch/arm/boot/dts/exynos5250-smdk5250.dts |   4 +
 arch/arm/boot/dts/exynos5250-snow.dts |   4 +
 arch/arm/boot/dts/exynos5250-spring.dts   |   4 +
 arch/arm/boot/dts/exynos5250.dtsi |  22 
 arch/arm/mach-exynos/common.h |   6 +
 arch/arm/mach-exynos/exynos.c |   1 +
 arch/arm/mach-exynos/firmware.c   |   2 -
 arch/arm/mach-exynos/platsmp.c|   2 +-
 arch/arm/mach-exynos/pmu.c|   3 +-
 arch/arm/mach-exynos/suspend.c|   4 +-
 drivers/clk/samsung/clk-exynos5250.c  |  31 +
 drivers/cpufreq/Kconfig.arm   |  11 --
 drivers/cpufreq/Makefile  |   1 -
 drivers/cpufreq/exynos-cpufreq.c  |   9 +-
 drivers/cpufreq/exynos-cpufreq.h  |  17 ---
 drivers/cpufreq/exynos5250-cpufreq.c  | 210 --
 include/dt-bindings/clock/exynos5250.h|   1 +
 18 files changed, 84 insertions(+), 252 deletions(-)
 delete mode 100644 drivers/cpufreq/exynos5250-cpufreq.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] ARM: EXYNOS: dts: Improvements for 4.3

2015-07-15 Thread Krzysztof Kozlowski
Dear Kukjin,

DTS related improvements. Description along with a tag.
You can find them also on the lists with my reviewed-by.

Best regards,
Krzysztof


The following changes since commit a419d78a6f97f8c977fe55d5d590cd0654ecd1ee:

  ARM: dts: Exynos4210: add CPU OPP and regulator supply property (2015-07-13 
21:16:05 +0900)

are available in the git repository at:

  https://github.com/krzk/linux.git tags/samsung-dt-4.3

for you to fetch changes up to cd0b551be420d49c2bde8dcf5ea147278dc89ffb:

  ARM: dts: Extend exynos5420-pinctrl nodes using labels instead of paths 
(2015-07-16 11:22:11 +0900)


Device Tree improvements for Exynos based boards:
1. Enable proper USB 3.0 regulators on Odroid XU3 board.
2. Set over-heat and over-voltage thresholds for Trats2 board fuel
   gauge.
3. Fix missing display frequency on Exynos3250 Rinato board
   (necessary to fix the display).
4. Enable thermal management and fan control on Odroid XU3 board.
   The speed of fan is adjusted to current temperature of SoC.
5. Cleanups and usage of label-notation for overriding nodes.


Anand Moon (5):
  ARM: dts: odroidxu3: Enable USB3 regulators
  ARM: dts: exynos5422-odroidxu3: Add pwm-fan node
  ARM: dts: exynos5422-odroidxu3: Enable TMU at Exynos5422 base
  ARM: dts: exynos5422-odroidxu3: Define default thermal-zones
  ARM: dts: exynos5422-odroidxu3: Enable thermal-zones

Andreas Färber (1):
  ARM: dts: Clean up exynos5410-smdk5410 indentation

Hyungwon Hwang (1):
  ARM: dts: fix the clock-frequency of exynos3250-rinato board's panel

Javier Martinez Canillas (4):
  ARM: dts: Include exynos5250-pinctrl after the nodes were defined
  ARM: dts: Extend exynos5250-pinctrl nodes using labels instead of paths
  ARM: dts: Include exynos5420-pinctrl after the nodes were defined
  ARM: dts: Extend exynos5420-pinctrl nodes using labels instead of paths

Krzysztof Kozlowski (2):
  ARM: dts: Set max17047 over heat and over voltage thresholds
  ARM: dts: Use labels for overriding nodes in exynos4210-universal

 arch/arm/boot/dts/exynos3250-rinato.dts|2 +-
 arch/arm/boot/dts/exynos4210-universal_c210.dts|  620 
 arch/arm/boot/dts/exynos4412-trats2.dts|3 +
 arch/arm/boot/dts/exynos5250-pinctrl.dtsi  | 1600 ++--
 arch/arm/boot/dts/exynos5250.dtsi  |3 +-
 arch/arm/boot/dts/exynos5410-smdk5410.dts  |6 +-
 arch/arm/boot/dts/exynos5420-pinctrl.dtsi  | 1411 +
 arch/arm/boot/dts/exynos5420.dtsi  |3 +-
 arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi  |   59 +
 arch/arm/boot/dts/exynos5422-odroidxu3-common.dtsi |   46 +
 10 files changed, 1930 insertions(+), 1823 deletions(-)
 create mode 100644 arch/arm/boot/dts/exynos5422-cpu-thermal.dtsi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 4/6] ARM: OMAP: PRM: Remove hardcoding of IRQENABLE_MPU_2 and IRQSTATUS_MPU_2 register offsets

2015-07-15 Thread Paul Walmsley
On Wed, 8 Jul 2015, Keerthy wrote:

> The register offsets of IRQENABLE_MPU_2 and IRQSTATUS_MPU_2 are hardcoded.
> This makes it difficult to reuse the code for SoCs like AM437x that have
> a single instance of IRQENABLE_MPU and IRQSTATUS_MPU registers.
> Hence handling the case using offset of 4 to accommodate single set of IRQ*
> registers generically.
> 
> Signed-off-by: Keerthy 

Thanks, queued for v4.3.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REGRESSION] 4.2-rc2: early boot memory corruption from FPU rework

2015-07-15 Thread Linus Torvalds
On Wed, Jul 15, 2015 at 5:34 PM, Dave Hansen
 wrote:
>
> The old code sized the buffer in a fully architectural way and it
> worked.  The CPU *tells* you how much memory the 'xsave' instruction is
> going to scribble on.  The new code just merrily calls it and let it
> scribble away.  This is as clear-cut a regression as I've ever seen.

Yes, I think we'll need to revert it, or do something else drastic
like make that initial fp state allocation *much* bigger and then have
a "disable xsaves if if it's still not big enough".

setup_xstate_features() should be able to easily just say "this was
the maximum offset+size we saw", and we can take that to either do a
proper allocation, or verify that the static allocation is indeed big
enough.

Apparently a straight revert doesn't work, if only because things in
that area have been renamed very aggressively (both files and
functions and variables). Ingo?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets

2015-07-15 Thread Paul Walmsley
On Thu, 16 Jul 2015, Paul Walmsley wrote:

> On Wed, 8 Jul 2015, Keerthy wrote:
> 
> > Add the PRM IRQ register offsets.
> > 
> > Signed-off-by: Keerthy 
> 
> Please add more detail to your commit messages so they conform to 
> Documentation/SubmittingPatches:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109
> 
> For example, this commit message should read something like:
> 
> ---
> 
> ARM: AM43xx: Add the PRM IRQ register offsets
> 
> Add the PRM IRQ register offsets.  This is needed to support PRM I/O 
> wakeup on AM43xx.
> 
> --
> 
> Basically, your patches need to provide context as to _why_ the change is 
> needed. 
> 
> I've fixed the message for this patch, and queued it for v4.3, but 
> please take care with this issue in the future.

Also I've moved the AM43XX_PRM_IO_PMCTRL_OFFSET macro out of the AM43XX CM 
section, since it doesn't belong there.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] ARM: AM43xx: Add the PRM IRQ register offsets

2015-07-15 Thread Paul Walmsley
On Wed, 8 Jul 2015, Keerthy wrote:

> Add the PRM IRQ register offsets.
> 
> Signed-off-by: Keerthy 

Please add more detail to your commit messages so they conform to 
Documentation/SubmittingPatches:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/SubmittingPatches#n109

For example, this commit message should read something like:

---

ARM: AM43xx: Add the PRM IRQ register offsets

Add the PRM IRQ register offsets.  This is needed to support PRM I/O 
wakeup on AM43xx.

--

Basically, your patches need to provide context as to _why_ the change is 
needed. 

I've fixed the message for this patch, and queued it for v4.3, but 
please take care with this issue in the future.


- Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v1 3/4] mm/memory-failure: give up error handling for non-tail-refcounted thp

2015-07-15 Thread Andi Kleen
> @@ -909,6 +909,15 @@ int get_hwpoison_page(struct page *page)
>* directly for tail pages.
>*/
>   if (PageTransHuge(head)) {
> + /*
> +  * Non anonymous thp exists only in allocation/free time. We
> +  * can't handle such a case correctly, so let's give it up.
> +  * This should be better than triggering BUG_ON when kernel
> +  * tries to touch a "partially handled" page.
> +  */
> + if (!PageAnon(head))
> + return 0;

Please print a message for this case. In the future there will be
likely more non anonymous THP pages from Kirill's large page cache work
(so eventually we'll need it)

-Andi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 002/251] sctp: fix ASCONF list handling

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Marcelo Ricardo Leitner 

[ Upstream commit 2d45a02d0166caf2627fe91897c6ffc3b19514c4 ]

->auto_asconf_splist is per namespace and mangled by functions like
sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization.

Also, the call to inet_sk_copy_descendant() was backuping
->auto_asconf_list through the copy but was not honoring
->do_auto_asconf, which could lead to list corruption if it was
different between both sockets.

This commit thus fixes the list handling by using ->addr_wq_lock
spinlock to protect the list. A special handling is done upon socket
creation and destruction for that. Error handlig on sctp_init_sock()
will never return an error after having initialized asconf, so
sctp_destroy_sock() can be called without addrq_wq_lock. The lock now
will be take on sctp_close_sock(), before locking the socket, so we
don't do it in inverse order compared to sctp_addr_wq_timeout_handler().

Instead of taking the lock on sctp_sock_migrate() for copying and
restoring the list values, it's preferred to avoid rewritting it by
implementing sctp_copy_descendant().

Issue was found with a test application that kept flipping sysctl
default_auto_asconf on and off, but one could trigger it by issuing
simultaneous setsockopt() calls on multiple sockets or by
creating/destroying sockets fast enough. This is only triggerable
locally.

Fixes: 9f7d653b67ae ("sctp: Add Auto-ASCONF support (core).")
Reported-by: Ji Jianwen 
Suggested-by: Neil Horman 
Suggested-by: Hannes Frederic Sowa 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: Marcelo Ricardo Leitner 
Signed-off-by: David S. Miller 
Cc: Moritz Mühlenhoff 
Reference: CVE-2015-3212
Signed-off-by: Kamal Mostafa 
---
 include/net/netns/sctp.h   |  1 +
 include/net/sctp/structs.h |  4 
 net/sctp/socket.c  | 43 ---
 3 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/include/net/netns/sctp.h b/include/net/netns/sctp.h
index 3573a81..8ba379f 100644
--- a/include/net/netns/sctp.h
+++ b/include/net/netns/sctp.h
@@ -31,6 +31,7 @@ struct netns_sctp {
struct list_head addr_waitq;
struct timer_list addr_wq_timer;
struct list_head auto_asconf_splist;
+   /* Lock that protects both addr_waitq and auto_asconf_splist */
spinlock_t addr_wq_lock;
 
/* Lock that protects the local_addr_list writers */
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 2bb2fcf..495c87e 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -223,6 +223,10 @@ struct sctp_sock {
atomic_t pd_mode;
/* Receive to here while partial delivery is in effect. */
struct sk_buff_head pd_lobby;
+
+   /* These must be the last fields, as they will skipped on copies,
+* like on accept and peeloff operations
+*/
struct list_head auto_asconf_list;
int do_auto_asconf;
 };
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index aafe94b..4e56571 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1533,8 +1533,10 @@ static void sctp_close(struct sock *sk, long timeout)
 
/* Supposedly, no process has access to the socket, but
 * the net layers still may.
+* Also, sctp_destroy_sock() needs to be called with addr_wq_lock
+* held and that should be grabbed before socket lock.
 */
-   local_bh_disable();
+   spin_lock_bh(&net->sctp.addr_wq_lock);
bh_lock_sock(sk);
 
/* Hold the sock, since sk_common_release() will put sock_put()
@@ -1544,7 +1546,7 @@ static void sctp_close(struct sock *sk, long timeout)
sk_common_release(sk);
 
bh_unlock_sock(sk);
-   local_bh_enable();
+   spin_unlock_bh(&net->sctp.addr_wq_lock);
 
sock_put(sk);
 
@@ -3587,6 +3589,7 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, 
char __user *optval,
if ((val && sp->do_auto_asconf) || (!val && !sp->do_auto_asconf))
return 0;
 
+   spin_lock_bh(&sock_net(sk)->sctp.addr_wq_lock);
if (val == 0 && sp->do_auto_asconf) {
list_del(&sp->auto_asconf_list);
sp->do_auto_asconf = 0;
@@ -3595,6 +3598,7 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, 
char __user *optval,
&sock_net(sk)->sctp.auto_asconf_splist);
sp->do_auto_asconf = 1;
}
+   spin_unlock_bh(&sock_net(sk)->sctp.addr_wq_lock);
return 0;
 }
 
@@ -4128,18 +4132,28 @@ static int sctp_init_sock(struct sock *sk)
local_bh_disable();
percpu_counter_inc(&sctp_sockets_allocated);
sock_prot_inuse_add(net, sk->sk_prot, 1);
+
+   /* Nothing can fail after this block, otherwise
+* sctp_destroy_sock() will be called without addr_wq_lock held
+*/
if (net->sctp.default_auto_asconf) {
+  

[PATCH 3.19.y-ckt 009/251] net/mlx4_en: Wake TX queues only when there's enough room

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ido Shamay 

[ Upstream commit 488a9b48e398b157703766e2cd91ea45ac6997c5 ]

Indication of a single completed packet, marked by txbbs_skipped
being bigger then zero, in not enough in order to wake up a
stopped TX queue. The completed packet may contain a single TXBB,
while next packet to be sent (after the wake up) may have multiple
TXBBs (LSO/TSO packets for example), causing overflow in queue followed
by WQE corruption and TX queue timeout.
Instead, wake the stopped queue only when there's enough room for the
worst case (maximum sized WQE) packet that we should need to handle after
the queue is opened again.

Also created an helper routine - mlx4_en_is_tx_ring_full, which checks
if the current TX ring is full or not. It provides better code readability
and removes code duplication.

Signed-off-by: Ido Shamay 
Signed-off-by: Or Gerlitz 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   | 19 +++
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h |  1 +
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 06c0de6..b54e621 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -66,6 +66,7 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
ring->size = size;
ring->size_mask = size - 1;
ring->stride = stride;
+   ring->full_size = ring->size - HEADROOM - MAX_DESC_TXBBS;
 
tmp = size * sizeof(struct mlx4_en_tx_info);
ring->tx_info = kmalloc_node(tmp, GFP_KERNEL | __GFP_NOWARN, node);
@@ -232,6 +233,11 @@ void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv,
   MLX4_QP_STATE_RST, NULL, 0, 0, &ring->qp);
 }
 
+static inline bool mlx4_en_is_tx_ring_full(struct mlx4_en_tx_ring *ring)
+{
+   return ring->prod - ring->cons > ring->full_size;
+}
+
 static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
  struct mlx4_en_tx_ring *ring, int index,
  u8 owner)
@@ -474,11 +480,10 @@ static bool mlx4_en_process_tx_cq(struct net_device *dev,
 
netdev_tx_completed_queue(ring->tx_queue, packets, bytes);
 
-   /*
-* Wakeup Tx queue if this stopped, and at least 1 packet
-* was completed
+   /* Wakeup Tx queue if this stopped, and ring is not full.
 */
-   if (netif_tx_queue_stopped(ring->tx_queue) && txbbs_skipped > 0) {
+   if (netif_tx_queue_stopped(ring->tx_queue) &&
+   !mlx4_en_is_tx_ring_full(ring)) {
netif_tx_wake_queue(ring->tx_queue);
ring->wake_queue++;
}
@@ -922,8 +927,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
skb_tx_timestamp(skb);
 
/* Check available TXBBs And 2K spare for prefetch */
-   stop_queue = (int)(ring->prod - ring_cons) >
- ring->size - HEADROOM - MAX_DESC_TXBBS;
+   stop_queue = mlx4_en_is_tx_ring_full(ring);
if (unlikely(stop_queue)) {
netif_tx_stop_queue(ring->tx_queue);
ring->queue_stopped++;
@@ -992,8 +996,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct 
net_device *dev)
smp_rmb();
 
ring_cons = ACCESS_ONCE(ring->cons);
-   if (unlikely(((int)(ring->prod - ring_cons)) <=
-ring->size - HEADROOM - MAX_DESC_TXBBS)) {
+   if (unlikely(!mlx4_en_is_tx_ring_full(ring))) {
netif_tx_wake_queue(ring->tx_queue);
ring->wake_queue++;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 0e80118..18f8578 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -280,6 +280,7 @@ struct mlx4_en_tx_ring {
u32 size; /* number of TXBBs */
u32 size_mask;
u16 stride;
+   u32 full_size;
u16 cqn;/* index of port CQ associated with 
this ring */
u32 buf_size;
__be32  doorbell_qpn;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 008/251] net/mlx4_en: Release TX QP when destroying TX ring

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Eran Ben Elisha 

[ Upstream commit 0eb08514fdbdcd16fd6870680cd638f203662e9d ]

TX ring QP wasn't released at mlx4_en_destroy_tx_ring. Instead, the code
used the deprecated base_tx_qpn field. Move TX QP release to
mlx4_en_destroy_tx_ring and remove the base_tx_qpn field.

Fixes: ddae0349fdb7 ('net/mlx4: Change QP allocation scheme')
Signed-off-by: Eran Ben Elisha 
Signed-off-by: Or Gerlitz 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 4 
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 1 +
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   | 1 -
 3 files changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index c998c4d..99b99eb 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1973,10 +1973,6 @@ void mlx4_en_free_resources(struct mlx4_en_priv *priv)
mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
}
 
-   if (priv->base_tx_qpn) {
-   mlx4_qp_release_range(priv->mdev->dev, priv->base_tx_qpn, 
priv->tx_ring_num);
-   priv->base_tx_qpn = 0;
-   }
 }
 
 int mlx4_en_alloc_resources(struct mlx4_en_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 18db895..06c0de6 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -180,6 +180,7 @@ void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv,
mlx4_bf_free(mdev->dev, &ring->bf);
mlx4_qp_remove(mdev->dev, &ring->qp);
mlx4_qp_free(mdev->dev, &ring->qp);
+   mlx4_qp_release_range(priv->mdev->dev, ring->qpn, 1);
mlx4_en_unmap_buffer(&ring->wqres.buf);
mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size);
kfree(ring->bounce_buf);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 6cc49c1..0e80118 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -599,7 +599,6 @@ struct mlx4_en_priv {
int vids[128];
bool wol;
struct device *ddev;
-   int base_tx_qpn;
struct hlist_head mac_hash[MLX4_EN_MAC_HASH_SIZE];
struct hwtstamp_config hwtstamp_config;
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 010/251] net/mlx4_en: Fix wrong csum complete report when rxvlan offload is disabled

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Ido Shamay 

[ Upstream commit 79a258526ce1051cb9684018c25a89d51ac21be8 ]

The check_csum() function relied on hwtstamp_rx_filter to know if rxvlan
offload is disabled. This is wrong since rxvlan offload can be switched
on/off regardless of hwtstamp_rx_filter.

Also moved check_csum to query CQE information to identify VLAN packets
and removed the check of IP packets, since it has been validated before.

Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM 
COMPLETE')
Signed-off-by: Ido Shamay 
Signed-off-by: Or Gerlitz 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c 
b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 10d3533..7f16627 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -719,7 +719,7 @@ static int get_fixed_ipv6_csum(__wsum hw_checksum, struct 
sk_buff *skb,
 }
 #endif
 static int check_csum(struct mlx4_cqe *cqe, struct sk_buff *skb, void *va,
- int hwtstamp_rx_filter)
+ netdev_features_t dev_features)
 {
__wsum hw_checksum = 0;
 
@@ -727,14 +727,8 @@ static int check_csum(struct mlx4_cqe *cqe, struct sk_buff 
*skb, void *va,
 
hw_checksum = csum_unfold((__force __sum16)cqe->checksum);
 
-   if (((struct ethhdr *)va)->h_proto == htons(ETH_P_8021Q) &&
-   hwtstamp_rx_filter != HWTSTAMP_FILTER_NONE) {
-   /* next protocol non IPv4 or IPv6 */
-   if (((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
-   != htons(ETH_P_IP) &&
-   ((struct vlan_hdr *)hdr)->h_vlan_encapsulated_proto
-   != htons(ETH_P_IPV6))
-   return -1;
+   if (cqe->vlan_my_qpn & cpu_to_be32(MLX4_CQE_VLAN_PRESENT_MASK) &&
+   !(dev_features & NETIF_F_HW_VLAN_CTAG_RX)) {
hw_checksum = get_fixed_vlan_csum(hw_checksum, hdr);
hdr += sizeof(struct vlan_hdr);
}
@@ -897,7 +891,8 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct 
mlx4_en_cq *cq, int bud
 
if (ip_summed == CHECKSUM_COMPLETE) {
void *va = 
skb_frag_address(skb_shinfo(gro_skb)->frags);
-   if (check_csum(cqe, gro_skb, va, 
ring->hwtstamp_rx_filter)) {
+   if (check_csum(cqe, gro_skb, va,
+  dev->features)) {
ip_summed = CHECKSUM_NONE;
ring->csum_none++;
ring->csum_complete--;
@@ -952,7 +947,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct 
mlx4_en_cq *cq, int bud
}
 
if (ip_summed == CHECKSUM_COMPLETE) {
-   if (check_csum(cqe, skb, skb->data, 
ring->hwtstamp_rx_filter)) {
+   if (check_csum(cqe, skb, skb->data, dev->features)) {
ip_summed = CHECKSUM_NONE;
ring->csum_complete--;
ring->csum_none++;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/7] Initial support for user namespace owned mounts

2015-07-15 Thread Eric W. Biederman
Andy Lutomirski  writes:

> On Jul 15, 2015 3:34 PM, "Eric W. Biederman"  wrote:
>>
>> Seth Forshee  writes:
>>
>> > On Wed, Jul 15, 2015 at 04:06:35PM -0500, Eric W. Biederman wrote:
>> >> Casey Schaufler  writes:
>> >>
>> >> > On 7/15/2015 12:46 PM, Seth Forshee wrote:
>> >> >> These are the first in a larger set of patches that I've been working 
>> >> >> on
>> >> >> (with help from Eric Biederman) to support mounting ext4 and fuse
>> >> >> filesystems from within user namespaces. I've pushed the full series 
>> >> >> to:
>> >> >>
>> >> >>   git://kernel.ubuntu.com/sforshee/linux.git userns-mounts
>> >> >>
>> >> >> Taking the series as a whole, the strategy is to handle as much of the
>> >> >> heavy lifting as possible in the vfs so the filesystems don't have to
>> >> >> handle weird edge cases. If you look at the full series you'll find 
>> >> >> that
>> >> >> the changes in ext4 to support user namespace mounts turn out to be
>> >> >> fairly minimal (fuse is a bit more complicated though as it must deal
>> >> >> with translating ids for a userspace process which is running in pid 
>> >> >> and
>> >> >> user namespaces).
>> >> >>
>> >> >> The patches I'm sending today lay some of the groundwork in the vfs and
>> >> >> related code. They fall into two broad groups:
>> >> >>
>> >> >>  1. Patches 1-2 add s_user_ns and simplify MNT_NODEV handling. These 
>> >> >> are
>> >> >> pretty straightforward, and Eric has expressed interest in merging
>> >> >> these patches soon. Note that patch 2 won't apply cleanly without
>> >> >> Eric's noexec patches for proc and sys [1].
>> >> >>
>> >> >>  2. Patches 2-7 tighten down security for mounts with s_user_ns !=
>> >> >> &init_user_ns. This includes updates to how file caps and suid are
>> >> >> handled and LSM updates to ignore security labels on superblocks
>> >> >> from non-init namespaces.
>> >> >>
>> >> >> The LSM changes in particular may not be optimal, as I don't have a
>> >> >> lot of familiarity with this code, so I'd be especially 
>> >> >> appreciative
>> >> >> of review of these changes and suggestions on how to improve them.
>> >> >
>> >> > Lukasz Pawelczyk  proposed
>> >> > LSM support in user namespaces ([RFC] lsm: namespace hooks)
>> >> > that make a whole lot more sense than just turning off
>> >> > the option of using labels on files. Gutting the ability
>> >> > to use MAC in a namespace is a step down the road of
>> >> > making MAC and namespaces incompatible.
>> >>
>> >> This is not "turning off the option to use labels on files".
>> >>
>> >> This is supporting mounting filesystems like ext4 by unprivileged users
>> >> and not trusting the labels they set in the same way as we trust labels
>> >> on filesystems mounted by privileged users.
>> >>
>> >> The first step needs to be not trusting those labels and treating such
>> >> filesystems as filesystems without label support.  I hope that is Seth
>> >> has implemented.
>> >>
>> >> In the long run we can do more interesting things with such filesystems
>> >> once the appropriate LSM policy is in place.
>> >
>> > Yes, this exactly. Right now it looks to me like the only safe thing to
>> > do with mounts from unprivileged users is to ignore the security labels,
>> > so that's what I'm trying to do with these changes. If there's some
>> > better thing to do, or some better way to do it, I'm more than happy to
>> > receive that feedback.
>>
>> Ugh.
>>
>> This made me realize that we have an interesting problem here.  An
>> unprivileged mount of tmpfs probably needs to have
>> s_user_ns == &init_user_ns.
>>
>> Otherwise we will break security labels on tmpfs for no good reason.
>> ramfs and sysfs also seem to have similar concerns.
>>
>> Because they have no backing store we can trust those filesystems with
>> security labels.  Plus for at least sysfs there is the security label
>> bleed through issue, that we need to make certain works.
>>
>> Perhaps these filesystems with trusted backing store need to call
>> "sget_userns(..., &init_user_ns)".
>>
>> If we don't get this right we will have significant regressions with
>> respect to security labels, and that is not ok.
>
> That's only a problem if there's anyone who sets security labels on
> such a mount.  You need global caps to do that (I hope), which
> requires someone outside the userns to help, which means there's a
> good chance that literally no one does this.

Fair enough.  That is however something we need to test.  If no one
puts security labels or file caps on such a mount we can change things.
If not we can't because it would introduce regressions.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 020/251] [media] cx24116: fix a buffer overflow when checking userspace params

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Mauro Carvalho Chehab 

commit 1fa2337a315a2448c5434f41e00d56b01a22283c upstream.

The maximum size for a DiSEqC command is 6, according to the
userspace API. However, the code allows to write up much more values:
drivers/media/dvb-frontends/cx24116.c:983 cx24116_send_diseqc_msg() 
error: buffer overflow 'd->msg' 6 <= 23

Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Kamal Mostafa 
---
 drivers/media/dvb-frontends/cx24116.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/dvb-frontends/cx24116.c 
b/drivers/media/dvb-frontends/cx24116.c
index 2916d7c..7bc68b3 100644
--- a/drivers/media/dvb-frontends/cx24116.c
+++ b/drivers/media/dvb-frontends/cx24116.c
@@ -963,6 +963,10 @@ static int cx24116_send_diseqc_msg(struct dvb_frontend *fe,
struct cx24116_state *state = fe->demodulator_priv;
int i, ret;
 
+   /* Validate length */
+   if (d->msg_len > sizeof(d->msg))
+return -EINVAL;
+
/* Dump DiSEqC message */
if (debug) {
printk(KERN_INFO "cx24116: %s(", __func__);
@@ -974,10 +978,6 @@ static int cx24116_send_diseqc_msg(struct dvb_frontend *fe,
printk(") toneburst=%d\n", toneburst);
}
 
-   /* Validate length */
-   if (d->msg_len > (CX24116_ARGLEN - CX24116_DISEQC_MSGOFS))
-   return -EINVAL;
-
/* DiSEqC message */
for (i = 0; i < d->msg_len; i++)
state->dsec_cmd.args[CX24116_DISEQC_MSGOFS + i] = d->msg[i];
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 001/251] net: don't wait for order-3 page allocation

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Shaohua Li 

[ Upstream commit fb05e7a89f500cfc06ae277bdc911b281928995d ]

We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet 
Cc: Chris Mason 
Cc: Debabrata Banerjee 
Signed-off-by: Shaohua Li 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/core/skbuff.c | 2 +-
 net/core/sock.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 3b0a8b0..0998af7 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4414,7 +4414,7 @@ struct sk_buff *alloc_skb_with_frags(unsigned long 
header_len,
 
while (order) {
if (npages >= 1 << order) {
-   page = alloc_pages(gfp_mask |
+   page = alloc_pages((gfp_mask & ~__GFP_WAIT) |
   __GFP_COMP |
   __GFP_NOWARN |
   __GFP_NORETRY,
diff --git a/net/core/sock.c b/net/core/sock.c
index a91f99f..3606cc5 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1888,7 +1888,7 @@ bool skb_page_frag_refill(unsigned int sz, struct 
page_frag *pfrag, gfp_t gfp)
 
pfrag->offset = 0;
if (SKB_FRAG_PAGE_ORDER) {
-   pfrag->page = alloc_pages(gfp | __GFP_COMP |
+   pfrag->page = alloc_pages((gfp & ~__GFP_WAIT) | __GFP_COMP |
  __GFP_NOWARN | __GFP_NORETRY,
  SKB_FRAG_PAGE_ORDER);
if (likely(pfrag->page)) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 019/251] [media] s5h1420: fix a buffer overflow when checking userspace params

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Mauro Carvalho Chehab 

commit 12f4543f5d6811f864e6c4952eb27253c7466c02 upstream.

The maximum size for a DiSEqC command is 6, according to the
userspace API. However, the code allows to write up to 7 values:
drivers/media/dvb-frontends/s5h1420.c:193 s5h1420_send_master_cmd() 
error: buffer overflow 'cmd->msg' 6 <= 7

Signed-off-by: Mauro Carvalho Chehab 
Signed-off-by: Kamal Mostafa 
---
 drivers/media/dvb-frontends/s5h1420.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/media/dvb-frontends/s5h1420.c 
b/drivers/media/dvb-frontends/s5h1420.c
index 93eeaf7..0b4f8fe 100644
--- a/drivers/media/dvb-frontends/s5h1420.c
+++ b/drivers/media/dvb-frontends/s5h1420.c
@@ -180,7 +180,7 @@ static int s5h1420_send_master_cmd (struct dvb_frontend* fe,
int result = 0;
 
dprintk("enter %s\n", __func__);
-   if (cmd->msg_len > 8)
+   if (cmd->msg_len > sizeof(cmd->msg))
return -EINVAL;
 
/* setup for DISEQC */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3.19.y-ckt 006/251] neigh: do not modify unlinked entries

2015-07-15 Thread Kamal Mostafa
3.19.8-ckt4 -stable review patch.  If anyone has any objections, please let me 
know.

--

From: Julian Anastasov 

[ Upstream commit 2c51a97f76d20ebf1f50fef908b986cb051fdff9 ]

The lockless lookups can return entry that is unlinked.
Sometimes they get reference before last neigh_cleanup_and_release,
sometimes they do not need reference. Later, any
modification attempts may result in the following problems:

1. entry is not destroyed immediately because neigh_update
can start the timer for dead entry, eg. on change to NUD_REACHABLE
state. As result, entry lives for some time but is invisible
and out of control.

2. __neigh_event_send can run in parallel with neigh_destroy
while refcnt=0 but if timer is started and expired refcnt can
reach 0 for second time leading to second neigh_destroy and
possible crash.

Thanks to Eric Dumazet and Ying Xue for their work and analyze
on the __neigh_event_send change.

Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.")
Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
Cc: Eric Dumazet 
Cc: Ying Xue 
Signed-off-by: Julian Anastasov 
Acked-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Kamal Mostafa 
---
 net/core/neighbour.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index 8d614c9..0385351 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -971,6 +971,8 @@ int __neigh_event_send(struct neighbour *neigh, struct 
sk_buff *skb)
rc = 0;
if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
goto out_unlock_bh;
+   if (neigh->dead)
+   goto out_dead;
 
if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {
if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
@@ -1027,6 +1029,13 @@ out_unlock_bh:
write_unlock(&neigh->lock);
local_bh_enable();
return rc;
+
+out_dead:
+   if (neigh->nud_state & NUD_STALE)
+   goto out_unlock_bh;
+   write_unlock_bh(&neigh->lock);
+   kfree_skb(skb);
+   return 1;
 }
 EXPORT_SYMBOL(__neigh_event_send);
 
@@ -1090,6 +1099,8 @@ int neigh_update(struct neighbour *neigh, const u8 
*lladdr, u8 new,
if (!(flags & NEIGH_UPDATE_F_ADMIN) &&
(old & (NUD_NOARP | NUD_PERMANENT)))
goto out;
+   if (neigh->dead)
+   goto out;
 
if (!(new & NUD_VALID)) {
neigh_del_timer(neigh);
@@ -1239,6 +1250,8 @@ EXPORT_SYMBOL(neigh_update);
  */
 void __neigh_set_probe_once(struct neighbour *neigh)
 {
+   if (neigh->dead)
+   return;
neigh->updated = jiffies;
if (!(neigh->nud_state & NUD_FAILED))
return;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >