Re: [drm:qxl] BUG: sleeping function called from invalid context - qxl_bo_kmap_atomic_page()...splat

2017-05-11 Thread Mike Galbraith
On Tue, 2017-05-09 at 04:37 +0200, Mike Galbraith wrote:
> On Mon, 2017-05-08 at 16:48 -0300, Gabriel Krisman Bertazi wrote:
> 
> > Thanks for reporting this.  Can you confirm the following patch prevents
> > the issue?
> 
> Nope, it still gripes.

The reason for this gripe is that we find that we don't have memory
reserved.. a tad too late.

Xorg-2252  [000]    135.409756: qxl_release_map 
<-qxl_cursor_atomic_update
Xorg-2252  [000]    135.409756: qxl_bo_kmap_atomic_page 
<-qxl_release_map
Xorg-2252  [000]    135.409757: qxl_bo_kmap_atomic_page: ENTER
Xorg-2252  [000]    135.409757: ttm_mem_io_lock 
<-qxl_bo_kmap_atomic_page
Xorg-2252  [000]    135.409757: ttm_mem_io_reserve 
<-qxl_bo_kmap_atomic_page
Xorg-2252  [000]    135.409757: qxl_ttm_io_mem_reserve 
<-ttm_mem_io_reserve
Xorg-2252  [000]    135.409757: ttm_mem_io_unlock 
<-qxl_bo_kmap_atomic_page
Xorg-2252  [000]    135.409757: qxl_bo_kmap_atomic_page: 
PREEMPTION DISABLED
Xorg-2252  [000] ...1   135.409758: qxl_bo_kmap 
<-qxl_cursor_atomic_update
Xorg-2252  [000] ...1   135.409758: ttm_bo_kmap <-qxl_bo_kmap  <== 
too late
Xorg-2252  [000] ...1   135.409758: ttm_mem_io_reserve <-ttm_bo_kmap
Xorg-2252  [000] ...1   135.409758: qxl_ttm_io_mem_reserve 
<-ttm_mem_io_reserve
Xorg-2252  [000] ...1   135.409759: ioremap_nocache <-ttm_bo_kmap 
<== game over
Xorg-2252  [000] ...1   135.409759: __ioremap_caller 
<-ioremap_nocache
Xorg-2252  [000] ...1   135.409759: walk_system_ram_range 
<-__ioremap_caller
Xorg-2252  [000] ...1   135.409759: find_next_iomem_res 
<-walk_system_ram_range
Xorg-2252  [000] ...1   135.409759: _raw_read_lock 
<-find_next_iomem_res
Xorg-2252  [000] ...1   135.409760: reserve_memtype 
<-__ioremap_caller
Xorg-2252  [000] ...1   135.409760: pat_pagerange_is_ram 
<-reserve_memtype
Xorg-2252  [000] ...1   135.409761: walk_system_ram_range 
<-pat_pagerange_is_ram
Xorg-2252  [000] ...1   135.409761: find_next_iomem_res 
<-walk_system_ram_range
Xorg-2252  [000] ...1   135.409761: _raw_read_lock 
<-find_next_iomem_res
Xorg-2252  [000] ...1   135.409761: kmem_cache_alloc_trace 
<-reserve_memtype
Xorg-2252  [000] ...1   135.409761: __might_sleep 
<-kmem_cache_alloc_trace
Xorg-2252  [000] ...1   135.409762: ___might_sleep <-__might_sleep


Re: [drm:qxl] BUG: sleeping function called from invalid context - qxl_bo_kmap_atomic_page()...splat

2017-05-11 Thread Mike Galbraith
On Tue, 2017-05-09 at 04:37 +0200, Mike Galbraith wrote:
> On Mon, 2017-05-08 at 16:48 -0300, Gabriel Krisman Bertazi wrote:
> 
> > Thanks for reporting this.  Can you confirm the following patch prevents
> > the issue?
> 
> Nope, it still gripes.

The reason for this gripe is that we find that we don't have memory
reserved.. a tad too late.

Xorg-2252  [000]    135.409756: qxl_release_map 
<-qxl_cursor_atomic_update
Xorg-2252  [000]    135.409756: qxl_bo_kmap_atomic_page 
<-qxl_release_map
Xorg-2252  [000]    135.409757: qxl_bo_kmap_atomic_page: ENTER
Xorg-2252  [000]    135.409757: ttm_mem_io_lock 
<-qxl_bo_kmap_atomic_page
Xorg-2252  [000]    135.409757: ttm_mem_io_reserve 
<-qxl_bo_kmap_atomic_page
Xorg-2252  [000]    135.409757: qxl_ttm_io_mem_reserve 
<-ttm_mem_io_reserve
Xorg-2252  [000]    135.409757: ttm_mem_io_unlock 
<-qxl_bo_kmap_atomic_page
Xorg-2252  [000]    135.409757: qxl_bo_kmap_atomic_page: 
PREEMPTION DISABLED
Xorg-2252  [000] ...1   135.409758: qxl_bo_kmap 
<-qxl_cursor_atomic_update
Xorg-2252  [000] ...1   135.409758: ttm_bo_kmap <-qxl_bo_kmap  <== 
too late
Xorg-2252  [000] ...1   135.409758: ttm_mem_io_reserve <-ttm_bo_kmap
Xorg-2252  [000] ...1   135.409758: qxl_ttm_io_mem_reserve 
<-ttm_mem_io_reserve
Xorg-2252  [000] ...1   135.409759: ioremap_nocache <-ttm_bo_kmap 
<== game over
Xorg-2252  [000] ...1   135.409759: __ioremap_caller 
<-ioremap_nocache
Xorg-2252  [000] ...1   135.409759: walk_system_ram_range 
<-__ioremap_caller
Xorg-2252  [000] ...1   135.409759: find_next_iomem_res 
<-walk_system_ram_range
Xorg-2252  [000] ...1   135.409759: _raw_read_lock 
<-find_next_iomem_res
Xorg-2252  [000] ...1   135.409760: reserve_memtype 
<-__ioremap_caller
Xorg-2252  [000] ...1   135.409760: pat_pagerange_is_ram 
<-reserve_memtype
Xorg-2252  [000] ...1   135.409761: walk_system_ram_range 
<-pat_pagerange_is_ram
Xorg-2252  [000] ...1   135.409761: find_next_iomem_res 
<-walk_system_ram_range
Xorg-2252  [000] ...1   135.409761: _raw_read_lock 
<-find_next_iomem_res
Xorg-2252  [000] ...1   135.409761: kmem_cache_alloc_trace 
<-reserve_memtype
Xorg-2252  [000] ...1   135.409761: __might_sleep 
<-kmem_cache_alloc_trace
Xorg-2252  [000] ...1   135.409762: ___might_sleep <-__might_sleep


Re: [PATCH v9 4/9] MAINTAINERS: update file entries for Coresight subsystem

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:49:57AM +0800, Leo Yan wrote:
> Update document file entries for Coresight debug module.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  MAINTAINERS | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b948dfa..a4b1f60 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1191,7 +1191,9 @@ L:  linux-arm-ker...@lists.infradead.org (moderated 
> for non-subscribers)
>  S:   Maintained
>  F:   drivers/hwtracing/coresight/*
>  F:   Documentation/trace/coresight.txt
> +F:   Documentation/trace/coresight-cpu-debug.txt
>  F:   Documentation/devicetree/bindings/arm/coresight.txt
> +F:   Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt
>  F:   Documentation/ABI/testing/sysfs-bus-coresight-devices-*
>  F:   tools/perf/arch/arm/util/pmu.c
>  F:   tools/perf/arch/arm/util/auxtrace.c
> -- 
> 2.7.4
> 


Re: [PATCH] staging: typec: Fix sparse warnings about incorrect types

2017-05-11 Thread Guenter Roeck
On Wed, May 10, 2017 at 10:51:35PM -0700, Guru Das Srinagesh wrote:
> Fix the following sparse warnings about incorrect type usage:
> 
> fusb302.c:1028:32: warning: incorrect type in argument 1 (different base 
> types)
> fusb302.c:1028:32:expected unsigned short [unsigned] [usertype] header
> fusb302.c:1028:32:got restricted __le16 const [usertype] header
> fusb302.c:1484:32: warning: incorrect type in argument 1 (different base 
> types)
> fusb302.c:1484:32:expected unsigned short [unsigned] [usertype] header
> fusb302.c:1484:32:got restricted __le16 [usertype] header
> 
> Signed-off-by: Guru Das Srinagesh 

Reviewed-by: Guenter Roeck 

> ---
>  drivers/staging/typec/fusb302/fusb302.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/typec/fusb302/fusb302.c 
> b/drivers/staging/typec/fusb302/fusb302.c
> index 2cee9a9..3bec9d5 100644
> --- a/drivers/staging/typec/fusb302/fusb302.c
> +++ b/drivers/staging/typec/fusb302/fusb302.c
> @@ -1025,7 +1025,7 @@ static int fusb302_pd_send_message(struct fusb302_chip 
> *chip,
>   buf[pos++] = FUSB302_TKN_SYNC1;
>   buf[pos++] = FUSB302_TKN_SYNC2;
>  
> - len = pd_header_cnt(msg->header) * 4;
> + len = pd_header_cnt_le(msg->header) * 4;
>   /* plug 2 for header */
>   len += 2;
>   if (len > 0x1F) {
> @@ -1481,7 +1481,7 @@ static int fusb302_pd_read_message(struct fusb302_chip 
> *chip,
>(u8 *)>header);
>   if (ret < 0)
>   return ret;
> - len = pd_header_cnt(msg->header) * 4;
> + len = pd_header_cnt_le(msg->header) * 4;
>   /* add 4 to length to include the CRC */
>   if (len > PD_MAX_PAYLOAD * 4) {
>   fusb302_log(chip, "PD message too long %d", len);
> -- 
> 2.7.4
> 


Re: [PATCH v9 4/9] MAINTAINERS: update file entries for Coresight subsystem

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:49:57AM +0800, Leo Yan wrote:
> Update document file entries for Coresight debug module.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  MAINTAINERS | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index b948dfa..a4b1f60 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1191,7 +1191,9 @@ L:  linux-arm-ker...@lists.infradead.org (moderated 
> for non-subscribers)
>  S:   Maintained
>  F:   drivers/hwtracing/coresight/*
>  F:   Documentation/trace/coresight.txt
> +F:   Documentation/trace/coresight-cpu-debug.txt
>  F:   Documentation/devicetree/bindings/arm/coresight.txt
> +F:   Documentation/devicetree/bindings/arm/coresight-cpu-debug.txt
>  F:   Documentation/ABI/testing/sysfs-bus-coresight-devices-*
>  F:   tools/perf/arch/arm/util/pmu.c
>  F:   tools/perf/arch/arm/util/auxtrace.c
> -- 
> 2.7.4
> 


Re: [PATCH] staging: typec: Fix sparse warnings about incorrect types

2017-05-11 Thread Guenter Roeck
On Wed, May 10, 2017 at 10:51:35PM -0700, Guru Das Srinagesh wrote:
> Fix the following sparse warnings about incorrect type usage:
> 
> fusb302.c:1028:32: warning: incorrect type in argument 1 (different base 
> types)
> fusb302.c:1028:32:expected unsigned short [unsigned] [usertype] header
> fusb302.c:1028:32:got restricted __le16 const [usertype] header
> fusb302.c:1484:32: warning: incorrect type in argument 1 (different base 
> types)
> fusb302.c:1484:32:expected unsigned short [unsigned] [usertype] header
> fusb302.c:1484:32:got restricted __le16 [usertype] header
> 
> Signed-off-by: Guru Das Srinagesh 

Reviewed-by: Guenter Roeck 

> ---
>  drivers/staging/typec/fusb302/fusb302.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/typec/fusb302/fusb302.c 
> b/drivers/staging/typec/fusb302/fusb302.c
> index 2cee9a9..3bec9d5 100644
> --- a/drivers/staging/typec/fusb302/fusb302.c
> +++ b/drivers/staging/typec/fusb302/fusb302.c
> @@ -1025,7 +1025,7 @@ static int fusb302_pd_send_message(struct fusb302_chip 
> *chip,
>   buf[pos++] = FUSB302_TKN_SYNC1;
>   buf[pos++] = FUSB302_TKN_SYNC2;
>  
> - len = pd_header_cnt(msg->header) * 4;
> + len = pd_header_cnt_le(msg->header) * 4;
>   /* plug 2 for header */
>   len += 2;
>   if (len > 0x1F) {
> @@ -1481,7 +1481,7 @@ static int fusb302_pd_read_message(struct fusb302_chip 
> *chip,
>(u8 *)>header);
>   if (ret < 0)
>   return ret;
> - len = pd_header_cnt(msg->header) * 4;
> + len = pd_header_cnt_le(msg->header) * 4;
>   /* add 4 to length to include the CRC */
>   if (len > PD_MAX_PAYLOAD * 4) {
>   fusb302_log(chip, "PD message too long %d", len);
> -- 
> 2.7.4
> 


Re: [PATCH] mdio: mux: Correct mdio_mux_init error path issues

2017-05-11 Thread David Miller
From: Florian Fainelli 
Date: Thu, 11 May 2017 10:05:27 -0700

> On 05/10/2017 08:20 AM, Jon Mason wrote:
>> There is a potential unnecessary refcount decriment on error path of
>> put_device(>mii_bus->dev), as it is possible to avoid the
>> of_mdio_find_bus() call if mux_bus is specified by the calling function.
>> 
>> The same put_device() is not called in the error path if the
>> devm_kzalloc of pb fails.  This caused the variable used in the
>> put_device() to be changed, as the pb pointer was obviously not set up.
>> 
>> There is an unnecessary of_node_get() on child_bus_node if the
>> of_mdiobus_register() is successful, as the
>> for_each_available_child_of_node() automatically increments this.
>> Thus the refcount on this node will always be +1 more than it should be.
>> 
>> There is no of_node_put() on child_bus_node if the of_mdiobus_register()
>> call fails.
>> 
>> Finally, it is lacking devm_kfree() of pb in the error path.  While this
>> might not be technically necessary, it was present in other parts of the
>> function.  So, I am adding it where necessary to make it uniform.
>> 
>> Signed-off-by: Jon Mason 
>> Fixes: f20e6657a875 ("mdio: mux: Enhanced MDIO mux framework for integrated 
>> multiplexers")
>> Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.")
> 
> Reviewed-by: Florian Fainelli 
> 
> Please include "net" in the subject for future submissions, thanks!

Applied.


Re: [PATCH] mdio: mux: Correct mdio_mux_init error path issues

2017-05-11 Thread David Miller
From: Florian Fainelli 
Date: Thu, 11 May 2017 10:05:27 -0700

> On 05/10/2017 08:20 AM, Jon Mason wrote:
>> There is a potential unnecessary refcount decriment on error path of
>> put_device(>mii_bus->dev), as it is possible to avoid the
>> of_mdio_find_bus() call if mux_bus is specified by the calling function.
>> 
>> The same put_device() is not called in the error path if the
>> devm_kzalloc of pb fails.  This caused the variable used in the
>> put_device() to be changed, as the pb pointer was obviously not set up.
>> 
>> There is an unnecessary of_node_get() on child_bus_node if the
>> of_mdiobus_register() is successful, as the
>> for_each_available_child_of_node() automatically increments this.
>> Thus the refcount on this node will always be +1 more than it should be.
>> 
>> There is no of_node_put() on child_bus_node if the of_mdiobus_register()
>> call fails.
>> 
>> Finally, it is lacking devm_kfree() of pb in the error path.  While this
>> might not be technically necessary, it was present in other parts of the
>> function.  So, I am adding it where necessary to make it uniform.
>> 
>> Signed-off-by: Jon Mason 
>> Fixes: f20e6657a875 ("mdio: mux: Enhanced MDIO mux framework for integrated 
>> multiplexers")
>> Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.")
> 
> Reviewed-by: Florian Fainelli 
> 
> Please include "net" in the subject for future submissions, thanks!

Applied.


Re: [PATCH v9 3/9] doc: Add coresight_cpu_debug.enable to kernel-parameters.txt

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:49:56AM +0800, Leo Yan wrote:
> Add coresight_cpu_debug.enable to kernel-parameters.txt, this flag is
> used to enable/disable the CPU sampling based debugging.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  Documentation/admin-guide/kernel-parameters.txt | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index e4c9e0e..010ae02 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -649,6 +649,13 @@
>   /proc//coredump_filter.
>   See also Documentation/filesystems/proc.txt.
>  
> + coresight_cpu_debug.enable
> + [ARM,ARM64]
> + Format: 
> + Enable/disable the CPU sampling based debugging.
> + 0: default value, disable debugging
> + 1: enable debugging at boot time
> +
>   cpuidle.off=1   [CPU_IDLE]
>   disable the cpuidle sub-system
>  
> -- 
> 2.7.4
> 


Re: [PATCH v9 3/9] doc: Add coresight_cpu_debug.enable to kernel-parameters.txt

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:49:56AM +0800, Leo Yan wrote:
> Add coresight_cpu_debug.enable to kernel-parameters.txt, this flag is
> used to enable/disable the CPU sampling based debugging.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  Documentation/admin-guide/kernel-parameters.txt | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt 
> b/Documentation/admin-guide/kernel-parameters.txt
> index e4c9e0e..010ae02 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -649,6 +649,13 @@
>   /proc//coredump_filter.
>   See also Documentation/filesystems/proc.txt.
>  
> + coresight_cpu_debug.enable
> + [ARM,ARM64]
> + Format: 
> + Enable/disable the CPU sampling based debugging.
> + 0: default value, disable debugging
> + 1: enable debugging at boot time
> +
>   cpuidle.off=1   [CPU_IDLE]
>   disable the cpuidle sub-system
>  
> -- 
> 2.7.4
> 


Re: [PATCH] ARM: remove duplicate 'const' annotations'

2017-05-11 Thread Florian Fainelli
On 05/11/2017 04:50 AM, Arnd Bergmann wrote:
> gcc-7 warns about some declarations that are more 'const' than necessary:
> 
> arch/arm/mach-at91/pm.c:338:34: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct of_device_id const ramc_ids[] __initconst = {
> arch/arm/mach-bcm/bcm_kona_smc.c:36:34: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct of_device_id const bcm_kona_smc_ids[] __initconst = {
> arch/arm/mach-spear/time.c:207:34: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct of_device_id const timer_of_match[] __initconst = {
> arch/arm/mach-omap2/prm_common.c:714:34: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct of_device_id const omap_prcm_dt_match_table[] 
> __initconst = {
> arch/arm/mach-omap2/vc.c:562:35: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct i2c_init_data const omap4_i2c_timing_data[] __initconst 
> = {
> 
> The ones in arch/arm were apparently all introduced accidentally by one
> commit that correctly marked a lot of variables as __initconst.
> 
> Cc: Nicolas Pitre 
> Fixes: 19c233b79d1a ("ARM: appropriate __init annotation for const data")
> Signed-off-by: Arnd Bergmann 
> ---

>  arch/arm/mach-bcm/bcm_kona_smc.c | 2 +-

Acked-by: Florian Fainelli 
-- 
Florian


Re: [PATCH] ARM: remove duplicate 'const' annotations'

2017-05-11 Thread Florian Fainelli
On 05/11/2017 04:50 AM, Arnd Bergmann wrote:
> gcc-7 warns about some declarations that are more 'const' than necessary:
> 
> arch/arm/mach-at91/pm.c:338:34: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct of_device_id const ramc_ids[] __initconst = {
> arch/arm/mach-bcm/bcm_kona_smc.c:36:34: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct of_device_id const bcm_kona_smc_ids[] __initconst = {
> arch/arm/mach-spear/time.c:207:34: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct of_device_id const timer_of_match[] __initconst = {
> arch/arm/mach-omap2/prm_common.c:714:34: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct of_device_id const omap_prcm_dt_match_table[] 
> __initconst = {
> arch/arm/mach-omap2/vc.c:562:35: error: duplicate 'const' declaration 
> specifier [-Werror=duplicate-decl-specifier]
>  static const struct i2c_init_data const omap4_i2c_timing_data[] __initconst 
> = {
> 
> The ones in arch/arm were apparently all introduced accidentally by one
> commit that correctly marked a lot of variables as __initconst.
> 
> Cc: Nicolas Pitre 
> Fixes: 19c233b79d1a ("ARM: appropriate __init annotation for const data")
> Signed-off-by: Arnd Bergmann 
> ---

>  arch/arm/mach-bcm/bcm_kona_smc.c | 2 +-

Acked-by: Florian Fainelli 
-- 
Florian


Re: [PATCH v9 2/9] doc: Add documentation for Coresight CPU debug

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:49:55AM +0800, Leo Yan wrote:
> Add detailed documentation for Coresight CPU debug driver, which
> contains the info for driver implementation, Mike Leach excellent
> summary for "clock and power domain". At the end some examples on how
> to enable the debugging functionality are provided.
> 
> Suggested-by: Mike Leach 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  Documentation/trace/coresight-cpu-debug.txt | 174 
> 
>  1 file changed, 174 insertions(+)
>  create mode 100644 Documentation/trace/coresight-cpu-debug.txt
> 
> diff --git a/Documentation/trace/coresight-cpu-debug.txt 
> b/Documentation/trace/coresight-cpu-debug.txt
> new file mode 100644
> index 000..0426d50
> --- /dev/null
> +++ b/Documentation/trace/coresight-cpu-debug.txt
> @@ -0,0 +1,174 @@
> + Coresight CPU Debug Module
> + ==
> +
> +   Author:   Leo Yan 
> +   Date: April 5th, 2017
> +
> +Introduction
> +
> +
> +Coresight CPU debug module is defined in ARMv8-a architecture reference 
> manual
> +(ARM DDI 0487A.k) Chapter 'Part H: External debug', the CPU can integrate
> +debug module and it is mainly used for two modes: self-hosted debug and
> +external debug. Usually the external debug mode is well known as the external
> +debugger connects with SoC from JTAG port; on the other hand the program can
> +explore debugging method which rely on self-hosted debug mode, this document
> +is to focus on this part.
> +
> +The debug module provides sample-based profiling extension, which can be used
> +to sample CPU program counter, secure state and exception level, etc; usually
> +every CPU has one dedicated debug module to be connected. Based on 
> self-hosted
> +debug mechanism, Linux kernel can access these related registers from mmio
> +region when the kernel panic happens. The callback notifier for kernel panic
> +will dump related registers for every CPU; finally this is good for assistant
> +analysis for panic.
> +
> +
> +Implementation
> +--
> +
> +- During driver registration, use EDDEVID and EDDEVID1 two device ID
> +  registers to decide if sample-based profiling is implemented or not. On 
> some
> +  platforms this hardware feature is fully or partialy implemented; and if
> +  this feature is not supported then registration will fail.
> +
> +- When write this doc, the debug driver mainly relies on three sampling
> +  registers. The kernel panic callback notifier gathers info from EDPCSR
> +  EDVIDSR and EDCIDSR; from EDPCSR we can get program counter, EDVIDSR has
> +  information for secure state, exception level, bit width, etc; EDCIDSR is
> +  context ID value which contains the sampled value of CONTEXTIDR_EL1.
> +
> +- The driver supports CPU running mode with either AArch64 or AArch32. The
> +  registers naming convention is a bit different between them, AArch64 uses
> +  'ED' for register prefix (ARM DDI 0487A.k, chapter H9.1) and AArch32 uses
> +  'DBG' as prefix (ARM DDI 0487A.k, chapter G5.1). The driver is unified to
> +  use AArch64 naming convention.
> +
> +- ARMv8-a (ARM DDI 0487A.k) and ARMv7-a (ARM DDI 0406C.b) have different
> +  register bits definition. So the driver consolidates two difference:
> +
> +  If PCSROffset=0b, on ARMv8-a the feature of EDPCSR is not implemented;
> +  but ARMv7-a defines "PCSR samples are offset by a value that depends on the
> +  instruction set state". For ARMv7-a, the driver checks furthermore if CPU
> +  runs with ARM or thumb instruction set and calibrate PCSR value, the
> +  detailed description for offset is in ARMv7-a ARM (ARM DDI 0406C.b) chapter
> +  C11.11.34 "DBGPCSR, Program Counter Sampling Register".
> +
> +  If PCSROffset=0b0010, ARMv8-a defines "EDPCSR implemented, and samples have
> +  no offset applied and do not sample the instruction set state in AArch32
> +  state". So on ARMv8 if EDDEVID1.PCSROffset is 0b0010 and the CPU operates
> +  in AArch32 state, EDPCSR is not sampled; when the CPU operates in AArch64
> +  state EDPCSR is sampled and no offset are applied.
> +
> +
> +Clock and power domain
> +--
> +
> +Before accessing debug registers, we should ensure the clock and power domain
> +have been enabled properly. In ARMv8-a ARM (ARM DDI 0487A.k) chapter 'H9.1
> +Debug registers', the debug registers are spread into two domains: the debug
> +domain and the CPU domain.
> +
> ++---+
> +|   |
> +|   |
> + +--+--+|
> +  dbg_clk -->|  |**||<-- cpu_clk
> + |Debug |**|   CPU  |
> +   dbg_pd -->|  |**||<-- cpu_pd
> + +--+--+  

Re: [PATCH v9 2/9] doc: Add documentation for Coresight CPU debug

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:49:55AM +0800, Leo Yan wrote:
> Add detailed documentation for Coresight CPU debug driver, which
> contains the info for driver implementation, Mike Leach excellent
> summary for "clock and power domain". At the end some examples on how
> to enable the debugging functionality are provided.
> 
> Suggested-by: Mike Leach 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  Documentation/trace/coresight-cpu-debug.txt | 174 
> 
>  1 file changed, 174 insertions(+)
>  create mode 100644 Documentation/trace/coresight-cpu-debug.txt
> 
> diff --git a/Documentation/trace/coresight-cpu-debug.txt 
> b/Documentation/trace/coresight-cpu-debug.txt
> new file mode 100644
> index 000..0426d50
> --- /dev/null
> +++ b/Documentation/trace/coresight-cpu-debug.txt
> @@ -0,0 +1,174 @@
> + Coresight CPU Debug Module
> + ==
> +
> +   Author:   Leo Yan 
> +   Date: April 5th, 2017
> +
> +Introduction
> +
> +
> +Coresight CPU debug module is defined in ARMv8-a architecture reference 
> manual
> +(ARM DDI 0487A.k) Chapter 'Part H: External debug', the CPU can integrate
> +debug module and it is mainly used for two modes: self-hosted debug and
> +external debug. Usually the external debug mode is well known as the external
> +debugger connects with SoC from JTAG port; on the other hand the program can
> +explore debugging method which rely on self-hosted debug mode, this document
> +is to focus on this part.
> +
> +The debug module provides sample-based profiling extension, which can be used
> +to sample CPU program counter, secure state and exception level, etc; usually
> +every CPU has one dedicated debug module to be connected. Based on 
> self-hosted
> +debug mechanism, Linux kernel can access these related registers from mmio
> +region when the kernel panic happens. The callback notifier for kernel panic
> +will dump related registers for every CPU; finally this is good for assistant
> +analysis for panic.
> +
> +
> +Implementation
> +--
> +
> +- During driver registration, use EDDEVID and EDDEVID1 two device ID
> +  registers to decide if sample-based profiling is implemented or not. On 
> some
> +  platforms this hardware feature is fully or partialy implemented; and if
> +  this feature is not supported then registration will fail.
> +
> +- When write this doc, the debug driver mainly relies on three sampling
> +  registers. The kernel panic callback notifier gathers info from EDPCSR
> +  EDVIDSR and EDCIDSR; from EDPCSR we can get program counter, EDVIDSR has
> +  information for secure state, exception level, bit width, etc; EDCIDSR is
> +  context ID value which contains the sampled value of CONTEXTIDR_EL1.
> +
> +- The driver supports CPU running mode with either AArch64 or AArch32. The
> +  registers naming convention is a bit different between them, AArch64 uses
> +  'ED' for register prefix (ARM DDI 0487A.k, chapter H9.1) and AArch32 uses
> +  'DBG' as prefix (ARM DDI 0487A.k, chapter G5.1). The driver is unified to
> +  use AArch64 naming convention.
> +
> +- ARMv8-a (ARM DDI 0487A.k) and ARMv7-a (ARM DDI 0406C.b) have different
> +  register bits definition. So the driver consolidates two difference:
> +
> +  If PCSROffset=0b, on ARMv8-a the feature of EDPCSR is not implemented;
> +  but ARMv7-a defines "PCSR samples are offset by a value that depends on the
> +  instruction set state". For ARMv7-a, the driver checks furthermore if CPU
> +  runs with ARM or thumb instruction set and calibrate PCSR value, the
> +  detailed description for offset is in ARMv7-a ARM (ARM DDI 0406C.b) chapter
> +  C11.11.34 "DBGPCSR, Program Counter Sampling Register".
> +
> +  If PCSROffset=0b0010, ARMv8-a defines "EDPCSR implemented, and samples have
> +  no offset applied and do not sample the instruction set state in AArch32
> +  state". So on ARMv8 if EDDEVID1.PCSROffset is 0b0010 and the CPU operates
> +  in AArch32 state, EDPCSR is not sampled; when the CPU operates in AArch64
> +  state EDPCSR is sampled and no offset are applied.
> +
> +
> +Clock and power domain
> +--
> +
> +Before accessing debug registers, we should ensure the clock and power domain
> +have been enabled properly. In ARMv8-a ARM (ARM DDI 0487A.k) chapter 'H9.1
> +Debug registers', the debug registers are spread into two domains: the debug
> +domain and the CPU domain.
> +
> ++---+
> +|   |
> +|   |
> + +--+--+|
> +  dbg_clk -->|  |**||<-- cpu_clk
> + |Debug |**|   CPU  |
> +   dbg_pd -->|  |**||<-- cpu_pd
> + +--+--+|
> +|   |
> +   

[GIT PULL] arm64 2nd set of updates for 4.12

2017-05-11 Thread Catalin Marinas
Hi Linus,

Please pull the arm64 updates below. The mm/vmalloc.c change was acked
by Michal Hocko and the arch/arm one by Russell King. Thanks.

The following changes since commit 92f66f84d9695d07adf9bc987bbcce4bf9b8e87c:

  arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUS 
(2017-05-05 11:41:35 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux tags/arm64-upstream

for you to fetch changes up to 0c2cf6d9487cb90be6ad7fac66044dfa8e8e5243:

  arm64: Silence first allocation with CONFIG_ARM64_MODULE_PLTS=y (2017-05-11 
14:43:40 +0100)


arm64 2nd set of updates for 4.12:

- Silence module allocation failures when CONFIG_ARM*_MODULE_PLTS is
  enabled. This requires a check for __GFP_NOWARN in alloc_vmap_area()

- Improve/sanitise user tagged pointers handling in the kernel

- Inline asm fixes/cleanups


Florian Fainelli (3):
  mm: Silence vmap() allocation failures based on caller gfp_flags
  ARM: Silence first allocation with CONFIG_ARM_MODULE_PLTS=y
  arm64: Silence first allocation with CONFIG_ARM64_MODULE_PLTS=y

Kristina Martsenko (4):
  arm64: traps: fix userspace cache maintenance emulation on a tagged 
pointer
  arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
  arm64: entry: improve data abort handling of tagged pointers
  arm64: documentation: document tagged pointer stack constraints

Mark Rutland (6):
  arm64: xchg: hazard against entire exchange variable
  arm64: ensure extension of smp_store_release value
  arm64: uaccess: ensure extension of access_ok() addr
  arm64: armv8_deprecated: ensure extension of addr
  arm64: atomic_lse: match asm register sizes
  arm64: uaccess: suppress spurious clang warning

 Documentation/arm64/tagged-pointers.txt | 62 +
 arch/arm/kernel/module.c| 11 --
 arch/arm64/include/asm/asm-uaccess.h|  9 +
 arch/arm64/include/asm/atomic_lse.h |  4 +--
 arch/arm64/include/asm/barrier.h| 20 ---
 arch/arm64/include/asm/cmpxchg.h|  2 +-
 arch/arm64/include/asm/uaccess.h| 13 +++
 arch/arm64/kernel/armv8_deprecated.c|  3 +-
 arch/arm64/kernel/entry.S   |  5 +--
 arch/arm64/kernel/hw_breakpoint.c   |  3 ++
 arch/arm64/kernel/module.c  |  7 +++-
 arch/arm64/kernel/traps.c   |  4 +--
 mm/vmalloc.c|  2 +-
 13 files changed, 107 insertions(+), 38 deletions(-)

-- 
Catalin


[GIT PULL] arm64 2nd set of updates for 4.12

2017-05-11 Thread Catalin Marinas
Hi Linus,

Please pull the arm64 updates below. The mm/vmalloc.c change was acked
by Michal Hocko and the arch/arm one by Russell King. Thanks.

The following changes since commit 92f66f84d9695d07adf9bc987bbcce4bf9b8e87c:

  arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUS 
(2017-05-05 11:41:35 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux tags/arm64-upstream

for you to fetch changes up to 0c2cf6d9487cb90be6ad7fac66044dfa8e8e5243:

  arm64: Silence first allocation with CONFIG_ARM64_MODULE_PLTS=y (2017-05-11 
14:43:40 +0100)


arm64 2nd set of updates for 4.12:

- Silence module allocation failures when CONFIG_ARM*_MODULE_PLTS is
  enabled. This requires a check for __GFP_NOWARN in alloc_vmap_area()

- Improve/sanitise user tagged pointers handling in the kernel

- Inline asm fixes/cleanups


Florian Fainelli (3):
  mm: Silence vmap() allocation failures based on caller gfp_flags
  ARM: Silence first allocation with CONFIG_ARM_MODULE_PLTS=y
  arm64: Silence first allocation with CONFIG_ARM64_MODULE_PLTS=y

Kristina Martsenko (4):
  arm64: traps: fix userspace cache maintenance emulation on a tagged 
pointer
  arm64: hw_breakpoint: fix watchpoint matching for tagged pointers
  arm64: entry: improve data abort handling of tagged pointers
  arm64: documentation: document tagged pointer stack constraints

Mark Rutland (6):
  arm64: xchg: hazard against entire exchange variable
  arm64: ensure extension of smp_store_release value
  arm64: uaccess: ensure extension of access_ok() addr
  arm64: armv8_deprecated: ensure extension of addr
  arm64: atomic_lse: match asm register sizes
  arm64: uaccess: suppress spurious clang warning

 Documentation/arm64/tagged-pointers.txt | 62 +
 arch/arm/kernel/module.c| 11 --
 arch/arm64/include/asm/asm-uaccess.h|  9 +
 arch/arm64/include/asm/atomic_lse.h |  4 +--
 arch/arm64/include/asm/barrier.h| 20 ---
 arch/arm64/include/asm/cmpxchg.h|  2 +-
 arch/arm64/include/asm/uaccess.h| 13 +++
 arch/arm64/kernel/armv8_deprecated.c|  3 +-
 arch/arm64/kernel/entry.S   |  5 +--
 arch/arm64/kernel/hw_breakpoint.c   |  3 ++
 arch/arm64/kernel/module.c  |  7 +++-
 arch/arm64/kernel/traps.c   |  4 +--
 mm/vmalloc.c|  2 +-
 13 files changed, 107 insertions(+), 38 deletions(-)

-- 
Catalin


Re: [PATCH v9 9/9] arm64: dts: qcom: msm8916: Add debug unit

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:50:02AM +0800, Leo Yan wrote:
> Add debug unit on Qualcomm msm8916 based platforms, including the
> DragonBoard 410c board.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  arch/arm64/boot/dts/qcom/msm8916.dtsi | 32 
>  1 file changed, 32 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8916.dtsi
> index 68a8e67..3af814b 100644
> --- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
> @@ -1104,6 +1104,38 @@
>   };
>   };
>  
> + debug@85 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0x85 0x1000>;
> + clocks = < RPM_QDSS_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@852000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0x852000 0x1000>;
> + clocks = < RPM_QDSS_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@854000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0x854000 0x1000>;
> + clocks = < RPM_QDSS_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@856000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0x856000 0x1000>;
> + clocks = < RPM_QDSS_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
>   etm@85c000 {
>   compatible = "arm,coresight-etm4x", "arm,primecell";
>   reg = <0x85c000 0x1000>;
> -- 
> 2.7.4
> 


Re: [PATCH v9 9/9] arm64: dts: qcom: msm8916: Add debug unit

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:50:02AM +0800, Leo Yan wrote:
> Add debug unit on Qualcomm msm8916 based platforms, including the
> DragonBoard 410c board.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  arch/arm64/boot/dts/qcom/msm8916.dtsi | 32 
>  1 file changed, 32 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
> b/arch/arm64/boot/dts/qcom/msm8916.dtsi
> index 68a8e67..3af814b 100644
> --- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
> +++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
> @@ -1104,6 +1104,38 @@
>   };
>   };
>  
> + debug@85 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0x85 0x1000>;
> + clocks = < RPM_QDSS_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@852000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0x852000 0x1000>;
> + clocks = < RPM_QDSS_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@854000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0x854000 0x1000>;
> + clocks = < RPM_QDSS_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@856000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0x856000 0x1000>;
> + clocks = < RPM_QDSS_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
>   etm@85c000 {
>   compatible = "arm,coresight-etm4x", "arm,primecell";
>   reg = <0x85c000 0x1000>;
> -- 
> 2.7.4
> 


Re: [PATCH v9 8/9] arm64: dts: hi6220: register debug module

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:50:01AM +0800, Leo Yan wrote:
> Bind debug module driver for Hi6220.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  arch/arm64/boot/dts/hisilicon/hi6220.dtsi | 64 
> +++
>  1 file changed, 64 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi 
> b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
> index 470461d..467aa15 100644
> --- a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
> +++ b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
> @@ -913,5 +913,69 @@
>   };
>   };
>   };
> +
> + debug@f659 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf659 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f6592000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf6592000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f6594000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf6594000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f6596000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf6596000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f65d {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf65d 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f65d2000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf65d2000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f65d4000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf65d4000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f65d6000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf65d6000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
>   };
>  };
> -- 
> 2.7.4
> 


Re: [PATCH v9 8/9] arm64: dts: hi6220: register debug module

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:50:01AM +0800, Leo Yan wrote:
> Bind debug module driver for Hi6220.
> 
> Signed-off-by: Leo Yan 

Reviewed-by: Mathieu Poirier 

> ---
>  arch/arm64/boot/dts/hisilicon/hi6220.dtsi | 64 
> +++
>  1 file changed, 64 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi 
> b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
> index 470461d..467aa15 100644
> --- a/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
> +++ b/arch/arm64/boot/dts/hisilicon/hi6220.dtsi
> @@ -913,5 +913,69 @@
>   };
>   };
>   };
> +
> + debug@f659 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf659 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f6592000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf6592000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f6594000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf6594000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f6596000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf6596000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f65d {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf65d 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f65d2000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf65d2000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f65d4000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf65d4000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
> +
> + debug@f65d6000 {
> + compatible = "arm,coresight-cpu-debug","arm,primecell";
> + reg = <0 0xf65d6000 0 0x1000>;
> + clocks = <_ctrl HI6220_DAPB_CLK>;
> + clock-names = "apb_pclk";
> + cpu = <>;
> + };
>   };
>  };
> -- 
> 2.7.4
> 


Re: [PATCH] input: cros_ec_keyb: remove extraneous 'const'

2017-05-11 Thread Dmitry Torokhov
On Thu, May 11, 2017 at 01:48:04PM +0200, Arnd Bergmann wrote:
> gcc-7 warns about 'const SIMPLE_DEV_PM_OPS', as that macro already constains
> a 'const' keyword:
> 
> drivers/input/keyboard/cros_ec_keyb.c:663:14: error: duplicate 'const' 
> declaration specifier [-Werror=duplicate-decl-specifier]
>  static const SIMPLE_DEV_PM_OPS(cros_ec_keyb_pm_ops, NULL, 
> cros_ec_keyb_resume);
> 
> This removes the extra one.
> 
> Fixes: 6af6dc2d2aa6 ("input: Add ChromeOS EC keyboard driver")
> Signed-off-by: Arnd Bergmann 

Applied, thank you.

> ---
>  drivers/input/keyboard/cros_ec_keyb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/input/keyboard/cros_ec_keyb.c 
> b/drivers/input/keyboard/cros_ec_keyb.c
> index c7a8120b13c0..79eb29550c34 100644
> --- a/drivers/input/keyboard/cros_ec_keyb.c
> +++ b/drivers/input/keyboard/cros_ec_keyb.c
> @@ -660,7 +660,7 @@ static const struct of_device_id cros_ec_keyb_of_match[] 
> = {
>  MODULE_DEVICE_TABLE(of, cros_ec_keyb_of_match);
>  #endif
>  
> -static const SIMPLE_DEV_PM_OPS(cros_ec_keyb_pm_ops, NULL, 
> cros_ec_keyb_resume);
> +static SIMPLE_DEV_PM_OPS(cros_ec_keyb_pm_ops, NULL, cros_ec_keyb_resume);
>  
>  static struct platform_driver cros_ec_keyb_driver = {
>   .probe = cros_ec_keyb_probe,
> -- 
> 2.9.0
> 

-- 
Dmitry


Re: [PATCH] input: cros_ec_keyb: remove extraneous 'const'

2017-05-11 Thread Dmitry Torokhov
On Thu, May 11, 2017 at 01:48:04PM +0200, Arnd Bergmann wrote:
> gcc-7 warns about 'const SIMPLE_DEV_PM_OPS', as that macro already constains
> a 'const' keyword:
> 
> drivers/input/keyboard/cros_ec_keyb.c:663:14: error: duplicate 'const' 
> declaration specifier [-Werror=duplicate-decl-specifier]
>  static const SIMPLE_DEV_PM_OPS(cros_ec_keyb_pm_ops, NULL, 
> cros_ec_keyb_resume);
> 
> This removes the extra one.
> 
> Fixes: 6af6dc2d2aa6 ("input: Add ChromeOS EC keyboard driver")
> Signed-off-by: Arnd Bergmann 

Applied, thank you.

> ---
>  drivers/input/keyboard/cros_ec_keyb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/input/keyboard/cros_ec_keyb.c 
> b/drivers/input/keyboard/cros_ec_keyb.c
> index c7a8120b13c0..79eb29550c34 100644
> --- a/drivers/input/keyboard/cros_ec_keyb.c
> +++ b/drivers/input/keyboard/cros_ec_keyb.c
> @@ -660,7 +660,7 @@ static const struct of_device_id cros_ec_keyb_of_match[] 
> = {
>  MODULE_DEVICE_TABLE(of, cros_ec_keyb_of_match);
>  #endif
>  
> -static const SIMPLE_DEV_PM_OPS(cros_ec_keyb_pm_ops, NULL, 
> cros_ec_keyb_resume);
> +static SIMPLE_DEV_PM_OPS(cros_ec_keyb_pm_ops, NULL, cros_ec_keyb_resume);
>  
>  static struct platform_driver cros_ec_keyb_driver = {
>   .probe = cros_ec_keyb_probe,
> -- 
> 2.9.0
> 

-- 
Dmitry


Re: [PATCH v9 7/9] coresight: add support for CPU debug module

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:50:00AM +0800, Leo Yan wrote:
> Coresight includes debug module and usually the module connects with CPU
> debug logic. ARMv8 architecture reference manual (ARM DDI 0487A.k) has
> description for related info in "Part H: External Debug".
> 
> Chapter H7 "The Sample-based Profiling Extension" introduces several
> sampling registers, e.g. we can check program counter value with
> combined CPU exception level, secure state, etc. So this is helpful for
> analysis CPU lockup scenarios, e.g. if one CPU has run into infinite
> loop with IRQ disabled. In this case the CPU cannot switch context and
> handle any interrupt (including IPIs), as the result it cannot handle
> SMP call for stack dump.
> 
> This patch is to enable coresight debug module, so firstly this driver
> is to bind apb clock for debug module and this is to ensure the debug
> module can be accessed from program or external debugger. And the driver
> uses sample-based registers for debug purpose, e.g. when system triggers
> panic, the driver will dump program counter and combined context
> registers (EDCIDSR, EDVIDSR); by parsing context registers so can
> quickly get to know CPU secure state, exception level, etc.
> 
> Some of the debug module registers are located in CPU power domain, so
> this requires the CPU power domain stays on when access related debug
> registers, but the power management for CPU power domain is quite
> dependent on SoC integration for power management. For the platforms
> which with sane power controller implementations, this driver follows
> the method to set EDPRCR to try to pull the CPU out of low power state
> and then set 'no power down request' bit so the CPU has no chance to
> lose power.
> 
> If the SoC has not followed up this design well for power management
> controller, the user should use the command line parameter or sysfs
> to constrain all or partial idle states to ensure the CPU power
> domain is enabled and access coresight CPU debug component safely.
> 
> Signed-off-by: Leo Yan 
> ---
>  drivers/hwtracing/coresight/Kconfig   |  14 +
>  drivers/hwtracing/coresight/Makefile  |   1 +
>  drivers/hwtracing/coresight/coresight-cpu-debug.c | 693 
> ++
>  3 files changed, 708 insertions(+)
>  create mode 100644 drivers/hwtracing/coresight/coresight-cpu-debug.c
> 
> diff --git a/drivers/hwtracing/coresight/Kconfig 
> b/drivers/hwtracing/coresight/Kconfig
> index 130cb21..8d55d6d 100644
> --- a/drivers/hwtracing/coresight/Kconfig
> +++ b/drivers/hwtracing/coresight/Kconfig
> @@ -89,4 +89,18 @@ config CORESIGHT_STM
> logging useful software events or data coming from various entities
> in the system, possibly running different OSs
>  
> +config CORESIGHT_CPU_DEBUG
> + tristate "CoreSight CPU Debug driver"
> + depends on ARM || ARM64
> + depends on DEBUG_FS
> + help
> +   This driver provides support for coresight debugging module. This
> +   is primarily used to dump sample-based profiling registers when
> +   system triggers panic, the driver will parse context registers so
> +   can quickly get to know program counter (PC), secure state,
> +   exception level, etc. Before use debugging functionality, platform
> +   needs to ensure the clock domain and power domain are enabled
> +   properly, please refer Documentation/trace/coresight-cpu-debug.txt
> +   for detailed description and the example for usage.
> +
>  endif
> diff --git a/drivers/hwtracing/coresight/Makefile 
> b/drivers/hwtracing/coresight/Makefile
> index af480d9..433d590 100644
> --- a/drivers/hwtracing/coresight/Makefile
> +++ b/drivers/hwtracing/coresight/Makefile
> @@ -16,3 +16,4 @@ obj-$(CONFIG_CORESIGHT_SOURCE_ETM4X) += coresight-etm4x.o \
>   coresight-etm4x-sysfs.o
>  obj-$(CONFIG_CORESIGHT_QCOM_REPLICATOR) += coresight-replicator-qcom.o
>  obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o
> +obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o
> diff --git a/drivers/hwtracing/coresight/coresight-cpu-debug.c 
> b/drivers/hwtracing/coresight/coresight-cpu-debug.c
> new file mode 100644
> index 000..ab12ec1
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-cpu-debug.c
> @@ -0,0 +1,693 @@
> +/*
> + * Copyright (c) 2017 Linaro Limited. All rights reserved.
> + *
> + * Author: Leo Yan 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published 
> by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but 
> WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public 

Re: [PATCH v9 7/9] coresight: add support for CPU debug module

2017-05-11 Thread Mathieu Poirier
On Tue, May 09, 2017 at 10:50:00AM +0800, Leo Yan wrote:
> Coresight includes debug module and usually the module connects with CPU
> debug logic. ARMv8 architecture reference manual (ARM DDI 0487A.k) has
> description for related info in "Part H: External Debug".
> 
> Chapter H7 "The Sample-based Profiling Extension" introduces several
> sampling registers, e.g. we can check program counter value with
> combined CPU exception level, secure state, etc. So this is helpful for
> analysis CPU lockup scenarios, e.g. if one CPU has run into infinite
> loop with IRQ disabled. In this case the CPU cannot switch context and
> handle any interrupt (including IPIs), as the result it cannot handle
> SMP call for stack dump.
> 
> This patch is to enable coresight debug module, so firstly this driver
> is to bind apb clock for debug module and this is to ensure the debug
> module can be accessed from program or external debugger. And the driver
> uses sample-based registers for debug purpose, e.g. when system triggers
> panic, the driver will dump program counter and combined context
> registers (EDCIDSR, EDVIDSR); by parsing context registers so can
> quickly get to know CPU secure state, exception level, etc.
> 
> Some of the debug module registers are located in CPU power domain, so
> this requires the CPU power domain stays on when access related debug
> registers, but the power management for CPU power domain is quite
> dependent on SoC integration for power management. For the platforms
> which with sane power controller implementations, this driver follows
> the method to set EDPRCR to try to pull the CPU out of low power state
> and then set 'no power down request' bit so the CPU has no chance to
> lose power.
> 
> If the SoC has not followed up this design well for power management
> controller, the user should use the command line parameter or sysfs
> to constrain all or partial idle states to ensure the CPU power
> domain is enabled and access coresight CPU debug component safely.
> 
> Signed-off-by: Leo Yan 
> ---
>  drivers/hwtracing/coresight/Kconfig   |  14 +
>  drivers/hwtracing/coresight/Makefile  |   1 +
>  drivers/hwtracing/coresight/coresight-cpu-debug.c | 693 
> ++
>  3 files changed, 708 insertions(+)
>  create mode 100644 drivers/hwtracing/coresight/coresight-cpu-debug.c
> 
> diff --git a/drivers/hwtracing/coresight/Kconfig 
> b/drivers/hwtracing/coresight/Kconfig
> index 130cb21..8d55d6d 100644
> --- a/drivers/hwtracing/coresight/Kconfig
> +++ b/drivers/hwtracing/coresight/Kconfig
> @@ -89,4 +89,18 @@ config CORESIGHT_STM
> logging useful software events or data coming from various entities
> in the system, possibly running different OSs
>  
> +config CORESIGHT_CPU_DEBUG
> + tristate "CoreSight CPU Debug driver"
> + depends on ARM || ARM64
> + depends on DEBUG_FS
> + help
> +   This driver provides support for coresight debugging module. This
> +   is primarily used to dump sample-based profiling registers when
> +   system triggers panic, the driver will parse context registers so
> +   can quickly get to know program counter (PC), secure state,
> +   exception level, etc. Before use debugging functionality, platform
> +   needs to ensure the clock domain and power domain are enabled
> +   properly, please refer Documentation/trace/coresight-cpu-debug.txt
> +   for detailed description and the example for usage.
> +
>  endif
> diff --git a/drivers/hwtracing/coresight/Makefile 
> b/drivers/hwtracing/coresight/Makefile
> index af480d9..433d590 100644
> --- a/drivers/hwtracing/coresight/Makefile
> +++ b/drivers/hwtracing/coresight/Makefile
> @@ -16,3 +16,4 @@ obj-$(CONFIG_CORESIGHT_SOURCE_ETM4X) += coresight-etm4x.o \
>   coresight-etm4x-sysfs.o
>  obj-$(CONFIG_CORESIGHT_QCOM_REPLICATOR) += coresight-replicator-qcom.o
>  obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o
> +obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o
> diff --git a/drivers/hwtracing/coresight/coresight-cpu-debug.c 
> b/drivers/hwtracing/coresight/coresight-cpu-debug.c
> new file mode 100644
> index 000..ab12ec1
> --- /dev/null
> +++ b/drivers/hwtracing/coresight/coresight-cpu-debug.c
> @@ -0,0 +1,693 @@
> +/*
> + * Copyright (c) 2017 Linaro Limited. All rights reserved.
> + *
> + * Author: Leo Yan 
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published 
> by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but 
> WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along 
> with
> + * this program.  

Re: [PATCH v6 3/5] test: add new driver_data load tester

2017-05-11 Thread Luis R. Rodriguez
On Thu, May 11, 2017 at 07:46:27PM +0900, AKASHI Takahiro wrote:
> Luis,
> 
> On Fri, Apr 28, 2017 at 03:45:35AM +0200, Luis R. Rodriguez wrote:
> > > > +To test an async call one could do::
> > > > +
> > > > +echo anything > /lib/firmware/test-driver_data.bin
> > > 
> > > Your current shell script doesn't search for the firmware in
> > > /lib/firmware unless you explicitly specify $FWPATH.
> > 
> > This is true but that is the *test* shell script, and it purposely avoids 
> > the
> > existing firmware path to avoid overriding dummy test files on the 
> > production
> > path. So the above still stands as it is not using the test shell script
> > driver_data.sh.
> > 
> > I'll add a note:
> > 
> > """
> > Note that driver_data.sh uses its own temporary custom path for creating 
> > and
> > looking for driver data files, it does this to not overwrite any production 
> > 
> > files you might have which may share the same names used by the test shell  
> > 
> > script driver_data.sh. If you are not using the driver_data.sh script your  
> > 
> > default path will be used. 
> > """
> 
> That looks fine, but I think we'd better change the line:
> 
> > > > +echo anything > /lib/firmware/test-driver_data.bin
> 
> since it is just incorrect as far as driver_data.sh goes.

But that is accurate, given the default file we search for on test_driver_data.c
is test-driver_data.bin. It also does not create a conflict to overwrite a file
used on driver_data.sh as driver_data.sh uses a custom path. I think the note
above on custom path is sufficient for the developer or user to be aware of
the fact the driver_data.sh does it own thing, and that the example is just a
manual test case.

What do you mean by that its incorrect ?

  Luis


Re: [PATCH v6 3/5] test: add new driver_data load tester

2017-05-11 Thread Luis R. Rodriguez
On Thu, May 11, 2017 at 07:46:27PM +0900, AKASHI Takahiro wrote:
> Luis,
> 
> On Fri, Apr 28, 2017 at 03:45:35AM +0200, Luis R. Rodriguez wrote:
> > > > +To test an async call one could do::
> > > > +
> > > > +echo anything > /lib/firmware/test-driver_data.bin
> > > 
> > > Your current shell script doesn't search for the firmware in
> > > /lib/firmware unless you explicitly specify $FWPATH.
> > 
> > This is true but that is the *test* shell script, and it purposely avoids 
> > the
> > existing firmware path to avoid overriding dummy test files on the 
> > production
> > path. So the above still stands as it is not using the test shell script
> > driver_data.sh.
> > 
> > I'll add a note:
> > 
> > """
> > Note that driver_data.sh uses its own temporary custom path for creating 
> > and
> > looking for driver data files, it does this to not overwrite any production 
> > 
> > files you might have which may share the same names used by the test shell  
> > 
> > script driver_data.sh. If you are not using the driver_data.sh script your  
> > 
> > default path will be used. 
> > """
> 
> That looks fine, but I think we'd better change the line:
> 
> > > > +echo anything > /lib/firmware/test-driver_data.bin
> 
> since it is just incorrect as far as driver_data.sh goes.

But that is accurate, given the default file we search for on test_driver_data.c
is test-driver_data.bin. It also does not create a conflict to overwrite a file
used on driver_data.sh as driver_data.sh uses a custom path. I think the note
above on custom path is sufficient for the developer or user to be aware of
the fact the driver_data.sh does it own thing, and that the example is just a
manual test case.

What do you mean by that its incorrect ?

  Luis


Threads stuck in zap_pid_ns_processes()

2017-05-11 Thread Guenter Roeck
Hi all,

the test program attached below almost always results in one of the child
processes being stuck in zap_pid_ns_processes(). When this happens, I can
see from test logs that nr_hashed == 2 and init_pids==1, but there is only
a single thread left in the pid namespace (the one that is stuck).
Traceback from /proc//stack is

[] zap_pid_ns_processes+0x1ee/0x2a0
[] do_exit+0x10d4/0x1330
[] do_group_exit+0x86/0x130
[] get_signal+0x367/0x8a0
[] do_signal+0x83/0xb90
[] exit_to_usermode_loop+0x75/0xc0
[] syscall_return_slowpath+0xc6/0xd0
[] entry_SYSCALL_64_fastpath+0xab/0xad
[] 0x

After 120 seconds, I get the "hung task" message.

Example from v4.11:

...
[ 3263.379545] INFO: task clone:27910 blocked for more than 120 seconds.
[ 3263.379561]   Not tainted 4.11.0+ #1
[ 3263.379569] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3263.379577] clone   D0 27910  27909 0x
[ 3263.379587] Call Trace:
[ 3263.379608]  __schedule+0x677/0xda0
[ 3263.379621]  ? pci_mmcfg_check_reserved+0xc0/0xc0
[ 3263.379634]  ? task_stopped_code+0x70/0x70
[ 3263.379643]  schedule+0x4d/0xd0
[ 3263.379653]  zap_pid_ns_processes+0x1ee/0x2a0
[ 3263.379659]  ? copy_pid_ns+0x4d0/0x4d0
[ 3263.379670]  do_exit+0x10d4/0x1330
...

The problem is seen in all kernels up to v4.11.

Any idea what might be going on and how to fix the problem ?

Thanks,
Guenter

---
This test program was kindly provided by Vovo Yang .

Note that the ptrace() call in child1() is not necessary for the problem
to be seen, though it seems to make it a bit more likely.
---

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define STACK_SIZE 65536

int child1(void* arg);
int child2(void* arg);

int main(int argc, char **argv)
{
  int child_pid;
  char* child_stack = malloc(STACK_SIZE);
  char* stack_top = child_stack + STACK_SIZE;
  char command[256];

  child_pid = clone(, stack_top, CLONE_NEWPID, NULL);
  if (child_pid == -1) {
printf("parent: clone failed: %s\n", strerror(errno));
return EXIT_FAILURE;
  }
  printf("parent: child1_pid: %d\n", child_pid);

  sleep(2);
  printf("child state, if it's D (disk sleep), the child process is hung\n");
  sprintf(command, "cat /proc/%d/status | grep State:", child_pid);
  system(command);
  sleep(3600);
  return EXIT_SUCCESS;
}

int child1(void* arg)
{
  int flags = CLONE_FILES | CLONE_FS | CLONE_VM | CLONE_SIGHAND | CLONE_THREAD;
  char* child_stack = malloc(STACK_SIZE);
  char* stack_top = child_stack + STACK_SIZE;
  long ret;

  ret = ptrace(PTRACE_TRACEME, 0, NULL, NULL);
  if (ret == -1) {
printf("child1: ptrace failed: %s\n", strerror(errno));
return EXIT_FAILURE;
  }

  ret = clone(, stack_top, flags, NULL);
  if (ret == -1) {
printf("child1: clone failed: %s\n", strerror(errno));
return EXIT_FAILURE;
  }
  printf("child1: child2 pid: %ld\n", ret);

  sleep(1);
  printf("child1: end\n");
  return EXIT_SUCCESS;
}

int child2(void* arg)
{
  long ret = ptrace(PTRACE_TRACEME, 0, NULL, NULL);
  if (ret == -1) {
printf("child2: ptrace failed: %s\n", strerror(errno));
return EXIT_FAILURE;
  }

  printf("child2: end\n");
  return EXIT_SUCCESS;
}




Threads stuck in zap_pid_ns_processes()

2017-05-11 Thread Guenter Roeck
Hi all,

the test program attached below almost always results in one of the child
processes being stuck in zap_pid_ns_processes(). When this happens, I can
see from test logs that nr_hashed == 2 and init_pids==1, but there is only
a single thread left in the pid namespace (the one that is stuck).
Traceback from /proc//stack is

[] zap_pid_ns_processes+0x1ee/0x2a0
[] do_exit+0x10d4/0x1330
[] do_group_exit+0x86/0x130
[] get_signal+0x367/0x8a0
[] do_signal+0x83/0xb90
[] exit_to_usermode_loop+0x75/0xc0
[] syscall_return_slowpath+0xc6/0xd0
[] entry_SYSCALL_64_fastpath+0xab/0xad
[] 0x

After 120 seconds, I get the "hung task" message.

Example from v4.11:

...
[ 3263.379545] INFO: task clone:27910 blocked for more than 120 seconds.
[ 3263.379561]   Not tainted 4.11.0+ #1
[ 3263.379569] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[ 3263.379577] clone   D0 27910  27909 0x
[ 3263.379587] Call Trace:
[ 3263.379608]  __schedule+0x677/0xda0
[ 3263.379621]  ? pci_mmcfg_check_reserved+0xc0/0xc0
[ 3263.379634]  ? task_stopped_code+0x70/0x70
[ 3263.379643]  schedule+0x4d/0xd0
[ 3263.379653]  zap_pid_ns_processes+0x1ee/0x2a0
[ 3263.379659]  ? copy_pid_ns+0x4d0/0x4d0
[ 3263.379670]  do_exit+0x10d4/0x1330
...

The problem is seen in all kernels up to v4.11.

Any idea what might be going on and how to fix the problem ?

Thanks,
Guenter

---
This test program was kindly provided by Vovo Yang .

Note that the ptrace() call in child1() is not necessary for the problem
to be seen, though it seems to make it a bit more likely.
---

#define _GNU_SOURCE
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define STACK_SIZE 65536

int child1(void* arg);
int child2(void* arg);

int main(int argc, char **argv)
{
  int child_pid;
  char* child_stack = malloc(STACK_SIZE);
  char* stack_top = child_stack + STACK_SIZE;
  char command[256];

  child_pid = clone(, stack_top, CLONE_NEWPID, NULL);
  if (child_pid == -1) {
printf("parent: clone failed: %s\n", strerror(errno));
return EXIT_FAILURE;
  }
  printf("parent: child1_pid: %d\n", child_pid);

  sleep(2);
  printf("child state, if it's D (disk sleep), the child process is hung\n");
  sprintf(command, "cat /proc/%d/status | grep State:", child_pid);
  system(command);
  sleep(3600);
  return EXIT_SUCCESS;
}

int child1(void* arg)
{
  int flags = CLONE_FILES | CLONE_FS | CLONE_VM | CLONE_SIGHAND | CLONE_THREAD;
  char* child_stack = malloc(STACK_SIZE);
  char* stack_top = child_stack + STACK_SIZE;
  long ret;

  ret = ptrace(PTRACE_TRACEME, 0, NULL, NULL);
  if (ret == -1) {
printf("child1: ptrace failed: %s\n", strerror(errno));
return EXIT_FAILURE;
  }

  ret = clone(, stack_top, flags, NULL);
  if (ret == -1) {
printf("child1: clone failed: %s\n", strerror(errno));
return EXIT_FAILURE;
  }
  printf("child1: child2 pid: %ld\n", ret);

  sleep(1);
  printf("child1: end\n");
  return EXIT_SUCCESS;
}

int child2(void* arg)
{
  long ret = ptrace(PTRACE_TRACEME, 0, NULL, NULL);
  if (ret == -1) {
printf("child2: ptrace failed: %s\n", strerror(errno));
return EXIT_FAILURE;
  }

  printf("child2: end\n");
  return EXIT_SUCCESS;
}




Re: [PATCH RFC] sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks

2017-05-11 Thread Daniel Bristot de Oliveira
On 05/04/2017 04:26 PM, Peter Zijlstra wrote:
> On Thu, May 04, 2017 at 04:17:21PM +0200, Peter Zijlstra wrote:
>> We use two difference CBS rules:
>>
>>  1) the original CBS rule for implicit deadline tasks;
>>  2) the revised CBS rule for constrained deadline tasks.
>>> @@ -500,6 +550,14 @@ static void update_dl_entity(struct sched_dl_entity 
>>> *dl_se,
>>>  
>>> if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
>>> dl_entity_overflow(dl_se, pi_se, rq_clock(rq))) {
>>> +
>>> +   if (unlikely(dl_is_constrained(dl_se) &&
>>> +   !dl_time_before(dl_se->deadline, rq_clock(rq)) &&
>>> +   !dl_se->dl_boosted)){
>>> +   update_dl_revised_wakeup(dl_se, rq);
>>> +   return;
>>> +   }
>>> +
>>> dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline;
>>> dl_se->runtime = pi_se->dl_runtime;
>>> }
> That comment above does not match the code. We can still use the
> original CBS rule for constrained tasks.
> 
> We only use the revised CBS rule when the constrained task hasn't missed
> its deadline yet.

Correct. I will improve the explanation.

Thank you very much for all comments :-)

-- Daniel


Re: [PATCH RFC] sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks

2017-05-11 Thread Daniel Bristot de Oliveira
On 05/04/2017 04:26 PM, Peter Zijlstra wrote:
> On Thu, May 04, 2017 at 04:17:21PM +0200, Peter Zijlstra wrote:
>> We use two difference CBS rules:
>>
>>  1) the original CBS rule for implicit deadline tasks;
>>  2) the revised CBS rule for constrained deadline tasks.
>>> @@ -500,6 +550,14 @@ static void update_dl_entity(struct sched_dl_entity 
>>> *dl_se,
>>>  
>>> if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
>>> dl_entity_overflow(dl_se, pi_se, rq_clock(rq))) {
>>> +
>>> +   if (unlikely(dl_is_constrained(dl_se) &&
>>> +   !dl_time_before(dl_se->deadline, rq_clock(rq)) &&
>>> +   !dl_se->dl_boosted)){
>>> +   update_dl_revised_wakeup(dl_se, rq);
>>> +   return;
>>> +   }
>>> +
>>> dl_se->deadline = rq_clock(rq) + pi_se->dl_deadline;
>>> dl_se->runtime = pi_se->dl_runtime;
>>> }
> That comment above does not match the code. We can still use the
> original CBS rule for constrained tasks.
> 
> We only use the revised CBS rule when the constrained task hasn't missed
> its deadline yet.

Correct. I will improve the explanation.

Thank you very much for all comments :-)

-- Daniel


Re: [PATCH RFC] sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks

2017-05-11 Thread Daniel Bristot de Oliveira
On 05/04/2017 04:23 PM, Peter Zijlstra wrote:
> On Thu, May 04, 2017 at 04:17:21PM +0200, Peter Zijlstra wrote:
>> On Mon, Apr 24, 2017 at 05:18:35PM +0200, Daniel Bristot de Oliveira wrote:
>>> +static void
>>> +update_dl_revised_wakeup(struct sched_dl_entity *dl_se, struct rq *rq)
>>> +{
>>> +   u64 density = div64_u64(dl_se->dl_runtime << 20, dl_se->dl_deadline);
>>> +   u64 laxity = dl_se->deadline - rq_clock(rq);
>>> +
>>> +   BUG_ON(laxity < 0);
>> Compiler will make that go away, by virtue of laxity being unsigned.
> Also, so we want that to BUG (or even WARN) in a soft RT setting.
> Remember, GEDF does not guarantee we make our deadlines.

The point is that, a task with laxity < 0 should be throttled before
this point by:

df8eac8cafce ("sched/deadline: Throttle a constrained deadline task
activated after the deadline")

Arrive here with (laxity < 0) meas that we skipped that check.

>> We use two difference CBS rules:
> _different_, clearly I cannot type anymore either ;-)

ops... :-)

-- Daniel


Re: [PATCH RFC] sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks

2017-05-11 Thread Daniel Bristot de Oliveira
On 05/04/2017 04:23 PM, Peter Zijlstra wrote:
> On Thu, May 04, 2017 at 04:17:21PM +0200, Peter Zijlstra wrote:
>> On Mon, Apr 24, 2017 at 05:18:35PM +0200, Daniel Bristot de Oliveira wrote:
>>> +static void
>>> +update_dl_revised_wakeup(struct sched_dl_entity *dl_se, struct rq *rq)
>>> +{
>>> +   u64 density = div64_u64(dl_se->dl_runtime << 20, dl_se->dl_deadline);
>>> +   u64 laxity = dl_se->deadline - rq_clock(rq);
>>> +
>>> +   BUG_ON(laxity < 0);
>> Compiler will make that go away, by virtue of laxity being unsigned.
> Also, so we want that to BUG (or even WARN) in a soft RT setting.
> Remember, GEDF does not guarantee we make our deadlines.

The point is that, a task with laxity < 0 should be throttled before
this point by:

df8eac8cafce ("sched/deadline: Throttle a constrained deadline task
activated after the deadline")

Arrive here with (laxity < 0) meas that we skipped that check.

>> We use two difference CBS rules:
> _different_, clearly I cannot type anymore either ;-)

ops... :-)

-- Daniel


Re: [PATCH] mdio: mux: Correct mdio_mux_init error path issues

2017-05-11 Thread Florian Fainelli
On 05/10/2017 08:20 AM, Jon Mason wrote:
> There is a potential unnecessary refcount decriment on error path of
> put_device(>mii_bus->dev), as it is possible to avoid the
> of_mdio_find_bus() call if mux_bus is specified by the calling function.
> 
> The same put_device() is not called in the error path if the
> devm_kzalloc of pb fails.  This caused the variable used in the
> put_device() to be changed, as the pb pointer was obviously not set up.
> 
> There is an unnecessary of_node_get() on child_bus_node if the
> of_mdiobus_register() is successful, as the
> for_each_available_child_of_node() automatically increments this.
> Thus the refcount on this node will always be +1 more than it should be.
> 
> There is no of_node_put() on child_bus_node if the of_mdiobus_register()
> call fails.
> 
> Finally, it is lacking devm_kfree() of pb in the error path.  While this
> might not be technically necessary, it was present in other parts of the
> function.  So, I am adding it where necessary to make it uniform.
> 
> Signed-off-by: Jon Mason 
> Fixes: f20e6657a875 ("mdio: mux: Enhanced MDIO mux framework for integrated 
> multiplexers")
> Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.")

Reviewed-by: Florian Fainelli 

Please include "net" in the subject for future submissions, thanks!

> ---
>  drivers/net/phy/mdio-mux.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/phy/mdio-mux.c b/drivers/net/phy/mdio-mux.c
> index 963838d4fac1..6943c5ece44a 100644
> --- a/drivers/net/phy/mdio-mux.c
> +++ b/drivers/net/phy/mdio-mux.c
> @@ -122,10 +122,9 @@ int mdio_mux_init(struct device *dev,
>   pb = devm_kzalloc(dev, sizeof(*pb), GFP_KERNEL);
>   if (pb == NULL) {
>   ret_val = -ENOMEM;
> - goto err_parent_bus;
> + goto err_pb_kz;
>   }
>  
> -
>   pb->switch_data = data;
>   pb->switch_fn = switch_fn;
>   pb->current_child = -1;
> @@ -154,6 +153,7 @@ int mdio_mux_init(struct device *dev,
>   cb->mii_bus = mdiobus_alloc();
>   if (!cb->mii_bus) {
>   ret_val = -ENOMEM;
> + devm_kfree(dev, cb);
>   of_node_put(child_bus_node);
>   break;
>   }
> @@ -169,8 +169,8 @@ int mdio_mux_init(struct device *dev,
>   if (r) {
>   mdiobus_free(cb->mii_bus);
>   devm_kfree(dev, cb);
> + of_node_put(child_bus_node);
>   } else {
> - of_node_get(child_bus_node);
>   cb->next = pb->children;
>   pb->children = cb;
>   }
> @@ -181,9 +181,11 @@ int mdio_mux_init(struct device *dev,
>   return 0;
>   }
>  
> + devm_kfree(dev, pb);
> +err_pb_kz:
>   /* balance the reference of_mdio_find_bus() took */
> - put_device(>mii_bus->dev);
> -
> + if (!mux_bus)
> + put_device(_bus->dev);
>  err_parent_bus:
>   of_node_put(parent_bus_node);
>   return ret_val;
> 


-- 
Florian


Re: [PATCH] mdio: mux: Correct mdio_mux_init error path issues

2017-05-11 Thread Florian Fainelli
On 05/10/2017 08:20 AM, Jon Mason wrote:
> There is a potential unnecessary refcount decriment on error path of
> put_device(>mii_bus->dev), as it is possible to avoid the
> of_mdio_find_bus() call if mux_bus is specified by the calling function.
> 
> The same put_device() is not called in the error path if the
> devm_kzalloc of pb fails.  This caused the variable used in the
> put_device() to be changed, as the pb pointer was obviously not set up.
> 
> There is an unnecessary of_node_get() on child_bus_node if the
> of_mdiobus_register() is successful, as the
> for_each_available_child_of_node() automatically increments this.
> Thus the refcount on this node will always be +1 more than it should be.
> 
> There is no of_node_put() on child_bus_node if the of_mdiobus_register()
> call fails.
> 
> Finally, it is lacking devm_kfree() of pb in the error path.  While this
> might not be technically necessary, it was present in other parts of the
> function.  So, I am adding it where necessary to make it uniform.
> 
> Signed-off-by: Jon Mason 
> Fixes: f20e6657a875 ("mdio: mux: Enhanced MDIO mux framework for integrated 
> multiplexers")
> Fixes: 0ca2997d1452 ("netdev/of/phy: Add MDIO bus multiplexer support.")

Reviewed-by: Florian Fainelli 

Please include "net" in the subject for future submissions, thanks!

> ---
>  drivers/net/phy/mdio-mux.c | 12 +++-
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/phy/mdio-mux.c b/drivers/net/phy/mdio-mux.c
> index 963838d4fac1..6943c5ece44a 100644
> --- a/drivers/net/phy/mdio-mux.c
> +++ b/drivers/net/phy/mdio-mux.c
> @@ -122,10 +122,9 @@ int mdio_mux_init(struct device *dev,
>   pb = devm_kzalloc(dev, sizeof(*pb), GFP_KERNEL);
>   if (pb == NULL) {
>   ret_val = -ENOMEM;
> - goto err_parent_bus;
> + goto err_pb_kz;
>   }
>  
> -
>   pb->switch_data = data;
>   pb->switch_fn = switch_fn;
>   pb->current_child = -1;
> @@ -154,6 +153,7 @@ int mdio_mux_init(struct device *dev,
>   cb->mii_bus = mdiobus_alloc();
>   if (!cb->mii_bus) {
>   ret_val = -ENOMEM;
> + devm_kfree(dev, cb);
>   of_node_put(child_bus_node);
>   break;
>   }
> @@ -169,8 +169,8 @@ int mdio_mux_init(struct device *dev,
>   if (r) {
>   mdiobus_free(cb->mii_bus);
>   devm_kfree(dev, cb);
> + of_node_put(child_bus_node);
>   } else {
> - of_node_get(child_bus_node);
>   cb->next = pb->children;
>   pb->children = cb;
>   }
> @@ -181,9 +181,11 @@ int mdio_mux_init(struct device *dev,
>   return 0;
>   }
>  
> + devm_kfree(dev, pb);
> +err_pb_kz:
>   /* balance the reference of_mdio_find_bus() took */
> - put_device(>mii_bus->dev);
> -
> + if (!mux_bus)
> + put_device(_bus->dev);
>  err_parent_bus:
>   of_node_put(parent_bus_node);
>   return ret_val;
> 


-- 
Florian


Re: [PATCH 4.4 16/60] x86/ioapic: Restore IO-APIC irq_chip retrigger callback

2017-05-11 Thread Ben Hutchings
On Thu, 2017-05-11 at 16:12 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Ruslan Ruslichenko 
> 
> commit a9b4f08770b415f30f2fb0f8329a370c8f554aa3 upstream.
> 
> commit d32932d02e18 removed the irq_retrigger callback from the IO-APIC
> chip and did not add it to the new IO-APIC-IR irq chip.
> 
> There is no harm because the interrupts are resent in software when the
> retrigger callback is NULL, but it's less efficient. So restore them.

Sounds like it's not important enough for stable, then?

Ben.

> [ tglx: Massaged changelog ]
> 
> Fixes: d32932d02e18  ("x86/irq: Convert IOAPIC to use hierarchical irqdomain 
> interfaces")
> Signed-off-by: Ruslan Ruslichenko 
> Cc: xe-linux-exter...@cisco.com
> Link: 
> http://lkml.kernel.org/r/1484662432-13580-1-git-send-email-rrusl...@cisco.com
> Signed-off-by: Thomas Gleixner 
> Signed-off-by: Greg Kroah-Hartman 
> 
> ---
>  arch/x86/kernel/apic/io_apic.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -1875,6 +1875,7 @@ static struct irq_chip ioapic_chip __rea
>   .irq_ack= irq_chip_ack_parent,
>   .irq_eoi= ioapic_ack_level,
>   .irq_set_affinity   = ioapic_set_affinity,
> + .irq_retrigger  = irq_chip_retrigger_hierarchy,
>   .flags  = IRQCHIP_SKIP_SET_WAKE,
>  };
>  
> @@ -1886,6 +1887,7 @@ static struct irq_chip ioapic_ir_chip __
>   .irq_ack= irq_chip_ack_parent,
>   .irq_eoi= ioapic_ir_ack_level,
>   .irq_set_affinity   = ioapic_set_affinity,
> + .irq_retrigger  = irq_chip_retrigger_hierarchy,
>   .flags  = IRQCHIP_SKIP_SET_WAKE,
>  };
>  
> 
> 
> 

-- 
Ben Hutchings
Software Developer, Codethink Ltd.




Re: [PATCH 4.4 16/60] x86/ioapic: Restore IO-APIC irq_chip retrigger callback

2017-05-11 Thread Ben Hutchings
On Thu, 2017-05-11 at 16:12 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Ruslan Ruslichenko 
> 
> commit a9b4f08770b415f30f2fb0f8329a370c8f554aa3 upstream.
> 
> commit d32932d02e18 removed the irq_retrigger callback from the IO-APIC
> chip and did not add it to the new IO-APIC-IR irq chip.
> 
> There is no harm because the interrupts are resent in software when the
> retrigger callback is NULL, but it's less efficient. So restore them.

Sounds like it's not important enough for stable, then?

Ben.

> [ tglx: Massaged changelog ]
> 
> Fixes: d32932d02e18  ("x86/irq: Convert IOAPIC to use hierarchical irqdomain 
> interfaces")
> Signed-off-by: Ruslan Ruslichenko 
> Cc: xe-linux-exter...@cisco.com
> Link: 
> http://lkml.kernel.org/r/1484662432-13580-1-git-send-email-rrusl...@cisco.com
> Signed-off-by: Thomas Gleixner 
> Signed-off-by: Greg Kroah-Hartman 
> 
> ---
>  arch/x86/kernel/apic/io_apic.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -1875,6 +1875,7 @@ static struct irq_chip ioapic_chip __rea
>   .irq_ack= irq_chip_ack_parent,
>   .irq_eoi= ioapic_ack_level,
>   .irq_set_affinity   = ioapic_set_affinity,
> + .irq_retrigger  = irq_chip_retrigger_hierarchy,
>   .flags  = IRQCHIP_SKIP_SET_WAKE,
>  };
>  
> @@ -1886,6 +1887,7 @@ static struct irq_chip ioapic_ir_chip __
>   .irq_ack= irq_chip_ack_parent,
>   .irq_eoi= ioapic_ir_ack_level,
>   .irq_set_affinity   = ioapic_set_affinity,
> + .irq_retrigger  = irq_chip_retrigger_hierarchy,
>   .flags  = IRQCHIP_SKIP_SET_WAKE,
>  };
>  
> 
> 
> 

-- 
Ben Hutchings
Software Developer, Codethink Ltd.




[PATCHv2] Make initramfs honor CONFIG_DEVTMPFS_MOUNT

2017-05-11 Thread Rob Landley
From: Rob Landley 

Make initramfs honor CONFIG_DEVTMPFS_MOUNT, move /dev/console
open after devtmpfs mount, and update help text.

Signed-off-by: Rob Landley 
---

 drivers/base/Kconfig |   14 --
 init/main.c  |   15 +--
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index d718ae4..74779ee 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -48,16 +48,10 @@ config DEVTMPFS_MOUNT
bool "Automount devtmpfs at /dev, after the kernel mounted the rootfs"
depends on DEVTMPFS
help
- This will instruct the kernel to automatically mount the
- devtmpfs filesystem at /dev, directly after the kernel has
- mounted the root filesystem. The behavior can be overridden
- with the commandline parameter: devtmpfs.mount=0|1.
- This option does not affect initramfs based booting, here
- the devtmpfs filesystem always needs to be mounted manually
- after the rootfs is mounted.
- With this option enabled, it allows to bring up a system in
- rescue mode with init=/bin/sh, even when the /dev directory
- on the rootfs is completely empty.
+ Automatically mount devtmpfs at /dev on the root filesystem, which
+ lets the system come up in rescue mode with [rd]init=/bin/sh.
+ Override with devtmpfs.mount=0 on the commandline. Initramfs can
+ create a /dev dir as needed, other rootfs needs the mount point.
 
 config STANDALONE
bool "Select only drivers that don't need compile-time external 
firmware"
diff --git a/init/main.c b/init/main.c
index f866510..9ec09ff 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1038,12 +1038,6 @@ static noinline void __init kernel_init_freeable(void)
 
do_basic_setup();
 
-   /* Open the /dev/console on the rootfs, this should never fail */
-   if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
-   pr_err("Warning: unable to open an initial console.\n");
-
-   (void) sys_dup(0);
-   (void) sys_dup(0);
/*
 * check if there is an early userspace init.  If yes, let it do all
 * the work
@@ -1055,8 +1049,17 @@ static noinline void __init kernel_init_freeable(void)
if (sys_access((const char __user *) ramdisk_execute_command, 0) != 0) {
ramdisk_execute_command = NULL;
prepare_namespace();
+   } else if (IS_ENABLED(CONFIG_DEVTMPFS_MOUNT)) {
+   sys_mkdir("/dev", 0755);
+   devtmpfs_mount("/dev");
}
 
+   /* Open the /dev/console on the rootfs, this should never fail */
+   if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
+   pr_err("Warning: unable to open an initial console.\n");
+   (void) sys_dup(0);
+   (void) sys_dup(0);
+
/*
 * Ok, we have completed the initial bootup, and
 * we're essentially up and running. Get rid of the


[PATCHv2] Make initramfs honor CONFIG_DEVTMPFS_MOUNT

2017-05-11 Thread Rob Landley
From: Rob Landley 

Make initramfs honor CONFIG_DEVTMPFS_MOUNT, move /dev/console
open after devtmpfs mount, and update help text.

Signed-off-by: Rob Landley 
---

 drivers/base/Kconfig |   14 --
 init/main.c  |   15 +--
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index d718ae4..74779ee 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -48,16 +48,10 @@ config DEVTMPFS_MOUNT
bool "Automount devtmpfs at /dev, after the kernel mounted the rootfs"
depends on DEVTMPFS
help
- This will instruct the kernel to automatically mount the
- devtmpfs filesystem at /dev, directly after the kernel has
- mounted the root filesystem. The behavior can be overridden
- with the commandline parameter: devtmpfs.mount=0|1.
- This option does not affect initramfs based booting, here
- the devtmpfs filesystem always needs to be mounted manually
- after the rootfs is mounted.
- With this option enabled, it allows to bring up a system in
- rescue mode with init=/bin/sh, even when the /dev directory
- on the rootfs is completely empty.
+ Automatically mount devtmpfs at /dev on the root filesystem, which
+ lets the system come up in rescue mode with [rd]init=/bin/sh.
+ Override with devtmpfs.mount=0 on the commandline. Initramfs can
+ create a /dev dir as needed, other rootfs needs the mount point.
 
 config STANDALONE
bool "Select only drivers that don't need compile-time external 
firmware"
diff --git a/init/main.c b/init/main.c
index f866510..9ec09ff 100644
--- a/init/main.c
+++ b/init/main.c
@@ -1038,12 +1038,6 @@ static noinline void __init kernel_init_freeable(void)
 
do_basic_setup();
 
-   /* Open the /dev/console on the rootfs, this should never fail */
-   if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
-   pr_err("Warning: unable to open an initial console.\n");
-
-   (void) sys_dup(0);
-   (void) sys_dup(0);
/*
 * check if there is an early userspace init.  If yes, let it do all
 * the work
@@ -1055,8 +1049,17 @@ static noinline void __init kernel_init_freeable(void)
if (sys_access((const char __user *) ramdisk_execute_command, 0) != 0) {
ramdisk_execute_command = NULL;
prepare_namespace();
+   } else if (IS_ENABLED(CONFIG_DEVTMPFS_MOUNT)) {
+   sys_mkdir("/dev", 0755);
+   devtmpfs_mount("/dev");
}
 
+   /* Open the /dev/console on the rootfs, this should never fail */
+   if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0)
+   pr_err("Warning: unable to open an initial console.\n");
+   (void) sys_dup(0);
+   (void) sys_dup(0);
+
/*
 * Ok, we have completed the initial bootup, and
 * we're essentially up and running. Get rid of the


Re: [PATCH RFC] sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks

2017-05-11 Thread Daniel Bristot de Oliveira
On 05/04/2017 04:17 PM, Peter Zijlstra wrote:
> On Mon, Apr 24, 2017 at 05:18:35PM +0200, Daniel Bristot de Oliveira wrote:
>> We have been facing some problems with self-suspending constrained
>> deadline tasks. The main reason is that the original CBS was not
>> designed for such sort of tasks.
>>
>> One problem reported by Xunlei Pang takes place when a task
>> suspends, and then is awakened before the deadline, but so close
>> to the deadline that its remaining runtime can cause the task
>> to have an absolute density higher than allowed. In such situation,
>> the original CBS assumes that the task is facing an early activation,
>> and so it replenishes the task and set another deadline, one deadline
>> in the future. This rule works fine for implicit deadline tasks.
>> Moreover, it allows the system to adapt the period of a task in which
>> the external event source suffered from a clock drift.
>>
>> However, this opens the window for bandwidth leakage for constrained
>> deadline tasks. For instance, a task with the following parameters:
>>
>>   runtime   = 5 ms
>>   deadline  = 7 ms
>>   [density] = 5 / 7 = 0.71
>>   period= 1000 ms
>>
>> If the task runs for 1 ms, and then suspends for another 1ms,
>> it will be awakened with the following parameters:
>>
>>   remaining runtime = 4
>>   laxity = 5
>>
>> presenting a absolute density of 4 / 5 = 0.80.
>>
>> In this case, the original CBS would assume the task had an early
>> wakeup. Then, CBS will reset the runtime, and the absolute deadline will
>> be postponed by one relative deadline, allowing the task to run.
>>
>> The problem is that, if the task runs this pattern forever, it will keep
>> receiving bandwidth, being able to run 1ms every 2ms. Following this
>> behavior, the task would be able to run 500 ms in 1 sec. Thus running
>> more than the 5 ms / 1 sec the admission control allowed it to run.
>>
>> Trying to address the self-suspending case, Luca Abeni, Giuseppe
>> Lipari, and Juri Lelli [1] revisited the CBS in order to deal with
>> self-suspending tasks. In the new approach, rather than
>> replenishing/postponing the absolute deadline, the revised wakeup rule
>> adjusts the remaining runtime, reducing it to fit into the allowed
>> density.
>>
>> A resumed version of the idea is:
>>
>> At a given time t, the maximum absolute density of a task cannot be
>> higher than its relative density, that is:
>>
>>   runtime / (deadline - t) <= dl_runtime / dl_deadline
>>
>> Knowing the laxity of a task (deadline - t), it is possible to move
>> it to the other side of the equality, thus enabling to define max
>> remaining runtime a task can use within the absolute deadline, without
>> over-running the allowed density:
>>
>>   runtime = (dl_runtime / dl_deadline) * (deadline - t)
>>
>> For instance, in our previous example, the task could still run:
>>
>>   runtime = ( 5 / 7 ) * 4
>>   runtime = 2.85 ms
>>
>> Without causing damage for other deadline tasks. It is note worth that
>> the laxity cannot be negative because that would cause a negative
>> runtime. Thus, this patch depends on the patch:
>>
>>   edf5835 sched/deadline: Throttle a constrained deadline task activated
>>   after the deadline
> 
> My git tree says that is:
> 
> df8eac8cafce ("sched/deadline: Throttle a constrained deadline task activated 
> after the deadline")

Ops, you are right, I was using my own tree, sorry.

>> Which throttles a constrained deadline task activated after the
>> deadline.
>>
>> Finally, it is also possible to use the revised wakeup rule for
>> all other tasks, but that would require some more discussions
>> about pros and cons.
>>
>> Reported-by: Xunlei Pang 
>> Signed-off-by: Daniel Bristot de Oliveira 
>> Cc: Xunlei Pang 
>> Cc: Ingo Molnar 
>> Cc: Peter Zijlstra 
>> Cc: Juri Lelli 
>> Cc: Steven Rostedt 
>> Cc: Luca Abeni 
>> Cc: Tommaso Cucinotta 
>> Cc: Romulo Silva de Oliveira 
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>  kernel/sched/deadline.c | 67 
>> +++--
>>  1 file changed, 60 insertions(+), 7 deletions(-)
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index a2ce590..71e5bcf 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -484,13 +484,63 @@ static bool dl_entity_overflow(struct sched_dl_entity 
>> *dl_se,
>>  }
>>  
>>  /*
>> + * Revised wakeup rule [1]: For self-suspending tasks, rather then
>> + * re-initializing task's runtime and deadline, the revised wakeup
>> + * rule adjusts the task's runtime to avoid the task to overrun its
>> + * density.
>> + *
>> + * Reasoning: a task may overrun the density if:
>> + *runtime / (deadline - t) > dl_runtime / dl_deadline
> 
> When reading that, I have the instant question: 

Re: [PATCH RFC] sched/deadline: Use the revised wakeup rule for suspending constrained dl tasks

2017-05-11 Thread Daniel Bristot de Oliveira
On 05/04/2017 04:17 PM, Peter Zijlstra wrote:
> On Mon, Apr 24, 2017 at 05:18:35PM +0200, Daniel Bristot de Oliveira wrote:
>> We have been facing some problems with self-suspending constrained
>> deadline tasks. The main reason is that the original CBS was not
>> designed for such sort of tasks.
>>
>> One problem reported by Xunlei Pang takes place when a task
>> suspends, and then is awakened before the deadline, but so close
>> to the deadline that its remaining runtime can cause the task
>> to have an absolute density higher than allowed. In such situation,
>> the original CBS assumes that the task is facing an early activation,
>> and so it replenishes the task and set another deadline, one deadline
>> in the future. This rule works fine for implicit deadline tasks.
>> Moreover, it allows the system to adapt the period of a task in which
>> the external event source suffered from a clock drift.
>>
>> However, this opens the window for bandwidth leakage for constrained
>> deadline tasks. For instance, a task with the following parameters:
>>
>>   runtime   = 5 ms
>>   deadline  = 7 ms
>>   [density] = 5 / 7 = 0.71
>>   period= 1000 ms
>>
>> If the task runs for 1 ms, and then suspends for another 1ms,
>> it will be awakened with the following parameters:
>>
>>   remaining runtime = 4
>>   laxity = 5
>>
>> presenting a absolute density of 4 / 5 = 0.80.
>>
>> In this case, the original CBS would assume the task had an early
>> wakeup. Then, CBS will reset the runtime, and the absolute deadline will
>> be postponed by one relative deadline, allowing the task to run.
>>
>> The problem is that, if the task runs this pattern forever, it will keep
>> receiving bandwidth, being able to run 1ms every 2ms. Following this
>> behavior, the task would be able to run 500 ms in 1 sec. Thus running
>> more than the 5 ms / 1 sec the admission control allowed it to run.
>>
>> Trying to address the self-suspending case, Luca Abeni, Giuseppe
>> Lipari, and Juri Lelli [1] revisited the CBS in order to deal with
>> self-suspending tasks. In the new approach, rather than
>> replenishing/postponing the absolute deadline, the revised wakeup rule
>> adjusts the remaining runtime, reducing it to fit into the allowed
>> density.
>>
>> A resumed version of the idea is:
>>
>> At a given time t, the maximum absolute density of a task cannot be
>> higher than its relative density, that is:
>>
>>   runtime / (deadline - t) <= dl_runtime / dl_deadline
>>
>> Knowing the laxity of a task (deadline - t), it is possible to move
>> it to the other side of the equality, thus enabling to define max
>> remaining runtime a task can use within the absolute deadline, without
>> over-running the allowed density:
>>
>>   runtime = (dl_runtime / dl_deadline) * (deadline - t)
>>
>> For instance, in our previous example, the task could still run:
>>
>>   runtime = ( 5 / 7 ) * 4
>>   runtime = 2.85 ms
>>
>> Without causing damage for other deadline tasks. It is note worth that
>> the laxity cannot be negative because that would cause a negative
>> runtime. Thus, this patch depends on the patch:
>>
>>   edf5835 sched/deadline: Throttle a constrained deadline task activated
>>   after the deadline
> 
> My git tree says that is:
> 
> df8eac8cafce ("sched/deadline: Throttle a constrained deadline task activated 
> after the deadline")

Ops, you are right, I was using my own tree, sorry.

>> Which throttles a constrained deadline task activated after the
>> deadline.
>>
>> Finally, it is also possible to use the revised wakeup rule for
>> all other tasks, but that would require some more discussions
>> about pros and cons.
>>
>> Reported-by: Xunlei Pang 
>> Signed-off-by: Daniel Bristot de Oliveira 
>> Cc: Xunlei Pang 
>> Cc: Ingo Molnar 
>> Cc: Peter Zijlstra 
>> Cc: Juri Lelli 
>> Cc: Steven Rostedt 
>> Cc: Luca Abeni 
>> Cc: Tommaso Cucinotta 
>> Cc: Romulo Silva de Oliveira 
>> Cc: linux-kernel@vger.kernel.org
>> ---
>>  kernel/sched/deadline.c | 67 
>> +++--
>>  1 file changed, 60 insertions(+), 7 deletions(-)
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index a2ce590..71e5bcf 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -484,13 +484,63 @@ static bool dl_entity_overflow(struct sched_dl_entity 
>> *dl_se,
>>  }
>>  
>>  /*
>> + * Revised wakeup rule [1]: For self-suspending tasks, rather then
>> + * re-initializing task's runtime and deadline, the revised wakeup
>> + * rule adjusts the task's runtime to avoid the task to overrun its
>> + * density.
>> + *
>> + * Reasoning: a task may overrun the density if:
>> + *runtime / (deadline - t) > dl_runtime / dl_deadline
> 
> When reading that, I have the instant question: "why / how ?" I suspect
> the blurb below (at update_dl_entity) has the answer, if so this can use
> a reference thereto.

Yeah, I will connect the two comments.

>> + *
>> + * Therefore, runtime can be adjusted to:
>> + 

Re: [PATCH v5 5/6] mtd: dataflash: Make use of "extened device information"

2017-05-11 Thread Brian Norris
On Fri, Apr 21, 2017 at 07:19:21PM +0200, Marek Vasut wrote:
> On 04/21/2017 06:30 PM, Andrey Smirnov wrote:
> > In anticipation of supporting chips that need it, extend the size of
> > struct flash_info's 'jedec_id' field to make room 2 byte of extended
> > device information as well as add code to fetch this data during
> > jedec_probe().
> > 
> > Cc: cphe...@gmail.com
> > Cc: David Woodhouse 
> > Cc: Brian Norris 
> > Cc: Boris Brezillon 
> > Cc: Marek Vasut 
> > Cc: Richard Weinberger 
> > Cc: Cyrille Pitchen 
> > Cc: linux-kernel@vger.kernel.org
> > Acked-by: Marek Vasut 
> > Signed-off-by: Andrey Smirnov 
> > ---
> > 
> > Changes since [v4]:
> > 
> > - Corrected value of SUP_EXTID from BIT(3) to 0x0004
> > 
> > - Collected Acked-by from Marek
> 
> Super, entire series is great, thanks !

Applied to l2-mtd.git/next for 4.13


Re: [PATCH v5 5/6] mtd: dataflash: Make use of "extened device information"

2017-05-11 Thread Brian Norris
On Fri, Apr 21, 2017 at 07:19:21PM +0200, Marek Vasut wrote:
> On 04/21/2017 06:30 PM, Andrey Smirnov wrote:
> > In anticipation of supporting chips that need it, extend the size of
> > struct flash_info's 'jedec_id' field to make room 2 byte of extended
> > device information as well as add code to fetch this data during
> > jedec_probe().
> > 
> > Cc: cphe...@gmail.com
> > Cc: David Woodhouse 
> > Cc: Brian Norris 
> > Cc: Boris Brezillon 
> > Cc: Marek Vasut 
> > Cc: Richard Weinberger 
> > Cc: Cyrille Pitchen 
> > Cc: linux-kernel@vger.kernel.org
> > Acked-by: Marek Vasut 
> > Signed-off-by: Andrey Smirnov 
> > ---
> > 
> > Changes since [v4]:
> > 
> > - Corrected value of SUP_EXTID from BIT(3) to 0x0004
> > 
> > - Collected Acked-by from Marek
> 
> Super, entire series is great, thanks !

Applied to l2-mtd.git/next for 4.13


Re: [PATCH] Make initramfs honor CONFIG_DEVTMPFS_MOUNT

2017-05-11 Thread Rob Landley


On 05/09/2017 04:31 PM, Andrew Morton wrote:
> On Thu, 4 May 2017 16:09:06 -0500 Rob Landley  wrote:
> 
>> From: Rob Landley 
>>
>> Make initramfs honor CONFIG_DEVTMPFS_MOUNT, and move
>> /dev/console open after devtmpfs mount.
> 
> 
> Could we please see complete description of the runtime effects of this
> change?  How does it affect users?  How does it benefit users?

It makes the behavior consistent. If you're going to have the config
symbol anyway, why is initramfs a second class citizen?

That said, I was fixing a specific bug when I started the patch: when
you statically link in an initramfs by pointing the kernel build at a
directory (so it makes its own cpio archive from that), if you're not
running the build as root you can't create dev/console in there and
there's no obvious way to add nodes (like you can editing the
gen_initramfs_list) output.

This means there's no /dev/console when init gets launched, so PID 1's
stdin/stdout/stderr go nowhere, and until your init script can open its
own and redirect you get no output if something goes wrong, so debugging
is fiddly and there's a hole where output gets lost. Userspace can't
close that hole.

When making the patch I did a version that mounted /proc /sys and
/dev/pts too, so rdinit=/bin/sh had pretty much its full environment
without an init script just like the DEVTMPFS_MOUNT option's help text
implied... but that seemed unlikely to be accepted. The console gap is a
problem userspace can't fix, the rest userspace can, so I did the
minimal thing.

> The DEVTMPFS_MOUNT Kconfig help (drivers/base/Kconfig) says:
> 
> This option does not affect initramfs based booting, here
> the devtmpfs filesystem always needs to be mounted manually
> after the rootfs is mounted.
> 
> which seems to no longer be correct?

Ah, sorry. I rewrote the help text and didn't include that file in the
diff. And rechecking I see the override part wasn't implemented by my
patch, I'll send a new one.

Rob


Re: [PATCH] Make initramfs honor CONFIG_DEVTMPFS_MOUNT

2017-05-11 Thread Rob Landley


On 05/09/2017 04:31 PM, Andrew Morton wrote:
> On Thu, 4 May 2017 16:09:06 -0500 Rob Landley  wrote:
> 
>> From: Rob Landley 
>>
>> Make initramfs honor CONFIG_DEVTMPFS_MOUNT, and move
>> /dev/console open after devtmpfs mount.
> 
> 
> Could we please see complete description of the runtime effects of this
> change?  How does it affect users?  How does it benefit users?

It makes the behavior consistent. If you're going to have the config
symbol anyway, why is initramfs a second class citizen?

That said, I was fixing a specific bug when I started the patch: when
you statically link in an initramfs by pointing the kernel build at a
directory (so it makes its own cpio archive from that), if you're not
running the build as root you can't create dev/console in there and
there's no obvious way to add nodes (like you can editing the
gen_initramfs_list) output.

This means there's no /dev/console when init gets launched, so PID 1's
stdin/stdout/stderr go nowhere, and until your init script can open its
own and redirect you get no output if something goes wrong, so debugging
is fiddly and there's a hole where output gets lost. Userspace can't
close that hole.

When making the patch I did a version that mounted /proc /sys and
/dev/pts too, so rdinit=/bin/sh had pretty much its full environment
without an init script just like the DEVTMPFS_MOUNT option's help text
implied... but that seemed unlikely to be accepted. The console gap is a
problem userspace can't fix, the rest userspace can, so I did the
minimal thing.

> The DEVTMPFS_MOUNT Kconfig help (drivers/base/Kconfig) says:
> 
> This option does not affect initramfs based booting, here
> the devtmpfs filesystem always needs to be mounted manually
> after the rootfs is mounted.
> 
> which seems to no longer be correct?

Ah, sorry. I rewrote the help text and didn't include that file in the
diff. And rechecking I see the override part wasn't implemented by my
patch, I'll send a new one.

Rob


Re: [PATCH v7 3/5] test: add new driver_data load tester

2017-05-11 Thread Luis R. Rodriguez
On Thu, May 11, 2017 at 07:10:18PM +0900, AKASHI Takahiro wrote:
> Luis,
> 
> On Tue, May 02, 2017 at 01:49:12AM -0700, Luis R. Rodriguez wrote:
> > 
> > diff --git a/lib/test_driver_data.c b/lib/test_driver_data.c
> > new file mode 100644
> > index ..488cc6e9eed4
> > --- /dev/null
> > +++ b/lib/test_driver_data.c
> 
>   ...
> 
> > +static int trigger_config_sync(struct driver_data_test_device *test_dev)
> > +{
> > +   struct test_config *config = _dev->config;
> > +   int ret;
> > +   const struct driver_data_req_params req_params_default = {
> > +   DRIVER_DATA_DEFAULT_SYNC_REQS(config_sync_req_cb, test_dev,
> > + DRIVER_DATA_REQ_OPTIONAL |
> > + DRIVER_DATA_REQ_KEEP)
> 
> Are these flags always on?

Ah no, indeed they are conditional on the config as with the others.

With this kmemleak on the test driver is back to squeaky clean, I had
failed to test with kmemleak on the test_driver after these changes,
sorry and thanks for picking this up.

> > +void free_test_dev_driver_data(struct driver_data_test_device *test_dev)
> > +{
> > +   kfree_const(test_dev->misc_dev.name);
> > +   test_dev->misc_dev.name = NULL;
> > +   vfree(test_dev);
> > +   test_dev = NULL;
> > +   driver_data_config_free(test_dev);
> 
> Removing this test module fails.
> 
> The last three lines should be:
>   driver_data_config_free(test_dev);
>   vfree(test_dev);

Fixed, thanks!

> > +}
> > +
> > +void unregister_test_dev_driver_data(struct driver_data_test_device 
> > *test_dev)
> > +{
> > +   wait_for_completion_timeout(_dev->request_complete, 5 * HZ);
> > +   dev_info(test_dev->dev, "removing interface\n");
> > +   misc_deregister(_dev->misc_dev);
> > +   kfree(_dev->misc_dev.name);
> 
> Don't need this kfree().

Indeed, thanks!

  Luis


Re: [PATCH v7 3/5] test: add new driver_data load tester

2017-05-11 Thread Luis R. Rodriguez
On Thu, May 11, 2017 at 07:10:18PM +0900, AKASHI Takahiro wrote:
> Luis,
> 
> On Tue, May 02, 2017 at 01:49:12AM -0700, Luis R. Rodriguez wrote:
> > 
> > diff --git a/lib/test_driver_data.c b/lib/test_driver_data.c
> > new file mode 100644
> > index ..488cc6e9eed4
> > --- /dev/null
> > +++ b/lib/test_driver_data.c
> 
>   ...
> 
> > +static int trigger_config_sync(struct driver_data_test_device *test_dev)
> > +{
> > +   struct test_config *config = _dev->config;
> > +   int ret;
> > +   const struct driver_data_req_params req_params_default = {
> > +   DRIVER_DATA_DEFAULT_SYNC_REQS(config_sync_req_cb, test_dev,
> > + DRIVER_DATA_REQ_OPTIONAL |
> > + DRIVER_DATA_REQ_KEEP)
> 
> Are these flags always on?

Ah no, indeed they are conditional on the config as with the others.

With this kmemleak on the test driver is back to squeaky clean, I had
failed to test with kmemleak on the test_driver after these changes,
sorry and thanks for picking this up.

> > +void free_test_dev_driver_data(struct driver_data_test_device *test_dev)
> > +{
> > +   kfree_const(test_dev->misc_dev.name);
> > +   test_dev->misc_dev.name = NULL;
> > +   vfree(test_dev);
> > +   test_dev = NULL;
> > +   driver_data_config_free(test_dev);
> 
> Removing this test module fails.
> 
> The last three lines should be:
>   driver_data_config_free(test_dev);
>   vfree(test_dev);

Fixed, thanks!

> > +}
> > +
> > +void unregister_test_dev_driver_data(struct driver_data_test_device 
> > *test_dev)
> > +{
> > +   wait_for_completion_timeout(_dev->request_complete, 5 * HZ);
> > +   dev_info(test_dev->dev, "removing interface\n");
> > +   misc_deregister(_dev->misc_dev);
> > +   kfree(_dev->misc_dev.name);
> 
> Don't need this kfree().

Indeed, thanks!

  Luis


Re: pinctrl-sx150x.c broken in 4.11

2017-05-11 Thread Nikita Yushchenko
>>> Hmm maybe yeah. I don't quite follow the above the "pinctrl-0 property
>>> of sx150x device tree node, is misinterpreted as hog" part though.
>>
>> sx150x is i2c-gpio device.  It has 16 GPIO lines that are communicated
>> with via i2c bus, and an interrupt line.
>>
>> Interrupt line is typically connected to SoC's pin.
>> This pin has to be configured.
>> This is done by providing appropriate subnode in SoC's pinmux node, with
>> information with pin configuration, and pinctrl-0 property in sx150x's
>> node with phandle to that subnode:
>>
>> ...
>>  {
>>  sx1503@20 {
>>  compatible = "semtech,sx1503q";
>>  pinctrl-names = "default";
>>  pinctrl-0 = <_sx1503_20>;
>>  ...
>>  };
>> };
>> ...
>>  {
>>  pinctrl_sx1503_20: pinctrl-sx1503-20 {
>>  fsl,pins = <
>>  VF610_PAD_PTB1__GPIO_23 0x219d
>>  >;
>>  };
>> };
>>
>> This pin configuration is handled by driver core, i.e. before probe()
>> for sx150x is called, core applies pin configuration.
>>
>> However sx150x driver is currently implemented as a pinctrl driver.
>>
>> When it initializes, pinctrl searches for "hog", i.e. pin config that
>> should be applied at driver registration time.
>>
>> While doing so, core searches for any registered pinctrl_map for device
>> being register. Search loop is in create_pinctrl().
>>
>> In this case, this loop finds map that is defined above.
>>
>> This is *not* hog.  This is pin setting already applied in SoC's pinmux
>> controller for sx1503 device.
>>
>> However code in create_pinctrl() tries to apply it, and use sx1503's
>> methods to do so. Which is plain wrong and errors out.
> 
> Maybe create_pinctrl() could check if the pin controller device
> for a potential hog points to the device itself and bail out
> if that's not the case?

Well that's exactly what patch from my first mail in this thread does.
This indeed fixes my case, but I don't know if it is correct in generic
case.

Should I submit it? Do you ack?

Nikita


Re: pinctrl-sx150x.c broken in 4.11

2017-05-11 Thread Nikita Yushchenko
>>> Hmm maybe yeah. I don't quite follow the above the "pinctrl-0 property
>>> of sx150x device tree node, is misinterpreted as hog" part though.
>>
>> sx150x is i2c-gpio device.  It has 16 GPIO lines that are communicated
>> with via i2c bus, and an interrupt line.
>>
>> Interrupt line is typically connected to SoC's pin.
>> This pin has to be configured.
>> This is done by providing appropriate subnode in SoC's pinmux node, with
>> information with pin configuration, and pinctrl-0 property in sx150x's
>> node with phandle to that subnode:
>>
>> ...
>>  {
>>  sx1503@20 {
>>  compatible = "semtech,sx1503q";
>>  pinctrl-names = "default";
>>  pinctrl-0 = <_sx1503_20>;
>>  ...
>>  };
>> };
>> ...
>>  {
>>  pinctrl_sx1503_20: pinctrl-sx1503-20 {
>>  fsl,pins = <
>>  VF610_PAD_PTB1__GPIO_23 0x219d
>>  >;
>>  };
>> };
>>
>> This pin configuration is handled by driver core, i.e. before probe()
>> for sx150x is called, core applies pin configuration.
>>
>> However sx150x driver is currently implemented as a pinctrl driver.
>>
>> When it initializes, pinctrl searches for "hog", i.e. pin config that
>> should be applied at driver registration time.
>>
>> While doing so, core searches for any registered pinctrl_map for device
>> being register. Search loop is in create_pinctrl().
>>
>> In this case, this loop finds map that is defined above.
>>
>> This is *not* hog.  This is pin setting already applied in SoC's pinmux
>> controller for sx1503 device.
>>
>> However code in create_pinctrl() tries to apply it, and use sx1503's
>> methods to do so. Which is plain wrong and errors out.
> 
> Maybe create_pinctrl() could check if the pin controller device
> for a potential hog points to the device itself and bail out
> if that's not the case?

Well that's exactly what patch from my first mail in this thread does.
This indeed fixes my case, but I don't know if it is correct in generic
case.

Should I submit it? Do you ack?

Nikita


Re: [PATCH v4] sd: Ignore sync cache failures when not supported

2017-05-11 Thread Ewan D. Milne
On Thu, 2017-05-11 at 14:34 +0200, Thierry Escande wrote:
> From: Derek Basehore 
> 
> Some external hard drives don't support the sync command even though the
> hard drive has write cache enabled. In this case, upon suspend request,
> sync cache failures are ignored if the error code in the sense header is
> ILLEGAL_REQUEST. There's not much we can do for these drives, so we
> shouldn't fail to suspend for this error case. The drive may stay
> powered if that's the setup for the port it's plugged into.
> 
> Signed-off-by: Derek Basehore 
> Signed-off-by: Thierry Escande 
> ---
> 
> v4 changes:
> - Check sense header validity before checking the sense_key field
> - Get rid of both goto statements (the one in the previous patch and the
>   one in the existing code)
> 
> v3 changes:
> - Pass the sense_hdr structure to sd_sync_cache() instead of the
>   lonely sense_key field
> 
> v2 changes:
> - Change sense_key type to u8 in sd_sync_cache()
> 
>  drivers/scsi/sd.c | 40 
>  1 file changed, 28 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index fcfeddc..823ab8b 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1489,17 +1489,21 @@ static unsigned int sd_check_events(struct gendisk 
> *disk, unsigned int clearing)
>   return retval;
>  }
>  
> -static int sd_sync_cache(struct scsi_disk *sdkp)
> +static int sd_sync_cache(struct scsi_disk *sdkp, struct scsi_sense_hdr 
> *sshdr)
>  {
>   int retries, res;
>   struct scsi_device *sdp = sdkp->device;
>   const int timeout = sdp->request_queue->rq_timeout
>   * SD_FLUSH_TIMEOUT_MULTIPLIER;
> - struct scsi_sense_hdr sshdr;
> + struct scsi_sense_hdr my_sshdr;
>  
>   if (!scsi_device_online(sdp))
>   return -ENODEV;
>  
> + /* caller might not be interested in sense, but we need it */
> + if (!sshdr)
> + sshdr = _sshdr;
> +
>   for (retries = 3; retries > 0; --retries) {
>   unsigned char cmd[10] = { 0 };
>  
> @@ -1508,7 +1512,7 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
>* Leave the rest of the command zero to indicate
>* flush everything.
>*/
> - res = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, ,
> + res = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, sshdr,
>   timeout, SD_MAX_RETRIES, 0, RQF_PM, NULL);
>   if (res == 0)
>   break;
> @@ -1518,11 +1522,12 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
>   sd_print_result(sdkp, "Synchronize Cache(10) failed", res);
>  
>   if (driver_byte(res) & DRIVER_SENSE)
> - sd_print_sense_hdr(sdkp, );
> + sd_print_sense_hdr(sdkp, sshdr);
> +
>   /* we need to evaluate the error return  */
> - if (scsi_sense_valid() &&
> - (sshdr.asc == 0x3a ||   /* medium not present */
> -  sshdr.asc == 0x20))/* invalid command */
> + if (scsi_sense_valid(sshdr) &&
> + (sshdr->asc == 0x3a ||  /* medium not present */
> +  sshdr->asc == 0x20))   /* invalid command */
>   /* this is no error here */
>   return 0;
>  
> @@ -3323,7 +3328,7 @@ static void sd_shutdown(struct device *dev)
>  
>   if (sdkp->WCE && sdkp->media_present) {
>   sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
> - sd_sync_cache(sdkp);
> + sd_sync_cache(sdkp, NULL);
>   }
>  
>   if (system_state != SYSTEM_RESTART && sdkp->device->manage_start_stop) {
> @@ -3335,6 +3340,7 @@ static void sd_shutdown(struct device *dev)
>  static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
>  {
>   struct scsi_disk *sdkp = dev_get_drvdata(dev);
> + struct scsi_sense_hdr sshdr;
>   int ret = 0;
>  
>   if (!sdkp)  /* E.g.: runtime suspend following sd_remove() */
> @@ -3342,12 +3348,23 @@ static int sd_suspend_common(struct device *dev, bool 
> ignore_stop_errors)
>  
>   if (sdkp->WCE && sdkp->media_present) {
>   sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
> - ret = sd_sync_cache(sdkp);
> + ret = sd_sync_cache(sdkp, );
> +
>   if (ret) {
>   /* ignore OFFLINE device */
>   if (ret == -ENODEV)
> - ret = 0;
> - goto done;
> + return 0;
> +
> + if (!scsi_sense_valid() ||
> + sshdr.sense_key != ILLEGAL_REQUEST)
> + return ret;
> +
> + /*
> +  * 

Re: [PATCH v4] sd: Ignore sync cache failures when not supported

2017-05-11 Thread Ewan D. Milne
On Thu, 2017-05-11 at 14:34 +0200, Thierry Escande wrote:
> From: Derek Basehore 
> 
> Some external hard drives don't support the sync command even though the
> hard drive has write cache enabled. In this case, upon suspend request,
> sync cache failures are ignored if the error code in the sense header is
> ILLEGAL_REQUEST. There's not much we can do for these drives, so we
> shouldn't fail to suspend for this error case. The drive may stay
> powered if that's the setup for the port it's plugged into.
> 
> Signed-off-by: Derek Basehore 
> Signed-off-by: Thierry Escande 
> ---
> 
> v4 changes:
> - Check sense header validity before checking the sense_key field
> - Get rid of both goto statements (the one in the previous patch and the
>   one in the existing code)
> 
> v3 changes:
> - Pass the sense_hdr structure to sd_sync_cache() instead of the
>   lonely sense_key field
> 
> v2 changes:
> - Change sense_key type to u8 in sd_sync_cache()
> 
>  drivers/scsi/sd.c | 40 
>  1 file changed, 28 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
> index fcfeddc..823ab8b 100644
> --- a/drivers/scsi/sd.c
> +++ b/drivers/scsi/sd.c
> @@ -1489,17 +1489,21 @@ static unsigned int sd_check_events(struct gendisk 
> *disk, unsigned int clearing)
>   return retval;
>  }
>  
> -static int sd_sync_cache(struct scsi_disk *sdkp)
> +static int sd_sync_cache(struct scsi_disk *sdkp, struct scsi_sense_hdr 
> *sshdr)
>  {
>   int retries, res;
>   struct scsi_device *sdp = sdkp->device;
>   const int timeout = sdp->request_queue->rq_timeout
>   * SD_FLUSH_TIMEOUT_MULTIPLIER;
> - struct scsi_sense_hdr sshdr;
> + struct scsi_sense_hdr my_sshdr;
>  
>   if (!scsi_device_online(sdp))
>   return -ENODEV;
>  
> + /* caller might not be interested in sense, but we need it */
> + if (!sshdr)
> + sshdr = _sshdr;
> +
>   for (retries = 3; retries > 0; --retries) {
>   unsigned char cmd[10] = { 0 };
>  
> @@ -1508,7 +1512,7 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
>* Leave the rest of the command zero to indicate
>* flush everything.
>*/
> - res = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, ,
> + res = scsi_execute(sdp, cmd, DMA_NONE, NULL, 0, NULL, sshdr,
>   timeout, SD_MAX_RETRIES, 0, RQF_PM, NULL);
>   if (res == 0)
>   break;
> @@ -1518,11 +1522,12 @@ static int sd_sync_cache(struct scsi_disk *sdkp)
>   sd_print_result(sdkp, "Synchronize Cache(10) failed", res);
>  
>   if (driver_byte(res) & DRIVER_SENSE)
> - sd_print_sense_hdr(sdkp, );
> + sd_print_sense_hdr(sdkp, sshdr);
> +
>   /* we need to evaluate the error return  */
> - if (scsi_sense_valid() &&
> - (sshdr.asc == 0x3a ||   /* medium not present */
> -  sshdr.asc == 0x20))/* invalid command */
> + if (scsi_sense_valid(sshdr) &&
> + (sshdr->asc == 0x3a ||  /* medium not present */
> +  sshdr->asc == 0x20))   /* invalid command */
>   /* this is no error here */
>   return 0;
>  
> @@ -3323,7 +3328,7 @@ static void sd_shutdown(struct device *dev)
>  
>   if (sdkp->WCE && sdkp->media_present) {
>   sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
> - sd_sync_cache(sdkp);
> + sd_sync_cache(sdkp, NULL);
>   }
>  
>   if (system_state != SYSTEM_RESTART && sdkp->device->manage_start_stop) {
> @@ -3335,6 +3340,7 @@ static void sd_shutdown(struct device *dev)
>  static int sd_suspend_common(struct device *dev, bool ignore_stop_errors)
>  {
>   struct scsi_disk *sdkp = dev_get_drvdata(dev);
> + struct scsi_sense_hdr sshdr;
>   int ret = 0;
>  
>   if (!sdkp)  /* E.g.: runtime suspend following sd_remove() */
> @@ -3342,12 +3348,23 @@ static int sd_suspend_common(struct device *dev, bool 
> ignore_stop_errors)
>  
>   if (sdkp->WCE && sdkp->media_present) {
>   sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
> - ret = sd_sync_cache(sdkp);
> + ret = sd_sync_cache(sdkp, );
> +
>   if (ret) {
>   /* ignore OFFLINE device */
>   if (ret == -ENODEV)
> - ret = 0;
> - goto done;
> + return 0;
> +
> + if (!scsi_sense_valid() ||
> + sshdr.sense_key != ILLEGAL_REQUEST)
> + return ret;
> +
> + /*
> +  * sshdr.sense_key == ILLEGAL_REQUEST means this drive
> +  * doesn't 

Re: vmbus: Delete an error message for a failed memory allocation in vmbus_device_create()

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:36:44 +0200
SF Markus Elfring  wrote:

> > Taking out the message assumes that all callers of this function either log 
> > an
> > error or pass appropriate error code back to userspace.  
> 
> Do you like the default error response by Linux memory allocation functions?

The default error message only helps a little.
I doubt this will ever fail anyway since only allocated on boot.


Re: vmbus: Delete an error message for a failed memory allocation in vmbus_device_create()

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:36:44 +0200
SF Markus Elfring  wrote:

> > Taking out the message assumes that all callers of this function either log 
> > an
> > error or pass appropriate error code back to userspace.  
> 
> Do you like the default error response by Linux memory allocation functions?

The default error message only helps a little.
I doubt this will ever fail anyway since only allocated on boot.


Re: [PATCH] net: ethernet: ti: netcp_core: return error while dma channel open issue

2017-05-11 Thread David Miller
From: Ivan Khoronzhuk 
Date: Wed, 10 May 2017 10:28:05 -0700

> Fix error path while dma open channel issue. Also, no need to check output
> on NULL if it's never returned.
> 
> Signed-off-by: Ivan Khoronzhuk 

Applied.


Re: [PATCH] net: ethernet: ti: netcp_core: return error while dma channel open issue

2017-05-11 Thread David Miller
From: Ivan Khoronzhuk 
Date: Wed, 10 May 2017 10:28:05 -0700

> Fix error path while dma open channel issue. Also, no need to check output
> on NULL if it's never returned.
> 
> Signed-off-by: Ivan Khoronzhuk 

Applied.


Re: vmbus: Delete an error message for a failed memory allocation in vmbus_device_create()

2017-05-11 Thread SF Markus Elfring
> Taking out the message assumes that all callers of this function either log an
> error or pass appropriate error code back to userspace.

Do you like the default error response by Linux memory allocation functions?

Regards,
Markus


Re: vmbus: Delete an error message for a failed memory allocation in vmbus_device_create()

2017-05-11 Thread SF Markus Elfring
> Taking out the message assumes that all callers of this function either log an
> error or pass appropriate error code back to userspace.

Do you like the default error response by Linux memory allocation functions?

Regards,
Markus


Re: pull request: linux-firmware: update cxgb4 firmware

2017-05-11 Thread Kyle McMartin
On Wed, Apr 26, 2017 at 08:06:53PM +0530, Ganesh Goudar wrote:
> Hi,
> 
> Kindly pull the new firmware from the following URL.
>   git://git.chelsio.net/pub/git/linux-firmware.git for-upstream
> 
> Thanks,
> Ganesh
> 
> The following changes since commit 766da91d4831792540451403ad349e2ece368f84:
> 
>   cxgb4: update firmware to revision 1.16.43.0 (2017-04-26 07:17:50 -0700)
> 
> are available in the git repository at:
> 
>   git://git.chelsio.net/pub/git/linux-firmware.git for-upstream
> 

Pulled, thanks Ganesh.

regards, Kyle

> for you to fetch changes up to 766da91d4831792540451403ad349e2ece368f84:
> 
>   cxgb4: update firmware to revision 1.16.43.0 (2017-04-26 07:17:50 -0700)
> 
> 


Re: pull request: linux-firmware: update cxgb4 firmware

2017-05-11 Thread Kyle McMartin
On Wed, Apr 26, 2017 at 08:06:53PM +0530, Ganesh Goudar wrote:
> Hi,
> 
> Kindly pull the new firmware from the following URL.
>   git://git.chelsio.net/pub/git/linux-firmware.git for-upstream
> 
> Thanks,
> Ganesh
> 
> The following changes since commit 766da91d4831792540451403ad349e2ece368f84:
> 
>   cxgb4: update firmware to revision 1.16.43.0 (2017-04-26 07:17:50 -0700)
> 
> are available in the git repository at:
> 
>   git://git.chelsio.net/pub/git/linux-firmware.git for-upstream
> 

Pulled, thanks Ganesh.

regards, Kyle

> for you to fetch changes up to 766da91d4831792540451403ad349e2ece368f84:
> 
>   cxgb4: update firmware to revision 1.16.43.0 (2017-04-26 07:17:50 -0700)
> 
> 


Re: pinctrl-sx150x.c broken in 4.11

2017-05-11 Thread Tony Lindgren
* Nikita Yushchenko  [170511 09:27]:
> > Hmm maybe yeah. I don't quite follow the above the "pinctrl-0 property
> > of sx150x device tree node, is misinterpreted as hog" part though.
> 
> sx150x is i2c-gpio device.  It has 16 GPIO lines that are communicated
> with via i2c bus, and an interrupt line.
> 
> Interrupt line is typically connected to SoC's pin.
> This pin has to be configured.
> This is done by providing appropriate subnode in SoC's pinmux node, with
> information with pin configuration, and pinctrl-0 property in sx150x's
> node with phandle to that subnode:
> 
> ...
>  {
>   sx1503@20 {
>   compatible = "semtech,sx1503q";
>   pinctrl-names = "default";
>   pinctrl-0 = <_sx1503_20>;
>   ...
>   };
> };
> ...
>  {
>   pinctrl_sx1503_20: pinctrl-sx1503-20 {
>   fsl,pins = <
>   VF610_PAD_PTB1__GPIO_23 0x219d
>   >;
>   };
> };
> 
> This pin configuration is handled by driver core, i.e. before probe()
> for sx150x is called, core applies pin configuration.
> 
> However sx150x driver is currently implemented as a pinctrl driver.
> 
> When it initializes, pinctrl searches for "hog", i.e. pin config that
> should be applied at driver registration time.
> 
> While doing so, core searches for any registered pinctrl_map for device
> being register. Search loop is in create_pinctrl().
> 
> In this case, this loop finds map that is defined above.
> 
> This is *not* hog.  This is pin setting already applied in SoC's pinmux
> controller for sx1503 device.
> 
> However code in create_pinctrl() tries to apply it, and use sx1503's
> methods to do so. Which is plain wrong and errors out.

Maybe create_pinctrl() could check if the pin controller device
for a potential hog points to the device itself and bail out
if that's not the case?

> > But at least with updating the probe to use pinctrl_register_and_init()
> > and pinctrl_enable() the driver can do something before the hogs are
> > claimed. I just don't know what the driver would here as I don't
> > understand the "misinterpreted as hog" part :)
> 
> Tried to explain above :)

Yup OK based on that this seems like a pinctrl core issue.

Regards,

Tony


Re: pinctrl-sx150x.c broken in 4.11

2017-05-11 Thread Tony Lindgren
* Nikita Yushchenko  [170511 09:27]:
> > Hmm maybe yeah. I don't quite follow the above the "pinctrl-0 property
> > of sx150x device tree node, is misinterpreted as hog" part though.
> 
> sx150x is i2c-gpio device.  It has 16 GPIO lines that are communicated
> with via i2c bus, and an interrupt line.
> 
> Interrupt line is typically connected to SoC's pin.
> This pin has to be configured.
> This is done by providing appropriate subnode in SoC's pinmux node, with
> information with pin configuration, and pinctrl-0 property in sx150x's
> node with phandle to that subnode:
> 
> ...
>  {
>   sx1503@20 {
>   compatible = "semtech,sx1503q";
>   pinctrl-names = "default";
>   pinctrl-0 = <_sx1503_20>;
>   ...
>   };
> };
> ...
>  {
>   pinctrl_sx1503_20: pinctrl-sx1503-20 {
>   fsl,pins = <
>   VF610_PAD_PTB1__GPIO_23 0x219d
>   >;
>   };
> };
> 
> This pin configuration is handled by driver core, i.e. before probe()
> for sx150x is called, core applies pin configuration.
> 
> However sx150x driver is currently implemented as a pinctrl driver.
> 
> When it initializes, pinctrl searches for "hog", i.e. pin config that
> should be applied at driver registration time.
> 
> While doing so, core searches for any registered pinctrl_map for device
> being register. Search loop is in create_pinctrl().
> 
> In this case, this loop finds map that is defined above.
> 
> This is *not* hog.  This is pin setting already applied in SoC's pinmux
> controller for sx1503 device.
> 
> However code in create_pinctrl() tries to apply it, and use sx1503's
> methods to do so. Which is plain wrong and errors out.

Maybe create_pinctrl() could check if the pin controller device
for a potential hog points to the device itself and bail out
if that's not the case?

> > But at least with updating the probe to use pinctrl_register_and_init()
> > and pinctrl_enable() the driver can do something before the hogs are
> > claimed. I just don't know what the driver would here as I don't
> > understand the "misinterpreted as hog" part :)
> 
> Tried to explain above :)

Yup OK based on that this seems like a pinctrl core issue.

Regards,

Tony


Re: [PATCH 4/4] vmbus: Adjust five checks for null pointers

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:19:21 +0200
SF Markus Elfring  wrote:

> From: Markus Elfring 
> Date: Thu, 11 May 2017 17:52:38 +0200
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> The script “checkpatch.pl” pointed information out like the following.
> 
> Comparison to NULL could be written …
> 
> Thus fix the affected source code places.
> 
> Signed-off-by: Markus Elfring 

Please don't do these kind of checkpatch "fix ups" on existing code.
The comparison with NULL is fine, doing this is just useless churn.



Re: [PATCH 4/4] vmbus: Adjust five checks for null pointers

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:19:21 +0200
SF Markus Elfring  wrote:

> From: Markus Elfring 
> Date: Thu, 11 May 2017 17:52:38 +0200
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> The script “checkpatch.pl” pointed information out like the following.
> 
> Comparison to NULL could be written …
> 
> Thus fix the affected source code places.
> 
> Signed-off-by: Markus Elfring 

Please don't do these kind of checkpatch "fix ups" on existing code.
The comparison with NULL is fine, doing this is just useless churn.



Re: [PATCH 3/4] vmbus: Fix a typo in a comment line

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:18:12 +0200
SF Markus Elfring  wrote:

> From: Markus Elfring 
> Date: Thu, 11 May 2017 17:43:55 +0200
> 
> Add a missing character in this description.
> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/hv/vmbus_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 96328aebae5a..ff94b111ed8d 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -991,7 +991,7 @@ static void vmbus_isr(void)
>   /*
>* Our host is win8 or above. The signaling mechanism
>* has changed and we can directly look at the event page.
> -  * If bit n is set then we have an interrup on the channel
> +  * If bit n is set then we have an interrupt on the channel
>* whose id is n.
>*/
>   handled = true;


Acked-by: Stephen Hemminger 


Re: [PATCH 3/4] vmbus: Fix a typo in a comment line

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:18:12 +0200
SF Markus Elfring  wrote:

> From: Markus Elfring 
> Date: Thu, 11 May 2017 17:43:55 +0200
> 
> Add a missing character in this description.
> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/hv/vmbus_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 96328aebae5a..ff94b111ed8d 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -991,7 +991,7 @@ static void vmbus_isr(void)
>   /*
>* Our host is win8 or above. The signaling mechanism
>* has changed and we can directly look at the event page.
> -  * If bit n is set then we have an interrup on the channel
> +  * If bit n is set then we have an interrupt on the channel
>* whose id is n.
>*/
>   handled = true;


Acked-by: Stephen Hemminger 


[RFC 07/11] ima: new namespace policy structure to track initial namespace policy data

2017-05-11 Thread Guilherme Magalhaes
Adding the global ima_initial_namespace_policy which will be used when the
initial namespace IMA policy data must be referred or when
CONFIG_IMA_PER_NAMESPACE is not defined.
New functions which will be used to retrieve the correct namespace IMA
policy data from the radix tree map or from the ima_initial_namespace_policy.
If the given namespace has not yet defined a private IMA policy, the IMA
policy for that namespace falls back to the initial IMA policy by using
ima_initial_namespace_policy.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima.h|   6 ++
 security/integrity/ima/ima_fs.c | 112 +---
 security/integrity/ima/ima_policy.c |  72 +++
 3 files changed, 170 insertions(+), 20 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 1c5c875..20b927e 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -150,6 +150,7 @@ struct ima_ns_policy {
int ima_appraise;
 };
 
+extern struct ima_ns_policy ima_initial_namespace_policy;
 #ifdef CONFIG_IMA_PER_NAMESPACE
 extern spinlock_t ima_ns_policy_lock;
 extern struct radix_tree_root ima_ns_policy_mapping;
@@ -203,6 +204,11 @@ static inline void ima_namespace_unlock(void) {
 }
 #endif
 
+/* IMA namespace function definitions */
+struct ima_ns_policy *ima_get_current_namespace_policy(void);
+struct ima_ns_policy *ima_get_namespace_policy_from_inode(struct inode *inode);
+struct ima_ns_policy *ima_get_policy_from_namespace(unsigned int ns_id);
+
 /*
  * used to protect h_table and sha_table
  */
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 56ba0ff..61f8da1 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -274,6 +274,22 @@ static const struct file_operations 
ima_ascii_measurements_ops = {
.release = seq_release,
 };
 
+static struct dentry *ima_dir;
+static struct dentry *binary_runtime_measurements;
+static struct dentry *ascii_runtime_measurements;
+static struct dentry *runtime_measurements_count;
+static struct dentry *violations;
+static struct dentry *ima_policy_initial_ns;
+#ifdef CONFIG_IMA_PER_NAMESPACE
+static struct dentry *ima_namespaces;
+#endif
+
+enum ima_fs_flags {
+   IMA_FS_BUSY,
+};
+
+static unsigned long ima_fs_flags;
+
 #ifdef CONFIG_IMA_PER_NAMESPACE
 /* used for namespace policy rules initialization */
 static LIST_HEAD(empty_policy);
@@ -348,6 +364,76 @@ static int check_mntns(unsigned int ns_id)
 
return result;
 }
+
+/*
+ * ima_find_namespace_id_from_inode
+ * @policy_inode: the inode of the securityfs policy file for a given
+ * namespace
+ *
+ * Return 0 if the namespace id is not found in ima_ns_policy_mapping
+ */
+static unsigned int find_namespace_id_from_inode(struct inode *policy_inode)
+{
+   unsigned int ns_id = 0;
+#ifdef CONFIG_IMA_PER_NAMESPACE
+   struct ima_ns_policy *ins;
+   void **slot;
+   struct radix_tree_iter iter;
+
+   rcu_read_lock();
+   radix_tree_for_each_slot(slot, _ns_policy_mapping, , 0) {
+   ins = radix_tree_deref_slot(slot);
+   if (unlikely(!ins))
+   continue;
+   if (radix_tree_deref_retry(ins)) {
+   slot = radix_tree_iter_retry();
+   continue;
+   }
+
+   if (ins->policy_dentry && ins->policy_dentry->d_inode == 
policy_inode) {
+   ns_id = iter.index;
+   break;
+   }
+   }
+   rcu_read_unlock();
+#endif
+
+   return ns_id;
+}
+
+/*
+ * get_namespace_policy_from_inode - Finds namespace mapping from
+ * securityfs policy file
+ * It is called to get the namespace policy reference when a seurityfs
+ * file such as the namespace or policy files are read or written.
+ * @inode: inode of the securityfs policy file under a namespace
+ * folder
+ * Expects the ima_ns_policy_lock already held
+ *
+ * Returns NULL if the namespace policy reference is not reliable once it
+ * probably was already released after a concurrent namespace release.
+ * Otherwise, the namespace policy reference is returned.
+ */
+struct ima_ns_policy *ima_get_namespace_policy_from_inode(struct inode *inode)
+{
+   unsigned int ns_id;
+   struct ima_ns_policy *ins;
+
+   ns_id = find_namespace_id_from_inode(inode);
+#ifdef CONFIG_IMA_PER_NAMESPACE
+   if (ns_id == 0 &&
+   (!ima_policy_initial_ns || inode != 
ima_policy_initial_ns->d_inode)) {
+   /* ns_id == 0 refers to initial namespace, but inode refers to a
+* namespaced policy file. It might be a race condition with
+* namespace release, return invalid reference. */
+   return NULL;
+   }
+#endif
+
+   ins = ima_get_policy_from_namespace(ns_id);
+
+   return ins;
+}
 #endif
 
 static ssize_t 

[RFC 07/11] ima: new namespace policy structure to track initial namespace policy data

2017-05-11 Thread Guilherme Magalhaes
Adding the global ima_initial_namespace_policy which will be used when the
initial namespace IMA policy data must be referred or when
CONFIG_IMA_PER_NAMESPACE is not defined.
New functions which will be used to retrieve the correct namespace IMA
policy data from the radix tree map or from the ima_initial_namespace_policy.
If the given namespace has not yet defined a private IMA policy, the IMA
policy for that namespace falls back to the initial IMA policy by using
ima_initial_namespace_policy.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima.h|   6 ++
 security/integrity/ima/ima_fs.c | 112 +---
 security/integrity/ima/ima_policy.c |  72 +++
 3 files changed, 170 insertions(+), 20 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 1c5c875..20b927e 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -150,6 +150,7 @@ struct ima_ns_policy {
int ima_appraise;
 };
 
+extern struct ima_ns_policy ima_initial_namespace_policy;
 #ifdef CONFIG_IMA_PER_NAMESPACE
 extern spinlock_t ima_ns_policy_lock;
 extern struct radix_tree_root ima_ns_policy_mapping;
@@ -203,6 +204,11 @@ static inline void ima_namespace_unlock(void) {
 }
 #endif
 
+/* IMA namespace function definitions */
+struct ima_ns_policy *ima_get_current_namespace_policy(void);
+struct ima_ns_policy *ima_get_namespace_policy_from_inode(struct inode *inode);
+struct ima_ns_policy *ima_get_policy_from_namespace(unsigned int ns_id);
+
 /*
  * used to protect h_table and sha_table
  */
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 56ba0ff..61f8da1 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -274,6 +274,22 @@ static const struct file_operations 
ima_ascii_measurements_ops = {
.release = seq_release,
 };
 
+static struct dentry *ima_dir;
+static struct dentry *binary_runtime_measurements;
+static struct dentry *ascii_runtime_measurements;
+static struct dentry *runtime_measurements_count;
+static struct dentry *violations;
+static struct dentry *ima_policy_initial_ns;
+#ifdef CONFIG_IMA_PER_NAMESPACE
+static struct dentry *ima_namespaces;
+#endif
+
+enum ima_fs_flags {
+   IMA_FS_BUSY,
+};
+
+static unsigned long ima_fs_flags;
+
 #ifdef CONFIG_IMA_PER_NAMESPACE
 /* used for namespace policy rules initialization */
 static LIST_HEAD(empty_policy);
@@ -348,6 +364,76 @@ static int check_mntns(unsigned int ns_id)
 
return result;
 }
+
+/*
+ * ima_find_namespace_id_from_inode
+ * @policy_inode: the inode of the securityfs policy file for a given
+ * namespace
+ *
+ * Return 0 if the namespace id is not found in ima_ns_policy_mapping
+ */
+static unsigned int find_namespace_id_from_inode(struct inode *policy_inode)
+{
+   unsigned int ns_id = 0;
+#ifdef CONFIG_IMA_PER_NAMESPACE
+   struct ima_ns_policy *ins;
+   void **slot;
+   struct radix_tree_iter iter;
+
+   rcu_read_lock();
+   radix_tree_for_each_slot(slot, _ns_policy_mapping, , 0) {
+   ins = radix_tree_deref_slot(slot);
+   if (unlikely(!ins))
+   continue;
+   if (radix_tree_deref_retry(ins)) {
+   slot = radix_tree_iter_retry();
+   continue;
+   }
+
+   if (ins->policy_dentry && ins->policy_dentry->d_inode == 
policy_inode) {
+   ns_id = iter.index;
+   break;
+   }
+   }
+   rcu_read_unlock();
+#endif
+
+   return ns_id;
+}
+
+/*
+ * get_namespace_policy_from_inode - Finds namespace mapping from
+ * securityfs policy file
+ * It is called to get the namespace policy reference when a seurityfs
+ * file such as the namespace or policy files are read or written.
+ * @inode: inode of the securityfs policy file under a namespace
+ * folder
+ * Expects the ima_ns_policy_lock already held
+ *
+ * Returns NULL if the namespace policy reference is not reliable once it
+ * probably was already released after a concurrent namespace release.
+ * Otherwise, the namespace policy reference is returned.
+ */
+struct ima_ns_policy *ima_get_namespace_policy_from_inode(struct inode *inode)
+{
+   unsigned int ns_id;
+   struct ima_ns_policy *ins;
+
+   ns_id = find_namespace_id_from_inode(inode);
+#ifdef CONFIG_IMA_PER_NAMESPACE
+   if (ns_id == 0 &&
+   (!ima_policy_initial_ns || inode != 
ima_policy_initial_ns->d_inode)) {
+   /* ns_id == 0 refers to initial namespace, but inode refers to a
+* namespaced policy file. It might be a race condition with
+* namespace release, return invalid reference. */
+   return NULL;
+   }
+#endif
+
+   ins = ima_get_policy_from_namespace(ns_id);
+
+   return ins;
+}
 #endif
 
 static ssize_t ima_read_policy(char *path)
@@ -439,22 

[RFC 06/11] ima, fs: release namespace policy resources

2017-05-11 Thread Guilherme Magalhaes
Release all namespace IMA policy resources when the mount namespace is
released.
This is the suggested mechanism to release namespace policy resources,
but we still can discuss other methods to avoid cross-component changes.

Signed-off-by: Guilherme Magalhaes 
---
 fs/namespace.c  |  4 
 include/linux/integrity.h   |  9 +
 security/integrity/ima/ima_fs.c | 26 ++
 3 files changed, 39 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index cc1375ef..80940998 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include /* init_rootfs */
@@ -3283,6 +3284,9 @@ void put_mnt_ns(struct mnt_namespace *ns)
 {
if (!atomic_dec_and_test(>count))
return;
+
+   ima_mnt_namespace_dying(ns->ns.inum);
+
drop_collected_mounts(>root->mnt);
free_mnt_ns(ns);
 }
diff --git a/include/linux/integrity.h b/include/linux/integrity.h
index c2d6082..034d082 100644
--- a/include/linux/integrity.h
+++ b/include/linux/integrity.h
@@ -43,4 +43,13 @@ static inline void integrity_load_keys(void)
 }
 #endif /* CONFIG_INTEGRITY */
 
+#ifdef CONFIG_IMA_PER_NAMESPACE
+extern void ima_mnt_namespace_dying(unsigned int ns_id);
+#else
+static inline void ima_mnt_namespace_dying(unsigned int ns_id)
+{
+   return;
+}
+#endif /* CONFIG_IMA_PER_NAMESPACE */
+
 #endif /* _LINUX_INTEGRITY_H */
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index ce6dcdf..56ba0ff 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -423,6 +423,7 @@ static ssize_t ima_write_policy(struct file *file, const 
char __user *buf,
integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
"policy_update", "signed policy required",
1, 0);
+
if (ima_appraise & IMA_APPRAISE_ENFORCE)
result = -EACCES;
} else {
@@ -579,6 +580,31 @@ static int create_mnt_ns_directory(unsigned int ns_id)
return result;
 }
 
+/*
+ * ima_mnt_namespace_dying - releases all namespace policy resources
+ * It is called automatically when the namespace is released.
+ * @ns_id namespace id to be released
+ *
+ * Note: This function is called by put_mnt_ns() in the context
+ * of a namespace release. We need to make sure that a lock on
+ * this path is allowed.
+ */
+void ima_mnt_namespace_dying(unsigned int ns_id)
+{
+   struct ima_ns_policy *p;
+
+   spin_lock(_ns_policy_lock);
+   p = radix_tree_delete(_ns_policy_mapping, ns_id);
+
+   if (!p) {
+   spin_unlock(_ns_policy_lock);
+   return;
+   }
+
+   free_namespace_policy(p);
+   spin_unlock(_ns_policy_lock);
+}
+
 static ssize_t handle_new_namespace_policy(const char *data, size_t datalen)
 {
unsigned int ns_id;
-- 
2.7.4



[RFC 06/11] ima, fs: release namespace policy resources

2017-05-11 Thread Guilherme Magalhaes
Release all namespace IMA policy resources when the mount namespace is
released.
This is the suggested mechanism to release namespace policy resources,
but we still can discuss other methods to avoid cross-component changes.

Signed-off-by: Guilherme Magalhaes 
---
 fs/namespace.c  |  4 
 include/linux/integrity.h   |  9 +
 security/integrity/ima/ima_fs.c | 26 ++
 3 files changed, 39 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index cc1375ef..80940998 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include /* init_rootfs */
@@ -3283,6 +3284,9 @@ void put_mnt_ns(struct mnt_namespace *ns)
 {
if (!atomic_dec_and_test(>count))
return;
+
+   ima_mnt_namespace_dying(ns->ns.inum);
+
drop_collected_mounts(>root->mnt);
free_mnt_ns(ns);
 }
diff --git a/include/linux/integrity.h b/include/linux/integrity.h
index c2d6082..034d082 100644
--- a/include/linux/integrity.h
+++ b/include/linux/integrity.h
@@ -43,4 +43,13 @@ static inline void integrity_load_keys(void)
 }
 #endif /* CONFIG_INTEGRITY */
 
+#ifdef CONFIG_IMA_PER_NAMESPACE
+extern void ima_mnt_namespace_dying(unsigned int ns_id);
+#else
+static inline void ima_mnt_namespace_dying(unsigned int ns_id)
+{
+   return;
+}
+#endif /* CONFIG_IMA_PER_NAMESPACE */
+
 #endif /* _LINUX_INTEGRITY_H */
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index ce6dcdf..56ba0ff 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -423,6 +423,7 @@ static ssize_t ima_write_policy(struct file *file, const 
char __user *buf,
integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
"policy_update", "signed policy required",
1, 0);
+
if (ima_appraise & IMA_APPRAISE_ENFORCE)
result = -EACCES;
} else {
@@ -579,6 +580,31 @@ static int create_mnt_ns_directory(unsigned int ns_id)
return result;
 }
 
+/*
+ * ima_mnt_namespace_dying - releases all namespace policy resources
+ * It is called automatically when the namespace is released.
+ * @ns_id namespace id to be released
+ *
+ * Note: This function is called by put_mnt_ns() in the context
+ * of a namespace release. We need to make sure that a lock on
+ * this path is allowed.
+ */
+void ima_mnt_namespace_dying(unsigned int ns_id)
+{
+   struct ima_ns_policy *p;
+
+   spin_lock(_ns_policy_lock);
+   p = radix_tree_delete(_ns_policy_mapping, ns_id);
+
+   if (!p) {
+   spin_unlock(_ns_policy_lock);
+   return;
+   }
+
+   free_namespace_policy(p);
+   spin_unlock(_ns_policy_lock);
+}
+
 static ssize_t handle_new_namespace_policy(const char *data, size_t datalen)
 {
unsigned int ns_id;
-- 
2.7.4



[PATCH 09/14] Sample program for driving fsopen/fsmount [ver #2]

2017-05-11 Thread David Howells

---

 samples/fsmount/test-fsmount.c |   79 
 1 file changed, 79 insertions(+)
 create mode 100644 samples/fsmount/test-fsmount.c

diff --git a/samples/fsmount/test-fsmount.c b/samples/fsmount/test-fsmount.c
new file mode 100644
index ..98b0258ae08f
--- /dev/null
+++ b/samples/fsmount/test-fsmount.c
@@ -0,0 +1,79 @@
+/* fd-based mount test.
+ *
+ * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowe...@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define E(x) do { if ((x) == -1) { perror(#x); exit(1); } } while(0)
+
+static __attribute__((noreturn))
+void mount_error(int fd, const char *s)
+{
+   char buf[4096];
+   int err, n;
+
+   err = errno;
+   n = read(fd, buf, sizeof(buf));
+   errno = err;
+   if (n > 0) {
+   n -= 2;
+   fprintf(stderr, "Error: '%s': %*.*s: %m\n", s, n, n, buf + 2);
+   } else {
+   fprintf(stderr, "%s: %m\n", s);
+   }
+   exit(1);
+}
+
+#define E_write(fd, s) \
+   do {\
+   if (write(fd, s, sizeof(s) - 1) == -1)  \
+   mount_error(fd, s); \
+   } while (0)
+
+static inline int fsopen(const char *fs_name, int reserved, int flags)
+{
+   return syscall(333, fs_name, reserved, flags);
+}
+
+static inline int fsmount(int fsfd, int dfd, const char *path,
+ unsigned int at_flags, unsigned int flags)
+{
+   return syscall(334, fsfd, dfd, path, at_flags, flags);
+}
+
+int main()
+{
+   int mfd;
+
+   /* Mount an NFS filesystem */
+   mfd = fsopen("nfs4", -1, 0);
+   if (mfd == -1) {
+   perror("fsopen");
+   exit(1);
+   }
+
+   E_write(mfd, "s warthog:/data");
+   E_write(mfd, "o fsc");
+   E_write(mfd, "o sync");
+   E_write(mfd, "o intr");
+   E_write(mfd, "o vers=4.2");
+   E_write(mfd, "o addr=90.155.74.18");
+   E_write(mfd, "o clientaddr=90.155.74.21");
+   if (fsmount(mfd, AT_FDCWD, "/mnt", 0, 0) < 0)
+   mount_error(mfd, "fsmount");
+   E(close(mfd));
+
+   exit(0);
+}



[PATCH 09/14] Sample program for driving fsopen/fsmount [ver #2]

2017-05-11 Thread David Howells

---

 samples/fsmount/test-fsmount.c |   79 
 1 file changed, 79 insertions(+)
 create mode 100644 samples/fsmount/test-fsmount.c

diff --git a/samples/fsmount/test-fsmount.c b/samples/fsmount/test-fsmount.c
new file mode 100644
index ..98b0258ae08f
--- /dev/null
+++ b/samples/fsmount/test-fsmount.c
@@ -0,0 +1,79 @@
+/* fd-based mount test.
+ *
+ * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowe...@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define E(x) do { if ((x) == -1) { perror(#x); exit(1); } } while(0)
+
+static __attribute__((noreturn))
+void mount_error(int fd, const char *s)
+{
+   char buf[4096];
+   int err, n;
+
+   err = errno;
+   n = read(fd, buf, sizeof(buf));
+   errno = err;
+   if (n > 0) {
+   n -= 2;
+   fprintf(stderr, "Error: '%s': %*.*s: %m\n", s, n, n, buf + 2);
+   } else {
+   fprintf(stderr, "%s: %m\n", s);
+   }
+   exit(1);
+}
+
+#define E_write(fd, s) \
+   do {\
+   if (write(fd, s, sizeof(s) - 1) == -1)  \
+   mount_error(fd, s); \
+   } while (0)
+
+static inline int fsopen(const char *fs_name, int reserved, int flags)
+{
+   return syscall(333, fs_name, reserved, flags);
+}
+
+static inline int fsmount(int fsfd, int dfd, const char *path,
+ unsigned int at_flags, unsigned int flags)
+{
+   return syscall(334, fsfd, dfd, path, at_flags, flags);
+}
+
+int main()
+{
+   int mfd;
+
+   /* Mount an NFS filesystem */
+   mfd = fsopen("nfs4", -1, 0);
+   if (mfd == -1) {
+   perror("fsopen");
+   exit(1);
+   }
+
+   E_write(mfd, "s warthog:/data");
+   E_write(mfd, "o fsc");
+   E_write(mfd, "o sync");
+   E_write(mfd, "o intr");
+   E_write(mfd, "o vers=4.2");
+   E_write(mfd, "o addr=90.155.74.18");
+   E_write(mfd, "o clientaddr=90.155.74.21");
+   if (fsmount(mfd, AT_FDCWD, "/mnt", 0, 0) < 0)
+   mount_error(mfd, "fsmount");
+   E(close(mfd));
+
+   exit(0);
+}



[PATCH 08/14] Implement fsmount() to effect a pre-configured mount [ver #2]

2017-05-11 Thread David Howells
Provide a system call by which a filesystem opened with fsopen() and
configured by a series of writes can be mounted:

int ret = fsmount(int fsfd, int dfd, const char *path,
  unsigned int at_flags, unsigned int flags);

where fsfd is the fd returned by fsopen(), dfd, path and at_flags locate
the mountpoint and flags are the applicable MS_* flags.  dfd can be
AT_FDCWD or an fd open to a directory.

In the event that fsmount() fails, it may be possible to get an error
message by calling read().  If no message is available, ENODATA will be
reported.

Signed-off-by: David Howells 
---

 arch/x86/entry/syscalls/syscall_32.tbl |1 
 arch/x86/entry/syscalls/syscall_64.tbl |1 
 fs/namespace.c |   99 
 include/linux/lsm_hooks.h  |6 ++
 include/linux/security.h   |6 ++
 include/linux/syscalls.h   |2 +
 kernel/sys_ni.c|1 
 security/security.c|7 ++
 security/selinux/hooks.c   |   13 
 9 files changed, 136 insertions(+)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
b/arch/x86/entry/syscalls/syscall_32.tbl
index 9bf8d4c62f85..abe6ea95e0e6 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -392,3 +392,4 @@
 383i386statx   sys_statx
 384i386arch_prctl  sys_arch_prctl  
compat_sys_arch_prctl
 385i386fsopen  sys_fsopen
+386i386fsmount sys_fsmount
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
b/arch/x86/entry/syscalls/syscall_64.tbl
index 9b198c5fc412..0977c5079831 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -340,6 +340,7 @@
 331common  pkey_free   sys_pkey_free
 332common  statx   sys_statx
 333common  fsopen  sys_fsopen
+334common  fsmount sys_fsmount
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/fs/namespace.c b/fs/namespace.c
index 8ade7252ee34..6e43657d78bd 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3253,6 +3253,105 @@ vfs_submount_sc(const struct dentry *mountpoint, struct 
sb_config *sc)
 EXPORT_SYMBOL_GPL(vfs_submount_sc);
 
 /*
+ * Mount a new, prepared superblock (specified by fs_fd) on the location
+ * specified by dfd and dir_name.  dfd can be AT_FDCWD, a dir fd or a container
+ * fd.  This cannot be used for binding, moving or remounting mounts.
+ */
+SYSCALL_DEFINE5(fsmount, int, fs_fd, int, dfd, const char __user *, dir_name,
+   unsigned int, at_flags, unsigned int, flags)
+{
+   struct sb_config *sc;
+   struct inode *inode;
+   struct path mountpoint;
+   struct fd f;
+   unsigned int lookup_flags, mnt_flags = 0;
+   long ret;
+
+   if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT |
+ AT_EMPTY_PATH)) != 0)
+   return -EINVAL;
+
+   if (flags & ~(MS_RDONLY | MS_NOSUID | MS_NODEV | MS_NOEXEC |
+ MS_NOATIME | MS_NODIRATIME | MS_RELATIME | 
MS_STRICTATIME))
+   return -EINVAL;
+
+   if (flags & MS_RDONLY)
+   mnt_flags |= MNT_READONLY;
+   if (flags & MS_NOSUID)
+   mnt_flags |= MNT_NOSUID;
+   if (flags & MS_NODEV)
+   mnt_flags |= MNT_NODEV;
+   if (flags & MS_NOEXEC)
+   mnt_flags |= MNT_NOEXEC;
+   if (flags & MS_NODIRATIME)
+   mnt_flags |= MNT_NODIRATIME;
+
+   if (flags & MS_STRICTATIME) {
+   if (flags & MS_NOATIME)
+   return -EINVAL;
+   } else if (flags & MS_NOATIME) {
+   mnt_flags |= MNT_NOATIME;
+   } else {
+   mnt_flags |= MNT_RELATIME;
+   }
+
+   f = fdget(fs_fd);
+   if (!f.file)
+   return -EBADF;
+
+   ret = -EINVAL;
+   if (f.file->f_op != _fs_fops)
+   goto err_fsfd;
+
+   sc = f.file->private_data;
+
+   ret = -EPERM;
+   if (!may_mount() ||
+   ((sc->ms_flags & MS_MANDLOCK) && !may_mandlock()))
+   goto err_fsfd;
+
+   /* Prevent further changes. */
+   inode = file_inode(f.file);
+   ret = inode_lock_killable(inode);
+   if (ret < 0)
+   goto err_fsfd;
+   ret = -EBUSY;
+   if (!sc->mounted) {
+   sc->mounted = true;
+   ret = 0;
+   }
+   inode_unlock(inode);
+   if (ret < 0)
+   goto err_fsfd;
+
+   /* Find the mountpoint.  A container can be specified in dfd. */
+   lookup_flags = LOOKUP_FOLLOW | LOOKUP_AUTOMOUNT;
+   if (at_flags & AT_SYMLINK_NOFOLLOW)
+   lookup_flags &= ~LOOKUP_FOLLOW;
+   if (at_flags & AT_NO_AUTOMOUNT)
+   lookup_flags &= 

[PATCH 08/14] Implement fsmount() to effect a pre-configured mount [ver #2]

2017-05-11 Thread David Howells
Provide a system call by which a filesystem opened with fsopen() and
configured by a series of writes can be mounted:

int ret = fsmount(int fsfd, int dfd, const char *path,
  unsigned int at_flags, unsigned int flags);

where fsfd is the fd returned by fsopen(), dfd, path and at_flags locate
the mountpoint and flags are the applicable MS_* flags.  dfd can be
AT_FDCWD or an fd open to a directory.

In the event that fsmount() fails, it may be possible to get an error
message by calling read().  If no message is available, ENODATA will be
reported.

Signed-off-by: David Howells 
---

 arch/x86/entry/syscalls/syscall_32.tbl |1 
 arch/x86/entry/syscalls/syscall_64.tbl |1 
 fs/namespace.c |   99 
 include/linux/lsm_hooks.h  |6 ++
 include/linux/security.h   |6 ++
 include/linux/syscalls.h   |2 +
 kernel/sys_ni.c|1 
 security/security.c|7 ++
 security/selinux/hooks.c   |   13 
 9 files changed, 136 insertions(+)

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
b/arch/x86/entry/syscalls/syscall_32.tbl
index 9bf8d4c62f85..abe6ea95e0e6 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -392,3 +392,4 @@
 383i386statx   sys_statx
 384i386arch_prctl  sys_arch_prctl  
compat_sys_arch_prctl
 385i386fsopen  sys_fsopen
+386i386fsmount sys_fsmount
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
b/arch/x86/entry/syscalls/syscall_64.tbl
index 9b198c5fc412..0977c5079831 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -340,6 +340,7 @@
 331common  pkey_free   sys_pkey_free
 332common  statx   sys_statx
 333common  fsopen  sys_fsopen
+334common  fsmount sys_fsmount
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/fs/namespace.c b/fs/namespace.c
index 8ade7252ee34..6e43657d78bd 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3253,6 +3253,105 @@ vfs_submount_sc(const struct dentry *mountpoint, struct 
sb_config *sc)
 EXPORT_SYMBOL_GPL(vfs_submount_sc);
 
 /*
+ * Mount a new, prepared superblock (specified by fs_fd) on the location
+ * specified by dfd and dir_name.  dfd can be AT_FDCWD, a dir fd or a container
+ * fd.  This cannot be used for binding, moving or remounting mounts.
+ */
+SYSCALL_DEFINE5(fsmount, int, fs_fd, int, dfd, const char __user *, dir_name,
+   unsigned int, at_flags, unsigned int, flags)
+{
+   struct sb_config *sc;
+   struct inode *inode;
+   struct path mountpoint;
+   struct fd f;
+   unsigned int lookup_flags, mnt_flags = 0;
+   long ret;
+
+   if ((at_flags & ~(AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT |
+ AT_EMPTY_PATH)) != 0)
+   return -EINVAL;
+
+   if (flags & ~(MS_RDONLY | MS_NOSUID | MS_NODEV | MS_NOEXEC |
+ MS_NOATIME | MS_NODIRATIME | MS_RELATIME | 
MS_STRICTATIME))
+   return -EINVAL;
+
+   if (flags & MS_RDONLY)
+   mnt_flags |= MNT_READONLY;
+   if (flags & MS_NOSUID)
+   mnt_flags |= MNT_NOSUID;
+   if (flags & MS_NODEV)
+   mnt_flags |= MNT_NODEV;
+   if (flags & MS_NOEXEC)
+   mnt_flags |= MNT_NOEXEC;
+   if (flags & MS_NODIRATIME)
+   mnt_flags |= MNT_NODIRATIME;
+
+   if (flags & MS_STRICTATIME) {
+   if (flags & MS_NOATIME)
+   return -EINVAL;
+   } else if (flags & MS_NOATIME) {
+   mnt_flags |= MNT_NOATIME;
+   } else {
+   mnt_flags |= MNT_RELATIME;
+   }
+
+   f = fdget(fs_fd);
+   if (!f.file)
+   return -EBADF;
+
+   ret = -EINVAL;
+   if (f.file->f_op != _fs_fops)
+   goto err_fsfd;
+
+   sc = f.file->private_data;
+
+   ret = -EPERM;
+   if (!may_mount() ||
+   ((sc->ms_flags & MS_MANDLOCK) && !may_mandlock()))
+   goto err_fsfd;
+
+   /* Prevent further changes. */
+   inode = file_inode(f.file);
+   ret = inode_lock_killable(inode);
+   if (ret < 0)
+   goto err_fsfd;
+   ret = -EBUSY;
+   if (!sc->mounted) {
+   sc->mounted = true;
+   ret = 0;
+   }
+   inode_unlock(inode);
+   if (ret < 0)
+   goto err_fsfd;
+
+   /* Find the mountpoint.  A container can be specified in dfd. */
+   lookup_flags = LOOKUP_FOLLOW | LOOKUP_AUTOMOUNT;
+   if (at_flags & AT_SYMLINK_NOFOLLOW)
+   lookup_flags &= ~LOOKUP_FOLLOW;
+   if (at_flags & AT_NO_AUTOMOUNT)
+   lookup_flags &= ~LOOKUP_AUTOMOUNT;
+   

[RFC 08/11] ima: block initial namespace id on the namespace policy interface

2017-05-11 Thread Guilherme Magalhaes
The initial namespace policy is set through the existent interface
in the ima/policy securityfs file. Block the initial namespace
id when it is written to the ima/namespace securityfs file.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima_fs.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 61f8da1..65c43e7 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -365,6 +365,16 @@ static int check_mntns(unsigned int ns_id)
return result;
 }
 
+static unsigned int initial_mntns_id;
+static void get_initial_mntns_id(void)
+{
+   struct ns_common *ns;
+
+   ns = mntns_operations.get(_task);
+   initial_mntns_id = ns->inum;
+   mntns_operations.put(ns);
+}
+
 /*
  * ima_find_namespace_id_from_inode
  * @policy_inode: the inode of the securityfs policy file for a given
@@ -699,6 +709,12 @@ static ssize_t handle_new_namespace_policy(const char 
*data, size_t datalen)
goto out;
}
 
+   if (ns_id == initial_mntns_id) {
+   pr_err("IMA: invalid use of the initial mount namespace\n");
+   result = -EINVAL;
+   goto out;
+   }
+
ima_namespace_lock();
if (check_mntns(ns_id)) {
result = -ENOENT;
@@ -835,6 +851,8 @@ int __init ima_fs_init(void)
_namespaces_ops);
if (IS_ERR(ima_namespaces))
goto out;
+
+   get_initial_mntns_id();
 #endif
 
return 0;
-- 
2.7.4



[RFC 08/11] ima: block initial namespace id on the namespace policy interface

2017-05-11 Thread Guilherme Magalhaes
The initial namespace policy is set through the existent interface
in the ima/policy securityfs file. Block the initial namespace
id when it is written to the ima/namespace securityfs file.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima_fs.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 61f8da1..65c43e7 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -365,6 +365,16 @@ static int check_mntns(unsigned int ns_id)
return result;
 }
 
+static unsigned int initial_mntns_id;
+static void get_initial_mntns_id(void)
+{
+   struct ns_common *ns;
+
+   ns = mntns_operations.get(_task);
+   initial_mntns_id = ns->inum;
+   mntns_operations.put(ns);
+}
+
 /*
  * ima_find_namespace_id_from_inode
  * @policy_inode: the inode of the securityfs policy file for a given
@@ -699,6 +709,12 @@ static ssize_t handle_new_namespace_policy(const char 
*data, size_t datalen)
goto out;
}
 
+   if (ns_id == initial_mntns_id) {
+   pr_err("IMA: invalid use of the initial mount namespace\n");
+   result = -EINVAL;
+   goto out;
+   }
+
ima_namespace_lock();
if (check_mntns(ns_id)) {
result = -ENOENT;
@@ -835,6 +851,8 @@ int __init ima_fs_init(void)
_namespaces_ops);
if (IS_ERR(ima_namespaces))
goto out;
+
+   get_initial_mntns_id();
 #endif
 
return 0;
-- 
2.7.4



Re: [PATCH 1/4] vmbus: Improve a size determination in vmbus_device_create()

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:15:46 +0200
SF Markus Elfring  wrote:

> From: Markus Elfring 
> Date: Thu, 11 May 2017 17:30:10 +0200
> 
> Replace the specification of a data structure by a pointer dereference
> as the parameter for the operator "sizeof" to make the corresponding size
> determination a bit safer according to the Linux coding style convention.
> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/hv/vmbus_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 0087b49095eb..6802d74f162c 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1145,5 +1145,5 @@ struct hv_device *vmbus_device_create(const uuid_le 
> *type,
>  {
>   struct hv_device *child_device_obj;
>  
> - child_device_obj = kzalloc(sizeof(struct hv_device), GFP_KERNEL);
> + child_device_obj = kzalloc(sizeof(*child_device_obj), GFP_KERNEL);
>   if (!child_device_obj) {

This looks fine.

Acked-by: Stephen Hemminger 


Re: [PATCH 1/4] vmbus: Improve a size determination in vmbus_device_create()

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:15:46 +0200
SF Markus Elfring  wrote:

> From: Markus Elfring 
> Date: Thu, 11 May 2017 17:30:10 +0200
> 
> Replace the specification of a data structure by a pointer dereference
> as the parameter for the operator "sizeof" to make the corresponding size
> determination a bit safer according to the Linux coding style convention.
> 
> Signed-off-by: Markus Elfring 
> ---
>  drivers/hv/vmbus_drv.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 0087b49095eb..6802d74f162c 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1145,5 +1145,5 @@ struct hv_device *vmbus_device_create(const uuid_le 
> *type,
>  {
>   struct hv_device *child_device_obj;
>  
> - child_device_obj = kzalloc(sizeof(struct hv_device), GFP_KERNEL);
> + child_device_obj = kzalloc(sizeof(*child_device_obj), GFP_KERNEL);
>   if (!child_device_obj) {

This looks fine.

Acked-by: Stephen Hemminger 


[RFC 05/11] ima: store new namespace policy structure in a radix tree

2017-05-11 Thread Guilherme Magalhaes
New ima_ns_policy structure to describe IMA policy data per namespace.
Using a radix tree to map namespace ids to a respective ima_ns_policy
structure.
When it is needed to retrieve IMA policy rules/flags, the target
ima_ns_policy structure is retrieved from the radix tree by getting the
namespace id from the current context.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima.h| 37 +
 security/integrity/ima/ima_fs.c | 79 ++---
 security/integrity/ima/ima_init.c   |  2 +
 security/integrity/ima/ima_policy.c | 29 +-
 4 files changed, 133 insertions(+), 14 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 6e8ca8e..1c5c875 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -140,6 +140,21 @@ static inline void ima_load_kexec_buffer(void) {}
  */
 extern bool ima_canonical_fmt;
 
+/* Namespace policy globals */
+struct ima_ns_policy {
+   struct dentry *policy_dentry;
+   struct dentry *ns_dentry;
+   struct list_head *ima_rules;
+   struct list_head ima_policy_rules;
+   int ima_policy_flag;
+   int ima_appraise;
+};
+
+#ifdef CONFIG_IMA_PER_NAMESPACE
+extern spinlock_t ima_ns_policy_lock;
+extern struct radix_tree_root ima_ns_policy_mapping;
+#endif
+
 /* Internal IMA function definitions */
 int ima_init(void);
 int ima_fs_init(void);
@@ -166,6 +181,27 @@ int ima_measurements_show(struct seq_file *m, void *v);
 unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
 void ima_init_template_list(void);
+#ifdef CONFIG_IMA_PER_NAMESPACE
+static inline void ima_namespace_lock_init(void) {
+   spin_lock_init(_ns_policy_lock);
+}
+static inline void ima_namespace_lock(void) {
+   spin_lock(_ns_policy_lock);
+}
+static inline void ima_namespace_unlock(void) {
+   spin_unlock(_ns_policy_lock);
+}
+#else
+static inline void ima_namespace_lock_init(void) {
+   return;
+}
+static inline void ima_namespace_lock(void) {
+   return;
+}
+static inline void ima_namespace_unlock(void) {
+   return;
+}
+#endif
 
 /*
  * used to protect h_table and sha_table
@@ -226,6 +262,7 @@ void ima_update_policy(void);
 void ima_update_policy_flag(void);
 ssize_t ima_parse_add_rule(char *);
 void ima_delete_rules(void);
+void ima_free_policy_rules(struct list_head *policy_rules);
 int ima_check_policy(void);
 void *ima_policy_start(struct seq_file *m, loff_t *pos);
 void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos);
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 6456407..ce6dcdf 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -275,6 +275,48 @@ static const struct file_operations 
ima_ascii_measurements_ops = {
 };
 
 #ifdef CONFIG_IMA_PER_NAMESPACE
+/* used for namespace policy rules initialization */
+static LIST_HEAD(empty_policy);
+
+static int allocate_namespace_policy(struct ima_ns_policy **ins,
+   struct dentry *policy_dentry, struct dentry *ns_dentry)
+{
+   int result;
+   struct ima_ns_policy *p;
+
+   p = kmalloc(sizeof(struct ima_ns_policy), GFP_KERNEL);
+   if (!p) {
+   result = -ENOMEM;
+   goto out;
+   }
+
+   p->policy_dentry = policy_dentry;
+   p->ns_dentry = ns_dentry;
+   p->ima_appraise = 0;
+   p->ima_policy_flag = 0;
+   INIT_LIST_HEAD(>ima_policy_rules);
+   /* namespace starts with empty rules and not pointing to
+* ima_policy_rules */
+   p->ima_rules = _policy;
+
+   result = 0;
+   *ins = p;
+
+out:
+   return result;
+}
+
+static void free_namespace_policy(struct ima_ns_policy *ins)
+{
+   if (ins->policy_dentry)
+   securityfs_remove(ins->policy_dentry);
+   securityfs_remove(ins->ns_dentry);
+
+   ima_free_policy_rules(>ima_policy_rules);
+
+   kfree(ins);
+}
+
 /*
  * check_mntns: check a mount namespace is valid
  *
@@ -476,9 +518,11 @@ static int ima_release_policy(struct inode *inode, struct 
file *file)
 #ifndefCONFIG_IMA_WRITE_POLICY
securityfs_remove(ima_policy);
ima_policy = NULL;
-#else
-   clear_bit(IMA_FS_BUSY, _fs_flags);
 #endif
+
+   /* always clear the busy flag so other namespaces can use it */
+   clear_bit(IMA_FS_BUSY, _fs_flags);
+
return 0;
 }
 
@@ -500,11 +544,14 @@ static int create_mnt_ns_directory(unsigned int ns_id)
int result;
struct dentry *ns_dir, *ns_policy;
char dir_name[64];
+   struct ima_ns_policy *ins;
 
snprintf(dir_name, sizeof(dir_name), "%u", ns_id);
 
ns_dir = securityfs_create_dir(dir_name, ima_dir);
if (IS_ERR(ns_dir)) {
+   /* TODO: handle EEXIST error, remove the folder and
+   continue the procedure */
result = PTR_ERR(ns_dir);
goto out;
  

[RFC 05/11] ima: store new namespace policy structure in a radix tree

2017-05-11 Thread Guilherme Magalhaes
New ima_ns_policy structure to describe IMA policy data per namespace.
Using a radix tree to map namespace ids to a respective ima_ns_policy
structure.
When it is needed to retrieve IMA policy rules/flags, the target
ima_ns_policy structure is retrieved from the radix tree by getting the
namespace id from the current context.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima.h| 37 +
 security/integrity/ima/ima_fs.c | 79 ++---
 security/integrity/ima/ima_init.c   |  2 +
 security/integrity/ima/ima_policy.c | 29 +-
 4 files changed, 133 insertions(+), 14 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 6e8ca8e..1c5c875 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -140,6 +140,21 @@ static inline void ima_load_kexec_buffer(void) {}
  */
 extern bool ima_canonical_fmt;
 
+/* Namespace policy globals */
+struct ima_ns_policy {
+   struct dentry *policy_dentry;
+   struct dentry *ns_dentry;
+   struct list_head *ima_rules;
+   struct list_head ima_policy_rules;
+   int ima_policy_flag;
+   int ima_appraise;
+};
+
+#ifdef CONFIG_IMA_PER_NAMESPACE
+extern spinlock_t ima_ns_policy_lock;
+extern struct radix_tree_root ima_ns_policy_mapping;
+#endif
+
 /* Internal IMA function definitions */
 int ima_init(void);
 int ima_fs_init(void);
@@ -166,6 +181,27 @@ int ima_measurements_show(struct seq_file *m, void *v);
 unsigned long ima_get_binary_runtime_size(void);
 int ima_init_template(void);
 void ima_init_template_list(void);
+#ifdef CONFIG_IMA_PER_NAMESPACE
+static inline void ima_namespace_lock_init(void) {
+   spin_lock_init(_ns_policy_lock);
+}
+static inline void ima_namespace_lock(void) {
+   spin_lock(_ns_policy_lock);
+}
+static inline void ima_namespace_unlock(void) {
+   spin_unlock(_ns_policy_lock);
+}
+#else
+static inline void ima_namespace_lock_init(void) {
+   return;
+}
+static inline void ima_namespace_lock(void) {
+   return;
+}
+static inline void ima_namespace_unlock(void) {
+   return;
+}
+#endif
 
 /*
  * used to protect h_table and sha_table
@@ -226,6 +262,7 @@ void ima_update_policy(void);
 void ima_update_policy_flag(void);
 ssize_t ima_parse_add_rule(char *);
 void ima_delete_rules(void);
+void ima_free_policy_rules(struct list_head *policy_rules);
 int ima_check_policy(void);
 void *ima_policy_start(struct seq_file *m, loff_t *pos);
 void *ima_policy_next(struct seq_file *m, void *v, loff_t *pos);
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 6456407..ce6dcdf 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -275,6 +275,48 @@ static const struct file_operations 
ima_ascii_measurements_ops = {
 };
 
 #ifdef CONFIG_IMA_PER_NAMESPACE
+/* used for namespace policy rules initialization */
+static LIST_HEAD(empty_policy);
+
+static int allocate_namespace_policy(struct ima_ns_policy **ins,
+   struct dentry *policy_dentry, struct dentry *ns_dentry)
+{
+   int result;
+   struct ima_ns_policy *p;
+
+   p = kmalloc(sizeof(struct ima_ns_policy), GFP_KERNEL);
+   if (!p) {
+   result = -ENOMEM;
+   goto out;
+   }
+
+   p->policy_dentry = policy_dentry;
+   p->ns_dentry = ns_dentry;
+   p->ima_appraise = 0;
+   p->ima_policy_flag = 0;
+   INIT_LIST_HEAD(>ima_policy_rules);
+   /* namespace starts with empty rules and not pointing to
+* ima_policy_rules */
+   p->ima_rules = _policy;
+
+   result = 0;
+   *ins = p;
+
+out:
+   return result;
+}
+
+static void free_namespace_policy(struct ima_ns_policy *ins)
+{
+   if (ins->policy_dentry)
+   securityfs_remove(ins->policy_dentry);
+   securityfs_remove(ins->ns_dentry);
+
+   ima_free_policy_rules(>ima_policy_rules);
+
+   kfree(ins);
+}
+
 /*
  * check_mntns: check a mount namespace is valid
  *
@@ -476,9 +518,11 @@ static int ima_release_policy(struct inode *inode, struct 
file *file)
 #ifndefCONFIG_IMA_WRITE_POLICY
securityfs_remove(ima_policy);
ima_policy = NULL;
-#else
-   clear_bit(IMA_FS_BUSY, _fs_flags);
 #endif
+
+   /* always clear the busy flag so other namespaces can use it */
+   clear_bit(IMA_FS_BUSY, _fs_flags);
+
return 0;
 }
 
@@ -500,11 +544,14 @@ static int create_mnt_ns_directory(unsigned int ns_id)
int result;
struct dentry *ns_dir, *ns_policy;
char dir_name[64];
+   struct ima_ns_policy *ins;
 
snprintf(dir_name, sizeof(dir_name), "%u", ns_id);
 
ns_dir = securityfs_create_dir(dir_name, ima_dir);
if (IS_ERR(ns_dir)) {
+   /* TODO: handle EEXIST error, remove the folder and
+   continue the procedure */
result = PTR_ERR(ns_dir);
goto out;
}
@@ -518,7 +565,15 @@ 

[RFC 04/11] ima: add support to namespace securityfs file

2017-05-11 Thread Guilherme Magalhaes
Creating the namespace securityfs file under ima folder. When a mount
namespace id is written to the namespace file, a new folder is created and
with a policy file for that specified namespace. Then, user defined policy
for namespaces may be set by writing rules to this namespace policy file.
With this interface, there is no need to give visibility for the securityfs
inside mount namespaces or containers in userspace.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima.h|   4 +
 security/integrity/ima/ima_fs.c | 183 
 2 files changed, 187 insertions(+)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 42fb91ba..6e8ca8e 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -326,4 +326,8 @@ static inline int security_filter_rule_match(u32 secid, u32 
field, u32 op,
 #definePOLICY_FILE_FLAGS   S_IWUSR
 #endif /* CONFIG_IMA_WRITE_POLICY */
 
+#ifdef CONFIG_IMA_PER_NAMESPACE
+#define NAMESPACES_FILE_FLAGS  S_IWUSR
+#endif
+
 #endif /* __LINUX_IMA_H */
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index ca303e5..6456407 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -23,6 +23,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "ima.h"
 
@@ -272,6 +274,40 @@ static const struct file_operations 
ima_ascii_measurements_ops = {
.release = seq_release,
 };
 
+#ifdef CONFIG_IMA_PER_NAMESPACE
+/*
+ * check_mntns: check a mount namespace is valid
+ *
+ * @ns_id: namespace id to be checked
+ * Returns 0 if the namespace is valid.
+ *
+ * Note: a better way to implement this check is needed. There are
+ * cases where the namespace id is valid but not in use by any process
+ * and then this implementation misses this case. Could we use an
+ * interface similar to what setns implements?
+ */
+static int check_mntns(unsigned int ns_id)
+{
+   struct task_struct *p;
+   int result = 1;
+   struct ns_common *ns;
+
+   rcu_read_lock();
+   for_each_process(p) {
+   ns = mntns_operations.get(p);
+   if (ns->inum == ns_id) {
+   result = 0;
+   mntns_operations.put(ns);
+   break;
+   }
+   mntns_operations.put(ns);
+   }
+   rcu_read_unlock();
+
+   return result;
+}
+#endif
+
 static ssize_t ima_read_policy(char *path)
 {
void *data;
@@ -366,6 +402,9 @@ static struct dentry *ascii_runtime_measurements;
 static struct dentry *runtime_measurements_count;
 static struct dentry *violations;
 static struct dentry *ima_policy;
+#ifdef CONFIG_IMA_PER_NAMESPACE
+static struct dentry *ima_namespaces;
+#endif
 
 enum ima_fs_flags {
IMA_FS_BUSY,
@@ -451,6 +490,139 @@ static const struct file_operations 
ima_measure_policy_ops = {
.llseek = generic_file_llseek,
 };
 
+#ifdef CONFIG_IMA_PER_NAMESPACE
+/*
+ * Assumes namespace id is in use by some process and this mapping
+ * does not exist in the map table.
+ */
+static int create_mnt_ns_directory(unsigned int ns_id)
+{
+   int result;
+   struct dentry *ns_dir, *ns_policy;
+   char dir_name[64];
+
+   snprintf(dir_name, sizeof(dir_name), "%u", ns_id);
+
+   ns_dir = securityfs_create_dir(dir_name, ima_dir);
+   if (IS_ERR(ns_dir)) {
+   result = PTR_ERR(ns_dir);
+   goto out;
+   }
+
+   ns_policy = securityfs_create_file("policy", POLICY_FILE_FLAGS,
+   ns_dir, NULL,
+   _measure_policy_ops);
+   if (IS_ERR(ns_policy)) {
+   result = PTR_ERR(ns_policy);
+   securityfs_remove(ns_dir);
+   goto out;
+   }
+
+   result = 0;
+
+out:
+   return result;
+}
+
+static ssize_t handle_new_namespace_policy(const char *data, size_t datalen)
+{
+   unsigned int ns_id;
+   ssize_t result;
+
+   result = -EINVAL;
+
+   if (sscanf(data, "%u", _id) != 1) {
+   pr_err("IMA: invalid namespace id: %s\n", data);
+   goto out;
+   }
+
+   if (check_mntns(ns_id)) {
+   result = -ENOENT;
+   pr_err("IMA: unused namespace id %u\n", ns_id);
+   goto out;
+   }
+
+   result = create_mnt_ns_directory(ns_id);
+   if (result != 0) {
+   pr_err("IMA: namespace id %u directory creation failed\n", 
ns_id);
+   goto out;
+   }
+
+   result = datalen;
+   pr_info("IMA: directory created for namespace id %u\n", ns_id);
+
+out:
+   return result;
+}
+
+static ssize_t ima_write_namespaces(struct file *file, const char __user *buf,
+   size_t datalen, loff_t *ppos)
+{
+   char *data;
+   ssize_t result;
+
+   if (datalen >= PAGE_SIZE)

Re: [PATCH 2/4] vmbus: Delete an error message for a failed memory allocation in vmbus_device_create()

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:17:01 +0200
SF Markus Elfring  wrote:

> From: Markus Elfring 
> Date: Thu, 11 May 2017 17:33:14 +0200
> 
> Omit an extra message for a memory allocation failure in this function.
> 
> This issue was detected by using the Coccinelle software.
> 
> Link: 
> http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
> Signed-off-by: Markus Elfring 
> ---
>  drivers/hv/vmbus_drv.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 6802d74f162c..96328aebae5a 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1149,7 +1149,5 @@ struct hv_device *vmbus_device_create(const uuid_le 
> *type,
> - if (!child_device_obj) {
> - pr_err("Unable to allocate device object for child device\n");
> + if (!child_device_obj)
>   return NULL;
> - }
>  
>   child_device_obj->channel = channel;
>   memcpy(_device_obj->dev_type, type, sizeof(uuid_le));

Taking out the message assumes that all callers of this function either log an
error or pass appropriate error code back to userspace. Did you walk back
through all the callers?

Just because an automated tool says that this needs to change does not
mean it has to.


[RFC 04/11] ima: add support to namespace securityfs file

2017-05-11 Thread Guilherme Magalhaes
Creating the namespace securityfs file under ima folder. When a mount
namespace id is written to the namespace file, a new folder is created and
with a policy file for that specified namespace. Then, user defined policy
for namespaces may be set by writing rules to this namespace policy file.
With this interface, there is no need to give visibility for the securityfs
inside mount namespaces or containers in userspace.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima.h|   4 +
 security/integrity/ima/ima_fs.c | 183 
 2 files changed, 187 insertions(+)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 42fb91ba..6e8ca8e 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -326,4 +326,8 @@ static inline int security_filter_rule_match(u32 secid, u32 
field, u32 op,
 #definePOLICY_FILE_FLAGS   S_IWUSR
 #endif /* CONFIG_IMA_WRITE_POLICY */
 
+#ifdef CONFIG_IMA_PER_NAMESPACE
+#define NAMESPACES_FILE_FLAGS  S_IWUSR
+#endif
+
 #endif /* __LINUX_IMA_H */
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index ca303e5..6456407 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -23,6 +23,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include "ima.h"
 
@@ -272,6 +274,40 @@ static const struct file_operations 
ima_ascii_measurements_ops = {
.release = seq_release,
 };
 
+#ifdef CONFIG_IMA_PER_NAMESPACE
+/*
+ * check_mntns: check a mount namespace is valid
+ *
+ * @ns_id: namespace id to be checked
+ * Returns 0 if the namespace is valid.
+ *
+ * Note: a better way to implement this check is needed. There are
+ * cases where the namespace id is valid but not in use by any process
+ * and then this implementation misses this case. Could we use an
+ * interface similar to what setns implements?
+ */
+static int check_mntns(unsigned int ns_id)
+{
+   struct task_struct *p;
+   int result = 1;
+   struct ns_common *ns;
+
+   rcu_read_lock();
+   for_each_process(p) {
+   ns = mntns_operations.get(p);
+   if (ns->inum == ns_id) {
+   result = 0;
+   mntns_operations.put(ns);
+   break;
+   }
+   mntns_operations.put(ns);
+   }
+   rcu_read_unlock();
+
+   return result;
+}
+#endif
+
 static ssize_t ima_read_policy(char *path)
 {
void *data;
@@ -366,6 +402,9 @@ static struct dentry *ascii_runtime_measurements;
 static struct dentry *runtime_measurements_count;
 static struct dentry *violations;
 static struct dentry *ima_policy;
+#ifdef CONFIG_IMA_PER_NAMESPACE
+static struct dentry *ima_namespaces;
+#endif
 
 enum ima_fs_flags {
IMA_FS_BUSY,
@@ -451,6 +490,139 @@ static const struct file_operations 
ima_measure_policy_ops = {
.llseek = generic_file_llseek,
 };
 
+#ifdef CONFIG_IMA_PER_NAMESPACE
+/*
+ * Assumes namespace id is in use by some process and this mapping
+ * does not exist in the map table.
+ */
+static int create_mnt_ns_directory(unsigned int ns_id)
+{
+   int result;
+   struct dentry *ns_dir, *ns_policy;
+   char dir_name[64];
+
+   snprintf(dir_name, sizeof(dir_name), "%u", ns_id);
+
+   ns_dir = securityfs_create_dir(dir_name, ima_dir);
+   if (IS_ERR(ns_dir)) {
+   result = PTR_ERR(ns_dir);
+   goto out;
+   }
+
+   ns_policy = securityfs_create_file("policy", POLICY_FILE_FLAGS,
+   ns_dir, NULL,
+   _measure_policy_ops);
+   if (IS_ERR(ns_policy)) {
+   result = PTR_ERR(ns_policy);
+   securityfs_remove(ns_dir);
+   goto out;
+   }
+
+   result = 0;
+
+out:
+   return result;
+}
+
+static ssize_t handle_new_namespace_policy(const char *data, size_t datalen)
+{
+   unsigned int ns_id;
+   ssize_t result;
+
+   result = -EINVAL;
+
+   if (sscanf(data, "%u", _id) != 1) {
+   pr_err("IMA: invalid namespace id: %s\n", data);
+   goto out;
+   }
+
+   if (check_mntns(ns_id)) {
+   result = -ENOENT;
+   pr_err("IMA: unused namespace id %u\n", ns_id);
+   goto out;
+   }
+
+   result = create_mnt_ns_directory(ns_id);
+   if (result != 0) {
+   pr_err("IMA: namespace id %u directory creation failed\n", 
ns_id);
+   goto out;
+   }
+
+   result = datalen;
+   pr_info("IMA: directory created for namespace id %u\n", ns_id);
+
+out:
+   return result;
+}
+
+static ssize_t ima_write_namespaces(struct file *file, const char __user *buf,
+   size_t datalen, loff_t *ppos)
+{
+   char *data;
+   ssize_t result;
+
+   if (datalen >= PAGE_SIZE)
+   datalen = 

Re: [PATCH 2/4] vmbus: Delete an error message for a failed memory allocation in vmbus_device_create()

2017-05-11 Thread Stephen Hemminger
On Thu, 11 May 2017 18:17:01 +0200
SF Markus Elfring  wrote:

> From: Markus Elfring 
> Date: Thu, 11 May 2017 17:33:14 +0200
> 
> Omit an extra message for a memory allocation failure in this function.
> 
> This issue was detected by using the Coccinelle software.
> 
> Link: 
> http://events.linuxfoundation.org/sites/events/files/slides/LCJ16-Refactor_Strings-WSang_0.pdf
> Signed-off-by: Markus Elfring 
> ---
>  drivers/hv/vmbus_drv.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 6802d74f162c..96328aebae5a 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1149,7 +1149,5 @@ struct hv_device *vmbus_device_create(const uuid_le 
> *type,
> - if (!child_device_obj) {
> - pr_err("Unable to allocate device object for child device\n");
> + if (!child_device_obj)
>   return NULL;
> - }
>  
>   child_device_obj->channel = channel;
>   memcpy(_device_obj->dev_type, type, sizeof(uuid_le));

Taking out the message assumes that all callers of this function either log an
error or pass appropriate error code back to userspace. Did you walk back
through all the callers?

Just because an automated tool says that this needs to change does not
mean it has to.


Re: [PATCH 5/5] phy: bcm-ns-usb3: add MDIO driver using proper bus layer

2017-05-11 Thread Florian Fainelli
On 05/11/2017 06:29 AM, Rafał Miłecki wrote:
> From: Rafał Miłecki 
> 
> As USB 3.0 PHY is attached to the MDIO bus this module should provide a
> MDIO driver and use a proper bus layer. This is a proper (cleaner)
> solution which doesn't require code to know this specific MDIO bus
> details. It also allows reusing the driver with other MDIO buses.
> 
> Signed-off-by: Rafał Miłecki 
> ---
>  drivers/phy/Kconfig   |  1 +
>  drivers/phy/phy-bcm-ns-usb3.c | 98 
> ++-
>  2 files changed, 98 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
> index afaf7b643eeb..2a9186b98ae0 100644
> --- a/drivers/phy/Kconfig
> +++ b/drivers/phy/Kconfig
> @@ -29,6 +29,7 @@ config PHY_BCM_NS_USB3
>   depends on ARCH_BCM_IPROC || COMPILE_TEST
>   depends on HAS_IOMEM && OF
>   select GENERIC_PHY
> + select PHYLIB

Should not this be select MDIO_DEVICE instead? 4.11 introduced the
possibility to build support for MDIO bus/devices without requiring PHYLIB.

>   help
> Enable this to support Broadcom USB 3.0 PHY connected to the USB
> controller on Northstar family.
> diff --git a/drivers/phy/phy-bcm-ns-usb3.c b/drivers/phy/phy-bcm-ns-usb3.c
> index 2c9a0d5f43d8..389f5e5a6238 100644
> --- a/drivers/phy/phy-bcm-ns-usb3.c
> +++ b/drivers/phy/phy-bcm-ns-usb3.c
> @@ -16,7 +16,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -52,6 +54,7 @@ struct bcm_ns_usb3 {
>   enum bcm_ns_family family;
>   void __iomem *dmp;
>   void __iomem *ccb_mii;
> + struct mdio_device *mdiodev;
>   struct phy *phy;
>  
>   int (*phy_write)(struct bcm_ns_usb3 *usb3, u16 reg, u16 value);
> @@ -183,6 +186,77 @@ static const struct phy_ops ops = {
>  };
>  
>  /**
> + * MDIO driver code
> + **/
> +
> +static int bcm_ns_usb3_mdiodev_phy_write(struct bcm_ns_usb3 *usb3, u16 reg,
> +  u16 value)
> +{
> + struct mdio_device *mdiodev = usb3->mdiodev;
> +
> + return mdiobus_write(mdiodev->bus, mdiodev->addr, reg, value);
> +}
> +
> +static int bcm_ns_usb3_mdio_probe(struct mdio_device *mdiodev)
> +{
> + struct device *dev = >dev;
> + const struct of_device_id *of_id;
> + struct phy_provider *phy_provider;
> + struct device_node *syscon_np;
> + struct bcm_ns_usb3 *usb3;
> + struct resource res;
> + int err;
> +
> + usb3 = devm_kzalloc(dev, sizeof(*usb3), GFP_KERNEL);
> + if (!usb3)
> + return -ENOMEM;
> +
> + usb3->dev = dev;
> + usb3->mdiodev = mdiodev;
> +
> + of_id = of_match_device(bcm_ns_usb3_id_table, dev);
> + if (!of_id)
> + return -EINVAL;
> + usb3->family = (enum bcm_ns_family)of_id->data;
> +
> + syscon_np = of_parse_phandle(dev->of_node, "usb3-dmp-syscon", 0);
> + err = of_address_to_resource(syscon_np, 0, );
> + of_node_put(syscon_np);
> + if (err)
> + return err;
> +
> + usb3->dmp = devm_ioremap_resource(dev, );
> + if (IS_ERR(usb3->dmp)) {
> + dev_err(dev, "Failed to map DMP regs\n");
> + return PTR_ERR(usb3->dmp);
> + }
> +
> + usb3->phy_write = bcm_ns_usb3_mdiodev_phy_write;
> +
> + usb3->phy = devm_phy_create(dev, NULL, );
> + if (IS_ERR(usb3->phy)) {
> + dev_err(dev, "Failed to create PHY\n");
> + return PTR_ERR(usb3->phy);
> + }
> +
> + phy_set_drvdata(usb3->phy, usb3);
> +
> + phy_provider = devm_of_phy_provider_register(dev, of_phy_simple_xlate);
> +
> + return PTR_ERR_OR_ZERO(phy_provider);
> +}
> +
> +static struct mdio_driver bcm_ns_usb3_mdio_driver = {
> + .mdiodrv = {
> + .driver = {
> + .name = "bcm_ns_mdio_usb3",
> + .of_match_table = bcm_ns_usb3_id_table,
> + },
> + },
> + .probe = bcm_ns_usb3_mdio_probe,
> +};
> +
> +/**
>   * Platform driver code
>   **/
>  
> @@ -297,6 +371,28 @@ static struct platform_driver bcm_ns_usb3_driver = {
>   .of_match_table = bcm_ns_usb3_id_table,
>   },
>  };
> -module_platform_driver(bcm_ns_usb3_driver);
> +
> +static int __init bcm_ns_usb3_module_init(void)
> +{
> + int err;
> +
> + err = mdio_driver_register(_ns_usb3_mdio_driver);
> + if (err)
> + return err;
> +
> + err = platform_driver_register(_ns_usb3_driver);
> + if (err)
> + mdio_driver_unregister(_ns_usb3_mdio_driver);
> +
> + return err;
> +}
> +module_init(bcm_ns_usb3_module_init);
> +
> +static void __exit bcm_ns_usb3_module_exit(void)
> +{
> + platform_driver_unregister(_ns_usb3_driver);
> + 

[RFC 09/11] ima: delete namespace policy securityfs file in write-once mode

2017-05-11 Thread Guilherme Magalhaes
When policy file is written and write-once is enabled, the policy file
must be deleted. Select the namespace policy structure to get the correct
policy file descriptor.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima_fs.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 65c43e7..94e89fe 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -575,6 +575,7 @@ static int ima_open_policy(struct inode *inode, struct file 
*filp)
 static int ima_release_policy(struct inode *inode, struct file *file)
 {
const char *cause = valid_policy ? "completed" : "failed";
+   struct ima_ns_policy *ins;
 
if ((file->f_flags & O_ACCMODE) == O_RDONLY)
return seq_release(inode, file);
@@ -595,15 +596,37 @@ static int ima_release_policy(struct inode *inode, struct 
file *file)
return 0;
}
 
+   /* get the namespace id from file->inode (policy file inode).
+* We also need to synchronize this operation with concurrent namespace
+* releasing. */
+   ima_namespace_lock();
+   ins = ima_get_namespace_policy_from_inode(inode);
+   if (!ins) {
+   /* the namespace is not valid anymore, discard new policy
+* rules and exit */
+   ima_delete_rules();
+   valid_policy = 1;
+   clear_bit(IMA_FS_BUSY, _fs_flags);
+   ima_namespace_unlock();
+   return 0;
+   }
+
ima_update_policy();
 #ifndefCONFIG_IMA_WRITE_POLICY
-   securityfs_remove(ima_policy_initial_ns);
-   ima_policy = NULL;
+   if (ins == _initial_namespace_policy) {
+   securityfs_remove(ima_policy_initial_ns);
+   ima_policy_initial_ns = NULL;
+   } else {
+   securityfs_remove(ins->policy_dentry);
+   ins->policy_dentry = NULL;
+   }
 #endif
 
/* always clear the busy flag so other namespaces can use it */
clear_bit(IMA_FS_BUSY, _fs_flags);
 
+   ima_namespace_unlock();
+
return 0;
 }
 
-- 
2.7.4



Re: [PATCH 5/5] phy: bcm-ns-usb3: add MDIO driver using proper bus layer

2017-05-11 Thread Florian Fainelli
On 05/11/2017 06:29 AM, Rafał Miłecki wrote:
> From: Rafał Miłecki 
> 
> As USB 3.0 PHY is attached to the MDIO bus this module should provide a
> MDIO driver and use a proper bus layer. This is a proper (cleaner)
> solution which doesn't require code to know this specific MDIO bus
> details. It also allows reusing the driver with other MDIO buses.
> 
> Signed-off-by: Rafał Miłecki 
> ---
>  drivers/phy/Kconfig   |  1 +
>  drivers/phy/phy-bcm-ns-usb3.c | 98 
> ++-
>  2 files changed, 98 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/phy/Kconfig b/drivers/phy/Kconfig
> index afaf7b643eeb..2a9186b98ae0 100644
> --- a/drivers/phy/Kconfig
> +++ b/drivers/phy/Kconfig
> @@ -29,6 +29,7 @@ config PHY_BCM_NS_USB3
>   depends on ARCH_BCM_IPROC || COMPILE_TEST
>   depends on HAS_IOMEM && OF
>   select GENERIC_PHY
> + select PHYLIB

Should not this be select MDIO_DEVICE instead? 4.11 introduced the
possibility to build support for MDIO bus/devices without requiring PHYLIB.

>   help
> Enable this to support Broadcom USB 3.0 PHY connected to the USB
> controller on Northstar family.
> diff --git a/drivers/phy/phy-bcm-ns-usb3.c b/drivers/phy/phy-bcm-ns-usb3.c
> index 2c9a0d5f43d8..389f5e5a6238 100644
> --- a/drivers/phy/phy-bcm-ns-usb3.c
> +++ b/drivers/phy/phy-bcm-ns-usb3.c
> @@ -16,7 +16,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -52,6 +54,7 @@ struct bcm_ns_usb3 {
>   enum bcm_ns_family family;
>   void __iomem *dmp;
>   void __iomem *ccb_mii;
> + struct mdio_device *mdiodev;
>   struct phy *phy;
>  
>   int (*phy_write)(struct bcm_ns_usb3 *usb3, u16 reg, u16 value);
> @@ -183,6 +186,77 @@ static const struct phy_ops ops = {
>  };
>  
>  /**
> + * MDIO driver code
> + **/
> +
> +static int bcm_ns_usb3_mdiodev_phy_write(struct bcm_ns_usb3 *usb3, u16 reg,
> +  u16 value)
> +{
> + struct mdio_device *mdiodev = usb3->mdiodev;
> +
> + return mdiobus_write(mdiodev->bus, mdiodev->addr, reg, value);
> +}
> +
> +static int bcm_ns_usb3_mdio_probe(struct mdio_device *mdiodev)
> +{
> + struct device *dev = >dev;
> + const struct of_device_id *of_id;
> + struct phy_provider *phy_provider;
> + struct device_node *syscon_np;
> + struct bcm_ns_usb3 *usb3;
> + struct resource res;
> + int err;
> +
> + usb3 = devm_kzalloc(dev, sizeof(*usb3), GFP_KERNEL);
> + if (!usb3)
> + return -ENOMEM;
> +
> + usb3->dev = dev;
> + usb3->mdiodev = mdiodev;
> +
> + of_id = of_match_device(bcm_ns_usb3_id_table, dev);
> + if (!of_id)
> + return -EINVAL;
> + usb3->family = (enum bcm_ns_family)of_id->data;
> +
> + syscon_np = of_parse_phandle(dev->of_node, "usb3-dmp-syscon", 0);
> + err = of_address_to_resource(syscon_np, 0, );
> + of_node_put(syscon_np);
> + if (err)
> + return err;
> +
> + usb3->dmp = devm_ioremap_resource(dev, );
> + if (IS_ERR(usb3->dmp)) {
> + dev_err(dev, "Failed to map DMP regs\n");
> + return PTR_ERR(usb3->dmp);
> + }
> +
> + usb3->phy_write = bcm_ns_usb3_mdiodev_phy_write;
> +
> + usb3->phy = devm_phy_create(dev, NULL, );
> + if (IS_ERR(usb3->phy)) {
> + dev_err(dev, "Failed to create PHY\n");
> + return PTR_ERR(usb3->phy);
> + }
> +
> + phy_set_drvdata(usb3->phy, usb3);
> +
> + phy_provider = devm_of_phy_provider_register(dev, of_phy_simple_xlate);
> +
> + return PTR_ERR_OR_ZERO(phy_provider);
> +}
> +
> +static struct mdio_driver bcm_ns_usb3_mdio_driver = {
> + .mdiodrv = {
> + .driver = {
> + .name = "bcm_ns_mdio_usb3",
> + .of_match_table = bcm_ns_usb3_id_table,
> + },
> + },
> + .probe = bcm_ns_usb3_mdio_probe,
> +};
> +
> +/**
>   * Platform driver code
>   **/
>  
> @@ -297,6 +371,28 @@ static struct platform_driver bcm_ns_usb3_driver = {
>   .of_match_table = bcm_ns_usb3_id_table,
>   },
>  };
> -module_platform_driver(bcm_ns_usb3_driver);
> +
> +static int __init bcm_ns_usb3_module_init(void)
> +{
> + int err;
> +
> + err = mdio_driver_register(_ns_usb3_mdio_driver);
> + if (err)
> + return err;
> +
> + err = platform_driver_register(_ns_usb3_driver);
> + if (err)
> + mdio_driver_unregister(_ns_usb3_mdio_driver);
> +
> + return err;
> +}
> +module_init(bcm_ns_usb3_module_init);
> +
> +static void __exit bcm_ns_usb3_module_exit(void)
> +{
> + platform_driver_unregister(_ns_usb3_driver);
> + mdio_driver_unregister(_ns_usb3_mdio_driver);
> +}
> 

[RFC 09/11] ima: delete namespace policy securityfs file in write-once mode

2017-05-11 Thread Guilherme Magalhaes
When policy file is written and write-once is enabled, the policy file
must be deleted. Select the namespace policy structure to get the correct
policy file descriptor.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima_fs.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 65c43e7..94e89fe 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -575,6 +575,7 @@ static int ima_open_policy(struct inode *inode, struct file 
*filp)
 static int ima_release_policy(struct inode *inode, struct file *file)
 {
const char *cause = valid_policy ? "completed" : "failed";
+   struct ima_ns_policy *ins;
 
if ((file->f_flags & O_ACCMODE) == O_RDONLY)
return seq_release(inode, file);
@@ -595,15 +596,37 @@ static int ima_release_policy(struct inode *inode, struct 
file *file)
return 0;
}
 
+   /* get the namespace id from file->inode (policy file inode).
+* We also need to synchronize this operation with concurrent namespace
+* releasing. */
+   ima_namespace_lock();
+   ins = ima_get_namespace_policy_from_inode(inode);
+   if (!ins) {
+   /* the namespace is not valid anymore, discard new policy
+* rules and exit */
+   ima_delete_rules();
+   valid_policy = 1;
+   clear_bit(IMA_FS_BUSY, _fs_flags);
+   ima_namespace_unlock();
+   return 0;
+   }
+
ima_update_policy();
 #ifndefCONFIG_IMA_WRITE_POLICY
-   securityfs_remove(ima_policy_initial_ns);
-   ima_policy = NULL;
+   if (ins == _initial_namespace_policy) {
+   securityfs_remove(ima_policy_initial_ns);
+   ima_policy_initial_ns = NULL;
+   } else {
+   securityfs_remove(ins->policy_dentry);
+   ins->policy_dentry = NULL;
+   }
 #endif
 
/* always clear the busy flag so other namespaces can use it */
clear_bit(IMA_FS_BUSY, _fs_flags);
 
+   ima_namespace_unlock();
+
return 0;
 }
 
-- 
2.7.4



[PATCH 07/14] Implement fsopen() to prepare for a mount [ver #2]

2017-05-11 Thread David Howells
Provide an fsopen() system call that starts the process of preparing to
mount, using an fd as a context handle.  fsopen() is given the name of the
filesystem that will be used:

int mfd = fsopen(const char *fsname, int reserved,
 int open_flags);

where reserved should be -1 for the moment (it will be used to pass the
namespace information in future) and open_flags can be 0 or O_CLOEXEC.

For example:

mfd = fsopen("ext4", -1, O_CLOEXEC);
write(mfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
write(mfd, "o noatime");
write(mfd, "o acl");
write(mfd, "o user_attr");
write(mfd, "o iversion");
write(mfd, "o ");
write(mfd, "r /my/container"); // root inside the fs
fsmount(mfd, container_fd, "/mnt", AT_NO_FOLLOW);

mfd = fsopen("afs", -1);
write(mfd, "s %grand.central.org:root.cell");
write(mfd, "o cell=grand.central.org");
write(mfd, "r /");
fsmount(mfd, AT_FDCWD, "/mnt", 0);

If an error is reported at any step, an error message may be available to be
read() back (ENODATA will be reported if there isn't an error available) in
the form:

"e :"
"e SELinux:Mount on mountpoint not permitted"

Once fsmount() has been called, further write() calls will incur EBUSY,
even if the fsmount() fails.  read() is still possible to retrieve error
information.

The fsopen() syscall creates a mount context and hangs it of the fd that it
returns.

Netlink is not used because it is optional.

Signed-off-by: David Howells 
---

 arch/x86/entry/syscalls/syscall_32.tbl |1 
 arch/x86/entry/syscalls/syscall_64.tbl |1 
 fs/Makefile|2 
 fs/fsopen.c|  279 
 include/linux/syscalls.h   |1 
 include/uapi/linux/magic.h |1 
 kernel/sys_ni.c|3 
 7 files changed, 287 insertions(+), 1 deletion(-)
 create mode 100644 fs/fsopen.c

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
b/arch/x86/entry/syscalls/syscall_32.tbl
index 448ac2161112..9bf8d4c62f85 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -391,3 +391,4 @@
 382i386pkey_free   sys_pkey_free
 383i386statx   sys_statx
 384i386arch_prctl  sys_arch_prctl  
compat_sys_arch_prctl
+385i386fsopen  sys_fsopen
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
b/arch/x86/entry/syscalls/syscall_64.tbl
index 5aef183e2f85..9b198c5fc412 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -339,6 +339,7 @@
 330common  pkey_alloc  sys_pkey_alloc
 331common  pkey_free   sys_pkey_free
 332common  statx   sys_statx
+333common  fsopen  sys_fsopen
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/fs/Makefile b/fs/Makefile
index 8f5142525866..b8fcf48b0400 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -12,7 +12,7 @@ obj-y :=  open.o read_write.o file_table.o super.o \
seq_file.o xattr.o libfs.o fs-writeback.o \
pnode.o splice.o sync.o utimes.o \
stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \
-   sb_config.o
+   sb_config.o fsopen.o
 
 ifeq ($(CONFIG_BLOCK),y)
 obj-y +=   buffer.o block_dev.o direct-io.o mpage.o
diff --git a/fs/fsopen.c b/fs/fsopen.c
new file mode 100644
index ..a4e9d5a7ce2b
--- /dev/null
+++ b/fs/fsopen.c
@@ -0,0 +1,279 @@
+/* Filesystem access-by-fd.
+ *
+ * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowe...@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static struct vfsmount *fs_fs_mnt __read_mostly;
+
+static int fs_fs_release(struct inode *inode, struct file *file)
+{
+   struct sb_config *sc = file->private_data;
+
+   file->private_data = NULL;
+
+   put_sb_config(sc);
+   return 0;
+}
+
+/*
+ * Read any error message back from the fd.  Will be prefixed by "e ".
+ */
+static ssize_t fs_fs_read(struct file *file, char __user *_buf, size_t len, 
loff_t *pos)
+{
+   struct sb_config *sc = file->private_data;
+   const char *msg;
+   size_t mlen;
+
+   msg = READ_ONCE(sc->error_msg);
+   if (!msg)
+   return -ENODATA;
+
+   mlen = strlen(msg);
+   if (mlen + 2 > len)
+   return -ETOOSMALL;
+   if (copy_to_user(_buf, "e ", 

[PATCH 07/14] Implement fsopen() to prepare for a mount [ver #2]

2017-05-11 Thread David Howells
Provide an fsopen() system call that starts the process of preparing to
mount, using an fd as a context handle.  fsopen() is given the name of the
filesystem that will be used:

int mfd = fsopen(const char *fsname, int reserved,
 int open_flags);

where reserved should be -1 for the moment (it will be used to pass the
namespace information in future) and open_flags can be 0 or O_CLOEXEC.

For example:

mfd = fsopen("ext4", -1, O_CLOEXEC);
write(mfd, "s /dev/sdb1"); // note I'm ignoring write's length arg
write(mfd, "o noatime");
write(mfd, "o acl");
write(mfd, "o user_attr");
write(mfd, "o iversion");
write(mfd, "o ");
write(mfd, "r /my/container"); // root inside the fs
fsmount(mfd, container_fd, "/mnt", AT_NO_FOLLOW);

mfd = fsopen("afs", -1);
write(mfd, "s %grand.central.org:root.cell");
write(mfd, "o cell=grand.central.org");
write(mfd, "r /");
fsmount(mfd, AT_FDCWD, "/mnt", 0);

If an error is reported at any step, an error message may be available to be
read() back (ENODATA will be reported if there isn't an error available) in
the form:

"e :"
"e SELinux:Mount on mountpoint not permitted"

Once fsmount() has been called, further write() calls will incur EBUSY,
even if the fsmount() fails.  read() is still possible to retrieve error
information.

The fsopen() syscall creates a mount context and hangs it of the fd that it
returns.

Netlink is not used because it is optional.

Signed-off-by: David Howells 
---

 arch/x86/entry/syscalls/syscall_32.tbl |1 
 arch/x86/entry/syscalls/syscall_64.tbl |1 
 fs/Makefile|2 
 fs/fsopen.c|  279 
 include/linux/syscalls.h   |1 
 include/uapi/linux/magic.h |1 
 kernel/sys_ni.c|3 
 7 files changed, 287 insertions(+), 1 deletion(-)
 create mode 100644 fs/fsopen.c

diff --git a/arch/x86/entry/syscalls/syscall_32.tbl 
b/arch/x86/entry/syscalls/syscall_32.tbl
index 448ac2161112..9bf8d4c62f85 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -391,3 +391,4 @@
 382i386pkey_free   sys_pkey_free
 383i386statx   sys_statx
 384i386arch_prctl  sys_arch_prctl  
compat_sys_arch_prctl
+385i386fsopen  sys_fsopen
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl 
b/arch/x86/entry/syscalls/syscall_64.tbl
index 5aef183e2f85..9b198c5fc412 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -339,6 +339,7 @@
 330common  pkey_alloc  sys_pkey_alloc
 331common  pkey_free   sys_pkey_free
 332common  statx   sys_statx
+333common  fsopen  sys_fsopen
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/fs/Makefile b/fs/Makefile
index 8f5142525866..b8fcf48b0400 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -12,7 +12,7 @@ obj-y :=  open.o read_write.o file_table.o super.o \
seq_file.o xattr.o libfs.o fs-writeback.o \
pnode.o splice.o sync.o utimes.o \
stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \
-   sb_config.o
+   sb_config.o fsopen.o
 
 ifeq ($(CONFIG_BLOCK),y)
 obj-y +=   buffer.o block_dev.o direct-io.o mpage.o
diff --git a/fs/fsopen.c b/fs/fsopen.c
new file mode 100644
index ..a4e9d5a7ce2b
--- /dev/null
+++ b/fs/fsopen.c
@@ -0,0 +1,279 @@
+/* Filesystem access-by-fd.
+ *
+ * Copyright (C) 2017 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowe...@redhat.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public Licence
+ * as published by the Free Software Foundation; either version
+ * 2 of the Licence, or (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static struct vfsmount *fs_fs_mnt __read_mostly;
+
+static int fs_fs_release(struct inode *inode, struct file *file)
+{
+   struct sb_config *sc = file->private_data;
+
+   file->private_data = NULL;
+
+   put_sb_config(sc);
+   return 0;
+}
+
+/*
+ * Read any error message back from the fd.  Will be prefixed by "e ".
+ */
+static ssize_t fs_fs_read(struct file *file, char __user *_buf, size_t len, 
loff_t *pos)
+{
+   struct sb_config *sc = file->private_data;
+   const char *msg;
+   size_t mlen;
+
+   msg = READ_ONCE(sc->error_msg);
+   if (!msg)
+   return -ENODATA;
+
+   mlen = strlen(msg);
+   if (mlen + 2 > len)
+   return -ETOOSMALL;
+   if (copy_to_user(_buf, "e ", 2) != 0 ||
+   

Re: [PATCH] arm64/cpufeature: don't use mutex in bringup path

2017-05-11 Thread Marc Zyngier
On 11/05/17 14:12, Mark Rutland wrote:
> Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
> must take the jump_label mutex.
> 
> We call cpus_set_cap() in the secondary bringup path, from the idle
> thread where interrupts are disabled. Taking a mutex in this path "is a
> NONO" regardless of whether it's contended, and something we must avoid.
> Additionally, the secondary CPU doesn't hold the percpu rwsem (as this
> is held by the primary CPU), so this triggers a lockdep splat.
> 
> This patch fixes both issues by moving the static_key poking from
> cpus_set_cap() into enable_cpu_capabilities(). This means that there is
> a period between detecting an erratum and cpus_have_const_cap(erratum)
> being true, but largely this is fine. Features are only enabled later
> regardless, and most errata workarounds are best-effort.
> 
> This rework means that we can remove the *_cpuslocked() helpers added in
> commit d54bb72551b999dd ("arm64/cpufeature: Use
> static_branch_enable_cpuslocked()").
> 
> Fixes: efd9e03facd075f5 ("arm64: Use static keys for CPU features")
> Signed-off-by: Mark Rutland 
> Cc: Catalin Marinas 
> Cc: Marc Zyniger 
> Cc: Peter Zijlstra 
> Cc: Sebastian Sewior 
> Cc: Suzuki Poulose 
> Cc: Thomas Gleixner 
> Cc: Will Deacon 
> ---
>  arch/arm64/include/asm/cpufeature.h |  3 +--
>  arch/arm64/kernel/cpu_errata.c  |  9 +
>  arch/arm64/kernel/cpufeature.c  | 16 +---
>  3 files changed, 15 insertions(+), 13 deletions(-)
> 
> I'm not sure what to do about ARM64_WORKAROUND_CAVIUM_23154.
> 
> This patch will defer enabling the workaround until all CPUs are up, and I
> can't see a good way of having the workaround on by default, then subsequently
> disabled if no CPUs are affected.

Yeah, this is pretty horrible.

The way I see it, we need an extra static key that would indicate that 
the errata have been applied. In the interval, we need to use the slow 
path and check the per-cpu state. Something like:

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index c132f29322cc..b4cc5a3573eb 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -123,10 +123,17 @@ static void gic_redist_wait_for_rwp(void)
 
 static u64 __maybe_unused gic_read_iar(void)
 {
-   if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_23154))
-   return gic_read_iar_cavium_thunderx();
-   else
-   return gic_read_iar_common();
+   if (static_branch_likely(_workarounds_enabled)) {
+   if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_23154))
+   return gic_read_iar_cavium_thunderx();
+   else
+   return gic_read_iar_common();
+   } else {
+   if (this_cpu_has_cap(ARM64_WORKAROUND_CAVIUM_23154))
+   return gic_read_iar_cavium_thunderx();
+   else
+   return gic_read_iar_common();
+   }
 }
 #endif
 
You can probably easily turn it into something that looks less shit though.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


[RFC 02/11] ima: qualify pathname in audit measurement record

2017-05-11 Thread Guilherme Magalhaes
Adding new fields (mount namespace id, file inode and device name) to
uniquely identify a pathname considering different mount namespaces.
The file inode on a given device is unique and these fields are
required to identify a namespace id since this id can be released
and later reused by a different namespace.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima_api.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index c2edba8..b05c1fd 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ima.h"
 
@@ -293,6 +294,7 @@ void ima_audit_measurement(struct integrity_iint_cache 
*iint,
char hash[(iint->ima_hash->length * 2) + 1];
const char *algo_name = hash_algo_name[iint->ima_hash->algo];
char algo_hash[sizeof(hash) + strlen(algo_name) + 2];
+   struct ns_common *ns;
int i;
 
if (iint->flags & IMA_AUDITED)
@@ -312,6 +314,12 @@ void ima_audit_measurement(struct integrity_iint_cache 
*iint,
audit_log_format(ab, " hash=");
snprintf(algo_hash, sizeof(algo_hash), "%s:%s", algo_name, hash);
audit_log_untrustedstring(ab, algo_hash);
+   ns = mntns_operations.get(current);
+   audit_log_format(ab, " mnt_ns=%u", ns->inum);
+   mntns_operations.put(ns);
+   audit_log_format(ab, " dev=");
+   audit_log_untrustedstring(ab, iint->inode->i_sb->s_id);
+   audit_log_format(ab, " ino=%lu", iint->inode->i_ino);
 
audit_log_task_info(ab, current);
audit_log_end(ab);
-- 
2.7.4



Re: [PATCH] arm64/cpufeature: don't use mutex in bringup path

2017-05-11 Thread Marc Zyngier
On 11/05/17 14:12, Mark Rutland wrote:
> Currently, cpus_set_cap() calls static_branch_enable_cpuslocked(), which
> must take the jump_label mutex.
> 
> We call cpus_set_cap() in the secondary bringup path, from the idle
> thread where interrupts are disabled. Taking a mutex in this path "is a
> NONO" regardless of whether it's contended, and something we must avoid.
> Additionally, the secondary CPU doesn't hold the percpu rwsem (as this
> is held by the primary CPU), so this triggers a lockdep splat.
> 
> This patch fixes both issues by moving the static_key poking from
> cpus_set_cap() into enable_cpu_capabilities(). This means that there is
> a period between detecting an erratum and cpus_have_const_cap(erratum)
> being true, but largely this is fine. Features are only enabled later
> regardless, and most errata workarounds are best-effort.
> 
> This rework means that we can remove the *_cpuslocked() helpers added in
> commit d54bb72551b999dd ("arm64/cpufeature: Use
> static_branch_enable_cpuslocked()").
> 
> Fixes: efd9e03facd075f5 ("arm64: Use static keys for CPU features")
> Signed-off-by: Mark Rutland 
> Cc: Catalin Marinas 
> Cc: Marc Zyniger 
> Cc: Peter Zijlstra 
> Cc: Sebastian Sewior 
> Cc: Suzuki Poulose 
> Cc: Thomas Gleixner 
> Cc: Will Deacon 
> ---
>  arch/arm64/include/asm/cpufeature.h |  3 +--
>  arch/arm64/kernel/cpu_errata.c  |  9 +
>  arch/arm64/kernel/cpufeature.c  | 16 +---
>  3 files changed, 15 insertions(+), 13 deletions(-)
> 
> I'm not sure what to do about ARM64_WORKAROUND_CAVIUM_23154.
> 
> This patch will defer enabling the workaround until all CPUs are up, and I
> can't see a good way of having the workaround on by default, then subsequently
> disabled if no CPUs are affected.

Yeah, this is pretty horrible.

The way I see it, we need an extra static key that would indicate that 
the errata have been applied. In the interval, we need to use the slow 
path and check the per-cpu state. Something like:

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index c132f29322cc..b4cc5a3573eb 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -123,10 +123,17 @@ static void gic_redist_wait_for_rwp(void)
 
 static u64 __maybe_unused gic_read_iar(void)
 {
-   if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_23154))
-   return gic_read_iar_cavium_thunderx();
-   else
-   return gic_read_iar_common();
+   if (static_branch_likely(_workarounds_enabled)) {
+   if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_23154))
+   return gic_read_iar_cavium_thunderx();
+   else
+   return gic_read_iar_common();
+   } else {
+   if (this_cpu_has_cap(ARM64_WORKAROUND_CAVIUM_23154))
+   return gic_read_iar_cavium_thunderx();
+   else
+   return gic_read_iar_common();
+   }
 }
 #endif
 
You can probably easily turn it into something that looks less shit though.

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...


[RFC 02/11] ima: qualify pathname in audit measurement record

2017-05-11 Thread Guilherme Magalhaes
Adding new fields (mount namespace id, file inode and device name) to
uniquely identify a pathname considering different mount namespaces.
The file inode on a given device is unique and these fields are
required to identify a namespace id since this id can be released
and later reused by a different namespace.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/ima_api.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index c2edba8..b05c1fd 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ima.h"
 
@@ -293,6 +294,7 @@ void ima_audit_measurement(struct integrity_iint_cache 
*iint,
char hash[(iint->ima_hash->length * 2) + 1];
const char *algo_name = hash_algo_name[iint->ima_hash->algo];
char algo_hash[sizeof(hash) + strlen(algo_name) + 2];
+   struct ns_common *ns;
int i;
 
if (iint->flags & IMA_AUDITED)
@@ -312,6 +314,12 @@ void ima_audit_measurement(struct integrity_iint_cache 
*iint,
audit_log_format(ab, " hash=");
snprintf(algo_hash, sizeof(algo_hash), "%s:%s", algo_name, hash);
audit_log_untrustedstring(ab, algo_hash);
+   ns = mntns_operations.get(current);
+   audit_log_format(ab, " mnt_ns=%u", ns->inum);
+   mntns_operations.put(ns);
+   audit_log_format(ab, " dev=");
+   audit_log_untrustedstring(ab, iint->inode->i_sb->s_id);
+   audit_log_format(ab, " ino=%lu", iint->inode->i_ino);
 
audit_log_task_info(ab, current);
audit_log_end(ab);
-- 
2.7.4



[RFC 03/11] ima: qualify pathname in measurement file

2017-05-11 Thread Guilherme Magalhaes
Adding new fields (mount namespace id, file inode and device name) to
uniquely identify a pathname in the measurement file considering
multiple mount namespaces. The file inode on a given device is unique
and these fields are required to identify a namespace id since this
id can be released and later reused by a different namespace.
These new fields are added to all measurement templates if
CONFIG_IMA_PER_NAMESPACE is defined.
There will still be one single measurement file even with multiple
namespaces, since for the remote attestion a single and complete list
is required.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/Kconfig|  8 
 security/integrity/ima/ima.h  | 12 ++
 security/integrity/ima/ima_template.c | 10 -
 security/integrity/ima/ima_template_lib.c | 70 +++
 security/integrity/ima/ima_template_lib.h | 13 ++
 5 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index 370eb2f..7331ff6 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -219,3 +219,11 @@ config IMA_APPRAISE_SIGNED_INIT
default n
help
   This option requires user-space init to be signed.
+
+config IMA_PER_NAMESPACE
+   bool "Enable per mount-namespace handling of IMA policy."
+   depends on IMA
+   default n
+   help
+   This option enables another API in securityfs allowing IMA policies 
to
+   be defined per mount namespace.
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index b563fbd..42fb91ba 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -47,7 +47,19 @@ enum tpm_pcrs { TPM_PCR0 = 0, TPM_PCR8 = 8 };
 #define IMA_TEMPLATE_NUM_FIELDS_MAX15
 
 #define IMA_TEMPLATE_IMA_NAME "ima"
+#define IMA_TEMPLATE_IMA_NG_NAME "ima-ng"
+#define IMA_TEMPLATE_IMA_SIG_NAME "ima-sig"
+
+#ifndef CONFIG_IMA_PER_NAMESPACE
 #define IMA_TEMPLATE_IMA_FMT "d|n"
+#define IMA_TEMPLATE_IMA_NG_FMT "d-ng|n-ng"
+#define IMA_TEMPLATE_IMA_SIG_FMT "d-ng|n-ng|sig"
+#else
+#define IMA_TEMPLATE_IMA_FMT "nid|fi|dev|d|n"
+#define IMA_TEMPLATE_IMA_NG_FMT "nid|fi|dev|d-ng|n-ng"
+#define IMA_TEMPLATE_IMA_SIG_FMT "nid|fi|dev|d-ng|n-ng|sig"
+#endif
+
 
 /* current content of the policy */
 extern int ima_policy_flag;
diff --git a/security/integrity/ima/ima_template.c 
b/security/integrity/ima/ima_template.c
index cebb37c..db65c09 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -21,8 +21,8 @@
 
 static struct ima_template_desc builtin_templates[] = {
{.name = IMA_TEMPLATE_IMA_NAME, .fmt = IMA_TEMPLATE_IMA_FMT},
-   {.name = "ima-ng", .fmt = "d-ng|n-ng"},
-   {.name = "ima-sig", .fmt = "d-ng|n-ng|sig"},
+   {.name = IMA_TEMPLATE_IMA_NG_NAME, .fmt = IMA_TEMPLATE_IMA_NG_FMT},
+   {.name = IMA_TEMPLATE_IMA_SIG_NAME, .fmt = IMA_TEMPLATE_IMA_SIG_FMT},
{.name = "", .fmt = ""},/* placeholder for a custom format */
 };
 
@@ -40,6 +40,12 @@ static struct ima_template_field supported_fields[] = {
 .field_show = ima_show_template_string},
{.field_id = "sig", .field_init = ima_eventsig_init,
 .field_show = ima_show_template_sig},
+   {.field_id = "nid", .field_init = ima_namespaceid_init,
+.field_show = ima_show_namespaceid},
+   {.field_id = "fi", .field_init = ima_filei_init,
+.field_show = ima_show_filei},
+   {.field_id = "dev", .field_init = ima_dev_init,
+.field_show = ima_show_dev},
 };
 #define MAX_TEMPLATE_NAME_LEN 15
 
diff --git a/security/integrity/ima/ima_template_lib.c 
b/security/integrity/ima/ima_template_lib.c
index f9ba37b..50cde10 100644
--- a/security/integrity/ima/ima_template_lib.c
+++ b/security/integrity/ima/ima_template_lib.c
@@ -14,6 +14,8 @@
  */
 
 #include "ima_template_lib.h"
+#include 
+#include 
 
 static bool ima_template_hash_algo_allowed(u8 algo)
 {
@@ -330,3 +332,71 @@ int ima_eventsig_init(struct ima_event_data *event_data,
 out:
return rc;
 }
+
+int ima_namespaceid_init(struct ima_event_data *event_data,
+struct ima_field_data *field_data)
+{
+   u8 tmpbuf[64];
+   struct ns_common *ns;
+
+   ns = mntns_operations.get(current);
+   snprintf(tmpbuf, sizeof(tmpbuf), "mnt-ns=%u", ns->inum);
+   mntns_operations.put(ns);
+
+   return ima_write_template_field_data(tmpbuf, strlen(tmpbuf), 
DATA_FMT_STRING, field_data);
+}
+
+void ima_show_namespaceid(struct seq_file *m, enum ima_show_type show,
+   struct ima_field_data 
*field_data)
+{
+   ima_show_template_field_data(m, show, DATA_FMT_STRING, field_data);
+}
+
+int ima_filei_init(struct ima_event_data *event_data,
+struct ima_field_data *field_data)
+{
+   u8 tmpbuf[64];
+   

[RFC 03/11] ima: qualify pathname in measurement file

2017-05-11 Thread Guilherme Magalhaes
Adding new fields (mount namespace id, file inode and device name) to
uniquely identify a pathname in the measurement file considering
multiple mount namespaces. The file inode on a given device is unique
and these fields are required to identify a namespace id since this
id can be released and later reused by a different namespace.
These new fields are added to all measurement templates if
CONFIG_IMA_PER_NAMESPACE is defined.
There will still be one single measurement file even with multiple
namespaces, since for the remote attestion a single and complete list
is required.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/ima/Kconfig|  8 
 security/integrity/ima/ima.h  | 12 ++
 security/integrity/ima/ima_template.c | 10 -
 security/integrity/ima/ima_template_lib.c | 70 +++
 security/integrity/ima/ima_template_lib.h | 13 ++
 5 files changed, 111 insertions(+), 2 deletions(-)

diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index 370eb2f..7331ff6 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -219,3 +219,11 @@ config IMA_APPRAISE_SIGNED_INIT
default n
help
   This option requires user-space init to be signed.
+
+config IMA_PER_NAMESPACE
+   bool "Enable per mount-namespace handling of IMA policy."
+   depends on IMA
+   default n
+   help
+   This option enables another API in securityfs allowing IMA policies 
to
+   be defined per mount namespace.
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index b563fbd..42fb91ba 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -47,7 +47,19 @@ enum tpm_pcrs { TPM_PCR0 = 0, TPM_PCR8 = 8 };
 #define IMA_TEMPLATE_NUM_FIELDS_MAX15
 
 #define IMA_TEMPLATE_IMA_NAME "ima"
+#define IMA_TEMPLATE_IMA_NG_NAME "ima-ng"
+#define IMA_TEMPLATE_IMA_SIG_NAME "ima-sig"
+
+#ifndef CONFIG_IMA_PER_NAMESPACE
 #define IMA_TEMPLATE_IMA_FMT "d|n"
+#define IMA_TEMPLATE_IMA_NG_FMT "d-ng|n-ng"
+#define IMA_TEMPLATE_IMA_SIG_FMT "d-ng|n-ng|sig"
+#else
+#define IMA_TEMPLATE_IMA_FMT "nid|fi|dev|d|n"
+#define IMA_TEMPLATE_IMA_NG_FMT "nid|fi|dev|d-ng|n-ng"
+#define IMA_TEMPLATE_IMA_SIG_FMT "nid|fi|dev|d-ng|n-ng|sig"
+#endif
+
 
 /* current content of the policy */
 extern int ima_policy_flag;
diff --git a/security/integrity/ima/ima_template.c 
b/security/integrity/ima/ima_template.c
index cebb37c..db65c09 100644
--- a/security/integrity/ima/ima_template.c
+++ b/security/integrity/ima/ima_template.c
@@ -21,8 +21,8 @@
 
 static struct ima_template_desc builtin_templates[] = {
{.name = IMA_TEMPLATE_IMA_NAME, .fmt = IMA_TEMPLATE_IMA_FMT},
-   {.name = "ima-ng", .fmt = "d-ng|n-ng"},
-   {.name = "ima-sig", .fmt = "d-ng|n-ng|sig"},
+   {.name = IMA_TEMPLATE_IMA_NG_NAME, .fmt = IMA_TEMPLATE_IMA_NG_FMT},
+   {.name = IMA_TEMPLATE_IMA_SIG_NAME, .fmt = IMA_TEMPLATE_IMA_SIG_FMT},
{.name = "", .fmt = ""},/* placeholder for a custom format */
 };
 
@@ -40,6 +40,12 @@ static struct ima_template_field supported_fields[] = {
 .field_show = ima_show_template_string},
{.field_id = "sig", .field_init = ima_eventsig_init,
 .field_show = ima_show_template_sig},
+   {.field_id = "nid", .field_init = ima_namespaceid_init,
+.field_show = ima_show_namespaceid},
+   {.field_id = "fi", .field_init = ima_filei_init,
+.field_show = ima_show_filei},
+   {.field_id = "dev", .field_init = ima_dev_init,
+.field_show = ima_show_dev},
 };
 #define MAX_TEMPLATE_NAME_LEN 15
 
diff --git a/security/integrity/ima/ima_template_lib.c 
b/security/integrity/ima/ima_template_lib.c
index f9ba37b..50cde10 100644
--- a/security/integrity/ima/ima_template_lib.c
+++ b/security/integrity/ima/ima_template_lib.c
@@ -14,6 +14,8 @@
  */
 
 #include "ima_template_lib.h"
+#include 
+#include 
 
 static bool ima_template_hash_algo_allowed(u8 algo)
 {
@@ -330,3 +332,71 @@ int ima_eventsig_init(struct ima_event_data *event_data,
 out:
return rc;
 }
+
+int ima_namespaceid_init(struct ima_event_data *event_data,
+struct ima_field_data *field_data)
+{
+   u8 tmpbuf[64];
+   struct ns_common *ns;
+
+   ns = mntns_operations.get(current);
+   snprintf(tmpbuf, sizeof(tmpbuf), "mnt-ns=%u", ns->inum);
+   mntns_operations.put(ns);
+
+   return ima_write_template_field_data(tmpbuf, strlen(tmpbuf), 
DATA_FMT_STRING, field_data);
+}
+
+void ima_show_namespaceid(struct seq_file *m, enum ima_show_type show,
+   struct ima_field_data 
*field_data)
+{
+   ima_show_template_field_data(m, show, DATA_FMT_STRING, field_data);
+}
+
+int ima_filei_init(struct ima_event_data *event_data,
+struct ima_field_data *field_data)
+{
+   u8 tmpbuf[64];
+   struct inode *inode;
+   int 

Re: [RFC][PATCH] sched/deadline: Remove if statement before clearing throttle and yielded

2017-05-11 Thread Juri Lelli
Hi,

On 10/05/17 09:50, Steven Rostedt wrote:
> [
>   This is an RFC as I didn't run any benchmarks. It just seemed a bit 
>   weird to me that we would add such a check instead of just clearing
>   these variables out regardless.
> ]
> 
> The function replenish_dl_entity() clears dl_throttled and dl_yielded,
> but checks first if they are set before doing so. As these variables
> are in the same cache locale of other variables being modified, there's
> no advantage in checking if they are set before clearing them. But
> having the compare takes slots away from the branch prediction.
> 
> Signed-off-by: Steven Rostedt (VMware) 
> ---
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index a2ce590..9748d33 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -423,10 +423,8 @@ static void replenish_dl_entity(struct sched_dl_entity 
> *dl_se,
>   dl_se->runtime = pi_se->dl_runtime;
>   }
>  
> - if (dl_se->dl_yielded)
> - dl_se->dl_yielded = 0;
> - if (dl_se->dl_throttled)
> - dl_se->dl_throttled = 0;
> + dl_se->dl_yielded = 0;
> + dl_se->dl_throttled = 0;
>  }

Looks good to me.

Peter, any particular reason why you wanted to first check the values?

Best,

- Juri


[PATCH 06/14] VFS: Introduce a superblock configuration context [ver #2]

2017-05-11 Thread David Howells
Introduce a superblock configuration context concept to be used during
superblock creation for mount and superblock reconfiguration for remount.
This is allocated at the beginning of the mount procedure and into it is
placed:

 (1) Filesystem type.

 (2) Namespaces.

 (3) Device name.

 (4) Superblock flags (MS_*).

 (5) Security details.

 (6) Filesystem-specific data, as set by the mount options.

It also gives a place in which to hang an error message for later retrieval
(see the mount-by-fd syscall later in this series).

Rather than calling fs_type->mount(), an sb_config struct is created and
fs_type->init_sb_config() is called to set it up.  fs_type->sb_config_size
says how much space should be allocated for the config context.  The
sb_config struct is placed at the beginning and any extra space is for the
filesystem's use.

A set of operations have to be set by ->init_sb_config() to provide
freeing, duplication, option parsing, binary data parsing, validation,
mounting and superblock filling.

It should be noted that, whilst this patch adds a lot of lines of code,
there is quite a bit of duplication with existing code that can be
eliminated should all filesystems be converted over.

Signed-off-by: David Howells 
---

 Documentation/filesystems/mounting.txt |  456 
 fs/Makefile|3 
 fs/internal.h  |2 
 fs/libfs.c |1 
 fs/namespace.c |  256 --
 fs/nfs/nfs4super.c |1 
 fs/proc/root.c |1 
 fs/sb_config.c |  326 +++
 fs/super.c |   54 +++-
 include/linux/fs.h |   14 +
 include/linux/lsm_hooks.h  |   38 +++
 include/linux/mount.h  |4 
 include/linux/sb_config.h  |   93 +++
 include/linux/security.h   |   29 ++
 security/security.c|   32 ++
 security/selinux/hooks.c   |  170 
 16 files changed, 1449 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/filesystems/mounting.txt
 create mode 100644 fs/sb_config.c
 create mode 100644 include/linux/sb_config.h

diff --git a/Documentation/filesystems/mounting.txt 
b/Documentation/filesystems/mounting.txt
new file mode 100644
index ..03e9086f754d
--- /dev/null
+++ b/Documentation/filesystems/mounting.txt
@@ -0,0 +1,456 @@
+ ===
+ FILESYSTEM MOUNTING
+ ===
+
+CONTENTS
+
+ (1) Overview.
+
+ (2) The superblock configuration context.
+
+ (3) The superblock config operations.
+
+ (4) Superblock config security.
+
+ (5) VFS superblock config operations.
+
+
+
+OVERVIEW
+
+
+The creation of new mounts is now to be done in a multistep process:
+
+ (1) Create a superblock configuration context.
+
+ (2) Parse the options and attach them to the context.  Options may be passed
+ individually from userspace.
+
+ (3) Validate and pre-process the context.
+
+ (4) Get or create a superblock and mountable root.
+
+ (5) Perform the mount.
+
+ (6) Return an error message attached to the context.
+
+ (7) Destroy the context.
+
+To support this, the file_system_type struct gains two new fields:
+
+   unsigned short sb_config_size;
+
+which indicates the total amount of space that should be allocated for context
+data (see the Superblock Configuration Context section), and:
+
+   int (*init_sb_config)(struct sb_config *sc, struct super_block *src_sb);
+
+which is invoked to set up the filesystem-specific parts of a superblock
+configuration context, including the additional space.  The src_sb parameter is
+used to convey the superblock from which the filesystem may draw extra
+information (such as namespaces), for submount (SB_CONFIG_FOR_SUBMOUNT) or
+remount (SB_CONFIG_FOR_REMOUNT) purposes or it will be NULL.
+
+Note that security initialisation is done *after* the filesystem is called so
+that the namespaces may be adjusted first.
+
+And the super_operations struct gains one:
+
+   int (*remount_fs_sc) (struct super_block *, struct sb_config *);
+
+This shadows the ->remount_fs() operation and takes a prepared superblock
+configuration context instead of the mount flags and data page.  It may modify
+the ms_flags in the context for the caller to pick up.
+
+[NOTE] remount_fs_sc is intended as a replacement for remount_fs.
+
+
+
+THE SUPERBLOCK CONFIGURATION CONTEXT
+
+
+The creation and reconfiguration of a superblock is governed by a superblock
+configuration context.  This is represented by the sb_config structure:
+
+   struct sb_config {
+   const struct sb_config_operations *ops;
+   

[RFC 01/11] ima: qualify pathname in audit info record

2017-05-11 Thread Guilherme Magalhaes
Adding new field (mount namespace id, along with already existent file
inode and device name) to uniquely identify a pathname considering
different mount namespaces. The file inode on a given device is unique
and these fields are required to identify a namespace id since this
id can be released and later reused by a different namespace.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/integrity_audit.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/security/integrity/integrity_audit.c 
b/security/integrity/integrity_audit.c
index 90987d1..e675e42 100644
--- a/security/integrity/integrity_audit.c
+++ b/security/integrity/integrity_audit.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "integrity.h"
 
 static int integrity_audit_info;
@@ -52,8 +53,12 @@ void integrity_audit_msg(int audit_msgno, struct inode 
*inode,
audit_log_format(ab, " comm=");
audit_log_untrustedstring(ab, get_task_comm(name, current));
if (fname) {
+   struct ns_common *ns;
audit_log_format(ab, " name=");
audit_log_untrustedstring(ab, fname);
+   ns = mntns_operations.get(current);
+   audit_log_format(ab, " mnt_ns=%u", ns->inum);
+   mntns_operations.put(ns);
}
if (inode) {
audit_log_format(ab, " dev=");
-- 
2.7.4



Re: [RFC][PATCH] sched/deadline: Remove if statement before clearing throttle and yielded

2017-05-11 Thread Juri Lelli
Hi,

On 10/05/17 09:50, Steven Rostedt wrote:
> [
>   This is an RFC as I didn't run any benchmarks. It just seemed a bit 
>   weird to me that we would add such a check instead of just clearing
>   these variables out regardless.
> ]
> 
> The function replenish_dl_entity() clears dl_throttled and dl_yielded,
> but checks first if they are set before doing so. As these variables
> are in the same cache locale of other variables being modified, there's
> no advantage in checking if they are set before clearing them. But
> having the compare takes slots away from the branch prediction.
> 
> Signed-off-by: Steven Rostedt (VMware) 
> ---
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index a2ce590..9748d33 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -423,10 +423,8 @@ static void replenish_dl_entity(struct sched_dl_entity 
> *dl_se,
>   dl_se->runtime = pi_se->dl_runtime;
>   }
>  
> - if (dl_se->dl_yielded)
> - dl_se->dl_yielded = 0;
> - if (dl_se->dl_throttled)
> - dl_se->dl_throttled = 0;
> + dl_se->dl_yielded = 0;
> + dl_se->dl_throttled = 0;
>  }

Looks good to me.

Peter, any particular reason why you wanted to first check the values?

Best,

- Juri


[PATCH 06/14] VFS: Introduce a superblock configuration context [ver #2]

2017-05-11 Thread David Howells
Introduce a superblock configuration context concept to be used during
superblock creation for mount and superblock reconfiguration for remount.
This is allocated at the beginning of the mount procedure and into it is
placed:

 (1) Filesystem type.

 (2) Namespaces.

 (3) Device name.

 (4) Superblock flags (MS_*).

 (5) Security details.

 (6) Filesystem-specific data, as set by the mount options.

It also gives a place in which to hang an error message for later retrieval
(see the mount-by-fd syscall later in this series).

Rather than calling fs_type->mount(), an sb_config struct is created and
fs_type->init_sb_config() is called to set it up.  fs_type->sb_config_size
says how much space should be allocated for the config context.  The
sb_config struct is placed at the beginning and any extra space is for the
filesystem's use.

A set of operations have to be set by ->init_sb_config() to provide
freeing, duplication, option parsing, binary data parsing, validation,
mounting and superblock filling.

It should be noted that, whilst this patch adds a lot of lines of code,
there is quite a bit of duplication with existing code that can be
eliminated should all filesystems be converted over.

Signed-off-by: David Howells 
---

 Documentation/filesystems/mounting.txt |  456 
 fs/Makefile|3 
 fs/internal.h  |2 
 fs/libfs.c |1 
 fs/namespace.c |  256 --
 fs/nfs/nfs4super.c |1 
 fs/proc/root.c |1 
 fs/sb_config.c |  326 +++
 fs/super.c |   54 +++-
 include/linux/fs.h |   14 +
 include/linux/lsm_hooks.h  |   38 +++
 include/linux/mount.h  |4 
 include/linux/sb_config.h  |   93 +++
 include/linux/security.h   |   29 ++
 security/security.c|   32 ++
 security/selinux/hooks.c   |  170 
 16 files changed, 1449 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/filesystems/mounting.txt
 create mode 100644 fs/sb_config.c
 create mode 100644 include/linux/sb_config.h

diff --git a/Documentation/filesystems/mounting.txt 
b/Documentation/filesystems/mounting.txt
new file mode 100644
index ..03e9086f754d
--- /dev/null
+++ b/Documentation/filesystems/mounting.txt
@@ -0,0 +1,456 @@
+ ===
+ FILESYSTEM MOUNTING
+ ===
+
+CONTENTS
+
+ (1) Overview.
+
+ (2) The superblock configuration context.
+
+ (3) The superblock config operations.
+
+ (4) Superblock config security.
+
+ (5) VFS superblock config operations.
+
+
+
+OVERVIEW
+
+
+The creation of new mounts is now to be done in a multistep process:
+
+ (1) Create a superblock configuration context.
+
+ (2) Parse the options and attach them to the context.  Options may be passed
+ individually from userspace.
+
+ (3) Validate and pre-process the context.
+
+ (4) Get or create a superblock and mountable root.
+
+ (5) Perform the mount.
+
+ (6) Return an error message attached to the context.
+
+ (7) Destroy the context.
+
+To support this, the file_system_type struct gains two new fields:
+
+   unsigned short sb_config_size;
+
+which indicates the total amount of space that should be allocated for context
+data (see the Superblock Configuration Context section), and:
+
+   int (*init_sb_config)(struct sb_config *sc, struct super_block *src_sb);
+
+which is invoked to set up the filesystem-specific parts of a superblock
+configuration context, including the additional space.  The src_sb parameter is
+used to convey the superblock from which the filesystem may draw extra
+information (such as namespaces), for submount (SB_CONFIG_FOR_SUBMOUNT) or
+remount (SB_CONFIG_FOR_REMOUNT) purposes or it will be NULL.
+
+Note that security initialisation is done *after* the filesystem is called so
+that the namespaces may be adjusted first.
+
+And the super_operations struct gains one:
+
+   int (*remount_fs_sc) (struct super_block *, struct sb_config *);
+
+This shadows the ->remount_fs() operation and takes a prepared superblock
+configuration context instead of the mount flags and data page.  It may modify
+the ms_flags in the context for the caller to pick up.
+
+[NOTE] remount_fs_sc is intended as a replacement for remount_fs.
+
+
+
+THE SUPERBLOCK CONFIGURATION CONTEXT
+
+
+The creation and reconfiguration of a superblock is governed by a superblock
+configuration context.  This is represented by the sb_config structure:
+
+   struct sb_config {
+   const struct sb_config_operations *ops;
+   struct 

[RFC 01/11] ima: qualify pathname in audit info record

2017-05-11 Thread Guilherme Magalhaes
Adding new field (mount namespace id, along with already existent file
inode and device name) to uniquely identify a pathname considering
different mount namespaces. The file inode on a given device is unique
and these fields are required to identify a namespace id since this
id can be released and later reused by a different namespace.

Signed-off-by: Guilherme Magalhaes 
---
 security/integrity/integrity_audit.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/security/integrity/integrity_audit.c 
b/security/integrity/integrity_audit.c
index 90987d1..e675e42 100644
--- a/security/integrity/integrity_audit.c
+++ b/security/integrity/integrity_audit.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "integrity.h"
 
 static int integrity_audit_info;
@@ -52,8 +53,12 @@ void integrity_audit_msg(int audit_msgno, struct inode 
*inode,
audit_log_format(ab, " comm=");
audit_log_untrustedstring(ab, get_task_comm(name, current));
if (fname) {
+   struct ns_common *ns;
audit_log_format(ab, " name=");
audit_log_untrustedstring(ab, fname);
+   ns = mntns_operations.get(current);
+   audit_log_format(ab, " mnt_ns=%u", ns->inum);
+   mntns_operations.put(ns);
}
if (inode) {
audit_log_format(ab, " dev=");
-- 
2.7.4



<    2   3   4   5   6   7   8   9   10   11   >