[PATCH net-next 3/5] net: l2tp: netlink: l2tp_nl_tunnel_send: set UDP6 checksum flags

2016-11-04 Thread Asbjoern Sloth Toennesen
This patch causes the proper attribute flags to be set,
in the case that IPv6 UDP checksums are disabled, so that
userspace ie. `ip l2tp show tunnel` knows about it.

Signed-off-by: Asbjoern Sloth Toennesen 
---
 net/l2tp/l2tp_netlink.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index e45c5409..1b3fcde 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -385,6 +385,16 @@ static int l2tp_nl_tunnel_send(struct sk_buff *skb, u32 
portid, u32 seq, int fla
nla_put_flag(skb, L2TP_ATTR_UDP_CSUM))
goto nla_put_failure;
break;
+#if IS_ENABLED(CONFIG_IPV6)
+   case AF_INET6:
+   if (udp_get_no_check6_tx(sk) &&
+   nla_put_flag(skb, L2TP_ATTR_UDP_ZERO_CSUM6_TX))
+   goto nla_put_failure;
+   if (udp_get_no_check6_rx(sk) &&
+   nla_put_flag(skb, L2TP_ATTR_UDP_ZERO_CSUM6_RX))
+   goto nla_put_failure;
+   break;
+#endif
}
if (nla_put_u16(skb, L2TP_ATTR_UDP_SPORT, 
ntohs(inet->inet_sport)) ||
nla_put_u16(skb, L2TP_ATTR_UDP_DPORT, 
ntohs(inet->inet_dport)))
-- 
2.10.1



[PATCH net-next 3/5] net: l2tp: netlink: l2tp_nl_tunnel_send: set UDP6 checksum flags

2016-11-04 Thread Asbjoern Sloth Toennesen
This patch causes the proper attribute flags to be set,
in the case that IPv6 UDP checksums are disabled, so that
userspace ie. `ip l2tp show tunnel` knows about it.

Signed-off-by: Asbjoern Sloth Toennesen 
---
 net/l2tp/l2tp_netlink.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index e45c5409..1b3fcde 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -385,6 +385,16 @@ static int l2tp_nl_tunnel_send(struct sk_buff *skb, u32 
portid, u32 seq, int fla
nla_put_flag(skb, L2TP_ATTR_UDP_CSUM))
goto nla_put_failure;
break;
+#if IS_ENABLED(CONFIG_IPV6)
+   case AF_INET6:
+   if (udp_get_no_check6_tx(sk) &&
+   nla_put_flag(skb, L2TP_ATTR_UDP_ZERO_CSUM6_TX))
+   goto nla_put_failure;
+   if (udp_get_no_check6_rx(sk) &&
+   nla_put_flag(skb, L2TP_ATTR_UDP_ZERO_CSUM6_RX))
+   goto nla_put_failure;
+   break;
+#endif
}
if (nla_put_u16(skb, L2TP_ATTR_UDP_SPORT, 
ntohs(inet->inet_sport)) ||
nla_put_u16(skb, L2TP_ATTR_UDP_DPORT, 
ntohs(inet->inet_dport)))
-- 
2.10.1



Re: [PATCH 0/3] rtc: remove modular usage from non-modular code

2016-11-04 Thread Alexandre Belloni
On 31/10/2016 at 14:55:24 -0400, Paul Gortmaker wrote :
> My ongoing audit looking for non-modular code that needlessly uses
> modular macros (vs. built-in equivalents) and/or has dead code
> relating to module unloading that can never be executed led to the
> creation of these rtc related commits.
> 
> For anyone new to the underlying goal of this cleanup, we are trying to
> not use module support for code that can never be built as a module since:
> 
>  (1) it is easy to accidentally write unused module_exit and remove code
>  (2) it can be misleading when reading the source, thinking it can be
>  modular when the Makefile and/or Kconfig prohibit it
>  (3) it requires the include of the module.h header file which in turn
>  includes nearly everything else, thus adding to CPP overhead.
>  (4) it gets copied/replicated into other code and spreads like weeds.
> 
> Build tested on current linux-next (sparc32) to ensure no silly typos
> or implicit include issues that would break compilation crept in.
> 
> ---
> 
> Cc: Alessandro Zummo 
> Cc: Alexandre Belloni 
> Cc: "David S. Miller" 
> Cc: rtc-li...@googlegroups.com
> Cc: sparcli...@vger.kernel.org
> 
> Paul Gortmaker (3):
>   rtc: make rtc-lib explicitly non-modular
>   rtc: sparc: make starfire explicitly non-modular
>   rtc: sparc: make sun4v explicitly non-modular
> 
>  drivers/rtc/rtc-lib.c  |  4 +---
>  drivers/rtc/rtc-starfire.c | 10 --
>  drivers/rtc/rtc-sun4v.c| 10 --
>  3 files changed, 9 insertions(+), 15 deletions(-)
> 
Applied, thanks.

-- 
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: [PATCH 0/3] rtc: remove modular usage from non-modular code

2016-11-04 Thread Alexandre Belloni
On 31/10/2016 at 14:55:24 -0400, Paul Gortmaker wrote :
> My ongoing audit looking for non-modular code that needlessly uses
> modular macros (vs. built-in equivalents) and/or has dead code
> relating to module unloading that can never be executed led to the
> creation of these rtc related commits.
> 
> For anyone new to the underlying goal of this cleanup, we are trying to
> not use module support for code that can never be built as a module since:
> 
>  (1) it is easy to accidentally write unused module_exit and remove code
>  (2) it can be misleading when reading the source, thinking it can be
>  modular when the Makefile and/or Kconfig prohibit it
>  (3) it requires the include of the module.h header file which in turn
>  includes nearly everything else, thus adding to CPP overhead.
>  (4) it gets copied/replicated into other code and spreads like weeds.
> 
> Build tested on current linux-next (sparc32) to ensure no silly typos
> or implicit include issues that would break compilation crept in.
> 
> ---
> 
> Cc: Alessandro Zummo 
> Cc: Alexandre Belloni 
> Cc: "David S. Miller" 
> Cc: rtc-li...@googlegroups.com
> Cc: sparcli...@vger.kernel.org
> 
> Paul Gortmaker (3):
>   rtc: make rtc-lib explicitly non-modular
>   rtc: sparc: make starfire explicitly non-modular
>   rtc: sparc: make sun4v explicitly non-modular
> 
>  drivers/rtc/rtc-lib.c  |  4 +---
>  drivers/rtc/rtc-starfire.c | 10 --
>  drivers/rtc/rtc-sun4v.c| 10 --
>  3 files changed, 9 insertions(+), 15 deletions(-)
> 
Applied, thanks.

-- 
Alexandre Belloni, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Re: [PATCH 1/2] pinctrl: tegra: Add DT binding for io pads control

2016-11-04 Thread Linus Walleij
On Wed, Nov 2, 2016 at 10:09 AM, Laxman Dewangan  wrote:

> NVIDIA Tegra124 and later SoCs support the multi-voltage level and
> low power state of some of its IO pads. The IO pads can work in
> the voltage of the 1.8V and 3.3V of IO power rail sources. When IO
> interface are not used then IO pads can be configure in low power
> state to reduce the power from that IO pads.
>
> On Tegra124, the IO power rail source is auto detected by SoC and hence
> it is only require to configure in low power mode if IO pads are not
> used.
>
> On T210 onwards, the auto-detection is removed from SoC and hence SW
> must configure the PMC register explicitly to set proper voltage in
> IO pads based on IO rail power source voltage.
>
> Add DT binding document for detailing the DT properties for
> configuring IO pads voltage levels and its power state.
>
> Signed-off-by: Laxman Dewangan 
(...)

> +-nvidia,power-source-voltage:  Integer. The voltage level of IO pads. The
> +   valid values are 1.8V and 3.3V. Macros are
> +   defined for these voltage levels in
> +   
> +   Use TEGRA_IO_PAD_POWER_SOURCE_180UV for 
> 1.8V
> +   Use TEGRA_IO_PAD_POWER_SOURCE_330UV for 
> 3.3V
> +
> +   All IO pads do not support the 1.8V/3.3V
> +   configurations. Valid values for "pins" are
> +   audio-hv, dmic, gpio, sdmmc1, sdmmc3, spi-hv.


As mentioned in another patch, what is wrong with the standard
power-source binding?

Yours,
Linus Walleij


Re: [PATCH 1/2] pinctrl: tegra: Add DT binding for io pads control

2016-11-04 Thread Linus Walleij
On Wed, Nov 2, 2016 at 10:09 AM, Laxman Dewangan  wrote:

> NVIDIA Tegra124 and later SoCs support the multi-voltage level and
> low power state of some of its IO pads. The IO pads can work in
> the voltage of the 1.8V and 3.3V of IO power rail sources. When IO
> interface are not used then IO pads can be configure in low power
> state to reduce the power from that IO pads.
>
> On Tegra124, the IO power rail source is auto detected by SoC and hence
> it is only require to configure in low power mode if IO pads are not
> used.
>
> On T210 onwards, the auto-detection is removed from SoC and hence SW
> must configure the PMC register explicitly to set proper voltage in
> IO pads based on IO rail power source voltage.
>
> Add DT binding document for detailing the DT properties for
> configuring IO pads voltage levels and its power state.
>
> Signed-off-by: Laxman Dewangan 
(...)

> +-nvidia,power-source-voltage:  Integer. The voltage level of IO pads. The
> +   valid values are 1.8V and 3.3V. Macros are
> +   defined for these voltage levels in
> +   
> +   Use TEGRA_IO_PAD_POWER_SOURCE_180UV for 
> 1.8V
> +   Use TEGRA_IO_PAD_POWER_SOURCE_330UV for 
> 3.3V
> +
> +   All IO pads do not support the 1.8V/3.3V
> +   configurations. Valid values for "pins" are
> +   audio-hv, dmic, gpio, sdmmc1, sdmmc3, spi-hv.


As mentioned in another patch, what is wrong with the standard
power-source binding?

Yours,
Linus Walleij


Re: [PATCH] pinctrl: sunxi: make bool drivers explicitly non-modular

2016-11-04 Thread Linus Walleij
On Sun, Oct 30, 2016 at 2:00 AM, Paul Gortmaker
 wrote:

> None of the Kconfigs for any of these drivers are tristate,
> meaning that they currently are not being built as a module by anyone.
>
> Lets remove the modular code that is essentially orphaned, so that
> when reading the drivers there is no doubt they are builtin-only.  All
> drivers get essentially the same change, so they are handled in batch.
>
> Changes are (1) use builtin_platform_driver, (2) use init.h header
> (3) delete module_exit related code, (4) delete MODULE_DEVICE_TABLE,
> and (5) delete MODULE_LICENCE/MODULE_AUTHOR and associated tags.
>
> Since module_platform_driver() uses the same init level priority as
> builtin_platform_driver() the init ordering remains unchanged with
> this commit.
>
> Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.
>
> We do delete the MODULE_LICENSE etc. tags since all that information
> is already contained at the top of each file in the comments.
>
> Cc: Boris Brezillon 
> Cc: Chen-Yu Tsai 
> Cc: Hans de Goede 
> Cc: Maxime Ripard 
> Cc: Linus Walleij 
> Cc: Patrice Chotard 
> Cc: Hongzhou Yang 
> Cc: Fabian Frederick 
> Cc: Maxime Coquelin 
> Cc: Vishnu Patekar 
> Cc: Mylene Josserand 
> Cc: linux-g...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Signed-off-by: Paul Gortmaker 

Patch applied with Maxime's ACK.

Yours,
Linus Walleij


Re: [PATCH] pinctrl: sunxi: make bool drivers explicitly non-modular

2016-11-04 Thread Linus Walleij
On Sun, Oct 30, 2016 at 2:00 AM, Paul Gortmaker
 wrote:

> None of the Kconfigs for any of these drivers are tristate,
> meaning that they currently are not being built as a module by anyone.
>
> Lets remove the modular code that is essentially orphaned, so that
> when reading the drivers there is no doubt they are builtin-only.  All
> drivers get essentially the same change, so they are handled in batch.
>
> Changes are (1) use builtin_platform_driver, (2) use init.h header
> (3) delete module_exit related code, (4) delete MODULE_DEVICE_TABLE,
> and (5) delete MODULE_LICENCE/MODULE_AUTHOR and associated tags.
>
> Since module_platform_driver() uses the same init level priority as
> builtin_platform_driver() the init ordering remains unchanged with
> this commit.
>
> Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.
>
> We do delete the MODULE_LICENSE etc. tags since all that information
> is already contained at the top of each file in the comments.
>
> Cc: Boris Brezillon 
> Cc: Chen-Yu Tsai 
> Cc: Hans de Goede 
> Cc: Maxime Ripard 
> Cc: Linus Walleij 
> Cc: Patrice Chotard 
> Cc: Hongzhou Yang 
> Cc: Fabian Frederick 
> Cc: Maxime Coquelin 
> Cc: Vishnu Patekar 
> Cc: Mylene Josserand 
> Cc: linux-g...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Signed-off-by: Paul Gortmaker 

Patch applied with Maxime's ACK.

Yours,
Linus Walleij


Re: [PATCH 2/2] pinctrl: tegra: Add driver to configure voltage and power of io pads

2016-11-04 Thread Linus Walleij
On Wed, Nov 2, 2016 at 10:09 AM, Laxman Dewangan  wrote:

> NVIDIA Tegra124 and later SoCs support the multi-voltage level and
> low power state of some of its IO pads. The IO pads can work in
> the voltage of the 1.8V and 3.3V of IO power rail sources. When IO
> interface are not used then IO pads can be configure in low power
> state to reduce the power from that IO pads.
>
> On Tegra124, the IO power rail source is auto detected by SoC and hence
> it is only require to configure in low power mode if IO pads are not
> used.
>
> On T210 onwards, the auto-detection is removed from SoC and hence SW
> must configure the PMC register explicitly to set proper voltage in
> IO pads based on IO rail power source voltage.
>
> This driver adds the IO pad driver to configure the power state and
> IO pad voltage based on the usage and power tree via pincontrol
> framework. The configuration can be static and dynamic.
>
> Signed-off-by: Laxman Dewangan 

Looking for an ACK from Stephen &| Thierry.

> ---
> On top of the branch from Thierry's T186 work
> https://github.com/thierryreding/linux/tree/tegra186

But it's an orthogonal patch right?

The build robot seems to have problems with it so pls fix these.

> +static const struct pinconf_generic_params tegra_io_pads_cfg_params[] = {
> +   {
> +   .property = "nvidia,power-source-voltage",
> +   .param = TEGRA_IO_PAD_POWER_SOURCE_VOLTAGE,
> +   },
> +};

Why can you not use the standard power-source binding
from Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt
instead of inventing this nvidia,* variant?

Yours,
Linus Walleij


Re: [PATCH 2/2] pinctrl: tegra: Add driver to configure voltage and power of io pads

2016-11-04 Thread Linus Walleij
On Wed, Nov 2, 2016 at 10:09 AM, Laxman Dewangan  wrote:

> NVIDIA Tegra124 and later SoCs support the multi-voltage level and
> low power state of some of its IO pads. The IO pads can work in
> the voltage of the 1.8V and 3.3V of IO power rail sources. When IO
> interface are not used then IO pads can be configure in low power
> state to reduce the power from that IO pads.
>
> On Tegra124, the IO power rail source is auto detected by SoC and hence
> it is only require to configure in low power mode if IO pads are not
> used.
>
> On T210 onwards, the auto-detection is removed from SoC and hence SW
> must configure the PMC register explicitly to set proper voltage in
> IO pads based on IO rail power source voltage.
>
> This driver adds the IO pad driver to configure the power state and
> IO pad voltage based on the usage and power tree via pincontrol
> framework. The configuration can be static and dynamic.
>
> Signed-off-by: Laxman Dewangan 

Looking for an ACK from Stephen &| Thierry.

> ---
> On top of the branch from Thierry's T186 work
> https://github.com/thierryreding/linux/tree/tegra186

But it's an orthogonal patch right?

The build robot seems to have problems with it so pls fix these.

> +static const struct pinconf_generic_params tegra_io_pads_cfg_params[] = {
> +   {
> +   .property = "nvidia,power-source-voltage",
> +   .param = TEGRA_IO_PAD_POWER_SOURCE_VOLTAGE,
> +   },
> +};

Why can you not use the standard power-source binding
from Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt
instead of inventing this nvidia,* variant?

Yours,
Linus Walleij


Re: [PATCH v2 2/3] irqchip: mtk-cirq: Add mediatek mtk-cirq implement

2016-11-04 Thread Marc Zyngier
On Fri, Nov 04 2016 at 04:42:57 AM, Youlin Pei  wrote:
> On Tue, 2016-11-01 at 20:49 +, Marc Zyngier wrote:
>> On Tue, Nov 01 2016 at 11:52:01 AM, Youlin Pei  
>> wrote:
>> > In Mediatek SOCs, the CIRQ is a low power interrupt controller
>> > designed to works outside MCUSYS which comprises with Cortex-Ax
>> > cores,CCI and GIC.
>> >
>> > The CIRQ controller is integrated in between MCUSYS( include
>> > Cortex-Ax, CCI and GIC ) and interrupt sources as the second
>> > level interrupt controller. The external interrupts which outside
>> > MCUSYS will feed through CIRQ then bypass to GIC. CIRQ can monitors
>> > all edge trigger interupts. When an edge interrupt is triggered,
>> > CIRQ can record the status and generate a pulse signal to GIC when
>> > flush command executed.
>> >
>> > When system enters sleep mode, MCUSYS will be turned off to improve
>> > power consumption, also GIC is power down. The edge trigger interrupts
>> > will be lost in this scenario without CIRQ.
>> >
>> > This commit provides the CIRQ irqchip implement.
>> >
>> > Signed-off-by: Youlin Pei 
>> > ---
>> >  drivers/irqchip/Makefile   |2 +-
>> >  drivers/irqchip/irq-mtk-cirq.c |  262 
>> > 
>> >  2 files changed, 263 insertions(+), 1 deletion(-)
>> >  create mode 100644 drivers/irqchip/irq-mtk-cirq.c
>> >
>> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
>> > index e4dbfc8..8f33580 100644
>> > --- a/drivers/irqchip/Makefile
>> > +++ b/drivers/irqchip/Makefile
>> > @@ -60,7 +60,7 @@ obj-$(CONFIG_BCM7120_L2_IRQ) += 
>> > irq-bcm7120-l2.o
>> >  obj-$(CONFIG_BRCMSTB_L2_IRQ)  += irq-brcmstb-l2.o
>> >  obj-$(CONFIG_KEYSTONE_IRQ)+= irq-keystone.o
>> >  obj-$(CONFIG_MIPS_GIC)+= irq-mips-gic.o
>> > -obj-$(CONFIG_ARCH_MEDIATEK)   += irq-mtk-sysirq.o
>> > +obj-$(CONFIG_ARCH_MEDIATEK)   += irq-mtk-sysirq.o 
>> > irq-mtk-cirq.o
>> >  obj-$(CONFIG_ARCH_DIGICOLOR)  += irq-digicolor.o
>> >  obj-$(CONFIG_RENESAS_H8300H_INTC) += irq-renesas-h8300h.o
>> >  obj-$(CONFIG_RENESAS_H8S_INTC)+= irq-renesas-h8s.o
>> > diff --git a/drivers/irqchip/irq-mtk-cirq.c 
>> > b/drivers/irqchip/irq-mtk-cirq.c
>> > new file mode 100644
>> > index 000..fc43ef3
>> > --- /dev/null
>> > +++ b/drivers/irqchip/irq-mtk-cirq.c
>> > @@ -0,0 +1,262 @@
>> > +/*
>> > + * Copyright (c) 2016 MediaTek Inc.
>> > + * Author: Youlin.Pei 
>> > + *
>> > + * This program is free software; you can redistribute it and/or modify
>> > + * it under the terms of the GNU General Public License version 2 as
>> > + * published by the Free Software Foundation.
>> > + *
>> > + * This program is distributed in the hope that it will be useful,
>> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> > + * GNU General Public License for more details.
>> > + */
>> > +
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +
>> > +#define CIRQ_ACK  0x40
>> > +#define CIRQ_MASK_SET 0xc0
>> > +#define CIRQ_MASK_CLR 0x100
>> > +#define CIRQ_SENS_SET 0x180
>> > +#define CIRQ_SENS_CLR 0x1c0
>> > +#define CIRQ_POL_SET  0x240
>> > +#define CIRQ_POL_CLR  0x280
>> > +#define CIRQ_CONTROL  0x300
>> > +
>> > +#define CIRQ_EN   0x1
>> > +#define CIRQ_EDGE 0x2
>> > +#define CIRQ_FLUSH0x4
>> > +
>> > +#define CIRQ_IRQ_NUM0x200
>> > +
>> > +struct mtk_cirq_chip_data {
>> > +  void __iomem *base;
>> > +  unsigned int ext_irq_start;
>> > +};
>> > +
>> > +static struct mtk_cirq_chip_data *cirq_data;
>> 
>> Are you guaranteed that you'll only ever have a single CIRQ in any
>> system?
>
> In Mediatek's SOC, only hace a single CIRQ.
>
>> 
>> > +
>> > +static void mtk_cirq_write_mask(struct irq_data *data, unsigned int 
>> > offset)
>> > +{
>> > +  struct mtk_cirq_chip_data *chip_data = data->chip_data;
>> > +  unsigned int cirq_num = data->hwirq;
>> > +  u32 mask = 1 << (cirq_num % 32);
>> > +
>> > +  writel(mask, chip_data->base + offset + (cirq_num / 32) * 4);
>> 
>> Why can't you use the relaxed accessors?
>
> It seems that i use wrong function, i will change the writel to
> writel_relaxed in next version.
>
>> 
>> > +}
>> > +
>> > +static void mtk_cirq_mask(struct irq_data *data)
>> > +{
>> > +  mtk_cirq_write_mask(data, CIRQ_MASK_SET);
>> > +  irq_chip_mask_parent(data);
>> > +}
>> > +
>> > +static void mtk_cirq_unmask(struct irq_data *data)
>> > +{
>> > +  mtk_cirq_write_mask(data, CIRQ_MASK_CLR);
>> > +  irq_chip_unmask_parent(data);
>> > +}
>> > +
>> > +static void mtk_cirq_eoi(struct irq_data *data)
>> > +{
>> > +  mtk_cirq_write_mask(data, CIRQ_ACK);
>> 
>> EOI and ACK have very different semantics. What is this write 

Re: [PATCH v2 2/3] irqchip: mtk-cirq: Add mediatek mtk-cirq implement

2016-11-04 Thread Marc Zyngier
On Fri, Nov 04 2016 at 04:42:57 AM, Youlin Pei  wrote:
> On Tue, 2016-11-01 at 20:49 +, Marc Zyngier wrote:
>> On Tue, Nov 01 2016 at 11:52:01 AM, Youlin Pei  
>> wrote:
>> > In Mediatek SOCs, the CIRQ is a low power interrupt controller
>> > designed to works outside MCUSYS which comprises with Cortex-Ax
>> > cores,CCI and GIC.
>> >
>> > The CIRQ controller is integrated in between MCUSYS( include
>> > Cortex-Ax, CCI and GIC ) and interrupt sources as the second
>> > level interrupt controller. The external interrupts which outside
>> > MCUSYS will feed through CIRQ then bypass to GIC. CIRQ can monitors
>> > all edge trigger interupts. When an edge interrupt is triggered,
>> > CIRQ can record the status and generate a pulse signal to GIC when
>> > flush command executed.
>> >
>> > When system enters sleep mode, MCUSYS will be turned off to improve
>> > power consumption, also GIC is power down. The edge trigger interrupts
>> > will be lost in this scenario without CIRQ.
>> >
>> > This commit provides the CIRQ irqchip implement.
>> >
>> > Signed-off-by: Youlin Pei 
>> > ---
>> >  drivers/irqchip/Makefile   |2 +-
>> >  drivers/irqchip/irq-mtk-cirq.c |  262 
>> > 
>> >  2 files changed, 263 insertions(+), 1 deletion(-)
>> >  create mode 100644 drivers/irqchip/irq-mtk-cirq.c
>> >
>> > diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
>> > index e4dbfc8..8f33580 100644
>> > --- a/drivers/irqchip/Makefile
>> > +++ b/drivers/irqchip/Makefile
>> > @@ -60,7 +60,7 @@ obj-$(CONFIG_BCM7120_L2_IRQ) += 
>> > irq-bcm7120-l2.o
>> >  obj-$(CONFIG_BRCMSTB_L2_IRQ)  += irq-brcmstb-l2.o
>> >  obj-$(CONFIG_KEYSTONE_IRQ)+= irq-keystone.o
>> >  obj-$(CONFIG_MIPS_GIC)+= irq-mips-gic.o
>> > -obj-$(CONFIG_ARCH_MEDIATEK)   += irq-mtk-sysirq.o
>> > +obj-$(CONFIG_ARCH_MEDIATEK)   += irq-mtk-sysirq.o 
>> > irq-mtk-cirq.o
>> >  obj-$(CONFIG_ARCH_DIGICOLOR)  += irq-digicolor.o
>> >  obj-$(CONFIG_RENESAS_H8300H_INTC) += irq-renesas-h8300h.o
>> >  obj-$(CONFIG_RENESAS_H8S_INTC)+= irq-renesas-h8s.o
>> > diff --git a/drivers/irqchip/irq-mtk-cirq.c 
>> > b/drivers/irqchip/irq-mtk-cirq.c
>> > new file mode 100644
>> > index 000..fc43ef3
>> > --- /dev/null
>> > +++ b/drivers/irqchip/irq-mtk-cirq.c
>> > @@ -0,0 +1,262 @@
>> > +/*
>> > + * Copyright (c) 2016 MediaTek Inc.
>> > + * Author: Youlin.Pei 
>> > + *
>> > + * This program is free software; you can redistribute it and/or modify
>> > + * it under the terms of the GNU General Public License version 2 as
>> > + * published by the Free Software Foundation.
>> > + *
>> > + * This program is distributed in the hope that it will be useful,
>> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> > + * GNU General Public License for more details.
>> > + */
>> > +
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +
>> > +#define CIRQ_ACK  0x40
>> > +#define CIRQ_MASK_SET 0xc0
>> > +#define CIRQ_MASK_CLR 0x100
>> > +#define CIRQ_SENS_SET 0x180
>> > +#define CIRQ_SENS_CLR 0x1c0
>> > +#define CIRQ_POL_SET  0x240
>> > +#define CIRQ_POL_CLR  0x280
>> > +#define CIRQ_CONTROL  0x300
>> > +
>> > +#define CIRQ_EN   0x1
>> > +#define CIRQ_EDGE 0x2
>> > +#define CIRQ_FLUSH0x4
>> > +
>> > +#define CIRQ_IRQ_NUM0x200
>> > +
>> > +struct mtk_cirq_chip_data {
>> > +  void __iomem *base;
>> > +  unsigned int ext_irq_start;
>> > +};
>> > +
>> > +static struct mtk_cirq_chip_data *cirq_data;
>> 
>> Are you guaranteed that you'll only ever have a single CIRQ in any
>> system?
>
> In Mediatek's SOC, only hace a single CIRQ.
>
>> 
>> > +
>> > +static void mtk_cirq_write_mask(struct irq_data *data, unsigned int 
>> > offset)
>> > +{
>> > +  struct mtk_cirq_chip_data *chip_data = data->chip_data;
>> > +  unsigned int cirq_num = data->hwirq;
>> > +  u32 mask = 1 << (cirq_num % 32);
>> > +
>> > +  writel(mask, chip_data->base + offset + (cirq_num / 32) * 4);
>> 
>> Why can't you use the relaxed accessors?
>
> It seems that i use wrong function, i will change the writel to
> writel_relaxed in next version.
>
>> 
>> > +}
>> > +
>> > +static void mtk_cirq_mask(struct irq_data *data)
>> > +{
>> > +  mtk_cirq_write_mask(data, CIRQ_MASK_SET);
>> > +  irq_chip_mask_parent(data);
>> > +}
>> > +
>> > +static void mtk_cirq_unmask(struct irq_data *data)
>> > +{
>> > +  mtk_cirq_write_mask(data, CIRQ_MASK_CLR);
>> > +  irq_chip_unmask_parent(data);
>> > +}
>> > +
>> > +static void mtk_cirq_eoi(struct irq_data *data)
>> > +{
>> > +  mtk_cirq_write_mask(data, CIRQ_ACK);
>> 
>> EOI and ACK have very different semantics. What is this write actually
>> doing? Also, you're now doing an additional MMIO write on each interrupt
>> EOI, doubling 

Re: [PATCH 1/1] gpio: lib: Add gpio_is_enabled() to get pin mode

2016-11-04 Thread Linus Walleij
On Wed, Nov 2, 2016 at 1:17 PM, Laxman Dewangan  wrote:

> Many of devices/SoCs supports the GPIO and special IO function
> from their pins. On such cases, there is always configuration
> bits to select the mode of pin as GPIO or as the special IO mode.
> The functional modes are selected by pinmux option.
>
> When device booted and reach to kernel, it is not possible to get
> the current configuration of pin whether it is in GPIO mode or
> in special IO mode without configurations.
>
> Add APIs to return the current mode of pins without requesting it
> as GPIO to find out the current mode.
> This helps on dumping the pin configuration from debug/test utility
> to get the current mode GPIO or functional mode.
>
> The typical utility looks as:
> pin_dump(pin)
> {
> if(gpio_is_enabled(pin)) {
> dump direction using get_direction()
> } else {
> dump pinmux option and its configurations.
> }
> }

Yeah but since pinctrl and pinmux has its own debugfs files why is this
necessary? I understand it is convenient but only for debugging
right? They the inconvenience of using pinctrls debugfs files should
be bearable.

Also it is possible for any GPIO chip to implement its own
debug print if they like, check what we do in
->dbg_show in drivers/pinctrl/nomadik/pinctrl-nomadik.c
for example.

If the use is for debug prints, keep it driver-local.

(...)
> +static inline int gpio_is_enabled(unsigned gpio)
> +{
> +   return gpiod_is_enabled(gpio_to_desc(gpio));
> +}

NAK why would be encourage the non-descriptor interface by
adding to it? Better only offer this to the descriptor users.

Yours,
Linus Walleij


Re: [PATCH 1/1] gpio: lib: Add gpio_is_enabled() to get pin mode

2016-11-04 Thread Linus Walleij
On Wed, Nov 2, 2016 at 1:17 PM, Laxman Dewangan  wrote:

> Many of devices/SoCs supports the GPIO and special IO function
> from their pins. On such cases, there is always configuration
> bits to select the mode of pin as GPIO or as the special IO mode.
> The functional modes are selected by pinmux option.
>
> When device booted and reach to kernel, it is not possible to get
> the current configuration of pin whether it is in GPIO mode or
> in special IO mode without configurations.
>
> Add APIs to return the current mode of pins without requesting it
> as GPIO to find out the current mode.
> This helps on dumping the pin configuration from debug/test utility
> to get the current mode GPIO or functional mode.
>
> The typical utility looks as:
> pin_dump(pin)
> {
> if(gpio_is_enabled(pin)) {
> dump direction using get_direction()
> } else {
> dump pinmux option and its configurations.
> }
> }

Yeah but since pinctrl and pinmux has its own debugfs files why is this
necessary? I understand it is convenient but only for debugging
right? They the inconvenience of using pinctrls debugfs files should
be bearable.

Also it is possible for any GPIO chip to implement its own
debug print if they like, check what we do in
->dbg_show in drivers/pinctrl/nomadik/pinctrl-nomadik.c
for example.

If the use is for debug prints, keep it driver-local.

(...)
> +static inline int gpio_is_enabled(unsigned gpio)
> +{
> +   return gpiod_is_enabled(gpio_to_desc(gpio));
> +}

NAK why would be encourage the non-descriptor interface by
adding to it? Better only offer this to the descriptor users.

Yours,
Linus Walleij


Re: console issue since 3.6, console=ttyS1 hangs

2016-11-04 Thread Peter Hurley
On Fri, Nov 4, 2016 at 3:33 PM, Nathan Zimmer  wrote:
> On Thu, Nov 03, 2016 at 06:25:46PM -0600, Peter Hurley wrote:
>> On Wed, Nov 2, 2016 at 9:29 AM, Nathan Zimmer  wrote:
>> > On Mon, Oct 31, 2016 at 08:55:49PM -0600, Peter Hurley wrote:
>> >> On Mon, Oct 31, 2016 at 2:27 PM, Sean Young  wrote:
>> >> > On Sun, Oct 30, 2016 at 10:33:02AM -0500, Nathan wrote:
>> >> >> I think this should be PNP0501 instead of PNP0c02.
>> >> >> Once I alter that then when I boot the serial comes up on irq 3. 
>> >> >> However it
>> >> >> still hangs.
>> >> >> I'll keep digging.
>> >> >
>> >> > Well that's that theory out of the window. I'm not sure where to look 
>> >> > now,
>> >> > I would start by enabling as many as possible of the "kernel hacking" 
>> >> > config
>> >> > options and see if anything gets caught.
>> >> >
>> >> > Looking at your earlier messages, you have a collection of percpu 
>> >> > allocation
>> >> > failures. That might be worth resolving before anything else.
>> >>
>> >> Hi Nathan,
>> >>
>> >> Couple of questions:
>> >> 1. Was login over serial console setup and working on SLES 11?  or was
>> >> the 'console=ttyS1' only for debug output?
>> >> I ask because console output doesn't use IRQs; iow, maybe the serial
>> >> port w/ driver never actually worked.
>> >> 2. Can you post dmesg for the SLES 11 setup?  That would show if there
>> >> were probe errors even on that.
>> >>
>> >> An alternative that should be equivalent to your previous setup is to
>> >> build w/ CONFIG_SERIAL_8250_PNP=n
>> >> Seems like your ACPI BIOS is buggy, but also that something else is using 
>> >> IRQ 3?
>> >>
>> >> Regards,
>> >> Peter Hurley
>> >
>> >
>> >
>> > 1) Yes I can confirm I used it to login sometimes.
>> >
>> > I built with CONFIG_SERIAL_8250_PNP=n and that seemed to work better, in 
>> > that the system did not hang.
>> > However I couldn't login on the serial and got these error messages, I 
>> > suspect I broke something while trying different permutations.
>> >
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.136636 seconds
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.180955 seconds
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.161415 seconds
>> > gdm[5206]: WARNING: GdmLocalDisplayFactory: maximum number of X display 
>> > failures reached: check X server log for errors
>> >
>> > It did boot all the way though.
>> >
>> > 2) attached log
>>
>> So I'm confused where this leaves us.
>>
>> In your OP, you claim to have gotten it working with a partial revert
>> of commit 835d844d1a28 (but you didn't attach the partial revert so no
>> one knows what you did); however, my suggestion should have been
>> equivalent.
>
> I apologize, if I was unclear.  Your suggestion of CONFIG_SERIAL_8250_PNP=n 
> did successfully boot and provide messages
> across the console, and yes is basically equivelent to the revert.

Ok, so the partial revert didn't get the login working then?

> Those warnings I just noticed in the dmesg and they weren't there before.
>
>>
>> Note that you have the serial port disabled in BIOS; that's why you're
>> getting the probe error for PNP.
>
> Now when you say its diabled in bios, how can I be sure and double check that?

Well, the ACPI BIOS is reporting it as disabled. Even the SLES11 log says:

[2.136899] pnp 00:04: Plug and Play ACPI device, IDs PNP0501 (disabled)


> These bios screens do not have any mention of PNP settings.
> I am getting output over the console (via ipmi) until the boot hangs.

Yeah, probably the device actually decodes io address access anyway,
but in the disabled state probably has not routed IRQ.

I have no idea how to help you with the bios, sorry.

Regards,
Peter Hurley


Re: console issue since 3.6, console=ttyS1 hangs

2016-11-04 Thread Peter Hurley
On Fri, Nov 4, 2016 at 3:33 PM, Nathan Zimmer  wrote:
> On Thu, Nov 03, 2016 at 06:25:46PM -0600, Peter Hurley wrote:
>> On Wed, Nov 2, 2016 at 9:29 AM, Nathan Zimmer  wrote:
>> > On Mon, Oct 31, 2016 at 08:55:49PM -0600, Peter Hurley wrote:
>> >> On Mon, Oct 31, 2016 at 2:27 PM, Sean Young  wrote:
>> >> > On Sun, Oct 30, 2016 at 10:33:02AM -0500, Nathan wrote:
>> >> >> I think this should be PNP0501 instead of PNP0c02.
>> >> >> Once I alter that then when I boot the serial comes up on irq 3. 
>> >> >> However it
>> >> >> still hangs.
>> >> >> I'll keep digging.
>> >> >
>> >> > Well that's that theory out of the window. I'm not sure where to look 
>> >> > now,
>> >> > I would start by enabling as many as possible of the "kernel hacking" 
>> >> > config
>> >> > options and see if anything gets caught.
>> >> >
>> >> > Looking at your earlier messages, you have a collection of percpu 
>> >> > allocation
>> >> > failures. That might be worth resolving before anything else.
>> >>
>> >> Hi Nathan,
>> >>
>> >> Couple of questions:
>> >> 1. Was login over serial console setup and working on SLES 11?  or was
>> >> the 'console=ttyS1' only for debug output?
>> >> I ask because console output doesn't use IRQs; iow, maybe the serial
>> >> port w/ driver never actually worked.
>> >> 2. Can you post dmesg for the SLES 11 setup?  That would show if there
>> >> were probe errors even on that.
>> >>
>> >> An alternative that should be equivalent to your previous setup is to
>> >> build w/ CONFIG_SERIAL_8250_PNP=n
>> >> Seems like your ACPI BIOS is buggy, but also that something else is using 
>> >> IRQ 3?
>> >>
>> >> Regards,
>> >> Peter Hurley
>> >
>> >
>> >
>> > 1) Yes I can confirm I used it to login sometimes.
>> >
>> > I built with CONFIG_SERIAL_8250_PNP=n and that seemed to work better, in 
>> > that the system did not hang.
>> > However I couldn't login on the serial and got these error messages, I 
>> > suspect I broke something while trying different permutations.
>> >
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.136636 seconds
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.180955 seconds
>> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.161415 seconds
>> > gdm[5206]: WARNING: GdmLocalDisplayFactory: maximum number of X display 
>> > failures reached: check X server log for errors
>> >
>> > It did boot all the way though.
>> >
>> > 2) attached log
>>
>> So I'm confused where this leaves us.
>>
>> In your OP, you claim to have gotten it working with a partial revert
>> of commit 835d844d1a28 (but you didn't attach the partial revert so no
>> one knows what you did); however, my suggestion should have been
>> equivalent.
>
> I apologize, if I was unclear.  Your suggestion of CONFIG_SERIAL_8250_PNP=n 
> did successfully boot and provide messages
> across the console, and yes is basically equivelent to the revert.

Ok, so the partial revert didn't get the login working then?

> Those warnings I just noticed in the dmesg and they weren't there before.
>
>>
>> Note that you have the serial port disabled in BIOS; that's why you're
>> getting the probe error for PNP.
>
> Now when you say its diabled in bios, how can I be sure and double check that?

Well, the ACPI BIOS is reporting it as disabled. Even the SLES11 log says:

[2.136899] pnp 00:04: Plug and Play ACPI device, IDs PNP0501 (disabled)


> These bios screens do not have any mention of PNP settings.
> I am getting output over the console (via ipmi) until the boot hangs.

Yeah, probably the device actually decodes io address access anyway,
but in the disabled state probably has not routed IRQ.

I have no idea how to help you with the bios, sorry.

Regards,
Peter Hurley


Re: v4.8-rc1: thinkpad x60: running at low frequency even during kernel build

2016-11-04 Thread Pavel Machek
Hi!

> I'd prefer mails over bugzilla for now...
> 
> 4.9-rc2 has bios_limit:
> 
> pavel@duo:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
> 1833000
> 
> and it has thermal zones:
> 
> /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp 127000
> /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_type critical
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_temp 97000
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_type critical
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_temp 92500
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_type passive
> 
> ..so it should slow down CPU at 92C.
> 
> So lets push the temperature up a bit...
> 
> sudo watch cat /proc/acpi/ibm/thermal
> /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
> /sys/devices/virtual/thermal/thermal_zone1/temp  
> /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
> 
> temperatures:   98 49 -128 85 28 -128 28 -128 49 58 -128 -128 -128
> -128 -128 -128
> 1833000
> 95000
> 1833000
> 
> Hmm. bios_limit does not seem to change, even when the temperature is
> clearly above the trip point. (It is also interestng that acpi/ibm
> reports bigger temperatures than
> /sys/devices/virtual/thermal/thermal_zone1/temp . I have seen 103C
> there.)

Under v4.8-rc, behaviour is different: bios_limit goes to 1GHz there
when temperature is around 84C at the thermal zone. That keeps
ibm/thermal temperatures under 90C, and no "thermal emergency"
messages in syslog.

So we seem to have thermal or ACPI regression in v4.9-rc3.

Best regards,
 
Pavel
 
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: v4.8-rc1: thinkpad x60: running at low frequency even during kernel build

2016-11-04 Thread Pavel Machek
Hi!

> I'd prefer mails over bugzilla for now...
> 
> 4.9-rc2 has bios_limit:
> 
> pavel@duo:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
> 1833000
> 
> and it has thermal zones:
> 
> /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp 127000
> /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_type critical
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_temp 97000
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_type critical
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_temp 92500
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_type passive
> 
> ..so it should slow down CPU at 92C.
> 
> So lets push the temperature up a bit...
> 
> sudo watch cat /proc/acpi/ibm/thermal
> /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
> /sys/devices/virtual/thermal/thermal_zone1/temp  
> /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
> 
> temperatures:   98 49 -128 85 28 -128 28 -128 49 58 -128 -128 -128
> -128 -128 -128
> 1833000
> 95000
> 1833000
> 
> Hmm. bios_limit does not seem to change, even when the temperature is
> clearly above the trip point. (It is also interestng that acpi/ibm
> reports bigger temperatures than
> /sys/devices/virtual/thermal/thermal_zone1/temp . I have seen 103C
> there.)

Under v4.8-rc, behaviour is different: bios_limit goes to 1GHz there
when temperature is around 84C at the thermal zone. That keeps
ibm/thermal temperatures under 90C, and no "thermal emergency"
messages in syslog.

So we seem to have thermal or ACPI regression in v4.9-rc3.

Best regards,
 
Pavel
 
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH] pinctrl: meson: Add GXL pinctrl definitions

2016-11-04 Thread Linus Walleij
On Mon, Oct 31, 2016 at 5:32 PM, Neil Armstrong  wrote:

> Add support for the Amlogic Meson GXL SoC, this is a partially complete
> definition only based on the Amlogic Vendor tree.
>
> This definition differs a lot from the GXBB and needs a separate entry.
>
> Acked-by: Rob Herring 
> Signed-off-by: Neil Armstrong 

Patch applied with Kevin's ACK.

Yours,
Linus Walleij


Re: [PATCH] pinctrl: meson: Add GXL pinctrl definitions

2016-11-04 Thread Linus Walleij
On Mon, Oct 31, 2016 at 5:32 PM, Neil Armstrong  wrote:

> Add support for the Amlogic Meson GXL SoC, this is a partially complete
> definition only based on the Amlogic Vendor tree.
>
> This definition differs a lot from the GXBB and needs a separate entry.
>
> Acked-by: Rob Herring 
> Signed-off-by: Neil Armstrong 

Patch applied with Kevin's ACK.

Yours,
Linus Walleij


Re: [PATCH] gpio: htc-egpio: Make it explicitly non-modular

2016-11-04 Thread Linus Walleij
On Mon, Oct 31, 2016 at 7:27 PM, Paul Gortmaker
 wrote:

> The Kconfig currently controlling compilation of this code is:
>
> drivers/gpio/Kconfig:config HTC_EGPIO
> drivers/gpio/Kconfig:   bool "HTC EGPIO support"
>
> ...meaning that it currently is not being built as a module by anyone.
>
> Lets remove the modular code that is essentially orphaned, so that
> when reading the driver there is no doubt it is builtin-only.
>
> We explicitly disallow a driver unbind, since that doesn't have a
> sensible use case anyway, and it allows us to drop the ".remove"
> code for non-modular drivers.
>
> Since module_init was not in use by this code, the init ordering
> remains unchanged with this commit.
>
> We also delete the MODULE_LICENSE tag etc. since all that information
> is already contained at the top of the file in the comments.
>
> Cc: Linus Walleij 
> Cc: Alexandre Courbot 
> Cc: Kevin O'Connor 
> Cc: linux-g...@vger.kernel.org
> Signed-off-by: Paul Gortmaker 
> ---
>
> [we had all these fixed for gpio, but then this file recently moved from
>  the mfd subsystem over here to gpio]

Sorry about the fuzz!

Patch applied.

Yours,
Linus Walleij


Re: [PATCH] gpio: htc-egpio: Make it explicitly non-modular

2016-11-04 Thread Linus Walleij
On Mon, Oct 31, 2016 at 7:27 PM, Paul Gortmaker
 wrote:

> The Kconfig currently controlling compilation of this code is:
>
> drivers/gpio/Kconfig:config HTC_EGPIO
> drivers/gpio/Kconfig:   bool "HTC EGPIO support"
>
> ...meaning that it currently is not being built as a module by anyone.
>
> Lets remove the modular code that is essentially orphaned, so that
> when reading the driver there is no doubt it is builtin-only.
>
> We explicitly disallow a driver unbind, since that doesn't have a
> sensible use case anyway, and it allows us to drop the ".remove"
> code for non-modular drivers.
>
> Since module_init was not in use by this code, the init ordering
> remains unchanged with this commit.
>
> We also delete the MODULE_LICENSE tag etc. since all that information
> is already contained at the top of the file in the comments.
>
> Cc: Linus Walleij 
> Cc: Alexandre Courbot 
> Cc: Kevin O'Connor 
> Cc: linux-g...@vger.kernel.org
> Signed-off-by: Paul Gortmaker 
> ---
>
> [we had all these fixed for gpio, but then this file recently moved from
>  the mfd subsystem over here to gpio]

Sorry about the fuzz!

Patch applied.

Yours,
Linus Walleij


Re: [PATCH v6] powerpc: Do not make the entire heap executable

2016-11-04 Thread Kees Cook
Hi,

Jason just reminded me about this patch. :)

Denys, can you resend a v7 with all the Acked/Reviewed/Tested-bys
added and send it To: akpm, with everyone else (and lkml) in CC? That
should be the easiest way for Andrew to pick it up.

Thanks!

-Kees


On Mon, Oct 24, 2016 at 5:17 PM, Kees Cook  wrote:
> On Thu, Oct 20, 2016 at 3:45 PM, Jason Gunthorpe
>  wrote:
>> On Tue, Oct 04, 2016 at 09:54:12AM -0700, Kees Cook wrote:
>>> On Mon, Oct 3, 2016 at 5:18 PM, Michael Ellerman  
>>> wrote:
>>> > Kees Cook  writes:
>>> >
>>> >> On Mon, Oct 3, 2016 at 9:13 AM, Denys Vlasenko  
>>> >> wrote:
>>> >>> On 32-bit powerpc the ELF PLT sections of binaries (built with 
>>> >>> --bss-plt,
>>> >>> or with a toolchain which defaults to it) look like this:
>>> > ...
>>> >>>
>>> >>> Signed-off-by: Jason Gunthorpe 
>>> >>> Signed-off-by: Denys Vlasenko 
>>> >>> Acked-by: Kees Cook 
>>> >>> Acked-by: Michael Ellerman 
>>> >>> CC: Benjamin Herrenschmidt 
>>> >>> CC: Paul Mackerras 
>>> >>> CC: "Aneesh Kumar K.V" 
>>> >>> CC: Kees Cook 
>>> >>> CC: Oleg Nesterov 
>>> >>> CC: Michael Ellerman 
>>> >>> CC: Florian Weimer 
>>> >>> CC: linux...@kvack.org
>>> >>> CC: linuxppc-...@lists.ozlabs.org
>>> >>> CC: linux-kernel@vger.kernel.org
>>> >>> Changes since v5:
>>> >>> * made do_brk_flags() error out if any bits other than VM_EXEC are set.
>>> >>>   (Kees Cook: "With this, I'd be happy to Ack.")
>>> >>>   See https://patchwork.ozlabs.org/patch/661595/
>>> >>
>>> >> Excellent, thanks for the v6! Should this go via the ppc tree or the -mm 
>>> >> tree?
>>> >
>>> > -mm would be best, given the diffstat I think it's less likely to
>>> >  conflict if it goes via -mm.
>>>
>>> Okay, excellent. Andrew, do you have this already in email? I think
>>> you weren't on the explicit CC from the v6...
>>
>> FWIW (and ping),
>>
>> Tested-by: Jason Gunthorpe 
>>
>> On ARM32 (kirkwood) and PPC32 (405)
>>
>> For reference, here is the patchwork URL:
>>
>> https://patchwork.ozlabs.org/patch/677753/
>
> Hi Andrew,
>
> Can you pick this up?
>
> Thanks!
>
> -Kees
>
> --
> Kees Cook
> Nexus Security



-- 
Kees Cook
Nexus Security


Re: [PATCH v6] powerpc: Do not make the entire heap executable

2016-11-04 Thread Kees Cook
Hi,

Jason just reminded me about this patch. :)

Denys, can you resend a v7 with all the Acked/Reviewed/Tested-bys
added and send it To: akpm, with everyone else (and lkml) in CC? That
should be the easiest way for Andrew to pick it up.

Thanks!

-Kees


On Mon, Oct 24, 2016 at 5:17 PM, Kees Cook  wrote:
> On Thu, Oct 20, 2016 at 3:45 PM, Jason Gunthorpe
>  wrote:
>> On Tue, Oct 04, 2016 at 09:54:12AM -0700, Kees Cook wrote:
>>> On Mon, Oct 3, 2016 at 5:18 PM, Michael Ellerman  
>>> wrote:
>>> > Kees Cook  writes:
>>> >
>>> >> On Mon, Oct 3, 2016 at 9:13 AM, Denys Vlasenko  
>>> >> wrote:
>>> >>> On 32-bit powerpc the ELF PLT sections of binaries (built with 
>>> >>> --bss-plt,
>>> >>> or with a toolchain which defaults to it) look like this:
>>> > ...
>>> >>>
>>> >>> Signed-off-by: Jason Gunthorpe 
>>> >>> Signed-off-by: Denys Vlasenko 
>>> >>> Acked-by: Kees Cook 
>>> >>> Acked-by: Michael Ellerman 
>>> >>> CC: Benjamin Herrenschmidt 
>>> >>> CC: Paul Mackerras 
>>> >>> CC: "Aneesh Kumar K.V" 
>>> >>> CC: Kees Cook 
>>> >>> CC: Oleg Nesterov 
>>> >>> CC: Michael Ellerman 
>>> >>> CC: Florian Weimer 
>>> >>> CC: linux...@kvack.org
>>> >>> CC: linuxppc-...@lists.ozlabs.org
>>> >>> CC: linux-kernel@vger.kernel.org
>>> >>> Changes since v5:
>>> >>> * made do_brk_flags() error out if any bits other than VM_EXEC are set.
>>> >>>   (Kees Cook: "With this, I'd be happy to Ack.")
>>> >>>   See https://patchwork.ozlabs.org/patch/661595/
>>> >>
>>> >> Excellent, thanks for the v6! Should this go via the ppc tree or the -mm 
>>> >> tree?
>>> >
>>> > -mm would be best, given the diffstat I think it's less likely to
>>> >  conflict if it goes via -mm.
>>>
>>> Okay, excellent. Andrew, do you have this already in email? I think
>>> you weren't on the explicit CC from the v6...
>>
>> FWIW (and ping),
>>
>> Tested-by: Jason Gunthorpe 
>>
>> On ARM32 (kirkwood) and PPC32 (405)
>>
>> For reference, here is the patchwork URL:
>>
>> https://patchwork.ozlabs.org/patch/677753/
>
> Hi Andrew,
>
> Can you pick this up?
>
> Thanks!
>
> -Kees
>
> --
> Kees Cook
> Nexus Security



-- 
Kees Cook
Nexus Security


Re: [git pull] drm fixes for 4.9-rc4

2016-11-04 Thread Theodore Ts'o
On Fri, Nov 04, 2016 at 01:38:25PM -0700, Linus Torvalds wrote:
> On Wed, Nov 2, 2016 at 5:31 PM, Dave Airlie  wrote:
> >
> > There are a set of fixes for an oops we've been seeing around MST
> > display unplug,
> 
> Side note: I heard from a couple of people at the KS that said that
> they had had problems with suspend/resume after plugging in to a 4k
> display (but _only_ a 4k display - apparently normal FHD displays
> didn't show this). I think at least one was USB3/Thunderbolt. Ted with
> a Lenovo laptop (intel GPU) was one, I forget who else mentioned this.

Actually, it's after a unplugging from a Dell 30" monitor with a 3k
display (2560 x 1920).  This is after I've carefully deactivated the
video output to the Dell 30" monitor, unplugged the Dell 30" monitor
(at which point the system becomes non-responsive for 2-3 seconds for
reasons unknown), and only suspending after the system has recovered
from the unplug.

At that point, it's a 20-30% chance that the system will never come
back after a suspend.  So I have to make a point of saving all of my
editor buffers, etc., since I never can know whether my laptop will
come back.

This was happening for years and years on the T540p laptop, as well as
my new T460 laptop.  I've complained about this in the past, and
gotten no response, and I've just gotten used to the fact that if I'm
transitioning from home (where I have the 30" display) to work,
there's a good chance the resume will lock up, and I will be forced to
push the power button for 8 seconds to forcibly power down the laptop
to recover from the suspend.  :-(

I agree with Linus's suspicion that I probably need to bite the bullet
and just buy a new SST monitor, and that will probably make the
problem go away.  But if the bug can be fixed, that would be really
great.

Thanks,

- Ted


Re: [git pull] drm fixes for 4.9-rc4

2016-11-04 Thread Theodore Ts'o
On Fri, Nov 04, 2016 at 01:38:25PM -0700, Linus Torvalds wrote:
> On Wed, Nov 2, 2016 at 5:31 PM, Dave Airlie  wrote:
> >
> > There are a set of fixes for an oops we've been seeing around MST
> > display unplug,
> 
> Side note: I heard from a couple of people at the KS that said that
> they had had problems with suspend/resume after plugging in to a 4k
> display (but _only_ a 4k display - apparently normal FHD displays
> didn't show this). I think at least one was USB3/Thunderbolt. Ted with
> a Lenovo laptop (intel GPU) was one, I forget who else mentioned this.

Actually, it's after a unplugging from a Dell 30" monitor with a 3k
display (2560 x 1920).  This is after I've carefully deactivated the
video output to the Dell 30" monitor, unplugged the Dell 30" monitor
(at which point the system becomes non-responsive for 2-3 seconds for
reasons unknown), and only suspending after the system has recovered
from the unplug.

At that point, it's a 20-30% chance that the system will never come
back after a suspend.  So I have to make a point of saving all of my
editor buffers, etc., since I never can know whether my laptop will
come back.

This was happening for years and years on the T540p laptop, as well as
my new T460 laptop.  I've complained about this in the past, and
gotten no response, and I've just gotten used to the fact that if I'm
transitioning from home (where I have the 30" display) to work,
there's a good chance the resume will lock up, and I will be forced to
push the power button for 8 seconds to forcibly power down the laptop
to recover from the suspend.  :-(

I agree with Linus's suspicion that I probably need to bite the bullet
and just buy a new SST monitor, and that will probably make the
problem go away.  But if the bug can be fixed, that would be really
great.

Thanks,

- Ted


Re: [PATCH v8 7/7] KVM: x86: virtualize cpuid faulting

2016-11-04 Thread Paolo Bonzini


On 04/11/2016 21:34, David Matlack wrote:
> On Mon, Oct 31, 2016 at 6:37 PM, Kyle Huey  wrote:
>> +   case MSR_PLATFORM_INFO:
>> +   /* cpuid faulting is supported */
>> +   msr_info->data = PLATINFO_CPUID_FAULT;
>> +   break;
> 
> This could break save/restore, if for example, a VM is migrated to a
> version of KVM without MSR_PLATFORM_INFO support. I think the way to
> handle this is to make MSR_PLATFORM_INFO writeable (but only from
> userspace) so that hypervisors can defend themselves (by setting this
> MSR to 0).

Right---and with my QEMU hat on, this feature will have to be enabled
manually on the command line because of the way QEMU supports running
with old kernels. :(  This however does not impact the KVM patch.

We may decide that, because CPUID faulting doesn't have a CPUID bit and
is relatively a "fringe" feature, we are okay if the kernel enables this
unconditionally and then userspace can arrange to block migration (in
QEMU this would use a subsection).  David, Eduardo, opinions?

> 
>> +   case MSR_MISC_FEATURES_ENABLES:
>> +   msr_info->data = 0;
>> +   if (vcpu->arch.cpuid_fault)
>> +   msr_info->data |= CPUID_FAULT_ENABLE;
>> +   break;
> 
> MSR_MISC_FEATURES_ENABLES should be added to emulated_msrs[] so that
> the hypervisor will maintain the value of CPUID_FAULT_ENABLE across a
> save/restore.

This is definitely necessary.  Thanks David.

Paolo

>> default:
>> if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
>> return kvm_pmu_get_msr(vcpu, msr_info->index, 
>> _info->data);
>> if (!ignore_msrs) {
>> vcpu_unimpl(vcpu, "unhandled rdmsr: 0x%x\n", 
>> msr_info->index);
>> return 1;
>> } else {
>> vcpu_unimpl(vcpu, "ignored rdmsr: 0x%x\n", 
>> msr_info->index);
>> @@ -7493,16 +7507,18 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool 
>> init_event)
>> kvm_update_dr0123(vcpu);
>> vcpu->arch.dr6 = DR6_INIT;
>> kvm_update_dr6(vcpu);
>> vcpu->arch.dr7 = DR7_FIXED_1;
>> kvm_update_dr7(vcpu);
>>
>> vcpu->arch.cr2 = 0;
>>
>> +   vcpu->arch.cpuid_fault = false;
>> +
>> kvm_make_request(KVM_REQ_EVENT, vcpu);
>> vcpu->arch.apf.msr_val = 0;
>> vcpu->arch.st.msr_val = 0;
>>
>> kvmclock_reset(vcpu);
>>
>> kvm_clear_async_pf_completion_queue(vcpu);
>> kvm_async_pf_hash_reset(vcpu);
>> --
>> 2.10.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: [PATCH v8 7/7] KVM: x86: virtualize cpuid faulting

2016-11-04 Thread Paolo Bonzini


On 04/11/2016 21:34, David Matlack wrote:
> On Mon, Oct 31, 2016 at 6:37 PM, Kyle Huey  wrote:
>> +   case MSR_PLATFORM_INFO:
>> +   /* cpuid faulting is supported */
>> +   msr_info->data = PLATINFO_CPUID_FAULT;
>> +   break;
> 
> This could break save/restore, if for example, a VM is migrated to a
> version of KVM without MSR_PLATFORM_INFO support. I think the way to
> handle this is to make MSR_PLATFORM_INFO writeable (but only from
> userspace) so that hypervisors can defend themselves (by setting this
> MSR to 0).

Right---and with my QEMU hat on, this feature will have to be enabled
manually on the command line because of the way QEMU supports running
with old kernels. :(  This however does not impact the KVM patch.

We may decide that, because CPUID faulting doesn't have a CPUID bit and
is relatively a "fringe" feature, we are okay if the kernel enables this
unconditionally and then userspace can arrange to block migration (in
QEMU this would use a subsection).  David, Eduardo, opinions?

> 
>> +   case MSR_MISC_FEATURES_ENABLES:
>> +   msr_info->data = 0;
>> +   if (vcpu->arch.cpuid_fault)
>> +   msr_info->data |= CPUID_FAULT_ENABLE;
>> +   break;
> 
> MSR_MISC_FEATURES_ENABLES should be added to emulated_msrs[] so that
> the hypervisor will maintain the value of CPUID_FAULT_ENABLE across a
> save/restore.

This is definitely necessary.  Thanks David.

Paolo

>> default:
>> if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
>> return kvm_pmu_get_msr(vcpu, msr_info->index, 
>> _info->data);
>> if (!ignore_msrs) {
>> vcpu_unimpl(vcpu, "unhandled rdmsr: 0x%x\n", 
>> msr_info->index);
>> return 1;
>> } else {
>> vcpu_unimpl(vcpu, "ignored rdmsr: 0x%x\n", 
>> msr_info->index);
>> @@ -7493,16 +7507,18 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool 
>> init_event)
>> kvm_update_dr0123(vcpu);
>> vcpu->arch.dr6 = DR6_INIT;
>> kvm_update_dr6(vcpu);
>> vcpu->arch.dr7 = DR7_FIXED_1;
>> kvm_update_dr7(vcpu);
>>
>> vcpu->arch.cr2 = 0;
>>
>> +   vcpu->arch.cpuid_fault = false;
>> +
>> kvm_make_request(KVM_REQ_EVENT, vcpu);
>> vcpu->arch.apf.msr_val = 0;
>> vcpu->arch.st.msr_val = 0;
>>
>> kvmclock_reset(vcpu);
>>
>> kvm_clear_async_pf_completion_queue(vcpu);
>> kvm_async_pf_hash_reset(vcpu);
>> --
>> 2.10.2
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Re: [PATCH 4/4] ARM: dts: Add #pinctrl-cells for pinctrl-single instances

2016-11-04 Thread Linus Walleij
On Thu, Nov 3, 2016 at 5:35 PM, Tony Lindgren  wrote:

> Drivers using pinctrl-single,pins have #pinctrl-cells = <1>, while
> pinctrl-single,bits need #pinctrl-cells = <2>.
>
> Note that this patch can be optionally applied separately from the
> driver changes as the driver supports also the legacy binding without
> #pinctrl-cells.
>
> Acked-by: Rob Herring 
> Signed-off-by: Tony Lindgren 

Reviewed-by: Linus Walleij 

Please take this through the OMAP tree to avoid hazzle.

Yours,
Linus Walleij


Re: [PATCH 4/4] ARM: dts: Add #pinctrl-cells for pinctrl-single instances

2016-11-04 Thread Linus Walleij
On Thu, Nov 3, 2016 at 5:35 PM, Tony Lindgren  wrote:

> Drivers using pinctrl-single,pins have #pinctrl-cells = <1>, while
> pinctrl-single,bits need #pinctrl-cells = <2>.
>
> Note that this patch can be optionally applied separately from the
> driver changes as the driver supports also the legacy binding without
> #pinctrl-cells.
>
> Acked-by: Rob Herring 
> Signed-off-by: Tony Lindgren 

Reviewed-by: Linus Walleij 

Please take this through the OMAP tree to avoid hazzle.

Yours,
Linus Walleij


Re: [PATCH 3/4] pinctrl: single: Use generic parser and #pinctrl-cells for pinctrl-single,bits

2016-11-04 Thread Linus Walleij
On Thu, Nov 3, 2016 at 5:35 PM, Tony Lindgren  wrote:

> We can now use generic parser and keep things compatible with the
> old binding.
>
> Signed-off-by: Tony Lindgren 

V2 Patch applied.

Yours,
Linus Walleij


Re: [PATCH 3/4] pinctrl: single: Use generic parser and #pinctrl-cells for pinctrl-single,bits

2016-11-04 Thread Linus Walleij
On Thu, Nov 3, 2016 at 5:35 PM, Tony Lindgren  wrote:

> We can now use generic parser and keep things compatible with the
> old binding.
>
> Signed-off-by: Tony Lindgren 

V2 Patch applied.

Yours,
Linus Walleij


Re: [PATCH 2/4] pinctrl: single: Use generic parser and #pinctrl-cells for pinctrl-single,pins

2016-11-04 Thread Linus Walleij
On Thu, Nov 3, 2016 at 5:35 PM, Tony Lindgren  wrote:

> We can now use generic parser. To support the legacy binding without
> #pinctrl-cells, add pcs_quirk_missing_pinctrl_cells() and warn about
> missing #pinctrl-cells.
>
> Let's also update the documentation for struct pcs_soc_data while at it
> as that seems to be out of date.
>
> Signed-off-by: Tony Lindgren 

This v2 patch applied.

Yours,
Linus Walleij


Re: [PATCH 2/4] pinctrl: single: Use generic parser and #pinctrl-cells for pinctrl-single,pins

2016-11-04 Thread Linus Walleij
On Thu, Nov 3, 2016 at 5:35 PM, Tony Lindgren  wrote:

> We can now use generic parser. To support the legacy binding without
> #pinctrl-cells, add pcs_quirk_missing_pinctrl_cells() and warn about
> missing #pinctrl-cells.
>
> Let's also update the documentation for struct pcs_soc_data while at it
> as that seems to be out of date.
>
> Signed-off-by: Tony Lindgren 

This v2 patch applied.

Yours,
Linus Walleij


Re: [PATCH 1/4] pinctrl: Introduce generic #pinctrl-cells and pinctrl_parse_index_with_args

2016-11-04 Thread Linus Walleij
On Thu, Nov 3, 2016 at 5:35 PM, Tony Lindgren  wrote:

> Introduce #pinctrl-cells helper binding and generic helper functions
> pinctrl_count_index_with_args() and pinctrl_parse_index_with_args().
>
> Acked-by: Rob Herring 
> Signed-off-by: Tony Lindgren 

Ooops applied this v2 version instead of the v1.

* kbuild test robot  [161103 13:29]:
>In file included from drivers/pinctrl/core.c:36:0:
> >> drivers/pinctrl/devicetree.h:29:14: warning: 'struct of_phandle_args' 
> >> declared inside parameter list will not be visible outside of this
definition or declaratio

> Hmm maybe we should just include of.h in core.c?

Nah. I just did this:

diff --git a/drivers/pinctrl/devicetree.h b/drivers/pinctrl/devicetree.h
index 7f0a5c4e15ad..c2d1a5505850 100644
--- a/drivers/pinctrl/devicetree.h
+++ b/drivers/pinctrl/devicetree.h
@@ -16,6 +16,8 @@
  * along with this program.  If not, see .
  */

+struct of_phandle_args;
+
 #ifdef CONFIG_OF

Let's see if it works!

Yours,
Linus Walleij


Re: [PATCH 1/4] pinctrl: Introduce generic #pinctrl-cells and pinctrl_parse_index_with_args

2016-11-04 Thread Linus Walleij
On Thu, Nov 3, 2016 at 5:35 PM, Tony Lindgren  wrote:

> Introduce #pinctrl-cells helper binding and generic helper functions
> pinctrl_count_index_with_args() and pinctrl_parse_index_with_args().
>
> Acked-by: Rob Herring 
> Signed-off-by: Tony Lindgren 

Ooops applied this v2 version instead of the v1.

* kbuild test robot  [161103 13:29]:
>In file included from drivers/pinctrl/core.c:36:0:
> >> drivers/pinctrl/devicetree.h:29:14: warning: 'struct of_phandle_args' 
> >> declared inside parameter list will not be visible outside of this
definition or declaratio

> Hmm maybe we should just include of.h in core.c?

Nah. I just did this:

diff --git a/drivers/pinctrl/devicetree.h b/drivers/pinctrl/devicetree.h
index 7f0a5c4e15ad..c2d1a5505850 100644
--- a/drivers/pinctrl/devicetree.h
+++ b/drivers/pinctrl/devicetree.h
@@ -16,6 +16,8 @@
  * along with this program.  If not, see .
  */

+struct of_phandle_args;
+
 #ifdef CONFIG_OF

Let's see if it works!

Yours,
Linus Walleij


RE: [PATCH 8/9] bus: fsl-mc: dpio: add the DPAA2 DPIO object driver

2016-11-04 Thread Ruxandra Ioana Radulescu
> -Original Message-
> From: Stuart Yoder
> Sent: Thursday, November 03, 2016 4:38 PM
> To: Ruxandra Ioana Radulescu 
> Subject: FW: [PATCH 8/9] bus: fsl-mc: dpio: add the DPAA2 DPIO object
> driver
> 
> 
> 
> -Original Message-
> From: Stuart Yoder [mailto:stuart.yo...@nxp.com]
> Sent: Friday, October 21, 2016 9:02 AM
> To: gre...@linuxfoundation.org
> Cc: German Rivera ; de...@driverdev.osuosl.org;
> linux-kernel@vger.kernel.org; ag...@suse.de; a...@arndb.de; Leo Li
> ; Roy Pledge ; Haiying Wang
> ; Stuart Yoder 
> Subject: [PATCH 8/9] bus: fsl-mc: dpio: add the DPAA2 DPIO object driver
> 
> From: Roy Pledge 
> 
> The DPIO driver registers with the fsl-mc bus to handle bus-related
> events for DPIO objects.  Key responsibility is mapping I/O
> regions, setting up interrupt handlers, and calling the DPIO
> service initialization during probe.
> 
> Signed-off-by: Roy Pledge 
> Signed-off-by: Haiying Wang 
> Signed-off-by: Stuart Yoder 
> ---
>  drivers/bus/fsl-mc/dpio/Makefile  |   2 +-
>  drivers/bus/fsl-mc/dpio/dpio-driver.c | 289
> ++
>  2 files changed, 290 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/bus/fsl-mc/dpio/dpio-driver.c
> 
> diff --git a/drivers/bus/fsl-mc/dpio/Makefile b/drivers/bus/fsl-
> mc/dpio/Makefile
> index 0778da7..837d330 100644
> --- a/drivers/bus/fsl-mc/dpio/Makefile
> +++ b/drivers/bus/fsl-mc/dpio/Makefile
> @@ -6,4 +6,4 @@ subdir-ccflags-y := -Werror
> 
>  obj-$(CONFIG_FSL_MC_DPIO) += fsl-mc-dpio.o
> 
> -fsl-mc-dpio-objs := dpio.o qbman-portal.o dpio-service.o
> +fsl-mc-dpio-objs := dpio.o qbman-portal.o dpio-service.o dpio-driver.o
> diff --git a/drivers/bus/fsl-mc/dpio/dpio-driver.c b/drivers/bus/fsl-
> mc/dpio/dpio-driver.c
> new file mode 100644
> index 000..ad04a2c
> --- /dev/null
> +++ b/drivers/bus/fsl-mc/dpio/dpio-driver.c
> @@ -0,0 +1,289 @@
> +/*
> + * Copyright 2014-2016 Freescale Semiconductor Inc.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions are
> met:
> + * * Redistributions of source code must retain the above copyright
> + *notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in the
> + *documentation and/or other materials provided with the
> distribution.
> + * * Neither the name of Freescale Semiconductor nor the
> + *names of its contributors may be used to endorse or promote
> products
> + *derived from this software without specific prior written permission.
> + *
> + * ALTERNATIVELY, this software may be distributed under the terms of the
> + * GNU General Public License ("GPL") as published by the Free Software
> + * Foundation, either version 2 of that License or (at your option) any
> + * later version.
> + *
> + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND
> ANY
> + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> THE IMPLIED
> + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE ARE
> + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR
> ANY
> + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL DAMAGES
> + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
> GOODS OR SERVICES;
> + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> HOWEVER CAUSED AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE OF THIS
> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +
> +#include "qbman-portal.h"
> +#include "dpio.h"
> +#include "dpio-cmd.h"
> +
> +MODULE_LICENSE("Dual BSD/GPL");
> +MODULE_AUTHOR("Freescale Semiconductor, Inc");
> +MODULE_DESCRIPTION("DPIO Driver");
> +
> +struct dpio_priv {
> + struct dpaa2_io *io;
> +};
> +
> +static irqreturn_t dpio_irq_handler(int irq_num, void *arg)
> +{
> + struct device *dev = (struct device *)arg;
> + struct dpio_priv *priv = dev_get_drvdata(dev);
> +
> + return dpaa2_io_irq(priv->io);
> +}
> +
> +static void unregister_dpio_irq_handlers(struct fsl_mc_device *dpio_dev)
> +{
> + struct fsl_mc_device_irq *irq;
> +
> + irq = dpio_dev->irqs[0];
> +
> + /* clear the affinity hint */
> + irq_set_affinity_hint(irq->msi_desc->irq, NULL);
> +}
> +
> +static int register_dpio_irq_handlers(struct fsl_mc_device *dpio_dev, int
> cpu)
> +{
> + 

RE: [PATCH 8/9] bus: fsl-mc: dpio: add the DPAA2 DPIO object driver

2016-11-04 Thread Ruxandra Ioana Radulescu
> -Original Message-
> From: Stuart Yoder
> Sent: Thursday, November 03, 2016 4:38 PM
> To: Ruxandra Ioana Radulescu 
> Subject: FW: [PATCH 8/9] bus: fsl-mc: dpio: add the DPAA2 DPIO object
> driver
> 
> 
> 
> -Original Message-
> From: Stuart Yoder [mailto:stuart.yo...@nxp.com]
> Sent: Friday, October 21, 2016 9:02 AM
> To: gre...@linuxfoundation.org
> Cc: German Rivera ; de...@driverdev.osuosl.org;
> linux-kernel@vger.kernel.org; ag...@suse.de; a...@arndb.de; Leo Li
> ; Roy Pledge ; Haiying Wang
> ; Stuart Yoder 
> Subject: [PATCH 8/9] bus: fsl-mc: dpio: add the DPAA2 DPIO object driver
> 
> From: Roy Pledge 
> 
> The DPIO driver registers with the fsl-mc bus to handle bus-related
> events for DPIO objects.  Key responsibility is mapping I/O
> regions, setting up interrupt handlers, and calling the DPIO
> service initialization during probe.
> 
> Signed-off-by: Roy Pledge 
> Signed-off-by: Haiying Wang 
> Signed-off-by: Stuart Yoder 
> ---
>  drivers/bus/fsl-mc/dpio/Makefile  |   2 +-
>  drivers/bus/fsl-mc/dpio/dpio-driver.c | 289
> ++
>  2 files changed, 290 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/bus/fsl-mc/dpio/dpio-driver.c
> 
> diff --git a/drivers/bus/fsl-mc/dpio/Makefile b/drivers/bus/fsl-
> mc/dpio/Makefile
> index 0778da7..837d330 100644
> --- a/drivers/bus/fsl-mc/dpio/Makefile
> +++ b/drivers/bus/fsl-mc/dpio/Makefile
> @@ -6,4 +6,4 @@ subdir-ccflags-y := -Werror
> 
>  obj-$(CONFIG_FSL_MC_DPIO) += fsl-mc-dpio.o
> 
> -fsl-mc-dpio-objs := dpio.o qbman-portal.o dpio-service.o
> +fsl-mc-dpio-objs := dpio.o qbman-portal.o dpio-service.o dpio-driver.o
> diff --git a/drivers/bus/fsl-mc/dpio/dpio-driver.c b/drivers/bus/fsl-
> mc/dpio/dpio-driver.c
> new file mode 100644
> index 000..ad04a2c
> --- /dev/null
> +++ b/drivers/bus/fsl-mc/dpio/dpio-driver.c
> @@ -0,0 +1,289 @@
> +/*
> + * Copyright 2014-2016 Freescale Semiconductor Inc.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions are
> met:
> + * * Redistributions of source code must retain the above copyright
> + *notice, this list of conditions and the following disclaimer.
> + * * Redistributions in binary form must reproduce the above copyright
> + *notice, this list of conditions and the following disclaimer in the
> + *documentation and/or other materials provided with the
> distribution.
> + * * Neither the name of Freescale Semiconductor nor the
> + *names of its contributors may be used to endorse or promote
> products
> + *derived from this software without specific prior written permission.
> + *
> + * ALTERNATIVELY, this software may be distributed under the terms of the
> + * GNU General Public License ("GPL") as published by the Free Software
> + * Foundation, either version 2 of that License or (at your option) any
> + * later version.
> + *
> + * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND
> ANY
> + * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
> THE IMPLIED
> + * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
> PURPOSE ARE
> + * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR
> ANY
> + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
> CONSEQUENTIAL DAMAGES
> + * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
> GOODS OR SERVICES;
> + * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
> HOWEVER CAUSED AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> OR TORT
> + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF
> THE USE OF THIS
> + * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +
> +#include "qbman-portal.h"
> +#include "dpio.h"
> +#include "dpio-cmd.h"
> +
> +MODULE_LICENSE("Dual BSD/GPL");
> +MODULE_AUTHOR("Freescale Semiconductor, Inc");
> +MODULE_DESCRIPTION("DPIO Driver");
> +
> +struct dpio_priv {
> + struct dpaa2_io *io;
> +};
> +
> +static irqreturn_t dpio_irq_handler(int irq_num, void *arg)
> +{
> + struct device *dev = (struct device *)arg;
> + struct dpio_priv *priv = dev_get_drvdata(dev);
> +
> + return dpaa2_io_irq(priv->io);
> +}
> +
> +static void unregister_dpio_irq_handlers(struct fsl_mc_device *dpio_dev)
> +{
> + struct fsl_mc_device_irq *irq;
> +
> + irq = dpio_dev->irqs[0];
> +
> + /* clear the affinity hint */
> + irq_set_affinity_hint(irq->msi_desc->irq, NULL);
> +}
> +
> +static int register_dpio_irq_handlers(struct fsl_mc_device *dpio_dev, int
> cpu)
> +{
> + struct dpio_priv *priv;
> + int error;
> + struct fsl_mc_device_irq *irq;
> + cpumask_t mask;
> +
> + priv = dev_get_drvdata(_dev->dev);
> +
> + irq = dpio_dev->irqs[0];
> + error = 

Re: [PATCH v2 2/2] power: bq27xxx_battery: add poll interval property query

2016-11-04 Thread Pavel Machek
On Fri 2016-11-04 13:39:19, Matt Ranostay wrote:
> On Fri, Nov 4, 2016 at 1:29 PM, Pavel Machek  wrote:
> > On Fri 2016-11-04 07:58:55, Tony Lindgren wrote:
> >> * Pavel Machek  [161104 00:10]:
> >> > On Thu 2016-11-03 22:00:56, Matt Ranostay wrote:
> >> > Do you actually have hardware with more than one bq27xxx?
> >>
> >> I can at least see the twl4030 battery/charger features
> >> being used together with some bq device to monitor the
> >> battery state. Not sure if that counts as multiple
> >> instances here though :)
> >
> > I have that, too, but that was not what i was asking.
> >
> > Matt wanted support for different polling intervals on different
> > bq27xxx chips. I'd like to know know if he actually has more than one
> > bq27xxx in his device...
> >
> 
> Actually only one bq27xxx chip but in theory we could have more.

Hmm. As we'd have to keep both old and new interfaces to change the
polling interfaces. Lets not due that unless we really need to, ok?

Thanks,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH v2 2/2] power: bq27xxx_battery: add poll interval property query

2016-11-04 Thread Pavel Machek
On Fri 2016-11-04 13:39:19, Matt Ranostay wrote:
> On Fri, Nov 4, 2016 at 1:29 PM, Pavel Machek  wrote:
> > On Fri 2016-11-04 07:58:55, Tony Lindgren wrote:
> >> * Pavel Machek  [161104 00:10]:
> >> > On Thu 2016-11-03 22:00:56, Matt Ranostay wrote:
> >> > Do you actually have hardware with more than one bq27xxx?
> >>
> >> I can at least see the twl4030 battery/charger features
> >> being used together with some bq device to monitor the
> >> battery state. Not sure if that counts as multiple
> >> instances here though :)
> >
> > I have that, too, but that was not what i was asking.
> >
> > Matt wanted support for different polling intervals on different
> > bq27xxx chips. I'd like to know know if he actually has more than one
> > bq27xxx in his device...
> >
> 
> Actually only one bq27xxx chip but in theory we could have more.

Hmm. As we'd have to keep both old and new interfaces to change the
polling interfaces. Lets not due that unless we really need to, ok?

Thanks,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


signature.asc
Description: Digital signature


Re: [PATCH 2/4] pinctrl: single: Use generic parser and #pinctrl-cells for pinctrl-single,pins

2016-11-04 Thread Linus Walleij
On Tue, Oct 25, 2016 at 6:45 PM, Tony Lindgren  wrote:

> We can now use generic parser. To support the legacy binding without
> #pinctrl-cells, add pcs_quirk_missing_pinctrl_cells() and warn about
> missing #pinctrl-cells.
>
> Let's also update the documentation for struct pcs_soc_data while at it
> as that seems to be out of date.
>
> Signed-off-by: Tony Lindgren 

Patch applied.

> Looking at some make randconfig results, looks like we don't have
> of_add_property() and of_remove_property() exported. Is there some
> reason not to export them?

If there is too much fuzz from the builders we need to think of something.

Yours,
Linus Walleij


Re: [PATCH 2/4] pinctrl: single: Use generic parser and #pinctrl-cells for pinctrl-single,pins

2016-11-04 Thread Linus Walleij
On Tue, Oct 25, 2016 at 6:45 PM, Tony Lindgren  wrote:

> We can now use generic parser. To support the legacy binding without
> #pinctrl-cells, add pcs_quirk_missing_pinctrl_cells() and warn about
> missing #pinctrl-cells.
>
> Let's also update the documentation for struct pcs_soc_data while at it
> as that seems to be out of date.
>
> Signed-off-by: Tony Lindgren 

Patch applied.

> Looking at some make randconfig results, looks like we don't have
> of_add_property() and of_remove_property() exported. Is there some
> reason not to export them?

If there is too much fuzz from the builders we need to think of something.

Yours,
Linus Walleij


Re: [PATCH 1/4] pinctrl: Introduce generic #pinctrl-cells and pinctrl_parse_index_with_args

2016-11-04 Thread Linus Walleij
On Fri, Oct 28, 2016 at 6:53 PM, Tony Lindgren  wrote:

> From tony Mon Sep 17 00:00:00 2001
> From: Tony Lindgren 
> Date: Tue, 25 Oct 2016 08:33:34 -0700
> Subject: [PATCH] pinctrl: Introduce generic #pinctrl-cells and
>  pinctrl_parse_index_with_args
>
> Introduce #pinctrl-cells helper binding and generic helper functions
> pinctrl_count_index_with_args() and pinctrl_parse_index_with_args().
>
> Signed-off-by: Tony Lindgren 

This updated version applied with Rob's ACK.

Yours,
Linus Walleij


Re: [PATCH 1/4] pinctrl: Introduce generic #pinctrl-cells and pinctrl_parse_index_with_args

2016-11-04 Thread Linus Walleij
On Fri, Oct 28, 2016 at 6:53 PM, Tony Lindgren  wrote:

> From tony Mon Sep 17 00:00:00 2001
> From: Tony Lindgren 
> Date: Tue, 25 Oct 2016 08:33:34 -0700
> Subject: [PATCH] pinctrl: Introduce generic #pinctrl-cells and
>  pinctrl_parse_index_with_args
>
> Introduce #pinctrl-cells helper binding and generic helper functions
> pinctrl_count_index_with_args() and pinctrl_parse_index_with_args().
>
> Signed-off-by: Tony Lindgren 

This updated version applied with Rob's ACK.

Yours,
Linus Walleij


[PATCH v3 0/2] regulator: handling of error conditions for usb drivers

2016-11-04 Thread Axel Haslam
Some usb drivers rely on external power switches/regulators
to for the port vbus. Some of these drivers are using
a plain gpio for the enable pin and also the over current
indicator pin.

To make these drivers more generic, we can use a regulator
to handle vbus, and send and over current event, but we are
missing a way to transmit the over current pin status, which
the usb layer may poll at any time.

We would like to move these drivers to use a regulator, this
would make the usb driver generic allowing to use any type
of regulator. Also, it would help removing code, making DT
migration simpler and avoiding new DT bindings for each driver.

These patches do 2 things:
* Add a new API, that consumers can use to poll the regulator
  error status.
* Extends the fixed regulator driver to handle an optional
  over current gpio pin.

Changes v2 -> v3
* droped merged patch to add new API
* rebased on top of regulator-next

Changes v1->v2
* add new API to get error status instead of extending events (Mark)
* use gpiod for fixed regulator: This spears us extra platform
  data and bindings

Axel Haslam (2):
  regulator: fixed: dt: Allow an optional over current pin
  regulator: fixed: Handle optional overcurrent pin

 .../bindings/regulator/fixed-regulator.txt |  2 +
 drivers/regulator/fixed.c  | 59 ++
 2 files changed, 61 insertions(+)

-- 
2.10.1



[PATCH v3 0/2] regulator: handling of error conditions for usb drivers

2016-11-04 Thread Axel Haslam
Some usb drivers rely on external power switches/regulators
to for the port vbus. Some of these drivers are using
a plain gpio for the enable pin and also the over current
indicator pin.

To make these drivers more generic, we can use a regulator
to handle vbus, and send and over current event, but we are
missing a way to transmit the over current pin status, which
the usb layer may poll at any time.

We would like to move these drivers to use a regulator, this
would make the usb driver generic allowing to use any type
of regulator. Also, it would help removing code, making DT
migration simpler and avoiding new DT bindings for each driver.

These patches do 2 things:
* Add a new API, that consumers can use to poll the regulator
  error status.
* Extends the fixed regulator driver to handle an optional
  over current gpio pin.

Changes v2 -> v3
* droped merged patch to add new API
* rebased on top of regulator-next

Changes v1->v2
* add new API to get error status instead of extending events (Mark)
* use gpiod for fixed regulator: This spears us extra platform
  data and bindings

Axel Haslam (2):
  regulator: fixed: dt: Allow an optional over current pin
  regulator: fixed: Handle optional overcurrent pin

 .../bindings/regulator/fixed-regulator.txt |  2 +
 drivers/regulator/fixed.c  | 59 ++
 2 files changed, 61 insertions(+)

-- 
2.10.1



[PATCH v3 2/2] regulator: fixed: Handle optional overcurrent pin

2016-11-04 Thread Axel Haslam
Fixed regulators (ex. TPS2087D) may have a over current pin that
is activated on over current. Consumers may be interested to know
about over current events to take appropriate actions.

Allow the fix regulator to take in an optional gpio pin for over
current and send the respective event to the consumer.

Signed-off-by: Axel Haslam 
---
 drivers/regulator/fixed.c | 59 +++
 1 file changed, 59 insertions(+)

diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c
index a43b0e8..06ed2f6 100644
--- a/drivers/regulator/fixed.c
+++ b/drivers/regulator/fixed.c
@@ -33,10 +33,12 @@
 #include 
 #include 
 #include 
+#include 
 
 struct fixed_voltage_data {
struct regulator_desc desc;
struct regulator_dev *dev;
+   struct gpio_desc *oc_gpio;
 };
 
 
@@ -135,7 +137,36 @@ acpi_get_fixed_voltage_config(struct device *dev,
return config;
 }
 
+static irqreturn_t reg_fixed_overcurrent_irq(int irq, void *data)
+{
+   struct fixed_voltage_data *drvdata = data;
+
+   regulator_notifier_call_chain(drvdata->dev,
+   REGULATOR_EVENT_OVER_CURRENT, NULL);
+
+   return IRQ_HANDLED;
+}
+
+static int reg_fixed_get_error_flags(struct regulator_dev *dev,
+   unsigned int *flags)
+{
+   struct fixed_voltage_data *drvdata = rdev_get_drvdata(dev);
+   int oc_value;
+
+   *flags = 0;
+
+   if (!drvdata->oc_gpio)
+   return 0;
+
+   oc_value = gpiod_get_value_cansleep(drvdata->oc_gpio);
+   if (oc_value)
+   *flags = REGULATOR_ERROR_OVER_CURRENT;
+
+   return 0;
+}
+
 static struct regulator_ops fixed_voltage_ops = {
+   .get_error_flags = reg_fixed_get_error_flags,
 };
 
 static int reg_fixed_voltage_probe(struct platform_device *pdev)
@@ -143,6 +174,7 @@ static int reg_fixed_voltage_probe(struct platform_device 
*pdev)
struct fixed_voltage_config *config;
struct fixed_voltage_data *drvdata;
struct regulator_config cfg = { };
+   unsigned long irqflags = IRQF_ONESHOT;
int ret;
 
drvdata = devm_kzalloc(>dev, sizeof(struct fixed_voltage_data),
@@ -221,6 +253,33 @@ static int reg_fixed_voltage_probe(struct platform_device 
*pdev)
cfg.driver_data = drvdata;
cfg.of_node = pdev->dev.of_node;
 
+
+   drvdata->oc_gpio = devm_gpiod_get_optional(>dev, "over-current",
+   GPIOF_DIR_IN);
+   if (IS_ERR(drvdata->oc_gpio)) {
+   ret = PTR_ERR(drvdata->oc_gpio);
+   dev_err(>dev,
+   "Failed to get over current gpio: %d\n", ret);
+   return ret;
+   }
+
+   if (drvdata->oc_gpio) {
+   if (gpiod_is_active_low(drvdata->oc_gpio))
+   irqflags |= IRQF_TRIGGER_FALLING;
+   else
+   irqflags |= IRQF_TRIGGER_RISING;
+
+   ret = devm_request_threaded_irq(>dev,
+   gpiod_to_irq(drvdata->oc_gpio), NULL,
+   reg_fixed_overcurrent_irq, irqflags,
+   "over_current", drvdata);
+   if (ret) {
+   dev_err(>dev,
+   "Failed to request irq: %d\n", ret);
+   return ret;
+   }
+   }
+
drvdata->dev = devm_regulator_register(>dev, >desc,
   );
if (IS_ERR(drvdata->dev)) {
-- 
2.10.1



[PATCH v3 1/2] regulator: fixed: dt: Allow an optional over current pin

2016-11-04 Thread Axel Haslam
Add support for an optional over current input pin which
can be used to send an over current event to the regulator
consumer.

Cc: devicet...@vger.kernel.org
Signed-off-by: Axel Haslam 
---
 Documentation/devicetree/bindings/regulator/fixed-regulator.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/regulator/fixed-regulator.txt 
b/Documentation/devicetree/bindings/regulator/fixed-regulator.txt
index 4fae41d..b145abb 100644
--- a/Documentation/devicetree/bindings/regulator/fixed-regulator.txt
+++ b/Documentation/devicetree/bindings/regulator/fixed-regulator.txt
@@ -11,6 +11,7 @@ If this property is missing, the default assumed is Active 
low.
 - gpio-open-drain: GPIO is open drain type.
   If this property is missing then default assumption is false.
 -vin-supply: Input supply name.
+- over-current-gpios: Input gpio that signal an over current condition.
 
 Any property defined as part of the core regulator
 binding, defined in regulator.txt, can also be used.
@@ -26,6 +27,7 @@ Example:
regulator-min-microvolt = <180>;
regulator-max-microvolt = <180>;
gpio = < 16 0>;
+   over-current-gpios = < 18 0>;
startup-delay-us = <7>;
enable-active-high;
regulator-boot-on;
-- 
2.10.1



[PATCH v3 2/2] regulator: fixed: Handle optional overcurrent pin

2016-11-04 Thread Axel Haslam
Fixed regulators (ex. TPS2087D) may have a over current pin that
is activated on over current. Consumers may be interested to know
about over current events to take appropriate actions.

Allow the fix regulator to take in an optional gpio pin for over
current and send the respective event to the consumer.

Signed-off-by: Axel Haslam 
---
 drivers/regulator/fixed.c | 59 +++
 1 file changed, 59 insertions(+)

diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c
index a43b0e8..06ed2f6 100644
--- a/drivers/regulator/fixed.c
+++ b/drivers/regulator/fixed.c
@@ -33,10 +33,12 @@
 #include 
 #include 
 #include 
+#include 
 
 struct fixed_voltage_data {
struct regulator_desc desc;
struct regulator_dev *dev;
+   struct gpio_desc *oc_gpio;
 };
 
 
@@ -135,7 +137,36 @@ acpi_get_fixed_voltage_config(struct device *dev,
return config;
 }
 
+static irqreturn_t reg_fixed_overcurrent_irq(int irq, void *data)
+{
+   struct fixed_voltage_data *drvdata = data;
+
+   regulator_notifier_call_chain(drvdata->dev,
+   REGULATOR_EVENT_OVER_CURRENT, NULL);
+
+   return IRQ_HANDLED;
+}
+
+static int reg_fixed_get_error_flags(struct regulator_dev *dev,
+   unsigned int *flags)
+{
+   struct fixed_voltage_data *drvdata = rdev_get_drvdata(dev);
+   int oc_value;
+
+   *flags = 0;
+
+   if (!drvdata->oc_gpio)
+   return 0;
+
+   oc_value = gpiod_get_value_cansleep(drvdata->oc_gpio);
+   if (oc_value)
+   *flags = REGULATOR_ERROR_OVER_CURRENT;
+
+   return 0;
+}
+
 static struct regulator_ops fixed_voltage_ops = {
+   .get_error_flags = reg_fixed_get_error_flags,
 };
 
 static int reg_fixed_voltage_probe(struct platform_device *pdev)
@@ -143,6 +174,7 @@ static int reg_fixed_voltage_probe(struct platform_device 
*pdev)
struct fixed_voltage_config *config;
struct fixed_voltage_data *drvdata;
struct regulator_config cfg = { };
+   unsigned long irqflags = IRQF_ONESHOT;
int ret;
 
drvdata = devm_kzalloc(>dev, sizeof(struct fixed_voltage_data),
@@ -221,6 +253,33 @@ static int reg_fixed_voltage_probe(struct platform_device 
*pdev)
cfg.driver_data = drvdata;
cfg.of_node = pdev->dev.of_node;
 
+
+   drvdata->oc_gpio = devm_gpiod_get_optional(>dev, "over-current",
+   GPIOF_DIR_IN);
+   if (IS_ERR(drvdata->oc_gpio)) {
+   ret = PTR_ERR(drvdata->oc_gpio);
+   dev_err(>dev,
+   "Failed to get over current gpio: %d\n", ret);
+   return ret;
+   }
+
+   if (drvdata->oc_gpio) {
+   if (gpiod_is_active_low(drvdata->oc_gpio))
+   irqflags |= IRQF_TRIGGER_FALLING;
+   else
+   irqflags |= IRQF_TRIGGER_RISING;
+
+   ret = devm_request_threaded_irq(>dev,
+   gpiod_to_irq(drvdata->oc_gpio), NULL,
+   reg_fixed_overcurrent_irq, irqflags,
+   "over_current", drvdata);
+   if (ret) {
+   dev_err(>dev,
+   "Failed to request irq: %d\n", ret);
+   return ret;
+   }
+   }
+
drvdata->dev = devm_regulator_register(>dev, >desc,
   );
if (IS_ERR(drvdata->dev)) {
-- 
2.10.1



[PATCH v3 1/2] regulator: fixed: dt: Allow an optional over current pin

2016-11-04 Thread Axel Haslam
Add support for an optional over current input pin which
can be used to send an over current event to the regulator
consumer.

Cc: devicet...@vger.kernel.org
Signed-off-by: Axel Haslam 
---
 Documentation/devicetree/bindings/regulator/fixed-regulator.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/regulator/fixed-regulator.txt 
b/Documentation/devicetree/bindings/regulator/fixed-regulator.txt
index 4fae41d..b145abb 100644
--- a/Documentation/devicetree/bindings/regulator/fixed-regulator.txt
+++ b/Documentation/devicetree/bindings/regulator/fixed-regulator.txt
@@ -11,6 +11,7 @@ If this property is missing, the default assumed is Active 
low.
 - gpio-open-drain: GPIO is open drain type.
   If this property is missing then default assumption is false.
 -vin-supply: Input supply name.
+- over-current-gpios: Input gpio that signal an over current condition.
 
 Any property defined as part of the core regulator
 binding, defined in regulator.txt, can also be used.
@@ -26,6 +27,7 @@ Example:
regulator-min-microvolt = <180>;
regulator-max-microvolt = <180>;
gpio = < 16 0>;
+   over-current-gpios = < 18 0>;
startup-delay-us = <7>;
enable-active-high;
regulator-boot-on;
-- 
2.10.1



Re: console issue since 3.6, console=ttyS1 hangs

2016-11-04 Thread Nathan Zimmer
On Thu, Nov 03, 2016 at 06:25:46PM -0600, Peter Hurley wrote:
> On Wed, Nov 2, 2016 at 9:29 AM, Nathan Zimmer  wrote:
> > On Mon, Oct 31, 2016 at 08:55:49PM -0600, Peter Hurley wrote:
> >> On Mon, Oct 31, 2016 at 2:27 PM, Sean Young  wrote:
> >> > On Sun, Oct 30, 2016 at 10:33:02AM -0500, Nathan wrote:
> >> >> I think this should be PNP0501 instead of PNP0c02.
> >> >> Once I alter that then when I boot the serial comes up on irq 3. 
> >> >> However it
> >> >> still hangs.
> >> >> I'll keep digging.
> >> >
> >> > Well that's that theory out of the window. I'm not sure where to look 
> >> > now,
> >> > I would start by enabling as many as possible of the "kernel hacking" 
> >> > config
> >> > options and see if anything gets caught.
> >> >
> >> > Looking at your earlier messages, you have a collection of percpu 
> >> > allocation
> >> > failures. That might be worth resolving before anything else.
> >>
> >> Hi Nathan,
> >>
> >> Couple of questions:
> >> 1. Was login over serial console setup and working on SLES 11?  or was
> >> the 'console=ttyS1' only for debug output?
> >> I ask because console output doesn't use IRQs; iow, maybe the serial
> >> port w/ driver never actually worked.
> >> 2. Can you post dmesg for the SLES 11 setup?  That would show if there
> >> were probe errors even on that.
> >>
> >> An alternative that should be equivalent to your previous setup is to
> >> build w/ CONFIG_SERIAL_8250_PNP=n
> >> Seems like your ACPI BIOS is buggy, but also that something else is using 
> >> IRQ 3?
> >>
> >> Regards,
> >> Peter Hurley
> >
> >
> >
> > 1) Yes I can confirm I used it to login sometimes.
> >
> > I built with CONFIG_SERIAL_8250_PNP=n and that seemed to work better, in 
> > that the system did not hang.
> > However I couldn't login on the serial and got these error messages, I 
> > suspect I broke something while trying different permutations.
> >
> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.136636 seconds
> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.180955 seconds
> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.161415 seconds
> > gdm[5206]: WARNING: GdmLocalDisplayFactory: maximum number of X display 
> > failures reached: check X server log for errors
> >
> > It did boot all the way though.
> >
> > 2) attached log
> 
> So I'm confused where this leaves us.
> 
> In your OP, you claim to have gotten it working with a partial revert
> of commit 835d844d1a28 (but you didn't attach the partial revert so no
> one knows what you did); however, my suggestion should have been
> equivalent.

I apologize, if I was unclear.  Your suggestion of CONFIG_SERIAL_8250_PNP=n did 
successfully boot and provide messages
across the console, and yes is basically equivelent to the revert.
Those warnings I just noticed in the dmesg and they weren't there before.

> 
> Note that you have the serial port disabled in BIOS; that's why you're
> getting the probe error for PNP.
> 
> Regards,
> Peter Hurley


Now when you say its diabled in bios, how can I be sure and double check that?
These bios screens do not have any mention of PNP settings.
I am getting output over the console (via ipmi) until the boot hangs.

Nate



Re: console issue since 3.6, console=ttyS1 hangs

2016-11-04 Thread Nathan Zimmer
On Thu, Nov 03, 2016 at 06:25:46PM -0600, Peter Hurley wrote:
> On Wed, Nov 2, 2016 at 9:29 AM, Nathan Zimmer  wrote:
> > On Mon, Oct 31, 2016 at 08:55:49PM -0600, Peter Hurley wrote:
> >> On Mon, Oct 31, 2016 at 2:27 PM, Sean Young  wrote:
> >> > On Sun, Oct 30, 2016 at 10:33:02AM -0500, Nathan wrote:
> >> >> I think this should be PNP0501 instead of PNP0c02.
> >> >> Once I alter that then when I boot the serial comes up on irq 3. 
> >> >> However it
> >> >> still hangs.
> >> >> I'll keep digging.
> >> >
> >> > Well that's that theory out of the window. I'm not sure where to look 
> >> > now,
> >> > I would start by enabling as many as possible of the "kernel hacking" 
> >> > config
> >> > options and see if anything gets caught.
> >> >
> >> > Looking at your earlier messages, you have a collection of percpu 
> >> > allocation
> >> > failures. That might be worth resolving before anything else.
> >>
> >> Hi Nathan,
> >>
> >> Couple of questions:
> >> 1. Was login over serial console setup and working on SLES 11?  or was
> >> the 'console=ttyS1' only for debug output?
> >> I ask because console output doesn't use IRQs; iow, maybe the serial
> >> port w/ driver never actually worked.
> >> 2. Can you post dmesg for the SLES 11 setup?  That would show if there
> >> were probe errors even on that.
> >>
> >> An alternative that should be equivalent to your previous setup is to
> >> build w/ CONFIG_SERIAL_8250_PNP=n
> >> Seems like your ACPI BIOS is buggy, but also that something else is using 
> >> IRQ 3?
> >>
> >> Regards,
> >> Peter Hurley
> >
> >
> >
> > 1) Yes I can confirm I used it to login sometimes.
> >
> > I built with CONFIG_SERIAL_8250_PNP=n and that seemed to work better, in 
> > that the system did not hang.
> > However I couldn't login on the serial and got these error messages, I 
> > suspect I broke something while trying different permutations.
> >
> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.136636 seconds
> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.180955 seconds
> > gdm[5206]: WARNING: GdmDisplay: display lasted 0.161415 seconds
> > gdm[5206]: WARNING: GdmLocalDisplayFactory: maximum number of X display 
> > failures reached: check X server log for errors
> >
> > It did boot all the way though.
> >
> > 2) attached log
> 
> So I'm confused where this leaves us.
> 
> In your OP, you claim to have gotten it working with a partial revert
> of commit 835d844d1a28 (but you didn't attach the partial revert so no
> one knows what you did); however, my suggestion should have been
> equivalent.

I apologize, if I was unclear.  Your suggestion of CONFIG_SERIAL_8250_PNP=n did 
successfully boot and provide messages
across the console, and yes is basically equivelent to the revert.
Those warnings I just noticed in the dmesg and they weren't there before.

> 
> Note that you have the serial port disabled in BIOS; that's why you're
> getting the probe error for PNP.
> 
> Regards,
> Peter Hurley


Now when you say its diabled in bios, how can I be sure and double check that?
These bios screens do not have any mention of PNP settings.
I am getting output over the console (via ipmi) until the boot hangs.

Nate



Re: [PATCH 01/14] pinctrl-sx150x: Rely on of_modalias_node for OF matching

2016-11-04 Thread Linus Walleij
On Fri, Nov 4, 2016 at 9:09 PM, Andrey Smirnov  wrote:
> On Fri, Nov 4, 2016 at 5:28 AM, Linus Walleij  
> wrote:
>> On Tue, Nov 1, 2016 at 4:57 PM, Andrey Smirnov  
>> wrote:
>>
>>> None of the OF match table entries contain any compatiblity strings that
>>> could not be matched against using i2c_device_id table above and
>>> of_modalias_node. Besides that entries in OF match table do not cary
>>> proper device variant information which is need by the drive. Those two
>>> facts combined, IMHO, make a compelling case for removal of that code
>>> altogether.
>>>
>>> Signed-off-by: Andrey Smirnov 
>> (...)
>>> -static const struct of_device_id sx150x_of_match[] = {
>>> -   { .compatible = "semtech,sx1508q" },
>>> -   { .compatible = "semtech,sx1509q" },
>>> -   { .compatible = "semtech,sx1506q" },
>>> -   { .compatible = "semtech,sx1502q" },
>>> -   {},
>>> -};
>>
>> I'm a bit hesitant about this since we should ideally first match on the
>> compatible string for any device. We have tried to alleviate the situation
>> in I2C devices but it has been a bit so-so.
>>
>
> Ah, good to know. Let's do it that way then.
>
>> It would be best if we make a separate patch after this tjat adds it
>> back, set the variant data also in the .data of the match and
>> use of_device_get_match_data().
>
> Do you prefer it as a separate patch, or, instead, should I change
> this patch of the series to do what you describe? I'd be happy to do
> either and it seems like it would be a trivial change so it should
> invalidate any of the testing done by Neil.

Yeah it would ideally be a modification of this patch.

Whatever is easiest for you to do!

BTW this is a great patch set and I'm very grateful for yours+Neils
combines contributions on getting this driver into shape.

Yours,
Linus Walleij


Re: [PATCH 01/14] pinctrl-sx150x: Rely on of_modalias_node for OF matching

2016-11-04 Thread Linus Walleij
On Fri, Nov 4, 2016 at 9:09 PM, Andrey Smirnov  wrote:
> On Fri, Nov 4, 2016 at 5:28 AM, Linus Walleij  
> wrote:
>> On Tue, Nov 1, 2016 at 4:57 PM, Andrey Smirnov  
>> wrote:
>>
>>> None of the OF match table entries contain any compatiblity strings that
>>> could not be matched against using i2c_device_id table above and
>>> of_modalias_node. Besides that entries in OF match table do not cary
>>> proper device variant information which is need by the drive. Those two
>>> facts combined, IMHO, make a compelling case for removal of that code
>>> altogether.
>>>
>>> Signed-off-by: Andrey Smirnov 
>> (...)
>>> -static const struct of_device_id sx150x_of_match[] = {
>>> -   { .compatible = "semtech,sx1508q" },
>>> -   { .compatible = "semtech,sx1509q" },
>>> -   { .compatible = "semtech,sx1506q" },
>>> -   { .compatible = "semtech,sx1502q" },
>>> -   {},
>>> -};
>>
>> I'm a bit hesitant about this since we should ideally first match on the
>> compatible string for any device. We have tried to alleviate the situation
>> in I2C devices but it has been a bit so-so.
>>
>
> Ah, good to know. Let's do it that way then.
>
>> It would be best if we make a separate patch after this tjat adds it
>> back, set the variant data also in the .data of the match and
>> use of_device_get_match_data().
>
> Do you prefer it as a separate patch, or, instead, should I change
> this patch of the series to do what you describe? I'd be happy to do
> either and it seems like it would be a trivial change so it should
> invalidate any of the testing done by Neil.

Yeah it would ideally be a modification of this patch.

Whatever is easiest for you to do!

BTW this is a great patch set and I'm very grateful for yours+Neils
combines contributions on getting this driver into shape.

Yours,
Linus Walleij


[PATCH v11 12/22] vfio: Add notifier callback to parent's ops structure of mdev

2016-11-04 Thread Kirti Wankhede
Add a notifier calback to parent's ops structure of mdev device so that per
device notifer for vfio module is registered through vfio_mdev module.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Iafa6f1721aecdd6e50eb93b153b5621e6d29b637
---
 drivers/vfio/mdev/vfio_mdev.c | 19 +++
 include/linux/mdev.h  |  9 +
 2 files changed, 28 insertions(+)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index bb534d19e321..2b7c24aa9e46 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -24,6 +24,15 @@
 #define DRIVER_AUTHOR   "NVIDIA Corporation"
 #define DRIVER_DESC "VFIO based driver for Mediated device"
 
+static int vfio_mdev_notifier(struct notifier_block *nb, unsigned long action,
+ void *data)
+{
+   struct mdev_device *mdev = container_of(nb, struct mdev_device, nb);
+   struct parent_device *parent = mdev->parent;
+
+   return parent->ops->notifier(mdev, action, data);
+}
+
 static int vfio_mdev_open(void *device_data)
 {
struct mdev_device *mdev = device_data;
@@ -40,6 +49,11 @@ static int vfio_mdev_open(void *device_data)
if (ret)
module_put(THIS_MODULE);
 
+   if (likely(parent->ops->notifier)) {
+   mdev->nb.notifier_call = vfio_mdev_notifier;
+   if (vfio_register_notifier(>dev, >nb))
+   pr_err("Failed to register notifier for mdev\n");
+   }
return ret;
 }
 
@@ -48,6 +62,11 @@ static void vfio_mdev_release(void *device_data)
struct mdev_device *mdev = device_data;
struct parent_device *parent = mdev->parent;
 
+   if (likely(parent->ops->notifier)) {
+   if (vfio_unregister_notifier(>dev, >nb))
+   pr_err("Failed to unregister notifier for mdev\n");
+   }
+
if (likely(parent->ops->release))
parent->ops->release(mdev);
 
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 0352febc1944..2999ef0ddaed 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -37,6 +37,7 @@ struct mdev_device {
struct kref ref;
struct list_headnext;
struct kobject  *type_kobj;
+   struct notifier_block   nb;
 };
 
 
@@ -84,6 +85,12 @@ struct mdev_device {
  * @cmd: mediated device structure
  * @arg: mediated device structure
  * @mmap:  mmap callback
+ * @mdev: mediated device structure
+ * @vma: vma structure
+ * @notifer:   Notifier callback
+ * @mdev: mediated device structure
+ * @action: Action for which notifier is called
+ * @data: Data associated with the notifier
  * Parent device that support mediated device should be registered with mdev
  * module with parent_ops structure.
  **/
@@ -105,6 +112,8 @@ struct parent_ops {
ssize_t (*ioctl)(struct mdev_device *mdev, unsigned int cmd,
 unsigned long arg);
int (*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
+   int (*notifier)(struct mdev_device *mdev, unsigned long action,
+   void *data);
 };
 
 /* interface for exporting mdev supported type attributes */
-- 
2.7.0



[PATCH v11 12/22] vfio: Add notifier callback to parent's ops structure of mdev

2016-11-04 Thread Kirti Wankhede
Add a notifier calback to parent's ops structure of mdev device so that per
device notifer for vfio module is registered through vfio_mdev module.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Iafa6f1721aecdd6e50eb93b153b5621e6d29b637
---
 drivers/vfio/mdev/vfio_mdev.c | 19 +++
 include/linux/mdev.h  |  9 +
 2 files changed, 28 insertions(+)

diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
index bb534d19e321..2b7c24aa9e46 100644
--- a/drivers/vfio/mdev/vfio_mdev.c
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -24,6 +24,15 @@
 #define DRIVER_AUTHOR   "NVIDIA Corporation"
 #define DRIVER_DESC "VFIO based driver for Mediated device"
 
+static int vfio_mdev_notifier(struct notifier_block *nb, unsigned long action,
+ void *data)
+{
+   struct mdev_device *mdev = container_of(nb, struct mdev_device, nb);
+   struct parent_device *parent = mdev->parent;
+
+   return parent->ops->notifier(mdev, action, data);
+}
+
 static int vfio_mdev_open(void *device_data)
 {
struct mdev_device *mdev = device_data;
@@ -40,6 +49,11 @@ static int vfio_mdev_open(void *device_data)
if (ret)
module_put(THIS_MODULE);
 
+   if (likely(parent->ops->notifier)) {
+   mdev->nb.notifier_call = vfio_mdev_notifier;
+   if (vfio_register_notifier(>dev, >nb))
+   pr_err("Failed to register notifier for mdev\n");
+   }
return ret;
 }
 
@@ -48,6 +62,11 @@ static void vfio_mdev_release(void *device_data)
struct mdev_device *mdev = device_data;
struct parent_device *parent = mdev->parent;
 
+   if (likely(parent->ops->notifier)) {
+   if (vfio_unregister_notifier(>dev, >nb))
+   pr_err("Failed to unregister notifier for mdev\n");
+   }
+
if (likely(parent->ops->release))
parent->ops->release(mdev);
 
diff --git a/include/linux/mdev.h b/include/linux/mdev.h
index 0352febc1944..2999ef0ddaed 100644
--- a/include/linux/mdev.h
+++ b/include/linux/mdev.h
@@ -37,6 +37,7 @@ struct mdev_device {
struct kref ref;
struct list_headnext;
struct kobject  *type_kobj;
+   struct notifier_block   nb;
 };
 
 
@@ -84,6 +85,12 @@ struct mdev_device {
  * @cmd: mediated device structure
  * @arg: mediated device structure
  * @mmap:  mmap callback
+ * @mdev: mediated device structure
+ * @vma: vma structure
+ * @notifer:   Notifier callback
+ * @mdev: mediated device structure
+ * @action: Action for which notifier is called
+ * @data: Data associated with the notifier
  * Parent device that support mediated device should be registered with mdev
  * module with parent_ops structure.
  **/
@@ -105,6 +112,8 @@ struct parent_ops {
ssize_t (*ioctl)(struct mdev_device *mdev, unsigned int cmd,
 unsigned long arg);
int (*mmap)(struct mdev_device *mdev, struct vm_area_struct *vma);
+   int (*notifier)(struct mdev_device *mdev, unsigned long action,
+   void *data);
 };
 
 /* interface for exporting mdev supported type attributes */
-- 
2.7.0



[PATCH v11 07/22] vfio iommu type1: Update argument of vaddr_get_pfn()

2016-11-04 Thread Kirti Wankhede
Update arguments of vaddr_get_pfn() to take struct mm_struct *mm as input
argument.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I885fd4cd4a9f66f4ee2c1caf58267464ec239f52
---
 drivers/vfio/vfio_iommu_type1.c | 30 +++---
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 02b302d0b7de..653386e80e85 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -226,20 +226,36 @@ static int put_pfn(unsigned long pfn, int prot)
return 0;
 }
 
-static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long *pfn)
+static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
+int prot, unsigned long *pfn)
 {
struct page *page[1];
struct vm_area_struct *vma;
int ret = -EFAULT;
 
-   if (get_user_pages_fast(vaddr, 1, !!(prot & IOMMU_WRITE), page) == 1) {
+   if (mm == current->mm)
+   ret = get_user_pages_fast(vaddr, 1, !!(prot & IOMMU_WRITE),
+ page);
+   else {
+   unsigned int flags = 0;
+
+   if (prot & IOMMU_WRITE)
+   flags |= FOLL_WRITE;
+
+   down_read(>mmap_sem);
+   ret = get_user_pages_remote(NULL, mm, vaddr, 1, flags, page,
+   NULL);
+   up_read(>mmap_sem);
+   }
+
+   if (ret == 1) {
*pfn = page_to_pfn(page[0]);
return 0;
}
 
-   down_read(>mm->mmap_sem);
+   down_read(>mmap_sem);
 
-   vma = find_vma_intersection(current->mm, vaddr, vaddr + 1);
+   vma = find_vma_intersection(mm, vaddr, vaddr + 1);
 
if (vma && vma->vm_flags & VM_PFNMAP) {
*pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
@@ -247,7 +263,7 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, 
unsigned long *pfn)
ret = 0;
}
 
-   up_read(>mm->mmap_sem);
+   up_read(>mmap_sem);
 
return ret;
 }
@@ -268,7 +284,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
if (!current->mm)
return -ENODEV;
 
-   ret = vaddr_get_pfn(vaddr, prot, pfn_base);
+   ret = vaddr_get_pfn(current->mm, vaddr, prot, pfn_base);
if (ret)
return ret;
 
@@ -291,7 +307,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
for (i = 1, vaddr += PAGE_SIZE; i < npage; i++, vaddr += PAGE_SIZE) {
unsigned long pfn = 0;
 
-   ret = vaddr_get_pfn(vaddr, prot, );
+   ret = vaddr_get_pfn(current->mm, vaddr, prot, );
if (ret)
break;
 
-- 
2.7.0



[PATCH v11 07/22] vfio iommu type1: Update argument of vaddr_get_pfn()

2016-11-04 Thread Kirti Wankhede
Update arguments of vaddr_get_pfn() to take struct mm_struct *mm as input
argument.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I885fd4cd4a9f66f4ee2c1caf58267464ec239f52
---
 drivers/vfio/vfio_iommu_type1.c | 30 +++---
 1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 02b302d0b7de..653386e80e85 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -226,20 +226,36 @@ static int put_pfn(unsigned long pfn, int prot)
return 0;
 }
 
-static int vaddr_get_pfn(unsigned long vaddr, int prot, unsigned long *pfn)
+static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
+int prot, unsigned long *pfn)
 {
struct page *page[1];
struct vm_area_struct *vma;
int ret = -EFAULT;
 
-   if (get_user_pages_fast(vaddr, 1, !!(prot & IOMMU_WRITE), page) == 1) {
+   if (mm == current->mm)
+   ret = get_user_pages_fast(vaddr, 1, !!(prot & IOMMU_WRITE),
+ page);
+   else {
+   unsigned int flags = 0;
+
+   if (prot & IOMMU_WRITE)
+   flags |= FOLL_WRITE;
+
+   down_read(>mmap_sem);
+   ret = get_user_pages_remote(NULL, mm, vaddr, 1, flags, page,
+   NULL);
+   up_read(>mmap_sem);
+   }
+
+   if (ret == 1) {
*pfn = page_to_pfn(page[0]);
return 0;
}
 
-   down_read(>mm->mmap_sem);
+   down_read(>mmap_sem);
 
-   vma = find_vma_intersection(current->mm, vaddr, vaddr + 1);
+   vma = find_vma_intersection(mm, vaddr, vaddr + 1);
 
if (vma && vma->vm_flags & VM_PFNMAP) {
*pfn = ((vaddr - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
@@ -247,7 +263,7 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, 
unsigned long *pfn)
ret = 0;
}
 
-   up_read(>mm->mmap_sem);
+   up_read(>mmap_sem);
 
return ret;
 }
@@ -268,7 +284,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
if (!current->mm)
return -ENODEV;
 
-   ret = vaddr_get_pfn(vaddr, prot, pfn_base);
+   ret = vaddr_get_pfn(current->mm, vaddr, prot, pfn_base);
if (ret)
return ret;
 
@@ -291,7 +307,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
for (i = 1, vaddr += PAGE_SIZE; i < npage; i++, vaddr += PAGE_SIZE) {
unsigned long pfn = 0;
 
-   ret = vaddr_get_pfn(vaddr, prot, );
+   ret = vaddr_get_pfn(current->mm, vaddr, prot, );
if (ret)
break;
 
-- 
2.7.0



[PATCH v11 08/22] vfio iommu type1: Add find_iommu_group() function

2016-11-04 Thread Kirti Wankhede
Add find_iommu_group()

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I9d372f1ebe9eb01a5a21374b8a2b03f7df73601f
---
 drivers/vfio/vfio_iommu_type1.c | 58 -
 1 file changed, 34 insertions(+), 24 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 653386e80e85..422c8d198abb 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -748,11 +748,24 @@ static void vfio_test_domain_fgsp(struct vfio_domain 
*domain)
__free_pages(pages, order);
 }
 
+static struct vfio_group *find_iommu_group(struct vfio_domain *domain,
+  struct iommu_group *iommu_group)
+{
+   struct vfio_group *g;
+
+   list_for_each_entry(g, >group_list, next) {
+   if (g->iommu_group == iommu_group)
+   return g;
+   }
+
+   return NULL;
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 struct iommu_group *iommu_group)
 {
struct vfio_iommu *iommu = iommu_data;
-   struct vfio_group *group, *g;
+   struct vfio_group *group;
struct vfio_domain *domain, *d;
struct bus_type *bus = NULL;
int ret;
@@ -760,10 +773,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
mutex_lock(>lock);
 
list_for_each_entry(d, >domain_list, next) {
-   list_for_each_entry(g, >group_list, next) {
-   if (g->iommu_group != iommu_group)
-   continue;
-
+   if (find_iommu_group(d, iommu_group)) {
mutex_unlock(>lock);
return -EINVAL;
}
@@ -882,28 +892,28 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
 
mutex_lock(>lock);
 
+
list_for_each_entry(domain, >domain_list, next) {
-   list_for_each_entry(group, >group_list, next) {
-   if (group->iommu_group != iommu_group)
-   continue;
+   group = find_iommu_group(domain, iommu_group);
+   if (!group)
+   continue;
 
-   iommu_detach_group(domain->domain, iommu_group);
-   list_del(>next);
-   kfree(group);
-   /*
-* Group ownership provides privilege, if the group
-* list is empty, the domain goes away.  If it's the
-* last domain, then all the mappings go away too.
-*/
-   if (list_empty(>group_list)) {
-   if (list_is_singular(>domain_list))
-   vfio_iommu_unmap_unpin_all(iommu);
-   iommu_domain_free(domain->domain);
-   list_del(>next);
-   kfree(domain);
-   }
-   goto done;
+   iommu_detach_group(domain->domain, iommu_group);
+   list_del(>next);
+   kfree(group);
+   /*
+* Group ownership provides privilege, if the group
+* list is empty, the domain goes away.  If it's the
+* last domain, then all the mappings go away too.
+*/
+   if (list_empty(>group_list)) {
+   if (list_is_singular(>domain_list))
+   vfio_iommu_unmap_unpin_all(iommu);
+   iommu_domain_free(domain->domain);
+   list_del(>next);
+   kfree(domain);
}
+   goto done;
}
 
 done:
-- 
2.7.0



[PATCH v11 08/22] vfio iommu type1: Add find_iommu_group() function

2016-11-04 Thread Kirti Wankhede
Add find_iommu_group()

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I9d372f1ebe9eb01a5a21374b8a2b03f7df73601f
---
 drivers/vfio/vfio_iommu_type1.c | 58 -
 1 file changed, 34 insertions(+), 24 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 653386e80e85..422c8d198abb 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -748,11 +748,24 @@ static void vfio_test_domain_fgsp(struct vfio_domain 
*domain)
__free_pages(pages, order);
 }
 
+static struct vfio_group *find_iommu_group(struct vfio_domain *domain,
+  struct iommu_group *iommu_group)
+{
+   struct vfio_group *g;
+
+   list_for_each_entry(g, >group_list, next) {
+   if (g->iommu_group == iommu_group)
+   return g;
+   }
+
+   return NULL;
+}
+
 static int vfio_iommu_type1_attach_group(void *iommu_data,
 struct iommu_group *iommu_group)
 {
struct vfio_iommu *iommu = iommu_data;
-   struct vfio_group *group, *g;
+   struct vfio_group *group;
struct vfio_domain *domain, *d;
struct bus_type *bus = NULL;
int ret;
@@ -760,10 +773,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
mutex_lock(>lock);
 
list_for_each_entry(d, >domain_list, next) {
-   list_for_each_entry(g, >group_list, next) {
-   if (g->iommu_group != iommu_group)
-   continue;
-
+   if (find_iommu_group(d, iommu_group)) {
mutex_unlock(>lock);
return -EINVAL;
}
@@ -882,28 +892,28 @@ static void vfio_iommu_type1_detach_group(void 
*iommu_data,
 
mutex_lock(>lock);
 
+
list_for_each_entry(domain, >domain_list, next) {
-   list_for_each_entry(group, >group_list, next) {
-   if (group->iommu_group != iommu_group)
-   continue;
+   group = find_iommu_group(domain, iommu_group);
+   if (!group)
+   continue;
 
-   iommu_detach_group(domain->domain, iommu_group);
-   list_del(>next);
-   kfree(group);
-   /*
-* Group ownership provides privilege, if the group
-* list is empty, the domain goes away.  If it's the
-* last domain, then all the mappings go away too.
-*/
-   if (list_empty(>group_list)) {
-   if (list_is_singular(>domain_list))
-   vfio_iommu_unmap_unpin_all(iommu);
-   iommu_domain_free(domain->domain);
-   list_del(>next);
-   kfree(domain);
-   }
-   goto done;
+   iommu_detach_group(domain->domain, iommu_group);
+   list_del(>next);
+   kfree(group);
+   /*
+* Group ownership provides privilege, if the group
+* list is empty, the domain goes away.  If it's the
+* last domain, then all the mappings go away too.
+*/
+   if (list_empty(>group_list)) {
+   if (list_is_singular(>domain_list))
+   vfio_iommu_unmap_unpin_all(iommu);
+   iommu_domain_free(domain->domain);
+   list_del(>next);
+   kfree(domain);
}
+   goto done;
}
 
 done:
-- 
2.7.0



[PATCH v11 03/22] vfio: Rearrange functions to get vfio_group from dev

2016-11-04 Thread Kirti Wankhede
This patch rearranges functions to get vfio_group from device

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I1f93262bdbab75094bc24b087b29da35ba70c4c6
---
 drivers/vfio/vfio.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index d1d70e0b011b..23bc86c1d05d 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -480,6 +480,21 @@ static struct vfio_group *vfio_group_get_from_minor(int 
minor)
return group;
 }
 
+static struct vfio_group *vfio_group_get_from_dev(struct device *dev)
+{
+   struct iommu_group *iommu_group;
+   struct vfio_group *group;
+
+   iommu_group = iommu_group_get(dev);
+   if (!iommu_group)
+   return NULL;
+
+   group = vfio_group_get_from_iommu(iommu_group);
+   iommu_group_put(iommu_group);
+
+   return group;
+}
+
 /**
  * Device objects - create, release, get, put, search
  */
@@ -811,16 +826,10 @@ EXPORT_SYMBOL_GPL(vfio_add_group_dev);
  */
 struct vfio_device *vfio_device_get_from_dev(struct device *dev)
 {
-   struct iommu_group *iommu_group;
struct vfio_group *group;
struct vfio_device *device;
 
-   iommu_group = iommu_group_get(dev);
-   if (!iommu_group)
-   return NULL;
-
-   group = vfio_group_get_from_iommu(iommu_group);
-   iommu_group_put(iommu_group);
+   group = vfio_group_get_from_dev(dev);
if (!group)
return NULL;
 
-- 
2.7.0



[PATCH v11 03/22] vfio: Rearrange functions to get vfio_group from dev

2016-11-04 Thread Kirti Wankhede
This patch rearranges functions to get vfio_group from device

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I1f93262bdbab75094bc24b087b29da35ba70c4c6
---
 drivers/vfio/vfio.c | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index d1d70e0b011b..23bc86c1d05d 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -480,6 +480,21 @@ static struct vfio_group *vfio_group_get_from_minor(int 
minor)
return group;
 }
 
+static struct vfio_group *vfio_group_get_from_dev(struct device *dev)
+{
+   struct iommu_group *iommu_group;
+   struct vfio_group *group;
+
+   iommu_group = iommu_group_get(dev);
+   if (!iommu_group)
+   return NULL;
+
+   group = vfio_group_get_from_iommu(iommu_group);
+   iommu_group_put(iommu_group);
+
+   return group;
+}
+
 /**
  * Device objects - create, release, get, put, search
  */
@@ -811,16 +826,10 @@ EXPORT_SYMBOL_GPL(vfio_add_group_dev);
  */
 struct vfio_device *vfio_device_get_from_dev(struct device *dev)
 {
-   struct iommu_group *iommu_group;
struct vfio_group *group;
struct vfio_device *device;
 
-   iommu_group = iommu_group_get(dev);
-   if (!iommu_group)
-   return NULL;
-
-   group = vfio_group_get_from_iommu(iommu_group);
-   iommu_group_put(iommu_group);
+   group = vfio_group_get_from_dev(dev);
if (!group)
return NULL;
 
-- 
2.7.0



[PATCH v11 13/22] vfio: Introduce common function to add capabilities

2016-11-04 Thread Kirti Wankhede
Vendor driver using mediated device framework should use
vfio_info_add_capability() to add capabilities.
Introduced this function to reduce code duplication in vendor drivers.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I6fca329fa2291f37a2c859d0bc97574d9e2ce1a6
---
 drivers/vfio/vfio.c  | 60 +++-
 include/linux/vfio.h |  3 +++
 2 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 4ed1a6a247c6..9a03be0942a1 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1797,8 +1797,66 @@ void vfio_info_cap_shift(struct vfio_info_cap *caps, 
size_t offset)
for (tmp = caps->buf; tmp->next; tmp = (void *)tmp + tmp->next - offset)
tmp->next += offset;
 }
-EXPORT_SYMBOL_GPL(vfio_info_cap_shift);
+EXPORT_SYMBOL(vfio_info_cap_shift);
 
+static int sparse_mmap_cap(struct vfio_info_cap *caps, void *cap_type)
+{
+   struct vfio_info_cap_header *header;
+   struct vfio_region_info_cap_sparse_mmap *sparse_cap, *sparse = cap_type;
+   size_t size;
+
+   size = sizeof(*sparse) + sparse->nr_areas *  sizeof(*sparse->areas);
+   header = vfio_info_cap_add(caps, size,
+  VFIO_REGION_INFO_CAP_SPARSE_MMAP, 1);
+   if (IS_ERR(header))
+   return PTR_ERR(header);
+
+   sparse_cap = container_of(header,
+   struct vfio_region_info_cap_sparse_mmap, header);
+   sparse_cap->nr_areas = sparse->nr_areas;
+   memcpy(sparse_cap->areas, sparse->areas,
+  sparse->nr_areas * sizeof(*sparse->areas));
+   return 0;
+}
+
+static int region_type_cap(struct vfio_info_cap *caps, void *cap_type)
+{
+   struct vfio_info_cap_header *header;
+   struct vfio_region_info_cap_type *type_cap, *cap = cap_type;
+
+   header = vfio_info_cap_add(caps, sizeof(*cap),
+  VFIO_REGION_INFO_CAP_TYPE, 1);
+   if (IS_ERR(header))
+   return PTR_ERR(header);
+
+   type_cap = container_of(header, struct vfio_region_info_cap_type,
+   header);
+   type_cap->type = cap->type;
+   type_cap->subtype = cap->subtype;
+   return 0;
+}
+
+int vfio_info_add_capability(struct vfio_info_cap *caps, int cap_type_id,
+void *cap_type)
+{
+   int ret = -EINVAL;
+
+   if (!cap_type)
+   return 0;
+
+   switch (cap_type_id) {
+   case VFIO_REGION_INFO_CAP_SPARSE_MMAP:
+   ret = sparse_mmap_cap(caps, cap_type);
+   break;
+
+   case VFIO_REGION_INFO_CAP_TYPE:
+   ret = region_type_cap(caps, cap_type);
+   break;
+   }
+
+   return ret;
+}
+EXPORT_SYMBOL(vfio_info_add_capability);
 
 /*
  * Pin a set of guest PFNs and return their associated host PFNs for local
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index dcda8fccefab..cf90393a11e2 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -113,6 +113,9 @@ extern struct vfio_info_cap_header *vfio_info_cap_add(
struct vfio_info_cap *caps, size_t size, u16 id, u16 version);
 extern void vfio_info_cap_shift(struct vfio_info_cap *caps, size_t offset);
 
+extern int vfio_info_add_capability(struct vfio_info_cap *caps,
+   int cap_type_id, void *cap_type);
+
 struct pci_dev;
 #ifdef CONFIG_EEH
 extern void vfio_spapr_pci_eeh_open(struct pci_dev *pdev);
-- 
2.7.0



[PATCH v11 17/22] vfio_platform: Updated to use vfio_set_irqs_validate_and_prepare()

2016-11-04 Thread Kirti Wankhede
Updated vfio_platform_common.c file to use
vfio_set_irqs_validate_and_prepare()

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Id87cd6b78ae901610b39bf957974baa6f40cd7b0
---
 drivers/vfio/platform/vfio_platform_common.c | 31 +++-
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index d78142830754..4c27f4be3c3d 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -364,36 +364,21 @@ static long vfio_platform_ioctl(void *device_data,
struct vfio_irq_set hdr;
u8 *data = NULL;
int ret = 0;
+   size_t data_size = 0;
 
minsz = offsetofend(struct vfio_irq_set, count);
 
if (copy_from_user(, (void __user *)arg, minsz))
return -EFAULT;
 
-   if (hdr.argsz < minsz)
-   return -EINVAL;
-
-   if (hdr.index >= vdev->num_irqs)
-   return -EINVAL;
-
-   if (hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
- VFIO_IRQ_SET_ACTION_TYPE_MASK))
-   return -EINVAL;
-
-   if (!(hdr.flags & VFIO_IRQ_SET_DATA_NONE)) {
-   size_t size;
-
-   if (hdr.flags & VFIO_IRQ_SET_DATA_BOOL)
-   size = sizeof(uint8_t);
-   else if (hdr.flags & VFIO_IRQ_SET_DATA_EVENTFD)
-   size = sizeof(int32_t);
-   else
-   return -EINVAL;
-
-   if (hdr.argsz - minsz < size)
-   return -EINVAL;
+   ret = vfio_set_irqs_validate_and_prepare(, vdev->num_irqs,
+vdev->num_irqs, _size);
+   if (ret)
+   return ret;
 
-   data = memdup_user((void __user *)(arg + minsz), size);
+   if (data_size) {
+   data = memdup_user((void __user *)(arg + minsz),
+   data_size);
if (IS_ERR(data))
return PTR_ERR(data);
}
-- 
2.7.0



Re: v4.8-rc1: thinkpad x60: running at low frequency even during kernel build

2016-11-04 Thread Pandruvada, Srinivas
On Fri, 2016-11-04 at 21:44 +0100, Pavel Machek wrote:
> Hi!
> 
> > 
> > > 
> > > Let me try v4.9-rc2... that works ok (cpus at the high frequency
> > > during the kernel build). Unfortunately that sends my cpus to 99C
> > > temperature range (and eventually forces emergency shutdown).
> > 
> > This we have to debug. Do you see same line like 
> > "
> > /sys/devices/system/cpu/cpu0/cpufreq/bios_limit:100
> > "
> > If not we need
> > to find out why.
> 
> I'd prefer mails over bugzilla for now...
> 
> 4.9-rc2 has bios_limit:
> 
> pavel@duo:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
> 1833000
> 
> and it has thermal zones:
> 
> /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp 127000
> /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_type critical
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_temp 97000
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_type critical
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_temp 92500
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_type passive
> 
It will not act if there is no binding information. Do you have more
files in this folder?

grep -r . * in /sys/class/thermal will be helpful.


> ..so it should slow down CPU at 92C.
> 
> So lets push the temperature up a bit...
> 
> sudo watch cat /proc/acpi/ibm/thermal
> /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
> /sys/devices/virtual/thermal/thermal_zone1/temp  /sys/devices/system/
> cpu/cpu0/cpufreq/cpuinfo_cur_freq
> 
> temperatures:   98 49 -128 85 28 -128 28 -128 49 58 -128 -128 -128
> -128 -128 -128
> 1833000
> 95000
> 1833000
> 
> Hmm. bios_limit does not seem to change, even when the temperature is
> clearly above the trip point. (It is also interestng that acpi/ibm
> reports bigger temperatures than
> /sys/devices/virtual/thermal/thermal_zone1/temp . I have seen 103C
> there.)
Probably they are showing package and core temperature or have a
different sampling interval.

Try enabling thermald service in Debian. it has access to more knobs to
control thermals.

Thanks,
Srinivas


[PATCH v11 13/22] vfio: Introduce common function to add capabilities

2016-11-04 Thread Kirti Wankhede
Vendor driver using mediated device framework should use
vfio_info_add_capability() to add capabilities.
Introduced this function to reduce code duplication in vendor drivers.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I6fca329fa2291f37a2c859d0bc97574d9e2ce1a6
---
 drivers/vfio/vfio.c  | 60 +++-
 include/linux/vfio.h |  3 +++
 2 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 4ed1a6a247c6..9a03be0942a1 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1797,8 +1797,66 @@ void vfio_info_cap_shift(struct vfio_info_cap *caps, 
size_t offset)
for (tmp = caps->buf; tmp->next; tmp = (void *)tmp + tmp->next - offset)
tmp->next += offset;
 }
-EXPORT_SYMBOL_GPL(vfio_info_cap_shift);
+EXPORT_SYMBOL(vfio_info_cap_shift);
 
+static int sparse_mmap_cap(struct vfio_info_cap *caps, void *cap_type)
+{
+   struct vfio_info_cap_header *header;
+   struct vfio_region_info_cap_sparse_mmap *sparse_cap, *sparse = cap_type;
+   size_t size;
+
+   size = sizeof(*sparse) + sparse->nr_areas *  sizeof(*sparse->areas);
+   header = vfio_info_cap_add(caps, size,
+  VFIO_REGION_INFO_CAP_SPARSE_MMAP, 1);
+   if (IS_ERR(header))
+   return PTR_ERR(header);
+
+   sparse_cap = container_of(header,
+   struct vfio_region_info_cap_sparse_mmap, header);
+   sparse_cap->nr_areas = sparse->nr_areas;
+   memcpy(sparse_cap->areas, sparse->areas,
+  sparse->nr_areas * sizeof(*sparse->areas));
+   return 0;
+}
+
+static int region_type_cap(struct vfio_info_cap *caps, void *cap_type)
+{
+   struct vfio_info_cap_header *header;
+   struct vfio_region_info_cap_type *type_cap, *cap = cap_type;
+
+   header = vfio_info_cap_add(caps, sizeof(*cap),
+  VFIO_REGION_INFO_CAP_TYPE, 1);
+   if (IS_ERR(header))
+   return PTR_ERR(header);
+
+   type_cap = container_of(header, struct vfio_region_info_cap_type,
+   header);
+   type_cap->type = cap->type;
+   type_cap->subtype = cap->subtype;
+   return 0;
+}
+
+int vfio_info_add_capability(struct vfio_info_cap *caps, int cap_type_id,
+void *cap_type)
+{
+   int ret = -EINVAL;
+
+   if (!cap_type)
+   return 0;
+
+   switch (cap_type_id) {
+   case VFIO_REGION_INFO_CAP_SPARSE_MMAP:
+   ret = sparse_mmap_cap(caps, cap_type);
+   break;
+
+   case VFIO_REGION_INFO_CAP_TYPE:
+   ret = region_type_cap(caps, cap_type);
+   break;
+   }
+
+   return ret;
+}
+EXPORT_SYMBOL(vfio_info_add_capability);
 
 /*
  * Pin a set of guest PFNs and return their associated host PFNs for local
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index dcda8fccefab..cf90393a11e2 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -113,6 +113,9 @@ extern struct vfio_info_cap_header *vfio_info_cap_add(
struct vfio_info_cap *caps, size_t size, u16 id, u16 version);
 extern void vfio_info_cap_shift(struct vfio_info_cap *caps, size_t offset);
 
+extern int vfio_info_add_capability(struct vfio_info_cap *caps,
+   int cap_type_id, void *cap_type);
+
 struct pci_dev;
 #ifdef CONFIG_EEH
 extern void vfio_spapr_pci_eeh_open(struct pci_dev *pdev);
-- 
2.7.0



[PATCH v11 17/22] vfio_platform: Updated to use vfio_set_irqs_validate_and_prepare()

2016-11-04 Thread Kirti Wankhede
Updated vfio_platform_common.c file to use
vfio_set_irqs_validate_and_prepare()

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Id87cd6b78ae901610b39bf957974baa6f40cd7b0
---
 drivers/vfio/platform/vfio_platform_common.c | 31 +++-
 1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/drivers/vfio/platform/vfio_platform_common.c 
b/drivers/vfio/platform/vfio_platform_common.c
index d78142830754..4c27f4be3c3d 100644
--- a/drivers/vfio/platform/vfio_platform_common.c
+++ b/drivers/vfio/platform/vfio_platform_common.c
@@ -364,36 +364,21 @@ static long vfio_platform_ioctl(void *device_data,
struct vfio_irq_set hdr;
u8 *data = NULL;
int ret = 0;
+   size_t data_size = 0;
 
minsz = offsetofend(struct vfio_irq_set, count);
 
if (copy_from_user(, (void __user *)arg, minsz))
return -EFAULT;
 
-   if (hdr.argsz < minsz)
-   return -EINVAL;
-
-   if (hdr.index >= vdev->num_irqs)
-   return -EINVAL;
-
-   if (hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
- VFIO_IRQ_SET_ACTION_TYPE_MASK))
-   return -EINVAL;
-
-   if (!(hdr.flags & VFIO_IRQ_SET_DATA_NONE)) {
-   size_t size;
-
-   if (hdr.flags & VFIO_IRQ_SET_DATA_BOOL)
-   size = sizeof(uint8_t);
-   else if (hdr.flags & VFIO_IRQ_SET_DATA_EVENTFD)
-   size = sizeof(int32_t);
-   else
-   return -EINVAL;
-
-   if (hdr.argsz - minsz < size)
-   return -EINVAL;
+   ret = vfio_set_irqs_validate_and_prepare(, vdev->num_irqs,
+vdev->num_irqs, _size);
+   if (ret)
+   return ret;
 
-   data = memdup_user((void __user *)(arg + minsz), size);
+   if (data_size) {
+   data = memdup_user((void __user *)(arg + minsz),
+   data_size);
if (IS_ERR(data))
return PTR_ERR(data);
}
-- 
2.7.0



Re: v4.8-rc1: thinkpad x60: running at low frequency even during kernel build

2016-11-04 Thread Pandruvada, Srinivas
On Fri, 2016-11-04 at 21:44 +0100, Pavel Machek wrote:
> Hi!
> 
> > 
> > > 
> > > Let me try v4.9-rc2... that works ok (cpus at the high frequency
> > > during the kernel build). Unfortunately that sends my cpus to 99C
> > > temperature range (and eventually forces emergency shutdown).
> > 
> > This we have to debug. Do you see same line like 
> > "
> > /sys/devices/system/cpu/cpu0/cpufreq/bios_limit:100
> > "
> > If not we need
> > to find out why.
> 
> I'd prefer mails over bugzilla for now...
> 
> 4.9-rc2 has bios_limit:
> 
> pavel@duo:~$ cat /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
> 1833000
> 
> and it has thermal zones:
> 
> /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_temp 127000
> /sys/devices/virtual/thermal/thermal_zone0/trip_point_0_type critical
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_temp 97000
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_0_type critical
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_temp 92500
> /sys/devices/virtual/thermal/thermal_zone1/trip_point_1_type passive
> 
It will not act if there is no binding information. Do you have more
files in this folder?

grep -r . * in /sys/class/thermal will be helpful.


> ..so it should slow down CPU at 92C.
> 
> So lets push the temperature up a bit...
> 
> sudo watch cat /proc/acpi/ibm/thermal
> /sys/devices/system/cpu/cpu0/cpufreq/bios_limit
> /sys/devices/virtual/thermal/thermal_zone1/temp  /sys/devices/system/
> cpu/cpu0/cpufreq/cpuinfo_cur_freq
> 
> temperatures:   98 49 -128 85 28 -128 28 -128 49 58 -128 -128 -128
> -128 -128 -128
> 1833000
> 95000
> 1833000
> 
> Hmm. bios_limit does not seem to change, even when the temperature is
> clearly above the trip point. (It is also interestng that acpi/ibm
> reports bigger temperatures than
> /sys/devices/virtual/thermal/thermal_zone1/temp . I have seen 103C
> there.)
Probably they are showing package and core temperature or have a
different sampling interval.

Try enabling thermald service in Debian. it has access to more knobs to
control thermals.

Thanks,
Srinivas


[PATCH v11 21/22] docs: Sample driver to demonstrate how to use Mediated device framework.

2016-11-04 Thread Kirti Wankhede
The Sample driver creates mdev device that simulates serial port over PCI
card.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I857f8f12f8b275f2498dfe8c628a5cdc7193b1b2
---
 Documentation/vfio-mediated-device.txt |  103 ++-
 samples/vfio-mdev/Makefile |   13 +
 samples/vfio-mdev/mtty.c   | 1503 
 3 files changed, 1618 insertions(+), 1 deletion(-)
 create mode 100644 samples/vfio-mdev/Makefile
 create mode 100644 samples/vfio-mdev/mtty.c

diff --git a/Documentation/vfio-mediated-device.txt 
b/Documentation/vfio-mediated-device.txt
index d61e95aec961..146da548c8d2 100644
--- a/Documentation/vfio-mediated-device.txt
+++ b/Documentation/vfio-mediated-device.txt
@@ -289,8 +289,109 @@ these callbacks are supported in the TYPE1 IOMMU module. 
To enable them for
 other IOMMU backend modules, such as PPC64 sPAPR module, they need to provide
 these two callback functions.
 
+Using the Sample Code
+=
+
+mtty.c in samples/vfio-mdev/ directory is a sample driver program to
+demonstrate how to use the mediated device framework.
+
+The sample driver creates an mdev device that simulates a serial port over a 
PCI
+card.
+
+1. Build and load the mtty.ko module.
+
+   This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
+
+   Files in this device directory in sysfs are similar to the following:
+
+   # tree /sys/devices/virtual/mtty/mtty/
+  /sys/devices/virtual/mtty/mtty/
+  |-- mdev_supported_types
+  |   |-- mtty-1
+  |   |   |-- available_instances
+  |   |   |-- create
+  |   |   |-- device_api
+  |   |   |-- devices
+  |   |   `-- name
+  |   `-- mtty-2
+  |   |-- available_instances
+  |   |-- create
+  |   |-- device_api
+  |   |-- devices
+  |   `-- name
+  |-- mtty_dev
+  |   `-- sample_mtty_dev
+  |-- power
+  |   |-- autosuspend_delay_ms
+  |   |-- control
+  |   |-- runtime_active_time
+  |   |-- runtime_status
+  |   `-- runtime_suspended_time
+  |-- subsystem -> ../../../../class/mtty
+  `-- uevent
+
+2. Create a mediated device by using the dummy device that you created in the
+   previous step.
+
+   # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
+  /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
+
+3. Add parameters to qemu-kvm.
+
+   -device vfio-pci,\
+sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
+
+4. Boot the VM.
+
+   In the Linux guest VM, with no hardware on the host, the device appears
+   as  follows:
+
+   # lspci -s 00:05.0 -xxvv
+   00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
+   Subsystem: Device 4348:3253
+   Physical Slot: 5
+   Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
+   Stepping- SERR- FastB2B- DisINTx-
+   Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
+   SERR-  Link[LNKA] -> GSI 10 (level, high) -> IRQ
+10
+   :00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
+   :00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
+
+
+5. In the Linux guest VM, check the serial ports.
+
+   # setserial -g /dev/ttyS*
+   /dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
+   /dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
+   /dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
+
+6. Using a minicom or any terminal enulation program, open port /dev/ttyS1 or
+   /dev/ttyS2 with hardware flow control disabled.
+
+7. Type data on the minicom terminal or send data to the terminal emulation
+   program and read the data.
+
+   Data is loop backed from hosts mtty driver.
+
+8. Destroy the mediated device that you created.
+
+   # echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
+
 References
---
+==
 
 [1] See Documentation/vfio.txt for more information on VFIO.
 [2] struct mdev_driver in include/linux/mdev.h
diff --git a/samples/vfio-mdev/Makefile b/samples/vfio-mdev/Makefile
new file mode 100644
index ..a932edbe38eb
--- /dev/null
+++ b/samples/vfio-mdev/Makefile
@@ -0,0 +1,13 @@
+#
+# Makefile for mtty.c file
+#
+KERNEL_DIR:=/lib/modules/$(shell uname -r)/build
+
+obj-m:=mtty.o
+
+modules clean modules_install:
+   $(MAKE) -C $(KERNEL_DIR) SUBDIRS=$(PWD) $@
+
+default: modules
+
+module: modules
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
new file mode 100644
index ..6c71d12288d1
--- /dev/null
+++ b/samples/vfio-mdev/mtty.c
@@ -0,0 +1,1503 @@
+/*
+ * Mediated virtual PCI serial host device driver
+ *
+ * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
+ * Author: Neo Jia 
+ * Kirti Wankhede 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public 

[PATCH v11 21/22] docs: Sample driver to demonstrate how to use Mediated device framework.

2016-11-04 Thread Kirti Wankhede
The Sample driver creates mdev device that simulates serial port over PCI
card.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I857f8f12f8b275f2498dfe8c628a5cdc7193b1b2
---
 Documentation/vfio-mediated-device.txt |  103 ++-
 samples/vfio-mdev/Makefile |   13 +
 samples/vfio-mdev/mtty.c   | 1503 
 3 files changed, 1618 insertions(+), 1 deletion(-)
 create mode 100644 samples/vfio-mdev/Makefile
 create mode 100644 samples/vfio-mdev/mtty.c

diff --git a/Documentation/vfio-mediated-device.txt 
b/Documentation/vfio-mediated-device.txt
index d61e95aec961..146da548c8d2 100644
--- a/Documentation/vfio-mediated-device.txt
+++ b/Documentation/vfio-mediated-device.txt
@@ -289,8 +289,109 @@ these callbacks are supported in the TYPE1 IOMMU module. 
To enable them for
 other IOMMU backend modules, such as PPC64 sPAPR module, they need to provide
 these two callback functions.
 
+Using the Sample Code
+=
+
+mtty.c in samples/vfio-mdev/ directory is a sample driver program to
+demonstrate how to use the mediated device framework.
+
+The sample driver creates an mdev device that simulates a serial port over a 
PCI
+card.
+
+1. Build and load the mtty.ko module.
+
+   This step creates a dummy device, /sys/devices/virtual/mtty/mtty/
+
+   Files in this device directory in sysfs are similar to the following:
+
+   # tree /sys/devices/virtual/mtty/mtty/
+  /sys/devices/virtual/mtty/mtty/
+  |-- mdev_supported_types
+  |   |-- mtty-1
+  |   |   |-- available_instances
+  |   |   |-- create
+  |   |   |-- device_api
+  |   |   |-- devices
+  |   |   `-- name
+  |   `-- mtty-2
+  |   |-- available_instances
+  |   |-- create
+  |   |-- device_api
+  |   |-- devices
+  |   `-- name
+  |-- mtty_dev
+  |   `-- sample_mtty_dev
+  |-- power
+  |   |-- autosuspend_delay_ms
+  |   |-- control
+  |   |-- runtime_active_time
+  |   |-- runtime_status
+  |   `-- runtime_suspended_time
+  |-- subsystem -> ../../../../class/mtty
+  `-- uevent
+
+2. Create a mediated device by using the dummy device that you created in the
+   previous step.
+
+   # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
+  /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
+
+3. Add parameters to qemu-kvm.
+
+   -device vfio-pci,\
+sysfsdev=/sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001
+
+4. Boot the VM.
+
+   In the Linux guest VM, with no hardware on the host, the device appears
+   as  follows:
+
+   # lspci -s 00:05.0 -xxvv
+   00:05.0 Serial controller: Device 4348:3253 (rev 10) (prog-if 02 [16550])
+   Subsystem: Device 4348:3253
+   Physical Slot: 5
+   Control: I/O+ Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
+   Stepping- SERR- FastB2B- DisINTx-
+   Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
+   SERR-  Link[LNKA] -> GSI 10 (level, high) -> IRQ
+10
+   :00:05.0: ttyS1 at I/O 0xc150 (irq = 10) is a 16550A
+   :00:05.0: ttyS2 at I/O 0xc158 (irq = 10) is a 16550A
+
+
+5. In the Linux guest VM, check the serial ports.
+
+   # setserial -g /dev/ttyS*
+   /dev/ttyS0, UART: 16550A, Port: 0x03f8, IRQ: 4
+   /dev/ttyS1, UART: 16550A, Port: 0xc150, IRQ: 10
+   /dev/ttyS2, UART: 16550A, Port: 0xc158, IRQ: 10
+
+6. Using a minicom or any terminal enulation program, open port /dev/ttyS1 or
+   /dev/ttyS2 with hardware flow control disabled.
+
+7. Type data on the minicom terminal or send data to the terminal emulation
+   program and read the data.
+
+   Data is loop backed from hosts mtty driver.
+
+8. Destroy the mediated device that you created.
+
+   # echo 1 > /sys/bus/mdev/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001/remove
+
 References
---
+==
 
 [1] See Documentation/vfio.txt for more information on VFIO.
 [2] struct mdev_driver in include/linux/mdev.h
diff --git a/samples/vfio-mdev/Makefile b/samples/vfio-mdev/Makefile
new file mode 100644
index ..a932edbe38eb
--- /dev/null
+++ b/samples/vfio-mdev/Makefile
@@ -0,0 +1,13 @@
+#
+# Makefile for mtty.c file
+#
+KERNEL_DIR:=/lib/modules/$(shell uname -r)/build
+
+obj-m:=mtty.o
+
+modules clean modules_install:
+   $(MAKE) -C $(KERNEL_DIR) SUBDIRS=$(PWD) $@
+
+default: modules
+
+module: modules
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
new file mode 100644
index ..6c71d12288d1
--- /dev/null
+++ b/samples/vfio-mdev/mtty.c
@@ -0,0 +1,1503 @@
+/*
+ * Mediated virtual PCI serial host device driver
+ *
+ * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
+ * Author: Neo Jia 
+ * Kirti Wankhede 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Sample 

[PATCH v11 16/22] vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare()

2016-11-04 Thread Kirti Wankhede
Updated vfio_pci.c file to use vfio_set_irqs_validate_and_prepare()

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I9f3daba89d8dba5cb5b01a8cff420412f30686c7
---
 drivers/vfio/pci/vfio_pci.c | 34 +++---
 1 file changed, 7 insertions(+), 27 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 03b5434f4d5b..dcd7c2a99618 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -818,45 +818,25 @@ static long vfio_pci_ioctl(void *device_data,
 
} else if (cmd == VFIO_DEVICE_SET_IRQS) {
struct vfio_irq_set hdr;
-   size_t size;
u8 *data = NULL;
int max, ret = 0;
+   size_t data_size = 0;
 
minsz = offsetofend(struct vfio_irq_set, count);
 
if (copy_from_user(, (void __user *)arg, minsz))
return -EFAULT;
 
-   if (hdr.argsz < minsz || hdr.index >= VFIO_PCI_NUM_IRQS ||
-   hdr.count >= (U32_MAX - hdr.start) ||
-   hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
- VFIO_IRQ_SET_ACTION_TYPE_MASK))
-   return -EINVAL;
-
max = vfio_pci_get_irq_count(vdev, hdr.index);
-   if (hdr.start >= max || hdr.start + hdr.count > max)
-   return -EINVAL;
 
-   switch (hdr.flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
-   case VFIO_IRQ_SET_DATA_NONE:
-   size = 0;
-   break;
-   case VFIO_IRQ_SET_DATA_BOOL:
-   size = sizeof(uint8_t);
-   break;
-   case VFIO_IRQ_SET_DATA_EVENTFD:
-   size = sizeof(int32_t);
-   break;
-   default:
-   return -EINVAL;
-   }
-
-   if (size) {
-   if (hdr.argsz - minsz < hdr.count * size)
-   return -EINVAL;
+   ret = vfio_set_irqs_validate_and_prepare(, max,
+VFIO_PCI_NUM_IRQS, _size);
+   if (ret)
+   return ret;
 
+   if (data_size) {
data = memdup_user((void __user *)(arg + minsz),
-  hdr.count * size);
+   data_size);
if (IS_ERR(data))
return PTR_ERR(data);
}
-- 
2.7.0



[PATCH v11 20/22] docs: Sysfs ABI for mediated device framework

2016-11-04 Thread Kirti Wankhede
Added details of sysfs ABI for mediated device framework

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Icb0fd4ed58a2fa793fbcb1c3d5009a4403c1f3ac
---
 Documentation/ABI/testing/sysfs-bus-vfio-mdev | 111 ++
 1 file changed, 111 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-vfio-mdev

diff --git a/Documentation/ABI/testing/sysfs-bus-vfio-mdev 
b/Documentation/ABI/testing/sysfs-bus-vfio-mdev
new file mode 100644
index ..452dbe39270e
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-vfio-mdev
@@ -0,0 +1,111 @@
+What:   /sys/...//mdev_supported_types/
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+This directory contains list of directories of currently
+   supported mediated device types and their details for
+   . Supported type attributes are defined by the
+   vendor driver who registers with Mediated device framework.
+   Each supported type is a directory whose name is created
+   by adding the device driver string as a prefix to the
+   string provided by the vendor driver.
+
+What:   /sys/...//mdev_supported_types//
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+This directory gives details of supported type, like name,
+   description, available_instances, device_api etc.
+   'device_api' and 'available_instances' are mandatory
+   attributes to be provided by vendor driver. 'name',
+   'description' and other vendor driver specific attributes
+   are optional.
+
+What:   /sys/.../mdev_supported_types//create
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Writing UUID to this file will create mediated device of
+   type  for parent device . This is a
+   write-only file.
+   For example:
+   # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
+  /sys/devices/foo/mdev_supported_types/foo-1/create
+
+What:   /sys/.../mdev_supported_types//devices/
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   This directory contains symbolic links pointing to mdev
+   devices sysfs entries which are created of this .
+
+What:   /sys/.../mdev_supported_types//available_instances
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Reading this attribute will show the number of mediated
+   devices of type  that can be created. This is a
+   readonly file.
+Users:
+   Userspace applications interested in creating mediated
+   device of that type. Userspace application should check
+   the number of available instances could be created before
+   creating mediated device of this type.
+
+What:   /sys/.../mdev_supported_types//device_api
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Reading this attribute will show VFIO device API supported
+   by this type. For example, "vfio-pci" for a PCI device,
+   "vfio-platform" for platform device.
+
+What:   /sys/.../mdev_supported_types//name
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Reading this attribute will show human readable name of the
+   mediated device that will get created of type .
+   This is optional attribute. For example: "Grid M60-0Q"
+Users:
+   Userspace applications interested in knowing the name of
+   a particular  that can help in understanding the
+   type of mediated device.
+
+What:   /sys/.../mdev_supported_types//description
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Reading this attribute will show description of the type of
+   mediated device that will get created of type .
+   This is optional attribute. For example:
+   "2 heads, 512M FB, 2560x1600 maximum resolution"
+Users:
+   Userspace applications interested in knowing the details of
+   a particular  that can help in understanding the
+   features provided by that type of mediated device.
+
+What:   /sys/...///
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   This directory represents device directory of mediated
+   device. It contains all the 

[PATCH v11 16/22] vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare()

2016-11-04 Thread Kirti Wankhede
Updated vfio_pci.c file to use vfio_set_irqs_validate_and_prepare()

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I9f3daba89d8dba5cb5b01a8cff420412f30686c7
---
 drivers/vfio/pci/vfio_pci.c | 34 +++---
 1 file changed, 7 insertions(+), 27 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 03b5434f4d5b..dcd7c2a99618 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -818,45 +818,25 @@ static long vfio_pci_ioctl(void *device_data,
 
} else if (cmd == VFIO_DEVICE_SET_IRQS) {
struct vfio_irq_set hdr;
-   size_t size;
u8 *data = NULL;
int max, ret = 0;
+   size_t data_size = 0;
 
minsz = offsetofend(struct vfio_irq_set, count);
 
if (copy_from_user(, (void __user *)arg, minsz))
return -EFAULT;
 
-   if (hdr.argsz < minsz || hdr.index >= VFIO_PCI_NUM_IRQS ||
-   hdr.count >= (U32_MAX - hdr.start) ||
-   hdr.flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
- VFIO_IRQ_SET_ACTION_TYPE_MASK))
-   return -EINVAL;
-
max = vfio_pci_get_irq_count(vdev, hdr.index);
-   if (hdr.start >= max || hdr.start + hdr.count > max)
-   return -EINVAL;
 
-   switch (hdr.flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
-   case VFIO_IRQ_SET_DATA_NONE:
-   size = 0;
-   break;
-   case VFIO_IRQ_SET_DATA_BOOL:
-   size = sizeof(uint8_t);
-   break;
-   case VFIO_IRQ_SET_DATA_EVENTFD:
-   size = sizeof(int32_t);
-   break;
-   default:
-   return -EINVAL;
-   }
-
-   if (size) {
-   if (hdr.argsz - minsz < hdr.count * size)
-   return -EINVAL;
+   ret = vfio_set_irqs_validate_and_prepare(, max,
+VFIO_PCI_NUM_IRQS, _size);
+   if (ret)
+   return ret;
 
+   if (data_size) {
data = memdup_user((void __user *)(arg + minsz),
-  hdr.count * size);
+   data_size);
if (IS_ERR(data))
return PTR_ERR(data);
}
-- 
2.7.0



[PATCH v11 20/22] docs: Sysfs ABI for mediated device framework

2016-11-04 Thread Kirti Wankhede
Added details of sysfs ABI for mediated device framework

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Icb0fd4ed58a2fa793fbcb1c3d5009a4403c1f3ac
---
 Documentation/ABI/testing/sysfs-bus-vfio-mdev | 111 ++
 1 file changed, 111 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-vfio-mdev

diff --git a/Documentation/ABI/testing/sysfs-bus-vfio-mdev 
b/Documentation/ABI/testing/sysfs-bus-vfio-mdev
new file mode 100644
index ..452dbe39270e
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-vfio-mdev
@@ -0,0 +1,111 @@
+What:   /sys/...//mdev_supported_types/
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+This directory contains list of directories of currently
+   supported mediated device types and their details for
+   . Supported type attributes are defined by the
+   vendor driver who registers with Mediated device framework.
+   Each supported type is a directory whose name is created
+   by adding the device driver string as a prefix to the
+   string provided by the vendor driver.
+
+What:   /sys/...//mdev_supported_types//
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+This directory gives details of supported type, like name,
+   description, available_instances, device_api etc.
+   'device_api' and 'available_instances' are mandatory
+   attributes to be provided by vendor driver. 'name',
+   'description' and other vendor driver specific attributes
+   are optional.
+
+What:   /sys/.../mdev_supported_types//create
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Writing UUID to this file will create mediated device of
+   type  for parent device . This is a
+   write-only file.
+   For example:
+   # echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001" > \
+  /sys/devices/foo/mdev_supported_types/foo-1/create
+
+What:   /sys/.../mdev_supported_types//devices/
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   This directory contains symbolic links pointing to mdev
+   devices sysfs entries which are created of this .
+
+What:   /sys/.../mdev_supported_types//available_instances
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Reading this attribute will show the number of mediated
+   devices of type  that can be created. This is a
+   readonly file.
+Users:
+   Userspace applications interested in creating mediated
+   device of that type. Userspace application should check
+   the number of available instances could be created before
+   creating mediated device of this type.
+
+What:   /sys/.../mdev_supported_types//device_api
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Reading this attribute will show VFIO device API supported
+   by this type. For example, "vfio-pci" for a PCI device,
+   "vfio-platform" for platform device.
+
+What:   /sys/.../mdev_supported_types//name
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Reading this attribute will show human readable name of the
+   mediated device that will get created of type .
+   This is optional attribute. For example: "Grid M60-0Q"
+Users:
+   Userspace applications interested in knowing the name of
+   a particular  that can help in understanding the
+   type of mediated device.
+
+What:   /sys/.../mdev_supported_types//description
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   Reading this attribute will show description of the type of
+   mediated device that will get created of type .
+   This is optional attribute. For example:
+   "2 heads, 512M FB, 2560x1600 maximum resolution"
+Users:
+   Userspace applications interested in knowing the details of
+   a particular  that can help in understanding the
+   features provided by that type of mediated device.
+
+What:   /sys/...///
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   This directory represents device directory of mediated
+   device. It contains all the attributes related to mediated
+   device.
+
+What:   /sys/...///mdev_type
+Date:   October 2016
+Contact:Kirti Wankhede 
+Description:
+   This is symbolic link pointing to supported type, 
+  

[PATCH v11 15/22] vfio: Introduce vfio_set_irqs_validate_and_prepare()

2016-11-04 Thread Kirti Wankhede
Vendor driver using mediated device framework would use same mechnism to
validate and prepare IRQs. Introducing this function to reduce code
replication in multiple drivers.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Ie201f269dda0713ca18a07dc4852500bd8b48309
---
 drivers/vfio/vfio.c  | 48 
 include/linux/vfio.h |  4 
 2 files changed, 52 insertions(+)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 9a03be0942a1..ed2361e4b904 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1858,6 +1858,54 @@ int vfio_info_add_capability(struct vfio_info_cap *caps, 
int cap_type_id,
 }
 EXPORT_SYMBOL(vfio_info_add_capability);
 
+int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr, int num_irqs,
+  int max_irq_type, size_t *data_size)
+{
+   unsigned long minsz;
+   size_t size;
+
+   minsz = offsetofend(struct vfio_irq_set, count);
+
+   if ((hdr->argsz < minsz) || (hdr->index >= max_irq_type) ||
+   (hdr->count >= (U32_MAX - hdr->start)) ||
+   (hdr->flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
+   VFIO_IRQ_SET_ACTION_TYPE_MASK)))
+   return -EINVAL;
+
+   if (data_size)
+   *data_size = 0;
+
+   if (hdr->start >= num_irqs || hdr->start + hdr->count > num_irqs)
+   return -EINVAL;
+
+   switch (hdr->flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+   case VFIO_IRQ_SET_DATA_NONE:
+   size = 0;
+   break;
+   case VFIO_IRQ_SET_DATA_BOOL:
+   size = sizeof(uint8_t);
+   break;
+   case VFIO_IRQ_SET_DATA_EVENTFD:
+   size = sizeof(int32_t);
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (size) {
+   if (hdr->argsz - minsz < hdr->count * size)
+   return -EINVAL;
+
+   if (!data_size)
+   return -EINVAL;
+
+   *data_size = hdr->count * size;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
+
 /*
  * Pin a set of guest PFNs and return their associated host PFNs for local
  * domain only.
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index cf90393a11e2..87c9afecd822 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -116,6 +116,10 @@ extern void vfio_info_cap_shift(struct vfio_info_cap 
*caps, size_t offset);
 extern int vfio_info_add_capability(struct vfio_info_cap *caps,
int cap_type_id, void *cap_type);
 
+extern int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr,
+ int num_irqs, int max_irq_type,
+ size_t *data_size);
+
 struct pci_dev;
 #ifdef CONFIG_EEH
 extern void vfio_spapr_pci_eeh_open(struct pci_dev *pdev);
-- 
2.7.0



[PATCH v11 19/22] docs: Add Documentation for Mediated devices

2016-11-04 Thread Kirti Wankhede
Add file Documentation/vfio-mediated-device.txt that include details of
mediated device framework.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I137dd646442936090d92008b115908b7b2c7bc5d
---
 Documentation/vfio-mediated-device.txt | 298 +
 1 file changed, 298 insertions(+)
 create mode 100644 Documentation/vfio-mediated-device.txt

diff --git a/Documentation/vfio-mediated-device.txt 
b/Documentation/vfio-mediated-device.txt
new file mode 100644
index ..d61e95aec961
--- /dev/null
+++ b/Documentation/vfio-mediated-device.txt
@@ -0,0 +1,298 @@
+/*
+ * VFIO Mediated devices
+ *
+ * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
+ * Author: Neo Jia 
+ * Kirti Wankhede 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+Virtual Function I/O (VFIO) Mediated devices[1]
+===
+
+The number of use cases for virtualizing DMA devices that do not have built-in
+SR_IOV capability is increasing. Previously, to virtualize such devices,
+developers had to create their own management interfaces and APIs, and then
+integrate them with user space software. To simplify integration with user 
space
+software, we have identified common requirements and a unified management
+interface for such devices.
+
+The VFIO driver framework provides unified APIs for direct device access. It is
+an IOMMU/device-agnostic framework for exposing direct device access to user
+space in a secure, IOMMU-protected environment. This framework is used for
+multiple devices, such as GPUs, network adapters, and compute accelerators. 
With
+direct device access, virtual machines or user space applications have direct
+access to the physical device. This framework is reused for mediated devices.
+
+The mediated core driver provides a common interface for mediated device
+management that can be used by drivers of different devices. This module
+provides a generic interface to perform these operations:
+
+* Create and destroy a mediated device
+* Add a mediated device to and remove it from a mediated bus driver
+* Add a mediated device to and remove it from an IOMMU group
+
+The mediated core driver also provides an interface to register a bus driver.
+For example, the mediated VFIO mdev driver is designed for mediated devices and
+supports VFIO APIs. The mediated bus driver adds a mediated device to and
+removes it from a VFIO group.
+
+The following high-level block diagram shows the main components and interfaces
+in the VFIO mediated driver framework. The diagram shows NVIDIA, Intel, and IBM
+devices as examples, as these devices are the first devices to use this module.
+
+ +---+
+ |   |
+ | +---+ |  mdev_register_driver() +--+
+ | |   | +<+  |
+ | |  mdev | | |  |
+ | |  bus  | +>+ vfio_mdev.ko |<-> VFIO user
+ | |  driver   | | probe()/remove()|  |APIs
+ | |   | | +--+
+ | +---+ |
+ |   |
+ |  MDEV CORE|
+ |   MODULE  |
+ |   mdev.ko |
+ | +---+ |  mdev_register_device() +--+
+ | |   | +<+  |
+ | |   | | |  nvidia.ko   |<-> physical
+ | |   | +>+  |device
+ | |   | |callbacks+--+
+ | | Physical  | |
+ | |  device   | |  mdev_register_device() +--+
+ | | interface | |<+  |
+ | |   | | |  i915.ko |<-> physical
+ | |   | +>+  |device
+ | |   | |callbacks+--+
+ | |   | |
+ | |   | |  mdev_register_device() +--+
+ | |   | +<+  |
+ | |   | | | ccw_device.ko|<-> physical
+ | |   | +>+  |device
+ | |   | |callbacks+--+
+ | +---+ |
+ +---+
+
+
+Registration Interfaces
+===
+
+The mediated core driver provides the following types of registration
+interfaces:
+
+* Registration interface for a mediated bus driver
+* Physical device driver interface
+
+Registration Interface for a Mediated Bus Driver
+
+
+The 

[PATCH v11 11/22] vfio iommu: Add blocking notifier to notify DMA_UNMAP

2016-11-04 Thread Kirti Wankhede
Added blocking notifier to IOMMU TYPE1 driver to notify vendor drivers
about DMA_UNMAP.
Exported two APIs vfio_register_notifier() and vfio_unregister_notifier().
Notifier should be registered, if external user wants to use
vfio_pin_pages()/vfio_unpin_pages() APIs to pin/unpin pages.
Vendor driver should use VFIO_IOMMU_NOTIFY_DMA_UNMAP action to invalidate
mappings.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I5910d0024d6be87f3e8d3e0ca0eaeaaa0b17f271
---
 drivers/vfio/vfio.c | 73 +
 drivers/vfio/vfio_iommu_type1.c | 47 --
 include/linux/vfio.h| 11 +++
 3 files changed, 121 insertions(+), 10 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 76d260e98930..4ed1a6a247c6 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1895,6 +1895,79 @@ err_unpin_pages:
 }
 EXPORT_SYMBOL(vfio_unpin_pages);
 
+int vfio_register_notifier(struct device *dev, struct notifier_block *nb)
+{
+   struct vfio_container *container;
+   struct vfio_group *group;
+   struct vfio_iommu_driver *driver;
+   ssize_t ret;
+
+   if (!dev || !nb)
+   return -EINVAL;
+
+   group = vfio_group_get_from_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   goto err_register_nb;
+
+   container = group->container;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->register_notifier))
+   ret = driver->ops->register_notifier(container->iommu_data, nb);
+   else
+   ret = -EINVAL;
+
+   up_read(>group_lock);
+   vfio_group_try_dissolve_container(group);
+
+err_register_nb:
+   vfio_group_put(group);
+   return ret;
+}
+EXPORT_SYMBOL(vfio_register_notifier);
+
+int vfio_unregister_notifier(struct device *dev, struct notifier_block *nb)
+{
+   struct vfio_container *container;
+   struct vfio_group *group;
+   struct vfio_iommu_driver *driver;
+   ssize_t ret;
+
+   if (!dev || !nb)
+   return -EINVAL;
+
+   group = vfio_group_get_from_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   goto err_unregister_nb;
+
+   container = group->container;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->unregister_notifier))
+   ret = driver->ops->unregister_notifier(container->iommu_data,
+  nb);
+   else
+   ret = -EINVAL;
+
+   up_read(>group_lock);
+   vfio_group_try_dissolve_container(group);
+
+err_unregister_nb:
+   vfio_group_put(group);
+   return ret;
+}
+EXPORT_SYMBOL(vfio_unregister_notifier);
+
 /**
  * Module/class support
  */
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index e511073446a0..c2d3a84c447b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson "
@@ -60,6 +61,7 @@ struct vfio_iommu {
struct vfio_domain  *external_domain; /* domain for external user */
struct mutexlock;
struct rb_root  dma_list;
+   struct blocking_notifier_head notifier;
boolv2;
boolnesting;
 };
@@ -550,7 +552,8 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 
mutex_lock(>lock);
 
-   if (!iommu->external_domain) {
+   /* Fail if notifier list is empty */
+   if ((!iommu->external_domain) || (!iommu->notifier.head)) {
ret = -EINVAL;
goto pin_done;
}
@@ -867,6 +870,11 @@ unlock:
/* Report how much was unmapped */
unmap->size = unmapped;
 
+   if (unmapped && iommu->external_domain)
+   blocking_notifier_call_chain(>notifier,
+VFIO_IOMMU_NOTIFY_DMA_UNMAP,
+unmap);
+
return ret;
 }
 
@@ -1474,6 +1482,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
INIT_LIST_HEAD(>addr_space_list);
iommu->dma_list = RB_ROOT;
mutex_init(>lock);
+   BLOCKING_INIT_NOTIFIER_HEAD(>notifier);
 
return iommu;
 }
@@ -1610,16 +1619,34 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
return -ENOTTY;
 }
 
+static int vfio_iommu_type1_register_notifier(void *iommu_data,
+ struct notifier_block *nb)
+{
+   struct 

[PATCH v11 06/22] vfio iommu type1: Update arguments of vfio_lock_acct

2016-11-04 Thread Kirti Wankhede
Updated arguments of vfio_lock_acct to take mm structure as input argument

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I5d3673cc9d3786bb436b395d5f74537f1a36da80
---
 drivers/vfio/vfio_iommu_type1.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 7fb87f008e0a..02b302d0b7de 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -150,17 +150,16 @@ static void vfio_lock_acct_bg(struct work_struct *work)
kfree(vwork);
 }
 
-static void vfio_lock_acct(long npage)
+static void vfio_lock_acct(struct mm_struct *mm, long npage)
 {
struct vwork *vwork;
-   struct mm_struct *mm;
 
-   if (!current->mm || !npage)
+   if (!mm || !npage)
return; /* process exited or nothing to do */
 
-   if (down_write_trylock(>mm->mmap_sem)) {
-   current->mm->locked_vm += npage;
-   up_write(>mm->mmap_sem);
+   if (down_write_trylock(>mmap_sem)) {
+   mm->locked_vm += npage;
+   up_write(>mmap_sem);
return;
}
 
@@ -172,8 +171,7 @@ static void vfio_lock_acct(long npage)
vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
if (!vwork)
return;
-   mm = get_task_mm(current);
-   if (!mm) {
+   if (!mmget_not_zero(mm)) {
kfree(vwork);
return;
}
@@ -285,7 +283,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
 
if (unlikely(disable_hugepages)) {
if (!rsvd)
-   vfio_lock_acct(1);
+   vfio_lock_acct(current->mm, 1);
return 1;
}
 
@@ -313,7 +311,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
}
 
if (!rsvd)
-   vfio_lock_acct(i);
+   vfio_lock_acct(current->mm, i);
 
return i;
 }
@@ -328,7 +326,7 @@ static long __vfio_unpin_pages_remote(unsigned long pfn, 
long npage,
unlocked += put_pfn(pfn++, prot);
 
if (do_accounting)
-   vfio_lock_acct(-unlocked);
+   vfio_lock_acct(current->mm, -unlocked);
 
return unlocked;
 }
@@ -390,7 +388,7 @@ static void vfio_unmap_unpin(struct vfio_iommu *iommu, 
struct vfio_dma *dma)
cond_resched();
}
 
-   vfio_lock_acct(-unlocked);
+   vfio_lock_acct(current->mm, -unlocked);
 }
 
 static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
-- 
2.7.0



[PATCH v11 14/22] vfio_pci: Update vfio_pci to use vfio_info_add_capability()

2016-11-04 Thread Kirti Wankhede
Update msix_sparse_mmap_cap() to use vfio_info_add_capability()
Update region type capability to use vfio_info_add_capability()

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I52bb28c7875a6da5a79ddad1843e6088aff58a45
---
 drivers/vfio/pci/vfio_pci.c | 49 ++---
 1 file changed, 19 insertions(+), 30 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 031bc08d000d..03b5434f4d5b 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -558,10 +558,9 @@ static int vfio_pci_for_each_slot_or_bus(struct pci_dev 
*pdev,
 static int msix_sparse_mmap_cap(struct vfio_pci_device *vdev,
struct vfio_info_cap *caps)
 {
-   struct vfio_info_cap_header *header;
struct vfio_region_info_cap_sparse_mmap *sparse;
size_t end, size;
-   int nr_areas = 2, i = 0;
+   int nr_areas = 2, i = 0, ret;
 
end = pci_resource_len(vdev->pdev, vdev->msix_bar);
 
@@ -572,13 +571,10 @@ static int msix_sparse_mmap_cap(struct vfio_pci_device 
*vdev,
 
size = sizeof(*sparse) + (nr_areas * sizeof(*sparse->areas));
 
-   header = vfio_info_cap_add(caps, size,
-  VFIO_REGION_INFO_CAP_SPARSE_MMAP, 1);
-   if (IS_ERR(header))
-   return PTR_ERR(header);
+   sparse = kzalloc(size, GFP_KERNEL);
+   if (!sparse)
+   return -ENOMEM;
 
-   sparse = container_of(header,
- struct vfio_region_info_cap_sparse_mmap, header);
sparse->nr_areas = nr_areas;
 
if (vdev->msix_offset & PAGE_MASK) {
@@ -594,26 +590,11 @@ static int msix_sparse_mmap_cap(struct vfio_pci_device 
*vdev,
i++;
}
 
-   return 0;
-}
-
-static int region_type_cap(struct vfio_pci_device *vdev,
-  struct vfio_info_cap *caps,
-  unsigned int type, unsigned int subtype)
-{
-   struct vfio_info_cap_header *header;
-   struct vfio_region_info_cap_type *cap;
-
-   header = vfio_info_cap_add(caps, sizeof(*cap),
-  VFIO_REGION_INFO_CAP_TYPE, 1);
-   if (IS_ERR(header))
-   return PTR_ERR(header);
-
-   cap = container_of(header, struct vfio_region_info_cap_type, header);
-   cap->type = type;
-   cap->subtype = subtype;
+   ret = vfio_info_add_capability(caps, VFIO_REGION_INFO_CAP_SPARSE_MMAP,
+  sparse);
+   kfree(sparse);
 
-   return 0;
+   return ret;
 }
 
 int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
@@ -752,6 +733,9 @@ static long vfio_pci_ioctl(void *device_data,
 
break;
default:
+   {
+   struct vfio_region_info_cap_type cap_type;
+
if (info.index >=
VFIO_PCI_NUM_REGIONS + vdev->num_regions)
return -EINVAL;
@@ -762,11 +746,16 @@ static long vfio_pci_ioctl(void *device_data,
info.size = vdev->region[i].size;
info.flags = vdev->region[i].flags;
 
-   ret = region_type_cap(vdev, ,
- vdev->region[i].type,
- vdev->region[i].subtype);
+   cap_type.type = vdev->region[i].type;
+   cap_type.subtype = vdev->region[i].subtype;
+
+   ret = vfio_info_add_capability(,
+ VFIO_REGION_INFO_CAP_TYPE,
+ _type);
if (ret)
return ret;
+
+   }
}
 
if (caps.size) {
-- 
2.7.0



[PATCH v11 15/22] vfio: Introduce vfio_set_irqs_validate_and_prepare()

2016-11-04 Thread Kirti Wankhede
Vendor driver using mediated device framework would use same mechnism to
validate and prepare IRQs. Introducing this function to reduce code
replication in multiple drivers.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Ie201f269dda0713ca18a07dc4852500bd8b48309
---
 drivers/vfio/vfio.c  | 48 
 include/linux/vfio.h |  4 
 2 files changed, 52 insertions(+)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 9a03be0942a1..ed2361e4b904 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1858,6 +1858,54 @@ int vfio_info_add_capability(struct vfio_info_cap *caps, 
int cap_type_id,
 }
 EXPORT_SYMBOL(vfio_info_add_capability);
 
+int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr, int num_irqs,
+  int max_irq_type, size_t *data_size)
+{
+   unsigned long minsz;
+   size_t size;
+
+   minsz = offsetofend(struct vfio_irq_set, count);
+
+   if ((hdr->argsz < minsz) || (hdr->index >= max_irq_type) ||
+   (hdr->count >= (U32_MAX - hdr->start)) ||
+   (hdr->flags & ~(VFIO_IRQ_SET_DATA_TYPE_MASK |
+   VFIO_IRQ_SET_ACTION_TYPE_MASK)))
+   return -EINVAL;
+
+   if (data_size)
+   *data_size = 0;
+
+   if (hdr->start >= num_irqs || hdr->start + hdr->count > num_irqs)
+   return -EINVAL;
+
+   switch (hdr->flags & VFIO_IRQ_SET_DATA_TYPE_MASK) {
+   case VFIO_IRQ_SET_DATA_NONE:
+   size = 0;
+   break;
+   case VFIO_IRQ_SET_DATA_BOOL:
+   size = sizeof(uint8_t);
+   break;
+   case VFIO_IRQ_SET_DATA_EVENTFD:
+   size = sizeof(int32_t);
+   break;
+   default:
+   return -EINVAL;
+   }
+
+   if (size) {
+   if (hdr->argsz - minsz < hdr->count * size)
+   return -EINVAL;
+
+   if (!data_size)
+   return -EINVAL;
+
+   *data_size = hdr->count * size;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL(vfio_set_irqs_validate_and_prepare);
+
 /*
  * Pin a set of guest PFNs and return their associated host PFNs for local
  * domain only.
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index cf90393a11e2..87c9afecd822 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -116,6 +116,10 @@ extern void vfio_info_cap_shift(struct vfio_info_cap 
*caps, size_t offset);
 extern int vfio_info_add_capability(struct vfio_info_cap *caps,
int cap_type_id, void *cap_type);
 
+extern int vfio_set_irqs_validate_and_prepare(struct vfio_irq_set *hdr,
+ int num_irqs, int max_irq_type,
+ size_t *data_size);
+
 struct pci_dev;
 #ifdef CONFIG_EEH
 extern void vfio_spapr_pci_eeh_open(struct pci_dev *pdev);
-- 
2.7.0



[PATCH v11 19/22] docs: Add Documentation for Mediated devices

2016-11-04 Thread Kirti Wankhede
Add file Documentation/vfio-mediated-device.txt that include details of
mediated device framework.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I137dd646442936090d92008b115908b7b2c7bc5d
---
 Documentation/vfio-mediated-device.txt | 298 +
 1 file changed, 298 insertions(+)
 create mode 100644 Documentation/vfio-mediated-device.txt

diff --git a/Documentation/vfio-mediated-device.txt 
b/Documentation/vfio-mediated-device.txt
new file mode 100644
index ..d61e95aec961
--- /dev/null
+++ b/Documentation/vfio-mediated-device.txt
@@ -0,0 +1,298 @@
+/*
+ * VFIO Mediated devices
+ *
+ * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
+ * Author: Neo Jia 
+ * Kirti Wankhede 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+Virtual Function I/O (VFIO) Mediated devices[1]
+===
+
+The number of use cases for virtualizing DMA devices that do not have built-in
+SR_IOV capability is increasing. Previously, to virtualize such devices,
+developers had to create their own management interfaces and APIs, and then
+integrate them with user space software. To simplify integration with user 
space
+software, we have identified common requirements and a unified management
+interface for such devices.
+
+The VFIO driver framework provides unified APIs for direct device access. It is
+an IOMMU/device-agnostic framework for exposing direct device access to user
+space in a secure, IOMMU-protected environment. This framework is used for
+multiple devices, such as GPUs, network adapters, and compute accelerators. 
With
+direct device access, virtual machines or user space applications have direct
+access to the physical device. This framework is reused for mediated devices.
+
+The mediated core driver provides a common interface for mediated device
+management that can be used by drivers of different devices. This module
+provides a generic interface to perform these operations:
+
+* Create and destroy a mediated device
+* Add a mediated device to and remove it from a mediated bus driver
+* Add a mediated device to and remove it from an IOMMU group
+
+The mediated core driver also provides an interface to register a bus driver.
+For example, the mediated VFIO mdev driver is designed for mediated devices and
+supports VFIO APIs. The mediated bus driver adds a mediated device to and
+removes it from a VFIO group.
+
+The following high-level block diagram shows the main components and interfaces
+in the VFIO mediated driver framework. The diagram shows NVIDIA, Intel, and IBM
+devices as examples, as these devices are the first devices to use this module.
+
+ +---+
+ |   |
+ | +---+ |  mdev_register_driver() +--+
+ | |   | +<+  |
+ | |  mdev | | |  |
+ | |  bus  | +>+ vfio_mdev.ko |<-> VFIO user
+ | |  driver   | | probe()/remove()|  |APIs
+ | |   | | +--+
+ | +---+ |
+ |   |
+ |  MDEV CORE|
+ |   MODULE  |
+ |   mdev.ko |
+ | +---+ |  mdev_register_device() +--+
+ | |   | +<+  |
+ | |   | | |  nvidia.ko   |<-> physical
+ | |   | +>+  |device
+ | |   | |callbacks+--+
+ | | Physical  | |
+ | |  device   | |  mdev_register_device() +--+
+ | | interface | |<+  |
+ | |   | | |  i915.ko |<-> physical
+ | |   | +>+  |device
+ | |   | |callbacks+--+
+ | |   | |
+ | |   | |  mdev_register_device() +--+
+ | |   | +<+  |
+ | |   | | | ccw_device.ko|<-> physical
+ | |   | +>+  |device
+ | |   | |callbacks+--+
+ | +---+ |
+ +---+
+
+
+Registration Interfaces
+===
+
+The mediated core driver provides the following types of registration
+interfaces:
+
+* Registration interface for a mediated bus driver
+* Physical device driver interface
+
+Registration Interface for a Mediated Bus Driver
+
+
+The registration interface for a mediated bus driver provides the following

[PATCH v11 11/22] vfio iommu: Add blocking notifier to notify DMA_UNMAP

2016-11-04 Thread Kirti Wankhede
Added blocking notifier to IOMMU TYPE1 driver to notify vendor drivers
about DMA_UNMAP.
Exported two APIs vfio_register_notifier() and vfio_unregister_notifier().
Notifier should be registered, if external user wants to use
vfio_pin_pages()/vfio_unpin_pages() APIs to pin/unpin pages.
Vendor driver should use VFIO_IOMMU_NOTIFY_DMA_UNMAP action to invalidate
mappings.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I5910d0024d6be87f3e8d3e0ca0eaeaaa0b17f271
---
 drivers/vfio/vfio.c | 73 +
 drivers/vfio/vfio_iommu_type1.c | 47 --
 include/linux/vfio.h| 11 +++
 3 files changed, 121 insertions(+), 10 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 76d260e98930..4ed1a6a247c6 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1895,6 +1895,79 @@ err_unpin_pages:
 }
 EXPORT_SYMBOL(vfio_unpin_pages);
 
+int vfio_register_notifier(struct device *dev, struct notifier_block *nb)
+{
+   struct vfio_container *container;
+   struct vfio_group *group;
+   struct vfio_iommu_driver *driver;
+   ssize_t ret;
+
+   if (!dev || !nb)
+   return -EINVAL;
+
+   group = vfio_group_get_from_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   goto err_register_nb;
+
+   container = group->container;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->register_notifier))
+   ret = driver->ops->register_notifier(container->iommu_data, nb);
+   else
+   ret = -EINVAL;
+
+   up_read(>group_lock);
+   vfio_group_try_dissolve_container(group);
+
+err_register_nb:
+   vfio_group_put(group);
+   return ret;
+}
+EXPORT_SYMBOL(vfio_register_notifier);
+
+int vfio_unregister_notifier(struct device *dev, struct notifier_block *nb)
+{
+   struct vfio_container *container;
+   struct vfio_group *group;
+   struct vfio_iommu_driver *driver;
+   ssize_t ret;
+
+   if (!dev || !nb)
+   return -EINVAL;
+
+   group = vfio_group_get_from_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   goto err_unregister_nb;
+
+   container = group->container;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->unregister_notifier))
+   ret = driver->ops->unregister_notifier(container->iommu_data,
+  nb);
+   else
+   ret = -EINVAL;
+
+   up_read(>group_lock);
+   vfio_group_try_dissolve_container(group);
+
+err_unregister_nb:
+   vfio_group_put(group);
+   return ret;
+}
+EXPORT_SYMBOL(vfio_unregister_notifier);
+
 /**
  * Module/class support
  */
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index e511073446a0..c2d3a84c447b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson "
@@ -60,6 +61,7 @@ struct vfio_iommu {
struct vfio_domain  *external_domain; /* domain for external user */
struct mutexlock;
struct rb_root  dma_list;
+   struct blocking_notifier_head notifier;
boolv2;
boolnesting;
 };
@@ -550,7 +552,8 @@ static int vfio_iommu_type1_pin_pages(void *iommu_data,
 
mutex_lock(>lock);
 
-   if (!iommu->external_domain) {
+   /* Fail if notifier list is empty */
+   if ((!iommu->external_domain) || (!iommu->notifier.head)) {
ret = -EINVAL;
goto pin_done;
}
@@ -867,6 +870,11 @@ unlock:
/* Report how much was unmapped */
unmap->size = unmapped;
 
+   if (unmapped && iommu->external_domain)
+   blocking_notifier_call_chain(>notifier,
+VFIO_IOMMU_NOTIFY_DMA_UNMAP,
+unmap);
+
return ret;
 }
 
@@ -1474,6 +1482,7 @@ static void *vfio_iommu_type1_open(unsigned long arg)
INIT_LIST_HEAD(>addr_space_list);
iommu->dma_list = RB_ROOT;
mutex_init(>lock);
+   BLOCKING_INIT_NOTIFIER_HEAD(>notifier);
 
return iommu;
 }
@@ -1610,16 +1619,34 @@ static long vfio_iommu_type1_ioctl(void *iommu_data,
return -ENOTTY;
 }
 
+static int vfio_iommu_type1_register_notifier(void *iommu_data,
+ struct notifier_block *nb)
+{
+   struct vfio_iommu *iommu = iommu_data;
+
+   return 

[PATCH v11 06/22] vfio iommu type1: Update arguments of vfio_lock_acct

2016-11-04 Thread Kirti Wankhede
Updated arguments of vfio_lock_acct to take mm structure as input argument

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I5d3673cc9d3786bb436b395d5f74537f1a36da80
---
 drivers/vfio/vfio_iommu_type1.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 7fb87f008e0a..02b302d0b7de 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -150,17 +150,16 @@ static void vfio_lock_acct_bg(struct work_struct *work)
kfree(vwork);
 }
 
-static void vfio_lock_acct(long npage)
+static void vfio_lock_acct(struct mm_struct *mm, long npage)
 {
struct vwork *vwork;
-   struct mm_struct *mm;
 
-   if (!current->mm || !npage)
+   if (!mm || !npage)
return; /* process exited or nothing to do */
 
-   if (down_write_trylock(>mm->mmap_sem)) {
-   current->mm->locked_vm += npage;
-   up_write(>mm->mmap_sem);
+   if (down_write_trylock(>mmap_sem)) {
+   mm->locked_vm += npage;
+   up_write(>mmap_sem);
return;
}
 
@@ -172,8 +171,7 @@ static void vfio_lock_acct(long npage)
vwork = kmalloc(sizeof(struct vwork), GFP_KERNEL);
if (!vwork)
return;
-   mm = get_task_mm(current);
-   if (!mm) {
+   if (!mmget_not_zero(mm)) {
kfree(vwork);
return;
}
@@ -285,7 +283,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
 
if (unlikely(disable_hugepages)) {
if (!rsvd)
-   vfio_lock_acct(1);
+   vfio_lock_acct(current->mm, 1);
return 1;
}
 
@@ -313,7 +311,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
}
 
if (!rsvd)
-   vfio_lock_acct(i);
+   vfio_lock_acct(current->mm, i);
 
return i;
 }
@@ -328,7 +326,7 @@ static long __vfio_unpin_pages_remote(unsigned long pfn, 
long npage,
unlocked += put_pfn(pfn++, prot);
 
if (do_accounting)
-   vfio_lock_acct(-unlocked);
+   vfio_lock_acct(current->mm, -unlocked);
 
return unlocked;
 }
@@ -390,7 +388,7 @@ static void vfio_unmap_unpin(struct vfio_iommu *iommu, 
struct vfio_dma *dma)
cond_resched();
}
 
-   vfio_lock_acct(-unlocked);
+   vfio_lock_acct(current->mm, -unlocked);
 }
 
 static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
-- 
2.7.0



[PATCH v11 14/22] vfio_pci: Update vfio_pci to use vfio_info_add_capability()

2016-11-04 Thread Kirti Wankhede
Update msix_sparse_mmap_cap() to use vfio_info_add_capability()
Update region type capability to use vfio_info_add_capability()

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I52bb28c7875a6da5a79ddad1843e6088aff58a45
---
 drivers/vfio/pci/vfio_pci.c | 49 ++---
 1 file changed, 19 insertions(+), 30 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 031bc08d000d..03b5434f4d5b 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -558,10 +558,9 @@ static int vfio_pci_for_each_slot_or_bus(struct pci_dev 
*pdev,
 static int msix_sparse_mmap_cap(struct vfio_pci_device *vdev,
struct vfio_info_cap *caps)
 {
-   struct vfio_info_cap_header *header;
struct vfio_region_info_cap_sparse_mmap *sparse;
size_t end, size;
-   int nr_areas = 2, i = 0;
+   int nr_areas = 2, i = 0, ret;
 
end = pci_resource_len(vdev->pdev, vdev->msix_bar);
 
@@ -572,13 +571,10 @@ static int msix_sparse_mmap_cap(struct vfio_pci_device 
*vdev,
 
size = sizeof(*sparse) + (nr_areas * sizeof(*sparse->areas));
 
-   header = vfio_info_cap_add(caps, size,
-  VFIO_REGION_INFO_CAP_SPARSE_MMAP, 1);
-   if (IS_ERR(header))
-   return PTR_ERR(header);
+   sparse = kzalloc(size, GFP_KERNEL);
+   if (!sparse)
+   return -ENOMEM;
 
-   sparse = container_of(header,
- struct vfio_region_info_cap_sparse_mmap, header);
sparse->nr_areas = nr_areas;
 
if (vdev->msix_offset & PAGE_MASK) {
@@ -594,26 +590,11 @@ static int msix_sparse_mmap_cap(struct vfio_pci_device 
*vdev,
i++;
}
 
-   return 0;
-}
-
-static int region_type_cap(struct vfio_pci_device *vdev,
-  struct vfio_info_cap *caps,
-  unsigned int type, unsigned int subtype)
-{
-   struct vfio_info_cap_header *header;
-   struct vfio_region_info_cap_type *cap;
-
-   header = vfio_info_cap_add(caps, sizeof(*cap),
-  VFIO_REGION_INFO_CAP_TYPE, 1);
-   if (IS_ERR(header))
-   return PTR_ERR(header);
-
-   cap = container_of(header, struct vfio_region_info_cap_type, header);
-   cap->type = type;
-   cap->subtype = subtype;
+   ret = vfio_info_add_capability(caps, VFIO_REGION_INFO_CAP_SPARSE_MMAP,
+  sparse);
+   kfree(sparse);
 
-   return 0;
+   return ret;
 }
 
 int vfio_pci_register_dev_region(struct vfio_pci_device *vdev,
@@ -752,6 +733,9 @@ static long vfio_pci_ioctl(void *device_data,
 
break;
default:
+   {
+   struct vfio_region_info_cap_type cap_type;
+
if (info.index >=
VFIO_PCI_NUM_REGIONS + vdev->num_regions)
return -EINVAL;
@@ -762,11 +746,16 @@ static long vfio_pci_ioctl(void *device_data,
info.size = vdev->region[i].size;
info.flags = vdev->region[i].flags;
 
-   ret = region_type_cap(vdev, ,
- vdev->region[i].type,
- vdev->region[i].subtype);
+   cap_type.type = vdev->region[i].type;
+   cap_type.subtype = vdev->region[i].subtype;
+
+   ret = vfio_info_add_capability(,
+ VFIO_REGION_INFO_CAP_TYPE,
+ _type);
if (ret)
return ret;
+
+   }
}
 
if (caps.size) {
-- 
2.7.0



[PATCH v11 09/22] vfio iommu type1: Add task structure to vfio_dma

2016-11-04 Thread Kirti Wankhede
Add task structure to vfio_dma.
Add address space structure. Each vfio_dma structure points to the address
space of the task who mapped it.
List of address spaces is maintained in vfio_iommu structure.
>From DMA_MAP call if address space already exist in address space list,
vfio_dma points to it. If address space doesn't exist, allocate address
space, save pointer of mm to it and vfio_dma points to it.
Two tasks can share same address space and so we need keep address space
structure different from task in vfio_dma structure. vfio_dma keeps
pointer to its corresponding address space.
During DMA_UNMAP, same task who mapped it or other task who shares same
address space is allowed to unmap, otherwise unmap fails.
QEMU maps few iova ranges initially, then fork threads and from the child
thread calls DMA_UNMAP on previously mapped iova. Since child shares same
address space, DMA_UNMAP is successful.
This address space structure is used to track pages pinned by external
user in later changes.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I7600f1bea6b384fd589fa72421ccf031bcfd9ac5
---
 drivers/vfio/vfio_iommu_type1.c | 182 +---
 1 file changed, 134 insertions(+), 48 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 422c8d198abb..8d64528dcc22 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -55,12 +55,20 @@ MODULE_PARM_DESC(disable_hugepages,
 
 struct vfio_iommu {
struct list_headdomain_list;
+   struct list_headaddr_space_list;
struct mutexlock;
struct rb_root  dma_list;
boolv2;
boolnesting;
 };
 
+/* address space */
+struct vfio_addr_space {
+   struct mm_struct*mm;
+   struct list_headnext;
+   atomic_tref_count;
+};
+
 struct vfio_domain {
struct iommu_domain *domain;
struct list_headnext;
@@ -75,6 +83,9 @@ struct vfio_dma {
unsigned long   vaddr;  /* Process virtual addr */
size_t  size;   /* Map size (bytes) */
int prot;   /* IOMMU_READ/WRITE */
+   struct vfio_addr_space  *addr_space;
+   struct task_struct  *task;
+   boolmlock_cap;
 };
 
 struct vfio_group {
@@ -130,6 +141,18 @@ static void vfio_unlink_dma(struct vfio_iommu *iommu, 
struct vfio_dma *old)
rb_erase(>node, >dma_list);
 }
 
+static struct vfio_addr_space *vfio_find_addr_space(struct vfio_iommu *iommu,
+   struct mm_struct *mm)
+{
+   struct vfio_addr_space *as;
+
+   list_for_each_entry(as, >addr_space_list, next) {
+   if (as->mm == mm)
+   return as;
+   }
+   return NULL;
+}
+
 struct vwork {
struct mm_struct*mm;
longnpage;
@@ -273,24 +296,24 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned 
long vaddr,
  * the iommu can only map chunks of consecutive pfns anyway, so get the
  * first page and all consecutive pages with the same locking.
  */
-static long __vfio_pin_pages_remote(unsigned long vaddr, long npage,
-   int prot, unsigned long *pfn_base)
+static long __vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
+   long npage, int prot,
+   unsigned long *pfn_base)
 {
-   unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-   bool lock_cap = capable(CAP_IPC_LOCK);
+   struct task_struct *task = dma->task;
+   unsigned long limit = task_rlimit(task, RLIMIT_MEMLOCK) >> PAGE_SHIFT;
+   bool lock_cap = dma->mlock_cap;
+   struct mm_struct *mm = dma->addr_space->mm;
long ret, i;
bool rsvd;
 
-   if (!current->mm)
-   return -ENODEV;
-
-   ret = vaddr_get_pfn(current->mm, vaddr, prot, pfn_base);
+   ret = vaddr_get_pfn(mm, vaddr, prot, pfn_base);
if (ret)
return ret;
 
rsvd = is_invalid_reserved_pfn(*pfn_base);
 
-   if (!rsvd && !lock_cap && current->mm->locked_vm + 1 > limit) {
+   if (!rsvd && !lock_cap && mm->locked_vm + 1 > limit) {
put_pfn(*pfn_base, prot);
pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__,
limit << PAGE_SHIFT);
@@ -299,7 +322,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
 
if (unlikely(disable_hugepages)) {
if (!rsvd)
-   vfio_lock_acct(current->mm, 1);
+   vfio_lock_acct(mm, 1);
return 1;
}
 
@@ -307,7 +330,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long 

[PATCH v11 18/22] vfio: Define device_api strings

2016-11-04 Thread Kirti Wankhede
Defined device API strings. Vendor driver using mediated device
framework should use corresponding string for device_api attribute.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I42d29f475f02a7132ce13297fbf2b48f1da10995
---
 include/uapi/linux/vfio.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 255a2113f53c..519eff362c1c 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -203,6 +203,16 @@ struct vfio_device_info {
 };
 #define VFIO_DEVICE_GET_INFO   _IO(VFIO_TYPE, VFIO_BASE + 7)
 
+/*
+ * Vendor driver using Mediated device framework should provide device_api
+ * attribute in supported type attribute groups. Device API string should be 
one
+ * of the following corresponding to device flags in vfio_device_info 
structure.
+ */
+
+#define VFIO_DEVICE_API_PCI_STRING "vfio-pci"
+#define VFIO_DEVICE_API_PLATFORM_STRING"vfio-platform"
+#define VFIO_DEVICE_API_AMBA_STRING"vfio-amba"
+
 /**
  * VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8,
  *struct vfio_region_info)
-- 
2.7.0



[PATCH v11 09/22] vfio iommu type1: Add task structure to vfio_dma

2016-11-04 Thread Kirti Wankhede
Add task structure to vfio_dma.
Add address space structure. Each vfio_dma structure points to the address
space of the task who mapped it.
List of address spaces is maintained in vfio_iommu structure.
>From DMA_MAP call if address space already exist in address space list,
vfio_dma points to it. If address space doesn't exist, allocate address
space, save pointer of mm to it and vfio_dma points to it.
Two tasks can share same address space and so we need keep address space
structure different from task in vfio_dma structure. vfio_dma keeps
pointer to its corresponding address space.
During DMA_UNMAP, same task who mapped it or other task who shares same
address space is allowed to unmap, otherwise unmap fails.
QEMU maps few iova ranges initially, then fork threads and from the child
thread calls DMA_UNMAP on previously mapped iova. Since child shares same
address space, DMA_UNMAP is successful.
This address space structure is used to track pages pinned by external
user in later changes.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I7600f1bea6b384fd589fa72421ccf031bcfd9ac5
---
 drivers/vfio/vfio_iommu_type1.c | 182 +---
 1 file changed, 134 insertions(+), 48 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 422c8d198abb..8d64528dcc22 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -55,12 +55,20 @@ MODULE_PARM_DESC(disable_hugepages,
 
 struct vfio_iommu {
struct list_headdomain_list;
+   struct list_headaddr_space_list;
struct mutexlock;
struct rb_root  dma_list;
boolv2;
boolnesting;
 };
 
+/* address space */
+struct vfio_addr_space {
+   struct mm_struct*mm;
+   struct list_headnext;
+   atomic_tref_count;
+};
+
 struct vfio_domain {
struct iommu_domain *domain;
struct list_headnext;
@@ -75,6 +83,9 @@ struct vfio_dma {
unsigned long   vaddr;  /* Process virtual addr */
size_t  size;   /* Map size (bytes) */
int prot;   /* IOMMU_READ/WRITE */
+   struct vfio_addr_space  *addr_space;
+   struct task_struct  *task;
+   boolmlock_cap;
 };
 
 struct vfio_group {
@@ -130,6 +141,18 @@ static void vfio_unlink_dma(struct vfio_iommu *iommu, 
struct vfio_dma *old)
rb_erase(>node, >dma_list);
 }
 
+static struct vfio_addr_space *vfio_find_addr_space(struct vfio_iommu *iommu,
+   struct mm_struct *mm)
+{
+   struct vfio_addr_space *as;
+
+   list_for_each_entry(as, >addr_space_list, next) {
+   if (as->mm == mm)
+   return as;
+   }
+   return NULL;
+}
+
 struct vwork {
struct mm_struct*mm;
longnpage;
@@ -273,24 +296,24 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned 
long vaddr,
  * the iommu can only map chunks of consecutive pfns anyway, so get the
  * first page and all consecutive pages with the same locking.
  */
-static long __vfio_pin_pages_remote(unsigned long vaddr, long npage,
-   int prot, unsigned long *pfn_base)
+static long __vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
+   long npage, int prot,
+   unsigned long *pfn_base)
 {
-   unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
-   bool lock_cap = capable(CAP_IPC_LOCK);
+   struct task_struct *task = dma->task;
+   unsigned long limit = task_rlimit(task, RLIMIT_MEMLOCK) >> PAGE_SHIFT;
+   bool lock_cap = dma->mlock_cap;
+   struct mm_struct *mm = dma->addr_space->mm;
long ret, i;
bool rsvd;
 
-   if (!current->mm)
-   return -ENODEV;
-
-   ret = vaddr_get_pfn(current->mm, vaddr, prot, pfn_base);
+   ret = vaddr_get_pfn(mm, vaddr, prot, pfn_base);
if (ret)
return ret;
 
rsvd = is_invalid_reserved_pfn(*pfn_base);
 
-   if (!rsvd && !lock_cap && current->mm->locked_vm + 1 > limit) {
+   if (!rsvd && !lock_cap && mm->locked_vm + 1 > limit) {
put_pfn(*pfn_base, prot);
pr_warn("%s: RLIMIT_MEMLOCK (%ld) exceeded\n", __func__,
limit << PAGE_SHIFT);
@@ -299,7 +322,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
 
if (unlikely(disable_hugepages)) {
if (!rsvd)
-   vfio_lock_acct(current->mm, 1);
+   vfio_lock_acct(mm, 1);
return 1;
}
 
@@ -307,7 +330,7 @@ static long __vfio_pin_pages_remote(unsigned long vaddr, 
long npage,
for (i = 1, vaddr += 

[PATCH v11 18/22] vfio: Define device_api strings

2016-11-04 Thread Kirti Wankhede
Defined device API strings. Vendor driver using mediated device
framework should use corresponding string for device_api attribute.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I42d29f475f02a7132ce13297fbf2b48f1da10995
---
 include/uapi/linux/vfio.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 255a2113f53c..519eff362c1c 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -203,6 +203,16 @@ struct vfio_device_info {
 };
 #define VFIO_DEVICE_GET_INFO   _IO(VFIO_TYPE, VFIO_BASE + 7)
 
+/*
+ * Vendor driver using Mediated device framework should provide device_api
+ * attribute in supported type attribute groups. Device API string should be 
one
+ * of the following corresponding to device flags in vfio_device_info 
structure.
+ */
+
+#define VFIO_DEVICE_API_PCI_STRING "vfio-pci"
+#define VFIO_DEVICE_API_PLATFORM_STRING"vfio-platform"
+#define VFIO_DEVICE_API_AMBA_STRING"vfio-amba"
+
 /**
  * VFIO_DEVICE_GET_REGION_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 8,
  *struct vfio_region_info)
-- 
2.7.0



[PATCH v11 22/22] MAINTAINERS: Add entry VFIO based Mediated device drivers

2016-11-04 Thread Kirti Wankhede
Adding myself as a maintainer of mediated device framework,
a sub module of VFIO.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I58f6717783e0d4008ca31f4a5c4494696bae8571
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f30b8ea700fd..a3165b6407a5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12729,6 +12729,15 @@ F: drivers/vfio/
 F: include/linux/vfio.h
 F: include/uapi/linux/vfio.h
 
+VFIO MEDIATED DEVICE DRIVERS
+M: Kirti Wankhede 
+L: k...@vger.kernel.org
+S: Maintained
+F: Documentation/vfio-mediated-device.txt
+F: drivers/vfio/mdev/
+F: include/linux/mdev.h
+F: samples/vfio-mdev/
+
 VFIO PLATFORM DRIVER
 M: Baptiste Reynal 
 L: k...@vger.kernel.org
-- 
2.7.0



[PATCH v11 10/22] vfio iommu type1: Add support for mediated devices

2016-11-04 Thread Kirti Wankhede
VFIO IOMMU drivers are designed for the devices which are IOMMU capable.
Mediated device only uses IOMMU APIs, the underlying hardware can be
managed by an IOMMU domain.

Aim of this change is:
- To use most of the code of TYPE1 IOMMU driver for mediated devices
- To support direct assigned device and mediated device in single module

This change adds pin and unpin support for mediated device to TYPE1 IOMMU
backend module. More details:
- vfio_pin_pages() callback here uses task and address space of vfio_dma,
  that is, of the process who mapped that iova range.
- Added pfn_list tracking logic to address space structure. All pages
  pinned through this interface are trached in its address space.
- Pinned pages list is used to verify unpinning request and to unpin
  remaining pages while detaching the group for that device.
- Page accounting is updated to account in its address space where the
  pages are pinned/unpinned.
-  Accouting for mdev device is only done if there is no iommu capable
  domain in the container. When there is a direct device assigned to the
  container and that domain is iommu capable, all pages are already pinned
  during DMA_MAP.
- Page accouting is updated on hot plug and unplug mdev device and pass
  through device.

Tested by assigning below combinations of devices to a single VM:
- GPU pass through only
- vGPU device only
- One GPU pass through and one vGPU device
- Linux VM hot plug and unplug vGPU device while GPU pass through device
  exist
- Linux VM hot plug and unplug GPU pass through device while vGPU device
  exist

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I295d6f0f2e0579b8d9882bfd8fd5a4194b97bd9a
---
 drivers/vfio/vfio_iommu_type1.c | 538 +---
 1 file changed, 500 insertions(+), 38 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 8d64528dcc22..e511073446a0 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson "
@@ -56,6 +57,7 @@ MODULE_PARM_DESC(disable_hugepages,
 struct vfio_iommu {
struct list_headdomain_list;
struct list_headaddr_space_list;
+   struct vfio_domain  *external_domain; /* domain for external user */
struct mutexlock;
struct rb_root  dma_list;
boolv2;
@@ -67,6 +69,9 @@ struct vfio_addr_space {
struct mm_struct*mm;
struct list_headnext;
atomic_tref_count;
+   /* external user pinned pfns */
+   struct rb_root  pfn_list;   /* pinned Host pfn list */
+   struct mutexpfn_list_lock;  /* mutex for pfn_list */
 };
 
 struct vfio_domain {
@@ -83,6 +88,7 @@ struct vfio_dma {
unsigned long   vaddr;  /* Process virtual addr */
size_t  size;   /* Map size (bytes) */
int prot;   /* IOMMU_READ/WRITE */
+   booliommu_mapped;
struct vfio_addr_space  *addr_space;
struct task_struct  *task;
boolmlock_cap;
@@ -94,6 +100,19 @@ struct vfio_group {
 };
 
 /*
+ * Guest RAM pinning working set or DMA target
+ */
+struct vfio_pfn {
+   struct rb_node  node;
+   unsigned long   pfn;/* Host pfn */
+   int prot;
+   atomic_tref_count;
+};
+
+#define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)\
+   (!list_empty(>domain_list))
+
+/*
  * This code handles mapping and unmapping of user data buffers
  * into DMA'ble space using the IOMMU
  */
@@ -153,6 +172,93 @@ static struct vfio_addr_space *vfio_find_addr_space(struct 
vfio_iommu *iommu,
return NULL;
 }
 
+/*
+ * Helper Functions for host pfn list
+ */
+static struct vfio_pfn *vfio_find_pfn(struct vfio_addr_space *addr_space,
+ unsigned long pfn)
+{
+   struct vfio_pfn *vpfn;
+   struct rb_node *node = addr_space->pfn_list.rb_node;
+
+   while (node) {
+   vpfn = rb_entry(node, struct vfio_pfn, node);
+
+   if (pfn < vpfn->pfn)
+   node = node->rb_left;
+   else if (pfn > vpfn->pfn)
+   node = node->rb_right;
+   else
+   return vpfn;
+   }
+
+   return NULL;
+}
+
+static void vfio_link_pfn(struct vfio_addr_space *addr_space,
+ struct vfio_pfn *new)
+{
+   struct rb_node **link, *parent = NULL;
+   struct vfio_pfn *vpfn;
+
+   link = _space->pfn_list.rb_node;
+   while (*link) {
+   parent = *link;
+ 

[PATCH v11 22/22] MAINTAINERS: Add entry VFIO based Mediated device drivers

2016-11-04 Thread Kirti Wankhede
Adding myself as a maintainer of mediated device framework,
a sub module of VFIO.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I58f6717783e0d4008ca31f4a5c4494696bae8571
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f30b8ea700fd..a3165b6407a5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12729,6 +12729,15 @@ F: drivers/vfio/
 F: include/linux/vfio.h
 F: include/uapi/linux/vfio.h
 
+VFIO MEDIATED DEVICE DRIVERS
+M: Kirti Wankhede 
+L: k...@vger.kernel.org
+S: Maintained
+F: Documentation/vfio-mediated-device.txt
+F: drivers/vfio/mdev/
+F: include/linux/mdev.h
+F: samples/vfio-mdev/
+
 VFIO PLATFORM DRIVER
 M: Baptiste Reynal 
 L: k...@vger.kernel.org
-- 
2.7.0



[PATCH v11 10/22] vfio iommu type1: Add support for mediated devices

2016-11-04 Thread Kirti Wankhede
VFIO IOMMU drivers are designed for the devices which are IOMMU capable.
Mediated device only uses IOMMU APIs, the underlying hardware can be
managed by an IOMMU domain.

Aim of this change is:
- To use most of the code of TYPE1 IOMMU driver for mediated devices
- To support direct assigned device and mediated device in single module

This change adds pin and unpin support for mediated device to TYPE1 IOMMU
backend module. More details:
- vfio_pin_pages() callback here uses task and address space of vfio_dma,
  that is, of the process who mapped that iova range.
- Added pfn_list tracking logic to address space structure. All pages
  pinned through this interface are trached in its address space.
- Pinned pages list is used to verify unpinning request and to unpin
  remaining pages while detaching the group for that device.
- Page accounting is updated to account in its address space where the
  pages are pinned/unpinned.
-  Accouting for mdev device is only done if there is no iommu capable
  domain in the container. When there is a direct device assigned to the
  container and that domain is iommu capable, all pages are already pinned
  during DMA_MAP.
- Page accouting is updated on hot plug and unplug mdev device and pass
  through device.

Tested by assigning below combinations of devices to a single VM:
- GPU pass through only
- vGPU device only
- One GPU pass through and one vGPU device
- Linux VM hot plug and unplug vGPU device while GPU pass through device
  exist
- Linux VM hot plug and unplug GPU pass through device while vGPU device
  exist

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I295d6f0f2e0579b8d9882bfd8fd5a4194b97bd9a
---
 drivers/vfio/vfio_iommu_type1.c | 538 +---
 1 file changed, 500 insertions(+), 38 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 8d64528dcc22..e511073446a0 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DRIVER_VERSION  "0.2"
 #define DRIVER_AUTHOR   "Alex Williamson "
@@ -56,6 +57,7 @@ MODULE_PARM_DESC(disable_hugepages,
 struct vfio_iommu {
struct list_headdomain_list;
struct list_headaddr_space_list;
+   struct vfio_domain  *external_domain; /* domain for external user */
struct mutexlock;
struct rb_root  dma_list;
boolv2;
@@ -67,6 +69,9 @@ struct vfio_addr_space {
struct mm_struct*mm;
struct list_headnext;
atomic_tref_count;
+   /* external user pinned pfns */
+   struct rb_root  pfn_list;   /* pinned Host pfn list */
+   struct mutexpfn_list_lock;  /* mutex for pfn_list */
 };
 
 struct vfio_domain {
@@ -83,6 +88,7 @@ struct vfio_dma {
unsigned long   vaddr;  /* Process virtual addr */
size_t  size;   /* Map size (bytes) */
int prot;   /* IOMMU_READ/WRITE */
+   booliommu_mapped;
struct vfio_addr_space  *addr_space;
struct task_struct  *task;
boolmlock_cap;
@@ -94,6 +100,19 @@ struct vfio_group {
 };
 
 /*
+ * Guest RAM pinning working set or DMA target
+ */
+struct vfio_pfn {
+   struct rb_node  node;
+   unsigned long   pfn;/* Host pfn */
+   int prot;
+   atomic_tref_count;
+};
+
+#define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)\
+   (!list_empty(>domain_list))
+
+/*
  * This code handles mapping and unmapping of user data buffers
  * into DMA'ble space using the IOMMU
  */
@@ -153,6 +172,93 @@ static struct vfio_addr_space *vfio_find_addr_space(struct 
vfio_iommu *iommu,
return NULL;
 }
 
+/*
+ * Helper Functions for host pfn list
+ */
+static struct vfio_pfn *vfio_find_pfn(struct vfio_addr_space *addr_space,
+ unsigned long pfn)
+{
+   struct vfio_pfn *vpfn;
+   struct rb_node *node = addr_space->pfn_list.rb_node;
+
+   while (node) {
+   vpfn = rb_entry(node, struct vfio_pfn, node);
+
+   if (pfn < vpfn->pfn)
+   node = node->rb_left;
+   else if (pfn > vpfn->pfn)
+   node = node->rb_right;
+   else
+   return vpfn;
+   }
+
+   return NULL;
+}
+
+static void vfio_link_pfn(struct vfio_addr_space *addr_space,
+ struct vfio_pfn *new)
+{
+   struct rb_node **link, *parent = NULL;
+   struct vfio_pfn *vpfn;
+
+   link = _space->pfn_list.rb_node;
+   while (*link) {
+   parent = *link;
+   vpfn = rb_entry(parent, struct vfio_pfn, node);
+
+  

[PATCH v11 05/22] vfio iommu: Added pin and unpin callback functions to vfio_iommu_driver_ops

2016-11-04 Thread Kirti Wankhede
Added two new callback functions to struct vfio_iommu_driver_ops. Backend
IOMMU module that supports pining and unpinning pages for mdev devices
should provide these functions.
Added APIs for pining and unpining pages to VFIO module. These calls back
into backend iommu module to actually pin and unpin pages.
Renamed static functions in vfio_type1_iommu.c to resolve conflicts

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Ia7417723aaae86bec2959ad9ae6c2915ddd340e0
---
 drivers/vfio/vfio.c | 96 +
 drivers/vfio/vfio_iommu_type1.c | 20 -
 include/linux/vfio.h| 14 +-
 3 files changed, 119 insertions(+), 11 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 2e83bdf007fe..76d260e98930 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1799,6 +1799,102 @@ void vfio_info_cap_shift(struct vfio_info_cap *caps, 
size_t offset)
 }
 EXPORT_SYMBOL_GPL(vfio_info_cap_shift);
 
+
+/*
+ * Pin a set of guest PFNs and return their associated host PFNs for local
+ * domain only.
+ * @dev [in] : device
+ * @user_pfn [in]: array of user/guest PFNs
+ * @npage [in]: count of array elements
+ * @prot [in] : protection flags
+ * @phys_pfn[out] : array of host PFNs
+ * Return error or number of pages pinned.
+ */
+int vfio_pin_pages(struct device *dev, unsigned long *user_pfn,
+  int npage, int prot, unsigned long *phys_pfn)
+{
+   struct vfio_container *container;
+   struct vfio_group *group;
+   struct vfio_iommu_driver *driver;
+   int ret;
+
+   if (!dev || !user_pfn || !phys_pfn)
+   return -EINVAL;
+
+   group = vfio_group_get_from_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   goto err_pin_pages;
+
+   container = group->container;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->pin_pages))
+   ret = driver->ops->pin_pages(container->iommu_data, user_pfn,
+npage, prot, phys_pfn);
+   else
+   ret = -EINVAL;
+
+   up_read(>group_lock);
+   vfio_group_try_dissolve_container(group);
+
+err_pin_pages:
+   vfio_group_put(group);
+   return ret;
+
+}
+EXPORT_SYMBOL(vfio_pin_pages);
+
+/*
+ * Unpin set of host PFNs for local domain only.
+ * @dev [in] : device
+ * @user_pfn [in]: array of user/guest PFNs to be unpinned
+ * @pfn [in] : array of host PFNs to be unpinned.
+ * @npage [in] :count of elements in array, that is number of pages.
+ * Return error or number of pages unpinned.
+ */
+int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn,
+unsigned long *pfn, int npage)
+{
+   struct vfio_container *container;
+   struct vfio_group *group;
+   struct vfio_iommu_driver *driver;
+   int ret;
+
+   if (!dev || !pfn)
+   return -EINVAL;
+
+   group = vfio_group_get_from_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   goto err_unpin_pages;
+
+   container = group->container;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->unpin_pages))
+   ret = driver->ops->unpin_pages(container->iommu_data, user_pfn,
+  pfn, npage);
+   else
+   ret = -EINVAL;
+
+   up_read(>group_lock);
+   vfio_group_try_dissolve_container(group);
+
+err_unpin_pages:
+   vfio_group_put(group);
+   return ret;
+}
+EXPORT_SYMBOL(vfio_unpin_pages);
+
 /**
  * Module/class support
  */
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2ba19424e4a1..7fb87f008e0a 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -259,8 +259,8 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, 
unsigned long *pfn)
  * the iommu can only map chunks of consecutive pfns anyway, so get the
  * first page and all consecutive pages with the same locking.
  */
-static long vfio_pin_pages(unsigned long vaddr, long npage,
-  int prot, unsigned long *pfn_base)
+static long __vfio_pin_pages_remote(unsigned long vaddr, long npage,
+   int prot, unsigned long *pfn_base)
 {
unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
bool lock_cap = capable(CAP_IPC_LOCK);
@@ -318,8 +318,8 @@ static long vfio_pin_pages(unsigned long vaddr, long npage,
return i;
 }
 
-static long vfio_unpin_pages(unsigned long pfn, long npage,
-int prot, bool do_accounting)
+static long 

[PATCH v11 00/22] Add Mediated device support

2016-11-04 Thread Kirti Wankhede
This series adds Mediated device support to Linux host kernel. Purpose
of this series is to provide a common interface for mediated device
management that can be used by different devices. This series introduces
Mdev core module that creates and manages mediated devices, VFIO based
driver for mediated devices that are created by mdev core module and
update VFIO type1 IOMMU module to support pinning & unpinning for mediated
devices.

What changed in v11?
mdev core:
  Register mdev_bus class when first device is registed to avoid panic if
  any vendor driver and mdev driver are selected as built-in but vendor
  driver loads first and then mdev module.
vfio_mdev:
  Added notifier callback function to mdev parent's ops so that notifer
  is registered from vfio_mdev module during device open and unregistered
  it from device close call. This is a optional callback. Some drivers
  using mdev framework might not pin or unpin pages, for example the
  sample mtty driver that simulates serial port. Vendor driver who need to
  pin/unpin pages should provide this callback. Otherwise pin request
  would fail.
vfio_iommu_type1:
  Updated to keep track of who (task and address space) mapped iova range.
  During DMA_UNMAP, same task who mapped it or other task who shares same
  address space is allowed to unmap, otherwise unmap fails.
  QEMU maps few iova ranges initially, then fork threads and from the child
  thread calls DMA_UNMAP on previously mapped iova. Since child shares same
  address space, DMA_UNMAP is successful.
  Address space keeps track of pages pinned (pfn_list) by external user /
  mdev devices. This pfn_list is used to verify pfn during unpin_request,
  re-accounting of pages when direct device assigned in hot-unplugged and
  mdev device is present in same container.
  When the container is released, all mapped iova from all tasks are
  unmapped and removed.

  Tested by assigning below combinations of devices to a single VM:
   - GPU pass through only
   - vGPU device only
   - One GPU pass through and one vGPU device
   - Linux VM hot plug and unplug vGPU device while GPU pass through device
 exist
   - Linux VM hot plug and unplug GPU pass through device while vGPU device
 exist

  Patch series tested with linux-next upto commit 14970f204b19 @Fri Oct 28
  Resolved against conflicting change:
  05692d7005a3 vfio/pci: Fix integer overflows, bitmask check


Kirti Wankhede (22):
  vfio: Mediated device Core driver
  vfio: VFIO based driver for Mediated devices
  vfio: Rearrange functions to get vfio_group from dev
  vfio: Common function to increment container_users
  vfio iommu: Added pin and unpin callback functions to
vfio_iommu_driver_ops
  vfio iommu type1: Update arguments of vfio_lock_acct
  vfio iommu type1: Update argument of vaddr_get_pfn()
  vfio iommu type1: Add find_iommu_group() function
  vfio iommu type1: Add task structure to vfio_dma
  vfio iommu type1: Add support for mediated devices
  vfio iommu: Add blocking notifier to notify DMA_UNMAP
  vfio: Add notifier callback to parent's ops structure of mdev
  vfio: Introduce common function to add capabilities
  vfio_pci: Update vfio_pci to use vfio_info_add_capability()
  vfio: Introduce vfio_set_irqs_validate_and_prepare()
  vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare()
  vfio_platform: Updated to use vfio_set_irqs_validate_and_prepare()
  vfio: Define device_api strings
  docs: Add Documentation for Mediated devices
  docs: Sysfs ABI for mediated device framework
  docs: Sample driver to demonstrate how to use Mediated device
framework.
  MAINTAINERS: Add entry VFIO based Mediated device drivers

 Documentation/ABI/testing/sysfs-bus-vfio-mdev |  111 ++
 Documentation/vfio-mediated-device.txt|  399 +++
 MAINTAINERS   |9 +
 drivers/vfio/Kconfig  |1 +
 drivers/vfio/Makefile |1 +
 drivers/vfio/mdev/Kconfig |   17 +
 drivers/vfio/mdev/Makefile|5 +
 drivers/vfio/mdev/mdev_core.c |  388 +++
 drivers/vfio/mdev/mdev_driver.c   |  122 ++
 drivers/vfio/mdev/mdev_private.h  |   41 +
 drivers/vfio/mdev/mdev_sysfs.c|  286 +
 drivers/vfio/mdev/vfio_mdev.c |  167 +++
 drivers/vfio/pci/vfio_pci.c   |   83 +-
 drivers/vfio/platform/vfio_platform_common.c  |   31 +-
 drivers/vfio/vfio.c   |  334 +-
 drivers/vfio/vfio_iommu_type1.c   |  831 --
 include/linux/mdev.h  |  176 +++
 include/linux/vfio.h  |   32 +-
 include/uapi/linux/vfio.h |   10 +
 samples/vfio-mdev/Makefile|   13 +
 samples/vfio-mdev/mtty.c  | 1503 +
 21 files changed, 4342 insertions(+), 218 deletions(-)
 create mode 100644 

[PATCH v11 02/22] vfio: VFIO based driver for Mediated devices

2016-11-04 Thread Kirti Wankhede
vfio_mdev driver registers with mdev core driver.
mdev core driver creates mediated device and calls probe routine of
vfio_mdev driver for each device.
Probe routine of vfio_mdev driver adds mediated device to VFIO core module

This driver forms a shim layer that pass through VFIO devices operations
to vendor driver for mediated devices.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I583f4734752971d3d112324d69e2508c88f359ec
---
 drivers/vfio/mdev/Kconfig |   9 ++-
 drivers/vfio/mdev/Makefile|   1 +
 drivers/vfio/mdev/vfio_mdev.c | 148 ++
 3 files changed, 157 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/mdev/vfio_mdev.c

diff --git a/drivers/vfio/mdev/Kconfig b/drivers/vfio/mdev/Kconfig
index 303c14ce2847..79c9cface7b1 100644
--- a/drivers/vfio/mdev/Kconfig
+++ b/drivers/vfio/mdev/Kconfig
@@ -5,6 +5,13 @@ config VFIO_MDEV
 default n
 help
 Provides a framework to virtualize devices.
-   See Documentation/vfio-mdev/vfio-mediated-device.txt for more details.
+   See Documentation/vfio-mediated-device.txt for more details.
 
 If you don't know what do here, say N.
+
+config VFIO_MDEV_DEVICE
+tristate "VFIO support for Mediated devices"
+depends on VFIO && VFIO_MDEV
+default n
+help
+VFIO based driver for mediated devices.
diff --git a/drivers/vfio/mdev/Makefile b/drivers/vfio/mdev/Makefile
index 31bc04801d94..fa2d5ea466ee 100644
--- a/drivers/vfio/mdev/Makefile
+++ b/drivers/vfio/mdev/Makefile
@@ -2,3 +2,4 @@
 mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o
 
 obj-$(CONFIG_VFIO_MDEV) += mdev.o
+obj-$(CONFIG_VFIO_MDEV_DEVICE) += vfio_mdev.o
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
new file mode 100644
index ..bb534d19e321
--- /dev/null
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -0,0 +1,148 @@
+/*
+ * VFIO based driver for Mediated device
+ *
+ * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
+ * Author: Neo Jia 
+ *Kirti Wankhede 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mdev_private.h"
+
+#define DRIVER_VERSION  "0.1"
+#define DRIVER_AUTHOR   "NVIDIA Corporation"
+#define DRIVER_DESC "VFIO based driver for Mediated device"
+
+static int vfio_mdev_open(void *device_data)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+   int ret;
+
+   if (unlikely(!parent->ops->open))
+   return -EINVAL;
+
+   if (!try_module_get(THIS_MODULE))
+   return -ENODEV;
+
+   ret = parent->ops->open(mdev);
+   if (ret)
+   module_put(THIS_MODULE);
+
+   return ret;
+}
+
+static void vfio_mdev_release(void *device_data)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (likely(parent->ops->release))
+   parent->ops->release(mdev);
+
+   module_put(THIS_MODULE);
+}
+
+static long vfio_mdev_unlocked_ioctl(void *device_data,
+unsigned int cmd, unsigned long arg)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (unlikely(!parent->ops->ioctl))
+   return -EINVAL;
+
+   return parent->ops->ioctl(mdev, cmd, arg);
+}
+
+static ssize_t vfio_mdev_read(void *device_data, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (unlikely(!parent->ops->read))
+   return -EINVAL;
+
+   return parent->ops->read(mdev, buf, count, ppos);
+}
+
+static ssize_t vfio_mdev_write(void *device_data, const char __user *buf,
+  size_t count, loff_t *ppos)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (unlikely(!parent->ops->write))
+   return -EINVAL;
+
+   return parent->ops->write(mdev, buf, count, ppos);
+}
+
+static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (unlikely(!parent->ops->mmap))
+   return -EINVAL;
+
+   return parent->ops->mmap(mdev, vma);
+}
+
+static const struct vfio_device_ops vfio_mdev_dev_ops = {
+   .name   = "vfio-mdev",
+   .open   = vfio_mdev_open,
+   .release= vfio_mdev_release,
+   .ioctl  = vfio_mdev_unlocked_ioctl,
+ 

[PATCH v11 05/22] vfio iommu: Added pin and unpin callback functions to vfio_iommu_driver_ops

2016-11-04 Thread Kirti Wankhede
Added two new callback functions to struct vfio_iommu_driver_ops. Backend
IOMMU module that supports pining and unpinning pages for mdev devices
should provide these functions.
Added APIs for pining and unpining pages to VFIO module. These calls back
into backend iommu module to actually pin and unpin pages.
Renamed static functions in vfio_type1_iommu.c to resolve conflicts

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: Ia7417723aaae86bec2959ad9ae6c2915ddd340e0
---
 drivers/vfio/vfio.c | 96 +
 drivers/vfio/vfio_iommu_type1.c | 20 -
 include/linux/vfio.h| 14 +-
 3 files changed, 119 insertions(+), 11 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 2e83bdf007fe..76d260e98930 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1799,6 +1799,102 @@ void vfio_info_cap_shift(struct vfio_info_cap *caps, 
size_t offset)
 }
 EXPORT_SYMBOL_GPL(vfio_info_cap_shift);
 
+
+/*
+ * Pin a set of guest PFNs and return their associated host PFNs for local
+ * domain only.
+ * @dev [in] : device
+ * @user_pfn [in]: array of user/guest PFNs
+ * @npage [in]: count of array elements
+ * @prot [in] : protection flags
+ * @phys_pfn[out] : array of host PFNs
+ * Return error or number of pages pinned.
+ */
+int vfio_pin_pages(struct device *dev, unsigned long *user_pfn,
+  int npage, int prot, unsigned long *phys_pfn)
+{
+   struct vfio_container *container;
+   struct vfio_group *group;
+   struct vfio_iommu_driver *driver;
+   int ret;
+
+   if (!dev || !user_pfn || !phys_pfn)
+   return -EINVAL;
+
+   group = vfio_group_get_from_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   goto err_pin_pages;
+
+   container = group->container;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->pin_pages))
+   ret = driver->ops->pin_pages(container->iommu_data, user_pfn,
+npage, prot, phys_pfn);
+   else
+   ret = -EINVAL;
+
+   up_read(>group_lock);
+   vfio_group_try_dissolve_container(group);
+
+err_pin_pages:
+   vfio_group_put(group);
+   return ret;
+
+}
+EXPORT_SYMBOL(vfio_pin_pages);
+
+/*
+ * Unpin set of host PFNs for local domain only.
+ * @dev [in] : device
+ * @user_pfn [in]: array of user/guest PFNs to be unpinned
+ * @pfn [in] : array of host PFNs to be unpinned.
+ * @npage [in] :count of elements in array, that is number of pages.
+ * Return error or number of pages unpinned.
+ */
+int vfio_unpin_pages(struct device *dev, unsigned long *user_pfn,
+unsigned long *pfn, int npage)
+{
+   struct vfio_container *container;
+   struct vfio_group *group;
+   struct vfio_iommu_driver *driver;
+   int ret;
+
+   if (!dev || !pfn)
+   return -EINVAL;
+
+   group = vfio_group_get_from_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
+
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   goto err_unpin_pages;
+
+   container = group->container;
+   down_read(>group_lock);
+
+   driver = container->iommu_driver;
+   if (likely(driver && driver->ops->unpin_pages))
+   ret = driver->ops->unpin_pages(container->iommu_data, user_pfn,
+  pfn, npage);
+   else
+   ret = -EINVAL;
+
+   up_read(>group_lock);
+   vfio_group_try_dissolve_container(group);
+
+err_unpin_pages:
+   vfio_group_put(group);
+   return ret;
+}
+EXPORT_SYMBOL(vfio_unpin_pages);
+
 /**
  * Module/class support
  */
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2ba19424e4a1..7fb87f008e0a 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -259,8 +259,8 @@ static int vaddr_get_pfn(unsigned long vaddr, int prot, 
unsigned long *pfn)
  * the iommu can only map chunks of consecutive pfns anyway, so get the
  * first page and all consecutive pages with the same locking.
  */
-static long vfio_pin_pages(unsigned long vaddr, long npage,
-  int prot, unsigned long *pfn_base)
+static long __vfio_pin_pages_remote(unsigned long vaddr, long npage,
+   int prot, unsigned long *pfn_base)
 {
unsigned long limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
bool lock_cap = capable(CAP_IPC_LOCK);
@@ -318,8 +318,8 @@ static long vfio_pin_pages(unsigned long vaddr, long npage,
return i;
 }
 
-static long vfio_unpin_pages(unsigned long pfn, long npage,
-int prot, bool do_accounting)
+static long __vfio_unpin_pages_remote(unsigned long pfn, long npage,
+

[PATCH v11 00/22] Add Mediated device support

2016-11-04 Thread Kirti Wankhede
This series adds Mediated device support to Linux host kernel. Purpose
of this series is to provide a common interface for mediated device
management that can be used by different devices. This series introduces
Mdev core module that creates and manages mediated devices, VFIO based
driver for mediated devices that are created by mdev core module and
update VFIO type1 IOMMU module to support pinning & unpinning for mediated
devices.

What changed in v11?
mdev core:
  Register mdev_bus class when first device is registed to avoid panic if
  any vendor driver and mdev driver are selected as built-in but vendor
  driver loads first and then mdev module.
vfio_mdev:
  Added notifier callback function to mdev parent's ops so that notifer
  is registered from vfio_mdev module during device open and unregistered
  it from device close call. This is a optional callback. Some drivers
  using mdev framework might not pin or unpin pages, for example the
  sample mtty driver that simulates serial port. Vendor driver who need to
  pin/unpin pages should provide this callback. Otherwise pin request
  would fail.
vfio_iommu_type1:
  Updated to keep track of who (task and address space) mapped iova range.
  During DMA_UNMAP, same task who mapped it or other task who shares same
  address space is allowed to unmap, otherwise unmap fails.
  QEMU maps few iova ranges initially, then fork threads and from the child
  thread calls DMA_UNMAP on previously mapped iova. Since child shares same
  address space, DMA_UNMAP is successful.
  Address space keeps track of pages pinned (pfn_list) by external user /
  mdev devices. This pfn_list is used to verify pfn during unpin_request,
  re-accounting of pages when direct device assigned in hot-unplugged and
  mdev device is present in same container.
  When the container is released, all mapped iova from all tasks are
  unmapped and removed.

  Tested by assigning below combinations of devices to a single VM:
   - GPU pass through only
   - vGPU device only
   - One GPU pass through and one vGPU device
   - Linux VM hot plug and unplug vGPU device while GPU pass through device
 exist
   - Linux VM hot plug and unplug GPU pass through device while vGPU device
 exist

  Patch series tested with linux-next upto commit 14970f204b19 @Fri Oct 28
  Resolved against conflicting change:
  05692d7005a3 vfio/pci: Fix integer overflows, bitmask check


Kirti Wankhede (22):
  vfio: Mediated device Core driver
  vfio: VFIO based driver for Mediated devices
  vfio: Rearrange functions to get vfio_group from dev
  vfio: Common function to increment container_users
  vfio iommu: Added pin and unpin callback functions to
vfio_iommu_driver_ops
  vfio iommu type1: Update arguments of vfio_lock_acct
  vfio iommu type1: Update argument of vaddr_get_pfn()
  vfio iommu type1: Add find_iommu_group() function
  vfio iommu type1: Add task structure to vfio_dma
  vfio iommu type1: Add support for mediated devices
  vfio iommu: Add blocking notifier to notify DMA_UNMAP
  vfio: Add notifier callback to parent's ops structure of mdev
  vfio: Introduce common function to add capabilities
  vfio_pci: Update vfio_pci to use vfio_info_add_capability()
  vfio: Introduce vfio_set_irqs_validate_and_prepare()
  vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare()
  vfio_platform: Updated to use vfio_set_irqs_validate_and_prepare()
  vfio: Define device_api strings
  docs: Add Documentation for Mediated devices
  docs: Sysfs ABI for mediated device framework
  docs: Sample driver to demonstrate how to use Mediated device
framework.
  MAINTAINERS: Add entry VFIO based Mediated device drivers

 Documentation/ABI/testing/sysfs-bus-vfio-mdev |  111 ++
 Documentation/vfio-mediated-device.txt|  399 +++
 MAINTAINERS   |9 +
 drivers/vfio/Kconfig  |1 +
 drivers/vfio/Makefile |1 +
 drivers/vfio/mdev/Kconfig |   17 +
 drivers/vfio/mdev/Makefile|5 +
 drivers/vfio/mdev/mdev_core.c |  388 +++
 drivers/vfio/mdev/mdev_driver.c   |  122 ++
 drivers/vfio/mdev/mdev_private.h  |   41 +
 drivers/vfio/mdev/mdev_sysfs.c|  286 +
 drivers/vfio/mdev/vfio_mdev.c |  167 +++
 drivers/vfio/pci/vfio_pci.c   |   83 +-
 drivers/vfio/platform/vfio_platform_common.c  |   31 +-
 drivers/vfio/vfio.c   |  334 +-
 drivers/vfio/vfio_iommu_type1.c   |  831 --
 include/linux/mdev.h  |  176 +++
 include/linux/vfio.h  |   32 +-
 include/uapi/linux/vfio.h |   10 +
 samples/vfio-mdev/Makefile|   13 +
 samples/vfio-mdev/mtty.c  | 1503 +
 21 files changed, 4342 insertions(+), 218 deletions(-)
 create mode 100644 

[PATCH v11 02/22] vfio: VFIO based driver for Mediated devices

2016-11-04 Thread Kirti Wankhede
vfio_mdev driver registers with mdev core driver.
mdev core driver creates mediated device and calls probe routine of
vfio_mdev driver for each device.
Probe routine of vfio_mdev driver adds mediated device to VFIO core module

This driver forms a shim layer that pass through VFIO devices operations
to vendor driver for mediated devices.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I583f4734752971d3d112324d69e2508c88f359ec
---
 drivers/vfio/mdev/Kconfig |   9 ++-
 drivers/vfio/mdev/Makefile|   1 +
 drivers/vfio/mdev/vfio_mdev.c | 148 ++
 3 files changed, 157 insertions(+), 1 deletion(-)
 create mode 100644 drivers/vfio/mdev/vfio_mdev.c

diff --git a/drivers/vfio/mdev/Kconfig b/drivers/vfio/mdev/Kconfig
index 303c14ce2847..79c9cface7b1 100644
--- a/drivers/vfio/mdev/Kconfig
+++ b/drivers/vfio/mdev/Kconfig
@@ -5,6 +5,13 @@ config VFIO_MDEV
 default n
 help
 Provides a framework to virtualize devices.
-   See Documentation/vfio-mdev/vfio-mediated-device.txt for more details.
+   See Documentation/vfio-mediated-device.txt for more details.
 
 If you don't know what do here, say N.
+
+config VFIO_MDEV_DEVICE
+tristate "VFIO support for Mediated devices"
+depends on VFIO && VFIO_MDEV
+default n
+help
+VFIO based driver for mediated devices.
diff --git a/drivers/vfio/mdev/Makefile b/drivers/vfio/mdev/Makefile
index 31bc04801d94..fa2d5ea466ee 100644
--- a/drivers/vfio/mdev/Makefile
+++ b/drivers/vfio/mdev/Makefile
@@ -2,3 +2,4 @@
 mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o
 
 obj-$(CONFIG_VFIO_MDEV) += mdev.o
+obj-$(CONFIG_VFIO_MDEV_DEVICE) += vfio_mdev.o
diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c
new file mode 100644
index ..bb534d19e321
--- /dev/null
+++ b/drivers/vfio/mdev/vfio_mdev.c
@@ -0,0 +1,148 @@
+/*
+ * VFIO based driver for Mediated device
+ *
+ * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
+ * Author: Neo Jia 
+ *Kirti Wankhede 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "mdev_private.h"
+
+#define DRIVER_VERSION  "0.1"
+#define DRIVER_AUTHOR   "NVIDIA Corporation"
+#define DRIVER_DESC "VFIO based driver for Mediated device"
+
+static int vfio_mdev_open(void *device_data)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+   int ret;
+
+   if (unlikely(!parent->ops->open))
+   return -EINVAL;
+
+   if (!try_module_get(THIS_MODULE))
+   return -ENODEV;
+
+   ret = parent->ops->open(mdev);
+   if (ret)
+   module_put(THIS_MODULE);
+
+   return ret;
+}
+
+static void vfio_mdev_release(void *device_data)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (likely(parent->ops->release))
+   parent->ops->release(mdev);
+
+   module_put(THIS_MODULE);
+}
+
+static long vfio_mdev_unlocked_ioctl(void *device_data,
+unsigned int cmd, unsigned long arg)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (unlikely(!parent->ops->ioctl))
+   return -EINVAL;
+
+   return parent->ops->ioctl(mdev, cmd, arg);
+}
+
+static ssize_t vfio_mdev_read(void *device_data, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (unlikely(!parent->ops->read))
+   return -EINVAL;
+
+   return parent->ops->read(mdev, buf, count, ppos);
+}
+
+static ssize_t vfio_mdev_write(void *device_data, const char __user *buf,
+  size_t count, loff_t *ppos)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (unlikely(!parent->ops->write))
+   return -EINVAL;
+
+   return parent->ops->write(mdev, buf, count, ppos);
+}
+
+static int vfio_mdev_mmap(void *device_data, struct vm_area_struct *vma)
+{
+   struct mdev_device *mdev = device_data;
+   struct parent_device *parent = mdev->parent;
+
+   if (unlikely(!parent->ops->mmap))
+   return -EINVAL;
+
+   return parent->ops->mmap(mdev, vma);
+}
+
+static const struct vfio_device_ops vfio_mdev_dev_ops = {
+   .name   = "vfio-mdev",
+   .open   = vfio_mdev_open,
+   .release= vfio_mdev_release,
+   .ioctl  = vfio_mdev_unlocked_ioctl,
+   .read   = vfio_mdev_read,
+   .write  = 

[PATCH v11 04/22] vfio: Common function to increment container_users

2016-11-04 Thread Kirti Wankhede
This change rearrange functions to have common function to increment
container_users

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I8bdeb352bc8439b107ffd519480fd4dc238677f2
---
 drivers/vfio/vfio.c | 34 +-
 1 file changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 23bc86c1d05d..2e83bdf007fe 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1385,6 +1385,23 @@ static bool vfio_group_viable(struct vfio_group *group)
 group, vfio_dev_viable) == 0);
 }
 
+static int vfio_group_add_container_user(struct vfio_group *group)
+{
+   if (!atomic_inc_not_zero(>container_users))
+   return -EINVAL;
+
+   if (group->noiommu) {
+   atomic_dec(>container_users);
+   return -EPERM;
+   }
+   if (!group->container->iommu_driver || !vfio_group_viable(group)) {
+   atomic_dec(>container_users);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 static const struct file_operations vfio_device_fops;
 
 static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
@@ -1694,23 +1711,14 @@ static const struct file_operations vfio_device_fops = {
 struct vfio_group *vfio_group_get_external_user(struct file *filep)
 {
struct vfio_group *group = filep->private_data;
+   int ret;
 
if (filep->f_op != _group_fops)
return ERR_PTR(-EINVAL);
 
-   if (!atomic_inc_not_zero(>container_users))
-   return ERR_PTR(-EINVAL);
-
-   if (group->noiommu) {
-   atomic_dec(>container_users);
-   return ERR_PTR(-EPERM);
-   }
-
-   if (!group->container->iommu_driver ||
-   !vfio_group_viable(group)) {
-   atomic_dec(>container_users);
-   return ERR_PTR(-EINVAL);
-   }
+   ret = vfio_group_add_container_user(group);
+   if (ret)
+   return ERR_PTR(ret);
 
vfio_group_get(group);
 
-- 
2.7.0



[PATCH v11 01/22] vfio: Mediated device Core driver

2016-11-04 Thread Kirti Wankhede
Design for Mediated Device Driver:
Main purpose of this driver is to provide a common interface for mediated
device management that can be used by different drivers of different
devices.

This module provides a generic interface to create the device, add it to
mediated bus, add device to IOMMU group and then add it to vfio group.

Below is the high Level block diagram, with Nvidia, Intel and IBM devices
as example, since these are the devices which are going to actively use
this module as of now.

 +---+
 |   |
 | +---+ |  mdev_register_driver() +--+
 | |   | +<+ __init() |
 | |  mdev | | |  |
 | |  bus  | +>+  |<-> VFIO user
 | |  driver   | | probe()/remove()| vfio_mdev.ko |APIs
 | |   | | |  |
 | +---+ | +--+
 |   |
 |  MDEV CORE|
 |   MODULE  |
 |   mdev.ko |
 | +---+ |  mdev_register_device() +--+
 | |   | +<+  |
 | |   | | |  nvidia.ko   |<-> physical
 | |   | +>+  |device
 | |   | |callback +--+
 | | Physical  | |
 | |  device   | |  mdev_register_device() +--+
 | | interface | |<+  |
 | |   | | |  i915.ko |<-> physical
 | |   | +>+  |device
 | |   | |callback +--+
 | |   | |
 | |   | |  mdev_register_device() +--+
 | |   | +<+  |
 | |   | | | ccw_device.ko|<-> physical
 | |   | +>+  |device
 | |   | |callback +--+
 | +---+ |
 +---+

Core driver provides two types of registration interfaces:
1. Registration interface for mediated bus driver:

/**
  * struct mdev_driver - Mediated device's driver
  * @name: driver name
  * @probe: called when new device created
  * @remove:called when device removed
  * @driver:device driver structure
  *
  **/
struct mdev_driver {
 const char *name;
 int  (*probe)  (struct device *dev);
 void (*remove) (struct device *dev);
 struct device_driverdriver;
};

Mediated bus driver for mdev device should use this interface to register
and unregister with core driver respectively:

int  mdev_register_driver(struct mdev_driver *drv, struct module *owner);
void mdev_unregister_driver(struct mdev_driver *drv);

Mediated bus driver is responsible to add/delete mediated devices to/from
VFIO group when devices are bound and unbound to the driver.

2. Physical device driver interface
This interface provides vendor driver the set APIs to manage physical
device related work in its driver. APIs are :

* dev_attr_groups: attributes of the parent device.
* mdev_attr_groups: attributes of the mediated device.
* supported_type_groups: attributes to define supported type. This is
 mandatory field.
* create: to allocate basic resources in vendor driver for a mediated
 device. This is mandatory to be provided by vendor driver.
* remove: to free resources in vendor driver when mediated device is
 destroyed. This is mandatory to be provided by vendor driver.
* open: open callback of mediated device
* release: release callback of mediated device
* read : read emulation callback.
* write: write emulation callback.
* ioctl: ioctl callback.
* mmap: mmap emulation callback.

Drivers should use these interfaces to register and unregister device to
mdev core driver respectively:

extern int  mdev_register_device(struct device *dev,
 const struct parent_ops *ops);
extern void mdev_unregister_device(struct device *dev);

There are no locks to serialize above callbacks in mdev driver and
vfio_mdev driver. If required, vendor driver can have locks to serialize
above APIs in their driver.

Signed-off-by: Kirti Wankhede 
Signed-off-by: Neo Jia 
Change-Id: I73a5084574270b14541c529461ea2f03c292d510
---
 drivers/vfio/Kconfig |   1 +
 drivers/vfio/Makefile|   1 +
 drivers/vfio/mdev/Kconfig|  10 +
 drivers/vfio/mdev/Makefile   |   4 +
 drivers/vfio/mdev/mdev_core.c| 388 +++
 drivers/vfio/mdev/mdev_driver.c  | 122 
 drivers/vfio/mdev/mdev_private.h |  41 +
 drivers/vfio/mdev/mdev_sysfs.c   | 286 +
 include/linux/mdev.h | 167 +
 9 files changed, 1020 insertions(+)
 create mode 100644 

<    1   2   3   4   5   6   7   8   9   10   >