date:20170320

[PATCH 4.9 87/93] x86/tsc: Fix ART for TSC_KNOWN_FREQ

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Peter Zijlstra 

commit 44fee88cea43d3c2cac962e0439cb10a3cabff6d upstream.

Subhransu reported that convert_art_to_tsc() isn't working for him.

The ART to TSC relation is only set up for systems which use the refined
TSC calibration. Systems with known TSC frequency (available via CPUID 15)
are not using the refined calibration and therefor the ART to TSC relation
is never established.

Add the setup to the known frequency init path which skips ART
calibration. The init code needs to be duplicated as for systems which use
refined calibration the ART setup must be delayed until calibration has
been done.

The problem has been there since the ART support was introdduced, but only
detected now because Subhransu tested the first time on hardware which has
TSC frequency enumerated via CPUID 15.

Note for stable: The conditional has changed from TSC_RELIABLE to
 TSC_KNOWN_FREQUENCY.

[ tglx: Rewrote changelog and identified the proper 'Fixes' commit ]

Fixes: f9677e0f8308 ("x86/tsc: Always Running Timer (ART) correlated 
clocksource")
Reported-by: "Prusty, Subhransu S" 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: sta...@vger.kernel.org
Cc: christopher.s.h...@intel.com
Cc: kevin.b.stan...@intel.com
Cc: john.stu...@linaro.org
Cc: akata...@vmware.com
Link: 
http://lkml.kernel.org/r/20170313145712.gi3...@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/x86/kernel/tsc.c |2 ++
 1 file changed, 2 insertions(+)

--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1287,6 +1287,8 @@ static int __init init_tsc_clocksource(v
 * exporting a reliable TSC.
 */
if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) {
+   if (boot_cpu_has(X86_FEATURE_ART))
+   art_related_clocksource = _tsc;
clocksource_register_khz(_tsc, tsc_khz);
return 0;
}

[PATCH 4.9 86/93] irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Shanker Donthineni 

commit 90922a2d03d84de36bf8a9979d62580102f31a92 upstream.

On Qualcomm Datacenter Technologies QDF2400 SoCs, the ITS hardware
implementation uses 16Bytes for Interrupt Translation Entry (ITE),
but reports an incorrect value of 8Bytes in GITS_TYPER.ITTE_size.

It might cause kernel memory corruption depending on the number
of MSI(x) that are configured and the amount of memory that has
been allocated for ITEs in its_create_device().

This patch fixes the potential memory corruption by setting the
correct ITE size to 16Bytes.

Cc: sta...@vger.kernel.org
Signed-off-by: Shanker Donthineni 
Signed-off-by: Marc Zyngier 
Signed-off-by: Greg Kroah-Hartman 

---
 Documentation/arm64/silicon-errata.txt |   44 +
 arch/arm64/Kconfig |   10 +++
 drivers/irqchip/irq-gic-v3-its.c   |   16 
 3 files changed, 49 insertions(+), 21 deletions(-)

--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -42,24 +42,26 @@ file acts as a registry of software work
 will be updated when new workarounds are committed and backported to
 stable kernels.
 
-| Implementor| Component   | Erratum ID  | Kconfig 
|
-++-+-+-+
-| ARM| Cortex-A53  | #826319 | ARM64_ERRATUM_826319
|
-| ARM| Cortex-A53  | #827319 | ARM64_ERRATUM_827319
|
-| ARM| Cortex-A53  | #824069 | ARM64_ERRATUM_824069
|
-| ARM| Cortex-A53  | #819472 | ARM64_ERRATUM_819472
|
-| ARM| Cortex-A53  | #845719 | ARM64_ERRATUM_845719
|
-| ARM| Cortex-A53  | #843419 | ARM64_ERRATUM_843419
|
-| ARM| Cortex-A57  | #832075 | ARM64_ERRATUM_832075
|
-| ARM| Cortex-A57  | #852523 | N/A 
|
-| ARM| Cortex-A57  | #834220 | ARM64_ERRATUM_834220
|
-| ARM| Cortex-A72  | #853709 | N/A 
|
-| ARM| MMU-500 | #841119,#826419 | N/A 
|
-|| | | 
|
-| Cavium | ThunderX ITS| #22375, #24313  | CAVIUM_ERRATUM_22375
|
-| Cavium | ThunderX ITS| #23144  | CAVIUM_ERRATUM_23144
|
-| Cavium | ThunderX GICv3  | #23154  | CAVIUM_ERRATUM_23154
|
-| Cavium | ThunderX Core   | #27456  | CAVIUM_ERRATUM_27456
|
-| Cavium | ThunderX SMMUv2 | #27704  | N/A|
-|| | | 
|
-| Freescale/NXP  | LS2080A/LS1043A | A-008585| FSL_ERRATUM_A008585 
|
+| Implementor| Component   | Erratum ID  | Kconfig 
|
+++-+-+-+
+| ARM| Cortex-A53  | #826319 | ARM64_ERRATUM_826319
|
+| ARM| Cortex-A53  | #827319 | ARM64_ERRATUM_827319
|
+| ARM| Cortex-A53  | #824069 | ARM64_ERRATUM_824069
|
+| ARM| Cortex-A53  | #819472 | ARM64_ERRATUM_819472
|
+| ARM| Cortex-A53  | #845719 | ARM64_ERRATUM_845719
|
+| ARM| Cortex-A53  | #843419 | ARM64_ERRATUM_843419
|
+| ARM| Cortex-A57  | #832075 | ARM64_ERRATUM_832075
|
+| ARM| Cortex-A57  | #852523 | N/A 
|
+| ARM| Cortex-A57  | #834220 | ARM64_ERRATUM_834220
|
+| ARM| Cortex-A72  | #853709 | N/A 
|
+| ARM| MMU-500 | #841119,#826419 | N/A 
|
+|| | | 
|
+| Cavium | ThunderX ITS| #22375, #24313  | CAVIUM_ERRATUM_22375
|
+| Cavium | ThunderX ITS| #23144  | CAVIUM_ERRATUM_23144
|
+| Cavium | ThunderX GICv3  | #23154  | CAVIUM_ERRATUM_23154
|
+| Cavium | ThunderX Core   | #27456  | CAVIUM_ERRATUM_27456
|
+| Cavium | ThunderX SMMUv2 | #27704  | N/A 
|
+|| | | 
|
+| Freescale/NXP  | LS2080A/LS1043A | A-008585| FSL_ERRATUM_A008585 
|
+|| | | 
|
+| Qualcomm Tech. | QDF2400 ITS | E0065   |

[PATCH 4.9 61/93] ibmveth: calculate gso_segs for large packets

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Thomas Falcon 

[ Upstream commit 94acf164dc8f1184e8d0737be7125134c2701dbe ]

Include calculations to compute the number of segments
that comprise an aggregated large packet.

Signed-off-by: Thomas Falcon 
Reviewed-by: Marcelo Ricardo Leitner 
Reviewed-by: Jonathan Maxwell 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/ibm/ibmveth.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1181,7 +1181,9 @@ map_failed:
 
 static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
 {
+   struct tcphdr *tcph;
int offset = 0;
+   int hdr_len;
 
/* only TCP packets will be aggregated */
if (skb->protocol == htons(ETH_P_IP)) {
@@ -1208,14 +1210,20 @@ static void ibmveth_rx_mss_helper(struct
/* if mss is not set through Large Packet bit/mss in rx buffer,
 * expect that the mss will be written to the tcp header checksum.
 */
+   tcph = (struct tcphdr *)(skb->data + offset);
if (lrg_pkt) {
skb_shinfo(skb)->gso_size = mss;
} else if (offset) {
-   struct tcphdr *tcph = (struct tcphdr *)(skb->data + offset);
-
skb_shinfo(skb)->gso_size = ntohs(tcph->check);
tcph->check = 0;
}
+
+   if (skb_shinfo(skb)->gso_size) {
+   hdr_len = offset + tcph->doff * 4;
+   skb_shinfo(skb)->gso_segs =
+   DIV_ROUND_UP(skb->len - hdr_len,
+skb_shinfo(skb)->gso_size);
+   }
 }
 
 static int ibmveth_poll(struct napi_struct *napi, int budget)

[PATCH 4.9 84/93] drm/vc4: Fix ->clock_select setting for the VEC encoder

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Boris Brezillon 

commit ab8df60e3a3b68420d0d4477c5f07c00fbfb078b upstream.

PV_CONTROL_CLK_SELECT_VEC is actually 2 and not 0. Fix the definition and
rework the vc4_set_crtc_possible_masks() to cover the full range of the
PV_CONTROL_CLK_SELECT field.

Signed-off-by: Boris Brezillon 
Signed-off-by: Eric Anholt 
Cc: Amit Pundir 
Signed-off-by: Greg Kroah-Hartman 


---
 drivers/gpu/drm/vc4/vc4_crtc.c |   36 ++--
 drivers/gpu/drm/vc4/vc4_drv.h  |1 +
 drivers/gpu/drm/vc4/vc4_regs.h |3 ++-
 3 files changed, 25 insertions(+), 15 deletions(-)

--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -83,8 +83,7 @@ struct vc4_crtc_data {
/* Which channel of the HVS this pixelvalve sources from. */
int hvs_channel;
 
-   enum vc4_encoder_type encoder0_type;
-   enum vc4_encoder_type encoder1_type;
+   enum vc4_encoder_type encoder_types[4];
 };
 
 #define CRTC_WRITE(offset, val) writel(val, vc4_crtc->regs + (offset))
@@ -867,20 +866,26 @@ static const struct drm_crtc_helper_func
 
 static const struct vc4_crtc_data pv0_data = {
.hvs_channel = 0,
-   .encoder0_type = VC4_ENCODER_TYPE_DSI0,
-   .encoder1_type = VC4_ENCODER_TYPE_DPI,
+   .encoder_types = {
+   [PV_CONTROL_CLK_SELECT_DSI] = VC4_ENCODER_TYPE_DSI0,
+   [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_DPI,
+   },
 };
 
 static const struct vc4_crtc_data pv1_data = {
.hvs_channel = 2,
-   .encoder0_type = VC4_ENCODER_TYPE_DSI1,
-   .encoder1_type = VC4_ENCODER_TYPE_SMI,
+   .encoder_types = {
+   [PV_CONTROL_CLK_SELECT_DSI] = VC4_ENCODER_TYPE_DSI1,
+   [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_SMI,
+   },
 };
 
 static const struct vc4_crtc_data pv2_data = {
.hvs_channel = 1,
-   .encoder0_type = VC4_ENCODER_TYPE_VEC,
-   .encoder1_type = VC4_ENCODER_TYPE_HDMI,
+   .encoder_types = {
+   [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_HDMI,
+   [PV_CONTROL_CLK_SELECT_VEC] = VC4_ENCODER_TYPE_VEC,
+   },
 };
 
 static const struct of_device_id vc4_crtc_dt_match[] = {
@@ -894,17 +899,20 @@ static void vc4_set_crtc_possible_masks(
struct drm_crtc *crtc)
 {
struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+   const struct vc4_crtc_data *crtc_data = vc4_crtc->data;
+   const enum vc4_encoder_type *encoder_types = crtc_data->encoder_types;
struct drm_encoder *encoder;
 
drm_for_each_encoder(encoder, drm) {
struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
+   int i;
 
-   if (vc4_encoder->type == vc4_crtc->data->encoder0_type) {
-   vc4_encoder->clock_select = 0;
-   encoder->possible_crtcs |= drm_crtc_mask(crtc);
-   } else if (vc4_encoder->type == vc4_crtc->data->encoder1_type) {
-   vc4_encoder->clock_select = 1;
-   encoder->possible_crtcs |= drm_crtc_mask(crtc);
+   for (i = 0; i < ARRAY_SIZE(crtc_data->encoder_types); i++) {
+   if (vc4_encoder->type == encoder_types[i]) {
+   vc4_encoder->clock_select = i;
+   encoder->possible_crtcs |= drm_crtc_mask(crtc);
+   break;
+   }
}
}
 }
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -194,6 +194,7 @@ to_vc4_plane(struct drm_plane *plane)
 }
 
 enum vc4_encoder_type {
+   VC4_ENCODER_TYPE_NONE,
VC4_ENCODER_TYPE_HDMI,
VC4_ENCODER_TYPE_VEC,
VC4_ENCODER_TYPE_DSI0,
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -177,8 +177,9 @@
 # define PV_CONTROL_WAIT_HSTARTBIT(12)
 # define PV_CONTROL_PIXEL_REP_MASK VC4_MASK(5, 4)
 # define PV_CONTROL_PIXEL_REP_SHIFT4
-# define PV_CONTROL_CLK_SELECT_DSI_VEC 0
+# define PV_CONTROL_CLK_SELECT_DSI 0
 # define PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI1
+# define PV_CONTROL_CLK_SELECT_VEC 2
 # define PV_CONTROL_CLK_SELECT_MASKVC4_MASK(3, 2)
 # define PV_CONTROL_CLK_SELECT_SHIFT   2
 # define PV_CONTROL_FIFO_CLR   BIT(1)

[PATCH 4.9 61/93] ibmveth: calculate gso_segs for large packets

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Thomas Falcon 

[ Upstream commit 94acf164dc8f1184e8d0737be7125134c2701dbe ]

Include calculations to compute the number of segments
that comprise an aggregated large packet.

Signed-off-by: Thomas Falcon 
Reviewed-by: Marcelo Ricardo Leitner 
Reviewed-by: Jonathan Maxwell 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/ibm/ibmveth.c |   12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1181,7 +1181,9 @@ map_failed:
 
 static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
 {
+   struct tcphdr *tcph;
int offset = 0;
+   int hdr_len;
 
/* only TCP packets will be aggregated */
if (skb->protocol == htons(ETH_P_IP)) {
@@ -1208,14 +1210,20 @@ static void ibmveth_rx_mss_helper(struct
/* if mss is not set through Large Packet bit/mss in rx buffer,
 * expect that the mss will be written to the tcp header checksum.
 */
+   tcph = (struct tcphdr *)(skb->data + offset);
if (lrg_pkt) {
skb_shinfo(skb)->gso_size = mss;
} else if (offset) {
-   struct tcphdr *tcph = (struct tcphdr *)(skb->data + offset);
-
skb_shinfo(skb)->gso_size = ntohs(tcph->check);
tcph->check = 0;
}
+
+   if (skb_shinfo(skb)->gso_size) {
+   hdr_len = offset + tcph->doff * 4;
+   skb_shinfo(skb)->gso_segs =
+   DIV_ROUND_UP(skb->len - hdr_len,
+skb_shinfo(skb)->gso_size);
+   }
 }
 
 static int ibmveth_poll(struct napi_struct *napi, int budget)

[PATCH 4.9 84/93] drm/vc4: Fix ->clock_select setting for the VEC encoder

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Boris Brezillon 

commit ab8df60e3a3b68420d0d4477c5f07c00fbfb078b upstream.

PV_CONTROL_CLK_SELECT_VEC is actually 2 and not 0. Fix the definition and
rework the vc4_set_crtc_possible_masks() to cover the full range of the
PV_CONTROL_CLK_SELECT field.

Signed-off-by: Boris Brezillon 
Signed-off-by: Eric Anholt 
Cc: Amit Pundir 
Signed-off-by: Greg Kroah-Hartman 


---
 drivers/gpu/drm/vc4/vc4_crtc.c |   36 ++--
 drivers/gpu/drm/vc4/vc4_drv.h  |1 +
 drivers/gpu/drm/vc4/vc4_regs.h |3 ++-
 3 files changed, 25 insertions(+), 15 deletions(-)

--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -83,8 +83,7 @@ struct vc4_crtc_data {
/* Which channel of the HVS this pixelvalve sources from. */
int hvs_channel;
 
-   enum vc4_encoder_type encoder0_type;
-   enum vc4_encoder_type encoder1_type;
+   enum vc4_encoder_type encoder_types[4];
 };
 
 #define CRTC_WRITE(offset, val) writel(val, vc4_crtc->regs + (offset))
@@ -867,20 +866,26 @@ static const struct drm_crtc_helper_func
 
 static const struct vc4_crtc_data pv0_data = {
.hvs_channel = 0,
-   .encoder0_type = VC4_ENCODER_TYPE_DSI0,
-   .encoder1_type = VC4_ENCODER_TYPE_DPI,
+   .encoder_types = {
+   [PV_CONTROL_CLK_SELECT_DSI] = VC4_ENCODER_TYPE_DSI0,
+   [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_DPI,
+   },
 };
 
 static const struct vc4_crtc_data pv1_data = {
.hvs_channel = 2,
-   .encoder0_type = VC4_ENCODER_TYPE_DSI1,
-   .encoder1_type = VC4_ENCODER_TYPE_SMI,
+   .encoder_types = {
+   [PV_CONTROL_CLK_SELECT_DSI] = VC4_ENCODER_TYPE_DSI1,
+   [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_SMI,
+   },
 };
 
 static const struct vc4_crtc_data pv2_data = {
.hvs_channel = 1,
-   .encoder0_type = VC4_ENCODER_TYPE_VEC,
-   .encoder1_type = VC4_ENCODER_TYPE_HDMI,
+   .encoder_types = {
+   [PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI] = VC4_ENCODER_TYPE_HDMI,
+   [PV_CONTROL_CLK_SELECT_VEC] = VC4_ENCODER_TYPE_VEC,
+   },
 };
 
 static const struct of_device_id vc4_crtc_dt_match[] = {
@@ -894,17 +899,20 @@ static void vc4_set_crtc_possible_masks(
struct drm_crtc *crtc)
 {
struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
+   const struct vc4_crtc_data *crtc_data = vc4_crtc->data;
+   const enum vc4_encoder_type *encoder_types = crtc_data->encoder_types;
struct drm_encoder *encoder;
 
drm_for_each_encoder(encoder, drm) {
struct vc4_encoder *vc4_encoder = to_vc4_encoder(encoder);
+   int i;
 
-   if (vc4_encoder->type == vc4_crtc->data->encoder0_type) {
-   vc4_encoder->clock_select = 0;
-   encoder->possible_crtcs |= drm_crtc_mask(crtc);
-   } else if (vc4_encoder->type == vc4_crtc->data->encoder1_type) {
-   vc4_encoder->clock_select = 1;
-   encoder->possible_crtcs |= drm_crtc_mask(crtc);
+   for (i = 0; i < ARRAY_SIZE(crtc_data->encoder_types); i++) {
+   if (vc4_encoder->type == encoder_types[i]) {
+   vc4_encoder->clock_select = i;
+   encoder->possible_crtcs |= drm_crtc_mask(crtc);
+   break;
+   }
}
}
 }
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -194,6 +194,7 @@ to_vc4_plane(struct drm_plane *plane)
 }
 
 enum vc4_encoder_type {
+   VC4_ENCODER_TYPE_NONE,
VC4_ENCODER_TYPE_HDMI,
VC4_ENCODER_TYPE_VEC,
VC4_ENCODER_TYPE_DSI0,
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -177,8 +177,9 @@
 # define PV_CONTROL_WAIT_HSTARTBIT(12)
 # define PV_CONTROL_PIXEL_REP_MASK VC4_MASK(5, 4)
 # define PV_CONTROL_PIXEL_REP_SHIFT4
-# define PV_CONTROL_CLK_SELECT_DSI_VEC 0
+# define PV_CONTROL_CLK_SELECT_DSI 0
 # define PV_CONTROL_CLK_SELECT_DPI_SMI_HDMI1
+# define PV_CONTROL_CLK_SELECT_VEC 2
 # define PV_CONTROL_CLK_SELECT_MASKVC4_MASK(3, 2)
 # define PV_CONTROL_CLK_SELECT_SHIFT   2
 # define PV_CONTROL_FIFO_CLR   BIT(1)

[PATCH 4.9 59/93] PCI: Ignore BAR updates on virtual functions

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Bjorn Helgaas 

[ Upstream commit 63880b230a4af502c56dde3d4588634c70c66006 ]

VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.

We already ignore these updates because of 70675e0b6a1a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Gavin Shan 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/pci/pci.c   |4 
 drivers/pci/setup-res.c |5 ++---
 2 files changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -564,10 +564,6 @@ static void pci_restore_bars(struct pci_
 {
int i;
 
-   /* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
-   if (dev->is_virtfn)
-   return;
-
for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
pci_update_resource(dev, i);
 }
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -34,10 +34,9 @@ static void pci_std_update_resource(stru
int reg;
struct resource *res = dev->resource + resno;
 
-   if (dev->is_virtfn) {
-   dev_warn(>dev, "can't update VF BAR%d\n", resno);
+   /* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
+   if (dev->is_virtfn)
return;
-   }
 
/*
 * Ignore resources for unimplemented BARs and unused resource slots

[PATCH 4.9 59/93] PCI: Ignore BAR updates on virtual functions

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Bjorn Helgaas 

[ Upstream commit 63880b230a4af502c56dde3d4588634c70c66006 ]

VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.

We already ignore these updates because of 70675e0b6a1a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Gavin Shan 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/pci/pci.c   |4 
 drivers/pci/setup-res.c |5 ++---
 2 files changed, 2 insertions(+), 7 deletions(-)

--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -564,10 +564,6 @@ static void pci_restore_bars(struct pci_
 {
int i;
 
-   /* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
-   if (dev->is_virtfn)
-   return;
-
for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
pci_update_resource(dev, i);
 }
--- a/drivers/pci/setup-res.c
+++ b/drivers/pci/setup-res.c
@@ -34,10 +34,9 @@ static void pci_std_update_resource(stru
int reg;
struct resource *res = dev->resource + resno;
 
-   if (dev->is_virtfn) {
-   dev_warn(>dev, "can't update VF BAR%d\n", resno);
+   /* Per SR-IOV spec 3.4.1.11, VF BARs are RO zero */
+   if (dev->is_virtfn)
return;
-   }
 
/*
 * Ignore resources for unimplemented BARs and unused resource slots

[PATCH 4.9 60/93] PCI: Do any VF BAR updates before enabling the BARs

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Gavin Shan 

[ Upstream commit f40ec3c748c6912f6266c56a7f7992de61b255ed ]

Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable().  But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.

Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled.  The half-updated state may conflict with other
devices in the system.

Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.

[bhelgaas: changelog]
Tested-by: Carol Soto 
Signed-off-by: Gavin Shan 
Signed-off-by: Bjorn Helgaas 

Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/pci/iov.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -306,13 +306,6 @@ static int sriov_enable(struct pci_dev *
return rc;
}
 
-   pci_iov_set_numvfs(dev, nr_virtfn);
-   iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
-   pci_cfg_access_lock(dev);
-   pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
-   msleep(100);
-   pci_cfg_access_unlock(dev);
-
iov->initial_VFs = initial;
if (nr_virtfn < initial)
initial = nr_virtfn;
@@ -323,6 +316,13 @@ static int sriov_enable(struct pci_dev *
goto err_pcibios;
}
 
+   pci_iov_set_numvfs(dev, nr_virtfn);
+   iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+   pci_cfg_access_lock(dev);
+   pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+   msleep(100);
+   pci_cfg_access_unlock(dev);
+
for (i = 0; i < initial; i++) {
rc = pci_iov_add_virtfn(dev, i, 0);
if (rc)

[PATCH 4.9 60/93] PCI: Do any VF BAR updates before enabling the BARs

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Gavin Shan 

[ Upstream commit f40ec3c748c6912f6266c56a7f7992de61b255ed ]

Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable().  But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.

Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled.  The half-updated state may conflict with other
devices in the system.

Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.

[bhelgaas: changelog]
Tested-by: Carol Soto 
Signed-off-by: Gavin Shan 
Signed-off-by: Bjorn Helgaas 

Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/pci/iov.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -306,13 +306,6 @@ static int sriov_enable(struct pci_dev *
return rc;
}
 
-   pci_iov_set_numvfs(dev, nr_virtfn);
-   iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
-   pci_cfg_access_lock(dev);
-   pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
-   msleep(100);
-   pci_cfg_access_unlock(dev);
-
iov->initial_VFs = initial;
if (nr_virtfn < initial)
initial = nr_virtfn;
@@ -323,6 +316,13 @@ static int sriov_enable(struct pci_dev *
goto err_pcibios;
}
 
+   pci_iov_set_numvfs(dev, nr_virtfn);
+   iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+   pci_cfg_access_lock(dev);
+   pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+   msleep(100);
+   pci_cfg_access_unlock(dev);
+
for (i = 0; i < initial; i++) {
rc = pci_iov_add_virtfn(dev, i, 0);
if (rc)

[PATCH 4.9 62/93] Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2)

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Vitaly Kuznetsov 

[ Upstream commit fa32ff6576623616c1751562edaed8c164ca5199 ]

With wrap around mappings in place we can always provide drivers with
direct links to packets on the ring buffer, even when they wrap around.
Do the required updates to get_next_pkt_raw()/put_pkt_raw()

The first version of this commit was reverted (65a532f3d50a) to deal with
cross-tree merge issues which are (hopefully) resolved now.

Signed-off-by: Vitaly Kuznetsov 
Signed-off-by: K. Y. Srinivasan 
Tested-by: Dexuan Cui 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/hyperv.h |   32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1548,31 +1548,23 @@ static inline struct vmpacket_descriptor
 get_next_pkt_raw(struct vmbus_channel *channel)
 {
struct hv_ring_buffer_info *ring_info = >inbound;
-   u32 read_loc = ring_info->priv_read_index;
+   u32 priv_read_loc = ring_info->priv_read_index;
void *ring_buffer = hv_get_ring_buffer(ring_info);
-   struct vmpacket_descriptor *cur_desc;
-   u32 packetlen;
u32 dsize = ring_info->ring_datasize;
-   u32 delta = read_loc - ring_info->ring_buffer->read_index;
+   /*
+* delta is the difference between what is available to read and
+* what was already consumed in place. We commit read index after
+* the whole batch is processed.
+*/
+   u32 delta = priv_read_loc >= ring_info->ring_buffer->read_index ?
+   priv_read_loc - ring_info->ring_buffer->read_index :
+   (dsize - ring_info->ring_buffer->read_index) + priv_read_loc;
u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta);
 
if (bytes_avail_toread < sizeof(struct vmpacket_descriptor))
return NULL;
 
-   if ((read_loc + sizeof(*cur_desc)) > dsize)
-   return NULL;
-
-   cur_desc = ring_buffer + read_loc;
-   packetlen = cur_desc->len8 << 3;
-
-   /*
-* If the packet under consideration is wrapping around,
-* return failure.
-*/
-   if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1))
-   return NULL;
-
-   return cur_desc;
+   return ring_buffer + priv_read_loc;
 }
 
 /*
@@ -1584,16 +1576,14 @@ static inline void put_pkt_raw(struct vm
struct vmpacket_descriptor *desc)
 {
struct hv_ring_buffer_info *ring_info = >inbound;
-   u32 read_loc = ring_info->priv_read_index;
u32 packetlen = desc->len8 << 3;
u32 dsize = ring_info->ring_datasize;
 
-   if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize)
-   BUG();
/*
 * Include the packet trailer.
 */
ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER;
+   ring_info->priv_read_index %= dsize;
 }
 
 /*

[PATCH 4.9 62/93] Drivers: hv: ring_buffer: count on wrap around mappings in get_next_pkt_raw() (v2)

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Vitaly Kuznetsov 

[ Upstream commit fa32ff6576623616c1751562edaed8c164ca5199 ]

With wrap around mappings in place we can always provide drivers with
direct links to packets on the ring buffer, even when they wrap around.
Do the required updates to get_next_pkt_raw()/put_pkt_raw()

The first version of this commit was reverted (65a532f3d50a) to deal with
cross-tree merge issues which are (hopefully) resolved now.

Signed-off-by: Vitaly Kuznetsov 
Signed-off-by: K. Y. Srinivasan 
Tested-by: Dexuan Cui 
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 include/linux/hyperv.h |   32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1548,31 +1548,23 @@ static inline struct vmpacket_descriptor
 get_next_pkt_raw(struct vmbus_channel *channel)
 {
struct hv_ring_buffer_info *ring_info = >inbound;
-   u32 read_loc = ring_info->priv_read_index;
+   u32 priv_read_loc = ring_info->priv_read_index;
void *ring_buffer = hv_get_ring_buffer(ring_info);
-   struct vmpacket_descriptor *cur_desc;
-   u32 packetlen;
u32 dsize = ring_info->ring_datasize;
-   u32 delta = read_loc - ring_info->ring_buffer->read_index;
+   /*
+* delta is the difference between what is available to read and
+* what was already consumed in place. We commit read index after
+* the whole batch is processed.
+*/
+   u32 delta = priv_read_loc >= ring_info->ring_buffer->read_index ?
+   priv_read_loc - ring_info->ring_buffer->read_index :
+   (dsize - ring_info->ring_buffer->read_index) + priv_read_loc;
u32 bytes_avail_toread = (hv_get_bytes_to_read(ring_info) - delta);
 
if (bytes_avail_toread < sizeof(struct vmpacket_descriptor))
return NULL;
 
-   if ((read_loc + sizeof(*cur_desc)) > dsize)
-   return NULL;
-
-   cur_desc = ring_buffer + read_loc;
-   packetlen = cur_desc->len8 << 3;
-
-   /*
-* If the packet under consideration is wrapping around,
-* return failure.
-*/
-   if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > (dsize - 1))
-   return NULL;
-
-   return cur_desc;
+   return ring_buffer + priv_read_loc;
 }
 
 /*
@@ -1584,16 +1576,14 @@ static inline void put_pkt_raw(struct vm
struct vmpacket_descriptor *desc)
 {
struct hv_ring_buffer_info *ring_info = >inbound;
-   u32 read_loc = ring_info->priv_read_index;
u32 packetlen = desc->len8 << 3;
u32 dsize = ring_info->ring_datasize;
 
-   if ((read_loc + packetlen + VMBUS_PKT_TRAILER) > dsize)
-   BUG();
/*
 * Include the packet trailer.
 */
ring_info->priv_read_index += packetlen + VMBUS_PKT_TRAILER;
+   ring_info->priv_read_index %= dsize;
 }
 
 /*

[PATCH 4.9 78/93] ACPI / blacklist: Make Dell Latitude 3350 ethernet work

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Michael Pobega 

[ Upstream commit 708f5dcc21ae9b35f395865fc154b0105baf4de4 ]

The Dell Latitude 3350's ethernet card attempts to use a reserved
IRQ (18), resulting in ACPI being unable to enable the ethernet.

Adding it to acpi_rev_dmi_table[] helps to work around this problem.

Signed-off-by: Michael Pobega 
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki 

Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/acpi/blacklist.c |   12 
 1 file changed, 12 insertions(+)

--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -176,6 +176,18 @@ static struct dmi_system_id acpi_rev_dmi
  DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3520"),
},
},
+   /*
+* Resolves a quirk with the Dell Latitude 3350 that
+* causes the ethernet adapter to not function.
+*/
+   {
+.callback = dmi_enable_rev_override,
+.ident = "DELL Latitude 3350",
+.matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Latitude 3350"),
+   },
+   },
 #endif
{}
 };

[PATCH 4.9 78/93] ACPI / blacklist: Make Dell Latitude 3350 ethernet work

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Michael Pobega 

[ Upstream commit 708f5dcc21ae9b35f395865fc154b0105baf4de4 ]

The Dell Latitude 3350's ethernet card attempts to use a reserved
IRQ (18), resulting in ACPI being unable to enable the ethernet.

Adding it to acpi_rev_dmi_table[] helps to work around this problem.

Signed-off-by: Michael Pobega 
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki 

Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/acpi/blacklist.c |   12 
 1 file changed, 12 insertions(+)

--- a/drivers/acpi/blacklist.c
+++ b/drivers/acpi/blacklist.c
@@ -176,6 +176,18 @@ static struct dmi_system_id acpi_rev_dmi
  DMI_MATCH(DMI_PRODUCT_NAME, "Precision 3520"),
},
},
+   /*
+* Resolves a quirk with the Dell Latitude 3350 that
+* causes the ethernet adapter to not function.
+*/
+   {
+.callback = dmi_enable_rev_override,
+.ident = "DELL Latitude 3350",
+.matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+ DMI_MATCH(DMI_PRODUCT_NAME, "Latitude 3350"),
+   },
+   },
 #endif
{}
 };

[PATCH 4.9 65/93] powerpc/iommu: Stop using @current in mm_iommu_xxx

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexey Kardashevskiy 

[ Upstream commit d7baee6901b34c4895eb78efdbf13a49079d7404 ]

This changes mm_iommu_xxx helpers to take mm_struct as a parameter
instead of getting it from @current which in some situations may
not have a valid reference to mm.

This changes helpers to receive @mm and moves all references to @current
to the caller, including checks for !current and !current->mm;
checks in mm_iommu_preregistered() are removed as there is no caller
yet.

This moves the mm_iommu_adjust_locked_vm() call to the caller as
it receives mm_iommu_table_group_mem_t but it needs mm.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: David Gibson 
Acked-by: Alex Williamson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/include/asm/mmu_context.h |   16 ++-
 arch/powerpc/mm/mmu_context_iommu.c|   46 -
 drivers/vfio/vfio_iommu_spapr_tce.c|   14 +++---
 3 files changed, 36 insertions(+), 40 deletions(-)

--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -19,16 +19,18 @@ extern void destroy_context(struct mm_st
 struct mm_iommu_table_group_mem_t;
 
 extern int isolate_lru_page(struct page *page);/* from internal.h */
-extern bool mm_iommu_preregistered(void);
-extern long mm_iommu_get(unsigned long ua, unsigned long entries,
+extern bool mm_iommu_preregistered(struct mm_struct *mm);
+extern long mm_iommu_get(struct mm_struct *mm,
+   unsigned long ua, unsigned long entries,
struct mm_iommu_table_group_mem_t **pmem);
-extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem);
+extern long mm_iommu_put(struct mm_struct *mm,
+   struct mm_iommu_table_group_mem_t *mem);
 extern void mm_iommu_init(struct mm_struct *mm);
 extern void mm_iommu_cleanup(struct mm_struct *mm);
-extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
-   unsigned long size);
-extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
-   unsigned long entries);
+extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm,
+   unsigned long ua, unsigned long size);
+extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
+   unsigned long ua, unsigned long entries);
 extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
unsigned long ua, unsigned long *hpa);
 extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -56,7 +56,7 @@ static long mm_iommu_adjust_locked_vm(st
}
 
pr_debug("[%d] RLIMIT_MEMLOCK HASH64 %c%ld %ld/%ld\n",
-   current->pid,
+   current ? current->pid : 0,
incr ? '+' : '-',
npages << PAGE_SHIFT,
mm->locked_vm << PAGE_SHIFT,
@@ -66,12 +66,9 @@ static long mm_iommu_adjust_locked_vm(st
return ret;
 }
 
-bool mm_iommu_preregistered(void)
+bool mm_iommu_preregistered(struct mm_struct *mm)
 {
-   if (!current || !current->mm)
-   return false;
-
-   return !list_empty(>mm->context.iommu_group_mem_list);
+   return !list_empty(>context.iommu_group_mem_list);
 }
 EXPORT_SYMBOL_GPL(mm_iommu_preregistered);
 
@@ -124,19 +121,16 @@ static int mm_iommu_move_page_from_cma(s
return 0;
 }
 
-long mm_iommu_get(unsigned long ua, unsigned long entries,
+long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long 
entries,
struct mm_iommu_table_group_mem_t **pmem)
 {
struct mm_iommu_table_group_mem_t *mem;
long i, j, ret = 0, locked_entries = 0;
struct page *page = NULL;
 
-   if (!current || !current->mm)
-   return -ESRCH; /* process exited */
-
mutex_lock(_list_mutex);
 
-   list_for_each_entry_rcu(mem, >mm->context.iommu_group_mem_list,
+   list_for_each_entry_rcu(mem, >context.iommu_group_mem_list,
next) {
if ((mem->ua == ua) && (mem->entries == entries)) {
++mem->used;
@@ -154,7 +148,7 @@ long mm_iommu_get(unsigned long ua, unsi
 
}
 
-   ret = mm_iommu_adjust_locked_vm(current->mm, entries, true);
+   ret = mm_iommu_adjust_locked_vm(mm, entries, true);
if (ret)
goto unlock_exit;
 
@@ -215,11 +209,11 @@ populate:
mem->entries = entries;
*pmem = mem;
 
-   list_add_rcu(>next,

[PATCH 4.9 65/93] powerpc/iommu: Stop using @current in mm_iommu_xxx

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexey Kardashevskiy 

[ Upstream commit d7baee6901b34c4895eb78efdbf13a49079d7404 ]

This changes mm_iommu_xxx helpers to take mm_struct as a parameter
instead of getting it from @current which in some situations may
not have a valid reference to mm.

This changes helpers to receive @mm and moves all references to @current
to the caller, including checks for !current and !current->mm;
checks in mm_iommu_preregistered() are removed as there is no caller
yet.

This moves the mm_iommu_adjust_locked_vm() call to the caller as
it receives mm_iommu_table_group_mem_t but it needs mm.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: David Gibson 
Acked-by: Alex Williamson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/include/asm/mmu_context.h |   16 ++-
 arch/powerpc/mm/mmu_context_iommu.c|   46 -
 drivers/vfio/vfio_iommu_spapr_tce.c|   14 +++---
 3 files changed, 36 insertions(+), 40 deletions(-)

--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -19,16 +19,18 @@ extern void destroy_context(struct mm_st
 struct mm_iommu_table_group_mem_t;
 
 extern int isolate_lru_page(struct page *page);/* from internal.h */
-extern bool mm_iommu_preregistered(void);
-extern long mm_iommu_get(unsigned long ua, unsigned long entries,
+extern bool mm_iommu_preregistered(struct mm_struct *mm);
+extern long mm_iommu_get(struct mm_struct *mm,
+   unsigned long ua, unsigned long entries,
struct mm_iommu_table_group_mem_t **pmem);
-extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem);
+extern long mm_iommu_put(struct mm_struct *mm,
+   struct mm_iommu_table_group_mem_t *mem);
 extern void mm_iommu_init(struct mm_struct *mm);
 extern void mm_iommu_cleanup(struct mm_struct *mm);
-extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
-   unsigned long size);
-extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
-   unsigned long entries);
+extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(struct mm_struct *mm,
+   unsigned long ua, unsigned long size);
+extern struct mm_iommu_table_group_mem_t *mm_iommu_find(struct mm_struct *mm,
+   unsigned long ua, unsigned long entries);
 extern long mm_iommu_ua_to_hpa(struct mm_iommu_table_group_mem_t *mem,
unsigned long ua, unsigned long *hpa);
 extern long mm_iommu_mapped_inc(struct mm_iommu_table_group_mem_t *mem);
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -56,7 +56,7 @@ static long mm_iommu_adjust_locked_vm(st
}
 
pr_debug("[%d] RLIMIT_MEMLOCK HASH64 %c%ld %ld/%ld\n",
-   current->pid,
+   current ? current->pid : 0,
incr ? '+' : '-',
npages << PAGE_SHIFT,
mm->locked_vm << PAGE_SHIFT,
@@ -66,12 +66,9 @@ static long mm_iommu_adjust_locked_vm(st
return ret;
 }
 
-bool mm_iommu_preregistered(void)
+bool mm_iommu_preregistered(struct mm_struct *mm)
 {
-   if (!current || !current->mm)
-   return false;
-
-   return !list_empty(>mm->context.iommu_group_mem_list);
+   return !list_empty(>context.iommu_group_mem_list);
 }
 EXPORT_SYMBOL_GPL(mm_iommu_preregistered);
 
@@ -124,19 +121,16 @@ static int mm_iommu_move_page_from_cma(s
return 0;
 }
 
-long mm_iommu_get(unsigned long ua, unsigned long entries,
+long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long 
entries,
struct mm_iommu_table_group_mem_t **pmem)
 {
struct mm_iommu_table_group_mem_t *mem;
long i, j, ret = 0, locked_entries = 0;
struct page *page = NULL;
 
-   if (!current || !current->mm)
-   return -ESRCH; /* process exited */
-
mutex_lock(_list_mutex);
 
-   list_for_each_entry_rcu(mem, >mm->context.iommu_group_mem_list,
+   list_for_each_entry_rcu(mem, >context.iommu_group_mem_list,
next) {
if ((mem->ua == ua) && (mem->entries == entries)) {
++mem->used;
@@ -154,7 +148,7 @@ long mm_iommu_get(unsigned long ua, unsi
 
}
 
-   ret = mm_iommu_adjust_locked_vm(current->mm, entries, true);
+   ret = mm_iommu_adjust_locked_vm(mm, entries, true);
if (ret)
goto unlock_exit;
 
@@ -215,11 +209,11 @@ populate:
mem->entries = entries;
*pmem = mem;
 
-   list_add_rcu(>next, >mm->context.iommu_group_mem_list);
+   list_add_rcu(>next, >context.iommu_group_mem_list);
 
 unlock_exit:
if (locked_entries && ret)
-

[PATCH 4.9 64/93] powerpc/iommu: Pass mm_struct to init/cleanup helpers

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexey Kardashevskiy 

[ Upstream commit 88f54a3581eb9deaa3bd1aade40aef266d782385 ]

We are going to get rid of @current references in mmu_context_boos3s64.c
and cache mm_struct in the VFIO container. Since mm_context_t does not
have reference counting, we will be using mm_struct which does have
the reference counter.

This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather
than mm_context_t (which is embedded into mm).

This should not cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: David Gibson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/include/asm/mmu_context.h |4 ++--
 arch/powerpc/kernel/setup-common.c |2 +-
 arch/powerpc/mm/mmu_context_book3s64.c |4 ++--
 arch/powerpc/mm/mmu_context_iommu.c|9 +
 4 files changed, 10 insertions(+), 9 deletions(-)

--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,8 +23,8 @@ extern bool mm_iommu_preregistered(void)
 extern long mm_iommu_get(unsigned long ua, unsigned long entries,
struct mm_iommu_table_group_mem_t **pmem);
 extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem);
-extern void mm_iommu_init(mm_context_t *ctx);
-extern void mm_iommu_cleanup(mm_context_t *ctx);
+extern void mm_iommu_init(struct mm_struct *mm);
+extern void mm_iommu_cleanup(struct mm_struct *mm);
 extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
unsigned long size);
 extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -915,7 +915,7 @@ void __init setup_arch(char **cmdline_p)
init_mm.context.pte_frag = NULL;
 #endif
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-   mm_iommu_init(_mm.context);
+   mm_iommu_init(_mm);
 #endif
irqstack_early_init();
exc_lvl_early_init();
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -115,7 +115,7 @@ int init_new_context(struct task_struct
mm->context.pte_frag = NULL;
 #endif
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-   mm_iommu_init(>context);
+   mm_iommu_init(mm);
 #endif
return 0;
 }
@@ -160,7 +160,7 @@ static inline void destroy_pagetable_pag
 void destroy_context(struct mm_struct *mm)
 {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-   mm_iommu_cleanup(>context);
+   mm_iommu_cleanup(mm);
 #endif
 
 #ifdef CONFIG_PPC_ICSWX
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -373,16 +373,17 @@ void mm_iommu_mapped_dec(struct mm_iommu
 }
 EXPORT_SYMBOL_GPL(mm_iommu_mapped_dec);
 
-void mm_iommu_init(mm_context_t *ctx)
+void mm_iommu_init(struct mm_struct *mm)
 {
-   INIT_LIST_HEAD_RCU(>iommu_group_mem_list);
+   INIT_LIST_HEAD_RCU(>context.iommu_group_mem_list);
 }
 
-void mm_iommu_cleanup(mm_context_t *ctx)
+void mm_iommu_cleanup(struct mm_struct *mm)
 {
struct mm_iommu_table_group_mem_t *mem, *tmp;
 
-   list_for_each_entry_safe(mem, tmp, >iommu_group_mem_list, next) {
+   list_for_each_entry_safe(mem, tmp, >context.iommu_group_mem_list,
+   next) {
list_del_rcu(>next);
mm_iommu_do_free(mem);
}

[PATCH 4.9 63/93] vfio/spapr: Postpone allocation of userspace version of TCE table

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexey Kardashevskiy 

[ Upstream commit 39701e56f5f16ea0cf8fc9e8472e645f8de91d23 ]

The iommu_table struct manages a hardware TCE table and a vmalloc'd
table with corresponding userspace addresses. Both are allocated when
the default DMA window is created and this happens when the very first
group is attached to a container.

As we are going to allow the userspace to configure container in one
memory context and pas container fd to another, we have to postpones
such allocations till a container fd is passed to the destination
user process so we would account locked memory limit against the actual
container user constrainsts.

This postpones the it_userspace array allocation till it is used first
time for mapping. The unmapping patch already checks if the array is
allocated.

Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: David Gibson 
Acked-by: Alex Williamson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/vfio/vfio_iommu_spapr_tce.c |   20 +++-
 1 file changed, 7 insertions(+), 13 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -509,6 +509,12 @@ static long tce_iommu_build_v2(struct tc
unsigned long hpa;
enum dma_data_direction dirtmp;
 
+   if (!tbl->it_userspace) {
+   ret = tce_iommu_userspace_view_alloc(tbl);
+   if (ret)
+   return ret;
+   }
+
for (i = 0; i < pages; ++i) {
struct mm_iommu_table_group_mem_t *mem = NULL;
unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl,
@@ -582,15 +588,6 @@ static long tce_iommu_create_table(struc
WARN_ON(!ret && !(*ptbl)->it_ops->free);
WARN_ON(!ret && ((*ptbl)->it_allocated_size != table_size));
 
-   if (!ret && container->v2) {
-   ret = tce_iommu_userspace_view_alloc(*ptbl);
-   if (ret)
-   (*ptbl)->it_ops->free(*ptbl);
-   }
-
-   if (ret)
-   decrement_locked_vm(table_size >> PAGE_SHIFT);
-
return ret;
 }
 
@@ -1062,10 +1059,7 @@ static int tce_iommu_take_ownership(stru
if (!tbl || !tbl->it_map)
continue;
 
-   rc = tce_iommu_userspace_view_alloc(tbl);
-   if (!rc)
-   rc = iommu_take_ownership(tbl);
-
+   rc = iommu_take_ownership(tbl);
if (rc) {
for (j = 0; j < i; ++j)
iommu_release_ownership(

[PATCH v5 0/7] Xen transport for 9pfs frontend driver

2017-03-20 Thread Stefano Stabellini

Hi all,

This patch series implements a new transport for 9pfs, aimed at Xen
systems.

The transport is based on a traditional Xen frontend and backend drivers
pair. This patch series implements the frontend, which typically runs in
a regular unprivileged guest.

I also sent a series that implements the backend in userspace in QEMU,
which typically runs in Dom0 (but could also run in a another guest).

The frontend complies to the Xen transport for 9pfs specification
version 1, available here:

https://xenbits.xen.org/docs/unstable/misc/9pfs.html


Changes in v5:
- test priv->tag instead of ret
- run checkpatch.pl against the whole series, fix all issues
- set intf->ring_order appropriately
- use shorter link to 9pfs spec

Changes in v4:
- code style improvements
- use xenbus_read_unsigned when possible
- do not leak "versions"
- introduce BUILD_BUG_ON
- introduce rwlock to protect the xen_9pfs_devs list
- add review-by

Changes in v3:
- add full copyright header to trans_xen.c
- rename ring->ring to ring->data
- handle gnttab_grant_foreign_access errors
- remove ring->bytes
- wrap long lines
- add reviewed-by

Changes in v2:
- use XEN_PAGE_SHIFT instead of PAGE_SHIFT
- remove unnecessary initializations
- fix error paths
- fix memory allocations for 64K kernels
- simplify p9_xen_create and p9_xen_close
- use virt_XXX barriers
- set status = REQ_STATUS_ERROR inside the p9_xen_response loop
- add in-code comments


Stefano Stabellini (7):
  xen: import new ring macros in ring.h
  xen: introduce the header file for the Xen 9pfs transport protocol
  xen/9pfs: introduce Xen 9pfs transport driver
  xen/9pfs: connect to the backend
  xen/9pfs: send requests to the backend
  xen/9pfs: receive responses
  xen/9pfs: build 9pfs Xen transport driver

 include/xen/interface/io/9pfs.h |  42 
 include/xen/interface/io/ring.h | 131 ++
 net/9p/Kconfig  |   8 +
 net/9p/Makefile |   4 +
 net/9p/trans_xen.c  | 541 
 5 files changed, 726 insertions(+)
 create mode 100644 include/xen/interface/io/9pfs.h
 create mode 100644 net/9p/trans_xen.c

Cheers,

Stefano

[PATCH 4.9 64/93] powerpc/iommu: Pass mm_struct to init/cleanup helpers

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexey Kardashevskiy 

[ Upstream commit 88f54a3581eb9deaa3bd1aade40aef266d782385 ]

We are going to get rid of @current references in mmu_context_boos3s64.c
and cache mm_struct in the VFIO container. Since mm_context_t does not
have reference counting, we will be using mm_struct which does have
the reference counter.

This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather
than mm_context_t (which is embedded into mm).

This should not cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: David Gibson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/include/asm/mmu_context.h |4 ++--
 arch/powerpc/kernel/setup-common.c |2 +-
 arch/powerpc/mm/mmu_context_book3s64.c |4 ++--
 arch/powerpc/mm/mmu_context_iommu.c|9 +
 4 files changed, 10 insertions(+), 9 deletions(-)

--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,8 +23,8 @@ extern bool mm_iommu_preregistered(void)
 extern long mm_iommu_get(unsigned long ua, unsigned long entries,
struct mm_iommu_table_group_mem_t **pmem);
 extern long mm_iommu_put(struct mm_iommu_table_group_mem_t *mem);
-extern void mm_iommu_init(mm_context_t *ctx);
-extern void mm_iommu_cleanup(mm_context_t *ctx);
+extern void mm_iommu_init(struct mm_struct *mm);
+extern void mm_iommu_cleanup(struct mm_struct *mm);
 extern struct mm_iommu_table_group_mem_t *mm_iommu_lookup(unsigned long ua,
unsigned long size);
 extern struct mm_iommu_table_group_mem_t *mm_iommu_find(unsigned long ua,
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -915,7 +915,7 @@ void __init setup_arch(char **cmdline_p)
init_mm.context.pte_frag = NULL;
 #endif
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-   mm_iommu_init(_mm.context);
+   mm_iommu_init(_mm);
 #endif
irqstack_early_init();
exc_lvl_early_init();
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -115,7 +115,7 @@ int init_new_context(struct task_struct
mm->context.pte_frag = NULL;
 #endif
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-   mm_iommu_init(>context);
+   mm_iommu_init(mm);
 #endif
return 0;
 }
@@ -160,7 +160,7 @@ static inline void destroy_pagetable_pag
 void destroy_context(struct mm_struct *mm)
 {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
-   mm_iommu_cleanup(>context);
+   mm_iommu_cleanup(mm);
 #endif
 
 #ifdef CONFIG_PPC_ICSWX
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -373,16 +373,17 @@ void mm_iommu_mapped_dec(struct mm_iommu
 }
 EXPORT_SYMBOL_GPL(mm_iommu_mapped_dec);
 
-void mm_iommu_init(mm_context_t *ctx)
+void mm_iommu_init(struct mm_struct *mm)
 {
-   INIT_LIST_HEAD_RCU(>iommu_group_mem_list);
+   INIT_LIST_HEAD_RCU(>context.iommu_group_mem_list);
 }
 
-void mm_iommu_cleanup(mm_context_t *ctx)
+void mm_iommu_cleanup(struct mm_struct *mm)
 {
struct mm_iommu_table_group_mem_t *mem, *tmp;
 
-   list_for_each_entry_safe(mem, tmp, >iommu_group_mem_list, next) {
+   list_for_each_entry_safe(mem, tmp, >context.iommu_group_mem_list,
+   next) {
list_del_rcu(>next);
mm_iommu_do_free(mem);
}

[PATCH 4.9 63/93] vfio/spapr: Postpone allocation of userspace version of TCE table

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexey Kardashevskiy 

[ Upstream commit 39701e56f5f16ea0cf8fc9e8472e645f8de91d23 ]

The iommu_table struct manages a hardware TCE table and a vmalloc'd
table with corresponding userspace addresses. Both are allocated when
the default DMA window is created and this happens when the very first
group is attached to a container.

As we are going to allow the userspace to configure container in one
memory context and pas container fd to another, we have to postpones
such allocations till a container fd is passed to the destination
user process so we would account locked memory limit against the actual
container user constrainsts.

This postpones the it_userspace array allocation till it is used first
time for mapping. The unmapping patch already checks if the array is
allocated.

Signed-off-by: Alexey Kardashevskiy 
Reviewed-by: David Gibson 
Acked-by: Alex Williamson 
Signed-off-by: Michael Ellerman 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/vfio/vfio_iommu_spapr_tce.c |   20 +++-
 1 file changed, 7 insertions(+), 13 deletions(-)

--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -509,6 +509,12 @@ static long tce_iommu_build_v2(struct tc
unsigned long hpa;
enum dma_data_direction dirtmp;
 
+   if (!tbl->it_userspace) {
+   ret = tce_iommu_userspace_view_alloc(tbl);
+   if (ret)
+   return ret;
+   }
+
for (i = 0; i < pages; ++i) {
struct mm_iommu_table_group_mem_t *mem = NULL;
unsigned long *pua = IOMMU_TABLE_USERSPACE_ENTRY(tbl,
@@ -582,15 +588,6 @@ static long tce_iommu_create_table(struc
WARN_ON(!ret && !(*ptbl)->it_ops->free);
WARN_ON(!ret && ((*ptbl)->it_allocated_size != table_size));
 
-   if (!ret && container->v2) {
-   ret = tce_iommu_userspace_view_alloc(*ptbl);
-   if (ret)
-   (*ptbl)->it_ops->free(*ptbl);
-   }
-
-   if (ret)
-   decrement_locked_vm(table_size >> PAGE_SHIFT);
-
return ret;
 }
 
@@ -1062,10 +1059,7 @@ static int tce_iommu_take_ownership(stru
if (!tbl || !tbl->it_map)
continue;
 
-   rc = tce_iommu_userspace_view_alloc(tbl);
-   if (!rc)
-   rc = iommu_take_ownership(tbl);
-
+   rc = iommu_take_ownership(tbl);
if (rc) {
for (j = 0; j < i; ++j)
iommu_release_ownership(

[PATCH v5 0/7] Xen transport for 9pfs frontend driver

2017-03-20 Thread Stefano Stabellini

Hi all,

This patch series implements a new transport for 9pfs, aimed at Xen
systems.

The transport is based on a traditional Xen frontend and backend drivers
pair. This patch series implements the frontend, which typically runs in
a regular unprivileged guest.

I also sent a series that implements the backend in userspace in QEMU,
which typically runs in Dom0 (but could also run in a another guest).

The frontend complies to the Xen transport for 9pfs specification
version 1, available here:

https://xenbits.xen.org/docs/unstable/misc/9pfs.html


Changes in v5:
- test priv->tag instead of ret
- run checkpatch.pl against the whole series, fix all issues
- set intf->ring_order appropriately
- use shorter link to 9pfs spec

Changes in v4:
- code style improvements
- use xenbus_read_unsigned when possible
- do not leak "versions"
- introduce BUILD_BUG_ON
- introduce rwlock to protect the xen_9pfs_devs list
- add review-by

Changes in v3:
- add full copyright header to trans_xen.c
- rename ring->ring to ring->data
- handle gnttab_grant_foreign_access errors
- remove ring->bytes
- wrap long lines
- add reviewed-by

Changes in v2:
- use XEN_PAGE_SHIFT instead of PAGE_SHIFT
- remove unnecessary initializations
- fix error paths
- fix memory allocations for 64K kernels
- simplify p9_xen_create and p9_xen_close
- use virt_XXX barriers
- set status = REQ_STATUS_ERROR inside the p9_xen_response loop
- add in-code comments


Stefano Stabellini (7):
  xen: import new ring macros in ring.h
  xen: introduce the header file for the Xen 9pfs transport protocol
  xen/9pfs: introduce Xen 9pfs transport driver
  xen/9pfs: connect to the backend
  xen/9pfs: send requests to the backend
  xen/9pfs: receive responses
  xen/9pfs: build 9pfs Xen transport driver

 include/xen/interface/io/9pfs.h |  42 
 include/xen/interface/io/ring.h | 131 ++
 net/9p/Kconfig  |   8 +
 net/9p/Makefile |   4 +
 net/9p/trans_xen.c  | 541 
 5 files changed, 726 insertions(+)
 create mode 100644 include/xen/interface/io/9pfs.h
 create mode 100644 net/9p/trans_xen.c

Cheers,

Stefano

[PATCH 4.9 56/93] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Bjorn Helgaas 

[ Upstream commit 7a6d312b50e63f598f5b5914c4fd21878ac2b595 ]

Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Gavin Shan 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/pci/probe.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -227,7 +227,8 @@ int __pci_read_base(struct pci_dev *dev,
mask64 = (u32)PCI_BASE_ADDRESS_MEM_MASK;
}
} else {
-   res->flags |= (l & IORESOURCE_ROM_ENABLE);
+   if (l & PCI_ROM_ADDRESS_ENABLE)
+   res->flags |= IORESOURCE_ROM_ENABLE;
l64 = l & PCI_ROM_ADDRESS_MASK;
sz64 = sz & PCI_ROM_ADDRESS_MASK;
mask64 = (u32)PCI_ROM_ADDRESS_MASK;

[PATCH 4.9 56/93] PCI: Decouple IORESOURCE_ROM_ENABLE and PCI_ROM_ADDRESS_ENABLE

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Bjorn Helgaas 

[ Upstream commit 7a6d312b50e63f598f5b5914c4fd21878ac2b595 ]

Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.

Signed-off-by: Bjorn Helgaas 
Reviewed-by: Gavin Shan 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/pci/probe.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -227,7 +227,8 @@ int __pci_read_base(struct pci_dev *dev,
mask64 = (u32)PCI_BASE_ADDRESS_MEM_MASK;
}
} else {
-   res->flags |= (l & IORESOURCE_ROM_ENABLE);
+   if (l & PCI_ROM_ADDRESS_ENABLE)
+   res->flags |= IORESOURCE_ROM_ENABLE;
l64 = l & PCI_ROM_ADDRESS_MASK;
sz64 = sz & PCI_ROM_ADDRESS_MASK;
mask64 = (u32)PCI_ROM_ADDRESS_MASK;

[PATCH 4.9 93/93] crypto: powerpc - Fix initialisation of crc32c context

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Daniel Axtens 

commit aa2be9b3d6d2d699e9ca7cbfc00867c80e5da213 upstream.

Turning on crypto self-tests on a POWER8 shows:

alg: hash: Test 1 failed for crc32c-vpmsum
: ff ff ff ff

Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.

This probably wasn't caught because btrfs does its own weird
open-coded initialisation.

Initialise our internal context to ~0 on init.

This makes the self-tests pass, and btrfs continues to work.

Fixes: 6dd7a82cc54e ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard 
Signed-off-by: Daniel Axtens 
Acked-by: Anton Blanchard 
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/powerpc/crypto/crc32c-vpmsum_glue.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/crypto/crc32c-vpmsum_glue.c
+++ b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
@@ -52,7 +52,7 @@ static int crc32c_vpmsum_cra_init(struct
 {
u32 *key = crypto_tfm_ctx(tfm);
 
-   *key = 0;
+   *key = ~0;
 
return 0;
 }

[PATCH 4.9 93/93] crypto: powerpc - Fix initialisation of crc32c context

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Daniel Axtens 

commit aa2be9b3d6d2d699e9ca7cbfc00867c80e5da213 upstream.

Turning on crypto self-tests on a POWER8 shows:

alg: hash: Test 1 failed for crc32c-vpmsum
: ff ff ff ff

Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.

This probably wasn't caught because btrfs does its own weird
open-coded initialisation.

Initialise our internal context to ~0 on init.

This makes the self-tests pass, and btrfs continues to work.

Fixes: 6dd7a82cc54e ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard 
Signed-off-by: Daniel Axtens 
Acked-by: Anton Blanchard 
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/powerpc/crypto/crc32c-vpmsum_glue.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/crypto/crc32c-vpmsum_glue.c
+++ b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
@@ -52,7 +52,7 @@ static int crc32c_vpmsum_cra_init(struct
 {
u32 *key = crypto_tfm_ctx(tfm);
 
-   *key = 0;
+   *key = ~0;
 
return 0;
 }

[PATCH 4.9 88/93] x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Andrey Ryabinin 

commit be3606ff739d1c1be36389f8737c577ad87e1f57 upstream.

The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.

Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().

Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu 
Signed-off-by: Andrey Ryabinin 
Cc: kasan-...@googlegroups.com
Cc: Alexander Potapenko 
Cc: Andrew Morton 
Cc: l...@01.org
Cc: Dmitry Vyukov 
Link: http://lkml.kernel.org/r/20170313163337.1704-1-aryabi...@virtuozzo.com
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/x86/kernel/head64.c|1 +
 arch/x86/mm/kasan_init_64.c |1 +
 2 files changed, 2 insertions(+)

--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -4,6 +4,7 @@
  *  Copyright (C) 2000 Andrea Arcangeli  SuSE
  */
 
+#define DISABLE_BRANCH_PROFILING
 #include 
 #include 
 #include 
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -1,3 +1,4 @@
+#define DISABLE_BRANCH_PROFILING
 #define pr_fmt(fmt) "kasan: " fmt
 #include 
 #include

[PATCH 4.9 88/93] x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Andrey Ryabinin 

commit be3606ff739d1c1be36389f8737c577ad87e1f57 upstream.

The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.

Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().

Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu 
Signed-off-by: Andrey Ryabinin 
Cc: kasan-...@googlegroups.com
Cc: Alexander Potapenko 
Cc: Andrew Morton 
Cc: l...@01.org
Cc: Dmitry Vyukov 
Link: http://lkml.kernel.org/r/20170313163337.1704-1-aryabi...@virtuozzo.com
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/x86/kernel/head64.c|1 +
 arch/x86/mm/kasan_init_64.c |1 +
 2 files changed, 2 insertions(+)

--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -4,6 +4,7 @@
  *  Copyright (C) 2000 Andrea Arcangeli  SuSE
  */
 
+#define DISABLE_BRANCH_PROFILING
 #include 
 #include 
 #include 
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -1,3 +1,4 @@
+#define DISABLE_BRANCH_PROFILING
 #define pr_fmt(fmt) "kasan: " fmt
 #include 
 #include

[PATCH 4.10 03/63] net/mlx5e: Fix broken CQE compression initialization

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Tariq Toukan 


[ Upstream commit b0d4660b4cc52e6477ca3a43435351d565dfcedc ]

Some of RQ type parameters are derived from CQE compression state flag,
CQE compression flag was initialized only after RQ type parameters
setup. This leads to load RQ with stride size smaller than what we
want for when CQE compression is on.

This bug introduces no functional damage, it only makes CQE compression
occur less often, since in ConnectX4-LX CQE compression is performed
only on packets smaller than stride size.

Fix this by marking default status of CQE compression in PFLAG prior to
calling mlx5e_set_rq_priv_params(), as it inits some fields based on it.

Tested:
 load driver on systems where rx CQE compress will be on (MH)
 pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
 verify `ethtool -S ethxx | grep compress` are advancing more often
 (rapidly)

Fixes: 2fc4bfb7250d ("net/mlx5e: Dynamic RQ type infrastructure")
Signed-off-by: Tariq Toukan 
Cc: kernel-t...@fb.com
Signed-off-by: Saeed Mahameed 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3500,6 +3500,9 @@ static void mlx5e_build_nic_netdev_priv(
cqe_compress_heuristic(link_speed, pci_bw);
}
 
+   MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS,
+   priv->params.rx_cqe_compress_def);
+
mlx5e_set_rq_priv_params(priv);
if (priv->params.rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
priv->params.lro_en = true;
@@ -3525,7 +3528,6 @@ static void mlx5e_build_nic_netdev_priv(
/* Initialize pflags */
MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
priv->params.rx_cq_period_mode == 
MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
-   MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, 
priv->params.rx_cqe_compress_def);
 
mutex_init(>state_lock);

[PATCH 4.10 03/63] net/mlx5e: Fix broken CQE compression initialization

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Tariq Toukan 


[ Upstream commit b0d4660b4cc52e6477ca3a43435351d565dfcedc ]

Some of RQ type parameters are derived from CQE compression state flag,
CQE compression flag was initialized only after RQ type parameters
setup. This leads to load RQ with stride size smaller than what we
want for when CQE compression is on.

This bug introduces no functional damage, it only makes CQE compression
occur less often, since in ConnectX4-LX CQE compression is performed
only on packets smaller than stride size.

Fix this by marking default status of CQE compression in PFLAG prior to
calling mlx5e_set_rq_priv_params(), as it inits some fields based on it.

Tested:
 load driver on systems where rx CQE compress will be on (MH)
 pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
 verify `ethtool -S ethxx | grep compress` are advancing more often
 (rapidly)

Fixes: 2fc4bfb7250d ("net/mlx5e: Dynamic RQ type infrastructure")
Signed-off-by: Tariq Toukan 
Cc: kernel-t...@fb.com
Signed-off-by: Saeed Mahameed 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3500,6 +3500,9 @@ static void mlx5e_build_nic_netdev_priv(
cqe_compress_heuristic(link_speed, pci_bw);
}
 
+   MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS,
+   priv->params.rx_cqe_compress_def);
+
mlx5e_set_rq_priv_params(priv);
if (priv->params.rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
priv->params.lro_en = true;
@@ -3525,7 +3528,6 @@ static void mlx5e_build_nic_netdev_priv(
/* Initialize pflags */
MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
priv->params.rx_cq_period_mode == 
MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
-   MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, 
priv->params.rx_cqe_compress_def);
 
mutex_init(>state_lock);

[PATCH 4.9 92/93] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Niklas Cassel 

commit 17fcbd590d0c3e35bd9646e2215f86586378bc42 upstream.

We hang if SIGKILL has been sent, but the task is stuck in down_read()
(after do_exit()), even though no task is doing down_write() on the
rwsem in question:

  INFO: task libupnp:21868 blocked for more than 120 seconds.
  libupnp D0 21868  1 0x0818
  ...
  Call Trace:
  __schedule()
  schedule()
  __down_read()
  do_exit()
  do_group_exit()
  __wake_up_parent()

This bug has already been fixed for CONFIG_RWSEM_XCHGADD_ALGORITHM=y in
the following commit:

 04cafed7fc19 ("locking/rwsem: Fix down_write_killable()")

... however, this bug also exists for CONFIG_RWSEM_GENERIC_SPINLOCK=y.

Signed-off-by: Niklas Cassel 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Niklas Cassel 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: d47996082f52 ("locking/rwsem: Introduce basis for down_write_killable()")
Link: 
http://lkml.kernel.org/r/1487981873-12649-1-git-send-email-nikl...@axis.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/locking/rwsem-spinlock.c |   15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/kernel/locking/rwsem-spinlock.c
+++ b/kernel/locking/rwsem-spinlock.c
@@ -216,10 +216,8 @@ int __sched __down_write_common(struct r
 */
if (sem->count == 0)
break;
-   if (signal_pending_state(state, current)) {
-   ret = -EINTR;
-   goto out;
-   }
+   if (signal_pending_state(state, current))
+   goto out_nolock;
set_task_state(tsk, state);
raw_spin_unlock_irqrestore(>wait_lock, flags);
schedule();
@@ -227,12 +225,19 @@ int __sched __down_write_common(struct r
}
/* got the lock */
sem->count = -1;
-out:
list_del();
 
raw_spin_unlock_irqrestore(>wait_lock, flags);
 
return ret;
+
+out_nolock:
+   list_del();
+   if (!list_empty(>wait_list))
+   __rwsem_do_wake(sem, 1);
+   raw_spin_unlock_irqrestore(>wait_lock, flags);
+
+   return -EINTR;
 }
 
 void __sched __down_write(struct rw_semaphore *sem)

[PATCH 4.9 92/93] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Niklas Cassel 

commit 17fcbd590d0c3e35bd9646e2215f86586378bc42 upstream.

We hang if SIGKILL has been sent, but the task is stuck in down_read()
(after do_exit()), even though no task is doing down_write() on the
rwsem in question:

  INFO: task libupnp:21868 blocked for more than 120 seconds.
  libupnp D0 21868  1 0x0818
  ...
  Call Trace:
  __schedule()
  schedule()
  __down_read()
  do_exit()
  do_group_exit()
  __wake_up_parent()

This bug has already been fixed for CONFIG_RWSEM_XCHGADD_ALGORITHM=y in
the following commit:

 04cafed7fc19 ("locking/rwsem: Fix down_write_killable()")

... however, this bug also exists for CONFIG_RWSEM_GENERIC_SPINLOCK=y.

Signed-off-by: Niklas Cassel 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Niklas Cassel 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: d47996082f52 ("locking/rwsem: Introduce basis for down_write_killable()")
Link: 
http://lkml.kernel.org/r/1487981873-12649-1-git-send-email-nikl...@axis.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/locking/rwsem-spinlock.c |   15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/kernel/locking/rwsem-spinlock.c
+++ b/kernel/locking/rwsem-spinlock.c
@@ -216,10 +216,8 @@ int __sched __down_write_common(struct r
 */
if (sem->count == 0)
break;
-   if (signal_pending_state(state, current)) {
-   ret = -EINTR;
-   goto out;
-   }
+   if (signal_pending_state(state, current))
+   goto out_nolock;
set_task_state(tsk, state);
raw_spin_unlock_irqrestore(>wait_lock, flags);
schedule();
@@ -227,12 +225,19 @@ int __sched __down_write_common(struct r
}
/* got the lock */
sem->count = -1;
-out:
list_del();
 
raw_spin_unlock_irqrestore(>wait_lock, flags);
 
return ret;
+
+out_nolock:
+   list_del();
+   if (!list_empty(>wait_list))
+   __rwsem_do_wake(sem, 1);
+   raw_spin_unlock_irqrestore(>wait_lock, flags);
+
+   return -EINTR;
 }
 
 void __sched __down_write(struct rw_semaphore *sem)

Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

2017-03-20 Thread Tejun Heo

Hello,

On Tue, Feb 28, 2017 at 02:38:38PM +, Patrick Bellasi wrote:
> This patch extends the CPU controller by adding a couple of new
> attributes, capacity_min and capacity_max, which can be used to enforce
> bandwidth boosting and capping. More specifically:
> 
> - capacity_min: defines the minimum capacity which should be granted
> (by schedutil) when a task in this group is running,
> i.e. the task will run at least at that capacity
> 
> - capacity_max: defines the maximum capacity which can be granted
> (by schedutil) when a task in this group is running,
> i.e. the task can run up to that capacity

cpu.capacity.min and cpu.capacity.max are the more conventional names.
I'm not sure about the name capacity as it doesn't encode what it does
and is difficult to tell apart from cpu bandwidth limits.  I think
it'd be better to represent what it controls more explicitly.

> These attributes:
> a) are tunable at all hierarchy levels, i.e. root group too

This usually is problematic because there should be a non-cgroup way
of configuring the feature in case cgroup isn't configured or used,
and it becomes awkward to have two separate mechanisms configuring the
same thing.  Maybe the feature is cgroup specific enough that it makes
sense here but this needs more explanation / justification.

> b) allow to create subgroups of tasks which are not violating the
>capacity constraints defined by the parent group.
>Thus, tasks on a subgroup can only be more boosted and/or more

For both limits and protections, the parent caps the maximum the
children can get.  At least that's what memcg does for memory.low.
Doing that makes sense for memcg because for memory the parent can
still do protections regardless of what its children are doing and it
makes delegation safe by default.

I understand why you would want a property like capacity to be the
other direction as that way you get more specific as you walk down the
tree for both limits and protections; however, I think we need to
think a bit more about it and ensure that the resulting interface
isn't confusing.  Would it work for capacity to behave the other
direction - ie. a parent's min restricting the highest min that its
descendants can get?  It's completely fine if that's weird.

Thanks.

-- 
tejun

Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

2017-03-20 Thread Tejun Heo

Hello,

On Tue, Feb 28, 2017 at 02:38:38PM +, Patrick Bellasi wrote:
> This patch extends the CPU controller by adding a couple of new
> attributes, capacity_min and capacity_max, which can be used to enforce
> bandwidth boosting and capping. More specifically:
> 
> - capacity_min: defines the minimum capacity which should be granted
> (by schedutil) when a task in this group is running,
> i.e. the task will run at least at that capacity
> 
> - capacity_max: defines the maximum capacity which can be granted
> (by schedutil) when a task in this group is running,
> i.e. the task can run up to that capacity

cpu.capacity.min and cpu.capacity.max are the more conventional names.
I'm not sure about the name capacity as it doesn't encode what it does
and is difficult to tell apart from cpu bandwidth limits.  I think
it'd be better to represent what it controls more explicitly.

> These attributes:
> a) are tunable at all hierarchy levels, i.e. root group too

This usually is problematic because there should be a non-cgroup way
of configuring the feature in case cgroup isn't configured or used,
and it becomes awkward to have two separate mechanisms configuring the
same thing.  Maybe the feature is cgroup specific enough that it makes
sense here but this needs more explanation / justification.

> b) allow to create subgroups of tasks which are not violating the
>capacity constraints defined by the parent group.
>Thus, tasks on a subgroup can only be more boosted and/or more

For both limits and protections, the parent caps the maximum the
children can get.  At least that's what memcg does for memory.low.
Doing that makes sense for memcg because for memory the parent can
still do protections regardless of what its children are doing and it
makes delegation safe by default.

I understand why you would want a property like capacity to be the
other direction as that way you get more specific as you walk down the
tree for both limits and protections; however, I think we need to
think a bit more about it and ensure that the resulting interface
isn't confusing.  Would it work for capacity to behave the other
direction - ie. a parent's min restricting the highest min that its
descendants can get?  It's completely fine if that's weird.

Thanks.

-- 
tejun

Re: [PATCH v1 0/3] ioremap() tidy-up

2017-03-20 Thread Bjorn Helgaas

On Fri, Mar 17, 2017 at 11:00:03PM +0100, Arnd Bergmann wrote:
> On Fri, Mar 17, 2017 at 6:46 PM, Bjorn Helgaas  wrote:
> > 1) Fix some comments that say "IOMMU" when they mean "MMU".
> >
> > 2) Remove the generic __ioremap() definition, which I think is unused and
> > confusing.
> >
> > 3) Simplify the comments about ioremap() implementation.  I split this out
> > in case I went too far and made this controversial.
> 
> All look good
> 
> Reviewed-by: Arnd Bergmann 
> 
> Do you want to take them through your PCI tree?

Sure, I can.  I should have copied linux-pci; this was all motivated by
reviewing Lorenzo's patches, will probably go via my tree and should be
coordinated with these.

Bjorn

[PATCH 4.10 02/63] net/mlx5e: Do not reduce LRO WQE size when not using build_skb

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Tariq Toukan 


[ Upstream commit 4078e637c12f1e0a74293f1ec9563f42bff14a03 ]

When rq_type is Striding RQ, no room of SKB_RESERVE is needed
as SKB allocation is not done via build_skb.

Fixes: e4b85508072b ("net/mlx5e: Slightly reduce hardware LRO size")
Signed-off-by: Tariq Toukan 
Signed-off-by: Saeed Mahameed 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -81,6 +81,7 @@ static bool mlx5e_check_fragmented_strid
 static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
 {
priv->params.rq_wq_type = rq_type;
+   priv->params.lro_wqe_sz = MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
switch (priv->params.rq_wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW;
@@ -93,6 +94,10 @@ static void mlx5e_set_rq_type_params(str
break;
default: /* MLX5_WQ_TYPE_LINKED_LIST */
priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE;
+
+   /* Extra room needed for build_skb */
+   priv->params.lro_wqe_sz -= MLX5_RX_HEADROOM +
+   SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
}
priv->params.min_rx_wqes = mlx5_min_rx_wqes(priv->params.rq_wq_type,
   BIT(priv->params.log_rq_size));
@@ -3517,12 +3522,6 @@ static void mlx5e_build_nic_netdev_priv(
mlx5e_build_default_indir_rqt(mdev, priv->params.indirection_rqt,
  MLX5E_INDIR_RQT_SIZE, 
profile->max_nch(mdev));
 
-   priv->params.lro_wqe_sz =
-   MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ -
-   /* Extra room needed for build_skb */
-   MLX5_RX_HEADROOM -
-   SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-
/* Initialize pflags */
MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
priv->params.rx_cq_period_mode == 
MLX5_CQ_PERIOD_MODE_START_FROM_CQE);

Re: [PATCH v1 0/3] ioremap() tidy-up

2017-03-20 Thread Bjorn Helgaas

On Fri, Mar 17, 2017 at 11:00:03PM +0100, Arnd Bergmann wrote:
> On Fri, Mar 17, 2017 at 6:46 PM, Bjorn Helgaas  wrote:
> > 1) Fix some comments that say "IOMMU" when they mean "MMU".
> >
> > 2) Remove the generic __ioremap() definition, which I think is unused and
> > confusing.
> >
> > 3) Simplify the comments about ioremap() implementation.  I split this out
> > in case I went too far and made this controversial.
> 
> All look good
> 
> Reviewed-by: Arnd Bergmann 
> 
> Do you want to take them through your PCI tree?

Sure, I can.  I should have copied linux-pci; this was all motivated by
reviewing Lorenzo's patches, will probably go via my tree and should be
coordinated with these.

Bjorn

[PATCH 4.10 02/63] net/mlx5e: Do not reduce LRO WQE size when not using build_skb

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Tariq Toukan 


[ Upstream commit 4078e637c12f1e0a74293f1ec9563f42bff14a03 ]

When rq_type is Striding RQ, no room of SKB_RESERVE is needed
as SKB allocation is not done via build_skb.

Fixes: e4b85508072b ("net/mlx5e: Slightly reduce hardware LRO size")
Signed-off-by: Tariq Toukan 
Signed-off-by: Saeed Mahameed 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -81,6 +81,7 @@ static bool mlx5e_check_fragmented_strid
 static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
 {
priv->params.rq_wq_type = rq_type;
+   priv->params.lro_wqe_sz = MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
switch (priv->params.rq_wq_type) {
case MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ:
priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE_MPW;
@@ -93,6 +94,10 @@ static void mlx5e_set_rq_type_params(str
break;
default: /* MLX5_WQ_TYPE_LINKED_LIST */
priv->params.log_rq_size = MLX5E_PARAMS_DEFAULT_LOG_RQ_SIZE;
+
+   /* Extra room needed for build_skb */
+   priv->params.lro_wqe_sz -= MLX5_RX_HEADROOM +
+   SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
}
priv->params.min_rx_wqes = mlx5_min_rx_wqes(priv->params.rq_wq_type,
   BIT(priv->params.log_rq_size));
@@ -3517,12 +3522,6 @@ static void mlx5e_build_nic_netdev_priv(
mlx5e_build_default_indir_rqt(mdev, priv->params.indirection_rqt,
  MLX5E_INDIR_RQT_SIZE, 
profile->max_nch(mdev));
 
-   priv->params.lro_wqe_sz =
-   MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ -
-   /* Extra room needed for build_skb */
-   MLX5_RX_HEADROOM -
-   SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-
/* Initialize pflags */
MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_BASED_MODER,
priv->params.rx_cq_period_mode == 
MLX5_CQ_PERIOD_MODE_START_FROM_CQE);

[PATCH 4.10 15/63] vxlan: lock RCU on TX path

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Jakub Kicinski 


[ Upstream commit 56de859e9967c070464a9a9f4f18d73f9447298e ]

There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: c6fcc4fc5f8b ("vxlan: avoid using stale vxlan socket.")
Signed-off-by: Jakub Kicinski 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/vxlan.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2062,6 +2062,7 @@ static void vxlan_xmit_one(struct sk_buf
src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
 vxlan->cfg.port_max, true);
 
+   rcu_read_lock();
if (dst->sa.sa_family == AF_INET) {
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
struct rtable *rt;
@@ -2084,7 +2085,7 @@ static void vxlan_xmit_one(struct sk_buf
dst_port, vni, >dst,
rt->rt_flags);
if (err)
-   return;
+   goto out_unlock;
} else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT) {
df = htons(IP_DF);
}
@@ -2123,7 +2124,7 @@ static void vxlan_xmit_one(struct sk_buf
dst_port, vni, ndst,
rt6i_flags);
if (err)
-   return;
+   goto out_unlock;
}
 
tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
@@ -2140,6 +2141,8 @@ static void vxlan_xmit_one(struct sk_buf
 label, src_port, dst_port, !udp_sum);
 #endif
}
+out_unlock:
+   rcu_read_unlock();
return;
 
 drop:
@@ -2148,6 +2151,7 @@ drop:
return;
 
 tx_error:
+   rcu_read_unlock();
if (err == -ELOOP)
dev->stats.collisions++;
else if (err == -ENETUNREACH)

[PATCH 4.10 17/63] mlxsw: spectrum_router: Avoid potential packets loss

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Ido Schimmel 


[ Upstream commit f7df4923fa986247e93ec2cdff5ca168fff14dcf ]

When the structure of the LPM tree changes (f.e., due to the addition of
a new prefix), we unbind the old tree and then bind the new one. This
may result in temporary packet loss.

Instead, overwrite the old binding with the new one.

Fixes: 6b75c4807db3 ("mlxsw: spectrum_router: Add virtual router management")
Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c |   30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -496,30 +496,40 @@ static int
 mlxsw_sp_vr_lpm_tree_check(struct mlxsw_sp *mlxsw_sp, struct mlxsw_sp_vr *vr,
   struct mlxsw_sp_prefix_usage *req_prefix_usage)
 {
-   struct mlxsw_sp_lpm_tree *lpm_tree;
+   struct mlxsw_sp_lpm_tree *lpm_tree = vr->lpm_tree;
+   struct mlxsw_sp_lpm_tree *new_tree;
+   int err;
 
-   if (mlxsw_sp_prefix_usage_eq(req_prefix_usage,
->lpm_tree->prefix_usage))
+   if (mlxsw_sp_prefix_usage_eq(req_prefix_usage, _tree->prefix_usage))
return 0;
 
-   lpm_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, req_prefix_usage,
+   new_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, req_prefix_usage,
 vr->proto, false);
-   if (IS_ERR(lpm_tree)) {
+   if (IS_ERR(new_tree)) {
/* We failed to get a tree according to the required
 * prefix usage. However, the current tree might be still good
 * for us if our requirement is subset of the prefixes used
 * in the tree.
 */
if (mlxsw_sp_prefix_usage_subset(req_prefix_usage,
->lpm_tree->prefix_usage))
+_tree->prefix_usage))
return 0;
-   return PTR_ERR(lpm_tree);
+   return PTR_ERR(new_tree);
}
 
-   mlxsw_sp_vr_lpm_tree_unbind(mlxsw_sp, vr);
-   mlxsw_sp_lpm_tree_put(mlxsw_sp, vr->lpm_tree);
+   /* Prevent packet loss by overwriting existing binding */
+   vr->lpm_tree = new_tree;
+   err = mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, vr);
+   if (err)
+   goto err_tree_bind;
+   mlxsw_sp_lpm_tree_put(mlxsw_sp, lpm_tree);
+
+   return 0;
+
+err_tree_bind:
vr->lpm_tree = lpm_tree;
-   return mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, vr);
+   mlxsw_sp_lpm_tree_put(mlxsw_sp, new_tree);
+   return err;
 }
 
 static struct mlxsw_sp_vr *mlxsw_sp_vr_get(struct mlxsw_sp *mlxsw_sp,

[PATCH 4.10 15/63] vxlan: lock RCU on TX path

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Jakub Kicinski 


[ Upstream commit 56de859e9967c070464a9a9f4f18d73f9447298e ]

There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: c6fcc4fc5f8b ("vxlan: avoid using stale vxlan socket.")
Signed-off-by: Jakub Kicinski 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/vxlan.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2062,6 +2062,7 @@ static void vxlan_xmit_one(struct sk_buf
src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
 vxlan->cfg.port_max, true);
 
+   rcu_read_lock();
if (dst->sa.sa_family == AF_INET) {
struct vxlan_sock *sock4 = rcu_dereference(vxlan->vn4_sock);
struct rtable *rt;
@@ -2084,7 +2085,7 @@ static void vxlan_xmit_one(struct sk_buf
dst_port, vni, >dst,
rt->rt_flags);
if (err)
-   return;
+   goto out_unlock;
} else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT) {
df = htons(IP_DF);
}
@@ -2123,7 +2124,7 @@ static void vxlan_xmit_one(struct sk_buf
dst_port, vni, ndst,
rt6i_flags);
if (err)
-   return;
+   goto out_unlock;
}
 
tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
@@ -2140,6 +2141,8 @@ static void vxlan_xmit_one(struct sk_buf
 label, src_port, dst_port, !udp_sum);
 #endif
}
+out_unlock:
+   rcu_read_unlock();
return;
 
 drop:
@@ -2148,6 +2151,7 @@ drop:
return;
 
 tx_error:
+   rcu_read_unlock();
if (err == -ELOOP)
dev->stats.collisions++;
else if (err == -ENETUNREACH)

[PATCH 4.10 17/63] mlxsw: spectrum_router: Avoid potential packets loss

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Ido Schimmel 


[ Upstream commit f7df4923fa986247e93ec2cdff5ca168fff14dcf ]

When the structure of the LPM tree changes (f.e., due to the addition of
a new prefix), we unbind the old tree and then bind the new one. This
may result in temporary packet loss.

Instead, overwrite the old binding with the new one.

Fixes: 6b75c4807db3 ("mlxsw: spectrum_router: Add virtual router management")
Signed-off-by: Ido Schimmel 
Signed-off-by: Jiri Pirko 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c |   30 --
 1 file changed, 20 insertions(+), 10 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -496,30 +496,40 @@ static int
 mlxsw_sp_vr_lpm_tree_check(struct mlxsw_sp *mlxsw_sp, struct mlxsw_sp_vr *vr,
   struct mlxsw_sp_prefix_usage *req_prefix_usage)
 {
-   struct mlxsw_sp_lpm_tree *lpm_tree;
+   struct mlxsw_sp_lpm_tree *lpm_tree = vr->lpm_tree;
+   struct mlxsw_sp_lpm_tree *new_tree;
+   int err;
 
-   if (mlxsw_sp_prefix_usage_eq(req_prefix_usage,
->lpm_tree->prefix_usage))
+   if (mlxsw_sp_prefix_usage_eq(req_prefix_usage, _tree->prefix_usage))
return 0;
 
-   lpm_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, req_prefix_usage,
+   new_tree = mlxsw_sp_lpm_tree_get(mlxsw_sp, req_prefix_usage,
 vr->proto, false);
-   if (IS_ERR(lpm_tree)) {
+   if (IS_ERR(new_tree)) {
/* We failed to get a tree according to the required
 * prefix usage. However, the current tree might be still good
 * for us if our requirement is subset of the prefixes used
 * in the tree.
 */
if (mlxsw_sp_prefix_usage_subset(req_prefix_usage,
->lpm_tree->prefix_usage))
+_tree->prefix_usage))
return 0;
-   return PTR_ERR(lpm_tree);
+   return PTR_ERR(new_tree);
}
 
-   mlxsw_sp_vr_lpm_tree_unbind(mlxsw_sp, vr);
-   mlxsw_sp_lpm_tree_put(mlxsw_sp, vr->lpm_tree);
+   /* Prevent packet loss by overwriting existing binding */
+   vr->lpm_tree = new_tree;
+   err = mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, vr);
+   if (err)
+   goto err_tree_bind;
+   mlxsw_sp_lpm_tree_put(mlxsw_sp, lpm_tree);
+
+   return 0;
+
+err_tree_bind:
vr->lpm_tree = lpm_tree;
-   return mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, vr);
+   mlxsw_sp_lpm_tree_put(mlxsw_sp, new_tree);
+   return err;
 }
 
 static struct mlxsw_sp_vr *mlxsw_sp_vr_get(struct mlxsw_sp *mlxsw_sp,

[PATCH 4.10 05/63] net/mlx5e: Fix wrong CQE decompression

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Tariq Toukan 


[ Upstream commit 36154be40a28e4afaa0416da2681d80b7e2ca319 ]

In cqe compression with striding RQ, the decompression of the CQE field
wqe_counter was done with a wrong wraparound value.
This caused handling cqes with a wrong pointer to wqe (rx descriptor)
and creating SKBs with wrong data, pointing to wrong (and already consumed)
strides/pages.

The meaning of the CQE field wqe_counter in striding RQ holds the
stride index instead of the WQE index. Hence, when decompressing
a CQE, wqe_counter should have wrapped-around the number of strides
in a single multi-packet WQE.

We dropped this wrap-around mask at all in CQE decompression of striding
RQ. It is not needed as in such cases the CQE compression session would
break because of different value of wqe_id field, starting a new
compression session.

Tested:
 ethtool -K ethxx lro off/on
 ethtool --set-priv-flags ethxx rx_cqe_compress on
 super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
 verified no csum errors and no page refcount issues.

Fixes: 7219ab34f184 ("net/mlx5e: CQE compression")
Signed-off-by: Tariq Toukan 
Reported-by: Tom Herbert 
Cc: kernel-t...@fb.com
Signed-off-by: Saeed Mahameed 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c |   13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -92,19 +92,18 @@ static inline void mlx5e_cqes_update_own
 static inline void mlx5e_decompress_cqe(struct mlx5e_rq *rq,
struct mlx5e_cq *cq, u32 cqcc)
 {
-   u16 wqe_cnt_step;
-
cq->title.byte_cnt = cq->mini_arr[cq->mini_arr_idx].byte_cnt;
cq->title.check_sum= cq->mini_arr[cq->mini_arr_idx].checksum;
cq->title.op_own  &= 0xf0;
cq->title.op_own  |= 0x01 & (cqcc >> cq->wq.log_sz);
cq->title.wqe_counter  = cpu_to_be16(cq->decmprs_wqe_counter);
 
-   wqe_cnt_step =
-   rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ ?
-   mpwrq_get_cqe_consumed_strides(>title) : 1;
-   cq->decmprs_wqe_counter =
-   (cq->decmprs_wqe_counter + wqe_cnt_step) & rq->wq.sz_m1;
+   if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
+   cq->decmprs_wqe_counter +=
+   mpwrq_get_cqe_consumed_strides(>title);
+   else
+   cq->decmprs_wqe_counter =
+   (cq->decmprs_wqe_counter + 1) & rq->wq.sz_m1;
 }
 
 static inline void mlx5e_decompress_cqe_no_hash(struct mlx5e_rq *rq,

[PATCH 4.10 05/63] net/mlx5e: Fix wrong CQE decompression

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Tariq Toukan 


[ Upstream commit 36154be40a28e4afaa0416da2681d80b7e2ca319 ]

In cqe compression with striding RQ, the decompression of the CQE field
wqe_counter was done with a wrong wraparound value.
This caused handling cqes with a wrong pointer to wqe (rx descriptor)
and creating SKBs with wrong data, pointing to wrong (and already consumed)
strides/pages.

The meaning of the CQE field wqe_counter in striding RQ holds the
stride index instead of the WQE index. Hence, when decompressing
a CQE, wqe_counter should have wrapped-around the number of strides
in a single multi-packet WQE.

We dropped this wrap-around mask at all in CQE decompression of striding
RQ. It is not needed as in such cases the CQE compression session would
break because of different value of wqe_id field, starting a new
compression session.

Tested:
 ethtool -K ethxx lro off/on
 ethtool --set-priv-flags ethxx rx_cqe_compress on
 super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
 verified no csum errors and no page refcount issues.

Fixes: 7219ab34f184 ("net/mlx5e: CQE compression")
Signed-off-by: Tariq Toukan 
Reported-by: Tom Herbert 
Cc: kernel-t...@fb.com
Signed-off-by: Saeed Mahameed 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c |   13 ++---
 1 file changed, 6 insertions(+), 7 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -92,19 +92,18 @@ static inline void mlx5e_cqes_update_own
 static inline void mlx5e_decompress_cqe(struct mlx5e_rq *rq,
struct mlx5e_cq *cq, u32 cqcc)
 {
-   u16 wqe_cnt_step;
-
cq->title.byte_cnt = cq->mini_arr[cq->mini_arr_idx].byte_cnt;
cq->title.check_sum= cq->mini_arr[cq->mini_arr_idx].checksum;
cq->title.op_own  &= 0xf0;
cq->title.op_own  |= 0x01 & (cqcc >> cq->wq.log_sz);
cq->title.wqe_counter  = cpu_to_be16(cq->decmprs_wqe_counter);
 
-   wqe_cnt_step =
-   rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ ?
-   mpwrq_get_cqe_consumed_strides(>title) : 1;
-   cq->decmprs_wqe_counter =
-   (cq->decmprs_wqe_counter + wqe_cnt_step) & rq->wq.sz_m1;
+   if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ)
+   cq->decmprs_wqe_counter +=
+   mpwrq_get_cqe_consumed_strides(>title);
+   else
+   cq->decmprs_wqe_counter =
+   (cq->decmprs_wqe_counter + 1) & rq->wq.sz_m1;
 }
 
 static inline void mlx5e_decompress_cqe_no_hash(struct mlx5e_rq *rq,

[PATCH 4.10 16/63] geneve: lock RCU on TX path

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Jakub Kicinski 


[ Upstream commit a717e3f740803cc88bd5c9a70c93504f6a368663 ]

There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: fceb9c3e3825 ("geneve: avoid using stale geneve socket.")
Signed-off-by: Jakub Kicinski 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/geneve.c |2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -881,12 +881,14 @@ static netdev_tx_t geneve_xmit(struct sk
info = >info;
}
 
+   rcu_read_lock();
 #if IS_ENABLED(CONFIG_IPV6)
if (info->mode & IP_TUNNEL_INFO_IPV6)
err = geneve6_xmit_skb(skb, dev, geneve, info);
else
 #endif
err = geneve_xmit_skb(skb, dev, geneve, info);
+   rcu_read_unlock();
 
if (likely(!err))
return NETDEV_TX_OK;

[PATCH 4.10 16/63] geneve: lock RCU on TX path

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Jakub Kicinski 


[ Upstream commit a717e3f740803cc88bd5c9a70c93504f6a368663 ]

There is no guarantees that callers of the TX path will hold
the RCU lock.  Grab it explicitly.

Fixes: fceb9c3e3825 ("geneve: avoid using stale geneve socket.")
Signed-off-by: Jakub Kicinski 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/geneve.c |2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -881,12 +881,14 @@ static netdev_tx_t geneve_xmit(struct sk
info = >info;
}
 
+   rcu_read_lock();
 #if IS_ENABLED(CONFIG_IPV6)
if (info->mode & IP_TUNNEL_INFO_IPV6)
err = geneve6_xmit_skb(skb, dev, geneve, info);
else
 #endif
err = geneve_xmit_skb(skb, dev, geneve, info);
+   rcu_read_unlock();
 
if (likely(!err))
return NETDEV_TX_OK;

[PATCH 4.10 20/63] net: dont call strlen() on the user buffer in packet_bind_spkt()

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexander Potapenko 


[ Upstream commit 540e2894f7905538740aaf122bd8e0548e1c34a4 ]

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in packet_bind_spkt():
Acked-by: Eric Dumazet 

==
BUG: KMSAN: use of unitialized memory
CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
  88006b6dfc08 82559ae8 88006b6dfb48
 818a7c91 85b9c870 0092 85b9c550
  0092 ec400911 0002
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0x238/0x290 lib/dump_stack.c:51
 [] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
 [] __msan_warning+0x5b/0xb0
mm/kmsan/kmsan_instr.c:424
 [< inline >] strlen lib/string.c:484
 [] strlcpy+0x9d/0x200 lib/string.c:144
 [] packet_bind_spkt+0x144/0x230
net/packet/af_packet.c:3132
 [] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
 [] SyS_bind+0x82/0xa0 net/socket.c:1356
 [] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
chained origin: eba00911
 [] save_stack_trace+0x27/0x50
arch/x86/kernel/stacktrace.c:67
 [< inline >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
 [< inline >] kmsan_save_stack mm/kmsan/kmsan.c:334
 [] kmsan_internal_chain_origin+0x118/0x1e0
mm/kmsan/kmsan.c:527
 [] __msan_set_alloca_origin4+0xc3/0x130
mm/kmsan/kmsan_instr.c:380
 [] SYSC_bind+0x129/0x5f0 net/socket.c:1356
 [] SyS_bind+0x82/0xa0 net/socket.c:1356
 [] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
origin description: address@SYSC_bind (origin=eb400911)
==
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

, when I run the following program as root:

=
 #include 
 #include 
 #include 
 #include 

 int main() {
   struct sockaddr addr;
   memset(, 0xff, sizeof(addr));
   addr.sa_family = AF_PACKET;
   int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
   bind(fd, , sizeof(addr));
   return 0;
 }
=

This happens because addr.sa_data copied from the userspace is not
zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
results in calling strlen() on the kernel copy of that non-terminated
buffer.

Signed-off-by: Alexander Potapenko 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/packet/af_packet.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3082,7 +3082,7 @@ static int packet_bind_spkt(struct socke
int addr_len)
 {
struct sock *sk = sock->sk;
-   char name[15];
+   char name[sizeof(uaddr->sa_data) + 1];
 
/*
 *  Check legality
@@ -3090,7 +3090,11 @@ static int packet_bind_spkt(struct socke
 
if (addr_len != sizeof(struct sockaddr))
return -EINVAL;
-   strlcpy(name, uaddr->sa_data, sizeof(name));
+   /* uaddr->sa_data comes from the userspace, it's not guaranteed to be
+* zero-terminated.
+*/
+   memcpy(name, uaddr->sa_data, sizeof(uaddr->sa_data));
+   name[sizeof(uaddr->sa_data)] = 0;
 
return packet_do_bind(sk, name, 0, pkt_sk(sk)->num);
 }

[PATCH 4.10 20/63] net: dont call strlen() on the user buffer in packet_bind_spkt()

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Alexander Potapenko 


[ Upstream commit 540e2894f7905538740aaf122bd8e0548e1c34a4 ]

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in packet_bind_spkt():
Acked-by: Eric Dumazet 

==
BUG: KMSAN: use of unitialized memory
CPU: 0 PID: 1074 Comm: packet Not tainted 4.8.0-rc6+ #1891
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
01/01/2011
  88006b6dfc08 82559ae8 88006b6dfb48
 818a7c91 85b9c870 0092 85b9c550
  0092 ec400911 0002
Call Trace:
 [< inline >] __dump_stack lib/dump_stack.c:15
 [] dump_stack+0x238/0x290 lib/dump_stack.c:51
 [] kmsan_report+0x276/0x2e0 mm/kmsan/kmsan.c:1003
 [] __msan_warning+0x5b/0xb0
mm/kmsan/kmsan_instr.c:424
 [< inline >] strlen lib/string.c:484
 [] strlcpy+0x9d/0x200 lib/string.c:144
 [] packet_bind_spkt+0x144/0x230
net/packet/af_packet.c:3132
 [] SYSC_bind+0x40d/0x5f0 net/socket.c:1370
 [] SyS_bind+0x82/0xa0 net/socket.c:1356
 [] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
chained origin: eba00911
 [] save_stack_trace+0x27/0x50
arch/x86/kernel/stacktrace.c:67
 [< inline >] kmsan_save_stack_with_flags mm/kmsan/kmsan.c:322
 [< inline >] kmsan_save_stack mm/kmsan/kmsan.c:334
 [] kmsan_internal_chain_origin+0x118/0x1e0
mm/kmsan/kmsan.c:527
 [] __msan_set_alloca_origin4+0xc3/0x130
mm/kmsan/kmsan_instr.c:380
 [] SYSC_bind+0x129/0x5f0 net/socket.c:1356
 [] SyS_bind+0x82/0xa0 net/socket.c:1356
 [] entry_SYSCALL_64_fastpath+0x13/0x8f
arch/x86/entry/entry_64.o:?
origin description: address@SYSC_bind (origin=eb400911)
==
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

, when I run the following program as root:

=
 #include 
 #include 
 #include 
 #include 

 int main() {
   struct sockaddr addr;
   memset(, 0xff, sizeof(addr));
   addr.sa_family = AF_PACKET;
   int fd = socket(PF_PACKET, SOCK_PACKET, htons(ETH_P_ALL));
   bind(fd, , sizeof(addr));
   return 0;
 }
=

This happens because addr.sa_data copied from the userspace is not
zero-terminated, and copying it with strlcpy() in packet_bind_spkt()
results in calling strlen() on the kernel copy of that non-terminated
buffer.

Signed-off-by: Alexander Potapenko 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/packet/af_packet.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3082,7 +3082,7 @@ static int packet_bind_spkt(struct socke
int addr_len)
 {
struct sock *sk = sock->sk;
-   char name[15];
+   char name[sizeof(uaddr->sa_data) + 1];
 
/*
 *  Check legality
@@ -3090,7 +3090,11 @@ static int packet_bind_spkt(struct socke
 
if (addr_len != sizeof(struct sockaddr))
return -EINVAL;
-   strlcpy(name, uaddr->sa_data, sizeof(name));
+   /* uaddr->sa_data comes from the userspace, it's not guaranteed to be
+* zero-terminated.
+*/
+   memcpy(name, uaddr->sa_data, sizeof(uaddr->sa_data));
+   name[sizeof(uaddr->sa_data)] = 0;
 
return packet_do_bind(sk, name, 0, pkt_sk(sk)->num);
 }

[PATCH 4.10 22/63] ipv6: orphan skbs in reassembly unit

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Eric Dumazet 


[ Upstream commit 48cac18ecf1de82f76259a54402c3adb7839ad01 ]

Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer 

==
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr 880062da0060 by task a.out/4140

page:ea00018b6800 count:1 mapcount:0 mapping:  (null)
index:0x0 compound_mapcount: 0
flags: 0x1008100(slab|head)
raw: 01008100   000180130013
raw: dead0100 dead0200 88006741f140 
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 describe_address mm/kasan/report.c:262
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370
 kasan_report mm/kasan/report.c:392
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
 sock_flag ./arch/x86/include/asm/bitops.h:324
 sock_wfree+0x118/0x120 net/core/sock.c:1631
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put ./include/net/inet_frag.h:133
 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook ./include/linux/netfilter.h:212
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x620 net/socket.c:848
 new_sync_write fs/read_write.c:499
 __vfs_write+0x483/0x760 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:7ff268e0ed98 EFLAGS: 0206 ORIG_RAX: 0001
RAX: ffda RBX: 7ff268e0f9c0 RCX: 7ff26e6f5b79
RDX: 0010 RSI: 20f50fe1 RDI: 0003
RBP: 7ff26ebc1220 R08:  R09: 
R10:  R11: 0206 R12: 
R13: 7ff268e0f9c0 R14: 7ff26efec040 R15: 0003

The buggy address belongs to the object at 880062da
 which belongs to the cache RAWv6 of size 1504
The buggy address 880062da0060 is located 96 bytes inside
 of 1504-byte region [880062da, 880062da05e0)

Freed by task 4113:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1352
 slab_free_freelist_hook mm/slub.c:1374
 slab_free mm/slub.c:2951
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
 sk_prot_free net/core/sock.c:1377
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sk_free+0x23/0x30 net/core/sock.c:1479
 sock_put ./include/net/sock.h:1638
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
 sock_release+0x8d/0x1e0 net/socket.c:599
 sock_close+0x16/0x20 net/socket.c:1063
 __fput+0x332/0x7f0 fs/file_table.c:208
 fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work ./include/linux/task_work.h:21
 do_exit+0x186b/0x2800 kernel/exit.c:839
 do_group_exit+0x149/0x420 kernel/exit.c:943
 SYSC_exit_group kernel/exit.c:954
 SyS_exit_group+0x1d/0x20 kernel/exit.c:952
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0

[PATCH 4.10 22/63] ipv6: orphan skbs in reassembly unit

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Eric Dumazet 


[ Upstream commit 48cac18ecf1de82f76259a54402c3adb7839ad01 ]

Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer 

==
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr 880062da0060 by task a.out/4140

page:ea00018b6800 count:1 mapcount:0 mapping:  (null)
index:0x0 compound_mapcount: 0
flags: 0x1008100(slab|head)
raw: 01008100   000180130013
raw: dead0100 dead0200 88006741f140 
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 describe_address mm/kasan/report.c:262
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370
 kasan_report mm/kasan/report.c:392
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
 sock_flag ./arch/x86/include/asm/bitops.h:324
 sock_wfree+0x118/0x120 net/core/sock.c:1631
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put ./include/net/inet_frag.h:133
 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook ./include/linux/netfilter.h:212
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x620 net/socket.c:848
 new_sync_write fs/read_write.c:499
 __vfs_write+0x483/0x760 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:7ff268e0ed98 EFLAGS: 0206 ORIG_RAX: 0001
RAX: ffda RBX: 7ff268e0f9c0 RCX: 7ff26e6f5b79
RDX: 0010 RSI: 20f50fe1 RDI: 0003
RBP: 7ff26ebc1220 R08:  R09: 
R10:  R11: 0206 R12: 
R13: 7ff268e0f9c0 R14: 7ff26efec040 R15: 0003

The buggy address belongs to the object at 880062da
 which belongs to the cache RAWv6 of size 1504
The buggy address 880062da0060 is located 96 bytes inside
 of 1504-byte region [880062da, 880062da05e0)

Freed by task 4113:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1352
 slab_free_freelist_hook mm/slub.c:1374
 slab_free mm/slub.c:2951
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
 sk_prot_free net/core/sock.c:1377
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sk_free+0x23/0x30 net/core/sock.c:1479
 sock_put ./include/net/sock.h:1638
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
 sock_release+0x8d/0x1e0 net/socket.c:599
 sock_close+0x16/0x20 net/socket.c:1063
 __fput+0x332/0x7f0 fs/file_table.c:208
 fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work ./include/linux/task_work.h:21
 do_exit+0x186b/0x2800 kernel/exit.c:839
 do_group_exit+0x149/0x420 kernel/exit.c:943
 SYSC_exit_group kernel/exit.c:954
 SyS_exit_group+0x1d/0x20 kernel/exit.c:952
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track

[PATCH 4.10 23/63] dccp: Unlock sock before calling sk_free()

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Arnaldo Carvalho de Melo 


[ Upstream commit d5afb6f9b6bb2c57bd0c05e76e12489dc0d037d9 ]

The code where sk_clone() came from created a new socket and locked it,
but then, on the error path didn't unlock it.

This problem stayed there for a long while, till b0691c8ee7c2 ("net:
Unlock sock before calling sk_free()") fixed it, but unfortunately the
callers of sk_clone() (now sk_clone_locked()) were not audited and the
one in dccp_create_openreq_child() remained.

Now in the age of the syskaller fuzzer, this was finally uncovered, as
reported by Dmitry:

  8< 

I've got the following report while running syzkaller fuzzer on
86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)")

  [ BUG: held lock freed! ]
  4.10.0+ #234 Not tainted
  -
  syz-executor6/6898 is freeing memory
  88006286cac0-88006286d3b7, with a lock still held there!
   (slock-AF_INET6){+.-...}, at: [] spin_lock
  include/linux/spinlock.h:299 [inline]
   (slock-AF_INET6){+.-...}, at: []
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
  5 locks held by syz-executor6/6898:
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
  include/net/sock.h:1460 [inline]
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: []
  inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681
   #1:  (rcu_read_lock){..}, at: []
  inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126
   #2:  (rcu_read_lock){..}, at: [] __skb_unlink
  include/linux/skbuff.h:1767 [inline]
   #2:  (rcu_read_lock){..}, at: [] __skb_dequeue
  include/linux/skbuff.h:1783 [inline]
   #2:  (rcu_read_lock){..}, at: []
  process_backlog+0x264/0x730 net/core/dev.c:4835
   #3:  (rcu_read_lock){..}, at: []
  ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59
   #4:  (slock-AF_INET6){+.-...}, at: [] spin_lock
  include/linux/spinlock.h:299 [inline]
   #4:  (slock-AF_INET6){+.-...}, at: []
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504

Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling
sk_free()").

Reported-by: Dmitry Vyukov 
Cc: Cong Wang 
Cc: Eric Dumazet 
Cc: Gerrit Renker 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20170301153510.ge15...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/dccp/minisocks.c |1 +
 1 file changed, 1 insertion(+)

--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(c
/* It is still raw copy of parent, so invalidate
 * destructor and make plain sk_free() */
newsk->sk_destruct = NULL;
+   bh_unlock_sock(newsk);
sk_free(newsk);
return NULL;
}

[PATCH 4.10 19/63] net: bridge: allow IPv6 when multicast flood is disabled

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Mike Manning 


[ Upstream commit 8953de2f02ad7b15e4964c82f9afd60f128e4e98 ]

Even with multicast flooding turned off, IPv6 ND should still work so
that IPv6 connectivity is provided. Allow this by continuing to flood
multicast traffic originated by us.

Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag")
Cc: Nikolay Aleksandrov 
Signed-off-by: Mike Manning 
Acked-by: Nikolay Aleksandrov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/bridge/br_forward.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -186,8 +186,9 @@ void br_flood(struct net_bridge *br, str
/* Do not flood unicast traffic to ports that turn it off */
if (pkt_type == BR_PKT_UNICAST && !(p->flags & BR_FLOOD))
continue;
+   /* Do not flood if mc off, except for traffic we originate */
if (pkt_type == BR_PKT_MULTICAST &&
-   !(p->flags & BR_MCAST_FLOOD))
+   !(p->flags & BR_MCAST_FLOOD) && skb->dev != br->dev)
continue;
 
/* Do not flood to ports that enable proxy ARP */

[PATCH 4.10 23/63] dccp: Unlock sock before calling sk_free()

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Arnaldo Carvalho de Melo 


[ Upstream commit d5afb6f9b6bb2c57bd0c05e76e12489dc0d037d9 ]

The code where sk_clone() came from created a new socket and locked it,
but then, on the error path didn't unlock it.

This problem stayed there for a long while, till b0691c8ee7c2 ("net:
Unlock sock before calling sk_free()") fixed it, but unfortunately the
callers of sk_clone() (now sk_clone_locked()) were not audited and the
one in dccp_create_openreq_child() remained.

Now in the age of the syskaller fuzzer, this was finally uncovered, as
reported by Dmitry:

  8< 

I've got the following report while running syzkaller fuzzer on
86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)")

  [ BUG: held lock freed! ]
  4.10.0+ #234 Not tainted
  -
  syz-executor6/6898 is freeing memory
  88006286cac0-88006286d3b7, with a lock still held there!
   (slock-AF_INET6){+.-...}, at: [] spin_lock
  include/linux/spinlock.h:299 [inline]
   (slock-AF_INET6){+.-...}, at: []
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
  5 locks held by syz-executor6/6898:
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: [] lock_sock
  include/net/sock.h:1460 [inline]
   #0:  (sk_lock-AF_INET6){+.+.+.}, at: []
  inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681
   #1:  (rcu_read_lock){..}, at: []
  inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126
   #2:  (rcu_read_lock){..}, at: [] __skb_unlink
  include/linux/skbuff.h:1767 [inline]
   #2:  (rcu_read_lock){..}, at: [] __skb_dequeue
  include/linux/skbuff.h:1783 [inline]
   #2:  (rcu_read_lock){..}, at: []
  process_backlog+0x264/0x730 net/core/dev.c:4835
   #3:  (rcu_read_lock){..}, at: []
  ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59
   #4:  (slock-AF_INET6){+.-...}, at: [] spin_lock
  include/linux/spinlock.h:299 [inline]
   #4:  (slock-AF_INET6){+.-...}, at: []
  sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504

Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling
sk_free()").

Reported-by: Dmitry Vyukov 
Cc: Cong Wang 
Cc: Eric Dumazet 
Cc: Gerrit Renker 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/20170301153510.ge15...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/dccp/minisocks.c |1 +
 1 file changed, 1 insertion(+)

--- a/net/dccp/minisocks.c
+++ b/net/dccp/minisocks.c
@@ -122,6 +122,7 @@ struct sock *dccp_create_openreq_child(c
/* It is still raw copy of parent, so invalidate
 * destructor and make plain sk_free() */
newsk->sk_destruct = NULL;
+   bh_unlock_sock(newsk);
sk_free(newsk);
return NULL;
}

[PATCH 4.10 19/63] net: bridge: allow IPv6 when multicast flood is disabled

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Mike Manning 


[ Upstream commit 8953de2f02ad7b15e4964c82f9afd60f128e4e98 ]

Even with multicast flooding turned off, IPv6 ND should still work so
that IPv6 connectivity is provided. Allow this by continuing to flood
multicast traffic originated by us.

Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag")
Cc: Nikolay Aleksandrov 
Signed-off-by: Mike Manning 
Acked-by: Nikolay Aleksandrov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/bridge/br_forward.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -186,8 +186,9 @@ void br_flood(struct net_bridge *br, str
/* Do not flood unicast traffic to ports that turn it off */
if (pkt_type == BR_PKT_UNICAST && !(p->flags & BR_FLOOD))
continue;
+   /* Do not flood if mc off, except for traffic we originate */
if (pkt_type == BR_PKT_MULTICAST &&
-   !(p->flags & BR_MCAST_FLOOD))
+   !(p->flags & BR_MCAST_FLOOD) && skb->dev != br->dev)
continue;
 
/* Do not flood to ports that enable proxy ARP */

Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent

2017-03-20 Thread Arnaldo Carvalho de Melo

Em Mon, Mar 20, 2017 at 11:56:55AM +0900, Taeung Song escreveu:
> Currently perf-annotate with --print-line can print
> -nan(0x8) because of division by zero
> when calculating percent.
> 
> So if a sum of samples is zero, skip calculating percent.

Tried to reproduce it here, couldn't, syswide record:

[root@jouet ~]# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: 
IP|TID|TIME|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, 
task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
[root@jouet ~]# perf annotate --stdio -l 2> /dev/null  | grep -i nan
[root@jouet ~]# 

Can you please send me a perf.data file with this problem? I have to go
thru the code to see how this can take place...

- Arnaldo

 
> Before:
> 
> $ perf annotate --stdio -l
> 
> Sorted summary for file /home/taeung/workspace/a.out
> --
> 
>32.89-nan7.04 a.c:38
>25.14-nan0.00 a.c:34
>16.26-nan   56.34 a.c:31
>15.88-nan1.41 a.c:37
> 5.67-nan0.00 a.c:39
> 1.13-nan   35.21 a.c:26
> 0.95-nan0.00 a.c:44
> 0.57-nan0.00 a.c:32
>  Percent |  Source code & Disassembly of a.out for cycles 
> (529 samples)
> -
>  :
> ...
> 
>  a.c:260.57-nan4.23 : 40081a:   mov
> %edi,-0x24(%rbp)
>  a.c:260.00-nan9.86 : 40081d:   mov
> %rsi,-0x30(%rbp)
> 
> ...
> 
> After:
> 
> $ perf annotate --stdio -l
> 
> Sorted summary for file /home/taeung/workspace/a.out
> --
> 
>32.890.007.04 a.c:38
>25.140.000.00 a.c:34
>16.260.00   56.34 a.c:31
>15.880.001.41 a.c:37
> 5.670.000.00 a.c:39
> 1.130.00   35.21 a.c:26
> 0.950.000.00 a.c:44
> 0.570.000.00 a.c:32
>  Percent |  Source code & Disassembly of old for cycles 
> (529 samples)
> -
>  :
> ...
> 
> a.c:260.570.004.23 : 40081a:   mov%edi,-0x24(%rbp)
> a.c:260.000.009.86 : 40081d:   mov%rsi,-0x30(%rbp)
> 
> ...
> 
> Cc: Namhyung Kim 
> Cc: Jiri Olsa 
> Signed-off-by: Taeung Song 
> ---
>  tools/perf/util/annotate.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index fc91c6b..9bb43cd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -1665,11 +1665,15 @@ static int symbol__get_source_line(struct symbol 
> *sym, struct map *map,
>   src_line->nr_pcnt = nr_pcnt;
>  
>   for (k = 0; k < nr_pcnt; k++) {
> + double percent = 0.0;
> +
>   h = annotation__histogram(notes, evidx + k);
> - src_line->samples[k].percent = 100.0 * h->addr[i] / 
> h->sum;
> + if (h->sum)
> + percent = 100.0 * h->addr[i] / h->sum;
>  
> - if (src_line->samples[k].percent > percent_max)
> - percent_max = src_line->samples[k].percent;
> + if (percent > percent_max)
> + percent_max = percent;
> + src_line->samples[k].percent = percent;
>   }
>  
>   if (percent_max <= 0.5)
> -- 
> 2.7.4

[PATCH v4] usb: hub: Fix error loop seen after hub communication errors

2017-03-20 Thread Guenter Roeck

While stress testing a usb controller using a bind/unbind looop, the
following error loop was observed.

usb 7-1.2: new low-speed USB device number 3 using xhci-hcd
usb 7-1.2: hub failed to enable device, error -108
usb 7-1-port2: cannot disable (err = -22)
usb 7-1-port2: couldn't allocate usb_device
usb 7-1-port2: cannot disable (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
** 57 printk messages dropped ** hub 7-1:1.0: activate --> -22
** 82 printk messages dropped ** hub 7-1:1.0: hub_ext_port_status failed (err = 
-22)

This continues forever. After adding tracebacks into the code,
the call sequence leading to this is found to be as follows.

[] hub_activate+0x368/0x7b8
[] hub_resume+0x2c/0x3c
[] usb_resume_interface.isra.6+0x128/0x158
[] usb_suspend_both+0x1e8/0x288
[] usb_runtime_suspend+0x3c/0x98
[] __rpm_callback+0x48/0x7c
[] rpm_callback+0xa8/0xd4
[] rpm_suspend+0x84/0x758
[] rpm_idle+0x2c8/0x498
[] __pm_runtime_idle+0x60/0xac
[] usb_autopm_put_interface+0x6c/0x7c
[] hub_event+0x10ac/0x12ac
[] process_one_work+0x390/0x6b8
[] worker_thread+0x480/0x610
[] kthread+0x164/0x178
[] ret_from_fork+0x10/0x40

kick_hub_wq() is called from hub_activate() even after failures to
communicate with the hub. This results in an endless sequence of
hub event -> hub activate -> wq trigger -> hub event -> ...

Provide two solutions for the problem.

- Only trigger the hub event queue if communication with the hub
  is successful.
- After a suspend failure, only resume already suspended interfaces
  if the communication with the device is still possible.

Each of the changes fixes the observed problem. Use both to improve
robustness.

Acked-by: Alan Stern 
Signed-off-by: Guenter Roeck 
---
v4: Other code uses a space before labels in hub_activate(). Do the same
for consistency.
v3: In hub.c, abort immediately if hub_port_status() returns an error.
Since hub_port_status() already logs the error, don't log it again.
In device,c, log the error return value from usb_suspend_device()
if usb_get_status() failed as well.
Don't check if the hub is still accessible if the error returned
from hub_port_status() is -EBUSY.
v2: Instead of not triggering the hub wq after an error to submit an urb,
implement a more complex error detection and handling. Do it in two
places. Marked as RFC to determine if one (or both) of those solutions
are viable.

 drivers/usb/core/driver.c | 18 ++
 drivers/usb/core/hub.c|  5 -
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index cdee5130638b..7ebdf2a4e8fe 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -1331,6 +1331,24 @@ static int usb_suspend_both(struct usb_device *udev, 
pm_message_t msg)
 */
if (udev->parent && !PMSG_IS_AUTO(msg))
status = 0;
+
+   /*
+* If the device is inaccessible, don't try to resume
+* suspended interfaces and just return the error.
+*/
+   if (status && status != -EBUSY) {
+   int err;
+   u16 devstat;
+
+   err = usb_get_status(udev, USB_RECIP_DEVICE, 0,
+);
+   if (err) {
+   dev_err(>dev,
+   "Failed to suspend device, error %d\n",
+   status);
+   goto done;
+   }
+   }
}
 
/* If the suspend failed, resume interfaces that did get suspended */
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index

[PATCH v4] usb: hub: Fix error loop seen after hub communication errors

2017-03-20 Thread Guenter Roeck

While stress testing a usb controller using a bind/unbind looop, the
following error loop was observed.

usb 7-1.2: new low-speed USB device number 3 using xhci-hcd
usb 7-1.2: hub failed to enable device, error -108
usb 7-1-port2: cannot disable (err = -22)
usb 7-1-port2: couldn't allocate usb_device
usb 7-1-port2: cannot disable (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: activate --> -22
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
hub 7-1:1.0: hub_ext_port_status failed (err = -22)
** 57 printk messages dropped ** hub 7-1:1.0: activate --> -22
** 82 printk messages dropped ** hub 7-1:1.0: hub_ext_port_status failed (err = 
-22)

This continues forever. After adding tracebacks into the code,
the call sequence leading to this is found to be as follows.

[] hub_activate+0x368/0x7b8
[] hub_resume+0x2c/0x3c
[] usb_resume_interface.isra.6+0x128/0x158
[] usb_suspend_both+0x1e8/0x288
[] usb_runtime_suspend+0x3c/0x98
[] __rpm_callback+0x48/0x7c
[] rpm_callback+0xa8/0xd4
[] rpm_suspend+0x84/0x758
[] rpm_idle+0x2c8/0x498
[] __pm_runtime_idle+0x60/0xac
[] usb_autopm_put_interface+0x6c/0x7c
[] hub_event+0x10ac/0x12ac
[] process_one_work+0x390/0x6b8
[] worker_thread+0x480/0x610
[] kthread+0x164/0x178
[] ret_from_fork+0x10/0x40

kick_hub_wq() is called from hub_activate() even after failures to
communicate with the hub. This results in an endless sequence of
hub event -> hub activate -> wq trigger -> hub event -> ...

Provide two solutions for the problem.

- Only trigger the hub event queue if communication with the hub
  is successful.
- After a suspend failure, only resume already suspended interfaces
  if the communication with the device is still possible.

Each of the changes fixes the observed problem. Use both to improve
robustness.

Acked-by: Alan Stern 
Signed-off-by: Guenter Roeck 
---
v4: Other code uses a space before labels in hub_activate(). Do the same
for consistency.
v3: In hub.c, abort immediately if hub_port_status() returns an error.
Since hub_port_status() already logs the error, don't log it again.
In device,c, log the error return value from usb_suspend_device()
if usb_get_status() failed as well.
Don't check if the hub is still accessible if the error returned
from hub_port_status() is -EBUSY.
v2: Instead of not triggering the hub wq after an error to submit an urb,
implement a more complex error detection and handling. Do it in two
places. Marked as RFC to determine if one (or both) of those solutions
are viable.

 drivers/usb/core/driver.c | 18 ++
 drivers/usb/core/hub.c|  5 -
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/core/driver.c b/drivers/usb/core/driver.c
index cdee5130638b..7ebdf2a4e8fe 100644
--- a/drivers/usb/core/driver.c
+++ b/drivers/usb/core/driver.c
@@ -1331,6 +1331,24 @@ static int usb_suspend_both(struct usb_device *udev, 
pm_message_t msg)
 */
if (udev->parent && !PMSG_IS_AUTO(msg))
status = 0;
+
+   /*
+* If the device is inaccessible, don't try to resume
+* suspended interfaces and just return the error.
+*/
+   if (status && status != -EBUSY) {
+   int err;
+   u16 devstat;
+
+   err = usb_get_status(udev, USB_RECIP_DEVICE, 0,
+);
+   if (err) {
+   dev_err(>dev,
+   "Failed to suspend device, error %d\n",
+   status);
+   goto done;
+   }
+   }
}
 
/* If the suspend failed, resume interfaces that did get suspended */
diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
index 5286bf67869a..2e047f982af3 100644
--- a/drivers/usb/core/hub.c
+++

Re: [PATCH 2/4] perf annotate: Avoid division by zero when calculating percent

2017-03-20 Thread Arnaldo Carvalho de Melo

Em Mon, Mar 20, 2017 at 11:56:55AM +0900, Taeung Song escreveu:
> Currently perf-annotate with --print-line can print
> -nan(0x8) because of division by zero
> when calculating percent.
> 
> So if a sum of samples is zero, skip calculating percent.

Tried to reproduce it here, couldn't, syswide record:

[root@jouet ~]# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: 
IP|TID|TIME|CPU|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, 
task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
[root@jouet ~]# perf annotate --stdio -l 2> /dev/null  | grep -i nan
[root@jouet ~]# 

Can you please send me a perf.data file with this problem? I have to go
thru the code to see how this can take place...

- Arnaldo

 
> Before:
> 
> $ perf annotate --stdio -l
> 
> Sorted summary for file /home/taeung/workspace/a.out
> --
> 
>32.89-nan7.04 a.c:38
>25.14-nan0.00 a.c:34
>16.26-nan   56.34 a.c:31
>15.88-nan1.41 a.c:37
> 5.67-nan0.00 a.c:39
> 1.13-nan   35.21 a.c:26
> 0.95-nan0.00 a.c:44
> 0.57-nan0.00 a.c:32
>  Percent |  Source code & Disassembly of a.out for cycles 
> (529 samples)
> -
>  :
> ...
> 
>  a.c:260.57-nan4.23 : 40081a:   mov
> %edi,-0x24(%rbp)
>  a.c:260.00-nan9.86 : 40081d:   mov
> %rsi,-0x30(%rbp)
> 
> ...
> 
> After:
> 
> $ perf annotate --stdio -l
> 
> Sorted summary for file /home/taeung/workspace/a.out
> --
> 
>32.890.007.04 a.c:38
>25.140.000.00 a.c:34
>16.260.00   56.34 a.c:31
>15.880.001.41 a.c:37
> 5.670.000.00 a.c:39
> 1.130.00   35.21 a.c:26
> 0.950.000.00 a.c:44
> 0.570.000.00 a.c:32
>  Percent |  Source code & Disassembly of old for cycles 
> (529 samples)
> -
>  :
> ...
> 
> a.c:260.570.004.23 : 40081a:   mov%edi,-0x24(%rbp)
> a.c:260.000.009.86 : 40081d:   mov%rsi,-0x30(%rbp)
> 
> ...
> 
> Cc: Namhyung Kim 
> Cc: Jiri Olsa 
> Signed-off-by: Taeung Song 
> ---
>  tools/perf/util/annotate.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index fc91c6b..9bb43cd 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -1665,11 +1665,15 @@ static int symbol__get_source_line(struct symbol 
> *sym, struct map *map,
>   src_line->nr_pcnt = nr_pcnt;
>  
>   for (k = 0; k < nr_pcnt; k++) {
> + double percent = 0.0;
> +
>   h = annotation__histogram(notes, evidx + k);
> - src_line->samples[k].percent = 100.0 * h->addr[i] / 
> h->sum;
> + if (h->sum)
> + percent = 100.0 * h->addr[i] / h->sum;
>  
> - if (src_line->samples[k].percent > percent_max)
> - percent_max = src_line->samples[k].percent;
> + if (percent > percent_max)
> + percent_max = percent;
> + src_line->samples[k].percent = percent;
>   }
>  
>   if (percent_max <= 0.5)
> -- 
> 2.7.4

[PATCH 4.10 26/63] amd-xgbe: Dont overwrite SFP PHY mod_absent settings

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: "Lendacky, Thomas" 


[ Upstream commit 2697ea5a859b83ca49511dcfd98daf42584eb3cf ]

If an SFP module is not present, xgbe_phy_sfp_phy_settings() should
return after applying the default settings. Currently there is no return
statement and the default settings are overwritten.

Signed-off-by: Tom Lendacky 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c |2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
@@ -716,6 +716,8 @@ static void xgbe_phy_sfp_phy_settings(st
pdata->phy.duplex = DUPLEX_UNKNOWN;
pdata->phy.autoneg = AUTONEG_ENABLE;
pdata->phy.advertising = pdata->phy.supported;
+
+   return;
}
 
pdata->phy.advertising &= ~ADVERTISED_Autoneg;

[PATCH 4.10 26/63] amd-xgbe: Dont overwrite SFP PHY mod_absent settings

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: "Lendacky, Thomas" 


[ Upstream commit 2697ea5a859b83ca49511dcfd98daf42584eb3cf ]

If an SFP module is not present, xgbe_phy_sfp_phy_settings() should
return after applying the default settings. Currently there is no return
statement and the default settings are overwritten.

Signed-off-by: Tom Lendacky 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c |2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
@@ -716,6 +716,8 @@ static void xgbe_phy_sfp_phy_settings(st
pdata->phy.duplex = DUPLEX_UNKNOWN;
pdata->phy.autoneg = AUTONEG_ENABLE;
pdata->phy.advertising = pdata->phy.supported;
+
+   return;
}
 
pdata->phy.advertising &= ~ADVERTISED_Autoneg;

[PATCH 4.10 10/63] ipv4: add missing initialization for flowi4_uid

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Julian Anastasov 


[ Upstream commit 8bcfd0925ef15f072ba1e7bee2c25e9e1b5fd6ca ]

Avoid matching of random stack value for uid when rules
are looked up on input route or when RP filter is used.
Problem should affect only setups that use ip rules with
uid range.

Fixes: 622ec2c9d524 ("net: core: add UID to flows, rules, and routes")
Signed-off-by: Julian Anastasov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/ipv4/fib_frontend.c |6 +++---
 net/ipv4/route.c|1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -319,7 +319,7 @@ static int __fib_validate_source(struct
int ret, no_addr;
struct fib_result res;
struct flowi4 fl4;
-   struct net *net;
+   struct net *net = dev_net(dev);
bool dev_match;
 
fl4.flowi4_oif = 0;
@@ -332,6 +332,7 @@ static int __fib_validate_source(struct
fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
fl4.flowi4_tun_key.tun_id = 0;
fl4.flowi4_flags = 0;
+   fl4.flowi4_uid = sock_net_uid(net, NULL);
 
no_addr = idev->ifa_list == NULL;
 
@@ -339,13 +340,12 @@ static int __fib_validate_source(struct
 
trace_fib_validate_source(dev, );
 
-   net = dev_net(dev);
if (fib_lookup(net, , , 0))
goto last_resort;
if (res.type != RTN_UNICAST &&
(res.type != RTN_LOCAL || !IN_DEV_ACCEPT_LOCAL(idev)))
goto e_inval;
-   if (!rpf && !fib_num_tclassid_users(dev_net(dev)) &&
+   if (!rpf && !fib_num_tclassid_users(net) &&
(dev->ifindex != oif || !IN_DEV_TX_REDIRECTS(idev)))
goto last_resort;
fib_combine_itag(itag, );
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1858,6 +1858,7 @@ static int ip_route_input_slow(struct sk
fl4.flowi4_flags = 0;
fl4.daddr = daddr;
fl4.saddr = saddr;
+   fl4.flowi4_uid = sock_net_uid(net, NULL);
err = fib_lookup(net, , , 0);
if (err != 0) {
if (!IN_DEV_FORWARD(in_dev))

[PATCH 4.10 10/63] ipv4: add missing initialization for flowi4_uid

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Julian Anastasov 


[ Upstream commit 8bcfd0925ef15f072ba1e7bee2c25e9e1b5fd6ca ]

Avoid matching of random stack value for uid when rules
are looked up on input route or when RP filter is used.
Problem should affect only setups that use ip rules with
uid range.

Fixes: 622ec2c9d524 ("net: core: add UID to flows, rules, and routes")
Signed-off-by: Julian Anastasov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/ipv4/fib_frontend.c |6 +++---
 net/ipv4/route.c|1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -319,7 +319,7 @@ static int __fib_validate_source(struct
int ret, no_addr;
struct fib_result res;
struct flowi4 fl4;
-   struct net *net;
+   struct net *net = dev_net(dev);
bool dev_match;
 
fl4.flowi4_oif = 0;
@@ -332,6 +332,7 @@ static int __fib_validate_source(struct
fl4.flowi4_scope = RT_SCOPE_UNIVERSE;
fl4.flowi4_tun_key.tun_id = 0;
fl4.flowi4_flags = 0;
+   fl4.flowi4_uid = sock_net_uid(net, NULL);
 
no_addr = idev->ifa_list == NULL;
 
@@ -339,13 +340,12 @@ static int __fib_validate_source(struct
 
trace_fib_validate_source(dev, );
 
-   net = dev_net(dev);
if (fib_lookup(net, , , 0))
goto last_resort;
if (res.type != RTN_UNICAST &&
(res.type != RTN_LOCAL || !IN_DEV_ACCEPT_LOCAL(idev)))
goto e_inval;
-   if (!rpf && !fib_num_tclassid_users(dev_net(dev)) &&
+   if (!rpf && !fib_num_tclassid_users(net) &&
(dev->ifindex != oif || !IN_DEV_TX_REDIRECTS(idev)))
goto last_resort;
fib_combine_itag(itag, );
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1858,6 +1858,7 @@ static int ip_route_input_slow(struct sk
fl4.flowi4_flags = 0;
fl4.daddr = daddr;
fl4.saddr = saddr;
+   fl4.flowi4_uid = sock_net_uid(net, NULL);
err = fib_lookup(net, , , 0);
if (err != 0) {
if (!IN_DEV_FORWARD(in_dev))

[PATCH 4.10 12/63] sctp: set sin_port for addr param when checking duplicate address

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Xin Long 


[ Upstream commit 2e3ce5bc2aa938653c3866aa7f4901a1f199b1c8 ]

Commit b8607805dd15 ("sctp: not copying duplicate addrs to the assoc's
bind address list") tried to check for duplicate address before copying
to asoc's bind_addr list from global addr list.

But all the addrs' sin_ports in global addr list are 0 while the addrs'
sin_ports are bp->port in asoc's bind_addr list. It means even if it's
a duplicate address, af->cmp_addr will still return 0 as the their
sin_ports are different.

This patch is to fix it by setting the sin_port for addr param with
bp->port before comparing the addrs.

Fixes: b8607805dd15 ("sctp: not copying duplicate addrs to the assoc's bind 
address list")
Reported-by: Wei Chen 
Signed-off-by: Xin Long 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/sctp/protocol.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -199,6 +199,7 @@ int sctp_copy_local_addr_list(struct net
  sctp_scope_t scope, gfp_t gfp, int copy_flags)
 {
struct sctp_sockaddr_entry *addr;
+   union sctp_addr laddr;
int error = 0;
 
rcu_read_lock();
@@ -220,7 +221,10 @@ int sctp_copy_local_addr_list(struct net
 !(copy_flags & SCTP_ADDR6_PEERSUPP)))
continue;
 
-   if (sctp_bind_addr_state(bp, >a) != -1)
+   laddr = addr->a;
+   /* also works for setting ipv6 address port */
+   laddr.v4.sin_port = htons(bp->port);
+   if (sctp_bind_addr_state(bp, ) != -1)
continue;
 
error = sctp_add_bind_addr(bp, >a, sizeof(addr->a),

[PATCH 4.10 12/63] sctp: set sin_port for addr param when checking duplicate address

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Xin Long 


[ Upstream commit 2e3ce5bc2aa938653c3866aa7f4901a1f199b1c8 ]

Commit b8607805dd15 ("sctp: not copying duplicate addrs to the assoc's
bind address list") tried to check for duplicate address before copying
to asoc's bind_addr list from global addr list.

But all the addrs' sin_ports in global addr list are 0 while the addrs'
sin_ports are bp->port in asoc's bind_addr list. It means even if it's
a duplicate address, af->cmp_addr will still return 0 as the their
sin_ports are different.

This patch is to fix it by setting the sin_port for addr param with
bp->port before comparing the addrs.

Fixes: b8607805dd15 ("sctp: not copying duplicate addrs to the assoc's bind 
address list")
Reported-by: Wei Chen 
Signed-off-by: Xin Long 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/sctp/protocol.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -199,6 +199,7 @@ int sctp_copy_local_addr_list(struct net
  sctp_scope_t scope, gfp_t gfp, int copy_flags)
 {
struct sctp_sockaddr_entry *addr;
+   union sctp_addr laddr;
int error = 0;
 
rcu_read_lock();
@@ -220,7 +221,10 @@ int sctp_copy_local_addr_list(struct net
 !(copy_flags & SCTP_ADDR6_PEERSUPP)))
continue;
 
-   if (sctp_bind_addr_state(bp, >a) != -1)
+   laddr = addr->a;
+   /* also works for setting ipv6 address port */
+   laddr.v4.sin_port = htons(bp->port);
+   if (sctp_bind_addr_state(bp, ) != -1)
continue;
 
error = sctp_add_bind_addr(bp, >a, sizeof(addr->a),

[PATCH 4.10 41/63] mpls: Do not decrement alive counter for unregister events

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: David Ahern 


[ Upstream commit 79099aab38c8f5c746748b066ae74ba984fe2cc8 ]

Multipath routes can be rendered usesless when a device in one of the
paths is deleted. For example:

$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2  dev virt12
nexthop as to 300 via inet 172.16.3.2  dev br0
101
nexthop as to 201 via inet6 2000:2::2  dev virt12
nexthop as to 301 via inet6 2000:3::2  dev br0

$ ip li del br0

When br0 is deleted the other hop is not considered in
mpls_select_multipath because of the alive check -- rt_nhn_alive
is 0.

rt_nhn_alive is decremented once in mpls_ifdown when the device is taken
down (NETDEV_DOWN) and again when it is deleted (NETDEV_UNREGISTER). For
a 2 hop route, deleting one device drops the alive count to 0. Since
devices are taken down before unregistering, the decrement on
NETDEV_UNREGISTER is redundant.

Fixes: c89359a42e2a4 ("mpls: support for dead routes")
Signed-off-by: David Ahern 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/mpls/af_mpls.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -956,7 +956,8 @@ static void mpls_ifdown(struct net_devic
/* fall through */
case NETDEV_CHANGE:
nh->nh_flags |= RTNH_F_LINKDOWN;
-   ACCESS_ONCE(rt->rt_nhn_alive) = 
rt->rt_nhn_alive - 1;
+   if (event != NETDEV_UNREGISTER)
+   ACCESS_ONCE(rt->rt_nhn_alive) = 
rt->rt_nhn_alive - 1;
break;
}
if (event == NETDEV_UNREGISTER)

[PATCH 4.10 41/63] mpls: Do not decrement alive counter for unregister events

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: David Ahern 


[ Upstream commit 79099aab38c8f5c746748b066ae74ba984fe2cc8 ]

Multipath routes can be rendered usesless when a device in one of the
paths is deleted. For example:

$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2  dev virt12
nexthop as to 300 via inet 172.16.3.2  dev br0
101
nexthop as to 201 via inet6 2000:2::2  dev virt12
nexthop as to 301 via inet6 2000:3::2  dev br0

$ ip li del br0

When br0 is deleted the other hop is not considered in
mpls_select_multipath because of the alive check -- rt_nhn_alive
is 0.

rt_nhn_alive is decremented once in mpls_ifdown when the device is taken
down (NETDEV_DOWN) and again when it is deleted (NETDEV_UNREGISTER). For
a 2 hop route, deleting one device drops the alive count to 0. Since
devices are taken down before unregistering, the decrement on
NETDEV_UNREGISTER is redundant.

Fixes: c89359a42e2a4 ("mpls: support for dead routes")
Signed-off-by: David Ahern 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 net/mpls/af_mpls.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -956,7 +956,8 @@ static void mpls_ifdown(struct net_devic
/* fall through */
case NETDEV_CHANGE:
nh->nh_flags |= RTNH_F_LINKDOWN;
-   ACCESS_ONCE(rt->rt_nhn_alive) = 
rt->rt_nhn_alive - 1;
+   if (event != NETDEV_UNREGISTER)
+   ACCESS_ONCE(rt->rt_nhn_alive) = 
rt->rt_nhn_alive - 1;
break;
}
if (event == NETDEV_UNREGISTER)

[PATCH 4.10 09/63] vxlan: dont allow overwrite of config src addr

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Brian Russell 


[ Upstream commit 1158632b5a2dcce0786c1b1b99654e81cc867981 ]

When using IPv6 transport and a default dst, a pointer to the configured
source address is passed into the route lookup. If no source address is
configured, then the value is overwritten.

IPv6 route lookup ignores egress ifindex match if the source address is set,
so if egress ifindex match is desired, the source address must be passed
as any. The overwrite breaks this for subsequent lookups.

Avoid this by copying the configured address to an existing stack variable
and pass a pointer to that instead.

Fixes: 272d96a5ab10 ("net: vxlan: lwt: Use source ip address during route 
lookup.")

Signed-off-by: Brian Russell 
Acked-by: Jiri Benc 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/vxlan.c |   12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1992,7 +1992,6 @@ static void vxlan_xmit_one(struct sk_buf
const struct iphdr *old_iph = ip_hdr(skb);
union vxlan_addr *dst;
union vxlan_addr remote_ip, local_ip;
-   union vxlan_addr *src;
struct vxlan_metadata _md;
struct vxlan_metadata *md = &_md;
__be16 src_port = 0, dst_port;
@@ -2019,7 +2018,7 @@ static void vxlan_xmit_one(struct sk_buf
 
dst_port = rdst->remote_port ? rdst->remote_port : 
vxlan->cfg.dst_port;
vni = rdst->remote_vni;
-   src = >cfg.saddr;
+   local_ip = vxlan->cfg.saddr;
dst_cache = >dst_cache;
md->gbp = skb->mark;
ttl = vxlan->cfg.ttl;
@@ -2052,7 +2051,6 @@ static void vxlan_xmit_one(struct sk_buf
dst = _ip;
dst_port = info->key.tp_dst ? : vxlan->cfg.dst_port;
vni = tunnel_id_to_key32(info->key.tun_id);
-   src = _ip;
dst_cache = >dst_cache;
if (info->options_len)
md = ip_tunnel_info_opts(info);
@@ -2072,7 +2070,7 @@ static void vxlan_xmit_one(struct sk_buf
rt = vxlan_get_route(vxlan, dev, sock4, skb,
 rdst ? rdst->remote_ifindex : 0, tos,
 dst->sin.sin_addr.s_addr,
->sin.sin_addr.s_addr,
+_ip.sin.sin_addr.s_addr,
 dst_port, src_port,
 dst_cache, info);
if (IS_ERR(rt)) {
@@ -2099,7 +2097,7 @@ static void vxlan_xmit_one(struct sk_buf
if (err < 0)
goto tx_error;
 
-   udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, 
src->sin.sin_addr.s_addr,
+   udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, 
local_ip.sin.sin_addr.s_addr,
dst->sin.sin_addr.s_addr, tos, ttl, df,
src_port, dst_port, xnet, !udp_sum);
 #if IS_ENABLED(CONFIG_IPV6)
@@ -2109,7 +2107,7 @@ static void vxlan_xmit_one(struct sk_buf
ndst = vxlan6_get_route(vxlan, dev, sock6, skb,
rdst ? rdst->remote_ifindex : 0, tos,
label, >sin6.sin6_addr,
-   >sin6.sin6_addr,
+   _ip.sin6.sin6_addr,
dst_port, src_port,
dst_cache, info);
if (IS_ERR(ndst)) {
@@ -2137,7 +2135,7 @@ static void vxlan_xmit_one(struct sk_buf
goto tx_error;
 
udp_tunnel6_xmit_skb(ndst, sock6->sock->sk, skb, dev,
->sin6.sin6_addr,
+_ip.sin6.sin6_addr,
 >sin6.sin6_addr, tos, ttl,
 label, src_port, dst_port, !udp_sum);
 #endif

[PATCH 4.10 25/63] amd-xgbe: Be sure to set MDIO modes on device (re)start

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: "Lendacky, Thomas" 


[ Upstream commit b42c6761fd1651f564491b53016046c9ebf0b2a9 ]

The MDIO register mode is set when the device is probed. But when the
device is brought down and then back up, the MDIO register mode has been
reset.  Be sure to reset the mode during device startup and only change
the mode of the address specified.

Signed-off-by: Tom Lendacky 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c|2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c |   22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
@@ -1323,7 +1323,7 @@ static int xgbe_read_ext_mii_regs(struct
 static int xgbe_set_ext_mii_mode(struct xgbe_prv_data *pdata, unsigned int 
port,
 enum xgbe_mdio_mode mode)
 {
-   unsigned int reg_val = 0;
+   unsigned int reg_val = XGMAC_IOREAD(pdata, MAC_MDIOCL22R);
 
switch (mode) {
case XGBE_MDIO_MODE_CL22:
--- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
@@ -875,6 +875,16 @@ static int xgbe_phy_find_phy_device(stru
!phy_data->sfp_phy_avail)
return 0;
 
+   /* Set the proper MDIO mode for the PHY */
+   ret = pdata->hw_if.set_ext_mii_mode(pdata, phy_data->mdio_addr,
+   phy_data->phydev_mode);
+   if (ret) {
+   netdev_err(pdata->netdev,
+  "mdio port/clause not compatible (%u/%u)\n",
+  phy_data->mdio_addr, phy_data->phydev_mode);
+   return ret;
+   }
+
/* Create and connect to the PHY device */
phydev = get_phy_device(phy_data->mii, phy_data->mdio_addr,
(phy_data->phydev_mode == XGBE_MDIO_MODE_CL45));
@@ -2722,6 +2732,18 @@ static int xgbe_phy_start(struct xgbe_pr
if (ret)
return ret;
 
+   /* Set the proper MDIO mode for the re-driver */
+   if (phy_data->redrv && !phy_data->redrv_if) {
+   ret = pdata->hw_if.set_ext_mii_mode(pdata, phy_data->redrv_addr,
+   XGBE_MDIO_MODE_CL22);
+   if (ret) {
+   netdev_err(pdata->netdev,
+  "redriver mdio port not compatible (%u)\n",
+  phy_data->redrv_addr);
+   return ret;
+   }
+   }
+
/* Start in highest supported mode */
xgbe_phy_set_mode(pdata, phy_data->start_mode);

[PATCH 4.10 56/63] x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Andrey Ryabinin 

commit be3606ff739d1c1be36389f8737c577ad87e1f57 upstream.

The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.

Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().

Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu 
Signed-off-by: Andrey Ryabinin 
Cc: kasan-...@googlegroups.com
Cc: Alexander Potapenko 
Cc: Andrew Morton 
Cc: l...@01.org
Cc: Dmitry Vyukov 
Link: http://lkml.kernel.org/r/20170313163337.1704-1-aryabi...@virtuozzo.com
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/x86/kernel/head64.c|1 +
 arch/x86/mm/kasan_init_64.c |1 +
 2 files changed, 2 insertions(+)

--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -4,6 +4,7 @@
  *  Copyright (C) 2000 Andrea Arcangeli  SuSE
  */
 
+#define DISABLE_BRANCH_PROFILING
 #include 
 #include 
 #include 
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -1,3 +1,4 @@
+#define DISABLE_BRANCH_PROFILING
 #define pr_fmt(fmt) "kasan: " fmt
 #include 
 #include

[PATCH 4.10 09/63] vxlan: dont allow overwrite of config src addr

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Brian Russell 


[ Upstream commit 1158632b5a2dcce0786c1b1b99654e81cc867981 ]

When using IPv6 transport and a default dst, a pointer to the configured
source address is passed into the route lookup. If no source address is
configured, then the value is overwritten.

IPv6 route lookup ignores egress ifindex match if the source address is set,
so if egress ifindex match is desired, the source address must be passed
as any. The overwrite breaks this for subsequent lookups.

Avoid this by copying the configured address to an existing stack variable
and pass a pointer to that instead.

Fixes: 272d96a5ab10 ("net: vxlan: lwt: Use source ip address during route 
lookup.")

Signed-off-by: Brian Russell 
Acked-by: Jiri Benc 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/vxlan.c |   12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1992,7 +1992,6 @@ static void vxlan_xmit_one(struct sk_buf
const struct iphdr *old_iph = ip_hdr(skb);
union vxlan_addr *dst;
union vxlan_addr remote_ip, local_ip;
-   union vxlan_addr *src;
struct vxlan_metadata _md;
struct vxlan_metadata *md = &_md;
__be16 src_port = 0, dst_port;
@@ -2019,7 +2018,7 @@ static void vxlan_xmit_one(struct sk_buf
 
dst_port = rdst->remote_port ? rdst->remote_port : 
vxlan->cfg.dst_port;
vni = rdst->remote_vni;
-   src = >cfg.saddr;
+   local_ip = vxlan->cfg.saddr;
dst_cache = >dst_cache;
md->gbp = skb->mark;
ttl = vxlan->cfg.ttl;
@@ -2052,7 +2051,6 @@ static void vxlan_xmit_one(struct sk_buf
dst = _ip;
dst_port = info->key.tp_dst ? : vxlan->cfg.dst_port;
vni = tunnel_id_to_key32(info->key.tun_id);
-   src = _ip;
dst_cache = >dst_cache;
if (info->options_len)
md = ip_tunnel_info_opts(info);
@@ -2072,7 +2070,7 @@ static void vxlan_xmit_one(struct sk_buf
rt = vxlan_get_route(vxlan, dev, sock4, skb,
 rdst ? rdst->remote_ifindex : 0, tos,
 dst->sin.sin_addr.s_addr,
->sin.sin_addr.s_addr,
+_ip.sin.sin_addr.s_addr,
 dst_port, src_port,
 dst_cache, info);
if (IS_ERR(rt)) {
@@ -2099,7 +2097,7 @@ static void vxlan_xmit_one(struct sk_buf
if (err < 0)
goto tx_error;
 
-   udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, 
src->sin.sin_addr.s_addr,
+   udp_tunnel_xmit_skb(rt, sock4->sock->sk, skb, 
local_ip.sin.sin_addr.s_addr,
dst->sin.sin_addr.s_addr, tos, ttl, df,
src_port, dst_port, xnet, !udp_sum);
 #if IS_ENABLED(CONFIG_IPV6)
@@ -2109,7 +2107,7 @@ static void vxlan_xmit_one(struct sk_buf
ndst = vxlan6_get_route(vxlan, dev, sock6, skb,
rdst ? rdst->remote_ifindex : 0, tos,
label, >sin6.sin6_addr,
-   >sin6.sin6_addr,
+   _ip.sin6.sin6_addr,
dst_port, src_port,
dst_cache, info);
if (IS_ERR(ndst)) {
@@ -2137,7 +2135,7 @@ static void vxlan_xmit_one(struct sk_buf
goto tx_error;
 
udp_tunnel6_xmit_skb(ndst, sock6->sock->sk, skb, dev,
->sin6.sin6_addr,
+_ip.sin6.sin6_addr,
 >sin6.sin6_addr, tos, ttl,
 label, src_port, dst_port, !udp_sum);
 #endif

[PATCH 4.10 25/63] amd-xgbe: Be sure to set MDIO modes on device (re)start

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: "Lendacky, Thomas" 


[ Upstream commit b42c6761fd1651f564491b53016046c9ebf0b2a9 ]

The MDIO register mode is set when the device is probed. But when the
device is brought down and then back up, the MDIO register mode has been
reset.  Be sure to reset the mode during device startup and only change
the mode of the address specified.

Signed-off-by: Tom Lendacky 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/amd/xgbe/xgbe-dev.c|2 +-
 drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c |   22 ++
 2 files changed, 23 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c
@@ -1323,7 +1323,7 @@ static int xgbe_read_ext_mii_regs(struct
 static int xgbe_set_ext_mii_mode(struct xgbe_prv_data *pdata, unsigned int 
port,
 enum xgbe_mdio_mode mode)
 {
-   unsigned int reg_val = 0;
+   unsigned int reg_val = XGMAC_IOREAD(pdata, MAC_MDIOCL22R);
 
switch (mode) {
case XGBE_MDIO_MODE_CL22:
--- a/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
+++ b/drivers/net/ethernet/amd/xgbe/xgbe-phy-v2.c
@@ -875,6 +875,16 @@ static int xgbe_phy_find_phy_device(stru
!phy_data->sfp_phy_avail)
return 0;
 
+   /* Set the proper MDIO mode for the PHY */
+   ret = pdata->hw_if.set_ext_mii_mode(pdata, phy_data->mdio_addr,
+   phy_data->phydev_mode);
+   if (ret) {
+   netdev_err(pdata->netdev,
+  "mdio port/clause not compatible (%u/%u)\n",
+  phy_data->mdio_addr, phy_data->phydev_mode);
+   return ret;
+   }
+
/* Create and connect to the PHY device */
phydev = get_phy_device(phy_data->mii, phy_data->mdio_addr,
(phy_data->phydev_mode == XGBE_MDIO_MODE_CL45));
@@ -2722,6 +2732,18 @@ static int xgbe_phy_start(struct xgbe_pr
if (ret)
return ret;
 
+   /* Set the proper MDIO mode for the re-driver */
+   if (phy_data->redrv && !phy_data->redrv_if) {
+   ret = pdata->hw_if.set_ext_mii_mode(pdata, phy_data->redrv_addr,
+   XGBE_MDIO_MODE_CL22);
+   if (ret) {
+   netdev_err(pdata->netdev,
+  "redriver mdio port not compatible (%u)\n",
+  phy_data->redrv_addr);
+   return ret;
+   }
+   }
+
/* Start in highest supported mode */
xgbe_phy_set_mode(pdata, phy_data->start_mode);

[PATCH 4.10 56/63] x86/kasan: Fix boot with KASAN=y and PROFILE_ANNOTATED_BRANCHES=y

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Andrey Ryabinin 

commit be3606ff739d1c1be36389f8737c577ad87e1f57 upstream.

The kernel doesn't boot with both PROFILE_ANNOTATED_BRANCHES=y and KASAN=y
options selected. With branch profiling enabled we end up calling
ftrace_likely_update() before kasan_early_init(). ftrace_likely_update() is
built with KASAN instrumentation, so calling it before kasan has been
initialized leads to crash.

Use DISABLE_BRANCH_PROFILING define to make sure that we don't call
ftrace_likely_update() from early code before kasan_early_init().

Fixes: ef7f0d6a6ca8 ("x86_64: add KASan support")
Reported-by: Fengguang Wu 
Signed-off-by: Andrey Ryabinin 
Cc: kasan-...@googlegroups.com
Cc: Alexander Potapenko 
Cc: Andrew Morton 
Cc: l...@01.org
Cc: Dmitry Vyukov 
Link: http://lkml.kernel.org/r/20170313163337.1704-1-aryabi...@virtuozzo.com
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/x86/kernel/head64.c|1 +
 arch/x86/mm/kasan_init_64.c |1 +
 2 files changed, 2 insertions(+)

--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -4,6 +4,7 @@
  *  Copyright (C) 2000 Andrea Arcangeli  SuSE
  */
 
+#define DISABLE_BRANCH_PROFILING
 #include 
 #include 
 #include 
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -1,3 +1,4 @@
+#define DISABLE_BRANCH_PROFILING
 #define pr_fmt(fmt) "kasan: " fmt
 #include 
 #include

[PATCH] drm/gma500: fix memory leak on edid

2017-03-20 Thread Colin King

From: Colin Ian King 

edid is allocated on the call to psb_intel_sdvo_get_edid but not
kfree'd at all, causing a memory leak.  Fix this by kfree'ing
the edid.  (This may be null, but kfree can handle null frees).

Detected by CoverityScan, CID#1090730 ("Resource Leak")

Fixes: 5736995b473b ("gma500: Replace SDVO code with slightly modified version 
from i915")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/gma500/psb_intel_sdvo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/gma500/psb_intel_sdvo.c 
b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
index e787d376ba67..f38e6ad1ab9b 100644
--- a/drivers/gpu/drm/gma500/psb_intel_sdvo.c
+++ b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
@@ -1650,6 +1650,7 @@ static bool psb_intel_sdvo_detect_hdmi_audio(struct 
drm_connector *connector)
edid = psb_intel_sdvo_get_edid(connector);
if (edid != NULL && edid->input & DRM_EDID_INPUT_DIGITAL)
has_audio = drm_detect_monitor_audio(edid);
+   kfree(edid);
 
return has_audio;
 }
-- 
2.11.0

Re: [PATCH] fs/pstore: Perform erase from a worker

2017-03-20 Thread Kees Cook

On Fri, Mar 17, 2017 at 2:52 AM, Chris Wilson  wrote:
> In order to prevent a cyclic recursion between psi->read_mutex and the
> inode_lock, we need to move the pse->erase to a worker.
>
> [  605.374955] ==
> [  605.381281] [ INFO: possible circular locking dependency detected ]
> [  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
> [  605.392826] ---
> [  605.399196] rm/7298 is trying to acquire lock:
> [  605.403720]  (>read_mutex){+.+.+.}, at: [] 
> pstore_unlink+0x3f/0xa0
> [  605.412300]
> [  605.412300] but task is already holding lock:
> [  605.418237]  (>s_type->i_mutex_key#14){++}, at: 
> [] vfs_unlink+0x4c/0x19
> 0
> [  605.427397]
> [  605.427397] which lock already depends on the new lock.
> [  605.427397]
> [  605.435770]
> [  605.435770] the existing dependency chain (in reverse order) is:
> [  605.443396]
> [  605.443396] -> #1 (>s_type->i_mutex_key#14){++}:
> [  605.450347]lock_acquire+0xc9/0x220
> [  605.454551]down_write+0x3f/0x70
> [  605.458484]pstore_mkfile+0x1f4/0x460
> [  605.462835]pstore_get_records+0x17a/0x320
> [  605.467664]pstore_fill_super+0xa4/0xc0
> [  605.472205]mount_single+0x89/0xb0
> [  605.476314]pstore_mount+0x13/0x20
> [  605.480411]mount_fs+0xf/0x90
> [  605.484122]vfs_kern_mount+0x66/0x170
> [  605.488464]do_mount+0x190/0xd50
> [  605.492397]SyS_mount+0x90/0xd0
> [  605.496212]entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  605.501496]
> [  605.501496] -> #0 (>read_mutex){+.+.+.}:
> [  605.507747]__lock_acquire+0x1ac0/0x1bb0
> [  605.512401]lock_acquire+0xc9/0x220
> [  605.516594]__mutex_lock+0x6e/0x990
> [  605.520755]mutex_lock_nested+0x16/0x20
> [  605.525279]pstore_unlink+0x3f/0xa0
> [  605.529465]vfs_unlink+0xb5/0x190
> [  605.533477]do_unlinkat+0x24c/0x2a0
> [  605.537672]SyS_unlinkat+0x16/0x30
> [  605.541781]entry_SYSCALL_64_fastpath+0x1c/0xb1

If I'm reading this right it's a race between mount and unlink...
that's quite a corner case. :)

> [  605.547067]
> [  605.547067] other info that might help us debug this:
> [  605.547067]
> [  605.555221]  Possible unsafe locking scenario:
> [  605.555221]
> [  605.561280]CPU0CPU1
> [  605.565883]
> [  605.570502]   lock(>s_type->i_mutex_key#14);
> [  605.575217]lock(>read_mutex);
> [  605.581803]
> lock(>s_type->i_mutex_key#14);
> [  605.589159]   lock(>read_mutex);

I haven't had time to dig much yet, but I wonder if the locking order
on unlink could just be reversed, and the deadlock would go away?

> [  605.593156]
> [  605.593156]  *** DEADLOCK ***
> [  605.593156]
> [  605.599214] 3 locks held by rm/7298:
> [  605.602896]  #0:  (sb_writers#11){.+.+..}, at: [] 
> mnt_want_write+0x1f/0x50
> [  605.611490]  #1:  (>s_type->i_mutex_key#14/1){+.+...}, at: 
> [] do_unlinkat+0
> x11c/0x2a0
> [  605.621417]  #2:  (>s_type->i_mutex_key#14){++}, at: 
> [] vfs_unlink+0x4c
> /0x190
> [  605.630995]
> [  605.630995] stack backtrace:
> [  605.635450] CPU: 7 PID: 7298 Comm: rm Not tainted 
> 4.11.0-rc2-CI-CI_DRM_2352+ #1
> [  605.642999] Hardware name: Gigabyte Technology Co., Ltd. 
> Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2
> 017
> [  605.652305] Call Trace:
> [  605.654814]  dump_stack+0x67/0x92
> [  605.658184]  print_circular_bug+0x1e0/0x2e0
> [  605.662465]  __lock_acquire+0x1ac0/0x1bb0
> [  605.34]  ? retint_kernel+0x2d/0x2d
> [  605.670456]  lock_acquire+0xc9/0x220
> [  605.674112]  ? pstore_unlink+0x3f/0xa0
> [  605.677970]  ? pstore_unlink+0x3f/0xa0
> [  605.681818]  __mutex_lock+0x6e/0x990
> [  605.685456]  ? pstore_unlink+0x3f/0xa0
> [  605.689791]  ? pstore_unlink+0x3f/0xa0
> [  605.694124]  ? vfs_unlink+0x4c/0x190
> [  605.698310]  mutex_lock_nested+0x16/0x20
> [  605.702859]  pstore_unlink+0x3f/0xa0
> [  605.707021]  vfs_unlink+0xb5/0x190
> [  605.711024]  do_unlinkat+0x24c/0x2a0
> [  605.715194]  SyS_unlinkat+0x16/0x30
> [  605.719275]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  605.724543] RIP: 0033:0x7f8b08073ed7
> [  605.728676] RSP: 002b:7ffe70eff628 EFLAGS: 0206 ORIG_RAX: 
> 0107
> [  605.736929] RAX: ffda RBX: 8147ea93 RCX: 
> 7f8b08073ed7
> [  605.744711] RDX:  RSI: 0145 RDI: 
> ff9c
> [  605.752512] RBP: c9000338ff88 R08: 0003 R09: 
> 
> [  605.760276] R10: 015e R11: 0206 R12: 
> 
> [  605.768040] R13: 7ffe70eff750 R14: 0144ff70 R15: 
> 01451230
> [  605.775800]  ? __this_cpu_preempt_check+0x13/0x20
>
> Reported-by: Tomi Sarvela 
> Fixes: e9e360b08a44 ("pstore: Protect

[PATCH] drm/gma500: fix memory leak on edid

2017-03-20 Thread Colin King

From: Colin Ian King 

edid is allocated on the call to psb_intel_sdvo_get_edid but not
kfree'd at all, causing a memory leak.  Fix this by kfree'ing
the edid.  (This may be null, but kfree can handle null frees).

Detected by CoverityScan, CID#1090730 ("Resource Leak")

Fixes: 5736995b473b ("gma500: Replace SDVO code with slightly modified version 
from i915")
Signed-off-by: Colin Ian King 
---
 drivers/gpu/drm/gma500/psb_intel_sdvo.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/gma500/psb_intel_sdvo.c 
b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
index e787d376ba67..f38e6ad1ab9b 100644
--- a/drivers/gpu/drm/gma500/psb_intel_sdvo.c
+++ b/drivers/gpu/drm/gma500/psb_intel_sdvo.c
@@ -1650,6 +1650,7 @@ static bool psb_intel_sdvo_detect_hdmi_audio(struct 
drm_connector *connector)
edid = psb_intel_sdvo_get_edid(connector);
if (edid != NULL && edid->input & DRM_EDID_INPUT_DIGITAL)
has_audio = drm_detect_monitor_audio(edid);
+   kfree(edid);
 
return has_audio;
 }
-- 
2.11.0

Re: [PATCH] fs/pstore: Perform erase from a worker

2017-03-20 Thread Kees Cook

On Fri, Mar 17, 2017 at 2:52 AM, Chris Wilson  wrote:
> In order to prevent a cyclic recursion between psi->read_mutex and the
> inode_lock, we need to move the pse->erase to a worker.
>
> [  605.374955] ==
> [  605.381281] [ INFO: possible circular locking dependency detected ]
> [  605.387679] 4.11.0-rc2-CI-CI_DRM_2352+ #1 Not tainted
> [  605.392826] ---
> [  605.399196] rm/7298 is trying to acquire lock:
> [  605.403720]  (>read_mutex){+.+.+.}, at: [] 
> pstore_unlink+0x3f/0xa0
> [  605.412300]
> [  605.412300] but task is already holding lock:
> [  605.418237]  (>s_type->i_mutex_key#14){++}, at: 
> [] vfs_unlink+0x4c/0x19
> 0
> [  605.427397]
> [  605.427397] which lock already depends on the new lock.
> [  605.427397]
> [  605.435770]
> [  605.435770] the existing dependency chain (in reverse order) is:
> [  605.443396]
> [  605.443396] -> #1 (>s_type->i_mutex_key#14){++}:
> [  605.450347]lock_acquire+0xc9/0x220
> [  605.454551]down_write+0x3f/0x70
> [  605.458484]pstore_mkfile+0x1f4/0x460
> [  605.462835]pstore_get_records+0x17a/0x320
> [  605.467664]pstore_fill_super+0xa4/0xc0
> [  605.472205]mount_single+0x89/0xb0
> [  605.476314]pstore_mount+0x13/0x20
> [  605.480411]mount_fs+0xf/0x90
> [  605.484122]vfs_kern_mount+0x66/0x170
> [  605.488464]do_mount+0x190/0xd50
> [  605.492397]SyS_mount+0x90/0xd0
> [  605.496212]entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  605.501496]
> [  605.501496] -> #0 (>read_mutex){+.+.+.}:
> [  605.507747]__lock_acquire+0x1ac0/0x1bb0
> [  605.512401]lock_acquire+0xc9/0x220
> [  605.516594]__mutex_lock+0x6e/0x990
> [  605.520755]mutex_lock_nested+0x16/0x20
> [  605.525279]pstore_unlink+0x3f/0xa0
> [  605.529465]vfs_unlink+0xb5/0x190
> [  605.533477]do_unlinkat+0x24c/0x2a0
> [  605.537672]SyS_unlinkat+0x16/0x30
> [  605.541781]entry_SYSCALL_64_fastpath+0x1c/0xb1

If I'm reading this right it's a race between mount and unlink...
that's quite a corner case. :)

> [  605.547067]
> [  605.547067] other info that might help us debug this:
> [  605.547067]
> [  605.555221]  Possible unsafe locking scenario:
> [  605.555221]
> [  605.561280]CPU0CPU1
> [  605.565883]
> [  605.570502]   lock(>s_type->i_mutex_key#14);
> [  605.575217]lock(>read_mutex);
> [  605.581803]
> lock(>s_type->i_mutex_key#14);
> [  605.589159]   lock(>read_mutex);

I haven't had time to dig much yet, but I wonder if the locking order
on unlink could just be reversed, and the deadlock would go away?

> [  605.593156]
> [  605.593156]  *** DEADLOCK ***
> [  605.593156]
> [  605.599214] 3 locks held by rm/7298:
> [  605.602896]  #0:  (sb_writers#11){.+.+..}, at: [] 
> mnt_want_write+0x1f/0x50
> [  605.611490]  #1:  (>s_type->i_mutex_key#14/1){+.+...}, at: 
> [] do_unlinkat+0
> x11c/0x2a0
> [  605.621417]  #2:  (>s_type->i_mutex_key#14){++}, at: 
> [] vfs_unlink+0x4c
> /0x190
> [  605.630995]
> [  605.630995] stack backtrace:
> [  605.635450] CPU: 7 PID: 7298 Comm: rm Not tainted 
> 4.11.0-rc2-CI-CI_DRM_2352+ #1
> [  605.642999] Hardware name: Gigabyte Technology Co., Ltd. 
> Z170X-UD5/Z170X-UD5-CF, BIOS F21 01/06/2
> 017
> [  605.652305] Call Trace:
> [  605.654814]  dump_stack+0x67/0x92
> [  605.658184]  print_circular_bug+0x1e0/0x2e0
> [  605.662465]  __lock_acquire+0x1ac0/0x1bb0
> [  605.34]  ? retint_kernel+0x2d/0x2d
> [  605.670456]  lock_acquire+0xc9/0x220
> [  605.674112]  ? pstore_unlink+0x3f/0xa0
> [  605.677970]  ? pstore_unlink+0x3f/0xa0
> [  605.681818]  __mutex_lock+0x6e/0x990
> [  605.685456]  ? pstore_unlink+0x3f/0xa0
> [  605.689791]  ? pstore_unlink+0x3f/0xa0
> [  605.694124]  ? vfs_unlink+0x4c/0x190
> [  605.698310]  mutex_lock_nested+0x16/0x20
> [  605.702859]  pstore_unlink+0x3f/0xa0
> [  605.707021]  vfs_unlink+0xb5/0x190
> [  605.711024]  do_unlinkat+0x24c/0x2a0
> [  605.715194]  SyS_unlinkat+0x16/0x30
> [  605.719275]  entry_SYSCALL_64_fastpath+0x1c/0xb1
> [  605.724543] RIP: 0033:0x7f8b08073ed7
> [  605.728676] RSP: 002b:7ffe70eff628 EFLAGS: 0206 ORIG_RAX: 
> 0107
> [  605.736929] RAX: ffda RBX: 8147ea93 RCX: 
> 7f8b08073ed7
> [  605.744711] RDX:  RSI: 0145 RDI: 
> ff9c
> [  605.752512] RBP: c9000338ff88 R08: 0003 R09: 
> 
> [  605.760276] R10: 015e R11: 0206 R12: 
> 
> [  605.768040] R13: 7ffe70eff750 R14: 0144ff70 R15: 
> 01451230
> [  605.775800]  ? __this_cpu_preempt_check+0x13/0x20
>
> Reported-by: Tomi Sarvela 
> Fixes: e9e360b08a44 ("pstore: Protect unlink with read_mutex")
> Bugzilla:

[PATCH 4.10 61/63] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Niklas Cassel 

commit 17fcbd590d0c3e35bd9646e2215f86586378bc42 upstream.

We hang if SIGKILL has been sent, but the task is stuck in down_read()
(after do_exit()), even though no task is doing down_write() on the
rwsem in question:

  INFO: task libupnp:21868 blocked for more than 120 seconds.
  libupnp D0 21868  1 0x0818
  ...
  Call Trace:
  __schedule()
  schedule()
  __down_read()
  do_exit()
  do_group_exit()
  __wake_up_parent()

This bug has already been fixed for CONFIG_RWSEM_XCHGADD_ALGORITHM=y in
the following commit:

 04cafed7fc19 ("locking/rwsem: Fix down_write_killable()")

... however, this bug also exists for CONFIG_RWSEM_GENERIC_SPINLOCK=y.

Signed-off-by: Niklas Cassel 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Niklas Cassel 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: d47996082f52 ("locking/rwsem: Introduce basis for down_write_killable()")
Link: 
http://lkml.kernel.org/r/1487981873-12649-1-git-send-email-nikl...@axis.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/locking/rwsem-spinlock.c |   15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/kernel/locking/rwsem-spinlock.c
+++ b/kernel/locking/rwsem-spinlock.c
@@ -216,10 +216,8 @@ int __sched __down_write_common(struct r
 */
if (sem->count == 0)
break;
-   if (signal_pending_state(state, current)) {
-   ret = -EINTR;
-   goto out;
-   }
+   if (signal_pending_state(state, current))
+   goto out_nolock;
set_task_state(tsk, state);
raw_spin_unlock_irqrestore(>wait_lock, flags);
schedule();
@@ -227,12 +225,19 @@ int __sched __down_write_common(struct r
}
/* got the lock */
sem->count = -1;
-out:
list_del();
 
raw_spin_unlock_irqrestore(>wait_lock, flags);
 
return ret;
+
+out_nolock:
+   list_del();
+   if (!list_empty(>wait_list))
+   __rwsem_do_wake(sem, 1);
+   raw_spin_unlock_irqrestore(>wait_lock, flags);
+
+   return -EINTR;
 }
 
 void __sched __down_write(struct rw_semaphore *sem)

[PATCH 4.10 61/63] locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Niklas Cassel 

commit 17fcbd590d0c3e35bd9646e2215f86586378bc42 upstream.

We hang if SIGKILL has been sent, but the task is stuck in down_read()
(after do_exit()), even though no task is doing down_write() on the
rwsem in question:

  INFO: task libupnp:21868 blocked for more than 120 seconds.
  libupnp D0 21868  1 0x0818
  ...
  Call Trace:
  __schedule()
  schedule()
  __down_read()
  do_exit()
  do_group_exit()
  __wake_up_parent()

This bug has already been fixed for CONFIG_RWSEM_XCHGADD_ALGORITHM=y in
the following commit:

 04cafed7fc19 ("locking/rwsem: Fix down_write_killable()")

... however, this bug also exists for CONFIG_RWSEM_GENERIC_SPINLOCK=y.

Signed-off-by: Niklas Cassel 
Signed-off-by: Peter Zijlstra (Intel) 
Cc: 
Cc: Andrew Morton 
Cc: Linus Torvalds 
Cc: Niklas Cassel 
Cc: Paul E. McKenney 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Fixes: d47996082f52 ("locking/rwsem: Introduce basis for down_write_killable()")
Link: 
http://lkml.kernel.org/r/1487981873-12649-1-git-send-email-nikl...@axis.com
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/locking/rwsem-spinlock.c |   15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

--- a/kernel/locking/rwsem-spinlock.c
+++ b/kernel/locking/rwsem-spinlock.c
@@ -216,10 +216,8 @@ int __sched __down_write_common(struct r
 */
if (sem->count == 0)
break;
-   if (signal_pending_state(state, current)) {
-   ret = -EINTR;
-   goto out;
-   }
+   if (signal_pending_state(state, current))
+   goto out_nolock;
set_task_state(tsk, state);
raw_spin_unlock_irqrestore(>wait_lock, flags);
schedule();
@@ -227,12 +225,19 @@ int __sched __down_write_common(struct r
}
/* got the lock */
sem->count = -1;
-out:
list_del();
 
raw_spin_unlock_irqrestore(>wait_lock, flags);
 
return ret;
+
+out_nolock:
+   list_del();
+   if (!list_empty(>wait_list))
+   __rwsem_do_wake(sem, 1);
+   raw_spin_unlock_irqrestore(>wait_lock, flags);
+
+   return -EINTR;
 }
 
 void __sched __down_write(struct rw_semaphore *sem)

[PATCH 4.10 60/63] futex: Add missing error handling to FUTEX_REQUEUE_PI

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Peter Zijlstra 

commit 9bbb25afeb182502ca4f2c4f3f88af0681b34cae upstream.

Thomas spotted that fixup_pi_state_owner() can return errors and we
fail to unlock the rt_mutex in that case.

Reported-by: Thomas Gleixner 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Darren Hart 
Cc: juri.le...@arm.com
Cc: bige...@linutronix.de
Cc: xlp...@redhat.com
Cc: rost...@goodmis.org
Cc: mathieu.desnoy...@efficios.com
Cc: jdesfos...@efficios.com
Cc: dvh...@infradead.org
Cc: bris...@redhat.com
Link: http://lkml.kernel.org/r/20170304093558.867401...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/futex.c |2 ++
 1 file changed, 2 insertions(+)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2896,6 +2896,8 @@ static int futex_wait_requeue_pi(u32 __u
if (q.pi_state && (q.pi_state->owner != current)) {
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, , current);
+   if (ret && rt_mutex_owner(_state->pi_mutex) == 
current)
+   rt_mutex_unlock(_state->pi_mutex);
/*
 * Drop the reference to the pi state which
 * the requeue_pi() code acquired for us.

[PATCH 4.10 59/63] futex: Fix potential use-after-free in FUTEX_REQUEUE_PI

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Peter Zijlstra 

commit c236c8e95a3d395b0494e7108f0d41cf36ec107c upstream.

While working on the futex code, I stumbled over this potential
use-after-free scenario. Dmitry triggered it later with syzkaller.

pi_mutex is a pointer into pi_state, which we drop the reference on in
unqueue_me_pi(). So any access to that pointer after that is bad.

Since other sites already do rt_mutex_unlock() with hb->lock held, see
for example futex_lock_pi(), simply move the unlock before
unqueue_me_pi().

Reported-by: Dmitry Vyukov 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Darren Hart 
Cc: juri.le...@arm.com
Cc: bige...@linutronix.de
Cc: xlp...@redhat.com
Cc: rost...@goodmis.org
Cc: mathieu.desnoy...@efficios.com
Cc: jdesfos...@efficios.com
Cc: dvh...@infradead.org
Cc: bris...@redhat.com
Link: http://lkml.kernel.org/r/20170304093558.801744...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/futex.c |   20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2813,7 +2813,6 @@ static int futex_wait_requeue_pi(u32 __u
 {
struct hrtimer_sleeper timeout, *to = NULL;
struct rt_mutex_waiter rt_waiter;
-   struct rt_mutex *pi_mutex = NULL;
struct futex_hash_bucket *hb;
union futex_key key2 = FUTEX_KEY_INIT;
struct futex_q q = futex_q_init;
@@ -2905,6 +2904,8 @@ static int futex_wait_requeue_pi(u32 __u
spin_unlock(q.lock_ptr);
}
} else {
+   struct rt_mutex *pi_mutex;
+
/*
 * We have been woken up by futex_unlock_pi(), a timeout, or a
 * signal.  futex_unlock_pi() will not destroy the lock_ptr nor
@@ -2928,18 +2929,19 @@ static int futex_wait_requeue_pi(u32 __u
if (res)
ret = (res < 0) ? res : 0;
 
+   /*
+* If fixup_pi_state_owner() faulted and was unable to handle
+* the fault, unlock the rt_mutex and return the fault to
+* userspace.
+*/
+   if (ret && rt_mutex_owner(pi_mutex) == current)
+   rt_mutex_unlock(pi_mutex);
+
/* Unqueue and drop the lock. */
unqueue_me_pi();
}
 
-   /*
-* If fixup_pi_state_owner() faulted and was unable to handle the
-* fault, unlock the rt_mutex and return the fault to userspace.
-*/
-   if (ret == -EFAULT) {
-   if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
-   rt_mutex_unlock(pi_mutex);
-   } else if (ret == -EINTR) {
+   if (ret == -EINTR) {
/*
 * We've already been requeued, but cannot restart by calling
 * futex_lock_pi() directly. We could restart this syscall, but

[PATCH 4.10 60/63] futex: Add missing error handling to FUTEX_REQUEUE_PI

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Peter Zijlstra 

commit 9bbb25afeb182502ca4f2c4f3f88af0681b34cae upstream.

Thomas spotted that fixup_pi_state_owner() can return errors and we
fail to unlock the rt_mutex in that case.

Reported-by: Thomas Gleixner 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Darren Hart 
Cc: juri.le...@arm.com
Cc: bige...@linutronix.de
Cc: xlp...@redhat.com
Cc: rost...@goodmis.org
Cc: mathieu.desnoy...@efficios.com
Cc: jdesfos...@efficios.com
Cc: dvh...@infradead.org
Cc: bris...@redhat.com
Link: http://lkml.kernel.org/r/20170304093558.867401...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/futex.c |2 ++
 1 file changed, 2 insertions(+)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2896,6 +2896,8 @@ static int futex_wait_requeue_pi(u32 __u
if (q.pi_state && (q.pi_state->owner != current)) {
spin_lock(q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, , current);
+   if (ret && rt_mutex_owner(_state->pi_mutex) == 
current)
+   rt_mutex_unlock(_state->pi_mutex);
/*
 * Drop the reference to the pi state which
 * the requeue_pi() code acquired for us.

[PATCH 4.10 59/63] futex: Fix potential use-after-free in FUTEX_REQUEUE_PI

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Peter Zijlstra 

commit c236c8e95a3d395b0494e7108f0d41cf36ec107c upstream.

While working on the futex code, I stumbled over this potential
use-after-free scenario. Dmitry triggered it later with syzkaller.

pi_mutex is a pointer into pi_state, which we drop the reference on in
unqueue_me_pi(). So any access to that pointer after that is bad.

Since other sites already do rt_mutex_unlock() with hb->lock held, see
for example futex_lock_pi(), simply move the unlock before
unqueue_me_pi().

Reported-by: Dmitry Vyukov 
Signed-off-by: Peter Zijlstra (Intel) 
Reviewed-by: Darren Hart 
Cc: juri.le...@arm.com
Cc: bige...@linutronix.de
Cc: xlp...@redhat.com
Cc: rost...@goodmis.org
Cc: mathieu.desnoy...@efficios.com
Cc: jdesfos...@efficios.com
Cc: dvh...@infradead.org
Cc: bris...@redhat.com
Link: http://lkml.kernel.org/r/20170304093558.801744...@infradead.org
Signed-off-by: Thomas Gleixner 
Signed-off-by: Greg Kroah-Hartman 

---
 kernel/futex.c |   20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2813,7 +2813,6 @@ static int futex_wait_requeue_pi(u32 __u
 {
struct hrtimer_sleeper timeout, *to = NULL;
struct rt_mutex_waiter rt_waiter;
-   struct rt_mutex *pi_mutex = NULL;
struct futex_hash_bucket *hb;
union futex_key key2 = FUTEX_KEY_INIT;
struct futex_q q = futex_q_init;
@@ -2905,6 +2904,8 @@ static int futex_wait_requeue_pi(u32 __u
spin_unlock(q.lock_ptr);
}
} else {
+   struct rt_mutex *pi_mutex;
+
/*
 * We have been woken up by futex_unlock_pi(), a timeout, or a
 * signal.  futex_unlock_pi() will not destroy the lock_ptr nor
@@ -2928,18 +2929,19 @@ static int futex_wait_requeue_pi(u32 __u
if (res)
ret = (res < 0) ? res : 0;
 
+   /*
+* If fixup_pi_state_owner() faulted and was unable to handle
+* the fault, unlock the rt_mutex and return the fault to
+* userspace.
+*/
+   if (ret && rt_mutex_owner(pi_mutex) == current)
+   rt_mutex_unlock(pi_mutex);
+
/* Unqueue and drop the lock. */
unqueue_me_pi();
}
 
-   /*
-* If fixup_pi_state_owner() faulted and was unable to handle the
-* fault, unlock the rt_mutex and return the fault to userspace.
-*/
-   if (ret == -EFAULT) {
-   if (pi_mutex && rt_mutex_owner(pi_mutex) == current)
-   rt_mutex_unlock(pi_mutex);
-   } else if (ret == -EINTR) {
+   if (ret == -EINTR) {
/*
 * We've already been requeued, but cannot restart by calling
 * futex_lock_pi() directly. We could restart this syscall, but

[PATCH 4.10 49/63] arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Marc Zyngier 

commit 68925176296a8b995e503349200e256674bfe5ac upstream.

When invalidating guest TLBs, special care must be taken to
actually shoot the guest TLBs and not the host ones if we're
running on a VHE system.  This is controlled by the HCR_EL2.TGE
bit, which we forget to clear before invalidating TLBs.

Address the issue by introducing two wrappers (__tlb_switch_to_guest
and __tlb_switch_to_host) that take care of both the VTTBR_EL2
and HCR_EL2.TGE switching.

Reported-by: Tomasz Nowicki 
Tested-by: Tomasz Nowicki 
Reviewed-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm64/kvm/hyp/tlb.c |   64 ---
 1 file changed, 55 insertions(+), 9 deletions(-)

--- a/arch/arm64/kvm/hyp/tlb.c
+++ b/arch/arm64/kvm/hyp/tlb.c
@@ -17,14 +17,62 @@
 
 #include 
 
+static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm)
+{
+   u64 val;
+
+   /*
+* With VHE enabled, we have HCR_EL2.{E2H,TGE} = {1,1}, and
+* most TLB operations target EL2/EL0. In order to affect the
+* guest TLBs (EL1/EL0), we need to change one of these two
+* bits. Changing E2H is impossible (goodbye TTBR1_EL2), so
+* let's flip TGE before executing the TLB operation.
+*/
+   write_sysreg(kvm->arch.vttbr, vttbr_el2);
+   val = read_sysreg(hcr_el2);
+   val &= ~HCR_TGE;
+   write_sysreg(val, hcr_el2);
+   isb();
+}
+
+static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm)
+{
+   write_sysreg(kvm->arch.vttbr, vttbr_el2);
+   isb();
+}
+
+static hyp_alternate_select(__tlb_switch_to_guest,
+   __tlb_switch_to_guest_nvhe,
+   __tlb_switch_to_guest_vhe,
+   ARM64_HAS_VIRT_HOST_EXTN);
+
+static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm)
+{
+   /*
+* We're done with the TLB operation, let's restore the host's
+* view of HCR_EL2.
+*/
+   write_sysreg(0, vttbr_el2);
+   write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+}
+
+static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm)
+{
+   write_sysreg(0, vttbr_el2);
+}
+
+static hyp_alternate_select(__tlb_switch_to_host,
+   __tlb_switch_to_host_nvhe,
+   __tlb_switch_to_host_vhe,
+   ARM64_HAS_VIRT_HOST_EXTN);
+
 void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
dsb(ishst);
 
/* Switch to requested VMID */
kvm = kern_hyp_va(kvm);
-   write_sysreg(kvm->arch.vttbr, vttbr_el2);
-   isb();
+   __tlb_switch_to_guest()(kvm);
 
/*
 * We could do so much better if we had the VA as well.
@@ -45,7 +93,7 @@ void __hyp_text __kvm_tlb_flush_vmid_ipa
dsb(ish);
isb();
 
-   write_sysreg(0, vttbr_el2);
+   __tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
@@ -54,14 +102,13 @@ void __hyp_text __kvm_tlb_flush_vmid(str
 
/* Switch to requested VMID */
kvm = kern_hyp_va(kvm);
-   write_sysreg(kvm->arch.vttbr, vttbr_el2);
-   isb();
+   __tlb_switch_to_guest()(kvm);
 
asm volatile("tlbi vmalls12e1is" : : );
dsb(ish);
isb();
 
-   write_sysreg(0, vttbr_el2);
+   __tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
@@ -69,14 +116,13 @@ void __hyp_text __kvm_tlb_flush_local_vm
struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm);
 
/* Switch to requested VMID */
-   write_sysreg(kvm->arch.vttbr, vttbr_el2);
-   isb();
+   __tlb_switch_to_guest()(kvm);
 
asm volatile("tlbi vmalle1" : : );
dsb(nsh);
isb();
 
-   write_sysreg(0, vttbr_el2);
+   __tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_flush_vm_context(void)

[PATCH 4.10 49/63] arm64: KVM: VHE: Clear HCR_TGE when invalidating guest TLBs

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Marc Zyngier 

commit 68925176296a8b995e503349200e256674bfe5ac upstream.

When invalidating guest TLBs, special care must be taken to
actually shoot the guest TLBs and not the host ones if we're
running on a VHE system.  This is controlled by the HCR_EL2.TGE
bit, which we forget to clear before invalidating TLBs.

Address the issue by introducing two wrappers (__tlb_switch_to_guest
and __tlb_switch_to_host) that take care of both the VTTBR_EL2
and HCR_EL2.TGE switching.

Reported-by: Tomasz Nowicki 
Tested-by: Tomasz Nowicki 
Reviewed-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/arm64/kvm/hyp/tlb.c |   64 ---
 1 file changed, 55 insertions(+), 9 deletions(-)

--- a/arch/arm64/kvm/hyp/tlb.c
+++ b/arch/arm64/kvm/hyp/tlb.c
@@ -17,14 +17,62 @@
 
 #include 
 
+static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm)
+{
+   u64 val;
+
+   /*
+* With VHE enabled, we have HCR_EL2.{E2H,TGE} = {1,1}, and
+* most TLB operations target EL2/EL0. In order to affect the
+* guest TLBs (EL1/EL0), we need to change one of these two
+* bits. Changing E2H is impossible (goodbye TTBR1_EL2), so
+* let's flip TGE before executing the TLB operation.
+*/
+   write_sysreg(kvm->arch.vttbr, vttbr_el2);
+   val = read_sysreg(hcr_el2);
+   val &= ~HCR_TGE;
+   write_sysreg(val, hcr_el2);
+   isb();
+}
+
+static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm)
+{
+   write_sysreg(kvm->arch.vttbr, vttbr_el2);
+   isb();
+}
+
+static hyp_alternate_select(__tlb_switch_to_guest,
+   __tlb_switch_to_guest_nvhe,
+   __tlb_switch_to_guest_vhe,
+   ARM64_HAS_VIRT_HOST_EXTN);
+
+static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm)
+{
+   /*
+* We're done with the TLB operation, let's restore the host's
+* view of HCR_EL2.
+*/
+   write_sysreg(0, vttbr_el2);
+   write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
+}
+
+static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm)
+{
+   write_sysreg(0, vttbr_el2);
+}
+
+static hyp_alternate_select(__tlb_switch_to_host,
+   __tlb_switch_to_host_nvhe,
+   __tlb_switch_to_host_vhe,
+   ARM64_HAS_VIRT_HOST_EXTN);
+
 void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
dsb(ishst);
 
/* Switch to requested VMID */
kvm = kern_hyp_va(kvm);
-   write_sysreg(kvm->arch.vttbr, vttbr_el2);
-   isb();
+   __tlb_switch_to_guest()(kvm);
 
/*
 * We could do so much better if we had the VA as well.
@@ -45,7 +93,7 @@ void __hyp_text __kvm_tlb_flush_vmid_ipa
dsb(ish);
isb();
 
-   write_sysreg(0, vttbr_el2);
+   __tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
@@ -54,14 +102,13 @@ void __hyp_text __kvm_tlb_flush_vmid(str
 
/* Switch to requested VMID */
kvm = kern_hyp_va(kvm);
-   write_sysreg(kvm->arch.vttbr, vttbr_el2);
-   isb();
+   __tlb_switch_to_guest()(kvm);
 
asm volatile("tlbi vmalls12e1is" : : );
dsb(ish);
isb();
 
-   write_sysreg(0, vttbr_el2);
+   __tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
@@ -69,14 +116,13 @@ void __hyp_text __kvm_tlb_flush_local_vm
struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm);
 
/* Switch to requested VMID */
-   write_sysreg(kvm->arch.vttbr, vttbr_el2);
-   isb();
+   __tlb_switch_to_guest()(kvm);
 
asm volatile("tlbi vmalle1" : : );
dsb(nsh);
isb();
 
-   write_sysreg(0, vttbr_el2);
+   __tlb_switch_to_host()(kvm);
 }
 
 void __hyp_text __kvm_flush_vm_context(void)

[PATCH 4.10 50/63] irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Shanker Donthineni 

commit 90922a2d03d84de36bf8a9979d62580102f31a92 upstream.

On Qualcomm Datacenter Technologies QDF2400 SoCs, the ITS hardware
implementation uses 16Bytes for Interrupt Translation Entry (ITE),
but reports an incorrect value of 8Bytes in GITS_TYPER.ITTE_size.

It might cause kernel memory corruption depending on the number
of MSI(x) that are configured and the amount of memory that has
been allocated for ITEs in its_create_device().

This patch fixes the potential memory corruption by setting the
correct ITE size to 16Bytes.

Cc: sta...@vger.kernel.org
Signed-off-by: Shanker Donthineni 
Signed-off-by: Marc Zyngier 
Signed-off-by: Greg Kroah-Hartman 

---
 Documentation/arm64/silicon-errata.txt |   44 +
 arch/arm64/Kconfig |   10 +++
 drivers/irqchip/irq-gic-v3-its.c   |   16 
 3 files changed, 49 insertions(+), 21 deletions(-)

--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -42,24 +42,26 @@ file acts as a registry of software work
 will be updated when new workarounds are committed and backported to
 stable kernels.
 
-| Implementor| Component   | Erratum ID  | Kconfig 
|
-++-+-+-+
-| ARM| Cortex-A53  | #826319 | ARM64_ERRATUM_826319
|
-| ARM| Cortex-A53  | #827319 | ARM64_ERRATUM_827319
|
-| ARM| Cortex-A53  | #824069 | ARM64_ERRATUM_824069
|
-| ARM| Cortex-A53  | #819472 | ARM64_ERRATUM_819472
|
-| ARM| Cortex-A53  | #845719 | ARM64_ERRATUM_845719
|
-| ARM| Cortex-A53  | #843419 | ARM64_ERRATUM_843419
|
-| ARM| Cortex-A57  | #832075 | ARM64_ERRATUM_832075
|
-| ARM| Cortex-A57  | #852523 | N/A 
|
-| ARM| Cortex-A57  | #834220 | ARM64_ERRATUM_834220
|
-| ARM| Cortex-A72  | #853709 | N/A 
|
-| ARM| MMU-500 | #841119,#826419 | N/A 
|
-|| | | 
|
-| Cavium | ThunderX ITS| #22375, #24313  | CAVIUM_ERRATUM_22375
|
-| Cavium | ThunderX ITS| #23144  | CAVIUM_ERRATUM_23144
|
-| Cavium | ThunderX GICv3  | #23154  | CAVIUM_ERRATUM_23154
|
-| Cavium | ThunderX Core   | #27456  | CAVIUM_ERRATUM_27456
|
-| Cavium | ThunderX SMMUv2 | #27704  | N/A|
-|| | | 
|
-| Freescale/NXP  | LS2080A/LS1043A | A-008585| FSL_ERRATUM_A008585 
|
+| Implementor| Component   | Erratum ID  | Kconfig 
|
+++-+-+-+
+| ARM| Cortex-A53  | #826319 | ARM64_ERRATUM_826319
|
+| ARM| Cortex-A53  | #827319 | ARM64_ERRATUM_827319
|
+| ARM| Cortex-A53  | #824069 | ARM64_ERRATUM_824069
|
+| ARM| Cortex-A53  | #819472 | ARM64_ERRATUM_819472
|
+| ARM| Cortex-A53  | #845719 | ARM64_ERRATUM_845719
|
+| ARM| Cortex-A53  | #843419 | ARM64_ERRATUM_843419
|
+| ARM| Cortex-A57  | #832075 | ARM64_ERRATUM_832075
|
+| ARM| Cortex-A57  | #852523 | N/A 
|
+| ARM| Cortex-A57  | #834220 | ARM64_ERRATUM_834220
|
+| ARM| Cortex-A72  | #853709 | N/A 
|
+| ARM| MMU-500 | #841119,#826419 | N/A 
|
+|| | | 
|
+| Cavium | ThunderX ITS| #22375, #24313  | CAVIUM_ERRATUM_22375
|
+| Cavium | ThunderX ITS| #23144  | CAVIUM_ERRATUM_23144
|
+| Cavium | ThunderX GICv3  | #23154  | CAVIUM_ERRATUM_23154
|
+| Cavium | ThunderX Core   | #27456  | CAVIUM_ERRATUM_27456
|
+| Cavium | ThunderX SMMUv2 | #27704  | N/A 
|
+|| | | 
|
+| Freescale/NXP  | LS2080A/LS1043A | A-008585| FSL_ERRATUM_A008585 
|
+|| | |

[PATCH 4.10 62/63] crypto: powerpc - Fix initialisation of crc32c context

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Daniel Axtens 

commit aa2be9b3d6d2d699e9ca7cbfc00867c80e5da213 upstream.

Turning on crypto self-tests on a POWER8 shows:

alg: hash: Test 1 failed for crc32c-vpmsum
: ff ff ff ff

Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.

This probably wasn't caught because btrfs does its own weird
open-coded initialisation.

Initialise our internal context to ~0 on init.

This makes the self-tests pass, and btrfs continues to work.

Fixes: 6dd7a82cc54e ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard 
Signed-off-by: Daniel Axtens 
Acked-by: Anton Blanchard 
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/powerpc/crypto/crc32c-vpmsum_glue.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/crypto/crc32c-vpmsum_glue.c
+++ b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
@@ -52,7 +52,7 @@ static int crc32c_vpmsum_cra_init(struct
 {
u32 *key = crypto_tfm_ctx(tfm);
 
-   *key = 0;
+   *key = ~0;
 
return 0;
 }

[PATCH 4.10 04/63] net/mlx5e: Update MPWQE stride size when modifying CQE compress state

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Saeed Mahameed 


[ Upstream commit 6dc4b54e77282caf17f0ff72aa32dd296037fbc0 ]

When the admin enables/disables cqe compression, updating
mpwqe stride size is required:
CQE compress ON  ==> stride size = 256B
CQE compress OFF ==> stride size = 64B

This is already done on driver load via mlx5e_set_rq_type_params, all we
need is just to call it on arbitrary admin changes of cqe compression
state via priv flags or when changing timestamping state
(as it is mutually exclusive with cqe compression).

This bug introduces no functional damage, it only makes cqe compression
occur less often, since in ConnectX4-LX CQE compression is performed
only on packets smaller than stride size.

Tested:
 ethtool --set-priv-flags ethxx rx_cqe_compress on
 pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
 verify `ethtool -S ethxx | grep compress` are advancing more often
 (rapidly)

Fixes: 7219ab34f184 ("net/mlx5e: CQE compression")
Signed-off-by: Saeed Mahameed 
Reviewed-by: Tariq Toukan 
Cc: kernel-t...@fb.com
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h |1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c |1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c|2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c  |1 +
 4 files changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -803,6 +803,7 @@ int mlx5e_get_max_linkspeed(struct mlx5_
 
 void mlx5e_set_rx_cq_mode_params(struct mlx5e_params *params,
 u8 cq_period_mode);
+void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type);
 
 static inline void mlx5e_tx_notify_hw(struct mlx5e_sq *sq,
  struct mlx5_wqe_ctrl_seg *ctrl, int bf_sz)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1477,6 +1477,7 @@ static int set_pflag_rx_cqe_compress(str
 
MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, enable);
priv->params.rx_cqe_compress_def = enable;
+   mlx5e_set_rq_type_params(priv, priv->params.rq_wq_type);
 
if (reset)
err = mlx5e_open_locked(netdev);
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -78,7 +78,7 @@ static bool mlx5e_check_fragmented_strid
MLX5_CAP_ETH(mdev, reg_umr_sq);
 }
 
-static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
+void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
 {
priv->params.rq_wq_type = rq_type;
priv->params.lro_wqe_sz = MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -172,6 +172,7 @@ void mlx5e_modify_rx_cqe_compression(str
mlx5e_close_locked(priv->netdev);
 
MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, val);
+   mlx5e_set_rq_type_params(priv, priv->params.rq_wq_type);
 
if (was_opened)
mlx5e_open_locked(priv->netdev);

[PATCH 4.10 50/63] irqchip/gicv3-its: Add workaround for QDF2400 ITS erratum 0065

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Shanker Donthineni 

commit 90922a2d03d84de36bf8a9979d62580102f31a92 upstream.

On Qualcomm Datacenter Technologies QDF2400 SoCs, the ITS hardware
implementation uses 16Bytes for Interrupt Translation Entry (ITE),
but reports an incorrect value of 8Bytes in GITS_TYPER.ITTE_size.

It might cause kernel memory corruption depending on the number
of MSI(x) that are configured and the amount of memory that has
been allocated for ITEs in its_create_device().

This patch fixes the potential memory corruption by setting the
correct ITE size to 16Bytes.

Cc: sta...@vger.kernel.org
Signed-off-by: Shanker Donthineni 
Signed-off-by: Marc Zyngier 
Signed-off-by: Greg Kroah-Hartman 

---
 Documentation/arm64/silicon-errata.txt |   44 +
 arch/arm64/Kconfig |   10 +++
 drivers/irqchip/irq-gic-v3-its.c   |   16 
 3 files changed, 49 insertions(+), 21 deletions(-)

--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -42,24 +42,26 @@ file acts as a registry of software work
 will be updated when new workarounds are committed and backported to
 stable kernels.
 
-| Implementor| Component   | Erratum ID  | Kconfig 
|
-++-+-+-+
-| ARM| Cortex-A53  | #826319 | ARM64_ERRATUM_826319
|
-| ARM| Cortex-A53  | #827319 | ARM64_ERRATUM_827319
|
-| ARM| Cortex-A53  | #824069 | ARM64_ERRATUM_824069
|
-| ARM| Cortex-A53  | #819472 | ARM64_ERRATUM_819472
|
-| ARM| Cortex-A53  | #845719 | ARM64_ERRATUM_845719
|
-| ARM| Cortex-A53  | #843419 | ARM64_ERRATUM_843419
|
-| ARM| Cortex-A57  | #832075 | ARM64_ERRATUM_832075
|
-| ARM| Cortex-A57  | #852523 | N/A 
|
-| ARM| Cortex-A57  | #834220 | ARM64_ERRATUM_834220
|
-| ARM| Cortex-A72  | #853709 | N/A 
|
-| ARM| MMU-500 | #841119,#826419 | N/A 
|
-|| | | 
|
-| Cavium | ThunderX ITS| #22375, #24313  | CAVIUM_ERRATUM_22375
|
-| Cavium | ThunderX ITS| #23144  | CAVIUM_ERRATUM_23144
|
-| Cavium | ThunderX GICv3  | #23154  | CAVIUM_ERRATUM_23154
|
-| Cavium | ThunderX Core   | #27456  | CAVIUM_ERRATUM_27456
|
-| Cavium | ThunderX SMMUv2 | #27704  | N/A|
-|| | | 
|
-| Freescale/NXP  | LS2080A/LS1043A | A-008585| FSL_ERRATUM_A008585 
|
+| Implementor| Component   | Erratum ID  | Kconfig 
|
+++-+-+-+
+| ARM| Cortex-A53  | #826319 | ARM64_ERRATUM_826319
|
+| ARM| Cortex-A53  | #827319 | ARM64_ERRATUM_827319
|
+| ARM| Cortex-A53  | #824069 | ARM64_ERRATUM_824069
|
+| ARM| Cortex-A53  | #819472 | ARM64_ERRATUM_819472
|
+| ARM| Cortex-A53  | #845719 | ARM64_ERRATUM_845719
|
+| ARM| Cortex-A53  | #843419 | ARM64_ERRATUM_843419
|
+| ARM| Cortex-A57  | #832075 | ARM64_ERRATUM_832075
|
+| ARM| Cortex-A57  | #852523 | N/A 
|
+| ARM| Cortex-A57  | #834220 | ARM64_ERRATUM_834220
|
+| ARM| Cortex-A72  | #853709 | N/A 
|
+| ARM| MMU-500 | #841119,#826419 | N/A 
|
+|| | | 
|
+| Cavium | ThunderX ITS| #22375, #24313  | CAVIUM_ERRATUM_22375
|
+| Cavium | ThunderX ITS| #23144  | CAVIUM_ERRATUM_23144
|
+| Cavium | ThunderX GICv3  | #23154  | CAVIUM_ERRATUM_23154
|
+| Cavium | ThunderX Core   | #27456  | CAVIUM_ERRATUM_27456
|
+| Cavium | ThunderX SMMUv2 | #27704  | N/A 
|
+|| | | 
|
+| Freescale/NXP  | LS2080A/LS1043A | A-008585| FSL_ERRATUM_A008585 
|
+|| | | 
|
+| Qualcomm Tech. | QDF2400 ITS | E0065   |

[PATCH 4.10 62/63] crypto: powerpc - Fix initialisation of crc32c context

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Daniel Axtens 

commit aa2be9b3d6d2d699e9ca7cbfc00867c80e5da213 upstream.

Turning on crypto self-tests on a POWER8 shows:

alg: hash: Test 1 failed for crc32c-vpmsum
: ff ff ff ff

Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.

This probably wasn't caught because btrfs does its own weird
open-coded initialisation.

Initialise our internal context to ~0 on init.

This makes the self-tests pass, and btrfs continues to work.

Fixes: 6dd7a82cc54e ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard 
Signed-off-by: Daniel Axtens 
Acked-by: Anton Blanchard 
Signed-off-by: Herbert Xu 
Signed-off-by: Greg Kroah-Hartman 

---
 arch/powerpc/crypto/crc32c-vpmsum_glue.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/powerpc/crypto/crc32c-vpmsum_glue.c
+++ b/arch/powerpc/crypto/crc32c-vpmsum_glue.c
@@ -52,7 +52,7 @@ static int crc32c_vpmsum_cra_init(struct
 {
u32 *key = crypto_tfm_ctx(tfm);
 
-   *key = 0;
+   *key = ~0;
 
return 0;
 }

[PATCH 4.10 04/63] net/mlx5e: Update MPWQE stride size when modifying CQE compress state

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: Saeed Mahameed 


[ Upstream commit 6dc4b54e77282caf17f0ff72aa32dd296037fbc0 ]

When the admin enables/disables cqe compression, updating
mpwqe stride size is required:
CQE compress ON  ==> stride size = 256B
CQE compress OFF ==> stride size = 64B

This is already done on driver load via mlx5e_set_rq_type_params, all we
need is just to call it on arbitrary admin changes of cqe compression
state via priv flags or when changing timestamping state
(as it is mutually exclusive with cqe compression).

This bug introduces no functional damage, it only makes cqe compression
occur less often, since in ConnectX4-LX CQE compression is performed
only on packets smaller than stride size.

Tested:
 ethtool --set-priv-flags ethxx rx_cqe_compress on
 pktgen with  64 < pkt size < 256 and netperf TCP_STREAM (IPv4/IPv6)
 verify `ethtool -S ethxx | grep compress` are advancing more often
 (rapidly)

Fixes: 7219ab34f184 ("net/mlx5e: CQE compression")
Signed-off-by: Saeed Mahameed 
Reviewed-by: Tariq Toukan 
Cc: kernel-t...@fb.com
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h |1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c |1 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c|2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c  |1 +
 4 files changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -803,6 +803,7 @@ int mlx5e_get_max_linkspeed(struct mlx5_
 
 void mlx5e_set_rx_cq_mode_params(struct mlx5e_params *params,
 u8 cq_period_mode);
+void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type);
 
 static inline void mlx5e_tx_notify_hw(struct mlx5e_sq *sq,
  struct mlx5_wqe_ctrl_seg *ctrl, int bf_sz)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -1477,6 +1477,7 @@ static int set_pflag_rx_cqe_compress(str
 
MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, enable);
priv->params.rx_cqe_compress_def = enable;
+   mlx5e_set_rq_type_params(priv, priv->params.rq_wq_type);
 
if (reset)
err = mlx5e_open_locked(netdev);
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -78,7 +78,7 @@ static bool mlx5e_check_fragmented_strid
MLX5_CAP_ETH(mdev, reg_umr_sq);
 }
 
-static void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
+void mlx5e_set_rq_type_params(struct mlx5e_priv *priv, u8 rq_type)
 {
priv->params.rq_wq_type = rq_type;
priv->params.lro_wqe_sz = MLX5E_PARAMS_DEFAULT_LRO_WQE_SZ;
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -172,6 +172,7 @@ void mlx5e_modify_rx_cqe_compression(str
mlx5e_close_locked(priv->netdev);
 
MLX5E_SET_PFLAG(priv, MLX5E_PFLAG_RX_CQE_COMPRESS, val);
+   mlx5e_set_rq_type_params(priv, priv->params.rq_wq_type);
 
if (was_opened)
mlx5e_open_locked(priv->netdev);

[PATCH 4.9 18/93] ipv6: orphan skbs in reassembly unit

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Eric Dumazet 


[ Upstream commit 48cac18ecf1de82f76259a54402c3adb7839ad01 ]

Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer 

==
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr 880062da0060 by task a.out/4140

page:ea00018b6800 count:1 mapcount:0 mapping:  (null)
index:0x0 compound_mapcount: 0
flags: 0x1008100(slab|head)
raw: 01008100   000180130013
raw: dead0100 dead0200 88006741f140 
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 describe_address mm/kasan/report.c:262
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370
 kasan_report mm/kasan/report.c:392
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
 sock_flag ./arch/x86/include/asm/bitops.h:324
 sock_wfree+0x118/0x120 net/core/sock.c:1631
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put ./include/net/inet_frag.h:133
 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook ./include/linux/netfilter.h:212
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x620 net/socket.c:848
 new_sync_write fs/read_write.c:499
 __vfs_write+0x483/0x760 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:7ff268e0ed98 EFLAGS: 0206 ORIG_RAX: 0001
RAX: ffda RBX: 7ff268e0f9c0 RCX: 7ff26e6f5b79
RDX: 0010 RSI: 20f50fe1 RDI: 0003
RBP: 7ff26ebc1220 R08:  R09: 
R10:  R11: 0206 R12: 
R13: 7ff268e0f9c0 R14: 7ff26efec040 R15: 0003

The buggy address belongs to the object at 880062da
 which belongs to the cache RAWv6 of size 1504
The buggy address 880062da0060 is located 96 bytes inside
 of 1504-byte region [880062da, 880062da05e0)

Freed by task 4113:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1352
 slab_free_freelist_hook mm/slub.c:1374
 slab_free mm/slub.c:2951
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
 sk_prot_free net/core/sock.c:1377
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sk_free+0x23/0x30 net/core/sock.c:1479
 sock_put ./include/net/sock.h:1638
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
 sock_release+0x8d/0x1e0 net/socket.c:599
 sock_close+0x16/0x20 net/socket.c:1063
 __fput+0x332/0x7f0 fs/file_table.c:208
 fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work ./include/linux/task_work.h:21
 do_exit+0x186b/0x2800 kernel/exit.c:839
 do_group_exit+0x149/0x420 kernel/exit.c:943
 SYSC_exit_group kernel/exit.c:954
 SyS_exit_group+0x1d/0x20 kernel/exit.c:952
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0

[PATCH 4.9 18/93] ipv6: orphan skbs in reassembly unit

2017-03-20 Thread Greg Kroah-Hartman

4.9-stable review patch.  If anyone has any objections, please let me know.

--

From: Eric Dumazet 


[ Upstream commit 48cac18ecf1de82f76259a54402c3adb7839ad01 ]

Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer 

==
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr 880062da0060 by task a.out/4140

page:ea00018b6800 count:1 mapcount:0 mapping:  (null)
index:0x0 compound_mapcount: 0
flags: 0x1008100(slab|head)
raw: 01008100   000180130013
raw: dead0100 dead0200 88006741f140 
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 describe_address mm/kasan/report.c:262
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370
 kasan_report mm/kasan/report.c:392
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
 sock_flag ./arch/x86/include/asm/bitops.h:324
 sock_wfree+0x118/0x120 net/core/sock.c:1631
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put ./include/net/inet_frag.h:133
 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook ./include/linux/netfilter.h:212
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x620 net/socket.c:848
 new_sync_write fs/read_write.c:499
 __vfs_write+0x483/0x760 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:7ff268e0ed98 EFLAGS: 0206 ORIG_RAX: 0001
RAX: ffda RBX: 7ff268e0f9c0 RCX: 7ff26e6f5b79
RDX: 0010 RSI: 20f50fe1 RDI: 0003
RBP: 7ff26ebc1220 R08:  R09: 
R10:  R11: 0206 R12: 
R13: 7ff268e0f9c0 R14: 7ff26efec040 R15: 0003

The buggy address belongs to the object at 880062da
 which belongs to the cache RAWv6 of size 1504
The buggy address 880062da0060 is located 96 bytes inside
 of 1504-byte region [880062da, 880062da05e0)

Freed by task 4113:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1352
 slab_free_freelist_hook mm/slub.c:1374
 slab_free mm/slub.c:2951
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
 sk_prot_free net/core/sock.c:1377
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sk_free+0x23/0x30 net/core/sock.c:1479
 sock_put ./include/net/sock.h:1638
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
 sock_release+0x8d/0x1e0 net/socket.c:599
 sock_close+0x16/0x20 net/socket.c:1063
 __fput+0x332/0x7f0 fs/file_table.c:208
 fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work ./include/linux/task_work.h:21
 do_exit+0x186b/0x2800 kernel/exit.c:839
 do_group_exit+0x149/0x420 kernel/exit.c:943
 SYSC_exit_group kernel/exit.c:954
 SyS_exit_group+0x1d/0x20 kernel/exit.c:952
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203

Allocated by task 4115:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track

Re: [PATCH v3] usb: hub: Fix error loop seen after hub communication errors

2017-03-20 Thread Doug Anderson

Hi,

On Thu, Mar 16, 2017 at 12:24 PM, Guenter Roeck  wrote:
> @@ -1198,7 +1201,7 @@ static void hub_activate(struct usb_hub *hub, enum 
> hub_activation_type type)
>
> /* Scan all ports that need attention */
> kick_hub_wq(hub);
> -
> +abort:

One tiny nit that could be done when applying this patch is to add a
space before "abort".  Other goto labels in this function are preceded
by a space and it's sane to try to match the existing coding
convention in the function rather than trying to mix and match.

Other than that this patch seems sane to me, but I am by no means an
expert on this code.  ;)

-Doug

[PATCH 4.10 37/63] uapi: fix linux/packet_diag.h userspace compilation error

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: "Dmitry V. Levin" 


[ Upstream commit 745cb7f8a5de0805cade3de3991b7a95317c7c73 ]

Replace MAX_ADDR_LEN with its numeric value to fix the following
linux/packet_diag.h userspace compilation error:

/usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here 
(not in a function)
  __u8 pdmc_addr[MAX_ADDR_LEN];

This is not the first case in the UAPI where the numeric value
of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h
already does the same:

$ grep MAX_ADDR_LEN include/uapi/linux/if_link.h
__u8 mac[32]; /* MAX_ADDR_LEN */

There are no UAPI headers besides these two that use MAX_ADDR_LEN.

Signed-off-by: Dmitry V. Levin 
Acked-by: Pavel Emelyanov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 include/uapi/linux/packet_diag.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/uapi/linux/packet_diag.h
+++ b/include/uapi/linux/packet_diag.h
@@ -64,7 +64,7 @@ struct packet_diag_mclist {
__u32   pdmc_count;
__u16   pdmc_type;
__u16   pdmc_alen;
-   __u8pdmc_addr[MAX_ADDR_LEN];
+   __u8pdmc_addr[32]; /* MAX_ADDR_LEN */
 };
 
 struct packet_diag_ring {

Re: [PATCH v3] usb: hub: Fix error loop seen after hub communication errors

2017-03-20 Thread Doug Anderson

Hi,

On Thu, Mar 16, 2017 at 12:24 PM, Guenter Roeck  wrote:
> @@ -1198,7 +1201,7 @@ static void hub_activate(struct usb_hub *hub, enum 
> hub_activation_type type)
>
> /* Scan all ports that need attention */
> kick_hub_wq(hub);
> -
> +abort:

One tiny nit that could be done when applying this patch is to add a
space before "abort".  Other goto labels in this function are preceded
by a space and it's sane to try to match the existing coding
convention in the function rather than trying to mix and match.

Other than that this patch seems sane to me, but I am by no means an
expert on this code.  ;)

-Doug

[PATCH 4.10 37/63] uapi: fix linux/packet_diag.h userspace compilation error

2017-03-20 Thread Greg Kroah-Hartman

4.10-stable review patch.  If anyone has any objections, please let me know.

--

From: "Dmitry V. Levin" 


[ Upstream commit 745cb7f8a5de0805cade3de3991b7a95317c7c73 ]

Replace MAX_ADDR_LEN with its numeric value to fix the following
linux/packet_diag.h userspace compilation error:

/usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here 
(not in a function)
  __u8 pdmc_addr[MAX_ADDR_LEN];

This is not the first case in the UAPI where the numeric value
of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h
already does the same:

$ grep MAX_ADDR_LEN include/uapi/linux/if_link.h
__u8 mac[32]; /* MAX_ADDR_LEN */

There are no UAPI headers besides these two that use MAX_ADDR_LEN.

Signed-off-by: Dmitry V. Levin 
Acked-by: Pavel Emelyanov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman 
---
 include/uapi/linux/packet_diag.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/include/uapi/linux/packet_diag.h
+++ b/include/uapi/linux/packet_diag.h
@@ -64,7 +64,7 @@ struct packet_diag_mclist {
__u32   pdmc_count;
__u16   pdmc_type;
__u16   pdmc_alen;
-   __u8pdmc_addr[MAX_ADDR_LEN];
+   __u8pdmc_addr[32]; /* MAX_ADDR_LEN */
 };
 
 struct packet_diag_ring {

Re: [PATCH v2 06/14] mmc: dw_mmc: simplify optional reset handling

2017-03-20 Thread Ulf Hansson

On 20 March 2017 at 12:00, Philipp Zabel  wrote:
> On Mon, 2017-03-20 at 11:49 +0100, Andrzej Hajda wrote:
>> On 20.03.2017 11:27, Philipp Zabel wrote:
> [...]
>> > diff --git a/include/linux/reset.h b/include/linux/reset.h
>> > index 86b4ed75359e8..c905ff1c21ec6 100644
>> > --- a/include/linux/reset.h
>> > +++ b/include/linux/reset.h
>> > @@ -74,14 +74,14 @@ static inline struct reset_control 
>> > *__of_reset_control_get(
>> > const char *id, int index, bool shared,
>> > bool optional)
>> >  {
>> > -   return ERR_PTR(-ENOTSUPP);
>> > +   return optional ? NULL : ERR_PTR(-ENOTSUPP);
>> >  }
>> >
>> >  static inline struct reset_control *__devm_reset_control_get(
>> > struct device *dev, const char *id,
>> > int index, bool shared, bool optional)
>> >  {
>> > -   return ERR_PTR(-ENOTSUPP);
>> > +   return optional ? NULL : ERR_PTR(-ENOTSUPP);
>> >  }
>> >
>> >  #endif /* CONFIG_RESET_CONTROLLER */
>> > -->8--
>>
>> In dw_mmc.c file there are also unconditional calls to
>> reset_control_assert, with disabled RESET_CONTROLLER it will cause
>> unexpected WARNs.
>> Anyway if you change reset API as above I think you should remove all
>> warns from reset stubs, because NULL reset is valid, but these warns are
>> there for reason - contradiction.
>
> You are right, I have to let go of those, too.


Until fixed, I have dropped the three changes from my next branch
related to this. Please re-post when fixed.

Kind regards
Uffe

>
> regards
> Philipp
>

Re: [PATCH v2 06/14] mmc: dw_mmc: simplify optional reset handling

2017-03-20 Thread Ulf Hansson

On 20 March 2017 at 12:00, Philipp Zabel  wrote:
> On Mon, 2017-03-20 at 11:49 +0100, Andrzej Hajda wrote:
>> On 20.03.2017 11:27, Philipp Zabel wrote:
> [...]
>> > diff --git a/include/linux/reset.h b/include/linux/reset.h
>> > index 86b4ed75359e8..c905ff1c21ec6 100644
>> > --- a/include/linux/reset.h
>> > +++ b/include/linux/reset.h
>> > @@ -74,14 +74,14 @@ static inline struct reset_control 
>> > *__of_reset_control_get(
>> > const char *id, int index, bool shared,
>> > bool optional)
>> >  {
>> > -   return ERR_PTR(-ENOTSUPP);
>> > +   return optional ? NULL : ERR_PTR(-ENOTSUPP);
>> >  }
>> >
>> >  static inline struct reset_control *__devm_reset_control_get(
>> > struct device *dev, const char *id,
>> > int index, bool shared, bool optional)
>> >  {
>> > -   return ERR_PTR(-ENOTSUPP);
>> > +   return optional ? NULL : ERR_PTR(-ENOTSUPP);
>> >  }
>> >
>> >  #endif /* CONFIG_RESET_CONTROLLER */
>> > -->8--
>>
>> In dw_mmc.c file there are also unconditional calls to
>> reset_control_assert, with disabled RESET_CONTROLLER it will cause
>> unexpected WARNs.
>> Anyway if you change reset API as above I think you should remove all
>> warns from reset stubs, because NULL reset is valid, but these warns are
>> there for reason - contradiction.
>
> You are right, I have to let go of those, too.


Until fixed, I have dropped the three changes from my next branch
related to this. Please re-post when fixed.

Kind regards
Uffe

>
> regards
> Philipp
>

Re: [PATCH v5 38/39] media: imx: csi: fix crop rectangle reset in sink set_fmt

2017-03-20 Thread Russell King - ARM Linux

On Mon, Mar 20, 2017 at 06:40:21PM +0100, Philipp Zabel wrote:
> On Mon, 2017-03-20 at 14:17 +, Russell King - ARM Linux wrote:
> > I have tripped over a bug in media-ctl when specifying both a crop and
> > compose rectangle - the --help output suggests that "," should be used
> > to separate them.  media-ctl rejects that, telling me the character at
> > the "," should be "]".  Replacing the "," with " " allows media-ctl to
> > accept it and set both rectangles, so it sounds like a parser bug - I've
> > not looked into this any further yet.
> 
> I can confirm this. I don't see any place in
> v4l2_subdev_parse_pad_format that handles the "," separator. There's
> just whitespace skipping between the v4l2-properties.

Maybe this is the easiest solution:

 utils/media-ctl/options.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/utils/media-ctl/options.c b/utils/media-ctl/options.c
index 83ca1ca..8b97874 100644
--- a/utils/media-ctl/options.c
+++ b/utils/media-ctl/options.c
@@ -65,7 +65,7 @@ static void usage(const char *argv0)
printf("\tentity  = entity-number | ( '\"' entity-name '\"' ) 
;\n");
printf("\n");
printf("\tv4l2= pad '[' v4l2-properties ']' ;\n");
-   printf("\tv4l2-properties = v4l2-property { ',' v4l2-property } ;\n");
+   printf("\tv4l2-properties = v4l2-property { ' '* v4l2-property } ;\n");
printf("\tv4l2-property   = v4l2-mbusfmt | v4l2-crop | 
v4l2-interval\n");
printf("\t| v4l2-compose | v4l2-interval ;\n");
printf("\tv4l2-mbusfmt= 'fmt:' fcc '/' size ; { 'field:' v4l2-field 
; } { 'colorspace:' v4l2-colorspace ; }\n");

;)

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.

Re: [PATCH v21 13/13] acpi/arm64: Add SBSA Generic Watchdog support in GTDT driver

2017-03-20 Thread Mark Rutland

On Tue, Mar 21, 2017 at 01:57:58AM +0800, Fu Wei wrote:
> On 18 March 2017 at 04:01, Mark Rutland  wrote:
> > On Tue, Feb 07, 2017 at 02:50:15AM +0800, fu@linaro.org wrote:

> > I've not been able to find where the ACPI spec says that zero is not a
> > valid GSIV. This may simply be an oversight/ambiguity in the spec.
> >
> > Is there any statement to that effect?
> 
> you are right, zero is a  valid GSIV, I will delete this check. Thanks

That being the case, how does one describe a watchdog that does not have
an interrupt?

As I mentioned, I think this is an oversight/ambiguity in the spec tat
we should address.

> > My reading of SBSA is that there is one watchdog in the system.
> >
> > Is that not the case?
> 
> do you mean:
> ---
> 4.2.4 Watchdogs
> The base server system implements a Generic Watchdog as specified in
> APPENDIX A: Generic Watchdog.
> ---
> 
> I am not sure about that if this is saying "we only have one SBSA
> watchdog in a system"
> 
> would you let me know where mention it? Do I miss something?

My reading was that the 'a' above meant a single element. i.e.

The base server system implements _a_ Generic Watchdog as
specified in APPENDIX A: Generic Watchdog.

Subsequently in 4.2.5, it is stated:

In this scenario, the system wakeup timer or generic watchdog is
still required to send its interrupt.

... which only makes sense if there is a single watchdog in the system.

Perhaps this is an oversight in the specification.

Thanks,
Mark.

< 3 4 5 6 7 8 9 10 11 12 >

701 - 800 of 2340 matches

Mail list logo