Re: [PATCH 2/2] mm: memcontrol: try harder to set a new memory.high

2019-10-23 Thread Michal Hocko
On Tue 22-10-19 16:15:18, Johannes Weiner wrote:
> Setting a memory.high limit below the usage makes almost no effort to
> shrink the cgroup to the new target size.
> 
> While memory.high is a "soft" limit that isn't supposed to cause OOM
> situations, we should still try harder to meet a user request through
> persistent reclaim.
> 
> For example, after setting a 10M memory.high on an 800M cgroup full of
> file cache, the usage shrinks to about 350M:
> 
> + cat /cgroup/workingset/memory.current
> 841568256
> + echo 10M
> + cat /cgroup/workingset/memory.current
> 355729408
> 
> This isn't exactly what the user would expect to happen. Setting the
> value a few more times eventually whittles the usage down to what we
> are asking for:
> 
> + echo 10M
> + cat /cgroup/workingset/memory.current
> 104181760
> + echo 10M
> + cat /cgroup/workingset/memory.current
> 31801344
> + echo 10M
> + cat /cgroup/workingset/memory.current
> 10440704
> 
> To improve this, add reclaim retry loops to the memory.high write()
> callback, similar to what we do for memory.max, to make a reasonable
> effort that the usage meets the requested size after the call returns.

That suggests that the reclaim couldn't meet the given reclaim target
but later attempts just made it through. Is this due to amount of dirty
pages or what prevented the reclaim to do its job?

While I am not against the reclaim retry loop I would like to understand
the underlying issue. Because if this is really about dirty memory then
we should probably be more pro-active in flushing it. Otherwise the
retry might not be of any help.

> Afterwards, a single write() to memory.high is enough in all but
> extreme cases:
> 
> + cat /cgroup/workingset/memory.current
> 841609216
> + echo 10M
> + cat /cgroup/workingset/memory.current
> 10182656
> 
> Signed-off-by: Johannes Weiner 
> ---
>  mm/memcontrol.c | 30 --
>  1 file changed, 24 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ff90d4e7df37..8090b4c99ac7 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -6074,7 +6074,8 @@ static ssize_t memory_high_write(struct 
> kernfs_open_file *of,
>char *buf, size_t nbytes, loff_t off)
>  {
>   struct mem_cgroup *memcg = mem_cgroup_from_css(of_css(of));
> - unsigned long nr_pages;
> + unsigned int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
> + bool drained = false;
>   unsigned long high;
>   int err;
>  
> @@ -6085,12 +6086,29 @@ static ssize_t memory_high_write(struct 
> kernfs_open_file *of,
>  
>   memcg->high = high;
>  
> - nr_pages = page_counter_read(&memcg->memory);
> - if (nr_pages > high)
> - try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
> -  GFP_KERNEL, true);
> + for (;;) {
> + unsigned long nr_pages = page_counter_read(&memcg->memory);
> + unsigned long reclaimed;
> +
> + if (nr_pages <= high)
> + break;
> +
> + if (signal_pending(current))
> + break;
> +
> + if (!drained) {
> + drain_all_stock(memcg);
> + drained = true;
> + continue;
> + }
> +
> + reclaimed = try_to_free_mem_cgroup_pages(memcg, nr_pages - high,
> +  GFP_KERNEL, true);
> +
> + if (!reclaimed && !nr_retries--)
> + break;
> + }
>  
> - memcg_wb_domain_size_changed(memcg);
>   return nbytes;
>  }
>  
> -- 
> 2.23.0

-- 
Michal Hocko
SUSE Labs


[PATCH] spi: document CS setup, hold & inactive times in header

2019-10-23 Thread Alexandru Ardelean
This change documents the CS setup, host & inactive times. They were
omitted when the fields were added, and were caught by one of the build
bots.

Fixes: 25093bdeb6bc ("spi: implement SW control for CS times")
Reported-by: kbuild test robot 
Signed-off-by: Alexandru Ardelean 
---
 include/linux/spi/spi.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index c40d6af2bf07..98fe8663033a 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -407,6 +407,11 @@ static inline void spi_unregister_driver(struct spi_driver 
*sdrv)
  *  controller has native support for memory like operations.
  * @unprepare_message: undo any work done by prepare_message().
  * @slave_abort: abort the ongoing transfer request on an SPI slave controller
+ * @cs_setup: delay to be introduced by the controller after CS is asserted
+ * @cs_hold: delay to be introduced by the controller before CS is deasserted
+ * @cs_inactive: delay to be introduced by the controller after CS is
+ * deasserted. If @cs_change_delay is used from @spi_transfer, then the
+ * two delays will be added up.
  * @cs_gpios: LEGACY: array of GPIO descs to use as chip select lines; one per
  * CS number. Any individual value may be -ENOENT for CS lines that
  * are not GPIOs (driven by the SPI controller itself). Use the cs_gpiods
-- 
2.20.1



[PATCH RESEND 2/2] dmaengine: JZ4780: Add support for the X1000.

2019-10-23 Thread Zhou Yanjie
Add support for probing the dma-jz4780 driver on the X1000 Soc.

Signed-off-by: Zhou Yanjie 
---
 drivers/dma/dma-jz4780.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/dma/dma-jz4780.c b/drivers/dma/dma-jz4780.c
index cafb1cc0..f809a6e 100644
--- a/drivers/dma/dma-jz4780.c
+++ b/drivers/dma/dma-jz4780.c
@@ -1019,11 +1019,18 @@ static const struct jz4780_dma_soc_data 
jz4780_dma_soc_data = {
.flags = JZ_SOC_DATA_ALLOW_LEGACY_DT | JZ_SOC_DATA_PROGRAMMABLE_DMA,
 };
 
+static const struct jz4780_dma_soc_data x1000_dma_soc_data = {
+   .nb_channels = 8,
+   .transfer_ord_max = 7,
+   .flags = JZ_SOC_DATA_ALLOW_LEGACY_DT | JZ_SOC_DATA_PROGRAMMABLE_DMA,
+};
+
 static const struct of_device_id jz4780_dma_dt_match[] = {
{ .compatible = "ingenic,jz4740-dma", .data = &jz4740_dma_soc_data },
{ .compatible = "ingenic,jz4725b-dma", .data = &jz4725b_dma_soc_data },
{ .compatible = "ingenic,jz4770-dma", .data = &jz4770_dma_soc_data },
{ .compatible = "ingenic,jz4780-dma", .data = &jz4780_dma_soc_data },
+   { .compatible = "ingenic,x1000-dma", .data = &x1000_dma_soc_data },
{},
 };
 MODULE_DEVICE_TABLE(of, jz4780_dma_dt_match);
-- 
2.7.4




[PATCH RESEND 1/2] dt-bindings: dmaengine: Add X1000 bindings.

2019-10-23 Thread Zhou Yanjie
Add the dmaengine bindings for the X1000 Soc from Ingenic.

Signed-off-by: Zhou Yanjie 
---
 .../devicetree/bindings/dma/jz4780-dma.txt |  3 +-
 include/dt-bindings/dma/x1000-dma.h| 40 ++
 2 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 include/dt-bindings/dma/x1000-dma.h

diff --git a/Documentation/devicetree/bindings/dma/jz4780-dma.txt 
b/Documentation/devicetree/bindings/dma/jz4780-dma.txt
index 636fcb2..ec89782 100644
--- a/Documentation/devicetree/bindings/dma/jz4780-dma.txt
+++ b/Documentation/devicetree/bindings/dma/jz4780-dma.txt
@@ -7,10 +7,11 @@ Required properties:
   * ingenic,jz4725b-dma
   * ingenic,jz4770-dma
   * ingenic,jz4780-dma
+  * ingenic,x1000-dma
 - reg: Should contain the DMA channel registers location and length, followed
   by the DMA controller registers location and length.
 - interrupts: Should contain the interrupt specifier of the DMA controller.
-- clocks: Should contain a clock specifier for the JZ4780 PDMA clock.
+- clocks: Should contain a clock specifier for the JZ4780/X1000 PDMA clock.
 - #dma-cells: Must be <2>. Number of integer cells in the dmas property of
   DMA clients (see below).
 
diff --git a/include/dt-bindings/dma/x1000-dma.h 
b/include/dt-bindings/dma/x1000-dma.h
new file mode 100644
index ..401e165
--- /dev/null
+++ b/include/dt-bindings/dma/x1000-dma.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * This header provides macros for X1000 DMA bindings.
+ *
+ * Copyright (c) 2019 Zhou Yanjie 
+ */
+
+#ifndef __DT_BINDINGS_DMA_X1000_DMA_H__
+#define __DT_BINDINGS_DMA_X1000_DMA_H__
+
+/*
+ * Request type numbers for the X1000 DMA controller (written to the DRTn
+ * register for the channel).
+ */
+#define X1000_DMA_DMIC_RX  0x5
+#define X1000_DMA_I2S0_TX  0x6
+#define X1000_DMA_I2S0_RX  0x7
+#define X1000_DMA_AUTO 0x8
+#define X1000_DMA_UART2_TX 0x10
+#define X1000_DMA_UART2_RX 0x11
+#define X1000_DMA_UART1_TX 0x12
+#define X1000_DMA_UART1_RX 0x13
+#define X1000_DMA_UART0_TX 0x14
+#define X1000_DMA_UART0_RX 0x15
+#define X1000_DMA_SSI0_TX  0x16
+#define X1000_DMA_SSI0_RX  0x17
+#define X1000_DMA_MSC0_TX  0x1a
+#define X1000_DMA_MSC0_RX  0x1b
+#define X1000_DMA_MSC1_TX  0x1c
+#define X1000_DMA_MSC1_RX  0x1d
+#define X1000_DMA_PCM0_TX  0x20
+#define X1000_DMA_PCM0_RX  0x21
+#define X1000_DMA_SMB0_TX  0x24
+#define X1000_DMA_SMB0_RX  0x25
+#define X1000_DMA_SMB1_TX  0x26
+#define X1000_DMA_SMB1_RX  0x27
+#define X1000_DMA_SMB2_TX  0x28
+#define X1000_DMA_SMB2_RX  0x29
+
+#endif /* __DT_BINDINGS_DMA_X1000_DMA_H__ */
-- 
2.7.4




Re: DMA: JZ4780: Add DMA driver for X1000.

2019-10-23 Thread Zhou Yanjie

Hi Vinod,

On 2019年10月23日 13:15, Vinod Koul wrote:

On 23-10-19, 11:05, Zhou Yanjie wrote:

1.Add the DMA bindings for the X1000 SoC from Ingenic.
2.Add support for probing the dma-jz4780 driver on the
   X1000 SoC from Ingenic.

The subsystem in dmaengine and not dma

Please resend with correct tags!

Thanks


I already resend this patches.

Best regards!





[PATCH net-next] ieee802154: remove set but not used variable 'status'

2019-10-23 Thread YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ieee802154/cc2520.c:221:5: warning:
 variable status set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Signed-off-by: YueHaibing 
---
 drivers/net/ieee802154/cc2520.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/ieee802154/cc2520.c b/drivers/net/ieee802154/cc2520.c
index 4350694..89c046b 100644
--- a/drivers/net/ieee802154/cc2520.c
+++ b/drivers/net/ieee802154/cc2520.c
@@ -218,7 +218,6 @@ static int
 cc2520_cmd_strobe(struct cc2520_private *priv, u8 cmd)
 {
int ret;
-   u8 status = 0xff;
struct spi_message msg;
struct spi_transfer xfer = {
.len = 0,
@@ -236,8 +235,6 @@ cc2520_cmd_strobe(struct cc2520_private *priv, u8 cmd)
 priv->buf[0]);
 
ret = spi_sync(priv->spi, &msg);
-   if (!ret)
-   status = priv->buf[0];
dev_vdbg(&priv->spi->dev,
 "buf[0] = %02x\n", priv->buf[0]);
mutex_unlock(&priv->buffer_mutex);
-- 
2.7.4




[PATCH][RESEND] input: adp5589: Make keypad support optional

2019-10-23 Thread Alexandru Ardelean
From: Lars-Peter Clausen 

On some platforms the adp5589 is used in GPIO only mode. On these platforms
we do not want to register a input device, so make that optional and only
create the input device if a keymap is supplied.

Signed-off-by: Lars-Peter Clausen 
Signed-off-by: Alexandru Ardelean 
---
 drivers/input/keyboard/adp5589-keys.c | 197 +++---
 1 file changed, 111 insertions(+), 86 deletions(-)

diff --git a/drivers/input/keyboard/adp5589-keys.c 
b/drivers/input/keyboard/adp5589-keys.c
index 4f96a4a99e5b..08bfa8b213e8 100644
--- a/drivers/input/keyboard/adp5589-keys.c
+++ b/drivers/input/keyboard/adp5589-keys.c
@@ -495,10 +495,10 @@ static int adp5589_build_gpiomap(struct adp5589_kpad 
*kpad,
return n_unused;
 }
 
-static int adp5589_gpio_add(struct adp5589_kpad *kpad)
+static int adp5589_gpio_add(struct adp5589_kpad *kpad,
+   const struct adp5589_kpad_platform_data *pdata)
 {
struct device *dev = &kpad->client->dev;
-   const struct adp5589_kpad_platform_data *pdata = dev_get_platdata(dev);
const struct adp5589_gpio_platform_data *gpio_data = pdata->gpio_data;
int i, error;
 
@@ -550,10 +550,10 @@ static int adp5589_gpio_add(struct adp5589_kpad *kpad)
return 0;
 }
 
-static void adp5589_gpio_remove(struct adp5589_kpad *kpad)
+static void adp5589_gpio_remove(struct adp5589_kpad *kpad,
+   const struct adp5589_kpad_platform_data *pdata)
 {
struct device *dev = &kpad->client->dev;
-   const struct adp5589_kpad_platform_data *pdata = dev_get_platdata(dev);
const struct adp5589_gpio_platform_data *gpio_data = pdata->gpio_data;
int error;
 
@@ -571,12 +571,14 @@ static void adp5589_gpio_remove(struct adp5589_kpad *kpad)
gpiochip_remove(&kpad->gc);
 }
 #else
-static inline int adp5589_gpio_add(struct adp5589_kpad *kpad)
+static inline int adp5589_gpio_add(struct adp5589_kpad *kpad,
+   struct const adp5589_kpad_platform_data *pdata)
 {
return 0;
 }
 
-static inline void adp5589_gpio_remove(struct adp5589_kpad *kpad)
+static inline void adp5589_gpio_remove(struct adp5589_kpad *kpad,
+   struct const adp5589_kpad_platform_data *pdata)
 {
 }
 #endif
@@ -652,11 +654,10 @@ static int adp5589_get_evcode(struct adp5589_kpad *kpad, 
unsigned short key)
return -EINVAL;
 }
 
-static int adp5589_setup(struct adp5589_kpad *kpad)
+static int adp5589_setup(struct adp5589_kpad *kpad, 
+   const struct adp5589_kpad_platform_data *pdata)
 {
struct i2c_client *client = kpad->client;
-   const struct adp5589_kpad_platform_data *pdata =
-   dev_get_platdata(&client->dev);
u8 (*reg) (u8) = kpad->var->reg;
unsigned char evt_mode1 = 0, evt_mode2 = 0, evt_mode3 = 0;
unsigned char pull_mask = 0;
@@ -857,70 +858,37 @@ static void adp5589_report_switch_state(struct 
adp5589_kpad *kpad)
input_sync(kpad->input);
 }
 
-static int adp5589_probe(struct i2c_client *client,
-const struct i2c_device_id *id)
+static int adp5589_keypad_add(struct adp5589_kpad *kpad, unsigned int revid,
+   const struct adp5589_kpad_platform_data *pdata)
 {
-   struct adp5589_kpad *kpad;
-   const struct adp5589_kpad_platform_data *pdata =
-   dev_get_platdata(&client->dev);
+   struct i2c_client *client = kpad->client;
struct input_dev *input;
-   unsigned int revid;
-   int ret, i;
+   unsigned int i;
int error;
 
-   if (!i2c_check_functionality(client->adapter,
-I2C_FUNC_SMBUS_BYTE_DATA)) {
-   dev_err(&client->dev, "SMBUS Byte Data not Supported\n");
-   return -EIO;
-   }
-
-   if (!pdata) {
-   dev_err(&client->dev, "no platform data?\n");
-   return -EINVAL;
-   }
-
-   kpad = kzalloc(sizeof(*kpad), GFP_KERNEL);
-   if (!kpad)
-   return -ENOMEM;
-
-   switch (id->driver_data) {
-   case ADP5585_02:
-   kpad->support_row5 = true;
-   /* fall through */
-   case ADP5585_01:
-   kpad->is_adp5585 = true;
-   kpad->var = &const_adp5585;
-   break;
-   case ADP5589:
-   kpad->support_row5 = true;
-   kpad->var = &const_adp5589;
-   break;
-   }
+   if (pdata->keymapsize == 0)
+   return 0;
 
if (!((pdata->keypad_en_mask & kpad->var->row_mask) &&
(pdata->keypad_en_mask >> kpad->var->col_shift)) ||
!pdata->keymap) {
dev_err(&client->dev, "no rows, cols or keymap from pdata\n");
-   error = -EINVAL;
-   goto err_free_mem;
+   return -EINVAL;
}
 
if (pdata->keymapsize != kpad->var->keymapsize) {
dev_err(&client->dev, "invalid keymapsize\n");
-   error = -EINVAL;
-   goto err_free_mem;

[PATCH V9 6/6] backlight: qcom-wled: Add auto string detection logic

2019-10-23 Thread Kiran Gunda
The auto string detection algorithm checks if the current WLED
sink configuration is valid. It tries enabling every sink and
checks if the OVP fault is observed. Based on this information
it detects and enables the valid sink configuration.
Auto calibration will be triggered when the OVP fault interrupts
are seen frequently thereby it tries to fix the sink configuration.

The auto-detection also kicks in when the connected LED string
of the display-backlight malfunctions (because of damage) and
requires the damaged string to be turned off to prevent the
complete panel and/or board from being damaged.

Signed-off-by: Kiran Gunda 
---
 drivers/video/backlight/qcom-wled.c | 400 +++-
 1 file changed, 394 insertions(+), 6 deletions(-)

diff --git a/drivers/video/backlight/qcom-wled.c 
b/drivers/video/backlight/qcom-wled.c
index 658b1e0..33b6007 100644
--- a/drivers/video/backlight/qcom-wled.c
+++ b/drivers/video/backlight/qcom-wled.c
@@ -17,19 +17,29 @@
 #define WLED_MAX_STRINGS   4
 
 #define WLED_DEFAULT_BRIGHTNESS2048
-
+#define WLED_SOFT_START_DLY_US 1
 #define WLED3_SINK_REG_BRIGHT_MAX  0xFFF
 
 /* WLED3/WLED4 control registers */
+#define WLED3_CTRL_REG_FAULT_STATUS0x08
+#define  WLED3_CTRL_REG_ILIM_FAULT_BIT BIT(0)
+#define  WLED3_CTRL_REG_OVP_FAULT_BIT  BIT(1)
+#define  WLED4_CTRL_REG_SC_FAULT_BIT   BIT(2)
+
+#define WLED3_CTRL_REG_INT_RT_STS  0x10
+#define  WLED3_CTRL_REG_OVP_FAULT_STATUS   BIT(1)
+
 #define WLED3_CTRL_REG_MOD_EN  0x46
 #define  WLED3_CTRL_REG_MOD_EN_MASKBIT(7)
 #define  WLED3_CTRL_REG_MOD_EN_SHIFT   7
 
+#define WLED3_CTRL_REG_FEEDBACK_CONTROL0x48
+
 #define WLED3_CTRL_REG_FREQ0x4c
 #define  WLED3_CTRL_REG_FREQ_MASK  GENMASK(3, 0)
 
 #define WLED3_CTRL_REG_OVP 0x4d
-#define  WLED3_CTRL_REG_OVP_MASK   GENMASK(1, 0)
+#define  WLED3_CTRL_REG_OVP_MASK   GENMASK(1, 0)
 
 #define WLED3_CTRL_REG_ILIMIT  0x4e
 #define  WLED3_CTRL_REG_ILIMIT_MASKGENMASK(2, 0)
@@ -119,6 +129,7 @@ struct wled_config {
bool ext_gen;
bool cabc;
bool external_pfet;
+   bool auto_detection_enabled;
 };
 
 struct wled {
@@ -127,17 +138,22 @@ struct wled {
struct regmap *regmap;
struct mutex lock;  /* Lock to avoid race from thread irq handler */
ktime_t last_short_event;
+   ktime_t start_ovp_fault_time;
u16 ctrl_addr;
u16 sink_addr;
u16 max_string_count;
+   u16 auto_detection_ovp_count;
u32 brightness;
u32 max_brightness;
u32 short_count;
+   u32 auto_detect_count;
bool disabled_by_short;
bool has_short_detect;
int short_irq;
+   int ovp_irq;
 
struct wled_config cfg;
+   struct delayed_work ovp_work;
int (*wled_set_brightness)(struct wled *wled, u16 brightness);
 };
 
@@ -182,6 +198,13 @@ static int wled4_set_brightness(struct wled *wled, u16 
brightness)
return 0;
 }
 
+static void wled_ovp_work(struct work_struct *work)
+{
+   struct wled *wled = container_of(work,
+struct wled, ovp_work.work);
+   enable_irq(wled->ovp_irq);
+}
+
 static int wled_module_enable(struct wled *wled, int val)
 {
int rc;
@@ -193,7 +216,25 @@ static int wled_module_enable(struct wled *wled, int val)
WLED3_CTRL_REG_MOD_EN,
WLED3_CTRL_REG_MOD_EN_MASK,
val << WLED3_CTRL_REG_MOD_EN_SHIFT);
-   return rc;
+   if (rc < 0)
+   return rc;
+
+   if (wled->ovp_irq > 0) {
+   if (val) {
+   /*
+* The hardware generates a storm of spurious OVP
+* interrupts during soft start operations. So defer
+* enabling the IRQ for 10ms to ensure that the
+* soft start is complete.
+*/
+   schedule_delayed_work(&wled->ovp_work, HZ / 100);
+   } else {
+   if (!cancel_delayed_work_sync(&wled->ovp_work))
+   disable_irq(wled->ovp_irq);
+   }
+   }
+
+   return 0;
 }
 
 static int wled_sync_toggle(struct wled *wled)
@@ -300,6 +341,304 @@ static irqreturn_t wled_short_irq_handler(int irq, void 
*_wled)
return IRQ_HANDLED;
 }
 
+#define AUTO_DETECT_BRIGHTNESS 200
+
+static void wled_auto_string_detection(struct wled *wled)
+{
+   int rc = 0, i;
+   u32 sink_config = 0, i

Re: [PATCH 3/7] soc: fsl: qe: avoid ppc-specific io accessors

2019-10-23 Thread Rasmus Villemoes
On 22/10/2019 17.01, Christophe Leroy wrote:
> 
> 
> On 10/18/2019 12:52 PM, Rasmus Villemoes wrote:
>> In preparation for allowing to build QE support for architectures
>> other than PPC, replace the ppc-specific io accessors. Done via
>>
> 
> This patch is not transparent in terms of performance, functions get
> changed significantly.
> 
> Before the patch:
> 
> 0330 :
>  330:    81 43 00 04 lwz r10,4(r3)
>  334:    7c 00 04 ac hwsync
>  338:    81 2a 00 00 lwz r9,0(r10)
>  33c:    0c 09 00 00 twi 0,r9,0
>  340:    4c 00 01 2c isync
>  344:    70 88 00 02 andi.   r8,r4,2
>  348:    41 82 00 10 beq 358 
>  34c:    39 00 00 01 li  r8,1
>  350:    91 03 00 10 stw r8,16(r3)
>  354:    61 29 00 10 ori r9,r9,16
>  358:    70 88 00 01 andi.   r8,r4,1
>  35c:    41 82 00 10 beq 36c 
>  360:    39 00 00 01 li  r8,1
>  364:    91 03 00 14 stw r8,20(r3)
>  368:    61 29 00 20 ori r9,r9,32
>  36c:    7c 00 04 ac hwsync
>  370:    91 2a 00 00 stw r9,0(r10)
>  374:    4e 80 00 20 blr
> 
> After the patch:
> 
> 030c :
>  30c:    94 21 ff e0 stwu    r1,-32(r1)
>  310:    7c 08 02 a6 mflr    r0
>  314:    bf a1 00 14 stmw    r29,20(r1)
>  318:    7c 9f 23 78 mr  r31,r4
>  31c:    90 01 00 24 stw r0,36(r1)
>  320:    7c 7e 1b 78 mr  r30,r3
>  324:    83 a3 00 04 lwz r29,4(r3)
>  328:    7f a3 eb 78 mr  r3,r29
>  32c:    48 00 00 01 bl  32c 
>     32c: R_PPC_REL24    ioread32be
>  330:    73 e9 00 02 andi.   r9,r31,2
>  334:    41 82 00 10 beq 344 
>  338:    39 20 00 01 li  r9,1
>  33c:    91 3e 00 10 stw r9,16(r30)
>  340:    60 63 00 10 ori r3,r3,16
>  344:    73 e9 00 01 andi.   r9,r31,1
>  348:    41 82 00 10 beq 358 
>  34c:    39 20 00 01 li  r9,1
>  350:    91 3e 00 14 stw r9,20(r30)
>  354:    60 63 00 20 ori r3,r3,32
>  358:    80 01 00 24 lwz r0,36(r1)
>  35c:    7f a4 eb 78 mr  r4,r29
>  360:    bb a1 00 14 lmw r29,20(r1)
>  364:    7c 08 03 a6 mtlr    r0
>  368:    38 21 00 20 addi    r1,r1,32
>  36c:    48 00 00 00 b   36c 
>     36c: R_PPC_REL24    iowrite32be

True. Do you know why powerpc uses out-of-line versions of these
accessors when !PPC_INDIRECT_PIO, i.e. at least all of PPC32? It's quite
a bit beyond the scope of this series, but I'd expect moving most if not
all of arch/powerpc/kernel/iomap.c into asm/io.h (guarded by
!defined(CONFIG_PPC_INDIRECT_PIO) of course) as static inlines would
benefit all ppc32 users of iowrite32 and friends.

Is there some other primitive available that (a) is defined on all
architectures (or at least both ppc and arm) and (b) expands to good
code in both/all cases?

Note that a few uses of the the iowrite32be accessors has already
appeared in the qe code with the introduction of the qe_clrsetbits()
helpers in bb8b2062af.

Rasmus


[PATCH v5 0/4] perf/core: fix restoring of Intel LBR call stack on a context switch

2019-10-23 Thread Alexey Budankov


Restore Intel LBR call stack from cloned inactive task perf context on
a context switch. This change inherently addresses inconsistency in LBR 
call stack data provided on a sample in record profiling mode:

  $ perf record -N -B -T -R --call-graph lbr \
 -e cpu/period=0xcdfe60,event=0x3c,name=\'CPU_CLK_UNHALTED.THREAD\'/Duk 
\
 --clockid=monotonic_raw -- ./miniFE.x nx 25 ny 25 nz 25

Let's assume threads A, B, C belonging to the same process. 
B and C are siblings of A and their perf contexts are treated as equivalent.
At some point B blocks on a futex (non preempt context switch).
B's LBRs are preserved at B's perf context task_ctx_data and B's events 
are removed from PMU and disabled. B's perf context becomes inactive.

Later C gets on a cpu, runs, gets profiled and eventually switches to 
the awaken but not yet running B. The optimized context switch path is 
executed swapping B's and C's task_ctx_data pointers at perf event contexts.
So C's task_ctx_data will refer preserved B's LBRs on the following 
switch-in event.

However, as far B's perf context is inactive there is no enabled events
in there and B's task_ctx_data->lbr_callstack_users is equal to 0.
When B gets on the cpu B's events reviving is skipped following
the optimized context switch path and B's task_ctx_data->lbr_callstack_users
remains 0. Thus B's LBR's are not restored by pmu sched_task() code called 
in the end of perf context switch-in callback for B.

In the report that manifests as having short fragments of B's
call stack, still tracked by LBR's HW between adjacent samples,
but the whole thread call tree doesn't aggregate.

The fix has been evaluated when profiling miniFE [1] (C++, OpenMP)
workload running 64 threads on Intel Skylake EP(64 core, 2 sockets):

  $ perf report --call-graph callee,flat

5.3.0-rc6+ (tip perf/core) - fixed

-   92.66%82.64%  miniFE.x  libiomp5.so [.] 
_INTERNAL_25___src_kmp_barrier_cpp_1d20fae8::__kmp_hyper_barrier_release
   - 69.14% 
_INTERNAL_25___src_kmp_barrier_cpp_1d20fae8::__kmp_hyper_barrier_release
__kmp_fork_barrier
__kmp_launch_thread
_INTERNAL_24___src_z_Linux_util_c_3e0095e6::__kmp_launch_worker
start_thread
__clone
   - 21.89% 
_INTERNAL_25___src_kmp_barrier_cpp_1d20fae8::__kmp_hyper_barrier_release
__kmp_barrier
__kmpc_reduce_nowait
miniFE::cg_solve, 
miniFE::Vector, miniFE::matvec_std, miniFE::Vector, 
miniFE::Vector, miniFE::matvec_std, miniFE::Vector, 
miniFE::Vector, miniFE::matvec_std, miniFE::Vector, 
miniFE::Vector, miniFE::matvec_std, miniFE::Vectorhttps://www.hpcadvisorycouncil.com/pdf/miniFE_Analysis_and_Profiling.pdf

---
Alexey Budankov (4):
  perf/core,x86: introduce swap_task_ctx() method at struct pmu
  perf/x86: install platform specific swap_task_ctx adapter
  perf/x86/intel: implement LBR callstacks context synchronization
  perf/core,x86: synchronize PMU task contexts on optimized context switches

 arch/x86/events/core.c   |  8 
 arch/x86/events/intel/core.c |  7 +++
 arch/x86/events/intel/lbr.c  | 23 +++
 arch/x86/events/perf_event.h | 11 +++
 include/linux/perf_event.h   |  9 +
 kernel/events/core.c | 13 -
 6 files changed, 70 insertions(+), 1 deletion(-)

---
Changes in v5:
- renamed sync_task_ctx to swap_task_ctx;
- converted type of prev and next swap_task_ctx() params to struct 
perf_event_context;
- implemented check on availability of swap_task_ctx() pointer in pmu type
  at perf core implementation;
- moved swap of ctx->task_ctx_data pointers to architecture specific
  intel_pmu_lbr_swap_task_ctx() implementation;

Changes in v4:
- moved check on simultaneous task_ctx_data objects availability 
  to the perf/core layer;
- marked sync_task_ctx() as the optional in code comments;
- renamed params of sync_task_ctx() to prev and next;

Changes in v3:
- replaced assignment with swap at intel_pmu_lbr_sync_task_ctx()

Changes in v2:
- implemented sync_task_ctx() method at perf,x86,intel pmu types;
- employed the method on the optimized context switch path between 
  equivalent perf event contexts;

-- 
2.20.1



dmaengine: JZ4780: Add dmaengine driver for X1000.

2019-10-23 Thread Zhou Yanjie
1.Add the dmaengine bindings for the X1000 SoC from Ingenic.
2.Add support for probing the dma-jz4780 driver on the
  X1000 SoC from Ingenic.




Re: [PATCH] perf/core: fix multiplexing event scheduling issue

2019-10-23 Thread Stephane Eranian
On Mon, Oct 21, 2019 at 3:06 AM Peter Zijlstra  wrote:
>
> On Thu, Oct 17, 2019 at 05:27:46PM -0700, Stephane Eranian wrote:
> > @@ -2153,6 +2157,7 @@ __perf_remove_from_context(struct perf_event *event,
> >  void *info)
> >  {
> >   unsigned long flags = (unsigned long)info;
> > + int was_necessary = ctx->rotate_necessary;
> >
> >   if (ctx->is_active & EVENT_TIME) {
> >   update_context_time(ctx);
> > @@ -2171,6 +2176,37 @@ __perf_remove_from_context(struct perf_event *event,
> >   cpuctx->task_ctx = NULL;
> >   }
> >   }
> > +
> > + /*
> > +  * sanity check that event_sched_out() does not and will not
> > +  * change the state of ctx->rotate_necessary
> > +  */
> > + WARN_ON(was_necessary != event->ctx->rotate_necessary);
>
> It doesn't... why is this important to check?
>
I can remove that. It is leftover from debugging. It is okay to look
at the situation
after event_sched_out(). Today, it does not change rotate_necessary.

> > + /*
> > +  * if we remove an event AND we were multiplexing then, that means
> > +  * we had more events than we have counters for, and thus, at least,
> > +  * one event was in INACTIVE state. Now, that we removed an event,
> > +  * we need to resched to give a chance to all events to get scheduled,
> > +  * otherwise some may get stuck.
> > +  *
> > +  * By the time this function is called the event is usually in the OFF
> > +  * state.
> > +  * Note that this is not a duplicate of the same code in 
> > _perf_event_disable()
> > +  * because the call path are different. Some events may be simply 
> > disabled
>
> It is the exact same code twice though; IIRC this C language has a
> feature to help with that.

Sure! I will make a function to check on the condition.

>
> > +  * others removed. There is a way to get removed and not be disabled 
> > first.
> > +  */
> > + if (ctx->rotate_necessary && ctx->nr_events) {
> > + int type = get_event_type(event);
> > + /*
> > +  * In case we removed a pinned event, then we need to
> > +  * resched for both pinned and flexible events. The
> > +  * opposite is not true. A pinned event can never be
> > +  * inactive due to multiplexing.
> > +  */
> > + if (type & EVENT_PINNED)
> > + type |= EVENT_FLEXIBLE;
> > + ctx_resched(cpuctx, cpuctx->task_ctx, type);
> > + }
>
> What you're relying on is that ->rotate_necessary implies ->is_active
> and there's pending events. And if we tighten ->rotate_necessary you can
> remove the && ->nr_events.
>
Imagine I have 6 events and 4 counters and I do delete them all before
the timer expires.
Then, I can be in a situation where rotate_necessary is still true and
yet have no more events
in the context. That is because only ctx_sched_out() clears
rotate_necessary, IIRC. So that
is why there is the && nr_events. Now, calling ctx_resched() with no
events wouldn't probably
cause any harm, just wasted work.  So if by tightening, I am guessing
you mean clearing
rotate_necessary earlier. But that would be tricky because the only
reliable way of clearing
it is when you know you are about the reschedule everything. Removing
an event by itself
may not be enough to eliminate multiplexing.


> > @@ -2232,6 +2270,35 @@ static void __perf_event_disable(struct perf_event 
> > *event,
> >   event_sched_out(event, cpuctx, ctx);
> >
> >   perf_event_set_state(event, PERF_EVENT_STATE_OFF);
> > + /*
> > +  * sanity check that event_sched_out() does not and will not
> > +  * change the state of ctx->rotate_necessary
> > +  */
> > + WARN_ON_ONCE(was_necessary != event->ctx->rotate_necessary);
> > +
> > + /*
> > +  * if we disable an event AND we were multiplexing then, that means
> > +  * we had more events than we have counters for, and thus, at least,
> > +  * one event was in INACTIVE state. Now, that we disabled an event,
> > +  * we need to resched to give a chance to all events to be scheduled,
> > +  * otherwise some may get stuck.
> > +  *
> > +  * Note that this is not a duplicate of the same code in
> > +  * __perf_remove_from_context()
> > +  * because events can be disabled without being removed.
>
> It _IS_ a duplicate, it is the _exact_ same code twice. What you're
> trying to say is that we need it in both places, but that's something
> else entirely.
>

Will refactor.

> > +  */
> > + if (ctx->rotate_necessary && ctx->nr_events) {
> > + int type = get_event_type(event);
> > + /*
> > +  * In case we removed a pinned event, then we need to
> > +  * resched for both pinned and flexible events. The
> > +  * opposite is not true. A pinned event can never be
> > +  * inactive

Re: [PATCH v2] vhost: introduce mdev based hardware backend

2019-10-23 Thread Tiwei Bie
On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote:
> On 2019/10/23 上午11:02, Tiwei Bie wrote:
> > On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote:
> > > On 2019/10/22 下午5:52, Tiwei Bie wrote:
> > > > This patch introduces a mdev based hardware vhost backend.
> > > > This backend is built on top of the same abstraction used
> > > > in virtio-mdev and provides a generic vhost interface for
> > > > userspace to accelerate the virtio devices in guest.
> > > > 
> > > > This backend is implemented as a mdev device driver on top
> > > > of the same mdev device ops used in virtio-mdev but using
> > > > a different mdev class id, and it will register the device
> > > > as a VFIO device for userspace to use. Userspace can setup
> > > > the IOMMU with the existing VFIO container/group APIs and
> > > > then get the device fd with the device name. After getting
> > > > the device fd of this device, userspace can use vhost ioctls
> > > > to setup the backend.
> > > > 
> > > > Signed-off-by: Tiwei Bie 
> > > > ---
> > > > This patch depends on below series:
> > > > https://lkml.org/lkml/2019/10/17/286
> > > > 
> > > > v1 -> v2:
> > > > - Replace _SET_STATE with _SET_STATUS (MST);
> > > > - Check status bits at each step (MST);
> > > > - Report the max ring size and max number of queues (MST);
> > > > - Add missing MODULE_DEVICE_TABLE (Jason);
> > > > - Only support the network backend w/o multiqueue for now;
> > > 
> > > Any idea on how to extend it to support devices other than net? I think we
> > > want a generic API or an API that could be made generic in the future.
> > > 
> > > Do we want to e.g having a generic vhost mdev for all kinds of devices or
> > > introducing e.g vhost-net-mdev and vhost-scsi-mdev?
> > One possible way is to do what vhost-user does. I.e. Apart from
> > the generic ring, features, ... related ioctls, we also introduce
> > device specific ioctls when we need them. As vhost-mdev just needs
> > to forward configs between parent and userspace and even won't
> > cache any info when possible,
> 
> 
> So it looks to me this is only possible if we expose e.g set_config and
> get_config to userspace.

The set_config and get_config interface isn't really everything
of device specific settings. We also have ctrlq in virtio-net.

> 
> 
> > I think it might be better to do
> > this in one generic vhost-mdev module.
> 
> 
> Looking at definitions of VhostUserRequest in qemu, it mixed generic API
> with device specific API. If we want go this ways (a generic vhost-mdev),
> more questions needs to be answered:
> 
> 1) How could userspace know which type of vhost it would use? Do we need to
> expose virtio subsystem device in for userspace this case?
> 
> 2) That generic vhost-mdev module still need to filter out unsupported
> ioctls for a specific type. E.g if it probes a net device, it should refuse
> API for other type. This in fact a vhost-mdev-net but just not modularize it
> on top of vhost-mdev.
> 
> 
> > 
> > > 
> > > > - Some minor fixes and improvements;
> > > > - Rebase on top of virtio-mdev series v4;
[...]
> > > > +
> > > > +static long vhost_mdev_get_features(struct vhost_mdev *m, u64 __user 
> > > > *featurep)
> > > > +{
> > > > +   if (copy_to_user(featurep, &m->features, sizeof(m->features)))
> > > > +   return -EFAULT;
> > > 
> > > As discussed in previous version do we need to filter out MQ feature here?
> > I think it's more straightforward to let the parent drivers to
> > filter out the unsupported features. Otherwise it would be tricky
> > when we want to add more features in vhost-mdev module,
> 
> 
> It's as simple as remove the feature from blacklist?

It's not really that easy. It may break the old drivers.

> 
> 
> > i.e. if
> > the parent drivers may expose unsupported features and relay on
> > vhost-mdev to filter them out, these features will be exposed
> > to userspace automatically when they are enabled in vhost-mdev
> > in the future.
> 
> 
> The issue is, it's only that vhost-mdev knows its own limitation. E.g in
> this patch, vhost-mdev only implements a subset of transport API, but parent
> doesn't know about that.
> 
> Still MQ as an example, there's no way (or no need) for parent to know that
> vhost-mdev does not support MQ.

The mdev is a MDEV_CLASS_ID_VHOST mdev device. When the parent
is being developed, it should know the currently supported features
of vhost-mdev.

> And this allows old kenrel to work with new
> parent drivers.

The new drivers should provide things like VIRTIO_MDEV_F_VERSION_1
to be compatible with the old kernels. When VIRTIO_MDEV_F_VERSION_1
is provided/negotiated, the behaviours should be consistent.

> 
> So basically we have three choices here:
> 
> 1) Implement what vhost-user did and implement a generic vhost-mdev (but may
> still have lots of device specific code). To support advanced feature which
> requires the access to config, still lots of API that needs to be added.
> 
> 2) Implement what vhost-kern

[PATCH v5 1/4] perf/core,x86: introduce swap_task_ctx() method at struct pmu

2019-10-23 Thread Alexey Budankov


Declare swap_task_ctx() methods at the generic and x86 specific
pmu types to bridge calls to platform specific pmu code on optimized
context switch path between equivalent task perf event contexts.

Signed-off-by: Alexey Budankov 
---
 arch/x86/events/perf_event.h | 8 
 include/linux/perf_event.h   | 9 +
 2 files changed, 17 insertions(+)

diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index ecacfbf4ebc1..5384317eaa16 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -682,6 +682,14 @@ struct x86_pmu {
 */
atomic_tlbr_exclusive[x86_lbr_exclusive_max];
 
+   /*
+* perf task context (i.e. struct perf_event_context::task_ctx_data)
+* switch helper to bridge calls from perf/core to perf/x86.
+* See struct pmu::swap_task_ctx() usage for examples;
+*/
+   void(*swap_task_ctx)(struct perf_event_context *prev,
+struct perf_event_context *next);
+
/*
 * AMD bits
 */
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 587ae4d002f5..7887e4a3d487 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -410,6 +410,15 @@ struct pmu {
 */
size_t  task_ctx_size;
 
+   /*
+* PMU specific parts of task perf event context (i.e. 
ctx->task_ctx_data)
+* can be synchronized using this function. See Intel LBR callstack 
support
+* implementation and Perf core context switch handling callbacks for 
usage
+* examples.
+*/
+   void (*swap_task_ctx)   (struct perf_event_context *prev,
+struct perf_event_context *next);
+   /* optional */
 
/*
 * Set up pmu-private data structures for an AUX area
-- 
2.20.1



[PATCH v5 2/4] perf/x86: install platform specific swap_task_ctx adapter

2019-10-23 Thread Alexey Budankov


Bridge perf core and x86 swap_task_ctx() method calls.

Signed-off-by: Alexey Budankov 
---
 arch/x86/events/core.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 7b21455d7504..6e3f0c18908e 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2243,6 +2243,13 @@ static void x86_pmu_sched_task(struct perf_event_context 
*ctx, bool sched_in)
x86_pmu.sched_task(ctx, sched_in);
 }
 
+static void x86_pmu_swap_task_ctx(struct perf_event_context *prev,
+ struct perf_event_context *next)
+{
+   if (x86_pmu.swap_task_ctx)
+   x86_pmu.swap_task_ctx(prev, next);
+}
+
 void perf_check_microcode(void)
 {
if (x86_pmu.check_microcode)
@@ -2297,6 +2304,7 @@ static struct pmu pmu = {
.event_idx  = x86_pmu_event_idx,
.sched_task = x86_pmu_sched_task,
.task_ctx_size  = sizeof(struct x86_perf_task_context),
+   .swap_task_ctx  = x86_pmu_swap_task_ctx,
.check_period   = x86_pmu_check_period,
 
.aux_output_match   = x86_pmu_aux_output_match,
-- 
2.20.1



Re: [PATCH v6 3/4] arm64: use both ZONE_DMA and ZONE_DMA32

2019-10-23 Thread Matthias Brugger



On 22/10/2019 13:23, Nicolas Saenz Julienne wrote:
> On Mon, 2019-10-21 at 16:36 -0400, Qian Cai wrote:
>> I managed to get more information here,
>>
>> [0.00] cma: dma_contiguous_reserve(limit c000)
>> [0.00] cma: dma_contiguous_reserve: reserving 64 MiB for global area
>> [0.00] cma: cma_declare_contiguous(size 0x0400, base
>> 0x, limit 0xc000 alignment 0x)
>> [0.00] cma: Failed to reserve 512 MiB
>>
>> Full dmesg:
>>
>> https://cailca.github.io/files/dmesg.txt
> 
> OK I got it, reproduced it too.
> 
> Here are the relevant logs:
> 
>   [0.00]   DMA  [mem 0x802f-0xbfff]
>   [0.00]   DMA32[mem 0xc000-0x]
>   [0.00]   Normal   [mem 0x0001-0x0097fcff]
> 
> As you can see ZONE_DMA spans from 0x802f-0xbfff which
> is slightly smaller than 1GB.
> 
>   [0.00] crashkernel reserved: 0x9fe0 - 
> 0xbfe0 (512 MB)
> 
> Here crashkernel reserved 512M in ZONE_DMA.
> 
>   [0.00] cma: Failed to reserve 512 MiB
> 
> CMA tried to allocate 512M in ZONE_DMA which fails as there is no enough 
> space.
> Makes sense.
> 
> A fix could be moving crashkernel reservations after CMA and then if unable to
> fit in ZONE_DMA try ZONE_DMA32 before bailing out. Maybe it's a little over 
> the
> top, yet although most devices will be fine with ZONE_DMA32, the RPi4 needs
> crashkernel to be reserved in ZONE_DMA.
> 
> My knowledge of Kdump is limited, so I'd love to see what Catalin has to say.
> Here's a tested patch of what I'm proposing:
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 120c26af916b..49f3c3a34ae2 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -76,6 +76,7 @@ phys_addr_t arm64_dma32_phys_limit __ro_after_init;
>  static void __init reserve_crashkernel(void)
>  {
> unsigned long long crash_base, crash_size;
> +   phys_addr_t limit = arm64_dma_phys_limit;
> int ret;
> 
> ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> @@ -86,11 +87,14 @@ static void __init reserve_crashkernel(void)
> 
> crash_size = PAGE_ALIGN(crash_size);
> 
> +again:
> if (crash_base == 0) {
> /* Current arm64 boot protocol requires 2MB alignment */
> -   crash_base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
> -   crash_size, SZ_2M);
> -   if (crash_base == 0) {
> +   crash_base = memblock_find_in_range(0, limit, crash_size,
> SZ_2M);
> +   if (!crash_base && limit == arm64_dma_phys_limit) {
> +   limit = arm64_dma32_phys_limit;
> +   goto again;

I'd try to avoid the goto.
Apart from that we should write some information message that the crashkernel
got reserved in arm64_dma_phys_limit. Otherwise RPi4 might break silently and
this will give the user at least a hint what happened.

Regards,
Matthias

> +   } else if (!crash_base && limit == arm64_dma32_phys_limit) {
> pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> crash_size);
> return;
> @@ -448,13 +452,13 @@ void __init arm64_memblock_init(void)
> else
> arm64_dma32_phys_limit = PHYS_MASK + 1;
> 
> -   reserve_crashkernel();
> -
> reserve_elfcorehdr();
> 
> high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
> 
> dma_contiguous_reserve(arm64_dma_phys_limit ? : 
> arm64_dma32_phys_limit);
> +
> +   reserve_crashkernel();
>  }
> 
>  void __init bootmem_init(void)
> 
> 
> Regards,
> Nicolas
> 


[PATCH v5 3/4] perf/x86/intel: implement LBR callstacks context synchronization

2019-10-23 Thread Alexey Budankov


Implement intel_pmu_lbr_swap_task_ctx() method updating counters
of the events that requested LBR callstack data on a sample.

The counter can be zero for the case when task context belongs to
a thread that has just come from a block on a futex and the context
contains saved (lbr_stack_state == LBR_VALID) LBR register values.

For the values to be restored at LBR registers on the next thread's
switch-in event it swaps the counter value with the one that is
expected to be non zero at the previous equivalent task perf event
context.

Swap operation type ensures the previous task perf event context
stays consistent with the amount of events that requested LBR
callstack data on a sample.

Signed-off-by: Alexey Budankov 
---
 arch/x86/events/intel/lbr.c  | 23 +++
 arch/x86/events/perf_event.h |  3 +++
 2 files changed, 26 insertions(+)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index ea54634eabf3..534c76606049 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -417,6 +417,29 @@ static void __intel_pmu_lbr_save(struct 
x86_perf_task_context *task_ctx)
cpuc->last_log_id = ++task_ctx->log_id;
 }
 
+void intel_pmu_lbr_swap_task_ctx(struct perf_event_context *prev,
+struct perf_event_context *next)
+{
+   struct x86_perf_task_context *prev_ctx_data, *next_ctx_data;
+
+   swap(prev->task_ctx_data, next->task_ctx_data);
+
+   /*
+* Architecture specific synchronization makes sense in
+* case both prev->task_ctx_data and next->task_ctx_data
+* pointers are allocated.
+*/
+
+   prev_ctx_data = next->task_ctx_data;
+   next_ctx_data = prev->task_ctx_data;
+
+   if (!prev_ctx_data || !next_ctx_data)
+   return;
+
+   swap(prev_ctx_data->lbr_callstack_users,
+next_ctx_data->lbr_callstack_users);
+}
+
 void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in)
 {
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 5384317eaa16..930611db8f9a 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -1024,6 +1024,9 @@ void intel_pmu_store_pebs_lbrs(struct pebs_lbr *lbr);
 
 void intel_ds_init(void);
 
+void intel_pmu_lbr_swap_task_ctx(struct perf_event_context *prev,
+struct perf_event_context *next);
+
 void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in);
 
 u64 lbr_from_signext_quirk_wr(u64 val);
-- 
2.20.1



[PATCH v5 4/4] perf/core,x86: synchronize PMU task contexts on optimized context switches

2019-10-23 Thread Alexey Budankov


Install Intel specific PMU task context synchronization adapter and
extend optimized context switch path with PMU specific task context
synchronization to fix LBR callstack virtualization on context switches.

Signed-off-by: Alexey Budankov 
---
 arch/x86/events/intel/core.c |  7 +++
 kernel/events/core.c | 13 -
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index bbf6588d47ee..dc64b16e6b71 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3820,6 +3820,12 @@ static void intel_pmu_sched_task(struct 
perf_event_context *ctx,
intel_pmu_lbr_sched_task(ctx, sched_in);
 }
 
+static void intel_pmu_swap_task_ctx(struct perf_event_context *prev,
+   struct perf_event_context *next)
+{
+   intel_pmu_lbr_swap_task_ctx(prev, next);
+}
+
 static int intel_pmu_check_period(struct perf_event *event, u64 value)
 {
return intel_pmu_has_bts_period(event, value) ? -EINVAL : 0;
@@ -3955,6 +3961,7 @@ static __initconst const struct x86_pmu intel_pmu = {
 
.guest_get_msrs = intel_guest_get_msrs,
.sched_task = intel_pmu_sched_task,
+   .swap_task_ctx  = intel_pmu_swap_task_ctx,
 
.check_period   = intel_pmu_check_period,
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f9a5d4356562..ed31aa849161 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3204,10 +3204,21 @@ static void perf_event_context_sched_out(struct 
task_struct *task, int ctxn,
raw_spin_lock(&ctx->lock);
raw_spin_lock_nested(&next_ctx->lock, SINGLE_DEPTH_NESTING);
if (context_equiv(ctx, next_ctx)) {
+   struct pmu *pmu = ctx->pmu;
+
WRITE_ONCE(ctx->task, next);
WRITE_ONCE(next_ctx->task, task);
 
-   swap(ctx->task_ctx_data, next_ctx->task_ctx_data);
+   /*
+* PMU specific parts of task perf context can require
+* additional synchronization. As an example of such
+* synchronization see implementation details of Intel
+* LBR call stack data profiling;
+*/
+   if (pmu->swap_task_ctx)
+   pmu->swap_task_ctx(ctx, next_ctx);
+   else
+   swap(ctx->task_ctx_data, 
next_ctx->task_ctx_data);
 
/*
 * RCU_INIT_POINTER here is safe because we've not
-- 
2.20.1



Re: [PATCH] hugetlbfs: add O_TMPFILE support

2019-10-23 Thread Piotr Sarna

On 10/23/19 4:55 AM, Mike Kravetz wrote:

On 10/22/19 12:09 AM, Piotr Sarna wrote:

On 10/21/19 7:17 PM, Mike Kravetz wrote:

On 10/15/19 4:37 PM, Mike Kravetz wrote:

On 10/15/19 3:50 AM, Michal Hocko wrote:

On Tue 15-10-19 11:01:12, Piotr Sarna wrote:

With hugetlbfs, a common pattern for mapping anonymous huge pages
is to create a temporary file first.


Really? I though that this is normally done by shmget(SHM_HUGETLB) or
mmap(MAP_HUGETLB). Or maybe I misunderstood your definition on anonymous
huge pages.


Currently libraries like
libhugetlbfs and seastar create these with a standard mkstemp+unlink
trick,


I would guess that much of libhugetlbfs was writen before MAP_HUGETLB
was implemented.  So, that is why it does not make (more) use of that
option.

The implementation looks to be straight forward.  However, I really do
not want to add more functionality to hugetlbfs unless there is specific
use case that needs it.


It was not my intention to shut down discussion on this patch.  I was just
asking if there was a (new) use case for such a change.  I am checking with
our DB team as I seem to remember them using the create/unlink approach for
hugetlbfs in one of their upcoming models.

Is there a new use case you were thinking about?



Oh, I indeed thought it was a shutdown. The use case I was thinking about was 
in Seastar, where the create+unlink trick is used for creating temporary files 
(in a generic way, not only for hugetlbfs). I simply intended to migrate it to 
a newer approach - O_TMPFILE. However,
for the specific case of hugetlbfs it indeed makes more sense to skip it and 
use mmap's MAP_HUGETLB, so perhaps it's not worth it to patch a perfectly good 
and stable file system just to provide a semi-useful flag support. My 
implementation of tmpfile for hugetlbfs is straightforward indeed, but the 
MAP_HUGETLB argument made me realize that it may not be worth the trouble - 
especially that MAP_HUGETLB is here since 2.6 and O_TMPFILE was introduced 
around v3.11, so the mmap way looks more portable.

tldr: I'd be very happy to get my patch accepted, but the use case I had in 
mind can be easily solved with MAP_HUGETLB, so I don't insist.


If you really are after something like 'anonymous memory' for Seastar,
then MAP_HUGETLB would be the better approach.


Just to clarify - my original goal was to migrate Seastar's temporary 
file implementation (which is fs-agnostic, based on descriptors) from 
the current create+unlink to O_TMPFILE, for robustness. One of the 
internal usages of this generic mechanism was to create a tmpfile on 
hugetlbfs and that's why I sent this patch. However, this particular 
internal usage can be easily switched to more portable MAP_HUGETLB, 
which will also mean that the generic tmpfile implementation will not be 
used internally for hugetlbfs anymore.


There *may* still be value in being able to support hugetlbfs once 
Seastar's tmpfile implementation migrates to O_TMPFILE, since the 
library offers creating temporary files in its public API, but there's 
no immediate use case I can apply it to.




I'm still checking with Oracle DB team as they may have a use for O_TMPFILE
in an upcoming release.  In their use case, they want an open fd to work with.
If it looks like they will proceed in this direction, we can work to get
your patch moved forward.

Thanks,


Great, if it turns out that my patch helps anyone with their O_TMPFILE 
usage, I'd be very glad to see it merged.




INFO: task hung in acct_process

2019-10-23 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:4f5cafb5 Linux 5.4-rc3
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14e1d2a0e0
kernel config:  https://syzkaller.appspot.com/x/.config?x=de66e73d1c10cebb
dashboard link: https://syzkaller.appspot.com/bug?extid=bece7c62047c98a5aa90
compiler:   clang version 9.0.0 (/home/glider/llvm/clang  
80fee25776c2fb61e74c1ecb1a523375c2500b69)


Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+bece7c62047c98a5a...@syzkaller.appspotmail.com

INFO: task syz-executor.5:19837 blocked for more than 143 seconds.
  Not tainted 5.4.0-rc3 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.5  D24680 19837  1 0x80004004
Call Trace:
 context_switch kernel/sched/core.c:3384 [inline]
 __schedule+0x74b/0xb80 kernel/sched/core.c:4069
 schedule+0x131/0x1e0 kernel/sched/core.c:4136
 schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:4195
 __mutex_lock_common+0x1411/0x2e20 kernel/locking/mutex.c:1033
 __mutex_lock kernel/locking/mutex.c:1103 [inline]
 mutex_lock_nested+0x1b/0x30 kernel/locking/mutex.c:1118
 acct_get kernel/acct.c:161 [inline]
 slow_acct_process kernel/acct.c:577 [inline]
 acct_process+0x3af/0x570 kernel/acct.c:605
 do_exit+0x573/0x2190 kernel/exit.c:807
 do_group_exit+0x15c/0x2b0 kernel/exit.c:921
 get_signal+0x4ac/0x1d60 kernel/signal.c:2734
 do_signal+0x37/0x640 arch/x86/kernel/signal.c:815
 exit_to_usermode_loop arch/x86/entry/common.c:159 [inline]
 prepare_exit_to_usermode+0x303/0x580 arch/x86/entry/common.c:194
 syscall_return_slowpath+0x113/0x4a0 arch/x86/entry/common.c:274
 do_syscall_64+0x11f/0x1c0 arch/x86/entry/common.c:300
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4139ea
Code: 00 48 89 04 24 48 c7 44 24 08 21 00 00 00 e8 ed 7e 01 00 0f 0b 48 89  
1c 24 e8 b2 f9 03 00 48 8b 44 24 10 48 89 44 24 38 48 8b <4c> 24 08 48 89  
4c 24 40 e8 69 88 01 00 48 8d 05 f3 a4 4e 00 48 89

RSP: 002b:7ffcbd566058 EFLAGS: 0246 ORIG_RAX: 003d
RAX: fe00 RBX: 00bfa940 RCX: 004139ea
RDX: 4000 RSI: 7ffcbd566090 RDI: 
RBP: 39b0 R08: 0001 R09: 0001
R10:  R11: 0246 R12: 0011
R13: 7ffcbd566090 R14: 00bfa99b R15: 7ffcbd5660a0

Showing all locks held in the system:
1 lock held by khungtaskd/1064:
 #0: 888d3f80 (rcu_read_lock){}, at: rcu_lock_acquire+0x4/0x30  
include/linux/rcupdate.h:207

1 lock held by rsyslogd/7848:
 #0: 8880a2fecde0 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x243/0x2e0  
fs/file.c:801

2 locks held by getty/7938:
 #0: 88809a98c090 (&tty->ldisc_sem){}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
 #1: c90005f012e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156

2 locks held by getty/7939:
 #0: 8880a58d5090 (&tty->ldisc_sem){}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
 #1: c90005f152e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156

2 locks held by getty/7940:
 #0: 8880a12f7090 (&tty->ldisc_sem){}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
 #1: c90005f092e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156

2 locks held by getty/7941:
 #0: 8880a7a88090 (&tty->ldisc_sem){}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
 #1: c90005f1d2e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156

2 locks held by getty/7942:
 #0: 8880a892b090 (&tty->ldisc_sem){}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
 #1: c90005f292e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156

2 locks held by getty/7943:
 #0: 8880a7bb4090 (&tty->ldisc_sem){}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
 #1: c90005f252e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156

2 locks held by getty/7944:
 #0: 8880993fd090 (&tty->ldisc_sem){}, at:  
tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:272
 #1: c90005ef12e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x221/0x1b00 drivers/tty/n_tty.c:2156

1 lock held by syz-executor.5/19837:
 #0: 8880a7dbb0f0 (&acct->lock#2){+.+.}, at: acct_get kernel/acct.c:161  
[inline]
 #0: 8880a7dbb0f0 (&acct->lock#2){+.+.}, at: slow_acct_process  
kernel/acct.c:577 [inline]
 #0: 8880a7dbb0f0 (&acct->lock#2){+.+.}, at: acct_process+0x3af/0x570  
kernel/acct.c:605

3 locks held by syz-executor.5/17136:
 #0: 8880a7dbb0f0 (&acct->lock#2){+.+.}, at: acct_get kernel/acct.c:161  
[inline]
 #0: 8880a7dbb0f0 (&acct

Re: [PATCH] clocksource/drivers: Fix error handling in ttc_setup_clocksource

2019-10-23 Thread Markus Elfring
> In the implementation of ttc_setup_clocksource() when
> clk_notifier_register() fails the execution should go to error handling.
> Additionally, to avoid memory leak the allocated memory for ttccs should
> be released, too.

I got other wording preferences. Thus I imagine that such a change
description can still be improved another bit.

How do you think about to omit the word “should” for describing
the previous software situation?


> So, goto error handling to release the memory and return.

Would you like to express the addition of a jump target (according to
the Linux coding style) for the completion of desired exception handling
in a different way?

Regards,
Markus


INFO: task hung in register_netdevice_notifier (2)

2019-10-23 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:3b7c59a1 Merge tag 'pinctrl-v5.4-2' of git://git.kernel.or..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=131abff760
kernel config:  https://syzkaller.appspot.com/x/.config?x=420126a10fdda0f1
dashboard link: https://syzkaller.appspot.com/bug?extid=355f8edb2ff45d5f95fa
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+355f8edb2ff45d5f9...@syzkaller.appspotmail.com

INFO: task syz-executor.3:12938 blocked for more than 143 seconds.
  Not tainted 5.4.0-rc4+ #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.3  D28568 12938  12570 0x0004
Call Trace:
 context_switch kernel/sched/core.c:3384 [inline]
 __schedule+0x94f/0x1e70 kernel/sched/core.c:4069
 schedule+0xd9/0x260 kernel/sched/core.c:4136
 rwsem_down_write_slowpath+0x70b/0xf90 kernel/locking/rwsem.c:1238
 __down_write kernel/locking/rwsem.c:1392 [inline]
 down_write+0x13c/0x150 kernel/locking/rwsem.c:1535
 register_netdevice_notifier+0x7e/0x650 net/core/dev.c:1644
 bcm_init+0x1a8/0x220 net/can/bcm.c:1451
 can_create+0x288/0x4b0 net/can/af_can.c:167
 __sock_create+0x3d8/0x730 net/socket.c:1418
 sock_create net/socket.c:1469 [inline]
 __sys_socket+0x103/0x220 net/socket.c:1511
 __do_sys_socket net/socket.c:1520 [inline]
 __se_sys_socket net/socket.c:1518 [inline]
 __x64_sys_socket+0x73/0xb0 net/socket.c:1518
 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x459ef9
Code: Bad RIP value.
RSP: 002b:7f95783e1c78 EFLAGS: 0246 ORIG_RAX: 0029
RAX: ffda RBX: 0003 RCX: 00459ef9
RDX: 0002 RSI: 0002 RDI: 001d
RBP: 0075bf20 R08:  R09: 
R10:  R11: 0246 R12: 7f95783e26d4
R13: 004c8f16 R14: 004e02c0 R15: 
INFO: task syz-executor.3:12940 blocked for more than 143 seconds.
  Not tainted 5.4.0-rc4+ #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor.3  D29112 12940  12570 0x0004
Call Trace:
 context_switch kernel/sched/core.c:3384 [inline]
 __schedule+0x94f/0x1e70 kernel/sched/core.c:4069
 schedule+0xd9/0x260 kernel/sched/core.c:4136
 rwsem_down_write_slowpath+0x70b/0xf90 kernel/locking/rwsem.c:1238
 __down_write kernel/locking/rwsem.c:1392 [inline]
 down_write+0x13c/0x150 kernel/locking/rwsem.c:1535
 register_netdevice_notifier+0x7e/0x650 net/core/dev.c:1644
 bcm_init+0x1a8/0x220 net/can/bcm.c:1451
 can_create+0x288/0x4b0 net/can/af_can.c:167
 __sock_create+0x3d8/0x730 net/socket.c:1418
 sock_create net/socket.c:1469 [inline]
 __sys_socket+0x103/0x220 net/socket.c:1511
 __do_sys_socket net/socket.c:1520 [inline]
 __se_sys_socket net/socket.c:1518 [inline]
 __x64_sys_socket+0x73/0xb0 net/socket.c:1518
 do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x459ef9
Code: Bad RIP value.
RSP: 002b:7f95783c0c78 EFLAGS: 0246 ORIG_RAX: 0029
RAX: ffda RBX: 0003 RCX: 00459ef9
RDX: 0002 RSI: 0002 RDI: 001d
RBP: 0075bfc8 R08:  R09: 
R10:  R11: 0246 R12: 7f95783c16d4
R13: 004c8f16 R14: 004e02c0 R15: 

Showing all locks held in the system:
1 lock held by khungtaskd/1070:
 #0: 88fab040 (rcu_read_lock){}, at:  
debug_show_all_locks+0x5f/0x27e kernel/locking/lockdep.c:5337

2 locks held by rs:main Q:Reg/8631:
 #0: 88809a078d60 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xee/0x110  
fs/file.c:801
 #1: 88821637c428 (sb_writers#3){.+.+}, at: file_start_write  
include/linux/fs.h:2882 [inline]
 #1: 88821637c428 (sb_writers#3){.+.+}, at: vfs_write+0x485/0x5d0  
fs/read_write.c:557

1 lock held by rsyslogd/8633:
 #0: 8880a9391120 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0xee/0x110  
fs/file.c:801

2 locks held by getty/8723:
 #0: 888096a75090 (&tty->ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f1d2e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks held by getty/8724:
 #0: 8880a181f090 (&tty->ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f392e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks held by getty/8725:
 #0: 88809ccbf090 (&tty->ldisc_sem){}, at:  
ldsem_down_read+0x33/0x40 drivers/tty/tty_ldsem.c:340
 #1: c90005f292e0 (&ldata->atomic_read_lock){+.+.}, at:  
n_tty_read+0x232/0x1c10 drivers/tty/n_tty.c:2156

2 locks h

INFO: task hung in vfs_unlink

2019-10-23 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:d72e90f3 Linux 4.18-rc6
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=104dc65840
kernel config:  https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
dashboard link: https://syzkaller.appspot.com/bug?extid=36feff43582f1f97716a
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+36feff43582f1f977...@syzkaller.appspotmail.com

binder: 12679:12746 ioctl 40046207 0 returned -16
binder: 12679:12746 unknown command -565157109
binder: 12679:12746 ioctl c0306201 204edfd0 returned -22
binder: 12679:12743 unknown command 0
binder: 12679:12743 ioctl c0306201 20007000 returned -22
INFO: task syz-executor7:12738 blocked for more than 140 seconds.
  Not tainted 4.18.0-rc6+ #160
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D25832 12738   4599 0x0004
Call Trace:
 context_switch kernel/sched/core.c:2853 [inline]
 __schedule+0x87c/0x1ed0 kernel/sched/core.c:3501
 schedule+0xfb/0x450 kernel/sched/core.c:3545
 __rwsem_down_write_failed_common+0x95d/0x1630  
kernel/locking/rwsem-xadd.c:566

 rwsem_down_write_failed+0xe/0x10 kernel/locking/rwsem-xadd.c:595
 call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
 __down_write arch/x86/include/asm/rwsem.h:142 [inline]
 down_write+0xaa/0x130 kernel/locking/rwsem.c:72
 inode_lock include/linux/fs.h:715 [inline]
 vfs_unlink+0xd1/0x510 fs/namei.c:4001
 do_unlinkat+0x6cc/0xa30 fs/namei.c:4073
 __do_sys_unlink fs/namei.c:4120 [inline]
 __se_sys_unlink fs/namei.c:4118 [inline]
 __x64_sys_unlink+0x42/0x50 fs/namei.c:4118
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x455ab9
Code: e0 1f 48 89 04 24 e8 b6 6f fd ff e8 81 6a fd ff e8 5c 68 fd ff 48 8d  
05 23 cd 48 00 48 89 04 24 48 c7 44 24 08 1d 00 00 00 e8 <13> 5e fd ff 0f  
0b e8 8c 44 00 00 e9 07 f0 ff ff cc cc cc cc cc cc

RSP: 002b:7f2301e2cc68 EFLAGS: 0246 ORIG_RAX: 0057
RAX: ffda RBX: 7f2301e2d6d4 RCX: 00455ab9
RDX:  RSI:  RDI: 2300
RBP: 0072bf48 R08:  R09: 
R10:  R11: 0246 R12: 
R13: 004c0088 R14: 004d4350 R15: 0001
INFO: task syz-executor7:12740 blocked for more than 140 seconds.
  Not tainted 4.18.0-rc6+ #160
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D24216 12740   4599 0x0004
Call Trace:
 context_switch kernel/sched/core.c:2853 [inline]
 __schedule+0x87c/0x1ed0 kernel/sched/core.c:3501
 schedule+0xfb/0x450 kernel/sched/core.c:3545
 __rwsem_down_write_failed_common+0x95d/0x1630  
kernel/locking/rwsem-xadd.c:566

 rwsem_down_write_failed+0xe/0x10 kernel/locking/rwsem-xadd.c:595
 call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
 __down_write arch/x86/include/asm/rwsem.h:142 [inline]
 down_write+0xaa/0x130 kernel/locking/rwsem.c:72
 inode_lock include/linux/fs.h:715 [inline]
 lock_mount+0x8c/0x2e0 fs/namespace.c:2088
 do_add_mount+0x27/0x370 fs/namespace.c:2465
 do_new_mount fs/namespace.c:2532 [inline]
 do_mount+0x193f/0x30e0 fs/namespace.c:2848
 ksys_mount+0x12d/0x140 fs/namespace.c:3064
 __do_sys_mount fs/namespace.c:3078 [inline]
 __se_sys_mount fs/namespace.c:3075 [inline]
 __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x455ab9
Code: e0 1f 48 89 04 24 e8 b6 6f fd ff e8 81 6a fd ff e8 5c 68 fd ff 48 8d  
05 23 cd 48 00 48 89 04 24 48 c7 44 24 08 1d 00 00 00 e8 <13> 5e fd ff 0f  
0b e8 8c 44 00 00 e9 07 f0 ff ff cc cc cc cc cc cc

RSP: 002b:7f2301e0bc68 EFLAGS: 0246 ORIG_RAX: 00a5
RAX: ffda RBX: 7f2301e0c6d4 RCX: 00455ab9
RDX: 2900 RSI: 2000 RDI: 
RBP: 0072bff0 R08: 2380 R09: 
R10:  R11: 0246 R12: 
R13: 004c0201 R14: 004cfe50 R15: 0002
INFO: task syz-executor7:12742 blocked for more than 140 seconds.
  Not tainted 4.18.0-rc6+ #160
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor7   D25408 12742   4599 0x0004
Call Trace:
 context_switch kernel/sched/core.c:2853 [inline]
 __schedule+0x87c/0x1ed0 kernel/sched/core.c:3501
 schedule+0xfb/0x450 kernel/sched/core.c:3545
 __rwsem_down_write_failed_common+0x95d/0x1630  
kernel/locking/rwsem-xadd.c:566

 rwsem_down_write_failed+0xe/0x10 kernel/locking/rwsem-xadd.c:595
 call_rwsem_down_write_failed+0x17/0x30 arch/x86/lib/rwsem.S:117
 __down_write arch/x86/include/asm/rwsem.h:142 [inline]
 down_w

RE: [PATCH v1 1/1] scsi: ufs: Add command logging infrastructure

2019-10-23 Thread Winkler, Tomas
> Add the necessary infrastructure to keep timestamp history of commands,
> events and other useful info for debugging complex issues. This helps in
> diagnosing events leading upto failure.

Why not use tracepoints, for that?
Thanks
Tomas

> Signed-off-by: Can Guo 
> ---
>  drivers/scsi/ufs/Kconfig  |  12 +++
>  drivers/scsi/ufs/ufshcd.c | 214
> +++---
>  drivers/scsi/ufs/ufshcd.h |  24 +-
>  3 files changed, 218 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig index
> 0b845ab..afc70cb 100644
> --- a/drivers/scsi/ufs/Kconfig
> +++ b/drivers/scsi/ufs/Kconfig
> @@ -50,6 +50,18 @@ config SCSI_UFSHCD
> However, do not compile this as a module if your root file system
> (the one containing the directory /) is located on a UFS device.
> 
> +config SCSI_UFSHCD_CMD_LOGGING
> + bool "Universal Flash Storage host controller driver layer command
> logging support"
> + depends on SCSI_UFSHCD
> + help
> +   This selects the UFS host controller driver layer command logging.
> +   UFS host controller driver layer command logging records all the
> +   command information sent from UFS host controller for debugging
> +   purpose.
> +
> +   Select this if you want above mentioned debug information captured.
> +   If unsure, say N.
> +
>  config SCSI_UFSHCD_PCI
>   tristate "PCI bus based UFS Controller support"
>   depends on SCSI_UFSHCD && PCI
> diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c index
> c28c144..f3faa85 100644
> --- a/drivers/scsi/ufs/ufshcd.c
> +++ b/drivers/scsi/ufs/ufshcd.c
> @@ -91,6 +91,9 @@
>  /* default delay of autosuspend: 2000 ms */  #define
> RPM_AUTOSUSPEND_DELAY_MS 2000
> 
> +/* Maximum command logging entries */
> +#define UFSHCD_MAX_CMD_LOGGING   20
> +
>  #define ufshcd_toggle_vreg(_dev, _vreg, _on) \
>   ({  \
>   int _ret;   \
> @@ -328,14 +331,135 @@ static void ufshcd_add_tm_upiu_trace(struct
> ufs_hba *hba, unsigned int tag,
>   &descp->input_param1);
>  }
> 
> -static void ufshcd_add_command_trace(struct ufs_hba *hba,
> - unsigned int tag, const char *str)
> +static inline void ufshcd_add_command_trace(struct ufs_hba *hba,
> + struct ufshcd_cmd_log_entry *entry)
> +{
> + if (trace_ufshcd_command_enabled()) {
> + u32 intr = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
> +
> + trace_ufshcd_command(dev_name(hba->dev), entry->str,
> entry->tag,
> +  entry->doorbell, entry->transfer_len, intr,
> +  entry->lba, entry->cmd_id);
> + }
> +}
> +
> +#ifdef CONFIG_SCSI_UFSHCD_CMD_LOGGING
> +static void ufshcd_cmd_log_init(struct ufs_hba *hba) {
> + /* Allocate log entries */
> + if (!hba->cmd_log.entries) {
> + hba->cmd_log.entries =
> kcalloc(UFSHCD_MAX_CMD_LOGGING,
> + sizeof(struct ufshcd_cmd_log_entry), GFP_KERNEL);
> + if (!hba->cmd_log.entries)
> + return;
> + dev_dbg(hba->dev, "%s: cmd_log.entries initialized\n",
> + __func__);
> + }
> +}
> +
> +static void __ufshcd_cmd_log(struct ufs_hba *hba, char *str, char *cmd_type,
> +  unsigned int tag, u8 cmd_id, u8 idn, u8 lun,
> +  sector_t lba, int transfer_len) {
> + struct ufshcd_cmd_log_entry *entry;
> +
> + if (!hba->cmd_log.entries)
> + return;
> +
> + entry = &hba->cmd_log.entries[hba->cmd_log.pos];
> + entry->lun = lun;
> + entry->str = str;
> + entry->cmd_type = cmd_type;
> + entry->cmd_id = cmd_id;
> + entry->lba = lba;
> + entry->transfer_len = transfer_len;
> + entry->idn = idn;
> + entry->doorbell = ufshcd_readl(hba,
> REG_UTP_TRANSFER_REQ_DOOR_BELL);
> + entry->tag = tag;
> + entry->tstamp = ktime_get();
> + entry->outstanding_reqs = hba->outstanding_reqs;
> + entry->seq_num = hba->cmd_log.seq_num;
> + hba->cmd_log.seq_num++;
> + hba->cmd_log.pos = (hba->cmd_log.pos + 1) %
> UFSHCD_MAX_CMD_LOGGING;
> +
> + ufshcd_add_command_trace(hba, entry);
> +}
> +
> +static void ufshcd_cmd_log(struct ufs_hba *hba, char *str, char *cmd_type,
> + unsigned int tag, u8 cmd_id, u8 idn)
> +{
> + __ufshcd_cmd_log(hba, str, cmd_type, tag, cmd_id, idn, 0, 0, 0); }
> +
> +static void ufshcd_dme_cmd_log(struct ufs_hba *hba, char *str, u8
> +cmd_id) {
> + ufshcd_cmd_log(hba, str, "dme", 0, cmd_id, 0); }
> +
> +static void ufshcd_print_cmd_log(struct ufs_hba *hba) {
> + int i;
> + int pos;
> + struct ufshcd_cmd_log_entry *p;
> +
> + if (!hba->cmd_log.entries)
> + return;
> +
> + pos = hba->cmd_log.pos;
> + for 

Re: [PATCH v2] vhost: introduce mdev based hardware backend

2019-10-23 Thread Jason Wang



On 2019/10/23 下午3:07, Tiwei Bie wrote:

On Wed, Oct 23, 2019 at 01:46:23PM +0800, Jason Wang wrote:

On 2019/10/23 上午11:02, Tiwei Bie wrote:

On Tue, Oct 22, 2019 at 09:30:16PM +0800, Jason Wang wrote:

On 2019/10/22 下午5:52, Tiwei Bie wrote:

This patch introduces a mdev based hardware vhost backend.
This backend is built on top of the same abstraction used
in virtio-mdev and provides a generic vhost interface for
userspace to accelerate the virtio devices in guest.

This backend is implemented as a mdev device driver on top
of the same mdev device ops used in virtio-mdev but using
a different mdev class id, and it will register the device
as a VFIO device for userspace to use. Userspace can setup
the IOMMU with the existing VFIO container/group APIs and
then get the device fd with the device name. After getting
the device fd of this device, userspace can use vhost ioctls
to setup the backend.

Signed-off-by: Tiwei Bie 
---
This patch depends on below series:
https://lkml.org/lkml/2019/10/17/286

v1 -> v2:
- Replace _SET_STATE with _SET_STATUS (MST);
- Check status bits at each step (MST);
- Report the max ring size and max number of queues (MST);
- Add missing MODULE_DEVICE_TABLE (Jason);
- Only support the network backend w/o multiqueue for now;

Any idea on how to extend it to support devices other than net? I think we
want a generic API or an API that could be made generic in the future.

Do we want to e.g having a generic vhost mdev for all kinds of devices or
introducing e.g vhost-net-mdev and vhost-scsi-mdev?

One possible way is to do what vhost-user does. I.e. Apart from
the generic ring, features, ... related ioctls, we also introduce
device specific ioctls when we need them. As vhost-mdev just needs
to forward configs between parent and userspace and even won't
cache any info when possible,


So it looks to me this is only possible if we expose e.g set_config and
get_config to userspace.

The set_config and get_config interface isn't really everything
of device specific settings. We also have ctrlq in virtio-net.



Yes, but it could be processed by the exist API. Isn't it? Just set ctrl 
vq address and let parent to deal with that.








I think it might be better to do
this in one generic vhost-mdev module.


Looking at definitions of VhostUserRequest in qemu, it mixed generic API
with device specific API. If we want go this ways (a generic vhost-mdev),
more questions needs to be answered:

1) How could userspace know which type of vhost it would use? Do we need to
expose virtio subsystem device in for userspace this case?

2) That generic vhost-mdev module still need to filter out unsupported
ioctls for a specific type. E.g if it probes a net device, it should refuse
API for other type. This in fact a vhost-mdev-net but just not modularize it
on top of vhost-mdev.



- Some minor fixes and improvements;
- Rebase on top of virtio-mdev series v4;

[...]

+
+static long vhost_mdev_get_features(struct vhost_mdev *m, u64 __user *featurep)
+{
+   if (copy_to_user(featurep, &m->features, sizeof(m->features)))
+   return -EFAULT;

As discussed in previous version do we need to filter out MQ feature here?

I think it's more straightforward to let the parent drivers to
filter out the unsupported features. Otherwise it would be tricky
when we want to add more features in vhost-mdev module,


It's as simple as remove the feature from blacklist?

It's not really that easy. It may break the old drivers.



I'm not sure I understand here, we do feature negotiation anyhow. For 
old drivers do you mean the guest drivers without MQ?








i.e. if
the parent drivers may expose unsupported features and relay on
vhost-mdev to filter them out, these features will be exposed
to userspace automatically when they are enabled in vhost-mdev
in the future.


The issue is, it's only that vhost-mdev knows its own limitation. E.g in
this patch, vhost-mdev only implements a subset of transport API, but parent
doesn't know about that.

Still MQ as an example, there's no way (or no need) for parent to know that
vhost-mdev does not support MQ.

The mdev is a MDEV_CLASS_ID_VHOST mdev device. When the parent
is being developed, it should know the currently supported features
of vhost-mdev.



How can parent know MQ is not supported by vhost-mdev?





And this allows old kenrel to work with new
parent drivers.

The new drivers should provide things like VIRTIO_MDEV_F_VERSION_1
to be compatible with the old kernels. When VIRTIO_MDEV_F_VERSION_1
is provided/negotiated, the behaviours should be consistent.



To be clear, I didn't mean a change in virtio-mdev API, I meant:

1) old vhost-mdev kernel driver that filters out MQ

2) new parent driver that support MQ





So basically we have three choices here:

1) Implement what vhost-user did and implement a generic vhost-mdev (but may
still have lots of device specific code). To support advanced feature which
requires the access to config, still lots of API th

Re: [PATCH 3/7] Add a UFFD_SECURE flag to the userfaultfd API.

2019-10-23 Thread Cyrill Gorcunov
On Tue, Oct 22, 2019 at 09:11:04PM -0700, Andy Lutomirski wrote:
> Trying again.  It looks like I used the wrong address for Pavel.

Thanks for CC Andy! I must confess I didn't dive into userfaultfd engine
personally but let me CC more people involved from criu side. (overquoting
left untouched for their sake).

> 
> On Sat, Oct 12, 2019 at 6:14 PM Andy Lutomirski  wrote:
> >
> > [adding more people because this is going to be an ABI break, sigh]
> >
> > On Sat, Oct 12, 2019 at 5:52 PM Daniel Colascione  wrote:
> > >
> > > On Sat, Oct 12, 2019 at 4:10 PM Andy Lutomirski  wrote:
> > > >
> > > > On Sat, Oct 12, 2019 at 12:16 PM Daniel Colascione  
> > > > wrote:
> > > > >
> > > > > The new secure flag makes userfaultfd use a new "secure" anonymous
> > > > > file object instead of the default one, letting security modules
> > > > > supervise userfaultfd use.
> > > > >
> > > > > Requiring that users pass a new flag lets us avoid changing the
> > > > > semantics for existing callers.
> > > >
> > > > Is there any good reason not to make this be the default?
> > > >
> > > >
> > > > The only downside I can see is that it would increase the memory usage
> > > > of userfaultfd(), but that doesn't seem like such a big deal.  A
> > > > lighter-weight alternative would be to have a single inode shared by
> > > > all userfaultfd instances, which would require a somewhat different
> > > > internal anon_inode API.
> > >
> > > I'd also prefer to just make SELinux use mandatory, but there's a
> > > nasty interaction with UFFD_EVENT_FORK. Adding a new UFFD_SECURE mode
> > > which blocks UFFD_EVENT_FORK sidesteps this problem. Maybe you know a
> > > better way to deal with it.
> >
> > ...
> >
> > > But maybe we can go further: let's separate authentication and
> > > authorization, as we do in other LSM hooks. Let's split my
> > > inode_init_security_anon into two hooks, inode_init_security_anon and
> > > inode_create_anon. We'd define the former to just initialize the file
> > > object's security information --- in the SELinux case, figuring out
> > > its class and SID --- and define the latter to answer the yes/no
> > > question of whether a particular anonymous inode creation should be
> > > allowed. Normally, anon_inode_getfile2() would just call both hooks.
> > > We'd add another anon_inode_getfd flag, ANON_INODE_SKIP_AUTHORIZATION
> > > or something, that would tell anon_inode_getfile2() to skip calling
> > > the authorization hook, effectively making the creation always
> > > succeed. We can then make the UFFD code pass
> > > ANON_INODE_SKIP_AUTHORIZATION when it's creating a file object in the
> > > fork child while creating UFFD_EVENT_FORK messages.
> >
> > That sounds like an improvement.  Or maybe just teach SELinux that
> > this particular fd creation is actually making an anon_inode that is a
> > child of an existing anon inode and that the context should be copied
> > or whatever SELinux wants to do.  Like this, maybe:
> >
> > static int resolve_userfault_fork(struct userfaultfd_ctx *ctx,
> >   struct userfaultfd_ctx *new,
> >   struct uffd_msg *msg)
> > {
> > int fd;
> >
> > Change this:
> >
> > fd = anon_inode_getfd("[userfaultfd]", &userfaultfd_fops, new,
> >   O_RDWR | (new->flags & 
> > UFFD_SHARED_FCNTL_FLAGS));
> >
> > to something like:
> >
> >   fd = anon_inode_make_child_fd(..., ctx->inode, ...);
> >
> > where ctx->inode is the one context's inode.
> >
> > *** HOWEVER *** !!!
> >
> > Now that you've pointed this mechanism out, it is utterly and
> > completely broken and should be removed from the kernel outright or at
> > least severely restricted.  A .read implementation MUST NOT ACT ON THE
> > CALLING TASK.  Ever.  Just imagine the effect of passing a userfaultfd
> > as stdin to a setuid program.
> >
> > So I think the right solution might be to attempt to *remove*
> > UFFD_EVENT_FORK.  Maybe the solution is to say that, unless the
> > creator of a userfaultfd() has global CAP_SYS_ADMIN, then it cannot
> > use UFFD_FEATURE_EVENT_FORK) and print a warning (once) when
> > UFFD_FEATURE_EVENT_FORK is allowed.  And, after some suitable
> > deprecation period, just remove it.  If it's genuinely useful, it
> > needs an entirely new API based on ioctl() or a syscall.  Or even
> > recvmsg() :)
> >
> > And UFFD_SECURE should just become automatic, since you don't have a
> > problem any more. :-p
> >
> > --Andy
> 

Cyrill


Re: [PATCH] perf/core: fix multiplexing event scheduling issue

2019-10-23 Thread Stephane Eranian
On Mon, Oct 21, 2019 at 3:21 AM Peter Zijlstra  wrote:
>
> On Thu, Oct 17, 2019 at 05:27:46PM -0700, Stephane Eranian wrote:
> > This patch complements the following commit:
> > 7fa343b7fdc4 ("perf/core: Fix corner case in perf_rotate_context()")
> >
> > The fix from Song addresses the consequences of the problem but
> > not the cause. This patch fixes the causes and can sit on top of
> > Song's patch.
>
> I'm tempted to say the other way around.
>
> Consider the case where you claim fixed2 with a pinned event and then
> have another fixed2 in the flexible list. At that point you're _never_
> going to run any other flexible events (without Song's patch).
>
In that case, there is no deactivation or removal of events, so yes, my patch
will not help that case. I said his patch is still useful. You gave one example,
even though in this case the rotate will not yield a reschedule of that flexible
event because fixed2 is used by a pinned event. So checking for it, will not
really help.

> This patch isn't going to help with that. Similarly, Songs patch helps
> with your situation where it will allow rotation to resume after you
> disable/remove all active events (while you still have pending events).
>
Yes, it will unblock the case where active events are deactivated or
removed. But it will delay the unblocking until the next mux timer
expires. And I am saying this is too far away in many cases. For instance,
we do not run with the 1ms timer for uncore, this is way too much overhead.
Imagine this timer is set to 10ms or event 100ms, just with Song's patch, the
inactive events would have to wait for up to 100ms to be scheduled again.
This is not acceptable for us.

> > This patch fixes a scheduling problem in the core functions of
> > perf_events. Under certain conditions, some events would not be
> > scheduled even though many counters would be available. This
> > is related to multiplexing and is architecture agnostic and
> > PMU agnostic (i.e., core or uncore).
> >
> > This problem can easily be reproduced when you have two perf
> > stat sessions. The first session does not cause multiplexing,
> > let's say it is measuring 1 event, E1. While it is measuring,
> > a second session starts and causes multiplexing. Let's say it
> > adds 6 events, B1-B6. Now, 7 events compete and are multiplexed.
> > When the second session terminates, all 6 (B1-B6) events are
> > removed. Normally, you'd expect the E1 event to continue to run
> > with no multiplexing. However, the problem is that depending on
> > the state Of E1 when B1-B6 are removed, it may never be scheduled
> > again. If E1 was inactive at the time of removal, despite the
> > multiplexing hrtimer still firing, it will not find any active
> > events and will not try to reschedule. This is what Song's patch
> > fixes. It forces the multiplexing code to consider non-active events.
>
> This; so Song's patch fixes the fundamental problem of the rotation not
> working right under certain conditions.
>
> > However, the cause is not addressed. The kernel should not rely on
> > the multiplexing hrtimer to unblock inactive events. That timer
> > can have abitrary duration in the milliseconds. Until the timer
> > fires, counters are available, but no measurable events are using
> > them. We do not want to introduce blind spots of arbitrary durations.
>
> This I disagree with -- you don't get a guarantee other than
> timer_period/n when you multiplex, and idling the counters until the
> next tick doesn't violate that at all.

My take is that if you have free counters and "idling" events, the kernel
should take every effort to schedule them as soon as they become available.
In the situation I described in the patch, once I remove the active
events, there
is no more reasons for multiplexing, all the counters are free (ignore
watchdog).
Now you may be arguing, that it may take more time to ctx_resched() then to
wait for the timer to expire. But I am not sure I buy that. Similarly,
I am not sure
there is code to cancel an active mux hrtimer when we clear rotate_necessary.
Maybe we just let it lapse and clear itself via a ctx_sched_out() in
the rotation code.

>
> > This patch addresses the cause of the problem, by checking that,
> > when an event is disabled or removed and the context was multiplexing
> > events, inactive events gets immediately a chance to be scheduled by
> > calling ctx_resched(). The rescheduling is done on  event of equal
> > or lower priority types.  With that in place, as soon as a counter
> > is freed, schedulable inactive events may run, thereby eliminating
> > a blind spot.
>
> Disagreed, Song's patch removed the fundamental blind spot of rotation
> completely failing.
Sure it removed the infinite blocking of schedulable events. My patch addresses
the issue of having free counters following a deactivation/removal and
not scheduling
the idling events on them, thereby creating a blind spot where no
event is monitoring.

>
> This just slightly optimizes 

[GIT PULL] EDAC fix for 5.4

2019-10-23 Thread Borislav Petkov
Hi Linus,

please pull,
thx.

---
The following changes since commit 54ecb8f7028c5eb3d740bb82b0f1d90f2df63c5c:

  Linux 5.4-rc1 (2019-09-30 10:35:40 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras.git 
tags/edac_urgent_for_5.4

for you to fetch changes up to 1e72e673b9d102ff2e8333e74b3308d012ddf75b:

  EDAC/ghes: Fix Use after free in ghes_edac remove path (2019-10-17 11:27:05 
+0200)


Fix ghes_edac UAF case triggered by KASAN and DEBUG_TEST_DRIVER_REMOVE.

Future pending rework of the ghes_edac instances registration will do
away with the single memory controller per system model and that ugly
hackery there.

This is a minimal fix for stable@, courtesy of James Morse.


James Morse (1):
  EDAC/ghes: Fix Use after free in ghes_edac remove path

 drivers/edac/ghes_edac.c | 4 
 1 file changed, 4 insertions(+)

-- 
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG 
Nürnberg


[PATCH] ASoC: fsl_esai: Add spin lock to protect reset and stop

2019-10-23 Thread Shengjiu Wang
xrun may happen at the end of stream, the
trigger->fsl_esai_trigger_stop maybe called in the middle of
fsl_esai_hw_reset, this may cause esai in wrong state
after stop, and there may be endless xrun interrupt.
So Add spin lock to lock these two function.

Fixes: 7ccafa2b3879 ("ASoC: fsl_esai: recover the channel swap after xrun")
Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_esai.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index 37b14c48b537..6a797648b66d 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -33,6 +33,7 @@
  * @fsysclk: system clock source to derive HCK, SCK and FS
  * @spbaclk: SPBA clock (optional, depending on SoC design)
  * @task: tasklet to handle the reset operation
+ * @lock: spin lock to handle reset and stop behavior
  * @fifo_depth: depth of tx/rx FIFO
  * @slot_width: width of each DAI slot
  * @slots: number of slots
@@ -56,6 +57,7 @@ struct fsl_esai {
struct clk *fsysclk;
struct clk *spbaclk;
struct tasklet_struct task;
+   spinlock_t lock; /* Protect reset and stop */
u32 fifo_depth;
u32 slot_width;
u32 slots;
@@ -676,8 +678,10 @@ static void fsl_esai_hw_reset(unsigned long arg)
 {
struct fsl_esai *esai_priv = (struct fsl_esai *)arg;
bool tx = true, rx = false, enabled[2];
+   unsigned long lock_flags;
u32 tfcr, rfcr;
 
+   spin_lock_irqsave(&esai_priv->lock, lock_flags);
/* Save the registers */
regmap_read(esai_priv->regmap, REG_ESAI_TFCR, &tfcr);
regmap_read(esai_priv->regmap, REG_ESAI_RFCR, &rfcr);
@@ -715,6 +719,8 @@ static void fsl_esai_hw_reset(unsigned long arg)
fsl_esai_trigger_start(esai_priv, tx);
if (enabled[rx])
fsl_esai_trigger_start(esai_priv, rx);
+
+   spin_unlock_irqrestore(&esai_priv->lock, lock_flags);
 }
 
 static int fsl_esai_trigger(struct snd_pcm_substream *substream, int cmd,
@@ -722,6 +728,7 @@ static int fsl_esai_trigger(struct snd_pcm_substream 
*substream, int cmd,
 {
struct fsl_esai *esai_priv = snd_soc_dai_get_drvdata(dai);
bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
+   unsigned long lock_flags;
 
esai_priv->channels[tx] = substream->runtime->channels;
 
@@ -734,7 +741,9 @@ static int fsl_esai_trigger(struct snd_pcm_substream 
*substream, int cmd,
case SNDRV_PCM_TRIGGER_SUSPEND:
case SNDRV_PCM_TRIGGER_STOP:
case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
+   spin_lock_irqsave(&esai_priv->lock, lock_flags);
fsl_esai_trigger_stop(esai_priv, tx);
+   spin_unlock_irqrestore(&esai_priv->lock, lock_flags);
break;
default:
return -EINVAL;
@@ -1002,6 +1011,7 @@ static int fsl_esai_probe(struct platform_device *pdev)
 
dev_set_drvdata(&pdev->dev, esai_priv);
 
+   spin_lock_init(&esai_priv->lock);
ret = fsl_esai_hw_init(esai_priv);
if (ret)
return ret;
-- 
2.21.0



Re: [PATCH v2] clocksource/drivers: Fix memory leak in ttc_setup_clockevent

2019-10-23 Thread Michal Simek
On 23. 10. 19 6:31, Navid Emamdoost wrote:
> In the implementation of ttc_setup_clockevent() release the allocated
> memory for ttcce if clk_notifier_register() fails.
> 
> Fixes: 70504f311d4b ("clocksource/drivers/cadence_ttc: Convert init function 
> to return error")
> Signed-off-by: Navid Emamdoost 
> ---
> Changes in v2:
>   - Added goto label for error handling
>   - Update description and fix typo
> 
>  drivers/clocksource/timer-cadence-ttc.c | 19 ++-
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/clocksource/timer-cadence-ttc.c 
> b/drivers/clocksource/timer-cadence-ttc.c
> index 88fe2e9ba9a3..0caacbc67102 100644
> --- a/drivers/clocksource/timer-cadence-ttc.c
> +++ b/drivers/clocksource/timer-cadence-ttc.c
> @@ -411,10 +411,8 @@ static int __init ttc_setup_clockevent(struct clk *clk,
>   ttcce->ttc.clk = clk;
>  
>   err = clk_prepare_enable(ttcce->ttc.clk);
> - if (err) {
> - kfree(ttcce);
> - return err;
> - }
> + if (err)
> + goto release_ttcce;
>  
>   ttcce->ttc.clk_rate_change_nb.notifier_call =
>   ttc_rate_change_clockevent_cb;
> @@ -424,7 +422,7 @@ static int __init ttc_setup_clockevent(struct clk *clk,
>   &ttcce->ttc.clk_rate_change_nb);
>   if (err) {
>   pr_warn("Unable to register clock notifier.\n");
> - return err;
> + goto release_ttcce;
>   }
>  
>   ttcce->ttc.freq = clk_get_rate(ttcce->ttc.clk);
> @@ -453,15 +451,18 @@ static int __init ttc_setup_clockevent(struct clk *clk,
>  
>   err = request_irq(irq, ttc_clock_event_interrupt,
> IRQF_TIMER, ttcce->ce.name, ttcce);
> - if (err) {
> - kfree(ttcce);
> - return err;
> - }
> + if (err)
> + goto release_ttcce;
>  
>   clockevents_config_and_register(&ttcce->ce,
>   ttcce->ttc.freq / PRESCALE, 1, 0xfffe);
>  
>   return 0;
> +
> +release_ttcce:
> +
> + kfree(ttcce);
> + return err;
>  }
>  
>  /**
> 

Acked-by: Michal Simek 

Thanks,
Michal



Re: [PATCH 1/3] phy: cadence: Sierra: add phy_reset hook

2019-10-23 Thread Kishon Vijay Abraham I
Roger,

On 22/10/19 6:52 PM, Roger Quadros wrote:
> This is required if type C driver needs to hold
> global reset on J7ES to perform LN10 swap.

Can you replace "This" with something more specific.

Thanks
Kishon
> 
> Signed-off-by: Roger Quadros 
> Signed-off-by: Sekhar Nori 
> ---
>  drivers/phy/cadence/phy-cadence-sierra.c | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/phy/cadence/phy-cadence-sierra.c 
> b/drivers/phy/cadence/phy-cadence-sierra.c
> index affede8c4368..e6d27bdec22a 100644
> --- a/drivers/phy/cadence/phy-cadence-sierra.c
> +++ b/drivers/phy/cadence/phy-cadence-sierra.c
> @@ -339,10 +339,20 @@ static int cdns_sierra_phy_off(struct phy *gphy)
>   return reset_control_assert(ins->lnk_rst);
>  }
>  
> +static int cdns_sierra_phy_reset(struct phy *gphy)
> +{
> + struct cdns_sierra_phy *sp = dev_get_drvdata(gphy->dev.parent);
> +
> + reset_control_assert(sp->phy_rst);
> + reset_control_deassert(sp->phy_rst);
> + return 0;
> +};
> +
>  static const struct phy_ops ops = {
>   .init   = cdns_sierra_phy_init,
>   .power_on   = cdns_sierra_phy_on,
>   .power_off  = cdns_sierra_phy_off,
> + .reset  = cdns_sierra_phy_reset,
>   .owner  = THIS_MODULE,
>  };
>  
> 


[PATCH net-next] net: lan78xx: remove set but not used variable 'event'

2019-10-23 Thread YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/usb/lan78xx.c:3995:6: warning:
 variable event set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Signed-off-by: YueHaibing 
---
 drivers/net/usb/lan78xx.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 6294809..f8c0818 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -3992,9 +3992,6 @@ static int lan78xx_suspend(struct usb_interface *intf, 
pm_message_t message)
struct lan78xx_priv *pdata = (struct lan78xx_priv *)(dev->data[0]);
u32 buf;
int ret;
-   int event;
-
-   event = message.event;
 
if (!dev->suspend_count++) {
spin_lock_irq(&dev->txq.lock);
-- 
2.7.4




[PATCH] adm80211: remove set but not used variables 'mem_addr' and 'io_addr'

2019-10-23 Thread YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/wireless/admtek/adm8211.c:1784:16:
 warning: variable mem_addr set but not used [-Wunused-but-set-variable]
drivers/net/wireless/admtek/adm8211.c:1785:15:
 warning: variable io_addr set but not used [-Wunused-but-set-variable]

They are never used, so can be removed.

Signed-off-by: YueHaibing 
---
 drivers/net/wireless/admtek/adm8211.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/admtek/adm8211.c 
b/drivers/net/wireless/admtek/adm8211.c
index 46f1427..ba326f6 100644
--- a/drivers/net/wireless/admtek/adm8211.c
+++ b/drivers/net/wireless/admtek/adm8211.c
@@ -1781,8 +1781,8 @@ static int adm8211_probe(struct pci_dev *pdev,
 {
struct ieee80211_hw *dev;
struct adm8211_priv *priv;
-   unsigned long mem_addr, mem_len;
-   unsigned int io_addr, io_len;
+   unsigned long mem_len;
+   unsigned int io_len;
int err;
u32 reg;
u8 perm_addr[ETH_ALEN];
@@ -1794,9 +1794,7 @@ static int adm8211_probe(struct pci_dev *pdev,
return err;
}
 
-   io_addr = pci_resource_start(pdev, 0);
io_len = pci_resource_len(pdev, 0);
-   mem_addr = pci_resource_start(pdev, 1);
mem_len = pci_resource_len(pdev, 1);
if (io_len < 256 || mem_len < 1024) {
printk(KERN_ERR "%s (adm8211): Too short PCI resources\n",
-- 
2.7.4




[PATCH] uprobes/x86: fix arch_uprobe_analyze_insn() comment

2019-10-23 Thread Yi Wang
Fix parameter name in comment and adjust the order.
No functional change.

Signed-off-by: Yi Wang 
---
 arch/x86/kernel/uprobes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 8cd745e..15e5aad 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -842,8 +842,8 @@ static int push_setup_xol_ops(struct arch_uprobe *auprobe, 
struct insn *insn)
 
 /**
  * arch_uprobe_analyze_insn - instruction analysis including validity and 
fixups.
+ * @auprobe: the probepoint information.
  * @mm: the probed address space.
- * @arch_uprobe: the probepoint information.
  * @addr: virtual address at which to install the probepoint
  * Return 0 on success or a -ve number on error.
  */
-- 
1.8.3.1



[PATCH] atmel: remove set but not used variable 'dev'

2019-10-23 Thread YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/wireless/atmel/atmel_cs.c:120:21:
 warning: variable dev set but not used [-Wunused-but-set-variable]

It is never used, so can remove it.

Signed-off-by: YueHaibing 
---
 drivers/net/wireless/atmel/atmel_cs.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/wireless/atmel/atmel_cs.c 
b/drivers/net/wireless/atmel/atmel_cs.c
index 7afc9c5..368eebe 100644
--- a/drivers/net/wireless/atmel/atmel_cs.c
+++ b/drivers/net/wireless/atmel/atmel_cs.c
@@ -117,11 +117,9 @@ static int atmel_config_check(struct pcmcia_device *p_dev, 
void *priv_data)
 
 static int atmel_config(struct pcmcia_device *link)
 {
-   struct local_info *dev;
int ret;
const struct pcmcia_device_id *did;
 
-   dev = link->priv;
did = dev_get_drvdata(&link->dev);
 
dev_dbg(&link->dev, "atmel_config\n");
-- 
2.7.4




[PATCH -next] phy: ti: dm816x: remove set but not used variable 'phy_data'

2019-10-23 Thread YueHaibing
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/phy/ti/phy-dm816x-usb.c:192:29: warning:
 variable phy_data set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Signed-off-by: YueHaibing 
---
 drivers/phy/ti/phy-dm816x-usb.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/phy/ti/phy-dm816x-usb.c b/drivers/phy/ti/phy-dm816x-usb.c
index cbcce7c..26f1947 100644
--- a/drivers/phy/ti/phy-dm816x-usb.c
+++ b/drivers/phy/ti/phy-dm816x-usb.c
@@ -189,7 +189,6 @@ static int dm816x_usb_phy_probe(struct platform_device 
*pdev)
struct phy_provider *phy_provider;
struct usb_otg *otg;
const struct of_device_id *of_id;
-   const struct usb_phy_data *phy_data;
int error;
 
of_id = of_match_device(of_match_ptr(dm816x_usb_phy_id_table),
@@ -220,8 +219,6 @@ static int dm816x_usb_phy_probe(struct platform_device 
*pdev)
if (phy->usbphy_ctrl == 0x2c)
phy->instance = 1;
 
-   phy_data = of_id->data;
-
otg = devm_kzalloc(&pdev->dev, sizeof(*otg), GFP_KERNEL);
if (!otg)
return -ENOMEM;
-- 
2.7.4




Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

2019-10-23 Thread Daniel Wagner
> > Wouldn't handle_nested_irq() work here instead of the simple thingy?
> 
> Daniel could you try this suggestion? Would it work?


[6.427289] [ cut here ]
[6.431977] kernel BUG at drivers/net/phy/mdio_bus.c:626!
[6.437453] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[6.443013] Modules linked in:
[6.446116] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.4.0-rc3-00100-g70cc5ab156c3-dirty #50
[6.454763] Hardware name: Raspberry Pi 3 Model B+ (DT)
[6.460062] pstate: 0005 (nzcv daif -PAN -UAO)
[6.464928] pc : mdiobus_read+0x68/0x70
[6.468821] lr : lan88xx_phy_ack_interrupt+0x1c/0x30
[6.473852] sp : 800010003d70
[6.477208] x29: 800010003d70 x28:  
[6.482597] x27: 0001 x26: 0060 
[6.487985] x25: 00e0 x24: 8000113ca680 
[6.493373] x23: 372ac174 x22: 3768f6d4 
[6.498762] x21: 3768f600 x20:  
[6.504149] x19: 372c1000 x18: 000e 
[6.509537] x17: 0001 x16: 0007 
[6.514923] x15: 000e x14: 0013 
[6.520311] x13:  x12:  
[6.525699] x11: 057b x10: 0003 
[6.531087] x9 : 383f4750 x8 : 383f3dc0 
[6.536474] x7 : 374be100 x6 : 383f4750 
[6.541862] x5 : 37cab700 x4 : 0008 
[6.547249] x3 : 0101 x2 : 001a 
[6.552636] x1 : 0001 x0 : 372c0800 
[6.558023] Call trace:
[6.560506]  mdiobus_read+0x68/0x70
[6.564045]  phy_interrupt+0x5c/0xb0
[6.567672]  handle_nested_irq+0xb8/0x130
[6.571740]  intr_complete+0xb0/0xe0
[6.575368]  __usb_hcd_giveback_urb+0x58/0xf8
[6.579787]  usb_giveback_urb_bh+0xac/0x108
[6.584032]  tasklet_action_common.isra.0+0x154/0x1a0
[6.589155]  tasklet_hi_action+0x24/0x30
[6.593134]  __do_softirq+0x120/0x23c
[6.596847]  irq_exit+0xb8/0xd8
[6.600031]  __handle_domain_irq+0x64/0xb8
[6.604183]  bcm2836_arm_irqchip_handle_irq+0x60/0xc0
[6.609305]  el1_irq+0xb8/0x180
[6.612490]  arch_cpu_idle+0x10/0x18
[6.616115]  do_idle+0x200/0x280
[6.619387]  cpu_startup_entry+0x20/0x40
[6.623366]  rest_init+0xd4/0xe0
[6.626642]  arch_call_rest_init+0xc/0x14
[6.630706]  start_kernel+0x420/0x44c
[6.634425] Code: a94153f3 a9425bf5 a8c37bfd d65f03c0 (d421) 
[6.640614] ---[ end trace 97efd7bf12ed0c65 ]---
[6.645295] Kernel panic - not syncing: Fatal exception in interrupt
[6.651740] SMP: stopping secondary CPUs
[6.655719] Kernel Offset: disabled
[6.659254] CPU features: 0x0002,24002004
[6.663315] Memory Limit: none
[6.666416] ---[ end Kernel panic - not syncing: Fatal exception in 
interrupt ]---


Not really. It turns out phy_interrupt() wants to be run in a threaded
context. As Sebastian says on IRC "but this means, that it was broken from
the beginning"


[PATCH -next] staging: comedi: dt2814: remove set but not used variables 'data'

2019-10-23 Thread YueHaibing
drivers/staging/comedi/drivers/dt2814.c:193:6:
 warning: variable data set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Signed-off-by: YueHaibing 
---
 drivers/staging/comedi/drivers/dt2814.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/staging/comedi/drivers/dt2814.c 
b/drivers/staging/comedi/drivers/dt2814.c
index d2c7157..e7c6787 100644
--- a/drivers/staging/comedi/drivers/dt2814.c
+++ b/drivers/staging/comedi/drivers/dt2814.c
@@ -186,21 +186,17 @@ static int dt2814_ai_cmd(struct comedi_device *dev, 
struct comedi_subdevice *s)
 
 static irqreturn_t dt2814_interrupt(int irq, void *d)
 {
-   int lo, hi;
struct comedi_device *dev = d;
struct dt2814_private *devpriv = dev->private;
struct comedi_subdevice *s = dev->read_subdev;
-   int data;
 
if (!dev->attached) {
dev_err(dev->class_dev, "spurious interrupt\n");
return IRQ_HANDLED;
}
 
-   hi = inb(dev->iobase + DT2814_DATA);
-   lo = inb(dev->iobase + DT2814_DATA);
-
-   data = (hi << 4) | (lo >> 4);
+   inb(dev->iobase + DT2814_DATA);
+   inb(dev->iobase + DT2814_DATA);
 
if (!(--devpriv->ntrig)) {
int i;
-- 
2.7.4




[PATCH v3] ACPI/processor_idle: Remove dummy wait if kernel is in guest mode

2019-10-23 Thread Yin Fengwei
In function acpi_idle_do_entry(), an ioport access is used for dummy
wait to guarantee hardware behavior. But it could trigger unnecessary
vmexit if kernel is running as guest in virtualization environtment.

If it's in virtualization environment, the deeper C state enter
operation (inb()) will trap to hyervisor. It's not needed to do
dummy wait after the inb() call. So we remove the dummy io port
access to avoid unnecessary VMexit.

We keep dummy io port access to maintain timing for native environment.

Signed-off-by: Yin Fengwei 
---
ChangeLog:
v2 -> v3:
 - Remove dummy io port access totally for virtualization env.

v1 -> v2:
 - Use ndelay instead of dead loop for dummy delay.

 drivers/acpi/processor_idle.c | 36 ---
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index ed56c6d20b08..0c4a97dd6917 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -58,6 +58,17 @@ struct cpuidle_driver acpi_idle_driver = {
 static
 DEFINE_PER_CPU(struct acpi_processor_cx * [CPUIDLE_STATE_MAX], acpi_cstate);
 
+static void (*dummy_wait)(u64 address);
+
+static void default_dummy_wait(u64 address)
+{
+   inl(address);
+}
+
+static void default_noop_wait(u64 address)
+{
+}
+
 static int disabled_by_idle_boot_param(void)
 {
return boot_option_idle_override == IDLE_POLL ||
@@ -660,8 +671,13 @@ static void __cpuidle acpi_idle_do_entry(struct 
acpi_processor_cx *cx)
inb(cx->address);
/* Dummy wait op - must do something useless after P_LVL2 read
   because chipsets cannot guarantee that STPCLK# signal
-  gets asserted in time to freeze execution properly. */
-   inl(acpi_gbl_FADT.xpm_timer_block.address);
+  gets asserted in time to freeze execution properly.
+
+  This dummy wait is only needed for native env. If we are 
running
+  as guest of a hypervisor, we don't need wait op here. We have
+  different implementation for dummy_wait on native/virtual 
env. */
+
+   dummy_wait(acpi_gbl_FADT.xpm_timer_block.address);
}
 }
 
@@ -683,7 +699,7 @@ static int acpi_idle_play_dead(struct cpuidle_device *dev, 
int index)
else if (cx->entry_method == ACPI_CSTATE_SYSTEMIO) {
inb(cx->address);
/* See comment in acpi_idle_do_entry() */
-   inl(acpi_gbl_FADT.xpm_timer_block.address);
+   dummy_wait(acpi_gbl_FADT.xpm_timer_block.address);
} else
return -ENODEV;
}
@@ -912,6 +928,20 @@ static inline void 
acpi_processor_cstate_first_run_checks(void)
  max_cstate);
first_run++;
 
+   dummy_wait = default_dummy_wait;
+
+#ifdef CONFIG_X86
+   /* For x86, if we are running in guest, we don't need extra
+* access ioport as dummy wait.
+*/
+   if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
+   pr_err("We are in virtual env");
+   dummy_wait = default_noop_wait;
+   } else {
+   pr_err("We are not in virtual env");
+   }
+#endif
+
if (acpi_gbl_FADT.cst_control && !nocst) {
status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
acpi_gbl_FADT.cst_control, 8);
-- 
2.19.1



Re: [PATCH] sched/fair: fix rework of find_idlest_group()

2019-10-23 Thread Chen, Rong A

Tested-by: kernel test robot 

On 10/23/2019 12:46 AM, Vincent Guittot wrote:

The task, for which the scheduler looks for the idlest group of CPUs, must
be discounted from all statistics in order to get a fair comparison
between groups. This includes utilization, load, nr_running and idle_cpus.

Such unfairness can be easily highlighted with the unixbench execl 1 task.
This test continuously call execve() and the scheduler looks for the idlest
group/CPU on which it should place the task. Because the task runs on the
local group/CPU, the latter seems already busy even if there is nothing
else running on it. As a result, the scheduler will always select another
group/CPU than the local one.

Fixes: 57abff067a08 ("sched/fair: Rework find_idlest_group()")
Reported-by: kernel test robot 
Signed-off-by: Vincent Guittot 
---

This recover most of the perf regression on my system and I have asked
Rong if he can rerun the test with the patch to check that it fixes his
system as well.

  kernel/sched/fair.c | 90 -
  1 file changed, 83 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a81c364..0ad4b21 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5379,6 +5379,36 @@ static unsigned long cpu_load(struct rq *rq)
  {
return cfs_rq_load_avg(&rq->cfs);
  }
+/*
+ * cpu_load_without - compute cpu load without any contributions from *p
+ * @cpu: the CPU which load is requested
+ * @p: the task which load should be discounted
+ *
+ * The load of a CPU is defined by the load of tasks currently enqueued on that
+ * CPU as well as tasks which are currently sleeping after an execution on that
+ * CPU.
+ *
+ * This method returns the load of the specified CPU by discounting the load of
+ * the specified task, whenever the task is currently contributing to the CPU
+ * load.
+ */
+static unsigned long cpu_load_without(struct rq *rq, struct task_struct *p)
+{
+   struct cfs_rq *cfs_rq;
+   unsigned int load;
+
+   /* Task has no contribution or is new */
+   if (cpu_of(rq) != task_cpu(p) || !READ_ONCE(p->se.avg.last_update_time))
+   return cpu_load(rq);
+
+   cfs_rq = &rq->cfs;
+   load = READ_ONCE(cfs_rq->avg.load_avg);
+
+   /* Discount task's util from CPU's util */
+   lsub_positive(&load, task_h_load(p));
+
+   return load;
+}
  
  static unsigned long capacity_of(int cpu)

  {
@@ -8117,10 +8147,55 @@ static inline enum fbq_type fbq_classify_rq(struct rq 
*rq)
  struct sg_lb_stats;
  
  /*

+ * task_running_on_cpu - return 1 if @p is running on @cpu.
+ */
+
+static unsigned int task_running_on_cpu(int cpu, struct task_struct *p)
+{
+   /* Task has no contribution or is new */
+   if (cpu != task_cpu(p) || !READ_ONCE(p->se.avg.last_update_time))
+   return 0;
+
+   if (task_on_rq_queued(p))
+   return 1;
+
+   return 0;
+}
+
+/**
+ * idle_cpu_without - would a given CPU be idle without p ?
+ * @cpu: the processor on which idleness is tested.
+ * @p: task which should be ignored.
+ *
+ * Return: 1 if the CPU would be idle. 0 otherwise.
+ */
+static int idle_cpu_without(int cpu, struct task_struct *p)
+{
+   struct rq *rq = cpu_rq(cpu);
+
+   if ((rq->curr != rq->idle) && (rq->curr != p))
+   return 0;
+
+   /*
+* rq->nr_running can't be used but an updated version without the
+* impact of p on cpu must be used instead. The updated nr_running
+* be computed and tested before calling idle_cpu_without().
+*/
+
+#ifdef CONFIG_SMP
+   if (!llist_empty(&rq->wake_list))
+   return 0;
+#endif
+
+   return 1;
+}
+
+/*
   * update_sg_wakeup_stats - Update sched_group's statistics for wakeup.
- * @denv: The ched_domain level to look for idlest group.
+ * @sd: The sched_domain level to look for idlest group.
   * @group: sched_group whose statistics are to be updated.
   * @sgs: variable to hold the statistics for this group.
+ * @p: The task for which we look for the idlest group/CPU.
   */
  static inline void update_sg_wakeup_stats(struct sched_domain *sd,
  struct sched_group *group,
@@ -8133,21 +8208,22 @@ static inline void update_sg_wakeup_stats(struct 
sched_domain *sd,
  
  	for_each_cpu(i, sched_group_span(group)) {

struct rq *rq = cpu_rq(i);
+   unsigned int local;
  
-		sgs->group_load += cpu_load(rq);

+   sgs->group_load += cpu_load_without(rq, p);
sgs->group_util += cpu_util_without(i, p);
-   sgs->sum_h_nr_running += rq->cfs.h_nr_running;
+   local = task_running_on_cpu(i, p);
+   sgs->sum_h_nr_running += rq->cfs.h_nr_running - local;
  
-		nr_running = rq->nr_running;

+   nr_running = rq->nr_running - local;
sgs->sum_nr_running += nr_running;
  
  		/*

-* No need to call

[PATCH -next] staging: comedi: remove unused variable 'route_table_size'

2019-10-23 Thread YueHaibing
drivers/staging/comedi/drivers/ni_routes.c:52:21: warning:
 route_table_size defined but not used [-Wunused-const-variable=]

It is never used since introduction.

Signed-off-by: YueHaibing 
---
 drivers/staging/comedi/drivers/ni_routes.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/staging/comedi/drivers/ni_routes.c 
b/drivers/staging/comedi/drivers/ni_routes.c
index eb61494..673d732 100644
--- a/drivers/staging/comedi/drivers/ni_routes.c
+++ b/drivers/staging/comedi/drivers/ni_routes.c
@@ -49,8 +49,6 @@
 /* Helper for accessing data. */
 #define RVi(table, src, dest)  ((table)[(dest) * NI_NUM_NAMES + (src)])
 
-static const size_t route_table_size = NI_NUM_NAMES * NI_NUM_NAMES;
-
 /*
  * Find the proper route_values and ni_device_routes tables for this particular
  * device.
-- 
2.7.4




Re: [PATCH v2 2/2] PCI: pciehp: Prevent deadlock on disconnect

2019-10-23 Thread Mika Westerberg
On Tue, Oct 22, 2019 at 06:00:06PM -0500, Bjorn Helgaas wrote:
> On Mon, Aug 12, 2019 at 05:31:33PM +0300, Mika Westerberg wrote:
> > If there are more than one PCIe switch with hotplug downstream ports
> > hot-removing them leads to a following deadlock:
> > 
> >  INFO: task irq/126-pciehp:198 blocked for more than 120 seconds.
> >  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >  irq/126-pciehp  D0   198  2 0x8000
> >  Call Trace:
> >   __schedule+0x2a2/0x880
> >   schedule+0x2c/0x80
> >   schedule_timeout+0x246/0x350
> >   ? ttwu_do_activate+0x67/0x90
> >   wait_for_completion+0xb7/0x140
> >   ? wake_up_q+0x80/0x80
> >   kthread_stop+0x49/0x110
> >   __free_irq+0x15c/0x2a0
> >   free_irq+0x32/0x70
> >   pcie_shutdown_notification+0x2f/0x50
> >   pciehp_remove+0x27/0x50
> >   pcie_port_remove_service+0x36/0x50
> >   device_release_driver_internal+0x18c/0x250
> >   device_release_driver+0x12/0x20
> >   bus_remove_device+0xec/0x160
> >   device_del+0x13b/0x350
> >   ? pcie_port_find_device+0x60/0x60
> >   device_unregister+0x1a/0x60
> >   remove_iter+0x1e/0x30
> >   device_for_each_child+0x56/0x90
> >   pcie_port_device_remove+0x22/0x40
> >   pcie_portdrv_remove+0x20/0x60
> >   pci_device_remove+0x3e/0xc0
> >   device_release_driver_internal+0x18c/0x250
> >   device_release_driver+0x12/0x20
> >   pci_stop_bus_device+0x6f/0x90
> >   pci_stop_bus_device+0x31/0x90
> >   pci_stop_and_remove_bus_device+0x12/0x20
> >   pciehp_unconfigure_device+0x88/0x140
> >   pciehp_disable_slot+0x6a/0x110
> >   pciehp_handle_presence_or_link_change+0x263/0x400
> >   pciehp_ist+0x1c9/0x1d0
> >   ? irq_forced_thread_fn+0x80/0x80
> >   irq_thread_fn+0x24/0x60
> >   irq_thread+0xeb/0x190
> >   ? irq_thread_fn+0x60/0x60
> >   kthread+0x120/0x140
> >   ? irq_thread_check_affinity+0xf0/0xf0
> >   ? kthread_park+0x90/0x90
> >   ret_from_fork+0x35/0x40
> >  INFO: task irq/190-pciehp:2288 blocked for more than 120 seconds.
> >  irq/190-pciehp  D0  2288  2 0x8000
> >  Call Trace:
> >   __schedule+0x2a2/0x880
> >   schedule+0x2c/0x80
> >   schedule_preempt_disabled+0xe/0x10
> >   __mutex_lock.isra.9+0x2e0/0x4d0
> >   ? __mutex_lock_slowpath+0x13/0x20
> >   __mutex_lock_slowpath+0x13/0x20
> >   mutex_lock+0x2c/0x30
> >   pci_lock_rescan_remove+0x15/0x20
> >   pciehp_unconfigure_device+0x4d/0x140
> >   pciehp_disable_slot+0x6a/0x110
> >   pciehp_handle_presence_or_link_change+0x263/0x400
> >   pciehp_ist+0x1c9/0x1d0
> >   ? irq_forced_thread_fn+0x80/0x80
> >   irq_thread_fn+0x24/0x60
> >   irq_thread+0xeb/0x190
> >   ? irq_thread_fn+0x60/0x60
> >   kthread+0x120/0x140
> >   ? irq_thread_check_affinity+0xf0/0xf0
> >   ? kthread_park+0x90/0x90
> >   ret_from_fork+0x35/0x40
> > 
> > What happens here is that the whole hierarchy is runtime resumed and the
> > parent PCIe downstream port, who got the hot-remove event, starts
> > removing devices below it taking pci_lock_rescan_remove() lock. When the
> > child PCIe port is runtime resumed it calls pciehp_check_presence()
> > which ends up calling pciehp_card_present() and pciehp_check_link_active().
> > Both of these read their parts of PCIe config space by calling helper
> > function pcie_capability_read_word(). Now, this function notices that
> > the underlying device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND
> > with the capability value set to 0. When pciehp gets this value it
> > thinks that its child device is also hot-removed and schedules its IRQ
> > thread to handle the event.
> 
> I can't remember if there was a reason why 8c0d3a02c130 ("PCI: Add
> accessors for PCI Express Capability") reset *val to 0 if
> pci_read_config_word() fails.  It doesn't seem like the right thing;
> it seems like it would be better for it to be consistent with a plain
> pci_read_config_word().
> 
> > The deadlock happens when the child's IRQ thread runs and tries to
> > acquire pci_lock_rescan_remove() which is already taken by the parent
> > and the parent waits for the child's IRQ thread to finish.
> > 
> > We can prevent this from happening by checking the return value of
> > pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop
> > performing any hot-removal activities.
> > 
> > Signed-off-by: Mika Westerberg 
> > ---
> > No changes from the previous version.
> > 
> >  drivers/pci/hotplug/pciehp.h  |  6 +++---
> >  drivers/pci/hotplug/pciehp_core.c | 11 ---
> >  drivers/pci/hotplug/pciehp_ctrl.c |  4 ++--
> >  drivers/pci/hotplug/pciehp_hpc.c  | 32 +++
> >  4 files changed, 37 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
> > index 8c51a04b8083..81c514ab9518 100644
> > --- a/drivers/pci/hotplug/pciehp.h
> > +++ b/drivers/pci/hotplug/pciehp.h
> > @@ -173,10 +173,10 @@ int pciehp_query_power_fault(struct controller *ctrl);
> >  void pciehp_green_led_on(struct controller *ctrl);
> >  void pciehp_green_led_off(struct cont

[PATCH] rtl8xxxu: remove set but not used variable 'rate_mask'

2019-10-23 Thread YueHaibing
drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c:4484:6:
 warning: variable rate_mask set but not used [-Wunused-but-set-variable]

It is never used since commit a9bb0b515778 ("rtl8xxxu: Improve
TX performance of RTL8723BU on rtl8xxxu driver")

Signed-off-by: YueHaibing 
---
 drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c 
b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
index 1e3b716..3843d7a 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
@@ -4481,11 +4481,6 @@ static u16
 rtl8xxxu_wireless_mode(struct ieee80211_hw *hw, struct ieee80211_sta *sta)
 {
u16 network_type = WIRELESS_MODE_UNKNOWN;
-   u32 rate_mask;
-
-   rate_mask = (sta->supp_rates[0] & 0xfff) |
-   (sta->ht_cap.mcs.rx_mask[0] << 12) |
-   (sta->ht_cap.mcs.rx_mask[0] << 20);
 
if (hw->conf.chandef.chan->band == NL80211_BAND_5GHZ) {
if (sta->vht_cap.vht_supported)
-- 
2.7.4




Re: [PATCHv2 0/2] perf tools: Share struct map after clone

2019-10-23 Thread Jiri Olsa
On Wed, Oct 16, 2019 at 10:22:24AM +0200, Jiri Olsa wrote:
> hi,
> Andi reported that maps cloning is eating lot of memory and
> it's probably unnecessary, because they keep the same data.
> 
> This 'maps sharing' seems to save lot of heap for reports with
> many forks/cloned mmaps (over 60% in example below).
> 
> Profile kernel build:
> 
>   $ perf record make -j 40
> 
> Get heap profile (tools/perf directory):
> 
>   $ 
>   $ make TCMALLOC=1
>   $ HEAPPROFILE=/tmp/heapprof ./perf report -i perf.data --stdio > out
>   $ pprof ./perf /tmp/heapprof.000*
> 
> Before:
> 
>   (pprof) top
>   Total: 2335.5 MB
> 1735.1  74.3%  74.3%   1735.1  74.3% memdup
>  402.0  17.2%  91.5%402.0  17.2% zalloc
>  140.2   6.0%  97.5%145.8   6.2% map__new
>   33.6   1.4%  98.9% 33.6   1.4% symbol__new
>   12.4   0.5%  99.5% 12.4   0.5% alloc_event
>6.2   0.3%  99.7%  6.2   0.3% nsinfo__new
>5.5   0.2% 100.0%  5.5   0.2% nsinfo__copy
>0.3   0.0% 100.0%  0.3   0.0% dso__new
>0.1   0.0% 100.0%  0.1   0.0% do_read_string
>0.0   0.0% 100.0%  0.0   0.0% __GI__IO_file_doallocate
> 
> After:
> 
>   (pprof) top
>   Total: 784.5 MB
>  385.8  49.2%  49.2%385.8  49.2% memdup
>  285.8  36.4%  85.6%285.8  36.4% zalloc
>   80.4  10.3%  95.9% 83.7  10.7% map__new
>   19.1   2.4%  98.3% 19.1   2.4% symbol__new
>6.2   0.8%  99.1%  6.2   0.8% alloc_event
>3.6   0.5%  99.6%  3.6   0.5% nsinfo__new
>3.2   0.4% 100.0%  3.2   0.4% nsinfo__copy
>0.2   0.0% 100.0%  0.2   0.0% dso__new
>0.0   0.0% 100.0%  0.0   0.0% do_read_string
>0.0   0.0% 100.0%  0.0   0.0% elf_fill
> 
> v2 changes:
>   - rebased to Arnaldo's perf/core
>   - patch 1 already taken
> 
> Also available in here:
>   git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   perf/map_shared

I rebased to latest perf/core and pushed the branch out

jirka



[PATCH] x86/mce/amd: fix -Wmissing-prototypes warnings

2019-10-23 Thread Yi Wang
We get two warnings when build kernel W=1:
arch/x86/kernel/cpu/mce/amd.c:586:6: warning: no previous prototype for 
‘disable_err_thresholding’ [-Wmissing-prototypes]

Make the function static to fix this.

Signed-off-by: Yi Wang 
---
 arch/x86/kernel/cpu/mce/amd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 6ea7fdc..5167bd2 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -583,7 +583,7 @@ bool amd_filter_mce(struct mce *m)
  * - Prevent possible spurious interrupts from the IF bank on Family 0x17
  *   Models 0x10-0x2F due to Erratum #1114.
  */
-void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank)
+static void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank)
 {
int i, num_msrs;
u64 hwcr;
-- 
1.8.3.1



Re: [PATCH] arch: microblaze: support for reserved-memory entries in DT

2019-10-23 Thread Michal Simek
Hi,


On 22. 10. 19 10:19, Alvaro Gamez Machado wrote:
> Signed-off-by: Alvaro Gamez Machado 

please put there reasonable description to commit message.

> ---
>  arch/microblaze/mm/init.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
> index a015a951c8b7..928c5c2816e4 100644
> --- a/arch/microblaze/mm/init.c
> +++ b/arch/microblaze/mm/init.c
> @@ -17,6 +17,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 

of_fdt.h should be enough.

>  
>  #include 
>  #include 
> @@ -188,6 +190,9 @@ void __init setup_memory(void)
>  
>  void __init mem_init(void)
>  {
> + early_init_fdt_reserve_self();
> + early_init_fdt_scan_reserved_mem();
> +
>   high_memory = (void *)__va(memory_start + lowmem_size - 1);
>  
>   /* this will put all memory onto the freelists */
> 


Also I have looked at others arch and take a look at

1b10cb21d888c021bedbe678f7c26aee1bf04ffa
ARC: add support for reserved memory defined by device tree

where they also enable OF_RESERVED_MEM to call fdt_init_reserve_mem()

The same here
4e7c84ec045921dacc78d36295e2e61390249665
 xtensa: support reserved-memory DT node

and here
9bf14b7c540ae9ca7747af3a0c0d8470ef77b6ce
arm64: add support for reserved memory defined by device tree


Please note this in commit message.

Thanks,
Michal


-- 
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Xilinx Microblaze
Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP/Versal SoCs



[PATCH -next] drm/amdgpu: remove set but not used variable 'adev'

2019-10-23 Thread YueHaibing
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:1221:24: warning: variable adev set but 
not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:488:24: warning: variable adev set but 
not used [-Wunused-but-set-variable]
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:547:24: warning: variable adev set but 
not used [-Wunused-but-set-variable]

It is never used, so can removed it.

Signed-off-by: YueHaibing 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 9 -
 1 file changed, 9 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index a61b0d9..ba00262 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -485,15 +485,12 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object 
*bo, bool evict,
struct ttm_operation_ctx *ctx,
struct ttm_mem_reg *new_mem)
 {
-   struct amdgpu_device *adev;
struct ttm_mem_reg *old_mem = &bo->mem;
struct ttm_mem_reg tmp_mem;
struct ttm_place placements;
struct ttm_placement placement;
int r;
 
-   adev = amdgpu_ttm_adev(bo->bdev);
-
/* create space/pages for new_mem in GTT space */
tmp_mem = *new_mem;
tmp_mem.mm_node = NULL;
@@ -544,15 +541,12 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object 
*bo, bool evict,
struct ttm_operation_ctx *ctx,
struct ttm_mem_reg *new_mem)
 {
-   struct amdgpu_device *adev;
struct ttm_mem_reg *old_mem = &bo->mem;
struct ttm_mem_reg tmp_mem;
struct ttm_placement placement;
struct ttm_place placements;
int r;
 
-   adev = amdgpu_ttm_adev(bo->bdev);
-
/* make space in GTT for old_mem buffer */
tmp_mem = *new_mem;
tmp_mem.mm_node = NULL;
@@ -1218,11 +1212,8 @@ static struct ttm_backend_func amdgpu_backend_func = {
 static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_buffer_object *bo,
   uint32_t page_flags)
 {
-   struct amdgpu_device *adev;
struct amdgpu_ttm_tt *gtt;
 
-   adev = amdgpu_ttm_adev(bo->bdev);
-
gtt = kzalloc(sizeof(struct amdgpu_ttm_tt), GFP_KERNEL);
if (gtt == NULL) {
return NULL;
-- 
2.7.4




Re: [PATCH v2] clocksource/drivers: Fix memory leak in ttc_setup_clockevent

2019-10-23 Thread Markus Elfring
> In the implementation of ttc_setup_clockevent() release the allocated
> memory for ttcce if clk_notifier_register() fails.

I got other wording preferences. Thus I imagine that such a change
description can still be improved another bit.
Would you like to express the addition of a jump target (according to
the Linux coding style) for the completion of desired exception handling
in a different way?


…
> +++ b/drivers/clocksource/timer-cadence-ttc.c
…
> @@ -453,15 +451,18 @@ static int __init ttc_setup_clockevent(struct clk *clk,
…
> +release_ttcce:
> +
> + kfree(ttcce);
…

I would prefer that a blank line will not be added directly after such a label.

Regards,
Markus


Re: [PATCH v2 10/11] gpio: pca953x: Convert to use bitmap API

2019-10-23 Thread Andy Shevchenko
On Tue, Oct 22, 2019 at 08:03:00PM +0200, Geert Uytterhoeven wrote:
> On Tue, Oct 22, 2019 at 7:29 PM Andy Shevchenko
>  wrote:
> > Instead of customized approach convert the driver to use bitmap API.

> >  #define MAX_BANK 5
> >  #define BANK_SZ 8
> > +#define MAX_LINE   (MAX_BANK * BANK_SZ)
> 
> Given (almost) everything is now bitmap (i.e. long [])-based, you might
> as well increase MAX_BANK to a multiple of 4 or 8, e.g. 8.

We can do it any time when we will really need it.

(Yes, I understand that we have no penalty for the change anyway)

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH] ASoC: mediatek: Check SND_SOC_CROS_EC_CODEC dependency

2019-10-23 Thread Tzung-Bi Shih
On Wed, Oct 23, 2019 at 2:31 PM Mao Wenan  wrote:
>
> If SND_SOC_MT8183_MT6358_TS3A227E_MAX98357A=y,
> below errors can be seen:
> sound/soc/codecs/cros_ec_codec.o: In function `send_ec_host_command':
> cros_ec_codec.c:(.text+0x534): undefined reference to 
> `cros_ec_cmd_xfer_status'
> cros_ec_codec.c:(.text+0x101c): undefined reference to 
> `cros_ec_get_host_event'
>
> This is because it will select SND_SOC_CROS_EC_CODEC
> after commit 2cc3cd5fdc8b ("ASoC: mediatek: mt8183: support WoV"),
> but SND_SOC_CROS_EC_CODEC depends on CROS_EC.
>
> Fixes: 2cc3cd5fdc8b ("ASoC: mediatek: mt8183: support WoV")
> Signed-off-by: Mao Wenan 

Acked-by: Tzung-Bi Shih 

Thanks for the catching.


[PATCH RESEND] dt-bindings: max77693: fix missing curly brace

2019-10-23 Thread Matti Vaittinen
Add missing curly brace to charger node example.

Signed-off-by: Matti Vaittinen 
---

Resending as I forgot to add the LKML in first attempt. Sorry peeps.

 Documentation/devicetree/bindings/mfd/max77693.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/mfd/max77693.txt 
b/Documentation/devicetree/bindings/mfd/max77693.txt
index a3c60a7a3be1..0ced96e16c16 100644
--- a/Documentation/devicetree/bindings/mfd/max77693.txt
+++ b/Documentation/devicetree/bindings/mfd/max77693.txt
@@ -175,6 +175,7 @@ Example:
maxim,thermal-regulation-celsius = <75>;
maxim,battery-overcurrent-microamp = <300>;
maxim,charge-input-threshold-microvolt = <430>;
+   };
 
led {
compatible = "maxim,max77693-led";
-- 
2.21.0


-- 
Matti Vaittinen, Linux device drivers
ROHM Semiconductors, Finland SWDC
Kiviharjunlenkki 1E
90220 OULU
FINLAND

~~~ "I don't think so," said Rene Descartes. Just then he vanished ~~~
Simon says - in Latin please.
~~~ "non cogito me" dixit Rene Descarte, deinde evanescavit ~~~
Thanks to Simon Glass for the translation =] 


Re: [PATCH] net: usb: lan78xx: Use phy_mac_interrupt() for interrupt handling

2019-10-23 Thread Daniel Wagner
Sebastian suggested to try this here:

--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -1264,8 +1264,11 @@ static void lan78xx_status(struct lan78xx_net *dev, 
struct urb *urb)
netif_dbg(dev, link, dev->net, "PHY INTR: 0x%08x\n", intdata);
lan78xx_defer_kevent(dev, EVENT_LINK_RESET);
 
-   if (dev->domain_data.phyirq > 0)
+   if (dev->domain_data.phyirq > 0) {
+   local_irq_disable();
generic_handle_irq(dev->domain_data.phyirq);
+   local_irq_enable();
+   }
} else
netdev_warn(dev->net,
"unexpected interrupt: 0x%08x\n", intdata);

While this gets rid of the warning, the networking interface is not
really stable:

[   43.999628] nfs: server 192.168.19.2 not responding, still trying
[   43.999633] nfs: server 192.168.19.2 not responding, still trying
[   43.999649] nfs: server 192.168.19.2 not responding, still trying
[   43.999674] nfs: server 192.168.19.2 not responding, still trying
[   43.999678] nfs: server 192.168.19.2 not responding, still trying
[   44.006712] nfs: server 192.168.19.2 OK
[   44.018443] nfs: server 192.168.19.2 OK
[   44.024765] nfs: server 192.168.19.2 OK
[   44.025361] nfs: server 192.168.19.2 OK
[   44.025420] nfs: server 192.168.19.2 OK
[  256.991659] nfs: server 192.168.19.2 not responding, still trying
[  256.991664] nfs: server 192.168.19.2 not responding, still trying
[  256.991669] nfs: server 192.168.19.2 not responding, still trying
[  256.991685] nfs: server 192.168.19.2 not responding, still trying
[  256.991713] nfs: server 192.168.19.2 not responding, still trying
[  256.998797] nfs: server 192.168.19.2 OK
[  256.999745] nfs: server 192.168.19.2 OK
[  256.999828] nfs: server 192.168.19.2 OK
[  257.000438] nfs: server 192.168.19.2 OK
[  257.004784] nfs: server 192.168.19.2 OK


Eventually, the rootfs can be loaded and the system boots. Though the
system is not really usable because it often stalls:


root@debian:~# apt update
Ign:1 http://deb.debian.org/debian stretch InRelease
Hit:2 http://deb.debian.org/debian stretch Release
Reading package lists... 0% 


I don't see this with the irqdomain code reverted.


Re: [PATCH] x86/mce/amd: fix -Wmissing-prototypes warnings

2019-10-23 Thread Borislav Petkov
On Wed, Oct 23, 2019 at 03:57:17PM +0800, Yi Wang wrote:
> We get two warnings when build kernel W=1:
> arch/x86/kernel/cpu/mce/amd.c:586:6: warning: no previous prototype for 
> ‘disable_err_thresholding’ [-Wmissing-prototypes]
> 
> Make the function static to fix this.
> 
> Signed-off-by: Yi Wang 
> ---
>  arch/x86/kernel/cpu/mce/amd.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
> index 6ea7fdc..5167bd2 100644
> --- a/arch/x86/kernel/cpu/mce/amd.c
> +++ b/arch/x86/kernel/cpu/mce/amd.c
> @@ -583,7 +583,7 @@ bool amd_filter_mce(struct mce *m)
>   * - Prevent possible spurious interrupts from the IF bank on Family 0x17
>   *   Models 0x10-0x2F due to Erratum #1114.
>   */
> -void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int bank)
> +static void disable_err_thresholding(struct cpuinfo_x86 *c, unsigned int 
> bank)
>  {
>   int i, num_msrs;
>   u64 hwcr;
> --

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=ras/core&id=47cd84e98f512eac5aad988f08baff432aea35ba

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH -next] phy: ti: dm816x: remove set but not used variable 'phy_data'

2019-10-23 Thread Roger Quadros




On 23/10/2019 10:45, YueHaibing wrote:

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/phy/ti/phy-dm816x-usb.c:192:29: warning:
  variable phy_data set but not used [-Wunused-but-set-variable]

It is never used, so can be removed.

Signed-off-by: YueHaibing 


Reviewed-by: Roger Quadros 

cheers,
-roger


---
  drivers/phy/ti/phy-dm816x-usb.c | 3 ---
  1 file changed, 3 deletions(-)

diff --git a/drivers/phy/ti/phy-dm816x-usb.c b/drivers/phy/ti/phy-dm816x-usb.c
index cbcce7c..26f1947 100644
--- a/drivers/phy/ti/phy-dm816x-usb.c
+++ b/drivers/phy/ti/phy-dm816x-usb.c
@@ -189,7 +189,6 @@ static int dm816x_usb_phy_probe(struct platform_device 
*pdev)
struct phy_provider *phy_provider;
struct usb_otg *otg;
const struct of_device_id *of_id;
-   const struct usb_phy_data *phy_data;
int error;
  
  	of_id = of_match_device(of_match_ptr(dm816x_usb_phy_id_table),

@@ -220,8 +219,6 @@ static int dm816x_usb_phy_probe(struct platform_device 
*pdev)
if (phy->usbphy_ctrl == 0x2c)
phy->instance = 1;
  
-	phy_data = of_id->data;

-
otg = devm_kzalloc(&pdev->dev, sizeof(*otg), GFP_KERNEL);
if (!otg)
return -ENOMEM;



--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


Re: [RFC v1] memcg: add memcg lru for page reclaiming

2019-10-23 Thread Michal Hocko
On Wed 23-10-19 12:44:48, Hillf Danton wrote:
> 
> On Tue, 22 Oct 2019 15:58:32 +0200 Michal Hocko wrote:
> > 
> > On Tue 22-10-19 21:30:50, Hillf Danton wrote:
[...]
> > > in this RFC after ripping pages off
> > > the first victim, the work finishes with the first ancestor of the victim
> > > added to lru.
> > > 
> > > Recaliming is defered until kswapd becomes active.
> > 
> > This is a wrong assumption because high limit might be configured way
> > before kswapd is woken up.
> 
> This change was introduced because high limit breach looks not like a
> serious problem in the absence of memory pressure. Lets do the hard work,
> reclaiming one memcg a time up through the hierarchy, when kswapd becomes
> active. It also explains the BH introduced.

But this goes against the main motivation for the high limit - to
throttle. It is not all that important that there is not global memory
pressure. The preventive high limit reclaim is there to make sure that
the specific memcg is kept in a reasonable containment.
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 1/3] phy: cadence: Sierra: add phy_reset hook

2019-10-23 Thread Roger Quadros

Kishon,

On 23/10/2019 10:36, Kishon Vijay Abraham I wrote:

Roger,

On 22/10/19 6:52 PM, Roger Quadros wrote:

This is required if type C driver needs to hold
global reset on J7ES to perform LN10 swap.


Can you replace "This" with something more specific.


I meant this patch, but I will revise the commit message.

cheers,
-roger


Thanks
Kishon


Signed-off-by: Roger Quadros 
Signed-off-by: Sekhar Nori 
---
  drivers/phy/cadence/phy-cadence-sierra.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/drivers/phy/cadence/phy-cadence-sierra.c 
b/drivers/phy/cadence/phy-cadence-sierra.c
index affede8c4368..e6d27bdec22a 100644
--- a/drivers/phy/cadence/phy-cadence-sierra.c
+++ b/drivers/phy/cadence/phy-cadence-sierra.c
@@ -339,10 +339,20 @@ static int cdns_sierra_phy_off(struct phy *gphy)
return reset_control_assert(ins->lnk_rst);
  }
  
+static int cdns_sierra_phy_reset(struct phy *gphy)

+{
+   struct cdns_sierra_phy *sp = dev_get_drvdata(gphy->dev.parent);
+
+   reset_control_assert(sp->phy_rst);
+   reset_control_deassert(sp->phy_rst);
+   return 0;
+};
+
  static const struct phy_ops ops = {
.init   = cdns_sierra_phy_init,
.power_on   = cdns_sierra_phy_on,
.power_off  = cdns_sierra_phy_off,
+   .reset  = cdns_sierra_phy_reset,
.owner  = THIS_MODULE,
  };
  



--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


Re: [PATCH v2 0/6] musb: Improve performance for hub-attached webcams

2019-10-23 Thread Matwey V. Kornilov
пн, 9 сент. 2019 г. в 19:33, Matwey V. Kornilov :
>
> вт, 2 июл. 2019 г. в 20:33, Bin Liu :
> >
> > Matwey,
> >
> > On Tue, Jul 02, 2019 at 08:29:03PM +0300, Matwey V. Kornilov wrote:
> > > Ping?
> >
> > I was offline and just got back. I will review it soon. Sorry for the
> > delay.
>
> Ping?
>

Ping?

> >
> > -Bin.
> >
> > >
> > > пт, 14 июн. 2019 г. в 19:47, Matwey V. Kornilov :
> > > >
> > > > The series is concerned to issues with isochronous transfer while
> > > > streaming the USB webcam data. I discovered the issue first time
> > > > when attached PWC USB webcam to AM335x-based BeagleBone Black SBC.
> > > > It appeared that the root issue was in numerous missed IN requests
> > > > during isochronous transfer where each missing leaded to the frame
> > > > drop. Since every IN request is triggered in MUSB driver
> > > > individually, it is important to queue the send IN request as
> > > > earlier as possible when the previous IN completed. At the same
> > > > time the URB giveback handler of the device driver has also to be
> > > > called there, that leads to arbitrarily delay depending on the
> > > > device driver performance. The details with the references are
> > > > described in [1].
> > > >
> > > > The issue has two parts:
> > > >
> > > >   1) peripheral driver URB callback performance
> > > >   2) MUSB host driver performance
> > > >
> > > > It appeared that the first part is related to the wrong memory
> > > > allocation strategy in the most USB webcam drivers. Non-cached
> > > > memory is used in assumption that coherent DMA memory leads to
> > > > the better performance than non-coherent memory in conjunction with
> > > > the proper synchronization. Yet the assumption might be valid for
> > > > x86 platforms some time ago, the issue was fixed for PWC driver in:
> > > >
> > > > 1161db6776bd ("media: usb: pwc: Don't use coherent DMA buffers for 
> > > > ISO transfer")
> > > >
> > > > that leads to 3.5x performance gain. The more generic fix for this
> > > > common issue are coming for the rest drivers [2].
> > > >
> > > > The patch allowed successfully running full-speed USB PWC webcams
> > > > attached directly to BeagleBone Black USB port.
> > > >
> > > > However, the second part of the issue is still present for
> > > > peripheral device attached through the high-speed USB hub due to
> > > > its 125us frame time. The patch series is intended to reorganize
> > > > musb_advance_schedule() to allow host to send IN request quicker.
> > > >
> > > > The patch series is organized as the following. First three patches
> > > > improve readability of the existing code in
> > > > musb_advance_schedule(). Patches 4 and 5 introduce updated
> > > > signature for musb_start_urb(). The last patch introduce new
> > > > code-path in musb_advance_schedule() which allows for faster
> > > > response.
> > > >
> > > > References:
> > > >
> > > > [1] https://www.spinics.net/lists/linux-usb/msg165735.html
> > > > [2] https://www.spinics.net/lists/linux-media/msg144279.html
> > > >
> > > > Changes since v1:
> > > >  - Patch 6 was redone to keep URB giveback order and stop transmission 
> > > > at
> > > >erroneous URB.
> > > >
> > > > Matwey V. Kornilov (6):
> > > >   usb: musb: Use USB_DIR_IN when calling musb_advance_schedule()
> > > >   usb: musb: Introduce musb_qh_empty() helper function
> > > >   usb: musb: Introduce musb_qh_free() helper function
> > > >   usb: musb: Rename musb_start_urb() to musb_start_next_urb()
> > > >   usb: musb: Introduce musb_start_urb()
> > > >   usb: musb: Decrease URB starting latency in musb_advance_schedule()
> > > >
> > > >  drivers/usb/musb/musb_host.c | 132 
> > > > ---
> > > >  drivers/usb/musb/musb_host.h |   1 +
> > > >  2 files changed, 86 insertions(+), 47 deletions(-)
> > > >
> > > > --
> > > > 2.16.4
> > > >
> > >
> > >
> > > --
> > > With best regards,
> > > Matwey V. Kornilov
>
>
>
> --
> With best regards,
> Matwey V. Kornilov



-- 
With best regards,
Matwey V. Kornilov


Re: [PATCH 3/3] phy: ti: j721e-wiz: Manage typec-gpio-dir

2019-10-23 Thread Roger Quadros




On 23/10/2019 08:28, Jyri Sarha wrote:

On 22/10/2019 16:22, Roger Quadros wrote:

Based on this GPIO state we need to configure LN10
bit to swap lane0 and lane1 if required (flipped connector).

Type-C companions typically need some time after the cable is
plugged before and before they reflect the correct status of
Type-C plug orientation on the DIR line.

Type-C Spec specifies CC attachment debounce time (tCCDebounce)
of 100 ms (min) to 200 ms (max).

Use the DT property to figure out if we need to add delay
or not before sampling the Type-C DIR line.

Signed-off-by: Roger Quadros 
Signed-off-by: Sekhar Nori 
---
  drivers/phy/ti/phy-j721e-wiz.c | 41 ++
  1 file changed, 41 insertions(+)

diff --git a/drivers/phy/ti/phy-j721e-wiz.c b/drivers/phy/ti/phy-j721e-wiz.c
index 2a95da843e9f..2becdbcb762a 100644
--- a/drivers/phy/ti/phy-j721e-wiz.c
+++ b/drivers/phy/ti/phy-j721e-wiz.c
@@ -9,6 +9,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  #include 
  #include 
  #include 
@@ -22,6 +24,7 @@
  #define WIZ_SERDES_CTRL   0x404
  #define WIZ_SERDES_TOP_CTRL   0x408
  #define WIZ_SERDES_RST0x40c
+#define WIZ_SERDES_TYPEC   0x410
  #define WIZ_LANECTL(n)(0x480 + (0x40 * (n)))
  
  #define WIZ_MAX_LANES		4

@@ -29,6 +32,8 @@
  #define WIZ_DIV_NUM_CLOCKS_16G2
  #define WIZ_DIV_NUM_CLOCKS_10G1
  
+#define WIZ_SERDES_TYPEC_LN10_SWAP	BIT(30)

+
  enum wiz_lane_standard_mode {
LANE_MODE_GEN1,
LANE_MODE_GEN2,
@@ -206,6 +211,8 @@ struct wiz {
u32 num_lanes;
struct platform_device  *serdes_pdev;
struct reset_controller_dev wiz_phy_reset_dev;
+   struct gpio_desc*gpio_typec_dir;
+   int typec_dir_delay;
  };
  
  static int wiz_reset(struct wiz *wiz)

@@ -703,6 +710,21 @@ static int wiz_phy_reset_deassert(struct 
reset_controller_dev *rcdev,
struct wiz *wiz = dev_get_drvdata(dev);
int ret;
  
+	/* if typec-dir gpio was specified, set LN10 SWAP bit based on that */

+   if (id == 0 && wiz->gpio_typec_dir) {
+   if (wiz->typec_dir_delay)
+   msleep_interruptible(wiz->typec_dir_delay);
+
+   if (gpiod_get_value_cansleep(wiz->gpio_typec_dir)) {
+   regmap_update_bits(wiz->regmap, WIZ_SERDES_TYPEC,
+  WIZ_SERDES_TYPEC_LN10_SWAP,
+  WIZ_SERDES_TYPEC_LN10_SWAP);


A nit pick, but wouldn't it be more coherent with the rest of the driver
to define a REG_FIELD also for TYPEC_LN10_SWAP bit?


I agree. Although, I hate fields as you need to do so much boilerplate just to
flip one bit.

cheers,
-roger



+   } else {
+   regmap_update_bits(wiz->regmap, WIZ_SERDES_TYPEC,
+  WIZ_SERDES_TYPEC_LN10_SWAP, 0);
+   }
+   }
+
if (id == 0) {
ret = regmap_field_write(wiz->phy_reset_n, true);
return ret;
@@ -789,6 +811,25 @@ static int wiz_probe(struct platform_device *pdev)
goto err_addr_to_resource;
}
  
+	wiz->gpio_typec_dir = devm_gpiod_get_optional(dev, "typec-dir",

+ GPIOD_IN);
+   if (IS_ERR(wiz->gpio_typec_dir)) {
+   ret = PTR_ERR(wiz->gpio_typec_dir);
+   if (ret != -EPROBE_DEFER)
+   dev_err(dev, "Failed to request typec-dir gpio: %d\n",
+   ret);
+   goto err_addr_to_resource;
+   }
+
+   if (wiz->gpio_typec_dir) {
+   ret = of_property_read_u32(node, "typec-dir-debounce",
+  &wiz->typec_dir_delay);
+   if (ret && ret != -EINVAL) {
+   dev_err(dev, "Invalid typec-dir-debounce property\n");
+   goto err_addr_to_resource;
+   }
+   }
+
wiz->dev = dev;
wiz->regmap = regmap;
wiz->num_lanes = num_lanes;






--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


Re: [PATCH 2/5] ARM: qcom_defconfig: add msm8974 interconnect support

2019-10-23 Thread Georgi Djakov
Hi Brian,

Thank you for working on this!

On 13.10.19 г. 11:08 ч., Brian Masney wrote:
> Add interconnect support for msm8974-based SoCs in order to support the
> GPU on this platform.
> 
> Signed-off-by: Brian Masney 
> ---
>  arch/arm/configs/qcom_defconfig | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm/configs/qcom_defconfig b/arch/arm/configs/qcom_defconfig
> index b6faf6f2ddb4..32fc8a24e5c7 100644
> --- a/arch/arm/configs/qcom_defconfig
> +++ b/arch/arm/configs/qcom_defconfig
> @@ -252,6 +252,9 @@ CONFIG_PHY_QCOM_IPQ806X_SATA=y
>  CONFIG_PHY_QCOM_USB_HS=y
>  CONFIG_PHY_QCOM_USB_HSIC=y
>  CONFIG_QCOM_QFPROM=y
> +CONFIG_INTERCONNECT=m

We want to change it from tristate to bool [1].

> +CONFIG_INTERCONNECT_QCOM=y
> +CONFIG_INTERCONNECT_QCOM_MSM8974=m
>  CONFIG_EXT2_FS=y
>  CONFIG_EXT2_FS_XATTR=y
>  CONFIG_EXT3_FS=y
> 

Otherwise looks good to me.

Thanks,
Georgi

[1]
https://lore.kernel.org/r/b789cce388dd1f2906492f307dea6780c398bc6a.1567065991.git.viresh.ku...@linaro.org


Re: [PATCH v5.1 RESEND] dt-bindings: hwrng: Add Samsung Exynos 5250+ True RNG bindings

2019-10-23 Thread Andreas Färber
Hi guys,

Am 26.03.19 um 12:42 schrieb Krzysztof Kozlowski:
> On Fri, 11 Jan 2019 at 14:22, Łukasz Stelmach  wrote:
>>
>> Add binding documentation for the True Random Number Generator
>> found on Samsung Exynos 5250+ SoCs.
>>
>> Acked-by: Rob Herring 
>> Reviewed-by: Krzysztof Kozlowski 
>> Signed-off-by: Łukasz Stelmach 
>> ---
> 
> Rob,
> Could you apply this directly? You acked this some time ago but it
> never went through drivers tree. Lukasz resent this patch and it is
> waiting since then.
> The driver implementing compatible is already in mainline.

For some reason this text file in linux-next is lonely in devicetree/...
rather than living in Documentation/devicetree/... - please fix that.
The patch here looks correct, so not sure what went wrong:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/devicetree/bindings/rng/samsung,exynos5250-trng.txt?h=next-20191023&id=85552c22f03c9066c33f26f34538b67fee6a91a8

Thanks,
Andreas

-- 
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer
HRB 36809 (AG Nürnberg)


Re: [PATCH v4] erofs: support superblock checksum

2019-10-23 Thread Chao Yu
Hi, Xiang, Pratik,

On 2019/10/23 12:05, Gao Xiang wrote:
> From: Pratik Shinde 
> 
> Introduce superblock checksum feature in order to check
> a number of given blocks at mounting time.
> 
> Signed-off-by: Pratik Shinde 
> Signed-off-by: Gao Xiang 
> ---
> changes from v3:
>  (based on Pratik's v3 patch)
>  - add LIBCRC32C dependency;
>  - use kmap() in order to avoid sleeping in atomic context;
>  - skip the first 1024 byte for x86 boot sector,
>co-tested with userspace utils,
>https://lore.kernel.org/r/20191023034957.184711-1-gaoxian...@huawei.com
> 
>  fs/erofs/Kconfig|  1 +
>  fs/erofs/erofs_fs.h |  6 +++--
>  fs/erofs/internal.h |  2 ++
>  fs/erofs/super.c| 53 +++--
>  4 files changed, 58 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig
> index 9d634d3a1845..74b0aaa7114c 100644
> --- a/fs/erofs/Kconfig
> +++ b/fs/erofs/Kconfig
> @@ -3,6 +3,7 @@
>  config EROFS_FS
>   tristate "EROFS filesystem support"
>   depends on BLOCK
> + select LIBCRC32C
>   help
> EROFS (Enhanced Read-Only File System) is a lightweight
> read-only file system with modern designs (eg. page-sized
> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h
> index b1ee5654750d..461913be1d1c 100644
> --- a/fs/erofs/erofs_fs.h
> +++ b/fs/erofs/erofs_fs.h
> @@ -11,6 +11,8 @@
>  
>  #define EROFS_SUPER_OFFSET  1024
>  
> +#define EROFS_FEATURE_COMPAT_SB_CHKSUM  0x0001
> +
>  /*
>   * Any bits that aren't in EROFS_ALL_FEATURE_INCOMPAT should
>   * be incompatible with this kernel version.
> @@ -37,8 +39,8 @@ struct erofs_super_block {
>   __u8 uuid[16];  /* 128-bit uuid for volume */
>   __u8 volume_name[16];   /* volume name */
>   __le32 feature_incompat;
> -
> - __u8 reserved2[44];
> + __le32 chksum_blocks;   /* number of blocks used for checksum */
> + __u8 reserved2[40];
>  };
>  
>  /*
> diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
> index 544a453f3076..a3778f597bf6 100644
> --- a/fs/erofs/internal.h
> +++ b/fs/erofs/internal.h
> @@ -85,6 +85,7 @@ struct erofs_sb_info {
>  
>   u8 uuid[16];/* 128-bit uuid for volume */
>   u8 volume_name[16]; /* volume name */
> + u32 feature_compat;
>   u32 feature_incompat;
>  
>   unsigned int mount_opt;
> @@ -426,6 +427,7 @@ static inline void z_erofs_exit_zip_subsystem(void) {}
>  #endif   /* !CONFIG_EROFS_FS_ZIP */
>  
>  #define EFSCORRUPTEDEUCLEAN /* Filesystem is corrupted */
> +#define EFSBADCRC   EBADMSG /* Bad CRC detected */
>  
>  #endif   /* __EROFS_INTERNAL_H */
>  
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index 0e369494f2f2..18d1ec18a671 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "xattr.h"
>  
>  #define CREATE_TRACE_POINTS
> @@ -46,6 +47,47 @@ void _erofs_info(struct super_block *sb, const char 
> *function,
>   va_end(args);
>  }
>  
> +static int erofs_superblock_csum_verify(struct super_block *sb, void *sbdata)
> +{
> + struct erofs_super_block *dsb;
> + u32 expected_crc, nblocks, crc;
> + void *kaddr;
> + struct page *page;
> + int i;
> +
> + dsb = kmemdup(sbdata + EROFS_SUPER_OFFSET,
> +   EROFS_BLKSIZ - EROFS_SUPER_OFFSET, GFP_KERNEL);
> + if (!dsb)
> + return -ENOMEM;
> +
> + expected_crc = le32_to_cpu(dsb->checksum);
> + nblocks = le32_to_cpu(dsb->chksum_blocks);

Now, we try to use nblocks's value before checking its validation, I guess fuzz
test can easily make the value extreme larger, result in checking latter blocks
unnecessarily.

IMO, we'd better
1. check validation of superblock to make sure all fields in sb are valid
2. use .nblocks to count and check payload blocks following sb

Thanks,

> + dsb->checksum = 0;
> + /* to allow for x86 boot sectors and other oddities. */
> + crc = crc32c(~0, dsb, EROFS_BLKSIZ - EROFS_SUPER_OFFSET);
> + kfree(dsb);
> +
> + for (i = 1; i < nblocks; i++) {
> + page = erofs_get_meta_page(sb, i);
> + if (IS_ERR(page))
> + return PTR_ERR(page);
> +
> + kaddr = kmap_atomic(page);
> + crc = crc32c(crc, kaddr, EROFS_BLKSIZ);
> + kunmap_atomic(kaddr);
> +
> + unlock_page(page);
> + put_page(page);
> + }
> +
> + if (crc != expected_crc) {
> + erofs_err(sb, "invalid checksum 0x%08x, 0x%08x expected",
> +   crc, expected_crc);
> + return -EFSBADCRC;
> + }
> + return 0;
> +}
> +
>  static void erofs_inode_init_once(void *ptr)
>  {
>   struct erofs_inode *vi = ptr;
> @@ -112,7 +154,7 @@ static int erofs_read_superblock(struct super_block *sb)
>  
>   sbi = EROFS_SB(sb);
>  
> - data = kmap_atomic(page);
> + data = kma

Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()

2019-10-23 Thread Wanpeng Li
On Fri, 28 Jun 2019 at 09:10, Frederic Weisbecker  wrote:
>
> On Fri, Jun 28, 2019 at 08:43:12AM +0800, Wanpeng Li wrote:
> > From: Wanpeng Li 
> >
> > On a machine, cpu 0 is used for housekeeping, the other 39 cpus in the
> > same socket are in nohz_full mode. We can observe huge time burn in the
> > loop for seaching nearest busy housekeeper cpu by ftrace.
> >
> >   2)   |   get_nohz_timer_target() {
> >   2)   0.240 us| housekeeping_test_cpu();
> >   2)   0.458 us| housekeeping_test_cpu();
> >
> >   ...
> >
> >   2)   0.292 us| housekeeping_test_cpu();
> >   2)   0.240 us| housekeeping_test_cpu();
> >   2)   0.227 us| housekeeping_any_cpu();
> >   2) + 43.460 us   |   }
> >
> > This patch optimizes the searching logic by finding a nearest housekeeper
> > cpu in the housekeeping cpumask, it can minimize the worst searching time
> > from ~44us to < 10us in my testing. In addition, the last iterated busy
> > housekeeper can become a random candidate while current CPU is a better
> > fallback if it is a housekeeper.
> >
> > Cc: Ingo Molnar 
> > Cc: Peter Zijlstra 
> > Cc: Frederic Weisbecker 
> > Cc: Thomas Gleixner 
> > Signed-off-by: Wanpeng Li 
>
> Reviewed-by: Frederic Weisbecker 

Hi Thomas,

I didn't see your refactor to get_nohz_timer_target() which you
mentioned in IRC after four months, I can observe cyclictest drop from
4~5us to 8us in kvm guest(we offload the lapic timer emulation to
housekeeping cpu to avoid timer fire external interrupt on the pCPU
which vCPU resident incur a vCPU vmexit) w/o this patch in the case of
there is no busy housekeeping cpu. The score can be recovered after I
give stress to create a busy housekeeping cpu.

Could you consider applying this patch for temporary since I'm not
sure when the refactor can be ready.

Wanpeng


Re: [RFC v1] mm: add page preemption

2019-10-23 Thread Michal Hocko
On Tue 22-10-19 22:28:02, Hillf Danton wrote:
> 
> On Tue, 22 Oct 2019 14:42:41 +0200 Michal Hocko wrote:
> > 
> > On Tue 22-10-19 20:14:39, Hillf Danton wrote:
> > > 
> > > On Mon, 21 Oct 2019 14:27:28 +0200 Michal Hocko wrote:
> > [...]
> > > > Why do we care and which workloads would benefit and how much.
> > > 
> > > Page preemption, disabled by default, should be turned on by those
> > > who wish that the performance of their workloads can survive memory
> > > pressure to certain extent.
> > 
> > I am sorry but this doesn't say anything to me. How come not all
> > workloads would fit that description?
> 
> That means pp plays a role when kswapd becomes active, and it may
> prevent too much jitters in active lru pages.

This is still too vague to be useful in any way.

> > > The number of pp users is supposed near the people who change the
> > > nice value of their apps either to -1 or higher at least once a week,
> > > less than vi users among UK's undergraduates.
> > > 
> > > > And last but not least why the existing infrastructure doesn't help
> > > > (e.g. if you have clearly defined workloads with different
> > > > memory consumption requirements then why don't you use memory cgroups to
> > > > reflect the priority).
> > > 
> > > Good question:)
> > > 
> > > Though pp is implemented by preventing any task from reclaiming as many
> > > pages as possible from other tasks that are higher on priority, it is
> > > trying to introduce prio into page reclaiming, to add a feature.
> > > 
> > > Page and memcg are different objects after all; pp is being added at
> > > the page granularity. It should be an option available in environments
> > > without memcg enabled.
> > 
> > So do you actually want to establish LRUs per priority?
> 
> No, no change other than the prio for every lru page was added. LRU per prio
> is too much to implement.

Well, considering that per page priority is a no go as already pointed
out by Willy then you do not have other choice right?

> > Why using memcgs is not an option?
> 
> I have plan to add prio in memcg. As you see, I sent a rfc before v0 with
> nice added in memcg, and realised a couple days ago that its dependence on
> soft limit reclaim is not acceptable.
> 
> But we can't do that without determining how to define memcg's prio.
> What is in mind now is the highest (or lowest) prio of tasks in a memcg
> with a knob offered to userspace.
> 
> If you like, I want to have a talk about it sometime later.

This doesn't really answer my question. Why cannot you use memcgs as
they are now. Why exactly do you need a fixed priority?

> > This is the main facility to partition reclaimable
> > memory in the first place. You should really focus on explaining on why
> > a much more fine grained control is needed much more thoroughly.
> > 
> > > What is way different from the protections offered by memory cgroup
> > > is that pages protected by memcg:min/low can't be reclaimed regardless
> > > of memory pressure. Such guarantee is not available under pp as it only
> > > suggests an extra factor to consider on deactivating lru pages.
> > 
> > Well, low limit can be breached if there is no eliglible memcg to be
> > reclaimed. That means that you can shape some sort of priority by
> > setting the low limit already.
> > 
> > [...]
> > 
> > > What was added on the reclaimer side is
> > > 
> > > 1, kswapd sets pgdat->kswapd_prio, the switch between page reclaimer
> > >and allocator in terms of prio, to the lowest value before taking
> > >a nap.
> > > 
> > > 2, any allocator is able to wake up the reclaimer because of the
> > >lowest prio, and it starts reclaiming pages using the waker's prio.
> > > 
> > > 3, allocator comes while kswapd is active, its prio is checked and
> > >no-op if kswapd is higher on prio; otherwise switch is updated
> > >with the higher prio.
> > > 
> > > 4, every time kswapd raises sc.priority that starts with DEF_PRIORITY,
> > >it is checked if there is pending update of switch; and kswapd's
> > >prio steps up if there is a pending one, thus its prio never steps
> > >down. Nor prio inversion. 
> > > 
> > > 5, goto 1 when kswapd finishes its work.
> > 
> > What about the direct reclaim?
> 
> Their prio will not change before reclaiming finishes, so leave it be.

This doesn't answer my question.

> > What if pages of a lower priority are
> > hard to reclaim? Do you want a process of a higher priority stall more
> > just because it has to wait for those lower priority pages?
> 
> The problems above are not introduced by pp, let Mr. Kswapd take care of
> them.

No, this is not an answer.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH v4] media: vimc: Implement debayer control for mean window size

2019-10-23 Thread Hans Verkuil
Hi Arthur,

I added this patch to my pull request, but I have a request for a follow-up
patch:

On 10/2/19 2:46 AM, Arthur Moraes do Lago wrote:
> Add mean window size parameter for debayer filter as a control in
> vimc-debayer.
> 
> vimc-debayer was patched to allow changing mean window parameter
> of the filter without needing to reload the driver. The parameter
> can now be set using a v4l2-ctl control(mean_window_size).
> 
> Co-developed-by: Laís Pessine do Carmo 
> Signed-off-by: Laís Pessine do Carmo 
> Signed-off-by: Arthur Moraes do Lago 
> ---
> Changes in V2:
>  - Updated documentation
>  - Added v4l2_subev_core_ops to solve errors in v4l2-ctl compliance test
>  - Changed control naming to follow English capitalization rules
>  - Rebased to Shuah Khan's newest patch series 171283
> ("Collapse vimc single monolithic driver")
>  - Change maximum value for mean window size
> Changes in V3:
>  - Renamed debayer control
>  - Fixed typo in documentation
>  - Freed control handler in vimc_deb_release
> Changes in V4:
>  - Removed unecessary function and checking for setting the control
> 
> We had originally intended to leave that bit of code checking if the
> value is being set to make it similar to what's done in vimc-sensor,
> and in case some extra caution is needed when chaging control in the
> future. But I guess they really were not necessary.
> 
> Thanks!
> 
> ---
>  Documentation/media/v4l-drivers/vimc.rst   | 10 +--
>  drivers/media/platform/vimc/vimc-common.h  |  1 +
>  drivers/media/platform/vimc/vimc-debayer.c | 81 ++
>  3 files changed, 71 insertions(+), 21 deletions(-)
> 
> diff --git a/Documentation/media/v4l-drivers/vimc.rst 
> b/Documentation/media/v4l-drivers/vimc.rst
> index a582af0509ee..28646c76dad5 100644
> --- a/Documentation/media/v4l-drivers/vimc.rst
> +++ b/Documentation/media/v4l-drivers/vimc.rst
> @@ -80,9 +80,7 @@ vimc-capture:
>  Module options
>  ---
>  
> -Vimc has a few module parameters to configure the driver.
> -
> -param=value
> +Vimc has a module parameter to configure the driver.
>  
>  * ``sca_mult=``
>  
> @@ -91,12 +89,6 @@ Vimc has a few module parameters to configure the driver.
>  original one. Currently, only supports scaling up (the default value
>  is 3).
>  
> -* ``deb_mean_win_size=``
> -
> -Window size to calculate the mean. Note: the window size needs to be 
> an
> -odd number, as the main pixel stays in the center of the window,
> -otherwise the next odd number is considered (the default value is 3).
> -
>  Source code documentation
>  -
>  
> diff --git a/drivers/media/platform/vimc/vimc-common.h 
> b/drivers/media/platform/vimc/vimc-common.h
> index 236412ad7548..3a5102ddf794 100644
> --- a/drivers/media/platform/vimc/vimc-common.h
> +++ b/drivers/media/platform/vimc/vimc-common.h
> @@ -19,6 +19,7 @@
>  #define VIMC_CID_VIMC_BASE   (0x00f0 | 0xf000)
>  #define VIMC_CID_VIMC_CLASS  (0x00f0 | 1)
>  #define VIMC_CID_TEST_PATTERN(VIMC_CID_VIMC_BASE + 0)
> +#define VIMC_CID_MEAN_WIN_SIZE   (VIMC_CID_VIMC_BASE + 1)
>  
>  #define VIMC_FRAME_MAX_WIDTH 4096
>  #define VIMC_FRAME_MAX_HEIGHT 2160
> diff --git a/drivers/media/platform/vimc/vimc-debayer.c 
> b/drivers/media/platform/vimc/vimc-debayer.c
> index 37f3767db469..ba0af4b2fb9b 100644
> --- a/drivers/media/platform/vimc/vimc-debayer.c
> +++ b/drivers/media/platform/vimc/vimc-debayer.c
> @@ -11,17 +11,12 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  
>  #include "vimc-common.h"
>  
> -static unsigned int deb_mean_win_size = 3;
> -module_param(deb_mean_win_size, uint, );
> -MODULE_PARM_DESC(deb_mean_win_size, " the window size to calculate the 
> mean.\n"
> - "NOTE: the window size needs to be an odd number, as the main pixel "
> - "stays in the center of the window, otherwise the next odd number "
> - "is considered");
> -
>  enum vimc_deb_rgb_colors {
>   VIMC_DEB_RED = 0,
>   VIMC_DEB_GREEN = 1,
> @@ -46,6 +41,8 @@ struct vimc_deb_device {
>   u8 *src_frame;
>   const struct vimc_deb_pix_map *sink_pix_map;
>   unsigned int sink_bpp;
> + unsigned int mean_win_size;
> + struct v4l2_ctrl_handler hdl;
>  };
>  
>  static const struct v4l2_mbus_framefmt sink_fmt_default = {
> @@ -346,11 +343,18 @@ static int vimc_deb_s_stream(struct v4l2_subdev *sd, 
> int enable)
>   return 0;
>  }
>  
> +static const struct v4l2_subdev_core_ops vimc_deb_core_ops = {
> + .log_status = v4l2_ctrl_subdev_log_status,
> + .subscribe_event = v4l2_ctrl_subdev_subscribe_event,
> + .unsubscribe_event = v4l2_event_subdev_unsubscribe,
> +};
> +
>  static const struct v4l2_subdev_video_ops vimc_deb_video_ops = {
>   .s_stream = vimc_deb_s_stream,
>  };
>  
>  static const struct v4l2_subdev_ops vimc_deb_ops = {
> + .core = &vimc_deb_core_ops,
>   .pad = &vimc_

Re: [PATCH] clocksource/drivers: Fix error handling in ttc_setup_clocksource

2019-10-23 Thread Markus Elfring
> Fixes: e932900a3279 ("arm: zynq: Use standard timer binding")

How do you think about to add the tag “Reported-by” for Michal Simek?
https://lore.kernel.org/linux-arm-kernel/2a6cdb63-397b-280a-7379-740e8f43d...@xilinx.com/
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?id=3b7c59a1950c75f2c0152e5a9cd77675b09233d6#n584

Regards,
Markus


Re: [PATCH v2] perf jevents: Fix resource leak in process_mapfile() and main()

2019-10-23 Thread Yunfeng Ye



On 2019/10/16 22:25, Arnaldo Carvalho de Melo wrote:
> Em Wed, Oct 16, 2019 at 09:50:17PM +0800, Yunfeng Ye escreveu:
>> There are memory leaks and file descriptor resource leaks in
>> process_mapfile() and main().
>>
>> Fix this by adding free(), fclose() and free_arch_std_events()
>> on the error paths.
>>
>> Fixes: 80eeb67fe577 ("perf jevents: Program to convert JSON file")
>> Fixes: 3f056b66647b ("perf jevents: Make build fail on JSON parse error")
>> Fixes: e9d32c1bf0cd ("perf vendor events: Add support for arch standard 
>> events")
> 
> Nice, thanks for adding the fixes line, I looked at those three patches
> and indeed they were leaky, thanks for the fixes, we shouldn't have
> those leaks even if that, for now, makes the tool to end anyway.
> 
The other 3 patchs have been applied, is this patch applied ? thanks.

> - Arnaldo
> 
>> Signed-off-by: Yunfeng Ye 
>> ---
>> v1 -> v2:
>>  - add free(eventsfp) to fix eventsfp resource leaks
>>  - add free_arch_std_events() on the error path
>>
>>  tools/perf/pmu-events/jevents.c | 13 +++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/pmu-events/jevents.c 
>> b/tools/perf/pmu-events/jevents.c
>> index e2837260ca4d..99e3fd04a5cb 100644
>> --- a/tools/perf/pmu-events/jevents.c
>> +++ b/tools/perf/pmu-events/jevents.c
>> @@ -758,6 +758,7 @@ static int process_mapfile(FILE *outfp, char *fpath)
>>  char *line, *p;
>>  int line_num;
>>  char *tblname;
>> +int ret = 0;
>>
>>  pr_info("%s: Processing mapfile %s\n", prog, fpath);
>>
>> @@ -769,6 +770,7 @@ static int process_mapfile(FILE *outfp, char *fpath)
>>  if (!mapfp) {
>>  pr_info("%s: Error %s opening %s\n", prog, strerror(errno),
>>  fpath);
>> +free(line);
>>  return -1;
>>  }
>>
>> @@ -795,7 +797,8 @@ static int process_mapfile(FILE *outfp, char *fpath)
>>  /* TODO Deal with lines longer than 16K */
>>  pr_info("%s: Mapfile %s: line %d too long, aborting\n",
>>  prog, fpath, line_num);
>> -return -1;
>> +ret = -1;
>> +goto out;
>>  }
>>  line[strlen(line)-1] = '\0';
>>
>> @@ -825,7 +828,9 @@ static int process_mapfile(FILE *outfp, char *fpath)
>>
>>  out:
>>  print_mapping_table_suffix(outfp);
>> -return 0;
>> +fclose(mapfp);
>> +free(line);
>> +return ret;
>>  }
>>
>>  /*
>> @@ -1122,6 +1127,7 @@ int main(int argc, char *argv[])
>>  goto empty_map;
>>  } else if (rc < 0) {
>>  /* Make build fail */
>> +fclose(eventsfp);
>>  free_arch_std_events();
>>  return 1;
>>  } else if (rc) {
>> @@ -1134,6 +1140,7 @@ int main(int argc, char *argv[])
>>  goto empty_map;
>>  } else if (rc < 0) {
>>  /* Make build fail */
>> +fclose(eventsfp);
>>  free_arch_std_events();
>>  return 1;
>>  } else if (rc) {
>> @@ -1151,6 +1158,8 @@ int main(int argc, char *argv[])
>>  if (process_mapfile(eventsfp, mapfile)) {
>>  pr_info("%s: Error processing mapfile %s\n", prog, mapfile);
>>  /* Make build fail */
>> +fclose(eventsfp);
>> +free_arch_std_events();
>>  return 1;
>>  }
>>
>> -- 
>> 2.7.4.3
> 



Re: [PATCH] PCI: Warn about host bridge device when its numa node is NO_NODE

2019-10-23 Thread Yunsheng Lin
On 2019/10/23 5:04, Bjorn Helgaas wrote:
> On Sat, Oct 19, 2019 at 02:45:43PM +0800, Yunsheng Lin wrote:
>> As the disscusion in [1]:
> 
> We need to justify this patch right here in the commit log, not with a
> pointer to a 50+ message email thread.

Ok, thanks.

> 
>> A PCI device really _MUST_ have a node assigned. 
> 
> No, it's not really essential.  It's *nice* if we know the node
> closest to a PCI device, but the system should function correctly even
> if we don't.  The only problem is that it will be slower.

Ok, will try to mention the info you mention here in the commit log.

> 
> I think the underlying problem you're addressing is that:
> 
>   - NUMA_NO_NODE == -1,
>   - dev_to_node(dev) may return NUMA_NO_NODE,
>   - kmalloc(dev) relies on cpumask_of_node(dev_to_node(dev)), and
>   - cpumask_of_node(NUMA_NO_NODE) makes an invalid array reference
> 
> For example, on arm64, mips loongson, s390, and x86,
> cpumask_of_node(node) returns "node_to_cpumask_map[node]", and -1 is
> an invalid array index.

The invalid array index of -1 is the underlying problem here when
cpumask_of_node(dev_to_node(dev)) is called and cpumask_of_node()
is not NUMA_NO_NODE aware yet.

In the "numa: make node_to_cpumask_map() NUMA_NO_NODE aware" thread
disscusion, it is requested that it is better to warn about the pcie
device without a node assigned by the firmware before making the
cpumask_of_node() NUMA_NO_NODE aware, so that the system with pci
devices of "NUMA_NO_NODE" node can be fixed by their vendor.

See: 
https://lore.kernel.org/lkml/2019101539.gx2...@hirez.programming.kicks-ass.net/

> 
> That problem can't be solved by emitting a warning, of course.  I
> assume some variation of your "numa: make node_to_cpumask_map()
> NUMA_NO_NODE aware" patch [a] will solve that problem.
> 
> [a] 
> https://lore.kernel.org/linux-mips/1568535656-158979-1-git-send-email-linyunsh...@huawei.com/T/#u
> 
> It is probably a good idea to emit a warning about the performance
> issue.
> 
> When I run your patch on qemu, I see this:
> 
>   ACPI: PCI Root Bridge [PCI0] (domain  [bus 00-ff])
>   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments 
> MSI HPX-Type3]
>   acpi PNP0A08:00: _OSC: platform does not support [LTR]
>   acpi PNP0A08:00: _OSC: OS now controls [PME AER PCIeCapability]
>   PCI host bridge to bus :00
>   pci_bus :00: root bus resource [io  0x-0x0cf7 window]
>   pci_bus :00: root bus resource [io  0x0d00-0x window]
>   pci_bus :00: root bus resource [mem 0x000a-0x000b window]
>   pci_bus :00: root bus resource [mem 0xc000-0xfebf window]
>   pci_bus :00: root bus resource [mem 0x1-0x8 window]
>   pci_bus :00: root bus resource [bus 00-ff]
>pci:00: [Firmware Bug]: No node assigned on NUMA capable HW by BIOS. 
> Please contact your vendor for updates.
> 
> I didn't debug it to see what's wrong with the " pci:00" string.
> Ideally it would be connected with "acpi PNP0A08:00" since that's the
> place where BIOS would make a fix but I suppose "pci_bus :00"
> would be adequate.

It seems the string at the beginning of dev_err() output is consisted
of dev_driver_string() and dev_name().

drivers/base/core.c:
const char *dev_driver_string(const struct device *dev)
{
struct device_driver *drv;

/* dev->driver can change to NULL underneath us because of unbinding,
 * so be careful about accessing it.  dev->bus and dev->class should
 * never change once they are set, so they don't need special care.
 */
drv = READ_ONCE(dev->driver);
return drv ? drv->name :
(dev->bus ? dev->bus->name :
(dev->class ? dev->class->name : ""));
}

The bridge device does not have a driver, a bus or a class, so
dev_driver_string() will return "", that is why there is a extral
space at the beginning of the " pci:00". I am not familiar with
the pci, so not sure if this is a problem that the bridge device
does not have a driver, a bus or a class.

And the bus device has a class of pcibus_class, and name of
pcibus_class is "pci_bus".

So maybe change the warning to below:

if (nr_node_ids > 1 && pcibus_to_node(bus) == NUMA_NO_NODE)
dev_err(&bus->dev, FW_BUG "No node assigned on NUMA capable HW. Please 
contact your vendor for updates.\n");

And it seems a pci device's parent will always set to the bridge
device in pci_setup_device(), and device_add() which will set the
node to its parent's when the child device' node is NUMA_NO_NODE,
maybe we can add the bridge device' node checking to make sure
the pci device really does not have a node assigned, as below:

if (nr_node_ids > 1 && pcibus_to_node(bus) == NUMA_NO_NODE &&
dev_to_node(bus->bridge) == NUMA_NO_NODE)
dev_err(&bus->dev, FW_BUG "No node assigned on NUMA capable HW. Please 
contact your vendor for updates.\n");


> 
>> It is possible to
>> have a PCI bridge share

[PATCH v9 1/3] PM: wakeup: Add routine to help fetch wakeup source object.

2019-10-23 Thread Ran Wang
Some user might want to go through all registered wakeup sources
and doing things accordingly. For example, SoC PM driver might need to
do HW programming to prevent powering down specific IP which wakeup
source depending on. So add this API to help walk through all registered
wakeup source objects on that list and return them one by one.

Signed-off-by: Ran Wang 
Tested-by: Leonard Crestez 
---
Change in v9:
- Supplement comments for wakeup_sources_read_lock(),
  wakeup_sources_read_unlock, wakeup_sources_walk_start and
  wakeup_sources_walk_next().

Change in v8:
- Rename wakeup_source_get_next() to wakeup_sources_walk_next().
- Add wakeup_sources_read_lock() to take over locking job of
  wakeup_source_get_star().
- Rename wakeup_source_get_start() to wakeup_sources_walk_start().
- Replace wakeup_source_get_stop() with wakeup_sources_read_unlock().
- Define macro for_each_wakeup_source(ws).

Change in v7:
- Remove define of member *dev in wake_irq to fix conflict with commit 
c8377adfa781 ("PM / wakeup: Show wakeup sources stats in sysfs"), user 
will use ws->dev->parent instead.
- Remove '#include ' because it is not used.

Change in v6:
- Add wakeup_source_get_star() and wakeup_source_get_stop() to aligned 
with wakeup_sources_stats_seq_start/nex/stop.

Change in v5:
- Update commit message, add decription of walk through all wakeup
source objects.
- Add SCU protection in function wakeup_source_get_next().
- Rename wakeup_source member 'attached_dev' to 'dev' and move it up
(before wakeirq).

Change in v4:
- None.

Change in v3:
- Adjust indentation of *attached_dev;.

Change in v2:
- None.

 drivers/base/power/wakeup.c | 54 +
 include/linux/pm_wakeup.h   |  9 
 2 files changed, 63 insertions(+)

diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index 5817b51..70a9edb 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -248,6 +248,60 @@ void wakeup_source_unregister(struct wakeup_source *ws)
 EXPORT_SYMBOL_GPL(wakeup_source_unregister);
 
 /**
+ * wakeup_sources_read_lock - Lock wakeup source list for read.
+ *
+ * Returns an index of srcu lock for struct wakeup_srcu.
+ * This index must be passed to the matching wakeup_sources_read_unlock().
+ */
+int wakeup_sources_read_lock(void)
+{
+   return srcu_read_lock(&wakeup_srcu);
+}
+EXPORT_SYMBOL_GPL(wakeup_sources_read_lock);
+
+/**
+ * wakeup_sources_read_unlock - Unlock wakeup source list.
+ * @idx: return value from corresponding wakeup_sources_read_lock()
+ */
+void wakeup_sources_read_unlock(int idx)
+{
+   srcu_read_unlock(&wakeup_srcu, idx);
+}
+EXPORT_SYMBOL_GPL(wakeup_sources_read_unlock);
+
+/**
+ * wakeup_sources_walk_start - Begin a walk on wakeup source list
+ *
+ * Returns first object of the list of wakeup sources.
+ *
+ * Note that to be safe, wakeup sources list needs to be locked by calling
+ * wakeup_source_read_lock() for this.
+ */
+struct wakeup_source *wakeup_sources_walk_start(void)
+{
+   struct list_head *ws_head = &wakeup_sources;
+
+   return list_entry_rcu(ws_head->next, struct wakeup_source, entry);
+}
+EXPORT_SYMBOL_GPL(wakeup_sources_walk_start);
+
+/**
+ * wakeup_sources_walk_next - Get next wakeup source from the list
+ * @ws: Previous wakeup source object
+ *
+ * Note that to be safe, wakeup sources list needs to be locked by calling
+ * wakeup_source_read_lock() for this.
+ */
+struct wakeup_source *wakeup_sources_walk_next(struct wakeup_source *ws)
+{
+   struct list_head *ws_head = &wakeup_sources;
+
+   return list_next_or_null_rcu(ws_head, &ws->entry,
+   struct wakeup_source, entry);
+}
+EXPORT_SYMBOL_GPL(wakeup_sources_walk_next);
+
+/**
  * device_wakeup_attach - Attach a wakeup source object to a device object.
  * @dev: Device to handle.
  * @ws: Wakeup source object to attach to @dev.
diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h
index 661efa0..aa3da66 100644
--- a/include/linux/pm_wakeup.h
+++ b/include/linux/pm_wakeup.h
@@ -63,6 +63,11 @@ struct wakeup_source {
boolautosleep_enabled:1;
 };
 
+#define for_each_wakeup_source(ws) \
+   for ((ws) = wakeup_sources_walk_start();\
+(ws);  \
+(ws) = wakeup_sources_walk_next((ws)))
+
 #ifdef CONFIG_PM_SLEEP
 
 /*
@@ -92,6 +97,10 @@ extern void wakeup_source_remove(struct wakeup_source *ws);
 extern struct wakeup_source *wakeup_source_register(struct device *dev,
const char *name);
 extern void wakeup_source_unregister(struct wakeup_source *ws);
+extern int wakeup_sources_read_lock(void);
+extern void wakeup_sources_read_unlock(int idx);
+extern struct wakeup_source *wa

[PATCH v9 3/3] soc: fsl: add RCPM driver

2019-10-23 Thread Ran Wang
The NXP's QorIQ Processors based on ARM Core have RCPM module
(Run Control and Power Management), which performs system level
tasks associated with power management such as wakeup source control.

This driver depends on PM wakeup source framework which help to
collect wake information.

Signed-off-by: Ran Wang 
---
Change in v9:
- Add kerneldoc for rcpm_pm_prepare().
- Use pr_debug() to replace dev_info(), to print message when decide
  skip current wakeup object, this is mainly for debugging (in order
  to detect potential improper implementation on device tree which
  might cause this skip).
- Refactor looping implementation in rcpm_pm_prepare(), add more
  comments to help clarify.

Change in v8:
- Adjust related API usage to meet wakeup.c's update in patch 1/3.
- Add sanity checking for the case of ws->dev or ws->dev->parent
  is null.

Change in v7:
- Replace 'ws->dev' with 'ws->dev->parent' to get aligned with
c8377adfa781 ("PM / wakeup: Show wakeup sources stats in sysfs")
- Remove '+obj-y += ftm_alarm.o' since it is wrong.
- Cosmetic work.

Change in v6:
- Adjust related API usage to meet wakeup.c's update in patch 1/3.

Change in v5:
- Fix v4 regression of the return value of wakeup_source_get_next()
didn't pass to ws in while loop.
- Rename wakeup_source member 'attached_dev' to 'dev'.
- Rename property 'fsl,#rcpm-wakeup-cells' to '#fsl,rcpm-wakeup-cells'.
please see https://lore.kernel.org/patchwork/patch/1101022/

Change in v4:
- Remove extra ',' in author line of rcpm.c
- Update usage of wakeup_source_get_next() to be less confusing to the
reader, code logic remain the same.

Change in v3:
- Some whitespace ajdustment.

Change in v2:
- Rebase Kconfig and Makefile update to latest mainline.

 drivers/soc/fsl/Kconfig  |   8 +++
 drivers/soc/fsl/Makefile |   1 +
 drivers/soc/fsl/rcpm.c   | 148 +++
 3 files changed, 157 insertions(+)
 create mode 100644 drivers/soc/fsl/rcpm.c

diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
index f9ad8ad..4918856 100644
--- a/drivers/soc/fsl/Kconfig
+++ b/drivers/soc/fsl/Kconfig
@@ -40,4 +40,12 @@ config DPAA2_CONSOLE
  /dev/dpaa2_mc_console and /dev/dpaa2_aiop_console,
  which can be used to dump the Management Complex and AIOP
  firmware logs.
+
+config FSL_RCPM
+   bool "Freescale RCPM support"
+   depends on PM_SLEEP
+   help
+ The NXP QorIQ Processors based on ARM Core have RCPM module
+ (Run Control and Power Management), which performs all device-level
+ tasks associated with power management, such as wakeup source control.
 endmenu
diff --git a/drivers/soc/fsl/Makefile b/drivers/soc/fsl/Makefile
index 71dee8d..906f1cd 100644
--- a/drivers/soc/fsl/Makefile
+++ b/drivers/soc/fsl/Makefile
@@ -6,6 +6,7 @@
 obj-$(CONFIG_FSL_DPAA) += qbman/
 obj-$(CONFIG_QUICC_ENGINE) += qe/
 obj-$(CONFIG_CPM)  += qe/
+obj-$(CONFIG_FSL_RCPM) += rcpm.o
 obj-$(CONFIG_FSL_GUTS) += guts.o
 obj-$(CONFIG_FSL_MC_DPIO)  += dpio/
 obj-$(CONFIG_DPAA2_CONSOLE)+= dpaa2-console.o
diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c
new file mode 100644
index 000..9378073
--- /dev/null
+++ b/drivers/soc/fsl/rcpm.c
@@ -0,0 +1,148 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// rcpm.c - Freescale QorIQ RCPM driver
+//
+// Copyright 2019 NXP
+//
+// Author: Ran Wang 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define RCPM_WAKEUP_CELL_MAX_SIZE  7
+
+struct rcpm {
+   unsigned intwakeup_cells;
+   void __iomem*ippdexpcr_base;
+   boollittle_endian;
+};
+
+/**
+ * rcpm_pm_prepare - performs device-level tasks associated with power
+ * management, such as programming related to the wakeup source control.
+ * @dev: Device to handle.
+ *
+ */
+static int rcpm_pm_prepare(struct device *dev)
+{
+   int i, ret, idx;
+   void __iomem *base;
+   struct wakeup_source*ws;
+   struct rcpm *rcpm;
+   struct device_node  *np = dev->of_node;
+   u32 value[RCPM_WAKEUP_CELL_MAX_SIZE + 1];
+
+   rcpm = dev_get_drvdata(dev);
+   if (!rcpm)
+   return -EINVAL;
+
+   base = rcpm->ippdexpcr_base;
+   idx = wakeup_sources_read_lock();
+
+   /* Begin with first registered wakeup source */
+   for_each_wakeup_source(ws) {
+
+   /* skip object which is not attached to device */
+   if (!ws->dev || !ws->dev->parent)
+   continue;
+
+   ret = device_property_read_u32_array(ws->dev->parent,
+   "fsl,rcpm-wakeup", value,
+   rcpm->wakeup_cells + 

[PATCH v9 2/3] Documentation: dt: binding: fsl: Add 'little-endian' and update Chassis define

2019-10-23 Thread Ran Wang
By default, QorIQ SoC's RCPM register block is Big Endian. But
there are some exceptions, such as LS1088A and LS2088A, are
Little Endian. So add this optional property to help identify
them.

Actually LS2021A and other Layerscapes won't totally follow Chassis
2.1, so separate them from powerpc SoC.

Signed-off-by: Ran Wang 
Reviewed-by: Rob Herring 
---
Change in v9:
- None

Change in v8:
- None.

Change in v7:
- None.

Change in v6:
- None.

Change in v5:
- Add 'Reviewed-by: Rob Herring ' to commit message.
- Rename property 'fsl,#rcpm-wakeup-cells' to '#fsl,rcpm-wakeup-cells'.
please see https://lore.kernel.org/patchwork/patch/1101022/

Change in v4:
- Adjust indectation of 'ls1021a, ls1012a, ls1043a, ls1046a'.

Change in v3:
- None.

Change in v2:
- None.

 Documentation/devicetree/bindings/soc/fsl/rcpm.txt | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt 
b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
index e284e4e..5a33619 100644
--- a/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
+++ b/Documentation/devicetree/bindings/soc/fsl/rcpm.txt
@@ -5,7 +5,7 @@ and power management.
 
 Required properites:
   - reg : Offset and length of the register set of the RCPM block.
-  - fsl,#rcpm-wakeup-cells : The number of IPPDEXPCR register cells in the
+  - #fsl,rcpm-wakeup-cells : The number of IPPDEXPCR register cells in the
fsl,rcpm-wakeup property.
   - compatible : Must contain a chip-specific RCPM block compatible string
and (if applicable) may contain a chassis-version RCPM compatible
@@ -20,6 +20,7 @@ Required properites:
* "fsl,qoriq-rcpm-1.0": for chassis 1.0 rcpm
* "fsl,qoriq-rcpm-2.0": for chassis 2.0 rcpm
* "fsl,qoriq-rcpm-2.1": for chassis 2.1 rcpm
+   * "fsl,qoriq-rcpm-2.1+": for chassis 2.1+ rcpm
 
 All references to "1.0" and "2.0" refer to the QorIQ chassis version to
 which the chip complies.
@@ -27,14 +28,19 @@ Chassis Version Example Chips
 ------
 1.0p4080, p5020, p5040, p2041, p3041
 2.0t4240, b4860, b4420
-2.1t1040, ls1021
+2.1t1040,
+2.1+   ls1021a, ls1012a, ls1043a, ls1046a
+
+Optional properties:
+ - little-endian : RCPM register block is Little Endian. Without it RCPM
+   will be Big Endian (default case).
 
 Example:
 The RCPM node for T4240:
rcpm: global-utilities@e2000 {
compatible = "fsl,t4240-rcpm", "fsl,qoriq-rcpm-2.0";
reg = <0xe2000 0x1000>;
-   fsl,#rcpm-wakeup-cells = <2>;
+   #fsl,rcpm-wakeup-cells = <2>;
};
 
 * Freescale RCPM Wakeup Source Device Tree Bindings
@@ -44,7 +50,7 @@ can be used as a wakeup source.
 
   - fsl,rcpm-wakeup: Consists of a phandle to the rcpm node and the IPPDEXPCR
register cells. The number of IPPDEXPCR register cells is defined in
-   "fsl,#rcpm-wakeup-cells" in the rcpm node. The first register cell is
+   "#fsl,rcpm-wakeup-cells" in the rcpm node. The first register cell is
the bit mask that should be set in IPPDEXPCR0, and the second register
cell is for IPPDEXPCR1, and so on.
 
-- 
2.7.4



Re: [PATCH] PCI: Warn about host bridge device when its numa node is NO_NODE

2019-10-23 Thread Yunsheng Lin
On 2019/10/22 21:55, Robin Murphy wrote:
> On 21/10/2019 05:05, Yunsheng Lin wrote:
>> On 2019/10/19 16:34, Christoph Hellwig wrote:
>>> On Sat, Oct 19, 2019 at 02:45:43PM +0800, Yunsheng Lin wrote:
 +if (nr_node_ids > 1 && dev_to_node(bus->bridge) == NUMA_NO_NODE)
 +dev_err(bus->bridge, FW_BUG "No node assigned on NUMA capable HW 
 by BIOS. Please contact your vendor for updates.\n");
 +
>>>
>>> The whole idea of mentioning a BIOS in architeture indepent code doesn't
>>> make sense at all.
> 
> [ Come to think of it, I'm sure an increasing number of x86 firmwares don't 
> even implement a PC BIOS any more... ]
> 
> In all fairness, the server-class Arm-based machines I've come across so far 
> do seem to consistently call their EFI firmware images "BIOS" despite the 
> clear anachronism. At least the absurdity of conflating a system setup 
> program with a semiconductor process seems to have mostly died out ;)
> 
>> Mentioning the BIOS is to tell user what firmware is broken, so that
>> user can report this to their vendor by referring the specific firmware.
>>
>> It seems we can specific the node through different ways(DT, ACPI, etc).
>>
>> Is there a better name for mentioning instead of BIOS, or we should do
>> the checking and warning in the architeture dependent code?
>>
>> Or maybe just remove the BIOS from the above log?
> 
> Even though there may be some degree of historical convention hanging around 
> on ACPI-based systems, that argument almost certainly doesn't hold for 
> OF/FDT/etc. - the "[Firmware Bug]:" prefix is hopefully indicative enough, so 
> I'd say just drop the "by BIOS" part.

Will drop the "by BIOS" part if there is another version.
Tnanks for clarifying.

> 
> Robin.
> 
> .
> 



[PATCH] iio: dln2-adc: fix iio_triggered_buffer_postenable() position

2019-10-23 Thread Alexandru Ardelean
The iio_triggered_buffer_postenable() hook should be called first to
attach the poll function. The iio_triggered_buffer_predisable() hook is
called last (as is it should).

This change moves iio_triggered_buffer_postenable() to be called first. It
adds iio_triggered_buffer_predisable() on the error paths of the postenable
hook.
For the predisable hook, some code-paths have been changed to make sure
that the iio_triggered_buffer_predisable() hook gets called in case there
is an error before it.

Signed-off-by: Alexandru Ardelean 
---
 drivers/iio/adc/dln2-adc.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/iio/adc/dln2-adc.c b/drivers/iio/adc/dln2-adc.c
index 5fa78c273a25..65c7c9329b1c 100644
--- a/drivers/iio/adc/dln2-adc.c
+++ b/drivers/iio/adc/dln2-adc.c
@@ -524,6 +524,10 @@ static int dln2_adc_triggered_buffer_postenable(struct 
iio_dev *indio_dev)
u16 conflict;
unsigned int trigger_chan;
 
+   ret = iio_triggered_buffer_postenable(indio_dev);
+   if (ret)
+   return ret;
+
mutex_lock(&dln2->mutex);
 
/* Enable ADC */
@@ -537,6 +541,7 @@ static int dln2_adc_triggered_buffer_postenable(struct 
iio_dev *indio_dev)
(int)conflict);
ret = -EBUSY;
}
+   iio_triggered_buffer_predisable(indio_dev);
return ret;
}
 
@@ -550,6 +555,7 @@ static int dln2_adc_triggered_buffer_postenable(struct 
iio_dev *indio_dev)
mutex_unlock(&dln2->mutex);
if (ret < 0) {
dev_dbg(&dln2->pdev->dev, "Problem in %s\n", __func__);
+   iio_triggered_buffer_predisable(indio_dev);
return ret;
}
} else {
@@ -557,12 +563,12 @@ static int dln2_adc_triggered_buffer_postenable(struct 
iio_dev *indio_dev)
mutex_unlock(&dln2->mutex);
}
 
-   return iio_triggered_buffer_postenable(indio_dev);
+   return 0;
 }
 
 static int dln2_adc_triggered_buffer_predisable(struct iio_dev *indio_dev)
 {
-   int ret;
+   int ret, ret2;
struct dln2_adc *dln2 = iio_priv(indio_dev);
 
mutex_lock(&dln2->mutex);
@@ -577,12 +583,14 @@ static int dln2_adc_triggered_buffer_predisable(struct 
iio_dev *indio_dev)
ret = dln2_adc_set_port_enabled(dln2, false, NULL);
 
mutex_unlock(&dln2->mutex);
-   if (ret < 0) {
+   if (ret < 0)
dev_dbg(&dln2->pdev->dev, "Problem in %s\n", __func__);
-   return ret;
-   }
 
-   return iio_triggered_buffer_predisable(indio_dev);
+   ret2 = iio_triggered_buffer_predisable(indio_dev);
+   if (ret == 0)
+   ret = ret2;
+
+   return ret;
 }
 
 static const struct iio_buffer_setup_ops dln2_adc_buffer_setup_ops = {
-- 
2.20.1



[PATCH] iio: at91-sama5d2_adc: fix iio_triggered_buffer_{predisable,postenable} positions

2019-10-23 Thread Alexandru Ardelean
The iio_triggered_buffer_{predisable,postenable} functions attach/detach
poll functions.

The iio_triggered_buffer_postenable() should be called first to attach the
poll function, and then the driver can init the data to be triggered.

Similarly, iio_triggered_buffer_predisable() should be called last to first
disable the data (to be triggered) and then the poll function should be
detached.

For this driver, the predisable & postenable hooks are also need to take
into consideration the touchscreen, so the hooks need to be put in places
that avoid the code for that cares about it.

Signed-off-by: Alexandru Ardelean 
---
 drivers/iio/adc/at91-sama5d2_adc.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/iio/adc/at91-sama5d2_adc.c 
b/drivers/iio/adc/at91-sama5d2_adc.c
index e1850f3d5cf3..ac3e5c4c9840 100644
--- a/drivers/iio/adc/at91-sama5d2_adc.c
+++ b/drivers/iio/adc/at91-sama5d2_adc.c
@@ -889,20 +889,24 @@ static int at91_adc_buffer_postenable(struct iio_dev 
*indio_dev)
if (!(indio_dev->currentmode & INDIO_ALL_TRIGGERED_MODES))
return -EINVAL;
 
+   ret = iio_triggered_buffer_postenable(indio_dev);
+   if (ret)
+   return ret;
+
/* we continue with the triggered buffer */
ret = at91_adc_dma_start(indio_dev);
if (ret) {
dev_err(&indio_dev->dev, "buffer postenable failed\n");
+   iio_triggered_buffer_predisable(indio_dev);
return ret;
}
 
-   return iio_triggered_buffer_postenable(indio_dev);
+   return 0;
 }
 
 static int at91_adc_buffer_predisable(struct iio_dev *indio_dev)
 {
struct at91_adc_state *st = iio_priv(indio_dev);
-   int ret;
u8 bit;
 
/* check if we are disabling triggered buffer or the touchscreen */
@@ -916,13 +920,8 @@ static int at91_adc_buffer_predisable(struct iio_dev 
*indio_dev)
if (!(indio_dev->currentmode & INDIO_ALL_TRIGGERED_MODES))
return -EINVAL;
 
-   /* continue with the triggered buffer */
-   ret = iio_triggered_buffer_predisable(indio_dev);
-   if (ret < 0)
-   dev_err(&indio_dev->dev, "buffer predisable failed\n");
-
if (!st->dma_st.dma_chan)
-   return ret;
+   goto out;
 
/* if we are using DMA we must clear registers and end DMA */
dmaengine_terminate_sync(st->dma_st.dma_chan);
@@ -949,7 +948,9 @@ static int at91_adc_buffer_predisable(struct iio_dev 
*indio_dev)
 
/* read overflow register to clear possible overflow status */
at91_adc_readl(st, AT91_SAMA5D2_OVER);
-   return ret;
+
+out:
+   return iio_triggered_buffer_predisable(indio_dev);
 }
 
 static const struct iio_buffer_setup_ops at91_buffer_setup_ops = {
-- 
2.20.1



Re: [PATCH v12 2/6] mm: Use zone and order instead of free area in free_list manipulators

2019-10-23 Thread David Hildenbrand

On 23.10.19 00:28, Alexander Duyck wrote:

From: Alexander Duyck 

In order to enable the use of the zone from the list manipulator functions
I will need access to the zone pointer. As it turns out most of the
accessors were always just being directly passed &zone->free_area[order]
anyway so it would make sense to just fold that into the function itself
and pass the zone and order as arguments instead of the free area.

In order to be able to reference the zone we need to move the declaration
of the functions down so that we have the zone defined before we define the
list manipulation functions. Since the functions are only used in the file
mm/page_alloc.c we can just move them there to reduce noise in the header.

Reviewed-by: Dan Williams 
Reviewed-by: David Hildenbrand 
Reviewed-by: Pankaj Gupta 
Signed-off-by: Alexander Duyck 
---
  include/linux/mmzone.h |   32 ---
  mm/page_alloc.c|   67 +++-
  2 files changed, 49 insertions(+), 50 deletions(-)


Did you see

https://lore.kernel.org/lkml/20191001152928.27008.8178.stgit@localhost.localdomain/T/#m4d2bc2f37bd7bdc3ae35c4f197905c275d0ad2f9

this time?

And the difference to the old patch is only an empty line.

--

Thanks,

David / dhildenb



[PATCH] iio: hdc100x: fix iio_triggered_buffer_{predisable,postenable} positions

2019-10-23 Thread Alexandru Ardelean
The iio_triggered_buffer_postenable() hook should be called first to
attach the poll function and the iio_triggered_buffer_predisable() hook
should be called last in the predisable hook.

This change updates the driver to attach/detach the poll func in the
correct order.

Signed-off-by: Alexandru Ardelean 
---
 drivers/iio/humidity/hdc100x.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/iio/humidity/hdc100x.c b/drivers/iio/humidity/hdc100x.c
index bfe1cdb16846..963ff043eecf 100644
--- a/drivers/iio/humidity/hdc100x.c
+++ b/drivers/iio/humidity/hdc100x.c
@@ -278,31 +278,34 @@ static int hdc100x_buffer_postenable(struct iio_dev 
*indio_dev)
struct hdc100x_data *data = iio_priv(indio_dev);
int ret;
 
+   ret = iio_triggered_buffer_postenable(indio_dev);
+   if (ret)
+   return ret;
+
/* Buffer is enabled. First set ACQ Mode, then attach poll func */
mutex_lock(&data->lock);
ret = hdc100x_update_config(data, HDC100X_REG_CONFIG_ACQ_MODE,
HDC100X_REG_CONFIG_ACQ_MODE);
mutex_unlock(&data->lock);
if (ret)
-   return ret;
+   iio_triggered_buffer_predisable(indio_dev);
 
-   return iio_triggered_buffer_postenable(indio_dev);
+   return ret;
 }
 
 static int hdc100x_buffer_predisable(struct iio_dev *indio_dev)
 {
struct hdc100x_data *data = iio_priv(indio_dev);
-   int ret;
-
-   /* First detach poll func, then reset ACQ mode. OK to disable buffer */
-   ret = iio_triggered_buffer_predisable(indio_dev);
-   if (ret)
-   return ret;
+   int ret, ret2;
 
mutex_lock(&data->lock);
ret = hdc100x_update_config(data, HDC100X_REG_CONFIG_ACQ_MODE, 0);
mutex_unlock(&data->lock);
 
+   ret2 = iio_triggered_buffer_predisable(indio_dev);
+   if (ret == 0)
+   ret = ret2;
+
return ret;
 }
 
-- 
2.20.1



Re: [PATCH] arch: microblaze: support for reserved-memory entries in DT

2019-10-23 Thread Alvaro Gamez Machado
Hi Michal

On Wed, Oct 23, 2019 at 09:59:40AM +0200, Michal Simek wrote:
> Hi,
> 
> 
> On 22. 10. 19 10:19, Alvaro Gamez Machado wrote:
> > Signed-off-by: Alvaro Gamez Machado 
> 
> please put there reasonable description to commit message.

Ok, will use those below as template.
 
> > ---
> >  arch/microblaze/mm/init.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
> > index a015a951c8b7..928c5c2816e4 100644
> > --- a/arch/microblaze/mm/init.c
> > +++ b/arch/microblaze/mm/init.c
> > @@ -17,6 +17,8 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> 
> of_fdt.h should be enough.

Ok

> >  
> >  #include 
> >  #include 
> > @@ -188,6 +190,9 @@ void __init setup_memory(void)
> >  
> >  void __init mem_init(void)
> >  {
> > +   early_init_fdt_reserve_self();
> > +   early_init_fdt_scan_reserved_mem();
> > +
> > high_memory = (void *)__va(memory_start + lowmem_size - 1);
> >  
> > /* this will put all memory onto the freelists */
> > 
> 
> 
> Also I have looked at others arch and take a look at
> 
> 1b10cb21d888c021bedbe678f7c26aee1bf04ffa
> ARC: add support for reserved memory defined by device tree
> 
> where they also enable OF_RESERVED_MEM to call fdt_init_reserve_mem()
> 
> The same here
> 4e7c84ec045921dacc78d36295e2e61390249665
>  xtensa: support reserved-memory DT node
> 
> and here
> 9bf14b7c540ae9ca7747af3a0c0d8470ef77b6ce
> arm64: add support for reserved memory defined by device tree
> 

They did that at the time, but it seems it's not needed anymore:

34e04eedd1cf1be714abb0e5976338cc72ccc05f
  of: select OF_RESERVED_MEM automatically

This commit removed those select OF_RESERVED_MEM lines. Is it needed
specifically for microblaze? I didn't need to do that in order for
reserved-memory entries to work on my platform.

Thanks!

> Please note this in commit message.
> 
> Thanks,
> Michal
> 
> 
> -- 
> Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
> w: www.monstr.eu p: +42-0-721842854
> Maintainer of Linux kernel - Xilinx Microblaze
> Maintainer of Linux kernel - Xilinx Zynq ARM and ZynqMP ARM64 SoCs
> U-Boot custodian - Xilinx Microblaze/Zynq/ZynqMP/Versal SoCs
> 

-- 
Alvaro G. M.


[PATCH] can: peak_usb: fix slab info leak

2019-10-23 Thread Johan Hovold
Fix a small slab info leak due to a failure to clear the command buffer
at allocation.

The first 16 bytes of the command buffer are always sent to the device
in pcan_usb_send_cmd() even though only the first two may have been
initialised in case no argument payload is provided (e.g. when waiting
for a response).

Fixes: bb4785551f64 ("can: usb: PEAK-System Technik USB adapters driver core")
Cc: stable  # 3.4
Reported-by: syzbot+863724e7128e14b26...@syzkaller.appspotmail.com
Signed-off-by: Johan Hovold 
---
 drivers/net/can/usb/peak_usb/pcan_usb_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_core.c 
b/drivers/net/can/usb/peak_usb/pcan_usb_core.c
index 65dce642b86b..0b7766b715fd 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb_core.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb_core.c
@@ -750,7 +750,7 @@ static int peak_usb_create_dev(const struct 
peak_usb_adapter *peak_usb_adapter,
dev = netdev_priv(netdev);
 
/* allocate a buffer large enough to send commands */
-   dev->cmd_buf = kmalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL);
+   dev->cmd_buf = kzalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL);
if (!dev->cmd_buf) {
err = -ENOMEM;
goto lbl_free_candev;
-- 
2.23.0



Re: [PATCH v1 1/2] usb: chipidea: use of extcon framework to work for non OTG case

2019-10-23 Thread Peter Chen
On 19-10-22 16:54:30, Igor Opaniuk wrote:
> Hi Peter,
> 
> On Tue, Oct 22, 2019 at 5:11 AM Peter Chen  wrote:
> >
> > On 19-10-21 19:16:53, Igor Opaniuk wrote:
> > > From: Stefan Agner 
> > >
> > > The existing usage of extcon in chipidea driver freezes the kernel
> > > presumably due to OTGSC register access.
> > >
> > > Prevent accessing any OTG registers for SoC with dual role devices
> > > but no true OTG support. Use the flag CI_HDRC_DUAL_ROLE_NOT_OTG for
> > > those devices and in case extcon is present, do the role switch
> > > using extcon only.
> >
> > Hi Igor & Stefan,
> >
> > I have several questions about the problem you met:
> > - Which vendor's controller you have used?
> > - What do you mean "no true OTG"? Does it mean no "OTGSC" register?
> 
> Probably the commit message adds a bit of confusion here
> (I've kept the original one from the patch in our downstream kernel,
> but will probably reword it).
> 
> The actual problem is that USB_OTG1_ID pin isn't wired, so we can't rely
> on the value of ID pin state in OTGSC for the role detection.
> In our SoM (Colibri iMX6ULL) ID pin from USB connector is wired
> to SNVS_TAMPER2 which is pinmuxed as GPIO pin (GPIO5_02),
> [1] (this is schematic for the Carrier Board, not SoM (isn't publicly
> available),
> but there is a pretty good explanation + schematic
> in the section "2.3.2.2 USB 2.0 OTG Schematic Example ").

Ok, I clear now. Then, you may not use CI_HDRC_DUAL_ROLE_NOT_OTG which
is for the controller without OTGSC. For imx6ull, access OTGSC will not
hang the system if USB is NOT at suspend mode.

Current upstream design has already considered the user case for switch
role through GPIO, but there is an issue that the external cable
wakeup doesn't work, I will submit it later (see ci_extcon_wakeup_int
implementation at downstream kernel).

You could try to disable runtime-pm to see if the behaviour is expected
or not, if it is NOT expected, please report what's that?

> 
> >
> > >   if (dr_mode == USB_DR_MODE_OTG || dr_mode == USB_DR_MODE_HOST) {
> > >   ret = ci_hdrc_host_init(ci);
> > > @@ -1145,8 +1208,18 @@ static int ci_hdrc_probe(struct platform_device 
> > > *pdev)
> > >
> > >   if (!ci_otg_is_fsm_mode(ci)) {
> > >   /* only update vbus status for peripheral */
> > > - if (ci->role == CI_ROLE_GADGET)
> > > - ci_handle_vbus_change(ci);
> > > + if (dr_mode == USB_DR_MODE_PERIPHERAL) {
> > > + usb_gadget_vbus_connect(&ci->gadget);
> >
> > We only use ci->role at runtime, since it has already considered the
> > dts setting, kernel configuration and hardware setting.
> >
> > If your controller doesn't otgsc register, but do need to support
> > role switch, you may enhance the function ci_get_role
> 
> Btw, ci_get_role() implementation still resides in the NXP dowstream kernel
> and I've never seen anything posted to the ML (if it was, could you
> please point me to
> the patch?). I can introduce the new one, which wraps both OTGSC handling
> + extcon for CI_HDRC_DUAL_ROLE_NOT_OTG controllers.

Sorry about that, I just read code for the upstream kernel with some
downstream patches on it.

> 
> Frankly speaking, I don't know the reason why additional workqueue 
> (ci->work_dr)
> was introduced (will try to reach Stefan regarding this).
> As I see it's valid to call extcon_get_state() from the atomic
> context, so probably
> using something like ci_get_role()(or ci_detect_role(), whatever)
> instead of explicitly
> retrieving bits from OTGSC in every ID pin check is a good choice.
> 

There are VBUS and ID events handling which are not non-atomic.

-- 

Thanks,
Peter Chen

Re: [PATCH] perf tools: avoid reading out of scope array

2019-10-23 Thread Jiri Olsa
On Thu, Oct 17, 2019 at 10:05:31AM -0700, Ian Rogers wrote:
> Modify tracepoint name into 2 sys components and assemble at use. This
> avoids the sys_name array being out of scope at the point of use.
> Bug caught with LLVM's address sanitizer with fuzz generated input of
> ":cs\1" to parse_events.
> 
> Signed-off-by: Ian Rogers 
> ---
>  tools/perf/util/parse-events.y | 36 +++---
>  1 file changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/perf/util/parse-events.y b/tools/perf/util/parse-events.y
> index 48126ae4cd13..28be39a703c9 100644
> --- a/tools/perf/util/parse-events.y
> +++ b/tools/perf/util/parse-events.y
> @@ -104,7 +104,8 @@ static void inc_group_count(struct list_head *list,
>   struct list_head *head;
>   struct parse_events_term *term;
>   struct tracepoint_name {
> - char *sys;
> + char *sys1;
> + char *sys2;
>   char *event;
>   } tracepoint_name;
>   struct parse_events_array array;
> @@ -425,9 +426,19 @@ tracepoint_name opt_event_config
>   if (error)
>   error->idx = @1.first_column;
>  
> - if (parse_events_add_tracepoint(list, &parse_state->idx, $1.sys, 
> $1.event,
> - error, $2))
> - return -1;
> +if ($1.sys2) {
> + char sys_name[128];
> + snprintf(&sys_name, sizeof(sys_name), "%s-%s",
> + $1.sys1, $1.sys2);
> + if (parse_events_add_tracepoint(list, &parse_state->idx,
> + sys_name, $1.event,
> + error, $2))
> + return -1;
> +} else
> + if (parse_events_add_tracepoint(list, &parse_state->idx,
> + $1.sys1, $1.event,
> + error, $2))
> + return -1;

nice catch, please enclose all multiline condition legs with {}

other than that

Acked-by: Jiri Olsa 

thanks,
jirka

>  
>   $$ = list;
>  }
> @@ -435,19 +446,22 @@ tracepoint_name opt_event_config
>  tracepoint_name:
>  PE_NAME '-' PE_NAME ':' PE_NAME
>  {
> - char sys_name[128];
> - struct tracepoint_name tracepoint;
> -
> - snprintf(&sys_name, 128, "%s-%s", $1, $3);
> - tracepoint.sys = &sys_name;
> - tracepoint.event = $5;
> + struct tracepoint_name tracepoint = {
> + .sys1 = $1,
> + .sys2 = $3,
> + .event = $5,
> + };
>  
>   $$ = tracepoint;
>  }
>  |
>  PE_NAME ':' PE_NAME
>  {
> - struct tracepoint_name tracepoint = {$1, $3};
> + struct tracepoint_name tracepoint = {
> + .sys1 = $1,
> + .sys2 = NULL,
> + .event = $3,
> + };
>  
>   $$ = tracepoint;
>  }
> -- 
> 2.23.0.700.g56cf767bdb-goog
> 



Re: [PATCH v2] sched/nohz: Optimize get_nohz_timer_target()

2019-10-23 Thread Thomas Gleixner
On Wed, 23 Oct 2019, Wanpeng Li wrote:
> I didn't see your refactor to get_nohz_timer_target() which you
> mentioned in IRC after four months, I can observe cyclictest drop from
> 4~5us to 8us in kvm guest(we offload the lapic timer emulation to
> housekeeping cpu to avoid timer fire external interrupt on the pCPU
> which vCPU resident incur a vCPU vmexit) w/o this patch in the case of
> there is no busy housekeeping cpu. The score can be recovered after I
> give stress to create a busy housekeeping cpu.
> 
> Could you consider applying this patch for temporary since I'm not
> sure when the refactor can be ready.

Yeah. It's delayed (again) Will pick that up.

Thanks,

tglx


Re: [PATCH V2] usb: typec: Add sysfs node to show connector orientation

2019-10-23 Thread Heikki Krogerus
+Guenter

On Tue, Oct 22, 2019 at 04:59:24PM +0800, Puma Hsu wrote:
> Export the Type-C connector orientation so that user space
> can get this information.
> 
> Signed-off-by: Puma Hsu 
> ---
>  Documentation/ABI/testing/sysfs-class-typec | 11 +++
>  drivers/usb/typec/class.c   | 18 ++
>  2 files changed, 29 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-class-typec 
> b/Documentation/ABI/testing/sysfs-class-typec
> index d7647b258c3c..b22f71801671 100644
> --- a/Documentation/ABI/testing/sysfs-class-typec
> +++ b/Documentation/ABI/testing/sysfs-class-typec
> @@ -108,6 +108,17 @@ Contact: Heikki Krogerus 
> 
>  Description:
>   Revision number of the supported USB Type-C specification.
>  
> +What:/sys/class/typec//connector_orientation
> +Date:October 2019
> +Contact: Puma Hsu 
> +Description:
> + Indicates which typec connector orientation is configured now.
> + cc1 is defined as "normal" and cc2 is defined as "reversed".
> +
> + Valid value:
> + - unknown (nothing configured)

"unknown" means we do not know the orientation.

> + - normal (configured in cc1 side)
> + - reversed (configured in cc2 side)

Guenter, do you think "connector_orientation" OK. I proposed it, but
I'm now wondering if something like "polarity" would be better?

>  USB Type-C partner devices (eg. /sys/class/typec/port0-partner/)
>  
> diff --git a/drivers/usb/typec/class.c b/drivers/usb/typec/class.c
> index 94a3eda62add..911d06676aeb 100644
> --- a/drivers/usb/typec/class.c
> +++ b/drivers/usb/typec/class.c
> @@ -1245,6 +1245,23 @@ static ssize_t usb_power_delivery_revision_show(struct 
> device *dev,
>  }
>  static DEVICE_ATTR_RO(usb_power_delivery_revision);
>  
> +static const char * const typec_connector_orientation[] = {
> + [TYPEC_ORIENTATION_NONE]= "unknown",
> + [TYPEC_ORIENTATION_NORMAL]  = "normal",
> + [TYPEC_ORIENTATION_REVERSE] = "reversed",
> +};
> +
> +static ssize_t connector_orientation_show(struct device *dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + struct typec_port *p = to_typec_port(dev);
> +
> + return sprintf(buf, "%s\n",
> +typec_connector_orientation[p->orientation]);
> +}
> +static DEVICE_ATTR_RO(connector_orientation);
> +
>  static struct attribute *typec_attrs[] = {
>   &dev_attr_data_role.attr,
>   &dev_attr_power_operation_mode.attr,
> @@ -1255,6 +1272,7 @@ static struct attribute *typec_attrs[] = {
>   &dev_attr_usb_typec_revision.attr,
>   &dev_attr_vconn_source.attr,
>   &dev_attr_port_type.attr,
> + &dev_attr_connector_orientation.attr,
>   NULL,
>  };
>  ATTRIBUTE_GROUPS(typec);

thanks,

-- 
heikki


[PATCH] dc.c:use kzalloc without test

2019-10-23 Thread zhongshiqi
dc.c:583:null check is needed after using kzalloc function

Signed-off-by: zhongshiqi 
---
 drivers/gpu/drm/amd/display/dc/core/dc.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c 
b/drivers/gpu/drm/amd/display/dc/core/dc.c
index 5d1aded..4b8819c 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc.c
@@ -580,6 +580,10 @@ static bool construct(struct dc *dc,
 #ifdef CONFIG_DRM_AMD_DC_DCN2_0
// Allocate memory for the vm_helper
dc->vm_helper = kzalloc(sizeof(struct vm_helper), GFP_KERNEL);
+   if (!dc->vm_helper) {
+   dm_error("%s: failed to create dc->vm_helper\n", __func__);
+   goto fail;
+   }
 
 #endif
memcpy(&dc->bb_overrides, &init_params->bb_overrides, 
sizeof(dc->bb_overrides));
-- 
2.9.5



Re: [PATCH] mm/vmstat: Reduce zone lock hold time when reading /proc/pagetypeinfo

2019-10-23 Thread Mel Gorman
On Tue, Oct 22, 2019 at 06:57:45PM +0200, Michal Hocko wrote:
> [Cc Mel]
> 
> On Tue 22-10-19 12:21:56, Waiman Long wrote:
> > The pagetypeinfo_showfree_print() function prints out the number of
> > free blocks for each of the page orders and migrate types. The current
> > code just iterates the each of the free lists to get counts.  There are
> > bug reports about hard lockup panics when reading the /proc/pagetyeinfo
> > file just because it look too long to iterate all the free lists within
> > a zone while holing the zone lock with irq disabled.
> > 
> > Given the fact that /proc/pagetypeinfo is readable by all, the possiblity
> > of crashing a system by the simple act of reading /proc/pagetypeinfo
> > by any user is a security problem that needs to be addressed.
> 
> Should we make the file 0400? It is a useful thing when debugging but
> not something regular users would really need for life.
> 

I think this would be useful in general. The information is not that
useful outside of debugging. Even then it's only useful when trying to
get a handle on why a path like compaction is taking too long.

> > There is a free_area structure associated with each page order. There
> > is also a nr_free count within the free_area for all the different
> > migration types combined. Tracking the number of free list entries
> > for each migration type will probably add some overhead to the fast
> > paths like moving pages from one migration type to another which may
> > not be desirable.
> 
> Have you tried to measure that overhead?
>  

I would prefer this option not be taken. It would increase the cost of
watermark calculations which is a relatively fast path.

> > we can actually skip iterating the list of one of the migration types
> > and used nr_free to compute the missing count. Since MIGRATE_MOVABLE
> > is usually the largest one on large memory systems, this is the one
> > to be skipped. Since the printing order is migration-type => order, we
> > will have to store the counts in an internal 2D array before printing
> > them out.
> > 
> > Even by skipping the MIGRATE_MOVABLE pages, we may still be holding the
> > zone lock for too long blocking out other zone lock waiters from being
> > run. This can be problematic for systems with large amount of memory.
> > So a check is added to temporarily release the lock and reschedule if
> > more than 64k of list entries have been iterated for each order. With
> > a MAX_ORDER of 11, the worst case will be iterating about 700k of list
> > entries before releasing the lock.
> 
> But you are still iterating through the whole free_list at once so if it
> gets really large then this is still possible. I think it would be
> preferable to use per migratetype nr_free if it doesn't cause any
> regressions.
> 

I think it will. The patch as it is contains the overhead within the
reader of the pagetypeinfo proc file which is a non-critical path. The
page allocator paths on the other hand is very important.

-- 
Mel Gorman
SUSE Labs


Re: [PATCH] ASoC: mediatek: Check SND_SOC_CROS_EC_CODEC dependency

2019-10-23 Thread Tzung-Bi Shih
On Wed, Oct 23, 2019 at 2:31 PM Mao Wenan  wrote:
>
> If SND_SOC_MT8183_MT6358_TS3A227E_MAX98357A=y,
> below errors can be seen:
> sound/soc/codecs/cros_ec_codec.o: In function `send_ec_host_command':
> cros_ec_codec.c:(.text+0x534): undefined reference to 
> `cros_ec_cmd_xfer_status'
> cros_ec_codec.c:(.text+0x101c): undefined reference to 
> `cros_ec_get_host_event'
>
> This is because it will select SND_SOC_CROS_EC_CODEC
> after commit 2cc3cd5fdc8b ("ASoC: mediatek: mt8183: support WoV"),
> but SND_SOC_CROS_EC_CODEC depends on CROS_EC.
>
> Fixes: 2cc3cd5fdc8b ("ASoC: mediatek: mt8183: support WoV")
> Signed-off-by: Mao Wenan 
> ---
>  sound/soc/mediatek/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/sound/soc/mediatek/Kconfig b/sound/soc/mediatek/Kconfig
> index 8b29f39..a656d20 100644
> --- a/sound/soc/mediatek/Kconfig
> +++ b/sound/soc/mediatek/Kconfig
> @@ -125,7 +125,7 @@ config SND_SOC_MT8183_MT6358_TS3A227E_MAX98357A
> select SND_SOC_MAX98357A
> select SND_SOC_BT_SCO
> select SND_SOC_TS3A227E
> -   select SND_SOC_CROS_EC_CODEC
> +   select SND_SOC_CROS_EC_CODEC if CROS_EC
> help
>   This adds ASoC driver for Mediatek MT8183 boards
>   with the MT6358 TS3A227E MAX98357A audio codec.
> --
> 2.7.4
>

Just realized your patch seems not showing in the list
(https://mailman.alsa-project.org/pipermail/alsa-devel/2019-October/thread.html).
I have no idea why.


Re: [PATCH v2 10/11] gpio: pca953x: Convert to use bitmap API

2019-10-23 Thread Geert Uytterhoeven
Hi Andy,

On Wed, Oct 23, 2019 at 10:01 AM Andy Shevchenko
 wrote:
> On Tue, Oct 22, 2019 at 08:03:00PM +0200, Geert Uytterhoeven wrote:
> > On Tue, Oct 22, 2019 at 7:29 PM Andy Shevchenko
> >  wrote:
> > > Instead of customized approach convert the driver to use bitmap API.
>
> > >  #define MAX_BANK 5
> > >  #define BANK_SZ 8
> > > +#define MAX_LINE   (MAX_BANK * BANK_SZ)
> >
> > Given (almost) everything is now bitmap (i.e. long [])-based, you might
> > as well increase MAX_BANK to a multiple of 4 or 8, e.g. 8.
>
> We can do it any time when we will really need it.

True. Especially as there's no real need to do it now.
(sorry, my mind mixed this up with gpio-74x164...)

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PATCH] perf tests: Fix a typo

2019-10-23 Thread Leo Yan
Correct typo in comment: s/suck/stuck.

Signed-off-by: Leo Yan 
---
 tools/perf/tests/bp_signal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index 166f411568a5..415903b48578 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -295,7 +295,7 @@ bool test__bp_signal_is_supported(void)
 * breakpointed instruction.
 *
 * Since arm64 has the same issue with arm for the single-step
-* handling, this case also gets suck on the breakpointed
+* handling, this case also gets stuck on the breakpointed
 * instruction.
 *
 * Just disable the test for these architectures until these
-- 
2.17.1



Re: [PATCH v2 1/9] perf tools: add parse events append error

2019-10-23 Thread Jiri Olsa
On Tue, Oct 22, 2019 at 05:53:29PM -0700, Ian Rogers wrote:
> Parse event error handling may overwrite one error string with another
> creating memory leaks and masking errors. Introduce a helper routine
> that appends error messages and avoids the memory leak.

good idea, it became little messy with time ;-)
some comments below

thanks,
jirka


> 
> Signed-off-by: Ian Rogers 
> ---
>  tools/perf/util/parse-events.c | 102 ++---
>  tools/perf/util/parse-events.h |   2 +
>  tools/perf/util/pmu.c  |  36 ++--
>  3 files changed, 89 insertions(+), 51 deletions(-)
> 
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index db882f630f7e..4d42344698b8 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -182,6 +182,34 @@ static int tp_event_has_id(const char *dir_path, struct 
> dirent *evt_dir)
>  
>  #define MAX_EVENT_LENGTH 512
>  
> +void parse_events__append_error(struct parse_events_error *err, int idx,
> + char *str, char *help)
> +{
> + char *new_str = NULL;
> +
> + WARN(!str, "WARNING: failed to provide error string");

should we also bail out if str is NULL?

> + if (err->str) {
> + int ret;
> +
> + if (err->help)
> + ret = asprintf(&new_str,
> + "%s (previous error: %s(help: %s))",
> + str, err->str, err->help);
> + else

please use {} for multiline condition legs

> + ret = asprintf(&new_str,
> + "%s (previous error: %s)",
> + str, err->str);

does this actualy happen? could you please provide output
of this in the changelog?

> + if (ret < 0)
> + new_str = NULL;
> + else
> + zfree(&str);
> + }
> + err->idx = idx;
> + free(err->str);
> + err->str = new_str ?: str;
> + free(err->help);
> + err->help = help;
> +}
>  

SNIP



Re: [PATCH 3/3] vhost, kcov: collect coverage from vhost_worker

2019-10-23 Thread Dmitry Vyukov
On Tue, Oct 22, 2019 at 6:46 PM Andrey Konovalov  wrote:
>
> This patch adds kcov_remote_start()/kcov_remote_stop() annotations to the
> vhost_worker() function, which is responsible for processing vhost works.
> Since vhost_worker() threads are spawned per vhost device instance
> the common kcov handle is used for kcov_remote_start()/stop() annotations
> (see Documentation/dev-tools/kcov.rst for details). As the result kcov can
> now be used to collect coverage from vhost worker threads.
>
> Signed-off-by: Andrey Konovalov 
> ---
>  drivers/vhost/vhost.c | 6 ++
>  drivers/vhost/vhost.h | 1 +
>  2 files changed, 7 insertions(+)
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index 36ca2cf419bf..a5a557c4b67f 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -30,6 +30,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "vhost.h"
>
> @@ -357,7 +358,9 @@ static int vhost_worker(void *data)
> llist_for_each_entry_safe(work, work_next, node, node) {
> clear_bit(VHOST_WORK_QUEUED, &work->flags);
> __set_current_state(TASK_RUNNING);
> +   kcov_remote_start(dev->kcov_handle);
> work->fn(work);
> +   kcov_remote_stop();
> if (need_resched())
> schedule();
> }
> @@ -546,6 +549,7 @@ long vhost_dev_set_owner(struct vhost_dev *dev)
>
> /* No owner, become one */
> dev->mm = get_task_mm(current);
> +   dev->kcov_handle = current->kcov_handle;

kcov_handle is not present in task_struct if !CONFIG_KCOV

Also this does not use KCOV_SUBSYSTEM_COMMON.
We discussed something along the following lines:

u64 kcov_remote_handle(u64 subsys, u64 id)
{
  WARN_ON(subsys or id has wrong bits set).
  return ...;
}

kcov_remote_handle(KCOV_SUBSYSTEM_USB, bus);
kcov_remote_handle(KCOV_SUBSYSTEM_COMMON, current->kcov_handle);


> worker = kthread_create(vhost_worker, dev, "vhost-%d", current->pid);
> if (IS_ERR(worker)) {
> err = PTR_ERR(worker);
> @@ -571,6 +575,7 @@ long vhost_dev_set_owner(struct vhost_dev *dev)
> if (dev->mm)
> mmput(dev->mm);
> dev->mm = NULL;
> +   dev->kcov_handle = 0;
>  err_mm:
> return err;
>  }
> @@ -682,6 +687,7 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
> if (dev->worker) {
> kthread_stop(dev->worker);
> dev->worker = NULL;
> +   dev->kcov_handle = 0;
> }
> if (dev->mm)
> mmput(dev->mm);
> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
> index e9ed2722b633..a123fd70847e 100644
> --- a/drivers/vhost/vhost.h
> +++ b/drivers/vhost/vhost.h
> @@ -173,6 +173,7 @@ struct vhost_dev {
> int iov_limit;
> int weight;
> int byte_weight;
> +   u64 kcov_handle;
>  };
>
>  bool vhost_exceeds_weight(struct vhost_virtqueue *vq, int pkts, int 
> total_len);
> --
> 2.23.0.866.gb869b98d4c-goog
>


[PATCH 0/1] Add support for setting MMIO PREF hotplug bridge size

2019-10-23 Thread Nicholas Johnson
This patch adds support for two new kernel parameters. This patch has
been in the making for quite some time, and has changed several times
based on feedback.

I realised I was making the mistake of putting it as part of my
Thunderbolt patch series. Although the other patches in the series are
very important for my goal, I realised that they are just a heap of
patches that are not Thunderbolt-specific. The only thing that is
Thunderbolt-related is the intended use case.

I hope that posting this alone can ease the difficulty of reviewing it.

Nicholas Johnson (1):
  PCI: Add hp_mmio_size and hp_mmio_pref_size parameters

 .../admin-guide/kernel-parameters.txt |  9 ++-
 drivers/pci/pci.c | 17 ++---
 drivers/pci/pci.h |  3 ++-
 drivers/pci/setup-bus.c   | 25 +++
 4 files changed, 38 insertions(+), 16 deletions(-)

-- 
2.23.0



Re: [PATCH 0/3] kcov: collect coverage from usb and vhost

2019-10-23 Thread Dmitry Vyukov
On Tue, Oct 22, 2019 at 6:46 PM Andrey Konovalov  wrote:
>
> This patchset extends kcov to allow collecting coverage from the USB
> subsystem and vhost workers. See the first patch description for details
> about the kcov extension. The other two patches apply this kcov extension
> to USB and vhost.
>
> These patches have been used to enable coverage-guided USB fuzzing with
> syzkaller for the last few years, see the details here:
>
> https://github.com/google/syzkaller/blob/master/docs/linux/external_fuzzing_usb.md
>
> This patchset has been pushed to the public Linux kernel Gerrit instance:
>
> https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/1524

Oh, so much easier to review with side-by-side diffs, context and
smart in-line colouring!

> Changes from RFC v1:
> - Remove unnecessary #ifdef's from drivers/vhost/vhost.c.
> - Reset t->kcov when area allocation fails in kcov_remote_start().
> - Use struct_size to calculate array size in kcov_ioctl().
> - Add a limit on area_size in kcov_remote_arg.
> - Added kcov_disable() helper.
> - Changed encoding of kcov remote handle ids, see the documentation.
> - Added a comment reference for kcov_sequence task_struct field.
> - Change common_handle type to u32.
> - Add checks for handle validity into kcov_ioctl_locked() and
> kcov_remote_start().
> - Updated documentation to reflect the changes.
>
> Andrey Konovalov (3):
>   kcov: remote coverage support
>   usb, kcov: collect coverage from hub_event
>   vhost, kcov: collect coverage from vhost_worker
>
>  Documentation/dev-tools/kcov.rst | 120 
>  drivers/usb/core/hub.c   |   5 +
>  drivers/vhost/vhost.c|   6 +
>  drivers/vhost/vhost.h|   1 +
>  include/linux/kcov.h |   6 +
>  include/linux/sched.h|   6 +
>  include/uapi/linux/kcov.h|  20 ++
>  kernel/kcov.c| 464 ---
>  8 files changed, 593 insertions(+), 35 deletions(-)
>
> --
> 2.23.0.866.gb869b98d4c-goog
>


[PATCH 1/1] PCI: Add hp_mmio_size and hp_mmio_pref_size parameters

2019-10-23 Thread Nicholas Johnson
Add kernel parameter pci=hpmmiosize=nn[KMG] to set MMIO bridge window
size for hotplug bridges.

Add kernel parameter pci=hpmmioprefsize=nn[KMG] to set MMIO_PREF bridge
window size for hotplug bridges.

Leave pci=hpmemsize=nn[KMG] unchanged, to prevent disruptions to
existing users. This sets both MMIO and MMIO_PREF to the same size.

The two new parameters conform to the style of pci=hpiosize=nn[KMG].

Signed-off-by: Nicholas Johnson 
---
 .../admin-guide/kernel-parameters.txt |  9 ++-
 drivers/pci/pci.c | 17 ++---
 drivers/pci/pci.h |  3 ++-
 drivers/pci/setup-bus.c   | 25 +++
 4 files changed, 38 insertions(+), 16 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index a84a83f88..cfe8c2b67 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3492,8 +3492,15 @@
hpiosize=nn[KMG]The fixed amount of bus space which is
reserved for hotplug bridge's IO window.
Default size is 256 bytes.
+   hpmmiosize=nn[KMG]  The fixed amount of bus space which is
+   reserved for hotplug bridge's MMIO window.
+   Default size is 2 megabytes.
+   hpmmioprefsize=nn[KMG]  The fixed amount of bus space which is
+   reserved for hotplug bridge's MMIO_PREF window.
+   Default size is 2 megabytes.
hpmemsize=nn[KMG]   The fixed amount of bus space which is
-   reserved for hotplug bridge's memory window.
+   reserved for hotplug bridge's MMIO and
+   MMIO_PREF window.
Default size is 2 megabytes.
hpbussize=nnThe minimum amount of additional bus numbers
reserved for buses below a hotplug bridge.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a..f3adab84b 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -85,10 +85,12 @@ unsigned long pci_cardbus_io_size = DEFAULT_CARDBUS_IO_SIZE;
 unsigned long pci_cardbus_mem_size = DEFAULT_CARDBUS_MEM_SIZE;
 
 #define DEFAULT_HOTPLUG_IO_SIZE(256)
-#define DEFAULT_HOTPLUG_MEM_SIZE   (2*1024*1024)
+#define DEFAULT_HOTPLUG_MMIO_SIZE  (2*1024*1024)
+#define DEFAULT_HOTPLUG_MMIO_PREF_SIZE (2*1024*1024)
 /* pci=hpmemsize=nnM,hpiosize=nn can override this */
 unsigned long pci_hotplug_io_size  = DEFAULT_HOTPLUG_IO_SIZE;
-unsigned long pci_hotplug_mem_size = DEFAULT_HOTPLUG_MEM_SIZE;
+unsigned long pci_hotplug_mmio_size = DEFAULT_HOTPLUG_MMIO_SIZE;
+unsigned long pci_hotplug_mmio_pref_size = DEFAULT_HOTPLUG_MMIO_PREF_SIZE;
 
 #define DEFAULT_HOTPLUG_BUS_SIZE   1
 unsigned long pci_hotplug_bus_size = DEFAULT_HOTPLUG_BUS_SIZE;
@@ -6286,8 +6288,17 @@ static int __init pci_setup(char *str)
pcie_ecrc_get_policy(str + 5);
} else if (!strncmp(str, "hpiosize=", 9)) {
pci_hotplug_io_size = memparse(str + 9, &str);
+   } else if (!strncmp(str, "hpmmiosize=", 11)) {
+   pci_hotplug_mmio_size =
+   memparse(str + 11, &str);
+   } else if (!strncmp(str, "hpmmioprefsize=", 15)) {
+   pci_hotplug_mmio_pref_size =
+   memparse(str + 15, &str);
} else if (!strncmp(str, "hpmemsize=", 10)) {
-   pci_hotplug_mem_size = memparse(str + 10, &str);
+   pci_hotplug_mmio_size =
+   memparse(str + 10, &str);
+   pci_hotplug_mmio_pref_size =
+   memparse(str + 10, &str);
} else if (!strncmp(str, "hpbussize=", 10)) {
pci_hotplug_bus_size =
simple_strtoul(str + 10, &str, 0);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 3f6947ee3..9faa55a15 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -218,7 +218,8 @@ extern const struct device_type pci_dev_type;
 extern const struct attribute_group *pci_bus_groups[];
 
 extern unsigned long pci_hotplug_io_size;
-extern unsigned long pci_hotplug_mem_size;
+extern unsigned long pci_hotplug_mmio_size;
+extern unsigned long pci_hotplug_mmio_pref_size;
 extern unsigned long pci_hotplug_bus_size;
 
 /**
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index e7dbe2170..24fc4c715 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup

Re: [RFC PATCH 11/13] led: bd71828: Support LED outputs on ROHM BD71828 PMIC

2019-10-23 Thread Vaittinen, Matti
Morning Jacek,

Thanks for the reply again. I did some cleaning to this mail as it was
getting lengthy.

On Tue, 2019-10-22 at 19:40 +0200, Jacek Anaszewski wrote:
> Matti,
> 
> On 10/22/19 2:40 PM, Vaittinen, Matti wrote:
> > Hello Jacek,
> > 
> > Thanks for the clarifications. I think I now understand the LED
> > subsystem a bit better :)
> > 
> > On Mon, 2019-10-21 at 21:09 +0200, Jacek Anaszewski wrote:
> > > Hi Matti,
> > > 
> > > On 10/21/19 10:00 AM, Vaittinen, Matti wrote:
> > > > Hello Dan,
> > > > 
> > > > Thanks for taking the time to check my driver :) I truly
> > > > appreciate
> > > > all
> > > > the help!
> > > > 
> > > > A "fundamental question" regarding these review comments is
> > > > whether
> > > > I
> > > > should add DT entries for these LEDs or not. I thought I
> > > > shouldn't
> > > > but
> > > > I would like to get a comment from Rob regarding it.
> > > 
> > > If the LED controller is a part of MFD device probed from DT then
> > > there is no doubt it should have corresponding DT sub-node.
> > 
> > Sorry but I still see no much benefit from adding this information
> > in
> > DT. Why should it have corresponding DT-node if the LED properties
> > are
> > fixed and if we only wish to allow user-space control and have no
> > dependencies to other devices in DT? 
> > 
> > In this specific case the information we can provide from DT is
> > supposed to be fixed. No board based variation. Furthermore, there
> > is
> > not much generic driver/led core functionality which would be able
> > to
> > parse and utilize relevant information from DT. I think we can only
> > give the name (function) and colour. And they are supposed to be
> > fixed
> > and thus could be just hard-coded in driver. Hard-coding these
> > would be
> > simpler and less error prone for users (no DT bindings to write)
> > and
> > simpler to create and probably also to maintain (no separate
> > binding
> > documents needed for LEDs).
> 
> AFAICS it is possible to connect LED of arbitrary color to the iouts
> of this device. If this is the case then it is justified to have DT
> node only to allow for LED name customization.

In theory, yes. In practice (if I understand it correctly) the color in
this case is only visible in sysfs path name. I am not at all sure that
reflecting the (unlikely) color change in path name is worth the
hassle. Besides - if this happens, then the driver and DT can be
changed. It is easier to add DT entries than remove them. If you see
the color change support as really crucial - then I could even consider
defaulting the colours to amber and green if no colour property is
present in DT. I see no point in _requiring_ the DT entry to be there.
If we like being prepared for the theoretical possibilities - what if
x86 is used to control this PMIC? I guess we wouldn't have DT there
then (And no - I don't see such use-case).

> > But assuming this is Ok to DT-folks and if you insist - I will add
> > LED
> > information to DT for the next patches. Hopefully this extra
> > complexity
> > helps in some oddball use-case which I can't foresee =)
> > 
> > Then what comes to the DT format.
> > 
> > Do you think LED subsystem should try to follow the convention with
> > other sub-systems and not introduce multiple compatibles for single
> > device? MFD can handle instantiating the sub-devices just fine even
> > when sub-devices have no own compatible property or of_match. Maybe
> > we
> > should also avoid unnecessary sub-nodes when they are not really
> > required.
> 
> This is beyond my scope of responsibility. It is MFD subsystem thing
> to
> choose the way of LED class driver instantiation. When it comes to
> LED subsystem - it expects single compatible pertaining to a physical
> device.

Sorry but I don't quite follow. What the LED subsystem does with the
compatible property? How does it expect this?

> Nonetheless, so far we used to have separate compatibles for drivers
> of
> MFD devices' LED cells. If we are going to change that I'd like to
> see
> explicit DT maintainer's statement confirming that.

I don't expect that existing DTs would be changed. But as I said, the
consensus amongst most of the subsystenm maintainers and DT maintainers
seems to be that sub-devices should not have own compatibles. I hope
Rob acks this here - but knowing he is a busy guy I add some old
discussions from which I have gathered my understanding:

BD71837 - first patch where regulators had compatible - Mark (regulator
maintainer instructed me to drop it):
https://lore.kernel.org/linux-clk/20180524140118.gs4...@sirena.org.uk/

And here Stephen (the clk subsystem maintainer) told me to drop whole
clocks sub-node (including the compatible):
https://lore.kernel.org/linux-clk/152777867392.144038.18188452389972834...@swboyd.mtv.corp.google.com/


> And one benefit of having separate nodes per MFD cells is that we can
> easily discern the support for which cells is to be turned on.

We don't want to do DT modifications to drop so

Re: [PATCH v3 1/2] arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear

2019-10-23 Thread liwei (GF)
Hi Marc,

On 2019/10/2 17:06, Marc Zyngier wrote:
> The GICv3 architecture specification is incredibly misleading when it
> comes to PMR and the requirement for a DSB. It turns out that this DSB
> is only required if the CPU interface sends an Upstream Control
> message to the redistributor in order to update the RD's view of PMR.
> 
> This message is only sent when ICC_CTLR_EL1.PMHE is set, which isn't
> the case in Linux. It can still be set from EL3, so some special care
> is required. But the upshot is that in the (hopefuly large) majority
> of the cases, we can drop the DSB altogether.
> 
> This relies on a new static key being set if the boot CPU has PMHE
> set. The drawback is that this static key has to be exported to
> modules.
> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: James Morse 
> Cc: Julien Thierry 
> Cc: Suzuki K Poulose 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/include/asm/barrier.h   | 12 
>  arch/arm64/include/asm/daifflags.h |  3 ++-
>  arch/arm64/include/asm/irqflags.h  | 19 ++-
>  arch/arm64/include/asm/kvm_host.h  |  3 +--
>  arch/arm64/kernel/entry.S  |  6 --
>  arch/arm64/kvm/hyp/switch.c|  4 ++--
>  drivers/irqchip/irq-gic-v3.c   | 20 
>  include/linux/irqchip/arm-gic-v3.h |  2 ++
>  8 files changed, 53 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/barrier.h 
> b/arch/arm64/include/asm/barrier.h
> index e0e2b1946f42..7d9cc5ec4971 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -29,6 +29,18 @@
>SB_BARRIER_INSN"nop\n",
> \
>ARM64_HAS_SB))
>  
> +#ifdef CONFIG_ARM64_PSEUDO_NMI
> +#define pmr_sync()   \
> + do {\
> + extern struct static_key_false gic_pmr_sync;\
> + \
> + if (static_branch_unlikely(&gic_pmr_sync))  \
> + dsb(sy);\
> + } while(0)
> +#else
> +#define pmr_sync()   do {} while (0)
> +#endif
> +

Thank you for solving this problem, it helps a lot indeed.

The pmr_sync() will call dsb(sy) when ARM64_PSEUDO_NMI=y and gic_pmr_sync=force,
but if pseudo nmi is not enabled through boot option, it just take one more
redundant calling than before at the following two place.

I think change dsb(sy) to
+   asm volatile(ALTERNATIVE("nop", "dsb sy",   \
+   ARM64_HAS_IRQ_PRIO_MASKING) \
+   : : : "memory");\
may be more appropriate.

Thanks,
Wei

>  
> @@ -34,14 +35,14 @@ static inline void arch_local_irq_enable(void)
>   }
>  
>   asm volatile(ALTERNATIVE(
> - "msrdaifclr, #2 // arch_local_irq_enable\n"
> - "nop",
> - __msr_s(SYS_ICC_PMR_EL1, "%0")
> - "dsbsy",
> + "msrdaifclr, #2 // arch_local_irq_enable",
> + __msr_s(SYS_ICC_PMR_EL1, "%0"),
>   ARM64_HAS_IRQ_PRIO_MASKING)
>   :
>   : "r" ((unsigned long) GIC_PRIO_IRQON)
>   : "memory");
> +
> + pmr_sync();
>  }
>  
>  static inline void arch_local_irq_disable(void)
> @@ -116,14 +117,14 @@ static inline unsigned long arch_local_irq_save(void)
>  static inline void arch_local_irq_restore(unsigned long flags)
>  {
>   asm volatile(ALTERNATIVE(
> - "msrdaif, %0\n"
> - "nop",
> - __msr_s(SYS_ICC_PMR_EL1, "%0")
> - "dsbsy",
> - ARM64_HAS_IRQ_PRIO_MASKING)
> + "msrdaif, %0",
> + __msr_s(SYS_ICC_PMR_EL1, "%0"),
> + ARM64_HAS_IRQ_PRIO_MASKING)
>   :
>   : "r" (flags)
>   : "memory");
> +
> + pmr_sync();
>  }
>  



Re: [PATCH v4] arm64: dts: imx8mq: Init rates and parents configs for clocks

2019-10-23 Thread Leonard Crestez
On 2019-10-23 9:29 AM, Viorel Suman wrote:
> On Mi, 2019-08-21 at 20:39 +, Leonard Crestez wrote:
>> The audio PLLs should run below 650 mHz so please use 393216000 and
>> 361267200 instead of 786432000 and 722534400. For the 8mm equivalent see
>> commit 053a4ffe2988 ("clk: imx: imx8mm: fix audio pll setting").
> 
> Hi Leonard,
> 
> Audio PLL IP on 8mm and 8mn is different than the Audio PLL IP on 8mq,
> so the requirement to run below 650 MHZ may not apply to 8mq.

This "max 650mHz" limit is from internal ADD and is also mentioned for 
imx8mq.

Peng: you made the change in our internal tree, can you confirm this 
requirement also applies to 8mq?

Viorel: Is there any impact from 393216000 vs 786432000 on PLL on audio? 
As far as I can know this rate goes through various dividers anyway.

--
Regards,
Leonard


Re: [PATCH] ASoC: mediatek: Check SND_SOC_CROS_EC_CODEC dependency

2019-10-23 Thread maowenan



On 2019/10/23 16:32, Tzung-Bi Shih wrote:
> On Wed, Oct 23, 2019 at 2:31 PM Mao Wenan  wrote:
>>
>> If SND_SOC_MT8183_MT6358_TS3A227E_MAX98357A=y,
>> below errors can be seen:
>> sound/soc/codecs/cros_ec_codec.o: In function `send_ec_host_command':
>> cros_ec_codec.c:(.text+0x534): undefined reference to 
>> `cros_ec_cmd_xfer_status'
>> cros_ec_codec.c:(.text+0x101c): undefined reference to 
>> `cros_ec_get_host_event'
>>
>> This is because it will select SND_SOC_CROS_EC_CODEC
>> after commit 2cc3cd5fdc8b ("ASoC: mediatek: mt8183: support WoV"),
>> but SND_SOC_CROS_EC_CODEC depends on CROS_EC.
>>
>> Fixes: 2cc3cd5fdc8b ("ASoC: mediatek: mt8183: support WoV")
>> Signed-off-by: Mao Wenan 
>> ---
>>  sound/soc/mediatek/Kconfig | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/sound/soc/mediatek/Kconfig b/sound/soc/mediatek/Kconfig
>> index 8b29f39..a656d20 100644
>> --- a/sound/soc/mediatek/Kconfig
>> +++ b/sound/soc/mediatek/Kconfig
>> @@ -125,7 +125,7 @@ config SND_SOC_MT8183_MT6358_TS3A227E_MAX98357A
>> select SND_SOC_MAX98357A
>> select SND_SOC_BT_SCO
>> select SND_SOC_TS3A227E
>> -   select SND_SOC_CROS_EC_CODEC
>> +   select SND_SOC_CROS_EC_CODEC if CROS_EC
>> help
>>   This adds ASoC driver for Mediatek MT8183 boards
>>   with the MT6358 TS3A227E MAX98357A audio codec.
>> --
>> 2.7.4
>>
> 
> Just realized your patch seems not showing in the list
> (https://mailman.alsa-project.org/pipermail/alsa-devel/2019-October/thread.html).
> I have no idea why.
> 
I receive below message after I post, do you know why?
'''
Your mail to 'Alsa-devel' with the subject

[PATCH] ASoC: mediatek: Check SND_SOC_CROS_EC_CODEC dependency

Is being held until the list moderator can review it for approval.

The reason it is being held:

Post by non-member to a members-only list

Either the message will get posted to the list, or you will receive
notification of the moderator's decision.  If you would like to cancel
this posting, please visit the following URL:


https://mailman.alsa-project.org/mailman/confirm/alsa-devel/574c24ad00f4d1aefc802a8a4b2c5fbda710e4e9
'''

> .
> 



Re: [PATCH v2 2/9] perf tools: splice events onto evlist even on error

2019-10-23 Thread Jiri Olsa
On Tue, Oct 22, 2019 at 05:53:30PM -0700, Ian Rogers wrote:
> If event parsing fails the event list is leaked, instead splice the list
> onto the out result and let the caller cleanup.
> 
> Signed-off-by: Ian Rogers 
> ---
>  tools/perf/util/parse-events.c | 17 +++--
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 4d42344698b8..a8f8801bd127 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1962,15 +1962,20 @@ int parse_events(struct evlist *evlist, const char 
> *str,
>  
>   ret = parse_events__scanner(str, &parse_state, PE_START_EVENTS);
>   perf_pmu__parse_cleanup();
> +

I dont understand.. is there something on the list in case we fail?

> + if (list_empty(&parse_state.list)) {
> + WARN_ONCE(true, "WARNING: event parser found nothing\n");
> + return -1;
> + }

this will display extra warning message for fail case:

[jolsa@krava perf]$ ./perf record -e krava ls
WARNING: event parser found nothing
event syntax error: 'krava'
 \___ parser error

we don't want that

jirka

> +
> + /*
> +  * Add list to the evlist even with errors to allow callers to clean up.
> +  */
> + perf_evlist__splice_list_tail(evlist, &parse_state.list);
> +
>   if (!ret) {
>   struct evsel *last;
>  
> - if (list_empty(&parse_state.list)) {
> - WARN_ONCE(true, "WARNING: event parser found 
> nothing\n");
> - return -1;
> - }
> -
> - perf_evlist__splice_list_tail(evlist, &parse_state.list);
>   evlist->nr_groups += parse_state.nr_groups;
>   last = evlist__last(evlist);
>   last->cmdline_group_boundary = true;
> -- 
> 2.23.0.866.gb869b98d4c-goog
> 



Re: [PATCH v5.1 RESEND] dt-bindings: hwrng: Add Samsung Exynos 5250+ True RNG bindings

2019-10-23 Thread Herbert Xu
On Wed, Oct 23, 2019 at 10:16:48AM +0200, Andreas Färber wrote:
>
> For some reason this text file in linux-next is lonely in devicetree/...
> rather than living in Documentation/devicetree/... - please fix that.
> The patch here looks correct, so not sure what went wrong:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/devicetree/bindings/rng/samsung,exynos5250-trng.txt?h=next-20191023&id=85552c22f03c9066c33f26f34538b67fee6a91a8

It's because the patch

https://patchwork.kernel.org/patch/11181265/

was generated at the wrong level (p0 instead of p1).

I'll fix this up in my tree.  Thanks for the heads up.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


RE: [PATCH] ALSA: hda/realtek - Fix 2 front mics of codec 0x623

2019-10-23 Thread Kailang



> -Original Message-
> From: Takashi Iwai 
> Sent: Wednesday, October 23, 2019 12:08 AM
> To: Aaron Ma 
> Cc: pe...@perex.cz; Kailang ;
> hui.w...@canonical.com; alsa-de...@alsa-project.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] ALSA: hda/realtek - Fix 2 front mics of codec 0x623
> 
> On Tue, 22 Oct 2019 17:38:55 +0200,
> Aaron Ma wrote:
> >
> > These 2 ThinkCentres installed a new realtek codec ID 0x623, it has 2
> > front mics with the same location on pin 0x18 and 0x19.
> >
> > Apply fixup ALC283_FIXUP_HEADSET_MIC to change 1 front mic location to
> > right, then pulseaudio can handle them.
> > One "Front Mic" and one "Mic" will be shown, and audio output works
> > fine.
> >
> > Signed-off-by: Aaron Ma 
> 
> I'd like to have Kailang's review about the new codec before applying.
> 
> Kailang, could you take a look?
OK.
I will post you the patch for ALC623 codec tomorrow.
Thanks.

> 
> 
> thanks,
> 
> Takashi
> 
> > ---
> >  sound/pci/hda/patch_realtek.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/sound/pci/hda/patch_realtek.c
> > b/sound/pci/hda/patch_realtek.c index b000b36ac3c6..c34d8b435f58
> > 100644
> > --- a/sound/pci/hda/patch_realtek.c
> > +++ b/sound/pci/hda/patch_realtek.c
> > @@ -7186,6 +7186,8 @@ static const struct snd_pci_quirk alc269_fixup_tbl[]
> = {
> > SND_PCI_QUIRK(0x17aa, 0x312f, "ThinkCentre Station",
> ALC294_FIXUP_LENOVO_MIC_LOCATION),
> > SND_PCI_QUIRK(0x17aa, 0x313c, "ThinkCentre Station",
> ALC294_FIXUP_LENOVO_MIC_LOCATION),
> > SND_PCI_QUIRK(0x17aa, 0x3151, "ThinkCentre Station",
> > ALC283_FIXUP_HEADSET_MIC),
> > +   SND_PCI_QUIRK(0x17aa, 0x3178, "ThinkCentre Station",
> ALC283_FIXUP_HEADSET_MIC),
> > +   SND_PCI_QUIRK(0x17aa, 0x3176, "ThinkCentre Station",
> > +ALC283_FIXUP_HEADSET_MIC),
> > SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80",
> ALC269_FIXUP_DMIC_THINKPAD_ACPI),
> > SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210",
> ALC283_FIXUP_INT_MIC),
> > SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo B50-70",
> > ALC269_FIXUP_DMIC_THINKPAD_ACPI), @@ -9187,6 +9189,7 @@ static
> const struct hda_device_id snd_hda_id_realtek[] = {
> > HDA_CODEC_ENTRY(0x10ec0298, "ALC298", patch_alc269),
> > HDA_CODEC_ENTRY(0x10ec0299, "ALC299", patch_alc269),
> > HDA_CODEC_ENTRY(0x10ec0300, "ALC300", patch_alc269),
> > +   HDA_CODEC_ENTRY(0x10ec0623, "ALC623", patch_alc269),
> > HDA_CODEC_REV_ENTRY(0x10ec0861, 0x100340, "ALC660",
> patch_alc861),
> > HDA_CODEC_ENTRY(0x10ec0660, "ALC660-VD", patch_alc861vd),
> > HDA_CODEC_ENTRY(0x10ec0861, "ALC861", patch_alc861),
> > --
> > 2.17.1
> >
> 
> --Please consider the environment before printing this e-mail.


[PATCH] mm, meminit: Recalculate pcpu batch and high limits after init completes -fix

2019-10-23 Thread Mel Gorman
LKP reported the following build problem from two hunks that did not
survive the reshuffling of the series reordering.

 ld: mm/page_alloc.o: in function `page_alloc_init_late':
 mm/page_alloc.c:1956: undefined reference to `zone_pcp_update'

This is a fix for the mmotm patch
mm-meminit-recalculate-pcpu-batch-and-high-limits-after-init-completes.patch

Reported-by: kbuild test robot 
Signed-off-by: Mel Gorman 
---
 mm/page_alloc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f9488efff680..12f3ce09d33d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8627,7 +8627,6 @@ void free_contig_range(unsigned long pfn, unsigned int 
nr_pages)
WARN(count != 0, "%d pages are still in use!\n", count);
 }
 
-#ifdef CONFIG_MEMORY_HOTPLUG
 /*
  * The zone indicated has a new number of managed_pages; batch sizes and percpu
  * page high values need to be recalulated.
@@ -8638,7 +8637,6 @@ void __meminit zone_pcp_update(struct zone *zone)
__zone_pcp_update(zone);
mutex_unlock(&pcp_batch_high_lock);
 }
-#endif
 
 void zone_pcp_reset(struct zone *zone)
 {


[PATCH v2 0/3] phy: cadence: j721e-wiz: Add Type-C plug flip support

2019-10-23 Thread Roger Quadros
Hi,

On J721e platform, the 2 lanes of SERDES PHY are used to achieve
USB Type-C plug flip support without any additional MUX component
by using a lane swap feature.

However, the driver needs to know the Type-C plug orientation before
it can decide whether to swap the lanes or not. This is achieved via a
GPIO named DIR.

Another constraint is that the lane swap must happen only when the PHY
is in inactive state. This is achieved by sampling the GPIO and
programming the lane swap before bringing the PHY out of reset.

This series adds support to read the GPIO and accordingly program
the Lane swap for Type-C plug flip support.

Series must be applied on top of
https://lkml.org/lkml/2019/10/16/517

cheers,
-roger

Changelog:
v2
- revise commit log of patch 1
- use regmap_field in patch 3

Roger Quadros (3):
  phy: cadence: Sierra: add phy_reset hook
  dt-bindings: phy: ti,phy-j721e-wiz: Add Type-C dir GPIO
  phy: ti: j721e-wiz: Manage typec-gpio-dir

 .../bindings/phy/ti,phy-j721e-wiz.txt |  9 
 drivers/phy/cadence/phy-cadence-sierra.c  | 10 
 drivers/phy/ti/phy-j721e-wiz.c| 48 +++
 3 files changed, 67 insertions(+)

-- 
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki



  1   2   3   4   5   6   7   8   >