date:20151111

Re: [PATCH] x86/mm: fix regression with huge pages on PAE

2015-11-11 Thread Kirill A. Shutemov

On Thu, Nov 12, 2015 at 08:48:54AM +0100, Ingo Molnar wrote:
> 
> * Borislav Petkov  wrote:
> 
> > --- a/arch/x86/include/asm/pgtable_types.h
> > +++ b/arch/x86/include/asm/pgtable_types.h
> > @@ -279,17 +279,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
> >  static inline pudval_t pud_pfn_mask(pud_t pud)
> >  {
> > if (native_pud_val(pud) & _PAGE_PSE)
> > -   return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> > +   return ~((1ULL << PUD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
> > else
> > return PTE_PFN_MASK;
> >  }
> 
> >  static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
> >  {
> > if (native_pmd_val(pmd) & _PAGE_PSE)
> > -   return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> > +   return ~((1ULL << PMD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
> > else
> > return PTE_PFN_MASK;
> >  }
> 
> So instead of uglifying the code, why not fix the real bug: change the 
> PMD_PAGE_MASK/PUD_PAGE_MASK definitions to be 64-bit everywhere?

*PAGE_MASK are usually applied to virtual addresses. I don't think it
should anything but 'unsigned long'. This is odd use case really.

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 2/2] leds: rt5033: Add RT5033 Flash led device driver

2015-11-11 Thread Ingi Kim

Hi Jacek,

Thanks for the review.
your feedback is highly appreciated :)
I'll send next patch set soon.

On 2015년 11월 11일 01:30, Jacek Anaszewski wrote:
> Hi Ingi,
> 
> Thanks for the update. Please find my comments below.
> 
> On 11/10/2015 03:17 AM, Ingi Kim wrote:
>> This patch adds device driver of Richtek RT5033 PMIC.
>> The driver supports a current regulated output to drive white LEDs.
> 
> I would add here also the part from leds-rt5033.txt header.
> 

Okay. I will add.

>>
>> Signed-off-by: Ingi Kim 
>> ---
>>   drivers/leds/Kconfig   |   8 +
>>   drivers/leds/Makefile  |   1 +
>>   drivers/leds/leds-rt5033.c | 502 
>> +
>>   include/linux/mfd/rt5033-private.h |  51 
>>   4 files changed, 562 insertions(+)
>>   create mode 100644 drivers/leds/leds-rt5033.c
>>
>> diff --git a/drivers/leds/Kconfig b/drivers/leds/Kconfig
>> index 42990f2..29613e3 100644
>> --- a/drivers/leds/Kconfig
>> +++ b/drivers/leds/Kconfig
>> @@ -345,6 +345,14 @@ config LEDS_PCA963X
>> LED driver chip accessed via the I2C bus. Supported
>> devices include PCA9633 and PCA9634
>>
>> +config LEDS_RT5033
>> +tristate "LED support for RT5033 PMIC"
>> +depends on LEDS_CLASS_FLASH && OF
>> +depends on MFD_RT5033
>> +help
>> +  This option enables support for on-chip LED driver on
>> +  RT5033 PMIC.
>> +
>>   config LEDS_WM831X_STATUS
>>   tristate "LED support for status LEDs on WM831x PMICs"
>>   depends on LEDS_CLASS
>> diff --git a/drivers/leds/Makefile b/drivers/leds/Makefile
>> index b503f92..bcc4d93 100644
>> --- a/drivers/leds/Makefile
>> +++ b/drivers/leds/Makefile
>> @@ -23,6 +23,7 @@ obj-$(CONFIG_LEDS_COBALT_QUBE)+= leds-cobalt-qube.o
>>   obj-$(CONFIG_LEDS_COBALT_RAQ)+= leds-cobalt-raq.o
>>   obj-$(CONFIG_LEDS_SUNFIRE)+= leds-sunfire.o
>>   obj-$(CONFIG_LEDS_PCA9532)+= leds-pca9532.o
>> +obj-$(CONFIG_LEDS_RT5033)+= leds-rt5033.o
>>   obj-$(CONFIG_LEDS_GPIO_REGISTER)+= leds-gpio-register.o
>>   obj-$(CONFIG_LEDS_GPIO)+= leds-gpio.o
>>   obj-$(CONFIG_LEDS_LP3944)+= leds-lp3944.o
>> diff --git a/drivers/leds/leds-rt5033.c b/drivers/leds/leds-rt5033.c
>> new file mode 100644
>> index 000..eb89731
>> --- /dev/null
>> +++ b/drivers/leds/leds-rt5033.c
>> @@ -0,0 +1,502 @@
>> +/*
>> + * led driver for RT5033
>> + *
>> + * Copyright (C) 2015 Samsung Electronics, Co., Ltd.
>> + * Ingi Kim 
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#define RT5033_LED_FLASH_TIMEOUT_MIN64000
>> +#define RT5033_LED_FLASH_TIMEOUT_STEP32000
>> +#define RT5033_LED_FLASH_BRIGHTNESS_MIN5
>> +#define RT5033_LED_FLASH_BRIGHTNESS_MAX_1CH60
>> +#define RT5033_LED_FLASH_BRIGHTNESS_MAX_2CH80
>> +#define RT5033_LED_FLASH_BRIGHTNESS_STEP25000
>> +#define RT5033_LED_TORCH_BRIGHTNESS_MIN12500
>> +#define RT5033_LED_TORCH_BRIGHTNESS_STEP12500
>> +
>> +/* Macro to get offset of rt5033_led_config_data */
>> +#define RT5033_LED_CONFIG_DATA_OFFSET(val, step, min)(((val) - (min)) \
>> +/ (step))
>> +#define MIN(a, b)((a) > (b) ? (b) : (a))
> 
> Please use min() macro from include/linux/kernel.h
> 

Oh, I'll check.

>> +#define FLED1_IOUT(BIT(0))
>> +#define FLED2_IOUT(BIT(1))
>> +
>> +enum rt5033_fled {
>> +FLED1,
>> +FLED2,
>> +};
>> +
>> +struct rt5033_sub_led {
>> +enum rt5033_fled fled_id;
>> +struct led_classdev_flash fled_cdev;
>> +struct work_struct work_brightness_set;
>> +
>> +u32 torch_brightness;
>> +u32 flash_brightness;
>> +u32 flash_timeout;
>> +};
>> +
>> +/* RT5033 Flash led platform data */
>> +struct rt5033_led {
>> +struct device *dev;
>> +struct mutex lock;
>> +struct regmap *regmap;
>> +struct rt5033_sub_led sub_leds[2];
>> +
>> +u32 iout_torch_max[2];
>> +u32 iout_flash_max[2];
> 
> You're not using above two properties anywhere in the driver.
> 

Thanks, I'll Remove it.

>> +u8 fled_mask;
>> +
>> +/* arrangement of current outputs */
>> +bool iout_joint;
>> +};
>> +
>> +struct rt5033_led_config_data {
>> +const char *label[2];
>> +u32 flash_max_microamp[2];
>> +u32 flash_max_timeout[2];
>> +u32 torch_max_microamp[2];
>> +u32 num_leds;
>> +};
>> +
>> +static struct rt5033_sub_led *flcdev_to_sub_led(
>> +struct led_classdev_flash *fled_cdev)
>> +{
>> +return container_of(fled_cdev, struct rt5033_sub_led, fled_cdev);
>> +}
>> +
>> +static struct rt5033_led *sub_led_to_led(struct rt5033_sub_led *sub_led)
>> +{
>> +return container_of(sub_led, struct rt5033_le

[PATCH] mm: change mm_vmscan_lru_shrink_inactive() proto types

2015-11-11 Thread yalin wang

Move node_id zone_idx shrink flags into trace function,
so thay we don't need caculate these args if the trace is disabled,
and will make this function have less arguments.

Signed-off-by: yalin wang 
---
 include/trace/events/vmscan.h | 14 +++---
 mm/vmscan.c   |  7 ++-
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index dae7836..f8d6b34 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -352,11 +352,11 @@ TRACE_EVENT(mm_vmscan_writepage,
 
 TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
 
-   TP_PROTO(int nid, int zid,
-   unsigned long nr_scanned, unsigned long nr_reclaimed,
-   int priority, int reclaim_flags),
+   TP_PROTO(struct zone *zone,
+   unsigned long nr_scanned, unsigned long nr_reclaimed,
+   int priority, int file),
 
-   TP_ARGS(nid, zid, nr_scanned, nr_reclaimed, priority, reclaim_flags),
+   TP_ARGS(zone, nr_scanned, nr_reclaimed, priority, file),
 
TP_STRUCT__entry(
__field(int, nid)
@@ -368,12 +368,12 @@ TRACE_EVENT(mm_vmscan_lru_shrink_inactive,
),
 
TP_fast_assign(
-   __entry->nid = nid;
-   __entry->zid = zid;
+   __entry->nid = zone->zone_pgdat->node_id;
+   __entry->zid = zone_idx(zone);
__entry->nr_scanned = nr_scanned;
__entry->nr_reclaimed = nr_reclaimed;
__entry->priority = priority;
-   __entry->reclaim_flags = reclaim_flags;
+   __entry->reclaim_flags = trace_shrink_flags(file);
),
 
TP_printk("nid=%d zid=%d nr_scanned=%ld nr_reclaimed=%ld priority=%d 
flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 83cea53..bd2918e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1691,11 +1691,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct 
lruvec *lruvec,
current_may_throttle())
wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10);
 
-   trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id,
-   zone_idx(zone),
-   nr_scanned, nr_reclaimed,
-   sc->priority,
-   trace_shrink_flags(file));
+   trace_mm_vmscan_lru_shrink_inactive(zone, nr_scanned, nr_reclaimed,
+   sc->priority, file);
return nr_reclaimed;
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] i2c: taos-evm: replace simple_strtoul by kstrtou8

2015-11-11 Thread LABBE Corentin

On Thu, Nov 12, 2015 at 08:46:43AM +0100, Uwe Kleine-König wrote:
> On Thu, Nov 12, 2015 at 08:26:33AM +0100, LABBE Corentin wrote:
> > The simple_strtoul function is marked as obsolete.
> > This patch replace it by kstrtou8.
> > 
> > Signed-off-by: LABBE Corentin 
> > ---
> >  drivers/i2c/busses/i2c-taos-evm.c | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/i2c/busses/i2c-taos-evm.c 
> > b/drivers/i2c/busses/i2c-taos-evm.c
> > index 4c7fc2d..fe2b705 100644
> > --- a/drivers/i2c/busses/i2c-taos-evm.c
> > +++ b/drivers/i2c/busses/i2c-taos-evm.c
> > @@ -70,6 +70,7 @@ static int taos_smbus_xfer(struct i2c_adapter *adapter, 
> > u16 addr,
> > struct serio *serio = adapter->algo_data;
> > struct taos_data *taos = serio_get_drvdata(serio);
> > char *p;
> > +   int err;
> >  
> > /* Encode our transaction. "@" is for the device address, "$" for the
> >SMBus command and "#" for the data. */
> > @@ -130,7 +131,9 @@ static int taos_smbus_xfer(struct i2c_adapter *adapter, 
> > u16 addr,
> > return 0;
> > } else {
> > if (p[0] == 'x') {
> > -   data->byte = simple_strtol(p + 1, NULL, 16);
> > +   err = kstrtou8(p + 1, 16, &data->byte);
> > +   if (err)
> > +   return -EPROTO;
> > return 0;
> 
> This is nearly equivalent to the probably more correct:
> 
>   return kstrtou8(p + 1, 16, &data->byte);
> 

As reported, by Jean Delvare, kstrtou8 could return -EINVAL.
It is why I "drop" the return code from kstrtou8 and return -EPROTO as 
suggested by Jean.

I have hesitate to put a comment for this, and it seems finaly necessary.

Regards

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] x86, ras: Add mcsafe_memcpy() function to recover from machine checks

2015-11-11 Thread Ingo Molnar


* Tony Luck  wrote:

> Using __copy_user_nocache() as inspiration create a memory copy
> routine for use by kernel code with annotations to allow for
> recovery from machine checks.
> 
> Notes:
> 1) Unlike the original we make no attempt to copy all the bytes
>up to the faulting address. The original achieves that by
>re-executing the failing part as a byte-by-byte copy,
>which will take another page fault. We don't want to have
>a second machine check!
> 2) Likewise the return value for the original indicates exactly
>how many bytes were not copied. Instead we provide the physical
>address of the fault (thanks to help from do_machine_check()

> +extern phys_addr_t mcsafe_memcpy(void *dst, const void __user *src,
> + unsigned size);

So what's the longer term purpose, where will mcsafe_memcpy() be used?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] i2c: rcar: fix a possible NULL dereference

2015-11-11 Thread Wolfram Sang

On Thu, Nov 12, 2015 at 08:44:47AM +0100, Uwe Kleine-König wrote:
> Hello,
> 
> On Thu, Nov 12, 2015 at 08:25:09AM +0100, LABBE Corentin wrote:
> > of_match_device could return NULL, and so cause a NULL pointer
> > dereference later.
> > 
> > Reported-by: coverity (CID 1130036)
> > Signed-off-by: LABBE Corentin 
> > ---
> >  drivers/i2c/busses/i2c-rcar.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c
> > index b0ae560..d2bdbda 100644
> > --- a/drivers/i2c/busses/i2c-rcar.c
> > +++ b/drivers/i2c/busses/i2c-rcar.c
> > @@ -639,6 +639,7 @@ static int rcar_i2c_probe(struct platform_device *pdev)
> > struct device *dev = &pdev->dev;
> > u32 bus_speed;
> > int irq, ret;
> > +   const struct of_device_id *of_id;
> >  
> > priv = devm_kzalloc(dev, sizeof(struct rcar_i2c_priv), GFP_KERNEL);
> > if (!priv)
> > @@ -653,7 +654,10 @@ static int rcar_i2c_probe(struct platform_device *pdev)
> > bus_speed = 10; /* default 100 kHz */
> > of_property_read_u32(dev->of_node, "clock-frequency", &bus_speed);
> >  
> > -   priv->devtype = (enum rcar_i2c_type)of_match_device(rcar_i2c_dt_ids, 
> > dev)->data;
> > +   of_id = of_match_device(rcar_i2c_dt_ids, dev);
> > +   if (!of_id)
> > +   return -ENODEV;
> > +   priv->devtype = (enum rcar_i2c_type)of_id->data;
> 
> This is nearly an open coding of of_device_get_match_data. Maybe using
> 
>   priv->devtype = (enum rcar_i2c_type)of_device_get_match_data(dev)
> 
> if good enough? 
> 
> Other than that, the NULL pointer dereference should only happen if the
> device was bound using the driver name. That might be worth to point out
> in the commit log. So maybe make (in a separate patch) the probe
> function fail when probed by name?

RCar is a DT only platform.



signature.asc
Description: Digital signature

[PATCH] mfd: qcom_rpm: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later.

Signed-off-by: LABBE Corentin 
---
 drivers/mfd/qcom_rpm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/mfd/qcom_rpm.c b/drivers/mfd/qcom_rpm.c
index 207a3bd..1be47ad 100644
--- a/drivers/mfd/qcom_rpm.c
+++ b/drivers/mfd/qcom_rpm.c
@@ -495,6 +495,8 @@ static int qcom_rpm_probe(struct platform_device *pdev)
}
 
match = of_match_device(qcom_rpm_of_match, &pdev->dev);
+   if (!match)
+   return -ENODEV;
rpm->data = match->data;
 
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mtd: nand: atmel_nand: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later.

Signed-off-by: LABBE Corentin 
---
 drivers/mtd/nand/atmel_nand.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c
index 475c938..f3cf68b 100644
--- a/drivers/mtd/nand/atmel_nand.c
+++ b/drivers/mtd/nand/atmel_nand.c
@@ -1495,9 +1495,12 @@ static int atmel_of_init_port(struct atmel_nand_host 
*host,
int ecc_mode;
struct atmel_nand_data *board = &host->board;
enum of_gpio_flags flags = 0;
+   const struct of_device_id *of_id;
 
-   host->caps = (struct atmel_nand_caps *)
-   of_match_device(atmel_nand_dt_ids, host->dev)->data;
+   of_id = of_match_device(atmel_nand_dt_ids, host->dev);
+   if (!of_id)
+   return -ENODEV;
+   host->caps = of_id->data;
 
if (of_property_read_u32(np, "atmel,nand-addr-offset", &val) == 0) {
if (val >= 32) {
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] x86/mm: fix regression with huge pages on PAE

2015-11-11 Thread Ingo Molnar


* Borislav Petkov  wrote:

> --- a/arch/x86/include/asm/pgtable_types.h
> +++ b/arch/x86/include/asm/pgtable_types.h
> @@ -279,17 +279,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
>  static inline pudval_t pud_pfn_mask(pud_t pud)
>  {
>   if (native_pud_val(pud) & _PAGE_PSE)
> - return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> + return ~((1ULL << PUD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
>   else
>   return PTE_PFN_MASK;
>  }

>  static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
>  {
>   if (native_pmd_val(pmd) & _PAGE_PSE)
> - return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> + return ~((1ULL << PMD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
>   else
>   return PTE_PFN_MASK;
>  }

So instead of uglifying the code, why not fix the real bug: change the 
PMD_PAGE_MASK/PUD_PAGE_MASK definitions to be 64-bit everywhere?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mtd: nand: mxc_nand: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later.

Signed-off-by: LABBE Corentin 
---
 drivers/mtd/nand/mxc_nand.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/nand/mxc_nand.c b/drivers/mtd/nand/mxc_nand.c
index 136e73a..9e42431 100644
--- a/drivers/mtd/nand/mxc_nand.c
+++ b/drivers/mtd/nand/mxc_nand.c
@@ -1464,8 +1464,7 @@ static int __init mxcnd_probe_dt(struct mxc_nand_host 
*host)
 {
struct device_node *np = host->dev->of_node;
struct mxc_nand_platform_data *pdata = &host->pdata;
-   const struct of_device_id *of_id =
-   of_match_device(mxcnd_dt_ids, host->dev);
+   const struct of_device_id *of_id;
int buswidth;
 
if (!np)
@@ -1482,6 +1481,9 @@ static int __init mxcnd_probe_dt(struct mxc_nand_host 
*host)
 
pdata->width = buswidth / 8;
 
+   of_id = of_match_device(mxcnd_dt_ids, host->dev);
+   if (!of_id)
+   return -ENODEV;
host->devtype_data = of_id->data;
 
return 0;
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] i2c: taos-evm: replace simple_strtoul by kstrtou8

2015-11-11 Thread Uwe Kleine-König

On Thu, Nov 12, 2015 at 08:26:33AM +0100, LABBE Corentin wrote:
> The simple_strtoul function is marked as obsolete.
> This patch replace it by kstrtou8.
> 
> Signed-off-by: LABBE Corentin 
> ---
>  drivers/i2c/busses/i2c-taos-evm.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/i2c/busses/i2c-taos-evm.c 
> b/drivers/i2c/busses/i2c-taos-evm.c
> index 4c7fc2d..fe2b705 100644
> --- a/drivers/i2c/busses/i2c-taos-evm.c
> +++ b/drivers/i2c/busses/i2c-taos-evm.c
> @@ -70,6 +70,7 @@ static int taos_smbus_xfer(struct i2c_adapter *adapter, u16 
> addr,
>   struct serio *serio = adapter->algo_data;
>   struct taos_data *taos = serio_get_drvdata(serio);
>   char *p;
> + int err;
>  
>   /* Encode our transaction. "@" is for the device address, "$" for the
>  SMBus command and "#" for the data. */
> @@ -130,7 +131,9 @@ static int taos_smbus_xfer(struct i2c_adapter *adapter, 
> u16 addr,
>   return 0;
>   } else {
>   if (p[0] == 'x') {
> - data->byte = simple_strtol(p + 1, NULL, 16);
> + err = kstrtou8(p + 1, 16, &data->byte);
> + if (err)
> + return -EPROTO;
>   return 0;

This is nearly equivalent to the probably more correct:

return kstrtou8(p + 1, 16, &data->byte);

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] serial: imx: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later.

Signed-off-by: LABBE Corentin 
---
 drivers/tty/serial/imx.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/serial/imx.c b/drivers/tty/serial/imx.c
index 016e4be..22e91f7 100644
--- a/drivers/tty/serial/imx.c
+++ b/drivers/tty/serial/imx.c
@@ -1857,8 +1857,7 @@ static int serial_imx_probe_dt(struct imx_port *sport,
struct platform_device *pdev)
 {
struct device_node *np = pdev->dev.of_node;
-   const struct of_device_id *of_id =
-   of_match_device(imx_uart_dt_ids, &pdev->dev);
+   const struct of_device_id *of_id;
int ret;
 
if (!np)
@@ -1878,6 +1877,9 @@ static int serial_imx_probe_dt(struct imx_port *sport,
if (of_get_property(np, "fsl,dte-mode", NULL))
sport->dte_mode = 1;
 
+   of_id = of_match_device(imx_uart_dt_ids, &pdev->dev);
+   if (!of_id)
+   return -ENODEV;
sport->devdata = of_id->data;
 
return 0;
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] usb: phy: phy-mxs-usb: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later.

Signed-off-by: LABBE Corentin 
---
 drivers/usb/phy/phy-mxs-usb.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/phy/phy-mxs-usb.c b/drivers/usb/phy/phy-mxs-usb.c
index 4d863eb..b7536af 100644
--- a/drivers/usb/phy/phy-mxs-usb.c
+++ b/drivers/usb/phy/phy-mxs-usb.c
@@ -452,10 +452,13 @@ static int mxs_phy_probe(struct platform_device *pdev)
struct clk *clk;
struct mxs_phy *mxs_phy;
int ret;
-   const struct of_device_id *of_id =
-   of_match_device(mxs_phy_dt_ids, &pdev->dev);
+   const struct of_device_id *of_id;
struct device_node *np = pdev->dev.of_node;
 
+   of_id = of_match_device(mxs_phy_dt_ids, &pdev->dev);
+   if (!of_id)
+   return -ENODEV;
+
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
base = devm_ioremap_resource(&pdev->dev, res);
if (IS_ERR(base))
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] usb: phy: msm: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later. Renaming id to of_id (like all others do) in the
process.

Reported-by: coverity (CID 1324133)
Signed-off-by: LABBE Corentin 
---
 drivers/usb/phy/phy-msm-usb.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/phy/phy-msm-usb.c b/drivers/usb/phy/phy-msm-usb.c
index 80eb991..c4a66cf 100644
--- a/drivers/usb/phy/phy-msm-usb.c
+++ b/drivers/usb/phy/phy-msm-usb.c
@@ -1506,20 +1506,23 @@ static int msm_otg_read_dt(struct platform_device 
*pdev, struct msm_otg *motg)
 {
struct msm_otg_platform_data *pdata;
struct extcon_dev *ext_id, *ext_vbus;
-   const struct of_device_id *id;
+   const struct of_device_id *of_id;
struct device_node *node = pdev->dev.of_node;
struct property *prop;
int len, ret, words;
u32 val, tmp[3];
 
+   of_id = of_match_device(msm_otg_dt_match, &pdev->dev);
+   if (!of_id)
+   return -ENODEV;
+
pdata = devm_kzalloc(&pdev->dev, sizeof(*pdata), GFP_KERNEL);
if (!pdata)
return -ENOMEM;
 
motg->pdata = pdata;
 
-   id = of_match_device(msm_otg_dt_match, &pdev->dev);
-   pdata->phy_type = (enum msm_usb_phy_type) id->data;
+   pdata->phy_type = (enum msm_usb_phy_type)of_id->data;
 
motg->link_rst = devm_reset_control_get(&pdev->dev, "link");
if (IS_ERR(motg->link_rst))
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] i2c: rcar: fix a possible NULL dereference

2015-11-11 Thread Uwe Kleine-König

Hello,

On Thu, Nov 12, 2015 at 08:25:09AM +0100, LABBE Corentin wrote:
> of_match_device could return NULL, and so cause a NULL pointer
> dereference later.
> 
> Reported-by: coverity (CID 1130036)
> Signed-off-by: LABBE Corentin 
> ---
>  drivers/i2c/busses/i2c-rcar.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c
> index b0ae560..d2bdbda 100644
> --- a/drivers/i2c/busses/i2c-rcar.c
> +++ b/drivers/i2c/busses/i2c-rcar.c
> @@ -639,6 +639,7 @@ static int rcar_i2c_probe(struct platform_device *pdev)
>   struct device *dev = &pdev->dev;
>   u32 bus_speed;
>   int irq, ret;
> + const struct of_device_id *of_id;
>  
>   priv = devm_kzalloc(dev, sizeof(struct rcar_i2c_priv), GFP_KERNEL);
>   if (!priv)
> @@ -653,7 +654,10 @@ static int rcar_i2c_probe(struct platform_device *pdev)
>   bus_speed = 10; /* default 100 kHz */
>   of_property_read_u32(dev->of_node, "clock-frequency", &bus_speed);
>  
> - priv->devtype = (enum rcar_i2c_type)of_match_device(rcar_i2c_dt_ids, 
> dev)->data;
> + of_id = of_match_device(rcar_i2c_dt_ids, dev);
> + if (!of_id)
> + return -ENODEV;
> + priv->devtype = (enum rcar_i2c_type)of_id->data;

This is nearly an open coding of of_device_get_match_data. Maybe using

priv->devtype = (enum rcar_i2c_type)of_device_get_match_data(dev)

if good enough? 

Other than that, the NULL pointer dereference should only happen if the
device was bound using the driver name. That might be worth to point out
in the commit log. So maybe make (in a separate patch) the probe
function fail when probed by name?

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | http://www.pengutronix.de/  |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] usb: chipidea: imx: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later.

Reported-by: coverity (CID 1324138)
Signed-off-by: LABBE Corentin 
---
 drivers/usb/chipidea/ci_hdrc_imx.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/chipidea/ci_hdrc_imx.c 
b/drivers/usb/chipidea/ci_hdrc_imx.c
index 6ccbf60..a021bd5 100644
--- a/drivers/usb/chipidea/ci_hdrc_imx.c
+++ b/drivers/usb/chipidea/ci_hdrc_imx.c
@@ -145,9 +145,14 @@ static int ci_hdrc_imx_probe(struct platform_device *pdev)
.flags  = CI_HDRC_SET_NON_ZERO_TTHA,
};
int ret;
-   const struct of_device_id *of_id =
-   of_match_device(ci_hdrc_imx_dt_ids, &pdev->dev);
-   const struct ci_hdrc_imx_platform_flag *imx_platform_flag = of_id->data;
+   const struct of_device_id *of_id;
+   const struct ci_hdrc_imx_platform_flag *imx_platform_flag;
+
+   of_id = of_match_device(ci_hdrc_imx_dt_ids, &pdev->dev);
+   if (!of_id)
+   return -ENODEV;
+
+   imx_platform_flag = of_id->data;
 
data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL);
if (!data)
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] usb: chipidea: imx: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later. Renaming tmp_dev to of_id (like all others do) in the
process.

Reported-by: coverity (CID 1324135)
Signed-off-by: LABBE Corentin 
---
 drivers/usb/chipidea/usbmisc_imx.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/usb/chipidea/usbmisc_imx.c 
b/drivers/usb/chipidea/usbmisc_imx.c
index fcea4eb..ab8b027 100644
--- a/drivers/usb/chipidea/usbmisc_imx.c
+++ b/drivers/usb/chipidea/usbmisc_imx.c
@@ -500,7 +500,11 @@ static int usbmisc_imx_probe(struct platform_device *pdev)
 {
struct resource *res;
struct imx_usbmisc *data;
-   struct of_device_id *tmp_dev;
+   const struct of_device_id *of_id;
+
+   of_id = of_match_device(usbmisc_imx_dt_ids, &pdev->dev);
+   if (!of_id)
+   return -ENODEV;
 
data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL);
if (!data)
@@ -513,9 +517,7 @@ static int usbmisc_imx_probe(struct platform_device *pdev)
if (IS_ERR(data->base))
return PTR_ERR(data->base);
 
-   tmp_dev = (struct of_device_id *)
-   of_match_device(usbmisc_imx_dt_ids, &pdev->dev);
-   data->ops = (const struct usbmisc_ops *)tmp_dev->data;
+   data->ops = (const struct usbmisc_ops *)of_id->data;
platform_set_drvdata(pdev, data);
 
return 0;
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] pwm: sun4i: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later.

Reported-by: coverity (CID 1324139)
Signed-off-by: LABBE Corentin 
---
 drivers/pwm/pwm-sun4i.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pwm/pwm-sun4i.c b/drivers/pwm/pwm-sun4i.c
index cd9dde5..3011fcc 100644
--- a/drivers/pwm/pwm-sun4i.c
+++ b/drivers/pwm/pwm-sun4i.c
@@ -291,6 +291,8 @@ static int sun4i_pwm_probe(struct platform_device *pdev)
const struct of_device_id *match;
 
match = of_match_device(sun4i_pwm_dt_ids, &pdev->dev);
+   if (!match)
+   return -ENODEV;
 
pwm = devm_kzalloc(&pdev->dev, sizeof(*pwm), GFP_KERNEL);
if (!pwm)
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mm: change trace_mm_vmscan_writepage() proto type

2015-11-11 Thread yalin wang

Move trace_reclaim_flags() into trace function,
so that we don't need caculate these flags if the trace is disabled.

Signed-off-by: yalin wang 
---
 include/trace/events/vmscan.h | 7 +++
 mm/vmscan.c   | 2 +-
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index f66476b..dae7836 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -330,10 +330,9 @@ DEFINE_EVENT(mm_vmscan_lru_isolate_template, 
mm_vmscan_memcg_isolate,
 
 TRACE_EVENT(mm_vmscan_writepage,
 
-   TP_PROTO(struct page *page,
-   int reclaim_flags),
+   TP_PROTO(struct page *page),
 
-   TP_ARGS(page, reclaim_flags),
+   TP_ARGS(page),
 
TP_STRUCT__entry(
__field(unsigned long, pfn)
@@ -342,7 +341,7 @@ TRACE_EVENT(mm_vmscan_writepage,
 
TP_fast_assign(
__entry->pfn = page_to_pfn(page);
-   __entry->reclaim_flags = reclaim_flags;
+   __entry->reclaim_flags = trace_reclaim_flags(page);
),
 
TP_printk("page=%p pfn=%lu flags=%s",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a4507ec..83cea53 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -594,7 +594,7 @@ static pageout_t pageout(struct page *page, struct 
address_space *mapping,
/* synchronous write or broken a_ops? */
ClearPageReclaim(page);
}
-   trace_mm_vmscan_writepage(page, trace_reclaim_flags(page));
+   trace_mm_vmscan_writepage(page);
inc_zone_page_state(page, NR_VMSCAN_WRITE);
return PAGE_SUCCESS;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 1/3] ARM: DTS: dra7: Fix McASP3 node regarding to clocks

2015-11-11 Thread Peter Ujfalusi

McASP node needs to list all mandatory clocks: gfclk and ahclkx

Signed-off-by: Peter Ujfalusi 
Tested-by: Felipe Balbi 
---
 arch/arm/boot/dts/dra7.dtsi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/dra7.dtsi b/arch/arm/boot/dts/dra7.dtsi
index bc672fb91466..fe99231cbde5 100644
--- a/arch/arm/boot/dts/dra7.dtsi
+++ b/arch/arm/boot/dts/dra7.dtsi
@@ -1459,8 +1459,8 @@
interrupt-names = "tx", "rx";
dmas = <&sdma_xbar 133>, <&sdma_xbar 132>;
dma-names = "tx", "rx";
-   clocks = <&mcasp3_ahclkx_mux>;
-   clock-names = "fck";
+   clocks = <&mcasp3_aux_gfclk_mux>, <&mcasp3_ahclkx_mux>;
+   clock-names = "fck", "ahclkx";
status = "disabled";
};
 
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 0/3] ARM: OMAP2+ McASP(3) support for DRA7xx family

2015-11-11 Thread Peter Ujfalusi

Hi Tony,

Changes since v3:
- rebased on mainline's HEAD
- Added Tested-by from Felipe
- Added Acked-by from Paul for the hwmod patches

Changes since v2:
- DTS patch added which is needed because of the clock handling changes

Felip Balbi reported that linux-next is broken right now since the DTS part of
the earlier series has been applied, but we do not have the mcasp hwmod in the
kernel:
...
[0.181029] platform 48468000.mcasp: Cannot lookup hwmod 'mcasp3'
...
[6.121072] davinci-mcasp 48468000.mcasp: _od_fail_runtime_resume: FIXME: 
missing hwmod/omap_dev info
[6.130790] [ cut here ]
[6.135643] WARNING: CPU: 0 PID: 244 at drivers/bus/omap_l3_noc.c:147 
l3_interrupt_handler+0x220/0x34c()
[6.145576] 4400.ocp:L3 Custom Error: MASTER MPU TARGET L4_PER2_P3 
(Read): Data Access in User mode during Functional access
...

This is the followup series for the hwmod changes needed to get audio working
on DRA7xx family based boards.
The DTS patches has been applied by Tony from the original series:
http://www.spinics.net/lists/linux-omap/msg121473.html

I have addressed your comments in the hwmod data and did some research also
regarding to the use of ahclkx as fclk in the original submission.
It turned out that McASP _needs_ all clocks to be enabled (fclk, iclk and
ahclkx/r) to be able to access registers. The original patch where we handled
the ahclkx as fclk worked, because the fclk clock got enabled in the HW w/o
any SW interaction.
All in all, the McASP found in DRA7 needs all clocks to be enabled.
To satisfy this I have introduced a new flag to hwmod, which means that the
listed optional clocks need to be handled alongside with the fclk clock.

Regards,
Peter
---
Peter Ujfalusi (3):
  ARM: DTS: dra7: Fix McASP3 node regarding to clocks
  ARM: OMAP2+: hwmod: Add hwmod flag for HWMOD_OPT_CLKS_NEEDED
  ARM: OMAP: DRA7: hwmod: Add data for McASP3

 arch/arm/boot/dts/dra7.dtsi   |  4 +-
 arch/arm/mach-omap2/omap_hwmod.c  | 66 +--
 arch/arm/mach-omap2/omap_hwmod.h  |  3 ++
 arch/arm/mach-omap2/omap_hwmod_7xx_data.c | 56 ++
 4 files changed, 97 insertions(+), 32 deletions(-)

-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 2/3] ARM: OMAP2+: hwmod: Add hwmod flag for HWMOD_OPT_CLKS_NEEDED

2015-11-11 Thread Peter Ujfalusi

Some module needs more than one functional clock in order to be accessible,
like the McASPs found in DRA7xx family.
This flag will indicate that the opt_clks need to be handled at the same
time as the main_clk for the given hwmod, ensuring that all needed clocks
are enabled before we try to access the module's address space.

Signed-off-by: Peter Ujfalusi 
Acked-by: Paul Walmsley 
Tested-by: Felipe Balbi 
---
 arch/arm/mach-omap2/omap_hwmod.c | 66 ++--
 arch/arm/mach-omap2/omap_hwmod.h |  3 ++
 2 files changed, 39 insertions(+), 30 deletions(-)

diff --git a/arch/arm/mach-omap2/omap_hwmod.c b/arch/arm/mach-omap2/omap_hwmod.c
index cc8a987149e2..48495ad82aba 100644
--- a/arch/arm/mach-omap2/omap_hwmod.c
+++ b/arch/arm/mach-omap2/omap_hwmod.c
@@ -890,6 +890,36 @@ static int _init_opt_clks(struct omap_hwmod *oh)
return ret;
 }
 
+static void _enable_optional_clocks(struct omap_hwmod *oh)
+{
+   struct omap_hwmod_opt_clk *oc;
+   int i;
+
+   pr_debug("omap_hwmod: %s: enabling optional clocks\n", oh->name);
+
+   for (i = oh->opt_clks_cnt, oc = oh->opt_clks; i > 0; i--, oc++)
+   if (oc->_clk) {
+   pr_debug("omap_hwmod: enable %s:%s\n", oc->role,
+__clk_get_name(oc->_clk));
+   clk_enable(oc->_clk);
+   }
+}
+
+static void _disable_optional_clocks(struct omap_hwmod *oh)
+{
+   struct omap_hwmod_opt_clk *oc;
+   int i;
+
+   pr_debug("omap_hwmod: %s: disabling optional clocks\n", oh->name);
+
+   for (i = oh->opt_clks_cnt, oc = oh->opt_clks; i > 0; i--, oc++)
+   if (oc->_clk) {
+   pr_debug("omap_hwmod: disable %s:%s\n", oc->role,
+__clk_get_name(oc->_clk));
+   clk_disable(oc->_clk);
+   }
+}
+
 /**
  * _enable_clocks - enable hwmod main clock and interface clocks
  * @oh: struct omap_hwmod *
@@ -917,6 +947,9 @@ static int _enable_clocks(struct omap_hwmod *oh)
clk_enable(os->_clk);
}
 
+   if (oh->flags & HWMOD_OPT_CLKS_NEEDED)
+   _enable_optional_clocks(oh);
+
/* The opt clocks are controlled by the device driver. */
 
return 0;
@@ -948,41 +981,14 @@ static int _disable_clocks(struct omap_hwmod *oh)
clk_disable(os->_clk);
}
 
+   if (oh->flags & HWMOD_OPT_CLKS_NEEDED)
+   _disable_optional_clocks(oh);
+
/* The opt clocks are controlled by the device driver. */
 
return 0;
 }
 
-static void _enable_optional_clocks(struct omap_hwmod *oh)
-{
-   struct omap_hwmod_opt_clk *oc;
-   int i;
-
-   pr_debug("omap_hwmod: %s: enabling optional clocks\n", oh->name);
-
-   for (i = oh->opt_clks_cnt, oc = oh->opt_clks; i > 0; i--, oc++)
-   if (oc->_clk) {
-   pr_debug("omap_hwmod: enable %s:%s\n", oc->role,
-__clk_get_name(oc->_clk));
-   clk_enable(oc->_clk);
-   }
-}
-
-static void _disable_optional_clocks(struct omap_hwmod *oh)
-{
-   struct omap_hwmod_opt_clk *oc;
-   int i;
-
-   pr_debug("omap_hwmod: %s: disabling optional clocks\n", oh->name);
-
-   for (i = oh->opt_clks_cnt, oc = oh->opt_clks; i > 0; i--, oc++)
-   if (oc->_clk) {
-   pr_debug("omap_hwmod: disable %s:%s\n", oc->role,
-__clk_get_name(oc->_clk));
-   clk_disable(oc->_clk);
-   }
-}
-
 /**
  * _omap4_enable_module - enable CLKCTRL modulemode on OMAP4
  * @oh: struct omap_hwmod *
diff --git a/arch/arm/mach-omap2/omap_hwmod.h b/arch/arm/mach-omap2/omap_hwmod.h
index ca6df1a73475..76bce11c85a4 100644
--- a/arch/arm/mach-omap2/omap_hwmod.h
+++ b/arch/arm/mach-omap2/omap_hwmod.h
@@ -523,6 +523,8 @@ struct omap_hwmod_omap4_prcm {
  * HWMOD_RECONFIG_IO_CHAIN: omap_hwmod code needs to reconfigure wake-up 
  * events by calling _reconfigure_io_chain() when a device is enabled
  * or idled.
+ * HWMOD_OPT_CLKS_NEEDED: The optional clocks are needed for the module to
+ * operate and they need to be handled at the same time as the main_clk.
  */
 #define HWMOD_SWSUP_SIDLE  (1 << 0)
 #define HWMOD_SWSUP_MSTANDBY   (1 << 1)
@@ -538,6 +540,7 @@ struct omap_hwmod_omap4_prcm {
 #define HWMOD_FORCE_MSTANDBY   (1 << 11)
 #define HWMOD_SWSUP_SIDLE_ACT  (1 << 12)
 #define HWMOD_RECONFIG_IO_CHAIN(1 << 13)
+#define HWMOD_OPT_CLKS_NEEDED  (1 << 14)
 
 /*
  * omap_hwmod._int_flags definitions
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4 3/3] ARM: OMAP: DRA7: hwmod: Add data for McASP3

2015-11-11 Thread Peter Ujfalusi

McASP3 is used by default on DRA7x based boards for audio.

Signed-off-by: Peter Ujfalusi 
Acked-by: Paul Walmsley 
Tested-by: Felipe Balbi 
---
 arch/arm/mach-omap2/omap_hwmod_7xx_data.c | 56 +++
 1 file changed, 56 insertions(+)

diff --git a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c 
b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
index 51d1ecb384bd..ee4e04434a94 100644
--- a/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
+++ b/arch/arm/mach-omap2/omap_hwmod_7xx_data.c
@@ -1298,6 +1298,44 @@ static struct omap_hwmod dra7xx_mcspi4_hwmod = {
 };
 
 /*
+ * 'mcasp' class
+ *
+ */
+static struct omap_hwmod_class_sysconfig dra7xx_mcasp_sysc = {
+   .sysc_offs  = 0x0004,
+   .sysc_flags = SYSC_HAS_SIDLEMODE,
+   .idlemodes  = (SIDLE_FORCE | SIDLE_NO | SIDLE_SMART),
+   .sysc_fields= &omap_hwmod_sysc_type3,
+};
+
+static struct omap_hwmod_class dra7xx_mcasp_hwmod_class = {
+   .name   = "mcasp",
+   .sysc   = &dra7xx_mcasp_sysc,
+};
+
+/* mcasp3 */
+static struct omap_hwmod_opt_clk mcasp3_opt_clks[] = {
+   { .role = "ahclkx", .clk = "mcasp3_ahclkx_mux" },
+};
+
+static struct omap_hwmod dra7xx_mcasp3_hwmod = {
+   .name   = "mcasp3",
+   .class  = &dra7xx_mcasp_hwmod_class,
+   .clkdm_name = "l4per2_clkdm",
+   .main_clk   = "mcasp3_aux_gfclk_mux",
+   .flags  = HWMOD_OPT_CLKS_NEEDED,
+   .prcm = {
+   .omap4 = {
+   .clkctrl_offs = DRA7XX_CM_L4PER2_MCASP3_CLKCTRL_OFFSET,
+   .context_offs = DRA7XX_RM_L4PER2_MCASP3_CONTEXT_OFFSET,
+   .modulemode   = MODULEMODE_SWCTRL,
+   },
+   },
+   .opt_clks   = mcasp3_opt_clks,
+   .opt_clks_cnt   = ARRAY_SIZE(mcasp3_opt_clks),
+};
+
+/*
  * 'mmc' class
  *
  */
@@ -2566,6 +2604,22 @@ static struct omap_hwmod_ocp_if dra7xx_l3_main_1__hdmi = 
{
.user   = OCP_USER_MPU | OCP_USER_SDMA,
 };
 
+/* l4_per2 -> mcasp3 */
+static struct omap_hwmod_ocp_if dra7xx_l4_per2__mcasp3 = {
+   .master = &dra7xx_l4_per2_hwmod,
+   .slave  = &dra7xx_mcasp3_hwmod,
+   .clk= "l4_root_clk_div",
+   .user   = OCP_USER_MPU | OCP_USER_SDMA,
+};
+
+/* l3_main_1 -> mcasp3 */
+static struct omap_hwmod_ocp_if dra7xx_l3_main_1__mcasp3 = {
+   .master = &dra7xx_l3_main_1_hwmod,
+   .slave  = &dra7xx_mcasp3_hwmod,
+   .clk= "l3_iclk_div",
+   .user   = OCP_USER_MPU | OCP_USER_SDMA,
+};
+
 /* l4_per1 -> elm */
 static struct omap_hwmod_ocp_if dra7xx_l4_per1__elm = {
.master = &dra7xx_l4_per1_hwmod,
@@ -3308,6 +3362,8 @@ static struct omap_hwmod_ocp_if *dra7xx_hwmod_ocp_ifs[] 
__initdata = {
&dra7xx_l4_wkup__dcan1,
&dra7xx_l4_per2__dcan2,
&dra7xx_l4_per2__cpgmac0,
+   &dra7xx_l4_per2__mcasp3,
+   &dra7xx_l3_main_1__mcasp3,
&dra7xx_gmac__mdio,
&dra7xx_l4_cfg__dma_system,
&dra7xx_l3_main_1__dss,
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Change of my repository (samsung-krzk)

2015-11-11 Thread Krzysztof Kozlowski

On 12.11.2015 16:25, Stephen Rothwell wrote:
> Hi Krzysztof,
> 
> On Thu, 12 Nov 2015 16:05:04 +0900 Krzysztof Kozlowski 
>  wrote:
>>
>> I moved my samsung-krzk repository for Samsung SoC (as co-maintainer)
>> from:
>> git://github.com/krzk/linux.git#for-next
>>
>> to:
>> git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git#for-next
>>
>> (the name of branch stays the same)
>>
>> Can you update the linux-next trees?
> 
> Done.


Great, thank you!

BR,
Krzysztof

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] i2c: tegra: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later at line 809:
i2c_dev->hw = match->data;

Signed-off-by: LABBE Corentin 
---
 drivers/i2c/busses/i2c-tegra.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index a0522fc..c803551 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -806,7 +806,10 @@ static int tegra_i2c_probe(struct platform_device *pdev)
 
if (pdev->dev.of_node) {
const struct of_device_id *match;
+
match = of_match_device(tegra_i2c_of_match, &pdev->dev);
+   if (!match)
+   return -ENODEV;
i2c_dev->hw = match->data;
i2c_dev->is_dvc = of_device_is_compatible(pdev->dev.of_node,
"nvidia,tegra20-i2c-dvc");
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] i2c: taos-evm: replace simple_strtoul by kstrtou8

2015-11-11 Thread LABBE Corentin

The simple_strtoul function is marked as obsolete.
This patch replace it by kstrtou8.

Signed-off-by: LABBE Corentin 
---
 drivers/i2c/busses/i2c-taos-evm.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-taos-evm.c 
b/drivers/i2c/busses/i2c-taos-evm.c
index 4c7fc2d..fe2b705 100644
--- a/drivers/i2c/busses/i2c-taos-evm.c
+++ b/drivers/i2c/busses/i2c-taos-evm.c
@@ -70,6 +70,7 @@ static int taos_smbus_xfer(struct i2c_adapter *adapter, u16 
addr,
struct serio *serio = adapter->algo_data;
struct taos_data *taos = serio_get_drvdata(serio);
char *p;
+   int err;
 
/* Encode our transaction. "@" is for the device address, "$" for the
   SMBus command and "#" for the data. */
@@ -130,7 +131,9 @@ static int taos_smbus_xfer(struct i2c_adapter *adapter, u16 
addr,
return 0;
} else {
if (p[0] == 'x') {
-   data->byte = simple_strtol(p + 1, NULL, 16);
+   err = kstrtou8(p + 1, 16, &data->byte);
+   if (err)
+   return -EPROTO;
return 0;
}
}
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] i2c: rcar: fix a possible NULL dereference

2015-11-11 Thread LABBE Corentin

of_match_device could return NULL, and so cause a NULL pointer
dereference later.

Reported-by: coverity (CID 1130036)
Signed-off-by: LABBE Corentin 
---
 drivers/i2c/busses/i2c-rcar.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-rcar.c b/drivers/i2c/busses/i2c-rcar.c
index b0ae560..d2bdbda 100644
--- a/drivers/i2c/busses/i2c-rcar.c
+++ b/drivers/i2c/busses/i2c-rcar.c
@@ -639,6 +639,7 @@ static int rcar_i2c_probe(struct platform_device *pdev)
struct device *dev = &pdev->dev;
u32 bus_speed;
int irq, ret;
+   const struct of_device_id *of_id;
 
priv = devm_kzalloc(dev, sizeof(struct rcar_i2c_priv), GFP_KERNEL);
if (!priv)
@@ -653,7 +654,10 @@ static int rcar_i2c_probe(struct platform_device *pdev)
bus_speed = 10; /* default 100 kHz */
of_property_read_u32(dev->of_node, "clock-frequency", &bus_speed);
 
-   priv->devtype = (enum rcar_i2c_type)of_match_device(rcar_i2c_dt_ids, 
dev)->data;
+   of_id = of_match_device(rcar_i2c_dt_ids, dev);
+   if (!of_id)
+   return -ENODEV;
+   priv->devtype = (enum rcar_i2c_type)of_id->data;
 
ret = rcar_i2c_clock_calculate(priv, bus_speed, dev);
if (ret < 0)
-- 
2.4.10

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Change of my repository (samsung-krzk)

2015-11-11 Thread Stephen Rothwell

Hi Krzysztof,

On Thu, 12 Nov 2015 16:05:04 +0900 Krzysztof Kozlowski 
 wrote:
>
> I moved my samsung-krzk repository for Samsung SoC (as co-maintainer)
> from:
> git://github.com/krzk/linux.git#for-next
> 
> to:
> git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git#for-next
> 
> (the name of branch stays the same)
> 
> Can you update the linux-next trees?

Done.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/1] drivers/hv: correct tsc page sequence invalid value

2015-11-11 Thread Denis V. Lunev

On 11/02/2015 10:42 PM, KY Srinivasan wrote:

-Original Message-
From: Denis V. Lunev [mailto:d...@openvz.org]
Sent: Monday, November 2, 2015 3:34 AM
Cc: rka...@virtuozzo.com; de...@linuxdriverproject.org; linux-
ker...@vger.kernel.org; Andrey Smetanin ; KY
Srinivasan ; Haiyang Zhang
; Vitaly Kuznetsov ;
Denis V. Lunev 
Subject: [PATCH 1/1] drivers/hv: correct tsc page sequence invalid value

From: Andrey Smetanin 

Hypervisor Top Level Functional Specification v3/4 says
that TSC page sequence value = -1(0x) is used to
indicate that TSC page no longer reliable source of reference
timer. Unfortunately, we found that Windows Hyper-V guest
side implementation uses sequence value = 0 to indicate
that Tsc page no longer valid. This is clearly visible
inside Windows 2012R2 ntoskrnl.exe HvlGetReferenceTime()
function dissassembly:

HvlGetReferenceTime proc near
  xchgax, ax
loc_1401C3132:
  mov rax, cs:HvlpReferenceTscPage
  mov r9d, [rax]
  testr9d, r9d
  jz  short loc_1401C3176
  rdtsc
  mov rcx, cs:HvlpReferenceTscPage
  shl rdx, 20h
  or  rdx, rax
  mov rax, [rcx+8]
  mov rcx, cs:HvlpReferenceTscPage
  mov r8, [rcx+10h]
  mul rdx
  mov rax, cs:HvlpReferenceTscPage
  add rdx, r8
  mov ecx, [rax]
  cmp ecx, r9d
  jnz short loc_1401C3132
  jmp short loc_1401C3184
loc_1401C3176:
  mov ecx, 4020h
  rdmsr
  shl rdx, 20h
  or  rdx, rax
loc_1401C3184:
  mov rax, rdx
  retn
HvlGetReferenceTime endp

This patch aligns Tsc page invalid sequence value with
Windows Hyper-V guest implementation which is more
compatible with both Hyper-V hypervisor and KVM hypervisor.

Signed-off-by: Andrey Smetanin 
CC: "K. Y. Srinivasan" 

Thanks Andrey; the Hyper-V team will be updating the Hyper-V documentation.

Acked-by: K. Y. Srinivasan 

Regards,

K. Y

K.Y.,

can you pls clarify the state of this patch? It is a bit unclear
to me whether it is applied or not.

By the way, I also do not see the following patch
"drivers/hv: cleanup synic msrs if vmbus connect failed"
as applied in the Linux-next. You have promised to resend
it will correct author.

Thank you in advance,
Den
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] locking: Introduce smp_cond_acquire()

2015-11-11 Thread Boqun Feng

On Wed, Nov 11, 2015 at 08:39:53PM +0100, Oleg Nesterov wrote:
> On 11/11, Peter Zijlstra wrote:
> >
> > On Wed, Nov 11, 2015 at 05:39:40PM +0800, Boqun Feng wrote:
> >
> > > Just be curious, should spin_unlock_wait() semantically be an ACQUIRE?
> >
> > I did wonder the same thing, it would simplify a number of things if
> > this were so.
> 
> Yes, me too.
> 
> Sometimes I even think it should have both ACQUIRE + RELEASE semantics.
> IOW, it should be "equivalent" to spin_lock() + spin_unlock().
> 
> Consider this code:
> 
>   object_t *object;
>   spinlock_t lock;
> 
>   void update(void)
>   {
>   object_t *o;
> 
>   spin_lock(&lock);
>   o = READ_ONCE(object);
>   if (o) {
>   BUG_ON(o->dead);
>   do_something(o);
>   }
>   spin_unlock(&lock);
>   }
> 
>   void destroy(void) // can be called only once, can't race with itself
>   {
>   object_t *o;
> 
>   o = object;
>   object = NULL;
> 
>   /*
>* pairs with lock/ACQUIRE. The next update() must see
>* object == NULL after spin_lock();
>*/
>   smp_mb();
> 
>   spin_unlock_wait(&lock);
> 
>   /*
>* pairs with unlock/RELEASE. The previous update() has
>* already passed BUG_ON(o->dead).
>*
>* (Yes, yes, in this particular case it is not needed,
>*  we can rely on the control dependency).
>*/
>   smp_mb();
> 
>   o->dead = true;
>   }
> 
> I believe the code above is correct and it needs the barriers on both sides.
> 

Hmm.. probably incorrect.. because the ACQUIRE semantics of spin_lock()
only guarantees that the memory operations following spin_lock() can't
be reorder before the *LOAD* part of spin_lock() not the *STORE* part,
i.e. the case below can happen(assuming the spin_lock() is implemented
as ll/sc loop)

spin_lock(&lock):
  r1 = *lock; // LL, r1 == 0
o = READ_ONCE(object); // could be reordered here.
  *lock = 1; // SC

This could happen because of the ACQUIRE semantics of spin_lock(), and 
the current implementation of spin_lock() on PPC allows this happen.

(Cc PPC maintainers for their opinions on this one)

Therefore the case below can happen:

CPU 1   CPU 2   CPU 3
==  ==
spin_unlock(&lock);
spin_lock(&lock):
  r1 = *lock; // r1 == 0;
o = READ_ONCE(object); // reordered here
object = NULL;
smp_mb();
spin_unlock_wait(&lock);
  *lock = 1;
smp_mb();   
o->dead = true;
if (o) // true
  BUG_ON(o->dead); // true!!


To show this, I also translate this situation into a PPC litmus for
herd[1]:

PPC spin-lock-wait
"
r1: local variable of lock
r2: constant 1
r3: constant 0 or NULL
r4: local variable of object, i.e. o
r5: local variable of *o (simulate ->dead as I didn't know 
how to write fields of structure in herd ;-()
r13: the address of lock, i.e. &lock
r14: the address of object, i.e. &object
"
{
0:r1=0;0:r2=1;0:r3=0;0:r13=lock;0:r14=object;
1:r1=0;1:r2=1;1:r3=0;1:r4=0;1:r5=0;1:r13=lock;1:r14=object;
2:r1=0;2:r13=lock;
lock=1; object=old; old=0;
}

P0  | P1 | P2 ;
ld r4,0(r14)| Lock:  | stw r1,0(r13);
std r3,0(r14)   | lwarx r1,r3,r13| ;
| cmpwi r1,0 | ;
sync| bne Lock   | ;
| stwcx. r2,r3,r13   | ;
Wait:   | bne Lock   | ;
lwz r1,0(r13)   | lwsync | ;
cmpwi r1,0  | ld r4,0(r14)   | ;
bne Wait| cmpwi r4,0 | ;
| beq Fail   | ;
sync| lwz r5, 0(r4)  | ;
stw r2,0(r4)| Fail:  | ;
| lwsync | ;
| stw r3, 0(r13) | ;

exists
(1:r4=old /\ 1:r5=1)

,whose result says that (1:r4=old /\ 1:r5=1) can happen:

Test spin-lock-wait Allowed
States 3
1:r4=0; 1:r5=0;
1:r4=old; 1:r5=0;
1:r4=old; 1:r5=1;
Loop Ok
Witnesses
Positive: 18 Negative: 108
Condition exists (1:r4=old /\ 1:r5=1)
Observation spin-lock-wait Sometimes 18 108
Hash=244f8

Re: [PATCH V2] mm: fix kernel crash in khugepaged thread

2015-11-11 Thread yalin wang

Ok
i will send a V3 patch.
> On Nov 5, 2015, at 16:50, Kirill A. Shutemov  wrote:
> 
> On Thu, Nov 05, 2015 at 09:12:34AM +0100, Vlastimil Babka wrote:
>> On 10/29/2015 01:35 AM, Kirill A. Shutemov wrote:
 @@ -2605,9 +2603,9 @@ out_unmap:
/* collapse_huge_page will return with the mmap_sem released */
collapse_huge_page(mm, address, hpage, vma, node);
}
 -out:
 -  trace_mm_khugepaged_scan_pmd(mm, page_to_pfn(page), writable, 
 referenced,
 -   none_or_zero, result, unmapped);
 +  trace_mm_khugepaged_scan_pmd(mm, pte_present(pteval) ?
 +  pte_pfn(pteval) : -1, writable, referenced,
 +  none_or_zero, result, unmapped);
>>> 
>>> maybe passing down pte instead of pfn?
>> 
>> Maybe just pass the page, and have tracepoint's fast assign check for !NULL 
>> and
>> do page_to_pfn itself? That way the complexity and overhead is only in the
>> tracepoint and when enabled.
> 
> Agreed.
> 
> -- 
> Kirill A. Shutemov

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V3] mm: fix kernel crash in khugepaged thread

2015-11-11 Thread yalin wang

This crash is caused by NULL pointer deference, in page_to_pfn() marco,
when page == NULL :

[  182.639154 ] Unable to handle kernel NULL pointer dereference at virtual 
address 
[  182.639491 ] pgd = ffc00077a000
[  182.639761 ] [] *pgd=b9422003, *pud=b9422003, 
*pmd=b9423003, *pte=006008000707
[  182.640749 ] Internal error: Oops: 9406 [#1] SMP
[  182.641197 ] Modules linked in:
[  182.641580 ] CPU: 1 PID: 26 Comm: khugepaged Tainted: GW   
4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3
[  182.642077 ] Hardware name: linux,dummy-virt (DT)
[  182.642227 ] task: ffc07957c080 ti: ffc079638000 task.ti: 
ffc079638000
[  182.642598 ] PC is at khugepaged+0x378/0x1af8
[  182.642826 ] LR is at khugepaged+0x418/0x1af8
[  182.643047 ] pc : [] lr : [] pstate: 
6145
[  182.643490 ] sp : ffc07963bca0
[  182.643650 ] x29: ffc07963bca0 x28: ffc00075c000
[  182.644024 ] x27: ffc00f275040 x26: ffc0006c7000
[  182.644334 ] x25: 00e848800f51 x24: 0640
[  182.644687 ] x23: 0002 x22: 
[  182.644972 ] x21:  x20: 
[  182.645446 ] x19:  x18: 007ff86d0990
[  182.645931 ] x17: 007ef9c8 x16: ffc98390
[  182.646236 ] x15:  x14: 
[  182.646649 ] x13: 016a x12: 
[  182.647046 ] x11: ffc07f025020 x10: 
[  182.647395 ] x9 : 0048 x8 : ffc000721e28
[  182.647872 ] x7 :  x6 : ffc07f02d000
[  182.648261 ] x5 : fe00 x4 : ffc00f275040
[  182.648611 ] x3 :  x2 : ffc00f2ad000
[  182.648908 ] x1 :  x0 : ffc000727000
[  182.649147 ]
[  182.649252 ] Process khugepaged (pid: 26, stack limit = 0xffc079638020)
[  182.649724 ] Stack: (0xffc07963bca0 to 0xffc07963c000)
[  182.650141 ] bca0: ffc07963be30 ffcb5044 ffc07961fb80 
ffc00072e630
[  182.650587 ] bcc0: ffc0005d5090  ffc000197d34 

[  182.651009 ] bce0:    

[  182.651446 ] bd00: ffc07963bd90 ffc07f1cbf80 4f3be003 
ffc00f2750a4
[  182.651956 ] bd20: ffc00f3bf000 ffc1 0001 
ffc07f085740
[  182.652520 ] bd40: ffc00f2ad188 ffc0 0620 
ffc00f275040
[  182.652972 ] bd60: ffc0006b1a90 ffc079638000 ffc07963be20 
ffc00f0144d0
[  182.653357 ] bd80: ffc0 0640 ffc00f0144d0 
0a080001
[  182.653793 ] bda0: 1001 ffc1 ffc07f025000 
ffc00f2750a8
[  182.654226 ] bdc0: 000105f8 ffc00075a000 06a0 
ffc000727000
[  182.654522 ] bde0: ffc0006e8478 ffc0 0001 
ffc078fb9000
[  182.654869 ] be00: ffc07963be30 ffc0 ffc07957c080 
ffccfc4c
[  182.655225 ] be20: ffc07963be20 ffc07963be20  
ffc85c50
[  182.655588 ] be40: ffcb4f64 ffc07961fb80  

[  182.656138 ] be60:  ffcbee2c ffcb4f64 

[  182.656609 ] be80:    

[  182.657145 ] bea0: ffc07963bea0 ffc07963bea0  
ffc0
[  182.657475 ] bec0: ffc07963bec0 ffc07963bec0  

[  182.657922 ] bee0:    

[  182.658558 ] bf00:    

[  182.658972 ] bf20:    

[  182.659291 ] bf40:    

[  182.659722 ] bf60:    

[  182.660122 ] bf80:    

[  182.660654 ] bfa0:    

[  182.661064 ] bfc0:    
0005
[  182.661466 ] bfe0:    

[  182.661848 ] Call trace:
[  182.662050 ] [] khugepaged+0x378/0x1af8
[  182.662294 ] [] kthread+0xdc/0xf4
[  182.662605 ] [] ret_from_fork+0xc/0x40
[  182.663046 ] Code: 35001700 f0002c60 aa0703e3 f9009fa0 (f94000e0)
[  182.663901 ] ---[ end trace 637503d8e28ae69e  ]---
[  182.664160 ] Kernel panic - not syncing: Fatal exception
[  182.664571 ] CPU2: stopping
[  182.664794 ] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G  D W   
4.3.0-rc6-next-20151022ajb-1-g32f3386-dirty #3
[  182.665248 ] Hardware name: linux,dummy-virt (DT)

add the trace point with TP_CONDITION(page),
avoid

Re: [PATCH v2 3/4] KVM: X86: Migration is supported

2015-11-11 Thread Jian Zhou




On 2015/11/11 23:15, Paolo Bonzini wrote:



On 23/10/2015 11:15, Jian Zhou wrote:

data *msr_info)
}
break;
case MSR_IA32_DEBUGCTLMSR:
-   if (!data) {
-   /* We support the non-activated case already */
-   break;
-   } else if (data & ~(DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF)) {
-   /* Values other than LBR and BTF are vendor-specific,
-  thus reserved and should throw a #GP */
+   supported = DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF |
+   DEBUGCTLMSR_FREEZE_LBRS_ON_PMI;
+
+   if (data & ~supported) {
+   /*
+* Values other than LBR/BTF/FREEZE_LBRS_ON_PMI
+* are not supported, thus reserved and should throw a 
#GP
+*/
+   vcpu_unimpl(vcpu, "%s: MSR_IA32_DEBUGCTLMSR 0x%llx, 
nop\n",
+   __func__, data);
return 1;
}
-   vcpu_unimpl(vcpu, "%s: MSR_IA32_DEBUGCTLMSR 0x%llx, nop\n",
-   __func__, data);
+   if (kvm_x86_ops->set_debugctlmsr) {
+   if (kvm_x86_ops->set_debugctlmsr(vcpu, data))
+   return 1;
+   }
+   else
+   return 1;
+
break;
case 0x200 ... 0x2ff:
return kvm_mtrr_set_msr(vcpu, msr, data);
@@ -2078,6 +2090,33 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
vcpu_unimpl(vcpu, "disabled perfctr wrmsr: "
"0x%x data 0x%llx\n", msr, data);
break;
+   case MSR_LBR_STATUS:
+   if (kvm_x86_ops->set_debugctlmsr) {
+   vcpu->arch.lbr_status = (data == 0) ? 0 : 1;
+   if (data)
+   kvm_x86_ops->set_debugctlmsr(vcpu,
+   DEBUGCTLMSR_LBR | 
DEBUGCTLMSR_FREEZE_LBRS_ON_PMI);
+   } else
+   vcpu_unimpl(vcpu, "lbr is disabled, ignored wrmsr: "
+   "0x%x data 0x%llx\n", msr, data);
+   break;
+   case MSR_LBR_SELECT:
+   case MSR_LBR_TOS:
+   case MSR_PENTIUM4_LER_FROM_LIP:
+   case MSR_PENTIUM4_LER_TO_LIP:
+   case MSR_PENTIUM4_LBR_TOS:
+   case MSR_IA32_LASTINTFROMIP:
+   case MSR_IA32_LASTINTTOIP:
+   case MSR_LBR_CORE2_FROM ... MSR_LBR_CORE2_FROM + 0x7:
+   case MSR_LBR_CORE2_TO ... MSR_LBR_CORE2_TO + 0x7:
+   case MSR_LBR_NHM_FROM ... MSR_LBR_NHM_FROM + 0x1f:
+   case MSR_LBR_NHM_TO ... MSR_LBR_NHM_TO + 0x1f:
+   if (kvm_x86_ops->set_lbr_msr)
+   kvm_x86_ops->set_lbr_msr(vcpu, msr, data);
+   else
+   vcpu_unimpl(vcpu, "lbr is disabled, ignored wrmsr: "
+   "0x%x data 0x%llx\n", msr, data);


I think you can just do this in kvm_x86_ops->set_msr.  The old
implementation for DEBUGCTL MSR can be moved to svm.c.


  I think you mean "moved to vmx.c"?

  Thanks,
  Jian


Paolo

.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] drm/tegra: fix locking of SET_TILING ioctl

2015-11-11 Thread Alexandre Courbot

drm_gem_object_unreference() now expects obj->dev->struct_mutex to be
held. Use the newly-introduced drm_gem_object_unreference_unlocked()
which handles locking for us.

If we don't do this drm_gem_object_unreference() will get noisy about
struct_mutex not being held every time we call the SET_TILING ioctl.

Signed-off-by: Alexandre Courbot 
---
 drivers/gpu/drm/tegra/drm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index cc48334ef164..c0ae89865958 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -683,7 +683,7 @@ static int tegra_gem_set_tiling(struct drm_device *drm, 
void *data,
bo->tiling.mode = mode;
bo->tiling.value = value;
 
-   drm_gem_object_unreference(gem);
+   drm_gem_object_unreference_unlocked(gem);
 
return 0;
 }
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Change of my repository (samsung-krzk)

2015-11-11 Thread Krzysztof Kozlowski

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Stephen,


I moved my samsung-krzk repository for Samsung SoC (as co-maintainer)
from:
git://github.com/krzk/linux.git#for-next

to:
git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git#for-next

(the name of branch stays the same)

Can you update the linux-next trees?


Best regards,
Krzysztof

-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQIcBAEBAgAGBQJWRDoSAAoJEME3ZuaGi4PXerQP/0dlSlUBsNM6cZnE5cQ8L3uY
EgwqQm0vc6KBGl3+5fZRcDUt/6izD4DfR1tUqE81ChkCfGvJXFzZVV9KckMF6Nda
AyiS6FSaMb9CmMbDVgfUBG9dB1ODR7XIHg+imYgW3VV9fgaBhRL/Ex2hoSM7r8yl
UsVyQRl4W2JQX4ibEF4MMwFh/XqeV8Zb3OtW6r70guplfVzXA9RyBZ69Vs01Hhnu
OKGJoc09IncsCSFX7daxdN6WTmnL+HqL2WhkzTosApqPI4DKVAv7z3dbUln4qUmf
Z8WielS5ubf1JkY1abNL1lsIOJmkYNrABJOVj+tUoQ9/yDGI10A4kOf85gZUtc10
krBiC3C+hf+WKGnZIkWYxH9s5N2HXjgCwVuqzwi46BlBXCt18qdFOAkMqhRq+2ii
482D5PDRFI6D/mI+uxz3m6WJWUgcLHX7bomO4g5H0I4sxwyxWteH8ZWQZF2j1zvp
61y9l6Ob1Flgs/TlDkc/vELG6iKtrrgTxYkku5e0F/ifC8HGGXoPdI3piQq6n4Pg
8ETB8krz5sPzm97JVT+B4ys8opBHTJifRAbAAamhlTy8BQ0cPY3nU2RPysehGpo5
bcYTsSkjeKmbvQ4m0FU9bsTGPTs7rmbUCj0rNMP2EVzcDu2iQ6E5P+MlEflMYodp
2zhWbJ1BQyYoSsO7/tCF
=OtZP
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: PING: [PATCH] net: smsc911x: Reset PHY during initialization

2015-11-11 Thread Pavel Fedin

 Hello!

> >> If you think I should reconsider the patch, you should resubmit it.
> >
> >  I understand this, of course. But, before doing this i'd like to
> > clarify your concern, why exactly you think that loopback test will
> > break.
> 
> If I didn't reply it means I don't have anything constructive to say
> to you, and probably I'll end up agreeing with your analysis of the
> loopback test issue.
> 
> I'm not going to ask more than one more time for you to repost your
> patch.

 But, in this case, i don't have anything to change, do i? Or is it
just a formal requirement to RESEND? I can do this, if you want to.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v7 1/4] Documentation: dt-bindings: Describe SROMc configuration

2015-11-11 Thread Pavel Fedin

 Hello!

> >> > +- samsung,srom-timing : array of 6 integers, specifying bank timings in 
> >> > the
> >> > +following order: Tacp, Tcah, Tcoh, Tacc, Tcos, 
> >> > Tacs.
> >> > +Each value is specified in cycles and has the 
> >> > following
> >> > +meaning and valid range:
> >> > +Tacp : Page mode access cycle at Page mode (0 - 
> >> > 15)
> >> > +Tcah : Address holding time after CSn (0 - 15)
> >> > +Tcoh : Chip selection hold on OEn (0 - 15)
> >> > +Tacc : Access cycle (0 - 31, the actual time is 
> >> > N + 1)
> >> > +Tcos : Chip selection set-up before OEn (0 - 15)
> >> > +Tacs : Address set-up before CSn (0 - 15)
> >>
> >> This is not easily extended. Perhaps a property per value instead.
> >
> >  We had a discussion with Krzysztof about it, he agreed with this form of 
> > the property.
> > My concern was that it's just too much typing, and makes little sense 
> > because these
> > settings always go together. If register layout changes, or parameter set 
> > changes in
> > incompatible way, then it's another device, not exynos-srom anymore.
> >  So would you agree with that, or is your position strong?
> 
> I'm thinking for a new version of the controller which could add (or
> remove) new timing parameters, but then I guess you can interpret the
> field differently based on the compatible string. Anyway, your problem
> to deal with.

 Of course, my thought is that if compatible string is different,
then it's already a different device. And of course it would have different 
parameters.
 So, OK, i'll post new version with fixed documentation today.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 0/2] sched: consider missed ticks when updating cpu load

2015-11-11 Thread Byungchul Park

On Wed, Nov 11, 2015 at 06:38:10PM +0900, Byungchul Park wrote:
> On Tue, Nov 10, 2015 at 09:36:00AM +0900, byungchul.p...@lge.com wrote:
> > From: Byungchul Park 
> > 
> > change from v4 to v5
> > - fix comments and commit message
> > - take new_load into account in update_cpu_load_nohz() and 
> > __update_cpu_load()
> >   because it's non-zero in NOHZ_FULL
> > 
> > change from v3 to v4
> > - focus the problem on full NOHZ
> > 
> > change from v2 to v3
> > - add a patch which make __update_cpu_load() handle active tickless
> > 
> > change from v1 to v2
> > - add some additional commit message (logic is same exactly)
> > 
> > additionally i tried to use the return value of hrtimer_forward() instead of
> > jiffies to get pending tick for updating cpu load. it's easy for
> > update_cpu_load_nohz() to do that. but for update_idle_cpu_load(), i need
> > more time to think about how to implement it nicely.
> 
> Actually because of update_idle_load() which is used when doing nohz
> load balancing while the cpu has been still tick-stopped, this looks
> not be able to be implemented without any kind of last_update_xxx
^^^
last_load_update_tick
> variable. What do you think about it, Peter?
> 
> > 
> > and i will try to fix other stuffs caused by full NOHZ later with another
> > patch. i decided to send this patch at first because "cpu load update" is a
> > standalone problem which is not coupled with other stuffs.
> > 
> > Byungchul Park (2):
> >   sched: make __update_cpu_load() handle NOHZ_FULL tickless
> >   sched: make update_cpu_load_nohz() consider missed ticks in NOHZ_FULL
> > 
> >  include/linux/sched.h|4 ++--
> >  kernel/sched/fair.c  |   57 
> > +-
> >  kernel/time/tick-sched.c |8 +++
> >  3 files changed, 52 insertions(+), 17 deletions(-)
> > 
> > -- 
> > 1.7.9.5
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] perf symbols/KCORE: Rebuild rbtree when adjusting symbols for kcore

2015-11-11 Thread Adrian Hunter

On 11/11/15 22:44, Arnaldo Carvalho de Melo wrote:
> Em Wed, Nov 11, 2015 at 03:02:35PM +0800, Wangnan (F) escreveu:
>>
>>
>> On 2015/11/6 21:59, Adrian Hunter wrote:
>>> On 06/11/15 15:19, Arnaldo Carvalho de Melo wrote:
 Em Fri, Nov 06, 2015 at 09:46:12AM +, Wang Nan escreveu:
> In dso__split_kallsyms_for_kcore(), current code adjusts symbol's
> address but only reinsert it into rbtree if the symbol belongs to
> another map. However, the expression for adjusting symbol (pos->start -=
> curr_map->start - curr_map->pgoff) can change the relative order between
> two symbols (even if the affected symbols are in different maps, in
> kcore case they are possible to share one same dso), which damages the
> rbtree.
 Right, some code does change the symbol values it gets from whatever
 symtab (kallsyms, ELF, JIT maps, etc) when it should instead use the per
 map data structure (struct map) and its ->{map,unmap}_ip, ->pgoff,
 ->reloc, members for that :-\

 I.e. 'struct dso' should be just what comes from the symtab, while
 'struct map' should be about where that DSO is in memory.

 With that in mind, do you still think your fix is the correct one?

 Adrian?
>>> The problem is when the order in memory (in kallsyms) is different
>>> to the order on the dso (kcore).
>>>
>>> I think to make it more general it needs to insert to a new tree.
>>> e.g.
>>>
>>
>> I have tested this patch and it works for me.
>>
>> Thank you.
> 
> Adrian, I took your explanation as the commit log, adding your S-o-B, so
> far not provided, is that ok with you, can I have your S-o-B?

Yes.  Thank you!

> 
>>From 500fe7dbd2c6cebc3638196352439490e1e3a8a4 Mon Sep 17 00:00:00 2001
> From: Adrian Hunter 
> Date: Fri, 6 Nov 2015 15:59:29 +0200
> Subject: [PATCH 1/1] perf symbols: Rebuild rbtree when adjusting symbols for
>  kcore
> 
> Normally symbols are read from the DSO and adjusted, if need be, so that
> the symbol start matches the file offset in the DSO file (we want the
> file offset because that is what we know from MMAP events). That is done
> by dso__load_sym() which inserts the symbols *after* adjusting them.
> 
> In the case of kcore, the symbols have been read from kallsyms and the
> symbol start is the memory address. The symbols have to be adjusted to
> match the kcore file offsets. dso__split_kallsyms_for_kcore() does that,
> but now the adjustment is being done *after* the symbols have been
> inserted. It appears dso__split_kallsyms_for_kcore() was assuming that
> changing the symbol start would not change the order in the rbtree -
> which is, of course, not guaranteed.
> 
> Signed-off-by: Adrian Hunter 
> Tested-by: Wang Nan 
> Cc: Jiri Olsa 
> Cc: Masami Hiramatsu 
> Cc: Namhyung Kim 
> Cc: Zefan Li 
> Cc: pi3or...@163.com
> Link: http://lkml.kernel.org/r/563cb241.2090...@intel.com
> Signed-off-by: Arnaldo Carvalho de Melo 
> ---
>  tools/perf/util/symbol.c | 30 ++
>  1 file changed, 14 insertions(+), 16 deletions(-)
> 
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index b4cc7662677e..09343a880c0b 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -654,19 +654,24 @@ static int dso__split_kallsyms_for_kcore(struct dso 
> *dso, struct map *map,
>   struct map_groups *kmaps = map__kmaps(map);
>   struct map *curr_map;
>   struct symbol *pos;
> - int count = 0, moved = 0;
> + int count = 0;
> + struct rb_root old_root = dso->symbols[map->type];
>   struct rb_root *root = &dso->symbols[map->type];
>   struct rb_node *next = rb_first(root);
>  
>   if (!kmaps)
>   return -1;
>  
> + *root = RB_ROOT;
> +
>   while (next) {
>   char *module;
>  
>   pos = rb_entry(next, struct symbol, rb_node);
>   next = rb_next(&pos->rb_node);
>  
> + rb_erase_init(&pos->rb_node, &old_root);
> +
>   module = strchr(pos->name, '\t');
>   if (module)
>   *module = '\0';
> @@ -674,28 +679,21 @@ static int dso__split_kallsyms_for_kcore(struct dso 
> *dso, struct map *map,
>   curr_map = map_groups__find(kmaps, map->type, pos->start);
>  
>   if (!curr_map || (filter && filter(curr_map, pos))) {
> - rb_erase_init(&pos->rb_node, root);
>   symbol__delete(pos);
> - } else {
> - pos->start -= curr_map->start - curr_map->pgoff;
> - if (pos->end)
> - pos->end -= curr_map->start - curr_map->pgoff;
> - if (curr_map->dso != map->dso) {
> - rb_erase_init(&pos->rb_node, root);
> - symbols__insert(
> - &curr_map->dso->symbols[curr_map->type],
> - pos);
> -

[tip:perf/urgent] tools include: Add compiler.h to list.h

2015-11-11 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  5602ea09c19e85557f2b4d30be1d6ba349b7a038
Gitweb: http://git.kernel.org/tip/5602ea09c19e85557f2b4d30be1d6ba349b7a038
Author: Arnaldo Carvalho de Melo 
AuthorDate: Wed, 11 Nov 2015 12:54:42 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 11 Nov 2015 18:41:33 -0300

tools include: Add compiler.h to list.h

list.h needs WRITE_ONCE() since 7f5f873c6a07 ("rculist: Use WRITE_ONCE()
when deleting from reader-visible list") add it before including the
kernel's list.h file.

This fixes builds of 'make perf-tar-src-pkg' perf tool tarball builds,
i.e. out of tree builds.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-e0rb8f7jwz0jn24ttyick...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/include/linux/list.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/include/linux/list.h b/tools/include/linux/list.h
index 76b014c..a017f15 100644
--- a/tools/include/linux/list.h
+++ b/tools/include/linux/list.h
@@ -1,3 +1,4 @@
+#include 
 #include 
 #include 
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/urgent] perf session: Add missing newlines to some pr_err() calls

2015-11-11 Thread tip-bot for Arnaldo Carvalho de Melo

Commit-ID:  e87b49116dedba3464fd8d0ec9393b4841167334
Gitweb: http://git.kernel.org/tip/e87b49116dedba3464fd8d0ec9393b4841167334
Author: Arnaldo Carvalho de Melo 
AuthorDate: Mon, 9 Nov 2015 17:12:03 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 11 Nov 2015 18:41:31 -0300

perf session: Add missing newlines to some pr_err() calls

Before:

  [acme@zoo linux]$ perf evlist
  WARNING: The perf.data file's data size field is 0 which is unexpected.
  Was the 'perf record' command properly terminated?
  non matching sample_type[acme@zoo linux]$

After:

  [acme@zoo linux]$ perf evlist
  WARNING: The perf.data file's data size field is 0 which is unexpected.
  Was the 'perf record' command properly terminated?
  non matching sample_type
  [acme@zoo linux]$

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: http://lkml.kernel.org/n/tip-wscok3a2s7yrj8156oc2r...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/session.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 428149b..c35ffdd 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -29,7 +29,7 @@ static int perf_session__open(struct perf_session *session)
struct perf_data_file *file = session->file;
 
if (perf_session__read_header(session) < 0) {
-   pr_err("incompatible file format (rerun with -v to learn 
more)");
+   pr_err("incompatible file format (rerun with -v to learn 
more)\n");
return -1;
}
 
@@ -37,17 +37,17 @@ static int perf_session__open(struct perf_session *session)
return 0;
 
if (!perf_evlist__valid_sample_type(session->evlist)) {
-   pr_err("non matching sample_type");
+   pr_err("non matching sample_type\n");
return -1;
}
 
if (!perf_evlist__valid_sample_id_all(session->evlist)) {
-   pr_err("non matching sample_id_all");
+   pr_err("non matching sample_id_all\n");
return -1;
}
 
if (!perf_evlist__valid_read_format(session->evlist)) {
-   pr_err("non matching read_format");
+   pr_err("non matching read_format\n");
return -1;
}
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:perf/urgent] perf probe: Verify parameters in two functions

2015-11-11 Thread tip-bot for Wang Nan

Commit-ID:  421fd0845eaeecce6b3806f7f0c0d67d1f9ad108
Gitweb: http://git.kernel.org/tip/421fd0845eaeecce6b3806f7f0c0d67d1f9ad108
Author: Wang Nan 
AuthorDate: Fri, 6 Nov 2015 09:50:15 +
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Wed, 11 Nov 2015 18:41:32 -0300

perf probe: Verify parameters in two functions

On kernel with only one out of CONFIG_KPROBE_EVENTS and
CONFIG_UPROBE_EVENTS enabled, 'perf probe -d' causes a segfault because
perf_del_probe_events() calls probe_file__get_events() with a negative
fd.

This patch fixes it by adding parameter validation at the entry of
probe_file__get_events() and probe_file__get_rawlist(). Since they are
both non-static public functions (in .h file), parameter verifying is
required.

v1 -> v2: Verify fd at the head of probe_file__get_rawlist() instead of
  checking at call site (suggested by Masami and Arnaldo at [1,2]).

[1] 
http://lkml.kernel.org/r/50399556c9727b4d88a595c8584aab3752604...@gsjptkydcembx32.service.hitachi.net
[2] http://lkml.kernel.org/r/20151105155830.gv13...@kernel.org

Signed-off-by: Wang Nan 
Acked-by: Masami Hiramatsu 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Zefan Li 
Cc: pi3or...@163.com
Link: 
http://lkml.kernel.org/r/1446803415-83382-1-git-send-email-wangn...@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/probe-file.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c
index 89dbeb9..e3b3b92 100644
--- a/tools/perf/util/probe-file.c
+++ b/tools/perf/util/probe-file.c
@@ -138,6 +138,9 @@ struct strlist *probe_file__get_rawlist(int fd)
char *p;
struct strlist *sl;
 
+   if (fd < 0)
+   return NULL;
+
sl = strlist__new(NULL, NULL);
 
fp = fdopen(dup(fd), "r");
@@ -271,6 +274,9 @@ int probe_file__get_events(int fd, struct strfilter *filter,
const char *p;
int ret = -ENOENT;
 
+   if (!plist)
+   return -EINVAL;
+
namelist = __probe_file__get_namelist(fd, true);
if (!namelist)
return -ENOENT;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V4 2/4] dma: add Qualcomm Technologies HIDMA management driver

2015-11-11 Thread Sinan Kaya

The Qualcomm Technologies HIDMA device has been designed
to support virtualization technology. The driver has been
divided into two to follow the hardware design.

1. HIDMA Management driver
2. HIDMA Channel driver

Each HIDMA HW consists of multiple channels. These channels
share some set of common parameters. These parameters are
initialized by the management driver during power up.
Same management driver is used for monitoring the execution
of the channels. Management driver can change the performance
behavior dynamically such as bandwidth allocation and
prioritization.

The management driver is executed in hypervisor context and
is the main management entity for all channels provided by
the device.

Signed-off-by: Sinan Kaya 
---
 .../devicetree/bindings/dma/qcom_hidma_mgmt.txt|  61 
 drivers/dma/qcom/Kconfig   |  11 +
 drivers/dma/qcom/Makefile  |   1 +
 drivers/dma/qcom/hidma_mgmt.c  | 312 +
 drivers/dma/qcom/hidma_mgmt.h  |  38 +++
 drivers/dma/qcom/hidma_mgmt_sys.c  | 231 +++
 6 files changed, 654 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt
 create mode 100644 drivers/dma/qcom/hidma_mgmt.c
 create mode 100644 drivers/dma/qcom/hidma_mgmt.h
 create mode 100644 drivers/dma/qcom/hidma_mgmt_sys.c

diff --git a/Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt 
b/Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt
new file mode 100644
index 000..eb053b9
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt
@@ -0,0 +1,61 @@
+Qualcomm Technologies HIDMA Management interface
+
+The Qualcomm Technologies HIDMA device has been designed
+to support virtualization technology. The driver has been
+divided into two to follow the hardware design. The management
+driver is executed in hypervisor context and is the main
+management entity for all channels provided by the device.
+
+Each HIDMA HW consists of multiple channels. These channels
+share some set of common parameters. These parameters are
+initialized by the management driver during power up.
+Same management driver is used for monitoring the execution
+of the channels. Management driver can change the performance
+behavior dynamically such as bandwidth allocation and
+prioritization.
+
+All channel devices get probed in the hypervisor
+context during power up. They show up as DMA engine
+DMA channels. Then, before starting the virtualization; each
+channel device is unbound from the hypervisor by VFIO
+and assign to the guest machine for control.
+
+This management driver will  be used by the system
+admin to monitor/reset the execution state of the DMA
+channels. This will be the management interface.
+
+
+Required properties:
+- compatible: "qcom,hidma-mgmt-1.0";
+- reg: Address range for DMA device
+- dma-channels: Number of channels supported by this DMA controller.
+- max-write-burst-bytes: Maximum write burst in bytes. A memcpy requested is
+  fragmented to multiples of this amount.
+- max-read-burst-bytes: Maximum read burst in bytes. A memcpy request is
+  fragmented to multiples of this amount.
+- max-write-transactions: Maximum write transactions to perform in a burst
+- max-read-transactions: Maximum read transactions to perform in a burst
+- channel-reset-timeout-cycles: Channel reset timeout in cycles for this SOC.
+- channel-priority: Priority of the channel.
+  Each dma channel share the same HW bandwidth with other dma channels.
+  If two requests reach to the HW at the same time from a low priority and
+  high priority channel, high priority channel will claim the bus.
+  0=low priority, 1=high priority
+- channel-weight: Round robin weight of the channel
+  Since there are only two priority levels supported, scheduling among
+  the equal priority channels is done via weights.
+
+Example:
+
+   hidma-mgmt@f9984000 = {
+   compatible = "qcom,hidma-mgmt-1.0";
+   reg = <0xf9984000 0x15000>;
+   dma-channels = 6;
+   max-write-burst-bytes = 1024;
+   max-read-burst-bytes = 1024;
+   max-write-transactions = 31;
+   max-read-transactions = 31;
+   channel-reset-timeout-cycles = 0x500;
+   channel-priority = < 1 1 0 0 0 0>;
+   channel-weight = < 1 13 10 3 4 5>;
+   };
diff --git a/drivers/dma/qcom/Kconfig b/drivers/dma/qcom/Kconfig
index 17545df..f3e2d4c 100644
--- a/drivers/dma/qcom/Kconfig
+++ b/drivers/dma/qcom/Kconfig
@@ -7,3 +7,14 @@ config QCOM_BAM_DMA
  Enable support for the QCOM BAM DMA controller.  This controller
  provides DMA capabilities for a variety of on-chip devices.
 
+config QCOM_HIDMA_MGMT
+   tristate "Qualcomm Technologies HIDMA Management support"
+   select DMA_ENGINE
+   help
+ Enable support for the Qualcomm Technologies HIDM

[PATCH V4 4/4] dma: add Qualcomm Technologies HIDMA channel driver

2015-11-11 Thread Sinan Kaya

This patch adds support for hidma engine. The driver consists
of two logical blocks. The DMA engine interface and the
low-level interface. The hardware only supports memcpy/memset
and this driver only support memcpy interface. HW and driver
doesn't support slave interface.

Signed-off-by: Sinan Kaya 
---
 .../devicetree/bindings/dma/qcom_hidma.txt |  18 +
 drivers/dma/qcom/Kconfig   |   9 +
 drivers/dma/qcom/Makefile  |   2 +
 drivers/dma/qcom/hidma.c   | 736 
 drivers/dma/qcom/hidma.h   | 157 
 drivers/dma/qcom/hidma_dbg.c   | 225 +
 drivers/dma/qcom/hidma_ll.c| 939 +
 7 files changed, 2086 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/qcom_hidma.txt
 create mode 100644 drivers/dma/qcom/hidma.c
 create mode 100644 drivers/dma/qcom/hidma.h
 create mode 100644 drivers/dma/qcom/hidma_dbg.c
 create mode 100644 drivers/dma/qcom/hidma_ll.c

diff --git a/Documentation/devicetree/bindings/dma/qcom_hidma.txt 
b/Documentation/devicetree/bindings/dma/qcom_hidma.txt
new file mode 100644
index 000..d18c8fc
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/qcom_hidma.txt
@@ -0,0 +1,18 @@
+Qualcomm Technologies HIDMA Channel driver
+
+Required properties:
+- compatible: must contain "qcom,hidma-1.0"
+- reg: Addresses for the transfer and event channel
+- interrupts: Should contain the event interrupt
+- desc-count: Number of asynchronous requests this channel can handle
+- event-channel: The HW event channel completions will be delivered.
+Example:
+
+   hidma_24: dma-controller@0x5c05 {
+   compatible = "qcom,hidma-1.0";
+   reg = <0 0x5c05 0x0 0x1000>,
+ <0 0x5c0b 0x0 0x1000>;
+   interrupts = <0 389 0>;
+   desc-count = <10>;
+   event-channel = <4>;
+   };
diff --git a/drivers/dma/qcom/Kconfig b/drivers/dma/qcom/Kconfig
index f3e2d4c..5588e1c 100644
--- a/drivers/dma/qcom/Kconfig
+++ b/drivers/dma/qcom/Kconfig
@@ -18,3 +18,12 @@ config QCOM_HIDMA_MGMT
  the guest OS would run QCOM_HIDMA channel driver and the
  hypervisor would run the QCOM_HIDMA_MGMT management driver.
 
+config QCOM_HIDMA
+   tristate "Qualcomm Technologies HIDMA Channel support"
+   select DMA_ENGINE
+   help
+ Enable support for the Qualcomm Technologies HIDMA controller.
+ The HIDMA controller supports optimized buffer copies
+ (user to kernel, kernel to kernel, etc.).  It only supports
+ memcpy interface. The core is not intended for general
+ purpose slave DMA.
diff --git a/drivers/dma/qcom/Makefile b/drivers/dma/qcom/Makefile
index 1a5a96d..2b68c9c 100644
--- a/drivers/dma/qcom/Makefile
+++ b/drivers/dma/qcom/Makefile
@@ -1,2 +1,4 @@
 obj-$(CONFIG_QCOM_BAM_DMA) += bam_dma.o
 obj-$(CONFIG_QCOM_HIDMA_MGMT) += hidma_mgmt.o hidma_mgmt_sys.o
+obj-$(CONFIG_QCOM_HIDMA) +=  hdma.o
+hdma-objs:= hidma_ll.o hidma.o hidma_dbg.o ../dmaselftest.o
diff --git a/drivers/dma/qcom/hidma.c b/drivers/dma/qcom/hidma.c
new file mode 100644
index 000..1af301c
--- /dev/null
+++ b/drivers/dma/qcom/hidma.c
@@ -0,0 +1,736 @@
+/*
+ * Qualcomm Technologies HIDMA DMA engine interface
+ *
+ * Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+/*
+ * Copyright (C) Freescale Semicondutor, Inc. 2007, 2008.
+ * Copyright (C) Semihalf 2009
+ * Copyright (C) Ilya Yanok, Emcraft Systems 2010
+ * Copyright (C) Alexander Popov, Promcontroller 2014
+ *
+ * Written by Piotr Ziecik . Hardware description
+ * (defines, structures and comments) was taken from MPC5121 DMA driver
+ * written by Hongjun Chen .
+ *
+ * Approved as OSADL project by a majority of OSADL members and funded
+ * by OSADL membership fees in 2009;  for details see www.osadl.org.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the Free
+ * Software Foundation; either version 2 of the License, or (at your option)
+ * any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * The full GNU General Public License is includ

[PATCH V4 0/4] dma: add Qualcomm Technologies HIDMA driver

2015-11-11 Thread Sinan Kaya

The Qualcomm Technologies HIDMA device has been designed
to support virtualization technology. The driver has been
divided into two to follow the hardware design.

1. HIDMA Management driver
2. HIDMA Channel driver

Each HIDMA HW consists of multiple channels. These channels
share some set of common parameters. These parameters are
initialized by the management driver during power up.
Same management driver is used for monitoring the execution
of the channels. Management driver can change the performance
behavior dynamically such as bandwidth allocation and
prioritization in the future.

The management driver is executed in hypervisor context and
is the main management entity for all channels provided by
the device.

Changes from V3: (https://lkml.org/lkml/2015/11/7/256)
* use git format-patch -M to reduce the difference

Changes from V3: (https://lkml.org/lkml/2015/11/7/257)
* add missing EXPORT_SYMBOL_GPL for hidma_mgmt_init_sys and
hidma_mgmt_setup
* simplify bitwise set and clear statements

Changes from V3: (https://lkml.org/lkml/2015/11/7/258)
* none

Changes from V3: (https://lkml.org/lkml/2015/11/7/259)
* remove return code on hidma_ll_start and hidma_ll_queue_request
* remove the checks after platform_get_resource.
* reorder the pm calls in failure path.
* simplify bit clear and set operations.
* correct device tree documentation compatible string
* clean unnecessary initializations and use unsigned int for
iterator types

Sinan Kaya (4):
  dma: qcom_bam_dma: move to qcom directory
  dma: add Qualcomm Technologies HIDMA management driver
  dmaselftest: add memcpy selftest support functions
  dma: add Qualcomm Technologies HIDMA channel driver

 .../devicetree/bindings/dma/qcom_hidma.txt |  18 +
 .../devicetree/bindings/dma/qcom_hidma_mgmt.txt|  61 ++
 drivers/dma/Kconfig|  11 +-
 drivers/dma/Makefile   |   2 +-
 drivers/dma/dmaengine.h|   2 +
 drivers/dma/dmaselftest.c  | 638 ++
 drivers/dma/qcom/Kconfig   |  29 +
 drivers/dma/qcom/Makefile  |   4 +
 drivers/dma/{qcom_bam_dma.c => qcom/bam_dma.c} |   6 +-
 drivers/dma/qcom/hidma.c   | 736 
 drivers/dma/qcom/hidma.h   | 157 
 drivers/dma/qcom/hidma_dbg.c   | 225 +
 drivers/dma/qcom/hidma_ll.c| 939 +
 drivers/dma/qcom/hidma_mgmt.c  | 312 +++
 drivers/dma/qcom/hidma_mgmt.h  |  38 +
 drivers/dma/qcom/hidma_mgmt_sys.c  | 231 +
 16 files changed, 3396 insertions(+), 13 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/dma/qcom_hidma.txt
 create mode 100644 Documentation/devicetree/bindings/dma/qcom_hidma_mgmt.txt
 create mode 100644 drivers/dma/dmaselftest.c
 create mode 100644 drivers/dma/qcom/Kconfig
 create mode 100644 drivers/dma/qcom/Makefile
 rename drivers/dma/{qcom_bam_dma.c => qcom/bam_dma.c} (99%)
 create mode 100644 drivers/dma/qcom/hidma.c
 create mode 100644 drivers/dma/qcom/hidma.h
 create mode 100644 drivers/dma/qcom/hidma_dbg.c
 create mode 100644 drivers/dma/qcom/hidma_ll.c
 create mode 100644 drivers/dma/qcom/hidma_mgmt.c
 create mode 100644 drivers/dma/qcom/hidma_mgmt.h
 create mode 100644 drivers/dma/qcom/hidma_mgmt_sys.c

-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V4 3/4] dmaselftest: add memcpy selftest support functions

2015-11-11 Thread Sinan Kaya

This patch adds supporting utility functions for selftest.
The intention is to share the self test code between
different drivers.

Supported test cases include:
1. dma_map_single
2. streaming DMA
3. coherent DMA
4. scatter-gather DMA

Signed-off-by: Sinan Kaya 
---
 drivers/dma/dmaengine.h   |   2 +
 drivers/dma/dmaselftest.c | 638 ++
 2 files changed, 640 insertions(+)
 create mode 100644 drivers/dma/dmaselftest.c

diff --git a/drivers/dma/dmaengine.h b/drivers/dma/dmaengine.h
index 17f983a..05b5a84 100644
--- a/drivers/dma/dmaengine.h
+++ b/drivers/dma/dmaengine.h
@@ -86,4 +86,6 @@ static inline void dma_set_residue(struct dma_tx_state 
*state, u32 residue)
state->residue = residue;
 }
 
+int dma_selftest_memcpy(struct dma_device *dmadev);
+
 #endif
diff --git a/drivers/dma/dmaselftest.c b/drivers/dma/dmaselftest.c
new file mode 100644
index 000..423a9a3
--- /dev/null
+++ b/drivers/dma/dmaselftest.c
@@ -0,0 +1,638 @@
+/*
+ * DMA self test code borrowed from Qualcomm Technologies HIDMA driver
+ *
+ * Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct test_result {
+   atomic_t counter;
+   wait_queue_head_t wq;
+   struct dma_device *dma;
+};
+
+static void dma_selftest_complete(void *arg)
+{
+   struct test_result *result = arg;
+   struct dma_device *dma = result->dma;
+
+   atomic_inc(&result->counter);
+   wake_up(&result->wq);
+   dev_dbg(dma->dev, "self test transfer complete :%d\n",
+   atomic_read(&result->counter));
+}
+
+/*
+ * Perform a transaction to verify the HW works.
+ */
+static int dma_selftest_sg(struct dma_device *dma,
+   struct dma_chan *chan, u64 size,
+   unsigned long flags)
+{
+   struct dma_async_tx_descriptor *tx;
+   struct sg_table sg_table;
+   struct scatterlist *sg;
+   struct test_result result;
+   dma_addr_t src, dest, dest_it;
+   u8 *src_buf, *dest_buf;
+   unsigned int i, j;
+   dma_cookie_t cookie;
+   int err;
+   int nents = 10, count;
+   bool free_channel = true;
+   int map_count;
+
+   init_waitqueue_head(&result.wq);
+   atomic_set(&result.counter, 0);
+   result.dma = dma;
+
+   if (!chan)
+   return -ENOMEM;
+
+   if (dma->device_alloc_chan_resources(chan) < 1)
+   return -ENODEV;
+
+   if (!chan->device || !dma->dev) {
+   dma->device_free_chan_resources(chan);
+   return -ENODEV;
+   }
+
+   err = sg_alloc_table(&sg_table, nents, GFP_KERNEL);
+   if (err)
+   goto sg_table_alloc_failed;
+
+   for_each_sg(sg_table.sgl, sg, nents, i) {
+   u64 alloc_sz;
+   void *cpu_addr;
+
+   alloc_sz = round_up(size, nents);
+   do_div(alloc_sz, nents);
+   cpu_addr = kmalloc(alloc_sz, GFP_KERNEL);
+
+   if (!cpu_addr) {
+   err = -ENOMEM;
+   goto sg_buf_alloc_failed;
+   }
+
+   dev_dbg(dma->dev, "set sg buf[%d] :%p\n", i, cpu_addr);
+   sg_set_buf(sg, cpu_addr, alloc_sz);
+   }
+
+   dest_buf = kmalloc(round_up(size, nents), GFP_KERNEL);
+   if (!dest_buf) {
+   err = -ENOMEM;
+   goto dst_alloc_failed;
+   }
+   dev_dbg(dma->dev, "dest:%p\n", dest_buf);
+
+   /* Fill in src buffer */
+   count = 0;
+   for_each_sg(sg_table.sgl, sg, nents, i) {
+   src_buf = sg_virt(sg);
+   dev_dbg(dma->dev,
+   "set src[%d, %p] = %d\n", i, src_buf, count);
+
+   for (j = 0; j < sg_dma_len(sg); j++)
+   src_buf[j] = count++;
+   }
+
+   /* dma_map_sg cleans and invalidates the cache in arm64 when
+* DMA_TO_DEVICE is selected for src. That's why, we need to do
+* the mapping after the data is copied.
+*/
+   map_count = dma_map_sg(dma->dev, sg_table.sgl, nents, DMA_TO_DEVICE);
+   if (!map_count) {
+   err =  -EINVAL;
+   goto src_map_failed;
+   }
+
+   dest = dma_map_single(dma->dev, dest_buf, size, DMA_FROM_DEVICE);
+
+   err = dma_mapping_error(dma->dev, dest);
+   if (err)
+   goto dest_map_failed;
+
+   /* check scatter gather list contents */
+   for_

[PATCH V4 1/4] dma: qcom_bam_dma: move to qcom directory

2015-11-11 Thread Sinan Kaya

Creating a QCOM directory for all QCOM DMA source
files.

Signed-off-by: Sinan Kaya 
---
 drivers/dma/Kconfig| 11 ++-
 drivers/dma/Makefile   |  2 +-
 drivers/dma/qcom/Kconfig   |  9 +
 drivers/dma/qcom/Makefile  |  1 +
 drivers/dma/{qcom_bam_dma.c => qcom/bam_dma.c} |  6 +++---
 5 files changed, 16 insertions(+), 13 deletions(-)
 create mode 100644 drivers/dma/qcom/Kconfig
 create mode 100644 drivers/dma/qcom/Makefile
 rename drivers/dma/{qcom_bam_dma.c => qcom/bam_dma.c} (99%)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index b458475..47b1b98 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -408,15 +408,6 @@ config PXA_DMA
  16 to 32 channels for peripheral to memory or memory to memory
  transfers.
 
-config QCOM_BAM_DMA
-   tristate "QCOM BAM DMA support"
-   depends on ARCH_QCOM || (COMPILE_TEST && OF && ARM)
-   select DMA_ENGINE
-   select DMA_VIRTUAL_CHANNELS
-   ---help---
- Enable support for the QCOM BAM DMA controller.  This controller
- provides DMA capabilities for a variety of on-chip devices.
-
 config SIRF_DMA
tristate "CSR SiRFprimaII/SiRFmarco DMA support"
depends on ARCH_SIRF
@@ -527,6 +518,8 @@ config ZX_DMA
 # driver files
 source "drivers/dma/bestcomm/Kconfig"
 
+source "drivers/dma/qcom/Kconfig"
+
 source "drivers/dma/dw/Kconfig"
 
 source "drivers/dma/hsu/Kconfig"
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 7711a71..8dba90d 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -52,7 +52,6 @@ obj-$(CONFIG_PCH_DMA) += pch_dma.o
 obj-$(CONFIG_PL330_DMA) += pl330.o
 obj-$(CONFIG_PPC_BESTCOMM) += bestcomm/
 obj-$(CONFIG_PXA_DMA) += pxa_dma.o
-obj-$(CONFIG_QCOM_BAM_DMA) += qcom_bam_dma.o
 obj-$(CONFIG_RENESAS_DMA) += sh/
 obj-$(CONFIG_SIRF_DMA) += sirf-dma.o
 obj-$(CONFIG_STE_DMA40) += ste_dma40.o ste_dma40_ll.o
@@ -66,4 +65,5 @@ obj-$(CONFIG_TI_EDMA) += edma.o
 obj-$(CONFIG_XGENE_DMA) += xgene-dma.o
 obj-$(CONFIG_ZX_DMA) += zx296702_dma.o
 
+obj-y += qcom/
 obj-y += xilinx/
diff --git a/drivers/dma/qcom/Kconfig b/drivers/dma/qcom/Kconfig
new file mode 100644
index 000..17545df
--- /dev/null
+++ b/drivers/dma/qcom/Kconfig
@@ -0,0 +1,9 @@
+config QCOM_BAM_DMA
+   tristate "QCOM BAM DMA support"
+   depends on ARCH_QCOM || (COMPILE_TEST && OF && ARM)
+   select DMA_ENGINE
+   select DMA_VIRTUAL_CHANNELS
+   ---help---
+ Enable support for the QCOM BAM DMA controller.  This controller
+ provides DMA capabilities for a variety of on-chip devices.
+
diff --git a/drivers/dma/qcom/Makefile b/drivers/dma/qcom/Makefile
new file mode 100644
index 000..f612ae3
--- /dev/null
+++ b/drivers/dma/qcom/Makefile
@@ -0,0 +1 @@
+obj-$(CONFIG_QCOM_BAM_DMA) += bam_dma.o
diff --git a/drivers/dma/qcom_bam_dma.c b/drivers/dma/qcom/bam_dma.c
similarity index 99%
rename from drivers/dma/qcom_bam_dma.c
rename to drivers/dma/qcom/bam_dma.c
index 5a250cd..5359234 100644
--- a/drivers/dma/qcom_bam_dma.c
+++ b/drivers/dma/qcom/bam_dma.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013-2014, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2013-2015, The Linux Foundation. All rights reserved.
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 and
@@ -49,8 +49,8 @@
 #include 
 #include 
 
-#include "dmaengine.h"
-#include "virt-dma.h"
+#include "../dmaengine.h"
+#include "../virt-dma.h"
 
 struct bam_desc_hw {
u32 addr;   /* Buffer physical address */
-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL 0/5] perf/urgent fixes

2015-11-11 Thread Ingo Molnar


* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> The following changes since commit b71b437eedaed985062492565d9d421d975ae845:
> 
>   perf: Fix inherited events vs. tracepoint filters (2015-11-09 16:13:11 
> +0100)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-urgent-for-mingo
> 
> for you to fetch changes up to 5602ea09c19e85557f2b4d30be1d6ba349b7a038:
> 
>   tools include: Add compiler.h to list.h (2015-11-11 18:41:33 -0300)
> 
> 
> perf/urgent fixes:
> 
> User visible:
> 
> - Add missing newlines to some pr_err() calls (Arnaldo Carvalho de Melo)
> 
> - Print full source file paths when using
>   'perf annotate --print-line --full-paths' (Michael Petlan)
> 
> - Fix 'perf probe -d' when just one out of uprobes and kprobes is
>   enabled (Wang Nan)
> 
> Developer stuff:
> 
> - Add compiler.h to list.h to fix 'make perf-tar-src-pkg' generated
>   tarballs, i.e. out of tree building (Arnaldo Carvalho de Melo)
> 
> - Add the llvm-src-base.c and llvm-src-kbuild.c files, generated by the
>   'perf test' LLVM entries, when running it in-tree, to .gitignore (Yunlong 
> Song)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Arnaldo Carvalho de Melo (2):
>   perf session: Add missing newlines to some pr_err() calls
>   tools include: Add compiler.h to list.h
> 
> Michael Petlan (1):
>   perf annotate: Support full source file paths for srcline fix
> 
> Wang Nan (1):
>   perf probe: Verify parameters in two functions
> 
> Yunlong Song (1):
>   perf test: Add llvm-src-base.c and llvm-src-kbuild.c to .gitignore
> 
>  tools/include/linux/list.h   | 1 +
>  tools/perf/tests/.gitignore  | 2 ++
>  tools/perf/util/annotate.c   | 1 +
>  tools/perf/util/probe-file.c | 6 ++
>  tools/perf/util/session.c| 8 
>  5 files changed, 14 insertions(+), 4 deletions(-)
>  create mode 100644 tools/perf/tests/.gitignore

Pulled, thanks a lot Arnaldo!

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mtd: phram: error handling

2015-11-11 Thread Saurabh Sengar


> More importantly, it's good to test these cases too:

> * phram is built-in (not a module), with and without a phram= line on
>   the commandline
> * writing to /sys/module/phram/parameters/phram (for both the module
>   and built-in cases)

Hi Brian,

1) I have tried phram as built-in, with and without phram= line in cmdline
but both the time there was no phram directory found in /sys/modules, neither 
/dev/mtd0
(do I need to enablesome config options ?)

2) There was no 'parameters' directory inside /sys/module/phram when I used 
phram as module,
though /dev/mtd0 and /dev/mtd0ro were present
 

I tried searching phram in kernel/Documentation but couldn't found anything.
I have few queries related to phram driver, please answer if your time permits.
(Feel free to ignore if I am taking too much your time, I know these are too 
many :) )

Q1) Phram driver is used for accessing memory which are there but not currently 
mapped in system? am I correct?

Q2) When I register device with junk names like 
phram=saurabh,0x1f700,0x400, it registers fine with name 'saurabh', isn't 
it wrong ?

Q3) If I access some memory which does not even exist, driver still registers 
and even read operation is successfull to it.
eg: phram=ram,8Gi,1ki (My laptop have 4GB ram but accessing 8GB address of ram)


Regards,
Saurabh

---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] tools/vm/slabinfo: update struct slabinfo members' types

2015-11-11 Thread Sergey Senozhatsky

On (11/11/15 21:07), David Rientjes wrote:
[..]
> > > > /* Object size */
> > > > -   unsigned long long min_objsize = max, max_objsize = 0, 
> > > > avg_objsize;
> > > > +   unsigned int min_objsize = UINT_MAX, max_objsize = 0, 
> > > > avg_objsize;
> > > >  
> > > > /* Number of partial slabs in a slabcache */
> > > > unsigned long long min_partial = max, max_partial = 0,
> > > 
> > > avg_objsize should not be unsigned int.
> > 
> > Hm. the assumption is that `avg_objsize' cannot be larger
> > than `max_objsize', which is
> > `int object_size;' in `struct kmem_cache' from slab_def.h
> > and
> > `unsigned int object_size;' in `struct kmem_cache' from slab.h.
> > 
> > 
> >  avg_objsize = total_used / total_objects;
> > 
> 

I'm not sure I clearly understand the problems you're pointing
me to.

> This has nothing to do with object_size in the kernel.

what we have in slabinfo as slab_size(), ->object_size, etc.
comming from slub's sysfs attrs:

chdir("/sys/kernel/slab")
while readdir
...
slab->object_size = get_obj("object_size");
slab->slab_size = get_obj("slab_size");
...

and attr show handlers are:

...
 static ssize_t slab_size_show(struct kmem_cache *s, char *buf)
 {
return sprintf(buf, "%d\n", s->size);
 }
 SLAB_ATTR_RO(slab_size);

 static ssize_t object_size_show(struct kmem_cache *s, char *buf)
 {
return sprintf(buf, "%d\n", s->object_size);
 }
 SLAB_ATTR_RO(object_size);
...

so those are sprintf("%d") of `struct kmem_cache'-s `int'
values.


> total_used and total_objects are unsigned long long.

yes, that's correct.
but `total_used / total_objects' cannot be larger that the size
of the largest object, which is represented in the kernel and
returned to user space as `int'. it must fit into `unsigned int'.


> If you need to convert max_objsize to be unsigned long long as
> well, that would be better.

... in case if someday `struct kmem_cache' will be updated to keep
`unsigned long' sized objects and sysfs attrs will do sprintf("%lu")?
IOW, if slabs will keep objects bigger that 4gig?

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] acpi: add support for extended IRQ to PCI link

2015-11-11 Thread Sinan Kaya

The ACPI compiler uses the extended format when used
interrupt numbers are greater than 256. The PCI link code
currently only supports simple interrupt format. The IRQ
numbers are represented using 32 bits when extended IRQ
syntax. This patch changes the interrupt number type to
32 bits and places an upper limit of 1020 as possible
interrupt id.

1020 is the maximum interrupt ID that can be assigned to
an ARM SPI interrupt according to ARM architecture.

Additional checks have been placed to prevent out of bounds
writes.

Signed-off-by: Sinan Kaya 
---
 drivers/acpi/pci_link.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c
index 7c8408b..a2becab 100644
--- a/drivers/acpi/pci_link.c
+++ b/drivers/acpi/pci_link.c
@@ -1,6 +1,7 @@
 /*
  *  pci_link.c - ACPI PCI Interrupt Link Device Driver ($Revision: 34 $)
  *
+ *  Copyright (c) 2015, The Linux Foundation. All rights reserved.
  *  Copyright (C) 2001, 2002 Andy Grover 
  *  Copyright (C) 2001, 2002 Paul Diefenbaugh 
  *  Copyright (C) 2002   Dominik Brodowski 
@@ -67,12 +68,12 @@ static struct acpi_scan_handler pci_link_handler = {
  * later even the link is disable. Instead, we just repick the active irq
  */
 struct acpi_pci_link_irq {
-   u8 active;  /* Current IRQ */
+   u32 active; /* Current IRQ */
u8 triggering;  /* All IRQs */
u8 polarity;/* All IRQs */
u8 resource_type;
u8 possible_count;
-   u8 possible[ACPI_PCI_LINK_MAX_POSSIBLE];
+   u32 possible[ACPI_PCI_LINK_MAX_POSSIBLE];
u8 initialized:1;
u8 reserved:7;
 };
@@ -437,7 +438,7 @@ static int acpi_pci_link_set(struct acpi_pci_link *link, 
int irq)
  * enabled system.
  */
 
-#define ACPI_MAX_IRQS  256
+#define ACPI_MAX_IRQS  1020
 #define ACPI_MAX_ISA_IRQ   16
 
 #define PIRQ_PENALTY_PCI_AVAILABLE (0)
@@ -493,7 +494,8 @@ int __init acpi_irq_penalty_init(void)
penalty;
}
 
-   } else if (link->irq.active) {
+   } else if (link->irq.active &&
+   (link->irq.active < ACPI_MAX_IRQS)) {
acpi_irq_penalty[link->irq.active] +=
PIRQ_PENALTY_PCI_POSSIBLE;
}
@@ -541,14 +543,16 @@ static int acpi_pci_link_allocate(struct acpi_pci_link 
*link)
else
irq = link->irq.possible[link->irq.possible_count - 1];
 
-   if (acpi_irq_balance || !link->irq.active) {
+   if ((acpi_irq_balance || !link->irq.active) && (irq < ACPI_MAX_IRQS)) {
/*
-* Select the best IRQ.  This is done in reverse to promote
-* the use of IRQs 9, 10, 11, and >15.
+* Select the best IRQ.  This is done in reverse to
+* promote the use of IRQs 9, 10, 11, and >15.
 */
-   for (i = (link->irq.possible_count - 1); i >= 0; i--) {
-   if (acpi_irq_penalty[irq] >
-   acpi_irq_penalty[link->irq.possible[i]])
+   i = link->irq.possible_count;
+   while (--i) {
+   if ((link->irq.possible[i] < ACPI_MAX_IRQS) &&
+   (acpi_irq_penalty[irq] >
+   acpi_irq_penalty[link->irq.possible[i]]))
irq = link->irq.possible[i];
}
}
@@ -568,7 +572,9 @@ static int acpi_pci_link_allocate(struct acpi_pci_link 
*link)
acpi_device_bid(link->device));
return -ENODEV;
} else {
-   acpi_irq_penalty[link->irq.active] += PIRQ_PENALTY_PCI_USING;
+   if (link->irq.active < ACPI_MAX_IRQS)
+   acpi_irq_penalty[link->irq.active] +=
+   PIRQ_PENALTY_PCI_USING;
printk(KERN_WARNING PREFIX "%s [%s] enabled at IRQ %d\n",
   acpi_device_name(link->device),
   acpi_device_bid(link->device), link->irq.active);
-- 
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: samples: livepatch: init reloc list and mark as klp module

2015-11-11 Thread Jessica Yu


+++ Petr Mladek [11/11/15 16:42 +0100]:

On Mon 2015-11-09 23:45:54, Jessica Yu wrote:

Intialize the list of relocation sections in the sample
klp_object (even if the list will be empty in this case).
Also mark module as a livepatch module so that the module
loader can appropriately initialize it.

Signed-off-by: Jessica Yu 
---
 samples/livepatch/livepatch-sample.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/samples/livepatch/livepatch-sample.c 
b/samples/livepatch/livepatch-sample.c
index fb8c861..2ef9345 100644
--- a/samples/livepatch/livepatch-sample.c
+++ b/samples/livepatch/livepatch-sample.c
@@ -89,3 +90,4 @@ static void livepatch_exit(void)
 module_init(livepatch_init);
 module_exit(livepatch_exit);
 MODULE_LICENSE("GPL");
+MODULE_INFO(livepatch, "Y");


This looks a bit error prone. I wonder if we could detect this
information another way. For example, by a check for the
livepatch-related elf sections. If it is missing,
we do not need to preserve struct load_info even
when it is a livepatch.


Yeah, I agree that it is unnecessary for a livepatch module without
reloc secs to keep a copy of the load_info struct. My justification
for using MODULE_INFO is that I was trying to be consistent with the
way how other module "characteristics" are checked in the module
loader. For example, if the module came from the staging tree, the
module loader simply checks get_modinfo(info, "staging")). If the
module is a livepatch module, we check get_modinfo(info,
"livepatch")). I also thought that it might be useful additional
information for the user to be able to issue the modinfo command on a
module to see if it's a livepatch module or not (but maybe this
information won't be so useful after all, that's quite subjective).
But if we want to do a more thorough check, we could, like you said,
check for the livepatch-related elf sections before copying load_info.

Thanks,
Jessica
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 00/12] usb: early: add support for early printk through USB3 debug port

2015-11-11 Thread Dave Young

Hi, Baolu

On 11/12/15 at 10:45am, Lu, Baolu wrote:
> Hi Dave,
> 
> Which device are you testing with? This implementation was developed
> and tested on Intel Skylake devices.
> 
> It doesn't surprise me if it doesn't work with other silicons. But it do
> remind me to create a verified-list and put those known-to-work devices
> in it.
> 

I have not got time to do more test. What infomation do you want? 
lsusb -v?


> Thanks,
> Baolu
> 
> On 11/11/2015 10:25 AM, Dave Young wrote:
> >Hi,
> >
> >On 11/11/15 at 09:32am, Lu, Baolu wrote:
> >>
> >>On 11/10/2015 05:39 PM, Dave Young wrote:
> >>>Hi,
> >>>
> >>>On 11/09/15 at 03:38pm, Lu Baolu wrote:
> This patch series adds support for early printk through USB3 debug port.
> USB3 debug port is described in xHCI specification as an optional extended
> capability.
> 
> >>>I did a test with your previous patchset with the manually wired cable.
> >>>debug host detected the remote device, but later the devie automaticlly
> >>>disconnected and earlyprintk hangs.
> >>Hi Dave,
> >>
> >>What I have done is:
> >Retested it, seems it is not stable. I got a sucessful boot with earlyprintk
> >But only once and there was no "Press Y to continue", I just blindly pressed 
> >Y.
> >
> >The other tests failed.
> >
> >Since it is not convinience to test, do you have way to enable the dbc
> >after kernel boot? like echo 1 to a sysfs file to enable it.
> >>(1) Build a new kernel for debug target with this patch series applied.
> >>(2) Add "earlyprintk=xdbc" to the kernel option of debug target. The
> >>  "keep" option for early printk doesn't support yet. (That's my next
> >>  target.)
> >>
> >>(3) Boot the debug host, and disable USB runtime suspend:
> >>
> >># echo on > /sys/bus/pci/devices//power/control
> >># echo on | tee /sys/bus/usb/devices/*/power/control
> >>
> >>(4) Boot the debug target. Check the dmesg message on debug host.
> >>
> >># tail -f /var/log/kern.log
> >>
> >>Nov 12 01:27:50 allen-ult kernel: [ 1815.983374] usb 4-3: new SuperSpeed USB
> >>device number 4 using xhci_hcd
> >>Nov 12 01:27:50 allen-ult kernel: [ 1815.999595] usb 4-3: LPM exit latency
> >>is zeroed, disabling LPM.
> >>Nov 12 01:27:50 allen-ult kernel: [ 1815.999899] usb 4-3: New USB device
> >>found, idVendor=1d6b, idProduct=0004
> >>Nov 12 01:27:50 allen-ult kernel: [ 1815.02] usb 4-3: New USB device
> >>strings: Mfr=1, Product=2, SerialNumber=3
> >>Nov 12 01:27:50 allen-ult kernel: [ 1815.03] usb 4-3: Product: Remote
> >>GDB
> >>Nov 12 01:27:50 allen-ult kernel: [ 1815.04] usb 4-3: Manufacturer:
> >>Linux
> >>Nov 12 01:27:50 allen-ult kernel: [ 1815.05] usb 4-3: SerialNumber: 0001
> >>Nov 12 01:27:50 allen-ult kernel: [ 1816.000240] usb_debug 4-3:1.0: xhci_dbc
> >>converter detected
> >>Nov 12 01:27:50 allen-ult kernel: [ 1816.000360] usb 4-3: xhci_dbc converter
> >>now attached to ttyUSB0
> >>
> >>(5) Host has completed enumeration of debug device. Start "minicom" on debug
> >>host.
> >>
> >Most times I have no chance to run minicom before the usb disconnection here.
> >
> >>Welcome to minicom 2.7
> >>
> >>OPTIONS: I18n
> >>Compiled on Jan  1 2014, 17:13:19.
> >>Port /dev/ttyUSB0, 01:28:02
> >>
> >>Press CTRL-A Z for help on special keys
> >>
> >>Press Y to continue...
> >>
> >>(6) You should be able to see "Press Y to continue..." (if not, try pressing
> >>Enter key)
> >>Press Y key, debug target should go ahead with boot and early boot messages
> >>should show in mincom.
> >>
> >>Press Y to continue...
> >>[0.00] Initializing cgroup subsys cpuset
> >>[0.00] Initializing cgroup subsys cpu
> >>[0.00] Initializing cgroup subsys cpuacct
> >>[0.00] Linux version 4.3.0-rc7+ (allen@blu-skl) (gcc version 4.8.4
> >>(Ubuntu 4.8.4-2ubuntu1~14.04) 5
> >>[0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.3.0-rc7+
> >>root=UUID=5a2fb856-0238-4b6e-aa45-beeccb7
> >>[0.00] KERNEL supported cpus:
> >>
> >>[...skipped...]
> >>
> >>[0.00]  Offload RCU callbacks from CPUs: 0-7.
> >>[0.00] Console: colour dummy device 80x25
> >>[0.00] console [tty0] enabled
> >>[0.00] bootconsole [earlyxdbc0] disabled
> >>
> >>
> >>So "the devie automaticlly disconnected and earlyprintk hangs" happens in
> >>which step?
> >>
> >Here is some log on host side.
> >
> >[ 1568.052135] usb 2-2: new SuperSpeed USB device number 5 using xhci_hcd
> >[ 1568.063416] usb 2-2: LPM exit latency is zeroed, disabling LPM.
> >[ 1568.063750] usb 2-2: New USB device found, idVendor=1d6b, idProduct=0004
> >[ 1568.063751] usb 2-2: New USB device strings: Mfr=1, Product=2, 
> >SerialNumber=3
> >[ 1568.063752] usb 2-2: Product: Remote GDB
> >[ 1568.063753] usb 2-2: Manufacturer: Linux
> >[ 1568.063754] usb 2-2: SerialNumber: 0001
> >[ 1568.065580] usb_debug 2-2:1.0: xhci_dbc converter detected
> >[ 1568.066309] usb 2-2: xhci_dbc converter now attached to ttyUSB0
> >[ 1580.464706] usb 2-2: USB disconnect, device number 5
> >[ 1580.464996] xh

平时最多也就联系了三千家，全球还有十几万客户在哪里？

2015-11-11 Thread iSayor

您好:
您还在用ali平台开发外贸客户?
   还在使用展会宣传企业和产品?
 你out了!!!
 当前外贸客户开发难，您是否也在寻找展会，B2B之外好的渠道？ 
 行业全球十几万客户，平时最多也就联系了三千家，您是否想把剩下的也开发到？
 加QQ767650805给您演示下主动开发客户的方法，先用先受益，已经有近万家企业领先您使用！！。
 广东省商业联合会推荐，主动开发客户第一品牌，近万家企业正在获益。您可以没有使用，但是不能没有了解。
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] tools/vm/page-types: suppress gcc warnings

2015-11-11 Thread David Rientjes

On Thu, 12 Nov 2015, Sergey Senozhatsky wrote:

> > This can't possibly be correct, the warnings are legitimate and the result
> > of the sigsetjmp() in the function.  You may be interested in
> > returns_twice rather than marking random automatic variables as volatile.
> 
> Hm, ok. I saw no probs with `int first' and `end' being volatile
> 

This will only happen with the undocumented change in your first patch 
which adds -O2.

I don't know what version of gcc you're using, but only "first" and "end" 
being marked volatile isn't sufficient since mere code inspection would 
show that "off" will also be clobbered -- it's part of the loop.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kdbus refactoring?

2015-11-11 Thread Kalle A. Sandstrom

[unrelated quotes trimmed, attribution preserved.]

> >> > On Sun, Nov 8, 2015 at 3:30 PM, Greg KH  
> >> > wrote:
> >> >> On Sun, Nov 08, 2015 at 10:39:43PM +0100, Richard Weinberger wrote:
> >> >>>
> >> >>> If you rework/redesign something you have to know what you want to 
> >> >>> change.
> >> >>> That's why I was asking for the plan...
> >> >>
> >> >> Since when do people post "plans" or "design documents" on lkml without
> >> >> real code?  Again, code will be posted when it's ready, like any other
> >> >> kernel submission.
> >> >

tl;dr: perhaps they should start doing that.

In the case of kdbus' 4.1 iteration, several of its defects could've been
spotted from its design alone. For examples: the expected userspace
behaviour when clients and servers notice that a message wasn't delivered
(which was underspecified to say the least); difficulty in guaranteeing
forward progress in the face of e.g. surprising scheduling behaviour and the
dropped_msgs field; the O(n) nature of the broadcast filtering bitmap
construct wrt # of connections on the bus[0]; and the feature that permits
opaque falsification of sender credentials by the bus owner[1].

Each of these has a significant real-world impact on designs built atop
kdbus, regardless of whether such things are closer to a layman's
approximation of formal engineering or green-field hack-job proofs of
concept. Literally each must be accounted for in userspace applications that
even as much as breathe in kdbus' direction. I'd have hated to run into the
credential-faking feature if I'd already been sack-deep into a derivative
that relied on the integrity of kdbus' metadata; and as of the most recent
version, the effects of a broadcast storm during high scheduling latency
(load, memory pressure, block device lag, w/e) were still very difficult to
parlay into a predictable design that left no dangling wires.

I don't mean to suggest that the defects cited were due to an incomplete
understanding of kdbus (or indeed IPC) on the part of its authorship.
However there's a very strong argument that these aspects weren't considered
when kdbus was submitted for inclusion, and then given a hard shove.

Moreover, even a semi-formal requirements document would've made reviewing
kdbus much easier without compromising quality of review. As it stood, the
things that would be reasoned about during review had to be sussed out from
kdbus' API documentation, the comments of its developers elsewhere, various
forms of PR surrounding the topic, header files, and from existing knowledge
of things that really must be in there somewhere (e.g. locking).

Similarly knowing the (implicit, patchwork, _anything_ really) arguments why
kdbus' design meets those requirements, how its implementation corresponds
to the design, and how its test suite verifies that the design's properties
are present in the implementation, would've permitted review besides the
"off-road" style which would therefore have been available sooner. That's to
say: there'd have been less of the cranium-oriented demolitions on both
sides of the fence, if any kind or quality of design document had been
available.

Considering that a req spec would've led to a design spec, in turn leading
to impl and test plans, each subject to review, the utility that could've
been had _at 0 SLOC_ would've definitely been significant. Also, their
existence would help manage long-term rot of the implementation and its test
suite by making both unambiguously remediable where rot's effects were
discovered[2]. Further progress could be built on that foundation instead of
hacks upon hacks, ever-mounting technical debt, and eventual CADT.

For instance: who's had a poke at Linux mm in the past two years? Or the
scheduler? Who even could, and where would they start? Both appear as
interlocking mishmashes of subtle oft-historical concerns ranging from the
humdrum to "must have been employed at SGI in the early aughties to
understand" grade NUMA, making each alteration unverifiable outside of test
environments the hacker has access to -- i.e. VMs and maybe a handful of
off-the-shelf microcomputers. IIRC the last major scheduling change was a
nigh-complete rewrite that had CFS emerge from failings indicated by Con
Kolivas' interactivity work.

I'm sure there's people who're well savvy to mm, sched, and maybe even both
at once. To the rest of us it might as well be opaque as far as
non-regressive modification is concerned. kdbus is certainly big enough to
suffer a similar fate, given time.

On Mon, Nov 09, 2015 at 09:23:34AM -0800, Andy Lutomirski wrote:
> On Mon, Nov 9, 2015 at 9:07 AM, Greg KH  wrote:
> > On Mon, Nov 09, 2015 at 05:02:45PM +, Måns Rullgård wrote:
> >> Andy Lutomirski  writes:
> >>
[quote moved up top]
> >> > I ask for feedback on ideas and designs on a fairly regular basis.  I
> >> > even frequently get valuable feedback :)
> >> >
> >> > I would like to think that the kernel community would have something
> >> > of value t

Re: module: save load_info for livepatch modules

2015-11-11 Thread Jessica Yu


+++ Miroslav Benes [11/11/15 15:17 +0100]:

On Mon, 9 Nov 2015, Jessica Yu wrote:


diff --git a/include/linux/module.h b/include/linux/module.h
index 3a19c79..c8680b1 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h


[...]


+#ifdef CONFIG_LIVEPATCH
+extern void klp_prepare_patch_module(struct module *mod,
+struct load_info *info);
+extern int
+apply_relocate_add(Elf64_Shdr *sechdrs, const char *strtab,
+  unsigned int symindex, unsigned int relsec,
+  struct module *me);
+#endif
+
 #else /* !CONFIG_MODULES... */


apply_relocate_add() is already in include/linux/moduleloader.h (guarded
by CONFIG_MODULES_USE_ELF_RELA), so maybe we can just include that where
we need it. As for the klp_prepare_patch_module() wouldn't it be better to
have it in our livepatch.h and include that in kernel/module.c?


Yeah, Petr pointed this out as well :-) I will just include
moduleloader.h for the apply_relocate_add() declaration.

It also looks like we have some disagreement over where to put
klp_prepare_patch_module(), either in livepatch/core.c (and add the
function declaration in livepatch.h, and have module.c include
livepatch.h) or in kernel/module.c, keeping the
klp_prepare_patch_module() declaration in module.h. Maybe Rusty can
provide some input.


 /* Given an address, look for it in the exception tables. */
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 6e53441..087a8c7 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -1001,6 +1001,23 @@ static struct notifier_block klp_module_nb = {
.priority = INT_MIN+1, /* called late but before ftrace notifier */
 };

+/*
+ * Save necessary information from info in order to be able to
+ * patch modules that might be loaded later
+ */
+void klp_prepare_patch_module(struct module *mod, struct load_info *info)
+{
+   Elf_Shdr *symsect;
+
+   symsect = info->sechdrs + info->index.sym;
+   /* update sh_addr to point to symtab */
+   symsect->sh_addr = (unsigned long)info->hdr + symsect->sh_offset;
+
+   mod->info = kzalloc(sizeof(*info), GFP_KERNEL);
+   memcpy(mod->info, info, sizeof(*info));
+
+}


What about arch-specific 'struct mod_arch_specific'? We need to preserve
it somewhere as well for s390x and other non-x86 architectures.


Ah! Thank you for catching this, I overlooked this important detail.
Yes, we do need to save the arch-specific struct. We would be in
trouble for s390 relocs if we didn't. I am trying to think of a way to
save this information for s390, since s390's module_finalize() frees
mod->arch.syminfo, which we definitely need in order for the call to
apply_relocate_add() to work. Maybe we can add an extra call right
before module_finalize() that will do some livepatch-specific
processing and copy this information (this would be in
post_relocation() in kernel/module.c). Perhaps this patchset cannot be
entirely free of arch-specific code after all :-( Still thinking.


+#ifdef CONFIG_LIVEPATCH
+   /*
+* Save sechdrs, indices, and other data from info
+* in order to patch to-be-loaded modules.
+* Do not call free_copy() for livepatch modules.
+*/
+   if (get_modinfo((struct load_info *)info, "livepatch"))
+   klp_prepare_patch_module(mod, info);
+   else
+   free_copy(info);
+#else
/* Get rid of temporary copy. */
free_copy(info);
+#endif


Maybe I am missing something but isn't it necessary to call vfree() on
info somewhere in the end?


So free_copy() will call vfree(info->hdr), except in livepatch modules
we want to keep all the elf section information stored there, so we
avoid calling free_copy(), As for the info struct itself, if you look
at the init_module and finit_module syscall definitions in
kernel/module.c, you will see that info is actually a local function
variable, simply passed in to the call to load_module(), and will be
automatically deallocated when the syscall returns. :-) No need to
explicitly free info.

Thanks for the comments,
Jessica
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 01/17] mm: support madvise(MADV_FREE)

2015-11-11 Thread Daniel Micay

> I also think that the kernel should commit to either zeroing the page
> or leaving it unchanged in response to MADV_FREE (even if the decision
> of which to do is made later on).  I think that your patch series does
> this, but only after a few of the patches are applied (the swap entry
> freeing), and I think that it should be a real guaranteed part of the
> semantics and maybe have a test case.

This would be a good thing to test because it would be required to add
MADV_FREE_UNDO down the road. It would mean the same semantics as the
MEM_RESET and MEM_RESET_UNDO features on Windows, and there's probably
value in that for the sake of migrating existing software too.

For one example, it could be dropped into Firefox:

https://dxr.mozilla.org/mozilla-central/source/memory/volatile/VolatileBufferWindows.cpp

And in Chromium:

https://code.google.com/p/chromium/codesearch#chromium/src/base/memory/discardable_shared_memory.cc

Worth noting that both also support the API for pinning/unpinning that's
used by Android's ashmem too. Linux really needs a feature like this for
caches. Firefox simply doesn't drop the memory at all on Linux right now:

https://dxr.mozilla.org/mozilla-central/source/memory/volatile/VolatileBufferFallback.cpp

(Lock == pin, Unlock == unpin)

For reference:

https://msdn.microsoft.com/en-us/library/windows/desktop/aa366887(v=vs.85).aspx



signature.asc
Description: OpenPGP digital signature

Re: [PATCH 00/19] drm: Add Allwinner A10 display engine support

2015-11-11 Thread Maxime Ripard

On Fri, Oct 30, 2015 at 03:52:17PM +0100, Daniel Vetter wrote:
> On Fri, Oct 30, 2015 at 03:20:46PM +0100, Maxime Ripard wrote:
> > Hi everyone,
> > 
> > The Allwinner SoCs (except for the very latest ones) all share the
> > same set of controllers, loosely coupled together to form the display
> > pipeline.
> > 
> > Depending on the SoC, the number of instances of the controller will
> > change (2 instances of each in the A10, only one in the A13, for
> > example), and the output availables will change too (HDMI, composite,
> > VGA on the A20, none of them on the A13).
> > 
> > On most featured SoCs, it looks like that:
> > 
> >  ++
> >  |RAM |
> >  ++
> >||  ||
> >v|  |v
> >  ++ |  | ++
> >  |Frontend| |  | |Frontend|
> >  ++ |  | ++
> >  |  |  | |
> >  v  |  | v
> >  ++ |  | ++
> >  |Backend |<+  +>|Backend |
> >  ++  ++
> >  |   |
> >  v   v
> >  ++  ++---> LVDS
> >  |  TCON  |  |  TCON  |---> RGB
> >  ++  ++
> >|   +---+   +---+  |
> >|   |   |  |
> >v   v   v  v
> >  ++  ++  ++---> VGA
> >  | TV Encoder |  |HDMI|  | TV Encoder |---> Composite
> >  ++  ++  ++
> > 
> > The current code only assumes that there is a single instance of all
> > the controllers. It also supports only the RGB and Composite
> > interfaces.
> > 
> > A few more things are missing though, and will be supported
> > eventually:
> >   - Overscan support
> >   - Asynchronous page flip
> >   - Multiple plane support
> >   - Composite / VGA Hotplug detection
> >   - More outputs
> >   - Support for several videos pipelines
> > 
> > And there's one big gotcha: thhe code to parse the mode from the
> > kernel commandline doesn't seem to support named modes. Since we
> > expose the various TV standards through named modes, it means that
> > there's no way to use a particular standard to display something
> > during the kernel boot. The default will always be used, in other
> > words, PAL.
> 
> Simply not done yet.

Ok, so I guess there's no fundamental objection to it. I'll do it then
:)

> > A few more questions that are probably going to be raised during the
> > review:
> >   - How do you associate private data to a mode, for example to deal
> > with the non-generic, driver-specific settings required to deal
> > with the various TV standards? drm_display_mode seems to have a
> > private field, but it isn't always preserved.
> 
> Analog TV in general is a giant mess, and there's not really all that much
> of a standardized solution. I also don't have much clue at all about what
> needs to be tuned with analog TV. Probably the best would be to look at
> existing drivers with TV-out support and what kind of properties they
> support.

The intel i915 driver already uses a generic property to support this
(using drm_mode_create_tv_properties), so I guess the generic part is
covered.

That might need some work, for example to support a different overscan
configuration in X and Y, but that's a detail, and we can always
extend it.

However, I'm more interested in actual register and bits that you have
to poke to enable one of the two modes, that are usually not directly
something you can expose through a generic mode field, either because
you don't really know what the bit is doing (because the datasheet
sucks), or because it's simply something completely tied to the
hardware itself, and not really the mode you expose (for example, I
have to enable the DAC0 in PAL and the DAC0 and DAC3 in NTSC).

> Then standardize them (with relevant helper code in drm core) and
> use that. Additional flags on the mode, especially using the private stuff
> is kinda the deprecated non-atomic approach.

Ok.

> If you need private data beyond the mode and any additional
> properties on the crtc (for derived state) then just subclass
> drm_crtc_state and put it there.

Except that it's not really tied to the CRTC in my case, but
encoder. I don't think the CRTC simply knows if it's using an RGB,
LVDS or Composite output, and then if the user wants NTSC or PAL on
the composite output. It just doesn't seem to really fit in the
current way the pipeline is represented.

> >   - How do you setup properties in the kernel command line? In order
> > to have a decent display during boot on rele

Re: [PATCH] sched: prevent getting too much vruntime

2015-11-11 Thread Byungchul Park

On Wed, Nov 11, 2015 at 12:50:43PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 11, 2015 at 06:48:49PM +0900, Byungchul Park wrote:
> > On Wed, Nov 11, 2015 at 10:26:32AM +0100, Peter Zijlstra wrote:
> > > On Wed, Nov 11, 2015 at 05:50:27PM +0900, byungchul.p...@lge.com wrote:
> > > 
> > > I've not actually read anything; my brain isn't working right today.
> > > 
> > > > +static inline void vruntime_unnormalize(struct cfs_rq *cfs_rq, struct 
> > > > sched_entity *se)
> > > > +{
> > > > +   se->vruntime += cfs_rq->min_vruntime;
> > > > +   if (unlikely((s64)se->vruntime < 0))
> > > > +   se->vruntime = 0;
> > > > +}
> > > 
> > > But this is broken. This simply _cannot_ be right.
> > > 
> > > vruntime very much needs to wrap in u64 space. While regular time in ns
> > > takes some 584 year to wrap, vruntime is scaled. The fastest vruntime is
> > > 2/1024 or 512 times faster than normal time. Making it take just over a
> > > year to wrap around. This will happen.
> > 
> > Then, do you mean it's no problem even if we compare between a vruntime
> > not wrapped yet and another vruntime already wrapped? I really wonder it.
> 
> It should be; we were really careful with this back when we wrote all
> that. All vruntime comparisons should be of the form (s64)(a-b). Which
> gets you the correct order assuming things haven't drifted more than
> 2^63 apart.

I checked it. It looks no problem as you said.

Thank you very much.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V3 2/5] PM / OPP: Add {opp-microvolt|opp-microamp}- binding

2015-11-11 Thread Viresh Kumar

On 11-11-15, 14:31, Rob Herring wrote:
> > +   opp00 {
> 
> Thought we are doing frequency for unit address here.

That's done by the (Already reviewed) 4th patch..

-- 
viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] tools/vm/slabinfo: update struct slabinfo members' types

2015-11-11 Thread David Rientjes

On Thu, 12 Nov 2015, Sergey Senozhatsky wrote:

> > >   /* Object size */
> > > - unsigned long long min_objsize = max, max_objsize = 0, avg_objsize;
> > > + unsigned int min_objsize = UINT_MAX, max_objsize = 0, avg_objsize;
> > >  
> > >   /* Number of partial slabs in a slabcache */
> > >   unsigned long long min_partial = max, max_partial = 0,
> > 
> > avg_objsize should not be unsigned int.
> 
> Hm. the assumption is that `avg_objsize' cannot be larger
> than `max_objsize', which is
>   `int object_size;' in `struct kmem_cache' from slab_def.h
> and
>   `unsigned int object_size;' in `struct kmem_cache' from slab.h.
> 
> 
>  avg_objsize = total_used / total_objects;
> 

total_used and total_objects are unsigned long long.  This has nothing to 
do with object_size in the kernel.  If you need to convert max_objsize to 
be unsigned long long as well, that would be better.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 2/3] mtd: nand: Add support for Arasan Nand Flash Controller

2015-11-11 Thread punnaiah choudary kalluri

On Mon, Nov 9, 2015 at 7:20 PM, Andy Shevchenko
 wrote:
> On Thu, 2015-11-05 at 08:19 +0530, Punnaiah Choudary Kalluri wrote:
>> Added the basic driver for Arasan Nand Flash Controller used in
>> Zynq UltraScale+ MPSoC. It supports only Hw Ecc and upto 24bit
>> correction.
>>
>
>> +config MTD_NAND_ARASAN
>> + tristate "Support for Arasan Nand Flash controller"
>> + depends on MTD_NAND
>
> This looks useless since you can't see the item without MTD_NAND is
> chosen.
>
>> + help
>> +   Enables the driver for the Arasan Nand Flash controller on
>> +   Zynq UltraScale+ MPSoC.
>> +
>>  endif # MTD_NAND
>>
>
> +obj-$(CONFIG_MTD_NAND_ARASAN)  += arasan_nfc.o
>
> "nfc" part a bit ambiguous since NFC might be Near Field Communication.
>
> Perhaps "nand_fc" or something like that?
>

This driver is under mtd/nand so, there is no point of confusion here and
in this context nfc is nand flash controller.


>> +static u8 anfc_page(u32 pagesize)
>> +{
>> + switch (pagesize) {
>> + case 512:
>> + return PAGE_SIZE_512;
>> + case 2048:
>> + return PAGE_SIZE_2K;
>> + case 4096:
>> + return PAGE_SIZE_4K;
>> + case 8192:
>> + return PAGE_SIZE_8K;
>> + case 16384:
>> + return PAGE_SIZE_16K;
>> + case 1024:
>
> Why not keep sorted here?
>

It is sorted based on the return value. I will change the
sorting order based on the page size bytes.

>> + return PAGE_SIZE_1K;
>> + default:
>> + break;
>> + }
>> +
>> + return 0;
>
>
>
>> +}
>> +
>> +static inline void anfc_enable_intrs(struct anfc *nfc, u32 val)
>> +{
>> + writel(val, nfc->base + INTR_STS_EN_OFST);
>> + writel(val, nfc->base + INTR_SIG_EN_OFST);
>> +}
>> +
>> +static int anfc_wait_for_event(struct anfc *nfc, u32 event)
>> +{
>> + struct completion *comp;
>> + int ret;
>> +
>> + if (event == XFER_COMPLETE)
>> + comp = &nfc->xfercomp;
>> + else
>> + comp = &nfc->bufrdy;
>> +
>> + ret = wait_for_completion_timeout(comp,
>> msecs_to_jiffies(EVNT_TIMEOUT));
>> +
>> + return ret;
>
> return func();
>
>> +}
>> +
>> +static inline void anfc_setpktszcnt(struct anfc *nfc, u32 pktsize,
>> + u32 pktcount)
>> +{
>> + writel(pktsize | (pktcount << PKT_CNT_SHIFT), nfc->base +
>> PKT_OFST);
>> +}
>> +
>> +static inline void anfc_set_eccsparecmd(struct anfc *nfc, u8 cmd1,
>> u8 cmd2)
>> +{
>> + writel(cmd1 | (cmd2 << CMD2_SHIFT) |
>> +(nfc->caddr_cycles << ADDR_CYCLES_SHIFT),
>> +nfc->base + ECC_SPR_CMD_OFST);
>> +}
>> +
>> +static void anfc_setpagecoladdr(struct anfc *nfc, u32 page, u16 col)
>> +{
>> + u32 val;
>> +
>> + writel(col | (page << PG_ADDR_SHIFT), nfc->base +
>> MEM_ADDR1_OFST);
>> +
>> + val = readl(nfc->base + MEM_ADDR2_OFST);
>> + val = (val & ~MEM_ADDR_MASK) |
>> +   ((page >> PG_ADDR_SHIFT) & MEM_ADDR_MASK);
>> + writel(val, nfc->base + MEM_ADDR2_OFST);
>> +}
>> +
>> +static void anfc_prepare_cmd(struct anfc *nfc, u8 cmd1, u8 cmd2,
>> +  u8 dmamode, u32 pagesize, u8
>> addrcycles)
>> +{
>> + u32 regval;
>> +
>> + regval = cmd1 | (cmd2 << CMD2_SHIFT);
>> + if (dmamode && nfc->dma)
>> + regval |= DMA_ENABLE << DMA_EN_SHIFT;
>> + if (addrcycles)
>> + regval |= addrcycles << ADDR_CYCLES_SHIFT;
>> + if (pagesize)
>> + regval |= anfc_page(pagesize) << PAGE_SIZE_SHIFT;
>> + writel(regval, nfc->base + CMD_OFST);
>> +}
>> +
>> +static int anfc_device_ready(struct mtd_info *mtd,
>> +  struct nand_chip *chip)
>> +{
>> + u8 status;
>> + unsigned long timeout = jiffies + STATUS_TIMEOUT;
>> +
>> + do {
>> + chip->cmdfunc(mtd, NAND_CMD_STATUS, 0, 0);
>> + status = chip->read_byte(mtd);
>> + if (status & ONFI_STATUS_READY) {
>
>> + if (status & ONFI_STATUS_FAIL)
>> + return NAND_STATUS_FAIL;
>
> This is invariant to the loop, perhaps move outside.
>

Nand device is ready means it is ready to accept next command and
it is done with previous command. It doesn't mean that previous
command is success, it can fail also.

>> + break;
>> + }
>> + cpu_relax();
>> + } while (!time_after_eq(jiffies, timeout));
>> +
>> + if (time_after_eq(jiffies, timeout)) {
>> + pr_err("%s timed out\n", __func__);
>
> dev_err?
>
>> + return -ETIMEDOUT;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int anfc_read_oob(struct mtd_info *mtd, struct nand_chip
>> *chip,
>> +  int page)
>> +{
>> + struct anfc *nfc = container_of(mtd, struct anfc, mtd);
>
> Since you use it more than once might be a good idea to do something
> like
>
> #define to_anfc() container_of()
>
>> +
>> + chip->cmdfunc(mtd, NAND_CMD_READOOB,

Is ndo_do_ioctl still acceptable?

2015-11-11 Thread Jason A. Donenfeld

Hi David & Folks,

Soon I will submit a virtual tunnel device driver to LKML for review.
It uses rtnl_link_register to create a virtual network interface,
which then handles encryption, authentication, and some other things,
amongst various configured peers.

Right now the device is configurable via Netlink. It receives new
peers and configuration via a rtnl_link_ops->changelink function, and
it reports information back to userspace via a
rtnl_link_ops->fill_info function.

Configuration works fine, though it is rather cumbersome to do this
all via Netlink.

Reporting information back to userspace does not work fine. The reason
is that sometimes there's too much information to report back to
userspace than what can fit in a single preallocated Netlink skb. And
since rtnl_link_ops->fill_info doesn't receive any information from
userspace, I'm unable to use it to send back information in smaller
pieces.

I realize I could register a whole new rtnl packet family and related
set of functions with rtnl_register, such as what's done at the bottom
of `net/core/rtnetlink.c`. This is extremely cumbersome and invasive
though. It would require adding a new protocol family (like the
already existing rtnl_register-ified functions for PF_UNSPEC and
PF_BRIDGE), and I don't have enough clout to confidently submit a
patch that augments `include/linux/socket.h` with a new PF/AF define.
This seems very invasive and not appropriate for my driver.

What I'd really like to do is just implement ndo_do_ioctl. It seems to
me that this gives me a precise interface to do exactly what I want in
the cleanest and easiest to read possible way. I could have differing
ioctls for differing things. I could write memory back to userspace in
proper chunks, with the proper size. It's clear and straightforward
how to do it, and what the completed result looks like. It doesn't
require invasive changes into other parts of the kernel, as this would
be self-contained. It's hard to imagine a better interface to use than
ndo_do_ioctl.

But. But the word on the street is that kernel hipsters hate ioctls
and espouse the use of netlink everywhere with religious fervor, and
will burn at the stake any submissions I might send that go anywhere
near using ndo_do_ioctl rather than (the most likely ill-fitting for
the task) netlink. That, and the maintainers of the `ip` tool will be
upset too (even though they do already make use of several ioctls
instead of netlink). I'm told everybody will leer and jeer at me if I
use ndo_do_ioctl instead of netlink.

Except ndo_do_ioctl is *so* perfectly fitting here for my use case!

So what's the verdict on this? Do these aforementioned kernel hipsters
not really matter so much, and ndo_do_ioctl is actually perfectly
fine? Or must I really affix netlink onto my forthcoming submission?

Thanks,
Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v5 4/5] ARM: dts: mt8135: enable basic SMP bringup for mt8135

2015-11-11 Thread Kevin Hilman

Hi Eddie,

Kevin Hilman  writes:

> Eddie Huang  writes:
>
>> On Tue, 2015-11-10 at 17:16 -0800, Kevin Hilman wrote:
>>> Hi Eddie,
>>> 
>>> [...]
>>> 
>>> > I check the log [0],
>>> 
>>> Thanks for checking into this boot failure.
>>> 
>>> > it seems first time mt8135-evbp1 boot to kernel
>>> > shell successfully, then boot again. In the second time, mt8135 stay in
>>> > fastboot mode, waiting host send boot image, then timeout.
>>> 
>>> Actually, it never gets to a shell the first time.  If you look closely,
>>> the target reboots as soon as userspace starts.   Look for the PYBOOT
>>> line which says "finished booting, starting userspace"
>>> 
>>> Later on, pyboot thinks it finds a root shell due to finding '#'
>>> characters, but clearly it never got to a shell.
>>> 
>>> > I download zImage and dtb in [1], and kernel run to shell successfully
>>> > on my platform.
>>> 
>>> Are you can you try using a ramdisk as well?  You can use the pre-built
>>> one here:
>>> http://storage.kernelci.org/images/rootfs/buildroot/armel/rootfs.cpio.gz
>>> 
>>
>> Yes, I tried this ramdisk, and I can reproduce fail issue.
>>
>
> OK, good.   Thanks for looking into it.
>
>>> Please check my boot logs to see how I'm generating the boot.img file
>>> (search for mkbootimg) with a kernel/dtb/ramdisk.  It may be possible
>>> that the kernel image size with a ramdisk is breaking some of the
>>> assumptions in the fastboot mode.  I've seen problems like this on other
>>> platforms due to hard-coded sizes/addresses in the boot firmware.
>>> 
>>
>> MT8135 allocate 10MB for BOOT partition, but the test boot.img is 11MB,
>> thus cause user space fail.
>
> Aha, I was right!  ;)

Also notice in kernelci.org that the mt8173 board has also been failing
to boot in mainline[1].  I wonder if this same limitation exists in the
mt8173 boot firmware?

Kevin

[1] 
http://kernelci.org/boot/mt8173-evb/job/mainline/kernel/v4.3-11553-g8d3de01cfa37/defconfig/defconfig/lab/lab-khilman/?_id=5643bc3959b5145c9e0918f4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] h8300 update

2015-11-11 Thread Yoshinori Sato

Hello Linus,
Could you pull this changes.

The following changes since commit 7379047d5585187d1288486d4627873170d0005a:

  Linux 4.3-rc6 (2015-10-18 16:08:42 -0700)

are available in the git repository at:

  git://git.osdn.jp/gitroot/uclinux-h8/linux.git tags/for-4.4

for you to fetch changes up to f639eeb4a60ce39f154753e3a745bd755e0fe084:

  h8300: enable CLKSRC_OF (2015-11-12 12:18:25 +0900)


h8300 update for v4.4

some bug fix.


Javier Martinez Canillas (1):
  h8300: Don't set CROSS_COMPILE unconditionally

Yoshinori Sato (7):
  h8300: unaligned divcr register support.
  h8300: Fix alignment for .data
  h8300: register address fix
  h8300: zImage fix
  h8300: bit io fix
  asm-generic: {get,put}_user ptr argument evaluate only 1 time
  h8300: enable CLKSRC_OF

 arch/h8300/Kconfig |  1 +
 arch/h8300/Makefile|  2 ++
 arch/h8300/boot/compressed/Makefile|  5 +++--
 arch/h8300/boot/compressed/head.S  |  4 ++--
 arch/h8300/boot/compressed/misc.c  |  7 +--
 arch/h8300/boot/compressed/vmlinux.lds |  2 +-
 arch/h8300/boot/dts/edosk2674.dts  |  6 +++---
 arch/h8300/include/asm/io.h| 12 ++--
 arch/h8300/include/asm/thread_info.h   | 14 ++
 arch/h8300/kernel/setup.c  |  2 ++
 arch/h8300/kernel/vmlinux.lds.S|  4 ++--
 drivers/clk/h8300/clk-div.c|  6 +-
 include/asm-generic/uaccess.h  | 10 ++
 13 files changed, 40 insertions(+), 35 deletions(-)

-- 
Yoshinori Sato

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 01/17] mm: support madvise(MADV_FREE)

2015-11-11 Thread Andy Lutomirski

On Wed, Nov 11, 2015 at 8:32 PM, Minchan Kim  wrote:
>
> Linux doesn't have an ability to free pages lazy while other OS already
> have been supported that named by madvise(MADV_FREE).
>
> The gain is clear that kernel can discard freed pages rather than swapping
> out or OOM if memory pressure happens.

>
> When madvise syscall is called, VM clears dirty bit of ptes of the range.
> If memory pressure happens, VM checks dirty bit of page table and if it
> found still "clean", it means it's a "lazyfree pages" so VM could discard
> the page instead of swapping out.  Once there was store operation for the
> page before VM peek a page to reclaim, dirty bit is set so VM can swap out
> the page instead of discarding.
>

I realize that this lends itself to an efficient implementation, but
it's certainly the case that the kernel *could* use the accessed bit
instead of the dirty bit to give more sensible user semantics, and the
semantics that rely on the dirty bit make me uncomfortable from an ABI
perspective.

I also think that the kernel should commit to either zeroing the page
or leaving it unchanged in response to MADV_FREE (even if the decision
of which to do is made later on).  I think that your patch series does
this, but only after a few of the patches are applied (the swap entry
freeing), and I think that it should be a real guaranteed part of the
semantics and maybe have a test case.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 2/3] mtd: nand: Add support for Arasan Nand Flash Controller

2015-11-11 Thread punnaiah choudary kalluri

On Mon, Nov 9, 2015 at 7:20 PM, Andy Shevchenko
 wrote:
> On Thu, 2015-11-05 at 08:19 +0530, Punnaiah Choudary Kalluri wrote:
>> Added the basic driver for Arasan Nand Flash Controller used in
>> Zynq UltraScale+ MPSoC. It supports only Hw Ecc and upto 24bit
>> correction.
>>
>
>> +config MTD_NAND_ARASAN
>> + tristate "Support for Arasan Nand Flash controller"
>> + depends on MTD_NAND
>
> This looks useless since you can't see the item without MTD_NAND is
> chosen.
>
>> + help
>> +   Enables the driver for the Arasan Nand Flash controller on
>> +   Zynq UltraScale+ MPSoC.
>> +
>>  endif # MTD_NAND
>>
>
> +obj-$(CONFIG_MTD_NAND_ARASAN)  += arasan_nfc.o
>
> "nfc" part a bit ambiguous since NFC might be Near Field Communication.

This driver is under mtd/nand so, there is no point of confusion and
in this context nfc is nand flash controller.
>
> Perhaps "nand_fc" or something like that?
>
>>
>>  nand-objs := nand_base.o nand_bbt.o nand_timings.o
>> diff --git a/drivers/mtd/nand/arasan_nfc.c
>> b/drivers/mtd/nand/arasan_nfc.c
>> new file mode 100644
>> index 000..9d4665e
>> --- /dev/null
>> +++ b/drivers/mtd/nand/arasan_nfc.c
>> @@ -0,0 +1,1026 @@
>> +/*
>> + * Arasan Nand Flash Controller Driver
>> + *
>> + * Copyright (C) 2014 - 2015 Xilinx, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> modify it under
>> + * the terms of the GNU General Public License version 2 as
>> published by the
>> + * Free Software Foundation; either version 2 of the License, or (at
>> your
>> + * option) any later version.
>> + */
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#define DRIVER_NAME  "arasan_nfc"
>
> Ditto.
>
>> +#define EVNT_TIMEOUT 1000
>> +#define STATUS_TIMEOUT   2000
>> +
>> +#define PKT_OFST 0x00
>> +#define MEM_ADDR1_OFST   0x04
>> +#define MEM_ADDR2_OFST   0x08
>> +#define CMD_OFST 0x0C
>> +#define PROG_OFST0x10
>> +#define INTR_STS_EN_OFST 0x14
>> +#define INTR_SIG_EN_OFST 0x18
>> +#define INTR_STS_OFST0x1C
>> +#define READY_STS_OFST   0x20
>> +#define DMA_ADDR1_OFST   0x24
>> +#define FLASH_STS_OFST   0x28
>> +#define DATA_PORT_OFST   0x30
>> +#define ECC_OFST 0x34
>> +#define ECC_ERR_CNT_OFST 0x38
>> +#define ECC_SPR_CMD_OFST 0x3C
>> +#define ECC_ERR_CNT_1BIT_OFST0x40
>> +#define ECC_ERR_CNT_2BIT_OFST0x44
>> +#define DMA_ADDR0_OFST   0x50
>> +#define DATA_INTERFACE_REG   0x6C
>> +
>> +#define PKT_CNT_SHIFT12
>> +
>> +#define ECC_ENABLE   BIT(31)
>> +#define DMA_EN_MASK  GENMASK(27, 26)
>> +#define DMA_ENABLE   0x2
>> +#define DMA_EN_SHIFT 26
>> +#define PAGE_SIZE_MASK   GENMASK(25, 23)
>
> PAGE_SIZE_ prefix is too broad, might conflict with global definitions
> on some architectures.
>
>> +#define PAGE_SIZE_SHIFT  23
>> +#define PAGE_SIZE_5120
>> +#define PAGE_SIZE_1K 5
>> +#define PAGE_SIZE_2K 1
>> +#define PAGE_SIZE_4K 2
>> +#define PAGE_SIZE_8K 3
>> +#define PAGE_SIZE_16K4
>> +#define CMD2_SHIFT   8
>> +#define ADDR_CYCLES_SHIFT28
>> +
>> +#define XFER_COMPLETEBIT(2)
>> +#define READ_READY   BIT(1)
>> +#define WRITE_READY  BIT(0)
>> +#define MBIT_ERROR   BIT(3)
>> +#define ERR_INTRPT   BIT(4)
>> +
>> +#define PROG_PGRDBIT(0)
>> +#define PROG_ERASE   BIT(2)
>> +#define PROG_STATUS  BIT(3)
>> +#define PROG_PGPROG  BIT(4)
>> +#define PROG_RDIDBIT(6)
>> +#define PROG_RDPARAM BIT(7)
>> +#define PROG_RST BIT(8)
>> +#define PROG_GET_FEATURE BIT(9)
>> +#define PROG_SET_FEATURE BIT(10)
>> +
>> +#define ONFI_STATUS_FAIL BIT(0)
>> +#define ONFI_STATUS_READYBIT(6)
>> +
>> +#define PG_ADDR_SHIFT16
>> +#define BCH_MODE_SHIFT   25
>> +#define BCH_EN_SHIFT 27
>> +#define ECC_SIZE_SHIFT   16
>> +
>> +#define MEM_ADDR_MASKGENMASK(7, 0)
>> +#define BCH_MODE_MASKGENMASK(27, 25)
>> +
>> +#define CS_MASK  GENMASK(31, 30)
>> +#define CS_SHIFT 30
>> +
>> +#define PAGE_ERR_CNT_MAS

Re: [RFC] usb: dwc2: hcd: fix split schedule issue

2015-11-11 Thread Doug Anderson

John,

On Wed, Nov 11, 2015 at 8:29 PM, John Youn  wrote:
> I also feel it is not quite right as the SSPLIT should be able to
> happen during the SSPLIT of another device. I tried to reproduce
> and see the same scheduling but don't see any hang due to it.
>
> Yunzhi, any details on what kind of hub and keyboard you are
> using? I have the same Jabra 410 speaker.

I saw it with a standard Logitech mouse.  It wasn't a hang, but the
mouse effectively became non-functional (behaved like it hung) when
you started playing audio.  Once the audio stream stopped, the mouse
would work again.  I was using the same Jabra 410 as well.

/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=dwc2/1p, 480M
|__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
|__ Port 2: Dev 6, If 0, Class=Human Interface Device,
Driver=usbhid, 1.5M
|__ Port 3: Dev 5, If 0, Class=Audio, Driver=snd-usb-audio, 12M
|__ Port 3: Dev 5, If 1, Class=Audio, Driver=snd-usb-audio, 12M
|__ Port 3: Dev 5, If 2, Class=Audio, Driver=snd-usb-audio, 12M
|__ Port 3: Dev 5, If 3, Class=Human Interface Device, Driver=, 12M

Bus 002 Device 005: ID 0b0e:0412 GN Netcom
Bus 002 Device 006: ID 046d:c05a Logitech, Inc. M90/M100 Optical Mouse
Bus 002 Device 002: ID 1a40:0101 Terminus Technology Inc. Hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

We've also had some discussion of this patch in our bug tracker at
.

I'll keep digging tomorrow, too.

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: module: save load_info for livepatch modules

2015-11-11 Thread Jessica Yu


+++ Petr Mladek [11/11/15 15:31 +0100]:

On Mon 2015-11-09 23:45:52, Jessica Yu wrote:

In livepatch modules, preserve section, symbol, string information from
the load_info struct in the module loader. This information is used to
patch modules that are not loaded in memory yet; specifically it is used
to resolve remaining symbols and write relocations when the target
module loads.

Signed-off-by: Jessica Yu 
---
 include/linux/module.h  | 25 +
 kernel/livepatch/core.c | 17 +
 kernel/module.c | 36 ++--
 3 files changed, 64 insertions(+), 14 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index 3a19c79..c8680b1 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h

[...]

@@ -635,6 +651,15 @@ static inline bool module_requested_async_probing(struct 
module *module)
return module && module->async_probe_requested;
 }

+#ifdef CONFIG_LIVEPATCH
+extern void klp_prepare_patch_module(struct module *mod,
+struct load_info *info);
+extern int
+apply_relocate_add(Elf64_Shdr *sechdrs, const char *strtab,
+  unsigned int symindex, unsigned int relsec,
+  struct module *me);
+#endif


This function is already declared in moduleloader.h.
It is implemted only when CONFIG_MODULES_USE_ELF_RELA is defined.

I guess that we want to include moduleloader.h in livepatch.


+
 #else /* !CONFIG_MODULES... */

 /* Given an address, look for it in the exception tables. */
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 6e53441..087a8c7 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -1001,6 +1001,23 @@ static struct notifier_block klp_module_nb = {
.priority = INT_MIN+1, /* called late but before ftrace notifier */
 };

+/*
+ * Save necessary information from info in order to be able to
+ * patch modules that might be loaded later
+ */
+void klp_prepare_patch_module(struct module *mod, struct load_info *info)
+{
+   Elf_Shdr *symsect;
+
+   symsect = info->sechdrs + info->index.sym;
+   /* update sh_addr to point to symtab */
+   symsect->sh_addr = (unsigned long)info->hdr + symsect->sh_offset;


Is livepatch the only user of this value? By other words, is this safe?


I think it is safe to say yes. klp_prepare_patch_module() is only
called at the very end of load_module(), right before
do_init_module(). Normally, at that point, info->hdr will have already
been freed by free_copy() along with the elf section information
associated with it. But if we have a livepatch module, we don't free.
So we should be the very last user, and there should be nobody
utilizing the memory associated with the load_info struct anymore at
that point.


+   mod->info = kzalloc(sizeof(*info), GFP_KERNEL);
+   memcpy(mod->info, info, sizeof(*info));
+
+}


It is strange that this funtion is defined in livepatch/core.c
but declared in module.h. I would move the definition to
module.c.


Right, I was trying to keep all the livepatch-related functions
together in livepatch/core.c. but I can move it to module.c if it
makes more sense/Rusty doesn't object to it :-)


 static int __init klp_init(void)
 {
int ret;
diff --git a/kernel/module.c b/kernel/module.c
index 8f051a1..8ae3ca5 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -318,20 +318,6 @@ int unregister_module_notifier(struct notifier_block *nb)
 }
 EXPORT_SYMBOL(unregister_module_notifier);

-struct load_info {
-   Elf_Ehdr *hdr;
-   unsigned long len;
-   Elf_Shdr *sechdrs;
-   char *secstrings, *strtab;
-   unsigned long symoffs, stroffs;
-   struct _ddebug *debug;
-   unsigned int num_debug;
-   bool sig_ok;
-   struct {
-   unsigned int sym, str, mod, vers, info, pcpu;
-   } index;
-};
-
 /* We require a truly strong try_module_get(): 0 means failure due to
ongoing or failed initialization etc. */
 static inline int strong_try_module_get(struct module *mod)
@@ -2137,6 +2123,11 @@ static int simplify_symbols(struct module *mod, const 
struct load_info *info)
   (long)sym[i].st_value);
break;

+#ifdef CONFIG_LIVEPATCH
+   case SHN_LIVEPATCH:
+   break;
+#endif


IMHO, even a kernel compiled without CONFIG_LIVEPATCH should handle livepatch
modules with grace. It means to reject loading.


I think even right now, without considering this patchset, we don't
reject modules "gracefully" when we load a livepatch module without
CONFIG_LIVEPATCH. The module loader will complain and reject the
livepatch module, saying something like "Unknown symbol
klp_register_patch." This behavior is the same with or without
this patch series applied. If we want to add a bit more logic to
gracefully reject patch modules, perhaps that should be a different
patch altogether, as I think it is unrelated to the g

Re: [PATCH v4 8/9] ARM: EXYNOS: rearrange static and non-static functions of PMU driver

2015-11-11 Thread Krzysztof Kozlowski

On 10.11.2015 20:43, Pankaj Dubey wrote:
> This patch moves exynos_sys_powerdown_conf function above all
> static functions.

Please (always) describe the reason, the answer to "why?". In this case
I know why, but other reviewers may not and other people grepping
through history definitely won't know.

Best regards,
Krzysztof

> 
> Signed-off-by: Pankaj Dubey 
> ---
>  arch/arm/mach-exynos/pmu.c | 34 +-
>  1 file changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/arm/mach-exynos/pmu.c b/arch/arm/mach-exynos/pmu.c
> index e01bdf1..f300ac9 100644
> --- a/arch/arm/mach-exynos/pmu.c
> +++ b/arch/arm/mach-exynos/pmu.c
> @@ -39,23 +39,6 @@ u32 pmu_raw_readl(u32 offset)
>   return readl_relaxed(pmu_base_addr + offset);
>  }
>  
> -static void exynos_power_off(void)
> -{
> - unsigned int tmp;
> -
> - pr_info("Power down.\n");
> - tmp = pmu_raw_readl(EXYNOS_PS_HOLD_CONTROL);
> - tmp ^= (1 << 8);
> - pmu_raw_writel(tmp, EXYNOS_PS_HOLD_CONTROL);
> -
> - /* Wait a little so we don't give a false warning below */
> - mdelay(100);
> -
> - pr_err("Power down failed, please power off system manually.\n");
> - while (1)
> - ;
> -}
> -
>  void exynos_sys_powerdown_conf(enum sys_powerdown mode)
>  {
>   unsigned int i;
> @@ -85,6 +68,23 @@ void exynos_sys_powerdown_conf(enum sys_powerdown mode)
>   }
>  }
>  
> +static void exynos_power_off(void)
> +{
> + unsigned int tmp;
> +
> + pr_info("Power down.\n");
> + tmp = pmu_raw_readl(EXYNOS_PS_HOLD_CONTROL);
> + tmp ^= (1 << 8);
> + pmu_raw_writel(tmp, EXYNOS_PS_HOLD_CONTROL);
> +
> + /* Wait a little so we don't give a false warning below */
> + mdelay(100);
> +
> + pr_err("Power down failed, please power off system manually.\n");
> + while (1)
> + ;
> +}
> +
>  static int pmu_restart_notify(struct notifier_block *this,
>   unsigned long code, void *unused)
>  {
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 04/17] mm: free swp_entry in madvise_free

2015-11-11 Thread Minchan Kim

When I test below piece of code with 12 processes(ie, 512M * 12 = 6G
consume) on my (3G ram + 12 cpu + 8G swap, the madvise_free is siginficat
slower (ie, 2x times) than madvise_dontneed.

loop = 5;
mmap(512M);
while (loop--) {
memset(512M);
madvise(MADV_FREE or MADV_DONTNEED);
}

The reason is lots of swapin.

1) dontneed: 1,612 swapin
2) madvfree: 879,585 swapin

If we find hinted pages were already swapped out when syscall is called,
it's pointless to keep the swapped-out pages in pte.
Instead, let's free the cold page because swapin is more expensive
than (alloc page + zeroing).

With this patch, it reduced swapin from 879,585 to 1,878 so elapsed time

1) dontneed: 6.10user 233.50system 0:50.44elapsed
2) madvfree: 6.03user 401.17system 1:30.67elapsed
2) madvfree + below patch: 6.70user 339.14system 1:04.45elapsed

Acked-by: Michal Hocko 
Acked-by: Hugh Dickins 
Signed-off-by: Minchan Kim 
---
 mm/madvise.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index a8813f7b37b3..6240a5de4a3a 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -270,6 +270,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long 
addr,
spinlock_t *ptl;
pte_t *pte, ptent;
struct page *page;
+   int nr_swap = 0;
 
split_huge_page_pmd(vma, addr, pmd);
if (pmd_trans_unstable(pmd))
@@ -280,8 +281,24 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
long addr,
for (; addr != end; pte++, addr += PAGE_SIZE) {
ptent = *pte;
 
-   if (!pte_present(ptent))
+   if (pte_none(ptent))
continue;
+   /*
+* If the pte has swp_entry, just clear page table to
+* prevent swap-in which is more expensive rather than
+* (page allocation + zeroing).
+*/
+   if (!pte_present(ptent)) {
+   swp_entry_t entry;
+
+   entry = pte_to_swp_entry(ptent);
+   if (non_swap_entry(entry))
+   continue;
+   nr_swap--;
+   free_swap_and_cache(entry);
+   pte_clear_not_present_full(mm, addr, pte, tlb->fullmm);
+   continue;
+   }
 
page = vm_normal_page(vma, addr, ptent);
if (!page)
@@ -317,6 +334,13 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
long addr,
}
}
 
+   if (nr_swap) {
+   if (current->mm == mm)
+   sync_mm_rss(mm);
+
+   add_mm_counter(mm, MM_SWAPENTS, nr_swap);
+   }
+
arch_leave_lazy_mmu_mode();
pte_unmap_unlock(pte - 1, ptl);
cond_resched();
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 07/17] mm: mark stable page dirty in KSM

2015-11-11 Thread Minchan Kim

The MADV_FREE patchset changes page reclaim to simply free a clean
anonymous page with no dirty ptes, instead of swapping it out; but
KSM uses clean write-protected ptes to reference the stable ksm page.
So be sure to mark that page dirty, so it's never mistakenly discarded.

[hughd: adjusted comments]
Acked-by: Hugh Dickins 
Signed-off-by: Minchan Kim 
---
 mm/ksm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/mm/ksm.c b/mm/ksm.c
index 7ee101eaacdf..18d2b7afecff 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1053,6 +1053,12 @@ static int try_to_merge_one_page(struct vm_area_struct 
*vma,
 */
set_page_stable_node(page, NULL);
mark_page_accessed(page);
+   /*
+* Page reclaim just frees a clean page with no dirty
+* ptes: make sure that the ksm page would be swapped.
+*/
+   if (!PageDirty(page))
+   SetPageDirty(page);
err = 0;
} else if (pages_identical(page, kpage))
err = replace_page(vma, page, kpage, orig_pte);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 02/17] mm: define MADV_FREE for some arches

2015-11-11 Thread Minchan Kim

Most architectures use asm-generic, but alpha, mips, parisc, xtensa
need their own definitions.

This patch defines MADV_FREE for them so it should fix build break
for their architectures.

Maybe, I should split and feed piecies to arch maintainers but
included here for mmotm convenience.

Cc: Michael Kerrisk 
Cc: Richard Henderson 
Cc: Ivan Kokshaysky 
Cc: "James E.J. Bottomley" 
Cc: Helge Deller 
Cc: Ralf Baechle 
Cc: Chris Zankel 
Acked-by: Max Filippov 
Reported-by: kbuild test robot 
Signed-off-by: Minchan Kim 
---
 arch/alpha/include/uapi/asm/mman.h  | 1 +
 arch/mips/include/uapi/asm/mman.h   | 1 +
 arch/parisc/include/uapi/asm/mman.h | 1 +
 arch/xtensa/include/uapi/asm/mman.h | 1 +
 4 files changed, 4 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/mman.h 
b/arch/alpha/include/uapi/asm/mman.h
index 0086b472bc2b..836fbd44f65b 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -44,6 +44,7 @@
 #define MADV_WILLNEED  3   /* will need these pages */
 #defineMADV_SPACEAVAIL 5   /* ensure resources are 
available */
 #define MADV_DONTNEED  6   /* don't need these pages */
+#define MADV_FREE  7   /* free pages only if memory pressure */
 
 /* common/generic parameters */
 #define MADV_REMOVE9   /* remove these pages & resources */
diff --git a/arch/mips/include/uapi/asm/mman.h 
b/arch/mips/include/uapi/asm/mman.h
index cfcb876cae6b..106e741aa7ee 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -67,6 +67,7 @@
 #define MADV_SEQUENTIAL 2  /* expect sequential page references */
 #define MADV_WILLNEED  3   /* will need these pages */
 #define MADV_DONTNEED  4   /* don't need these pages */
+#define MADV_FREE  5   /* free pages only if memory pressure */
 
 /* common parameters: try to keep these consistent across architectures */
 #define MADV_REMOVE9   /* remove these pages & resources */
diff --git a/arch/parisc/include/uapi/asm/mman.h 
b/arch/parisc/include/uapi/asm/mman.h
index 294d251ca7b2..6cb8db76fd4e 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -40,6 +40,7 @@
 #define MADV_SPACEAVAIL 5   /* insure that resources are reserved 
*/
 #define MADV_VPS_PURGE  6   /* Purge pages from VM page cache */
 #define MADV_VPS_INHERIT 7  /* Inherit parents page size */
+#define MADV_FREE  8   /* free pages only if memory pressure */
 
 /* common/generic parameters */
 #define MADV_REMOVE9   /* remove these pages & resources */
diff --git a/arch/xtensa/include/uapi/asm/mman.h 
b/arch/xtensa/include/uapi/asm/mman.h
index 201aec0e0446..1b19f25bc567 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -80,6 +80,7 @@
 #define MADV_SEQUENTIAL2   /* expect sequential page 
references */
 #define MADV_WILLNEED  3   /* will need these pages */
 #define MADV_DONTNEED  4   /* don't need these pages */
+#define MADV_FREE  5   /* free pages only if memory pressure */
 
 /* common parameters: try to keep these consistent across architectures */
 #define MADV_REMOVE9   /* remove these pages & resources */
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 08/17] x86: add pmd_[dirty|mkclean] for THP

2015-11-11 Thread Minchan Kim

MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
of the contents since MADV_FREE syscall is called for THP page.

This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
support.

Signed-off-by: Minchan Kim 
Signed-off-by: Andrew Morton 
---
 arch/x86/include/asm/pgtable.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 867da5bbb4a3..b964d54300e1 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -267,6 +267,11 @@ static inline pmd_t pmd_mkold(pmd_t pmd)
return pmd_clear_flags(pmd, _PAGE_ACCESSED);
 }
 
+static inline pmd_t pmd_mkclean(pmd_t pmd)
+{
+   return pmd_clear_flags(pmd, _PAGE_DIRTY);
+}
+
 static inline pmd_t pmd_wrprotect(pmd_t pmd)
 {
return pmd_clear_flags(pmd, _PAGE_RW);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 06/17] mm: clear PG_dirty to mark page freeable

2015-11-11 Thread Minchan Kim

Basically, MADV_FREE relies on dirty bit in page table entry to decide
whether VM allows to discard the page or not.  IOW, if page table entry
includes marked dirty bit, VM shouldn't discard the page.

However, as a example, if swap-in by read fault happens, page table entry
doesn't have dirty bit so MADV_FREE could discard the page wrongly.

For avoiding the problem, MADV_FREE did more checks with PageDirty
and PageSwapCache. It worked out because swapped-in page lives on
swap cache and since it is evicted from the swap cache, the page has
PG_dirty flag. So both page flags check effectively prevent
wrong discarding by MADV_FREE.

However, a problem in above logic is that swapped-in page has
PG_dirty still after they are removed from swap cache so VM cannot
consider the page as freeable any more even if madvise_free is
called in future.

Look at below example for detail.

ptr = malloc();
memset(ptr);
..
..
.. heavy memory pressure so all of pages are swapped out
..
..
var = *ptr; -> a page swapped-in and could be removed from
   swapcache. Then, page table doesn't mark
   dirty bit and page descriptor includes PG_dirty
..
..
madvise_free(ptr); -> It doesn't clear PG_dirty of the page.
..
..
..
.. heavy memory pressure again.
.. In this time, VM cannot discard the page because the page
.. has *PG_dirty*

To solve the problem, this patch clears PG_dirty if only the page is owned
exclusively by current process when madvise is called because PG_dirty
represents ptes's dirtiness in several processes so we could clear it only
if we own it exclusively.

Acked-by: Michal Hocko 
Acked-by: Hugh Dickins 
Signed-off-by: Minchan Kim 
---
 mm/madvise.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 3462a3ca9690..4e67ba0b1104 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -304,11 +304,19 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned 
long addr,
if (!page)
continue;
 
-   if (PageSwapCache(page)) {
+   if (PageSwapCache(page) || PageDirty(page)) {
if (!trylock_page(page))
continue;
+   /*
+* If page is shared with others, we couldn't clear
+* PG_dirty of the page.
+*/
+   if (page_count(page) != 1 + !!PageSwapCache(page)) {
+   unlock_page(page);
+   continue;
+   }
 
-   if (!try_to_free_swap(page)) {
+   if (PageSwapCache(page) && !try_to_free_swap(page)) {
unlock_page(page);
continue;
}
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 05/17] mm: move lazily freed pages to inactive list

2015-11-11 Thread Minchan Kim

MADV_FREE is a hint that it's okay to discard pages if there is memory
pressure and we use reclaimers(ie, kswapd and direct reclaim) to free them
so there is no value keeping them in the active anonymous LRU so this
patch moves them to inactive LRU list's head.

This means that MADV_FREE-ed pages which were living on the inactive list
are reclaimed first because they are more likely to be cold rather than
recently active pages.

An arguable issue for the approach would be whether we should put the page
to the head or tail of the inactive list.  I chose head because the kernel
cannot make sure it's really cold or warm for every MADV_FREE usecase but
at least we know it's not *hot*, so landing of inactive head would be a
comprimise for various usecases.

This fixes suboptimal behavior of MADV_FREE when pages living on the
active list will sit there for a long time even under memory pressure
while the inactive list is reclaimed heavily.  This basically breaks the
whole purpose of using MADV_FREE to help the system to free memory which
is might not be used.

Cc: Johannes Weiner 
Cc: Mel Gorman 
Cc: Rik van Riel 
Cc: Shaohua Li 
Acked-by: Hugh Dickins 
Acked-by: Michal Hocko 
Signed-off-by: Minchan Kim 
---
 include/linux/swap.h |  2 +-
 mm/madvise.c |  3 +++
 mm/swap.c| 62 +---
 mm/truncate.c|  2 +-
 4 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7ba7dccaf0e7..8e944c0cedea 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -307,7 +307,7 @@ extern void lru_add_drain(void);
 extern void lru_add_drain_cpu(int cpu);
 extern void lru_add_drain_all(void);
 extern void rotate_reclaimable_page(struct page *page);
-extern void deactivate_file_page(struct page *page);
+extern void deactivate_page(struct page *page);
 extern void swap_setup(void);
 
 extern void add_page_to_unevictable_list(struct page *page);
diff --git a/mm/madvise.c b/mm/madvise.c
index 6240a5de4a3a..3462a3ca9690 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -317,6 +317,9 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long 
addr,
unlock_page(page);
}
 
+   if (PageActive(page))
+   deactivate_page(page);
+
if (pte_young(ptent) || pte_dirty(ptent)) {
/*
 * Some of architecture(ex, PPC) don't update TLB
diff --git a/mm/swap.c b/mm/swap.c
index 983f692a47fd..a2f2cd458de0 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -44,7 +44,7 @@ int page_cluster;
 
 static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
 static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
-static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs);
+static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
 
 /*
  * This path almost never happens for VM activity - pages are normally
@@ -733,13 +733,13 @@ void lru_cache_add_active_or_unevictable(struct page 
*page,
 }
 
 /*
- * If the page can not be invalidated, it is moved to the
+ * If the file page can not be invalidated, it is moved to the
  * inactive list to speed up its reclaim.  It is moved to the
  * head of the list, rather than the tail, to give the flusher
  * threads some time to write it out, as this is much more
  * effective than the single-page writeout from reclaim.
  *
- * If the page isn't page_mapped and dirty/writeback, the page
+ * If the file page isn't page_mapped and dirty/writeback, the page
  * could reclaim asap using PG_reclaim.
  *
  * 1. active, mapped page -> none
@@ -752,32 +752,36 @@ void lru_cache_add_active_or_unevictable(struct page 
*page,
  * In 4, why it moves inactive's head, the VM expects the page would
  * be write it out by flusher threads as this is much more effective
  * than the single-page writeout from reclaim.
+ *
+ * If @page is anonymous page, it is moved to the inactive list.
  */
-static void lru_deactivate_file_fn(struct page *page, struct lruvec *lruvec,
+static void lru_deactivate_fn(struct page *page, struct lruvec *lruvec,
  void *arg)
 {
-   int lru, file;
-   bool active;
+   int lru;
+   bool file, active;
 
-   if (!PageLRU(page))
+   if (!PageLRU(page) || PageUnevictable(page))
return;
 
-   if (PageUnevictable(page))
-   return;
+   file = page_is_file_cache(page);
+   active = PageActive(page);
+   lru = page_lru_base_type(page);
 
-   /* Some processes are using the page */
-   if (page_mapped(page))
+   if (!file && !active)
return;
 
-   active = PageActive(page);
-   file = page_is_file_cache(page);
-   lru = page_lru_base_type(page);
+   if (file && page_mapped(page))
+   return;
 
del_page_from_lru_list(page, lruvec, lru + active);
ClearPageActive(page);
-   ClearPageReferenced(page);

[PATCH v3 12/17] arm64: add pmd_mkclean for THP

2015-11-11 Thread Minchan Kim

MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
of the contents since MADV_FREE syscall is called for THP page.

This patch adds pmd_mkclean for THP page MADV_FREE support.

Signed-off-by: Minchan Kim 
---
 arch/arm64/include/asm/pgtable.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 26b066690593..a945263addd4 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -325,6 +325,7 @@ void pmdp_splitting_flush(struct vm_area_struct *vma, 
unsigned long address,
 #define pmd_mksplitting(pmd)   pte_pmd(pte_mkspecial(pmd_pte(pmd)))
 #define pmd_mkold(pmd) pte_pmd(pte_mkold(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)   pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#define pmd_mkclean(pmd)   pte_pmd(pte_mkclean(pmd_pte(pmd)))
 #define pmd_mkdirty(pmd)   pte_pmd(pte_mkdirty(pmd_pte(pmd)))
 #define pmd_mkyoung(pmd)   pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mknotpresent(pmd)  (__pmd(pmd_val(pmd) & ~PMD_TYPE_MASK))
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 01/17] mm: support madvise(MADV_FREE)

2015-11-11 Thread Minchan Kim

Linux doesn't have an ability to free pages lazy while other OS already
have been supported that named by madvise(MADV_FREE).

The gain is clear that kernel can discard freed pages rather than swapping
out or OOM if memory pressure happens.

Without memory pressure, freed pages would be reused by userspace without
another additional overhead(ex, page fault + allocation + zeroing).

Jason Evans said:

: Facebook has been using MAP_UNINITIALIZED
: (https://lkml.org/lkml/2012/1/18/308) in some of its applications for
: several years, but there are operational costs to maintaining this
: out-of-tree in our kernel and in jemalloc, and we are anxious to retire it
: in favor of MADV_FREE.  When we first enabled MAP_UNINITIALIZED it
: increased throughput for much of our workload by ~5%, and although the
: benefit has decreased using newer hardware and kernels, there is still
: enough benefit that we cannot reasonably retire it without a replacement.
:
: Aside from Facebook operations, there are numerous broadly used
: applications that would benefit from MADV_FREE.  The ones that immediately
: come to mind are redis, varnish, and MariaDB.  I don't have much insight
: into Android internals and development process, but I would hope to see
: MADV_FREE support eventually end up there as well to benefit applications
: linked with the integrated jemalloc.
:
: jemalloc will use MADV_FREE once it becomes available in the Linux kernel.
: In fact, jemalloc already uses MADV_FREE or equivalent everywhere it's
: available: *BSD, OS X, Windows, and Solaris -- every platform except Linux
: (and AIX, but I'm not sure it even compiles on AIX).  The lack of
: MADV_FREE on Linux forced me down a long series of increasingly
: sophisticated heuristics for madvise() volume reduction, and even so this
: remains a common performance issue for people using jemalloc on Linux.
: Please integrate MADV_FREE; many people will benefit substantially.

How it works:

When madvise syscall is called, VM clears dirty bit of ptes of the range.
If memory pressure happens, VM checks dirty bit of page table and if it
found still "clean", it means it's a "lazyfree pages" so VM could discard
the page instead of swapping out.  Once there was store operation for the
page before VM peek a page to reclaim, dirty bit is set so VM can swap out
the page instead of discarding.

Firstly, heavy users would be general allocators(ex, jemalloc, tcmalloc
and hope glibc supports it) and jemalloc/tcmalloc already have supported
the feature for other OS(ex, FreeBSD)

barrios@blaptop:~/benchmark/ebizzy$ lscpu
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):12
On-line CPU(s) list:   0-11
Thread(s) per core:1
Core(s) per socket:1
Socket(s): 12
NUMA node(s):  1
Vendor ID: GenuineIntel
CPU family:6
Model: 2
Stepping:  3
CPU MHz:   3200.185
BogoMIPS:  6400.53
Virtualization:VT-x
Hypervisor vendor: KVM
Virtualization type:   full
L1d cache: 32K
L1i cache: 32K
L2 cache:  4096K
NUMA node0 CPU(s): 0-11
ebizzy benchmark(./ebizzy -S 10 -n 512)

Higher avg is better.

 vanilla-jemalloc   MADV_free-jemalloc

1 thread
records: 10 records: 10
avg:2961.90 avg:   12069.70
std:  71.96(2.43%)  std: 186.68(1.55%)
max:3070.00 max:   12385.00
min:2796.00 min:   11746.00

2 thread
records: 10 records: 10
avg:5020.00 avg:   17827.00
std: 264.87(5.28%)  std: 358.52(2.01%)
max:5244.00 max:   18760.00
min:4251.00 min:   17382.00

4 thread
records: 10 records: 10
avg:8988.80 avg:   27930.80
std:1175.33(13.08%) std:3317.33(11.88%)
max:9508.00 max:   30879.00
min:5477.00 min:   21024.00

8 thread
records: 10 records: 10
avg:   13036.50 avg:   33739.40
std: 170.67(1.31%)  std:5146.22(15.25%)
max:   13371.00 max:   40572.00
min:   12785.00 min:   24088.00

16 thread
records: 10 records: 10
avg:   11092.40 avg:   31424.20
std: 710.60(6.41%)  std:3763.89(11.98%)
max:   12446.00 max:   36635.00
min:9949.00 min:   25669.00

32 thread
records: 10 records: 10
avg:   11067.00 avg:   34495.80
std: 971.06(8.77%)  std:2721.36(7.89%)
max:   12010.00 max:   38598.00
min:9002.00 min:   30636.00

In summary, MADV_FRE

[PATCH v3 00/17] MADFV_FREE support

2015-11-11 Thread Minchan Kim

MADV_FREE is on linux-next so long time. The reason was two, I think.

1. MADV_FREE code on reclaim path was really mess.

2. Andrew really want to see voice of userland people who want to use
   the syscall.

A few month ago, Daniel Micay(jemalloc active contributor) requested me
to make progress upstreaming but I was busy at that time so it took
so long time for me to revist the code and finally, I clean it up the
mess recently so it solves the #2 issue.

As well, Daniel and Jason(jemalloc maintainer) requested it to Andrew
again recently and they said it would be great to have even though
it has swap dependency now so Andrew decided he will do that for v4.4.

However, there were some concerns, still.

* hotness

Someone think MADV_FREEed pages are really cold while others are not.
Look at detail in decscription of mm: add knob to tune lazyfreeing.

* swap dependency

In old version, MADV_FREE is equal to MADV_DONTNEED on swapless
system because we don't have aged anonymous LRU list on swapless.
So there are requests for MADV_FREE to support swapless system.

For addressing issues, this version includes new LRU list for
hinted pages and tuning knob. With that, we could support swapless
without zapping hinted pages instantly.

Please, review and comment.

I have been tested it on v4.3-rc7 and couldn't find any problem so far.

git: git://git.kernel.org/pub/scm/linux/kernel/git/minchan/linux.git
branch: mm/madv_free-v4.3-rc7-v3-lazyfreelru

In this stage, I don't think we need to write man page.
It could be done after solid policy and implementation.

 * Change from v2
   * add new LRU list and tuning knob
   * support swapless

 * Change from v1
   * Don't do unnecessary TLB flush - Shaohua
   * Added Acked-by - Hugh, Michal
   * Merge deactivate_page and deactivate_file_page
   * Add pmd_dirty/pmd_mkclean patches for several arches
   * Add lazy THP split patch
   * Drop zhangyan...@cn.fujitsu.com - Delivery Failure

Chen Gang (1):
  arch: uapi: asm: mman.h: Let MADV_FREE have same value for all
architectures

Minchan Kim (16):
  mm: support madvise(MADV_FREE)
  mm: define MADV_FREE for some arches
  mm: free swp_entry in madvise_free
  mm: move lazily freed pages to inactive list
  mm: clear PG_dirty to mark page freeable
  mm: mark stable page dirty in KSM
  x86: add pmd_[dirty|mkclean] for THP
  sparc: add pmd_[dirty|mkclean] for THP
  powerpc: add pmd_[dirty|mkclean] for THP
  arm: add pmd_mkclean for THP
  arm64: add pmd_mkclean for THP
  mm: don't split THP page when syscall is called
  mm: introduce wrappers to add new LRU
  mm: introduce lazyfree LRU list
  mm: support MADV_FREE on swapless system
  mm: add knob to tune lazyfreeing

 Documentation/sysctl/vm.txt   |  13 +++
 arch/alpha/include/uapi/asm/mman.h|   1 +
 arch/arm/include/asm/pgtable-3level.h |   1 +
 arch/arm64/include/asm/pgtable.h  |   1 +
 arch/mips/include/uapi/asm/mman.h |   1 +
 arch/parisc/include/uapi/asm/mman.h   |   1 +
 arch/powerpc/include/asm/pgtable-ppc64.h  |   2 +
 arch/sparc/include/asm/pgtable_64.h   |   9 ++
 arch/x86/include/asm/pgtable.h|   5 +
 arch/xtensa/include/uapi/asm/mman.h   |   1 +
 drivers/base/node.c   |   2 +
 drivers/staging/android/lowmemorykiller.c |   3 +-
 fs/proc/meminfo.c |   2 +
 include/linux/huge_mm.h   |   3 +
 include/linux/memcontrol.h|   1 +
 include/linux/mm_inline.h |  83 ++-
 include/linux/mmzone.h|  16 ++-
 include/linux/page-flags.h|   5 +
 include/linux/rmap.h  |   1 +
 include/linux/swap.h  |  18 +++-
 include/linux/vm_event_item.h |   3 +-
 include/trace/events/vmscan.h |  38 ---
 include/uapi/asm-generic/mman-common.h|   1 +
 kernel/sysctl.c   |   9 ++
 mm/compaction.c   |  14 ++-
 mm/huge_memory.c  |  51 +++--
 mm/ksm.c  |   6 ++
 mm/madvise.c  | 171 ++
 mm/memcontrol.c   |  44 +++-
 mm/memory-failure.c   |   7 +-
 mm/memory_hotplug.c   |   3 +-
 mm/mempolicy.c|   3 +-
 mm/migrate.c  |  28 ++---
 mm/page_alloc.c   |   3 +
 mm/rmap.c |  14 +++
 mm/swap.c | 128 +++---
 mm/swap_state.c   |  11 +-
 mm/truncate.c |   2 +-
 mm/vmscan.c   | 157 ---
 mm/vmstat.c   |   4 +
 40 files changed, 713 insertions(+), 153 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe

Re: [PATCH v4 9/9] drivers: soc: Add support for Exynos PMU driver

2015-11-11 Thread Krzysztof Kozlowski

On 10.11.2015 20:43, Pankaj Dubey wrote:
> This patch moves Exynos PMU driver implementation from "arm/mach-exynos"
> to "drivers/soc/samsung". This driver is mainly used for setting misc
> bits of register from PMU IP of Exynos SoC which will be required to
> configure before Suspend/Resume. Currently all these settings are done
> in "arch/arm/mach-exynos/pmu.c" but moving ahead for ARM64 based SoC
> support, there is a need of this PMU driver in driver/* folder.
> 
> This driver uses existing DT binding information and there should
> be no functionality change in the supported platforms.
> 
> Signed-off-by: Amit Daniel Kachhap 
> Signed-off-by: Pankaj Dubey 
> ---
>  arch/arm/mach-exynos/Kconfig  |   1 +
>  arch/arm/mach-exynos/Makefile |   4 +-
>  arch/arm/mach-exynos/exynos-pmu.h |  45 --
>  arch/arm/mach-exynos/exynos3250-pmu.c | 175 -
>  arch/arm/mach-exynos/exynos4-pmu.c| 223 ---
>  arch/arm/mach-exynos/exynos5250-pmu.c | 196 
>  arch/arm/mach-exynos/exynos5420-pmu.c | 280 
> --
>  arch/arm/mach-exynos/pmu.c| 183 --
>  drivers/soc/samsung/Kconfig   |   4 +
>  drivers/soc/samsung/Makefile  |   4 +
>  drivers/soc/samsung/exynos-pmu.c  | 183 ++
>  drivers/soc/samsung/exynos-pmu.h  |  45 ++
>  drivers/soc/samsung/exynos3250-pmu.c  | 175 +
>  drivers/soc/samsung/exynos4-pmu.c | 223 +++
>  drivers/soc/samsung/exynos5250-pmu.c  | 196 
>  drivers/soc/samsung/exynos5420-pmu.c  | 280 
> ++
>  16 files changed, 1112 insertions(+), 1105 deletions(-)
>  delete mode 100644 arch/arm/mach-exynos/exynos-pmu.h
>  delete mode 100644 arch/arm/mach-exynos/exynos3250-pmu.c
>  delete mode 100644 arch/arm/mach-exynos/exynos4-pmu.c
>  delete mode 100644 arch/arm/mach-exynos/exynos5250-pmu.c
>  delete mode 100644 arch/arm/mach-exynos/exynos5420-pmu.c
>  delete mode 100644 arch/arm/mach-exynos/pmu.c
>  create mode 100644 drivers/soc/samsung/exynos-pmu.c
>  create mode 100644 drivers/soc/samsung/exynos-pmu.h
>  create mode 100644 drivers/soc/samsung/exynos3250-pmu.c
>  create mode 100644 drivers/soc/samsung/exynos4-pmu.c
>  create mode 100644 drivers/soc/samsung/exynos5250-pmu.c
>  create mode 100644 drivers/soc/samsung/exynos5420-pmu.c

Again - renames were not detected. This is strange... and actually
unreadable. The previous patch looked much better. What happened?

Please send also entire patchset to linux-pm mailing list. I asked about
it last time and can't see it as recipient here.

Best regards,
Krzysztof

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 11/17] arm: add pmd_mkclean for THP

2015-11-11 Thread Minchan Kim

MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
of the contents since MADV_FREE syscall is called for THP page.

This patch adds pmd_mkclean for THP page MADV_FREE support.

Signed-off-by: Minchan Kim 
---
 arch/arm/include/asm/pgtable-3level.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/include/asm/pgtable-3level.h 
b/arch/arm/include/asm/pgtable-3level.h
index a745a2a53853..6d6012a320b2 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -249,6 +249,7 @@ PMD_BIT_FUNC(mkold, &= ~PMD_SECT_AF);
 PMD_BIT_FUNC(mksplitting, |= L_PMD_SECT_SPLITTING);
 PMD_BIT_FUNC(mkwrite,   &= ~L_PMD_SECT_RDONLY);
 PMD_BIT_FUNC(mkdirty,   |= L_PMD_SECT_DIRTY);
+PMD_BIT_FUNC(mkclean,   &= ~L_PMD_SECT_DIRTY);
 PMD_BIT_FUNC(mkyoung,   |= PMD_SECT_AF);
 
 #define pmd_mkhuge(pmd)(__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 10/17] powerpc: add pmd_[dirty|mkclean] for THP

2015-11-11 Thread Minchan Kim

MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
of the contents since MADV_FREE syscall is called for THP page.

This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
support.

Signed-off-by: Minchan Kim 
Signed-off-by: Andrew Morton 
---
 arch/powerpc/include/asm/pgtable-ppc64.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h 
b/arch/powerpc/include/asm/pgtable-ppc64.h
index fa1dfb7f7b48..85e15c8067be 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -507,9 +507,11 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_pfn(pmd)   pte_pfn(pmd_pte(pmd))
 #define pmd_dirty(pmd) pte_dirty(pmd_pte(pmd))
 #define pmd_young(pmd) pte_young(pmd_pte(pmd))
+#define pmd_dirty(pmd) pte_dirty(pmd_pte(pmd))
 #define pmd_mkold(pmd) pte_pmd(pte_mkold(pmd_pte(pmd)))
 #define pmd_wrprotect(pmd) pte_pmd(pte_wrprotect(pmd_pte(pmd)))
 #define pmd_mkdirty(pmd)   pte_pmd(pte_mkdirty(pmd_pte(pmd)))
+#define pmd_mkclean(pmd)   pte_pmd(pte_mkclean(pmd_pte(pmd)))
 #define pmd_mkyoung(pmd)   pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)   pte_pmd(pte_mkwrite(pmd_pte(pmd)))
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 09/17] sparc: add pmd_[dirty|mkclean] for THP

2015-11-11 Thread Minchan Kim

MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite
of the contents since MADV_FREE syscall is called for THP page.

This patch adds pmd_dirty and pmd_mkclean for THP page MADV_FREE
support.

Signed-off-by: Minchan Kim 
Signed-off-by: Andrew Morton 
---
 arch/sparc/include/asm/pgtable_64.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/sparc/include/asm/pgtable_64.h 
b/arch/sparc/include/asm/pgtable_64.h
index 131d36fcd07a..5833dc5ee7d7 100644
--- a/arch/sparc/include/asm/pgtable_64.h
+++ b/arch/sparc/include/asm/pgtable_64.h
@@ -717,6 +717,15 @@ static inline pmd_t pmd_mkdirty(pmd_t pmd)
return __pmd(pte_val(pte));
 }
 
+static inline pmd_t pmd_mkclean(pmd_t pmd)
+{
+   pte_t pte = __pte(pmd_val(pmd));
+
+   pte = pte_mkclean(pte);
+
+   return __pmd(pte_val(pte));
+}
+
 static inline pmd_t pmd_mkyoung(pmd_t pmd)
 {
pte_t pte = __pte(pmd_val(pmd));
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 17/17] mm: add knob to tune lazyfreeing

2015-11-11 Thread Minchan Kim

MADV_FREEed page's hotness is very arguble.
Someone think it's hot while others are it's cold.

Quote from Shaohua
"
My main concern is the policy how we should treat the FREE pages. Moving it to
inactive lru is definitionly a good start, I'm wondering if it's enough. The
MADV_FREE increases memory pressure and cause unnecessary reclaim because of
the lazy memory free. While MADV_FREE is intended to be a better replacement of
MADV_DONTNEED, MADV_DONTNEED doesn't have the memory pressure issue as it free
memory immediately. So I hope the MADV_FREE doesn't have impact on memory
pressure too. I'm thinking of adding an extra lru list and wartermark for this
to make sure FREE pages can be freed before system wide page reclaim. As you
said, this is arguable, but I hope we can discuss about this issue more.
"

Quote from me
"
It seems the divergence comes from MADV_FREE is *replacement* of MADV_DONTNEED.
But I don't think so. If we could discard MADV_FREEed page *anytime*, I agree
but it's not true because the page would be dirty state when VM want to reclaim.

I'm also against with your's suggestion which let's discard FREEed page before
system wide page reclaim because system would have lots of clean cold page
caches or anonymous pages. In such case, reclaiming of them would be better.
Yeb, it's really workload-dependent so we might need some heuristic which is
normally what we want to avoid.

Having said that, I agree with you we could do better than the deactivation
and frankly speaking, I'm thinking of another LRU list(e.g. tentatively named
"ezreclaim LRU list"). What I have in mind is to age (anon|file|ez)
fairly. IOW, I want to percolate ez-LRU list reclaiming into get_scan_count.
When the MADV_FREE is called, we could move hinted pages from anon-LRU to
ez-LRU and then If VM find to not be able to discard a page in ez-LRU,
it could promote it to acive-anon-LRU which would be very natural aging
concept because it mean someone touches the page recenlty.
With that, I don't want to bias one side and don't want to add some knob for
tuning the heuristic but let's rely on common fair aging scheme of VM.
"

Quote from Johannes
"
thread 1:
Even if we're wrong about the aging of those MADV_FREE pages, their
contents are invalidated; they can be discarded freely, and restoring
them is a mere GFP_ZERO allocation. All other anonymous pages have to
be written to disk, and potentially be read back.

[ Arguably, MADV_FREE pages should even be reclaimed before inactive
  page cache. It's the same cost to discard both types of pages, but
  restoring page cache involves IO. ]

It probably makes sense to stop thinking about them as anonymous pages
entirely at this point when it comes to aging. They're really not. The
LRU lists are split to differentiate access patterns and cost of page
stealing (and restoring). From that angle, MADV_FREE pages really have
nothing in common with in-use anonymous pages, and so they shouldn't
be on the same LRU list.

thread:2
What about them is hot? They contain garbage, you have to write to
them before you can use them. Granted, you might have to refetch
cachelines if you don't do cacheline-aligned populating writes, but
you can do a lot of them before it's more expensive than doing IO.

"

Quote from Daniel
"
thread:1
Keep in mind that this is memory the kernel wouldn't be getting back at
all if the allocator wasn't going out of the way to purge it, and they
aren't going to go out of their way to purge it if it means the kernel
is going to steal the pages when there isn't actually memory pressure.

An allocator would be using MADV_DONTNEED if it didn't expect that the
pages were going to be used against shortly. MADV_FREE indicates that it
has time to inform the kernel that they're unused but they could still
be very hot.

thread:2
It's hot because applications churn through memory via the allocator.

Drop the pages and the application is now churning through page faults
and zeroing rather than simply reusing memory. It's not something that
may happen, it *will* happen. A page in the page cache *may* be reused,
but often won't be, especially when the I/O patterns don't line up well
with the way it works.

The whole point of the feature is not requiring the allocator to have
elaborate mechanisms for aging pages and throttling purging. That ends
up resulting in lots of memory held by userspace where the kernel can't
reclaim it under memory pressure. If it's dropped before page cache, it
isn't going to be able to replace any of that logic in allocators.

The page cache is speculative. Page caching by allocators is not really
speculative. Using MADV_FREE on the pages at all is speculative. The
memory is probably going to be reused fairly soon (unless the process
exits, and then it doesn't matter), but purging will end up reducing
memory usage for the portions that aren't.

It would be a different story for a full unpinning/pinning feature since
that would have other use cases (speculative caches

Re: [PATCH v3 1/5] spi: introduce mmap read support for spi flash devices

2015-11-11 Thread Vignesh R

Hi Brian,

On 11/12/2015 12:54 AM, Brian Norris wrote:
> In addition to my other comments:
> 

[...]

>> +int (*spi_mtd_mmap_read)(struct  spi_device *spi,
>> + loff_t from, size_t len,
>> + size_t *retlen, u_char *buf,
>> + u8 read_opcode, u8 addr_width,
>> + u8 dummy_bytes);
> 
> This is seeming to be a longer and longer list of arguments. I know MTD
> has a bad habit of long argument lists (which then cause a ton of
> unnecessary churn when things need changed in the API), but perhaps we
> can limit the damage to the SPI layer. Perhaps this deserves a struct to
> encapsulate all the flash read arguments? Like:
> 
> struct spi_flash_read_message {
>   loff_t from;
>   size_t len;
>   size_t *retlen;
>   void *buf;
>   u8 read_opcode;
>   u8 addr_width;
>   u8 dummy_bits;
>   // additional fields to describe rx_nbits for opcode/addr/data
> };
> 
> struct spi_master {
>   ...
>   int (*spi_flash_read)(struct spi_device *spi,
> struct spi_flash_message *msg);
> };


Yeah.. I think struct encapsulation helps, this can also be used to pass
sg lists for dma in future. I will rework the series with your
suggestion to include nbits for opcode/addr/data.
Also, will add validation logic (similar to __spi_validate()) to check
whether master supports dual/quad mode for opcode/addr/data. I am
planning to add this validation code to spi_flash_read_validate(in place
of spi_mmap_read_supported())
Thanks!


-- 
Regards
Vignesh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 15/17] mm: introduce lazyfree LRU list

2015-11-11 Thread Minchan Kim

There are issues to support MADV_FREE.

* MADV_FREE pages's hotness

It's really arguable. Someone think it's cold while others are not.
It's matter of workload dependent so I think no one could have
a one way. IOW, we need tunable knob.

* MADV_FREE on swapless system

Now, we instantly free MADV_FREEed pages on swapless system
because we don't have aged anonymous LRU list on swapless system
so there is no chance to discard them.

I tried to solve it with inactive anonymous LRU list without
introducing new LRU list but it needs a few hooks in reclaim
path to fix old behavior witch was not good to me. Moreover,
it makes implement tuning konb hard.

For addressing issues, this patch adds new LazyFree LRU list and
functions for the stat. Pages on the list have PG_lazyfree flag
which overrides PG_mappedtodisk(It should be safe because
no anonymous page can have the flag).

If user calls madvise(start, len, MADV_FREE), pages in the range
moves to lazyfree LRU from anonymous LRU. When memory pressure
happens, they can be discarded since there is no more store
opeartion since then. If there is store operation, they can move
to active anonymous LRU list.

In this patch, How to age lazyfree pages is very basic, which just
discards all pages in the list whenever memory pressure happens.
It's enough to prove working. Later patch will implement the policy.

Signed-off-by: Minchan Kim 
---
 drivers/base/node.c   |  2 +
 drivers/staging/android/lowmemorykiller.c |  3 +-
 fs/proc/meminfo.c |  2 +
 include/linux/mm_inline.h | 25 +--
 include/linux/mmzone.h| 11 +++--
 include/linux/page-flags.h|  5 +++
 include/linux/rmap.h  |  2 +-
 include/linux/swap.h  |  1 +
 include/linux/vm_event_item.h |  4 +-
 include/trace/events/vmscan.h | 18 +---
 mm/compaction.c   | 12 --
 mm/huge_memory.c  |  4 +-
 mm/madvise.c  |  3 +-
 mm/memcontrol.c   | 14 +-
 mm/migrate.c  |  2 +
 mm/page_alloc.c   |  3 ++
 mm/rmap.c | 15 +--
 mm/swap.c | 48 +
 mm/vmscan.c   | 71 +--
 mm/vmstat.c   |  3 ++
 20 files changed, 203 insertions(+), 45 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 560751bad294..f7a1f2107b43 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -70,6 +70,7 @@ static ssize_t node_read_meminfo(struct device *dev,
   "Node %d Active(file):   %8lu kB\n"
   "Node %d Inactive(file): %8lu kB\n"
   "Node %d Unevictable:%8lu kB\n"
+  "Node %d LazyFree:   %8lu kB\n"
   "Node %d Mlocked:%8lu kB\n",
   nid, K(i.totalram),
   nid, K(i.freeram),
@@ -83,6 +84,7 @@ static ssize_t node_read_meminfo(struct device *dev,
   nid, K(node_page_state(nid, NR_ACTIVE_FILE)),
   nid, K(node_page_state(nid, NR_INACTIVE_FILE)),
   nid, K(node_page_state(nid, NR_UNEVICTABLE)),
+  nid, K(node_page_state(nid, NR_LZFREE)),
   nid, K(node_page_state(nid, NR_MLOCK)));
 
 #ifdef CONFIG_HIGHMEM
diff --git a/drivers/staging/android/lowmemorykiller.c 
b/drivers/staging/android/lowmemorykiller.c
index 872bd603fd0d..658c16a653c2 100644
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -72,7 +72,8 @@ static unsigned long lowmem_count(struct shrinker *s,
return global_page_state(NR_ACTIVE_ANON) +
global_page_state(NR_ACTIVE_FILE) +
global_page_state(NR_INACTIVE_ANON) +
-   global_page_state(NR_INACTIVE_FILE);
+   global_page_state(NR_INACTIVE_FILE) +
+   global_page_state(NR_LZFREE);
 }
 
 static unsigned long lowmem_scan(struct shrinker *s, struct shrink_control *sc)
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index d3ebf2e61853..3444f7c4e0b6 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -102,6 +102,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
"Active(file):   %8lu kB\n"
"Inactive(file): %8lu kB\n"
"Unevictable:%8lu kB\n"
+   "LazyFree:   %8lu kB\n"
"Mlocked:%8lu kB\n"
 #ifdef CONFIG_HIGHMEM
"HighTotal:  %8lu kB\n"
@@ -159,6 +160,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
K(pages[LRU_ACTIVE_FILE]),
K(pages[LRU_INACTIVE_FILE]),
K(pages[LRU_UNEVICTABLE])

[PATCH v3 03/17] arch: uapi: asm: mman.h: Let MADV_FREE have same value for all architectures

2015-11-11 Thread Minchan Kim

From: Chen Gang 

For uapi, need try to let all macros have same value, and MADV_FREE is
added into main branch recently, so need redefine MADV_FREE for it.

At present, '8' can be shared with all architectures, so redefine it to
'8'.

Cc: r...@twiddle.net ,
Cc: i...@jurassic.park.msu.ru 
Cc: matts...@gmail.com 
Cc: Ralf Baechle 
Cc: j...@parisc-linux.org 
Cc: del...@gmx.de 
Cc: ch...@zankel.net 
Cc: jcmvb...@gmail.com 
Cc: Arnd Bergmann 
Cc: linux-a...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: rol...@kernel.org
Cc: darrick.w...@oracle.com
Cc: da...@davemloft.net
Acked-by: Hugh Dickins 
Acked-by: Minchan Kim 
Signed-off-by: Chen Gang 
---
 arch/alpha/include/uapi/asm/mman.h | 2 +-
 arch/mips/include/uapi/asm/mman.h  | 2 +-
 arch/parisc/include/uapi/asm/mman.h| 2 +-
 arch/xtensa/include/uapi/asm/mman.h| 2 +-
 include/uapi/asm-generic/mman-common.h | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/mman.h 
b/arch/alpha/include/uapi/asm/mman.h
index 836fbd44f65b..0b8a5de7aee3 100644
--- a/arch/alpha/include/uapi/asm/mman.h
+++ b/arch/alpha/include/uapi/asm/mman.h
@@ -44,9 +44,9 @@
 #define MADV_WILLNEED  3   /* will need these pages */
 #defineMADV_SPACEAVAIL 5   /* ensure resources are 
available */
 #define MADV_DONTNEED  6   /* don't need these pages */
-#define MADV_FREE  7   /* free pages only if memory pressure */
 
 /* common/generic parameters */
+#define MADV_FREE  8   /* free pages only if memory pressure */
 #define MADV_REMOVE9   /* remove these pages & resources */
 #define MADV_DONTFORK  10  /* don't inherit across fork */
 #define MADV_DOFORK11  /* do inherit across fork */
diff --git a/arch/mips/include/uapi/asm/mman.h 
b/arch/mips/include/uapi/asm/mman.h
index 106e741aa7ee..d247f5457944 100644
--- a/arch/mips/include/uapi/asm/mman.h
+++ b/arch/mips/include/uapi/asm/mman.h
@@ -67,9 +67,9 @@
 #define MADV_SEQUENTIAL 2  /* expect sequential page references */
 #define MADV_WILLNEED  3   /* will need these pages */
 #define MADV_DONTNEED  4   /* don't need these pages */
-#define MADV_FREE  5   /* free pages only if memory pressure */
 
 /* common parameters: try to keep these consistent across architectures */
+#define MADV_FREE  8   /* free pages only if memory pressure */
 #define MADV_REMOVE9   /* remove these pages & resources */
 #define MADV_DONTFORK  10  /* don't inherit across fork */
 #define MADV_DOFORK11  /* do inherit across fork */
diff --git a/arch/parisc/include/uapi/asm/mman.h 
b/arch/parisc/include/uapi/asm/mman.h
index 6cb8db76fd4e..700d83fd9352 100644
--- a/arch/parisc/include/uapi/asm/mman.h
+++ b/arch/parisc/include/uapi/asm/mman.h
@@ -40,9 +40,9 @@
 #define MADV_SPACEAVAIL 5   /* insure that resources are reserved 
*/
 #define MADV_VPS_PURGE  6   /* Purge pages from VM page cache */
 #define MADV_VPS_INHERIT 7  /* Inherit parents page size */
-#define MADV_FREE  8   /* free pages only if memory pressure */
 
 /* common/generic parameters */
+#define MADV_FREE  8   /* free pages only if memory pressure */
 #define MADV_REMOVE9   /* remove these pages & resources */
 #define MADV_DONTFORK  10  /* don't inherit across fork */
 #define MADV_DOFORK11  /* do inherit across fork */
diff --git a/arch/xtensa/include/uapi/asm/mman.h 
b/arch/xtensa/include/uapi/asm/mman.h
index 1b19f25bc567..77eaca434071 100644
--- a/arch/xtensa/include/uapi/asm/mman.h
+++ b/arch/xtensa/include/uapi/asm/mman.h
@@ -80,9 +80,9 @@
 #define MADV_SEQUENTIAL2   /* expect sequential page 
references */
 #define MADV_WILLNEED  3   /* will need these pages */
 #define MADV_DONTNEED  4   /* don't need these pages */
-#define MADV_FREE  5   /* free pages only if memory pressure */
 
 /* common parameters: try to keep these consistent across architectures */
+#define MADV_FREE  8   /* free pages only if memory pressure */
 #define MADV_REMOVE9   /* remove these pages & resources */
 #define MADV_DONTFORK  10  /* don't inherit across fork */
 #define MADV_DOFORK11  /* do inherit across fork */
diff --git a/include/uapi/asm-generic/mman-common.h 
b/include/uapi/asm-generic/mman-common.h
index 7a94102b7a02..869595947873 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -34,9 +34,9 @@
 #define MADV_SEQUENTIAL2   /* expect sequential page 
references */
 #define MADV_WILLNEED  3   /* will need these pages */
 #define MADV_DONTNEED  4   /* don't need these pages */
-#def

[PATCH v3 13/17] mm: don't split THP page when syscall is called

2015-11-11 Thread Minchan Kim

We don't need to split THP page when MADV_FREE syscall is called.
It could be done when VM decide to free it in reclaim path when
memory pressure is heavy so we could avoid unnecessary THP split.

For that, this patch changes two things

1. __split_huge_page_map

It does pte_mkdirty to subpages only if pmd_dirty is true.

2. __split_huge_page_refcount

It removes marking PG_dirty to subpages unconditionally.

Cc: Kirill A. Shutemov 
Cc: Hugh Dickins 
Cc: Andrea Arcangeli 
Signed-off-by: Minchan Kim 
---
 include/linux/huge_mm.h |  3 +++
 mm/huge_memory.c| 46 ++
 mm/madvise.c| 12 +++-
 3 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index ecb080d6ff42..e9db238a75c1 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -19,6 +19,9 @@ extern struct page *follow_trans_huge_pmd(struct 
vm_area_struct *vma,
  unsigned long addr,
  pmd_t *pmd,
  unsigned int flags);
+extern int madvise_free_huge_pmd(struct mmu_gather *tlb,
+   struct vm_area_struct *vma,
+   pmd_t *pmd, unsigned long addr);
 extern int zap_huge_pmd(struct mmu_gather *tlb,
struct vm_area_struct *vma,
pmd_t *pmd, unsigned long addr);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index bbac913f96bc..b8c9b44af864 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1453,6 +1453,41 @@ int do_huge_pmd_numa_page(struct mm_struct *mm, struct 
vm_area_struct *vma,
return 0;
 }
 
+int madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
+   pmd_t *pmd, unsigned long addr)
+
+{
+   spinlock_t *ptl;
+   pmd_t orig_pmd;
+   struct page *page;
+   struct mm_struct *mm = tlb->mm;
+
+   if (__pmd_trans_huge_lock(pmd, vma, &ptl) != 1)
+   return 1;
+
+   orig_pmd = *pmd;
+   if (is_huge_zero_pmd(orig_pmd))
+   goto out;
+
+   page = pmd_page(orig_pmd);
+   if (PageActive(page))
+   deactivate_page(page);
+
+   if (pmd_young(orig_pmd) || pmd_dirty(orig_pmd)) {
+   orig_pmd = pmdp_huge_get_and_clear_full(tlb->mm, addr, pmd,
+   tlb->fullmm);
+   orig_pmd = pmd_mkold(orig_pmd);
+   orig_pmd = pmd_mkclean(orig_pmd);
+
+   set_pmd_at(mm, addr, pmd, orig_pmd);
+   tlb_remove_pmd_tlb_entry(tlb, pmd, addr);
+   }
+out:
+   spin_unlock(ptl);
+
+   return 0;
+}
+
 int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 pmd_t *pmd, unsigned long addr)
 {
@@ -1752,8 +1787,8 @@ static void __split_huge_page_refcount(struct page *page,
  (1L << PG_mlocked) |
  (1L << PG_uptodate) |
  (1L << PG_active) |
- (1L << PG_unevictable)));
-   page_tail->flags |= (1L << PG_dirty);
+ (1L << PG_unevictable) |
+ (1L << PG_dirty)));
 
/* clear PageTail before overwriting first_page */
smp_wmb();
@@ -1787,7 +1822,6 @@ static void __split_huge_page_refcount(struct page *page,
 
BUG_ON(!PageAnon(page_tail));
BUG_ON(!PageUptodate(page_tail));
-   BUG_ON(!PageDirty(page_tail));
BUG_ON(!PageSwapBacked(page_tail));
 
lru_add_page_tail(page, page_tail, lruvec, list);
@@ -1831,10 +1865,12 @@ static int __split_huge_page_map(struct page *page,
int ret = 0, i;
pgtable_t pgtable;
unsigned long haddr;
+   bool dirty;
 
pmd = page_check_address_pmd(page, mm, address,
PAGE_CHECK_ADDRESS_PMD_SPLITTING_FLAG, &ptl);
if (pmd) {
+   dirty = pmd_dirty(*pmd);
pgtable = pgtable_trans_huge_withdraw(mm, pmd);
pmd_populate(mm, &_pmd, pgtable);
if (pmd_write(*pmd))
@@ -1850,7 +1886,9 @@ static int __split_huge_page_map(struct page *page,
 * permissions across VMAs.
 */
entry = mk_pte(page + i, vma->vm_page_prot);
-   entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+   if (dirty)
+   entry = pte_mkdirty(entry);
+   entry = maybe_mkwrite(entry, vma);
if (!pmd_write(*pmd))
entry = pte_wrprotect(entry);
if (!pmd_young(*pmd))
diff --git a/mm/madvise.c b/mm/madvise.c
index 4e67ba0b1104..27ed057c0bd7 100644
--- a/mm/madvise.c
+++ b/mm/m

[PATCH v3 16/17] mm: support MADV_FREE on swapless system

2015-11-11 Thread Minchan Kim

Historically, we have disabled reclaiming of anonymous pages
completely with swapoff or non-swap configurable system.
It did make sense but problem for lazy free pages is that
we couldn't get a chance to discard MADV_FREE hinted pages
in reclaim path in those systems.

That's why current MADV_FREE implementation drops pages instantly
like MADV_DONTNNED in swapless system so that users on those
systems couldn't get the benefit of MADV_FREE.

Now we have lazyfree LRU list to keep MADV_FREEed pages without
relying on anonymous LRU so that we could scan MADV_FREE pages
on swapless system without relying on anonymous LRU list.

Signed-off-by: Minchan Kim 
---
 mm/madvise.c|  7 +--
 mm/swap_state.c |  6 --
 mm/vmscan.c | 37 +++--
 3 files changed, 28 insertions(+), 22 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 7c88c6cfe300..3a4c3f7efe20 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -547,12 +547,7 @@ madvise_vma(struct vm_area_struct *vma, struct 
vm_area_struct **prev,
case MADV_WILLNEED:
return madvise_willneed(vma, prev, start, end);
case MADV_FREE:
-   /*
-* XXX: In this implementation, MADV_FREE works like
-* MADV_DONTNEED on swapless system or full swap.
-*/
-   if (get_nr_swap_pages() > 0)
-   return madvise_free(vma, prev, start, end);
+   return madvise_free(vma, prev, start, end);
/* passthrough */
case MADV_DONTNEED:
return madvise_dontneed(vma, prev, start, end);
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 10f63eded7b7..49c683b02ee4 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -170,12 +170,6 @@ int add_to_swap(struct page *page, struct list_head *list)
if (!entry.val)
return 0;
 
-   if (unlikely(PageTransHuge(page)))
-   if (unlikely(split_huge_page_to_list(page, list))) {
-   swapcache_free(entry);
-   return 0;
-   }
-
/*
 * Radix-tree node allocations from PF_MEMALLOC contexts could
 * completely exhaust the page allocator. __GFP_NOMEMALLOC
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3a7d57cbceb3..cd65db9d3004 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -611,13 +611,18 @@ static int __remove_mapping(struct address_space 
*mapping, struct page *page,
bool reclaimed)
 {
unsigned long flags;
-   struct mem_cgroup *memcg;
+   struct mem_cgroup *memcg = NULL;
+   int expected = mapping ? 2 : 1;
 
BUG_ON(!PageLocked(page));
BUG_ON(mapping != page_mapping(page));
+   VM_BUG_ON_PAGE(mapping == NULL && !PageLazyFree(page), page);
+
+   if (mapping) {
+   memcg = mem_cgroup_begin_page_stat(page);
+   spin_lock_irqsave(&mapping->tree_lock, flags);
+   }
 
-   memcg = mem_cgroup_begin_page_stat(page);
-   spin_lock_irqsave(&mapping->tree_lock, flags);
/*
 * The non racy check for a busy page.
 *
@@ -643,14 +648,18 @@ static int __remove_mapping(struct address_space 
*mapping, struct page *page,
 * Note that if SetPageDirty is always performed via set_page_dirty,
 * and thus under tree_lock, then this ordering is not required.
 */
-   if (!page_freeze_refs(page, 2))
+   if (!page_freeze_refs(page, expected))
goto cannot_free;
/* note: atomic_cmpxchg in page_freeze_refs provides the smp_rmb */
if (unlikely(PageDirty(page))) {
-   page_unfreeze_refs(page, 2);
+   page_unfreeze_refs(page, expected);
goto cannot_free;
}
 
+   /* No more work to do with backing store */
+   if (!mapping)
+   return 1;
+
if (PageSwapCache(page)) {
swp_entry_t swap = { .val = page_private(page) };
mem_cgroup_swapout(page, swap);
@@ -687,8 +696,10 @@ static int __remove_mapping(struct address_space *mapping, 
struct page *page,
return 1;
 
 cannot_free:
-   spin_unlock_irqrestore(&mapping->tree_lock, flags);
-   mem_cgroup_end_page_stat(memcg);
+   if (mapping) {
+   spin_unlock_irqrestore(&mapping->tree_lock, flags);
+   mem_cgroup_end_page_stat(memcg);
+   }
return 0;
 }
 
@@ -1051,7 +1062,12 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
if (PageAnon(page) && !PageSwapCache(page)) {
if (!(sc->gfp_mask & __GFP_IO))
goto keep_locked;
-   if (!add_to_swap(page, page_list))
+   if (unlikely(PageTransHuge(page)) &&
+   unlikely(split_huge_page_to_list(page,
+   page_list)))
+

[PATCH v3 14/17] mm: introduce wrappers to add new LRU

2015-11-11 Thread Minchan Kim

We have used binary variable "file" to identify whether it is anon LRU
or file LRU. It's good but it becomes obstacle if we add new LRU.

So, this patch introduces some wrapper functions to handle it.

Signed-off-by: Minchan Kim 
---
 include/linux/mm_inline.h | 64 +--
 include/trace/events/vmscan.h | 24 
 mm/compaction.c   |  2 +-
 mm/huge_memory.c  |  5 ++--
 mm/memory-failure.c   |  7 ++---
 mm/memory_hotplug.c   |  3 +-
 mm/mempolicy.c|  3 +-
 mm/migrate.c  | 26 ++
 mm/swap.c | 22 ++-
 mm/vmscan.c   | 12 
 10 files changed, 104 insertions(+), 64 deletions(-)

diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index cf55945c83fb..5e08a354f936 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -8,8 +8,8 @@
  * page_is_file_cache - should the page be on a file LRU or anon LRU?
  * @page: the page to test
  *
- * Returns 1 if @page is page cache page backed by a regular filesystem,
- * or 0 if @page is anonymous, tmpfs or otherwise ram or swap backed.
+ * Returns true if @page is page cache page backed by a regular filesystem,
+ * or false if @page is anonymous, tmpfs or otherwise ram or swap backed.
  * Used by functions that manipulate the LRU lists, to sort a page
  * onto the right LRU list.
  *
@@ -17,7 +17,7 @@
  * needs to survive until the page is last deleted from the LRU, which
  * could be as far down as __page_cache_release.
  */
-static inline int page_is_file_cache(struct page *page)
+static inline bool page_is_file_cache(struct page *page)
 {
return !PageSwapBacked(page);
 }
@@ -56,6 +56,64 @@ static inline enum lru_list page_lru_base_type(struct page 
*page)
 }
 
 /**
+ * lru_index - which LRU list is lru on for accouting update_page_reclaim_stat
+ *
+ * Used for LRU list index arithmetic.
+ *
+ * Returns 0 if @lru is anon, 1 if it is file.
+ */
+static inline int lru_index(enum lru_list lru)
+{
+   int base;
+
+   switch (lru) {
+   case LRU_INACTIVE_ANON:
+   case LRU_ACTIVE_ANON:
+   base = 0;
+   break;
+   case LRU_INACTIVE_FILE:
+   case LRU_ACTIVE_FILE:
+   base = 1;
+   break;
+   default:
+   BUG();
+   }
+   return base;
+}
+
+/*
+ * page_off_isolate - which LRU list was page on for accouting NR_ISOLATED.
+ * @page: the page to test
+ *
+ * Returns the LRU list a page was on, as an index into the array of
+ * zone_page_state;
+ */
+static inline int page_off_isolate(struct page *page)
+{
+   int lru = NR_ISOLATED_ANON;
+
+   if (!PageSwapBacked(page))
+   lru = NR_ISOLATED_FILE;
+   return lru;
+}
+
+/**
+ * lru_off_isolate - which LRU list was @lru on for accouting NR_ISOLATED.
+ * @lru: the lru to test
+ *
+ * Returns the LRU list a page was on, as an index into the array of
+ * zone_page_state;
+ */
+static inline int lru_off_isolate(enum lru_list lru)
+{
+   int base = NR_ISOLATED_FILE;
+
+   if (lru <= LRU_ACTIVE_ANON)
+   base = NR_ISOLATED_ANON;
+   return base;
+}
+
+/**
  * page_off_lru - which LRU list was page on? clearing its lru flags.
  * @page: the page to test
  *
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h
index f66476b96264..4e9e86733849 100644
--- a/include/trace/events/vmscan.h
+++ b/include/trace/events/vmscan.h
@@ -30,9 +30,9 @@
(RECLAIM_WB_ASYNC) \
)
 
-#define trace_shrink_flags(file) \
+#define trace_shrink_flags(lru) \
( \
-   (file ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
+   (lru ? RECLAIM_WB_FILE : RECLAIM_WB_ANON) | \
(RECLAIM_WB_ASYNC) \
)
 
@@ -271,9 +271,9 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
unsigned long nr_scanned,
unsigned long nr_taken,
isolate_mode_t isolate_mode,
-   int file),
+   enum lru_list lru),
 
-   TP_ARGS(order, nr_requested, nr_scanned, nr_taken, isolate_mode, file),
+   TP_ARGS(order, nr_requested, nr_scanned, nr_taken, isolate_mode, lru),
 
TP_STRUCT__entry(
__field(int, order)
@@ -281,7 +281,7 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
__field(unsigned long, nr_scanned)
__field(unsigned long, nr_taken)
__field(isolate_mode_t, isolate_mode)
-   __field(int, file)
+   __field(enum lru_list, lru)
),
 
TP_fast_assign(
@@ -290,16 +290,16 @@ DECLARE_EVENT_CLASS(mm_vmscan_lru_isolate_template,
__entry->nr_scanned = nr_scanned;
__entry->nr_taken = nr_taken;
__entry->isolate_mode = isolate_mode;
-   __entry->file = file;
+   __entry->lru = lru;
),
 
-

Re: [RFC] usb: dwc2: hcd: fix split schedule issue

2015-11-11 Thread John Youn

On 11/11/2015 4:22 PM, Doug Anderson wrote:
> John,
> 
> On Fri, Nov 6, 2015 at 2:04 AM, Yunzhi Li  wrote:
>> hi John ,
>>
>>   As we talked yesterday, I tried to fix the split schedule sequence. This
>> patch will
>> avoid scheduling SSPLIT-IN packet for another device between
>> SSPLIT-OUT-begin and
>> SSPLIT-OUT-end, now the keyboard and Jebra audio speaker could work together
>> well, but
>> I'm not sure if this is exactly the right way to schedule split transfers
>> and if there
>> is any dide effect with this patch. Please help review this patch. Thanks.
>>
>>> Fix dwc2 split schedule sequence issue. Not schedule a SSPLIT_IN
>>> packet between SSPLIT-begin and SSPLIT-end.
>>>
>>> Signed-off-by: Yunzhi Li 
>>> ---
>>>   drivers/usb/dwc2/hcd.c | 4 
>>>   1 file changed, 4 insertions(+)
> 
> Did you have any thoughts on this patch?  Although this patch didn't
> fix the problems I was seeing with the Microsoft Wireless Keyboard
> (see the patch I sent out earlier which does seem to fix it), I can
> confirm that in a different setup (HUB goes to USB audio + mouse) that
> this patch does fix some problems.
> 
> That being said, it feels to me like a band-aid rather than an actual
> fix (I'm talking out of my rear end, though, since my USB experience
> is lacking at best).  It feels like perhaps we're just not keeping
> track the xact_pos correctly, but of course I don't know that for
> sure...
> 
> Anyway, just fishing...  ;)
> 
> -Doug
> 

Hi Doug,

I also feel it is not quite right as the SSPLIT should be able to
happen during the SSPLIT of another device. I tried to reproduce
and see the same scheduling but don't see any hang due to it.

Yunzhi, any details on what kind of hub and keyboard you are
using? I have the same Jabra 410 speaker.

Regards,
John




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 7/9] ARM: EXYNOS: split up exynos5420 SoC specific PMU data

2015-11-11 Thread Krzysztof Kozlowski

On 10.11.2015 20:43, Pankaj Dubey wrote:
> This patch splits up mach-exynos/pmu.c file, and moves exynos5420,
> PMU configuration data and functions handing data into exynos5420
> SoC specific PMU file mach-exynos/exynos5420-pmu.c.
> 
> Signed-off-by: Pankaj Dubey 
> ---
>  arch/arm/mach-exynos/Makefile |   2 +-
>  arch/arm/mach-exynos/exynos-pmu.h |   1 +
>  arch/arm/mach-exynos/exynos5420-pmu.c | 280 
> ++
>  arch/arm/mach-exynos/pmu.c| 263 ---
>  4 files changed, 282 insertions(+), 264 deletions(-)
>  create mode 100644 arch/arm/mach-exynos/exynos5420-pmu.c
> 
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index bfb23a5..2d58063 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -11,7 +11,7 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) += 
> -I$(srctree)/$(src)/include -I$(srctree)
>  
>  obj-$(CONFIG_ARCH_EXYNOS)+= exynos.o pmu.o exynos-smc.o firmware.o \
>   exynos3250-pmu.o exynos4-pmu.o \
> - exynos5250-pmu.o
> + exynos5250-pmu.o exynos5420-pmu.o
>  
>  obj-$(CONFIG_EXYNOS_CPU_SUSPEND) += pm.o sleep.o
>  obj-$(CONFIG_PM_SLEEP)   += suspend.o
> diff --git a/arch/arm/mach-exynos/exynos-pmu.h 
> b/arch/arm/mach-exynos/exynos-pmu.h
> index 003fa6d..306f5c7 100644
> --- a/arch/arm/mach-exynos/exynos-pmu.h
> +++ b/arch/arm/mach-exynos/exynos-pmu.h
> @@ -38,6 +38,7 @@ extern const struct exynos_pmu_data exynos4210_pmu_data;
>  extern const struct exynos_pmu_data exynos4212_pmu_data;
>  extern const struct exynos_pmu_data exynos4412_pmu_data;
>  extern const struct exynos_pmu_data exynos5250_pmu_data;
> +extern const struct exynos_pmu_data exynos5420_pmu_data;
>  
>  extern void pmu_raw_writel(u32 val, u32 offset);
>  extern u32 pmu_raw_readl(u32 offset);
> diff --git a/arch/arm/mach-exynos/exynos5420-pmu.c 
> b/arch/arm/mach-exynos/exynos5420-pmu.c
> new file mode 100644
> index 000..5810afe
> --- /dev/null
> +++ b/arch/arm/mach-exynos/exynos5420-pmu.c
> @@ -0,0 +1,280 @@
> +/*
> + * Copyright (c) 2011-2015 Samsung Electronics Co., Ltd.
> + *   http://www.samsung.com/
> + *
> + * EXYNOS5420 - CPU PMU (Power Management Unit) support
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +#include "exynos-pmu.h"
> +
> +static struct exynos_pmu_conf exynos5420_pmu_config[] = {
> + /* { .offset = offset, .val = { AFTR, LPA, SLEEP } */
> + { EXYNOS5_ARM_CORE0_SYS_PWR_REG,{ 0x0, 0x0, 0x0} },
> + { EXYNOS5_DIS_IRQ_ARM_CORE0_LOCAL_SYS_PWR_REG,  { 0x0, 0x0, 0x0} },
> + { EXYNOS5_DIS_IRQ_ARM_CORE0_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5_ARM_CORE1_SYS_PWR_REG,{ 0x0, 0x0, 0x0} },
> + { EXYNOS5_DIS_IRQ_ARM_CORE1_LOCAL_SYS_PWR_REG,  { 0x0, 0x0, 0x0} },
> + { EXYNOS5_DIS_IRQ_ARM_CORE1_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_ARM_CORE2_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_ARM_CORE2_LOCAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_ARM_CORE2_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_ARM_CORE3_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_ARM_CORE3_LOCAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_ARM_CORE3_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_KFC_CORE0_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_KFC_CORE0_LOCAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_KFC_CORE0_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_KFC_CORE1_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_KFC_CORE1_LOCAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_KFC_CORE1_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_KFC_CORE2_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_KFC_CORE2_LOCAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_KFC_CORE2_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_KFC_CORE3_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_KFC_CORE3_LOCAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5420_DIS_IRQ_KFC_CORE3_CENTRAL_SYS_PWR_REG, { 0x0, 0x0, 0x0} },
> + { EXYNOS5_ISP_ARM_SYS_PWR_REG,  { 0x1, 0x0, 0x0} },
> + { EXYNOS5_DIS_IRQ_ISP_ARM_LOCAL_SYS_PWR_REG,{ 0x1, 0x0, 0x0} },
> + { EXYNOS5_DIS_IRQ_ISP_ARM_CENTRAL_SYS_PWR_REG,  { 0x1, 0x0, 0x0} },
> + { EXYNOS5420_ARM_COMMON_SYS_PWR_REG,{ 0x0, 0x0, 0x0} },
> + { EXYNOS5420_KFC_COMMON_SYS_PWR_REG,

Re: [UNTESTED PATCH] x86, mce: Avoid double entry of deferred errors into the genpool.

2015-11-11 Thread Chen, Gong

On Wed, Nov 11, 2015 at 02:01:51PM -0800, Luck, Tony wrote:
> Date: Wed, 11 Nov 2015 14:01:51 -0800
> From: Tony Luck 
> To: "Chen, Gong" 
> Cc: b...@alien8.de, linux-e...@vger.kernel.org, linux-kernel@vger.kernel.org
> Subject: [UNTESTED PATCH] x86, mce: Avoid double entry of deferred errors
>  into the genpool.
> 
> We used to have a special ring buffer for deferred errors that
> was used to mark problem pages. We replaced that with a genpool.
> Then later converted mce_log() to also use the same genpool. As
> a result we end up adding all deferred errors to the genpool twice.
> 
> Rearrange this code. Make sure to set the m.severity and m.usable_addr
> fields for deferred errors. Then if flags and mca_cfg.dont_log_ce mean
> we call mce_log() we are done, because that will add this entry to the
> genpool.
> 
> If we skipped mce_log(), then we still want to take action for the
> deferred error, so add to the genpool.
> 
> Changed the name of the boolean "error_logged" to "error_seen", we
> should set it whether of not we logged an error because the return
> value from machine_check_poll() is used to decide whether storms
> have subsided or not.
> 
> Reported-by: Chen, Gong 
> Signed-off-by: Tony Luck 
> ---

It's much better than my original version.


signature.asc
Description: PGP signature

Re: [PATCH v4 4/9] ARM: EXYNOS: split up exynos3250 SoC specific PMU data

2015-11-11 Thread Krzysztof Kozlowski

On 10.11.2015 20:42, Pankaj Dubey wrote:
> This patch splits up mach-exynos/pmu.c file, and moves exynos3250 PMU
> configuration data and functions handing those data into exynos3250
> SoC specific PMU file mach-exynos/exynos3250-pmu.c.
> 
> Signed-off-by: Pankaj Dubey 
> ---
>  arch/arm/mach-exynos/Makefile |   2 +-
>  arch/arm/mach-exynos/exynos-pmu.h |  40 
>  arch/arm/mach-exynos/exynos.c |   2 -
>  arch/arm/mach-exynos/exynos3250-pmu.c | 175 
>  arch/arm/mach-exynos/pmu.c| 184 
> +-
>  5 files changed, 220 insertions(+), 183 deletions(-)
>  create mode 100644 arch/arm/mach-exynos/exynos-pmu.h
>  create mode 100644 arch/arm/mach-exynos/exynos3250-pmu.c
> 
> diff --git a/arch/arm/mach-exynos/Makefile b/arch/arm/mach-exynos/Makefile
> index 2f30676..e869f86 100644
> --- a/arch/arm/mach-exynos/Makefile
> +++ b/arch/arm/mach-exynos/Makefile
> @@ -9,7 +9,7 @@ ccflags-$(CONFIG_ARCH_MULTIPLATFORM) += 
> -I$(srctree)/$(src)/include -I$(srctree)
>  
>  # Core
>  
> -obj-$(CONFIG_ARCH_EXYNOS)+= exynos.o pmu.o exynos-smc.o firmware.o
> +obj-$(CONFIG_ARCH_EXYNOS)+= exynos.o pmu.o exynos-smc.o firmware.o 
> exynos3250-pmu.o
>  
>  obj-$(CONFIG_EXYNOS_CPU_SUSPEND) += pm.o sleep.o
>  obj-$(CONFIG_PM_SLEEP)   += suspend.o
> diff --git a/arch/arm/mach-exynos/exynos-pmu.h 
> b/arch/arm/mach-exynos/exynos-pmu.h
> new file mode 100644
> index 000..6f95c7d
> --- /dev/null
> +++ b/arch/arm/mach-exynos/exynos-pmu.h
> @@ -0,0 +1,40 @@
> +/*
> + * Copyright (c) 2015 Samsung Electronics Co., Ltd.
> + *   http://www.samsung.com
> + *
> + * Header for EXYNOS PMU Driver support
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef __EXYNOS_PMU_H
> +#define __EXYNOS_PMU_H
> +
> +#include 
> +
> +#define PMU_TABLE_END(-1U)
> +
> +

Unnecessary blank line.

Rest looks good, so with this fix:

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] x86, ras: Extend machine check recovery code to annotated ring0 areas

2015-11-11 Thread Andy Lutomirski


On 11/06/2015 01:01 PM, Tony Luck wrote:

Extend the severity checking code to add a new context IN_KERN_RECOV
which is used to indicate that the machine check was triggered by code
in the kernel with a fixup entry.

Add code to check for this situation and respond by altering the return
IP to the fixup address and changing the regs->ax so that the recovery
code knows the physical address of the error. Note that we also set bit
63 because 0x0 is a legal physical address.

Signed-off-by: Tony Luck 
---
  arch/x86/kernel/cpu/mcheck/mce-severity.c | 19 +--
  arch/x86/kernel/cpu/mcheck/mce.c  | 13 ++---
  2 files changed, 27 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c 
b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 9c682c222071..1e83842310e8 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -12,6 +12,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 

@@ -29,7 +30,7 @@
   * panic situations)
   */

-enum context { IN_KERNEL = 1, IN_USER = 2 };
+enum context { IN_KERNEL = 1, IN_USER = 2, IN_KERNEL_RECOV = 3 };
  enum ser { SER_REQUIRED = 1, NO_SER = 2 };
  enum exception { EXCP_CONTEXT = 1, NO_EXCP = 2 };

@@ -48,6 +49,7 @@ static struct severity {
  #define MCESEV(s, m, c...) { .sev = MCE_ ## s ## _SEVERITY, .msg = m, ## c }
  #define  KERNEL   .context = IN_KERNEL
  #define  USER .context = IN_USER
+#define  KERNEL_RECOV  .context = IN_KERNEL_RECOV
  #define  SER  .ser = SER_REQUIRED
  #define  NOSER.ser = NO_SER
  #define  EXCP .excp = EXCP_CONTEXT
@@ -87,6 +89,10 @@ static struct severity {
EXCP, KERNEL, MCGMASK(MCG_STATUS_RIPV, 0)
),
MCESEV(
+   PANIC, "In kernel and no restart IP",
+   EXCP, KERNEL_RECOV, MCGMASK(MCG_STATUS_RIPV, 0)
+   ),
+   MCESEV(
DEFERRED, "Deferred error",
NOSER, 
MASK(MCI_STATUS_UC|MCI_STATUS_DEFERRED|MCI_STATUS_POISON, MCI_STATUS_DEFERRED)
),
@@ -123,6 +129,11 @@ static struct severity {
MCGMASK(MCG_STATUS_RIPV|MCG_STATUS_EIPV, MCG_STATUS_RIPV)
),
MCESEV(
+   AR, "Action required: data load error recoverable area of 
kernel",
+   SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, 
MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
+   KERNEL_RECOV
+   ),
+   MCESEV(
AR, "Action required: data load error in a user process",
SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCI_ADDR|MCACOD, 
MCI_UC_SAR|MCI_ADDR|MCACOD_DATA),
USER
@@ -183,7 +194,11 @@ static struct severity {
   */
  static int error_context(struct mce *m)
  {
-   return ((m->cs & 3) == 3) ? IN_USER : IN_KERNEL;
+   if ((m->cs & 3) == 3)
+   return IN_USER;
+   if (search_mcexception_tables(m->ip))
+   return IN_KERNEL_RECOV;
+   return IN_KERNEL;
  }

  /*
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 9d014b82a124..472d11150b7a 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -31,6 +31,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -1132,9 +1133,15 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
if (no_way_out)
mce_panic("Fatal machine check on current CPU", &m, 
msg);
if (worst == MCE_AR_SEVERITY) {
-   recover_paddr = m.addr;
-   if (!(m.mcgstatus & MCG_STATUS_RIPV))
-   flags |= MF_MUST_KILL;
+   if ((m.cs & 3) == 3) {
+   recover_paddr = m.addr;
+   if (!(m.mcgstatus & MCG_STATUS_RIPV))
+   flags |= MF_MUST_KILL;
+   } else if (fixup_mcexception(regs)) {
+   regs->ax = BIT(63) | m.addr;
+   } else
+   mce_panic("Failed kernel mode recovery",
+ &m, NULL);


Maybe I'm misunderstanding this, but presumably you shouldn't call 
fixup_mcexception unless you've first verified RIPV (i.e. that the ip 
you're looking up in the table is valid).


Also... I find the general flow of this code very hard to follow.  It's 
critical that an MCE hitting kernel mode not get as far as 
ist_begin_non_atomic.  It was already hard enough to tell that the code 
follows that rule, and now it's even harder.  Would it make sense to add 
clear assertions that m.cs == regs->cs and that user_mode(regs) when you 
get to the end?  Simplifying the control flow might also be nice.



} else if (kill_it) {
force_sig(SIGBU

Re: [PATCH 1/3] x86, ras: Add new infrastructure for machine check fixup tables

2015-11-11 Thread Andy Lutomirski


On 11/06/2015 12:57 PM, Tony Luck wrote:

Copy the existing page fault fixup mechanisms to create a new table
to be used when fixing machine checks. Note:
1) At this time we only provide a macro to annotate assembly code
2) We assume all fixups will in code builtin to the kernel.


Shouldn't the first step be to fixup failures during user memory access?



Signed-off-by: Tony Luck 
---
  arch/x86/include/asm/asm.h|  7 +++
  arch/x86/include/asm/uaccess.h|  1 +
  arch/x86/mm/extable.c | 16 
  include/asm-generic/vmlinux.lds.h |  6 ++
  include/linux/module.h|  1 +
  kernel/extable.c  | 14 ++
  6 files changed, 45 insertions(+)

diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h
index 189679aba703..f2fa7973f18f 100644
--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -58,6 +58,13 @@
.long (to) - . + 0x7ff0 ;   \
.popsection

+# define _ASM_MCEXTABLE(from, to)  \
+   .pushsection "__mcex_table", "a" ;  \
+   .balign 8 ; \
+   .long (from) - . ;  \
+   .long (to) - . ;\
+   .popsection
+


This does something really weird to rax.  (Also, what happens on 32-bit 
kernels?  There's no bit 63.)


Please at least document it clearly.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 3/9] ARM: EXYNOS: Move pmu specific headers under "linux/soc/samsung"

2015-11-11 Thread Krzysztof Kozlowski

On 10.11.2015 20:42, Pankaj Dubey wrote:
> Moving Exynos PMU specific header file into "include/linux/soc/samsung"
> thus updated affected files under "mach-exynos" to use new location of
> these header files.
> 
> Signed-off-by: Amit Daniel Kachhap 
> Signed-off-by: Pankaj Dubey 
> ---
>  arch/arm/mach-exynos/exynos-pmu.h   |  24 -
>  arch/arm/mach-exynos/exynos.c   |   2 +-
>  arch/arm/mach-exynos/mcpm-exynos.c  |   2 +-
>  arch/arm/mach-exynos/platsmp.c  |   2 +-
>  arch/arm/mach-exynos/pm.c   |   4 +-
>  arch/arm/mach-exynos/pmu.c  |   6 +-
>  arch/arm/mach-exynos/regs-pmu.h | 693 
> 
>  arch/arm/mach-exynos/suspend.c  |   4 +-
>  include/linux/soc/samsung/exynos-pmu.h  |  24 +
>  include/linux/soc/samsung/exynos-regs-pmu.h | 693 
> 

Did you disable the rename-detection for format-patch? Default rename
detection mechanism (50% of similarity) should detect two renames here:
exynos-pmu.h and exynos-regs-pmu.h

Best regards,
Krzysztof


>  10 files changed, 727 insertions(+), 727 deletions(-)
>  delete mode 100644 arch/arm/mach-exynos/exynos-pmu.h
>  delete mode 100644 arch/arm/mach-exynos/regs-pmu.h
>  create mode 100644 include/linux/soc/samsung/exynos-pmu.h
>  create mode 100644 include/linux/soc/samsung/exynos-regs-pmu.h

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] block: Introduce BIO_ENDIO_FREE for bio flags

2015-11-11 Thread Baolin Wang

On 12 November 2015 at 01:54, Mike Snitzer  wrote:
> On Wed, Nov 11 2015 at  4:31am -0500,
> Baolin Wang  wrote:
>
>> When we use dm-crypt to decrypt block data, it will decrypt the block data
>> in endio() when one IO is completed. In this situation we don't want the
>> cloned bios is freed before calling the endio().
>>
>> Thus introduce 'BIO_ENDIO_FREE' flag to support the request handling for 
>> dm-crypt,
>> this flag will ensure that blk layer does not complete the cloned bios before
>> completing the request. When the crypt endio is called, post-processsing is
>> done and then the dm layer will complete the bios (clones) and free them.
>
> Not following why request-based DM's partial completion handling
> (drivers/md/dm.c:end_clone_bio) isn't a sufficient hook -- no need to
> add block complexity.
>

Sorry for lacking of more explanation for that. The dm-crypt will
decrypt block data in the end_io() callback function when one request
is completed, so we don't want the bios of this request is freed when
calling the end_io() callback. Thus we introduce a flag to indicate
these type bios of this request will be freed at dm layer not in block
layer.

> But that aside, I'm not liking the idea of a request-based dm-crypt.
>
>> diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
>> index 76d23fa..f636c50 100644
>> --- a/include/linux/device-mapper.h
>> +++ b/include/linux/device-mapper.h
>> @@ -407,6 +407,11 @@ union map_info *dm_get_rq_mapinfo(struct request *rq);
>>
>>  struct queue_limits *dm_get_queue_limits(struct mapped_device *md);
>>
>> +void dm_end_request(struct request *clone, int error);
>> +void dm_kill_unmapped_request(struct request *rq, int error);
>> +void dm_dispatch_clone_request(struct request *clone, struct request *rq);
>> +struct request *dm_get_orig_rq(struct request *clone);
>> +
>>  /*
>>   * Geometry functions.
>>   */
>
> I have no interest in seeing any request-based DM interfaces exported.

OK.


-- 
Baolin.wang
Best Regards
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 2/9] ARM: EXYNOS: Fix potential NULL pointer access in exynos_sys_powerdown_conf

2015-11-11 Thread Krzysztof Kozlowski

On 10.11.2015 20:42, Pankaj Dubey wrote:
> If no platform devices binded to the driver but driver itself loaded and
> exynos_sys_powerdown_conf is called from
> arch/arm/mach-exynos/{suspend.c, pm.c} it will result in NULL pointer access,
> to prevent this added check on pmu_context for NULL.
> 
> Signed-off-by: Pankaj Dubey 
> ---
>  arch/arm/mach-exynos/pmu.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Reviewed-by: Krzysztof Kozlowski 

Best regards,
Krzysztof

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 >

1 - 100 of 693 matches

Mail list logo