RE: [PATCH 0/8] Input: support for latest Lenovo thinkpads (series 80)

2018-04-09 Thread 廖崇榮
Hi Benjamin,

Thanks so much for your patch.

I have tested them for Elan Gen5/Gen6(new) touchpad with SMbus/PS2.
It works fine in my thinkpad so far but I find an issue today after 
lid-close/open.

I am not sure if you can see it in T480S , I "guess" it may be relative to 
i2c_i801.

The lid-close will enter deep sleep and cut touchpad power. 
I can see the resume flow after lid-open and SMbus-initial try to request hello 
package but fail.
Strangely, I can't see any SMbus host signal on LA scope after power-on.

I can't switch to SMbus after rmmod/modprobe psmouse because error happen in 
elantech_create_smbus.
It will be recovered only if I rmmod/modprobe i2c_i801 first.

Do you have any idea about it?

Thanks
KT
-Original Message-
From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com] 
Sent: Friday, April 06, 2018 2:51 PM
To: Dmitry Torokhov
Cc: 廖崇榮; Oliver Haessler; Benjamin Berg; open list:HID CORE LAYER; lkml
Subject: Re: [PATCH 0/8] Input: support for latest Lenovo thinkpads (series 80)

Hi Dmitry,

On Fri, Apr 6, 2018 at 1:51 AM, Dmitry Torokhov  
wrote:
> Hi Benjamin,
>
> On Thu, Apr 05, 2018 at 03:25:29PM +0200, Benjamin Tissoires wrote:
>> Hi Dmitry,
>>
>> well, this year, Lenovo gave us a surprise and decided to not use the 
>> same touchpad/trackstick in all its model. And by default, the 
>> support under Linux is less than ideal.
>>
>> Please find a series that should fix those issues. Compared to the 60 
>> series, there do not seem to e BIOS table issues this time, and 
>> suspend/resume works fine thanks to your latest trackstick fixes.
>>
>> The T480s is a different beast, as it uses an Elan touchpad.
>> I have been carrying the patches 3-6 for a while and tested previous 
>> versions on various Elan PS/2 hardware without an issue as far as I 
>> could tell. I was lacking tests from users with SMBus as all the 
>> laptops I tried where puer PS/2.
>>
>> Anyway, it would be cool if you could have a look at the series.
>
> I am mostly happy with the series, but I would love to hear KT's take 
> on it.

thanks for the quick review.
I worked closely with KT for this series. He helped me a lot for the tiny 
firmware changes that were required. However, quoting his email from Tuesday:
"There will be a spring vacation in Taiwan from tomorrow." I guess we won't 
hear from him until the end of next week as we always have a backlog of urgent 
things to do after holidays...

Cheers,
Benjamin



RE: [PATCH 0/8] Input: support for latest Lenovo thinkpads (series 80)

2018-04-09 Thread 廖崇榮
Hi Benjamin,

Thanks so much for your patch.

I have tested them for Elan Gen5/Gen6(new) touchpad with SMbus/PS2.
It works fine in my thinkpad so far but I find an issue today after 
lid-close/open.

I am not sure if you can see it in T480S , I "guess" it may be relative to 
i2c_i801.

The lid-close will enter deep sleep and cut touchpad power. 
I can see the resume flow after lid-open and SMbus-initial try to request hello 
package but fail.
Strangely, I can't see any SMbus host signal on LA scope after power-on.

I can't switch to SMbus after rmmod/modprobe psmouse because error happen in 
elantech_create_smbus.
It will be recovered only if I rmmod/modprobe i2c_i801 first.

Do you have any idea about it?

Thanks
KT
-Original Message-
From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com] 
Sent: Friday, April 06, 2018 2:51 PM
To: Dmitry Torokhov
Cc: 廖崇榮; Oliver Haessler; Benjamin Berg; open list:HID CORE LAYER; lkml
Subject: Re: [PATCH 0/8] Input: support for latest Lenovo thinkpads (series 80)

Hi Dmitry,

On Fri, Apr 6, 2018 at 1:51 AM, Dmitry Torokhov  
wrote:
> Hi Benjamin,
>
> On Thu, Apr 05, 2018 at 03:25:29PM +0200, Benjamin Tissoires wrote:
>> Hi Dmitry,
>>
>> well, this year, Lenovo gave us a surprise and decided to not use the 
>> same touchpad/trackstick in all its model. And by default, the 
>> support under Linux is less than ideal.
>>
>> Please find a series that should fix those issues. Compared to the 60 
>> series, there do not seem to e BIOS table issues this time, and 
>> suspend/resume works fine thanks to your latest trackstick fixes.
>>
>> The T480s is a different beast, as it uses an Elan touchpad.
>> I have been carrying the patches 3-6 for a while and tested previous 
>> versions on various Elan PS/2 hardware without an issue as far as I 
>> could tell. I was lacking tests from users with SMBus as all the 
>> laptops I tried where puer PS/2.
>>
>> Anyway, it would be cool if you could have a look at the series.
>
> I am mostly happy with the series, but I would love to hear KT's take 
> on it.

thanks for the quick review.
I worked closely with KT for this series. He helped me a lot for the tiny 
firmware changes that were required. However, quoting his email from Tuesday:
"There will be a spring vacation in Taiwan from tomorrow." I guess we won't 
hear from him until the end of next week as we always have a backlog of urgent 
things to do after holidays...

Cheers,
Benjamin



[PATCH] drm: xlnx: pl_disp: fix odd_ptr_err.cocci warnings

2018-04-09 Thread Julia Lawall
From: Fengguang Wu 

 PTR_ERR should normally access the value just tested by IS_ERR

Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci

Fixes: 742243a44a73 ("drm: xlnx: pl_disp: Use xlnx pipeline calls")
CC: Hyun Kwon 
Signed-off-by: Fengguang Wu 
Signed-off-by: Julia Lawall 
---

tree:   https://github.com/Xilinx/linux-xlnx xlnx_rebase_v4.14
head:   fe04d2ee0dfea6b5fdbb04f4f6dbcaa13bfd2fda
commit: 742243a44a738b165f8da5cbdb6662139e85a5c5 [651/842] drm: xlnx:
pl_disp: Use xlnx pipeline calls


 xlnx_pl_disp.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/xlnx/xlnx_pl_disp.c
+++ b/drivers/gpu/drm/xlnx/xlnx_pl_disp.c
@@ -482,7 +482,7 @@ static int xlnx_pl_disp_probe(struct pla

xlnx_pl_disp->master = xlnx_drm_pipeline_init(pdev);
if (IS_ERR(xlnx_pl_disp->master)) {
-   ret = PTR_ERR(xlnx_pl_disp->dev);
+   ret = PTR_ERR(xlnx_pl_disp->master);
dev_err(dev, "failed to initialize the drm pipeline\n");
goto err_component;
}


[PATCH] drm: xlnx: pl_disp: fix odd_ptr_err.cocci warnings

2018-04-09 Thread Julia Lawall
From: Fengguang Wu 

 PTR_ERR should normally access the value just tested by IS_ERR

Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci

Fixes: 742243a44a73 ("drm: xlnx: pl_disp: Use xlnx pipeline calls")
CC: Hyun Kwon 
Signed-off-by: Fengguang Wu 
Signed-off-by: Julia Lawall 
---

tree:   https://github.com/Xilinx/linux-xlnx xlnx_rebase_v4.14
head:   fe04d2ee0dfea6b5fdbb04f4f6dbcaa13bfd2fda
commit: 742243a44a738b165f8da5cbdb6662139e85a5c5 [651/842] drm: xlnx:
pl_disp: Use xlnx pipeline calls


 xlnx_pl_disp.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/xlnx/xlnx_pl_disp.c
+++ b/drivers/gpu/drm/xlnx/xlnx_pl_disp.c
@@ -482,7 +482,7 @@ static int xlnx_pl_disp_probe(struct pla

xlnx_pl_disp->master = xlnx_drm_pipeline_init(pdev);
if (IS_ERR(xlnx_pl_disp->master)) {
-   ret = PTR_ERR(xlnx_pl_disp->dev);
+   ret = PTR_ERR(xlnx_pl_disp->master);
dev_err(dev, "failed to initialize the drm pipeline\n");
goto err_component;
}


Re: [PATCH v6 0/6] Add MediaTek PMIC keys support

2018-04-09 Thread Chen Zhong
On Thu, 2018-03-29 at 09:15 -0700, Dmitry Torokhov wrote:
> 
> 
> Oh, sorry, I did not realize you wanted my Ack for bindings. I usually
> leave it to Rob and simply ack the driver itself when I am happy with
> the code.
> 
> I'll go and add my ack to the binding post if that will help merging
> the series.
> 
> Thanks. 

Hi Lee,

May I know if I need to collect Dmitry's comments and send a new version
for the merging?

Thank you.



Re: [PATCH v6 0/6] Add MediaTek PMIC keys support

2018-04-09 Thread Chen Zhong
On Thu, 2018-03-29 at 09:15 -0700, Dmitry Torokhov wrote:
> 
> 
> Oh, sorry, I did not realize you wanted my Ack for bindings. I usually
> leave it to Rob and simply ack the driver itself when I am happy with
> the code.
> 
> I'll go and add my ack to the binding post if that will help merging
> the series.
> 
> Thanks. 

Hi Lee,

May I know if I need to collect Dmitry's comments and send a new version
for the merging?

Thank you.



Re: [PATCH v2 01/14] Input: atmel_mxt_ts - do not pass suspend mode in platform data

2018-04-09 Thread Benson Leung
On Tue, Mar 20, 2018 at 03:31:25PM -0700, Dmitry Torokhov wrote:
> The way we are supposed to put controller to sleep and wake it up does not
> depend on the platform, but rather on controller itself, so we want to get
> rid of suspend mode in platform data (and eventually get rid of platform
> data completely). Unfortunately some early chromebooks (the original Pixel,
> Acer C720) were shipped with config that requires manually re-enabling
> touch reporting in T9. We will sort it out, but in the meantime let's
> switch to a simple DMI quirk.
> 
> We'll keep pdata->suspend_mode for now and remove it when we rework
> chromeos-laptop driver.
> 
> Signed-off-by: Dmitry Torokhov 

Applied, thanks.

> ---
>  drivers/input/touchscreen/atmel_mxt_ts.c | 27 +++-
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/input/touchscreen/atmel_mxt_ts.c 
> b/drivers/input/touchscreen/atmel_mxt_ts.c
> index 7659bc48f1db8..20e1224d1a6db 100644
> --- a/drivers/input/touchscreen/atmel_mxt_ts.c
> +++ b/drivers/input/touchscreen/atmel_mxt_ts.c
> @@ -324,6 +324,8 @@ struct mxt_data {
>  
>   /* for config update handling */
>   struct completion crc_completion;
> +
> + enum mxt_suspend_mode suspend_mode;
>  };
>  
>  struct mxt_vb2_buffer {
> @@ -2868,7 +2870,7 @@ static const struct attribute_group mxt_attr_group = {
>  
>  static void mxt_start(struct mxt_data *data)
>  {
> - switch (data->pdata->suspend_mode) {
> + switch (data->suspend_mode) {
>   case MXT_SUSPEND_T9_CTRL:
>   mxt_soft_reset(data);
>  
> @@ -2886,12 +2888,11 @@ static void mxt_start(struct mxt_data *data)
>   mxt_t6_command(data, MXT_COMMAND_CALIBRATE, 1, false);
>   break;
>   }
> -
>  }
>  
>  static void mxt_stop(struct mxt_data *data)
>  {
> - switch (data->pdata->suspend_mode) {
> + switch (data->suspend_mode) {
>   case MXT_SUSPEND_T9_CTRL:
>   /* Touch disable */
>   mxt_write_object(data,
> @@ -2954,8 +2955,6 @@ static const struct mxt_platform_data 
> *mxt_parse_dt(struct i2c_client *client)
>   pdata->t19_keymap = keymap;
>   }
>  
> - pdata->suspend_mode = MXT_SUSPEND_DEEP_SLEEP;
> -
>   return pdata;
>  }
>  #else
> @@ -3109,6 +3108,21 @@ mxt_get_platform_data(struct i2c_client *client)
>   return ERR_PTR(-EINVAL);
>  }
>  
> +static const struct dmi_system_id chromebook_T9_suspend_dmi[] = {
> + {
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"),
> + DMI_MATCH(DMI_PRODUCT_NAME, "Link"),
> + },
> + },
> + {
> + .matches = {
> + DMI_MATCH(DMI_PRODUCT_NAME, "Peppy"),
> + },
> + },
> + { }
> +};
> +
>  static int mxt_probe(struct i2c_client *client, const struct i2c_device_id 
> *id)
>  {
>   struct mxt_data *data;
> @@ -3135,6 +3149,9 @@ static int mxt_probe(struct i2c_client *client, const 
> struct i2c_device_id *id)
>   init_completion(>reset_completion);
>   init_completion(>crc_completion);
>  
> + data->suspend_mode = dmi_check_system(chromebook_T9_suspend_dmi) ?
> + MXT_SUSPEND_T9_CTRL : MXT_SUSPEND_DEEP_SLEEP;
> +
>   data->reset_gpio = devm_gpiod_get_optional(>dev,
>  "reset", GPIOD_OUT_LOW);
>   if (IS_ERR(data->reset_gpio)) {
> -- 
> 2.16.2.804.g6dcf76e118-goog
> 

-- 
Benson Leung
Staff Software Engineer
Chrome OS Kernel
Google Inc.
ble...@google.com
Chromium OS Project
ble...@chromium.org


signature.asc
Description: PGP signature


Re: [PATCH v2 01/14] Input: atmel_mxt_ts - do not pass suspend mode in platform data

2018-04-09 Thread Benson Leung
On Tue, Mar 20, 2018 at 03:31:25PM -0700, Dmitry Torokhov wrote:
> The way we are supposed to put controller to sleep and wake it up does not
> depend on the platform, but rather on controller itself, so we want to get
> rid of suspend mode in platform data (and eventually get rid of platform
> data completely). Unfortunately some early chromebooks (the original Pixel,
> Acer C720) were shipped with config that requires manually re-enabling
> touch reporting in T9. We will sort it out, but in the meantime let's
> switch to a simple DMI quirk.
> 
> We'll keep pdata->suspend_mode for now and remove it when we rework
> chromeos-laptop driver.
> 
> Signed-off-by: Dmitry Torokhov 

Applied, thanks.

> ---
>  drivers/input/touchscreen/atmel_mxt_ts.c | 27 +++-
>  1 file changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/input/touchscreen/atmel_mxt_ts.c 
> b/drivers/input/touchscreen/atmel_mxt_ts.c
> index 7659bc48f1db8..20e1224d1a6db 100644
> --- a/drivers/input/touchscreen/atmel_mxt_ts.c
> +++ b/drivers/input/touchscreen/atmel_mxt_ts.c
> @@ -324,6 +324,8 @@ struct mxt_data {
>  
>   /* for config update handling */
>   struct completion crc_completion;
> +
> + enum mxt_suspend_mode suspend_mode;
>  };
>  
>  struct mxt_vb2_buffer {
> @@ -2868,7 +2870,7 @@ static const struct attribute_group mxt_attr_group = {
>  
>  static void mxt_start(struct mxt_data *data)
>  {
> - switch (data->pdata->suspend_mode) {
> + switch (data->suspend_mode) {
>   case MXT_SUSPEND_T9_CTRL:
>   mxt_soft_reset(data);
>  
> @@ -2886,12 +2888,11 @@ static void mxt_start(struct mxt_data *data)
>   mxt_t6_command(data, MXT_COMMAND_CALIBRATE, 1, false);
>   break;
>   }
> -
>  }
>  
>  static void mxt_stop(struct mxt_data *data)
>  {
> - switch (data->pdata->suspend_mode) {
> + switch (data->suspend_mode) {
>   case MXT_SUSPEND_T9_CTRL:
>   /* Touch disable */
>   mxt_write_object(data,
> @@ -2954,8 +2955,6 @@ static const struct mxt_platform_data 
> *mxt_parse_dt(struct i2c_client *client)
>   pdata->t19_keymap = keymap;
>   }
>  
> - pdata->suspend_mode = MXT_SUSPEND_DEEP_SLEEP;
> -
>   return pdata;
>  }
>  #else
> @@ -3109,6 +3108,21 @@ mxt_get_platform_data(struct i2c_client *client)
>   return ERR_PTR(-EINVAL);
>  }
>  
> +static const struct dmi_system_id chromebook_T9_suspend_dmi[] = {
> + {
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"),
> + DMI_MATCH(DMI_PRODUCT_NAME, "Link"),
> + },
> + },
> + {
> + .matches = {
> + DMI_MATCH(DMI_PRODUCT_NAME, "Peppy"),
> + },
> + },
> + { }
> +};
> +
>  static int mxt_probe(struct i2c_client *client, const struct i2c_device_id 
> *id)
>  {
>   struct mxt_data *data;
> @@ -3135,6 +3149,9 @@ static int mxt_probe(struct i2c_client *client, const 
> struct i2c_device_id *id)
>   init_completion(>reset_completion);
>   init_completion(>crc_completion);
>  
> + data->suspend_mode = dmi_check_system(chromebook_T9_suspend_dmi) ?
> + MXT_SUSPEND_T9_CTRL : MXT_SUSPEND_DEEP_SLEEP;
> +
>   data->reset_gpio = devm_gpiod_get_optional(>dev,
>  "reset", GPIOD_OUT_LOW);
>   if (IS_ERR(data->reset_gpio)) {
> -- 
> 2.16.2.804.g6dcf76e118-goog
> 

-- 
Benson Leung
Staff Software Engineer
Chrome OS Kernel
Google Inc.
ble...@google.com
Chromium OS Project
ble...@chromium.org


signature.asc
Description: PGP signature


Re: [PATCH] xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen

2018-04-09 Thread Juergen Gross
On 09/04/18 20:51, Boris Ostrovsky wrote:
> Pre-4.17 kernels ignored start_info's rsdp_paddr pointer and instead
> relied on finding RSDP in standard location in BIOS RO memory. This
> has worked since that's where Xen used to place it.
> 
> However, with recent Xen change (commit 4a5733771e6f ("libxl: put RSDP
> for PVH guest near 4GB")) it prefers to keep RSDP at a "non-standard"
> address. Even though as of commit b17d9d1df3c3 ("x86/xen: Add pvh
> specific rsdp address retrieval function") Linux is able to find RSDP,
> for back-compatibility reasons we need to indicate to Xen that we can
> handle this, an we do so by setting XENFEAT_linux_rsdp_unrestricted
> flag in ELF notes.
> 
> (Also take this opportunity and sync features.h header file with Xen)
> 
> Signed-off-by: Boris Ostrovsky 

Reviewed-by: Juergen Gross 


Juergen


Re: [PATCH] xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen

2018-04-09 Thread Juergen Gross
On 09/04/18 20:51, Boris Ostrovsky wrote:
> Pre-4.17 kernels ignored start_info's rsdp_paddr pointer and instead
> relied on finding RSDP in standard location in BIOS RO memory. This
> has worked since that's where Xen used to place it.
> 
> However, with recent Xen change (commit 4a5733771e6f ("libxl: put RSDP
> for PVH guest near 4GB")) it prefers to keep RSDP at a "non-standard"
> address. Even though as of commit b17d9d1df3c3 ("x86/xen: Add pvh
> specific rsdp address retrieval function") Linux is able to find RSDP,
> for back-compatibility reasons we need to indicate to Xen that we can
> handle this, an we do so by setting XENFEAT_linux_rsdp_unrestricted
> flag in ELF notes.
> 
> (Also take this opportunity and sync features.h header file with Xen)
> 
> Signed-off-by: Boris Ostrovsky 

Reviewed-by: Juergen Gross 


Juergen


[tip:perf/urgent] perf tests clang: Fix function name for clang IR test

2018-04-09 Thread tip-bot for Sandipan Das
Commit-ID:  fcbd8fa44664e99a5d8c7ab97f1afdd82472f973
Gitweb: https://git.kernel.org/tip/fcbd8fa44664e99a5d8c7ab97f1afdd82472f973
Author: Sandipan Das 
AuthorDate: Wed, 4 Apr 2018 23:34:19 +0530
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 Apr 2018 11:13:09 -0300

perf tests clang: Fix function name for clang IR test

As stated in tests/llvm-src-base.c, the name of the bpf function should
be "bpf_func__SyS_epoll_pwait" but this clang test fails as it tries to
lookup "bpf_func__SyS_epoll_wait".

Before applying patch:

55: builtin clang support :
55.1: builtin clang compile C source to IR: FAILED!
55.2: builtin clang compile C source to ELF object: Skip

After applying patch:

55: builtin clang support :
55.1: builtin clang compile C source to IR: Ok
55.2: builtin clang compile C source to ELF object: Ok

Signed-off-by: Sandipan Das 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Naveen N. Rao 
Fixes: e67d52d411c3 ("perf clang: Update test case to use real BPF script")
Link: 
http://lkml.kernel.org/r/20180404180419.19056-3-sandi...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/c++/clang-test.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/c++/clang-test.cpp 
b/tools/perf/util/c++/clang-test.cpp
index a4014d786676..7b042a5ebc68 100644
--- a/tools/perf/util/c++/clang-test.cpp
+++ b/tools/perf/util/c++/clang-test.cpp
@@ -41,7 +41,7 @@ int test__clang_to_IR(void)
if (!M)
return -1;
for (llvm::Function& F : *M)
-   if (F.getName() == "bpf_func__SyS_epoll_wait")
+   if (F.getName() == "bpf_func__SyS_epoll_pwait")
return 0;
return -1;
 }


[tip:perf/urgent] perf tests clang: Fix function name for clang IR test

2018-04-09 Thread tip-bot for Sandipan Das
Commit-ID:  fcbd8fa44664e99a5d8c7ab97f1afdd82472f973
Gitweb: https://git.kernel.org/tip/fcbd8fa44664e99a5d8c7ab97f1afdd82472f973
Author: Sandipan Das 
AuthorDate: Wed, 4 Apr 2018 23:34:19 +0530
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 Apr 2018 11:13:09 -0300

perf tests clang: Fix function name for clang IR test

As stated in tests/llvm-src-base.c, the name of the bpf function should
be "bpf_func__SyS_epoll_pwait" but this clang test fails as it tries to
lookup "bpf_func__SyS_epoll_wait".

Before applying patch:

55: builtin clang support :
55.1: builtin clang compile C source to IR: FAILED!
55.2: builtin clang compile C source to ELF object: Skip

After applying patch:

55: builtin clang support :
55.1: builtin clang compile C source to IR: Ok
55.2: builtin clang compile C source to ELF object: Ok

Signed-off-by: Sandipan Das 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Naveen N. Rao 
Fixes: e67d52d411c3 ("perf clang: Update test case to use real BPF script")
Link: 
http://lkml.kernel.org/r/20180404180419.19056-3-sandi...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/c++/clang-test.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/c++/clang-test.cpp 
b/tools/perf/util/c++/clang-test.cpp
index a4014d786676..7b042a5ebc68 100644
--- a/tools/perf/util/c++/clang-test.cpp
+++ b/tools/perf/util/c++/clang-test.cpp
@@ -41,7 +41,7 @@ int test__clang_to_IR(void)
if (!M)
return -1;
for (llvm::Function& F : *M)
-   if (F.getName() == "bpf_func__SyS_epoll_wait")
+   if (F.getName() == "bpf_func__SyS_epoll_pwait")
return 0;
return -1;
 }


[tip:perf/urgent] perf clang: Add support for recent clang versions

2018-04-09 Thread tip-bot for Sandipan Das
Commit-ID:  7854e499f33fd9c7e63288692ffb754d9b1d02fd
Gitweb: https://git.kernel.org/tip/7854e499f33fd9c7e63288692ffb754d9b1d02fd
Author: Sandipan Das 
AuthorDate: Wed, 4 Apr 2018 23:34:18 +0530
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 Apr 2018 11:13:08 -0300

perf clang: Add support for recent clang versions

The clang API calls used by perf have changed in recent releases and
builds succeed with libclang-3.9 only. This introduces compatibility
with libclang-4.0 and above.

Without this patch, we will see the following compilation errors with
libclang-4.0+:

 util/c++/clang.cpp: In function ‘clang::CompilerInvocation* 
perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, 
clang::DiagnosticsEngine&)’:
 util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope
   Opts.Inputs.emplace_back(Path, IK_C);
  ^~~~
 util/c++/clang.cpp: In function ‘std::unique_ptr 
perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, 
llvm::IntrusiveRefCntPtr)’:
 util/c++/clang.cpp:75:26: error: no matching function for call to 
‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’
   Clang.setInvocation(&*CI);
   ^
 In file included from util/c++/clang.cpp:14:0:
 /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void 
clang::CompilerInstance::setInvocation(std::shared_ptr)
void setInvocation(std::shared_ptr Value);
 ^

Committer testing:

Tested on Fedora 27 after installing the clang-devel and llvm-devel
packages, versions:

  # rpm -qa | egrep llvm\|clang
  llvm-5.0.1-6.fc27.x86_64
  clang-libs-5.0.1-5.fc27.x86_64
  clang-5.0.1-5.fc27.x86_64
  clang-tools-extra-5.0.1-5.fc27.x86_64
  llvm-libs-5.0.1-6.fc27.x86_64
  llvm-devel-5.0.1-6.fc27.x86_64
  clang-devel-5.0.1-5.fc27.x86_64
  #

Make sure you don't have some older version lying around in /usr/local,
etc, then:

  $ make LIBCLANGLLVM=1 -C tools/perf install-bin

And in the end perf will be linked agains these libraries:

  # ldd ~/bin/perf | egrep -i llvm\|clang
libclangAST.so.5 => /lib64/libclangAST.so.5 (0x7f8bb2eb4000)
libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x7f8bb29e3000)
libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x7f8bb23f7000)
libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x7f8bb206)
libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 
(0x7f8bb1d06000)
libclangLex.so.5 => /lib64/libclangLex.so.5 (0x7f8bb1a3e000)
libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x7f8bb17d4000)
libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x7f8bb15c5000)
libclangSema.so.5 => /lib64/libclangSema.so.5 (0x7f8bb0cc9000)
libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 
(0x7f8bb0a23000)
libclangParse.so.5 => /lib64/libclangParse.so.5 (0x7f8bb0725000)
libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 
(0x7f8bb039a000)
libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x7f8bace98000)
libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 
(0x7f8bab735000)
libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 
(0x7f8bab4b2000)
libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 
(0x7f8bab2a1000)
libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 
(0x7f8bab08e000)
  #

Signed-off-by: Sandipan Das 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Naveen N. Rao 
Fixes: 00b86691c77c ("perf clang: Add builtin clang support ant test case")
Link: 
http://lkml.kernel.org/r/20180404180419.19056-2-sandi...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/c++/clang.cpp | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp
index 1bfc946e37dc..bf31ceab33bd 100644
--- a/tools/perf/util/c++/clang.cpp
+++ b/tools/perf/util/c++/clang.cpp
@@ -9,6 +9,7 @@
  * Copyright (C) 2016 Huawei Inc.
  */
 
+#include "clang/Basic/Version.h"
 #include "clang/CodeGen/CodeGenAction.h"
 #include "clang/Frontend/CompilerInvocation.h"
 #include "clang/Frontend/CompilerInstance.h"
@@ -58,7 +59,8 @@ createCompilerInvocation(llvm::opt::ArgStringList CFlags, 
StringRef& Path,
 
FrontendOptions& Opts = CI->getFrontendOpts();
Opts.Inputs.clear();
-   Opts.Inputs.emplace_back(Path, IK_C);
+   Opts.Inputs.emplace_back(Path,
+   FrontendOptions::getInputKindForExtension("c"));
return CI;
 }
 
@@ -71,10 +73,17 @@ getModuleFromSource(llvm::opt::ArgStringList CFlags,
 
Clang.setVirtualFileSystem(&*VFS);
 
+#if 

[tip:perf/urgent] perf clang: Add support for recent clang versions

2018-04-09 Thread tip-bot for Sandipan Das
Commit-ID:  7854e499f33fd9c7e63288692ffb754d9b1d02fd
Gitweb: https://git.kernel.org/tip/7854e499f33fd9c7e63288692ffb754d9b1d02fd
Author: Sandipan Das 
AuthorDate: Wed, 4 Apr 2018 23:34:18 +0530
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 Apr 2018 11:13:08 -0300

perf clang: Add support for recent clang versions

The clang API calls used by perf have changed in recent releases and
builds succeed with libclang-3.9 only. This introduces compatibility
with libclang-4.0 and above.

Without this patch, we will see the following compilation errors with
libclang-4.0+:

 util/c++/clang.cpp: In function ‘clang::CompilerInvocation* 
perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, 
clang::DiagnosticsEngine&)’:
 util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope
   Opts.Inputs.emplace_back(Path, IK_C);
  ^~~~
 util/c++/clang.cpp: In function ‘std::unique_ptr 
perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, 
llvm::IntrusiveRefCntPtr)’:
 util/c++/clang.cpp:75:26: error: no matching function for call to 
‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’
   Clang.setInvocation(&*CI);
   ^
 In file included from util/c++/clang.cpp:14:0:
 /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void 
clang::CompilerInstance::setInvocation(std::shared_ptr)
void setInvocation(std::shared_ptr Value);
 ^

Committer testing:

Tested on Fedora 27 after installing the clang-devel and llvm-devel
packages, versions:

  # rpm -qa | egrep llvm\|clang
  llvm-5.0.1-6.fc27.x86_64
  clang-libs-5.0.1-5.fc27.x86_64
  clang-5.0.1-5.fc27.x86_64
  clang-tools-extra-5.0.1-5.fc27.x86_64
  llvm-libs-5.0.1-6.fc27.x86_64
  llvm-devel-5.0.1-6.fc27.x86_64
  clang-devel-5.0.1-5.fc27.x86_64
  #

Make sure you don't have some older version lying around in /usr/local,
etc, then:

  $ make LIBCLANGLLVM=1 -C tools/perf install-bin

And in the end perf will be linked agains these libraries:

  # ldd ~/bin/perf | egrep -i llvm\|clang
libclangAST.so.5 => /lib64/libclangAST.so.5 (0x7f8bb2eb4000)
libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x7f8bb29e3000)
libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x7f8bb23f7000)
libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x7f8bb206)
libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 
(0x7f8bb1d06000)
libclangLex.so.5 => /lib64/libclangLex.so.5 (0x7f8bb1a3e000)
libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x7f8bb17d4000)
libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x7f8bb15c5000)
libclangSema.so.5 => /lib64/libclangSema.so.5 (0x7f8bb0cc9000)
libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 
(0x7f8bb0a23000)
libclangParse.so.5 => /lib64/libclangParse.so.5 (0x7f8bb0725000)
libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 
(0x7f8bb039a000)
libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x7f8bace98000)
libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 
(0x7f8bab735000)
libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 
(0x7f8bab4b2000)
libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 
(0x7f8bab2a1000)
libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 
(0x7f8bab08e000)
  #

Signed-off-by: Sandipan Das 
Tested-by: Arnaldo Carvalho de Melo 
Cc: Jiri Olsa 
Cc: Naveen N. Rao 
Fixes: 00b86691c77c ("perf clang: Add builtin clang support ant test case")
Link: 
http://lkml.kernel.org/r/20180404180419.19056-2-sandi...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/c++/clang.cpp | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp
index 1bfc946e37dc..bf31ceab33bd 100644
--- a/tools/perf/util/c++/clang.cpp
+++ b/tools/perf/util/c++/clang.cpp
@@ -9,6 +9,7 @@
  * Copyright (C) 2016 Huawei Inc.
  */
 
+#include "clang/Basic/Version.h"
 #include "clang/CodeGen/CodeGenAction.h"
 #include "clang/Frontend/CompilerInvocation.h"
 #include "clang/Frontend/CompilerInstance.h"
@@ -58,7 +59,8 @@ createCompilerInvocation(llvm::opt::ArgStringList CFlags, 
StringRef& Path,
 
FrontendOptions& Opts = CI->getFrontendOpts();
Opts.Inputs.clear();
-   Opts.Inputs.emplace_back(Path, IK_C);
+   Opts.Inputs.emplace_back(Path,
+   FrontendOptions::getInputKindForExtension("c"));
return CI;
 }
 
@@ -71,10 +73,17 @@ getModuleFromSource(llvm::opt::ArgStringList CFlags,
 
Clang.setVirtualFileSystem(&*VFS);
 
+#if CLANG_VERSION_MAJOR < 4
IntrusiveRefCntPtr CI =
createCompilerInvocation(std::move(CFlags), Path,
 

[tip:perf/urgent] perf tools: Fix perf builds with clang support

2018-04-09 Thread tip-bot for Sandipan Das
Commit-ID:  c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e
Gitweb: https://git.kernel.org/tip/c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e
Author: Sandipan Das 
AuthorDate: Wed, 4 Apr 2018 23:34:17 +0530
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 Apr 2018 11:13:07 -0300

perf tools: Fix perf builds with clang support

For libclang, some distro packages provide static libraries (.a) while
some provide shared libraries (.so). Currently, perf code can only be
linked with static libraries. This makes perf build possible for both
cases.

Signed-off-by: Sandipan Das 
Cc: Jiri Olsa 
Cc: Naveen N. Rao 
Fixes: d58ac0bf8d1e ("perf build: Add clang and llvm compile and linking 
support")
Link: 
http://lkml.kernel.org/r/20180404180419.19056-1-sandi...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Makefile.perf | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index f7517e1b73f8..83e453de36f8 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -364,7 +364,8 @@ LIBS = -Wl,--whole-archive $(PERFLIBS) $(EXTRA_PERFLIBS) 
-Wl,--no-whole-archive
 
 ifeq ($(USE_CLANG), 1)
   CLANGLIBS_LIST = AST Basic CodeGen Driver Frontend Lex Tooling Edit Sema 
Analysis Parse Serialization
-  LIBCLANG = $(foreach l,$(CLANGLIBS_LIST),$(wildcard $(shell $(LLVM_CONFIG) 
--libdir)/libclang$(l).a))
+  CLANGLIBS_NOEXT_LIST = $(foreach l,$(CLANGLIBS_LIST),$(shell $(LLVM_CONFIG) 
--libdir)/libclang$(l))
+  LIBCLANG = $(foreach l,$(CLANGLIBS_NOEXT_LIST),$(wildcard $(l).a $(l).so))
   LIBS += -Wl,--start-group $(LIBCLANG) -Wl,--end-group
 endif
 


[tip:perf/urgent] perf tools: Fix perf builds with clang support

2018-04-09 Thread tip-bot for Sandipan Das
Commit-ID:  c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e
Gitweb: https://git.kernel.org/tip/c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e
Author: Sandipan Das 
AuthorDate: Wed, 4 Apr 2018 23:34:17 +0530
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 Apr 2018 11:13:07 -0300

perf tools: Fix perf builds with clang support

For libclang, some distro packages provide static libraries (.a) while
some provide shared libraries (.so). Currently, perf code can only be
linked with static libraries. This makes perf build possible for both
cases.

Signed-off-by: Sandipan Das 
Cc: Jiri Olsa 
Cc: Naveen N. Rao 
Fixes: d58ac0bf8d1e ("perf build: Add clang and llvm compile and linking 
support")
Link: 
http://lkml.kernel.org/r/20180404180419.19056-1-sandi...@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/Makefile.perf | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index f7517e1b73f8..83e453de36f8 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -364,7 +364,8 @@ LIBS = -Wl,--whole-archive $(PERFLIBS) $(EXTRA_PERFLIBS) 
-Wl,--no-whole-archive
 
 ifeq ($(USE_CLANG), 1)
   CLANGLIBS_LIST = AST Basic CodeGen Driver Frontend Lex Tooling Edit Sema 
Analysis Parse Serialization
-  LIBCLANG = $(foreach l,$(CLANGLIBS_LIST),$(wildcard $(shell $(LLVM_CONFIG) 
--libdir)/libclang$(l).a))
+  CLANGLIBS_NOEXT_LIST = $(foreach l,$(CLANGLIBS_LIST),$(shell $(LLVM_CONFIG) 
--libdir)/libclang$(l))
+  LIBCLANG = $(foreach l,$(CLANGLIBS_NOEXT_LIST),$(wildcard $(l).a $(l).so))
   LIBS += -Wl,--start-group $(LIBCLANG) -Wl,--end-group
 endif
 


[tip:perf/urgent] perf tools: No need to include namespaces.h in util.h

2018-04-09 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  ad0902e0c4004dc95bf15229933012121ff54033
Gitweb: https://git.kernel.org/tip/ad0902e0c4004dc95bf15229933012121ff54033
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 6 Apr 2018 14:53:56 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 Apr 2018 10:57:50 -0300

perf tools: No need to include namespaces.h in util.h

The only thing that is needed there is a forward declaration for 'struct
nsinfo', so disentanble this, which in turns allows built-in clang
builds, i.e. 'make LIBCLANGLLVM=1 -C tools/perf'.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Naveen N. Rao 
Cc: Sandipan Das 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-vq26rsuwq1cqylpcyvq89...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/util.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 9496365da3d7..c9626c206208 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -11,8 +11,7 @@
 #include 
 #include 
 #include 
-#include 
-#include "namespaces.h"
+#include 
 
 /* General helper functions */
 void usage(const char *err) __noreturn;
@@ -26,6 +25,7 @@ static inline void *zalloc(size_t size)
 #define zfree(ptr) ({ free(*ptr); *ptr = NULL; })
 
 struct dirent;
+struct nsinfo;
 struct strlist;
 
 int mkdir_p(char *path, mode_t mode);


[tip:perf/urgent] perf tools: No need to include namespaces.h in util.h

2018-04-09 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  ad0902e0c4004dc95bf15229933012121ff54033
Gitweb: https://git.kernel.org/tip/ad0902e0c4004dc95bf15229933012121ff54033
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 6 Apr 2018 14:53:56 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Mon, 9 Apr 2018 10:57:50 -0300

perf tools: No need to include namespaces.h in util.h

The only thing that is needed there is a forward declaration for 'struct
nsinfo', so disentanble this, which in turns allows built-in clang
builds, i.e. 'make LIBCLANGLLVM=1 -C tools/perf'.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Naveen N. Rao 
Cc: Sandipan Das 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-vq26rsuwq1cqylpcyvq89...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/util.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 9496365da3d7..c9626c206208 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -11,8 +11,7 @@
 #include 
 #include 
 #include 
-#include 
-#include "namespaces.h"
+#include 
 
 /* General helper functions */
 void usage(const char *err) __noreturn;
@@ -26,6 +25,7 @@ static inline void *zalloc(size_t size)
 #define zfree(ptr) ({ free(*ptr); *ptr = NULL; })
 
 struct dirent;
+struct nsinfo;
 struct strlist;
 
 int mkdir_p(char *path, mode_t mode);


[tip:perf/urgent] perf hists browser: Remove leftover from row returned from refresh

2018-04-09 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  94e87a8bd529121ea90219164c65c36ea1d19e56
Gitweb: https://git.kernel.org/tip/94e87a8bd529121ea90219164c65c36ea1d19e56
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 6 Apr 2018 12:11:11 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 Apr 2018 12:23:25 -0300

perf hists browser: Remove leftover from row returned from refresh

The per-browser screen refresh routine (ui_browser->refresh()) should
return the first row that should be cleaned after the rows just printed,
in case not all rows available on the screen gets filled.

When moving the extra title lines logic from the hists browser to the
generic ui_browser class, one piece of that logic remained in the hists
browser and then when going back from the annotate browser to the hists
browser in a case where fewer lines were displayed in the hists browser,
for instance when filtering the entries per substring, one line of the
annotate browser would remain on the screen, fix that.

Example of the screen artifact:


Samples: 73K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 45172901394
Overhead  Shared O  Symbol
   0.30%  [kernel]  [k] __indirect_thunk_start
   0.09%  [kernel]  [k] __x86_indirect_thunk_r10
   │  lfence


Here from 'perf top' the view was zoomed with '/thunk' to functions
having that substring, then the first was annotated and from the
annotate browser ESC was pressed, then the first lines were overwritten,
but the 'lfence' line remained due to the off by one bug fixed in this
cset.

Cc: Adrian Hunter 
Cc: Andi Kleen 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the 
hists browser")
Link: https://lkml.kernel.org/n/tip-odryfso74eaarm0z3e4v9...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/hists.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index de17e59d9952..0eec06c105c6 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1744,17 +1744,11 @@ static void ui_browser__hists_init_top(struct 
ui_browser *browser)
 static unsigned int hist_browser__refresh(struct ui_browser *browser)
 {
unsigned row = 0;
-   u16 header_offset = 0;
struct rb_node *nd;
struct hist_browser *hb = container_of(browser, struct hist_browser, b);
-   struct hists *hists = hb->hists;
-
-   if (hb->show_headers) {
-   struct perf_hpp_list *hpp_list = hists->hpp_list;
 
+   if (hb->show_headers)
hist_browser__show_headers(hb);
-   header_offset = hpp_list->nr_header_lines;
-   }
 
ui_browser__hists_init_top(browser);
hb->he_selection = NULL;
@@ -1792,7 +1786,7 @@ static unsigned int hist_browser__refresh(struct 
ui_browser *browser)
break;
}
 
-   return row + header_offset;
+   return row;
 }
 
 static struct rb_node *hists__filter_entries(struct rb_node *nd,


[tip:perf/urgent] perf hists browser: Show extra_title_lines in the 'D' debug hotkey

2018-04-09 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  fdae6400809aa179f8ca04e32f3eb176fb3b3a9d
Gitweb: https://git.kernel.org/tip/fdae6400809aa179f8ca04e32f3eb176fb3b3a9d
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 6 Apr 2018 11:56:11 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 Apr 2018 12:22:06 -0300

perf hists browser: Show extra_title_lines in the 'D' debug hotkey

To help in fixing problems in the browser.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-uj0n76yqh5bf98i0edckd...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/hists.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index b06afb8f51fb..de17e59d9952 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -659,9 +659,10 @@ int hist_browser__run(struct hist_browser *browser, const 
char *help,
struct hist_entry *h = rb_entry(browser->b.top,
struct hist_entry, 
rb_node);
ui_helpline__pop();
-   ui_helpline__fpush("%d: nr_ent=(%d,%d), rows=%d, 
idx=%d, fve: idx=%d, row_off=%d, nrows=%d",
+   ui_helpline__fpush("%d: nr_ent=(%d,%d), etl: %d, 
rows=%d, idx=%d, fve: idx=%d, row_off=%d, nrows=%d",
   seq++, browser->b.nr_entries,
   browser->hists->nr_entries,
+  browser->b.extra_title_lines,
   browser->b.rows,
   browser->b.index,
   browser->b.top_idx,


[tip:perf/urgent] perf hists browser: Show extra_title_lines in the 'D' debug hotkey

2018-04-09 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  fdae6400809aa179f8ca04e32f3eb176fb3b3a9d
Gitweb: https://git.kernel.org/tip/fdae6400809aa179f8ca04e32f3eb176fb3b3a9d
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 6 Apr 2018 11:56:11 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 Apr 2018 12:22:06 -0300

perf hists browser: Show extra_title_lines in the 'D' debug hotkey

To help in fixing problems in the browser.

Cc: Adrian Hunter 
Cc: David Ahern 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Link: https://lkml.kernel.org/n/tip-uj0n76yqh5bf98i0edckd...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/hists.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index b06afb8f51fb..de17e59d9952 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -659,9 +659,10 @@ int hist_browser__run(struct hist_browser *browser, const 
char *help,
struct hist_entry *h = rb_entry(browser->b.top,
struct hist_entry, 
rb_node);
ui_helpline__pop();
-   ui_helpline__fpush("%d: nr_ent=(%d,%d), rows=%d, 
idx=%d, fve: idx=%d, row_off=%d, nrows=%d",
+   ui_helpline__fpush("%d: nr_ent=(%d,%d), etl: %d, 
rows=%d, idx=%d, fve: idx=%d, row_off=%d, nrows=%d",
   seq++, browser->b.nr_entries,
   browser->hists->nr_entries,
+  browser->b.extra_title_lines,
   browser->b.rows,
   browser->b.index,
   browser->b.top_idx,


[tip:perf/urgent] perf hists browser: Remove leftover from row returned from refresh

2018-04-09 Thread tip-bot for Arnaldo Carvalho de Melo
Commit-ID:  94e87a8bd529121ea90219164c65c36ea1d19e56
Gitweb: https://git.kernel.org/tip/94e87a8bd529121ea90219164c65c36ea1d19e56
Author: Arnaldo Carvalho de Melo 
AuthorDate: Fri, 6 Apr 2018 12:11:11 -0300
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 Apr 2018 12:23:25 -0300

perf hists browser: Remove leftover from row returned from refresh

The per-browser screen refresh routine (ui_browser->refresh()) should
return the first row that should be cleaned after the rows just printed,
in case not all rows available on the screen gets filled.

When moving the extra title lines logic from the hists browser to the
generic ui_browser class, one piece of that logic remained in the hists
browser and then when going back from the annotate browser to the hists
browser in a case where fewer lines were displayed in the hists browser,
for instance when filtering the entries per substring, one line of the
annotate browser would remain on the screen, fix that.

Example of the screen artifact:


Samples: 73K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 45172901394
Overhead  Shared O  Symbol
   0.30%  [kernel]  [k] __indirect_thunk_start
   0.09%  [kernel]  [k] __x86_indirect_thunk_r10
   │  lfence


Here from 'perf top' the view was zoomed with '/thunk' to functions
having that substring, then the first was annotated and from the
annotate browser ESC was pressed, then the first lines were overwritten,
but the 'lfence' line remained due to the off by one bug fixed in this
cset.

Cc: Adrian Hunter 
Cc: Andi Kleen 
Cc: David Ahern 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: Namhyung Kim 
Cc: Wang Nan 
Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the 
hists browser")
Link: https://lkml.kernel.org/n/tip-odryfso74eaarm0z3e4v9...@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/ui/browsers/hists.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index de17e59d9952..0eec06c105c6 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1744,17 +1744,11 @@ static void ui_browser__hists_init_top(struct 
ui_browser *browser)
 static unsigned int hist_browser__refresh(struct ui_browser *browser)
 {
unsigned row = 0;
-   u16 header_offset = 0;
struct rb_node *nd;
struct hist_browser *hb = container_of(browser, struct hist_browser, b);
-   struct hists *hists = hb->hists;
-
-   if (hb->show_headers) {
-   struct perf_hpp_list *hpp_list = hists->hpp_list;
 
+   if (hb->show_headers)
hist_browser__show_headers(hb);
-   header_offset = hpp_list->nr_header_lines;
-   }
 
ui_browser__hists_init_top(browser);
hb->he_selection = NULL;
@@ -1792,7 +1786,7 @@ static unsigned int hist_browser__refresh(struct 
ui_browser *browser)
break;
}
 
-   return row + header_offset;
+   return row;
 }
 
 static struct rb_node *hists__filter_entries(struct rb_node *nd,


[tip:perf/urgent] perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering

2018-04-09 Thread tip-bot for Adrian Hunter
Commit-ID:  b238db655796e74b59d9ece58b645ad0b494d615
Gitweb: https://git.kernel.org/tip/b238db655796e74b59d9ece58b645ad0b494d615
Author: Adrian Hunter 
AuthorDate: Tue, 6 Mar 2018 11:13:18 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 Apr 2018 09:40:41 -0300

perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering

In preparation for supporting AUX area sampling buffers,
auxtrace_queues__add_buffer() needs to be more generic. To that end, move
CPU filtering into it.

Signed-off-by: Adrian Hunter 
Cc: Jiri Olsa 
Link: 
http://lkml.kernel.org/r/1520327598-1317-8-git-send-email-adrian.hun...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/auxtrace.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index e1aff91c54a8..857de69a5361 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -302,6 +302,13 @@ static int auxtrace_queues__split_buffer(struct 
auxtrace_queues *queues,
return 0;
 }
 
+static bool filter_cpu(struct perf_session *session, int cpu)
+{
+   unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap;
+
+   return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap);
+}
+
 static int auxtrace_queues__add_buffer(struct auxtrace_queues *queues,
   struct perf_session *session,
   unsigned int idx,
@@ -310,6 +317,9 @@ static int auxtrace_queues__add_buffer(struct 
auxtrace_queues *queues,
 {
int err = -ENOMEM;
 
+   if (filter_cpu(session, buffer->cpu))
+   return 0;
+
buffer = memdup(buffer, sizeof(*buffer));
if (!buffer)
return -ENOMEM;
@@ -344,13 +354,6 @@ out_free:
return err;
 }
 
-static bool filter_cpu(struct perf_session *session, int cpu)
-{
-   unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap;
-
-   return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap);
-}
-
 int auxtrace_queues__add_event(struct auxtrace_queues *queues,
   struct perf_session *session,
   union perf_event *event, off_t data_offset,
@@ -367,9 +370,6 @@ int auxtrace_queues__add_event(struct auxtrace_queues 
*queues,
};
unsigned int idx = event->auxtrace.idx;
 
-   if (filter_cpu(session, event->auxtrace.cpu))
-   return 0;
-
return auxtrace_queues__add_buffer(queues, session, idx, ,
   buffer_ptr);
 }


[tip:perf/urgent] perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering

2018-04-09 Thread tip-bot for Adrian Hunter
Commit-ID:  b238db655796e74b59d9ece58b645ad0b494d615
Gitweb: https://git.kernel.org/tip/b238db655796e74b59d9ece58b645ad0b494d615
Author: Adrian Hunter 
AuthorDate: Tue, 6 Mar 2018 11:13:18 +0200
Committer:  Arnaldo Carvalho de Melo 
CommitDate: Fri, 6 Apr 2018 09:40:41 -0300

perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering

In preparation for supporting AUX area sampling buffers,
auxtrace_queues__add_buffer() needs to be more generic. To that end, move
CPU filtering into it.

Signed-off-by: Adrian Hunter 
Cc: Jiri Olsa 
Link: 
http://lkml.kernel.org/r/1520327598-1317-8-git-send-email-adrian.hun...@intel.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/auxtrace.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index e1aff91c54a8..857de69a5361 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -302,6 +302,13 @@ static int auxtrace_queues__split_buffer(struct 
auxtrace_queues *queues,
return 0;
 }
 
+static bool filter_cpu(struct perf_session *session, int cpu)
+{
+   unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap;
+
+   return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap);
+}
+
 static int auxtrace_queues__add_buffer(struct auxtrace_queues *queues,
   struct perf_session *session,
   unsigned int idx,
@@ -310,6 +317,9 @@ static int auxtrace_queues__add_buffer(struct 
auxtrace_queues *queues,
 {
int err = -ENOMEM;
 
+   if (filter_cpu(session, buffer->cpu))
+   return 0;
+
buffer = memdup(buffer, sizeof(*buffer));
if (!buffer)
return -ENOMEM;
@@ -344,13 +354,6 @@ out_free:
return err;
 }
 
-static bool filter_cpu(struct perf_session *session, int cpu)
-{
-   unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap;
-
-   return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap);
-}
-
 int auxtrace_queues__add_event(struct auxtrace_queues *queues,
   struct perf_session *session,
   union perf_event *event, off_t data_offset,
@@ -367,9 +370,6 @@ int auxtrace_queues__add_event(struct auxtrace_queues 
*queues,
};
unsigned int idx = event->auxtrace.idx;
 
-   if (filter_cpu(session, event->auxtrace.cpu))
-   return 0;
-
return auxtrace_queues__add_buffer(queues, session, idx, ,
   buffer_ptr);
 }


[PATCH v2 1/2] vhost: fix vhost_vq_access_ok() log check

2018-04-09 Thread Stefan Hajnoczi
Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log
when IOTLB is enabled") introduced a regression.  The logic was
originally:

  if (vq->iotlb)
  return 1;
  return A && B;

After the patch the short-circuit logic for A was inverted:

  if (A || vq->iotlb)
  return A;
  return B;

This patch fixes the regression by rewriting the checks in the obvious
way, no longer returning A when vq->iotlb is non-NULL (which is hard to
understand).

Reported-by: syzbot+65a84dde0214b0387...@syzkaller.appspotmail.com
Cc: Jason Wang 
Signed-off-by: Stefan Hajnoczi 
---
 drivers/vhost/vhost.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 5320039671b7..93fd0c75b0d8 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1244,10 +1244,12 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq,
 /* Caller should have vq mutex and device mutex */
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
-   int ret = vq_log_access_ok(vq, vq->log_base);
+   if (!vq_log_access_ok(vq, vq->log_base))
+   return 0;
 
-   if (ret || vq->iotlb)
-   return ret;
+   /* Access validation occurs at prefetch time with IOTLB */
+   if (vq->iotlb)
+   return 1;
 
return vq_access_ok(vq, vq->num, vq->desc, vq->avail, vq->used);
 }
-- 
2.14.3



[PATCH v2 1/2] vhost: fix vhost_vq_access_ok() log check

2018-04-09 Thread Stefan Hajnoczi
Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log
when IOTLB is enabled") introduced a regression.  The logic was
originally:

  if (vq->iotlb)
  return 1;
  return A && B;

After the patch the short-circuit logic for A was inverted:

  if (A || vq->iotlb)
  return A;
  return B;

This patch fixes the regression by rewriting the checks in the obvious
way, no longer returning A when vq->iotlb is non-NULL (which is hard to
understand).

Reported-by: syzbot+65a84dde0214b0387...@syzkaller.appspotmail.com
Cc: Jason Wang 
Signed-off-by: Stefan Hajnoczi 
---
 drivers/vhost/vhost.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 5320039671b7..93fd0c75b0d8 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1244,10 +1244,12 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq,
 /* Caller should have vq mutex and device mutex */
 int vhost_vq_access_ok(struct vhost_virtqueue *vq)
 {
-   int ret = vq_log_access_ok(vq, vq->log_base);
+   if (!vq_log_access_ok(vq, vq->log_base))
+   return 0;
 
-   if (ret || vq->iotlb)
-   return ret;
+   /* Access validation occurs at prefetch time with IOTLB */
+   if (vq->iotlb)
+   return 1;
 
return vq_access_ok(vq, vq->num, vq->desc, vq->avail, vq->used);
 }
-- 
2.14.3



[PATCH v2 2/2] vhost: return bool from *_access_ok() functions

2018-04-09 Thread Stefan Hajnoczi
Currently vhost *_access_ok() functions return int.  This is error-prone
because there are two popular conventions:

1. 0 means failure, 1 means success
2. -errno means failure, 0 means success

Although vhost mostly uses #1, it does not do so consistently.
umem_access_ok() uses #2.

This patch changes the return type from int to bool so that false means
failure and true means success.  This eliminates a potential source of
errors.

Suggested-by: Linus Torvalds 
Signed-off-by: Stefan Hajnoczi 
---
 drivers/vhost/vhost.h |  4 ++--
 drivers/vhost/vhost.c | 66 +--
 2 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index ac4b6056f19a..6e00fa57af09 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -178,8 +178,8 @@ void vhost_dev_cleanup(struct vhost_dev *);
 void vhost_dev_stop(struct vhost_dev *);
 long vhost_dev_ioctl(struct vhost_dev *, unsigned int ioctl, void __user 
*argp);
 long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp);
-int vhost_vq_access_ok(struct vhost_virtqueue *vq);
-int vhost_log_access_ok(struct vhost_dev *);
+bool vhost_vq_access_ok(struct vhost_virtqueue *vq);
+bool vhost_log_access_ok(struct vhost_dev *);
 
 int vhost_get_vq_desc(struct vhost_virtqueue *,
  struct iovec iov[], unsigned int iov_count,
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 93fd0c75b0d8..b6a082ef33dd 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -641,14 +641,14 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
 }
 EXPORT_SYMBOL_GPL(vhost_dev_cleanup);
 
-static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
+static bool log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
 {
u64 a = addr / VHOST_PAGE_SIZE / 8;
 
/* Make sure 64 bit math will not overflow. */
if (a > ULONG_MAX - (unsigned long)log_base ||
a + (unsigned long)log_base > ULONG_MAX)
-   return 0;
+   return false;
 
return access_ok(VERIFY_WRITE, log_base + a,
 (sz + VHOST_PAGE_SIZE * 8 - 1) / VHOST_PAGE_SIZE / 8);
@@ -661,30 +661,30 @@ static bool vhost_overflow(u64 uaddr, u64 size)
 }
 
 /* Caller should have vq mutex and device mutex. */
-static int vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem,
-  int log_all)
+static bool vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem,
+   int log_all)
 {
struct vhost_umem_node *node;
 
if (!umem)
-   return 0;
+   return false;
 
list_for_each_entry(node, >umem_list, link) {
unsigned long a = node->userspace_addr;
 
if (vhost_overflow(node->userspace_addr, node->size))
-   return 0;
+   return false;
 
 
if (!access_ok(VERIFY_WRITE, (void __user *)a,
node->size))
-   return 0;
+   return false;
else if (log_all && !log_access_ok(log_base,
   node->start,
   node->size))
-   return 0;
+   return false;
}
-   return 1;
+   return true;
 }
 
 static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq,
@@ -701,13 +701,13 @@ static inline void __user *vhost_vq_meta_fetch(struct 
vhost_virtqueue *vq,
 
 /* Can we switch to this memory table? */
 /* Caller should have device mutex but not vq mutex */
-static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem,
-   int log_all)
+static bool memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem,
+int log_all)
 {
int i;
 
for (i = 0; i < d->nvqs; ++i) {
-   int ok;
+   bool ok;
bool log;
 
mutex_lock(>vqs[i]->mutex);
@@ -717,12 +717,12 @@ static int memory_access_ok(struct vhost_dev *d, struct 
vhost_umem *umem,
ok = vq_memory_access_ok(d->vqs[i]->log_base,
 umem, log);
else
-   ok = 1;
+   ok = true;
mutex_unlock(>vqs[i]->mutex);
if (!ok)
-   return 0;
+   return false;
}
-   return 1;
+   return true;
 }
 
 static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len,
@@ -959,21 +959,21 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
spin_unlock(>iotlb_lock);
 }
 
-static int umem_access_ok(u64 uaddr, u64 size, int access)
+static 

[PATCH v2 2/2] vhost: return bool from *_access_ok() functions

2018-04-09 Thread Stefan Hajnoczi
Currently vhost *_access_ok() functions return int.  This is error-prone
because there are two popular conventions:

1. 0 means failure, 1 means success
2. -errno means failure, 0 means success

Although vhost mostly uses #1, it does not do so consistently.
umem_access_ok() uses #2.

This patch changes the return type from int to bool so that false means
failure and true means success.  This eliminates a potential source of
errors.

Suggested-by: Linus Torvalds 
Signed-off-by: Stefan Hajnoczi 
---
 drivers/vhost/vhost.h |  4 ++--
 drivers/vhost/vhost.c | 66 +--
 2 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index ac4b6056f19a..6e00fa57af09 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -178,8 +178,8 @@ void vhost_dev_cleanup(struct vhost_dev *);
 void vhost_dev_stop(struct vhost_dev *);
 long vhost_dev_ioctl(struct vhost_dev *, unsigned int ioctl, void __user 
*argp);
 long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp);
-int vhost_vq_access_ok(struct vhost_virtqueue *vq);
-int vhost_log_access_ok(struct vhost_dev *);
+bool vhost_vq_access_ok(struct vhost_virtqueue *vq);
+bool vhost_log_access_ok(struct vhost_dev *);
 
 int vhost_get_vq_desc(struct vhost_virtqueue *,
  struct iovec iov[], unsigned int iov_count,
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 93fd0c75b0d8..b6a082ef33dd 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -641,14 +641,14 @@ void vhost_dev_cleanup(struct vhost_dev *dev)
 }
 EXPORT_SYMBOL_GPL(vhost_dev_cleanup);
 
-static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
+static bool log_access_ok(void __user *log_base, u64 addr, unsigned long sz)
 {
u64 a = addr / VHOST_PAGE_SIZE / 8;
 
/* Make sure 64 bit math will not overflow. */
if (a > ULONG_MAX - (unsigned long)log_base ||
a + (unsigned long)log_base > ULONG_MAX)
-   return 0;
+   return false;
 
return access_ok(VERIFY_WRITE, log_base + a,
 (sz + VHOST_PAGE_SIZE * 8 - 1) / VHOST_PAGE_SIZE / 8);
@@ -661,30 +661,30 @@ static bool vhost_overflow(u64 uaddr, u64 size)
 }
 
 /* Caller should have vq mutex and device mutex. */
-static int vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem,
-  int log_all)
+static bool vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem,
+   int log_all)
 {
struct vhost_umem_node *node;
 
if (!umem)
-   return 0;
+   return false;
 
list_for_each_entry(node, >umem_list, link) {
unsigned long a = node->userspace_addr;
 
if (vhost_overflow(node->userspace_addr, node->size))
-   return 0;
+   return false;
 
 
if (!access_ok(VERIFY_WRITE, (void __user *)a,
node->size))
-   return 0;
+   return false;
else if (log_all && !log_access_ok(log_base,
   node->start,
   node->size))
-   return 0;
+   return false;
}
-   return 1;
+   return true;
 }
 
 static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq,
@@ -701,13 +701,13 @@ static inline void __user *vhost_vq_meta_fetch(struct 
vhost_virtqueue *vq,
 
 /* Can we switch to this memory table? */
 /* Caller should have device mutex but not vq mutex */
-static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem,
-   int log_all)
+static bool memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem,
+int log_all)
 {
int i;
 
for (i = 0; i < d->nvqs; ++i) {
-   int ok;
+   bool ok;
bool log;
 
mutex_lock(>vqs[i]->mutex);
@@ -717,12 +717,12 @@ static int memory_access_ok(struct vhost_dev *d, struct 
vhost_umem *umem,
ok = vq_memory_access_ok(d->vqs[i]->log_base,
 umem, log);
else
-   ok = 1;
+   ok = true;
mutex_unlock(>vqs[i]->mutex);
if (!ok)
-   return 0;
+   return false;
}
-   return 1;
+   return true;
 }
 
 static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len,
@@ -959,21 +959,21 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d,
spin_unlock(>iotlb_lock);
 }
 
-static int umem_access_ok(u64 uaddr, u64 size, int access)
+static bool umem_access_ok(u64 uaddr, u64 size, int 

[PATCH v2 0/2] vhost: fix vhost_vq_access_ok() log check

2018-04-09 Thread Stefan Hajnoczi
v2:
 * Rewrote the conditional to make the vq access check clearer [Linus]
 * Added Patch 2 to make the return type consistent and harder to misuse [Linus]

The first patch fixes the vhost virtqueue access check which was recently
broken.  The second patch replaces the int return type with bool to prevent
future bugs.

Stefan Hajnoczi (2):
  vhost: fix vhost_vq_access_ok() log check
  vhost: return bool from *_access_ok() functions

 drivers/vhost/vhost.h |  4 +--
 drivers/vhost/vhost.c | 70 ++-
 2 files changed, 38 insertions(+), 36 deletions(-)

-- 
2.14.3



[PATCH v2 0/2] vhost: fix vhost_vq_access_ok() log check

2018-04-09 Thread Stefan Hajnoczi
v2:
 * Rewrote the conditional to make the vq access check clearer [Linus]
 * Added Patch 2 to make the return type consistent and harder to misuse [Linus]

The first patch fixes the vhost virtqueue access check which was recently
broken.  The second patch replaces the int return type with bool to prevent
future bugs.

Stefan Hajnoczi (2):
  vhost: fix vhost_vq_access_ok() log check
  vhost: return bool from *_access_ok() functions

 drivers/vhost/vhost.h |  4 +--
 drivers/vhost/vhost.c | 70 ++-
 2 files changed, 38 insertions(+), 36 deletions(-)

-- 
2.14.3



Re: [GIT PULL 00/13] perf/urgent fixes

2018-04-09 Thread Ingo Molnar

* Arnaldo Carvalho de Melo <a...@kernel.org> wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit d1e7e602cd64cf61f87dbf30df07c24df9eb1d99:
> 
>   perf/x86/intel: Move regs->flags EXACT bit init (2018-04-05 09:28:40 +0200)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-urgent-for-mingo-4.17-20180409
> 
> for you to fetch changes up to fcbd8fa44664e99a5d8c7ab97f1afdd82472f973:
> 
>   perf tests clang: Fix function name for clang IR test (2018-04-09 11:13:09 
> -0300)
> 
> 
> perf/urgent fixes:
> 
> . Fix the --stdio2/TUI annotate output to include group details,
>   be it for a recorded '{a,b,f}' explicit event group or when
>   forcing group display using 'perf report --group' for a set of
>   events not recorded as a group (Arnaldo Carvalho de Melo)
> 
> . Fix display artifacts in the ui browser (base class for the
>   annotate and main report/top TUI browser) related to the extra
>   title lines work (Arnaldo Carvalho de Melo)
> 
> . perf auxtrace refactorings, leftovers from a previously partially
>   processed patchset (Adrian Hunter)
> 
> . Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de Melo)
> 
> - Synchronize i915_drm.h, silencing a perf build warning and
>   in the process automagically adding support for a new ioctl
>   command (Arnaldo Carvalho de Melo)
> 
> Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com>
> 
> 
> Adrian Hunter (2):
>   perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer
>   perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
> 
> Arnaldo Carvalho de Melo (8):
>   perf annotate: Show group details on the title line
>   perf annotate browser: Fixup vertical line separating metrics from 
> instructions
>   perf ui browser: Fixup cleaning unused lines at the bottom
>   perf report: Remove duplicated 'samples' in lost samples warning
>   tools headers uapi: Synchronize i915_drm.h
>   perf hists browser: Show extra_title_lines in the 'D' debug hotkey
>   perf hists browser: Remove leftover from row returned from refresh
>   perf tools: No need to include namespaces.h in util.h
> 
> Sandipan Das (3):
>   perf tools: Fix perf builds with clang support
>   perf clang: Add support for recent clang versions
>   perf tests clang: Fix function name for clang IR test
> 
>  tools/include/uapi/drm/i915_drm.h  | 112 
> +++--
>  tools/perf/Makefile.perf   |   3 +-
>  tools/perf/ui/browser.c|   4 +-
>  tools/perf/ui/browsers/annotate.c  |   2 +-
>  tools/perf/ui/browsers/hists.c |  13 ++---
>  tools/perf/util/annotate.c |   7 ++-
>  tools/perf/util/auxtrace.c |  72 +++-
>  tools/perf/util/c++/clang-test.cpp |   2 +-
>  tools/perf/util/c++/clang.cpp  |  11 +++-
>  tools/perf/util/session.c  |   2 +-
>  tools/perf/util/util.h |   4 +-
>  11 files changed, 169 insertions(+), 63 deletions(-)

Pulled, thanks a lot Arnaldo!

Ingo


Re: [GIT PULL 00/13] perf/urgent fixes

2018-04-09 Thread Ingo Molnar

* Arnaldo Carvalho de Melo  wrote:

> Hi Ingo,
> 
>   Please consider pulling,
> 
> - Arnaldo
> 
> 
> Test results at the end of this message, as usual.
> 
> The following changes since commit d1e7e602cd64cf61f87dbf30df07c24df9eb1d99:
> 
>   perf/x86/intel: Move regs->flags EXACT bit init (2018-04-05 09:28:40 +0200)
> 
> are available in the Git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git 
> tags/perf-urgent-for-mingo-4.17-20180409
> 
> for you to fetch changes up to fcbd8fa44664e99a5d8c7ab97f1afdd82472f973:
> 
>   perf tests clang: Fix function name for clang IR test (2018-04-09 11:13:09 
> -0300)
> 
> 
> perf/urgent fixes:
> 
> . Fix the --stdio2/TUI annotate output to include group details,
>   be it for a recorded '{a,b,f}' explicit event group or when
>   forcing group display using 'perf report --group' for a set of
>   events not recorded as a group (Arnaldo Carvalho de Melo)
> 
> . Fix display artifacts in the ui browser (base class for the
>   annotate and main report/top TUI browser) related to the extra
>   title lines work (Arnaldo Carvalho de Melo)
> 
> . perf auxtrace refactorings, leftovers from a previously partially
>   processed patchset (Adrian Hunter)
> 
> . Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de Melo)
> 
> - Synchronize i915_drm.h, silencing a perf build warning and
>   in the process automagically adding support for a new ioctl
>   command (Arnaldo Carvalho de Melo)
> 
> Signed-off-by: Arnaldo Carvalho de Melo 
> 
> 
> Adrian Hunter (2):
>   perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer
>   perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
> 
> Arnaldo Carvalho de Melo (8):
>   perf annotate: Show group details on the title line
>   perf annotate browser: Fixup vertical line separating metrics from 
> instructions
>   perf ui browser: Fixup cleaning unused lines at the bottom
>   perf report: Remove duplicated 'samples' in lost samples warning
>   tools headers uapi: Synchronize i915_drm.h
>   perf hists browser: Show extra_title_lines in the 'D' debug hotkey
>   perf hists browser: Remove leftover from row returned from refresh
>   perf tools: No need to include namespaces.h in util.h
> 
> Sandipan Das (3):
>   perf tools: Fix perf builds with clang support
>   perf clang: Add support for recent clang versions
>   perf tests clang: Fix function name for clang IR test
> 
>  tools/include/uapi/drm/i915_drm.h  | 112 
> +++--
>  tools/perf/Makefile.perf   |   3 +-
>  tools/perf/ui/browser.c|   4 +-
>  tools/perf/ui/browsers/annotate.c  |   2 +-
>  tools/perf/ui/browsers/hists.c |  13 ++---
>  tools/perf/util/annotate.c |   7 ++-
>  tools/perf/util/auxtrace.c |  72 +++-
>  tools/perf/util/c++/clang-test.cpp |   2 +-
>  tools/perf/util/c++/clang.cpp  |  11 +++-
>  tools/perf/util/session.c  |   2 +-
>  tools/perf/util/util.h |   4 +-
>  11 files changed, 169 insertions(+), 63 deletions(-)

Pulled, thanks a lot Arnaldo!

Ingo


[PATCH 2/2] drm/bridge: sii902x: add optional power supplies

2018-04-09 Thread Philippe Cornu
Add the 3 optional power supplies using the exact description
found in the document named
"SiI9022A/SiI9024A HDMI Transmitter Data Sheet (August 2016)".

Signed-off-by: Philippe Cornu 
---
 drivers/gpu/drm/bridge/sii902x.c | 39 +++
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c
index 60373d7eb220..e17ba6db1ec8 100644
--- a/drivers/gpu/drm/bridge/sii902x.c
+++ b/drivers/gpu/drm/bridge/sii902x.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -86,6 +87,7 @@ struct sii902x {
struct drm_bridge bridge;
struct drm_connector connector;
struct gpio_desc *reset_gpio;
+   struct regulator_bulk_data supplies[3];
 };
 
 static inline struct sii902x *bridge_to_sii902x(struct drm_bridge *bridge)
@@ -392,23 +394,43 @@ static int sii902x_probe(struct i2c_client *client,
return PTR_ERR(sii902x->reset_gpio);
}
 
+   sii902x->supplies[0].supply = "iovcc";
+   sii902x->supplies[1].supply = "avcc12";
+   sii902x->supplies[2].supply = "cvcc12";
+   ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(sii902x->supplies),
+ sii902x->supplies);
+   if (ret) {
+   dev_err(dev, "regulator_bulk_get failed\n");
+   return ret;
+   }
+
+   ret = regulator_bulk_enable(ARRAY_SIZE(sii902x->supplies),
+   sii902x->supplies);
+   if (ret) {
+   dev_err(dev, "regulator_bulk_enable failed\n");
+   return ret;
+   }
+
+   usleep_range(1, 2);
+
sii902x_reset(sii902x);
 
ret = regmap_write(sii902x->regmap, SII902X_REG_TPI_RQB, 0x0);
if (ret)
-   return ret;
+   goto err_disable_regulator;
 
ret = regmap_bulk_read(sii902x->regmap, SII902X_REG_CHIPID(0),
   , 4);
if (ret) {
dev_err(dev, "regmap_read failed %d\n", ret);
-   return ret;
+   goto err_disable_regulator;
}
 
if (chipid[0] != 0xb0) {
dev_err(dev, "Invalid chipid: %02x (expecting 0xb0)\n",
chipid[0]);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto err_disable_regulator;
}
 
/* Clear all pending interrupts */
@@ -424,7 +446,7 @@ static int sii902x_probe(struct i2c_client *client,
IRQF_ONESHOT, dev_name(dev),
sii902x);
if (ret)
-   return ret;
+   goto err_disable_regulator;
}
 
sii902x->bridge.funcs = _bridge_funcs;
@@ -434,6 +456,12 @@ static int sii902x_probe(struct i2c_client *client,
i2c_set_clientdata(client, sii902x);
 
return 0;
+
+err_disable_regulator:
+   regulator_bulk_disable(ARRAY_SIZE(sii902x->supplies),
+  sii902x->supplies);
+
+   return ret;
 }
 
 static int sii902x_remove(struct i2c_client *client)
@@ -443,6 +471,9 @@ static int sii902x_remove(struct i2c_client *client)
 
drm_bridge_remove(>bridge);
 
+   regulator_bulk_disable(ARRAY_SIZE(sii902x->supplies),
+  sii902x->supplies);
+
return 0;
 }
 
-- 
2.15.1



[PATCH 2/2] drm/bridge: sii902x: add optional power supplies

2018-04-09 Thread Philippe Cornu
Add the 3 optional power supplies using the exact description
found in the document named
"SiI9022A/SiI9024A HDMI Transmitter Data Sheet (August 2016)".

Signed-off-by: Philippe Cornu 
---
 drivers/gpu/drm/bridge/sii902x.c | 39 +++
 1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c
index 60373d7eb220..e17ba6db1ec8 100644
--- a/drivers/gpu/drm/bridge/sii902x.c
+++ b/drivers/gpu/drm/bridge/sii902x.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -86,6 +87,7 @@ struct sii902x {
struct drm_bridge bridge;
struct drm_connector connector;
struct gpio_desc *reset_gpio;
+   struct regulator_bulk_data supplies[3];
 };
 
 static inline struct sii902x *bridge_to_sii902x(struct drm_bridge *bridge)
@@ -392,23 +394,43 @@ static int sii902x_probe(struct i2c_client *client,
return PTR_ERR(sii902x->reset_gpio);
}
 
+   sii902x->supplies[0].supply = "iovcc";
+   sii902x->supplies[1].supply = "avcc12";
+   sii902x->supplies[2].supply = "cvcc12";
+   ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(sii902x->supplies),
+ sii902x->supplies);
+   if (ret) {
+   dev_err(dev, "regulator_bulk_get failed\n");
+   return ret;
+   }
+
+   ret = regulator_bulk_enable(ARRAY_SIZE(sii902x->supplies),
+   sii902x->supplies);
+   if (ret) {
+   dev_err(dev, "regulator_bulk_enable failed\n");
+   return ret;
+   }
+
+   usleep_range(1, 2);
+
sii902x_reset(sii902x);
 
ret = regmap_write(sii902x->regmap, SII902X_REG_TPI_RQB, 0x0);
if (ret)
-   return ret;
+   goto err_disable_regulator;
 
ret = regmap_bulk_read(sii902x->regmap, SII902X_REG_CHIPID(0),
   , 4);
if (ret) {
dev_err(dev, "regmap_read failed %d\n", ret);
-   return ret;
+   goto err_disable_regulator;
}
 
if (chipid[0] != 0xb0) {
dev_err(dev, "Invalid chipid: %02x (expecting 0xb0)\n",
chipid[0]);
-   return -EINVAL;
+   ret = -EINVAL;
+   goto err_disable_regulator;
}
 
/* Clear all pending interrupts */
@@ -424,7 +446,7 @@ static int sii902x_probe(struct i2c_client *client,
IRQF_ONESHOT, dev_name(dev),
sii902x);
if (ret)
-   return ret;
+   goto err_disable_regulator;
}
 
sii902x->bridge.funcs = _bridge_funcs;
@@ -434,6 +456,12 @@ static int sii902x_probe(struct i2c_client *client,
i2c_set_clientdata(client, sii902x);
 
return 0;
+
+err_disable_regulator:
+   regulator_bulk_disable(ARRAY_SIZE(sii902x->supplies),
+  sii902x->supplies);
+
+   return ret;
 }
 
 static int sii902x_remove(struct i2c_client *client)
@@ -443,6 +471,9 @@ static int sii902x_remove(struct i2c_client *client)
 
drm_bridge_remove(>bridge);
 
+   regulator_bulk_disable(ARRAY_SIZE(sii902x->supplies),
+  sii902x->supplies);
+
return 0;
 }
 
-- 
2.15.1



[PATCH 0/2] drm/bridge: sii902x: add optional power supplies

2018-04-09 Thread Philippe Cornu
This patchset adds the 3 optional power supplies to the sii902x
drm bridge driver.

Philippe Cornu (2):
  dt-bindings/display/bridge: sii902x: add optional power supplies
  drm/bridge: sii902x: add optional power supplies

 .../devicetree/bindings/display/bridge/sii902x.txt |  3 ++
 drivers/gpu/drm/bridge/sii902x.c   | 39 +++---
 2 files changed, 38 insertions(+), 4 deletions(-)

-- 
2.15.1



[PATCH 0/2] drm/bridge: sii902x: add optional power supplies

2018-04-09 Thread Philippe Cornu
This patchset adds the 3 optional power supplies to the sii902x
drm bridge driver.

Philippe Cornu (2):
  dt-bindings/display/bridge: sii902x: add optional power supplies
  drm/bridge: sii902x: add optional power supplies

 .../devicetree/bindings/display/bridge/sii902x.txt |  3 ++
 drivers/gpu/drm/bridge/sii902x.c   | 39 +++---
 2 files changed, 38 insertions(+), 4 deletions(-)

-- 
2.15.1



[PATCH 1/2] dt-bindings/display/bridge: sii902x: add optional power supplies

2018-04-09 Thread Philippe Cornu
Add the 3 optional power supplies using the exact description
found in the document named
"SiI9022A/SiI9024A HDMI Transmitter Data Sheet (August 2016)".

Signed-off-by: Philippe Cornu 
---
 Documentation/devicetree/bindings/display/bridge/sii902x.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/bridge/sii902x.txt 
b/Documentation/devicetree/bindings/display/bridge/sii902x.txt
index 56a3e68ccb80..cf53678fe574 100644
--- a/Documentation/devicetree/bindings/display/bridge/sii902x.txt
+++ b/Documentation/devicetree/bindings/display/bridge/sii902x.txt
@@ -8,6 +8,9 @@ Optional properties:
- interrupts-extended or interrupt-parent + interrupts: describe
  the interrupt line used to inform the host about hotplug events.
- reset-gpios: OF device-tree gpio specification for RST_N pin.
+   - iovcc-supply: I/O supply voltage (1.8V or 3.3V, host-dependent).
+   - avcc12-supply: TMDS analog supply voltage (1.2V).
+   - cvcc12-supply: Digital core supply voltage (1.2V).
 
 Optional subnodes:
- video input: this subnode can contain a video input port node
-- 
2.15.1



[PATCH 1/2] dt-bindings/display/bridge: sii902x: add optional power supplies

2018-04-09 Thread Philippe Cornu
Add the 3 optional power supplies using the exact description
found in the document named
"SiI9022A/SiI9024A HDMI Transmitter Data Sheet (August 2016)".

Signed-off-by: Philippe Cornu 
---
 Documentation/devicetree/bindings/display/bridge/sii902x.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/bridge/sii902x.txt 
b/Documentation/devicetree/bindings/display/bridge/sii902x.txt
index 56a3e68ccb80..cf53678fe574 100644
--- a/Documentation/devicetree/bindings/display/bridge/sii902x.txt
+++ b/Documentation/devicetree/bindings/display/bridge/sii902x.txt
@@ -8,6 +8,9 @@ Optional properties:
- interrupts-extended or interrupt-parent + interrupts: describe
  the interrupt line used to inform the host about hotplug events.
- reset-gpios: OF device-tree gpio specification for RST_N pin.
+   - iovcc-supply: I/O supply voltage (1.8V or 3.3V, host-dependent).
+   - avcc12-supply: TMDS analog supply voltage (1.2V).
+   - cvcc12-supply: Digital core supply voltage (1.2V).
 
 Optional subnodes:
- video input: this subnode can contain a video input port node
-- 
2.15.1



Re: [PATCH v2 2/9] PCI: dwc: Add support for endpoint mode

2018-04-09 Thread Kishon Vijay Abraham I
Hi,

On Monday 09 April 2018 03:11 PM, Gustavo Pimentel wrote:
> The PCIe controller dual mode is capable of operating in host mode as well
> as endpoint mode by configuration, therefore this patch aims to add
> endpoint mode support to the designware driver.
> 
> Signed-off-by: Gustavo Pimentel 
> ---
> Change v1->v2:
>  - Removed dw_plat_pcie_stop_link empty function.
>  - Implemented Kishon's suggestions about dw-pcie-rc and dw-pcie strings.
> compatibility.
>  - Added second entry on pci_epf_test_ids structure.
> 
>  drivers/pci/dwc/Kconfig   |  45 ++--
>  drivers/pci/dwc/pcie-designware-ep.c  |   4 +-
>  drivers/pci/dwc/pcie-designware-plat.c| 153 
> --
>  drivers/pci/endpoint/functions/pci-epf-test.c |   9 ++
>  4 files changed, 190 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
> index 2f3f5c5..3fd7daf 100644
> --- a/drivers/pci/dwc/Kconfig
> +++ b/drivers/pci/dwc/Kconfig
> @@ -7,8 +7,7 @@ config PCIE_DW
>  
>  config PCIE_DW_HOST
>  bool
> - depends on PCI
> - depends on PCI_MSI_IRQ_DOMAIN
> + depends on PCI && PCI_MSI_IRQ_DOMAIN
>  select PCIE_DW
>  
>  config PCIE_DW_EP
> @@ -52,16 +51,42 @@ config PCI_DRA7XX_EP
>  
>  config PCIE_DW_PLAT
>   bool "Platform bus based DesignWare PCIe Controller"
> - depends on PCI
> - depends on PCI_MSI_IRQ_DOMAIN
> - select PCIE_DW_HOST
> - ---help---
> -  This selects the DesignWare PCIe controller support. Select this if
> -  you have a PCIe controller on Platform bus.
> + help
> +   There are two instances of PCIe controller in Designware IP.
> +   This controller can work either as EP or RC. In order to enable
> +   host-specific features PCIE_DW_PLAT_HOST must be selected and in
> +   order to enable device-specific features PCIE_DW_PLAT_EP must be
> +   selected.
>  
> -  If you have a controller with this interface, say Y or M here.
> +config PCIE_DW_PLAT_HOST
> + bool "Platform bus based DesignWare PCIe Controller - Host mode"
> + depends on PCI && PCI_MSI_IRQ_DOMAIN
> + select PCIE_DW_HOST
> + select PCIE_DW_PLAT
> + default y
> + help
> +   Enables support for the PCIe controller in the Designware IP to
> +   work in host mode. There are two instances of PCIe controller in
> +   Designware IP.
> +   This controller can work either as EP or RC. In order to enable
> +   host-specific features PCIE_DW_PLAT_HOST must be selected and in
> +   order to enable device-specific features PCI_DW_PLAT_EP must be
> +   selected.
>  
> -  If unsure, say N.
> +config PCIE_DW_PLAT_EP
> + bool "Platform bus based DesignWare PCIe Controller - Endpoint mode"
> + depends on PCI && PCI_MSI_IRQ_DOMAIN
> + depends on PCI_ENDPOINT
> + select PCIE_DW_EP
> + select PCIE_DW_PLAT
> + help
> +   Enables support for the PCIe controller in the Designware IP to
> +   work in endpoint mode. There are two instances of PCIe controller
> +   in Designware IP.
> +   This controller can work either as EP or RC. In order to enable
> +   host-specific features PCIE_DW_PLAT_HOST must be selected and in
> +   order to enable device-specific features PCI_DW_PLAT_EP must be
> +   selected.
>  
>  config PCI_EXYNOS
>   bool "Samsung Exynos PCIe controller"
> diff --git a/drivers/pci/dwc/pcie-designware-ep.c 
> b/drivers/pci/dwc/pcie-designware-ep.c
> index f07678b..4ac135a 100644
> --- a/drivers/pci/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/dwc/pcie-designware-ep.c
> @@ -15,8 +15,10 @@
>  void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
>  {
>   struct pci_epc *epc = ep->epc;
> + struct pci_epf *epf;
>  
> - pci_epc_linkup(epc);
> + list_for_each_entry(epf, >pci_epf, list)
> + pci_epf_linkup(epf);
>  }

This shouldn't be required anymore.
>  
>  static void __dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar,
> diff --git a/drivers/pci/dwc/pcie-designware-plat.c 
> b/drivers/pci/dwc/pcie-designware-plat.c
> index 5416aa8..5382a7a 100644
> --- a/drivers/pci/dwc/pcie-designware-plat.c
> +++ b/drivers/pci/dwc/pcie-designware-plat.c
> @@ -12,19 +12,29 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "pcie-designware.h"
>  
>  struct dw_plat_pcie {
> - struct dw_pcie  *pci;
> + struct dw_pcie  *pci;
> + struct regmap   *regmap;
> + enum dw_pcie_device_modemode;
>  };
>  
> +struct dw_plat_pcie_of_data {
> + enum dw_pcie_device_modemode;
> +};
> +
> +static const struct of_device_id dw_plat_pcie_of_match[];
> +
>  static int dw_plat_pcie_host_init(struct pcie_port *pp)
>  {
>   struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
> @@ -42,9 

[GIT PULL] libnvdimm for 4.17

2018-04-09 Thread Williams, Dan J
Hi Linus, please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 
tags/libnvdimm-for-4.17

...to receive the libnvdimm update for 4.17.

This cycle was was not something I ever want to repeat as there were
several late changes that have only now just settled. Half of the
branch up to commit d2c997c0f145 "fs, dax: use page->mapping to
warn..." have been in -next for several releases. The of_pmem driver
and the address range scrub rework were late arrivals, and the dax work
was scaled back at the last moment.

The of_pmem driver missed a previous merge window due to an oversight.
A sense of obligation to rectify that miss is why it is included for
4.17. It has acks from PowerPC folks. Stephen reported a build failure
that only occurs when merging it with your latest tree, for now I have
fixed that up by disabling modular builds of of_pmem. A test merge with
your tree has received a build success report from the 0day robot over
156 configs.

An initial version of the ARS rework was submitted before the merge
window. It is self contained to libnvdimm, a net code reduction, and
passing all unit tests.

The filesystem-dax changes are based on the wait_var_event()
functionality from tip/sched/core. However, late review feedback showed
that those changes regressed truncate performance to a large degress.
The branch was rewound to drop the truncate behavior change and now
only includes preparation patches and cleanups (with full acks and
reviews). The finalization of this dax-dma-vs-trnucate work will need
to wait for 4.18.

git picked the wait_var_event() baseline for the diffstat, so I also
include the diffstat of the test merge below.

The following changes since commit 3eb2ce825ea1ad89d20f7a3b5780df850e4be274:

  Linux 4.16-rc7 (2018-03-25 12:44:30 -1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 
tags/libnvdimm-for-4.17

for you to fetch changes up to e13e75b86ef2f88e3a47d672dd4c52a293efb95b:

  Merge branch 'for-4.17/dax' into libnvdimm-for-next (2018-04-09 10:50:17 
-0700)


libnvdimm for 4.17

* A rework of the filesytem-dax implementation provides for detection of
  unmap operations (truncate / hole punch) colliding with in-progress
  device-DMA. A fix for these collisions remains a work-in-progress
  pending resolution of truncate latency and starvation regressions.

* The of_pmem driver expands the users of libnvdimm outside of x86 and
  ACPI to describe an implementation of persistent memory on PowerPC with
  Open Firmware / Device tree.

* Address Range Scrub (ARS) handling is completely rewritten to account for
  the fact that ARS may run for 100s of seconds and there is no platform
  defined way to cancel it. ARS will now no longer block namespace
  initialization.

* The NVDIMM Namespace Label implementation is updated to handle label
  areas as small as 1K, down from 128K.

* Miscellaneous cleanups and updates to unit test infrastructure.


Dan Williams (26):
  libnvdimm: remove redundant __func__ in dev_dbg
  device-dax: remove redundant __func__ in dev_dbg
  nfit: skip region registration for incomplete control regions
  acpi, nfit: rework NVDIMM leaf method detection
  dax: store pfns in the radix
  fs, dax: prepare for dax-specific address_space_operations
  block, dax: remove dead code in blkdev_writepages()
  xfs, dax: introduce xfs_dax_aops
  ext4, dax: introduce ext4_dax_aops
  nfit: fix region registration vs block-data-window ranges
  ext2, dax: introduce ext2_dax_aops
  fs, dax: use page->mapping to warn if truncate collides with a busy page
  dax: introduce CONFIG_DAX_DRIVER
  dax, dm: allow device-mapper to operate without dax support
  nfit, address-range-scrub: fix scrub in-progress reporting
  libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device'
  nfit, address-range-scrub: introduce nfit_spa->ars_state
  libnvdimm, dimm: fix dpa reservation vs uninitialized label area
  libnvdimm, namespace: use a safe lookup for dimm device name
  libnvdimm, region: quiet region probe
  nfit, address-range-scrub: determine one platform max_ars value
  nfit, address-range-scrub: rework and simplify ARS state machine
  nfit, address-range-scrub: add module option to skip initial ars
  libnvdimm, of_pmem: workaround OF_NUMA=n build error
  Merge branch 'for-4.17/libnvdimm' into libnvdimm-for-next
  Merge branch 'for-4.17/dax' into libnvdimm-for-next

Johannes Thumshirn (4):
  acpi, nfit: remove redundant __func__ in dev_dbg
  libnvdimm: provide module_nd_driver wrapper
  libnvdimm, pmem: use module_nd_driver
  device-dax: use module_nd_driver

Oliver O'Halloran (4):
  libnvdimm: Add of_node to region and bus descriptors
  libnvdimm: Add 

Re: [PATCH v2 2/9] PCI: dwc: Add support for endpoint mode

2018-04-09 Thread Kishon Vijay Abraham I
Hi,

On Monday 09 April 2018 03:11 PM, Gustavo Pimentel wrote:
> The PCIe controller dual mode is capable of operating in host mode as well
> as endpoint mode by configuration, therefore this patch aims to add
> endpoint mode support to the designware driver.
> 
> Signed-off-by: Gustavo Pimentel 
> ---
> Change v1->v2:
>  - Removed dw_plat_pcie_stop_link empty function.
>  - Implemented Kishon's suggestions about dw-pcie-rc and dw-pcie strings.
> compatibility.
>  - Added second entry on pci_epf_test_ids structure.
> 
>  drivers/pci/dwc/Kconfig   |  45 ++--
>  drivers/pci/dwc/pcie-designware-ep.c  |   4 +-
>  drivers/pci/dwc/pcie-designware-plat.c| 153 
> --
>  drivers/pci/endpoint/functions/pci-epf-test.c |   9 ++
>  4 files changed, 190 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
> index 2f3f5c5..3fd7daf 100644
> --- a/drivers/pci/dwc/Kconfig
> +++ b/drivers/pci/dwc/Kconfig
> @@ -7,8 +7,7 @@ config PCIE_DW
>  
>  config PCIE_DW_HOST
>  bool
> - depends on PCI
> - depends on PCI_MSI_IRQ_DOMAIN
> + depends on PCI && PCI_MSI_IRQ_DOMAIN
>  select PCIE_DW
>  
>  config PCIE_DW_EP
> @@ -52,16 +51,42 @@ config PCI_DRA7XX_EP
>  
>  config PCIE_DW_PLAT
>   bool "Platform bus based DesignWare PCIe Controller"
> - depends on PCI
> - depends on PCI_MSI_IRQ_DOMAIN
> - select PCIE_DW_HOST
> - ---help---
> -  This selects the DesignWare PCIe controller support. Select this if
> -  you have a PCIe controller on Platform bus.
> + help
> +   There are two instances of PCIe controller in Designware IP.
> +   This controller can work either as EP or RC. In order to enable
> +   host-specific features PCIE_DW_PLAT_HOST must be selected and in
> +   order to enable device-specific features PCIE_DW_PLAT_EP must be
> +   selected.
>  
> -  If you have a controller with this interface, say Y or M here.
> +config PCIE_DW_PLAT_HOST
> + bool "Platform bus based DesignWare PCIe Controller - Host mode"
> + depends on PCI && PCI_MSI_IRQ_DOMAIN
> + select PCIE_DW_HOST
> + select PCIE_DW_PLAT
> + default y
> + help
> +   Enables support for the PCIe controller in the Designware IP to
> +   work in host mode. There are two instances of PCIe controller in
> +   Designware IP.
> +   This controller can work either as EP or RC. In order to enable
> +   host-specific features PCIE_DW_PLAT_HOST must be selected and in
> +   order to enable device-specific features PCI_DW_PLAT_EP must be
> +   selected.
>  
> -  If unsure, say N.
> +config PCIE_DW_PLAT_EP
> + bool "Platform bus based DesignWare PCIe Controller - Endpoint mode"
> + depends on PCI && PCI_MSI_IRQ_DOMAIN
> + depends on PCI_ENDPOINT
> + select PCIE_DW_EP
> + select PCIE_DW_PLAT
> + help
> +   Enables support for the PCIe controller in the Designware IP to
> +   work in endpoint mode. There are two instances of PCIe controller
> +   in Designware IP.
> +   This controller can work either as EP or RC. In order to enable
> +   host-specific features PCIE_DW_PLAT_HOST must be selected and in
> +   order to enable device-specific features PCI_DW_PLAT_EP must be
> +   selected.
>  
>  config PCI_EXYNOS
>   bool "Samsung Exynos PCIe controller"
> diff --git a/drivers/pci/dwc/pcie-designware-ep.c 
> b/drivers/pci/dwc/pcie-designware-ep.c
> index f07678b..4ac135a 100644
> --- a/drivers/pci/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/dwc/pcie-designware-ep.c
> @@ -15,8 +15,10 @@
>  void dw_pcie_ep_linkup(struct dw_pcie_ep *ep)
>  {
>   struct pci_epc *epc = ep->epc;
> + struct pci_epf *epf;
>  
> - pci_epc_linkup(epc);
> + list_for_each_entry(epf, >pci_epf, list)
> + pci_epf_linkup(epf);
>  }

This shouldn't be required anymore.
>  
>  static void __dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar,
> diff --git a/drivers/pci/dwc/pcie-designware-plat.c 
> b/drivers/pci/dwc/pcie-designware-plat.c
> index 5416aa8..5382a7a 100644
> --- a/drivers/pci/dwc/pcie-designware-plat.c
> +++ b/drivers/pci/dwc/pcie-designware-plat.c
> @@ -12,19 +12,29 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "pcie-designware.h"
>  
>  struct dw_plat_pcie {
> - struct dw_pcie  *pci;
> + struct dw_pcie  *pci;
> + struct regmap   *regmap;
> + enum dw_pcie_device_modemode;
>  };
>  
> +struct dw_plat_pcie_of_data {
> + enum dw_pcie_device_modemode;
> +};
> +
> +static const struct of_device_id dw_plat_pcie_of_match[];
> +
>  static int dw_plat_pcie_host_init(struct pcie_port *pp)
>  {
>   struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
> @@ -42,9 +52,53 @@ static const struct 

[GIT PULL] libnvdimm for 4.17

2018-04-09 Thread Williams, Dan J
Hi Linus, please pull from:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 
tags/libnvdimm-for-4.17

...to receive the libnvdimm update for 4.17.

This cycle was was not something I ever want to repeat as there were
several late changes that have only now just settled. Half of the
branch up to commit d2c997c0f145 "fs, dax: use page->mapping to
warn..." have been in -next for several releases. The of_pmem driver
and the address range scrub rework were late arrivals, and the dax work
was scaled back at the last moment.

The of_pmem driver missed a previous merge window due to an oversight.
A sense of obligation to rectify that miss is why it is included for
4.17. It has acks from PowerPC folks. Stephen reported a build failure
that only occurs when merging it with your latest tree, for now I have
fixed that up by disabling modular builds of of_pmem. A test merge with
your tree has received a build success report from the 0day robot over
156 configs.

An initial version of the ARS rework was submitted before the merge
window. It is self contained to libnvdimm, a net code reduction, and
passing all unit tests.

The filesystem-dax changes are based on the wait_var_event()
functionality from tip/sched/core. However, late review feedback showed
that those changes regressed truncate performance to a large degress.
The branch was rewound to drop the truncate behavior change and now
only includes preparation patches and cleanups (with full acks and
reviews). The finalization of this dax-dma-vs-trnucate work will need
to wait for 4.18.

git picked the wait_var_event() baseline for the diffstat, so I also
include the diffstat of the test merge below.

The following changes since commit 3eb2ce825ea1ad89d20f7a3b5780df850e4be274:

  Linux 4.16-rc7 (2018-03-25 12:44:30 -1000)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 
tags/libnvdimm-for-4.17

for you to fetch changes up to e13e75b86ef2f88e3a47d672dd4c52a293efb95b:

  Merge branch 'for-4.17/dax' into libnvdimm-for-next (2018-04-09 10:50:17 
-0700)


libnvdimm for 4.17

* A rework of the filesytem-dax implementation provides for detection of
  unmap operations (truncate / hole punch) colliding with in-progress
  device-DMA. A fix for these collisions remains a work-in-progress
  pending resolution of truncate latency and starvation regressions.

* The of_pmem driver expands the users of libnvdimm outside of x86 and
  ACPI to describe an implementation of persistent memory on PowerPC with
  Open Firmware / Device tree.

* Address Range Scrub (ARS) handling is completely rewritten to account for
  the fact that ARS may run for 100s of seconds and there is no platform
  defined way to cancel it. ARS will now no longer block namespace
  initialization.

* The NVDIMM Namespace Label implementation is updated to handle label
  areas as small as 1K, down from 128K.

* Miscellaneous cleanups and updates to unit test infrastructure.


Dan Williams (26):
  libnvdimm: remove redundant __func__ in dev_dbg
  device-dax: remove redundant __func__ in dev_dbg
  nfit: skip region registration for incomplete control regions
  acpi, nfit: rework NVDIMM leaf method detection
  dax: store pfns in the radix
  fs, dax: prepare for dax-specific address_space_operations
  block, dax: remove dead code in blkdev_writepages()
  xfs, dax: introduce xfs_dax_aops
  ext4, dax: introduce ext4_dax_aops
  nfit: fix region registration vs block-data-window ranges
  ext2, dax: introduce ext2_dax_aops
  fs, dax: use page->mapping to warn if truncate collides with a busy page
  dax: introduce CONFIG_DAX_DRIVER
  dax, dm: allow device-mapper to operate without dax support
  nfit, address-range-scrub: fix scrub in-progress reporting
  libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device'
  nfit, address-range-scrub: introduce nfit_spa->ars_state
  libnvdimm, dimm: fix dpa reservation vs uninitialized label area
  libnvdimm, namespace: use a safe lookup for dimm device name
  libnvdimm, region: quiet region probe
  nfit, address-range-scrub: determine one platform max_ars value
  nfit, address-range-scrub: rework and simplify ARS state machine
  nfit, address-range-scrub: add module option to skip initial ars
  libnvdimm, of_pmem: workaround OF_NUMA=n build error
  Merge branch 'for-4.17/libnvdimm' into libnvdimm-for-next
  Merge branch 'for-4.17/dax' into libnvdimm-for-next

Johannes Thumshirn (4):
  acpi, nfit: remove redundant __func__ in dev_dbg
  libnvdimm: provide module_nd_driver wrapper
  libnvdimm, pmem: use module_nd_driver
  device-dax: use module_nd_driver

Oliver O'Halloran (4):
  libnvdimm: Add of_node to region and bus descriptors
  libnvdimm: Add 

WARNING: kobject bug in corrupted

2018-04-09 Thread syzbot

Hello,

syzbot hit the following crash on upstream commit
fd40ffc72e2f74c7db61e400903e7d50a88bc0b0 (Mon Apr 9 18:36:05 2018 +)
selinux: fix missing dput() before selinuxfs unmount
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=dd8fe49d0d1423aa5295


C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5710100694040576
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=5951393567342592
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=6276231339180032
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-771321277174894814

compiler: gcc (GCC) 8.0.1 20180301 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+dd8fe49d0d1423aa5...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
kobject_add_internal failed for gfs2meta with -EEXIST, don't try to  
register things with the same name in the same directory.

 sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30
 sysfs_create_dir_ns+0x184/0x1d0 fs/sysfs/dir.c:58
WARNING: CPU: 1 PID: 4473 at lib/kobject.c:238  
kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236

 create_dir lib/kobject.c:69 [inline]
 kobject_add_internal+0x353/0xba0 lib/kobject.c:228
Kernel panic - not syncing: panic_on_warn set ...

 kobject_add_varg lib/kobject.c:364 [inline]
 kobject_init_and_add+0xed/0x130 lib/kobject.c:435
 gfs2_sys_fs_add+0x1ff/0x500 fs/gfs2/sys.c:652
 fill_super+0x8c9/0x1a40 fs/gfs2/ops_fstype.c:1118
 gfs2_mount+0x5e6/0x712 fs/gfs2/ops_fstype.c:1321
 mount_fs+0xae/0x328 fs/super.c:1222
 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
 vfs_kern_mount fs/namespace.c:1027 [inline]
 do_new_mount fs/namespace.c:2517 [inline]
 do_mount+0x564/0x3070 fs/namespace.c:2847
 ksys_mount+0x12d/0x140 fs/namespace.c:3063
 SYSC_mount fs/namespace.c:3077 [inline]
 SyS_mount+0x35/0x50 fs/namespace.c:3074
 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x4430ca
RSP: 002b:7fff5f80e158 EFLAGS: 0297 ORIG_RAX: 00a5
RAX: ffda RBX: 0003 RCX: 004430ca
RDX: 2040 RSI: 2080 RDI: 7fff5f80e170
RBP: 006cb018 R08: 24c0 R09: 000a
R10:  R11: 0297 R12: 6e5f6b636f6c3d6f
R13: 746f72706b636f6c R14: 0030656c69662f2e R15: 0004
CPU: 1 PID: 4473 Comm: syzkaller208561 Not tainted 4.16.0+ #14
[ cut here ]
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
kobject_add_internal failed for gfs2meta with -EEXIST, don't try to  
register things with the same name in the same directory.

 panic+0x22f/0x4de kernel/panic.c:183
WARNING: CPU: 0 PID: 4470 at lib/kobject.c:238  
kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236

Modules linked in:
CPU: 0 PID: 4470 Comm: syzkaller208561 Not tainted 4.16.0+ #14
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236
 __warn.cold.8+0x163/0x1a3 kernel/panic.c:547
RSP: 0018:8801af7af480 EFLAGS: 00010286
 report_bug+0x252/0x2d0 lib/bug.c:186
RAX: 007d RBX: 8801af24d1d0 RCX: 815f42ed
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296
RDX:  RSI: 815f8fa1 RDI: 8801af7aefe0
RBP: 8801af7af578 R08: 8801af794640 R09: 0006
R10: 8801af794640 R11:  R12: ffef
R13: 8801d3abea48 R14: 110035ef5e9a R15: 8801d3abea00
FS:  011be880() GS:8801db00() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fff0fb79330 CR3: 0001af48 CR4: 001406f0
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
DR0:  DR1:  DR2: 
 invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:991
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
RIP: 0010:kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236
RSP: 0018:8801af4ef480 EFLAGS: 00010286
RAX: 007d RBX: 8801af2a1210 RCX: 815f42ed
RDX:  RSI: 815f8fa1 RDI: 8801af4eefe0
RBP: 8801af4ef578 R08: 8801af00c700 R09: 0006
R10: 8801af00c700 R11:  R12: ffef
R13: 8801d3abea48 R14: 110035e9de9a R15: 8801d3abea00
 kobject_add_varg lib/kobject.c:364 [inline]
 kobject_init_and_add+0xed/0x130 lib/kobject.c:435
 gfs2_sys_fs_add+0x1ff/0x500 fs/gfs2/sys.c:652
 

WARNING: kobject bug in corrupted

2018-04-09 Thread syzbot

Hello,

syzbot hit the following crash on upstream commit
fd40ffc72e2f74c7db61e400903e7d50a88bc0b0 (Mon Apr 9 18:36:05 2018 +)
selinux: fix missing dput() before selinuxfs unmount
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=dd8fe49d0d1423aa5295


C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5710100694040576
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=5951393567342592
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=6276231339180032
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-771321277174894814

compiler: gcc (GCC) 8.0.1 20180301 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+dd8fe49d0d1423aa5...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
kobject_add_internal failed for gfs2meta with -EEXIST, don't try to  
register things with the same name in the same directory.

 sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30
 sysfs_create_dir_ns+0x184/0x1d0 fs/sysfs/dir.c:58
WARNING: CPU: 1 PID: 4473 at lib/kobject.c:238  
kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236

 create_dir lib/kobject.c:69 [inline]
 kobject_add_internal+0x353/0xba0 lib/kobject.c:228
Kernel panic - not syncing: panic_on_warn set ...

 kobject_add_varg lib/kobject.c:364 [inline]
 kobject_init_and_add+0xed/0x130 lib/kobject.c:435
 gfs2_sys_fs_add+0x1ff/0x500 fs/gfs2/sys.c:652
 fill_super+0x8c9/0x1a40 fs/gfs2/ops_fstype.c:1118
 gfs2_mount+0x5e6/0x712 fs/gfs2/ops_fstype.c:1321
 mount_fs+0xae/0x328 fs/super.c:1222
 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
 vfs_kern_mount fs/namespace.c:1027 [inline]
 do_new_mount fs/namespace.c:2517 [inline]
 do_mount+0x564/0x3070 fs/namespace.c:2847
 ksys_mount+0x12d/0x140 fs/namespace.c:3063
 SYSC_mount fs/namespace.c:3077 [inline]
 SyS_mount+0x35/0x50 fs/namespace.c:3074
 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x4430ca
RSP: 002b:7fff5f80e158 EFLAGS: 0297 ORIG_RAX: 00a5
RAX: ffda RBX: 0003 RCX: 004430ca
RDX: 2040 RSI: 2080 RDI: 7fff5f80e170
RBP: 006cb018 R08: 24c0 R09: 000a
R10:  R11: 0297 R12: 6e5f6b636f6c3d6f
R13: 746f72706b636f6c R14: 0030656c69662f2e R15: 0004
CPU: 1 PID: 4473 Comm: syzkaller208561 Not tainted 4.16.0+ #14
[ cut here ]
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1b9/0x294 lib/dump_stack.c:113
kobject_add_internal failed for gfs2meta with -EEXIST, don't try to  
register things with the same name in the same directory.

 panic+0x22f/0x4de kernel/panic.c:183
WARNING: CPU: 0 PID: 4470 at lib/kobject.c:238  
kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236

Modules linked in:
CPU: 0 PID: 4470 Comm: syzkaller208561 Not tainted 4.16.0+ #14
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236
 __warn.cold.8+0x163/0x1a3 kernel/panic.c:547
RSP: 0018:8801af7af480 EFLAGS: 00010286
 report_bug+0x252/0x2d0 lib/bug.c:186
RAX: 007d RBX: 8801af24d1d0 RCX: 815f42ed
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296
RDX:  RSI: 815f8fa1 RDI: 8801af7aefe0
RBP: 8801af7af578 R08: 8801af794640 R09: 0006
R10: 8801af794640 R11:  R12: ffef
R13: 8801d3abea48 R14: 110035ef5e9a R15: 8801d3abea00
FS:  011be880() GS:8801db00() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7fff0fb79330 CR3: 0001af48 CR4: 001406f0
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
DR0:  DR1:  DR2: 
 invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:991
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
RIP: 0010:kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236
RSP: 0018:8801af4ef480 EFLAGS: 00010286
RAX: 007d RBX: 8801af2a1210 RCX: 815f42ed
RDX:  RSI: 815f8fa1 RDI: 8801af4eefe0
RBP: 8801af4ef578 R08: 8801af00c700 R09: 0006
R10: 8801af00c700 R11:  R12: ffef
R13: 8801d3abea48 R14: 110035e9de9a R15: 8801d3abea00
 kobject_add_varg lib/kobject.c:364 [inline]
 kobject_init_and_add+0xed/0x130 lib/kobject.c:435
 gfs2_sys_fs_add+0x1ff/0x500 fs/gfs2/sys.c:652
 

Re: [RFC] vhost: introduce mdev based hardware vhost backend

2018-04-09 Thread Tiwei Bie
On Tue, Apr 10, 2018 at 10:52:52AM +0800, Jason Wang wrote:
> On 2018年04月02日 23:23, Tiwei Bie wrote:
> > This patch introduces a mdev (mediated device) based hardware
> > vhost backend. This backend is an abstraction of the various
> > hardware vhost accelerators (potentially any device that uses
> > virtio ring can be used as a vhost accelerator). Some generic
> > mdev parent ops are provided for accelerator drivers to support
> > generating mdev instances.
> > 
> > What's this
> > ===
> > 
> > The idea is that we can setup a virtio ring compatible device
> > with the messages available at the vhost-backend. Originally,
> > these messages are used to implement a software vhost backend,
> > but now we will use these messages to setup a virtio ring
> > compatible hardware device. Then the hardware device will be
> > able to work with the guest virtio driver in the VM just like
> > what the software backend does. That is to say, we can implement
> > a hardware based vhost backend in QEMU, and any virtio ring
> > compatible devices potentially can be used with this backend.
> > (We also call it vDPA -- vhost Data Path Acceleration).
> > 
> > One problem is that, different virtio ring compatible devices
> > may have different device interfaces. That is to say, we will
> > need different drivers in QEMU. It could be troublesome. And
> > that's what this patch trying to fix. The idea behind this
> > patch is very simple: mdev is a standard way to emulate device
> > in kernel.
> 
> So you just move the abstraction layer from qemu to kernel, and you still
> need different drivers in kernel for different device interfaces of
> accelerators. This looks even more complex than leaving it in qemu. As you
> said, another idea is to implement userspace vhost backend for accelerators
> which seems easier and could co-work with other parts of qemu without
> inventing new type of messages.

I'm not quite sure. Do you think it's acceptable to
add various vendor specific hardware drivers in QEMU?

> 
> Need careful thought here to seek a best solution here.

Yeah, definitely! :)
And your opinions would be very helpful!

> 
> >   So we defined a standard device based on mdev, which
> > is able to accept vhost messages. When the mdev emulation code
> > (i.e. the generic mdev parent ops provided by this patch) gets
> > vhost messages, it will parse and deliver them to accelerator
> > drivers. Drivers can use these messages to setup accelerators.
> > 
> > That is to say, the generic mdev parent ops (e.g. read()/write()/
> > ioctl()/...) will be provided for accelerator drivers to register
> > accelerators as mdev parent devices. And each accelerator device
> > will support generating standard mdev instance(s).
> > 
> > With this standard device interface, we will be able to just
> > develop one userspace driver to implement the hardware based
> > vhost backend in QEMU.
> > 
> > Difference between vDPA and PCI passthru
> > 
> > 
> > The key difference between vDPA and PCI passthru is that, in
> > vDPA only the data path of the device (e.g. DMA ring, notify
> > region and queue interrupt) is pass-throughed to the VM, the
> > device control path (e.g. PCI configuration space and MMIO
> > regions) is still defined and emulated by QEMU.
> > 
> > The benefits of keeping virtio device emulation in QEMU compared
> > with virtio device PCI passthru include (but not limit to):
> > 
> > - consistent device interface for guest OS in the VM;
> > - max flexibility on the hardware design, especially the
> >accelerator for each vhost backend doesn't have to be a
> >full PCI device;
> > - leveraging the existing virtio live-migration framework;
> > 
> > The interface of this mdev based device
> > ===
> > 
> > 1. BAR0
> > 
> > The MMIO region described by BAR0 is the main control
> > interface. Messages will be written to or read from
> > this region.
> > 
> > The message type is determined by the `request` field
> > in message header. The message size is encoded in the
> > message header too. The message format looks like this:
> > 
> > struct vhost_vfio_op {
> > __u64 request;
> > __u32 flags;
> > /* Flag values: */
> > #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */
> > __u32 size;
> > union {
> > __u64 u64;
> > struct vhost_vring_state state;
> > struct vhost_vring_addr addr;
> > struct vhost_memory memory;
> > } payload;
> > };
> > 
> > The existing vhost-kernel ioctl cmds are reused as
> > the message requests in above structure.
> > 
> > Each message will be written to or read from this
> > region at offset 0:
> > 
> > int vhost_vfio_write(struct vhost_dev *dev, struct vhost_vfio_op *op)
> > {
> > int count = VHOST_VFIO_OP_HDR_SIZE + op->size;
> > struct vhost_vfio *vfio = dev->opaque;
> > int ret;
> > 
> > ret = pwrite64(vfio->device_fd, op, count, 

linux-next: Tree for Apr 10

2018-04-09 Thread Stephen Rothwell
Hi all,

Please do not add any v4.18 destined stuff to your linux-next included
trees until after v4.17-rc1 has been released.

Changes since 20180409:

The parisc-hd tree still had its build failure for which I applied a patch.

The nvdimm tree lost its build failure.

Non-merge commits (relative to Linus' tree): 1678
 1682 files changed, 62884 insertions(+), 31587 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (fd3b36d27566 Merge branch 'work.namei' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs)
Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging kbuild-current/fixes (28913ee8191a netfilter: nf_nat_snmp_basic: add 
correct dependency to Makefile)
Merging arc-current/for-curr (661e50bc8532 Linux 4.16-rc4)
Merging arm-current/fixes (2a141cd0d83b ARM: 8758/1: decompressor: restore r1 
and r2 just before jumping to the kernel)
Merging arm64-fixes/for-next/fixes (e21da1c99200 arm64: Relax 
ARM_SMCCC_ARCH_WORKAROUND_1 discovery)
Merging m68k-current/for-linus (ecd685580c8f m68k/mac: Remove bogus "FIXME" 
comment)
Merging powerpc-fixes/fixes (52396500f97c powerpc/64s: Fix i-side SLB miss bad 
address handler saving nonvolatile GPRs)
Merging sparc/master (17dec0a94915 Merge branch 'userns-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (a2ac99905f1e vhost-net: set packet weight of tx polling to 
2 * vq size)
Merging bpf/master (33491588c1fb kernel/bpf/syscall: fix warning defined but 
not used)
Merging ipsec/master (4b66af2d6356 af_key: Always verify length of provided 
sadb_key)
Merging netfilter/master (3f1e53abff84 netfilter: ebtables: don't attempt to 
allocate 0-sized compat array)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (77e30e10ee28 iwlwifi: mvm: query regdb for wmm 
rule if needed)
Merging mac80211/master (b5dbc28762fd Merge tag 'kbuild-fixes-v4.16-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging rdma-fixes/for-rc (84652aefb347 RDMA/ucma: Introduce safer 
rdma_addr_size() variants)
Merging sound-current/for-linus (e1a3a981e320 ALSA: pcm: Remove WARN_ON() at 
snd_pcm_hw_params() error)
Merging pci-current/for-linus (fc110ebdd014 PCI: dwc: Fix enumeration end when 
reaching root subordinate)
Merging driver-core.current/driver-core-linus (38c23685b273 Merge tag 
'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging tty.current/tty-linus (38c23685b273 Merge tag 'armsoc-drivers' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging usb.current/usb-linus (38c23685b273 Merge tag 'armsoc-drivers' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging usb-gadget-fixes/fixes (c6ba5084ce0d usb: gadget: udc: renesas_usb3: 
add binging for r8a77965)
Merging usb-serial-fixes/usb-linus (86d71233b615 USB: serial: ftdi_sio: add 
support for Harman FirmwareHubEmulator)
Merging usb-chipidea-fixes/ci-fo

Re: [RFC] vhost: introduce mdev based hardware vhost backend

2018-04-09 Thread Tiwei Bie
On Tue, Apr 10, 2018 at 10:52:52AM +0800, Jason Wang wrote:
> On 2018年04月02日 23:23, Tiwei Bie wrote:
> > This patch introduces a mdev (mediated device) based hardware
> > vhost backend. This backend is an abstraction of the various
> > hardware vhost accelerators (potentially any device that uses
> > virtio ring can be used as a vhost accelerator). Some generic
> > mdev parent ops are provided for accelerator drivers to support
> > generating mdev instances.
> > 
> > What's this
> > ===
> > 
> > The idea is that we can setup a virtio ring compatible device
> > with the messages available at the vhost-backend. Originally,
> > these messages are used to implement a software vhost backend,
> > but now we will use these messages to setup a virtio ring
> > compatible hardware device. Then the hardware device will be
> > able to work with the guest virtio driver in the VM just like
> > what the software backend does. That is to say, we can implement
> > a hardware based vhost backend in QEMU, and any virtio ring
> > compatible devices potentially can be used with this backend.
> > (We also call it vDPA -- vhost Data Path Acceleration).
> > 
> > One problem is that, different virtio ring compatible devices
> > may have different device interfaces. That is to say, we will
> > need different drivers in QEMU. It could be troublesome. And
> > that's what this patch trying to fix. The idea behind this
> > patch is very simple: mdev is a standard way to emulate device
> > in kernel.
> 
> So you just move the abstraction layer from qemu to kernel, and you still
> need different drivers in kernel for different device interfaces of
> accelerators. This looks even more complex than leaving it in qemu. As you
> said, another idea is to implement userspace vhost backend for accelerators
> which seems easier and could co-work with other parts of qemu without
> inventing new type of messages.

I'm not quite sure. Do you think it's acceptable to
add various vendor specific hardware drivers in QEMU?

> 
> Need careful thought here to seek a best solution here.

Yeah, definitely! :)
And your opinions would be very helpful!

> 
> >   So we defined a standard device based on mdev, which
> > is able to accept vhost messages. When the mdev emulation code
> > (i.e. the generic mdev parent ops provided by this patch) gets
> > vhost messages, it will parse and deliver them to accelerator
> > drivers. Drivers can use these messages to setup accelerators.
> > 
> > That is to say, the generic mdev parent ops (e.g. read()/write()/
> > ioctl()/...) will be provided for accelerator drivers to register
> > accelerators as mdev parent devices. And each accelerator device
> > will support generating standard mdev instance(s).
> > 
> > With this standard device interface, we will be able to just
> > develop one userspace driver to implement the hardware based
> > vhost backend in QEMU.
> > 
> > Difference between vDPA and PCI passthru
> > 
> > 
> > The key difference between vDPA and PCI passthru is that, in
> > vDPA only the data path of the device (e.g. DMA ring, notify
> > region and queue interrupt) is pass-throughed to the VM, the
> > device control path (e.g. PCI configuration space and MMIO
> > regions) is still defined and emulated by QEMU.
> > 
> > The benefits of keeping virtio device emulation in QEMU compared
> > with virtio device PCI passthru include (but not limit to):
> > 
> > - consistent device interface for guest OS in the VM;
> > - max flexibility on the hardware design, especially the
> >accelerator for each vhost backend doesn't have to be a
> >full PCI device;
> > - leveraging the existing virtio live-migration framework;
> > 
> > The interface of this mdev based device
> > ===
> > 
> > 1. BAR0
> > 
> > The MMIO region described by BAR0 is the main control
> > interface. Messages will be written to or read from
> > this region.
> > 
> > The message type is determined by the `request` field
> > in message header. The message size is encoded in the
> > message header too. The message format looks like this:
> > 
> > struct vhost_vfio_op {
> > __u64 request;
> > __u32 flags;
> > /* Flag values: */
> > #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */
> > __u32 size;
> > union {
> > __u64 u64;
> > struct vhost_vring_state state;
> > struct vhost_vring_addr addr;
> > struct vhost_memory memory;
> > } payload;
> > };
> > 
> > The existing vhost-kernel ioctl cmds are reused as
> > the message requests in above structure.
> > 
> > Each message will be written to or read from this
> > region at offset 0:
> > 
> > int vhost_vfio_write(struct vhost_dev *dev, struct vhost_vfio_op *op)
> > {
> > int count = VHOST_VFIO_OP_HDR_SIZE + op->size;
> > struct vhost_vfio *vfio = dev->opaque;
> > int ret;
> > 
> > ret = pwrite64(vfio->device_fd, op, count, 

linux-next: Tree for Apr 10

2018-04-09 Thread Stephen Rothwell
Hi all,

Please do not add any v4.18 destined stuff to your linux-next included
trees until after v4.17-rc1 has been released.

Changes since 20180409:

The parisc-hd tree still had its build failure for which I applied a patch.

The nvdimm tree lost its build failure.

Non-merge commits (relative to Linus' tree): 1678
 1682 files changed, 62884 insertions(+), 31587 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 258 trees (counting Linus' and 44 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (fd3b36d27566 Merge branch 'work.namei' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs)
Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging kbuild-current/fixes (28913ee8191a netfilter: nf_nat_snmp_basic: add 
correct dependency to Makefile)
Merging arc-current/for-curr (661e50bc8532 Linux 4.16-rc4)
Merging arm-current/fixes (2a141cd0d83b ARM: 8758/1: decompressor: restore r1 
and r2 just before jumping to the kernel)
Merging arm64-fixes/for-next/fixes (e21da1c99200 arm64: Relax 
ARM_SMCCC_ARCH_WORKAROUND_1 discovery)
Merging m68k-current/for-linus (ecd685580c8f m68k/mac: Remove bogus "FIXME" 
comment)
Merging powerpc-fixes/fixes (52396500f97c powerpc/64s: Fix i-side SLB miss bad 
address handler saving nonvolatile GPRs)
Merging sparc/master (17dec0a94915 Merge branch 'userns-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace)
Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2)
Merging net/master (a2ac99905f1e vhost-net: set packet weight of tx polling to 
2 * vq size)
Merging bpf/master (33491588c1fb kernel/bpf/syscall: fix warning defined but 
not used)
Merging ipsec/master (4b66af2d6356 af_key: Always verify length of provided 
sadb_key)
Merging netfilter/master (3f1e53abff84 netfilter: ebtables: don't attempt to 
allocate 0-sized compat array)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (77e30e10ee28 iwlwifi: mvm: query regdb for wmm 
rule if needed)
Merging mac80211/master (b5dbc28762fd Merge tag 'kbuild-fixes-v4.16-3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild)
Merging rdma-fixes/for-rc (84652aefb347 RDMA/ucma: Introduce safer 
rdma_addr_size() variants)
Merging sound-current/for-linus (e1a3a981e320 ALSA: pcm: Remove WARN_ON() at 
snd_pcm_hw_params() error)
Merging pci-current/for-linus (fc110ebdd014 PCI: dwc: Fix enumeration end when 
reaching root subordinate)
Merging driver-core.current/driver-core-linus (38c23685b273 Merge tag 
'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging tty.current/tty-linus (38c23685b273 Merge tag 'armsoc-drivers' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging usb.current/usb-linus (38c23685b273 Merge tag 'armsoc-drivers' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging usb-gadget-fixes/fixes (c6ba5084ce0d usb: gadget: udc: renesas_usb3: 
add binging for r8a77965)
Merging usb-serial-fixes/usb-linus (86d71233b615 USB: serial: ftdi_sio: add 
support for Harman FirmwareHubEmulator)
Merging usb-chipidea-fixes/ci-fo

Re: [PATCH] xhci: Fix USB ports for Dell Inspiron 5775

2018-04-09 Thread Kai Heng Feng

Hi Matthias,

On Mar 18, 2018, at 11:11 PM, Kai-Heng Feng   
wrote:


The Dell Inspiron 5775 is a Raven Ridge. The Enable Slot command timed
out when a USB device gets plugged:
[ 212.156326] xhci_hcd :03:00.3: Error while assigning device slot ID
[ 212.156340] xhci_hcd :03:00.3: Max number of devices this xHCI host  
supports is 64.

[ 212.156348] usb usb2-port3: couldn't allocate usb_device

AMD suggests that a delay before xHC suspends can fix the issue.

I can confirm it fixes the issue, so use the suspend delay quirk for
Raven Ridge's xHC.


I am hoping this patch can get merged in v4.17...

Thanks,
Kai-Heng



Cc: sta...@vger.kernel.org
Signed-off-by: Kai-Heng Feng 
---
 drivers/usb/host/xhci-pci.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index d9f831b67e57..93ce34bce7b5 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -126,7 +126,10 @@ static void xhci_pci_quirks(struct device *dev,  
struct xhci_hcd *xhci)

if (pdev->vendor == PCI_VENDOR_ID_AMD && usb_amd_find_chipset_info())
xhci->quirks |= XHCI_AMD_PLL_FIX;

-   if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x43bb)
+   if (pdev->vendor == PCI_VENDOR_ID_AMD &&
+   (pdev->device == 0x15e0 ||
+pdev->device == 0x15e1 ||
+pdev->device == 0x43bb))
xhci->quirks |= XHCI_SUSPEND_DELAY;

if (pdev->vendor == PCI_VENDOR_ID_AMD)
--
2.15.1


Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open

2018-04-09 Thread Alexei Starovoitov

On 4/9/18 9:45 PM, Ravi Bangoria wrote:

Hi Song,

On 12/07/2017 04:15 AM, Song Liu wrote:

With current kernel, user space tools can only create/destroy [k,u]probes
with a text-based API (kprobe_events and uprobe_events in tracefs). This
approach relies on user space to clean up the [k,u]probe after using them.
However, this is not easy for user space to clean up properly.

To solve this problem, we introduce a file descriptor based API.
Specifically, we extended perf_event_open to create [k,u]probe, and attach
this [k,u]probe to the file descriptor created by perf_event_open. These
[k,u]probe are associated with this file descriptor, so they are not
available in tracefs.


Sorry for being late. One simple question..

Will it be good to support k/uprobe arguments with perf_event_open()?
Do you have any plans about that?


no plans for that. People that use text based interfaces should
probably be using text interfaces consistently.
imo mixing FD-based kprobe api with text is not worth the complexity.



Re: [PATCH] xhci: Fix USB ports for Dell Inspiron 5775

2018-04-09 Thread Kai Heng Feng

Hi Matthias,

On Mar 18, 2018, at 11:11 PM, Kai-Heng Feng   
wrote:


The Dell Inspiron 5775 is a Raven Ridge. The Enable Slot command timed
out when a USB device gets plugged:
[ 212.156326] xhci_hcd :03:00.3: Error while assigning device slot ID
[ 212.156340] xhci_hcd :03:00.3: Max number of devices this xHCI host  
supports is 64.

[ 212.156348] usb usb2-port3: couldn't allocate usb_device

AMD suggests that a delay before xHC suspends can fix the issue.

I can confirm it fixes the issue, so use the suspend delay quirk for
Raven Ridge's xHC.


I am hoping this patch can get merged in v4.17...

Thanks,
Kai-Heng



Cc: sta...@vger.kernel.org
Signed-off-by: Kai-Heng Feng 
---
 drivers/usb/host/xhci-pci.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index d9f831b67e57..93ce34bce7b5 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -126,7 +126,10 @@ static void xhci_pci_quirks(struct device *dev,  
struct xhci_hcd *xhci)

if (pdev->vendor == PCI_VENDOR_ID_AMD && usb_amd_find_chipset_info())
xhci->quirks |= XHCI_AMD_PLL_FIX;

-   if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x43bb)
+   if (pdev->vendor == PCI_VENDOR_ID_AMD &&
+   (pdev->device == 0x15e0 ||
+pdev->device == 0x15e1 ||
+pdev->device == 0x43bb))
xhci->quirks |= XHCI_SUSPEND_DELAY;

if (pdev->vendor == PCI_VENDOR_ID_AMD)
--
2.15.1


Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open

2018-04-09 Thread Alexei Starovoitov

On 4/9/18 9:45 PM, Ravi Bangoria wrote:

Hi Song,

On 12/07/2017 04:15 AM, Song Liu wrote:

With current kernel, user space tools can only create/destroy [k,u]probes
with a text-based API (kprobe_events and uprobe_events in tracefs). This
approach relies on user space to clean up the [k,u]probe after using them.
However, this is not easy for user space to clean up properly.

To solve this problem, we introduce a file descriptor based API.
Specifically, we extended perf_event_open to create [k,u]probe, and attach
this [k,u]probe to the file descriptor created by perf_event_open. These
[k,u]probe are associated with this file descriptor, so they are not
available in tracefs.


Sorry for being late. One simple question..

Will it be good to support k/uprobe arguments with perf_event_open()?
Do you have any plans about that?


no plans for that. People that use text based interfaces should
probably be using text interfaces consistently.
imo mixing FD-based kprobe api with text is not worth the complexity.



Re: [PATCH v2] resource: Fix integer overflow at reallocation

2018-04-09 Thread Takashi Iwai
On Tue, 10 Apr 2018 02:23:26 +0200,
Andrew Morton wrote:
> 
> On Sun,  8 Apr 2018 09:20:26 +0200 Takashi Iwai  wrote:
> 
> > We've got a bug report indicating a kernel panic at booting on an
> > x86-32 system, and it turned out to be the invalid resource assigned
> > after PCI resource reallocation.  __find_resource() first aligns the
> > resource start address and resets the end address with start+size-1
> > accordingly, then checks whether it's contained.  Here the end address
> > may overflow the integer, although resource_contains() still returns
> > true because the function validates only start and end address.  So
> > this ends up with returning an invalid resource (start > end).
> > 
> > There was already an attempt to cover such a problem in the commit
> > 47ea91b4052d ("Resource: fix wrong resource window calculation"), but
> > this case is an overseen one.
> > 
> > This patch adds the validity check in resource_contains() to see
> > whether the given resource has a valid range for avoiding the integer
> > overflow problem.
> > 
> > ...
> >
> > --- a/include/linux/ioport.h
> > +++ b/include/linux/ioport.h
> > @@ -212,6 +212,9 @@ static inline bool resource_contains(struct resource 
> > *r1, struct resource *r2)
> > return false;
> > if (r1->flags & IORESOURCE_UNSET || r2->flags & IORESOURCE_UNSET)
> > return false;
> > +   /* sanity check whether it's a valid resource range */
> > +   if (r2->end < r2->start)
> > +   return false;
> > return r1->start <= r2->start && r1->end >= r2->end;
> >  }
> 
> This doesn't look like the correct place to handle this?  Clearly .end
> < .start is an invalid state for a resource and we should never have
> constructed such a thing in the first place?  So adding a check at the
> place where this resource was initially created seems to be the correct
> fix?

Yes, that was also my first thought and actually the v1 patch was like
that.  The v2 one was by Ram's suggestion so that we can cover
potential bugs by all other callers as well.

I don't mind in which way to fix; below is the v1 version.
Please choose the one you think better.


Thanks!

Takashi

-- 8< --

From: Takashi Iwai 
Subject: [PATCH v1] resource: Fix integer overflow at reallocation

We've got a bug report indicating a kernel panic at booting on an
x86-32 system, and it turned out to be the invalid PCI resource
assigned after reallocation.  __find_resource() first aligns the
resource start address and resets the end address with start+size-1
accordingly, then checks whether it's contained.  Here the end address
may overflow the integer, although resource_contains() still returns
true because the function validates only start and end address.  So
this ends up with returning an invalid resource (start > end).

There was already an attempt to cover such a problem in the commit
47ea91b4052d ("Resource: fix wrong resource window calculation"), but
this case is an overseen one.

This patch adds the validity check of the newly calculated resource
for avoiding the integer overflow problem.

Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739
Fixes: 23c570a67448 ("resource: ability to resize an allocated resource")
Reported-and-tested-by: Michael Henders 
Cc: 
Signed-off-by: Takashi Iwai 
---

 kernel/resource.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index e270b5048988..2af6c03858b9 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -651,7 +651,8 @@ static int __find_resource(struct resource *root, struct 
resource *old,
alloc.start = 
constraint->alignf(constraint->alignf_data, ,
size, constraint->align);
alloc.end = alloc.start + size - 1;
-   if (resource_contains(, )) {
+   if (alloc.start <= alloc.end &&
+   resource_contains(, )) {
new->start = alloc.start;
new->end = alloc.end;
return 0;
-- 
2.16.2



Re: [PATCH v2] resource: Fix integer overflow at reallocation

2018-04-09 Thread Takashi Iwai
On Tue, 10 Apr 2018 02:23:26 +0200,
Andrew Morton wrote:
> 
> On Sun,  8 Apr 2018 09:20:26 +0200 Takashi Iwai  wrote:
> 
> > We've got a bug report indicating a kernel panic at booting on an
> > x86-32 system, and it turned out to be the invalid resource assigned
> > after PCI resource reallocation.  __find_resource() first aligns the
> > resource start address and resets the end address with start+size-1
> > accordingly, then checks whether it's contained.  Here the end address
> > may overflow the integer, although resource_contains() still returns
> > true because the function validates only start and end address.  So
> > this ends up with returning an invalid resource (start > end).
> > 
> > There was already an attempt to cover such a problem in the commit
> > 47ea91b4052d ("Resource: fix wrong resource window calculation"), but
> > this case is an overseen one.
> > 
> > This patch adds the validity check in resource_contains() to see
> > whether the given resource has a valid range for avoiding the integer
> > overflow problem.
> > 
> > ...
> >
> > --- a/include/linux/ioport.h
> > +++ b/include/linux/ioport.h
> > @@ -212,6 +212,9 @@ static inline bool resource_contains(struct resource 
> > *r1, struct resource *r2)
> > return false;
> > if (r1->flags & IORESOURCE_UNSET || r2->flags & IORESOURCE_UNSET)
> > return false;
> > +   /* sanity check whether it's a valid resource range */
> > +   if (r2->end < r2->start)
> > +   return false;
> > return r1->start <= r2->start && r1->end >= r2->end;
> >  }
> 
> This doesn't look like the correct place to handle this?  Clearly .end
> < .start is an invalid state for a resource and we should never have
> constructed such a thing in the first place?  So adding a check at the
> place where this resource was initially created seems to be the correct
> fix?

Yes, that was also my first thought and actually the v1 patch was like
that.  The v2 one was by Ram's suggestion so that we can cover
potential bugs by all other callers as well.

I don't mind in which way to fix; below is the v1 version.
Please choose the one you think better.


Thanks!

Takashi

-- 8< --

From: Takashi Iwai 
Subject: [PATCH v1] resource: Fix integer overflow at reallocation

We've got a bug report indicating a kernel panic at booting on an
x86-32 system, and it turned out to be the invalid PCI resource
assigned after reallocation.  __find_resource() first aligns the
resource start address and resets the end address with start+size-1
accordingly, then checks whether it's contained.  Here the end address
may overflow the integer, although resource_contains() still returns
true because the function validates only start and end address.  So
this ends up with returning an invalid resource (start > end).

There was already an attempt to cover such a problem in the commit
47ea91b4052d ("Resource: fix wrong resource window calculation"), but
this case is an overseen one.

This patch adds the validity check of the newly calculated resource
for avoiding the integer overflow problem.

Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739
Fixes: 23c570a67448 ("resource: ability to resize an allocated resource")
Reported-and-tested-by: Michael Henders 
Cc: 
Signed-off-by: Takashi Iwai 
---

 kernel/resource.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index e270b5048988..2af6c03858b9 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -651,7 +651,8 @@ static int __find_resource(struct resource *root, struct 
resource *old,
alloc.start = 
constraint->alignf(constraint->alignf_data, ,
size, constraint->align);
alloc.end = alloc.start + size - 1;
-   if (resource_contains(, )) {
+   if (alloc.start <= alloc.end &&
+   resource_contains(, )) {
new->start = alloc.start;
new->end = alloc.end;
return 0;
-- 
2.16.2



Re: [PATCH] mmc: sdhci-pci: Only do AMD tuning for HS200

2018-04-09 Thread Shyam Sundar S K


On 4/7/2018 3:37 AM, Daniel Kurtz wrote:
> Commit c31165d7400b ("mmc: sdhci-pci: Add support for HS200 tuning mode
> on AMD, eMMC-4.5.1") added a HS200 tuning method for use with AMD SDHCI
> controllers.  As described in the commit subject, this tuning is specific
> for HS200.  However, as implemented, this method is used for all host
> timings, because platform_execute_tuning, if it exists, is called
> unconditionally by sdhci_execute_tuning().  This breaks tuning when using
> the AMD controller with, for example, a DDR50 SD card.
>
> Instead, we can implement an amd execute_tuning wrapper callback, and
> then conditionally do the HS200 specific tuning for HS200, and otherwise
> call back to the standard sdhci_execute_tuning().
>
> Signed-off-by: Daniel Kurtz 
Looks good.

Acked-by: Shyam Sundar S K 




Re: [PATCH] mmc: sdhci-pci: Only do AMD tuning for HS200

2018-04-09 Thread Shyam Sundar S K


On 4/7/2018 3:37 AM, Daniel Kurtz wrote:
> Commit c31165d7400b ("mmc: sdhci-pci: Add support for HS200 tuning mode
> on AMD, eMMC-4.5.1") added a HS200 tuning method for use with AMD SDHCI
> controllers.  As described in the commit subject, this tuning is specific
> for HS200.  However, as implemented, this method is used for all host
> timings, because platform_execute_tuning, if it exists, is called
> unconditionally by sdhci_execute_tuning().  This breaks tuning when using
> the AMD controller with, for example, a DDR50 SD card.
>
> Instead, we can implement an amd execute_tuning wrapper callback, and
> then conditionally do the HS200 specific tuning for HS200, and otherwise
> call back to the standard sdhci_execute_tuning().
>
> Signed-off-by: Daniel Kurtz 
Looks good.

Acked-by: Shyam Sundar S K 




Re: [PATCH bpf-next v8 05/11] seccomp,landlock: Enforce Landlock programs per process hierarchy

2018-04-09 Thread Alexei Starovoitov
On Mon, Apr 09, 2018 at 12:01:59AM +0200, Mickaël Salaün wrote:
> 
> On 04/08/2018 11:06 PM, Andy Lutomirski wrote:
> > On Sun, Apr 8, 2018 at 6:13 AM, Mickaël Salaün  wrote:
> >>
> >> On 02/27/2018 10:48 PM, Mickaël Salaün wrote:
> >>>
> >>> On 27/02/2018 17:39, Andy Lutomirski wrote:
>  On Tue, Feb 27, 2018 at 5:32 AM, Alexei Starovoitov
>   wrote:
> > On Tue, Feb 27, 2018 at 05:20:55AM +, Andy Lutomirski wrote:
> >> On Tue, Feb 27, 2018 at 4:54 AM, Alexei Starovoitov
> >>  wrote:
> >>> On Tue, Feb 27, 2018 at 04:40:34AM +, Andy Lutomirski wrote:
>  On Tue, Feb 27, 2018 at 2:08 AM, Alexei Starovoitov
>   wrote:
> > On Tue, Feb 27, 2018 at 01:41:15AM +0100, Mickaël Salaün wrote:
> >> The seccomp(2) syscall can be used by a task to apply a Landlock 
> >> program
> >> to itself. As a seccomp filter, a Landlock program is enforced for 
> >> the
> >> current task and all its future children. A program is immutable 
> >> and a
> >> task can only add new restricting programs to itself, forming a 
> >> list of
> >> programss.
> >>
> >> A Landlock program is tied to a Landlock hook. If the action on a 
> >> kernel
> >> object is allowed by the other Linux security mechanisms (e.g. DAC,
> >> capabilities, other LSM), then a Landlock hook related to this 
> >> kind of
> >> object is triggered. The list of programs for this hook is then
> >> evaluated. Each program return a 32-bit value which can deny the 
> >> action
> >> on a kernel object with a non-zero value. If every programs of the 
> >> list
> >> return zero, then the action on the object is allowed.
> >>
> >> Multiple Landlock programs can be chained to share a 64-bits value 
> >> for a
> >> call chain (e.g. evaluating multiple elements of a file path).  
> >> This
> >> chaining is restricted when a process construct this chain by 
> >> loading a
> >> program, but additional checks are performed when it requests to 
> >> apply
> >> this chain of programs to itself.  The restrictions ensure that it 
> >> is
> >> not possible to call multiple programs in a way that would imply to
> >> handle multiple shared values (i.e. cookies) for one chain.  For 
> >> now,
> >> only a fs_pick program can be chained to the same type of program,
> >> because it may make sense if they have different triggers (cf. next
> >> commits).  This restrictions still allows to reuse Landlock 
> >> programs in
> >> a safe way (e.g. use the same loaded fs_walk program with multiple
> >> chains of fs_pick programs).
> >>
> >> Signed-off-by: Mickaël Salaün 
> >
> > ...
> >
> >> +struct landlock_prog_set *landlock_prepend_prog(
> >> + struct landlock_prog_set *current_prog_set,
> >> + struct bpf_prog *prog)
> >> +{
> >> + struct landlock_prog_set *new_prog_set = current_prog_set;
> >> + unsigned long pages;
> >> + int err;
> >> + size_t i;
> >> + struct landlock_prog_set tmp_prog_set = {};
> >> +
> >> + if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK)
> >> + return ERR_PTR(-EINVAL);
> >> +
> >> + /* validate memory size allocation */
> >> + pages = prog->pages;
> >> + if (current_prog_set) {
> >> + size_t i;
> >> +
> >> + for (i = 0; i < 
> >> ARRAY_SIZE(current_prog_set->programs); i++) {
> >> + struct landlock_prog_list *walker_p;
> >> +
> >> + for (walker_p = 
> >> current_prog_set->programs[i];
> >> + walker_p; walker_p = 
> >> walker_p->prev)
> >> + pages += walker_p->prog->pages;
> >> + }
> >> + /* count a struct landlock_prog_set if we need to 
> >> allocate one */
> >> + if (refcount_read(_prog_set->usage) != 1)
> >> + pages += round_up(sizeof(*current_prog_set), 
> >> PAGE_SIZE)
> >> + / PAGE_SIZE;
> >> + }
> >> + if (pages > LANDLOCK_PROGRAMS_MAX_PAGES)
> >> + return ERR_PTR(-E2BIG);
> >> +
> >> + /* ensure early that we can allocate enough memory for the 
> >> new
> >> +  * prog_lists */
> >> + err = 

Re: [PATCH bpf-next v8 05/11] seccomp,landlock: Enforce Landlock programs per process hierarchy

2018-04-09 Thread Alexei Starovoitov
On Mon, Apr 09, 2018 at 12:01:59AM +0200, Mickaël Salaün wrote:
> 
> On 04/08/2018 11:06 PM, Andy Lutomirski wrote:
> > On Sun, Apr 8, 2018 at 6:13 AM, Mickaël Salaün  wrote:
> >>
> >> On 02/27/2018 10:48 PM, Mickaël Salaün wrote:
> >>>
> >>> On 27/02/2018 17:39, Andy Lutomirski wrote:
>  On Tue, Feb 27, 2018 at 5:32 AM, Alexei Starovoitov
>   wrote:
> > On Tue, Feb 27, 2018 at 05:20:55AM +, Andy Lutomirski wrote:
> >> On Tue, Feb 27, 2018 at 4:54 AM, Alexei Starovoitov
> >>  wrote:
> >>> On Tue, Feb 27, 2018 at 04:40:34AM +, Andy Lutomirski wrote:
>  On Tue, Feb 27, 2018 at 2:08 AM, Alexei Starovoitov
>   wrote:
> > On Tue, Feb 27, 2018 at 01:41:15AM +0100, Mickaël Salaün wrote:
> >> The seccomp(2) syscall can be used by a task to apply a Landlock 
> >> program
> >> to itself. As a seccomp filter, a Landlock program is enforced for 
> >> the
> >> current task and all its future children. A program is immutable 
> >> and a
> >> task can only add new restricting programs to itself, forming a 
> >> list of
> >> programss.
> >>
> >> A Landlock program is tied to a Landlock hook. If the action on a 
> >> kernel
> >> object is allowed by the other Linux security mechanisms (e.g. DAC,
> >> capabilities, other LSM), then a Landlock hook related to this 
> >> kind of
> >> object is triggered. The list of programs for this hook is then
> >> evaluated. Each program return a 32-bit value which can deny the 
> >> action
> >> on a kernel object with a non-zero value. If every programs of the 
> >> list
> >> return zero, then the action on the object is allowed.
> >>
> >> Multiple Landlock programs can be chained to share a 64-bits value 
> >> for a
> >> call chain (e.g. evaluating multiple elements of a file path).  
> >> This
> >> chaining is restricted when a process construct this chain by 
> >> loading a
> >> program, but additional checks are performed when it requests to 
> >> apply
> >> this chain of programs to itself.  The restrictions ensure that it 
> >> is
> >> not possible to call multiple programs in a way that would imply to
> >> handle multiple shared values (i.e. cookies) for one chain.  For 
> >> now,
> >> only a fs_pick program can be chained to the same type of program,
> >> because it may make sense if they have different triggers (cf. next
> >> commits).  This restrictions still allows to reuse Landlock 
> >> programs in
> >> a safe way (e.g. use the same loaded fs_walk program with multiple
> >> chains of fs_pick programs).
> >>
> >> Signed-off-by: Mickaël Salaün 
> >
> > ...
> >
> >> +struct landlock_prog_set *landlock_prepend_prog(
> >> + struct landlock_prog_set *current_prog_set,
> >> + struct bpf_prog *prog)
> >> +{
> >> + struct landlock_prog_set *new_prog_set = current_prog_set;
> >> + unsigned long pages;
> >> + int err;
> >> + size_t i;
> >> + struct landlock_prog_set tmp_prog_set = {};
> >> +
> >> + if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK)
> >> + return ERR_PTR(-EINVAL);
> >> +
> >> + /* validate memory size allocation */
> >> + pages = prog->pages;
> >> + if (current_prog_set) {
> >> + size_t i;
> >> +
> >> + for (i = 0; i < 
> >> ARRAY_SIZE(current_prog_set->programs); i++) {
> >> + struct landlock_prog_list *walker_p;
> >> +
> >> + for (walker_p = 
> >> current_prog_set->programs[i];
> >> + walker_p; walker_p = 
> >> walker_p->prev)
> >> + pages += walker_p->prog->pages;
> >> + }
> >> + /* count a struct landlock_prog_set if we need to 
> >> allocate one */
> >> + if (refcount_read(_prog_set->usage) != 1)
> >> + pages += round_up(sizeof(*current_prog_set), 
> >> PAGE_SIZE)
> >> + / PAGE_SIZE;
> >> + }
> >> + if (pages > LANDLOCK_PROGRAMS_MAX_PAGES)
> >> + return ERR_PTR(-E2BIG);
> >> +
> >> + /* ensure early that we can allocate enough memory for the 
> >> new
> >> +  * prog_lists */
> >> + err = store_landlock_prog(_prog_set, current_prog_set, 
> >> prog);
> >> + if (err)
> >> + 

Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open

2018-04-09 Thread Ravi Bangoria
Hi Song,

On 12/07/2017 04:15 AM, Song Liu wrote:
> With current kernel, user space tools can only create/destroy [k,u]probes
> with a text-based API (kprobe_events and uprobe_events in tracefs). This
> approach relies on user space to clean up the [k,u]probe after using them.
> However, this is not easy for user space to clean up properly.
>
> To solve this problem, we introduce a file descriptor based API.
> Specifically, we extended perf_event_open to create [k,u]probe, and attach
> this [k,u]probe to the file descriptor created by perf_event_open. These
> [k,u]probe are associated with this file descriptor, so they are not
> available in tracefs.

Sorry for being late. One simple question..

Will it be good to support k/uprobe arguments with perf_event_open()?
Do you have any plans about that?

Thanks,
Ravi



Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open

2018-04-09 Thread Ravi Bangoria
Hi Song,

On 12/07/2017 04:15 AM, Song Liu wrote:
> With current kernel, user space tools can only create/destroy [k,u]probes
> with a text-based API (kprobe_events and uprobe_events in tracefs). This
> approach relies on user space to clean up the [k,u]probe after using them.
> However, this is not easy for user space to clean up properly.
>
> To solve this problem, we introduce a file descriptor based API.
> Specifically, we extended perf_event_open to create [k,u]probe, and attach
> this [k,u]probe to the file descriptor created by perf_event_open. These
> [k,u]probe are associated with this file descriptor, so they are not
> available in tracefs.

Sorry for being late. One simple question..

Will it be good to support k/uprobe arguments with perf_event_open()?
Do you have any plans about that?

Thanks,
Ravi



Re: [PATCH 5/7] arm64: dts: msm8996: Add rpmpd device node

2018-04-09 Thread Rajendra Nayak


On 04/09/2018 09:33 PM, Stephen Boyd wrote:
> Quoting Rajendra Nayak (2018-03-15 21:08:22)
>> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
>> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
>> index 0a6f7952bbb1..43757a078146 100644
>> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
>> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
>> @@ -297,6 +297,52 @@
>> #clock-cells = <1>;
>> };
>>  
>> +   rpmpd: qcom,rpmpd {
> 
> power-controller? power-domain-controller? power-domains? Or something
> like that.
> 
>> +   compatible = "qcom,rpmpd-msm8996";
>> +   #power-domain-cells = <1>;
>> +   operating-points-v2 = <_opp_table>, /* 
>> cx */
>> + <_opp_table>, /* 
>> cx_ao */
>> + <_opp_table>, /* 
>> cx_vfc */
>> + <_opp_table>, /* 
>> mx */
>> + <_opp_table>, /* 
>> mx_ao */
>> + <_opp_table>, /* 
>> sscx */
>> + <_opp_table>; /* 
>> sscx_vfc */
>> +   };
>> +
>> +   rpmpd_opp_table: opp-table {
> 
> This should go into the root of the tree? Otherwise it may be populated
> by the RPMh platform populate code which would be odd. We should go and
> update the platform populate code to always ignore operating-points-v2
> compatible nodes too.
> 
>> +   compatible = "operating-points-v2", 
>> "operating-points-v2-qcom";
> 
> This is backwards? I thought more specific compatible went first.

thanks for the review, will fixup all of these when I respin.


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


Re: [PATCH 5/7] arm64: dts: msm8996: Add rpmpd device node

2018-04-09 Thread Rajendra Nayak


On 04/09/2018 09:33 PM, Stephen Boyd wrote:
> Quoting Rajendra Nayak (2018-03-15 21:08:22)
>> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi 
>> b/arch/arm64/boot/dts/qcom/msm8996.dtsi
>> index 0a6f7952bbb1..43757a078146 100644
>> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi
>> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi
>> @@ -297,6 +297,52 @@
>> #clock-cells = <1>;
>> };
>>  
>> +   rpmpd: qcom,rpmpd {
> 
> power-controller? power-domain-controller? power-domains? Or something
> like that.
> 
>> +   compatible = "qcom,rpmpd-msm8996";
>> +   #power-domain-cells = <1>;
>> +   operating-points-v2 = <_opp_table>, /* 
>> cx */
>> + <_opp_table>, /* 
>> cx_ao */
>> + <_opp_table>, /* 
>> cx_vfc */
>> + <_opp_table>, /* 
>> mx */
>> + <_opp_table>, /* 
>> mx_ao */
>> + <_opp_table>, /* 
>> sscx */
>> + <_opp_table>; /* 
>> sscx_vfc */
>> +   };
>> +
>> +   rpmpd_opp_table: opp-table {
> 
> This should go into the root of the tree? Otherwise it may be populated
> by the RPMh platform populate code which would be odd. We should go and
> update the platform populate code to always ignore operating-points-v2
> compatible nodes too.
> 
>> +   compatible = "operating-points-v2", 
>> "operating-points-v2-qcom";
> 
> This is backwards? I thought more specific compatible went first.

thanks for the review, will fixup all of these when I respin.


-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation


Re: [PATCH] f2fs: enlarge block plug coverage

2018-04-09 Thread Jaegeuk Kim
On 04/10, Chao Yu wrote:
> On 2018/4/10 2:02, Jaegeuk Kim wrote:
> > On 04/08, Chao Yu wrote:
> >> On 2018/4/5 11:51, Jaegeuk Kim wrote:
> >>> On 04/04, Chao Yu wrote:
>  This patch enlarges block plug coverage in __issue_discard_cmd, in
>  order to collect more pending bios before issuing them, to avoid
>  being disturbed by previous discard I/O in IO aware discard mode.
> >>>
> >>> Hmm, then we need to wait for huge discard IO for over 10 secs, which
> >>
> >> We found that total discard latency is rely on total discard number we 
> >> issued
> >> last time instead of range or length discard covered. IMO, if we don't 
> >> change
> >> .max_requests value, we will not suffer longer latency.
> >>
> >>> will affect following read/write IOs accordingly. In order to avoid that,
> >>> we actually need to limit the discard size.
> 
> Do you mean limit discard count or discard length?

Both of them.

> 
> >>
> >> If you are worry about I/O interference in between discard and rw, I 
> >> suggest to
> >> decrease .max_requests value.
> > 
> > What do you mean? This will produce more pending requests in the queue?
> 
> I mean after applying this patch, we can queue more discard IOs in plug inside
> task, otherwise, previous issued discard in block layer can make is_idle() be 
> false,
> then it can stop IO awared user to issue pending discard command.

Then, unplug will issue lots of discard commands, which affects the following rw
latencies. My preference would be issuing discard commands one by one as much as
possible.

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> 
>  Signed-off-by: Chao Yu 
>  ---
>   fs/f2fs/segment.c | 7 +--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
>  diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>  index 8f0b5ba46315..4287e208c040 100644
>  --- a/fs/f2fs/segment.c
>  +++ b/fs/f2fs/segment.c
>  @@ -1208,10 +1208,12 @@ static int __issue_discard_cmd(struct 
>  f2fs_sb_info *sbi,
>   pend_list = >pend_list[i];
>   
>   mutex_lock(>cmd_lock);
>  +
>  +blk_start_plug();
>  +
>   if (list_empty(pend_list))
>   goto next;
>   f2fs_bug_on(sbi, !__check_rb_tree_consistence(sbi, 
>  >root));
>  -blk_start_plug();
>   list_for_each_entry_safe(dc, tmp, pend_list, list) {
>   f2fs_bug_on(sbi, dc->state != D_PREP);
>   
>  @@ -1227,8 +1229,9 @@ static int __issue_discard_cmd(struct f2fs_sb_info 
>  *sbi,
>   if (++iter >= dpolicy->max_requests)
>   break;
>   }
>  -blk_finish_plug();
>   next:
>  +blk_finish_plug();
>  +
>   mutex_unlock(>cmd_lock);
>   
>   if (iter >= dpolicy->max_requests)
>  -- 
>  2.15.0.55.gc2ece9dc4de6
> >>>
> >>> .
> >>>
> > 
> > .
> > 


Re: [PATCH] f2fs: enlarge block plug coverage

2018-04-09 Thread Jaegeuk Kim
On 04/10, Chao Yu wrote:
> On 2018/4/10 2:02, Jaegeuk Kim wrote:
> > On 04/08, Chao Yu wrote:
> >> On 2018/4/5 11:51, Jaegeuk Kim wrote:
> >>> On 04/04, Chao Yu wrote:
>  This patch enlarges block plug coverage in __issue_discard_cmd, in
>  order to collect more pending bios before issuing them, to avoid
>  being disturbed by previous discard I/O in IO aware discard mode.
> >>>
> >>> Hmm, then we need to wait for huge discard IO for over 10 secs, which
> >>
> >> We found that total discard latency is rely on total discard number we 
> >> issued
> >> last time instead of range or length discard covered. IMO, if we don't 
> >> change
> >> .max_requests value, we will not suffer longer latency.
> >>
> >>> will affect following read/write IOs accordingly. In order to avoid that,
> >>> we actually need to limit the discard size.
> 
> Do you mean limit discard count or discard length?

Both of them.

> 
> >>
> >> If you are worry about I/O interference in between discard and rw, I 
> >> suggest to
> >> decrease .max_requests value.
> > 
> > What do you mean? This will produce more pending requests in the queue?
> 
> I mean after applying this patch, we can queue more discard IOs in plug inside
> task, otherwise, previous issued discard in block layer can make is_idle() be 
> false,
> then it can stop IO awared user to issue pending discard command.

Then, unplug will issue lots of discard commands, which affects the following rw
latencies. My preference would be issuing discard commands one by one as much as
possible.

> 
> Thanks,
> 
> > 
> >>
> >> Thanks,
> >>
> >>>
> >>> Thanks,
> >>>
> 
>  Signed-off-by: Chao Yu 
>  ---
>   fs/f2fs/segment.c | 7 +--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
>  diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
>  index 8f0b5ba46315..4287e208c040 100644
>  --- a/fs/f2fs/segment.c
>  +++ b/fs/f2fs/segment.c
>  @@ -1208,10 +1208,12 @@ static int __issue_discard_cmd(struct 
>  f2fs_sb_info *sbi,
>   pend_list = >pend_list[i];
>   
>   mutex_lock(>cmd_lock);
>  +
>  +blk_start_plug();
>  +
>   if (list_empty(pend_list))
>   goto next;
>   f2fs_bug_on(sbi, !__check_rb_tree_consistence(sbi, 
>  >root));
>  -blk_start_plug();
>   list_for_each_entry_safe(dc, tmp, pend_list, list) {
>   f2fs_bug_on(sbi, dc->state != D_PREP);
>   
>  @@ -1227,8 +1229,9 @@ static int __issue_discard_cmd(struct f2fs_sb_info 
>  *sbi,
>   if (++iter >= dpolicy->max_requests)
>   break;
>   }
>  -blk_finish_plug();
>   next:
>  +blk_finish_plug();
>  +
>   mutex_unlock(>cmd_lock);
>   
>   if (iter >= dpolicy->max_requests)
>  -- 
>  2.15.0.55.gc2ece9dc4de6
> >>>
> >>> .
> >>>
> > 
> > .
> > 


Re: [PATCH] dmaengine: dmatest: Remove use of VLAs

2018-04-09 Thread kbuild test robot
Hi Laura,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16 next-20180409]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Laura-Abbott/dmaengine-dmatest-Remove-use-of-VLAs/20180410-094633
config: i386-randconfig-x076-201814 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

   Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write
   Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls
   Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u32
   Cyclomatic Complexity 3 include/linux/log2.h:is_power_of_2
   Cyclomatic Complexity 1 include/linux/list.h:INIT_LIST_HEAD
   Cyclomatic Complexity 1 include/linux/list.h:__list_add_valid
   Cyclomatic Complexity 1 include/linux/list.h:__list_del_entry_valid
   Cyclomatic Complexity 2 include/linux/list.h:__list_add
   Cyclomatic Complexity 1 include/linux/list.h:list_add_tail
   Cyclomatic Complexity 1 include/linux/list.h:__list_del
   Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry
   Cyclomatic Complexity 1 include/linux/list.h:list_del
   Cyclomatic Complexity 1 include/linux/err.h:IS_ERR
   Cyclomatic Complexity 1 arch/x86/include/asm/current.h:get_current
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_read
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_inc
   Cyclomatic Complexity 1 
arch/x86/include/asm/atomic.h:arch_atomic_dec_and_test
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_read
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc
   Cyclomatic Complexity 1 
include/asm-generic/atomic-instrumented.h:atomic_dec_and_test
   Cyclomatic Complexity 1 include/asm-generic/getorder.h:__get_order
   Cyclomatic Complexity 3 include/linux/bitmap.h:bitmap_zero
   Cyclomatic Complexity 1 include/linux/jiffies.h:_msecs_to_jiffies
   Cyclomatic Complexity 3 include/linux/jiffies.h:msecs_to_jiffies
   Cyclomatic Complexity 70 include/linux/ktime.h:ktime_divns
   Cyclomatic Complexity 1 include/linux/ktime.h:ktime_to_us
   Cyclomatic Complexity 1 include/linux/mmzone.h:pfn_to_section_nr
   Cyclomatic Complexity 2 include/linux/mmzone.h:__nr_to_section
   Cyclomatic Complexity 1 include/linux/mmzone.h:__section_mem_map_addr
   Cyclomatic Complexity 1 include/linux/mmzone.h:__pfn_to_section
   Cyclomatic Complexity 1 include/linux/kobject.h:kobject_name
   Cyclomatic Complexity 2 include/linux/device.h:dev_name
   Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_map_page
   Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_mapping_error
   Cyclomatic Complexity 1 include/linux/dma-mapping.h:valid_dma_direction
   Cyclomatic Complexity 1 arch/x86/include/asm/dma-mapping.h:get_arch_dma_ops
   Cyclomatic Complexity 4 include/linux/dma-mapping.h:get_dma_ops
   Cyclomatic Complexity 1 include/linux/dma-mapping.h:dma_map_page_attrs
   Cyclomatic Complexity 2 include/linux/dma-mapping.h:dma_mapping_error
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_submit_error
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_chan_name
   Cyclomatic Complexity 2 include/linux/dmaengine.h:dmaengine_terminate_all
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dmaf_continue
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dmaf_p_disabled_continue
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_dev_has_pq_continue
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_dev_to_maxpq
   Cyclomatic Complexity 4 include/linux/dmaengine.h:dma_maxpq
   Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_set
   Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_zero
   Cyclomatic Complexity 2 include/linux/dmaengine.h:__dma_has_cap
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_async_issue_pending
   Cyclomatic Complexity 3 include/linux/dmaengine.h:dma_async_is_tx_complete
   Cyclomatic Complexity 2 include/linux/freezer.h:freezing
   Cyclomatic Complexity 2 include/linux/freezer.h:try_to_freeze_unsafe
   Cyclomatic Complexity 2 include/linux/freezer.h:try_to_freeze
   Cyclomatic Complexity 1 include/linux/kasan.h:kasan_kmalloc
   Cyclomatic Complexity 28 include/linux/slab.h:kmalloc_index
   Cyclomatic Complexity 1 include/linux/slab.h:kmem_cache_alloc_trace
   Cyclomatic Complexity 1 include/linux/slab.h:kmalloc_order_trace
   Cyclomatic Complexity 67 include/linux/slab.h:kmalloc_large
   Cyclomatic Complexity 5 include/linux/slab.h:k

Re: [PATCH] dmaengine: dmatest: Remove use of VLAs

2018-04-09 Thread kbuild test robot
Hi Laura,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.16 next-20180409]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Laura-Abbott/dmaengine-dmatest-Remove-use-of-VLAs/20180410-094633
config: i386-randconfig-x076-201814 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

   Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write
   Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit
   Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls
   Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u32
   Cyclomatic Complexity 3 include/linux/log2.h:is_power_of_2
   Cyclomatic Complexity 1 include/linux/list.h:INIT_LIST_HEAD
   Cyclomatic Complexity 1 include/linux/list.h:__list_add_valid
   Cyclomatic Complexity 1 include/linux/list.h:__list_del_entry_valid
   Cyclomatic Complexity 2 include/linux/list.h:__list_add
   Cyclomatic Complexity 1 include/linux/list.h:list_add_tail
   Cyclomatic Complexity 1 include/linux/list.h:__list_del
   Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry
   Cyclomatic Complexity 1 include/linux/list.h:list_del
   Cyclomatic Complexity 1 include/linux/err.h:IS_ERR
   Cyclomatic Complexity 1 arch/x86/include/asm/current.h:get_current
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_read
   Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_inc
   Cyclomatic Complexity 1 
arch/x86/include/asm/atomic.h:arch_atomic_dec_and_test
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_read
   Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc
   Cyclomatic Complexity 1 
include/asm-generic/atomic-instrumented.h:atomic_dec_and_test
   Cyclomatic Complexity 1 include/asm-generic/getorder.h:__get_order
   Cyclomatic Complexity 3 include/linux/bitmap.h:bitmap_zero
   Cyclomatic Complexity 1 include/linux/jiffies.h:_msecs_to_jiffies
   Cyclomatic Complexity 3 include/linux/jiffies.h:msecs_to_jiffies
   Cyclomatic Complexity 70 include/linux/ktime.h:ktime_divns
   Cyclomatic Complexity 1 include/linux/ktime.h:ktime_to_us
   Cyclomatic Complexity 1 include/linux/mmzone.h:pfn_to_section_nr
   Cyclomatic Complexity 2 include/linux/mmzone.h:__nr_to_section
   Cyclomatic Complexity 1 include/linux/mmzone.h:__section_mem_map_addr
   Cyclomatic Complexity 1 include/linux/mmzone.h:__pfn_to_section
   Cyclomatic Complexity 1 include/linux/kobject.h:kobject_name
   Cyclomatic Complexity 2 include/linux/device.h:dev_name
   Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_map_page
   Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_mapping_error
   Cyclomatic Complexity 1 include/linux/dma-mapping.h:valid_dma_direction
   Cyclomatic Complexity 1 arch/x86/include/asm/dma-mapping.h:get_arch_dma_ops
   Cyclomatic Complexity 4 include/linux/dma-mapping.h:get_dma_ops
   Cyclomatic Complexity 1 include/linux/dma-mapping.h:dma_map_page_attrs
   Cyclomatic Complexity 2 include/linux/dma-mapping.h:dma_mapping_error
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_submit_error
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_chan_name
   Cyclomatic Complexity 2 include/linux/dmaengine.h:dmaengine_terminate_all
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dmaf_continue
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dmaf_p_disabled_continue
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_dev_has_pq_continue
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_dev_to_maxpq
   Cyclomatic Complexity 4 include/linux/dmaengine.h:dma_maxpq
   Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_set
   Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_zero
   Cyclomatic Complexity 2 include/linux/dmaengine.h:__dma_has_cap
   Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_async_issue_pending
   Cyclomatic Complexity 3 include/linux/dmaengine.h:dma_async_is_tx_complete
   Cyclomatic Complexity 2 include/linux/freezer.h:freezing
   Cyclomatic Complexity 2 include/linux/freezer.h:try_to_freeze_unsafe
   Cyclomatic Complexity 2 include/linux/freezer.h:try_to_freeze
   Cyclomatic Complexity 1 include/linux/kasan.h:kasan_kmalloc
   Cyclomatic Complexity 28 include/linux/slab.h:kmalloc_index
   Cyclomatic Complexity 1 include/linux/slab.h:kmem_cache_alloc_trace
   Cyclomatic Complexity 1 include/linux/slab.h:kmalloc_order_trace
   Cyclomatic Complexity 67 include/linux/slab.h:kmalloc_large
   Cyclomatic Complexity 5 include/linux/slab.h:k

Re: [PATCH] f2fs: don't use GFP_ZERO for page caches

2018-04-09 Thread Jaegeuk Kim
On 04/10, Chao Yu wrote:
> On 2018/4/10 3:00, Jaegeuk Kim wrote:
> > From: Chao Yu 
> > 
> > Related to https://lkml.org/lkml/2018/4/8/661
> > 
> > Sometimes, we need to write meta data to new allocated block address,
> > then we will allocate a zeroed page in inner inode's address space, and
> > fill partial data in it, and leave other place with zero value which means
> > some fields are initial status.
> > 
> > There are two inner inodes (meta inode and node inode) setting __GFP_ZERO,
> > I have just checked them, for both of them, we can avoid using __GFP_ZERO,
> > and do initialization by ourselves to avoid unneeded/redundant zeroing
> > from mm.
> > 
> > Cc: 
> > Signed-off-by: Chao Yu 
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >  fs/f2fs/inode.c| 4 ++--
> >  fs/f2fs/node.c | 6 --
> >  fs/f2fs/node.h | 7 ++-
> >  fs/f2fs/recovery.c | 3 +--
> >  4 files changed, 9 insertions(+), 11 deletions(-)
> > 
> > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> > index 417c9dcd0269..87535bf63421 100644
> > --- a/fs/f2fs/inode.c
> > +++ b/fs/f2fs/inode.c
> > @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, 
> > unsigned long ino)
> >  make_now:
> > if (ino == F2FS_NODE_INO(sbi)) {
> > inode->i_mapping->a_ops = _node_aops;
> > -   mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> > +   mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
> > } else if (ino == F2FS_META_INO(sbi)) {
> > inode->i_mapping->a_ops = _meta_aops;
> > -   mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> > +   mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
> > } else if (S_ISREG(inode->i_mode)) {
> > inode->i_op = _file_inode_operations;
> > inode->i_fop = _file_operations;
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index 9a99243054ba..6fc3311820ec 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1096,7 +1096,8 @@ struct page *new_node_page(struct dnode_of_data *dn, 
> > unsigned int ofs)
> > set_node_addr(sbi, _ni, NEW_ADDR, false);
> >  
> > f2fs_wait_on_page_writeback(page, NODE, true);
> > -   fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs, true);
> > +   memset(F2FS_NODE(page), 0, PAGE_SIZE);
> > +   fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs);
> > set_cold_node(page, S_ISDIR(dn->inode->i_mode));
> > if (!PageUptodate(page))
> > SetPageUptodate(page);
> > @@ -2311,7 +2312,8 @@ int recover_inode_page(struct f2fs_sb_info *sbi, 
> > struct page *page)
> >  
> > if (!PageUptodate(ipage))
> > SetPageUptodate(ipage);
> > -   fill_node_footer(ipage, ino, ino, 0, true);
> > +   memset(F2FS_NODE(page), 0, PAGE_SIZE);
> 
> At a glance, should be memset(F2FS_NODE(ipage), 0, PAGE_SIZE);

Actually, we don't need to do this, since fill_node_footer(true) will reset the
page.

> 
> Sorry about that.
> 
> Thanks,
> 
> > +   fill_node_footer(ipage, ino, ino, 0);
> > set_cold_node(page, false);
> >  
> > src = F2FS_INODE(page);
> > diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
> > index b95e49e4a928..42cd081114ab 100644
> > --- a/fs/f2fs/node.h
> > +++ b/fs/f2fs/node.h
> > @@ -263,15 +263,12 @@ static inline block_t next_blkaddr_of_node(struct 
> > page *node_page)
> >  }
> >  
> >  static inline void fill_node_footer(struct page *page, nid_t nid,
> > -   nid_t ino, unsigned int ofs, bool reset)
> > +   nid_t ino, unsigned int ofs)
> >  {
> > struct f2fs_node *rn = F2FS_NODE(page);
> > unsigned int old_flag = 0;
> >  
> > -   if (reset)
> > -   memset(rn, 0, sizeof(*rn));
> > -   else
> > -   old_flag = le32_to_cpu(rn->footer.flag);
> > +   old_flag = le32_to_cpu(rn->footer.flag);
> >  
> > rn->footer.nid = cpu_to_le32(nid);
> > rn->footer.ino = cpu_to_le32(ino);
> > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> > index 1b23d3febe4c..de24f3247aa5 100644
> > --- a/fs/f2fs/recovery.c
> > +++ b/fs/f2fs/recovery.c
> > @@ -540,8 +540,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, 
> > struct inode *inode,
> > }
> >  
> > copy_node_footer(dn.node_page, page);
> > -   fill_node_footer(dn.node_page, dn.nid, ni.ino,
> > -   ofs_of_node(page), false);
> > +   fill_node_footer(dn.node_page, dn.nid, ni.ino, ofs_of_node(page));
> > set_page_dirty(dn.node_page);
> >  err:
> > f2fs_put_dnode();
> > 


Re: [PATCH] f2fs: don't use GFP_ZERO for page caches

2018-04-09 Thread Jaegeuk Kim
On 04/10, Chao Yu wrote:
> On 2018/4/10 3:00, Jaegeuk Kim wrote:
> > From: Chao Yu 
> > 
> > Related to https://lkml.org/lkml/2018/4/8/661
> > 
> > Sometimes, we need to write meta data to new allocated block address,
> > then we will allocate a zeroed page in inner inode's address space, and
> > fill partial data in it, and leave other place with zero value which means
> > some fields are initial status.
> > 
> > There are two inner inodes (meta inode and node inode) setting __GFP_ZERO,
> > I have just checked them, for both of them, we can avoid using __GFP_ZERO,
> > and do initialization by ourselves to avoid unneeded/redundant zeroing
> > from mm.
> > 
> > Cc: 
> > Signed-off-by: Chao Yu 
> > Signed-off-by: Jaegeuk Kim 
> > ---
> >  fs/f2fs/inode.c| 4 ++--
> >  fs/f2fs/node.c | 6 --
> >  fs/f2fs/node.h | 7 ++-
> >  fs/f2fs/recovery.c | 3 +--
> >  4 files changed, 9 insertions(+), 11 deletions(-)
> > 
> > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> > index 417c9dcd0269..87535bf63421 100644
> > --- a/fs/f2fs/inode.c
> > +++ b/fs/f2fs/inode.c
> > @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, 
> > unsigned long ino)
> >  make_now:
> > if (ino == F2FS_NODE_INO(sbi)) {
> > inode->i_mapping->a_ops = _node_aops;
> > -   mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> > +   mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
> > } else if (ino == F2FS_META_INO(sbi)) {
> > inode->i_mapping->a_ops = _meta_aops;
> > -   mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> > +   mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
> > } else if (S_ISREG(inode->i_mode)) {
> > inode->i_op = _file_inode_operations;
> > inode->i_fop = _file_operations;
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index 9a99243054ba..6fc3311820ec 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -1096,7 +1096,8 @@ struct page *new_node_page(struct dnode_of_data *dn, 
> > unsigned int ofs)
> > set_node_addr(sbi, _ni, NEW_ADDR, false);
> >  
> > f2fs_wait_on_page_writeback(page, NODE, true);
> > -   fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs, true);
> > +   memset(F2FS_NODE(page), 0, PAGE_SIZE);
> > +   fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs);
> > set_cold_node(page, S_ISDIR(dn->inode->i_mode));
> > if (!PageUptodate(page))
> > SetPageUptodate(page);
> > @@ -2311,7 +2312,8 @@ int recover_inode_page(struct f2fs_sb_info *sbi, 
> > struct page *page)
> >  
> > if (!PageUptodate(ipage))
> > SetPageUptodate(ipage);
> > -   fill_node_footer(ipage, ino, ino, 0, true);
> > +   memset(F2FS_NODE(page), 0, PAGE_SIZE);
> 
> At a glance, should be memset(F2FS_NODE(ipage), 0, PAGE_SIZE);

Actually, we don't need to do this, since fill_node_footer(true) will reset the
page.

> 
> Sorry about that.
> 
> Thanks,
> 
> > +   fill_node_footer(ipage, ino, ino, 0);
> > set_cold_node(page, false);
> >  
> > src = F2FS_INODE(page);
> > diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
> > index b95e49e4a928..42cd081114ab 100644
> > --- a/fs/f2fs/node.h
> > +++ b/fs/f2fs/node.h
> > @@ -263,15 +263,12 @@ static inline block_t next_blkaddr_of_node(struct 
> > page *node_page)
> >  }
> >  
> >  static inline void fill_node_footer(struct page *page, nid_t nid,
> > -   nid_t ino, unsigned int ofs, bool reset)
> > +   nid_t ino, unsigned int ofs)
> >  {
> > struct f2fs_node *rn = F2FS_NODE(page);
> > unsigned int old_flag = 0;
> >  
> > -   if (reset)
> > -   memset(rn, 0, sizeof(*rn));
> > -   else
> > -   old_flag = le32_to_cpu(rn->footer.flag);
> > +   old_flag = le32_to_cpu(rn->footer.flag);
> >  
> > rn->footer.nid = cpu_to_le32(nid);
> > rn->footer.ino = cpu_to_le32(ino);
> > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> > index 1b23d3febe4c..de24f3247aa5 100644
> > --- a/fs/f2fs/recovery.c
> > +++ b/fs/f2fs/recovery.c
> > @@ -540,8 +540,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, 
> > struct inode *inode,
> > }
> >  
> > copy_node_footer(dn.node_page, page);
> > -   fill_node_footer(dn.node_page, dn.nid, ni.ino,
> > -   ofs_of_node(page), false);
> > +   fill_node_footer(dn.node_page, dn.nid, ni.ino, ofs_of_node(page));
> > set_page_dirty(dn.node_page);
> >  err:
> > f2fs_put_dnode();
> > 


Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN

2018-04-09 Thread Zhaoyang Huang
On Tue, Apr 10, 2018 at 11:12 AM, Steven Rostedt  wrote:
> On Tue, 10 Apr 2018 10:32:36 +0800
> Zhaoyang Huang  wrote:
>
>> For bellowing scenario, process A have no intension to exhaust the
>> memory, but will be likely to be selected by OOM for we set
>> OOM_CORE_ADJ_MIN for it.
>> process A(-1000)  process B
>>
>>   i = si_mem_available();
>>if (i < nr_pages)
>>return -ENOMEM;
>>schedule
>> --->
>> allocate huge memory
>> <-
>> if (user_thread)
>>   set_current_oom_origin();
>>
>>   for (i = 0; i < nr_pages; i++) {
>>  bpage = kzalloc_node
>
> Is this really an issue though?
>
> Seriously, do you think you will ever hit this?
>
> How often do you increase the size of the ftrace ring buffer? For this
> to be an issue, the system has to trigger an OOM at the exact moment
> you decide to increase the size of the ring buffer. That would be an
> impressive attack, with little to gain.
>
> Ask the memory management people. If they think this could be a
> problem, then I'll be happy to take your patch.
>
> -- Steve
add Michael for review.
Hi Michael,
I would like suggest Steve NOT to set OOM_CORE_ADJ_MIN for the process
with adj = -1000 when setting the user space process as potential
victim of OOM. Steve doubts about the possibility of the scenario. In
my opinion, we should NOT break the original concept of the OOM, that
is, OOM would not select -1000 process unless it config it itself.
With regard to the possibility, in memory thirsty system such as
android on mobile phones, there are different kinds of user behavior
or test script to attack or ensure the stability of the system. So I
suggest we'd better keep every corner case safe. Would you please give
a comment on that? thanks


Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN

2018-04-09 Thread Zhaoyang Huang
On Tue, Apr 10, 2018 at 11:12 AM, Steven Rostedt  wrote:
> On Tue, 10 Apr 2018 10:32:36 +0800
> Zhaoyang Huang  wrote:
>
>> For bellowing scenario, process A have no intension to exhaust the
>> memory, but will be likely to be selected by OOM for we set
>> OOM_CORE_ADJ_MIN for it.
>> process A(-1000)  process B
>>
>>   i = si_mem_available();
>>if (i < nr_pages)
>>return -ENOMEM;
>>schedule
>> --->
>> allocate huge memory
>> <-
>> if (user_thread)
>>   set_current_oom_origin();
>>
>>   for (i = 0; i < nr_pages; i++) {
>>  bpage = kzalloc_node
>
> Is this really an issue though?
>
> Seriously, do you think you will ever hit this?
>
> How often do you increase the size of the ftrace ring buffer? For this
> to be an issue, the system has to trigger an OOM at the exact moment
> you decide to increase the size of the ring buffer. That would be an
> impressive attack, with little to gain.
>
> Ask the memory management people. If they think this could be a
> problem, then I'll be happy to take your patch.
>
> -- Steve
add Michael for review.
Hi Michael,
I would like suggest Steve NOT to set OOM_CORE_ADJ_MIN for the process
with adj = -1000 when setting the user space process as potential
victim of OOM. Steve doubts about the possibility of the scenario. In
my opinion, we should NOT break the original concept of the OOM, that
is, OOM would not select -1000 process unless it config it itself.
With regard to the possibility, in memory thirsty system such as
android on mobile phones, there are different kinds of user behavior
or test script to attack or ensure the stability of the system. So I
suggest we'd better keep every corner case safe. Would you please give
a comment on that? thanks


Re: [PATCH 5/7] arm64: dts: msm8996: Add rpmpd device node

2018-04-09 Thread Viresh Kumar
On 09-04-18, 09:03, Stephen Boyd wrote:
> We should go and
> update the platform populate code to always ignore operating-points-v2
> compatible nodes too.

Will do that.

-- 
viresh


Re: [PATCH 5/7] arm64: dts: msm8996: Add rpmpd device node

2018-04-09 Thread Viresh Kumar
On 09-04-18, 09:03, Stephen Boyd wrote:
> We should go and
> update the platform populate code to always ignore operating-points-v2
> compatible nodes too.

Will do that.

-- 
viresh


Re: [PATCH] f2fs: don't use GFP_ZERO for page caches

2018-04-09 Thread Chao Yu
On 2018/4/10 3:00, Jaegeuk Kim wrote:
> From: Chao Yu 
> 
> Related to https://lkml.org/lkml/2018/4/8/661
> 
> Sometimes, we need to write meta data to new allocated block address,
> then we will allocate a zeroed page in inner inode's address space, and
> fill partial data in it, and leave other place with zero value which means
> some fields are initial status.
> 
> There are two inner inodes (meta inode and node inode) setting __GFP_ZERO,
> I have just checked them, for both of them, we can avoid using __GFP_ZERO,
> and do initialization by ourselves to avoid unneeded/redundant zeroing
> from mm.
> 
> Cc: 
> Signed-off-by: Chao Yu 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/inode.c| 4 ++--
>  fs/f2fs/node.c | 6 --
>  fs/f2fs/node.h | 7 ++-
>  fs/f2fs/recovery.c | 3 +--
>  4 files changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> index 417c9dcd0269..87535bf63421 100644
> --- a/fs/f2fs/inode.c
> +++ b/fs/f2fs/inode.c
> @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, 
> unsigned long ino)
>  make_now:
>   if (ino == F2FS_NODE_INO(sbi)) {
>   inode->i_mapping->a_ops = _node_aops;
> - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
>   } else if (ino == F2FS_META_INO(sbi)) {
>   inode->i_mapping->a_ops = _meta_aops;
> - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
>   } else if (S_ISREG(inode->i_mode)) {
>   inode->i_op = _file_inode_operations;
>   inode->i_fop = _file_operations;
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 9a99243054ba..6fc3311820ec 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1096,7 +1096,8 @@ struct page *new_node_page(struct dnode_of_data *dn, 
> unsigned int ofs)
>   set_node_addr(sbi, _ni, NEW_ADDR, false);
>  
>   f2fs_wait_on_page_writeback(page, NODE, true);
> - fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs, true);
> + memset(F2FS_NODE(page), 0, PAGE_SIZE);
> + fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs);
>   set_cold_node(page, S_ISDIR(dn->inode->i_mode));
>   if (!PageUptodate(page))
>   SetPageUptodate(page);
> @@ -2311,7 +2312,8 @@ int recover_inode_page(struct f2fs_sb_info *sbi, struct 
> page *page)
>  
>   if (!PageUptodate(ipage))
>   SetPageUptodate(ipage);
> - fill_node_footer(ipage, ino, ino, 0, true);
> + memset(F2FS_NODE(page), 0, PAGE_SIZE);

At a glance, should be memset(F2FS_NODE(ipage), 0, PAGE_SIZE);

Sorry about that.

Thanks,

> + fill_node_footer(ipage, ino, ino, 0);
>   set_cold_node(page, false);
>  
>   src = F2FS_INODE(page);
> diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
> index b95e49e4a928..42cd081114ab 100644
> --- a/fs/f2fs/node.h
> +++ b/fs/f2fs/node.h
> @@ -263,15 +263,12 @@ static inline block_t next_blkaddr_of_node(struct page 
> *node_page)
>  }
>  
>  static inline void fill_node_footer(struct page *page, nid_t nid,
> - nid_t ino, unsigned int ofs, bool reset)
> + nid_t ino, unsigned int ofs)
>  {
>   struct f2fs_node *rn = F2FS_NODE(page);
>   unsigned int old_flag = 0;
>  
> - if (reset)
> - memset(rn, 0, sizeof(*rn));
> - else
> - old_flag = le32_to_cpu(rn->footer.flag);
> + old_flag = le32_to_cpu(rn->footer.flag);
>  
>   rn->footer.nid = cpu_to_le32(nid);
>   rn->footer.ino = cpu_to_le32(ino);
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index 1b23d3febe4c..de24f3247aa5 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -540,8 +540,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, 
> struct inode *inode,
>   }
>  
>   copy_node_footer(dn.node_page, page);
> - fill_node_footer(dn.node_page, dn.nid, ni.ino,
> - ofs_of_node(page), false);
> + fill_node_footer(dn.node_page, dn.nid, ni.ino, ofs_of_node(page));
>   set_page_dirty(dn.node_page);
>  err:
>   f2fs_put_dnode();
> 



Re: [PATCH] f2fs: don't use GFP_ZERO for page caches

2018-04-09 Thread Chao Yu
On 2018/4/10 3:00, Jaegeuk Kim wrote:
> From: Chao Yu 
> 
> Related to https://lkml.org/lkml/2018/4/8/661
> 
> Sometimes, we need to write meta data to new allocated block address,
> then we will allocate a zeroed page in inner inode's address space, and
> fill partial data in it, and leave other place with zero value which means
> some fields are initial status.
> 
> There are two inner inodes (meta inode and node inode) setting __GFP_ZERO,
> I have just checked them, for both of them, we can avoid using __GFP_ZERO,
> and do initialization by ourselves to avoid unneeded/redundant zeroing
> from mm.
> 
> Cc: 
> Signed-off-by: Chao Yu 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/inode.c| 4 ++--
>  fs/f2fs/node.c | 6 --
>  fs/f2fs/node.h | 7 ++-
>  fs/f2fs/recovery.c | 3 +--
>  4 files changed, 9 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
> index 417c9dcd0269..87535bf63421 100644
> --- a/fs/f2fs/inode.c
> +++ b/fs/f2fs/inode.c
> @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, 
> unsigned long ino)
>  make_now:
>   if (ino == F2FS_NODE_INO(sbi)) {
>   inode->i_mapping->a_ops = _node_aops;
> - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
>   } else if (ino == F2FS_META_INO(sbi)) {
>   inode->i_mapping->a_ops = _meta_aops;
> - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
> + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
>   } else if (S_ISREG(inode->i_mode)) {
>   inode->i_op = _file_inode_operations;
>   inode->i_fop = _file_operations;
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 9a99243054ba..6fc3311820ec 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -1096,7 +1096,8 @@ struct page *new_node_page(struct dnode_of_data *dn, 
> unsigned int ofs)
>   set_node_addr(sbi, _ni, NEW_ADDR, false);
>  
>   f2fs_wait_on_page_writeback(page, NODE, true);
> - fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs, true);
> + memset(F2FS_NODE(page), 0, PAGE_SIZE);
> + fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs);
>   set_cold_node(page, S_ISDIR(dn->inode->i_mode));
>   if (!PageUptodate(page))
>   SetPageUptodate(page);
> @@ -2311,7 +2312,8 @@ int recover_inode_page(struct f2fs_sb_info *sbi, struct 
> page *page)
>  
>   if (!PageUptodate(ipage))
>   SetPageUptodate(ipage);
> - fill_node_footer(ipage, ino, ino, 0, true);
> + memset(F2FS_NODE(page), 0, PAGE_SIZE);

At a glance, should be memset(F2FS_NODE(ipage), 0, PAGE_SIZE);

Sorry about that.

Thanks,

> + fill_node_footer(ipage, ino, ino, 0);
>   set_cold_node(page, false);
>  
>   src = F2FS_INODE(page);
> diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h
> index b95e49e4a928..42cd081114ab 100644
> --- a/fs/f2fs/node.h
> +++ b/fs/f2fs/node.h
> @@ -263,15 +263,12 @@ static inline block_t next_blkaddr_of_node(struct page 
> *node_page)
>  }
>  
>  static inline void fill_node_footer(struct page *page, nid_t nid,
> - nid_t ino, unsigned int ofs, bool reset)
> + nid_t ino, unsigned int ofs)
>  {
>   struct f2fs_node *rn = F2FS_NODE(page);
>   unsigned int old_flag = 0;
>  
> - if (reset)
> - memset(rn, 0, sizeof(*rn));
> - else
> - old_flag = le32_to_cpu(rn->footer.flag);
> + old_flag = le32_to_cpu(rn->footer.flag);
>  
>   rn->footer.nid = cpu_to_le32(nid);
>   rn->footer.ino = cpu_to_le32(ino);
> diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c
> index 1b23d3febe4c..de24f3247aa5 100644
> --- a/fs/f2fs/recovery.c
> +++ b/fs/f2fs/recovery.c
> @@ -540,8 +540,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, 
> struct inode *inode,
>   }
>  
>   copy_node_footer(dn.node_page, page);
> - fill_node_footer(dn.node_page, dn.nid, ni.ino,
> - ofs_of_node(page), false);
> + fill_node_footer(dn.node_page, dn.nid, ni.ino, ofs_of_node(page));
>   set_page_dirty(dn.node_page);
>  err:
>   f2fs_put_dnode();
> 



Re: [f2fs-dev] [PATCH v3] f2fs: don't use GFP_ZERO for page caches

2018-04-09 Thread Jaegeuk Kim
Change log from v2:
 - consider IO error case when dealing with metapage
 - memset by fill_node_footer

Change log from v1:
 - don't memset for recovered page
 
Related to https://lkml.org/lkml/2018/4/8/661

Sometimes, we need to write meta data to new allocated block address,
then we will allocate a zeroed page in inner inode's address space, and
fill partial data in it, and leave other place with zero value which means
some fields are initial status.

There are two inner inodes (meta inode and node inode) setting __GFP_ZERO,
I have just checked them, for both of them, we can avoid using __GFP_ZERO,
and do initialization by ourselves to avoid unneeded/redundant zeroing
from mm.

Cc: 
Signed-off-by: Chao Yu 
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/checkpoint.c | 4 +++-
 fs/f2fs/inode.c  | 4 ++--
 fs/f2fs/segment.c| 3 +++
 fs/f2fs/segment.h| 1 +
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index bf779461df13..2e23b953d304 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -100,8 +100,10 @@ static struct page *__get_meta_page(struct f2fs_sb_info 
*sbi, pgoff_t index,
 * readonly and make sure do not write checkpoint with non-uptodate
 * meta page.
 */
-   if (unlikely(!PageUptodate(page)))
+   if (unlikely(!PageUptodate(page))) {
+   memset(page_address(page), 0, PAGE_SIZE);
f2fs_stop_checkpoint(sbi, false);
+   }
 out:
return page;
 }
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 417c9dcd0269..87535bf63421 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned 
long ino)
 make_now:
if (ino == F2FS_NODE_INO(sbi)) {
inode->i_mapping->a_ops = _node_aops;
-   mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
+   mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
} else if (ino == F2FS_META_INO(sbi)) {
inode->i_mapping->a_ops = _meta_aops;
-   mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
+   mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
} else if (S_ISREG(inode->i_mode)) {
inode->i_op = _file_inode_operations;
inode->i_fop = _file_operations;
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index a4b8e3e24ccb..1f5db557ab96 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2021,6 +2021,7 @@ static void write_current_sum_page(struct f2fs_sb_info 
*sbi,
struct f2fs_summary_block *dst;
 
dst = (struct f2fs_summary_block *)page_address(page);
+   memset(dst, 0, PAGE_SIZE);
 
mutex_lock(>curseg_mutex);
 
@@ -3117,6 +3118,7 @@ static void write_compacted_summaries(struct f2fs_sb_info 
*sbi, block_t blkaddr)
 
page = grab_meta_page(sbi, blkaddr++);
kaddr = (unsigned char *)page_address(page);
+   memset(kaddr, 0, PAGE_SIZE);
 
/* Step 1: write nat cache */
seg_i = CURSEG_I(sbi, CURSEG_HOT_DATA);
@@ -3141,6 +3143,7 @@ static void write_compacted_summaries(struct f2fs_sb_info 
*sbi, block_t blkaddr)
if (!page) {
page = grab_meta_page(sbi, blkaddr++);
kaddr = (unsigned char *)page_address(page);
+   memset(kaddr, 0, PAGE_SIZE);
written_size = 0;
}
summary = (struct f2fs_summary *)(kaddr + written_size);
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 3325d0769723..492ad0c86fa9 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -375,6 +375,7 @@ static inline void seg_info_to_sit_page(struct f2fs_sb_info 
*sbi,
int i;
 
raw_sit = (struct f2fs_sit_block *)page_address(page);
+   memset(raw_sit, 0, PAGE_SIZE);
for (i = 0; i < end - start; i++) {
rs = _sit->entries[i];
se = get_seg_entry(sbi, start + i);
-- 
2.15.0.531.g2ccb3012c9-goog



Re: [f2fs-dev] [PATCH v3] f2fs: don't use GFP_ZERO for page caches

2018-04-09 Thread Jaegeuk Kim
Change log from v2:
 - consider IO error case when dealing with metapage
 - memset by fill_node_footer

Change log from v1:
 - don't memset for recovered page
 
Related to https://lkml.org/lkml/2018/4/8/661

Sometimes, we need to write meta data to new allocated block address,
then we will allocate a zeroed page in inner inode's address space, and
fill partial data in it, and leave other place with zero value which means
some fields are initial status.

There are two inner inodes (meta inode and node inode) setting __GFP_ZERO,
I have just checked them, for both of them, we can avoid using __GFP_ZERO,
and do initialization by ourselves to avoid unneeded/redundant zeroing
from mm.

Cc: 
Signed-off-by: Chao Yu 
Signed-off-by: Jaegeuk Kim 
---
 fs/f2fs/checkpoint.c | 4 +++-
 fs/f2fs/inode.c  | 4 ++--
 fs/f2fs/segment.c| 3 +++
 fs/f2fs/segment.h| 1 +
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index bf779461df13..2e23b953d304 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -100,8 +100,10 @@ static struct page *__get_meta_page(struct f2fs_sb_info 
*sbi, pgoff_t index,
 * readonly and make sure do not write checkpoint with non-uptodate
 * meta page.
 */
-   if (unlikely(!PageUptodate(page)))
+   if (unlikely(!PageUptodate(page))) {
+   memset(page_address(page), 0, PAGE_SIZE);
f2fs_stop_checkpoint(sbi, false);
+   }
 out:
return page;
 }
diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 417c9dcd0269..87535bf63421 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned 
long ino)
 make_now:
if (ino == F2FS_NODE_INO(sbi)) {
inode->i_mapping->a_ops = _node_aops;
-   mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
+   mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
} else if (ino == F2FS_META_INO(sbi)) {
inode->i_mapping->a_ops = _meta_aops;
-   mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO);
+   mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS);
} else if (S_ISREG(inode->i_mode)) {
inode->i_op = _file_inode_operations;
inode->i_fop = _file_operations;
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index a4b8e3e24ccb..1f5db557ab96 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2021,6 +2021,7 @@ static void write_current_sum_page(struct f2fs_sb_info 
*sbi,
struct f2fs_summary_block *dst;
 
dst = (struct f2fs_summary_block *)page_address(page);
+   memset(dst, 0, PAGE_SIZE);
 
mutex_lock(>curseg_mutex);
 
@@ -3117,6 +3118,7 @@ static void write_compacted_summaries(struct f2fs_sb_info 
*sbi, block_t blkaddr)
 
page = grab_meta_page(sbi, blkaddr++);
kaddr = (unsigned char *)page_address(page);
+   memset(kaddr, 0, PAGE_SIZE);
 
/* Step 1: write nat cache */
seg_i = CURSEG_I(sbi, CURSEG_HOT_DATA);
@@ -3141,6 +3143,7 @@ static void write_compacted_summaries(struct f2fs_sb_info 
*sbi, block_t blkaddr)
if (!page) {
page = grab_meta_page(sbi, blkaddr++);
kaddr = (unsigned char *)page_address(page);
+   memset(kaddr, 0, PAGE_SIZE);
written_size = 0;
}
summary = (struct f2fs_summary *)(kaddr + written_size);
diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 3325d0769723..492ad0c86fa9 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -375,6 +375,7 @@ static inline void seg_info_to_sit_page(struct f2fs_sb_info 
*sbi,
int i;
 
raw_sit = (struct f2fs_sit_block *)page_address(page);
+   memset(raw_sit, 0, PAGE_SIZE);
for (i = 0; i < end - start; i++) {
rs = _sit->entries[i];
se = get_seg_entry(sbi, start + i);
-- 
2.15.0.531.g2ccb3012c9-goog



Re: [RFC v2] virtio: support packed ring

2018-04-09 Thread Tiwei Bie
On Tue, Apr 10, 2018 at 10:55:25AM +0800, Jason Wang wrote:
> On 2018年04月01日 22:12, Tiwei Bie wrote:
> > Hello everyone,
> > 
> > This RFC implements packed ring support for virtio driver.
> > 
> > The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented
> > by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html
> > Minor changes are needed for the vhost code, e.g. to kick the guest.
> > 
> > TODO:
> > - Refinements and bug fixes;
> > - Split into small patches;
> > - Test indirect descriptor support;
> > - Test/fix event suppression support;
> > - Test devices other than net;
> > 
> > RFC v1 -> RFC v2:
> > - Add indirect descriptor support - compile test only;
> > - Add event suppression supprt - compile test only;
> > - Move vring_packed_init() out of uapi (Jason, MST);
> > - Merge two loops into one in virtqueue_add_packed() (Jason);
> > - Split vring_unmap_one() for packed ring and split ring (Jason);
> > - Avoid using '%' operator (Jason);
> > - Rename free_head -> next_avail_idx (Jason);
> > - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
> > - Some other refinements and bug fixes;
> > 
> > Thanks!
> 
> Will try to review this later.
> 
> But it would be better if you can split it (more than 1000 lines is too big
> to be reviewed easily). E.g you can at least split it into three patches,
> new structures, datapath, and event suppression.
> 

No problem! It's on my TODO list. I'll get it done in the next version.

Thanks!


Re: [RFC v2] virtio: support packed ring

2018-04-09 Thread Tiwei Bie
On Tue, Apr 10, 2018 at 10:55:25AM +0800, Jason Wang wrote:
> On 2018年04月01日 22:12, Tiwei Bie wrote:
> > Hello everyone,
> > 
> > This RFC implements packed ring support for virtio driver.
> > 
> > The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented
> > by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html
> > Minor changes are needed for the vhost code, e.g. to kick the guest.
> > 
> > TODO:
> > - Refinements and bug fixes;
> > - Split into small patches;
> > - Test indirect descriptor support;
> > - Test/fix event suppression support;
> > - Test devices other than net;
> > 
> > RFC v1 -> RFC v2:
> > - Add indirect descriptor support - compile test only;
> > - Add event suppression supprt - compile test only;
> > - Move vring_packed_init() out of uapi (Jason, MST);
> > - Merge two loops into one in virtqueue_add_packed() (Jason);
> > - Split vring_unmap_one() for packed ring and split ring (Jason);
> > - Avoid using '%' operator (Jason);
> > - Rename free_head -> next_avail_idx (Jason);
> > - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
> > - Some other refinements and bug fixes;
> > 
> > Thanks!
> 
> Will try to review this later.
> 
> But it would be better if you can split it (more than 1000 lines is too big
> to be reviewed easily). E.g you can at least split it into three patches,
> new structures, datapath, and event suppression.
> 

No problem! It's on my TODO list. I'll get it done in the next version.

Thanks!


Re: [PATCH v2 11/21] stack-protector: test compiler capability in Kconfig and drop AUTO mode

2018-04-09 Thread Masahiro Yamada
2018-04-10 0:04 GMT+09:00 Kees Cook :
> On Mon, Apr 9, 2018 at 1:54 AM, Masahiro Yamada
>  wrote:
>> 2018-03-28 20:18 GMT+09:00 Kees Cook :
>>> On Mon, Mar 26, 2018 at 10:29 PM, Masahiro Yamada
>>>  wrote:
 diff --git a/arch/Kconfig b/arch/Kconfig
 index 8e0d665..b42378d 100644
 --- a/arch/Kconfig
 +++ b/arch/Kconfig
 @@ -535,13 +535,13 @@ config HAVE_CC_STACKPROTECTOR
 bool
 help
   An arch should select this symbol if:
 - - its compiler supports the -fstack-protector option
>>>
>>> Please leave this note: it's still valid. An arch must still have
>>> compiler support for this to be sensible.
>>>
>>
>> No.
>>
>> "its compiler supports the -fstack-protector option"
>> is tested by $(cc-option -fstack-protector)
>>
>> ARCH does not need to know the GCC support level.
>
> That's not correct: if you enable stack protector for a kernel
> architecture that doesn't having it enabled, it's unlikely for the
> resulting kernel to boot. An architecture must handle the changes that
> the compiler introduces when adding -fstack-protector (for example,
> having the stack protector canary value defined, having the failure
> function defined, handling context switches changing canaries, etc).
>



It is still hard to understand this.


When we "its compiler supports the -fstack-protector option",
we have two meanings

[1] the stack protector feature is implemented in GCC source code.

[2] -fstack-protector is recognized as a valid option in the GCC being used.
This can be tested by $(cc-option -fstack-protector)

I guess you were talking about [1], where as I [2].
Is this correct?


Does [2] happen only after [1] happens?
Or, are they independent?

If there is a case where GCC recognizes -fstack-protector,
but not implemented?


For x86, there are cases where the option is recognized but not working.
That's why we have
scripts/gcc-x86_{32,64}-has-stack-protector.sh

Generally, if GCC accepts -fstack-protector as a valid option,
we expect "it is working".

I wonder why we need additional information about the compiler
even after $(cc-option -fstack-protector) succeeds.


This is just a matter of comment.

Can you clarify your problem?




> resulting kernel to boot. An architecture must handle the changes that
> the compiler introduces when adding -fstack-protector (for example,
> having the stack protector canary value defined, having the failure
> function defined, handling context switches changing canaries, etc).
>

All of these are talking about the kernel side implementation.
So, it is included in the following comment I am still keeping.

  - it has implemented a stack canary (e.g. __stack_chk_guard)



-- 
Best Regards
Masahiro Yamada


Re: [PATCH v2 11/21] stack-protector: test compiler capability in Kconfig and drop AUTO mode

2018-04-09 Thread Masahiro Yamada
2018-04-10 0:04 GMT+09:00 Kees Cook :
> On Mon, Apr 9, 2018 at 1:54 AM, Masahiro Yamada
>  wrote:
>> 2018-03-28 20:18 GMT+09:00 Kees Cook :
>>> On Mon, Mar 26, 2018 at 10:29 PM, Masahiro Yamada
>>>  wrote:
 diff --git a/arch/Kconfig b/arch/Kconfig
 index 8e0d665..b42378d 100644
 --- a/arch/Kconfig
 +++ b/arch/Kconfig
 @@ -535,13 +535,13 @@ config HAVE_CC_STACKPROTECTOR
 bool
 help
   An arch should select this symbol if:
 - - its compiler supports the -fstack-protector option
>>>
>>> Please leave this note: it's still valid. An arch must still have
>>> compiler support for this to be sensible.
>>>
>>
>> No.
>>
>> "its compiler supports the -fstack-protector option"
>> is tested by $(cc-option -fstack-protector)
>>
>> ARCH does not need to know the GCC support level.
>
> That's not correct: if you enable stack protector for a kernel
> architecture that doesn't having it enabled, it's unlikely for the
> resulting kernel to boot. An architecture must handle the changes that
> the compiler introduces when adding -fstack-protector (for example,
> having the stack protector canary value defined, having the failure
> function defined, handling context switches changing canaries, etc).
>



It is still hard to understand this.


When we "its compiler supports the -fstack-protector option",
we have two meanings

[1] the stack protector feature is implemented in GCC source code.

[2] -fstack-protector is recognized as a valid option in the GCC being used.
This can be tested by $(cc-option -fstack-protector)

I guess you were talking about [1], where as I [2].
Is this correct?


Does [2] happen only after [1] happens?
Or, are they independent?

If there is a case where GCC recognizes -fstack-protector,
but not implemented?


For x86, there are cases where the option is recognized but not working.
That's why we have
scripts/gcc-x86_{32,64}-has-stack-protector.sh

Generally, if GCC accepts -fstack-protector as a valid option,
we expect "it is working".

I wonder why we need additional information about the compiler
even after $(cc-option -fstack-protector) succeeds.


This is just a matter of comment.

Can you clarify your problem?




> resulting kernel to boot. An architecture must handle the changes that
> the compiler introduces when adding -fstack-protector (for example,
> having the stack protector canary value defined, having the failure
> function defined, handling context switches changing canaries, etc).
>

All of these are talking about the kernel side implementation.
So, it is included in the following comment I am still keeping.

  - it has implemented a stack canary (e.g. __stack_chk_guard)



-- 
Best Regards
Masahiro Yamada


Re: [lkp-robot] [init, tracing] 2580d6b795: BUG:kernel_reboot-without-warning_in_boot_stage

2018-04-09 Thread Ye Xiaolong
On 04/09, Steven Rostedt wrote:
>On Tue, 10 Apr 2018 09:23:40 +0800
>Ye Xiaolong  wrote:
>
>> Hi, Steven
>> 
>> On 04/09, Steven Rostedt wrote:
>> >On Mon, 9 Apr 2018 13:32:52 +0800
>> >kernel test robot  wrote:
>> >  
>> >> FYI, we noticed the following commit (built with gcc-7):
>> >> 
>> >> commit: 2580d6b795e25879c825a0891cf67390f665b11f ("init, tracing: Have 
>> >> printk come through the trace events for initcall_debug")
>> >> url: 
>> >> https://github.com/0day-ci/linux/commits/Steven-Rostedt/init-tracing/20180407-130743
>> >> 
>> >> 
>> >> in testcase: boot
>> >> 
>> >> on test machine: qemu-system-x86_64 -enable-kvm -cpu Nehalem -smp 2 -m 
>> >> 512M
>> >> 
>> >> caused below changes (please refer to attached dmesg/kmsg for entire 
>> >> log/backtrace):
>> >> 
>> >> 
>> >> +--+++
>> >> |  | 
>> >> ecf6709d07 | 2580d6b795 |
>> >> +--+++
>> >> | boot_successes   | 0
>> >>   | 0  |
>> >> | boot_failures| 8
>> >>   | 8  |
>> >> | invoked_oom-killer:gfp_mask=0x   | 8
>> >>   ||
>> >> | Mem-Info | 8
>> >>   ||
>> >> | Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 8
>> >>   ||
>> >> | BUG:kernel_reboot-without-warning_in_boot_stage  | 0
>> >>   | 8  |
>> >> +--+++
>> >>   
>> >
>> >What does this mean?  
>> 
>> It means BUG:BUG:kernel_reboot-without-warning_in_boot_stage occurred 8 times
>> in boot tests for commit 2580d6b795, while 0 time for its parent ecf6709d07.
>
>I don't have a commit 2580d6b795.
>
>The commit with the title "init, tracing: Have printk come through the
>trace events for initcall_debug" is 4e37958d1288ce. linux-next doesn't
>have that commit sha1 either.

This commit was generated by 0day service, it captured your email patchset 
which posted
on LKML and then applied it on top of 06dd3dfeea60 and performed build/boot 
tests accordingly.

>
>
>> 
>> >  
>> >> 
>> >> 
>> >> 
>> >> [0.00] RAMDISK: [mem 0x1b7e2000-0x1ffc]
>> >> [0.00] ACPI: Early table checksum verification disabled
>> >> [0.00] ACPI: RSDP 0x000F6860 14 (v00 BOCHS )
>> >> [0.00] ACPI: RSDT 0x1FFE1628 30 (v01 BOCHS  BXPCRSDT 
>> >> 0001 BXPC 0001)
>> >> [0.00] ACPI: FACP 0x1FFE147C 74 (v01 BOCHS  BXPCFACP 
>> >> 0001 BXPC 0001)
>> >> BUG: kernel reboot-without-warning in boot stage
>> >> 
>> >> Elapsed time: 10
>> >> 
>> >> #!/bin/bash
>> >> 
>> >> 
>> >> 
>> >> To reproduce:
>> >> 
>> >> git clone https://github.com/intel/lkp-tests.git
>> >> cd lkp-tests
>> >> bin/lkp qemu -k  job-script  # job-script is attached in 
>> >> this email
>> >>   
>> >
>> >The config boots fine for me. But I don't have the setup to run the
>> >above and get it to work, nor the time to figure out why it doesn't
>> >work.  
>> 
>> Could you paste your failure log here, we can see if there is something we 
>> can help.
>
>I tried it on a more up-to-date box, after checking out my commit with
>the title you say is an error. I compiled your config with gcc (GCC)
>7.3.1 20180130 (Red Hat 7.3.1-2), and ran the above (which did work).
>It ended after it got to a login prompt.
>
>---
>[..]
>[   10.588029] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
>[   10.589363] platform regulatory.0: Direct firmware load for regulatory.db 
>failed with error -2
>[   10.590385] cfg80211: failed to load regulatory.db
>[   10.610002] Freeing unused kernel memory: 1980K
>[   10.610533] Write protecting the kernel read-only data: 49152k
>[   10.615728] Freeing unused kernel memory: 2028K
>[   10.620754] Freeing unused kernel memory: 464K
>INIT: version 2.88 booting
>/etc/rcS.d/S00fbsetup: line 3: /sbin/modprobe: not found
>
>Please wait: booting...
>[   10.676861] rc (151) used greatest stack depth: 27848 bytes left
>Starting udev
>[   10.750281] udevd[175]: starting version 3.1.5
>[   10.759923] udevd (175) used greatest stack depth: 27696 bytes left
>[   11.12] udevadm (178) used greatest stack depth: 26696 bytes left
>Populating dev cache
>INIT: Entering runlevel: 5
>Configuring network interfaces... done.
>Starting syslogd/klogd: done
>
>Poky (Yocto Project Reference Distro) 2.1 qemux86-64 /dev/ttyS0
>
>qemux86-64 login:
>---
>
>What am I suppose to see?
>

So I figure the gap may be 0day bot applied your patchset to a inappropriate 

Re: [lkp-robot] [init, tracing] 2580d6b795: BUG:kernel_reboot-without-warning_in_boot_stage

2018-04-09 Thread Ye Xiaolong
On 04/09, Steven Rostedt wrote:
>On Tue, 10 Apr 2018 09:23:40 +0800
>Ye Xiaolong  wrote:
>
>> Hi, Steven
>> 
>> On 04/09, Steven Rostedt wrote:
>> >On Mon, 9 Apr 2018 13:32:52 +0800
>> >kernel test robot  wrote:
>> >  
>> >> FYI, we noticed the following commit (built with gcc-7):
>> >> 
>> >> commit: 2580d6b795e25879c825a0891cf67390f665b11f ("init, tracing: Have 
>> >> printk come through the trace events for initcall_debug")
>> >> url: 
>> >> https://github.com/0day-ci/linux/commits/Steven-Rostedt/init-tracing/20180407-130743
>> >> 
>> >> 
>> >> in testcase: boot
>> >> 
>> >> on test machine: qemu-system-x86_64 -enable-kvm -cpu Nehalem -smp 2 -m 
>> >> 512M
>> >> 
>> >> caused below changes (please refer to attached dmesg/kmsg for entire 
>> >> log/backtrace):
>> >> 
>> >> 
>> >> +--+++
>> >> |  | 
>> >> ecf6709d07 | 2580d6b795 |
>> >> +--+++
>> >> | boot_successes   | 0
>> >>   | 0  |
>> >> | boot_failures| 8
>> >>   | 8  |
>> >> | invoked_oom-killer:gfp_mask=0x   | 8
>> >>   ||
>> >> | Mem-Info | 8
>> >>   ||
>> >> | Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 8
>> >>   ||
>> >> | BUG:kernel_reboot-without-warning_in_boot_stage  | 0
>> >>   | 8  |
>> >> +--+++
>> >>   
>> >
>> >What does this mean?  
>> 
>> It means BUG:BUG:kernel_reboot-without-warning_in_boot_stage occurred 8 times
>> in boot tests for commit 2580d6b795, while 0 time for its parent ecf6709d07.
>
>I don't have a commit 2580d6b795.
>
>The commit with the title "init, tracing: Have printk come through the
>trace events for initcall_debug" is 4e37958d1288ce. linux-next doesn't
>have that commit sha1 either.

This commit was generated by 0day service, it captured your email patchset 
which posted
on LKML and then applied it on top of 06dd3dfeea60 and performed build/boot 
tests accordingly.

>
>
>> 
>> >  
>> >> 
>> >> 
>> >> 
>> >> [0.00] RAMDISK: [mem 0x1b7e2000-0x1ffc]
>> >> [0.00] ACPI: Early table checksum verification disabled
>> >> [0.00] ACPI: RSDP 0x000F6860 14 (v00 BOCHS )
>> >> [0.00] ACPI: RSDT 0x1FFE1628 30 (v01 BOCHS  BXPCRSDT 
>> >> 0001 BXPC 0001)
>> >> [0.00] ACPI: FACP 0x1FFE147C 74 (v01 BOCHS  BXPCFACP 
>> >> 0001 BXPC 0001)
>> >> BUG: kernel reboot-without-warning in boot stage
>> >> 
>> >> Elapsed time: 10
>> >> 
>> >> #!/bin/bash
>> >> 
>> >> 
>> >> 
>> >> To reproduce:
>> >> 
>> >> git clone https://github.com/intel/lkp-tests.git
>> >> cd lkp-tests
>> >> bin/lkp qemu -k  job-script  # job-script is attached in 
>> >> this email
>> >>   
>> >
>> >The config boots fine for me. But I don't have the setup to run the
>> >above and get it to work, nor the time to figure out why it doesn't
>> >work.  
>> 
>> Could you paste your failure log here, we can see if there is something we 
>> can help.
>
>I tried it on a more up-to-date box, after checking out my commit with
>the title you say is an error. I compiled your config with gcc (GCC)
>7.3.1 20180130 (Red Hat 7.3.1-2), and ran the above (which did work).
>It ended after it got to a login prompt.
>
>---
>[..]
>[   10.588029] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
>[   10.589363] platform regulatory.0: Direct firmware load for regulatory.db 
>failed with error -2
>[   10.590385] cfg80211: failed to load regulatory.db
>[   10.610002] Freeing unused kernel memory: 1980K
>[   10.610533] Write protecting the kernel read-only data: 49152k
>[   10.615728] Freeing unused kernel memory: 2028K
>[   10.620754] Freeing unused kernel memory: 464K
>INIT: version 2.88 booting
>/etc/rcS.d/S00fbsetup: line 3: /sbin/modprobe: not found
>
>Please wait: booting...
>[   10.676861] rc (151) used greatest stack depth: 27848 bytes left
>Starting udev
>[   10.750281] udevd[175]: starting version 3.1.5
>[   10.759923] udevd (175) used greatest stack depth: 27696 bytes left
>[   11.12] udevadm (178) used greatest stack depth: 26696 bytes left
>Populating dev cache
>INIT: Entering runlevel: 5
>Configuring network interfaces... done.
>Starting syslogd/klogd: done
>
>Poky (Yocto Project Reference Distro) 2.1 qemux86-64 /dev/ttyS0
>
>qemux86-64 login:
>---
>
>What am I suppose to see?
>

So I figure the gap may be 0day bot applied your patchset to a inappropriate 
base. 

Thanks,
Xiaolong
>-- Steve
>


Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN

2018-04-09 Thread Steven Rostedt
On Tue, 10 Apr 2018 10:32:36 +0800
Zhaoyang Huang  wrote:

> For bellowing scenario, process A have no intension to exhaust the
> memory, but will be likely to be selected by OOM for we set
> OOM_CORE_ADJ_MIN for it.
> process A(-1000)  process B
> 
>   i = si_mem_available();
>if (i < nr_pages)
>return -ENOMEM;
>schedule
> --->  
> allocate huge memory
> <-
> if (user_thread)
>   set_current_oom_origin();
> 
>   for (i = 0; i < nr_pages; i++) {
>  bpage = kzalloc_node

Is this really an issue though?

Seriously, do you think you will ever hit this?

How often do you increase the size of the ftrace ring buffer? For this
to be an issue, the system has to trigger an OOM at the exact moment
you decide to increase the size of the ring buffer. That would be an
impressive attack, with little to gain.

Ask the memory management people. If they think this could be a
problem, then I'll be happy to take your patch.

-- Steve


Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN

2018-04-09 Thread Steven Rostedt
On Tue, 10 Apr 2018 10:32:36 +0800
Zhaoyang Huang  wrote:

> For bellowing scenario, process A have no intension to exhaust the
> memory, but will be likely to be selected by OOM for we set
> OOM_CORE_ADJ_MIN for it.
> process A(-1000)  process B
> 
>   i = si_mem_available();
>if (i < nr_pages)
>return -ENOMEM;
>schedule
> --->  
> allocate huge memory
> <-
> if (user_thread)
>   set_current_oom_origin();
> 
>   for (i = 0; i < nr_pages; i++) {
>  bpage = kzalloc_node

Is this really an issue though?

Seriously, do you think you will ever hit this?

How often do you increase the size of the ftrace ring buffer? For this
to be an issue, the system has to trigger an OOM at the exact moment
you decide to increase the size of the ring buffer. That would be an
impressive attack, with little to gain.

Ask the memory management people. If they think this could be a
problem, then I'll be happy to take your patch.

-- Steve


Re: [PATCH v3 3/3] mm: restructure memfd code

2018-04-09 Thread Mike Kravetz
On 04/09/2018 06:41 PM, Matthew Wilcox wrote:
> On Mon, Apr 09, 2018 at 04:05:05PM -0700, Mike Kravetz wrote:
>> +/*
>> + * We need a tag: a new tag would expand every radix_tree_node by 8 bytes,
>> + * so reuse a tag which we firmly believe is never set or cleared on shmem.
>> + */
>> +#define SHMEM_TAG_PINNEDPAGECACHE_TAG_TOWRITE
> 
> Do we also firmly believe it's never used on hugetlbfs?
> 

Yes.  hugetlbfs is memory resident only with no writeback.
This comment and name should have been updated when hugetlbfs support was
added.

Also, ideally all the memfd related function names of the form shmem_* should
have been changed to memfd_* when hugetlbfs support was added.  Some of them
were changed, but not all.

I can clean all this up.  But, I would want to do it in patch 2 of the series.
That is where other cleanup such as this was done before code movement.

Will wait a little while for any additional comments before sending series
again.
-- 
Mike Kravetz


Re: [PATCH v3 3/3] mm: restructure memfd code

2018-04-09 Thread Mike Kravetz
On 04/09/2018 06:41 PM, Matthew Wilcox wrote:
> On Mon, Apr 09, 2018 at 04:05:05PM -0700, Mike Kravetz wrote:
>> +/*
>> + * We need a tag: a new tag would expand every radix_tree_node by 8 bytes,
>> + * so reuse a tag which we firmly believe is never set or cleared on shmem.
>> + */
>> +#define SHMEM_TAG_PINNEDPAGECACHE_TAG_TOWRITE
> 
> Do we also firmly believe it's never used on hugetlbfs?
> 

Yes.  hugetlbfs is memory resident only with no writeback.
This comment and name should have been updated when hugetlbfs support was
added.

Also, ideally all the memfd related function names of the form shmem_* should
have been changed to memfd_* when hugetlbfs support was added.  Some of them
were changed, but not all.

I can clean all this up.  But, I would want to do it in patch 2 of the series.
That is where other cleanup such as this was done before code movement.

Will wait a little while for any additional comments before sending series
again.
-- 
Mike Kravetz


[PATCH V2] x86/boot/e820: add new chareater - to free BIOS memory in memmap bootargs

2018-04-09 Thread zoucao
this is useing memmap=0x4101000-0x6aeff000 to free BIOS reserved memory
"6aeff000-6eff : reserved":

..
0010-6aefefff : System RAM
0100-0165537a : Kernel code
0165537b-01a8873f : Kernel data
01c31000-01f4efff : Kernel bss
2800-320f : Crash kernel
6aeff000-6eff : reserved   --> it is e820 reserved memory
6f00-78240fff : System RAM
..


add bootargs memmap=0x4101000-0x6aeff000, to free memory region: 
6aeff000-6eff
then 6aeff000-6eff will be merged into 0010-78240fff.

new iomem:
cat /proc/iomem:
..
0010-78240fff : System RAM
0100-0165537a : Kernel code
0165537b-01a8873f : Kernel data
01c31000-01f4efff : Kernel bss
..


V1>V2: fixed the wrong chareaters

zoucao (1):
  x86/boot/e820: add new chareater "-" to free BIOS memory in memmap 
bootargs

 7u/Documentation/kernel-parameters.txt | 6 ++
 7u/arch/x86/kernel/e820.c  | 3 +++
 2 files changed, 9 insertions(+)



Re: [Resend Patch 1/3] Vmbus: Add function to report available ring buffer to write in total ring size percentage

2018-04-09 Thread Martin K. Petersen

Long,

> I hope this patch set goes through SCSI, because it's purpose is to
> improve storvsc.
>
> If this strategy is not possible, I can resubmit the 1st two patches to
> net, and the 3rd patch to scsi after the 1st two are merged.

Applied to my staging tree for 4.18/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH V2] x86/boot/e820: add new chareater - to free BIOS memory in memmap bootargs

2018-04-09 Thread zoucao
this is useing memmap=0x4101000-0x6aeff000 to free BIOS reserved memory
"6aeff000-6eff : reserved":

..
0010-6aefefff : System RAM
0100-0165537a : Kernel code
0165537b-01a8873f : Kernel data
01c31000-01f4efff : Kernel bss
2800-320f : Crash kernel
6aeff000-6eff : reserved   --> it is e820 reserved memory
6f00-78240fff : System RAM
..


add bootargs memmap=0x4101000-0x6aeff000, to free memory region: 
6aeff000-6eff
then 6aeff000-6eff will be merged into 0010-78240fff.

new iomem:
cat /proc/iomem:
..
0010-78240fff : System RAM
0100-0165537a : Kernel code
0165537b-01a8873f : Kernel data
01c31000-01f4efff : Kernel bss
..


V1>V2: fixed the wrong chareaters

zoucao (1):
  x86/boot/e820: add new chareater "-" to free BIOS memory in memmap 
bootargs

 7u/Documentation/kernel-parameters.txt | 6 ++
 7u/arch/x86/kernel/e820.c  | 3 +++
 2 files changed, 9 insertions(+)



Re: [Resend Patch 1/3] Vmbus: Add function to report available ring buffer to write in total ring size percentage

2018-04-09 Thread Martin K. Petersen

Long,

> I hope this patch set goes through SCSI, because it's purpose is to
> improve storvsc.
>
> If this strategy is not possible, I can resubmit the 1st two patches to
> net, and the 3rd patch to scsi after the 1st two are merged.

Applied to my staging tree for 4.18/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH] x86/boot/e820: add new chareater "-" to free BIOS memory in memmap bootargs

2018-04-09 Thread zoucao
From: zoucao 

Normally every BIOS reserved memory is used for some features, we can't
use them, but in some conditions,  users can ensure some BIOS memories
are not used and reserved memory is well to free, they have not a good
way to free these memories, here add a new chareater "-" in memmap to
free reserved memory.

Signed-off-by: zou cao 
---
 7u/Documentation/kernel-parameters.txt | 6 ++
 7u/arch/x86/kernel/e820.c  | 3 +++
 2 files changed, 9 insertions(+)

diff --git a/7u/Documentation/kernel-parameters.txt 
b/7u/Documentation/kernel-parameters.txt
index 9a1abb99a..dbea75e12 100644
--- a/7u/Documentation/kernel-parameters.txt
+++ b/7u/Documentation/kernel-parameters.txt
@@ -1677,6 +1677,12 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
 or
 memmap=0x1$0x1869
 
+   memmap=nn[KMG]-ss[KMG]
+   Free E820 reserved memory, as specified by the user.
+   Region of reserved memory to be free, from ss to ss+nn.
+   Example: free reserved memory from 0x1869-0x186a
+   memmap=0x4101000-0x6aeff000
+
memory_corruption_check=0/1 [X86]
Some BIOSes seem to corrupt the first 64k of
memory when doing things like suspend/resume.
diff --git a/7u/arch/x86/kernel/e820.c b/7u/arch/x86/kernel/e820.c
index 174da5fc5..b8a042981 100644
--- a/7u/arch/x86/kernel/e820.c
+++ b/7u/arch/x86/kernel/e820.c
@@ -875,6 +875,9 @@ static int __init parse_memmap_one(char *p)
} else if (*p == '$') {
start_at = memparse(p+1, );
e820_add_region(start_at, mem_size, E820_RESERVED);
+   } else if (*p == '-') {
+   start_at = memparse(p+1, );
+   e820_remove_range(start_at, mem_size, E820_RESERVED, E820_RAM);
} else
e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1);
 
-- 
2.14.1.40.g8e62ba1



[PATCH] x86/boot/e820: add new chareater "-" to free BIOS memory in memmap bootargs

2018-04-09 Thread zoucao
From: zoucao 

Normally every BIOS reserved memory is used for some features, we can't
use them, but in some conditions,  users can ensure some BIOS memories
are not used and reserved memory is well to free, they have not a good
way to free these memories, here add a new chareater "-" in memmap to
free reserved memory.

Signed-off-by: zou cao 
---
 7u/Documentation/kernel-parameters.txt | 6 ++
 7u/arch/x86/kernel/e820.c  | 3 +++
 2 files changed, 9 insertions(+)

diff --git a/7u/Documentation/kernel-parameters.txt 
b/7u/Documentation/kernel-parameters.txt
index 9a1abb99a..dbea75e12 100644
--- a/7u/Documentation/kernel-parameters.txt
+++ b/7u/Documentation/kernel-parameters.txt
@@ -1677,6 +1677,12 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
 or
 memmap=0x1$0x1869
 
+   memmap=nn[KMG]-ss[KMG]
+   Free E820 reserved memory, as specified by the user.
+   Region of reserved memory to be free, from ss to ss+nn.
+   Example: free reserved memory from 0x1869-0x186a
+   memmap=0x4101000-0x6aeff000
+
memory_corruption_check=0/1 [X86]
Some BIOSes seem to corrupt the first 64k of
memory when doing things like suspend/resume.
diff --git a/7u/arch/x86/kernel/e820.c b/7u/arch/x86/kernel/e820.c
index 174da5fc5..b8a042981 100644
--- a/7u/arch/x86/kernel/e820.c
+++ b/7u/arch/x86/kernel/e820.c
@@ -875,6 +875,9 @@ static int __init parse_memmap_one(char *p)
} else if (*p == '$') {
start_at = memparse(p+1, );
e820_add_region(start_at, mem_size, E820_RESERVED);
+   } else if (*p == '-') {
+   start_at = memparse(p+1, );
+   e820_remove_range(start_at, mem_size, E820_RESERVED, E820_RAM);
} else
e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1);
 
-- 
2.14.1.40.g8e62ba1



Re: [PATCH] mm: workingset: fix NULL ptr dereference

2018-04-09 Thread Minchan Kim
On Mon, Apr 09, 2018 at 07:41:52PM -0700, Matthew Wilcox wrote:
> On Tue, Apr 10, 2018 at 11:33:39AM +0900, Minchan Kim wrote:
> > @@ -522,7 +532,7 @@ EXPORT_SYMBOL(radix_tree_preload);
> >   */
> >  int radix_tree_maybe_preload(gfp_t gfp_mask)
> >  {
> > -   if (gfpflags_allow_blocking(gfp_mask))
> > +   if (gfpflags_allow_blocking(gfp_mask) && !(gfp_mask & __GFP_ZERO))
> > return __radix_tree_preload(gfp_mask, RADIX_TREE_PRELOAD_SIZE);
> > /* Preloading doesn't help anything with this gfp mask, skip it */
> > preempt_disable();
> 
> No, you've completely misunderstood what's going on in this function.

Okay, I hope this version clear current concerns.

>From fb37c41b90f7d3ead1798e5cb7baef76709afd94 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Tue, 10 Apr 2018 11:54:57 +0900
Subject: [PATCH v3] mm: workingset: fix NULL ptr dereference

It assumes shadow entries of radix tree rely on the init state
that node->private_list allocated newly is list_empty state
for the working. Currently, it's initailized in SLAB constructor
which means node of radix tree would be initialized only when
*slub allocates new page*, not *slub alloctes new object*.

If some FS or subsystem pass gfp_mask to __GFP_ZERO, that means
newly allocated node can have !list_empty(node->private_list)
by memset of slab allocator. It ends up calling NULL deference
at workingset_update_node by failing list_empty check.

This patch fixes it.

Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check")
Cc: Johannes Weiner 
Cc: Jan Kara 
Cc: Matthew Wilcox 
Cc: Jaegeuk Kim 
Cc: Chao Yu 
Cc: Christopher Lameter 
Cc: linux-fsde...@vger.kernel.org
Cc: sta...@vger.kernel.org
Reported-by: Chris Fries 
Signed-off-by: Minchan Kim 
---
 lib/radix-tree.c | 9 +
 mm/filemap.c | 5 +++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index da9e10c827df..7569e637dbaa 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -470,6 +470,15 @@ static __must_check int __radix_tree_preload(gfp_t 
gfp_mask, unsigned nr)
struct radix_tree_node *node;
int ret = -ENOMEM;
 
+   /*
+* New allocate node must have node->private_list as INIT_LIST_HEAD
+* state by workingset shadow memory implementation.
+* If user pass  __GFP_ZERO by mistake, slab allocator will clear
+* node->private_list, which makes a BUG. Rather than going Oops,
+* just fix and warn about it.
+*/
+   if (WARN_ON(gfp_mask & __GFP_ZERO))
+   gfp_mask &= ~__GFP_ZERO;
/*
 * Nodes preloaded by one cgroup can be be used by another cgroup, so
 * they should never be accounted to any particular memory cgroup.
diff --git a/mm/filemap.c b/mm/filemap.c
index ab77e19ab09c..b6de9d691c8a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -786,7 +786,7 @@ int replace_page_cache_page(struct page *old, struct page 
*new, gfp_t gfp_mask)
VM_BUG_ON_PAGE(!PageLocked(new), new);
VM_BUG_ON_PAGE(new->mapping, new);
 
-   error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
+   error = radix_tree_preload(gfp_mask & ~(__GFP_HIGHMEM | __GFP_ZERO));
if (!error) {
struct address_space *mapping = old->mapping;
void (*freepage)(struct page *);
@@ -842,7 +842,8 @@ static int __add_to_page_cache_locked(struct page *page,
return error;
}
 
-   error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
+   error = radix_tree_maybe_preload(gfp_mask &
+   ~(__GFP_HIGHMEM | __GFP_ZERO));
if (error) {
if (!huge)
mem_cgroup_cancel_charge(page, memcg, false);
-- 
2.17.0.484.g0c8726318c-goog




Re: [PATCH] mm: workingset: fix NULL ptr dereference

2018-04-09 Thread Minchan Kim
On Mon, Apr 09, 2018 at 07:41:52PM -0700, Matthew Wilcox wrote:
> On Tue, Apr 10, 2018 at 11:33:39AM +0900, Minchan Kim wrote:
> > @@ -522,7 +532,7 @@ EXPORT_SYMBOL(radix_tree_preload);
> >   */
> >  int radix_tree_maybe_preload(gfp_t gfp_mask)
> >  {
> > -   if (gfpflags_allow_blocking(gfp_mask))
> > +   if (gfpflags_allow_blocking(gfp_mask) && !(gfp_mask & __GFP_ZERO))
> > return __radix_tree_preload(gfp_mask, RADIX_TREE_PRELOAD_SIZE);
> > /* Preloading doesn't help anything with this gfp mask, skip it */
> > preempt_disable();
> 
> No, you've completely misunderstood what's going on in this function.

Okay, I hope this version clear current concerns.

>From fb37c41b90f7d3ead1798e5cb7baef76709afd94 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Tue, 10 Apr 2018 11:54:57 +0900
Subject: [PATCH v3] mm: workingset: fix NULL ptr dereference

It assumes shadow entries of radix tree rely on the init state
that node->private_list allocated newly is list_empty state
for the working. Currently, it's initailized in SLAB constructor
which means node of radix tree would be initialized only when
*slub allocates new page*, not *slub alloctes new object*.

If some FS or subsystem pass gfp_mask to __GFP_ZERO, that means
newly allocated node can have !list_empty(node->private_list)
by memset of slab allocator. It ends up calling NULL deference
at workingset_update_node by failing list_empty check.

This patch fixes it.

Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check")
Cc: Johannes Weiner 
Cc: Jan Kara 
Cc: Matthew Wilcox 
Cc: Jaegeuk Kim 
Cc: Chao Yu 
Cc: Christopher Lameter 
Cc: linux-fsde...@vger.kernel.org
Cc: sta...@vger.kernel.org
Reported-by: Chris Fries 
Signed-off-by: Minchan Kim 
---
 lib/radix-tree.c | 9 +
 mm/filemap.c | 5 +++--
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index da9e10c827df..7569e637dbaa 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -470,6 +470,15 @@ static __must_check int __radix_tree_preload(gfp_t 
gfp_mask, unsigned nr)
struct radix_tree_node *node;
int ret = -ENOMEM;
 
+   /*
+* New allocate node must have node->private_list as INIT_LIST_HEAD
+* state by workingset shadow memory implementation.
+* If user pass  __GFP_ZERO by mistake, slab allocator will clear
+* node->private_list, which makes a BUG. Rather than going Oops,
+* just fix and warn about it.
+*/
+   if (WARN_ON(gfp_mask & __GFP_ZERO))
+   gfp_mask &= ~__GFP_ZERO;
/*
 * Nodes preloaded by one cgroup can be be used by another cgroup, so
 * they should never be accounted to any particular memory cgroup.
diff --git a/mm/filemap.c b/mm/filemap.c
index ab77e19ab09c..b6de9d691c8a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -786,7 +786,7 @@ int replace_page_cache_page(struct page *old, struct page 
*new, gfp_t gfp_mask)
VM_BUG_ON_PAGE(!PageLocked(new), new);
VM_BUG_ON_PAGE(new->mapping, new);
 
-   error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
+   error = radix_tree_preload(gfp_mask & ~(__GFP_HIGHMEM | __GFP_ZERO));
if (!error) {
struct address_space *mapping = old->mapping;
void (*freepage)(struct page *);
@@ -842,7 +842,8 @@ static int __add_to_page_cache_locked(struct page *page,
return error;
}
 
-   error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
+   error = radix_tree_maybe_preload(gfp_mask &
+   ~(__GFP_HIGHMEM | __GFP_ZERO));
if (error) {
if (!huge)
mem_cgroup_cancel_charge(page, memcg, false);
-- 
2.17.0.484.g0c8726318c-goog




[lkp-robot] [hugetlbfs] e979e5a059: BUG_hugetlbfs_inode_cache(Not_tainted):Objects_remaining_in_hugetlbfs_inode_cache_on__kmem_cache_shutdown()

2018-04-09 Thread kernel test robot

FYI, we noticed the following commit (built with gcc-7):

commit: e979e5a0591e70ad0b41cf876ee987de468a220e ("hugetlbfs: Convert to 
fs_context")
https://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git mount-context

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 512M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+-+++
|   
  | 838d9ecc64 | e979e5a059 |
+-+++
| boot_successes
  | 0  | 0  |
| boot_failures 
  | 54 | 17 |
| BUG:stack_guard_page_was_hit_at#(stack_is#..#)
  | 54 ||
| RIP:legacy_parse_monolithic   
  | 54 ||
| Kernel_panic-not_syncing:Fatal_exception  
  | 54 ||
| 
BUG_hugetlbfs_inode_cache(Not_tainted):Objects_remaining_in_hugetlbfs_inode_cache_on__kmem_cache_shutdown()
 | 0  | 17 |
| INFO:Slab#objects=#used=#fp=#flags=   
  | 0  | 17 |
| INFO:Object#@offset=  
  | 0  | 17 |
+-+++



[0.160565] PCI: pci_cache_line_size set to 64 bytes
[0.161260] e820: reserve RAM buffer [mem 0x0009fc00-0x0009]
[0.161969] e820: reserve RAM buffer [mem 0x1ffe-0x1fff]
[0.163220] clocksource: Switched to clocksource kvm-clock
[0.175560] 
=
[0.176568] BUG hugetlbfs_inode_cache (Not tainted): Objects remaining in 
hugetlbfs_inode_cache on __kmem_cache_shutdown()
[0.176640] 
-
[0.176640] 
[0.176640] Disabling lock debugging due to kernel taint
[0.176640] INFO: Slab 0x6376557a objects=17 used=1 
fp=0x154e780a flags=0x40008100
[0.176640] CPU: 0 PID: 1 Comm: swapper Tainted: GB
4.16.0-10623-ge979e5a #1
[0.176640] Call Trace:
[0.176640]  slab_err+0xad/0xcf
[0.176640]  ? __kmem_cache_shutdown+0x93/0x301
[0.176640]  ? __need_fs_reclaim+0x5/0x4e
[0.176640]  ? prefetch_freepointer+0x5/0x14
[0.176640]  ? __kmalloc+0x122/0x1c4
[0.176640]  __kmem_cache_shutdown+0x163/0x301
[0.176640]  shutdown_cache+0x14/0xf7
[0.176640]  kmem_cache_destroy+0x15c/0x1a5
[0.176640]  init_hugetlbfs_fs+0x85/0x15c
[0.176640]  ? init_ramfs_fs+0x1f/0x1f
[0.176640]  ? set_debug_rodata+0x11/0x11
[0.176640]  do_one_initcall+0x9c/0x148
[0.176640]  kernel_init_freeable+0x11b/0x1a8
[0.176640]  ? rest_init+0x119/0x119
[0.176640]  kernel_init+0xa/0xe1
[0.176640]  ret_from_fork+0x3a/0x50
[0.176640] INFO: Object 0xe4f03853 @offset=12768
[0.190206] kmem_cache_destroy hugetlbfs_inode_cache: Slab cache still has 
objects
[0.191091] CPU: 0 PID: 1 Comm: swapper Tainted: GB
4.16.0-10623-ge979e5a #1
[0.192084] Call Trace:
[0.192383]  kmem_cache_destroy+0x175/0x1a5
[0.192889]  init_hugetlbfs_fs+0x85/0x15c
[0.193362]  ? init_ramfs_fs+0x1f/0x1f
[0.193809]  ? set_debug_rodata+0x11/0x11
[0.194282]  do_one_initcall+0x9c/0x148
[0.194738]  kernel_init_freeable+0x11b/0x1a8
[0.195249]  ? rest_init+0x119/0x119
[0.195673]  kernel_init+0xa/0xe1
[0.196091]  ret_from_fork+0x3a/0x50
[0.196575] pnp: PnP ACPI init
[0.197162] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active)
[0.198248] pnp 00:01: Plug and Play ACPI device, IDs PNP0303 (active)
[0.199306] pnp 00:02: Plug and Play ACPI device, IDs PNP0f13 (active)
[0.200357] pnp 00:03: [dma 2]


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
Xiaolong
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.16.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"

[lkp-robot] [hugetlbfs] e979e5a059: BUG_hugetlbfs_inode_cache(Not_tainted):Objects_remaining_in_hugetlbfs_inode_cache_on__kmem_cache_shutdown()

2018-04-09 Thread kernel test robot

FYI, we noticed the following commit (built with gcc-7):

commit: e979e5a0591e70ad0b41cf876ee987de468a220e ("hugetlbfs: Convert to 
fs_context")
https://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git mount-context

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 512M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+-+++
|   
  | 838d9ecc64 | e979e5a059 |
+-+++
| boot_successes
  | 0  | 0  |
| boot_failures 
  | 54 | 17 |
| BUG:stack_guard_page_was_hit_at#(stack_is#..#)
  | 54 ||
| RIP:legacy_parse_monolithic   
  | 54 ||
| Kernel_panic-not_syncing:Fatal_exception  
  | 54 ||
| 
BUG_hugetlbfs_inode_cache(Not_tainted):Objects_remaining_in_hugetlbfs_inode_cache_on__kmem_cache_shutdown()
 | 0  | 17 |
| INFO:Slab#objects=#used=#fp=#flags=   
  | 0  | 17 |
| INFO:Object#@offset=  
  | 0  | 17 |
+-+++



[0.160565] PCI: pci_cache_line_size set to 64 bytes
[0.161260] e820: reserve RAM buffer [mem 0x0009fc00-0x0009]
[0.161969] e820: reserve RAM buffer [mem 0x1ffe-0x1fff]
[0.163220] clocksource: Switched to clocksource kvm-clock
[0.175560] 
=
[0.176568] BUG hugetlbfs_inode_cache (Not tainted): Objects remaining in 
hugetlbfs_inode_cache on __kmem_cache_shutdown()
[0.176640] 
-
[0.176640] 
[0.176640] Disabling lock debugging due to kernel taint
[0.176640] INFO: Slab 0x6376557a objects=17 used=1 
fp=0x154e780a flags=0x40008100
[0.176640] CPU: 0 PID: 1 Comm: swapper Tainted: GB
4.16.0-10623-ge979e5a #1
[0.176640] Call Trace:
[0.176640]  slab_err+0xad/0xcf
[0.176640]  ? __kmem_cache_shutdown+0x93/0x301
[0.176640]  ? __need_fs_reclaim+0x5/0x4e
[0.176640]  ? prefetch_freepointer+0x5/0x14
[0.176640]  ? __kmalloc+0x122/0x1c4
[0.176640]  __kmem_cache_shutdown+0x163/0x301
[0.176640]  shutdown_cache+0x14/0xf7
[0.176640]  kmem_cache_destroy+0x15c/0x1a5
[0.176640]  init_hugetlbfs_fs+0x85/0x15c
[0.176640]  ? init_ramfs_fs+0x1f/0x1f
[0.176640]  ? set_debug_rodata+0x11/0x11
[0.176640]  do_one_initcall+0x9c/0x148
[0.176640]  kernel_init_freeable+0x11b/0x1a8
[0.176640]  ? rest_init+0x119/0x119
[0.176640]  kernel_init+0xa/0xe1
[0.176640]  ret_from_fork+0x3a/0x50
[0.176640] INFO: Object 0xe4f03853 @offset=12768
[0.190206] kmem_cache_destroy hugetlbfs_inode_cache: Slab cache still has 
objects
[0.191091] CPU: 0 PID: 1 Comm: swapper Tainted: GB
4.16.0-10623-ge979e5a #1
[0.192084] Call Trace:
[0.192383]  kmem_cache_destroy+0x175/0x1a5
[0.192889]  init_hugetlbfs_fs+0x85/0x15c
[0.193362]  ? init_ramfs_fs+0x1f/0x1f
[0.193809]  ? set_debug_rodata+0x11/0x11
[0.194282]  do_one_initcall+0x9c/0x148
[0.194738]  kernel_init_freeable+0x11b/0x1a8
[0.195249]  ? rest_init+0x119/0x119
[0.195673]  kernel_init+0xa/0xe1
[0.196091]  ret_from_fork+0x3a/0x50
[0.196575] pnp: PnP ACPI init
[0.197162] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active)
[0.198248] pnp 00:01: Plug and Play ACPI device, IDs PNP0303 (active)
[0.199306] pnp 00:02: Plug and Play ACPI device, IDs PNP0f13 (active)
[0.200357] pnp 00:03: [dma 2]


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
Xiaolong
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.16.0 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"

Re: [PATCH 0/8] hisi_sas: some misc changes

2018-04-09 Thread Martin K. Petersen

John,

> This patchset introduces some minor, more trivial patches, some of
> which have been sitting on our internal dev branch for a while.

Applied to 4.18/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 0/8] hisi_sas: some misc changes

2018-04-09 Thread Martin K. Petersen

John,

> This patchset introduces some minor, more trivial patches, some of
> which have been sitting on our internal dev branch for a while.

Applied to 4.18/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [RFC v2] virtio: support packed ring

2018-04-09 Thread Jason Wang



On 2018年04月01日 22:12, Tiwei Bie wrote:

Hello everyone,

This RFC implements packed ring support for virtio driver.

The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented
by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html
Minor changes are needed for the vhost code, e.g. to kick the guest.

TODO:
- Refinements and bug fixes;
- Split into small patches;
- Test indirect descriptor support;
- Test/fix event suppression support;
- Test devices other than net;

RFC v1 -> RFC v2:
- Add indirect descriptor support - compile test only;
- Add event suppression supprt - compile test only;
- Move vring_packed_init() out of uapi (Jason, MST);
- Merge two loops into one in virtqueue_add_packed() (Jason);
- Split vring_unmap_one() for packed ring and split ring (Jason);
- Avoid using '%' operator (Jason);
- Rename free_head -> next_avail_idx (Jason);
- Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
- Some other refinements and bug fixes;

Thanks!


Will try to review this later.

But it would be better if you can split it (more than 1000 lines is too 
big to be reviewed easily). E.g you can at least split it into three 
patches, new structures, datapath, and event suppression.


Thanks




Signed-off-by: Tiwei Bie 
---
  drivers/virtio/virtio_ring.c   | 1094 +---
  include/linux/virtio_ring.h|8 +-
  include/uapi/linux/virtio_config.h |   12 +-
  include/uapi/linux/virtio_ring.h   |   61 ++
  4 files changed, 980 insertions(+), 195 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 71458f493cf8..0515dca34d77 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -58,14 +58,15 @@
  
  struct vring_desc_state {

void *data; /* Data for callback. */
-   struct vring_desc *indir_desc;  /* Indirect descriptor, if any. */
+   void *indir_desc;   /* Indirect descriptor, if any. */
+   int num;/* Descriptor list length. */
  };
  
  struct vring_virtqueue {

struct virtqueue vq;
  
-	/* Actual memory layout for this queue */

-   struct vring vring;
+   /* Is this a packed ring? */
+   bool packed;
  
  	/* Can we use weak barriers? */

bool weak_barriers;
@@ -79,19 +80,45 @@ struct vring_virtqueue {
/* Host publishes avail event idx */
bool event;
  
-	/* Head of free buffer list. */

-   unsigned int free_head;
/* Number we've added since last sync. */
unsigned int num_added;
  
  	/* Last used index we've seen. */

u16 last_used_idx;
  
-	/* Last written value to avail->flags */

-   u16 avail_flags_shadow;
+   union {
+   /* Available for split ring */
+   struct {
+   /* Actual memory layout for this queue. */
+   struct vring vring;
  
-	/* Last written value to avail->idx in guest byte order */

-   u16 avail_idx_shadow;
+   /* Head of free buffer list. */
+   unsigned int free_head;
+
+   /* Last written value to avail->flags */
+   u16 avail_flags_shadow;
+
+   /* Last written value to avail->idx in
+* guest byte order. */
+   u16 avail_idx_shadow;
+   };
+
+   /* Available for packed ring */
+   struct {
+   /* Actual memory layout for this queue. */
+   struct vring_packed vring_packed;
+
+   /* Driver ring wrap counter. */
+   u8 wrap_counter;
+
+   /* Index of the next avail descriptor. */
+   unsigned int next_avail_idx;
+
+   /* Last written value to driver->flags in
+* guest byte order. */
+   u16 event_flags_shadow;
+   };
+   };
  
  	/* How to notify other side. FIXME: commonalize hcalls! */

bool (*notify)(struct virtqueue *vq);
@@ -201,8 +228,33 @@ static dma_addr_t vring_map_single(const struct 
vring_virtqueue *vq,
  cpu_addr, size, direction);
  }
  
-static void vring_unmap_one(const struct vring_virtqueue *vq,

-   struct vring_desc *desc)
+static void vring_unmap_one_split(const struct vring_virtqueue *vq,
+ struct vring_desc *desc)
+{
+   u16 flags;
+
+   if (!vring_use_dma_api(vq->vq.vdev))
+   return;
+
+   flags = virtio16_to_cpu(vq->vq.vdev, desc->flags);
+
+   if (flags & VRING_DESC_F_INDIRECT) {
+   dma_unmap_single(vring_dma_dev(vq),
+virtio64_to_cpu(vq->vq.vdev, desc->addr),
+virtio32_to_cpu(vq->vq.vdev, desc->len),
+   

Re: [RFC v2] virtio: support packed ring

2018-04-09 Thread Jason Wang



On 2018年04月01日 22:12, Tiwei Bie wrote:

Hello everyone,

This RFC implements packed ring support for virtio driver.

The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented
by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html
Minor changes are needed for the vhost code, e.g. to kick the guest.

TODO:
- Refinements and bug fixes;
- Split into small patches;
- Test indirect descriptor support;
- Test/fix event suppression support;
- Test devices other than net;

RFC v1 -> RFC v2:
- Add indirect descriptor support - compile test only;
- Add event suppression supprt - compile test only;
- Move vring_packed_init() out of uapi (Jason, MST);
- Merge two loops into one in virtqueue_add_packed() (Jason);
- Split vring_unmap_one() for packed ring and split ring (Jason);
- Avoid using '%' operator (Jason);
- Rename free_head -> next_avail_idx (Jason);
- Add comments for virtio_wmb() in virtqueue_add_packed() (Jason);
- Some other refinements and bug fixes;

Thanks!


Will try to review this later.

But it would be better if you can split it (more than 1000 lines is too 
big to be reviewed easily). E.g you can at least split it into three 
patches, new structures, datapath, and event suppression.


Thanks




Signed-off-by: Tiwei Bie 
---
  drivers/virtio/virtio_ring.c   | 1094 +---
  include/linux/virtio_ring.h|8 +-
  include/uapi/linux/virtio_config.h |   12 +-
  include/uapi/linux/virtio_ring.h   |   61 ++
  4 files changed, 980 insertions(+), 195 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 71458f493cf8..0515dca34d77 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -58,14 +58,15 @@
  
  struct vring_desc_state {

void *data; /* Data for callback. */
-   struct vring_desc *indir_desc;  /* Indirect descriptor, if any. */
+   void *indir_desc;   /* Indirect descriptor, if any. */
+   int num;/* Descriptor list length. */
  };
  
  struct vring_virtqueue {

struct virtqueue vq;
  
-	/* Actual memory layout for this queue */

-   struct vring vring;
+   /* Is this a packed ring? */
+   bool packed;
  
  	/* Can we use weak barriers? */

bool weak_barriers;
@@ -79,19 +80,45 @@ struct vring_virtqueue {
/* Host publishes avail event idx */
bool event;
  
-	/* Head of free buffer list. */

-   unsigned int free_head;
/* Number we've added since last sync. */
unsigned int num_added;
  
  	/* Last used index we've seen. */

u16 last_used_idx;
  
-	/* Last written value to avail->flags */

-   u16 avail_flags_shadow;
+   union {
+   /* Available for split ring */
+   struct {
+   /* Actual memory layout for this queue. */
+   struct vring vring;
  
-	/* Last written value to avail->idx in guest byte order */

-   u16 avail_idx_shadow;
+   /* Head of free buffer list. */
+   unsigned int free_head;
+
+   /* Last written value to avail->flags */
+   u16 avail_flags_shadow;
+
+   /* Last written value to avail->idx in
+* guest byte order. */
+   u16 avail_idx_shadow;
+   };
+
+   /* Available for packed ring */
+   struct {
+   /* Actual memory layout for this queue. */
+   struct vring_packed vring_packed;
+
+   /* Driver ring wrap counter. */
+   u8 wrap_counter;
+
+   /* Index of the next avail descriptor. */
+   unsigned int next_avail_idx;
+
+   /* Last written value to driver->flags in
+* guest byte order. */
+   u16 event_flags_shadow;
+   };
+   };
  
  	/* How to notify other side. FIXME: commonalize hcalls! */

bool (*notify)(struct virtqueue *vq);
@@ -201,8 +228,33 @@ static dma_addr_t vring_map_single(const struct 
vring_virtqueue *vq,
  cpu_addr, size, direction);
  }
  
-static void vring_unmap_one(const struct vring_virtqueue *vq,

-   struct vring_desc *desc)
+static void vring_unmap_one_split(const struct vring_virtqueue *vq,
+ struct vring_desc *desc)
+{
+   u16 flags;
+
+   if (!vring_use_dma_api(vq->vq.vdev))
+   return;
+
+   flags = virtio16_to_cpu(vq->vq.vdev, desc->flags);
+
+   if (flags & VRING_DESC_F_INDIRECT) {
+   dma_unmap_single(vring_dma_dev(vq),
+virtio64_to_cpu(vq->vq.vdev, desc->addr),
+virtio32_to_cpu(vq->vq.vdev, desc->len),
+

  1   2   3   4   5   6   7   8   9   10   >