RE: [PATCH 0/8] Input: support for latest Lenovo thinkpads (series 80)
Hi Benjamin, Thanks so much for your patch. I have tested them for Elan Gen5/Gen6(new) touchpad with SMbus/PS2. It works fine in my thinkpad so far but I find an issue today after lid-close/open. I am not sure if you can see it in T480S , I "guess" it may be relative to i2c_i801. The lid-close will enter deep sleep and cut touchpad power. I can see the resume flow after lid-open and SMbus-initial try to request hello package but fail. Strangely, I can't see any SMbus host signal on LA scope after power-on. I can't switch to SMbus after rmmod/modprobe psmouse because error happen in elantech_create_smbus. It will be recovered only if I rmmod/modprobe i2c_i801 first. Do you have any idea about it? Thanks KT -Original Message- From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com] Sent: Friday, April 06, 2018 2:51 PM To: Dmitry Torokhov Cc: 廖崇榮; Oliver Haessler; Benjamin Berg; open list:HID CORE LAYER; lkml Subject: Re: [PATCH 0/8] Input: support for latest Lenovo thinkpads (series 80) Hi Dmitry, On Fri, Apr 6, 2018 at 1:51 AM, Dmitry Torokhovwrote: > Hi Benjamin, > > On Thu, Apr 05, 2018 at 03:25:29PM +0200, Benjamin Tissoires wrote: >> Hi Dmitry, >> >> well, this year, Lenovo gave us a surprise and decided to not use the >> same touchpad/trackstick in all its model. And by default, the >> support under Linux is less than ideal. >> >> Please find a series that should fix those issues. Compared to the 60 >> series, there do not seem to e BIOS table issues this time, and >> suspend/resume works fine thanks to your latest trackstick fixes. >> >> The T480s is a different beast, as it uses an Elan touchpad. >> I have been carrying the patches 3-6 for a while and tested previous >> versions on various Elan PS/2 hardware without an issue as far as I >> could tell. I was lacking tests from users with SMBus as all the >> laptops I tried where puer PS/2. >> >> Anyway, it would be cool if you could have a look at the series. > > I am mostly happy with the series, but I would love to hear KT's take > on it. thanks for the quick review. I worked closely with KT for this series. He helped me a lot for the tiny firmware changes that were required. However, quoting his email from Tuesday: "There will be a spring vacation in Taiwan from tomorrow." I guess we won't hear from him until the end of next week as we always have a backlog of urgent things to do after holidays... Cheers, Benjamin
RE: [PATCH 0/8] Input: support for latest Lenovo thinkpads (series 80)
Hi Benjamin, Thanks so much for your patch. I have tested them for Elan Gen5/Gen6(new) touchpad with SMbus/PS2. It works fine in my thinkpad so far but I find an issue today after lid-close/open. I am not sure if you can see it in T480S , I "guess" it may be relative to i2c_i801. The lid-close will enter deep sleep and cut touchpad power. I can see the resume flow after lid-open and SMbus-initial try to request hello package but fail. Strangely, I can't see any SMbus host signal on LA scope after power-on. I can't switch to SMbus after rmmod/modprobe psmouse because error happen in elantech_create_smbus. It will be recovered only if I rmmod/modprobe i2c_i801 first. Do you have any idea about it? Thanks KT -Original Message- From: Benjamin Tissoires [mailto:benjamin.tissoi...@redhat.com] Sent: Friday, April 06, 2018 2:51 PM To: Dmitry Torokhov Cc: 廖崇榮; Oliver Haessler; Benjamin Berg; open list:HID CORE LAYER; lkml Subject: Re: [PATCH 0/8] Input: support for latest Lenovo thinkpads (series 80) Hi Dmitry, On Fri, Apr 6, 2018 at 1:51 AM, Dmitry Torokhov wrote: > Hi Benjamin, > > On Thu, Apr 05, 2018 at 03:25:29PM +0200, Benjamin Tissoires wrote: >> Hi Dmitry, >> >> well, this year, Lenovo gave us a surprise and decided to not use the >> same touchpad/trackstick in all its model. And by default, the >> support under Linux is less than ideal. >> >> Please find a series that should fix those issues. Compared to the 60 >> series, there do not seem to e BIOS table issues this time, and >> suspend/resume works fine thanks to your latest trackstick fixes. >> >> The T480s is a different beast, as it uses an Elan touchpad. >> I have been carrying the patches 3-6 for a while and tested previous >> versions on various Elan PS/2 hardware without an issue as far as I >> could tell. I was lacking tests from users with SMBus as all the >> laptops I tried where puer PS/2. >> >> Anyway, it would be cool if you could have a look at the series. > > I am mostly happy with the series, but I would love to hear KT's take > on it. thanks for the quick review. I worked closely with KT for this series. He helped me a lot for the tiny firmware changes that were required. However, quoting his email from Tuesday: "There will be a spring vacation in Taiwan from tomorrow." I guess we won't hear from him until the end of next week as we always have a backlog of urgent things to do after holidays... Cheers, Benjamin
[PATCH] drm: xlnx: pl_disp: fix odd_ptr_err.cocci warnings
From: Fengguang WuPTR_ERR should normally access the value just tested by IS_ERR Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci Fixes: 742243a44a73 ("drm: xlnx: pl_disp: Use xlnx pipeline calls") CC: Hyun Kwon Signed-off-by: Fengguang Wu Signed-off-by: Julia Lawall --- tree: https://github.com/Xilinx/linux-xlnx xlnx_rebase_v4.14 head: fe04d2ee0dfea6b5fdbb04f4f6dbcaa13bfd2fda commit: 742243a44a738b165f8da5cbdb6662139e85a5c5 [651/842] drm: xlnx: pl_disp: Use xlnx pipeline calls xlnx_pl_disp.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/xlnx/xlnx_pl_disp.c +++ b/drivers/gpu/drm/xlnx/xlnx_pl_disp.c @@ -482,7 +482,7 @@ static int xlnx_pl_disp_probe(struct pla xlnx_pl_disp->master = xlnx_drm_pipeline_init(pdev); if (IS_ERR(xlnx_pl_disp->master)) { - ret = PTR_ERR(xlnx_pl_disp->dev); + ret = PTR_ERR(xlnx_pl_disp->master); dev_err(dev, "failed to initialize the drm pipeline\n"); goto err_component; }
[PATCH] drm: xlnx: pl_disp: fix odd_ptr_err.cocci warnings
From: Fengguang Wu PTR_ERR should normally access the value just tested by IS_ERR Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci Fixes: 742243a44a73 ("drm: xlnx: pl_disp: Use xlnx pipeline calls") CC: Hyun Kwon Signed-off-by: Fengguang Wu Signed-off-by: Julia Lawall --- tree: https://github.com/Xilinx/linux-xlnx xlnx_rebase_v4.14 head: fe04d2ee0dfea6b5fdbb04f4f6dbcaa13bfd2fda commit: 742243a44a738b165f8da5cbdb6662139e85a5c5 [651/842] drm: xlnx: pl_disp: Use xlnx pipeline calls xlnx_pl_disp.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/drivers/gpu/drm/xlnx/xlnx_pl_disp.c +++ b/drivers/gpu/drm/xlnx/xlnx_pl_disp.c @@ -482,7 +482,7 @@ static int xlnx_pl_disp_probe(struct pla xlnx_pl_disp->master = xlnx_drm_pipeline_init(pdev); if (IS_ERR(xlnx_pl_disp->master)) { - ret = PTR_ERR(xlnx_pl_disp->dev); + ret = PTR_ERR(xlnx_pl_disp->master); dev_err(dev, "failed to initialize the drm pipeline\n"); goto err_component; }
Re: [PATCH v6 0/6] Add MediaTek PMIC keys support
On Thu, 2018-03-29 at 09:15 -0700, Dmitry Torokhov wrote: > > > Oh, sorry, I did not realize you wanted my Ack for bindings. I usually > leave it to Rob and simply ack the driver itself when I am happy with > the code. > > I'll go and add my ack to the binding post if that will help merging > the series. > > Thanks. Hi Lee, May I know if I need to collect Dmitry's comments and send a new version for the merging? Thank you.
Re: [PATCH v6 0/6] Add MediaTek PMIC keys support
On Thu, 2018-03-29 at 09:15 -0700, Dmitry Torokhov wrote: > > > Oh, sorry, I did not realize you wanted my Ack for bindings. I usually > leave it to Rob and simply ack the driver itself when I am happy with > the code. > > I'll go and add my ack to the binding post if that will help merging > the series. > > Thanks. Hi Lee, May I know if I need to collect Dmitry's comments and send a new version for the merging? Thank you.
Re: [PATCH v2 01/14] Input: atmel_mxt_ts - do not pass suspend mode in platform data
On Tue, Mar 20, 2018 at 03:31:25PM -0700, Dmitry Torokhov wrote: > The way we are supposed to put controller to sleep and wake it up does not > depend on the platform, but rather on controller itself, so we want to get > rid of suspend mode in platform data (and eventually get rid of platform > data completely). Unfortunately some early chromebooks (the original Pixel, > Acer C720) were shipped with config that requires manually re-enabling > touch reporting in T9. We will sort it out, but in the meantime let's > switch to a simple DMI quirk. > > We'll keep pdata->suspend_mode for now and remove it when we rework > chromeos-laptop driver. > > Signed-off-by: Dmitry TorokhovApplied, thanks. > --- > drivers/input/touchscreen/atmel_mxt_ts.c | 27 +++- > 1 file changed, 22 insertions(+), 5 deletions(-) > > diff --git a/drivers/input/touchscreen/atmel_mxt_ts.c > b/drivers/input/touchscreen/atmel_mxt_ts.c > index 7659bc48f1db8..20e1224d1a6db 100644 > --- a/drivers/input/touchscreen/atmel_mxt_ts.c > +++ b/drivers/input/touchscreen/atmel_mxt_ts.c > @@ -324,6 +324,8 @@ struct mxt_data { > > /* for config update handling */ > struct completion crc_completion; > + > + enum mxt_suspend_mode suspend_mode; > }; > > struct mxt_vb2_buffer { > @@ -2868,7 +2870,7 @@ static const struct attribute_group mxt_attr_group = { > > static void mxt_start(struct mxt_data *data) > { > - switch (data->pdata->suspend_mode) { > + switch (data->suspend_mode) { > case MXT_SUSPEND_T9_CTRL: > mxt_soft_reset(data); > > @@ -2886,12 +2888,11 @@ static void mxt_start(struct mxt_data *data) > mxt_t6_command(data, MXT_COMMAND_CALIBRATE, 1, false); > break; > } > - > } > > static void mxt_stop(struct mxt_data *data) > { > - switch (data->pdata->suspend_mode) { > + switch (data->suspend_mode) { > case MXT_SUSPEND_T9_CTRL: > /* Touch disable */ > mxt_write_object(data, > @@ -2954,8 +2955,6 @@ static const struct mxt_platform_data > *mxt_parse_dt(struct i2c_client *client) > pdata->t19_keymap = keymap; > } > > - pdata->suspend_mode = MXT_SUSPEND_DEEP_SLEEP; > - > return pdata; > } > #else > @@ -3109,6 +3108,21 @@ mxt_get_platform_data(struct i2c_client *client) > return ERR_PTR(-EINVAL); > } > > +static const struct dmi_system_id chromebook_T9_suspend_dmi[] = { > + { > + .matches = { > + DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"), > + DMI_MATCH(DMI_PRODUCT_NAME, "Link"), > + }, > + }, > + { > + .matches = { > + DMI_MATCH(DMI_PRODUCT_NAME, "Peppy"), > + }, > + }, > + { } > +}; > + > static int mxt_probe(struct i2c_client *client, const struct i2c_device_id > *id) > { > struct mxt_data *data; > @@ -3135,6 +3149,9 @@ static int mxt_probe(struct i2c_client *client, const > struct i2c_device_id *id) > init_completion(>reset_completion); > init_completion(>crc_completion); > > + data->suspend_mode = dmi_check_system(chromebook_T9_suspend_dmi) ? > + MXT_SUSPEND_T9_CTRL : MXT_SUSPEND_DEEP_SLEEP; > + > data->reset_gpio = devm_gpiod_get_optional(>dev, > "reset", GPIOD_OUT_LOW); > if (IS_ERR(data->reset_gpio)) { > -- > 2.16.2.804.g6dcf76e118-goog > -- Benson Leung Staff Software Engineer Chrome OS Kernel Google Inc. ble...@google.com Chromium OS Project ble...@chromium.org signature.asc Description: PGP signature
Re: [PATCH v2 01/14] Input: atmel_mxt_ts - do not pass suspend mode in platform data
On Tue, Mar 20, 2018 at 03:31:25PM -0700, Dmitry Torokhov wrote: > The way we are supposed to put controller to sleep and wake it up does not > depend on the platform, but rather on controller itself, so we want to get > rid of suspend mode in platform data (and eventually get rid of platform > data completely). Unfortunately some early chromebooks (the original Pixel, > Acer C720) were shipped with config that requires manually re-enabling > touch reporting in T9. We will sort it out, but in the meantime let's > switch to a simple DMI quirk. > > We'll keep pdata->suspend_mode for now and remove it when we rework > chromeos-laptop driver. > > Signed-off-by: Dmitry Torokhov Applied, thanks. > --- > drivers/input/touchscreen/atmel_mxt_ts.c | 27 +++- > 1 file changed, 22 insertions(+), 5 deletions(-) > > diff --git a/drivers/input/touchscreen/atmel_mxt_ts.c > b/drivers/input/touchscreen/atmel_mxt_ts.c > index 7659bc48f1db8..20e1224d1a6db 100644 > --- a/drivers/input/touchscreen/atmel_mxt_ts.c > +++ b/drivers/input/touchscreen/atmel_mxt_ts.c > @@ -324,6 +324,8 @@ struct mxt_data { > > /* for config update handling */ > struct completion crc_completion; > + > + enum mxt_suspend_mode suspend_mode; > }; > > struct mxt_vb2_buffer { > @@ -2868,7 +2870,7 @@ static const struct attribute_group mxt_attr_group = { > > static void mxt_start(struct mxt_data *data) > { > - switch (data->pdata->suspend_mode) { > + switch (data->suspend_mode) { > case MXT_SUSPEND_T9_CTRL: > mxt_soft_reset(data); > > @@ -2886,12 +2888,11 @@ static void mxt_start(struct mxt_data *data) > mxt_t6_command(data, MXT_COMMAND_CALIBRATE, 1, false); > break; > } > - > } > > static void mxt_stop(struct mxt_data *data) > { > - switch (data->pdata->suspend_mode) { > + switch (data->suspend_mode) { > case MXT_SUSPEND_T9_CTRL: > /* Touch disable */ > mxt_write_object(data, > @@ -2954,8 +2955,6 @@ static const struct mxt_platform_data > *mxt_parse_dt(struct i2c_client *client) > pdata->t19_keymap = keymap; > } > > - pdata->suspend_mode = MXT_SUSPEND_DEEP_SLEEP; > - > return pdata; > } > #else > @@ -3109,6 +3108,21 @@ mxt_get_platform_data(struct i2c_client *client) > return ERR_PTR(-EINVAL); > } > > +static const struct dmi_system_id chromebook_T9_suspend_dmi[] = { > + { > + .matches = { > + DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"), > + DMI_MATCH(DMI_PRODUCT_NAME, "Link"), > + }, > + }, > + { > + .matches = { > + DMI_MATCH(DMI_PRODUCT_NAME, "Peppy"), > + }, > + }, > + { } > +}; > + > static int mxt_probe(struct i2c_client *client, const struct i2c_device_id > *id) > { > struct mxt_data *data; > @@ -3135,6 +3149,9 @@ static int mxt_probe(struct i2c_client *client, const > struct i2c_device_id *id) > init_completion(>reset_completion); > init_completion(>crc_completion); > > + data->suspend_mode = dmi_check_system(chromebook_T9_suspend_dmi) ? > + MXT_SUSPEND_T9_CTRL : MXT_SUSPEND_DEEP_SLEEP; > + > data->reset_gpio = devm_gpiod_get_optional(>dev, > "reset", GPIOD_OUT_LOW); > if (IS_ERR(data->reset_gpio)) { > -- > 2.16.2.804.g6dcf76e118-goog > -- Benson Leung Staff Software Engineer Chrome OS Kernel Google Inc. ble...@google.com Chromium OS Project ble...@chromium.org signature.asc Description: PGP signature
Re: [PATCH] xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen
On 09/04/18 20:51, Boris Ostrovsky wrote: > Pre-4.17 kernels ignored start_info's rsdp_paddr pointer and instead > relied on finding RSDP in standard location in BIOS RO memory. This > has worked since that's where Xen used to place it. > > However, with recent Xen change (commit 4a5733771e6f ("libxl: put RSDP > for PVH guest near 4GB")) it prefers to keep RSDP at a "non-standard" > address. Even though as of commit b17d9d1df3c3 ("x86/xen: Add pvh > specific rsdp address retrieval function") Linux is able to find RSDP, > for back-compatibility reasons we need to indicate to Xen that we can > handle this, an we do so by setting XENFEAT_linux_rsdp_unrestricted > flag in ELF notes. > > (Also take this opportunity and sync features.h header file with Xen) > > Signed-off-by: Boris OstrovskyReviewed-by: Juergen Gross Juergen
Re: [PATCH] xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen
On 09/04/18 20:51, Boris Ostrovsky wrote: > Pre-4.17 kernels ignored start_info's rsdp_paddr pointer and instead > relied on finding RSDP in standard location in BIOS RO memory. This > has worked since that's where Xen used to place it. > > However, with recent Xen change (commit 4a5733771e6f ("libxl: put RSDP > for PVH guest near 4GB")) it prefers to keep RSDP at a "non-standard" > address. Even though as of commit b17d9d1df3c3 ("x86/xen: Add pvh > specific rsdp address retrieval function") Linux is able to find RSDP, > for back-compatibility reasons we need to indicate to Xen that we can > handle this, an we do so by setting XENFEAT_linux_rsdp_unrestricted > flag in ELF notes. > > (Also take this opportunity and sync features.h header file with Xen) > > Signed-off-by: Boris Ostrovsky Reviewed-by: Juergen Gross Juergen
[tip:perf/urgent] perf tests clang: Fix function name for clang IR test
Commit-ID: fcbd8fa44664e99a5d8c7ab97f1afdd82472f973 Gitweb: https://git.kernel.org/tip/fcbd8fa44664e99a5d8c7ab97f1afdd82472f973 Author: Sandipan DasAuthorDate: Wed, 4 Apr 2018 23:34:19 +0530 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 9 Apr 2018 11:13:09 -0300 perf tests clang: Fix function name for clang IR test As stated in tests/llvm-src-base.c, the name of the bpf function should be "bpf_func__SyS_epoll_pwait" but this clang test fails as it tries to lookup "bpf_func__SyS_epoll_wait". Before applying patch: 55: builtin clang support : 55.1: builtin clang compile C source to IR: FAILED! 55.2: builtin clang compile C source to ELF object: Skip After applying patch: 55: builtin clang support : 55.1: builtin clang compile C source to IR: Ok 55.2: builtin clang compile C source to ELF object: Ok Signed-off-by: Sandipan Das Tested-by: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Naveen N. Rao Fixes: e67d52d411c3 ("perf clang: Update test case to use real BPF script") Link: http://lkml.kernel.org/r/20180404180419.19056-3-sandi...@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/c++/clang-test.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/c++/clang-test.cpp b/tools/perf/util/c++/clang-test.cpp index a4014d786676..7b042a5ebc68 100644 --- a/tools/perf/util/c++/clang-test.cpp +++ b/tools/perf/util/c++/clang-test.cpp @@ -41,7 +41,7 @@ int test__clang_to_IR(void) if (!M) return -1; for (llvm::Function& F : *M) - if (F.getName() == "bpf_func__SyS_epoll_wait") + if (F.getName() == "bpf_func__SyS_epoll_pwait") return 0; return -1; }
[tip:perf/urgent] perf tests clang: Fix function name for clang IR test
Commit-ID: fcbd8fa44664e99a5d8c7ab97f1afdd82472f973 Gitweb: https://git.kernel.org/tip/fcbd8fa44664e99a5d8c7ab97f1afdd82472f973 Author: Sandipan Das AuthorDate: Wed, 4 Apr 2018 23:34:19 +0530 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 9 Apr 2018 11:13:09 -0300 perf tests clang: Fix function name for clang IR test As stated in tests/llvm-src-base.c, the name of the bpf function should be "bpf_func__SyS_epoll_pwait" but this clang test fails as it tries to lookup "bpf_func__SyS_epoll_wait". Before applying patch: 55: builtin clang support : 55.1: builtin clang compile C source to IR: FAILED! 55.2: builtin clang compile C source to ELF object: Skip After applying patch: 55: builtin clang support : 55.1: builtin clang compile C source to IR: Ok 55.2: builtin clang compile C source to ELF object: Ok Signed-off-by: Sandipan Das Tested-by: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Naveen N. Rao Fixes: e67d52d411c3 ("perf clang: Update test case to use real BPF script") Link: http://lkml.kernel.org/r/20180404180419.19056-3-sandi...@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/c++/clang-test.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/util/c++/clang-test.cpp b/tools/perf/util/c++/clang-test.cpp index a4014d786676..7b042a5ebc68 100644 --- a/tools/perf/util/c++/clang-test.cpp +++ b/tools/perf/util/c++/clang-test.cpp @@ -41,7 +41,7 @@ int test__clang_to_IR(void) if (!M) return -1; for (llvm::Function& F : *M) - if (F.getName() == "bpf_func__SyS_epoll_wait") + if (F.getName() == "bpf_func__SyS_epoll_pwait") return 0; return -1; }
[tip:perf/urgent] perf clang: Add support for recent clang versions
Commit-ID: 7854e499f33fd9c7e63288692ffb754d9b1d02fd Gitweb: https://git.kernel.org/tip/7854e499f33fd9c7e63288692ffb754d9b1d02fd Author: Sandipan DasAuthorDate: Wed, 4 Apr 2018 23:34:18 +0530 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 9 Apr 2018 11:13:08 -0300 perf clang: Add support for recent clang versions The clang API calls used by perf have changed in recent releases and builds succeed with libclang-3.9 only. This introduces compatibility with libclang-4.0 and above. Without this patch, we will see the following compilation errors with libclang-4.0+: util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’: util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope Opts.Inputs.emplace_back(Path, IK_C); ^~~~ util/c++/clang.cpp: In function ‘std::unique_ptr perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr)’: util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’ Clang.setInvocation(&*CI); ^ In file included from util/c++/clang.cpp:14:0: /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr) void setInvocation(std::shared_ptr Value); ^ Committer testing: Tested on Fedora 27 after installing the clang-devel and llvm-devel packages, versions: # rpm -qa | egrep llvm\|clang llvm-5.0.1-6.fc27.x86_64 clang-libs-5.0.1-5.fc27.x86_64 clang-5.0.1-5.fc27.x86_64 clang-tools-extra-5.0.1-5.fc27.x86_64 llvm-libs-5.0.1-6.fc27.x86_64 llvm-devel-5.0.1-6.fc27.x86_64 clang-devel-5.0.1-5.fc27.x86_64 # Make sure you don't have some older version lying around in /usr/local, etc, then: $ make LIBCLANGLLVM=1 -C tools/perf install-bin And in the end perf will be linked agains these libraries: # ldd ~/bin/perf | egrep -i llvm\|clang libclangAST.so.5 => /lib64/libclangAST.so.5 (0x7f8bb2eb4000) libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x7f8bb29e3000) libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x7f8bb23f7000) libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x7f8bb206) libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x7f8bb1d06000) libclangLex.so.5 => /lib64/libclangLex.so.5 (0x7f8bb1a3e000) libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x7f8bb17d4000) libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x7f8bb15c5000) libclangSema.so.5 => /lib64/libclangSema.so.5 (0x7f8bb0cc9000) libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x7f8bb0a23000) libclangParse.so.5 => /lib64/libclangParse.so.5 (0x7f8bb0725000) libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x7f8bb039a000) libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x7f8bace98000) libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x7f8bab735000) libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x7f8bab4b2000) libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x7f8bab2a1000) libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x7f8bab08e000) # Signed-off-by: Sandipan Das Tested-by: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Naveen N. Rao Fixes: 00b86691c77c ("perf clang: Add builtin clang support ant test case") Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandi...@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/c++/clang.cpp | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp index 1bfc946e37dc..bf31ceab33bd 100644 --- a/tools/perf/util/c++/clang.cpp +++ b/tools/perf/util/c++/clang.cpp @@ -9,6 +9,7 @@ * Copyright (C) 2016 Huawei Inc. */ +#include "clang/Basic/Version.h" #include "clang/CodeGen/CodeGenAction.h" #include "clang/Frontend/CompilerInvocation.h" #include "clang/Frontend/CompilerInstance.h" @@ -58,7 +59,8 @@ createCompilerInvocation(llvm::opt::ArgStringList CFlags, StringRef& Path, FrontendOptions& Opts = CI->getFrontendOpts(); Opts.Inputs.clear(); - Opts.Inputs.emplace_back(Path, IK_C); + Opts.Inputs.emplace_back(Path, + FrontendOptions::getInputKindForExtension("c")); return CI; } @@ -71,10 +73,17 @@ getModuleFromSource(llvm::opt::ArgStringList CFlags, Clang.setVirtualFileSystem(&*VFS); +#if
[tip:perf/urgent] perf clang: Add support for recent clang versions
Commit-ID: 7854e499f33fd9c7e63288692ffb754d9b1d02fd Gitweb: https://git.kernel.org/tip/7854e499f33fd9c7e63288692ffb754d9b1d02fd Author: Sandipan Das AuthorDate: Wed, 4 Apr 2018 23:34:18 +0530 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 9 Apr 2018 11:13:08 -0300 perf clang: Add support for recent clang versions The clang API calls used by perf have changed in recent releases and builds succeed with libclang-3.9 only. This introduces compatibility with libclang-4.0 and above. Without this patch, we will see the following compilation errors with libclang-4.0+: util/c++/clang.cpp: In function ‘clang::CompilerInvocation* perf::createCompilerInvocation(llvm::opt::ArgStringList, llvm::StringRef&, clang::DiagnosticsEngine&)’: util/c++/clang.cpp:62:33: error: ‘IK_C’ was not declared in this scope Opts.Inputs.emplace_back(Path, IK_C); ^~~~ util/c++/clang.cpp: In function ‘std::unique_ptr perf::getModuleFromSource(llvm::opt::ArgStringList, llvm::StringRef, llvm::IntrusiveRefCntPtr)’: util/c++/clang.cpp:75:26: error: no matching function for call to ‘clang::CompilerInstance::setInvocation(clang::CompilerInvocation*)’ Clang.setInvocation(&*CI); ^ In file included from util/c++/clang.cpp:14:0: /usr/include/clang/Frontend/CompilerInstance.h:231:8: note: candidate: void clang::CompilerInstance::setInvocation(std::shared_ptr) void setInvocation(std::shared_ptr Value); ^ Committer testing: Tested on Fedora 27 after installing the clang-devel and llvm-devel packages, versions: # rpm -qa | egrep llvm\|clang llvm-5.0.1-6.fc27.x86_64 clang-libs-5.0.1-5.fc27.x86_64 clang-5.0.1-5.fc27.x86_64 clang-tools-extra-5.0.1-5.fc27.x86_64 llvm-libs-5.0.1-6.fc27.x86_64 llvm-devel-5.0.1-6.fc27.x86_64 clang-devel-5.0.1-5.fc27.x86_64 # Make sure you don't have some older version lying around in /usr/local, etc, then: $ make LIBCLANGLLVM=1 -C tools/perf install-bin And in the end perf will be linked agains these libraries: # ldd ~/bin/perf | egrep -i llvm\|clang libclangAST.so.5 => /lib64/libclangAST.so.5 (0x7f8bb2eb4000) libclangBasic.so.5 => /lib64/libclangBasic.so.5 (0x7f8bb29e3000) libclangCodeGen.so.5 => /lib64/libclangCodeGen.so.5 (0x7f8bb23f7000) libclangDriver.so.5 => /lib64/libclangDriver.so.5 (0x7f8bb206) libclangFrontend.so.5 => /lib64/libclangFrontend.so.5 (0x7f8bb1d06000) libclangLex.so.5 => /lib64/libclangLex.so.5 (0x7f8bb1a3e000) libclangTooling.so.5 => /lib64/libclangTooling.so.5 (0x7f8bb17d4000) libclangEdit.so.5 => /lib64/libclangEdit.so.5 (0x7f8bb15c5000) libclangSema.so.5 => /lib64/libclangSema.so.5 (0x7f8bb0cc9000) libclangAnalysis.so.5 => /lib64/libclangAnalysis.so.5 (0x7f8bb0a23000) libclangParse.so.5 => /lib64/libclangParse.so.5 (0x7f8bb0725000) libclangSerialization.so.5 => /lib64/libclangSerialization.so.5 (0x7f8bb039a000) libLLVM-5.0.so => /lib64/libLLVM-5.0.so (0x7f8bace98000) libclangASTMatchers.so.5 => /lib64/../lib64/libclangASTMatchers.so.5 (0x7f8bab735000) libclangFormat.so.5 => /lib64/../lib64/libclangFormat.so.5 (0x7f8bab4b2000) libclangRewrite.so.5 => /lib64/../lib64/libclangRewrite.so.5 (0x7f8bab2a1000) libclangToolingCore.so.5 => /lib64/../lib64/libclangToolingCore.so.5 (0x7f8bab08e000) # Signed-off-by: Sandipan Das Tested-by: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Naveen N. Rao Fixes: 00b86691c77c ("perf clang: Add builtin clang support ant test case") Link: http://lkml.kernel.org/r/20180404180419.19056-2-sandi...@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/c++/clang.cpp | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/c++/clang.cpp b/tools/perf/util/c++/clang.cpp index 1bfc946e37dc..bf31ceab33bd 100644 --- a/tools/perf/util/c++/clang.cpp +++ b/tools/perf/util/c++/clang.cpp @@ -9,6 +9,7 @@ * Copyright (C) 2016 Huawei Inc. */ +#include "clang/Basic/Version.h" #include "clang/CodeGen/CodeGenAction.h" #include "clang/Frontend/CompilerInvocation.h" #include "clang/Frontend/CompilerInstance.h" @@ -58,7 +59,8 @@ createCompilerInvocation(llvm::opt::ArgStringList CFlags, StringRef& Path, FrontendOptions& Opts = CI->getFrontendOpts(); Opts.Inputs.clear(); - Opts.Inputs.emplace_back(Path, IK_C); + Opts.Inputs.emplace_back(Path, + FrontendOptions::getInputKindForExtension("c")); return CI; } @@ -71,10 +73,17 @@ getModuleFromSource(llvm::opt::ArgStringList CFlags, Clang.setVirtualFileSystem(&*VFS); +#if CLANG_VERSION_MAJOR < 4 IntrusiveRefCntPtr CI = createCompilerInvocation(std::move(CFlags), Path,
[tip:perf/urgent] perf tools: Fix perf builds with clang support
Commit-ID: c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e Gitweb: https://git.kernel.org/tip/c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e Author: Sandipan DasAuthorDate: Wed, 4 Apr 2018 23:34:17 +0530 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 9 Apr 2018 11:13:07 -0300 perf tools: Fix perf builds with clang support For libclang, some distro packages provide static libraries (.a) while some provide shared libraries (.so). Currently, perf code can only be linked with static libraries. This makes perf build possible for both cases. Signed-off-by: Sandipan Das Cc: Jiri Olsa Cc: Naveen N. Rao Fixes: d58ac0bf8d1e ("perf build: Add clang and llvm compile and linking support") Link: http://lkml.kernel.org/r/20180404180419.19056-1-sandi...@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Makefile.perf | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index f7517e1b73f8..83e453de36f8 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -364,7 +364,8 @@ LIBS = -Wl,--whole-archive $(PERFLIBS) $(EXTRA_PERFLIBS) -Wl,--no-whole-archive ifeq ($(USE_CLANG), 1) CLANGLIBS_LIST = AST Basic CodeGen Driver Frontend Lex Tooling Edit Sema Analysis Parse Serialization - LIBCLANG = $(foreach l,$(CLANGLIBS_LIST),$(wildcard $(shell $(LLVM_CONFIG) --libdir)/libclang$(l).a)) + CLANGLIBS_NOEXT_LIST = $(foreach l,$(CLANGLIBS_LIST),$(shell $(LLVM_CONFIG) --libdir)/libclang$(l)) + LIBCLANG = $(foreach l,$(CLANGLIBS_NOEXT_LIST),$(wildcard $(l).a $(l).so)) LIBS += -Wl,--start-group $(LIBCLANG) -Wl,--end-group endif
[tip:perf/urgent] perf tools: Fix perf builds with clang support
Commit-ID: c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e Gitweb: https://git.kernel.org/tip/c2fb54a183cfe77c6fdc9d71e2d5299c1c302a6e Author: Sandipan Das AuthorDate: Wed, 4 Apr 2018 23:34:17 +0530 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 9 Apr 2018 11:13:07 -0300 perf tools: Fix perf builds with clang support For libclang, some distro packages provide static libraries (.a) while some provide shared libraries (.so). Currently, perf code can only be linked with static libraries. This makes perf build possible for both cases. Signed-off-by: Sandipan Das Cc: Jiri Olsa Cc: Naveen N. Rao Fixes: d58ac0bf8d1e ("perf build: Add clang and llvm compile and linking support") Link: http://lkml.kernel.org/r/20180404180419.19056-1-sandi...@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Makefile.perf | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index f7517e1b73f8..83e453de36f8 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -364,7 +364,8 @@ LIBS = -Wl,--whole-archive $(PERFLIBS) $(EXTRA_PERFLIBS) -Wl,--no-whole-archive ifeq ($(USE_CLANG), 1) CLANGLIBS_LIST = AST Basic CodeGen Driver Frontend Lex Tooling Edit Sema Analysis Parse Serialization - LIBCLANG = $(foreach l,$(CLANGLIBS_LIST),$(wildcard $(shell $(LLVM_CONFIG) --libdir)/libclang$(l).a)) + CLANGLIBS_NOEXT_LIST = $(foreach l,$(CLANGLIBS_LIST),$(shell $(LLVM_CONFIG) --libdir)/libclang$(l)) + LIBCLANG = $(foreach l,$(CLANGLIBS_NOEXT_LIST),$(wildcard $(l).a $(l).so)) LIBS += -Wl,--start-group $(LIBCLANG) -Wl,--end-group endif
[tip:perf/urgent] perf tools: No need to include namespaces.h in util.h
Commit-ID: ad0902e0c4004dc95bf15229933012121ff54033 Gitweb: https://git.kernel.org/tip/ad0902e0c4004dc95bf15229933012121ff54033 Author: Arnaldo Carvalho de MeloAuthorDate: Fri, 6 Apr 2018 14:53:56 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 9 Apr 2018 10:57:50 -0300 perf tools: No need to include namespaces.h in util.h The only thing that is needed there is a forward declaration for 'struct nsinfo', so disentanble this, which in turns allows built-in clang builds, i.e. 'make LIBCLANGLLVM=1 -C tools/perf'. Cc: Adrian Hunter Cc: David Ahern Cc: Jiri Olsa Cc: Namhyung Kim Cc: Naveen N. Rao Cc: Sandipan Das Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-vq26rsuwq1cqylpcyvq89...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/util.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h index 9496365da3d7..c9626c206208 100644 --- a/tools/perf/util/util.h +++ b/tools/perf/util/util.h @@ -11,8 +11,7 @@ #include #include #include -#include -#include "namespaces.h" +#include /* General helper functions */ void usage(const char *err) __noreturn; @@ -26,6 +25,7 @@ static inline void *zalloc(size_t size) #define zfree(ptr) ({ free(*ptr); *ptr = NULL; }) struct dirent; +struct nsinfo; struct strlist; int mkdir_p(char *path, mode_t mode);
[tip:perf/urgent] perf tools: No need to include namespaces.h in util.h
Commit-ID: ad0902e0c4004dc95bf15229933012121ff54033 Gitweb: https://git.kernel.org/tip/ad0902e0c4004dc95bf15229933012121ff54033 Author: Arnaldo Carvalho de Melo AuthorDate: Fri, 6 Apr 2018 14:53:56 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Mon, 9 Apr 2018 10:57:50 -0300 perf tools: No need to include namespaces.h in util.h The only thing that is needed there is a forward declaration for 'struct nsinfo', so disentanble this, which in turns allows built-in clang builds, i.e. 'make LIBCLANGLLVM=1 -C tools/perf'. Cc: Adrian Hunter Cc: David Ahern Cc: Jiri Olsa Cc: Namhyung Kim Cc: Naveen N. Rao Cc: Sandipan Das Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-vq26rsuwq1cqylpcyvq89...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/util.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h index 9496365da3d7..c9626c206208 100644 --- a/tools/perf/util/util.h +++ b/tools/perf/util/util.h @@ -11,8 +11,7 @@ #include #include #include -#include -#include "namespaces.h" +#include /* General helper functions */ void usage(const char *err) __noreturn; @@ -26,6 +25,7 @@ static inline void *zalloc(size_t size) #define zfree(ptr) ({ free(*ptr); *ptr = NULL; }) struct dirent; +struct nsinfo; struct strlist; int mkdir_p(char *path, mode_t mode);
[tip:perf/urgent] perf hists browser: Remove leftover from row returned from refresh
Commit-ID: 94e87a8bd529121ea90219164c65c36ea1d19e56 Gitweb: https://git.kernel.org/tip/94e87a8bd529121ea90219164c65c36ea1d19e56 Author: Arnaldo Carvalho de MeloAuthorDate: Fri, 6 Apr 2018 12:11:11 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 6 Apr 2018 12:23:25 -0300 perf hists browser: Remove leftover from row returned from refresh The per-browser screen refresh routine (ui_browser->refresh()) should return the first row that should be cleaned after the rows just printed, in case not all rows available on the screen gets filled. When moving the extra title lines logic from the hists browser to the generic ui_browser class, one piece of that logic remained in the hists browser and then when going back from the annotate browser to the hists browser in a case where fewer lines were displayed in the hists browser, for instance when filtering the entries per substring, one line of the annotate browser would remain on the screen, fix that. Example of the screen artifact: Samples: 73K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 45172901394 Overhead Shared O Symbol 0.30% [kernel] [k] __indirect_thunk_start 0.09% [kernel] [k] __x86_indirect_thunk_r10 │ lfence Here from 'perf top' the view was zoomed with '/thunk' to functions having that substring, then the first was annotated and from the annotate browser ESC was pressed, then the first lines were overwritten, but the 'lfence' line remained due to the off by one bug fixed in this cset. Cc: Adrian Hunter Cc: Andi Kleen Cc: David Ahern Cc: Jin Yao Cc: Jiri Olsa Cc: Namhyung Kim Cc: Wang Nan Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the hists browser") Link: https://lkml.kernel.org/n/tip-odryfso74eaarm0z3e4v9...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/ui/browsers/hists.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index de17e59d9952..0eec06c105c6 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -1744,17 +1744,11 @@ static void ui_browser__hists_init_top(struct ui_browser *browser) static unsigned int hist_browser__refresh(struct ui_browser *browser) { unsigned row = 0; - u16 header_offset = 0; struct rb_node *nd; struct hist_browser *hb = container_of(browser, struct hist_browser, b); - struct hists *hists = hb->hists; - - if (hb->show_headers) { - struct perf_hpp_list *hpp_list = hists->hpp_list; + if (hb->show_headers) hist_browser__show_headers(hb); - header_offset = hpp_list->nr_header_lines; - } ui_browser__hists_init_top(browser); hb->he_selection = NULL; @@ -1792,7 +1786,7 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser) break; } - return row + header_offset; + return row; } static struct rb_node *hists__filter_entries(struct rb_node *nd,
[tip:perf/urgent] perf hists browser: Show extra_title_lines in the 'D' debug hotkey
Commit-ID: fdae6400809aa179f8ca04e32f3eb176fb3b3a9d Gitweb: https://git.kernel.org/tip/fdae6400809aa179f8ca04e32f3eb176fb3b3a9d Author: Arnaldo Carvalho de MeloAuthorDate: Fri, 6 Apr 2018 11:56:11 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 6 Apr 2018 12:22:06 -0300 perf hists browser: Show extra_title_lines in the 'D' debug hotkey To help in fixing problems in the browser. Cc: Adrian Hunter Cc: David Ahern Cc: Jiri Olsa Cc: Namhyung Kim Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-uj0n76yqh5bf98i0edckd...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/ui/browsers/hists.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index b06afb8f51fb..de17e59d9952 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -659,9 +659,10 @@ int hist_browser__run(struct hist_browser *browser, const char *help, struct hist_entry *h = rb_entry(browser->b.top, struct hist_entry, rb_node); ui_helpline__pop(); - ui_helpline__fpush("%d: nr_ent=(%d,%d), rows=%d, idx=%d, fve: idx=%d, row_off=%d, nrows=%d", + ui_helpline__fpush("%d: nr_ent=(%d,%d), etl: %d, rows=%d, idx=%d, fve: idx=%d, row_off=%d, nrows=%d", seq++, browser->b.nr_entries, browser->hists->nr_entries, + browser->b.extra_title_lines, browser->b.rows, browser->b.index, browser->b.top_idx,
[tip:perf/urgent] perf hists browser: Show extra_title_lines in the 'D' debug hotkey
Commit-ID: fdae6400809aa179f8ca04e32f3eb176fb3b3a9d Gitweb: https://git.kernel.org/tip/fdae6400809aa179f8ca04e32f3eb176fb3b3a9d Author: Arnaldo Carvalho de Melo AuthorDate: Fri, 6 Apr 2018 11:56:11 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 6 Apr 2018 12:22:06 -0300 perf hists browser: Show extra_title_lines in the 'D' debug hotkey To help in fixing problems in the browser. Cc: Adrian Hunter Cc: David Ahern Cc: Jiri Olsa Cc: Namhyung Kim Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-uj0n76yqh5bf98i0edckd...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/ui/browsers/hists.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index b06afb8f51fb..de17e59d9952 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -659,9 +659,10 @@ int hist_browser__run(struct hist_browser *browser, const char *help, struct hist_entry *h = rb_entry(browser->b.top, struct hist_entry, rb_node); ui_helpline__pop(); - ui_helpline__fpush("%d: nr_ent=(%d,%d), rows=%d, idx=%d, fve: idx=%d, row_off=%d, nrows=%d", + ui_helpline__fpush("%d: nr_ent=(%d,%d), etl: %d, rows=%d, idx=%d, fve: idx=%d, row_off=%d, nrows=%d", seq++, browser->b.nr_entries, browser->hists->nr_entries, + browser->b.extra_title_lines, browser->b.rows, browser->b.index, browser->b.top_idx,
[tip:perf/urgent] perf hists browser: Remove leftover from row returned from refresh
Commit-ID: 94e87a8bd529121ea90219164c65c36ea1d19e56 Gitweb: https://git.kernel.org/tip/94e87a8bd529121ea90219164c65c36ea1d19e56 Author: Arnaldo Carvalho de Melo AuthorDate: Fri, 6 Apr 2018 12:11:11 -0300 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 6 Apr 2018 12:23:25 -0300 perf hists browser: Remove leftover from row returned from refresh The per-browser screen refresh routine (ui_browser->refresh()) should return the first row that should be cleaned after the rows just printed, in case not all rows available on the screen gets filled. When moving the extra title lines logic from the hists browser to the generic ui_browser class, one piece of that logic remained in the hists browser and then when going back from the annotate browser to the hists browser in a case where fewer lines were displayed in the hists browser, for instance when filtering the entries per substring, one line of the annotate browser would remain on the screen, fix that. Example of the screen artifact: Samples: 73K of event 'cycles:ppp', 4000 Hz, Event count (approx.): 45172901394 Overhead Shared O Symbol 0.30% [kernel] [k] __indirect_thunk_start 0.09% [kernel] [k] __x86_indirect_thunk_r10 │ lfence Here from 'perf top' the view was zoomed with '/thunk' to functions having that substring, then the first was annotated and from the annotate browser ESC was pressed, then the first lines were overwritten, but the 'lfence' line remained due to the off by one bug fixed in this cset. Cc: Adrian Hunter Cc: Andi Kleen Cc: David Ahern Cc: Jin Yao Cc: Jiri Olsa Cc: Namhyung Kim Cc: Wang Nan Fixes: ef9ff6017e3c ("perf ui browser: Move the extra title lines from the hists browser") Link: https://lkml.kernel.org/n/tip-odryfso74eaarm0z3e4v9...@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/ui/browsers/hists.c | 10 ++ 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index de17e59d9952..0eec06c105c6 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -1744,17 +1744,11 @@ static void ui_browser__hists_init_top(struct ui_browser *browser) static unsigned int hist_browser__refresh(struct ui_browser *browser) { unsigned row = 0; - u16 header_offset = 0; struct rb_node *nd; struct hist_browser *hb = container_of(browser, struct hist_browser, b); - struct hists *hists = hb->hists; - - if (hb->show_headers) { - struct perf_hpp_list *hpp_list = hists->hpp_list; + if (hb->show_headers) hist_browser__show_headers(hb); - header_offset = hpp_list->nr_header_lines; - } ui_browser__hists_init_top(browser); hb->he_selection = NULL; @@ -1792,7 +1786,7 @@ static unsigned int hist_browser__refresh(struct ui_browser *browser) break; } - return row + header_offset; + return row; } static struct rb_node *hists__filter_entries(struct rb_node *nd,
[tip:perf/urgent] perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
Commit-ID: b238db655796e74b59d9ece58b645ad0b494d615 Gitweb: https://git.kernel.org/tip/b238db655796e74b59d9ece58b645ad0b494d615 Author: Adrian HunterAuthorDate: Tue, 6 Mar 2018 11:13:18 +0200 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 6 Apr 2018 09:40:41 -0300 perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering In preparation for supporting AUX area sampling buffers, auxtrace_queues__add_buffer() needs to be more generic. To that end, move CPU filtering into it. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/1520327598-1317-8-git-send-email-adrian.hun...@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/auxtrace.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index e1aff91c54a8..857de69a5361 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -302,6 +302,13 @@ static int auxtrace_queues__split_buffer(struct auxtrace_queues *queues, return 0; } +static bool filter_cpu(struct perf_session *session, int cpu) +{ + unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap; + + return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap); +} + static int auxtrace_queues__add_buffer(struct auxtrace_queues *queues, struct perf_session *session, unsigned int idx, @@ -310,6 +317,9 @@ static int auxtrace_queues__add_buffer(struct auxtrace_queues *queues, { int err = -ENOMEM; + if (filter_cpu(session, buffer->cpu)) + return 0; + buffer = memdup(buffer, sizeof(*buffer)); if (!buffer) return -ENOMEM; @@ -344,13 +354,6 @@ out_free: return err; } -static bool filter_cpu(struct perf_session *session, int cpu) -{ - unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap; - - return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap); -} - int auxtrace_queues__add_event(struct auxtrace_queues *queues, struct perf_session *session, union perf_event *event, off_t data_offset, @@ -367,9 +370,6 @@ int auxtrace_queues__add_event(struct auxtrace_queues *queues, }; unsigned int idx = event->auxtrace.idx; - if (filter_cpu(session, event->auxtrace.cpu)) - return 0; - return auxtrace_queues__add_buffer(queues, session, idx, , buffer_ptr); }
[tip:perf/urgent] perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering
Commit-ID: b238db655796e74b59d9ece58b645ad0b494d615 Gitweb: https://git.kernel.org/tip/b238db655796e74b59d9ece58b645ad0b494d615 Author: Adrian Hunter AuthorDate: Tue, 6 Mar 2018 11:13:18 +0200 Committer: Arnaldo Carvalho de Melo CommitDate: Fri, 6 Apr 2018 09:40:41 -0300 perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering In preparation for supporting AUX area sampling buffers, auxtrace_queues__add_buffer() needs to be more generic. To that end, move CPU filtering into it. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/1520327598-1317-8-git-send-email-adrian.hun...@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/auxtrace.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c index e1aff91c54a8..857de69a5361 100644 --- a/tools/perf/util/auxtrace.c +++ b/tools/perf/util/auxtrace.c @@ -302,6 +302,13 @@ static int auxtrace_queues__split_buffer(struct auxtrace_queues *queues, return 0; } +static bool filter_cpu(struct perf_session *session, int cpu) +{ + unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap; + + return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap); +} + static int auxtrace_queues__add_buffer(struct auxtrace_queues *queues, struct perf_session *session, unsigned int idx, @@ -310,6 +317,9 @@ static int auxtrace_queues__add_buffer(struct auxtrace_queues *queues, { int err = -ENOMEM; + if (filter_cpu(session, buffer->cpu)) + return 0; + buffer = memdup(buffer, sizeof(*buffer)); if (!buffer) return -ENOMEM; @@ -344,13 +354,6 @@ out_free: return err; } -static bool filter_cpu(struct perf_session *session, int cpu) -{ - unsigned long *cpu_bitmap = session->itrace_synth_opts->cpu_bitmap; - - return cpu_bitmap && cpu != -1 && !test_bit(cpu, cpu_bitmap); -} - int auxtrace_queues__add_event(struct auxtrace_queues *queues, struct perf_session *session, union perf_event *event, off_t data_offset, @@ -367,9 +370,6 @@ int auxtrace_queues__add_event(struct auxtrace_queues *queues, }; unsigned int idx = event->auxtrace.idx; - if (filter_cpu(session, event->auxtrace.cpu)) - return 0; - return auxtrace_queues__add_buffer(queues, session, idx, , buffer_ptr); }
[PATCH v2 1/2] vhost: fix vhost_vq_access_ok() log check
Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log when IOTLB is enabled") introduced a regression. The logic was originally: if (vq->iotlb) return 1; return A && B; After the patch the short-circuit logic for A was inverted: if (A || vq->iotlb) return A; return B; This patch fixes the regression by rewriting the checks in the obvious way, no longer returning A when vq->iotlb is non-NULL (which is hard to understand). Reported-by: syzbot+65a84dde0214b0387...@syzkaller.appspotmail.com Cc: Jason WangSigned-off-by: Stefan Hajnoczi --- drivers/vhost/vhost.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 5320039671b7..93fd0c75b0d8 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1244,10 +1244,12 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, /* Caller should have vq mutex and device mutex */ int vhost_vq_access_ok(struct vhost_virtqueue *vq) { - int ret = vq_log_access_ok(vq, vq->log_base); + if (!vq_log_access_ok(vq, vq->log_base)) + return 0; - if (ret || vq->iotlb) - return ret; + /* Access validation occurs at prefetch time with IOTLB */ + if (vq->iotlb) + return 1; return vq_access_ok(vq, vq->num, vq->desc, vq->avail, vq->used); } -- 2.14.3
[PATCH v2 1/2] vhost: fix vhost_vq_access_ok() log check
Commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log when IOTLB is enabled") introduced a regression. The logic was originally: if (vq->iotlb) return 1; return A && B; After the patch the short-circuit logic for A was inverted: if (A || vq->iotlb) return A; return B; This patch fixes the regression by rewriting the checks in the obvious way, no longer returning A when vq->iotlb is non-NULL (which is hard to understand). Reported-by: syzbot+65a84dde0214b0387...@syzkaller.appspotmail.com Cc: Jason Wang Signed-off-by: Stefan Hajnoczi --- drivers/vhost/vhost.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 5320039671b7..93fd0c75b0d8 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1244,10 +1244,12 @@ static int vq_log_access_ok(struct vhost_virtqueue *vq, /* Caller should have vq mutex and device mutex */ int vhost_vq_access_ok(struct vhost_virtqueue *vq) { - int ret = vq_log_access_ok(vq, vq->log_base); + if (!vq_log_access_ok(vq, vq->log_base)) + return 0; - if (ret || vq->iotlb) - return ret; + /* Access validation occurs at prefetch time with IOTLB */ + if (vq->iotlb) + return 1; return vq_access_ok(vq, vq->num, vq->desc, vq->avail, vq->used); } -- 2.14.3
[PATCH v2 2/2] vhost: return bool from *_access_ok() functions
Currently vhost *_access_ok() functions return int. This is error-prone because there are two popular conventions: 1. 0 means failure, 1 means success 2. -errno means failure, 0 means success Although vhost mostly uses #1, it does not do so consistently. umem_access_ok() uses #2. This patch changes the return type from int to bool so that false means failure and true means success. This eliminates a potential source of errors. Suggested-by: Linus TorvaldsSigned-off-by: Stefan Hajnoczi --- drivers/vhost/vhost.h | 4 ++-- drivers/vhost/vhost.c | 66 +-- 2 files changed, 35 insertions(+), 35 deletions(-) diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index ac4b6056f19a..6e00fa57af09 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -178,8 +178,8 @@ void vhost_dev_cleanup(struct vhost_dev *); void vhost_dev_stop(struct vhost_dev *); long vhost_dev_ioctl(struct vhost_dev *, unsigned int ioctl, void __user *argp); long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp); -int vhost_vq_access_ok(struct vhost_virtqueue *vq); -int vhost_log_access_ok(struct vhost_dev *); +bool vhost_vq_access_ok(struct vhost_virtqueue *vq); +bool vhost_log_access_ok(struct vhost_dev *); int vhost_get_vq_desc(struct vhost_virtqueue *, struct iovec iov[], unsigned int iov_count, diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 93fd0c75b0d8..b6a082ef33dd 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -641,14 +641,14 @@ void vhost_dev_cleanup(struct vhost_dev *dev) } EXPORT_SYMBOL_GPL(vhost_dev_cleanup); -static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz) +static bool log_access_ok(void __user *log_base, u64 addr, unsigned long sz) { u64 a = addr / VHOST_PAGE_SIZE / 8; /* Make sure 64 bit math will not overflow. */ if (a > ULONG_MAX - (unsigned long)log_base || a + (unsigned long)log_base > ULONG_MAX) - return 0; + return false; return access_ok(VERIFY_WRITE, log_base + a, (sz + VHOST_PAGE_SIZE * 8 - 1) / VHOST_PAGE_SIZE / 8); @@ -661,30 +661,30 @@ static bool vhost_overflow(u64 uaddr, u64 size) } /* Caller should have vq mutex and device mutex. */ -static int vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem, - int log_all) +static bool vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem, + int log_all) { struct vhost_umem_node *node; if (!umem) - return 0; + return false; list_for_each_entry(node, >umem_list, link) { unsigned long a = node->userspace_addr; if (vhost_overflow(node->userspace_addr, node->size)) - return 0; + return false; if (!access_ok(VERIFY_WRITE, (void __user *)a, node->size)) - return 0; + return false; else if (log_all && !log_access_ok(log_base, node->start, node->size)) - return 0; + return false; } - return 1; + return true; } static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq, @@ -701,13 +701,13 @@ static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq, /* Can we switch to this memory table? */ /* Caller should have device mutex but not vq mutex */ -static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem, - int log_all) +static bool memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem, +int log_all) { int i; for (i = 0; i < d->nvqs; ++i) { - int ok; + bool ok; bool log; mutex_lock(>vqs[i]->mutex); @@ -717,12 +717,12 @@ static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem, ok = vq_memory_access_ok(d->vqs[i]->log_base, umem, log); else - ok = 1; + ok = true; mutex_unlock(>vqs[i]->mutex); if (!ok) - return 0; + return false; } - return 1; + return true; } static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len, @@ -959,21 +959,21 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d, spin_unlock(>iotlb_lock); } -static int umem_access_ok(u64 uaddr, u64 size, int access) +static
[PATCH v2 2/2] vhost: return bool from *_access_ok() functions
Currently vhost *_access_ok() functions return int. This is error-prone because there are two popular conventions: 1. 0 means failure, 1 means success 2. -errno means failure, 0 means success Although vhost mostly uses #1, it does not do so consistently. umem_access_ok() uses #2. This patch changes the return type from int to bool so that false means failure and true means success. This eliminates a potential source of errors. Suggested-by: Linus Torvalds Signed-off-by: Stefan Hajnoczi --- drivers/vhost/vhost.h | 4 ++-- drivers/vhost/vhost.c | 66 +-- 2 files changed, 35 insertions(+), 35 deletions(-) diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index ac4b6056f19a..6e00fa57af09 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -178,8 +178,8 @@ void vhost_dev_cleanup(struct vhost_dev *); void vhost_dev_stop(struct vhost_dev *); long vhost_dev_ioctl(struct vhost_dev *, unsigned int ioctl, void __user *argp); long vhost_vring_ioctl(struct vhost_dev *d, int ioctl, void __user *argp); -int vhost_vq_access_ok(struct vhost_virtqueue *vq); -int vhost_log_access_ok(struct vhost_dev *); +bool vhost_vq_access_ok(struct vhost_virtqueue *vq); +bool vhost_log_access_ok(struct vhost_dev *); int vhost_get_vq_desc(struct vhost_virtqueue *, struct iovec iov[], unsigned int iov_count, diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 93fd0c75b0d8..b6a082ef33dd 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -641,14 +641,14 @@ void vhost_dev_cleanup(struct vhost_dev *dev) } EXPORT_SYMBOL_GPL(vhost_dev_cleanup); -static int log_access_ok(void __user *log_base, u64 addr, unsigned long sz) +static bool log_access_ok(void __user *log_base, u64 addr, unsigned long sz) { u64 a = addr / VHOST_PAGE_SIZE / 8; /* Make sure 64 bit math will not overflow. */ if (a > ULONG_MAX - (unsigned long)log_base || a + (unsigned long)log_base > ULONG_MAX) - return 0; + return false; return access_ok(VERIFY_WRITE, log_base + a, (sz + VHOST_PAGE_SIZE * 8 - 1) / VHOST_PAGE_SIZE / 8); @@ -661,30 +661,30 @@ static bool vhost_overflow(u64 uaddr, u64 size) } /* Caller should have vq mutex and device mutex. */ -static int vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem, - int log_all) +static bool vq_memory_access_ok(void __user *log_base, struct vhost_umem *umem, + int log_all) { struct vhost_umem_node *node; if (!umem) - return 0; + return false; list_for_each_entry(node, >umem_list, link) { unsigned long a = node->userspace_addr; if (vhost_overflow(node->userspace_addr, node->size)) - return 0; + return false; if (!access_ok(VERIFY_WRITE, (void __user *)a, node->size)) - return 0; + return false; else if (log_all && !log_access_ok(log_base, node->start, node->size)) - return 0; + return false; } - return 1; + return true; } static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq, @@ -701,13 +701,13 @@ static inline void __user *vhost_vq_meta_fetch(struct vhost_virtqueue *vq, /* Can we switch to this memory table? */ /* Caller should have device mutex but not vq mutex */ -static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem, - int log_all) +static bool memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem, +int log_all) { int i; for (i = 0; i < d->nvqs; ++i) { - int ok; + bool ok; bool log; mutex_lock(>vqs[i]->mutex); @@ -717,12 +717,12 @@ static int memory_access_ok(struct vhost_dev *d, struct vhost_umem *umem, ok = vq_memory_access_ok(d->vqs[i]->log_base, umem, log); else - ok = 1; + ok = true; mutex_unlock(>vqs[i]->mutex); if (!ok) - return 0; + return false; } - return 1; + return true; } static int translate_desc(struct vhost_virtqueue *vq, u64 addr, u32 len, @@ -959,21 +959,21 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d, spin_unlock(>iotlb_lock); } -static int umem_access_ok(u64 uaddr, u64 size, int access) +static bool umem_access_ok(u64 uaddr, u64 size, int
[PATCH v2 0/2] vhost: fix vhost_vq_access_ok() log check
v2: * Rewrote the conditional to make the vq access check clearer [Linus] * Added Patch 2 to make the return type consistent and harder to misuse [Linus] The first patch fixes the vhost virtqueue access check which was recently broken. The second patch replaces the int return type with bool to prevent future bugs. Stefan Hajnoczi (2): vhost: fix vhost_vq_access_ok() log check vhost: return bool from *_access_ok() functions drivers/vhost/vhost.h | 4 +-- drivers/vhost/vhost.c | 70 ++- 2 files changed, 38 insertions(+), 36 deletions(-) -- 2.14.3
[PATCH v2 0/2] vhost: fix vhost_vq_access_ok() log check
v2: * Rewrote the conditional to make the vq access check clearer [Linus] * Added Patch 2 to make the return type consistent and harder to misuse [Linus] The first patch fixes the vhost virtqueue access check which was recently broken. The second patch replaces the int return type with bool to prevent future bugs. Stefan Hajnoczi (2): vhost: fix vhost_vq_access_ok() log check vhost: return bool from *_access_ok() functions drivers/vhost/vhost.h | 4 +-- drivers/vhost/vhost.c | 70 ++- 2 files changed, 38 insertions(+), 36 deletions(-) -- 2.14.3
Re: [GIT PULL 00/13] perf/urgent fixes
* Arnaldo Carvalho de Melo <a...@kernel.org> wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > > Test results at the end of this message, as usual. > > The following changes since commit d1e7e602cd64cf61f87dbf30df07c24df9eb1d99: > > perf/x86/intel: Move regs->flags EXACT bit init (2018-04-05 09:28:40 +0200) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git > tags/perf-urgent-for-mingo-4.17-20180409 > > for you to fetch changes up to fcbd8fa44664e99a5d8c7ab97f1afdd82472f973: > > perf tests clang: Fix function name for clang IR test (2018-04-09 11:13:09 > -0300) > > > perf/urgent fixes: > > . Fix the --stdio2/TUI annotate output to include group details, > be it for a recorded '{a,b,f}' explicit event group or when > forcing group display using 'perf report --group' for a set of > events not recorded as a group (Arnaldo Carvalho de Melo) > > . Fix display artifacts in the ui browser (base class for the > annotate and main report/top TUI browser) related to the extra > title lines work (Arnaldo Carvalho de Melo) > > . perf auxtrace refactorings, leftovers from a previously partially > processed patchset (Adrian Hunter) > > . Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de Melo) > > - Synchronize i915_drm.h, silencing a perf build warning and > in the process automagically adding support for a new ioctl > command (Arnaldo Carvalho de Melo) > > Signed-off-by: Arnaldo Carvalho de Melo <a...@redhat.com> > > > Adrian Hunter (2): > perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer > perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering > > Arnaldo Carvalho de Melo (8): > perf annotate: Show group details on the title line > perf annotate browser: Fixup vertical line separating metrics from > instructions > perf ui browser: Fixup cleaning unused lines at the bottom > perf report: Remove duplicated 'samples' in lost samples warning > tools headers uapi: Synchronize i915_drm.h > perf hists browser: Show extra_title_lines in the 'D' debug hotkey > perf hists browser: Remove leftover from row returned from refresh > perf tools: No need to include namespaces.h in util.h > > Sandipan Das (3): > perf tools: Fix perf builds with clang support > perf clang: Add support for recent clang versions > perf tests clang: Fix function name for clang IR test > > tools/include/uapi/drm/i915_drm.h | 112 > +++-- > tools/perf/Makefile.perf | 3 +- > tools/perf/ui/browser.c| 4 +- > tools/perf/ui/browsers/annotate.c | 2 +- > tools/perf/ui/browsers/hists.c | 13 ++--- > tools/perf/util/annotate.c | 7 ++- > tools/perf/util/auxtrace.c | 72 +++- > tools/perf/util/c++/clang-test.cpp | 2 +- > tools/perf/util/c++/clang.cpp | 11 +++- > tools/perf/util/session.c | 2 +- > tools/perf/util/util.h | 4 +- > 11 files changed, 169 insertions(+), 63 deletions(-) Pulled, thanks a lot Arnaldo! Ingo
Re: [GIT PULL 00/13] perf/urgent fixes
* Arnaldo Carvalho de Melo wrote: > Hi Ingo, > > Please consider pulling, > > - Arnaldo > > > Test results at the end of this message, as usual. > > The following changes since commit d1e7e602cd64cf61f87dbf30df07c24df9eb1d99: > > perf/x86/intel: Move regs->flags EXACT bit init (2018-04-05 09:28:40 +0200) > > are available in the Git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git > tags/perf-urgent-for-mingo-4.17-20180409 > > for you to fetch changes up to fcbd8fa44664e99a5d8c7ab97f1afdd82472f973: > > perf tests clang: Fix function name for clang IR test (2018-04-09 11:13:09 > -0300) > > > perf/urgent fixes: > > . Fix the --stdio2/TUI annotate output to include group details, > be it for a recorded '{a,b,f}' explicit event group or when > forcing group display using 'perf report --group' for a set of > events not recorded as a group (Arnaldo Carvalho de Melo) > > . Fix display artifacts in the ui browser (base class for the > annotate and main report/top TUI browser) related to the extra > title lines work (Arnaldo Carvalho de Melo) > > . perf auxtrace refactorings, leftovers from a previously partially > processed patchset (Adrian Hunter) > > . Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de Melo) > > - Synchronize i915_drm.h, silencing a perf build warning and > in the process automagically adding support for a new ioctl > command (Arnaldo Carvalho de Melo) > > Signed-off-by: Arnaldo Carvalho de Melo > > > Adrian Hunter (2): > perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer > perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering > > Arnaldo Carvalho de Melo (8): > perf annotate: Show group details on the title line > perf annotate browser: Fixup vertical line separating metrics from > instructions > perf ui browser: Fixup cleaning unused lines at the bottom > perf report: Remove duplicated 'samples' in lost samples warning > tools headers uapi: Synchronize i915_drm.h > perf hists browser: Show extra_title_lines in the 'D' debug hotkey > perf hists browser: Remove leftover from row returned from refresh > perf tools: No need to include namespaces.h in util.h > > Sandipan Das (3): > perf tools: Fix perf builds with clang support > perf clang: Add support for recent clang versions > perf tests clang: Fix function name for clang IR test > > tools/include/uapi/drm/i915_drm.h | 112 > +++-- > tools/perf/Makefile.perf | 3 +- > tools/perf/ui/browser.c| 4 +- > tools/perf/ui/browsers/annotate.c | 2 +- > tools/perf/ui/browsers/hists.c | 13 ++--- > tools/perf/util/annotate.c | 7 ++- > tools/perf/util/auxtrace.c | 72 +++- > tools/perf/util/c++/clang-test.cpp | 2 +- > tools/perf/util/c++/clang.cpp | 11 +++- > tools/perf/util/session.c | 2 +- > tools/perf/util/util.h | 4 +- > 11 files changed, 169 insertions(+), 63 deletions(-) Pulled, thanks a lot Arnaldo! Ingo
[PATCH 2/2] drm/bridge: sii902x: add optional power supplies
Add the 3 optional power supplies using the exact description found in the document named "SiI9022A/SiI9024A HDMI Transmitter Data Sheet (August 2016)". Signed-off-by: Philippe Cornu--- drivers/gpu/drm/bridge/sii902x.c | 39 +++ 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c index 60373d7eb220..e17ba6db1ec8 100644 --- a/drivers/gpu/drm/bridge/sii902x.c +++ b/drivers/gpu/drm/bridge/sii902x.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include @@ -86,6 +87,7 @@ struct sii902x { struct drm_bridge bridge; struct drm_connector connector; struct gpio_desc *reset_gpio; + struct regulator_bulk_data supplies[3]; }; static inline struct sii902x *bridge_to_sii902x(struct drm_bridge *bridge) @@ -392,23 +394,43 @@ static int sii902x_probe(struct i2c_client *client, return PTR_ERR(sii902x->reset_gpio); } + sii902x->supplies[0].supply = "iovcc"; + sii902x->supplies[1].supply = "avcc12"; + sii902x->supplies[2].supply = "cvcc12"; + ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(sii902x->supplies), + sii902x->supplies); + if (ret) { + dev_err(dev, "regulator_bulk_get failed\n"); + return ret; + } + + ret = regulator_bulk_enable(ARRAY_SIZE(sii902x->supplies), + sii902x->supplies); + if (ret) { + dev_err(dev, "regulator_bulk_enable failed\n"); + return ret; + } + + usleep_range(1, 2); + sii902x_reset(sii902x); ret = regmap_write(sii902x->regmap, SII902X_REG_TPI_RQB, 0x0); if (ret) - return ret; + goto err_disable_regulator; ret = regmap_bulk_read(sii902x->regmap, SII902X_REG_CHIPID(0), , 4); if (ret) { dev_err(dev, "regmap_read failed %d\n", ret); - return ret; + goto err_disable_regulator; } if (chipid[0] != 0xb0) { dev_err(dev, "Invalid chipid: %02x (expecting 0xb0)\n", chipid[0]); - return -EINVAL; + ret = -EINVAL; + goto err_disable_regulator; } /* Clear all pending interrupts */ @@ -424,7 +446,7 @@ static int sii902x_probe(struct i2c_client *client, IRQF_ONESHOT, dev_name(dev), sii902x); if (ret) - return ret; + goto err_disable_regulator; } sii902x->bridge.funcs = _bridge_funcs; @@ -434,6 +456,12 @@ static int sii902x_probe(struct i2c_client *client, i2c_set_clientdata(client, sii902x); return 0; + +err_disable_regulator: + regulator_bulk_disable(ARRAY_SIZE(sii902x->supplies), + sii902x->supplies); + + return ret; } static int sii902x_remove(struct i2c_client *client) @@ -443,6 +471,9 @@ static int sii902x_remove(struct i2c_client *client) drm_bridge_remove(>bridge); + regulator_bulk_disable(ARRAY_SIZE(sii902x->supplies), + sii902x->supplies); + return 0; } -- 2.15.1
[PATCH 2/2] drm/bridge: sii902x: add optional power supplies
Add the 3 optional power supplies using the exact description found in the document named "SiI9022A/SiI9024A HDMI Transmitter Data Sheet (August 2016)". Signed-off-by: Philippe Cornu --- drivers/gpu/drm/bridge/sii902x.c | 39 +++ 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c index 60373d7eb220..e17ba6db1ec8 100644 --- a/drivers/gpu/drm/bridge/sii902x.c +++ b/drivers/gpu/drm/bridge/sii902x.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include @@ -86,6 +87,7 @@ struct sii902x { struct drm_bridge bridge; struct drm_connector connector; struct gpio_desc *reset_gpio; + struct regulator_bulk_data supplies[3]; }; static inline struct sii902x *bridge_to_sii902x(struct drm_bridge *bridge) @@ -392,23 +394,43 @@ static int sii902x_probe(struct i2c_client *client, return PTR_ERR(sii902x->reset_gpio); } + sii902x->supplies[0].supply = "iovcc"; + sii902x->supplies[1].supply = "avcc12"; + sii902x->supplies[2].supply = "cvcc12"; + ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(sii902x->supplies), + sii902x->supplies); + if (ret) { + dev_err(dev, "regulator_bulk_get failed\n"); + return ret; + } + + ret = regulator_bulk_enable(ARRAY_SIZE(sii902x->supplies), + sii902x->supplies); + if (ret) { + dev_err(dev, "regulator_bulk_enable failed\n"); + return ret; + } + + usleep_range(1, 2); + sii902x_reset(sii902x); ret = regmap_write(sii902x->regmap, SII902X_REG_TPI_RQB, 0x0); if (ret) - return ret; + goto err_disable_regulator; ret = regmap_bulk_read(sii902x->regmap, SII902X_REG_CHIPID(0), , 4); if (ret) { dev_err(dev, "regmap_read failed %d\n", ret); - return ret; + goto err_disable_regulator; } if (chipid[0] != 0xb0) { dev_err(dev, "Invalid chipid: %02x (expecting 0xb0)\n", chipid[0]); - return -EINVAL; + ret = -EINVAL; + goto err_disable_regulator; } /* Clear all pending interrupts */ @@ -424,7 +446,7 @@ static int sii902x_probe(struct i2c_client *client, IRQF_ONESHOT, dev_name(dev), sii902x); if (ret) - return ret; + goto err_disable_regulator; } sii902x->bridge.funcs = _bridge_funcs; @@ -434,6 +456,12 @@ static int sii902x_probe(struct i2c_client *client, i2c_set_clientdata(client, sii902x); return 0; + +err_disable_regulator: + regulator_bulk_disable(ARRAY_SIZE(sii902x->supplies), + sii902x->supplies); + + return ret; } static int sii902x_remove(struct i2c_client *client) @@ -443,6 +471,9 @@ static int sii902x_remove(struct i2c_client *client) drm_bridge_remove(>bridge); + regulator_bulk_disable(ARRAY_SIZE(sii902x->supplies), + sii902x->supplies); + return 0; } -- 2.15.1
[PATCH 0/2] drm/bridge: sii902x: add optional power supplies
This patchset adds the 3 optional power supplies to the sii902x drm bridge driver. Philippe Cornu (2): dt-bindings/display/bridge: sii902x: add optional power supplies drm/bridge: sii902x: add optional power supplies .../devicetree/bindings/display/bridge/sii902x.txt | 3 ++ drivers/gpu/drm/bridge/sii902x.c | 39 +++--- 2 files changed, 38 insertions(+), 4 deletions(-) -- 2.15.1
[PATCH 0/2] drm/bridge: sii902x: add optional power supplies
This patchset adds the 3 optional power supplies to the sii902x drm bridge driver. Philippe Cornu (2): dt-bindings/display/bridge: sii902x: add optional power supplies drm/bridge: sii902x: add optional power supplies .../devicetree/bindings/display/bridge/sii902x.txt | 3 ++ drivers/gpu/drm/bridge/sii902x.c | 39 +++--- 2 files changed, 38 insertions(+), 4 deletions(-) -- 2.15.1
[PATCH 1/2] dt-bindings/display/bridge: sii902x: add optional power supplies
Add the 3 optional power supplies using the exact description found in the document named "SiI9022A/SiI9024A HDMI Transmitter Data Sheet (August 2016)". Signed-off-by: Philippe Cornu--- Documentation/devicetree/bindings/display/bridge/sii902x.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/display/bridge/sii902x.txt b/Documentation/devicetree/bindings/display/bridge/sii902x.txt index 56a3e68ccb80..cf53678fe574 100644 --- a/Documentation/devicetree/bindings/display/bridge/sii902x.txt +++ b/Documentation/devicetree/bindings/display/bridge/sii902x.txt @@ -8,6 +8,9 @@ Optional properties: - interrupts-extended or interrupt-parent + interrupts: describe the interrupt line used to inform the host about hotplug events. - reset-gpios: OF device-tree gpio specification for RST_N pin. + - iovcc-supply: I/O supply voltage (1.8V or 3.3V, host-dependent). + - avcc12-supply: TMDS analog supply voltage (1.2V). + - cvcc12-supply: Digital core supply voltage (1.2V). Optional subnodes: - video input: this subnode can contain a video input port node -- 2.15.1
[PATCH 1/2] dt-bindings/display/bridge: sii902x: add optional power supplies
Add the 3 optional power supplies using the exact description found in the document named "SiI9022A/SiI9024A HDMI Transmitter Data Sheet (August 2016)". Signed-off-by: Philippe Cornu --- Documentation/devicetree/bindings/display/bridge/sii902x.txt | 3 +++ 1 file changed, 3 insertions(+) diff --git a/Documentation/devicetree/bindings/display/bridge/sii902x.txt b/Documentation/devicetree/bindings/display/bridge/sii902x.txt index 56a3e68ccb80..cf53678fe574 100644 --- a/Documentation/devicetree/bindings/display/bridge/sii902x.txt +++ b/Documentation/devicetree/bindings/display/bridge/sii902x.txt @@ -8,6 +8,9 @@ Optional properties: - interrupts-extended or interrupt-parent + interrupts: describe the interrupt line used to inform the host about hotplug events. - reset-gpios: OF device-tree gpio specification for RST_N pin. + - iovcc-supply: I/O supply voltage (1.8V or 3.3V, host-dependent). + - avcc12-supply: TMDS analog supply voltage (1.2V). + - cvcc12-supply: Digital core supply voltage (1.2V). Optional subnodes: - video input: this subnode can contain a video input port node -- 2.15.1
Re: [PATCH v2 2/9] PCI: dwc: Add support for endpoint mode
Hi, On Monday 09 April 2018 03:11 PM, Gustavo Pimentel wrote: > The PCIe controller dual mode is capable of operating in host mode as well > as endpoint mode by configuration, therefore this patch aims to add > endpoint mode support to the designware driver. > > Signed-off-by: Gustavo Pimentel> --- > Change v1->v2: > - Removed dw_plat_pcie_stop_link empty function. > - Implemented Kishon's suggestions about dw-pcie-rc and dw-pcie strings. > compatibility. > - Added second entry on pci_epf_test_ids structure. > > drivers/pci/dwc/Kconfig | 45 ++-- > drivers/pci/dwc/pcie-designware-ep.c | 4 +- > drivers/pci/dwc/pcie-designware-plat.c| 153 > -- > drivers/pci/endpoint/functions/pci-epf-test.c | 9 ++ > 4 files changed, 190 insertions(+), 21 deletions(-) > > diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig > index 2f3f5c5..3fd7daf 100644 > --- a/drivers/pci/dwc/Kconfig > +++ b/drivers/pci/dwc/Kconfig > @@ -7,8 +7,7 @@ config PCIE_DW > > config PCIE_DW_HOST > bool > - depends on PCI > - depends on PCI_MSI_IRQ_DOMAIN > + depends on PCI && PCI_MSI_IRQ_DOMAIN > select PCIE_DW > > config PCIE_DW_EP > @@ -52,16 +51,42 @@ config PCI_DRA7XX_EP > > config PCIE_DW_PLAT > bool "Platform bus based DesignWare PCIe Controller" > - depends on PCI > - depends on PCI_MSI_IRQ_DOMAIN > - select PCIE_DW_HOST > - ---help--- > - This selects the DesignWare PCIe controller support. Select this if > - you have a PCIe controller on Platform bus. > + help > + There are two instances of PCIe controller in Designware IP. > + This controller can work either as EP or RC. In order to enable > + host-specific features PCIE_DW_PLAT_HOST must be selected and in > + order to enable device-specific features PCIE_DW_PLAT_EP must be > + selected. > > - If you have a controller with this interface, say Y or M here. > +config PCIE_DW_PLAT_HOST > + bool "Platform bus based DesignWare PCIe Controller - Host mode" > + depends on PCI && PCI_MSI_IRQ_DOMAIN > + select PCIE_DW_HOST > + select PCIE_DW_PLAT > + default y > + help > + Enables support for the PCIe controller in the Designware IP to > + work in host mode. There are two instances of PCIe controller in > + Designware IP. > + This controller can work either as EP or RC. In order to enable > + host-specific features PCIE_DW_PLAT_HOST must be selected and in > + order to enable device-specific features PCI_DW_PLAT_EP must be > + selected. > > - If unsure, say N. > +config PCIE_DW_PLAT_EP > + bool "Platform bus based DesignWare PCIe Controller - Endpoint mode" > + depends on PCI && PCI_MSI_IRQ_DOMAIN > + depends on PCI_ENDPOINT > + select PCIE_DW_EP > + select PCIE_DW_PLAT > + help > + Enables support for the PCIe controller in the Designware IP to > + work in endpoint mode. There are two instances of PCIe controller > + in Designware IP. > + This controller can work either as EP or RC. In order to enable > + host-specific features PCIE_DW_PLAT_HOST must be selected and in > + order to enable device-specific features PCI_DW_PLAT_EP must be > + selected. > > config PCI_EXYNOS > bool "Samsung Exynos PCIe controller" > diff --git a/drivers/pci/dwc/pcie-designware-ep.c > b/drivers/pci/dwc/pcie-designware-ep.c > index f07678b..4ac135a 100644 > --- a/drivers/pci/dwc/pcie-designware-ep.c > +++ b/drivers/pci/dwc/pcie-designware-ep.c > @@ -15,8 +15,10 @@ > void dw_pcie_ep_linkup(struct dw_pcie_ep *ep) > { > struct pci_epc *epc = ep->epc; > + struct pci_epf *epf; > > - pci_epc_linkup(epc); > + list_for_each_entry(epf, >pci_epf, list) > + pci_epf_linkup(epf); > } This shouldn't be required anymore. > > static void __dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar, > diff --git a/drivers/pci/dwc/pcie-designware-plat.c > b/drivers/pci/dwc/pcie-designware-plat.c > index 5416aa8..5382a7a 100644 > --- a/drivers/pci/dwc/pcie-designware-plat.c > +++ b/drivers/pci/dwc/pcie-designware-plat.c > @@ -12,19 +12,29 @@ > #include > #include > #include > +#include > #include > #include > #include > #include > #include > #include > +#include > > #include "pcie-designware.h" > > struct dw_plat_pcie { > - struct dw_pcie *pci; > + struct dw_pcie *pci; > + struct regmap *regmap; > + enum dw_pcie_device_modemode; > }; > > +struct dw_plat_pcie_of_data { > + enum dw_pcie_device_modemode; > +}; > + > +static const struct of_device_id dw_plat_pcie_of_match[]; > + > static int dw_plat_pcie_host_init(struct pcie_port *pp) > { > struct dw_pcie *pci = to_dw_pcie_from_pp(pp); > @@ -42,9
[GIT PULL] libnvdimm for 4.17
Hi Linus, please pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-4.17 ...to receive the libnvdimm update for 4.17. This cycle was was not something I ever want to repeat as there were several late changes that have only now just settled. Half of the branch up to commit d2c997c0f145 "fs, dax: use page->mapping to warn..." have been in -next for several releases. The of_pmem driver and the address range scrub rework were late arrivals, and the dax work was scaled back at the last moment. The of_pmem driver missed a previous merge window due to an oversight. A sense of obligation to rectify that miss is why it is included for 4.17. It has acks from PowerPC folks. Stephen reported a build failure that only occurs when merging it with your latest tree, for now I have fixed that up by disabling modular builds of of_pmem. A test merge with your tree has received a build success report from the 0day robot over 156 configs. An initial version of the ARS rework was submitted before the merge window. It is self contained to libnvdimm, a net code reduction, and passing all unit tests. The filesystem-dax changes are based on the wait_var_event() functionality from tip/sched/core. However, late review feedback showed that those changes regressed truncate performance to a large degress. The branch was rewound to drop the truncate behavior change and now only includes preparation patches and cleanups (with full acks and reviews). The finalization of this dax-dma-vs-trnucate work will need to wait for 4.18. git picked the wait_var_event() baseline for the diffstat, so I also include the diffstat of the test merge below. The following changes since commit 3eb2ce825ea1ad89d20f7a3b5780df850e4be274: Linux 4.16-rc7 (2018-03-25 12:44:30 -1000) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-4.17 for you to fetch changes up to e13e75b86ef2f88e3a47d672dd4c52a293efb95b: Merge branch 'for-4.17/dax' into libnvdimm-for-next (2018-04-09 10:50:17 -0700) libnvdimm for 4.17 * A rework of the filesytem-dax implementation provides for detection of unmap operations (truncate / hole punch) colliding with in-progress device-DMA. A fix for these collisions remains a work-in-progress pending resolution of truncate latency and starvation regressions. * The of_pmem driver expands the users of libnvdimm outside of x86 and ACPI to describe an implementation of persistent memory on PowerPC with Open Firmware / Device tree. * Address Range Scrub (ARS) handling is completely rewritten to account for the fact that ARS may run for 100s of seconds and there is no platform defined way to cancel it. ARS will now no longer block namespace initialization. * The NVDIMM Namespace Label implementation is updated to handle label areas as small as 1K, down from 128K. * Miscellaneous cleanups and updates to unit test infrastructure. Dan Williams (26): libnvdimm: remove redundant __func__ in dev_dbg device-dax: remove redundant __func__ in dev_dbg nfit: skip region registration for incomplete control regions acpi, nfit: rework NVDIMM leaf method detection dax: store pfns in the radix fs, dax: prepare for dax-specific address_space_operations block, dax: remove dead code in blkdev_writepages() xfs, dax: introduce xfs_dax_aops ext4, dax: introduce ext4_dax_aops nfit: fix region registration vs block-data-window ranges ext2, dax: introduce ext2_dax_aops fs, dax: use page->mapping to warn if truncate collides with a busy page dax: introduce CONFIG_DAX_DRIVER dax, dm: allow device-mapper to operate without dax support nfit, address-range-scrub: fix scrub in-progress reporting libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device' nfit, address-range-scrub: introduce nfit_spa->ars_state libnvdimm, dimm: fix dpa reservation vs uninitialized label area libnvdimm, namespace: use a safe lookup for dimm device name libnvdimm, region: quiet region probe nfit, address-range-scrub: determine one platform max_ars value nfit, address-range-scrub: rework and simplify ARS state machine nfit, address-range-scrub: add module option to skip initial ars libnvdimm, of_pmem: workaround OF_NUMA=n build error Merge branch 'for-4.17/libnvdimm' into libnvdimm-for-next Merge branch 'for-4.17/dax' into libnvdimm-for-next Johannes Thumshirn (4): acpi, nfit: remove redundant __func__ in dev_dbg libnvdimm: provide module_nd_driver wrapper libnvdimm, pmem: use module_nd_driver device-dax: use module_nd_driver Oliver O'Halloran (4): libnvdimm: Add of_node to region and bus descriptors libnvdimm: Add
Re: [PATCH v2 2/9] PCI: dwc: Add support for endpoint mode
Hi, On Monday 09 April 2018 03:11 PM, Gustavo Pimentel wrote: > The PCIe controller dual mode is capable of operating in host mode as well > as endpoint mode by configuration, therefore this patch aims to add > endpoint mode support to the designware driver. > > Signed-off-by: Gustavo Pimentel > --- > Change v1->v2: > - Removed dw_plat_pcie_stop_link empty function. > - Implemented Kishon's suggestions about dw-pcie-rc and dw-pcie strings. > compatibility. > - Added second entry on pci_epf_test_ids structure. > > drivers/pci/dwc/Kconfig | 45 ++-- > drivers/pci/dwc/pcie-designware-ep.c | 4 +- > drivers/pci/dwc/pcie-designware-plat.c| 153 > -- > drivers/pci/endpoint/functions/pci-epf-test.c | 9 ++ > 4 files changed, 190 insertions(+), 21 deletions(-) > > diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig > index 2f3f5c5..3fd7daf 100644 > --- a/drivers/pci/dwc/Kconfig > +++ b/drivers/pci/dwc/Kconfig > @@ -7,8 +7,7 @@ config PCIE_DW > > config PCIE_DW_HOST > bool > - depends on PCI > - depends on PCI_MSI_IRQ_DOMAIN > + depends on PCI && PCI_MSI_IRQ_DOMAIN > select PCIE_DW > > config PCIE_DW_EP > @@ -52,16 +51,42 @@ config PCI_DRA7XX_EP > > config PCIE_DW_PLAT > bool "Platform bus based DesignWare PCIe Controller" > - depends on PCI > - depends on PCI_MSI_IRQ_DOMAIN > - select PCIE_DW_HOST > - ---help--- > - This selects the DesignWare PCIe controller support. Select this if > - you have a PCIe controller on Platform bus. > + help > + There are two instances of PCIe controller in Designware IP. > + This controller can work either as EP or RC. In order to enable > + host-specific features PCIE_DW_PLAT_HOST must be selected and in > + order to enable device-specific features PCIE_DW_PLAT_EP must be > + selected. > > - If you have a controller with this interface, say Y or M here. > +config PCIE_DW_PLAT_HOST > + bool "Platform bus based DesignWare PCIe Controller - Host mode" > + depends on PCI && PCI_MSI_IRQ_DOMAIN > + select PCIE_DW_HOST > + select PCIE_DW_PLAT > + default y > + help > + Enables support for the PCIe controller in the Designware IP to > + work in host mode. There are two instances of PCIe controller in > + Designware IP. > + This controller can work either as EP or RC. In order to enable > + host-specific features PCIE_DW_PLAT_HOST must be selected and in > + order to enable device-specific features PCI_DW_PLAT_EP must be > + selected. > > - If unsure, say N. > +config PCIE_DW_PLAT_EP > + bool "Platform bus based DesignWare PCIe Controller - Endpoint mode" > + depends on PCI && PCI_MSI_IRQ_DOMAIN > + depends on PCI_ENDPOINT > + select PCIE_DW_EP > + select PCIE_DW_PLAT > + help > + Enables support for the PCIe controller in the Designware IP to > + work in endpoint mode. There are two instances of PCIe controller > + in Designware IP. > + This controller can work either as EP or RC. In order to enable > + host-specific features PCIE_DW_PLAT_HOST must be selected and in > + order to enable device-specific features PCI_DW_PLAT_EP must be > + selected. > > config PCI_EXYNOS > bool "Samsung Exynos PCIe controller" > diff --git a/drivers/pci/dwc/pcie-designware-ep.c > b/drivers/pci/dwc/pcie-designware-ep.c > index f07678b..4ac135a 100644 > --- a/drivers/pci/dwc/pcie-designware-ep.c > +++ b/drivers/pci/dwc/pcie-designware-ep.c > @@ -15,8 +15,10 @@ > void dw_pcie_ep_linkup(struct dw_pcie_ep *ep) > { > struct pci_epc *epc = ep->epc; > + struct pci_epf *epf; > > - pci_epc_linkup(epc); > + list_for_each_entry(epf, >pci_epf, list) > + pci_epf_linkup(epf); > } This shouldn't be required anymore. > > static void __dw_pcie_ep_reset_bar(struct dw_pcie *pci, enum pci_barno bar, > diff --git a/drivers/pci/dwc/pcie-designware-plat.c > b/drivers/pci/dwc/pcie-designware-plat.c > index 5416aa8..5382a7a 100644 > --- a/drivers/pci/dwc/pcie-designware-plat.c > +++ b/drivers/pci/dwc/pcie-designware-plat.c > @@ -12,19 +12,29 @@ > #include > #include > #include > +#include > #include > #include > #include > #include > #include > #include > +#include > > #include "pcie-designware.h" > > struct dw_plat_pcie { > - struct dw_pcie *pci; > + struct dw_pcie *pci; > + struct regmap *regmap; > + enum dw_pcie_device_modemode; > }; > > +struct dw_plat_pcie_of_data { > + enum dw_pcie_device_modemode; > +}; > + > +static const struct of_device_id dw_plat_pcie_of_match[]; > + > static int dw_plat_pcie_host_init(struct pcie_port *pp) > { > struct dw_pcie *pci = to_dw_pcie_from_pp(pp); > @@ -42,9 +52,53 @@ static const struct
[GIT PULL] libnvdimm for 4.17
Hi Linus, please pull from: git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-4.17 ...to receive the libnvdimm update for 4.17. This cycle was was not something I ever want to repeat as there were several late changes that have only now just settled. Half of the branch up to commit d2c997c0f145 "fs, dax: use page->mapping to warn..." have been in -next for several releases. The of_pmem driver and the address range scrub rework were late arrivals, and the dax work was scaled back at the last moment. The of_pmem driver missed a previous merge window due to an oversight. A sense of obligation to rectify that miss is why it is included for 4.17. It has acks from PowerPC folks. Stephen reported a build failure that only occurs when merging it with your latest tree, for now I have fixed that up by disabling modular builds of of_pmem. A test merge with your tree has received a build success report from the 0day robot over 156 configs. An initial version of the ARS rework was submitted before the merge window. It is self contained to libnvdimm, a net code reduction, and passing all unit tests. The filesystem-dax changes are based on the wait_var_event() functionality from tip/sched/core. However, late review feedback showed that those changes regressed truncate performance to a large degress. The branch was rewound to drop the truncate behavior change and now only includes preparation patches and cleanups (with full acks and reviews). The finalization of this dax-dma-vs-trnucate work will need to wait for 4.18. git picked the wait_var_event() baseline for the diffstat, so I also include the diffstat of the test merge below. The following changes since commit 3eb2ce825ea1ad89d20f7a3b5780df850e4be274: Linux 4.16-rc7 (2018-03-25 12:44:30 -1000) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm tags/libnvdimm-for-4.17 for you to fetch changes up to e13e75b86ef2f88e3a47d672dd4c52a293efb95b: Merge branch 'for-4.17/dax' into libnvdimm-for-next (2018-04-09 10:50:17 -0700) libnvdimm for 4.17 * A rework of the filesytem-dax implementation provides for detection of unmap operations (truncate / hole punch) colliding with in-progress device-DMA. A fix for these collisions remains a work-in-progress pending resolution of truncate latency and starvation regressions. * The of_pmem driver expands the users of libnvdimm outside of x86 and ACPI to describe an implementation of persistent memory on PowerPC with Open Firmware / Device tree. * Address Range Scrub (ARS) handling is completely rewritten to account for the fact that ARS may run for 100s of seconds and there is no platform defined way to cancel it. ARS will now no longer block namespace initialization. * The NVDIMM Namespace Label implementation is updated to handle label areas as small as 1K, down from 128K. * Miscellaneous cleanups and updates to unit test infrastructure. Dan Williams (26): libnvdimm: remove redundant __func__ in dev_dbg device-dax: remove redundant __func__ in dev_dbg nfit: skip region registration for incomplete control regions acpi, nfit: rework NVDIMM leaf method detection dax: store pfns in the radix fs, dax: prepare for dax-specific address_space_operations block, dax: remove dead code in blkdev_writepages() xfs, dax: introduce xfs_dax_aops ext4, dax: introduce ext4_dax_aops nfit: fix region registration vs block-data-window ranges ext2, dax: introduce ext2_dax_aops fs, dax: use page->mapping to warn if truncate collides with a busy page dax: introduce CONFIG_DAX_DRIVER dax, dm: allow device-mapper to operate without dax support nfit, address-range-scrub: fix scrub in-progress reporting libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device' nfit, address-range-scrub: introduce nfit_spa->ars_state libnvdimm, dimm: fix dpa reservation vs uninitialized label area libnvdimm, namespace: use a safe lookup for dimm device name libnvdimm, region: quiet region probe nfit, address-range-scrub: determine one platform max_ars value nfit, address-range-scrub: rework and simplify ARS state machine nfit, address-range-scrub: add module option to skip initial ars libnvdimm, of_pmem: workaround OF_NUMA=n build error Merge branch 'for-4.17/libnvdimm' into libnvdimm-for-next Merge branch 'for-4.17/dax' into libnvdimm-for-next Johannes Thumshirn (4): acpi, nfit: remove redundant __func__ in dev_dbg libnvdimm: provide module_nd_driver wrapper libnvdimm, pmem: use module_nd_driver device-dax: use module_nd_driver Oliver O'Halloran (4): libnvdimm: Add of_node to region and bus descriptors libnvdimm: Add
WARNING: kobject bug in corrupted
Hello, syzbot hit the following crash on upstream commit fd40ffc72e2f74c7db61e400903e7d50a88bc0b0 (Mon Apr 9 18:36:05 2018 +) selinux: fix missing dput() before selinuxfs unmount syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=dd8fe49d0d1423aa5295 C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5710100694040576 syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=5951393567342592 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=6276231339180032 Kernel config: https://syzkaller.appspot.com/x/.config?id=-771321277174894814 compiler: gcc (GCC) 8.0.1 20180301 (experimental) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+dd8fe49d0d1423aa5...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 kobject_add_internal failed for gfs2meta with -EEXIST, don't try to register things with the same name in the same directory. sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30 sysfs_create_dir_ns+0x184/0x1d0 fs/sysfs/dir.c:58 WARNING: CPU: 1 PID: 4473 at lib/kobject.c:238 kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236 create_dir lib/kobject.c:69 [inline] kobject_add_internal+0x353/0xba0 lib/kobject.c:228 Kernel panic - not syncing: panic_on_warn set ... kobject_add_varg lib/kobject.c:364 [inline] kobject_init_and_add+0xed/0x130 lib/kobject.c:435 gfs2_sys_fs_add+0x1ff/0x500 fs/gfs2/sys.c:652 fill_super+0x8c9/0x1a40 fs/gfs2/ops_fstype.c:1118 gfs2_mount+0x5e6/0x712 fs/gfs2/ops_fstype.c:1321 mount_fs+0xae/0x328 fs/super.c:1222 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 vfs_kern_mount fs/namespace.c:1027 [inline] do_new_mount fs/namespace.c:2517 [inline] do_mount+0x564/0x3070 fs/namespace.c:2847 ksys_mount+0x12d/0x140 fs/namespace.c:3063 SYSC_mount fs/namespace.c:3077 [inline] SyS_mount+0x35/0x50 fs/namespace.c:3074 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x4430ca RSP: 002b:7fff5f80e158 EFLAGS: 0297 ORIG_RAX: 00a5 RAX: ffda RBX: 0003 RCX: 004430ca RDX: 2040 RSI: 2080 RDI: 7fff5f80e170 RBP: 006cb018 R08: 24c0 R09: 000a R10: R11: 0297 R12: 6e5f6b636f6c3d6f R13: 746f72706b636f6c R14: 0030656c69662f2e R15: 0004 CPU: 1 PID: 4473 Comm: syzkaller208561 Not tainted 4.16.0+ #14 [ cut here ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 kobject_add_internal failed for gfs2meta with -EEXIST, don't try to register things with the same name in the same directory. panic+0x22f/0x4de kernel/panic.c:183 WARNING: CPU: 0 PID: 4470 at lib/kobject.c:238 kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236 Modules linked in: CPU: 0 PID: 4470 Comm: syzkaller208561 Not tainted 4.16.0+ #14 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236 __warn.cold.8+0x163/0x1a3 kernel/panic.c:547 RSP: 0018:8801af7af480 EFLAGS: 00010286 report_bug+0x252/0x2d0 lib/bug.c:186 RAX: 007d RBX: 8801af24d1d0 RCX: 815f42ed fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296 RDX: RSI: 815f8fa1 RDI: 8801af7aefe0 RBP: 8801af7af578 R08: 8801af794640 R09: 0006 R10: 8801af794640 R11: R12: ffef R13: 8801d3abea48 R14: 110035ef5e9a R15: 8801d3abea00 FS: 011be880() GS:8801db00() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fff0fb79330 CR3: 0001af48 CR4: 001406f0 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 DR0: DR1: DR2: invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:991 DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: RIP: 0010:kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236 RSP: 0018:8801af4ef480 EFLAGS: 00010286 RAX: 007d RBX: 8801af2a1210 RCX: 815f42ed RDX: RSI: 815f8fa1 RDI: 8801af4eefe0 RBP: 8801af4ef578 R08: 8801af00c700 R09: 0006 R10: 8801af00c700 R11: R12: ffef R13: 8801d3abea48 R14: 110035e9de9a R15: 8801d3abea00 kobject_add_varg lib/kobject.c:364 [inline] kobject_init_and_add+0xed/0x130 lib/kobject.c:435 gfs2_sys_fs_add+0x1ff/0x500 fs/gfs2/sys.c:652
WARNING: kobject bug in corrupted
Hello, syzbot hit the following crash on upstream commit fd40ffc72e2f74c7db61e400903e7d50a88bc0b0 (Mon Apr 9 18:36:05 2018 +) selinux: fix missing dput() before selinuxfs unmount syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=dd8fe49d0d1423aa5295 C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5710100694040576 syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=5951393567342592 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=6276231339180032 Kernel config: https://syzkaller.appspot.com/x/.config?id=-771321277174894814 compiler: gcc (GCC) 8.0.1 20180301 (experimental) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+dd8fe49d0d1423aa5...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 kobject_add_internal failed for gfs2meta with -EEXIST, don't try to register things with the same name in the same directory. sysfs_warn_dup.cold.3+0x1c/0x2b fs/sysfs/dir.c:30 sysfs_create_dir_ns+0x184/0x1d0 fs/sysfs/dir.c:58 WARNING: CPU: 1 PID: 4473 at lib/kobject.c:238 kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236 create_dir lib/kobject.c:69 [inline] kobject_add_internal+0x353/0xba0 lib/kobject.c:228 Kernel panic - not syncing: panic_on_warn set ... kobject_add_varg lib/kobject.c:364 [inline] kobject_init_and_add+0xed/0x130 lib/kobject.c:435 gfs2_sys_fs_add+0x1ff/0x500 fs/gfs2/sys.c:652 fill_super+0x8c9/0x1a40 fs/gfs2/ops_fstype.c:1118 gfs2_mount+0x5e6/0x712 fs/gfs2/ops_fstype.c:1321 mount_fs+0xae/0x328 fs/super.c:1222 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 vfs_kern_mount fs/namespace.c:1027 [inline] do_new_mount fs/namespace.c:2517 [inline] do_mount+0x564/0x3070 fs/namespace.c:2847 ksys_mount+0x12d/0x140 fs/namespace.c:3063 SYSC_mount fs/namespace.c:3077 [inline] SyS_mount+0x35/0x50 fs/namespace.c:3074 do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x4430ca RSP: 002b:7fff5f80e158 EFLAGS: 0297 ORIG_RAX: 00a5 RAX: ffda RBX: 0003 RCX: 004430ca RDX: 2040 RSI: 2080 RDI: 7fff5f80e170 RBP: 006cb018 R08: 24c0 R09: 000a R10: R11: 0297 R12: 6e5f6b636f6c3d6f R13: 746f72706b636f6c R14: 0030656c69662f2e R15: 0004 CPU: 1 PID: 4473 Comm: syzkaller208561 Not tainted 4.16.0+ #14 [ cut here ] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 kobject_add_internal failed for gfs2meta with -EEXIST, don't try to register things with the same name in the same directory. panic+0x22f/0x4de kernel/panic.c:183 WARNING: CPU: 0 PID: 4470 at lib/kobject.c:238 kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236 Modules linked in: CPU: 0 PID: 4470 Comm: syzkaller208561 Not tainted 4.16.0+ #14 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236 __warn.cold.8+0x163/0x1a3 kernel/panic.c:547 RSP: 0018:8801af7af480 EFLAGS: 00010286 report_bug+0x252/0x2d0 lib/bug.c:186 RAX: 007d RBX: 8801af24d1d0 RCX: 815f42ed fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296 RDX: RSI: 815f8fa1 RDI: 8801af7aefe0 RBP: 8801af7af578 R08: 8801af794640 R09: 0006 R10: 8801af794640 R11: R12: ffef R13: 8801d3abea48 R14: 110035ef5e9a R15: 8801d3abea00 FS: 011be880() GS:8801db00() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fff0fb79330 CR3: 0001af48 CR4: 001406f0 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 DR0: DR1: DR2: invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:991 DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: RIP: 0010:kobject_add_internal+0x8e0/0xba0 lib/kobject.c:236 RSP: 0018:8801af4ef480 EFLAGS: 00010286 RAX: 007d RBX: 8801af2a1210 RCX: 815f42ed RDX: RSI: 815f8fa1 RDI: 8801af4eefe0 RBP: 8801af4ef578 R08: 8801af00c700 R09: 0006 R10: 8801af00c700 R11: R12: ffef R13: 8801d3abea48 R14: 110035e9de9a R15: 8801d3abea00 kobject_add_varg lib/kobject.c:364 [inline] kobject_init_and_add+0xed/0x130 lib/kobject.c:435 gfs2_sys_fs_add+0x1ff/0x500 fs/gfs2/sys.c:652
Re: [RFC] vhost: introduce mdev based hardware vhost backend
On Tue, Apr 10, 2018 at 10:52:52AM +0800, Jason Wang wrote: > On 2018年04月02日 23:23, Tiwei Bie wrote: > > This patch introduces a mdev (mediated device) based hardware > > vhost backend. This backend is an abstraction of the various > > hardware vhost accelerators (potentially any device that uses > > virtio ring can be used as a vhost accelerator). Some generic > > mdev parent ops are provided for accelerator drivers to support > > generating mdev instances. > > > > What's this > > === > > > > The idea is that we can setup a virtio ring compatible device > > with the messages available at the vhost-backend. Originally, > > these messages are used to implement a software vhost backend, > > but now we will use these messages to setup a virtio ring > > compatible hardware device. Then the hardware device will be > > able to work with the guest virtio driver in the VM just like > > what the software backend does. That is to say, we can implement > > a hardware based vhost backend in QEMU, and any virtio ring > > compatible devices potentially can be used with this backend. > > (We also call it vDPA -- vhost Data Path Acceleration). > > > > One problem is that, different virtio ring compatible devices > > may have different device interfaces. That is to say, we will > > need different drivers in QEMU. It could be troublesome. And > > that's what this patch trying to fix. The idea behind this > > patch is very simple: mdev is a standard way to emulate device > > in kernel. > > So you just move the abstraction layer from qemu to kernel, and you still > need different drivers in kernel for different device interfaces of > accelerators. This looks even more complex than leaving it in qemu. As you > said, another idea is to implement userspace vhost backend for accelerators > which seems easier and could co-work with other parts of qemu without > inventing new type of messages. I'm not quite sure. Do you think it's acceptable to add various vendor specific hardware drivers in QEMU? > > Need careful thought here to seek a best solution here. Yeah, definitely! :) And your opinions would be very helpful! > > > So we defined a standard device based on mdev, which > > is able to accept vhost messages. When the mdev emulation code > > (i.e. the generic mdev parent ops provided by this patch) gets > > vhost messages, it will parse and deliver them to accelerator > > drivers. Drivers can use these messages to setup accelerators. > > > > That is to say, the generic mdev parent ops (e.g. read()/write()/ > > ioctl()/...) will be provided for accelerator drivers to register > > accelerators as mdev parent devices. And each accelerator device > > will support generating standard mdev instance(s). > > > > With this standard device interface, we will be able to just > > develop one userspace driver to implement the hardware based > > vhost backend in QEMU. > > > > Difference between vDPA and PCI passthru > > > > > > The key difference between vDPA and PCI passthru is that, in > > vDPA only the data path of the device (e.g. DMA ring, notify > > region and queue interrupt) is pass-throughed to the VM, the > > device control path (e.g. PCI configuration space and MMIO > > regions) is still defined and emulated by QEMU. > > > > The benefits of keeping virtio device emulation in QEMU compared > > with virtio device PCI passthru include (but not limit to): > > > > - consistent device interface for guest OS in the VM; > > - max flexibility on the hardware design, especially the > >accelerator for each vhost backend doesn't have to be a > >full PCI device; > > - leveraging the existing virtio live-migration framework; > > > > The interface of this mdev based device > > === > > > > 1. BAR0 > > > > The MMIO region described by BAR0 is the main control > > interface. Messages will be written to or read from > > this region. > > > > The message type is determined by the `request` field > > in message header. The message size is encoded in the > > message header too. The message format looks like this: > > > > struct vhost_vfio_op { > > __u64 request; > > __u32 flags; > > /* Flag values: */ > > #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */ > > __u32 size; > > union { > > __u64 u64; > > struct vhost_vring_state state; > > struct vhost_vring_addr addr; > > struct vhost_memory memory; > > } payload; > > }; > > > > The existing vhost-kernel ioctl cmds are reused as > > the message requests in above structure. > > > > Each message will be written to or read from this > > region at offset 0: > > > > int vhost_vfio_write(struct vhost_dev *dev, struct vhost_vfio_op *op) > > { > > int count = VHOST_VFIO_OP_HDR_SIZE + op->size; > > struct vhost_vfio *vfio = dev->opaque; > > int ret; > > > > ret = pwrite64(vfio->device_fd, op, count,
linux-next: Tree for Apr 10
Hi all, Please do not add any v4.18 destined stuff to your linux-next included trees until after v4.17-rc1 has been released. Changes since 20180409: The parisc-hd tree still had its build failure for which I applied a patch. The nvdimm tree lost its build failure. Non-merge commits (relative to Linus' tree): 1678 1682 files changed, 62884 insertions(+), 31587 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu (with and without kvm enabled). Below is a summary of the state of the merge. I am currently merging 258 trees (counting Linus' and 44 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (fd3b36d27566 Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs) Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild) Merging kbuild-current/fixes (28913ee8191a netfilter: nf_nat_snmp_basic: add correct dependency to Makefile) Merging arc-current/for-curr (661e50bc8532 Linux 4.16-rc4) Merging arm-current/fixes (2a141cd0d83b ARM: 8758/1: decompressor: restore r1 and r2 just before jumping to the kernel) Merging arm64-fixes/for-next/fixes (e21da1c99200 arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery) Merging m68k-current/for-linus (ecd685580c8f m68k/mac: Remove bogus "FIXME" comment) Merging powerpc-fixes/fixes (52396500f97c powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs) Merging sparc/master (17dec0a94915 Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace) Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2) Merging net/master (a2ac99905f1e vhost-net: set packet weight of tx polling to 2 * vq size) Merging bpf/master (33491588c1fb kernel/bpf/syscall: fix warning defined but not used) Merging ipsec/master (4b66af2d6356 af_key: Always verify length of provided sadb_key) Merging netfilter/master (3f1e53abff84 netfilter: ebtables: don't attempt to allocate 0-sized compat array) Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook mask only if set) Merging wireless-drivers/master (77e30e10ee28 iwlwifi: mvm: query regdb for wmm rule if needed) Merging mac80211/master (b5dbc28762fd Merge tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild) Merging rdma-fixes/for-rc (84652aefb347 RDMA/ucma: Introduce safer rdma_addr_size() variants) Merging sound-current/for-linus (e1a3a981e320 ALSA: pcm: Remove WARN_ON() at snd_pcm_hw_params() error) Merging pci-current/for-linus (fc110ebdd014 PCI: dwc: Fix enumeration end when reaching root subordinate) Merging driver-core.current/driver-core-linus (38c23685b273 Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging tty.current/tty-linus (38c23685b273 Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging usb.current/usb-linus (38c23685b273 Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging usb-gadget-fixes/fixes (c6ba5084ce0d usb: gadget: udc: renesas_usb3: add binging for r8a77965) Merging usb-serial-fixes/usb-linus (86d71233b615 USB: serial: ftdi_sio: add support for Harman FirmwareHubEmulator) Merging usb-chipidea-fixes/ci-fo
Re: [RFC] vhost: introduce mdev based hardware vhost backend
On Tue, Apr 10, 2018 at 10:52:52AM +0800, Jason Wang wrote: > On 2018年04月02日 23:23, Tiwei Bie wrote: > > This patch introduces a mdev (mediated device) based hardware > > vhost backend. This backend is an abstraction of the various > > hardware vhost accelerators (potentially any device that uses > > virtio ring can be used as a vhost accelerator). Some generic > > mdev parent ops are provided for accelerator drivers to support > > generating mdev instances. > > > > What's this > > === > > > > The idea is that we can setup a virtio ring compatible device > > with the messages available at the vhost-backend. Originally, > > these messages are used to implement a software vhost backend, > > but now we will use these messages to setup a virtio ring > > compatible hardware device. Then the hardware device will be > > able to work with the guest virtio driver in the VM just like > > what the software backend does. That is to say, we can implement > > a hardware based vhost backend in QEMU, and any virtio ring > > compatible devices potentially can be used with this backend. > > (We also call it vDPA -- vhost Data Path Acceleration). > > > > One problem is that, different virtio ring compatible devices > > may have different device interfaces. That is to say, we will > > need different drivers in QEMU. It could be troublesome. And > > that's what this patch trying to fix. The idea behind this > > patch is very simple: mdev is a standard way to emulate device > > in kernel. > > So you just move the abstraction layer from qemu to kernel, and you still > need different drivers in kernel for different device interfaces of > accelerators. This looks even more complex than leaving it in qemu. As you > said, another idea is to implement userspace vhost backend for accelerators > which seems easier and could co-work with other parts of qemu without > inventing new type of messages. I'm not quite sure. Do you think it's acceptable to add various vendor specific hardware drivers in QEMU? > > Need careful thought here to seek a best solution here. Yeah, definitely! :) And your opinions would be very helpful! > > > So we defined a standard device based on mdev, which > > is able to accept vhost messages. When the mdev emulation code > > (i.e. the generic mdev parent ops provided by this patch) gets > > vhost messages, it will parse and deliver them to accelerator > > drivers. Drivers can use these messages to setup accelerators. > > > > That is to say, the generic mdev parent ops (e.g. read()/write()/ > > ioctl()/...) will be provided for accelerator drivers to register > > accelerators as mdev parent devices. And each accelerator device > > will support generating standard mdev instance(s). > > > > With this standard device interface, we will be able to just > > develop one userspace driver to implement the hardware based > > vhost backend in QEMU. > > > > Difference between vDPA and PCI passthru > > > > > > The key difference between vDPA and PCI passthru is that, in > > vDPA only the data path of the device (e.g. DMA ring, notify > > region and queue interrupt) is pass-throughed to the VM, the > > device control path (e.g. PCI configuration space and MMIO > > regions) is still defined and emulated by QEMU. > > > > The benefits of keeping virtio device emulation in QEMU compared > > with virtio device PCI passthru include (but not limit to): > > > > - consistent device interface for guest OS in the VM; > > - max flexibility on the hardware design, especially the > >accelerator for each vhost backend doesn't have to be a > >full PCI device; > > - leveraging the existing virtio live-migration framework; > > > > The interface of this mdev based device > > === > > > > 1. BAR0 > > > > The MMIO region described by BAR0 is the main control > > interface. Messages will be written to or read from > > this region. > > > > The message type is determined by the `request` field > > in message header. The message size is encoded in the > > message header too. The message format looks like this: > > > > struct vhost_vfio_op { > > __u64 request; > > __u32 flags; > > /* Flag values: */ > > #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */ > > __u32 size; > > union { > > __u64 u64; > > struct vhost_vring_state state; > > struct vhost_vring_addr addr; > > struct vhost_memory memory; > > } payload; > > }; > > > > The existing vhost-kernel ioctl cmds are reused as > > the message requests in above structure. > > > > Each message will be written to or read from this > > region at offset 0: > > > > int vhost_vfio_write(struct vhost_dev *dev, struct vhost_vfio_op *op) > > { > > int count = VHOST_VFIO_OP_HDR_SIZE + op->size; > > struct vhost_vfio *vfio = dev->opaque; > > int ret; > > > > ret = pwrite64(vfio->device_fd, op, count,
linux-next: Tree for Apr 10
Hi all, Please do not add any v4.18 destined stuff to your linux-next included trees until after v4.17-rc1 has been released. Changes since 20180409: The parisc-hd tree still had its build failure for which I applied a patch. The nvdimm tree lost its build failure. Non-merge commits (relative to Linus' tree): 1678 1682 files changed, 62884 insertions(+), 31587 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu (with and without kvm enabled). Below is a summary of the state of the merge. I am currently merging 258 trees (counting Linus' and 44 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (fd3b36d27566 Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs) Merging fixes/master (147a89bc71e7 Merge tag 'kconfig-v4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild) Merging kbuild-current/fixes (28913ee8191a netfilter: nf_nat_snmp_basic: add correct dependency to Makefile) Merging arc-current/for-curr (661e50bc8532 Linux 4.16-rc4) Merging arm-current/fixes (2a141cd0d83b ARM: 8758/1: decompressor: restore r1 and r2 just before jumping to the kernel) Merging arm64-fixes/for-next/fixes (e21da1c99200 arm64: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery) Merging m68k-current/for-linus (ecd685580c8f m68k/mac: Remove bogus "FIXME" comment) Merging powerpc-fixes/fixes (52396500f97c powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs) Merging sparc/master (17dec0a94915 Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace) Merging fscrypt-current/for-stable (ae64f9bd1d36 Linux 4.15-rc2) Merging net/master (a2ac99905f1e vhost-net: set packet weight of tx polling to 2 * vq size) Merging bpf/master (33491588c1fb kernel/bpf/syscall: fix warning defined but not used) Merging ipsec/master (4b66af2d6356 af_key: Always verify length of provided sadb_key) Merging netfilter/master (3f1e53abff84 netfilter: ebtables: don't attempt to allocate 0-sized compat array) Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook mask only if set) Merging wireless-drivers/master (77e30e10ee28 iwlwifi: mvm: query regdb for wmm rule if needed) Merging mac80211/master (b5dbc28762fd Merge tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild) Merging rdma-fixes/for-rc (84652aefb347 RDMA/ucma: Introduce safer rdma_addr_size() variants) Merging sound-current/for-linus (e1a3a981e320 ALSA: pcm: Remove WARN_ON() at snd_pcm_hw_params() error) Merging pci-current/for-linus (fc110ebdd014 PCI: dwc: Fix enumeration end when reaching root subordinate) Merging driver-core.current/driver-core-linus (38c23685b273 Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging tty.current/tty-linus (38c23685b273 Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging usb.current/usb-linus (38c23685b273 Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc) Merging usb-gadget-fixes/fixes (c6ba5084ce0d usb: gadget: udc: renesas_usb3: add binging for r8a77965) Merging usb-serial-fixes/usb-linus (86d71233b615 USB: serial: ftdi_sio: add support for Harman FirmwareHubEmulator) Merging usb-chipidea-fixes/ci-fo
Re: [PATCH] xhci: Fix USB ports for Dell Inspiron 5775
Hi Matthias, On Mar 18, 2018, at 11:11 PM, Kai-Heng Fengwrote: The Dell Inspiron 5775 is a Raven Ridge. The Enable Slot command timed out when a USB device gets plugged: [ 212.156326] xhci_hcd :03:00.3: Error while assigning device slot ID [ 212.156340] xhci_hcd :03:00.3: Max number of devices this xHCI host supports is 64. [ 212.156348] usb usb2-port3: couldn't allocate usb_device AMD suggests that a delay before xHC suspends can fix the issue. I can confirm it fixes the issue, so use the suspend delay quirk for Raven Ridge's xHC. I am hoping this patch can get merged in v4.17... Thanks, Kai-Heng Cc: sta...@vger.kernel.org Signed-off-by: Kai-Heng Feng --- drivers/usb/host/xhci-pci.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index d9f831b67e57..93ce34bce7b5 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -126,7 +126,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) if (pdev->vendor == PCI_VENDOR_ID_AMD && usb_amd_find_chipset_info()) xhci->quirks |= XHCI_AMD_PLL_FIX; - if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x43bb) + if (pdev->vendor == PCI_VENDOR_ID_AMD && + (pdev->device == 0x15e0 || +pdev->device == 0x15e1 || +pdev->device == 0x43bb)) xhci->quirks |= XHCI_SUSPEND_DELAY; if (pdev->vendor == PCI_VENDOR_ID_AMD) -- 2.15.1
Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open
On 4/9/18 9:45 PM, Ravi Bangoria wrote: Hi Song, On 12/07/2017 04:15 AM, Song Liu wrote: With current kernel, user space tools can only create/destroy [k,u]probes with a text-based API (kprobe_events and uprobe_events in tracefs). This approach relies on user space to clean up the [k,u]probe after using them. However, this is not easy for user space to clean up properly. To solve this problem, we introduce a file descriptor based API. Specifically, we extended perf_event_open to create [k,u]probe, and attach this [k,u]probe to the file descriptor created by perf_event_open. These [k,u]probe are associated with this file descriptor, so they are not available in tracefs. Sorry for being late. One simple question.. Will it be good to support k/uprobe arguments with perf_event_open()? Do you have any plans about that? no plans for that. People that use text based interfaces should probably be using text interfaces consistently. imo mixing FD-based kprobe api with text is not worth the complexity.
Re: [PATCH] xhci: Fix USB ports for Dell Inspiron 5775
Hi Matthias, On Mar 18, 2018, at 11:11 PM, Kai-Heng Feng wrote: The Dell Inspiron 5775 is a Raven Ridge. The Enable Slot command timed out when a USB device gets plugged: [ 212.156326] xhci_hcd :03:00.3: Error while assigning device slot ID [ 212.156340] xhci_hcd :03:00.3: Max number of devices this xHCI host supports is 64. [ 212.156348] usb usb2-port3: couldn't allocate usb_device AMD suggests that a delay before xHC suspends can fix the issue. I can confirm it fixes the issue, so use the suspend delay quirk for Raven Ridge's xHC. I am hoping this patch can get merged in v4.17... Thanks, Kai-Heng Cc: sta...@vger.kernel.org Signed-off-by: Kai-Heng Feng --- drivers/usb/host/xhci-pci.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c index d9f831b67e57..93ce34bce7b5 100644 --- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -126,7 +126,10 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci) if (pdev->vendor == PCI_VENDOR_ID_AMD && usb_amd_find_chipset_info()) xhci->quirks |= XHCI_AMD_PLL_FIX; - if (pdev->vendor == PCI_VENDOR_ID_AMD && pdev->device == 0x43bb) + if (pdev->vendor == PCI_VENDOR_ID_AMD && + (pdev->device == 0x15e0 || +pdev->device == 0x15e1 || +pdev->device == 0x43bb)) xhci->quirks |= XHCI_SUSPEND_DELAY; if (pdev->vendor == PCI_VENDOR_ID_AMD) -- 2.15.1
Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open
On 4/9/18 9:45 PM, Ravi Bangoria wrote: Hi Song, On 12/07/2017 04:15 AM, Song Liu wrote: With current kernel, user space tools can only create/destroy [k,u]probes with a text-based API (kprobe_events and uprobe_events in tracefs). This approach relies on user space to clean up the [k,u]probe after using them. However, this is not easy for user space to clean up properly. To solve this problem, we introduce a file descriptor based API. Specifically, we extended perf_event_open to create [k,u]probe, and attach this [k,u]probe to the file descriptor created by perf_event_open. These [k,u]probe are associated with this file descriptor, so they are not available in tracefs. Sorry for being late. One simple question.. Will it be good to support k/uprobe arguments with perf_event_open()? Do you have any plans about that? no plans for that. People that use text based interfaces should probably be using text interfaces consistently. imo mixing FD-based kprobe api with text is not worth the complexity.
Re: [PATCH v2] resource: Fix integer overflow at reallocation
On Tue, 10 Apr 2018 02:23:26 +0200, Andrew Morton wrote: > > On Sun, 8 Apr 2018 09:20:26 +0200 Takashi Iwaiwrote: > > > We've got a bug report indicating a kernel panic at booting on an > > x86-32 system, and it turned out to be the invalid resource assigned > > after PCI resource reallocation. __find_resource() first aligns the > > resource start address and resets the end address with start+size-1 > > accordingly, then checks whether it's contained. Here the end address > > may overflow the integer, although resource_contains() still returns > > true because the function validates only start and end address. So > > this ends up with returning an invalid resource (start > end). > > > > There was already an attempt to cover such a problem in the commit > > 47ea91b4052d ("Resource: fix wrong resource window calculation"), but > > this case is an overseen one. > > > > This patch adds the validity check in resource_contains() to see > > whether the given resource has a valid range for avoiding the integer > > overflow problem. > > > > ... > > > > --- a/include/linux/ioport.h > > +++ b/include/linux/ioport.h > > @@ -212,6 +212,9 @@ static inline bool resource_contains(struct resource > > *r1, struct resource *r2) > > return false; > > if (r1->flags & IORESOURCE_UNSET || r2->flags & IORESOURCE_UNSET) > > return false; > > + /* sanity check whether it's a valid resource range */ > > + if (r2->end < r2->start) > > + return false; > > return r1->start <= r2->start && r1->end >= r2->end; > > } > > This doesn't look like the correct place to handle this? Clearly .end > < .start is an invalid state for a resource and we should never have > constructed such a thing in the first place? So adding a check at the > place where this resource was initially created seems to be the correct > fix? Yes, that was also my first thought and actually the v1 patch was like that. The v2 one was by Ram's suggestion so that we can cover potential bugs by all other callers as well. I don't mind in which way to fix; below is the v1 version. Please choose the one you think better. Thanks! Takashi -- 8< -- From: Takashi Iwai Subject: [PATCH v1] resource: Fix integer overflow at reallocation We've got a bug report indicating a kernel panic at booting on an x86-32 system, and it turned out to be the invalid PCI resource assigned after reallocation. __find_resource() first aligns the resource start address and resets the end address with start+size-1 accordingly, then checks whether it's contained. Here the end address may overflow the integer, although resource_contains() still returns true because the function validates only start and end address. So this ends up with returning an invalid resource (start > end). There was already an attempt to cover such a problem in the commit 47ea91b4052d ("Resource: fix wrong resource window calculation"), but this case is an overseen one. This patch adds the validity check of the newly calculated resource for avoiding the integer overflow problem. Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739 Fixes: 23c570a67448 ("resource: ability to resize an allocated resource") Reported-and-tested-by: Michael Henders Cc: Signed-off-by: Takashi Iwai --- kernel/resource.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/resource.c b/kernel/resource.c index e270b5048988..2af6c03858b9 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -651,7 +651,8 @@ static int __find_resource(struct resource *root, struct resource *old, alloc.start = constraint->alignf(constraint->alignf_data, , size, constraint->align); alloc.end = alloc.start + size - 1; - if (resource_contains(, )) { + if (alloc.start <= alloc.end && + resource_contains(, )) { new->start = alloc.start; new->end = alloc.end; return 0; -- 2.16.2
Re: [PATCH v2] resource: Fix integer overflow at reallocation
On Tue, 10 Apr 2018 02:23:26 +0200, Andrew Morton wrote: > > On Sun, 8 Apr 2018 09:20:26 +0200 Takashi Iwai wrote: > > > We've got a bug report indicating a kernel panic at booting on an > > x86-32 system, and it turned out to be the invalid resource assigned > > after PCI resource reallocation. __find_resource() first aligns the > > resource start address and resets the end address with start+size-1 > > accordingly, then checks whether it's contained. Here the end address > > may overflow the integer, although resource_contains() still returns > > true because the function validates only start and end address. So > > this ends up with returning an invalid resource (start > end). > > > > There was already an attempt to cover such a problem in the commit > > 47ea91b4052d ("Resource: fix wrong resource window calculation"), but > > this case is an overseen one. > > > > This patch adds the validity check in resource_contains() to see > > whether the given resource has a valid range for avoiding the integer > > overflow problem. > > > > ... > > > > --- a/include/linux/ioport.h > > +++ b/include/linux/ioport.h > > @@ -212,6 +212,9 @@ static inline bool resource_contains(struct resource > > *r1, struct resource *r2) > > return false; > > if (r1->flags & IORESOURCE_UNSET || r2->flags & IORESOURCE_UNSET) > > return false; > > + /* sanity check whether it's a valid resource range */ > > + if (r2->end < r2->start) > > + return false; > > return r1->start <= r2->start && r1->end >= r2->end; > > } > > This doesn't look like the correct place to handle this? Clearly .end > < .start is an invalid state for a resource and we should never have > constructed such a thing in the first place? So adding a check at the > place where this resource was initially created seems to be the correct > fix? Yes, that was also my first thought and actually the v1 patch was like that. The v2 one was by Ram's suggestion so that we can cover potential bugs by all other callers as well. I don't mind in which way to fix; below is the v1 version. Please choose the one you think better. Thanks! Takashi -- 8< -- From: Takashi Iwai Subject: [PATCH v1] resource: Fix integer overflow at reallocation We've got a bug report indicating a kernel panic at booting on an x86-32 system, and it turned out to be the invalid PCI resource assigned after reallocation. __find_resource() first aligns the resource start address and resets the end address with start+size-1 accordingly, then checks whether it's contained. Here the end address may overflow the integer, although resource_contains() still returns true because the function validates only start and end address. So this ends up with returning an invalid resource (start > end). There was already an attempt to cover such a problem in the commit 47ea91b4052d ("Resource: fix wrong resource window calculation"), but this case is an overseen one. This patch adds the validity check of the newly calculated resource for avoiding the integer overflow problem. Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739 Fixes: 23c570a67448 ("resource: ability to resize an allocated resource") Reported-and-tested-by: Michael Henders Cc: Signed-off-by: Takashi Iwai --- kernel/resource.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/resource.c b/kernel/resource.c index e270b5048988..2af6c03858b9 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -651,7 +651,8 @@ static int __find_resource(struct resource *root, struct resource *old, alloc.start = constraint->alignf(constraint->alignf_data, , size, constraint->align); alloc.end = alloc.start + size - 1; - if (resource_contains(, )) { + if (alloc.start <= alloc.end && + resource_contains(, )) { new->start = alloc.start; new->end = alloc.end; return 0; -- 2.16.2
Re: [PATCH] mmc: sdhci-pci: Only do AMD tuning for HS200
On 4/7/2018 3:37 AM, Daniel Kurtz wrote: > Commit c31165d7400b ("mmc: sdhci-pci: Add support for HS200 tuning mode > on AMD, eMMC-4.5.1") added a HS200 tuning method for use with AMD SDHCI > controllers. As described in the commit subject, this tuning is specific > for HS200. However, as implemented, this method is used for all host > timings, because platform_execute_tuning, if it exists, is called > unconditionally by sdhci_execute_tuning(). This breaks tuning when using > the AMD controller with, for example, a DDR50 SD card. > > Instead, we can implement an amd execute_tuning wrapper callback, and > then conditionally do the HS200 specific tuning for HS200, and otherwise > call back to the standard sdhci_execute_tuning(). > > Signed-off-by: Daniel KurtzLooks good. Acked-by: Shyam Sundar S K
Re: [PATCH] mmc: sdhci-pci: Only do AMD tuning for HS200
On 4/7/2018 3:37 AM, Daniel Kurtz wrote: > Commit c31165d7400b ("mmc: sdhci-pci: Add support for HS200 tuning mode > on AMD, eMMC-4.5.1") added a HS200 tuning method for use with AMD SDHCI > controllers. As described in the commit subject, this tuning is specific > for HS200. However, as implemented, this method is used for all host > timings, because platform_execute_tuning, if it exists, is called > unconditionally by sdhci_execute_tuning(). This breaks tuning when using > the AMD controller with, for example, a DDR50 SD card. > > Instead, we can implement an amd execute_tuning wrapper callback, and > then conditionally do the HS200 specific tuning for HS200, and otherwise > call back to the standard sdhci_execute_tuning(). > > Signed-off-by: Daniel Kurtz Looks good. Acked-by: Shyam Sundar S K
Re: [PATCH bpf-next v8 05/11] seccomp,landlock: Enforce Landlock programs per process hierarchy
On Mon, Apr 09, 2018 at 12:01:59AM +0200, Mickaël Salaün wrote: > > On 04/08/2018 11:06 PM, Andy Lutomirski wrote: > > On Sun, Apr 8, 2018 at 6:13 AM, Mickaël Salaünwrote: > >> > >> On 02/27/2018 10:48 PM, Mickaël Salaün wrote: > >>> > >>> On 27/02/2018 17:39, Andy Lutomirski wrote: > On Tue, Feb 27, 2018 at 5:32 AM, Alexei Starovoitov > wrote: > > On Tue, Feb 27, 2018 at 05:20:55AM +, Andy Lutomirski wrote: > >> On Tue, Feb 27, 2018 at 4:54 AM, Alexei Starovoitov > >> wrote: > >>> On Tue, Feb 27, 2018 at 04:40:34AM +, Andy Lutomirski wrote: > On Tue, Feb 27, 2018 at 2:08 AM, Alexei Starovoitov > wrote: > > On Tue, Feb 27, 2018 at 01:41:15AM +0100, Mickaël Salaün wrote: > >> The seccomp(2) syscall can be used by a task to apply a Landlock > >> program > >> to itself. As a seccomp filter, a Landlock program is enforced for > >> the > >> current task and all its future children. A program is immutable > >> and a > >> task can only add new restricting programs to itself, forming a > >> list of > >> programss. > >> > >> A Landlock program is tied to a Landlock hook. If the action on a > >> kernel > >> object is allowed by the other Linux security mechanisms (e.g. DAC, > >> capabilities, other LSM), then a Landlock hook related to this > >> kind of > >> object is triggered. The list of programs for this hook is then > >> evaluated. Each program return a 32-bit value which can deny the > >> action > >> on a kernel object with a non-zero value. If every programs of the > >> list > >> return zero, then the action on the object is allowed. > >> > >> Multiple Landlock programs can be chained to share a 64-bits value > >> for a > >> call chain (e.g. evaluating multiple elements of a file path). > >> This > >> chaining is restricted when a process construct this chain by > >> loading a > >> program, but additional checks are performed when it requests to > >> apply > >> this chain of programs to itself. The restrictions ensure that it > >> is > >> not possible to call multiple programs in a way that would imply to > >> handle multiple shared values (i.e. cookies) for one chain. For > >> now, > >> only a fs_pick program can be chained to the same type of program, > >> because it may make sense if they have different triggers (cf. next > >> commits). This restrictions still allows to reuse Landlock > >> programs in > >> a safe way (e.g. use the same loaded fs_walk program with multiple > >> chains of fs_pick programs). > >> > >> Signed-off-by: Mickaël Salaün > > > > ... > > > >> +struct landlock_prog_set *landlock_prepend_prog( > >> + struct landlock_prog_set *current_prog_set, > >> + struct bpf_prog *prog) > >> +{ > >> + struct landlock_prog_set *new_prog_set = current_prog_set; > >> + unsigned long pages; > >> + int err; > >> + size_t i; > >> + struct landlock_prog_set tmp_prog_set = {}; > >> + > >> + if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK) > >> + return ERR_PTR(-EINVAL); > >> + > >> + /* validate memory size allocation */ > >> + pages = prog->pages; > >> + if (current_prog_set) { > >> + size_t i; > >> + > >> + for (i = 0; i < > >> ARRAY_SIZE(current_prog_set->programs); i++) { > >> + struct landlock_prog_list *walker_p; > >> + > >> + for (walker_p = > >> current_prog_set->programs[i]; > >> + walker_p; walker_p = > >> walker_p->prev) > >> + pages += walker_p->prog->pages; > >> + } > >> + /* count a struct landlock_prog_set if we need to > >> allocate one */ > >> + if (refcount_read(_prog_set->usage) != 1) > >> + pages += round_up(sizeof(*current_prog_set), > >> PAGE_SIZE) > >> + / PAGE_SIZE; > >> + } > >> + if (pages > LANDLOCK_PROGRAMS_MAX_PAGES) > >> + return ERR_PTR(-E2BIG); > >> + > >> + /* ensure early that we can allocate enough memory for the > >> new > >> + * prog_lists */ > >> + err =
Re: [PATCH bpf-next v8 05/11] seccomp,landlock: Enforce Landlock programs per process hierarchy
On Mon, Apr 09, 2018 at 12:01:59AM +0200, Mickaël Salaün wrote: > > On 04/08/2018 11:06 PM, Andy Lutomirski wrote: > > On Sun, Apr 8, 2018 at 6:13 AM, Mickaël Salaün wrote: > >> > >> On 02/27/2018 10:48 PM, Mickaël Salaün wrote: > >>> > >>> On 27/02/2018 17:39, Andy Lutomirski wrote: > On Tue, Feb 27, 2018 at 5:32 AM, Alexei Starovoitov > wrote: > > On Tue, Feb 27, 2018 at 05:20:55AM +, Andy Lutomirski wrote: > >> On Tue, Feb 27, 2018 at 4:54 AM, Alexei Starovoitov > >> wrote: > >>> On Tue, Feb 27, 2018 at 04:40:34AM +, Andy Lutomirski wrote: > On Tue, Feb 27, 2018 at 2:08 AM, Alexei Starovoitov > wrote: > > On Tue, Feb 27, 2018 at 01:41:15AM +0100, Mickaël Salaün wrote: > >> The seccomp(2) syscall can be used by a task to apply a Landlock > >> program > >> to itself. As a seccomp filter, a Landlock program is enforced for > >> the > >> current task and all its future children. A program is immutable > >> and a > >> task can only add new restricting programs to itself, forming a > >> list of > >> programss. > >> > >> A Landlock program is tied to a Landlock hook. If the action on a > >> kernel > >> object is allowed by the other Linux security mechanisms (e.g. DAC, > >> capabilities, other LSM), then a Landlock hook related to this > >> kind of > >> object is triggered. The list of programs for this hook is then > >> evaluated. Each program return a 32-bit value which can deny the > >> action > >> on a kernel object with a non-zero value. If every programs of the > >> list > >> return zero, then the action on the object is allowed. > >> > >> Multiple Landlock programs can be chained to share a 64-bits value > >> for a > >> call chain (e.g. evaluating multiple elements of a file path). > >> This > >> chaining is restricted when a process construct this chain by > >> loading a > >> program, but additional checks are performed when it requests to > >> apply > >> this chain of programs to itself. The restrictions ensure that it > >> is > >> not possible to call multiple programs in a way that would imply to > >> handle multiple shared values (i.e. cookies) for one chain. For > >> now, > >> only a fs_pick program can be chained to the same type of program, > >> because it may make sense if they have different triggers (cf. next > >> commits). This restrictions still allows to reuse Landlock > >> programs in > >> a safe way (e.g. use the same loaded fs_walk program with multiple > >> chains of fs_pick programs). > >> > >> Signed-off-by: Mickaël Salaün > > > > ... > > > >> +struct landlock_prog_set *landlock_prepend_prog( > >> + struct landlock_prog_set *current_prog_set, > >> + struct bpf_prog *prog) > >> +{ > >> + struct landlock_prog_set *new_prog_set = current_prog_set; > >> + unsigned long pages; > >> + int err; > >> + size_t i; > >> + struct landlock_prog_set tmp_prog_set = {}; > >> + > >> + if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK) > >> + return ERR_PTR(-EINVAL); > >> + > >> + /* validate memory size allocation */ > >> + pages = prog->pages; > >> + if (current_prog_set) { > >> + size_t i; > >> + > >> + for (i = 0; i < > >> ARRAY_SIZE(current_prog_set->programs); i++) { > >> + struct landlock_prog_list *walker_p; > >> + > >> + for (walker_p = > >> current_prog_set->programs[i]; > >> + walker_p; walker_p = > >> walker_p->prev) > >> + pages += walker_p->prog->pages; > >> + } > >> + /* count a struct landlock_prog_set if we need to > >> allocate one */ > >> + if (refcount_read(_prog_set->usage) != 1) > >> + pages += round_up(sizeof(*current_prog_set), > >> PAGE_SIZE) > >> + / PAGE_SIZE; > >> + } > >> + if (pages > LANDLOCK_PROGRAMS_MAX_PAGES) > >> + return ERR_PTR(-E2BIG); > >> + > >> + /* ensure early that we can allocate enough memory for the > >> new > >> + * prog_lists */ > >> + err = store_landlock_prog(_prog_set, current_prog_set, > >> prog); > >> + if (err) > >> +
Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open
Hi Song, On 12/07/2017 04:15 AM, Song Liu wrote: > With current kernel, user space tools can only create/destroy [k,u]probes > with a text-based API (kprobe_events and uprobe_events in tracefs). This > approach relies on user space to clean up the [k,u]probe after using them. > However, this is not easy for user space to clean up properly. > > To solve this problem, we introduce a file descriptor based API. > Specifically, we extended perf_event_open to create [k,u]probe, and attach > this [k,u]probe to the file descriptor created by perf_event_open. These > [k,u]probe are associated with this file descriptor, so they are not > available in tracefs. Sorry for being late. One simple question.. Will it be good to support k/uprobe arguments with perf_event_open()? Do you have any plans about that? Thanks, Ravi
Re: [PATCH v5 0/6] enable creating [k,u]probe with perf_event_open
Hi Song, On 12/07/2017 04:15 AM, Song Liu wrote: > With current kernel, user space tools can only create/destroy [k,u]probes > with a text-based API (kprobe_events and uprobe_events in tracefs). This > approach relies on user space to clean up the [k,u]probe after using them. > However, this is not easy for user space to clean up properly. > > To solve this problem, we introduce a file descriptor based API. > Specifically, we extended perf_event_open to create [k,u]probe, and attach > this [k,u]probe to the file descriptor created by perf_event_open. These > [k,u]probe are associated with this file descriptor, so they are not > available in tracefs. Sorry for being late. One simple question.. Will it be good to support k/uprobe arguments with perf_event_open()? Do you have any plans about that? Thanks, Ravi
Re: [PATCH 5/7] arm64: dts: msm8996: Add rpmpd device node
On 04/09/2018 09:33 PM, Stephen Boyd wrote: > Quoting Rajendra Nayak (2018-03-15 21:08:22) >> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi >> b/arch/arm64/boot/dts/qcom/msm8996.dtsi >> index 0a6f7952bbb1..43757a078146 100644 >> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi >> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi >> @@ -297,6 +297,52 @@ >> #clock-cells = <1>; >> }; >> >> + rpmpd: qcom,rpmpd { > > power-controller? power-domain-controller? power-domains? Or something > like that. > >> + compatible = "qcom,rpmpd-msm8996"; >> + #power-domain-cells = <1>; >> + operating-points-v2 = <_opp_table>, /* >> cx */ >> + <_opp_table>, /* >> cx_ao */ >> + <_opp_table>, /* >> cx_vfc */ >> + <_opp_table>, /* >> mx */ >> + <_opp_table>, /* >> mx_ao */ >> + <_opp_table>, /* >> sscx */ >> + <_opp_table>; /* >> sscx_vfc */ >> + }; >> + >> + rpmpd_opp_table: opp-table { > > This should go into the root of the tree? Otherwise it may be populated > by the RPMh platform populate code which would be odd. We should go and > update the platform populate code to always ignore operating-points-v2 > compatible nodes too. > >> + compatible = "operating-points-v2", >> "operating-points-v2-qcom"; > > This is backwards? I thought more specific compatible went first. thanks for the review, will fixup all of these when I respin. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH 5/7] arm64: dts: msm8996: Add rpmpd device node
On 04/09/2018 09:33 PM, Stephen Boyd wrote: > Quoting Rajendra Nayak (2018-03-15 21:08:22) >> diff --git a/arch/arm64/boot/dts/qcom/msm8996.dtsi >> b/arch/arm64/boot/dts/qcom/msm8996.dtsi >> index 0a6f7952bbb1..43757a078146 100644 >> --- a/arch/arm64/boot/dts/qcom/msm8996.dtsi >> +++ b/arch/arm64/boot/dts/qcom/msm8996.dtsi >> @@ -297,6 +297,52 @@ >> #clock-cells = <1>; >> }; >> >> + rpmpd: qcom,rpmpd { > > power-controller? power-domain-controller? power-domains? Or something > like that. > >> + compatible = "qcom,rpmpd-msm8996"; >> + #power-domain-cells = <1>; >> + operating-points-v2 = <_opp_table>, /* >> cx */ >> + <_opp_table>, /* >> cx_ao */ >> + <_opp_table>, /* >> cx_vfc */ >> + <_opp_table>, /* >> mx */ >> + <_opp_table>, /* >> mx_ao */ >> + <_opp_table>, /* >> sscx */ >> + <_opp_table>; /* >> sscx_vfc */ >> + }; >> + >> + rpmpd_opp_table: opp-table { > > This should go into the root of the tree? Otherwise it may be populated > by the RPMh platform populate code which would be odd. We should go and > update the platform populate code to always ignore operating-points-v2 > compatible nodes too. > >> + compatible = "operating-points-v2", >> "operating-points-v2-qcom"; > > This is backwards? I thought more specific compatible went first. thanks for the review, will fixup all of these when I respin. -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH] f2fs: enlarge block plug coverage
On 04/10, Chao Yu wrote: > On 2018/4/10 2:02, Jaegeuk Kim wrote: > > On 04/08, Chao Yu wrote: > >> On 2018/4/5 11:51, Jaegeuk Kim wrote: > >>> On 04/04, Chao Yu wrote: > This patch enlarges block plug coverage in __issue_discard_cmd, in > order to collect more pending bios before issuing them, to avoid > being disturbed by previous discard I/O in IO aware discard mode. > >>> > >>> Hmm, then we need to wait for huge discard IO for over 10 secs, which > >> > >> We found that total discard latency is rely on total discard number we > >> issued > >> last time instead of range or length discard covered. IMO, if we don't > >> change > >> .max_requests value, we will not suffer longer latency. > >> > >>> will affect following read/write IOs accordingly. In order to avoid that, > >>> we actually need to limit the discard size. > > Do you mean limit discard count or discard length? Both of them. > > >> > >> If you are worry about I/O interference in between discard and rw, I > >> suggest to > >> decrease .max_requests value. > > > > What do you mean? This will produce more pending requests in the queue? > > I mean after applying this patch, we can queue more discard IOs in plug inside > task, otherwise, previous issued discard in block layer can make is_idle() be > false, > then it can stop IO awared user to issue pending discard command. Then, unplug will issue lots of discard commands, which affects the following rw latencies. My preference would be issuing discard commands one by one as much as possible. > > Thanks, > > > > >> > >> Thanks, > >> > >>> > >>> Thanks, > >>> > > Signed-off-by: Chao Yu> --- > fs/f2fs/segment.c | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > index 8f0b5ba46315..4287e208c040 100644 > --- a/fs/f2fs/segment.c > +++ b/fs/f2fs/segment.c > @@ -1208,10 +1208,12 @@ static int __issue_discard_cmd(struct > f2fs_sb_info *sbi, > pend_list = >pend_list[i]; > > mutex_lock(>cmd_lock); > + > +blk_start_plug(); > + > if (list_empty(pend_list)) > goto next; > f2fs_bug_on(sbi, !__check_rb_tree_consistence(sbi, > >root)); > -blk_start_plug(); > list_for_each_entry_safe(dc, tmp, pend_list, list) { > f2fs_bug_on(sbi, dc->state != D_PREP); > > @@ -1227,8 +1229,9 @@ static int __issue_discard_cmd(struct f2fs_sb_info > *sbi, > if (++iter >= dpolicy->max_requests) > break; > } > -blk_finish_plug(); > next: > +blk_finish_plug(); > + > mutex_unlock(>cmd_lock); > > if (iter >= dpolicy->max_requests) > -- > 2.15.0.55.gc2ece9dc4de6 > >>> > >>> . > >>> > > > > . > >
Re: [PATCH] f2fs: enlarge block plug coverage
On 04/10, Chao Yu wrote: > On 2018/4/10 2:02, Jaegeuk Kim wrote: > > On 04/08, Chao Yu wrote: > >> On 2018/4/5 11:51, Jaegeuk Kim wrote: > >>> On 04/04, Chao Yu wrote: > This patch enlarges block plug coverage in __issue_discard_cmd, in > order to collect more pending bios before issuing them, to avoid > being disturbed by previous discard I/O in IO aware discard mode. > >>> > >>> Hmm, then we need to wait for huge discard IO for over 10 secs, which > >> > >> We found that total discard latency is rely on total discard number we > >> issued > >> last time instead of range or length discard covered. IMO, if we don't > >> change > >> .max_requests value, we will not suffer longer latency. > >> > >>> will affect following read/write IOs accordingly. In order to avoid that, > >>> we actually need to limit the discard size. > > Do you mean limit discard count or discard length? Both of them. > > >> > >> If you are worry about I/O interference in between discard and rw, I > >> suggest to > >> decrease .max_requests value. > > > > What do you mean? This will produce more pending requests in the queue? > > I mean after applying this patch, we can queue more discard IOs in plug inside > task, otherwise, previous issued discard in block layer can make is_idle() be > false, > then it can stop IO awared user to issue pending discard command. Then, unplug will issue lots of discard commands, which affects the following rw latencies. My preference would be issuing discard commands one by one as much as possible. > > Thanks, > > > > >> > >> Thanks, > >> > >>> > >>> Thanks, > >>> > > Signed-off-by: Chao Yu > --- > fs/f2fs/segment.c | 7 +-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > index 8f0b5ba46315..4287e208c040 100644 > --- a/fs/f2fs/segment.c > +++ b/fs/f2fs/segment.c > @@ -1208,10 +1208,12 @@ static int __issue_discard_cmd(struct > f2fs_sb_info *sbi, > pend_list = >pend_list[i]; > > mutex_lock(>cmd_lock); > + > +blk_start_plug(); > + > if (list_empty(pend_list)) > goto next; > f2fs_bug_on(sbi, !__check_rb_tree_consistence(sbi, > >root)); > -blk_start_plug(); > list_for_each_entry_safe(dc, tmp, pend_list, list) { > f2fs_bug_on(sbi, dc->state != D_PREP); > > @@ -1227,8 +1229,9 @@ static int __issue_discard_cmd(struct f2fs_sb_info > *sbi, > if (++iter >= dpolicy->max_requests) > break; > } > -blk_finish_plug(); > next: > +blk_finish_plug(); > + > mutex_unlock(>cmd_lock); > > if (iter >= dpolicy->max_requests) > -- > 2.15.0.55.gc2ece9dc4de6 > >>> > >>> . > >>> > > > > . > >
Re: [PATCH] dmaengine: dmatest: Remove use of VLAs
Hi Laura, I love your patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v4.16 next-20180409] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Laura-Abbott/dmaengine-dmatest-Remove-use-of-VLAs/20180410-094633 config: i386-randconfig-x076-201814 (attached as .config) compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=i386 All warnings (new ones prefixed by >>): Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u32 Cyclomatic Complexity 3 include/linux/log2.h:is_power_of_2 Cyclomatic Complexity 1 include/linux/list.h:INIT_LIST_HEAD Cyclomatic Complexity 1 include/linux/list.h:__list_add_valid Cyclomatic Complexity 1 include/linux/list.h:__list_del_entry_valid Cyclomatic Complexity 2 include/linux/list.h:__list_add Cyclomatic Complexity 1 include/linux/list.h:list_add_tail Cyclomatic Complexity 1 include/linux/list.h:__list_del Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry Cyclomatic Complexity 1 include/linux/list.h:list_del Cyclomatic Complexity 1 include/linux/err.h:IS_ERR Cyclomatic Complexity 1 arch/x86/include/asm/current.h:get_current Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_read Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_inc Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_dec_and_test Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_read Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_dec_and_test Cyclomatic Complexity 1 include/asm-generic/getorder.h:__get_order Cyclomatic Complexity 3 include/linux/bitmap.h:bitmap_zero Cyclomatic Complexity 1 include/linux/jiffies.h:_msecs_to_jiffies Cyclomatic Complexity 3 include/linux/jiffies.h:msecs_to_jiffies Cyclomatic Complexity 70 include/linux/ktime.h:ktime_divns Cyclomatic Complexity 1 include/linux/ktime.h:ktime_to_us Cyclomatic Complexity 1 include/linux/mmzone.h:pfn_to_section_nr Cyclomatic Complexity 2 include/linux/mmzone.h:__nr_to_section Cyclomatic Complexity 1 include/linux/mmzone.h:__section_mem_map_addr Cyclomatic Complexity 1 include/linux/mmzone.h:__pfn_to_section Cyclomatic Complexity 1 include/linux/kobject.h:kobject_name Cyclomatic Complexity 2 include/linux/device.h:dev_name Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_map_page Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_mapping_error Cyclomatic Complexity 1 include/linux/dma-mapping.h:valid_dma_direction Cyclomatic Complexity 1 arch/x86/include/asm/dma-mapping.h:get_arch_dma_ops Cyclomatic Complexity 4 include/linux/dma-mapping.h:get_dma_ops Cyclomatic Complexity 1 include/linux/dma-mapping.h:dma_map_page_attrs Cyclomatic Complexity 2 include/linux/dma-mapping.h:dma_mapping_error Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_submit_error Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_chan_name Cyclomatic Complexity 2 include/linux/dmaengine.h:dmaengine_terminate_all Cyclomatic Complexity 1 include/linux/dmaengine.h:dmaf_continue Cyclomatic Complexity 1 include/linux/dmaengine.h:dmaf_p_disabled_continue Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_dev_has_pq_continue Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_dev_to_maxpq Cyclomatic Complexity 4 include/linux/dmaengine.h:dma_maxpq Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_set Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_zero Cyclomatic Complexity 2 include/linux/dmaengine.h:__dma_has_cap Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_async_issue_pending Cyclomatic Complexity 3 include/linux/dmaengine.h:dma_async_is_tx_complete Cyclomatic Complexity 2 include/linux/freezer.h:freezing Cyclomatic Complexity 2 include/linux/freezer.h:try_to_freeze_unsafe Cyclomatic Complexity 2 include/linux/freezer.h:try_to_freeze Cyclomatic Complexity 1 include/linux/kasan.h:kasan_kmalloc Cyclomatic Complexity 28 include/linux/slab.h:kmalloc_index Cyclomatic Complexity 1 include/linux/slab.h:kmem_cache_alloc_trace Cyclomatic Complexity 1 include/linux/slab.h:kmalloc_order_trace Cyclomatic Complexity 67 include/linux/slab.h:kmalloc_large Cyclomatic Complexity 5 include/linux/slab.h:k
Re: [PATCH] dmaengine: dmatest: Remove use of VLAs
Hi Laura, I love your patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v4.16 next-20180409] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Laura-Abbott/dmaengine-dmatest-Remove-use-of-VLAs/20180410-094633 config: i386-randconfig-x076-201814 (attached as .config) compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=i386 All warnings (new ones prefixed by >>): Cyclomatic Complexity 1 include/linux/kasan-checks.h:kasan_check_write Cyclomatic Complexity 2 arch/x86/include/asm/bitops.h:set_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:constant_test_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:variable_test_bit Cyclomatic Complexity 1 arch/x86/include/asm/bitops.h:fls Cyclomatic Complexity 1 include/linux/log2.h:__ilog2_u32 Cyclomatic Complexity 3 include/linux/log2.h:is_power_of_2 Cyclomatic Complexity 1 include/linux/list.h:INIT_LIST_HEAD Cyclomatic Complexity 1 include/linux/list.h:__list_add_valid Cyclomatic Complexity 1 include/linux/list.h:__list_del_entry_valid Cyclomatic Complexity 2 include/linux/list.h:__list_add Cyclomatic Complexity 1 include/linux/list.h:list_add_tail Cyclomatic Complexity 1 include/linux/list.h:__list_del Cyclomatic Complexity 2 include/linux/list.h:__list_del_entry Cyclomatic Complexity 1 include/linux/list.h:list_del Cyclomatic Complexity 1 include/linux/err.h:IS_ERR Cyclomatic Complexity 1 arch/x86/include/asm/current.h:get_current Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_read Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_inc Cyclomatic Complexity 1 arch/x86/include/asm/atomic.h:arch_atomic_dec_and_test Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_read Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_inc Cyclomatic Complexity 1 include/asm-generic/atomic-instrumented.h:atomic_dec_and_test Cyclomatic Complexity 1 include/asm-generic/getorder.h:__get_order Cyclomatic Complexity 3 include/linux/bitmap.h:bitmap_zero Cyclomatic Complexity 1 include/linux/jiffies.h:_msecs_to_jiffies Cyclomatic Complexity 3 include/linux/jiffies.h:msecs_to_jiffies Cyclomatic Complexity 70 include/linux/ktime.h:ktime_divns Cyclomatic Complexity 1 include/linux/ktime.h:ktime_to_us Cyclomatic Complexity 1 include/linux/mmzone.h:pfn_to_section_nr Cyclomatic Complexity 2 include/linux/mmzone.h:__nr_to_section Cyclomatic Complexity 1 include/linux/mmzone.h:__section_mem_map_addr Cyclomatic Complexity 1 include/linux/mmzone.h:__pfn_to_section Cyclomatic Complexity 1 include/linux/kobject.h:kobject_name Cyclomatic Complexity 2 include/linux/device.h:dev_name Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_map_page Cyclomatic Complexity 1 include/linux/dma-debug.h:debug_dma_mapping_error Cyclomatic Complexity 1 include/linux/dma-mapping.h:valid_dma_direction Cyclomatic Complexity 1 arch/x86/include/asm/dma-mapping.h:get_arch_dma_ops Cyclomatic Complexity 4 include/linux/dma-mapping.h:get_dma_ops Cyclomatic Complexity 1 include/linux/dma-mapping.h:dma_map_page_attrs Cyclomatic Complexity 2 include/linux/dma-mapping.h:dma_mapping_error Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_submit_error Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_chan_name Cyclomatic Complexity 2 include/linux/dmaengine.h:dmaengine_terminate_all Cyclomatic Complexity 1 include/linux/dmaengine.h:dmaf_continue Cyclomatic Complexity 1 include/linux/dmaengine.h:dmaf_p_disabled_continue Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_dev_has_pq_continue Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_dev_to_maxpq Cyclomatic Complexity 4 include/linux/dmaengine.h:dma_maxpq Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_set Cyclomatic Complexity 1 include/linux/dmaengine.h:__dma_cap_zero Cyclomatic Complexity 2 include/linux/dmaengine.h:__dma_has_cap Cyclomatic Complexity 1 include/linux/dmaengine.h:dma_async_issue_pending Cyclomatic Complexity 3 include/linux/dmaengine.h:dma_async_is_tx_complete Cyclomatic Complexity 2 include/linux/freezer.h:freezing Cyclomatic Complexity 2 include/linux/freezer.h:try_to_freeze_unsafe Cyclomatic Complexity 2 include/linux/freezer.h:try_to_freeze Cyclomatic Complexity 1 include/linux/kasan.h:kasan_kmalloc Cyclomatic Complexity 28 include/linux/slab.h:kmalloc_index Cyclomatic Complexity 1 include/linux/slab.h:kmem_cache_alloc_trace Cyclomatic Complexity 1 include/linux/slab.h:kmalloc_order_trace Cyclomatic Complexity 67 include/linux/slab.h:kmalloc_large Cyclomatic Complexity 5 include/linux/slab.h:k
Re: [PATCH] f2fs: don't use GFP_ZERO for page caches
On 04/10, Chao Yu wrote: > On 2018/4/10 3:00, Jaegeuk Kim wrote: > > From: Chao Yu> > > > Related to https://lkml.org/lkml/2018/4/8/661 > > > > Sometimes, we need to write meta data to new allocated block address, > > then we will allocate a zeroed page in inner inode's address space, and > > fill partial data in it, and leave other place with zero value which means > > some fields are initial status. > > > > There are two inner inodes (meta inode and node inode) setting __GFP_ZERO, > > I have just checked them, for both of them, we can avoid using __GFP_ZERO, > > and do initialization by ourselves to avoid unneeded/redundant zeroing > > from mm. > > > > Cc: > > Signed-off-by: Chao Yu > > Signed-off-by: Jaegeuk Kim > > --- > > fs/f2fs/inode.c| 4 ++-- > > fs/f2fs/node.c | 6 -- > > fs/f2fs/node.h | 7 ++- > > fs/f2fs/recovery.c | 3 +-- > > 4 files changed, 9 insertions(+), 11 deletions(-) > > > > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c > > index 417c9dcd0269..87535bf63421 100644 > > --- a/fs/f2fs/inode.c > > +++ b/fs/f2fs/inode.c > > @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, > > unsigned long ino) > > make_now: > > if (ino == F2FS_NODE_INO(sbi)) { > > inode->i_mapping->a_ops = _node_aops; > > - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); > > + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); > > } else if (ino == F2FS_META_INO(sbi)) { > > inode->i_mapping->a_ops = _meta_aops; > > - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); > > + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); > > } else if (S_ISREG(inode->i_mode)) { > > inode->i_op = _file_inode_operations; > > inode->i_fop = _file_operations; > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c > > index 9a99243054ba..6fc3311820ec 100644 > > --- a/fs/f2fs/node.c > > +++ b/fs/f2fs/node.c > > @@ -1096,7 +1096,8 @@ struct page *new_node_page(struct dnode_of_data *dn, > > unsigned int ofs) > > set_node_addr(sbi, _ni, NEW_ADDR, false); > > > > f2fs_wait_on_page_writeback(page, NODE, true); > > - fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs, true); > > + memset(F2FS_NODE(page), 0, PAGE_SIZE); > > + fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs); > > set_cold_node(page, S_ISDIR(dn->inode->i_mode)); > > if (!PageUptodate(page)) > > SetPageUptodate(page); > > @@ -2311,7 +2312,8 @@ int recover_inode_page(struct f2fs_sb_info *sbi, > > struct page *page) > > > > if (!PageUptodate(ipage)) > > SetPageUptodate(ipage); > > - fill_node_footer(ipage, ino, ino, 0, true); > > + memset(F2FS_NODE(page), 0, PAGE_SIZE); > > At a glance, should be memset(F2FS_NODE(ipage), 0, PAGE_SIZE); Actually, we don't need to do this, since fill_node_footer(true) will reset the page. > > Sorry about that. > > Thanks, > > > + fill_node_footer(ipage, ino, ino, 0); > > set_cold_node(page, false); > > > > src = F2FS_INODE(page); > > diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h > > index b95e49e4a928..42cd081114ab 100644 > > --- a/fs/f2fs/node.h > > +++ b/fs/f2fs/node.h > > @@ -263,15 +263,12 @@ static inline block_t next_blkaddr_of_node(struct > > page *node_page) > > } > > > > static inline void fill_node_footer(struct page *page, nid_t nid, > > - nid_t ino, unsigned int ofs, bool reset) > > + nid_t ino, unsigned int ofs) > > { > > struct f2fs_node *rn = F2FS_NODE(page); > > unsigned int old_flag = 0; > > > > - if (reset) > > - memset(rn, 0, sizeof(*rn)); > > - else > > - old_flag = le32_to_cpu(rn->footer.flag); > > + old_flag = le32_to_cpu(rn->footer.flag); > > > > rn->footer.nid = cpu_to_le32(nid); > > rn->footer.ino = cpu_to_le32(ino); > > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c > > index 1b23d3febe4c..de24f3247aa5 100644 > > --- a/fs/f2fs/recovery.c > > +++ b/fs/f2fs/recovery.c > > @@ -540,8 +540,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, > > struct inode *inode, > > } > > > > copy_node_footer(dn.node_page, page); > > - fill_node_footer(dn.node_page, dn.nid, ni.ino, > > - ofs_of_node(page), false); > > + fill_node_footer(dn.node_page, dn.nid, ni.ino, ofs_of_node(page)); > > set_page_dirty(dn.node_page); > > err: > > f2fs_put_dnode(); > >
Re: [PATCH] f2fs: don't use GFP_ZERO for page caches
On 04/10, Chao Yu wrote: > On 2018/4/10 3:00, Jaegeuk Kim wrote: > > From: Chao Yu > > > > Related to https://lkml.org/lkml/2018/4/8/661 > > > > Sometimes, we need to write meta data to new allocated block address, > > then we will allocate a zeroed page in inner inode's address space, and > > fill partial data in it, and leave other place with zero value which means > > some fields are initial status. > > > > There are two inner inodes (meta inode and node inode) setting __GFP_ZERO, > > I have just checked them, for both of them, we can avoid using __GFP_ZERO, > > and do initialization by ourselves to avoid unneeded/redundant zeroing > > from mm. > > > > Cc: > > Signed-off-by: Chao Yu > > Signed-off-by: Jaegeuk Kim > > --- > > fs/f2fs/inode.c| 4 ++-- > > fs/f2fs/node.c | 6 -- > > fs/f2fs/node.h | 7 ++- > > fs/f2fs/recovery.c | 3 +-- > > 4 files changed, 9 insertions(+), 11 deletions(-) > > > > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c > > index 417c9dcd0269..87535bf63421 100644 > > --- a/fs/f2fs/inode.c > > +++ b/fs/f2fs/inode.c > > @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, > > unsigned long ino) > > make_now: > > if (ino == F2FS_NODE_INO(sbi)) { > > inode->i_mapping->a_ops = _node_aops; > > - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); > > + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); > > } else if (ino == F2FS_META_INO(sbi)) { > > inode->i_mapping->a_ops = _meta_aops; > > - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); > > + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); > > } else if (S_ISREG(inode->i_mode)) { > > inode->i_op = _file_inode_operations; > > inode->i_fop = _file_operations; > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c > > index 9a99243054ba..6fc3311820ec 100644 > > --- a/fs/f2fs/node.c > > +++ b/fs/f2fs/node.c > > @@ -1096,7 +1096,8 @@ struct page *new_node_page(struct dnode_of_data *dn, > > unsigned int ofs) > > set_node_addr(sbi, _ni, NEW_ADDR, false); > > > > f2fs_wait_on_page_writeback(page, NODE, true); > > - fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs, true); > > + memset(F2FS_NODE(page), 0, PAGE_SIZE); > > + fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs); > > set_cold_node(page, S_ISDIR(dn->inode->i_mode)); > > if (!PageUptodate(page)) > > SetPageUptodate(page); > > @@ -2311,7 +2312,8 @@ int recover_inode_page(struct f2fs_sb_info *sbi, > > struct page *page) > > > > if (!PageUptodate(ipage)) > > SetPageUptodate(ipage); > > - fill_node_footer(ipage, ino, ino, 0, true); > > + memset(F2FS_NODE(page), 0, PAGE_SIZE); > > At a glance, should be memset(F2FS_NODE(ipage), 0, PAGE_SIZE); Actually, we don't need to do this, since fill_node_footer(true) will reset the page. > > Sorry about that. > > Thanks, > > > + fill_node_footer(ipage, ino, ino, 0); > > set_cold_node(page, false); > > > > src = F2FS_INODE(page); > > diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h > > index b95e49e4a928..42cd081114ab 100644 > > --- a/fs/f2fs/node.h > > +++ b/fs/f2fs/node.h > > @@ -263,15 +263,12 @@ static inline block_t next_blkaddr_of_node(struct > > page *node_page) > > } > > > > static inline void fill_node_footer(struct page *page, nid_t nid, > > - nid_t ino, unsigned int ofs, bool reset) > > + nid_t ino, unsigned int ofs) > > { > > struct f2fs_node *rn = F2FS_NODE(page); > > unsigned int old_flag = 0; > > > > - if (reset) > > - memset(rn, 0, sizeof(*rn)); > > - else > > - old_flag = le32_to_cpu(rn->footer.flag); > > + old_flag = le32_to_cpu(rn->footer.flag); > > > > rn->footer.nid = cpu_to_le32(nid); > > rn->footer.ino = cpu_to_le32(ino); > > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c > > index 1b23d3febe4c..de24f3247aa5 100644 > > --- a/fs/f2fs/recovery.c > > +++ b/fs/f2fs/recovery.c > > @@ -540,8 +540,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, > > struct inode *inode, > > } > > > > copy_node_footer(dn.node_page, page); > > - fill_node_footer(dn.node_page, dn.nid, ni.ino, > > - ofs_of_node(page), false); > > + fill_node_footer(dn.node_page, dn.nid, ni.ino, ofs_of_node(page)); > > set_page_dirty(dn.node_page); > > err: > > f2fs_put_dnode(); > >
Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN
On Tue, Apr 10, 2018 at 11:12 AM, Steven Rostedtwrote: > On Tue, 10 Apr 2018 10:32:36 +0800 > Zhaoyang Huang wrote: > >> For bellowing scenario, process A have no intension to exhaust the >> memory, but will be likely to be selected by OOM for we set >> OOM_CORE_ADJ_MIN for it. >> process A(-1000) process B >> >> i = si_mem_available(); >>if (i < nr_pages) >>return -ENOMEM; >>schedule >> ---> >> allocate huge memory >> <- >> if (user_thread) >> set_current_oom_origin(); >> >> for (i = 0; i < nr_pages; i++) { >> bpage = kzalloc_node > > Is this really an issue though? > > Seriously, do you think you will ever hit this? > > How often do you increase the size of the ftrace ring buffer? For this > to be an issue, the system has to trigger an OOM at the exact moment > you decide to increase the size of the ring buffer. That would be an > impressive attack, with little to gain. > > Ask the memory management people. If they think this could be a > problem, then I'll be happy to take your patch. > > -- Steve add Michael for review. Hi Michael, I would like suggest Steve NOT to set OOM_CORE_ADJ_MIN for the process with adj = -1000 when setting the user space process as potential victim of OOM. Steve doubts about the possibility of the scenario. In my opinion, we should NOT break the original concept of the OOM, that is, OOM would not select -1000 process unless it config it itself. With regard to the possibility, in memory thirsty system such as android on mobile phones, there are different kinds of user behavior or test script to attack or ensure the stability of the system. So I suggest we'd better keep every corner case safe. Would you please give a comment on that? thanks
Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN
On Tue, Apr 10, 2018 at 11:12 AM, Steven Rostedt wrote: > On Tue, 10 Apr 2018 10:32:36 +0800 > Zhaoyang Huang wrote: > >> For bellowing scenario, process A have no intension to exhaust the >> memory, but will be likely to be selected by OOM for we set >> OOM_CORE_ADJ_MIN for it. >> process A(-1000) process B >> >> i = si_mem_available(); >>if (i < nr_pages) >>return -ENOMEM; >>schedule >> ---> >> allocate huge memory >> <- >> if (user_thread) >> set_current_oom_origin(); >> >> for (i = 0; i < nr_pages; i++) { >> bpage = kzalloc_node > > Is this really an issue though? > > Seriously, do you think you will ever hit this? > > How often do you increase the size of the ftrace ring buffer? For this > to be an issue, the system has to trigger an OOM at the exact moment > you decide to increase the size of the ring buffer. That would be an > impressive attack, with little to gain. > > Ask the memory management people. If they think this could be a > problem, then I'll be happy to take your patch. > > -- Steve add Michael for review. Hi Michael, I would like suggest Steve NOT to set OOM_CORE_ADJ_MIN for the process with adj = -1000 when setting the user space process as potential victim of OOM. Steve doubts about the possibility of the scenario. In my opinion, we should NOT break the original concept of the OOM, that is, OOM would not select -1000 process unless it config it itself. With regard to the possibility, in memory thirsty system such as android on mobile phones, there are different kinds of user behavior or test script to attack or ensure the stability of the system. So I suggest we'd better keep every corner case safe. Would you please give a comment on that? thanks
Re: [PATCH 5/7] arm64: dts: msm8996: Add rpmpd device node
On 09-04-18, 09:03, Stephen Boyd wrote: > We should go and > update the platform populate code to always ignore operating-points-v2 > compatible nodes too. Will do that. -- viresh
Re: [PATCH 5/7] arm64: dts: msm8996: Add rpmpd device node
On 09-04-18, 09:03, Stephen Boyd wrote: > We should go and > update the platform populate code to always ignore operating-points-v2 > compatible nodes too. Will do that. -- viresh
Re: [PATCH] f2fs: don't use GFP_ZERO for page caches
On 2018/4/10 3:00, Jaegeuk Kim wrote: > From: Chao Yu> > Related to https://lkml.org/lkml/2018/4/8/661 > > Sometimes, we need to write meta data to new allocated block address, > then we will allocate a zeroed page in inner inode's address space, and > fill partial data in it, and leave other place with zero value which means > some fields are initial status. > > There are two inner inodes (meta inode and node inode) setting __GFP_ZERO, > I have just checked them, for both of them, we can avoid using __GFP_ZERO, > and do initialization by ourselves to avoid unneeded/redundant zeroing > from mm. > > Cc: > Signed-off-by: Chao Yu > Signed-off-by: Jaegeuk Kim > --- > fs/f2fs/inode.c| 4 ++-- > fs/f2fs/node.c | 6 -- > fs/f2fs/node.h | 7 ++- > fs/f2fs/recovery.c | 3 +-- > 4 files changed, 9 insertions(+), 11 deletions(-) > > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c > index 417c9dcd0269..87535bf63421 100644 > --- a/fs/f2fs/inode.c > +++ b/fs/f2fs/inode.c > @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, > unsigned long ino) > make_now: > if (ino == F2FS_NODE_INO(sbi)) { > inode->i_mapping->a_ops = _node_aops; > - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); > + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); > } else if (ino == F2FS_META_INO(sbi)) { > inode->i_mapping->a_ops = _meta_aops; > - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); > + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); > } else if (S_ISREG(inode->i_mode)) { > inode->i_op = _file_inode_operations; > inode->i_fop = _file_operations; > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c > index 9a99243054ba..6fc3311820ec 100644 > --- a/fs/f2fs/node.c > +++ b/fs/f2fs/node.c > @@ -1096,7 +1096,8 @@ struct page *new_node_page(struct dnode_of_data *dn, > unsigned int ofs) > set_node_addr(sbi, _ni, NEW_ADDR, false); > > f2fs_wait_on_page_writeback(page, NODE, true); > - fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs, true); > + memset(F2FS_NODE(page), 0, PAGE_SIZE); > + fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs); > set_cold_node(page, S_ISDIR(dn->inode->i_mode)); > if (!PageUptodate(page)) > SetPageUptodate(page); > @@ -2311,7 +2312,8 @@ int recover_inode_page(struct f2fs_sb_info *sbi, struct > page *page) > > if (!PageUptodate(ipage)) > SetPageUptodate(ipage); > - fill_node_footer(ipage, ino, ino, 0, true); > + memset(F2FS_NODE(page), 0, PAGE_SIZE); At a glance, should be memset(F2FS_NODE(ipage), 0, PAGE_SIZE); Sorry about that. Thanks, > + fill_node_footer(ipage, ino, ino, 0); > set_cold_node(page, false); > > src = F2FS_INODE(page); > diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h > index b95e49e4a928..42cd081114ab 100644 > --- a/fs/f2fs/node.h > +++ b/fs/f2fs/node.h > @@ -263,15 +263,12 @@ static inline block_t next_blkaddr_of_node(struct page > *node_page) > } > > static inline void fill_node_footer(struct page *page, nid_t nid, > - nid_t ino, unsigned int ofs, bool reset) > + nid_t ino, unsigned int ofs) > { > struct f2fs_node *rn = F2FS_NODE(page); > unsigned int old_flag = 0; > > - if (reset) > - memset(rn, 0, sizeof(*rn)); > - else > - old_flag = le32_to_cpu(rn->footer.flag); > + old_flag = le32_to_cpu(rn->footer.flag); > > rn->footer.nid = cpu_to_le32(nid); > rn->footer.ino = cpu_to_le32(ino); > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c > index 1b23d3febe4c..de24f3247aa5 100644 > --- a/fs/f2fs/recovery.c > +++ b/fs/f2fs/recovery.c > @@ -540,8 +540,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, > struct inode *inode, > } > > copy_node_footer(dn.node_page, page); > - fill_node_footer(dn.node_page, dn.nid, ni.ino, > - ofs_of_node(page), false); > + fill_node_footer(dn.node_page, dn.nid, ni.ino, ofs_of_node(page)); > set_page_dirty(dn.node_page); > err: > f2fs_put_dnode(); >
Re: [PATCH] f2fs: don't use GFP_ZERO for page caches
On 2018/4/10 3:00, Jaegeuk Kim wrote: > From: Chao Yu > > Related to https://lkml.org/lkml/2018/4/8/661 > > Sometimes, we need to write meta data to new allocated block address, > then we will allocate a zeroed page in inner inode's address space, and > fill partial data in it, and leave other place with zero value which means > some fields are initial status. > > There are two inner inodes (meta inode and node inode) setting __GFP_ZERO, > I have just checked them, for both of them, we can avoid using __GFP_ZERO, > and do initialization by ourselves to avoid unneeded/redundant zeroing > from mm. > > Cc: > Signed-off-by: Chao Yu > Signed-off-by: Jaegeuk Kim > --- > fs/f2fs/inode.c| 4 ++-- > fs/f2fs/node.c | 6 -- > fs/f2fs/node.h | 7 ++- > fs/f2fs/recovery.c | 3 +-- > 4 files changed, 9 insertions(+), 11 deletions(-) > > diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c > index 417c9dcd0269..87535bf63421 100644 > --- a/fs/f2fs/inode.c > +++ b/fs/f2fs/inode.c > @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, > unsigned long ino) > make_now: > if (ino == F2FS_NODE_INO(sbi)) { > inode->i_mapping->a_ops = _node_aops; > - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); > + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); > } else if (ino == F2FS_META_INO(sbi)) { > inode->i_mapping->a_ops = _meta_aops; > - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); > + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); > } else if (S_ISREG(inode->i_mode)) { > inode->i_op = _file_inode_operations; > inode->i_fop = _file_operations; > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c > index 9a99243054ba..6fc3311820ec 100644 > --- a/fs/f2fs/node.c > +++ b/fs/f2fs/node.c > @@ -1096,7 +1096,8 @@ struct page *new_node_page(struct dnode_of_data *dn, > unsigned int ofs) > set_node_addr(sbi, _ni, NEW_ADDR, false); > > f2fs_wait_on_page_writeback(page, NODE, true); > - fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs, true); > + memset(F2FS_NODE(page), 0, PAGE_SIZE); > + fill_node_footer(page, dn->nid, dn->inode->i_ino, ofs); > set_cold_node(page, S_ISDIR(dn->inode->i_mode)); > if (!PageUptodate(page)) > SetPageUptodate(page); > @@ -2311,7 +2312,8 @@ int recover_inode_page(struct f2fs_sb_info *sbi, struct > page *page) > > if (!PageUptodate(ipage)) > SetPageUptodate(ipage); > - fill_node_footer(ipage, ino, ino, 0, true); > + memset(F2FS_NODE(page), 0, PAGE_SIZE); At a glance, should be memset(F2FS_NODE(ipage), 0, PAGE_SIZE); Sorry about that. Thanks, > + fill_node_footer(ipage, ino, ino, 0); > set_cold_node(page, false); > > src = F2FS_INODE(page); > diff --git a/fs/f2fs/node.h b/fs/f2fs/node.h > index b95e49e4a928..42cd081114ab 100644 > --- a/fs/f2fs/node.h > +++ b/fs/f2fs/node.h > @@ -263,15 +263,12 @@ static inline block_t next_blkaddr_of_node(struct page > *node_page) > } > > static inline void fill_node_footer(struct page *page, nid_t nid, > - nid_t ino, unsigned int ofs, bool reset) > + nid_t ino, unsigned int ofs) > { > struct f2fs_node *rn = F2FS_NODE(page); > unsigned int old_flag = 0; > > - if (reset) > - memset(rn, 0, sizeof(*rn)); > - else > - old_flag = le32_to_cpu(rn->footer.flag); > + old_flag = le32_to_cpu(rn->footer.flag); > > rn->footer.nid = cpu_to_le32(nid); > rn->footer.ino = cpu_to_le32(ino); > diff --git a/fs/f2fs/recovery.c b/fs/f2fs/recovery.c > index 1b23d3febe4c..de24f3247aa5 100644 > --- a/fs/f2fs/recovery.c > +++ b/fs/f2fs/recovery.c > @@ -540,8 +540,7 @@ static int do_recover_data(struct f2fs_sb_info *sbi, > struct inode *inode, > } > > copy_node_footer(dn.node_page, page); > - fill_node_footer(dn.node_page, dn.nid, ni.ino, > - ofs_of_node(page), false); > + fill_node_footer(dn.node_page, dn.nid, ni.ino, ofs_of_node(page)); > set_page_dirty(dn.node_page); > err: > f2fs_put_dnode(); >
Re: [f2fs-dev] [PATCH v3] f2fs: don't use GFP_ZERO for page caches
Change log from v2: - consider IO error case when dealing with metapage - memset by fill_node_footer Change log from v1: - don't memset for recovered page Related to https://lkml.org/lkml/2018/4/8/661 Sometimes, we need to write meta data to new allocated block address, then we will allocate a zeroed page in inner inode's address space, and fill partial data in it, and leave other place with zero value which means some fields are initial status. There are two inner inodes (meta inode and node inode) setting __GFP_ZERO, I have just checked them, for both of them, we can avoid using __GFP_ZERO, and do initialization by ourselves to avoid unneeded/redundant zeroing from mm. Cc:Signed-off-by: Chao Yu Signed-off-by: Jaegeuk Kim --- fs/f2fs/checkpoint.c | 4 +++- fs/f2fs/inode.c | 4 ++-- fs/f2fs/segment.c| 3 +++ fs/f2fs/segment.h| 1 + 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index bf779461df13..2e23b953d304 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -100,8 +100,10 @@ static struct page *__get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index, * readonly and make sure do not write checkpoint with non-uptodate * meta page. */ - if (unlikely(!PageUptodate(page))) + if (unlikely(!PageUptodate(page))) { + memset(page_address(page), 0, PAGE_SIZE); f2fs_stop_checkpoint(sbi, false); + } out: return page; } diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 417c9dcd0269..87535bf63421 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino) make_now: if (ino == F2FS_NODE_INO(sbi)) { inode->i_mapping->a_ops = _node_aops; - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); } else if (ino == F2FS_META_INO(sbi)) { inode->i_mapping->a_ops = _meta_aops; - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); } else if (S_ISREG(inode->i_mode)) { inode->i_op = _file_inode_operations; inode->i_fop = _file_operations; diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index a4b8e3e24ccb..1f5db557ab96 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -2021,6 +2021,7 @@ static void write_current_sum_page(struct f2fs_sb_info *sbi, struct f2fs_summary_block *dst; dst = (struct f2fs_summary_block *)page_address(page); + memset(dst, 0, PAGE_SIZE); mutex_lock(>curseg_mutex); @@ -3117,6 +3118,7 @@ static void write_compacted_summaries(struct f2fs_sb_info *sbi, block_t blkaddr) page = grab_meta_page(sbi, blkaddr++); kaddr = (unsigned char *)page_address(page); + memset(kaddr, 0, PAGE_SIZE); /* Step 1: write nat cache */ seg_i = CURSEG_I(sbi, CURSEG_HOT_DATA); @@ -3141,6 +3143,7 @@ static void write_compacted_summaries(struct f2fs_sb_info *sbi, block_t blkaddr) if (!page) { page = grab_meta_page(sbi, blkaddr++); kaddr = (unsigned char *)page_address(page); + memset(kaddr, 0, PAGE_SIZE); written_size = 0; } summary = (struct f2fs_summary *)(kaddr + written_size); diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index 3325d0769723..492ad0c86fa9 100644 --- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -375,6 +375,7 @@ static inline void seg_info_to_sit_page(struct f2fs_sb_info *sbi, int i; raw_sit = (struct f2fs_sit_block *)page_address(page); + memset(raw_sit, 0, PAGE_SIZE); for (i = 0; i < end - start; i++) { rs = _sit->entries[i]; se = get_seg_entry(sbi, start + i); -- 2.15.0.531.g2ccb3012c9-goog
Re: [f2fs-dev] [PATCH v3] f2fs: don't use GFP_ZERO for page caches
Change log from v2: - consider IO error case when dealing with metapage - memset by fill_node_footer Change log from v1: - don't memset for recovered page Related to https://lkml.org/lkml/2018/4/8/661 Sometimes, we need to write meta data to new allocated block address, then we will allocate a zeroed page in inner inode's address space, and fill partial data in it, and leave other place with zero value which means some fields are initial status. There are two inner inodes (meta inode and node inode) setting __GFP_ZERO, I have just checked them, for both of them, we can avoid using __GFP_ZERO, and do initialization by ourselves to avoid unneeded/redundant zeroing from mm. Cc: Signed-off-by: Chao Yu Signed-off-by: Jaegeuk Kim --- fs/f2fs/checkpoint.c | 4 +++- fs/f2fs/inode.c | 4 ++-- fs/f2fs/segment.c| 3 +++ fs/f2fs/segment.h| 1 + 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c index bf779461df13..2e23b953d304 100644 --- a/fs/f2fs/checkpoint.c +++ b/fs/f2fs/checkpoint.c @@ -100,8 +100,10 @@ static struct page *__get_meta_page(struct f2fs_sb_info *sbi, pgoff_t index, * readonly and make sure do not write checkpoint with non-uptodate * meta page. */ - if (unlikely(!PageUptodate(page))) + if (unlikely(!PageUptodate(page))) { + memset(page_address(page), 0, PAGE_SIZE); f2fs_stop_checkpoint(sbi, false); + } out: return page; } diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c index 417c9dcd0269..87535bf63421 100644 --- a/fs/f2fs/inode.c +++ b/fs/f2fs/inode.c @@ -320,10 +320,10 @@ struct inode *f2fs_iget(struct super_block *sb, unsigned long ino) make_now: if (ino == F2FS_NODE_INO(sbi)) { inode->i_mapping->a_ops = _node_aops; - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); } else if (ino == F2FS_META_INO(sbi)) { inode->i_mapping->a_ops = _meta_aops; - mapping_set_gfp_mask(inode->i_mapping, GFP_F2FS_ZERO); + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); } else if (S_ISREG(inode->i_mode)) { inode->i_op = _file_inode_operations; inode->i_fop = _file_operations; diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index a4b8e3e24ccb..1f5db557ab96 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -2021,6 +2021,7 @@ static void write_current_sum_page(struct f2fs_sb_info *sbi, struct f2fs_summary_block *dst; dst = (struct f2fs_summary_block *)page_address(page); + memset(dst, 0, PAGE_SIZE); mutex_lock(>curseg_mutex); @@ -3117,6 +3118,7 @@ static void write_compacted_summaries(struct f2fs_sb_info *sbi, block_t blkaddr) page = grab_meta_page(sbi, blkaddr++); kaddr = (unsigned char *)page_address(page); + memset(kaddr, 0, PAGE_SIZE); /* Step 1: write nat cache */ seg_i = CURSEG_I(sbi, CURSEG_HOT_DATA); @@ -3141,6 +3143,7 @@ static void write_compacted_summaries(struct f2fs_sb_info *sbi, block_t blkaddr) if (!page) { page = grab_meta_page(sbi, blkaddr++); kaddr = (unsigned char *)page_address(page); + memset(kaddr, 0, PAGE_SIZE); written_size = 0; } summary = (struct f2fs_summary *)(kaddr + written_size); diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h index 3325d0769723..492ad0c86fa9 100644 --- a/fs/f2fs/segment.h +++ b/fs/f2fs/segment.h @@ -375,6 +375,7 @@ static inline void seg_info_to_sit_page(struct f2fs_sb_info *sbi, int i; raw_sit = (struct f2fs_sit_block *)page_address(page); + memset(raw_sit, 0, PAGE_SIZE); for (i = 0; i < end - start; i++) { rs = _sit->entries[i]; se = get_seg_entry(sbi, start + i); -- 2.15.0.531.g2ccb3012c9-goog
Re: [RFC v2] virtio: support packed ring
On Tue, Apr 10, 2018 at 10:55:25AM +0800, Jason Wang wrote: > On 2018年04月01日 22:12, Tiwei Bie wrote: > > Hello everyone, > > > > This RFC implements packed ring support for virtio driver. > > > > The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented > > by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html > > Minor changes are needed for the vhost code, e.g. to kick the guest. > > > > TODO: > > - Refinements and bug fixes; > > - Split into small patches; > > - Test indirect descriptor support; > > - Test/fix event suppression support; > > - Test devices other than net; > > > > RFC v1 -> RFC v2: > > - Add indirect descriptor support - compile test only; > > - Add event suppression supprt - compile test only; > > - Move vring_packed_init() out of uapi (Jason, MST); > > - Merge two loops into one in virtqueue_add_packed() (Jason); > > - Split vring_unmap_one() for packed ring and split ring (Jason); > > - Avoid using '%' operator (Jason); > > - Rename free_head -> next_avail_idx (Jason); > > - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason); > > - Some other refinements and bug fixes; > > > > Thanks! > > Will try to review this later. > > But it would be better if you can split it (more than 1000 lines is too big > to be reviewed easily). E.g you can at least split it into three patches, > new structures, datapath, and event suppression. > No problem! It's on my TODO list. I'll get it done in the next version. Thanks!
Re: [RFC v2] virtio: support packed ring
On Tue, Apr 10, 2018 at 10:55:25AM +0800, Jason Wang wrote: > On 2018年04月01日 22:12, Tiwei Bie wrote: > > Hello everyone, > > > > This RFC implements packed ring support for virtio driver. > > > > The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented > > by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html > > Minor changes are needed for the vhost code, e.g. to kick the guest. > > > > TODO: > > - Refinements and bug fixes; > > - Split into small patches; > > - Test indirect descriptor support; > > - Test/fix event suppression support; > > - Test devices other than net; > > > > RFC v1 -> RFC v2: > > - Add indirect descriptor support - compile test only; > > - Add event suppression supprt - compile test only; > > - Move vring_packed_init() out of uapi (Jason, MST); > > - Merge two loops into one in virtqueue_add_packed() (Jason); > > - Split vring_unmap_one() for packed ring and split ring (Jason); > > - Avoid using '%' operator (Jason); > > - Rename free_head -> next_avail_idx (Jason); > > - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason); > > - Some other refinements and bug fixes; > > > > Thanks! > > Will try to review this later. > > But it would be better if you can split it (more than 1000 lines is too big > to be reviewed easily). E.g you can at least split it into three patches, > new structures, datapath, and event suppression. > No problem! It's on my TODO list. I'll get it done in the next version. Thanks!
Re: [PATCH v2 11/21] stack-protector: test compiler capability in Kconfig and drop AUTO mode
2018-04-10 0:04 GMT+09:00 Kees Cook: > On Mon, Apr 9, 2018 at 1:54 AM, Masahiro Yamada > wrote: >> 2018-03-28 20:18 GMT+09:00 Kees Cook : >>> On Mon, Mar 26, 2018 at 10:29 PM, Masahiro Yamada >>> wrote: diff --git a/arch/Kconfig b/arch/Kconfig index 8e0d665..b42378d 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -535,13 +535,13 @@ config HAVE_CC_STACKPROTECTOR bool help An arch should select this symbol if: - - its compiler supports the -fstack-protector option >>> >>> Please leave this note: it's still valid. An arch must still have >>> compiler support for this to be sensible. >>> >> >> No. >> >> "its compiler supports the -fstack-protector option" >> is tested by $(cc-option -fstack-protector) >> >> ARCH does not need to know the GCC support level. > > That's not correct: if you enable stack protector for a kernel > architecture that doesn't having it enabled, it's unlikely for the > resulting kernel to boot. An architecture must handle the changes that > the compiler introduces when adding -fstack-protector (for example, > having the stack protector canary value defined, having the failure > function defined, handling context switches changing canaries, etc). > It is still hard to understand this. When we "its compiler supports the -fstack-protector option", we have two meanings [1] the stack protector feature is implemented in GCC source code. [2] -fstack-protector is recognized as a valid option in the GCC being used. This can be tested by $(cc-option -fstack-protector) I guess you were talking about [1], where as I [2]. Is this correct? Does [2] happen only after [1] happens? Or, are they independent? If there is a case where GCC recognizes -fstack-protector, but not implemented? For x86, there are cases where the option is recognized but not working. That's why we have scripts/gcc-x86_{32,64}-has-stack-protector.sh Generally, if GCC accepts -fstack-protector as a valid option, we expect "it is working". I wonder why we need additional information about the compiler even after $(cc-option -fstack-protector) succeeds. This is just a matter of comment. Can you clarify your problem? > resulting kernel to boot. An architecture must handle the changes that > the compiler introduces when adding -fstack-protector (for example, > having the stack protector canary value defined, having the failure > function defined, handling context switches changing canaries, etc). > All of these are talking about the kernel side implementation. So, it is included in the following comment I am still keeping. - it has implemented a stack canary (e.g. __stack_chk_guard) -- Best Regards Masahiro Yamada
Re: [PATCH v2 11/21] stack-protector: test compiler capability in Kconfig and drop AUTO mode
2018-04-10 0:04 GMT+09:00 Kees Cook : > On Mon, Apr 9, 2018 at 1:54 AM, Masahiro Yamada > wrote: >> 2018-03-28 20:18 GMT+09:00 Kees Cook : >>> On Mon, Mar 26, 2018 at 10:29 PM, Masahiro Yamada >>> wrote: diff --git a/arch/Kconfig b/arch/Kconfig index 8e0d665..b42378d 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -535,13 +535,13 @@ config HAVE_CC_STACKPROTECTOR bool help An arch should select this symbol if: - - its compiler supports the -fstack-protector option >>> >>> Please leave this note: it's still valid. An arch must still have >>> compiler support for this to be sensible. >>> >> >> No. >> >> "its compiler supports the -fstack-protector option" >> is tested by $(cc-option -fstack-protector) >> >> ARCH does not need to know the GCC support level. > > That's not correct: if you enable stack protector for a kernel > architecture that doesn't having it enabled, it's unlikely for the > resulting kernel to boot. An architecture must handle the changes that > the compiler introduces when adding -fstack-protector (for example, > having the stack protector canary value defined, having the failure > function defined, handling context switches changing canaries, etc). > It is still hard to understand this. When we "its compiler supports the -fstack-protector option", we have two meanings [1] the stack protector feature is implemented in GCC source code. [2] -fstack-protector is recognized as a valid option in the GCC being used. This can be tested by $(cc-option -fstack-protector) I guess you were talking about [1], where as I [2]. Is this correct? Does [2] happen only after [1] happens? Or, are they independent? If there is a case where GCC recognizes -fstack-protector, but not implemented? For x86, there are cases where the option is recognized but not working. That's why we have scripts/gcc-x86_{32,64}-has-stack-protector.sh Generally, if GCC accepts -fstack-protector as a valid option, we expect "it is working". I wonder why we need additional information about the compiler even after $(cc-option -fstack-protector) succeeds. This is just a matter of comment. Can you clarify your problem? > resulting kernel to boot. An architecture must handle the changes that > the compiler introduces when adding -fstack-protector (for example, > having the stack protector canary value defined, having the failure > function defined, handling context switches changing canaries, etc). > All of these are talking about the kernel side implementation. So, it is included in the following comment I am still keeping. - it has implemented a stack canary (e.g. __stack_chk_guard) -- Best Regards Masahiro Yamada
Re: [lkp-robot] [init, tracing] 2580d6b795: BUG:kernel_reboot-without-warning_in_boot_stage
On 04/09, Steven Rostedt wrote: >On Tue, 10 Apr 2018 09:23:40 +0800 >Ye Xiaolongwrote: > >> Hi, Steven >> >> On 04/09, Steven Rostedt wrote: >> >On Mon, 9 Apr 2018 13:32:52 +0800 >> >kernel test robot wrote: >> > >> >> FYI, we noticed the following commit (built with gcc-7): >> >> >> >> commit: 2580d6b795e25879c825a0891cf67390f665b11f ("init, tracing: Have >> >> printk come through the trace events for initcall_debug") >> >> url: >> >> https://github.com/0day-ci/linux/commits/Steven-Rostedt/init-tracing/20180407-130743 >> >> >> >> >> >> in testcase: boot >> >> >> >> on test machine: qemu-system-x86_64 -enable-kvm -cpu Nehalem -smp 2 -m >> >> 512M >> >> >> >> caused below changes (please refer to attached dmesg/kmsg for entire >> >> log/backtrace): >> >> >> >> >> >> +--+++ >> >> | | >> >> ecf6709d07 | 2580d6b795 | >> >> +--+++ >> >> | boot_successes | 0 >> >> | 0 | >> >> | boot_failures| 8 >> >> | 8 | >> >> | invoked_oom-killer:gfp_mask=0x | 8 >> >> || >> >> | Mem-Info | 8 >> >> || >> >> | Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 8 >> >> || >> >> | BUG:kernel_reboot-without-warning_in_boot_stage | 0 >> >> | 8 | >> >> +--+++ >> >> >> > >> >What does this mean? >> >> It means BUG:BUG:kernel_reboot-without-warning_in_boot_stage occurred 8 times >> in boot tests for commit 2580d6b795, while 0 time for its parent ecf6709d07. > >I don't have a commit 2580d6b795. > >The commit with the title "init, tracing: Have printk come through the >trace events for initcall_debug" is 4e37958d1288ce. linux-next doesn't >have that commit sha1 either. This commit was generated by 0day service, it captured your email patchset which posted on LKML and then applied it on top of 06dd3dfeea60 and performed build/boot tests accordingly. > > >> >> > >> >> >> >> >> >> >> >> [0.00] RAMDISK: [mem 0x1b7e2000-0x1ffc] >> >> [0.00] ACPI: Early table checksum verification disabled >> >> [0.00] ACPI: RSDP 0x000F6860 14 (v00 BOCHS ) >> >> [0.00] ACPI: RSDT 0x1FFE1628 30 (v01 BOCHS BXPCRSDT >> >> 0001 BXPC 0001) >> >> [0.00] ACPI: FACP 0x1FFE147C 74 (v01 BOCHS BXPCFACP >> >> 0001 BXPC 0001) >> >> BUG: kernel reboot-without-warning in boot stage >> >> >> >> Elapsed time: 10 >> >> >> >> #!/bin/bash >> >> >> >> >> >> >> >> To reproduce: >> >> >> >> git clone https://github.com/intel/lkp-tests.git >> >> cd lkp-tests >> >> bin/lkp qemu -k job-script # job-script is attached in >> >> this email >> >> >> > >> >The config boots fine for me. But I don't have the setup to run the >> >above and get it to work, nor the time to figure out why it doesn't >> >work. >> >> Could you paste your failure log here, we can see if there is something we >> can help. > >I tried it on a more up-to-date box, after checking out my commit with >the title you say is an error. I compiled your config with gcc (GCC) >7.3.1 20180130 (Red Hat 7.3.1-2), and ran the above (which did work). >It ended after it got to a login prompt. > >--- >[..] >[ 10.588029] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7' >[ 10.589363] platform regulatory.0: Direct firmware load for regulatory.db >failed with error -2 >[ 10.590385] cfg80211: failed to load regulatory.db >[ 10.610002] Freeing unused kernel memory: 1980K >[ 10.610533] Write protecting the kernel read-only data: 49152k >[ 10.615728] Freeing unused kernel memory: 2028K >[ 10.620754] Freeing unused kernel memory: 464K >INIT: version 2.88 booting >/etc/rcS.d/S00fbsetup: line 3: /sbin/modprobe: not found > >Please wait: booting... >[ 10.676861] rc (151) used greatest stack depth: 27848 bytes left >Starting udev >[ 10.750281] udevd[175]: starting version 3.1.5 >[ 10.759923] udevd (175) used greatest stack depth: 27696 bytes left >[ 11.12] udevadm (178) used greatest stack depth: 26696 bytes left >Populating dev cache >INIT: Entering runlevel: 5 >Configuring network interfaces... done. >Starting syslogd/klogd: done > >Poky (Yocto Project Reference Distro) 2.1 qemux86-64 /dev/ttyS0 > >qemux86-64 login: >--- > >What am I suppose to see? > So I figure the gap may be 0day bot applied your patchset to a inappropriate
Re: [lkp-robot] [init, tracing] 2580d6b795: BUG:kernel_reboot-without-warning_in_boot_stage
On 04/09, Steven Rostedt wrote: >On Tue, 10 Apr 2018 09:23:40 +0800 >Ye Xiaolong wrote: > >> Hi, Steven >> >> On 04/09, Steven Rostedt wrote: >> >On Mon, 9 Apr 2018 13:32:52 +0800 >> >kernel test robot wrote: >> > >> >> FYI, we noticed the following commit (built with gcc-7): >> >> >> >> commit: 2580d6b795e25879c825a0891cf67390f665b11f ("init, tracing: Have >> >> printk come through the trace events for initcall_debug") >> >> url: >> >> https://github.com/0day-ci/linux/commits/Steven-Rostedt/init-tracing/20180407-130743 >> >> >> >> >> >> in testcase: boot >> >> >> >> on test machine: qemu-system-x86_64 -enable-kvm -cpu Nehalem -smp 2 -m >> >> 512M >> >> >> >> caused below changes (please refer to attached dmesg/kmsg for entire >> >> log/backtrace): >> >> >> >> >> >> +--+++ >> >> | | >> >> ecf6709d07 | 2580d6b795 | >> >> +--+++ >> >> | boot_successes | 0 >> >> | 0 | >> >> | boot_failures| 8 >> >> | 8 | >> >> | invoked_oom-killer:gfp_mask=0x | 8 >> >> || >> >> | Mem-Info | 8 >> >> || >> >> | Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 8 >> >> || >> >> | BUG:kernel_reboot-without-warning_in_boot_stage | 0 >> >> | 8 | >> >> +--+++ >> >> >> > >> >What does this mean? >> >> It means BUG:BUG:kernel_reboot-without-warning_in_boot_stage occurred 8 times >> in boot tests for commit 2580d6b795, while 0 time for its parent ecf6709d07. > >I don't have a commit 2580d6b795. > >The commit with the title "init, tracing: Have printk come through the >trace events for initcall_debug" is 4e37958d1288ce. linux-next doesn't >have that commit sha1 either. This commit was generated by 0day service, it captured your email patchset which posted on LKML and then applied it on top of 06dd3dfeea60 and performed build/boot tests accordingly. > > >> >> > >> >> >> >> >> >> >> >> [0.00] RAMDISK: [mem 0x1b7e2000-0x1ffc] >> >> [0.00] ACPI: Early table checksum verification disabled >> >> [0.00] ACPI: RSDP 0x000F6860 14 (v00 BOCHS ) >> >> [0.00] ACPI: RSDT 0x1FFE1628 30 (v01 BOCHS BXPCRSDT >> >> 0001 BXPC 0001) >> >> [0.00] ACPI: FACP 0x1FFE147C 74 (v01 BOCHS BXPCFACP >> >> 0001 BXPC 0001) >> >> BUG: kernel reboot-without-warning in boot stage >> >> >> >> Elapsed time: 10 >> >> >> >> #!/bin/bash >> >> >> >> >> >> >> >> To reproduce: >> >> >> >> git clone https://github.com/intel/lkp-tests.git >> >> cd lkp-tests >> >> bin/lkp qemu -k job-script # job-script is attached in >> >> this email >> >> >> > >> >The config boots fine for me. But I don't have the setup to run the >> >above and get it to work, nor the time to figure out why it doesn't >> >work. >> >> Could you paste your failure log here, we can see if there is something we >> can help. > >I tried it on a more up-to-date box, after checking out my commit with >the title you say is an error. I compiled your config with gcc (GCC) >7.3.1 20180130 (Red Hat 7.3.1-2), and ran the above (which did work). >It ended after it got to a login prompt. > >--- >[..] >[ 10.588029] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7' >[ 10.589363] platform regulatory.0: Direct firmware load for regulatory.db >failed with error -2 >[ 10.590385] cfg80211: failed to load regulatory.db >[ 10.610002] Freeing unused kernel memory: 1980K >[ 10.610533] Write protecting the kernel read-only data: 49152k >[ 10.615728] Freeing unused kernel memory: 2028K >[ 10.620754] Freeing unused kernel memory: 464K >INIT: version 2.88 booting >/etc/rcS.d/S00fbsetup: line 3: /sbin/modprobe: not found > >Please wait: booting... >[ 10.676861] rc (151) used greatest stack depth: 27848 bytes left >Starting udev >[ 10.750281] udevd[175]: starting version 3.1.5 >[ 10.759923] udevd (175) used greatest stack depth: 27696 bytes left >[ 11.12] udevadm (178) used greatest stack depth: 26696 bytes left >Populating dev cache >INIT: Entering runlevel: 5 >Configuring network interfaces... done. >Starting syslogd/klogd: done > >Poky (Yocto Project Reference Distro) 2.1 qemux86-64 /dev/ttyS0 > >qemux86-64 login: >--- > >What am I suppose to see? > So I figure the gap may be 0day bot applied your patchset to a inappropriate base. Thanks, Xiaolong >-- Steve >
Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN
On Tue, 10 Apr 2018 10:32:36 +0800 Zhaoyang Huangwrote: > For bellowing scenario, process A have no intension to exhaust the > memory, but will be likely to be selected by OOM for we set > OOM_CORE_ADJ_MIN for it. > process A(-1000) process B > > i = si_mem_available(); >if (i < nr_pages) >return -ENOMEM; >schedule > ---> > allocate huge memory > <- > if (user_thread) > set_current_oom_origin(); > > for (i = 0; i < nr_pages; i++) { > bpage = kzalloc_node Is this really an issue though? Seriously, do you think you will ever hit this? How often do you increase the size of the ftrace ring buffer? For this to be an issue, the system has to trigger an OOM at the exact moment you decide to increase the size of the ring buffer. That would be an impressive attack, with little to gain. Ask the memory management people. If they think this could be a problem, then I'll be happy to take your patch. -- Steve
Re: [PATCH v1] ringbuffer: Don't choose the process with adj equal OOM_SCORE_ADJ_MIN
On Tue, 10 Apr 2018 10:32:36 +0800 Zhaoyang Huang wrote: > For bellowing scenario, process A have no intension to exhaust the > memory, but will be likely to be selected by OOM for we set > OOM_CORE_ADJ_MIN for it. > process A(-1000) process B > > i = si_mem_available(); >if (i < nr_pages) >return -ENOMEM; >schedule > ---> > allocate huge memory > <- > if (user_thread) > set_current_oom_origin(); > > for (i = 0; i < nr_pages; i++) { > bpage = kzalloc_node Is this really an issue though? Seriously, do you think you will ever hit this? How often do you increase the size of the ftrace ring buffer? For this to be an issue, the system has to trigger an OOM at the exact moment you decide to increase the size of the ring buffer. That would be an impressive attack, with little to gain. Ask the memory management people. If they think this could be a problem, then I'll be happy to take your patch. -- Steve
Re: [PATCH v3 3/3] mm: restructure memfd code
On 04/09/2018 06:41 PM, Matthew Wilcox wrote: > On Mon, Apr 09, 2018 at 04:05:05PM -0700, Mike Kravetz wrote: >> +/* >> + * We need a tag: a new tag would expand every radix_tree_node by 8 bytes, >> + * so reuse a tag which we firmly believe is never set or cleared on shmem. >> + */ >> +#define SHMEM_TAG_PINNEDPAGECACHE_TAG_TOWRITE > > Do we also firmly believe it's never used on hugetlbfs? > Yes. hugetlbfs is memory resident only with no writeback. This comment and name should have been updated when hugetlbfs support was added. Also, ideally all the memfd related function names of the form shmem_* should have been changed to memfd_* when hugetlbfs support was added. Some of them were changed, but not all. I can clean all this up. But, I would want to do it in patch 2 of the series. That is where other cleanup such as this was done before code movement. Will wait a little while for any additional comments before sending series again. -- Mike Kravetz
Re: [PATCH v3 3/3] mm: restructure memfd code
On 04/09/2018 06:41 PM, Matthew Wilcox wrote: > On Mon, Apr 09, 2018 at 04:05:05PM -0700, Mike Kravetz wrote: >> +/* >> + * We need a tag: a new tag would expand every radix_tree_node by 8 bytes, >> + * so reuse a tag which we firmly believe is never set or cleared on shmem. >> + */ >> +#define SHMEM_TAG_PINNEDPAGECACHE_TAG_TOWRITE > > Do we also firmly believe it's never used on hugetlbfs? > Yes. hugetlbfs is memory resident only with no writeback. This comment and name should have been updated when hugetlbfs support was added. Also, ideally all the memfd related function names of the form shmem_* should have been changed to memfd_* when hugetlbfs support was added. Some of them were changed, but not all. I can clean all this up. But, I would want to do it in patch 2 of the series. That is where other cleanup such as this was done before code movement. Will wait a little while for any additional comments before sending series again. -- Mike Kravetz
[PATCH V2] x86/boot/e820: add new chareater - to free BIOS memory in memmap bootargs
this is useing memmap=0x4101000-0x6aeff000 to free BIOS reserved memory "6aeff000-6eff : reserved": .. 0010-6aefefff : System RAM 0100-0165537a : Kernel code 0165537b-01a8873f : Kernel data 01c31000-01f4efff : Kernel bss 2800-320f : Crash kernel 6aeff000-6eff : reserved --> it is e820 reserved memory 6f00-78240fff : System RAM .. add bootargs memmap=0x4101000-0x6aeff000, to free memory region: 6aeff000-6eff then 6aeff000-6eff will be merged into 0010-78240fff. new iomem: cat /proc/iomem: .. 0010-78240fff : System RAM 0100-0165537a : Kernel code 0165537b-01a8873f : Kernel data 01c31000-01f4efff : Kernel bss .. V1>V2: fixed the wrong chareaters zoucao (1): x86/boot/e820: add new chareater "-" to free BIOS memory in memmap bootargs 7u/Documentation/kernel-parameters.txt | 6 ++ 7u/arch/x86/kernel/e820.c | 3 +++ 2 files changed, 9 insertions(+)
Re: [Resend Patch 1/3] Vmbus: Add function to report available ring buffer to write in total ring size percentage
Long, > I hope this patch set goes through SCSI, because it's purpose is to > improve storvsc. > > If this strategy is not possible, I can resubmit the 1st two patches to > net, and the 3rd patch to scsi after the 1st two are merged. Applied to my staging tree for 4.18/scsi-queue. Thanks! -- Martin K. Petersen Oracle Linux Engineering
[PATCH V2] x86/boot/e820: add new chareater - to free BIOS memory in memmap bootargs
this is useing memmap=0x4101000-0x6aeff000 to free BIOS reserved memory "6aeff000-6eff : reserved": .. 0010-6aefefff : System RAM 0100-0165537a : Kernel code 0165537b-01a8873f : Kernel data 01c31000-01f4efff : Kernel bss 2800-320f : Crash kernel 6aeff000-6eff : reserved --> it is e820 reserved memory 6f00-78240fff : System RAM .. add bootargs memmap=0x4101000-0x6aeff000, to free memory region: 6aeff000-6eff then 6aeff000-6eff will be merged into 0010-78240fff. new iomem: cat /proc/iomem: .. 0010-78240fff : System RAM 0100-0165537a : Kernel code 0165537b-01a8873f : Kernel data 01c31000-01f4efff : Kernel bss .. V1>V2: fixed the wrong chareaters zoucao (1): x86/boot/e820: add new chareater "-" to free BIOS memory in memmap bootargs 7u/Documentation/kernel-parameters.txt | 6 ++ 7u/arch/x86/kernel/e820.c | 3 +++ 2 files changed, 9 insertions(+)
Re: [Resend Patch 1/3] Vmbus: Add function to report available ring buffer to write in total ring size percentage
Long, > I hope this patch set goes through SCSI, because it's purpose is to > improve storvsc. > > If this strategy is not possible, I can resubmit the 1st two patches to > net, and the 3rd patch to scsi after the 1st two are merged. Applied to my staging tree for 4.18/scsi-queue. Thanks! -- Martin K. Petersen Oracle Linux Engineering
[PATCH] x86/boot/e820: add new chareater "-" to free BIOS memory in memmap bootargs
From: zoucaoNormally every BIOS reserved memory is used for some features, we can't use them, but in some conditions, users can ensure some BIOS memories are not used and reserved memory is well to free, they have not a good way to free these memories, here add a new chareater "-" in memmap to free reserved memory. Signed-off-by: zou cao --- 7u/Documentation/kernel-parameters.txt | 6 ++ 7u/arch/x86/kernel/e820.c | 3 +++ 2 files changed, 9 insertions(+) diff --git a/7u/Documentation/kernel-parameters.txt b/7u/Documentation/kernel-parameters.txt index 9a1abb99a..dbea75e12 100644 --- a/7u/Documentation/kernel-parameters.txt +++ b/7u/Documentation/kernel-parameters.txt @@ -1677,6 +1677,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. or memmap=0x1$0x1869 + memmap=nn[KMG]-ss[KMG] + Free E820 reserved memory, as specified by the user. + Region of reserved memory to be free, from ss to ss+nn. + Example: free reserved memory from 0x1869-0x186a + memmap=0x4101000-0x6aeff000 + memory_corruption_check=0/1 [X86] Some BIOSes seem to corrupt the first 64k of memory when doing things like suspend/resume. diff --git a/7u/arch/x86/kernel/e820.c b/7u/arch/x86/kernel/e820.c index 174da5fc5..b8a042981 100644 --- a/7u/arch/x86/kernel/e820.c +++ b/7u/arch/x86/kernel/e820.c @@ -875,6 +875,9 @@ static int __init parse_memmap_one(char *p) } else if (*p == '$') { start_at = memparse(p+1, ); e820_add_region(start_at, mem_size, E820_RESERVED); + } else if (*p == '-') { + start_at = memparse(p+1, ); + e820_remove_range(start_at, mem_size, E820_RESERVED, E820_RAM); } else e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1); -- 2.14.1.40.g8e62ba1
[PATCH] x86/boot/e820: add new chareater "-" to free BIOS memory in memmap bootargs
From: zoucao Normally every BIOS reserved memory is used for some features, we can't use them, but in some conditions, users can ensure some BIOS memories are not used and reserved memory is well to free, they have not a good way to free these memories, here add a new chareater "-" in memmap to free reserved memory. Signed-off-by: zou cao --- 7u/Documentation/kernel-parameters.txt | 6 ++ 7u/arch/x86/kernel/e820.c | 3 +++ 2 files changed, 9 insertions(+) diff --git a/7u/Documentation/kernel-parameters.txt b/7u/Documentation/kernel-parameters.txt index 9a1abb99a..dbea75e12 100644 --- a/7u/Documentation/kernel-parameters.txt +++ b/7u/Documentation/kernel-parameters.txt @@ -1677,6 +1677,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. or memmap=0x1$0x1869 + memmap=nn[KMG]-ss[KMG] + Free E820 reserved memory, as specified by the user. + Region of reserved memory to be free, from ss to ss+nn. + Example: free reserved memory from 0x1869-0x186a + memmap=0x4101000-0x6aeff000 + memory_corruption_check=0/1 [X86] Some BIOSes seem to corrupt the first 64k of memory when doing things like suspend/resume. diff --git a/7u/arch/x86/kernel/e820.c b/7u/arch/x86/kernel/e820.c index 174da5fc5..b8a042981 100644 --- a/7u/arch/x86/kernel/e820.c +++ b/7u/arch/x86/kernel/e820.c @@ -875,6 +875,9 @@ static int __init parse_memmap_one(char *p) } else if (*p == '$') { start_at = memparse(p+1, ); e820_add_region(start_at, mem_size, E820_RESERVED); + } else if (*p == '-') { + start_at = memparse(p+1, ); + e820_remove_range(start_at, mem_size, E820_RESERVED, E820_RAM); } else e820_remove_range(mem_size, ULLONG_MAX - mem_size, E820_RAM, 1); -- 2.14.1.40.g8e62ba1
Re: [PATCH] mm: workingset: fix NULL ptr dereference
On Mon, Apr 09, 2018 at 07:41:52PM -0700, Matthew Wilcox wrote: > On Tue, Apr 10, 2018 at 11:33:39AM +0900, Minchan Kim wrote: > > @@ -522,7 +532,7 @@ EXPORT_SYMBOL(radix_tree_preload); > > */ > > int radix_tree_maybe_preload(gfp_t gfp_mask) > > { > > - if (gfpflags_allow_blocking(gfp_mask)) > > + if (gfpflags_allow_blocking(gfp_mask) && !(gfp_mask & __GFP_ZERO)) > > return __radix_tree_preload(gfp_mask, RADIX_TREE_PRELOAD_SIZE); > > /* Preloading doesn't help anything with this gfp mask, skip it */ > > preempt_disable(); > > No, you've completely misunderstood what's going on in this function. Okay, I hope this version clear current concerns. >From fb37c41b90f7d3ead1798e5cb7baef76709afd94 Mon Sep 17 00:00:00 2001 From: Minchan KimDate: Tue, 10 Apr 2018 11:54:57 +0900 Subject: [PATCH v3] mm: workingset: fix NULL ptr dereference It assumes shadow entries of radix tree rely on the init state that node->private_list allocated newly is list_empty state for the working. Currently, it's initailized in SLAB constructor which means node of radix tree would be initialized only when *slub allocates new page*, not *slub alloctes new object*. If some FS or subsystem pass gfp_mask to __GFP_ZERO, that means newly allocated node can have !list_empty(node->private_list) by memset of slab allocator. It ends up calling NULL deference at workingset_update_node by failing list_empty check. This patch fixes it. Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Cc: Johannes Weiner Cc: Jan Kara Cc: Matthew Wilcox Cc: Jaegeuk Kim Cc: Chao Yu Cc: Christopher Lameter Cc: linux-fsde...@vger.kernel.org Cc: sta...@vger.kernel.org Reported-by: Chris Fries Signed-off-by: Minchan Kim --- lib/radix-tree.c | 9 + mm/filemap.c | 5 +++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/lib/radix-tree.c b/lib/radix-tree.c index da9e10c827df..7569e637dbaa 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -470,6 +470,15 @@ static __must_check int __radix_tree_preload(gfp_t gfp_mask, unsigned nr) struct radix_tree_node *node; int ret = -ENOMEM; + /* +* New allocate node must have node->private_list as INIT_LIST_HEAD +* state by workingset shadow memory implementation. +* If user pass __GFP_ZERO by mistake, slab allocator will clear +* node->private_list, which makes a BUG. Rather than going Oops, +* just fix and warn about it. +*/ + if (WARN_ON(gfp_mask & __GFP_ZERO)) + gfp_mask &= ~__GFP_ZERO; /* * Nodes preloaded by one cgroup can be be used by another cgroup, so * they should never be accounted to any particular memory cgroup. diff --git a/mm/filemap.c b/mm/filemap.c index ab77e19ab09c..b6de9d691c8a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -786,7 +786,7 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask) VM_BUG_ON_PAGE(!PageLocked(new), new); VM_BUG_ON_PAGE(new->mapping, new); - error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_preload(gfp_mask & ~(__GFP_HIGHMEM | __GFP_ZERO)); if (!error) { struct address_space *mapping = old->mapping; void (*freepage)(struct page *); @@ -842,7 +842,8 @@ static int __add_to_page_cache_locked(struct page *page, return error; } - error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_maybe_preload(gfp_mask & + ~(__GFP_HIGHMEM | __GFP_ZERO)); if (error) { if (!huge) mem_cgroup_cancel_charge(page, memcg, false); -- 2.17.0.484.g0c8726318c-goog
Re: [PATCH] mm: workingset: fix NULL ptr dereference
On Mon, Apr 09, 2018 at 07:41:52PM -0700, Matthew Wilcox wrote: > On Tue, Apr 10, 2018 at 11:33:39AM +0900, Minchan Kim wrote: > > @@ -522,7 +532,7 @@ EXPORT_SYMBOL(radix_tree_preload); > > */ > > int radix_tree_maybe_preload(gfp_t gfp_mask) > > { > > - if (gfpflags_allow_blocking(gfp_mask)) > > + if (gfpflags_allow_blocking(gfp_mask) && !(gfp_mask & __GFP_ZERO)) > > return __radix_tree_preload(gfp_mask, RADIX_TREE_PRELOAD_SIZE); > > /* Preloading doesn't help anything with this gfp mask, skip it */ > > preempt_disable(); > > No, you've completely misunderstood what's going on in this function. Okay, I hope this version clear current concerns. >From fb37c41b90f7d3ead1798e5cb7baef76709afd94 Mon Sep 17 00:00:00 2001 From: Minchan Kim Date: Tue, 10 Apr 2018 11:54:57 +0900 Subject: [PATCH v3] mm: workingset: fix NULL ptr dereference It assumes shadow entries of radix tree rely on the init state that node->private_list allocated newly is list_empty state for the working. Currently, it's initailized in SLAB constructor which means node of radix tree would be initialized only when *slub allocates new page*, not *slub alloctes new object*. If some FS or subsystem pass gfp_mask to __GFP_ZERO, that means newly allocated node can have !list_empty(node->private_list) by memset of slab allocator. It ends up calling NULL deference at workingset_update_node by failing list_empty check. This patch fixes it. Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Cc: Johannes Weiner Cc: Jan Kara Cc: Matthew Wilcox Cc: Jaegeuk Kim Cc: Chao Yu Cc: Christopher Lameter Cc: linux-fsde...@vger.kernel.org Cc: sta...@vger.kernel.org Reported-by: Chris Fries Signed-off-by: Minchan Kim --- lib/radix-tree.c | 9 + mm/filemap.c | 5 +++-- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/lib/radix-tree.c b/lib/radix-tree.c index da9e10c827df..7569e637dbaa 100644 --- a/lib/radix-tree.c +++ b/lib/radix-tree.c @@ -470,6 +470,15 @@ static __must_check int __radix_tree_preload(gfp_t gfp_mask, unsigned nr) struct radix_tree_node *node; int ret = -ENOMEM; + /* +* New allocate node must have node->private_list as INIT_LIST_HEAD +* state by workingset shadow memory implementation. +* If user pass __GFP_ZERO by mistake, slab allocator will clear +* node->private_list, which makes a BUG. Rather than going Oops, +* just fix and warn about it. +*/ + if (WARN_ON(gfp_mask & __GFP_ZERO)) + gfp_mask &= ~__GFP_ZERO; /* * Nodes preloaded by one cgroup can be be used by another cgroup, so * they should never be accounted to any particular memory cgroup. diff --git a/mm/filemap.c b/mm/filemap.c index ab77e19ab09c..b6de9d691c8a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -786,7 +786,7 @@ int replace_page_cache_page(struct page *old, struct page *new, gfp_t gfp_mask) VM_BUG_ON_PAGE(!PageLocked(new), new); VM_BUG_ON_PAGE(new->mapping, new); - error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_preload(gfp_mask & ~(__GFP_HIGHMEM | __GFP_ZERO)); if (!error) { struct address_space *mapping = old->mapping; void (*freepage)(struct page *); @@ -842,7 +842,8 @@ static int __add_to_page_cache_locked(struct page *page, return error; } - error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM); + error = radix_tree_maybe_preload(gfp_mask & + ~(__GFP_HIGHMEM | __GFP_ZERO)); if (error) { if (!huge) mem_cgroup_cancel_charge(page, memcg, false); -- 2.17.0.484.g0c8726318c-goog
[lkp-robot] [hugetlbfs] e979e5a059: BUG_hugetlbfs_inode_cache(Not_tainted):Objects_remaining_in_hugetlbfs_inode_cache_on__kmem_cache_shutdown()
FYI, we noticed the following commit (built with gcc-7): commit: e979e5a0591e70ad0b41cf876ee987de468a220e ("hugetlbfs: Convert to fs_context") https://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git mount-context in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 512M caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +-+++ | | 838d9ecc64 | e979e5a059 | +-+++ | boot_successes | 0 | 0 | | boot_failures | 54 | 17 | | BUG:stack_guard_page_was_hit_at#(stack_is#..#) | 54 || | RIP:legacy_parse_monolithic | 54 || | Kernel_panic-not_syncing:Fatal_exception | 54 || | BUG_hugetlbfs_inode_cache(Not_tainted):Objects_remaining_in_hugetlbfs_inode_cache_on__kmem_cache_shutdown() | 0 | 17 | | INFO:Slab#objects=#used=#fp=#flags= | 0 | 17 | | INFO:Object#@offset= | 0 | 17 | +-+++ [0.160565] PCI: pci_cache_line_size set to 64 bytes [0.161260] e820: reserve RAM buffer [mem 0x0009fc00-0x0009] [0.161969] e820: reserve RAM buffer [mem 0x1ffe-0x1fff] [0.163220] clocksource: Switched to clocksource kvm-clock [0.175560] = [0.176568] BUG hugetlbfs_inode_cache (Not tainted): Objects remaining in hugetlbfs_inode_cache on __kmem_cache_shutdown() [0.176640] - [0.176640] [0.176640] Disabling lock debugging due to kernel taint [0.176640] INFO: Slab 0x6376557a objects=17 used=1 fp=0x154e780a flags=0x40008100 [0.176640] CPU: 0 PID: 1 Comm: swapper Tainted: GB 4.16.0-10623-ge979e5a #1 [0.176640] Call Trace: [0.176640] slab_err+0xad/0xcf [0.176640] ? __kmem_cache_shutdown+0x93/0x301 [0.176640] ? __need_fs_reclaim+0x5/0x4e [0.176640] ? prefetch_freepointer+0x5/0x14 [0.176640] ? __kmalloc+0x122/0x1c4 [0.176640] __kmem_cache_shutdown+0x163/0x301 [0.176640] shutdown_cache+0x14/0xf7 [0.176640] kmem_cache_destroy+0x15c/0x1a5 [0.176640] init_hugetlbfs_fs+0x85/0x15c [0.176640] ? init_ramfs_fs+0x1f/0x1f [0.176640] ? set_debug_rodata+0x11/0x11 [0.176640] do_one_initcall+0x9c/0x148 [0.176640] kernel_init_freeable+0x11b/0x1a8 [0.176640] ? rest_init+0x119/0x119 [0.176640] kernel_init+0xa/0xe1 [0.176640] ret_from_fork+0x3a/0x50 [0.176640] INFO: Object 0xe4f03853 @offset=12768 [0.190206] kmem_cache_destroy hugetlbfs_inode_cache: Slab cache still has objects [0.191091] CPU: 0 PID: 1 Comm: swapper Tainted: GB 4.16.0-10623-ge979e5a #1 [0.192084] Call Trace: [0.192383] kmem_cache_destroy+0x175/0x1a5 [0.192889] init_hugetlbfs_fs+0x85/0x15c [0.193362] ? init_ramfs_fs+0x1f/0x1f [0.193809] ? set_debug_rodata+0x11/0x11 [0.194282] do_one_initcall+0x9c/0x148 [0.194738] kernel_init_freeable+0x11b/0x1a8 [0.195249] ? rest_init+0x119/0x119 [0.195673] kernel_init+0xa/0xe1 [0.196091] ret_from_fork+0x3a/0x50 [0.196575] pnp: PnP ACPI init [0.197162] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active) [0.198248] pnp 00:01: Plug and Play ACPI device, IDs PNP0303 (active) [0.199306] pnp 00:02: Plug and Play ACPI device, IDs PNP0f13 (active) [0.200357] pnp 00:03: [dma 2] To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k job-script # job-script is attached in this email Thanks, Xiaolong # # Automatically generated file; DO NOT EDIT. # Linux/x86_64 4.16.0 Kernel Configuration # CONFIG_64BIT=y CONFIG_X86_64=y CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_OUTPUT_FORMAT="elf64-x86-64"
[lkp-robot] [hugetlbfs] e979e5a059: BUG_hugetlbfs_inode_cache(Not_tainted):Objects_remaining_in_hugetlbfs_inode_cache_on__kmem_cache_shutdown()
FYI, we noticed the following commit (built with gcc-7): commit: e979e5a0591e70ad0b41cf876ee987de468a220e ("hugetlbfs: Convert to fs_context") https://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git mount-context in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 512M caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +-+++ | | 838d9ecc64 | e979e5a059 | +-+++ | boot_successes | 0 | 0 | | boot_failures | 54 | 17 | | BUG:stack_guard_page_was_hit_at#(stack_is#..#) | 54 || | RIP:legacy_parse_monolithic | 54 || | Kernel_panic-not_syncing:Fatal_exception | 54 || | BUG_hugetlbfs_inode_cache(Not_tainted):Objects_remaining_in_hugetlbfs_inode_cache_on__kmem_cache_shutdown() | 0 | 17 | | INFO:Slab#objects=#used=#fp=#flags= | 0 | 17 | | INFO:Object#@offset= | 0 | 17 | +-+++ [0.160565] PCI: pci_cache_line_size set to 64 bytes [0.161260] e820: reserve RAM buffer [mem 0x0009fc00-0x0009] [0.161969] e820: reserve RAM buffer [mem 0x1ffe-0x1fff] [0.163220] clocksource: Switched to clocksource kvm-clock [0.175560] = [0.176568] BUG hugetlbfs_inode_cache (Not tainted): Objects remaining in hugetlbfs_inode_cache on __kmem_cache_shutdown() [0.176640] - [0.176640] [0.176640] Disabling lock debugging due to kernel taint [0.176640] INFO: Slab 0x6376557a objects=17 used=1 fp=0x154e780a flags=0x40008100 [0.176640] CPU: 0 PID: 1 Comm: swapper Tainted: GB 4.16.0-10623-ge979e5a #1 [0.176640] Call Trace: [0.176640] slab_err+0xad/0xcf [0.176640] ? __kmem_cache_shutdown+0x93/0x301 [0.176640] ? __need_fs_reclaim+0x5/0x4e [0.176640] ? prefetch_freepointer+0x5/0x14 [0.176640] ? __kmalloc+0x122/0x1c4 [0.176640] __kmem_cache_shutdown+0x163/0x301 [0.176640] shutdown_cache+0x14/0xf7 [0.176640] kmem_cache_destroy+0x15c/0x1a5 [0.176640] init_hugetlbfs_fs+0x85/0x15c [0.176640] ? init_ramfs_fs+0x1f/0x1f [0.176640] ? set_debug_rodata+0x11/0x11 [0.176640] do_one_initcall+0x9c/0x148 [0.176640] kernel_init_freeable+0x11b/0x1a8 [0.176640] ? rest_init+0x119/0x119 [0.176640] kernel_init+0xa/0xe1 [0.176640] ret_from_fork+0x3a/0x50 [0.176640] INFO: Object 0xe4f03853 @offset=12768 [0.190206] kmem_cache_destroy hugetlbfs_inode_cache: Slab cache still has objects [0.191091] CPU: 0 PID: 1 Comm: swapper Tainted: GB 4.16.0-10623-ge979e5a #1 [0.192084] Call Trace: [0.192383] kmem_cache_destroy+0x175/0x1a5 [0.192889] init_hugetlbfs_fs+0x85/0x15c [0.193362] ? init_ramfs_fs+0x1f/0x1f [0.193809] ? set_debug_rodata+0x11/0x11 [0.194282] do_one_initcall+0x9c/0x148 [0.194738] kernel_init_freeable+0x11b/0x1a8 [0.195249] ? rest_init+0x119/0x119 [0.195673] kernel_init+0xa/0xe1 [0.196091] ret_from_fork+0x3a/0x50 [0.196575] pnp: PnP ACPI init [0.197162] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active) [0.198248] pnp 00:01: Plug and Play ACPI device, IDs PNP0303 (active) [0.199306] pnp 00:02: Plug and Play ACPI device, IDs PNP0f13 (active) [0.200357] pnp 00:03: [dma 2] To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k job-script # job-script is attached in this email Thanks, Xiaolong # # Automatically generated file; DO NOT EDIT. # Linux/x86_64 4.16.0 Kernel Configuration # CONFIG_64BIT=y CONFIG_X86_64=y CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_OUTPUT_FORMAT="elf64-x86-64"
Re: [PATCH 0/8] hisi_sas: some misc changes
John, > This patchset introduces some minor, more trivial patches, some of > which have been sitting on our internal dev branch for a while. Applied to 4.18/scsi-queue. Thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH 0/8] hisi_sas: some misc changes
John, > This patchset introduces some minor, more trivial patches, some of > which have been sitting on our internal dev branch for a while. Applied to 4.18/scsi-queue. Thank you! -- Martin K. Petersen Oracle Linux Engineering
Re: [RFC v2] virtio: support packed ring
On 2018年04月01日 22:12, Tiwei Bie wrote: Hello everyone, This RFC implements packed ring support for virtio driver. The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html Minor changes are needed for the vhost code, e.g. to kick the guest. TODO: - Refinements and bug fixes; - Split into small patches; - Test indirect descriptor support; - Test/fix event suppression support; - Test devices other than net; RFC v1 -> RFC v2: - Add indirect descriptor support - compile test only; - Add event suppression supprt - compile test only; - Move vring_packed_init() out of uapi (Jason, MST); - Merge two loops into one in virtqueue_add_packed() (Jason); - Split vring_unmap_one() for packed ring and split ring (Jason); - Avoid using '%' operator (Jason); - Rename free_head -> next_avail_idx (Jason); - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason); - Some other refinements and bug fixes; Thanks! Will try to review this later. But it would be better if you can split it (more than 1000 lines is too big to be reviewed easily). E.g you can at least split it into three patches, new structures, datapath, and event suppression. Thanks Signed-off-by: Tiwei Bie--- drivers/virtio/virtio_ring.c | 1094 +--- include/linux/virtio_ring.h|8 +- include/uapi/linux/virtio_config.h | 12 +- include/uapi/linux/virtio_ring.h | 61 ++ 4 files changed, 980 insertions(+), 195 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 71458f493cf8..0515dca34d77 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -58,14 +58,15 @@ struct vring_desc_state { void *data; /* Data for callback. */ - struct vring_desc *indir_desc; /* Indirect descriptor, if any. */ + void *indir_desc; /* Indirect descriptor, if any. */ + int num;/* Descriptor list length. */ }; struct vring_virtqueue { struct virtqueue vq; - /* Actual memory layout for this queue */ - struct vring vring; + /* Is this a packed ring? */ + bool packed; /* Can we use weak barriers? */ bool weak_barriers; @@ -79,19 +80,45 @@ struct vring_virtqueue { /* Host publishes avail event idx */ bool event; - /* Head of free buffer list. */ - unsigned int free_head; /* Number we've added since last sync. */ unsigned int num_added; /* Last used index we've seen. */ u16 last_used_idx; - /* Last written value to avail->flags */ - u16 avail_flags_shadow; + union { + /* Available for split ring */ + struct { + /* Actual memory layout for this queue. */ + struct vring vring; - /* Last written value to avail->idx in guest byte order */ - u16 avail_idx_shadow; + /* Head of free buffer list. */ + unsigned int free_head; + + /* Last written value to avail->flags */ + u16 avail_flags_shadow; + + /* Last written value to avail->idx in +* guest byte order. */ + u16 avail_idx_shadow; + }; + + /* Available for packed ring */ + struct { + /* Actual memory layout for this queue. */ + struct vring_packed vring_packed; + + /* Driver ring wrap counter. */ + u8 wrap_counter; + + /* Index of the next avail descriptor. */ + unsigned int next_avail_idx; + + /* Last written value to driver->flags in +* guest byte order. */ + u16 event_flags_shadow; + }; + }; /* How to notify other side. FIXME: commonalize hcalls! */ bool (*notify)(struct virtqueue *vq); @@ -201,8 +228,33 @@ static dma_addr_t vring_map_single(const struct vring_virtqueue *vq, cpu_addr, size, direction); } -static void vring_unmap_one(const struct vring_virtqueue *vq, - struct vring_desc *desc) +static void vring_unmap_one_split(const struct vring_virtqueue *vq, + struct vring_desc *desc) +{ + u16 flags; + + if (!vring_use_dma_api(vq->vq.vdev)) + return; + + flags = virtio16_to_cpu(vq->vq.vdev, desc->flags); + + if (flags & VRING_DESC_F_INDIRECT) { + dma_unmap_single(vring_dma_dev(vq), +virtio64_to_cpu(vq->vq.vdev, desc->addr), +virtio32_to_cpu(vq->vq.vdev, desc->len), +
Re: [RFC v2] virtio: support packed ring
On 2018年04月01日 22:12, Tiwei Bie wrote: Hello everyone, This RFC implements packed ring support for virtio driver. The code was tested with DPDK vhost (testpmd/vhost-PMD) implemented by Jens at http://dpdk.org/ml/archives/dev/2018-January/089417.html Minor changes are needed for the vhost code, e.g. to kick the guest. TODO: - Refinements and bug fixes; - Split into small patches; - Test indirect descriptor support; - Test/fix event suppression support; - Test devices other than net; RFC v1 -> RFC v2: - Add indirect descriptor support - compile test only; - Add event suppression supprt - compile test only; - Move vring_packed_init() out of uapi (Jason, MST); - Merge two loops into one in virtqueue_add_packed() (Jason); - Split vring_unmap_one() for packed ring and split ring (Jason); - Avoid using '%' operator (Jason); - Rename free_head -> next_avail_idx (Jason); - Add comments for virtio_wmb() in virtqueue_add_packed() (Jason); - Some other refinements and bug fixes; Thanks! Will try to review this later. But it would be better if you can split it (more than 1000 lines is too big to be reviewed easily). E.g you can at least split it into three patches, new structures, datapath, and event suppression. Thanks Signed-off-by: Tiwei Bie --- drivers/virtio/virtio_ring.c | 1094 +--- include/linux/virtio_ring.h|8 +- include/uapi/linux/virtio_config.h | 12 +- include/uapi/linux/virtio_ring.h | 61 ++ 4 files changed, 980 insertions(+), 195 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 71458f493cf8..0515dca34d77 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -58,14 +58,15 @@ struct vring_desc_state { void *data; /* Data for callback. */ - struct vring_desc *indir_desc; /* Indirect descriptor, if any. */ + void *indir_desc; /* Indirect descriptor, if any. */ + int num;/* Descriptor list length. */ }; struct vring_virtqueue { struct virtqueue vq; - /* Actual memory layout for this queue */ - struct vring vring; + /* Is this a packed ring? */ + bool packed; /* Can we use weak barriers? */ bool weak_barriers; @@ -79,19 +80,45 @@ struct vring_virtqueue { /* Host publishes avail event idx */ bool event; - /* Head of free buffer list. */ - unsigned int free_head; /* Number we've added since last sync. */ unsigned int num_added; /* Last used index we've seen. */ u16 last_used_idx; - /* Last written value to avail->flags */ - u16 avail_flags_shadow; + union { + /* Available for split ring */ + struct { + /* Actual memory layout for this queue. */ + struct vring vring; - /* Last written value to avail->idx in guest byte order */ - u16 avail_idx_shadow; + /* Head of free buffer list. */ + unsigned int free_head; + + /* Last written value to avail->flags */ + u16 avail_flags_shadow; + + /* Last written value to avail->idx in +* guest byte order. */ + u16 avail_idx_shadow; + }; + + /* Available for packed ring */ + struct { + /* Actual memory layout for this queue. */ + struct vring_packed vring_packed; + + /* Driver ring wrap counter. */ + u8 wrap_counter; + + /* Index of the next avail descriptor. */ + unsigned int next_avail_idx; + + /* Last written value to driver->flags in +* guest byte order. */ + u16 event_flags_shadow; + }; + }; /* How to notify other side. FIXME: commonalize hcalls! */ bool (*notify)(struct virtqueue *vq); @@ -201,8 +228,33 @@ static dma_addr_t vring_map_single(const struct vring_virtqueue *vq, cpu_addr, size, direction); } -static void vring_unmap_one(const struct vring_virtqueue *vq, - struct vring_desc *desc) +static void vring_unmap_one_split(const struct vring_virtqueue *vq, + struct vring_desc *desc) +{ + u16 flags; + + if (!vring_use_dma_api(vq->vq.vdev)) + return; + + flags = virtio16_to_cpu(vq->vq.vdev, desc->flags); + + if (flags & VRING_DESC_F_INDIRECT) { + dma_unmap_single(vring_dma_dev(vq), +virtio64_to_cpu(vq->vq.vdev, desc->addr), +virtio32_to_cpu(vq->vq.vdev, desc->len), +