Re: [PATCH] hwmon: (coretemp) Handle frozen hotplug state correctly
2017-05-10 23:09 GMT+03:00 Guenter Roeck: > On Wed, May 10, 2017 at 10:16:33PM +0300, Tommi Rantala wrote: >> 2017-05-10 17:30 GMT+03:00 Thomas Gleixner : >> > The recent conversion to the hotplug state machine missed that the original >> > hotplug notifiers did not execute in the frozen state, which is used on >> > suspend on resume. >> > >> > This does not matter on single socket machines, but on multi socket systems >> > this breaks when the device for a non-boot socket is removed when the last >> > CPU of that socket is brought offline. The device removal locks up the >> > machine hard w/o any debug output. >> > >> > Prevent executing the hotplug callbacks when cpuhp_tasks_frozen is true. >> > >> > Thanks to Tommi for providing debug information patiently while I failed to >> > spot the obvious. >> > >> > Fixes: e00ca5df37ad ("hwmon: (coretemp) Convert to hotplug state machine") >> > Reported-by: Tommi Rantala >> > Signed-off-by: Thomas Gleixner >> >> Many thanks, I can confirm that it works well! >> > Ok if I add your Tested-by: ? Sure! Tested-by: Tommi Rantala > Thanks, > Guenter > >> -Tommi >> >> > --- >> > drivers/hwmon/coretemp.c | 14 ++ >> > 1 file changed, 14 insertions(+) >> > >> > --- a/drivers/hwmon/coretemp.c >> > +++ b/drivers/hwmon/coretemp.c >> > @@ -605,6 +605,13 @@ static int coretemp_cpu_online(unsigned >> > struct platform_data *pdata; >> > >> > /* >> > +* Don't execute this on resume as the offline callback did >> > +* not get executed on suspend. >> > +*/ >> > + if (cpuhp_tasks_frozen) >> > + return 0; >> > + >> > + /* >> > * CPUID.06H.EAX[0] indicates whether the CPU has thermal >> > * sensors. We check this bit only, all the early CPUs >> > * without thermal sensors will be filtered out. >> > @@ -654,6 +661,13 @@ static int coretemp_cpu_offline(unsigned >> > struct temp_data *tdata; >> > int indx, target; >> > >> > + /* >> > +* Don't execute this on suspend as the device remove locks >> > +* up the machine. >> > +*/ >> > + if (cpuhp_tasks_frozen) >> > + return 0; >> > + >> > /* If the physical CPU device does not exist, just return */ >> > if (!pdev) >> > return 0;
Re: [PATCH] hwmon: (coretemp) Handle frozen hotplug state correctly
2017-05-10 23:09 GMT+03:00 Guenter Roeck : > On Wed, May 10, 2017 at 10:16:33PM +0300, Tommi Rantala wrote: >> 2017-05-10 17:30 GMT+03:00 Thomas Gleixner : >> > The recent conversion to the hotplug state machine missed that the original >> > hotplug notifiers did not execute in the frozen state, which is used on >> > suspend on resume. >> > >> > This does not matter on single socket machines, but on multi socket systems >> > this breaks when the device for a non-boot socket is removed when the last >> > CPU of that socket is brought offline. The device removal locks up the >> > machine hard w/o any debug output. >> > >> > Prevent executing the hotplug callbacks when cpuhp_tasks_frozen is true. >> > >> > Thanks to Tommi for providing debug information patiently while I failed to >> > spot the obvious. >> > >> > Fixes: e00ca5df37ad ("hwmon: (coretemp) Convert to hotplug state machine") >> > Reported-by: Tommi Rantala >> > Signed-off-by: Thomas Gleixner >> >> Many thanks, I can confirm that it works well! >> > Ok if I add your Tested-by: ? Sure! Tested-by: Tommi Rantala > Thanks, > Guenter > >> -Tommi >> >> > --- >> > drivers/hwmon/coretemp.c | 14 ++ >> > 1 file changed, 14 insertions(+) >> > >> > --- a/drivers/hwmon/coretemp.c >> > +++ b/drivers/hwmon/coretemp.c >> > @@ -605,6 +605,13 @@ static int coretemp_cpu_online(unsigned >> > struct platform_data *pdata; >> > >> > /* >> > +* Don't execute this on resume as the offline callback did >> > +* not get executed on suspend. >> > +*/ >> > + if (cpuhp_tasks_frozen) >> > + return 0; >> > + >> > + /* >> > * CPUID.06H.EAX[0] indicates whether the CPU has thermal >> > * sensors. We check this bit only, all the early CPUs >> > * without thermal sensors will be filtered out. >> > @@ -654,6 +661,13 @@ static int coretemp_cpu_offline(unsigned >> > struct temp_data *tdata; >> > int indx, target; >> > >> > + /* >> > +* Don't execute this on suspend as the device remove locks >> > +* up the machine. >> > +*/ >> > + if (cpuhp_tasks_frozen) >> > + return 0; >> > + >> > /* If the physical CPU device does not exist, just return */ >> > if (!pdev) >> > return 0;
[PATCH] staging: typec: Fix sparse warnings about incorrect types
Fix the following sparse warnings about incorrect type usage: fusb302.c:1028:32: warning: incorrect type in argument 1 (different base types) fusb302.c:1028:32:expected unsigned short [unsigned] [usertype] header fusb302.c:1028:32:got restricted __le16 const [usertype] header fusb302.c:1484:32: warning: incorrect type in argument 1 (different base types) fusb302.c:1484:32:expected unsigned short [unsigned] [usertype] header fusb302.c:1484:32:got restricted __le16 [usertype] header Signed-off-by: Guru Das Srinagesh--- drivers/staging/typec/fusb302/fusb302.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/typec/fusb302/fusb302.c b/drivers/staging/typec/fusb302/fusb302.c index 2cee9a9..3bec9d5 100644 --- a/drivers/staging/typec/fusb302/fusb302.c +++ b/drivers/staging/typec/fusb302/fusb302.c @@ -1025,7 +1025,7 @@ static int fusb302_pd_send_message(struct fusb302_chip *chip, buf[pos++] = FUSB302_TKN_SYNC1; buf[pos++] = FUSB302_TKN_SYNC2; - len = pd_header_cnt(msg->header) * 4; + len = pd_header_cnt_le(msg->header) * 4; /* plug 2 for header */ len += 2; if (len > 0x1F) { @@ -1481,7 +1481,7 @@ static int fusb302_pd_read_message(struct fusb302_chip *chip, (u8 *)>header); if (ret < 0) return ret; - len = pd_header_cnt(msg->header) * 4; + len = pd_header_cnt_le(msg->header) * 4; /* add 4 to length to include the CRC */ if (len > PD_MAX_PAYLOAD * 4) { fusb302_log(chip, "PD message too long %d", len); -- 2.7.4
[PATCH] staging: typec: Fix sparse warnings about incorrect types
Fix the following sparse warnings about incorrect type usage: fusb302.c:1028:32: warning: incorrect type in argument 1 (different base types) fusb302.c:1028:32:expected unsigned short [unsigned] [usertype] header fusb302.c:1028:32:got restricted __le16 const [usertype] header fusb302.c:1484:32: warning: incorrect type in argument 1 (different base types) fusb302.c:1484:32:expected unsigned short [unsigned] [usertype] header fusb302.c:1484:32:got restricted __le16 [usertype] header Signed-off-by: Guru Das Srinagesh --- drivers/staging/typec/fusb302/fusb302.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/staging/typec/fusb302/fusb302.c b/drivers/staging/typec/fusb302/fusb302.c index 2cee9a9..3bec9d5 100644 --- a/drivers/staging/typec/fusb302/fusb302.c +++ b/drivers/staging/typec/fusb302/fusb302.c @@ -1025,7 +1025,7 @@ static int fusb302_pd_send_message(struct fusb302_chip *chip, buf[pos++] = FUSB302_TKN_SYNC1; buf[pos++] = FUSB302_TKN_SYNC2; - len = pd_header_cnt(msg->header) * 4; + len = pd_header_cnt_le(msg->header) * 4; /* plug 2 for header */ len += 2; if (len > 0x1F) { @@ -1481,7 +1481,7 @@ static int fusb302_pd_read_message(struct fusb302_chip *chip, (u8 *)>header); if (ret < 0) return ret; - len = pd_header_cnt(msg->header) * 4; + len = pd_header_cnt_le(msg->header) * 4; /* add 4 to length to include the CRC */ if (len > PD_MAX_PAYLOAD * 4) { fusb302_log(chip, "PD message too long %d", len); -- 2.7.4
Re: [patch V2 24/24] cpu/hotplug: Convert hotplug locking to percpu rwsem
Thomas Gleixnerwrites: > On Wed, 10 May 2017, Michael Ellerman wrote: > >> Thomas Gleixner writes: >> >> > @@ -130,6 +130,7 @@ void __static_key_slow_inc(struct static >> > * the all CPUs, for that to be serialized against CPU hot-plug >> > * we need to avoid CPUs coming online. >> > */ >> > + lockdep_assert_hotplug_held(); >> >jump_label_lock(); >> >if (atomic_read(>enabled) == 0) { >> >atomic_set(>enabled, -1); >> >> I seem to be hitting this assert from the ftrace event selftests, >> enabled at boot with CONFIG_FTRACE_STARTUP_TEST=y, using next-20170509 >> (on powerpc). >> >> The stupidly obvious (or perhaps obviously stupid) patch below fixes it: > > Kinda. There is more horror in that area lurking and I'm still trying to > figure out all the convoluted call pathes. OK thanks. cheers
Re: [patch V2 24/24] cpu/hotplug: Convert hotplug locking to percpu rwsem
Thomas Gleixner writes: > On Wed, 10 May 2017, Michael Ellerman wrote: > >> Thomas Gleixner writes: >> >> > @@ -130,6 +130,7 @@ void __static_key_slow_inc(struct static >> > * the all CPUs, for that to be serialized against CPU hot-plug >> > * we need to avoid CPUs coming online. >> > */ >> > + lockdep_assert_hotplug_held(); >> >jump_label_lock(); >> >if (atomic_read(>enabled) == 0) { >> >atomic_set(>enabled, -1); >> >> I seem to be hitting this assert from the ftrace event selftests, >> enabled at boot with CONFIG_FTRACE_STARTUP_TEST=y, using next-20170509 >> (on powerpc). >> >> The stupidly obvious (or perhaps obviously stupid) patch below fixes it: > > Kinda. There is more horror in that area lurking and I'm still trying to > figure out all the convoluted call pathes. OK thanks. cheers
Re: [Linux-graphics-maintainer] No mouse cursor since 36cc79bc9077319c04bd3b132edcacaa9a0d9f2b
> Sinclair Yeh hat am 10. Mai 2017 um 18:46 geschrieben: > > > Hi, Hi, > On Wed, May 10, 2017 at 12:31:39PM +0200, m.t wrote: > > > > > Thomas Hellstrom hat am 10. Mai 2017 um 10:35 geschrieben: > > > > > > Hi, > > > > > > Thanks for reporting. > > > > I would have reported earlier if it had not taken 3 days on the vm to > > bisect. > > > > > I think there is a fix for this either in preparation or queued for > > > submission. Sinclair (CC'd) should know more. > > > > If you point me to a patch I can test it. > > please give this patch a try: > > https://cgit.freedesktop.org/mesa/vmwgfx/commit/?id=324722b1e1582d45e865dcd2233a17edcfbd1638 Works fine. Thanks > If it doesn't work, then please send me the following: > 1. dmesg from the guest > 2. vmware.log from the host > 3. .vmx file for your VM > > thanks, > > Sinclair you're welcome Mark > > > > > Thanks, > > > Thomas > > > > Mark > > > > > > > > On 05/10/2017 09:17 AM, m.t wrote: > > > > Hi, > > > > I don't have a mouse cursor in my virtual machine (vmware workstation > > > > 12.5.5 > > > > build-5234757) with latest master from > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_=DwICAg=uilaK90D4TOVoH58JNXRgQ=j2ey7nuAQ5d6NbMmCOnRsTIJfmv7blo3UCKagbsM9o2-No8JdlhkKK3ty481ErVu=DWnNRbcrxMhc78PIvembLdV3OsJi3vPwcdG03kqVpJo=mc7-KF2t4BktXs11MShGZBf9bzgxumqhmVQ0ocAIS0k= > > > > > > > > kernel/git/torvalds/linux.git > > > > > > > > git bisect led me to 36cc79bc9077319c04bd3b132edcacaa9a0d9f2b as > > > > culprit. > > > > > > > > Regards > > > > Mark > > > > ___ > > > > Sent to linux-graphics-maintai...@vmware.com > > > > > >
Re: [Linux-graphics-maintainer] No mouse cursor since 36cc79bc9077319c04bd3b132edcacaa9a0d9f2b
> Sinclair Yeh hat am 10. Mai 2017 um 18:46 geschrieben: > > > Hi, Hi, > On Wed, May 10, 2017 at 12:31:39PM +0200, m.t wrote: > > > > > Thomas Hellstrom hat am 10. Mai 2017 um 10:35 geschrieben: > > > > > > Hi, > > > > > > Thanks for reporting. > > > > I would have reported earlier if it had not taken 3 days on the vm to > > bisect. > > > > > I think there is a fix for this either in preparation or queued for > > > submission. Sinclair (CC'd) should know more. > > > > If you point me to a patch I can test it. > > please give this patch a try: > > https://cgit.freedesktop.org/mesa/vmwgfx/commit/?id=324722b1e1582d45e865dcd2233a17edcfbd1638 Works fine. Thanks > If it doesn't work, then please send me the following: > 1. dmesg from the guest > 2. vmware.log from the host > 3. .vmx file for your VM > > thanks, > > Sinclair you're welcome Mark > > > > > Thanks, > > > Thomas > > > > Mark > > > > > > > > On 05/10/2017 09:17 AM, m.t wrote: > > > > Hi, > > > > I don't have a mouse cursor in my virtual machine (vmware workstation > > > > 12.5.5 > > > > build-5234757) with latest master from > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__git.kernel.org_pub_scm_linux_=DwICAg=uilaK90D4TOVoH58JNXRgQ=j2ey7nuAQ5d6NbMmCOnRsTIJfmv7blo3UCKagbsM9o2-No8JdlhkKK3ty481ErVu=DWnNRbcrxMhc78PIvembLdV3OsJi3vPwcdG03kqVpJo=mc7-KF2t4BktXs11MShGZBf9bzgxumqhmVQ0ocAIS0k= > > > > > > > > kernel/git/torvalds/linux.git > > > > > > > > git bisect led me to 36cc79bc9077319c04bd3b132edcacaa9a0d9f2b as > > > > culprit. > > > > > > > > Regards > > > > Mark > > > > ___ > > > > Sent to linux-graphics-maintai...@vmware.com > > > > > >
Re: [PATCH 0/4] KVM: x86: kvm_mwait_in_guest() cleanup and fixes
On 06/05/2017 18:48, Gabriel L. Somlo wrote: > So, in conclusion; it's not important to *me* that this old machine > keeps working, I'm just volunteering test data points. So please don't > feel obligated in any way to go out of your way on my account. OTOH, > I'm happy to provide feedback as long as you would like me to. > > Along the same lines: Paolo, as the author of commit 2c82878b0cb38fd, > is the Xeon chip listed above one of the "obsolete for virtualization" > models ? Yes - I hadn't tested this model in particular, and this one is a little less obsolete compared to the ones I found without NMI support (a 64-bit Prescott and a 32-bit Yonah), but I still believe it's saner to treat them as obsolete. Can you please run vmxcap (from QEMU's git repository) on that processor and include the output? Paolo > In that case, it makes no sense for me to keep using it for > tests, and the fact that it misbehaves with L1 MWAIT should also not > matter at all.
Re: [PATCH 0/4] KVM: x86: kvm_mwait_in_guest() cleanup and fixes
On 06/05/2017 18:48, Gabriel L. Somlo wrote: > So, in conclusion; it's not important to *me* that this old machine > keeps working, I'm just volunteering test data points. So please don't > feel obligated in any way to go out of your way on my account. OTOH, > I'm happy to provide feedback as long as you would like me to. > > Along the same lines: Paolo, as the author of commit 2c82878b0cb38fd, > is the Xeon chip listed above one of the "obsolete for virtualization" > models ? Yes - I hadn't tested this model in particular, and this one is a little less obsolete compared to the ones I found without NMI support (a 64-bit Prescott and a 32-bit Yonah), but I still believe it's saner to treat them as obsolete. Can you please run vmxcap (from QEMU's git repository) on that processor and include the output? Paolo > In that case, it makes no sense for me to keep using it for > tests, and the fact that it misbehaves with L1 MWAIT should also not > matter at all.
Re: [PATCH] FS: Fixing return type of unsigned_offsets
If I remove '!!', sparse flags warning: fs/read_write.c:38:29: warning: incorrect type in return expression (different base types) fs/read_write.c:38:29:expected bool fs/read_write.c:38:29:got restricted fmode_t It means explicit conversion is needed. On Thu, May 11, 2017 at 10:25 AM, Joe Percheswrote: > On Thu, 2017-05-11 at 10:13 +0530, Pushkar Jambhlekar wrote: >> Should I change my implementation, i.e. remove '!!'? > > That'd be up to Al. > > At least one implementation using similar bit comparisons > in fs/*.c does not use !! > > fs/locks.c:static bool lease_breaking(struct file_lock *fl) > fs/locks.c-{ > fs/locks.c- return fl->fl_flags & (FL_UNLOCK_PENDING | > FL_DOWNGRADE_PENDING) > fs/locks.c-} > > I didn't look very hard. -- Jambhlekar Pushkar Arun
Re: [PATCH] FS: Fixing return type of unsigned_offsets
If I remove '!!', sparse flags warning: fs/read_write.c:38:29: warning: incorrect type in return expression (different base types) fs/read_write.c:38:29:expected bool fs/read_write.c:38:29:got restricted fmode_t It means explicit conversion is needed. On Thu, May 11, 2017 at 10:25 AM, Joe Perches wrote: > On Thu, 2017-05-11 at 10:13 +0530, Pushkar Jambhlekar wrote: >> Should I change my implementation, i.e. remove '!!'? > > That'd be up to Al. > > At least one implementation using similar bit comparisons > in fs/*.c does not use !! > > fs/locks.c:static bool lease_breaking(struct file_lock *fl) > fs/locks.c-{ > fs/locks.c- return fl->fl_flags & (FL_UNLOCK_PENDING | > FL_DOWNGRADE_PENDING) > fs/locks.c-} > > I didn't look very hard. -- Jambhlekar Pushkar Arun
[PATCH] kernel/power: Declare variables as static
Fixing sparse warnings: 'symbol not declared. Should it be static?' Variables need to be static since they have not used outside of the files. Signed-off-by: Pushkar Jambhlekar--- kernel/power/snapshot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 3b1e0f3..fa46606 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -1425,7 +1425,7 @@ static unsigned int nr_meta_pages; * Numbers of normal and highmem page frames allocated for hibernation image * before suspending devices. */ -unsigned int alloc_normal, alloc_highmem; +static unsigned int alloc_normal, alloc_highmem; /* * Memory bitmap used for marking saveable pages (during hibernation) or * hibernation image pages (during restore) -- 2.7.4
Re: [PATCH v2 0/3] nVMX: Emulated Page Modification Logging for Nested Virtualization
On 09/05/2017 18:03, Bandan Das wrote: >> I tested this with api/dirty-log-perf, and nested PML is more than 3 >> times faster than pml=0. I want to do a few more tests because I don't >> see any PML full exits in the L1 trace, but it seems to be a nice >> improvement! > > Thanks for testing! Regarding the PML full exits, I did notice their > absence. I induced it artifically in my testing with a lower index > and it seemed to work fine. Yes, tracing showed that it's because the guest's EXTERNAL_INTERRUPT exits are too frequent. Paolo
Re: [PATCH v2 0/3] nVMX: Emulated Page Modification Logging for Nested Virtualization
On 09/05/2017 18:03, Bandan Das wrote: >> I tested this with api/dirty-log-perf, and nested PML is more than 3 >> times faster than pml=0. I want to do a few more tests because I don't >> see any PML full exits in the L1 trace, but it seems to be a nice >> improvement! > > Thanks for testing! Regarding the PML full exits, I did notice their > absence. I induced it artifically in my testing with a lower index > and it seemed to work fine. Yes, tracing showed that it's because the guest's EXTERNAL_INTERRUPT exits are too frequent. Paolo
[PATCH] kernel/power: Declare variables as static
Fixing sparse warnings: 'symbol not declared. Should it be static?' Variables need to be static since they have not used outside of the files. Signed-off-by: Pushkar Jambhlekar --- kernel/power/snapshot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 3b1e0f3..fa46606 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -1425,7 +1425,7 @@ static unsigned int nr_meta_pages; * Numbers of normal and highmem page frames allocated for hibernation image * before suspending devices. */ -unsigned int alloc_normal, alloc_highmem; +static unsigned int alloc_normal, alloc_highmem; /* * Memory bitmap used for marking saveable pages (during hibernation) or * hibernation image pages (during restore) -- 2.7.4
[GIT PULL] MTD updates for 4.12-rc1
Hi Linus, The following changes since commit c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201: Linux 4.11-rc1 (2017-03-05 12:59:56 -0800) are available in the git repository at: git://git.infradead.org/linux-mtd.git tags/for-linus-20170510 for you to fetch changes up to a9402889f41cc2db7a9b162990bef271be098ff0: MAINTAINERS: Update NAND subsystem git repositories (2017-05-10 18:22:38 -0700) MTD updates for 4.12-rc1: NAND, from Boris: """ - some minor fixes/improvements on existing drivers (fsmc, gpio, ifc, davinci, brcmnand, omap) - a huge cleanup/rework of the denali driver accompanied with core fixes/improvements to simplify the driver code - a complete rewrite of the atmel driver to support new DT bindings make future evolution easier - the addition of per-vendor detection/initialization steps to avoid extending the nand_ids table with more extended-id entries """ SPI NOR, from Cyrille: """ - fixes in the hisi SPI controller driver. - fixes in the intel SPI controller driver. - fixes in the Mediatek SPI controller driver. - fixes to some SPI flash memories not supported the Chip Erase command. - add support to some new memory parts (Winbond, Macronix, Micron, ESMT). - add new driver for the STM32 QSPI controller. """ And a few fixes for Gemini and Versatile platforms on physmap-of Alexander Couzens (1): mtd: nand: add ooblayout for old hamming layout Alexander Kurz (2): drivers mtd: spi-nor: add Winbond W25Q20 variants drivers mtd: spi-nor: add Macronix MX25Ux033E and MX25Ux035 variants Alexey Khoroshilov (1): mtd: spi-nor: hisi: do not ignore clk_prepare_enable() failure Alison Wang (2): memory: ifc: Update dependency of IFC for LS1021A mtd: nand: Update dependency of IFC for LS1021A Boris Brezillon (21): mtd: nand: Get rid of the mtd parameter in all auto-detection functions mtd: nand: Store nand ID in struct nand_chip mtd: nand: Get rid of busw parameter mtd: nand: Rename nand_get_flash_type() into nand_detect() mtd: nand: Rename the nand_manufacturers struct mtd: nand: Kill the MTD_NAND_IDS Kconfig option mtd: nand: Do not expose the NAND manufacturer table directly mtd: nand: Add manufacturer specific initialization/detection steps mtd: nand: Move Samsung specific init/detection logic in nand_samsung.c mtd: nand: Move Hynix specific init/detection logic in nand_hynix.c mtd: nand: Move Toshiba specific init/detection logic in nand_toshiba.c mtd: nand: Move Micron specific init logic in nand_micron.c mtd: nand: Move AMD/Spansion specific init/detection logic in nand_amd.c mtd: nand: Move Macronix specific initialization in nand_macronix.c mtd: nand: hynix: Rework NAND ID decoding to extract more information mtd: nand: hynix: Add read-retry support for 1x nm MLC NANDs mtd: nand: tango: Enforce DMA direction type mtd: nand: Cleanup/rework the atmel_nand driver mtd: nand: atmel: Document the new DT bindings mtd: nand: Remove unused chip->write_page() hook MAINTAINERS: Update NAND subsystem git repositories Brian Norris (2): Merge tag 'nand/for-4.12' of github.com:linux-nand/linux into MTD Merge tag 'spi-nor/for-4.12-v2' of git://github.com/spi-nor/linux into MTD Christophe Jaillet (1): mtd: nand: NULL terminate a of_device_id table Christophe Leroy (2): mtd: nand: gpio: make nCE GPIO optional mtd: nand: gpio: update binding Colin Ian King (2): mtd: nand: nandsim: fix spelling mistake: "weakpagess" -> "weakpages" jffs2: fix spelling mistake: "requestied" -> "requested" Cyrille Pitchen (1): MAINTAINERS: change email address from atmel.com to wedev4u.fr Dan Carpenter (3): mtd: nand: hynix: Fix an error code in init mtd: nand: Fix a couple error codes mtd: oxnas_nand: Allocating more than necessary in probe() Geliang Tang (1): mtd: mtdswap: use MTDSWAP_ECNT_MIN/MAX Guochun Mao (1): mtd: mtk-nor: set controller's address width according to nor flash Hans de Goede (1): mtd: nand: samsung: Retrieve ECC requirements from extended ID Joe Perches (1): drivers/mtd: Convert remaining uses of pr_warning to pr_warn Kamal Dasu (1): mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program L. D. Pinney (1): mtd: spi-nor: Add support for ESMT F25L32QA and F25L64QA Linus Walleij (1): mtd: physmap_of: really fix the physmap add-ons Ludovic Barre (2): mtd: spi-nor: add driver for STM32 quad spi flash controller dt-bindings: mtd: Document the STM32 QSPI bindings Masahiro Yamada (31): mtd: nand: allow to set only one of ECC size and ECC stre
[GIT PULL] MTD updates for 4.12-rc1
Hi Linus, The following changes since commit c1ae3cfa0e89fa1a7ecc4c99031f5e9ae99d9201: Linux 4.11-rc1 (2017-03-05 12:59:56 -0800) are available in the git repository at: git://git.infradead.org/linux-mtd.git tags/for-linus-20170510 for you to fetch changes up to a9402889f41cc2db7a9b162990bef271be098ff0: MAINTAINERS: Update NAND subsystem git repositories (2017-05-10 18:22:38 -0700) MTD updates for 4.12-rc1: NAND, from Boris: """ - some minor fixes/improvements on existing drivers (fsmc, gpio, ifc, davinci, brcmnand, omap) - a huge cleanup/rework of the denali driver accompanied with core fixes/improvements to simplify the driver code - a complete rewrite of the atmel driver to support new DT bindings make future evolution easier - the addition of per-vendor detection/initialization steps to avoid extending the nand_ids table with more extended-id entries """ SPI NOR, from Cyrille: """ - fixes in the hisi SPI controller driver. - fixes in the intel SPI controller driver. - fixes in the Mediatek SPI controller driver. - fixes to some SPI flash memories not supported the Chip Erase command. - add support to some new memory parts (Winbond, Macronix, Micron, ESMT). - add new driver for the STM32 QSPI controller. """ And a few fixes for Gemini and Versatile platforms on physmap-of Alexander Couzens (1): mtd: nand: add ooblayout for old hamming layout Alexander Kurz (2): drivers mtd: spi-nor: add Winbond W25Q20 variants drivers mtd: spi-nor: add Macronix MX25Ux033E and MX25Ux035 variants Alexey Khoroshilov (1): mtd: spi-nor: hisi: do not ignore clk_prepare_enable() failure Alison Wang (2): memory: ifc: Update dependency of IFC for LS1021A mtd: nand: Update dependency of IFC for LS1021A Boris Brezillon (21): mtd: nand: Get rid of the mtd parameter in all auto-detection functions mtd: nand: Store nand ID in struct nand_chip mtd: nand: Get rid of busw parameter mtd: nand: Rename nand_get_flash_type() into nand_detect() mtd: nand: Rename the nand_manufacturers struct mtd: nand: Kill the MTD_NAND_IDS Kconfig option mtd: nand: Do not expose the NAND manufacturer table directly mtd: nand: Add manufacturer specific initialization/detection steps mtd: nand: Move Samsung specific init/detection logic in nand_samsung.c mtd: nand: Move Hynix specific init/detection logic in nand_hynix.c mtd: nand: Move Toshiba specific init/detection logic in nand_toshiba.c mtd: nand: Move Micron specific init logic in nand_micron.c mtd: nand: Move AMD/Spansion specific init/detection logic in nand_amd.c mtd: nand: Move Macronix specific initialization in nand_macronix.c mtd: nand: hynix: Rework NAND ID decoding to extract more information mtd: nand: hynix: Add read-retry support for 1x nm MLC NANDs mtd: nand: tango: Enforce DMA direction type mtd: nand: Cleanup/rework the atmel_nand driver mtd: nand: atmel: Document the new DT bindings mtd: nand: Remove unused chip->write_page() hook MAINTAINERS: Update NAND subsystem git repositories Brian Norris (2): Merge tag 'nand/for-4.12' of github.com:linux-nand/linux into MTD Merge tag 'spi-nor/for-4.12-v2' of git://github.com/spi-nor/linux into MTD Christophe Jaillet (1): mtd: nand: NULL terminate a of_device_id table Christophe Leroy (2): mtd: nand: gpio: make nCE GPIO optional mtd: nand: gpio: update binding Colin Ian King (2): mtd: nand: nandsim: fix spelling mistake: "weakpagess" -> "weakpages" jffs2: fix spelling mistake: "requestied" -> "requested" Cyrille Pitchen (1): MAINTAINERS: change email address from atmel.com to wedev4u.fr Dan Carpenter (3): mtd: nand: hynix: Fix an error code in init mtd: nand: Fix a couple error codes mtd: oxnas_nand: Allocating more than necessary in probe() Geliang Tang (1): mtd: mtdswap: use MTDSWAP_ECNT_MIN/MAX Guochun Mao (1): mtd: mtk-nor: set controller's address width according to nor flash Hans de Goede (1): mtd: nand: samsung: Retrieve ECC requirements from extended ID Joe Perches (1): drivers/mtd: Convert remaining uses of pr_warning to pr_warn Kamal Dasu (1): mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program L. D. Pinney (1): mtd: spi-nor: Add support for ESMT F25L32QA and F25L64QA Linus Walleij (1): mtd: physmap_of: really fix the physmap add-ons Ludovic Barre (2): mtd: spi-nor: add driver for STM32 quad spi flash controller dt-bindings: mtd: Document the STM32 QSPI bindings Masahiro Yamada (31): mtd: nand: allow to set only one of ECC size and ECC stre
Re: [PATCH] FS: Fixing return type of unsigned_offsets
On Thu, 2017-05-11 at 10:13 +0530, Pushkar Jambhlekar wrote: > Should I change my implementation, i.e. remove '!!'? That'd be up to Al. At least one implementation using similar bit comparisons in fs/*.c does not use !! fs/locks.c:static bool lease_breaking(struct file_lock *fl) fs/locks.c-{ fs/locks.c- return fl->fl_flags & (FL_UNLOCK_PENDING | FL_DOWNGRADE_PENDING) fs/locks.c-} I didn't look very hard.
Re: [PATCH] FS: Fixing return type of unsigned_offsets
On Thu, 2017-05-11 at 10:13 +0530, Pushkar Jambhlekar wrote: > Should I change my implementation, i.e. remove '!!'? That'd be up to Al. At least one implementation using similar bit comparisons in fs/*.c does not use !! fs/locks.c:static bool lease_breaking(struct file_lock *fl) fs/locks.c-{ fs/locks.c- return fl->fl_flags & (FL_UNLOCK_PENDING | FL_DOWNGRADE_PENDING) fs/locks.c-} I didn't look very hard.
Re: [PATCH 1/1] selftests: sync: add config fragment for testing sync framework
Fathi Boudrawrites: > Unless the software synchronization objects (CONFIG_SW_SYNC) is enabled, > the sync test will fail: > > Additional Information: > Running tests in sync > > [RUN] Testing sync framework > [RUN] Executing test_alloc_timeline > [ERROR] Failure allocating timeline It would be better if the test just detected that the kernel didn't support the API. It seems to rely on /sys/kernel/debug/sync/sw_sync existing. How about this? diff --git a/tools/testing/selftests/sync/sync_test.c b/tools/testing/selftests/sync/sync_test.c index 9ea08d9f0b13..62fa666e501a 100644 --- a/tools/testing/selftests/sync/sync_test.c +++ b/tools/testing/selftests/sync/sync_test.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include "synctest.h" @@ -52,10 +53,22 @@ static int run_test(int (*test)(void), char *name) exit(test()); } +static int sync_api_supported(void) +{ + struct stat sbuf; + + return 0 == stat("/sys/kernel/debug/sync/sw_sync", ); +} + int main(void) { int err = 0; + if (!sync_api_supported()) { + printf("SKIP: Sync framework not supported by kernel\n"); + return 0; + } + printf("[RUN]\tTesting sync framework\n"); err += RUN_TEST(test_alloc_timeline); cheers
Re: [PATCH 1/1] selftests: sync: add config fragment for testing sync framework
Fathi Boudra writes: > Unless the software synchronization objects (CONFIG_SW_SYNC) is enabled, > the sync test will fail: > > Additional Information: > Running tests in sync > > [RUN] Testing sync framework > [RUN] Executing test_alloc_timeline > [ERROR] Failure allocating timeline It would be better if the test just detected that the kernel didn't support the API. It seems to rely on /sys/kernel/debug/sync/sw_sync existing. How about this? diff --git a/tools/testing/selftests/sync/sync_test.c b/tools/testing/selftests/sync/sync_test.c index 9ea08d9f0b13..62fa666e501a 100644 --- a/tools/testing/selftests/sync/sync_test.c +++ b/tools/testing/selftests/sync/sync_test.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include "synctest.h" @@ -52,10 +53,22 @@ static int run_test(int (*test)(void), char *name) exit(test()); } +static int sync_api_supported(void) +{ + struct stat sbuf; + + return 0 == stat("/sys/kernel/debug/sync/sw_sync", ); +} + int main(void) { int err = 0; + if (!sync_api_supported()) { + printf("SKIP: Sync framework not supported by kernel\n"); + return 0; + } + printf("[RUN]\tTesting sync framework\n"); err += RUN_TEST(test_alloc_timeline); cheers
RE: [PATCH] net: fec: select queue depending on VLAN priority
From: Stefan AgnerSent: Thursday, May 11, 2017 12:08 PM >To: Andy Duan >Cc: David Miller ; and...@lunn.ch; >feste...@gmail.com; net...@vger.kernel.org; linux- >ker...@vger.kernel.org >Subject: Re: [PATCH] net: fec: select queue depending on VLAN priority > >On 2017-05-09 19:42, Andy Duan wrote: >> From: David Miller Sent: Tuesday, May 09, 2017 >> 9:39 PM >>>To: ste...@agner.ch >>>Cc: Andy Duan ; and...@lunn.ch; >>>feste...@gmail.com; net...@vger.kernel.org; linux- >>>ker...@vger.kernel.org >>>Subject: Re: [PATCH] net: fec: select queue depending on VLAN priority >>> >>>From: Stefan Agner >>>Date: Mon, 8 May 2017 22:37:08 -0700 >>> Since the addition of the multi queue code with commit 59d0f7465644 ("net: fec: init multi queue date structure") the queue selection has been handelt by the default transmit queue selection implementation which tries to evenly distribute the traffic across all available queues. This selection presumes that the queues are using an equal priority, however, the queues 1 and 2 are actually of higher priority (the classification of the queues is enabled in >fec_enet_enable_ring). This can lead to net scheduler warnings and continuous TX ring dumps when exercising the system with iperf. Use only queue 0 for all common traffic (no VLAN and P802.1p priority 0 and 1) and route level 2-7 through queue 1 and 2. Signed-off-by: Fugang Duan Fixes: 59d0f7465644 ("net: fec: init multi queue date structure") >>> >>>If the queues are used for prioritization, and it does not have >>>multiple normal priority level queues, multiqueue is not what the >>>driver should have implemented. >> Firstly, HW multiple queues support: >> - Traffic-shaping bandwidth distribution supports credit-based and >> round-robin-based policies. Either policy can be combined with >> time-based shaping. >> - AVB (Audio Video Bridging, IEEE 802.1Qav) features: >> * Credit-based bandwidth distribution policy can be combined >with >> time-based shaping >> * AVB endpoint talker and listener support >> * Support for arbitration between different priority traffic >> (for >> example, AVB class A, AVB class B, and non-AVB) Round-robin-based >> policies: >> It has the same priority for three queues: In the round-robin QoS >> scheme, each queue is given an equal opportunity to transmit one >> frame. For example, if queue n has a frame to transmit, the queue >> transmits its frame. After queue n has transmitted its frame, or if >> queue n does not have a frame to transmit, queue n+1 is then allowed >> to transmit its frame, and so on. >> >> Credit-based policies: >> The AVB credit based shaper acts independently, per class, to control >> the bandwidth distribution between normal traffic and time-sensitive >> traffic with respect to the total link bandwidth available. >> Credit based shaper conbined with time-based shaping: >> - priority: ClassA queue > ClassB queue > best effort >> - ensure the queue bandwidth as user set based on time- >based shaping >> algorithms (transmitter transmit frame from three queue in turn based >> on time-based shaping algorithms) >> And in real AVB case, each streaming can be independent, and are >> fixed on related queue. Then driver level should implement >> .ndo_select_queue() to put the streaming into related queue. That is >> what the patch did. >> >> The current driver config the three queue to credit-based policies >> (AVB), the patch seems no problem for the implementation. Do you have >> any suggestion ? >> > >I tried using the round robin mode by adding this: > >+ /* Set Round-Robin policy */ >+ writel(1, fep->hwp + FEC_QOS_SCHEME); > >After a while I got the warning non the less: > >WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 >dev_watchdog+0x248/0x24c NETDEV WATCHDOG: eth0 (fec): transmit queue >2 timed out Modules linked in: >CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00058-g56d22eced8bc- >dirty #377 Hardware name: Freescale i.MX7 Dual (Device Tree) [] >(unwind_backtrace) from [] (show_stack+0x10/0x14) [] >(show_stack) from [] (dump_stack+0x78/0x8c) [] >(dump_stack) from [] (__warn+0xe8/0x100) [] >(__warn) from [] (warn_slowpath_fmt+0x38/0x48) [] >(warn_slowpath_fmt) from [] >(dev_watchdog+0x248/0x24c) >[] (dev_watchdog) from [] (call_timer_fn+0x28/0x98) >[] (call_timer_fn) from [] (expire_timers+0xa0/0xac) >[] (expire_timers) from [] >(run_timer_softirq+0x9c/0x194) >[] (run_timer_softirq) from [] >(__do_softirq+0x114/0x234) >[] (__do_softirq) from [] (irq_exit+0xcc/0x108) >[] (irq_exit) from [] >(__handle_domain_irq+0x80/0xec) >[] (__handle_domain_irq) from [] >(gic_handle_irq+0x48/0x8c) >[] (gic_handle_irq) from [] (__irq_svc+0x58/0x8c)
RE: [PATCH] net: fec: select queue depending on VLAN priority
From: Stefan Agner Sent: Thursday, May 11, 2017 12:08 PM >To: Andy Duan >Cc: David Miller ; and...@lunn.ch; >feste...@gmail.com; net...@vger.kernel.org; linux- >ker...@vger.kernel.org >Subject: Re: [PATCH] net: fec: select queue depending on VLAN priority > >On 2017-05-09 19:42, Andy Duan wrote: >> From: David Miller Sent: Tuesday, May 09, 2017 >> 9:39 PM >>>To: ste...@agner.ch >>>Cc: Andy Duan ; and...@lunn.ch; >>>feste...@gmail.com; net...@vger.kernel.org; linux- >>>ker...@vger.kernel.org >>>Subject: Re: [PATCH] net: fec: select queue depending on VLAN priority >>> >>>From: Stefan Agner >>>Date: Mon, 8 May 2017 22:37:08 -0700 >>> Since the addition of the multi queue code with commit 59d0f7465644 ("net: fec: init multi queue date structure") the queue selection has been handelt by the default transmit queue selection implementation which tries to evenly distribute the traffic across all available queues. This selection presumes that the queues are using an equal priority, however, the queues 1 and 2 are actually of higher priority (the classification of the queues is enabled in >fec_enet_enable_ring). This can lead to net scheduler warnings and continuous TX ring dumps when exercising the system with iperf. Use only queue 0 for all common traffic (no VLAN and P802.1p priority 0 and 1) and route level 2-7 through queue 1 and 2. Signed-off-by: Fugang Duan Fixes: 59d0f7465644 ("net: fec: init multi queue date structure") >>> >>>If the queues are used for prioritization, and it does not have >>>multiple normal priority level queues, multiqueue is not what the >>>driver should have implemented. >> Firstly, HW multiple queues support: >> - Traffic-shaping bandwidth distribution supports credit-based and >> round-robin-based policies. Either policy can be combined with >> time-based shaping. >> - AVB (Audio Video Bridging, IEEE 802.1Qav) features: >> * Credit-based bandwidth distribution policy can be combined >with >> time-based shaping >> * AVB endpoint talker and listener support >> * Support for arbitration between different priority traffic >> (for >> example, AVB class A, AVB class B, and non-AVB) Round-robin-based >> policies: >> It has the same priority for three queues: In the round-robin QoS >> scheme, each queue is given an equal opportunity to transmit one >> frame. For example, if queue n has a frame to transmit, the queue >> transmits its frame. After queue n has transmitted its frame, or if >> queue n does not have a frame to transmit, queue n+1 is then allowed >> to transmit its frame, and so on. >> >> Credit-based policies: >> The AVB credit based shaper acts independently, per class, to control >> the bandwidth distribution between normal traffic and time-sensitive >> traffic with respect to the total link bandwidth available. >> Credit based shaper conbined with time-based shaping: >> - priority: ClassA queue > ClassB queue > best effort >> - ensure the queue bandwidth as user set based on time- >based shaping >> algorithms (transmitter transmit frame from three queue in turn based >> on time-based shaping algorithms) >> And in real AVB case, each streaming can be independent, and are >> fixed on related queue. Then driver level should implement >> .ndo_select_queue() to put the streaming into related queue. That is >> what the patch did. >> >> The current driver config the three queue to credit-based policies >> (AVB), the patch seems no problem for the implementation. Do you have >> any suggestion ? >> > >I tried using the round robin mode by adding this: > >+ /* Set Round-Robin policy */ >+ writel(1, fep->hwp + FEC_QOS_SCHEME); > >After a while I got the warning non the less: > >WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 >dev_watchdog+0x248/0x24c NETDEV WATCHDOG: eth0 (fec): transmit queue >2 timed out Modules linked in: >CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00058-g56d22eced8bc- >dirty #377 Hardware name: Freescale i.MX7 Dual (Device Tree) [] >(unwind_backtrace) from [] (show_stack+0x10/0x14) [] >(show_stack) from [] (dump_stack+0x78/0x8c) [] >(dump_stack) from [] (__warn+0xe8/0x100) [] >(__warn) from [] (warn_slowpath_fmt+0x38/0x48) [] >(warn_slowpath_fmt) from [] >(dev_watchdog+0x248/0x24c) >[] (dev_watchdog) from [] (call_timer_fn+0x28/0x98) >[] (call_timer_fn) from [] (expire_timers+0xa0/0xac) >[] (expire_timers) from [] >(run_timer_softirq+0x9c/0x194) >[] (run_timer_softirq) from [] >(__do_softirq+0x114/0x234) >[] (__do_softirq) from [] (irq_exit+0xcc/0x108) >[] (irq_exit) from [] >(__handle_domain_irq+0x80/0xec) >[] (__handle_domain_irq) from [] >(gic_handle_irq+0x48/0x8c) >[] (gic_handle_irq) from [] (__irq_svc+0x58/0x8c) >Exception stack(0xc1001f28 to 0xc1001f70) >1f20: 0001 c022fdc0 c100 >c1003d80 >1f40: c1003d34
Re: [PATCH] FS: Fixing return type of unsigned_offsets
Should I change my implementation, i.e. remove '!!'? On Thu, May 11, 2017 at 10:09 AM, Joe Percheswrote: > On Thu, 2017-05-11 at 09:57 +0530, Pushkar Jambhlekar wrote: >> Fixing Sparse warning. It should return bool, instead it returns >> int. > [] >> diff --git a/fs/read_write.c b/fs/read_write.c > [] >> @@ -33,9 +33,9 @@ const struct file_operations generic_ro_fops = { >> >> EXPORT_SYMBOL(generic_ro_fops); >> >> -static inline int unsigned_offsets(struct file *file) >> +static inline bool unsigned_offsets(struct file *file) >> { >> - return file->f_mode & FMODE_UNSIGNED_OFFSET; >> + return !!(file->f_mode & FMODE_UNSIGNED_OFFSET); > > trivia: the !! isn't necessary as by definition > all non-zero assigns to bool are converted to 1. > -- Jambhlekar Pushkar Arun
Re: [PATCH] FS: Fixing return type of unsigned_offsets
Should I change my implementation, i.e. remove '!!'? On Thu, May 11, 2017 at 10:09 AM, Joe Perches wrote: > On Thu, 2017-05-11 at 09:57 +0530, Pushkar Jambhlekar wrote: >> Fixing Sparse warning. It should return bool, instead it returns >> int. > [] >> diff --git a/fs/read_write.c b/fs/read_write.c > [] >> @@ -33,9 +33,9 @@ const struct file_operations generic_ro_fops = { >> >> EXPORT_SYMBOL(generic_ro_fops); >> >> -static inline int unsigned_offsets(struct file *file) >> +static inline bool unsigned_offsets(struct file *file) >> { >> - return file->f_mode & FMODE_UNSIGNED_OFFSET; >> + return !!(file->f_mode & FMODE_UNSIGNED_OFFSET); > > trivia: the !! isn't necessary as by definition > all non-zero assigns to bool are converted to 1. > -- Jambhlekar Pushkar Arun
Re: [PATCH] FS: Fixing return type of unsigned_offsets
On Thu, 2017-05-11 at 09:57 +0530, Pushkar Jambhlekar wrote: > Fixing Sparse warning. It should return bool, instead it returns > int. [] > diff --git a/fs/read_write.c b/fs/read_write.c [] > @@ -33,9 +33,9 @@ const struct file_operations generic_ro_fops = { > > EXPORT_SYMBOL(generic_ro_fops); > > -static inline int unsigned_offsets(struct file *file) > +static inline bool unsigned_offsets(struct file *file) > { > - return file->f_mode & FMODE_UNSIGNED_OFFSET; > + return !!(file->f_mode & FMODE_UNSIGNED_OFFSET); trivia: the !! isn't necessary as by definition all non-zero assigns to bool are converted to 1.
Re: [PATCH] FS: Fixing return type of unsigned_offsets
On Thu, 2017-05-11 at 09:57 +0530, Pushkar Jambhlekar wrote: > Fixing Sparse warning. It should return bool, instead it returns > int. [] > diff --git a/fs/read_write.c b/fs/read_write.c [] > @@ -33,9 +33,9 @@ const struct file_operations generic_ro_fops = { > > EXPORT_SYMBOL(generic_ro_fops); > > -static inline int unsigned_offsets(struct file *file) > +static inline bool unsigned_offsets(struct file *file) > { > - return file->f_mode & FMODE_UNSIGNED_OFFSET; > + return !!(file->f_mode & FMODE_UNSIGNED_OFFSET); trivia: the !! isn't necessary as by definition all non-zero assigns to bool are converted to 1.
Re: [PATCH -mm -v10 1/3] mm, THP, swap: Delay splitting THP during swap out
On Thu, May 11, 2017 at 08:50:01AM +0800, Huang, Ying wrote: < snip > > >> > @@ -1125,8 +1125,28 @@ static unsigned long shrink_page_list(struct > >> > list_head *page_list, > >> > !PageSwapCache(page)) { > >> > if (!(sc->gfp_mask & __GFP_IO)) > >> > goto keep_locked; > >> > -if (!add_to_swap(page, page_list)) > >> > +swap_retry: > >> > +/* > >> > + * Retry after split if we fail to allocate > >> > + * swap space of a THP. > >> > + */ > >> > +if (!add_to_swap(page)) { > >> > +if (!PageTransHuge(page) || > >> > +split_huge_page_to_list(page, > >> > page_list)) > >> > +goto activate_locked; > >> > +goto swap_retry; > >> > +} > >> > >> This is definitely better. > > > > Thanks. > > > >> > >> However, I think it'd be cleaner without the label here: > >> > >>if (!add_to_swap(page)) { > >>if (!PageTransHuge(page)) > >>goto activate_locked; > >>/* Split THP and swap individual base pages */ > >>if (split_huge_page_to_list(page, page_list)) > >>goto activate_locked; > >>if (!add_to_swap(page)) > >>goto activate_locked; > > > > Yes. > > > >>} > >> > >> > +/* > >> > + * Got swap space successfully. But > >> > unfortunately, > >> > + * we don't support a THP page writeout so > >> > split it. > >> > + */ > >> > +if (PageTransHuge(page) && > >> > + split_huge_page_to_list(page, > >> > page_list)) { > >> > +delete_from_swap_cache(page); > >> > goto activate_locked; > >> > +} > >> > >> Pulling this out of add_to_swap() is an improvement for sure. Add an > >> XXX: before that "we don't support THP writes" comment for good > >> measure :) > > > > Sure. > > > > It could be a separate patch which makes add_to_swap clean via > > removing page_list argument but I hope Huang take/fold it when he > > resend it because it would be more important with THP swap. > > Sure. I will take this patch as one patch of the THP swap series. > Because the first patch of the THP swap series is a little big, I don't > think it is a good idea to fold this patch into it. Could you update > the patch according to Johannes' comments and resend it? Okay, I will resend this clean-up patch against on yours patch after finishing this discussion. Thanks.
Re: [PATCH -mm -v10 1/3] mm, THP, swap: Delay splitting THP during swap out
On Thu, May 11, 2017 at 08:50:01AM +0800, Huang, Ying wrote: < snip > > >> > @@ -1125,8 +1125,28 @@ static unsigned long shrink_page_list(struct > >> > list_head *page_list, > >> > !PageSwapCache(page)) { > >> > if (!(sc->gfp_mask & __GFP_IO)) > >> > goto keep_locked; > >> > -if (!add_to_swap(page, page_list)) > >> > +swap_retry: > >> > +/* > >> > + * Retry after split if we fail to allocate > >> > + * swap space of a THP. > >> > + */ > >> > +if (!add_to_swap(page)) { > >> > +if (!PageTransHuge(page) || > >> > +split_huge_page_to_list(page, > >> > page_list)) > >> > +goto activate_locked; > >> > +goto swap_retry; > >> > +} > >> > >> This is definitely better. > > > > Thanks. > > > >> > >> However, I think it'd be cleaner without the label here: > >> > >>if (!add_to_swap(page)) { > >>if (!PageTransHuge(page)) > >>goto activate_locked; > >>/* Split THP and swap individual base pages */ > >>if (split_huge_page_to_list(page, page_list)) > >>goto activate_locked; > >>if (!add_to_swap(page)) > >>goto activate_locked; > > > > Yes. > > > >>} > >> > >> > +/* > >> > + * Got swap space successfully. But > >> > unfortunately, > >> > + * we don't support a THP page writeout so > >> > split it. > >> > + */ > >> > +if (PageTransHuge(page) && > >> > + split_huge_page_to_list(page, > >> > page_list)) { > >> > +delete_from_swap_cache(page); > >> > goto activate_locked; > >> > +} > >> > >> Pulling this out of add_to_swap() is an improvement for sure. Add an > >> XXX: before that "we don't support THP writes" comment for good > >> measure :) > > > > Sure. > > > > It could be a separate patch which makes add_to_swap clean via > > removing page_list argument but I hope Huang take/fold it when he > > resend it because it would be more important with THP swap. > > Sure. I will take this patch as one patch of the THP swap series. > Because the first patch of the THP swap series is a little big, I don't > think it is a good idea to fold this patch into it. Could you update > the patch according to Johannes' comments and resend it? Okay, I will resend this clean-up patch against on yours patch after finishing this discussion. Thanks.
[PATCH] FS: Fixing return type of unsigned_offsets
Fixing Sparse warning. It should return bool, instead it returns int. Signed-off-by: Pushkar Jambhlekar--- fs/read_write.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 47c1d44..d672830 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -33,9 +33,9 @@ const struct file_operations generic_ro_fops = { EXPORT_SYMBOL(generic_ro_fops); -static inline int unsigned_offsets(struct file *file) +static inline bool unsigned_offsets(struct file *file) { - return file->f_mode & FMODE_UNSIGNED_OFFSET; + return !!(file->f_mode & FMODE_UNSIGNED_OFFSET); } /** -- 2.7.4
[PATCH] FS: Fixing return type of unsigned_offsets
Fixing Sparse warning. It should return bool, instead it returns int. Signed-off-by: Pushkar Jambhlekar --- fs/read_write.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 47c1d44..d672830 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -33,9 +33,9 @@ const struct file_operations generic_ro_fops = { EXPORT_SYMBOL(generic_ro_fops); -static inline int unsigned_offsets(struct file *file) +static inline bool unsigned_offsets(struct file *file) { - return file->f_mode & FMODE_UNSIGNED_OFFSET; + return !!(file->f_mode & FMODE_UNSIGNED_OFFSET); } /** -- 2.7.4
[PATCH] staging: lustre: ptlrpc: remove unnecessary code
offset is an unsigned variable and, greater-than-or-equal-to-zero comparison of an unsigned variable is always true. Addresses-Coverity-ID: 1373919 Signed-off-by: Gustavo A. R. Silva--- drivers/staging/lustre/lustre/ptlrpc/layout.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c index 356d735..ff77b52 100644 --- a/drivers/staging/lustre/lustre/ptlrpc/layout.c +++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c @@ -1762,7 +1762,7 @@ static u32 __req_capsule_offset(const struct req_capsule *pill, field->rmf_name, offset, loc); offset--; - LASSERT(0 <= offset && offset < REQ_MAX_FIELD_NR); + LASSERT(offset < REQ_MAX_FIELD_NR); return offset; } -- 2.5.0
[PATCH] staging: lustre: ptlrpc: remove unnecessary code
offset is an unsigned variable and, greater-than-or-equal-to-zero comparison of an unsigned variable is always true. Addresses-Coverity-ID: 1373919 Signed-off-by: Gustavo A. R. Silva --- drivers/staging/lustre/lustre/ptlrpc/layout.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c index 356d735..ff77b52 100644 --- a/drivers/staging/lustre/lustre/ptlrpc/layout.c +++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c @@ -1762,7 +1762,7 @@ static u32 __req_capsule_offset(const struct req_capsule *pill, field->rmf_name, offset, loc); offset--; - LASSERT(0 <= offset && offset < REQ_MAX_FIELD_NR); + LASSERT(offset < REQ_MAX_FIELD_NR); return offset; } -- 2.5.0
Re: [PATCH] FS: Making aproriate return type
On Thu, May 11, 2017 at 09:44:09AM +0530, Pushkar Jambhlekar wrote: > unsigned_offsets function returns fmode_t but function definition returns > int. sparse generate warning. > Updating proper return type You do realize that it's a predicate? This is actually one case where bool would be appropriate...
Re: [PATCH] FS: Making aproriate return type
On Thu, May 11, 2017 at 09:44:09AM +0530, Pushkar Jambhlekar wrote: > unsigned_offsets function returns fmode_t but function definition returns > int. sparse generate warning. > Updating proper return type You do realize that it's a predicate? This is actually one case where bool would be appropriate...
[PATCH] FS: Making aproriate return type
unsigned_offsets function returns fmode_t but function definition returns int. sparse generate warning. Updating proper return type Signed-off-by: Pushkar Jambhlekar--- fs/read_write.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/read_write.c b/fs/read_write.c index 47c1d44..d11eabc 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -33,7 +33,7 @@ const struct file_operations generic_ro_fops = { EXPORT_SYMBOL(generic_ro_fops); -static inline int unsigned_offsets(struct file *file) +static inline fmode_t unsigned_offsets(struct file *file) { return file->f_mode & FMODE_UNSIGNED_OFFSET; } -- 2.7.4
[PATCH] FS: Making aproriate return type
unsigned_offsets function returns fmode_t but function definition returns int. sparse generate warning. Updating proper return type Signed-off-by: Pushkar Jambhlekar --- fs/read_write.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/read_write.c b/fs/read_write.c index 47c1d44..d11eabc 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -33,7 +33,7 @@ const struct file_operations generic_ro_fops = { EXPORT_SYMBOL(generic_ro_fops); -static inline int unsigned_offsets(struct file *file) +static inline fmode_t unsigned_offsets(struct file *file) { return file->f_mode & FMODE_UNSIGNED_OFFSET; } -- 2.7.4
Re: [PATCH] net: fec: select queue depending on VLAN priority
On 2017-05-09 19:42, Andy Duan wrote: > From: David MillerSent: Tuesday, May 09, 2017 9:39 PM >>To: ste...@agner.ch >>Cc: Andy Duan ; and...@lunn.ch; >>feste...@gmail.com; net...@vger.kernel.org; linux- >>ker...@vger.kernel.org >>Subject: Re: [PATCH] net: fec: select queue depending on VLAN priority >> >>From: Stefan Agner >>Date: Mon, 8 May 2017 22:37:08 -0700 >> >>> Since the addition of the multi queue code with commit 59d0f7465644 >>> ("net: fec: init multi queue date structure") the queue selection has >>> been handelt by the default transmit queue selection implementation >>> which tries to evenly distribute the traffic across all available >>> queues. This selection presumes that the queues are using an equal >>> priority, however, the queues 1 and 2 are actually of higher priority >>> (the classification of the queues is enabled in fec_enet_enable_ring). >>> >>> This can lead to net scheduler warnings and continuous TX ring dumps >>> when exercising the system with iperf. >>> >>> Use only queue 0 for all common traffic (no VLAN and P802.1p priority >>> 0 and 1) and route level 2-7 through queue 1 and 2. >>> >>> Signed-off-by: Fugang Duan >>> Fixes: 59d0f7465644 ("net: fec: init multi queue date structure") >> >>If the queues are used for prioritization, and it does not have multiple >>normal >>priority level queues, multiqueue is not what the driver should have >>implemented. > Firstly, HW multiple queues support: > - Traffic-shaping bandwidth distribution supports credit-based and > round-robin-based policies. Either policy can be combined with > time-based shaping. > - AVB (Audio Video Bridging, IEEE 802.1Qav) features: > * Credit-based bandwidth distribution policy can be combined > with > time-based shaping > * AVB endpoint talker and listener support > * Support for arbitration between different priority traffic > (for > example, AVB class A, AVB class B, and non-AVB) > Round-robin-based policies: > It has the same priority for three queues: In the round-robin QoS > scheme, each queue is given an equal opportunity to transmit one > frame. For example, if queue n has a frame to transmit, the queue > transmits its frame. After queue n has transmitted its frame, or if > queue n does not have a frame to transmit, queue n+1 is then allowed > to transmit its frame, and so on. > > Credit-based policies: > The AVB credit based shaper acts independently, per class, to control > the bandwidth distribution between normal traffic and time-sensitive > traffic with respect to the total link bandwidth available. > Credit based shaper conbined with time-based shaping: > - priority: ClassA queue > ClassB queue > best effort > - ensure the queue bandwidth as user set based on time-based > shaping > algorithms (transmitter transmit frame from three queue in turn based > on time-based shaping algorithms) > And in real AVB case, each streaming can be independent, and are > fixed on related queue. Then driver level should implement > .ndo_select_queue() to put the streaming into related queue. That is > what the patch did. > > The current driver config the three queue to credit-based policies > (AVB), the patch seems no problem for the implementation. Do you have > any suggestion ? > I tried using the round robin mode by adding this: + /* Set Round-Robin policy */ + writel(1, fep->hwp + FEC_QOS_SCHEME); After a while I got the warning non the less: WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x248/0x24c NETDEV WATCHDOG: eth0 (fec): transmit queue 2 timed out Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00058-g56d22eced8bc-dirty #377 Hardware name: Freescale i.MX7 Dual (Device Tree) [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (dump_stack+0x78/0x8c) [] (dump_stack) from [] (__warn+0xe8/0x100) [] (__warn) from [] (warn_slowpath_fmt+0x38/0x48) [] (warn_slowpath_fmt) from [] (dev_watchdog+0x248/0x24c) [] (dev_watchdog) from [] (call_timer_fn+0x28/0x98) [] (call_timer_fn) from [] (expire_timers+0xa0/0xac) [] (expire_timers) from [] (run_timer_softirq+0x9c/0x194) [] (run_timer_softirq) from [] (__do_softirq+0x114/0x234) [] (__do_softirq) from [] (irq_exit+0xcc/0x108) [] (irq_exit) from [] (__handle_domain_irq+0x80/0xec) [] (__handle_domain_irq) from [] (gic_handle_irq+0x48/0x8c) [] (gic_handle_irq) from [] (__irq_svc+0x58/0x8c) Exception stack(0xc1001f28 to 0xc1001f70) 1f20: 0001 c022fdc0 c100 c1003d80 1f40: c1003d34 c0e72ed0 c0bd9b04 c1001f80 c1001f78 1f60: c022048c c0220490 6013 [] (__irq_svc) from [] (arch_cpu_idle+0x38/0x3c) [] (arch_cpu_idle) from [] (do_idle+0x170/0x204) [] (do_idle) from [] (cpu_startup_entry+0x18/0x1c) []
Re: [PATCH] net: fec: select queue depending on VLAN priority
On 2017-05-09 19:42, Andy Duan wrote: > From: David Miller Sent: Tuesday, May 09, 2017 9:39 PM >>To: ste...@agner.ch >>Cc: Andy Duan ; and...@lunn.ch; >>feste...@gmail.com; net...@vger.kernel.org; linux- >>ker...@vger.kernel.org >>Subject: Re: [PATCH] net: fec: select queue depending on VLAN priority >> >>From: Stefan Agner >>Date: Mon, 8 May 2017 22:37:08 -0700 >> >>> Since the addition of the multi queue code with commit 59d0f7465644 >>> ("net: fec: init multi queue date structure") the queue selection has >>> been handelt by the default transmit queue selection implementation >>> which tries to evenly distribute the traffic across all available >>> queues. This selection presumes that the queues are using an equal >>> priority, however, the queues 1 and 2 are actually of higher priority >>> (the classification of the queues is enabled in fec_enet_enable_ring). >>> >>> This can lead to net scheduler warnings and continuous TX ring dumps >>> when exercising the system with iperf. >>> >>> Use only queue 0 for all common traffic (no VLAN and P802.1p priority >>> 0 and 1) and route level 2-7 through queue 1 and 2. >>> >>> Signed-off-by: Fugang Duan >>> Fixes: 59d0f7465644 ("net: fec: init multi queue date structure") >> >>If the queues are used for prioritization, and it does not have multiple >>normal >>priority level queues, multiqueue is not what the driver should have >>implemented. > Firstly, HW multiple queues support: > - Traffic-shaping bandwidth distribution supports credit-based and > round-robin-based policies. Either policy can be combined with > time-based shaping. > - AVB (Audio Video Bridging, IEEE 802.1Qav) features: > * Credit-based bandwidth distribution policy can be combined > with > time-based shaping > * AVB endpoint talker and listener support > * Support for arbitration between different priority traffic > (for > example, AVB class A, AVB class B, and non-AVB) > Round-robin-based policies: > It has the same priority for three queues: In the round-robin QoS > scheme, each queue is given an equal opportunity to transmit one > frame. For example, if queue n has a frame to transmit, the queue > transmits its frame. After queue n has transmitted its frame, or if > queue n does not have a frame to transmit, queue n+1 is then allowed > to transmit its frame, and so on. > > Credit-based policies: > The AVB credit based shaper acts independently, per class, to control > the bandwidth distribution between normal traffic and time-sensitive > traffic with respect to the total link bandwidth available. > Credit based shaper conbined with time-based shaping: > - priority: ClassA queue > ClassB queue > best effort > - ensure the queue bandwidth as user set based on time-based > shaping > algorithms (transmitter transmit frame from three queue in turn based > on time-based shaping algorithms) > And in real AVB case, each streaming can be independent, and are > fixed on related queue. Then driver level should implement > .ndo_select_queue() to put the streaming into related queue. That is > what the patch did. > > The current driver config the three queue to credit-based policies > (AVB), the patch seems no problem for the implementation. Do you have > any suggestion ? > I tried using the round robin mode by adding this: + /* Set Round-Robin policy */ + writel(1, fep->hwp + FEC_QOS_SCHEME); After a while I got the warning non the less: WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x248/0x24c NETDEV WATCHDOG: eth0 (fec): transmit queue 2 timed out Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00058-g56d22eced8bc-dirty #377 Hardware name: Freescale i.MX7 Dual (Device Tree) [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (dump_stack+0x78/0x8c) [] (dump_stack) from [] (__warn+0xe8/0x100) [] (__warn) from [] (warn_slowpath_fmt+0x38/0x48) [] (warn_slowpath_fmt) from [] (dev_watchdog+0x248/0x24c) [] (dev_watchdog) from [] (call_timer_fn+0x28/0x98) [] (call_timer_fn) from [] (expire_timers+0xa0/0xac) [] (expire_timers) from [] (run_timer_softirq+0x9c/0x194) [] (run_timer_softirq) from [] (__do_softirq+0x114/0x234) [] (__do_softirq) from [] (irq_exit+0xcc/0x108) [] (irq_exit) from [] (__handle_domain_irq+0x80/0xec) [] (__handle_domain_irq) from [] (gic_handle_irq+0x48/0x8c) [] (gic_handle_irq) from [] (__irq_svc+0x58/0x8c) Exception stack(0xc1001f28 to 0xc1001f70) 1f20: 0001 c022fdc0 c100 c1003d80 1f40: c1003d34 c0e72ed0 c0bd9b04 c1001f80 c1001f78 1f60: c022048c c0220490 6013 [] (__irq_svc) from [] (arch_cpu_idle+0x38/0x3c) [] (arch_cpu_idle) from [] (do_idle+0x170/0x204) [] (do_idle) from [] (cpu_startup_entry+0x18/0x1c) [] (cpu_startup_entry) from [] (start_kernel+0x394/0x3a0) ---[ end trace a474f341d40e0705 ]---
Re: [RFC 07/10] fscrypt: move to generic async completion
Hi Gilad, On Sat, May 06, 2017 at 03:59:56PM +0300, Gilad Ben-Yossef wrote: > int fscrypt_do_page_crypto(const struct inode *inode, fscrypt_direction_t rw, > u64 lblk_num, struct page *src_page, > struct page *dest_page, unsigned int len, > @@ -150,7 +135,7 @@ int fscrypt_do_page_crypto(const struct inode *inode, > fscrypt_direction_t rw, > u8 padding[FS_XTS_TWEAK_SIZE - sizeof(__le64)]; > } xts_tweak; > struct skcipher_request *req = NULL; > - DECLARE_FS_COMPLETION_RESULT(ecr); > + DECLARE_CRYPTO_WAIT(ecr); > struct scatterlist dst, src; > struct fscrypt_info *ci = inode->i_crypt_info; > struct crypto_skcipher *tfm = ci->ci_ctfm; > @@ -168,7 +153,7 @@ int fscrypt_do_page_crypto(const struct inode *inode, > fscrypt_direction_t rw, [...] This patch looks good --- thanks for doing this! I suggest also renaming 'ecr' to 'wait' in the places being updated to use crypto_wait, since the name is obsolete (and was already; it originally stood for "ext4 completion result"). - Eric
Re: [RFC 07/10] fscrypt: move to generic async completion
Hi Gilad, On Sat, May 06, 2017 at 03:59:56PM +0300, Gilad Ben-Yossef wrote: > int fscrypt_do_page_crypto(const struct inode *inode, fscrypt_direction_t rw, > u64 lblk_num, struct page *src_page, > struct page *dest_page, unsigned int len, > @@ -150,7 +135,7 @@ int fscrypt_do_page_crypto(const struct inode *inode, > fscrypt_direction_t rw, > u8 padding[FS_XTS_TWEAK_SIZE - sizeof(__le64)]; > } xts_tweak; > struct skcipher_request *req = NULL; > - DECLARE_FS_COMPLETION_RESULT(ecr); > + DECLARE_CRYPTO_WAIT(ecr); > struct scatterlist dst, src; > struct fscrypt_info *ci = inode->i_crypt_info; > struct crypto_skcipher *tfm = ci->ci_ctfm; > @@ -168,7 +153,7 @@ int fscrypt_do_page_crypto(const struct inode *inode, > fscrypt_direction_t rw, [...] This patch looks good --- thanks for doing this! I suggest also renaming 'ecr' to 'wait' in the places being updated to use crypto_wait, since the name is obsolete (and was already; it originally stood for "ext4 completion result"). - Eric
Re: [RFC 01/10] crypto: factor async completion for general use
Hi Gilad, On Sat, May 06, 2017 at 03:59:50PM +0300, Gilad Ben-Yossef wrote: > Invoking a possibly async. crypto op and waiting for completion > while correctly handling backlog processing is a common task > in the crypto API implementation and outside users of it. > > This patch re-factors one of the in crypto API implementation in > preparation for using it across the board instead of hand > rolled versions. Thanks for doing this! It annoyed me too that there wasn't a helper function for this. Just a few comments below: > diff --git a/crypto/af_alg.c b/crypto/af_alg.c > index 3556d8e..bf4acaf 100644 > --- a/crypto/af_alg.c > +++ b/crypto/af_alg.c > @@ -480,33 +480,6 @@ int af_alg_cmsg_send(struct msghdr *msg, struct > af_alg_control *con) > } > EXPORT_SYMBOL_GPL(af_alg_cmsg_send); > > -int af_alg_wait_for_completion(int err, struct af_alg_completion *completion) > -{ > - switch (err) { > - case -EINPROGRESS: > - case -EBUSY: > - wait_for_completion(>completion); > - reinit_completion(>completion); > - err = completion->err; > - break; > - }; > - > - return err; > -} > -EXPORT_SYMBOL_GPL(af_alg_wait_for_completion); > - > -void af_alg_complete(struct crypto_async_request *req, int err) > -{ > - struct af_alg_completion *completion = req->data; > - > - if (err == -EINPROGRESS) > - return; > - > - completion->err = err; > - complete(>completion); > -} > -EXPORT_SYMBOL_GPL(af_alg_complete); > - I think it would be cleaner to switch af_alg and algif_* over to the new interface in its own patch, rather than in the same patch that introduces the new interface. > @@ -88,8 +88,8 @@ static int hash_sendmsg(struct socket *sock, struct msghdr > *msg, > if ((msg->msg_flags & MSG_MORE)) > hash_free_result(sk, ctx); > > - err = af_alg_wait_for_completion(crypto_ahash_init(>req), > - >completion); > + err = crypto_wait_req(crypto_ahash_init(>req), > + >wait); > if (err) > goto unlock; > } In general can you try to keep the argument lists indented sanely? (This applies throughout the patch series.) e.g. here I'd have expected: err = crypto_wait_req(crypto_ahash_init(>req), >wait); > > diff --git a/crypto/api.c b/crypto/api.c > index 941cd4c..1c6e9cd 100644 > --- a/crypto/api.c > +++ b/crypto/api.c > @@ -24,6 +24,7 @@ > #include > #include > #include > +#include > #include "internal.h" > > LIST_HEAD(crypto_alg_list); > @@ -595,5 +596,32 @@ int crypto_has_alg(const char *name, u32 type, u32 mask) > } > EXPORT_SYMBOL_GPL(crypto_has_alg); > > +int crypto_wait_req(int err, struct crypto_wait *wait) > +{ > + switch (err) { > + case -EINPROGRESS: > + case -EBUSY: > + wait_for_completion(>completion); > + reinit_completion(>completion); > + err = wait->err; > + break; > + }; > + > + return err; > +} > +EXPORT_SYMBOL_GPL(crypto_wait_req); crypto_wait_req() maybe should be inlined, since it doesn't do much (note that reinit_completion is actually just a variable assignment), and the common case is that 'err' will be 0, so there will be nothing to wait for. Also drop the unnecessary semicolon at the end of the switch block. With regards to the wait being uninterruptible, I agree that this should be the default behavior, because I think users waiting for specific crypto requests are generally not prepared to handle the wait actually being interrupted. After interruption the crypto operation will still proceed in the background, and it will use buffers which the caller has in many cases already freed. However, I'd suggest taking a close look at anything that was actually doing an interruptible wait before, to see whether it was a bug or intentional (or "doesn't matter"). And yes there could always be a crypto_wait_req_interruptible() introduced if some users need it. > > #define CRYPTO_MINALIGN_ATTR __attribute__ ((__aligned__(CRYPTO_MINALIGN))) > > +/* > + * Macro for declaring a crypto op async wait object on stack > + */ > +#define DECLARE_CRYPTO_WAIT(_wait) \ > + struct crypto_wait _wait = { \ > + COMPLETION_INITIALIZER_ONSTACK((_wait).completion), 0 } > + Move this definition down below, so it's next to crypto_wait? > > /* > * Algorithm registration interface. > */ > @@ -1604,5 +1631,6 @@ static inline int crypto_comp_decompress(struct > crypto_comp *tfm, > src, slen, dst, dlen); > } > > + Don't add unrelated blank lines. - Eric
Re: [RFC 01/10] crypto: factor async completion for general use
Hi Gilad, On Sat, May 06, 2017 at 03:59:50PM +0300, Gilad Ben-Yossef wrote: > Invoking a possibly async. crypto op and waiting for completion > while correctly handling backlog processing is a common task > in the crypto API implementation and outside users of it. > > This patch re-factors one of the in crypto API implementation in > preparation for using it across the board instead of hand > rolled versions. Thanks for doing this! It annoyed me too that there wasn't a helper function for this. Just a few comments below: > diff --git a/crypto/af_alg.c b/crypto/af_alg.c > index 3556d8e..bf4acaf 100644 > --- a/crypto/af_alg.c > +++ b/crypto/af_alg.c > @@ -480,33 +480,6 @@ int af_alg_cmsg_send(struct msghdr *msg, struct > af_alg_control *con) > } > EXPORT_SYMBOL_GPL(af_alg_cmsg_send); > > -int af_alg_wait_for_completion(int err, struct af_alg_completion *completion) > -{ > - switch (err) { > - case -EINPROGRESS: > - case -EBUSY: > - wait_for_completion(>completion); > - reinit_completion(>completion); > - err = completion->err; > - break; > - }; > - > - return err; > -} > -EXPORT_SYMBOL_GPL(af_alg_wait_for_completion); > - > -void af_alg_complete(struct crypto_async_request *req, int err) > -{ > - struct af_alg_completion *completion = req->data; > - > - if (err == -EINPROGRESS) > - return; > - > - completion->err = err; > - complete(>completion); > -} > -EXPORT_SYMBOL_GPL(af_alg_complete); > - I think it would be cleaner to switch af_alg and algif_* over to the new interface in its own patch, rather than in the same patch that introduces the new interface. > @@ -88,8 +88,8 @@ static int hash_sendmsg(struct socket *sock, struct msghdr > *msg, > if ((msg->msg_flags & MSG_MORE)) > hash_free_result(sk, ctx); > > - err = af_alg_wait_for_completion(crypto_ahash_init(>req), > - >completion); > + err = crypto_wait_req(crypto_ahash_init(>req), > + >wait); > if (err) > goto unlock; > } In general can you try to keep the argument lists indented sanely? (This applies throughout the patch series.) e.g. here I'd have expected: err = crypto_wait_req(crypto_ahash_init(>req), >wait); > > diff --git a/crypto/api.c b/crypto/api.c > index 941cd4c..1c6e9cd 100644 > --- a/crypto/api.c > +++ b/crypto/api.c > @@ -24,6 +24,7 @@ > #include > #include > #include > +#include > #include "internal.h" > > LIST_HEAD(crypto_alg_list); > @@ -595,5 +596,32 @@ int crypto_has_alg(const char *name, u32 type, u32 mask) > } > EXPORT_SYMBOL_GPL(crypto_has_alg); > > +int crypto_wait_req(int err, struct crypto_wait *wait) > +{ > + switch (err) { > + case -EINPROGRESS: > + case -EBUSY: > + wait_for_completion(>completion); > + reinit_completion(>completion); > + err = wait->err; > + break; > + }; > + > + return err; > +} > +EXPORT_SYMBOL_GPL(crypto_wait_req); crypto_wait_req() maybe should be inlined, since it doesn't do much (note that reinit_completion is actually just a variable assignment), and the common case is that 'err' will be 0, so there will be nothing to wait for. Also drop the unnecessary semicolon at the end of the switch block. With regards to the wait being uninterruptible, I agree that this should be the default behavior, because I think users waiting for specific crypto requests are generally not prepared to handle the wait actually being interrupted. After interruption the crypto operation will still proceed in the background, and it will use buffers which the caller has in many cases already freed. However, I'd suggest taking a close look at anything that was actually doing an interruptible wait before, to see whether it was a bug or intentional (or "doesn't matter"). And yes there could always be a crypto_wait_req_interruptible() introduced if some users need it. > > #define CRYPTO_MINALIGN_ATTR __attribute__ ((__aligned__(CRYPTO_MINALIGN))) > > +/* > + * Macro for declaring a crypto op async wait object on stack > + */ > +#define DECLARE_CRYPTO_WAIT(_wait) \ > + struct crypto_wait _wait = { \ > + COMPLETION_INITIALIZER_ONSTACK((_wait).completion), 0 } > + Move this definition down below, so it's next to crypto_wait? > > /* > * Algorithm registration interface. > */ > @@ -1604,5 +1631,6 @@ static inline int crypto_comp_decompress(struct > crypto_comp *tfm, > src, slen, dst, dlen); > } > > + Don't add unrelated blank lines. - Eric
[PATCH] iommu: remove unnecessary code
did_old is an unsigned variable and, greater-than-or-equal-to-zero comparison of an unsigned variable is always true. Addresses-Coverity-ID: 1398477 Signed-off-by: Gustavo A. R. Silva--- drivers/iommu/intel-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index d412a31..98daf4a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2050,7 +2050,7 @@ static int domain_context_mapping_one(struct dmar_domain *domain, if (context_copied(context)) { u16 did_old = context_domain_id(context); - if (did_old >= 0 && did_old < cap_ndoms(iommu->cap)) + if (did_old < cap_ndoms(iommu->cap)) iommu->flush.flush_context(iommu, did_old, (((u16)bus) << 8) | devfn, DMA_CCMD_MASK_NOBIT, -- 2.5.0
[PATCH] iommu: remove unnecessary code
did_old is an unsigned variable and, greater-than-or-equal-to-zero comparison of an unsigned variable is always true. Addresses-Coverity-ID: 1398477 Signed-off-by: Gustavo A. R. Silva --- drivers/iommu/intel-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index d412a31..98daf4a 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -2050,7 +2050,7 @@ static int domain_context_mapping_one(struct dmar_domain *domain, if (context_copied(context)) { u16 did_old = context_domain_id(context); - if (did_old >= 0 && did_old < cap_ndoms(iommu->cap)) + if (did_old < cap_ndoms(iommu->cap)) iommu->flush.flush_context(iommu, did_old, (((u16)bus) << 8) | devfn, DMA_CCMD_MASK_NOBIT, -- 2.5.0
[PATCH v3 2/4] hwmon: (adt7475) fan stall prevention
By default adt7475 will stop the fans (pwm duty cycle 0%) when the temperature drops past Tmin - hysteresis. Some systems want to keep the fans moving even when the temperature drops so add new sysfs attributes that configure the enhanced acoustics min 1-3 which allows the fans to run at the minimum configure pwm duty cycle. Signed-off-by: Chris Packham--- Changes in v2: - use pwmN_stall_dis as the attribute name. I think this describes the purpose pretty well. I went with a new attribute instead of overloading pwmN_auto_point1_pwm so this doesn't affect existing users. Changes in v3: - Fix grammar. - change enh_acou to enh_acoustics Documentation/hwmon/adt7475 | 5 + drivers/hwmon/adt7475.c | 50 + 2 files changed, 55 insertions(+) diff --git a/Documentation/hwmon/adt7475 b/Documentation/hwmon/adt7475 index 0502f2b464e1..3990bae60e78 100644 --- a/Documentation/hwmon/adt7475 +++ b/Documentation/hwmon/adt7475 @@ -109,6 +109,11 @@ fan speed) is applied. PWM values range from 0 (off) to 255 (full speed). Fan speed may be set to maximum when the temperature sensor associated with the PWM control exceeds temp#_max. +At Tmin - hysteresis the PWM output can either be off (0% duty cycle) or at the +minimum (i.e. auto_point1_pwm). This behaviour can be configured using the +pwm[1-*]_stall_dis sysfs attribute. A value of 0 means the fans will shut off. +A value of 1 means the fans will run at auto_point1_pwm. + Notes - diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index ec0c43fbcdce..4d6c625fec70 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -79,6 +79,9 @@ #define REG_TEMP_TRANGE_BASE 0x5F +#define REG_ENHANCE_ACOUSTICS1 0x62 +#define REG_ENHANCE_ACOUSTICS2 0x63 + #define REG_PWM_MIN_BASE 0x64 #define REG_TEMP_TMIN_BASE 0x67 @@ -209,6 +212,7 @@ struct adt7475_data { u8 range[3]; u8 pwmctl[3]; u8 pwmchan[3]; + u8 enh_acoustics[2]; u8 vid; u8 vrm; @@ -700,6 +704,43 @@ static ssize_t set_pwm(struct device *dev, struct device_attribute *attr, data->pwm[sattr->nr][sattr->index] = clamp_val(val, 0, 0xFF); i2c_smbus_write_byte_data(client, reg, data->pwm[sattr->nr][sattr->index]); + mutex_unlock(>lock); + + return count; +} + + +static ssize_t show_stall_dis(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct i2c_client *client = to_i2c_client(dev); + struct adt7475_data *data = i2c_get_clientdata(client); + u8 mask = BIT(5 + sattr->index); + + return sprintf(buf, "%d\n", !!(data->enh_acoustics[0] & mask)); +} + +static ssize_t set_stall_dis(struct device *dev, struct device_attribute *attr, +const char *buf, size_t count) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct i2c_client *client = to_i2c_client(dev); + struct adt7475_data *data = i2c_get_clientdata(client); + long val; + u8 mask = BIT(5 + sattr->index); + + if (kstrtol(buf, 10, )) + return -EINVAL; + + mutex_lock(>lock); + + data->enh_acoustics[0] &= ~mask; + if (val) + data->enh_acoustics[0] |= mask; + + i2c_smbus_write_byte_data(client, REG_ENHANCE_ACOUSTICS1, + data->enh_acoustics[0]); mutex_unlock(>lock); @@ -1028,6 +1069,8 @@ static SENSOR_DEVICE_ATTR_2(pwm1_auto_point1_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MIN, 0); static SENSOR_DEVICE_ATTR_2(pwm1_auto_point2_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MAX, 0); +static SENSOR_DEVICE_ATTR_2(pwm1_stall_dis, S_IRUGO | S_IWUSR, show_stall_dis, + set_stall_dis, 0, 0); static SENSOR_DEVICE_ATTR_2(pwm2, S_IRUGO | S_IWUSR, show_pwm, set_pwm, INPUT, 1); static SENSOR_DEVICE_ATTR_2(pwm2_freq, S_IRUGO | S_IWUSR, show_pwmfreq, @@ -1040,6 +1083,8 @@ static SENSOR_DEVICE_ATTR_2(pwm2_auto_point1_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MIN, 1); static SENSOR_DEVICE_ATTR_2(pwm2_auto_point2_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MAX, 1); +static SENSOR_DEVICE_ATTR_2(pwm2_stall_dis, S_IRUGO | S_IWUSR, show_stall_dis, + set_stall_dis, 0, 1); static SENSOR_DEVICE_ATTR_2(pwm3, S_IRUGO | S_IWUSR, show_pwm, set_pwm, INPUT, 2); static SENSOR_DEVICE_ATTR_2(pwm3_freq, S_IRUGO | S_IWUSR, show_pwmfreq, @@ -1052,6 +1097,8 @@ static SENSOR_DEVICE_ATTR_2(pwm3_auto_point1_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MIN, 2); static
[PATCH v3 4/4] hwmon: (adt7475) add high frequency support
Systems using 4-wire fans usually require high frequency (22.5kHz) output on the pwm. Add 22500 as a valid option in the pwmfreq_table. In high frequency mode the low-order bit are ignored so they can safely be set to 0. Signed-off-by: Chris Packham--- Changes in v3: - New drivers/hwmon/adt7475.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index f7322330789c..f5a65d1166cd 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -936,7 +936,7 @@ static ssize_t set_pwmctrl(struct device *dev, struct device_attribute *attr, /* List of frequencies for the PWM */ static const int pwmfreq_table[] = { - 11, 14, 22, 29, 35, 44, 58, 88 + 11, 14, 22, 29, 35, 44, 58, 88, 22500 }; static ssize_t show_pwmfreq(struct device *dev, struct device_attribute *attr, @@ -944,9 +944,10 @@ static ssize_t show_pwmfreq(struct device *dev, struct device_attribute *attr, { struct adt7475_data *data = adt7475_update_device(dev); struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + int i = clamp_val(data->range[sattr->index] & 0xf, 0, + sizeof(pwmfreq_table)); - return sprintf(buf, "%d\n", - pwmfreq_table[data->range[sattr->index] & 7]); + return sprintf(buf, "%d\n", pwmfreq_table[i]); } static ssize_t set_pwmfreq(struct device *dev, struct device_attribute *attr, @@ -967,7 +968,7 @@ static ssize_t set_pwmfreq(struct device *dev, struct device_attribute *attr, data->range[sattr->index] = adt7475_read(TEMP_TRANGE_REG(sattr->index)); - data->range[sattr->index] &= ~7; + data->range[sattr->index] &= ~0xf; data->range[sattr->index] |= out; i2c_smbus_write_byte_data(client, TEMP_TRANGE_REG(sattr->index), -- 2.11.0.24.ge6920cf
[PATCH v3 2/4] hwmon: (adt7475) fan stall prevention
By default adt7475 will stop the fans (pwm duty cycle 0%) when the temperature drops past Tmin - hysteresis. Some systems want to keep the fans moving even when the temperature drops so add new sysfs attributes that configure the enhanced acoustics min 1-3 which allows the fans to run at the minimum configure pwm duty cycle. Signed-off-by: Chris Packham --- Changes in v2: - use pwmN_stall_dis as the attribute name. I think this describes the purpose pretty well. I went with a new attribute instead of overloading pwmN_auto_point1_pwm so this doesn't affect existing users. Changes in v3: - Fix grammar. - change enh_acou to enh_acoustics Documentation/hwmon/adt7475 | 5 + drivers/hwmon/adt7475.c | 50 + 2 files changed, 55 insertions(+) diff --git a/Documentation/hwmon/adt7475 b/Documentation/hwmon/adt7475 index 0502f2b464e1..3990bae60e78 100644 --- a/Documentation/hwmon/adt7475 +++ b/Documentation/hwmon/adt7475 @@ -109,6 +109,11 @@ fan speed) is applied. PWM values range from 0 (off) to 255 (full speed). Fan speed may be set to maximum when the temperature sensor associated with the PWM control exceeds temp#_max. +At Tmin - hysteresis the PWM output can either be off (0% duty cycle) or at the +minimum (i.e. auto_point1_pwm). This behaviour can be configured using the +pwm[1-*]_stall_dis sysfs attribute. A value of 0 means the fans will shut off. +A value of 1 means the fans will run at auto_point1_pwm. + Notes - diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index ec0c43fbcdce..4d6c625fec70 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -79,6 +79,9 @@ #define REG_TEMP_TRANGE_BASE 0x5F +#define REG_ENHANCE_ACOUSTICS1 0x62 +#define REG_ENHANCE_ACOUSTICS2 0x63 + #define REG_PWM_MIN_BASE 0x64 #define REG_TEMP_TMIN_BASE 0x67 @@ -209,6 +212,7 @@ struct adt7475_data { u8 range[3]; u8 pwmctl[3]; u8 pwmchan[3]; + u8 enh_acoustics[2]; u8 vid; u8 vrm; @@ -700,6 +704,43 @@ static ssize_t set_pwm(struct device *dev, struct device_attribute *attr, data->pwm[sattr->nr][sattr->index] = clamp_val(val, 0, 0xFF); i2c_smbus_write_byte_data(client, reg, data->pwm[sattr->nr][sattr->index]); + mutex_unlock(>lock); + + return count; +} + + +static ssize_t show_stall_dis(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct i2c_client *client = to_i2c_client(dev); + struct adt7475_data *data = i2c_get_clientdata(client); + u8 mask = BIT(5 + sattr->index); + + return sprintf(buf, "%d\n", !!(data->enh_acoustics[0] & mask)); +} + +static ssize_t set_stall_dis(struct device *dev, struct device_attribute *attr, +const char *buf, size_t count) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct i2c_client *client = to_i2c_client(dev); + struct adt7475_data *data = i2c_get_clientdata(client); + long val; + u8 mask = BIT(5 + sattr->index); + + if (kstrtol(buf, 10, )) + return -EINVAL; + + mutex_lock(>lock); + + data->enh_acoustics[0] &= ~mask; + if (val) + data->enh_acoustics[0] |= mask; + + i2c_smbus_write_byte_data(client, REG_ENHANCE_ACOUSTICS1, + data->enh_acoustics[0]); mutex_unlock(>lock); @@ -1028,6 +1069,8 @@ static SENSOR_DEVICE_ATTR_2(pwm1_auto_point1_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MIN, 0); static SENSOR_DEVICE_ATTR_2(pwm1_auto_point2_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MAX, 0); +static SENSOR_DEVICE_ATTR_2(pwm1_stall_dis, S_IRUGO | S_IWUSR, show_stall_dis, + set_stall_dis, 0, 0); static SENSOR_DEVICE_ATTR_2(pwm2, S_IRUGO | S_IWUSR, show_pwm, set_pwm, INPUT, 1); static SENSOR_DEVICE_ATTR_2(pwm2_freq, S_IRUGO | S_IWUSR, show_pwmfreq, @@ -1040,6 +1083,8 @@ static SENSOR_DEVICE_ATTR_2(pwm2_auto_point1_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MIN, 1); static SENSOR_DEVICE_ATTR_2(pwm2_auto_point2_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MAX, 1); +static SENSOR_DEVICE_ATTR_2(pwm2_stall_dis, S_IRUGO | S_IWUSR, show_stall_dis, + set_stall_dis, 0, 1); static SENSOR_DEVICE_ATTR_2(pwm3, S_IRUGO | S_IWUSR, show_pwm, set_pwm, INPUT, 2); static SENSOR_DEVICE_ATTR_2(pwm3_freq, S_IRUGO | S_IWUSR, show_pwmfreq, @@ -1052,6 +1097,8 @@ static SENSOR_DEVICE_ATTR_2(pwm3_auto_point1_pwm, S_IRUGO | S_IWUSR, show_pwm, set_pwm, MIN, 2); static SENSOR_DEVICE_ATTR_2(pwm3_auto_point2_pwm, S_IRUGO |
[PATCH v3 4/4] hwmon: (adt7475) add high frequency support
Systems using 4-wire fans usually require high frequency (22.5kHz) output on the pwm. Add 22500 as a valid option in the pwmfreq_table. In high frequency mode the low-order bit are ignored so they can safely be set to 0. Signed-off-by: Chris Packham --- Changes in v3: - New drivers/hwmon/adt7475.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index f7322330789c..f5a65d1166cd 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -936,7 +936,7 @@ static ssize_t set_pwmctrl(struct device *dev, struct device_attribute *attr, /* List of frequencies for the PWM */ static const int pwmfreq_table[] = { - 11, 14, 22, 29, 35, 44, 58, 88 + 11, 14, 22, 29, 35, 44, 58, 88, 22500 }; static ssize_t show_pwmfreq(struct device *dev, struct device_attribute *attr, @@ -944,9 +944,10 @@ static ssize_t show_pwmfreq(struct device *dev, struct device_attribute *attr, { struct adt7475_data *data = adt7475_update_device(dev); struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + int i = clamp_val(data->range[sattr->index] & 0xf, 0, + sizeof(pwmfreq_table)); - return sprintf(buf, "%d\n", - pwmfreq_table[data->range[sattr->index] & 7]); + return sprintf(buf, "%d\n", pwmfreq_table[i]); } static ssize_t set_pwmfreq(struct device *dev, struct device_attribute *attr, @@ -967,7 +968,7 @@ static ssize_t set_pwmfreq(struct device *dev, struct device_attribute *attr, data->range[sattr->index] = adt7475_read(TEMP_TRANGE_REG(sattr->index)); - data->range[sattr->index] &= ~7; + data->range[sattr->index] &= ~0xf; data->range[sattr->index] |= out; i2c_smbus_write_byte_data(client, TEMP_TRANGE_REG(sattr->index), -- 2.11.0.24.ge6920cf
[PATCH v3 3/4] hwmon: (adt7475) temperature smoothing
When enabled temperature smoothing allows ramping the fan speed over a configurable period of time instead of jumping to the new speed instantaneously. Signed-off-by: Chris Packham--- Changes in v2: - use a single tempN_smoothing attribute Changes in v3: - change enh_acou to enh_acoustics - simplify show_temp_st() Documentation/hwmon/adt7475 | 4 ++ drivers/hwmon/adt7475.c | 93 + 2 files changed, 97 insertions(+) diff --git a/Documentation/hwmon/adt7475 b/Documentation/hwmon/adt7475 index 3990bae60e78..e82b24ec4b07 100644 --- a/Documentation/hwmon/adt7475 +++ b/Documentation/hwmon/adt7475 @@ -114,6 +114,10 @@ minimum (i.e. auto_point1_pwm). This behaviour can be configured using the pwm[1-*]_stall_dis sysfs attribute. A value of 0 means the fans will shut off. A value of 1 means the fans will run at auto_point1_pwm. +The responsiveness of the ADT747x to temperature changes can be configured. +This allows smoothing of the fan speed transition. To set the transition time +set the value in ms in the temp[1-*]_smoothing sysfs attribute. + Notes - diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index 4d6c625fec70..f7322330789c 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -526,6 +526,90 @@ static ssize_t set_temp(struct device *dev, struct device_attribute *attr, return count; } +/* Assuming CONFIG6[SLOW] is 0 */ +static const int ad7475_st_map[] = { + 37500, 18800, 12500, 7500, 4700, 3100, 1600, 800, +}; + +static ssize_t show_temp_st(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct i2c_client *client = to_i2c_client(dev); + struct adt7475_data *data = i2c_get_clientdata(client); + long val; + + switch (sattr->index) { + case 0: + val = data->enh_acoustics[0] & 0xf; + break; + case 1: + val = (data->enh_acoustics[1] >> 4) & 0xf; + break; + case 2: + val = data->enh_acoustics[1] & 0xf; + break; + default: + return -EINVAL; + } + + if (val & 0x8) + return sprintf(buf, "%d\n", ad7475_st_map[val & 0x7]); + else + return sprintf(buf, "0\n"); +} + +static ssize_t set_temp_st(struct device *dev, struct device_attribute *attr, +const char *buf, size_t count) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct i2c_client *client = to_i2c_client(dev); + struct adt7475_data *data = i2c_get_clientdata(client); + unsigned char reg; + int shift, idx; + ulong val; + + if (kstrtoul(buf, 10, )) + return -EINVAL; + + switch (sattr->index) { + case 0: + reg = REG_ENHANCE_ACOUSTICS1; + shift = 0; + idx = 0; + break; + case 1: + reg = REG_ENHANCE_ACOUSTICS2; + shift = 4; + idx = 1; + break; + case 2: + reg = REG_ENHANCE_ACOUSTICS2; + shift = 0; + idx = 1; + break; + default: + return -EINVAL; + } + + if (val > 0) { + val = find_closest_descending(val, ad7475_st_map, + ARRAY_SIZE(ad7475_st_map)); + val |= 0x8; + } + + mutex_lock(>lock); + + data->enh_acoustics[idx] &= ~(0xf << shift); + data->enh_acoustics[idx] |= (val << shift); + + i2c_smbus_write_byte_data(client, reg, data->enh_acoustics[idx]); + + mutex_unlock(>lock); + + return count; +} + /* * Table of autorange values - the user will write the value in millidegrees, * and we'll convert it @@ -1008,6 +1092,8 @@ static SENSOR_DEVICE_ATTR_2(temp1_crit, S_IRUGO | S_IWUSR, show_temp, set_temp, THERM, 0); static SENSOR_DEVICE_ATTR_2(temp1_crit_hyst, S_IRUGO | S_IWUSR, show_temp, set_temp, HYSTERSIS, 0); +static SENSOR_DEVICE_ATTR_2(temp1_smoothing, S_IRUGO | S_IWUSR, show_temp_st, + set_temp_st, 0, 0); static SENSOR_DEVICE_ATTR_2(temp2_input, S_IRUGO, show_temp, NULL, INPUT, 1); static SENSOR_DEVICE_ATTR_2(temp2_alarm, S_IRUGO, show_temp, NULL, ALARM, 1); static SENSOR_DEVICE_ATTR_2(temp2_max, S_IRUGO | S_IWUSR, show_temp, set_temp, @@ -1024,6 +1110,8 @@ static SENSOR_DEVICE_ATTR_2(temp2_crit, S_IRUGO | S_IWUSR, show_temp, set_temp, THERM, 1); static SENSOR_DEVICE_ATTR_2(temp2_crit_hyst, S_IRUGO | S_IWUSR, show_temp, set_temp, HYSTERSIS, 1); +static SENSOR_DEVICE_ATTR_2(temp2_smoothing, S_IRUGO |
[PATCH v3 3/4] hwmon: (adt7475) temperature smoothing
When enabled temperature smoothing allows ramping the fan speed over a configurable period of time instead of jumping to the new speed instantaneously. Signed-off-by: Chris Packham --- Changes in v2: - use a single tempN_smoothing attribute Changes in v3: - change enh_acou to enh_acoustics - simplify show_temp_st() Documentation/hwmon/adt7475 | 4 ++ drivers/hwmon/adt7475.c | 93 + 2 files changed, 97 insertions(+) diff --git a/Documentation/hwmon/adt7475 b/Documentation/hwmon/adt7475 index 3990bae60e78..e82b24ec4b07 100644 --- a/Documentation/hwmon/adt7475 +++ b/Documentation/hwmon/adt7475 @@ -114,6 +114,10 @@ minimum (i.e. auto_point1_pwm). This behaviour can be configured using the pwm[1-*]_stall_dis sysfs attribute. A value of 0 means the fans will shut off. A value of 1 means the fans will run at auto_point1_pwm. +The responsiveness of the ADT747x to temperature changes can be configured. +This allows smoothing of the fan speed transition. To set the transition time +set the value in ms in the temp[1-*]_smoothing sysfs attribute. + Notes - diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index 4d6c625fec70..f7322330789c 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -526,6 +526,90 @@ static ssize_t set_temp(struct device *dev, struct device_attribute *attr, return count; } +/* Assuming CONFIG6[SLOW] is 0 */ +static const int ad7475_st_map[] = { + 37500, 18800, 12500, 7500, 4700, 3100, 1600, 800, +}; + +static ssize_t show_temp_st(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct i2c_client *client = to_i2c_client(dev); + struct adt7475_data *data = i2c_get_clientdata(client); + long val; + + switch (sattr->index) { + case 0: + val = data->enh_acoustics[0] & 0xf; + break; + case 1: + val = (data->enh_acoustics[1] >> 4) & 0xf; + break; + case 2: + val = data->enh_acoustics[1] & 0xf; + break; + default: + return -EINVAL; + } + + if (val & 0x8) + return sprintf(buf, "%d\n", ad7475_st_map[val & 0x7]); + else + return sprintf(buf, "0\n"); +} + +static ssize_t set_temp_st(struct device *dev, struct device_attribute *attr, +const char *buf, size_t count) +{ + struct sensor_device_attribute_2 *sattr = to_sensor_dev_attr_2(attr); + struct i2c_client *client = to_i2c_client(dev); + struct adt7475_data *data = i2c_get_clientdata(client); + unsigned char reg; + int shift, idx; + ulong val; + + if (kstrtoul(buf, 10, )) + return -EINVAL; + + switch (sattr->index) { + case 0: + reg = REG_ENHANCE_ACOUSTICS1; + shift = 0; + idx = 0; + break; + case 1: + reg = REG_ENHANCE_ACOUSTICS2; + shift = 4; + idx = 1; + break; + case 2: + reg = REG_ENHANCE_ACOUSTICS2; + shift = 0; + idx = 1; + break; + default: + return -EINVAL; + } + + if (val > 0) { + val = find_closest_descending(val, ad7475_st_map, + ARRAY_SIZE(ad7475_st_map)); + val |= 0x8; + } + + mutex_lock(>lock); + + data->enh_acoustics[idx] &= ~(0xf << shift); + data->enh_acoustics[idx] |= (val << shift); + + i2c_smbus_write_byte_data(client, reg, data->enh_acoustics[idx]); + + mutex_unlock(>lock); + + return count; +} + /* * Table of autorange values - the user will write the value in millidegrees, * and we'll convert it @@ -1008,6 +1092,8 @@ static SENSOR_DEVICE_ATTR_2(temp1_crit, S_IRUGO | S_IWUSR, show_temp, set_temp, THERM, 0); static SENSOR_DEVICE_ATTR_2(temp1_crit_hyst, S_IRUGO | S_IWUSR, show_temp, set_temp, HYSTERSIS, 0); +static SENSOR_DEVICE_ATTR_2(temp1_smoothing, S_IRUGO | S_IWUSR, show_temp_st, + set_temp_st, 0, 0); static SENSOR_DEVICE_ATTR_2(temp2_input, S_IRUGO, show_temp, NULL, INPUT, 1); static SENSOR_DEVICE_ATTR_2(temp2_alarm, S_IRUGO, show_temp, NULL, ALARM, 1); static SENSOR_DEVICE_ATTR_2(temp2_max, S_IRUGO | S_IWUSR, show_temp, set_temp, @@ -1024,6 +1110,8 @@ static SENSOR_DEVICE_ATTR_2(temp2_crit, S_IRUGO | S_IWUSR, show_temp, set_temp, THERM, 1); static SENSOR_DEVICE_ATTR_2(temp2_crit_hyst, S_IRUGO | S_IWUSR, show_temp, set_temp, HYSTERSIS, 1); +static SENSOR_DEVICE_ATTR_2(temp2_smoothing, S_IRUGO | S_IWUSR, show_temp_st, +
Re: [PATCH] libertas: Avoid reading past end of buffer
Joe Percheswrites: > unrelated trivia: > > lbs_deb_enter is used incorrectly here at > function exit as both enter and leave calls. > > That type of copy/paste defect may be common. > > $ git grep -w lbs_deb_enter | wc -l > 148 > $ git grep -w lbs_deb_leave | wc -l > 71 > > One would expect these numbers to be the same. > > Another option would be to delete all these > calls as ftrace function tracing works well. Yeah, deleting all the enter/exit calls would be better. -- Kalle Valo
Re: [PATCH] libertas: Avoid reading past end of buffer
Joe Perches writes: > unrelated trivia: > > lbs_deb_enter is used incorrectly here at > function exit as both enter and leave calls. > > That type of copy/paste defect may be common. > > $ git grep -w lbs_deb_enter | wc -l > 148 > $ git grep -w lbs_deb_leave | wc -l > 71 > > One would expect these numbers to be the same. > > Another option would be to delete all these > calls as ftrace function tracing works well. Yeah, deleting all the enter/exit calls would be better. -- Kalle Valo
[PATCH v3 1/4] hwmon: (adt7475) replace find_nearest() with find_closest()
The adt7475 has had find_nearest() since it's creation in 2009. Since then find_closest() has been introduced and several drivers have been updated to use it. Update the adt7475 to use find_closest() and remove the now unused find_nearest(). Signed-off-by: Chris Packham--- Changes in v3: - None drivers/hwmon/adt7475.c | 34 +++--- 1 file changed, 3 insertions(+), 31 deletions(-) diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index c803e3c5fcd4..ec0c43fbcdce 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -22,6 +22,7 @@ #include #include #include +#include /* Indexes for the sysfs hooks */ @@ -314,35 +315,6 @@ static void adt7475_write_word(struct i2c_client *client, int reg, u16 val) i2c_smbus_write_byte_data(client, reg, val & 0xFF); } -/* - * Find the nearest value in a table - used for pwm frequency and - * auto temp range - */ -static int find_nearest(long val, const int *array, int size) -{ - int i; - - if (val < array[0]) - return 0; - - if (val > array[size - 1]) - return size - 1; - - for (i = 0; i < size - 1; i++) { - int a, b; - - if (val > array[i + 1]) - continue; - - a = val - array[i]; - b = array[i + 1] - val; - - return (a <= b) ? i : i + 1; - } - - return 0; -} - static ssize_t show_voltage(struct device *dev, struct device_attribute *attr, char *buf) { @@ -606,7 +578,7 @@ static ssize_t set_point2(struct device *dev, struct device_attribute *attr, val -= temp; /* Find the nearest table entry to what the user wrote */ - val = find_nearest(val, autorange_table, ARRAY_SIZE(autorange_table)); + val = find_closest(val, autorange_table, ARRAY_SIZE(autorange_table)); data->range[sattr->index] &= ~0xF0; data->range[sattr->index] |= val << 4; @@ -864,7 +836,7 @@ static ssize_t set_pwmfreq(struct device *dev, struct device_attribute *attr, if (kstrtol(buf, 10, )) return -EINVAL; - out = find_nearest(val, pwmfreq_table, ARRAY_SIZE(pwmfreq_table)); + out = find_closest(val, pwmfreq_table, ARRAY_SIZE(pwmfreq_table)); mutex_lock(>lock); -- 2.11.0.24.ge6920cf
[PATCH v3 1/4] hwmon: (adt7475) replace find_nearest() with find_closest()
The adt7475 has had find_nearest() since it's creation in 2009. Since then find_closest() has been introduced and several drivers have been updated to use it. Update the adt7475 to use find_closest() and remove the now unused find_nearest(). Signed-off-by: Chris Packham --- Changes in v3: - None drivers/hwmon/adt7475.c | 34 +++--- 1 file changed, 3 insertions(+), 31 deletions(-) diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c index c803e3c5fcd4..ec0c43fbcdce 100644 --- a/drivers/hwmon/adt7475.c +++ b/drivers/hwmon/adt7475.c @@ -22,6 +22,7 @@ #include #include #include +#include /* Indexes for the sysfs hooks */ @@ -314,35 +315,6 @@ static void adt7475_write_word(struct i2c_client *client, int reg, u16 val) i2c_smbus_write_byte_data(client, reg, val & 0xFF); } -/* - * Find the nearest value in a table - used for pwm frequency and - * auto temp range - */ -static int find_nearest(long val, const int *array, int size) -{ - int i; - - if (val < array[0]) - return 0; - - if (val > array[size - 1]) - return size - 1; - - for (i = 0; i < size - 1; i++) { - int a, b; - - if (val > array[i + 1]) - continue; - - a = val - array[i]; - b = array[i + 1] - val; - - return (a <= b) ? i : i + 1; - } - - return 0; -} - static ssize_t show_voltage(struct device *dev, struct device_attribute *attr, char *buf) { @@ -606,7 +578,7 @@ static ssize_t set_point2(struct device *dev, struct device_attribute *attr, val -= temp; /* Find the nearest table entry to what the user wrote */ - val = find_nearest(val, autorange_table, ARRAY_SIZE(autorange_table)); + val = find_closest(val, autorange_table, ARRAY_SIZE(autorange_table)); data->range[sattr->index] &= ~0xF0; data->range[sattr->index] |= val << 4; @@ -864,7 +836,7 @@ static ssize_t set_pwmfreq(struct device *dev, struct device_attribute *attr, if (kstrtol(buf, 10, )) return -EINVAL; - out = find_nearest(val, pwmfreq_table, ARRAY_SIZE(pwmfreq_table)); + out = find_closest(val, pwmfreq_table, ARRAY_SIZE(pwmfreq_table)); mutex_lock(>lock); -- 2.11.0.24.ge6920cf
Re: [PATCH v7 20/26] x86/cpufeature: Add User-Mode Instruction Prevention definitions
On Sat, 2017-05-06 at 11:04 +0200, Paolo Bonzini wrote: > > > On 05/05/2017 20:17, Ricardo Neri wrote: > > User-Mode Instruction Prevention is a security feature present in > new > > Intel processors that, when set, prevents the execution of a subset > of > > instructions if such instructions are executed in user mode (CPL > > 0). > > Attempting to execute such instructions causes a general protection > > exception. > > > > The subset of instructions comprises: > > > > * SGDT - Store Global Descriptor Table > > * SIDT - Store Interrupt Descriptor Table > > * SLDT - Store Local Descriptor Table > > * SMSW - Store Machine Status Word > > * STR - Store Task Register > > > > This feature is also added to the list of disabled-features to allow > > a cleaner handling of build-time configuration. > > > > Cc: Andy Lutomirski> > Cc: Andrew Morton > > Cc: H. Peter Anvin > > Cc: Borislav Petkov > > Cc: Brian Gerst > > Cc: Chen Yucong > > Cc: Chris Metcalf > > Cc: Dave Hansen > > Cc: Fenghua Yu > > Cc: Huang Rui > > Cc: Jiri Slaby > > Cc: Jonathan Corbet > > Cc: Michael S. Tsirkin > > Cc: Paul Gortmaker > > Cc: Peter Zijlstra > > Cc: Ravi V. Shankar > > Cc: Shuah Khan > > Cc: Vlastimil Babka > > Cc: Tony Luck > > Cc: Paolo Bonzini > > Cc: Liang Z. Li > > Cc: Alexandre Julliard > > Cc: Stas Sergeev > > Cc: x...@kernel.org > > Cc: linux-ms...@vger.kernel.org > > > > Signed-off-by: Ricardo Neri > > Would it be possible to have this patch in a topic branch for KVM's > consumption? > I have put a branch here with this single patch: https://github.com/ricardon/tip.git rneri/umip_for_kvm This is based on Linux v4.11. Please let me know if this works for your or you'd prefer it to be based on a different branch/commit/repo. Thanks and BR, Ricardo
Re: [f2fs-dev] [PATCH 1/3] f2fs: use f2fs_submit_page_bio for ra_meta_pages
On 2017/5/11 10:41, Jaegeuk Kim wrote: > On 05/11, Chao Yu wrote: >> Hi Jaegeuk, >> >> On 2017/5/11 7:48, Jaegeuk Kim wrote: >>> This patch avoids to use f2fs_submit_merged_bio for read, which was the only >>> read case. >> >> This makes f2fs losing the chance to merge multiple pages into one bio during >> reading continuous physical blocks, it may cause potential performance >> regression, how about using a local bio in ra_meta_pages to cache more pages? > > This is a readahead flow, which is asynchronous and not a performance critical > flow. And, I expect blk_plug is still able to merge them. We're using this in > readahead of node blocks as well. Confirmed, block plug can do the merge. :) Thanks, > > Thanks, > >> >> Thanks, >> >>> >>> Signed-off-by: Jaegeuk Kim>>> --- >>> fs/f2fs/checkpoint.c | 4 +--- >>> 1 file changed, 1 insertion(+), 3 deletions(-) >>> >>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c >>> index ea9c317b5916..8d92f8249000 100644 >>> --- a/fs/f2fs/checkpoint.c >>> +++ b/fs/f2fs/checkpoint.c >>> @@ -207,12 +207,10 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t >>> start, int nrpages, >>> } >>> >>> fio.page = page; >>> - fio.old_blkaddr = fio.new_blkaddr; >>> - f2fs_submit_page_mbio(); >>> + f2fs_submit_page_bio(); >>> f2fs_put_page(page, 0); >>> } >>> out: >>> - f2fs_submit_merged_bio(sbi, META, READ); >>> blk_finish_plug(); >>> return blkno - start; >>> } >>> > > . >
Re: [PATCH v7 20/26] x86/cpufeature: Add User-Mode Instruction Prevention definitions
On Sat, 2017-05-06 at 11:04 +0200, Paolo Bonzini wrote: > > > On 05/05/2017 20:17, Ricardo Neri wrote: > > User-Mode Instruction Prevention is a security feature present in > new > > Intel processors that, when set, prevents the execution of a subset > of > > instructions if such instructions are executed in user mode (CPL > > 0). > > Attempting to execute such instructions causes a general protection > > exception. > > > > The subset of instructions comprises: > > > > * SGDT - Store Global Descriptor Table > > * SIDT - Store Interrupt Descriptor Table > > * SLDT - Store Local Descriptor Table > > * SMSW - Store Machine Status Word > > * STR - Store Task Register > > > > This feature is also added to the list of disabled-features to allow > > a cleaner handling of build-time configuration. > > > > Cc: Andy Lutomirski > > Cc: Andrew Morton > > Cc: H. Peter Anvin > > Cc: Borislav Petkov > > Cc: Brian Gerst > > Cc: Chen Yucong > > Cc: Chris Metcalf > > Cc: Dave Hansen > > Cc: Fenghua Yu > > Cc: Huang Rui > > Cc: Jiri Slaby > > Cc: Jonathan Corbet > > Cc: Michael S. Tsirkin > > Cc: Paul Gortmaker > > Cc: Peter Zijlstra > > Cc: Ravi V. Shankar > > Cc: Shuah Khan > > Cc: Vlastimil Babka > > Cc: Tony Luck > > Cc: Paolo Bonzini > > Cc: Liang Z. Li > > Cc: Alexandre Julliard > > Cc: Stas Sergeev > > Cc: x...@kernel.org > > Cc: linux-ms...@vger.kernel.org > > > > Signed-off-by: Ricardo Neri > > Would it be possible to have this patch in a topic branch for KVM's > consumption? > I have put a branch here with this single patch: https://github.com/ricardon/tip.git rneri/umip_for_kvm This is based on Linux v4.11. Please let me know if this works for your or you'd prefer it to be based on a different branch/commit/repo. Thanks and BR, Ricardo
Re: [f2fs-dev] [PATCH 1/3] f2fs: use f2fs_submit_page_bio for ra_meta_pages
On 2017/5/11 10:41, Jaegeuk Kim wrote: > On 05/11, Chao Yu wrote: >> Hi Jaegeuk, >> >> On 2017/5/11 7:48, Jaegeuk Kim wrote: >>> This patch avoids to use f2fs_submit_merged_bio for read, which was the only >>> read case. >> >> This makes f2fs losing the chance to merge multiple pages into one bio during >> reading continuous physical blocks, it may cause potential performance >> regression, how about using a local bio in ra_meta_pages to cache more pages? > > This is a readahead flow, which is asynchronous and not a performance critical > flow. And, I expect blk_plug is still able to merge them. We're using this in > readahead of node blocks as well. Confirmed, block plug can do the merge. :) Thanks, > > Thanks, > >> >> Thanks, >> >>> >>> Signed-off-by: Jaegeuk Kim >>> --- >>> fs/f2fs/checkpoint.c | 4 +--- >>> 1 file changed, 1 insertion(+), 3 deletions(-) >>> >>> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c >>> index ea9c317b5916..8d92f8249000 100644 >>> --- a/fs/f2fs/checkpoint.c >>> +++ b/fs/f2fs/checkpoint.c >>> @@ -207,12 +207,10 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t >>> start, int nrpages, >>> } >>> >>> fio.page = page; >>> - fio.old_blkaddr = fio.new_blkaddr; >>> - f2fs_submit_page_mbio(); >>> + f2fs_submit_page_bio(); >>> f2fs_put_page(page, 0); >>> } >>> out: >>> - f2fs_submit_merged_bio(sbi, META, READ); >>> blk_finish_plug(); >>> return blkno - start; >>> } >>> > > . >
Re: Lockdep splat involving all_q_mutex
On Wed, May 10, 2017 at 08:55:54PM -0600, Jens Axboe wrote: > On 05/10/2017 04:34 PM, Paul E. McKenney wrote: > > Hello! > > > > I got the lockdep splat shown below during some rcutorture testing (which > > does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge > > tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm"). > > My kneejerk reaction was just to reverse the "mutex_lock(_q_mutex);" > > and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then > > I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock > > inversion") just got done moving these two statements in the other > > direction. > > The problem is that that patch got merged too early, as it only > fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus' > way, it's in my for-linus tree. Thank you for the update, looking forward to the fix. Thanx, Paul
Re: Lockdep splat involving all_q_mutex
On Wed, May 10, 2017 at 08:55:54PM -0600, Jens Axboe wrote: > On 05/10/2017 04:34 PM, Paul E. McKenney wrote: > > Hello! > > > > I got the lockdep splat shown below during some rcutorture testing (which > > does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge > > tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm"). > > My kneejerk reaction was just to reverse the "mutex_lock(_q_mutex);" > > and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then > > I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock > > inversion") just got done moving these two statements in the other > > direction. > > The problem is that that patch got merged too early, as it only > fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus' > way, it's in my for-linus tree. Thank you for the update, looking forward to the fix. Thanx, Paul
Re: [RFC] adding of TGID to ftrace output
Hi Steven, Thanks for your quick reply. On Wed, May 10, 2017 at 6:28 PM, Steven Rostedtwrote: > On Wed, 10 May 2017 16:04:55 -0700 > Joel Fernandes wrote: > >> Hi Steven, >> >> Can we add TGID information along with PID to ftrace output? >> >> Something like: >> # _-=> irqs-off >> # / _=> need-resched >> # | / _---=> hardirq/softirq >> # || / _--=> preempt-depth >> # ||| / delay >> # TASK-PID:TGID CPU# TIMESTAMP FUNCTION >> # | | | | | | >> bash-1977:1977 [000] 17284.993652: sys_close <- >> >> Instead of: >> # _-=> irqs-off >> # / _=> need-resched >> #| / _---=> hardirq/softirq >> #|| / _--=> preempt-depth >> #||| / delay >> # TASK-PID CPU# TIMESTAMP FUNCTION >> # | | | | | >> bash-1977 [000] 17284.993652: sys_close <- > > Well the thing about this is that it adds more overhead to each event > that is mostly not needed by users. There will not be any overhead if we can retrieve the tgid at print time and not trace time. I was thinking something like trace_find_cmdline approach would work great. The only draw back of something like this is if a thread is added and removed to a thread group during a trace, then we might not have all tgid information available. I think this is the draw back of the trace_find_cmdline code as well where if a thread doesn't exist at trace output time, then we end up printing "<...>" as the task comm? At the end of this email, I make a suggestion to fix this. >> Currently in android, we inject tgid into each trace event at the end >> of the trace just so we can group threads under a process in our trace >> viewer. >> >> The other option we're thinking off is we can retrieve tgid during the >> trace output (reading trace file from debugfs/tracefs) from the >> task_struct and then have ftrace output it that way. > > Hmm, is there any events that show the TGID currently? If we have that, > you can use another tracing instance to read from (one that wont get > overridden by other events), and keep track of what TGID processes > have. The timestamp could be used to keep everything in sync. There is a sched_process_fork event that can show ppid but not tgid. I was thinking we modify that to show tgid as well, that could work. However the draw back doing this is we can only see when something is added or removed to the thread group during tracing. If something was already a part of the thread group _before_ tracing, then we don't have the information. >> Anyway, having this will really simplify things and make it more >> reliable (we run ps -T at the end of the trace right now, but its >> unreliable because threads might have exited by then). We also have to >> call getpid() and inject pid into each userspace trace_marker write >> just so we can do this grouping. > > Perhaps we need to have a way to record it via another event. Fork or > sched switch? Same reasoning above, that separate events are not enough to have this information as such events may not fire for the duration of the tracing for the threads we're interested in. > Perhaps we can add a trigger to record extra data like this, similar to > the stack trace trigger does now. Are you talking about 1 event trigger another event to have more information? That would have the side-effect of having a separate event for each event that we need the tgid information for, thus causing lots of space wastage in the right buffer, right? To fix the suggestion I made in the very beginning of this email, I suggest a combination of trace_find_cmdline type of approach to save the tgid of all current threads available, and then using trace_sched_fork to find out what we missed (such as threads that were added and removed.) I think this approach will be robust and not affect any of the existing ftrace users (since we can make it a print option). Thoughts? Update: actually I find that someone already wrote an out-of-tree patch for Android long time back during 3.10 kernel days that added a 'print-tgid' option that saved tgid in the trace_save_cmdline function and then find it later during output time: https://android.googlesource.com/kernel/msm.git/+/0c5363f6a0a89952b0d0072ff4e4c3bd042add9a%5E%21/ It will be nice if we can have a more upstream ready version of this patch or some other approach that you think works best. Thanks! Regards, Joel
Re: [RFC] adding of TGID to ftrace output
Hi Steven, Thanks for your quick reply. On Wed, May 10, 2017 at 6:28 PM, Steven Rostedt wrote: > On Wed, 10 May 2017 16:04:55 -0700 > Joel Fernandes wrote: > >> Hi Steven, >> >> Can we add TGID information along with PID to ftrace output? >> >> Something like: >> # _-=> irqs-off >> # / _=> need-resched >> # | / _---=> hardirq/softirq >> # || / _--=> preempt-depth >> # ||| / delay >> # TASK-PID:TGID CPU# TIMESTAMP FUNCTION >> # | | | | | | >> bash-1977:1977 [000] 17284.993652: sys_close <- >> >> Instead of: >> # _-=> irqs-off >> # / _=> need-resched >> #| / _---=> hardirq/softirq >> #|| / _--=> preempt-depth >> #||| / delay >> # TASK-PID CPU# TIMESTAMP FUNCTION >> # | | | | | >> bash-1977 [000] 17284.993652: sys_close <- > > Well the thing about this is that it adds more overhead to each event > that is mostly not needed by users. There will not be any overhead if we can retrieve the tgid at print time and not trace time. I was thinking something like trace_find_cmdline approach would work great. The only draw back of something like this is if a thread is added and removed to a thread group during a trace, then we might not have all tgid information available. I think this is the draw back of the trace_find_cmdline code as well where if a thread doesn't exist at trace output time, then we end up printing "<...>" as the task comm? At the end of this email, I make a suggestion to fix this. >> Currently in android, we inject tgid into each trace event at the end >> of the trace just so we can group threads under a process in our trace >> viewer. >> >> The other option we're thinking off is we can retrieve tgid during the >> trace output (reading trace file from debugfs/tracefs) from the >> task_struct and then have ftrace output it that way. > > Hmm, is there any events that show the TGID currently? If we have that, > you can use another tracing instance to read from (one that wont get > overridden by other events), and keep track of what TGID processes > have. The timestamp could be used to keep everything in sync. There is a sched_process_fork event that can show ppid but not tgid. I was thinking we modify that to show tgid as well, that could work. However the draw back doing this is we can only see when something is added or removed to the thread group during tracing. If something was already a part of the thread group _before_ tracing, then we don't have the information. >> Anyway, having this will really simplify things and make it more >> reliable (we run ps -T at the end of the trace right now, but its >> unreliable because threads might have exited by then). We also have to >> call getpid() and inject pid into each userspace trace_marker write >> just so we can do this grouping. > > Perhaps we need to have a way to record it via another event. Fork or > sched switch? Same reasoning above, that separate events are not enough to have this information as such events may not fire for the duration of the tracing for the threads we're interested in. > Perhaps we can add a trigger to record extra data like this, similar to > the stack trace trigger does now. Are you talking about 1 event trigger another event to have more information? That would have the side-effect of having a separate event for each event that we need the tgid information for, thus causing lots of space wastage in the right buffer, right? To fix the suggestion I made in the very beginning of this email, I suggest a combination of trace_find_cmdline type of approach to save the tgid of all current threads available, and then using trace_sched_fork to find out what we missed (such as threads that were added and removed.) I think this approach will be robust and not affect any of the existing ftrace users (since we can make it a print option). Thoughts? Update: actually I find that someone already wrote an out-of-tree patch for Android long time back during 3.10 kernel days that added a 'print-tgid' option that saved tgid in the trace_save_cmdline function and then find it later during output time: https://android.googlesource.com/kernel/msm.git/+/0c5363f6a0a89952b0d0072ff4e4c3bd042add9a%5E%21/ It will be nice if we can have a more upstream ready version of this patch or some other approach that you think works best. Thanks! Regards, Joel
linux-next: Tree for May 11
Hi all, Please do not add any v4.13 destined material in your linux-next included branches until after v4.12-rc1 has been released. Changes since 20170510: The tpmdd tree lost its build failure. Non-merge commits (relative to Linus' tree): 697 763 files changed, 20547 insertions(+), 20505 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig (with CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. Below is a summary of the state of the merge. I am currently merging 258 trees (counting Linus' and 37 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (b5a53b61a289 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux) Merging fixes/master (97da3854c526 Linux 4.11-rc3) Merging kbuild-current/fixes (9be3213b14d4 gconfig: remove misleading parentheses around a condition) Merging arc-current/for-curr (cf4100d1cddc Revert "ARCv2: Allow enabling PAE40 w/o HIGHMEM") Merging arm-current/fixes (6d8059493691 ARM: 8670/1: V7M: Do not corrupt vector table around v7m_invalidate_l1 call) Merging m68k-current/for-linus (f6ab4d59a5fe nubus: Add MVC and VSC video card definitions) Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups) Merging powerpc-fixes/fixes (be5c5e843c4a powerpc/64: Fix HMI exception on LE with CONFIG_RELOCATABLE=y) Merging sparc/master (3c7f62212018 sparc64: fix fault handling in NGbzero.S and GENbzero.S) Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and linking special files) Merging net/master (56868a460b83 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide) Merging ipsec/master (2c1497bbc8fd xfrm: Fix NETDEV_DOWN with IPSec offload) Merging netfilter/master (f411af682218 Merge branch 'ibmvnic-Updated-reset-handler-andcode-fixes') Merging ipvs/master (3c5ab3f395d6 ipvs: SNAT packet replies only for NATed connections) Merging wireless-drivers/master (d77facb88448 brcmfmac: use local iftype avoiding use-after-free of virtual interface) Merging mac80211/master (29cee56c0be4 Merge tag 'mac80211-for-davem-2017-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211) Merging sound-current/for-linus (960013762df0 ALSA: hda: Fix cpu lockup when stopping the cmd dmas) Merging pci-current/for-linus (b9c1153f7a9c PCI: hisi: Fix DT binding (hisi-pcie-almost-ecam)) Merging driver-core.current/driver-core-linus (af82455f7dbd Merge tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc) Merging tty.current/tty-linus (4f7d029b9bf0 Linux 4.11-rc7) Merging usb.current/usb-linus (a71c9a1c779f Linux 4.11-rc5) Merging usb-gadget-fixes/fixes (a351e9b9fc24 Linux 4.11) Merging usb-serial-fixes/usb-linus (c02ed2e75ef4 Linux 4.11-rc4) Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move the lock initialization to core file) Merging phy/fixes (1a09b6a7c10e phy: qcom-usb-hs: Add depends on EXTCON) Merging staging.current/staging-linus (56868a460b83 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide) Merging char-misc.current/char-misc-linus (af82455f7dbd Merge tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc) Merging input-current/for-linus (8be193c7b1f4 Input: add support for PlayStation 1/2 joypads connected via SPI) Merging crypto-current/master (929562b14478 crypto: stm32 - Fix OF module alias information) Merging ide/master (acfead32f3f9 ide: don't call memcpy with th
linux-next: Tree for May 11
Hi all, Please do not add any v4.13 destined material in your linux-next included branches until after v4.12-rc1 has been released. Changes since 20170510: The tpmdd tree lost its build failure. Non-merge commits (relative to Linus' tree): 697 763 files changed, 20547 insertions(+), 20505 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc and an allmodconfig (with CONFIG_BUILD_DOCSRC=n) for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig. Below is a summary of the state of the merge. I am currently merging 258 trees (counting Linus' and 37 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (b5a53b61a289 Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux) Merging fixes/master (97da3854c526 Linux 4.11-rc3) Merging kbuild-current/fixes (9be3213b14d4 gconfig: remove misleading parentheses around a condition) Merging arc-current/for-curr (cf4100d1cddc Revert "ARCv2: Allow enabling PAE40 w/o HIGHMEM") Merging arm-current/fixes (6d8059493691 ARM: 8670/1: V7M: Do not corrupt vector table around v7m_invalidate_l1 call) Merging m68k-current/for-linus (f6ab4d59a5fe nubus: Add MVC and VSC video card definitions) Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups) Merging powerpc-fixes/fixes (be5c5e843c4a powerpc/64: Fix HMI exception on LE with CONFIG_RELOCATABLE=y) Merging sparc/master (3c7f62212018 sparc64: fix fault handling in NGbzero.S and GENbzero.S) Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and linking special files) Merging net/master (56868a460b83 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide) Merging ipsec/master (2c1497bbc8fd xfrm: Fix NETDEV_DOWN with IPSec offload) Merging netfilter/master (f411af682218 Merge branch 'ibmvnic-Updated-reset-handler-andcode-fixes') Merging ipvs/master (3c5ab3f395d6 ipvs: SNAT packet replies only for NATed connections) Merging wireless-drivers/master (d77facb88448 brcmfmac: use local iftype avoiding use-after-free of virtual interface) Merging mac80211/master (29cee56c0be4 Merge tag 'mac80211-for-davem-2017-05-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211) Merging sound-current/for-linus (960013762df0 ALSA: hda: Fix cpu lockup when stopping the cmd dmas) Merging pci-current/for-linus (b9c1153f7a9c PCI: hisi: Fix DT binding (hisi-pcie-almost-ecam)) Merging driver-core.current/driver-core-linus (af82455f7dbd Merge tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc) Merging tty.current/tty-linus (4f7d029b9bf0 Linux 4.11-rc7) Merging usb.current/usb-linus (a71c9a1c779f Linux 4.11-rc5) Merging usb-gadget-fixes/fixes (a351e9b9fc24 Linux 4.11) Merging usb-serial-fixes/usb-linus (c02ed2e75ef4 Linux 4.11-rc4) Merging usb-chipidea-fixes/ci-for-usb-stable (c7fbb09b2ea1 usb: chipidea: move the lock initialization to core file) Merging phy/fixes (1a09b6a7c10e phy: qcom-usb-hs: Add depends on EXTCON) Merging staging.current/staging-linus (56868a460b83 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide) Merging char-misc.current/char-misc-linus (af82455f7dbd Merge tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc) Merging input-current/for-linus (8be193c7b1f4 Input: add support for PlayStation 1/2 joypads connected via SPI) Merging crypto-current/master (929562b14478 crypto: stm32 - Fix OF module alias information) Merging ide/master (acfead32f3f9 ide: don't call memcpy with th
[PATCH] filesystem-dax: fix broken __dax_zero_page_range() conversion
The conversion of __dax_zero_page_range() to 'struct dax_operations' caused it to frequently fail. The mistake was treating the @size parameter as a dax mapping length rather than just a length of the clear_pmem() operation. The dax mapping length is assumed to be hard coded as PAGE_SIZE. Without this fix any page unaligned zeroing request will trigger a -EINVAL return from bdev_dax_pgoff(). Cc: Jan KaraCc: Christoph Hellwig Reported-by: Ross Zwisler Fixes: cccbce671582 ("filesystem-dax: convert to dax_direct_access()") Signed-off-by: Dan Williams --- fs/dax.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index ce9dc9c3e829..5ee1d212d81f 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -971,12 +971,12 @@ int __dax_zero_page_range(struct block_device *bdev, void *kaddr; pfn_t pfn; - rc = bdev_dax_pgoff(bdev, sector, size, ); + rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, ); if (rc) return rc; id = dax_read_lock(); - rc = dax_direct_access(dax_dev, pgoff, PHYS_PFN(size), , + rc = dax_direct_access(dax_dev, pgoff, 1, , ); if (rc < 0) { dax_read_unlock(id);
[PATCH] filesystem-dax: fix broken __dax_zero_page_range() conversion
The conversion of __dax_zero_page_range() to 'struct dax_operations' caused it to frequently fail. The mistake was treating the @size parameter as a dax mapping length rather than just a length of the clear_pmem() operation. The dax mapping length is assumed to be hard coded as PAGE_SIZE. Without this fix any page unaligned zeroing request will trigger a -EINVAL return from bdev_dax_pgoff(). Cc: Jan Kara Cc: Christoph Hellwig Reported-by: Ross Zwisler Fixes: cccbce671582 ("filesystem-dax: convert to dax_direct_access()") Signed-off-by: Dan Williams --- fs/dax.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index ce9dc9c3e829..5ee1d212d81f 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -971,12 +971,12 @@ int __dax_zero_page_range(struct block_device *bdev, void *kaddr; pfn_t pfn; - rc = bdev_dax_pgoff(bdev, sector, size, ); + rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, ); if (rc) return rc; id = dax_read_lock(); - rc = dax_direct_access(dax_dev, pgoff, PHYS_PFN(size), , + rc = dax_direct_access(dax_dev, pgoff, 1, , ); if (rc < 0) { dax_read_unlock(id);
Re: Lockdep splat involving all_q_mutex
On 05/10/2017 04:34 PM, Paul E. McKenney wrote: > Hello! > > I got the lockdep splat shown below during some rcutorture testing (which > does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge > tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm"). > My kneejerk reaction was just to reverse the "mutex_lock(_q_mutex);" > and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then > I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock > inversion") just got done moving these two statements in the other > direction. The problem is that that patch got merged too early, as it only fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus' way, it's in my for-linus tree. -- Jens Axboe
Re: Lockdep splat involving all_q_mutex
On 05/10/2017 04:34 PM, Paul E. McKenney wrote: > Hello! > > I got the lockdep splat shown below during some rcutorture testing (which > does CPU hotplug operations) on mainline at commit dc9edaab90de ("Merge > tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/.../rafael/linux-pm"). > My kneejerk reaction was just to reverse the "mutex_lock(_q_mutex);" > and "get_online_cpus();" in blk_mq_init_allocated_queue(), but then > I noticed that commit eabe06595d62 ("block/mq: Cure cpu hotplug lock > inversion") just got done moving these two statements in the other > direction. The problem is that that patch got merged too early, as it only fixes a lockdep splat with the cpu hotplug rework. Fix is coming Linus' way, it's in my for-linus tree. -- Jens Axboe
Re: (radeon?) WARNING: drivers/gpu/drm/drm_irq.c:1195 drm_vblank_put (v4.11-12441-g56868a4)
On 11/05/17 04:33 AM, Tommi Rantala wrote: > Hi, > > I just tested v4.11-12441-g56868a4 on HP xw6600 with radeon graphics, > and I'm seeing the following WARNING triggered constantly. > > I have not seen this earlier e.g. with the distro kernel > 4.10.13-200.fc25.x86_64 > > $ lspci|grep -i amd > 60:00.0 VGA compatible controller: Advanced Micro Devices, Inc. > [AMD/ATI] Curacao PRO [Radeon R7 370 / R9 270/370 OEM] > 60:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape > Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] > > Complete kernel log: > http://termbin.com/dzy5 > > [ 249.952546] [ cut here ] > [ 249.952593] WARNING: CPU: 5 PID: 0 at > /home/ttrantal/git/linux/drivers/gpu/drm/drm_irq.c:1195 > drm_vblank_put+0xc4/0x120 [drm] > [ 249.952596] Modules linked in: fuse tun bridge stp llc af_packet > pl2303 usbserial shpchp acpi_cpufreq binfmt_misc amdgpu hid_generic > uhci_hcd radeon 3c59x mii tg3 ehci_pci ehci_hcd i2c_algo_bit > drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm > agpgart unix autofs4 > [ 249.952675] CPU: 5 PID: 0 Comm: swapper/5 Tainted: GW > 4.11.0+ #4 > [ 249.952678] Hardware name: Hewlett-Packard HP xw6600 > Workstation/0A9Ch, BIOS 786F4 v01.46 09/20/2012 > [ 249.952681] task: 88080aea task.stack: c900031b > [ 249.952695] RIP: 0010:drm_vblank_put+0xc4/0x120 [drm] > [ 249.952698] RSP: 0018:88080f003d70 EFLAGS: 00010046 > [ 249.952703] RAX: RBX: 880801d53000 RCX: > > [ 249.952706] RDX: RSI: RDI: > 88080a4ac000 > [ 249.952709] RBP: 88080f003d88 R08: 0001 R09: > 0003 > [ 249.952711] R10: 88080f003d08 R11: 001da540 R12: > 88080a4ac000 > [ 249.952714] R13: R14: 0086 R15: > 8808019a > [ 249.952717] FS: () GS:88080f00() > knlGS: > [ 249.952720] CS: 0010 DS: ES: CR0: 80050033 > [ 249.952723] CR2: 7f8bcc3a5810 CR3: 000808789000 CR4: > 06e0 > [ 249.952726] Call Trace: > [ 249.952731] > [ 249.952746] drm_crtc_vblank_put+0x1b/0x30 [drm] > [ 249.952813] radeon_crtc_handle_flip+0xdc/0x140 [radeon] > [ 249.952843] si_irq_process+0x610/0x1e90 [radeon] > [ 249.952872] radeon_driver_irq_handler_kms+0x39/0xc0 [radeon] > [ 249.952881] __handle_irq_event_percpu+0x60/0x580 > [ 249.952887] handle_irq_event_percpu+0x20/0x90 > [ 249.952892] handle_irq_event+0x46/0xb0 > [ 249.952897] handle_edge_irq+0x13d/0x370 > [ 249.952903] handle_irq+0x66/0x210 > [ 249.952908] ? __local_bh_enable+0x34/0x50 > [ 249.952914] do_IRQ+0x7e/0x1b0 > [ 249.952920] common_interrupt+0x95/0x95 Weird, not sure how this could happen. Can you bisect? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
Re: (radeon?) WARNING: drivers/gpu/drm/drm_irq.c:1195 drm_vblank_put (v4.11-12441-g56868a4)
On 11/05/17 04:33 AM, Tommi Rantala wrote: > Hi, > > I just tested v4.11-12441-g56868a4 on HP xw6600 with radeon graphics, > and I'm seeing the following WARNING triggered constantly. > > I have not seen this earlier e.g. with the distro kernel > 4.10.13-200.fc25.x86_64 > > $ lspci|grep -i amd > 60:00.0 VGA compatible controller: Advanced Micro Devices, Inc. > [AMD/ATI] Curacao PRO [Radeon R7 370 / R9 270/370 OEM] > 60:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cape > Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series] > > Complete kernel log: > http://termbin.com/dzy5 > > [ 249.952546] [ cut here ] > [ 249.952593] WARNING: CPU: 5 PID: 0 at > /home/ttrantal/git/linux/drivers/gpu/drm/drm_irq.c:1195 > drm_vblank_put+0xc4/0x120 [drm] > [ 249.952596] Modules linked in: fuse tun bridge stp llc af_packet > pl2303 usbserial shpchp acpi_cpufreq binfmt_misc amdgpu hid_generic > uhci_hcd radeon 3c59x mii tg3 ehci_pci ehci_hcd i2c_algo_bit > drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm > agpgart unix autofs4 > [ 249.952675] CPU: 5 PID: 0 Comm: swapper/5 Tainted: GW > 4.11.0+ #4 > [ 249.952678] Hardware name: Hewlett-Packard HP xw6600 > Workstation/0A9Ch, BIOS 786F4 v01.46 09/20/2012 > [ 249.952681] task: 88080aea task.stack: c900031b > [ 249.952695] RIP: 0010:drm_vblank_put+0xc4/0x120 [drm] > [ 249.952698] RSP: 0018:88080f003d70 EFLAGS: 00010046 > [ 249.952703] RAX: RBX: 880801d53000 RCX: > > [ 249.952706] RDX: RSI: RDI: > 88080a4ac000 > [ 249.952709] RBP: 88080f003d88 R08: 0001 R09: > 0003 > [ 249.952711] R10: 88080f003d08 R11: 001da540 R12: > 88080a4ac000 > [ 249.952714] R13: R14: 0086 R15: > 8808019a > [ 249.952717] FS: () GS:88080f00() > knlGS: > [ 249.952720] CS: 0010 DS: ES: CR0: 80050033 > [ 249.952723] CR2: 7f8bcc3a5810 CR3: 000808789000 CR4: > 06e0 > [ 249.952726] Call Trace: > [ 249.952731] > [ 249.952746] drm_crtc_vblank_put+0x1b/0x30 [drm] > [ 249.952813] radeon_crtc_handle_flip+0xdc/0x140 [radeon] > [ 249.952843] si_irq_process+0x610/0x1e90 [radeon] > [ 249.952872] radeon_driver_irq_handler_kms+0x39/0xc0 [radeon] > [ 249.952881] __handle_irq_event_percpu+0x60/0x580 > [ 249.952887] handle_irq_event_percpu+0x20/0x90 > [ 249.952892] handle_irq_event+0x46/0xb0 > [ 249.952897] handle_edge_irq+0x13d/0x370 > [ 249.952903] handle_irq+0x66/0x210 > [ 249.952908] ? __local_bh_enable+0x34/0x50 > [ 249.952914] do_IRQ+0x7e/0x1b0 > [ 249.952920] common_interrupt+0x95/0x95 Weird, not sure how this could happen. Can you bisect? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
Re: [PATCH net-next V4 10/10] vhost_net: try batch dequing from skb array
On 2017年05月10日 20:34, Michael S. Tsirkin wrote: On Wed, May 10, 2017 at 11:36:22AM +0800, Jason Wang wrote: We used to dequeue one skb during recvmsg() from skb_array, this could be inefficient because of the bad cache utilization and spinlock touching for each packet. This patch tries to batch them by calling batch dequeuing helpers explicitly on the exported skb array and pass the skb back through msg_control for underlayer socket to finish the userspace copying. Batch dequeuing is also the requirement for more batching improvement on rx. Tests were done by pktgen on tap with XDP1 in guest on top of batch zeroing: rx batch | pps 2562.41Mpps (+6.16%) 1282.48Mpps (+8.80%) 64 2.38Mpps (+3.96%) <- Default 16 2.31Mpps (+1.76%) 4 2.31Mpps (+1.76%) 1 2.30Mpps (+1.32%) 0 2.27Mpps (+7.48%) Signed-off-by: Jason Wang--- drivers/vhost/net.c | 117 +--- 1 file changed, 111 insertions(+), 6 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 9b51989..fbaecf3 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -28,6 +28,8 @@ #include #include #include +#include +#include #include @@ -85,6 +87,13 @@ struct vhost_net_ubuf_ref { struct vhost_virtqueue *vq; }; +#define VHOST_RX_BATCH 64 +struct vhost_net_buf { + struct sk_buff *queue[VHOST_RX_BATCH]; + int tail; + int head; +}; + Do you strictly need to put this inline? This structure is quite big already. Do you see a measureabe difference if you make it struct sk_buff **queue; int tail; int head; ? I don't. Will also make it easier to play with the size in the future should someone want to see how does it work e.g. for different ring sizes. Ok, will do this in next version Thanks
Re: [PATCH net-next V4 10/10] vhost_net: try batch dequing from skb array
On 2017年05月10日 20:34, Michael S. Tsirkin wrote: On Wed, May 10, 2017 at 11:36:22AM +0800, Jason Wang wrote: We used to dequeue one skb during recvmsg() from skb_array, this could be inefficient because of the bad cache utilization and spinlock touching for each packet. This patch tries to batch them by calling batch dequeuing helpers explicitly on the exported skb array and pass the skb back through msg_control for underlayer socket to finish the userspace copying. Batch dequeuing is also the requirement for more batching improvement on rx. Tests were done by pktgen on tap with XDP1 in guest on top of batch zeroing: rx batch | pps 2562.41Mpps (+6.16%) 1282.48Mpps (+8.80%) 64 2.38Mpps (+3.96%) <- Default 16 2.31Mpps (+1.76%) 4 2.31Mpps (+1.76%) 1 2.30Mpps (+1.32%) 0 2.27Mpps (+7.48%) Signed-off-by: Jason Wang --- drivers/vhost/net.c | 117 +--- 1 file changed, 111 insertions(+), 6 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 9b51989..fbaecf3 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -28,6 +28,8 @@ #include #include #include +#include +#include #include @@ -85,6 +87,13 @@ struct vhost_net_ubuf_ref { struct vhost_virtqueue *vq; }; +#define VHOST_RX_BATCH 64 +struct vhost_net_buf { + struct sk_buff *queue[VHOST_RX_BATCH]; + int tail; + int head; +}; + Do you strictly need to put this inline? This structure is quite big already. Do you see a measureabe difference if you make it struct sk_buff **queue; int tail; int head; ? I don't. Will also make it easier to play with the size in the future should someone want to see how does it work e.g. for different ring sizes. Ok, will do this in next version Thanks
[RFC][Patch 0/3] securityfs: add the ability to support symlinks
Currently securityfs does not support the creation/use of symlinks. AppArmor would like to be able to use symlinks to map some policy relationships between profiles and the data set it was loaded from (patch 2), and to create a specialized magic policy tree that maps visible policy to the policy namespace that a task is confined by. The following patchset adds symlink support to securityfs and then updates apparmor to leverage them.
[RFC][Patch 0/3] securityfs: add the ability to support symlinks
Currently securityfs does not support the creation/use of symlinks. AppArmor would like to be able to use symlinks to map some policy relationships between profiles and the data set it was loaded from (patch 2), and to create a specialized magic policy tree that maps visible policy to the policy namespace that a task is confined by. The following patchset adds symlink support to securityfs and then updates apparmor to leverage them.
[PATCH 3/3] apparmor: virtualize the policy/ directory
virtualize the apparmor policy/ directory so that the current namespace affects what part of policy is seen. This is done by * creating a new apparmorfs filesystem * creating a magic symlink from securityfs to the correct apparmorfs file in the tree (similar to nsfs use). apparmor fs data and fns also get renamed some to help indicate where they are used aafs - special magic apparmorfs aa_sfs - for fns/data that go into securityfs aa_fs - for fns/data that may be used in the either of aafs or securityfs Signed-off-by: John JohansenReviewed-by: Seth Arnold --- include/uapi/linux/magic.h | 2 + security/apparmor/Makefile | 6 +- security/apparmor/apparmorfs.c | 773 - security/apparmor/capability.c | 4 +- security/apparmor/include/apparmorfs.h | 60 +-- security/apparmor/include/capability.h | 2 +- security/apparmor/include/resource.h | 2 +- security/apparmor/policy.c | 6 +- security/apparmor/policy_ns.c | 4 +- security/apparmor/resource.c | 4 +- 10 files changed, 611 insertions(+), 252 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index e230af2e6855..a0908f1d2760 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -80,6 +80,8 @@ #define BTRFS_TEST_MAGIC 0x73727279 #define NSFS_MAGIC 0x6e736673 #define BPF_FS_MAGIC 0xcafe4a11 +#define AAFS_MAGIC 0x5a3c69f0 + /* Since UDF 2.01 is ISO 13346 based... */ #define UDF_SUPER_MAGIC0x15013346 #define BALLOON_KVM_MAGIC 0x13661366 diff --git a/security/apparmor/Makefile b/security/apparmor/Makefile index ad369a7aac24..b3b6f70f32c5 100644 --- a/security/apparmor/Makefile +++ b/security/apparmor/Makefile @@ -20,7 +20,7 @@ cmd_make-caps = echo "static const char *const capability_names[] = {" > $@ ;\ sed $< >>$@ -r -n -e '/CAP_FS_MASK/d' \ -e 's/^\#define[ \t]+CAP_([A-Z0-9_]+)[ \t]+([0-9]+)/[\2] = "\L\1",/p';\ echo "};" >> $@ ;\ - echo -n '\#define AA_FS_CAPS_MASK "' >> $@ ;\ + echo -n '\#define AA_SFS_CAPS_MASK "' >> $@ ;\ sed $< -r -n -e '/CAP_FS_MASK/d' \ -e 's/^\#define[ \t]+CAP_([A-Z0-9_]+)[ \t]+([0-9]+)/\L\1/p' | \ tr '\n' ' ' | sed -e 's/ $$/"\n/' >> $@ @@ -46,7 +46,7 @@ cmd_make-caps = echo "static const char *const capability_names[] = {" > $@ ;\ ##define RLIMIT_FSIZE1 /* Maximum filesize */ ##define RLIMIT_STACK 3 /* max stack size */ # to -# #define AA_FS_RLIMIT_MASK "fsize stack" +# #define AA_SFS_RLIMIT_MASK "fsize stack" quiet_cmd_make-rlim = GEN $@ cmd_make-rlim = echo "static const char *const rlim_names[RLIM_NLIMITS] = {" \ > $@ ;\ @@ -56,7 +56,7 @@ cmd_make-rlim = echo "static const char *const rlim_names[RLIM_NLIMITS] = {" \ echo "static const int rlim_map[RLIM_NLIMITS] = {" >> $@ ;\ sed -r -n "s/^\# ?define[ \t]+(RLIMIT_[A-Z0-9_]+).*/\1,/p" $< >> $@ ;\ echo "};" >> $@ ; \ - echo -n '\#define AA_FS_RLIMIT_MASK "' >> $@ ;\ + echo -n '\#define AA_SFS_RLIMIT_MASK "' >> $@ ;\ sed -r -n 's/^\# ?define[ \t]+RLIMIT_([A-Z0-9_]+).*/\L\1/p' $< | \ tr '\n' ' ' | sed -e 's/ $$/"\n/' >> $@ diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c index 5a6010007046..cc1b95628f0f 100644 --- a/security/apparmor/apparmorfs.c +++ b/security/apparmor/apparmorfs.c @@ -35,6 +35,35 @@ #include "include/resource.h" #include "include/policy_unpack.h" +/* + * The apparmor filesystem interface used for policy load and introspection + * The interface is split into two main components based on their function + * a securityfs component: + * used for static files that are always available, and which allows + * userspace to specificy the location of the security filesystem. + * + * fns and data are prefixed with + * aa_sfs_ + * + * an apparmorfs component: + * used loaded policy content and introspection. It is not part of a + * regular mounted filesystem and is available only through the magic + * policy symlink in the root of the securityfs apparmor/ directory. + * Tasks queries will be magically redirected to the correct portion + * of the policy tree based on their confinement. + * + * fns and data are prefixed with + * aafs_ + * + * The aa_fs_ prefix is used to indicate the fn is used by both the + * securityfs and apparmorfs filesystems. + */ + + +/* + * support fns + */ + /** * aa_mangle_name - mangle a profile name to std profile layout form * @name: profile name to mangle (NOT NULL) @@ -74,6 +103,265 @@ static int mangle_name(const char *name, char *target) return t - target; } + +/* + * aafs - core fns and data for the policy tree + */ + +#define AAFS_NAME "apparmorfs" +static struct
[PATCH 3/3] apparmor: virtualize the policy/ directory
virtualize the apparmor policy/ directory so that the current namespace affects what part of policy is seen. This is done by * creating a new apparmorfs filesystem * creating a magic symlink from securityfs to the correct apparmorfs file in the tree (similar to nsfs use). apparmor fs data and fns also get renamed some to help indicate where they are used aafs - special magic apparmorfs aa_sfs - for fns/data that go into securityfs aa_fs - for fns/data that may be used in the either of aafs or securityfs Signed-off-by: John Johansen Reviewed-by: Seth Arnold --- include/uapi/linux/magic.h | 2 + security/apparmor/Makefile | 6 +- security/apparmor/apparmorfs.c | 773 - security/apparmor/capability.c | 4 +- security/apparmor/include/apparmorfs.h | 60 +-- security/apparmor/include/capability.h | 2 +- security/apparmor/include/resource.h | 2 +- security/apparmor/policy.c | 6 +- security/apparmor/policy_ns.c | 4 +- security/apparmor/resource.c | 4 +- 10 files changed, 611 insertions(+), 252 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index e230af2e6855..a0908f1d2760 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -80,6 +80,8 @@ #define BTRFS_TEST_MAGIC 0x73727279 #define NSFS_MAGIC 0x6e736673 #define BPF_FS_MAGIC 0xcafe4a11 +#define AAFS_MAGIC 0x5a3c69f0 + /* Since UDF 2.01 is ISO 13346 based... */ #define UDF_SUPER_MAGIC0x15013346 #define BALLOON_KVM_MAGIC 0x13661366 diff --git a/security/apparmor/Makefile b/security/apparmor/Makefile index ad369a7aac24..b3b6f70f32c5 100644 --- a/security/apparmor/Makefile +++ b/security/apparmor/Makefile @@ -20,7 +20,7 @@ cmd_make-caps = echo "static const char *const capability_names[] = {" > $@ ;\ sed $< >>$@ -r -n -e '/CAP_FS_MASK/d' \ -e 's/^\#define[ \t]+CAP_([A-Z0-9_]+)[ \t]+([0-9]+)/[\2] = "\L\1",/p';\ echo "};" >> $@ ;\ - echo -n '\#define AA_FS_CAPS_MASK "' >> $@ ;\ + echo -n '\#define AA_SFS_CAPS_MASK "' >> $@ ;\ sed $< -r -n -e '/CAP_FS_MASK/d' \ -e 's/^\#define[ \t]+CAP_([A-Z0-9_]+)[ \t]+([0-9]+)/\L\1/p' | \ tr '\n' ' ' | sed -e 's/ $$/"\n/' >> $@ @@ -46,7 +46,7 @@ cmd_make-caps = echo "static const char *const capability_names[] = {" > $@ ;\ ##define RLIMIT_FSIZE1 /* Maximum filesize */ ##define RLIMIT_STACK 3 /* max stack size */ # to -# #define AA_FS_RLIMIT_MASK "fsize stack" +# #define AA_SFS_RLIMIT_MASK "fsize stack" quiet_cmd_make-rlim = GEN $@ cmd_make-rlim = echo "static const char *const rlim_names[RLIM_NLIMITS] = {" \ > $@ ;\ @@ -56,7 +56,7 @@ cmd_make-rlim = echo "static const char *const rlim_names[RLIM_NLIMITS] = {" \ echo "static const int rlim_map[RLIM_NLIMITS] = {" >> $@ ;\ sed -r -n "s/^\# ?define[ \t]+(RLIMIT_[A-Z0-9_]+).*/\1,/p" $< >> $@ ;\ echo "};" >> $@ ; \ - echo -n '\#define AA_FS_RLIMIT_MASK "' >> $@ ;\ + echo -n '\#define AA_SFS_RLIMIT_MASK "' >> $@ ;\ sed -r -n 's/^\# ?define[ \t]+RLIMIT_([A-Z0-9_]+).*/\L\1/p' $< | \ tr '\n' ' ' | sed -e 's/ $$/"\n/' >> $@ diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c index 5a6010007046..cc1b95628f0f 100644 --- a/security/apparmor/apparmorfs.c +++ b/security/apparmor/apparmorfs.c @@ -35,6 +35,35 @@ #include "include/resource.h" #include "include/policy_unpack.h" +/* + * The apparmor filesystem interface used for policy load and introspection + * The interface is split into two main components based on their function + * a securityfs component: + * used for static files that are always available, and which allows + * userspace to specificy the location of the security filesystem. + * + * fns and data are prefixed with + * aa_sfs_ + * + * an apparmorfs component: + * used loaded policy content and introspection. It is not part of a + * regular mounted filesystem and is available only through the magic + * policy symlink in the root of the securityfs apparmor/ directory. + * Tasks queries will be magically redirected to the correct portion + * of the policy tree based on their confinement. + * + * fns and data are prefixed with + * aafs_ + * + * The aa_fs_ prefix is used to indicate the fn is used by both the + * securityfs and apparmorfs filesystems. + */ + + +/* + * support fns + */ + /** * aa_mangle_name - mangle a profile name to std profile layout form * @name: profile name to mangle (NOT NULL) @@ -74,6 +103,265 @@ static int mangle_name(const char *name, char *target) return t - target; } + +/* + * aafs - core fns and data for the policy tree + */ + +#define AAFS_NAME "apparmorfs" +static struct vfsmount *aafs_mnt; +static int aafs_count; + + +static
[PATCH 1/3] securityfs: add the ability to support symlinks
Signed-off-by: John JohansenReviewed-by: Seth Arnold --- include/linux/security.h | 12 security/inode.c | 140 +-- 2 files changed, 134 insertions(+), 18 deletions(-) diff --git a/include/linux/security.h b/include/linux/security.h index 78d8e03be5d3..28e2be5dd6df 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -1645,6 +1645,10 @@ extern struct dentry *securityfs_create_file(const char *name, umode_t mode, struct dentry *parent, void *data, const struct file_operations *fops); extern struct dentry *securityfs_create_dir(const char *name, struct dentry *parent); +struct dentry *securityfs_create_symlink(const char *name, +struct dentry *parent, +const char *target, +const struct inode_operations *iops); extern void securityfs_remove(struct dentry *dentry); #else /* CONFIG_SECURITYFS */ @@ -1664,6 +1668,14 @@ static inline struct dentry *securityfs_create_file(const char *name, return ERR_PTR(-ENODEV); } +struct dentry *securityfs_create_symlink(const char *name, +struct dentry *parent, +const char *target, +const struct inode_operations *iops) +{ + return ERR_PTR(-ENODEV); +} + static inline void securityfs_remove(struct dentry *dentry) {} diff --git a/security/inode.c b/security/inode.c index 2cb14162ff8d..10c4955d0bed 100644 --- a/security/inode.c +++ b/security/inode.c @@ -26,11 +26,31 @@ static struct vfsmount *mount; static int mount_count; +static void securityfs_evict_inode(struct inode *inode) +{ + truncate_inode_pages_final(>i_data); + clear_inode(inode); + if (S_ISLNK(inode->i_mode)) + kfree(inode->i_link); +} + +static const struct super_operations securityfs_super_operations = { + .statfs = simple_statfs, + .evict_inode= securityfs_evict_inode, +}; + static int fill_super(struct super_block *sb, void *data, int silent) { static struct tree_descr files[] = {{""}}; + int error; + + error = simple_fill_super(sb, SECURITYFS_MAGIC, files); + if (error) + return error; + + sb->s_op = _super_operations; - return simple_fill_super(sb, SECURITYFS_MAGIC, files); + return 0; } static struct dentry *get_sb(struct file_system_type *fs_type, @@ -48,7 +68,7 @@ static struct file_system_type fs_type = { }; /** - * securityfs_create_file - create a file in the securityfs filesystem + * securityfs_create_dentry - create a dentry in the securityfs filesystem * * @name: a pointer to a string containing the name of the file to create. * @mode: the permission that the file should have @@ -60,34 +80,35 @@ static struct file_system_type fs_type = { *the open() call. * @fops: a pointer to a struct file_operations that should be used for *this file. + * @iops: a point to a struct of inode_operations that should be used for + *this file/dir * - * This is the basic "create a file" function for securityfs. It allows for a - * wide range of flexibility in creating a file, or a directory (if you - * want to create a directory, the securityfs_create_dir() function is - * recommended to be used instead). + * This is the basic "create a file/dir/symlink" function for + * securityfs. It allows for a wide range of flexibility in creating + * a file, or a directory (if you want to create a directory, the + * securityfs_create_dir() function is recommended to be used + * instead). * * This function returns a pointer to a dentry if it succeeds. This - * pointer must be passed to the securityfs_remove() function when the file is - * to be removed (no automatic cleanup happens if your module is unloaded, - * you are responsible here). If an error occurs, the function will return - * the error value (via ERR_PTR). + * pointer must be passed to the securityfs_remove() function when the + * file is to be removed (no automatic cleanup happens if your module + * is unloaded, you are responsible here). If an error occurs, the + * function will return the error value (via ERR_PTR). * * If securityfs is not enabled in the kernel, the value %-ENODEV is * returned. */ -struct dentry *securityfs_create_file(const char *name, umode_t mode, - struct dentry *parent, void *data, - const struct file_operations *fops) +static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, + struct dentry *parent, void *data, +
[PATCH 2/3] apparmor: move to per loaddata files, instead of replicating in profiles
The loaddata sets cover more than just a single profile and should be tracked at the ns level. Move the load data files under the namespace and reference the files from the profiles via a symlink. Signed-off-by: John JohansenReviewed-by: Seth Arnold --- security/apparmor/apparmorfs.c| 288 -- security/apparmor/include/apparmorfs.h| 5 + security/apparmor/include/policy_ns.h | 4 + security/apparmor/include/policy_unpack.h | 67 ++- security/apparmor/policy.c| 42 - security/apparmor/policy_ns.c | 2 + security/apparmor/policy_unpack.c | 49 - 7 files changed, 393 insertions(+), 64 deletions(-) diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c index 6d1a4a67abce..5a6010007046 100644 --- a/security/apparmor/apparmorfs.c +++ b/security/apparmor/apparmorfs.c @@ -101,10 +101,10 @@ static struct aa_loaddata *aa_simple_write_to_buffer(const char __user *userbuf, data = kvmalloc(sizeof(*data) + alloc_size); if (data == NULL) return ERR_PTR(-ENOMEM); + memset(data, 0, sizeof(*data)); kref_init(>count); + INIT_LIST_HEAD(>list); data->size = copy_size; - data->hash = NULL; - data->abi = 0; if (copy_from_user(data->data, userbuf, copy_size)) { kvfree(data); @@ -559,68 +559,92 @@ static const struct file_operations aa_fs_ns_name = { .release= single_release, }; -static int rawdata_release(struct inode *inode, struct file *file) + +/* policy/raw_data/ * file ops */ + +#define SEQ_RAWDATA_FOPS(NAME) \ +static int seq_rawdata_ ##NAME ##_open(struct inode *inode, struct file *file)\ +{\ + return seq_rawdata_open(inode, file, seq_rawdata_ ##NAME ##_show);\ +}\ + \ +static const struct file_operations seq_rawdata_ ##NAME ##_fops = { \ + .owner = THIS_MODULE,\ + .open = seq_rawdata_ ##NAME ##_open,\ + .read = seq_read, \ + .llseek = seq_lseek, \ + .release= seq_rawdata_release,\ +}\ + +static int seq_rawdata_open(struct inode *inode, struct file *file, + int (*show)(struct seq_file *, void *)) { - /* TODO: switch to loaddata when profile switched to symlink */ - aa_put_loaddata(file->private_data); + struct aa_loaddata *data = __aa_get_loaddata(inode->i_private); + int error; - return 0; + if (!data) + /* lost race this ent is being reaped */ + return -ENOENT; + + error = single_open(file, show, data); + if (error) { + file->private_data = NULL; + aa_put_loaddata(data); + } + + return error; } -static int aa_fs_seq_raw_abi_show(struct seq_file *seq, void *v) +static int seq_rawdata_release(struct inode *inode, struct file *file) { - struct aa_proxy *proxy = seq->private; - struct aa_profile *profile = aa_get_profile_rcu(>profile); + struct seq_file *seq = (struct seq_file *) file->private_data; - if (profile->rawdata->abi) - seq_printf(seq, "v%d\n", profile->rawdata->abi); + if (seq) + aa_put_loaddata(seq->private); + return single_release(inode, file); +} - aa_put_profile(profile); +static int seq_rawdata_abi_show(struct seq_file *seq, void *v) +{ + struct aa_loaddata *data = seq->private; + + if (data->abi) { + seq_printf(seq, "v%d", data->abi); + seq_puts(seq, "\n"); + } return 0; } -static int aa_fs_seq_raw_abi_open(struct inode *inode, struct file *file) +static int seq_rawdata_revision_show(struct seq_file *seq, void *v) { - return aa_fs_seq_profile_open(inode, file, aa_fs_seq_raw_abi_show); -} + struct aa_loaddata *data = seq->private; -static const struct file_operations aa_fs_seq_raw_abi_fops = { - .owner = THIS_MODULE, - .open = aa_fs_seq_raw_abi_open, - .read = seq_read, - .llseek = seq_lseek, - .release= aa_fs_seq_profile_release, -}; + if (data->revision) { + seq_printf(seq, "%ld", data->revision); + seq_puts(seq, "\n"); + } + + return 0; +} -static int aa_fs_seq_raw_hash_show(struct
[PATCH 1/3] securityfs: add the ability to support symlinks
Signed-off-by: John Johansen Reviewed-by: Seth Arnold --- include/linux/security.h | 12 security/inode.c | 140 +-- 2 files changed, 134 insertions(+), 18 deletions(-) diff --git a/include/linux/security.h b/include/linux/security.h index 78d8e03be5d3..28e2be5dd6df 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -1645,6 +1645,10 @@ extern struct dentry *securityfs_create_file(const char *name, umode_t mode, struct dentry *parent, void *data, const struct file_operations *fops); extern struct dentry *securityfs_create_dir(const char *name, struct dentry *parent); +struct dentry *securityfs_create_symlink(const char *name, +struct dentry *parent, +const char *target, +const struct inode_operations *iops); extern void securityfs_remove(struct dentry *dentry); #else /* CONFIG_SECURITYFS */ @@ -1664,6 +1668,14 @@ static inline struct dentry *securityfs_create_file(const char *name, return ERR_PTR(-ENODEV); } +struct dentry *securityfs_create_symlink(const char *name, +struct dentry *parent, +const char *target, +const struct inode_operations *iops) +{ + return ERR_PTR(-ENODEV); +} + static inline void securityfs_remove(struct dentry *dentry) {} diff --git a/security/inode.c b/security/inode.c index 2cb14162ff8d..10c4955d0bed 100644 --- a/security/inode.c +++ b/security/inode.c @@ -26,11 +26,31 @@ static struct vfsmount *mount; static int mount_count; +static void securityfs_evict_inode(struct inode *inode) +{ + truncate_inode_pages_final(>i_data); + clear_inode(inode); + if (S_ISLNK(inode->i_mode)) + kfree(inode->i_link); +} + +static const struct super_operations securityfs_super_operations = { + .statfs = simple_statfs, + .evict_inode= securityfs_evict_inode, +}; + static int fill_super(struct super_block *sb, void *data, int silent) { static struct tree_descr files[] = {{""}}; + int error; + + error = simple_fill_super(sb, SECURITYFS_MAGIC, files); + if (error) + return error; + + sb->s_op = _super_operations; - return simple_fill_super(sb, SECURITYFS_MAGIC, files); + return 0; } static struct dentry *get_sb(struct file_system_type *fs_type, @@ -48,7 +68,7 @@ static struct file_system_type fs_type = { }; /** - * securityfs_create_file - create a file in the securityfs filesystem + * securityfs_create_dentry - create a dentry in the securityfs filesystem * * @name: a pointer to a string containing the name of the file to create. * @mode: the permission that the file should have @@ -60,34 +80,35 @@ static struct file_system_type fs_type = { *the open() call. * @fops: a pointer to a struct file_operations that should be used for *this file. + * @iops: a point to a struct of inode_operations that should be used for + *this file/dir * - * This is the basic "create a file" function for securityfs. It allows for a - * wide range of flexibility in creating a file, or a directory (if you - * want to create a directory, the securityfs_create_dir() function is - * recommended to be used instead). + * This is the basic "create a file/dir/symlink" function for + * securityfs. It allows for a wide range of flexibility in creating + * a file, or a directory (if you want to create a directory, the + * securityfs_create_dir() function is recommended to be used + * instead). * * This function returns a pointer to a dentry if it succeeds. This - * pointer must be passed to the securityfs_remove() function when the file is - * to be removed (no automatic cleanup happens if your module is unloaded, - * you are responsible here). If an error occurs, the function will return - * the error value (via ERR_PTR). + * pointer must be passed to the securityfs_remove() function when the + * file is to be removed (no automatic cleanup happens if your module + * is unloaded, you are responsible here). If an error occurs, the + * function will return the error value (via ERR_PTR). * * If securityfs is not enabled in the kernel, the value %-ENODEV is * returned. */ -struct dentry *securityfs_create_file(const char *name, umode_t mode, - struct dentry *parent, void *data, - const struct file_operations *fops) +static struct dentry *securityfs_create_dentry(const char *name, umode_t mode, + struct dentry *parent, void *data, + const struct file_operations *fops, +
[PATCH 2/3] apparmor: move to per loaddata files, instead of replicating in profiles
The loaddata sets cover more than just a single profile and should be tracked at the ns level. Move the load data files under the namespace and reference the files from the profiles via a symlink. Signed-off-by: John Johansen Reviewed-by: Seth Arnold --- security/apparmor/apparmorfs.c| 288 -- security/apparmor/include/apparmorfs.h| 5 + security/apparmor/include/policy_ns.h | 4 + security/apparmor/include/policy_unpack.h | 67 ++- security/apparmor/policy.c| 42 - security/apparmor/policy_ns.c | 2 + security/apparmor/policy_unpack.c | 49 - 7 files changed, 393 insertions(+), 64 deletions(-) diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c index 6d1a4a67abce..5a6010007046 100644 --- a/security/apparmor/apparmorfs.c +++ b/security/apparmor/apparmorfs.c @@ -101,10 +101,10 @@ static struct aa_loaddata *aa_simple_write_to_buffer(const char __user *userbuf, data = kvmalloc(sizeof(*data) + alloc_size); if (data == NULL) return ERR_PTR(-ENOMEM); + memset(data, 0, sizeof(*data)); kref_init(>count); + INIT_LIST_HEAD(>list); data->size = copy_size; - data->hash = NULL; - data->abi = 0; if (copy_from_user(data->data, userbuf, copy_size)) { kvfree(data); @@ -559,68 +559,92 @@ static const struct file_operations aa_fs_ns_name = { .release= single_release, }; -static int rawdata_release(struct inode *inode, struct file *file) + +/* policy/raw_data/ * file ops */ + +#define SEQ_RAWDATA_FOPS(NAME) \ +static int seq_rawdata_ ##NAME ##_open(struct inode *inode, struct file *file)\ +{\ + return seq_rawdata_open(inode, file, seq_rawdata_ ##NAME ##_show);\ +}\ + \ +static const struct file_operations seq_rawdata_ ##NAME ##_fops = { \ + .owner = THIS_MODULE,\ + .open = seq_rawdata_ ##NAME ##_open,\ + .read = seq_read, \ + .llseek = seq_lseek, \ + .release= seq_rawdata_release,\ +}\ + +static int seq_rawdata_open(struct inode *inode, struct file *file, + int (*show)(struct seq_file *, void *)) { - /* TODO: switch to loaddata when profile switched to symlink */ - aa_put_loaddata(file->private_data); + struct aa_loaddata *data = __aa_get_loaddata(inode->i_private); + int error; - return 0; + if (!data) + /* lost race this ent is being reaped */ + return -ENOENT; + + error = single_open(file, show, data); + if (error) { + file->private_data = NULL; + aa_put_loaddata(data); + } + + return error; } -static int aa_fs_seq_raw_abi_show(struct seq_file *seq, void *v) +static int seq_rawdata_release(struct inode *inode, struct file *file) { - struct aa_proxy *proxy = seq->private; - struct aa_profile *profile = aa_get_profile_rcu(>profile); + struct seq_file *seq = (struct seq_file *) file->private_data; - if (profile->rawdata->abi) - seq_printf(seq, "v%d\n", profile->rawdata->abi); + if (seq) + aa_put_loaddata(seq->private); + return single_release(inode, file); +} - aa_put_profile(profile); +static int seq_rawdata_abi_show(struct seq_file *seq, void *v) +{ + struct aa_loaddata *data = seq->private; + + if (data->abi) { + seq_printf(seq, "v%d", data->abi); + seq_puts(seq, "\n"); + } return 0; } -static int aa_fs_seq_raw_abi_open(struct inode *inode, struct file *file) +static int seq_rawdata_revision_show(struct seq_file *seq, void *v) { - return aa_fs_seq_profile_open(inode, file, aa_fs_seq_raw_abi_show); -} + struct aa_loaddata *data = seq->private; -static const struct file_operations aa_fs_seq_raw_abi_fops = { - .owner = THIS_MODULE, - .open = aa_fs_seq_raw_abi_open, - .read = seq_read, - .llseek = seq_lseek, - .release= aa_fs_seq_profile_release, -}; + if (data->revision) { + seq_printf(seq, "%ld", data->revision); + seq_puts(seq, "\n"); + } + + return 0; +} -static int aa_fs_seq_raw_hash_show(struct seq_file *seq, void *v) +static int
Re: [PATCH v3 2/2] dt-bindings: pcie: Add documentation for Mediatek PCIe
On Wed, 2017-05-10 at 12:01 +0200, Arnd Bergmann wrote: > On Wed, May 10, 2017 at 11:31 AM, Ryder Leewrote: > > On Wed, 2017-05-10 at 10:08 +0200, Arnd Bergmann wrote: > >> On Wed, May 10, 2017 at 4:07 AM, Ryder Lee wrote: > >> > >> > +- ranges: > >> > + - The first three entries are expected to translate the addresses for > >> > the root > >> > +port registers, which are referenced by the assigned-addresses > >> > property of > >> > +the root port nodes (see below). > >> > >> I don't understand this part. Why do you need a static translation for > >> these? > >> Shouldn't they just be listed in the 'reg' property of the parent node now > >> that > >> you have the clk/reset/phy properties in the parent as well? > > > > At first, I did like that. But I noticed that someone suggest it's > > better to use 'assigned-addresses' to handle per-port registers, the > > same path as tegra and marvell did, in other platform discussion thread. > > So I just put shared register in root node. It could be rolled back if > > you feel this is inappropriate. > > The marvell case is not a good example for your case: their top-level > device is made up by the OS to help with the shared resource allocation, > while in your case the bus bridge actually exists in hardware. > > I'm not too familiar with the Tegra case, and haven't looked at that here, > but it could be an artifact of how for a while we used to list the config > space access in the top-level "ranges" instead of the "reg" property. > > I'd vote for moving it back, for consistency with the other port specific > properties that are now in the root node. Once you do that, the port > nodes can be removed completely, which is what I was aiming for with > the comments on the previous version. I'll move it back. > >> > +Required properties: > >> > +- device_type: Must be "pci" > >> > +- assigned-addresses: Address and size of the port configuration > >> > registers > >> > +- reg: Only the first four bytes are used to refer to the correct bus > >> > number > >> > + and device number. > >> > +- #address-cells: Must be 3 > >> > +- #size-cells: Must be 2 > >> > +- #interrupt-cells: Must be 1 > >> > +- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping > >> > properties > >> > + Please refer to the standard PCI bus binding document for a more > >> > detailed > >> > + explanation. > >> > >> Child nodes do not normally have interrupt-map properties. Isn't this > >> already covered by the interrupt-map in the parent? > >> > > > > I have one Intel 4 port ethernet card(:00:01) and MTK WLAN card > > (:00:02), probe message looks good to me. > > > > pci :00:01.0: fixup irq: got 224 > > pci :00:01.0: assigning IRQ 224 > > pci :00:02.0: fixup irq: got 225 > > pci :00:02.0: assigning IRQ 225 > > > > pci :01:00.0: fixup irq: got 224 > > pci :01:00.0: assigning IRQ 224 > > pci :01:00.1: fixup irq: got 224 > > pci :01:00.1: assigning IRQ 224 > > pci :01:00.2: fixup irq: got 224 > > pci :01:00.2: assigning IRQ 224 > > pci :01:00.3: fixup irq: got 224 > > pci :01:00.3: assigning IRQ 224 > > > > pci :02:00.0: fixup irq: got 225 > > pci :02:00.0: assigning IRQ 225 > > > > > > But child nodes without interrupt-map properties: > > It seems incorrect. > > > > pci :00:01.0: fixup irq: got 224 > > pci :00:01.0: assigning IRQ 224 > > pci :00:02.0: fixup irq: got 225 > > pci :00:02.0: assigning IRQ 225 > > > > pci :01:00.0: fixup irq: got 223 > > pci :01:00.0: assigning IRQ 223 > > Not entirely sure what happens here, but I guess the problem > is that the 'reg' portion of the parent interrupt-map refers to > the port devices, not the devices attached the devices behind > them. I agree with you. That's why I need additional interrupt-map properties to resolve IRQ correctly for the devices behind root ports. Not sure whether other platforms have similar case like me here. > On a related note, I see that you still list > > > +- interrupts: Three interrupt outputs of the controller. Must contain an > > + entry for each entry in the interrupt-names property. > > +- interrupt-names: Must include the following names > > + - "pcie-int0" > > + - "pcie-int1" > > + - "pcie-int2" > > This seems to be an artifact from the older version and should be > removed as the driver correctly ignores the properties now. Actually, everything works fine without these properties however when it loads we see a few weird error message: pcieport :00:01.0: Signaling PME with IRQ 232 pcieport :00:02.0: enabling device (0140 -> 0142) pcieport :00:02.0: enabling bus mastering irq 232: nobody cared (try booting with the "irqpoll" option) ... [] (pcie_pme_probe) from [] (pcie_port_probe_service +0x44/0x6c) (pcie_port_probe_service) from [] (driver_probe_device +0x280/0x470) ... (pcie_port_device_register) from [] (pcie_portdrv_probe
Re: [PATCH v3 2/2] dt-bindings: pcie: Add documentation for Mediatek PCIe
On Wed, 2017-05-10 at 12:01 +0200, Arnd Bergmann wrote: > On Wed, May 10, 2017 at 11:31 AM, Ryder Lee wrote: > > On Wed, 2017-05-10 at 10:08 +0200, Arnd Bergmann wrote: > >> On Wed, May 10, 2017 at 4:07 AM, Ryder Lee wrote: > >> > >> > +- ranges: > >> > + - The first three entries are expected to translate the addresses for > >> > the root > >> > +port registers, which are referenced by the assigned-addresses > >> > property of > >> > +the root port nodes (see below). > >> > >> I don't understand this part. Why do you need a static translation for > >> these? > >> Shouldn't they just be listed in the 'reg' property of the parent node now > >> that > >> you have the clk/reset/phy properties in the parent as well? > > > > At first, I did like that. But I noticed that someone suggest it's > > better to use 'assigned-addresses' to handle per-port registers, the > > same path as tegra and marvell did, in other platform discussion thread. > > So I just put shared register in root node. It could be rolled back if > > you feel this is inappropriate. > > The marvell case is not a good example for your case: their top-level > device is made up by the OS to help with the shared resource allocation, > while in your case the bus bridge actually exists in hardware. > > I'm not too familiar with the Tegra case, and haven't looked at that here, > but it could be an artifact of how for a while we used to list the config > space access in the top-level "ranges" instead of the "reg" property. > > I'd vote for moving it back, for consistency with the other port specific > properties that are now in the root node. Once you do that, the port > nodes can be removed completely, which is what I was aiming for with > the comments on the previous version. I'll move it back. > >> > +Required properties: > >> > +- device_type: Must be "pci" > >> > +- assigned-addresses: Address and size of the port configuration > >> > registers > >> > +- reg: Only the first four bytes are used to refer to the correct bus > >> > number > >> > + and device number. > >> > +- #address-cells: Must be 3 > >> > +- #size-cells: Must be 2 > >> > +- #interrupt-cells: Must be 1 > >> > +- interrupt-map-mask and interrupt-map: Standard PCI IRQ mapping > >> > properties > >> > + Please refer to the standard PCI bus binding document for a more > >> > detailed > >> > + explanation. > >> > >> Child nodes do not normally have interrupt-map properties. Isn't this > >> already covered by the interrupt-map in the parent? > >> > > > > I have one Intel 4 port ethernet card(:00:01) and MTK WLAN card > > (:00:02), probe message looks good to me. > > > > pci :00:01.0: fixup irq: got 224 > > pci :00:01.0: assigning IRQ 224 > > pci :00:02.0: fixup irq: got 225 > > pci :00:02.0: assigning IRQ 225 > > > > pci :01:00.0: fixup irq: got 224 > > pci :01:00.0: assigning IRQ 224 > > pci :01:00.1: fixup irq: got 224 > > pci :01:00.1: assigning IRQ 224 > > pci :01:00.2: fixup irq: got 224 > > pci :01:00.2: assigning IRQ 224 > > pci :01:00.3: fixup irq: got 224 > > pci :01:00.3: assigning IRQ 224 > > > > pci :02:00.0: fixup irq: got 225 > > pci :02:00.0: assigning IRQ 225 > > > > > > But child nodes without interrupt-map properties: > > It seems incorrect. > > > > pci :00:01.0: fixup irq: got 224 > > pci :00:01.0: assigning IRQ 224 > > pci :00:02.0: fixup irq: got 225 > > pci :00:02.0: assigning IRQ 225 > > > > pci :01:00.0: fixup irq: got 223 > > pci :01:00.0: assigning IRQ 223 > > Not entirely sure what happens here, but I guess the problem > is that the 'reg' portion of the parent interrupt-map refers to > the port devices, not the devices attached the devices behind > them. I agree with you. That's why I need additional interrupt-map properties to resolve IRQ correctly for the devices behind root ports. Not sure whether other platforms have similar case like me here. > On a related note, I see that you still list > > > +- interrupts: Three interrupt outputs of the controller. Must contain an > > + entry for each entry in the interrupt-names property. > > +- interrupt-names: Must include the following names > > + - "pcie-int0" > > + - "pcie-int1" > > + - "pcie-int2" > > This seems to be an artifact from the older version and should be > removed as the driver correctly ignores the properties now. Actually, everything works fine without these properties however when it loads we see a few weird error message: pcieport :00:01.0: Signaling PME with IRQ 232 pcieport :00:02.0: enabling device (0140 -> 0142) pcieport :00:02.0: enabling bus mastering irq 232: nobody cared (try booting with the "irqpoll" option) ... [] (pcie_pme_probe) from [] (pcie_port_probe_service +0x44/0x6c) (pcie_port_probe_service) from [] (driver_probe_device +0x280/0x470) ... (pcie_port_device_register) from [] (pcie_portdrv_probe +0x3c/0xb4) (pcie_portdrv_probe) from []
Re: [f2fs-dev] [PATCH 1/3] f2fs: use f2fs_submit_page_bio for ra_meta_pages
On 05/11, Chao Yu wrote: > Hi Jaegeuk, > > On 2017/5/11 7:48, Jaegeuk Kim wrote: > > This patch avoids to use f2fs_submit_merged_bio for read, which was the only > > read case. > > This makes f2fs losing the chance to merge multiple pages into one bio during > reading continuous physical blocks, it may cause potential performance > regression, how about using a local bio in ra_meta_pages to cache more pages? This is a readahead flow, which is asynchronous and not a performance critical flow. And, I expect blk_plug is still able to merge them. We're using this in readahead of node blocks as well. Thanks, > > Thanks, > > > > > Signed-off-by: Jaegeuk Kim> > --- > > fs/f2fs/checkpoint.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c > > index ea9c317b5916..8d92f8249000 100644 > > --- a/fs/f2fs/checkpoint.c > > +++ b/fs/f2fs/checkpoint.c > > @@ -207,12 +207,10 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t > > start, int nrpages, > > } > > > > fio.page = page; > > - fio.old_blkaddr = fio.new_blkaddr; > > - f2fs_submit_page_mbio(); > > + f2fs_submit_page_bio(); > > f2fs_put_page(page, 0); > > } > > out: > > - f2fs_submit_merged_bio(sbi, META, READ); > > blk_finish_plug(); > > return blkno - start; > > } > >
Re: [f2fs-dev] [PATCH 1/3] f2fs: use f2fs_submit_page_bio for ra_meta_pages
On 05/11, Chao Yu wrote: > On 2017/5/11 9:24, Chao Yu wrote: > > Hi Jaegeuk, > > > > On 2017/5/11 7:48, Jaegeuk Kim wrote: > >> This patch avoids to use f2fs_submit_merged_bio for read, which was the > >> only > >> read case. > > > > This makes f2fs losing the chance to merge multiple pages into one bio > > during > > reading continuous physical blocks, it may cause potential performance > > regression, how about using a local bio in ra_meta_pages to cache more > > pages? > > BTW, need to remove sbi->read_io as well? Yup. > > Thanks, > > > > > Thanks, > > > >> > >> Signed-off-by: Jaegeuk Kim> >> --- > >> fs/f2fs/checkpoint.c | 4 +--- > >> 1 file changed, 1 insertion(+), 3 deletions(-) > >> > >> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c > >> index ea9c317b5916..8d92f8249000 100644 > >> --- a/fs/f2fs/checkpoint.c > >> +++ b/fs/f2fs/checkpoint.c > >> @@ -207,12 +207,10 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t > >> start, int nrpages, > >>} > >> > >>fio.page = page; > >> - fio.old_blkaddr = fio.new_blkaddr; > >> - f2fs_submit_page_mbio(); > >> + f2fs_submit_page_bio(); > >>f2fs_put_page(page, 0); > >>} > >> out: > >> - f2fs_submit_merged_bio(sbi, META, READ); > >>blk_finish_plug(); > >>return blkno - start; > >> } > >> > > > > > > . > >
Re: [f2fs-dev] [PATCH 1/3] f2fs: use f2fs_submit_page_bio for ra_meta_pages
On 05/11, Chao Yu wrote: > Hi Jaegeuk, > > On 2017/5/11 7:48, Jaegeuk Kim wrote: > > This patch avoids to use f2fs_submit_merged_bio for read, which was the only > > read case. > > This makes f2fs losing the chance to merge multiple pages into one bio during > reading continuous physical blocks, it may cause potential performance > regression, how about using a local bio in ra_meta_pages to cache more pages? This is a readahead flow, which is asynchronous and not a performance critical flow. And, I expect blk_plug is still able to merge them. We're using this in readahead of node blocks as well. Thanks, > > Thanks, > > > > > Signed-off-by: Jaegeuk Kim > > --- > > fs/f2fs/checkpoint.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c > > index ea9c317b5916..8d92f8249000 100644 > > --- a/fs/f2fs/checkpoint.c > > +++ b/fs/f2fs/checkpoint.c > > @@ -207,12 +207,10 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t > > start, int nrpages, > > } > > > > fio.page = page; > > - fio.old_blkaddr = fio.new_blkaddr; > > - f2fs_submit_page_mbio(); > > + f2fs_submit_page_bio(); > > f2fs_put_page(page, 0); > > } > > out: > > - f2fs_submit_merged_bio(sbi, META, READ); > > blk_finish_plug(); > > return blkno - start; > > } > >
Re: [f2fs-dev] [PATCH 1/3] f2fs: use f2fs_submit_page_bio for ra_meta_pages
On 05/11, Chao Yu wrote: > On 2017/5/11 9:24, Chao Yu wrote: > > Hi Jaegeuk, > > > > On 2017/5/11 7:48, Jaegeuk Kim wrote: > >> This patch avoids to use f2fs_submit_merged_bio for read, which was the > >> only > >> read case. > > > > This makes f2fs losing the chance to merge multiple pages into one bio > > during > > reading continuous physical blocks, it may cause potential performance > > regression, how about using a local bio in ra_meta_pages to cache more > > pages? > > BTW, need to remove sbi->read_io as well? Yup. > > Thanks, > > > > > Thanks, > > > >> > >> Signed-off-by: Jaegeuk Kim > >> --- > >> fs/f2fs/checkpoint.c | 4 +--- > >> 1 file changed, 1 insertion(+), 3 deletions(-) > >> > >> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c > >> index ea9c317b5916..8d92f8249000 100644 > >> --- a/fs/f2fs/checkpoint.c > >> +++ b/fs/f2fs/checkpoint.c > >> @@ -207,12 +207,10 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t > >> start, int nrpages, > >>} > >> > >>fio.page = page; > >> - fio.old_blkaddr = fio.new_blkaddr; > >> - f2fs_submit_page_mbio(); > >> + f2fs_submit_page_bio(); > >>f2fs_put_page(page, 0); > >>} > >> out: > >> - f2fs_submit_merged_bio(sbi, META, READ); > >>blk_finish_plug(); > >>return blkno - start; > >> } > >> > > > > > > . > >
Re: [PATCH 0/3] GPU-DRM-Radeon: Fine-tuning for three function implementations
On 10/05/17 08:30 PM, Christian König wrote: > Am 10.05.2017 um 02:23 schrieb Michel Dänzer: >> On 03/05/17 09:46 PM, Christian König wrote: >>> Am 02.05.2017 um 22:04 schrieb SF Markus Elfring: From: Markus ElfringDate: Tue, 2 May 2017 22:00:02 +0200 Three update suggestions were taken into account from static source code analysis. Markus Elfring (3): Use seq_putc() in radeon_sa_bo_dump_debug_info() Use seq_puts() in radeon_debugfs_pm_info() Use seq_puts() in r100_debugfs_cp_csq_fifo() >>> Reviewed-by: Christian König >> Based on >> https://lists.freedesktop.org/archives/dri-devel/2017-May/140837.html >> and followups, I'm afraid we'll have to make sure Markus' patches have >> been tested adequately before applying them. > > I can't judge the background of that decision, but at least those tree > patches for radeon looked trivial to me. Which is part of the issue, see also https://lists.freedesktop.org/archives/dri-devel/2017-May/140694.html and other posts in that thread. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
Re: [PATCH 0/3] GPU-DRM-Radeon: Fine-tuning for three function implementations
On 10/05/17 08:30 PM, Christian König wrote: > Am 10.05.2017 um 02:23 schrieb Michel Dänzer: >> On 03/05/17 09:46 PM, Christian König wrote: >>> Am 02.05.2017 um 22:04 schrieb SF Markus Elfring: From: Markus Elfring Date: Tue, 2 May 2017 22:00:02 +0200 Three update suggestions were taken into account from static source code analysis. Markus Elfring (3): Use seq_putc() in radeon_sa_bo_dump_debug_info() Use seq_puts() in radeon_debugfs_pm_info() Use seq_puts() in r100_debugfs_cp_csq_fifo() >>> Reviewed-by: Christian König >> Based on >> https://lists.freedesktop.org/archives/dri-devel/2017-May/140837.html >> and followups, I'm afraid we'll have to make sure Markus' patches have >> been tested adequately before applying them. > > I can't judge the background of that decision, but at least those tree > patches for radeon looked trivial to me. Which is part of the issue, see also https://lists.freedesktop.org/archives/dri-devel/2017-May/140694.html and other posts in that thread. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
Re: [PATCH] f2fs: split bio cache
On 05/11, Chao Yu wrote: > On 2017/5/11 7:50, Jaegeuk Kim wrote: > > On 05/09, Chao Yu wrote: > >> Hi Jaegeuk, > >> > >> On 2017/5/9 5:23, Jaegeuk Kim wrote: > >>> Hi Chao, > >>> > >>> I can't see a strong reason to split meta from data/node and rename the > >>> existing > >>> function names. Instead, how about keeping the existing one while adding > >>> some > >>> page types to deal with log types? > >> > >> Hmm.. before write this patch, I have thought about this implementation > >> which > >> adds HOT_DATA/WARM_DATA/.. into page_type enum, but the final target is > >> making > >> different temperature log-header being with independent bio cache, io > >> lock, and > >> io list, eliminating interaction of different temperature IOs, also > >> expanding > >> f2fs' scalability when adding more log-headers. If we don't split meta from > >> data/node, it's a little bit hard to approach this. What do you think? > > > > I submitted clean-up patches. How about this on top of them? > > After splitting, bio caches is connected to log-header, so why not moving bio > cache into log-header (SM_I(sbi)->curseg_array)? after the merging: > - we could avoid redundant enum. e.g. temp_type, page_type::{DATA/NODE}, since > we have seg_type enum instead. > - once we add special log-header like journal log or random IO log making > DATA/NODE and HOT/WARM/COLD non-orthogonalization, we just need to change few > codes to adjust it. Well, I perfer to keep block allocation and IO control in a separate way as is. Moreover, I don't think performance gain would be large enough comparing to what we need to change both of whole flows. Thanks, > > How do you think? > > Thanks, > > > > > --- > > fs/f2fs/data.c | 51 > > + > > fs/f2fs/f2fs.h | 10 - > > fs/f2fs/gc.c| 2 ++ > > fs/f2fs/segment.c | 24 +++-- > > fs/f2fs/segment.h | 4 > > fs/f2fs/super.c | 21 --- > > include/trace/events/f2fs.h | 14 - > > 7 files changed, 102 insertions(+), 24 deletions(-) > > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > > index 06bb2042385e..49b7e3918484 100644 > > --- a/fs/f2fs/data.c > > +++ b/fs/f2fs/data.c > > @@ -282,21 +282,30 @@ static bool has_merged_page(struct f2fs_sb_info *sbi, > > struct inode *inode, > > nid_t ino, pgoff_t idx, enum page_type type) > > { > > enum page_type btype = PAGE_TYPE_OF_BIO(type); > > - struct f2fs_bio_info *io = >write_io[btype]; > > - bool ret; > > + enum temp_type temp; > > + struct f2fs_bio_info *io; > > + bool ret = false; > > > > - down_read(>io_rwsem); > > - ret = __has_merged_page(io, inode, ino, idx); > > - up_read(>io_rwsem); > > + for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { > > + io = sbi->write_io[btype] + temp; > > + > > + down_read(>io_rwsem); > > + ret = __has_merged_page(io, inode, ino, idx); > > + up_read(>io_rwsem); > > + > > + /* TODO: use HOT temp only for meta pages now. */ > > + if (ret || btype == META) > > + break; > > + } > > return ret; > > } > > > > static void __f2fs_submit_merged_write(struct f2fs_sb_info *sbi, > > struct inode *inode, nid_t ino, pgoff_t idx, > > - enum page_type type) > > + enum page_type type, enum temp_type temp) > > { > > enum page_type btype = PAGE_TYPE_OF_BIO(type); > > - struct f2fs_bio_info *io = >write_io[btype]; > > + struct f2fs_bio_info *io = sbi->write_io[btype] + temp; > > > > down_write(>io_rwsem); > > > > @@ -316,17 +325,34 @@ static void __f2fs_submit_merged_write(struct > > f2fs_sb_info *sbi, > > up_write(>io_rwsem); > > } > > > > +static void __submit_merged_write_cond(struct f2fs_sb_info *sbi, > > + struct inode *inode, nid_t ino, pgoff_t idx, > > + enum page_type type, bool force) > > +{ > > + enum temp_type temp; > > + > > + if (!force && !has_merged_page(sbi, inode, ino, idx, type)) > > + return; > > + > > + for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { > > + __f2fs_submit_merged_write(sbi, inode, ino, idx, type, temp); > > + > > + /* TODO: use HOT temp only for meta pages now. */ > > + if (type >= META) > > + break; > > + } > > +} > > + > > void f2fs_submit_merged_write(struct f2fs_sb_info *sbi, enum page_type > > type) > > { > > - __f2fs_submit_merged_write(sbi, NULL, 0, 0, type); > > + __submit_merged_write_cond(sbi, NULL, 0, 0, type, true); > > } > > > > void f2fs_submit_merged_write_cond(struct f2fs_sb_info *sbi, > > struct inode *inode, nid_t ino, pgoff_t idx, > > enum page_type type) > > { > > - if
Re: [PATCH] f2fs: split bio cache
On 05/11, Chao Yu wrote: > On 2017/5/11 7:50, Jaegeuk Kim wrote: > > On 05/09, Chao Yu wrote: > >> Hi Jaegeuk, > >> > >> On 2017/5/9 5:23, Jaegeuk Kim wrote: > >>> Hi Chao, > >>> > >>> I can't see a strong reason to split meta from data/node and rename the > >>> existing > >>> function names. Instead, how about keeping the existing one while adding > >>> some > >>> page types to deal with log types? > >> > >> Hmm.. before write this patch, I have thought about this implementation > >> which > >> adds HOT_DATA/WARM_DATA/.. into page_type enum, but the final target is > >> making > >> different temperature log-header being with independent bio cache, io > >> lock, and > >> io list, eliminating interaction of different temperature IOs, also > >> expanding > >> f2fs' scalability when adding more log-headers. If we don't split meta from > >> data/node, it's a little bit hard to approach this. What do you think? > > > > I submitted clean-up patches. How about this on top of them? > > After splitting, bio caches is connected to log-header, so why not moving bio > cache into log-header (SM_I(sbi)->curseg_array)? after the merging: > - we could avoid redundant enum. e.g. temp_type, page_type::{DATA/NODE}, since > we have seg_type enum instead. > - once we add special log-header like journal log or random IO log making > DATA/NODE and HOT/WARM/COLD non-orthogonalization, we just need to change few > codes to adjust it. Well, I perfer to keep block allocation and IO control in a separate way as is. Moreover, I don't think performance gain would be large enough comparing to what we need to change both of whole flows. Thanks, > > How do you think? > > Thanks, > > > > > --- > > fs/f2fs/data.c | 51 > > + > > fs/f2fs/f2fs.h | 10 - > > fs/f2fs/gc.c| 2 ++ > > fs/f2fs/segment.c | 24 +++-- > > fs/f2fs/segment.h | 4 > > fs/f2fs/super.c | 21 --- > > include/trace/events/f2fs.h | 14 - > > 7 files changed, 102 insertions(+), 24 deletions(-) > > > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > > index 06bb2042385e..49b7e3918484 100644 > > --- a/fs/f2fs/data.c > > +++ b/fs/f2fs/data.c > > @@ -282,21 +282,30 @@ static bool has_merged_page(struct f2fs_sb_info *sbi, > > struct inode *inode, > > nid_t ino, pgoff_t idx, enum page_type type) > > { > > enum page_type btype = PAGE_TYPE_OF_BIO(type); > > - struct f2fs_bio_info *io = >write_io[btype]; > > - bool ret; > > + enum temp_type temp; > > + struct f2fs_bio_info *io; > > + bool ret = false; > > > > - down_read(>io_rwsem); > > - ret = __has_merged_page(io, inode, ino, idx); > > - up_read(>io_rwsem); > > + for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { > > + io = sbi->write_io[btype] + temp; > > + > > + down_read(>io_rwsem); > > + ret = __has_merged_page(io, inode, ino, idx); > > + up_read(>io_rwsem); > > + > > + /* TODO: use HOT temp only for meta pages now. */ > > + if (ret || btype == META) > > + break; > > + } > > return ret; > > } > > > > static void __f2fs_submit_merged_write(struct f2fs_sb_info *sbi, > > struct inode *inode, nid_t ino, pgoff_t idx, > > - enum page_type type) > > + enum page_type type, enum temp_type temp) > > { > > enum page_type btype = PAGE_TYPE_OF_BIO(type); > > - struct f2fs_bio_info *io = >write_io[btype]; > > + struct f2fs_bio_info *io = sbi->write_io[btype] + temp; > > > > down_write(>io_rwsem); > > > > @@ -316,17 +325,34 @@ static void __f2fs_submit_merged_write(struct > > f2fs_sb_info *sbi, > > up_write(>io_rwsem); > > } > > > > +static void __submit_merged_write_cond(struct f2fs_sb_info *sbi, > > + struct inode *inode, nid_t ino, pgoff_t idx, > > + enum page_type type, bool force) > > +{ > > + enum temp_type temp; > > + > > + if (!force && !has_merged_page(sbi, inode, ino, idx, type)) > > + return; > > + > > + for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { > > + __f2fs_submit_merged_write(sbi, inode, ino, idx, type, temp); > > + > > + /* TODO: use HOT temp only for meta pages now. */ > > + if (type >= META) > > + break; > > + } > > +} > > + > > void f2fs_submit_merged_write(struct f2fs_sb_info *sbi, enum page_type > > type) > > { > > - __f2fs_submit_merged_write(sbi, NULL, 0, 0, type); > > + __submit_merged_write_cond(sbi, NULL, 0, 0, type, true); > > } > > > > void f2fs_submit_merged_write_cond(struct f2fs_sb_info *sbi, > > struct inode *inode, nid_t ino, pgoff_t idx, > > enum page_type type) > > { > > - if
Re: [PATCH RT] futex/rtmutex: Cure RT double blocking issue
2017-05-09 23:11 GMT+08:00 Thomas Gleixner: > RT has a problem when the wait on a futex/rtmutex got interrupted by a > timeout or a signal. task->pi_blocked_on is still set when returning from > rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock > after this. > > If the hash bucket lock is contended then the > BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in > task_blocks_on_rt_mutex() will trigger. > > This can be avoided by clearing task->pi_blocked_on in the return path of > rt_mutex_wait_proxy_lock() which removes the task from the boosting chain > of the rtmutex. That's correct because the task is not longer blocked on > it. > > Signed-off-by: Thomas Gleixner > Reported-by: Engleder Gerhard > --- > kernel/locking/rtmutex.c | 17 + > 1 file changed, 17 insertions(+) > > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -2380,6 +2380,7 @@ int rt_mutex_wait_proxy_lock(struct rt_m >struct hrtimer_sleeper *to, >struct rt_mutex_waiter *waiter) > { > + struct task_struct *tsk = current; > int ret; > > raw_spin_lock_irq(>wait_lock); > @@ -2389,6 +2390,22 @@ int rt_mutex_wait_proxy_lock(struct rt_m > /* sleep on the mutex */ > ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL); Why not check the ret value to avoid lock/unlock tsk->pi_lock when acquires the rt_mutex successfully? Regards, Wanpeng Li > > + /* > +* RT has a problem here when the wait got interrupted by a timeout > +* or a signal. task->pi_blocked_on is still set. The task must > +* acquire the hash bucket lock when returning from this function. > +* > +* If the hash bucket lock is contended then the > +* BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in > +* task_blocks_on_rt_mutex() will trigger. This can be avoided by > +* clearing task->pi_blocked_on which removes the task from the > +* boosting chain of the rtmutex. That's correct because the task > +* is not longer blocked on it. > +*/ > + raw_spin_lock(>pi_lock); > + tsk->pi_blocked_on = NULL; > + raw_spin_unlock(>pi_lock); > + > raw_spin_unlock_irq(>wait_lock); > > return ret;
Re: [PATCH RT] futex/rtmutex: Cure RT double blocking issue
2017-05-09 23:11 GMT+08:00 Thomas Gleixner : > RT has a problem when the wait on a futex/rtmutex got interrupted by a > timeout or a signal. task->pi_blocked_on is still set when returning from > rt_mutex_wait_proxy_lock(). The task must acquire the hash bucket lock > after this. > > If the hash bucket lock is contended then the > BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in > task_blocks_on_rt_mutex() will trigger. > > This can be avoided by clearing task->pi_blocked_on in the return path of > rt_mutex_wait_proxy_lock() which removes the task from the boosting chain > of the rtmutex. That's correct because the task is not longer blocked on > it. > > Signed-off-by: Thomas Gleixner > Reported-by: Engleder Gerhard > --- > kernel/locking/rtmutex.c | 17 + > 1 file changed, 17 insertions(+) > > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -2380,6 +2380,7 @@ int rt_mutex_wait_proxy_lock(struct rt_m >struct hrtimer_sleeper *to, >struct rt_mutex_waiter *waiter) > { > + struct task_struct *tsk = current; > int ret; > > raw_spin_lock_irq(>wait_lock); > @@ -2389,6 +2390,22 @@ int rt_mutex_wait_proxy_lock(struct rt_m > /* sleep on the mutex */ > ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter, NULL); Why not check the ret value to avoid lock/unlock tsk->pi_lock when acquires the rt_mutex successfully? Regards, Wanpeng Li > > + /* > +* RT has a problem here when the wait got interrupted by a timeout > +* or a signal. task->pi_blocked_on is still set. The task must > +* acquire the hash bucket lock when returning from this function. > +* > +* If the hash bucket lock is contended then the > +* BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in > +* task_blocks_on_rt_mutex() will trigger. This can be avoided by > +* clearing task->pi_blocked_on which removes the task from the > +* boosting chain of the rtmutex. That's correct because the task > +* is not longer blocked on it. > +*/ > + raw_spin_lock(>pi_lock); > + tsk->pi_blocked_on = NULL; > + raw_spin_unlock(>pi_lock); > + > raw_spin_unlock_irq(>wait_lock); > > return ret;
Re: [PATCH] f2fs: split bio cache
On 2017/5/11 7:50, Jaegeuk Kim wrote: > On 05/09, Chao Yu wrote: >> Hi Jaegeuk, >> >> On 2017/5/9 5:23, Jaegeuk Kim wrote: >>> Hi Chao, >>> >>> I can't see a strong reason to split meta from data/node and rename the >>> existing >>> function names. Instead, how about keeping the existing one while adding >>> some >>> page types to deal with log types? >> >> Hmm.. before write this patch, I have thought about this implementation which >> adds HOT_DATA/WARM_DATA/.. into page_type enum, but the final target is >> making >> different temperature log-header being with independent bio cache, io lock, >> and >> io list, eliminating interaction of different temperature IOs, also expanding >> f2fs' scalability when adding more log-headers. If we don't split meta from >> data/node, it's a little bit hard to approach this. What do you think? > > I submitted clean-up patches. How about this on top of them? After splitting, bio caches is connected to log-header, so why not moving bio cache into log-header (SM_I(sbi)->curseg_array)? after the merging: - we could avoid redundant enum. e.g. temp_type, page_type::{DATA/NODE}, since we have seg_type enum instead. - once we add special log-header like journal log or random IO log making DATA/NODE and HOT/WARM/COLD non-orthogonalization, we just need to change few codes to adjust it. How do you think? Thanks, > > --- > fs/f2fs/data.c | 51 > + > fs/f2fs/f2fs.h | 10 - > fs/f2fs/gc.c| 2 ++ > fs/f2fs/segment.c | 24 +++-- > fs/f2fs/segment.h | 4 > fs/f2fs/super.c | 21 --- > include/trace/events/f2fs.h | 14 - > 7 files changed, 102 insertions(+), 24 deletions(-) > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > index 06bb2042385e..49b7e3918484 100644 > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -282,21 +282,30 @@ static bool has_merged_page(struct f2fs_sb_info *sbi, > struct inode *inode, > nid_t ino, pgoff_t idx, enum page_type type) > { > enum page_type btype = PAGE_TYPE_OF_BIO(type); > - struct f2fs_bio_info *io = >write_io[btype]; > - bool ret; > + enum temp_type temp; > + struct f2fs_bio_info *io; > + bool ret = false; > > - down_read(>io_rwsem); > - ret = __has_merged_page(io, inode, ino, idx); > - up_read(>io_rwsem); > + for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { > + io = sbi->write_io[btype] + temp; > + > + down_read(>io_rwsem); > + ret = __has_merged_page(io, inode, ino, idx); > + up_read(>io_rwsem); > + > + /* TODO: use HOT temp only for meta pages now. */ > + if (ret || btype == META) > + break; > + } > return ret; > } > > static void __f2fs_submit_merged_write(struct f2fs_sb_info *sbi, > struct inode *inode, nid_t ino, pgoff_t idx, > - enum page_type type) > + enum page_type type, enum temp_type temp) > { > enum page_type btype = PAGE_TYPE_OF_BIO(type); > - struct f2fs_bio_info *io = >write_io[btype]; > + struct f2fs_bio_info *io = sbi->write_io[btype] + temp; > > down_write(>io_rwsem); > > @@ -316,17 +325,34 @@ static void __f2fs_submit_merged_write(struct > f2fs_sb_info *sbi, > up_write(>io_rwsem); > } > > +static void __submit_merged_write_cond(struct f2fs_sb_info *sbi, > + struct inode *inode, nid_t ino, pgoff_t idx, > + enum page_type type, bool force) > +{ > + enum temp_type temp; > + > + if (!force && !has_merged_page(sbi, inode, ino, idx, type)) > + return; > + > + for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { > + __f2fs_submit_merged_write(sbi, inode, ino, idx, type, temp); > + > + /* TODO: use HOT temp only for meta pages now. */ > + if (type >= META) > + break; > + } > +} > + > void f2fs_submit_merged_write(struct f2fs_sb_info *sbi, enum page_type type) > { > - __f2fs_submit_merged_write(sbi, NULL, 0, 0, type); > + __submit_merged_write_cond(sbi, NULL, 0, 0, type, true); > } > > void f2fs_submit_merged_write_cond(struct f2fs_sb_info *sbi, > struct inode *inode, nid_t ino, pgoff_t idx, > enum page_type type) > { > - if (has_merged_page(sbi, inode, ino, idx, type)) > - __f2fs_submit_merged_write(sbi, inode, ino, idx, type); > + __submit_merged_write_cond(sbi, inode, ino, idx, type, false); > } > > void f2fs_flush_merged_writes(struct f2fs_sb_info *sbi) > @@ -369,7 +395,7 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio) > { > struct f2fs_sb_info *sbi = fio->sbi; > enum page_type btype =
Re: [PATCH] f2fs: split bio cache
On 2017/5/11 7:50, Jaegeuk Kim wrote: > On 05/09, Chao Yu wrote: >> Hi Jaegeuk, >> >> On 2017/5/9 5:23, Jaegeuk Kim wrote: >>> Hi Chao, >>> >>> I can't see a strong reason to split meta from data/node and rename the >>> existing >>> function names. Instead, how about keeping the existing one while adding >>> some >>> page types to deal with log types? >> >> Hmm.. before write this patch, I have thought about this implementation which >> adds HOT_DATA/WARM_DATA/.. into page_type enum, but the final target is >> making >> different temperature log-header being with independent bio cache, io lock, >> and >> io list, eliminating interaction of different temperature IOs, also expanding >> f2fs' scalability when adding more log-headers. If we don't split meta from >> data/node, it's a little bit hard to approach this. What do you think? > > I submitted clean-up patches. How about this on top of them? After splitting, bio caches is connected to log-header, so why not moving bio cache into log-header (SM_I(sbi)->curseg_array)? after the merging: - we could avoid redundant enum. e.g. temp_type, page_type::{DATA/NODE}, since we have seg_type enum instead. - once we add special log-header like journal log or random IO log making DATA/NODE and HOT/WARM/COLD non-orthogonalization, we just need to change few codes to adjust it. How do you think? Thanks, > > --- > fs/f2fs/data.c | 51 > + > fs/f2fs/f2fs.h | 10 - > fs/f2fs/gc.c| 2 ++ > fs/f2fs/segment.c | 24 +++-- > fs/f2fs/segment.h | 4 > fs/f2fs/super.c | 21 --- > include/trace/events/f2fs.h | 14 - > 7 files changed, 102 insertions(+), 24 deletions(-) > > diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c > index 06bb2042385e..49b7e3918484 100644 > --- a/fs/f2fs/data.c > +++ b/fs/f2fs/data.c > @@ -282,21 +282,30 @@ static bool has_merged_page(struct f2fs_sb_info *sbi, > struct inode *inode, > nid_t ino, pgoff_t idx, enum page_type type) > { > enum page_type btype = PAGE_TYPE_OF_BIO(type); > - struct f2fs_bio_info *io = >write_io[btype]; > - bool ret; > + enum temp_type temp; > + struct f2fs_bio_info *io; > + bool ret = false; > > - down_read(>io_rwsem); > - ret = __has_merged_page(io, inode, ino, idx); > - up_read(>io_rwsem); > + for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { > + io = sbi->write_io[btype] + temp; > + > + down_read(>io_rwsem); > + ret = __has_merged_page(io, inode, ino, idx); > + up_read(>io_rwsem); > + > + /* TODO: use HOT temp only for meta pages now. */ > + if (ret || btype == META) > + break; > + } > return ret; > } > > static void __f2fs_submit_merged_write(struct f2fs_sb_info *sbi, > struct inode *inode, nid_t ino, pgoff_t idx, > - enum page_type type) > + enum page_type type, enum temp_type temp) > { > enum page_type btype = PAGE_TYPE_OF_BIO(type); > - struct f2fs_bio_info *io = >write_io[btype]; > + struct f2fs_bio_info *io = sbi->write_io[btype] + temp; > > down_write(>io_rwsem); > > @@ -316,17 +325,34 @@ static void __f2fs_submit_merged_write(struct > f2fs_sb_info *sbi, > up_write(>io_rwsem); > } > > +static void __submit_merged_write_cond(struct f2fs_sb_info *sbi, > + struct inode *inode, nid_t ino, pgoff_t idx, > + enum page_type type, bool force) > +{ > + enum temp_type temp; > + > + if (!force && !has_merged_page(sbi, inode, ino, idx, type)) > + return; > + > + for (temp = HOT; temp < NR_TEMP_TYPE; temp++) { > + __f2fs_submit_merged_write(sbi, inode, ino, idx, type, temp); > + > + /* TODO: use HOT temp only for meta pages now. */ > + if (type >= META) > + break; > + } > +} > + > void f2fs_submit_merged_write(struct f2fs_sb_info *sbi, enum page_type type) > { > - __f2fs_submit_merged_write(sbi, NULL, 0, 0, type); > + __submit_merged_write_cond(sbi, NULL, 0, 0, type, true); > } > > void f2fs_submit_merged_write_cond(struct f2fs_sb_info *sbi, > struct inode *inode, nid_t ino, pgoff_t idx, > enum page_type type) > { > - if (has_merged_page(sbi, inode, ino, idx, type)) > - __f2fs_submit_merged_write(sbi, inode, ino, idx, type); > + __submit_merged_write_cond(sbi, inode, ino, idx, type, false); > } > > void f2fs_flush_merged_writes(struct f2fs_sb_info *sbi) > @@ -369,7 +395,7 @@ int f2fs_submit_page_write(struct f2fs_io_info *fio) > { > struct f2fs_sb_info *sbi = fio->sbi; > enum page_type btype =
Re: [alsa-devel] [PATCH v2 3/3] ASoC: stm32: Add full duplex support to i2s (fwd)
Please check whether the lock taken on line 585 should be freed on line 598. The report also mentions line 652, although that is not shown. julia -- Forwarded message -- Date: Thu, 11 May 2017 09:49:21 +0800 From: kbuild test robot <fengguang...@intel.com> To: kbu...@01.org Cc: Julia Lawall <julia.law...@lip6.fr> Subject: Re: [alsa-devel] [PATCH v2 3/3] ASoC: stm32: Add full duplex support to i2s Hi olivier, [auto build test WARNING on asoc/for-next] [also build test WARNING on next-20170510] [cannot apply to v4.11] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/olivier-moysan/Add-I2S-driver/20170511-075335 base: https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next :: branch date: 2 hours ago :: commit date: 2 hours ago >> sound/soc/stm/stm32_i2s.c:598:3-9: preceding lock on line 585 >> sound/soc/stm/stm32_i2s.c:598:3-9: preceding lock on line 585 sound/soc/stm/stm32_i2s.c:652:3-9: preceding lock on line 585 git remote add linux-review https://github.com/0day-ci/linux git remote update linux-review git checkout 9f5663c4aa8961bf2a547683af787e606503376e vim +598 sound/soc/stm/stm32_i2s.c 1ebd36ae olivier moysan 2017-05-10 579 { 1ebd36ae olivier moysan 2017-05-10 580 struct stm32_i2s_data *i2s = snd_soc_dai_get_drvdata(cpu_dai); 1ebd36ae olivier moysan 2017-05-10 581 bool playback_flg = (substream->stream == SNDRV_PCM_STREAM_PLAYBACK); 9f5663c4 olivier moysan 2017-05-10 582 u32 cfg1_mask, ier; 1ebd36ae olivier moysan 2017-05-10 583 int ret; 1ebd36ae olivier moysan 2017-05-10 584 9f5663c4 olivier moysan 2017-05-10 @585 spin_lock(>lock_fd); 9f5663c4 olivier moysan 2017-05-10 586 1ebd36ae olivier moysan 2017-05-10 587 switch (cmd) { 1ebd36ae olivier moysan 2017-05-10 588 case SNDRV_PCM_TRIGGER_START: 1ebd36ae olivier moysan 2017-05-10 589 case SNDRV_PCM_TRIGGER_RESUME: 1ebd36ae olivier moysan 2017-05-10 590 case SNDRV_PCM_TRIGGER_PAUSE_RELEASE: 1ebd36ae olivier moysan 2017-05-10 591 /* Enable i2s */ 1ebd36ae olivier moysan 2017-05-10 592 dev_dbg(cpu_dai->dev, "start I2S\n"); 1ebd36ae olivier moysan 2017-05-10 593 1ebd36ae olivier moysan 2017-05-10 594 ret = regmap_update_bits(i2s->regmap, STM32_I2S_CR1_REG, 1ebd36ae olivier moysan 2017-05-10 595 I2S_CR1_SPE, I2S_CR1_SPE); 1ebd36ae olivier moysan 2017-05-10 596 if (ret < 0) { 1ebd36ae olivier moysan 2017-05-10 597 dev_err(cpu_dai->dev, "Error %d enabling I2S\n", ret); 1ebd36ae olivier moysan 2017-05-10 @598 return ret; 1ebd36ae olivier moysan 2017-05-10 599 } 1ebd36ae olivier moysan 2017-05-10 600 1ebd36ae olivier moysan 2017-05-10 601 ret = regmap_update_bits(i2s->regmap, STM32_I2S_CR1_REG, :: The code at line 598 was first introduced by commit :: 1ebd36ae636af5c130a9df4d9a921e4ed984a6e9 ASoC: stm32: Add I2S driver :: TO: olivier moysan <olivier.moy...@st.com> :: CC: 0day robot <fengguang...@intel.com> --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: [alsa-devel] [PATCH v2 3/3] ASoC: stm32: Add full duplex support to i2s (fwd)
Please check whether the lock taken on line 585 should be freed on line 598. The report also mentions line 652, although that is not shown. julia -- Forwarded message -- Date: Thu, 11 May 2017 09:49:21 +0800 From: kbuild test robot To: kbu...@01.org Cc: Julia Lawall Subject: Re: [alsa-devel] [PATCH v2 3/3] ASoC: stm32: Add full duplex support to i2s Hi olivier, [auto build test WARNING on asoc/for-next] [also build test WARNING on next-20170510] [cannot apply to v4.11] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/olivier-moysan/Add-I2S-driver/20170511-075335 base: https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next :: branch date: 2 hours ago :: commit date: 2 hours ago >> sound/soc/stm/stm32_i2s.c:598:3-9: preceding lock on line 585 >> sound/soc/stm/stm32_i2s.c:598:3-9: preceding lock on line 585 sound/soc/stm/stm32_i2s.c:652:3-9: preceding lock on line 585 git remote add linux-review https://github.com/0day-ci/linux git remote update linux-review git checkout 9f5663c4aa8961bf2a547683af787e606503376e vim +598 sound/soc/stm/stm32_i2s.c 1ebd36ae olivier moysan 2017-05-10 579 { 1ebd36ae olivier moysan 2017-05-10 580 struct stm32_i2s_data *i2s = snd_soc_dai_get_drvdata(cpu_dai); 1ebd36ae olivier moysan 2017-05-10 581 bool playback_flg = (substream->stream == SNDRV_PCM_STREAM_PLAYBACK); 9f5663c4 olivier moysan 2017-05-10 582 u32 cfg1_mask, ier; 1ebd36ae olivier moysan 2017-05-10 583 int ret; 1ebd36ae olivier moysan 2017-05-10 584 9f5663c4 olivier moysan 2017-05-10 @585 spin_lock(>lock_fd); 9f5663c4 olivier moysan 2017-05-10 586 1ebd36ae olivier moysan 2017-05-10 587 switch (cmd) { 1ebd36ae olivier moysan 2017-05-10 588 case SNDRV_PCM_TRIGGER_START: 1ebd36ae olivier moysan 2017-05-10 589 case SNDRV_PCM_TRIGGER_RESUME: 1ebd36ae olivier moysan 2017-05-10 590 case SNDRV_PCM_TRIGGER_PAUSE_RELEASE: 1ebd36ae olivier moysan 2017-05-10 591 /* Enable i2s */ 1ebd36ae olivier moysan 2017-05-10 592 dev_dbg(cpu_dai->dev, "start I2S\n"); 1ebd36ae olivier moysan 2017-05-10 593 1ebd36ae olivier moysan 2017-05-10 594 ret = regmap_update_bits(i2s->regmap, STM32_I2S_CR1_REG, 1ebd36ae olivier moysan 2017-05-10 595 I2S_CR1_SPE, I2S_CR1_SPE); 1ebd36ae olivier moysan 2017-05-10 596 if (ret < 0) { 1ebd36ae olivier moysan 2017-05-10 597 dev_err(cpu_dai->dev, "Error %d enabling I2S\n", ret); 1ebd36ae olivier moysan 2017-05-10 @598 return ret; 1ebd36ae olivier moysan 2017-05-10 599 } 1ebd36ae olivier moysan 2017-05-10 600 1ebd36ae olivier moysan 2017-05-10 601 ret = regmap_update_bits(i2s->regmap, STM32_I2S_CR1_REG, :: The code at line 598 was first introduced by commit :: 1ebd36ae636af5c130a9df4d9a921e4ed984a6e9 ASoC: stm32: Add I2S driver :: TO: olivier moysan :: CC: 0day robot --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: [PATCH v7 0/7] Introduce ZONE_CMA
Sorry for the late response. I was on a vacation. On Tue, May 02, 2017 at 03:32:29PM +0200, Michal Hocko wrote: > On Tue 02-05-17 13:01:32, Joonsoo Kim wrote: > > On Thu, Apr 27, 2017 at 05:06:36PM +0200, Michal Hocko wrote: > [...] > > > I see this point and I agree that using a specific zone might be a > > > _nicer_ solution in the end but you have to consider another aspects as > > > well. The main one I am worried about is a long term maintainability. > > > We are really out of page flags and consuming one for a rather specific > > > usecase is not good. Look at ZONE_DMA. I am pretty sure that almost > > > no sane HW needs 16MB zone anymore, yet we have hard time to get rid > > > of it and so we have that memory laying around unused all the time > > > and blocking one page flag bit. CMA falls into a similar category > > > AFAIU. I wouldn't be all that surprised if a future HW will not need CMA > > > allocations in few years, yet we will have to fight to get rid of it > > > like we do with ZONE_DMA. And not only that. We will also have to fight > > > finding page flags for other more general usecases in the meantime. > > > > This maintenance problem is inherent. This problem exists even if we > > uses MIGRATETYPE approach. We cannot remove many hooks for CMA if a > > future HW will not need CMA allocation in few years. The only > > difference is that one takes single zone bit only for CMA user and the > > other approach takes many hooks that we need to take care about it all > > the time. > > And I consider this a big difference. Because while hooks are not nice > they will affect CMA users (in a sense of bugs/performance etc.). While > an additional bit consumed will affect potential future and more generic > features. Because these hooks are so tricky and are spread on many places, bugs about these hooks can be made by *non-CMA* user and they hurt *CMA* user. These hooks could also delay non-CMA user's development speed since there are many hooks about CMA and understanding how CMA is managed is rather difficult. I think that this is a big maintenance overhead not only for CMA user but also for non-CMA user. So, I think that it can justify additional bit consumed. > > [...] > > > I believe that the overhead in the hot path is not such a big deal. We > > > have means to make it 0 when CMA is not used by jumplabels. I assume > > > that the vast majority of systems will not use CMA. Those systems which > > > use CMA should be able to cope with some slight overhead IMHO. > > > > Please don't underestimate number of CMA user. Most of android device > > uses CMA. So, there would be more devices using CMA than the server > > not using CMA. They also have a right to experience the best performance. > > This is not a fair comparison, though. Android development model is much > more faster and tend to not care about future maintainability at all. I > do not know about any android device that would run on a clean vanilla > kernel because vendors simply do not care enough (or have time) to put > the code into a proper shape to upstream it. I understand that this > model might work quite well for rapidly changing and moving mobile or > IoT segment but it is not the greatest fit to motivate the core kernel > subsystem development. We are not in the drivers space! > > [...] > > > This looks like a nice clean up. Those ifdefs are ugly as hell. One > > > could argue that some of that could be cleaned up by simply adding some > > > helpers (with a jump label to reduce the overhead), though. But is this > > > really strong enough reason to bring the whole zone in? I am not really > > > convinced to be honest. > > > > Please don't underestimate the benefit of this patchset. > > I have said that we need *more* hooks to fix all the problems. > > > > And, please think that this code removal is not only code removal but > > also concept removal. With this removing, we don't need to consider > > ALLOC_CMA for alloc_flags when calling zone_watermark_ok(). There are > > many bugs on it and it still remains. We don't need to consider > > pageblock migratetype when handling pageblock migratetype. We don't > > need to take a great care about calculating the number of CMA > > freepages. > > And all this can be isolated to CMA specific hooks with mostly minimum > impact to most users. I hear you saying that zone approach is more natural > and I would agree if we wouldn't have to care about the number of zones. I attach a solution about one more bit in page flags although I don't agree with your opinion that additional bit is no-go approach. Just note that we have already used three bits for zone encoding in some configuration due to ZONE_DEVICE. > > > > [...] > > > > > > > > Please do _not_ take this as a NAK from me. At least not at this > > > > > time. I > > > > > am still trying to understand all the consequences but my intuition > > > > > tells me that building on top of highmem like approach will turn out > >
Re: [PATCH v7 0/7] Introduce ZONE_CMA
Sorry for the late response. I was on a vacation. On Tue, May 02, 2017 at 03:32:29PM +0200, Michal Hocko wrote: > On Tue 02-05-17 13:01:32, Joonsoo Kim wrote: > > On Thu, Apr 27, 2017 at 05:06:36PM +0200, Michal Hocko wrote: > [...] > > > I see this point and I agree that using a specific zone might be a > > > _nicer_ solution in the end but you have to consider another aspects as > > > well. The main one I am worried about is a long term maintainability. > > > We are really out of page flags and consuming one for a rather specific > > > usecase is not good. Look at ZONE_DMA. I am pretty sure that almost > > > no sane HW needs 16MB zone anymore, yet we have hard time to get rid > > > of it and so we have that memory laying around unused all the time > > > and blocking one page flag bit. CMA falls into a similar category > > > AFAIU. I wouldn't be all that surprised if a future HW will not need CMA > > > allocations in few years, yet we will have to fight to get rid of it > > > like we do with ZONE_DMA. And not only that. We will also have to fight > > > finding page flags for other more general usecases in the meantime. > > > > This maintenance problem is inherent. This problem exists even if we > > uses MIGRATETYPE approach. We cannot remove many hooks for CMA if a > > future HW will not need CMA allocation in few years. The only > > difference is that one takes single zone bit only for CMA user and the > > other approach takes many hooks that we need to take care about it all > > the time. > > And I consider this a big difference. Because while hooks are not nice > they will affect CMA users (in a sense of bugs/performance etc.). While > an additional bit consumed will affect potential future and more generic > features. Because these hooks are so tricky and are spread on many places, bugs about these hooks can be made by *non-CMA* user and they hurt *CMA* user. These hooks could also delay non-CMA user's development speed since there are many hooks about CMA and understanding how CMA is managed is rather difficult. I think that this is a big maintenance overhead not only for CMA user but also for non-CMA user. So, I think that it can justify additional bit consumed. > > [...] > > > I believe that the overhead in the hot path is not such a big deal. We > > > have means to make it 0 when CMA is not used by jumplabels. I assume > > > that the vast majority of systems will not use CMA. Those systems which > > > use CMA should be able to cope with some slight overhead IMHO. > > > > Please don't underestimate number of CMA user. Most of android device > > uses CMA. So, there would be more devices using CMA than the server > > not using CMA. They also have a right to experience the best performance. > > This is not a fair comparison, though. Android development model is much > more faster and tend to not care about future maintainability at all. I > do not know about any android device that would run on a clean vanilla > kernel because vendors simply do not care enough (or have time) to put > the code into a proper shape to upstream it. I understand that this > model might work quite well for rapidly changing and moving mobile or > IoT segment but it is not the greatest fit to motivate the core kernel > subsystem development. We are not in the drivers space! > > [...] > > > This looks like a nice clean up. Those ifdefs are ugly as hell. One > > > could argue that some of that could be cleaned up by simply adding some > > > helpers (with a jump label to reduce the overhead), though. But is this > > > really strong enough reason to bring the whole zone in? I am not really > > > convinced to be honest. > > > > Please don't underestimate the benefit of this patchset. > > I have said that we need *more* hooks to fix all the problems. > > > > And, please think that this code removal is not only code removal but > > also concept removal. With this removing, we don't need to consider > > ALLOC_CMA for alloc_flags when calling zone_watermark_ok(). There are > > many bugs on it and it still remains. We don't need to consider > > pageblock migratetype when handling pageblock migratetype. We don't > > need to take a great care about calculating the number of CMA > > freepages. > > And all this can be isolated to CMA specific hooks with mostly minimum > impact to most users. I hear you saying that zone approach is more natural > and I would agree if we wouldn't have to care about the number of zones. I attach a solution about one more bit in page flags although I don't agree with your opinion that additional bit is no-go approach. Just note that we have already used three bits for zone encoding in some configuration due to ZONE_DEVICE. > > > > [...] > > > > > > > > Please do _not_ take this as a NAK from me. At least not at this > > > > > time. I > > > > > am still trying to understand all the consequences but my intuition > > > > > tells me that building on top of highmem like approach will turn out > >