Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
On 07.08.20 06:32, Andrew Morton wrote: > On Fri, 3 Jul 2020 18:28:23 +0530 Srikar Dronamraju > wrote: > >>> The memory hotplug changes that somehow because you can hotremove numa >>> nodes and therefore make the nodemask sparse but that is not a common >>> case. I am not sure what would happen if a completely new node was added >>> and its corresponding node was already used by the renumbered one >>> though. It would likely conflate the two I am afraid. But I am not sure >>> this is really possible with x86 and a lack of a bug report would >>> suggest that nobody is doing that at least. >>> >> >> JFYI, >> Satheesh copied in this mailchain had opened a bug a year on crash with vcpu >> hotplug on memoryless node. >> >> https://bugzilla.kernel.org/show_bug.cgi?id=202187 > > So... do we merge this patch or not? Seems that the overall view is > "risky but nobody is likely to do anything better any time soon"? I recall the issue Michal saw was "fix powerpc" vs. "break other architectures". @Michal how should we proceed? At least x86-64 won't be affected IIUC. -- Thanks, David / dhildenb
Re: [RESEND PATCH] media: atomisp: Replace trace_printk by pr_info
On Fri, Aug 7, 2020 at 2:28 PM Greg Kroah-Hartman wrote: > > On Fri, Aug 07, 2020 at 09:50:23AM +0800, Nicolas Boichat wrote: > > On Fri, Jul 24, 2020 at 8:41 PM Nicolas Boichat > > wrote: > > > > > > On Fri, Jul 10, 2020 at 3:03 PM Greg Kroah-Hartman > > > wrote: > > > > > > > > On Fri, Jul 10, 2020 at 02:45:29PM +0800, Nicolas Boichat wrote: > > > > > trace_printk should not be used in production code, replace it > > > > > call with pr_info. > > > > > > > > > > Signed-off-by: Nicolas Boichat > > > > > --- > > > > > Sent this before as part of a series (whose 4th patch was a > > > > > change that allows to detect such trace_printk), but maybe it's > > > > > easier to get individual maintainer attention by splitting it. > > > > > > > > Mauro should take this soon: > > > > > > > > Acked-by: Greg Kroah-Hartman > > > > > > Mauro: did you get a chance to look at this? (and the other similar > > > patch "media: camss: vfe: Use trace_printk for debugging only") > > > > Mauro: Another gentle ping. Thanks. > > It's the middle of the merge window, maintainers can't do anything until > after 5.9-rc1 is out, sorry. Huh, wait, looks like Mauro _did_ pick it (found it in this email "[GIT PULL for v5.8-rc7] media fixes"). My bad then, I was expecting an ack ,-) Thanks! > greg k-h
Re: [GIT] Networking
On Thu, Aug 6, 2020 at 11:23 PM John Stultz wrote: > > On Thu, Aug 6, 2020 at 5:32 PM John Stultz wrote: > > > > On Thu, Aug 6, 2020 at 4:17 PM Eric Dumazet wrote: > > > On 8/6/20 2:39 PM, John Stultz wrote: > > > > [ 19.709492] Unable to handle kernel access to user memory outside > > > > uaccess routines at virtual address 006f53337070 > > > > [ 19.726539] Mem abort info: > > > > [ 19.726544] ESR = 0x960f > > > > [ 19.741323] EC = 0x25: DABT (current EL), IL = 32 bits > > > > [ 19.741326] SET = 0, FnV = 0 > > > > [ 19.761185] EA = 0, S1PTW = 0 > > > > [ 19.761188] Data abort info: > > > > [ 19.761190] ISV = 0, ISS = 0x000f > > > > [ 19.761192] CM = 0, WnR = 0 > > > > [ 19.761199] user pgtable: 4k pages, 39-bit VAs, pgdp=00016e9e9000 > > > > [ 19.777584] [006f53337070] pgd=00016e99e003, > > > > p4d=00016e99e003, pud=00016e99e003, pmd=00016e99a003, > > > > pte=00e800016d3c7f53 > > > > [ 19.789205] Internal error: Oops: 960f [#1] PREEMPT SMP > > > > [ 19.789211] Modules linked in: > > > > [ 19.797153] CPU: 7 PID: 364 Comm: iptables-restor Tainted: G > > > > W 5.8.0-mainline-08255-gf9e74a8eb6f3 #3350 > > > > [ 19.797156] Hardware name: Thundercomm Dragonboard 845c (DT) > > > > [ 19.797161] pstate: a045 (NzCv daif +PAN -UAO BTYPE=--) > > > > [ 19.797177] pc : do_ipt_set_ctl+0x304/0x610 > > > > [ 19.807891] lr : do_ipt_set_ctl+0x50/0x610 > > > > [ 19.807894] sp : ffc0139bbba0 > > > > [ 19.807898] x29: ffc0139bbba0 x28: ff80f07a3800 > > > > [ 19.846468] x27: x26: > > > > [ 19.846472] x25: x24: 0698 > > > > [ 19.846476] x23: ffec8eb0cc80 x22: 0040 > > > > [ 19.846480] x21: b46f53337070 x20: ffec8eb0c000 > > > > [ 19.846484] x19: ffec8e9e9000 x18: > > > > [ 19.846487] x17: x16: > > > > [ 19.846491] x15: x14: > > > > [ 19.846495] x13: x12: > > > > [ 19.846501] x11: x10: > > > > [ 19.856005] x9 : x8 : > > > > [ 19.856008] x7 : ffec8e9e9d08 x6 : > > > > [ 19.856012] x5 : x4 : 0213 > > > > [ 19.856015] x3 : 0001ffdeffef x2 : 11ded3fb0bb85e00 > > > > [ 19.856019] x1 : 0027 x0 : 0080 > > > > [ 19.856024] Call trace: > > > > [ 19.866319] do_ipt_set_ctl+0x304/0x610 > > > > [ 19.866327] nf_setsockopt+0x64/0xa8 > > > > [ 19.866332] ip_setsockopt+0x21c/0x1710 > > > > [ 19.866338] raw_setsockopt+0x50/0x1b8 > > > > [ 19.866347] sock_common_setsockopt+0x50/0x68 > > > > [ 19.882672] __sys_setsockopt+0x120/0x1c8 > > > > [ 19.882677] __arm64_sys_setsockopt+0x30/0x40 > > > > [ 19.882686] el0_svc_common.constprop.3+0x78/0x188 > > > > [ 19.882691] do_el0_svc+0x80/0xa0 > > > > [ 19.882699] el0_sync_handler+0x134/0x1a0 > > > > [ 19.901555] el0_sync+0x140/0x180 > > > > [ 19.901564] Code: aa1503e0 97fffd3e 2a0003f5 1780 (a9401ea6) > > > > [ 19.901569] ---[ end trace 22010e9688ae248f ]--- > > > > [ 19.913033] Kernel panic - not syncing: Fatal exception > > > > [ 19.913042] SMP: stopping secondary CPUs > > > > [ 20.138885] Kernel Offset: 0x2c7d08 from 0xffc01000 > > > > [ 20.138887] PHYS_OFFSET: 0xfffa8000 > > > > [ 20.138894] CPU features: 0x0040002,2a80a218 > > > > [ 20.138898] Memory Limit: none > > > > > > > > I'll continue to work on bisecting this down further, but figured I'd > > > > share now as you or someone else might be able to tell whats wrong > > > > from the trace. > > > > > > > > > > Can you try at commit c2f12630c60ff33a9cafd221646053fc10ec59b6 > > > ("netfilter: switch nf_setsockopt to sockptr_t") > > > (and right before it) > > > > > > So I rebased my patches ontop of that commit, but I'm not seeing the > > crash there. I also hand applied your suggested patch when I did see > > the issue, but that didn't seem to fix it either. > > > > So far I've only narrowed it down to between > > 65ccbbda52288527b7c48087eb33bb0757975875..530fe9d433b9e60251bb8fdc5dddecbc486a50ef. > > But I'll keep rebase-bisecting it down. > > So I've finally rebase-bisected it down to: > a31edb2059ed ("net: improve the user pointer check in init_user_sockptr") > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a31edb2059ed4e498f9aa8230c734b59d0ad797a > > And reverting that from linus/HEAD (at least from this morning) seems > to avoid it. > > Seems like it is just adding extra checks on the data passed, so maybe > existing trouble from a different driver is the issue here, but it's > not really clear from the crash what might be wrong. > > Suggestions would be greatly appreciated! And while I'm back to being able to boot with the above reverted, wifi
Re: [PATCH 1/2] module: Correctly truncate sysfs sections output
On Thu, Aug 06, 2020 at 11:35:38PM -0700, Kees Cook wrote: > The only-root-readable /sys/module/$module/sections/$section files > did not truncate their output to the available buffer size. While most > paths into the kernfs read handlers end up using PAGE_SIZE buffers, > it's possible to get there through other paths (e.g. splice, sendfile). > Actually limit the output to the "count" passed into the read function, > and report it back correctly. *sigh* Ugh, never thought about that... Reviewed-by: Greg Kroah-Hartman
Re: [PATCH] staging: Add Mediatek High Frequency Manager Framework
On Fri, Aug 07, 2020 at 02:30:29PM +0800, hongxu.zhao wrote: > On Thu, 2020-08-06 at 12:53 +0200, Greg Kroah-Hartman wrote: > > On Thu, Aug 06, 2020 at 04:32:28PM +0800, hongxu.zhao wrote: > > > On Tue, 2020-08-04 at 10:11 +0200, Greg Kroah-Hartman wrote: > > > > On Tue, Aug 04, 2020 at 03:52:49PM +0800, hongxu.zhao wrote: > > > > > Add a new sensor framework into linux kernel which can support multi > > > > > client request sensor data. > > > > > There are the following features: > > > > > 1.Ringbuffer between manager and client; > > > > > 2.Kernel space user interface; > > > > > 3.User space user interface with syscall; > > > > > 4.Each client hang detect mechanism; > > > > > 5.Polling timer management in framework no need driver concern; > > > > > 6.Polling kthread work intergrated into a single kthread > > > > > worker to save system resources in framework no need driver > > > > > concern; > > > > > 7.Proc file system to show manager device and client details; > > > > > 8.Compitable with android and widely used in many mediatek > > > > > platform products; > > > > > > > > > > Change-Id: I6361cdc2d51de50f66eede7df099c4575e7ec473 > > > > > > > > Did you not run checkpatch.pl on this? :) > > > > > > > > No need for change-id here. > > > > > > > > But, most importantly, why is this in drivers/staging? What keeps it > > > > from being in the "real" part of the kernel? I need a TODO file in the > > > > directory of the driver listing what remains to be done and who is > > > > responsible for doing this work and reviewing patches. > > > > > > > > Can you resend this with that file added and the Change-id removed? > > > > > > > > Also, why not just use the IIO interface, why are you creating > > > > yet-another api for sensors? We already have 2, making a third seems > > > > like something that guarantees this will never be mergable to the > > > > correct part of the kernel. > > > > > > > > And finally, /proc/ is not for devices, that is what sysfs is for, > > > > please use that. > > > > > > I have modified checkpatch issue, but blocked by ARCH=alpha build error > > > and I can't reproduce this build error in mediatek environment. I need > > > spend some time setting up an environment to solve this problem and will > > > send you the latest patch together after solving the problem of alpha > > > build error. > > > > If you can't easily fix the alpha issue, you can just mark the driver as > > not able to be built there for now. > > > > > Firstly I want keep it in the real part of kernel and I send mail to > > > community to find the right maintainer, unfortunately, several emails > > > were not answered. > > > > Pointers to those emails on lore.kernel.org? > > I sended mail to linux-kernel@vger.kernel.org And that was it? You need to at least cc: people involved so we can see it, no one can keep up with lkml on its own. > > > Secondly I found iio upstream history it also started from staging at > > > the beginning, maybe staging is the best start until it become mature we > > > can move it to the real part of kernel. > > > > iio started in staging, but now that the infrastructure is out of it, > > there should not be any reason that new drivers be forced to go into > > staging too. You only want to do it this way if for some reason you can > > not get the work done on your own. > > Yes, I want started in staging and want more sensor driver(vendor > driver) added to high frequency manager framework. Again, don't add stuff to staging unless there is a good reason it needs to be there. > > > Actually, we have already assessed IIO subsystem, but the conclusion is > > > that it doesn't meet our requirement: > > > 1. iio doesn't have sensor manager in kernel space. > > > > What exactly do you mean by this? Who needs to be a "manager" here? > > >From our experience, mediatek have many platforms and different > customers have different requirement. > > Firstly customer requirement: >Kernel space | user space > > driver -- framework -- syscall -- client get data in process > > Face this demand, first architecture have no manager service. Why is a "requirement" divided up this way? Shouldn't the only requirement be that a specific userspace program get the needed information? Other than that, the architecture below it does not have to be a specific way, right? > Secondly customer requirement: > Kernel space | user space > - > driver -- framework -- syscall -- framework service --IPC -- client A > > |--IPC --client B > > Face this demand, second architecture we add a sensor manager framework > as service in user space, receive client's enable disable command and > dispatch data to those clients. > Then second architecture have a sensor manager in user space and through > ip
[PATCH net-next] vmxnet3: use correct tcp hdr length when packet is encapsulated
'Commit dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support")' added support for encapsulation offload. However, while calculating tcp hdr length, it does not take into account if the packet is encapsulated or not. This patch fixes this issue by using correct reference for inner tcp header. Fixes: dacce2be3312 ("vmxnet3: add geneve and vxlan tunnel offload support") Signed-off-by: Ronak Doshi Acked-by: Guolin Yang --- drivers/net/vmxnet3/vmxnet3_drv.c | 3 ++- drivers/net/vmxnet3/vmxnet3_int.h | 4 ++-- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c index ca395f9679d0..2818015324b8 100644 --- a/drivers/net/vmxnet3/vmxnet3_drv.c +++ b/drivers/net/vmxnet3/vmxnet3_drv.c @@ -886,7 +886,8 @@ vmxnet3_parse_hdr(struct sk_buff *skb, struct vmxnet3_tx_queue *tq, switch (protocol) { case IPPROTO_TCP: - ctx->l4_hdr_size = tcp_hdrlen(skb); + ctx->l4_hdr_size = skb->encapsulation ? inner_tcp_hdrlen(skb) : + tcp_hdrlen(skb); break; case IPPROTO_UDP: ctx->l4_hdr_size = sizeof(struct udphdr); diff --git a/drivers/net/vmxnet3/vmxnet3_int.h b/drivers/net/vmxnet3/vmxnet3_int.h index 5d2b062215a2..f99e3327a7b0 100644 --- a/drivers/net/vmxnet3/vmxnet3_int.h +++ b/drivers/net/vmxnet3/vmxnet3_int.h @@ -69,12 +69,12 @@ /* * Version numbers */ -#define VMXNET3_DRIVER_VERSION_STRING "1.5.0.0-k" +#define VMXNET3_DRIVER_VERSION_STRING "1.5.1.0-k" /* Each byte of this 32-bit integer encodes a version number in * VMXNET3_DRIVER_VERSION_STRING. */ -#define VMXNET3_DRIVER_VERSION_NUM 0x0105 +#define VMXNET3_DRIVER_VERSION_NUM 0x01050100 #if defined(CONFIG_PCI_MSI) /* RSS only makes sense if MSI-X is supported. */ -- 2.11.0
linux-next: Tree for Aug 7
Hi all, News: The merge window has opened, so please do not add any v5.10 related material to your linux-next included branches until after the merge window closes again. Changes since 20200806: My fixes tree contains: dbf24e30ce2e ("device_cgroup: Fix RCU list debugging warning") Non-merge commits (relative to Linus' tree): 3779 4266 files changed, 127778 insertions(+), 64714 deletions(-) I have created today's linux-next tree at git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (patches at http://www.kernel.org/pub/linux/kernel/next/ ). If you are tracking the linux-next tree using git, you should not use "git pull" to do so as that will try to merge the new linux-next release with the old one. You should use "git fetch" and checkout or reset to the new master. You can see which trees have been included by looking in the Next/Trees file in the source. There are also quilt-import.log and merge.log files in the Next directory. Between each merge, the tree was built with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a multi_v7_defconfig for arm and a native build of tools/perf. After the final fixups (if any), I do an x86_64 modules_install followed by builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit), ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc and sparc64 defconfig and htmldocs. And finally, a simple boot test of the powerpc pseries_le_defconfig kernel in qemu (with and without kvm enabled). Below is a summary of the state of the merge. I am currently merging 328 trees (counting Linus' and 85 trees of bug fix patches pending for the current merge release). Stats about the size of the tree over time can be seen at http://neuling.org/linux-next-size.html . Status of my local build tests will be at http://kisskb.ellerman.id.au/linux-next . If maintainers want to give advice about cross compilers/configs that work, we are always open to add more builds. Thanks to Randy Dunlap for doing many randconfig builds. And to Paul Gortmaker for triage and bug fixes. -- Cheers, Stephen Rothwell $ git checkout master $ git reset --hard stable Merging origin/master (d6efb3ac3e6c Merge tag 'tty-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty) Merging fixes/master (dbf24e30ce2e device_cgroup: Fix RCU list debugging warning) Merging kbuild-current/fixes (bcf876870b95 Linux 5.8) Merging arc-current/for-curr (11ba468877bb Linux 5.8-rc5) Merging arm-current/fixes (5c6360ee4a0e ARM: 8988/1: mmu: fix crash in EFI calls due to p4d typo in create_mapping_late()) Merging arm64-fixes/for-next/fixes (6a7389f0312f MAINTAINERS: Include drivers subdirs for ARM PMU PROFILING AND DEBUGGING entry) Merging arm-soc-fixes/arm/fixes (fe1d899f4212 ARM: dts: keystone-k2g-evm: fix rgmii phy-mode for ksz9031 phy) Merging uniphier-fixes/fixes (48778464bb7d Linux 5.8-rc2) Merging drivers-memory-fixes/fixes (b3a9e3b9622a Linux 5.8-rc1) Merging m68k-current/for-linus (382f429bb559 m68k: defconfig: Update defconfigs for v5.8-rc3) Merging powerpc-fixes/fixes (bcf876870b95 Linux 5.8) Merging s390-fixes/fixes (8e911bd8afe0 s390/test_unwind: fix possible memleak in test_unwind()) Merging sparc/master (0a95a6d1a4cd sparc: use for_each_child_of_node() macro) Merging fscrypt-current/for-stable (2b4eae95c736 fscrypt: don't evict dirty inodes after removing key) Merging net/master (8912fd6a61d7 net: hns3: fix spelling mistake "could'nt" -> "couldn't") Merging bpf/master (8912fd6a61d7 net: hns3: fix spelling mistake "could'nt" -> "couldn't") Merging ipsec/master (ac3a0c847296 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net) Merging netfilter/master (4203b19c2796 netfilter: flowtable: Set offload timeout when adding flow) Merging ipvs/master (eadede5f9362 Merge branch 'hns3-fixes') Merging wireless-drivers/master (1cfd3426ef98 ath10k: Fix NULL pointer dereference in AHB device probe) Merging mac80211/master (ac3a0c847296 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net) Merging rdma-fixes/for-rc (bcf876870b95 Linux 5.8) Merging sound-current/for-linus (fec9008828cd ALSA: usb-audio: Creative USB X-Fi Pro SB1095 volume knob support) Merging sound-asoc-fixes/for-linus (68122177749a Merge remote-tracking branch 'asoc/for-5.9' into asoc-linus) Merging regmap-fixes/for-linus (2b0f61e27f75 Merge remote-tracking branch 'regmap/for-5.8' into regmap-linus) Merging regulator-fixes/for-linus (e30c06f230a9 Merge remote-tracking branch 'regulator/for-5.9' into regulator-linus) Merging spi-fixes/for-linus (cdce7131f268 Merge remote-tracking branch 'spi/for-5.9' into spi-linus) Merging pci-current/for-linus (b361663c5a40 PCI/ASPM: Disable ASPM on ASMedia ASM10
Re: [PATCH] hwmon/pmbus: use simple i2c probe function
On Thu, 6 Aug 2020 14:48:58 -0700, Guenter Roeck wrote: > On 8/6/20 1:12 PM, Stephen Kitt wrote: > > On Thu, 6 Aug 2020 12:15:55 -0700, Guenter Roeck > > wrote: > >> On 8/6/20 9:16 AM, Stephen Kitt wrote: [...] > >> Also, I am not convinced that replacements such as > >> > >> - { "ipsps1", 0 }, > >> + { .name = "ipsps1" }, > >> > >> are an improvement. I would suggest to leave that alone for > >> consistency (and to make it easier to add more devices to the > >> various drivers if that happens in the future). > > > > From reading through all the drivers using id_table, it seems to me that > > we could do away with driver_data altogether and move all that to > > driver-local structures, in many cases covering more than just an id. By > > only initialising the elements of the structure that are really needed, I > > was hoping to (a) make it more obvious that driver_data isn’t used, and > > (b) allow removing it without touching all the code again. > > > > I don't see it as an improvement to replace a common data structure with > per-driver data structures. That sounds too much like "let's re-invent > the wheel over and over again". If that is where things are going, I'd > rather have it implemented everywhere else first. I am ok with the other > changes, but not with this. I agree, and I wasn’t intending on encouraging re-inventing the wheel in each driver. Let’s focus on probe_new for now... What did you mean by “to make it easier to add more devices to the various drivers if that happens in the future”? There are already many drivers with multiple devices but no driver_data, dropping the explicit driver_data initialisation doesn’t necessarily make it harder to add devices, does it? Regards, Stephen pgp4nCMrDbOak.pgp Description: OpenPGP digital signature
[PATCH 1/2] module: Correctly truncate sysfs sections output
The only-root-readable /sys/module/$module/sections/$section files did not truncate their output to the available buffer size. While most paths into the kernfs read handlers end up using PAGE_SIZE buffers, it's possible to get there through other paths (e.g. splice, sendfile). Actually limit the output to the "count" passed into the read function, and report it back correctly. *sigh* Reported-by: kernel test robot Link: https://lore.kernel.org/lkml/20200805002015.GE23458@shao2-debian Fixes: ed66f991bb19 ("module: Refactor section attr into bin attribute") Cc: sta...@vger.kernel.org Cc: Jessica Yu Signed-off-by: Kees Cook --- kernel/module.c | 22 +++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/kernel/module.c b/kernel/module.c index aa183c9ac0a2..08c46084d8cc 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -1520,18 +1520,34 @@ struct module_sect_attrs { struct module_sect_attr attrs[]; }; +#define MODULE_SECT_READ_SIZE (3 /* "0x", "\n" */ + (BITS_PER_LONG / 4)) static ssize_t module_sect_read(struct file *file, struct kobject *kobj, struct bin_attribute *battr, char *buf, loff_t pos, size_t count) { struct module_sect_attr *sattr = container_of(battr, struct module_sect_attr, battr); + char bounce[MODULE_SECT_READ_SIZE + 1]; + size_t wrote; if (pos != 0) return -EINVAL; - return sprintf(buf, "0x%px\n", - kallsyms_show_value(file->f_cred) ? (void *)sattr->address : NULL); + /* +* Since we're a binary read handler, we must account for the +* trailing NUL byte that sprintf will write: if "buf" is +* too small to hold the NUL, or the NUL is exactly the last +* byte, the read will look like it got truncated by one byte. +* Since there is no way to ask sprintf nicely to not write +* the NUL, we have to use a bounce buffer. +*/ + wrote = scnprintf(bounce, sizeof(bounce), "0x%px\n", +kallsyms_show_value(file->f_cred) + ? (void *)sattr->address : NULL); + count = min(count, wrote); + memcpy(buf, bounce, count); + + return count; } static void free_sect_attrs(struct module_sect_attrs *sect_attrs) @@ -1580,7 +1596,7 @@ static void add_sect_attrs(struct module *mod, const struct load_info *info) goto out; sect_attrs->nsections++; sattr->battr.read = module_sect_read; - sattr->battr.size = 3 /* "0x", "\n" */ + (BITS_PER_LONG / 4); + sattr->battr.size = MODULE_SECT_READ_SIZE; sattr->battr.attr.mode = 0400; *(gattr++) = &(sattr++)->battr; } -- 2.25.1
[PATCH 0/2] module: Correctly truncate sysfs sections output
Hi, This fixes my sysfs module sections refactoring to take into account the case where the output buffer is not PAGE_SIZE. :( Thanks to 0day and trinity for noticing. I'll let this sit in -next for a few days and then send it to Linus. -Kees Kees Cook (2): module: Correctly truncate sysfs sections output selftests: splice: Check behavior of full and short splices kernel/module.c | 22 ++- tools/testing/selftests/splice/.gitignore | 1 + tools/testing/selftests/splice/Makefile | 4 +- tools/testing/selftests/splice/config | 1 + tools/testing/selftests/splice/settings | 1 + .../selftests/splice/short_splice_read.sh | 56 ++ tools/testing/selftests/splice/splice_read.c | 57 +++ 7 files changed, 137 insertions(+), 5 deletions(-) create mode 100644 tools/testing/selftests/splice/config create mode 100644 tools/testing/selftests/splice/settings create mode 100755 tools/testing/selftests/splice/short_splice_read.sh create mode 100644 tools/testing/selftests/splice/splice_read.c -- 2.25.1
[PATCH 2/2] selftests: splice: Check behavior of full and short splices
In order to help catch regressions in splice vs read behavior in certain special files, test a few with various different kinds of internal kernel helpers. Cc: Shuah Khan Signed-off-by: Kees Cook --- tools/testing/selftests/splice/.gitignore | 1 + tools/testing/selftests/splice/Makefile | 4 +- tools/testing/selftests/splice/config | 1 + tools/testing/selftests/splice/settings | 1 + .../selftests/splice/short_splice_read.sh | 56 ++ tools/testing/selftests/splice/splice_read.c | 57 +++ 6 files changed, 118 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/splice/config create mode 100644 tools/testing/selftests/splice/settings create mode 100755 tools/testing/selftests/splice/short_splice_read.sh create mode 100644 tools/testing/selftests/splice/splice_read.c diff --git a/tools/testing/selftests/splice/.gitignore b/tools/testing/selftests/splice/.gitignore index d5a2da428752..be8266f5d04c 100644 --- a/tools/testing/selftests/splice/.gitignore +++ b/tools/testing/selftests/splice/.gitignore @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only default_file_splice_read +splice_read diff --git a/tools/testing/selftests/splice/Makefile b/tools/testing/selftests/splice/Makefile index e519b159b60d..541cd826d5a5 100644 --- a/tools/testing/selftests/splice/Makefile +++ b/tools/testing/selftests/splice/Makefile @@ -1,5 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 -TEST_PROGS := default_file_splice_read.sh -TEST_GEN_PROGS_EXTENDED := default_file_splice_read +TEST_PROGS := default_file_splice_read.sh short_splice_read.sh +TEST_GEN_PROGS_EXTENDED := default_file_splice_read splice_read include ../lib.mk diff --git a/tools/testing/selftests/splice/config b/tools/testing/selftests/splice/config new file mode 100644 index ..058c928368b8 --- /dev/null +++ b/tools/testing/selftests/splice/config @@ -0,0 +1 @@ +CONFIG_TEST_LKM=m diff --git a/tools/testing/selftests/splice/settings b/tools/testing/selftests/splice/settings new file mode 100644 index ..89cedfc0d12b --- /dev/null +++ b/tools/testing/selftests/splice/settings @@ -0,0 +1 @@ +timeout=5 diff --git a/tools/testing/selftests/splice/short_splice_read.sh b/tools/testing/selftests/splice/short_splice_read.sh new file mode 100755 index ..7810d3589d9a --- /dev/null +++ b/tools/testing/selftests/splice/short_splice_read.sh @@ -0,0 +1,56 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +set -e + +ret=0 + +do_splice() +{ + filename="$1" + bytes="$2" + expected="$3" + + out=$(./splice_read "$filename" "$bytes" | cat) + if [ "$out" = "$expected" ] ; then + echo "ok: $filename $bytes" + else + echo "FAIL: $filename $bytes" + ret=1 + fi +} + +test_splice() +{ + filename="$1" + + full=$(cat "$filename") + two=$(echo "$full" | grep -m1 . | cut -c-2) + + # Make sure full splice has the same contents as a standard read. + do_splice "$filename" 4096 "$full" + + # Make sure a partial splice see the first two characters. + do_splice "$filename" 2 "$two" +} + +# proc_single_open(), seq_read() +test_splice /proc/$$/limits +# special open, seq_read() +test_splice /proc/$$/comm + +# proc_handler, proc_dointvec_minmax +test_splice /proc/sys/fs/nr_open +# proc_handler, proc_dostring +test_splice /proc/sys/kernel/modprobe +# proc_handler, special read +test_splice /proc/sys/kernel/version + +if ! [ -d /sys/module/test_module/sections ] ; then + modprobe test_module +fi +# kernfs, attr +test_splice /sys/module/test_module/coresize +# kernfs, binattr +test_splice /sys/module/test_module/sections/.init.text + +exit $ret diff --git a/tools/testing/selftests/splice/splice_read.c b/tools/testing/selftests/splice/splice_read.c new file mode 100644 index ..46dae6a25cfb --- /dev/null +++ b/tools/testing/selftests/splice/splice_read.c @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include + +int main(int argc, char *argv[]) +{ + int fd; + size_t size; + ssize_t spliced; + + if (argc < 2) { + fprintf(stderr, "Usage: %s INPUT [BYTES]\n", argv[0]); + return EXIT_FAILURE; + } + + fd = open(argv[1], O_RDONLY); + if (fd < 0) { + perror(argv[1]); + return EXIT_FAILURE; + } + + if (argc == 3) + size = atol(argv[2]); + else { + struct stat statbuf; + + if (fstat(fd, &statbuf) < 0) { + perror(argv[1]); + return EXIT_FAILURE; + } + + if (statbuf.st_size > INT_MAX) { + fprintf(stderr, "%s: Too big\n", argv[1]); + return EXIT_FAILURE; +
[tip:timers/core] BUILD SUCCESS 0099808553ad4f9c04ad7afd966f6d7f470f247f
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/core branch HEAD: 0099808553ad4f9c04ad7afd966f6d7f470f247f x86: Select POSIX_CPU_TIMERS_TASK_WORK elapsed time: 724m configs tested: 112 configs skipped: 6 The following configs have been built successfully. More configs may be tested in the coming days. arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig arm mainstone_defconfig sh sh7770_generic_defconfig mips rbtx49xx_defconfig ia64 bigsur_defconfig arm simpad_defconfig sh polaris_defconfig powerpc64 defconfig mips mips_paravirt_defconfig microblaze mmu_defconfig sh rsk7201_defconfig powerpc mgcoge_defconfig powerpc64alldefconfig mips maltasmvp_defconfig arm axm55xx_defconfig xtensaxip_kc705_defconfig m68k m5475evb_defconfig powerpc holly_defconfig mips pnx8335_stb225_defconfig arm aspeed_g5_defconfig mips bmips_stb_defconfig mips tb0287_defconfig shdreamcast_defconfig m68k m5249evb_defconfig powerpc mpc885_ads_defconfig mips fuloong2e_defconfig sh secureedge5410_defconfig ia64 tiger_defconfig arm mv78xx0_defconfig mips loongson1b_defconfig sh apsh4a3a_defconfig mipsbcm47xx_defconfig arm colibri_pxa270_defconfig xtensa defconfig sh j2_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nios2 defconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig i386 allyesconfig sparcallyesconfig sparc defconfig i386defconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig powerpc defconfig x86_64 randconfig-a006-20200806 x86_64 randconfig-a001-20200806 x86_64 randconfig-a004-20200806 x86_64 randconfig-a005-20200806 x86_64 randconfig-a003-20200806 x86_64 randconfig-a002-20200806 i386 randconfig-a005-20200806 i386 randconfig-a004-20200806 i386 randconfig-a001-20200806 i386 randconfig-a002-20200806 i386 randconfig-a003-20200806 i386 randconfig-a006-20200806 i386 randconfig-a005-20200805 i386 randconfig-a004-20200805 i386 randconfig-a001-20200805 i386 randconfig-a003-20200805 i386 randconfig-a002-20200805 i386 randconfig-a006-20200805 i386 randconfig-a011-20200805 i386 randconfig-a012-20200805 i386 randconfig-a013-20200805 i386 randconfig-a014-20200805 i386 randconfig-a015-20200805 i386 randconfig-a016-20200805 i386 randconfig-a011-20200806 i386 randconfig-a012-20200806 i386 randconfig-a013-20200806 i386 randconfig-a015-20200806 i386 randconfig-a014-20200806 i386
[tip:perf/urgent] BUILD SUCCESS b55b3fdce3e554a6bbe8f8ca6a01a892d720e64e
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/urgent branch HEAD: b55b3fdce3e554a6bbe8f8ca6a01a892d720e64e hw_breakpoint: Remove unused __register_perf_hw_breakpoint() declaration elapsed time: 724m configs tested: 110 configs skipped: 6 The following configs have been built successfully. More configs may be tested in the coming days. arm defconfig arm64allyesconfig arm64 defconfig arm allyesconfig arm allmodconfig arm mainstone_defconfig sh sh7770_generic_defconfig mips rbtx49xx_defconfig ia64 bigsur_defconfig arm simpad_defconfig powerpc mgcoge_defconfig powerpc64alldefconfig mips maltasmvp_defconfig arm axm55xx_defconfig xtensaxip_kc705_defconfig m68k m5475evb_defconfig powerpc holly_defconfig mips pnx8335_stb225_defconfig arm aspeed_g5_defconfig mips bmips_stb_defconfig mips tb0287_defconfig shdreamcast_defconfig m68k m5249evb_defconfig powerpc mpc885_ads_defconfig mips fuloong2e_defconfig sh secureedge5410_defconfig ia64 tiger_defconfig arm mv78xx0_defconfig microblaze mmu_defconfig mips loongson1b_defconfig powerpc64 defconfig sh apsh4a3a_defconfig shhp6xx_defconfig mipsbcm47xx_defconfig arm colibri_pxa270_defconfig xtensa defconfig sh j2_defconfig ia64 allmodconfig ia64defconfig ia64 allyesconfig m68k allmodconfig m68kdefconfig m68k allyesconfig nios2 defconfig arc allyesconfig nds32 allnoconfig c6x allyesconfig nds32 defconfig nios2allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig arc defconfig sh allmodconfig parisc defconfig s390 allyesconfig parisc allyesconfig s390defconfig sparcallyesconfig sparc defconfig i386defconfig i386 allyesconfig mips allyesconfig mips allmodconfig powerpc allyesconfig powerpc allmodconfig powerpc allnoconfig powerpc defconfig x86_64 randconfig-a006-20200806 x86_64 randconfig-a001-20200806 x86_64 randconfig-a004-20200806 x86_64 randconfig-a005-20200806 x86_64 randconfig-a003-20200806 x86_64 randconfig-a002-20200806 i386 randconfig-a005-20200806 i386 randconfig-a004-20200806 i386 randconfig-a001-20200806 i386 randconfig-a002-20200806 i386 randconfig-a003-20200806 i386 randconfig-a006-20200806 i386 randconfig-a005-20200805 i386 randconfig-a004-20200805 i386 randconfig-a001-20200805 i386 randconfig-a003-20200805 i386 randconfig-a002-20200805 i386 randconfig-a006-20200805 i386 randconfig-a011-20200805 i386 randconfig-a012-20200805 i386 randconfig-a013-20200805 i386 randconfig-a014-20200805 i386 randconfig-a015-20200805 i386 randconfig-a016-20200805 i386 randconfig-a011-20200806 i386 randconfig-a012-20200806 i386 randconfig-a013-20200806 i386 randconfig-a015-20200806 i386 randconfig-a014-20200806 i386 randconfig-a016-20200806 riscv
[GIT PULL] Mailbox changes for v5.9
Hi Linus, The following changes since commit 92ed301919932f13b9172e525674157e983d: Linux 5.8-rc7 (2020-07-26 14:14:06 -0700) are available in the Git repository at: git://git.linaro.org/landing-teams/working/fujitsu/integration.git tags/mailbox-v5.9 for you to fetch changes up to 884996986347dbe3b735cfa9bc041dd98a533796: mailbox: mediatek: cmdq: clear task in channel before shutdown (2020-08-03 23:56:38 -0500) - mediatek : add support for mt6779 gce shutdown cleanup and address shift support - qcom : add msm8994 apcs and sdm660 hmss compatibility - imx : mark PM funcs __maybe - pcc : put acpi table before bailout - misc: replace http with https links Alexander A. Klimov (1): mailbox: Replace HTTP links with HTTPS ones Dennis YC Hsieh (4): dt-binding: gce: add gce header file for mt6779 mailbox: cmdq: variablize address shift in platform mailbox: cmdq: support mt6779 gce platform definition mailbox: mediatek: cmdq: clear task in channel before shutdown Hanjun Guo (1): mailbox: pcc: Put the PCCT table for error path Konrad Dybcio (2): mailbox: qcom: Add sdm660 hmss compatible mailbox: qcom: Add msm8994 apcs compatible Nathan Chancellor (1): mailbox: imx: Mark PM functions as __maybe_unused .../devicetree/bindings/mailbox/mtk-gce.txt| 8 +- .../bindings/mailbox/qcom,apcs-kpss-global.yaml| 2 + drivers/mailbox/imx-mailbox.c | 8 +- drivers/mailbox/mtk-cmdq-mailbox.c | 97 +++-- drivers/mailbox/omap-mailbox.c | 2 +- drivers/mailbox/pcc.c | 9 +- drivers/mailbox/qcom-apcs-ipc-mailbox.c| 10 + drivers/mailbox/ti-msgmgr.c| 2 +- include/dt-bindings/gce/mt6779-gce.h | 222 + include/linux/mailbox/mtk-cmdq-mailbox.h | 2 + 10 files changed, 338 insertions(+), 24 deletions(-) create mode 100644 include/dt-bindings/gce/mt6779-gce.h
Re: [RESEND PATCH] media: atomisp: Replace trace_printk by pr_info
On Fri, Aug 07, 2020 at 09:50:23AM +0800, Nicolas Boichat wrote: > On Fri, Jul 24, 2020 at 8:41 PM Nicolas Boichat wrote: > > > > On Fri, Jul 10, 2020 at 3:03 PM Greg Kroah-Hartman > > wrote: > > > > > > On Fri, Jul 10, 2020 at 02:45:29PM +0800, Nicolas Boichat wrote: > > > > trace_printk should not be used in production code, replace it > > > > call with pr_info. > > > > > > > > Signed-off-by: Nicolas Boichat > > > > --- > > > > Sent this before as part of a series (whose 4th patch was a > > > > change that allows to detect such trace_printk), but maybe it's > > > > easier to get individual maintainer attention by splitting it. > > > > > > Mauro should take this soon: > > > > > > Acked-by: Greg Kroah-Hartman > > > > Mauro: did you get a chance to look at this? (and the other similar > > patch "media: camss: vfe: Use trace_printk for debugging only") > > Mauro: Another gentle ping. Thanks. It's the middle of the merge window, maintainers can't do anything until after 5.9-rc1 is out, sorry. greg k-h
Re: [PATCH v6 2/2] devicetree: hwmon: shtc1: Add sensirion,shtc1.yaml
Rob, sorry for late reply, I have my head in setup production install the dt-schema and rerun the check, with 2020-6 no warnings lets fix it. Chris On 13/7/2020 11:00 pm, Rob Herring wrote: On Sun, 12 Jul 2020 12:44:10 +0800, Chris Ruehl wrote: Add documentation for the newly added DTS support in the shtc1 driver. To align with the drivers logic to have high precision by default a boolean sensirion,low_precision is used to switch to low precision. Signed-off-by: Chris Ruehl --- .../bindings/hwmon/sensirion,shtc1.yaml | 57 +++ 1 file changed, 57 insertions(+) create mode 100644 Documentation/devicetree/bindings/hwmon/sensirion,shtc1.yaml My bot found errors running 'make dt_binding_check' on your patch: Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dts:24.13-26: Warning (reg_format): /example-0/i2c1/shtc3@70:reg: property has invalid length (4 bytes) (#address-cells == 2, #size-cells == 1) Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dt.yaml: Warning (pci_device_reg): Failed prerequisite 'reg_format' Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dt.yaml: Warning (pci_device_bus_num): Failed prerequisite 'reg_format' Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dt.yaml: Warning (simple_bus_reg): Failed prerequisite 'reg_format' Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dt.yaml: Warning (i2c_bus_reg): Failed prerequisite 'reg_format' Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dt.yaml: Warning (spi_bus_reg): Failed prerequisite 'reg_format' Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dts:22.20-26.13: Warning (avoid_default_addr_size): /example-0/i2c1/shtc3@70: Relying on default #address-cells value Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dts:22.20-26.13: Warning (avoid_default_addr_size): /example-0/i2c1/shtc3@70: Relying on default #size-cells value Documentation/devicetree/bindings/hwmon/sensirion,shtc1.example.dt.yaml: Warning (unique_unit_address): Failed prerequisite 'avoid_default_addr_size' See https://patchwork.ozlabs.org/patch/1327453 If you already ran 'make dt_binding_check' and didn't see the above error(s), then make sure dt-schema is up to date: pip3 install git+https://github.com/devicetree-org/dt-schema.git@master --upgrade Please check and re-submit. -- GTSYS Limited RFID Technology 9/F, Unit E, R07, Kwai Shing Industrial Building Phase 2, 42-46 Tai Lin Pai Road, Kwai Chung, N.T., Hong Kong Tel (852) 9079 9521 Disclaimer: https://www.gtsys.com.hk/email/classified.html
Re: Is anyone else getting a bad signature from kernel.org's 5.8 sources+Greg's sign?
On Wed, Aug 05, 2020 at 11:20:38PM -0400, David Niklas wrote: > On Wed, 5 Aug 2020 18:36:08 -0700 > Randy Dunlap wrote: > > > On 8/5/20 5:59 PM, David Niklas wrote: > > > Hello, > > > I downloaded the kernel sources from kernel.org using curl, then > > > opera, and finally lynx (to rule out an html parsing bug). I did the > > > same with the sign and I keep getting: > > > > > > % gpg2 --verify linux-5.8.tar.sign linux-5.8.tar.xz > > > gpg: Signature made Mon Aug 3 00:19:13 2020 EDT > > > gpg:using RSA key > > > 647F28654894E3BD457199BE38DBBDC86092693E gpg: BAD signature from > > > "Greg Kroah-Hartman " [unknown] > > > > > > I did refresh all the keys just in case. > > > I believe this is important so I'm addressing this to the signer and > > > only CC'ing the list. > > > > > > If I'm made some simple mistake, feel free to send SIG666 to my > > > terminal. I did re-read the man page just in case. > > > > It works successfully for me. > > > > > > from https://www.kernel.org/category/signatures.html:: > > > > > > If you get "BAD signature" > > > > If at any time you see "BAD signature" output from "gpg2 --verify", > > please first check the following first: > > > > Make sure that you are verifying the signature against the .tar > > version of the archive, not the compressed (.tar.xz) version. Make sure > > the the downloaded file is correct and not truncated or otherwise > > corrupted. > > > > If you repeatedly get the same "BAD signature" output, please email > > helpd...@kernel.org, so we can investigate the problem. > > > > > > > > Many thanks. I've never seen a signature done that way before, but I > understand why you would do it that way. That means other projects need to change as well :) And you are not alone, this comes up every release, no problems. greg k-h
Re: [PATCH] perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set
Hi Jiri, On 8/7/2020 3:43 AM, Jiri Olsa wrote: On Wed, Aug 05, 2020 at 10:29:37AM +0800, Jin Yao wrote: We received an error report that perf-record caused 'Segmentation fault' on a newly system (e.g. on the new installed ubuntu). (gdb) backtrace #0 __read_once_size (size=4, res=, p=0x14) at /root/0-jinyao/acme/tools/include/linux/compiler.h:139 #1 atomic_read (v=0x14) at /root/0-jinyao/acme/tools/include/asm/../../arch/x86/include/asm/atomic.h:28 #2 refcount_read (r=0x14) at /root/0-jinyao/acme/tools/include/linux/refcount.h:65 #3 perf_mmap__read_init (map=map@entry=0x0) at mmap.c:177 #4 0x561ce5c0de39 in perf_evlist__poll_thread (arg=0x561ce68584d0) at util/sideband_evlist.c:62 #5 0x7fad78491609 in start_thread (arg=) at pthread_create.c:477 #6 0x7fad7823c103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 hum, I recall discussing the same issue, I thought it was already fixed :-\ in any case: Is it possible some patches have been posted but we missed? :) Acked-by: Jiri Olsa Thanks! Thanks Jin Yao thanks, jirka The root cause is, evlist__add_bpf_sb_event() just returns 0 if HAVE_LIBBPF_SUPPORT is not defined (inline function path). So it will not create a valid evsel for side-band event. But perf-record still creates BPF side band thread to process the side-band event, then the error happpens. We can reproduce this issue by removing the libelf-dev. e.g. 1. apt-get remove libelf-dev 2. perf record -a -- sleep 1 root@test:~# ./perf record -a -- sleep 1 perf: Segmentation fault Obtained 6 stack frames. ./perf(+0x28eee8) [0x5562d6ef6ee8] /lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7fbfdc65f210] ./perf(+0x342e74) [0x5562d6faae74] ./perf(+0x257e39) [0x5562d6ebfe39] /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7fbfdc990609] /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7fbfdc73b103] Segmentation fault (core dumped) To fix this issue, 1. We either install the missing libraries to let HAVE_LIBBPF_SUPPORT be defined. e.g. apt-get install libelf-dev and install other related libraries. 2. Use this patch to skip the side-band event setup if HAVE_LIBBPF_SUPPORT is not set. Signed-off-by: Jin Yao --- tools/perf/builtin-record.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index b6bdccd875bc..ae97f98e2753 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -1506,6 +1506,7 @@ static int record__synthesize(struct record *rec, bool tail) return err; } +#ifdef HAVE_LIBBPF_SUPPORT static int record__process_signal_event(union perf_event *event __maybe_unused, void *data) { struct record *rec = data; @@ -1550,6 +1551,12 @@ static int record__setup_sb_evlist(struct record *rec) return 0; } +#else +static int record__setup_sb_evlist(struct record *rec __maybe_unused) +{ + return 0; +} +#endif static int __cmd_record(struct record *rec, int argc, const char **argv) { -- 2.17.1
[RESEND 0/3] Venus - Handle race conditions in concurrency
The intention of this patchset is to handle race conditions during concurrency usecases like Multiple YouTube browser tabs(approx 50 plus tabs), graphics_Stress, WiFi ON/OFF, Bluetooth ON/OF, and reboot in parallel. --- Resending the fixes by describing more about the issue and correcting typo errors. Mansur Alisha Shaik (3): venus: core: handle race condititon for core ops venus: core: cancel pending work items in workqueue venus: handle use after free for iommu_map/iommu_unmap drivers/media/platform/qcom/venus/core.c | 6 +- drivers/media/platform/qcom/venus/firmware.c | 17 + drivers/media/platform/qcom/venus/hfi.c | 5 - 3 files changed, 22 insertions(+), 6 deletions(-) -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
[RESEND 2/3] venus: core: cancel pending work items in workqueue
In concurrency usecase and reboot scenario we are observing race condition and seeing NULL pointer dereference crash. In shutdown path and system recovery path we are destroying the same mutex hence seeing crash. This case is handled by mutex protection and cancel delayed work items in work queue. Below is the call trace for the crash Call trace: venus_remove+0xdc/0xec [venus_core] venus_core_shutdown+0x1c/0x34 [venus_core] platform_drv_shutdown+0x28/0x34 device_shutdown+0x154/0x1fc kernel_restart_prepare+0x40/0x4c kernel_restart+0x1c/0x64 Call trace: mutex_lock+0x34/0x60 venus_hfi_destroy+0x28/0x98 [venus_core] hfi_destroy+0x1c/0x28 [venus_core] venus_sys_error_handler+0x60/0x14c [venus_core] process_one_work+0x210/0x3d0 worker_thread+0x248/0x3f4 kthread+0x11c/0x12c ret_from_fork+0x10/0x18 Signed-off-by: Mansur Alisha Shaik --- drivers/media/platform/qcom/venus/core.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c index fe99c83..41a293e 100644 --- a/drivers/media/platform/qcom/venus/core.c +++ b/drivers/media/platform/qcom/venus/core.c @@ -312,6 +312,8 @@ static int venus_remove(struct platform_device *pdev) struct device *dev = core->dev; int ret; + cancel_delayed_work_sync(&core->work); + ret = pm_runtime_get_sync(dev); WARN_ON(ret < 0); @@ -329,7 +331,9 @@ static int venus_remove(struct platform_device *pdev) if (pm_ops->core_put) pm_ops->core_put(dev); + mutex_lock(&core->lock); hfi_destroy(core); + mutex_unlock(&core->lock); icc_put(core->video_path); icc_put(core->cpucfg_path); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
[RESEND 3/3] venus: handle use after free for iommu_map/iommu_unmap
In concurrency usecase and reboot scenario we are trying to map fw.iommu_domain which is already unmapped during shutdown. This is causing NULL pointer dereference crash. This case is handled by adding necessary checks. Call trace: __iommu_map+0x4c/0x348 iommu_map+0x5c/0x70 venus_boot+0x184/0x230 [venus_core] venus_sys_error_handler+0xa0/0x14c [venus_core] process_one_work+0x210/0x3d0 worker_thread+0x248/0x3f4 kthread+0x11c/0x12c ret_from_fork+0x10/0x18 Signed-off-by: Mansur Alisha Shaik --- drivers/media/platform/qcom/venus/firmware.c | 17 + 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c index 8801a6a..c427e88 100644 --- a/drivers/media/platform/qcom/venus/firmware.c +++ b/drivers/media/platform/qcom/venus/firmware.c @@ -171,9 +171,14 @@ static int venus_shutdown_no_tz(struct venus_core *core) iommu = core->fw.iommu_domain; - unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped); - if (unmapped != mapped) - dev_err(dev, "failed to unmap firmware\n"); + if (core->fw.mapped_mem_size && iommu) { + unmapped = iommu_unmap(iommu, VENUS_FW_START_ADDR, mapped); + + if (unmapped != mapped) + dev_err(dev, "failed to unmap firmware\n"); + else + core->fw.mapped_mem_size = 0; + } return 0; } @@ -288,7 +293,11 @@ void venus_firmware_deinit(struct venus_core *core) iommu = core->fw.iommu_domain; iommu_detach_device(iommu, core->fw.dev); - iommu_domain_free(iommu); + + if (core->fw.iommu_domain) { + iommu_domain_free(iommu); + core->fw.iommu_domain = NULL; + } platform_device_unregister(to_platform_device(core->fw.dev)); } -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
[RESEND 1/3] venus: core: handle race condititon for core ops
For core ops we are having only write protect but there is no read protect, because of this in multthreading and concurrency, one CPU core is reading without wait which is causing the NULL pointer dereferece crash. one such scenario is as show below, where in one core core->ops becoming NULL and in another core calling core->ops->session_init(). CPU: core-7: Call trace: hfi_session_init+0x180/0x1dc [venus_core] vdec_queue_setup+0x9c/0x364 [venus_dec] vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common] vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2] v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem] v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem] v4l_reqbufs+0x4c/0x5c __video_do_ioctl+0x2b0/0x39c CPU: core-0: Call trace: venus_shutdown+0x98/0xfc [venus_core] venus_sys_error_handler+0x64/0x148 [venus_core] process_one_work+0x210/0x3d0 worker_thread+0x248/0x3f4 kthread+0x11c/0x12c Signed-off-by: Mansur Alisha Shaik --- drivers/media/platform/qcom/venus/core.c | 2 +- drivers/media/platform/qcom/venus/hfi.c | 5 - 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c index 203c653..fe99c83 100644 --- a/drivers/media/platform/qcom/venus/core.c +++ b/drivers/media/platform/qcom/venus/core.c @@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct work_struct *work) pm_runtime_get_sync(core->dev); hfi_core_deinit(core, true); - hfi_destroy(core); mutex_lock(&core->lock); + hfi_destroy(core); venus_shutdown(core); pm_runtime_put_sync(core->dev); diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c index a211eb9..2eeb31f 100644 --- a/drivers/media/platform/qcom/venus/hfi.c +++ b/drivers/media/platform/qcom/venus/hfi.c @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create); int hfi_session_init(struct venus_inst *inst, u32 pixfmt) { struct venus_core *core = inst->core; - const struct hfi_ops *ops = core->ops; + const struct hfi_ops *ops; int ret; if (inst->state != INST_UNINIT) @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt) inst->hfi_codec = to_codec_type(pixfmt); reinit_completion(&inst->done); + mutex_lock(&core->lock); + ops = core->ops; ret = ops->session_init(inst, inst->session_type, inst->hfi_codec); if (ret) return ret; + mutex_unlock(&core->lock); ret = wait_session_msg(inst); if (ret) return ret; -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples
Hi Peter, On 8/6/2020 7:00 PM, pet...@infradead.org wrote: On Thu, Aug 06, 2020 at 11:18:27AM +0200, pet...@infradead.org wrote: Suppose we have nested virt: L0-hv | G0/L1-hv | G1 And we're running in G0, then: - 'exclude_hv' would exclude L0 events - 'exclude_host' would ... exclude L1-hv events? - 'exclude_guest' would ... exclude G1 events? So in arch/x86/events/intel/core.c we have: static inline void intel_set_masks(struct perf_event *event, int idx) { struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); if (event->attr.exclude_host) __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask); if (event->attr.exclude_guest) __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask); if (event_is_checkpointed(event)) __set_bit(idx, (unsigned long *)&cpuc->intel_cp_status); } exclude_host is now set by guest (pmc_reprogram_counter, arch/x86/kvm/pmu.c). When enabling the event, we can check exclude_host to know if it's a guest. Otherwise we may need more flags in event->attr to indicate the status. which is, afaict, just plain wrong. Should that not be something like: if (!event->attr.exclude_host) __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask); if (!event->attr.exclude_guest) __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask); How can we know it's guest or host even if exclude_host is set in guest? Thanks Jin Yao Also, ARM64 seems to also implement this stuff, Mark, do you have any insight on how all this is 'supposed' to work?
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 10:58 PM Bjorn Andersson wrote: > > On Thu 06 Aug 19:48 PDT 2020, John Stultz wrote: > > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > > wrote: > > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > > > wrote: > > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski > > > > > > wrote: > > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz > > > > > > > >> wrote: > > > > > > > >>> > > > > > > > >>> So this is where I bashfully admit I didn't get a chance to > > > > > > > >>> try this > > > > > > > >>> patch series out, as I had success with a much older version > > > > > > > >>> of > > > > > > > >>> Saravana's macro magic. > > > > > > > >>> > > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm > > > > > > > >>> seeing > > > > > > > >>> boot regressions on db845c. :( This is in the non-modular > > > > > > > >>> case, > > > > > > > >>> building the driver in. > > > > > > > >> Does that mean the modular version is working? Or you haven't > > > > > > > >> tried > > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I > > > > > > > >> don't > > > > > > > >> have the hardware, but it should be easy to guess this issue > > > > > > > >> looking > > > > > > > >> at the code delta. > > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started > > > > > > > > on -next > > > > > > > > around 20200727, but I didn't track it down as, well, there's > > > > > > > > less way > > > > > > > > to get debug output on the C630. > > > > > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting > > > > > > > > does > > > > > > > > allow me to boot again. > > > > > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit > > > > > > > reverted > > > > > > > boots, however, module (on the c630 at least) doesn't boot > > > > > > > whether it's > > > > > > > a module or built-in. > > > > > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > > > more grace time to load. > > > > > > > > > > With the risk of me reading more into this than what you're saying, > > > > > please don't upstream anything that depend this parameter to be > > > > > increased. > > > > > > > > > > Compiling any of these drivers as module should not require the user > > > > > to > > > > > pass additional kernel command line parameters in order to get their > > > > > device to boot. > > > > > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > > > avoid it. But the reality is that it is already required (at least in > > > > configurations heavily using modules) to give more time for modules > > > > loaded to resolve missing dependencies after init begins (due to > > > > changes in the driver core to fail loading after init so that optional > > > > dt links aren't eternally looked for). This was seen when trying to > > > > enable the qualcom clk drivers to modules. > > > > > > > > > > So to clarify what you're saying, any system that boots successfully > > > with the default options is a sign of pure luck - regardless of being > > > builtin or modules. > > > > > > > > > And there you have my exact argument against the deferred timeout magic > > > going on in the driver core. But as you know people insist that it's > > > more important to be able to boot some defunct system from NFS than a > > > properly configured one reliably. > > > > I'd agree, but the NFS case was in use before, and when the original > > deferred timeout/optional link handling stuff landed no one complained > > they were broken by it (at least at the point where it landed). > > I did object when this was proposed and I've objected for the last two > years, because we keep adding more and more subsystems to follow this > broken behavior. > > > Only later when we started enabling more lower-level core drivers as > > modules did the shortened dependency resolution time start to bite > > folks. My attempt to set the default to be 30 seconds helped there, > > but caused trouble and delays for the NFS case, and "don't break > > existing users" seemed to rule, so I set the default timeout back to > > 0. > > > > I can't argue with that and I'm at loss on how to turn this around. > > > > > It doesn't seem necessary in this case, but I suggested it here as > > > > I've got it enabled by default in my AOSP builds so that the > > > > module-heavy configs for GKI boot properly (even if Saravana's > > > > fw_devlink work is disabled). > > > > > > > > > > With all due respect, that's your downstream kernel, the upstream kernel > > > should not rely on luck, out-of-tree patches or kernel parameters. > > > > I agree that would be preferred.
Re: [GIT] Networking
On Thu, Aug 6, 2020 at 5:32 PM John Stultz wrote: > > On Thu, Aug 6, 2020 at 4:17 PM Eric Dumazet wrote: > > On 8/6/20 2:39 PM, John Stultz wrote: > > > [ 19.709492] Unable to handle kernel access to user memory outside > > > uaccess routines at virtual address 006f53337070 > > > [ 19.726539] Mem abort info: > > > [ 19.726544] ESR = 0x960f > > > [ 19.741323] EC = 0x25: DABT (current EL), IL = 32 bits > > > [ 19.741326] SET = 0, FnV = 0 > > > [ 19.761185] EA = 0, S1PTW = 0 > > > [ 19.761188] Data abort info: > > > [ 19.761190] ISV = 0, ISS = 0x000f > > > [ 19.761192] CM = 0, WnR = 0 > > > [ 19.761199] user pgtable: 4k pages, 39-bit VAs, pgdp=00016e9e9000 > > > [ 19.777584] [006f53337070] pgd=00016e99e003, > > > p4d=00016e99e003, pud=00016e99e003, pmd=00016e99a003, > > > pte=00e800016d3c7f53 > > > [ 19.789205] Internal error: Oops: 960f [#1] PREEMPT SMP > > > [ 19.789211] Modules linked in: > > > [ 19.797153] CPU: 7 PID: 364 Comm: iptables-restor Tainted: G > > > W 5.8.0-mainline-08255-gf9e74a8eb6f3 #3350 > > > [ 19.797156] Hardware name: Thundercomm Dragonboard 845c (DT) > > > [ 19.797161] pstate: a045 (NzCv daif +PAN -UAO BTYPE=--) > > > [ 19.797177] pc : do_ipt_set_ctl+0x304/0x610 > > > [ 19.807891] lr : do_ipt_set_ctl+0x50/0x610 > > > [ 19.807894] sp : ffc0139bbba0 > > > [ 19.807898] x29: ffc0139bbba0 x28: ff80f07a3800 > > > [ 19.846468] x27: x26: > > > [ 19.846472] x25: x24: 0698 > > > [ 19.846476] x23: ffec8eb0cc80 x22: 0040 > > > [ 19.846480] x21: b46f53337070 x20: ffec8eb0c000 > > > [ 19.846484] x19: ffec8e9e9000 x18: > > > [ 19.846487] x17: x16: > > > [ 19.846491] x15: x14: > > > [ 19.846495] x13: x12: > > > [ 19.846501] x11: x10: > > > [ 19.856005] x9 : x8 : > > > [ 19.856008] x7 : ffec8e9e9d08 x6 : > > > [ 19.856012] x5 : x4 : 0213 > > > [ 19.856015] x3 : 0001ffdeffef x2 : 11ded3fb0bb85e00 > > > [ 19.856019] x1 : 0027 x0 : 0080 > > > [ 19.856024] Call trace: > > > [ 19.866319] do_ipt_set_ctl+0x304/0x610 > > > [ 19.866327] nf_setsockopt+0x64/0xa8 > > > [ 19.866332] ip_setsockopt+0x21c/0x1710 > > > [ 19.866338] raw_setsockopt+0x50/0x1b8 > > > [ 19.866347] sock_common_setsockopt+0x50/0x68 > > > [ 19.882672] __sys_setsockopt+0x120/0x1c8 > > > [ 19.882677] __arm64_sys_setsockopt+0x30/0x40 > > > [ 19.882686] el0_svc_common.constprop.3+0x78/0x188 > > > [ 19.882691] do_el0_svc+0x80/0xa0 > > > [ 19.882699] el0_sync_handler+0x134/0x1a0 > > > [ 19.901555] el0_sync+0x140/0x180 > > > [ 19.901564] Code: aa1503e0 97fffd3e 2a0003f5 1780 (a9401ea6) > > > [ 19.901569] ---[ end trace 22010e9688ae248f ]--- > > > [ 19.913033] Kernel panic - not syncing: Fatal exception > > > [ 19.913042] SMP: stopping secondary CPUs > > > [ 20.138885] Kernel Offset: 0x2c7d08 from 0xffc01000 > > > [ 20.138887] PHYS_OFFSET: 0xfffa8000 > > > [ 20.138894] CPU features: 0x0040002,2a80a218 > > > [ 20.138898] Memory Limit: none > > > > > > I'll continue to work on bisecting this down further, but figured I'd > > > share now as you or someone else might be able to tell whats wrong > > > from the trace. > > > > > > > Can you try at commit c2f12630c60ff33a9cafd221646053fc10ec59b6 ("netfilter: > > switch nf_setsockopt to sockptr_t") > > (and right before it) > > > So I rebased my patches ontop of that commit, but I'm not seeing the > crash there. I also hand applied your suggested patch when I did see > the issue, but that didn't seem to fix it either. > > So far I've only narrowed it down to between > 65ccbbda52288527b7c48087eb33bb0757975875..530fe9d433b9e60251bb8fdc5dddecbc486a50ef. > But I'll keep rebase-bisecting it down. So I've finally rebase-bisected it down to: a31edb2059ed ("net: improve the user pointer check in init_user_sockptr") https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a31edb2059ed4e498f9aa8230c734b59d0ad797a And reverting that from linus/HEAD (at least from this morning) seems to avoid it. Seems like it is just adding extra checks on the data passed, so maybe existing trouble from a different driver is the issue here, but it's not really clear from the crash what might be wrong. Suggestions would be greatly appreciated! thanks -john
[GIT PULL]: dmaengine updates for v5.9-rc1
Hello Linus, Now that mail.kernel.org is back, time to send the pull request. Please pull to receive the below updates for dmaengine. Please note that SFR has reported conflicts with MAINTAINERS file update in this request, am sure that would be easy for you to manage :) The following changes since commit 87730ccbddcb48478b1b88e88b14e73424130764: dmaengine: ioat setting ioat timeout as module parameter (2020-07-06 14:49:34 +0530) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git tags/dmaengine-5.9-rc1 for you to fetch changes up to 00043a2689232631f39ebbf6719d545b1d799086: Merge branch 'topic/xilinx' into fixes (2020-08-07 11:13:37 +0530) dmaengine updates for v5.9-rc1 Core: - Support out of order dma completion - Support for repeating transaction New controllers: - Support for Actions S700 DMA engine - Renesas R8A774E1, r8a7742 controller binding - New driver for Xilinx DPDMA controller Others: - Support of out of order dma completion in idxd driver - W=1 warning cleanup of subsystem - Updates to ti-k3-dma, dw, idxd drivers Amit Singh Tomar (3): dt-bindings: dmaengine: convert Actions Semi Owl SoCs bindings to yaml dmaengine: Actions: get rid of bit fields from dma descriptor dmaengine: Actions: Add support for S700 DMA engine Andy Shevchenko (4): dmaengine: dw: Register ACPI DMA controller for PCI that has companion dmaengine: dw: Replace 'objs' by 'y' dmaengine: acpi: Drop double check for ACPI companion device dmaengine: dw: Don't include unneeded header to platform data header Dave Jiang (6): dmaengine: cookie bypass for out of order completion dmaengine: idxd: add leading / for sysfspath in ABI documentation dmaengine: idxd: move submission to sbitmap_queue dmaengine: idxd: add work queue drain support dmaengine: idxd: move idxd interrupt handling to mask instead of ignore dmaengine: idxd: add missing invalid flags field to completion Gustavo A. R. Silva (2): dmaengine: hisilicon: Use struct_size() in devm_kzalloc() dmaengine: ti: k3-udma: Use struct_size() in kzalloc() Hyun Kwon (1): dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver Koehrer Mathias (ETAS/EES-SL) (1): dmaengine: Extend NXP QDMA driver to check transmission errors Lad Prabhakar (3): dt-bindings: dmaengine: renesas,usb-dmac: Add binding for r8a7742 dt-bindings: dma: renesas,rcar-dmac: Document R8A774E1 bindings dt-bindings: dma: renesas,usb-dmac: Add binding for r8a774e1 Laurent Pinchart (3): dt: bindings: dma: xilinx: dpdma: DT bindings for Xilinx DPDMA dmaengine: Add support for repeating transactions dmaengine: xilinx: dpdma: Fix kerneldoc warning Lee Jones (17): dmaengine: mediatek: mtk-hsdma: Fix formatting in 'struct mtk_hsdma_pdesc' doc block dmaengine: of-dma: Fix misspellings/formatting issues in some function headers dmaengine: ep93xx_dma: Provide some missing struct attribute documentation dmaengine: mmp_pdma: Demote obvious misuse of kerneldoc to standard comment blocks dmaengine: pl330: Demote obvious misuse of kerneldoc to standard comment block dmaengine: ste_dma40: Supply 2 missing struct attribute descriptions dmaengine: altera-msgdma: Fix struct documentation blocks dmaengine: at_hdmac: Repair parameter misspelling and demote non-kerneldoc headers dmaengine: sun4i-dma: Demote obvious misuse of kerneldoc to standard comment blocks dmaengine: fsl-qdma: Fix 'struct fsl_qdma_format' formatting issue dmaengine: imx-sdma: Correct formatting issue and provide 2 new descriptions dmaengine: iop-adma: Function parameter documentation must adhere to correct formatting dmaengine: nbpfaxi: Provide some missing attribute docs and split out slave info dmaengine: xgene-dma: Provide descriptions for 'dev' and 'clk' in device's ddata dmaengine: mv_xor_v2: Supply some missing 'struct mv_xor_v2_device' attribute docs dmaengine: ioat: init: Correct misspelling of function parameter 'c' for channel dmaengine: ioat: Fix some parameter misspelling and provide description for phys_complete Lubomir Rintel (2): dmaengine: mmp_pdma: Do not warn when IRQ is shared by all chans dmaengine: mmp_tdma: share the IRQ line Ludovic Desroches (1): MAINTAINERS: dmaengine: Microchip: add Tudor Ambarus as co-maintainer Peter Ujfalusi (7): dmaengine: ti: k3-udma: Remove dma_sync_single calls for descriptors dmaengine: ti: k3-udma: Do not use ring_get_occ in udma_pop_from_ring dmaengine: ti: k3-udma: Use common defines for TCHANRT/RCHANRT registers dmaengine: ti: k3-udma-private: Use udma_read/write for register access dma
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu 06 Aug 19:48 PDT 2020, John Stultz wrote: > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > wrote: > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > > wrote: > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski > > > > > wrote: > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz > > > > > > >> wrote: > > > > > > >>> > > > > > > >>> So this is where I bashfully admit I didn't get a chance to try > > > > > > >>> this > > > > > > >>> patch series out, as I had success with a much older version of > > > > > > >>> Saravana's macro magic. > > > > > > >>> > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm > > > > > > >>> seeing > > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > > >>> building the driver in. > > > > > > >> Does that mean the modular version is working? Or you haven't > > > > > > >> tried > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I > > > > > > >> don't > > > > > > >> have the hardware, but it should be easy to guess this issue > > > > > > >> looking > > > > > > >> at the code delta. > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started > > > > > > > on -next > > > > > > > around 20200727, but I didn't track it down as, well, there's > > > > > > > less way > > > > > > > to get debug output on the C630. > > > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting > > > > > > > does > > > > > > > allow me to boot again. > > > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit > > > > > > reverted > > > > > > boots, however, module (on the c630 at least) doesn't boot whether > > > > > > it's > > > > > > a module or built-in. > > > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > > more grace time to load. > > > > > > > > With the risk of me reading more into this than what you're saying, > > > > please don't upstream anything that depend this parameter to be > > > > increased. > > > > > > > > Compiling any of these drivers as module should not require the user to > > > > pass additional kernel command line parameters in order to get their > > > > device to boot. > > > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > > avoid it. But the reality is that it is already required (at least in > > > configurations heavily using modules) to give more time for modules > > > loaded to resolve missing dependencies after init begins (due to > > > changes in the driver core to fail loading after init so that optional > > > dt links aren't eternally looked for). This was seen when trying to > > > enable the qualcom clk drivers to modules. > > > > > > > So to clarify what you're saying, any system that boots successfully > > with the default options is a sign of pure luck - regardless of being > > builtin or modules. > > > > > > And there you have my exact argument against the deferred timeout magic > > going on in the driver core. But as you know people insist that it's > > more important to be able to boot some defunct system from NFS than a > > properly configured one reliably. > > I'd agree, but the NFS case was in use before, and when the original > deferred timeout/optional link handling stuff landed no one complained > they were broken by it (at least at the point where it landed). I did object when this was proposed and I've objected for the last two years, because we keep adding more and more subsystems to follow this broken behavior. > Only later when we started enabling more lower-level core drivers as > modules did the shortened dependency resolution time start to bite > folks. My attempt to set the default to be 30 seconds helped there, > but caused trouble and delays for the NFS case, and "don't break > existing users" seemed to rule, so I set the default timeout back to > 0. > I can't argue with that and I'm at loss on how to turn this around. > > > It doesn't seem necessary in this case, but I suggested it here as > > > I've got it enabled by default in my AOSP builds so that the > > > module-heavy configs for GKI boot properly (even if Saravana's > > > fw_devlink work is disabled). > > > > > > > With all due respect, that's your downstream kernel, the upstream kernel > > should not rely on luck, out-of-tree patches or kernel parameters. > > I agree that would be preferred. But kernel parameters are often there > for these sorts of cases where we can't always do the right thing. > As for out-of-tree patches, broken things don't get fixed until > out-of-tree patches are developed and upstreamed, and I know Saravana > is doing exactly that, and I hope his fw_devlink work helps f
Re: [PATCH v3] drm/virtio: fix missing dma_fence_put() in virtio_gpu_execbuffer_ioctl()
On Fri, Aug 07, 2020 at 11:00:11AM +0800, 何鑫 wrote: > Xin He 于2020年7月21日周二 下午6:17写道: > > > > From: Qi Liu > > > > We should put the reference count of the fence after calling > > virtio_gpu_cmd_submit(). So add the missing dma_fence_put(). > > > > Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit > > synchronization") > > Co-developed-by: Xin He > > Signed-off-by: Xin He > > Signed-off-by: Qi Liu > > Reviewed-by: Muchun Song > > --- > > > > changelog in v3: > > 1) Change the subject from "drm/virtio: fixed memory leak in > > virtio_gpu_execbuffer_ioctl()" to > >"drm/virtio: fix missing dma_fence_put() in > > virtio_gpu_execbuffer_ioctl()" > > 2) Rework the commit log > > > > changelog in v2: > > 1) Add a change description > > > > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > index 5df722072ba0..19c5bc01eb79 100644 > > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > > @@ -179,6 +179,7 @@ static int virtio_gpu_execbuffer_ioctl(struct > > drm_device *dev, void *data, > > > > virtio_gpu_cmd_submit(vgdev, buf, exbuf->size, > > vfpriv->ctx_id, buflist, out_fence); > > + dma_fence_put(&out_fence->f); > > virtio_gpu_notify(vgdev); > > return 0; > > > > -- > > 2.21.1 (Apple Git-122.3) > > > > cc Greg Why? $ ./scripts/get_maintainer.pl --file drivers/gpu/drm/virtio/virtgpu_ioctl.c David Airlie (maintainer:VIRTIO GPU DRIVER) Gerd Hoffmann (maintainer:VIRTIO GPU DRIVER) Daniel Vetter (maintainer:DRM DRIVERS) Sumit Semwal (maintainer:DMA BUFFER SHARING FRAMEWORK) "Christian König" (maintainer:DMA BUFFER SHARING FRAMEWORK) dri-de...@lists.freedesktop.org (open list:VIRTIO GPU DRIVER) virtualizat...@lists.linux-foundation.org (open list:VIRTIO GPU DRIVER) linux-kernel@vger.kernel.org (open list) linux-me...@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK) linaro-mm-...@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK)
Re: [PATCH] epic100: switch from 'pci_' to 'dma_' API
Le 06/08/2020 à 23:23, David Miller a écrit : From: Christophe JAILLET Date: Thu, 6 Aug 2020 22:19:35 +0200 The wrappers in include/linux/pci-dma-compat.h should go away. Christophe, the net-next tree is closed so I'd like to ask that you defer submitting these conversion patches until the net-next tree opens back up again. Thank you. Sure, sorry for the inconvenience. CJ
Re: drivers/staging/mt7621-pci/pci-mt7621.c:189:11: error: 'pci_generic_config_read' undeclared here (not in a function)
Hi, On Fri, Aug 7, 2020 at 1:51 AM kernel test robot wrote: > > Hi Sergio, > > First bad commit (maybe != root cause): > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: 7b4ea9456dd3f73238408126ab00f1d906963d81 > commit: 3b2fa0c92686562ac0b8cf00c0326a45814f8e18 MIPS: ralink: enable PCI > support only if driver for mt7621 SoC is selected > date: 9 months ago > config: mips-randconfig-r005-20200807 (attached as .config) > compiler: mipsel-linux-gcc (GCC) 9.3.0 > reproduce (this is a W=1 build): > wget > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > git checkout 3b2fa0c92686562ac0b8cf00c0326a45814f8e18 > # save the attached .config to linux build tree > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross > ARCH=mips > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > All errors (new ones prefixed by >>): > > >> drivers/staging/mt7621-pci/pci-mt7621.c:189:11: error: > >> 'pci_generic_config_read' undeclared here (not in a function) > 189 | .read = pci_generic_config_read, > | ^~~ > >> drivers/staging/mt7621-pci/pci-mt7621.c:190:12: error: > >> 'pci_generic_config_write' undeclared here (not in a function) > 190 | .write = pci_generic_config_write, > |^~~~ [snip] PCI is not selected if CONFIG_SOC_MT7621 is not set which is the case for the attached kernel config... There was a time when the driver itself was depending on PCI (which for me seems to be the correct thing to do) but it was removed creating a regression for gnubee and ralink with pci based boards. This was done in 'c4d48cf5e2f0 ("MIPS: ralink: deactivate PCI support for SOC_MT7621")'. To try to fix this issue and being able again to properly compile the driver I sent the patch is mentioned here by the test robot in where HAVE_PCI was selected only in PCI_MT7621 SOC was selected... I was told in the mips kernel list that this way was not the best thing to do to fix the issue but the patch was accepted. I also asked about the original compile issue to the guy who removed first the 'depends PCI' stuff with no answer at all... So, what should I do to properly fix this? Thanks in advance for your time. Best regards, Sergio Paracuellos
drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: cast from restricted __le16
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 86cfccb66937dd6cbf26ed619958b9e587e6a115 commit: 00b2e16e006390069480e90478aa8b6e924996d7 mt76: mt7915: add TxBF capabilities date: 3 months ago config: sparc64-randconfig-s032-20200806 (attached as .config) compiler: sparc64-linux-gcc (GCC) 9.3.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.2-117-g8c7aee71-dirty git checkout 00b2e16e006390069480e90478aa8b6e924996d7 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=sparc64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot sparse warnings: (new ones prefixed by >>) >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 >> drivers/net/wireless/mediatek/mt76/mt7915/init.c:339:26: sparse: sparse: >> cast from restricted __le16 vim +339 drivers/net/wireless/mediatek/mt76/mt7915/init.c 296 297 elem->phy_cap_info[3] &= ~IEEE80211_HE_PHY_CAP3_SU_BEAMFORMER; 298 elem->phy_cap_info[4] &= ~IEEE80211_HE_PHY_CAP4_MU_BEAMFORMER; 299 300 c = IEEE80211_HE_PHY_CAP5_BEAMFORMEE_NUM_SND_DIM_UNDER_80MHZ_MASK | 301 IEEE80211_HE_PHY_CAP5_BEAMFORMEE_NUM_SND_DIM_ABOVE_80MHZ_MASK; 302 elem->phy_cap_info[5] &= ~c; 303 304 c = IEEE80211_HE_PHY_CAP6_TRIG_SU_BEAMFORMER_FB | 305 IEEE80211_HE_PHY_CAP6_TRIG_MU_BEAMFORMER_FB; 306 elem->phy_cap_info[6] &= ~c; 307 308 elem->phy_cap_info[7] &= ~IEEE80211_HE_PHY_CAP7_MAX_NC_MASK; 309 310 c = IEEE80211_HE_PHY_CAP2_NDP_4x_LTF_AND_3_2US | 311 IEEE80211_HE_PHY_CAP2_UL_MU_FULL_MU_MIMO | 312 IEEE80211_HE_PHY_CAP2_UL_MU_PARTIAL_MU_MIMO; 313 elem->phy_cap_info[2] |= c; 314 315 c = IEEE80211_HE_PHY_CAP4_SU_BEAMFORMEE | 316 IEEE80211_HE_PHY_CAP4_BEAMFORMEE_MAX_STS_UNDER_80MHZ_4 | 317 IEEE80211_HE_PHY_CAP4_BEAMFORMEE_MAX_STS_ABOVE_80MHZ_4; 318 elem->phy_cap_info[4] |= c; 319 320 /* do not support NG16 due to spec D4.0 changes subcarrier idx */ 321 c = IEEE80211_HE_PHY_CAP6_CODEBOOK_SIZE_42_SU | 322 IEEE80211_HE_PHY_CAP6_CODEBOOK_SIZE_75_MU; 323 324 if (vif == NL80211_IFTYPE_STATION) 325 c |= IEEE80211_HE_PHY_CAP6_PARTIAL_BANDWIDTH_DL_MUMIMO; 326 327 elem->phy_cap_info[6] |= c; 328 329 if (nss < 2) 330 return; 331 332 if (vif != NL80211_IFTYPE_AP) 333 return; 334 335 elem->phy_cap_info[3] |= IEEE80211_HE_PHY_CAP3_SU_BEAMFORMER; 336 elem->phy_cap_info[4] |= IEEE80211_HE_PHY_CAP4_MU_BEAMFORMER; 337 338 /* num_snd_dim */ > 339 c = (nss - 1) | (max_t(int, mcs->tx_mcs_160, 1) << 3); 340 elem->phy_cap_info[5] |= c; 341 342 c = IEEE80211_HE_PHY_CAP6_TRIG_SU_BEAMFORMER_FB | 343 IEEE80211_HE_PHY_CAP6_TRIG_MU_BEAMFORMER_FB; 344 elem->phy_cap_info[6] |= c; 345 346 /* the maximum cap is 4 x 3, (Nr, Nc) = (3, 2) */ 347 elem->phy_cap_info[7] |= min_t(int, nss - 1, 2) << 3; 348 } 349 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples
Hi Peter, On 8/6/2020 5:24 PM, pet...@infradead.org wrote: On Thu, Aug 06, 2020 at 11:18:27AM +0200, pet...@infradead.org wrote: On Thu, Aug 06, 2020 at 10:26:29AM +0800, Jin, Yao wrote: +static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs) +{ + struct pt_regs *sample_regs = regs; + + /* user only */ + if (!event->attr.exclude_kernel || !event->attr.exclude_hv || + !event->attr.exclude_host || !event->attr.exclude_guest) + return sample_regs; + Is this condition correct? Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It will go "return sample_regs" path. I'm not sure, I'm terminally confused on virt stuff. [A] Suppose we have nested virt: L0-hv | G0/L1-hv | G1 And we're running in G0, then: - 'exclude_hv' would exclude L0 events - 'exclude_host' would ... exclude L1-hv events? - 'exclude_guest' would ... exclude G1 events? [B] Then the next question is, if G0 is a host, does the L1-hv run in G0 userspace or G0 kernel space? I was assuming G0 userspace would not include anything L1 (kvm is a kernel module after all), but what do I know. @@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open, if (err) return err; - if (!attr.exclude_kernel) { + if (!attr.exclude_kernel || !attr.exclude_callchain_kernel || + !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) { err = perf_allow_kernel(&attr); if (err) return err; I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel". But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest". Well, I'm very sure G0 userspace should never see L0 or G1 state, so exclude_hv and exclude_guest had better be true. On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right? Same as above, is G0 host state G0 userspace? So even exclude_kernel = 1 but exclude_host = 0, we will still go perf_allow_kernel path. Please correct me if my understanding is wrong. Yes, because with those permission checks in place it means you have permission to see kernel bits. So if I understand 'exclude_host' wrong -- a distinct possibility -- can we then pretty please have the above [A-B] corrected and put in a comment near perf_event_attr and the exclude_* comments changed to refer to that? In my previous mail, I explained what I understood for 'exclude_host', but not sure if it's correct. Needs more review comments. Thanks Jin Yao
Re: INFO: task hung in pipe_read (2)
Hello! On Sat, Aug 01, 2020 at 10:39:00AM -0700, Linus Torvalds wrote: > On Sat, Aug 1, 2020 at 8:30 AM Tetsuo Handa > wrote: > > > > Waiting for response at > > https://lkml.kernel.org/r/45a9b2c8-d0b7-8f00-5b30-0cfe3e028...@i-love.sakura.ne.jp > > . > > I think handle_userfault() should have a (shortish) timeout, and just > return VM_FAULT_RETRY. The 1sec timeout if applied only to kernel faults (not the case yet but it'd be enough to solve the hangcheck timer), will work perfectly for Android, but it will break qemu. [ 916.954313] INFO: task syz-executor.0:61593 blocked for more than 40 seconds. If you want to enforce a timeout, 40 seconds or something of the order of the hangcheck timer would be more reasonable. 1sec is of the same order of magnitude of latency that you'd get with an host kernel upgrade in place with kexec (with the guest memory being preserved in RAM) that you'd suffer occasionally from in most public clouds. So postcopy live migration should be allowed to take 1 sec latency and it shouldn't become a deal breaker, that results in the VM getting killed. > The code is overly complex anyway, because it predates the "just return > RETRY". > > And because we can't wait forever when the source of the fault is a > kernel exception, I think we should add some extra logic to just say > "if this is a retry, we've already done this once, just return an > error". Until the uffp-wp was merged recently, we never needed more than one VM_FAULT_RETRY to handle uffd-missing faults, you seem to want to go back to that which again would be fine for uffd-missing faults. I haven't had time to read and test the testcase properly yet, but at first glance from reading the hangcheck report it looks like there would be just one userfault? So I don't see an immediate connection. The change adding a 1sec timeout would definitely fix this issue, but it'll also break qemu and probably the vast majority of the users. > This is a TEST PATCH ONLY. I think we'll actually have to do something > like this, but I think the final version might need to allow a couple > of retries, rather than just give up after just one second. > > But for testing your case, this patch might be enough to at least show > that "yeah, this kind of approach works". > > Andrea? Comments? As mentioned, this is probably much too aggressive, > but I do think we need to limit the time that the kernel will wait for > page faults. Why is pipe preventing to SIGKILL the task that is blocked on the mutex_lock? Is there any good reason for it or it simply has margin for improvement regardless of the hangcheck report? It'd be great if we can look into that before looking into the uffd specific bits. The hangcheck timer would have zero issues with tasks that can be killed, if only the pipe code could be improved to use mutex_lock_killable. /* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */ if (t->state == TASK_UNINTERRUPTIBLE) check_hung_task(t, timeout); The hangcheck report is just telling us one task was in D state a little too long, but it wasn't fatal error and the kernel wasn't actually destabilized and the only malfunction reported is that a task was unkillable for too long. Now if it's impossible to improve the pipe code so it works better not just for uffd, there's still no reason to worry: we could disable uffd in the pipe context. For example ptrace opts-out of uffds, so that gdb doesn't get stuck if you read a pointer that should be handled by the process that is under debug. I hope it won't be necessary but it wouldn't be a major issue, certainly it wouldn't risk breaking qemu (and non-cooperative APIs are privileged so it could still skip the timeout). > Because userfaultfd has become a huge source of security holes as a > way to time kernel faults or delay them indefinitely. I assume you refer to the below: https://duasynt.com/blog/cve-2016-6187-heap-off-by-one-exploit https://blog.lizzie.io/using-userfaultfd.html These reports don't happen to mention CONFIG_SLAB_FREELIST_RANDOM=y which is already enabled by default in enterprise kernels for a reason and they don't mention the more recent CONFIG_SLAB_FREELIST_HARDENED and CONFIG_SHUFFLE_PAGE_ALLOCATOR. Can they test it with those options enabled again, does it still work so good or not anymore? That would be very helpful to know. Randomizing which is the next page that gets allocated is much more important than worrying about uffd because if you removed uffd you may still have other ways to temporarily stop the page fault depending on the setup. Example: https://bugs.chromium.org/p/project-zero/issues/detail?id=808 The above one doesn't use uffd, but it uses fuse. So is fuse also a source of security holes given they even use it for the exploit in a preferential way instead of uffd? "This can be done by abusing the writev() syscall and FUSE: The attacker mounts a FUSE filesystem that artificia
Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples
Hi Peter, On 8/6/2020 5:18 PM, pet...@infradead.org wrote: On Thu, Aug 06, 2020 at 10:26:29AM +0800, Jin, Yao wrote: +static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs) +{ + struct pt_regs *sample_regs = regs; + + /* user only */ + if (!event->attr.exclude_kernel || !event->attr.exclude_hv || + !event->attr.exclude_host || !event->attr.exclude_guest) + return sample_regs; + Is this condition correct? Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It will go "return sample_regs" path. I'm not sure, I'm terminally confused on virt stuff. Suppose we have nested virt: L0-hv | G0/L1-hv | G1 And we're running in G0, then: - 'exclude_hv' would exclude L0 events - 'exclude_host' would ... exclude L1-hv events? I think the exclude_host is generally set by guest (/arch/x86/kvm/pmu.c, pmc_reprogram_counter). If G0 is a host, if we set exclude_host in G0, I think we will not be able to count the events on G0. The appropriate usage is, G1 sets the exclude_host, then the events on G0 will not be collected by guest G1. That's my understanding for the usage of exclude_host. - 'exclude_guest' would ... exclude G1 events? Similarly, the appropriate usage is, the host (G0) sets the exclude_guest, then the events on G1 will not be collected by host G0. If G1 sets exclude_guest, since no guest is under G1, that's ineffective. Then the next question is, if G0 is a host, does the L1-hv run in G0 userspace or G0 kernel space? I'm not very sure. Maybe some in kernel, some in userspace(qemu)? Maybe some KVM experts can help to answer this question. I was assuming G0 userspace would not include anything L1 (kvm is a kernel module after all), but what do I know. I have tested following conditions in native environment (not in KVM guests), the result is not expected. /* user only */ if (!event->attr.exclude_kernel || !event->attr.exclude_hv || !event->attr.exclude_host || !event->attr.exclude_guest) return sample_regs; perf record -e cycles:u ./div perf report --stdio # Overhead Command Shared Object Symbol # ... ... # 49.51% div libc-2.27.so [.] __random_r 33.93% div libc-2.27.so [.] __random 8.13% div libc-2.27.so [.] rand 4.29% div div [.] main 4.14% div div [.] rand@plt 0.00% div [unknown] [k] 0xbd600cb0 0.00% div [unknown] [k] 0xbd600df0 0.00% div ld-2.27.so[.] _dl_relocate_object 0.00% div ld-2.27.so[.] _dl_start 0.00% div ld-2.27.so[.] _start 0xbd600cb0 and 0xbd600df0 are leaked kernel addresses. From debug, I can see: [ 6272.320258] jinyao: sanitize_sample_regs: event->attr.exclude_kernel = 1, event->attr.exclude_hv = 1, event->attr.exclude_host = 0, event->attr.exclude_guest = 0 So it goes "return sample_regs;" path. @@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open, if (err) return err; - if (!attr.exclude_kernel) { + if (!attr.exclude_kernel || !attr.exclude_callchain_kernel || + !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) { err = perf_allow_kernel(&attr); if (err) return err; I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel". But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest". Well, I'm very sure G0 userspace should never see L0 or G1 state, so exclude_hv and exclude_guest had better be true. On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right? Same as above, is G0 host state G0 userspace? So even exclude_kernel = 1 but exclude_host = 0, we will still go perf_allow_kernel path. Please correct me if my understanding is wrong. Yes, because with those permission checks in place it means you have permission to see kernel bits. At the syscall entry, I also added some printk. Aug 7 03:37:40 kbl-ppc kernel: [ 854.688045] syscall: attr.exclude_kernel = 1, attr.exclude_callchain_kernel = 0, attr.exclude_hv = 0, attr.exclude_host = 0, attr.exclude_guest = 0 For my test case ("perf record -e cycles:u ./div"), the perf_allow_kernel() is also executed. Thanks Jin Yao
Re: [PATCH v6 11/11] bus: mhi: core: Introduce sysfs entries for MHI
On Mon, Jul 27, 2020 at 07:02:20PM -0700, Bhaumik Bhatt wrote: > Introduce sysfs entries to enable userspace clients the ability to read > the serial number and the OEM PK Hash values obtained from BHI. OEMs > need to read these device-specific hardware information values through > userspace for factory testing purposes and cannot be exposed via degbufs > as it may remain disabled for performance reasons. Also, update the > documentation for ABI to include these entries. > > Signed-off-by: Bhaumik Bhatt > --- > Documentation/ABI/stable/sysfs-bus-mhi | 21 ++ > MAINTAINERS| 1 + > drivers/bus/mhi/core/init.c| 53 > ++ > 3 files changed, 75 insertions(+) > create mode 100644 Documentation/ABI/stable/sysfs-bus-mhi > > diff --git a/Documentation/ABI/stable/sysfs-bus-mhi > b/Documentation/ABI/stable/sysfs-bus-mhi > new file mode 100644 > index 000..1d5d0d6 > --- /dev/null > +++ b/Documentation/ABI/stable/sysfs-bus-mhi > @@ -0,0 +1,21 @@ > +What:/sys/bus/mhi/devices/.../serialnumber > +Date:Jul 2020 > +KernelVersion: 5.8 > +Contact: Bhaumik Bhatt > +Description: The file holds the serial number of the client device obtained > + using a BHI (Boot Host Interface) register read after at least > + one attempt to power up the device has been done. If read > + without having the device power on at least once, the file will > + read all 0's. > +Users: Any userspace application or clients interested in > device info. I think you're not using tabs here and that's why it is showing mangled. Please use tabs as like other files. Thanks, Mani > + > +What:/sys/bus/mhi/devices/.../oem_pk_hash > +Date:Jul 2020 > +KernelVersion: 5.8 > +Contact: Bhaumik Bhatt > +Description: The file holds the OEM PK Hash value of the endpoint device > + obtained using a BHI (Boot Host Interface) register read after > + at least one attempt to power up the device has been done. If > + read without having the device power on at least once, the file > + will read all 0's. > +Users: Any userspace application or clients interested in > device info. > diff --git a/MAINTAINERS b/MAINTAINERS > index e64e5db..5e49316 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -11018,6 +11018,7 @@ M:Hemant Kumar > L: linux-arm-...@vger.kernel.org > S: Maintained > T: git git://git.kernel.org/pub/scm/linux/kernel/git/mani/mhi.git > +F: Documentation/ABI/stable/sysfs-bus-mhi > F: Documentation/mhi/ > F: drivers/bus/mhi/ > F: include/linux/mhi.h > diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c > index 972dbf0..c086ef2 100644 > --- a/drivers/bus/mhi/core/init.c > +++ b/drivers/bus/mhi/core/init.c > @@ -76,6 +76,56 @@ const char *to_mhi_pm_state_str(enum mhi_pm_state state) > return mhi_pm_state_str[index]; > } > > +static ssize_t serial_number_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + struct mhi_device *mhi_dev = to_mhi_device(dev); > + struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl; > + > + return snprintf(buf, PAGE_SIZE, "Serial Number: %u\n", > + mhi_cntrl->serial_number); > +} > +static DEVICE_ATTR_RO(serial_number); > + > +static ssize_t oem_pk_hash_show(struct device *dev, > + struct device_attribute *attr, > + char *buf) > +{ > + struct mhi_device *mhi_dev = to_mhi_device(dev); > + struct mhi_controller *mhi_cntrl = mhi_dev->mhi_cntrl; > + int i, cnt = 0; > + > + for (i = 0; i < ARRAY_SIZE(mhi_cntrl->oem_pk_hash); i++) > + cnt += snprintf(buf + cnt, PAGE_SIZE - cnt, > + "OEMPKHASH[%d]: 0x%x\n", i, > + mhi_cntrl->oem_pk_hash[i]); > + > + return cnt; > +} > +static DEVICE_ATTR_RO(oem_pk_hash); > + > +static struct attribute *mhi_sysfs_attrs[] = { > + &dev_attr_serial_number.attr, > + &dev_attr_oem_pk_hash.attr, > + NULL, > +}; > + > +static const struct attribute_group mhi_sysfs_group = { > + .attrs = mhi_sysfs_attrs, > +}; > + > +static int mhi_create_sysfs(struct mhi_controller *mhi_cntrl) > +{ > + return sysfs_create_group(&mhi_cntrl->mhi_dev->dev.kobj, > + &mhi_sysfs_group); > +} > + > +static void mhi_destroy_sysfs(struct mhi_controller *mhi_cntrl) > +{ > + sysfs_remove_group(&mhi_cntrl->mhi_dev->dev.kobj, &mhi_sysfs_group); > +} > + > /* MHI protocol requires the transfer ring to be aligned with ring length */ > static int mhi_alloc_aligned_ring(struct mhi_controller *mhi_cntrl, > struct mhi_ring *ring, > @@ -917,6 +967,8 @@ int
Re: [PATCH v3 2/2] dma-pool: Only allocate from CMA when in same memory zone
On Thu, Aug 06, 2020 at 08:47:55PM +0200, Nicolas Saenz Julienne wrote: > There is no guarantee to CMA's placement, so allocating a zone specific > atomic pool from CMA might return memory from a completely different > memory zone. To get around this double check CMA's placement before > allocating from it. As the builtbot pointed out, memblock_start_of_DRAM can't be used from non-__init code. But lookig at it I think throwing that in is bogus anyway, as cma_get_base returns a proper physical address already.
[PATCH] serial: qcom_geni_serial: Fix recent kdb hang
The commit e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work even if UART isn't console") worked pretty well and I've been doing a lot of debugging with it. However, recently I typed "dmesg" in kdb and then held the space key down to scroll through the pagination. My device hung. This was repeatable and I found that it was introduced with the aforementioned commit. It turns out that there are some strange boundary cases in geni where in some weird situations it will signal RX_LAST but then will put 0 in RX_LAST_BYTE. This means that the entire last FIFO entry is valid. This weird corner case is handled in qcom_geni_serial_handle_rx() where you can see that we only honor RX_LAST_BYTE if RX_LAST is set _and_ RX_LAST_BYTE is non-zero. If either of these is not true we use BYTES_PER_FIFO_WORD (4) for the size of the last FIFO word. Let's fix kgdb. While at it, also use the proper #define for 4. Fixes: e42d6c3ec0c7 ("serial: qcom_geni_serial: Make kgdb work even if UART isn't console") Signed-off-by: Douglas Anderson --- drivers/tty/serial/qcom_geni_serial.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c index 07b7b6b05b8b..e27077656939 100644 --- a/drivers/tty/serial/qcom_geni_serial.c +++ b/drivers/tty/serial/qcom_geni_serial.c @@ -361,11 +361,16 @@ static int qcom_geni_serial_get_char(struct uart_port *uport) return NO_POLL_CHAR; if (word_cnt == 1 && (status & RX_LAST)) + /* +* NOTE: If RX_LAST_BYTE_VALID is 0 it needs to be +* treated as if it was BYTES_PER_FIFO_WORD. +*/ private_data->poll_cached_bytes_cnt = (status & RX_LAST_BYTE_VALID_MSK) >> RX_LAST_BYTE_VALID_SHFT; - else - private_data->poll_cached_bytes_cnt = 4; + + if (private_data->poll_cached_bytes_cnt == 0) + private_data->poll_cached_bytes_cnt = BYTES_PER_FIFO_WORD; private_data->poll_cached_bytes = readl(uport->membase + SE_GENI_RX_FIFOn); -- 2.28.0.236.gb10cc79966-goog
Re: [PATCH v6 10/11] bus: mhi: core: Introduce APIs to allocate and free the MHI controller
On Mon, Jul 27, 2020 at 07:02:19PM -0700, Bhaumik Bhatt wrote: > Client devices should use the APIs provided to allocate and free > the MHI controller structure. This will help ensure that the > structure is zero-initialized and there are no false positives > with respect to reading any values such as the serial number or > the OEM PK hash. > > Signed-off-by: Bhaumik Bhatt Can you please also add the Suggested-by tag? Reviewed-by: Manivannan Sadhasivam Thanks, Mani > --- > drivers/bus/mhi/core/init.c | 16 > include/linux/mhi.h | 12 > 2 files changed, 28 insertions(+) > > diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c > index d2c0f6e..972dbf0 100644 > --- a/drivers/bus/mhi/core/init.c > +++ b/drivers/bus/mhi/core/init.c > @@ -959,6 +959,22 @@ void mhi_unregister_controller(struct mhi_controller > *mhi_cntrl) > } > EXPORT_SYMBOL_GPL(mhi_unregister_controller); > > +struct mhi_controller *mhi_alloc_controller(void) > +{ > + struct mhi_controller *mhi_cntrl; > + > + mhi_cntrl = kzalloc(sizeof(*mhi_cntrl), GFP_KERNEL); > + > + return mhi_cntrl; > +} > +EXPORT_SYMBOL_GPL(mhi_alloc_controller); > + > +void mhi_free_controller(struct mhi_controller *mhi_cntrl) > +{ > + kfree(mhi_cntrl); > +} > +EXPORT_SYMBOL_GPL(mhi_free_controller); > + > int mhi_prepare_for_power_up(struct mhi_controller *mhi_cntrl) > { > struct device *dev = &mhi_cntrl->mhi_dev->dev; > diff --git a/include/linux/mhi.h b/include/linux/mhi.h > index d15e9ce..a35d876 100644 > --- a/include/linux/mhi.h > +++ b/include/linux/mhi.h > @@ -530,6 +530,18 @@ struct mhi_driver { > #define to_mhi_device(dev) container_of(dev, struct mhi_device, dev) > > /** > + * mhi_alloc_controller - Allocate the MHI Controller structure > + * Allocate the mhi_controller structure using zero initialized memory > + */ > +struct mhi_controller *mhi_alloc_controller(void); > + > +/** > + * mhi_free_controller - Free the MHI Controller structure > + * Free the mhi_controller structure which was previously allocated > + */ > +void mhi_free_controller(struct mhi_controller *mhi_cntrl); > + > +/** > * mhi_register_controller - Register MHI controller > * @mhi_cntrl: MHI controller to register > * @config: Configuration to use for the controller > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >
Re: [PATCH v6 07/11] bus: mhi: core: Introduce counters to track MHI device state transitions
On Mon, Jul 27, 2020 at 07:02:16PM -0700, Bhaumik Bhatt wrote: > Use counters to track MHI device state transitions such as those > to M0, M2, or M3 states. This can help in better debug, allowing > the user to see the number of transitions to a certain MHI state > when queried using debugfs entries or via other mechanisms. > > Signed-off-by: Bhaumik Bhatt Reviewed-by: Manivannan Sadhasivam A minor nit below... > --- > drivers/bus/mhi/core/pm.c | 4 > include/linux/mhi.h | 2 ++ > 2 files changed, 6 insertions(+) > > diff --git a/drivers/bus/mhi/core/pm.c b/drivers/bus/mhi/core/pm.c > index 27bb471..ce4d969 100644 > --- a/drivers/bus/mhi/core/pm.c > +++ b/drivers/bus/mhi/core/pm.c > @@ -256,6 +256,7 @@ int mhi_pm_m0_transition(struct mhi_controller *mhi_cntrl) > dev_err(dev, "Unable to transition to M0 state\n"); > return -EIO; > } > + mhi_cntrl->M0++; > > /* Wake up the device */ > read_lock_bh(&mhi_cntrl->pm_lock); > @@ -326,6 +327,8 @@ void mhi_pm_m1_transition(struct mhi_controller > *mhi_cntrl) > mhi_cntrl->dev_state = MHI_STATE_M2; > > write_unlock_irq(&mhi_cntrl->pm_lock); > + > + mhi_cntrl->M2++; > wake_up_all(&mhi_cntrl->state_event); > > /* If there are any pending resources, exit M2 immediately */ > @@ -362,6 +365,7 @@ int mhi_pm_m3_transition(struct mhi_controller *mhi_cntrl) > return -EIO; > } > > + mhi_cntrl->M3++; > wake_up_all(&mhi_cntrl->state_event); > > return 0; > diff --git a/include/linux/mhi.h b/include/linux/mhi.h > index a8379b3..e38de6d 100644 > --- a/include/linux/mhi.h > +++ b/include/linux/mhi.h > @@ -328,6 +328,7 @@ struct mhi_controller_config { > * @dev_state: MHI device state > * @dev_wake: Device wakeup count > * @pending_pkts: Pending packets for the controller > + * @M0, M2, M3, M3_fast: Counters to track number of device MHI state changes > * @transition_list: List of MHI state transitions > * @transition_lock: Lock for protecting MHI state transition list > * @wlock: Lock for protecting device wakeup > @@ -407,6 +408,7 @@ struct mhi_controller { > enum mhi_state dev_state; > atomic_t dev_wake; > atomic_t pending_pkts; > + u32 M0, M2, M3, M3_fast; Are you using M3_fast anywhere in current series? If not, please drop it for now. Thanks, Mani > struct list_head transition_list; > spinlock_t transition_lock; > spinlock_t wlock; > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project >
[GIT PULL] xen: branch for v5.9-rc1
Linus, Please git pull the following tag: git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git for-linus-5.9-rc1-tag xen: branch for v5.9-rc1 It contains the following: - two trivial comment fixes - A small series for the Xen balloon driver fixing some issues - A series of the Xen privcmd driver targeting elimination of using get_user_pages*() in this driver - A series for the Xen swiotlb driver cleaning it up and adding support for letting the kernel run as dom0 on Rpi4 Thanks. Juergen arch/arm/xen/mm.c| 34 +- arch/x86/include/asm/xen/hypercall.h | 2 +- drivers/xen/balloon.c| 26 +++- drivers/xen/privcmd.c| 32 +- drivers/xen/swiotlb-xen.c| 119 +-- include/uapi/xen/gntdev.h| 2 +- include/xen/page.h | 1 - include/xen/swiotlb-xen.h| 8 +-- 8 files changed, 119 insertions(+), 105 deletions(-) Boris Ostrovsky (1): swiotlb-xen: use vmalloc_to_page on vmalloc virt addresses Randy Dunlap (2): xen/gntdev: gntdev.h: drop a duplicated word xen: hypercall.h: fix duplicated word Roger Pau Monne (3): xen/balloon: fix accounting in alloc_xenballooned_pages error path xen/balloon: make the balloon wait interruptible Revert "xen/balloon: Fix crash when ballooning on x86 32 bit PAE" Souptick Joarder (3): xen/privcmd: Corrected error handling path xen/privcmd: Mark pages as dirty xen/privcmd: Convert get_user_pages*() to pin_user_pages*() Stefano Stabellini (10): swiotlb-xen: remove start_dma_addr swiotlb-xen: add struct device * parameter to xen_phys_to_bus swiotlb-xen: add struct device * parameter to xen_bus_to_phys swiotlb-xen: add struct device * parameter to xen_dma_sync_for_cpu swiotlb-xen: add struct device * parameter to xen_dma_sync_for_device swiotlb-xen: add struct device * parameter to is_xen_swiotlb_buffer swiotlb-xen: remove XEN_PFN_PHYS swiotlb-xen: introduce phys_to_dma/dma_to_phys translations xen/arm: introduce phys/dma translations in xen_dma_sync_for_* xen/arm: call dma_to_phys on the dma_addr_t parameter of dma_cache_maint
Re: [PATCH] ftrace: Fixup lockdep assert held of text_mutex
On Fri, Aug 7, 2020 at 12:01 PM Steven Rostedt wrote: > > On Fri, 7 Aug 2020 10:59:16 +0800 > Guo Ren wrote: > > > > > > This looks like a bug in the lockdep_assert_held() in whatever arch > > > (riscv) is running. > > Seems you think it's a bug of arch implementation with the wrong usage > > of text_mutex? > > > > Also @riscv maintainer, > > How about modifying it in riscv's code? we still need to solve it. > > > > > > diff --git a/arch/riscv/include/asm/ftrace.h > > b/arch/riscv/include/asm/ftrace.h > > index ace8a6e..fb266c3 100644 > > --- a/arch/riscv/include/asm/ftrace.h > > +++ b/arch/riscv/include/asm/ftrace.h > > @@ -23,6 +23,12 @@ static inline unsigned long > > ftrace_call_adjust(unsigned long addr) > > > > struct dyn_arch_ftrace { > > }; > > + > > +#ifdef CONFIG_DYNAMIC_FTRACE > > +struct dyn_ftrace; > > +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); > > +#define ftrace_init_nop ftrace_init_nop > > +#endif > > #endif > > > > #ifdef CONFIG_DYNAMIC_FTRACE > > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > > index 2ff63d0..9e9f7c0 100644 > > --- a/arch/riscv/kernel/ftrace.c > > +++ b/arch/riscv/kernel/ftrace.c > > @@ -97,6 +97,17 @@ int ftrace_make_nop(struct module *mod, struct > > dyn_ftrace *rec, > > return __ftrace_modify_call(rec->ip, addr, false); > > } > > > > +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) > > +{ > > + int ret; > > + > > + mutex_lock(&text_mutex); > > + ret = ftrace_make_nop(mod, rec, MCOUNT_ADDR); > > Looking at x86, we have the following code: > > static int ftrace_poke_late = 0; > > int ftrace_arch_code_modify_prepare(void) > __acquires(&text_mutex) > { > /* > * Need to grab text_mutex to prevent a race from module loading > * and live kernel patching from changing the text permissions while > * ftrace has it set to "read/write". > */ > mutex_lock(&text_mutex); > ftrace_poke_late = 1; > return 0; > } > > int ftrace_arch_code_modify_post_process(void) > __releases(&text_mutex) > { > /* > * ftrace_make_{call,nop}() may be called during > * module load, and we need to finish the text_poke_queue() > * that they do, here. > */ > text_poke_finish(); > ftrace_poke_late = 0; > mutex_unlock(&text_mutex); > return 0; > } > > And if ftrace_poke_late is not set, then ftrace_make_nop() does direct > modification (calls text_poke_early(), which is basically a memcpy). > > This path doesn't have any checks against text_mutex being held, > because it only happens at boot up. The solution is ok for me, but I want to get riscv maintainer's opinion before the next patch. @Paul Walmsley @Palmer Dabbelt > > > + mutex_unlock(&text_mutex); > > + > > + return ret; > > +} > > + > > int ftrace_update_ftrace_func(ftrace_func_t func) > > { > > int ret = __ftrace_modify_call((unsigned long)&ftrace_call, > > --- > > > > > > --- a/kernel/trace/ftrace.c > > > > +++ b/kernel/trace/ftrace.c > > > > @@ -26,6 +26,7 @@ > > > > #include > > > > #include > > > > #include > > > > +#include > > > > #include > > > > #include > > > > #include > > > > @@ -6712,9 +6713,11 @@ void __init ftrace_init(void) > > > > > > ftrace_init() is called before SMP is initialized. Nothing else should > > > be running here. That means grabbing a mutex is useless. > > I don't agree, ftrace_init are modifying kernel text, so we should > > give the lock of text_mutex to keep semantic consistency. > > > Did you test your patch on x86 with lockdep? Ah.., no :P > > ftrace_process_locs() grabs the ftrace_lock, which I believe is held > when text_mutex is taken in other locations. So this will probably not > work anyway. > > text_mutex isn't to be taken at the ftrace level. Yes, currently it seemed only to be used by kernel/kprobes.c. -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
Re: [PATCH v2 37/41] cpufreq: s3c24xx: move low-level clk reg access into platform code
On 06-08-20, 20:20, Krzysztof Kozlowski wrote: > From: Arnd Bergmann > > Rather than have the cpufreq drivers touch include the > common headers to get the constants, add a small indirection. > This is still not the proper way that would do this through > the common clk API, but it lets us kill off the header file > usage. > > Signed-off-by: Arnd Bergmann > [krzk: Rebase and fix -Wold-style-definition] > Signed-off-by: Krzysztof Kozlowski Acked-by: Viresh Kumar -- viresh
Re: [PATCH v2 36/41] cpufreq: s3c2412: use global s3c2412_cpufreq_setrefresh
On 06-08-20, 20:20, Krzysztof Kozlowski wrote: > From: Arnd Bergmann > > There are two identical copies of the s3c2412_cpufreq_setrefresh > function: a static one in the cpufreq driver and a global > version in iotiming-s3c2412.c. > > As the function requires the use of a hardcoded register address > from a header that we want to not be visible to drivers, just > move the existing global function and add a declaration in > one of the cpufreq header files. > > Signed-off-by: Arnd Bergmann > Signed-off-by: Krzysztof Kozlowski > --- > drivers/cpufreq/s3c2412-cpufreq.c| 23 > include/linux/soc/samsung/s3c-cpufreq-core.h | 1 + > 2 files changed, 1 insertion(+), 23 deletions(-) Acked-by: Viresh Kumar -- viresh
[git pull] m68knommu changes for v5.9
Hi Linus, Please pull the m68knommu changes for v5.9. Regards Greg The following changes since commit 92ed301919932f13b9172e525674157e983d: Linux 5.8-rc7 (2020-07-26 14:14:06 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git tags/m68knommu-for-v5.9 for you to fetch changes up to fde87ebf1daa8d96e4412aa06536da4b55103e02: m68k: stmark2: enable edma support for dspi (2020-07-27 12:32:00 +1000) m68knommu: collection of fixes for v5.9 Fixes include: . cleanup compiler warnings (IO access functions and unused variables) . ColdFire v3 cache control fix . ColdFire MMU comment cleanup . switch to using asm-generic cmpxchg_local() . stmark platform updates Angelo Dureghello (2): m68k: stmark2: defconfig updates m68k: stmark2: enable edma support for dspi Greg Ungerer (5): m68knommu: __force type casts for raw IO access m68knommu: fix use of cpu_to_le() on IO access m68k: fix ColdFire mmu init compile warning m68knommu: fix overwriting of bits in ColdFire V3 cache control m68k: use asm-generic cmpxchg_local() Mike Rapoport (1): m68k: mcfmmu: remove stale part of comment about steal_context arch/m68k/coldfire/stmark2.c| 5 arch/m68k/configs/stmark2_defconfig | 47 + arch/m68k/include/asm/cmpxchg.h | 8 --- arch/m68k/include/asm/io_no.h | 20 arch/m68k/include/asm/m53xxacr.h| 6 ++--- arch/m68k/mm/mcfmmu.c | 6 - 6 files changed, 45 insertions(+), 47 deletions(-)
Re: [PATCH v2 35/41] ARM: s3c: remove cpufreq header dependencies
On 06-08-20, 20:20, Krzysztof Kozlowski wrote: > From: Arnd Bergmann > > The cpufreq drivers are split between the machine directory > and the drivers/cpufreq directory. In order to share header > files after we convert s3c to multiplatform, those headers > have to live in a different global location. > > Move them to linux/soc/samsung/ in lack of a better place. > > Signed-off-by: Arnd Bergmann > Signed-off-by: Krzysztof Kozlowski > --- > arch/arm/mach-s3c24xx/common.c | 1 - > arch/arm/mach-s3c24xx/cpufreq-utils.c | 2 +- > arch/arm/mach-s3c24xx/iotiming-s3c2410.c | 2 +- > arch/arm/mach-s3c24xx/iotiming-s3c2412.c | 2 +- > arch/arm/mach-s3c24xx/mach-bast.c | 2 +- > arch/arm/mach-s3c24xx/mach-osiris-dvs.c| 2 +- > arch/arm/mach-s3c24xx/mach-osiris.c| 2 +- > arch/arm/mach-s3c24xx/pll-s3c2410.c| 4 ++-- > arch/arm/mach-s3c24xx/pll-s3c2440-1200.c | 4 ++-- > arch/arm/mach-s3c24xx/pll-s3c2440-16934400.c | 4 ++-- > arch/arm/mach-s3c24xx/s3c2410.c| 1 - > arch/arm/mach-s3c24xx/s3c2412.c| 1 - > arch/arm/mach-s3c24xx/s3c244x.c| 2 -- > arch/arm/mach-s3c64xx/s3c6400.c| 1 - > arch/arm/mach-s3c64xx/s3c6410.c| 2 +- > arch/arm/plat-samsung/include/plat/cpu.h | 9 - > drivers/cpufreq/s3c2410-cpufreq.c | 5 ++--- > drivers/cpufreq/s3c2412-cpufreq.c | 5 ++--- > drivers/cpufreq/s3c2440-cpufreq.c | 5 ++--- > drivers/cpufreq/s3c24xx-cpufreq-debugfs.c | 2 +- > drivers/cpufreq/s3c24xx-cpufreq.c | 5 ++--- > .../linux/soc/samsung/s3c-cpu-freq.h | 4 > .../linux/soc/samsung/s3c-cpufreq-core.h | 6 +- > include/linux/soc/samsung/s3c-pm.h | 10 ++ > 24 files changed, 41 insertions(+), 42 deletions(-) > rename arch/arm/plat-samsung/include/plat/cpu-freq.h => > include/linux/soc/samsung/s3c-cpu-freq.h (97%) > rename arch/arm/plat-samsung/include/plat/cpu-freq-core.h => > include/linux/soc/samsung/s3c-cpufreq-core.h (98%) Acked-by: Viresh Kumar -- viresh
Re: [PATCH v2 34/41] cpufreq: s3c24xx: split out registers
On 06-08-20, 20:20, Krzysztof Kozlowski wrote: > From: Arnd Bergmann > > Each of the cpufreq drivers uses a fixed set of register > bits, copy those definitions into the drivers to avoid > including mach/regs-clock.h. > > Signed-off-by: Arnd Bergmann > [krzk: Fix build by copying also S3C2410_LOCKTIME] > Signed-off-by: Krzysztof Kozlowski > > --- Acked-by: Viresh Kumar -- viresh
Re: [PATCH v3 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup
On Thu, Aug 06, 2020 at 09:16:03PM -0700, Andrew Morton wrote: > On Wed, 29 Jul 2020 19:10:39 +0200 Michal Koutný wrote: > > > Hello. > > > > On Tue, Jun 23, 2020 at 11:45:14AM -0700, Roman Gushchin > > wrote: > > > Because the size of memory cgroup internal structures can dramatically > > > exceed the size of object or page which is pinning it in the memory, it's > > > not a good idea to simple ignore it. It actually breaks the isolation > > > between cgroups. > > No doubt about accounting the memory if it's significant amount. > > > > > Let's account the consumed percpu memory to the parent cgroup. > > Why did you choose charging to the parent of the created cgroup? > > > > Should the charge go the cgroup _that is creating_ the new memcg? > > > > One reason is that there are the throttling mechanisms for memory limits > > and those are better exercised when the actor and its memory artefact > > are the same cgroup, aren't they? Hi! In general, yes. But in this case I think it wouldn't be a good idea: most often cgroups are created by a centralized daemon (systemd), which is usually located in the root cgroup. Even if it's located not in the root cgroup, limiting it's memory will likely affect the whole system, even if only one specific limit was reached. If there is a containerized workload, which creates sub-cgroups, charging it's parent cgroup is perfectly effective. And the opposite, if we'll charge the cgroup of a process, who created a cgroup, we'll not cover the most common case: systemd creating cgroups for all services in the system. > > > > The second reason is based on the example Dlegation Containment > > (Documentation/admin-guide/cgroup-v2.rst) > > > > > For an example, let's assume cgroups C0 and C1 have been delegated to > > > user U0 who created C00, C01 under C0 and C10 under C1 as follows and > > > all processes under C0 and C1 belong to U0:: > > > > > > ~ - C0 - C00 > > > ~ cgroup~ \ C01 > > > ~ hierarchy ~ > > > ~ - C1 - C10 > > > > Thanks to permissions a task running in C0 creating a cgroup in C1 would > > deplete C1's supply victimizing tasks inside C1. Right, but it's quite unusual for tasks from one cgroup to create sub-cgroups in completely different cgroup. In this particular case there are tons of other ways how a task from C00 can hurt C1. > > These week-old issues appear to be significant. Roman? Or someone > else? Oh, I'm sorry, somehow I've missed this letter. Thank you for pointing at it! Thanks!
Re: [PATCH 2/3] mm/huge_memory.c: update tlb entry if pmd is changed
On Fri, 26 Jun 2020 13:43:06 +0530 "Aneesh Kumar K.V" wrote: > On 6/25/20 10:16 PM, Mike Kravetz wrote: > > On 6/25/20 5:01 AM, Aneesh Kumar K.V wrote: > >> Mike Kravetz writes: > >> > >>> On 6/24/20 2:26 AM, Bibo Mao wrote: > When set_pmd_at is called in function do_huge_pmd_anonymous_page, > new tlb entry can be added by software on MIPS platform. > > Here add update_mmu_cache_pmd when pmd entry is set, and > update_mmu_cache_pmd is defined as empty excepts arc/mips platform. > This patch has no negative effect on other platforms except arc/mips > system. > >>> > >>> I am confused by this comment. It appears that update_mmu_cache_pmd > >>> is defined as non-empty on arc, mips, powerpc and sparc architectures. > >>> Am I missing something? > >>> > >>> If those architectures do provide update_mmu_cache_pmd, then the previous > >>> patch and this one now call update_mmu_cache_pmd with the actual faulting > >>> address instead of the huge page aligned address. This was intentional > >>> for mips. However, are there any potential issues on the other > >>> architectures? > >>> I am no expert in any of those architectures. arc looks like it could be > >>> problematic as update_mmu_cache_pmd calls update_mmu_cache and then > >>> operates on (address & PAGE_MASK). That could now be different. > >>> > >> > >> Also we added update_mmu_cache_pmd to update a THP entry. That could be > >> different from a hugetlb entry on some architectures. If we need to do > >> hugetlb equivalent for update_mmu_cache, we should add a different > >> function. > > > > I do not know the mips architecture well enough or if the motivation for > > this patch was based on THP or hugetlb pages. However, it will change > > the address passed to update_mmu_cache_pmd from huge page aligned to the > > actual faulting address. Will such a change in the passed address impact > > the powerpc update_mmu_cache_pmd routine? > > > > Right now powerpc update_mmu_cache_pmd() is a dummy function. But I > agree we should audit arch to make sure such a change can work with > architectures. My comment was related to the fact that mmu cache update > w.r.t THP and hugetlb can be different on some platforms. So we may > want to avoid using the same function for both. So I'll assume that this patch is stalled until such an audit has taken place?
Re: [PATCH v8 0/4] scsi: ufs: Add Host Performance Booster Support
Avri, > Martin - Are you considering to merge the HPB feature eventually to > mainline kernel? I promise to take a look at the new series. But I can't say I'm a big fan of how this feature was defined in the spec. And - as discussed a couple of weeks ago - I would still like to see some supporting evidence from a realistic workload and not a synthetic benchmark. -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
On Fri, 3 Jul 2020 18:28:23 +0530 Srikar Dronamraju wrote: > > The memory hotplug changes that somehow because you can hotremove numa > > nodes and therefore make the nodemask sparse but that is not a common > > case. I am not sure what would happen if a completely new node was added > > and its corresponding node was already used by the renumbered one > > though. It would likely conflate the two I am afraid. But I am not sure > > this is really possible with x86 and a lack of a bug report would > > suggest that nobody is doing that at least. > > > > JFYI, > Satheesh copied in this mailchain had opened a bug a year on crash with vcpu > hotplug on memoryless node. > > https://bugzilla.kernel.org/show_bug.cgi?id=202187 So... do we merge this patch or not? Seems that the overall view is "risky but nobody is likely to do anything better any time soon"?
Re: [PATCH 1/3] scripts/sorttable: Change section type of orc_lookup to SHT_PROGBITS
> On Aug 6, 2020, at 11:08 PM, Ingo Molnar wrote: > > > * changhuaixin wrote: > >> Hi, Ingo >> >> Another way to write SHT_PROGBITS is using elf_create_section to write >> orc_lookup table headers, when orc_unwind_ip table and orc_unwind table are >> written. Is this a better solution? >> >> diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c >> index 3f98dcfbc177..860d4dcec8e6 100644 >> --- a/tools/objtool/orc_gen.c >> +++ b/tools/objtool/orc_gen.c >> @@ -183,6 +183,10 @@ int create_orc_sections(struct objtool_file *file) >>u_sec = elf_create_section(file->elf, ".orc_unwind", >> sizeof(struct orc_entry), idx); >> >> + /* make flags of section orc_lookup right */ >> + if (!elf_create_section(file->elf, ".orc_lookup", sizeof(int), 0)) >> + return -1; >> + >>/* populate sections */ >>idx = 0; >>for_each_sec(file, sec) { > > Looks much nicer IMO. > > Mind turning this into a proper patch that does it plus reverts the > hack? > A new patchset is sent. Thanks, huaixin > Thanks, > > Ingo
Re: [PATCH] kernel: time: delete repeated words in comments
On Thu, Aug 6, 2020 at 8:32 PM Randy Dunlap wrote: > > Drop repeated words in kernel/time/. > {when, one, into} > Acked-by: John Stultz (I'm sure I'm to blame) thanks -john
[PATCH 3/3] x86/unwind/orc: Simplify unwind_init() for x86 boot
The ORC fast lookup table is built by scripts/sorttable tool. All that is left is setting lookup_num_blocks. Signed-off-by: Huaixin Chang Signed-off-by: Shile Zhang --- arch/x86/kernel/unwind_orc.c | 41 ++--- 1 file changed, 2 insertions(+), 39 deletions(-) diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c index ec88bbe08a32..29890389b4f6 100644 --- a/arch/x86/kernel/unwind_orc.c +++ b/arch/x86/kernel/unwind_orc.c @@ -264,48 +264,11 @@ void unwind_module_init(struct module *mod, void *_orc_ip, size_t orc_ip_size, void __init unwind_init(void) { - size_t orc_ip_size = (void *)__stop_orc_unwind_ip - (void *)__start_orc_unwind_ip; - size_t orc_size = (void *)__stop_orc_unwind - (void *)__start_orc_unwind; - size_t num_entries = orc_ip_size / sizeof(int); - struct orc_entry *orc; - int i; - - if (!num_entries || orc_ip_size % sizeof(int) != 0 || - orc_size % sizeof(struct orc_entry) != 0 || - num_entries != orc_size / sizeof(struct orc_entry)) { - orc_warn("WARNING: Bad or missing .orc_unwind table. Disabling unwinder.\n"); - return; - } - /* -* Note, the orc_unwind and orc_unwind_ip tables were already -* sorted at build time via the 'sorttable' tool. -* It's ready for binary search straight away, no need to sort it. +* All ORC tables are sorted and built via sorttable tool. Initialize +* lookup_num_blocks only. */ - - /* Initialize the fast lookup table: */ lookup_num_blocks = orc_lookup_end - orc_lookup; - for (i = 0; i < lookup_num_blocks-1; i++) { - orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, -num_entries, -LOOKUP_START_IP + (LOOKUP_BLOCK_SIZE * i)); - if (!orc) { - orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n"); - return; - } - - orc_lookup[i] = orc - __start_orc_unwind; - } - - /* Initialize the ending block: */ - orc = __orc_find(__start_orc_unwind_ip, __start_orc_unwind, num_entries, -LOOKUP_STOP_IP); - if (!orc) { - orc_warn("WARNING: Corrupt .orc_unwind table. Disabling unwinder.\n"); - return; - } - orc_lookup[lookup_num_blocks-1] = orc - __start_orc_unwind; - orc_init = true; } -- 2.14.4.44.g2045bb6
[PATCH 2/3] scripts/sorttable: Build ORC fast lookup table via sorttable tool
Since ORC tables are already sorted by sorttable tool, let us move building of fast lookup table into sorttable tool too. This saves us 6380us from boot time under Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz with 64 cores. Signed-off-by: Huaixin Chang Signed-off-by: Shile Zhang --- arch/x86/include/asm/orc_lookup.h | 16 -- arch/x86/include/asm/orc_types.h | 16 ++ arch/x86/kernel/vmlinux.lds.S | 2 +- scripts/sorttable.h| 96 +++--- tools/arch/x86/include/asm/orc_types.h | 16 ++ 5 files changed, 122 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/orc_lookup.h b/arch/x86/include/asm/orc_lookup.h index 241631282e43..c75eb1f82bdb 100644 --- a/arch/x86/include/asm/orc_lookup.h +++ b/arch/x86/include/asm/orc_lookup.h @@ -5,22 +5,6 @@ #ifndef _ORC_LOOKUP_H #define _ORC_LOOKUP_H -/* - * This is a lookup table for speeding up access to the .orc_unwind table. - * Given an input address offset, the corresponding lookup table entry - * specifies a subset of the .orc_unwind table to search. - * - * Each block represents the end of the previous range and the start of the - * next range. An extra block is added to give the last range an end. - * - * The block size should be a power of 2 to avoid a costly 'div' instruction. - * - * A block size of 256 was chosen because it roughly doubles unwinder - * performance while only adding ~5% to the ORC data footprint. - */ -#define LOOKUP_BLOCK_ORDER 8 -#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER) - #ifndef LINKER_SCRIPT extern unsigned int orc_lookup[]; diff --git a/arch/x86/include/asm/orc_types.h b/arch/x86/include/asm/orc_types.h index d25534940bde..b93c6a7b4da4 100644 --- a/arch/x86/include/asm/orc_types.h +++ b/arch/x86/include/asm/orc_types.h @@ -9,6 +9,22 @@ #include #include +/* + * This is a lookup table for speeding up access to the .orc_unwind table. + * Given an input address offset, the corresponding lookup table entry + * specifies a subset of the .orc_unwind table to search. + * + * Each block represents the end of the previous range and the start of the + * next range. An extra block is added to give the last range an end. + * + * The block size should be a power of 2 to avoid a costly 'div' instruction. + * + * A block size of 256 was chosen because it roughly doubles unwinder + * performance while only adding ~5% to the ORC data footprint. + */ +#define LOOKUP_BLOCK_ORDER 8 +#define LOOKUP_BLOCK_SIZE (1 << LOOKUP_BLOCK_ORDER) + /* * The ORC_REG_* registers are base registers which are used to find other * registers on the stack. diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 9a03e5b23135..75760e7f6319 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -29,7 +29,7 @@ #include #include #include -#include +#include #include #include diff --git a/scripts/sorttable.h b/scripts/sorttable.h index a2baa2fefb13..de9822f8ae8f 100644 --- a/scripts/sorttable.h +++ b/scripts/sorttable.h @@ -93,12 +93,50 @@ char g_err[ERRSTR_MAXSZ]; int *g_orc_ip_table; struct orc_entry *g_orc_table; +static unsigned long orc_ip_table_offset; pthread_t orc_sort_thread; +struct orc_sort_param { + size_t lookup_table_size; + unsigned int*orc_lookup_table; + unsigned long start_ip; + size_t text_size; + unsigned intorc_num_entries; +}; + static inline unsigned long orc_ip(const int *ip) { - return (unsigned long)ip + *ip; + return (unsigned long)ip + *ip + orc_ip_table_offset; +} + +static struct orc_entry *__orc_find(int *ip_table, struct orc_entry *u_table, + unsigned int num_entries, unsigned long ip) +{ + int *first = ip_table; + int *last = ip_table + num_entries - 1; + int *mid = first, *found = first; + + if (!num_entries) + return NULL; + + /* +* Do a binary range search to find the rightmost duplicate of a given +* starting address. Some entries are section terminators which are +* "weak" entries for ensuring there are no gaps. They should be +* ignored when they conflict with a real entry. +*/ + while (first <= last) { + mid = first + ((last - first) / 2); + + if (orc_ip(mid) <= ip) { + found = mid; + first = mid + 1; + } else + last = mid - 1; + } + + return u_table + (found - ip_table); } static int orc_sort_cmp(const void *_a, const void *_b) @@ -130,18 +168,24 @@ static void *sort_orctable(void *arg) int *idxs = NULL; int *tmp_orc_ip_table = NULL; struct orc_entry *tmp_orc_table = NULL; - unsigned int *orc_ip_size = (unsigned int *)arg; - unsigned int num_entries = *orc_ip_size / sizeof(int); +
[PATCH v2 0/3] Build ORC fast lookup table in scripts/sorttable tool
Move building of fast lookup table from boot to sorttable tool. This saves us 6380us boot time on Intel(R) Xeon(R) CPU E5-2682 v4 @ 2.50GHz with cores. It adds a little more than 7ms to boot time when testing on the same CPU. Changelog v2: 1. Write .orc_lookup section header via objtool 2. Move two ORC lookup table macro from orc_lookup.h into orc_types.h 3. Spell 'ORC' in capitalized fashion Huaixin Chang (3): objtool: Write .orc_lookup section header scripts/sorttable: Build ORC fast lookup table via sorttable tool x86/unwind/orc: Simplify unwind_init() for x86 boot arch/x86/include/asm/orc_lookup.h | 16 -- arch/x86/include/asm/orc_types.h | 16 ++ arch/x86/kernel/unwind_orc.c | 41 +-- arch/x86/kernel/vmlinux.lds.S | 2 +- scripts/sorttable.h| 96 +++--- tools/arch/x86/include/asm/orc_types.h | 16 ++ tools/objtool/orc_gen.c| 4 ++ 7 files changed, 128 insertions(+), 63 deletions(-) -- 2.14.4.44.g2045bb6
[PATCH 1/3] objtool: Write .orc_lookup section header
The purpose of this patch is to set sh_type to SHT_PROGBITS and remove write bits away from sh_flags. In order to write section header, just call elf_create_section() upon section orc_lookup with 0 entry written. Originally, section headers are as follows: [23] .orc_unwind_ipPROGBITS 8259f4b8 0179f4b8 00178bbc A 0 0 1 [24] .rela.orc_unwind_ RELA 11e57b58 008d4668 0018 I 7023 8 [25] .orc_unwind PROGBITS 82718074 01918074 0023519a A 0 0 1 [26] .orc_lookup NOBITS 8294d210 01b4d20e 00030038 WA 0 0 1 [27] .vvar PROGBITS 8297e000 01b7e000 1000 WA 0 0 16 Now, they are changed to: [23] .orc_unwind_ipPROGBITS 8259f4b8 0179f4b8 00178bbc A 0 0 1 [24] .rela.orc_unwind_ RELA 11e57b58 008d4668 0018 I 7023 8 [25] .orc_unwind PROGBITS 82718074 01918074 0023519a A 0 0 1 [26] .orc_lookup PROGBITS 8294d210 01b4d210 00030038 A 0 0 1 [27] .vvar PROGBITS 8297e000 01b7e000 1000 WA 0 0 16 Signed-off-by: Huaixin Chang --- tools/objtool/orc_gen.c | 4 1 file changed, 4 insertions(+) diff --git a/tools/objtool/orc_gen.c b/tools/objtool/orc_gen.c index 968f55e6dd94..2b2653979ad6 100644 --- a/tools/objtool/orc_gen.c +++ b/tools/objtool/orc_gen.c @@ -189,6 +189,10 @@ int create_orc_sections(struct objtool_file *file) u_sec = elf_create_section(file->elf, ".orc_unwind", sizeof(struct orc_entry), idx); + /* make flags of section orc_lookup right */ + if (!elf_create_section(file->elf, ".orc_lookup", sizeof(int), 0)) + return -1; + /* populate sections */ idx = 0; for_each_sec(file, sec) { -- 2.14.4.44.g2045bb6
security/integrity/platform_certs/keyring_handler.c:66:16: sparse: sparse: Using plain integer as NULL pointer
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: 86cfccb66937dd6cbf26ed619958b9e587e6a115 commit: ad723674d6758478829ee766e3f1a2a24d56236f x86/efi: move common keyring handler functions to new file date: 9 months ago config: x86_64-randconfig-s022-20200807 (attached as .config) compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.2-117-g8c7aee71-dirty git checkout ad723674d6758478829ee766e3f1a2a24d56236f # save the attached .config to linux build tree make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot sparse warnings: (new ones prefixed by >>) >> security/integrity/platform_certs/keyring_handler.c:66:16: sparse: sparse: >> Using plain integer as NULL pointer security/integrity/platform_certs/keyring_handler.c:79:16: sparse: sparse: Using plain integer as NULL pointer vim +66 security/integrity/platform_certs/keyring_handler.c 57 58 /* 59 * Return the appropriate handler for particular signature list types found in 60 * the UEFI db and MokListRT tables. 61 */ 62 __init efi_element_handler_t get_handler_for_db(const efi_guid_t *sig_type) 63 { 64 if (efi_guidcmp(*sig_type, efi_cert_x509_guid) == 0) 65 return add_to_platform_keyring; > 66 return 0; 67 } 68 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH v3 4/5] mm: memcg: charge memcg percpu memory to the parent cgroup
On Wed, 29 Jul 2020 19:10:39 +0200 Michal Koutný wrote: > Hello. > > On Tue, Jun 23, 2020 at 11:45:14AM -0700, Roman Gushchin wrote: > > Because the size of memory cgroup internal structures can dramatically > > exceed the size of object or page which is pinning it in the memory, it's > > not a good idea to simple ignore it. It actually breaks the isolation > > between cgroups. > No doubt about accounting the memory if it's significant amount. > > > Let's account the consumed percpu memory to the parent cgroup. > Why did you choose charging to the parent of the created cgroup? > > Should the charge go the cgroup _that is creating_ the new memcg? > > One reason is that there are the throttling mechanisms for memory limits > and those are better exercised when the actor and its memory artefact > are the same cgroup, aren't they? > > The second reason is based on the example Dlegation Containment > (Documentation/admin-guide/cgroup-v2.rst) > > > For an example, let's assume cgroups C0 and C1 have been delegated to > > user U0 who created C00, C01 under C0 and C10 under C1 as follows and > > all processes under C0 and C1 belong to U0:: > > > > ~ - C0 - C00 > > ~ cgroup~ \ C01 > > ~ hierarchy ~ > > ~ - C1 - C10 > > Thanks to permissions a task running in C0 creating a cgroup in C1 would > deplete C1's supply victimizing tasks inside C1. These week-old issues appear to be significant. Roman? Or someone else?
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
07.08.2020 07:06, Sowjanya Komatineni пишет: > > On 8/6/20 9:05 PM, Sowjanya Komatineni wrote: >> >> On 8/6/20 9:01 PM, Dmitry Osipenko wrote: >>> 07.08.2020 06:18, Sowjanya Komatineni пишет: On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: > On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: >> On 8/6/20 7:31 PM, Dmitry Osipenko wrote: >>> 06.08.2020 22:01, Sowjanya Komatineni пишет: >>> ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(&device->mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); >>> Could you please explain why the ACTIVE bit can't be polled >>> instead of >>> using the fixed delay? Doesn't ACTIVE bit represents the state of >>> the >>> busy FSM? >> Based on internal discussion, ACTIVE bit gets cleared when all >> enabled pads calibration is done (same time as when DONE set to 1). >> >> Will request HW designer to look into design and confirm exactly when >> ACTIVE bit gets cleared. >> >> Will get back on this. >> > Verified with HW designer. above is correct. ACTIVE bit update happens > same time as DONE bit. > > Active = !(DONE) > > In case of calibration logic waiting for LP-11 where done bit does not > get set, ACTIVE will still be 1 and on next start trigger new > calibration will start > Based on internal design check from designer, as long as its in waiting for LP-11 stage, next calibration request can be triggered again but ACTIVE bit we will see it at 1. So we should check for DONE bits to confirm if calibration is done or not. To start next calibration, it can take effect as long as its in wait for LP-11 mode. >>> I meant the start_calibration() will poll the ACTIVE bit (calibration >>> busy), while the finish_calibration() will poll the DONE bit >>> (calibration applied). >> >> ACTIVE bit can be 1 when previous calibration process does not see LP-11. >> >> So there is no need to use ACTIVE bit during start of calibration. >> >> At HW level, both ACTIVE and DONE bits get set at same time. >> >> So waiting for ACTIVE to be 0 during start calibration instead of >> *75uS will not work as ACTIVE bit will not become 0 after calibration >> sequence codes and it will get updated along with DONE bits only after >> applying results to pads which happens after seeing LP-11 on pads. >> > *typo fixed I see now, thank you.
[GIT PULL] xfs: new code for 5.9-rc1
Hi Linus, Please pull this large pile of new xfs code for 5.9. There are quite a few changes in this release, the most notable of which is that we've made inode flushing fully asynchronous, and we no longer block memory reclaim on this. Furthermore, we have fixed a long-standing bug in the quota code where soft limit warnings and inode limits were never tracked properly. Moving further down the line, the reflink control loops have been redesigned to behave more efficiently; and numerous small bugs have been fixed (see below). The xattr and quota code have been extensively refactored in preparation for more new features coming down the line. Finally, the behavior of DAX between ext4 and xfs has been stabilized, which gets us a step closer to removing the experimental tag from that feature. We have a few new contributors this time around. Welcome, all! This branch merges cleanly with master as of a few minutes ago, so please let me know if anything strange happens. I anticipate a second pull request next week for a few small bugfixes that have been trickling in, but this is it for big changes. --D The following changes since commit dcb7fd82c75ee2d6e6f9d8cc71c52519ed52e258: Linux 5.8-rc4 (2020-07-05 16:20:22 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/xfs-5.9-merge-7 for you to fetch changes up to 818d5a91559ffe1e1f2095dcbbdb96c13fdb94ec: fs/xfs: Support that ioctl(SETXFLAGS/GETXFLAGS) can set/get inode DAX on XFS. (2020-07-28 20:28:20 -0700) New code for 5.9: - Fix some btree block pingponging problems when swapping extents - Redesign the reflink copy loop so that we only run one remapping operation per transaction. This helps us avoid running out of block reservation on highly deduped filesystems. - Take the MMAPLOCK around filemap_map_pages. - Make inode reclaim fully async so that we avoid stalling processes on flushing inodes to disk. - Reduce inode cluster buffer RMW cycles by attaching the buffer to dirty inodes so we won't let go of the cluster buffer when we know we're going to need it soon. - Add some more checks to the realtime bitmap file scrubber. - Don't trip false lockdep warnings in fs freeze. - Remove various redundant lines of code. - Remove unnecessary calls to xfs_perag_{get,put}. - Preserve I_VERSION state across remounts. - Fix an unmount hang due to AIL going to sleep with a non-empty delwri buffer list. - Fix an error in the inode allocation space reservation macro that caused regressions in generic/531. - Fix a potential livelock when dquot flush fails because the dquot buffer is locked. - Fix a miscalculation when reserving inode quota that could cause users to exceed a hardlimit. - Refactor struct xfs_dquot to use native types for incore fields instead of abusing the ondisk struct for this purpose. This will eventually enable proper y2038+ support, but for now it merely cleans up the quota function declarations. - Actually increment the quota softlimit warning counter so that soft failures turn into hard(er) failures when they exceed the softlimit warning counter limits set by the administrator. - Split incore dquot state flags into their own field and namespace, to avoid mixing them with quota type flags. - Create a new quota type flags namespace so that we can make it obvious when a quota function takes a quota type (user, group, project) as an argument. - Rename the ondisk dquot flags field to type, as that more accurately represents what we store in it. - Drop our bespoke memory allocation flags in favor of GFP_*. - Rearrange the xattr functions so that we no longer mix metadata updates and transaction management (e.g. rolling complex transactions) in the same functions. This work will prepare us for atomic xattr operations (itself a prerequisite for directory backrefs) in future release cycles. - Support FS_DAX_FL (aka FS_XFLAG_DAX) via GETFLAGS/SETFLAGS. Allison Collins (22): xfs: Add xfs_has_attr and subroutines xfs: Check for -ENOATTR or -EEXIST xfs: Factor out new helper functions xfs_attr_rmtval_set xfs: Pull up trans handling in xfs_attr3_leaf_flipflags xfs: Split apart xfs_attr_leaf_addname xfs: Refactor xfs_attr_try_sf_addname xfs: Pull up trans roll from xfs_attr3_leaf_setflag xfs: Factor out xfs_attr_rmtval_invalidate xfs: Pull up trans roll in xfs_attr3_leaf_clearflag xfs: Refactor xfs_attr_rmtval_remove xfs: Pull up xfs_attr_rmtval_invalidate xfs: Add helper function xfs_attr_node_shrink xfs: Remove unneeded xfs_trans_roll_inode calls xfs: Remove xfs_trans_roll in xfs_attr_node_removename xfs: Add helpers xfs_attr_is_shortform and xfs_attr_set_shortform xfs: Add helper function xfs_attr_leaf_mark_incomplete xfs
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 9:05 PM, Sowjanya Komatineni wrote: On 8/6/20 9:01 PM, Dmitry Osipenko wrote: 07.08.2020 06:18, Sowjanya Komatineni пишет: On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(&device->mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this. Verified with HW designer. above is correct. ACTIVE bit update happens same time as DONE bit. Active = !(DONE) In case of calibration logic waiting for LP-11 where done bit does not get set, ACTIVE will still be 1 and on next start trigger new calibration will start Based on internal design check from designer, as long as its in waiting for LP-11 stage, next calibration request can be triggered again but ACTIVE bit we will see it at 1. So we should check for DONE bits to confirm if calibration is done or not. To start next calibration, it can take effect as long as its in wait for LP-11 mode. I meant the start_calibration() will poll the ACTIVE bit (calibration busy), while the finish_calibration() will poll the DONE bit (calibration applied). ACTIVE bit can be 1 when previous calibration process does not see LP-11. So there is no need to use ACTIVE bit during start of calibration. At HW level, both ACTIVE and DONE bits get set at same time. So waiting for ACTIVE to be 0 during start calibration instead of *75uS will not work as ACTIVE bit will not become 0 after calibration sequence codes and it will get updated along with DONE bits only after applying results to pads which happens after seeing LP-11 on pads. *typo fixed
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 9:01 PM, Dmitry Osipenko wrote: 07.08.2020 06:18, Sowjanya Komatineni пишет: On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(&device->mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this. Verified with HW designer. above is correct. ACTIVE bit update happens same time as DONE bit. Active = !(DONE) In case of calibration logic waiting for LP-11 where done bit does not get set, ACTIVE will still be 1 and on next start trigger new calibration will start Based on internal design check from designer, as long as its in waiting for LP-11 stage, next calibration request can be triggered again but ACTIVE bit we will see it at 1. So we should check for DONE bits to confirm if calibration is done or not. To start next calibration, it can take effect as long as its in wait for LP-11 mode. I meant the start_calibration() will poll the ACTIVE bit (calibration busy), while the finish_calibration() will poll the DONE bit (calibration applied). ACTIVE bit can be 1 when previous calibration process does not see LP-11. So there is no need to use ACTIVE bit during start of calibration. At HW level, both ACTIVE and DONE bits get set at same time. So waiting for ACTIVE to be 0 during start calibration instead of 7uS will not work as ACTIVE bit will not become 0 after calibration sequence codes and it will get updated along with DONE bits only after applying results to pads which happens after seeing LP-11 on pads.
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
07.08.2020 06:18, Sowjanya Komatineni пишет: > > On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: >> >> On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: >>> >>> On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... > +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) > { > const struct tegra_mipi_soc *soc = device->mipi->soc; > unsigned int i; > @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct > tegra_mipi_device *device) > value |= MIPI_CAL_CTRL_START; > tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); > - mutex_unlock(&device->mipi->lock); > - clk_disable(device->mipi->clk); > + /* > + * Wait for min 72uS to let calibration logic finish calibration > + * sequence codes before waiting for pads idle state to apply the > + * results. > + */ > + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? >>> >>> Based on internal discussion, ACTIVE bit gets cleared when all >>> enabled pads calibration is done (same time as when DONE set to 1). >>> >>> Will request HW designer to look into design and confirm exactly when >>> ACTIVE bit gets cleared. >>> >>> Will get back on this. >>> >> Verified with HW designer. above is correct. ACTIVE bit update happens >> same time as DONE bit. >> >> Active = !(DONE) >> >> In case of calibration logic waiting for LP-11 where done bit does not >> get set, ACTIVE will still be 1 and on next start trigger new >> calibration will start >> > Based on internal design check from designer, as long as its in waiting > for LP-11 stage, next calibration request can be triggered again but > ACTIVE bit we will see it at 1. So we should check for DONE bits to > confirm if calibration is done or not. > > To start next calibration, it can take effect as long as its in wait for > LP-11 mode. I meant the start_calibration() will poll the ACTIVE bit (calibration busy), while the finish_calibration() will poll the DONE bit (calibration applied).
Re: [PATCH] ftrace: Fixup lockdep assert held of text_mutex
On Fri, 7 Aug 2020 10:59:16 +0800 Guo Ren wrote: > > > > This looks like a bug in the lockdep_assert_held() in whatever arch > > (riscv) is running. > Seems you think it's a bug of arch implementation with the wrong usage > of text_mutex? > > Also @riscv maintainer, > How about modifying it in riscv's code? we still need to solve it. > > > diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h > index ace8a6e..fb266c3 100644 > --- a/arch/riscv/include/asm/ftrace.h > +++ b/arch/riscv/include/asm/ftrace.h > @@ -23,6 +23,12 @@ static inline unsigned long > ftrace_call_adjust(unsigned long addr) > > struct dyn_arch_ftrace { > }; > + > +#ifdef CONFIG_DYNAMIC_FTRACE > +struct dyn_ftrace; > +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); > +#define ftrace_init_nop ftrace_init_nop > +#endif > #endif > > #ifdef CONFIG_DYNAMIC_FTRACE > diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c > index 2ff63d0..9e9f7c0 100644 > --- a/arch/riscv/kernel/ftrace.c > +++ b/arch/riscv/kernel/ftrace.c > @@ -97,6 +97,17 @@ int ftrace_make_nop(struct module *mod, struct > dyn_ftrace *rec, > return __ftrace_modify_call(rec->ip, addr, false); > } > > +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) > +{ > + int ret; > + > + mutex_lock(&text_mutex); > + ret = ftrace_make_nop(mod, rec, MCOUNT_ADDR); Looking at x86, we have the following code: static int ftrace_poke_late = 0; int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex) { /* * Need to grab text_mutex to prevent a race from module loading * and live kernel patching from changing the text permissions while * ftrace has it set to "read/write". */ mutex_lock(&text_mutex); ftrace_poke_late = 1; return 0; } int ftrace_arch_code_modify_post_process(void) __releases(&text_mutex) { /* * ftrace_make_{call,nop}() may be called during * module load, and we need to finish the text_poke_queue() * that they do, here. */ text_poke_finish(); ftrace_poke_late = 0; mutex_unlock(&text_mutex); return 0; } And if ftrace_poke_late is not set, then ftrace_make_nop() does direct modification (calls text_poke_early(), which is basically a memcpy). This path doesn't have any checks against text_mutex being held, because it only happens at boot up. > + mutex_unlock(&text_mutex); > + > + return ret; > +} > + > int ftrace_update_ftrace_func(ftrace_func_t func) > { > int ret = __ftrace_modify_call((unsigned long)&ftrace_call, > --- > > > > --- a/kernel/trace/ftrace.c > > > +++ b/kernel/trace/ftrace.c > > > @@ -26,6 +26,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > #include > > > #include > > > #include > > > @@ -6712,9 +6713,11 @@ void __init ftrace_init(void) > > > > ftrace_init() is called before SMP is initialized. Nothing else should > > be running here. That means grabbing a mutex is useless. > I don't agree, ftrace_init are modifying kernel text, so we should > give the lock of text_mutex to keep semantic consistency. Did you test your patch on x86 with lockdep? ftrace_process_locs() grabs the ftrace_lock, which I believe is held when text_mutex is taken in other locations. So this will probably not work anyway. text_mutex isn't to be taken at the ftrace level. -- Steve
Re: [PATCH] vdpa/mlx5: Fix erroneous null pointer checks
On 2020/8/7 上午11:37, Jason Wang wrote: On 2020/8/7 上午3:18, Alex Dewar wrote: In alloc_inout() in net/mlx5_vnet.c, there are a few places where memory is allocated to *in and *out, but only the values of in and out are null-checked (i.e. there is a missing dereference). Fix this. Addresses-Coverity: ("CID 1496603: (REVERSE_INULL)") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Alex Dewar Acked-by: Jason Wang Colin posted something similar: [PATCH][next] vdpa/mlx5: fix memory allocation failure checks And I think his fix is better since it prevent raw pointers to be freed. Thanks
Re: [PATCH][next] vdpa/mlx5: fix memory allocation failure checks
On 2020/8/7 上午12:08, Colin King wrote: From: Colin Ian King The memory allocation failure checking for in and out is currently checking if the pointers are valid rather than the contents of what they point to. Hence the null check on failed memory allocations is incorrect. Fix this by adding the missing indirection in the check. Also for the default case, just set the *in and *out to null as these don't have any thing allocated to kfree. Finally remove the redundant *in and *out check as these have been already done on each allocation in the case statement. Addresses-Coverity: ("Null pointer dereference") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Colin Ian King Acked-by: Jason Wang --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 13 ++--- 1 file changed, 6 insertions(+), 7 deletions(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 3ec44a4f0e45..55bc58e1dae9 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(qp_2rst_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(*outlen, GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(qp_2rst_in, *in, opcode, cmd); @@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rst2init_qp_in, *in, opcode, cmd); @@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(init2rtr_qp_in, *in, opcode, cmd); @@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd); @@ -927,16 +927,15 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl MLX5_SET(qpc, qpc, rnr_retry, 7); break; default: - goto outerr; + goto outerr_nullify; } - if (!*in || !*out) - goto outerr; return; outerr: kfree(*in); kfree(*out); +outerr_nullify: *in = NULL; *out = NULL; }
Re: [BUG] crypto: hisilicon: accessing the data mapped to streaming DMA
On 2020/8/3 9:29, Jia-Ju Bai wrote: > > > On 2020/8/3 9:12, Zhou Wang wrote: >> On 2020/8/2 22:52, Jia-Ju Bai wrote: >>> In qm_qp_ctx_cfg(), "sqc" and "aeqc" are mapped to streaming DMA: >>>eqc_dma = dma_map_single(..., eqc, ...); >>>.. >>>aeqc_dma = dma_map_single(..., aeqc, ...); >> Only sqc, cqc will be configured in qm_qp_ctx_cfg. >> >>> Then "sqc" and "aeqc" are accessed at many places, such as: >>>eqc->base_l = cpu_to_le32(lower_32_bits(qm->eqe_dma)); >>>eqc->base_h = cpu_to_le32(upper_32_bits(qm->eqe_dma)); >>>.. >>>aeqc->base_l = cpu_to_le32(lower_32_bits(qm->aeqe_dma)); >>>aeqc->base_h = cpu_to_le32(upper_32_bits(qm->aeqe_dma)); >> There are sqc, cqc, eqc, aeqc, you seems misunderstand them. >> >>> These accesses may cause data inconsistency between CPU cache and hardware. >>> >>> I am not sure how to properly fix this problem, and thus I only report it. >> In qm_qp_ctx_cfg, sqc/cqc memory will be allocated and related mailbox will >> be sent >> to hardware. In qm_eq_ctx_cfg, eqc/aeqc related operations will be done. >> >> So there is no problem here :) > > Ah, sorry, I misunderstood qm_eq_ctx_cfg() and qm_qp_ctx_cfg(), because their > names are quite similar. > Now, I re-organize this report as follows: > > In qm_eq_ctx_cfg(), "eqc" and "aeqc" are mapped to streaming DMA: > eqc_dma = dma_map_single(..., eqc, ...); > .. > aeqc_dma = dma_map_single(..., aeqc, ...); > > Then "sqc" and "aeqc" are accessed at some places in qm_eq_ctx_cfg(), such as: > eqc->base_l = cpu_to_le32(lower_32_bits(qm->eqe_dma)); > eqc->base_h = cpu_to_le32(upper_32_bits(qm->eqe_dma)); > .. > aeqc->base_l = cpu_to_le32(lower_32_bits(qm->aeqe_dma)); > aeqc->base_h = cpu_to_le32(upper_32_bits(qm->aeqe_dma)); > > These accesses may cause data inconsistency between CPU cache and hardware. > > Besides, in qm_qp_ctx_cfg(), "sqc" and "cqc" are mapped to streaming DMA: > sqc_dma = dma_map_single(..., sqc, ...); > .. > cqc_dma = dma_map_single(..., cqc, ...); > > > Then "sqc" and "cqc" are at some places in qm_qp_ctx_cfg(), such as: > sqc->cq_num = cpu_to_le16(qp_id); > sqc->w13 = cpu_to_le16(QM_MK_SQC_W13(0, 1, qp->alg_type)); > .. > cqc->dw3 = cpu_to_le32(QM_MK_CQC_DW3_V2(4)); > cqc->w8 = 0; > > These accesses may cause data inconsistency between CPU cache and hardware. > > I think such problems (if they are real) can be fixed by finishing data > assignment before DMA mapping. Sorry for late. I got your idea, from the semantics of dma_map_single/dma_unmap_single, we should not mix CPU and device DMA accessing here. The reason of working well is our hardware is hardware CC. Will fix this later. Thanks, Zhou > > > Best wishes, > Jia-Ju Bai > > . >
Re: [PATCHv3] coresight: etm4x: Fix etm4_count race by moving cpuhp callbacks to init
Sorry for the noise. Please ignore previous comment. The change is in old patch set of my series. This change is good to go. On 2020-08-07 11:52, Tingwei Zhang wrote: On Wed, Jul 29, 2020 at 01:13:10PM +0800, Sai Prakash Ranjan wrote: etm4_count keeps track of number of ETMv4 registered and on some systems, a race is observed on etm4_count variable which can lead to multiple calls to cpuhp_setup_state_nocalls_cpuslocked(). This function internally calls cpuhp_store_callbacks() which prevents multiple registrations of callbacks for a given state and due to this race, it returns -EBUSY leading to ETM probe failures like below. coresight-etm4x: probe of 704.etm failed with error -16 This race can easily be triggered with async probe by setting probe type as PROBE_PREFER_ASYNCHRONOUS and with ETM power management property "arm,coresight-loses-context-with-cpu". Prevent this race by moving cpuhp callbacks to etm driver init since the cpuhp callbacks doesn't have to depend on the etm4_count and can be once setup during driver init. Similarly we move cpu_pm notifier registration to driver init and completely remove etm4_count usage. Also now we can use non cpuslocked version of cpuhp callbacks with this movement. Fixes: 9b6a3f3633a5 ("coresight: etmv4: Fix CPU power management setup in probe() function") Fixes: 58eb457be028 ("hwtracing/coresight-etm4x: Convert to hotplug state machine") Suggested-by: Suzuki K Poulose Signed-off-by: Sai Prakash Ranjan --- Changes in v3: * Minor cleanups from v2 and change to device_initcall (Stephen Boyd) * Move to non cpuslocked cpuhp callbacks and rename to etm_pm_setup() (Mike Leach) Changes in v2: * Rearrange cpuhp callbacks and move them to driver init (Suzuki K Poulose) --- drivers/hwtracing/coresight/coresight-etm4x.c | 65 +-- 1 file changed, 31 insertions(+), 34 deletions(-) diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c b/drivers/hwtracing/coresight/coresight-etm4x.c index 6d7d2169bfb2..fddfd93b9a7b 100644 --- a/drivers/hwtracing/coresight/coresight-etm4x.c +++ b/drivers/hwtracing/coresight/coresight-etm4x.c @@ -48,8 +48,6 @@ module_param(pm_save_enable, int, 0444); MODULE_PARM_DESC(pm_save_enable, "Save/restore state on power down: 1 = never, 2 = self-hosted"); -/* The number of ETMv4 currently registered */ -static int etm4_count; static struct etmv4_drvdata *etmdrvdata[NR_CPUS]; static void etm4_set_default_config(struct etmv4_config *config); static int etm4_set_event_filters(struct etmv4_drvdata *drvdata, @@ -1398,28 +1396,25 @@ static struct notifier_block etm4_cpu_pm_nb = { .notifier_call = etm4_cpu_pm_notify, }; -/* Setup PM. Called with cpus locked. Deals with error conditions and counts */ -static int etm4_pm_setup_cpuslocked(void) +/* Setup PM. Deals with error conditions and counts */ +static int __init etm4_pm_setup(void) { int ret; - if (etm4_count++) - return 0; - ret = cpu_pm_register_notifier(&etm4_cpu_pm_nb); if (ret) - goto reduce_count; + return ret; - ret = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ARM_CORESIGHT_STARTING, - "arm/coresight4:starting", - etm4_starting_cpu, etm4_dying_cpu); + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING, + "arm/coresight4:starting", + etm4_starting_cpu, etm4_dying_cpu); if (ret) goto unregister_notifier; - ret = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN, - "arm/coresight4:online", - etm4_online_cpu, NULL); + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, + "arm/coresight4:online", + etm4_online_cpu, NULL); /* HP dyn state ID returned in ret on success */ if (ret > 0) { @@ -1428,21 +1423,15 @@ static int etm4_pm_setup_cpuslocked(void) } /* failed dyn state - remove others */ - cpuhp_remove_state_nocalls_cpuslocked(CPUHP_AP_ARM_CORESIGHT_STARTING); + cpuhp_remove_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING); unregister_notifier: cpu_pm_unregister_notifier(&etm4_cpu_pm_nb); - -reduce_count: - --etm4_count; return ret; } -static void etm4_pm_clear(void) +static void __init etm4_pm_clear(void) { - if (--etm4_count != 0) - return; - cpu_pm_unregister_notifier(&etm4_cpu_pm_nb); cpuhp_remove_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING); if (hp_online) { @@ -1498,22 +1487,12 @@ static int etm4_probe(struct amba_device *adev, const struct amba_id *id) if (!desc.name) return -ENOMEM;
Re: [PATCH] vdpa/mlx5: Fix uninitialised variable in core/mr.c
On 2020/8/7 上午2:56, Alex Dewar wrote: If the kernel is unable to allocate memory for the variable dmr then err will be returned without being set. Set err to -ENOMEM in this case. Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code") Addresses-Coverity: ("Uninitialized variables") Signed-off-by: Alex Dewar Acked-by: Jason Wang --- drivers/vdpa/mlx5/core/mr.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/vdpa/mlx5/core/mr.c b/drivers/vdpa/mlx5/core/mr.c index f5dec0274133..ef1c550f8266 100644 --- a/drivers/vdpa/mlx5/core/mr.c +++ b/drivers/vdpa/mlx5/core/mr.c @@ -319,8 +319,10 @@ static int add_direct_chain(struct mlx5_vdpa_dev *mvdev, u64 start, u64 size, u8 while (size) { sz = (u32)min_t(u64, MAX_KLM_SIZE, size); dmr = kzalloc(sizeof(*dmr), GFP_KERNEL); - if (!dmr) + if (!dmr) { + err = -ENOMEM; goto err_alloc; + } dmr->start = st; dmr->end = st + sz; -- 2.28.0
Re: [PATCHv3] coresight: etm4x: Fix etm4_count race by moving cpuhp callbacks to init
On Wed, Jul 29, 2020 at 01:13:10PM +0800, Sai Prakash Ranjan wrote: > etm4_count keeps track of number of ETMv4 registered and on some systems, > a race is observed on etm4_count variable which can lead to multiple calls > to cpuhp_setup_state_nocalls_cpuslocked(). This function internally calls > cpuhp_store_callbacks() which prevents multiple registrations of callbacks > for a given state and due to this race, it returns -EBUSY leading to ETM > probe failures like below. > > coresight-etm4x: probe of 704.etm failed with error -16 > > This race can easily be triggered with async probe by setting probe type > as PROBE_PREFER_ASYNCHRONOUS and with ETM power management property > "arm,coresight-loses-context-with-cpu". > > Prevent this race by moving cpuhp callbacks to etm driver init since the > cpuhp callbacks doesn't have to depend on the etm4_count and can be once > setup during driver init. Similarly we move cpu_pm notifier registration > to driver init and completely remove etm4_count usage. Also now we can > use non cpuslocked version of cpuhp callbacks with this movement. > > Fixes: 9b6a3f3633a5 ("coresight: etmv4: Fix CPU power management setup in > probe() function") > Fixes: 58eb457be028 ("hwtracing/coresight-etm4x: Convert to hotplug state > machine") > Suggested-by: Suzuki K Poulose > Signed-off-by: Sai Prakash Ranjan > --- > > Changes in v3: > * Minor cleanups from v2 and change to device_initcall (Stephen Boyd) > * Move to non cpuslocked cpuhp callbacks and rename to etm_pm_setup() (Mike > Leach) > > Changes in v2: > * Rearrange cpuhp callbacks and move them to driver init (Suzuki K Poulose) > > --- > drivers/hwtracing/coresight/coresight-etm4x.c | 65 +-- > 1 file changed, 31 insertions(+), 34 deletions(-) > > diff --git a/drivers/hwtracing/coresight/coresight-etm4x.c > b/drivers/hwtracing/coresight/coresight-etm4x.c > index 6d7d2169bfb2..fddfd93b9a7b 100644 > --- a/drivers/hwtracing/coresight/coresight-etm4x.c > +++ b/drivers/hwtracing/coresight/coresight-etm4x.c > @@ -48,8 +48,6 @@ module_param(pm_save_enable, int, 0444); > MODULE_PARM_DESC(pm_save_enable, > "Save/restore state on power down: 1 = never, 2 = self-hosted"); > > -/* The number of ETMv4 currently registered */ > -static int etm4_count; > static struct etmv4_drvdata *etmdrvdata[NR_CPUS]; > static void etm4_set_default_config(struct etmv4_config *config); > static int etm4_set_event_filters(struct etmv4_drvdata *drvdata, > @@ -1398,28 +1396,25 @@ static struct notifier_block etm4_cpu_pm_nb = { > .notifier_call = etm4_cpu_pm_notify, > }; > > -/* Setup PM. Called with cpus locked. Deals with error conditions and > counts */ > -static int etm4_pm_setup_cpuslocked(void) > +/* Setup PM. Deals with error conditions and counts */ > +static int __init etm4_pm_setup(void) > { > int ret; > > - if (etm4_count++) > - return 0; > - > ret = cpu_pm_register_notifier(&etm4_cpu_pm_nb); > if (ret) > - goto reduce_count; > + return ret; > > - ret = > cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ARM_CORESIGHT_STARTING, > -"arm/coresight4:starting", > -etm4_starting_cpu, > etm4_dying_cpu); > + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING, > + "arm/coresight4:starting", > + etm4_starting_cpu, etm4_dying_cpu); > > if (ret) > goto unregister_notifier; > > - ret = cpuhp_setup_state_nocalls_cpuslocked(CPUHP_AP_ONLINE_DYN, > -"arm/coresight4:online", > -etm4_online_cpu, NULL); > + ret = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, > + "arm/coresight4:online", > + etm4_online_cpu, NULL); > > /* HP dyn state ID returned in ret on success */ > if (ret > 0) { > @@ -1428,21 +1423,15 @@ static int etm4_pm_setup_cpuslocked(void) > } > > /* failed dyn state - remove others */ > - cpuhp_remove_state_nocalls_cpuslocked(CPUHP_AP_ARM_CORESIGHT_STARTING); > + cpuhp_remove_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING); > > unregister_notifier: > cpu_pm_unregister_notifier(&etm4_cpu_pm_nb); > - > -reduce_count: > - --etm4_count; > return ret; > } > > -static void etm4_pm_clear(void) > +static void __init etm4_pm_clear(void) > { > - if (--etm4_count != 0) > - return; > - > cpu_pm_unregister_notifier(&etm4_cpu_pm_nb); > cpuhp_remove_state_nocalls(CPUHP_AP_ARM_CORESIGHT_STARTING); > if (hp_online) { > @@ -1498,22 +1487,12 @@ static int etm4_probe(struct amba_device *adev, > const struct amba_id *id) > if (!desc.name) > return -ENOMEM; > > - cpus_read_l
Re: [PATCH 1/2] vdpa: ifcvf: return err when fail to request config irq
On 2020/7/23 下午5:12, Jason Wang wrote: We ignore the err of requesting config interrupt, fix this. Fixes: e7991f376a4d ("ifcvf: implement config interrupt in IFCVF") Cc: Zhu Lingshan Signed-off-by: Jason Wang --- drivers/vdpa/ifcvf/ifcvf_main.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c index f5a60c14b979..ae7110955a44 100644 --- a/drivers/vdpa/ifcvf/ifcvf_main.c +++ b/drivers/vdpa/ifcvf/ifcvf_main.c @@ -76,6 +76,10 @@ static int ifcvf_request_irq(struct ifcvf_adapter *adapter) ret = devm_request_irq(&pdev->dev, irq, ifcvf_config_changed, 0, vf->config_msix_name, vf); + if (ret) { + IFCVF_ERR(pdev, "Failed to request config irq\n"); + return ret; + } for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++) { snprintf(vf->vring[i].msix_name, 256, "ifcvf[%s]-%d\n", Hi Michael: Any comments on this series? Thanks
Re: [PATCH] vdpa/mlx5: Fix erroneous null pointer checks
On 2020/8/7 上午3:18, Alex Dewar wrote: In alloc_inout() in net/mlx5_vnet.c, there are a few places where memory is allocated to *in and *out, but only the values of in and out are null-checked (i.e. there is a missing dereference). Fix this. Addresses-Coverity: ("CID 1496603: (REVERSE_INULL)") Fixes: 1a86b377aa21 ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices") Signed-off-by: Alex Dewar Acked-by: Jason Wang --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 3ec44a4f0e45..bcb6600c2839 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -867,7 +867,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(qp_2rst_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(*outlen, GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(qp_2rst_in, *in, opcode, cmd); @@ -879,7 +879,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rst2init_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rst2init_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rst2init_qp_in, *in, opcode, cmd); @@ -896,7 +896,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(init2rtr_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(init2rtr_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(init2rtr_qp_in, *in, opcode, cmd); @@ -914,7 +914,7 @@ static void alloc_inout(struct mlx5_vdpa_net *ndev, int cmd, void **in, int *inl *outlen = MLX5_ST_SZ_BYTES(rtr2rts_qp_out); *in = kzalloc(*inlen, GFP_KERNEL); *out = kzalloc(MLX5_ST_SZ_BYTES(rtr2rts_qp_out), GFP_KERNEL); - if (!in || !out) + if (!*in || !*out) goto outerr; MLX5_SET(rtr2rts_qp_in, *in, opcode, cmd);
[PATCH] kernel: time: delete repeated words in comments
Drop repeated words in kernel/time/. {when, one, into} Signed-off-by: Randy Dunlap Cc: John Stultz Cc: Thomas Gleixner --- kernel/time/alarmtimer.c |2 +- kernel/time/sched_clock.c |2 +- kernel/time/timekeeping.c |2 +- 3 files changed, 3 insertions(+), 3 deletions(-) --- linux-next-20200806.orig/kernel/time/alarmtimer.c +++ linux-next-20200806/kernel/time/alarmtimer.c @@ -192,7 +192,7 @@ static void alarmtimer_dequeue(struct al * When a alarm timer fires, this runs through the timerqueue to * see which alarms expired, and runs those. If there are more alarm * timers queued for the future, we set the hrtimer to fire when - * when the next future alarm timer expires. + * the next future alarm timer expires. */ static enum hrtimer_restart alarmtimer_fired(struct hrtimer *timer) { --- linux-next-20200806.orig/kernel/time/sched_clock.c +++ linux-next-20200806/kernel/time/sched_clock.c @@ -229,7 +229,7 @@ void __init generic_sched_clock_init(voi { /* * If no sched_clock() function has been provided at that point, -* make it the final one one. +* make it the final one. */ if (cd.actual_read_sched_clock == jiffy_sched_clock_read) sched_clock_register(jiffy_sched_clock_read, BITS_PER_LONG, HZ); --- linux-next-20200806.orig/kernel/time/timekeeping.c +++ linux-next-20200806/kernel/time/timekeeping.c @@ -2001,7 +2001,7 @@ static inline unsigned int accumulate_ns * logarithmic_accumulation - shifted accumulation of cycles * * This functions accumulates a shifted interval of cycles into - * into a shifted interval nanoseconds. Allows for O(log) accumulation + * a shifted interval nanoseconds. Allows for O(log) accumulation * loop. * * Returns the unconsumed cycles.
[PATCH] kernel: printk: delete repeated words in comments
Drop repeated words "the" in kernel/printk/. Signed-off-by: Randy Dunlap Cc: Petr Mladek Cc: Sergey Senozhatsky --- kernel/printk/printk.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- linux-next-20200806.orig/kernel/printk/printk.c +++ linux-next-20200806/kernel/printk/printk.c @@ -2461,7 +2461,7 @@ void console_unlock(void) * * console_trylock() is not able to detect the preemptive * context reliably. Therefore the value must be stored before -* and cleared after the the "again" goto label. +* and cleared after the "again" goto label. */ do_cond_resched = console_may_schedule; again: @@ -3374,7 +3374,7 @@ EXPORT_SYMBOL_GPL(kmsg_dump_get_line); * @len: length of line placed into buffer * * Start at the end of the kmsg buffer and fill the provided buffer - * with as many of the the *youngest* kmsg records that fit into it. + * with as many of the *youngest* kmsg records that fit into it. * If the buffer is large enough, all available kmsg records will be * copied with a single call. *
[PATCH] kernel: sched: delete repeated words in comments
Drop repeated words in kernel/sched/. {in, not} Signed-off-by: Randy Dunlap Cc: Mathieu Desnoyers Cc: "Paul E. McKenney" Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot --- kernel/sched/fair.c |2 +- kernel/sched/membarrier.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- linux-next-20200806.orig/kernel/sched/fair.c +++ linux-next-20200806/kernel/sched/fair.c @@ -5109,7 +5109,7 @@ static void do_sched_cfs_slack_timer(str /* * When a group wakes up we want to make sure that its quota is not already * expired/exceeded, otherwise it may be allowed to steal additional ticks of - * runtime as update_curr() throttling can not not trigger until it's on-rq. + * runtime as update_curr() throttling can not trigger until it's on-rq. */ static void check_enqueue_throttle(struct cfs_rq *cfs_rq) { --- linux-next-20200806.orig/kernel/sched/membarrier.c +++ linux-next-20200806/kernel/sched/membarrier.c @@ -229,7 +229,7 @@ static int sync_runqueues_membarrier_sta /* * For each cpu runqueue, if the task's mm match @mm, ensure that all -* @mm's membarrier state set bits are also set in in the runqueue's +* @mm's membarrier state set bits are also set in the runqueue's * membarrier state. This ensures that a runqueue scheduling * between threads which are users of @mm has its membarrier state * updated.
[PATCH] kernel: trace: delete repeated words in comments
Drop repeated words in kernel/trace/. {and, the, not} Signed-off-by: Randy Dunlap Cc: Steven Rostedt Cc: Ingo Molnar --- kernel/trace/ftrace.c |2 +- kernel/trace/trace.c |2 +- kernel/trace/trace_dynevent.c |2 +- kernel/trace/trace_events_synth.c |2 +- kernel/trace/tracing_map.c|2 +- 5 files changed, 5 insertions(+), 5 deletions(-) --- linux-next-20200806.orig/kernel/trace/ftrace.c +++ linux-next-20200806/kernel/trace/ftrace.c @@ -2402,7 +2402,7 @@ struct ftrace_ops direct_ops = { * * If the record has the FTRACE_FL_REGS set, that means that it * wants to convert to a callback that saves all regs. If FTRACE_FL_REGS - * is not not set, then it wants to convert to the normal callback. + * is not set, then it wants to convert to the normal callback. * * Returns the address of the trampoline to set to */ --- linux-next-20200806.orig/kernel/trace/trace.c +++ linux-next-20200806/kernel/trace/trace.c @@ -9243,7 +9243,7 @@ void ftrace_dump(enum ftrace_dump_mode o } /* -* We need to stop all tracing on all CPUS to read the +* We need to stop all tracing on all CPUS to read * the next buffer. This is a bit expensive, but is * not done often. We fill all what we can read, * and then release the locks again. --- linux-next-20200806.orig/kernel/trace/trace_dynevent.c +++ linux-next-20200806/kernel/trace/trace_dynevent.c @@ -402,7 +402,7 @@ void dynevent_arg_init(struct dynevent_a * whitespace, all followed by a separator, if applicable. After the * first arg string is successfully appended to the command string, * the optional @operator is appended, followed by the second arg and - * and optional @separator. If no separator was specified when + * optional @separator. If no separator was specified when * initializing the arg, a space will be appended. */ void dynevent_arg_pair_init(struct dynevent_arg_pair *arg_pair, --- linux-next-20200806.orig/kernel/trace/trace_events_synth.c +++ linux-next-20200806/kernel/trace/trace_events_synth.c @@ -1211,7 +1211,7 @@ __synth_event_trace_start(struct trace_e * ENABLED bit is set (which attaches the probe thus allowing * this code to be called, etc). Because this is called * directly by the user, we don't have that but we still need -* to honor not logging when disabled. For the the iterated +* to honor not logging when disabled. For the iterated * trace case, we save the enabed state upon start and just * ignore the following data calls. */ --- linux-next-20200806.orig/kernel/trace/tracing_map.c +++ linux-next-20200806/kernel/trace/tracing_map.c @@ -260,7 +260,7 @@ int tracing_map_add_var(struct tracing_m * to use cmp_fn. * * A key can be a subset of a compound key; for that purpose, the - * offset param is used to describe where within the the compound key + * offset param is used to describe where within the compound key * the key referenced by this key field resides. * * Return: The index identifying the field in the map and associated
[PATCH] kernel: locking: delete repeated words in comments
Drop repeated words in kernel/locking/. {it, no, the} Signed-off-by: Randy Dunlap Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Will Deacon --- kernel/locking/rtmutex.c |4 ++-- kernel/locking/rwsem.c |2 +- kernel/locking/semaphore.c |2 +- 3 files changed, 4 insertions(+), 4 deletions(-) --- linux-next-20200806.orig/kernel/locking/rtmutex.c +++ linux-next-20200806/kernel/locking/rtmutex.c @@ -1438,7 +1438,7 @@ rt_mutex_fasttrylock(struct rt_mutex *lo } /* - * Performs the wakeup of the the top-waiter and re-enables preemption. + * Performs the wakeup of the top-waiter and re-enables preemption. */ void rt_mutex_postunlock(struct wake_q_head *wake_q) { @@ -1833,7 +1833,7 @@ struct task_struct *rt_mutex_next_owner( * been started. * @waiter:the pre-initialized rt_mutex_waiter * - * Wait for the the lock acquisition started on our behalf by + * Wait for the lock acquisition started on our behalf by * rt_mutex_start_proxy_lock(). Upon failure, the caller must call * rt_mutex_cleanup_proxy_lock(). * --- linux-next-20200806.orig/kernel/locking/rwsem.c +++ linux-next-20200806/kernel/locking/rwsem.c @@ -1177,7 +1177,7 @@ rwsem_down_write_slowpath(struct rw_sema /* * If there were already threads queued before us and: -* 1) there are no no active locks, wake the front +* 1) there are no active locks, wake the front * queued process(es) as the handoff bit might be set. * 2) there are no active writers and some readers, the lock * must be read owned; so we try to wake any read lock --- linux-next-20200806.orig/kernel/locking/semaphore.c +++ linux-next-20200806/kernel/locking/semaphore.c @@ -119,7 +119,7 @@ EXPORT_SYMBOL(down_killable); * @sem: the semaphore to be acquired * * Try to acquire the semaphore atomically. Returns 0 if the semaphore has - * been acquired successfully or 1 if it it cannot be acquired. + * been acquired successfully or 1 if it cannot be acquired. * * NOTE: This return value is inverted from both spin_trylock and * mutex_trylock! Be careful about this when converting code.
[PATCH] kernel: events: delete repeated words in comments
Drop repeated words in kernel/events/. {if, the, that, with, time} Signed-off-by: Randy Dunlap Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo --- kernel/events/core.c|8 kernel/events/uprobes.c |2 +- 2 files changed, 5 insertions(+), 5 deletions(-) --- linux-next-20200806.orig/kernel/events/core.c +++ linux-next-20200806/kernel/events/core.c @@ -265,7 +265,7 @@ static void event_function_call(struct p if (!event->parent) { /* * If this is a !child event, we must hold ctx::mutex to -* stabilize the the event->ctx relation. See +* stabilize the event->ctx relation. See * perf_event_ctx_lock(). */ lockdep_assert_held(&ctx->mutex); @@ -1299,7 +1299,7 @@ static void put_ctx(struct perf_event_co * life-time rules separate them. That is an exiting task cannot fork, and a * spawning task cannot (yet) exit. * - * But remember that that these are parent<->child context relations, and + * But remember that these are parent<->child context relations, and * migration does not affect children, therefore these two orderings should not * interact. * @@ -1438,7 +1438,7 @@ static u64 primary_event_id(struct perf_ /* * Get the perf_event_context for a task and lock it. * - * This has to cope with with the fact that until it is locked, + * This has to cope with the fact that until it is locked, * the context could get moved to another task. */ static struct perf_event_context * @@ -2475,7 +2475,7 @@ static void perf_set_shadow_time(struct * But this is a bit hairy. * * So instead, we have an explicit cgroup call to remain -* within the time time source all along. We believe it +* within the time source all along. We believe it * is cleaner and simpler to understand. */ if (is_cgroup_event(event)) --- linux-next-20200806.orig/kernel/events/uprobes.c +++ linux-next-20200806/kernel/events/uprobes.c @@ -1735,7 +1735,7 @@ void uprobe_free_utask(struct task_struc } /* - * Allocate a uprobe_task object for the task if if necessary. + * Allocate a uprobe_task object for the task if necessary. * Called when the thread hits a breakpoint. * * Returns:
[PATCH] kernel: bpf: delete repeated words in comments
Drop repeated words in kernel/bpf/. {has, the} Signed-off-by: Randy Dunlap Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: net...@vger.kernel.org Cc: b...@vger.kernel.org --- kernel/bpf/core.c |2 +- kernel/bpf/verifier.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) --- linux-next-20200806.orig/kernel/bpf/core.c +++ linux-next-20200806/kernel/bpf/core.c @@ -1966,7 +1966,7 @@ void bpf_prog_array_delete_safe(struct b * @index: the index of the program to replace * * Skips over dummy programs, by not counting them, when calculating - * the the position of the program to replace. + * the position of the program to replace. * * Return: * * 0 - Success --- linux-next-20200806.orig/kernel/bpf/verifier.c +++ linux-next-20200806/kernel/bpf/verifier.c @@ -8294,7 +8294,7 @@ static bool stacksafe(struct bpf_func_st if (old->stack[spi].slot_type[i % BPF_REG_SIZE] != cur->stack[spi].slot_type[i % BPF_REG_SIZE]) /* Ex: old explored (safe) state has STACK_SPILL in -* this stack slot, but current has has STACK_MISC -> +* this stack slot, but current has STACK_MISC -> * this verifier states are not equivalent, * return false to continue verification of this path */
答复: 答复: 答复: 答复: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR
Hi baolu, I understand what you mean is that you want to put the following processing code in the acpi_device_create_direct_mappings() into the probe_acpi_namespace_devices() ,right? If you mean it , I think it's OK. if (pn_dev == NULL) { acpi_device->bus->iommu_ops = &intel_iommu_ops; ret = iommu_probe_device(acpi_device); if (ret) { pr_err("acpi_device probe fail! ret:%d\n", ret); return ret; } return 0; } Best regards Felix cui-oc -邮件原件- 发件人: Lu Baolu 发送时间: 2020年8月7日 9:08 收件人: FelixCui-oc ; Joerg Roedel ; io...@lists.linux-foundation.org; linux-kernel@vger.kernel.org; David Woodhouse 抄送: baolu...@linux.intel.com; RaymondPang-oc ; CobeChen-oc 主题: Re: 答复: 答复: 答复: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR Hi Felix, On 2020/8/6 14:51, FelixCui-oc wrote: > Hi baolu, > >Sure. Before that, let me sync my understanding with you. You > have an acpi namespace device in ANDD table, it also shows up in the device > scope of a RMRR. > >Current code doesn't enumerate that device for the RMRR, hence > iommu_create_device_direct_mappings() doesn't work for this device. > > >At the same time, probe_acpi_namespace_devices() doesn't work > for this device, hence you want to add a home-made > >acpi_device_create_direct_mappings() helper. > > Your understanding is right. > But there is a problem that even if the namespace device in > rmrr is enumerated in the code, probe_acpi_namespace_devices() also doesn't > work for this device. > This is because the dev parameter of the > iommu_create_device_direct_mappings() is not the namespace device in RMRR. > The actual parameter passed in is the namespace device's > physical node device. > In iommu_create_device_direct_mappings(), the physical node > device passed in cannot match the namespace device in rmrr->device[],right? > We need acpi_device_create_direct_mappings() helper ? > > In addition, adev->physical_node_list is related to the __HID > of namespace device reported by the bios. > For example, if the __HID reported by the bios belongs to > acpi_pnp_device_ids[], adev->physical_node_list has no devices. > So in acpi_device_create_direct_mappings(), I added the case > that adev->physical_node_list is empty. Got you. Thanks! Have you ever tried to have probe_acpi_namespace_devices() handle the case of empty adev->physical_node_list at the same time? Best regards, baolu > > > Best regards > Felix cui > > > > > -邮件原件- > 发件人: Lu Baolu > 发送时间: 2020年8月6日 10:36 > 收件人: FelixCui-oc ; Joerg Roedel > ; io...@lists.linux-foundation.org; > linux-kernel@vger.kernel.org; David Woodhouse > 抄送: baolu...@linux.intel.com; RaymondPang-oc > ; CobeChen-oc > 主题: Re: 答复: 答复: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI > device in RMRR > > Hi Felix, > > On 8/5/20 3:37 PM, FelixCui-oc wrote: >> Hi baolu, >> Let me talk about why acpi_device_create_direct_mappings() is >> needed and please tell me if there is an error. > > Sure. Before that, let me sync my understanding with you. You have an acpi > namespace device in ANDD table, it also shows up in the device scope of a > RMRR. Current code doesn't enumerate that device for the RMRR, hence > iommu_create_device_direct_mappings() doesn't work for this device. > > At the same time, probe_acpi_namespace_devices() doesn't work for this > device, hence you want to add a home-made > acpi_device_create_direct_mappings() helper. > > Did I get it right? > >> In the probe_acpi_namespace_devices() function, only the device >> in the addev->physical_node_list is probed, >> but we need to establish identity mapping for the namespace >> device in RMRR. These are two different devices. > > The namespace device has been probed and put in one drhd's device list. > Hence, it should be processed by probe_acpi_namespace_devices(). So the > question is why there are no devices in addev->physical_node_list? > >> Therefore, the namespace device in RMRR is not mapped in >> probe_acpi_namespace_devices(). >> acpi_device_create_direct_mappings() is to create direct >> mappings for namespace devices in RMRR. > > Best regards, > baolu >
[PATCH v4] x86/cpu: Use SERIALIZE in sync_core() when available
The SERIALIZE instruction gives software a way to force the processor to complete all modifications to flags, registers and memory from previous instructions and drain all buffered writes to memory before the next instruction is fetched and executed. Thus, it serves the purpose of sync_core(). Use it when available. Cc: Andy Lutomirski Cc: Cathy Zhang Cc: Dave Hansen Cc: Fenghua Yu Cc: "H. Peter Anvin" Cc: Kyung Min Park Cc: Peter Zijlstra Cc: "Ravi V. Shankar" Cc: Sean Christopherson Cc: linux-e...@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Tony Luck Suggested-by: Andy Lutomirski Signed-off-by: Ricardo Neri --- This is v4 from my three previous submission [1], [2], and [3]. The first three patches of the series have been merged in Linus' tree. Hence, I am submitting only this patch for review. [1]. https://lkml.org/lkml/2020/7/27/8 [2]. https://lkml.org/lkml/2020/8/4/1090 [3]. https://lkml.org/lkml/2020/8/6/808 Changes since v3: * Reworked comments in sync_core() for better readability. (Dave Hansen) * Reworked the implementation to align with the style in special_insns.h. No functional changes were introduced. (Tony Luck) Changes since v2: * Support serialize with static_cpu_has() instead of using alternative runtime patching directly. (Borislav Petkov) Changes since v1: * Support SERIALIZE using alternative runtime patching. (Peter Zijlstra, H. Peter Anvin) * Added a note to specify which version of binutils supports SERIALIZE. (Peter Zijlstra) * Verified that (::: "memory") is used. (H. Peter Anvin) --- arch/x86/include/asm/special_insns.h | 6 ++ arch/x86/include/asm/sync_core.h | 26 ++ 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 59a3e13204c3..5999b0b3dd4a 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -234,6 +234,12 @@ static inline void clwb(volatile void *__p) #define nop() asm volatile ("nop") +static inline void serialize(void) +{ + /* Instruction opcode for SERIALIZE; supported in binutils >= 2.35. */ + asm volatile(".byte 0xf, 0x1, 0xe8" ::: "memory"); +} + #endif /* __KERNEL__ */ #endif /* _ASM_X86_SPECIAL_INSNS_H */ diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h index fdb5b356e59b..089712777fd9 100644 --- a/arch/x86/include/asm/sync_core.h +++ b/arch/x86/include/asm/sync_core.h @@ -5,6 +5,7 @@ #include #include #include +#include #ifdef CONFIG_X86_32 static inline void iret_to_self(void) @@ -54,14 +55,23 @@ static inline void iret_to_self(void) static inline void sync_core(void) { /* -* There are quite a few ways to do this. IRET-to-self is nice -* because it works on every CPU, at any CPL (so it's compatible -* with paravirtualization), and it never exits to a hypervisor. -* The only down sides are that it's a bit slow (it seems to be -* a bit more than 2x slower than the fastest options) and that -* it unmasks NMIs. The "push %cs" is needed because, in -* paravirtual environments, __KERNEL_CS may not be a valid CS -* value when we do IRET directly. +* The SERIALIZE instruction is the most straightforward way to +* do this but it not universally available. +*/ + if (static_cpu_has(X86_FEATURE_SERIALIZE)) { + serialize(); + return; + } + + /* +* For all other processors, there are quite a few ways to do this +* IRET-to-self is nice because it works on every CPU, at any CPL +* (so it's compatible with paravirtualization), and it never exits +* to a hypervisor. The only down sides are that it's a bit slow +* (it seems to be a bit more than 2x slower than the fastest +* options) and that it unmasks NMIs. The "push %cs" is needed +* because, in paravirtual environments, __KERNEL_CS may not be a +* valid CS value when we do IRET directly. * * In case NMI unmasking or performance ever becomes a problem, * the next best option appears to be MOV-to-CR2 and an -- 2.17.1
Re: [PATCH v2 03/24] virtio: allow __virtioXX, __leXX in config space
On 2020/8/6 下午1:58, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 11:37:38AM +0800, Jason Wang wrote: On 2020/8/5 下午7:45, Michael S. Tsirkin wrote: #define virtio_cread(vdev, structname, member, ptr) \ do {\ might_sleep(); \ /* Must match the member's type, and be integer */ \ - if (!typecheck(typeofstructname*)0)->member)), *(ptr))) \ + if (!__virtio_typecheck(structname, member, *(ptr)))\ (*ptr) = 1; \ A silly question, compare to using set()/get() directly, what's the value of the accessors macro here? Thanks get/set don't convert to the native endian, I guess that's why drivers use cread/cwrite. It is also nice that there's type safety, checking the correct integer width is used. Yes, but this is simply because a macro is used here, how about just doing things similar like virtio_cread_bytes(): static inline void virtio_cread(struct virtio_device *vdev, unsigned int offset, void *buf, size_t len) And do the endian conversion inside? Thanks Then you lose type safety. It's very easy to have an le32 field and try to read it into a u16 by mistake. These macros are all about preventing bugs: and the whole patchset is about several bugs sparse found - that is what prompted me to make type checks more strict. Yes, but we need to do the macro with compiler extensions. I wonder whether or not the kernel has already had something since this request here is pretty common? Thanks
Re: [PATCH v17 14/21] mm/compaction: do page isolation first in compaction
在 2020/8/7 上午2:38, Alexander Duyck 写道: >> + >> isolate_abort: >> if (locked) >> spin_unlock_irqrestore(&pgdat->lru_lock, flags); >> + if (page) { >> + SetPageLRU(page); >> + put_page(page); >> + } >> >> /* >> * Updated the cached scanner pfn once the pageblock has been scanned > We should probably be calling SetPageLRU before we release the lru > lock instead of before. It might make sense to just call it before we > get here, similar to how you did in the isolate_fail_put case a few > lines later. Otherwise this seems to violate the rules you had set up > earlier where we were only going to be setting the LRU bit while > holding the LRU lock. Hi Alex, Set out of lock here should be fine. I never said we must set the bit in locking. And this page is get by get_page_unless_zero(), no warry on release. Thanks Alex
Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range
On 2020/8/6 下午8:29, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 03:03:55PM +0300, Eli Cohen wrote: On Wed, Aug 05, 2020 at 08:51:56AM -0400, Michael S. Tsirkin wrote: On Wed, Jun 17, 2020 at 11:29:44AM +0800, Jason Wang wrote: This patch introduce a config op to get valid iova range from the vDPA device. Signed-off-by: Jason Wang --- include/linux/vdpa.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 239db794357c..b7633ed2500c 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -41,6 +41,16 @@ struct vdpa_device { unsigned int index; }; +/** + * vDPA IOVA range - the IOVA range support by the device + * @start: start of the IOVA range + * @end: end of the IOVA range + */ +struct vdpa_iova_range { + u64 start; + u64 end; +}; + This is ambiguous. Is end in the range or just behind it? How about first/last? It is customary in the kernel to use start-end where end corresponds to the byte following the last in the range. See struct vm_area_struct vm_start and vm_end fields Exactly my point: include/linux/mm_types.h: unsigned long vm_end; /* The first byte after our end address in this case Jason wants it to be the last byte, not one behind. Ok, I somehow recall the reason :) See: struct iommu_domain_geometry { dma_addr_t aperture_start; /* First address that can be mapped */ dma_addr_t aperture_end; /* Last address that can be mapped */ bool force_aperture; /* DMA only allowed in mappable range? */ }; So what I proposed here is to be consistent with it. Thanks
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 8:14 PM, Sowjanya Komatineni wrote: On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(&device->mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this. Verified with HW designer. above is correct. ACTIVE bit update happens same time as DONE bit. Active = !(DONE) In case of calibration logic waiting for LP-11 where done bit does not get set, ACTIVE will still be 1 and on next start trigger new calibration will start Based on internal design check from designer, as long as its in waiting for LP-11 stage, next calibration request can be triggered again but ACTIVE bit we will see it at 1. So we should check for DONE bits to confirm if calibration is done or not. To start next calibration, it can take effect as long as its in wait for LP-11 mode.
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 8:10 PM, Sowjanya Komatineni wrote: On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(&device->mipi->lock); - clk_disable(device->mipi->clk); + /* + * Wait for min 72uS to let calibration logic finish calibration + * sequence codes before waiting for pads idle state to apply the + * results. + */ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this. Verified with HW designer. above is correct. ACTIVE bit update happens same time as DONE bit. Active = !(DONE) In case of calibration logic waiting for LP-11 where done bit does not get set, ACTIVE will still be 1 and on next start trigger new calibration will start
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 8:09 PM John Stultz wrote: > > On Thu, Aug 6, 2020 at 8:02 PM Saravana Kannan wrote: > > On Thu, Aug 6, 2020 at 7:49 PM John Stultz wrote: > > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > > > wrote: > > > > With all due respect, that's your downstream kernel, the upstream kernel > > > > should not rely on luck, out-of-tree patches or kernel parameters. > > > > > > I agree that would be preferred. But kernel parameters are often there > > > for these sorts of cases where we can't always do the right thing. As > > > for out-of-tree patches, broken things don't get fixed until > > > out-of-tree patches are developed and upstreamed, and I know Saravana > > > is doing exactly that, and I hope his fw_devlink work helps fix it so > > > the module loading is not just a matter of luck. > > > > Btw, the only downstream fw_devlink change is setting itto =on (vs > > =permissive in upstream). > > I thought there was the clk_sync_state stuff as well? That's not needed to solve the module load ordering issues and deferred probe issues. That's only needed to keep clocks on till some of the modules are loaded and it depends on fw_devlink, but not really a part of fw_devlink IMHO. And yes, that's on my list of things to upstream. > > > Also I think Thierry's comments in the other thread today are also > > > good ideas for ways to better handle the optional dt link handling > > > (rather than using a timeout). > > > > Could you please give me a lore link to this thread? Just curious. > > Sure: https://lore.kernel.org/lkml/20200806135251.GB3351349@ulmo/ Thanks. -Saravana
转发: upstream test error: WARNING in do_epoll_wait
> >发件人: linux-kernel-ow...@vger.kernel.org >>代表 syzbot >发送时间: 2020年8月5日 15:19 >收件人: linux-fsde...@vger.kernel.org; linux-kernel@vger.kernel.org; >>syzkaller->b...@googlegroups.com; v...@zeniv.linux.org.uk >主题: upstream test error: WARNING in do_epoll_wait >Hello, >syzbot found the following issue on: >HEAD commit:4f30a60a Merge tag 'close-range-v5.9' of git://git.kernel... >git tree: upstream >console output: https://syzkaller.appspot.com/x/log.txt?x=14c5a7da90 >kernel config: https://syzkaller.appspot.com/x/.config?x=8bdd9944dedf0f16 >dashboard link: https://syzkaller.appspot.com/bug?extid=4429670d8213f5f26352 >compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project.git >>ca2dcbd030eadbf0aa9b660efe864ff08af6e18b) >IMPORTANT: if you fix the issue, please add the following tag to the commit: >Reported-by: syzbot+4429670d8213f5f26...@syzkaller.appspotmail.com >[ cut here ] >WARNING: CPU: 1 PID: 8728 at fs/eventpoll.c:1828 ep_poll fs/eventpoll.c:1828 >[inline] >WARNING: CPU: 1 PID: 8728 at fs/eventpoll.c:1828 do_epoll_wait+0x337/0x920 >>fs/eventpoll.c:2333 >Kernel panic - not syncing: panic_on_warn set ... >CPU: 1 PID: 8728 Comm: syz-fuzzer Not tainted 5.8.0-syzkaller #0 >Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >>Google 01/01/2011 >Call Trace: >__dump_stack lib/dump_stack.c:77 [inline] >dump_stack+0x16e/0x25d lib/dump_stack.c:118 >panic+0x20c/0x69a kernel/panic.c:231 >__warn+0x211/0x240 kernel/panic.c:600 >report_bug+0x153/0x1d0 lib/bug.c:198 >handle_bug+0x4d/0x90 arch/x86/kernel/traps.c:235 >exc_invalid_op+0x16/0x70 arch/x86/kernel/traps.c:255 >asm_exc_invalid_op+0x12/0x20 arch/x86/include/asm/idtentry.h:547 >RIP: 0010:ep_poll fs/eventpoll.c:1828 [inline] >RIP: 0010:do_epoll_wait+0x337/0x920 fs/eventpoll.c:2333 >Code: 41 be 01 00 00 00 31 c0 48 89 44 24 20 45 31 e4 e9 7f 01 00 00 e8 59 ab >c6 ff 41 bc f2 >ff ff ff e9 c8 03 00 00 e8 49 ab c6 ff <0f> 0b e9 58 fe ff ff >49 bf ff ff ff ff ff ff ff 7f e9 f0 fe ff >ff >RSP: 0018:c9e1fe28 EFLAGS: 00010293 >RAX: 81856297 RBX: 888120fafa00 RCX: 88811e196400 >RDX: RSI: RDI: >RBP: R08: 818560d8 R09: 88619eb7 >R10: R11: R12: 7000 >R13: 0080 R14: 0001 R15: 0003 >__do_sys_epoll_pwait fs/eventpoll.c:2364 [inline] >__se_sys_epoll_pwait fs/eventpoll.c:2350 [inline] >__x64_sys_epoll_pwait+0x92/0x150 fs/eventpoll.c:2350 >do_syscall_64+0x6a/0xe0 arch/x86/entry/common.c:384 >entry_SYSCALL_64_after_hwframe+0x44/0xa9 >RIP: 0033:0x469240 >Code: 0f 05 89 44 24 20 c3 cc cc cc 8b 7c 24 08 48 8b 74 24 10 8b 54 24 18 44 >8b 54 24 1c >49 c7 c0 00 00 00 00 b8 19 01 00 00 0f 05 <89> 44 24 20 c3 cc cc >cc cc cc cc cc cc cc cc cc >8b 7c 24 08 48 c7 >RSP: 002b:00c4b7f0 EFLAGS: 0246 ORIG_RAX: 0119 >RAX: ffda RBX: 0001 RCX: 00469240 >RDX: 0080 RSI: 00c4b840 RDI: 0003 >RBP: 00c4be40 R08: R09: >R10: 0001 R11: 0246 R12: 0003 >R13: 00c9cc00 R14: 00c00032c180 R15: >Kernel Offset: disabled >Rebooting in 86400 seconds.. In "ep_poll" func the lockdep_assert_irqs_enabled detected interrupt status, although before enter "ep_poll" func, irq is already enabled, but it was missed lockdep irq status set. --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -76,6 +76,7 @@ noinstr long syscall_enter_from_user_mode(struct pt_regs *regs, long syscall) instrumentation_begin(); local_irq_enable(); + lockdep_hardirqs_on(CALLER_ADDR0); ti_work = READ_ONCE(current_thread_info()->flags); if (ti_work & SYSCALL_ENTER_WORK) syscall = syscall_trace_enter(regs, syscall, ti_work); -- Can we should add lockdep_hardirqs_on? >--- >This report is generated by a bot. It may contain errors. >See https://goo.gl/tpsmEJ for more information about syzbot. >syzbot engineers can be reached at syzkal...@googlegroups.com. >syzbot will keep track of this issue. See: >https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
Re: [PATCH v9 08/10] gpu: host1x: mipi: Keep MIPI clock enabled and mutex locked till calibration done
On 8/6/20 7:31 PM, Dmitry Osipenko wrote: 06.08.2020 22:01, Sowjanya Komatineni пишет: ... +int tegra_mipi_start_calibration(struct tegra_mipi_device *device) { const struct tegra_mipi_soc *soc = device->mipi->soc; unsigned int i; @@ -381,12 +375,16 @@ int tegra_mipi_calibrate(struct tegra_mipi_device *device) value |= MIPI_CAL_CTRL_START; tegra_mipi_writel(device->mipi, value, MIPI_CAL_CTRL); - mutex_unlock(&device->mipi->lock); - clk_disable(device->mipi->clk); + /* +* Wait for min 72uS to let calibration logic finish calibration +* sequence codes before waiting for pads idle state to apply the +* results. +*/ + usleep_range(75, 80); Could you please explain why the ACTIVE bit can't be polled instead of using the fixed delay? Doesn't ACTIVE bit represents the state of the busy FSM? Based on internal discussion, ACTIVE bit gets cleared when all enabled pads calibration is done (same time as when DONE set to 1). Will request HW designer to look into design and confirm exactly when ACTIVE bit gets cleared. Will get back on this.
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 8:02 PM Saravana Kannan wrote: > On Thu, Aug 6, 2020 at 7:49 PM John Stultz wrote: > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > > wrote: > > > With all due respect, that's your downstream kernel, the upstream kernel > > > should not rely on luck, out-of-tree patches or kernel parameters. > > > > I agree that would be preferred. But kernel parameters are often there > > for these sorts of cases where we can't always do the right thing. As > > for out-of-tree patches, broken things don't get fixed until > > out-of-tree patches are developed and upstreamed, and I know Saravana > > is doing exactly that, and I hope his fw_devlink work helps fix it so > > the module loading is not just a matter of luck. > > Btw, the only downstream fw_devlink change is setting itto =on (vs > =permissive in upstream). I thought there was the clk_sync_state stuff as well? > > Also I think Thierry's comments in the other thread today are also > > good ideas for ways to better handle the optional dt link handling > > (rather than using a timeout). > > Could you please give me a lore link to this thread? Just curious. Sure: https://lore.kernel.org/lkml/20200806135251.GB3351349@ulmo/ thanks -john
Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range
On 2020/8/6 下午8:10, Eli Cohen wrote: On Wed, Jun 17, 2020 at 06:29:44AM +0300, Jason Wang wrote: This patch introduce a config op to get valid iova range from the vDPA device. Signed-off-by: Jason Wang --- include/linux/vdpa.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h index 239db794357c..b7633ed2500c 100644 --- a/include/linux/vdpa.h +++ b/include/linux/vdpa.h @@ -41,6 +41,16 @@ struct vdpa_device { unsigned int index; }; +/** + * vDPA IOVA range - the IOVA range support by the device + * @start: start of the IOVA range + * @end: end of the IOVA range + */ +struct vdpa_iova_range { + u64 start; + u64 end; +}; + What do you do with this information? Suppose some device tells you it supports some limited range, say, from 0x4000 to 0x8000. What does qemu do with this information? For qemu, when qemu will fail the vDPA device creation when: 1) vIOMMU is not enabled and GPA is out of this range 2) vIOMMU is enabled but it can't report such range to guest For other userspace application, it will know it can only use this range as its IOVA. Thanks
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 7:49 PM John Stultz wrote: > > On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson > wrote: > > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > > wrote: > > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski > > > > > wrote: > > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz > > > > > > >> wrote: > > > > > > >>> > > > > > > >>> So this is where I bashfully admit I didn't get a chance to try > > > > > > >>> this > > > > > > >>> patch series out, as I had success with a much older version of > > > > > > >>> Saravana's macro magic. > > > > > > >>> > > > > > > >>> But unfortunately, now that this has landed in mainline, I'm > > > > > > >>> seeing > > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > > >>> building the driver in. > > > > > > >> Does that mean the modular version is working? Or you haven't > > > > > > >> tried > > > > > > >> that yet? I'll wait for your reply before I try to fix it. I > > > > > > >> don't > > > > > > >> have the hardware, but it should be easy to guess this issue > > > > > > >> looking > > > > > > >> at the code delta. > > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started > > > > > > > on -next > > > > > > > around 20200727, but I didn't track it down as, well, there's > > > > > > > less way > > > > > > > to get debug output on the C630. > > > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting > > > > > > > does > > > > > > > allow me to boot again. > > > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit > > > > > > reverted > > > > > > boots, however, module (on the c630 at least) doesn't boot whether > > > > > > it's > > > > > > a module or built-in. > > > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > > more grace time to load. > > > > > > > > With the risk of me reading more into this than what you're saying, > > > > please don't upstream anything that depend this parameter to be > > > > increased. > > > > > > > > Compiling any of these drivers as module should not require the user to > > > > pass additional kernel command line parameters in order to get their > > > > device to boot. > > > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > > avoid it. But the reality is that it is already required (at least in > > > configurations heavily using modules) to give more time for modules > > > loaded to resolve missing dependencies after init begins (due to > > > changes in the driver core to fail loading after init so that optional > > > dt links aren't eternally looked for). This was seen when trying to > > > enable the qualcom clk drivers to modules. > > > > > > > So to clarify what you're saying, any system that boots successfully > > with the default options is a sign of pure luck - regardless of being > > builtin or modules. > > > > > > And there you have my exact argument against the deferred timeout magic > > going on in the driver core. But as you know people insist that it's > > more important to be able to boot some defunct system from NFS than a > > properly configured one reliably. > > I'd agree, but the NFS case was in use before, and when the original > deferred timeout/optional link handling stuff landed no one complained > they were broken by it (at least at the point where it landed). Only > later when we started enabling more lower-level core drivers as > modules did the shortened dependency resolution time start to bite > folks. My attempt to set the default to be 30 seconds helped there, > but caused trouble and delays for the NFS case, and "don't break > existing users" seemed to rule, so I set the default timeout back to > 0. > > > > It doesn't seem necessary in this case, but I suggested it here as > > > I've got it enabled by default in my AOSP builds so that the > > > module-heavy configs for GKI boot properly (even if Saravana's > > > fw_devlink work is disabled). > > > > > > > With all due respect, that's your downstream kernel, the upstream kernel > > should not rely on luck, out-of-tree patches or kernel parameters. > > I agree that would be preferred. But kernel parameters are often there > for these sorts of cases where we can't always do the right thing. As > for out-of-tree patches, broken things don't get fixed until > out-of-tree patches are developed and upstreamed, and I know Saravana > is doing exactly that, and I hope his fw_devlink work helps fix it so > the module loading is not just a matter of luck. Btw, the only downstream fw_devlink change is setting itto =on (vs =permissive in upstream). > Also I think Thierry's comments in the other thread today are also > goo
Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy
On 2020/8/6 下午6:00, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 03:27:38PM +0800, Jason Wang wrote: On 2020/8/6 下午1:53, Michael S. Tsirkin wrote: On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote: On 2020/8/5 下午7:40, Michael S. Tsirkin wrote: On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote: On 2020/8/4 上午5:00, Michael S. Tsirkin wrote: Some legacy guests just assume features are 0 after reset. We detect that config space is accessed before features are set and set features to 0 automatically. Note: some legacy guests might not even access config space, if this is reported in the field we might need to catch a kick to handle these. I wonder whether it's easier to just support modern device? Thanks Well hardware vendors are I think interested in supporting legacy guests. Limiting vdpa to modern only would make it uncompetitive. My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to work. Hmm I don't really see why. Assume host maps guest memory properly, VM does not have an IOMMU, legacy guest can just work. Yes, guest may not set IOMMU_PLATFORM. Care explaining what's wrong with this picture? The problem is virtio_vdpa, without IOMMU_PLATFORM it uses PA which can not work if IOMMU is enabled. Thanks So that's a virtio_vdpa limitation. Probably not, I think this goes back to the long debate of whether to use DMA API unconditionally. If we did that, we can support legacy virtio driver. The vDPA device needs to provide a DMA device and the virtio core and perform DMA API with that device which should work for all of the cases. But a big question is, do upstream care about out of tree virtio drivers? Thanks In the same way, if a device does not have an on-device iommu *and* is not behind an iommu, then vdpa can't bind to it. But this virtio_vdpa specific hack does not belong in a generic vdpa code. So it can only work for modern device ... Thanks
Re: [PATCH v3] drm/virtio: fix missing dma_fence_put() in virtio_gpu_execbuffer_ioctl()
Xin He 于2020年7月21日周二 下午6:17写道: > > From: Qi Liu > > We should put the reference count of the fence after calling > virtio_gpu_cmd_submit(). So add the missing dma_fence_put(). > > Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit > synchronization") > Co-developed-by: Xin He > Signed-off-by: Xin He > Signed-off-by: Qi Liu > Reviewed-by: Muchun Song > --- > > changelog in v3: > 1) Change the subject from "drm/virtio: fixed memory leak in > virtio_gpu_execbuffer_ioctl()" to >"drm/virtio: fix missing dma_fence_put() in virtio_gpu_execbuffer_ioctl()" > 2) Rework the commit log > > changelog in v2: > 1) Add a change description > > drivers/gpu/drm/virtio/virtgpu_ioctl.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > index 5df722072ba0..19c5bc01eb79 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c > @@ -179,6 +179,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device > *dev, void *data, > > virtio_gpu_cmd_submit(vgdev, buf, exbuf->size, > vfpriv->ctx_id, buflist, out_fence); > + dma_fence_put(&out_fence->f); > virtio_gpu_notify(vgdev); > return 0; > > -- > 2.21.1 (Apple Git-122.3) > cc Greg -- Xin He
Re: [PATCH] ftrace: Fixup lockdep assert held of text_mutex
On Thu, Aug 6, 2020 at 11:48 PM Steven Rostedt wrote: > > On Thu, 6 Aug 2020 14:50:54 + > guo...@kernel.org wrote: > > > From: Guo Ren > > > > The function ftrace_process_locs() will modify text code, so we > > should give a text_mutex lock. Because some arch's patch code > > will assert held of text_mutex even during start_kernel-> > > ftrace_init(). > > NAK. > > This looks like a bug in the lockdep_assert_held() in whatever arch > (riscv) is running. Seems you think it's a bug of arch implementation with the wrong usage of text_mutex? Also @riscv maintainer, How about modifying it in riscv's code? we still need to solve it. diff --git a/arch/riscv/include/asm/ftrace.h b/arch/riscv/include/asm/ftrace.h index ace8a6e..fb266c3 100644 --- a/arch/riscv/include/asm/ftrace.h +++ b/arch/riscv/include/asm/ftrace.h @@ -23,6 +23,12 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr) struct dyn_arch_ftrace { }; + +#ifdef CONFIG_DYNAMIC_FTRACE +struct dyn_ftrace; +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); +#define ftrace_init_nop ftrace_init_nop +#endif #endif #ifdef CONFIG_DYNAMIC_FTRACE diff --git a/arch/riscv/kernel/ftrace.c b/arch/riscv/kernel/ftrace.c index 2ff63d0..9e9f7c0 100644 --- a/arch/riscv/kernel/ftrace.c +++ b/arch/riscv/kernel/ftrace.c @@ -97,6 +97,17 @@ int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, return __ftrace_modify_call(rec->ip, addr, false); } +int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) +{ + int ret; + + mutex_lock(&text_mutex); + ret = ftrace_make_nop(mod, rec, MCOUNT_ADDR); + mutex_unlock(&text_mutex); + + return ret; +} + int ftrace_update_ftrace_func(ftrace_func_t func) { int ret = __ftrace_modify_call((unsigned long)&ftrace_call, --- > > > > > backtrace log: > >assert by lockdep_assert_held(&text_mutex) > > 0 patch_insn_write (addr=0xffe010fc , > > insn=0xffe001203eb8, len=8) at arch/riscv/kernel/patch.c:63 > > 1 0xffe0002042ec in patch_text_nosync (addr=, > > insns=, len=) at arch/riscv/kernel/patch.c:93 > > 2 0xffe00020628e in __ftrace_modify_call (hook_pos=, > > target=, enable=) at > > arch/riscv/kernel/ftrace.c:68 > > 3 0xffe0002063c0 in ftrace_make_nop (mod=, > > rec=0xffe001221c70 , addr=18446743936272720288) at > > arch/riscv/kernel/ftrace.c:97 > > 4 0xffe0002b13f0 in ftrace_init_nop (rec=, > > mod=) at ./include/linux/ftrace.h:647 > > 5 ftrace_nop_initialize (rec=, mod=) at > > kernel/trace/ftrace.c:2619 > > 6 ftrace_update_code (new_pgs=, mod=) at > > kernel/trace/ftrace.c:3063 > > 7 ftrace_process_locs (mod=, start=, > > end=) at kernel/trace/ftrace.c:6154 > > 8 0xffe0b6e6 in ftrace_init () at kernel/trace/ftrace.c:6715 > > 9 0xffe01b48 in start_kernel () at init/main.c:888 > > 10 0xffe010a8 in _start_kernel () at arch/riscv/kernel/head.S:247 > > > > Signed-off-by: Guo Ren > > Cc: Steven Rostedt > > Cc: Ingo Molnar > > --- > > kernel/trace/ftrace.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c > > index 1903b80..4b48b88 100644 > > --- a/kernel/trace/ftrace.c > > +++ b/kernel/trace/ftrace.c > > @@ -26,6 +26,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include > > @@ -6712,9 +6713,11 @@ void __init ftrace_init(void) > > ftrace_init() is called before SMP is initialized. Nothing else should > be running here. That means grabbing a mutex is useless. I don't agree, ftrace_init are modifying kernel text, so we should give the lock of text_mutex to keep semantic consistency. > > -- Steve > > > > > > last_ftrace_enabled = ftrace_enabled = 1; > > > > + mutex_lock(&text_mutex); > > ret = ftrace_process_locs(NULL, > > __start_mcount_loc, > > __stop_mcount_loc); > > + mutex_unlock(&text_mutex); > > > > pr_info("ftrace: allocated %ld pages with %ld groups\n", > > ftrace_number_of_pages, ftrace_number_of_groups); > -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
Re: [PATCH v3 2/4] irqchip/qcom-pdc: Switch to using IRQCHIP_PLATFORM_DRIVER helper macros
On Thu, Aug 6, 2020 at 6:42 PM Bjorn Andersson wrote: > On Thu 06 Aug 18:22 PDT 2020, John Stultz wrote: > > On Thu, Aug 6, 2020 at 5:43 PM Bjorn Andersson > > wrote: > > > On Wed 05 Aug 14:57 PDT 2020, John Stultz wrote: > > > > On Wed, Aug 5, 2020 at 2:47 PM Steev Klimaszewski > > > > wrote: > > > > > On 8/5/20 4:16 PM, Steev Klimaszewski wrote: > > > > > > On 8/5/20 3:19 PM, Saravana Kannan wrote: > > > > > >> On Wed, Aug 5, 2020 at 12:44 AM John Stultz > > > > > >> wrote: > > > > > >>> > > > > > >>> So this is where I bashfully admit I didn't get a chance to try > > > > > >>> this > > > > > >>> patch series out, as I had success with a much older version of > > > > > >>> Saravana's macro magic. > > > > > >>> > > > > > >>> But unfortunately, now that this has landed in mainline, I'm > > > > > >>> seeing > > > > > >>> boot regressions on db845c. :( This is in the non-modular case, > > > > > >>> building the driver in. > > > > > >> Does that mean the modular version is working? Or you haven't tried > > > > > >> that yet? I'll wait for your reply before I try to fix it. I don't > > > > > >> have the hardware, but it should be easy to guess this issue > > > > > >> looking > > > > > >> at the code delta. > > > > > > For what it's worth, I saw this too on the Lenovo C630 (started on > > > > > > -next > > > > > > around 20200727, but I didn't track it down as, well, there's less > > > > > > way > > > > > > to get debug output on the C630. > > > > > > > > > > > > In my testing, module or built-in doesn't matter, but reverting does > > > > > > allow me to boot again. > > > > > > > > > > > Actually - I spoke too soon - QCOM_PDC built-in with the commit > > > > > reverted > > > > > boots, however, module (on the c630 at least) doesn't boot whether > > > > > it's > > > > > a module or built-in. > > > > > > > > You may need to set deferred_probe_timeout=30 to give things a bit > > > > more grace time to load. > > > > > > With the risk of me reading more into this than what you're saying, > > > please don't upstream anything that depend this parameter to be > > > increased. > > > > > > Compiling any of these drivers as module should not require the user to > > > pass additional kernel command line parameters in order to get their > > > device to boot. > > > > So, ideally I agree, and Saravana's fw_devlink work should allow us to > > avoid it. But the reality is that it is already required (at least in > > configurations heavily using modules) to give more time for modules > > loaded to resolve missing dependencies after init begins (due to > > changes in the driver core to fail loading after init so that optional > > dt links aren't eternally looked for). This was seen when trying to > > enable the qualcom clk drivers to modules. > > > > So to clarify what you're saying, any system that boots successfully > with the default options is a sign of pure luck - regardless of being > builtin or modules. > > > And there you have my exact argument against the deferred timeout magic > going on in the driver core. But as you know people insist that it's > more important to be able to boot some defunct system from NFS than a > properly configured one reliably. I'd agree, but the NFS case was in use before, and when the original deferred timeout/optional link handling stuff landed no one complained they were broken by it (at least at the point where it landed). Only later when we started enabling more lower-level core drivers as modules did the shortened dependency resolution time start to bite folks. My attempt to set the default to be 30 seconds helped there, but caused trouble and delays for the NFS case, and "don't break existing users" seemed to rule, so I set the default timeout back to 0. > > It doesn't seem necessary in this case, but I suggested it here as > > I've got it enabled by default in my AOSP builds so that the > > module-heavy configs for GKI boot properly (even if Saravana's > > fw_devlink work is disabled). > > > > With all due respect, that's your downstream kernel, the upstream kernel > should not rely on luck, out-of-tree patches or kernel parameters. I agree that would be preferred. But kernel parameters are often there for these sorts of cases where we can't always do the right thing. As for out-of-tree patches, broken things don't get fixed until out-of-tree patches are developed and upstreamed, and I know Saravana is doing exactly that, and I hope his fw_devlink work helps fix it so the module loading is not just a matter of luck. Also I think Thierry's comments in the other thread today are also good ideas for ways to better handle the optional dt link handling (rather than using a timeout). thanks -john
Re: [PATCH] powerpc/signal: Move and simplify get_clean_sp()
Christoph Hellwig writes: > On Thu, Aug 06, 2020 at 08:50:20AM +, Christophe Leroy wrote: >> get_clean_sp() is only used in kernel/signal.c . Move it there. >> >> And GCC is smart enough to reduce the function when on PPC32, no >> need of a special PPC32 simple version. > > What about just open coding it in the only caller, which would seem even > cleaner? +1 cheers
Re: [PATCH] sched/core: add unlikely in group_has_capacity()
Yeah, because of the following two points, I also think the probability is 0%: a) the sd is protected by rcu lock, and load_balance() func is between rcu_read_lock() and rcu_read_unlock(). b) the sgs is a local variable. So in the group_classify(), the env->sd->imbalance_pct and the sgs will not be changed. May I remove the duplicate check from group_has_capacity() and resubmit a patch? Yours, Qi Zheng On 2020/8/6 下午10:45, Ingo Molnar wrote: * Qi Zheng wrote: 1. The group_has_capacity() function is only called in group_classify(). 2. Before calling the group_has_capacity() function, group_is_overloaded() will first judge the following formula, if it holds, the group_classify() will directly return the group_overloaded. (sgs->group_capacity * imbalance_pct) < (sgs->group_runnable * 100) Therefore, when the group_has_capacity() is called, the probability that the above formalu holds is very small. Hint compilers about that. Signed-off-by: Qi Zheng --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2ba8f230feb9..9074fd5e23b2 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8234,8 +8234,8 @@ group_has_capacity(unsigned int imbalance_pct, struct sg_lb_stats *sgs) if (sgs->sum_nr_running < sgs->group_weight) return true; - if ((sgs->group_capacity * imbalance_pct) < - (sgs->group_runnable * 100)) + if (unlikely((sgs->group_capacity * imbalance_pct) < + (sgs->group_runnable * 100))) return false; Isn't the probability that this second check will match around 0%? I.e. wouldn't the right fix be to remove the duplicate check from group_has_capacity(), because it's already been checked in group_classify()? Maybe while leaving a comment in place? Thanks, Ingo
Re: [GIT PULL] dlm updates for 5.9
The pull request you sent on Thu, 6 Aug 2020 11:45:07 -0500: > git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm.git dlm-5.9 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/86cfccb66937dd6cbf26ed619958b9e587e6a115 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] RESEND: thermal for v5.9-rc1
On Thu, Aug 6, 2020 at 1:19 PM Daniel Lezcano wrote: > > > - Add generic netlink support for userspace notifications: events, > temperature > and discovery commands (Daniel Lezcano) This is "default y". Why? The help text doesn't explain either. Please explain, or remove the default y. We don't add new features and then try to force people to use them by enabling them by default. "default y" is mainly for when something unconditional gets split up and becomes conditional (so now "default y" means that you don't break peoples setups when they don't even know what it is). Alternatively, "default y" is for things that are make peoples lives immeasurably better somehow, and it would be a crime to not enable it because it's _so_ wonderful. So far, I'm not convinced we've ever hit that second case. Convince me that the thermal layer is so magical that it really warrants it. Tell me why and how my life is improved by enabling it. Linus