RE: [RFC PATCH 13/22] x86/fpu/xstate: Expand dynamic user state area on first use
> From: Andy Lutomirski > > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c > > @@ -518,3 +518,40 @@ int fpu__exception_code(struct fpu *fpu, int trap_nr) .. > > +bool xfirstuse_event_handler(struct fpu *fpu) > > +{ > > + bool handled = false; > > + u64 event_mask; > > + > > + /* Check whether the first-use detection is running. */ > > + if (!static_cpu_has(X86_FEATURE_XFD) || !xfirstuse_enabled()) > > + return handled; > > + > MSR_IA32_XFD_ERR needs to be wired up in the exception handler, not in > some helper called farther down the stack xfirstuse_event_handler() is called directly from the IDTENTRY exc_device_not_available(): > > @@ -1028,6 +1028,9 @@ DEFINE_IDTENTRY(exc_device_not_available) > > { > > unsigned long cr0 = read_cr0(); > > > > + if (xfirstuse_event_handler(>thread.fpu)) > > + return; Are you suggesting we should instead open code it inside that routine? > But this raises an interesting point -- what happens if allocation > fails? I think that, from kernel code, we simply cannot support this > exception mechanism. If kernel code wants to use AMX (and that would > be very strange indeed), it should call x86_i_am_crazy_amx_begin() and > handle errors, not rely on exceptions. From user code, I assume we > send a signal if allocation fails. The XFD feature allows us to transparently expand the kernel context switch buffer for a user task, when that task first touches this associated hardware. It allows applications to operate as if the kernel had allocated the backing AMX context switch buffer at initialization time. However, since we expect only a sub-set of tasks to actually use AMX, we instead defer allocation until we actually see first use for that task, rather than allocating for all tasks. While we currently don't propose AMX use inside the kernel, it is conceivable that could be done in the future, just like AVX is used by the RAID code; and it would be done the same way -- kernel_fpu_begin()/kernel_fpu_end(). Such future Kernel AMX use would _not_ arm XFD, and would _not_ provoke this fault. (note that we context switch the XFD-armed state per task) vmalloc() does not fail, and does not return an error, and so there is no concept of returning a signal. If we got to the point where vmalloc() sleeps, then the system has bigger OOM issues, and the OOM killer would be on the prowl. If we were concerned about using vmalloc for a couple of pages in the task structure, Then we could implement a routine to harvest unused buffers and free them -- but that didn't seem worth the complexity. Note that this feature is 64-bit only. Thanks, -Len
Re: [PATCH] nvmet: fix uninitialized work for zero kato
Hit a warning: WARNING: CPU: 1 PID: 241 at kernel/workqueue.c:1627 __queue_delayed_work+0x6d/0x90 with trace: mod_delayed_work_on+0x59/0x90 nvmet_update_cc+0xee/0x100 [nvmet] nvmet_execute_prop_set+0x72/0x80 [nvmet] nvmet_tcp_try_recv_pdu+0x2f7/0x770 [nvmet_tcp] nvmet_tcp_io_work+0x63f/0xb2d [nvmet_tcp] ... This could be reproduced easily with a keep alive time 0: nvme connect -t tcp -n NQN -a ADDR -s PORT --keep-alive-tmo=0 The reason is: Starting an uninitialized work when initiator connects with zero kato. Althrough keep-alive timer is disabled during allocating a ctrl (fix in 0d3b6a8d213a), ka_work still has a chance to run (called by nvmet_start_ctrl to detect dead host). This should have a "Fixes:" tag. Initilize ka_work during allocating ctrl, and set a reasonable kato before scheduling ka_work. Signed-off-by: zhenwei pi --- drivers/nvme/target/core.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c index b7b63330b5ef..3c5b2b065476 100644 --- a/drivers/nvme/target/core.c +++ b/drivers/nvme/target/core.c @@ -19,6 +19,8 @@ struct workqueue_struct *buffered_io_wq; static const struct nvmet_fabrics_ops *nvmet_transports[NVMF_TRTYPE_MAX]; static DEFINE_IDA(cntlid_ida); +#define NVMET_DEFAULT_KATO 5 + /* * This read/write semaphore is used to synchronize access to configuration * information on a target system that will result in discovery log page @@ -385,6 +387,11 @@ static void nvmet_keep_alive_timer(struct work_struct *work) if (cmd_seen) { pr_debug("ctrl %d reschedule traffic based keep-alive timer\n", ctrl->cntlid); + + /* run once, trigger from nvmet_start_ctrl to detect dead link */ + if (!ctrl->kato) + return; + schedule_delayed_work(>ka_work, ctrl->kato * HZ); It will be better to just schedule/mod the ka_work if kato != 0, other changes in the patch aren't needed IMO. return; } @@ -403,15 +410,11 @@ static void nvmet_start_keep_alive_timer(struct nvmet_ctrl *ctrl) pr_debug("ctrl %d start keep-alive timer for %d secs\n", ctrl->cntlid, ctrl->kato); - INIT_DELAYED_WORK(>ka_work, nvmet_keep_alive_timer); schedule_delayed_work(>ka_work, ctrl->kato * HZ); } static void nvmet_stop_keep_alive_timer(struct nvmet_ctrl *ctrl) { - if (unlikely(ctrl->kato == 0)) - return; - pr_debug("ctrl %d stop keep-alive\n", ctrl->cntlid); cancel_delayed_work_sync(>ka_work); @@ -1107,6 +1110,8 @@ static inline u8 nvmet_cc_iocqes(u32 cc) static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl) { + u32 kato = ctrl->kato ? ctrl->kato : NVMET_DEFAULT_KATO; + The controller shouldn't have a default value, it should receive the desired value from the host. lockdep_assert_held(>lock); if (nvmet_cc_iosqes(ctrl->cc) != NVME_NVM_IOSQES || @@ -1126,7 +1131,7 @@ static void nvmet_start_ctrl(struct nvmet_ctrl *ctrl) * in case a host died before it enabled the controller. Hence, simply * reset the keep alive timer when the controller is enabled. */ - mod_delayed_work(system_wq, >ka_work, ctrl->kato * HZ); + mod_delayed_work(system_wq, >ka_work, kato * HZ); } static void nvmet_clear_ctrl(struct nvmet_ctrl *ctrl) @@ -1378,6 +1383,7 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn, /* keep-alive timeout in seconds */ ctrl->kato = DIV_ROUND_UP(kato, 1000); + INIT_DELAYED_WORK(>ka_work, nvmet_keep_alive_timer); ctrl->err_counter = 0; spin_lock_init(>error_lock);
Re: linux-next: build failure after merge of the vfio tree
Hi Diana, On Tue, 13 Oct 2020 18:56:07 +0300 Diana Craciun OSS wrote: > > Hi, > > How does it fail? What's the error? Sorry about that: drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c: In function 'vfio_fsl_mc_set_irq_trigger': drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c:121:8: error: implicit declaration of function 'fsl_mc_populate_irq_pool' [-Werror=implicit-function-declaration] 121 | ret = fsl_mc_populate_irq_pool(mc_cont, |^~~~ drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c:122:4: error: 'FSL_MC_IRQ_POOL_MAX_TOTAL_IRQS' undeclared (first use in this function) 122 |FSL_MC_IRQ_POOL_MAX_TOTAL_IRQS); |^~ drivers/vfio/fsl-mc/vfio_fsl_mc.c: In function 'vfio_fsl_mc_release': drivers/vfio/fsl-mc/vfio_fsl_mc.c:178:9: error: implicit declaration of function 'dprc_reset_container' [-Werror=implicit-function-declaration] 178 | ret = dprc_reset_container(mc_cont->mc_io, 0, | ^~~~ drivers/vfio/fsl-mc/vfio_fsl_mc.c:181:6: error: 'DPRC_RESET_OPTION_NON_RECURSIVE' undeclared (first use in this function) 181 | DPRC_RESET_OPTION_NON_RECURSIVE); | ^~~ drivers/vfio/fsl-mc/vfio_fsl_mc.c:181:6: note: each undeclared identifier is reported only once for each function it appears in drivers/vfio/fsl-mc/vfio_fsl_mc.c:191:3: error: implicit declaration of function 'fsl_mc_cleanup_irq_pool' [-Werror=implicit-function-declaration] 191 | fsl_mc_cleanup_irq_pool(mc_cont); | ^~~ drivers/vfio/fsl-mc/vfio_fsl_mc.c: In function 'vfio_fsl_mc_ioctl': drivers/vfio/fsl-mc/vfio_fsl_mc.c:316:9: error: 'DPRC_RESET_OPTION_NON_RECURSIVE' undeclared (first use in this function) 316 | DPRC_RESET_OPTION_NON_RECURSIVE); | ^~~ drivers/vfio/fsl-mc/vfio_fsl_mc.c: In function 'vfio_fsl_mc_mmap_mmio': drivers/vfio/fsl-mc/vfio_fsl_mc.c:455:36: error: 'FSL_MC_REGION_CACHEABLE' undeclared (first use in this function) 455 | region_cacheable = (region.type & FSL_MC_REGION_CACHEABLE) && |^~~ drivers/vfio/fsl-mc/vfio_fsl_mc.c:456:22: error: 'FSL_MC_REGION_SHAREABLE' undeclared (first use in this function) 456 | (region.type & FSL_MC_REGION_SHAREABLE); | ^~~ drivers/vfio/fsl-mc/vfio_fsl_mc.c: In function 'vfio_fsl_mc_bus_notifier': drivers/vfio/fsl-mc/vfio_fsl_mc.c:522:9: error: 'struct fsl_mc_device' has no member named 'driver_override' 522 | mc_dev->driver_override = kasprintf(GFP_KERNEL, "%s", | ^~ drivers/vfio/fsl-mc/vfio_fsl_mc.c:524:14: error: 'struct fsl_mc_device' has no member named 'driver_override' 524 | if (!mc_dev->driver_override) | ^~ drivers/vfio/fsl-mc/vfio_fsl_mc.c: In function 'vfio_fsl_mc_init_device': drivers/vfio/fsl-mc/vfio_fsl_mc.c:561:8: error: implicit declaration of function 'dprc_setup' [-Werror=implicit-function-declaration] 561 | ret = dprc_setup(mc_dev); |^~ drivers/vfio/fsl-mc/vfio_fsl_mc.c:567:8: error: implicit declaration of function 'dprc_scan_container' [-Werror=implicit-function-declaration] 567 | ret = dprc_scan_container(mc_dev, false); |^~~ drivers/vfio/fsl-mc/vfio_fsl_mc.c:576:2: error: implicit declaration of function 'dprc_remove_devices' [-Werror=implicit-function-declaration] 576 | dprc_remove_devices(mc_dev, NULL, 0); | ^~~ drivers/vfio/fsl-mc/vfio_fsl_mc.c:577:2: error: implicit declaration of function 'dprc_cleanup' [-Werror=implicit-function-declaration] 577 | dprc_cleanup(mc_dev); | ^~~~ -- Cheers, Stephen Rothwell pgpashWGkQyKb.pgp Description: OpenPGP digital signature
Re: [PATCH 2/4] Documentation/powercap/dtpm: Add documentation for dtpm
On 10/6/2020 6:20 AM, Daniel Lezcano wrote: The dynamic thermal and power management is a technique to dynamically adjust the power consumption of different devices in order to ensure a global thermal constraint. An userspace daemon is usually monitoring the temperature and the power to take immediate action on the device. The DTPM framework provides an unified API to userspace to act on the power. Document this framework. Signed-off-by: Daniel Lezcano --- Documentation/power/powercap/dtpm.rst | 222 ++ 1 file changed, 222 insertions(+) create mode 100644 Documentation/power/powercap/dtpm.rst diff --git a/Documentation/power/powercap/dtpm.rst b/Documentation/power/powercap/dtpm.rst new file mode 100644 index ..ce11cf183994 --- /dev/null +++ b/Documentation/power/powercap/dtpm.rst @@ -0,0 +1,222 @@ +== +Dynamic Thermal Power Management framework +== + +On the embedded world, the complexity of the SoC leads to an +increasing number of hotspots which need to be monitored and mitigated +as a whole in order to prevent the temperature to go above the +normative and legally stated 'skin temperature'. + +Another aspect is to sustain the performance for a given power budget, +for example virtual reality where the user can feel dizziness if the +performance is capped while a big CPU is processing something else. Or +reduce the battery charging because the dissipated power is too high +compared with the power consumed by other devices. + +The userspace is the most adequate place to dynamically act on the +different devices by limiting their power given an application +profile: it has the knowledge of the platform. + +The Dynamic Thermal Power Management (DTPM) is a technique acting on +the device power by limiting and/or balancing a power budget among +different devices. + +The DTPM framework provides an unified interface to act on the +device power. + +=== +1. Overview +=== + +The DTPM framework relies on the powercap framework to create the +powercap entries in the sysfs directory and implement the backend +driver to do the connection with the power manageable device. + +The DTPM is a tree representation describing the power constraints +shared between devices, not their physical positions. + +The nodes of the tree are a virtual description aggregating the power +characteristics of the children nodes and their power limitations. + +The leaves of the tree are the real power manageable devices. + +For instance: + + SoC + | + `-- pkg + | + |-- pd0 (cpu0-3) + | + `-- pd1 (cpu4-5) + +* The pkg power will be the sum of pd0 and pd1 power numbers. + + SoC (400mW - 3100mW) + | + `-- pkg (400mW - 3100mW) + | + |-- pd0 (100mW - 700mW) + | + `-- pd1 (300mW - 2400mW) + +* When the nodes are inserted in the tree, their power characteristics + are propagated to the parents. + + SoC (600mW - 5900mW) + | + |-- pkg (400mW - 3100mW) + || + ||-- pd0 (100mW - 700mW) + || + |`-- pd1 (300mW - 2400mW) + | + `-- pd2 (200mW - 2800mW) + +* Each node have a weight on a 2^10 basis reflecting the percentage of + power consumption along the siblings. + + SoC (w=1024) + | + |-- pkg (w=538) + || + ||-- pd0 (w=231) + || + |`-- pd1 (w=794) + | + `-- pd2 (w=486) + + Note the sum of weights at the same level are equal to 1024. + +* When a power limitation is applied to a node, then it is distributed + along the children given their weights. For example, if we set a + power limitation of 3200mW at the 'SoC' root node, the resulting + tree will be. + + SoC (w=1024) <--- power_limit = 3200mW + | + |-- pkg (w=538) --> power_limit = 1681mW + || + ||-- pd0 (w=231) --> power_limit = 378mW + || + |`-- pd1 (w=794) --> power_limit = 1303mW + | + `-- pd2 (w=486) --> power_limit = 1519mW + + +1.1 Flat description + + +A root node is created and it is the parent of all the nodes. This +description is the simplest one and it is supposed to give to +userspace a flat representation of all the devices supporting the +power limitation without any power limitation distribution. + + +1.2 Hierarchical description + + +The different devices supporting the power limitation are represented +hierarchically. There is one root node, all intermediate nodes are +grouping the child nodes which can be intermediate nodes also or real +devices. + +The intermediate nodes aggregate the power information and allows to +set the power limit given the weight of the nodes. + + +2. Userspace API + + +As stated in the overview, the DTPM framework is built on top of the +powercap framework. Thus the sysfs interface is the same, please refer +to the powercap
Re: [PATCH v2 22/24] ice: docs fix a devlink info that broke a table
On 10/13/2020 5:14 AM, Mauro Carvalho Chehab wrote: > Changeset 410d06879c01 ("ice: add the DDP Track ID to devlink info") > added description for a new devlink field, but forgot to add > one of its columns, causing it to break: > > .../Documentation/networking/devlink/ice.rst:15: WARNING: Error parsing > content block for the "list-table" directive: uniform two-level bullet list > expected, but row 11 does not contain the same number of items as row 1 (3 vs > 4). > > .. list-table:: devlink info versions implemented > :widths: 5 5 5 90 > ... > * - ``fw.app.bundle_id`` > - 0xc001 > - Unique identifier for the DDP package loaded in the device. Also > referred to as the DDP Track ID. Can be used to uniquely > identify > the specific DDP package. > > Add the type field to the ``fw.app.bundle_id`` row. > > Fixes: 410d06879c01 ("ice: add the DDP Track ID to devlink info") > Signed-off-by: Mauro Carvalho Chehab Yep, looks correct. Thanks for the fix! Reviewed-by: Jacob Keller > --- > Documentation/networking/devlink/ice.rst | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/Documentation/networking/devlink/ice.rst > b/Documentation/networking/devlink/ice.rst > index b165181d5d4d..a432dc419fa4 100644 > --- a/Documentation/networking/devlink/ice.rst > +++ b/Documentation/networking/devlink/ice.rst > @@ -70,6 +70,7 @@ The ``ice`` driver reports the following versions > that both the name (as reported by ``fw.app.name``) and version are > required to uniquely identify the package. > * - ``fw.app.bundle_id`` > + - running >- 0xc001 >- Unique identifier for the DDP package loaded in the device. Also > referred to as the DDP Track ID. Can be used to uniquely identify >
RE: [RFC PATCH 07/22] x86/fpu/xstate: Introduce helpers to manage an xstate area dynamically
> > From: Andy Lutomirski > > > + /* > > +* The caller may be under interrupt disabled condition. Ensure > > interrupt > > +* allowance before the memory allocation, which may involve with > > page faults. > > +*/ > > + local_irq_enable(); > ... you can't just enable IRQs here. If IRQs are off, they're off for a > reason. Secondly, if they're *on*, you just forgot that fact. Good catch. This is a trap handler from user-space and should never be called with irqs disabled, So the local_irq_enable() should be dead code. Chang suggested that he erroneously left it in from a previous implementation. > > + /* We need 64B aligned pointer, but vmalloc() returns a > > page-aligned address */ > > + state_ptr = vmalloc(newsz); > And allocating this state from vmalloc() seems questionable. Why are you > doing this? While the buffer needs to be virtually contiguous, it doesn't need to be physically contiguous, And so vmalloc() is less overhead than kmalloc() for this. Thanks, -Len
Re: [PATCH 1/2] x86: Remove led/gpio setup from pcengines platform driver
On 13/10/2020 09:48, Hans de Goede wrote: > On 10/12/20 9:39 PM, Enrico Weigelt, metux IT consult wrote: >> On 22.09.20 00:17, Ed W wrote: >>> Hi, I've been adding support for the PC Engines APU5 board, which is a >>> variant of the APU 2-4 >>> boards >>> with some nice features. The current platform driver for pcengines boards >>> has some redundant >>> features with regards to recent bios/firmware packages for the board as >>> they now set the ACPI >>> tables >>> to indicate GPIOs for keys and leds. >> >> NAK. Breaks existing userlands in the field (literally field), forcing >> users to fw upgrade is not an option (field roll would be realy expensive). > > Thank you Enrico, I was wondering the same (what about userspace breakage) > when I was looking at this patch. It is good to have confirmation that > userspace breakage is a real issue here. This isn't the whole story. The original naming was board specific. Then Enrico (not unreasonably - I actually prefer his naming) changed the naming to be non board specific. Then within 2 months PC Engines introduced ACPI based config using the old names. So if we are holding "userspace breakage" as the gold standard, then the original (also the current) names have actually been around longest and likely cause the least userspace breakage. Also, some other pieces of this module have already been removed (SIM Swap), so there is an existing precedent for "userspace breakage" and trimming down this platform driver. In big picture terms, changing the name of the LED device doesn't seem a huge concern to me... A udev rule can setup compatibility forwards/backwards quite trivially I think? Kind regards Ed W
Re: [PATCH 1/2] x86: Remove led/gpio setup from pcengines platform driver
On 12/10/2020 20:39, Enrico Weigelt, metux IT consult wrote: > On 22.09.20 00:17, Ed W wrote: >> Hi, I've been adding support for the PC Engines APU5 board, which is a >> variant of the APU 2-4 boards >> with some nice features. The current platform driver for pcengines boards >> has some redundant >> features with regards to recent bios/firmware packages for the board as they >> now set the ACPI tables >> to indicate GPIOs for keys and leds. > NAK. Breaks existing userlands in the field (literally field), forcing > users to fw upgrade is not an option (field roll would be realy expensive). But why are users "in the field" updating a kernel willy nilly without also updating the userland software that talks to it? Why is the kernel upgrade trivial, but the fw upgrade is not an option? Why not also update the app or setup a udev rule? I would understand if we were talking something fairly major, but it's the case of matching a filename that YOU changed from an old name to the current name and it's now changing back to the original name? >> So I've submitted a patch to eliminate this. It could be argued >> that it's useful to support older firmware versions, but there is also a >> 'leds-apu' driver which a) >> probably ought to be marked deprecated with a view to removing it and b) >> implements the leds even on >> antique firmware versions. > leds-apu is only for *OLD* apu1 - it does *not* work with v2/3/4, > completely different chipset. That's extremely disingenuous!! It USED to work for the APU2-4 except that YOU removed support for APU2-4 from that module!! >> In implementing the APU5 I changed some of the exported gpio names to make >> them more closely match >> functionality across all the boards. > For example APU2 vs APU4 both support >> 2x LTE options, but in >> different mpcie slots and this affects the numbering of options, but not the >> sense of them (so I >> renamed them based on the intention of the option). This is particularly >> true on APU5 which supports >> 3x LTE cards > Dont break existing userlands. But YOU already did that!! Look, the original situation was that: - up to July 2019 there was a kernel module that named the LEDs in the form apu2:green:1, etc. - In July 2019 you removed that and "broke all of userland by renaming the LEDs" (never break userland right?) - Then in Sept 2019, PCEngines released a new bios/firmware line which setup ACPI correctly to register these GPIOs with some default names along the lines of apu2:green:1 or similar. So now we are back to the original naming convention - This meant that from Sept 2019 your module created duplicate LEDs with a different set of names So the situationt boils down to: - LEDs have been named like "apu2:green:1" continuously, with a brief outage in Aug-Sep 2019. - You have introduced a new module which unnecessarily duplicates those LEDs for users of this board with a bios/firmware post Sep 2019. - Your new naming convention isn't the same as this historic naming convention Now don't get me wrong, I prefer your module naming, but you are very disingenuous to claim that I'm trying to break userspace when you already did it! Plus I see no future for the LED piece of your module given that it's done by ACPI in all modern bios? Do you really want a duplicate set of LEDs to exist forever in userspace? This doesn't seem to be workable? Lets be clear, the current situation is: - LED names change from "apu:green:1" to "apu2:green:1". - This can be trivially worked around with some symlinks and/or a udev rule - The historic name has been "apu2:green:1" in the original LED module and now in ACPI. I'm not wild about this naming convention, but it's been around longest. If one has to pick which userspace to break, then this seems to have precedence. - Your LED based SIM toggle HAS already gone. So you have another example of userspace being broken right there. (Seems that this rule isn't so concrete?). So you already need to (significantly?) adjust your userspace code - I'm not seeing how/why the LED change is such a blocker? >> Can I get some advice: It would be helpful if the kernel would export the >> GPIOs to user-space >> automatically since toggling SIM slots is fairly useful task in userspace. > This is planned to be moved to either an own subsystem or into rfkill > (which would have to be extended for such things). Can you send me a pointer to this planning? Is this something concrete or aspirational? I need something I can use right now for SIM swap. exporting GPIOs with known names seems no more evil than exporting LEDs with known names? Do you have any concrete suggestion for the here and now? Given that the LED based sim swap is already removed from the kernel, how are you planning to toggle SIM swap in userspace? Hans, can I ask you to look again at the history of this please. Bearing in mind the speed kernel stuff takes to get to end users, we are
Re: [PATCH] power: supply: bq25980: Fix uninitialized wd_reg_val and overrun
Hi Dan, On Tue, Oct 13, 2020 at 01:03:13PM -0500, Dan Murphy wrote: > On 10/9/20 7:12 AM, Dan Murphy wrote: > > Fix the issue when 'i' is equal to array size then array index over > > runs the array when checking for the watch dog value. > > > > This also fixes the uninitialized wd_reg_val if the for..loop was not > > successful in finding an appropriate match. > > Might want to pull this into next as well this is a 0-day bug fix Yes, merged now. I did not take it directly, since I had to rebase it first. Please always send power-supply patches based on the for-next branch, which already contained a fix for the uninitialized wd_reg_val. (also no need to Cc DT people for this patch :)) -- Sebastian signature.asc Description: PGP signature
Re: [PATCH 1/2] fs, close_range: add flag CLOSE_RANGE_CLOEXEC
On Tue, Oct 13, 2020 at 10:09:25PM +0100, Al Viro wrote: > On Tue, Oct 13, 2020 at 04:06:08PM +0200, Giuseppe Scrivano wrote: > > + spin_lock(_fds->file_lock); > > + fdt = files_fdtable(cur_fds); > > + cur_max = fdt->max_fds - 1; > > + max_fd = min(max_fd, cur_max); > > + while (fd <= max_fd) > > + __set_close_on_exec(fd++, fdt); > > + spin_unlock(_fds->file_lock); > > First of all, this is an atrocious way to set all bits > in a range. What's more, you don't want to set it for *all* Hm, good point. Would it make sense to just use the bitmap_set() proposal since the 3 to ~0 case is most common or to actually iterate based on the open fds? Christian
Re: [PATCH] vfio/platform: Replace spin_lock_irqsave by spin_lock in hard IRQ
On Tue, 13 Oct 2020 10:00:58 +0800 Tian Tao wrote: > It is redundant to do irqsave and irqrestore in hardIRQ context. But this function is also called from non-IRQ context. Thanks, Alex > Signed-off-by: Tian Tao > --- > drivers/vfio/platform/vfio_platform_irq.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/drivers/vfio/platform/vfio_platform_irq.c > b/drivers/vfio/platform/vfio_platform_irq.c > index c5b09ec..24fd6c5 100644 > --- a/drivers/vfio/platform/vfio_platform_irq.c > +++ b/drivers/vfio/platform/vfio_platform_irq.c > @@ -139,10 +139,9 @@ static int vfio_platform_set_irq_unmask(struct > vfio_platform_device *vdev, > static irqreturn_t vfio_automasked_irq_handler(int irq, void *dev_id) > { > struct vfio_platform_irq *irq_ctx = dev_id; > - unsigned long flags; > int ret = IRQ_NONE; > > - spin_lock_irqsave(_ctx->lock, flags); > + spin_lock(_ctx->lock); > > if (!irq_ctx->masked) { > ret = IRQ_HANDLED; > @@ -152,7 +151,7 @@ static irqreturn_t vfio_automasked_irq_handler(int irq, > void *dev_id) > irq_ctx->masked = true; > } > > - spin_unlock_irqrestore(_ctx->lock, flags); > + spin_unlock(_ctx->lock); > > if (ret == IRQ_HANDLED) > eventfd_signal(irq_ctx->trigger, 1);
Re: [PATCH v4 1/2] dt-bindings: power: Add the bq25790 dt bindings
Hi Dan, On Tue, Oct 13, 2020 at 01:03:52PM -0500, Dan Murphy wrote: > On 10/9/20 9:41 AM, Dan Murphy wrote: > > Add the bindings for the bq25790. > > Also any updates on this series? Sorry, It's not gonna make it into this merge window. I did not yet fully review it and merge window is already open. I can say you are at least leaking USB notifier on driver removal, since you never call usb_unregister_notifier() for that case (similar to Ricardo's submission). -- Sebastian signature.asc Description: PGP signature
Re: WARNING: can't access registers at asm_sysvec_reschedule_ipi
syzbot has found a reproducer for the following issue on: HEAD commit:64a632da net: fec: Fix phy_device lookup for phy_reset_aft.. git tree: net console output: https://syzkaller.appspot.com/x/log.txt?x=11580c8050 kernel config: https://syzkaller.appspot.com/x/.config?x=c06bcf3cc963d91c dashboard link: https://syzkaller.appspot.com/bug?extid=853f7009c5c271473926 compiler: gcc (GCC) 10.1.0-syz 20200507 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12c1bc6f90 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+853f7009c5c271473...@syzkaller.appspotmail.com WARNING: can't access registers at asm_sysvec_reschedule_ipi+0x12/0x20 arch/x86/include/asm/idtentry.h:586
Re: Re: [PATCH v2] arm64: dts: allwinner: h6: add eMMC voltage property for Beelink GS1
Dne petek, 09. oktober 2020 ob 09:36:51 CEST je Maxime Ripard napisal(a): > On Thu, Oct 08, 2020 at 10:00:06PM +0200, Clément Péron wrote: > > Hi Maxime, > > > > Adding linux-sunxi and Jernej Skrabec to this discussion. > > > > On Thu, 8 Oct 2020 at 17:10, Maxime Ripard wrote: > > > > > > Hi Clément, > > > > > > On Mon, Oct 05, 2020 at 08:47:19PM +0200, Clément Péron wrote: > > > > On Mon, 5 Oct 2020 at 11:21, Maxime Ripard wrote: > > > > > > > > > > Hi Clément, > > > > > > > > > > On Sat, Oct 03, 2020 at 11:20:01AM +0200, Clément Péron wrote: > > > > > > Sunxi MMC driver can't distinguish at runtime what's the I/O voltage > > > > > > for HS200 mode. > > > > > > > > > > Unfortunately, that's not true (or at least, that's not related to your patch). > > > > > > > > > > > Add a property in the device-tree to notify MMC core about this > > > > > > configuration. > > > > > > > > > > > > Fixes: 089bee8dd119 ("arm64: dts: allwinner: h6: Introduce Beelink GS1 board") > > > > > > Signed-off-by: Clément Péron > > > > > > --- > > > > > > arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts | 1 + > > > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > > > diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink- gs1.dts b/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts > > > > > > index 049c21718846..3f20d2c9 100644 > > > > > > --- a/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts > > > > > > +++ b/arch/arm64/boot/dts/allwinner/sun50i-h6-beelink-gs1.dts > > > > > > @@ -145,6 +145,7 @@ { > > > > > > vqmmc-supply = <_bldo2>; > > > > > > non-removable; > > > > > > cap-mmc-hw-reset; > > > > > > + mmc-hs200-1_8v; > > > > > > bus-width = <8>; > > > > > > status = "okay"; > > > > > > }; > > > > > > > > > > I'm not really sure what you're trying to fix here, but as far as MMC > > > > > goes, eMMC's can support io voltage of 3.3, 1.8 and 1.2V. Modes up until > > > > > HS DDR (50MHz in DDR) will use an IO voltage of 3.3V, higher speed modes > > > > > (HS200 and HS400) supporting 1.8V and 1.2V. > > > > > > > > Some users report that the eMMC is not working properly on their > > > > Beelink GS1 boards. > > > > > > > > > The mmc-hs200-1_8v property states that the MMC controller supports the > > > > > HS200 mode at 1.8V. Now, I can only assume that since BLDO2 is set up at > > > > > 1.8V then otherwise, the MMC core will rightfully decide to use the > > > > > highest supported mode. In this case, since the driver sets it, it would > > > > > be HS-DDR at 3.3V, which won't work with that fixed regulator. > > > > > > > > > > I can only assume that enabling HS200 at 1.8V only fixes the issue you > > > > > have because otherwise it would use HS-DDR at 3.3V, ie not actually > > > > > fixing the issue but sweeping it under the rug. > > > > > > > > > > Trying to add mmc-ddr-1_8v would be a good idea > > > > > > > > Thanks for the explanation, this is indeed the correct one. > > > > So It looks like the SDIO controller has an issue on some boards when > > > > using HS-DDR mode. > > > > > > > > Is this patch acceptable with the proper commit log? > > > > > > If HS-DDR works, yes, but I assume it doesn't? > > > > After discussing with Jernej about this issue, I understood that: > > - Automatic delay calibration is not implemented > > - We also miss some handling of DDR related bits in control register > > > > So none of H5/H6 boards should actually work. > > (Some 'lucky' boards seem to work enough to switch to HS200 mode...) > > > > To "fix" this the H5 disable the HS-DDR mode in sunxi mmc driver : > > https://github.com/torvalds/linux/blob/master/drivers/mmc/host/sunxi-mmc.c#L1409 > > I find it suspicious that some boards would have traces not good enough > for HS-DDR (50MHz in DDR) but would work fine in HS200 (200MHz in SDR). > If there's some mismatch on the traces, it will only be worse in HS200. FYI, similar situation is also with Tanix TX6 board. Mine works well in HS-DDR mode, but some people reported that it doesn't work for them. The only possible difference could be different eMMC IC. I'll try to confirm that. Anyway, I did some tests on OrangePi 3 board which also have eMMC. Both modes (HS-DDR and HS200) are supported and work well. Interesting observation is that speed test (hdparm -t) reported 80.58 MB/sec for HS-DDR mode and 43.40 MB/sec for HS200. As it can be seen here, HS-DDR is quicker by a factor of 2, but it should be the other way around. Reason for this is that both modes use same base clock and thus HS-DDR produces higher speed. If I change f_max to 150 MHz (max. per datasheet for SDR @ 1.8 V) then naturally HS200 mode is faster (124.63 MB/sec) as HS-DDR as it should be. This would be actually correct test for problematic boards but unfortunately I don't have it. I also hacked in support for HS400 (~143 MB/s) and this mode is the only one which really needs calibration on my board. Two observations from BSP
Re: [PATCH 1/2] fs, close_range: add flag CLOSE_RANGE_CLOEXEC
On Tue, Oct 13, 2020 at 11:04:21PM +0200, Rasmus Villemoes wrote: > On 13/10/2020 22.54, Christian Brauner wrote: > > On Tue, Oct 13, 2020 at 04:06:08PM +0200, Giuseppe Scrivano wrote: > > > > Hey Guiseppe, > > > > Thanks for the patch! > > > >> When the flag CLOSE_RANGE_CLOEXEC is set, close_range doesn't > >> immediately close the files but it sets the close-on-exec bit. > > > > Hm, please expand on the use-cases a little here so people know where > > and how this is useful. Keeping the rationale for a change in the commit > > log is really important. > > > > > I think I don't have quarrels with this patch in principle but I wonder > > if something like the following wouldn't be easier to follow: > > > > diff --git a/fs/file.c b/fs/file.c > > index 21c0893f2f1d..872a4098c3be 100644 > > --- a/fs/file.c > > +++ b/fs/file.c > > @@ -672,6 +672,32 @@ int __close_fd(struct files_struct *files, unsigned fd) > > } > > EXPORT_SYMBOL(__close_fd); /* for ksys_close() */ > > > > +static inline void __range_cloexec(struct files_struct *cur_fds, > > + unsigned int fd, unsigned max_fd) > > +{ > > + struct fdtable *fdt; > > + spin_lock(_fds->file_lock); > > + fdt = files_fdtable(cur_fds); > > + while (fd <= max_fd) > > + __set_close_on_exec(fd++, fdt); > (I should've warned that I just proposed this as a completely untested brainstorm.) > Doesn't that want to be > > bitmap_set(fdt->close_on_exec, fd, max_fd - fd + 1) > > to do word-at-a-time? I assume this would mostly be called with (3, ~0U) > as arguments or something like that. Yes, that is the common case. Thanks Rasmus, I was unaware we had that function. In that case I think we'd actually need sm like: spin_lock(_fds->file_lock); fdt = files_fdtable(cur_fds); cur_max = files_fdtable(cur_fds)->max_fds - 1; max_fd = min(max_fd, cur_max); bitmap_set(fdt->close_on_exec, fd, max_fd - fd + 1) so we retrieve max_fd with the spinlock held, I think. Christian
Re: [PATCH v1 02/15] perf report: output trace file name in raw trace dump
Hi, On 13.10.2020 22:54, Jiri Olsa wrote: > On Mon, Oct 12, 2020 at 11:54:24AM +0300, Alexey Budankov wrote: >> >> Output path of a trace file into raw dump (-D) @. >> Print offset of PERF_RECORD_COMPRESSED record instead of zero for >> decompressed records: >> 0x22...@perf.data [0x30]: event: 9 >> or >> 0x15c...@perf.data/data.7 [0x30]: event: 9 >> >> Signed-off-by: Alexey Budankov > > hi, > I'm getting: > > CC builtin-inject.o > builtin-inject.c: In function ‘cmd_inject’: > builtin-inject.c:850:18: error: initialization of ‘int (*)(struct > perf_session *, union perf_event *, u64, const char *)’ {aka ‘int (*)(struct > perf_session *, union perf_event *, long unsigned int, const char *)’} from > incompatible pointer type ‘int (*)(struct perf_session *, union perf_event *, > u64)’ {aka ‘int (*)(struct perf_session *, union perf_event *, long unsigned > int)’} [-Werror=incompatible-pointer-types] > 850 |.compressed = perf_event__repipe_op4_synth, > | ^~~~ > builtin-inject.c:850:18: note: (near initialization for > ‘inject.tool.compressed’) > > it's probably recent build id changes Looks like that's it. Fix is in v2 and follows: diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c index f3f965157d69..35c005b8da7f 100644 --- a/tools/perf/builtin-inject.c +++ b/tools/perf/builtin-inject.c @@ -106,7 +106,8 @@ static int perf_event__repipe_op2_synth(struct perf_session *session, static int perf_event__repipe_op4_synth(struct perf_session *session, union perf_event *event, - u64 data __maybe_unused) + u64 data __maybe_unused, + const char *str __maybe_unused) { return perf_event__repipe_synth(session->tool, event); } Thanks! Alexei
Re: [PATCH] MIPS: Add set_memory_node()
On Tue, Oct 13, 2020 at 11:19:43AM +0800, Jinyang He wrote: > Commit e7ae8d174eec ("MIPS: replace add_memory_region with memblock") this commit just changed code and doesn't change the problem you want to solve. > replaced add_memory_region(, , BOOT_MEM_RAM) with memblock_add(). But > it doesn't work well on some platforms which have NUMA like Loongson64. > Because memblock_add() calls memblock_add_range() and sets memory at > MAX_NUMNODES. As mm/memblock.c says, assign the region to a NUMA node > later by using memblock_set_node(). This patch provides a NUMA port so it says later, which doesn't have to be right after the memblock_add. I don't know why you need the whole mem=/memmap= game, but please do a for_each_memblock(...) memblock_set_node(...); somewhere in arch/mips/loongson64 to fix up the memory blocks as needed. Thomas. -- Crap can work. Given enough thrust pigs will fly, but it's not necessarily a good idea.[ RFC1925, 2.3 ]
Re: [PATCH 1/2] fs, close_range: add flag CLOSE_RANGE_CLOEXEC
On Tue, Oct 13, 2020 at 04:06:08PM +0200, Giuseppe Scrivano wrote: > + spin_lock(_fds->file_lock); > + fdt = files_fdtable(cur_fds); > + cur_max = fdt->max_fds - 1; > + max_fd = min(max_fd, cur_max); > + while (fd <= max_fd) > + __set_close_on_exec(fd++, fdt); > + spin_unlock(_fds->file_lock); First of all, this is an atrocious way to set all bits in a range. What's more, you don't want to set it for *all* bits - only for the ones present in open bitmap. It's probably harmless at the moment, but let's not create interesting surprises for the future.
Re: [GIT PULL] io_uring updates for 5.10-rc1
On Tue, Oct 13, 2020 at 01:49:01PM -0600, Jens Axboe wrote: > On 10/13/20 1:46 PM, Linus Torvalds wrote: > > On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe wrote: > >> > >> Here are the io_uring updates for 5.10. > > > > Very strange. My clang build gives a warning I've never seen before: > > > >/tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section > > attributes for .data..read_mostly > > > > and looking at what clang generates for the *.s file, it seems to be > > the "section" line in: > > > > .type io_op_defs,@object # @io_op_defs > > .section.data..read_mostly,"a",@progbits > > .p2align4 > > > > I think it's the combination of "const" and "__read_mostly". > > > > I think the warning is sensible: how can a piece of data be both > > "const" and "__read_mostly"? If it's "const", then it's not "mostly" > > read - it had better be _always_ read. > > > > I'm letting it go, and I've pulled this (gcc doesn't complain), but > > please have a look. > > Huh weird, I'll take a look. FWIW, the construct isn't unique across > the kernel. > > What clang are you using? > > -- > Jens Axboe > If const and non-const __read_mostly appeared in the same file, both gcc and clang would give errors.
Re: [PATCH 1/2] fs, close_range: add flag CLOSE_RANGE_CLOEXEC
On 13/10/2020 22.54, Christian Brauner wrote: > On Tue, Oct 13, 2020 at 04:06:08PM +0200, Giuseppe Scrivano wrote: > > Hey Guiseppe, > > Thanks for the patch! > >> When the flag CLOSE_RANGE_CLOEXEC is set, close_range doesn't >> immediately close the files but it sets the close-on-exec bit. > > Hm, please expand on the use-cases a little here so people know where > and how this is useful. Keeping the rationale for a change in the commit > log is really important. > > I think I don't have quarrels with this patch in principle but I wonder > if something like the following wouldn't be easier to follow: > > diff --git a/fs/file.c b/fs/file.c > index 21c0893f2f1d..872a4098c3be 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -672,6 +672,32 @@ int __close_fd(struct files_struct *files, unsigned fd) > } > EXPORT_SYMBOL(__close_fd); /* for ksys_close() */ > > +static inline void __range_cloexec(struct files_struct *cur_fds, > +unsigned int fd, unsigned max_fd) > +{ > + struct fdtable *fdt; > + spin_lock(_fds->file_lock); > + fdt = files_fdtable(cur_fds); > + while (fd <= max_fd) > + __set_close_on_exec(fd++, fdt); Doesn't that want to be bitmap_set(fdt->close_on_exec, fd, max_fd - fd + 1) to do word-at-a-time? I assume this would mostly be called with (3, ~0U) as arguments or something like that. Rasmus
Re: [GIT PULL] io_uring updates for 5.10-rc1
On 10/13/20 2:49 PM, Rasmus Villemoes wrote: > On 13/10/2020 21.49, Jens Axboe wrote: >> On 10/13/20 1:46 PM, Linus Torvalds wrote: >>> On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe wrote: Here are the io_uring updates for 5.10. >>> >>> Very strange. My clang build gives a warning I've never seen before: >>> >>>/tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section >>> attributes for .data..read_mostly >>> >>> and looking at what clang generates for the *.s file, it seems to be >>> the "section" line in: >>> >>> .type io_op_defs,@object # @io_op_defs >>> .section.data..read_mostly,"a",@progbits >>> .p2align4 >>> >>> I think it's the combination of "const" and "__read_mostly". >>> >>> I think the warning is sensible: how can a piece of data be both >>> "const" and "__read_mostly"? If it's "const", then it's not "mostly" >>> read - it had better be _always_ read. >>> >>> I'm letting it go, and I've pulled this (gcc doesn't complain), but >>> please have a look. >> >> Huh weird, I'll take a look. FWIW, the construct isn't unique across >> the kernel. > > Citation needed. There's lots of "pointer to const foo" stuff declared > as __read_mostly, but I can't find any objects that are themselves both > const and __read_mostly. Other than that io_op_defs and io_uring_fops now. You are right, they are all pointers, so not the same. I'll just revert the patch. -- Jens Axboe
Re: [PATCH 1/2] fs, close_range: add flag CLOSE_RANGE_CLOEXEC
On Tue, Oct 13, 2020 at 04:06:08PM +0200, Giuseppe Scrivano wrote: Hey Guiseppe, Thanks for the patch! > When the flag CLOSE_RANGE_CLOEXEC is set, close_range doesn't > immediately close the files but it sets the close-on-exec bit. Hm, please expand on the use-cases a little here so people know where and how this is useful. Keeping the rationale for a change in the commit log is really important. > > Signed-off-by: Giuseppe Scrivano > --- > fs/file.c| 56 ++-- > include/uapi/linux/close_range.h | 3 ++ > 2 files changed, 42 insertions(+), 17 deletions(-) > > diff --git a/fs/file.c b/fs/file.c > index 21c0893f2f1d..ad4ebee41e09 100644 > --- a/fs/file.c > +++ b/fs/file.c > @@ -672,6 +672,17 @@ int __close_fd(struct files_struct *files, unsigned fd) > } > EXPORT_SYMBOL(__close_fd); /* for ksys_close() */ > > +static unsigned int __get_max_fds(struct files_struct *cur_fds) > +{ > + unsigned int max_fds; > + > + rcu_read_lock(); > + /* cap to last valid index into fdtable */ > + max_fds = files_fdtable(cur_fds)->max_fds; > + rcu_read_unlock(); > + return max_fds; > +} > + > /** > * __close_range() - Close all file descriptors in a given range. > * > @@ -683,27 +694,23 @@ EXPORT_SYMBOL(__close_fd); /* for ksys_close() */ > */ > int __close_range(unsigned fd, unsigned max_fd, unsigned int flags) > { > - unsigned int cur_max; > + unsigned int cur_max = UINT_MAX; > struct task_struct *me = current; > struct files_struct *cur_fds = me->files, *fds = NULL; > > - if (flags & ~CLOSE_RANGE_UNSHARE) > + if (flags & ~(CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC)) > return -EINVAL; > > if (fd > max_fd) > return -EINVAL; > > - rcu_read_lock(); > - cur_max = files_fdtable(cur_fds)->max_fds; > - rcu_read_unlock(); > - > - /* cap to last valid index into fdtable */ > - cur_max--; > - > if (flags & CLOSE_RANGE_UNSHARE) { > int ret; > unsigned int max_unshare_fds = NR_OPEN_MAX; > > + /* cap to last valid index into fdtable */ > + cur_max = __get_max_fds(cur_fds) - 1; > + > /* >* If the requested range is greater than the current maximum, >* we're closing everything so only copy all file descriptors > @@ -724,16 +731,31 @@ int __close_range(unsigned fd, unsigned max_fd, > unsigned int flags) > swap(cur_fds, fds); > } > > - max_fd = min(max_fd, cur_max); > - while (fd <= max_fd) { > - struct file *file; > + if (flags & CLOSE_RANGE_CLOEXEC) { > + struct fdtable *fdt; > > - file = pick_file(cur_fds, fd++); > - if (!file) > - continue; > + spin_lock(_fds->file_lock); > + fdt = files_fdtable(cur_fds); > + cur_max = fdt->max_fds - 1; > + max_fd = min(max_fd, cur_max); > + while (fd <= max_fd) > + __set_close_on_exec(fd++, fdt); > + spin_unlock(_fds->file_lock); > + } else { > + /* Initialize cur_max if needed. */ > + if (cur_max == UINT_MAX) > + cur_max = __get_max_fds(cur_fds) - 1; The separation between how cur_fd is retrieved in the two branches makes the code more difficult to follow imho. Unless there's a clear reason why you've done it that way I would think that something like the patch I appended below might be a little clearer and easier to maintain(?). > + max_fd = min(max_fd, cur_max); > + while (fd <= max_fd) { > + struct file *file; > > - filp_close(file, cur_fds); > - cond_resched(); > + file = pick_file(cur_fds, fd++); > + if (!file) > + continue; > + > + filp_close(file, cur_fds); > + cond_resched(); > + } > } I think I don't have quarrels with this patch in principle but I wonder if something like the following wouldn't be easier to follow: diff --git a/fs/file.c b/fs/file.c index 21c0893f2f1d..872a4098c3be 100644 --- a/fs/file.c +++ b/fs/file.c @@ -672,6 +672,32 @@ int __close_fd(struct files_struct *files, unsigned fd) } EXPORT_SYMBOL(__close_fd); /* for ksys_close() */ +static inline void __range_cloexec(struct files_struct *cur_fds, + unsigned int fd, unsigned max_fd) +{ + struct fdtable *fdt; + spin_lock(_fds->file_lock); + fdt = files_fdtable(cur_fds); + while (fd <= max_fd) + __set_close_on_exec(fd++, fdt); + spin_unlock(_fds->file_lock); +} + +static inline void __range_close(struct files_struct *cur_fds, unsigned int fd, +unsigned max_fd) +{ + while (fd <=
Re: [PATCH RFC PKS/PMEM 24/58] fs/freevxfs: Utilize new kmap_thread()
On Tue, Oct 13, 2020 at 12:25:44PM +0100, Christoph Hellwig wrote: > > - kaddr = kmap(pp); > > + kaddr = kmap_thread(pp); > > memcpy(kaddr, vip->vii_immed.vi_immed + offset, PAGE_SIZE); > > - kunmap(pp); > > + kunmap_thread(pp); > > You only Cced me on this particular patch, which means I have absolutely > no idea what kmap_thread and kunmap_thread actually do, and thus can't > provide an informed review. Sorry the list was so big I struggled with who to CC and on which patches. > > That being said I think your life would be a lot easier if you add > helpers for the above code sequence and its counterpart that copies > to a potential hughmem page first, as that hides the implementation > details from most users. Matthew Wilcox and Al Viro have suggested similar ideas. https://lore.kernel.org/lkml/20201013205012.gi2046...@iweiny-desk2.sc.intel.com/ Ira
Re: [PATCH RFC PKS/PMEM 33/58] fs/cramfs: Utilize new kmap_thread()
On Tue, Oct 13, 2020 at 09:01:49PM +0100, Al Viro wrote: > On Tue, Oct 13, 2020 at 08:36:43PM +0100, Matthew Wilcox wrote: > > > static inline void copy_to_highpage(struct page *to, void *vfrom, unsigned > > int size) > > { > > char *vto = kmap_atomic(to); > > > > memcpy(vto, vfrom, size); > > kunmap_atomic(vto); > > } > > > > in linux/highmem.h ? > > You mean, like > static void memcpy_from_page(char *to, struct page *page, size_t offset, > size_t len) > { > char *from = kmap_atomic(page); > memcpy(to, from + offset, len); > kunmap_atomic(from); > } > > static void memcpy_to_page(struct page *page, size_t offset, const char > *from, size_t len) > { > char *to = kmap_atomic(page); > memcpy(to + offset, from, len); > kunmap_atomic(to); > } > > static void memzero_page(struct page *page, size_t offset, size_t len) > { > char *addr = kmap_atomic(page); > memset(addr + offset, 0, len); > kunmap_atomic(addr); > } > > in lib/iov_iter.c? FWIW, I don't like that "highpage" in the name and > highmem.h as location - these make perfect sense regardless of highmem; > they are normal memory operations with page + offset used instead of > a pointer... I was thinking along those lines as well especially because of the direction this patch set takes kmap(). Thanks for pointing these out to me. How about I lift them to a common header? But if not highmem.h where? Ira
Re: [GIT PULL] io_uring updates for 5.10-rc1
On 13/10/2020 21.49, Jens Axboe wrote: > On 10/13/20 1:46 PM, Linus Torvalds wrote: >> On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe wrote: >>> >>> Here are the io_uring updates for 5.10. >> >> Very strange. My clang build gives a warning I've never seen before: >> >>/tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section >> attributes for .data..read_mostly >> >> and looking at what clang generates for the *.s file, it seems to be >> the "section" line in: >> >> .type io_op_defs,@object # @io_op_defs >> .section.data..read_mostly,"a",@progbits >> .p2align4 >> >> I think it's the combination of "const" and "__read_mostly". >> >> I think the warning is sensible: how can a piece of data be both >> "const" and "__read_mostly"? If it's "const", then it's not "mostly" >> read - it had better be _always_ read. >> >> I'm letting it go, and I've pulled this (gcc doesn't complain), but >> please have a look. > > Huh weird, I'll take a look. FWIW, the construct isn't unique across > the kernel. Citation needed. There's lots of "pointer to const foo" stuff declared as __read_mostly, but I can't find any objects that are themselves both const and __read_mostly. Other than that io_op_defs and io_uring_fops now. But... there's something a little weird: $ grep read_most -- fs/io_uring.s .section.data..read_mostly,"a",@progbits $ readelf --wide -S fs/io_uring.o | grep read_most [32] .data..read_mostly PROGBITS 01b4e0 000188 00 WA 0 0 32 (this is with gcc/gas). So despite that .section directive not saying "aw", the section got the W flag anyway. There are lots of .section "__tracepoints_ptrs", "a" .pushsection .smp_locks,"a" in the .s file, and those sections do end up with just the A bit in the .o file. Does gas maybe somehow special-case a section name starting with .data? Rasmus
Re: [PATCH v3 4/5] media: dt-bindings: media: renesas,drif: Add r8a77965 support
Hi Fabrizio, Thank you for the patch. On Tue, Oct 13, 2020 at 04:01:49PM +0100, Fabrizio Castro wrote: > The r8a77965 (a.k.a. R-Car M3-N) device tree schema is > compatible with the already documented R-Car Gen3 devices. > > Document r8a77965 support within renesas,drif.yaml. > > Signed-off-by: Fabrizio Castro Reviewed-by: Laurent Pinchart > --- > v2->v3: > * New patch > > Documentation/devicetree/bindings/media/renesas,drif.yaml | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/Documentation/devicetree/bindings/media/renesas,drif.yaml > b/Documentation/devicetree/bindings/media/renesas,drif.yaml > index ae50b1448320..89445ddd598e 100644 > --- a/Documentation/devicetree/bindings/media/renesas,drif.yaml > +++ b/Documentation/devicetree/bindings/media/renesas,drif.yaml > @@ -53,6 +53,7 @@ properties: >- enum: > - renesas,r8a7795-drif# R-Car H3 > - renesas,r8a7796-drif# R-Car M3-W > +- renesas,r8a77965-drif # R-Car M3-N > - renesas,r8a77990-drif # R-Car E3 >- const: renesas,rcar-gen3-drif # Generic R-Car Gen3 compatible device > -- Regards, Laurent Pinchart
Re: [GIT PULL -v2] x86/asm updates for v5.10
On Tue, Oct 13, 2020 at 01:39:01PM -0700, Linus Torvalds wrote: > Actually, I think you forgot to push out the updated thing, I still > see the same contents of the pull. Blergh, that tag is still pointing to the old branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/tag/?h=x86_asm_for_v5.10 even though I wrote a new one and pushed it. Otherwise I wouldnt've been able to create the v2 pull request message. However, I used the same tag name and force-pushed and perhaps there it didn't do what I wanted it to do. Sorry about that - I'll recheck stuff like that in the future. > Which I guess is ok, since Uros has convinced me that the xorl > conversion is safe even for the byte cases. > > So I've pulled that unmodified branch. Aha, ok, sounds good too. Thx. -- Regards/Gruss, Boris. SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg --
Re: [PATCH v6 70/80] rcu/tree: docs: document bkvcache new members at struct kfree_rcu_cpu
Em Tue, 13 Oct 2020 09:34:04 -0700 "Paul E. McKenney" escreveu: > On Tue, Oct 13, 2020 at 01:54:25PM +0200, Mauro Carvalho Chehab wrote: > > Changeset 53c72b590b3a ("rcu/tree: cache specified number of objects") > > added new members for struct kfree_rcu_cpu, but didn't add the > > corresponding at the kernel-doc markup, as repoted when doing > > "make htmldocs": > > ./kernel/rcu/tree.c:3113: warning: Function parameter or member > > 'bkvcache' not described in 'kfree_rcu_cpu' > > ./kernel/rcu/tree.c:3113: warning: Function parameter or member > > 'nr_bkv_objs' not described in 'kfree_rcu_cpu' > > > > So, move the description for bkvcache to kernel-doc, and add a > > description for nr_bkv_objs. > > > > Fixes: 53c72b590b3a ("rcu/tree: cache specified number of objects") > > Signed-off-by: Mauro Carvalho Chehab > > Queued for review and testing, likely target v5.11. Hi Paul, I would prefer if we could get those on 5.10, if possible. We're aiming to have 5.10 free of docs warnings ;-) If you prefer, I can send those patches to Linus with your ack. Regards, Mauro > > Thanx, Paul > > > --- > > kernel/rcu/tree.c | 14 ++ > > 1 file changed, 6 insertions(+), 8 deletions(-) > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > index f78ee759af9c..03c54c3478b7 100644 > > --- a/kernel/rcu/tree.c > > +++ b/kernel/rcu/tree.c > > @@ -3022,6 +3022,12 @@ struct kfree_rcu_cpu_work { > > * @monitor_todo: Tracks whether a @monitor_work delayed work is pending > > * @initialized: The @rcu_work fields have been initialized > > * @count: Number of objects for which GP not started > > + * @bkvcache: > > + * A simple cache list that contains objects for reuse purpose. > > + * In order to save some per-cpu space the list is singular. > > + * Even though it is lockless an access has to be protected by the > > + * per-cpu lock. > > + * @nr_bkv_objs: number of allocated objects at @bkvcache. > > * > > * This is a per-CPU structure. The reason that it is not included in > > * the rcu_data structure is to permit this code to be extracted from > > @@ -3037,14 +3043,6 @@ struct kfree_rcu_cpu { > > bool monitor_todo; > > bool initialized; > > int count; > > - > > - /* > > -* A simple cache list that contains objects for > > -* reuse purpose. In order to save some per-cpu > > -* space the list is singular. Even though it is > > -* lockless an access has to be protected by the > > -* per-cpu lock. > > -*/ > > struct llist_head bkvcache; > > int nr_bkv_objs; > > }; > > -- > > 2.26.2 > > Thanks, Mauro
Re: [PATCH RFC PKS/PMEM 33/58] fs/cramfs: Utilize new kmap_thread()
On Tue, Oct 13, 2020 at 08:36:43PM +0100, Matthew Wilcox wrote: > On Tue, Oct 13, 2020 at 11:44:29AM -0700, Dan Williams wrote: > > On Fri, Oct 9, 2020 at 12:52 PM wrote: > > > > > > From: Ira Weiny > > > > > > The kmap() calls in this FS are localized to a single thread. To avoid > > > the over head of global PKRS updates use the new kmap_thread() call. > > > > > > Cc: Nicolas Pitre > > > Signed-off-by: Ira Weiny > > > --- > > > fs/cramfs/inode.c | 10 +- > > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > > > diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c > > > index 912308600d39..003c014a42ed 100644 > > > --- a/fs/cramfs/inode.c > > > +++ b/fs/cramfs/inode.c > > > @@ -247,8 +247,8 @@ static void *cramfs_blkdev_read(struct super_block > > > *sb, unsigned int offset, > > > struct page *page = pages[i]; > > > > > > if (page) { > > > - memcpy(data, kmap(page), PAGE_SIZE); > > > - kunmap(page); > > > + memcpy(data, kmap_thread(page), PAGE_SIZE); > > > + kunmap_thread(page); > > > > Why does this need a sleepable kmap? This looks like a textbook > > kmap_atomic() use case. > > There's a lot of code of this form. Could we perhaps have: > > static inline void copy_to_highpage(struct page *to, void *vfrom, unsigned > int size) > { > char *vto = kmap_atomic(to); > > memcpy(vto, vfrom, size); > kunmap_atomic(vto); > } > > in linux/highmem.h ? Christoph had the same idea. I'll work on it. Ira
Re: [GIT PULL] x86/asm updates for v5.10
The pull request you sent on Mon, 12 Oct 2020 13:05:57 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > tags/x86_asm_for_v5.10 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/029f56db6ac248769f2c260bfaf3c3c0e23e904c Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [PATCH v6 68/80] nl80211: docs: add a description for s1g_cap parameter
On Tue, 2020-10-13 at 22:41 +0200, Mauro Carvalho Chehab wrote: > Em Tue, 13 Oct 2020 20:47:47 +0200 > Johannes Berg escreveu: > > > Thanks Mauro. > > > > > > On Tue, 2020-10-13 at 13:54 +0200, Mauro Carvalho Chehab wrote: > > > Changeset df78a0c0b67d ("nl80211: S1G band and channel definitions") > > > added a new parameter, but didn't add the corresponding kernel-doc > > > markup, as repoted when doing "make htmldocs": > > > > > > ./include/net/cfg80211.h:471: warning: Function parameter or member > > > 's1g_cap' not described in 'ieee80211_supported_band' > > > > > > Add a documentation for it. > > > > Should I take this through my tree, or is that part of a larger set > > that'll go somewhere else? > > Whatever works best for you ;-) > > If you don't pick it via your tree, I'm planning to send it > together with the other patches likely on Thursday. OK, please do, and here's an Acked-by: Johannes Berg I don't think I can get it this quickly through net-next at this point, and there's just no point if you have stuff to send anyway :-) Thanks! johannes
[PATCH 1/2] module: merge repetitive strings in module_sig_check()
The 'reason' variable in module_sig_check() points to 3 strings across the *switch* statement, all needlessly starting with the same text. Let's put as much of the starting text as we can into the pr_notice() call (this includes some rewording of the 1st message) -- it saves 37 bytes of object code (x86 gcc 10.2.1). Signed-off-by: Sergey Shtylyov --- kernel/module.c |9 + 1 file changed, 5 insertions(+), 4 deletions(-) Index: linux/kernel/module.c === --- linux.orig/kernel/module.c +++ linux/kernel/module.c @@ -2906,16 +2906,17 @@ static int module_sig_check(struct load_ * enforcing, certain errors are non-fatal. */ case -ENODATA: - reason = "Loading of unsigned module"; + reason = "no signature"; goto decide; case -ENOPKG: - reason = "Loading of module with unsupported crypto"; + reason = "unsupported crypto"; goto decide; case -ENOKEY: - reason = "Loading of module with unavailable key"; + reason = "unavailable key"; decide: if (is_module_sig_enforced()) { - pr_notice("%s: %s is rejected\n", info->name, reason); + pr_notice("%s: loading of module with %s is rejected\n", + info->name, reason); return -EKEYREJECTED; }
[PATCH 2/2] module: unindent comments in module_sig_check()
The way the comments in the *switch* statement in module_sig_check() are indented, it may seem they refer to the statements above them, not below. Align the comments with the *case* and *default* labels below them, fixing the comment style and adding article/dash, while at it... Signed-off-by: Sergey Shtylyov --- kernel/module.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) Index: linux/kernel/module.c === --- linux.orig/kernel/module.c +++ linux/kernel/module.c @@ -2901,10 +2901,11 @@ static int module_sig_check(struct load_ info->sig_ok = true; return 0; - /* We don't permit modules to be loaded into trusted kernels -* without a valid signature on them, but if we're not -* enforcing, certain errors are non-fatal. -*/ + /* +* We don't permit modules to be loaded into the trusted kernels +* without a valid signature on them, but if we're not enforcing, +* certain errors are non-fatal. +*/ case -ENODATA: reason = "no signature"; goto decide; @@ -2922,10 +2923,10 @@ static int module_sig_check(struct load_ return security_locked_down(LOCKDOWN_MODULE_SIGNATURE); - /* All other errors are fatal, including nomem, unparseable -* signatures and signature check failures - even if signatures -* aren't required. -*/ + /* +* All other errors are fatal, including nomem, unparseable signatures +* and signature check failures -- even if signatures aren't required. +*/ default: return err; }
Re: [PATCH v6 68/80] nl80211: docs: add a description for s1g_cap parameter
Em Tue, 13 Oct 2020 20:47:47 +0200 Johannes Berg escreveu: > Thanks Mauro. > > > On Tue, 2020-10-13 at 13:54 +0200, Mauro Carvalho Chehab wrote: > > Changeset df78a0c0b67d ("nl80211: S1G band and channel definitions") > > added a new parameter, but didn't add the corresponding kernel-doc > > markup, as repoted when doing "make htmldocs": > > > > ./include/net/cfg80211.h:471: warning: Function parameter or member > > 's1g_cap' not described in 'ieee80211_supported_band' > > > > Add a documentation for it. > > Should I take this through my tree, or is that part of a larger set > that'll go somewhere else? Whatever works best for you ;-) If you don't pick it via your tree, I'm planning to send it together with the other patches likely on Thursday. Thanks, Mauro
Re: [GIT PULL -v2] x86/asm updates for v5.10
On Tue, Oct 13, 2020 at 2:42 AM Borislav Petkov wrote: > > here's v2 of the x86/asm pull with only the __force_order patch so that > it can go in now. The other one will be sorted out when the matter has > been settled properly. Actually, I think you forgot to push out the updated thing, I still see the same contents of the pull. Which I guess is ok, since Uros has convinced me that the xorl conversion is safe even for the byte cases. So I've pulled that unmodified branch. Linus
[PATCH 0/2] module: some refactoring in module_sig_check()
Here are 2 patches against the 'modules-next' branch of Jessica Yu's 'linux.git' repo. I'm doing some little refactoring in module_sig_check()... [1/2] module: merge repetitive strings in module_sig_check() [2/2] module: unindent comments in module_sig_check()
Re: [PATCH 0/2] module: n module_sig_check()
Oops, hadn't finished the subject... :-/\
[PATCH 0/2] module: n module_sig_check()
Here are 2 patches against the 'modules-next' branch of Jessica Yu's 'linux.git' repo. I'm doing some little refactoring in module_sig_check()... [1/2] module: merge repetitive strings in module_sig_check() [2/2] module: unindent comments in module_sig_check()
Re: [GIT PULL] libata updates for 5.10-rc1
The pull request you sent on Mon, 12 Oct 2020 07:48:15 -0600: > git://git.kernel.dk/linux-block.git tags/libata-5.10-2020-10-12 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/79ec6d9cac46d59db9b006bc9cde2811ef365292 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [RFC PATCH 00/35] SEV-ES hypervisor support
Apologies, Sean. I thought I had replied to this but found it instead in my drafts folder... I've taken much of your feedback and incorporated that into the next version of the patches that I submitted and updated this response based on that, too. On 9/15/20 7:19 PM, Sean Christopherson wrote: > On Tue, Sep 15, 2020 at 12:22:05PM -0500, Tom Lendacky wrote: >> On 9/14/20 5:59 PM, Sean Christopherson wrote: >>> Given that we don't yet have publicly available KVM code for TDX, what if I >>> generate and post a list of ioctls() that are denied by either SEV-ES or >>> TDX, >>> organized by the denier(s)? Then for the ioctls() that are denied by one >>> and >>> not the other, we add a brief explanation of why it's denied? >>> >>> If that sounds ok, I'll get the list and the TDX side of things posted >>> tomorrow. >> >> That sounds good. > > TDX completely blocks the following ioctl()s: SEV-ES doesn't need to completely block these ioctls. SEV-SNP is likely to do more of that. SEV-ES will still allow interrupts to be injected, or registers to be retrieved (which will only contain what was provided in the GHCB exchange), etc. > > kvm_vcpu_ioctl_interrupt > kvm_vcpu_ioctl_smi > kvm_vcpu_ioctl_x86_setup_mce > kvm_vcpu_ioctl_x86_set_mce > kvm_vcpu_ioctl_x86_get_debugregs > kvm_vcpu_ioctl_x86_set_debugregs > kvm_vcpu_ioctl_x86_get_xsave > kvm_vcpu_ioctl_x86_set_xsave > kvm_vcpu_ioctl_x86_get_xcrs > kvm_vcpu_ioctl_x86_set_xcrs > kvm_arch_vcpu_ioctl_get_regs > kvm_arch_vcpu_ioctl_set_regs > kvm_arch_vcpu_ioctl_get_sregs > kvm_arch_vcpu_ioctl_set_sregs > kvm_arch_vcpu_ioctl_set_guest_debug > kvm_arch_vcpu_ioctl_get_fpu > kvm_arch_vcpu_ioctl_set_fpu Of the listed ioctls, really the only ones I've updated are: kvm_vcpu_ioctl_x86_get_xsave kvm_vcpu_ioctl_x86_set_xsave kvm_arch_vcpu_ioctl_get_sregs This allows reading of the tracking value registers kvm_arch_vcpu_ioctl_set_sregs This prevents setting of register values kvm_arch_vcpu_ioctl_set_guest_debug kvm_arch_vcpu_ioctl_get_fpu kvm_arch_vcpu_ioctl_set_fpu > > Looking through the code, I think kvm_arch_vcpu_ioctl_get_mpstate() and > kvm_arch_vcpu_ioctl_set_mpstate() should also be disallowed, we just haven't > actually done so. I haven't done anything with these either. > > There are also two helper functions that are "blocked". > dm_request_for_irq_injection() returns false if guest_state_protected, and > post_kvm_run_save() shoves dummy state. ... and these. > > TDX also selectively blocks/skips portions of other ioctl()s so that the > TDX code itself can yell loudly if e.g. .get_cpl() is invoked. The event > injection restrictions are due to direct injection not being allowed (except > for NMIs); all IRQs have to be routed through APICv (posted interrupts) and > exception injection is completely disallowed. For SEV-ES, we don't have those restrictions. > > kvm_vcpu_ioctl_x86_get_vcpu_events: > if (!vcpu->kvm->arch.guest_state_protected) > events->interrupt.shadow = > kvm_x86_ops.get_interrupt_shadow(vcpu); > > kvm_arch_vcpu_put: > if (vcpu->preempted && !vcpu->kvm->arch.guest_state_protected) > vcpu->arch.preempted_in_kernel = !kvm_x86_ops.get_cpl(vcpu); > > kvm_vcpu_ioctl_x86_set_vcpu_events: > u32 allowed_flags = KVM_VCPUEVENT_VALID_NMI_PENDING | > KVM_VCPUEVENT_VALID_SIPI_VECTOR | > KVM_VCPUEVENT_VALID_SHADOW | > KVM_VCPUEVENT_VALID_SMM | > KVM_VCPUEVENT_VALID_PAYLOAD; > > if (vcpu->kvm->arch.guest_state_protected) > allowed_flags = KVM_VCPUEVENT_VALID_NMI_PENDING; > > > kvm_arch_vcpu_ioctl_run: > if (vcpu->kvm->arch.guest_state_protected) > kvm_sync_valid_fields = KVM_SYNC_X86_EVENTS; > else > kvm_sync_valid_fields = KVM_SYNC_X86_VALID_FIELDS; > > > In addition to the more generic guest_state_protected, we also (obviously > tentatively) have a few other flags to deal with aspects of TDX that I'm > fairly certain don't apply to SEV-ES: > > tsc_immutable - KVM doesn't have write access to the TSC offset of the > guest. > > eoi_intercept_unsupported - KVM can't intercept EOIs (doesn't have access > to EOI bitmaps) and so can't support level > triggered interrupts, at least not without > extra pain. > > readonly_mem_unsupported - Secure EPT (analagous to SNP) requires RWX > permissions for all private/encrypted memory. > S-EPT isn't optional, so we get the joy of > adding this right off the bat... Yes, most of the above stuff doesn't apply to SEV-ES. Thanks, Tom >
Re: [PATCH v2 1/1] sched/uclamp: add SCHED_FLAG_UTIL_CLAMP_RESET flag to reset uclamp
Hi Yun, On 12/10/2020 18:31, Yun Hsiang wrote: > If the user wants to stop controlling uclamp and let the task inherit > the value from the group, we need a method to reset. > > Add SCHED_FLAG_UTIL_CLAMP_RESET flag to allow the user to reset uclamp via > sched_setattr syscall. before we decide on how to implement the 'uclamp user_defined reset' feature, could we come back to your use case in https://lkml.kernel.org/r/20201002053812.GA176142@ubuntu ? Lets just consider uclamp min for now. We have: (1) system-wide: # cat /proc/sys/kernel/sched_util_clamp_min 1024 (2) tg (hierarchy) with top-app's cpu.uclamp.min to ~200 (20% of 1024): # cat /sys/fs/cgroup/cpu/top-app/cpu.uclamp.min 20 (3) and 2 cfs tasks A and B in top-app: # cat /sys/fs/cgroup/cpu/top-app/tasks pid_A pid_B Then you set A and B's uclamp min to 100. A and B are now user_defined. A and B's effective uclamp min value is 100. Since the task uclamp min values (3) are less than (1) and (2), their uclamp min value is not affected by (1) or (2). If A doesn't want to control itself anymore, it can set its uclamp min to e.g. 300. Now A's effective uclamp min value is ~200, i.e. controlled by (2), the one of B stays 100. So the policy is: (a) If the user_defined task wants to control it's uclamp, use task uclamp value less than the tg (hierarchy) (and the system-wide) value. (b) If the user_defined task doesn't want to control it's uclamp anymore, use a uclamp value greater than or equal the tg (hierarchy) (and the system-wide) value. So where exactly is the use case which would require a 'uclamp user_defined reset' functionality?
Re: [PATCH v3 2/2] arm64: dts: rockchip: Add basic support for Kobol's Helios64
Hello Johan, On 10/13/20 7:34 PM, Johan Jonker wrote: > Part 1 of 2 missing here. Please complain to gmail then, given that patch 1 can be found on https://lore.kernel.org/linux-arm-kernel/20201013161340.720403-2-...@kleine-koenig.org/. > Submit all patches to all maintainers and mail lists. > Don't forget robh+dt ! I'm really surprised you insist here. In my experience Rob (with his dt-hat on) is not interested in specifics of the machine files and he already acked patch 1. > Add a little change log here. I assume you also didn't get the cover letter? >> +fault-led { > fault_led: led-0 {} > > My fault. > Change ones more... > # The first form is preferred, but fall back to just 'led' anywhere in the > # node name to at least catch some child nodes. > "(^led-[0-9a-f]$|led)": ok, the label isn't necessary, is it? >> +label = "helios64:green:status"; >> +gpios = < RK_PB4 GPIO_ACTIVE_HIGH>; > >> +linux,default-trigger = "none"; > > Don't use 'none' for mainline. Oh, I missed that. Thanks for your persistence. >> +default-state = "on"; >> +}; >> +}; >> + >> +vcc1v8_sys_s0: vcc1v8-sys-s0 { >> +compatible = "regulator-fixed"; >> +regulator-name = "vcc1v8_sys_s0"; >> +regulator-always-on; >> +regulator-boot-on; >> +regulator-min-microvolt = <180>; >> +regulator-max-microvolt = <180>; >> +vin-supply = <_sys_s3>; >> +}; >> + >> +vcc3v0_sd: vcc3v0-sd { >> +compatible = "regulator-fixed"; >> +regulator-name = "vcc3v0_sd"; > > Doesn't a sd card need a on/off gpio? > Could you check the schematics? Hmm, there is a GPIO. I didn't consider a problem there given that the machine logs [ 31.708706] vcc3v0_sd: disabling when there is no SD-card in the slot. Will investigate further. >> +pinctrl-names = "default"; >> +pinctrl-0 = <_int_l>; > > sort I would expect this is an exception from the sorting rule. $ git grep pinctrl-names linus/master:arch/arm64/boot/dts/ | wc -l 1905 $ git grep -A1 pinctrl-names linus/master:arch/arm64/boot/dts/ | \ grep pinctrl-0 | wc -l 1412 When grepping over arch/arm64/boot/dts/rockchip the numbers are 453 and 445 respectively. Will use pinctrl-names above pinctrl-0 consistently. >> +regulator-max-microvolt = <135>; >> +regulator-min-microvolt = <75>; > > > The rest has min above max. > Exception to the sort rule, not sure what Heiko wants, but keep it every > where the same. OK, most rockchip dts have min before max, will fix up. >> +i2c-scl-rising-time-ns = <160>; >> +i2c-scl-falling-time-ns = <30>; > > sort I consider it logical to have rise before fall. In 43 of 59 cases in arch/arm64/boot/dts/rockchip/ it is done this way. >> +vqmmc-supply = <_sys_s0>; >> +status = "okay"; >> +}; >> + >> + { >> +bus-width = <4>; >> +cap-sd-highspeed; > >> +cd-gpios = < RK_PA7 GPIO_ACTIVE_LOW>; > > see regulator? GPIO0_A7 is the card detect line. I don't understand your question. Is it the same as above, i.e. that it should be possible that the SD regulator can be disabled? >> +disable-wp; >> +pinctrl-0 = <_clk _cmd _cd _bus4>; >> +pinctrl-names = "default"; >> +vmmc-supply = <_sd>; >> +vqmmc-supply = <_sdio_s0>; >> +status = "okay"; >> +}; Best regards Uwe signature.asc Description: OpenPGP digital signature
Re: [PATCH v4 0/5] Unify NUMA implementation between ARM64 & RISC-V
On Mon, Oct 5, 2020 at 5:18 PM Atish Patra wrote: > > This series attempts to move the ARM64 numa implementation to common > code so that RISC-V can leverage that as well instead of reimplementing > it again. > > RISC-V specific bits are based on initial work done by Greentime Hu [1] but > modified to reuse the common implementation to avoid duplication. > > [1] https://lkml.org/lkml/2020/1/10/233 > > This series has been tested on qemu with numa enabled for both RISC-V & ARM64. > It would be great if somebody can test it on numa capable ARM64 hardware > platforms. > This patch series doesn't modify the maintainers list for the common code > (arch_numa) > as I am not sure if somebody from ARM64 community or Greg should take up the > maintainership. Ganapatrao was the original author of the arm64 version. > I would be happy to update that in the next revision once it is decided. > > # numactl --hardware > available: 2 nodes (0-1) > node 0 cpus: 0 1 2 3 > node 0 size: 486 MB > node 0 free: 470 MB > node 1 cpus: 4 5 6 7 > node 1 size: 424 MB > node 1 free: 408 MB > node distances: > node 0 1 > 0: 10 20 > 1: 20 10 > # numactl -show > policy: default > preferred node: current > physcpubind: 0 1 2 3 4 5 6 7 > cpubind: 0 1 > nodebind: 0 1 > membind: 0 1 > > The patches are also available at > https://github.com/atishp04/linux/tree/5.10_numa_unified_v4 > > For RISC-V, the following qemu series is a pre-requisite(already available in > upstream) > https://patchwork.kernel.org/project/qemu-devel/list/?series=303313 > > Testing: > RISC-V: > Tested in Qemu and 2 socket OmniXtend FPGA. > > ARM64: > 2 socket kunpeng920 (4 nodes around 250G a node) > Tested-by: Jonathan Cameron > > There may be some minor conflicts with Mike's cleanup series [2] depending on > the > order in which these two series are being accepted. I can rebase on top his > series > if required. > > [2] https://lkml.org/lkml/2020/8/18/754 > > Changes from v3->v4: > 1. Removed redundant duplicate header. > 2. Added Reviewed-by tags. > > Changes from v2->v3: > 1. Added Acked-by/Reviewed-by tags. > 2. Replaced asm/acpi.h with linux/acpi.h > 3. Defined arch_acpi_numa_init as static. > > Changes from v1->v2: > 1. Replaced ARM64 specific compile time protection with ACPI specific ones. > 2. Dropped common pcibus_to_node changes. Added required changes in RISC-V. > 3. Fixed few typos. > > Atish Patra (4): > numa: Move numa implementation to common code > arm64, numa: Change the numa init functions name to be generic > riscv: Separate memory init from paging init > riscv: Add numa support for riscv64 platform > > Greentime Hu (1): > riscv: Add support pte_protnone and pmd_protnone if > CONFIG_NUMA_BALANCING > > arch/arm64/Kconfig| 1 + > arch/arm64/include/asm/numa.h | 45 + > arch/arm64/kernel/acpi_numa.c | 13 - > arch/arm64/mm/Makefile| 1 - > arch/arm64/mm/init.c | 4 +- > arch/riscv/Kconfig| 31 +++- > arch/riscv/include/asm/mmzone.h | 13 + > arch/riscv/include/asm/numa.h | 8 +++ > arch/riscv/include/asm/pci.h | 14 ++ > arch/riscv/include/asm/pgtable.h | 21 > arch/riscv/kernel/setup.c | 11 - > arch/riscv/kernel/smpboot.c | 12 - > arch/riscv/mm/init.c | 10 +++- > drivers/base/Kconfig | 6 +++ > drivers/base/Makefile | 1 + > .../mm/numa.c => drivers/base/arch_numa.c | 30 ++-- > include/asm-generic/numa.h| 49 +++ > 17 files changed, 199 insertions(+), 71 deletions(-) > create mode 100644 arch/riscv/include/asm/mmzone.h > create mode 100644 arch/riscv/include/asm/numa.h > rename arch/arm64/mm/numa.c => drivers/base/arch_numa.c (95%) > create mode 100644 include/asm-generic/numa.h > > -- > 2.25.1 > > > ___ > linux-riscv mailing list > linux-ri...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv Ping ? -- Regards, Atish
Re: [PATCH v2 2/2] x86: kexec_file: print appropriate variable
It was <2020-04-30 czw 18:31>, when Łukasz Stelmach wrote: > The value of kbuf->memsz may be different than kbuf->bufsz after calling > kexec_add_buffer(). Hence both values should be logged. > > Fixes: ec2b9bfaac44e ("kexec_file: Change kexec_add_buffer to take kexec_buf > as argument.") > Fixes: 27f48d3e633be ("kexec-bzImage64: support for loading bzImage using > 64bit entry") > Fixes: dd5f726076cc7 ("kexec: support for kexec on panic using new system > call") > Cc: Vivek Goyal > Cc: Thiago Jung Bauermann > Signed-off-by: Łukasz Stelmach > --- > arch/x86/kernel/crash.c | 2 +- > arch/x86/kernel/kexec-bzimage64.c | 4 ++-- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c > index fd87b59452a3..d408e5b536fa 100644 > --- a/arch/x86/kernel/crash.c > +++ b/arch/x86/kernel/crash.c > @@ -420,7 +420,7 @@ int crash_load_segments(struct kimage *image) > } > image->arch.elf_load_addr = kbuf.mem; > pr_debug("Loaded ELF headers at 0x%lx bufsz=0x%lx memsz=0x%lx\n", > - image->arch.elf_load_addr, kbuf.bufsz, kbuf.bufsz); > + image->arch.elf_load_addr, kbuf.bufsz, kbuf.memsz); > > return ret; > } > diff --git a/arch/x86/kernel/kexec-bzimage64.c > b/arch/x86/kernel/kexec-bzimage64.c > index db6578d45157..420393c58a73 100644 > --- a/arch/x86/kernel/kexec-bzimage64.c > +++ b/arch/x86/kernel/kexec-bzimage64.c > @@ -434,7 +434,7 @@ static void *bzImage64_load(struct kimage *image, char > *kernel, > goto out_free_params; > bootparam_load_addr = kbuf.mem; > pr_debug("Loaded boot_param, command line and misc at 0x%lx bufsz=0x%lx > memsz=0x%lx\n", > - bootparam_load_addr, kbuf.bufsz, kbuf.bufsz); > + bootparam_load_addr, kbuf.bufsz, kbuf.memsz); > > /* Load kernel */ > kbuf.buffer = kernel + kern16_size; > @@ -464,7 +464,7 @@ static void *bzImage64_load(struct kimage *image, char > *kernel, > initrd_load_addr = kbuf.mem; > > pr_debug("Loaded initrd at 0x%lx bufsz=0x%lx memsz=0x%lx\n", > - initrd_load_addr, initrd_len, initrd_len); > + initrd_load_addr, kbuf.bufsz, kbuf.memsz); > > setup_initrd(params, initrd_load_addr, initrd_len); > } Ping? Any chance this patch follows its arm64 counterpart into mainline? Kind regards, -- Łukasz Stelmach Samsung R Institute Poland Samsung Electronics signature.asc Description: PGP signature
Re: [PATCH] MIPS: Add set_memory_node()
On Tue, 13 Oct 2020, Jinyang He wrote: > Commit e7ae8d174eec ("MIPS: replace add_memory_region with memblock") > replaced add_memory_region(, , BOOT_MEM_RAM) with memblock_add(). But > it doesn't work well on some platforms which have NUMA like Loongson64. Please note this is not a full review, I haven't investigated the fitness for purpose of this change and instead just addressed one aspect of coding style. > diff --git a/arch/mips/include/asm/bootinfo.h > b/arch/mips/include/asm/bootinfo.h > index aa03b12..29e2d9c 100644 > --- a/arch/mips/include/asm/bootinfo.h > +++ b/arch/mips/include/asm/bootinfo.h > @@ -92,6 +92,10 @@ extern unsigned long mips_machtype; > > extern void detect_memory_region(phys_addr_t start, phys_addr_t sz_min, > phys_addr_t sz_max); > > +#ifdef CONFIG_NUMA > +extern void set_memory_node(phys_addr_t start, phys_addr_t size); > +#endif > + If anything this needs to be: #ifdef CONFIG_NUMA extern void set_memory_node(phys_addr_t start, phys_addr_t size); #else static inline void set_memory_node(phys_addr_t start, phys_addr_t size) {} #endif so as to avoid #ifdef clutter across call places. Maciej
Re: KASAN: use-after-free Read in fscache_alloc_cookie
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git f8eb8d1c6a853f617ca9ee233bb2d230401c5bdc
Re: [PATCH 07/23] wfx: add bus_sdio.c
Hello! On Monday 12 October 2020 12:46:32 Jerome Pouiller wrote: > +#define SDIO_VENDOR_ID_SILABS0x > +#define SDIO_DEVICE_ID_SILABS_WF200 0x1000 > +static const struct sdio_device_id wfx_sdio_ids[] = { > + { SDIO_DEVICE(SDIO_VENDOR_ID_SILABS, SDIO_DEVICE_ID_SILABS_WF200) }, Please move ids into common include file include/linux/mmc/sdio_ids.h where are all SDIO ids. Now all drivers have ids defined in that file. > + // FIXME: ignore VID/PID and only rely on device tree > + // { SDIO_DEVICE(SDIO_ANY_ID, SDIO_ANY_ID) }, What is the reason for ignoring vendor and device ids? > + { }, > +}; > +MODULE_DEVICE_TABLE(sdio, wfx_sdio_ids); > + > +struct sdio_driver wfx_sdio_driver = { > + .name = "wfx-sdio", > + .id_table = wfx_sdio_ids, > + .probe = wfx_sdio_probe, > + .remove = wfx_sdio_remove, > + .drv = { > + .owner = THIS_MODULE, > + .of_match_table = wfx_sdio_of_match, > + } > +}; > -- > 2.28.0 >
Re: Unbreakable loop in fuse_fill_write_pages()
On Tue, 2020-10-13 at 15:57 -0400, Vivek Goyal wrote: > Hmm..., So how do I reproduce it. Just run trinity as root and it will > reproduce after some time? Only need to run it as unprivileged user after mounting virtiofs on /tmp (trinity will need to create and use files there) as many as CPUs as possible. Also, make sure your guest's memory usage does not exceed the host's /dev/shm size. Otherwise, horrible things could happen. $ trinity -C 48 --arch 64 It might get coredump or exit due to some other unrelated reasons, so just keep retrying. It is best to apply your recent patch for the virtiofs false positive warning first, so it won't taint the kernel which will stop the trinity. Today, I had been able to reproduce it twice within half-hour each.
[GIT PULL] Kselftest fixes update for Linux 5.10-rc1
Hi Linus, Please pull the following Kselftest fixes update for Linux 5.10-rc1. This kselftest fixes update consists of a selftests harness fix to flush stdout before forking to avoid parent and child printing duplicates messages. This is evident when test output is redirected to a file. The second fix is a tools/ wide change to avoid comma separated statements from Joe Perches. This fix spans tools/lib, tools/power/cpupower, and selftests. diff is attached Please note that there is a conflict in tools/testing/selftests/vm/gup_test.c between commit: aa803771a80a ("tools: Avoid comma separated statements") from the kselftest-fixes tree and commit: 5c64830675a6 ("mm/gup_benchmark: rename to mm/gup_test") from the akpm tree. tools/testing/selftests/vm/gup_benchmark.c has been renamed in 5c64830675a6 from akpm tree. Stephen fixed this up in linux-next. thanks, -- Shuah The following changes since commit 5c1e4f7e9e49b6925b1fb5c507d2c614f3edb292: selftests/timers: Turn off timeout setting (2020-08-20 15:49:28 -0600) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest tags/linux-kselftest-fixes-5.10-rc1 for you to fetch changes up to aa803771a80aa2aa2d5cdd38434b369066fbb8fc: tools: Avoid comma separated statements (2020-10-02 10:36:36 -0600) linux-kselftest-fixes-5.10-rc1 This kselftest fixes update consists of a selftests harness fix to flush stdout before forking to avoid parent and child printing duplicates messages. This is evident when test output is redirected to a file. The second fix is a tools/ wide change to avoid comma separated statements from Joe Perches. This fix spans tools/lib, tools/power/cpupower, and selftests. Joe Perches (1): tools: Avoid comma separated statements Michael Ellerman (1): selftests/harness: Flush stdout before forking tools/lib/subcmd/help.c | 10 +- tools/power/cpupower/utils/cpufreq-set.c| 14 +- tools/testing/selftests/kselftest_harness.h | 5 + tools/testing/selftests/vm/gup_benchmark.c | 18 +- tools/testing/selftests/vm/userfaultfd.c| 296 +--- 5 files changed, 215 insertions(+), 128 deletions(-) diff --git a/tools/lib/subcmd/help.c b/tools/lib/subcmd/help.c index 2859f107abc8..bf02d62a3b2b 100644 --- a/tools/lib/subcmd/help.c +++ b/tools/lib/subcmd/help.c @@ -65,12 +65,14 @@ void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes) ci = cj = ei = 0; while (ci < cmds->cnt && ei < excludes->cnt) { cmp = strcmp(cmds->names[ci]->name, excludes->names[ei]->name); - if (cmp < 0) + if (cmp < 0) { cmds->names[cj++] = cmds->names[ci++]; - else if (cmp == 0) - ci++, ei++; - else if (cmp > 0) + } else if (cmp == 0) { + ci++; ei++; + } else if (cmp > 0) { + ei++; + } } while (ci < cmds->cnt) diff --git a/tools/power/cpupower/utils/cpufreq-set.c b/tools/power/cpupower/utils/cpufreq-set.c index 6ed82fba5aaa..7b2164e07057 100644 --- a/tools/power/cpupower/utils/cpufreq-set.c +++ b/tools/power/cpupower/utils/cpufreq-set.c @@ -99,13 +99,17 @@ static unsigned long string_to_frequency(const char *str) continue; if (str[cp] == '.') { - while (power > -1 && isdigit(str[cp+1])) - cp++, power--; + while (power > -1 && isdigit(str[cp+1])) { + cp++; + power--; + } } - if (power >= -1) /* not enough => pad */ + if (power >= -1) { /* not enough => pad */ pad = power + 1; - else /* to much => strip */ - pad = 0, cp += power + 1; + } else { /* too much => strip */ + pad = 0; + cp += power + 1; + } /* check bounds */ if (cp <= 0 || cp + pad > NORM_FREQ_LEN - 1) return 0; diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 4f78e4805633..f19804df244c 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -971,6 +971,11 @@ void __run_test(struct __fixture_metadata *f, ksft_print_msg(" RUN %s%s%s.%s ...\n", f->name, variant->name[0] ? "." : "", variant->name, t->name); + + /* Make sure output buffers are flushed before fork */ + fflush(stdout); + fflush(stderr); + t->pid = fork(); if (t->pid < 0) { ksft_print_msg("ERROR SPAWNING TEST CHILD\n"); diff --git a/tools/testing/selftests/vm/gup_benchmark.c b/tools/testing/selftests/vm/gup_benchmark.c index 43b4dfe161a2..d69f0eb0d8c0 100644 --- a/tools/testing/selftests/vm/gup_benchmark.c +++ b/tools/testing/selftests/vm/gup_benchmark.c @@ -105,12 +105,16 @@ int main(int argc, char **argv) gup.flags |= FOLL_WRITE; fd = open("/sys/kernel/debug/gup_benchmark", O_RDWR); - if (fd == -1) - perror("open"), exit(1); + if
Re: [PATCH] binder: fix UAF when releasing todo list
On Fri, Oct 09, 2020 at 04:24:55PM -0700, Todd Kjos wrote: > When releasing a thread todo list when tearing down > a binder_proc, the following race was possible which > could result in a use-after-free: > > 1. Thread 1: enter binder_release_work from binder_thread_release > 2. Thread 2: binder_update_ref_for_handle() -> binder_dec_node_ilocked() > 3. Thread 2: dec nodeA --> 0 (will free node) > 4. Thread 1: ACQ inner_proc_lock > 5. Thread 2: block on inner_proc_lock > 6. Thread 1: dequeue work (BINDER_WORK_NODE, part of nodeA) > 7. Thread 1: REL inner_proc_lock > 8. Thread 2: ACQ inner_proc_lock > 9. Thread 2: todo list cleanup, but work was already dequeued > 10. Thread 2: free node > 11. Thread 2: REL inner_proc_lock > 12. Thread 1: deref w->type (UAF) > > The problem was that for a BINDER_WORK_NODE, the binder_work element > must not be accessed after releasing the inner_proc_lock while > processing the todo list elements since another thread might be > handling a deref on the node containing the binder_work element > leading to the node being freed. > > Signed-off-by: Todd Kjos > --- Thanks! Acked-by: Christian Brauner
Re: [PATCH v2 2/4] time: make getboottime64 aware of time namespace
On Sat, Oct 10, 2020 at 12:19:14AM -0700, Andrei Vagin wrote: > On Fri, Oct 09, 2020 at 03:28:15PM +0200, Christian Brauner wrote: > > On Thu, Oct 08, 2020 at 07:39:42AM +0200, Michael Weiß wrote: > > > getboottime64() provides the time stamp of system boot. In case of > > > time namespaces, the offset to the boot time stamp was not applied > > > earlier. However, getboottime64 is used e.g., in /proc/stat to print > > > the system boot time to userspace. In container runtimes which utilize > > > time namespaces to virtualize boottime of a container, this leaks > > > information about the host system boot time. > > > > > > Therefore, we make getboottime64() to respect the time namespace offset > > > for boottime by subtracting the boottime offset. > > > > > > Signed-off-by: Michael Weiß > > > --- > > > kernel/time/timekeeping.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c > > > index 4c47f388a83f..67530cdb389e 100644 > > > --- a/kernel/time/timekeeping.c > > > +++ b/kernel/time/timekeeping.c > > > @@ -17,6 +17,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > #include > > > #include > > > #include > > > @@ -2154,6 +2155,8 @@ void getboottime64(struct timespec64 *ts) > > > { > > > struct timekeeper *tk = _core.timekeeper; > > > ktime_t t = ktime_sub(tk->offs_real, tk->offs_boot); > > > + /* shift boot time stamp according to the timens offset */ > > > + t = timens_ktime_to_host(CLOCK_BOOTTIME, t); > > > > Note that getbootime64() is mostly used in net/sunrpc and I don't know > > if this change has any security implications for them. > > I would prefer to not patch kernel internal functions if they are used > not only to expose time to the userspace. > > I think when kernel developers sees the getboottime64 function, they > will expect that it returns the real time of kernel boot. They will > not expect that it is aware of time namespaces and a returned time will > depend on a task in which context it will be called. > > IMHO, as a minimum, we need to update the documentation for this function or > even adjust a function name. > > And I think we need to consider an option to not change getbootime64 and > apply a timens offset right in the show_stat(fs/proc/stat.c) function. This is why I asked about this since I assumed this would break someone's use-case. :) In any case, if I understand correctly then we want the same thing: just change fs/proc/stat.c i.e. have a a specific helper that applies the correct offset. Christian
Re: [PATCH v2 2/4] net: ax88796c: ASIX AX88796C SPI Ethernet Adapter Driver
It was <2020-10-02 pią 22:36>, when Andrew Lunn wrote: >> +static u32 ax88796c_get_link(struct net_device *ndev) >> +{ >> +struct ax88796c_device *ax_local = to_ax88796c_device(ndev); >> + >> +mutex_lock(_local->spi_lock); >> + >> +phy_read_status(ndev->phydev); >> + >> +mutex_unlock(_local->spi_lock); > > Why do you take this mutux before calling phy_read_status()? The > phylib core will not be taking this mutex when it calls into the PHY > driver. This applies to all the calls you have with phy_ > I need to review the use of this mutex. Thanks for spotting. > There should not be any need to call phy_read_status(). phylib will do > this once per second, or after any interrupt from the PHY. so just use > > phydev->link > Using ethtool_op_get_link() >> +static void >> +ax88796c_get_regs(struct net_device *ndev, struct ethtool_regs *regs, void >> *_p) >> +{ >> +struct ax88796c_device *ax_local = to_ax88796c_device(ndev); >> +u16 *p = _p; >> +int offset, i; >> + >> +memset(p, 0, AX88796C_REGDUMP_LEN); >> + >> +for (offset = 0; offset < AX88796C_REGDUMP_LEN; offset += 2) { >> +if (!test_bit(offset / 2, ax88796c_no_regs_mask)) >> +*p = AX_READ(_local->ax_spi, offset); >> +p++; >> +} >> + >> +for (i = 0; i < AX88796C_PHY_REGDUMP_LEN / 2; i++) { >> +*p = phy_read(ax_local->phydev, i); >> +p++; > > Depending on the PHY, that can be dangerous. This is a built-in generic PHY. The chip has no lines to attach any other external one. > phylib could be busy doing things with the PHY. It could be looking at How does phylib prevent concurrent access to a PHY? > a different page for example. Different page? > miitool(1) can give you the same functionally without the MAC driver > doing anything, other than forwarding the IOCTL call on. No, I am afraid mii-tool is not able to dump registers. I am not insisting on dumping PHY registeres but I think it is nice to have them. Intel drivers do it. >> +int ax88796c_mdio_read(struct mii_bus *mdiobus, int phy_id, int loc) >> +{ >> +struct ax88796c_device *ax_local = mdiobus->priv; >> +int ret; >> + >> +AX_WRITE(_local->ax_spi, MDIOCR_RADDR(loc) >> +| MDIOCR_FADDR(phy_id) | MDIOCR_READ, P2_MDIOCR); >> + >> +ret = read_poll_timeout(AX_READ, ret, >> +(ret != 0), >> +0, jiffies_to_usecs(HZ / 100), false, >> +_local->ax_spi, P2_MDIOCR); >> +if (ret) >> +return -EBUSY; > > Return whatever read_poll_timeout() returned. It is probably > -ETIMEDOUT, but it could also be -EIO for example. Indeed it is -ETIMEDOUT. Returning ret. >> +ax88796c_mdio_write(struct mii_bus *mdiobus, int phy_id, int loc, u16 val) >> +{ >> +struct ax88796c_device *ax_local = mdiobus->priv; >> +int ret; >> + >> +AX_WRITE(_local->ax_spi, val, P2_MDIODR); >> + >> +AX_WRITE(_local->ax_spi, >> + MDIOCR_RADDR(loc) | MDIOCR_FADDR(phy_id) >> + | MDIOCR_WRITE, P2_MDIOCR); >> + >> +ret = read_poll_timeout(AX_READ, ret, >> +((ret & MDIOCR_VALID) != 0), 0, >> +jiffies_to_usecs(HZ / 100), false, >> +_local->ax_spi, P2_MDIOCR); >> +if (ret) >> +return -EIO; >> + >> +if (loc == MII_ADVERTISE) { >> +AX_WRITE(_local->ax_spi, (BMCR_FULLDPLX | BMCR_ANRESTART | >> + BMCR_ANENABLE | BMCR_SPEED100), P2_MDIODR); >> +AX_WRITE(_local->ax_spi, (MDIOCR_RADDR(MII_BMCR) | >> + MDIOCR_FADDR(phy_id) | MDIOCR_WRITE), >> + P2_MDIOCR); >> > > What is this doing? > Well… it turns autonegotiation when changing advertised link modes. But this is obvious. As to why this code is here, I will honestly say — I am not sure (Reminder: this is a vendor driver I am porting, I am more than happy to receive any comments, thank you). Apparently it is not required and I am willing to remove it. It could be of some use when the driver didn't use phylib. >> +ret = read_poll_timeout(AX_READ, ret, >> +((ret & MDIOCR_VALID) != 0), 0, >> +jiffies_to_usecs(HZ / 100), false, >> +_local->ax_spi, P2_MDIOCR); >> +if (ret) >> +return -EIO; >> +} >> + >> +return 0; >> +} > >> +static char *no_regs_list = "80018001,e1918001,8001a001,fc0d"; >> +unsigned long ax88796c_no_regs_mask[AX88796C_REGDUMP_LEN / (sizeof(unsigned >> long) * 8)]; >> + >> +module_param(comp, int, 0444); >> +MODULE_PARM_DESC(comp, "0=Non-Compression Mode, 1=Compression Mode"); >> + >> +module_param(msg_enable, int, 0444); >> +MODULE_PARM_DESC(msg_enable, "Message mask (see linux/netdevice.h for >> bitmap)"); > > No module parameters allowed, not in
Re: [PATCH v2 2/4] net: ax88796c: ASIX AX88796C SPI Ethernet Adapter Driver
It was <2020-10-03 sob 14:59>, when Heiner Kallweit wrote: > On 02.10.2020 21:22, Łukasz Stelmach wrote: >> ASIX AX88796[1] is a versatile ethernet adapter chip, that can be >> connected to a CPU with a 8/16-bit bus or with an SPI. This driver >> supports SPI connection. >> >> The driver has been ported from the vendor kernel for ARTIK5[2] >> boards. Several changes were made to adapt it to the current kernel >> which include: >> >> + updated DT configuration, >> + clock configuration moved to DT, >> + new timer, ethtool and gpio APIs, >> + dev_* instead of pr_* and custom printk() wrappers, >> + removed awkward vendor power managemtn. >> >> [1] >> https://protect2.fireeye.com/v1/url?k=f34a6c6f-ae99377b-f34be720-0cc47a31ce4e-31468d469d6422a1=1=9cb90086-9be8-4db6-8a58-c5447926b709=https%3A%2F%2Fwww.asix.com.tw%2Fproducts.php%3Fop%3DpItemdetail%26PItemID%3D104%3B65%3B86%26PLine%3D65 >> [2] >> https://protect2.fireeye.com/v1/url?k=67412e7e-3a92756a-6740a531-0cc47a31ce4e-4192baccfff43dec=1=9cb90086-9be8-4db6-8a58-c5447926b709=https%3A%2F%2Fgit.tizen.org%2Fcgit%2Fprofile%2Fcommon%2Fplatform%2Fkernel%2Flinux-3.10-artik%2F >> >> The other ax88796 driver is for NE2000 compatible AX88796L chip. These >> chips are not compatible. Hence, two separate drivers are required. >> >> Signed-off-by: Łukasz Stelmach >> --- >> MAINTAINERS|6 + >> drivers/net/ethernet/Kconfig |1 + >> drivers/net/ethernet/Makefile |1 + >> drivers/net/ethernet/asix/Kconfig | 21 + >> drivers/net/ethernet/asix/Makefile |6 + >> drivers/net/ethernet/asix/ax88796c_ioctl.c | 241 + >> drivers/net/ethernet/asix/ax88796c_ioctl.h | 27 + >> drivers/net/ethernet/asix/ax88796c_main.c | 1041 >> drivers/net/ethernet/asix/ax88796c_main.h | 568 +++ >> drivers/net/ethernet/asix/ax88796c_spi.c | 111 +++ >> drivers/net/ethernet/asix/ax88796c_spi.h | 69 ++ >> 11 files changed, 2092 insertions(+) >> create mode 100644 drivers/net/ethernet/asix/Kconfig >> create mode 100644 drivers/net/ethernet/asix/Makefile >> create mode 100644 drivers/net/ethernet/asix/ax88796c_ioctl.c >> create mode 100644 drivers/net/ethernet/asix/ax88796c_ioctl.h >> create mode 100644 drivers/net/ethernet/asix/ax88796c_main.c >> create mode 100644 drivers/net/ethernet/asix/ax88796c_main.h >> create mode 100644 drivers/net/ethernet/asix/ax88796c_spi.c >> create mode 100644 drivers/net/ethernet/asix/ax88796c_spi.h >> >> diff --git a/MAINTAINERS b/MAINTAINERS >> index deaafb617361..654eb0127479 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -2822,6 +2822,12 @@ S:Maintained >> F: Documentation/hwmon/asc7621.rst >> F: drivers/hwmon/asc7621.c >> >> +ASIX AX88796C SPI ETHERNET ADAPTER >> +M: Łukasz Stelmach >> +S: Maintained >> +F: Documentation/devicetree/bindings/net/asix,ax99706c-spi.yaml >> +F: drivers/net/ethernet/asix/ax88796c_* >> + >> ASPEED PINCTRL DRIVERS >> M: Andrew Jeffery >> L: linux-asp...@lists.ozlabs.org (moderated for non-subscribers) >> diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig >> index de50e8b9e656..f3b218e45ea5 100644 >> --- a/drivers/net/ethernet/Kconfig >> +++ b/drivers/net/ethernet/Kconfig >> @@ -32,6 +32,7 @@ source "drivers/net/ethernet/apm/Kconfig" >> source "drivers/net/ethernet/apple/Kconfig" >> source "drivers/net/ethernet/aquantia/Kconfig" >> source "drivers/net/ethernet/arc/Kconfig" >> +source "drivers/net/ethernet/asix/Kconfig" >> source "drivers/net/ethernet/atheros/Kconfig" >> source "drivers/net/ethernet/aurora/Kconfig" >> source "drivers/net/ethernet/broadcom/Kconfig" >> diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile >> index f8f38dcb5f8a..9eb368d93607 100644 >> --- a/drivers/net/ethernet/Makefile >> +++ b/drivers/net/ethernet/Makefile >> @@ -18,6 +18,7 @@ obj-$(CONFIG_NET_XGENE) += apm/ >> obj-$(CONFIG_NET_VENDOR_APPLE) += apple/ >> obj-$(CONFIG_NET_VENDOR_AQUANTIA) += aquantia/ >> obj-$(CONFIG_NET_VENDOR_ARC) += arc/ >> +obj-$(CONFIG_NET_VENDOR_ASIX) += asix/ >> obj-$(CONFIG_NET_VENDOR_ATHEROS) += atheros/ >> obj-$(CONFIG_NET_VENDOR_AURORA) += aurora/ >> obj-$(CONFIG_NET_VENDOR_CADENCE) += cadence/ >> diff --git a/drivers/net/ethernet/asix/Kconfig >> b/drivers/net/ethernet/asix/Kconfig >> new file mode 100644 >> index ..7caa45607450 >> --- /dev/null >> +++ b/drivers/net/ethernet/asix/Kconfig >> @@ -0,0 +1,21 @@ >> +# >> +# Asix network device configuration >> +# >> + >> +config NET_VENDOR_ASIX >> +bool "Asix devices" >> +default y >> +help >> + If you have a network (Ethernet, non-USB, not NE2000 compatible) >> + interface based on a chip from ASIX, say Y. >> + >> +if NET_VENDOR_ASIX >> + >> +config SPI_AX88796C >> +tristate "Asix AX88796C-SPI support" >> +depends on SPI >> +depends on GPIOLIB >> +help >> + Say Y here if you intend to use ASIX
Re: [PATCH RFC PKS/PMEM 33/58] fs/cramfs: Utilize new kmap_thread()
On Tue, Oct 13, 2020 at 08:36:43PM +0100, Matthew Wilcox wrote: > static inline void copy_to_highpage(struct page *to, void *vfrom, unsigned > int size) > { > char *vto = kmap_atomic(to); > > memcpy(vto, vfrom, size); > kunmap_atomic(vto); > } > > in linux/highmem.h ? You mean, like static void memcpy_from_page(char *to, struct page *page, size_t offset, size_t len) { char *from = kmap_atomic(page); memcpy(to, from + offset, len); kunmap_atomic(from); } static void memcpy_to_page(struct page *page, size_t offset, const char *from, size_t len) { char *to = kmap_atomic(page); memcpy(to + offset, from, len); kunmap_atomic(to); } static void memzero_page(struct page *page, size_t offset, size_t len) { char *addr = kmap_atomic(page); memset(addr + offset, 0, len); kunmap_atomic(addr); } in lib/iov_iter.c? FWIW, I don't like that "highpage" in the name and highmem.h as location - these make perfect sense regardless of highmem; they are normal memory operations with page + offset used instead of a pointer...
Re: Unbreakable loop in fuse_fill_write_pages()
On Tue, Oct 13, 2020 at 03:53:19PM -0400, Qian Cai wrote: > On Tue, 2020-10-13 at 14:58 -0400, Vivek Goyal wrote: > > > I am wondering if virtiofsd still alive and responding to requests? I > > see another task which is blocked on getdents() for more than 120s. > > > > [10580.142571][ T348] INFO: task trinity-c36:254165 blocked for more than > > 123 > > +seconds. > > [10580.143924][ T348] Tainted: G O 5.9.0-next-20201013+ #2 > > [10580.145158][ T348] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > > +disables this message. > > [10580.146636][ T348] task:trinity-c36 state:D stack:26704 pid:254165 > > ppid: > > +87180 flags:0x0004 > > [10580.148260][ T348] Call Trace: > > [10580.148789][ T348] __schedule+0x71d/0x1b50 > > [10580.149532][ T348] ? __sched_text_start+0x8/0x8 > > [10580.150343][ T348] schedule+0xbf/0x270 > > [10580.151044][ T348] schedule_preempt_disabled+0xc/0x20 > > [10580.152006][ T348] __mutex_lock+0x9f1/0x1360 > > [10580.152777][ T348] ? __fdget_pos+0x9c/0xb0 > > [10580.153484][ T348] ? mutex_lock_io_nested+0x1240/0x1240 > > [10580.154432][ T348] ? find_held_lock+0x33/0x1c0 > > [10580.155220][ T348] ? __fdget_pos+0x9c/0xb0 > > [10580.155934][ T348] __fdget_pos+0x9c/0xb0 > > [10580.156660][ T348] __x64_sys_getdents+0xff/0x230 > > > > May be virtiofsd crashed and hence no requests are completing leading > > to a hard lockup? > Virtiofsd is still working. Once this happened, I manually create a file on > the > guest (in virtiofs) and then I can see the content of it from the host. Hmm..., So how do I reproduce it. Just run trinity as root and it will reproduce after some time? Vivek
Re: [PATCH v1 02/15] perf report: output trace file name in raw trace dump
On Mon, Oct 12, 2020 at 11:54:24AM +0300, Alexey Budankov wrote: > > Output path of a trace file into raw dump (-D) @. > Print offset of PERF_RECORD_COMPRESSED record instead of zero for > decompressed records: > 0x22...@perf.data [0x30]: event: 9 > or > 0x15c...@perf.data/data.7 [0x30]: event: 9 > > Signed-off-by: Alexey Budankov hi, I'm getting: CC builtin-inject.o builtin-inject.c: In function ‘cmd_inject’: builtin-inject.c:850:18: error: initialization of ‘int (*)(struct perf_session *, union perf_event *, u64, const char *)’ {aka ‘int (*)(struct perf_session *, union perf_event *, long unsigned int, const char *)’} from incompatible pointer type ‘int (*)(struct perf_session *, union perf_event *, u64)’ {aka ‘int (*)(struct perf_session *, union perf_event *, long unsigned int)’} [-Werror=incompatible-pointer-types] 850 |.compressed = perf_event__repipe_op4_synth, | ^~~~ builtin-inject.c:850:18: note: (near initialization for ‘inject.tool.compressed’) it's probably recent build id changes jirka > --- > tools/perf/util/session.c | 75 +++ > tools/perf/util/tool.h| 3 +- > 2 files changed, 46 insertions(+), 32 deletions(-) > > diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c > index 4478ddae485b..6f09d506b2f6 100644 > --- a/tools/perf/util/session.c > +++ b/tools/perf/util/session.c > @@ -36,7 +36,8 @@ > > #ifdef HAVE_ZSTD_SUPPORT > static int perf_session__process_compressed_event(struct perf_session > *session, > - union perf_event *event, u64 > file_offset) > + union perf_event *event, u64 > file_offset, > + const char *file_path) > { > void *src; > size_t decomp_size, src_size; > @@ -58,6 +59,7 @@ static int perf_session__process_compressed_event(struct > perf_session *session, > } > > decomp->file_pos = file_offset; > + decomp->file_path = file_path; > decomp->mmap_len = mmap_len; > decomp->head = 0; > > @@ -98,7 +100,8 @@ static int perf_session__process_compressed_event(struct > perf_session *session, > static int perf_session__deliver_event(struct perf_session *session, > union perf_event *event, > struct perf_tool *tool, > -u64 file_offset); > +u64 file_offset, > +const char *file_path); > > static int perf_session__open(struct perf_session *session) > { > @@ -180,7 +183,8 @@ static int ordered_events__deliver_event(struct > ordered_events *oe, > ordered_events); > > return perf_session__deliver_event(session, event->event, > -session->tool, event->file_offset); > +session->tool, event->file_offset, > +event->file_path); > } > > struct perf_session *perf_session__new(struct perf_data *data, > @@ -452,7 +456,8 @@ static int process_stat_round_stub(struct perf_session > *perf_session __maybe_unu > > static int perf_session__process_compressed_event_stub(struct perf_session > *session __maybe_unused, > union perf_event *event > __maybe_unused, > -u64 file_offset > __maybe_unused) > +u64 file_offset > __maybe_unused, > +const char *file_path > __maybe_unused) > { > dump_printf(": unhandled!\n"); > return 0; > @@ -1227,13 +1232,14 @@ static void sample_read__printf(struct perf_sample > *sample, u64 read_format) > } > > static void dump_event(struct evlist *evlist, union perf_event *event, > -u64 file_offset, struct perf_sample *sample) > +u64 file_offset, struct perf_sample *sample, > +const char *file_path) > { > if (!dump_trace) > return; > > - printf("\n%#" PRIx64 " [%#x]: event: %d\n", > -file_offset, event->header.size, event->header.type); > + printf("\n%#" PRIx64 "@%s [%#x]: event: %d\n", > +file_offset, file_path, event->header.size, event->header.type); > > trace_event(event); > if (event->header.type == PERF_RECORD_SAMPLE && > evlist->trace_event_sample_raw) > @@ -1424,12 +1430,13 @@ static int machines__deliver_event(struct machines > *machines, > struct evlist *evlist, > union perf_event *event, > struct perf_sample *sample, > -
Re: Unbreakable loop in fuse_fill_write_pages()
On Tue, 2020-10-13 at 14:58 -0400, Vivek Goyal wrote: > I am wondering if virtiofsd still alive and responding to requests? I > see another task which is blocked on getdents() for more than 120s. > > [10580.142571][ T348] INFO: task trinity-c36:254165 blocked for more than 123 > +seconds. > [10580.143924][ T348] Tainted: G O5.9.0-next-20201013+ #2 > [10580.145158][ T348] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > +disables this message. > [10580.146636][ T348] task:trinity-c36 state:D stack:26704 pid:254165 > ppid: > +87180 flags:0x0004 > [10580.148260][ T348] Call Trace: > [10580.148789][ T348] __schedule+0x71d/0x1b50 > [10580.149532][ T348] ? __sched_text_start+0x8/0x8 > [10580.150343][ T348] schedule+0xbf/0x270 > [10580.151044][ T348] schedule_preempt_disabled+0xc/0x20 > [10580.152006][ T348] __mutex_lock+0x9f1/0x1360 > [10580.152777][ T348] ? __fdget_pos+0x9c/0xb0 > [10580.153484][ T348] ? mutex_lock_io_nested+0x1240/0x1240 > [10580.154432][ T348] ? find_held_lock+0x33/0x1c0 > [10580.155220][ T348] ? __fdget_pos+0x9c/0xb0 > [10580.155934][ T348] __fdget_pos+0x9c/0xb0 > [10580.156660][ T348] __x64_sys_getdents+0xff/0x230 > > May be virtiofsd crashed and hence no requests are completing leading > to a hard lockup? Virtiofsd is still working. Once this happened, I manually create a file on the guest (in virtiofs) and then I can see the content of it from the host.
Re: [GIT PULL] io_uring updates for 5.10-rc1
On Tue, Oct 13, 2020 at 12:49 PM Jens Axboe wrote: > > What clang are you using? I have a self-built clang version from their development tree, since I've been using it for the "asm goto with outputs" testing. But I bet this happens with just regular reasonably up-to-date clang too. Linus
Re: [GIT PULL] io_uring updates for 5.10-rc1
On 10/13/20 1:46 PM, Linus Torvalds wrote: > On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe wrote: >> >> Here are the io_uring updates for 5.10. > > Very strange. My clang build gives a warning I've never seen before: > >/tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section > attributes for .data..read_mostly > > and looking at what clang generates for the *.s file, it seems to be > the "section" line in: > > .type io_op_defs,@object # @io_op_defs > .section.data..read_mostly,"a",@progbits > .p2align4 > > I think it's the combination of "const" and "__read_mostly". > > I think the warning is sensible: how can a piece of data be both > "const" and "__read_mostly"? If it's "const", then it's not "mostly" > read - it had better be _always_ read. > > I'm letting it go, and I've pulled this (gcc doesn't complain), but > please have a look. Huh weird, I'll take a look. FWIW, the construct isn't unique across the kernel. What clang are you using? -- Jens Axboe
Re: [PATCH v2 12/24] drm/dp: fix a kernel-doc issue at drm_edid.c
wait, I think there's some confusion here. these patches have already been pushed On Tue, 2020-10-13 at 14:14 +0200, Mauro Carvalho Chehab wrote: > The name of the argument is different, causing those warnings: > > ./drivers/gpu/drm/drm_edid.c:3754: warning: Function parameter or member > 'video_code' not described in 'drm_display_mode_from_cea_vic' > ./drivers/gpu/drm/drm_edid.c:3754: warning: Excess function parameter > 'vic' description in 'drm_display_mode_from_cea_vic' > > Fixes: 7af655bce275 ("drm/dp: Add drm_dp_downstream_mode()") > Signed-off-by: Mauro Carvalho Chehab > --- > drivers/gpu/drm/drm_edid.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > index a82f37d44258..631125b46e04 100644 > --- a/drivers/gpu/drm/drm_edid.c > +++ b/drivers/gpu/drm/drm_edid.c > @@ -3741,7 +3741,7 @@ drm_add_cmdb_modes(struct drm_connector *connector, u8 > svd) > /** > * drm_display_mode_from_cea_vic() - return a mode for CEA VIC > * @dev: DRM device > - * @vic: CEA VIC of the mode > + * @video_code: CEA VIC of the mode > * > * Creates a new mode matching the specified CEA VIC. > * -- Sincerely, Lyude Paul (she/her) Software Engineer at Red Hat Note: I deal with a lot of emails and have a lot of bugs on my plate. If you've asked me a question, are waiting for a review/merge on a patch, etc. and I haven't responded in a while, please feel free to send me another email to check on my status. I don't bite!
[PATCH] tracing: Check return value of __create_val_fields() before using its result
From: "Steven Rostedt (VMware)" After having a typo for writing a histogram trigger. Wrote: echo 'hist:key=pid:ts=common_timestamp.usec' > events/sched/sched_waking/trigger Instead of: echo 'hist:key=pid:ts=common_timestamp.usecs' > events/sched/sched_waking/trigger and the following crash happened: BUG: kernel NULL pointer dereference, address: 0008 #PF: supervisor read access in kernel mode #PF: error_code(0x) - not-present page PGD 0 P4D 0 Oops: [#1] PREEMPT SMP PTI CPU: 4 PID: 1641 Comm: sh Not tainted 5.9.0-rc5-test+ #549 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:event_hist_trigger_func+0x70b/0x1ee0 Code: 24 08 89 d5 49 89 cc e9 8c 00 00 00 4c 89 f2 41 b9 00 10 00 00 4c 89 e1 44 89 ee 4c 89 ff e8 dc d3 ff ff 45 89 ea 4b 8b 14 d7 42 08 04 74 17 41 8b 8f c0 00 00 00 8d 71 01 41 89 b7 c0 00 00 RSP: 0018:959213d53db0 EFLAGS: 00010202 RAX: ffea RBX: RCX: 00084c04 RDX: RSI: df7326aefebd174c RDI: 00031080 RBP: 0002 R08: 0001 R09: 0001 R10: 0001 R11: 0046 R12: 959211dcf690 R13: 0001 R14: 95925a36e370 R15: 959251c89800 FS: 7fb9ea934740() GS:95925ab0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 0008 CR3: c976c005 CR4: 001706e0 Call Trace: ? trigger_process_regex+0x78/0x110 trigger_process_regex+0xc5/0x110 event_trigger_write+0x71/0xd0 vfs_write+0xca/0x210 ksys_write+0x70/0xf0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fb9eaa29487 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 This was caused by accessing the hlist_data fields after the call to __create_val_fields() without checking if the creation succeed. Fixes: 63a1e5de3006 ("tracing: Save normal string variables") Signed-off-by: Steven Rostedt (VMware) --- kernel/trace/trace_events_hist.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index c74a7d157306..96c3f86b81c5 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -3687,7 +3687,7 @@ static int create_var_field(struct hist_trigger_data *hist_data, ret = __create_val_field(hist_data, val_idx, file, var_name, expr_str, flags); - if (hist_data->fields[val_idx]->flags & HIST_FIELD_FL_STRING) + if (!ret && hist_data->fields[val_idx]->flags & HIST_FIELD_FL_STRING) hist_data->fields[val_idx]->var_str_idx = hist_data->n_var_str++; return ret; -- 2.25.4
Re: sysfs filenames with spaces
On Tue, 2020-10-13 at 19:17 +0200, Pavel Machek wrote: > On Mon 2020-10-05 19:41:15, Joe Perches wrote: > > This doesn't seem like a great idea to me. > > > > For my system I've got: > > > > /sys/devices/platform/Fixed MDIO bus.0/ > > /sys/bus/platform/drivers/int3401 thermal/ > > /sys/bus/platform/drivers/int3403 thermal/ > > /sys/bus/platform/drivers/int3400 thermal/ > > /sys/bus/mdio_bus/drivers/Generic PHY/ > > /sys/bus/mdio_bus/drivers/Generic Clause 45 PHY/ > > /sys/bus/pnp/drivers/i8042 aux/ > > /sys/bus/pnp/drivers/i8042 kbd/ > > /sys/bus/i2c/drivers/CHT Whiskey Cove PMIC/ > > > > Could these filenames be avoided in the future or > > even renamed today? > > Does not look like great idea to me, either. Hmm. Is there filename > with "/" in it? :-) > > But I guess you'd need to cc relevant maintainers and that this is > going to be a bit of whack-a-mole. An option might be to convert any invalid filename via an alloc and substitution in sysfs_add_file and similar free in sysfs_remove_file. Emitting a logging message describing any new name would be useful too.
Re: [GIT PULL] x86/urgent for v5.10-rc1
The pull request you sent on Tue, 13 Oct 2020 19:36:15 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git > tags/x86_urgent_for_v5.10-rc1 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/857d64485e7c920364688a8a6dd0ffe5774327b6 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] io_uring updates for 5.10-rc1
On Mon, Oct 12, 2020 at 6:46 AM Jens Axboe wrote: > > Here are the io_uring updates for 5.10. Very strange. My clang build gives a warning I've never seen before: /tmp/io_uring-dd40c4.s:26476: Warning: ignoring changed section attributes for .data..read_mostly and looking at what clang generates for the *.s file, it seems to be the "section" line in: .type io_op_defs,@object # @io_op_defs .section.data..read_mostly,"a",@progbits .p2align4 I think it's the combination of "const" and "__read_mostly". I think the warning is sensible: how can a piece of data be both "const" and "__read_mostly"? If it's "const", then it's not "mostly" read - it had better be _always_ read. I'm letting it go, and I've pulled this (gcc doesn't complain), but please have a look. Linus
Re: [GIT PULL] io_uring updates for 5.10-rc1
The pull request you sent on Mon, 12 Oct 2020 07:46:45 -0600: > git://git.kernel.dk/linux-block.git tags/io_uring-5.10-2020-10-12 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/6ad4bf6ea1609fb539a62f10fca87ddbd53a0315 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [PATCH] kcmp: add separate Kconfig symbol for kcmp syscall
On 10/07/2020 17.57, Matthew Wilcox wrote: > On Fri, Jul 10, 2020 at 09:56:31AM +0200, Rasmus Villemoes wrote: >> The ability to check open file descriptions for equality (without >> resorting to unreliable fstat() and fcntl(F_GETFL) comparisons) can be >> useful outside of the checkpoint/restore use case - for example, >> systemd uses kcmp() to deduplicate the per-service file descriptor >> store. >> >> Make it possible to have the kcmp() syscall without the full >> CONFIG_CHECKPOINT_RESTORE. > > If systemd is using it, is it even worth making it conditional any more? > Maybe for CONFIG_EXPERT builds, it could be de-selectable. > [hm, I dropped the ball, sorry for the necromancy] Well, first, I don't want to change any defaults here, if that is to be done, it should be a separate patch. Second, yes, systemd uses it for the de-duplication, and for that reason recommends CONFIG_CHECKPOINT_RESTORE (at least, according to their README) - but I'm not aware of any daemons that actually make use of systemd's file descriptor store, so it's not really something essential to every systemd-based system out there. It would be nice if systemd could change its recommendation to just CONFIG_KCMP_SYSCALL. But it's also useful for others, e.g. I have some code that wants to temporarily replace stdin/stdout/stderr with some other file descriptors, but needs to preserve the '0 is a dup of 1, or not' state (i.e., is the same struct file) - that cannot reliably be determined from fstat()/lseek(SEEK_CUR)/F_GETFL or whatever else one could throw at an fd. Rasmus
Re: [PATCH RFC V3 1/9] x86/pkeys: Create pkeys_common.h
On Tue, Oct 13, 2020 at 10:46:16AM -0700, Dave Hansen wrote: > On 10/9/20 12:42 PM, ira.we...@intel.com wrote: > > Protection Keys User (PKU) and Protection Keys Supervisor (PKS) work > > in similar fashions and can share common defines. > > Could we be a bit less abstract? PKS and PKU each have: > 1. A single control register > 2. The same number of keys > 3. The same number of bits in the register per key > 4. Access and Write disable in the same bit locations > > That means that we can share all the macros that synthesize and > manipulate register values between the two features. Sure. Done. > > > +++ b/arch/x86/include/asm/pkeys_common.h > > @@ -0,0 +1,11 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +#ifndef _ASM_X86_PKEYS_INTERNAL_H > > +#define _ASM_X86_PKEYS_INTERNAL_H > > + > > +#define PKR_AD_BIT 0x1 > > +#define PKR_WD_BIT 0x2 > > +#define PKR_BITS_PER_PKEY 2 > > + > > +#define PKR_AD_KEY(pkey) (PKR_AD_BIT << ((pkey) * PKR_BITS_PER_PKEY)) > > Now that this has moved away from its use-site, it's a bit less > self-documenting. Let's add a comment: > > /* > * Generate an Access-Disable mask for the given pkey. Several of these > * can be OR'd together to generate pkey register values. > */ Fair enough. done. > > Once that's in place, along with the updated changelog: > > Reviewed-by: Dave Hansen Thanks, Ira
Re: [PATCH v3 1/2] tracing: support "bool" type in synthetic trace events
On Fri, 9 Oct 2020, Axel Rasmussen wrote: > It's common [1] to define tracepoint fields as "bool" when they contain > a true / false value. Currently, defining a synthetic event with a > "bool" field yields EINVAL. It's possible to work around this by using > e.g. u8 (assuming sizeof(bool) is 1, and bool is unsigned; if either of > these properties don't match, you get EINVAL [2]). > > Supporting "bool" explicitly makes hooking this up easier and more > portable for userspace. > > [1]: grep -r "bool" include/trace/events/ > [2]: check_synth_field() in kernel/trace/trace_events_hist.c > > Acked-by: Michel Lespinasse > Signed-off-by: Axel Rasmussen Acked-by: David Rientjes
Re: [PATCH v3 2/2] mmap_lock: add tracepoints around lock acquisition
On Fri, 9 Oct 2020, Axel Rasmussen wrote: > The goal of these tracepoints is to be able to debug lock contention > issues. This lock is acquired on most (all?) mmap / munmap / page fault > operations, so a multi-threaded process which does a lot of these can > experience significant contention. > > We trace just before we start acquisition, when the acquisition returns > (whether it succeeded or not), and when the lock is released (or > downgraded). The events are broken out by lock type (read / write). > > The events are also broken out by memcg path. For container-based > workloads, users often think of several processes in a memcg as a single > logical "task", so collecting statistics at this level is useful. > > The end goal is to get latency information. This isn't directly included > in the trace events. Instead, users are expected to compute the time > between "start locking" and "acquire returned", using e.g. synthetic > events or BPF. The benefit we get from this is simpler code. > > Because we use tracepoint_enabled() to decide whether or not to trace, > this patch has effectively no overhead unless tracepoints are enabled at > runtime. If tracepoints are enabled, there is a performance impact, but > how much depends on exactly what e.g. the BPF program does. > > Signed-off-by: Axel Rasmussen Acked-by: David Rientjes
Re: [PATCH RFC PKS/PMEM 33/58] fs/cramfs: Utilize new kmap_thread()
On Tue, Oct 13, 2020 at 12:37 PM Matthew Wilcox wrote: > > On Tue, Oct 13, 2020 at 11:44:29AM -0700, Dan Williams wrote: > > On Fri, Oct 9, 2020 at 12:52 PM wrote: > > > > > > From: Ira Weiny > > > > > > The kmap() calls in this FS are localized to a single thread. To avoid > > > the over head of global PKRS updates use the new kmap_thread() call. > > > > > > Cc: Nicolas Pitre > > > Signed-off-by: Ira Weiny > > > --- > > > fs/cramfs/inode.c | 10 +- > > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > > > diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c > > > index 912308600d39..003c014a42ed 100644 > > > --- a/fs/cramfs/inode.c > > > +++ b/fs/cramfs/inode.c > > > @@ -247,8 +247,8 @@ static void *cramfs_blkdev_read(struct super_block > > > *sb, unsigned int offset, > > > struct page *page = pages[i]; > > > > > > if (page) { > > > - memcpy(data, kmap(page), PAGE_SIZE); > > > - kunmap(page); > > > + memcpy(data, kmap_thread(page), PAGE_SIZE); > > > + kunmap_thread(page); > > > > Why does this need a sleepable kmap? This looks like a textbook > > kmap_atomic() use case. > > There's a lot of code of this form. Could we perhaps have: > > static inline void copy_to_highpage(struct page *to, void *vfrom, unsigned > int size) > { > char *vto = kmap_atomic(to); > > memcpy(vto, vfrom, size); > kunmap_atomic(vto); > } > > in linux/highmem.h ? Nice, yes, that could also replace the local ones in lib/iov_iter.c (memcpy_{to,from}_page())
Re: [PATCHSET RFC v3 0/6] Add support for TIF_NOTIFY_SIGNAL
On 10/12/20 11:27 AM, Miroslav Benes wrote: > On Sat, 10 Oct 2020, Jens Axboe wrote: > >> On 10/9/20 9:21 AM, Jens Axboe wrote: >>> On 10/9/20 2:01 AM, Miroslav Benes wrote: On Thu, 8 Oct 2020, Oleg Nesterov wrote: > On 10/05, Jens Axboe wrote: >> >> Hi, >> >> The goal is this patch series is to decouple TWA_SIGNAL based task_work >> from real signals and signal delivery. > > I think TIF_NOTIFY_SIGNAL can have more users. Say, we can move > try_to_freeze() from get_signal() to tracehook_notify_signal(), kill > fake_signal_wake_up(), and remove freezing() from recalc_sigpending(). > > Probably the same for TIF_PATCH_PENDING, klp_send_signals() can use > set_notify_signal() rather than signal_wake_up(). Yes, that was my impression from the patch set too, when I accidentally noticed it. Jens, could you CC our live patching ML when you submit v4, please? It would be a nice cleanup. >>> >>> Definitely, though it'd be v5 at this point. But we really need to get >>> all archs supporting TIF_NOTIFY_SIGNAL first. Once we have that, there's >>> a whole slew of cleanups that'll fall out naturally: >>> >>> - Removal of JOBCTL_TASK_WORK >>> - Removal of special path for TWA_SIGNAL in task_work >>> - TIF_PATCH_PENDING can be converted and then removed >>> - try_to_freeze() cleanup that Oleg mentioned >>> >>> And probably more I'm not thinking of right now :-) >> >> Here's the current series, I took a stab at converting all archs to >> support TIF_NOTIFY_SIGNAL so we have a base to build on top of. Most >> of them were straight forward, but I need someone to fixup powerpc, >> verify arm and s390. >> >> But it's a decent start I think, and means that we can drop various >> bits as is done at the end of the series. I could swap things around >> a bit and avoid having the intermediate step, but I envision that >> getting this in all archs will take a bit longer than just signing off >> on the generic/x86 bits. So probably best to keep the series as it is >> for now, and work on getting the arch bits verified/fixed/tested. >> >> https://git.kernel.dk/cgit/linux-block/log/?h=tif-task_work > > Thanks, Jens. > > Crude diff for live patching on top of the series is below. Tested only on > x86_64, but it passes the tests without an issue. Nice, thanks! I'm continuing to hone the series, what's really missing so far is arch review. Most conversions are straight forward, some I need folks to definitely take a look at (arm, s390). powerpc is also a bit hair right now, but I'm told that 5.10 will kill a TIF flag there, so that'll make it trivial once I rebase on that. Did a few more cleanups on top, series is in the same spot. I'll repost once the merge window settles down. -- Jens Axboe
Re: [PATCH RFC PKS/PMEM 33/58] fs/cramfs: Utilize new kmap_thread()
On Tue, Oct 13, 2020 at 11:44:29AM -0700, Dan Williams wrote: > On Fri, Oct 9, 2020 at 12:52 PM wrote: > > > > From: Ira Weiny > > > > The kmap() calls in this FS are localized to a single thread. To avoid > > the over head of global PKRS updates use the new kmap_thread() call. > > > > Cc: Nicolas Pitre > > Signed-off-by: Ira Weiny > > --- > > fs/cramfs/inode.c | 10 +- > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c > > index 912308600d39..003c014a42ed 100644 > > --- a/fs/cramfs/inode.c > > +++ b/fs/cramfs/inode.c > > @@ -247,8 +247,8 @@ static void *cramfs_blkdev_read(struct super_block *sb, > > unsigned int offset, > > struct page *page = pages[i]; > > > > if (page) { > > - memcpy(data, kmap(page), PAGE_SIZE); > > - kunmap(page); > > + memcpy(data, kmap_thread(page), PAGE_SIZE); > > + kunmap_thread(page); > > Why does this need a sleepable kmap? This looks like a textbook > kmap_atomic() use case. There's a lot of code of this form. Could we perhaps have: static inline void copy_to_highpage(struct page *to, void *vfrom, unsigned int size) { char *vto = kmap_atomic(to); memcpy(vto, vfrom, size); kunmap_atomic(vto); } in linux/highmem.h ?
Re: [PATCH] MAINTAINERS: jarkko.sakki...@linux.intel.com -> jar...@kernel.org
On Tue, 2020-10-13 at 22:25 +0300, Jarkko Sakkinen wrote: > On Tue, Oct 13, 2020 at 08:30:38AM -0700, Joe Perches wrote: > > On Tue, 2020-10-13 at 13:46 +0300, Jarkko Sakkinen wrote: > > > Use korg address as the main communications end point. Update the > > > corresponding M-entries. > > > > Maybe add an equivalent entry to .mailmap? > > Ugh, neither has @linux.intel.com. So, I'll insert these two lines: > > Jarkko Sakkinen > Jarkko Sakkinen I think a single line like works Jarkko Sakkinen Adding this to .mailmap gives: $ ./scripts/get_maintainer.pl -f drivers/char/tpm/tpm-sysfs.c Peter Huewe (maintainer:TPM DEVICE DRIVER) Jarkko Sakkinen (maintainer:TPM DEVICE DRIVER) Jason Gunthorpe (reviewer:TPM DEVICE DRIVER) linux-integr...@vger.kernel.org (open list:TPM DEVICE DRIVER) linux-kernel@vger.kernel.org (open list) even without the MAINTAINER file changes (though you should really do those too so people that read the file can use the proper address) --- .mailmap | 1 + 1 file changed, 1 insertion(+) diff --git a/.mailmap b/.mailmap index e4ccac4e2f88..1e14566a3d56 100644 --- a/.mailmap +++ b/.mailmap @@ -133,6 +133,7 @@ James Ketrenos Jan Glauber Jan Glauber Jan Glauber +Jarkko Sakkinen Jason Gunthorpe Jason Gunthorpe Jason Gunthorpe
Re: [PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed
call_common+0xe8/0x218 [19611.952990][T1717146] INFO: Slab 0x99caaf22 objects=178 used=174 fp=0x006a64b0 flags=0x7fff800201 [19611.953004][T1717146] INFO: Object 0xf360132d @offset=30192 fp=0x [19611.953004][T1717146] [19611.953048][T1717146] Redzone acef7298: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb [19611.953080][T1717146] Object f360132d: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkk. [19611.953114][T1717146] Redzone 83758aaa: bb bb bb bb bb bb bb bb [19611.953146][T1717146] Padding cbb228a2: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a [19611.953189][T1717146] CPU: 16 PID: 1717146 Comm: trinity-c8 Tainted: GB W O 5.9.0-next-20201013 #1 [19611.953223][T1717146] Call Trace: [19611.953242][T1717146] [c000200022557800] [c064c208] dump_stack+0xec/0x144 (unreliable) [19611.953291][T1717146] [c000200022557840] [c0363688] print_trailer+0x278/0x2a0 [19611.953323][T1717146] [c0002000225578d0] [c035aa8c] free_debug_processing+0x57c/0x600 [19611.953356][T1717146] [c0002000225579b0] [c035af24] __slab_free+0x414/0x5b0 [19611.953391][T1717146] [c000200022557a80] [c035b55c] kfree+0x49c/0x510 [19611.953423][T1717146] [c000200022557b10] [c00432a0] pcibios_remove_bus+0x70/0x90 pci_irq_map_dispose at arch/powerpc/kernel/pci-common.c:456 (inlined by) pcibios_remove_bus at arch/powerpc/kernel/pci-common.c:461 [19611.953454][T1717146] [c000200022557b40] [c0677f94] pci_remove_bus+0xe4/0x110 [19611.953477][T1717146] [c000200022557b70] [c0678134] pci_remove_bus_device+0x74/0x170 [19611.953510][T1717146] [c000200022557bb0] [c0678120] pci_remove_bus_device+0x60/0x170 [19611.953543][T1717146] [c000200022557bf0] [c06782a4] pci_stop_and_remove_bus_device_locked+0x34/0x50 [19611.953567][T1717146] [c000200022557c20] [c0687690] remove_store+0xc0/0xe0 [19611.953599][T1717146] [c000200022557c70] [c06e5320] dev_attr_store+0x30/0x50 [19611.953621][T1717146] [c000200022557c90] [c04a53b8] sysfs_kf_write+0x68/0xb0 [19611.953652][T1717146] [c000200022557cd0] [c04a45e4] kernfs_fop_write+0x114/0x260 [19611.953684][T1717146] [c000200022557d20] [c03aff74] vfs_write+0xe4/0x260 [19611.953717][T1717146] [c000200022557d70] [c03b02a4] ksys_write+0x74/0x130 [19611.953762][T1717146] [c000200022557dc0] [c002a3e8] system_call_exception+0xf8/0x1d0 [19611.953795][T1717146] [c000200022557e20] [c000d0a8] system_call_common+0xe8/0x218 [19611.953821][T1717146] FIX kmalloc-16: Object at 0xf360132d not freed [19611.954111][T1717146] = [19611.954144][T1717146] BUG kmalloc-16 (Tainted: GB W O ): Wrong object count. Counter is 174 but counted were 176 [19611.954176][T1717146] - [19611.954176][T1717146] [19611.954221][T1717146] INFO: Slab 0x99caaf22 objects=178 used=174 fp=0x006a64b0 flags=0x7fff800201 [19611.954237][T1717146] CPU: 16 PID: 1717146 Comm: trinity-c8 Tainted: GB W O 5.9.0-next-20201013 #1 [19611.954269][T1717146] Call Trace: [19611.954286][T1717146] [c0002000225576f0] [c064c208] dump_stack+0xec/0x144 (unreliable) [19611.954329][T1717146] [c000200022557730] [c0363368] slab_err+0x78/0xb0 [19611.954364][T1717146] [c000200022557810] [c0359f94] on_freelist+0x364/0x390 [19611.954390][T1717146] [c0002000225578b0] [c035a798] free_debug_processing+0x288/0x600 [19611.954428][T1717146] [c000200022557990] [c035af24] __slab_free+0x414/0x5b0 [19611.954459][T1717146] [c000200022557a60] [c035b55c] kfree+0x49c/0x510 [19611.954507][T1717146] [c000200022557af0] [c02bd5a0] kfree_const+0x60/0x80 [19611.954540][T1717146] [c000200022557b10] [c06553ec] kobject_release+0x7c/0xd0 [19611.954562][T1717146] [c000200022557b50] [c06e66c0] put_device+0x20/0x40 [19611.954594][T1717146] [c000200022557b70] [c067820c] pci_remove_bus_device+0x14c/0x170 [19611.954627][T1717146] [c000200022557bb0] [c0678120] pci_remove_bus_device+0x60/0x170 [19611.954652][T1717146] [c000200022557bf0] [c06782a4] pci_stop_and_remove_bus_device_locked+0x34/0x50 [19611.954686][T1717146] [c000200022557c20] [c0687690] remove_store+0xc0/0xe0 [19611.954717][T1717146] [c000200022557c70] [c06e5320] dev_attr_store+0x30/0x50 [19611.954749][T1717146] [c000200022557c90] [c04a53b8] sysfs_kf_write+0x68/0xb0 [19611.954784][T1717146] [c000200022557cd0] [c04a45e4] kernfs_fop_write+0x114/0x260 [19611.954884][T1717146] [c000200022557d20] [c03aff74] vfs_write+0xe4/0x260 [19611.954972][T1717146] [c000200022557d70] [c
Re: [tip: locking/core] lockdep: Fix lockdep recursion
On Tue, Oct 13, 2020 at 09:26:50AM -0700, Paul E. McKenney wrote: > On Tue, Oct 13, 2020 at 01:25:44PM +0200, Peter Zijlstra wrote: > > On Tue, Oct 13, 2020 at 12:44:50PM +0200, Peter Zijlstra wrote: > > > On Tue, Oct 13, 2020 at 12:34:06PM +0200, Peter Zijlstra wrote: > > > > On Mon, Oct 12, 2020 at 02:28:12PM -0700, Paul E. McKenney wrote: > > > > > It is certainly an accident waiting to happen. Would something like > > > > > the following make sense? > > > > > > > > Sadly no. > > > > > > > > > > > > > > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > > > index bfd38f2..52a63bc 100644 > > > > > --- a/kernel/rcu/tree.c > > > > > +++ b/kernel/rcu/tree.c > > > > > @@ -4067,6 +4067,7 @@ void rcu_cpu_starting(unsigned int cpu) > > > > > > > > > > rnp = rdp->mynode; > > > > > mask = rdp->grpmask; > > > > > + lockdep_off(); > > > > > raw_spin_lock_irqsave_rcu_node(rnp, flags); > > > > > WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask); > > > > > newcpu = !(rnp->expmaskinitnext & mask); > > > > > @@ -4086,6 +4087,7 @@ void rcu_cpu_starting(unsigned int cpu) > > > > > } else { > > > > > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > > > > > } > > > > > + lockdep_on(); > > > > > smp_mb(); /* Ensure RCU read-side usage follows above > > > > > initialization. */ > > > > > } > > > > > > > > This will just shut it up, but will not fix the actual problem of that > > > > spin-lock ending up in trace_lock_acquire() which relies on RCU which > > > > isn't looking. > > > > > > > > What we need here is to supress tracing not lockdep. Let me consider. > > > > > > We appear to have a similar problem with rcu_report_dead(), it's > > > raw_spin_unlock()s can end up in trace_lock_release() while we just > > > killed RCU. > > > > So we can deal with the explicit trace_*() calls like the below, but I > > really don't like it much. It also doesn't help with function tracing. > > This is really early/late in the hotplug cycle and should be considered > > entry, we shouldn't be tracing anything here. > > > > Paul, would it be possible to use a scheme similar to IRQ/NMI for > > hotplug? That seems to mostly rely on atomic ops, not locks. > > The rest of the rcu_node tree and the various grace-period/hotplug races > makes that question non-trivial. I will look into it, but I have no > reason for optimism. > > But there is only one way to find out... ;-) The aforementioned races get really ugly really fast. So I do not believe that a lockless approach is a strategy to win here. But why not use something sort of like a sequence counter, but adapted for local on-CPU use? This should quiet the diagnostics for the full time that RCU needs its locks. Untested patch below. Thoughts? Thanx, Paul diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 1d42909..5b06886 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1152,13 +1152,15 @@ bool rcu_lockdep_current_cpu_online(void) struct rcu_data *rdp; struct rcu_node *rnp; bool ret = false; + unsigned long seq; if (in_nmi() || !rcu_scheduler_fully_active) return true; preempt_disable_notrace(); rdp = this_cpu_ptr(_data); rnp = rdp->mynode; - if (rdp->grpmask & rcu_rnp_online_cpus(rnp)) + seq = READ_ONCE(rnp->ofl_seq) & ~0x1; + if (rdp->grpmask & rcu_rnp_online_cpus(rnp) || seq != READ_ONCE(rnp->ofl_seq)) ret = true; preempt_enable_notrace(); return ret; @@ -4065,6 +4067,8 @@ void rcu_cpu_starting(unsigned int cpu) rnp = rdp->mynode; mask = rdp->grpmask; + WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1); + WARN_ON_ONCE(!(rnp->ofl_seq & 0x1)); raw_spin_lock_irqsave_rcu_node(rnp, flags); WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask); newcpu = !(rnp->expmaskinitnext & mask); @@ -4084,6 +4088,8 @@ void rcu_cpu_starting(unsigned int cpu) } else { raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } + WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1); + WARN_ON_ONCE(rnp->ofl_seq & 0x1); smp_mb(); /* Ensure RCU read-side usage follows above initialization. */ } @@ -4111,6 +4117,8 @@ void rcu_report_dead(unsigned int cpu) /* Remove outgoing CPU from mask in the leaf rcu_node structure. */ mask = rdp->grpmask; + WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1); + WARN_ON_ONCE(!(rnp->ofl_seq & 0x1)); raw_spin_lock(_state.ofl_lock); raw_spin_lock_irqsave_rcu_node(rnp, flags); /* Enforce GP memory-order guarantee. */ rdp->rcu_ofl_gp_seq = READ_ONCE(rcu_state.gp_seq); @@ -4123,6 +4131,8 @@
Re: [PATCH] mm: memcontrol: Remove unused mod_memcg_obj_state()
On Tue, 13 Oct 2020, Muchun Song wrote: > Since commit: > > 991e7673859e ("mm: memcontrol: account kernel stack per node") > > There is no user of the mod_memcg_obj_state(). This patch just remove > it. Also rework type of the idx parameter of the mod_objcg_state() > from int to enum node_stat_item. > > Signed-off-by: Muchun Song Acked-by: David Rientjes
Re: [PATCH v5 4/5] docs: counter: Document character device interface
On Tue, Oct 13, 2020 at 02:08:45PM -0500, David Lechner wrote: > On 10/13/20 1:58 PM, William Breathitt Gray wrote: > > On Mon, Oct 12, 2020 at 12:04:10PM -0500, David Lechner wrote: > >> On 10/8/20 7:28 AM, William Breathitt Gray wrote: > >>> On Thu, Oct 08, 2020 at 10:09:09AM +0200, Pavel Machek wrote: > Hi! > > > +int main(void) > > +{ > > +struct pollfd pfd = { .events = POLLIN }; > > +struct counter_event event_data[2]; > > + > > +pfd.fd = open("/dev/counter0", O_RDWR); > > + > > +ioctl(pfd.fd, COUNTER_SET_WATCH_IOCTL, watches); > > +ioctl(pfd.fd, COUNTER_SET_WATCH_IOCTL, watches + 1); > > +ioctl(pfd.fd, COUNTER_LOAD_WATCHES_IOCTL); > > + > > +for (;;) { > > +poll(, 1, -1); > > Why do poll, when you are doing blocking read? > > > +read(pfd.fd, event_data, sizeof(event_data)); > > Does your new chrdev always guarantee returning complete buffer? > > If so, should it behave like that? > > Best regards, > Pavel > -- > (english) http://www.livejournal.com/~pavelmachek > (cesky, pictures) > http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html > >>> > >>> I suppose you're right: a poll() should be redundant now with this > >>> version of the character device implementation because buffers will > >>> always return complete; so a blocking read() should achieve the same > >>> behavior that a poll() with read() would. > >>> > >>> I'll give some more time for additional feedback to come in for this > >>> version of the patchset, and then likely remove support for poll() in > >>> the v6 submission. > >>> > >>> William Breathitt Gray > >>> > >> > >> I hope that you mean that you will just remove it from the example > >> and not from the chardev. Otherwise it won't be possible to > >> integrate this with an event loop. > > > > Would you elaborate a bit further on this? My thought process is that > > because users must set the Counter Events they want to watch, and only > > those Counter Events show up in the character device node, a blocking > > read() would effectively behave the same as poll() with read(); if none > > of the Counter Events occur, the read() just blocks until one does, thus > > making the use of a poll() call redundant. > > > > William Breathitt Gray > > > > If the counter device was the only file descriptor being read, then yes > it wouldn't matter. But if we are using this in combination with other > file descriptors, then it is common to poll all of the file descriptors > using a single syscall to see which one is ready to read rather than > doing a non-blocking read on all of the file descriptors, which would > result in many unnecessary syscalls. Ah, that's a fair point, my original view was somewhat myopic there. I'll leave poll support in the Counter chrdev then and just simplify the documentation example code to not use it. William Breathitt Gray signature.asc Description: PGP signature
[PATCH 5/9] perf tools: Pass build_id object to dso__set_build_id
Passing build_id object to dso__set_build_id, so it's easier to initialize dos's build id object. Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/perf/util/dso.c| 4 ++-- tools/perf/util/dso.h| 2 +- tools/perf/util/header.c | 4 +++- tools/perf/util/symbol-minimal.c | 2 +- tools/perf/util/symbol.c | 2 +- 5 files changed, 8 insertions(+), 6 deletions(-) diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index 2f7f01ead9a1..4415ce83150b 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -1326,9 +1326,9 @@ void dso__put(struct dso *dso) dso__delete(dso); } -void dso__set_build_id(struct dso *dso, void *build_id) +void dso__set_build_id(struct dso *dso, struct build_id *bid) { - memcpy(dso->bid.data, build_id, sizeof(dso->bid.data)); + dso->bid = *bid; dso->has_build_id = 1; } diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index eac004210b47..5a5678dbdaa5 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -260,7 +260,7 @@ bool dso__sorted_by_name(const struct dso *dso); void dso__set_sorted_by_name(struct dso *dso); void dso__sort_by_name(struct dso *dso); -void dso__set_build_id(struct dso *dso, void *build_id); +void dso__set_build_id(struct dso *dso, struct build_id *bid); bool dso__build_id_equal(const struct dso *dso, u8 *build_id); void dso__read_running_kernel_build_id(struct dso *dso, struct machine *machine); diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index fe220f01fc94..21243adbb9fd 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -2082,8 +2082,10 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev, dso = machine__findnew_dso(machine, filename); if (dso != NULL) { char sbuild_id[SBUILD_ID_SIZE]; + struct build_id bid; - dso__set_build_id(dso, >build_id); + build_id__init(, bev->build_id, BUILD_ID_SIZE); + dso__set_build_id(dso, ); if (dso_space != DSO_SPACE__USER) { struct kmod_path m = { .name = NULL, }; diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c index dba6b9e5d64e..f9eb0bee7f15 100644 --- a/tools/perf/util/symbol-minimal.c +++ b/tools/perf/util/symbol-minimal.c @@ -349,7 +349,7 @@ int dso__load_sym(struct dso *dso, struct map *map __maybe_unused, dso->is_64_bit = ret; if (filename__read_build_id(ss->name, ) > 0) - dso__set_build_id(dso, bid.data); + dso__set_build_id(dso, ); return 0; } diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 369cbad09f0d..976632d0baa0 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -1818,7 +1818,7 @@ int dso__load(struct dso *dso, struct map *map) is_regular_file(dso->long_name)) { __symbol__join_symfs(name, PATH_MAX, dso->long_name); if (filename__read_build_id(name, ) > 0) - dso__set_build_id(dso, bid.data); + dso__set_build_id(dso, ); } /* -- 2.26.2
[PATCH 7/9] perf tools: Add size to struct perf_record_header_build_id
We do not store size with build ids in perf data, but there's enough space to do it. Adding misc bit PERF_RECORD_MISC_BUILD_ID_SIZE to mark build id event with size. With this fix the dso with md5 build id will have correct build id data and will be usable for debuginfod processing if needed (coming in following patches). Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/lib/perf/include/perf/event.h | 12 +++- tools/perf/util/build-id.c | 8 +--- tools/perf/util/header.c| 10 +++--- 3 files changed, 23 insertions(+), 7 deletions(-) diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h index a6dbba6b9073..988c539bedb6 100644 --- a/tools/lib/perf/include/perf/event.h +++ b/tools/lib/perf/include/perf/event.h @@ -201,10 +201,20 @@ struct perf_record_header_tracing_data { __u32size; }; +#define PERF_RECORD_MISC_BUILD_ID_SIZE (1 << 15) + struct perf_record_header_build_id { struct perf_event_header header; pid_tpid; - __u8 build_id[24]; + union { + __u8 build_id[24]; + struct { + __u8 data[20]; + __u8 size; + __u8 reserved1__; + __u16reserved2__; + }; + }; char filename[]; }; diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c index b5648735f01f..8763772f1095 100644 --- a/tools/perf/util/build-id.c +++ b/tools/perf/util/build-id.c @@ -296,7 +296,7 @@ char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size, continue; \ else -static int write_buildid(const char *name, size_t name_len, u8 *build_id, +static int write_buildid(const char *name, size_t name_len, struct build_id *bid, pid_t pid, u16 misc, struct feat_fd *fd) { int err; @@ -307,7 +307,9 @@ static int write_buildid(const char *name, size_t name_len, u8 *build_id, len = PERF_ALIGN(len, NAME_ALIGN); memset(, 0, sizeof(b)); - memcpy(_id, build_id, BUILD_ID_SIZE); + memcpy(, bid->data, bid->size); + b.size = (u8) bid->size; + misc |= PERF_RECORD_MISC_BUILD_ID_SIZE; b.pid = pid; b.header.misc = misc; b.header.size = sizeof(b) + len; @@ -354,7 +356,7 @@ static int machine__write_buildid_table(struct machine *machine, in_kernel = pos->kernel || is_kernel_module(name, PERF_RECORD_MISC_CPUMODE_UNKNOWN); - err = write_buildid(name, name_len, pos->bid.data, machine->pid, + err = write_buildid(name, name_len, >bid, machine->pid, in_kernel ? kmisc : umisc, fd); if (err) break; diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c index 21243adbb9fd..8da3886f10a8 100644 --- a/tools/perf/util/header.c +++ b/tools/perf/util/header.c @@ -2083,8 +2083,12 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev, if (dso != NULL) { char sbuild_id[SBUILD_ID_SIZE]; struct build_id bid; + size_t size = BUILD_ID_SIZE; - build_id__init(, bev->build_id, BUILD_ID_SIZE); + if (bev->header.misc & PERF_RECORD_MISC_BUILD_ID_SIZE) + size = bev->size; + + build_id__init(, bev->data, size); dso__set_build_id(dso, ); if (dso_space != DSO_SPACE__USER) { @@ -2098,8 +2102,8 @@ static int __event_process_build_id(struct perf_record_header_build_id *bev, } build_id__sprintf(>bid, sbuild_id); - pr_debug("build id event received for %s: %s\n", -dso->long_name, sbuild_id); + pr_debug("build id event received for %s: %s [%lu]\n", +dso->long_name, sbuild_id, size); dso__put(dso); } -- 2.26.2
[PATCH 6/9] perf tools: Pass build_id object to dso__build_id_equal
Passing build_id object to dso__build_id_equal, so we can properly check build id with different size than sha1. Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/perf/util/dso.c| 5 +++-- tools/perf/util/dso.h| 2 +- tools/perf/util/symbol-elf.c | 8 ++-- tools/perf/util/symbol.c | 2 +- 4 files changed, 11 insertions(+), 6 deletions(-) diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index 4415ce83150b..ca965845b35e 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -1332,9 +1332,10 @@ void dso__set_build_id(struct dso *dso, struct build_id *bid) dso->has_build_id = 1; } -bool dso__build_id_equal(const struct dso *dso, u8 *build_id) +bool dso__build_id_equal(const struct dso *dso, struct build_id *bid) { - return memcmp(dso->bid.data, build_id, sizeof(dso->bid.data)) == 0; + return dso->bid.size == bid->size && + memcmp(dso->bid.data, bid->data, dso->bid.size) == 0; } void dso__read_running_kernel_build_id(struct dso *dso, struct machine *machine) diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index 5a5678dbdaa5..f926c96bf230 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -261,7 +261,7 @@ void dso__set_sorted_by_name(struct dso *dso); void dso__sort_by_name(struct dso *dso); void dso__set_build_id(struct dso *dso, struct build_id *bid); -bool dso__build_id_equal(const struct dso *dso, u8 *build_id); +bool dso__build_id_equal(const struct dso *dso, struct build_id *bid); void dso__read_running_kernel_build_id(struct dso *dso, struct machine *machine); int dso__kernel_module_get_build_id(struct dso *dso, const char *root_dir); diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 97a55f162ea5..44dd86a4f25f 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -834,13 +834,17 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name, /* Always reject images with a mismatched build-id: */ if (dso->has_build_id && !symbol_conf.ignore_vmlinux_buildid) { u8 build_id[BUILD_ID_SIZE]; + struct build_id bid; + int size; - if (elf_read_build_id(elf, build_id, BUILD_ID_SIZE) < 0) { + size = elf_read_build_id(elf, build_id, BUILD_ID_SIZE); + if (size <= 0) { dso->load_errno = DSO_LOAD_ERRNO__CANNOT_READ_BUILDID; goto out_elf_end; } - if (!dso__build_id_equal(dso, build_id)) { + build_id__init(, build_id, size); + if (!dso__build_id_equal(dso, )) { pr_debug("%s: build id mismatch for %s.\n", __func__, name); dso->load_errno = DSO_LOAD_ERRNO__MISMATCHING_BUILDID; goto out_elf_end; diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c index 976632d0baa0..613885df 100644 --- a/tools/perf/util/symbol.c +++ b/tools/perf/util/symbol.c @@ -2136,7 +2136,7 @@ static char *dso__find_kallsyms(struct dso *dso, struct map *map) } if (sysfs__read_build_id("/sys/kernel/notes", ) == 0) - is_host = dso__build_id_equal(dso, bid.data); + is_host = dso__build_id_equal(dso, ); /* Try a fast path for /proc/kallsyms if possible */ if (is_host) { -- 2.26.2
[PATCH 8/9] perf tools: Align buildid list output for short build ids
With shorter md5 build ids we need to align their paths properly with other build ids: $ perf buildid-list 17f4e448cc746582ea1881528deb549f7fdb3fd5 [kernel.kallsyms] a50e350e97c43b4708d09bcd85ebfff7 .../tools/perf/buildid-ex-md5 1805c738c8f3ec0f47b7ea09080c28f34d18a82b /usr/lib64/ld-2.31.so Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/perf/util/dso.c | 2 +- tools/perf/util/dso.h | 1 - tools/perf/util/dsos.c | 6 -- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index ca965845b35e..55c11e854fe4 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -1369,7 +1369,7 @@ int dso__kernel_module_get_build_id(struct dso *dso, return 0; } -size_t dso__fprintf_buildid(struct dso *dso, FILE *fp) +static size_t dso__fprintf_buildid(struct dso *dso, FILE *fp) { char sbuild_id[SBUILD_ID_SIZE]; diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index f926c96bf230..d8cb4f5680a4 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -362,7 +362,6 @@ struct dso *machine__findnew_kernel(struct machine *machine, const char *name, void dso__reset_find_symbol_cache(struct dso *dso); -size_t dso__fprintf_buildid(struct dso *dso, FILE *fp); size_t dso__fprintf_symbols_by_name(struct dso *dso, FILE *fp); size_t dso__fprintf(struct dso *dso, FILE *fp); diff --git a/tools/perf/util/dsos.c b/tools/perf/util/dsos.c index 87161e431830..183a81d5b2f9 100644 --- a/tools/perf/util/dsos.c +++ b/tools/perf/util/dsos.c @@ -287,10 +287,12 @@ size_t __dsos__fprintf_buildid(struct list_head *head, FILE *fp, size_t ret = 0; list_for_each_entry(pos, head, node) { + char sbuild_id[SBUILD_ID_SIZE]; + if (skip && skip(pos, parm)) continue; - ret += dso__fprintf_buildid(pos, fp); - ret += fprintf(fp, " %s\n", pos->long_name); + build_id__sprintf(>bid, sbuild_id); + ret += fprintf(fp, "%-40s %s\n", sbuild_id, pos->long_name); } return ret; } -- 2.26.2
Re: [PATCH] MAINTAINERS: jarkko.sakki...@linux.intel.com -> jar...@kernel.org
On Tue, Oct 13, 2020 at 08:30:38AM -0700, Joe Perches wrote: > On Tue, 2020-10-13 at 13:46 +0300, Jarkko Sakkinen wrote: > > Use korg address as the main communications end point. Update the > > corresponding M-entries. > > Maybe add an equivalent entry to .mailmap? Ugh, neither has @linux.intel.com. So, I'll insert these two lines: Jarkko Sakkinen Jarkko Sakkinen > > diff --git a/MAINTAINERS b/MAINTAINERS > [] > > -M: Jarkko Sakkinen > > +M: Jarkko Sakkinen Thanks. /Jarkko
[PATCH 9/9] perf tools: Add build id shell test
Adding test for build id cache that adds binary with sha1 and md5 build ids and verifies it's added properly. The test updates build id cache with perf record and perf buildid-cache -a. Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/perf/tests/shell/buildid.sh | 101 ++ 1 file changed, 101 insertions(+) create mode 100755 tools/perf/tests/shell/buildid.sh diff --git a/tools/perf/tests/shell/buildid.sh b/tools/perf/tests/shell/buildid.sh new file mode 100755 index ..4861a20edee2 --- /dev/null +++ b/tools/perf/tests/shell/buildid.sh @@ -0,0 +1,101 @@ +#!/bin/sh +# build id cache operations +# SPDX-License-Identifier: GPL-2.0 + +# skip if there's no readelf +if ! [ -x "$(command -v readelf)" ]; then + echo "failed: no readelf, install binutils" + exit 2 +fi + +# skip if there's no compiler +if ! [ -x "$(command -v cc)" ]; then + echo "failed: no compiler, install gcc" + exit 2 +fi + +ex_md5=$(mktemp /tmp/perf.ex.MD5.XXX) +ex_sha1=$(mktemp /tmp/perf.ex.SHA1.XXX) + +echo 'int main(void) { return 0; }' | cc -Wl,--build-id=sha1 -o ${ex_sha1} -x c - +echo 'int main(void) { return 0; }' | cc -Wl,--build-id=md5 -o ${ex_md5} -x c - + +echo "test binaries: ${ex_sha1} ${ex_md5}" + +check() +{ + id=`readelf -n ${1} 2>/dev/null | grep 'Build ID' | awk '{print $3}'` + + echo "build id: ${id}" + + link=${build_id_dir}/.build-id/${id:0:2}/${id:2} + echo "link: ${link}" + + if [ ! -h $link ]; then + echo "failed: link ${link} does not exist" + exit 1 + fi + + file=${build_id_dir}/.build-id/${id:0:2}/`readlink ${link}`/elf + echo "file: ${file}" + + if [ ! -x $file ]; then + echo "failed: file ${file} does not exist" + exit 1 + fi + + diff ${file} ${1} + if [ $? -ne 0 ]; then + echo "failed: ${file} do not match" + exit 1 + fi + + echo "OK for ${1}" +} + +test_add() +{ + build_id_dir=$(mktemp -d /tmp/perf.debug.XXX) + perf="perf --buildid-dir ${build_id_dir}" + + ${perf} buildid-cache -v -a ${1} + if [ $? -ne 0 ]; then + echo "failed: add ${1} to build id cache" + exit 1 + fi + + check ${1} + + rm -rf ${build_id_dir} +} + +test_record() +{ + data=$(mktemp /tmp/perf.data.XXX) + build_id_dir=$(mktemp -d /tmp/perf.debug.XXX) + perf="perf --buildid-dir ${build_id_dir}" + + ${perf} record --buildid-all -o ${data} ${1} + if [ $? -ne 0 ]; then + echo "failed: record ${1}" + exit 1 + fi + + check ${1} + + rm -rf ${build_id_dir} + rm -rf ${data} +} + +# add binaries manual via perf buildid-cache -a +test_add ${ex_sha1} +test_add ${ex_md5} + +# add binaries via perf record post processing +test_record ${ex_sha1} +test_record ${ex_md5} + +# cleanup +rm ${ex_sha1} ${ex_md5} + +exit ${err} -- 2.26.2
[PATCH 3/9] perf tools: Pass build id object to sysfs__read_build_id
Passing build id object to sysfs__read_build_id function, so it can populate the size of the build_id object. Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/perf/util/build-id.c | 6 +++--- tools/perf/util/dso.c| 6 ++ tools/perf/util/symbol-elf.c | 11 +-- tools/perf/util/symbol-minimal.c | 7 ++- tools/perf/util/symbol.c | 7 +++ tools/perf/util/symbol.h | 2 +- 6 files changed, 16 insertions(+), 23 deletions(-) diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c index 62b258aaa128..1c332e78bbc3 100644 --- a/tools/perf/util/build-id.c +++ b/tools/perf/util/build-id.c @@ -113,7 +113,7 @@ int build_id__sprintf(const u8 *build_id, int len, char *bf) int sysfs__sprintf_build_id(const char *root_dir, char *sbuild_id) { char notes[PATH_MAX]; - u8 build_id[BUILD_ID_SIZE]; + struct build_id bid; int ret; if (!root_dir) @@ -121,11 +121,11 @@ int sysfs__sprintf_build_id(const char *root_dir, char *sbuild_id) scnprintf(notes, sizeof(notes), "%s/sys/kernel/notes", root_dir); - ret = sysfs__read_build_id(notes, build_id, sizeof(build_id)); + ret = sysfs__read_build_id(notes, ); if (ret < 0) return ret; - return build_id__sprintf(build_id, sizeof(build_id), sbuild_id); + return build_id__sprintf(bid.data, sizeof(bid.data), sbuild_id); } int filename__sprintf_build_id(const char *pathname, char *sbuild_id) diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index d2c1ed08c879..e87fa9a71d9f 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -1346,8 +1346,7 @@ void dso__read_running_kernel_build_id(struct dso *dso, struct machine *machine) if (machine__is_default_guest(machine)) return; sprintf(path, "%s/sys/kernel/notes", machine->root_dir); - if (sysfs__read_build_id(path, dso->bid.data, -sizeof(dso->bid.data)) == 0) + if (sysfs__read_build_id(path, >bid) == 0) dso->has_build_id = true; } @@ -1365,8 +1364,7 @@ int dso__kernel_module_get_build_id(struct dso *dso, "%s/sys/module/%.*s/notes/.note.gnu.build-id", root_dir, (int)strlen(name) - 1, name); - if (sysfs__read_build_id(filename, dso->bid.data, -sizeof(dso->bid.data)) == 0) + if (sysfs__read_build_id(filename, >bid) == 0) dso->has_build_id = true; return 0; diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c index 61d7c444e6f5..97a55f162ea5 100644 --- a/tools/perf/util/symbol-elf.c +++ b/tools/perf/util/symbol-elf.c @@ -595,13 +595,11 @@ int filename__read_build_id(const char *filename, struct build_id *bid) #endif // HAVE_LIBBFD_BUILDID_SUPPORT -int sysfs__read_build_id(const char *filename, void *build_id, size_t size) +int sysfs__read_build_id(const char *filename, struct build_id *bid) { + size_t size = sizeof(bid->data); int fd, err = -1; - if (size < BUILD_ID_SIZE) - goto out; - fd = open(filename, O_RDONLY); if (fd < 0) goto out; @@ -622,8 +620,9 @@ int sysfs__read_build_id(const char *filename, void *build_id, size_t size) break; if (memcmp(bf, "GNU", sizeof("GNU")) == 0) { size_t sz = min(descsz, size); - if (read(fd, build_id, sz) == (ssize_t)sz) { - memset(build_id + sz, 0, size - sz); + if (read(fd, bid->data, sz) == (ssize_t)sz) { + memset(bid->data + sz, 0, size - sz); + bid->size = sz; err = 0; break; } diff --git a/tools/perf/util/symbol-minimal.c b/tools/perf/util/symbol-minimal.c index 5173331ee6e4..dba6b9e5d64e 100644 --- a/tools/perf/util/symbol-minimal.c +++ b/tools/perf/util/symbol-minimal.c @@ -222,9 +222,8 @@ int filename__read_build_id(const char *filename, struct build_id *bid) return ret; } -int sysfs__read_build_id(const char *filename, void *build_id, size_t size __maybe_unused) +int sysfs__read_build_id(const char *filename, struct build_id *bid) { - struct build_id bid; int fd; int ret = -1; struct stat stbuf; @@ -246,9 +245,7 @@ int sysfs__read_build_id(const char *filename, void *build_id, size_t size __may if (read(fd, buf, buf_size) != (ssize_t) buf_size) goto out_free; - ret = read_build_id(buf, buf_size, , false); - if (ret > 0) - memcpy(build_id, bid.data, bid.size); + ret = read_build_id(buf, buf_size, bid, false); out_free: free(buf);
[PATCH 1/9] perf tools: Use build_id object in dso
Replace build_id byte array with struct build_id object and all the code that references it. The objective is to carry size together with build id array, so it's better to keep both together. This is preparatory change for following patches, and there's no functional change. Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/perf/bench/inject-buildid.c | 4 ++-- tools/perf/builtin-buildid-cache.c | 2 +- tools/perf/builtin-inject.c| 4 ++-- tools/perf/util/annotate.c | 4 ++-- tools/perf/util/build-id.c | 6 +++--- tools/perf/util/build-id.h | 5 + tools/perf/util/dso.c | 18 +- tools/perf/util/dso.h | 2 +- tools/perf/util/dsos.c | 4 ++-- tools/perf/util/header.c | 2 +- tools/perf/util/map.c | 4 ++-- tools/perf/util/probe-event.c | 2 +- .../scripting-engines/trace-event-python.c | 2 +- tools/perf/util/symbol.c | 2 +- tools/perf/util/synthetic-events.c | 2 +- 15 files changed, 34 insertions(+), 29 deletions(-) diff --git a/tools/perf/bench/inject-buildid.c b/tools/perf/bench/inject-buildid.c index e9a11f4a1109..3d048a8139a7 100644 --- a/tools/perf/bench/inject-buildid.c +++ b/tools/perf/bench/inject-buildid.c @@ -79,12 +79,12 @@ static int add_dso(const char *fpath, const struct stat *sb __maybe_unused, int typeflag, struct FTW *ftwbuf __maybe_unused) { struct bench_dso *dso = [nr_dsos]; - unsigned char build_id[BUILD_ID_SIZE]; + struct build_id bid; if (typeflag == FTW_D || typeflag == FTW_SL) return 0; - if (filename__read_build_id(fpath, build_id, BUILD_ID_SIZE) < 0) + if (filename__read_build_id(fpath, bid.data, sizeof(bid.data)) < 0) return 0; dso->name = realpath(fpath, NULL); diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c index 39efa51d7fb3..a523c629f321 100644 --- a/tools/perf/builtin-buildid-cache.c +++ b/tools/perf/builtin-buildid-cache.c @@ -284,7 +284,7 @@ static bool dso__missing_buildid_cache(struct dso *dso, int parm __maybe_unused) pr_warning("Problems with %s file, consider removing it from the cache\n", filename); - } else if (memcmp(dso->build_id, build_id, sizeof(dso->build_id))) { + } else if (memcmp(dso->bid.data, build_id, sizeof(dso->bid.data))) { pr_warning("Problems with %s file, consider removing it from the cache\n", filename); } diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c index f3f965157d69..a35cdabe5bd4 100644 --- a/tools/perf/builtin-inject.c +++ b/tools/perf/builtin-inject.c @@ -522,8 +522,8 @@ static int dso__read_build_id(struct dso *dso) return 0; nsinfo__mountns_enter(dso->nsinfo, ); - if (filename__read_build_id(dso->long_name, dso->build_id, - sizeof(dso->build_id)) > 0) { + if (filename__read_build_id(dso->long_name, dso->bid.data, + sizeof(dso->bid.data)) > 0) { dso->has_build_id = true; } nsinfo__mountns_exit(); diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index fc17af7ba845..a016e1bd7b8d 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1578,8 +1578,8 @@ int symbol__strerror_disassemble(struct map_symbol *ms, int errnum, char *buf, s char *build_id_msg = NULL; if (dso->has_build_id) { - build_id__sprintf(dso->build_id, - sizeof(dso->build_id), bf + 15); + build_id__sprintf(dso->bid.data, + sizeof(dso->bid.data), bf + 15); build_id_msg = bf; } scnprintf(buf, buflen, diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c index 31207b6e2066..7da13ddb0d50 100644 --- a/tools/perf/util/build-id.c +++ b/tools/perf/util/build-id.c @@ -272,7 +272,7 @@ char *dso__build_id_filename(const struct dso *dso, char *bf, size_t size, if (!dso->has_build_id) return NULL; - build_id__sprintf(dso->build_id, sizeof(dso->build_id), sbuild_id); + build_id__sprintf(dso->bid.data, sizeof(dso->bid.data), sbuild_id); linkname = build_id_cache__linkname(sbuild_id, NULL, 0); if (!linkname) return NULL; @@ -355,7 +355,7 @@ static int machine__write_buildid_table(struct machine *machine, in_kernel = pos->kernel || is_kernel_module(name,
[PATCH 2/9] perf tools: Pass build_id object to filename__read_build_id
Passing build_id object to filename__read_build_id function, so it can populate the size of the build_id object. Changing filename__read_build_id code for both elf/non-elf code. Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/perf/bench/inject-buildid.c | 2 +- tools/perf/builtin-buildid-cache.c | 25 +++--- tools/perf/builtin-inject.c| 4 +--- tools/perf/tests/pe-file-parsing.c | 10 - tools/perf/tests/sdt.c | 6 +++--- tools/perf/util/build-id.c | 8 +++ tools/perf/util/dsos.c | 3 +-- tools/perf/util/symbol-elf.c | 16 -- tools/perf/util/symbol-minimal.c | 34 +- tools/perf/util/symbol.c | 6 +++--- tools/perf/util/symbol.h | 3 ++- 11 files changed, 60 insertions(+), 57 deletions(-) diff --git a/tools/perf/bench/inject-buildid.c b/tools/perf/bench/inject-buildid.c index 3d048a8139a7..280227e3ffd7 100644 --- a/tools/perf/bench/inject-buildid.c +++ b/tools/perf/bench/inject-buildid.c @@ -84,7 +84,7 @@ static int add_dso(const char *fpath, const struct stat *sb __maybe_unused, if (typeflag == FTW_D || typeflag == FTW_SL) return 0; - if (filename__read_build_id(fpath, bid.data, sizeof(bid.data)) < 0) + if (filename__read_build_id(fpath, ) < 0) return 0; dso->name = realpath(fpath, NULL); diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c index a523c629f321..c3cb168d546d 100644 --- a/tools/perf/builtin-buildid-cache.c +++ b/tools/perf/builtin-buildid-cache.c @@ -174,19 +174,19 @@ static int build_id_cache__add_kcore(const char *filename, bool force) static int build_id_cache__add_file(const char *filename, struct nsinfo *nsi) { char sbuild_id[SBUILD_ID_SIZE]; - u8 build_id[BUILD_ID_SIZE]; + struct build_id bid; int err; struct nscookie nsc; nsinfo__mountns_enter(nsi, ); - err = filename__read_build_id(filename, _id, sizeof(build_id)); + err = filename__read_build_id(filename, ); nsinfo__mountns_exit(); if (err < 0) { pr_debug("Couldn't read a build-id in %s\n", filename); return -1; } - build_id__sprintf(build_id, sizeof(build_id), sbuild_id); + build_id__sprintf(bid.data, sizeof(bid.data), sbuild_id); err = build_id_cache__add_s(sbuild_id, filename, nsi, false, false); pr_debug("Adding %s %s: %s\n", sbuild_id, filename, @@ -196,21 +196,21 @@ static int build_id_cache__add_file(const char *filename, struct nsinfo *nsi) static int build_id_cache__remove_file(const char *filename, struct nsinfo *nsi) { - u8 build_id[BUILD_ID_SIZE]; char sbuild_id[SBUILD_ID_SIZE]; + struct build_id bid; struct nscookie nsc; int err; nsinfo__mountns_enter(nsi, ); - err = filename__read_build_id(filename, _id, sizeof(build_id)); + err = filename__read_build_id(filename, ); nsinfo__mountns_exit(); if (err < 0) { pr_debug("Couldn't read a build-id in %s\n", filename); return -1; } - build_id__sprintf(build_id, sizeof(build_id), sbuild_id); + build_id__sprintf(bid.data, sizeof(bid.data), sbuild_id); err = build_id_cache__remove_s(sbuild_id); pr_debug("Removing %s %s: %s\n", sbuild_id, filename, err ? "FAIL" : "Ok"); @@ -274,17 +274,16 @@ static int build_id_cache__purge_all(void) static bool dso__missing_buildid_cache(struct dso *dso, int parm __maybe_unused) { char filename[PATH_MAX]; - u8 build_id[BUILD_ID_SIZE]; + struct build_id bid; if (dso__build_id_filename(dso, filename, sizeof(filename), false) && - filename__read_build_id(filename, build_id, - sizeof(build_id)) != sizeof(build_id)) { + filename__read_build_id(filename, ) == -1) { if (errno == ENOENT) return false; pr_warning("Problems with %s file, consider removing it from the cache\n", filename); - } else if (memcmp(dso->bid.data, build_id, sizeof(dso->bid.data))) { + } else if (memcmp(dso->bid.data, bid.data, bid.size)) { pr_warning("Problems with %s file, consider removing it from the cache\n", filename); } @@ -300,14 +299,14 @@ static int build_id_cache__fprintf_missing(struct perf_session *session, FILE *f static int build_id_cache__update_file(const char *filename, struct nsinfo *nsi) { - u8 build_id[BUILD_ID_SIZE]; char sbuild_id[SBUILD_ID_SIZE]; + struct build_id bid; struct nscookie nsc; int err; nsinfo__mountns_enter(nsi, ); - err = filename__read_build_id(filename,
[PATCH 4/9] perf tools: Pass build_id object to build_id__sprintf
Passing build_id object to build_id__sprintf function, so it can operate with the proper size of build id. This will create proper md5 build id readable names, like following: a50e350e97c43b4708d09bcd85ebfff7 instead of: a50e350e97c43b4708d09bcd85ebfff7 Acked-by: Ian Rogers Signed-off-by: Jiri Olsa --- tools/perf/builtin-buildid-cache.c| 6 ++-- tools/perf/tests/sdt.c| 2 +- tools/perf/util/annotate.c| 3 +- tools/perf/util/build-id.c| 30 --- tools/perf/util/build-id.h| 3 +- tools/perf/util/dso.c | 6 ++-- tools/perf/util/header.c | 3 +- tools/perf/util/map.c | 4 +-- tools/perf/util/probe-event.c | 9 -- tools/perf/util/probe-finder.c| 8 +++-- .../scripting-engines/trace-event-python.c| 2 +- tools/perf/util/symbol.c | 2 +- 12 files changed, 43 insertions(+), 35 deletions(-) diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c index c3cb168d546d..a25411926e48 100644 --- a/tools/perf/builtin-buildid-cache.c +++ b/tools/perf/builtin-buildid-cache.c @@ -186,7 +186,7 @@ static int build_id_cache__add_file(const char *filename, struct nsinfo *nsi) return -1; } - build_id__sprintf(bid.data, sizeof(bid.data), sbuild_id); + build_id__sprintf(, sbuild_id); err = build_id_cache__add_s(sbuild_id, filename, nsi, false, false); pr_debug("Adding %s %s: %s\n", sbuild_id, filename, @@ -210,7 +210,7 @@ static int build_id_cache__remove_file(const char *filename, struct nsinfo *nsi) return -1; } - build_id__sprintf(bid.data, sizeof(bid.data), sbuild_id); + build_id__sprintf(, sbuild_id); err = build_id_cache__remove_s(sbuild_id); pr_debug("Removing %s %s: %s\n", sbuild_id, filename, err ? "FAIL" : "Ok"); @@ -314,7 +314,7 @@ static int build_id_cache__update_file(const char *filename, struct nsinfo *nsi) } err = 0; - build_id__sprintf(bid.data, sizeof(bid.data), sbuild_id); + build_id__sprintf(, sbuild_id); if (build_id_cache__cached(sbuild_id)) err = build_id_cache__remove_s(sbuild_id); diff --git a/tools/perf/tests/sdt.c b/tools/perf/tests/sdt.c index 3ef37f739203..ed76c693f65e 100644 --- a/tools/perf/tests/sdt.c +++ b/tools/perf/tests/sdt.c @@ -37,7 +37,7 @@ static int build_id_cache__add_file(const char *filename) return err; } - build_id__sprintf(bid.data, sizeof(bid.data), sbuild_id); + build_id__sprintf(, sbuild_id); err = build_id_cache__add_s(sbuild_id, filename, NULL, false, false); if (err < 0) pr_debug("Failed to add build id cache of %s\n", filename); diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index a016e1bd7b8d..6c8575e182ed 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1578,8 +1578,7 @@ int symbol__strerror_disassemble(struct map_symbol *ms, int errnum, char *buf, s char *build_id_msg = NULL; if (dso->has_build_id) { - build_id__sprintf(dso->bid.data, - sizeof(dso->bid.data), bf + 15); + build_id__sprintf(>bid, bf + 15); build_id_msg = bf; } scnprintf(buf, buflen, diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c index 1c332e78bbc3..b5648735f01f 100644 --- a/tools/perf/util/build-id.c +++ b/tools/perf/util/build-id.c @@ -37,6 +37,7 @@ #include #include +#include static bool no_buildid_cache; @@ -95,13 +96,13 @@ struct perf_tool build_id__mark_dso_hit_ops = { .ordered_events = true, }; -int build_id__sprintf(const u8 *build_id, int len, char *bf) +int build_id__sprintf(const struct build_id *build_id, char *bf) { char *bid = bf; - const u8 *raw = build_id; - int i; + const u8 *raw = build_id->data; + size_t i; - for (i = 0; i < len; ++i) { + for (i = 0; i < build_id->size; ++i) { sprintf(bid, "%02x", *raw); ++raw; bid += 2; @@ -125,7 +126,7 @@ int sysfs__sprintf_build_id(const char *root_dir, char *sbuild_id) if (ret < 0) return ret; - return build_id__sprintf(bid.data, sizeof(bid.data), sbuild_id); + return build_id__sprintf(, sbuild_id); } int filename__sprintf_build_id(const char *pathname, char *sbuild_id) @@ -137,7 +138,7 @@ int filename__sprintf_build_id(const char *pathname, char *sbuild_id) if (ret < 0) return ret; - return build_id__sprintf(bid.data, sizeof(bid.data),
Re: [PATCH v2 12/24] drm/dp: fix a kernel-doc issue at drm_edid.c
Reviewed-by: Lyude Paul Thanks for the fixes! I will go ahead and push 11 and 12 to drm-misc-next. On Tue, 2020-10-13 at 14:14 +0200, Mauro Carvalho Chehab wrote: > The name of the argument is different, causing those warnings: > > ./drivers/gpu/drm/drm_edid.c:3754: warning: Function parameter or member > 'video_code' not described in 'drm_display_mode_from_cea_vic' > ./drivers/gpu/drm/drm_edid.c:3754: warning: Excess function parameter > 'vic' description in 'drm_display_mode_from_cea_vic' > > Fixes: 7af655bce275 ("drm/dp: Add drm_dp_downstream_mode()") > Signed-off-by: Mauro Carvalho Chehab > --- > drivers/gpu/drm/drm_edid.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > index a82f37d44258..631125b46e04 100644 > --- a/drivers/gpu/drm/drm_edid.c > +++ b/drivers/gpu/drm/drm_edid.c > @@ -3741,7 +3741,7 @@ drm_add_cmdb_modes(struct drm_connector *connector, u8 > svd) > /** > * drm_display_mode_from_cea_vic() - return a mode for CEA VIC > * @dev: DRM device > - * @vic: CEA VIC of the mode > + * @video_code: CEA VIC of the mode > * > * Creates a new mode matching the specified CEA VIC. > * -- Sincerely, Lyude Paul (she/her) Software Engineer at Red Hat Note: I deal with a lot of emails and have a lot of bugs on my plate. If you've asked me a question, are waiting for a review/merge on a patch, etc. and I haven't responded in a while, please feel free to send me another email to check on my status. I don't bite!
[PATCHv2 0/9] perf tools: Add support for build id with different sizes
hi, currently we support only one storage size (20 bytes) for build ids. That fits for SHA1 build ids, but there can in theory be any size. The gcc linker supports also MD5, which is 16 bytes. Currently the MD5 build id will be stored in .debug cache with additional zeros, like: $ find ~/.debug .../.debug/.build-id/a5/0e350e97c43b4708d09bcd85ebfff7 ... .../buildid-ex-md5/a50e350e97c43b4708d09bcd85ebfff7 .../buildid-ex-md5/a50e350e97c43b4708d09bcd85ebfff7/elf .../buildid-ex-md5/a50e350e97c43b4708d09bcd85ebfff7/probes And the same reflected in buildid-list as well: $ perf buildid-list 17f4e448cc746582ea1881528deb549f7fdb3fd5 [kernel.kallsyms] a50e350e97c43b4708d09bcd85ebfff7 .../buildid-ex-md5 This will cause problems in future when we will ask debuginfod for binaries/debuginfo based on build id. This patchset is adding 'struct build_id' object, that holds the build id data and size and use it to store build ids and changes all related functions to use it. Also available in: git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git perf/build_id_size v2 changes: - rebased on current perf/core - updated test to build its own binaries [Namhyung] thanks, jirka --- Jiri Olsa (9): perf tools: Use build_id object in dso perf tools: Pass build_id object to filename__read_build_id perf tools: Pass build id object to sysfs__read_build_id perf tools: Pass build_id object to build_id__sprintf perf tools: Pass build_id object to dso__set_build_id perf tools: Pass build_id object to dso__build_id_equal perf tools: Add size to struct perf_record_header_build_id perf tools: Align buildid list output for short build ids perf tools: Add build id shell test tools/lib/perf/include/perf/event.h| 12 +++- tools/perf/bench/inject-buildid.c | 4 ++-- tools/perf/builtin-buildid-cache.c | 25 tools/perf/builtin-inject.c| 4 +--- tools/perf/tests/pe-file-parsing.c | 10 +- tools/perf/tests/sdt.c | 6 +++--- tools/perf/tests/shell/buildid.sh | 101 tools/perf/util/annotate.c | 3 +-- tools/perf/util/build-id.c | 48 +++--- tools/perf/util/build-id.h | 8 +++- tools/perf/util/dso.c | 23 ++ tools/perf/util/dso.h | 7 +++ tools/perf/util/dsos.c | 9 + tools/perf/util/header.c | 15 ++- tools/perf/util/map.c | 4 +--- tools/perf/util/probe-event.c | 9 ++--- tools/perf/util/probe-finder.c | 8 +--- tools/perf/util/scripting-engines/trace-event-python.c | 2 +- tools/perf/util/symbol-elf.c | 35 +++-- tools/perf/util/symbol-minimal.c | 31 +++--- tools/perf/util/symbol.c | 15 +++ tools/perf/util/symbol.h | 5 +++-- tools/perf/util/synthetic-events.c | 2 +- 23 files changed, 260 insertions(+), 126 deletions(-) create mode 100755 tools/perf/tests/shell/buildid.sh
Re: [2/2] drm/msm: Add support for GPU cooling
On 10/13/2020 11:10 PM, m...@chromium.org wrote: On Tue, Oct 13, 2020 at 07:23:34PM +0530, Akhil P Oommen wrote: On 10/12/2020 11:10 PM, m...@chromium.org wrote: On Mon, Oct 12, 2020 at 07:03:51PM +0530, Akhil P Oommen wrote: On 10/10/2020 12:06 AM, m...@chromium.org wrote: Hi Akhil, On Thu, Oct 08, 2020 at 10:39:07PM +0530, Akhil P Oommen wrote: Register GPU as a devfreq cooling device so that it can be passively cooled by the thermal framework. Signed-off-by: Akhil P Oommen --- drivers/gpu/drm/msm/msm_gpu.c | 13 - drivers/gpu/drm/msm/msm_gpu.h | 2 ++ 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c index 55d1648..93ffd66 100644 --- a/drivers/gpu/drm/msm/msm_gpu.c +++ b/drivers/gpu/drm/msm/msm_gpu.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include @@ -107,9 +108,18 @@ static void msm_devfreq_init(struct msm_gpu *gpu) if (IS_ERR(gpu->devfreq.devfreq)) { DRM_DEV_ERROR(>pdev->dev, "Couldn't initialize GPU devfreq\n"); gpu->devfreq.devfreq = NULL; + return; } devfreq_suspend_device(gpu->devfreq.devfreq); + + gpu->cooling = of_devfreq_cooling_register(gpu->pdev->dev.of_node, + gpu->devfreq.devfreq); + if (IS_ERR(gpu->cooling)) { + DRM_DEV_ERROR(>pdev->dev, + "Couldn't register GPU cooling device\n"); + gpu->cooling = NULL; + } } static int enable_pwrrail(struct msm_gpu *gpu) @@ -926,7 +936,6 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev, msm_devfreq_init(gpu); - Will remove this unintended change. gpu->aspace = gpu->funcs->create_address_space(gpu, pdev); if (gpu->aspace == NULL) @@ -1005,4 +1014,6 @@ void msm_gpu_cleanup(struct msm_gpu *gpu) gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu); msm_gem_address_space_put(gpu->aspace); } + + devfreq_cooling_unregister(gpu->cooling); Resources should be released in reverse order, otherwise the cooling device could use resources that have already been freed. Why do you think this is not the correct order? If you are thinking about devfreq struct, it is managed device resource. I did not check specifically if changing the frequency really uses any of the resources that are released previously, In any case it's not a good idea to allow other parts of the kernel to use a half initialized/torn down device. Even if it isn't a problem today someone could change the driver to use any of these resources (or add a new one) in a frequency change, without even thinking about the cooling device, just (rightfully) asuming that things are set up and torn down in a sane order. 'sane order' relative to what specifically here? Should we worry about freq change at this point because we have already disabled gpu runtime pm and devfreq? GPU runtime PM and the devfreq being disabled is not evident from the context of the function. You are probably right that it's not a problem in practice, but why give reason for doubts in the first place if this could be avoided by following a common practice? ___ dri-devel mailing list dri-de...@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel Other option I see is to create a managed device resource (devm) version of the devfreq_cooling_register API and use that. Is that what you are trying to suggest? -Akhil.
Re: [PATCH v2] vfio/fsl-mc: Fixed vfio-fsl-mc driver compilation on 32 bit
On Tue, 13 Oct 2020 18:06:51 +0300 Diana Craciun wrote: > The FSL_MC_BUS on which the VFIO-FSL-MC driver is dependent on > can be compiled on other architectures as well (not only ARM64) > including 32 bit architectures. > Include linux/io-64-nonatomic-hi-lo.h to make writeq/readq used > in the driver available on 32bit platforms. > > Signed-off-by: Diana Craciun > --- > v1 --> v2 > - Added prefix to patch description > > drivers/vfio/fsl-mc/vfio_fsl_mc.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c > b/drivers/vfio/fsl-mc/vfio_fsl_mc.c > index d009f873578c..80fc7f4ed343 100644 > --- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c > +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c > @@ -13,6 +13,7 @@ > #include > #include > #include > +#include > > #include "vfio_fsl_mc_private.h" > Thanks, applied and pushed to next. Hopefully it's either this or the merge ordering biting us with linux-next. Thanks, Alex
Re: [PATCH v2 11/24] drm/dp: fix kernel-doc warnings at drm_dp_helper.c
Reviewed-by: Lyude Paul On Tue, 2020-10-13 at 14:14 +0200, Mauro Carvalho Chehab wrote: > As warned by kernel-doc: > > ./drivers/gpu/drm/drm_dp_helper.c:385: warning: Function parameter or > member 'type' not described in 'drm_dp_downstream_is_type' > ./drivers/gpu/drm/drm_dp_helper.c:886: warning: Function parameter or > member 'dev' not described in 'drm_dp_downstream_mode' > > Some function parameters weren't documented. > > Fixes: 38784f6f8805 ("drm/dp: Add helpers to identify downstream facing port > types") > Signed-off-by: Mauro Carvalho Chehab > --- > drivers/gpu/drm/drm_dp_helper.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/gpu/drm/drm_dp_helper.c b/drivers/gpu/drm/drm_dp_helper.c > index b1c71af88579..deeed73f4ed6 100644 > --- a/drivers/gpu/drm/drm_dp_helper.c > +++ b/drivers/gpu/drm/drm_dp_helper.c > @@ -374,6 +374,10 @@ static bool is_edid_digital_input_dp(const struct edid > *edid) > * drm_dp_downstream_is_type() - is the downstream facing port of certain > type? > * @dpcd: DisplayPort configuration data > * @port_cap: port capabilities > + * @type: port type to be checked. Can be: > + * %DP_DS_PORT_TYPE_DP, %DP_DS_PORT_TYPE_VGA, %DP_DS_PORT_TYPE_DVI, > + * %DP_DS_PORT_TYPE_HDMI, %DP_DS_PORT_TYPE_NON_EDID, > + * %DP_DS_PORT_TYPE_DP_DUALMODE or %DP_DS_PORT_TYPE_WIRELESS. > * > * Caveat: Only works with DPCD 1.1+ port caps. > * > @@ -870,6 +874,7 @@ EXPORT_SYMBOL(drm_dp_downstream_444_to_420_conversion); > > /** > * drm_dp_downstream_mode() - return a mode for downstream facing port > + * @dev: DRM device > * @dpcd: DisplayPort configuration data > * @port_cap: port capabilities > * -- Sincerely, Lyude Paul (she/her) Software Engineer at Red Hat Note: I deal with a lot of emails and have a lot of bugs on my plate. If you've asked me a question, are waiting for a review/merge on a patch, etc. and I haven't responded in a while, please feel free to send me another email to check on my status. I don't bite!
Re: linux-next: build failure after merge of the vfio tree
On Tue, 13 Oct 2020 18:56:07 +0300 Diana Craciun OSS wrote: > Hi, > > How does it fail? What's the error? > > Thanks, > Diana > > > On 10/13/2020 6:07 AM, Stephen Rothwell wrote: > > Hi all, > > > > After merging the vfio tree, today's linux-next build (x86_64 > > allmodconfig) failed like this: > > > > > > Caused by commit > > > >cc0ee20bd969 ("vfio/fsl-mc: trigger an interrupt via eventfd") > >ac93ab2bf69a ("vfio/fsl-mc: Add support for device reset") > > > > I have used the vfio tree from next-20201012 for today. Thanks, Stephen. Diana has posted a 32bit build fix which I've merged, maybe that was the error. Also Diana's series in my branch is currently dependent on fsl-bus support in GregKH's char-misc-next branch. Looking at the log from the successful build, I wonder if our branches are just in the wrong order (vfio/next processed on line 341, char-misc-next processed on 387). I don't know if you regularly re-order for this sort of thing, otherwise it should work out when Greg's branch gets merged, but testing sooner in next would be preferred. Thanks, Alex
Nouveau DRM failure on 5120x1440 screen with 5.8/5.9 kernel
I'm having a problem with both the 5.8 and 5.9 kernels using the nouveau DRM driver. I have a laptop with a VGA card (specs below) connected to a 5120x1440 screen. At boot time, the card correctly detects the screen, tries to allocate fbdev fb0, then the video hangs completely for 15-30 seconds until it goes blank. This used to work in Linux 5.7 and earlier, although it allocated a 3840x1080 fb instead of a 5120x1440. I've attached the full dmesg. I tried commands like video=DP-2:3840x1080 but it doesn't help. Linux 5.8 boots without hanging if the laptop is not connected to the 5120x1440 screen. PCI specs: 01:00.0 0300: 10de:0dfc (rev a1) 01:00.0 VGA compatible controller: NVIDIA Corporation GF108GLM [NVS 5200M] (rev a1) xrandr available resolutions reported (from Linux 5.7 using Xorg): Screen 0: minimum 320 x 200, current 5120 x 1440, maximum 16384 x 16384 LVDS-1 unknown connection (normal left inverted right x axis y axis) 1600x900 59.99 + 40.00 5120x1440 60.00 1360x1020 73.97 1152x864 59.97 1024x768 59.95 800x600 59.96 640x480 59.94 DP-1 disconnected (normal left inverted right x axis y axis) DP-2 connected primary 5120x1440+0+0 (normal left inverted right x axis y axis) 1200mm x 340mm panning 5120x1440+0+0 3840x1080 59.97 + 5120x1440 29.98* 2560x1080 60.0059.9459.98 1920x1080 60.0060.0050.0059.94 1920x1080i60.0050.0059.94 1600x1200 60.00 1280x1024 75.0260.02 1280x800 59.81 1152x864 75.00 1280x720 60.0050.0059.94 1024x768 75.0360.00 800x600 75.0060.32 720x576 50.00 720x480 60.0059.94 640x480 75.0060.0059.94 720x400 70.08 HDMI-1 disconnected (normal left inverted right x axis y axis) VGA-1 disconnected (normal left inverted right x axis y axis) I'm currently using 5120x1440@30. 60 Hz isn't available. But look below: xrandr resolutions from Linux 5.9 (even though screen is still blank): Screen 0: minimum 320 x 200, current 5120 x 1440, maximum 16384 x 16384 LVDS-1 unknown connection (normal left inverted right x axis y axis) 1600x900 59.99 + 40.00 5120x1440 60.00 1360x1020 73.97 1152x864 59.97 1024x768 59.95 800x600 59.96 640x480 59.94 DP-1 disconnected (normal left inverted right x axis y axis) DP-2 connected primary 5120x1440+0+0 (normal left inverted right x axis y axis) 1200mm x 340mm panning 5120x1440+0+0 5120x1440 59.98 + 29.98* 3840x1080 59.97 + 2560x1080 60.0059.9459.98 1920x1080 60.0060.0050.0059.94 1920x1080i60.0050.0059.94 1600x1200 60.00 1280x1024 75.0260.02 1280x800 59.81 1152x864 75.00 1280x720 60.0050.0059.94 1024x768 75.0360.00 800x600 75.0060.32 720x576 50.00 720x480 60.0059.94 640x480 75.0060.0059.94 720x400 70.08 HDMI-1 disconnected (normal left inverted right x axis y axis) VGA-1 disconnected (normal left inverted right x axis y axis) Let me know if you need additional debug information/etc. Thanks, -Byron microcode: microcode updated early to revision 0x21, date = 2019-02-13 Linux version 5.9.0 (r...@iss.comtime.lan) (gcc (Gentoo 10.2.0-r2 p3) 10.2.0, GNU ld (Gentoo 2.35.1 p1) 2.35.1) #2 SMP PREEMPT Tue Oct 13 14:54:36 EDT 2020 Command line: auto BOOT_IMAGE=cti ro root=801 cti KERNEL supported cpus: Intel GenuineIntel x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. BIOS-provided physical RAM map: BIOS-e820: [mem 0x-0x0009e7ff] usable BIOS-e820: [mem 0x0009e800-0x0009] reserved BIOS-e820: [mem 0x000e-0x000f] reserved BIOS-e820: [mem 0x0010-0xc9f09fff] usable BIOS-e820: [mem 0xc9f0a000-0xc9ff] reserved BIOS-e820: [mem 0xca00-0xca752fff] usable BIOS-e820: [mem 0xca753000-0xca7f] reserved BIOS-e820: [mem 0xca80-0xcafb1fff] usable BIOS-e820: [mem 0xcafb2000-0xcaff] ACPI data BIOS-e820: [mem 0xcb00-0xcc6fbfff] usable BIOS-e820: [mem 0xcc6fc000-0xcc7f] ACPI NVS BIOS-e820: [mem 0xcc80-0xcdda0fff] usable BIOS-e820: [mem 0xcdda1000-0xce7a4fff] reserved BIOS-e820: [mem 0xce7a5000-0xce7e7fff] ACPI NVS BIOS-e820: [mem 0xce7e8000-0xcf2b2fff] usable BIOS-e820: [mem 0xcf2b3000-0xcf7e] reserved BIOS-e820: [mem
Re: [PATCH 1/9] perf tools: Add build id shell test
On Tue, Oct 13, 2020 at 01:13:40PM -0300, Arnaldo Carvalho de Melo wrote: > Em Wed, Sep 30, 2020 at 07:15:04PM +0200, Jiri Olsa escreveu: > > Adding test for build id cache that adds binary > > with sha1 and md5 build ids and verifies it's > > added properly. > > > > The test updates build id cache with perf record > > and perf buildid-cache -a. > > > [root@five ~]# perf test "build id" > 82: build id cache operations : Skip > [root@five ~]# set -o vi > [root@five ~]# perf test -v "build id" > 82: build id cache operations : > --- start --- > test child forked, pid 88384 > failed: no test binaries > test child finished with -2 > end > build id cache operations: Skip > [root@five ~]# hm, output looks like older version of the test > > Also the other patches clashed with Namhyung's patch series, can you > please check? right, new api is different now > > I've just pushed what I have to acme/perf/core I rebased and pushed perf/build_id_size branch, also will post new version thanks, jirka
Re: [PATCH v2 06/24] blk-mq: docs: add kernel-doc description for a new struct member
On 10/13/20 6:14 AM, Mauro Carvalho Chehab wrote: > As reported by kernel-doc: > ./include/linux/blk-mq.h:267: warning: Function parameter or member > 'active_queues_shared_sbitmap' not described in 'blk_mq_tag_set' > > There is now a new member for struct blk_mq_tag_set. Add a > description for it, based on the commit that introduced it. Reviewed-by: Jens Axboe -- Jens Axboe
Re: Unbreakable loop in fuse_fill_write_pages()
On Tue, 2020-10-13 at 14:58 -0400, Vivek Goyal wrote: > I am wondering if virtiofsd still alive and responding to requests? I > see another task which is blocked on getdents() for more than 120s. > > [10580.142571][ T348] INFO: task trinity-c36:254165 blocked for more than 123 > +seconds. > [10580.143924][ T348] Tainted: G O5.9.0-next-20201013+ #2 > [10580.145158][ T348] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > +disables this message. > [10580.146636][ T348] task:trinity-c36 state:D stack:26704 pid:254165 > ppid: > +87180 flags:0x0004 > [10580.148260][ T348] Call Trace: > [10580.148789][ T348] __schedule+0x71d/0x1b50 > [10580.149532][ T348] ? __sched_text_start+0x8/0x8 > [10580.150343][ T348] schedule+0xbf/0x270 > [10580.151044][ T348] schedule_preempt_disabled+0xc/0x20 > [10580.152006][ T348] __mutex_lock+0x9f1/0x1360 > [10580.152777][ T348] ? __fdget_pos+0x9c/0xb0 > [10580.153484][ T348] ? mutex_lock_io_nested+0x1240/0x1240 > [10580.154432][ T348] ? find_held_lock+0x33/0x1c0 > [10580.155220][ T348] ? __fdget_pos+0x9c/0xb0 > [10580.155934][ T348] __fdget_pos+0x9c/0xb0 > [10580.156660][ T348] __x64_sys_getdents+0xff/0x230 > > May be virtiofsd crashed and hence no requests are completing leading > to a hard lockup? No, it was not crashed. After I had to forcibly close the guest, the virtiofsd daemon will exit normally. However, I can't tell exactly if the virtiofsd daemon was still functioning normally. I'll enable the debug and retry to see if there is anything interesting.
Re: [PATCH RESEND 1/1] perf build: Allow nested externs to enable BUILD_BUG() usage
Em Mon, Oct 12, 2020 at 08:59:36AM +1100, Stephen Rothwell escreveu: > Hi all, > > On Fri, 9 Oct 2020 14:41:11 +0200 Jiri Olsa wrote: > > > > On Fri, Oct 09, 2020 at 02:25:23PM +0200, Vasily Gorbik wrote: > > > Currently BUILD_BUG() macro is expanded to smth like the following: > > >do { > > >extern void __compiletime_assert_0(void) > > >__attribute__((error("BUILD_BUG failed"))); > > >if (!(!(1))) > > >__compiletime_assert_0(); > > >} while (0); > > > > > > If used in a function body this obviously would produce build errors > > > with -Wnested-externs and -Werror. > > > > > > To enable BUILD_BUG() usage in tools/arch/x86/lib/insn.c which perf > > > includes in intel-pt-decoder, build perf without -Wnested-externs. > > > > > > Reported-by: Stephen Rothwell > > > Signed-off-by: Vasily Gorbik > > > > that one applied nicely ;-) thanks > > > > Acked-by: Jiri Olsa > > I will apply that patch to the merge of the tip tree today (instead of > reverting the series I reverted in Friday) (unless I get an update of > the tip tree containing it, of course). Applied to perf/core that will go to Linus this week, maybe even today. - Arnaldo
Re: [PATCH v5 4/5] docs: counter: Document character device interface
On 10/13/20 1:58 PM, William Breathitt Gray wrote: On Mon, Oct 12, 2020 at 12:04:10PM -0500, David Lechner wrote: On 10/8/20 7:28 AM, William Breathitt Gray wrote: On Thu, Oct 08, 2020 at 10:09:09AM +0200, Pavel Machek wrote: Hi! +int main(void) +{ +struct pollfd pfd = { .events = POLLIN }; +struct counter_event event_data[2]; + +pfd.fd = open("/dev/counter0", O_RDWR); + +ioctl(pfd.fd, COUNTER_SET_WATCH_IOCTL, watches); +ioctl(pfd.fd, COUNTER_SET_WATCH_IOCTL, watches + 1); +ioctl(pfd.fd, COUNTER_LOAD_WATCHES_IOCTL); + +for (;;) { +poll(, 1, -1); Why do poll, when you are doing blocking read? +read(pfd.fd, event_data, sizeof(event_data)); Does your new chrdev always guarantee returning complete buffer? If so, should it behave like that? Best regards, Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html I suppose you're right: a poll() should be redundant now with this version of the character device implementation because buffers will always return complete; so a blocking read() should achieve the same behavior that a poll() with read() would. I'll give some more time for additional feedback to come in for this version of the patchset, and then likely remove support for poll() in the v6 submission. William Breathitt Gray I hope that you mean that you will just remove it from the example and not from the chardev. Otherwise it won't be possible to integrate this with an event loop. Would you elaborate a bit further on this? My thought process is that because users must set the Counter Events they want to watch, and only those Counter Events show up in the character device node, a blocking read() would effectively behave the same as poll() with read(); if none of the Counter Events occur, the read() just blocks until one does, thus making the use of a poll() call redundant. William Breathitt Gray If the counter device was the only file descriptor being read, then yes it wouldn't matter. But if we are using this in combination with other file descriptors, then it is common to poll all of the file descriptors using a single syscall to see which one is ready to read rather than doing a non-blocking read on all of the file descriptors, which would result in many unnecessary syscalls.
Re: [PATCH RESEND 1/1] perf build: Allow nested externs to enable BUILD_BUG() usage
Em Fri, Oct 09, 2020 at 02:41:11PM +0200, Jiri Olsa escreveu: > On Fri, Oct 09, 2020 at 02:25:23PM +0200, Vasily Gorbik wrote: > > Currently BUILD_BUG() macro is expanded to smth like the following: > >do { > >extern void __compiletime_assert_0(void) > >__attribute__((error("BUILD_BUG failed"))); > >if (!(!(1))) > >__compiletime_assert_0(); > >} while (0); > > > > If used in a function body this obviously would produce build errors > > with -Wnested-externs and -Werror. > > > > To enable BUILD_BUG() usage in tools/arch/x86/lib/insn.c which perf > > includes in intel-pt-decoder, build perf without -Wnested-externs. > > > > Reported-by: Stephen Rothwell > > Signed-off-by: Vasily Gorbik > > that one applied nicely ;-) thanks > > Acked-by: Jiri Olsa Thanks, applied. - Arnaldo