Re: Linux 5.3-rc8
On Sa, 14.09.19 09:52, Linus Torvalds (torva...@linux-foundation.org) wrote: > On Sat, Sep 14, 2019 at 9:35 AM Alexander E. Patrakov > wrote: > > > > Let me repeat: not -EINVAL, please. Please find some other error code, > > so that the application could sensibly distinguish between this case > > (low quality entropy is in the buffer) and the "kernel is too dumb" case > > (and no entropy is in the buffer). > > I'm not convinced we want applications to see that difference. > > The fact is, every time an application thinks it cares, it has caused > problems. I can just see systemd saying "ok, the kernel didn't block, > so I'll just do > >while (getrandom(x) == -ENOENTROPY) >sleep(1); > > instead. Which is still completely buggy garbage. > > The fact is, we can't guarantee entropy in general. It's probably > there is practice, particularly with user space saving randomness from > last boot etc, but that kind of data may be real entropy, but the > kernel cannot *guarantee* that it is. I am not expecting the kernel to guarantee entropy. I just expecting the kernel to not give me garbage knowingly. It's OK if it gives me garbage unknowingly, but I have a problem if it gives me trash all the time. There's benefit in being able to wait until the pool is initialized before we update the random seed stored on disk with a new one, and there's benefit in being able to wait until the pool is initialized before we let cryptsetup read a fresh, one-time key for dm-crypt from /dev/urandom. I fully understand that any such reporting for initialization is "best-effort", i.e. to the point where we don't know anything to the contrary, but at least give userspace that. Lennart -- Lennart Poettering, Berlin
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 10:05:21PM -0400, Theodore Y. Ts'o wrote: > You basically want to turn getrandom into /dev/urandom. And that's > how we got into the mess where 10% of the publically accessible ssh > keys could be guessed. Not exactly. This was an *API* issue that created this situation. The fact that you had a single random() call in the libc, either mapped to /dev/urandom or to /dev/random. By then many of us were used to rely on one or the other and finding systems where /dev/random was a symlink to /dev/urandom to avoid blocking was extremely common. In fact it was caused by the exact same situation: we try to enforce good random for everyone, it cannot work all the time and breaks programs which do not need such randoms, so the user breaks the trust on randomness by configuring the system so that randoms work all the time for the most common programs. And that's how you end up with SSH trusting a broken random generator without knowing it was misconfigured. Your getrandom() API does have the ability to fix this. In my opinion the best way to proceed is to consider that all those who don't care about randomness quality never block and that those who care can be sure they will either get good randoms or will know about it. Ideally calling getrandom() without any flag should be equivalent to what you have with /dev/urandom and be good enough to put a UUID on a file system. And calling it with "SECURE" or something like this will be the indication that it will not betray you and will only return good randoms (which is what GRND_RANDOM does in my opinion). The huge difference between getrandom() and /dev/*random here is that each application can decide what type of random to use without relying on what system-wide breakage was applied just for the sake of fixing another simple application. This could even help OpenSSL use two different calls for RAND_bytes() and RAND_pseudo_bytes(), instead of using the same call and blocking. Last but not least, I think we need to educate developers regarding random number consumption, asking "if you could produce only 16 bytes of random in your whole system's lifetime, where would you use them?". Entropy is extremely precious and yet the most poorly used resource. I almost wouldn't mind seeing GRND_RANDOM requiring a special capability since it does have a system-wide impact! Regards, Willy
Re: Linux 5.3-rc8
On Sa, 14.09.19 09:30, Linus Torvalds (torva...@linux-foundation.org) wrote: > > => src/random-seed/random-seed.c: > > /* > > * Let's make this whole job asynchronous, i.e. let's make > > * ourselves a barrier for proper initialization of the > > * random pool. > > */ > > k = getrandom(buf, buf_size, GRND_NONBLOCK); > > if (k < 0 && errno == EAGAIN && synchronous) { > > log_notice("Kernel entropy pool is not initialized yet, " > > "waiting until it is."); > > > > k = getrandom(buf, buf_size, 0); /* retry synchronously */ > > } > > Yeah, the above is yet another example of completely broken garbage. > > You can't just wait and block at boot. That is simply 100% > unacceptable, and always has been, exactly because that may > potentially mean waiting forever since you didn't do anything that > actually is likely to add any entropy. Oh man. Just spend 5min to understand the situation, before claiming this was garbage or that was garbage. The code above does not block boot. It blocks startup of services that explicit order themselves after the code above. There's only a few services that should do that, and the main system boots up just fine without waiting for this. Primary example for stuff that orders itself after the above, correctly: cryptsetup entries that specify /dev/urandom as password source (i.e. swap space and stuff, that wants a new key on every boot). If we don't wait for the initialized pool for cases like that the password for that swap space is not actually going to be random, and that defeats its purpose. Another example: the storing of an updated random seed file on disk. We should only do that if the seed on disk is actually properly random, i.e. comes from an initialized pool. Hence we wait for the pool to be initialized before reading the seed from the pool, and writing it to disk. I'd argue that doing things like this is not "garbage", like you say, but *necessary* to make this stuff safe and secure. And no, other stuff is not delayed for this (but there are bugs of course, some random services in 3rd party packages that set too agressive deps, but that needs to be fixed there, and not in the kernel). Anyway, I really don't appreciate your tone, and being sucked into messy LKML discussions. I generally stay away from LKML, and gah, you remind me why. Just tone it down, not everything you never bothered to understand is "garbage". And please don't break /dev/urandom again. The above code is the ony way I see how we can make /dev/urandom-derived swap encryption safe, and the only way I can see how we can sanely write a valid random seed to disk after boot. You guys changed semantics on /dev/urandom all the time in the past, don't break API again, thank you very much. Lennart
[PATCH v2 0/2] Amazon's Annapurna Labs Memory Controller EDAC
This series introduces support for Amazon's Annapurna Labs Memory Controller EDAC driver. Changes since v1: = - updated dt binding node name and added Rob Reviewed-By - removed auto selecting of this driver Talel Shenhar (2): dt-bindings: edac: al-mc-edac: Amazon's Annapurna Labs Memory Controller EDAC EDAC: al-mc-edac: Introduce Amazon's Annapurna Labs Memory Controller EDAC .../devicetree/bindings/edac/amazon,al-mc-edac.txt | 24 ++ MAINTAINERS| 7 + drivers/edac/Kconfig | 7 + drivers/edac/Makefile | 1 + drivers/edac/al_mc_edac.c | 382 + 5 files changed, 421 insertions(+) create mode 100644 Documentation/devicetree/bindings/edac/amazon,al-mc-edac.txt create mode 100644 drivers/edac/al_mc_edac.c -- 2.7.4
[PATCH v2 2/2] EDAC: al-mc-edac: Introduce Amazon's Annapurna Labs Memory Controller EDAC
The Amazon's Annapurna Labs Memory Controller EDAC supports ECC capability for error detection and correction (Single bit error correction, Double detection). This driver introduces EDAC driver for that capability. Signed-off-by: Talel Shenhar --- MAINTAINERS | 7 + drivers/edac/Kconfig | 7 + drivers/edac/Makefile | 1 + drivers/edac/al_mc_edac.c | 382 ++ 4 files changed, 397 insertions(+) create mode 100644 drivers/edac/al_mc_edac.c diff --git a/MAINTAINERS b/MAINTAINERS index e81e60b..4b91e21 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -751,6 +751,13 @@ F: drivers/tty/serial/altera_jtaguart.c F: include/linux/altera_uart.h F: include/linux/altera_jtaguart.h +AMAZON ANNAPURNA LABS MEMORY CONTROLLER EDAC +M: Talel Shenhar +M: Talel Shenhar +S: Maintained +F: Documentation/devicetree/bindings/edac/amazon,al-mc-edac.txt +F: drivers/edac/al_mc_edac.c + AMAZON ANNAPURNA LABS THERMAL MMIO DRIVER M: Talel Shenhar S: Maintained diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig index 200c04c..e8109c1 100644 --- a/drivers/edac/Kconfig +++ b/drivers/edac/Kconfig @@ -100,6 +100,13 @@ config EDAC_AMD64_ERROR_INJECTION In addition, there are two control files, inject_read and inject_write, which trigger the DRAM ECC Read and Write respectively. +config EDAC_AL_MC + bool "Amazon's Annapurna Lab EDAC Memory Controller" + depends on ARCH_ALPINE + help + Support for error detection and correction for Amazon's Annapurna + Labs Alpine chips which allows 1 bit correction and 2 bits detection. + config EDAC_AMD76X tristate "AMD 76x (760, 762, 768)" depends on PCI && X86_32 diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index 165ca65e..f6c3a40 100644 --- a/drivers/edac/Makefile +++ b/drivers/edac/Makefile @@ -22,6 +22,7 @@ obj-$(CONFIG_EDAC_GHES) += ghes_edac.o edac_mce_amd-y := mce_amd.o obj-$(CONFIG_EDAC_DECODE_MCE) += edac_mce_amd.o +obj-$(CONFIG_EDAC_AL_MC) += al_mc_edac.o obj-$(CONFIG_EDAC_AMD76X) += amd76x_edac.o obj-$(CONFIG_EDAC_CPC925) += cpc925_edac.o obj-$(CONFIG_EDAC_I5000) += i5000_edac.o diff --git a/drivers/edac/al_mc_edac.c b/drivers/edac/al_mc_edac.c new file mode 100644 index 000..f9763d4 --- /dev/null +++ b/drivers/edac/al_mc_edac.c @@ -0,0 +1,382 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + */ +#include +#include +#include +#include "edac_module.h" + +/* Registers Offset */ +#define AL_MC_MSTR 0x00 +#define AL_MC_ECC_CFG 0x70 +#define AL_MC_ECC_CLEAR0x7c +#define AL_MC_ECC_ERR_COUNT0x80 +#define AL_MC_ECC_CE_ADDR0 0x84 +#define AL_MC_ECC_CE_ADDR1 0x88 +#define AL_MC_ECC_UE_ADDR0 0xa4 +#define AL_MC_ECC_UE_ADDR1 0xa8 +#define AL_MC_ECC_CE_SYND0 0x8c +#define AL_MC_ECC_CE_SYND1 0x90 +#define AL_MC_ECC_CE_SYND2 0x94 +#define AL_MC_ECC_UE_SYND0 0xac +#define AL_MC_ECC_UE_SYND1 0xb0 +#define AL_MC_ECC_UE_SYND2 0xb4 + +/* Registers Fields */ +#define AL_MC_MSTR_DEV_CFG GENMASK(31, 30) +#define AL_MC_MSTR_RANKS GENMASK(27, 24) +#define AL_MC_MSTR_DATA_BUS_WIDTH GENMASK(13, 12) +#define AL_MC_MSTR_DDR4BIT(4) +#define AL_MC_MSTR_DDR3BIT(0) + +#define AL_MC_ECC_CFG_SCRUB_DISABLED BIT(4) +#define AL_MC_ECC_CFG_ECC_MODE GENMASK(2, 0) + +#define AL_MC_ECC_CLEAR_UE_COUNT BIT(3) +#define AL_MC_ECC_CLEAR_CE_COUNT BIT(2) +#define AL_MC_ECC_CLEAR_UE_ERR BIT(1) +#define AL_MC_ECC_CLEAR_CE_ERR BIT(0) + +#define AL_MC_ECC_ERR_COUNT_UE GENMASK(31, 16) +#define AL_MC_ECC_ERR_COUNT_CE GENMASK(15, 0) + +#define AL_MC_ECC_CE_ADDR0_RANKGENMASK(25, 24) +#define AL_MC_ECC_CE_ADDR0_ROW GENMASK(17, 0) + +#define AL_MC_ECC_CE_ADDR1_BG GENMASK(25, 24) +#define AL_MC_ECC_CE_ADDR1_BANKGENMASK(18, 16) +#define AL_MC_ECC_CE_ADDR1_COLUMN GENMASK(11, 0) + +#define AL_MC_ECC_UE_ADDR0_RANKGENMASK(25, 24) +#define AL_MC_ECC_UE_ADDR0_ROW GENMASK(17, 0) + +#define AL_MC_ECC_UE_ADDR1_BG GENMASK(25, 24) +#define AL_MC_ECC_UE_ADDR1_BANKGENMASK(18, 16) +#define AL_MC_ECC_UE_ADDR1_COLUMN GENMASK(11, 0) + +/* Registers Values */ +#define AL_MC_MSTR_DEV_CFG_X4 0 +#define AL_MC_MSTR_DEV_CFG_X8 1 +#define AL_MC_MSTR_DEV_CFG_X16 2 +#define AL_MC_MSTR_DEV_CFG_X32 3 +#define AL_MC_MSTR_RANKS_MAX 4 +#define AL_MC_MSTR_DATA_BUS_WIDTH_X64 0 + +#define DRV_NAME "al_mc_edac" +#define AL_MC_EDAC_MSG_MAX 256 +#define AL_MC_EDAC_MSG(message, buffer_size, type, \ + rank, row, bg, bank, column, syn0,
[PATCH v2 1/2] dt-bindings: edac: al-mc-edac: Amazon's Annapurna Labs Memory Controller EDAC
Document Amazon's Annapurna Labs Memory Controller EDAC SoC binding. Signed-off-by: Talel Shenhar Reviewed-by: Rob Herring --- .../devicetree/bindings/edac/amazon,al-mc-edac.txt | 24 ++ 1 file changed, 24 insertions(+) create mode 100644 Documentation/devicetree/bindings/edac/amazon,al-mc-edac.txt diff --git a/Documentation/devicetree/bindings/edac/amazon,al-mc-edac.txt b/Documentation/devicetree/bindings/edac/amazon,al-mc-edac.txt new file mode 100644 index 000..4f612e0 --- /dev/null +++ b/Documentation/devicetree/bindings/edac/amazon,al-mc-edac.txt @@ -0,0 +1,24 @@ +Amazon's Annapurna Labs Memory Controller EDAC + +EDAC node is defined to describe on-chip error detection and correction for +Amazon's Annapurna Labs Memory Controller. + +Required properties: +- compatible: Shall be "amazon,al-mc-edac". +- reg: DDR controller resource. + +Optional: +- interrupt-names: may include "ue", for uncorrectable errors, + and/or "ce", for correctable errors. +- interrupts: should contain the interrupts associated with the + interrupts names. + +Example: + +edac@f008 { + compatible = "amazon,al-mc-edac"; + reg = <0x0 0xf008 0x0 0x0001>; + interrupt-parent = <&amazon_al_system_fabric>; + interrupt-names = "ue"; + interrupts = <20 IRQ_TYPE_LEVEL_HIGH>; +}; -- 2.7.4
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 10:05:21PM -0400, Theodore Y. Ts'o wrote: > I'd be willing to let it take at least 2 minutes, since that's slow > enough to be annoying. It's an eternity, and prevents a backup system from being turned on in time to replace a dead system. In fact the main problem with this is that it destroys uptime on already configured systems for the sake of making sure a private SSH key is produce correctly. It turns out that if we instead give the info to this tool that the produced random is not strong, this only tool that requires good entropy will be able to ask the user to type something to add real entropy. But making the system wait forever will not bring any extra entropy because the services cannot start, it will not even receive network traffic and will not be able to collect entropy. Sorry Ted, but I've been hit by this already. It's a real problem to see a system not finish to boot after a crash when you know your systems have only 5 minutes of total downtime allowed per year (5 nines). And when the SSH keys, like the rest of the config, were supposed to be either synchronized from the network or pre-populated in a system image, nobody finds this a valid justification for an extended downtime. > Except the developer could (and *has) just ignored the warning, which > is what happened with /dev/urandom when it was accessed too early. That's why it's nice to have getrandom() return the error : it will for once allow the developer of the program to care depending on the program. Those proposing to choose the pieces to present in Tetris will not care, those trying to generate an SSH key will care and will have solid and well known fallbacks. And the rare ones who need good randoms and ignore the error will be the ones *responsible* for this, it will not be the kernel anymore giving bad random. BTW I was thinking that EAGAIN was semantically better than EINVAL to indicate that the same call should be done with blocking. Just my two cents, Willy
Re: [PATCH] ARM: dts: imx6dl: SolidRun: add phy node with 100Mb/s max-speed
Hi Andrew, On Tue, Sep 10 2019, Andrew Lunn wrote: > On Tue, Sep 10, 2019 at 06:55:07PM +0300, tinywrkb wrote: >> Cubox-i Solo/DualLite carrier board has 100Mb/s magnetics while the >> Atheros AR8035 PHY on the MicroSoM v1.3 CPU module is a 1GbE PHY device. >> >> Since commit 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in >> genphy_read_status") ethernet is broken on Cubox-i Solo/DualLite devices. > > Hi Tinywrkb > > You emailed lots of people, but missed the PHY maintainers :-( > > Are you sure this is the patch which broken it? Did you do a git > bisect. Tinywrkb confirmed to me in private communication that revert of 5502b218e001 fixes Ethernet for him on effected system. He also referred me to an old Cubox-i spec that lists 10/100 Ethernet only for i.MX6 Solo/DualLite variants of Cubox-i. It turns out that there was a plan to use a different 10/100 PHY for Solo/DualLite SOMs. This plan never materialized. All SolidRun i.MX6 SOMs use the same AR8035 PHY that supports 1Gb. Commit 5502b218e001 might be triggering a hardware issue on the affected Cubox-i. I could not reproduce the issue here with Cubox-i and a Dual SOM variant running v5.3-rc8. I have no Solo/DualLite variant handy at the moment. baruch -- http://baruch.siach.name/blog/ ~. .~ Tk Open Systems =}ooO--U--Ooo{= - bar...@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -
Re: [PATCH net-next v8 2/3] net: phy: add support for clause 37 auto-negotiation
On 9/14/19 7:17 AM, Andrew Lunn wrote: > On Mon, Sep 09, 2019 at 01:49:06PM -0700, Tao Ren wrote: >> From: Heiner Kallweit >> >> This patch adds support for clause 37 1000Base-X auto-negotiation. >> >> Signed-off-by: Heiner Kallweit >> Signed-off-by: Tao Ren >> Tested-by: René van Dorst > > Reviewed-by: Andrew Lunn > > Andrew Thanks a lot, Andrew. Cheers, Tao
Re: Re: [PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
On 13/09/2019 6:09,Valentin Schneider wrote: >> From 089dbf0216628ac6ae98742ab90725ca9c2bf201 Mon Sep 17 00:00:00 2001 >> From: >> Date: Tue, 10 Sep 2019 09:44:58 -0400 >> Subject: [PATCH] sched: fix migration to invalid cpu in >> __set_cpus_allowed_ptr >> >> reason: migration to invalid cpu in __set_cpus_allowed_ptr archive >> path: patches/euleros/sched >> > >The above should probably be trimmed from the log. > Thanks for reminding me. I will remove this part in a new patch. >> Oops occur when running qemu on arm64: >> Unable to handle kernel paging request at virtual address >> 08effe40 Internal error: Oops: 9607 [#1] SMP Process >> migration/0 (pid: 12, stack limit = 0x084e3736) >> pstate: 2085 (nzCv daIf -PAN -UAO) pc : >> __ll_sc___cmpxchg_case_acq_4+0x4/0x20 >> lr : move_queued_task.isra.21+0x124/0x298 >> ... >> Call trace: >> __ll_sc___cmpxchg_case_acq_4+0x4/0x20 >> __migrate_task+0xc8/0xe0 >> migration_cpu_stop+0x170/0x180 >> cpu_stopper_thread+0xec/0x178 >> smpboot_thread_fn+0x1ac/0x1e8 >> kthread+0x134/0x138 >> ret_from_fork+0x10/0x18 >> >> __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask >> to migrage the process if process is not currently running on any one >> of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will >> choose an invalid dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if >> CPUS in affinity mask are deactived by cpu_down after cpumask_intersects >> check. > >Right, cpumask_any_and() returns >= nr_cpu_ids when there isn't any valid CPU >bit set. > >> Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if >> corresponding bit is coincidentally set. > >Ouch. I was going to say the cpu_active_mask check from is_cpu_allowed() >should've stopped the whole thing there, but AFAICT there's no safeguard >against > nr_cpu_ids bit accesses. I see CONFIG_DEBUG_PER_CPU_MAPS should fire >a warning for such accesses, but we don't enable it by default. > >Would it make sense to do something like > > return test_bit(...) && cpu < nr_cpu_ids; > >for cpumask_test_cpu()? We still get the warn with the right config, but we >prevent sneaky mistakes like this one. And it seems it's not the only one >according to: > >-- >virtual patch >virtual report > >@nocheck@ >expression E; >identifier any_func =~ "^cpumask_any"; >position pos; >@@ > >E@pos = any_func(...); >... when != E >= nr_cpu_ids >when != E < nr_cpu_ids > >@script:python depends on nocheck && report@ p << nocheck.pos; @@ >coccilib.report.print_report(p[0], "Missing cpumask_any_*() return value >check!") >--- > >Some of those seem benign since they are on e.g. cpu_online_mask, some other >are somewhat dubious (e.g. deadline.c::dl_task_can_attach()). > >As a consequence, kernel will access a invalid rq address associate with the >invalid cpu in It's more thoughtful to add check in cpumask_test_cpu.It can solve this problem and can prevent other potential bugs.I will test it and resend a new patch. >> migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. >> Process as follows may trigger the Oops: >> 1) A process repeatedly bind itself to cpu0 and cpu1 in turn by >> calling sched_setaffinity >> 2) A shell script repeatedly "echo 0 > >> /sys/devices/system/cpu/cpu1/online" and "echo 1 > >> /sys/devices/system/cpu/cpu1/online" in turn >> 3) Oops appears if the invalid cpu is set in memory after tested cpumask. >> >> >> Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40 > >- This doesn't respect the 75 char per line limit >- Change IDs don't belong here (we're not using Gerrit) >- You're missing a Signed-off-by. > >You'll find all the guidelines you need for formatting patches in >Documentation/process/submitting-patches.rst. > Thanks for the guide.I will read it carefully before send next patch. >> --- >> kernel/sched/core.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index >> 4b63fef..5181ea9 100644 >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct >> *p, >> if (cpumask_equal(&p->cpus_allowed, new_mask)) >> goto out; >> >> - if (!cpumask_intersects(new_mask, cpu_valid_mask)) { >> + dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask); >> + if (dest_cpu >= nr_cpu_ids) { >> ret = -EINVAL; >> goto out; >> } >> @@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct >> *p, >> if (cpumask_test_cpu(task_cpu(p), new_mask)) >> goto out; >> >> - dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask);> if >> (task_running(rq, p) || p->state == TASK_WAKING) { >> struct migration_arg arg = { p, dest_cpu }; >> /* Need help from migration thread: drop lock and wait. */ >> -- >> 1.8.5.6 >> Thansks for your review and advises.
Re: [PATCH v2] net: mdio: switch to using gpiod_get_optional()
On Sat, Sep 14, 2019 at 08:09:33PM +0300, Andy Shevchenko wrote: > On Fri, Sep 13, 2019 at 03:55:47PM -0700, Dmitry Torokhov wrote: > > The MDIO device reset line is optional and now that gpiod_get_optional() > > returns proper value when GPIO support is compiled out, there is no > > reason to use fwnode_get_named_gpiod() that I plan to hide away. > > > > Let's switch to using more standard gpiod_get_optional() and > > gpiod_set_consumer_name() to keep the nice "PHY reset" label. > > > > Also there is no reason to only try to fetch the reset GPIO when we have > > OF node, gpiolib can fetch GPIO data from firmwares as well. > > > > Reviewed-by: Andy Shevchenko Thanks Andy. > > But see comment below. > > > + mdiodev->reset_gpio = gpiod_get_optional(&mdiodev->dev, > > +"reset", GPIOD_OUT_LOW); > > + error = PTR_ERR_OR_ZERO(mdiodev->reset_gpio); > > + if (error) > > + return error; > > + > > > + if (mdiodev->reset_gpio) > > This is redundant check. I see that gpiod_* API handle NULL desc and usually return immediately, but frankly I am not that comfortable with it. I'm OK with functions that free/destroy objects that recognize NULL resources, but it is unusual for other types of APIs. Thanks. -- Dmitry
[RFC 4/4] counter: 104-quad-8: Update count_read/count_write/signal_read callbacks
The count_read and count_write callbacks pass unsigned long now, while the signal_read callback passes an enum counter_signal_value. Signed-off-by: William Breathitt Gray --- drivers/counter/104-quad-8.c | 33 ++--- 1 file changed, 10 insertions(+), 23 deletions(-) diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c index 00b113f4b958..17e67a84777d 100644 --- a/drivers/counter/104-quad-8.c +++ b/drivers/counter/104-quad-8.c @@ -562,11 +562,10 @@ static const struct iio_chan_spec quad8_channels[] = { }; static int quad8_signal_read(struct counter_device *counter, - struct counter_signal *signal, struct counter_signal_read_value *val) + struct counter_signal *signal, enum counter_signal_value *val) { const struct quad8_iio *const priv = counter->priv; unsigned int state; - enum counter_signal_level level; /* Only Index signal levels can be read */ if (signal->id < 16) @@ -575,22 +574,19 @@ static int quad8_signal_read(struct counter_device *counter, state = inb(priv->base + QUAD8_REG_INDEX_INPUT_LEVELS) & BIT(signal->id - 16); - level = (state) ? COUNTER_SIGNAL_LEVEL_HIGH : COUNTER_SIGNAL_LEVEL_LOW; - - counter_signal_read_value_set(val, COUNTER_SIGNAL_LEVEL, &level); + *val = (state) ? COUNTER_SIGNAL_HIGH : COUNTER_SIGNAL_LOW; return 0; } static int quad8_count_read(struct counter_device *counter, - struct counter_count *count, struct counter_count_read_value *val) + struct counter_count *count, unsigned long *val) { const struct quad8_iio *const priv = counter->priv; const int base_offset = priv->base + 2 * count->id; unsigned int flags; unsigned int borrow; unsigned int carry; - unsigned long position; int i; flags = inb(base_offset + 1); @@ -598,36 +594,27 @@ static int quad8_count_read(struct counter_device *counter, carry = !!(flags & QUAD8_FLAG_CT); /* Borrow XOR Carry effectively doubles count range */ - position = (unsigned long)(borrow ^ carry) << 24; + *val = (unsigned long)(borrow ^ carry) << 24; /* Reset Byte Pointer; transfer Counter to Output Latch */ outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_CNTR_OUT, base_offset + 1); for (i = 0; i < 3; i++) - position |= (unsigned long)inb(base_offset) << (8 * i); - - counter_count_read_value_set(val, COUNTER_COUNT_POSITION, &position); + *val |= (unsigned long)inb(base_offset) << (8 * i); return 0; } static int quad8_count_write(struct counter_device *counter, - struct counter_count *count, struct counter_count_write_value *val) + struct counter_count *count, unsigned long val) { const struct quad8_iio *const priv = counter->priv; const int base_offset = priv->base + 2 * count->id; - int err; - unsigned long position; int i; - err = counter_count_write_value_get(&position, COUNTER_COUNT_POSITION, - val); - if (err) - return err; - /* Only 24-bit values are supported */ - if (position > 0xFF) + if (val > 0xFF) return -EINVAL; /* Reset Byte Pointer */ @@ -635,7 +622,7 @@ static int quad8_count_write(struct counter_device *counter, /* Counter can only be set via Preset Register */ for (i = 0; i < 3; i++) - outb(position >> (8 * i), base_offset); + outb(val >> (8 * i), base_offset); /* Transfer Preset Register to Counter */ outb(QUAD8_CTR_RLD | QUAD8_RLD_PRESET_CNTR, base_offset + 1); @@ -644,9 +631,9 @@ static int quad8_count_write(struct counter_device *counter, outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1); /* Set Preset Register back to original value */ - position = priv->preset[count->id]; + val = priv->preset[count->id]; for (i = 0; i < 3; i++) - outb(position >> (8 * i), base_offset); + outb(val >> (8 * i), base_offset); /* Reset Borrow, Carry, Compare, and Sign flags */ outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_FLAGS, base_offset + 1); -- 2.23.0
[RFC 1/4] counter: Simplify the count_read and count_write callbacks
The count_read and count_write callbacks are simplified to pass val as unsigned long rather than as an opaque data structure. The opaque counter_count_read_value and counter_count_write_value structures, counter_count_value_type enum, and relevant counter_count_read_value_set and counter_count_write_value_get functions, are removed as they are no longer used. Signed-off-by: William Breathitt Gray --- drivers/counter/counter.c | 66 +-- include/linux/counter.h | 43 +++-- 2 files changed, 12 insertions(+), 97 deletions(-) diff --git a/drivers/counter/counter.c b/drivers/counter/counter.c index 106bc7180cd8..1d08f1437b1b 100644 --- a/drivers/counter/counter.c +++ b/drivers/counter/counter.c @@ -246,60 +246,6 @@ void counter_signal_read_value_set(struct counter_signal_read_value *const val, } EXPORT_SYMBOL_GPL(counter_signal_read_value_set); -/** - * counter_count_read_value_set - set counter_count_read_value data - * @val: counter_count_read_value structure to set - * @type: property Count data represents - * @data: Count data - * - * This function sets an opaque counter_count_read_value structure with the - * provided Count data. - */ -void counter_count_read_value_set(struct counter_count_read_value *const val, - const enum counter_count_value_type type, - void *const data) -{ - switch (type) { - case COUNTER_COUNT_POSITION: - val->len = sprintf(val->buf, "%lu\n", *(unsigned long *)data); - break; - default: - val->len = 0; - } -} -EXPORT_SYMBOL_GPL(counter_count_read_value_set); - -/** - * counter_count_write_value_get - get counter_count_write_value data - * @data: Count data - * @type: property Count data represents - * @val: counter_count_write_value structure containing data - * - * This function extracts Count data from the provided opaque - * counter_count_write_value structure and stores it at the address provided by - * @data. - * - * RETURNS: - * 0 on success, negative error number on failure. - */ -int counter_count_write_value_get(void *const data, - const enum counter_count_value_type type, - const struct counter_count_write_value *const val) -{ - int err; - - switch (type) { - case COUNTER_COUNT_POSITION: - err = kstrtoul(val->buf, 0, data); - if (err) - return err; - break; - } - - return 0; -} -EXPORT_SYMBOL_GPL(counter_count_write_value_get); - struct counter_attr_parm { struct counter_device_attr_group *group; const char *prefix; @@ -788,13 +734,13 @@ static ssize_t counter_count_show(struct device *dev, const struct counter_count_unit *const component = devattr->component; struct counter_count *const count = component->count; int err; - struct counter_count_read_value val = { .buf = buf }; + unsigned long val; err = counter->ops->count_read(counter, count, &val); if (err) return err; - return val.len; + return sprintf(buf, "%lu\n", val); } static ssize_t counter_count_store(struct device *dev, @@ -806,9 +752,13 @@ static ssize_t counter_count_store(struct device *dev, const struct counter_count_unit *const component = devattr->component; struct counter_count *const count = component->count; int err; - struct counter_count_write_value val = { .buf = buf }; + unsigned long val; + + err = kstrtoul(buf, 0, &val); + if (err) + return err; - err = counter->ops->count_write(counter, count, &val); + err = counter->ops->count_write(counter, count, val); if (err) return err; diff --git a/include/linux/counter.h b/include/linux/counter.h index a061cdcdef7c..7e40796598a6 100644 --- a/include/linux/counter.h +++ b/include/linux/counter.h @@ -300,24 +300,6 @@ struct counter_signal_read_value { size_t len; }; -/** - * struct counter_count_read_value - Opaque Count read value - * @buf: string representation of Count read value - * @len: length of string in @buf - */ -struct counter_count_read_value { - char *buf; - size_t len; -}; - -/** - * struct counter_count_write_value - Opaque Count write value - * @buf: string representation of Count write value - */ -struct counter_count_write_value { - const char *buf; -}; - /** * struct counter_ops - Callbacks from driver * @signal_read: optional read callback for Signal attribute. The read @@ -328,15 +310,10 @@ struct counter_count_write_value { * signal_read callback. * @count_read:optional read callback for Count attribute. The read * value of the respect
[RFC 3/4] docs: driver-api: generic-counter: Update Count and Signal data types
Count data is now always represented as an unsigned integer, while Signal data is either SIGNAL_LOW or SIGNAL_HIGH. Signed-off-by: William Breathitt Gray --- Documentation/driver-api/generic-counter.rst | 22 +++- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/Documentation/driver-api/generic-counter.rst b/Documentation/driver-api/generic-counter.rst index 8382f01a53e3..161652fc1025 100644 --- a/Documentation/driver-api/generic-counter.rst +++ b/Documentation/driver-api/generic-counter.rst @@ -39,10 +39,7 @@ There are three core components to a counter: COUNT - A Count represents the count data for a set of Signals. The Generic -Counter interface provides the following available count data types: - -* COUNT_POSITION: - Unsigned integer value representing position. +Counter interface represents the count data as an unsigned integer. A Count has a count function mode which represents the update behavior for the count data. The Generic Counter interface provides the following @@ -93,19 +90,16 @@ SIGNAL A Signal represents a counter input data; this is the input data that is evaluated by the counter to determine the count data; e.g. a quadrature signal output line of a rotary encoder. Not all counter devices provide -user access to the Signal data. - -The Generic Counter interface provides the following available signal -data types for when the Signal data is available for user access: +user access to the Signal data, so exposure is optional for drivers. -* SIGNAL_LEVEL: - Signal line state level. The following states are possible: +When the Signal data is available for user access, the Generic Counter +interface provides the following available signal values: - - SIGNAL_LEVEL_LOW: -Signal line is in a low state. +* SIGNAL_LOW: + Signal line is in a low state. - - SIGNAL_LEVEL_HIGH: -Signal line is in a high state. +* SIGNAL_HIGH: + Signal line is in a high state. A Signal may be associated with one or more Counts. -- 2.23.0
[RFC 2/4] counter: Simplify the signal_read callback
The signal_read callback is simplified to pass val as a counter_signal_val enum rather than as an opaque data structure. The opaque counter_signal_read_value structure and relevant counter_signal_read_value_set function are removed as they are no longer used. In addition, the counter_signal_level enum is replaced by the similar counter_signal_value enum. Signed-off-by: William Breathitt Gray --- drivers/counter/counter.c | 35 +++ include/linux/counter.h | 31 +-- 2 files changed, 12 insertions(+), 54 deletions(-) diff --git a/drivers/counter/counter.c b/drivers/counter/counter.c index 1d08f1437b1b..6a683d086008 100644 --- a/drivers/counter/counter.c +++ b/drivers/counter/counter.c @@ -220,32 +220,6 @@ ssize_t counter_device_enum_available_read(struct counter_device *counter, } EXPORT_SYMBOL_GPL(counter_device_enum_available_read); -static const char *const counter_signal_level_str[] = { - [COUNTER_SIGNAL_LEVEL_LOW] = "low", - [COUNTER_SIGNAL_LEVEL_HIGH] = "high" -}; - -/** - * counter_signal_read_value_set - set counter_signal_read_value data - * @val: counter_signal_read_value structure to set - * @type: property Signal data represents - * @data: Signal data - * - * This function sets an opaque counter_signal_read_value structure with the - * provided Signal data. - */ -void counter_signal_read_value_set(struct counter_signal_read_value *const val, - const enum counter_signal_value_type type, - void *const data) -{ - if (type == COUNTER_SIGNAL_LEVEL) - val->len = sprintf(val->buf, "%s\n", - counter_signal_level_str[*(enum counter_signal_level *)data]); - else - val->len = 0; -} -EXPORT_SYMBOL_GPL(counter_signal_read_value_set); - struct counter_attr_parm { struct counter_device_attr_group *group; const char *prefix; @@ -315,6 +289,11 @@ struct counter_signal_unit { struct counter_signal *signal; }; +static const char *const counter_signal_value_str[] = { + [COUNTER_SIGNAL_LOW] = "low", + [COUNTER_SIGNAL_HIGH] = "high" +}; + static ssize_t counter_signal_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -323,13 +302,13 @@ static ssize_t counter_signal_show(struct device *dev, const struct counter_signal_unit *const component = devattr->component; struct counter_signal *const signal = component->signal; int err; - struct counter_signal_read_value val = { .buf = buf }; + enum counter_signal_value val; err = counter->ops->signal_read(counter, signal, &val); if (err) return err; - return val.len; + return sprintf(buf, "%s\n", counter_signal_value_str[val]); } struct counter_name_unit { diff --git a/include/linux/counter.h b/include/linux/counter.h index 7e40796598a6..32fb4d8cc3fd 100644 --- a/include/linux/counter.h +++ b/include/linux/counter.h @@ -290,24 +290,16 @@ struct counter_device_state { const struct attribute_group **groups; }; -/** - * struct counter_signal_read_value - Opaque Signal read value - * @buf: string representation of Signal read value - * @len: length of string in @buf - */ -struct counter_signal_read_value { - char *buf; - size_t len; +enum counter_signal_value { + COUNTER_SIGNAL_LOW = 0, + COUNTER_SIGNAL_HIGH }; /** * struct counter_ops - Callbacks from driver * @signal_read: optional read callback for Signal attribute. The read * value of the respective Signal should be passed back via - * the val parameter. val points to an opaque type which - * should be set only by calling the - * counter_signal_read_value_set function from within the - * signal_read callback. + * the val parameter. * @count_read:optional read callback for Count attribute. The read * value of the respective Count should be passed back via * the val parameter. @@ -332,7 +324,7 @@ struct counter_signal_read_value { struct counter_ops { int (*signal_read)(struct counter_device *counter, struct counter_signal *signal, - struct counter_signal_read_value *val); + enum counter_signal_value *val); int (*count_read)(struct counter_device *counter, struct counter_count *count, unsigned long *val); int (*count_write)(struct counter_device *counter, @@ -452,19 +444,6 @@ struct counter_device { void *priv; }; -enum counter_signal_level { - COUNTER_SIGNAL_LEVEL_LOW = 0, - COUNTER_SIGNAL_LEVEL_HIGH -}; - -enum counter_sig
[RFC 0/4] counter: Simplify count_read/count_write/signal_read
I have a simplification for the the Counter subsystem callbacks, but I want to get some comments first before committing further. Since this is an RFC, I've included updates to the 104-QUAD-8 driver here for the sake of demonstration; if the comments received are positive, I'll submit the changes for the rest of the existing counter drivers as well. The changes in this patchset will not affect the userspace interface. Rather, these changes are intended to simplify the kernelspace Counter callbacks for counter device driver authors. The following main changes are proposed: * Retire the opaque counter_count_read_value/counter_count_write_value structures and simply represent count data as an unsigned integer. * Retire the opaque counter_signal_read_value structure and represent Signal data as a counter_signal_value enum. These changes should reduce some complexity and code in the use and implementation of the count_read, count_write, and signal_read callbacks. The opaque structures for Count data and Signal data were introduced originally in anticipation of supporting various representations of counter data (e.g. arbitrary-precision tallies, floating-point spherical coordinate positions, etc). However, with the counter device drivers that have appeared, it's become apparent that utilizing opaque structures in kernelspace is not the best approach to take. I believe it is best to let userspace applications decide how to interpret the count data they receive. There are a couple of reasons why it would be good to do so: * Users use their devices in unexpected ways. For example, a quadrature encoder counter device is typically used to keep track of the position of a motor, but a user could set the device in a pulse-direction mode and instead use it to count sporadic rising edges from an arbitrary signal line unrelated to positioning. Users should have the freedom to decide what their data represents. * Most counter devices represent data as unsigned integers anyway. For example, whether the device is a tally counter or position counter, the count data is represented to the user as an unsigned integer value. So specifying that one device is representing tallies while the other specifies positions does not provide much utility from an interface perspective. For these reasons, the count_read and count_write callbacks have been redefined to pass count data directly as unsigned long instead of passed via opaque structures: count_read(struct counter_device *counter, struct counter_count *count, unsigned long *val); count_write(struct counter_device *counter, struct counter_count *count, unsigned long val); Similarly, the signal_read is redefined to pass Signal data directly as a counter_signal_value enum instead of via an opaque structure: signal_read(struct counter_device *counter, struct counter_signal *signal, enum counter_signal_value *val); The counter_signal_value enum is simply the counter_signal_level enum redefined to remove the references to the Signal data "level" data type. William Breathitt Gray (4): counter: Simplify the count_read and count_write callbacks counter: Simplify the signal_read callback docs: driver-api: generic-counter: Update Count and Signal data types counter: 104-quad-8: Update count_read/count_write/signal_read callbacks Documentation/driver-api/generic-counter.rst | 22 ++-- drivers/counter/104-quad-8.c | 33 ++ drivers/counter/counter.c| 101 +++ include/linux/counter.h | 74 ++ 4 files changed, 42 insertions(+), 188 deletions(-) -- 2.23.0
[PATCH RFC v2] random: optionally block in getrandom(2) when the CRNG is uninitialized
getrandom() has been created as a new and more secure interface for pseudorandom data requests. Unlike /dev/urandom, it unconditionally blocks until the entropy pool has been properly initialized. While getrandom() has no guaranteed upper bound for its waiting time, user-space has been abusing it by issuing the syscall, from shared libraries no less, during the main system boot sequence. Thus, on certain setups where there is no hwrng (embedded), or the hwrng is not trusted by some users (intel RDRAND), or sometimes it's just broken (amd RDRAND), the system boot can be *reliably* blocked. The issue is further exaggerated by recent file-system optimizations, e.g. b03755ad6f33 (ext4: make __ext4_get_inode_loc plug), which merges directory lookup code inode table IO, and thus minimizes the number of disk interrupts and entropy during boot. After that commit, a blocked boot can be reliably reproduced on a Thinkpad E480 laptop with standard ArchLinux user-space. Thus, add an optional configuration option which stops getrandom(2) from blocking, but instead returns "best efforts" randomness, which might not be random or secure at all. This can be controlled via random.getrandom_block boot command line option, and the CONFIG_RANDOM_BLOCK can be used to set the default to be blocking. Since according to the Great Penguin, only incompetent system designers would value "security" ahead of "usability", the default is to be non-blocking. In addition, modify getrandom(2) to complain loudly with a kernel warning when some userspace process is erroneously calling getrandom(2) too early during the boot process. Link: https://lkml.kernel.org/r/CAHk-=wjyH910+JRBdZf_Y9G54c1M=lbf8nkxb6vjcm9xjln...@mail.gmail.com Link: https://lkml.kernel.org/r/20190912034421.GA2085@darwi-home-pc Link: https://lkml.kernel.org/r/20190911173624.gi2...@mit.edu Link: https://lkml.kernel.org/r/20180514003034.gi14...@thunk.org [ Modified by ty...@mit.edu to make the change of getrandom(2) to be non-blocking to be optional. ] Suggested-by: Linus Torvalds Signed-off-by: Ahmed S. Darwish Signed-off-by: Theodore Ts'o --- Here's my take on the patch. I really very strongly believe that the idea of making getrandom(2) non-blocking and to blindly assume that we can load up the buffer with "best efforts" randomness to be a terrible, terrible idea that is going to cause major security problems that we will potentially regret very badly. Linus Torvalds believes I am an incompetent systems designer. So let's do it both ways, and push the decision on the distributor and/or product manufacturer drivers/char/Kconfig | 33 +++-- drivers/char/random.c | 34 +- 2 files changed, 60 insertions(+), 7 deletions(-) diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig index 3e866885a405..337baeca5ebc 100644 --- a/drivers/char/Kconfig +++ b/drivers/char/Kconfig @@ -557,8 +557,6 @@ config ADI and SSM (Silicon Secured Memory). Intended consumers of this driver include crash and makedumpfile. -endmenu - config RANDOM_TRUST_CPU bool "Trust the CPU manufacturer to initialize Linux's CRNG" depends on X86 || S390 || PPC @@ -573,3 +571,34 @@ config RANDOM_TRUST_CPU has not installed a hidden back door to compromise the CPU's random number generation facilities. This can also be configured at boot with "random.trust_cpu=on/off". + +config RANDOM_BLOCK + bool "Block if getrandom is called before CRNG is initialized" + help + Say Y here if you want userspace programs which call + getrandom(2) before the Cryptographic Random Number + Generator (CRNG) is initialized to block until + secure random numbers are available. + + Say N if you believe usability is more important than + security, so if getrandom(2) is called before the CRNG is + initialized, it should not block, but instead return "best + effort" randomness which might not be very secure or random + at all; but at least the system boot will not be delayed by + minutes or hours. + + This can also be controlled at boot with + "random.getrandom_block=on/off". + + Ideally, systems would be configured with hardware random + number generators, and/or configured to trust CPU-provided + RNG's. In addition, userspace should generate cryptographic + keys only as late as possible, when they are needed, instead + of during early boot. (For non-cryptographic use cases, + such as dictionary seeds or MIT Magic Cookies, other + mechanisms such as /dev/urandom or random(3) may be more + appropropriate.) This config option controls what the + kernel should do as a fallback when the non-ideal case + presents itself. + +endmenu diff --git a/drivers/char/random.c b/drivers/char/random.c index 5d5ea4ce1442..243fb4
Re: [PATCH] mm/memblock: cleanup doc
On Thu, Sep 12, 2019 at 08:31:27PM +0800, Cao jin wrote: > fix typos for: > elaboarte -> elaborate > architecure -> architecture > compltes -> completes > > And, convert the markup :c:func:`foo` to foo() as kernel documentation > toolchain can recognize foo() as a function. > > Suggested-by: Mike Rapoport > Signed-off-by: Cao jin Reviewed-by: Mike Rapoport > --- > mm/memblock.c | 44 > 1 file changed, 20 insertions(+), 24 deletions(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index 7d4f61ae666a..c23b370cc49e 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -57,42 +57,38 @@ > * at build time. The region arrays for the "memory" and "reserved" > * types are initially sized to %INIT_MEMBLOCK_REGIONS and for the > * "physmap" type to %INIT_PHYSMEM_REGIONS. > - * The :c:func:`memblock_allow_resize` enables automatic resizing of > - * the region arrays during addition of new regions. This feature > - * should be used with care so that memory allocated for the region > - * array will not overlap with areas that should be reserved, for > - * example initrd. > + * The memblock_allow_resize() enables automatic resizing of the region > + * arrays during addition of new regions. This feature should be used > + * with care so that memory allocated for the region array will not > + * overlap with areas that should be reserved, for example initrd. > * > * The early architecture setup should tell memblock what the physical > - * memory layout is by using :c:func:`memblock_add` or > - * :c:func:`memblock_add_node` functions. The first function does not > - * assign the region to a NUMA node and it is appropriate for UMA > - * systems. Yet, it is possible to use it on NUMA systems as well and > - * assign the region to a NUMA node later in the setup process using > - * :c:func:`memblock_set_node`. The :c:func:`memblock_add_node` > - * performs such an assignment directly. > + * memory layout is by using memblock_add() or memblock_add_node() > + * functions. The first function does not assign the region to a NUMA > + * node and it is appropriate for UMA systems. Yet, it is possible to > + * use it on NUMA systems as well and assign the region to a NUMA node > + * later in the setup process using memblock_set_node(). The > + * memblock_add_node() performs such an assignment directly. > * > * Once memblock is setup the memory can be allocated using one of the > * API variants: > * > - * * :c:func:`memblock_phys_alloc*` - these functions return the > - * **physical** address of the allocated memory > - * * :c:func:`memblock_alloc*` - these functions return the **virtual** > - * address of the allocated memory. > + * * memblock_phys_alloc*() - these functions return the **physical** > + * address of the allocated memory > + * * memblock_alloc*() - these functions return the **virtual** address > + * of the allocated memory. > * > * Note, that both API variants use implict assumptions about allowed > * memory ranges and the fallback methods. Consult the documentation > - * of :c:func:`memblock_alloc_internal` and > - * :c:func:`memblock_alloc_range_nid` functions for more elaboarte > - * description. > + * of memblock_alloc_internal() and memblock_alloc_range_nid() > + * functions for more elaborate description. > * > - * As the system boot progresses, the architecture specific > - * :c:func:`mem_init` function frees all the memory to the buddy page > - * allocator. > + * As the system boot progresses, the architecture specific mem_init() > + * function frees all the memory to the buddy page allocator. > * > - * Unless an architecure enables %CONFIG_ARCH_KEEP_MEMBLOCK, the > + * Unless an architecture enables %CONFIG_ARCH_KEEP_MEMBLOCK, the > * memblock data structures will be discarded after the system > - * initialization compltes. > + * initialization completes. > */ > > #ifndef CONFIG_NEED_MULTIPLE_NODES > -- > 2.21.0 > > > -- Sincerely yours, Mike.
Why isn't Grsecurity being sued for its long standing GPL violations??
Hi, RMS and Bruce Perens; I noticed that recently Grsecurity's Brad Spengler (who sued you, Bruce, for speaking the truth), decided to "Flex" and basically advertise while chastising the linux community: news.ycombinator.com/item?id=20874470 Another poster then pointed out the history of Grsecurity's copyright violations (as a derivative work that is restricting redistribution), but said he didn't want to say to much, because he didn't want to get sued. He referenced your good write-up on the situation: perens.com/2017/06/28/warning-grsecurity-potential-contributory-infringement-risk-for-customers/ (Which has not changed) Why isn't Grsecurity being sued for it's copyright violations? They've been going on for years now. Clearly their scheme works: it can be shown to a court both the attempt to restrict redistribution was tendered (the agreement) and that said attempt has been successful. Also isn't Open Source Security simply an alter-ego of Brad Spengler? Him being the only employee? Couldn't the corporate veil be pierced and he be found personally liable for any damages?
[PATCH] usbip: vhci_hcd indicate failed message
If the return value of vhci_init_attr_group and sysfs_create_group is non-zero, which mean they failed to init attr_group and create sysfs group, so it would better add 'failed' message to indicate that. Fixes: 0775a9cbc694 ("usbip: vhci extension: modifications to vhci driver") Signed-off-by: Mao Wenan --- drivers/usb/usbip/vhci_hcd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c index 000ab7225717..dd54c95d2498 100644 --- a/drivers/usb/usbip/vhci_hcd.c +++ b/drivers/usb/usbip/vhci_hcd.c @@ -1185,12 +1185,12 @@ static int vhci_start(struct usb_hcd *hcd) if (id == 0 && usb_hcd_is_primary_hcd(hcd)) { err = vhci_init_attr_group(); if (err) { - pr_err("init attr group\n"); + pr_err("init attr group failed\n"); return err; } err = sysfs_create_group(&hcd_dev(hcd)->kobj, &vhci_attr_group); if (err) { - pr_err("create sysfs files\n"); + pr_err("create sysfs failed\n"); vhci_finish_attr_group(); return err; } -- 2.20.1
[PATCH] sched: fix migration to invalid cpu in __set_cpus_allowed_ptr
From: reason: migration to invalid cpu in __set_cpus_allowed_ptr archive path: patches/euleros/sched Oops occur when running qemu on arm64: Unable to handle kernel paging request at virtual address 08effe40 Internal error: Oops: 9607 [#1] SMP Process migration/0 (pid: 12, stack limit = 0x084e3736) pstate: 2085 (nzCv daIf -PAN -UAO) pc : __ll_sc___cmpxchg_case_acq_4+0x4/0x20 lr : move_queued_task.isra.21+0x124/0x298 ... Call trace: __ll_sc___cmpxchg_case_acq_4+0x4/0x20 __migrate_task+0xc8/0xe0 migration_cpu_stop+0x170/0x180 cpu_stopper_thread+0xec/0x178 smpboot_thread_fn+0x1ac/0x1e8 kthread+0x134/0x138 ret_from_fork+0x10/0x18 __set_cpus_allowed_ptr will choose an active dest_cpu in affinity mask to migrage the process if process is not currently running on any one of the CPUs specified in affinity mask.__set_cpus_allowed_ptr will choose an invalid dest_cpu(>= nr_cpu_ids, 1024 in my virtual machine) if CPUS in affinity mask are deactived by cpu_down after cpumask_intersects check.Cpumask_test_cpu of dest_cpu afterwards is overflow and may passes if corresponding bit is coincidentally set.As a consequence, kernel will access a invalid rq address associate with the invalid cpu in migration_cpu_stop->__migrate_task->move_queued_task and the Oops occurs. Process as follows may trigger the Oops: 1) A process repeatedly bind itself to cpu0 and cpu1 in turn by calling sched_setaffinity 2) A shell script repeatedly "echo 0 > /sys/devices/system/cpu/cpu1/online" and "echo 1 > /sys/devices/system/cpu/cpu1/online" in turn 3) Oops appears if the invalid cpu is set in memory after tested cpumask. Change-Id: I9c2f95aecd3da568991b7408397215f26c990e40 Signed-off-by: --- kernel/sched/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4b63fef..5181ea9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1112,7 +1112,8 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, if (cpumask_equal(&p->cpus_allowed, new_mask)) goto out; - if (!cpumask_intersects(new_mask, cpu_valid_mask)) { + dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask); + if (dest_cpu >= nr_cpu_ids) { ret = -EINVAL; goto out; } @@ -1133,7 +1134,6 @@ static int __set_cpus_allowed_ptr(struct task_struct *p, if (cpumask_test_cpu(task_cpu(p), new_mask)) goto out; - dest_cpu = cpumask_any_and(cpu_valid_mask, new_mask); if (task_running(rq, p) || p->state == TASK_WAKING) { struct migration_arg arg = { p, dest_cpu }; /* Need help from migration thread: drop lock and wait. */ -- 1.8.3.1
Re: [PATCH V7 2/3] arm64/mm: Hold memory hotplug lock while walking for kernel page table dump
On 3/9/19 7:45 pm, Anshuman Khandual wrote: > The arm64 page table dump code can race with concurrent modification of the > kernel page tables. When a leaf entries are modified concurrently, the dump > code may log stale or inconsistent information for a VA range, but this is > otherwise not harmful. > > When intermediate levels of table are freed, the dump code will continue to > use memory which has been freed and potentially reallocated for another > purpose. In such cases, the dump code may dereference bogus addresses, > leading to a number of potential problems. > > Intermediate levels of table may by freed during memory hot-remove, > which will be enabled by a subsequent patch. To avoid racing with > this, take the memory hotplug lock when walking the kernel page table. > > Acked-by: David Hildenbrand > Acked-by: Mark Rutland > Signed-off-by: Anshuman Khandual > --- > arch/arm64/mm/ptdump_debugfs.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm64/mm/ptdump_debugfs.c b/arch/arm64/mm/ptdump_debugfs.c > index 064163f25592..b5eebc8c4924 100644 > --- a/arch/arm64/mm/ptdump_debugfs.c > +++ b/arch/arm64/mm/ptdump_debugfs.c > @@ -1,5 +1,6 @@ > // SPDX-License-Identifier: GPL-2.0 > #include > +#include > #include > > #include > @@ -7,7 +8,10 @@ > static int ptdump_show(struct seq_file *m, void *v) > { > struct ptdump_info *info = m->private; > + > + get_online_mems(); > ptdump_walk_pgd(m, info); > + put_online_mems(); Looks sane, BTW, checking other arches they might have the same race. Is there anything special about the arch? Acked-by: Balbir Singh
media: vimc-debayer: lock problem
Kernel is 5.3-rc8 on x86_64. Loading, unloading, and then loading the vimc-debayer module causes: [ 793.542496] [ cut here ] [ 793.542518] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 793.542536] WARNING: CPU: 3 PID: 2029 at ../kernel/locking/mutex.c:912 __mutex_lock+0xd7c/0x1480 [ 793.542559] Modules linked in: vimc_debayer(+) vimc_capture vimc_scaler vimc_sensor v4l2_tpg vimc v4l2_common ccm xt_tcpudp ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 af_packet xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat scsi_transport_iscsi nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables bpfilter coretemp hwmon intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass crct10dif_pclmul crc32_generic crc32_pclmul ghash_clmulni_intel msr ghash_generic gf128mul uvcvideo hid_generic btrfs videobuf2_vmalloc gcm videobuf2_memops xor videobuf2_v4l2 iTCO_wdt zstd_compress videobuf2_common videodev iTCO_vendor_support usbmouse usb_debug mei_hdcp usbserial xts raid6_pq mc usbhid libcrc32c crc32c_intel hid [ 793.542617] zstd_decompress iwldvm ctr snd_hda_codec_hdmi aesni_intel mac80211 aes_x86_64 snd_hda_codec_realtek crypto_simd snd_hda_codec_generic cryptd ledtrig_audio glue_helper libarc4 snd_hda_intel intel_cstate iwlwifi snd_hda_codec intel_uncore intel_rapl_perf snd_hda_core joydev snd_hwdep input_leds cfg80211 mousedev sdhci_pci toshiba_acpi cqhci sparse_keymap sr_mod snd_pcm uio_pdrv_genirq sdhci uio snd_timer wmi serio_raw pcspkr mmc_core cdrom rfkill led_class snd rtc_cmos mei_me industrialio evdev mei thermal lpc_ich mac_hid soundcore toshiba_haps ac battery sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua autofs4 [last unloaded: vimc_debayer] [ 793.542817] CPU: 3 PID: 2029 Comm: modprobe Not tainted 5.3.0-rc8 #3 [ 793.542829] Hardware name: TOSHIBA PORTEGE R835/Portable PC, BIOS Version 4.10 01/08/2013 [ 793.542845] RIP: 0010:__mutex_lock+0xd7c/0x1480 [ 793.542857] Code: d2 0f 85 35 05 00 00 44 8b 0d 50 7f b2 01 45 85 c9 0f 85 b1 f3 ff ff 48 c7 c6 60 15 89 9c 48 c7 c7 00 14 89 9c e8 b4 86 eb fd <0f> 0b e9 97 f3 ff ff 48 8b 85 88 fe ff ff 48 c1 e8 03 42 80 3c 28 [ 793.542882] RSP: 0018:88812a5eefd0 EFLAGS: 00010282 [ 793.542894] RAX: dc08 RBX: c1c0f7e0 RCX: 9a2b67f4 [ 793.542906] RDX: 13a3dfc5 RSI: 0004 RDI: 0246 [ 793.542919] RBP: 88812a5ef150 R08: fbfff3a3dfc5 R09: fbfff3a3dfc5 [ 793.542931] R10: 0001 R11: fbfff3a3dfc4 R12: [ 793.542944] R13: R14: dc00 R15: 9e700400 [ 793.542957] FS: 7f080c4a6b80() GS:88812aa0() knlGS: [ 793.542970] CS: 0010 DS: ES: CR0: 80050033 [ 793.542981] CR2: 7ffeab8adee8 CR3: 00010fd66005 CR4: 000606e0 [ 793.542993] Call Trace: [ 793.543007] ? do_raw_spin_unlock+0x54/0x220 [ 793.543026] ? media_device_register_entity+0x1fd/0x710 [mc] [ 793.543044] ? mutex_lock_io_nested+0x1380/0x1380 [ 793.543055] ? ida_alloc_range+0x5b7/0x6e0 [ 793.543068] ? ida_destroy+0x260/0x260 [ 793.543080] ? save_stack+0x21/0x90 [ 793.543090] ? __kasan_kmalloc.constprop.8+0xa7/0xd0 [ 793.543101] ? kasan_kmalloc+0x9/0x10 [ 793.543111] ? __kmalloc+0x11f/0x260 [ 793.543123] ? vimc_pads_init+0x33/0x130 [vimc] [ 793.543134] ? vimc_ent_sd_register+0x36/0x350 [vimc] [ 793.543147] ? vimc_sen_comp_bind+0x280/0x520 [vimc_sensor] [ 793.543160] ? component_bind_all+0x2ef/0xa60 [ 793.543171] ? vimc_comp_bind+0x74/0x630 [vimc] [ 793.543188] ? driver_probe_device+0xf0/0x3a0 [ 793.543199] ? device_driver_attach+0xec/0x120 [ 793.543212] mutex_lock_nested+0x16/0x20 [ 793.543223] ? mutex_lock_nested+0x16/0x20 [ 793.543236] media_device_register_entity+0x1fd/0x710 [mc] [ 793.543249] ? __x64_sys_finit_module+0x6e/0xb0 [ 793.543260] ? do_syscall_64+0xaa/0x380 [ 793.543271] ? check_object+0xb2/0x300 [ 793.543282] ? vimc_pads_init+0x33/0x130 [vimc] [ 793.543296] ? media_device_unregister_entity_notify+0xf0/0xf0 [mc] [ 793.543311] ? ___slab_alloc+0x5b3/0x600 [ 793.543321] ? ___slab_alloc+0x5b3/0x600 [ 793.543335] ? kasan_unpoison_shadow+0x35/0x50 [ 793.543346] ? __kasan_kmalloc.constprop.8+0xa7/0xd0 [ 793.543358] ? kasan_kmalloc+0x9/0x10 [ 793.543377] v4l2_device_register_subdev+0x261/0x620 [videodev] [ 793.543392] vimc_ent_sd_register+0x26b/0x350 [vimc] [ 793.543406] vimc_sen_comp_bind+0x280/0x520 [vimc_sensor] [ 793.543421] component_bind_all+0x2ef/0xa60 [ 793.543433] ? refcount_inc_checked+0xa/0x50 [ 793.543446] ? vimc_comp_unbind+0x60/0x60 [vimc] [ 793.543459] vimc_comp_bind+0x74/0x630 [vimc] [ 793.543471] ?
Re: [PATCH v3] powerpc/lockdep: fix a false positive warning
Hi Qian, Thank you for the patch! Yet something to improve: [auto build test ERROR on linus/master] [cannot apply to v5.3-rc8 next-20190904] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Qian-Cai/powerpc-lockdep-fix-a-false-positive-warning/20190910-053719 config: powerpc-ps3_defconfig (attached as .config) compiler: powerpc64-linux-gcc (GCC) 7.4.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.4.0 make.cross ARCH=powerpc If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): >> kernel/locking/lockdep.o:(.toc+0x0): undefined reference to `bss_hole_start' >> kernel/locking/lockdep.o:(.toc+0x8): undefined reference to `bss_hole_end' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 7:05 PM Theodore Y. Ts'o wrote: > > I'd be willing to let it take at least 2 minutes, since that's slow > enough to be annoying. Have you ever met a real human being? A boot that blocks will result in people pressing the big red button in less than 30 seconds, unless it talks very much about _why_ it blocks and gives an estimate of how long. Please go out and actually interact with real people some day. Linus
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 06:10:47PM -0700, Linus Torvalds wrote: > > We could return 0 for success, and yet "the best we > > can do" could be really terrible. > > Yes. Which is why we should warn. I'm all in favor of warning. But people might just ignore the warning. We warn today about systemd trying to read from /dev/urandom too early, and that just gets ignored. > But we can't *block*. Because that just breaks people. Like shown in > this whole discussion. I'd be willing to let it take at least 2 minutes, since that's slow enough to be annoying. I'd be willing to to kill the process which tried to call getrandom too early. But I believe blocking is better than returning something potentially not random at all. I think failing "safe" is extremely important. And returning something not random which then gets used for a long-term private key is a disaster. You basically want to turn getrandom into /dev/urandom. And that's how we got into the mess where 10% of the publically accessible ssh keys could be guessed. I've tried that already, and we saw how that ended. > Why is warning different? Because hopefully it tells the only person > who can *do* something about it - the original maintainer or developer > of the user space tools - that they are doing something wrong and need > to fix their broken model. Except the developer could (and *has) just ignored the warning, which is what happened with /dev/urandom when it was accessed too early. Even when I drew some developers attention to the warning, at least one just said, "meh", and blew me off. Would a making it be noiser (e.g., a WARN_ON) make enough of a difference? I guess I'm just not convinced. > Blocking doesn't do that. Blocking only makes the system unusable. And > yes, some security people think "unusable == secure", but honestly, > those security people shouldn't do system design. They are the worst > kind of "technically correct" incompetent. Which is worse really depends on your point of view, and what the system might be controlling. If access to the system could cause a malicious attacker to trigger a nuclear bomb, failing safe is always going to be better. In other cases, maybe failing open is certainly more convenient. It certainly leaves the system more "usable". But how do we trade off "usable" with "insecure"? There are times when "unusable" is WAY better than "could risk life or human safety". Would you be willing to settle for a CONFIG option or a boot-command line option which controls whether we fail "safe" or fail "open" if someone calls getrandom(2) and there isn't enough entropy? Then each distribution and/or system integrator can decide whether "proper systems design" considers "usability" versus "must not fail insecurely" to be more important. - Ted
Re: [RFC v1 14/14] krsi: Pin arg pages only when needed
On 15-Sep 00:33, Yonghong Song wrote: > > > On 9/10/19 12:55 PM, KP Singh wrote: > > From: KP Singh > > > > Adds a callback which is called when a new program is attached > > to a hook. The callback registered by the process_exection hook > > checks if a program that has calls to a helper that requires pages to > > be pinned (eg. krsi_get_env_var). > > > > Signed-off-by: KP Singh > > --- > > include/linux/krsi.h | 1 + > > security/krsi/include/hooks.h | 5 ++- > > security/krsi/include/krsi_init.h | 7 > > security/krsi/krsi.c | 62 --- > > security/krsi/ops.c | 10 - > > 5 files changed, 77 insertions(+), 8 deletions(-) > > > > diff --git a/include/linux/krsi.h b/include/linux/krsi.h > > index c7d1790d0c1f..e443d0309764 100644 > > --- a/include/linux/krsi.h > > +++ b/include/linux/krsi.h > > @@ -7,6 +7,7 @@ > > > > #ifdef CONFIG_SECURITY_KRSI > > int krsi_prog_attach(const union bpf_attr *attr, struct bpf_prog *prog); > > +extern const struct bpf_func_proto krsi_get_env_var_proto; > > #else > > static inline int krsi_prog_attach(const union bpf_attr *attr, > >struct bpf_prog *prog) > > diff --git a/security/krsi/include/hooks.h b/security/krsi/include/hooks.h > > index e070c452b5de..38293125ff99 100644 > > --- a/security/krsi/include/hooks.h > > +++ b/security/krsi/include/hooks.h > > @@ -8,7 +8,7 @@ > >* > >* Format: > >* > > - * KRSI_HOOK_INIT(TYPE, NAME, LSM_HOOK, KRSI_HOOK_FN) > > + * KRSI_HOOK_INIT(TYPE, NAME, LSM_HOOK, KRSI_HOOK_FN, CALLBACK) > >* > >* KRSI adds one layer of indirection between the name of the hook and > > the name > >* it exposes to the userspace in Security FS to prevent the userspace > > from > > @@ -18,4 +18,5 @@ > > KRSI_HOOK_INIT(PROCESS_EXECUTION, > >process_execution, > >bprm_check_security, > > - krsi_process_execution) > > + krsi_process_execution, > > + krsi_process_execution_cb) > > diff --git a/security/krsi/include/krsi_init.h > > b/security/krsi/include/krsi_init.h > > index 6152847c3b08..99801d5b273a 100644 > > --- a/security/krsi/include/krsi_init.h > > +++ b/security/krsi/include/krsi_init.h > > @@ -31,6 +31,8 @@ struct krsi_ctx { > > }; > > }; > > > > +typedef int (*krsi_prog_attach_t) (struct bpf_prog_array *); > > + > > /* > >* The LSM creates one file per hook. > >* > > @@ -61,6 +63,11 @@ struct krsi_hook { > > * The eBPF programs that are attached to this hook. > > */ > > struct bpf_prog_array __rcu *progs; > > + /* > > +* The attach callback is called before a new program is attached > > +* to the hook and is passed the updated bpf_prog_array as an argument. > > +*/ > > + krsi_prog_attach_t attach_callback; > > }; > > > > extern struct krsi_hook krsi_hooks_list[]; > > diff --git a/security/krsi/krsi.c b/security/krsi/krsi.c > > index 00a7150c1b22..a4443d7aa150 100644 > > --- a/security/krsi/krsi.c > > +++ b/security/krsi/krsi.c > > @@ -5,15 +5,65 @@ > > #include > > #include > > #include > > +#include > > #include > > > > #include "krsi_init.h" > > > > +/* > > + * need_arg_pages is only updated in bprm_check_security_cb > > + * when a mutex on krsi_hook for bprm_check_security is already > > + * held. need_arg_pages avoids pinning pages when no program > > + * that needs them is attached to the hook. > > + */ > > +static bool need_arg_pages; > > + > > +/* > > + * Checks if the instruction is a BPF_CALL to an eBPF helper located > > + * at the given address. > > + */ > > +static inline bool bpf_is_call_to_func(struct bpf_insn *insn, > > + void *func_addr) > > +{ > > + u8 opcode = BPF_OP(insn->code); > > + > > + if (opcode != BPF_CALL) > > + return false; > > + > > + if (insn->src_reg == BPF_PSEUDO_CALL) > > + return false; > > + > > + /* > > +* The BPF verifier updates the value of insn->imm from the > > +* enum bpf_func_id to the offset of the address of helper > > +* from the __bpf_call_base. > > +*/ > > + return __bpf_call_base + insn->imm == func_addr; > > In how many cases, krsi program does not have krsi_get_env_var() helper? It depends, if the user does not choose to use log environment variables or use the the value as a part of their MAC policy, the pinning of the pages is not needed. Also, the pinning is needed since eBPF helpers cannot run a non-atomic context. It would not be needed if "sleepable eBPF" becomes a thing. - KP > > > +} > > + > > +static int krsi_process_execution_cb(struct bpf_prog_array *array) > > +{ > > + struct bpf_prog_array_item *item = array->items; > > + struct bpf_prog *p; > > + const struct bpf_func_proto *proto = &krsi_get_env_var_proto; > > + int i; > > + > > + while ((p = READ_ONCE(item->prog))) { > > + for (i = 0; i
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 6:00 PM Theodore Y. Ts'o wrote: > > That makes me even more worried. It's probably going to be OK for > modern x86 systems, since "best we can do" will include RDRAND > (whether or not it's trusted). But on systems without something like > RDRAND --- e.g., ARM --- the "best we can do" could potentially be > Really Bad. Again, look back at the Mining Your P's and Q's paper > from factorable.net. Yes. And they had that problem *because* the blocking interface was useless, and they didn't use it, and *because* nobody warned them about it. In other words, the whole disaster was exactly because blocking is wrong, and because blocking to get "secure" data is unacceptable. And the random people DIDN'T LEARN A SINGLE LESSON from that thing. Seriously. getrandom() introduced the same broken model as /dev/random had - and that then caused people to use /dev/urandom instead. And now it has shown itself to be broken _again_. And you still argue against the only sane model. Scream loudly that you're doing something wrong so that people can fix their broken garbage, but don't let people block, which is _also_ broken garbage. Seriously. Blocking is wrong. Blocking has _always_ been wrong. It was why /dev/random was useless, and it is now why the new getrandom() system call is showing itself useless. > We could return 0 for success, and yet "the best we > can do" could be really terrible. Yes. Which is why we should warn. But we can't *block*. Because that just breaks people. Like shown in this whole discussion. Why is warning different? Because hopefully it tells the only person who can *do* something about it - the original maintainer or developer of the user space tools - that they are doing something wrong and need to fix their broken model. Blocking doesn't do that. Blocking only makes the system unusable. And yes, some security people think "unusable == secure", but honestly, those security people shouldn't do system design. They are the worst kind of "technically correct" incompetent. > > > For 5.3, can we please consider my proposal in [1]? > > It may be the safest thing to do, but at that point we might as well > > just revert the ext4 change entirely. I'd rather do that, than have > > random filesystems start making random decisions based on crazy user > > space behavior. > > All we're doing is omitting the plug; Yes. Which we'll do by reverting that change. I agree that it's the safe thing to do for 5.3. We are not adding crazy workarounds for "getrandom()" bugs in some low-level filesystem. Either we fix getrandom() or we revert the change. We don't do some mis-designed "let's work around bugs in getrandom() in the ext4 filesystem with ad-hoc behavioral changes". Linus
Re: [Openipmi-developer] [PATCH 0/1] Fix race in ipmi timer cleanup
> > > > > {disable,enable}_si_irq() themselves are racy: > > > > static inline bool disable_si_irq(struct smi_info *smi_info) > > { > > if ((smi_info->io.irq) && (!smi_info->interrupt_disabled)) { > > smi_info->interrupt_disabled = true; > > > > Basically interrupt_disabled need to be atomic here to have any value, > > unless you ensure to have a spin lock around every access to it. > > It needs to be atomic, yes, but I think just adding the spinlock like > I suggested will work. You are right, the check for timer_running is > not necessary here, and I'm fine with removing it, but there are other > issues with interrupt_disabled (as you said) and with memory ordering > in the timer case. So even if you remove the timer running check, the > lock is still required here. It turns out you were right, all that really needs to be done is the del_timer_sync(). I've added your patch to my queue. Sorry for the trouble. Thanks, -corey
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 03:32:46PM -0700, Linus Torvalds wrote: > > Worse, it won't even accomplish something against an obstinant > > programmers. Someone who is going to change their program to sleep > > loop waiting for getrandom(2) to not return with an error can just as > > easily check for a buffer which is zero-filled, or an unchanged > > buffer, and then sleep loop on that. > > Again, no they can't. They'll get random data in the buffer. And > there is no way they can tell how much entropy that random data has. That makes me even more worried. It's probably going to be OK for modern x86 systems, since "best we can do" will include RDRAND (whether or not it's trusted). But on systems without something like RDRAND --- e.g., ARM --- the "best we can do" could potentially be Really Bad. Again, look back at the Mining Your P's and Q's paper from factorable.net. If we don't block, and we just return "the best we can do", and some insane userspace tries to generate a long-term private key (for SSH or TLS) in super-early boot, I think we owe them something beyond a big fat WARN_ON_ONCE. We could return 0 for success, and yet "the best we can do" could be really terrible. > > For 5.3, can we please consider my proposal in [1]? > > > > [1] https://lore.kernel.org/linux-ext4/20190914162719.ga19...@mit.edu/ > > Honestly, to me that looks *much* worse than just saying that we need > to stop allowing insane user mode boot programs make insane choices > that have no basis in reality. > > It may be the safest thing to do, but at that point we might as well > just revert the ext4 change entirely. I'd rather do that, than have > random filesystems start making random decisions based on crazy user > space behavior. All we're doing is omitting the plug; I disagree that it's really all that random. Honestly, I'd much rather just let distributions hang, and force them to fix it that way. That's *much* better than silently give them "the best we can do", which might be "not really random at all". The reality is that there will be some platforms where we will block for a very long time, given certain kernel configs and certain really stupid userspace decisions --- OR, we can open up a really massive security hole given stupid userspace decisions. Ext4 just got unlocky that a performance improvement patch happened to toggle one or two configurations from "working" to "not working". But just saying, "oh well" and returning something which might not really be random with a success code is SUCH A TERRIBLE IDEA, that if you really prefer that, I'll accept the ext4 revert, even though I don't think we should be penalizing all ext4 performance just because of a few distros being stupid. If the choice is between that and making some unsuspecting distributions being potentially completely insecure, it's no contest. I won't have that on my conscience. - Ted
Re: [PATCH 4.9 00/14] 4.9.193-stable review
Hi Greg, On 9/14/19 1:31 AM, Greg Kroah-Hartman wrote: On Sat, Sep 14, 2019 at 01:28:39AM -0700, Guenter Roeck wrote: On 9/13/19 6:06 AM, Greg Kroah-Hartman wrote: This is the start of the stable review cycle for the 4.9.193 release. There are 14 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know. Responses should be made by Sun 15 Sep 2019 01:03:32 PM UTC. Anything received after that time might be too late. Is it really only me seeing this ? drivers/vhost/vhost.c: In function 'translate_desc': include/linux/compiler.h:549:38: error: call to '__compiletime_assert_1879' declared with attribute error: BUILD_BUG_ON failed: sizeof(_s) > sizeof(long) i386:allyesconfig, mips:allmodconfig and others, everywhere including mainline. Culprit is commit a89db445fbd7f1 ("vhost: block speculation of translated descriptors"). Nope, I just got another report of this on 5.2.y. Problem is also in Linus's tree :( Please apply upstream commit 0d4a3f2abbef ("Revert "vhost: block speculation of translated descriptors") to v4.9.y-queue and later to fix the build problems. Or maybe just drop the offending commit from the stable release queues. Thanks, Guenter
Re: Re: [RFC v1 06/14] krsi: Implement eBPF operations, attachment and execution
On 9/14/19 5:56 PM, Yonghong Song wrote: > > > On 9/10/19 12:55 PM, KP Singh wrote: >> From: KP Singh >> >> A user space program can attach an eBPF program by: >> >> hook_fd = open("/sys/kernel/security/krsi/process_execution", O_RDWR) >> prog_fd = bpf(BPF_PROG_LOAD, ...) >> bpf(BPF_PROG_ATTACH, hook_fd, prog_fd) >> >> When such an attach call is received, the attachment logic looks up the >> dentry and appends the program to the bpf_prog_array. >> >> The BPF programs are stored in a bpf_prog_array and writes to the array >> are guarded by a mutex. The eBPF programs are executed as a part of the >> LSM hook they are attached to. If any of the eBPF programs return >> an error (-ENOPERM) the action represented by the hook is denied. >> >> Signed-off-by: KP Singh >> --- >>include/linux/krsi.h | 18 ++ >>kernel/bpf/syscall.c | 3 +- >>security/krsi/include/krsi_init.h | 51 +++ >>security/krsi/krsi.c | 13 +++- >>security/krsi/krsi_fs.c | 28 >>security/krsi/ops.c | 102 ++ >>6 files changed, 213 insertions(+), 2 deletions(-) >>create mode 100644 include/linux/krsi.h >> [...] >> >> +static inline int krsi_run_progs(enum krsi_hook_type t, struct krsi_ctx >> *ctx) >> +{ >> +struct bpf_prog_array_item *item; >> +struct bpf_prog *prog; >> +struct krsi_hook *h = &krsi_hooks_list[t]; >> +int ret, retval = 0; > > Reverse christmas tree style? > >> + >> +preempt_disable(); > > Do we need preempt_disable() here? From the following patches, I see perf_event_output() helper and per-cpu array usage. So, indeed preempt_disable() is needed. > >> +rcu_read_lock(); >> + >> +item = rcu_dereference(h->progs)->items; >> +while ((prog = READ_ONCE(item->prog))) { >> +ret = BPF_PROG_RUN(prog, ctx); >> +if (ret < 0) { >> +retval = ret; >> +goto out; >> +} >> +item++; >> +} >> + >> +out: >> +rcu_read_unlock(); >> +preempt_enable(); >> +return IS_ENABLED(CONFIG_SECURITY_KRSI_ENFORCE) ? retval : 0; >> +} >> + [...]
Re: [RFC v1 14/14] krsi: Pin arg pages only when needed
On 9/10/19 12:55 PM, KP Singh wrote: > From: KP Singh > > Adds a callback which is called when a new program is attached > to a hook. The callback registered by the process_exection hook > checks if a program that has calls to a helper that requires pages to > be pinned (eg. krsi_get_env_var). > > Signed-off-by: KP Singh > --- > include/linux/krsi.h | 1 + > security/krsi/include/hooks.h | 5 ++- > security/krsi/include/krsi_init.h | 7 > security/krsi/krsi.c | 62 --- > security/krsi/ops.c | 10 - > 5 files changed, 77 insertions(+), 8 deletions(-) > > diff --git a/include/linux/krsi.h b/include/linux/krsi.h > index c7d1790d0c1f..e443d0309764 100644 > --- a/include/linux/krsi.h > +++ b/include/linux/krsi.h > @@ -7,6 +7,7 @@ > > #ifdef CONFIG_SECURITY_KRSI > int krsi_prog_attach(const union bpf_attr *attr, struct bpf_prog *prog); > +extern const struct bpf_func_proto krsi_get_env_var_proto; > #else > static inline int krsi_prog_attach(const union bpf_attr *attr, > struct bpf_prog *prog) > diff --git a/security/krsi/include/hooks.h b/security/krsi/include/hooks.h > index e070c452b5de..38293125ff99 100644 > --- a/security/krsi/include/hooks.h > +++ b/security/krsi/include/hooks.h > @@ -8,7 +8,7 @@ >* >* Format: >* > - * KRSI_HOOK_INIT(TYPE, NAME, LSM_HOOK, KRSI_HOOK_FN) > + * KRSI_HOOK_INIT(TYPE, NAME, LSM_HOOK, KRSI_HOOK_FN, CALLBACK) >* >* KRSI adds one layer of indirection between the name of the hook and the > name >* it exposes to the userspace in Security FS to prevent the userspace from > @@ -18,4 +18,5 @@ > KRSI_HOOK_INIT(PROCESS_EXECUTION, > process_execution, > bprm_check_security, > -krsi_process_execution) > +krsi_process_execution, > +krsi_process_execution_cb) > diff --git a/security/krsi/include/krsi_init.h > b/security/krsi/include/krsi_init.h > index 6152847c3b08..99801d5b273a 100644 > --- a/security/krsi/include/krsi_init.h > +++ b/security/krsi/include/krsi_init.h > @@ -31,6 +31,8 @@ struct krsi_ctx { > }; > }; > > +typedef int (*krsi_prog_attach_t) (struct bpf_prog_array *); > + > /* >* The LSM creates one file per hook. >* > @@ -61,6 +63,11 @@ struct krsi_hook { >* The eBPF programs that are attached to this hook. >*/ > struct bpf_prog_array __rcu *progs; > + /* > + * The attach callback is called before a new program is attached > + * to the hook and is passed the updated bpf_prog_array as an argument. > + */ > + krsi_prog_attach_t attach_callback; > }; > > extern struct krsi_hook krsi_hooks_list[]; > diff --git a/security/krsi/krsi.c b/security/krsi/krsi.c > index 00a7150c1b22..a4443d7aa150 100644 > --- a/security/krsi/krsi.c > +++ b/security/krsi/krsi.c > @@ -5,15 +5,65 @@ > #include > #include > #include > +#include > #include > > #include "krsi_init.h" > > +/* > + * need_arg_pages is only updated in bprm_check_security_cb > + * when a mutex on krsi_hook for bprm_check_security is already > + * held. need_arg_pages avoids pinning pages when no program > + * that needs them is attached to the hook. > + */ > +static bool need_arg_pages; > + > +/* > + * Checks if the instruction is a BPF_CALL to an eBPF helper located > + * at the given address. > + */ > +static inline bool bpf_is_call_to_func(struct bpf_insn *insn, > +void *func_addr) > +{ > + u8 opcode = BPF_OP(insn->code); > + > + if (opcode != BPF_CALL) > + return false; > + > + if (insn->src_reg == BPF_PSEUDO_CALL) > + return false; > + > + /* > + * The BPF verifier updates the value of insn->imm from the > + * enum bpf_func_id to the offset of the address of helper > + * from the __bpf_call_base. > + */ > + return __bpf_call_base + insn->imm == func_addr; In how many cases, krsi program does not have krsi_get_env_var() helper? > +} > + > +static int krsi_process_execution_cb(struct bpf_prog_array *array) > +{ > + struct bpf_prog_array_item *item = array->items; > + struct bpf_prog *p; > + const struct bpf_func_proto *proto = &krsi_get_env_var_proto; > + int i; > + > + while ((p = READ_ONCE(item->prog))) { > + for (i = 0; i < p->len; i++) { > + if (bpf_is_call_to_func(&p->insnsi[i], proto->func)) > + need_arg_pages = true; > + } > + item++; > + } > + return 0; > +} > + > struct krsi_hook krsi_hooks_list[] = { > - #define KRSI_HOOK_INIT(TYPE, NAME, H, I) \ > + #define KRSI_HOOK_INIT(TYPE, NAME, H, I, CB) \ > [TYPE] = { \ > .h_type = TYPE, \ > .name = #NAME, \ > + .attach_callback = CB, \ > }, >
Re: [RFC v1 13/14] krsi: Provide an example to read and log environment variables
On 9/10/19 12:55 PM, KP Singh wrote: > From: KP Singh > > * The program takes the name of an environment variable as an > argument. > * An eBPF program is loaded and attached to the > process_execution hook. > * The name of the environment variable passed is updated in a > eBPF per-cpu map. > * The eBPF program uses the krsi_get_env_var helper to get the > the value of this variable and logs the result to the perf > events buffer. > * The user-space program listens to the perf events and prints > the values. > > Example execution: > > ./krsi LD_PRELOAD > > [p_pid=123] LD_PRELOAD is not set > [p_pid=456] LD_PRELOAD=/lib/bad.so > [p_pid=789] WARNING! LD_PRELOAD is set 2 times > [p_pid=789] LD_PRELOAD=/lib/decoy.so > [p_pid=789] LD_PRELOAD=/lib/bad.so > > In a separate session the following [1, 2, 3] exec system calls > are made where: > > [1, 2, 3] char *argv[] = {"/bin/ls", 0}; > > [1] char *envp = {0}; > [2] char *envp = {"LD_PRELOAD=/lib/bad.so", 0}; > [3] char *envp = {"LD_PRELOAD=/lib/decoy.so, "LD_PRELOAD=/lib/bad.so", > 0}; > > This example demonstrates that user-space is free to choose the format > in which the data is logged and can use very specific helpers like > krsi_get_env_var to populate only the data that is required. > > Signed-off-by: KP Singh > --- > MAINTAINERS| 3 + > samples/bpf/.gitignore | 1 + > samples/bpf/Makefile | 3 + > samples/bpf/krsi_helpers.h | 31 ++ > samples/bpf/krsi_kern.c| 52 ++ > samples/bpf/krsi_user.c| 202 + The KRSI does not depend on kernel headers. I would recommend to put this into tools/testing/selftests/bpf as a selftest. Selftest is the place to test regressions with bpf changes. The krsi_kern.c can be put into tools/testing/selftests/bpf/progs and krsi_user.c can be put into tools/testing/selftests/bpf/prog_tests. > 6 files changed, 292 insertions(+) > create mode 100644 samples/bpf/krsi_helpers.h > create mode 100644 samples/bpf/krsi_kern.c > create mode 100644 samples/bpf/krsi_user.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index 8e0364391d8b..ec378abb4c23 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -9005,6 +9005,9 @@ F: kernel/kprobes.c > KRSI SECURITY MODULE > M: KP Singh > S: Supported > +F: samples/bpf/krsi_helpers.h > +F: samples/bpf/krsi_kern.c > +F: samples/bpf/krsi_user.c > F: security/krsi/ > > KS0108 LCD CONTROLLER DRIVER > diff --git a/samples/bpf/.gitignore b/samples/bpf/.gitignore > index 74d31fd3c99c..6bbf5a04877f 100644 > --- a/samples/bpf/.gitignore > +++ b/samples/bpf/.gitignore > @@ -2,6 +2,7 @@ cpustat > fds_example > hbm > ibumad > +krsi > lathist > lwt_len_hist > map_perf_test > diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile > index 1d9be26b4edd..33d3bef17549 100644 > --- a/samples/bpf/Makefile > +++ b/samples/bpf/Makefile > @@ -8,6 +8,7 @@ hostprogs-y := test_lru_dist > hostprogs-y += sock_example > hostprogs-y += fds_example > hostprogs-y += sockex1 > +hostprogs-y += krsi > hostprogs-y += sockex2 > hostprogs-y += sockex3 > hostprogs-y += tracex1 > @@ -62,6 +63,7 @@ TRACE_HELPERS := > ../../tools/testing/selftests/bpf/trace_helpers.o > > fds_example-objs := fds_example.o > sockex1-objs := sockex1_user.o > +krsi-objs := krsi_user.o $(TRACE_HELPERS) > sockex2-objs := sockex2_user.o > sockex3-objs := bpf_load.o sockex3_user.o > tracex1-objs := bpf_load.o tracex1_user.o > @@ -113,6 +115,7 @@ hbm-objs := bpf_load.o hbm.o $(CGROUP_HELPERS) > # Tell kbuild to always build the programs > always := $(hostprogs-y) > always += sockex1_kern.o > +always += krsi_kern.o > always += sockex2_kern.o > always += sockex3_kern.o > always += tracex1_kern.o > diff --git a/samples/bpf/krsi_helpers.h b/samples/bpf/krsi_helpers.h > new file mode 100644 > index ..3007bfd6212e > --- /dev/null > +++ b/samples/bpf/krsi_helpers.h > @@ -0,0 +1,31 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +#ifndef _KRSI_HELPERS_H > +#define _KRSI_HELPERS_H > + > +#define __bpf_percpu_val_align __aligned(8) > + > +#define ENV_VAR_NAME_MAX_LEN 48 > +#define ENV_VAR_VAL_MAX_LEN 4096 > + > +#define MAX_CPUS 128 > + > +#define __LOWER(x) (x & 0x) > +#define __UPPER(x) (x >> 32) > + > +struct krsi_env_value { > + // The name of the environment variable. > + char name[ENV_VAR_NAME_MAX_LEN]; > + // The value of the environment variable (if set). > + char value[ENV_VAR_VAL_MAX_LEN]; > + // Indicates if an overflow occurred while reading the value of the > + // of the environment variable. This means that an -E2BIG was received > + // from the krsi_get_env_var helper. > + bool overflow; > + // The number of times the environment variable was set. > + __u32 time
Re: [RFC v1 12/14] krsi: Add an eBPF helper function to get the value of an env variable
On 9/10/19 12:55 PM, KP Singh wrote: > From: KP Singh This patch cannot apply cleanly. -bash-4.4$ git apply ~/p12.txt error: patch failed: include/uapi/linux/bpf.h:2715 error: include/uapi/linux/bpf.h: patch does not apply error: patch failed: tools/include/uapi/linux/bpf.h:2715 error: tools/include/uapi/linux/bpf.h: patch does not apply -bash-4.4$ > > The helper returns the value of the environment variable in the buffer > that is passed to it. If the var is set multiple times, the helper > returns all the values as null separated strings. > > If the buffer is too short for these values, the helper tries to fill it > the best it can and guarantees that the value returned in the buffer > is always null terminated. After the buffer is filled, the helper keeps > counting the number of times the environment variable is set in the > envp. > > The return value of the helper is an u64 value which carries two pieces > of information. > >* The upper 32 bits are a u32 value signifying the number of times > the environment variable is set in the envp. Not sure how useful this 'upper 32' bit value is. What user expected to do? Another option is to have upper 32 bits encode the required buffer size to hold all values. This may cause some kind of user space action, e.g., to replace the program with new program with larger per cpu map value size? >* The lower 32 bits are a s32 value signifying the number of bytes > written to the buffer or an error code. > > Since the value of the environment variable can be very long and exceed > what can be allocated on the BPF stack, a per-cpu array can be used > instead: > > struct bpf_map_def SEC("maps") env_map = { > .type = BPF_MAP_TYPE_PERCPU_ARRAY, > .key_size = sizeof(u32), > .value_size = 4096, > .max_entries = 1, > }; Could you use use map definition with SEC(".maps")? > > SEC("prgrm") > int bpf_prog1(void *ctx) > { > u32 map_id = 0; > u64 times_ret; > s32 ret; > char name[48] = "LD_PRELOAD"; Reverse Christmas tree coding style, here and other places? > > char *map_value = bpf_map_lookup_elem(&env_map, &map_id); > if (!map_value) > return 0; > > // Read the lower 32 bits for the return value > times_ret = krsi_get_env_var(ctx, name, 48, map_value, 4096); > ret = times_ret & 0x; > if (ret < 0) > return ret; > return 0; > } > > Signed-off-by: KP Singh > --- > include/uapi/linux/bpf.h | 42 ++- > security/krsi/ops.c | 129 ++ > tools/include/uapi/linux/bpf.h| 42 ++- > tools/testing/selftests/bpf/bpf_helpers.h | 3 + > 4 files changed, 214 insertions(+), 2 deletions(-) > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 32ab38f1a2fe..a4ef07956e07 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -2715,6 +2715,45 @@ union bpf_attr { >* **-EPERM** if no permission to send the *sig*. >* >* **-EAGAIN** if bpf program can try again. > + * > + * u64 krsi_get_env_var(void *ctx, char *name, char *buf, > + * size_t name_len, size_t buf_len) This signature is not the same as the later krsi_get_env_var(...) helper definition. BPF_CALL_5(krsi_get_env_var, struct krsi_ctx *, ctx, char *, name, u32, n_size, char *, dest, u32, size) > + * Description > + * This helper can be used as a part of the > + * process_execution hook of the KRSI LSM in > + * programs of type BPF_PROG_TYPE_KRSI. > + * > + * The helper returns the value of the environment > + * variable with the provided "name" for process that's > + * going to be executed in the passed buffer, "buf". If the var > + * is set multiple times, the helper returns all > + * the values as null separated strings. > + * > + * If the buffer is too short for these values, the helper > + * tries to fill it the best it can and guarantees that the value > + * returned in the buffer is always null terminated. > + * After the buffer is filled, the helper keeps counting the number > + * of times the environment variable is set in the envp. > + * > + * Return: > + * > + * The return value of the helper is an u64 value > + * which carries two pieces of information: > + * > + * The upper 32 bits are a u32 value signifying > + * the number of times the environment variable > + * is set in the envp. > + * The lower 32 bits are an s32 value signifying > + * the number of bytes written to the buffer or an error code: > + * > + * **-ENOMEM** if the kernel is unable to allocate memory > + * for pinni
Re: [PATCH v2] tracing: Be more clever when dumping hex in __print_hex()
On Fri, 13 Sep 2019 15:28:26 +0300 Andy Shevchenko wrote: > On Tue, Aug 06, 2019 at 11:33:52AM -0400, Steven Rostedt wrote: > > On Tue, 6 Aug 2019 18:15:43 +0300 > > Andy Shevchenko wrote: > > > > > Hex dump as many as 16 bytes at once in trace_print_hex_seq() > > > instead of byte-by-byte approach. > > > > + const char *fmt = concatenate ? "%*phN" : "%*ph"; > > > > > > + for (i = 0; i < buf_len; i += 16) > > > + trace_seq_printf(p, fmt, min(buf_len - i, 16), &buf[i]); > > > > Cute. > > > > I'll have to wrap my head around it a bit to make sure it's correct. > > Anything I need to update here? > Nope, thanks for the ping, I've been traveling quite crazy lately. I'll try to add this to the next merge window coming up shortly. -- Steve
Re: [HELP REQUESTED from the community] Was: Staging status of speakup
Hello, Okash Khawaja, le sam. 14 sept. 2019 22:08:35 +0100, a ecrit: > 2. We are still missing descriptions for i18n/ directory. I have added > filenames below. can someone can add description please: There are some descriptions in the "14.1. Files Under the i18n Subdirectory" section of spkguide.txt Samuel
Re: [GIT PULL] MMC fixes for v5.3-rc9
The pull request you sent on Fri, 13 Sep 2019 13:57:24 +0200: > git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc.git tags/mmc-v5.3-rc8 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/1c4c5e2528af0c803fb1171632074f4070229a75 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [GIT PULL] Urgent RISC-V fix for v5.3
The pull request you sent on Sat, 14 Sep 2019 06:52:48 -0700 (PDT): > git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git > tags/riscv/for-v5.3 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/b03c036e6f96340dd311817c7b964dad183c4141 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [GIT PULL] Final batch of KVM changes for Linux 5.3.
The pull request you sent on Sat, 14 Sep 2019 22:31:41 +0200: > https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/1609d7604b847a9820e63393d1a3b6cac7286d40 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [PULL] vhost: a last minute revert
The pull request you sent on Sat, 14 Sep 2019 15:38:59 -0400: > https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/1f9c632cde0c3d781463a88ce430a8dd4a7c1a0e Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: [GIT] Networking
The pull request you sent on Fri, 13 Sep 2019 21:55:40 +0100 (WEST): > git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git refs/heads/master has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/36024fcf8d28999f270908e75675d43b099ff7b3 Thank you! -- Deet-doot-dot, I am a bot. https://korg.wiki.kernel.org/userdoc/prtracker
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 3:24 PM Theodore Y. Ts'o wrote: > > > > Also, we might even want to just fill the buffer and return 0 at that > > > point, to make sure that even more broken user space doesn't then try > > > to sleep manually and turn it into a "I'll wait myself" loop. > > Ugh. This makes getrandom(2) unreliable for application programers, > in that it returns success, but with the buffer filled with something > which is definitely not random. Please, let's not. You misunderstand, The buffer would always be filled with "as random as we can make it". My "return zero" was for success, but Alexander pointed out that the return value is the length, not "zero for success". > Worse, it won't even accomplish something against an obstinant > programmers. Someone who is going to change their program to sleep > loop waiting for getrandom(2) to not return with an error can just as > easily check for a buffer which is zero-filled, or an unchanged > buffer, and then sleep loop on that. Again, no they can't. They'll get random data in the buffer. And there is no way they can tell how much entropy that random data has. Exactly the same way there is absolutely no way _we_ can tell how much entropy we have. > For 5.3, can we please consider my proposal in [1]? > > [1] https://lore.kernel.org/linux-ext4/20190914162719.ga19...@mit.edu/ Honestly, to me that looks *much* worse than just saying that we need to stop allowing insane user mode boot programs make insane choices that have no basis in reality. It may be the safest thing to do, but at that point we might as well just revert the ext4 change entirely. I'd rather do that, than h ave random filesystems start making random decisions based on crazy user space behavior. Linus
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 11:11:26PM +0200, Ahmed S. Darwish wrote: > > > I've sent an RFC patch at [1]. > > > > > > [1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc > > > > Looks reasonable to me. Except I'd just make it simpler and make it a > > big WARN_ON_ONCE(), which is a lot harder to miss than pr_notice(). > > Make it clear that it is a *bug* if user space thinks it should wait > > at boot time. So I'd really rather not make a change as fundamental as this so close to 5.3 being released. This sort of thing is subtle since essentially what we're trying to do is to work around broken userspace, and worse, in many cases, obstinent, determined userspace application progammers. We've told them to avoid trying to generate cryptographically secure random numbers for *years*. And they haven't listened. This is also a fairly major functional change which is likely to be very visible to userspace applications, and so it is likely to cause *some* kind of breakage. So if/when this breaks applications, are we going to then have to revert it? > > Also, we might even want to just fill the buffer and return 0 at that > > point, to make sure that even more broken user space doesn't then try > > to sleep manually and turn it into a "I'll wait myself" loop. Ugh. This makes getrandom(2) unreliable for application programers, in that it returns success, but with the buffer filled with something which is definitely not random. Please, let's not. Worse, it won't even accomplish something against an obstinant programmers. Someone who is going to change their program to sleep loop waiting for getrandom(2) to not return with an error can just as easily check for a buffer which is zero-filled, or an unchanged buffer, and then sleep loop on that. Again, remember we're trying to work around malicious human beings --- except instead trying to fight malicious attackers, we're trying to fight malicious userspace programmers. This is not a fight we can win. We can't make getrandom(2) idiot-proof, because idiots are too d*mned ingenious :-) For 5.3, can we please consider my proposal in [1]? [1] https://lore.kernel.org/linux-ext4/20190914162719.ga19...@mit.edu/ We can try to discuss different ways of working around broken/stupid userspace, but let's wait until after the LTS release, and ultimately, I still think we need to just try to get more randomness from hardware whichever way we can. Pretty much all x86 laptop/desktop have TPM's. So let's use that, in combination with RDRAND, and UEFI provided randomness, etc., etc., And if we want to put in a big WARN_ON_ONCE, sure. But we've tried not blocking before, and that way didn't end well[2], with almost 10% of all publically accessible SSH keys across the entire internet being shown to be week by an academic researcher. (This ruined my July 4th holidays in 2012 when I was working on patches to fix this on very short notice.) So let's *please* not be hasty with changes here. We're dealing with a complex systems that includes some very obstinent/strong personalities, including one which rhymes with Loettering [2] https://factorable.net - Ted
Re: Linux 5.3-rc8
Ahmed S. Darwish - 14.09.19, 23:11:26 CEST: > > Yeah, the above is yet another example of completely broken garbage. > > > > You can't just wait and block at boot. That is simply 100% > > unacceptable, and always has been, exactly because that may > > potentially mean waiting forever since you didn't do anything that > > actually is likely to add any entropy. > > ACK, the systemd commit which introduced that code also does: > >=> 26ded5570994 (random-seed: rework systemd-random-seed.service..) > [...] > --- a/units/systemd-random-seed.service.in > +++ b/units/systemd-random-seed.service.in > @@ -22,4 +22,9 @@ Type=oneshot > RemainAfterExit=yes > ExecStart=@rootlibexecdir@/systemd-random-seed load > ExecStop=@rootlibexecdir@/systemd-random-seed save >-TimeoutSec=30s >+ >+# This service waits until the kernel's entropy pool is >+# initialized, and may be used as ordering barrier for service >+# that require an initialized entropy pool. Since initialization >+# can take a while on entropy-starved systems, let's increase the >+# time-out substantially here. >+TimeoutSec=10min > > This 10min wait thing is really broken... it's basically "forever". I am so happy to use Sysvinit on my systems again. Depending on entropy for just booting a machine is broken¹. Of course regenerating SSH keys on boot, probably due to cloud-init replacing the old key after a VM has been cloned from template, may still be a challenge to handle well². I'd probably replace SSH keys in the background and restart the service then, but this may lead to spurious man in the middle warnings. [1] Debian Buster release notes: 5.1.4. Daemons fail to start or system appears to hang during boot https://www.debian.org/releases/stable/amd64/release-notes/ch-information.en.html#entropy-starvation [2] Openssh taking minutes to become available, booting takes half an hour ... because your server waits for a few bytes of randomness https://daniel-lange.com/archives/152-hello-buster.html Thanks, -- Martin
[PATCH] PCI: Move ATS declarations to linux/pci-ats.h
Move ATS function prototypes from include/linux/pci.h to include/linux/pci-ats.h as the ATS, PRI, and PASID interfaces are related, and are used only by the IOMMU drivers. This effecively reverts the change done in commit ff9bee895c4d ("PCI: Move ATS declarations to linux/pci.h so they're all together"). Also, remove surplus forward declaration of struct pci_ats from include/linux/pci.h, as it is no longer needed, since the struct pci_ats has been embedded directly into struct pci_dev in the commit d544d75ac96a ("PCI: Embed ATS info directly into struct pci_dev"). No functional changes intended. Signed-off-by: Krzysztof Wilczynski --- Related: https://lore.kernel.org/r/20190902211100.gh7...@google.com https://lore.kernel.org/r/20190724233848.73327-9-skunberg.kel...@gmail.com include/linux/pci-ats.h | 76 +++-- include/linux/pci.h | 14 2 files changed, 28 insertions(+), 62 deletions(-) diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h index 1ebb88e7c184..a2001673d445 100644 --- a/include/linux/pci-ats.h +++ b/include/linux/pci-ats.h @@ -4,74 +4,54 @@ #include -#ifdef CONFIG_PCI_PRI +#ifdef CONFIG_PCI_ATS +/* Address Translation Service */ +int pci_enable_ats(struct pci_dev *dev, int ps); +void pci_disable_ats(struct pci_dev *dev); +int pci_ats_queue_depth(struct pci_dev *dev); +int pci_ats_page_aligned(struct pci_dev *dev); +#else /* CONFIG_PCI_ATS */ +static inline int pci_enable_ats(struct pci_dev *d, int ps) +{ return -ENODEV; } +static inline void pci_disable_ats(struct pci_dev *d) { } +static inline int pci_ats_queue_depth(struct pci_dev *d) +{ return -ENODEV; } +static inline int pci_ats_page_aligned(struct pci_dev *dev) +{ return 0; } +#endif /* CONFIG_PCI_ATS */ +#ifdef CONFIG_PCI_PRI int pci_enable_pri(struct pci_dev *pdev, u32 reqs); void pci_disable_pri(struct pci_dev *pdev); void pci_restore_pri_state(struct pci_dev *pdev); int pci_reset_pri(struct pci_dev *pdev); - #else /* CONFIG_PCI_PRI */ - static inline int pci_enable_pri(struct pci_dev *pdev, u32 reqs) -{ - return -ENODEV; -} - -static inline void pci_disable_pri(struct pci_dev *pdev) -{ -} - -static inline void pci_restore_pri_state(struct pci_dev *pdev) -{ -} - +{ return -ENODEV; } +static inline void pci_disable_pri(struct pci_dev *pdev) { } +static inline void pci_restore_pri_state(struct pci_dev *pdev) { } static inline int pci_reset_pri(struct pci_dev *pdev) -{ - return -ENODEV; -} - +{ return -ENODEV; } #endif /* CONFIG_PCI_PRI */ #ifdef CONFIG_PCI_PASID - int pci_enable_pasid(struct pci_dev *pdev, int features); void pci_disable_pasid(struct pci_dev *pdev); void pci_restore_pasid_state(struct pci_dev *pdev); int pci_pasid_features(struct pci_dev *pdev); int pci_max_pasids(struct pci_dev *pdev); int pci_prg_resp_pasid_required(struct pci_dev *pdev); - -#else /* CONFIG_PCI_PASID */ - +#else /* CONFIG_PCI_PASID */ static inline int pci_enable_pasid(struct pci_dev *pdev, int features) -{ - return -EINVAL; -} - -static inline void pci_disable_pasid(struct pci_dev *pdev) -{ -} - -static inline void pci_restore_pasid_state(struct pci_dev *pdev) -{ -} - +{ return -EINVAL; } +static inline void pci_disable_pasid(struct pci_dev *pdev) { } +static inline void pci_restore_pasid_state(struct pci_dev *pdev) { } static inline int pci_pasid_features(struct pci_dev *pdev) -{ - return -EINVAL; -} - +{ return -EINVAL; } static inline int pci_max_pasids(struct pci_dev *pdev) -{ - return -EINVAL; -} - +{ return -EINVAL; } static inline int pci_prg_resp_pasid_required(struct pci_dev *pdev) -{ - return 0; -} +{ return 0; } #endif /* CONFIG_PCI_PASID */ - -#endif /* LINUX_PCI_ATS_H*/ +#endif /* LINUX_PCI_ATS_H */ diff --git a/include/linux/pci.h b/include/linux/pci.h index 56767f50ad96..5f2ae580bd19 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -284,7 +284,6 @@ struct irq_affinity; struct pcie_link_state; struct pci_vpd; struct pci_sriov; -struct pci_ats; struct pci_p2pdma; /* The pci_dev structure describes PCI devices */ @@ -1764,19 +1763,6 @@ static inline const struct pci_device_id *pci_match_id(const struct pci_device_i static inline bool pci_ats_disabled(void) { return true; } #endif /* CONFIG_PCI */ -#ifdef CONFIG_PCI_ATS -/* Address Translation Service */ -int pci_enable_ats(struct pci_dev *dev, int ps); -void pci_disable_ats(struct pci_dev *dev); -int pci_ats_queue_depth(struct pci_dev *dev); -int pci_ats_page_aligned(struct pci_dev *dev); -#else -static inline int pci_enable_ats(struct pci_dev *d, int ps) { return -ENODEV; } -static inline void pci_disable_ats(struct pci_dev *d) { } -static inline int pci_ats_queue_depth(struct pci_dev *d) { return -ENODEV; } -static inline int pci_ats_page_aligned(struct pci_dev *dev) { return 0; } -#endif - /* Include architecture-dependent settings and functions */ #include -- 2.23.0
Re: Linux 5.3-rc8
Hi, On Sat, Sep 14, 2019 at 09:30:19AM -0700, Linus Torvalds wrote: > On Sat, Sep 14, 2019 at 8:02 AM Ahmed S. Darwish wrote: > > > > On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote: > > > > > > An alternative might be to make getrandom() just return an error > > > instead of waiting. Sure, fill the buffer with "as random as we can" > > > stuff, but then return -EINVAL because you called us too early. > > > > ACK, that's probably _the_ most sensible approach. Only caveat is > > the slight change in user-space API semantics though... > > > > For example, this breaks the just released systemd-random-seed(8) > > as it _explicitly_ requests blocking behvior from getrandom() here: > > > > Actually, I would argue that the "don't ever block, instead fill > buffer and return error instead" fixes this broken case. > > > => src/random-seed/random-seed.c: > > /* > > * Let's make this whole job asynchronous, i.e. let's make > > * ourselves a barrier for proper initialization of the > > * random pool. > > */ > > k = getrandom(buf, buf_size, GRND_NONBLOCK); > > if (k < 0 && errno == EAGAIN && synchronous) { > > log_notice("Kernel entropy pool is not initialized yet, " > > "waiting until it is."); > > > > k = getrandom(buf, buf_size, 0); /* retry synchronously */ > > } > > Yeah, the above is yet another example of completely broken garbage. > > You can't just wait and block at boot. That is simply 100% > unacceptable, and always has been, exactly because that may > potentially mean waiting forever since you didn't do anything that > actually is likely to add any entropy. > ACK, the systemd commit which introduced that code also does: => 26ded5570994 (random-seed: rework systemd-random-seed.service..) [...] --- a/units/systemd-random-seed.service.in +++ b/units/systemd-random-seed.service.in @@ -22,4 +22,9 @@ Type=oneshot RemainAfterExit=yes ExecStart=@rootlibexecdir@/systemd-random-seed load ExecStop=@rootlibexecdir@/systemd-random-seed save -TimeoutSec=30s + +# This service waits until the kernel's entropy pool is +# initialized, and may be used as ordering barrier for service +# that require an initialized entropy pool. Since initialization +# can take a while on entropy-starved systems, let's increase the +# time-out substantially here. +TimeoutSec=10min This 10min wait thing is really broken... it's basically "forever". > > if (k < 0) { > > log_debug_errno(errno, "Failed to read random data with " > > "getrandom(), falling back to " > > "/dev/urandom: %m"); > > At least it gets a log message. > > So I think the right thing to do is to just make getrandom() return > -EINVAL, and refuse to block. > > As mentioned, this has already historically been a huge issue on > embedded devices, and with disks turnign not just to NVMe but to > actual polling nvdimm/xpoint/flash, the amount of true "entropy" > randomness we can give at boot is very questionable. > ACK. Moreover, and as a result of all that, distributions are now officially *duct-taping* the problem: https://www.debian.org/releases/buster/amd64/release-notes/ch-information.en.html#entropy-starvation 5.1.4. Daemons fail to start or system appears to hang during boot Due to systemd needing entropy during boot and the kernel treating such calls as blocking when available entropy is low, the system may hang for minutes to hours until the randomness subsystem is sufficiently initialized (random: crng init done). "the system may hang for minuts to hours"... > We can (and will) continue to do a best-effort thing (including very > much using rdread and friends), but the whole "wait for entropy" > simply *must* stop. > > > I've sent an RFC patch at [1]. > > > > [1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc > > Looks reasonable to me. Except I'd just make it simpler and make it a > big WARN_ON_ONCE(), which is a lot harder to miss than pr_notice(). > Make it clear that it is a *bug* if user space thinks it should wait > at boot time. > > Also, we might even want to just fill the buffer and return 0 at that > point, to make sure that even more broken user space doesn't then try > to sleep manually and turn it into a "I'll wait myself" loop. > ACK, I'll send an RFC v2, returning buflen, and so on.. /me will enjoy the popcorn from all the to-be-reported WARN_ON()s on distribution mailing lists ;-) > Linus thanks, -- darwi http://darwish.chasingpointers.com
Re: [HELP REQUESTED from the community] Was: Staging status of speakup
On Mon, Sep 9, 2019 at 3:55 AM Gregory Nowak wrote: > > On Sun, Sep 08, 2019 at 10:43:02AM +0100, Okash Khawaja wrote: > > Sorry, I have only now got round to working on this. It's not complete > > yet but I have assimilated the feedback and converted subjective > > phrases, like "I think..." into objective statements or put them in > > TODO: so that someone else may verify. I have attached it to this > > email. > > I think bleeps needs a TODO, since we don't know what values it accepts, or > what difference those values make. Also, to keep things uniform, we > should replace my "don't know" for trigger_time with a TODO. Looks > good to me otherwise. Thanks. Great thanks. I have updated. I have two questions: 1. Is it okay for these descriptions to go inside Documentation/ABI/stable? They have been around since 2.6 (2010). Or would we prefer Documentation/ABI/testing/? 2. We are still missing descriptions for i18n/ directory. I have added filenames below. can someone can add description please: What: /sys/accessibility/speakup/i18n/announcements KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO What: /sys/accessibility/speakup/i18n/chartab KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO What: /sys/accessibility/speakup/i18n/ctl_keys KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO What: /sys/accessibility/speakup/i18n/function_names KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO What: /sys/accessibility/speakup/i18n/states KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO What: /sys/accessibility/speakup/i18n/characters KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO What: /sys/accessibility/speakup/i18n/colors KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO What: /sys/accessibility/speakup/i18n/formatted KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO What: /sys/accessibility/speakup/i18n/key_names KernelVersion: 2.6 Contact:spea...@linux-speakup.org Description: TODO Thanks, Okash
[GIT PULL] Final batch of KVM changes for Linux 5.3.
Linus, The following changes since commit a7f89616b7376495424f682b6086e0c391a89a1d: Merge branch 'for-5.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup (2019-09-13 09:52:01 +0100) are available in the git repository at: https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus for you to fetch changes up to a9c20bb0206ae9384bd470a6832dd8913730add9: Merge tag 'kvm-s390-master-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master (2019-09-14 09:25:30 +0200) The main change here is a revert of reverts. We recently simplified some code that was thought unnecessary; however, since then KVM has grown quite a few cond_resched()s and for that reason the simplified code is prone to livelocks---one CPUs tries to empty a list of guest page tables while the others keep adding to them. This adds back the generation-based zapping of guest page tables, which was not unnecessary after all. On top of this, there is a fix for a kernel memory leak and a couple of s390 fixlets as well. Fuqian Huang (1): KVM: x86: work around leak of uninitialized stack contents Igor Mammedov (1): KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset() Paolo Bonzini (2): KVM: nVMX: handle page fault in vmread Merge tag 'kvm-s390-master-5.3-1' of git://git.kernel.org/.../kvms390/linux into kvm-master Sean Christopherson (1): KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot Thomas Huth (1): KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl arch/s390/kvm/interrupt.c | 10 arch/s390/kvm/kvm-s390.c| 4 +- arch/x86/include/asm/kvm_host.h | 2 + arch/x86/kvm/mmu.c | 101 +++- arch/x86/kvm/vmx/nested.c | 4 +- arch/x86/kvm/x86.c | 7 +++ 6 files changed, 124 insertions(+), 4 deletions(-)
Re: [PATCH] staging: r8188eu: replace rtw_malloc() with it's definition
On Sat, Sep 14, 2019 at 06:18:03PM +0300, Ivan Safonov wrote: > On 9/10/19 2:59 PM, Dan Carpenter wrote: > > On Sun, Sep 08, 2019 at 12:00:26PM +0300, Ivan Safonov wrote >> rtw_malloc > > prevents the use of kmemdup/kzalloc and others. > > > > > > Signed-off-by: Ivan Safonov > > > --- > > > drivers/staging/rtl8188eu/core/rtw_ap.c| 4 ++-- > > > drivers/staging/rtl8188eu/core/rtw_mlme_ext.c | 2 +- > > > .../staging/rtl8188eu/include/osdep_service.h | 3 --- > > > drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 18 +- > > > drivers/staging/rtl8188eu/os_dep/mlme_linux.c | 2 +- > > > .../staging/rtl8188eu/os_dep/osdep_service.c | 7 +-- > > > 6 files changed, 14 insertions(+), 22 deletions(-) > > > > > > diff --git a/drivers/staging/rtl8188eu/core/rtw_ap.c > > > b/drivers/staging/rtl8188eu/core/rtw_ap.c > > > index 51a5b71f8c25..c9c57379b7a2 100644 > > > --- a/drivers/staging/rtl8188eu/core/rtw_ap.c > > > +++ b/drivers/staging/rtl8188eu/core/rtw_ap.c > > > @@ -104,7 +104,7 @@ static void update_BCNTIM(struct adapter *padapter) > > > } > > > if (remainder_ielen > 0) { > > > - pbackup_remainder_ie = rtw_malloc(remainder_ielen); > > > + pbackup_remainder_ie = kmalloc(remainder_ielen, in_interrupt() > > > ? GFP_ATOMIC : GFP_KERNEL); > > > > ^ > > This stuff isn't right. It really should be checking if spinlocks are > > held or IRQs are disabled. And the only way to do that is by auditing > > the callers. > I hope to make these changes later as separate independent patches. > This patch do only one thing - remove rtw_malloc(). No, just do that in one step. regards, dan carpenter
Re: [PATCH] irqchip/sifive-plic: add irq_mask and irq_unmask
On Sat, 14 Sep 2019 12:42:32 PDT (-0700), charles.papon...@gmail.com wrote: I had issues with that plic driver. The current implementation wasn't usable with driver using level sensitive interrupt together with the IRQF_ONESHOT flag. Those null were producing crashes in the chained_irq_enter function. Filling them with dummy function fixed the issue. I'm not arguing it fixes a crash, the code Darius pointed to obviously doesn't check for NULL before calling these functions and will therefor crash. There is a bunch of other code that does check, though, so I guess my question is really: is the bug in the PLIC driver, or in this header? If we're not allowed to have these as NULL and there's nothing to do, then this is a reasonable patch. I'm just not capable of answering that question, as I'm not an irqchip maintainer :) On Sat, Sep 14, 2019 at 9:00 PM Palmer Dabbelt wrote: On Thu, 12 Sep 2019 14:40:34 PDT (-0700), Darius Rad wrote: > As per the existing comment, irq_mask and irq_unmask do not need > to do anything for the PLIC. However, the functions must exist > (the pointers cannot be NULL) as they are not optional, based on > the documentation (Documentation/core-api/genericirq.rst) as well > as existing usage (e.g., include/linux/irqchip/chained_irq.h). > > Signed-off-by: Darius Rad > --- > drivers/irqchip/irq-sifive-plic.c | 13 + > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c > index cf755964f2f8..52d5169f924f 100644 > --- a/drivers/irqchip/irq-sifive-plic.c > +++ b/drivers/irqchip/irq-sifive-plic.c > @@ -111,6 +111,13 @@ static void plic_irq_disable(struct irq_data *d) > plic_irq_toggle(cpu_possible_mask, d->hwirq, 0); > } > > +/* > + * There is no need to mask/unmask PLIC interrupts. They are "masked" > + * by reading claim and "unmasked" when writing it back. > + */ > +static void plic_irq_mask(struct irq_data *d) { } > +static void plic_irq_unmask(struct irq_data *d) { } > + > #ifdef CONFIG_SMP > static int plic_set_affinity(struct irq_data *d, >const struct cpumask *mask_val, bool force) > @@ -138,12 +145,10 @@ static int plic_set_affinity(struct irq_data *d, > > static struct irq_chip plic_chip = { > .name = "SiFive PLIC", > - /* > - * There is no need to mask/unmask PLIC interrupts. They are "masked" > - * by reading claim and "unmasked" when writing it back. > - */ > .irq_enable = plic_irq_enable, > .irq_disable= plic_irq_disable, > + .irq_mask = plic_irq_mask, > + .irq_unmask = plic_irq_unmask, > #ifdef CONFIG_SMP > .irq_set_affinity = plic_set_affinity, > #endif I can't find any other drivers in irqchip with empty irq_mask/irq_unmask. I'm not well versed in irqchip stuff, so I'll leave it up to the irqchip maintainers to comment on if this is the right way to do this. Either way, I'm assuming it'll go in through some the irqchip tree so Acked-by: Palmer Dabbelt just to make sure I don't get in the way if it is the right way to do it :). ___ linux-riscv mailing list linux-ri...@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
Re: [PATCH 0/5] tools/power/x86/intel-speed-select: New command and
On Sat, Sep 14, 2019 at 10:22 PM Srinivas Pandruvada wrote: > > On Sat, 2019-09-14 at 20:19 +0300, Andy Shevchenko wrote: > > On Sat, Sep 14, 2019 at 12:05:08AM -0700, Srinivas Pandruvada wrote: > > > This series contains some minor fixes, when firmware mask is > > > including > > > invalid CPU in the perf-profile mask. Also add some commands to > > > better manage core-power feature. > > > > Hmm... 150+ LOCs doesn't count to me as minor fixes. > > So, are you considering this a material for v5.4? > Sorry, I should be clear. It is for 5.4. I am trying to catch merge > window. None of the fixes are critical. The majority of the code is > added for new command features. > > What is your cut off for 5.4? I want to send some more features if > possible for 5.4. First PR already had been sent to Linus. -- With Best Regards, Andy Shevchenko
[PATCH v2 2/5] tools/power/x86/intel-speed-select: Allow online/offline based on tdp
Using enable core mask, do online offline CPUs. There is a new option --online|-o for set-config-level. Signed-off-by: Srinivas Pandruvada --- .../x86/intel-speed-select/isst-config.c | 60 +-- tools/power/x86/intel-speed-select/isst.h | 2 + 2 files changed, 58 insertions(+), 4 deletions(-) diff --git a/tools/power/x86/intel-speed-select/isst-config.c b/tools/power/x86/intel-speed-select/isst-config.c index 83ac72902b36..e505aaf2beef 100644 --- a/tools/power/x86/intel-speed-select/isst-config.c +++ b/tools/power/x86/intel-speed-select/isst-config.c @@ -38,6 +38,7 @@ static int fact_avx = 0xFF; static unsigned long long fact_trl; static int out_format_json; static int cmd_help; +static int force_online_offline; /* clos related */ static int current_clos = -1; @@ -138,14 +139,14 @@ int out_format_is_json(void) int get_physical_package_id(int cpu) { return parse_int_file( - 1, "/sys/devices/system/cpu/cpu%d/topology/physical_package_id", + 0, "/sys/devices/system/cpu/cpu%d/topology/physical_package_id", cpu); } int get_physical_core_id(int cpu) { return parse_int_file( - 1, "/sys/devices/system/cpu/cpu%d/topology/core_id", cpu); + 0, "/sys/devices/system/cpu/cpu%d/topology/core_id", cpu); } int get_physical_die_id(int cpu) @@ -165,6 +166,26 @@ int get_topo_max_cpus(void) return topo_max_cpus; } +static void set_cpu_online_offline(int cpu, int state) +{ + char buffer[128]; + int fd; + + snprintf(buffer, sizeof(buffer), +"/sys/devices/system/cpu/cpu%d/online", cpu); + + fd = open(buffer, O_WRONLY); + if (fd < 0) + err(-1, "%s open failed", buffer); + + if (state) + write(fd, "1\n", 2); + else + write(fd, "0\n", 2); + + close(fd); +} + #define MAX_PACKAGE_COUNT 8 #define MAX_DIE_PER_PACKAGE 2 static void for_each_online_package_in_set(void (*callback)(int, void *, void *, @@ -736,9 +757,34 @@ static void set_tdp_level_for_cpu(int cpu, void *arg1, void *arg2, void *arg3, ret = isst_set_tdp_level(cpu, tdp_level); if (ret) perror("set_tdp_level_for_cpu"); - else + else { isst_display_result(cpu, outf, "perf-profile", "set_tdp_level", ret); + if (force_online_offline) { + struct isst_pkg_ctdp_level_info ctdp_level; + int pkg_id = get_physical_package_id(cpu); + int die_id = get_physical_die_id(cpu); + + fprintf(stderr, "Option is set to online/offline\n"); + ctdp_level.core_cpumask_size = + alloc_cpu_set(&ctdp_level.core_cpumask); + isst_get_coremask_info(cpu, tdp_level, &ctdp_level); + if (ctdp_level.cpu_count) { + int i, max_cpus = get_topo_max_cpus(); + for (i = 0; i < max_cpus; ++i) { + if (pkg_id != get_physical_package_id(i) || die_id != get_physical_die_id(i)) + continue; + if (CPU_ISSET_S(i, ctdp_level.core_cpumask_size, ctdp_level.core_cpumask)) { + fprintf(stderr, "online cpu %d\n", i); + set_cpu_online_offline(i, 1); + } else { + fprintf(stderr, "offline cpu %d\n", i); + set_cpu_online_offline(i, 0); + } + } + } + } + } } static void set_tdp_level(void) @@ -747,6 +793,8 @@ static void set_tdp_level(void) fprintf(stderr, "Set Config TDP level\n"); fprintf(stderr, "\t Arguments: -l|--level : Specify tdp level\n"); + fprintf(stderr, + "\t Optional Arguments: -o | online : online/offline for the tdp level\n"); exit(0); } @@ -1319,6 +1367,7 @@ static void parse_cmd_args(int argc, int start, char **argv) static struct option long_options[] = { { "bucket", required_argument, 0, 'b' }, { "level", required_argument, 0, 'l' }, + { "online", required_argument, 0, 'o' }, { "trl-type", required_argument, 0, 'r' }, { "trl", required_argument, 0, 't' }, { "help", no_argument, 0, 'h' }, @@ -1335,7 +1384,7 @@ static void parse_cmd_args(int argc, int start, char **argv) option_index = start; optind = start + 1; - while ((opt
[PATCH v2 1/5] tools/power/x86/intel-speed-select: Fix high priority core mask over count
From: Youquan Song If the CPU package has the less logical CPU than topo_max_cpus, but un-present CPU's punit_cpu_core will be initiated to 0 and they will be count to core 0 Like below, there are only 10 high priority cores (20 logical CPUs) in the CPU package, but it count to 27 logic CPUs. ./intel-speed-select base-freq info -l 0 | grep mask high-priority-cpu-mask:7f000179,f000179f With the fix patch: ./intel-speed-select base-freq info -l 0 high-priority-cpu-mask:0179,f000179f Signed-off-by: Youquan Song Signed-off-by: Srinivas Pandruvada --- tools/power/x86/intel-speed-select/isst-config.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/power/x86/intel-speed-select/isst-config.c b/tools/power/x86/intel-speed-select/isst-config.c index 59753b3917bb..83ac72902b36 100644 --- a/tools/power/x86/intel-speed-select/isst-config.c +++ b/tools/power/x86/intel-speed-select/isst-config.c @@ -402,6 +402,9 @@ void set_cpu_mask_from_punit_coremask(int cpu, unsigned long long core_mask, int j; for (j = 0; j < topo_max_cpus; ++j) { + if (!CPU_ISSET_S(j, present_cpumask_size, present_cpumask)) + continue; + if (cpu_map[j].pkg_id == pkg_id && cpu_map[j].die_id == die_id && cpu_map[j].punit_cpu_core == i) { -- 2.17.2
[PATCH v2 4/5] tools/power/x86/intel-speed-select: Fix some debug prints
Fix wrong debug print for cpu, which is displayed as CLOS. Also avoid printing clos id, when user is specify clos as parameter. Signed-off-by: Srinivas Pandruvada --- tools/power/x86/intel-speed-select/isst-config.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/tools/power/x86/intel-speed-select/isst-config.c b/tools/power/x86/intel-speed-select/isst-config.c index c2892d86be36..889396d676cb 100644 --- a/tools/power/x86/intel-speed-select/isst-config.c +++ b/tools/power/x86/intel-speed-select/isst-config.c @@ -508,7 +508,7 @@ int isst_send_mbox_command(unsigned int cpu, unsigned char command, int write = 0; int clos_id, core_id, ret = 0; - debug_printf("CLOS %d\n", cpu); + debug_printf("CPU %d\n", cpu); if (parameter & BIT(MBOX_CMD_WRITE_BIT)) { value = req_data; @@ -1421,7 +1421,6 @@ static void parse_cmd_args(int argc, int start, char **argv) /* CLOS related */ case 'c': current_clos = atoi(optarg); - printf("clos %d\n", current_clos); break; case 'd': clos_desired = atoi(optarg); -- 2.17.2
[PATCH v2 0/5] tools/power/x86/intel-speed-select: New command and
This series contains some minor fixes, when firmware mask is including invalid CPU in the perf-profile mask. Also add some commands to better manage core-power feature. v2: Add exit(0) for invalid argument when -c/--cpu not specified for new clos commands. Fix online/offline based on TDP change to only do for local die or package. Srinivas Pandruvada (4): tools/power/x86/intel-speed-select: Allow online/offline based on tdp tools/power/x86/intel-speed-select: Format get-assoc information tools/power/x86/intel-speed-select: Fix some debug prints tools/power/x86/intel-speed-select: Extend core-power command set Youquan Song (1): tools/power/x86/intel-speed-select: Fix high priority core mask over count .../x86/intel-speed-select/isst-config.c | 118 -- .../power/x86/intel-speed-select/isst-core.c | 25 .../x86/intel-speed-select/isst-display.c | 51 tools/power/x86/intel-speed-select/isst.h | 9 +- 4 files changed, 190 insertions(+), 13 deletions(-) -- 2.17.2
[PATCH v2 3/5] tools/power/x86/intel-speed-select: Format get-assoc information
Format the get-assoc command output consistant with other commands. For example: Intel(R) Speed Select Technology Executing on CPU model:142[0x8e] package-0 die-0 cpu-0 get-assoc clos:0 Signed-off-by: Srinivas Pandruvada --- .../x86/intel-speed-select/isst-config.c | 14 +++ .../x86/intel-speed-select/isst-display.c | 23 +++ tools/power/x86/intel-speed-select/isst.h | 2 +- 3 files changed, 33 insertions(+), 6 deletions(-) diff --git a/tools/power/x86/intel-speed-select/isst-config.c b/tools/power/x86/intel-speed-select/isst-config.c index e505aaf2beef..c2892d86be36 100644 --- a/tools/power/x86/intel-speed-select/isst-config.c +++ b/tools/power/x86/intel-speed-select/isst-config.c @@ -1249,7 +1249,7 @@ static void get_clos_assoc_for_cpu(int cpu, void *arg1, void *arg2, void *arg3, if (ret) perror("isst_clos_get_assoc_status"); else - isst_display_result(cpu, outf, "core-power", "get-assoc", clos); + isst_clos_display_assoc_information(cpu, outf, clos); } static void get_clos_assoc(void) @@ -1259,13 +1259,17 @@ static void get_clos_assoc(void) fprintf(stderr, "\tSpecify targeted cpu id with [--cpu|-c]\n"); exit(0); } - if (max_target_cpus) - for_each_online_target_cpu_in_set(get_clos_assoc_for_cpu, NULL, - NULL, NULL, NULL); - else { + + if (!max_target_cpus) { fprintf(stderr, "Invalid target cpu. Specify with [-c|--cpu]\n"); + exit(0); } + + isst_ctdp_display_information_start(outf); + for_each_online_target_cpu_in_set(get_clos_assoc_for_cpu, NULL, + NULL, NULL, NULL); + isst_ctdp_display_information_end(outf); } static struct process_cmd_struct isst_cmds[] = { diff --git a/tools/power/x86/intel-speed-select/isst-display.c b/tools/power/x86/intel-speed-select/isst-display.c index df4aa99c4e92..bd7aaf27e4de 100644 --- a/tools/power/x86/intel-speed-select/isst-display.c +++ b/tools/power/x86/intel-speed-select/isst-display.c @@ -503,6 +503,29 @@ void isst_clos_display_information(int cpu, FILE *outf, int clos, format_and_print(outf, 1, NULL, NULL); } +void isst_clos_display_assoc_information(int cpu, FILE *outf, int clos) +{ + char header[256]; + char value[256]; + + snprintf(header, sizeof(header), "package-%d", +get_physical_package_id(cpu)); + format_and_print(outf, 1, header, NULL); + snprintf(header, sizeof(header), "die-%d", get_physical_die_id(cpu)); + format_and_print(outf, 2, header, NULL); + snprintf(header, sizeof(header), "cpu-%d", cpu); + format_and_print(outf, 3, header, NULL); + + snprintf(header, sizeof(header), "get-assoc"); + format_and_print(outf, 4, header, NULL); + + snprintf(header, sizeof(header), "clos"); + snprintf(value, sizeof(value), "%d", clos); + format_and_print(outf, 5, header, value); + + format_and_print(outf, 1, NULL, NULL); +} + void isst_display_result(int cpu, FILE *outf, char *feature, char *cmd, int result) { diff --git a/tools/power/x86/intel-speed-select/isst.h b/tools/power/x86/intel-speed-select/isst.h index 668f914d077f..48655d0dee2d 100644 --- a/tools/power/x86/intel-speed-select/isst.h +++ b/tools/power/x86/intel-speed-select/isst.h @@ -225,7 +225,7 @@ extern int isst_clos_associate(int cpu, int clos); extern int isst_clos_get_assoc_status(int cpu, int *clos_id); extern void isst_clos_display_information(int cpu, FILE *outf, int clos, struct isst_clos_config *clos_config); - +extern void isst_clos_display_assoc_information(int cpu, FILE *outf, int clos); extern int isst_read_reg(unsigned short reg, unsigned int *val); extern int isst_write_reg(int reg, unsigned int val); -- 2.17.2
[PATCH v2 5/5] tools/power/x86/intel-speed-select: Extend core-power command set
Add additional command to get the clos enable and priority type. The current info option is actually dumping per clos QOS config, so name the command appropriately to get-config. Signed-off-by: Srinivas Pandruvada --- .../x86/intel-speed-select/isst-config.c | 38 ++- .../power/x86/intel-speed-select/isst-core.c | 25 .../x86/intel-speed-select/isst-display.c | 28 ++ tools/power/x86/intel-speed-select/isst.h | 5 +++ 4 files changed, 95 insertions(+), 1 deletion(-) diff --git a/tools/power/x86/intel-speed-select/isst-config.c b/tools/power/x86/intel-speed-select/isst-config.c index 889396d676cb..6a54e165672d 100644 --- a/tools/power/x86/intel-speed-select/isst-config.c +++ b/tools/power/x86/intel-speed-select/isst-config.c @@ -1133,6 +1133,40 @@ static void dump_clos_config(void) isst_ctdp_display_information_end(outf); } +static void get_clos_info_for_cpu(int cpu, void *arg1, void *arg2, void *arg3, + void *arg4) +{ + int enable, ret, prio_type; + + ret = isst_clos_get_clos_information(cpu, &enable, &prio_type); + if (ret) + perror("isst_clos_get_info"); + else + isst_clos_display_clos_information(cpu, outf, enable, prio_type); +} + +static void dump_clos_info(void) +{ + if (cmd_help) { + fprintf(stderr, + "Print Intel Speed Select Technology core power information\n"); + fprintf(stderr, "\tSpecify targeted cpu id with [--cpu|-c]\n"); + exit(0); + } + + if (!max_target_cpus) { + fprintf(stderr, + "Invalid target cpu. Specify with [-c|--cpu]\n"); + exit(0); + } + + isst_ctdp_display_information_start(outf); + for_each_online_target_cpu_in_set(get_clos_info_for_cpu, NULL, + NULL, NULL, NULL); + isst_ctdp_display_information_end(outf); + +} + static void set_clos_config_for_cpu(int cpu, void *arg1, void *arg2, void *arg3, void *arg4) { @@ -1286,10 +1320,11 @@ static struct process_cmd_struct isst_cmds[] = { { "turbo-freq", "info", dump_fact_config }, { "turbo-freq", "enable", set_fact_enable }, { "turbo-freq", "disable", set_fact_disable }, - { "core-power", "info", dump_clos_config }, + { "core-power", "info", dump_clos_info }, { "core-power", "enable", set_clos_enable }, { "core-power", "disable", set_clos_disable }, { "core-power", "config", set_clos_config }, + { "core-power", "get-config", dump_clos_config }, { "core-power", "assoc", set_clos_assoc }, { "core-power", "get-assoc", get_clos_assoc }, { NULL, NULL, NULL } @@ -1491,6 +1526,7 @@ static void core_power_help(void) printf("\tenable\n"); printf("\tdisable\n"); printf("\tconfig\n"); + printf("\tget-config\n"); printf("\tassoc\n"); printf("\tget-assoc\n"); } diff --git a/tools/power/x86/intel-speed-select/isst-core.c b/tools/power/x86/intel-speed-select/isst-core.c index 0bf341ad9697..6dee5332c9d3 100644 --- a/tools/power/x86/intel-speed-select/isst-core.c +++ b/tools/power/x86/intel-speed-select/isst-core.c @@ -619,6 +619,31 @@ int isst_get_process_ctdp(int cpu, int tdp_level, struct isst_pkg_ctdp *pkg_dev) return 0; } +int isst_clos_get_clos_information(int cpu, int *enable, int *type) +{ + unsigned int resp; + int ret; + + ret = isst_send_mbox_command(cpu, CONFIG_CLOS, CLOS_PM_QOS_CONFIG, 0, 0, +&resp); + if (ret) + return ret; + + debug_printf("cpu:%d CLOS_PM_QOS_CONFIG resp:%x\n", cpu, resp); + + if (resp & BIT(1)) + *enable = 1; + else + *enable = 0; + + if (resp & BIT(2)) + *type = 1; + else + *type = 0; + + return 0; +} + int isst_pm_qos_config(int cpu, int enable_clos, int priority_type) { unsigned int req, resp; diff --git a/tools/power/x86/intel-speed-select/isst-display.c b/tools/power/x86/intel-speed-select/isst-display.c index bd7aaf27e4de..2e6e5fcdbd7c 100644 --- a/tools/power/x86/intel-speed-select/isst-display.c +++ b/tools/power/x86/intel-speed-select/isst-display.c @@ -503,6 +503,34 @@ void isst_clos_display_information(int cpu, FILE *outf, int clos, format_and_print(outf, 1, NULL, NULL); } +void isst_clos_display_clos_information(int cpu, FILE *outf, + int clos_enable, int type) +{ + char header[256]; + char value[256]; + + snprintf(header, sizeof(header), "package-%d", +get_physical_package_id(cpu)); + format_and_print(outf, 1, header, NULL); + snprintf(header, sizeof(header), "die-%d", get_physical_die_id(cpu)); + forma
Re: [PATCH] irqchip/sifive-plic: add irq_mask and irq_unmask
I had issues with that plic driver. The current implementation wasn't usable with driver using level sensitive interrupt together with the IRQF_ONESHOT flag. Those null were producing crashes in the chained_irq_enter function. Filling them with dummy function fixed the issue. On Sat, Sep 14, 2019 at 9:00 PM Palmer Dabbelt wrote: > > On Thu, 12 Sep 2019 14:40:34 PDT (-0700), Darius Rad wrote: > > As per the existing comment, irq_mask and irq_unmask do not need > > to do anything for the PLIC. However, the functions must exist > > (the pointers cannot be NULL) as they are not optional, based on > > the documentation (Documentation/core-api/genericirq.rst) as well > > as existing usage (e.g., include/linux/irqchip/chained_irq.h). > > > > Signed-off-by: Darius Rad > > --- > > drivers/irqchip/irq-sifive-plic.c | 13 + > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/irqchip/irq-sifive-plic.c > > b/drivers/irqchip/irq-sifive-plic.c > > index cf755964f2f8..52d5169f924f 100644 > > --- a/drivers/irqchip/irq-sifive-plic.c > > +++ b/drivers/irqchip/irq-sifive-plic.c > > @@ -111,6 +111,13 @@ static void plic_irq_disable(struct irq_data *d) > > plic_irq_toggle(cpu_possible_mask, d->hwirq, 0); > > } > > > > +/* > > + * There is no need to mask/unmask PLIC interrupts. They are "masked" > > + * by reading claim and "unmasked" when writing it back. > > + */ > > +static void plic_irq_mask(struct irq_data *d) { } > > +static void plic_irq_unmask(struct irq_data *d) { } > > + > > #ifdef CONFIG_SMP > > static int plic_set_affinity(struct irq_data *d, > >const struct cpumask *mask_val, bool force) > > @@ -138,12 +145,10 @@ static int plic_set_affinity(struct irq_data *d, > > > > static struct irq_chip plic_chip = { > > .name = "SiFive PLIC", > > - /* > > - * There is no need to mask/unmask PLIC interrupts. They are "masked" > > - * by reading claim and "unmasked" when writing it back. > > - */ > > .irq_enable = plic_irq_enable, > > .irq_disable= plic_irq_disable, > > + .irq_mask = plic_irq_mask, > > + .irq_unmask = plic_irq_unmask, > > #ifdef CONFIG_SMP > > .irq_set_affinity = plic_set_affinity, > > #endif > > I can't find any other drivers in irqchip with empty irq_mask/irq_unmask. I'm > not well versed in irqchip stuff, so I'll leave it up to the irqchip > maintainers to comment on if this is the right way to do this. Either way, > I'm > assuming it'll go in through some the irqchip tree so > > Acked-by: Palmer Dabbelt > > just to make sure I don't get in the way if it is the right way to do it :). > > ___ > linux-riscv mailing list > linux-ri...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
[PULL] vhost: a last minute revert
So I made a mess of it. Sent a pull before making sure it works on 32 bit too. Hope it's not too late to revert. Will teach me to be way more careful in the near future. The following changes since commit 060423bfdee3f8bc6e2c1bac97de24d5415e2bc4: vhost: make sure log_num < in_num (2019-09-11 15:15:26 -0400) are available in the Git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus for you to fetch changes up to 0d4a3f2abbef73b9e5bb5f12213c275565473588: Revert "vhost: block speculation of translated descriptors" (2019-09-14 15:21:51 -0400) virtio: a last minute revert 32 bit build got broken by the latest defence in depth patch. Revert and we'll try again in the next cycle. Signed-off-by: Michael S. Tsirkin Michael S. Tsirkin (1): Revert "vhost: block speculation of translated descriptors" drivers/vhost/vhost.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-)
Re: [PATCH] vhost: Fix compile time error
On Sat, Sep 14, 2019 at 01:44:57AM -0700, Guenter Roeck wrote: > Building vhost on 32-bit targets results in the following error. > > drivers/vhost/vhost.c: In function 'translate_desc': > include/linux/compiler.h:549:38: error: > call to '__compiletime_assert_1879' declared with attribute error: > BUILD_BUG_ON failed: sizeof(_s) > sizeof(long) > > Fixes: a89db445fbd7 ("vhost: block speculation of translated descriptors") > Cc: Michael S. Tsirkin > Cc: Jason Wang > Signed-off-by: Guenter Roeck > --- > drivers/vhost/vhost.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c > index acabf20b069e..102a0c877007 100644 > --- a/drivers/vhost/vhost.c > +++ b/drivers/vhost/vhost.c > @@ -2074,7 +2074,7 @@ static int translate_desc(struct vhost_virtqueue *vq, > u64 addr, u32 len, > _iov->iov_base = (void __user *) > ((unsigned long)node->userspace_addr + >array_index_nospec((unsigned long)(addr - node->start), > - node->size)); > + (unsigned long)node->size)); Unfortunately this does not fix the case where size is actually 64 bit, e.g. a single node with VA 0, size 2^32 is how you cover the whole virtual address space. this is not how qemu uses it, but it's valid. I think it's best to just revert the patch for now. > s += size; > addr += size; > ++ret; > -- > 2.7.4
Re: [PATCH] pinctrl: at91-pio4: implement .get_multiple and .set_multiple
On Thu, Sep 05, 2019 at 04:13:04PM +0200, Alexandre Belloni wrote: > > Implement .get_multiple and .set_multiple to allow reading or setting > multiple pins simultaneously. Pins in the same bank will all be switched at > the same time, improving synchronization and performances. > > Signed-off-by: Alexandre Belloni Acked-by: Ludovic Desroches Thanks for this improvement. You can keep my ack for v3 as the changes should be the commit message only. I'll be off for three weeks. Regards Ludovic > --- > drivers/pinctrl/pinctrl-at91-pio4.c | 60 + > 1 file changed, 60 insertions(+) > > diff --git a/drivers/pinctrl/pinctrl-at91-pio4.c > b/drivers/pinctrl/pinctrl-at91-pio4.c > index d6de4d360cd4..488a302a60d4 100644 > --- a/drivers/pinctrl/pinctrl-at91-pio4.c > +++ b/drivers/pinctrl/pinctrl-at91-pio4.c > @@ -328,6 +328,35 @@ static int atmel_gpio_get(struct gpio_chip *chip, > unsigned offset) > return !!(reg & BIT(pin->line)); > } > > +static int atmel_gpio_get_multiple(struct gpio_chip *chip, unsigned long > *mask, > +unsigned long *bits) > +{ > + struct atmel_pioctrl *atmel_pioctrl = gpiochip_get_data(chip); > + unsigned int bank; > + > + bitmap_zero(bits, atmel_pioctrl->npins); > + > + for (bank = 0; bank < atmel_pioctrl->nbanks; bank++) { > + unsigned int word = bank; > + unsigned int offset = 0; > + unsigned int reg; > + > +#if ATMEL_PIO_NPINS_PER_BANK != BITS_PER_LONG > + word = BIT_WORD(bank * ATMEL_PIO_NPINS_PER_BANK); > + offset = bank * ATMEL_PIO_NPINS_PER_BANK % BITS_PER_LONG; > +#endif > + if (!mask[word]) > + continue; > + > + reg = atmel_gpio_read(atmel_pioctrl, bank, ATMEL_PIO_PDSR); > + bits[word] |= mask[word] & (reg << offset); > + > + pr_err("ABE: %d %08x\n", bank, bits[word]); > + } > + > + return 0; > +} > + > static int atmel_gpio_direction_output(struct gpio_chip *chip, unsigned > offset, > int value) > { > @@ -358,11 +387,42 @@ static void atmel_gpio_set(struct gpio_chip *chip, > unsigned offset, int val) >BIT(pin->line)); > } > > +static void atmel_gpio_set_multiple(struct gpio_chip *chip, unsigned long > *mask, > + unsigned long *bits) > +{ > + struct atmel_pioctrl *atmel_pioctrl = gpiochip_get_data(chip); > + unsigned int bank; > + > + for (bank = 0; bank < atmel_pioctrl->nbanks; bank++) { > + unsigned int bitmask; > + unsigned int word = bank; > + > +#if ATMEL_PIO_NPINS_PER_BANK != BITS_PER_LONG > + word = BIT_WORD(bank * ATMEL_PIO_NPINS_PER_BANK); > +#endif > + if (!mask[word]) > + continue; > + > + bitmask = mask[word] & bits[word]; > + atmel_gpio_write(atmel_pioctrl, bank, ATMEL_PIO_SODR, bitmask); > + > + bitmask = mask[word] & ~bits[word]; > + atmel_gpio_write(atmel_pioctrl, bank, ATMEL_PIO_CODR, bitmask); > + > +#if ATMEL_PIO_NPINS_PER_BANK != BITS_PER_LONG > + mask[word] >>= ATMEL_PIO_NPINS_PER_BANK; > + bits[word] >>= ATMEL_PIO_NPINS_PER_BANK; > +#endif > + } > +} > + > static struct gpio_chip atmel_gpio_chip = { > .direction_input= atmel_gpio_direction_input, > .get= atmel_gpio_get, > + .get_multiple = atmel_gpio_get_multiple, > .direction_output = atmel_gpio_direction_output, > .set= atmel_gpio_set, > + .set_multiple = atmel_gpio_set_multiple, > .to_irq = atmel_gpio_to_irq, > .base = 0, > }; > -- > 2.21.0 > >
Re: [PATCH v5 0/9] i2c: add support for filters
On Wed, Sep 11, 2019 at 10:24:14AM +0200, Eugen Hristev - M18282 wrote: > From: Eugen Hristev > > Hello, > > This series adds support for analog and digital filters for i2c controllers > > This series is based on the series: > [PATCH v2 0/9] i2c: at91: filters support for at91 SoCs > and later > [PATCH v4 0/9] i2c: add support for filters > and enhanced to add the bindings for all controllers plus an extra bindings > for the width of the spikes in nanoseconds (digital filters) and cut-off > frequency (analog filters) > > First, bindings are created for > 'i2c-analog-filter' > 'i2c-digital-filter' > 'i2c-digital-filter-width-ns' > 'i2c-analog-filter-cutoff-frequency' > > The support is added in the i2c core to retrieve filter width/cutoff frequency > and add it to the timings structure. > Next, the at91 driver is enhanced for supporting digital filter, advanced > digital filter (with selectable spike width) and the analog filter. > > Finally the device tree for two boards are modified to make use of the > new properties. > > This series is the result of the comments on the ML in the direction > requested: to make the bindings globally available for i2c drivers. > > Changes in v5: > - renamed i2c-filter-width-ns to i2c-digital-filter-width-ns as this > is applicable only to digital filter > - created new binding i2c-digital-filter-width-ns for analog filters. Acked-by: Ludovic Desroches for at91 stuff. You can keep it for the future if needed as long as changes mainly concerned the generic binding. Regards Ludovic > > Changes in v4: > - renamed i2c-ana-filter to i2c-analog-filter > - renamed i2c-dig-filter to i2c-digital-filter > > Changes in v3: > - made bindings global for i2c controllers and modified accordingly > - gave up PADFCDF bit because it's a lack in datasheet > - the computation on the width of the spike is based on periph clock as it > is done for hold time. > > Changes in v2: > - added device tree bindings and support for enable-ana-filt and > enable-dig-filt > - added the new properties to the DT for sama5d4_xplained/sama5d2_xplained > > Eugen Hristev (9): > dt-bindings: i2c: at91: add new compatible > dt-bindings: i2c: add bindings for i2c analog and digital filter > i2c: add support for filters optional properties > i2c: at91: add new platform support for sam9x60 > i2c: at91: add support for digital filtering > i2c: at91: add support for advanced digital filtering > i2c: at91: add support for analog filtering > ARM: dts: at91: sama5d2_xplained: add analog and digital filter for > i2c > ARM: dts: at91: sama5d4_xplained: add digital filter for i2c > > Documentation/devicetree/bindings/i2c/i2c-at91.txt | 3 +- > Documentation/devicetree/bindings/i2c/i2c.txt | 18 > arch/arm/boot/dts/at91-sama5d2_xplained.dts| 6 +++ > arch/arm/boot/dts/at91-sama5d4_xplained.dts| 1 + > drivers/i2c/busses/i2c-at91-core.c | 38 + > drivers/i2c/busses/i2c-at91-master.c | 49 > -- > drivers/i2c/busses/i2c-at91.h | 13 ++ > drivers/i2c/i2c-core-base.c| 6 +++ > include/linux/i2c.h| 6 +++ > 9 files changed, 136 insertions(+), 4 deletions(-) > > -- > 2.7.4 >
RE: [RFC] buildtar: add case for riscv architecture
On Sat, 14 Sep 2019 06:05:59 PDT (-0700), Anup Patel wrote: -Original Message- From: linux-kernel-ow...@vger.kernel.org On Behalf Of Palmer Dabbelt Sent: Saturday, September 14, 2019 6:30 PM To: m...@aurabindo.in Cc: Troy Benjegerdes ; Paul Walmsley ; a...@eecs.berkeley.edu; linux- ri...@lists.infradead.org; linux-kernel@vger.kernel.org; linux- kbu...@vger.kernel.org Subject: Re: [RFC] buildtar: add case for riscv architecture On Wed, 11 Sep 2019 05:54:07 PDT (-0700), m...@aurabindo.in wrote: > > >> None of the available RiscV platforms that I’m aware of use compressed images, unless there are some new bootloaders I haven’t seen yet. >> > > I noticed that default build image is Image.gz, which is why I thought its a good idea to copy it into the tarball. Does such a copy not make sense at this point ? Image.gz can't be booted directly: it's just Image that's been compressed with the standard gzip command. A bootloader would have to decompress that image before loading it into memory, which requires extra bootloader support. Contrast that with the zImage style images (which are vmlinuz on x86), which are self-extracting and therefor require no bootloader support. The examples for u-boot all use the "booti" command, which expects uncompressed images. Poking around I couldn't figure out a way to have u-boot decompress the images, but that applies to arm64 as well so I'm not sure if I'm missing something. If I was doing this, I'd copy over arch/riscv/boot/Image and call it "/boot/image-${KERNELRELEASE}", as calling it vmlinuz is a bit confusing to me because I'd expect vmlinuz to be a self-extracting compressed executable and not a raw gzip file. On the contrary, it is indeed possible to boot Image.gz directly using U-Boot booti command so this patch would be useful. Atish had got it working on U-Boot but he has deferred booti Image.gz support due to few more dependent changes. May be he can share more info. Oh, great. I guess it makes sense to just put both in the tarball, then, as users will still need to use the Image format for now. Regards, Anup
Re: [PATCH 0/5] tools/power/x86/intel-speed-select: New command and
On Sat, 2019-09-14 at 20:19 +0300, Andy Shevchenko wrote: > On Sat, Sep 14, 2019 at 12:05:08AM -0700, Srinivas Pandruvada wrote: > > This series contains some minor fixes, when firmware mask is > > including > > invalid CPU in the perf-profile mask. Also add some commands to > > better manage core-power feature. > > Hmm... 150+ LOCs doesn't count to me as minor fixes. > So, are you considering this a material for v5.4? Sorry, I should be clear. It is for 5.4. I am trying to catch merge window. None of the fixes are critical. The majority of the code is added for new command features. What is your cut off for 5.4? I want to send some more features if possible for 5.4. Thanks, Srinivas > > > > Srinivas Pandruvada (4): > > tools/power/x86/intel-speed-select: Allow online/offline based on > > tdp > > tools/power/x86/intel-speed-select: Format get-assoc information > > tools/power/x86/intel-speed-select: Fix some debug prints > > tools/power/x86/intel-speed-select: Extend core-power command set > > > > Youquan Song (1): > > tools/power/x86/intel-speed-select: Fix high priority core mask > > over > > count > > > > .../x86/intel-speed-select/isst-config.c | 108 > > -- > > .../power/x86/intel-speed-select/isst-core.c | 25 > > .../x86/intel-speed-select/isst-display.c | 51 + > > tools/power/x86/intel-speed-select/isst.h | 9 +- > > 4 files changed, 182 insertions(+), 11 deletions(-) > > > > -- > > 2.17.2 > > > >
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 10:09 AM Alexander E. Patrakov wrote: > > > Which means that we're all kinds of screwed. The whole "we guarantee > > entropy" model is broken. > > I agree here. Given that you suggested "to just fill the buffer and > return 0" in the previous mail (well, I think you really meant "return > buflen", otherwise ENOENTROPY == 0 and your previous objection applies), Right. The question remains when we should WARN_ON(), though. For example, if somebody did save entropy between boots, we probably should accept that - at least in the sense of not warning when they then ask for randomness data back. And if the hardware does have a functioning rdrand, we probably should accept that too - simply because not accepting it and warning sounds a bit too annoying. But we definitely *should* have a warning for people who build embedded devices that we can't see any reasonable amount of possible entropy. Those have definitely happened, and it's a serious and real security issue. > let's do just that. As a bonus, it saves applications from the complex > dance with retrying via /dev/urandom and finally brings a reliable API > (modulo old and broken kernels) to get random numbers (well, as random > as possible right now) without needing a file descriptor. Yeah, well, the question in the end always is "what is reliable". Waiting has definitely not been reliable, and has only ever caused problems. Returning an error (or some status while still doing a best effort) would be reasonable, but I really do think that people will mis-use that. We just have too much of a history of people having the mindset that they can just fall back to something better - like waiting - and they are always wrong. Just returning random data is the right thing, but we do need to make sure that system developers see a warning if they do something obviously wrong (so that the embedded people without even a real-time clock to initialize any bits of entropy AT ALL won't think that they can generate a system key on their router). Linus
Re: [PATCH] i2c: at91: Send bus clear command if SCL or SDA is down
On Wed, Sep 11, 2019 at 12:58:54PM +0300, Codrin Ciubotariu wrote: > After a transfer timeout, some faulty I2C slave devices might hold down > the SCL or the SDA pins. We can generate a bus clear command, hoping that > the slave might release the pins. > > Signed-off-by: Codrin Ciubotariu Acked-by: Ludovic Desroches I'll be off for three weeks so if there are minor changes, you can keep my ack. Thanks Ludovic > --- > drivers/i2c/busses/i2c-at91-master.c | 20 > drivers/i2c/busses/i2c-at91.h| 6 +- > 2 files changed, 25 insertions(+), 1 deletion(-) > > diff --git a/drivers/i2c/busses/i2c-at91-master.c > b/drivers/i2c/busses/i2c-at91-master.c > index a3fcc35ffd3b..5f544a16db96 100644 > --- a/drivers/i2c/busses/i2c-at91-master.c > +++ b/drivers/i2c/busses/i2c-at91-master.c > @@ -599,6 +599,26 @@ static int at91_do_twi_transfer(struct at91_twi_dev *dev) > at91_twi_write(dev, AT91_TWI_CR, > AT91_TWI_THRCLR | AT91_TWI_LOCKCLR); > } > + > + /* > + * After timeout, some faulty I2C slave devices might hold SCL/SDA down; > + * we can send a bus clear command, hoping that the pins will be > + * released > + */ > + if (!(dev->transfer_status & AT91_TWI_SDA) || > + !(dev->transfer_status & AT91_TWI_SCL)) { > + dev_dbg(dev->dev, > + "SDA/SCL are down; sending bus clear command\n"); > + if (dev->use_alt_cmd) { > + unsigned int acr; > + > + acr = at91_twi_read(dev, AT91_TWI_ACR); > + acr &= ~AT91_TWI_ACR_DATAL_MASK; > + at91_twi_write(dev, AT91_TWI_ACR, acr); > + } > + at91_twi_write(dev, AT91_TWI_CR, AT91_TWI_CLEAR); > + } > + > return ret; > } > > diff --git a/drivers/i2c/busses/i2c-at91.h b/drivers/i2c/busses/i2c-at91.h > index 499b506f6128..ffb870f3ffc6 100644 > --- a/drivers/i2c/busses/i2c-at91.h > +++ b/drivers/i2c/busses/i2c-at91.h > @@ -36,6 +36,7 @@ > #define AT91_TWI_SVDIS BIT(5) /* Slave Transfer Disable */ > #define AT91_TWI_QUICK BIT(6) /* SMBus quick command */ > #define AT91_TWI_SWRST BIT(7) /* Software Reset */ > +#define AT91_TWI_CLEAR BIT(15) /* Bus clear command */ > #define AT91_TWI_ACMEN BIT(16) /* Alternative Command Mode > Enable */ > #define AT91_TWI_ACMDIS BIT(17) /* Alternative Command Mode > Disable */ > #define AT91_TWI_THRCLR BIT(24) /* Transmit Holding Register > Clear */ > @@ -69,6 +70,8 @@ > #define AT91_TWI_NACK BIT(8) /* Not Acknowledged */ > #define AT91_TWI_EOSACC BIT(11) /* End Of Slave Access */ > #define AT91_TWI_LOCK BIT(23) /* TWI Lock due to Frame Errors > */ > +#define AT91_TWI_SCLBIT(24) /* TWI SCL status */ > +#define AT91_TWI_SDABIT(25) /* TWI SDA status */ > > #define AT91_TWI_INT_MASK \ > (AT91_TWI_TXCOMP | AT91_TWI_RXRDY | AT91_TWI_TXRDY | AT91_TWI_NACK \ > @@ -81,7 +84,8 @@ > #define AT91_TWI_THR0x0034 /* Transmit Holding Register */ > > #define AT91_TWI_ACR0x0040 /* Alternative Command Register > */ > -#define AT91_TWI_ACR_DATAL(len) ((len) & 0xff) > +#define AT91_TWI_ACR_DATAL_MASK GENMASK(15, 0) > +#define AT91_TWI_ACR_DATAL(len) ((len) & AT91_TWI_ACR_DATAL_MASK) > #define AT91_TWI_ACR_DIRBIT(8) > > #define AT91_TWI_FMR0x0050 /* FIFO Mode Register */ > -- > 2.20.1 >
Re: [PATCH] irqchip/sifive-plic: add irq_mask and irq_unmask
On Thu, 12 Sep 2019 14:40:34 PDT (-0700), Darius Rad wrote: As per the existing comment, irq_mask and irq_unmask do not need to do anything for the PLIC. However, the functions must exist (the pointers cannot be NULL) as they are not optional, based on the documentation (Documentation/core-api/genericirq.rst) as well as existing usage (e.g., include/linux/irqchip/chained_irq.h). Signed-off-by: Darius Rad --- drivers/irqchip/irq-sifive-plic.c | 13 + 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index cf755964f2f8..52d5169f924f 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -111,6 +111,13 @@ static void plic_irq_disable(struct irq_data *d) plic_irq_toggle(cpu_possible_mask, d->hwirq, 0); } +/* + * There is no need to mask/unmask PLIC interrupts. They are "masked" + * by reading claim and "unmasked" when writing it back. + */ +static void plic_irq_mask(struct irq_data *d) { } +static void plic_irq_unmask(struct irq_data *d) { } + #ifdef CONFIG_SMP static int plic_set_affinity(struct irq_data *d, const struct cpumask *mask_val, bool force) @@ -138,12 +145,10 @@ static int plic_set_affinity(struct irq_data *d, static struct irq_chip plic_chip = { .name = "SiFive PLIC", - /* -* There is no need to mask/unmask PLIC interrupts. They are "masked" -* by reading claim and "unmasked" when writing it back. -*/ .irq_enable = plic_irq_enable, .irq_disable= plic_irq_disable, + .irq_mask = plic_irq_mask, + .irq_unmask = plic_irq_unmask, #ifdef CONFIG_SMP .irq_set_affinity = plic_set_affinity, #endif I can't find any other drivers in irqchip with empty irq_mask/irq_unmask. I'm not well versed in irqchip stuff, so I'll leave it up to the irqchip maintainers to comment on if this is the right way to do this. Either way, I'm assuming it'll go in through some the irqchip tree so Acked-by: Palmer Dabbelt just to make sure I don't get in the way if it is the right way to do it :).
Re: [RFC v1 09/14] krsi: Add a helper function for bpf_perf_event_output
On 9/10/19 12:55 PM, KP Singh wrote: > From: KP Singh > > This helper is mapped to the existing operation > BPF_FUNC_perf_event_output. > > An example usage of this function would be: > > #define BUF_SIZE 64; > > struct bpf_map_def SEC("maps") perf_map = { > .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, > .key_size = sizeof(int), > .value_size = sizeof(u32), > .max_entries = MAX_CPUS, > }; could you use a map definition similar to tools/testing/selftests/bpf/progs/test_perf_buffer.c? struct { __uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY); __uint(key_size, sizeof(int)); __uint(value_size, sizeof(u32)); } perf_map SEC(".maps"); > > SEC("krsi") > int bpf_prog1(void *ctx) > { > char buf[BUF_SIZE]; > int len; > u64 flags = BPF_F_CURRENT_CPU; > > /* some logic that fills up buf with len data*/ > len = fill_up_buf(buf); > if (len < 0) > return len; > if (len > BU) BUF_SIZE? > return 0; > > bpf_perf_event_output(ctx, &perf_map, flags, buf len); buf, len? > return 0; > } > > A sample program that showcases the use of bpf_perf_event_output is > added later. > > Signed-off-by: KP Singh > --- > security/krsi/ops.c | 22 ++ > 1 file changed, 22 insertions(+) > > diff --git a/security/krsi/ops.c b/security/krsi/ops.c > index a61508b7018f..57bd304a03f4 100644 > --- a/security/krsi/ops.c > +++ b/security/krsi/ops.c > @@ -111,6 +111,26 @@ static bool krsi_prog_is_valid_access(int off, int size, > return false; > } > > +BPF_CALL_5(krsi_event_output, void *, log, Maybe name the first argument as 'ctx' to follow typical helper convention? > +struct bpf_map *, map, u64, flags, void *, data, u64, size) > +{ > + if (unlikely(flags & ~(BPF_F_INDEX_MASK))) > + return -EINVAL; > + > + return bpf_event_output(map, flags, data, size, NULL, 0, NULL); > +} > + > +static const struct bpf_func_proto krsi_event_output_proto = { > + .func = krsi_event_output, > + .gpl_only = true, > + .ret_type = RET_INTEGER, > + .arg1_type = ARG_PTR_TO_CTX, > + .arg2_type = ARG_CONST_MAP_PTR, > + .arg3_type = ARG_ANYTHING, > + .arg4_type = ARG_PTR_TO_MEM, > + .arg5_type = ARG_CONST_SIZE_OR_ZERO, > +}; > + > static const struct bpf_func_proto *krsi_prog_func_proto(enum bpf_func_id >func_id, >const struct bpf_prog > @@ -121,6 +141,8 @@ static const struct bpf_func_proto > *krsi_prog_func_proto(enum bpf_func_id > return &bpf_map_lookup_elem_proto; > case BPF_FUNC_get_current_pid_tgid: > return &bpf_get_current_pid_tgid_proto; > + case BPF_FUNC_perf_event_output: > + return &krsi_event_output_proto; > default: > return NULL; > } >
Re: [PATCH v2 0/4] task: Making tasks on the runqueue rcu protected
On Sat, Sep 14, 2019 at 5:30 AM Eric W. Biederman wrote: > > I have reworked these patches one more time to make it clear that the > first 3 patches only fix task_struct so that it experiences a rcu grace > period after it leaves the runqueue for the last time. I remain a fan of these patches, and the added comment on the last one is I think a sufficient clarification of the issue. But it's patch 3 that makes me go "yeah, this is the right approach", because it just removes subtle code in favor of something that is understandable. Yes, most of the lines removed may be comments, and so it doesn't actually remove a lot of _code_, but I think the comments are a result of just how subtle and fragile our current approach is, and the new model not needing them as much is I think a real issue (rather than just Eric being less verbose in the new comments and removing lines of code that way). Can anybody see anything wrong with the series? Because I'd love to have it for 5.4, Linus
Re: [RESEND PATCH v3 3/3] arm64: dts: meson-g12b-ugoos-am6: add initial device-tree
Hi Christian, my nit-picks below On Fri, Sep 6, 2019 at 4:34 PM Christian Hewitt wrote: [...] > + spdif_dit: audio-codec-1 { > + #sound-dai-cells = <0>; > + compatible = "linux,spdif-dit"; > + status = "okay"; > + sound-name-prefix = "DIT"; > + }; please move it below sdio_pwrseq (or at least somewhere below the memory node) [...] > + vcc_3v3: regulator-vcc_3v3 { > + compatible = "regulator-fixed"; > + regulator-name = "VCC_3V3"; > + regulator-min-microvolt = <330>; > + regulator-max-microvolt = <330>; > + vin-supply = <&vddao_3v3>; > + regulator-always-on; > + /* FIXME: actually controlled by VDDCPU_B_EN */ can we add the enable GPIO here now that we know how to describe the VDDCPU_B regulator? [...] > + usb1_pow: regulator-usb1_pow { for consistency with the regulators above: regulator-usb1-pow [...] > + usb_pwr_en: regulator-usb_pwr_en { for consistency with the regulators above: regulator-usb-pwr-en [...] > + vddao_1v8: regulator-vddao_1v8 { for consistency with the regulators above: regulator-vddao-1v8 [... > + vddao_3v3: regulator-vddao_3v3 { for consistency with the regulators above: regulator-vddao-3v3 [...] > +&cpu0 { > + cpu-supply = <&vddcpu_b>; > + operating-points-v2 = <&cpu_opp_table_0>; > + clocks = <&clkc CLKID_CPU_CLK>; > + clock-latency = <5>; > +}; > + > +&cpu1 { > + cpu-supply = <&vddcpu_b>; > + operating-points-v2 = <&cpu_opp_table_0>; > + clocks = <&clkc CLKID_CPU_CLK>; > + clock-latency = <5>; > +}; > + > +&cpu100 { > + cpu-supply = <&vddcpu_a>; > + operating-points-v2 = <&cpub_opp_table_1>; > + clocks = <&clkc CLKID_CPUB_CLK>; > + clock-latency = <5>; > +}; > + > +&cpu101 { > + cpu-supply = <&vddcpu_a>; > + operating-points-v2 = <&cpub_opp_table_1>; > + clocks = <&clkc CLKID_CPUB_CLK>; > + clock-latency = <5>; > +}; > + > +&cpu102 { > + cpu-supply = <&vddcpu_a>; > + operating-points-v2 = <&cpub_opp_table_1>; > + clocks = <&clkc CLKID_CPUB_CLK>; > + clock-latency = <5>; > +}; > + > +&cpu103 { > + cpu-supply = <&vddcpu_a>; > + operating-points-v2 = <&cpub_opp_table_1>; > + clocks = <&clkc CLKID_CPUB_CLK>; > + clock-latency = <5>; > +}; (not limited to this patch: there's a lot of redundancy with the CPU nodes across the G12B .dts) [...] > +&sd_emmc_a { all nodes starting here should use alphabetical sorting Martin
Re: [PATCH v4 0/4] arm64: Add basic support for Amlogic A1 SoC Family
Hi Jianxin, On Thu, Sep 12, 2019 at 10:20 AM Jianxin Pan wrote: > > A1 is an application processor designed for smart audio and IoT applications, > with Dual core ARM Cortex-A35 CPU. Unlike the previous GXL and G12 series, > there is no Cortex-M3 AO CPU in it. it will be interesting to see which devices will use this SoC [...] > Jianxin Pan (4): > soc: amlogic: meson-gx-socinfo: Add A1 and A113L IDs > dt-bindings: arm: amlogic: add A1 bindings > dt-bindings: arm: amlogic: add Amlogic AD401 bindings > arm64: dts: add support for A1 based Amlogic AD401 for the whole series: Reviewed-by: Martin Blumenstingl
[PATCH v4] tpm_crb: fix fTPM on AMD Zen+ CPUs
From: Ivan Lazeev Bug link: https://bugzilla.kernel.org/show_bug.cgi?id=195657 cmd/rsp buffers are expected to be in the same ACPI region. For Zen+ CPUs BIOS's might report two different regions, some of them also report region sizes inconsistent with values from TPM registers. Memory configuration on ASRock x470 ITX: db0a-dc59efff : Reserved dc57e000-dc57efff : MSFT0101:00 dc582000-dc582fff : MSFT0101:00 Work around the issue by storing ACPI regions declared for the device in a list of struct crb_resource. Each entry holds a copy of the correcponding ACPI resourse (iores field) and a pointer to a possibly allocated with devm_ioremap_resource memory region (iobase field). This data was previously held for a single resource in struct crb_priv (iobase field) and local variable io_res in crb_map_io function. The list is used to find corresponding region for each buffer with crb_containing_resource, make the buffer size consistent with it's length and map it at most once, storing the pointer to allocated resource in iobase field of the entry. Signed-off-by: Ivan Lazeev --- Changes in v3: - make crb_containing_resource search for address only, because buffer sizes aren't trusted anyway - improve commit message Changes in v4: - rename struct crb_resource fields (style change) - improve commit message I believe that storing the data in a in list of struct crb_resource makes tracking of the resource allocation state explicit, aiding clarity. Whilst everything that worked before seems not to be broken, there is a possibility of allocating with crb_map_resource a resource that is not from ACPI table, and state of such resource is not tracked in the current solution. It might be good to track allocation of all resources, not just ones declared by ACPI, for complete correctness. However, as I see it now, it will complicate the code a bit more. Do you think the change should be made, or such situation is completely hypothetical? drivers/char/tpm/tpm_crb.c | 137 +++-- 1 file changed, 101 insertions(+), 36 deletions(-) diff --git a/drivers/char/tpm/tpm_crb.c b/drivers/char/tpm/tpm_crb.c index e59f1f91d7f3..b301f7fc4a73 100644 --- a/drivers/char/tpm/tpm_crb.c +++ b/drivers/char/tpm/tpm_crb.c @@ -91,7 +91,6 @@ enum crb_status { struct crb_priv { u32 sm; const char *hid; - void __iomem *iobase; struct crb_regs_head __iomem *regs_h; struct crb_regs_tail __iomem *regs_t; u8 __iomem *cmd; @@ -108,6 +107,12 @@ struct tpm2_crb_smc { u32 smc_func_id; }; +struct crb_resource { + struct resource iores; + void __iomem *iobase; + struct list_head list; +}; + static bool crb_wait_for_reg_32(u32 __iomem *reg, u32 mask, u32 value, unsigned long timeout) { @@ -432,23 +437,57 @@ static const struct tpm_class_ops tpm_crb = { .req_complete_val = CRB_DRV_STS_COMPLETE, }; +static void crb_free_resource_list(struct list_head *resources) +{ + struct crb_resource *cres, *tmp; + + list_for_each_entry_safe(cres, tmp, resources, list) + kfree(cres); +} + +static inline bool crb_resource_contains(const struct resource *io_res, +u64 address) +{ + return address >= io_res->start && address <= io_res->end; +} + +static struct crb_resource *crb_containing_resource( + const struct list_head *resources, u64 start) +{ + struct crb_resource *cres; + + list_for_each_entry(cres, resources, list) { + if (crb_resource_contains(&cres->iores, start)) + return cres; + } + + return NULL; +} + static int crb_check_resource(struct acpi_resource *ares, void *data) { - struct resource *io_res = data; + struct list_head *list = data; + struct crb_resource *cres; struct resource_win win; struct resource *res = &(win.res); if (acpi_dev_resource_memory(ares, res) || acpi_dev_resource_address_space(ares, &win)) { - *io_res = *res; - io_res->name = NULL; + cres = kzalloc(sizeof(*cres), GFP_KERNEL); + if (!cres) + return -ENOMEM; + + cres->iores = *res; + cres->iores.name = NULL; + + list_add_tail(&cres->list, list); } return 1; } -static void __iomem *crb_map_res(struct device *dev, struct crb_priv *priv, -struct resource *io_res, u64 start, u32 size) +static void __iomem *crb_map_res(struct device *dev, struct crb_resource *cres, +u64 start, u32 size) { struct resource new_res = { .start = start, @@ -460,10 +499,16 @@ static void __iomem *crb_map_res(struct device *dev, struct crb_priv *priv, if (start != new_res
Re: [PATCH 0/5] tools/power/x86/intel-speed-select: New command and
On Sat, Sep 14, 2019 at 12:05:08AM -0700, Srinivas Pandruvada wrote: > This series contains some minor fixes, when firmware mask is including > invalid CPU in the perf-profile mask. Also add some commands to > better manage core-power feature. Hmm... 150+ LOCs doesn't count to me as minor fixes. So, are you considering this a material for v5.4? > Srinivas Pandruvada (4): > tools/power/x86/intel-speed-select: Allow online/offline based on tdp > tools/power/x86/intel-speed-select: Format get-assoc information > tools/power/x86/intel-speed-select: Fix some debug prints > tools/power/x86/intel-speed-select: Extend core-power command set > > Youquan Song (1): > tools/power/x86/intel-speed-select: Fix high priority core mask over > count > > .../x86/intel-speed-select/isst-config.c | 108 -- > .../power/x86/intel-speed-select/isst-core.c | 25 > .../x86/intel-speed-select/isst-display.c | 51 + > tools/power/x86/intel-speed-select/isst.h | 9 +- > 4 files changed, 182 insertions(+), 11 deletions(-) > > -- > 2.17.2 > -- With Best Regards, Andy Shevchenko
Re: Linux 5.3-rc8
14.09.2019 21:52, Linus Torvalds пишет: On Sat, Sep 14, 2019 at 9:35 AM Alexander E. Patrakov wrote: Let me repeat: not -EINVAL, please. Please find some other error code, so that the application could sensibly distinguish between this case (low quality entropy is in the buffer) and the "kernel is too dumb" case (and no entropy is in the buffer). I'm not convinced we want applications to see that difference. The fact is, every time an application thinks it cares, it has caused problems. I can just see systemd saying "ok, the kernel didn't block, so I'll just do while (getrandom(x) == -ENOENTROPY) sleep(1); instead. Which is still completely buggy garbage. OK, I understand this viewpoint. But then still, -EINVAL is not the answer, because a hypothetical evil version of systemd will use -EINVAL as -ENOENTROPY (with flags == 0 and a reasonable buffer size, there is simply no other reason for the kernel to return -EINVAL). Yes I understand that this is a complete reverse of my previous argument. The fact is, we can't guarantee entropy in general. It's probably there is practice, particularly with user space saving randomness from last boot etc, but that kind of data may be real entropy, but the kernel cannot *guarantee* that it is. And people don't like us guaranteeing that rdrand/rdseed is "real entropy" either, since they don't trust the CPU hw either. Which means that we're all kinds of screwed. The whole "we guarantee entropy" model is broken. I agree here. Given that you suggested "to just fill the buffer and return 0" in the previous mail (well, I think you really meant "return buflen", otherwise ENOENTROPY == 0 and your previous objection applies), let's do just that. As a bonus, it saves applications from the complex dance with retrying via /dev/urandom and finally brings a reliable API (modulo old and broken kernels) to get random numbers (well, as random as possible right now) without needing a file descriptor. -- Alexander E. Patrakov smime.p7s Description: Криптографическая подпись S/MIME
Re: [PATCH v2] net: mdio: switch to using gpiod_get_optional()
On Fri, Sep 13, 2019 at 03:55:47PM -0700, Dmitry Torokhov wrote: > The MDIO device reset line is optional and now that gpiod_get_optional() > returns proper value when GPIO support is compiled out, there is no > reason to use fwnode_get_named_gpiod() that I plan to hide away. > > Let's switch to using more standard gpiod_get_optional() and > gpiod_set_consumer_name() to keep the nice "PHY reset" label. > > Also there is no reason to only try to fetch the reset GPIO when we have > OF node, gpiolib can fetch GPIO data from firmwares as well. > Reviewed-by: Andy Shevchenko But see comment below. > Signed-off-by: Dmitry Torokhov > --- > > Note this is an update to a patch titled "[PATCH 05/11] net: mdio: > switch to using fwnode_gpiod_get_index()" that no longer uses the new > proposed API and instead works with already existing ones. > > drivers/net/phy/mdio_bus.c | 22 +- > 1 file changed, 9 insertions(+), 13 deletions(-) > > diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c > index ce940871331e..2e29ab841b4d 100644 > --- a/drivers/net/phy/mdio_bus.c > +++ b/drivers/net/phy/mdio_bus.c > @@ -42,21 +42,17 @@ > > static int mdiobus_register_gpiod(struct mdio_device *mdiodev) > { > - struct gpio_desc *gpiod = NULL; > + int error; > > /* Deassert the optional reset signal */ > - if (mdiodev->dev.of_node) > - gpiod = fwnode_get_named_gpiod(&mdiodev->dev.of_node->fwnode, > -"reset-gpios", 0, GPIOD_OUT_LOW, > -"PHY reset"); > - if (IS_ERR(gpiod)) { > - if (PTR_ERR(gpiod) == -ENOENT || PTR_ERR(gpiod) == -ENOSYS) > - gpiod = NULL; > - else > - return PTR_ERR(gpiod); > - } > - > - mdiodev->reset_gpio = gpiod; > + mdiodev->reset_gpio = gpiod_get_optional(&mdiodev->dev, > + "reset", GPIOD_OUT_LOW); > + error = PTR_ERR_OR_ZERO(mdiodev->reset_gpio); > + if (error) > + return error; > + > + if (mdiodev->reset_gpio) This is redundant check. > + gpiod_set_consumer_name(mdiodev->reset_gpio, "PHY reset"); > return 0; > } > -- > 2.23.0.237.gc6a4ce50a0-goog > > > -- > Dmitry -- With Best Regards, Andy Shevchenko
Re: [RFC v1 06/14] krsi: Implement eBPF operations, attachment and execution
On 9/10/19 12:55 PM, KP Singh wrote: > From: KP Singh > > A user space program can attach an eBPF program by: > >hook_fd = open("/sys/kernel/security/krsi/process_execution", O_RDWR) >prog_fd = bpf(BPF_PROG_LOAD, ...) >bpf(BPF_PROG_ATTACH, hook_fd, prog_fd) > > When such an attach call is received, the attachment logic looks up the > dentry and appends the program to the bpf_prog_array. > > The BPF programs are stored in a bpf_prog_array and writes to the array > are guarded by a mutex. The eBPF programs are executed as a part of the > LSM hook they are attached to. If any of the eBPF programs return > an error (-ENOPERM) the action represented by the hook is denied. > > Signed-off-by: KP Singh > --- > include/linux/krsi.h | 18 ++ > kernel/bpf/syscall.c | 3 +- > security/krsi/include/krsi_init.h | 51 +++ > security/krsi/krsi.c | 13 +++- > security/krsi/krsi_fs.c | 28 > security/krsi/ops.c | 102 ++ > 6 files changed, 213 insertions(+), 2 deletions(-) > create mode 100644 include/linux/krsi.h > > diff --git a/include/linux/krsi.h b/include/linux/krsi.h > new file mode 100644 > index ..c7d1790d0c1f > --- /dev/null > +++ b/include/linux/krsi.h > @@ -0,0 +1,18 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +#ifndef _KRSI_H > +#define _KRSI_H > + > +#include > + > +#ifdef CONFIG_SECURITY_KRSI > +int krsi_prog_attach(const union bpf_attr *attr, struct bpf_prog *prog); > +#else > +static inline int krsi_prog_attach(const union bpf_attr *attr, > +struct bpf_prog *prog) > +{ > + return -EINVAL; > +} > +#endif /* CONFIG_SECURITY_KRSI */ > + > +#endif /* _KRSI_H */ > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index f38a539f7e67..ab063ed84258 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -4,6 +4,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -1950,7 +1951,7 @@ static int bpf_prog_attach(const union bpf_attr *attr) > ret = lirc_prog_attach(attr, prog); > break; > case BPF_PROG_TYPE_KRSI: > - ret = -EINVAL; > + ret = krsi_prog_attach(attr, prog); > break; > case BPF_PROG_TYPE_FLOW_DISSECTOR: > ret = skb_flow_dissector_bpf_prog_attach(attr, prog); > diff --git a/security/krsi/include/krsi_init.h > b/security/krsi/include/krsi_init.h > index 68755182a031..4e17ecacd4ed 100644 > --- a/security/krsi/include/krsi_init.h > +++ b/security/krsi/include/krsi_init.h > @@ -5,12 +5,29 @@ > > #include "krsi_fs.h" > > +#include > + > enum krsi_hook_type { > PROCESS_EXECUTION, > __MAX_KRSI_HOOK_TYPE, /* delimiter */ > }; > > extern int krsi_fs_initialized; > + > +struct krsi_bprm_ctx { > + struct linux_binprm *bprm; > +}; > + > +/* > + * krsi_ctx is the context that is passed to all KRSI eBPF > + * programs. > + */ > +struct krsi_ctx { > + union { > + struct krsi_bprm_ctx bprm_ctx; > + }; > +}; > + > /* >* The LSM creates one file per hook. >* > @@ -33,10 +50,44 @@ struct krsi_hook { >* The dentry of the file created in securityfs. >*/ > struct dentry *h_dentry; > + /* > + * The mutex must be held when updating the progs attached to the hook. > + */ > + struct mutex mutex; > + /* > + * The eBPF programs that are attached to this hook. > + */ > + struct bpf_prog_array __rcu *progs; > }; > > extern struct krsi_hook krsi_hooks_list[]; > > +static inline int krsi_run_progs(enum krsi_hook_type t, struct krsi_ctx *ctx) > +{ > + struct bpf_prog_array_item *item; > + struct bpf_prog *prog; > + struct krsi_hook *h = &krsi_hooks_list[t]; > + int ret, retval = 0; Reverse christmas tree style? > + > + preempt_disable(); Do we need preempt_disable() here? > + rcu_read_lock(); > + > + item = rcu_dereference(h->progs)->items; > + while ((prog = READ_ONCE(item->prog))) { > + ret = BPF_PROG_RUN(prog, ctx); > + if (ret < 0) { > + retval = ret; > + goto out; > + } > + item++; > + } > + > +out: > + rcu_read_unlock(); > + preempt_enable(); > + return IS_ENABLED(CONFIG_SECURITY_KRSI_ENFORCE) ? retval : 0; > +} > + > #define krsi_for_each_hook(hook) \ > for ((hook) = &krsi_hooks_list[0]; \ >(hook) < &krsi_hooks_list[__MAX_KRSI_HOOK_TYPE]; \ > diff --git a/security/krsi/krsi.c b/security/krsi/krsi.c > index 77d7e2f91172..d3a4a361c192 100644 > --- a/security/krsi/krsi.c > +++ b/security/krsi/krsi.c > @@ -1,6 +1,9 @@ > // SPDX-License-Identifier: GPL-2.0 > > #include > +#include > +#include > +#include > > #include "krsi_init.h" > > @@ -16,7 +19,15 @@ str
Re: [PATCH 1/2] x86,sched: Add support for frequency invariance
Hi Giovanni On Monday 09 Sep 2019 at 04:42:15 (+0200), Giovanni Gherdovich wrote: > +static inline long arch_scale_freq_capacity(int cpu) > +{ > + if (static_cpu_has(X86_FEATURE_APERFMPERF)) > + return per_cpu(arch_cpu_freq, cpu); So, if this is conditional, perhaps you could also add this check in an x86-specific implementation of arch_scale_freq_invariant() ? That would guide sugov in the right path (see get_next_freq()) if APERF/MPERF are unavailable. > + return 1024 /* SCHED_CAPACITY_SCALE */; > +} Thanks, Quentin
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 9:35 AM Alexander E. Patrakov wrote: > > Let me repeat: not -EINVAL, please. Please find some other error code, > so that the application could sensibly distinguish between this case > (low quality entropy is in the buffer) and the "kernel is too dumb" case > (and no entropy is in the buffer). I'm not convinced we want applications to see that difference. The fact is, every time an application thinks it cares, it has caused problems. I can just see systemd saying "ok, the kernel didn't block, so I'll just do while (getrandom(x) == -ENOENTROPY) sleep(1); instead. Which is still completely buggy garbage. The fact is, we can't guarantee entropy in general. It's probably there is practice, particularly with user space saving randomness from last boot etc, but that kind of data may be real entropy, but the kernel cannot *guarantee* that it is. And people don't like us guaranteeing that rdrand/rdseed is "real entropy" either, since they don't trust the CPU hw either. Which means that we're all kinds of screwed. The whole "we guarantee entropy" model is broken. Linus
Re: [PATCH 2/2] powerpc/83xx: map IMMR with a BAT.
Le 14/09/2019 à 16:34, Scott Wood a écrit : On Fri, 2019-08-23 at 12:50 +, Christophe Leroy wrote: On mpc83xx with a QE, IMMR is 2Mbytes. On mpc83xx without a QE, IMMR is 1Mbytes. Each driver will map a part of it to access the registers it needs. Some driver will map the same part of IMMR as other drivers. In order to reduce TLB misses, map the full IMMR with a BAT. Signed-off-by: Christophe Leroy --- arch/powerpc/platforms/83xx/misc.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/arch/powerpc/platforms/83xx/misc.c b/arch/powerpc/platforms/83xx/misc.c index f46d7bf3b140..1e395b01c535 100644 --- a/arch/powerpc/platforms/83xx/misc.c +++ b/arch/powerpc/platforms/83xx/misc.c @@ -18,6 +18,8 @@ #include #include +#include + #include "mpc83xx.h" static __be32 __iomem *restart_reg_base; @@ -145,6 +147,14 @@ void __init mpc83xx_setup_arch(void) if (ppc_md.progress) ppc_md.progress("mpc83xx_setup_arch()", 0); + if (!__map_without_bats) { + int immrsize = IS_ENABLED(CONFIG_QUICC_ENGINE) ? SZ_2M : SZ_1M; Any reason not to unconditionally make it 2M? After all, the kernel being built with CONFIG_QUICC_ENGINE doesn't mean that the hardware you're running on has it... Euh .. ok. I didn't see it that way, but you are right. Do you think it is not a problem to map 2M even when the quicc engine is not there ? Or should it check device tree instead ? Christophe
Re: Linux 5.3-rc8
14.09.2019 21:30, Linus Torvalds пишет: On Sat, Sep 14, 2019 at 8:02 AM Ahmed S. Darwish wrote: On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote: An alternative might be to make getrandom() just return an error instead of waiting. Sure, fill the buffer with "as random as we can" stuff, but then return -EINVAL because you called us too early. ACK, that's probably _the_ most sensible approach. Only caveat is the slight change in user-space API semantics though... For example, this breaks the just released systemd-random-seed(8) as it _explicitly_ requests blocking behvior from getrandom() here: Actually, I would argue that the "don't ever block, instead fill buffer and return error instead" fixes this broken case. => src/random-seed/random-seed.c: /* * Let's make this whole job asynchronous, i.e. let's make * ourselves a barrier for proper initialization of the * random pool. */ k = getrandom(buf, buf_size, GRND_NONBLOCK); if (k < 0 && errno == EAGAIN && synchronous) { log_notice("Kernel entropy pool is not initialized yet, " "waiting until it is."); k = getrandom(buf, buf_size, 0); /* retry synchronously */ } Yeah, the above is yet another example of completely broken garbage. You can't just wait and block at boot. That is simply 100% unacceptable, and always has been, exactly because that may potentially mean waiting forever since you didn't do anything that actually is likely to add any entropy. if (k < 0) { log_debug_errno(errno, "Failed to read random data with " "getrandom(), falling back to " "/dev/urandom: %m"); At least it gets a log message. So I think the right thing to do is to just make getrandom() return -EINVAL, and refuse to block. Let me repeat: not -EINVAL, please. Please find some other error code, so that the application could sensibly distinguish between this case (low quality entropy is in the buffer) and the "kernel is too dumb" case (and no entropy is in the buffer). -- Alexander E. Patrakov smime.p7s Description: Криптографическая подпись S/MIME
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 8:02 AM Ahmed S. Darwish wrote: > > On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote: > > > > An alternative might be to make getrandom() just return an error > > instead of waiting. Sure, fill the buffer with "as random as we can" > > stuff, but then return -EINVAL because you called us too early. > > ACK, that's probably _the_ most sensible approach. Only caveat is > the slight change in user-space API semantics though... > > For example, this breaks the just released systemd-random-seed(8) > as it _explicitly_ requests blocking behvior from getrandom() here: > Actually, I would argue that the "don't ever block, instead fill buffer and return error instead" fixes this broken case. > => src/random-seed/random-seed.c: > /* > * Let's make this whole job asynchronous, i.e. let's make > * ourselves a barrier for proper initialization of the > * random pool. > */ > k = getrandom(buf, buf_size, GRND_NONBLOCK); > if (k < 0 && errno == EAGAIN && synchronous) { > log_notice("Kernel entropy pool is not initialized yet, " > "waiting until it is."); > > k = getrandom(buf, buf_size, 0); /* retry synchronously */ > } Yeah, the above is yet another example of completely broken garbage. You can't just wait and block at boot. That is simply 100% unacceptable, and always has been, exactly because that may potentially mean waiting forever since you didn't do anything that actually is likely to add any entropy. > if (k < 0) { > log_debug_errno(errno, "Failed to read random data with " > "getrandom(), falling back to " > "/dev/urandom: %m"); At least it gets a log message. So I think the right thing to do is to just make getrandom() return -EINVAL, and refuse to block. As mentioned, this has already historically been a huge issue on embedded devices, and with disks turnign not just to NVMe but to actual polling nvdimm/xpoint/flash, the amount of true "entropy" randomness we can give at boot is very questionable. We can (and will) continue to do a best-effort thing (including very much using rdread and friends), but the whole "wait for entropy" simply *must* stop. > I've sent an RFC patch at [1]. > > [1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc Looks reasonable to me. Except I'd just make it simpler and make it a big WARN_ON_ONCE(), which is a lot harder to miss than pr_notice(). Make it clear that it is a *bug* if user space thinks it should wait at boot time. Also, we might even want to just fill the buffer and return 0 at that point, to make sure that even more broken user space doesn't then try to sleep manually and turn it into a "I'll wait myself" loop. Linus
Re: Linux 5.3-rc8
On Sat, Sep 14, 2019 at 11:25:09AM +0200, Ahmed S. Darwish wrote: > Unfortunately, it only made the early fast init faster, but didn't fix > the normal crng init blockage :-( Yeah, I see why; the original goal was to do the fast init so that using /dev/urandom, even before we were fully initialized, wouldn't be deadly. But then we still wanted 128 bits of estimated entropy the old fashioned way before we declare the CRNG initialized. There are a bunch of things that I think I want to do long-term, such as make CONFIG_RANDOM_TRUST_CPU the default, trying to get random entropy from the bootloader, etc. But none of this is something we should do in a hurry, especially this close before 5.4 drops. So I think I want to fix things this way, which is a bit a of a hack, but I think it's better than simply reverting commit b03755ad6f33. Ahmed, Linus, what do you think? - Ted >From f1a111bff3b996258410e51a3760fc39bbd7058f Mon Sep 17 00:00:00 2001 From: Theodore Ts'o Date: Sat, 14 Sep 2019 12:21:39 -0400 Subject: [PATCH] ext4: don't plug in __ext4_get_inode_loc if the CRNG is not initialized Unfortuantely commit b03755ad6f33 ("ext4: make __ext4_get_inode_loc plug") is so effective that on some systems, where RDRAND is not trusted, and the GNOME display manager is using getrandom(2) to get randomness for MIT Magic Cookie (which isn't really secure so using getrandom(2) is a bit of waste) in early boot on an Arch system is causing the boot to hang. Since this is causing problems, although arguably this is userspace's fault, let's not do it if the CRNG is not yet initialized. This is better than trying to tweak the random number generator right before 5.4 is released (I'm afraid we'll accidentally make it _too_ weak), and it's also better than simply completely reverting b03755ad6f33. We're effectively reverting it while the RNG is not yet initialized, to slow down the boot and make it less efficient, just to work around broken init setups. Fixes: b03755ad6f33 ("ext4: make __ext4_get_inode_loc plug") Signed-off-by: Theodore Ts'o --- fs/ext4/inode.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 4e271b509af1..41ad93f11b6d 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -4534,6 +4534,7 @@ static int __ext4_get_inode_loc(struct inode *inode, struct buffer_head *bh; struct super_block *sb = inode->i_sb; ext4_fsblk_tblock; + int be_inefficient = !rng_is_initialized(); struct blk_plug plug; int inodes_per_block, inode_offset; @@ -4541,7 +4542,6 @@ static int __ext4_get_inode_loc(struct inode *inode, if (inode->i_ino < EXT4_ROOT_INO || inode->i_ino > le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)) return -EFSCORRUPTED; - iloc->block_group = (inode->i_ino - 1) / EXT4_INODES_PER_GROUP(sb); gdp = ext4_get_group_desc(sb, iloc->block_group, NULL); if (!gdp) @@ -4623,7 +4623,8 @@ static int __ext4_get_inode_loc(struct inode *inode, * If we need to do any I/O, try to pre-readahead extra * blocks from the inode table. */ - blk_start_plug(&plug); + if (likely(!be_inefficient)) + blk_start_plug(&plug); if (EXT4_SB(sb)->s_inode_readahead_blks) { ext4_fsblk_t b, end, table; unsigned num; @@ -4654,7 +4655,8 @@ static int __ext4_get_inode_loc(struct inode *inode, get_bh(bh); bh->b_end_io = end_buffer_read_sync; submit_bh(REQ_OP_READ, REQ_META | REQ_PRIO, bh); - blk_finish_plug(&plug); + if (likely(!be_inefficient)) + blk_finish_plug(&plug); wait_on_buffer(bh); if (!buffer_uptodate(bh)) { EXT4_ERROR_INODE_BLOCK(inode, block, -- 2.23.0
Re: [RFC v1 05/14] krsi: Initialize KRSI hooks and create files in securityfs
On 9/10/19 12:55 PM, KP Singh wrote: > From: KP Singh > > The LSM creates files in securityfs for each hook registered with the > LSM. > > /sys/kernel/security/bpf/ > > The initialization of the hooks is done collectively in an internal > header "hooks.h" which results in: > > * Creation of a file for the hook in the securityfs. > * Allocation of a krsi_hook data structure which stores a pointer to the >dentry of the newly created file in securityfs. > * A pointer to the krsi_hook data structure is stored in the private >d_fsdata of dentry of the file created in securityFS. > > These files will later be used to specify an attachment target during > BPF_PROG_LOAD. > > Signed-off-by: KP Singh > --- > security/krsi/Makefile| 4 +- > security/krsi/include/hooks.h | 21 > security/krsi/include/krsi_fs.h | 19 +++ > security/krsi/include/krsi_init.h | 45 > security/krsi/krsi.c | 16 +- > security/krsi/krsi_fs.c | 88 +++ > 6 files changed, 191 insertions(+), 2 deletions(-) > create mode 100644 security/krsi/include/hooks.h > create mode 100644 security/krsi/include/krsi_fs.h > create mode 100644 security/krsi/include/krsi_init.h > create mode 100644 security/krsi/krsi_fs.c > > diff --git a/security/krsi/Makefile b/security/krsi/Makefile > index 660cc1f422fd..4586241f16e1 100644 > --- a/security/krsi/Makefile > +++ b/security/krsi/Makefile > @@ -1 +1,3 @@ > -obj-$(CONFIG_SECURITY_KRSI) := krsi.o ops.o > +obj-$(CONFIG_SECURITY_KRSI) := krsi.o krsi_fs.o ops.o > + > +ccflags-y := -I$(srctree)/security/krsi -I$(srctree)/security/krsi/include > diff --git a/security/krsi/include/hooks.h b/security/krsi/include/hooks.h > new file mode 100644 > index ..e070c452b5de > --- /dev/null > +++ b/security/krsi/include/hooks.h > @@ -0,0 +1,21 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +/* > + * The hooks for the KRSI LSM are declared in this file. > + * > + * This header MUST NOT be included directly and should > + * be only used to initialize the hooks lists. > + * > + * Format: > + * > + * KRSI_HOOK_INIT(TYPE, NAME, LSM_HOOK, KRSI_HOOK_FN) > + * > + * KRSI adds one layer of indirection between the name of the hook and the > name > + * it exposes to the userspace in Security FS to prevent the userspace from > + * breaking in case the name of the hook changes in the kernel or if there's > + * another LSM hook that maps better to the represented security behaviour. > + */ > +KRSI_HOOK_INIT(PROCESS_EXECUTION, > +process_execution, > +bprm_check_security, > +krsi_process_execution) > diff --git a/security/krsi/include/krsi_fs.h b/security/krsi/include/krsi_fs.h > new file mode 100644 > index ..38134661d8d6 > --- /dev/null > +++ b/security/krsi/include/krsi_fs.h > @@ -0,0 +1,19 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +#ifndef _KRSI_FS_H > +#define _KRSI_FS_H > + > +#include > +#include > +#include > + > +bool is_krsi_hook_file(struct file *f); > + > +/* > + * The name of the directory created in securityfs > + * > + * /sys/kernel/security/ > + */ > +#define KRSI_SFS_NAME "krsi" > + > +#endif /* _KRSI_FS_H */ > diff --git a/security/krsi/include/krsi_init.h > b/security/krsi/include/krsi_init.h > new file mode 100644 > index ..68755182a031 > --- /dev/null > +++ b/security/krsi/include/krsi_init.h > @@ -0,0 +1,45 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +#ifndef _KRSI_INIT_H > +#define _KRSI_INIT_H > + > +#include "krsi_fs.h" > + > +enum krsi_hook_type { > + PROCESS_EXECUTION, > + __MAX_KRSI_HOOK_TYPE, /* delimiter */ > +}; > + > +extern int krsi_fs_initialized; > +/* > + * The LSM creates one file per hook. > + * > + * A pointer to krsi_hook data structure is stored in the > + * private fsdata of the dentry of the per-hook file created > + * in securityfs. > + */ > +struct krsi_hook { > + /* > + * The name of the security hook, a file with this name will be created > + * in the securityfs. > + */ > + const char *name; > + /* > + * The type of the LSM hook, the LSM uses this to index the list of the > + * hooks to run the eBPF programs that may have been attached. > + */ > + enum krsi_hook_type h_type; > + /* > + * The dentry of the file created in securityfs. > + */ > + struct dentry *h_dentry; > +}; > + > +extern struct krsi_hook krsi_hooks_list[]; > + > +#define krsi_for_each_hook(hook) \ > + for ((hook) = &krsi_hooks_list[0]; \ > + (hook) < &krsi_hooks_list[__MAX_KRSI_HOOK_TYPE]; \ > + (hook)++) > + > +#endif /* _KRSI_INIT_H */ > diff --git a/security/krsi/krsi.c b/security/krsi/krsi.c > index 9ce4f56fb78d..77d7e2f91172 100644 > --- a/security/krsi/krsi.c > +++ b/security/krsi/krsi.c > @@ -2,13 +2,27 @@ > > #include > > +#include "krsi_init.h" > + > +struct krsi_hook krsi_hooks_l
Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
On Sat, Sep 14, 2019 at 9:38 AM H. Nikolaus Schaller wrote: > > > > Am 14.09.2019 um 15:42 schrieb Adam Ford : > > > > On Sat, Sep 14, 2019 at 4:20 AM H. Nikolaus Schaller > > wrote: > >> > >> > >>> Am 13.09.2019 um 17:37 schrieb Adam Ford : > >>> > >>> The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C > >>> depending on commercial or industrial temperature ratings. This > >>> patch expands the thermal information to the limits of 90 and 105 > >>> for alert and critical. > >>> > >>> For boards who never use industrial temperatures, these can be > >>> changed on their respective device trees with something like: > >>> > >>> &cpu_alert0 { > >>> temperature = <85000>; /* millicelsius */ > >>> }; > >>> > >>> &cpu_crit { > >>> temperature = <9>; /* millicelsius */ > >>> }; > >>> > >>> Signed-off-by: Adam Ford > >>> --- > >>> V2: Change the CPU reference to &cpu instead of &cpu0 > >>> > >>> diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi > >>> b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi > >>> index 235ecfd61e2d..dfbd0cb0b00b 100644 > >>> --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi > >>> +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi > >>> @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal { > >>> > >>> /* sensor ID */ > >>> thermal-sensors = <&bandgap 0>; > >>> + > >>> + cpu_trips: trips { > >>> + cpu_alert0: cpu_alert { > >>> + temperature = <9>; /* millicelsius */ > >>> + hysteresis = <2000>; /* millicelsius */ > >>> + type = "passive"; > >>> + }; > >>> + cpu_crit: cpu_crit { > >>> + temperature = <105000>; /* millicelsius */ > >>> + hysteresis = <2000>; /* millicelsius */ > >>> + type = "critical"; > >>> + }; > >>> + }; > >>> + > >>> + cpu_cooling_maps: cooling-maps { > >>> + map0 { > >>> + trip = <&cpu_alert0>; > >>> + cooling-device = > >>> + <&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; > >>> + }; > >>> + }; > >>> }; > >>> -- > >>> 2.17.1 > >>> > >> > >> Here is my test log (GTA04A5 with DM3730CBP100). > >> "high-load" script is driving the NEON to full power > >> and would report calculation errors. > >> > >> There is no noise visible in the bandgap sensor data > >> induced by power supply fluctuations (log shows system > >> voltage while charging). > >> > > > > Great data! > > > >> root@letux:~# ./high-load -n2 > >> 100% load stress test for 1 cores running ./neon_loop2 > >> Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz > >> Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz > >> Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz > >> Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz > >> Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz > >> Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz > >> Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz > >> Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz > >> Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz > >> Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz > >> Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz > >> Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz > >> Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz > >> Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz > >> Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz > >> Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz > >> Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz > >> Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz > >> Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz > >> Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz > >> Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz > >> Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz > >> Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz > >> Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz > >> Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz > >> Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz > >> Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz > >> Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz > >> Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz > >> Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz > >> Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz > >> ... > >> Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz > >> Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz > >> Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz > >> Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz > >> Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz > >> Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz > >> Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz > >> Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz > >> Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz > >> Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz > >> Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz > >> Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz > >> Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1
Re: [RFC v1 04/14] krsi: Add support in libbpf for BPF_PROG_TYPE_KRSI
On 9/10/19 12:55 PM, KP Singh wrote: > From: KP Singh > > Update the libbpf library with functionality to load and > attach a program type BPF_PROG_TYPE_KRSI. > > Since the bpf_prog_load does not allow the specification of an > expected attach type, it's recommended to use bpf_prog_load_xattr and > set the expected attach type as KRSI. > > Signed-off-by: KP Singh > --- > tools/lib/bpf/libbpf.c| 4 > tools/lib/bpf/libbpf.h| 2 ++ > tools/lib/bpf/libbpf.map | 2 ++ > tools/lib/bpf/libbpf_probes.c | 1 + > 4 files changed, 9 insertions(+) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index 2b57d7ea7836..3cc86bbc68cd 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -2676,6 +2676,7 @@ static bool bpf_prog_type__needs_kver(enum > bpf_prog_type type) > case BPF_PROG_TYPE_PERF_EVENT: > case BPF_PROG_TYPE_CGROUP_SYSCTL: > case BPF_PROG_TYPE_CGROUP_SOCKOPT: > + case BPF_PROG_TYPE_KRSI: > return false; > case BPF_PROG_TYPE_KPROBE: > default: > @@ -3536,6 +3537,7 @@ bool bpf_program__is_##NAME(const struct bpf_program > *prog) \ > } \ > > BPF_PROG_TYPE_FNS(socket_filter, BPF_PROG_TYPE_SOCKET_FILTER); > +BPF_PROG_TYPE_FNS(krsi, BPF_PROG_TYPE_KRSI); > BPF_PROG_TYPE_FNS(kprobe, BPF_PROG_TYPE_KPROBE); > BPF_PROG_TYPE_FNS(sched_cls, BPF_PROG_TYPE_SCHED_CLS); > BPF_PROG_TYPE_FNS(sched_act, BPF_PROG_TYPE_SCHED_ACT); > @@ -3590,6 +3592,8 @@ static const struct { > BPF_PROG_SEC("lwt_out", BPF_PROG_TYPE_LWT_OUT), > BPF_PROG_SEC("lwt_xmit",BPF_PROG_TYPE_LWT_XMIT), > BPF_PROG_SEC("lwt_seg6local", BPF_PROG_TYPE_LWT_SEG6LOCAL), > + BPF_APROG_SEC("krsi", BPF_PROG_TYPE_KRSI, > + BPF_KRSI), > BPF_APROG_SEC("cgroup_skb/ingress", BPF_PROG_TYPE_CGROUP_SKB, > BPF_CGROUP_INET_INGRESS), > BPF_APROG_SEC("cgroup_skb/egress", BPF_PROG_TYPE_CGROUP_SKB, > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h > index 5cbf459ece0b..8781d29b4035 100644 > --- a/tools/lib/bpf/libbpf.h > +++ b/tools/lib/bpf/libbpf.h > @@ -261,6 +261,7 @@ LIBBPF_API int bpf_program__set_sched_cls(struct > bpf_program *prog); > LIBBPF_API int bpf_program__set_sched_act(struct bpf_program *prog); > LIBBPF_API int bpf_program__set_xdp(struct bpf_program *prog); > LIBBPF_API int bpf_program__set_perf_event(struct bpf_program *prog); > +LIBBPF_API int bpf_program__set_krsi(struct bpf_program *prog); > LIBBPF_API void bpf_program__set_type(struct bpf_program *prog, > enum bpf_prog_type type); > LIBBPF_API void > @@ -275,6 +276,7 @@ LIBBPF_API bool bpf_program__is_sched_cls(const struct > bpf_program *prog); > LIBBPF_API bool bpf_program__is_sched_act(const struct bpf_program *prog); > LIBBPF_API bool bpf_program__is_xdp(const struct bpf_program *prog); > LIBBPF_API bool bpf_program__is_perf_event(const struct bpf_program *prog); > +LIBBPF_API bool bpf_program__is_krsi(const struct bpf_program *prog); > > /* >* No need for __attribute__((packed)), all members of 'bpf_map_def' > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map > index f9d316e873d8..75b8fe419c11 100644 > --- a/tools/lib/bpf/libbpf.map > +++ b/tools/lib/bpf/libbpf.map > @@ -68,6 +68,7 @@ LIBBPF_0.0.1 { > bpf_prog_test_run_xattr; > bpf_program__fd; > bpf_program__is_kprobe; > + bpf_program__is_krsi; > bpf_program__is_perf_event; > bpf_program__is_raw_tracepoint; > bpf_program__is_sched_act; > @@ -85,6 +86,7 @@ LIBBPF_0.0.1 { > bpf_program__set_expected_attach_type; > bpf_program__set_ifindex; > bpf_program__set_kprobe; > + bpf_program__set_krsi; > bpf_program__set_perf_event; > bpf_program__set_prep; > bpf_program__set_priv; Please put the above two new API functions in version LIBBPF_0.0.5. > diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c > index ace1a0708d99..cc515a36794d 100644 > --- a/tools/lib/bpf/libbpf_probes.c > +++ b/tools/lib/bpf/libbpf_probes.c > @@ -102,6 +102,7 @@ probe_load(enum bpf_prog_type prog_type, const struct > bpf_insn *insns, > case BPF_PROG_TYPE_FLOW_DISSECTOR: > case BPF_PROG_TYPE_CGROUP_SYSCTL: > case BPF_PROG_TYPE_CGROUP_SOCKOPT: > + case BPF_PROG_TYPE_KRSI: > default: > break; > } >
Re: [PATCH v22 00/24] Intel SGX foundations
On 9/14/19 6:41 AM, Jarkko Sakkinen wrote: > > The proposed LSM hooks give the granularity to make yes/no decision > based on the > > * The origin of the source of the source for the enclave. > * The requested permissions for the added or mapped peage. > > The hooks to do these checks are provided for mmap() and EADD > operations. > > With just file permissions you can still limit mmap() by having a > privileged process to build the enclaves and pass the file descriptor > to the enclave user who can mmap() the enclave within the constraints > set by the enclave pages (their permissions refine the roof that you > can mmap() any memory range within an enclave). The LSM hooks are presumably fixing a problem that these patches introduce. What's that problem?
Re: [PATCH v4 2/2] net: phy: adin: implement Energy Detect Powerdown mode via phy-tunable
On Thu, Sep 12, 2019 at 07:28:12PM +0300, Alexandru Ardelean wrote: > +static int adin_set_edpd(struct phy_device *phydev, u16 tx_interval) > +{ > + u16 val; > + > + if (tx_interval == ETHTOOL_PHY_EDPD_DISABLE) > + return phy_clear_bits(phydev, ADIN1300_PHY_CTRL_STATUS2, > + (ADIN1300_NRG_PD_EN | ADIN1300_NRG_PD_TX_EN)); > + > + val = ADIN1300_NRG_PD_EN; > + > + switch (tx_interval) { > + case 1000: /* 1 second */ > + /* fallthrough */ > + case ETHTOOL_PHY_EDPD_DFLT_TX_MSECS: > + val |= ADIN1300_NRG_PD_TX_EN; > + /* fallthrough */ > + case ETHTOOL_PHY_EDPD_NO_TX: > + break; > + default: > + return -EINVAL; > + } > + > + return phy_modify(phydev, ADIN1300_PHY_CTRL_STATUS2, > + (ADIN1300_NRG_PD_EN | ADIN1300_NRG_PD_TX_EN), > + val); > +} > + > > + rc = adin_set_edpd(phydev, 1); > + if (rc < 0) > + return rc; Hi Alexandru Shouldn't this be adin_set_edpd(phydev, 1000); Andrew
Re: [PATCH v4 1/2] ethtool: implement Energy Detect Powerdown support via phy-tunable
On Thu, Sep 12, 2019 at 07:28:11PM +0300, Alexandru Ardelean wrote: > The `phy_tunable_id` has been named `ETHTOOL_PHY_EDPD` since it looks like > this feature is common across other PHYs (like EEE), and defining > `ETHTOOL_PHY_ENERGY_DETECT_POWER_DOWN` seems too long. > > The way EDPD works, is that the RX block is put to a lower power mode, > except for link-pulse detection circuits. The TX block is also put to low > power mode, but the PHY wakes-up periodically to send link pulses, to avoid > lock-ups in case the other side is also in EDPD mode. > > Currently, there are 2 PHY drivers that look like they could use this new > PHY tunable feature: the `adin` && `micrel` PHYs. > > The ADIN's datasheet mentions that TX pulses are at intervals of 1 second > default each, and they can be disabled. For the Micrel KSZ9031 PHY, the > datasheet does not mention whether they can be disabled, but mentions that > they can modified. > > The way this change is structured, is similar to the PHY tunable downshift > control: > * a `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` value is exposed to cover a default > TX interval; some PHYs could specify a certain value that makes sense > * `ETHTOOL_PHY_EDPD_NO_TX` would disable TX when EDPD is enabled > * `ETHTOOL_PHY_EDPD_DISABLE` will disable EDPD > > As noted by the `ETHTOOL_PHY_EDPD_DFLT_TX_MSECS` the interval unit is 1 > millisecond, which should cover a reasonable range of intervals: > - from 1 millisecond, which does not sound like much of a power-saver > - to ~65 seconds which is quite a lot to wait for a link to come up when >plugging a cable > > Signed-off-by: Alexandru Ardelean Reviewed-by: Andrew Lunn Andrew
Re: [PATCH 0/6] ARM, arm64: Remove arm_pm_restart()
On Mon, Jan 30, 2017 at 12:05:06PM +0100, Thierry Reding wrote: > From: Thierry Reding > > Hi everyone, > > This small series is preparatory work for a series that I'm working on > which attempts to establish a formal framework for system restart and > power off. > > Guenter has done a lot of good work in this area, but it never got > merged. I think this set is a valuable addition to the kernel because > it converts all odd providers to the established mechanism for restart. > > Since this is stretched across both 32-bit and 64-bit ARM, as well as > PSCI, and given the SoC/board level of functionality, I think it might > make sense to take this through the ARM SoC tree in order to simplify > the interdependencies. But it should also be possible to take patches > 1-4 via their respective trees this cycle and patches 5-6 through the > ARM and arm64 trees for the next cycle, if that's preferred. > We tried this twice now, and it seems to go nowhere. What does it take to get it applied ? Guenter > Thanks, > Thierry > > Guenter Roeck (6): > ARM: prima2: Register with kernel restart handler > ARM: xen: Register with kernel restart handler > drivers: firmware: psci: Register with kernel restart handler > ARM: Register with kernel restart handler > ARM64: Remove arm_pm_restart() > ARM: Remove arm_pm_restart() > > arch/arm/include/asm/system_misc.h | 1 - > arch/arm/kernel/reboot.c | 6 +- > arch/arm/kernel/setup.c | 20 ++-- > arch/arm/mach-prima2/rstc.c | 11 +-- > arch/arm/xen/enlighten.c | 13 +++-- > arch/arm64/include/asm/system_misc.h | 2 -- > arch/arm64/kernel/process.c | 7 +-- > drivers/firmware/psci.c | 11 +-- > 8 files changed, 49 insertions(+), 22 deletions(-) > > -- > 2.11.0
[PATCH] staging: bcm2835-audio: Fix draining behavior regression
The PCM draining behavior got broken since the recent refactoring, and this turned out to be the incorrect expectation of the firmware behavior regarding "draining". While I expected the "drain" flag at the stop operation would do processing the queued samples, it seems rather dropping the samples. As a quick fix, just drop the SNDRV_PCM_INFO_DRAIN_TRIGGER flag, so that the driver uses the normal PCM draining procedure. Also, put some caution comment to the function for future readers not to fall into the same pitfall. Fixes: d7ca3a71545b ("staging: bcm2835-audio: Operate non-atomic PCM ops") BugLink: https://github.com/raspberrypi/linux/issues/2983 Cc: sta...@vger.kernel.org Signed-off-by: Takashi Iwai --- drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c | 4 ++-- drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c | 1 + 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c index bc1eaa3a0773..826016c3431a 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-pcm.c @@ -12,7 +12,7 @@ static const struct snd_pcm_hardware snd_bcm2835_playback_hw = { .info = (SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID | -SNDRV_PCM_INFO_DRAIN_TRIGGER | SNDRV_PCM_INFO_SYNC_APPLPTR), +SNDRV_PCM_INFO_SYNC_APPLPTR), .formats = SNDRV_PCM_FMTBIT_U8 | SNDRV_PCM_FMTBIT_S16_LE, .rates = SNDRV_PCM_RATE_CONTINUOUS | SNDRV_PCM_RATE_8000_48000, .rate_min = 8000, @@ -29,7 +29,7 @@ static const struct snd_pcm_hardware snd_bcm2835_playback_hw = { static const struct snd_pcm_hardware snd_bcm2835_playback_spdif_hw = { .info = (SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_MMAP_VALID | -SNDRV_PCM_INFO_DRAIN_TRIGGER | SNDRV_PCM_INFO_SYNC_APPLPTR), +SNDRV_PCM_INFO_SYNC_APPLPTR), .formats = SNDRV_PCM_FMTBIT_S16_LE, .rates = SNDRV_PCM_RATE_CONTINUOUS | SNDRV_PCM_RATE_44100 | SNDRV_PCM_RATE_48000, diff --git a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c index 23fba01107b9..c6f9cf1913d2 100644 --- a/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c +++ b/drivers/staging/vc04_services/bcm2835-audio/bcm2835-vchiq.c @@ -289,6 +289,7 @@ int bcm2835_audio_stop(struct bcm2835_alsa_stream *alsa_stream) VC_AUDIO_MSG_TYPE_STOP, false); } +/* FIXME: this doesn't seem working as expected for "draining" */ int bcm2835_audio_drain(struct bcm2835_alsa_stream *alsa_stream) { struct vc_audio_msg m = { -- 2.16.4
Re: [PATCH net-next] net: dsa: b53: Add support for port_egress_floods callback
On Thu, Sep 12, 2019 at 08:28:39PM -0700, Florian Fainelli wrote: > Add support for configuring the per-port egress flooding control for > both Unicast and Multicast traffic. > > Signed-off-by: Florian Fainelli > --- > Beneditk, > > Do you mind re-testing, or confirming that this patch that I sent much > earlier does work correctly for you? Thanks! > > drivers/net/dsa/b53/b53_common.c | 33 > drivers/net/dsa/b53/b53_priv.h | 2 ++ > 2 files changed, 35 insertions(+) > > diff --git a/drivers/net/dsa/b53/b53_common.c > b/drivers/net/dsa/b53/b53_common.c > index 7d328a5f0161..ac2ec08a652b 100644 > --- a/drivers/net/dsa/b53/b53_common.c > +++ b/drivers/net/dsa/b53/b53_common.c > @@ -342,6 +342,13 @@ static void b53_set_forwarding(struct b53_device *dev, > int enable) > b53_read8(dev, B53_CTRL_PAGE, B53_SWITCH_CTRL, &mgmt); > mgmt |= B53_MII_DUMB_FWDG_EN; > b53_write8(dev, B53_CTRL_PAGE, B53_SWITCH_CTRL, mgmt); > + > + /* Look at B53_UC_FWD_EN and B53_MC_FWD_EN to decide whether > + * frames should be flooed or not. Hi Florian s/flooed/flooded Reviewed-by: Andrew Lunn Andrew
Re: [PATCH v2] net: mdio: switch to using gpiod_get_optional()
On Fri, Sep 13, 2019 at 03:55:47PM -0700, Dmitry Torokhov wrote: > The MDIO device reset line is optional and now that gpiod_get_optional() > returns proper value when GPIO support is compiled out, there is no > reason to use fwnode_get_named_gpiod() that I plan to hide away. > > Let's switch to using more standard gpiod_get_optional() and > gpiod_set_consumer_name() to keep the nice "PHY reset" label. > > Also there is no reason to only try to fetch the reset GPIO when we have > OF node, gpiolib can fetch GPIO data from firmwares as well. > > Signed-off-by: Dmitry Torokhov Reviewed-by: Andrew Lunn Andrew
Re: [PATCH] x86_64: new and improved memset()
On Sat, Sep 14, 2019 at 01:37:17PM +0200, Borislav Petkov wrote: > On Sat, Sep 14, 2019 at 01:33:45PM +0300, Alexey Dobriyan wrote: > > --- a/arch/x86/include/asm/string_64.h > > +++ b/arch/x86/include/asm/string_64.h > > @@ -15,7 +15,111 @@ extern void *memcpy(void *to, const void *from, size_t > > len); > > extern void *__memcpy(void *to, const void *from, size_t len); > > > > #define __HAVE_ARCH_MEMSET > > +#if defined(_ARCH_X86_BOOT) || defined(CONFIG_FORTIFY_SOURCE) > > void *memset(void *s, int c, size_t n); > > +#else > > +#include > > +#include > > + > > +/* Internal, do not use. */ > > +static __always_inline void memset0(void *s, size_t n) > > +{ > > + /* Internal, do not use. */ > > + void _memset0_mov(void); > > + void _memset0_rep_stosq(void); > > + void memset0_mov(void); > > + void memset0_rep_stosq(void); > > + void memset0_rep_stosb(void); > > + > > + if (__builtin_constant_p(n) && n == 0) { > > + } else if (__builtin_constant_p(n) && n == 1) { > > + *(uint8_t *)s = 0; > > + } else if (__builtin_constant_p(n) && n == 2) { > > + *(uint16_t *)s = 0; > > + } else if (__builtin_constant_p(n) && n == 4) { > > + *(uint32_t *)s = 0; > > + } else if (__builtin_constant_p(n) && n == 6) { > > + *(uint32_t *)s = 0; > > + *(uint16_t *)(s + 4) = 0; > > + } else if (__builtin_constant_p(n) && n == 8) { > > + *(uint64_t *)s = 0; > > + } else if (__builtin_constant_p(n) && (n & 7) == 0) { > > + alternative_call_2( > > + _memset0_mov, > > + _memset0_rep_stosq, X86_FEATURE_REP_GOOD, > > + memset0_rep_stosb, X86_FEATURE_ERMS, > > + ASM_OUTPUT2("=D" (s), "=c" (n)), > > + "D" (s), "c" (n) > > + : "rax", "cc", "memory" > > + ); > > + } else { > > + alternative_call_2( > > + memset0_mov, > > + memset0_rep_stosq, X86_FEATURE_REP_GOOD, > > + memset0_rep_stosb, X86_FEATURE_ERMS, > > + ASM_OUTPUT2("=D" (s), "=c" (n)), > > + "D" (s), "c" (n) > > + : "rax", "rsi", "cc", "memory" > > + ); > > + } > > +} > > + > > +/* Internal, do not use. */ > > +static __always_inline void memsetx(void *s, int c, size_t n) > > +{ > > + /* Internal, do not use. */ > > + void _memsetx_mov(void); > > + void _memsetx_rep_stosq(void); > > + void memsetx_mov(void); > > + void memsetx_rep_stosq(void); > > + void memsetx_rep_stosb(void); > > + > > + const uint64_t ccc = (uint8_t)c * 0x0101010101010101ULL; > > + > > + if (__builtin_constant_p(n) && n == 0) { > > + } else if (__builtin_constant_p(n) && n == 1) { > > + *(uint8_t *)s = ccc; > > + } else if (__builtin_constant_p(n) && n == 2) { > > + *(uint16_t *)s = ccc; > > + } else if (__builtin_constant_p(n) && n == 4) { > > + *(uint32_t *)s = ccc; > > + } else if (__builtin_constant_p(n) && n == 8) { > > + *(uint64_t *)s = ccc; > > + } else if (__builtin_constant_p(n) && (n & 7) == 0) { > > + alternative_call_2( > > + _memsetx_mov, > > + _memsetx_rep_stosq, X86_FEATURE_REP_GOOD, > > + memsetx_rep_stosb, X86_FEATURE_ERMS, > > + ASM_OUTPUT2("=D" (s), "=c" (n)), > > + "D" (s), "c" (n), "a" (ccc) > > + : "cc", "memory" > > + ); > > + } else { > > + alternative_call_2( > > + memsetx_mov, > > + memsetx_rep_stosq, X86_FEATURE_REP_GOOD, > > + memsetx_rep_stosb, X86_FEATURE_ERMS, > > + ASM_OUTPUT2("=D" (s), "=c" (n)), > > + "D" (s), "c" (n), "a" (ccc) > > + : "rsi", "cc", "memory" > > + ); > > + } > > +} > > + > > +static __always_inline void *memset(void *s, int c, size_t n) > > +{ > > + if (__builtin_constant_p(c)) { > > + if (c == 0) { > > + memset0(s, n); > > + } else { > > + memsetx(s, c, n); > > + } > > + return s; > > + } else { > > + return __builtin_memset(s, c, n); > > + } > > +} > > I'm willing to take something like that only when such complexity is > justified by numbers. I.e., I'm much more inclined to capping it under > 32 and 64 byte sizes and keeping it simple. OK. Those small lengths were indeed annoying. > > +ENTRY(_memset0_mov) > > + xor eax, eax > > +.globl _memsetx_mov > > +_memsetx_mov: > > + add rcx, rdi > > + cmp rdi, rcx > > + je 1f > > +2: > > + mov [rdi], rax > > + add rdi, 8 > > + cmp rdi, rcx > > + jne 2b > > +1: > > + ret > > +ENDPROC(_memset0_mov) > > +ENDPROC(_memsetx_mov) > > +EXPORT_SYMBOL(_memset0_mov) > > +EXPORT_SYMBOL(_memsetx_mov) > > + > > +ENTRY(mems
Re: [PATCH] staging: r8188eu: replace rtw_malloc() with it's definition
On 9/10/19 2:59 PM, Dan Carpenter wrote: On Sun, Sep 08, 2019 at 12:00:26PM +0300, Ivan Safonov wrote >> rtw_malloc prevents the use of kmemdup/kzalloc and others. Signed-off-by: Ivan Safonov --- drivers/staging/rtl8188eu/core/rtw_ap.c| 4 ++-- drivers/staging/rtl8188eu/core/rtw_mlme_ext.c | 2 +- .../staging/rtl8188eu/include/osdep_service.h | 3 --- drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 18 +- drivers/staging/rtl8188eu/os_dep/mlme_linux.c | 2 +- .../staging/rtl8188eu/os_dep/osdep_service.c | 7 +-- 6 files changed, 14 insertions(+), 22 deletions(-) diff --git a/drivers/staging/rtl8188eu/core/rtw_ap.c b/drivers/staging/rtl8188eu/core/rtw_ap.c index 51a5b71f8c25..c9c57379b7a2 100644 --- a/drivers/staging/rtl8188eu/core/rtw_ap.c +++ b/drivers/staging/rtl8188eu/core/rtw_ap.c @@ -104,7 +104,7 @@ static void update_BCNTIM(struct adapter *padapter) } if (remainder_ielen > 0) { - pbackup_remainder_ie = rtw_malloc(remainder_ielen); + pbackup_remainder_ie = kmalloc(remainder_ielen, in_interrupt() ? GFP_ATOMIC : GFP_KERNEL); ^ This stuff isn't right. It really should be checking if spinlocks are held or IRQs are disabled. And the only way to do that is by auditing the callers. I hope to make these changes later as separate independent patches. This patch do only one thing - remove rtw_malloc(). (The original rtw_malloc() implementation is buggy nonsense). regards, dan carpenter Ivan Safonov.
Re: Linux 5.3-rc8
On Thu, Sep 12, 2019 at 12:34:45PM +0100, Linus Torvalds wrote: > On Thu, Sep 12, 2019 at 9:25 AM Theodore Y. Ts'o wrote: > > > > Hmm, one thought might be GRND_FAILSAFE, which will wait up to two > > minutes before returning "best efforts" randomness and issuing a huge > > massive warning if it is triggered? > > Yeah, based on (by now) _years_ of experience with people mis-using > "get me random numbers", I think the sense of a new flag needs to be > "yeah, I'm willing to wait for it". > > Because most people just don't want to wait for it, and most people > don't think about it, and we need to make the default be for that > "don't think about it" crowd, with the people who ask for randomness > sources for a secure key having to very clearly and very explicitly > say "Yes, I understand that this can take minutes and can only be done > long after boot". > > Even then people will screw that up because they copy code, or some > less than gifted rodent writes a library and decides "my library is so > important that I need that waiting sooper-sekrit-secure random > number", and then people use that broken library by mistake without > realizing that it's not going to be reliable at boot time. > > An alternative might be to make getrandom() just return an error > instead of waiting. Sure, fill the buffer with "as random as we can" > stuff, but then return -EINVAL because you called us too early. > ACK, that's probably _the_ most sensible approach. Only caveat is the slight change in user-space API semantics though... For example, this breaks the just released systemd-random-seed(8) as it _explicitly_ requests blocking behvior from getrandom() here: => src/random-seed/random-seed.c: /* * Let's make this whole job asynchronous, i.e. let's make * ourselves a barrier for proper initialization of the * random pool. */ k = getrandom(buf, buf_size, GRND_NONBLOCK); if (k < 0 && errno == EAGAIN && synchronous) { log_notice("Kernel entropy pool is not initialized yet, " "waiting until it is."); k = getrandom(buf, buf_size, 0); /* retry synchronously */ } if (k < 0) { log_debug_errno(errno, "Failed to read random data with " "getrandom(), falling back to " "/dev/urandom: %m"); } else if ((size_t) k < buf_size) { log_debug("Short read from getrandom(), falling back to " "/dev/urandom: %m"); } else { getrandom_worked = true; } Nonetheless, a slightly broken systemd-random-seed, that was just released only 11 days ago (v243), is honestly much better than a *non-booting system*... I've sent an RFC patch at [1]. To handle the systemd case, I'll add the discussed "yeah, I'm willing to wait for it" flag (GRND_BLOCK) in v2. If this whole approach is going to be merged, and the slight ABI breakage is to be tolerated (hm?), I wonder how will systemd random-seed handle the semantics change though without doing ugly kernel version checks.. thanks, [1] https://lkml.kernel.org/r/20190914122500.GA1425@darwi-home-pc -- darwi http://darwish.chasingpointers.com
Re: [RFC v2 1/2] ARM: dts: omap3: Add cpu trips and cooling map for omap3 family
> Am 14.09.2019 um 15:42 schrieb Adam Ford : > > On Sat, Sep 14, 2019 at 4:20 AM H. Nikolaus Schaller > wrote: >> >> >>> Am 13.09.2019 um 17:37 schrieb Adam Ford : >>> >>> The OMAP3530, AM3517 and DM3730 all show thresholds of 90C and 105C >>> depending on commercial or industrial temperature ratings. This >>> patch expands the thermal information to the limits of 90 and 105 >>> for alert and critical. >>> >>> For boards who never use industrial temperatures, these can be >>> changed on their respective device trees with something like: >>> >>> &cpu_alert0 { >>> temperature = <85000>; /* millicelsius */ >>> }; >>> >>> &cpu_crit { >>> temperature = <9>; /* millicelsius */ >>> }; >>> >>> Signed-off-by: Adam Ford >>> --- >>> V2: Change the CPU reference to &cpu instead of &cpu0 >>> >>> diff --git a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi >>> b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi >>> index 235ecfd61e2d..dfbd0cb0b00b 100644 >>> --- a/arch/arm/boot/dts/omap3-cpu-thermal.dtsi >>> +++ b/arch/arm/boot/dts/omap3-cpu-thermal.dtsi >>> @@ -17,4 +17,25 @@ cpu_thermal: cpu_thermal { >>> >>> /* sensor ID */ >>> thermal-sensors = <&bandgap 0>; >>> + >>> + cpu_trips: trips { >>> + cpu_alert0: cpu_alert { >>> + temperature = <9>; /* millicelsius */ >>> + hysteresis = <2000>; /* millicelsius */ >>> + type = "passive"; >>> + }; >>> + cpu_crit: cpu_crit { >>> + temperature = <105000>; /* millicelsius */ >>> + hysteresis = <2000>; /* millicelsius */ >>> + type = "critical"; >>> + }; >>> + }; >>> + >>> + cpu_cooling_maps: cooling-maps { >>> + map0 { >>> + trip = <&cpu_alert0>; >>> + cooling-device = >>> + <&cpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; >>> + }; >>> + }; >>> }; >>> -- >>> 2.17.1 >>> >> >> Here is my test log (GTA04A5 with DM3730CBP100). >> "high-load" script is driving the NEON to full power >> and would report calculation errors. >> >> There is no noise visible in the bandgap sensor data >> induced by power supply fluctuations (log shows system >> voltage while charging). >> > > Great data! > >> root@letux:~# ./high-load -n2 >> 100% load stress test for 1 cores running ./neon_loop2 >> Sat Sep 14 09:05:50 UTC 2019 65° 4111mV 1000MHz >> Sat Sep 14 09:05:50 UTC 2019 67° 4005mV 1000MHz >> Sat Sep 14 09:05:52 UTC 2019 68° 4000mV 1000MHz >> Sat Sep 14 09:05:53 UTC 2019 68° 4000mV 1000MHz >> Sat Sep 14 09:05:55 UTC 2019 72° 3976mV 1000MHz >> Sat Sep 14 09:05:56 UTC 2019 72° 4023mV 1000MHz >> Sat Sep 14 09:05:57 UTC 2019 72° 3900mV 1000MHz >> Sat Sep 14 09:05:59 UTC 2019 73° 4029mV 1000MHz >> Sat Sep 14 09:06:00 UTC 2019 73° 3988mV 1000MHz >> Sat Sep 14 09:06:01 UTC 2019 73° 4005mV 1000MHz >> Sat Sep 14 09:06:03 UTC 2019 73° 4011mV 1000MHz >> Sat Sep 14 09:06:04 UTC 2019 73° 4117mV 1000MHz >> Sat Sep 14 09:06:06 UTC 2019 73° 4005mV 1000MHz >> Sat Sep 14 09:06:07 UTC 2019 75° 3994mV 1000MHz >> Sat Sep 14 09:06:08 UTC 2019 75° 3970mV 1000MHz >> Sat Sep 14 09:06:09 UTC 2019 75° 4046mV 1000MHz >> Sat Sep 14 09:06:11 UTC 2019 75° 4005mV 1000MHz >> Sat Sep 14 09:06:12 UTC 2019 75° 4023mV 1000MHz >> Sat Sep 14 09:06:14 UTC 2019 75° 3970mV 1000MHz >> Sat Sep 14 09:06:15 UTC 2019 75° 4011mV 1000MHz >> Sat Sep 14 09:06:16 UTC 2019 77° 4017mV 1000MHz >> Sat Sep 14 09:06:18 UTC 2019 77° 3994mV 1000MHz >> Sat Sep 14 09:06:19 UTC 2019 77° 3994mV 1000MHz >> Sat Sep 14 09:06:20 UTC 2019 77° 3988mV 1000MHz >> Sat Sep 14 09:06:22 UTC 2019 77° 4023mV 1000MHz >> Sat Sep 14 09:06:23 UTC 2019 77° 4023mV 1000MHz >> Sat Sep 14 09:06:24 UTC 2019 78° 4005mV 1000MHz >> Sat Sep 14 09:06:26 UTC 2019 78° 4105mV 1000MHz >> Sat Sep 14 09:06:27 UTC 2019 78° 4011mV 1000MHz >> Sat Sep 14 09:06:28 UTC 2019 78° 3994mV 1000MHz >> Sat Sep 14 09:06:30 UTC 2019 78° 4123mV 1000MHz >> ... >> Sat Sep 14 09:09:57 UTC 2019 88° 4082mV 1000MHz >> Sat Sep 14 09:09:59 UTC 2019 88° 4164mV 1000MHz >> Sat Sep 14 09:10:00 UTC 2019 88° 4058mV 1000MHz >> Sat Sep 14 09:10:01 UTC 2019 88° 4058mV 1000MHz >> Sat Sep 14 09:10:03 UTC 2019 88° 4082mV 1000MHz >> Sat Sep 14 09:10:04 UTC 2019 88° 4058mV 1000MHz >> Sat Sep 14 09:10:06 UTC 2019 88° 4146mV 1000MHz >> Sat Sep 14 09:10:07 UTC 2019 88° 4041mV 1000MHz >> Sat Sep 14 09:10:08 UTC 2019 88° 4035mV 1000MHz >> Sat Sep 14 09:10:10 UTC 2019 88° 4052mV 1000MHz >> Sat Sep 14 09:10:11 UTC 2019 88° 4087mV 1000MHz >> Sat Sep 14 09:10:12 UTC 2019 88° 4152mV 1000MHz >> Sat Sep 14 09:10:14 UTC 2019 88° 4070mV 1000MHz >> Sat Sep 14 09:10:15 UTC 2019 88° 4064mV 1000MHz >> Sat Sep 14 09:10:17 UTC 2019 88° 4170mV 1000MHz >> Sat Sep 14 09:10:18 UTC 2019 88° 4058mV 1000MHz >> Sat Sep 14 09:10:19 UTC 2019 88° 4187mV 1000MHz >> Sat Sep 14 09:10:21 UTC 2019 88° 4093mV 1000MHz >> Sat Sep 14 09:10:22 UTC
Re: [PATCH 2/2] powerpc/83xx: map IMMR with a BAT.
On Fri, 2019-08-23 at 12:50 +, Christophe Leroy wrote: > On mpc83xx with a QE, IMMR is 2Mbytes. > On mpc83xx without a QE, IMMR is 1Mbytes. > Each driver will map a part of it to access the registers it needs. > Some driver will map the same part of IMMR as other drivers. > > In order to reduce TLB misses, map the full IMMR with a BAT. > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/platforms/83xx/misc.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/arch/powerpc/platforms/83xx/misc.c > b/arch/powerpc/platforms/83xx/misc.c > index f46d7bf3b140..1e395b01c535 100644 > --- a/arch/powerpc/platforms/83xx/misc.c > +++ b/arch/powerpc/platforms/83xx/misc.c > @@ -18,6 +18,8 @@ > #include > #include > > +#include > + > #include "mpc83xx.h" > > static __be32 __iomem *restart_reg_base; > @@ -145,6 +147,14 @@ void __init mpc83xx_setup_arch(void) > if (ppc_md.progress) > ppc_md.progress("mpc83xx_setup_arch()", 0); > > + if (!__map_without_bats) { > + int immrsize = IS_ENABLED(CONFIG_QUICC_ENGINE) ? SZ_2M : > SZ_1M; Any reason not to unconditionally make it 2M? After all, the kernel being built with CONFIG_QUICC_ENGINE doesn't mean that the hardware you're running on has it... -Scott
Re: [PATCH v6 00/12] implement KASLR for powerpc/fsl_booke/32
On Tue, 2019-09-10 at 13:34 +0800, Jason Yan wrote: > Hi Scott, > > On 2019/8/28 12:05, Scott Wood wrote: > > On Fri, 2019-08-09 at 18:07 +0800, Jason Yan wrote: > > > This series implements KASLR for powerpc/fsl_booke/32, as a security > > > feature that deters exploit attempts relying on knowledge of the > > > location > > > of kernel internals. > > > > > > Since CONFIG_RELOCATABLE has already supported, what we need to do is > > > map or copy kernel to a proper place and relocate. > > > > Have you tested this with a kernel that was loaded at a non-zero > > address? I > > tried loading a kernel at 0x0400 (by changing the address in the > > uImage, > > and setting bootm_low to 0400 in U-Boot), and it works without > > CONFIG_RANDOMIZE and fails with. > > > > How did you change the load address of the uImage, by changing the > kernel config CONFIG_PHYSICAL_START or the "-a/-e" parameter of mkimage? > I tried both, but it did not work with or without CONFIG_RANDOMIZE. With mkimage. Did you set bootm_low in U-Boot as described above? Was CONFIG_RELOCATABLE set in the non-CONFIG_RANDOMIZE kernel? -Scott
Re: [PATCH for 5.3 2/3] rseq: Fix: Unregister rseq for CLONE_SETTLS
There is an ongoing discussion on the choice of flag we want to care about here. Therefore, please don't pull this patch until we reach an agreement. Thanks, Mathieu - On Sep 13, 2019, at 11:12 AM, Mathieu Desnoyers mathieu.desnoy...@efficios.com wrote: > It has been reported by Google that rseq is not behaving properly > with respect to clone when CLONE_VM is used without CLONE_THREAD. > It keeps the prior thread's rseq TLS registered when the TLS of the > thread has moved, so the kernel deals with the wrong TLS. > > The approach of clearing the per task-struct rseq registration > on clone with CLONE_THREAD flag is incomplete. It does not cover > the use-case of clone with CLONE_VM set, but without CLONE_THREAD. > > Looking more closely at each of the clone flags: > > - CLONE_THREAD, > - CLONE_VM, > - CLONE_SETTLS. > > It appears that the flag we really want to track is CLONE_SETTLS, which > moves the location of the TLS for the child, making the rseq > registration point to the wrong TLS. > > Suggested-by: "H . Peter Anvin" > Signed-off-by: Mathieu Desnoyers > Cc: Thomas Gleixner > Cc: Peter Zijlstra (Intel) > Cc: "Paul E. McKenney" > Cc: Boqun Feng > Cc: "H . Peter Anvin" > Cc: Paul Turner > Cc: Dmitry Vyukov > Cc: linux-...@vger.kernel.org > Cc: > --- > include/linux/sched.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index 9f51932bd543..76bf55b5cccf 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1919,11 +1919,11 @@ static inline void rseq_migrate(struct task_struct *t) > > /* > * If parent process has a registered restartable sequences area, the > - * child inherits. Only applies when forking a process, not a thread. > + * child inherits. Unregister rseq for a clone with CLONE_SETTLS set. > */ > static inline void rseq_fork(struct task_struct *t, unsigned long clone_flags) > { > - if (clone_flags & CLONE_THREAD) { > + if (clone_flags & CLONE_SETTLS) { > t->rseq = NULL; > t->rseq_sig = 0; > t->rseq_event_mask = 0; > -- > 2.17.1 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com