Re: [PATCHv2 0/4] perf/core: Add support to exclude kernel mode PMU tracing

2021-03-04 Thread Andi Kleen
Sai Prakash Ranjan  writes:
>
> "Consider a system where disk contents are encrypted and the encryption
> key is set up by the user when mounting the file system. From that point
> on the encryption key resides in the kernel. It seems reasonable to
> expect that the disk encryption key be protected from exfiltration even
> if the system later suffers a root compromise (or even against insiders
> that have root access), at least as long as the attacker doesn't
> manage to compromise the kernel."

Normally disk encryption is in specialized work queues. It's total
overkill to restrict all of the kernel if you just want to restrict
those work queues.

I would suggest some more analysis where secrets are actually stored
and handled first.

-Andi


Re: [PATCH v3 1/2] dt-bindings: rng: bcm2835: document reset support

2021-03-04 Thread Rob Herring
On Thu, Mar 4, 2021 at 6:07 AM Nicolas Saenz Julienne
 wrote:
>
> Hi Alvaro,
>
> On Tue, 2021-02-23 at 18:00 +0100, Álvaro Fernández Rojas wrote:
> > Some devices may need to perform a reset before using the RNG, such as the
> > BCM6368.
> >
> > Signed-off-by: Álvaro Fernández Rojas 
> > ---
> >  v3: make resets required if brcm,bcm6368-rng.
> >  v2: document reset support.
> >
> >  .../devicetree/bindings/rng/brcm,bcm2835.yaml   | 17 +
> >  1 file changed, 17 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/rng/brcm,bcm2835.yaml 
> > b/Documentation/devicetree/bindings/rng/brcm,bcm2835.yaml
> > index c147900f9041..11c23e1f6988 100644
> > --- a/Documentation/devicetree/bindings/rng/brcm,bcm2835.yaml
> > +++ b/Documentation/devicetree/bindings/rng/brcm,bcm2835.yaml
> > @@ -37,6 +37,21 @@ required:
> >
> >
> >  additionalProperties: false
>
> I can't claim I fully understand all the meta stuff in shemas, so I generally
> just follow the patterns already available out there. That said, shoudln't 
> this
> be at the end, just before the examples? Maybe the cause of that odd warning
> you got there?

Yes, 'resets' needs to be defined under 'properties' as
additionalProperties can't 'see' into if/then schemas. The top level
needs to define everything and then if/then schema just add additional
constraints.

>
> > +if:
> > +  properties:
> > +compatible:
> > +  enum:
> > +- brcm,bcm6368-rng
> > +then:
> > +  properties:
> > +resets:
> > +  maxItems: 1
> > +  required:
> > +- resets
>
> I belive you can't really make a property required when the bindings for
> 'brcm,bcm6368-rng' were already defined. This will break the schema for those
> otherwise correct devicetrees.

Right, unless not having the property meant the device never would have worked.

Rob


Re: [PATCH] net: mellanox: mlx5: fix error return code in mlx5_fpga_device_start()

2021-03-04 Thread Heiner Kallweit
On 04.03.2021 15:18, Jia-Ju Bai wrote:
> When mlx5_is_fpga_lookaside() returns a non-zero value, no error 
> return code is assigned.
> To fix this bug, err is assigned with -EINVAL as error return code.
> 
To me it looks like the current behavior is intentional.
Did you verify that it's actually an error condition if the
function returns true? Please don't blindly trust such code checkers.

> Reported-by: TOTE Robot 
> Signed-off-by: Jia-Ju Bai 
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c
> index 2ce4241459ce..c9e6da97126f 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/fpga/core.c
> @@ -198,8 +198,10 @@ int mlx5_fpga_device_start(struct mlx5_core_dev *mdev)
>   mlx5_fpga_info(fdev, "FPGA card %s:%u\n", mlx5_fpga_name(fpga_id), 
> fpga_id);
>  
>   /* No QPs if FPGA does not participate in net processing */
> - if (mlx5_is_fpga_lookaside(fpga_id))
> + if (mlx5_is_fpga_lookaside(fpga_id)) {
> + err = -EINVAL;
>   goto out;
> + }
>  
>   mlx5_fpga_info(fdev, "%s(%d): image, version %u; SBU %06x:%04x version 
> %d\n",
>  mlx5_fpga_image_name(fdev->last_oper_image),
> 



Re: [PATCH] ipc/msg: add msgsnd_timed and msgrcv_timed syscall for system V message queue

2021-03-04 Thread Manfred Spraul

Hi Eric,


On 3/4/21 2:12 AM, Andrew Morton wrote:

On Tue, 23 Feb 2021 23:11:43 +0800 Eric Gao  wrote:


sometimes, we need the msgsnd or msgrcv syscall can return after a limited
time, so that the business thread do not be blocked here all the time. In
this case, I add the msgsnd_timed and msgrcv_timed syscall that with time
parameter, which has a unit of ms.

Please cc Manfred and Davidlohr on ipc/ changes.

The above is a very brief description for a new syscall!  Please go to
great lengths to tell us why this is considered useful - what are the
use cases?

Also, please fully describe the proposed syscall interface right here
in the changelog.  Please be prepared to later prepare a full manpage.


...
+SYSCALL_DEFINE5(msgsnd_timed, int, msqid, struct msgbuf __user *, msgp, 
size_t, msgsz,
+   int, msgflg, long, timeoutms)

Specifying the timeout in milliseconds is problematic - it's very
coarse.  See sys_epoll_pwait2()'s use of timespecs.


What about using an absolute timeout, like in mq_timedsend()?

That makes restart handling after signals far simpler.


> -   schedule();
> +
> +   /* sometimes, we need msgsnd syscall return after a given 
time */
> +   if (timeoutms <= 0) {
> +   schedule();
> +   } else {
> +   timeoutms = schedule_timeout(timeoutms);
> +   if (timeoutms == 0)
> +   timeoutflag = true;
> +   }

I wonder if this should be schedule_timeout_interruptible() or at least
schedule_timeout_killable() instead of schedule_timeout(). If it should,
this should probably be done as a separate change.
No. schedule_timeout_interruptible() just means that 
__set_current_state() is called before the schedule_timeout().


The __set_current_state() is done directly in msg.c, before dropping the 
lock.


--

    Manfred



RE: [RFC PATCH 2/5] char: rpmb: provide a user space interface

2021-03-04 Thread Winkler, Tomas


> 
> Winkler, Tomas  writes:
> 
> >> "Winkler, Tomas"  writes:
> >>
> >> >> The user space API is achieved via a number of synchronous IOCTLs.
> >> >>
> >> >>   * RPMB_IOC_VER_CMD - simple versioning API
> >> >>   * RPMB_IOC_CAP_CMD - query of underlying capabilities
> >> >>   * RPMB_IOC_PKEY_CMD - one time programming of access key
> >> >>   * RPMB_IOC_COUNTER_CMD - query the write counter
> >> >>   * RPMB_IOC_WBLOCKS_CMD - write blocks to device
> >> >>   * RPMB_IOC_RBLOCKS_CMD - read blocks from device
> >> >>
> >> >> The keys used for programming and writing blocks to the device are
> >> >> key_serial_t handles as provided by the keyctl() interface.
> >> >>
> >> >> [AJB: here there are two key differences between this and the
> >> >> original proposal. The first is the dropping of the sequence of
> >> >> preformated frames in favour of explicit actions. The second is
> >> >> the introduction of key_serial_t and the keyring API for
> >> >> referencing the key to use]
> >> >
> >> > Putting it gently I'm not sure this is good idea, from the security
> >> > point of
> >> view.
> >> > The key has to be possession of the one that signs the frames as
> >> > they are,
> >> it doesn't mean it is linux kernel keyring, it can be other party on
> >> different system.
> >> > With this approach you will make the other usecases not applicable.
> >> > It is less then trivial to move key securely from one system to another.
> >>
> >> OK I can understand the desire for such a use-case but it does
> >> constrain the interface on the kernel with access to the hardware to
> >> purely providing a pipe to the raw hardware while also having to
> >> expose the details of the HW to userspace.
> > This is the use case in Android. The key is in the "trusty" which
> > different os running in a virtual environment. The file storage
> > abstraction is implemented there. I'm not sure the point of
> > constraining the kernel, can you please elaborate on that.
> 
> Well the kernel is all about abstracting differences not baking in 
> assumptions.
> However can I ask a bit more about this security model?
> Is the secure enclave just a separate userspace process or is it in a separate
> virtual machine? Is it accessible at all by the kernel running the driver?

It's not an assumption this is working for few years already 
(https://source.android.com/security/trusty#application_services) 
The model is that you have a trusted environment (TEE)  in which can be in any 
of the form you described above.
And there is established agreement via the RPMB key that TEE is only entity 
that can produce content to be stored on RPBM,
The RPMB hardware also ensure that nobody can catch it in the middle and replay 
that storage event. 

My point is that interface you are suggesting is not covering all possible 
usages of RPMB, actually usages that are already in place.

> The fact that key id is passed down into the kernel doesn't have to imply the
> kernel does the final cryptographic operation. In the ARM world you could
> make a call to the secure world to do the operation for you. I note the
> keyctl() interface already has support for going to userspace to make queries
> of the keyring.  Maybe what is really needed is an abstraction for the kernel
> to delegate the MAC calculation to some other trusted process that also
> understands the keyid.

Sure but that you want need to make sure that the entity that creates the 
content has the right to use this specific key, so you will need to create 
another channel of trust. 
And this trust has to be established somewhere at the manufacturing time. 

> >
> > Also doesn't this break down after a PROGRAM_KEY event as
> >> the key will have had to traverse into the "untrusted" kernel?
> >
> > This is one in a life event of the card happening on the manufacturing
> > floor, maybe even not performed on Linux.
> 
> In an off list conversation it was suggested that maybe the PROGRAM_KEY
> ioctl should be disabled for locked down kernels to dissuade production use
> of the facility (it is handy for testing t

This is really protected by the hardware,  also once you are programming key 
your platform would be rather sealed already at least it's TEE environment, as 
this is the other part that knows the key.

> >> I wonder if virtio-rpmb may be of help here? You could wrap up up the
> >> front- end in the security domain that has the keys although I don't
> >> know how easy it would be for a backend to work with real hardware?
> >
> > I'm open to see any proposal, not sure I can wrap may head about it right
> now.
> >
> > Anyway I was about to send the new round of my code,  but let's come to
> common ground first.
> >
> 
> OK - I'll see what the others say.
> 
> --
> Alex Bennée


Re: [PATCH v2 1/3] regmap-irq: Add support for peripheral offsets

2021-03-04 Thread Mark Brown
On Thu, Mar 04, 2021 at 10:27:35AM -0800, Guru Das Srinagesh wrote:
> On Thu, Nov 12, 2020 at 07:33:12PM +, Mark Brown wrote:

> > supposed to do.  Nothing here says what POLARITY_HI and POLARITY_LO are,
> > how they interact or anything.

> The POLARITY_HI and POLARITY_LO registers were described very briefly in
> the cover letter. If an interrupt is already configured as either edge-
> or level-triggered, setting the corresponding bit for it in the
> POLARITY_HI register further configures it as rising-edge or level-high
> triggered (as the case may be), while setting the same bit in
> POLARITY_LO further configures it as falling-edge or level-low
> triggered. I could certainly add this information to the commit message
> as well.

So this is just a trigger type control that's in two discontiguous bits
possibly in different registers or something?  This doesn't sound like
anything generic with the API you're describing, if that's what it is
the interface should also handle things like four bits (one for each
type) or having the different values mapped differently within the two
bits that are spread out (eg, you could have one bit for polarity and
another for edge/level).

> > For the address offsets I'm not sure that this is the best way to
> > represent things.  It looks like the hardware this is trying to describe
> > is essentially a bunch of separate interrupt controllers that happen to
> > share an upstream interrupt

> Sorry but isn't this essentially the same as what the framework already knows 
> as
> the "sub-irq" concept, with the key difference that the register stride
> is not fixed? Everything else is the same (except for a couple of minor
> points noted below) - a main IRQ register that indicates sub-irq blocks
> that have unhandled interrupts, as well as interrupt handling and
> servicing.

Like I said in my original review it is extremely hard to tell from your
patch what you are trying to implement, and it's now been more than four
months since you sent it which isn't helping anything.  This means it is
also extremely hard to tell if the patch is doing the same thing as
sub_irq.

IIRC it appeared that there was no top level interrupt status register,
the point with sub_irq is that we don't need to read all the status
registers because there's a top level status register which says which
of them have signals in them (including the possibility that more than
one bit in the top level status might indicate the same leaf status
register).  If the device doesn't have that it doesn't have sub_irqs.
Note that sub_irq only affects status register reads, it doesn't affect
other things like acking or masking.

On the other hand if this *is* the same thing as sub_irqs then why is it
completely separate code and not sharing anything with it?

As I said at the time you need to split this into at least two distinct
patches with clear changelogs which explain what they are trying to
implement, I don't think it's useful to discuss this further without
that.  I can't give you any clearer advice on how to implement whatever
you are trying to implement without having some idea of what that is.

> > clearer if at least the implementation looked like this.  Instead of
> > having to check for this array of offsets at every use point (which is
> > going to be rarely used and hence prone to bugs)

> Well, using irq_reg_stride already does exactly this - calculating the
> right register to access at every use point, as an offset from the _base
> register (status, ack, type, et c.). Peripheral offsets would just be
> another way of calculating the right register, that's all. And we could
> have a macro as well.

The stride code is executed in all paths, it doesn't add conditional
statements all over the place.  This helps a lot, we know it's being run
all the time as opposed to being a lot of separate code paths that are
rarely run - the case without a stride is just a stride of 1.

> Sure, I can look into how this approach would look like, but given that
> the QCOM register arrangement of main vs sub-irq is essentially the same
> as what the framework currently understands, couldn't we simply have a
> macro to change the way the right register offset is calculated
> (irq_reg_stride vs. peripheral offsets)?

I'm not sure macros all over the place is going to be clearer than
conditional statements all over the place.  As with what you were saying
about sub_irq if you think the two things are equivalent then why is one
not implemented in terms of the other rather than doing conditional code
on every single use?

> Also, could you elaborate more on the genirq route? I'm not sure where
> to start looking to evaluate one route vs the other.

Register a separate regmap-irq for each of these perhiperals in your
list, using the same parent interrupt for all of them and setting
IRQF_SHARED.  They will then be handled like any other shared interrupt,
if the parent interrupt fires then genirq will go through and call 

Re: [PATCH 1/4] platform/x86: simatic-ipc: add main driver for Siemens devices

2021-03-04 Thread Henning Schild
Am Thu, 4 Mar 2021 12:11:12 +0200
schrieb Andy Shevchenko :

> On Thu, Mar 4, 2021 at 8:36 AM Henning Schild
>  wrote:
> >
> > From: Henning Schild 
> >
> > This mainly implements detection of these devices and will allow
> > secondary drivers to work on such machines.
> >
> > The identification is DMI-based with a vendor specific way to tell
> > them apart in a reliable way.
> >
> > Drivers for LEDs and Watchdogs will follow to make use of that
> > platform detection.  
> 
> > Signed-off-by: Gerd Haeussler 
> > Signed-off-by: Henning Schild   
> 
> The order is wrong taking into account the From line in the body. So,
> it's not clear who is the author, who is a co-developer, and who is
> the committer (one person may utilize few roles).
> Check for the rest of the series as well (basically this is the rule
> of thumb to recheck entire code for the comment you have got at any
> single place of it).

For some code Gerd is the author, and i am the co-Author. We even have
Jan in the mix at places. At least in copyright headers.

I will remain the committer for the three of us. And since i do not
know exactly what the problem is i will add only my Signed-off to avoid
confusion.

Please speak up it that would be wrong as well and point me to the docs
i missed. Or tell me how it should be done. 

> ...
> 
> > +config SIEMENS_SIMATIC_IPC
> > +   tristate "Siemens Simatic IPC Class driver"
> > +   depends on PCI
> > +   help
> > + This Simatic IPC class driver is the central of several
> > drivers. It
> > + is mainly used for system identification, after which
> > drivers in other
> > + classes will take care of driving specifics of those
> > machines.
> > + i.e. leds and watchdogs  
> 
> LEDs
> watchdog. (missed period and singular form)
> 
> Module name?
> 
> >  endif # X86_PLATFORM_DEVICES  
> 
> Not sure about the ordering of the section in Kconfig, but it is up to
> maintainers.
> 
> ...
> 
> > +# Siemens Simatic Industrial PCs
> > +obj-$(CONFIG_SIEMENS_SIMATIC_IPC)  += simatic-ipc.o  
> 
> Ditto.

I will check both again if a find a pattern, otherwise will wait for
maintainers to complain and hopefully suggest.

> ...
> 
> > + * Siemens SIMATIC IPC driver for LEDs  
> 
> It seems to be used in patch 4 which is not LED related. Also, why is
> it here if it's for LEDs?
> 
> ...
> 
> > + * This program is free software; you can redistribute it and/or
> > modify
> > + * it under the terms of the GNU General Public License version 2
> > as
> > + * published by the Free Software Foundation.  
> 
> Replace with SPDX
> 
> ...
> 
> > +#include 
> > +#include 
> > +#include   
> 
> Ordered?
> 
> > +#include   
> 
> ...
> 
> > +static int register_platform_devices(u32 station_id)
> > +{
> > +   int i;
> > +   u8 ledmode = SIMATIC_IPC_DEVICE_NONE;
> > +   u8 wdtmode = SIMATIC_IPC_DEVICE_NONE;  
> 
> Reversed xmas tree order?

I do not get this, it is almost easter egg order time. Please explain.

> > +   platform_data.devmode = SIMATIC_IPC_DEVICE_NONE;
> > +
> > +   for (i = 0; i < ARRAY_SIZE(device_modes); i++) {
> > +   if (device_modes[i].station_id == station_id) {
> > +   ledmode = device_modes[i].led_mode;
> > +   wdtmode = device_modes[i].wdt_mode;
> > +   break;
> > +   }
> > +   }
> > +
> > +   if (ledmode != SIMATIC_IPC_DEVICE_NONE) {
> > +   platform_data.devmode = ledmode;  
> 
> > +   ipc_led_platform_device =
> > platform_device_register_data
> > +   (NULL, KBUILD_MODNAME "_leds",
> > PLATFORM_DEVID_NONE,
> > +_data, sizeof(struct
> > simatic_ipc_platform));  
> 
> Strange indentation (second line).
> 
> > +   if (IS_ERR(ipc_led_platform_device))
> > +   return PTR_ERR(ipc_led_platform_device);  
> 
> > +   pr_debug(KBUILD_MODNAME ": device=%s created\n",
> > +ipc_led_platform_device->name);  
> 
> Utilize pr_fmt() instead of adding prefixes like this.
> 
> > +   }  
> 
> > +   if (wdtmode != SIMATIC_IPC_DEVICE_NONE) {
> > +   platform_data.devmode = wdtmode;
> > +   ipc_wdt_platform_device =
> > platform_device_register_data
> > +   (NULL, KBUILD_MODNAME "_wdt",
> > PLATFORM_DEVID_NONE,
> > +_data, sizeof(struct
> > simatic_ipc_platform));
> > +   if (IS_ERR(ipc_wdt_platform_device))
> > +   return PTR_ERR(ipc_wdt_platform_device);
> > +
> > +   pr_debug(KBUILD_MODNAME ": device=%s created\n",
> > +ipc_wdt_platform_device->name);
> > +   }  
> 
> Same comments as above.
> 
> > +   if (ledmode == SIMATIC_IPC_DEVICE_NONE &&
> > +   wdtmode == SIMATIC_IPC_DEVICE_NONE) {
> > +   pr_warn(KBUILD_MODNAME
> > +   ": unsupported IPC 

Re: [PATCH v3 24/32] KVM: arm64: Reserve memory for host stage 2

2021-03-04 Thread Will Deacon
On Tue, Mar 02, 2021 at 02:59:54PM +, Quentin Perret wrote:
> Extend the memory pool allocated for the hypervisor to include enough
> pages to map all of memory at page granularity for the host stage 2.
> While at it, also reserve some memory for device mappings.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/kvm/hyp/include/nvhe/mm.h | 23 ++-
>  arch/arm64/kvm/hyp/nvhe/setup.c  | 12 
>  arch/arm64/kvm/hyp/reserved_mem.c|  2 ++
>  3 files changed, 36 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/mm.h 
> b/arch/arm64/kvm/hyp/include/nvhe/mm.h
> index ac0f7fcffd08..411a35db949c 100644
> --- a/arch/arm64/kvm/hyp/include/nvhe/mm.h
> +++ b/arch/arm64/kvm/hyp/include/nvhe/mm.h
> @@ -53,7 +53,7 @@ static inline unsigned long 
> __hyp_pgtable_max_pages(unsigned long nr_pages)
>   return total;
>  }
>  
> -static inline unsigned long hyp_s1_pgtable_pages(void)
> +static inline unsigned long __hyp_pgtable_total_pages(void)
>  {
>   unsigned long res = 0, i;
>  
> @@ -63,9 +63,30 @@ static inline unsigned long hyp_s1_pgtable_pages(void)
>   res += __hyp_pgtable_max_pages(reg->size >> PAGE_SHIFT);
>   }
>  
> + return res;
> +}
> +
> +static inline unsigned long hyp_s1_pgtable_pages(void)
> +{
> + unsigned long res;
> +
> + res = __hyp_pgtable_total_pages();
> +
>   /* Allow 1 GiB for private mappings */
>   res += __hyp_pgtable_max_pages(SZ_1G >> PAGE_SHIFT);
>  
>   return res;
>  }
> +
> +static inline unsigned long host_s2_mem_pgtable_pages(void)
> +{
> + return __hyp_pgtable_total_pages() + 16;

Is this 16 due to the possibility of a concatenated pgd? If so, please add
a comment to that effect.

With that:

Acked-by: Will Deacon 

Will


Re: [PATCH v3 25/32] KVM: arm64: Sort the hypervisor memblocks

2021-03-04 Thread Will Deacon
On Tue, Mar 02, 2021 at 02:59:55PM +, Quentin Perret wrote:
> We will soon need to check if a Physical Address belongs to a memblock
> at EL2, so make sure to sort them so this can be done efficiently.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/kvm/hyp/reserved_mem.c | 19 +++
>  1 file changed, 19 insertions(+)

Acked-by: Will Deacon 

Will


[PATCH v3] amba: Remove deferred device addition

2021-03-04 Thread Saravana Kannan
The uevents generated for an amba device need PID and CID information
that's available only when the amba device is powered on, clocked and
out of reset. So, if those resources aren't available, the information
can't be read to generate the uevents. To workaround this requirement,
if the resources weren't available, the device addition was deferred and
retried periodically.

However, this deferred addition retry isn't based on resources becoming
available. Instead, it's retried every 5 seconds and causes arbitrary
probe delays for amba devices and their consumers.

Also, maintaining a separate deferred-probe like mechanism is
maintenance headache.

With this commit, instead of deferring the device addition, we simply
defer the generation of uevents for the device and probing of the device
(because drivers needs PID and CID to match) until the PID and CID
information can be read. This allows us to delete all the amba specific
deferring code and also avoid the arbitrary probing delays.

Cc: Rob Herring 
Cc: Ulf Hansson 
Cc: John Stultz 
Cc: Saravana Kannan 
Cc: Linus Walleij 
Cc: Sudeep Holla 
Cc: Nicolas Saenz Julienne 
Cc: Geert Uytterhoeven 
Cc: Marek Szyprowski 
Cc: Russell King 
Signed-off-by: Saravana Kannan 
---

v1 -> v2:
- Dropped RFC tag
- Complete rewrite to not use stub devices.
v2 -> v3:
- Flipped the if() condition for hard-coded periphids.
- Added a stub driver to handle the case where all amba drivers are
  modules loaded by uevents.
- Cc Marek after I realized I forgot to add him.

Marek,

Would you mind testing this? It looks okay with my limited testing.

-Saravana

 drivers/amba/bus.c | 329 +
 1 file changed, 151 insertions(+), 178 deletions(-)

diff --git a/drivers/amba/bus.c b/drivers/amba/bus.c
index 939ca220bf78..836d6d23bba3 100644
--- a/drivers/amba/bus.c
+++ b/drivers/amba/bus.c
@@ -149,11 +149,101 @@ static struct attribute *amba_dev_attrs[] = {
 };
 ATTRIBUTE_GROUPS(amba_dev);
 
+static int amba_read_periphid(struct amba_device *dev)
+{
+   u32 size;
+   void __iomem *tmp;
+   u32 pid, cid;
+   struct reset_control *rstc;
+   int i, ret;
+
+   /*
+* Dynamically calculate the size of the resource
+* and use this for iomap
+*/
+   size = resource_size(>res);
+   tmp = ioremap(dev->res.start, size);
+   if (!tmp)
+   return -ENOMEM;
+
+   ret = dev_pm_domain_attach(>dev, true);
+   if (ret)
+   goto err_pm;
+
+   ret = amba_get_enable_pclk(dev);
+   if (ret)
+   goto err_clk;
+
+   /*
+* Find reset control(s) of the amba bus and de-assert them.
+*/
+   rstc = of_reset_control_array_get_optional_shared(dev->dev.of_node);
+   if (IS_ERR(rstc)) {
+   ret = PTR_ERR(rstc);
+   if (ret != -EPROBE_DEFER)
+   dev_err(>dev, "can't get reset: %d\n",
+   ret);
+   goto err_reset;
+   }
+   reset_control_deassert(rstc);
+   reset_control_put(rstc);
+
+   /*
+* Read pid and cid based on size of resource
+* they are located at end of region
+*/
+   for (pid = 0, i = 0; i < 4; i++)
+   pid |= (readl(tmp + size - 0x20 + 4 * i) & 255) <<
+   (i * 8);
+   for (cid = 0, i = 0; i < 4; i++)
+   cid |= (readl(tmp + size - 0x10 + 4 * i) & 255) <<
+   (i * 8);
+
+   if (cid == CORESIGHT_CID) {
+   /* set the base to the start of the last 4k block */
+   void __iomem *csbase = tmp + size - 4096;
+
+   dev->uci.devarch =
+   readl(csbase + UCI_REG_DEVARCH_OFFSET);
+   dev->uci.devtype =
+   readl(csbase + UCI_REG_DEVTYPE_OFFSET) & 0xff;
+   }
+
+   amba_put_disable_pclk(dev);
+
+   if (cid == AMBA_CID || cid == CORESIGHT_CID) {
+   dev->periphid = pid;
+   dev->cid = cid;
+   }
+
+   if (!dev->periphid)
+   ret = -ENODEV;
+
+   return ret;
+
+err_reset:
+   amba_put_disable_pclk(dev);
+err_clk:
+   dev_pm_domain_detach(>dev, true);
+err_pm:
+   iounmap(tmp);
+   return ret;
+}
+
 static int amba_match(struct device *dev, struct device_driver *drv)
 {
struct amba_device *pcdev = to_amba_device(dev);
struct amba_driver *pcdrv = to_amba_driver(drv);
 
+   if (!pcdev->periphid) {
+   int ret = amba_read_periphid(pcdev);
+
+   if (ret)
+   return ret;
+   dev_set_uevent_suppress(dev, false);
+   kobject_uevent(>kobj, KOBJ_ADD);
+   }
+
/* When driver_override is set, only bind to the matching driver */
if (pcdev->driver_override)
return !strcmp(pcdev->driver_override, drv->name);
@@ -332,6 +422,42 @@ static int __init amba_init(void)
 
 

[PATCH] pinctrl: qcom: lpass lpi: use default pullup/strength values

2021-03-04 Thread Jonathan Marek
If these fields are not set in dts, the driver will use these variables
uninitialized to set the fields. Not only will it set garbage values for
these fields, but it can overflow into other fields and break those.

In the current sm8250 dts, the dmic01 entries do not have a pullup setting,
and might not work without this change.

Fixes: 6e261d1090d6 ("pinctrl: qcom: Add sm8250 lpass lpi pinctrl driver")
Signed-off-by: Jonathan Marek 
---
 drivers/pinctrl/qcom/pinctrl-lpass-lpi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c 
b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
index 369ee20a7ea95..2f19ab4db7208 100644
--- a/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
+++ b/drivers/pinctrl/qcom/pinctrl-lpass-lpi.c
@@ -392,7 +392,7 @@ static int lpi_config_set(struct pinctrl_dev *pctldev, 
unsigned int group,
  unsigned long *configs, unsigned int nconfs)
 {
struct lpi_pinctrl *pctrl = dev_get_drvdata(pctldev->dev);
-   unsigned int param, arg, pullup, strength;
+   unsigned int param, arg, pullup = LPI_GPIO_BIAS_DISABLE, strength = 2;
bool value, output_enabled = false;
const struct lpi_pingroup *g;
unsigned long sval;
-- 
2.26.1



Re: [PATCH v3 23/32] KVM: arm64: Make memcache anonymous in pgtable allocator

2021-03-04 Thread Will Deacon
On Tue, Mar 02, 2021 at 02:59:53PM +, Quentin Perret wrote:
> The current stage2 page-table allocator uses a memcache to get
> pre-allocated pages when it needs any. To allow re-using this code at
> EL2 which uses a concept of memory pools, make the memcache argument of
> kvm_pgtable_stage2_map() anonymous, and let the mm_ops zalloc_page()
> callbacks use it the way they need to.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_pgtable.h | 6 +++---
>  arch/arm64/kvm/hyp/pgtable.c | 4 ++--
>  2 files changed, 5 insertions(+), 5 deletions(-)

Acked-by: Will Deacon 

Will


Re: [PATCH 0/4] add device drivers for Siemens Industrial PCs

2021-03-04 Thread Henning Schild
Am Thu, 4 Mar 2021 12:20:22 +0200
schrieb Andy Shevchenko :

> On Thu, Mar 4, 2021 at 12:19 PM Andy Shevchenko
>  wrote:
> > On Thu, Mar 4, 2021 at 9:29 AM Henning Schild
> >  wrote:  
> 
> > I have given a few comments here and there, so please check the
> > entire series and address them in _all_ similar locations. As I
> > have noticed, I have different approach about P2SB code, I have to
> > give the series a dust and see if you can utilize it.
> >
> > I would like to be Cc'ed on the next version.  
> 
> One more thing, is it Apollo Lake based?

Not sure i can answer that, or what you even refer to. The whole series
is about a range of devices, some even have sub-models with differing
CPUs and Lakes

regards,
Henning



Re: [PATCH v3 22/32] KVM: arm64: Refactor __populate_fault_info()

2021-03-04 Thread Will Deacon
On Tue, Mar 02, 2021 at 02:59:52PM +, Quentin Perret wrote:
> Refactor __populate_fault_info() to introduce __get_fault_info() which
> will be used once the host is wrapped in a stage 2.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 37 ++---
>  1 file changed, 21 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h 
> b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 6c1f51f25eb3..1f017c9851bb 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -160,19 +160,9 @@ static inline bool __translate_far_to_hpfar(u64 far, u64 
> *hpfar)
>   return true;
>  }
>  
> -static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
> +static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info 
> *fault)
>  {
> - u8 ec;
> - u64 esr;
> - u64 hpfar, far;
> -
> - esr = vcpu->arch.fault.esr_el2;
> - ec = ESR_ELx_EC(esr);
> -
> - if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
> - return true;
> -
> - far = read_sysreg_el2(SYS_FAR);
> + fault->far_el2 = read_sysreg_el2(SYS_FAR);
>  
>   /*
>* The HPFAR can be invalid if the stage 2 fault did not
> @@ -188,14 +178,29 @@ static inline bool __populate_fault_info(struct 
> kvm_vcpu *vcpu)
>   if (!(esr & ESR_ELx_S1PTW) &&
>   (cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
>(esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
> - if (!__translate_far_to_hpfar(far, ))
> + if (!__translate_far_to_hpfar(fault->far_el2, 
> >hpfar_el2))
>   return false;
>   } else {
> - hpfar = read_sysreg(hpfar_el2);
> + fault->hpfar_el2 = read_sysreg(hpfar_el2);
>   }
>  
> - vcpu->arch.fault.far_el2 = far;
> - vcpu->arch.fault.hpfar_el2 = hpfar;
> + return true;
> +}
> +
> +static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
> +{
> + u8 ec;
> + u64 esr;
> +
> + esr = vcpu->arch.fault.esr_el2;
> + ec = ESR_ELx_EC(esr);
> +
> + if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
> + return true;
> +
> + if (!__get_fault_info(esr, >arch.fault))
> + return false;
> +
>   return true;

Just return __get_fault_info(esr, >arch.fault); here.

With that:

Acked-by: Will Deacon 

Will


Re: [PATCH v3 20/32] KVM: arm64: Refactor kvm_arm_setup_stage2()

2021-03-04 Thread Will Deacon
On Tue, Mar 02, 2021 at 02:59:50PM +, Quentin Perret wrote:
> In order to re-use some of the stage 2 setup code at EL2, factor parts
> of kvm_arm_setup_stage2() out into separate functions.
> 
> No functional change intended.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_pgtable.h | 26 +
>  arch/arm64/kvm/hyp/pgtable.c | 32 +
>  arch/arm64/kvm/reset.c   | 42 +++-
>  3 files changed, 62 insertions(+), 38 deletions(-)

Looks much better than the big header in v2, thanks!

Acked-by: Will Deacon 

Will


Re: [perf] perf_fuzzer causes unchecked MSR access error

2021-03-04 Thread Liang, Kan




On 3/3/2021 3:22 PM, Vince Weaver wrote:

On Wed, 3 Mar 2021, Liang, Kan wrote:


We never use bit 58. It should be a new issue.


Actually, KVM uses it. They create a fake event called VLBR_EVENT, which 
uses bit 58. It's introduced from the commit 097e4311cda9 ("perf/x86: 
Add constraint to create guest LBR event without hw counter").


Since it's a fake event, it doesn't support PEBS. Perf should reject it 
if it sets the precise_ip.


The below patch should fix the MSR access error.

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 5bac48d..1ea3c67 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3659,6 +3659,10 @@ static int intel_pmu_hw_config(struct perf_event 
*event)

return ret;

if (event->attr.precise_ip) {
+
+		if ((event->attr.config & INTEL_ARCH_EVENT_MASK) == 
INTEL_FIXED_VLBR_EVENT)

+   return -EINVAL;
+
 		if (!(event->attr.freq || (event->attr.wakeup_events && 
!event->attr.watermark))) {

event->hw.flags |= PERF_X86_EVENT_AUTO_RELOAD;
if (!(event->attr.sample_type &

Thanks,
Kan


Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Segher Boessenkool
On Thu, Mar 04, 2021 at 09:54:44AM -0800, Nick Desaulniers wrote:
> On Thu, Mar 4, 2021 at 9:42 AM Marco Elver  wrote:
> include/linux/compiler.h:246:
> prevent_tail_call_optimization
> 
> commit a9a3ed1eff36 ("x86: Fix early boot crash on gcc-10, third try")

That is much heavier than needed (an mb()).  You can just put an empty
inline asm after a call before a return, and that call cannot be
optimised to a sibling call: (the end of a function is an implicit
return:)

Instead of:

void g(void);
void f(int x)
if (x)
g();
}

Do:

void g(void);
void f(int x)
if (x)
g();
asm("");
}

This costs no extra instructions, and certainly not something as heavy
as an mb()!  It works without the "if" as well, of course, but with it
it is a more interesting example of a tail call.


Segher


Re: [GIT PULL] sound fixes for 5.12-rc2

2021-03-04 Thread pr-tracker-bot
The pull request you sent on Thu, 04 Mar 2021 10:49:59 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git 
> tags/sound-5.12-rc2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/43df5242af4ed67e8811257ab1bfe6a07e4a5858

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] KVM changes for 5.12-rc2

2021-03-04 Thread pr-tracker-bot
The pull request you sent on Wed,  3 Mar 2021 10:10:07 -0500:

> https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/cee407c5cc427a7d9b21ee964fbda613e368bdff

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] xen: branch for v5.12-rc2

2021-03-04 Thread pr-tracker-bot
The pull request you sent on Thu,  4 Mar 2021 12:00:53 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git 
> for-linus-5.12b-rc2-tag

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/c5a58f877ca645a3303f7a57476f2de837fdb97a

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


[PATCH] PM: domains: Don't runtime resume devices at genpd_prepare()

2021-03-04 Thread Ulf Hansson
Runtime resuming a device upfront in the genpd_prepare() callback, to check
if there is a wakeup pending for it, seems like an unnecessary thing to do.
The PM core already manages these kind of things in a common way in
__device_suspend(), via calling pm_runtime_barrier() and
pm_wakeup_pending().

Therefore, let's simply drop this behaviour from genpd_prepare().

Note that, this change is applicable only for devices that are attached to
a genpd that has the GENPD_FLAG_ACTIVE_WAKEUP set (Renesas, Mediatek, and
Rockchip platforms). Moreover, a driver that needs to restore power for its
device to re-configure it for a system wakeup, may still call
pm_runtime_get_sync(), for example, to do this.

Signed-off-by: Ulf Hansson 
---
 drivers/base/power/domain.c | 36 
 1 file changed, 36 deletions(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 78c310d3179d..b6a782c31613 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -1087,34 +1087,6 @@ static void genpd_sync_power_on(struct generic_pm_domain 
*genpd, bool use_lock,
genpd->status = GENPD_STATE_ON;
 }
 
-/**
- * resume_needed - Check whether to resume a device before system suspend.
- * @dev: Device to check.
- * @genpd: PM domain the device belongs to.
- *
- * There are two cases in which a device that can wake up the system from sleep
- * states should be resumed by genpd_prepare(): (1) if the device is enabled
- * to wake up the system and it has to remain active for this purpose while the
- * system is in the sleep state and (2) if the device is not enabled to wake up
- * the system from sleep states and it generally doesn't generate wakeup 
signals
- * by itself (those signals are generated on its behalf by other parts of the
- * system).  In the latter case it may be necessary to reconfigure the device's
- * wakeup settings during system suspend, because it may have been set up to
- * signal remote wakeup from the system's working state as needed by runtime 
PM.
- * Return 'true' in either of the above cases.
- */
-static bool resume_needed(struct device *dev,
- const struct generic_pm_domain *genpd)
-{
-   bool active_wakeup;
-
-   if (!device_can_wakeup(dev))
-   return false;
-
-   active_wakeup = genpd_is_active_wakeup(genpd);
-   return device_may_wakeup(dev) ? active_wakeup : !active_wakeup;
-}
-
 /**
  * genpd_prepare - Start power transition of a device in a PM domain.
  * @dev: Device to start the transition of.
@@ -1135,14 +1107,6 @@ static int genpd_prepare(struct device *dev)
if (IS_ERR(genpd))
return -EINVAL;
 
-   /*
-* If a wakeup request is pending for the device, it should be woken up
-* at this point and a system wakeup event should be reported if it's
-* set up to wake up the system from sleep states.
-*/
-   if (resume_needed(dev, genpd))
-   pm_runtime_resume(dev);
-
genpd_lock(genpd);
 
if (genpd->prepared_count++ == 0)
-- 
2.25.1



{standard input}:5625: Error: Unable to parse register name $fp

2021-03-04 Thread kernel test robot
Hi WANG,

FYI, the error/warning still remains.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   f69d02e37a85645aa90d18cacfff36dba370f797
commit: ec7a93188a75b57b9f704db6862e7137f01aa80b MIPS: emulate CPUCFG 
instruction on older Loongson64 cores
date:   10 months ago
config: mips-randconfig-p002-20210304 (attached as .config)
compiler: mips64el-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ec7a93188a75b57b9f704db6862e7137f01aa80b
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout ec7a93188a75b57b9f704db6862e7137f01aa80b
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=mips 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   In file included from include/linux/bits.h:23,
from include/linux/bitops.h:5,
from include/linux/kernel.h:12,
from arch/mips/kernel/cpu-probe.c:11:
   arch/mips/kernel/cpu-probe.c: In function 'decode_config5':
   include/linux/bits.h:26:28: warning: comparison of unsigned expression < 0 
is always false [-Wtype-limits]
  26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
 |^
   include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
  16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); 
})))
 |  ^
   include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
  39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
 |   ^~~
   arch/mips/kernel/cpu-probe.c:1028:20: note: in expansion of macro 'GENMASK'
1028 |if (asid_mask > GENMASK(max_mmid_width - 1, 0)) {
 |^~~
   include/linux/bits.h:26:40: warning: comparison of unsigned expression < 0 
is always false [-Wtype-limits]
  26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
 |^
   include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
  16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); 
})))
 |  ^
   include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
  39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
 |   ^~~
   arch/mips/kernel/cpu-probe.c:1028:20: note: in expansion of macro 'GENMASK'
1028 |if (asid_mask > GENMASK(max_mmid_width - 1, 0)) {
 |^~~
   include/linux/bits.h:26:28: warning: comparison of unsigned expression < 0 
is always false [-Wtype-limits]
  26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
 |^
   include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
  16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); 
})))
 |  ^
   include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
  39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
 |   ^~~
   arch/mips/kernel/cpu-probe.c:1031:17: note: in expansion of macro 'GENMASK'
1031 | asid_mask = GENMASK(max_mmid_width - 1, 0);
 | ^~~
   include/linux/bits.h:26:40: warning: comparison of unsigned expression < 0 
is always false [-Wtype-limits]
  26 |   __builtin_constant_p((l) > (h)), (l) > (h), 0)))
 |^
   include/linux/build_bug.h:16:62: note: in definition of macro 
'BUILD_BUG_ON_ZERO'
  16 | #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); 
})))
 |  ^
   include/linux/bits.h:39:3: note: in expansion of macro 'GENMASK_INPUT_CHECK'
  39 |  (GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
 |   ^~~
   arch/mips/kernel/cpu-probe.c:1031:17: note: in expansion of macro 'GENMASK'
1031 | asid_mask = GENMASK(max_mmid_width - 1, 0);
 | ^~~
   {standard input}: Assembler messages:
>> {standard input}:5625: Error: Unable to parse register name $fp
   {standard input}:5626: Error: Unable to parse register name $fp

---
0-DAY CI Kernel Test Service, Intel Corporation
https:

Re: [PATCH v3 16/32] KVM: arm64: Elevate hypervisor mappings creation at EL2

2021-03-04 Thread Will Deacon
On Tue, Mar 02, 2021 at 02:59:46PM +, Quentin Perret wrote:
> Previous commits have introduced infrastructure to enable the EL2 code
> to manage its own stage 1 mappings. However, this was preliminary work,
> and none of it is currently in use.
> 
> Put all of this together by elevating the mapping creation at EL2 when
> memory protection is enabled. In this case, the host kernel running
> at EL1 still creates _temporary_ EL2 mappings, only used while
> initializing the hypervisor, but frees them right after.
> 
> As such, all calls to create_hyp_mappings() after kvm init has finished
> turn into hypercalls, as the host now has no 'legal' way to modify the
> hypevisor page tables directly.
> 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_mmu.h |  2 +-
>  arch/arm64/kvm/arm.c | 87 +---
>  arch/arm64/kvm/mmu.c | 43 ++--
>  3 files changed, 120 insertions(+), 12 deletions(-)

[...]

> @@ -1489,13 +1497,14 @@ static void cpu_hyp_reinit(void)
>   
> kvm_init_host_cpu_context(_cpu_ptr_hyp_sym(kvm_host_data)->host_ctxt);
>  
>   cpu_hyp_reset();
> - cpu_set_hyp_vector();
>  
>   if (is_kernel_in_hyp_mode())
>   kvm_timer_init_vhe();
>   else
>   cpu_init_hyp_mode();
>  
> + cpu_set_hyp_vector();
> +
>   kvm_arm_init_debug();
>  
>   if (vgic_present)
> @@ -1691,18 +1700,59 @@ static void teardown_hyp_mode(void)
>   }
>  }
>  
> +static int do_pkvm_init(u32 hyp_va_bits)
> +{
> + void *per_cpu_base = kvm_ksym_ref(kvm_arm_hyp_percpu_base);
> + int ret;
> +
> + preempt_disable();
> + hyp_install_host_vector();

It's a shame we need this both here _and_ on the reinit path, but it looks
like it's necessary.

> + ret = kvm_call_hyp_nvhe(__pkvm_init, hyp_mem_base, hyp_mem_size,
> + num_possible_cpus(), kern_hyp_va(per_cpu_base),
> + hyp_va_bits);
> + preempt_enable();
> +
> + return ret;
> +}

[...]

>  /**
>   * Inits Hyp-mode on all online CPUs
>   */
>  static int init_hyp_mode(void)
>  {
> + u32 hyp_va_bits;
>   int cpu;
> - int err = 0;
> + int err = -ENOMEM;
> +
> + /*
> +  * The protected Hyp-mode cannot be initialized if the memory pool
> +  * allocation has failed.
> +  */
> + if (is_protected_kvm_enabled() && !hyp_mem_base)
> + return err;

This skips the error message you get on the out_err path.

> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 4d41d7838d53..9d331bf262d2 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -221,15 +221,39 @@ void free_hyp_pgds(void)
>   if (hyp_pgtable) {
>   kvm_pgtable_hyp_destroy(hyp_pgtable);
>   kfree(hyp_pgtable);
> + hyp_pgtable = NULL;
>   }
>   mutex_unlock(_hyp_pgd_mutex);
>  }
>  
> +static bool kvm_host_owns_hyp_mappings(void)
> +{
> + if (static_branch_likely(_protected_mode_initialized))
> + return false;
> +
> + /*
> +  * This can happen at boot time when __create_hyp_mappings() is called
> +  * after the hyp protection has been enabled, but the static key has
> +  * not been flipped yet.
> +  */
> + if (!hyp_pgtable && is_protected_kvm_enabled())
> + return false;
> +
> + WARN_ON(!hyp_pgtable);
> +
> + return true;

return !(WARN_ON(!hyp_pgtable) && is_protected_kvm_enabled());

?

Will


Re: [PATCH v2] sched: Optimize __calc_delta.

2021-03-04 Thread Sedat Dilek
On Thu, Mar 4, 2021 at 7:24 PM Sedat Dilek  wrote:
>
> On Thu, Mar 4, 2021 at 6:34 PM 'Nick Desaulniers' via Clang Built
> Linux  wrote:
> >
> > On Wed, Mar 3, 2021 at 2:48 PM Josh Don  wrote:
> > >
> > > From: Clement Courbet 
> > >
> > > A significant portion of __calc_delta time is spent in the loop
> > > shifting a u64 by 32 bits. Use `fls` instead of iterating.
> > >
> > > This is ~7x faster on benchmarks.
> > >
> > > The generic `fls` implementation (`generic_fls`) is still ~4x faster
> > > than the loop.
> > > Architectures that have a better implementation will make use of it. For
> > > example, on X86 we get an additional factor 2 in speed without dedicated
> > > implementation.
> > >
> > > On gcc, the asm versions of `fls` are about the same speed as the
> > > builtin. On clang, the versions that use fls are more than twice as
> > > slow as the builtin. This is because the way the `fls` function is
> > > written, clang puts the value in memory:
> > > https://godbolt.org/z/EfMbYe. This bug is filed at
> > > https://bugs.llvm.org/show_bug.cgi?id=49406.
> >
> > Hi Josh, Thanks for helping get this patch across the finish line.
> > Would you mind updating the commit message to point to
> > https://bugs.llvm.org/show_bug.cgi?id=20197?
> >
> > >
> > > ```
> > > name   cpu/op
> > > BM_Calc<__calc_delta_loop> 9.57ms ±12%
> > > BM_Calc<__calc_delta_generic_fls>  2.36ms ±13%
> > > BM_Calc<__calc_delta_asm_fls>  2.45ms ±13%
> > > BM_Calc<__calc_delta_asm_fls_nomem>1.66ms ±12%
> > > BM_Calc<__calc_delta_asm_fls64>2.46ms ±13%
> > > BM_Calc<__calc_delta_asm_fls64_nomem>  1.34ms ±15%
> > > BM_Calc<__calc_delta_builtin>  1.32ms ±11%
> > > ```
> > >
> > > Signed-off-by: Clement Courbet 
> > > Signed-off-by: Josh Don 
> > > ---
> > >  kernel/sched/fair.c  | 19 +++
> > >  kernel/sched/sched.h |  1 +
> > >  2 files changed, 12 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index 8a8bd7b13634..a691371960ae 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -229,22 +229,25 @@ static void __update_inv_weight(struct load_weight 
> > > *lw)
> > >  static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct 
> > > load_weight *lw)
> > >  {
> > > u64 fact = scale_load_down(weight);
> > > +   u32 fact_hi = (u32)(fact >> 32);
> > > int shift = WMULT_SHIFT;
> > > +   int fs;
> > >
> > > __update_inv_weight(lw);
> > >
> > > -   if (unlikely(fact >> 32)) {
> > > -   while (fact >> 32) {
> > > -   fact >>= 1;
> > > -   shift--;
> > > -   }
> > > +   if (unlikely(fact_hi)) {
> > > +   fs = fls(fact_hi);
> > > +   shift -= fs;
> > > +   fact >>= fs;
> > > }
> > >
> > > fact = mul_u32_u32(fact, lw->inv_weight);
> > >
> > > -   while (fact >> 32) {
> > > -   fact >>= 1;
> > > -   shift--;
> > > +   fact_hi = (u32)(fact >> 32);
> > > +   if (fact_hi) {
> > > +   fs = fls(fact_hi);
> > > +   shift -= fs;
> > > +   fact >>= fs;
> > > }
> > >
> > > return mul_u64_u32_shr(delta_exec, fact, shift);
> > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > > index 10a1522b1e30..714af71cf983 100644
> > > --- a/kernel/sched/sched.h
> > > +++ b/kernel/sched/sched.h
> > > @@ -36,6 +36,7 @@
> > >  #include 
> > >
> > >  #include 
> > > +#include 
> >
> > This hunk of the patch is curious.  I assume that bitops.h is needed
> > for fls(); if so, why not #include it in kernel/sched/fair.c?
> > Otherwise this potentially hurts compile time for all TUs that include
> > kernel/sched/sched.h.
> >
>
> I have v2 as-is in my custom patchset and booted right now on bare metal.
>
> As Nick points out moving the include makes sense to me.
> We have a lot of include at the wrong places increasing build-time.
>

I tried with the attached patch.

$ LC_ALL=C ll kernel/sched/fair.o
-rw-r--r-- 1 dileks dileks 1.2M Mar  4 20:11 kernel/sched/fair.o

- Sedat -
From afd45cd78c21960c6e937021f095e5f8f51fef7a Mon Sep 17 00:00:00 2001
From: Sedat Dilek 
Date: Thu, 4 Mar 2021 20:05:30 +0100
Subject: [PATCH] sched/fair: Move include after __calc_delta optimization
 change

Signed-off-by: Sedat Dilek 
---
 kernel/sched/fair.c  | 2 ++
 kernel/sched/sched.h | 1 -
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5fda1751fbd1..b9f10ae92e3f 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -20,6 +20,8 @@
  *  Adaptive scheduling granularity, math enhancements by Peter Zijlstra
  *  Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra
  */
+#include 
+
 #include "sched.h"
 
 /*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 714af71cf983..10a1522b1e30 100644
--- 

Re: [PATCH v3] selinux: measure state and policy capabilities

2021-03-04 Thread Lakshmi Ramasubramanian

On 2/12/21 8:37 AM, Lakshmi Ramasubramanian wrote:

Hi Paul,


SELinux stores the configuration state and the policy capabilities
in kernel memory.  Changes to this data at runtime would have an impact
on the security guarantees provided by SELinux.  Measuring this data
through IMA subsystem provides a tamper-resistant way for
an attestation service to remotely validate it at runtime.

Measure the configuration state and policy capabilities by calling
the IMA hook ima_measure_critical_data().



I have addressed your comments on the v2 patch for selinux measurement 
using IMA. Could you please let me know if there are any other comments 
that I need to address in this patch?


Thanks for your review and help so far.

 -lakshmi



Signed-off-by: Lakshmi Ramasubramanian 
Suggested-by: Stephen Smalley 
Suggested-by: Paul Moore 
---
  security/selinux/ima.c | 87 --
  security/selinux/include/ima.h |  6 +++
  security/selinux/selinuxfs.c   |  6 +++
  security/selinux/ss/services.c |  2 +-
  4 files changed, 96 insertions(+), 5 deletions(-)

diff --git a/security/selinux/ima.c b/security/selinux/ima.c
index 03715893ff97..34d421861bfc 100644
--- a/security/selinux/ima.c
+++ b/security/selinux/ima.c
@@ -13,18 +13,83 @@
  #include "ima.h"
  
  /*

- * selinux_ima_measure_state - Measure hash of the SELinux policy
+ * selinux_ima_collect_state - Read selinux configuration settings
   *
- * @state: selinux state struct
+ * @state: selinux_state
   *
- * NOTE: This function must be called with policy_mutex held.
+ * On success returns the configuration settings string.
+ * On error, returns NULL.
   */
-void selinux_ima_measure_state(struct selinux_state *state)
+static char *selinux_ima_collect_state(struct selinux_state *state)
  {
+   const char *on = "=1;", *off = "=0;";
+   char *buf;
+   int buf_len, len, i, rc;
+
+   buf_len = strlen("initialized=0;enforcing=0;checkreqprot=0;") + 1;
+
+   len = strlen(on);
+   for (i = 0; i < __POLICYDB_CAPABILITY_MAX; i++)
+   buf_len += strlen(selinux_policycap_names[i]) + len;
+
+   buf = kzalloc(buf_len, GFP_KERNEL);
+   if (!buf)
+   return NULL;
+
+   rc = strscpy(buf, "initialized", buf_len);
+   WARN_ON(rc < 0);
+
+   rc = strlcat(buf, selinux_initialized(state) ? on : off, buf_len);
+   WARN_ON(rc >= buf_len);
+
+   rc = strlcat(buf, "enforcing", buf_len);
+   WARN_ON(rc >= buf_len);
+
+   rc = strlcat(buf, enforcing_enabled(state) ? on : off, buf_len);
+   WARN_ON(rc >= buf_len);
+
+   rc = strlcat(buf, "checkreqprot", buf_len);
+   WARN_ON(rc >= buf_len);
+
+   rc = strlcat(buf, checkreqprot_get(state) ? on : off, buf_len);
+   WARN_ON(rc >= buf_len);
+
+   for (i = 0; i < __POLICYDB_CAPABILITY_MAX; i++) {
+   rc = strlcat(buf, selinux_policycap_names[i], buf_len);
+   WARN_ON(rc >= buf_len);
+
+   rc = strlcat(buf, state->policycap[i] ? on : off, buf_len);
+   WARN_ON(rc >= buf_len);
+   }
+
+   return buf;
+}
+
+/*
+ * selinux_ima_measure_state_locked - Measure SELinux state and hash of policy
+ *
+ * @state: selinux state struct
+ */
+void selinux_ima_measure_state_locked(struct selinux_state *state)
+{
+   char *state_str = NULL;
void *policy = NULL;
size_t policy_len;
int rc = 0;
  
+	WARN_ON(!mutex_is_locked(>policy_mutex));

+
+   state_str = selinux_ima_collect_state(state);
+   if (!state_str) {
+   pr_err("SELinux: %s: failed to read state.\n", __func__);
+   return;
+   }
+
+   ima_measure_critical_data("selinux", "selinux-state",
+ state_str, strlen(state_str), false);
+
+   kfree(state_str);
+
/*
 * Measure SELinux policy only after initialization is completed.
 */
@@ -42,3 +107,17 @@ void selinux_ima_measure_state(struct selinux_state *state)
  
  	vfree(policy);

  }
+
+/*
+ * selinux_ima_measure_state - Measure SELinux state and hash of policy
+ *
+ * @state: selinux state struct
+ */
+void selinux_ima_measure_state(struct selinux_state *state)
+{
+   WARN_ON(mutex_is_locked(>policy_mutex));
+
+   mutex_lock(>policy_mutex);
+   selinux_ima_measure_state_locked(state);
+   mutex_unlock(>policy_mutex);
+}
diff --git a/security/selinux/include/ima.h b/security/selinux/include/ima.h
index d69c36611423..75ca92b4a462 100644
--- a/security/selinux/include/ima.h
+++ b/security/selinux/include/ima.h
@@ -15,10 +15,16 @@
  
  #ifdef CONFIG_IMA

  extern void selinux_ima_measure_state(struct selinux_state *selinux_state);
+extern void selinux_ima_measure_state_locked(
+   struct selinux_state *selinux_state);
  #else
  static inline void selinux_ima_measure_state(struct selinux_state 
*selinux_state)
  {
  }
+static inline void selinux_ima_measure_state_locked(
+   struct 

Re: [PATCH v2] dma-buf: system_heap: do not warn for costly allocation

2021-03-04 Thread John Stultz
On Wed, Feb 10, 2021 at 6:33 PM Minchan Kim  wrote:
>
> Dmabuf system_heap allocation logic starts with the highest necessary
> allocation order before falling back to lower orders. The requested
> order can be higher than PAGE_ALLOC_COSTLY_ODER and failures to
> allocate will flood dmesg with warnings. Such high-order allocations
> are not unexpected and are handled by the system_heap's allocation
> fallback mechanism.
> Prevent these warnings when allocating higher than
> PAGE_ALLOC_COSTLY_ODER pages using __GFP_NOWARN flag.
>
> Below is ION warning example I got but dmabuf system heap is nothing 
> different:
>
> [ 1233.911533][  T460] warn_alloc: 11 callbacks suppressed
> [ 1233.911539][  T460] allocator@2.0-s: page allocation failure: order:4, 
> mode:0x140dc2(GFP_HIGHUSER|__GFP_COMP|__GFP_ZERO), 
> nodemask=(null),cpuset=/,mems_allowed=0
> [ 1233.926235][  T460] Call trace:
> [ 1233.929370][  T460]  dump_backtrace+0x0/0x1d8
> [ 1233.933704][  T460]  show_stack+0x18/0x24
> [ 1233.937701][  T460]  dump_stack+0xc0/0x140
> [ 1233.941783][  T460]  warn_alloc+0xf4/0x148
> [ 1233.945862][  T460]  __alloc_pages_slowpath+0x9fc/0xa10
> [ 1233.951101][  T460]  __alloc_pages_nodemask+0x278/0x2c0
> [ 1233.956285][  T460]  ion_page_pool_alloc+0xd8/0x100
> [ 1233.961144][  T460]  ion_system_heap_allocate+0xbc/0x2f0
> [ 1233.966440][  T460]  ion_buffer_create+0x68/0x274
> [ 1233.971130][  T460]  ion_buffer_alloc+0x8c/0x110
> [ 1233.975733][  T460]  ion_dmabuf_alloc+0x44/0xe8
> [ 1233.980248][  T460]  ion_ioctl+0x100/0x320
> [ 1233.984332][  T460]  __arm64_sys_ioctl+0x90/0xc8
> [ 1233.988934][  T460]  el0_svc_common+0x9c/0x168
> [ 1233.993360][  T460]  do_el0_svc+0x1c/0x28
> [ 1233.997358][  T460]  el0_sync_handler+0xd8/0x250
> [ 1234.001989][  T460]  el0_sync+0x148/0x180
>
> Signed-off-by: Minchan Kim 
> ---
> * from v1 - 
> https://lore.kernel.org/lkml/20210210162632.3903128-1-minc...@kernel.org/
>  * better description - surenb
>  * use mid_order_gfp - john.stultz
>
>  drivers/dma-buf/heaps/system_heap.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/dma-buf/heaps/system_heap.c 
> b/drivers/dma-buf/heaps/system_heap.c
> index 29e49ac17251..e5f545ada587 100644
> --- a/drivers/dma-buf/heaps/system_heap.c
> +++ b/drivers/dma-buf/heaps/system_heap.c
> @@ -40,11 +40,16 @@ struct dma_heap_attachment {
> bool mapped;
>  };
>
> +#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
> +/*
> + * Avoid warning on order-4 allocation failures as we'll fall back to
> + * order-0 in that case.
> + */
> +#define MID_ORDER_GFP (LOW_ORDER_GFP | __GFP_NOWARN)
>  #define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
> | __GFP_NORETRY) & ~__GFP_RECLAIM) \
> | __GFP_COMP)
> -#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
> -static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
> +static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, LOW_ORDER_GFP};
>  /*
>   * The selection of the orders used for allocation (1MB, 64K, 4K) is designed
>   * to match with the sizes often found in IOMMUs. Using order 4 pages instead

This looks good to me! Thanks for sending this and apologies for the
slow reply, the patch slipped by me!

Reviewed-by: John Stultz 

thanks again!
-john


Re: [PATCH 5/7] printk: Make %pS and friends print module build ID

2021-03-04 Thread Stephen Boyd
Quoting Matthew Wilcox (2021-03-04 09:00:52)
> On Mon, Mar 01, 2021 at 09:47:47AM -0800, Stephen Boyd wrote:
> > Example:
> > 
> >  WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 
> > lkdtm_WARNING+0x28/0x30 [lkdtm] (ed5019fdf5e53be37cb1ba7899292d7e143b259e)
> 
> Would the first 12 characters instead of all 40 make it more palatable
> without reducing its utility?

I can't seem to request debuginfo from debuginfod without the full 40
characters. It's not a git sha1 hash. 

> And I feel it should be within the [], so maybe this:
> 
> WARNING: CPU: 4 PID: 3255 at drivers/misc/lkdtm/bugs.c:83 
> lkdtm_WARNING+0x28/0x30 [lkdtm ed5019fdf5e5]
> 

Sure I could put the hex numbers inside the brackets. I suspect changing
%pS or updating the "Modules linked in:" line isn't going to be
palatable. I've decided to introduce another printk format %pT to print
the stacktrace and then updated the architecture code on arm64 and x86
to see how it goes. Other architecures can be updated if this is
acceptable. I'll send out a patch series in a little bit that also
updates the decode_stacktrace.sh script to parse this.


Re: [PATCH] KVM: arm64: Disable LTO in hyp

2021-03-04 Thread Marc Zyngier
On Thu, 04 Mar 2021 18:45:44 +,
Sami Tolvanen  wrote:
> 
> allmodconfig + CONFIG_LTO_CLANG_THIN=y fails to build due to following
> linker errors:
> 
>   ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21CC):

I assume this message is only an oddity, right? Because
__guest_enter() is as far as you can imagine from irqbypass.c...

>   relocation R_AARCH64_CONDBR19 out of range: 2031220 is not in
>   [-1048576, 1048575]; references hyp_panic
>   >>> defined in vmlinux.o
> 
>   ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21E0):
>   relocation R_AARCH64_ADR_PREL_LO21 out of range: 2031200 is not in
>   [-1048576, 1048575]; references hyp_panic
>   >>> defined in vmlinux.o
> 
> As LTO is not really necessary for the hypervisor code, disable it for
> the hyp directory to fix the build.

Can you shed some light on what the problem is exactly?

> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/1317
> Reported-by: Nathan Chancellor 
> Tested-by: Nathan Chancellor 
> Signed-off-by: Sami Tolvanen 
> ---
>  arch/arm64/kvm/hyp/Makefile | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile
> index 687598e41b21..e8116016e6a8 100644
> --- a/arch/arm64/kvm/hyp/Makefile
> +++ b/arch/arm64/kvm/hyp/Makefile
> @@ -11,3 +11,6 @@ subdir-ccflags-y := -I$(incdir) 
> \
>   $(DISABLE_STACKLEAK_PLUGIN)
>  
>  obj-$(CONFIG_KVM) += vhe/ nvhe/ pgtable.o
> +
> +# Disable LTO for the files in this directory
> +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS))
> 
> base-commit: f69d02e37a85645aa90d18cacfff36dba370f797

Can this be reduced to the nvhe part of the tree? The rest of the
hypervisor should support being built with LTO, I'd expect. Or am I
missing something more significant?

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH 0/4] add device drivers for Siemens Industrial PCs

2021-03-04 Thread Henning Schild
Thanks Andy,

Am Thu, 4 Mar 2021 12:19:44 +0200
schrieb Andy Shevchenko :

> On Thu, Mar 4, 2021 at 9:29 AM Henning Schild
>  wrote:
> 
> > This series adds support for watchdogs and leds of several x86
> > devices from Siemens.
> >
> > It is structured with a platform driver that mainly does
> > identification of the machines. It might trigger loading of the
> > actual device drivers by attaching devices to the platform bus.
> >
> > The identification is vendor specific, parsing a special binary DMI
> > entry. The implementation of that platform identification is
> > applied on pmc_atom clock quirks in the final patch.
> >
> > It is all structured in a way that we can easily add more devices
> > and more platform drivers later. Internally we have some more code
> > for hardware monitoring, more leds, watchdogs etc. This will follow
> > some day.
> >
> > But the idea here is to share early, and hopefully not fail early.  
> 
> I have given a few comments here and there, so please check the entire
> series and address them in _all_ similar locations. As I have noticed,
> I have different approach about P2SB code, I have to give the series a
> dust and see if you can utilize it.

You did find some things that others found as well. SPDX vs blabla,
header ordering, some other style.
Some things are already done and will be in v2.

Other findings are new, and we will look into them. The only thing that
did stick out is P2SB, also was a point in internal pre-review.
Let us see what you have, i can include a patch of yours into the q.
But i am kind of afraid once it is there, several existing users should
be touched with it, and this series would come later. Or this series
comes first and you come later and clean up our "mess". Not sure i
would want to take over all P2SB unhiders, but with you on board it
will work.

> I would like to be Cc'ed on the next version.

Sure thing.

regards,
Henning


> 



Re: [RFC PATCH v2 00/13] Add futex2 syscall

2021-03-04 Thread André Almeida

Hi Ted,

Às 12:01 de 04/03/21, Theodore Ts'o escreveu:

On Wed, Mar 03, 2021 at 09:42:06PM -0300, André Almeida wrote:

  ** Performance

  - For comparing futex() and futex2() performance, I used the artificial
benchmarks implemented at perf (wake, wake-parallel, hash and
requeue). The setup was 200 runs for each test and using 8, 80, 800,
8000 for the number of threads, Note that for this test, I'm not using
patch 14 ("kernel: Enable waitpid() for futex2") , for reasons explained
at "The patchset" section.


How heavily contended where the benchmarks?  One of the benefits of
the original futex was that no system call was necessary in the happy
path when the lock is uncontended.  


futex2 has the same design in that aspect, no syscall is needed in the 
happy path. Did something in the cover letter gave the impression that 
is not the case? I would like to reword it to clarify this.



Especially on a non-NUMA system
(which are the far more common case), since that's where relying on a
single memory access was a huge win for the original futex.  I would
expect that futex2 will fare worse in this particular case, since it
requires a system call entry for all operations --- the question is
how large is the delta in this worst case (for futex2) and best case
(for futex) scenario.

Cheers,

- Ted



Thanks,
André


Re: [PATCH] platform/surface: aggregator: Make SSAM_DEFINE_SYNC_REQUEST_x define static functions

2021-03-04 Thread Hans de Goede
Hi,

On 3/4/21 8:05 PM, Maximilian Luz wrote:
> The SSAM_DEFINE_SYNC_REQUEST_x() macros are intended to reduce
> boiler-plate code for SSAM request definitions by defining a wrapper
> function for the specified request. The client device variants of those
> macros, i.e. SSAM_DEFINE_SYNC_REQUEST_CL_x() in particular rely on the
> multi-device (MD) variants, e.g.:
> 
> #define SSAM_DEFINE_SYNC_REQUEST_CL_R(name, rtype, spec...)   \
> SSAM_DEFINE_SYNC_REQUEST_MD_R(__raw_##name, rtype, spec)  \
> int name(struct ssam_device *sdev, rtype *ret)\
> { \
> return __raw_##name(sdev->ctrl, sdev->uid.target, \
> sdev->uid.instance, ret); \
> }
> 
> This now creates the problem that it is not possible to declare the
> generated functions static via
> 
> static SSAM_DEFINE_SYNC_REQUEST_CL_R(...)
> 
> as this will only apply to the function defined by the multi-device
> macro, i.e. SSAM_DEFINE_SYNC_REQUEST_MD_R(). Thus compiling with
> `-Wmissing-prototypes' rightfully complains that there is a 'static'
> keyword missing.
> 
> To solve this, make all SSAM_DEFINE_SYNC_REQUEST_x() macros define
> static functions. Non-client-device macros are also changed for
> consistency. In general, we expect those functions to be only used
> locally in the respective drivers for the corresponding interfaces, so
> having to define a wrapper function to be able to export this should be
> the odd case out.
> 
> Reported-by: kernel test robot 
> Fixes: 510c8114fc74 ("platform/surface: Add platform profile driver")
> Signed-off-by: Maximilian Luz 

Thank you for the quick fix.

I've applied this patch to my review-hans branch:
https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git/log/?h=review-hans

Note it will show up in my review-hans branch once I've pushed my
local branch there, which might take a while.

Once I've run some tests on this branch the patches there will be
added to the platform-drivers-x86/for-next branch and eventually
will be included in the pdx86 pull-request to Linus for the next
merge-window.

Regards,

Hans



> ---
>  .../driver-api/surface_aggregator/client.rst  |  4 +-
>  .../platform/surface/aggregator/controller.c  | 10 +--
>  .../surface/surface_aggregator_registry.c |  2 +-
>  .../surface/surface_platform_profile.c|  4 +-
>  include/linux/surface_aggregator/controller.h | 74 +--
>  include/linux/surface_aggregator/device.h | 31 
>  6 files changed, 63 insertions(+), 62 deletions(-)
> 
> diff --git a/Documentation/driver-api/surface_aggregator/client.rst 
> b/Documentation/driver-api/surface_aggregator/client.rst
> index 26d13085a117..e519d374c378 100644
> --- a/Documentation/driver-api/surface_aggregator/client.rst
> +++ b/Documentation/driver-api/surface_aggregator/client.rst
> @@ -248,7 +248,7 @@ This example defines a function
>  
>  .. code-block:: c
>  
> -   int __ssam_tmp_perf_mode_set(struct ssam_controller *ctrl, const __le32 
> *arg);
> +   static int __ssam_tmp_perf_mode_set(struct ssam_controller *ctrl, const 
> __le32 *arg);
>  
>  executing the specified request, with the controller passed in when calling
>  said function. In this example, the argument is provided via the ``arg``
> @@ -296,7 +296,7 @@ This invocation of the macro defines a function
>  
>  .. code-block:: c
>  
> -   int ssam_bat_get_sta(struct ssam_device *sdev, __le32 *ret);
> +   static int ssam_bat_get_sta(struct ssam_device *sdev, __le32 *ret);
>  
>  executing the specified request, using the device IDs and controller given
>  in the client device. The full list of such macros for client devices is:
> diff --git a/drivers/platform/surface/aggregator/controller.c 
> b/drivers/platform/surface/aggregator/controller.c
> index 5bcb59ed579d..aa6f37b4f46e 100644
> --- a/drivers/platform/surface/aggregator/controller.c
> +++ b/drivers/platform/surface/aggregator/controller.c
> @@ -1750,35 +1750,35 @@ EXPORT_SYMBOL_GPL(ssam_request_sync_with_buffer);
>  
>  /* -- Internal SAM requests. 
>  */
>  
> -static SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_get_firmware_version, __le32, {
> +SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_get_firmware_version, __le32, {
>   .target_category = SSAM_SSH_TC_SAM,
>   .target_id   = 0x01,
>   .command_id  = 0x13,
>   .instance_id = 0x00,
>  });
>  
> -static SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_display_off, u8, {
> +SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_display_off, u8, {
>   .target_category = SSAM_SSH_TC_SAM,
>   .target_id   = 0x01,
>   .command_id  = 0x15,
>   .instance_id = 0x00,
>  });
>  
> -static SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_display_on, u8, {
> +SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_display_on, u8, {
>   .target_category = SSAM_SSH_TC_SAM,
>   

Re: [PATCH RFC] gcc-plugins: Handle GCC version mismatch for OOT modules

2021-03-04 Thread Linus Torvalds
On Thu, Mar 4, 2021 at 7:36 AM Masahiro Yamada  wrote:
>
> All the kernel-space objects are rebuilt
> when the compiler is upgraded.

I very much NAK'ed that one. Why did that go in?

Or maybe I NAK'ed another version of it (I think the one I NAK'ed was
from Josh), and didn't realize that there were multiple ones.

> Linus complaint about GCC plugins not being rebuilt.

Yes, and that was a separate complaint and not at all tied to the other objects.

Gcc plugins aren't kernel object files at all. They are very much like
dynamically loadable libraries to gcc itself.

   Linus


Re: [PATCH 5/7] printk: Make %pS and friends print module build ID

2021-03-04 Thread Stephen Boyd
Quoting Matthew Wilcox (2021-03-04 09:19:40)
> On Mon, Mar 01, 2021 at 09:43:19PM -0500, Steven Rostedt wrote:
> > On Mon,  1 Mar 2021 09:47:47 -0800
> > Stephen Boyd  wrote:
> > 
> > > The %pS printk format (among some others) is used to print kernel
> > > addresses symbolically. When the kernel prints an address inside of a
> > > module, the kernel prints the addresses' symbol name along with the
> > > module's name that contains the address. Let's make kernel stacktraces
> > > easier to identify on KALLSYMS builds by including the build ID of a
> > > module when we print the address.
> > 
> > Please no!
> > 
> > This kills the output of tracing with offset, and can possibly break
> > scripts. I don't want to look at traces like this!
> > 
> >   -0   [004] ..s1   353.842577: 
> > ipv4_conntrack_in+0x0/0x10 [nf_conntrack] 
> > (3b39eb771b2566331887f671c741f90bfba0b051) <-nf_hook_slow+0x40/0xb0
> 
> Would it make sense to only print the build-id if it differs from the
> build-id of the kernel which has loaded it?

No. The build-id of the module is different from the kernel that loaded
it all the time, so it would always be printed.


Re: [PATCH 1/2] Makefile: Remove '--gcc-toolchain' flag

2021-03-04 Thread Fāng-ruì Sòng
On Wed, Mar 3, 2021 at 3:07 PM Fangrui Song  wrote:
>
>
> On 2021-03-03, Masahiro Yamada wrote:
> >Hi.
> >
> >On Wed, Mar 3, 2021 at 6:44 AM Fangrui Song  wrote:
> >>
> >> Reviewed-by: Fangrui Song 
> >>
> >> Thanks for the clean-up!
> >> --gcc-toolchain= is an obsscure way searching for GCC installation 
> >> prefixes (--prefix).
> >> The logic is complex and different for different 
> >> distributions/architectures.
> >>
> >> If we specify --prefix= (-B) explicitly, --gcc-toolchain is not needed.
> >
> >
> >I tested this, and worked for me too.
> >
> >Before applying this patch, could you please
> >help me understand the logic?
> >
> >
> >
> >
> >I checked the manual
> >(https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-b-dir)
> >
> >
> >
> >-B, --prefix , --prefix=
> >Add  to search path for binaries and object files used implicitly
> >
> >--gcc-toolchain=, -gcc-toolchain 
> >Use the gcc toolchain at the given directory
> >
> >
> >Hmm, this description is too concise
> >to understand how it works...
> >
> >
> >
> >I use Ubuntu 20.10.
> >
> >I use distro's default clang
> >located in /usr/bin/clang.
> >
> >I place my aarch64 linaro toolchain in
> >/home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu-gcc,
> >which is not in my PATH environment.
> >
> >
> >
> >
> >From my some experiments,
> >
> >clang --target=aarch64-linux-gnu -no-integrated-as \
> >--prefix=/home/masahiro/tools/aarch64-linaro-7.5/bin/aarch64-linux-gnu-  ...
> >
> >works almost equivalent to
> >
> >PATH=/home/masahiro/tools/aarch64-linaro-7.5/bin:$PATH \
> >clang --target=aarch64-linux-gnu -no-integrated-as ...
> >
> >
> >Then, clang will pick up aarch64-linux-gnu-as
> >found in the search path.
> >
> >Is this correct?
> >
> >
> >On the other hand, I could not understand
> >what the purpose of --gcc-toolchain= is.
> >
> >
> >Even if I add --gcc-toolchain=/home/masahiro/tools/aarch64-linaro-7.5,
> >it does not make any difference, and it is completely useless.
> >
> >
> >I read the comment from stephenhines:
> >https://github.com/ClangBuiltLinux/linux/issues/78
> >
> >How could --gcc-toolchain be used
> >in a useful way?
>
> --gcc-toolchain was introduced in
> https://reviews.llvm.org/rG1af7c219c7113a35415651127f05cdf056b63110
> to provide a flexible alternative to autoconf configure-time 
> --with-gcc-toolchain (now cmake variable GCC_INSTALL_PREFIX).
>
> I agree the option is confusing, the documentation is poor, and probably very 
> few people understand what it does.
> I apologize that my previous reply is not particular correct.
> So the more correct answer is below:
>
>
> A --prefix option can specify either of
>
> 1) A directory (for something like /a/b/lib/gcc/arm-linux-androideabi, this 
> should be /a/b, the parent directory of 'lib')
> 2) A path fragment like /usr/bin/aarch64-linux-gnu-
>
> The directory values of the --prefix list and --gcc-toolchain are used to 
> detect GCC installation directories. The directory is used to fetch include 
> directories, system library directories and binutils directories (as, 
> objcopy).
> (See below, Linux kernel only needs the binutils executables, so the 
> include/library logic is really useless to us)
>
> The logic is around 
> https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Gnu.cpp#L1910
>
>Prefixes = --prefix/-B list (only the directory subset is effective)
>StringRef GCCToolchainDir = --gcc-toolchain= or CMake variable 
> GCC_INSTALL_PREFIX
>if (GCCToolchainDir != "") {
>  Prefixes.push_back(std::string(GCCToolchainDir));
>} else {
>  if (!D.SysRoot.empty()) {
>Prefixes.push_back(D.SysRoot);
>// Add D.SysRoot+"/usr" to Prefixes. Some distributions add more 
> directories.
>AddDefaultGCCPrefixes(TargetTriple, Prefixes, D.SysRoot);
>  }
>
>  // D.InstalledDir is the directory of the clang executable, e.g. /usr/bin
>  Prefixes.push_back(D.InstalledDir + "/..");
>
>  if (D.SysRoot.empty())
>AddDefaultGCCPrefixes(TargetTriple, Prefixes, D.SysRoot);
>}
>
>// Gentoo / ChromeOS specific logic.
>// I think this block is misplaced.
>if (GCCToolchainDir == "" || GCCToolchainDir == D.SysRoot + "/usr") {
>  ...
>}
>
>// Loop over the various components which exist and select the best GCC
>// installation available. GCC installs are ranked by version number.
>Version = GCCVersion::Parse("0.0.0");
>for (const std::string  : Prefixes) {
>  auto  = D.getVFS();
>  if (!VFS.exists(Prefix))
>continue;
>
>  // CandidateLibDirs is a subset of {/lib64, /lib32, /lib}.
>  for (StringRef Suffix : CandidateLibDirs) {
>const std::string LibDir = Prefix + Suffix.str();
>if (!VFS.exists(LibDir))
>  continue;
>bool GCCDirExists = VFS.exists(LibDir + "/gcc");
>bool GCCCrossDirExists = VFS.exists(LibDir + "/gcc-cross");
>
>// Precise match. Detect 

Re: [PATCH net] net: usb: cdc_ncm: don't spew notifications

2021-03-04 Thread Grant Grundler
On Wed, Jan 20, 2021 at 5:04 PM Jakub Kicinski  wrote:
>
> On Wed, 20 Jan 2021 03:38:32 + Hayes Wang wrote:
> > Grant Grundler 
> > > Sent: Wednesday, January 20, 2021 9:12 AM
> > > Subject: [PATCH net] net: usb: cdc_ncm: don't spew notifications
> > >
> > > RTL8156 sends notifications about every 32ms.
> > > Only display/log notifications when something changes.
> > >
> > > This issue has been reported by others:
> > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832472
> > > https://lkml.org/lkml/2020/8/27/1083
> > >
> > > Chrome OS cannot support RTL8156 until this is fixed.
> > >
> > > Signed-off-by: Grant Grundler 
> >
> > Reviewed-by: Hayes Wang 
>
> Applied, thanks!
>
> net should be merged back into net-next by the end of the day, so
> if the other patches depend on this one to apply cleanly please keep
> an eye and post after that happens. If there is no conflict you can
> just post them with [PATCH net-next] now.

Jakub, sorry, I "dropped the ball" on this one. I'll repost with
"net-next" a bit later today.

cheers,
grant


Re: [PATCH v5 08/16] rpmsg: glink: add sendto and trysendto ops

2021-03-04 Thread Mathieu Poirier
On Fri, Feb 19, 2021 at 12:14:53PM +0100, Arnaud Pouliquen wrote:
> Implement the sendto ops to support the future rpmsg_char update for the
> vitio backend support.

Add a new line, otherwise it is very easy to read.

> The use of sendto in rpmsg_char is needed as a destination address is
> requested at least by the virtio backend.

Same here and throughout the patchset.

> The glink implementation does not need a destination address so ignores it.
> 
> Signed-off-by: Arnaud Pouliquen 
> ---
>  drivers/rpmsg/qcom_glink_native.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/drivers/rpmsg/qcom_glink_native.c 
> b/drivers/rpmsg/qcom_glink_native.c
> index d4e4dd482614..ae2c03b59c55 100644
> --- a/drivers/rpmsg/qcom_glink_native.c
> +++ b/drivers/rpmsg/qcom_glink_native.c
> @@ -1332,6 +1332,20 @@ static int qcom_glink_trysend(struct rpmsg_endpoint 
> *ept, void *data, int len)
>   return __qcom_glink_send(channel, data, len, false);
>  }
>  
> +static int qcom_glink_sendto(struct rpmsg_endpoint *ept, void *data, int 
> len, u32 dst)
> +{
> + struct glink_channel *channel = to_glink_channel(ept);
> +
> + return __qcom_glink_send(channel, data, len, true);
> +}
> +
> +static int qcom_glink_trysendto(struct rpmsg_endpoint *ept, void *data, int 
> len, u32 dst)
> +{
> + struct glink_channel *channel = to_glink_channel(ept);
> +
> + return __qcom_glink_send(channel, data, len, false);
> +}

Just rename send() to sendto() and trysend() to trysendto() and ignore the
destination address.  The same goes for the next patch.  I would fold patch 08
and 09 into 10 to help get the big picture. 

> +
>  /*
>   * Finds the device_node for the glink child interested in this channel.
>   */
> @@ -1364,7 +1378,9 @@ static const struct rpmsg_device_ops glink_device_ops = 
> {
>  static const struct rpmsg_endpoint_ops glink_endpoint_ops = {
>   .destroy_ept = qcom_glink_destroy_ept,
>   .send = qcom_glink_send,
> + .sendto = qcom_glink_sendto,
>   .trysend = qcom_glink_trysend,
> + .trysendto = qcom_glink_trysendto,
>  };
>  
>  static void qcom_glink_rpdev_release(struct device *dev)
> -- 
> 2.17.1
> 


[PATCH v3 10/11] kentry: Check that syscall entries and syscall exits match

2021-03-04 Thread Andy Lutomirski
If arch code calls the wrong kernel entry helpers, syscall entries and
exits can get out of sync.  Add a new field to task_struct to track the
syscall state and validate that it transitions correctly.

Signed-off-by: Andy Lutomirski 
---
 include/linux/sched.h |  4 
 init/init_task.c  |  8 
 kernel/entry/common.c | 24 +++-
 3 files changed, 35 insertions(+), 1 deletion(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6e3a5eeec509..95d6d8686d98 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1368,6 +1368,10 @@ struct task_struct {
struct llist_head   kretprobe_instances;
 #endif
 
+#ifdef CONFIG_DEBUG_ENTRY
+   bool kentry_in_syscall;
+#endif
+
/*
 * New fields for task_struct should be added above here, so that
 * they are included in the randomized portion of task_struct.
diff --git a/init/init_task.c b/init/init_task.c
index 8a992d73e6fb..de4fdaac4e8b 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -212,6 +212,14 @@ struct task_struct init_task
 #ifdef CONFIG_SECCOMP
.seccomp= { .filter_count = ATOMIC_INIT(0) },
 #endif
+#ifdef CONFIG_DEBUG_ENTRY
+   /*
+* The init task, and kernel threads in general, are considered
+* to be "in a syscall".  This way they can execve() and then exit
+* the supposed syscall that they were in to go to user mode.
+*/
+   .kentry_in_syscall = true,
+#endif
 };
 EXPORT_SYMBOL(init_task);
 
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index f62934d761e3..9dea411de3b0 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -148,11 +148,21 @@ static long syscall_trace_enter(struct pt_regs *regs, 
long syscall,
 
 long kentry_syscall_begin(struct pt_regs *regs, long syscall)
 {
-   unsigned long work = READ_ONCE(current_thread_info()->syscall_work);
+   unsigned long work;
+
+   if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) {
+   DEBUG_ENTRY_WARN_ONCE(
+   current->kentry_in_syscall,
+   "entering syscall %ld while already in a syscall",
+   syscall);
+   current->kentry_in_syscall = true;
+   }
 
CT_WARN_ON(ct_state() != CONTEXT_KERNEL);
lockdep_assert_irqs_enabled();
 
+   work = READ_ONCE(current_thread_info()->syscall_work);
+
if (work & SYSCALL_WORK_ENTER)
syscall = syscall_trace_enter(regs, syscall, work);
 
@@ -163,11 +173,16 @@ long kentry_syscall_begin(struct pt_regs *regs, long 
syscall)
 static __always_inline void __exit_to_user_mode(void)
 {
instrumentation_begin();
+
 #ifdef CONFIG_DEBUG_ENTRY
DEBUG_ENTRY_WARN_ONCE(this_cpu_read(kentry_cpu_depth) != 1,
  "__exit_to_user_mode called at wrong kentry cpu 
depth (%u)",
  this_cpu_read(kentry_cpu_depth));
+
+   DEBUG_ENTRY_WARN_ONCE(current->kentry_in_syscall,
+ "exiting to user mode while in syscall context");
 #endif
+
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare(CALLER_ADDR0);
instrumentation_end();
@@ -331,6 +346,13 @@ void kentry_syscall_end(struct pt_regs *regs)
 */
if (unlikely(work & SYSCALL_WORK_EXIT))
syscall_exit_work(regs, work);
+
+#ifdef CONFIG_DEBUG_ENTRY
+   DEBUG_ENTRY_WARN_ONCE(!current->kentry_in_syscall,
+ "exiting syscall %lu without entering first", nr);
+
+   current->kentry_in_syscall = 0;
+#endif
 }
 
 noinstr void kentry_enter_from_user_mode(struct pt_regs *regs)
-- 
2.29.2



Re: XDP socket rings, and LKMM litmus tests

2021-03-04 Thread Paul E. McKenney
On Thu, Mar 04, 2021 at 04:44:34PM +0100, maranget wrote:
> 
> 
> > On 3 Mar 2021, at 21:22, Alan Stern  wrote:
> > 
> >>> 
> >>> Local variables absolutely should be treated just like CPU registers, if 
> >>> possible.  In fact, the compiler has the option of keeping local 
> >>> variables stored in registers.
> >>> 
> >>> (Of course, things may get complicated if anyone writes a litmus test 
> >>> that uses a pointer to a local variable,  Especially if the pointer 
> >>> could hold the address of a local variable in one execution and a 
> >>> shared variable in another!  Or if the pointer is itself a shared 
> >>> variable and is dereferenced in another thread!)
> >> 
> >> Good point!  I did miss this complication.  ;-)
> > 
> > I suspect it wouldn't be so bad if herd7 disallowed taking addresses of 
> > local variables.
> 
> Herd7 does disallow taking addresses of local variables.

Good to know, and thank you!

> However, such  tests can still be run on machine, provided function bodies 
> are accepted by the C compiler.

True, but that would be outside of the LKMM proper, correct?

Thanx, Paul


[PATCH v3 07/11] kentry: Make entry/exit_to_user_mode() arm64-only

2021-03-04 Thread Andy Lutomirski
exit_to_user_mode() does part, but not all, of the exit-to-user-mode work.
It's used only by arm64, and arm64 should stop using it (hint, hint!).
Compile it out on other architectures to minimize the chance of error.

enter_from_user_mode() is a legacy way to spell
kentry_enter_from_user_mode().  It's also only used by arm64.  Give it
the same treatment.

Signed-off-by: Andy Lutomirski 
---
 include/linux/entry-common.h | 34 ++
 kernel/entry/common.c|  4 
 2 files changed, 10 insertions(+), 28 deletions(-)

diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index 5287c6c15a66..a75374f87258 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -97,26 +97,15 @@ static inline __must_check int 
arch_syscall_enter_tracehook(struct pt_regs *regs
 }
 #endif
 
+#ifdef CONFIG_ARM64
 /**
  * enter_from_user_mode - Establish state when coming from user mode
  *
- * Syscall/interrupt entry disables interrupts, but user mode is traced as
- * interrupts enabled. Also with NO_HZ_FULL RCU might be idle.
+ * Legacy variant of kentry_enter_from_user_mode().  Used only by arm64.
  *
- * 1) Tell lockdep that interrupts are disabled
- * 2) Invoke context tracking if enabled to reactivate RCU
- * 3) Trace interrupts off state
- *
- * Invoked from architecture specific syscall entry code with interrupts
- * disabled. The calling code has to be non-instrumentable. When the
- * function returns all state is correct and interrupts are still
- * disabled. The subsequent functions can be instrumented.
- *
- * This is invoked when there is architecture specific functionality to be
- * done between establishing state and enabling interrupts. The caller must
- * enable interrupts before invoking syscall_enter_from_user_mode_work().
  */
 void enter_from_user_mode(struct pt_regs *regs);
+#endif
 
 /**
  * kentry_syscall_begin - Prepare to invoke a syscall handler
@@ -261,25 +250,14 @@ static inline void arch_syscall_exit_tracehook(struct 
pt_regs *regs, bool step)
 }
 #endif
 
+#ifdef CONFIG_ARM64
 /**
  * exit_to_user_mode - Fixup state when exiting to user mode
  *
- * Syscall/interrupt exit enables interrupts, but the kernel state is
- * interrupts disabled when this is invoked. Also tell RCU about it.
- *
- * 1) Trace interrupts on state
- * 2) Invoke context tracking if enabled to adjust RCU state
- * 3) Invoke architecture specific last minute exit code, e.g. speculation
- *mitigations, etc.: arch_exit_to_user_mode()
- * 4) Tell lockdep that interrupts are enabled
- *
- * Invoked from architecture specific code when syscall_exit_to_user_mode()
- * is not suitable as the last step before returning to userspace. Must be
- * invoked with interrupts disabled and the caller must be
- * non-instrumentable.
- * The caller has to invoke syscall_exit_to_user_mode_work() before this.
+ * Does the latter part of irqentry_exit_to_user_mode().  Only used by arm64.
  */
 void exit_to_user_mode(void);
+#endif
 
 /**
  * kentry_syscall_end - Finish syscall processing
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 800ad406431b..4ba82c684189 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -25,10 +25,12 @@ static __always_inline void __enter_from_user_mode(struct 
pt_regs *regs)
instrumentation_end();
 }
 
+#ifdef CONFIG_ARM64
 void noinstr enter_from_user_mode(struct pt_regs *regs)
 {
__enter_from_user_mode(regs);
 }
+#endif
 
 static inline void syscall_enter_audit(struct pt_regs *regs, long syscall)
 {
@@ -106,10 +108,12 @@ static __always_inline void __exit_to_user_mode(void)
lockdep_hardirqs_on(CALLER_ADDR0);
 }
 
+#ifdef CONFIG_ARM64
 void noinstr exit_to_user_mode(void)
 {
__exit_to_user_mode();
 }
+#endif
 
 /* Workaround to allow gradual conversion of architecture code */
 void __weak arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal) { 
}
-- 
2.29.2



[PATCH v3 11/11] kentry: Verify kentry state in instrumentation_begin/end()

2021-03-04 Thread Andy Lutomirski
Calling instrumentation_begin() and instrumentation_end() when kentry
thinks the CPU is in user mode is an error.  Verify the kentry state when
instrumentation_begin/end() are called.

Add _nocheck() variants to skip verification to avoid WARN() generating
extra kentry warnings.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/traps.c |  4 ++--
 include/asm-generic/bug.h   |  8 
 include/linux/instrumentation.h | 25 -
 kernel/entry/common.c   |  7 +++
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index be924180005a..983e4be5fdcb 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -229,7 +229,7 @@ static noinstr bool handle_bug(struct pt_regs *regs)
/*
 * All lies, just get the WARN/BUG out.
 */
-   instrumentation_begin();
+   instrumentation_begin_nocheck();
/*
 * Since we're emulating a CALL with exceptions, restore the interrupt
 * state to what it was at the exception site.
@@ -242,7 +242,7 @@ static noinstr bool handle_bug(struct pt_regs *regs)
}
if (regs->flags & X86_EFLAGS_IF)
raw_local_irq_disable();
-   instrumentation_end();
+   instrumentation_end_nocheck();
 
return handled;
 }
diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 76a10e0dca9f..fc360c463a99 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -85,18 +85,18 @@ void warn_slowpath_fmt(const char *file, const int line, 
unsigned taint,
   const char *fmt, ...);
 #define __WARN()   __WARN_printf(TAINT_WARN, NULL)
 #define __WARN_printf(taint, arg...) do {  \
-   instrumentation_begin();\
+   instrumentation_begin_nocheck();\
warn_slowpath_fmt(__FILE__, __LINE__, taint, arg);  \
-   instrumentation_end();  \
+   instrumentation_end_nocheck();  \
} while (0)
 #else
 extern __printf(1, 2) void __warn_printk(const char *fmt, ...);
 #define __WARN()   __WARN_FLAGS(BUGFLAG_TAINT(TAINT_WARN))
 #define __WARN_printf(taint, arg...) do {  \
-   instrumentation_begin();\
+   instrumentation_begin_nocheck();\
__warn_printk(arg); \
__WARN_FLAGS(BUGFLAG_NO_CUT_HERE | BUGFLAG_TAINT(taint));\
-   instrumentation_end();  \
+   instrumentation_end_nocheck();  \
} while (0)
 #define WARN_ON_ONCE(condition) ({ \
int __ret_warn_on = !!(condition);  \
diff --git a/include/linux/instrumentation.h b/include/linux/instrumentation.h
index 93e2ad67fc10..cdf80454f92a 100644
--- a/include/linux/instrumentation.h
+++ b/include/linux/instrumentation.h
@@ -4,14 +4,21 @@
 
 #if defined(CONFIG_DEBUG_ENTRY) && defined(CONFIG_STACK_VALIDATION)
 
+extern void kentry_assert_may_instrument(void);
+
 /* Begin/end of an instrumentation safe region */
-#define instrumentation_begin() ({ \
-   asm volatile("%c0: nop\n\t" 
\
+#define instrumentation_begin_nocheck() ({ \
+   asm volatile("%c0: nop\n\t" \
 ".pushsection .discard.instr_begin\n\t"\
 ".long %c0b - .\n\t"   \
 ".popsection\n\t" : : "i" (__COUNTER__));  \
 })
 
+#define instrumentation_begin() ({ \
+   instrumentation_begin_nocheck();\
+   kentry_assert_may_instrument(); \
+})
+
 /*
  * Because instrumentation_{begin,end}() can nest, objtool validation considers
  * _begin() a +1 and _end() a -1 and computes a sum over the instructions.
@@ -43,15 +50,23 @@
  * To avoid this, have _end() be a NOP instruction, this ensures it will be
  * part of the condition block and does not escape.
  */
-#define instrumentation_end() ({   \
+#define instrumentation_end_nocheck() ({   \
asm volatile("%c0: nop\n\t" \
 ".pushsection .discard.instr_end\n\t"  \
 ".long %c0b - .\n\t"   \
 ".popsection\n\t" : : "i" (__COUNTER__));  \
 })
+
+#define instrumentation_end() ({ 

[PATCH v3 08/11] entry: Make CONFIG_DEBUG_ENTRY available outside x86

2021-03-04 Thread Andy Lutomirski
In principle, the generic entry code is generic, and the goal is to use it
in many architectures once it settles down more.  Move CONFIG_DEBUG_ENTRY
to the generic config so that it can be used in the generic entry code and
not just in arch/x86.

Disable it on arm64.  arm64 uses some but not all of the kentry
code, and trying to debug the resulting state machine will be painful.
arm64 can turn it back on when it starts using the entire generic
path.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/Kconfig.debug | 10 --
 lib/Kconfig.debug  | 11 +++
 2 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 80b57e7f4947..a5a52133730c 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -170,16 +170,6 @@ config CPA_DEBUG
help
  Do change_page_attr() self-tests every 30 seconds.
 
-config DEBUG_ENTRY
-   bool "Debug low-level entry code"
-   depends on DEBUG_KERNEL
-   help
- This option enables sanity checks in x86's low-level entry code.
- Some of these sanity checks may slow down kernel entries and
- exits or otherwise impact performance.
-
- If unsure, say N.
-
 config DEBUG_NMI_SELFTEST
bool "NMI Selftest"
depends on DEBUG_KERNEL && X86_LOCAL_APIC
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 7937265ef879..76549c8afa8a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1411,6 +1411,17 @@ config CSD_LOCK_WAIT_DEBUG
 
 endmenu # lock debugging
 
+config DEBUG_ENTRY
+   bool "Debug low-level entry code"
+   depends on DEBUG_KERNEL
+   depends on !ARM64
+   help
+ This option enables sanity checks in the low-level entry code.
+ Some of these sanity checks may slow down kernel entries and
+ exits or otherwise impact performance.
+
+ If unsure, say N.
+
 config TRACE_IRQFLAGS
depends on TRACE_IRQFLAGS_SUPPORT
bool
-- 
2.29.2



[PATCH v3 09/11] kentry: Add debugging checks for proper kentry API usage

2021-03-04 Thread Andy Lutomirski
It's quite easy to mess up kentry calls.  Add debgging checks that kentry
transitions to and from user mode match up and that kentry_nmi_enter() and
kentry_nmi_exit() match up.

Checking full matching of kentry_enter() with kentry_exit() needs
per-task state.

Signed-off-by: Andy Lutomirski 
---
 kernel/entry/common.c | 81 ++-
 1 file changed, 80 insertions(+), 1 deletion(-)

diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 4ba82c684189..f62934d761e3 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -11,9 +11,65 @@
 #define CREATE_TRACE_POINTS
 #include 
 
+/*
+ * kentry_cpu_depth is 0 in user mode, 1 in normal kernel mode, and
+ * 1 + n * KENTRY_DEPTH_NMI in kentry_nmi_enter() mode.  We can't
+ * use a percpu variable to match up kentry_enter() from kernel mode
+ * with the corresponding kentry_exit() because tasks may schedule inside
+ * nested kentry_enter() regions.
+ */
+#define KENTRY_CPU_DEPTH_NMI   1024UL
+
+#ifdef CONFIG_DEBUG_ENTRY
+
+/*
+ * Extra safe WARN_ONCE.  Per-arch optimized WARN_ONCE() implementations
+ * might go through the low-level entry and kentry code even before noticing
+ * that the warning already fired, which could result in recursive warnings.
+ * This carefully avoids any funny business once a given warning has fired.
+ */
+#define DEBUG_ENTRY_WARN_ONCE(condition, format...) ({ \
+   static bool __section(".data.once") __warned;   \
+   int __ret_warn_once = !!(condition);\
+   \
+   if (unlikely(__ret_warn_once && !READ_ONCE(__warned))) {\
+   WRITE_ONCE(__warned, true); \
+   WARN(1, format);\
+   }   \
+   unlikely(__ret_warn_once);  \
+})
+
+
+static DEFINE_PER_CPU(unsigned int, kentry_cpu_depth) = 1UL;
+
+static __always_inline void kentry_cpu_depth_add(unsigned int n)
+{
+   this_cpu_add(kentry_cpu_depth, n);
+}
+
+static void kentry_cpu_depth_check(unsigned int n)
+{
+   DEBUG_ENTRY_WARN_ONCE(this_cpu_read(kentry_cpu_depth) < n, "kentry 
depth underflow");
+}
+
+static __always_inline void kentry_cpu_depth_sub(unsigned int n)
+{
+   this_cpu_sub(kentry_cpu_depth, n);
+}
+#else
+
+#define DEBUG_ENTRY_WARN_ONCE(condition, format...) do {} while (0)
+
+static __always_inline void kentry_cpu_depth_add(unsigned int n) {}
+static void kentry_cpu_depth_check(unsigned int n) {}
+static __always_inline void kentry_cpu_depth_sub(unsigned int n) {}
+
+#endif
+
 /* See comment for enter_from_user_mode() in entry-common.h */
 static __always_inline void __enter_from_user_mode(struct pt_regs *regs)
 {
+   kentry_cpu_depth_add(1);
arch_check_user_regs(regs);
lockdep_hardirqs_off(CALLER_ADDR0);
 
@@ -22,6 +78,14 @@ static __always_inline void __enter_from_user_mode(struct 
pt_regs *regs)
 
instrumentation_begin();
trace_hardirqs_off_finish();
+
+#ifdef CONFIG_DEBUG_ENTRY
+   DEBUG_ENTRY_WARN_ONCE(
+   this_cpu_read(kentry_cpu_depth) != 1,
+   "kentry: __enter_from_user_mode() called while kentry thought 
the CPU was in the kernel (%u)",
+   this_cpu_read(kentry_cpu_depth));
+#endif
+
instrumentation_end();
 }
 
@@ -99,6 +163,11 @@ long kentry_syscall_begin(struct pt_regs *regs, long 
syscall)
 static __always_inline void __exit_to_user_mode(void)
 {
instrumentation_begin();
+#ifdef CONFIG_DEBUG_ENTRY
+   DEBUG_ENTRY_WARN_ONCE(this_cpu_read(kentry_cpu_depth) != 1,
+ "__exit_to_user_mode called at wrong kentry cpu 
depth (%u)",
+ this_cpu_read(kentry_cpu_depth));
+#endif
trace_hardirqs_on_prepare();
lockdep_hardirqs_on_prepare(CALLER_ADDR0);
instrumentation_end();
@@ -106,6 +175,7 @@ static __always_inline void __exit_to_user_mode(void)
user_enter_irqoff();
arch_exit_to_user_mode();
lockdep_hardirqs_on(CALLER_ADDR0);
+   kentry_cpu_depth_sub(1);
 }
 
 #ifdef CONFIG_ARM64
@@ -360,7 +430,12 @@ noinstr void kentry_exit(struct pt_regs *regs, 
kentry_state_t state)
/* Check whether this returns to user mode */
if (user_mode(regs)) {
kentry_exit_to_user_mode(regs);
-   } else if (!regs_irqs_disabled(regs)) {
+   return;
+   }
+
+   kentry_cpu_depth_check(1);
+
+   if (!regs_irqs_disabled(regs)) {
/*
 * If RCU was not watching on entry this needs to be done
 * carefully and needs the same ordering of lockdep/tracing
@@ -399,6 +474,8 @@ kentry_state_t noinstr kentry_nmi_enter(struct pt_regs 
*regs)
 
irq_state.lockdep = lockdep_hardirqs_enabled();
 
+   kentry_cpu_depth_add(KENTRY_CPU_DEPTH_NMI);
+
__nmi_enter();
  

[PATCH v3 05/11] x86/entry: Convert ret_from_fork to C

2021-03-04 Thread Andy Lutomirski
ret_from_fork is written in asm, slightly differently, for x86_32 and
x86_64.  Convert it to C.

This is a straight conversion without any particular cleverness.  As a
further cleanup, the code that sets up the ret_from_fork argument registers
could be adjusted to put the arguments in the correct registers.

This will cause the ORC unwinder to find pt_regs even for kernel threads on
x86_64.  This seems harmless.

The 32-bit comment above the now-deleted schedule_tail_wrapper was
obsolete: the encode_frame_pointer mechanism (see copy_thread()) solves the
same problem more cleanly.

Cc: Josh Poimboeuf 
Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/common.c  | 23 ++
 arch/x86/entry/entry_32.S| 51 +---
 arch/x86/entry/entry_64.S| 33 +
 arch/x86/include/asm/switch_to.h |  2 +-
 arch/x86/kernel/process.c|  2 +-
 arch/x86/kernel/process_32.c |  2 +-
 arch/x86/kernel/unwind_orc.c |  2 +-
 7 files changed, 43 insertions(+), 72 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 95776f16c1cb..ef1c65938a6b 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -214,6 +214,29 @@ SYSCALL_DEFINE0(ni_syscall)
return -ENOSYS;
 }
 
+void ret_from_fork(struct task_struct *prev,
+  int (*kernel_thread_fn)(void *),
+  void *kernel_thread_arg,
+  struct pt_regs *user_regs);
+
+__visible void noinstr ret_from_fork(struct task_struct *prev,
+int (*kernel_thread_fn)(void *),
+void *kernel_thread_arg,
+struct pt_regs *user_regs)
+{
+   instrumentation_begin();
+
+   schedule_tail(prev);
+
+   if (kernel_thread_fn) {
+   kernel_thread_fn(kernel_thread_arg);
+   user_regs->ax = 0;
+   }
+
+   instrumentation_end();
+   syscall_exit_to_user_mode(user_regs);
+}
+
 #ifdef CONFIG_XEN_PV
 #ifndef CONFIG_PREEMPTION
 /*
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index df8c017e6161..7113d259727f 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -805,26 +805,6 @@ SYM_CODE_START(__switch_to_asm)
 SYM_CODE_END(__switch_to_asm)
 .popsection
 
-/*
- * The unwinder expects the last frame on the stack to always be at the same
- * offset from the end of the page, which allows it to validate the stack.
- * Calling schedule_tail() directly would break that convention because its an
- * asmlinkage function so its argument has to be pushed on the stack.  This
- * wrapper creates a proper "end of stack" frame header before the call.
- */
-.pushsection .text, "ax"
-SYM_FUNC_START(schedule_tail_wrapper)
-   FRAME_BEGIN
-
-   pushl   %eax
-   callschedule_tail
-   popl%eax
-
-   FRAME_END
-   ret
-SYM_FUNC_END(schedule_tail_wrapper)
-.popsection
-
 /*
  * A newly forked process directly context switches into this address.
  *
@@ -833,29 +813,14 @@ SYM_FUNC_END(schedule_tail_wrapper)
  * edi: kernel thread arg
  */
 .pushsection .text, "ax"
-SYM_CODE_START(ret_from_fork)
-   callschedule_tail_wrapper
-
-   testl   %ebx, %ebx
-   jnz 1f  /* kernel threads are uncommon */
-
-2:
-   /* When we fork, we trace the syscall return in the child, too. */
-   movl%esp, %eax
-   callsyscall_exit_to_user_mode
-   jmp .Lsyscall_32_done
-
-   /* kernel thread */
-1: movl%edi, %eax
-   CALL_NOSPEC ebx
-   /*
-* A kernel thread is allowed to return here after successfully
-* calling kernel_execve().  Exit to userspace to complete the execve()
-* syscall.
-*/
-   movl$0, PT_EAX(%esp)
-   jmp 2b
-SYM_CODE_END(ret_from_fork)
+SYM_CODE_START(asm_ret_from_fork)
+   movl%ebx, %edx
+   movl%edi, %ecx
+   pushl   %esp
+   callret_from_fork
+   addl$4, %esp
+   jmp .Lsyscall_32_done
+SYM_CODE_END(asm_ret_from_fork)
 .popsection
 
 SYM_ENTRY(__begin_SYSENTER_singlestep_region, SYM_L_GLOBAL, SYM_A_NONE)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index cad08703c4ad..0f7df8861ac1 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -273,35 +273,18 @@ SYM_FUNC_END(__switch_to_asm)
  * rax: prev task we switched from
  * rbx: kernel thread func (NULL for user thread)
  * r12: kernel thread arg
+ * rbp: encoded frame pointer for the fp unwinder
  */
 .pushsection .text, "ax"
-SYM_CODE_START(ret_from_fork)
-   UNWIND_HINT_EMPTY
-   movq%rax, %rdi
-   callschedule_tail   /* rdi: 'prev' task parameter */
-
-   testq   %rbx, %rbx  /* from kernel_thread? */
-   jnz 1f  /* kernel threads are uncommon 
*/
-
-2:
+SYM_CODE_START(asm_ret_from_fork)
   

[PATCH v3 06/11] kentry: Simplify the common syscall API

2021-03-04 Thread Andy Lutomirski
The new common syscall API had a large and confusing API surface.  Simplify
it.  Now there is exactly one way to use it.  It's a bit more verbose than
the old way for the simple x86_64 native case, but it's much easier to use
right, and the diffstat should speak for itself.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/common.c  | 57 +++-
 include/linux/entry-common.h | 86 ++--
 kernel/entry/common.c| 57 +++-
 3 files changed, 54 insertions(+), 146 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index ef1c65938a6b..8710b2300b8d 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -38,9 +38,12 @@
 #ifdef CONFIG_X86_64
 __visible noinstr void do_syscall_64(unsigned long nr, struct pt_regs *regs)
 {
-   nr = syscall_enter_from_user_mode(regs, nr);
-
+   kentry_enter_from_user_mode(regs);
+   local_irq_enable();
instrumentation_begin();
+
+   nr = kentry_syscall_begin(regs, nr);
+
if (likely(nr < NR_syscalls)) {
nr = array_index_nospec(nr, NR_syscalls);
regs->ax = sys_call_table[nr](regs);
@@ -52,8 +55,12 @@ __visible noinstr void do_syscall_64(unsigned long nr, 
struct pt_regs *regs)
regs->ax = x32_sys_call_table[nr](regs);
 #endif
}
+
+   kentry_syscall_end(regs);
+
+   local_irq_disable();
instrumentation_end();
-   syscall_exit_to_user_mode(regs);
+   kentry_exit_to_user_mode(regs);
 }
 #endif
 
@@ -83,33 +90,34 @@ __visible noinstr void do_int80_syscall_32(struct pt_regs 
*regs)
 {
unsigned int nr = syscall_32_enter(regs);
 
+   kentry_enter_from_user_mode(regs);
+   local_irq_enable();
+   instrumentation_begin();
+
/*
 * Subtlety here: if ptrace pokes something larger than 2^32-1 into
 * orig_ax, the unsigned int return value truncates it.  This may
 * or may not be necessary, but it matches the old asm behavior.
 */
-   nr = (unsigned int)syscall_enter_from_user_mode(regs, nr);
-   instrumentation_begin();
-
+   nr = (unsigned int)kentry_syscall_begin(regs, nr);
do_syscall_32_irqs_on(regs, nr);
+   kentry_syscall_end(regs);
 
+   local_irq_disable();
instrumentation_end();
-   syscall_exit_to_user_mode(regs);
+   kentry_exit_to_user_mode(regs);
 }
 
 static noinstr bool __do_fast_syscall_32(struct pt_regs *regs)
 {
unsigned int nr = syscall_32_enter(regs);
+   bool ret;
int res;
 
-   /*
-* This cannot use syscall_enter_from_user_mode() as it has to
-* fetch EBP before invoking any of the syscall entry work
-* functions.
-*/
-   syscall_enter_from_user_mode_prepare(regs);
-
+   kentry_enter_from_user_mode(regs);
+   local_irq_enable();
instrumentation_begin();
+
/* Fetch EBP from where the vDSO stashed it. */
if (IS_ENABLED(CONFIG_X86_64)) {
/*
@@ -126,21 +134,23 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs 
*regs)
if (res) {
/* User code screwed up. */
regs->ax = -EFAULT;
-
-   instrumentation_end();
-   local_irq_disable();
-   irqentry_exit_to_user_mode(regs);
-   return false;
+   ret = false;
+   goto out;
}
 
/* The case truncates any ptrace induced syscall nr > 2^32 -1 */
-   nr = (unsigned int)syscall_enter_from_user_mode_work(regs, nr);
+   nr = (unsigned int)kentry_syscall_begin(regs, nr);
 
/* Now this is just like a normal syscall. */
do_syscall_32_irqs_on(regs, nr);
 
+   kentry_syscall_end(regs);
+   ret = true;
+
+out:
+   local_irq_disable();
instrumentation_end();
-   syscall_exit_to_user_mode(regs);
+   kentry_exit_to_user_mode(regs);
return true;
 }
 
@@ -233,8 +243,11 @@ __visible void noinstr ret_from_fork(struct task_struct 
*prev,
user_regs->ax = 0;
}
 
+   kentry_syscall_end(user_regs);
+
+   local_irq_disable();
instrumentation_end();
-   syscall_exit_to_user_mode(user_regs);
+   kentry_exit_to_user_mode(user_regs);
 }
 
 #ifdef CONFIG_XEN_PV
diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
index fd2d7c35670a..5287c6c15a66 100644
--- a/include/linux/entry-common.h
+++ b/include/linux/entry-common.h
@@ -119,31 +119,12 @@ static inline __must_check int 
arch_syscall_enter_tracehook(struct pt_regs *regs
 void enter_from_user_mode(struct pt_regs *regs);
 
 /**
- * syscall_enter_from_user_mode_prepare - Establish state and enable interrupts
- * @regs:  Pointer to currents pt_regs
- *
- * Invoked from architecture specific syscall entry code with interrupts
- * disabled. The calling code has to be non-instrumentable. When the
- * function returns all state 

[PATCH v3 03/11] x86/dumpstack: Remove unnecessary range check fetching opcode bytes

2021-03-04 Thread Andy Lutomirski
copy_from_user_nmi() validates that the pointer is in the user range,
so there is no need for an extra check in copy_code().

Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/dumpstack.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 299c20f0a38b..55cf3c8325c6 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -81,12 +81,6 @@ static int copy_code(struct pt_regs *regs, u8 *buf, unsigned 
long src,
/* The user space code from other tasks cannot be accessed. */
if (regs != task_pt_regs(current))
return -EPERM;
-   /*
-* Make sure userspace isn't trying to trick us into dumping kernel
-* memory by pointing the userspace instruction pointer at it.
-*/
-   if (__chk_range_not_ok(src, nbytes, TASK_SIZE_MAX))
-   return -EINVAL;
 
/*
 * Even if named copy_from_user_nmi() this can be invoked from
-- 
2.29.2



[PATCH v3 02/11] kentry: Rename irqentry to kentry

2021-03-04 Thread Andy Lutomirski
The common entry functions are mostly named irqentry, and this is
confusing.  They are used for syscalls, exceptions, NMIs and, yes, IRQs.
Call them kentry instead, since they are supposed to be usable for any
entry to the kernel.

This path doesn't touch the .irqentry section -- someone can contemplate
changing that later.  That code predates the common entry code.

Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/common.c |  8 ++---
 arch/x86/include/asm/idtentry.h | 28 +++
 arch/x86/kernel/cpu/mce/core.c  | 10 +++---
 arch/x86/kernel/kvm.c   |  6 ++--
 arch/x86/kernel/nmi.c   |  6 ++--
 arch/x86/kernel/traps.c | 28 +++
 arch/x86/mm/fault.c |  6 ++--
 include/linux/entry-common.h| 60 -
 kernel/entry/common.c   | 26 +++---
 9 files changed, 89 insertions(+), 89 deletions(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 8fdb4cb27efe..95776f16c1cb 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -264,9 +264,9 @@ __visible noinstr void xen_pv_evtchn_do_upcall(struct 
pt_regs *regs)
 {
struct pt_regs *old_regs;
bool inhcall;
-   irqentry_state_t state;
+   kentry_state_t state;
 
-   state = irqentry_enter(regs);
+   state = kentry_enter(regs);
old_regs = set_irq_regs(regs);
 
instrumentation_begin();
@@ -278,11 +278,11 @@ __visible noinstr void xen_pv_evtchn_do_upcall(struct 
pt_regs *regs)
inhcall = get_and_clear_inhcall();
if (inhcall && !WARN_ON_ONCE(state.exit_rcu)) {
instrumentation_begin();
-   irqentry_exit_cond_resched();
+   kentry_exit_cond_resched();
instrumentation_end();
restore_inhcall(inhcall);
} else {
-   irqentry_exit(regs, state);
+   kentry_exit(regs, state);
}
 }
 #endif /* CONFIG_XEN_PV */
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index f656aabd1545..3ac805d24289 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -40,8 +40,8 @@
  * The macro is written so it acts as function definition. Append the
  * body with a pair of curly brackets.
  *
- * irqentry_enter() contains common code which has to be invoked before
- * arbitrary code in the body. irqentry_exit() contains common code
+ * kentry_enter() contains common code which has to be invoked before
+ * arbitrary code in the body. kentry_exit() contains common code
  * which has to run before returning to the low level assembly code.
  */
 #define DEFINE_IDTENTRY(func)  \
@@ -49,12 +49,12 @@ static __always_inline void __##func(struct pt_regs *regs); 
\
\
 __visible noinstr void func(struct pt_regs *regs)  \
 {  \
-   irqentry_state_t state = irqentry_enter(regs);  \
+   kentry_state_t state = kentry_enter(regs);  \
\
instrumentation_begin();\
__##func (regs);\
instrumentation_end();  \
-   irqentry_exit(regs, state); \
+   kentry_exit(regs, state);   \
 }  \
\
 static __always_inline void __##func(struct pt_regs *regs)
@@ -96,12 +96,12 @@ static __always_inline void __##func(struct pt_regs *regs,  
\
 __visible noinstr void func(struct pt_regs *regs,  \
unsigned long error_code)   \
 {  \
-   irqentry_state_t state = irqentry_enter(regs);  \
+   kentry_state_t state = kentry_enter(regs);  \
\
instrumentation_begin();\
__##func (regs, error_code);\
instrumentation_end();  \
-   irqentry_exit(regs, state); \
+   kentry_exit(regs, state);   \
 }  \
\
 static __always_inline void __##func(struct pt_regs *regs, \
@@ -156,7 +156,7 @@ 

[PATCH v3 00/11] kentry: A stable bugfix and a bunch of improvements

2021-03-04 Thread Andy Lutomirski
I noticed a little bug in fast compat syscalls.  I got a bit carried away
fixing it.  This renames the irqentry stuff to kentry, improves (IMNSHO)
the API, and adds lots of debugging.

It also tweaks the unwinder wrt ret_from_fork and rewrites ret_from_fork
in C.  I did this because the kentry work involved a small change to
ret_from_fork, and adjusting the asm is a mess.  So C it is.

Changes from v1 and v2: Complete rewrite

Andy Lutomirski (11):
  x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls
  kentry: Rename irqentry to kentry
  x86/dumpstack: Remove unnecessary range check fetching opcode bytes
  x86/kthread,dumpstack: Set task_pt_regs->cs.RPL=3 for kernel threads
  x86/entry: Convert ret_from_fork to C
  kentry: Simplify the common syscall API
  kentry: Make entry/exit_to_user_mode() arm64-only
  entry: Make CONFIG_DEBUG_ENTRY available outside x86
  kentry: Add debugging checks for proper kentry API usage
  kentry: Check that syscall entries and syscall exits match
  kentry: Verify kentry state in instrumentation_begin/end()

 arch/x86/Kconfig.debug   |  10 --
 arch/x86/entry/common.c  |  85 ++
 arch/x86/entry/entry_32.S|  51 ++--
 arch/x86/entry/entry_64.S|  33 ++
 arch/x86/include/asm/idtentry.h  |  28 ++---
 arch/x86/include/asm/switch_to.h |   2 +-
 arch/x86/kernel/cpu/mce/core.c   |  10 +-
 arch/x86/kernel/dumpstack.c  |  10 +-
 arch/x86/kernel/kvm.c|   6 +-
 arch/x86/kernel/nmi.c|   6 +-
 arch/x86/kernel/process.c|  15 ++-
 arch/x86/kernel/process_32.c |   2 +-
 arch/x86/kernel/traps.c  |  32 ++---
 arch/x86/kernel/unwind_orc.c |   2 +-
 arch/x86/mm/fault.c  |   6 +-
 include/asm-generic/bug.h|   8 +-
 include/linux/entry-common.h | 180 
 include/linux/instrumentation.h  |  25 +++-
 include/linux/sched.h|   4 +
 init/init_task.c |   8 ++
 kernel/entry/common.c| 193 +--
 lib/Kconfig.debug|  11 ++
 22 files changed, 366 insertions(+), 361 deletions(-)

-- 
2.29.2



[PATCH v3 04/11] x86/kthread,dumpstack: Set task_pt_regs->cs.RPL=3 for kernel threads

2021-03-04 Thread Andy Lutomirski
For kernel threads, task_pt_regs is currently all zeros, a valid user state
(if kernel_execve() has been called), or some combination thereof during
execution of kernel_execve().  If a stack trace is printed, the unwinder
might get confused and treat task_pt_regs as a kernel state.  Indeed,
forcing a stack dump results in an attempt to display _kernel_ code bytes
from a bogus address at the very top of kernel memory.

Fix the confusion by setting cs=3 so that user_mode(task_pt_regs(...)) ==
true for kernel threads.

Also improve the error when failing to fetch code bytes to show which type
of fetch failed.  This makes it much easier to understand what's happening.

Cc: Josh Poimboeuf 
Signed-off-by: Andy Lutomirski 
---
 arch/x86/kernel/dumpstack.c |  4 ++--
 arch/x86/kernel/process.c   | 13 +
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 55cf3c8325c6..9b7b69bb12e5 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -128,8 +128,8 @@ void show_opcodes(struct pt_regs *regs, const char *loglvl)
/* No access to the user space stack of other tasks. Ignore. */
break;
default:
-   printk("%sCode: Unable to access opcode bytes at RIP 0x%lx.\n",
-  loglvl, prologue);
+   printk("%sCode: Unable to access %s opcode bytes at RIP 
0x%lx.\n",
+  loglvl, user_mode(regs) ? "user" : "kernel", prologue);
break;
}
 }
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 145a7ac0c19a..f6f16df04cb9 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -163,6 +163,19 @@ int copy_thread(unsigned long clone_flags, unsigned long 
sp, unsigned long arg,
/* Kernel thread ? */
if (unlikely(p->flags & PF_KTHREAD)) {
memset(childregs, 0, sizeof(struct pt_regs));
+
+   /*
+* Even though there is no real user state here, these
+* are were user regs belong, and kernel_execve() will
+* overwrite them with user regs.  Put an obviously
+* invalid value that nonetheless causes user_mode(regs)
+* to return true in CS.
+*
+* This also prevents the unwinder from thinking that there
+* is invalid kernel state at the top of the stack.
+*/
+   childregs->cs = 3;
+
kthread_frame_init(frame, sp, arg);
return 0;
}
-- 
2.29.2



Re: [PATCH v5 13/16] rpmsg: char: introduce __rpmsg_chrdev_create_eptdev function

2021-03-04 Thread Mathieu Poirier
On Fri, Feb 19, 2021 at 12:14:58PM +0100, Arnaud Pouliquen wrote:
> Introduce the __rpmsg_chrdev_create_eptdev internal function that returns
> the rpmsg_eptdev context structure.

Add newlines between paragraphs.

> This patch prepares the introduction of a RPMsg device for the
> char device. the RPMsg device will need a reference to the context.

s/the/The

s/RPMsg/RPMSG - throughout the patchset.

As a general note please be mindful of patch changelogs.  I often find myself
having to decipher the ideas being conveyed.

I am done reviewing this set.  There are things I will want to come back to but
the general goals behind the patchset are being achieved.

Thanks,
Mathieu

> 
> Signed-off-by: Arnaud Pouliquen 
> ---
>  drivers/rpmsg/rpmsg_char.c | 23 ++-
>  1 file changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
> index 09ae1304837c..66dcb8845d6c 100644
> --- a/drivers/rpmsg/rpmsg_char.c
> +++ b/drivers/rpmsg/rpmsg_char.c
> @@ -328,8 +328,9 @@ int rpmsg_chrdev_eptdev_destroy(struct device *dev, void 
> *data)
>  }
>  EXPORT_SYMBOL(rpmsg_chrdev_eptdev_destroy);
>  
> -int rpmsg_chrdev_create_eptdev(struct rpmsg_device *rpdev, struct device 
> *parent,
> -struct rpmsg_channel_info chinfo)
> +static struct rpmsg_eptdev *__rpmsg_chrdev_create_eptdev(struct rpmsg_device 
> *rpdev,
> +  struct device *parent,
> +  struct 
> rpmsg_channel_info chinfo)
>  {
>   struct rpmsg_eptdev *eptdev;
>   struct device *dev;
> @@ -337,7 +338,7 @@ int rpmsg_chrdev_create_eptdev(struct rpmsg_device 
> *rpdev, struct device *parent
>  
>   eptdev = kzalloc(sizeof(*eptdev), GFP_KERNEL);
>   if (!eptdev)
> - return -ENOMEM;
> + return ERR_PTR(-ENOMEM);
>  
>   dev = >dev;
>   eptdev->rpdev = rpdev;
> @@ -381,7 +382,7 @@ int rpmsg_chrdev_create_eptdev(struct rpmsg_device 
> *rpdev, struct device *parent
>   put_device(dev);
>   }
>  
> - return ret;
> + return eptdev;
>  
>  free_ept_ida:
>   ida_simple_remove(_ept_ida, dev->id);
> @@ -391,7 +392,19 @@ int rpmsg_chrdev_create_eptdev(struct rpmsg_device 
> *rpdev, struct device *parent
>   put_device(dev);
>   kfree(eptdev);
>  
> - return ret;
> + return ERR_PTR(ret);
> +}
> +
> +int rpmsg_chrdev_create_eptdev(struct rpmsg_device *rpdev, struct device 
> *parent,
> +struct rpmsg_channel_info chinfo)
> +{
> + struct rpmsg_eptdev *eptdev;
> +
> + eptdev = __rpmsg_chrdev_create_eptdev(rpdev, >dev, chinfo);
> + if (IS_ERR(eptdev))
> + return PTR_ERR(eptdev);
> +
> + return 0;
>  }
>  EXPORT_SYMBOL(rpmsg_chrdev_create_eptdev);
>  
> -- 
> 2.17.1
> 


[PATCH] platform/surface: aggregator: Make SSAM_DEFINE_SYNC_REQUEST_x define static functions

2021-03-04 Thread Maximilian Luz
The SSAM_DEFINE_SYNC_REQUEST_x() macros are intended to reduce
boiler-plate code for SSAM request definitions by defining a wrapper
function for the specified request. The client device variants of those
macros, i.e. SSAM_DEFINE_SYNC_REQUEST_CL_x() in particular rely on the
multi-device (MD) variants, e.g.:

#define SSAM_DEFINE_SYNC_REQUEST_CL_R(name, rtype, spec...)   \
SSAM_DEFINE_SYNC_REQUEST_MD_R(__raw_##name, rtype, spec)  \
int name(struct ssam_device *sdev, rtype *ret)\
{ \
return __raw_##name(sdev->ctrl, sdev->uid.target, \
sdev->uid.instance, ret); \
}

This now creates the problem that it is not possible to declare the
generated functions static via

static SSAM_DEFINE_SYNC_REQUEST_CL_R(...)

as this will only apply to the function defined by the multi-device
macro, i.e. SSAM_DEFINE_SYNC_REQUEST_MD_R(). Thus compiling with
`-Wmissing-prototypes' rightfully complains that there is a 'static'
keyword missing.

To solve this, make all SSAM_DEFINE_SYNC_REQUEST_x() macros define
static functions. Non-client-device macros are also changed for
consistency. In general, we expect those functions to be only used
locally in the respective drivers for the corresponding interfaces, so
having to define a wrapper function to be able to export this should be
the odd case out.

Reported-by: kernel test robot 
Fixes: 510c8114fc74 ("platform/surface: Add platform profile driver")
Signed-off-by: Maximilian Luz 
---
 .../driver-api/surface_aggregator/client.rst  |  4 +-
 .../platform/surface/aggregator/controller.c  | 10 +--
 .../surface/surface_aggregator_registry.c |  2 +-
 .../surface/surface_platform_profile.c|  4 +-
 include/linux/surface_aggregator/controller.h | 74 +--
 include/linux/surface_aggregator/device.h | 31 
 6 files changed, 63 insertions(+), 62 deletions(-)

diff --git a/Documentation/driver-api/surface_aggregator/client.rst 
b/Documentation/driver-api/surface_aggregator/client.rst
index 26d13085a117..e519d374c378 100644
--- a/Documentation/driver-api/surface_aggregator/client.rst
+++ b/Documentation/driver-api/surface_aggregator/client.rst
@@ -248,7 +248,7 @@ This example defines a function
 
 .. code-block:: c
 
-   int __ssam_tmp_perf_mode_set(struct ssam_controller *ctrl, const __le32 
*arg);
+   static int __ssam_tmp_perf_mode_set(struct ssam_controller *ctrl, const 
__le32 *arg);
 
 executing the specified request, with the controller passed in when calling
 said function. In this example, the argument is provided via the ``arg``
@@ -296,7 +296,7 @@ This invocation of the macro defines a function
 
 .. code-block:: c
 
-   int ssam_bat_get_sta(struct ssam_device *sdev, __le32 *ret);
+   static int ssam_bat_get_sta(struct ssam_device *sdev, __le32 *ret);
 
 executing the specified request, using the device IDs and controller given
 in the client device. The full list of such macros for client devices is:
diff --git a/drivers/platform/surface/aggregator/controller.c 
b/drivers/platform/surface/aggregator/controller.c
index 5bcb59ed579d..aa6f37b4f46e 100644
--- a/drivers/platform/surface/aggregator/controller.c
+++ b/drivers/platform/surface/aggregator/controller.c
@@ -1750,35 +1750,35 @@ EXPORT_SYMBOL_GPL(ssam_request_sync_with_buffer);
 
 /* -- Internal SAM requests.  
*/
 
-static SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_get_firmware_version, __le32, {
+SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_get_firmware_version, __le32, {
.target_category = SSAM_SSH_TC_SAM,
.target_id   = 0x01,
.command_id  = 0x13,
.instance_id = 0x00,
 });
 
-static SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_display_off, u8, {
+SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_display_off, u8, {
.target_category = SSAM_SSH_TC_SAM,
.target_id   = 0x01,
.command_id  = 0x15,
.instance_id = 0x00,
 });
 
-static SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_display_on, u8, {
+SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_display_on, u8, {
.target_category = SSAM_SSH_TC_SAM,
.target_id   = 0x01,
.command_id  = 0x16,
.instance_id = 0x00,
 });
 
-static SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_d0_exit, u8, {
+SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_d0_exit, u8, {
.target_category = SSAM_SSH_TC_SAM,
.target_id   = 0x01,
.command_id  = 0x33,
.instance_id = 0x00,
 });
 
-static SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_d0_entry, u8, {
+SSAM_DEFINE_SYNC_REQUEST_R(ssam_ssh_notif_d0_entry, u8, {
.target_category = SSAM_SSH_TC_SAM,
.target_id   = 0x01,
.command_id  = 0x34,
diff --git a/drivers/platform/surface/surface_aggregator_registry.c 
b/drivers/platform/surface/surface_aggregator_registry.c
index 

[PATCH v3 01/11] x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls

2021-03-04 Thread Andy Lutomirski
On a 32-bit fast syscall that fails to read its arguments from user
memory, the kernel currently does syscall exit work but not
syscall entry work.  This confuses audit and ptrace.  For example:

$ ./tools/testing/selftests/x86/syscall_arg_fault_32
...
strace: pid 264258: entering, ptrace_syscall_info.op == 2
...

This is a minimal fix intended for ease of backporting.  A more
complete cleanup is coming.

Cc: sta...@vger.kernel.org
Fixes: 0b085e68f407 ("x86/entry: Consolidate 32/64 bit syscall entry")
Signed-off-by: Andy Lutomirski 
---
 arch/x86/entry/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 0904f5676e4d..8fdb4cb27efe 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -128,7 +128,8 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs 
*regs)
regs->ax = -EFAULT;
 
instrumentation_end();
-   syscall_exit_to_user_mode(regs);
+   local_irq_disable();
+   irqentry_exit_to_user_mode(regs);
return false;
}
 
-- 
2.29.2



Re: XDP socket rings, and LKMM litmus tests

2021-03-04 Thread Paul E. McKenney
On Thu, Mar 04, 2021 at 10:35:24AM -0500, Alan Stern wrote:
> On Wed, Mar 03, 2021 at 09:04:07PM -0800, Paul E. McKenney wrote:
> > On Wed, Mar 03, 2021 at 10:21:01PM -0500, Alan Stern wrote:
> > > On Wed, Mar 03, 2021 at 02:03:48PM -0800, Paul E. McKenney wrote:
> > > > On Wed, Mar 03, 2021 at 03:22:46PM -0500, Alan Stern wrote:
> 
> > > > > >  And I cannot immediately think of a situation where
> > > > > > this approach would break that would not result in a data race being
> > > > > > flagged.  Or is this yet another failure of my imagination?
> > > > > 
> > > > > By definition, an access to a local variable cannot participate in a 
> > > > > data race because all such accesses are confined to a single thread.
> > > > 
> > > > True, but its value might have come from a load from a shared variable.
> > > 
> > > Then that load could have participated in a data race.  But the store to 
> > > the local variable cannot.
> > 
> > Agreed.  My thought was that if the ordering from the initial (non-local)
> > load mattered, then that initial load must have participated in a
> > data race.  Is that true, or am I failing to perceive some corner case?
> 
> Ordering can matter even when no data race is involved.  Just think
> about how much of the memory model is concerned with ordering of
> marked accesses, which don't participate in data races unless there is
> a conflicting plain access somewhere.

Fair point.  Should I have instead said "then that initial load must
have run concurrently with a store to that same variable"?

Thanx, Paul


Re: Broken kretprobe stack traces

2021-03-04 Thread Daniel Xu
On Wed, Mar 3, 2021, at 6:18 PM, Daniel Xu wrote:
> On Wed, Mar 03, 2021 at 03:37:40PM -0500, Steven Rostedt wrote:
> > On Wed, 03 Mar 2021 12:13:08 -0800
> > "Daniel Xu"  wrote:
> > 
> > > On Wed, Mar 3, 2021, at 11:58 AM, Daniel Xu wrote:
> > > > On Wed, Mar 03, 2021 at 09:26:04AM -0500, Steven Rostedt wrote:  
> > > > > On Wed, 3 Mar 2021 13:48:28 +0900
> > > > > Masami Hiramatsu  wrote:
> > > > >   
> > > > > >   
> > > > > > > 
> > > > > > > I think (can't prove) this used to work:
> > > > > 
> > > > > Would be good to find out if it did.  
> > > > 
> > > > I'm installing some older kernels now to check. Will report back.  
> > > 
> > > Yep, works in 4.11. So there was a regression somewhere.
> > 
> > Care to bisect? ;-)
> 
> Took a while (I'll probably be typing "test_regression.sh" in my sleep
> tonight) but I've bisected it down to f95b23a112f1 ("Merge branch
> 'x86/urgent' into x86/asm, to pick up dependent fixes").
> 
> I think I saw the default option for stack unwinder change from frame
> pointers -> ORC so that may be the root cause. Not sure, though. Need to
> look more closely at the commits in the merge commit.
> 
> <...>
> 
> Daniel
>

Compiling with:

CONFIG_UNWINDER_ORC=n
CONFIG_UNWINDER_FRAME_POINTER=y

fixes the issues and leads me to believe stacktraces on kretprobes
never worked with ORC.

Josh, any chance you have an idea?

Thanks,
Daniel


Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-04 Thread Eric W. Biederman
Thomas Gleixner  writes:

> On Thu, Mar 04 2021 at 09:11, Sebastian Andrzej Siewior wrote:
>> On 2021-03-03 16:09:05 [-0600], Eric W. Biederman wrote:
>>> Sebastian Andrzej Siewior  writes:
>>> 
>>> > From: Thomas Gleixner 
>>> >
>>> > Allow realtime tasks to cache one sigqueue in task struct. This avoids an
>>> > allocation which can increase the latency or fail.
>>> > Ideally the sigqueue is cached after first successful delivery and will be
>>> > available for next signal delivery. This works under the assumption that 
>>> > the RT
>>> > task has never an unprocessed signal while a one is about to be queued.
>>> >
>>> > The caching is not used for SIGQUEUE_PREALLOC because this kind of 
>>> > sigqueue is
>>> > handled differently (and not used for regular signal delivery).
>>> 
>>> What part of this is about real time tasks?  This allows any task
>>> to cache a sigqueue entry.
>>
>> It is limited to realtime tasks (SCHED_FIFO/RR/DL):
>>
>> +static void __sigqueue_cache_or_free(struct sigqueue *q)
>> +{
>> …
>> +if (!task_is_realtime(current) || !sigqueue_add_cache(current, q))
>> +kmem_cache_free(sigqueue_cachep, q);
>> +}
>
> We could of course do the caching unconditionally for all tasks.

Is there any advantage to only doing this for realtime tasks?

If not it probably makes sense to do the caching for all tasks.

I am wondering if we want to count the cached sigqueue structure to the
users rt signal rlimit?

Eric


Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller

2021-03-04 Thread Jason Gunthorpe
On Thu, Mar 04, 2021 at 11:01:44AM -0800, Jacob Pan wrote:

> > For something like qemu I'd expect to put the qemu process in a cgroup
> > with 1 PASID. Who cares what qemu uses the PASID for, or how it was
> > allocated?
> 
> For vSVA, we will need one PASID per guest process. But that is up to the
> admin based on whether or how many SVA capable devices are directly
> assigned.

I hope the virtual IOMMU driver can communicate the PASID limit and
the cgroup machinery in the guest can know what the actual limit is.

I was thinking of a case where qemu is using a single PASID to setup
the guest kVA or similar

Jason


[PATCH v2 3/4] ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42 companion codec.

2021-03-04 Thread Vitaly Rodionov
In the case of CS8409 we do not have unsol events from NID's 0x24 and 0x34
where hs mic and hp are connected. Companion codec CS42L42 will generate
interrupt via gpio 4 to notify jack events. We have to overwrite standard
snd_hda_jack_unsol_event(), read CS42L42 jack detect status registers and
then notify status via generic snd_hda_jack_unsol_event() call.

Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3505.

Signed-off-by: Vitaly Rodionov 
---
 sound/pci/hda/patch_cirrus.c | 304 ++-
 1 file changed, 302 insertions(+), 2 deletions(-)

diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
index c95588681d53..0b8980240176 100644
--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -38,6 +39,15 @@ struct cs_spec {
/* for MBP SPDIF control */
int (*spdif_sw_put)(struct snd_kcontrol *kcontrol,
struct snd_ctl_elem_value *ucontrol);
+
+   unsigned int cs42l42_hp_jack_in:1;
+   unsigned int cs42l42_mic_jack_in:1;
+
+   struct mutex cs8409_i2c_mux;
+
+   /* verb exec op override */
+   int (*exec_verb)(struct hdac_device *dev, unsigned int cmd,
+unsigned int flags, unsigned int *res);
 };
 
 /* available models with CS420x */
@@ -1229,6 +1239,13 @@ static int patch_cs4213(struct hda_codec *codec)
 #define CS8409_CS42L42_SPK_PIN_NID 0x2c
 #define CS8409_CS42L42_AMIC_PIN_NID0x34
 #define CS8409_CS42L42_DMIC_PIN_NID0x44
+#define CS8409_CS42L42_DMIC_ADC_PIN_NID0x22
+
+#define CS42L42_HSDET_AUTO_DONE0x02
+#define CS42L42_HSTYPE_MASK0x03
+
+#define CS42L42_JACK_INSERTED  0x0C
+#define CS42L42_JACK_REMOVED   0x00
 
 #define GPIO3_INT (1 << 3)
 #define GPIO4_INT (1 << 4)
@@ -1429,6 +1446,7 @@ static const struct cs8409_i2c_param 
cs42l42_init_reg_seq[] = {
{ 0x1C03, 0xC0 },
{ 0x1105, 0x00 },
{ 0x1112, 0xC0 },
+   { 0x1101, 0x02 },
{} /* Terminator */
 };
 
@@ -1565,6 +1583,8 @@ static unsigned int cs8409_i2c_write(struct hda_codec 
*codec,
 /* Assert/release RTS# line to CS42L42 */
 static void cs8409_cs42l42_reset(struct hda_codec *codec)
 {
+   struct cs_spec *spec = codec->spec;
+
/* Assert RTS# line */
snd_hda_codec_write(codec,
codec->core.afg, 0, AC_VERB_SET_GPIO_DATA, 0);
@@ -1576,21 +1596,184 @@ static void cs8409_cs42l42_reset(struct hda_codec 
*codec)
/* wait ~10ms */
usleep_range(1, 15000);
 
-   /* Clear interrupts status */
+   mutex_lock(>cs8409_i2c_mux);
+
+   /* Clear interrupts, by reading interrupt status registers */
cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1308, 1);
cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1309, 1);
cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130A, 1);
cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130F, 1);
 
+   mutex_unlock(>cs8409_i2c_mux);
+
+}
+
+/* Configure CS42L42 slave codec for jack autodetect */
+static int cs8409_cs42l42_enable_jack_detect(struct hda_codec *codec)
+{
+   struct cs_spec *spec = codec->spec;
+
+   mutex_lock(>cs8409_i2c_mux);
+
+   /* Set TIP_SENSE_EN for analog front-end of tip sense. */
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b70, 0x0020, 1);
+   /* Clear WAKE# */
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b71, 0x0001, 1);
+   /* Wait ~2.5ms */
+   usleep_range(2500, 3000);
+   /* Set mode WAKE# output follows the combination logic directly */
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b71, 0x0020, 1);
+   /* Clear interrupts status */
+   cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x130f, 1);
+   cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1b7b, 1);
+   /* Enable interrupt */
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1320, 0x03, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b79, 0x00, 1);
+
+   mutex_unlock(>cs8409_i2c_mux);
+
+   return 0;
+}
+
+/* Enable and run CS42L42 slave codec jack auto detect */
+static void cs8409_cs42l42_run_jack_detect(struct hda_codec *codec)
+{
+   struct cs_spec *spec = codec->spec;
+
+   mutex_lock(>cs8409_i2c_mux);
+
+   /* Clear interrupts */
+   cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1308, 1);
+   cs8409_i2c_read(codec, CS42L42_I2C_ADDR, 0x1b77, 1);
+
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1102, 0x87, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1f06, 0x86, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1b74, 0x07, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x131b, 0x01, 1);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x1120, 0x80, 1);
+   /* Wait ~110ms*/
+   usleep_range(11, 20);
+   cs8409_i2c_write(codec, CS42L42_I2C_ADDR, 0x111f, 0x77, 1);
+   

[PATCH v2 1/4] ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18

2021-03-04 Thread Vitaly Rodionov
In preparation to support Cirrus Logic CS8409 HDA bridge on new Dell platforms
it is nessasary to increase AUTO_CFG_MAX_INS and AUTO_CFG_NUM_INPUTS values.
Currently AUTO_CFG_MAX_INS is limited to 8, but Cirrus Logic HDA bridge CS8409
has 18 input pins, 16 ASP receivers and 2 DMIC inputs. We have to increase this
value to 18, so generic code can handle this correctly.

Tested on DELL Inspiron-3505, DELL Inspiron-3501, DELL Inspiron-3500

Signed-off-by: Vitaly Rodionov 
---
 sound/pci/hda/hda_auto_parser.h | 2 +-
 sound/pci/hda/hda_local.h   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/sound/pci/hda/hda_auto_parser.h b/sound/pci/hda/hda_auto_parser.h
index a22ca0e17a08..df63d66af1ab 100644
--- a/sound/pci/hda/hda_auto_parser.h
+++ b/sound/pci/hda/hda_auto_parser.h
@@ -27,7 +27,7 @@ enum {
 };
 
 #define AUTO_CFG_MAX_OUTS  HDA_MAX_OUTS
-#define AUTO_CFG_MAX_INS   8
+#define AUTO_CFG_MAX_INS   18
 
 struct auto_pin_cfg_item {
hda_nid_t pin;
diff --git a/sound/pci/hda/hda_local.h b/sound/pci/hda/hda_local.h
index 5beb8aa44ecd..317245a5585d 100644
--- a/sound/pci/hda/hda_local.h
+++ b/sound/pci/hda/hda_local.h
@@ -180,7 +180,7 @@ int snd_hda_create_spdif_in_ctls(struct hda_codec *codec, 
hda_nid_t nid);
 /*
  * input MUX helper
  */
-#define HDA_MAX_NUM_INPUTS 16
+#define HDA_MAX_NUM_INPUTS 36
 struct hda_input_mux_item {
char label[32];
unsigned int index;
-- 
2.25.1



[PATCH v2 0/4] ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec

2021-03-04 Thread Vitaly Rodionov
Dell's laptops Inspiron 3500, Inspiron 3501, Inspiron 3505 are using
Cirrus Logic CS8409 HDA bridge with CS42L42 companion codec.

The CS8409 is a multichannel HD audio routing controller.
CS8409 includes support for four channels of digital
microphone data and two bidirectional ASPs for up to 32
channels of TDM data or 4 channels of I2S data. The CS8409 is
intended to be used with a remote companion codec that implements
high performance analog functions in close physical
proximity to the end-equipment audio port or speaker driver.

The CS42L42 is a low-power audio codec with integrated MIPI
SoundWire interface or I2C/I2S/TDM interfaces designed
for portable applications. It provides a high-dynamic range,
stereo DAC for audio playback and a mono high-dynamic-range
ADC for audio capture

Changes since version 1:

ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18
* No change

ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42
companion codec.
* Removed redundant fields in fixup table
* Handle gpio via spec->gpio_dir, spec->gpio_data and spec->gpio_mask
* Moved cs8409_cs42l42_init() from patch 2, to handle resume correctly

ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42
companion codec.
* Run scripts/checkpatch.pl, fixed new warnings

ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control
* Moved control values to cache to avoid i2c read at each time.

Stefan Binding (1):
  ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control

Vitaly Rodionov (3):
  ALSA: hda/cirrus: Increase AUTO_CFG_MAX_INS from 8 to 18
  ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42
companion codec.
  ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42
companion codec.

 sound/pci/hda/hda_auto_parser.h |2 +-
 sound/pci/hda/hda_local.h   |2 +-
 sound/pci/hda/patch_cirrus.c| 1068 +++
 3 files changed, 1070 insertions(+), 2 deletions(-)

-- 
2.25.1



[PATCH v2 4/4] ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control

2021-03-04 Thread Vitaly Rodionov
From: Stefan Binding 

CS8409 does not support Volume Control for NIDs 0x24 (the Headphones),
or 0x34 (The Headset Mic).
However, CS42L42 codec does support gain control for both.
We can add support for Volume Controls, by writing the the CS42L42
regmap via i2c commands, using custom info, get and put volume
functions, saved in the control.

Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3500

Signed-off-by: Stefan Binding 
Signed-off-by: Vitaly Rodionov 
---
 sound/pci/hda/patch_cirrus.c | 201 ++-
 1 file changed, 198 insertions(+), 3 deletions(-)

diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
index 0b8980240176..082420545ab7 100644
--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -21,6 +21,9 @@
 /*
  */
 
+#define CS42L42_HP_CH (2U)
+#define CS42L42_HS_MIC_CH (1U)
+
 struct cs_spec {
struct hda_gen_spec gen;
 
@@ -42,6 +45,9 @@ struct cs_spec {
 
unsigned int cs42l42_hp_jack_in:1;
unsigned int cs42l42_mic_jack_in:1;
+   unsigned int cs42l42_volume_init:1;
+   char cs42l42_hp_volume[CS42L42_HP_CH];
+   char cs42l42_hs_mic_volume[CS42L42_HS_MIC_CH];
 
struct mutex cs8409_i2c_mux;
 
@@ -1260,6 +1266,14 @@ static int patch_cs4213(struct hda_codec *codec)
 #define CIR_I2C_QWRITE 0x005D
 #define CIR_I2C_QREAD  0x005E
 
+#define CS8409_CS42L42_HP_VOL_REAL_MIN   (-63)
+#define CS8409_CS42L42_HP_VOL_REAL_MAX   (0)
+#define CS8409_CS42L42_AMIC_VOL_REAL_MIN (-97)
+#define CS8409_CS42L42_AMIC_VOL_REAL_MAX (12)
+#define CS8409_CS42L42_REG_HS_VOLUME_CHA (0x2301)
+#define CS8409_CS42L42_REG_HS_VOLUME_CHB (0x2303)
+#define CS8409_CS42L42_REG_AMIC_VOLUME   (0x1D03)
+
 struct cs8409_i2c_param {
unsigned int addr;
unsigned int reg;
@@ -1401,7 +1415,6 @@ static const struct cs8409_i2c_param 
cs42l42_init_reg_seq[] = {
{ 0x1010, 0xB0 },
{ 0x1D01, 0x00 },
{ 0x1D02, 0x06 },
-   { 0x1D03, 0x00 },
{ 0x1107, 0x01 },
{ 0x1009, 0x02 },
{ 0x1007, 0x03 },
@@ -1431,8 +1444,6 @@ static const struct cs8409_i2c_param 
cs42l42_init_reg_seq[] = {
{ 0x2901, 0x01 },
{ 0x1101, 0x0A },
{ 0x1102, 0x84 },
-   { 0x2301, 0x00 },
-   { 0x2303, 0x00 },
{ 0x2302, 0x3f },
{ 0x2001, 0x03 },
{ 0x1B75, 0xB6 },
@@ -1580,6 +1591,179 @@ static unsigned int cs8409_i2c_write(struct hda_codec 
*codec,
return retval;
 }
 
+static int cs8409_cs42l42_volume_info(struct snd_kcontrol *kcontrol,
+ struct snd_ctl_elem_info *uinfo)
+{
+   struct hda_codec *codec = snd_kcontrol_chip(kcontrol);
+   u16 nid = get_amp_nid(kcontrol);
+   u8 chs = get_amp_channels(kcontrol);
+
+   codec_dbg(codec, "%s() nid: %d\n", __func__, nid);
+   switch (nid) {
+   case CS8409_CS42L42_HP_PIN_NID:
+   uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER;
+   uinfo->count = chs == 3 ? 2 : 1;
+   uinfo->value.integer.min = CS8409_CS42L42_HP_VOL_REAL_MIN;
+   uinfo->value.integer.max = CS8409_CS42L42_HP_VOL_REAL_MAX;
+   break;
+   case CS8409_CS42L42_AMIC_PIN_NID:
+   uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER;
+   uinfo->count = chs == 3 ? 2 : 1;
+   uinfo->value.integer.min = CS8409_CS42L42_AMIC_VOL_REAL_MIN;
+   uinfo->value.integer.max = CS8409_CS42L42_AMIC_VOL_REAL_MAX;
+   break;
+   default:
+   break;
+   }
+   return 0;
+}
+
+static void cs8409_cs42l42_update_volume(struct hda_codec *codec)
+{
+   struct cs_spec *spec = codec->spec;
+
+   mutex_lock(>cs8409_i2c_mux);
+   spec->cs42l42_hp_volume[0] = -(cs8409_i2c_read(codec, CS42L42_I2C_ADDR,
+   CS8409_CS42L42_REG_HS_VOLUME_CHA, 1));
+   spec->cs42l42_hp_volume[1] = -(cs8409_i2c_read(codec, CS42L42_I2C_ADDR,
+   CS8409_CS42L42_REG_HS_VOLUME_CHB, 1));
+   spec->cs42l42_hs_mic_volume[0] = -(cs8409_i2c_read(codec, 
CS42L42_I2C_ADDR,
+   CS8409_CS42L42_REG_AMIC_VOLUME, 1));
+   codec_dbg(codec, "%s() HP Volume: %d/%d, HS Mic Volume: %d\n", __func__,
+   spec->cs42l42_hp_volume[0], spec->cs42l42_hp_volume[1],
+   spec->cs42l42_hs_mic_volume[0]);
+   mutex_unlock(>cs8409_i2c_mux);
+   spec->cs42l42_volume_init = 1;
+}
+
+static int cs8409_cs42l42_volume_get(struct snd_kcontrol *kcontrol,
+struct snd_ctl_elem_value *ucontrol)
+{
+   struct hda_codec *codec = snd_kcontrol_chip(kcontrol);
+   struct cs_spec *spec = codec->spec;
+   hda_nid_t nid = get_amp_nid(kcontrol);
+   int chs = get_amp_channels(kcontrol);
+   long *valp = ucontrol->value.integer.value;
+
+   codec_dbg(codec, "%s() nid: %d\n", __func__, nid);
+   if (!spec->cs42l42_volume_init) {
+   

[PATCH v2 2/4] ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec.

2021-03-04 Thread Vitaly Rodionov
Dell's laptops Inspiron 3500, Inspiron 3501, Inspiron 3505 are using Cirrus 
Logic
CS8409 HDA bridge with CS42L42 companion codec.

The CS8409 is a multichannel HD audio routing controller.
CS8409 includes support for four channels of digital
microphone data and two bidirectional ASPs for up to 32
channels of TDM data or 4 channels of I2S data. The CS8409 is
intended to be used with a remote companion codec that implements
high performance analog functions in close physical
proximity to the end-equipment audio port or speaker driver.

The CS42L42 is a low-power audio codec with integrated MIPI
SoundWire interface or I2C/I2S/TDM interfaces designed
for portable applications. It provides a high-dynamic range,
stereo DAC for audio playback and a mono high-dynamic-range
ADC for audio capture

CS42L42 is connected to CS8409 HDA bridge via I2C and I2S.

CS8409  CS42L42
--- 
ASP1.A TX  -->  ASP_SDIN
ASP1.A RX  <--  ASP_SDOUT
GPIO5  -->  RST#
GPIO4  <--  INT#
GPIO3  <--  WAKE#
GPIO7  <->  I2C SDA
GPIO6  -->  I2C CLK

Tested on DELL Inspiron-3500, DELL Inspiron-3501, DELL Inspiron-3505

This patch will register CS8409 with sound card and create
input/output paths and two input devices, initialise CS42L42
companion codec and configure it for ASP TX/RX TDM mode,
24bit, 48kHz.

cat /proc/asound/pcm
00-00: CS8409 Analog : CS8409 Analog : playback 1 : capture 1
00-03: HDMI 0 : HDMI 0 : playback 1

dmesg
snd_hda_codec_cirrus hdaudioC0D0: autoconfig for CS8409: line_outs=1 
(0x2c/0x0/0x0/0x0/0x0) type:speaker
snd_hda_codec_cirrus hdaudioC0D0:speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
snd_hda_codec_cirrus hdaudioC0D0:hp_outs=1 (0x24/0x0/0x0/0x0/0x0)
snd_hda_codec_cirrus hdaudioC0D0:mono: mono_out=0x0
snd_hda_codec_cirrus hdaudioC0D0:inputs:
snd_hda_codec_cirrus hdaudioC0D0:  Internal Mic=0x44
snd_hda_codec_cirrus hdaudioC0D0:  Mic=0x34
input: HDA Intel PCH Headphone as 
/devices/pci:00/:00:1f.3/sound/card0/input8
input: HDA Intel PCH Headset Mic as 
/devices/pci:00/:00:1f.3/sound/card0/input9

Signed-off-by: Vitaly Rodionov 
---
 sound/pci/hda/patch_cirrus.c | 573 +++
 1 file changed, 573 insertions(+)

diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
index f46204ab0b90..c95588681d53 100644
--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "hda_local.h"
@@ -1219,6 +1220,577 @@ static int patch_cs4213(struct hda_codec *codec)
return err;
 }
 
+/* Cirrus Logic CS8409 HDA bridge with
+ * companion codec CS42L42
+ */
+#define CS8409_VENDOR_NID 0x47
+
+#define CS8409_CS42L42_HP_PIN_NID  0x24
+#define CS8409_CS42L42_SPK_PIN_NID 0x2c
+#define CS8409_CS42L42_AMIC_PIN_NID0x34
+#define CS8409_CS42L42_DMIC_PIN_NID0x44
+
+#define GPIO3_INT (1 << 3)
+#define GPIO4_INT (1 << 4)
+#define GPIO5_INT (1 << 5)
+
+#define CS42L42_I2C_ADDR   (0x48 << 1)
+
+#define CIR_I2C_ADDR   0x0059
+#define CIR_I2C_DATA   0x005A
+#define CIR_I2C_CTRL   0x005B
+#define CIR_I2C_STATUS 0x005C
+#define CIR_I2C_QWRITE 0x005D
+#define CIR_I2C_QREAD  0x005E
+
+struct cs8409_i2c_param {
+   unsigned int addr;
+   unsigned int reg;
+};
+
+struct cs8409_cir_param {
+   unsigned int nid;
+   unsigned int cir;
+   unsigned int coeff;
+};
+
+enum {
+   CS8409_BULLSEYE,
+   CS8409_WARLOCK,
+   CS8409_CYBORG,
+   CS8409_VERBS,
+};
+
+/* Dell Inspiron models with cs8409/cs42l42 */
+static const struct hda_model_fixup cs8409_models[] = {
+   { .id = CS8409_BULLSEYE, .name = "bullseye" },
+   { .id = CS8409_WARLOCK, .name = "warlock" },
+   { .id = CS8409_CYBORG, .name = "cyborg" },
+   {}
+};
+
+/* Dell Inspiron platforms
+ * with cs8409 bridge and cs42l42 codec
+ */
+static const struct snd_pci_quirk cs8409_fixup_tbl[] = {
+   SND_PCI_QUIRK(0x1028, 0x0A11, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A12, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A23, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A24, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A25, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A29, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A2A, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0A2B, "Bullseye", CS8409_BULLSEYE),
+   SND_PCI_QUIRK(0x1028, 0x0AB0, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB2, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB1, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB3, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB4, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AB5, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0AD9, "Warlock", CS8409_WARLOCK),
+   SND_PCI_QUIRK(0x1028, 0x0ADA, 

Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Marco Elver
On Thu, 4 Mar 2021 at 19:51, Mark Rutland  wrote:
> On Thu, Mar 04, 2021 at 07:22:53PM +0100, Marco Elver wrote:
> > On Thu, 4 Mar 2021 at 19:02, Mark Rutland  wrote:
> > > On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> > > > On Thu, Mar 04, 2021 at 04:59PM +, Mark Rutland wrote:
> > > > > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > > > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland  
> > > > > > wrote:
> > > > > > > [adding Mark Brown]
> > > > > > >
> > > > > > > The bigger problem here is that skipping is dodgy to begin with, 
> > > > > > > and
> > > > > > > this is still liable to break in some cases. One big concern is 
> > > > > > > that
> > > > > > > (especially with LTO) we cannot guarantee the compiler will not 
> > > > > > > inline
> > > > > > > or outline functions, causing the skipp value to be too large or 
> > > > > > > too
> > > > > > > small. That's liable to happen to callers, and in theory (though
> > > > > > > unlikely in practice), portions of arch_stack_walk() or
> > > > > > > stack_trace_save() could get outlined too.
> > > > > > >
> > > > > > > Unless we can get some strong guarantees from compiler folk such 
> > > > > > > that we
> > > > > > > can guarantee a specific function acts boundary for unwinding (and
> > > > > > > doesn't itself get split, etc), the only reliable way I can think 
> > > > > > > to
> > > > > > > solve this requires an assembly trampoline. Whatever we do is 
> > > > > > > liable to
> > > > > > > need some invasive rework.
> > > > > >
> > > > > > Will LTO and friends respect 'noinline'?
> > > > >
> > > > > I hope so (and suspect we'd have more problems otherwise), but I don't
> > > > > know whether they actually so.
> > > > >
> > > > > I suspect even with 'noinline' the compiler is permitted to outline
> > > > > portions of a function if it wanted to (and IIUC it could still make
> > > > > specialized copies in the absence of 'noclone').
> > > > >
> > > > > > One thing I also noticed is that tail calls would also cause the 
> > > > > > stack
> > > > > > trace to appear somewhat incomplete (for some of my tests I've
> > > > > > disabled tail call optimizations).
> > > > >
> > > > > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > > > > trace A->C? ... or is A going missing too?
> > > >
> > > > Correct, it's just the A->C outcome.
> > >
> > > I'd assumed that those cases were benign, e.g. for livepatching what
> > > matters is what can be returned to, so B disappearing from the trace
> > > isn't a problem there.
> > >
> > > Is the concern debugability, or is there a functional issue you have in
> > > mind?
> >
> > For me, it's just been debuggability, and reliable test cases.
> >
> > > > > > Is there a way to also mark a function non-tail-callable?
> > > > >
> > > > > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > > > > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > > > > function-local optimization options), but I don't expect there's any 
> > > > > way
> > > > > to mark a callee as not being tail-callable.
> > > >
> > > > I don't think this is reliable. It'd be
> > > > __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> > > > work if applied to the function we do not want to tail-call-optimize,
> > > > but would have to be applied to the function that does the tail-calling.
> > >
> > > Yup; that's what I meant then I said you could do that on the caller but
> > > not the callee.
> > >
> > > I don't follow why you'd want to put this on the callee, though, so I
> > > think I'm missing something. Considering a set of functions in different
> > > compilation units:
> > >
> > >   A->B->C->D->E->F->G->H->I->J->K
> >
> > I was having this problem with KCSAN, where the compiler would
> > tail-call-optimize __tsan_X instrumentation.
>
> Those are compiler-generated calls, right? When those are generated the
> compilation unit (and whatever it has included) might not have provided
> a prototype anyway, and the compiler has special knowledge of the
> functions, so it feels like the compiler would need to inhibit TCO here
> for this to be robust. For their intended usage subjecting them to TCO
> doesn't seem to make sense AFAICT.
>
> I suspect that compilers have some way of handling that; otherwise I'd
> expect to have heard stories of mcount/fentry calls getting TCO'd and
> causing problems. So maybe there's an easy fix there?

I agree, the compiler builtins should be handled by the compiler
directly, perhaps that was a bad example. But we also have "explicit
instrumentation", e.g. everything that's in .

> > This would mean that KCSAN runtime functions ended up in the trace,
> > but the function where the access happened would not. However, I don't
> > care about the runtime functions, and instead want to see the function
> > where the access happened. In that case, I'd like to just mark
> > __tsan_X and any other kcsan 

Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-04 Thread Eric W. Biederman
Sebastian Andrzej Siewior  writes:

> On 2021-03-03 16:09:05 [-0600], Eric W. Biederman wrote:
>> Sebastian Andrzej Siewior  writes:
>> 
>> > From: Thomas Gleixner 
>> >
>> > Allow realtime tasks to cache one sigqueue in task struct. This avoids an
>> > allocation which can increase the latency or fail.
>> > Ideally the sigqueue is cached after first successful delivery and will be
>> > available for next signal delivery. This works under the assumption that 
>> > the RT
>> > task has never an unprocessed signal while a one is about to be queued.
>> >
>> > The caching is not used for SIGQUEUE_PREALLOC because this kind of 
>> > sigqueue is
>> > handled differently (and not used for regular signal delivery).
>> 
>> What part of this is about real time tasks?  This allows any task
>> to cache a sigqueue entry.
>
> It is limited to realtime tasks (SCHED_FIFO/RR/DL):
>
> +static void __sigqueue_cache_or_free(struct sigqueue *q)
> +{
> …
> + if (!task_is_realtime(current) || !sigqueue_add_cache(current, q))
> + kmem_cache_free(sigqueue_cachep, q);
> +}

I see now.  I was looking for it somewhere in the allocation side.
Oleg's suggestion of simply adding a few additional lines to
__sigqueue_free would have made this stand out more.

A __sigqueue_free that takes the relevant task_struct instead of always
assuming current would be nice here.


>> Either the patch is buggy or the description is.  Overall caching one
>> sigqueue entry doesn't look insane. But it would help to have a clear
>> description of what is going on.
>
> Does this clear things up or is my logic somehow broken here?

No I just missed the task_is_realtime limitation.

Eric



Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller

2021-03-04 Thread Jacob Pan
Hi Jason,

On Thu, 4 Mar 2021 13:54:02 -0400, Jason Gunthorpe  wrote:

> On Thu, Mar 04, 2021 at 09:46:03AM -0800, Jacob Pan wrote:
> 
> > Right, I was assuming have three use cases of IOASIDs:
> > 1. host supervisor SVA (not a concern, just one init_mm to bind)
> > 2. host user SVA, either one IOASID per process or perhaps some private
> > IOASID for private address space
> > 3. VM use for guest SVA, each IOASID is bound to a guest process
> > 
> > My current cgroup proposal applies to #3 with IOASID_SET_TYPE_MM, which
> > is allocated by the new /dev/ioasid interface.
> > 
> > For #2, I was thinking you can limit the host process via PIDs cgroup?
> > i.e. limit fork. So the host IOASIDs are currently allocated from the
> > system pool with quota of chosen by iommu_sva_init() in my patch, 0
> > means unlimited use whatever is available.
> > https://lkml.org/lkml/2021/2/28/18  
> 
> Why do we need two pools?
> 
> If PASID's are limited then why does it matter how the PASID was
> allocated? Either the thing requesting it is below the limit, or it
> isn't.
> 
you are right. it should be tracked based on the process regardless it is
allocated by the user (/dev/ioasid) or indirectly by kernel drivers during
iommu_sva_bind_device(). Need to consolidate both 2 and 3 and
decouple cgroup and IOASID set.

> For something like qemu I'd expect to put the qemu process in a cgroup
> with 1 PASID. Who cares what qemu uses the PASID for, or how it was
> allocated?
> 
For vSVA, we will need one PASID per guest process. But that is up to the
admin based on whether or how many SVA capable devices are directly
assigned.

> Jason


Thanks,

Jacob


Re: [PATCH 3/4] ALSA: hda/cirrus: Add jack detect interrupt support from CS42L42 companion codec.

2021-03-04 Thread Vitaly Rodionov

On 04/03/2021 1:45 pm, Takashi Iwai wrote:

On Wed, 03 Mar 2021 19:29:58 +0100,
Vitaly Rodionov wrote:

@@ -1243,6 +1258,8 @@ static int patch_cs4213(struct hda_codec *codec)
  #define CIR_I2C_QWRITE0x005D
  #define CIR_I2C_QREAD 0x005E
  
+static struct mutex cs8409_i2c_mux;

Any reason that this must be the global mutex?  I suppose it can fit
in own spec->i2c_mutex instead?

No,  there is no reason to have global mutex, will move it out to spec.




+static void cs8409_cs42l42_cap_sync_hook(struct hda_codec *codec,
+struct snd_kcontrol *kcontrol,
+struct snd_ctl_elem_value *ucontrol)
+{
+   struct cs_spec *spec = codec->spec;
+   unsigned int curval, expval;
+   /* CS8409 DMIC Pin only allows the setting of the Stream Parameters in
+* Power State D0. When a headset is unplugged, and the path is 
switched to
+* the DMIC, the Stream is restarted with the new ADC, but this is done 
in
+* Power State D3. Restart the Stream now DMIC is in D0.
+*/
+   if (spec->gen.cur_adc == CS8409_CS42L42_DMIC_ADC_PIN_NID) {
+   curval = snd_hda_codec_read(codec, spec->gen.cur_adc,
+   0, AC_VERB_GET_CONV, 0);
+   expval = (spec->gen.cur_adc_stream_tag << 4) | 0;
+   if (curval != expval) {
+   codec_dbg(codec, "%s Restarting Stream after DMIC 
switch\n", __func__);
+   __snd_hda_codec_cleanup_stream(codec, 
spec->gen.cur_adc, 1);
+   snd_hda_codec_setup_stream(codec, spec->gen.cur_adc,
+  spec->gen.cur_adc_stream_tag, 0,
+  spec->gen.cur_adc_format);

Hrm, this looks a big scary.  We may need to reconsider how to handle
this better later, but it's OK as long as you've tested with this
code...


We have been thinking about this code, and we have some ideas , it was 
tested by us, DELL and Canonical and works.


But we would like to change it a bit later, and handle it in a more 
generic way.





+static int cs8409_cs42l42_init(struct hda_codec *codec)
+{
+   int ret = 0;
+
+   ret = snd_hda_gen_init(codec);
+
+   if (!ret) {
+   /* On Dell platforms with suspend D3 mode support we
+* have to re-initialise cs8409 bridge and companion
+* cs42l42 codec
+*/
+   snd_hda_sequence_write(codec, cs8409_cs42l42_init_verbs);
+   snd_hda_sequence_write(codec, cs8409_cs42l42_add_verbs);
+
+   cs8409_cs42l42_hw_init(codec);

Ah... the init stuff at resume appears finally here.  This part should
be in the second patch instead.

Yes, moving this into second patch.



+static int cs8409_cs42l42_exec_verb(struct hdac_device *dev,
+   unsigned int cmd, unsigned int flags, unsigned int *res)
+{
+   struct hda_codec *codec = container_of(dev, struct hda_codec, core);
+   struct cs_spec *spec = codec->spec;
+
+   unsigned int nid = 0;
+   unsigned int verb = 0;

The blank line above should be removed.


+   case CS8409_CS42L42_HP_PIN_NID:
+   if (verb == AC_VERB_GET_PIN_SENSE) {
+   *res = 
(spec->cs42l42_hp_jack_in)?AC_PINSENSE_PRESENCE:0;

The spaces are needed around operators.
Similar coding-style issues are seen other places.  Please try to run
scripts/checkpatch.pl.


Fixed, and checked with scripts/checkpatch.pl. Base has 19 warnings. 
This patch will not introduce any new.





thanks,

Takashi


Thanks,

Vitaly




Re: [RFC PATCH v2 00/13] Add futex2 syscall

2021-03-04 Thread André Almeida

Hi Peter,

Às 02:44 de 04/03/21, Peter Oskolkov escreveu:

On Wed, Mar 3, 2021 at 5:22 PM André Almeida  wrote:


Hi,

This patch series introduces the futex2 syscalls.

* FAQ

  ** "And what's about FUTEX_64?"

  By supporting 64 bit futexes, the kernel structure for futex would
  need to have a 64 bit field for the value, and that could defeat one of
  the purposes of having different sized futexes in the first place:
  supporting smaller ones to decrease memory usage. This might be
  something that could be disabled for 32bit archs (and even for
  CONFIG_BASE_SMALL).

  Which use case would benefit for FUTEX_64? Does it worth the trade-offs?


The ability to store a pointer value on 64bit platforms is an
important use case.
Imagine a simple producer/consumer scenario, with the producer updating
some shared memory data and waking the consumer. Storing the pointer
in the futex makes it so that only one shared memory location needs to be
accessed "atomically", etc. With two atomics synchronization becomes
more involved (= slower).



So the idea is to, instead of doing this:

T1:
atomic_set(_addr, buffer_addr);
atomic_set(, 0);
futex_wake(, 1);

T2:
consume(shm_addr);

To do that:

T1:
atomic_set(, buffer_addr);
futex_wake(, 1);

T2:
consume(futex);

Right?

I'll try to write a small test to see how the perf numbers looks like.


Re: A note on the 5.12-rc1 tag

2021-03-04 Thread Geert Uytterhoeven
Hi David,

On Thu, Mar 4, 2021 at 5:56 PM David Laight  wrote:
> > On Thu, Mar 4, 2021 at 1:59 PM Linus Torvalds
> >  wrote:
> > > And, as far as I know, all the normal distributions set things up with
> > > swap partitions, not files, because honestly, swapfiles tend to be
> > > slower and have various other complexity issues.
> >
> > Looks like this has changed in at least Ubuntu: my desktop machine,
> > which got Ubuntu 18.04LTS during initial installation, is using a (small)
> > swapfile instead of a swap partition.
>
> My older ubuntu (13.04) didn't have swap at all.

IIRC, the small swapfile was the default suggestion. I don't really
need swap (yummy, 53 GiB in buff/cache ;-)

> I had to add some when running multiple copies of the Altera
> fpga software started causing grief.
> That will be a file.

Or switch FPGA, and use yosys ;-)

> After all once you start swapping it is all horrid and slow.
> Swap to file may be slower, but apart from dumping out inactive
> pages you really don't want to be doing it - so it doesn't matter.
>
> Historically swap was a partition and larger than physical memory.
> This allows simple 'dump to swap' on panic (for many disk types).
> But I've not seen that support in linux.

I know.  We started with lots of small partitions, but nowadays the
distros wan't to install everything in a single[*] partition, even swap.

[*] Ignoring /boot/efi, which didn't exist in the good ol' days.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH v2 2/2] mm/memcg: set memcg when split page

2021-03-04 Thread Shakeel Butt
On Wed, Mar 3, 2021 at 11:57 PM Zhou Guanghui  wrote:
>
> As described in the split_page function comment, for the non-compound
> high order page, the sub-pages must be freed individually. If the
> memcg of the fisrt page is valid, the tail pages cannot be uncharged
> when be freed.
>
> For example, when alloc_pages_exact is used to allocate 1MB continuous
> physical memory, 2MB is charged(kmemcg is enabled and __GFP_ACCOUNT is
> set). When make_alloc_exact free the unused 1MB and free_pages_exact
> free the applied 1MB, actually, only 4KB(one page) is uncharged.
>
> Therefore, the memcg of the tail page needs to be set when split page.
>
> Signed-off-by: Zhou Guanghui 

Reviewed-by: Shakeel Butt 


Re: [PATCH v2 1/2] mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg

2021-03-04 Thread Shakeel Butt
On Wed, Mar 3, 2021 at 11:55 PM Zhou Guanghui  wrote:
>
> Rename mem_cgroup_split_huge_fixup to split_page_memcg and explicitly
> pass in page number argument.
>
> In this way, the interface name is more common and can be used by
> potential users. In addition, the complete info(memcg and flag) of
> the memcg needs to be set to the tail pages.
>
> Signed-off-by: Zhou Guanghui 

Reviewed-by: Shakeel Butt 


Re: [PATCH v5 13/16] rpmsg: char: introduce __rpmsg_chrdev_create_eptdev function

2021-03-04 Thread Mathieu Poirier
On Fri, Feb 19, 2021 at 12:14:58PM +0100, Arnaud Pouliquen wrote:
> Introduce the __rpmsg_chrdev_create_eptdev internal function that returns
> the rpmsg_eptdev context structure.
> This patch prepares the introduction of a RPMsg device for the
> char device. the RPMsg device will need a reference to the context.
> 
> Signed-off-by: Arnaud Pouliquen 
> ---
>  drivers/rpmsg/rpmsg_char.c | 23 ++-
>  1 file changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
> index 09ae1304837c..66dcb8845d6c 100644
> --- a/drivers/rpmsg/rpmsg_char.c
> +++ b/drivers/rpmsg/rpmsg_char.c
> @@ -328,8 +328,9 @@ int rpmsg_chrdev_eptdev_destroy(struct device *dev, void 
> *data)
>  }
>  EXPORT_SYMBOL(rpmsg_chrdev_eptdev_destroy);
>  
> -int rpmsg_chrdev_create_eptdev(struct rpmsg_device *rpdev, struct device 
> *parent,
> -struct rpmsg_channel_info chinfo)
> +static struct rpmsg_eptdev *__rpmsg_chrdev_create_eptdev(struct rpmsg_device 
> *rpdev,
> +  struct device *parent,
> +  struct 
> rpmsg_channel_info chinfo)
>  {
>   struct rpmsg_eptdev *eptdev;
>   struct device *dev;
> @@ -337,7 +338,7 @@ int rpmsg_chrdev_create_eptdev(struct rpmsg_device 
> *rpdev, struct device *parent
>  
>   eptdev = kzalloc(sizeof(*eptdev), GFP_KERNEL);
>   if (!eptdev)
> - return -ENOMEM;
> + return ERR_PTR(-ENOMEM);
>  
>   dev = >dev;
>   eptdev->rpdev = rpdev;
> @@ -381,7 +382,7 @@ int rpmsg_chrdev_create_eptdev(struct rpmsg_device 
> *rpdev, struct device *parent
>   put_device(dev);
>   }
>  
> - return ret;
> + return eptdev;
>  
>  free_ept_ida:
>   ida_simple_remove(_ept_ida, dev->id);
> @@ -391,7 +392,19 @@ int rpmsg_chrdev_create_eptdev(struct rpmsg_device 
> *rpdev, struct device *parent
>   put_device(dev);
>   kfree(eptdev);
>  
> - return ret;
> + return ERR_PTR(ret);
> +}
> +
> +int rpmsg_chrdev_create_eptdev(struct rpmsg_device *rpdev, struct device 
> *parent,
> +struct rpmsg_channel_info chinfo)
> +{
> + struct rpmsg_eptdev *eptdev;
> +
> + eptdev = __rpmsg_chrdev_create_eptdev(rpdev, >dev, chinfo);

Shouldn't the second argument to __rpmsg_chrdev_create_eptdev() be @parent?

> + if (IS_ERR(eptdev))
> + return PTR_ERR(eptdev);
> +
> + return 0;
>  }
>  EXPORT_SYMBOL(rpmsg_chrdev_create_eptdev);
>  
> -- 
> 2.17.1
> 


Re: [PATCH V3 XRT Alveo 03/18] fpga: xrt: xclbin file helper functions

2021-03-04 Thread Lizhi Hou

Hi Moritz,


On 03/02/2021 07:14 AM, Moritz Fischer wrote:


On Mon, Mar 01, 2021 at 04:25:37PM -0800, Lizhi Hou wrote:

Hi Tom,


On 02/28/2021 08:54 AM, Tom Rix wrote:

CAUTION: This message has originated from an External Source. Please use proper 
judgment and caution when opening attachments, clicking links, or responding to 
this email.


On 2/26/21 1:23 PM, Lizhi Hou wrote:

Hi Tom,



snip


I also do not see a pragma pack, usually this is set of 1 so the compiler does 
not shuffle elements, increase size etc.

This data structure is shared with other tools. And the structure is well 
defined with reasonable alignment. It is compatible with all compilers we have 
tested. So pragma pack is not necessary.

You can not have possibly tested all the configurations since the kernel 
supports many arches and compilers.

If the tested existing alignment is ok, pragma pack should be a noop on your 
tested configurations.

And help cover the untested configurations.

Got it. I will add pragma pack(1).

Please do not use pragma pack(), add __packed to the structs in
question.

Ok, I will use __packed.

Thanks,
Lizhi


- Moritz




Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Mark Rutland
On Thu, Mar 04, 2021 at 07:22:53PM +0100, Marco Elver wrote:
> On Thu, 4 Mar 2021 at 19:02, Mark Rutland  wrote:
> > On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> > > On Thu, Mar 04, 2021 at 04:59PM +, Mark Rutland wrote:
> > > > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland  
> > > > > wrote:
> > > > > > [adding Mark Brown]
> > > > > >
> > > > > > The bigger problem here is that skipping is dodgy to begin with, and
> > > > > > this is still liable to break in some cases. One big concern is that
> > > > > > (especially with LTO) we cannot guarantee the compiler will not 
> > > > > > inline
> > > > > > or outline functions, causing the skipp value to be too large or too
> > > > > > small. That's liable to happen to callers, and in theory (though
> > > > > > unlikely in practice), portions of arch_stack_walk() or
> > > > > > stack_trace_save() could get outlined too.
> > > > > >
> > > > > > Unless we can get some strong guarantees from compiler folk such 
> > > > > > that we
> > > > > > can guarantee a specific function acts boundary for unwinding (and
> > > > > > doesn't itself get split, etc), the only reliable way I can think to
> > > > > > solve this requires an assembly trampoline. Whatever we do is 
> > > > > > liable to
> > > > > > need some invasive rework.
> > > > >
> > > > > Will LTO and friends respect 'noinline'?
> > > >
> > > > I hope so (and suspect we'd have more problems otherwise), but I don't
> > > > know whether they actually so.
> > > >
> > > > I suspect even with 'noinline' the compiler is permitted to outline
> > > > portions of a function if it wanted to (and IIUC it could still make
> > > > specialized copies in the absence of 'noclone').
> > > >
> > > > > One thing I also noticed is that tail calls would also cause the stack
> > > > > trace to appear somewhat incomplete (for some of my tests I've
> > > > > disabled tail call optimizations).
> > > >
> > > > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > > > trace A->C? ... or is A going missing too?
> > >
> > > Correct, it's just the A->C outcome.
> >
> > I'd assumed that those cases were benign, e.g. for livepatching what
> > matters is what can be returned to, so B disappearing from the trace
> > isn't a problem there.
> >
> > Is the concern debugability, or is there a functional issue you have in
> > mind?
> 
> For me, it's just been debuggability, and reliable test cases.
> 
> > > > > Is there a way to also mark a function non-tail-callable?
> > > >
> > > > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > > > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > > > function-local optimization options), but I don't expect there's any way
> > > > to mark a callee as not being tail-callable.
> > >
> > > I don't think this is reliable. It'd be
> > > __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> > > work if applied to the function we do not want to tail-call-optimize,
> > > but would have to be applied to the function that does the tail-calling.
> >
> > Yup; that's what I meant then I said you could do that on the caller but
> > not the callee.
> >
> > I don't follow why you'd want to put this on the callee, though, so I
> > think I'm missing something. Considering a set of functions in different
> > compilation units:
> >
> >   A->B->C->D->E->F->G->H->I->J->K
> 
> I was having this problem with KCSAN, where the compiler would
> tail-call-optimize __tsan_X instrumentation.

Those are compiler-generated calls, right? When those are generated the
compilation unit (and whatever it has included) might not have provided
a prototype anyway, and the compiler has special knowledge of the
functions, so it feels like the compiler would need to inhibit TCO here
for this to be robust. For their intended usage subjecting them to TCO
doesn't seem to make sense AFAICT.

I suspect that compilers have some way of handling that; otherwise I'd
expect to have heard stories of mcount/fentry calls getting TCO'd and
causing problems. So maybe there's an easy fix there?

> This would mean that KCSAN runtime functions ended up in the trace,
> but the function where the access happened would not. However, I don't
> care about the runtime functions, and instead want to see the function
> where the access happened. In that case, I'd like to just mark
> __tsan_X and any other kcsan instrumentation functions as
> do-not-tail-call-optimize, which would solve the problem.

I understand why we don't want to TCO these calls, but given the calls
are implicitly generated, I strongly suspect it's better to fix the
implicit call generation to not be TCO'd to begin with.

> The solution today is that when you compile a kernel with KCSAN, every
> instrumented TU is compiled with -fno-optimize-sibling-calls. The
> better solution would be to just mark KCSAN runtime functions somehow,
> but 

Re: [tip: irq/core] genirq: Add IRQF_NO_AUTOEN for request_irq/nmi()

2021-03-04 Thread Thomas Gleixner
Dmitry,

On Thu, Mar 04 2021 at 11:57, Thomas Gleixner wrote:
> On Thu, Mar 04 2021 at 10:53, tip-bot wrote:
>
>> The following commit has been merged into the irq/core branch of tip:
>>
>> Commit-ID: e749df1bbd23f4472082210650514548d8a39e9b
>> Gitweb:
>> https://git.kernel.org/tip/e749df1bbd23f4472082210650514548d8a39e9b
>> Author:Barry Song 
>> AuthorDate:Wed, 03 Mar 2021 11:49:15 +13:00
>> Committer: Thomas Gleixner 
>> CommitterDate: Thu, 04 Mar 2021 11:47:52 +01:00
>>
>> genirq: Add IRQF_NO_AUTOEN for request_irq/nmi()
>
> this commit is immutable and I tagged it so you can pull it into your
> tree to add the input changes on top:
>
>git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
> irq-no-autoen-2021-03-04

Please hold on:

  
https://lkml.kernel.org/r/CAHk-=wgZjJ89jeH72TC3i6N%2Bz9WEY=3ysp8zr9narucsqca...@mail.gmail.com

I'll recreate a tag for you once rc2 is out.

Thanks,

tglx


Re: [PATCH] scsi: ibmvfc: Switch to using the new API kobj_to_dev()

2021-03-04 Thread Tyrel Datwyler
On 3/4/21 1:28 AM, Jiapeng Chong wrote:
> Fix the following coccicheck warnings:
> 
> ./drivers/scsi/ibmvscsi/ibmvfc.c:3483:60-61: WARNING opportunity for
> kobj_to_dev().
> 
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 

Acked-by: Tyrel Datwyler 

> ---
>  drivers/scsi/ibmvscsi/ibmvfc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c
> index 755313b..e5f1ca7 100644
> --- a/drivers/scsi/ibmvscsi/ibmvfc.c
> +++ b/drivers/scsi/ibmvscsi/ibmvfc.c
> @@ -3480,7 +3480,7 @@ static ssize_t ibmvfc_read_trace(struct file *filp, 
> struct kobject *kobj,
>struct bin_attribute *bin_attr,
>char *buf, loff_t off, size_t count)
>  {
> - struct device *dev = container_of(kobj, struct device, kobj);
> + struct device *dev = kobj_to_dev(kobj);
>   struct Scsi_Host *shost = class_to_shost(dev);
>   struct ibmvfc_host *vhost = shost_priv(shost);
>   unsigned long flags = 0;
> 



Re: [patch 2/7] drm/vmgfx: Replace kmap_atomic()

2021-03-04 Thread Zack Rusin


> On Mar 3, 2021, at 08:20, Thomas Gleixner  wrote:
> 
> From: Thomas Gleixner 
> 
> There is no reason to disable pagefaults and preemption as a side effect of
> kmap_atomic_prot().
> 
> Use kmap_local_page_prot() instead and document the reasoning for the
> mapping usage with the given pgprot.
> 
> Remove the NULL pointer check for the map. These functions return a valid
> address for valid pages and the return was bogus anyway as it would have
> left preemption and pagefaults disabled.
> 
> Signed-off-by: Thomas Gleixner 
> Cc: VMware Graphics 
> Cc: Roland Scheidegger 
> Cc: Zack Rusin 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-de...@lists.freedesktop.org
> ---
> drivers/gpu/drm/vmwgfx/vmwgfx_blit.c |   30 --
> 1 file changed, 12 insertions(+), 18 deletions(-)
> 
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_blit.c
> @@ -375,12 +375,12 @@ static int vmw_bo_cpu_blit_line(struct v
>   copy_size = min_t(u32, copy_size, PAGE_SIZE - src_page_offset);
> 
>   if (unmap_src) {
> - kunmap_atomic(d->src_addr);
> + kunmap_local(d->src_addr);
>   d->src_addr = NULL;
>   }
> 
>   if (unmap_dst) {
> - kunmap_atomic(d->dst_addr);
> + kunmap_local(d->dst_addr);
>   d->dst_addr = NULL;
>   }
> 
> @@ -388,12 +388,8 @@ static int vmw_bo_cpu_blit_line(struct v
>   if (WARN_ON_ONCE(dst_page >= d->dst_num_pages))
>   return -EINVAL;
> 
> - d->dst_addr =
> - kmap_atomic_prot(d->dst_pages[dst_page],
> -  d->dst_prot);
> - if (!d->dst_addr)
> - return -ENOMEM;
> -
> + d->dst_addr = 
> kmap_local_page_prot(d->dst_pages[dst_page],
> +d->dst_prot);
>   d->mapped_dst = dst_page;
>   }
> 
> @@ -401,12 +397,8 @@ static int vmw_bo_cpu_blit_line(struct v
>   if (WARN_ON_ONCE(src_page >= d->src_num_pages))
>   return -EINVAL;
> 
> - d->src_addr =
> - kmap_atomic_prot(d->src_pages[src_page],
> -  d->src_prot);
> - if (!d->src_addr)
> - return -ENOMEM;
> -
> + d->src_addr = 
> kmap_local_page_prot(d->src_pages[src_page],
> +d->src_prot);
>   d->mapped_src = src_page;
>   }
>   diff->do_cpy(diff, d->dst_addr + dst_page_offset,
> @@ -436,8 +428,10 @@ static int vmw_bo_cpu_blit_line(struct v
>  *
>  * Performs a CPU blit from one buffer object to another avoiding a full
>  * bo vmap which may exhaust- or fragment vmalloc space.
> - * On supported architectures (x86), we're using kmap_atomic which avoids
> - * cross-processor TLB- and cache flushes and may, on non-HIGHMEM systems
> + *
> + * On supported architectures (x86), we're using kmap_local_prot() which
> + * avoids cross-processor TLB- and cache flushes. kmap_local_prot() will
> + * either map a highmem page with the proper pgprot on HIGHMEM=y systems or
>  * reference already set-up mappings.
>  *
>  * Neither of the buffer objects may be placed in PCI memory
> @@ -500,9 +494,9 @@ int vmw_bo_cpu_blit(struct ttm_buffer_ob
>   }
> out:
>   if (d.src_addr)
> - kunmap_atomic(d.src_addr);
> + kunmap_local(d.src_addr);
>   if (d.dst_addr)
> - kunmap_atomic(d.dst_addr);
> + kunmap_local(d.dst_addr);
> 
>   return ret;
> }


Looks good. Thanks.

Reviewed-by: Zack Rusin 

z



Re: futex breakage in 4.9 stable branch

2021-03-04 Thread Thomas Gleixner
On Thu, Mar 04 2021 at 10:12, Mike Galbraith wrote:
> On Mon, 2021-03-01 at 18:29 +0100, Ben Hutchings wrote:
>
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -874,8 +874,12 @@ static void free_pi_state(struct futex_p
>* and has cleaned up the pi_state already
>*/
>   if (pi_state->owner) {
> + unsigned long flags;
> +
> + raw_spin_lock_irqsave(_state->pi_mutex.wait_lock, flags);
>   pi_state_update_owner(pi_state, NULL);
>   rt_mutex_proxy_unlock(_state->pi_mutex);
> + raw_spin_unlock_irqrestore(_state->pi_mutex.wait_lock, 
> flags);

This hunk is missing in 4.9 as well.

>   }
>
>   if (current->pi_state_cache)
> @@ -1406,7 +1410,7 @@ static int wake_futex_pi(u32 __user *uad
>   if (pi_state->owner != current)
>   return -EINVAL;
>
> - raw_spin_lock(_state->pi_mutex.wait_lock);
> + raw_spin_lock_irq(_state->pi_mutex.wait_lock);
>   new_owner = rt_mutex_next_owner(_state->pi_mutex);
>
>   /*

That looks correct.

Thanks,

tglx


Re: [PATCH v3 15/32] KVM: arm64: Prepare the creation of s1 mappings at EL2

2021-03-04 Thread Will Deacon
Hi Quentin,

On Tue, Mar 02, 2021 at 02:59:45PM +, Quentin Perret wrote:
> When memory protection is enabled, the EL2 code needs the ability to
> create and manage its own page-table. To do so, introduce a new set of
> hypercalls to bootstrap a memory management system at EL2.
> 
> This leads to the following boot flow in nVHE Protected mode:
> 
>  1. the host allocates memory for the hypervisor very early on, using
> the memblock API;
> 
>  2. the host creates a set of stage 1 page-table for EL2, installs the
> EL2 vectors, and issues the __pkvm_init hypercall;
> 
>  3. during __pkvm_init, the hypervisor re-creates its stage 1 page-table
> and stores it in the memory pool provided by the host;
> 
>  4. the hypervisor then extends its stage 1 mappings to include a
> vmemmap in the EL2 VA space, hence allowing to use the buddy
> allocator introduced in a previous patch;
> 
>  5. the hypervisor jumps back in the idmap page, switches from the
> host-provided page-table to the new one, and wraps up its
> initialization by enabling the new allocator, before returning to
> the host.
> 
>  6. the host can free the now unused page-table created for EL2, and
> will now need to issue hypercalls to make changes to the EL2 stage 1
> mappings instead of modifying them directly.
> 
> Note that for the sake of simplifying the review, this patch focuses on
> the hypervisor side of things. In other words, this only implements the
> new hypercalls, but does not make use of them from the host yet. The
> host-side changes will follow in a subsequent patch.
> 
> Credits to Will for __pkvm_init_switch_pgd.
> 
> Co-authored-by: Will Deacon 
> Signed-off-by: Will Deacon 
> Signed-off-by: Quentin Perret 
> ---
>  arch/arm64/include/asm/kvm_asm.h |   4 +
>  arch/arm64/include/asm/kvm_host.h|   7 +
>  arch/arm64/include/asm/kvm_hyp.h |   8 ++
>  arch/arm64/include/asm/kvm_pgtable.h |   2 +
>  arch/arm64/kernel/image-vars.h   |  16 +++
>  arch/arm64/kvm/hyp/Makefile  |   2 +-
>  arch/arm64/kvm/hyp/include/nvhe/mm.h |  71 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile |   4 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-init.S   |  31 +
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c   |  49 +++
>  arch/arm64/kvm/hyp/nvhe/mm.c | 173 
>  arch/arm64/kvm/hyp/nvhe/setup.c  | 195 +++
>  arch/arm64/kvm/hyp/pgtable.c |   2 -
>  arch/arm64/kvm/hyp/reserved_mem.c|  92 +
>  arch/arm64/mm/init.c |   3 +
>  15 files changed, 654 insertions(+), 5 deletions(-)

This mostly looks good to me, but in a patch this size I was bound to spot
a few niggles. It is _huge_!

> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-init.S 
> b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
> index c631e29fb001..bc56ea92b812 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-init.S
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-init.S
> @@ -244,4 +244,35 @@ alternative_else_nop_endif
>  
>  SYM_CODE_END(__kvm_handle_stub_hvc)
>  
> +SYM_FUNC_START(__pkvm_init_switch_pgd)
> + /* Turn the MMU off */
> + pre_disable_mmu_workaround
> + mrs x2, sctlr_el2
> + bic x3, x2, #SCTLR_ELx_M
> + msr sctlr_el2, x3
> + isb
> +
> + tlbialle2
> +
> + /* Install the new pgtables */
> + ldr x3, [x0, #NVHE_INIT_PGD_PA]
> + phys_to_ttbr x4, x3
> +alternative_if ARM64_HAS_CNP
> + orr x4, x4, #TTBR_CNP_BIT
> +alternative_else_nop_endif
> + msr ttbr0_el2, x4
> +
> + /* Set the new stack pointer */
> + ldr x0, [x0, #NVHE_INIT_STACK_HYP_VA]
> + mov sp, x0
> +
> + /* And turn the MMU back on! */
> + dsb nsh
> + isb
> + msr sctlr_el2, x2
> + ic  iallu
> + isb

Comparing with the new-fangled set_sctlr_el1 macro we have, this sequence
isn't quite right. Probably best to introduce set_sctlr_el2, and implement
that and the existing macro in terms of set_sctlr_elX or something like
that.

> +void __noreturn __pkvm_init_finalise(void)
> +{
> + struct kvm_host_data *host_data = this_cpu_ptr(_host_data);
> + struct kvm_cpu_context *host_ctxt = _data->host_ctxt;
> + unsigned long nr_pages, reserved_pages, pfn;
> + int ret;
> +
> + /* Now that the vmemmap is backed, install the full-fledged allocator */
> + pfn = hyp_virt_to_pfn(hyp_pgt_base);
> + nr_pages = hyp_s1_pgtable_pages();
> + reserved_pages = hyp_early_alloc_nr_used_pages();
> + ret = hyp_pool_init(, pfn, nr_pages, reserved_pages);
> + if (ret)
> + goto out;
> +
> + pkvm_pgtable_mm_ops.zalloc_page = hyp_zalloc_hyp_page;
> + pkvm_pgtable_mm_ops.phys_to_virt = hyp_phys_to_virt;
> + pkvm_pgtable_mm_ops.virt_to_phys = hyp_virt_to_phys;
> + pkvm_pgtable_mm_ops.get_page = hyp_get_page;
> + pkvm_pgtable_mm_ops.put_page = hyp_put_page;
> + pkvm_pgtable.mm_ops = _pgtable_mm_ops;

Can you do:

pkvm_pgtable_mm_ops = (struct 

Re: [PATCH 4/4] ALSA: hda/cirrus: Add Headphone and Headset MIC Volume Control

2021-03-04 Thread Vitaly Rodionov

On 04/03/2021 1:49 pm, Takashi Iwai wrote:

On Wed, 03 Mar 2021 19:29:59 +0100,
Vitaly Rodionov wrote:

+static int cs8409_cs42l42_volume_get(struct snd_kcontrol *kcontrol,
+struct snd_ctl_elem_value *ucontrol)
+{
+   struct hda_codec *codec = snd_kcontrol_chip(kcontrol);
+   hda_nid_t nid = get_amp_nid(kcontrol);
+   int chs = get_amp_channels(kcontrol);
+   long *valp = ucontrol->value.integer.value;
+   char vol = 0;
+
+   codec_dbg(codec, "%s() nid: %d\n", __func__, nid);
+   snd_hda_power_up(codec);
+   switch (nid) {
+   case CS8409_CS42L42_HP_PIN_NID:
+   mutex_lock(_i2c_mux);
+   if (chs & 1) {
+   vol = -(cs8409_i2c_read(codec, CS42L42_I2C_ADDR,
+   CS8409_CS42L42_REG_HS_VOLUME_CHA, 1));

Better to cache the values instead of i2c read at each time?
Then the unnecessary power up/down sequence can be avoided, too.

Yes, agree. Will be fixed in next version of patch.



thanks,

Takashi


Thank you,

Vitaly



[PATCH] KVM: arm64: Disable LTO in hyp

2021-03-04 Thread Sami Tolvanen
allmodconfig + CONFIG_LTO_CLANG_THIN=y fails to build due to following
linker errors:

  ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21CC):
  relocation R_AARCH64_CONDBR19 out of range: 2031220 is not in
  [-1048576, 1048575]; references hyp_panic
  >>> defined in vmlinux.o

  ld.lld: error: irqbypass.c:(function __guest_enter: .text+0x21E0):
  relocation R_AARCH64_ADR_PREL_LO21 out of range: 2031200 is not in
  [-1048576, 1048575]; references hyp_panic
  >>> defined in vmlinux.o

As LTO is not really necessary for the hypervisor code, disable it for
the hyp directory to fix the build.

Link: https://github.com/ClangBuiltLinux/linux/issues/1317
Reported-by: Nathan Chancellor 
Tested-by: Nathan Chancellor 
Signed-off-by: Sami Tolvanen 
---
 arch/arm64/kvm/hyp/Makefile | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile
index 687598e41b21..e8116016e6a8 100644
--- a/arch/arm64/kvm/hyp/Makefile
+++ b/arch/arm64/kvm/hyp/Makefile
@@ -11,3 +11,6 @@ subdir-ccflags-y := -I$(incdir)   
\
$(DISABLE_STACKLEAK_PLUGIN)
 
 obj-$(CONFIG_KVM) += vhe/ nvhe/ pgtable.o
+
+# Disable LTO for the files in this directory
+KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS))

base-commit: f69d02e37a85645aa90d18cacfff36dba370f797
-- 
2.30.1.766.gb4fecdf3b7-goog



Re: [PATCH v5 16/16] rpmsg: char: return an error if device already open

2021-03-04 Thread Mathieu Poirier
On Fri, Feb 19, 2021 at 12:15:01PM +0100, Arnaud Pouliquen wrote:
> The rpmsg_create_ept function is invoked when the device is opened.
> As only one endpoint must be created per device. It is not possible to
> open the same device twice. But there is nothing to prevent multi open.

s/multi/multiple

> Return -EBUSY when device is already opened to have a generic error
> instead of relying on the back-end to potentially detect the error.
> 
> Without this patch for instance the GLINK driver return -EBUSY while
> the virtio bus return -ENOSPC.
> 
> Signed-off-by: Arnaud Pouliquen 
> ---
>  drivers/rpmsg/rpmsg_char.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
> index 8d3f9d6c20ad..4cd5b79559f0 100644
> --- a/drivers/rpmsg/rpmsg_char.c
> +++ b/drivers/rpmsg/rpmsg_char.c
> @@ -116,6 +116,9 @@ static int rpmsg_eptdev_open(struct inode *inode, struct 
> file *filp)
>   struct device *dev = >dev;
>   u32 addr = eptdev->chinfo.src;
>  
> + if (eptdev->ept)
> + return -EBUSY;
> +

It would be nice to return the same error code regardless of the backend but at
the same time I feel like it isn't the right place to do this.  I need to think
about this one but for now we can keep it.

>   get_device(dev);
>  
>   /*
> -- 
> 2.17.1
> 


Re: [RFC PATCH 05/18] virt/mshv: create partition ioctl

2021-03-04 Thread Nuno Das Neves
On 2/9/2021 5:15 AM, Vitaly Kuznetsov wrote:
> Nuno Das Neves  writes:
> 
>> Add MSHV_CREATE_PARTITION, which creates an fd to track a new partition.
>> Partition is not yet created in the hypervisor itself.
>> Introduce header files for userspace-facing hyperv structures.
>>
>> Co-developed-by: Lillian Grassin-Drake 
>> Signed-off-by: Lillian Grassin-Drake 
>> Signed-off-by: Nuno Das Neves 
>> ---
>>  Documentation/virt/mshv/api.rst |  12 ++
>>  arch/x86/include/asm/hyperv-tlfs.h  |   1 +
>>  arch/x86/include/uapi/asm/hyperv-tlfs.h | 124 
>>  include/asm-generic/hyperv-tlfs.h   |   1 +
>>  include/linux/mshv.h|  16 +++
>>  include/uapi/asm-generic/hyperv-tlfs.h  |  14 ++
>>  include/uapi/linux/mshv.h   |   7 +
>>  virt/mshv/mshv_main.c   | 179 +---
>>  8 files changed, 338 insertions(+), 16 deletions(-)
>>  create mode 100644 arch/x86/include/uapi/asm/hyperv-tlfs.h
>>  create mode 100644 include/uapi/asm-generic/hyperv-tlfs.h
>>
>> diff --git a/Documentation/virt/mshv/api.rst 
>> b/Documentation/virt/mshv/api.rst
>> index 82e32de48d03..ce651a1738e0 100644
>> --- a/Documentation/virt/mshv/api.rst
>> +++ b/Documentation/virt/mshv/api.rst
>> @@ -39,6 +39,9 @@ root partition can use mshv APIs to create guest 
>> partitions.
>>  
>>  The module is named mshv and can be configured with CONFIG_HYPERV_ROOT_API.
>>  
>> +The uapi header files you need are linux/mshv.h, asm/hyperv-tlfs.h, and
>> +asm-generic/hyperv-tlfs.h.
>> +
>>  Mshv is file descriptor-based, following a similar pattern to KVM.
>>  
>>  To get a handle to the mshv driver, use open("/dev/mshv").
>> @@ -60,3 +63,12 @@ if one of them matches.
>>  This /dev/mshv file descriptor will remain 'locked' to that version as long 
>> as
>>  it is open - this ioctl can only be called once per open.
>>  
>> +3.2 MSHV_CREATE_PARTITION
>> +-
>> +:Type: /dev/mshv ioctl
>> +:Parameters: struct mshv_create_partition
>> +:Returns: partition file descriptor, or -1 on failure
>> +
>> +This ioctl creates a guest partition, returning a file descriptor to use as 
>> a
>> +handle for partition ioctls.
>> +
>> diff --git a/arch/x86/include/asm/hyperv-tlfs.h 
>> b/arch/x86/include/asm/hyperv-tlfs.h
>> index 592c75e51e0f..4cd44ae9bffb 100644
>> --- a/arch/x86/include/asm/hyperv-tlfs.h
>> +++ b/arch/x86/include/asm/hyperv-tlfs.h
>> @@ -11,6 +11,7 @@
>>  
>>  #include 
>>  #include 
>> +#include 
>>  /*
>>   * The below CPUID leaves are present if 
>> VersionAndFeatures.HypervisorPresent
>>   * is set by CPUID(HvCpuIdFunctionVersionAndFeatures).
>> diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h 
>> b/arch/x86/include/uapi/asm/hyperv-tlfs.h
>> new file mode 100644
>> index ..72150c25ffe6
>> --- /dev/null
>> +++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
>> @@ -0,0 +1,124 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +#ifndef _UAPI_ASM_X86_HYPERV_TLFS_USER_H
>> +#define _UAPI_ASM_X86_HYPERV_TLFS_USER_H
>> +
>> +#include 
>> +
>> +#define HV_PARTITION_PROCESSOR_FEATURE_BANKS 2
>> +
>> +union hv_partition_processor_features {
>> +struct {
>> +__u64 sse3_support:1;
>> +__u64 lahf_sahf_support:1;
>> +__u64 ssse3_support:1;
>> +__u64 sse4_1_support:1;
>> +__u64 sse4_2_support:1;
>> +__u64 sse4a_support:1;
>> +__u64 xop_support:1;
>> +__u64 pop_cnt_support:1;
>> +__u64 cmpxchg16b_support:1;
>> +__u64 altmovcr8_support:1;
>> +__u64 lzcnt_support:1;
>> +__u64 mis_align_sse_support:1;
>> +__u64 mmx_ext_support:1;
>> +__u64 amd3dnow_support:1;
>> +__u64 extended_amd3dnow_support:1;
>> +__u64 page_1gb_support:1;
>> +__u64 aes_support:1;
>> +__u64 pclmulqdq_support:1;
>> +__u64 pcid_support:1;
>> +__u64 fma4_support:1;
>> +__u64 f16c_support:1;
>> +__u64 rd_rand_support:1;
>> +__u64 rd_wr_fs_gs_support:1;
>> +__u64 smep_support:1;
>> +__u64 enhanced_fast_string_support:1;
>> +__u64 bmi1_support:1;
>> +__u64 bmi2_support:1;
>> +__u64 hle_support_deprecated:1;
>> +__u64 rtm_support_deprecated:1;
>> +__u64 movbe_support:1;
>> +__u64 npiep1_support:1;
>> +__u64 dep_x87_fpu_save_support:1;
>> +__u64 rd_seed_support:1;
>> +__u64 adx_support:1;
>> +__u64 intel_prefetch_support:1;
>> +__u64 smap_support:1;
>> +__u64 hle_support:1;
>> +__u64 rtm_support:1;
>> +__u64 rdtscp_support:1;
>> +__u64 clflushopt_support:1;
>> +__u64 clwb_support:1;
>> +__u64 sha_support:1;
>> +__u64 x87_pointers_saved_support:1;
>> +

Re: [RFC PATCH 04/18] virt/mshv: request version ioctl

2021-03-04 Thread Nuno Das Neves
On 2/9/2021 5:11 AM, Vitaly Kuznetsov wrote:
> Nuno Das Neves  writes:
> 
>> Reserve ioctl number in userpsace-api/ioctl/ioctl-number.rst
>> Introduce MSHV_REQUEST_VERSION ioctl.
>> Introduce documentation for /dev/mshv in Documentation/virt/mshv
>>
>> Signed-off-by: Nuno Das Neves 
>> ---
>>  .../userspace-api/ioctl/ioctl-number.rst  |  2 +
>>  Documentation/virt/mshv/api.rst   | 62 +++
>>  include/linux/mshv.h  | 11 
>>  include/uapi/linux/mshv.h | 19 ++
>>  virt/mshv/mshv_main.c | 49 +++
>>  5 files changed, 143 insertions(+)
>>  create mode 100644 Documentation/virt/mshv/api.rst
>>  create mode 100644 include/linux/mshv.h
>>  create mode 100644 include/uapi/linux/mshv.h
>>
>> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst 
>> b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> index 55a2d9b2ce33..13a4d3ecafca 100644
>> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst
>> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
>> @@ -343,6 +343,8 @@ Code  Seq#Include File   
>> Comments
>>  0xB5  00-0F  uapi/linux/rpmsg.h  
>> 
>>  0xB6  alllinux/fpga-dfl.h
>>  0xB7  alluapi/linux/remoteproc_cdev.h
>> 
>> +0xB8  alluapi/linux/mshv.h   
>> Microsoft Hypervisor root partition APIs
>> + 
>> 
>>  0xC0  00-0F  linux/usb/iowarrior.h
>>  0xCA  00-0F  uapi/misc/cxl.h
>>  0xCA  10-2F  uapi/misc/ocxl.h
>> diff --git a/Documentation/virt/mshv/api.rst 
>> b/Documentation/virt/mshv/api.rst
>> new file mode 100644
>> index ..82e32de48d03
>> --- /dev/null
>> +++ b/Documentation/virt/mshv/api.rst
>> @@ -0,0 +1,62 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +=
>> +Microsoft Hypervisor Root Partition API Documentation
>> +=
>> +
>> +1. Overview
>> +===
>> +
>> +This document describes APIs for creating and managing guest virtual 
>> machines
>> +when running Linux as the root partition on the Microsoft Hypervisor.
>> +
>> +This API is not yet stable.
>> +
>> +2. Glossary/Terms
>> +=
>> +
>> +hv
>> +--
>> +Short for Hyper-V. This name is used in the kernel to describe interfaces to
>> +the Microsoft Hypervisor.
>> +
>> +mshv
>> +
>> +Short for Microsoft Hypervisor. This is the name of the userland API module
>> +described in this document.
>> +
>> +Partition
>> +-
>> +A virtual machine running on the Microsoft Hypervisor.
>> +
>> +Root Partition
>> +--
>> +The partition that is created and assumes control when the machine boots. 
>> The
>> +root partition can use mshv APIs to create guest partitions.
>> +
>> +3. API description
>> +==
>> +
>> +The module is named mshv and can be configured with CONFIG_HYPERV_ROOT_API.
>> +
>> +Mshv is file descriptor-based, following a similar pattern to KVM.
>> +
>> +To get a handle to the mshv driver, use open("/dev/mshv").
>> +
>> +3.1 MSHV_REQUEST_VERSION
>> +
>> +:Type: /dev/mshv ioctl
>> +:Parameters: pointer to a u32
>> +:Returns: 0 on success
>> +
>> +Before issuing any other ioctls, a MSHV_REQUEST_VERSION ioctl must be 
>> called to
>> +establish the interface version with the kernel module.
>> +
>> +The caller should pass the MSHV_VERSION as an argument.
>> +
>> +The kernel module will check which interface versions it supports and 
>> return 0
>> +if one of them matches.
>> +
>> +This /dev/mshv file descriptor will remain 'locked' to that version as long 
>> as
>> +it is open - this ioctl can only be called once per open.
>> +
> 
> KVM used to have KVM_GET_API_VERSION too but this turned out to be not
> very convenient so we use capabilities (KVM_CHECK_EXTENSION/KVM_ENABLE_CAP)
> instead.
> 

The goal of MSHV_REQUEST_VERSION is to support changes to APIs in the core set.
When we add new features/ioctls beyond the core we can use an 
extension/capability
approach like KVM.

>> diff --git a/include/linux/mshv.h b/include/linux/mshv.h
>> new file mode 100644
>> index ..a0982fe2c0b8
>> --- /dev/null
>> +++ b/include/linux/mshv.h
>> @@ -0,0 +1,11 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +#ifndef _LINUX_MSHV_H
>> +#define _LINUX_MSHV_H
>> +
>> +/*
>> + * Microsoft Hypervisor root partition driver for /dev/mshv
>> + */
>> +
>> +#include 
>> +
>> +#endif
>> diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
>> new file mode 100644
>> index ..dd30fc2f0a80
>> --- /dev/null
>> +++ b/include/uapi/linux/mshv.h
>> @@ -0,0 +1,19 @@
>> +/* 

Re: [PATCH 0/7] Add support for AEAD algorithms in Qualcomm Crypto Engine driver

2021-03-04 Thread Thara Gopinath
On Thu, 4 Mar 2021 at 00:30, Herbert Xu  wrote:
>
> On Thu, Feb 25, 2021 at 01:27:09PM -0500, Thara Gopinath wrote:
> > Enable support for AEAD algorithms in Qualcomm CE driver.  The first three
> > patches in this series are cleanups and add a few missing pieces required
> > to add support for AEAD algorithms.  Patch 4 introduces supported AEAD
> > transformations on Qualcomm CE.  Patches 5 and 6 implements the h/w
> > infrastructure needed to enable and run the AEAD transformations on
> > Qualcomm CE.  Patch 7 adds support to queue fallback algorithms in case of
> > unsupported special inputs.
> >
> > This series is dependant on https://lkml.org/lkml/2021/2/11/1052.
>
> Did this patch series pass the fuzz tests?

Hi Herbert,

Yes it did. The last patch adds fallback for unsupported cases and
this will make it pass the fuzz tests.

>
> Thanks,
> --
> Email: Herbert Xu 
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



-- 
Warm Regards
Thara


Re: [PATCH v5 15/16] rpmsg: char: no dynamic endpoint management for the default one

2021-03-04 Thread Mathieu Poirier
There has to be a capital letter at the start of the title:

rpmsg: char: No dynamic endpoint management for the default one

Please fix for all the patches.

On Fri, Feb 19, 2021 at 12:15:00PM +0100, Arnaud Pouliquen wrote:
> Do not dynamically manage the default endpoint. The ept address must
> not change.
> This update is needed to manage the RPMSG_CREATE_DEV_IOCTL. In this
> case a default endpoint is used and it's address must not change or
> been reused by another service.

The above is very difficult to understand.  I am not sure about introducing
RPMSG_CREATE_DEV_IOCTL in this patchset.  More on that in an upcoming comment.

> 
> Signed-off-by: Arnaud Pouliquen 
> ---
>  drivers/rpmsg/rpmsg_char.c | 28 +---
>  1 file changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/rpmsg/rpmsg_char.c b/drivers/rpmsg/rpmsg_char.c
> index c98b0e69679b..8d3f9d6c20ad 100644
> --- a/drivers/rpmsg/rpmsg_char.c
> +++ b/drivers/rpmsg/rpmsg_char.c
> @@ -114,14 +114,23 @@ static int rpmsg_eptdev_open(struct inode *inode, 
> struct file *filp)
>   struct rpmsg_endpoint *ept;
>   struct rpmsg_device *rpdev = eptdev->rpdev;
>   struct device *dev = >dev;
> + u32 addr = eptdev->chinfo.src;
>  
>   get_device(dev);
>  
> - ept = rpmsg_create_ept(rpdev, rpmsg_ept_cb, eptdev, eptdev->chinfo);
> - if (!ept) {
> - dev_err(dev, "failed to open %s\n", eptdev->chinfo.name);
> - put_device(dev);
> - return -EINVAL;
> + /*
> +  * The RPMsg device can has been created by a ns announcement. In this
> +  * case a default endpoint has been created. Reuse it to avoid to manage
> +  * a new address on each open close.
> +  */

Here too it is very difficult to understand because the comment
doesn't not describe what the code does.  The code creates an enpoint if it
has not been created, which means /dev/rpmsgX was created from the ioctl. 

> + ept = rpdev->ept;
> + if (!ept || addr != ept->addr) {
> + ept = rpmsg_create_ept(rpdev, rpmsg_ept_cb, eptdev, 
> eptdev->chinfo);
> + if (!ept) {
> + dev_err(dev, "failed to open %s\n", 
> eptdev->chinfo.name);
> + put_device(dev);
> + return -EINVAL;
> + }
>   }
>  
>   eptdev->ept = ept;
> @@ -133,12 +142,17 @@ static int rpmsg_eptdev_open(struct inode *inode, 
> struct file *filp)
>  static int rpmsg_eptdev_release(struct inode *inode, struct file *filp)
>  {
>   struct rpmsg_eptdev *eptdev = cdev_to_eptdev(inode->i_cdev);
> + struct rpmsg_device *rpdev = eptdev->rpdev;
>   struct device *dev = >dev;
>  
> - /* Close the endpoint, if it's not already destroyed by the parent */
> + /*
> +  * Close the endpoint, if it's not already destroyed by the parent and 
> it is not the
> +  * default one.
> +  */
>   mutex_lock(>ept_lock);
>   if (eptdev->ept) {
> - rpmsg_destroy_ept(eptdev->ept);
> + if (eptdev->ept != rpdev->ept)
> + rpmsg_destroy_ept(eptdev->ept);
>   eptdev->ept = NULL;
>   }
>   mutex_unlock(>ept_lock);
> -- 
> 2.17.1
> 


Re: [PATCH 2/4] ALSA: hda/cirrus: Add support for CS8409 HDA bridge and CS42L42 companion codec.

2021-03-04 Thread Vitaly Rodionov

On 04/03/2021 1:39 pm, Takashi Iwai wrote:
Thank you very much for quick response!

On Wed, 03 Mar 2021 19:29:57 +0100,
Vitaly Rodionov wrote:

+static const struct hda_verb cs8409_cs42l42_init_verbs[] = {
+   { 0x01, AC_VERB_SET_POWER_STATE, 0x },/* AFG: D0 */

I guess this power state change is superfluous.  The AFG node is
already powered up when the codec probe or init is called.

Yes, This should be removed



+   { 0x01, AC_VERB_SET_GPIO_DIRECTION, 0x0020 }, /* GPIO 5 out, 3,4 in */
+   { 0x01, AC_VERB_SET_GPIO_DATA, 0x },  /* GPIO  data 0 */
+   { 0x01, AC_VERB_SET_GPIO_MASK, 0x003f },  /* Enable GPIO */

Those are handled in spec->gpio_dir, gpio->data and gpio->mask
fields.

Will handle via spec fields.



+   { 0x01, AC_VERB_SET_GPIO_WAKE_MASK, 0x0018 }, /* WAKE from GPIO 3,4 */
+   { 0x47, AC_VERB_SET_PROC_STATE, 0x0001 }, /* Enable VPW processing  
*/
+   { 0x47, AC_VERB_SET_COEF_INDEX, 0x0002 }, /* Configure GPIO 6,7 */
+   { 0x47, AC_VERB_SET_PROC_COEF,  0x0080 }, /* I2C mode */
+   { 0x47, AC_VERB_SET_COEF_INDEX, 0x005b }, /* Set I2C bus speed */
+   { 0x47, AC_VERB_SET_PROC_COEF,  0x0200 }, /* 100kHz I2C_STO = 2 */

Those remaining verbs are good in the init verbs.  But I suppose they
have to be applied at the resume as well?  But...
This was added in the next patch, but you are right we should handle it 
here.



+static int cs8409_cs42l42_fixup(struct hda_codec *codec)
+{
+   int err = 0;
+   struct cs_spec *spec = codec->spec;
+   unsigned int pincap = 0;
+
+   /* Basic initial sequence for specific hw configuration */
+   snd_hda_sequence_write(codec, cs8409_cs42l42_init_verbs);

... it seems applied only at the fixup call at parsing?

Ditto about cs8409_cs42l42_hw_init(codec).

Yes, needs to be fixed. I will make v2 patch set.



thanks,

Takashi





Re: [PATCH v2 1/5] irqdomain: Introduce irq_domain_create_simple() API

2021-03-04 Thread Marc Zyngier
Andy,

On Thu, 04 Mar 2021 15:02:11 +,
Andy Shevchenko  wrote:
> 
> Linus Walleij pointed out that ird_domain_add_simple() gained
> additional functionality and can't be anymore replaced with
> a simple conditional. In preparation to upgrade GPIO library
> to use fwnode, introduce irq_domain_create_simple() API which is
> functional equivalent to the existing irq_domain_add_simple(),
> but takes a pointer to the struct fwnode_handle as a parameter.
> 
> While at it, amend documentation to mention irq_domain_create_*()
> functions where it makes sense.
> 
> Signed-off-by: Andy Shevchenko 

[...]

> diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
> index 42d196805f58..1d4a8e7c5d5f 100644
> --- a/include/linux/irqdomain.h
> +++ b/include/linux/irqdomain.h
> @@ -265,6 +265,11 @@ struct irq_domain *irq_domain_add_simple(struct 
> device_node *of_node,
>unsigned int first_irq,
>const struct irq_domain_ops *ops,
>void *host_data);
> +struct irq_domain *irq_domain_create_simple(struct fwnode_handle *fwnode,
> + unsigned int size,
> + unsigned int first_irq,
> + const struct irq_domain_ops *ops,
> + void *host_data);
>  struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
>unsigned int size,
>unsigned int first_irq,
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index 288151393a06..418548ea13cf 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -317,10 +317,20 @@ struct irq_domain *irq_domain_add_simple(struct 
> device_node *of_node,
>unsigned int first_irq,
>const struct irq_domain_ops *ops,
>void *host_data)
> +{
> + return irq_domain_create_simple(of_node_to_fwnode(of_node), size, 
> first_irq, ops, host_data);
> +}
> +EXPORT_SYMBOL_GPL(irq_domain_add_simple);

Please move this as an inline helper in linux/irqdomain.h, so that we
can drop this export altogether.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.


Re: [PATCH v2 1/3] regmap-irq: Add support for peripheral offsets

2021-03-04 Thread Guru Das Srinagesh
Hi Mark,

Sorry for the delay in my response.

On Thu, Nov 12, 2020 at 07:33:12PM +, Mark Brown wrote:
> It is difficult to follow what this change is supposed to do, in part
> because it looks like this is in fact two separate changes, one adding
> the _base feature and another adding the polarity feature.  These should
> each be in a separate patch if that is the case, and I think each needs
> a clearer changelog - I'm not entirely sure what the polarity feature is
> supposed to do.  Nothing here says what POLARITY_HI and POLARITY_LO are,
> how they interact or anything.

Sure, I can split this into two patches for easier review.

The POLARITY_HI and POLARITY_LO registers were described very briefly in
the cover letter. If an interrupt is already configured as either edge-
or level-triggered, setting the corresponding bit for it in the
POLARITY_HI register further configures it as rising-edge or level-high
triggered (as the case may be), while setting the same bit in
POLARITY_LO further configures it as falling-edge or level-low
triggered. I could certainly add this information to the commit message
as well.

> 
> For the address offsets I'm not sure that this is the best way to
> represent things.  It looks like the hardware this is trying to describe
> is essentially a bunch of separate interrupt controllers that happen to
> share an upstream interrupt

Sorry but isn't this essentially the same as what the framework already knows as
the "sub-irq" concept, with the key difference that the register stride
is not fixed? Everything else is the same (except for a couple of minor
points noted below) - a main IRQ register that indicates sub-irq blocks
that have unhandled interrupts, as well as interrupt handling and
servicing.

The two minor differences are:
  - type_buf handling in regmap_irq_set_type() for IRQ_TYPE_LEVEL_HIGH and
IRQ_TYPE_LEVEL_LOW
  - Two extra registers: POLARITY_HI and POLARITY_LO

> clearer if at least the implementation looked like this.  Instead of
> having to check for this array of offsets at every use point (which is
> going to be rarely used and hence prone to bugs)

Well, using irq_reg_stride already does exactly this - calculating the
right register to access at every use point, as an offset from the _base
register (status, ack, type, et c.). Peripheral offsets would just be
another way of calculating the right register, that's all. And we could
have a macro as well.

> we'd have a set of separate regmap-irqs and then we'd mostly only have
> to loop through them on handling, the bulk of the implementation
> wouldn't have to worry about this special case.
> 
> Historically genirq didn't support sharing threaded interrupts, if
> that's not changed we'd need to open code everything inside regmap-irq
> but it would be doable, or ideally genirq could grow this feature.  If
> it's done inside regmap you'd have a separate API that took an array of
> regmap-irq configurations instead of just one and then when an interrupt
> is delivered just loops through all of them handling it.  A quick scan
> through the interrupt code suggests it might be able to cope with shared
> IRQs now though which would make life easier.

Sure, I can look into how this approach would look like, but given that
the QCOM register arrangement of main vs sub-irq is essentially the same
as what the framework currently understands, couldn't we simply have a
macro to change the way the right register offset is calculated
(irq_reg_stride vs. peripheral offsets)?

Also, could you elaborate more on the genirq route? I'm not sure where
to start looking to evaluate one route vs the other.

Thank you.

Guru Das.


Re: [PATCH v2] sched: Optimize __calc_delta.

2021-03-04 Thread Sedat Dilek
On Thu, Mar 4, 2021 at 6:34 PM 'Nick Desaulniers' via Clang Built
Linux  wrote:
>
> On Wed, Mar 3, 2021 at 2:48 PM Josh Don  wrote:
> >
> > From: Clement Courbet 
> >
> > A significant portion of __calc_delta time is spent in the loop
> > shifting a u64 by 32 bits. Use `fls` instead of iterating.
> >
> > This is ~7x faster on benchmarks.
> >
> > The generic `fls` implementation (`generic_fls`) is still ~4x faster
> > than the loop.
> > Architectures that have a better implementation will make use of it. For
> > example, on X86 we get an additional factor 2 in speed without dedicated
> > implementation.
> >
> > On gcc, the asm versions of `fls` are about the same speed as the
> > builtin. On clang, the versions that use fls are more than twice as
> > slow as the builtin. This is because the way the `fls` function is
> > written, clang puts the value in memory:
> > https://godbolt.org/z/EfMbYe. This bug is filed at
> > https://bugs.llvm.org/show_bug.cgi?id=49406.
>
> Hi Josh, Thanks for helping get this patch across the finish line.
> Would you mind updating the commit message to point to
> https://bugs.llvm.org/show_bug.cgi?id=20197?
>
> >
> > ```
> > name   cpu/op
> > BM_Calc<__calc_delta_loop> 9.57ms ±12%
> > BM_Calc<__calc_delta_generic_fls>  2.36ms ±13%
> > BM_Calc<__calc_delta_asm_fls>  2.45ms ±13%
> > BM_Calc<__calc_delta_asm_fls_nomem>1.66ms ±12%
> > BM_Calc<__calc_delta_asm_fls64>2.46ms ±13%
> > BM_Calc<__calc_delta_asm_fls64_nomem>  1.34ms ±15%
> > BM_Calc<__calc_delta_builtin>  1.32ms ±11%
> > ```
> >
> > Signed-off-by: Clement Courbet 
> > Signed-off-by: Josh Don 
> > ---
> >  kernel/sched/fair.c  | 19 +++
> >  kernel/sched/sched.h |  1 +
> >  2 files changed, 12 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 8a8bd7b13634..a691371960ae 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -229,22 +229,25 @@ static void __update_inv_weight(struct load_weight 
> > *lw)
> >  static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct 
> > load_weight *lw)
> >  {
> > u64 fact = scale_load_down(weight);
> > +   u32 fact_hi = (u32)(fact >> 32);
> > int shift = WMULT_SHIFT;
> > +   int fs;
> >
> > __update_inv_weight(lw);
> >
> > -   if (unlikely(fact >> 32)) {
> > -   while (fact >> 32) {
> > -   fact >>= 1;
> > -   shift--;
> > -   }
> > +   if (unlikely(fact_hi)) {
> > +   fs = fls(fact_hi);
> > +   shift -= fs;
> > +   fact >>= fs;
> > }
> >
> > fact = mul_u32_u32(fact, lw->inv_weight);
> >
> > -   while (fact >> 32) {
> > -   fact >>= 1;
> > -   shift--;
> > +   fact_hi = (u32)(fact >> 32);
> > +   if (fact_hi) {
> > +   fs = fls(fact_hi);
> > +   shift -= fs;
> > +   fact >>= fs;
> > }
> >
> > return mul_u64_u32_shr(delta_exec, fact, shift);
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index 10a1522b1e30..714af71cf983 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -36,6 +36,7 @@
> >  #include 
> >
> >  #include 
> > +#include 
>
> This hunk of the patch is curious.  I assume that bitops.h is needed
> for fls(); if so, why not #include it in kernel/sched/fair.c?
> Otherwise this potentially hurts compile time for all TUs that include
> kernel/sched/sched.h.
>

I have v2 as-is in my custom patchset and booted right now on bare metal.

As Nick points out moving the include makes sense to me.
We have a lot of include at the wrong places increasing build-time.

- Sedat -

> >  #include 
> >  #include 
> >  #include 
> > --
> > 2.30.1.766.gb4fecdf3b7-goog
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clang-built-linux+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/clang-built-linux/CAKwvOdmijctJfM3gNfwEVjaQyp3LZkhnAwgsT7EBhsSBJyfLAA%40mail.gmail.com.


Re: [RFC v4] copy_file_range.2: Update cross-filesystem support for 5.12

2021-03-04 Thread Alejandro Colomar (man-pages)

Hi Darrick,

On 3/4/21 6:13 PM, Darrick J. Wong wrote:

On Thu, Mar 04, 2021 at 10:38:07AM +0100, Alejandro Colomar wrote:

+However, on some virtual filesystems,
+the call failed to copy, while still reporting success.


...success, or merely a short copy?


Okay.



(The rest looks reasonable (at least by c_f_r standards) to me.)


I'm curious, what does "c_f_r standards" mean? :)

Cheers,

Alex

--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/


Re: [RFC PATCH 01/18] x86/hyperv: convert hyperv statuses to linux error codes

2021-03-04 Thread Nuno Das Neves
On 2/9/2021 5:04 AM, Vitaly Kuznetsov wrote:
> Nuno Das Neves  writes:
> 
>> Return linux-friendly error codes from hypercall wrapper functions.
>> This will be needed in the mshv module.
>>
>> Signed-off-by: Nuno Das Neves 
>> ---
>>  arch/x86/hyperv/hv_proc.c | 30 ++---
>>  arch/x86/include/asm/mshyperv.h   |  1 +
>>  include/asm-generic/hyperv-tlfs.h | 32 +--
>>  3 files changed, 50 insertions(+), 13 deletions(-)
>>
>> diff --git a/arch/x86/hyperv/hv_proc.c b/arch/x86/hyperv/hv_proc.c
>> index 0fd972c9129a..8f86f8e86748 100644
>> --- a/arch/x86/hyperv/hv_proc.c
>> +++ b/arch/x86/hyperv/hv_proc.c
>> @@ -18,6 +18,30 @@
>>  #define HV_DEPOSIT_MAX_ORDER (8)
>>  #define HV_DEPOSIT_MAX (1 << HV_DEPOSIT_MAX_ORDER)
>>  
>> +int hv_status_to_errno(int hv_status)
>> +{
>> +switch (hv_status) {
>> +case HV_STATUS_SUCCESS:
>> +return 0;
>> +case HV_STATUS_INVALID_PARAMETER:
>> +case HV_STATUS_UNKNOWN_PROPERTY:
>> +case HV_STATUS_PROPERTY_VALUE_OUT_OF_RANGE:
>> +case HV_STATUS_INVALID_VP_INDEX:
>> +case HV_STATUS_INVALID_REGISTER_VALUE:
>> +case HV_STATUS_INVALID_LP_INDEX:
>> +return EINVAL;
>> +case HV_STATUS_ACCESS_DENIED:
>> +case HV_STATUS_OPERATION_DENIED:
>> +return EACCES;
>> +case HV_STATUS_NOT_ACKNOWLEDGED:
>> +case HV_STATUS_INVALID_VP_STATE:
>> +case HV_STATUS_INVALID_PARTITION_STATE:
>> +return EBADFD;
>> +}
>> +return ENOTRECOVERABLE;
>> +}
>> +EXPORT_SYMBOL_GPL(hv_status_to_errno);
>> +
>>  /*
>>   * Deposits exact number of pages
>>   * Must be called with interrupts enabled
>> @@ -99,7 +123,7 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 
>> num_pages)
>>  
>>  if (status != HV_STATUS_SUCCESS) {
>>  pr_err("Failed to deposit pages: %d\n", status);
>> -ret = status;
>> +ret = -hv_status_to_errno(status);
> 
> "-hv_status_to_errno" looks weird, could we just return
> '-EINVAL'/'-EACCES'/... from hv_status_to_errno() instead?
> 

Yes, good idea.

>>  goto err_free_allocations;
>>  }
>>  
>> @@ -155,7 +179,7 @@ int hv_call_add_logical_proc(int node, u32 lp_index, u32 
>> apic_id)
>>  if (status != HV_STATUS_SUCCESS) {
>>  pr_err("%s: cpu %u apic ID %u, %d\n", __func__,
>> lp_index, apic_id, status);
>> -ret = status;
>> +ret = -hv_status_to_errno(status);
>>  }
>>  break;
>>  }
>> @@ -203,7 +227,7 @@ int hv_call_create_vp(int node, u64 partition_id, u32 
>> vp_index, u32 flags)
>>  if (status != HV_STATUS_SUCCESS) {
>>  pr_err("%s: vcpu %u, lp %u, %d\n", __func__,
>> vp_index, flags, status);
>> -ret = status;
>> +ret = -hv_status_to_errno(status);
>>  }
>>  break;
>>  }
>> diff --git a/arch/x86/include/asm/mshyperv.h 
>> b/arch/x86/include/asm/mshyperv.h
>> index cbee72550a12..eb75faa4d4c5 100644
>> --- a/arch/x86/include/asm/mshyperv.h
>> +++ b/arch/x86/include/asm/mshyperv.h
>> @@ -243,6 +243,7 @@ int hyperv_flush_guest_mapping_range(u64 as,
>>  int hyperv_fill_flush_guest_mapping_list(
>>  struct hv_guest_mapping_flush_list *flush,
>>  u64 start_gfn, u64 end_gfn);
>> +int hv_status_to_errno(int hv_status);
>>  
>>  extern bool hv_root_partition;
>>  
>> diff --git a/include/asm-generic/hyperv-tlfs.h 
>> b/include/asm-generic/hyperv-tlfs.h
>> index dd385c6a71b5..445244192fa4 100644
>> --- a/include/asm-generic/hyperv-tlfs.h
>> +++ b/include/asm-generic/hyperv-tlfs.h
>> @@ -181,16 +181,28 @@ enum HV_GENERIC_SET_FORMAT {
>>  #define HV_HYPERCALL_REP_START_MASK GENMASK_ULL(59, 48)
>>  
>>  /* hypercall status code */
>> -#define HV_STATUS_SUCCESS   0
>> -#define HV_STATUS_INVALID_HYPERCALL_CODE2
>> -#define HV_STATUS_INVALID_HYPERCALL_INPUT   3
>> -#define HV_STATUS_INVALID_ALIGNMENT 4
>> -#define HV_STATUS_INVALID_PARAMETER 5
>> -#define HV_STATUS_OPERATION_DENIED  8
>> -#define HV_STATUS_INSUFFICIENT_MEMORY   11
>> -#define HV_STATUS_INVALID_PORT_ID   17
>> -#define HV_STATUS_INVALID_CONNECTION_ID 18
>> -#define HV_STATUS_INSUFFICIENT_BUFFERS  19
>> +#define HV_STATUS_SUCCESS   0x0
>> +#define HV_STATUS_INVALID_HYPERCALL_CODE0x2
>> +#define HV_STATUS_INVALID_HYPERCALL_INPUT   0x3
>> +#define HV_STATUS_INVALID_ALIGNMENT 0x4
>> +#define HV_STATUS_INVALID_PARAMETER 0x5
>> +#define HV_STATUS_ACCESS_DENIED 0x6
>> +#define HV_STATUS_INVALID_PARTITION_STATE   0x7
>> +#define HV_STATUS_OPERATION_DENIED  0x8
>> +#define 

Re: [PATCH] mm: be more verbose for alloc_contig_range faliures

2021-03-04 Thread Minchan Kim
On Thu, Mar 04, 2021 at 10:11:35AM -0800, Minchan Kim wrote:
> On Thu, Mar 04, 2021 at 06:23:09PM +0100, David Hildenbrand wrote:
> > > > You want to debug something, so you try triggering it and capturing 
> > > > debug
> > > > data. There are not that many alloc_contig_range() users such that this
> > > > would really be an issue to isolate ...
> > > 
> > > cma_alloc uses alloc_contig_range and cma_alloc has lots of users.
> > > Even, it is expoerted by dmabuf so any userspace would trigger the
> > > allocation by their own. Some of them could be tolerant for the failure,
> > > rest of them could be critical. We should't expect it by limited kernel
> > > usecase.
> > 
> > Assume you are debugging allocation failures. You either collect the data
> > yourself or ask someone to send you that output. You care about any
> > alloc_contig_range() allocation failures that shouldn't happen, don't you?
> > 
> > > 
> > > > 
> > > > Strictly speaking: any allocation failure on ZONE_MOVABLE or CMA is
> > > > problematic (putting aside NORETRY logic and similar aside). So any such
> > > > page you hit is worth investigating and, therefore, worth getting 
> > > > logged for
> > > > debugging purposes.
> > > 
> > > If you believe the every alloc_contig_range failure is problematic
> > 
> > Every one where we should have guarantees I guess: ZONE_MOVABLE or
> > MIGRAT_CMA. On ZONE_NORMAL, there are no guarantees.
> 
> Indeed.

How about this?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 238d0fc232aa..489e557b9390 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8481,7 +8481,8 @@ static inline void dump_migrate_failure_pages(struct 
list_head *page_list)

 /* [start, end) must belong to a single zone. */
 static int __alloc_contig_migrate_range(struct compact_control *cc,
-   unsigned long start, unsigned long end)
+   unsigned long start, unsigned long end,
+   bool nofail)
 {
/* This function is based on compact_zone() from compaction.c. */
unsigned int nr_reclaimed;
@@ -8522,7 +8523,8 @@ static int __alloc_contig_migrate_range(struct 
compact_control *cc,
NULL, (unsigned long), cc->mode, 
MR_CONTIG_RANGE);
}
if (ret < 0) {
-   dump_migrate_failure_pages(>migratepages);
+   if (ret == -EBUSY && nofail)
+   dump_migrate_failure_pages(>migratepages);
putback_movable_pages(>migratepages);
return ret;
}
@@ -8610,7 +8612,9 @@ int alloc_contig_range(unsigned long start, unsigned long 
end,
 * allocated.  So, if we fall through be sure to clear ret so that
 * -EBUSY is not accidentally used or returned to caller.
 */
-   ret = __alloc_contig_migrate_range(, start, end);
+   ret = __alloc_contig_migrate_range(, start, end,
+   migratetype == CMA ||
+   zone_idx(cc.zone) == ZONE_MOVABLE);
if (ret && ret != -EBUSY)
goto done;
ret =0;



Re: [PATCH v1] powerpc: Include running function as first entry in save_stack_trace() and friends

2021-03-04 Thread Marco Elver
On Thu, 4 Mar 2021 at 19:02, Mark Rutland  wrote:
> On Thu, Mar 04, 2021 at 06:25:33PM +0100, Marco Elver wrote:
> > On Thu, Mar 04, 2021 at 04:59PM +, Mark Rutland wrote:
> > > On Thu, Mar 04, 2021 at 04:30:34PM +0100, Marco Elver wrote:
> > > > On Thu, 4 Mar 2021 at 15:57, Mark Rutland  wrote:
> > > > > [adding Mark Brown]
> > > > >
> > > > > The bigger problem here is that skipping is dodgy to begin with, and
> > > > > this is still liable to break in some cases. One big concern is that
> > > > > (especially with LTO) we cannot guarantee the compiler will not inline
> > > > > or outline functions, causing the skipp value to be too large or too
> > > > > small. That's liable to happen to callers, and in theory (though
> > > > > unlikely in practice), portions of arch_stack_walk() or
> > > > > stack_trace_save() could get outlined too.
> > > > >
> > > > > Unless we can get some strong guarantees from compiler folk such that 
> > > > > we
> > > > > can guarantee a specific function acts boundary for unwinding (and
> > > > > doesn't itself get split, etc), the only reliable way I can think to
> > > > > solve this requires an assembly trampoline. Whatever we do is liable 
> > > > > to
> > > > > need some invasive rework.
> > > >
> > > > Will LTO and friends respect 'noinline'?
> > >
> > > I hope so (and suspect we'd have more problems otherwise), but I don't
> > > know whether they actually so.
> > >
> > > I suspect even with 'noinline' the compiler is permitted to outline
> > > portions of a function if it wanted to (and IIUC it could still make
> > > specialized copies in the absence of 'noclone').
> > >
> > > > One thing I also noticed is that tail calls would also cause the stack
> > > > trace to appear somewhat incomplete (for some of my tests I've
> > > > disabled tail call optimizations).
> > >
> > > I assume you mean for a chain A->B->C where B tail-calls C, you get a
> > > trace A->C? ... or is A going missing too?
> >
> > Correct, it's just the A->C outcome.
>
> I'd assumed that those cases were benign, e.g. for livepatching what
> matters is what can be returned to, so B disappearing from the trace
> isn't a problem there.
>
> Is the concern debugability, or is there a functional issue you have in
> mind?

For me, it's just been debuggability, and reliable test cases.

> > > > Is there a way to also mark a function non-tail-callable?
> > >
> > > I think this can be bodged using __attribute__((optimize("$OPTIONS")))
> > > on a caller to inhibit TCO (though IIRC GCC doesn't reliably support
> > > function-local optimization options), but I don't expect there's any way
> > > to mark a callee as not being tail-callable.
> >
> > I don't think this is reliable. It'd be
> > __attribute__((optimize("-fno-optimize-sibling-calls"))), but doesn't
> > work if applied to the function we do not want to tail-call-optimize,
> > but would have to be applied to the function that does the tail-calling.
>
> Yup; that's what I meant then I said you could do that on the caller but
> not the callee.
>
> I don't follow why you'd want to put this on the callee, though, so I
> think I'm missing something. Considering a set of functions in different
> compilation units:
>
>   A->B->C->D->E->F->G->H->I->J->K

I was having this problem with KCSAN, where the compiler would
tail-call-optimize __tsan_X instrumentation. This would mean that
KCSAN runtime functions ended up in the trace, but the function where
the access happened would not. However, I don't care about the runtime
functions, and instead want to see the function where the access
happened. In that case, I'd like to just mark __tsan_X and any other
kcsan instrumentation functions as do-not-tail-call-optimize, which
would solve the problem.

The solution today is that when you compile a kernel with KCSAN, every
instrumented TU is compiled with -fno-optimize-sibling-calls. The
better solution would be to just mark KCSAN runtime functions somehow,
but permit tail calling other things. Although, I probably still want
to see the full trace, and would decide that having
-fno-optimize-sibling-calls is a small price to pay in a
debug-only-kernel to get complete traces.

> ... if K were marked in this way, and J was compiled with visibility of
> this, J would stick around, but J's callers might not, and so the a
> trace might see:
>
>   A->J->K
>
> ... do you just care about the final caller, i.e. you just need
> certainty that J will be in the trace?

Yes. But maybe it's a special problem that only sanitizers have.

> If so, we can somewhat bodge that by having K have an __always_inline
> wrapper which has a barrier() or similar after the real call to K, so
> the call couldn't be TCO'd.
>
> Otherwise I'd expect we'd probably need to disable TCO generally.

Thanks,
-- Marco


mm/page_alloc.c:3728:1: warning: stack frame size of 2064 bytes in function 'get_page_from_freelist'

2021-03-04 Thread kernel test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   f69d02e37a85645aa90d18cacfff36dba370f797
commit: 1d91df85f399adbe4f318f3e74ac5a5d84c0ca7c mm/page_alloc: handle a 
missing case for memalloc_nocma_{save/restore} APIs
date:   5 months ago
config: powerpc-randconfig-r023-20210304 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
eec7f8f7b1226be422a76542cb403d02538f453a)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install powerpc cross compiling tool for clang build
# apt-get install binutils-powerpc-linux-gnu
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1d91df85f399adbe4f318f3e74ac5a5d84c0ca7c
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 1d91df85f399adbe4f318f3e74ac5a5d84c0ca7c
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c),
   ^~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :239:1: note: expanded from here
   __do_insw
   ^
   arch/powerpc/include/asm/io.h:542:56: note: expanded from macro '__do_insw'
   #define __do_insw(p, b, n)  readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
  ~^
   In file included from mm/page_alloc.c:20:
   In file included from include/linux/highmem.h:10:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
   ^~~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :244:1: note: expanded from here
   __do_insl
   ^
   arch/powerpc/include/asm/io.h:543:56: note: expanded from macro '__do_insl'
   #define __do_insl(p, b, n)  readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
  ~^
   In file included from mm/page_alloc.c:20:
   In file included from include/linux/highmem.h:10:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/powerpc/include/asm/io.h:604:
   arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer 
arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
   DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
   ^~
   arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 
'DEF_PCI_AC_NORET'
   __do_##name al; \
   ^~
   :249:1: note: expanded from here
   __do_outsb
   ^
   arch/powerpc/include/asm/io.h:544:58: note: expanded from macro '__do_outsb'
   #define __do_outsb(p, b, n) writesb((PCI_IO_ADDR)_IO_BASE+(p),(b),(n))
   ~^
   In file included from mm/page_alloc.c:20:
   In file included from include/linux/highmem.h:10:
   In file included from include/linux/hardirq.h:10:
   In file included from arch/powerpc/include/asm/hardirq.h:6:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file include

Re: [Patch v4 1/3] lib: Restrict cpumask_local_spread to houskeeping CPUs

2021-03-04 Thread Nitesh Narayan Lal


On 2/11/21 10:55 AM, Nitesh Narayan Lal wrote:
> On 2/6/21 7:43 PM, Nitesh Narayan Lal wrote:
>> On 2/5/21 5:23 PM, Thomas Gleixner wrote:
>>> On Thu, Feb 04 2021 at 14:17, Nitesh Narayan Lal wrote:
 On 2/4/21 2:06 PM, Marcelo Tosatti wrote:
>>> How about adding a new flag for isolcpus instead?
>>>
>> Do you mean a flag based on which we can switch the affinity mask to
>> housekeeping for all the devices at the time of IRQ distribution?
> Yes a new flag for isolcpus. HK_FLAG_IRQ_SPREAD or some better name.
 Does sounds like a nice idea to explore, lets see what Thomas thinks about 
 it.
> 
>
 When the affinity mask of the interrupt at the time when it is actually
 requested contains an isolated CPU then nothing prevents the kernel from
 steering it at an isolated CPU. But that has absolutely nothing to do
 with that spreading thingy.

 The only difference which this change makes is the fact that the
 affinity hint changes. Nothing else.

>> Thanks for the detailed explanation.
>>
>> Before I posted this patch, I was doing some debugging on a setup where I
>> was observing some latency issues due to the iavf IRQs that were pinned on
>> the isolated CPUs.
>>
>> Based on some initial traces I had this impression that the affinity hint
>> or cpumask_local_spread was somehow playing a role in deciding the affinity
>> mask of these IRQs. Although, that does look incorrect after going through
>> your explanation.
>> For some reason, with a kernel that had this patch when I tried creating
>> VFs iavf IRQs always ended up on the HK CPUs.
>>
>> The reasoning for the above is still not very clear to me. I will investigate
>> this further to properly understand this behavior.
>>
>>
> After a little more digging, I found out why cpumask_local_spread change
> affects the general/initial smp_affinity for certain device IRQs.
>
> After the introduction of the commit:
>
>     e2e64a932 genirq: Set initial affinity in irq_set_affinity_hint()
>

Continuing the conversation about the above commit and adding Jesse.
I was trying to understand the problem that the commit message explains
"The default behavior of the kernel is somewhat undesirable as all
requested interrupts end up on CPU0 after registration.", I have also been
trying to reproduce this behavior without the patch but I failed in doing
so, maybe because I am missing something here.

@Jesse Can you please explain? FWIU IRQ affinity should be decided based on
the default affinity mask.

The problem with the commit is that when we overwrite the affinity mask
based on the hinting mask we completely ignore the default SMP affinity
mask. If we do want to overwrite the affinity based on the hint mask we
should atleast consider the default SMP affinity.

-- 
Thanks
Nitesh



Re: [PATCH v4 05/10] coresight: syscfg: Add API to activate and enable configurations

2021-03-04 Thread Mike Leach
Hi Suzuki,

On Thu, 4 Mar 2021 at 16:49, Suzuki K Poulose  wrote:
>
> On 1/28/21 5:09 PM, Mike Leach wrote:
> > Configurations are first activated, then when any coresight device is
> > enabled, the active configurations are checked and any matching
> > one is enabled.
> >
> > This patch provides the activation / enable API.
> >
> > Signed-off-by: Mike Leach 
> > ---
> >   .../hwtracing/coresight/coresight-config.h|   2 +
> >   .../hwtracing/coresight/coresight-syscfg.c| 127 ++
> >   .../hwtracing/coresight/coresight-syscfg.h|  10 +-
> >   include/linux/coresight.h |   2 +
> >   4 files changed, 140 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/hwtracing/coresight/coresight-config.h 
> > b/drivers/hwtracing/coresight/coresight-config.h
> > index 98380b496046..26396b70c826 100644
> > --- a/drivers/hwtracing/coresight/coresight-config.h
> > +++ b/drivers/hwtracing/coresight/coresight-config.h
> > @@ -156,6 +156,7 @@ struct cscfg_config_feat_ref {
> >* @presets:Array of preset values.
> >* @id_ea:  Extended attribute for perf configid value
> >* @event_ea:   Extended attribute for perf event value
> > + * @active_cnt: ref count for activate on this configuration.
> >*/
> >   struct cscfg_config_desc {
> >   const char *name;
> > @@ -168,6 +169,7 @@ struct cscfg_config_desc {
> >   const u64 *presets; /* nr_presets * nr_total_params */
> >   struct dev_ext_attribute *id_ea;
> >   struct dev_ext_attribute *event_ea;
> > + atomic_t active_cnt;
> >   };
> >
> >   /**
> > diff --git a/drivers/hwtracing/coresight/coresight-syscfg.c 
> > b/drivers/hwtracing/coresight/coresight-syscfg.c
> > index a070f135eca3..d79cf5b36758 100644
> > --- a/drivers/hwtracing/coresight/coresight-syscfg.c
> > +++ b/drivers/hwtracing/coresight/coresight-syscfg.c
> > @@ -298,6 +298,7 @@ static int cscfg_load_config(struct cscfg_config_desc 
> > *cfg_desc)
> >   return err;
> >
> >   list_add(_desc->item, _mgr->data.config_desc_list);
> > + atomic_set(_desc->active_cnt, 0);
> >   return 0;
> >   }
> >
> > @@ -477,6 +478,131 @@ void cscfg_unregister_csdev(struct coresight_device 
> > *csdev)
> >   }
> >   EXPORT_SYMBOL_GPL(cscfg_unregister_csdev);
> >
> > +void cscfg_csdev_reset_feats(struct coresight_device *csdev)
> > +{
> > + struct cscfg_feature_csdev *feat;
> > +
> > + mutex_lock(_csdev_mutex);
> > + if (list_empty(>feature_csdev_list))
> > + goto unlock_exit;
> > +
> > + list_for_each_entry(feat, >feature_csdev_list, node)
> > + cscfg_reset_feat(feat);
> > +
> > +unlock_exit:
> > + mutex_unlock(_csdev_mutex);
> > +}
> > +EXPORT_SYMBOL_GPL(cscfg_csdev_reset_feats);
> > +
> > +/**
> > + * Mark a config descriptor as active.
> > + * This will be seen when csdev devices are activated in the system.
> > + *
> > + * Selection by hash value - generated from the configuration name when it
> > + * was loaded and added to the cs_etm/configurations file system for 
> > selection
> > + * by perf.
> > + *
> > + * @cfg_hash: Hash value of the selected configuration name.
> > + */
> > +int cscfg_activate_config(unsigned long cfg_hash)
> > +{
> > + struct cscfg_config_desc *curr_item, *match_item = 0;
>
> nit: s/0/NULL
>

OK.

> > +
> > + mutex_lock(_mutex);
> > +
> > + list_for_each_entry(curr_item, _mgr->data.config_desc_list, 
> > item) {
> > + if ((unsigned long)curr_item->id_ea->var == cfg_hash) {
> > + match_item = curr_item;
> > + atomic_inc(_mgr->data.sys_active_cnt);
> > + break;
> > + }
> > + }
> > + mutex_unlock(_mutex);
> > +
> > + if (!match_item)
> > + return -EINVAL;
> > +
> > + dev_dbg(to_device_cscfg(), "Activate config %s.\n", match_item->name);
> > +
> > + /* mark the descriptors as active so enable config will use them */
> > + mutex_lock(_csdev_mutex);
> > + atomic_inc(_item->active_cnt);
> > + mutex_unlock(_csdev_mutex);
>
> Is there a guarantee that this item is active and present in the list after
> we dropped the mutex above ? We could certainly nest the mutex as long as
> we follow the order everywhere to prevent such a race.
>

Although removal not supported in this set, the rule is that nothing
can be removed while any configuration is active (count on
sys_active_cnt). - but a comment here could be added.
That said, given this is an atomic, not sure that the mutex is
necessary (I think previous versions did more than just update the
count), & perhaps the increment should be moved to the main
list_for_each loop.

> > +
> > + return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(cscfg_activate_config);
> > +
> > +void cscfg_deactivate_config(unsigned long cfg_hash)
> > +{
> > + struct cscfg_config_desc *curr_item, *match_item = 0;
> > +
> > + mutex_lock(_mutex);
> > +
> > + list_for_each_entry(curr_item, 

Re: [PATCH 1/4] userfaultfd.2: Add UFFD_FEATURE_THREAD_ID docs

2021-03-04 Thread Alejandro Colomar (man-pages)

Hi Peter,

On 3/4/21 4:50 PM, Peter Xu wrote:

On Thu, Mar 04, 2021 at 10:22:18AM +0100, Alejandro Colomar (man-pages) wrote:

+.BR UFFD_FEATURE_THREAD_ID


This should use [.B] and not [.BR].
.BR is for alternate Bold and Roman.
.B is for bold.

(There are more appearances of this in the other patches.)


Yeah I got a bit confused when differenciating those two, since I also see
similar usage, e.g.:

.BR O_CLOEXEC


Yes, these are minor imperfections that got into the manual pages, and 
we don't remove them due to the churn that it would create (and 
possibility of introducing other bugs while doing such a big scripted 
change that couldn't be easily reviewed (thousands of lines)).  So as we 
still have those lines, they tend to confuse.




I'll fix all of them appeared in current patchset.  Let me know if you also
want me to fix all the existing uses of ".BR" too where ".B" would suffice.
Otherwise I won't touch them since I can't say they're wrong either (I think
most of them should generate the same output with either ".BR" or ".B" if
there's only one word?).


Our current non-written guidelines are:
We are fixing the existing ones as we modify code near them,
but leave untouched code that is far from what we are changing, even on 
the same page.


Thanks,

Alex

--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/


Re: [PATCH] mm: be more verbose for alloc_contig_range faliures

2021-03-04 Thread Minchan Kim
On Thu, Mar 04, 2021 at 06:23:09PM +0100, David Hildenbrand wrote:
> > > You want to debug something, so you try triggering it and capturing debug
> > > data. There are not that many alloc_contig_range() users such that this
> > > would really be an issue to isolate ...
> > 
> > cma_alloc uses alloc_contig_range and cma_alloc has lots of users.
> > Even, it is expoerted by dmabuf so any userspace would trigger the
> > allocation by their own. Some of them could be tolerant for the failure,
> > rest of them could be critical. We should't expect it by limited kernel
> > usecase.
> 
> Assume you are debugging allocation failures. You either collect the data
> yourself or ask someone to send you that output. You care about any
> alloc_contig_range() allocation failures that shouldn't happen, don't you?
> 
> > 
> > > 
> > > Strictly speaking: any allocation failure on ZONE_MOVABLE or CMA is
> > > problematic (putting aside NORETRY logic and similar aside). So any such
> > > page you hit is worth investigating and, therefore, worth getting logged 
> > > for
> > > debugging purposes.
> > 
> > If you believe the every alloc_contig_range failure is problematic
> 
> Every one where we should have guarantees I guess: ZONE_MOVABLE or
> MIGRAT_CMA. On ZONE_NORMAL, there are no guarantees.

Indeed.

> 
> > and there is no such realy example I menionted above in the world,
> > I am happy to put this chunk to support dynamic debugging.
> > Okay?
> > 
> > +#if defined(CONFIG_DYNAMIC_DEBUG) || \
> > +(defined(CONFIG_DYNAMIC_DEBUG_CORE) && 
> > defined(DYNAMIC_DEBUG_MODULE))
> > +static DEFINE_RATELIMIT_STATE(alloc_contig_ratelimit_state,
> > +   DEFAULT_RATELIMIT_INTERVAL, DEFAULT_RATELIMIT_BURST);
> > +int alloc_contig_ratelimit(void)
> > +{
> > +   return __ratelimit(_contig_ratelimit_state);
> > +}
> > +
> 
> ^ do we need ratelimiting with dynamic debugging enabled?

Main argument was debug message flooding. Even though we
play with dynamic debugging, the issue never disappear.

> 
> > +void dump_migrate_failure_pages(struct list_head *page_list)
> > +{
> > +   DEFINE_DYNAMIC_DEBUG_METADATA(descriptor,
> > +   "migrate failure");
> > +   if (DYNAMIC_DEBUG_BRANCH(descriptor) &&
> > +   alloc_contig_ratelimit()) {
> > +   struct page *page;
> > +
> > +   WARN(1, "failed callstack");
> > +   list_for_each_entry(page, page_list, lru)
> > +   dump_page(page, "migration failure");
> 
> Are all pages on the list guaranteed to be problematic, or only the first
> entry? I assume all.

All.

> 
> > +   }
> > +}
> > +#else
> > +static inline void dump_migrate_failure_pages(struct list_head *page_list)
> > +{
> > +}
> > +#endif
> > +
> >   /* [start, end) must belong to a single zone. */
> >   static int __alloc_contig_migrate_range(struct compact_control *cc,
> >  unsigned long start, unsigned long 
> > end)
> > @@ -8496,6 +8522,7 @@ static int __alloc_contig_migrate_range(struct 
> > compact_control *cc,
> >  NULL, (unsigned long), cc->mode, 
> > MR_CONTIG_RANGE);
> >  }
> >  if (ret < 0) {
> > +   dump_migrate_failure_pages(>migratepages);
> >  putback_movable_pages(>migratepages);
> >  return ret;
> >  }
> > 
> > 
> 
> If that's the way dynamic debugging is configured/enabled (still have to
> look into it) - yes, that goes into the right direction. As I said above,
> you should dump only where we have some kind of guarantees I assume.

Sure, let me wait for your review before sending next revision.
Thanks for the review!


Re: [PATCH v2] certs: Fix wrong kconfig option used for x509_revocation_list

2021-03-04 Thread Nathan Chancellor
On Thu, Mar 04, 2021 at 12:50:30PM -0500, Eric Snowberg wrote:
> Fix a build issue when x509_revocation_list is not defined.
> 
> $ make ARCH=x86_64 O=build64 all
> 
>  EXTRACT_CERTS   ../
> At main.c:154:
> - SSL error:0909006C:PEM routines:get_name:no start line: 
> crypto/pem/pem_lib.c:745
> extract-cert: ../: Is a directory
> make[2]: [../certs/Makefile:119: certs/x509_revocation_list] Error 1 (ignored)
> 
> When the new CONFIG_SYSTEM_REVOCATION_LIST was added [1], it was not carried
> into the code for preloading the revocation certificates [2].  Change from
> using the original CONFIG_SYSTEM_BLACKLIST_KEYRING  to the new
> CONFIG_SYSTEM_REVOCATION_LIST.
> 
> [1] 
> https://lore.kernel.org/keyrings/eda280f9-f72d-4181-93c7-cdbe95976...@oracle.com/T/#m562c1b27bf402190e7bb573ad20eff5b6310d08f
> [2] 
> https://lore.kernel.org/keyrings/eda280f9-f72d-4181-93c7-cdbe95976...@oracle.com/T/#m07e258bf019ccbac23820fad5192ceffa74fc6ab
> 
> Reported-by: Randy Dunlap 
> Signed-off-by: Eric Snowberg 

This seems to fix my build errors.

Tested-by: Nathan Chancellor 

> ---
> v2 changes:
>   Use the new config option for extract-cert
>   Use the new config option when building revocation_certificates.o
> ---
>  certs/Makefile| 5 +++--
>  certs/blacklist.c | 4 
>  scripts/Makefile  | 2 +-
>  3 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/certs/Makefile b/certs/Makefile
> index e3f4926fd21e..b6db52ebf0be 100644
> --- a/certs/Makefile
> +++ b/certs/Makefile
> @@ -4,7 +4,8 @@
>  #
>  
>  obj-$(CONFIG_SYSTEM_TRUSTED_KEYRING) += system_keyring.o 
> system_certificates.o common.o
> -obj-$(CONFIG_SYSTEM_BLACKLIST_KEYRING) += blacklist.o 
> revocation_certificates.o common.o
> +obj-$(CONFIG_SYSTEM_BLACKLIST_KEYRING) += blacklist.o common.o
> +obj-$(CONFIG_SYSTEM_REVOCATION_LIST) += revocation_certificates.o
>  ifneq ($(CONFIG_SYSTEM_BLACKLIST_HASH_LIST),"")
>  obj-$(CONFIG_SYSTEM_BLACKLIST_KEYRING) += blacklist_hashes.o
>  else
> @@ -105,7 +106,7 @@ $(obj)/signing_key.x509: scripts/extract-cert $(X509_DEP) 
> FORCE
>   $(call 
> if_changed,extract_certs,$(MODULE_SIG_KEY_SRCPREFIX)$(CONFIG_MODULE_SIG_KEY))
>  endif # CONFIG_MODULE_SIG
>  
> -ifeq ($(CONFIG_SYSTEM_BLACKLIST_KEYRING),y)
> +ifeq ($(CONFIG_SYSTEM_REVOCATION_LIST),y)
>  
>  $(eval $(call config_filename,SYSTEM_REVOCATION_KEYS))
>  
> diff --git a/certs/blacklist.c b/certs/blacklist.c
> index 723b19c96256..c9a435b15af4 100644
> --- a/certs/blacklist.c
> +++ b/certs/blacklist.c
> @@ -21,8 +21,10 @@
>  
>  static struct key *blacklist_keyring;
>  
> +#ifdef CONFIG_SYSTEM_REVOCATION_LIST
>  extern __initconst const u8 revocation_certificate_list[];
>  extern __initconst const unsigned long revocation_certificate_list_size;
> +#endif
>  
>  /*
>   * The description must be a type prefix, a colon and then an even number of
> @@ -225,6 +227,7 @@ static int __init blacklist_init(void)
>   */
>  device_initcall(blacklist_init);
>  
> +#ifdef CONFIG_SYSTEM_REVOCATION_LIST
>  /*
>   * Load the compiled-in list of revocation X.509 certificates.
>   */
> @@ -237,3 +240,4 @@ static __init int load_revocation_certificate_list(void)
>blacklist_keyring);
>  }
>  late_initcall(load_revocation_certificate_list);
> +#endif
> diff --git a/scripts/Makefile b/scripts/Makefile
> index 983b785f13cb..bd0718f7c493 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -11,7 +11,7 @@ hostprogs-always-$(CONFIG_ASN1) 
> += asn1_compiler
>  hostprogs-always-$(CONFIG_MODULE_SIG_FORMAT) += sign-file
>  hostprogs-always-$(CONFIG_SYSTEM_TRUSTED_KEYRING)+= extract-cert
>  hostprogs-always-$(CONFIG_SYSTEM_EXTRA_CERTIFICATE)  += insert-sys-cert
> - hostprogs-always-$(CONFIG_SYSTEM_BLACKLIST_KEYRING) += extract-cert
> +hostprogs-always-$(CONFIG_SYSTEM_REVOCATION_LIST)+= extract-cert
>  
>  HOSTCFLAGS_sorttable.o = -I$(srctree)/tools/include
>  HOSTCFLAGS_asn1_compiler.o = -I$(srctree)/include
> -- 
> 2.18.4
> 


Re: [PATCH V2] mmc: sdhci: Check for reset prior to DMA address unmap

2021-03-04 Thread Ulf Hansson
On Thu, 4 Mar 2021 at 16:16,  wrote:
>
> On 2021-03-04 19:19, Ulf Hansson wrote:
> > On Wed, 3 Mar 2021 at 09:32, Pradeep P V K 
> > wrote:
> >>
> >> From: Pradeep P V K 
> >>
> >> For data read commands, SDHC may initiate data transfers even before
> >> it
> >> completely process the command response. In case command itself fails,
> >> driver un-maps the memory associated with data transfer but this
> >> memory
> >> can still be accessed by SDHC for the already initiated data transfer.
> >> This scenario can lead to un-mapped memory access error.
> >>
> >> To avoid this scenario, reset SDHC (when command fails) prior to
> >> un-mapping memory. Resetting SDHC ensures that all in-flight data
> >> transfers are either aborted or completed. So we don't run into this
> >> scenario.
> >>
> >> Swap the reset, un-map steps sequence in sdhci_request_done().
> >>
> >> Changes since V1:
> >> - Added an empty line and fixed the comment style.
> >> - Retained the Acked-by signoff.
> >>
> >> Suggested-by: Veerabhadrarao Badiganti 
> >> Signed-off-by: Pradeep P V K 
> >> Acked-by: Adrian Hunter 
>
> Hi Uffe,
> >
> > Seems like it might be a good idea to tag this for stable? I did that,
> > but awaiting for your confirmation.
> >
> Yes, this fix is applicable for all stable starting from 4.9 (n/a for
> 4.4).
> Kindly go ahead.
>
> > So, applied for next, thanks!
> >
> > Kind regards
> > Uffe
> >
> Thanks and Regards,
> Pradeep

Thanks for confirming, I have updated the stable tag.

Kind regards
Uffe

>
> >
> >> ---
> >>  drivers/mmc/host/sdhci.c | 60
> >> +---
> >>  1 file changed, 31 insertions(+), 29 deletions(-)
> >>
> >> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> >> index 646823d..130fd2d 100644
> >> --- a/drivers/mmc/host/sdhci.c
> >> +++ b/drivers/mmc/host/sdhci.c
> >> @@ -2998,6 +2998,37 @@ static bool sdhci_request_done(struct
> >> sdhci_host *host)
> >> }
> >>
> >> /*
> >> +* The controller needs a reset of internal state machines
> >> +* upon error conditions.
> >> +*/
> >> +   if (sdhci_needs_reset(host, mrq)) {
> >> +   /*
> >> +* Do not finish until command and data lines are
> >> available for
> >> +* reset. Note there can only be one other mrq, so it
> >> cannot
> >> +* also be in mrqs_done, otherwise host->cmd and
> >> host->data_cmd
> >> +* would both be null.
> >> +*/
> >> +   if (host->cmd || host->data_cmd) {
> >> +   spin_unlock_irqrestore(>lock, flags);
> >> +   return true;
> >> +   }
> >> +
> >> +   /* Some controllers need this kick or reset won't work
> >> here */
> >> +   if (host->quirks & SDHCI_QUIRK_CLOCK_BEFORE_RESET)
> >> +   /* This is to force an update */
> >> +   host->ops->set_clock(host, host->clock);
> >> +
> >> +   /*
> >> +* Spec says we should do both at the same time, but
> >> Ricoh
> >> +* controllers do not like that.
> >> +*/
> >> +   sdhci_do_reset(host, SDHCI_RESET_CMD);
> >> +   sdhci_do_reset(host, SDHCI_RESET_DATA);
> >> +
> >> +   host->pending_reset = false;
> >> +   }
> >> +
> >> +   /*
> >>  * Always unmap the data buffers if they were mapped by
> >>  * sdhci_prepare_data() whenever we finish with a request.
> >>  * This avoids leaking DMA mappings on error.
> >> @@ -3060,35 +3091,6 @@ static bool sdhci_request_done(struct
> >> sdhci_host *host)
> >> }
> >> }
> >>
> >> -   /*
> >> -* The controller needs a reset of internal state machines
> >> -* upon error conditions.
> >> -*/
> >> -   if (sdhci_needs_reset(host, mrq)) {
> >> -   /*
> >> -* Do not finish until command and data lines are
> >> available for
> >> -* reset. Note there can only be one other mrq, so it
> >> cannot
> >> -* also be in mrqs_done, otherwise host->cmd and
> >> host->data_cmd
> >> -* would both be null.
> >> -*/
> >> -   if (host->cmd || host->data_cmd) {
> >> -   spin_unlock_irqrestore(>lock, flags);
> >> -   return true;
> >> -   }
> >> -
> >> -   /* Some controllers need this kick or reset won't work
> >> here */
> >> -   if (host->quirks & SDHCI_QUIRK_CLOCK_BEFORE_RESET)
> >> -   /* This is to force an update */
> >> -   host->ops->set_clock(host, host->clock);
> >> -
> >> -   /* Spec says we should do both at the same time, but
> >> Ricoh
> >> -  controllers do not like that. */
> >> -   sdhci_do_reset(host, SDHCI_RESET_CMD);
> 

[PATCH 2/4] arm64: dts: qcom: sc7180: Add pompom rev3

2021-03-04 Thread Matthias Kaehlcke
The only kernel visible change with respect to rev2 is that pompom
rev3 changed the charger thermistor from a 47k to a 100k NTC to use
a thermistor which is supported by the PM6150 ADC driver.

Signed-off-by: Matthias Kaehlcke 
---

 .../dts/qcom/sc7180-trogdor-pompom-r2-lte.dts |  4 +-
 .../dts/qcom/sc7180-trogdor-pompom-r2.dts |  4 +-
 .../dts/qcom/sc7180-trogdor-pompom-r3-lte.dts | 14 ++
 .../dts/qcom/sc7180-trogdor-pompom-r3.dts | 46 +++
 4 files changed, 64 insertions(+), 4 deletions(-)
 create mode 100644 arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r3-lte.dts
 create mode 100644 arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r3.dts

diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r2-lte.dts 
b/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r2-lte.dts
index 791d496ad046..00e187c08eb9 100644
--- a/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r2-lte.dts
+++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r2-lte.dts
@@ -9,6 +9,6 @@
 #include "sc7180-trogdor-lte-sku.dtsi"
 
 / {
-   model = "Google Pompom (rev2+) with LTE";
-   compatible = "google,pompom-sku0", "qcom,sc7180";
+   model = "Google Pompom (rev2) with LTE";
+   compatible = "google,pompom-rev2-sku0", "qcom,sc7180";
 };
diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r2.dts 
b/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r2.dts
index 984d7337da78..2b2bd906321d 100644
--- a/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r2.dts
+++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r2.dts
@@ -10,8 +10,8 @@
 #include "sc7180-trogdor-pompom.dtsi"
 
 / {
-   model = "Google Pompom (rev2+)";
-   compatible = "google,pompom", "qcom,sc7180";
+   model = "Google Pompom (rev2)";
+   compatible = "google,pompom-rev2", "qcom,sc7180";
 };
 
 _controller {
diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r3-lte.dts 
b/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r3-lte.dts
new file mode 100644
index ..067cb75a011e
--- /dev/null
+++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r3-lte.dts
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+/*
+ * Google Pompom board device tree source
+ *
+ * Copyright 2020 Google LLC.
+ */
+
+#include "sc7180-trogdor-pompom-r3.dts"
+#include "sc7180-trogdor-lte-sku.dtsi"
+
+/ {
+   model = "Google Pompom (rev3+) with LTE";
+   compatible = "google,pompom-sku0", "qcom,sc7180";
+};
diff --git a/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r3.dts 
b/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r3.dts
new file mode 100644
index ..12d2d1e8e9e1
--- /dev/null
+++ b/arch/arm64/boot/dts/qcom/sc7180-trogdor-pompom-r3.dts
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+/*
+ * Google Pompom board device tree source
+ *
+ * Copyright 2021 Google LLC.
+ */
+
+/dts-v1/;
+
+#include "sc7180-trogdor-pompom.dtsi"
+
+/ {
+   model = "Google Pompom (rev3+)";
+   compatible = "google,pompom", "qcom,sc7180";
+};
+
+_controller {
+   function-row-physmap = <
+   MATRIX_KEY(0x00, 0x02, 0)   /* T1 */
+   MATRIX_KEY(0x03, 0x02, 0)   /* T2 */
+   MATRIX_KEY(0x02, 0x02, 0)   /* T3 */
+   MATRIX_KEY(0x01, 0x02, 0)   /* T4 */
+   MATRIX_KEY(0x03, 0x04, 0)   /* T5 */
+   MATRIX_KEY(0x02, 0x04, 0)   /* T6 */
+   MATRIX_KEY(0x01, 0x04, 0)   /* T7 */
+   MATRIX_KEY(0x02, 0x09, 0)   /* T8 */
+   MATRIX_KEY(0x01, 0x09, 0)   /* T9 */
+   MATRIX_KEY(0x00, 0x04, 0)   /* T10 */
+   >;
+   linux,keymap = <
+   MATRIX_KEY(0x00, 0x02, KEY_BACK)
+   MATRIX_KEY(0x03, 0x02, KEY_REFRESH)
+   MATRIX_KEY(0x02, 0x02, KEY_ZOOM)
+   MATRIX_KEY(0x01, 0x02, KEY_SCALE)
+   MATRIX_KEY(0x03, 0x04, KEY_SYSRQ)
+   MATRIX_KEY(0x02, 0x04, KEY_BRIGHTNESSDOWN)
+   MATRIX_KEY(0x01, 0x04, KEY_BRIGHTNESSUP)
+   MATRIX_KEY(0x02, 0x09, KEY_MUTE)
+   MATRIX_KEY(0x01, 0x09, KEY_VOLUMEDOWN)
+   MATRIX_KEY(0x00, 0x04, KEY_VOLUMEUP)
+
+   MATRIX_KEY(0x03, 0x09, KEY_SLEEP)   /* LOCK key */
+
+   CROS_STD_MAIN_KEYMAP
+   >;
+};
-- 
2.30.1.766.gb4fecdf3b7-goog



Re: [PATCH v3 17/21] tools/objtool: Convert to insn_decode()

2021-03-04 Thread Borislav Petkov
On Thu, Mar 04, 2021 at 06:49:08PM +0100, Peter Zijlstra wrote:
> This is going to have trivial rejects/fuzz against tip/objtool/core.

I was just wondering whether to you show you how I resolved :)

diff --cc tools/objtool/arch/x86/decode.c
index 431bafb881d4,8380d0b1d933..22a53ee322ea
--- a/tools/objtool/arch/x86/decode.c
+++ b/tools/objtool/arch/x86/decode.c
@@@ -104,14 -90,12 +104,14 @@@ int arch_decode_instruction(const struc
struct list_head *ops_list)
  {
struct insn insn;
-   int x86_64;
 -  int x86_64, sign, ret;
 -  unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
 -rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0,
 -modrm_reg = 0, sib = 0;
++  int x86_64, ret;
 +  unsigned char op1, op2,
 +rex = 0, rex_b = 0, rex_r = 0, rex_w = 0, rex_x = 0,
 +modrm = 0, modrm_mod = 0, modrm_rm = 0, modrm_reg = 0,
 +sib = 0, /* sib_scale = 0, */ sib_index = 0, sib_base = 0;
struct stack_op *op = NULL;
struct symbol *sym;
 +  u64 imm;
  
x86_64 = is_x86_64(elf);
if (x86_64 == -1)

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


<    1   2   3   4   5   6   7   8   9   10   >