Re: [PATCH 3/4] mm: /proc/sys/vm/stat_refresh skip checking known negative stats

2021-03-01 Thread Roman Gushchin
Mon, Mar 01, 2021 at 02:08:17PM -0800, Hugh Dickins wrote:
> On Sun, 28 Feb 2021, Roman Gushchin wrote:
> > On Thu, Feb 25, 2021 at 03:14:03PM -0800, Hugh Dickins wrote:
> > > vmstat_refresh() can occasionally catch nr_zone_write_pending and
> > > nr_writeback when they are transiently negative.  The reason is partly
> > > that the interrupt which decrements them in test_clear_page_writeback()
> > > can come in before __test_set_page_writeback() got to increment them;
> > > but transient negatives are still seen even when that is prevented, and
> > > we have not yet resolved why (Roman believes that it is an unavoidable
> > > consequence of the refresh scheduled on each cpu).  But those stats are
> > > not buggy, they have never been seen to drift away from 0 permanently:
> > > so just avoid the annoyance of showing a warning on them.
> > > 
> > > Similarly avoid showing a warning on nr_free_cma: CMA users have seen
> > > that one reported negative from /proc/sys/vm/stat_refresh too, but it
> > > does drift away permanently: I believe that's because its incrementation
> > > and decrementation are decided by page migratetype, but the migratetype
> > > of a pageblock is not guaranteed to be constant.
> > > 
> > > Use switch statements so we can most easily add or remove cases later.
> > 
> > I'm OK with the code, but I can't fully agree with the commit log. I don't 
> > think
> > there is any mystery around negative values. Let me copy-paste the 
> > explanation
> > from my original patch:
> > 
> > These warnings* are generated by the vmstat_refresh() function, which
> > assumes that atomic zone and numa counters can't go below zero.  
> > However,
> > on a SMP machine it's not quite right: due to per-cpu caching it can in
> > theory be as low as -(zone threshold) * NR_CPUs.
> > 
> > For instance, let's say all cma pages are in use and NR_FREE_CMA_PAGES
> > reached 0.  Then we've reclaimed a small number of cma pages on each CPU
> > except CPU0, so that most percpu NR_FREE_CMA_PAGES counters are slightly
> > positive (the atomic counter is still 0).  Then somebody on CPU0 
> > consumes
> > all these pages.  The number of pages can easily exceed the threshold 
> > and
> > a negative value will be committed to the atomic counter.
> > 
> > * warnings about negative NR_FREE_CMA_PAGES
> 
> Hi Roman, thanks for your Acks on the others - and indeed this
> is the one on which disagreement was more to be expected.
> 
> I certainly wanted (and included below) a Link to your original patch;
> and even wondered whether to paste your description into mine.
> But I read it again and still have issues with it.
> 
> Mainly, it does not convey at all, that touching stat_refresh adds the
> per-cpu counts into the global atomics, resetting per-cpu counts to 0.
> Which does not invalidate your explanation: races might still manage
> to underflow; but it does take the "easily" out of "can easily exceed".

Hi Hugh!

It could be that "easily" simple comes from the scale (number of machines).

> 
> Since I don't use CMA on any machine, I cannot be sure, but it looked
> like a bad example to rely upon, because of its migratetype-based
> accounting.  If you use /proc/sys/vm/stat_refresh frequently enough,
> without suppressing the warning, I guess that uncertainty could be
> resolved by checking whether nr_free_cma is seen with negative value
> in consecutive refreshes - which would tend to support my migratetype
> theory - or only singly - which would support your raciness theory.
> 
> > 
> > Actually, the same is almost true for ANY other counter. What differs CMA, 
> > dirty
> > and write pending counters is that they can reach 0 value under normal 
> > conditions.
> > Other counters are usually not reaching values small enough to see negative 
> > values
> > on a reasonable sized machine.
> 
> Looking through /proc/vmstat now, yes, I can see that there are fewer
> counters which hover near 0 than I had imagined: more have a positive
> bias, or are monotonically increasing.  And I'd be lying if I said I'd
> never seen any others than nr_writeback or nr_zone_write_pending caught
> negative.  But what are you asking for?  Should the patch be changed, to
> retry the refresh_vm_stats() before warning, if it sees any negative?
> Depends on how terrible one line in dmesg is considered!
> 
> > 
> > Does it makes sense?
> 
> I'm not sure: you were not asking for the patch to be changed, but
> its commit log: and I better not say "Roman believes that it is an
> unavoidable consequence of the refresh scheduled on each cpu" if
> that's untrue (or unclear: now it reads to me as if we're accusing
> the refresh of messing things up, whereas it's the non-atomic nature
> of the refresh which leaves it vulnerable to races).

I think we both agree that for some counters going slightly into negative
is possible and isn't an indication of an error, if only they don't become
too negative. For other 

Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-03-01 Thread Dan Williams
On Mon, Mar 1, 2021 at 2:47 PM Dave Chinner  wrote:
>
> On Mon, Mar 01, 2021 at 12:55:53PM -0800, Dan Williams wrote:
> > On Sun, Feb 28, 2021 at 2:39 PM Dave Chinner  wrote:
> > >
> > > On Sat, Feb 27, 2021 at 03:40:24PM -0800, Dan Williams wrote:
> > > > On Sat, Feb 27, 2021 at 2:36 PM Dave Chinner  
> > > > wrote:
> > > > > On Fri, Feb 26, 2021 at 02:41:34PM -0800, Dan Williams wrote:
> > > > > > On Fri, Feb 26, 2021 at 1:28 PM Dave Chinner  
> > > > > > wrote:
> > > > > > > On Fri, Feb 26, 2021 at 12:59:53PM -0800, Dan Williams wrote:
> > > > > it points to, check if it points to the PMEM that is being removed,
> > > > > grab the page it points to, map that to the relevant struct page,
> > > > > run collect_procs() on that page, then kill the user processes that
> > > > > map that page.
> > > > >
> > > > > So why can't we walk the ptescheck the physical pages that they
> > > > > map to and if they map to a pmem page we go poison that
> > > > > page and that kills any user process that maps it.
> > > > >
> > > > > i.e. I can't see how unexpected pmem device unplug is any different
> > > > > to an MCE delivering a hwpoison event to a DAX mapped page.
> > > >
> > > > I guess the tradeoff is walking a long list of inodes vs walking a
> > > > large array of pages.
> > >
> > > Not really. You're assuming all a filesystem has to do is invalidate
> > > everything if a device goes away, and that's not true. Finding if an
> > > inode has a mapping that spans a specific device in a multi-device
> > > filesystem can be a lot more complex than that. Just walking inodes
> > > is easy - determining whihc inodes need invalidation is the hard
> > > part.
> >
> > That inode-to-device level of specificity is not needed for the same
> > reason that drop_caches does not need to be specific. If the wrong
> > page is unmapped a re-fault will bring it back, and re-fault will fail
> > for the pages that are successfully removed.
> >
> > > That's where ->corrupt_range() comes in - the filesystem is already
> > > set up to do reverse mapping from physical range to inode(s)
> > > offsets...
> >
> > Sure, but what is the need to get to that level of specificity with
> > the filesystem for something that should rarely happen in the course
> > of normal operation outside of a mistake?
>
> Dan, you made this mistake with the hwpoisoning code that we're
> trying to fix that here. Hard coding a 1:1 physical address to
> inode/offset into the DAX mapping was a bad mistake. It's also one
> that should never have occurred because it's *obviously wrong* to
> filesystem developers and has been for a long time.

I admit that mistake. The traditional memory error handling model
assumptions around page->mapping were broken by DAX, I'm not trying to
repeat that mistake. I feel we're talking past each other on the
discussion of the proposals.

> Now we have the filesytem people providing a mechanism for the pmem
> devices to tell the filesystems about physical device failures so
> they can handle such failures correctly themselves. Having the
> device go away unexpectedly from underneath a mounted and active
> filesystem is a *device failure*, not an "unplug event".

It's the same difference to the physical page, all mappings to that
page need to be torn down. I'm happy to call an fs callback and let
each filesystem do what it wants with a "every pfn in this dax device
needs to be unmapped".

I'm looking at the ->corrupted_range() patches trying to map it to
this use case and I don't see how, for example a realtime-xfs over DM
over multiple PMEM gets the notification to the right place.
bd_corrupted_range() uses get_super() which get the wrong answer for
both realtime-xfs and DM.

I'd flip that arrangement around and have the FS tell the block device
"if something happens to you, here is the super_block to notify". So
to me this looks like a fs_dax_register_super() helper that plumbs the
superblock through an arbitrary stack of block devices to the leaf
block-device that might want to send a notification up when a global
unmap operation needs to be performed.

I naively think that "for_each_inode()
unmap_mapping_range(>i_mapping)" is sufficient as a generic
implementation, that does not preclude XFS to override that generic
implementation and handle it directly if it so chooses.

> The mistake you made was not understanding how filesystems work,
> nor actually asking filesystem developers what they actually needed.

You're going too far here, but that's off topic.

> You're doing the same thing here - you're telling us what you think
> the solution filesystems need is.

No, I'm not, I'm trying to understand tradeoffs. I apologize if this
is coming across as not listening.

> Please listen when we say "that is
> not sufficient" because we don't want to be backed into a corner
> that we have to fix ourselves again before we can enable some basic
> filesystem functionality that we should have been able to support on
> DAX from the start...

That's 

Re: [PATCH 05/25] x86/sgx: Introduce virtual EPC for use by KVM guests

2021-03-01 Thread Kai Huang
On Mon, 2021-03-01 at 08:21 -0800, Sean Christopherson wrote:
> On Mon, Mar 01, 2021, Kai Huang wrote:
> > +   /*
> > +* SECS pages are "pinned" by child pages, an unpinned once all
> 
> s/an/and

Thanks!

> 
> > +* children have been EREMOVE'd.  A child page in this instance
> > +* may have pinned an SECS page encountered in an earlier release(),
> > +* creating a zombie.  Since some children were EREMOVE'd above,
> > +* try to EREMOVE all zombies in the hopes that one was unpinned.
> > +*/
> > +   mutex_lock(_secs_pages_lock);
> > +   list_for_each_entry_safe(epc_page, tmp, _secs_pages, list) {
> > +   /*
> > +* Speculatively remove the page from the list of zombies,
> > +* if the page is successfully EREMOVE it will be added to
> > +* the list of free pages.  If EREMOVE fails, throw the page
> > +* on the local list, which will be spliced on at the end.
> > +*/
> > +   list_del(_page->list);
> > +
> > +   if (sgx_vepc_free_page(epc_page))
> > +   list_add_tail(_page->list, _pages);
> > +   }
> > +
> > +   if (!list_empty(_pages))
> > +   list_splice_tail(_pages, _secs_pages);
> > +   mutex_unlock(_secs_pages_lock);
> > +
> > +   kfree(vepc);
> > +
> > +   return 0;
> > +}




Re: [PATCH 03/25] x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()

2021-03-01 Thread Kai Huang
On Mon, 2021-03-01 at 09:29 -0800, Sean Christopherson wrote:
> On Mon, Mar 01, 2021, Kai Huang wrote:
> > diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
> > index 7449ef33f081..a7dc86e87a09 100644
> > --- a/arch/x86/kernel/cpu/sgx/encl.c
> > +++ b/arch/x86/kernel/cpu/sgx/encl.c
> > @@ -381,6 +381,26 @@ const struct vm_operations_struct sgx_vm_ops = {
> >     .access = sgx_vma_access,
> >  };
> >  
> > 
> > 
> > 
> > +static void sgx_encl_free_epc_page(struct sgx_epc_page *epc_page)
> > +{
> > +   int ret;
> > +
> > +   WARN_ON_ONCE(epc_page->flags & SGX_EPC_PAGE_RECLAIMER_TRACKED);
> > +
> > +   ret = __eremove(sgx_get_epc_virt_addr(epc_page));
> > +   if (WARN_ONCE(ret, "EREMOVE returned %d (0x%x)", ret, ret)) {
> 
> This can be ENCLS_WARN, especially if you're printing a separate error message
> about leaking the page.  That being said, I'm not sure a seperate error 
> message
> is a good idea.  If other stuff gets dumped to the kernel log between the WARN
> and the pr_err_once(), it may not be clear to admins that the two events are
> directly connected.  It's even possible the prints could come from two 
> different
> CPUs.

Good point. Thanks for educating me :)

> 
> Why not dump a short blurb in the WARN itself?  The error message can be 
> thrown
> in a define if the line length is too obnoxious (it's ~109 chars if embedded
> directly).
> 
> #define EREMOVE_ERROR_MESSAGE \
>   "EREMOVE returned %d (0x%x).  EPC page leaked, reboot recommended."
> 
>   if (WARN_ONCE(ret, EREMOVE_ERROR_MESSAGE, ret, ret))

Will do in your way. Thanks!

> 
> > +   /*
> > +* Give a message to remind EPC page is leaked, and requires
> > +* machine reboot to get leaked pages back. This can be improved
> > +* in the future by adding stats of leaked pages, etc.
> > +*/
> > +   pr_err_once("EPC page is leaked. Require machine reboot to get 
> > leaked pages back.\n");
> > +   return;
> > +   }
> > +
> > +   sgx_free_epc_page(epc_page);
> > +}
> > +
> >  /**
> >   * sgx_encl_release - Destroy an enclave instance
> >   * @kref:  address of a kref inside _encl




Re: [PATCH v5 05/14] vfio/mdev: idxd: add basic mdev registration and helper functions

2021-03-01 Thread Jason Gunthorpe
On Mon, Mar 01, 2021 at 05:23:47PM -0700, Dave Jiang wrote:
> 
> So after looking at the code in vfio_pci_intrs.c, I agree that the set_irqs
> code between VFIO_PCI and this driver can be made in common. Given that Alex
> doesn't want a vfio_pci device embedded in the driver, 

idxd isn't a vfio_pci so it would be improper to do something like
that here anyhow.

> I think we'll need some sort of generic VFIO device that can be used
> from the vfio_pci side and vfio_mdev side to pass down in order to
> have common support library functions. 

Why do you need more layers?

Just make some helper functions to manage this and build them into
their own struct and function family. All this needs is some callback
to for the end driver to hook in the raw device programming and some
entry points to direct the emulation access to the module.

It should be fully self contained and completely unrelated to vfio_pci

Jason


Re: [PATCH v5 0/5] mm/hugetlb: Early cow on fork, and a few cleanups

2021-03-01 Thread Jason Gunthorpe
On Mon, Mar 01, 2021 at 04:28:46PM -0800, Andrew Morton wrote:
> On Mon, 1 Mar 2021 09:11:51 -0500 Peter Xu  wrote:
> 
> > On Wed, Feb 17, 2021 at 06:35:42PM -0500, Peter Xu wrote:
> > > v5:
> > > - patch 4: change "int cow" into "bool cow"
> > > - collect r-bs for Jason
> > 
> > Andrew,
> > 
> > I just noticed 5.12-rc1 has released; is this series still possible to make 
> > it
> > for 5.12, or needs to wait for 5.13?
> > 
> 
> It has taken a while to settle down.  What is the case for
> fast-tracking it into 5.12?

IIRC hugetlb users and fork and DMA will get the unexpected VA
corruption that triggered all this work.

Jason


Re: [PATCH v5 0/5] mm/hugetlb: Early cow on fork, and a few cleanups

2021-03-01 Thread Andrew Morton
On Mon, 1 Mar 2021 09:11:51 -0500 Peter Xu  wrote:

> On Wed, Feb 17, 2021 at 06:35:42PM -0500, Peter Xu wrote:
> > v5:
> > - patch 4: change "int cow" into "bool cow"
> > - collect r-bs for Jason
> 
> Andrew,
> 
> I just noticed 5.12-rc1 has released; is this series still possible to make it
> for 5.12, or needs to wait for 5.13?
> 

It has taken a while to settle down.  What is the case for
fast-tracking it into 5.12?


Re: [PATCH net] hv_netvsc: Fix validation in netvsc_linkstatus_callback()

2021-03-01 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Mon,  1 Mar 2021 19:25:30 +0100 you wrote:
> Contrary to the RNDIS protocol specification, certain (pre-Fe)
> implementations of Hyper-V's vSwitch did not account for the status
> buffer field in the length of an RNDIS packet; the bug was fixed in
> newer implementations.  Validate the status buffer fields using the
> length of the 'vmtransfer_page' packet (all implementations), that
> is known/validated to be less than or equal to the receive section
> size and not smaller than the length of the RNDIS message.
> 
> [...]

Here is the summary with links:
  - [net] hv_netvsc: Fix validation in netvsc_linkstatus_callback()
https://git.kernel.org/netdev/net/c/3946688edbc5

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH V3 XRT Alveo 03/18] fpga: xrt: xclbin file helper functions

2021-03-01 Thread Lizhi Hou

Hi Tom,


On 02/28/2021 08:54 AM, Tom Rix wrote:

CAUTION: This message has originated from an External Source. Please use proper 
judgment and caution when opening attachments, clicking links, or responding to 
this email.


On 2/26/21 1:23 PM, Lizhi Hou wrote:

Hi Tom,



snip


I also do not see a pragma pack, usually this is set of 1 so the compiler does 
not shuffle elements, increase size etc.

This data structure is shared with other tools. And the structure is well 
defined with reasonable alignment. It is compatible with all compilers we have 
tested. So pragma pack is not necessary.

You can not have possibly tested all the configurations since the kernel 
supports many arches and compilers.

If the tested existing alignment is ok, pragma pack should be a noop on your 
tested configurations.

And help cover the untested configurations.

Got it. I will add pragma pack(1).

Lizhi


Tom





Re: [PATCH v5 05/14] vfio/mdev: idxd: add basic mdev registration and helper functions

2021-03-01 Thread Dave Jiang



On 2/10/2021 4:59 PM, Jason Gunthorpe wrote:

On Fri, Feb 05, 2021 at 01:53:24PM -0700, Dave Jiang wrote:


<-- cut for brevity -->



+static int vdcm_idxd_set_msix_trigger(struct vdcm_idxd *vidxd,
+ unsigned int index, unsigned int start,
+ unsigned int count, uint32_t flags,
+ void *data)
+{
+   int i, rc = 0;
+
+   if (count > VIDXD_MAX_MSIX_ENTRIES - 1)
+   count = VIDXD_MAX_MSIX_ENTRIES - 1;
+
+   if (count == 0 && (flags & VFIO_IRQ_SET_DATA_NONE)) {
+   /* Disable all MSIX entries */
+   for (i = 0; i < VIDXD_MAX_MSIX_ENTRIES; i++) {
+   rc = msix_trigger_unregister(vidxd, i);
+   if (rc < 0)
+   return rc;
+   }
+   return 0;
+   }
+
+   for (i = 0; i < count; i++) {
+   if (flags & VFIO_IRQ_SET_DATA_EVENTFD) {
+   u32 fd = *(u32 *)(data + i * sizeof(u32));
+
+   rc = msix_trigger_register(vidxd, fd, i);
+   if (rc < 0)
+   return rc;
+   } else if (flags & VFIO_IRQ_SET_DATA_NONE) {
+   rc = msix_trigger_unregister(vidxd, i);
+   if (rc < 0)
+   return rc;
+   }
+   }
+   return rc;
+}
+
+static int idxd_vdcm_set_irqs(struct vdcm_idxd *vidxd, uint32_t flags,
+ unsigned int index, unsigned int start,
+ unsigned int count, void *data)
+{
+   int (*func)(struct vdcm_idxd *vidxd, unsigned int index,
+   unsigned int start, unsigned int count, uint32_t flags,
+   void *data) = NULL;
+   struct mdev_device *mdev = vidxd->vdev.mdev;
+   struct device *dev = mdev_dev(mdev);
+
+   switch (index) {
+   case VFIO_PCI_INTX_IRQ_INDEX:
+   dev_warn(dev, "intx interrupts not supported.\n");
+   break;
+   case VFIO_PCI_MSI_IRQ_INDEX:
+   dev_dbg(dev, "msi interrupt.\n");
+   switch (flags & VFIO_IRQ_SET_ACTION_TYPE_MASK) {
+   case VFIO_IRQ_SET_ACTION_MASK:
+   case VFIO_IRQ_SET_ACTION_UNMASK:
+   break;
+   case VFIO_IRQ_SET_ACTION_TRIGGER:
+   func = vdcm_idxd_set_msix_trigger;
This would be a good place to insert a common VFIO helper library to
take care of the MSI-X emulation for IMS.


Hi Jason,

So after looking at the code in vfio_pci_intrs.c, I agree that the 
set_irqs code between VFIO_PCI and this driver can be made in common. 
Given that Alex doesn't want a vfio_pci device embedded in the driver, I 
think we'll need some sort of generic VFIO device that can be used from 
the vfio_pci side and vfio_mdev side to pass down in order to have 
common support library functions. Do you have any thoughts on how to do 
this cleanly architecturally? Also, with vfio_pci common split [1] still 
being worked on, do you think we can defer the work on making the 
interrupt setup code common until the vfio_pci split work settles? Thanks!


[1]: https://lore.kernel.org/kvm/20210201162828.5938-1-mgurto...@nvidia.com/




Re: [PATCH v3 1/8] mm: Remove special swap entry functions

2021-03-01 Thread Alistair Popple
On Tuesday, 2 March 2021 4:46:42 AM AEDT Jason Gunthorpe wrote:
> 
> I wish you could come up with a more descriptive word that special
> here
> 
> What I understand is this is true when the swap_offset is a pfn?

Correct, and that points to a better name. Maybe is_pfn_swap_entry()? In which 
case adding a helper as Christoph suggested makes some more sense. Eg: 
pfn_swap_entry_to_page()

> > -static inline struct page *migration_entry_to_page(swp_entry_t entry)
> > -{
> > -   struct page *p = pfn_to_page(swp_offset(entry));
> > -   /*
> > -* Any use of migration entries may only occur while the
> > -* corresponding page is locked
> > -*/
> > -   BUG_ON(!PageLocked(compound_head(p)));
> > -   return p;
> 
> And this constraint has been completely lost?

Yes, sorry I should have called that out. I didn't think loosing the check was 
a big deal, but I can add some checks to some of the call sites which would 
catch a page being incorrectly unlocked.

> A comment in front of the is_special_entry explaining all the rule
> would help alot

Will add one.

> Transformation looks fine otherwise

Thanks.

 - Alistair
 
> Jason
> 






linux-next: build failure after merge of the powerpc-fixes tree

2021-03-01 Thread Stephen Rothwell
Hi all,

After merging the powerpc-fixes tree, today's linux-next build (powerpc
allyesconfig) failed like this:

drivers/net/ethernet/ibm/ibmvnic.c:5399:13: error: conflicting types for 
'ibmvnic_remove'
 5399 | static void ibmvnic_remove(struct vio_dev *dev)
  | ^~
drivers/net/ethernet/ibm/ibmvnic.c:81:12: note: previous declaration of 
'ibmvnic_remove' was here
   81 | static int ibmvnic_remove(struct vio_dev *);
  |^~

Caused by commit

  1bdd1e6f9320 ("vio: make remove callback return void")

I have applied the following patch for today:

From: Stephen Rothwell 
Date: Tue, 2 Mar 2021 11:06:37 +1100
Subject: [PATCH] vio: fix for make remove callback return void

Signed-off-by: Stephen Rothwell 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index eb39318766f6..fe3201ba2034 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -78,7 +78,6 @@ MODULE_LICENSE("GPL");
 MODULE_VERSION(IBMVNIC_DRIVER_VERSION);
 
 static int ibmvnic_version = IBMVNIC_INITIAL_VERSION;
-static int ibmvnic_remove(struct vio_dev *);
 static void release_sub_crqs(struct ibmvnic_adapter *, bool);
 static int ibmvnic_reset_crq(struct ibmvnic_adapter *);
 static int ibmvnic_send_crq_init(struct ibmvnic_adapter *);
-- 
2.30.0

-- 
Cheers,
Stephen Rothwell


pgp7u0BheRH7K.pgp
Description: OpenPGP digital signature


[PATCH] c6x: Remove stale symlink 'scripts/dtc/include-prefixes/c6x'

2021-03-01 Thread Victor Erminpour
Remove stale symlink 'scripts/dtc/include-prefixes/c6x'

Signed-off-by: Victor Erminpour 
---
 scripts/dtc/include-prefixes/c6x | 1 -
 1 file changed, 1 deletion(-)
 delete mode 12 scripts/dtc/include-prefixes/c6x

diff --git a/scripts/dtc/include-prefixes/c6x b/scripts/dtc/include-prefixes/c6x
deleted file mode 12
index 49ded4cae2be..
--- a/scripts/dtc/include-prefixes/c6x
+++ /dev/null
@@ -1 +0,0 @@
-../../../arch/c6x/boot/dts
\ No newline at end of file


[PATCH] NFS: fs_context: validate UDP retrans to prevent shift out-of-bounds

2021-03-01 Thread Randy Dunlap
Fix shift out-of-bounds in xprt_calc_majortimeo(). This is caused
by a garbage timeout (retrans) mount option being passed to nfs mount,
in this case from syzkaller.

If the protocol is XPRT_TRANSPORT_UDP, then 'retrans' is a shift
value for a 64-bit long integer, so 'retrans' cannot be >= 64.
If it is >= 64, fail the mount and return an error.

Fixes: 9954bf92c0cd ("NFS: Move mount parameterisation bits into their own 
file")
Reported-by: syzbot+ba2e91df8f7480941...@syzkaller.appspotmail.com
Reported-by: syzbot+f3a0fa110fd630ab5...@syzkaller.appspotmail.com
Signed-off-by: Randy Dunlap 
Cc: Trond Myklebust 
Cc: Anna Schumaker 
Cc: linux-...@vger.kernel.org
Cc: David Howells 
Cc: Al Viro 
Cc: sta...@vger.kernel.org
---
 fs/nfs/fs_context.c |   12 
 1 file changed, 12 insertions(+)

--- lnx-512-rc1.orig/fs/nfs/fs_context.c
+++ lnx-512-rc1/fs/nfs/fs_context.c
@@ -974,6 +974,15 @@ static int nfs23_parse_monolithic(struct
   sizeof(mntfh->data) - mntfh->size);
 
/*
+* for proto == XPRT_TRANSPORT_UDP, which is what uses
+* to_exponential, implying shift: limit the shift value
+* to BITS_PER_LONG (majortimeo is unsigned long)
+*/
+   if (!(data->flags & NFS_MOUNT_TCP)) /* this will be UDP */
+   if (data->retrans >= 64) /* shift value is too large */
+   goto out_invalid_data;
+
+   /*
 * Translate to nfs_fs_context, which nfs_fill_super
 * can deal with.
 */
@@ -1073,6 +1082,9 @@ out_no_address:
 
 out_invalid_fh:
return nfs_invalf(fc, "NFS: invalid root filehandle");
+
+out_invalid_data:
+   return nfs_invalf(fc, "NFS: invalid binary mount data");
 }
 
 #if IS_ENABLED(CONFIG_NFS_V4)


RE: [PATCH 4.19 055/247] soc: aspeed: snoop: Add clock control logic

2021-03-01 Thread Yoo, Jae Hyun
> -Original Message-
> From: Joel Stanley 
> Sent: Monday, March 1, 2021 2:44 PM
> To: Greg Kroah-Hartman ; John Wang
> ; Yoo, Jae Hyun
> 
> Cc: Linux Kernel Mailing List ;
> sta...@vger.kernel.org; Vernon Mauery ;
> Sasha Levin 
> Subject: Re: [PATCH 4.19 055/247] soc: aspeed: snoop: Add clock control logic
> 
> On Mon, 1 Mar 2021 at 16:37, Greg Kroah-Hartman
>  wrote:
> >
> > From: Jae Hyun Yoo 
> >
> > [ Upstream commit 3f94cf15583be554df7aaa651b8ff8e1b68fbe51 ]
> >
> > If LPC SNOOP driver is registered ahead of lpc-ctrl module, LPC SNOOP
> > block will be enabled without heart beating of LCLK until lpc-ctrl
> > enables the LCLK. This issue causes improper handling on host
> > interrupts when the host sends interrupt in that time frame.
> > Then kernel eventually forcibly disables the interrupt with dumping
> > stack and printing a 'nobody cared this irq' message out.
> >
> > To prevent this issue, all LPC sub-nodes should enable LCLK
> > individually so this patch adds clock control logic into the LPC SNOOP
> > driver.
> 
> Jae, John; with this backported do we need to also provide a corresponding
> device tree change for the stable tree, otherwise this driver will no longer
> probe?

Right. The second patch
https://lore.kernel.org/linux-arm-kernel/20201208091748.1920-2-wangzhiqiang...@bytedance.com/
John submitted should be applied to stable tree too to make this module be 
probed
correctly.

> >
> > Fixes: 3772e5da4454 ("drivers/misc: Aspeed LPC snoop output using misc
> > chardev")
> > Signed-off-by: Jae Hyun Yoo 
> > Signed-off-by: Vernon Mauery 
> > Signed-off-by: John Wang 
> > Reviewed-by: Joel Stanley 
> > Link:
> > https://lore.kernel.org/r/20201208091748.1920-1-wangzhiqiang.bj@byteda
> > nce.com
> > Signed-off-by: Joel Stanley 
> > Signed-off-by: Sasha Levin 
> > ---
> >  drivers/misc/aspeed-lpc-snoop.c | 30 +++-
> --
> >  1 file changed, 27 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/misc/aspeed-lpc-snoop.c
> > b/drivers/misc/aspeed-lpc-snoop.c index c10be21a1663d..b4a776bf44bc5
> > 100644
> > --- a/drivers/misc/aspeed-lpc-snoop.c
> > +++ b/drivers/misc/aspeed-lpc-snoop.c
> > @@ -15,6 +15,7 @@
> >   */
> >
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -71,6 +72,7 @@ struct aspeed_lpc_snoop_channel {  struct
> > aspeed_lpc_snoop {
> > struct regmap   *regmap;
> > int irq;
> > +   struct clk  *clk;
> > struct aspeed_lpc_snoop_channel chan[NUM_SNOOP_CHANNELS];  };
> >
> > @@ -286,22 +288,42 @@ static int aspeed_lpc_snoop_probe(struct
> platform_device *pdev)
> > return -ENODEV;
> > }
> >
> > +   lpc_snoop->clk = devm_clk_get(dev, NULL);
> > +   if (IS_ERR(lpc_snoop->clk)) {
> > +   rc = PTR_ERR(lpc_snoop->clk);
> > +   if (rc != -EPROBE_DEFER)
> > +   dev_err(dev, "couldn't get clock\n");
> > +   return rc;
> > +   }
> > +   rc = clk_prepare_enable(lpc_snoop->clk);
> > +   if (rc) {
> > +   dev_err(dev, "couldn't enable clock\n");
> > +   return rc;
> > +   }
> > +
> > rc = aspeed_lpc_snoop_config_irq(lpc_snoop, pdev);
> > if (rc)
> > -   return rc;
> > +   goto err;
> >
> > rc = aspeed_lpc_enable_snoop(lpc_snoop, dev, 0, port);
> > if (rc)
> > -   return rc;
> > +   goto err;
> >
> > /* Configuration of 2nd snoop channel port is optional */
> > if (of_property_read_u32_index(dev->of_node, "snoop-ports",
> >1, ) == 0) {
> > rc = aspeed_lpc_enable_snoop(lpc_snoop, dev, 1, port);
> > -   if (rc)
> > +   if (rc) {
> > aspeed_lpc_disable_snoop(lpc_snoop, 0);
> > +   goto err;
> > +   }
> > }
> >
> > +   return 0;
> > +
> > +err:
> > +   clk_disable_unprepare(lpc_snoop->clk);
> > +
> > return rc;
> >  }
> >
> > @@ -313,6 +335,8 @@ static int aspeed_lpc_snoop_remove(struct
> platform_device *pdev)
> > aspeed_lpc_disable_snoop(lpc_snoop, 0);
> > aspeed_lpc_disable_snoop(lpc_snoop, 1);
> >
> > +   clk_disable_unprepare(lpc_snoop->clk);
> > +
> > return 0;
> >  }
> >
> > --
> > 2.27.0
> >
> >
> >


Upper bound mode for kernel timers

2021-03-01 Thread Josh Poimboeuf
Hi Thomas,

As discussed on IRC:

We had a report of a regression in the TCP keepalive timer.  The user
had a 3600s keepalive timer for preventing firewall disconnects (on a
3650s interval).  They observed keepalive timers coming in up to four
minutes late, causing unexpected disconnects.

The regression was observed to have come from the timer wheel rewrite
from almost five years ago:

  500462a9de65 ("timers: Switch to a non-cascading wheel")

As you mentioned, with a HZ of 1000, the granularity for a one-hour
timer is four minutes, which matches the seen behavior.

To "fix" it, the user can just lower the timeout value by four minutes,
but that's a workaround, because the keepalive timer isn't working as
advertised.

One potential fix would be an "upper bound mode" in the timer, i.e. give
the user a way to specify that the given 'expires' value is an upper
bound rather than a lower bound.

As you graciously offered, if you or Anna-Maria can implement that new
interface, we (Artem or I) can write up a patch to use it for the
keepalive timer.

-- 
Josh



Re: [PATCH v3 5/8] mm: Device exclusive memory access

2021-03-01 Thread Jason Gunthorpe
On Fri, Feb 26, 2021 at 06:18:29PM +1100, Alistair Popple wrote:

> +/**
> + * make_device_exclusive_range() - Mark a range for exclusive use by a device
> + * @mm: mm_struct of assoicated target process
> + * @start: start of the region to mark for exclusive device access
> + * @end: end address of region
> + * @pages: returns the pages which were successfully mark for exclusive acces
> + *
> + * Returns: number of pages successfully marked for exclusive access
> + *
> + * This function finds the ptes mapping page(s) to the given address range 
> and
> + * replaces them with special swap entries preventing userspace CPU access. 
> On
> + * fault these entries are replaced with the original mapping after calling 
> MMU
> + * notifiers.
> + */
> +int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
> + unsigned long end, struct page **pages)
> +{
> + long npages = (end - start) >> PAGE_SHIFT;
> + long i;
> +
> + npages = get_user_pages_remote(mm, start, npages,
> +FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
> +pages, NULL, NULL);
> + for (i = 0; i < npages; i++) {
> + if (!trylock_page(pages[i])) {
> + put_page(pages[i]);
> + pages[i] = NULL;
> + continue;
> + }
> +
> + if (!try_to_protect(pages[i])) {

Isn't this racy? get_user_pages returns the ptes at an instant in
time, they could have already been changed to something else?

I would think you'd want to switch to the swap entry atomically under
th PTLs?

Jason


[PATCH v2] docs: filesystem: Update smaps vm flag list to latest

2021-03-01 Thread Peter Xu
We've missed a few documentation when adding new VM_* flags.  Add the missing
pieces so they'll be in sync now.

Signed-off-by: Peter Xu 
---
v2:
- rebase
---
 Documentation/filesystems/proc.rst | 4 
 1 file changed, 4 insertions(+)

diff --git a/Documentation/filesystems/proc.rst 
b/Documentation/filesystems/proc.rst
index 48fbfc336ebf..81bfe3c800cc 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -540,7 +540,9 @@ encoded manner. The codes are the following:
 acarea is accountable
 nrswap space is not reserved for the area
 htarea uses huge tlb pages
+sfsynchronous page fault
 ararchitecture specific flag
+wfwipe on fork
 dddo not include area into core dump
 sdsoft dirty flag
 mmmixed map area
@@ -549,6 +551,8 @@ encoded manner. The codes are the following:
 mgmergable advise flag
 btarm64 BTI guarded page
 mtarm64 MTE allocation tags are enabled
+umuserfaultfd missing tracking
+uwuserfaultfd wr-protect tracking
 =====
 
 Note that there is no guarantee that every flag and associated mnemonic will
-- 
2.26.2



Re: [PATCH] docs: filesystem: Update smaps vm flag list to latest

2021-03-01 Thread Peter Xu
On Mon, Mar 01, 2021 at 03:17:13PM -0700, Jonathan Corbet wrote:
> Peter Xu  writes:
> 
> > We've missed a few documentation when adding new VM_* flags.  Add the 
> > missing
> > pieces so they'll be in sync now.
> >
> > Signed-off-by: Peter Xu 
> > ---
> >  Documentation/filesystems/proc.rst | 5 +
> >  1 file changed, 5 insertions(+)
> 
> So this patch doesn't apply; what version of the kernel did you generate
> it against?  Could you redo against current kernels, please?

Sure.  "mt" just got added, hence conflicted, but the rest are still missing.
Reposting.  Thanks,

-- 
Peter Xu



Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify_code

2021-03-01 Thread Sami Tolvanen
On Mon, Mar 1, 2021 at 3:45 PM Steven Rostedt  wrote:
>
> On Mon, 1 Mar 2021 14:14:51 -0800
> Sami Tolvanen  wrote:
>
> > Basically, the problem is that ftrace_replace_code() expects to find
> > ideal_nops[NOP_ATOMIC5] here, which in this case is 66:66:66:66:90,
> > while objtool has replaced the __fentry__ call with 0f:1f:44:00:00.
> >
> > As ideal_nops changes depending on kernel config and hardware, when
> > CC_USING_NOP_MCOUNT is defined we could either change
> > ftrace_nop_replace() to always use P6_NOP5, or skip
> > ftrace_verify_code() in ftrace_replace_code() for
> > FTRACE_UPDATE_MAKE_CALL.
>
> So I hacked up the code to get -mnop-record to work on x86, and checked the
> vmlinux and it gives me:
>
> 81bc6120 :
> 81bc6120:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
> 81bc6125:   55  push   %rbp
> 81bc6126:   65 48 8b 2c 25 c0 7d 01 00  mov
> %gs:0x17dc0,%rbp 81bc612b: R_X86_64_32S  current_task
> 81bc612f:   53  push   %rbx
> 81bc6130:   48 8b 45 18 mov0x18(%rbp),%rax
>
>
> Which is the 0f:1f:44:00:00, and it works fine for me.
>
> Now, that could be because the ideal_nops[NOP_ATOMIC5] is the same, which
> would explain this.
>
> No, we should *not* change ftrace_nop_replace() to always use any P6_NOP5,
> as there was a reason we did this. Because not all nops are the same, and
> this gets called for *every* function that is traced.
>
> No, we should not skip ftrace_verify_code() *ever*. (/me was just
> referencing on twitter the scenario where ftrace bricked e1000e cards).
>
> This is probably why I never was much for the compiler conversion into nops,
> because it may chose the wrong one for the architecture.

Sure, makes sense. Should we just skip the conversion in objtool then
and let the kernel deal with it?

> What we could do, is if the nop chosen by the compiler is not the ideal
> nop, to go back and modify all the nops added by the compiler to the ideal
> one, which would keep it using the most efficient one.
>
> Or, add something like this:
>

[...]

> ret = ftrace_verify_code(rec->ip, old);
> +
> +   if (__is_defined(CC_USING_NOP_MCOUNT) && ret && old_nop) {
> +   /* Compiler could have put in P6_NOP5 */
> +   old = P6_NOP5;
> +   ret = ftrace_verify_code(rec->ip, old);
> +   }
> +

Wouldn't that still hit WARN(1) in the initial ftrace_verify_code()
call if ideal_nops doesn't match?

Sami


[PATCH v2 1/5] userfaultfd: support minor fault handling for shmem

2021-03-01 Thread Axel Rasmussen
Modify the userfaultfd register API to allow registering shmem VMAs in
minor mode. Modify the shmem mcopy implementation to support
UFFDIO_CONTINUE in order to resolve such faults.

Combine the shmem mcopy handler functions into a single
shmem_mcopy_atomic_pte, which takes a mode parameter. This matches how
the hugetlbfs implementation is structured, and lets us remove a good
chunk of boilerplate.

Signed-off-by: Axel Rasmussen 
---
 fs/userfaultfd.c |  6 +--
 include/linux/shmem_fs.h | 26 -
 include/uapi/linux/userfaultfd.h |  4 +-
 mm/memory.c  |  8 +--
 mm/shmem.c   | 92 +++-
 mm/userfaultfd.c | 27 +-
 6 files changed, 79 insertions(+), 84 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 14f92285d04f..9f3b8684cf3c 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1267,8 +1267,7 @@ static inline bool vma_can_userfault(struct 
vm_area_struct *vma,
}
 
if (vm_flags & VM_UFFD_MINOR) {
-   /* FIXME: Add minor fault interception for shmem. */
-   if (!is_vm_hugetlb_page(vma))
+   if (!(is_vm_hugetlb_page(vma) || vma_is_shmem(vma)))
return false;
}
 
@@ -1941,7 +1940,8 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx,
/* report all available features and ioctls to userland */
uffdio_api.features = UFFD_API_FEATURES;
 #ifndef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR
-   uffdio_api.features &= ~UFFD_FEATURE_MINOR_HUGETLBFS;
+   uffdio_api.features &=
+   ~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM);
 #endif
uffdio_api.ioctls = UFFD_API_IOCTLS;
ret = -EFAULT;
diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h
index d82b6f396588..f0919c3722e7 100644
--- a/include/linux/shmem_fs.h
+++ b/include/linux/shmem_fs.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* inode in-kernel data */
 
@@ -122,21 +123,16 @@ static inline bool shmem_file(struct file *file)
 extern bool shmem_charge(struct inode *inode, long pages);
 extern void shmem_uncharge(struct inode *inode, long pages);
 
+#ifdef CONFIG_USERFAULTFD
 #ifdef CONFIG_SHMEM
-extern int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
- struct vm_area_struct *dst_vma,
- unsigned long dst_addr,
- unsigned long src_addr,
- struct page **pagep);
-extern int shmem_mfill_zeropage_pte(struct mm_struct *dst_mm,
-   pmd_t *dst_pmd,
-   struct vm_area_struct *dst_vma,
-   unsigned long dst_addr);
-#else
-#define shmem_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma, dst_addr, \
-  src_addr, pagep)({ BUG(); 0; })
-#define shmem_mfill_zeropage_pte(dst_mm, dst_pmd, dst_vma, \
-dst_addr)  ({ BUG(); 0; })
-#endif
+int shmem_mcopy_atomic_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
+  struct vm_area_struct *dst_vma,
+  unsigned long dst_addr, unsigned long src_addr,
+  enum mcopy_atomic_mode mode, struct page **pagep);
+#else /* !CONFIG_SHMEM */
+#define shmem_mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, \
+  src_addr, mode, pagep)({ BUG(); 0; })
+#endif /* CONFIG_SHMEM */
+#endif /* CONFIG_USERFAULTFD */
 
 #endif
diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
index bafbeb1a2624..47d9790d863d 100644
--- a/include/uapi/linux/userfaultfd.h
+++ b/include/uapi/linux/userfaultfd.h
@@ -31,7 +31,8 @@
   UFFD_FEATURE_MISSING_SHMEM | \
   UFFD_FEATURE_SIGBUS |\
   UFFD_FEATURE_THREAD_ID | \
-  UFFD_FEATURE_MINOR_HUGETLBFS)
+  UFFD_FEATURE_MINOR_HUGETLBFS |   \
+  UFFD_FEATURE_MINOR_SHMEM)
 #define UFFD_API_IOCTLS\
((__u64)1 << _UFFDIO_REGISTER | \
 (__u64)1 << _UFFDIO_UNREGISTER |   \
@@ -196,6 +197,7 @@ struct uffdio_api {
 #define UFFD_FEATURE_SIGBUS(1<<7)
 #define UFFD_FEATURE_THREAD_ID (1<<8)
 #define UFFD_FEATURE_MINOR_HUGETLBFS   (1<<9)
+#define UFFD_FEATURE_MINOR_SHMEM   (1<<10)
__u64 features;
 
__u64 ioctls;
diff --git a/mm/memory.c b/mm/memory.c
index c8e357627318..a1e5ff55027e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3929,9 +3929,11 @@ static vm_fault_t do_read_fault(struct vm_fault *vmf)
 * something).
 */
if (vma->vm_ops->map_pages && fault_around_bytes >> 

[PATCH v2 5/5] userfaultfd/selftests: exercise minor fault handling shmem support

2021-03-01 Thread Axel Rasmussen
Enable test_uffdio_minor for test_type == TEST_SHMEM, and modify the
test slightly to pass in / check for the right feature flags.

Signed-off-by: Axel Rasmussen 
---
 tools/testing/selftests/vm/userfaultfd.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/vm/userfaultfd.c 
b/tools/testing/selftests/vm/userfaultfd.c
index 5183ddb3080d..f31e9a4edc55 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -1410,7 +1410,7 @@ static int userfaultfd_minor_test(void)
void *expected_page;
char c;
struct uffd_stats stats = { 0 };
-   uint64_t features = UFFD_FEATURE_MINOR_HUGETLBFS;
+   uint64_t req_features, features_out;
 
if (!test_uffdio_minor)
return 0;
@@ -1418,10 +1418,18 @@ static int userfaultfd_minor_test(void)
printf("testing minor faults: ");
fflush(stdout);
 
-   if (uffd_test_ctx_clear() || uffd_test_ctx_init_ext())
+   if (test_type == TEST_HUGETLB)
+   req_features = UFFD_FEATURE_MINOR_HUGETLBFS;
+   else if (test_type == TEST_SHMEM)
+   req_features = UFFD_FEATURE_MINOR_SHMEM;
+   else
+   return 1;
+
+   features_out = req_features;
+   if (uffd_test_ctx_clear() || uffd_test_ctx_init_ext(_out))
return 1;
-   /* If kernel reports the feature isn't supported, skip the test. */
-   if (!(features & UFFD_FEATURE_MINOR_HUGETLBFS)) {
+   /* If kernel reports required features aren't supported, skip test. */
+   if ((features_out & req_features) != req_features) {
printf("skipping test due to lack of feature support\n");
fflush(stdout);
return 0;
@@ -1431,7 +1439,7 @@ static int userfaultfd_minor_test(void)
uffdio_register.range.len = nr_pages * page_size;
uffdio_register.mode = UFFDIO_REGISTER_MODE_MINOR;
if (ioctl(uffd, UFFDIO_REGISTER, _register)) {
-   fprintf(stderr, "register failure\n");
+   perror("register failure");
exit(1);
}
 
@@ -1695,6 +1703,7 @@ static void set_test_type(const char *type)
map_shared = true;
test_type = TEST_SHMEM;
uffd_test_ops = _uffd_test_ops;
+   test_uffdio_minor = true;
} else {
fprintf(stderr, "Unknown test type: %s\n", type); exit(1);
}
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 3/5] userfaultfd/selftests: create alias mappings in the shmem test

2021-03-01 Thread Axel Rasmussen
Previously, we just allocated two shm areas: area_src and area_dst. With
this commit, change this so we also allocate area_src_alias, and
area_dst_alias.

area_*_alias and area_* (respectively) point to the same underlying
physical pages, but are different VMAs. In a future commit in this
series, we'll leverage this setup to exercise minor fault handling
support for shmem, just like we do in the hugetlb_shared test.

Signed-off-by: Axel Rasmussen 
---
 tools/testing/selftests/vm/userfaultfd.c | 29 +---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/userfaultfd.c 
b/tools/testing/selftests/vm/userfaultfd.c
index 859398efb4fe..4a18590fe0f8 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -298,8 +298,9 @@ static int shmem_release_pages(char *rel_area)
 
 static void shmem_allocate_area(void **alloc_area)
 {
-   unsigned long offset =
-   alloc_area == (void **)_src ? 0 : nr_pages * page_size;
+   void *area_alias = NULL;
+   bool is_src = alloc_area == (void **)_src;
+   unsigned long offset = is_src ? 0 : nr_pages * page_size;
 
*alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
   MAP_SHARED, shm_fd, offset);
@@ -308,12 +309,34 @@ static void shmem_allocate_area(void **alloc_area)
goto fail;
}
 
+   area_alias = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
+ MAP_SHARED, shm_fd, offset);
+   if (area_alias == MAP_FAILED) {
+   perror("mmap of memfd alias failed");
+   goto fail_munmap;
+   }
+
+   if (is_src)
+   area_src_alias = area_alias;
+   else
+   area_dst_alias = area_alias;
+
return;
 
+fail_munmap:
+   if (munmap(*alloc_area, nr_pages * page_size) < 0) {
+   perror("munmap of memfd failed\n");
+   exit(1);
+   }
 fail:
*alloc_area = NULL;
 }
 
+static void shmem_alias_mapping(__u64 *start, size_t len, unsigned long offset)
+{
+   *start = (unsigned long)area_dst_alias + offset;
+}
+
 struct uffd_test_ops {
unsigned long expected_ioctls;
void (*allocate_area)(void **alloc_area);
@@ -341,7 +364,7 @@ static struct uffd_test_ops shmem_uffd_test_ops = {
.expected_ioctls = SHMEM_EXPECTED_IOCTLS,
.allocate_area  = shmem_allocate_area,
.release_pages  = shmem_release_pages,
-   .alias_mapping = noop_alias_mapping,
+   .alias_mapping = shmem_alias_mapping,
 };
 
 static struct uffd_test_ops hugetlb_uffd_test_ops = {
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 2/5] userfaultfd/selftests: use memfd_create for shmem test type

2021-03-01 Thread Axel Rasmussen
This is a preparatory commit. In the future, we want to be able to setup
alias mappings for area_src and area_dst in the shmem test, like we do
in the hugetlb_shared test. With a VMA obtained via
mmap(MAP_ANONYMOUS | MAP_SHARED), it isn't clear how to do this.

So, mmap() with an fd, so we can create alias mappings. Use memfd_create
instead of actually passing in a tmpfs path like hugetlb does, since
it's more convenient / simpler to run, and works just as well.

Future commits will:

1. Setup the alias mappings.
2. Extend our tests to actually take advantage of this, to test new
   userfaultfd behavior being introduced in this series.

Also, a small fix in the area we're changing: when the hugetlb setup
fails in main(), pass in the right argv[] so we actually print out the
hugetlb file path.

Signed-off-by: Axel Rasmussen 
---
 tools/testing/selftests/vm/userfaultfd.c | 35 
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/vm/userfaultfd.c 
b/tools/testing/selftests/vm/userfaultfd.c
index f5ab5e0312e7..859398efb4fe 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -85,6 +85,7 @@ static bool test_uffdio_wp = false;
 static bool test_uffdio_minor = false;
 
 static bool map_shared;
+static int shm_fd;
 static int huge_fd;
 static char *huge_fd_off0;
 static unsigned long long *count_verify;
@@ -297,12 +298,20 @@ static int shmem_release_pages(char *rel_area)
 
 static void shmem_allocate_area(void **alloc_area)
 {
+   unsigned long offset =
+   alloc_area == (void **)_src ? 0 : nr_pages * page_size;
+
*alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
-  MAP_ANONYMOUS | MAP_SHARED, -1, 0);
+  MAP_SHARED, shm_fd, offset);
if (*alloc_area == MAP_FAILED) {
-   fprintf(stderr, "shared memory mmap failed\n");
-   *alloc_area = NULL;
+   perror("mmap of memfd failed");
+   goto fail;
}
+
+   return;
+
+fail:
+   *alloc_area = NULL;
 }
 
 struct uffd_test_ops {
@@ -1672,15 +1681,31 @@ int main(int argc, char **argv)
usage();
huge_fd = open(argv[4], O_CREAT | O_RDWR, 0755);
if (huge_fd < 0) {
-   fprintf(stderr, "Open of %s failed", argv[3]);
+   fprintf(stderr, "Open of %s failed", argv[4]);
perror("open");
exit(1);
}
if (ftruncate(huge_fd, 0)) {
-   fprintf(stderr, "ftruncate %s to size 0 failed", 
argv[3]);
+   fprintf(stderr, "ftruncate %s to size 0 failed", 
argv[4]);
perror("ftruncate");
exit(1);
}
+   } else if (test_type == TEST_SHMEM) {
+   shm_fd = memfd_create(argv[0], 0);
+   if (shm_fd < 0) {
+   perror("memfd_create");
+   exit(1);
+   }
+   if (ftruncate(shm_fd, nr_pages * page_size * 2)) {
+   perror("ftruncate");
+   exit(1);
+   }
+   if (fallocate(shm_fd,
+ FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, 0,
+ nr_pages * page_size * 2)) {
+   perror("fallocate");
+   exit(1);
+   }
}
printf("nr_pages: %lu, nr_pages_per_cpu: %lu\n",
   nr_pages, nr_pages_per_cpu);
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v2 4/5] userfaultfd/selftests: reinitialize test context in each test

2021-03-01 Thread Axel Rasmussen
Currently, the context (fds, mmap-ed areas, etc.) are global. Each test
mutates this state in some way, in some cases really "clobbering it"
(e.g., the events test mremap-ing area_dst over the top of area_src, or
the minor faults tests overwriting the count_verify values in the test
areas). We run the tests in a particular order, each test is careful to
make the right assumptions about its starting state, etc.

But, this is fragile. It's better for a test's success or failure to not
depend on what some other prior test case did to the global state.

To that end, clear and reinitialize the test context at the start of
each test case, so whatever prior test cases did doesn't affect future
tests.

This is particularly relevant to this series because the events test's
mremap of area_dst screws up assumptions the minor fault test was
relying on. This wasn't a problem for hugetlb, as we don't mremap in
that case.

Signed-off-by: Axel Rasmussen 
---
 tools/testing/selftests/vm/userfaultfd.c | 249 ++-
 1 file changed, 151 insertions(+), 98 deletions(-)

diff --git a/tools/testing/selftests/vm/userfaultfd.c 
b/tools/testing/selftests/vm/userfaultfd.c
index 4a18590fe0f8..5183ddb3080d 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -89,7 +89,8 @@ static int shm_fd;
 static int huge_fd;
 static char *huge_fd_off0;
 static unsigned long long *count_verify;
-static int uffd, uffd_flags, finished, *pipefd;
+static int uffd = -1;
+static int uffd_flags, finished, *pipefd;
 static char *area_src, *area_src_alias, *area_dst, *area_dst_alias;
 static char *zeropage;
 pthread_attr_t attr;
@@ -376,6 +377,146 @@ static struct uffd_test_ops hugetlb_uffd_test_ops = {
 
 static struct uffd_test_ops *uffd_test_ops;
 
+static int userfaultfd_open(uint64_t *features)
+{
+   struct uffdio_api uffdio_api;
+
+   uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
+   if (uffd < 0) {
+   fprintf(stderr,
+   "userfaultfd syscall not available in this kernel\n");
+   return 1;
+   }
+   uffd_flags = fcntl(uffd, F_GETFD, NULL);
+
+   uffdio_api.api = UFFD_API;
+   uffdio_api.features = *features;
+   if (ioctl(uffd, UFFDIO_API, _api)) {
+   fprintf(stderr, "UFFDIO_API failed.\nPlease make sure to "
+   "run with either root or ptrace capability.\n");
+   return 1;
+   }
+   if (uffdio_api.api != UFFD_API) {
+   fprintf(stderr, "UFFDIO_API error: %" PRIu64 "\n",
+   (uint64_t)uffdio_api.api);
+   return 1;
+   }
+
+   *features = uffdio_api.features;
+   return 0;
+}
+
+static int uffd_test_ctx_init_ext(uint64_t *features)
+{
+   unsigned long nr, cpu;
+
+   uffd_test_ops->allocate_area((void **)_src);
+   if (!area_src)
+   return 1;
+   uffd_test_ops->allocate_area((void **)_dst);
+   if (!area_dst)
+   return 1;
+
+   if (uffd_test_ops->release_pages(area_src))
+   return 1;
+
+   if (uffd_test_ops->release_pages(area_dst))
+   return 1;
+
+   if (userfaultfd_open(features))
+   return 1;
+
+   count_verify = malloc(nr_pages * sizeof(unsigned long long));
+   if (!count_verify) {
+   perror("count_verify");
+   return 1;
+   }
+
+   for (nr = 0; nr < nr_pages; nr++) {
+   *area_mutex(area_src, nr) =
+   (pthread_mutex_t)PTHREAD_MUTEX_INITIALIZER;
+   count_verify[nr] = *area_count(area_src, nr) = 1;
+   /*
+* In the transition between 255 to 256, powerpc will
+* read out of order in my_bcmp and see both bytes as
+* zero, so leave a placeholder below always non-zero
+* after the count, to avoid my_bcmp to trigger false
+* positives.
+*/
+   *(area_count(area_src, nr) + 1) = 1;
+   }
+
+   pipefd = malloc(sizeof(int) * nr_cpus * 2);
+   if (!pipefd) {
+   perror("pipefd");
+   return 1;
+   }
+   for (cpu = 0; cpu < nr_cpus; cpu++) {
+   if (pipe2([cpu * 2], O_CLOEXEC | O_NONBLOCK)) {
+   perror("pipe");
+   return 1;
+   }
+   }
+
+   return 0;
+}
+
+static inline int uffd_test_ctx_init(uint64_t features)
+{
+   return uffd_test_ctx_init_ext();
+}
+
+static inline int munmap_area(void **area)
+{
+   if (*area) {
+   if (munmap(*area, nr_pages * page_size)) {
+   perror("munmap");
+   return 1;
+   }
+   }
+
+   *area = NULL;
+   return 0;
+}
+
+static int uffd_test_ctx_clear(void)
+{
+   int ret = 0;
+   size_t i;
+
+   if (pipefd) {
+   for (i = 0; i < nr_cpus * 2; 

[PATCH v2 0/5] userfaultfd: support minor fault handling for shmem

2021-03-01 Thread Axel Rasmussen
Base


This series is based on top of my series which adds minor fault handling for
hugetlbfs [1]. (And, therefore, it is based on 5.12-rc1 and Peter Xu's series
for disabling huge pmd sharing as well.)

[1] 
https://lore.kernel.org/linux-fsdevel/20210301222728.176417-1-axelrasmus...@google.com/T/#t

Changelog
=

v1->v2:
- For UFFDIO_CONTINUE, don't mess with page flags. Just use find_lock_page to
  get a locked page from the page cache, instead of doing __SetPageLocked.
  This fixes a VM_BUG_ON v1 hit when handling minor faults for THP-backed
  shmem (a tmpfs mounted with huge=always).

Overview


See my original series linked above for a detailed overview of minor fault
handling in general. The feature in this series works exactly like the
hugetblfs version (from userspace's perspective).

I'm sending this as a separate series because:

- The original minor fault handling series has a full set of R-Bs, and seems
  close to being merged. So, it seems reasonable to start looking at this next
  step, which extends the basic functionality.

- shmem is different enough that this series may require some additional work
  before it's ready, and I don't want to delay the original series
  unnecessarily by bundling them together.

Use Case


In some cases it is useful to have VM memory backed by tmpfs instead of
hugetlbfs. So, this feature will be used to support the same VM live migration
use case described in my original series.

Additionally, Android folks (Lokesh Gidra ) hope to
optimize the Android Runtime garbage collector using this feature:

"The plan is to use userfaultfd for concurrently compacting the heap. With
this feature, the heap can be shared-mapped at another location where the
GC-thread(s) could continue the compaction operation without the need to
invoke userfault ioctl(UFFDIO_COPY) each time. OTOH, if and when Java threads
get faults on the heap, UFFDIO_CONTINUE can be used to resume execution.
Furthermore, this feature enables updating references in the 'non-moving'
portion of the heap efficiently. Without this feature, uneccessary page
copying (ioctl(UFFDIO_COPY)) would be required."

Axel Rasmussen (5):
  userfaultfd: support minor fault handling for shmem
  userfaultfd/selftests: use memfd_create for shmem test type
  userfaultfd/selftests: create alias mappings in the shmem test
  userfaultfd/selftests: reinitialize test context in each test
  userfaultfd/selftests: exercise minor fault handling shmem support

 fs/userfaultfd.c |   6 +-
 include/linux/shmem_fs.h |  26 +-
 include/uapi/linux/userfaultfd.h |   4 +-
 mm/memory.c  |   8 +-
 mm/shmem.c   |  92 +++
 mm/userfaultfd.c |  27 +-
 tools/testing/selftests/vm/userfaultfd.c | 322 +++
 7 files changed, 295 insertions(+), 190 deletions(-)

--
2.30.1.766.gb4fecdf3b7-goog



Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify_code

2021-03-01 Thread Steven Rostedt
On Mon, 1 Mar 2021 14:14:51 -0800
Sami Tolvanen  wrote:

> Basically, the problem is that ftrace_replace_code() expects to find
> ideal_nops[NOP_ATOMIC5] here, which in this case is 66:66:66:66:90,
> while objtool has replaced the __fentry__ call with 0f:1f:44:00:00.
> 
> As ideal_nops changes depending on kernel config and hardware, when
> CC_USING_NOP_MCOUNT is defined we could either change
> ftrace_nop_replace() to always use P6_NOP5, or skip
> ftrace_verify_code() in ftrace_replace_code() for
> FTRACE_UPDATE_MAKE_CALL.

So I hacked up the code to get -mnop-record to work on x86, and checked the
vmlinux and it gives me:

81bc6120 :
81bc6120:   0f 1f 44 00 00  nopl   0x0(%rax,%rax,1)
81bc6125:   55  push   %rbp
81bc6126:   65 48 8b 2c 25 c0 7d 01 00  mov%gs:0x17dc0,%rbp 
81bc612b: R_X86_64_32S  current_task
81bc612f:   53  push   %rbx
81bc6130:   48 8b 45 18 mov0x18(%rbp),%rax


Which is the 0f:1f:44:00:00, and it works fine for me.

Now, that could be because the ideal_nops[NOP_ATOMIC5] is the same, which
would explain this.

No, we should *not* change ftrace_nop_replace() to always use any P6_NOP5,
as there was a reason we did this. Because not all nops are the same, and
this gets called for *every* function that is traced.

No, we should not skip ftrace_verify_code() *ever*. (/me was just
referencing on twitter the scenario where ftrace bricked e1000e cards).

This is probably why I never was much for the compiler conversion into nops,
because it may chose the wrong one for the architecture.

What we could do, is if the nop chosen by the compiler is not the ideal
nop, to go back and modify all the nops added by the compiler to the ideal
one, which would keep it using the most efficient one.

Or, add something like this:

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 7edbd5ee5ed4..aef3ea53f931 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -152,12 +152,19 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned 
long addr)
 {
unsigned long ip = rec->ip;
const char *new, *old;
+   int ret;
 
old = ftrace_nop_replace();
new = ftrace_call_replace(ip, addr);
 
/* Should only be called when module is loaded */
-   return ftrace_modify_code_direct(rec->ip, old, new);
+   ret = ftrace_modify_code_direct(rec->ip, old, new);
+   if (__is_defined(CC_USING_NOP_MCOUNT) && ret) {
+   /* Compiler could have put in P6_NOP5 */
+   old = P6_NOP5;
+   ret = ftrace_modify_code_direct(rec->ip, old, new);
+   }
+   return ret;
 }
 
 /*
@@ -199,6 +206,8 @@ void ftrace_replace_code(int enable)
int ret;
 
for_ftrace_rec_iter(iter) {
+   bool old_nop = false;
+
rec = ftrace_rec_iter_record(iter);
 
switch (ftrace_test_record(rec, enable)) {
@@ -208,6 +217,7 @@ void ftrace_replace_code(int enable)
 
case FTRACE_UPDATE_MAKE_CALL:
old = ftrace_nop_replace();
+   old_nop = true;
break;
 
case FTRACE_UPDATE_MODIFY_CALL:
@@ -217,6 +227,13 @@ void ftrace_replace_code(int enable)
}
 
ret = ftrace_verify_code(rec->ip, old);
+
+   if (__is_defined(CC_USING_NOP_MCOUNT) && ret && old_nop) {
+   /* Compiler could have put in P6_NOP5 */
+   old = P6_NOP5;
+   ret = ftrace_verify_code(rec->ip, old);
+   }
+
if (ret) {
ftrace_bug(ret, rec);
return;


-- Steve


Re: [PATCH v2] KVM: x86: Revise guest_fpu xcomp_bv field

2021-03-01 Thread Sean Christopherson
On Thu, Feb 25, 2021, Jing Liu wrote:
> XCOMP_BV[63] field indicates that the save area is in the compacted
> format and XCOMP_BV[62:0] indicates the states that have space allocated
> in the save area, including both XCR0 and XSS bits enabled by the host
> kernel. Use xfeatures_mask_all for calculating xcomp_bv and reuse
> XCOMP_BV_COMPACTED_FORMAT defined by kernel.
> 
> Signed-off-by: Jing Liu 
> ---
>  arch/x86/kvm/x86.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1b404e4d7dd8..f115493f577d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4435,8 +4435,6 @@ static int kvm_vcpu_ioctl_x86_set_debugregs(struct 
> kvm_vcpu *vcpu,
>   return 0;
>  }
>  
> -#define XSTATE_COMPACTION_ENABLED (1ULL << 63)
> -
>  static void fill_xsave(u8 *dest, struct kvm_vcpu *vcpu)
>  {
>   struct xregs_state *xsave = >arch.guest_fpu->state.xsave;
> @@ -4494,7 +4492,8 @@ static void load_xsave(struct kvm_vcpu *vcpu, u8 *src)
>   /* Set XSTATE_BV and possibly XCOMP_BV.  */
>   xsave->header.xfeatures = xstate_bv;
>   if (boot_cpu_has(X86_FEATURE_XSAVES))
> - xsave->header.xcomp_bv = host_xcr0 | XSTATE_COMPACTION_ENABLED;
> + xsave->header.xcomp_bv = XCOMP_BV_COMPACTED_FORMAT |
> +  xfeatures_mask_all;

Doesn't fill_xsave also need to be updated?  Not with xfeatures_mask_all, but
to account for arch.ia32_xss?  I believe it's a nop with the current code, since
supported_xss is zero, but it should be fixed, no?

>  
>   /*
>* Copy each region from the non-compacted offset to the
> @@ -9912,9 +9911,6 @@ static void fx_init(struct kvm_vcpu *vcpu)
>   return;
>  
>   fpstate_init(>arch.guest_fpu->state);
> - if (boot_cpu_has(X86_FEATURE_XSAVES))
> - vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv =
> - host_xcr0 | XSTATE_COMPACTION_ENABLED;

Ugh, this _really_ needs a comment in the changelog.  It took me a while to
realize fpstate_init() does exactly what the new fill_xave() is doing.

And isn't the code in load_xsave() redundant and can be removed?  Any code that
uses get_xsave_addr() would be have a dependency on load_xsave() if it's not
redundant, and I can't see how that would work.

>  
>   /*
>* Ensure guest xcr0 is valid for loading
> -- 
> 2.18.4
> 


Re: [PATCH] sysctl: use min() helper for namecmp()

2021-03-01 Thread Kees Cook
On Sun, Feb 28, 2021 at 04:44:22PM +0900, Masahiro Yamada wrote:
> (CC: Andrew Morton)
> 
> A friendly reminder.
> 
> 
> This is just a minor clean-up.
> 
> If nobody picks it up,
> I hope perhaps Andrew Morton will do.
> 
> This patch:
> https://lore.kernel.org/patchwork/patch/1360092/
> 
> 
> 
> 
> 
> On Mon, Jan 4, 2021 at 5:33 PM Masahiro Yamada  wrote:
> >
> > Make it slightly readable by using min().
> >
> > Signed-off-by: Masahiro Yamada 

Acked-by: Kees Cook 

Feel free to take this via your tree Masahiro. Thanks!

-Kees

> > ---
> >
> >  fs/proc/proc_sysctl.c | 7 +--
> >  1 file changed, 1 insertion(+), 6 deletions(-)
> >
> > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> > index 317899222d7f..86341c0f0c40 100644
> > --- a/fs/proc/proc_sysctl.c
> > +++ b/fs/proc/proc_sysctl.c
> > @@ -94,14 +94,9 @@ static void sysctl_print_dir(struct ctl_dir *dir)
> >
> >  static int namecmp(const char *name1, int len1, const char *name2, int 
> > len2)
> >  {
> > -   int minlen;
> > int cmp;
> >
> > -   minlen = len1;
> > -   if (minlen > len2)
> > -   minlen = len2;
> > -
> > -   cmp = memcmp(name1, name2, minlen);
> > +   cmp = memcmp(name1, name2, min(len1, len2));
> > if (cmp == 0)
> > cmp = len1 - len2;
> > return cmp;
> > --
> > 2.27.0
> >
> 
> 
> -- 
> Best Regards
> Masahiro Yamada

-- 
Kees Cook

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH net] net: dsa: tag_mtk: fix 802.1ad VLAN egress

2021-03-01 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Tue,  2 Mar 2021 00:01:59 +0800 you wrote:
> A different TPID bit is used for 802.1ad VLAN frames.
> 
> Reported-by: Ilario Gelmetti 
> Fixes: f0af34317f4b ("net: dsa: mediatek: combine MediaTek tag with VLAN tag")
> Signed-off-by: DENG Qingfang 
> ---
>  net/dsa/tag_mtk.c | 19 +--
>  1 file changed, 13 insertions(+), 6 deletions(-)

Here is the summary with links:
  - [net] net: dsa: tag_mtk: fix 802.1ad VLAN egress
https://git.kernel.org/netdev/net/c/9200f515c41f

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




[tip:x86/platform] BUILD SUCCESS 2430915f8291212f2bd2155176b817c34a18a2b1

2021-03-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
x86/platform
branch HEAD: 2430915f8291212f2bd2155176b817c34a18a2b1  x86/platform/uv: Fix 
indentation warning in Documentation/ABI/testing/sysfs-firmware-sgi_uv

elapsed time: 720m

configs tested: 95
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64   defconfig
arm64allyesconfig
arm  allyesconfig
arm  allmodconfig
arm  moxart_defconfig
m68kq40_defconfig
powerpc  katmai_defconfig
alpha   defconfig
ia64 alldefconfig
powerpc  makalu_defconfig
powerpc  chrp32_defconfig
i386 allyesconfig
mipsjmr3927_defconfig
arcnsim_700_defconfig
arm nhk8815_defconfig
armzeus_defconfig
mips cu1830-neo_defconfig
sh  rsk7269_defconfig
mips mpc30x_defconfig
arm   versatile_defconfig
sparc   defconfig
sparc64 defconfig
shapsh4ad0a_defconfig
powerpc canyonlands_defconfig
sh sh7710voipgw_defconfig
mips decstation_r4k_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20210228
i386 randconfig-a005-20210228
i386 randconfig-a004-20210228
i386 randconfig-a003-20210228
i386 randconfig-a001-20210228
i386 randconfig-a002-20210228
x86_64   randconfig-a013-20210301
x86_64   randconfig-a016-20210301
x86_64   randconfig-a015-20210301
x86_64   randconfig-a014-20210301
x86_64   randconfig-a012-20210301
x86_64   randconfig-a011-20210301
i386 randconfig-a016-20210301
i386 randconfig-a012-20210301
i386 randconfig-a014-20210301
i386 randconfig-a013-20210301
i386 randconfig-a011-20210301
i386 randconfig-a015-20210301
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a006-20210301
x86_64   randconfig-a001-20210301
x86_64   randconfig-a004-20210301
x86_64   randconfig-a002-20210301
x86_64   randconfig-a005-20210301
x86_64   randconfig-a003-20210301

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [PATCH] spi: cadence-quadspi: add missing of_node_put

2021-03-01 Thread Mark Brown
On Mon, 15 Feb 2021 19:04:25 +0800, angkery wrote:
> Fix OF node leaks by calling of_node_put in
> for_each_available_child_of_node when the cycle returns.
> 
> Generated by: scripts/coccinelle/iterators/for_each_child.cocci

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/1] spi: cadence-quadspi: add missing of_node_put
  commit: 44233a5ba2511b85da3c055a0ab7c28976544e47

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] scsi: ufs: Fix incorrect ufshcd_state after ufshcd_reset_and_restore()

2021-03-01 Thread Asutosh Das

On Mon, Mar 01 2021 at 11:19 -0800, Adrian Hunter wrote:

If ufshcd_probe_hba() fails it sets ufshcd_state to UFSHCD_STATE_ERROR,
however, if it is called again, as it is within a loop in
ufshcd_reset_and_restore(), and succeeds, then it will not set the state
back to UFSHCD_STATE_OPERATIONAL unless the state was
UFSHCD_STATE_RESET.

That can result in the state being UFSHCD_STATE_ERROR even though
ufshcd_reset_and_restore() is successful and returns zero.

Fix by initializing the state to UFSHCD_STATE_RESET in the start of each
loop in ufshcd_reset_and_restore().  If there is an error,
ufshcd_reset_and_restore() will change the state to UFSHCD_STATE_ERROR,
otherwise ufshcd_probe_hba() will have set the state appropriately.

Fixes: 4db7a2360597 ("scsi: ufs: Fix concurrency of error handler and other error 
recovery paths")
Signed-off-by: Adrian Hunter 
---


Reviewed-by: Asutosh Das 


drivers/scsi/ufs/ufshcd.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 77161750c9fb..91a403afe038 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -7031,6 +7031,8 @@ static int ufshcd_reset_and_restore(struct ufs_hba *hba)
spin_unlock_irqrestore(hba->host->host_lock, flags);

do {
+   hba->ufshcd_state = UFSHCD_STATE_RESET;
+
/* Reset the attached device */
ufshcd_device_reset(hba);

--
2.17.1



Re: seccomp: Delay filter activation

2021-03-01 Thread Kees Cook
On Mon, Mar 01, 2021 at 02:21:56PM +0100, Christian Brauner wrote:
> On Mon, Mar 01, 2021 at 12:09:09PM +0100, Christian Brauner wrote:
> > On Sat, Feb 20, 2021 at 01:31:57AM -0800, Sargun Dhillon wrote:
> > > We've run into a problem where attaching a filter can be quite messy
> > > business because the filter itself intercepts sendmsg, and other
> > > syscalls related to exfiltrating the listener FD. I believe that this
> > > problem set has been brought up before, and although there are
> > > "simpler" methods of exfiltrating the listener, like clone3 or
> > > pidfd_getfd, but these are still less than ideal.

I'm trying to make sure I understand: the target process would like to
have a filter attached that blocks sendmsg, but that would mean it has
no way to send the listener FD to its manager?

And you'd want to have listening working for sendmsg (otherwise you
could do it with two filters, I imagine)?

> > int fd_filter = seccomp(SECCOMP_SET_MODE_FILTER, 
> > SECCOMP_FILTER_DETACHED, );
> > 
> > BARRIER_WAIT_SETUP_DONE;
> > 
> > int ret = seccomp(SECCOMP_ATTACH_FILTER, 0, INT_TO_PTR(fd_listener));
> 
> This obviously should've been sm like:
> 
> struct seccomp_filter_attach {
>   union {
>   __s32 pidfd;
>   __s32 pid;
>   };
>   __u32 fd_filter;
> };
> 
> and then
> 
> int ret = seccomp(SECCOMP_ATTACH_FILTER, 0, seccomp_filter_attach);

Given the difficulty with TSYNC, I'm not excited about adding an
"apply this filter to another process" API. :)

The prior thread was here:
https://lore.kernel.org/lkml/20201029075841.GB29881@ircssh-2.c.rugged-nimbus-611.internal/

But I haven't had time to follow up. Both Andy and Sargun discuss filter
"replacement", but I'm not a fan of that, since I'd really like to keep
the "additive-only" property of seccomp.

So, I'm still back to wanting an answer to my questions at the end of
https://lore.kernel.org/lkml/202010281503.3D1FCFE0@keescook/

Namely, how to best indicate the point of execution where "delayed"
filters become applied?

If we require supporting the "2b" (launched oblivious target) case
(which I think we must), we need to signal it externally, or via an
automatic trip point.

Since synchronizing with an oblivious target is rather nasty (e.g.
involving ptrace or at least ptrace access checking), I'd rather create
a predefined trip point. Having it be "execve" limits the utility of
this feature for cooperating targets, though, so I think "apply on exec"
isn't great.

struct seccomp_filter_attach_trigger {
u64 nr;
unsigned char *filter;
};

seccomp(SECCOMP_ATTACH_FILTER_TRIGGER, 0, seccomp_filter_attach_trigger);

after "nr" is evaluated (but before it runs), seccomp installs the
filter.

And by "installs", I'm not sure if it needs to keep it in a queue, with
separate ref coutning, or if it should be in the main filter stack, but
have an "alive" toggle, or what.

-- 
Kees Cook


Re: [PATCH] spi: rockchip: avoid objtool warning

2021-03-01 Thread Mark Brown
On Thu, 25 Feb 2021 13:55:34 +0100, Arnd Bergmann wrote:
> Building this file with clang leads to a an unreachable code path
> causing a warning from objtool:
> 
> drivers/spi/spi-rockchip.o: warning: objtool: 
> rockchip_spi_transfer_one()+0x2e0: sibling call from callable instruction 
> with modified stack frame
> 
> Use BUG() instead of unreachable() to avoid the undefined behavior
> if it does happen.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/1] spi: rockchip: avoid objtool warning
  commit: d86e880f7a7c5b64a650146a1353f98750863f21

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] spi: atmel: Drop unused variable

2021-03-01 Thread Mark Brown
On Thu, 18 Feb 2021 15:28:40 +0200, Tudor Ambarus wrote:
> The DMA cap mask is no longer used since:
> commit 7758e390699f ("spi: atmel: remove compat for non DT board when 
> requesting dma chan")
> Drop it now.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/1] spi: atmel: Drop unused variable
  commit: c5f754fd0a31d2c6f2f8d11f3db1427b5566f1e7

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] [v2] spi: rockchip: avoid objtool warning

2021-03-01 Thread Mark Brown
On Fri, 26 Feb 2021 15:00:48 +0100, Arnd Bergmann wrote:
> Building this file with clang leads to a an unreachable code path
> causing a warning from objtool:
> 
> drivers/spi/spi-rockchip.o: warning: objtool: 
> rockchip_spi_transfer_one()+0x2e0: sibling call from callable instruction 
> with modified stack frame
> 
> Change the unreachable() into an error return that can be
> handled if it ever happens, rather than silently crashing
> the kernel.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/1] spi: rockchip: avoid objtool warning
  commit: d86e880f7a7c5b64a650146a1353f98750863f21

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] spi: omap2-mcspi: Activate pinctrl idle state during runtime suspend

2021-03-01 Thread Mark Brown
On Mon, 22 Feb 2021 03:32:43 +0100, Alexander Sverdlin wrote:
> Set the (optional) idle pinctrl state during runtime suspend. This is the
> same schema used in PL022 driver and can help with HW designs sharing
> the SPI lines for different purposes.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/1] spi: omap2-mcspi: Activate pinctrl idle state during runtime suspend
  commit: 9923f8e3039ed0361c2476d5d3c5195c7f766504

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] qcom: spmi-regulator: Add support for ULT LV_P50 and ULT P300

2021-03-01 Thread Mark Brown
On Thu, 25 Feb 2021 22:35:13 +0100, Konrad Dybcio wrote:
> The ULT LV_P50 shares the same configuration as the other ULT LV_Pxxx
> and the ULT P300 shares the same as the other ULT Pxxx.
> 
> These two regulator types are found on PM8950 and its variants.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git 
for-next

Thanks!

[1/1] qcom: spmi-regulator: Add support for ULT LV_P50 and ULT P300
  commit: b15d870510c0a3910c9980ebceab885a390af60c

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] regulator: pf8x00: Use regulator_map_voltage_ascend for pf8x00_buck7_ops

2021-03-01 Thread Mark Brown
On Tue, 16 Feb 2021 14:01:28 +0800, Axel Lin wrote:
> The voltages in pf8x00_sw7_voltages are in ascendant order, so use
> regulator_map_voltage_ascend.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git 
for-next

Thanks!

[1/1] regulator: pf8x00: Use regulator_map_voltage_ascend for pf8x00_buck7_ops
  commit: 6930ab7ac03c1be5d1944473cbf327c9d4d14ce4

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] [v2] Input: Add "Share" button to Microsoft Xbox One controller.

2021-03-01 Thread Chris Ye
Hi Cameron,
   I was first thinking of adding a new XTYPE but then realized it is
still XBox One but just a model with extra button, so adding
MAP_SHARE_BUTTON would avoid adding a new XTYPE there.
Addressed the name to be "Microsoft Xbox One X pad" and removed the
{}, please review again, thanks!
Chris


On Sat, Feb 27, 2021 at 6:01 PM Cameron Gutman  wrote:
>
> On 2/24/21 11:32 PM, Chris Ye wrote:
> > Add "Share" button input capability and input event mapping for
> > Microsoft Xbox One controller.
> > Fixed Microsoft Xbox One controller share button not working under USB
> > connection.
> >
> > Signed-off-by: Chris Ye 
> > ---
> >  drivers/input/joystick/xpad.c | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c
> > index 9f0d07dcbf06..0c3374091aff 100644
> > --- a/drivers/input/joystick/xpad.c
> > +++ b/drivers/input/joystick/xpad.c
> > @@ -79,6 +79,7 @@
> >  #define MAP_DPAD_TO_BUTTONS  (1 << 0)
> >  #define MAP_TRIGGERS_TO_BUTTONS  (1 << 1)
> >  #define MAP_STICKS_TO_NULL   (1 << 2)
> > +#define MAP_SHARE_BUTTON (1 << 3)
> >  #define DANCEPAD_MAP_CONFIG  (MAP_DPAD_TO_BUTTONS |  \
> >   MAP_TRIGGERS_TO_BUTTONS | MAP_STICKS_TO_NULL)
> >
> > @@ -130,6 +131,7 @@ static const struct xpad_device {
> >   { 0x045e, 0x02e3, "Microsoft X-Box One Elite pad", 0, XTYPE_XBOXONE },
> >   { 0x045e, 0x02ea, "Microsoft X-Box One S pad", 0, XTYPE_XBOXONE },
> >   { 0x045e, 0x0719, "Xbox 360 Wireless Receiver", MAP_DPAD_TO_BUTTONS, 
> > XTYPE_XBOX360W },
> > + { 0x045e, 0x0b12, "Microsoft X-Box One X pad", MAP_SHARE_BUTTON, 
> > XTYPE_XBOXONE },
>
> Let's use 'Xbox' for new entries instead of 'X-Box'. There was an effort to
> standardize on 'Xbox' (which is what Microsoft uses), but changing device
> names can impact userspace which may use these names in mapping heuristics
> (SDL does this). We can at least not make the problem worse though.
>
> >   { 0x046d, 0xc21d, "Logitech Gamepad F310", 0, XTYPE_XBOX360 },
> >   { 0x046d, 0xc21e, "Logitech Gamepad F510", 0, XTYPE_XBOX360 },
> >   { 0x046d, 0xc21f, "Logitech Gamepad F710", 0, XTYPE_XBOX360 },
> > @@ -862,6 +864,8 @@ static void xpadone_process_packet(struct usb_xpad 
> > *xpad, u16 cmd, unsigned char
> >   /* menu/view buttons */
> >   input_report_key(dev, BTN_START,  data[4] & 0x04);
> >   input_report_key(dev, BTN_SELECT, data[4] & 0x08);
> > + if (xpad->mapping & MAP_SHARE_BUTTON)
> > + input_report_key(dev, KEY_RECORD, data[22] & 0x01);
> >
>
> I was worried adding a button to an existing supported gamepad like this
> might cause a breaking change to SDL's gamepad mapping for this gamepad,
> since SDL assigns each present button an index rather than using the keycodes
> directly (adding a new one could change the old indices). Fortunately, SDL
> always processes buttons in the BTN_GAMEPAD range first, so this new button
> ends up at the end of the list anyway.
>
>
> >   /* buttons A,B,X,Y */
> >   input_report_key(dev, BTN_A,data[4] & 0x10);
> > @@ -1669,9 +1673,12 @@ static int xpad_init_input(struct usb_xpad *xpad)
> >
> >   /* set up model-specific ones */
> >   if (xpad->xtype == XTYPE_XBOX360 || xpad->xtype == XTYPE_XBOX360W ||
> > - xpad->xtype == XTYPE_XBOXONE) {
> > + xpad->xtype == XTYPE_XBOXONE) {
> >   for (i = 0; xpad360_btn[i] >= 0; i++)
> >   input_set_capability(input_dev, EV_KEY, 
> > xpad360_btn[i]);
> > + if (xpad->mapping & MAP_SHARE_BUTTON) {
> > + input_set_capability(input_dev, EV_KEY, KEY_RECORD);
> > + }
>
> Style nit: Drop the uneeded {} here
>
> >   } else {
> >   for (i = 0; xpad_btn[i] >= 0; i++)
> >   input_set_capability(input_dev, EV_KEY, xpad_btn[i]);
> >
> LGTM, other than the minor changes suggested above.
>
>
> Regards,
> Cameron


Re: [PATCH v2] regulator: add missing call to of_node_put()

2021-03-01 Thread Mark Brown
On Fri, 26 Feb 2021 09:39:35 +0800, Yang Li wrote:
> In one of the error paths of the for_each_child_of_node() loop,
> add missing call to of_node_put().
> 
> Fix the following coccicheck warning:
> ./drivers/regulator/scmi-regulator.c:343:1-23: WARNING: Function
> "for_each_child_of_node" should have of_node_put() before return around
> line 347.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git 
for-next

Thanks!

[1/1] regulator: add missing call to of_node_put()
  commit: 755a74fc655ee95ce37bb0f552cbd39b52978a05

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH v2] regulator: pca9450: Clear PRESET_EN bit to fix BUCK1/2/3 voltage setting

2021-03-01 Thread Mark Brown
On Mon, 22 Feb 2021 12:52:20 +0100, Schrempf Frieder wrote:
> The driver uses the DVS registers PCA9450_REG_BUCKxOUT_DVS0 to set the
> voltage for the buck regulators 1, 2 and 3. This has no effect as the
> PRESET_EN bit is set by default and therefore the preset values are used
> instead, which are set to 850 mV.
> 
> To fix this we clear the PRESET_EN bit at time of initialization.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git 
for-next

Thanks!

[1/1] regulator: pca9450: Clear PRESET_EN bit to fix BUCK1/2/3 voltage setting
  commit: 66f9f2d5d94f374605d829b9e690e8cdc9d0d05d

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] ASoC: fsl_xcvr: move reset assert into runtime_resume

2021-03-01 Thread Mark Brown
On Mon, 22 Feb 2021 17:09:50 +0800, Shengjiu Wang wrote:
> Move reset assert into runtime_resume since we
> cannot rely on reset assert state when the device
> is put out from suspend.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: fsl_xcvr: move reset assert into runtime_resume
  commit: 0f780e4bef4587f07060109040955d6b6aa179a2

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH 0/4] drop unneeded snd_soc_dai_set_drvdata

2021-03-01 Thread Mark Brown
On Sat, 13 Feb 2021 11:19:03 +0100, Julia Lawall wrote:
> snd_soc_dai_set_drvdata is not needed when the set data comes from
> snd_soc_dai_get_drvdata or dev_get_drvdata.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/4] ASoC: mmp-sspa: drop unneeded snd_soc_dai_set_drvdata
  commit: 131036ffae211a9cc3bfb053fadce87484e13fc5
[2/4] ASoC: mxs-saif: drop unneeded snd_soc_dai_set_drvdata
  commit: 7150186f1edb2fa94554be1bec26aa65a7df3388
[3/4] ASoC: sun4i-i2s: drop unneeded snd_soc_dai_set_drvdata
  commit: 0c34af2d5c9ba5103637c33c4f52d658172b991d
[4/4] ASoC: fsl: drop unneeded snd_soc_dai_set_drvdata
  commit: eb9db3066cdb57dbfd1fb3d85ca143ad5d719bfb

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] ASoC: Intel: boards: sof-wm8804: add check for PLL setting

2021-03-01 Thread Mark Brown
On Fri, 26 Feb 2021 18:56:53 +, Colin King wrote:
> Currently the return from snd_soc_dai_set_pll is not checking for
> failure, this is the only driver in the kernel that ignores this,
> so it probably should be added for sake of completeness.  Fix this
> by adding an error return check.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: Intel: boards: sof-wm8804: add check for PLL setting
  commit: e067855b814600248234a2a7283a7a9006e5aadc

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] sound: soc/uniphier: Simplify the return expression of uniphier_aio_startup

2021-03-01 Thread Mark Brown
On Wed, 24 Feb 2021 16:54:07 +0800, dingsen...@163.com wrote:
> Simplify the return expression in the aio-cpu.c.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] sound: soc/uniphier: Simplify the return expression of 
uniphier_aio_startup
  commit: e3fdb6288dd08d965dea4bf00186e20f79153b2b

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


[tip:locking/urgent] BUILD SUCCESS 8b97c027dfe4ba195be08fd0e18f716005763b8a

2021-03-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git 
locking/urgent
branch HEAD: 8b97c027dfe4ba195be08fd0e18f716005763b8a  static_call: Fix the 
module key fixup

elapsed time: 722m

configs tested: 95
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64   defconfig
arm64allyesconfig
arm  allyesconfig
arm  allmodconfig
arm  moxart_defconfig
m68kq40_defconfig
powerpc  katmai_defconfig
alpha   defconfig
ia64 alldefconfig
powerpc  makalu_defconfig
powerpc  chrp32_defconfig
i386 allyesconfig
mipsjmr3927_defconfig
arcnsim_700_defconfig
arm nhk8815_defconfig
armzeus_defconfig
mips cu1830-neo_defconfig
sh  rsk7269_defconfig
mips mpc30x_defconfig
arm   versatile_defconfig
sparc   defconfig
sparc64 defconfig
shapsh4ad0a_defconfig
powerpc canyonlands_defconfig
sh sh7710voipgw_defconfig
mips decstation_r4k_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20210228
i386 randconfig-a005-20210228
i386 randconfig-a004-20210228
i386 randconfig-a003-20210228
i386 randconfig-a001-20210228
i386 randconfig-a002-20210228
x86_64   randconfig-a013-20210301
x86_64   randconfig-a016-20210301
x86_64   randconfig-a015-20210301
x86_64   randconfig-a014-20210301
x86_64   randconfig-a012-20210301
x86_64   randconfig-a011-20210301
i386 randconfig-a016-20210301
i386 randconfig-a012-20210301
i386 randconfig-a014-20210301
i386 randconfig-a013-20210301
i386 randconfig-a011-20210301
i386 randconfig-a015-20210301
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a006-20210301
x86_64   randconfig-a001-20210301
x86_64   randconfig-a004-20210301
x86_64   randconfig-a002-20210301
x86_64   randconfig-a005-20210301
x86_64   randconfig-a003-20210301

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [PATCH 0/4] ASoC: rt*: Constify static structs

2021-03-01 Thread Mark Brown
On Wed, 24 Feb 2021 22:19:14 +0100, Rikard Falkeborn wrote:
> Constify a number of static structs that are never modified in RealTek
> codecs. The most important patches are the first two, which constifies
> snd_soc_dai_ops and sdw_slave_ops, both which contain function pointers.
> The other two patches are for good measure, since I was already touching
> the code there.
> 
> When doing this, I discovered sound/soc/codecs/rt1016.c is not in a
> Makefile, so there is not really any way to build it (I added locally to
> the Makefile to compile-test my changes). Is this expected or an oversight?
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/4] ASoC: rt*: Constify static struct sdw_slave_ops
  commit: 3ebb1b951880d3152547ac4018bfcce0fd7810bd
[2/4] ASoC: rt*: Constify static struct snd_soc_dai_ops
  commit: 84732dd4ff3ad28cc65eedfa3061fe3808e8469b
[3/4] ASoC: rt*: Constify static struct acpi_device_id
  commit: c85ca92c716bd04981ebcd2c67cd03f96748859e
[4/4] ASoc: rt5631: Constify static struct coeff_clk_div
  commit: 39f9eb61307061eed197eae651ef56cb3544f9b2

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] ASoC: constify of_phandle_args in snd_soc_get_dai_name()

2021-03-01 Thread Mark Brown
On Sun, 21 Feb 2021 16:30:24 +0100, Krzysztof Kozlowski wrote:
> The pointer to of_phandle_args passed to snd_soc_get_dai_name() and
> of_xlate_dai_name() implementations is not modified.  Since it is being
> used only to translate passed OF node to a DAI name, it should not be
> modified, so mark it as const for correctness and safer code.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: constify of_phandle_args in snd_soc_get_dai_name()
  commit: 54928c5c63c83afd5a1c2a91802a9c37e9a4ff88

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH] ASoC: fsl_sai: Add pm qos cpu latency support

2021-03-01 Thread Mark Brown
On Mon, 22 Feb 2021 16:40:20 +0800, Shengjiu Wang wrote:
> On SoCs such as i.MX7ULP, cpuidle has some levels which
> may disable system/bus clocks, so need to add pm_qos to
> prevent cpuidle from entering low level idles and make sure
> system/bus clocks are enabled when sai is active.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: fsl_sai: Add pm qos cpu latency support
  commit: 6d85d770c171972c0f33f74b84bf0fedc111e89f

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH 0/9] ASoC: fsl: remove cppcheck warnings

2021-03-01 Thread Mark Brown
On Fri, 19 Feb 2021 17:29:28 -0600, Pierre-Louis Bossart wrote:
> Nothing critical and no functional changes.
> 
> The only change that needs attention if the 'fsl_ssi: remove
> unnecessary tests' patch, where variables are to zero, then tested to
> set register fields. Either the tests are indeed redundant or the
> entire programming sequence is incorrect.
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/9] ASoC: fsl: fsl_asrc: remove useless assignment
  commit: ca289c2c70c131dc2d4a37e5f6f5c71acfc7cb8b
[2/9] ASoC: fsl: fsl_dma: remove unused variable
  commit: faff74679f510b9e469238b8ff610eb2b8ad5602
[3/9] ASoC: fsl: fsl_easrc: remove useless assignments
  commit: e80382fe721f71100cd49e209fbac260042a0106
[4/9] ASoC: fsl: fsl_esai: clarify expression
  commit: e7347520a4323fafea1df84abb29ae979c595931
[5/9] ASoC: fsl: fsl_ssi: remove unnecessary tests
  commit: e06a8f1a7c4ceb9f3f804bbe5e2fd25230bc91b1
[6/9] ASoC: fsl: imx-hdmi: remove unused structure members
  commit: 40e2c4450a34429b6343a7c8f80b4c6715bbd393
[7/9] ASoC: fsl: mpc5200: signed parameter in snprintf format
  commit: 5a6d43108095c2bb94947ccf3f53a7e71ae5774e
[8/9] ASoC: fsl: mpc8610: remove useless assignment
  commit: 3fb0dcec3e60466afd6a3d770c06a8a879160f68
[9/9] ASoC: fsl: p1022_ds: remove useless assignment
  commit: bafe21c9d01b3f39d26ff6271905c5c9ef00dc44

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


[tip:perf/urgent] BUILD SUCCESS a8abc881981762631a22568d5e4b2c0ce4aeb15c

2021-03-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git 
perf/urgent
branch HEAD: a8abc881981762631a22568d5e4b2c0ce4aeb15c  perf/x86/intel: Set 
PERF_ATTACH_SCHED_CB for large PEBS and LBR

elapsed time: 721m

configs tested: 95
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64   defconfig
arm64allyesconfig
arm  allyesconfig
arm  allmodconfig
arm  moxart_defconfig
m68kq40_defconfig
powerpc  katmai_defconfig
alpha   defconfig
ia64 alldefconfig
powerpc  makalu_defconfig
powerpc  chrp32_defconfig
i386 allyesconfig
mipsjmr3927_defconfig
arcnsim_700_defconfig
arm nhk8815_defconfig
armzeus_defconfig
mips cu1830-neo_defconfig
sh  rsk7269_defconfig
mips mpc30x_defconfig
arm   versatile_defconfig
sparc   defconfig
sparc64 defconfig
shapsh4ad0a_defconfig
powerpc canyonlands_defconfig
sh sh7710voipgw_defconfig
mips decstation_r4k_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20210228
i386 randconfig-a005-20210228
i386 randconfig-a004-20210228
i386 randconfig-a003-20210228
i386 randconfig-a001-20210228
i386 randconfig-a002-20210228
x86_64   randconfig-a013-20210301
x86_64   randconfig-a016-20210301
x86_64   randconfig-a015-20210301
x86_64   randconfig-a014-20210301
x86_64   randconfig-a012-20210301
x86_64   randconfig-a011-20210301
i386 randconfig-a016-20210301
i386 randconfig-a012-20210301
i386 randconfig-a014-20210301
i386 randconfig-a013-20210301
i386 randconfig-a011-20210301
i386 randconfig-a015-20210301
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a006-20210301
x86_64   randconfig-a001-20210301
x86_64   randconfig-a004-20210301
x86_64   randconfig-a002-20210301
x86_64   randconfig-a005-20210301
x86_64   randconfig-a003-20210301

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [PATCH][next] ASoC: codecs: lpass-rx-macro: remove redundant initialization of variable hph_pwr_mode

2021-03-01 Thread Mark Brown
On Mon, 15 Feb 2021 20:05:01 +, Colin King wrote:
> The variable hph_pwr_mode is being initialized with a value that is
> never read and it is being updated later with a new value.  The
> initialization is redundant and can be removed.

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/1] ASoC: codecs: lpass-rx-macro: remove redundant initialization of variable 
hph_pwr_mode
  commit: 7f7d1c4fce10ca68e87165898e6232353e4be1af

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


[tip:sched/urgent] BUILD SUCCESS fba111913e51a934eaad85734254eab801343836

2021-03-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git 
sched/urgent
branch HEAD: fba111913e51a934eaad85734254eab801343836  sched/membarrier: fix 
missing local execution of ipi_sync_rq_state()

elapsed time: 720m

configs tested: 95
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64   defconfig
arm64allyesconfig
arm  allyesconfig
arm  allmodconfig
arm  moxart_defconfig
m68kq40_defconfig
powerpc  katmai_defconfig
alpha   defconfig
ia64 alldefconfig
powerpc  makalu_defconfig
powerpc  chrp32_defconfig
i386 allyesconfig
mipsjmr3927_defconfig
arcnsim_700_defconfig
arm nhk8815_defconfig
armzeus_defconfig
mips cu1830-neo_defconfig
sh  rsk7269_defconfig
mips mpc30x_defconfig
arm   versatile_defconfig
sparc   defconfig
sparc64 defconfig
shapsh4ad0a_defconfig
powerpc canyonlands_defconfig
sh sh7710voipgw_defconfig
mips decstation_r4k_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20210228
i386 randconfig-a005-20210228
i386 randconfig-a004-20210228
i386 randconfig-a003-20210228
i386 randconfig-a001-20210228
i386 randconfig-a002-20210228
x86_64   randconfig-a013-20210301
x86_64   randconfig-a016-20210301
x86_64   randconfig-a015-20210301
x86_64   randconfig-a014-20210301
x86_64   randconfig-a012-20210301
x86_64   randconfig-a011-20210301
i386 randconfig-a016-20210301
i386 randconfig-a012-20210301
i386 randconfig-a014-20210301
i386 randconfig-a013-20210301
i386 randconfig-a011-20210301
i386 randconfig-a015-20210301
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
x86_64   allyesconfig
x86_64rhel-7.6-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-a006-20210301
x86_64   randconfig-a001-20210301
x86_64   randconfig-a004-20210301
x86_64   randconfig-a002-20210301
x86_64   randconfig-a005-20210301
x86_64   randconfig-a003-20210301

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[PATCH] [v3] Input: Add "Share" button to Microsoft Xbox One controller.

2021-03-01 Thread Chris Ye
Add "Share" button input capability and input event mapping for
Microsoft Xbox One controller.
Fixed Microsoft Xbox One controller share button not working under USB
connection.

Signed-off-by: Chris Ye 
---
 drivers/input/joystick/xpad.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c
index 9f0d07dcbf06..b51c0e381cc9 100644
--- a/drivers/input/joystick/xpad.c
+++ b/drivers/input/joystick/xpad.c
@@ -79,6 +79,7 @@
 #define MAP_DPAD_TO_BUTTONS(1 << 0)
 #define MAP_TRIGGERS_TO_BUTTONS(1 << 1)
 #define MAP_STICKS_TO_NULL (1 << 2)
+#define MAP_SHARE_BUTTON   (1 << 3)
 #define DANCEPAD_MAP_CONFIG(MAP_DPAD_TO_BUTTONS |  \
MAP_TRIGGERS_TO_BUTTONS | MAP_STICKS_TO_NULL)
 
@@ -130,6 +131,7 @@ static const struct xpad_device {
{ 0x045e, 0x02e3, "Microsoft X-Box One Elite pad", 0, XTYPE_XBOXONE },
{ 0x045e, 0x02ea, "Microsoft X-Box One S pad", 0, XTYPE_XBOXONE },
{ 0x045e, 0x0719, "Xbox 360 Wireless Receiver", MAP_DPAD_TO_BUTTONS, 
XTYPE_XBOX360W },
+   { 0x045e, 0x0b12, "Microsoft Xbox One X pad", MAP_SHARE_BUTTON, 
XTYPE_XBOXONE },
{ 0x046d, 0xc21d, "Logitech Gamepad F310", 0, XTYPE_XBOX360 },
{ 0x046d, 0xc21e, "Logitech Gamepad F510", 0, XTYPE_XBOX360 },
{ 0x046d, 0xc21f, "Logitech Gamepad F710", 0, XTYPE_XBOX360 },
@@ -862,6 +864,8 @@ static void xpadone_process_packet(struct usb_xpad *xpad, 
u16 cmd, unsigned char
/* menu/view buttons */
input_report_key(dev, BTN_START,  data[4] & 0x04);
input_report_key(dev, BTN_SELECT, data[4] & 0x08);
+   if (xpad->mapping & MAP_SHARE_BUTTON)
+   input_report_key(dev, KEY_RECORD, data[22] & 0x01);
 
/* buttons A,B,X,Y */
input_report_key(dev, BTN_A,data[4] & 0x10);
@@ -1669,9 +1673,11 @@ static int xpad_init_input(struct usb_xpad *xpad)
 
/* set up model-specific ones */
if (xpad->xtype == XTYPE_XBOX360 || xpad->xtype == XTYPE_XBOX360W ||
-   xpad->xtype == XTYPE_XBOXONE) {
+   xpad->xtype == XTYPE_XBOXONE) {
for (i = 0; xpad360_btn[i] >= 0; i++)
input_set_capability(input_dev, EV_KEY, xpad360_btn[i]);
+   if (xpad->mapping & MAP_SHARE_BUTTON)
+   input_set_capability(input_dev, EV_KEY, KEY_RECORD);
} else {
for (i = 0; xpad_btn[i] >= 0; i++)
input_set_capability(input_dev, EV_KEY, xpad_btn[i]);
-- 
2.30.1.766.gb4fecdf3b7-goog



Re: [PATCH net] net: l2tp: reduce log level when passing up invalid packets

2021-03-01 Thread Matthias Schiffer

On 2/23/21 10:47 AM, Tom Parkin wrote:

On  Mon, Feb 22, 2021 at 14:31:38 -0800, Jakub Kicinski wrote:

On Mon, 22 Feb 2021 17:40:16 +0100 Matthias Schiffer wrote:

This will not be sufficient for my usecase: To stay compatible with older
versions of fastd, I can't set the T flag in the first packet of the
handshake, as it won't be known whether the peer has a new enough fastd
version to understand packets that have this bit set. Luckily, the second
handshake byte is always 0 in fastd's protocol, so these packets fail the
tunnel version check and are passed to userspace regardless.

I'm aware that this usecase is far outside of the original intentions of the
code and can only be described as a hack, but I still consider this a
regression in the kernel, as it was working fine in the past, without
visible warnings.
  


I'm sorry, but for the reasons stated above I disagree about it being
a regression.


Hmm, is it common for protocol implementations in the kernel to warn about
invalid packets they receive? While L2TP uses connected sockets and thus
usually no unrelated packets end up in the socket, a simple UDP port scan
originating from the configured remote address/port will trigger the "short
packet" warning now (nmap uses a zero-length payload for UDP scans by
default). Log spam caused by a malicous party might also be a concern.


Indeed, seems like appropriate counters would be a good fit here?
The prints are both potentially problematic for security and lossy.


Yes, I agree with this argument.



Sounds good, I'll send an updated patch adding a counter for invalid packets.

By now I've found another project affected by the kernel warnings:
https://github.com/wlanslovenija/tunneldigger/issues/160



OpenPGP_signature
Description: OpenPGP digital signature


[RFC PATCH v4 3/3] scheduler: Add cluster scheduler level for x86

2021-03-01 Thread Barry Song
From: Tim Chen 

There are x86 CPU architectures (e.g. Jacobsville) where L2 cahce
is shared among a cluster of cores instead of being exclusive
to one single core.

To prevent oversubscription of L2 cache, load should be
balanced between such L2 clusters, especially for tasks with
no shared data.

Also with cluster scheduling policy where tasks are woken up
in the same L2 cluster, we will benefit from keeping tasks
related to each other and likely sharing data in the same L2
cluster.

Add CPU masks of CPUs sharing the L2 cache so we can build such
L2 cluster scheduler domain.

Signed-off-by: Tim Chen 
Signed-off-by: Barry Song 
---
 arch/x86/Kconfig|  8 
 arch/x86/include/asm/smp.h  |  7 +++
 arch/x86/include/asm/topology.h |  1 +
 arch/x86/kernel/cpu/cacheinfo.c |  1 +
 arch/x86/kernel/cpu/common.c|  3 +++
 arch/x86/kernel/smpboot.c   | 43 -
 6 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d3338a8..40110de 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1009,6 +1009,14 @@ config NR_CPUS
  This is purely to save memory: each supported CPU adds about 8KB
  to the kernel image.
 
+config SCHED_CLUSTER
+   bool "Cluster scheduler support"
+   default n
+   help
+Cluster scheduler support improves the CPU scheduler's decision
+making when dealing with machines that have clusters of CPUs
+sharing L2 cache. If unsure say N here.
+
 config SCHED_SMT
def_bool y if SMP
 
diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index c0538f8..9cbc4ae 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -16,7 +16,9 @@
 DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
 /* cpus sharing the last level cache: */
 DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
+DECLARE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_l2c_shared_map);
 DECLARE_PER_CPU_READ_MOSTLY(u16, cpu_llc_id);
+DECLARE_PER_CPU_READ_MOSTLY(u16, cpu_l2c_id);
 DECLARE_PER_CPU_READ_MOSTLY(int, cpu_number);
 
 static inline struct cpumask *cpu_llc_shared_mask(int cpu)
@@ -24,6 +26,11 @@ static inline struct cpumask *cpu_llc_shared_mask(int cpu)
return per_cpu(cpu_llc_shared_map, cpu);
 }
 
+static inline struct cpumask *cpu_l2c_shared_mask(int cpu)
+{
+   return per_cpu(cpu_l2c_shared_map, cpu);
+}
+
 DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_cpu_to_apicid);
 DECLARE_EARLY_PER_CPU_READ_MOSTLY(u32, x86_cpu_to_acpiid);
 DECLARE_EARLY_PER_CPU_READ_MOSTLY(u16, x86_bios_cpu_apicid);
diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 9239399..2a11ccc 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -103,6 +103,7 @@ static inline void setup_node_to_cpumask_map(void) { }
 #include 
 
 extern const struct cpumask *cpu_coregroup_mask(int cpu);
+extern const struct cpumask *cpu_clustergroup_mask(int cpu);
 
 #define topology_logical_package_id(cpu)   (cpu_data(cpu).logical_proc_id)
 #define topology_physical_package_id(cpu)  (cpu_data(cpu).phys_proc_id)
diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 3ca9be4..0d03a71 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -846,6 +846,7 @@ void init_intel_cacheinfo(struct cpuinfo_x86 *c)
l2 = new_l2;
 #ifdef CONFIG_SMP
per_cpu(cpu_llc_id, cpu) = l2_id;
+   per_cpu(cpu_l2c_id, cpu) = l2_id;
 #endif
}
 
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 35ad848..fb08c73 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -78,6 +78,9 @@
 /* Last level cache ID of each logical CPU */
 DEFINE_PER_CPU_READ_MOSTLY(u16, cpu_llc_id) = BAD_APICID;
 
+/* L2 cache ID of each logical CPU */
+DEFINE_PER_CPU_READ_MOSTLY(u16, cpu_l2c_id) = BAD_APICID;
+
 /* correctly size the local cpu masks */
 void __init setup_cpu_local_masks(void)
 {
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 02813a7..c85ffa8 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -101,6 +101,8 @@
 
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_llc_shared_map);
 
+DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_l2c_shared_map);
+
 /* Per CPU bogomips and other parameters */
 DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info);
 EXPORT_PER_CPU_SYMBOL(cpu_info);
@@ -501,6 +503,21 @@ static bool match_llc(struct cpuinfo_x86 *c, struct 
cpuinfo_x86 *o)
return topology_sane(c, o, "llc");
 }
 
+static bool match_l2c(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
+{
+   int cpu1 = c->cpu_index, cpu2 = o->cpu_index;
+
+   /* Do not match if we do not have a valid APICID for cpu: */
+   if (per_cpu(cpu_l2c_id, cpu1) == BAD_APICID)
+   return false;
+
+   /* Do not match if 

RE: [PATCH v3 6/8] mm: Selftests for exclusive device memory

2021-03-01 Thread Ralph Campbell
> From: Alistair Popple 
> Sent: Thursday, February 25, 2021 11:19 PM
> To: linux...@kvack.org; nouv...@lists.freedesktop.org;
> bske...@redhat.com; a...@linux-foundation.org
> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; dri-
> de...@lists.freedesktop.org; John Hubbard ; Ralph
> Campbell ; jgli...@redhat.com; Jason Gunthorpe
> ; h...@infradead.org; dan...@ffwll.ch; Alistair Popple
> 
> Subject: [PATCH v3 6/8] mm: Selftests for exclusive device memory
> 
> Adds some selftests for exclusive device memory.
> 
> Signed-off-by: Alistair Popple 

One minor nit below, but you can add
Tested-by: Ralph Campbell 
Reviewed-by: Ralph Campbell 

> +static int dmirror_exclusive(struct dmirror *dmirror,
> +  struct hmm_dmirror_cmd *cmd)
> +{
> + unsigned long start, end, addr;
> + unsigned long size = cmd->npages << PAGE_SHIFT;
> + struct mm_struct *mm = dmirror->notifier.mm;
> + struct page *pages[64];
> + struct dmirror_bounce bounce;
> + unsigned long next;
> + int ret;
> +
> + start = cmd->addr;
> + end = start + size;
> + if (end < start)
> + return -EINVAL;
> +
> + /* Since the mm is for the mirrored process, get a reference first. */
> + if (!mmget_not_zero(mm))
> + return -EINVAL;
> +
> + mmap_read_lock(mm);
> + for (addr = start; addr < end; addr = next) {
> + int i, mapped;
> +
> + if (end < addr + (64 << PAGE_SHIFT))
> + next = end;
> + else
> + next = addr + (64 << PAGE_SHIFT);

I suggest using ARRAY_SIZE(pages) instead of '64' to make the meaning clear.



[RFC PATCH v4 2/3] scheduler: add scheduler level for clusters

2021-03-01 Thread Barry Song
ARM64 chip Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
cluster has 4 cpus. All clusters share L3 cache data, but each cluster
has local L3 tag. On the other hand, each clusters will share some
internal system bus. This means cache coherence overhead inside one
cluster is much less than the overhead across clusters.

This patch adds the sched_domain for clusters. On kunpeng 920, without
this patch, domain0 of cpu0 would be MC with cpu0~cpu23 with ; with this
patch, MC becomes domain1, a new domain0 "CLS" including cpu0-cpu3.

This will help spread unrelated tasks among clusters, thus decrease the
contention and improve the throughput, for example, stream benchmark can
improve around 4.3%~6.3% by this patch:

w/o patch:
numactl -N 0 /usr/lib/lmbench/bin/stream -P 12 -M 1024M -N 5
STREAM copy latency: 3.36 nanoseconds
STREAM copy bandwidth: 57072.50 MB/sec
STREAM scale latency: 3.40 nanoseconds
STREAM scale bandwidth: 56542.52 MB/sec
STREAM add latency: 5.10 nanoseconds
STREAM add bandwidth: 56482.83 MB/sec
STREAM triad latency: 5.14 nanoseconds
STREAM triad bandwidth: 56069.52 MB/sec

w/ patch:
$ numactl -N 0 /usr/lib/lmbench/bin/stream -P 12 -M 1024M -N 5
STREAM copy latency: 3.22 nanoseconds
STREAM copy bandwidth: 59660.96 MB/sec->  +4.5%
STREAM scale latency: 3.25 nanoseconds
STREAM scale bandwidth: 59002.29 MB/sec   ->  +4.3%
STREAM add latency: 4.80 nanoseconds
STREAM add bandwidth: 60036.62 MB/sec ->  +6.3%
STREAM triad latency: 4.86 nanoseconds
STREAM triad bandwidth: 59228.30 MB/sec   ->  +5.6%

On the other hand, while doing WAKE_AFFINE, this patch will try to find
a core in the target cluster before scanning the whole llc domain. So it
helps gather related tasks within one cluster.
we run the below hackbench with different -g parameter from 2 to 14, for
each different g, we run the command 10 times and get the average time
$ numactl -N 0 hackbench -p -T -l 2 -g $1

hackbench will report the time which is needed to complete a certain number
of messages transmissions between a certain number of tasks, for example:
$ numactl -N 0 hackbench -p -T -l 2 -g 10
Running in threaded mode with 10 groups using 40 file descriptors each
(== 400 tasks)
Each sender will pass 2 messages of 100 bytes
Time: 8.874

The below is the result of hackbench w/ and w/o the patch:
g=2  4 6   8  10 12  14
w/o: 1.9596 4.0506 5.9654 8.0068 9.8147 11.4900 13.1163
w/ : 1.9362 3.9197 5.6570 7.1376 8.5263 10.0512 11.3256
+3.3%  +5.2%  +10.9% +13.2%  +12.8%  +13.7%

Signed-off-by: Barry Song 
---
-v4:
  * rebased to tip/sched/core with the latest unified code of select_idle_cpu
  * also added benchmark data of spreading unrelated tasks
  * avoided the iteration of sched_domain by moving to static_key(addressing
Vincent's comment

 arch/arm64/Kconfig |  7 +
 include/linux/sched/cluster.h  | 19 
 include/linux/sched/sd_flags.h |  9 ++
 include/linux/sched/topology.h |  7 +
 include/linux/topology.h   |  7 +
 kernel/sched/core.c| 18 
 kernel/sched/fair.c| 66 +-
 kernel/sched/sched.h   |  1 +
 kernel/sched/topology.c|  6 
 9 files changed, 126 insertions(+), 14 deletions(-)
 create mode 100644 include/linux/sched/cluster.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f39568b..158b0fa 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -971,6 +971,13 @@ config SCHED_MC
  making when dealing with multi-core CPU chips at a cost of slightly
  increased overhead in some places. If unsure say N here.
 
+config SCHED_CLUSTER
+   bool "Cluster scheduler support"
+   help
+ Cluster scheduler support improves the CPU scheduler's decision
+ making when dealing with machines that have clusters(sharing internal
+ bus or sharing LLC cache tag). If unsure say N here.
+
 config SCHED_SMT
bool "SMT scheduler support"
help
diff --git a/include/linux/sched/cluster.h b/include/linux/sched/cluster.h
new file mode 100644
index 000..ea6c475
--- /dev/null
+++ b/include/linux/sched/cluster.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_SCHED_CLUSTER_H
+#define _LINUX_SCHED_CLUSTER_H
+
+#include 
+
+#ifdef CONFIG_SCHED_CLUSTER
+extern struct static_key_false sched_cluster_present;
+
+static __always_inline bool sched_cluster_active(void)
+{
+   return static_branch_likely(_cluster_present);
+}
+#else
+static inline bool sched_cluster_active(void) { return false; }
+
+#endif
+
+#endif
diff --git a/include/linux/sched/sd_flags.h b/include/linux/sched/sd_flags.h
index 34b21e9..fc3c894 100644
--- a/include/linux/sched/sd_flags.h
+++ b/include/linux/sched/sd_flags.h
@@ -100,6 +100,15 @@
 SD_FLAG(SD_SHARE_CPUCAPACITY, SDF_SHARED_CHILD | SDF_NEEDS_GROUPS)
 
 /*
+ * Domain members share CPU cluster resources (i.e. llc cache tags)
+ *

[RFC PATCH v4 0/3] scheduler: expose the topology of clusters and add cluster scheduler

2021-03-01 Thread Barry Song
ARM64 server chip Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
cluster has 4 cpus. All clusters share L3 cache data while each cluster has
local L3 tag. On the other hand, each cluster will share some internal system
bus. This means cache is much more affine inside one cluster than across
clusters.

+---+  +-+
|  +--++--++---+ |
|  | CPU0 || cpu1 | |+---+ | |
|  +--++--+ ||   | | |
|   ++L3 | | |
|  +--++--+   cluster   ||tag| | |
|  | CPU2 || CPU3 | ||   | | |
|  +--++--+ |+---+ | |
|   |  | |
+---+  | |
+---+  | |
|  +--++--+ +--+ |
|  |  ||  | |+---+ | |
|  +--++--+ ||   | | |
|   ||L3 | | |
|  +--++--+ ++tag| | |
|  |  ||  | ||   | | |
|  +--++--+ |+---+ | |
|   |  | |
+---+  |   L3|
   |   data  |
+---+  | |
|  +--++--+ |+---+ | |
|  |  ||  | ||   | | |
|  +--++--+ ++L3 | | |
|   ||tag| | |
|  +--++--+ ||   | | |
|  |  ||  |+++---+ | |
|  +--++--+|---+ |
+---|  | |
+---|  | |
|  +--++--++---+ |
|  |  ||  | |+---+ | |
|  +--++--+ ||   | | |
|   ++L3 | | |
|  +--++--+ ||tag| | |
|  |  ||  | ||   | | |
|  +--++--+ |+---+ | |
|   |  | |
+---+  | |
+---+  | |
|  +--++--+ +--+ |
|  |  ||  | |   +---+  | |
|  +--++--+ |   |   |  | |


There is a similar need for clustering in x86.  Some x86 cores could share L2 
caches
that is similar to the cluster in Kupeng 920 (e.g. on Jacobsville there are 6 
clusters
of 4 Atom cores, each cluster sharing a separate L2, and 24 cores sharing L3).  

Having a sched_domain for clusters will bring two aspects of improvement:
1. spreading unrelated tasks among clusters, which decreases the contention of 
resources
and improve the throughput.
unrelated tasks might be put randomly without cluster sched_domain:
+---++-+
| ++   ++   || |
| |task|   |task|   || |
| |1   |   |2   |   || |
| ++   ++   || |
|   || |
|   cluster1|| cluster2|
+---++-+

but with cluster sched_domain, they are likely to spread due to LB:
+---++-+
| ++|| ++  |
| |task||| |task|  |
| |1   ||| 

[RFC PATCH v4 1/3] topology: Represent clusters of CPUs within a die.

2021-03-01 Thread Barry Song
From: Jonathan Cameron 

Both ACPI and DT provide the ability to describe additional layers of
topology between that of individual cores and higher level constructs
such as the level at which the last level cache is shared.
In ACPI this can be represented in PPTT as a Processor Hierarchy
Node Structure [1] that is the parent of the CPU cores and in turn
has a parent Processor Hierarchy Nodes Structure representing
a higher level of topology.

For example Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
cluster has 4 cpus. All clusters share L3 cache data, but each cluster
has local L3 tag. On the other hand, each clusters will share some
internal system bus.

+---+  +-+
|  +--++--++---+ |
|  | CPU0 || cpu1 | |+---+ | |
|  +--++--+ ||   | | |
|   ++L3 | | |
|  +--++--+   cluster   ||tag| | |
|  | CPU2 || CPU3 | ||   | | |
|  +--++--+ |+---+ | |
|   |  | |
+---+  | |
+---+  | |
|  +--++--+ +--+ |
|  |  ||  | |+---+ | |
|  +--++--+ ||   | | |
|   ||L3 | | |
|  +--++--+ ++tag| | |
|  |  ||  | ||   | | |
|  +--++--+ |+---+ | |
|   |  | |
+---+  |   L3|
   |   data  |
+---+  | |
|  +--++--+ |+---+ | |
|  |  ||  | ||   | | |
|  +--++--+ ++L3 | | |
|   ||tag| | |
|  +--++--+ ||   | | |
|  |  ||  |+++---+ | |
|  +--++--+|---+ |
+---|  | |
+---|  | |
|  +--++--++---+ |
|  |  ||  | |+---+ | |
|  +--++--+ ||   | | |
|   ++L3 | | |
|  +--++--+ ||tag| | |
|  |  ||  | ||   | | |
|  +--++--+ |+---+ | |
|   |  | |
+---+  | |
+---+  | |
|  +--++--+ +--+ |
|  |  ||  | |   +---+  | |
|  +--++--+ |   |   |  | |
|   |   |L3 |  | |
|  +--++--+ +---+tag|  | |
|  |  ||  | |   |   |  | |
|  +--++--+ |   +---+  | |
|   |  | |
+---+  | |
+---+ ++ |
|  +--++--+ +--+ |
|  |  ||  | |  +---+   | |
|  +--++--+ |  |   |   | |
|   |  |L3 |   | |
|  +--++--+ +--+tag|   | |
|  |  ||  | |  |   |   | |
|  +--++--+ |  

Re: [PATCH v6 08/12] fork: Clear PASID for new mm

2021-03-01 Thread Jacob Pan
Hi Fenghua,

On Thu, 25 Feb 2021 22:17:11 +, Fenghua Yu  wrote:

> Hi, Jean,
> 
> On Wed, Feb 24, 2021 at 11:19:27AM +0100, Jean-Philippe Brucker wrote:
> > Hi Fenghua,
> > 
> > [Trimmed the Cc list]
> > 
> > On Mon, Jul 13, 2020 at 04:48:03PM -0700, Fenghua Yu wrote:  
> > > When a new mm is created, its PASID should be cleared, i.e. the PASID
> > > is initialized to its init state 0 on both ARM and X86.  
> > 
> > I just noticed this patch was dropped in v7, and am wondering whether we
> > could still upstream it. Does x86 need a child with a new address space
> > (!CLONE_VM) to inherit the PASID of the parent?  That doesn't make much
> > sense with regard to IOMMU structures - same PASID indexing multiple
> > PGDs?  
> 
> You are right: x86 should clear mm->pasid when a new mm is created.
> This patch somehow is losted:(
> 
> > 
> > Currently iommu_sva_alloc_pasid() assumes mm->pasid is always
> > initialized to 0 and fails on forked tasks. I'm trying to figure out
> > how to fix this. Could we clear the pasid on fork or does it break the
> > x86 model?  
> 
> x86 calls ioasid_alloc() instead of iommu_sva_alloc_pasid(). So
We should consolidate at some point, there is no need to store pasid in two
places.

> functionality is not a problem without this patch on x86. But I think
I feel the reason that x86 doesn't care is that mm->pasid is not used
unless bind_mm is called. For the fork children even mm->pasid is non-zero,
it has no effect since it is not loaded onto MSRs.
Perhaps you could also add a check or WARN_ON(!mm->pasid) in load_pasid()?

> we do need to have this patch in the kernel because PASID is per addr
> space and two addr spaces shouldn't have the same PASID.
> 
Agreed.

> Who will accept this patch?
> 
> Thanks.
> 
> -Fenghua


Thanks,

Jacob


[PATCH] usb: serial: io_edgeport: fix memory leak in edge_startup

2021-03-01 Thread Pavel Skripkin
sysbot found memory leak in edge_startup().
The problem was that when an error was received from the usb_submit_urb(),
nothing was cleaned up.

Reported-by: syzbot+59f777bdcbdd7eea5...@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin 
---
 drivers/usb/serial/io_edgeport.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/serial/io_edgeport.c b/drivers/usb/serial/io_edgeport.c
index a493670c06e6..68401adcffde 100644
--- a/drivers/usb/serial/io_edgeport.c
+++ b/drivers/usb/serial/io_edgeport.c
@@ -3003,26 +3003,32 @@ static int edge_startup(struct usb_serial *serial)
response = -ENODEV;
}
 
-   usb_free_urb(edge_serial->interrupt_read_urb);
-   kfree(edge_serial->interrupt_in_buffer);
-
-   usb_free_urb(edge_serial->read_urb);
-   kfree(edge_serial->bulk_in_buffer);
-
-   kfree(edge_serial);
-
-   return response;
+   goto error;
}
 
/* start interrupt read for this edgeport this interrupt will
 * continue as long as the edgeport is connected */
response = usb_submit_urb(edge_serial->interrupt_read_urb,
GFP_KERNEL);
-   if (response)
+   if (response) {
dev_err(ddev, "%s - Error %d submitting control urb\n",
__func__, response);
+
+   goto error;
+   }
}
return response;
+
+error:
+   usb_free_urb(edge_serial->interrupt_read_urb);
+   kfree(edge_serial->interrupt_in_buffer);
+
+   usb_free_urb(edge_serial->read_urb);
+   kfree(edge_serial->bulk_in_buffer);
+
+   kfree(edge_serial);
+
+   return response;
 }
 
 
-- 
2.25.1



RE: [PATCH v3 5/8] mm: Device exclusive memory access

2021-03-01 Thread Ralph Campbell
> From: Alistair Popple 
> Sent: Thursday, February 25, 2021 11:18 PM
> To: linux...@kvack.org; nouv...@lists.freedesktop.org;
> bske...@redhat.com; a...@linux-foundation.org
> Cc: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org; dri-
> de...@lists.freedesktop.org; John Hubbard ; Ralph
> Campbell ; jgli...@redhat.com; Jason Gunthorpe
> ; h...@infradead.org; dan...@ffwll.ch; Alistair Popple
> 
> Subject: [PATCH v3 5/8] mm: Device exclusive memory access
> 
> Some devices require exclusive write access to shared virtual memory (SVM)
> ranges to perform atomic operations on that memory. This requires CPU page
> tables to be updated to deny access whilst atomic operations are occurring.
> 
> In order to do this introduce a new swap entry type (SWP_DEVICE_EXCLUSIVE).
> When a SVM range needs to be marked for exclusive access by a device all page
> table mappings for the particular range are replaced with device exclusive 
> swap
> entries. This causes any CPU access to the page to result in a fault.
> 
> Faults are resovled by replacing the faulting entry with the original 
> mapping. This
> results in MMU notifiers being called which a driver uses to update access
> permissions such as revoking atomic access. After notifiers have been called 
> the
> device will no longer have exclusive access to the region.
> 
> Signed-off-by: Alistair Popple 
> ---
>  Documentation/vm/hmm.rst |  15 
>  include/linux/rmap.h |   3 +
>  include/linux/swap.h |   4 +-
>  include/linux/swapops.h  |  44 ++-
>  mm/hmm.c |   5 ++
>  mm/memory.c  | 108 +-
>  mm/mprotect.c|   8 ++
>  mm/page_vma_mapped.c |   9 ++-
>  mm/rmap.c| 163 +++
>  9 files changed, 352 insertions(+), 7 deletions(-)
...
> +int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
> + unsigned long end, struct page **pages) {
> + long npages = (end - start) >> PAGE_SHIFT;
> + long i;

Nit: you should use unsigned long for 'i' and 'npages' to match start/end.


Re: [PATCH 4.19 055/247] soc: aspeed: snoop: Add clock control logic

2021-03-01 Thread Joel Stanley
On Mon, 1 Mar 2021 at 16:37, Greg Kroah-Hartman
 wrote:
>
> From: Jae Hyun Yoo 
>
> [ Upstream commit 3f94cf15583be554df7aaa651b8ff8e1b68fbe51 ]
>
> If LPC SNOOP driver is registered ahead of lpc-ctrl module, LPC
> SNOOP block will be enabled without heart beating of LCLK until
> lpc-ctrl enables the LCLK. This issue causes improper handling on
> host interrupts when the host sends interrupt in that time frame.
> Then kernel eventually forcibly disables the interrupt with
> dumping stack and printing a 'nobody cared this irq' message out.
>
> To prevent this issue, all LPC sub-nodes should enable LCLK
> individually so this patch adds clock control logic into the LPC
> SNOOP driver.

Jae, John; with this backported do we need to also provide a
corresponding device tree change for the stable tree, otherwise this
driver will no longer probe?

>
> Fixes: 3772e5da4454 ("drivers/misc: Aspeed LPC snoop output using misc 
> chardev")
> Signed-off-by: Jae Hyun Yoo 
> Signed-off-by: Vernon Mauery 
> Signed-off-by: John Wang 
> Reviewed-by: Joel Stanley 
> Link: 
> https://lore.kernel.org/r/20201208091748.1920-1-wangzhiqiang...@bytedance.com
> Signed-off-by: Joel Stanley 
> Signed-off-by: Sasha Levin 
> ---
>  drivers/misc/aspeed-lpc-snoop.c | 30 +++---
>  1 file changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/misc/aspeed-lpc-snoop.c b/drivers/misc/aspeed-lpc-snoop.c
> index c10be21a1663d..b4a776bf44bc5 100644
> --- a/drivers/misc/aspeed-lpc-snoop.c
> +++ b/drivers/misc/aspeed-lpc-snoop.c
> @@ -15,6 +15,7 @@
>   */
>
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -71,6 +72,7 @@ struct aspeed_lpc_snoop_channel {
>  struct aspeed_lpc_snoop {
> struct regmap   *regmap;
> int irq;
> +   struct clk  *clk;
> struct aspeed_lpc_snoop_channel chan[NUM_SNOOP_CHANNELS];
>  };
>
> @@ -286,22 +288,42 @@ static int aspeed_lpc_snoop_probe(struct 
> platform_device *pdev)
> return -ENODEV;
> }
>
> +   lpc_snoop->clk = devm_clk_get(dev, NULL);
> +   if (IS_ERR(lpc_snoop->clk)) {
> +   rc = PTR_ERR(lpc_snoop->clk);
> +   if (rc != -EPROBE_DEFER)
> +   dev_err(dev, "couldn't get clock\n");
> +   return rc;
> +   }
> +   rc = clk_prepare_enable(lpc_snoop->clk);
> +   if (rc) {
> +   dev_err(dev, "couldn't enable clock\n");
> +   return rc;
> +   }
> +
> rc = aspeed_lpc_snoop_config_irq(lpc_snoop, pdev);
> if (rc)
> -   return rc;
> +   goto err;
>
> rc = aspeed_lpc_enable_snoop(lpc_snoop, dev, 0, port);
> if (rc)
> -   return rc;
> +   goto err;
>
> /* Configuration of 2nd snoop channel port is optional */
> if (of_property_read_u32_index(dev->of_node, "snoop-ports",
>1, ) == 0) {
> rc = aspeed_lpc_enable_snoop(lpc_snoop, dev, 1, port);
> -   if (rc)
> +   if (rc) {
> aspeed_lpc_disable_snoop(lpc_snoop, 0);
> +   goto err;
> +   }
> }
>
> +   return 0;
> +
> +err:
> +   clk_disable_unprepare(lpc_snoop->clk);
> +
> return rc;
>  }
>
> @@ -313,6 +335,8 @@ static int aspeed_lpc_snoop_remove(struct platform_device 
> *pdev)
> aspeed_lpc_disable_snoop(lpc_snoop, 0);
> aspeed_lpc_disable_snoop(lpc_snoop, 1);
>
> +   clk_disable_unprepare(lpc_snoop->clk);
> +
> return 0;
>  }
>
> --
> 2.27.0
>
>
>


Re: [PATCH 5.10 000/661] 5.10.20-rc2 review

2021-03-01 Thread Florian Fainelli
On 3/1/21 11:37 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.10.20 release.
> There are 661 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 03 Mar 2021 19:34:53 +.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.20-rc2.gz
> or in the git tree and branch at:
>   
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-5.10.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:

Tested-by: Florian Fainelli 
-- 
Florian


Re: [PATCHv2 3/4] coresight: etm4x: Add support to exclude kernel mode tracing

2021-03-01 Thread Doug Anderson
Hi,

On Mon, Mar 1, 2021 at 11:05 AM Sai Prakash Ranjan
 wrote:
>
> On production systems with ETMs enabled, it is preferred to exclude
> kernel mode(NS EL1) tracing for security concerns and support only
> userspace(NS EL0) tracing. Perf subsystem interface uses the newly
> introduced kernel config CONFIG_EXCLUDE_KERNEL_PMU_TRACE to exclude
> kernel mode tracing, but there is an additional interface via sysfs
> for ETMs which also needs to be handled to exclude kernel
> mode tracing. So we use this same generic kernel config to handle
> the sysfs mode of tracing. This config is disabled by default and
> would not affect the current configuration which has both kernel and
> userspace tracing enabled by default.
>
> Tested-by: Denis Nikitin 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/hwtracing/coresight/coresight-etm4x-core.c  | 6 +-
>  drivers/hwtracing/coresight/coresight-etm4x-sysfs.c | 6 ++
>  2 files changed, 11 insertions(+), 1 deletion(-)

Not that I'm an expert in the perf subsystem, but the concern I had
with v1 is now addressed.  FWIW this seems fine to me now.

Reviewed-by: Douglas Anderson 


> --- a/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-etm4x-sysfs.c
> @@ -296,6 +296,12 @@ static ssize_t mode_store(struct device *dev,
> if (kstrtoul(buf, 16, ))
> return -EINVAL;
>
> +   if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_PMU_TRACE) && (!(val & 
> ETM_MODE_EXCL_KERN))) {
> +   dev_warn(dev,
> +   "Kernel mode tracing is not allowed, check your 
> kernel config\n");

slight nit that I think your string needs to be indented by 1 space.  ;-)


Re: [PATCH v1 03/15] powerpc/uaccess: Remove __get/put_user_inatomic()

2021-03-01 Thread Daniel Axtens
Christophe Leroy  writes:

> Since commit 662bbcb2747c ("mm, sched: Allow uaccess in atomic with
> pagefault_disable()"), __get/put_user() can be used in atomic parts
> of the code, therefore the __get/put_user_inatomic() introduced
> by commit e68c825bb016 ("[POWERPC] Add inatomic versions of __get_user
> and __put_user") have become useless.

I spent some time chasing these macro definitions.

Let me see if I understand you.

__get_user(x, ptr) becomes __get_user_nocheck(..., true)
__get_user_inatomic() become __get_user_nosleep()

The difference between how __get_user_nosleep() and
__get_user_nocheck(..., true) operate is that __get_user_nocheck calls
might_fault() and __get_user_nosleep() does not.

If I understand the commit you reference and mm/memory.c, you're saying
that we can indeed call might_fault() when page faults are disabled,
because __might_fault() checks if page faults are disabled and does not
fire a warning if it is called with page faults disabled.

Therefore, it is safe to remove our _inatomic version that does not call
might_fault and just to call might_fault unconditionally.

Is that right?

I haven't checked changes you made to the various .c files in fine
detail but they appear to be entirely mechanical.

> powerpc is the only one having such functions. There is a real
> intention not to have to provide such _inatomic() helpers, see the
> comment in might_fault() in mm/memory.c introduced by
> commit 3ee1afa308f2 ("x86: some lock annotations for user
> copy paths, v2"):
>
>   /*
>* it would be nicer only to annotate paths which are not under
>* pagefault_disable, however that requires a larger audit and
>* providing helpers like get_user_atomic.
>*/
>

I'm not fully sure I understand what you're saying in this part of the
commit message.

Kind regards,
Daniel

>
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/include/asm/uaccess.h| 37 ---
>  arch/powerpc/kernel/align.c   | 32 
>  .../kernel/hw_breakpoint_constraints.c|  2 +-
>  arch/powerpc/kernel/traps.c   |  2 +-
>  4 files changed, 18 insertions(+), 55 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/uaccess.h 
> b/arch/powerpc/include/asm/uaccess.h
> index a08c482b1315..01aea0df4dd0 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -53,11 +53,6 @@ static inline bool __access_ok(unsigned long addr, 
> unsigned long size)
>  #define __put_user(x, ptr) \
>   __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
>  
> -#define __get_user_inatomic(x, ptr) \
> - __get_user_nosleep((x), (ptr), sizeof(*(ptr)))
> -#define __put_user_inatomic(x, ptr) \
> - __put_user_nosleep((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
> -
>  #ifdef CONFIG_PPC64
>  
>  #define ___get_user_instr(gu_op, dest, ptr)  \
> @@ -92,9 +87,6 @@ static inline bool __access_ok(unsigned long addr, unsigned 
> long size)
>  #define __get_user_instr(x, ptr) \
>   ___get_user_instr(__get_user, x, ptr)
>  
> -#define __get_user_instr_inatomic(x, ptr) \
> - ___get_user_instr(__get_user_inatomic, x, ptr)
> -
>  extern long __put_user_bad(void);
>  
>  #define __put_user_size(x, ptr, size, retval)\
> @@ -141,20 +133,6 @@ __pu_failed: 
> \
>   __pu_err;   \
>  })
>  
> -#define __put_user_nosleep(x, ptr, size) \
> -({   \
> - long __pu_err;  \
> - __typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
> - __typeof__(*(ptr)) __pu_val = (x);  \
> - __typeof__(size) __pu_size = (size);\
> - \
> - __chk_user_ptr(__pu_addr);  \
> - __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
> - \
> - __pu_err;   \
> -})
> -
> -
>  /*
>   * We don't tell gcc that we are accessing memory, but this is OK
>   * because we do not write to any memory gcc knows about, so there
> @@ -320,21 +298,6 @@ do { 
> \
>   __gu_err;   \
>  })
>  
> -#define __get_user_nosleep(x, ptr, size) \
> -({   \
> - long __gu_err;  \
> - __long_type(*(ptr)) __gu_val;   \
> - __typeof__(*(ptr)) __user *__gu_addr = (ptr);   \
> - __typeof__(size) __gu_size = (size);\
> -   

Re: [PATCHv2 2/4] perf evsel: Print warning for excluding kernel mode instruction tracing

2021-03-01 Thread Doug Anderson
Hi,

On Mon, Mar 1, 2021 at 11:05 AM Sai Prakash Ranjan
 wrote:
>
> Add a warning message to check CONFIG_EXCLUDE_KERNEL_HW_ITRACE kernel
> config which excludes kernel mode instruction tracing to help perf tool
> users identify the perf event open failure when they attempt kernel mode
> tracing with this config enabled.
>
> Tested-by: Denis Nikitin 
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  tools/perf/util/evsel.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

I'm not really knowledgeable at all about the perf subsystem so my
review doesn't hold a lot of weight.  However, Sai's patch seems sane
to me.

Reviewed-by: Douglas Anderson 


Re: Question about the "EXPERIMENTAL" tag for dax in XFS

2021-03-01 Thread Dave Chinner
On Mon, Mar 01, 2021 at 12:55:53PM -0800, Dan Williams wrote:
> On Sun, Feb 28, 2021 at 2:39 PM Dave Chinner  wrote:
> >
> > On Sat, Feb 27, 2021 at 03:40:24PM -0800, Dan Williams wrote:
> > > On Sat, Feb 27, 2021 at 2:36 PM Dave Chinner  wrote:
> > > > On Fri, Feb 26, 2021 at 02:41:34PM -0800, Dan Williams wrote:
> > > > > On Fri, Feb 26, 2021 at 1:28 PM Dave Chinner  
> > > > > wrote:
> > > > > > On Fri, Feb 26, 2021 at 12:59:53PM -0800, Dan Williams wrote:
> > > > it points to, check if it points to the PMEM that is being removed,
> > > > grab the page it points to, map that to the relevant struct page,
> > > > run collect_procs() on that page, then kill the user processes that
> > > > map that page.
> > > >
> > > > So why can't we walk the ptescheck the physical pages that they
> > > > map to and if they map to a pmem page we go poison that
> > > > page and that kills any user process that maps it.
> > > >
> > > > i.e. I can't see how unexpected pmem device unplug is any different
> > > > to an MCE delivering a hwpoison event to a DAX mapped page.
> > >
> > > I guess the tradeoff is walking a long list of inodes vs walking a
> > > large array of pages.
> >
> > Not really. You're assuming all a filesystem has to do is invalidate
> > everything if a device goes away, and that's not true. Finding if an
> > inode has a mapping that spans a specific device in a multi-device
> > filesystem can be a lot more complex than that. Just walking inodes
> > is easy - determining whihc inodes need invalidation is the hard
> > part.
> 
> That inode-to-device level of specificity is not needed for the same
> reason that drop_caches does not need to be specific. If the wrong
> page is unmapped a re-fault will bring it back, and re-fault will fail
> for the pages that are successfully removed.
> 
> > That's where ->corrupt_range() comes in - the filesystem is already
> > set up to do reverse mapping from physical range to inode(s)
> > offsets...
> 
> Sure, but what is the need to get to that level of specificity with
> the filesystem for something that should rarely happen in the course
> of normal operation outside of a mistake?

Dan, you made this mistake with the hwpoisoning code that we're
trying to fix that here. Hard coding a 1:1 physical address to
inode/offset into the DAX mapping was a bad mistake. It's also one
that should never have occurred because it's *obviously wrong* to
filesystem developers and has been for a long time.

Now we have the filesytem people providing a mechanism for the pmem
devices to tell the filesystems about physical device failures so
they can handle such failures correctly themselves. Having the
device go away unexpectedly from underneath a mounted and active
filesystem is a *device failure*, not an "unplug event".

The mistake you made was not understanding how filesystems work,
nor actually asking filesystem developers what they actually needed.
You're doing the same thing here - you're telling us what you think
the solution filesystems need is. Please listen when we say "that is
not sufficient" because we don't want to be backed into a corner
that we have to fix ourselves again before we can enable some basic
filesystem functionality that we should have been able to support on
DAX from the start...

> > > There's likely always more pages than inodes, but perhaps it's more
> > > efficient to walk the 'struct page' array than sb->s_inodes?
> >
> > I really don't see you seem to be telling us that invalidation is an
> > either/or choice. There's more ways to convert physical block
> > address -> inode file offset and mapping index than brute force
> > inode cache walks
> 
> Yes, but I was trying to map it to an existing mechanism and the
> internals of drop_pagecache_sb() are, in coarse terms, close to what
> needs to happen here.

No.

drop_pagecache_sb() is not a relevant model for telling a filesystem
that the block device underneath has gone away, nor for a device to
ensure that access protections that *are managed by the filesystem*
are enforced/revoked sanely.

drop_pagecache_sb() is a brute-force model for invalidating user
data mappings that the filesystem performs in response to such a
notification. It only needs this brute-force approach if it has no
other way to find active DAX mappings across the range of the device
that has gone away.

But this model doesn't work for direct mapped metadata, journals or
any other internal direct filesystem mappings that aren't referenced
by inodes that the filesytem might be using. The filesystem still
needs to invalidate all those mappings and prevent further access to
them, even from within the kernel itself.

Filesystems are way more complex than pure DAX devices, and hence
handle errors and failure events differently. Unlike DAX devices, we
have both internal and external references to the DAX device, and we
can have both external and internal direct maps.  Invalidating user
data mappings is all dax devices need to do on unplug, 

Re: [PATCHv2 1/4] perf/core: Add support to exclude kernel mode PMU tracing

2021-03-01 Thread Doug Anderson
Hi,

On Mon, Mar 1, 2021 at 11:05 AM Sai Prakash Ranjan
 wrote:
>
> Hardware assisted tracing families such as ARM Coresight, Intel PT
> provides rich tracing capabilities including instruction level
> tracing and accurate timestamps which are very useful for profiling
> and also pose a significant security risk. One such example of
> security risk is when kernel mode tracing is not excluded and these
> hardware assisted tracing can be used to analyze cryptographic code
> execution. In this case, even the root user must not be able to infer
> anything.
>
> To explain it more clearly in the words of a security team member
> (credits: Mattias Nissler),
>
> "Consider a system where disk contents are encrypted and the encryption
> key is set up by the user when mounting the file system. From that point
> on the encryption key resides in the kernel. It seems reasonable to
> expect that the disk encryption key be protected from exfiltration even
> if the system later suffers a root compromise (or even against insiders
> that have root access), at least as long as the attacker doesn't
> manage to compromise the kernel."
>
> Here the idea is to protect such important information from all users
> including root users since root privileges does not have to mean full
> control over the kernel [1] and root compromise does not have to be
> the end of the world.
>
> But "Peter said even the regular counters can be used for full branch
> trace, the information isn't as accurate as PT and friends and not easier
> but is good enough to infer plenty". This would mean that a global tunable
> config for all kernel mode pmu tracing is more appropriate than the one
> targeting the hardware assisted instruction tracing.
>
> Currently we can exclude kernel mode tracing via perf_event_paranoid
> sysctl but it has following limitations,
>
>  * No option to restrict kernel mode instruction tracing by the
>root user.
>  * Not possible to restrict kernel mode instruction tracing when the
>hardware assisted tracing IPs like ARM Coresight ETMs use an
>additional interface via sysfs for tracing in addition to perf
>interface.
>
> So introduce a new config CONFIG_EXCLUDE_KERNEL_PMU_TRACE to exclude
> kernel mode pmu tracing which will be generic and applicable to all
> hardware tracing families and which can also be used with other
> interfaces like sysfs in case of ETMs.
>
> [1] https://lwn.net/Articles/796866/
>
> Suggested-by: Suzuki K Poulose 
> Suggested-by: Al Grant 
> Tested-by: Denis Nikitin 
> Link: 
> https://lore.kernel.org/lkml/20201015124522.1876-1-saiprakash.ran...@codeaurora.org/
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  init/Kconfig | 11 +++
>  kernel/events/core.c |  3 +++
>  2 files changed, 14 insertions(+)

I'm not really knowledgeable at all about the perf subsystem so my
review doesn't hold a lot of weight.  However, Sai's patch seems sane
to me.

Reviewed-by: Douglas Anderson 


Re: [PATCHv2 4/4] coresight: etm3x: Add support to exclude kernel mode tracing

2021-03-01 Thread Doug Anderson
Hi,

On Mon, Mar 1, 2021 at 11:05 AM Sai Prakash Ranjan
 wrote:
>
> On production systems with ETMs enabled, it is preferred to exclude
> kernel mode(NS EL1) tracing for security concerns and support only
> userspace(NS EL0) tracing. Perf subsystem interface uses the newly
> introduced kernel config CONFIG_EXCLUDE_KERNEL_PMU_TRACE to exclude
> kernel mode tracing, but there is an additional interface
> via sysfs for ETMs which also needs to be handled to exclude kernel
> mode tracing. So we use this same generic kernel config to handle
> the sysfs mode of tracing. This config is disabled by default and
> would not affect the current configuration which has both kernel and
> userspace tracing enabled by default.
>
> Signed-off-by: Sai Prakash Ranjan 
> ---
>  drivers/hwtracing/coresight/coresight-etm3x-core.c  | 3 +++
>  drivers/hwtracing/coresight/coresight-etm3x-sysfs.c | 6 ++
>  2 files changed, 9 insertions(+)

Reviewed-by: Douglas Anderson 


> diff --git a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c 
> b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c
> index e8c7649f123e..f522fc2e01b3 100644
> --- a/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c
> +++ b/drivers/hwtracing/coresight/coresight-etm3x-sysfs.c
> @@ -116,6 +116,12 @@ static ssize_t mode_store(struct device *dev,
> if (ret)
> return ret;
>
> +   if (IS_ENABLED(CONFIG_EXCLUDE_KERNEL_PMU_TRACE) && (!(val & 
> ETM_MODE_EXCL_KERN))) {
> +   dev_warn(dev,
> +   "Kernel mode tracing is not allowed, check your 
> kernel config\n");

Same nit as in patch #3 that the above string should be indented by 1
more space.


[PATCH v9 4/6] userfaultfd: add UFFDIO_CONTINUE ioctl

2021-03-01 Thread Axel Rasmussen
This ioctl is how userspace ought to resolve "minor" userfaults. The
idea is, userspace is notified that a minor fault has occurred. It might
change the contents of the page using its second non-UFFD mapping, or
not. Then, it calls UFFDIO_CONTINUE to tell the kernel "I have ensured
the page contents are correct, carry on setting up the mapping".

Note that it doesn't make much sense to use UFFDIO_{COPY,ZEROPAGE} for
MINOR registered VMAs. ZEROPAGE maps the VMA to the zero page; but in
the minor fault case, we already have some pre-existing underlying page.
Likewise, UFFDIO_COPY isn't useful if we have a second non-UFFD mapping.
We'd just use memcpy() or similar instead.

It turns out hugetlb_mcopy_atomic_pte() already does very close to what
we want, if an existing page is provided via `struct page **pagep`. We
already special-case the behavior a bit for the UFFDIO_ZEROPAGE case, so
just extend that design: add an enum for the three modes of operation,
and make the small adjustments needed for the MCOPY_ATOMIC_CONTINUE
case. (Basically, look up the existing page, and avoid adding the
existing page to the page cache or calling set_page_huge_active() on
it.)

Reviewed-by: Peter Xu 
Signed-off-by: Axel Rasmussen 
---
 fs/userfaultfd.c | 67 
 include/linux/hugetlb.h  |  3 ++
 include/linux/userfaultfd_k.h| 18 +
 include/uapi/linux/userfaultfd.h | 21 +-
 mm/hugetlb.c | 40 ---
 mm/userfaultfd.c | 37 +++---
 6 files changed, 156 insertions(+), 30 deletions(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index ba35cafa8b0d..14f92285d04f 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1487,6 +1487,10 @@ static int userfaultfd_register(struct userfaultfd_ctx 
*ctx,
if (!(uffdio_register.mode & UFFDIO_REGISTER_MODE_WP))
ioctls_out &= ~((__u64)1 << _UFFDIO_WRITEPROTECT);
 
+   /* CONTINUE ioctl is only supported for MINOR ranges. */
+   if (!(uffdio_register.mode & UFFDIO_REGISTER_MODE_MINOR))
+   ioctls_out &= ~((__u64)1 << _UFFDIO_CONTINUE);
+
/*
 * Now that we scanned all vmas we can already tell
 * userland which ioctls methods are guaranteed to
@@ -1840,6 +1844,66 @@ static int userfaultfd_writeprotect(struct 
userfaultfd_ctx *ctx,
return ret;
 }
 
+static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
+{
+   __s64 ret;
+   struct uffdio_continue uffdio_continue;
+   struct uffdio_continue __user *user_uffdio_continue;
+   struct userfaultfd_wake_range range;
+
+   user_uffdio_continue = (struct uffdio_continue __user *)arg;
+
+   ret = -EAGAIN;
+   if (READ_ONCE(ctx->mmap_changing))
+   goto out;
+
+   ret = -EFAULT;
+   if (copy_from_user(_continue, user_uffdio_continue,
+  /* don't copy the output fields */
+  sizeof(uffdio_continue) - (sizeof(__s64
+   goto out;
+
+   ret = validate_range(ctx->mm, _continue.range.start,
+uffdio_continue.range.len);
+   if (ret)
+   goto out;
+
+   ret = -EINVAL;
+   /* double check for wraparound just in case. */
+   if (uffdio_continue.range.start + uffdio_continue.range.len <=
+   uffdio_continue.range.start) {
+   goto out;
+   }
+   if (uffdio_continue.mode & ~UFFDIO_CONTINUE_MODE_DONTWAKE)
+   goto out;
+
+   if (mmget_not_zero(ctx->mm)) {
+   ret = mcopy_continue(ctx->mm, uffdio_continue.range.start,
+uffdio_continue.range.len,
+>mmap_changing);
+   mmput(ctx->mm);
+   } else {
+   return -ESRCH;
+   }
+
+   if (unlikely(put_user(ret, _uffdio_continue->mapped)))
+   return -EFAULT;
+   if (ret < 0)
+   goto out;
+
+   /* len == 0 would wake all */
+   BUG_ON(!ret);
+   range.len = ret;
+   if (!(uffdio_continue.mode & UFFDIO_CONTINUE_MODE_DONTWAKE)) {
+   range.start = uffdio_continue.range.start;
+   wake_userfault(ctx, );
+   }
+   ret = range.len == uffdio_continue.range.len ? 0 : -EAGAIN;
+
+out:
+   return ret;
+}
+
 static inline unsigned int uffd_ctx_features(__u64 user_features)
 {
/*
@@ -1927,6 +1991,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned 
cmd,
case UFFDIO_WRITEPROTECT:
ret = userfaultfd_writeprotect(ctx, arg);
break;
+   case UFFDIO_CONTINUE:
+   ret = userfaultfd_continue(ctx, arg);
+   break;
}
return ret;
 }
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 7b86bf809d7a..1d3246b31a41 100644
--- 

Re: [PATCH V6] x86/mm: Tracking linear mapping split events

2021-03-01 Thread Tejun Heo
Hello,

On Thu, Feb 18, 2021 at 03:57:44PM -0800, Saravanan D wrote:
> To help with debugging the sluggishness caused by TLB miss/reload,
> we introduce monotonic hugepage [direct mapped] split event counts since
> system state: SYSTEM_RUNNING to be displayed as part of
> /proc/vmstat in x86 servers
...
> Signed-off-by: Saravanan D 
> Acked-by: Tejun Heo 
> Acked-by: Johannes Weiner 
> Acked-by: Dave Hansen 

Andrew, do you mind picking this one up? It has enough acks and can go
through either mm or x86 tree.

Thank you.

-- 
tejun


Re: [PATCH v2 1/2] tty/serial: Add rx-tx-swap OF option to stm32-usart

2021-03-01 Thread Martin DEVERA

On 3/1/21 11:28 AM, Fabrice Gasnier wrote:

On 2/27/21 5:41 PM, Martin Devera wrote:

STM32 F7/H7 usarts supports RX & TX pin swapping.
Add option to turn it on.
Tested on STM32MP157.

Signed-off-by: Martin Devera 
---
  drivers/tty/serial/stm32-usart.c | 3 ++-
  drivers/tty/serial/stm32-usart.h | 1 +
  2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index b3675cf25a69..3650c8798061 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -758,7 +758,7 @@ static void stm32_usart_set_termios(struct uart_port *port,
cr1 = USART_CR1_TE | USART_CR1_RE;
if (stm32_port->fifoen)
cr1 |= USART_CR1_FIFOEN;
-   cr2 = 0;
+   cr2 = stm32_port->swap ? USART_CR2_SWAP : 0;

Hi Martin,

Same could be done in the startup routine, that enables the port for
reception (as described in Documentation/driver-api/serial/driver.rst)

Hello Fabrice,

I already incorporated all your comments but I'm struggling with the one 
above.

The code must be in stm32_usart_set_termios too, because CR2 is modified.
What is the reason to have it in startup() ?
Is it because USART can be started without calling set_termios at all ? Like
to reuse bootloader's last settings ?

Thanks, Martin



[PATCH v9 5/6] userfaultfd: update documentation to describe minor fault handling

2021-03-01 Thread Axel Rasmussen
Reword / reorganize things a little bit into "lists", so new features /
modes / ioctls can sort of just be appended.

Describe how UFFDIO_REGISTER_MODE_MINOR and UFFDIO_CONTINUE can be used
to intercept and resolve minor faults. Make it clear that COPY and
ZEROPAGE are used for MISSING faults, whereas CONTINUE is used for MINOR
faults.

Reviewed-by: Peter Xu 
Signed-off-by: Axel Rasmussen 
---
 Documentation/admin-guide/mm/userfaultfd.rst | 107 ---
 1 file changed, 66 insertions(+), 41 deletions(-)

diff --git a/Documentation/admin-guide/mm/userfaultfd.rst 
b/Documentation/admin-guide/mm/userfaultfd.rst
index 65eefa66c0ba..3aa38e8b8361 100644
--- a/Documentation/admin-guide/mm/userfaultfd.rst
+++ b/Documentation/admin-guide/mm/userfaultfd.rst
@@ -63,36 +63,36 @@ the generic ioctl available.
 
 The ``uffdio_api.features`` bitmask returned by the ``UFFDIO_API`` ioctl
 defines what memory types are supported by the ``userfaultfd`` and what
-events, except page fault notifications, may be generated.
-
-If the kernel supports registering ``userfaultfd`` ranges on hugetlbfs
-virtual memory areas, ``UFFD_FEATURE_MISSING_HUGETLBFS`` will be set in
-``uffdio_api.features``. Similarly, ``UFFD_FEATURE_MISSING_SHMEM`` will be
-set if the kernel supports registering ``userfaultfd`` ranges on shared
-memory (covering all shmem APIs, i.e. tmpfs, ``IPCSHM``, ``/dev/zero``,
-``MAP_SHARED``, ``memfd_create``, etc).
-
-The userland application that wants to use ``userfaultfd`` with hugetlbfs
-or shared memory need to set the corresponding flag in
-``uffdio_api.features`` to enable those features.
-
-If the userland desires to receive notifications for events other than
-page faults, it has to verify that ``uffdio_api.features`` has appropriate
-``UFFD_FEATURE_EVENT_*`` bits set. These events are described in more
-detail below in `Non-cooperative userfaultfd`_ section.
-
-Once the ``userfaultfd`` has been enabled the ``UFFDIO_REGISTER`` ioctl should
-be invoked (if present in the returned ``uffdio_api.ioctls`` bitmask) to
-register a memory range in the ``userfaultfd`` by setting the
+events, except page fault notifications, may be generated:
+
+- The ``UFFD_FEATURE_EVENT_*`` flags indicate that various other events
+  other than page faults are supported. These events are described in more
+  detail below in the `Non-cooperative userfaultfd`_ section.
+
+- ``UFFD_FEATURE_MISSING_HUGETLBFS`` and ``UFFD_FEATURE_MISSING_SHMEM``
+  indicate that the kernel supports ``UFFDIO_REGISTER_MODE_MISSING``
+  registrations for hugetlbfs and shared memory (covering all shmem APIs,
+  i.e. tmpfs, ``IPCSHM``, ``/dev/zero``, ``MAP_SHARED``, ``memfd_create``,
+  etc) virtual memory areas, respectively.
+
+- ``UFFD_FEATURE_MINOR_HUGETLBFS`` indicates that the kernel supports
+  ``UFFDIO_REGISTER_MODE_MINOR`` registration for hugetlbfs virtual memory
+  areas.
+
+The userland application should set the feature flags it intends to use
+when invoking the ``UFFDIO_API`` ioctl, to request that those features be
+enabled if supported.
+
+Once the ``userfaultfd`` API has been enabled the ``UFFDIO_REGISTER``
+ioctl should be invoked (if present in the returned ``uffdio_api.ioctls``
+bitmask) to register a memory range in the ``userfaultfd`` by setting the
 uffdio_register structure accordingly. The ``uffdio_register.mode``
 bitmask will specify to the kernel which kind of faults to track for
-the range (``UFFDIO_REGISTER_MODE_MISSING`` would track missing
-pages). The ``UFFDIO_REGISTER`` ioctl will return the
+the range. The ``UFFDIO_REGISTER`` ioctl will return the
 ``uffdio_register.ioctls`` bitmask of ioctls that are suitable to resolve
 userfaults on the range registered. Not all ioctls will necessarily be
-supported for all memory types depending on the underlying virtual
-memory backend (anonymous memory vs tmpfs vs real filebacked
-mappings).
+supported for all memory types (e.g. anonymous memory vs. shmem vs.
+hugetlbfs), or all types of intercepted faults.
 
 Userland can use the ``uffdio_register.ioctls`` to manage the virtual
 address space in the background (to add or potentially also remove
@@ -100,21 +100,46 @@ memory from the ``userfaultfd`` registered range). This 
means a userfault
 could be triggering just before userland maps in the background the
 user-faulted page.
 
-The primary ioctl to resolve userfaults is ``UFFDIO_COPY``. That
-atomically copies a page into the userfault registered range and wakes
-up the blocked userfaults
-(unless ``uffdio_copy.mode & UFFDIO_COPY_MODE_DONTWAKE`` is set).
-Other ioctl works similarly to ``UFFDIO_COPY``. They're atomic as in
-guaranteeing that nothing can see an half copied page since it'll
-keep userfaulting until the copy has finished.
+Resolving Userfaults
+
+
+There are three basic ways to resolve userfaults:
+
+- ``UFFDIO_COPY`` atomically copies some existing page contents from
+  userspace.
+
+- ``UFFDIO_ZEROPAGE`` atomically zeros the new page.
+

Re: [PATCH v2 1/4] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR

2021-03-01 Thread Sean Christopherson
On Wed, Feb 03, 2021, Like Xu wrote:
> @@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct 
> kvm_vcpu *vcpu,
>   return true;
>  }
>  
> +/*
> + * Check if the requested depth values is supported
> + * based on the bits [0:7] of the guest cpuid.1c.eax.
> + */
> +static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
> +{
> + struct kvm_cpuid_entry2 *best;
> +
> + best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
> + if (depth && best)

> + return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));

I believe this will genereate undefined behavior if depth > 64.  Or if depth < 
8.
And I believe this check also needs to enforce that depth is a multiple of 8.

   For each bit n set in this field, the IA32_LBR_DEPTH.DEPTH value 8*(n+1) is
   supported.

Thus it's impossible for 0-7, 9-15, etc... to be legal depths.


> +
> + return false;
> +}
> +



Re: [PATCH 1/1] docs: arm: /chosen node parameters

2021-03-01 Thread Jonathan Corbet
Heinrich Schuchardt  writes:

> Add missing items to table of parameters set in the /chosen node by the EFI
> stub.
>
> Signed-off-by: Heinrich Schuchardt 
> ---
>  Documentation/arm/uefi.rst | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/Documentation/arm/uefi.rst b/Documentation/arm/uefi.rst
> index f732f957421f..9b0b5e458a1e 100644
> --- a/Documentation/arm/uefi.rst
> +++ b/Documentation/arm/uefi.rst
> @@ -64,4 +64,11 @@ linux,uefi-mmap-desc-size   32-bit   Size in bytes of each 
> entry in the UEFI
>   memory map.
>
>  linux,uefi-mmap-desc-ver32-bit   Version of the mmap descriptor format.
> +
> +linux,initrd-start  64-bit   Physical start address of an initrd
> +
> +linux,initrd-end64-bit   Physical end address of an initrd
> +
> +kaslr-seed  64-bit   Entropy used to randomize the kernel 
> image
> + base address location.
>  ==  ==   
> ===

Applied, thanks.

jon


Re: [PATCH v1 01/15] powerpc/uaccess: Remove __get_user_allowed() and unsafe_op_wrap()

2021-03-01 Thread Segher Boessenkool
On Tue, Mar 02, 2021 at 09:02:54AM +1100, Daniel Axtens wrote:
> Checkpatch does have one check that is relevant:
> 
> CHECK: Macro argument reuse 'p' - possible side-effects?
> #36: FILE: arch/powerpc/include/asm/uaccess.h:482:
> +#define unsafe_get_user(x, p, e) do {
> \
> + if (unlikely(__get_user_nocheck((x), (p), sizeof(*(p)), false)))\
> + goto e; \
> +} while (0)

sizeof (of something other than a VLA) does not evaluate its operand.
The checkpatch warning is incorrect (well, it does say "possible" --
it just didn't find a possible problem here).

You can write
  bla = sizeof *p++;
and p is *not* incremented.


Segher


Re: [PATCH] mmc: Try power cycling card if command request times out

2021-03-01 Thread Marten Lindahl
Hi Adrian!

Thank you for your comments!

On Mon, Mar 01, 2021 at 11:40:03AM +0100, Adrian Hunter wrote:
> On 1/03/21 10:50 am, Ulf Hansson wrote:
> > + Adrian
> > 
> > On Tue, 16 Feb 2021 at 23:43, Mårten Lindahl  
> > wrote:
> >>
> >> Sometimes SD cards that has been run for a long time enters a state
> >> where it cannot by itself be recovered, but needs a power cycle to be
> >> operational again. Card status analysis has indicated that the card can
> >> end up in a state where all external commands are ignored by the card
> >> since it is halted by data timeouts.
> >>
> >> If the card has been heavily used for a long time it can be weared out,
> >> and should typically be replaced. But on some tests, it shows that the
> >> card can still be functional after a power cycle, but as it requires an
> >> operator to do it, the card can remain in a non-operational state for a
> >> long time until the problem has been observed by the operator.
> >>
> >> This patch adds function to power cycle the card in case it does not
> >> respond to a command, and then resend the command if the power cycle
> >> was successful. This procedure will be tested 1 time before giving up,
> >> and resuming host operation as normal.
> > 
> > I assume the context above is all about the ioctl interface?
> > 
> > So, when the card enters this non functional state, have you tried
> > just reading a block through the regular I/O interface. Does it
> > trigger a power cycle of the card - and then makes it functional
> > again?
> > 
> >>
> >> Signed-off-by: Mårten Lindahl 
> >> ---
> >> Please note: This might not be the way we want to handle these cases,
> >> but at least it lets us start the discussion. In which cases should the
> >> mmc framework deal with error messages like ETIMEDOUT, and in which
> >> cases should it be handled by userspace?
> >> The mmc framework tries to recover a failed block request
> >> (mmc_blk_mq_rw_recovery) which may end up in a HW reset of the card.
> >> Would it be an idea to act in a similar way when an ioctl times out?
> > 
> > Maybe, it's a good idea to allow the similar reset for ioctls as we do
> > for regular I/O requests. My concern with this though, is that we
> > might allow user space to trigger a HW resets a bit too easily - and
> > that could damage the card.
> > 
> > Did you consider this?
> > 
> >>
> >>  drivers/mmc/core/block.c | 20 ++--
> >>  1 file changed, 18 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
> >> index 42e27a298218..d007b2af64d6 100644
> >> --- a/drivers/mmc/core/block.c
> >> +++ b/drivers/mmc/core/block.c
> >> @@ -976,6 +976,7 @@ static inline void mmc_blk_reset_success(struct 
> >> mmc_blk_data *md, int type)
> >>   */
> >>  static void mmc_blk_issue_drv_op(struct mmc_queue *mq, struct request 
> >> *req)
> >>  {
> >> +   int type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
> >> struct mmc_queue_req *mq_rq;
> >> struct mmc_card *card = mq->card;
> >> struct mmc_blk_data *md = mq->blkdata;
> >> @@ -983,7 +984,7 @@ static void mmc_blk_issue_drv_op(struct mmc_queue *mq, 
> >> struct request *req)
> >> bool rpmb_ioctl;
> >> u8 **ext_csd;
> >> u32 status;
> >> -   int ret;
> >> +   int ret, retry = 1;
> >> int i;
> >>
> >> mq_rq = req_to_mmc_queue_req(req);
> >> @@ -994,9 +995,24 @@ static void mmc_blk_issue_drv_op(struct mmc_queue 
> >> *mq, struct request *req)
> >> case MMC_DRV_OP_IOCTL_RPMB:
> 
> SD cards do not have RPMB.  Did you mean eMMC?
> 

No, you are right. This action should be excluded from 'case 
MMC_DRV_OP_IOCTL_RPMB'.

> 
> >> idata = mq_rq->drv_op_data;
> >> for (i = 0, ret = 0; i < mq_rq->ioc_count; i++) {
> >> +cmd_do:
> >> ret = __mmc_blk_ioctl_cmd(card, md, idata[i]);
> >> -   if (ret)
> >> +   if (ret == -ETIMEDOUT) {
> >> +   dev_warn(mmc_dev(card->host),
> >> +"error %d sending command\n", 
> >> ret);
> >> +cmd_reset:
> >> +   mmc_blk_reset_success(md, type);
> 
> mmc_blk_reset_success() is called upon success, not failure.  The reset will
> not be attempted twice in a row, for a given type, without a "success" in
> between.
> 

Ok, yes I see. This line and the cmd_reset label should be removed, and if
mmc_blk_reset fails we should break, not retry.

Kind regards
Mårten

> >> +   if (retry--) {
> >> +   dev_warn(mmc_dev(card->host),
> >> +"power cycling card\n");
> >> +   if (mmc_blk_reset
> >> +   (md, card->host, type))
> >> +   goto cmd_reset;
> >> +  

[PATCH v9 6/6] userfaultfd/selftests: add test exercising minor fault handling

2021-03-01 Thread Axel Rasmussen
Fix a dormant bug in userfaultfd_events_test(), where we did
`return faulting_process(0)` instead of `exit(faulting_process(0))`.
This caused the forked process to keep running, trying to execute any
further test cases after the events test in parallel with the "real"
process.

Add a simple test case which exercises minor faults. In short, it does
the following:

1. "Sets up" an area (area_dst) and a second shared mapping to the same
   underlying pages (area_dst_alias).

2. Register one of these areas with userfaultfd, in minor fault mode.

3. Start a second thread to handle any minor faults.

4. Populate the underlying pages with the non-UFFD-registered side of
   the mapping. Basically, memset() each page with some arbitrary
   contents.

5. Then, using the UFFD-registered mapping, read all of the page
   contents, asserting that the contents match expectations (we expect
   the minor fault handling thread can modify the page contents before
   resolving the fault).

The minor fault handling thread, upon receiving an event, flips all the
bits (~) in that page, just to prove that it can modify it in some
arbitrary way. Then it issues a UFFDIO_CONTINUE ioctl, to setup the
mapping and resolve the fault. The reading thread should wake up and see
this modification.

Currently the minor fault test is only enabled in hugetlb_shared mode,
as this is the only configuration the kernel feature supports.

Reviewed-by: Peter Xu 
Signed-off-by: Axel Rasmussen 
---
 tools/testing/selftests/vm/userfaultfd.c | 164 ++-
 1 file changed, 158 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/vm/userfaultfd.c 
b/tools/testing/selftests/vm/userfaultfd.c
index 92b8ec423201..f5ab5e0312e7 100644
--- a/tools/testing/selftests/vm/userfaultfd.c
+++ b/tools/testing/selftests/vm/userfaultfd.c
@@ -81,6 +81,8 @@ static volatile bool test_uffdio_copy_eexist = true;
 static volatile bool test_uffdio_zeropage_eexist = true;
 /* Whether to test uffd write-protection */
 static bool test_uffdio_wp = false;
+/* Whether to test uffd minor faults */
+static bool test_uffdio_minor = false;
 
 static bool map_shared;
 static int huge_fd;
@@ -96,6 +98,7 @@ struct uffd_stats {
int cpu;
unsigned long missing_faults;
unsigned long wp_faults;
+   unsigned long minor_faults;
 };
 
 /* pthread_mutex_t starts at page offset 0 */
@@ -153,17 +156,19 @@ static void uffd_stats_reset(struct uffd_stats 
*uffd_stats,
uffd_stats[i].cpu = i;
uffd_stats[i].missing_faults = 0;
uffd_stats[i].wp_faults = 0;
+   uffd_stats[i].minor_faults = 0;
}
 }
 
 static void uffd_stats_report(struct uffd_stats *stats, int n_cpus)
 {
int i;
-   unsigned long long miss_total = 0, wp_total = 0;
+   unsigned long long miss_total = 0, wp_total = 0, minor_total = 0;
 
for (i = 0; i < n_cpus; i++) {
miss_total += stats[i].missing_faults;
wp_total += stats[i].wp_faults;
+   minor_total += stats[i].minor_faults;
}
 
printf("userfaults: %llu missing (", miss_total);
@@ -172,6 +177,9 @@ static void uffd_stats_report(struct uffd_stats *stats, int 
n_cpus)
printf("\b), %llu wp (", wp_total);
for (i = 0; i < n_cpus; i++)
printf("%lu+", stats[i].wp_faults);
+   printf("\b), %llu minor (", minor_total);
+   for (i = 0; i < n_cpus; i++)
+   printf("%lu+", stats[i].minor_faults);
printf("\b)\n");
 }
 
@@ -328,7 +336,7 @@ static struct uffd_test_ops shmem_uffd_test_ops = {
 };
 
 static struct uffd_test_ops hugetlb_uffd_test_ops = {
-   .expected_ioctls = UFFD_API_RANGE_IOCTLS_BASIC,
+   .expected_ioctls = UFFD_API_RANGE_IOCTLS_BASIC & ~(1 << 
_UFFDIO_CONTINUE),
.allocate_area  = hugetlb_allocate_area,
.release_pages  = hugetlb_release_pages,
.alias_mapping = hugetlb_alias_mapping,
@@ -362,6 +370,22 @@ static void wp_range(int ufd, __u64 start, __u64 len, bool 
wp)
}
 }
 
+static void continue_range(int ufd, __u64 start, __u64 len)
+{
+   struct uffdio_continue req;
+
+   req.range.start = start;
+   req.range.len = len;
+   req.mode = 0;
+
+   if (ioctl(ufd, UFFDIO_CONTINUE, )) {
+   fprintf(stderr,
+   "UFFDIO_CONTINUE failed for address 0x%" PRIx64 "\n",
+   (uint64_t)start);
+   exit(1);
+   }
+}
+
 static void *locking_thread(void *arg)
 {
unsigned long cpu = (unsigned long) arg;
@@ -569,8 +593,32 @@ static void uffd_handle_page_fault(struct uffd_msg *msg,
}
 
if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP) {
+   /* Write protect page faults */
wp_range(uffd, msg->arg.pagefault.address, page_size, false);
stats->wp_faults++;
+   } else if (msg->arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_MINOR) {
+   

[PATCH v9 3/6] userfaultfd: hugetlbfs: only compile UFFD helpers if config enabled

2021-03-01 Thread Axel Rasmussen
For background, mm/userfaultfd.c provides a general mcopy_atomic
implementation. But some types of memory (i.e., hugetlb and shmem) need
a slightly different implementation, so they provide their own helpers
for this. In other words, userfaultfd is the only caller of these
functions.

This patch achieves two things:

1. Don't spend time compiling code which will end up never being
referenced anyway (a small build time optimization).

2. In patches later in this series, we extend the signature of these
helpers with UFFD-specific state (a mode enumeration). Once this
happens, we *have to* either not compile the helpers, or unconditionally
define the UFFD-only state (which seems messier to me). This includes
the declarations in the headers, as otherwise they'd yield warnings
about implicitly defining the type of those arguments.

Reviewed-by: Mike Kravetz 
Reviewed-by: Peter Xu 
Signed-off-by: Axel Rasmussen 
---
 include/linux/hugetlb.h | 4 
 mm/hugetlb.c| 2 ++
 2 files changed, 6 insertions(+)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index c0b10f0c7f23..7b86bf809d7a 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -134,11 +134,13 @@ void hugetlb_show_meminfo(void);
 unsigned long hugetlb_total_pages(void);
 vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, unsigned int flags);
+#ifdef CONFIG_USERFAULTFD
 int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte,
struct vm_area_struct *dst_vma,
unsigned long dst_addr,
unsigned long src_addr,
struct page **pagep);
+#endif /* CONFIG_USERFAULTFD */
 bool hugetlb_reserve_pages(struct inode *inode, long from, long to,
struct vm_area_struct *vma,
vm_flags_t vm_flags);
@@ -310,6 +312,7 @@ static inline void hugetlb_free_pgd_range(struct mmu_gather 
*tlb,
BUG();
 }
 
+#ifdef CONFIG_USERFAULTFD
 static inline int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
pte_t *dst_pte,
struct vm_area_struct *dst_vma,
@@ -320,6 +323,7 @@ static inline int hugetlb_mcopy_atomic_pte(struct mm_struct 
*dst_mm,
BUG();
return 0;
 }
+#endif /* CONFIG_USERFAULTFD */
 
 static inline pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr,
unsigned long sz)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 61fd15185f0a..4422dab8db9a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4618,6 +4618,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
return ret;
 }
 
+#ifdef CONFIG_USERFAULTFD
 /*
  * Used by userfaultfd UFFDIO_COPY.  Based on mcopy_atomic_pte with
  * modifications for huge pages.
@@ -4748,6 +4749,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
put_page(page);
goto out;
 }
+#endif /* CONFIG_USERFAULTFD */
 
 static void record_subpages_vmas(struct page *page, struct vm_area_struct *vma,
 int refs, struct page **pages,
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v9 2/6] userfaultfd: disable huge PMD sharing for MINOR registered VMAs

2021-03-01 Thread Axel Rasmussen
As the comment says: for the MINOR fault use case, although the page
might be present and populated in the other (non-UFFD-registered) half
of the mapping, it may be out of date, and we explicitly want userspace
to get a minor fault so it can check and potentially update the page's
contents.

Huge PMD sharing would prevent these faults from occurring for
suitably aligned areas, so disable it upon UFFD registration.

Reviewed-by: Peter Xu 
Reviewed-by: Mike Kravetz 
Signed-off-by: Axel Rasmussen 
---
 include/linux/userfaultfd_k.h | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index 0390e5ac63b3..e060d5f77cc5 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -56,12 +56,19 @@ static inline bool is_mergeable_vm_userfaultfd_ctx(struct 
vm_area_struct *vma,
 }
 
 /*
- * Never enable huge pmd sharing on uffd-wp registered vmas, because uffd-wp
- * protect information is per pgtable entry.
+ * Never enable huge pmd sharing on some uffd registered vmas:
+ *
+ * - VM_UFFD_WP VMAs, because write protect information is per pgtable entry.
+ *
+ * - VM_UFFD_MINOR VMAs, because otherwise we would never get minor faults for
+ *   VMAs which share huge pmds. (If you have two mappings to the same
+ *   underlying pages, and fault in the non-UFFD-registered one with a write,
+ *   with huge pmd sharing this would *also* setup the second UFFD-registered
+ *   mapping, and we'd not get minor faults.)
  */
 static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma)
 {
-   return vma->vm_flags & VM_UFFD_WP;
+   return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR);
 }
 
 static inline bool userfaultfd_missing(struct vm_area_struct *vma)
-- 
2.30.1.766.gb4fecdf3b7-goog



[PATCH v9 0/6] userfaultfd: add minor fault handling

2021-03-01 Thread Axel Rasmussen
Base


This series is based on v5.12-rc1. Additionally, this series depends on
Peter Xu's series to allow disabling huge pmd sharing.

[1] https://lore.kernel.org/patchwork/cover/1382204/

Changelog
=

v8->v9:
- Removed an unneeded double !! from a VM_BUG_ON check in handle_userfault.
- Introduced a handle_userfault helper in hugetlb.c, to reduce repetition.
- Rebased to v5.12-rc1, which has Mike's hugetlb changes which originally
  motivated rebasing onto akpm's tree (so, it also applies cleanly to akpm's
  tree).

v7->v8:
- Check CONFIG_HAVE_ARCH_USERFAULTFD_MINOR instead of commenting in
  userfaultfd_register.
- Remove redundant "ret = -EINVAL;" in userfaultfd_register.
- Revert removing trailing \ in include/trace/events/mmflags.h.
- Don't set "*pagep = NULL" in the is_continue case in
  hugetlb_mcopy_atomic_pte.

v6->v7:
- Based upon discussion, switched back to the VM_* flags approach which was used
  in v5, instead of implementing this as an API feature. Switched to using a
  high bit (instead of brokenly conflicting with VM_LOCKED), which implies
  introducing CONFIG_HAVE_ARCH_USERFAULTFD_MINOR and selecting it only on 64-bit
  architectures (x86_64 and arm64 for now).

v5->v6:
- Fixed the condition guarding a second case where we unlock_page() in
  hugetlb_mcopy_atomic_pte().
- Significantly refactored how minor registration works. Because there are no
  VM_* flags available to use, it has to be a userfaultfd API feature, rather
  than a registration mode. This has a few knock on consequences worth calling
  out:
- userfaultfd_minor() can no longer be inline, because we have to inspect
  the userfaultfd_ctx, which is only defined in fs/userfaultfd.c. This means
  slightly more overhead (1 function call) on all hugetlbfs minor faults.
- vma_can_userfault() no longer changes. It seems valid to me to create an
  FD with the minor fault feature enabled, and then register e.g. some
  non-hugetlbfs region in MISSING mode, fully expecting to not get any minor
  faults for it, alongside some other region which you *do* want minor
  faults for. So, at registration time, either should be accepted.
- Since I'm no longer adding a new registration mode, I'm no longer
  introducing __VM_UFFD_FLAGS or UFFD_API_REGISTER_MODES, and all the
  related cleanups have been reverted.

v4->v5:
- Typo fix in the documentation update.
- Removed comment in vma_can_userfault. The same information is better covered
  in the documentation update, so the comment is unnecessary (and slightly
  confusing as written).
- Reworded comment for MCOPY_ATOMIC_CONTINUE mode.
- For non-shared CONTINUE, only make the PTE(s) non-writable, don't change flags
  on the VMA.
- In hugetlb_mcopy_atomic_pte, always unlock the page in MCOPY_ATOMIC_CONTINUE,
  even if we don't have VM_SHARED.
- In hugetlb_mcopy_atomic_pte, introduce "bool is_continue" to make that kind of
  mode check more terse.
- Merged two nested if()s into a single expression in __mcopy_atomic_hugetlb.
- Moved "return -EINVAL if MCOPY_CONTINUE isn't supported for this vma type" up
  one level, into __mcopy_atomic.
- Rebased onto linux-next/akpm, instead of the latest 5.11 RC. Resolved
  conflicts with Mike's recent hugetlb changes.

v3->v4:
- Relaxed restriction for minor registration to allow any hugetlb VMAs, not
  just those with VM_SHARED. Fixed setting VM_WRITE flag in a CONTINUE ioctl
  for non-VM_SHARED VMAs.
- Reordered if() branches in hugetlb_mcopy_atomic_pte, so the conditions are
  simpler and easier to read.
- Reverted most of the mfill_atomic_pte change (the anon / shmem path). Just
  return -EINVAL for CONTINUE, and set zeropage = (mode ==
  MCOPY_ATOMIC_ZEROPAGE), so we can keep the delta small.
- Split out adding #ifdef CONFIG_USERFAULTFD to a separate patch (instead of
  lumping it together with adding UFFDIO_CONTINUE).
- Fixed signature of hugetlb_mcopy_atomic_pte for !CONFIG_HUGETLB_PAGE
  (signature must be the same in either case).
- Rebased onto a newer version of Peter's patches to disable huge PMD sharing.

v2->v3:
- Added #ifdef CONFIG_USERFAULTFD around hugetlb helper functions, to fix build
  errors when building without CONFIG_USERFAULTFD set.

v1->v2:
- Fixed a bug in the hugetlb_mcopy_atomic_pte retry case. We now plumb in the
  enum mcopy_atomic_mode, so we can differentiate between the three cases this
  function needs to handle:
  1) We're doing a COPY op, and need to allocate a page, add to cache, etc.
  2) We're doing a COPY op, but allocation in this function failed previously;
 we're in the retry path. The page was allocated, but not e.g. added to page
 cache, so that still needs to be done.
  3) We're doing a CONTINUE op, we need to look up an existing page instead of
 allocating a new one.
- Rebased onto a newer version of Peter's patches to disable huge PMD sharing,
  which fixes syzbot complaints on some non-x86 architectures.
- Moved __VM_UFFD_FLAGS into 

[PATCH v9 1/6] userfaultfd: add minor fault registration mode

2021-03-01 Thread Axel Rasmussen
This feature allows userspace to intercept "minor" faults. By "minor"
faults, I mean the following situation:

Let there exist two mappings (i.e., VMAs) to the same page(s). One of
the mappings is registered with userfaultfd (in minor mode), and the
other is not. Via the non-UFFD mapping, the underlying pages have
already been allocated & filled with some contents. The UFFD mapping
has not yet been faulted in; when it is touched for the first time,
this results in what I'm calling a "minor" fault. As a concrete
example, when working with hugetlbfs, we have huge_pte_none(), but
find_lock_page() finds an existing page.

This commit adds the new registration mode, and sets the relevant flag
on the VMAs being registered. In the hugetlb fault path, if we find
that we have huge_pte_none(), but find_lock_page() does indeed find an
existing page, then we have a "minor" fault, and if the VMA has the
userfaultfd registration flag, we call into userfaultfd to handle it.

This is implemented as a new registration mode, instead of an API
feature. This is because the alternative implementation has significant
drawbacks [1].

However, doing it this was requires we allocate a VM_* flag for the new
registration mode. On 32-bit systems, there are no unused bits, so this
feature is only supported on architectures with
CONFIG_ARCH_USES_HIGH_VMA_FLAGS. When attempting to register a VMA in
MINOR mode on 32-bit architectures, we return -EINVAL.

[1] https://lore.kernel.org/patchwork/patch/1380226/

Reviewed-by: Peter Xu 
Signed-off-by: Axel Rasmussen 
---
 arch/arm64/Kconfig   |  1 +
 arch/x86/Kconfig |  1 +
 fs/proc/task_mmu.c   |  3 ++
 fs/userfaultfd.c | 78 ++-
 include/linux/mm.h   |  7 +++
 include/linux/userfaultfd_k.h| 15 +-
 include/trace/events/mmflags.h   |  7 +++
 include/uapi/linux/userfaultfd.h | 15 +-
 init/Kconfig |  5 ++
 mm/hugetlb.c | 79 +---
 10 files changed, 149 insertions(+), 62 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1f212b47a48a..ce6044273ef1 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -208,6 +208,7 @@ config ARM64
select SWIOTLB
select SYSCTL_EXCEPTION_TRACE
select THREAD_INFO_IN_TASK
+   select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD
help
  ARM 64-bit (AArch64) Linux support.
 
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 2792879d398e..7f71b71ed372 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -164,6 +164,7 @@ config X86
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
select HAVE_ARCH_USERFAULTFD_WP if X86_64 && USERFAULTFD
+   select HAVE_ARCH_USERFAULTFD_MINOR  if X86_64 && USERFAULTFD
select HAVE_ARCH_VMAP_STACK if X86_64
select HAVE_ARCH_WITHIN_STACK_FRAMES
select HAVE_ASM_MODVERSIONS
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3cec6fbef725..e1c9095ebe70 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -661,6 +661,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct 
vm_area_struct *vma)
[ilog2(VM_PKEY_BIT4)]   = "",
 #endif
 #endif /* CONFIG_ARCH_HAS_PKEYS */
+#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR
+   [ilog2(VM_UFFD_MINOR)]  = "ui",
+#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */
};
size_t i;
 
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index e5ce3b4e6c3d..ba35cafa8b0d 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -197,24 +197,21 @@ static inline struct uffd_msg userfault_msg(unsigned long 
address,
msg_init();
msg.event = UFFD_EVENT_PAGEFAULT;
msg.arg.pagefault.address = address;
+   /*
+* These flags indicate why the userfault occurred:
+* - UFFD_PAGEFAULT_FLAG_WP indicates a write protect fault.
+* - UFFD_PAGEFAULT_FLAG_MINOR indicates a minor fault.
+* - Neither of these flags being set indicates a MISSING fault.
+*
+* Separately, UFFD_PAGEFAULT_FLAG_WRITE indicates it was a write
+* fault. Otherwise, it was a read fault.
+*/
if (flags & FAULT_FLAG_WRITE)
-   /*
-* If UFFD_FEATURE_PAGEFAULT_FLAG_WP was set in the
-* uffdio_api.features and UFFD_PAGEFAULT_FLAG_WRITE
-* was not set in a UFFD_EVENT_PAGEFAULT, it means it
-* was a read fault, otherwise if set it means it's
-* a write fault.
-*/
msg.arg.pagefault.flags |= UFFD_PAGEFAULT_FLAG_WRITE;
if (reason & VM_UFFD_WP)
-   /*
-* If UFFD_FEATURE_PAGEFAULT_FLAG_WP was set in the
-* uffdio_api.features and UFFD_PAGEFAULT_FLAG_WP was
-* not set in a 

Re: [PATCH 1/2] fs: eventpoll: fix comments & kernel-doc notation

2021-03-01 Thread Jonathan Corbet
Randy Dunlap  writes:

> Use the documented kernel-doc format for function Return: descriptions.
> Begin constant values in kernel-doc comments with '%'.
>
> Remove kernel-doc "/**" from 2 functions that are not documented with
> kernel-doc notation.
>
> Fix typos, punctuation, & grammar.
>
> Also fix a few kernel-doc warnings:
>
> ../fs/eventpoll.c:1883: warning: Function parameter or member 'ep' not 
> described in 'ep_loop_check_proc'
> ../fs/eventpoll.c:1883: warning: Excess function parameter 'priv' description 
> in 'ep_loop_check_proc'
> ../fs/eventpoll.c:1932: warning: Function parameter or member 'ep' not 
> described in 'ep_loop_check'
> ../fs/eventpoll.c:1932: warning: Excess function parameter 'from' description 
> in 'ep_loop_check'
>
> Signed-off-by: Randy Dunlap 
> Cc: Jonathan Corbet 
> Cc: linux-...@vger.kernel.org
> Cc: Andrew Morton 
> Cc: Alexander Viro 
> ---
> Jon: Al says that he is OK with one of you merging this fs/
>  (only comments) patch.
>
>  fs/eventpoll.c |   52 +++
>  1 file changed, 26 insertions(+), 26 deletions(-)

Both patches applied, thanks.

jon


Re: [PATCH 5.4 000/338] 5.4.102-rc2 review

2021-03-01 Thread Florian Fainelli
On 3/1/21 11:47 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.102 release.
> There are 338 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Wed, 03 Mar 2021 19:43:25 +.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.102-rc2.gz
> or in the git tree and branch at:
>   
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
> linux-5.4.y
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h

On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:

Tested-by: Florian Fainelli 
-- 
Florian


Re: [PATCH] Documentation: ioctl: add entry for nsfs.h

2021-03-01 Thread Jonathan Corbet
Randy Dunlap  writes:

> All userspace ioctls major/magic number should be documented in
> Documentation/userspace-api/ioctl/ioctl-number.rst, so add
> the entry for .
>
> Signed-off-by: Randy Dunlap 
> Cc: Andrey Vagin 
> Cc: Serge Hallyn 
> Cc: Eric W. Biederman 
> Cc: linux-...@vger.kernel.org
> Cc: Jonathan Corbet 
> ---
> Feel free to modify the patch as needed.
>
> Probably don't need to backport:
> # Fixes: 6786741dbf99 ("nsfs: add ioctl to get an owning user namespace for 
> ns file descriptor")
>
>  Documentation/userspace-api/ioctl/ioctl-number.rst |1 +
>  1 file changed, 1 insertion(+)

Applied (rather belatedly, sorry).

Thanks,

jon


Re: [PATCH v1 02/15] powerpc/uaccess: Define ___get_user_instr() for ppc32

2021-03-01 Thread Daniel Axtens
Hi Christophe,

> +#else /* !CONFIG_PPC64 */
> +#define ___get_user_instr(gu_op, dest, ptr)  \
> + gu_op((dest).val, (u32 __user *)(ptr))
> +#endif /* CONFIG_PPC64 */
>  
>  #define get_user_instr(x, ptr) \
>   ___get_user_instr(get_user, x, ptr)
> @@ -91,18 +95,6 @@ static inline bool __access_ok(unsigned long addr, 
> unsigned long size)
>  #define __get_user_instr_inatomic(x, ptr) \
>   ___get_user_instr(__get_user_inatomic, x, ptr)
>  
> -#else /* !CONFIG_PPC64 */
> -#define get_user_instr(x, ptr) \
> - get_user((x).val, (u32 __user *)(ptr))
> -
> -#define __get_user_instr(x, ptr) \
> - __get_user_nocheck((x).val, (u32 __user *)(ptr), sizeof(u32), true)
> -
> -#define __get_user_instr_inatomic(x, ptr) \
> - __get_user_nosleep((x).val, (u32 __user *)(ptr), sizeof(u32))
> -
> -#endif /* CONFIG_PPC64 */

The previous version of __get_user_instr called __get_user_nocheck,
this version calls __get_user. Likewise __get_user_instr_inatomic called
__get_user_nosleep and now it calls __get_user_inatomic. I was confused
by this until I chased the macro definitions and realised that both
names refer to the same thing:

#define __get_user(x, ptr) \
__get_user_nocheck((x), (ptr), sizeof(*(ptr)), true)

#define __get_user_inatomic(x, ptr) \
__get_user_nosleep((x), (ptr), sizeof(*(ptr)))

(I don't think you need to do anything here, I'm just documenting what I
considered while reviewing your patch.)

As such:
Reviewed-by: Daniel Axtens 

Kind regards,
Daniel


> -
>  extern long __put_user_bad(void);
>  
>  #define __put_user_size(x, ptr, size, retval)\
> -- 
> 2.25.0


Re: [PATCH] docs: networking: bonding.rst Fix a typo in bonding.rst

2021-03-01 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Mon,  1 Mar 2021 21:28:23 +0900 you wrote:
> This patch fixes a spelling typo in bonding.rst.
> 
> Signed-off-by: Masanari Iida 
> ---
>  Documentation/networking/bonding.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Here is the summary with links:
  - docs: networking: bonding.rst Fix a typo in bonding.rst
https://git.kernel.org/netdev/net/c/2353db75c3db

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




Re: [PATCH] Documentation: Replace more lkml.org links with lore

2021-03-01 Thread Jonathan Corbet
Kees Cook  writes:

> As started by commit 05a5f51ca566 ("Documentation: Replace lkml.org
> links with lore"), replace a few more scattered lkml.org links with
> lore to better use a single source that's more likely to stay available
> long-term.
>
> Signed-off-by: Kees Cook 
> ---
>  CREDITS| 2 +-
>  tools/scripts/Makefile.include | 3 ++-
>  2 files changed, 3 insertions(+), 2 deletions(-)

I've (rather belatedly) applied this, thanks.

jon


[tip:timers/urgent] BUILD SUCCESS 05f7fcc675f50001a30b8938c05d11ca9f599f8c

2021-03-01 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 
timers/urgent
branch HEAD: 05f7fcc675f50001a30b8938c05d11ca9f599f8c  hrtimer: Update 
softirq_expires_next correctly after __hrtimer_get_next_event()

elapsed time: 731m

configs tested: 124
configs skipped: 2

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm64allyesconfig
arm  allyesconfig
arm  allmodconfig
arm defconfig
arm64   defconfig
arm  moxart_defconfig
m68kq40_defconfig
powerpc  katmai_defconfig
alpha   defconfig
ia64 alldefconfig
powerpc  makalu_defconfig
sh   se7724_defconfig
mips   xway_defconfig
armrealview_defconfig
mipsvocore2_defconfig
powerpc mpc832x_rdb_defconfig
powerpc  walnut_defconfig
m68kmvme16x_defconfig
armvexpress_defconfig
powerpc  chrp32_defconfig
i386 allyesconfig
mipsjmr3927_defconfig
arcnsim_700_defconfig
arm nhk8815_defconfig
arm   sama5_defconfig
powerpc   maple_defconfig
sh   alldefconfig
sh  kfr2r09_defconfig
powerpc   mpc834x_itxgp_defconfig
riscvalldefconfig
arm   spitz_defconfig
powerpcwarp_defconfig
xtensa   common_defconfig
armneponset_defconfig
sh magicpanelr2_defconfig
armzeus_defconfig
mips cu1830-neo_defconfig
sh  rsk7269_defconfig
mips mpc30x_defconfig
arm   versatile_defconfig
nios2alldefconfig
powerpc   ebony_defconfig
powerpc mpc8313_rdb_defconfig
riscvnommu_virt_defconfig
powerpc mpc834x_mds_defconfig
sparc   defconfig
sparc64 defconfig
shapsh4ad0a_defconfig
powerpc canyonlands_defconfig
sh sh7710voipgw_defconfig
mips decstation_r4k_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
c6x  allyesconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
sparcallyesconfig
i386   tinyconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a006-20210228
i386 randconfig-a005-20210228
i386 randconfig-a004-20210228
i386 randconfig-a003-20210228
i386 randconfig-a001-20210228
i386 randconfig-a002-20210228
i386 randconfig-a005-20210301
i386 randconfig-a003-20210301
i386 randconfig-a002-20210301
i386 randconfig-a004-20210301
i386 randconfig-a006-20210301
i386 randconfig-a001-20210301
x86_64   randconfig-a013-20210301
x86_64   randconfig-a016-20210301
x86_64   randconfig-a015-20210301
x86_64   randconfig-a014-20210301
x86_64   randconfig-a012-20210301
x86_64   randconfig-a011

Re: [PATCH v1] microblaze: tag highmem_setup() with __meminit

2021-03-01 Thread Oscar Salvador
On Mon, Mar 01, 2021 at 12:47:49PM +0100, David Hildenbrand wrote:
> With commit a0cd7a7c4bc0 ("mm: simplify free_highmem_page() and
> free_reserved_page()") the kernel test robot complains about a warning:
> 
> WARNING: modpost: vmlinux.o(.text.unlikely+0x23ac): Section mismatch in
>   reference from the function highmem_setup() to the function
>   .meminit.text:memblock_is_reserved()
> 
> This has been broken ever since microblaze added highmem support,
> because memblock_is_reserved() was already tagged with "__init" back then -
> most probably the function always got inlined, so we never stumbled over
> it.

It might be good to point out that we need __meminit instead of __init
because microblaze platform does not define CONFIG_ARCH_KEEP_MEMBLOCK,
and __init_memblock fallsback to that.

(I had to go and look as I was puzzled :-) )

Reviewed-by: Oscar Salvador 

> 
> Reported-by: kernel test robot 
> Fixes: 2f2f371f8907 ("microblaze: Highmem support")
> Cc: Andrew Morton 
> Cc: Michal Simek 
> Cc: Mike Rapoport 
> Cc: Andrew Morton 
> Cc: Thomas Gleixner 
> Cc: Arvind Sankar 
> Cc: Ira Weiny 
> Cc: Randy Dunlap 
> Cc: Oscar Salvador 
> Cc: Anshuman Khandual 
> Signed-off-by: David Hildenbrand 
> ---
>  arch/microblaze/mm/init.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/microblaze/mm/init.c b/arch/microblaze/mm/init.c
> index 181e48782e6c..05cf1fb3f5ff 100644
> --- a/arch/microblaze/mm/init.c
> +++ b/arch/microblaze/mm/init.c
> @@ -52,7 +52,7 @@ static void __init highmem_init(void)
>   pkmap_page_table = virt_to_kpte(PKMAP_BASE);
>  }
>  
> -static void highmem_setup(void)
> +static void __meminit highmem_setup(void)
>  {
>   unsigned long pfn;
>  
> -- 
> 2.29.2
> 
> 

-- 
Oscar Salvador
SUSE L3


Re: [PATCH] docs: filesystem: Update smaps vm flag list to latest

2021-03-01 Thread Jonathan Corbet
Peter Xu  writes:

> We've missed a few documentation when adding new VM_* flags.  Add the missing
> pieces so they'll be in sync now.
>
> Signed-off-by: Peter Xu 
> ---
>  Documentation/filesystems/proc.rst | 5 +
>  1 file changed, 5 insertions(+)

So this patch doesn't apply; what version of the kernel did you generate
it against?  Could you redo against current kernels, please?

Thanks,

jon


Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify_code

2021-03-01 Thread Sami Tolvanen
On Sun, Feb 28, 2021 at 11:25 PM kernel test robot
 wrote:
>
>
> Greeting,
>
> FYI, we noticed the following commit (built with clang-13):
>
> commit: 6dafca97803309c3cb5148d449bfa711e41ddef2 ("x86, build: use objtool 
> mcount")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

Thanks for the report, I'm able to reproduce the warning.

> [4.764496] [ ftrace bug ]
> [4.764847] ftrace failed to modify
> [4.764852] do_sys_open (kbuild/src/consumer/fs/open.c:1186)
> [4.765483]  actual:   0f:1f:44:00:00
> [4.765784] Setting ftrace call site to call ftrace function
> [4.766193] ftrace record flags: 5001
> [4.766490]  (1) R
> [4.766490]  expected tramp: 81037af0
> [4.766959] [ cut here ]

Basically, the problem is that ftrace_replace_code() expects to find
ideal_nops[NOP_ATOMIC5] here, which in this case is 66:66:66:66:90,
while objtool has replaced the __fentry__ call with 0f:1f:44:00:00.

As ideal_nops changes depending on kernel config and hardware, when
CC_USING_NOP_MCOUNT is defined we could either change
ftrace_nop_replace() to always use P6_NOP5, or skip
ftrace_verify_code() in ftrace_replace_code() for
FTRACE_UPDATE_MAKE_CALL.

Steven, Peter, any thoughts?

Sami


Re: [PATCH v3 03/25] mm/vmstat: Add folio stat wrappers

2021-03-01 Thread Matthew Wilcox
On Mon, Mar 01, 2021 at 04:17:39PM -0500, Zi Yan wrote:
> On 28 Jan 2021, at 2:03, Matthew Wilcox (Oracle) wrote:
> > Allow page counters to be more readily modified by callers which have
> > a folio.  Name these wrappers with 'stat' instead of 'state' as requested
>
> Shouldn’t we change the stats with folio_nr_pages(folio) here? And all
> changes below. Otherwise one folio is always counted as a single page.

That's a good point.  Looking through the changes in my current folio
tree (which doesn't get as far as the thp tree did; ie doesn't yet allocate
multi-page folios, so hasn't been tested with anything larger than a
single page), the callers are ...

@@ -2698,3 +2698,3 @@ int clear_page_dirty_for_io(struct page *page)
-   if (TestClearPageDirty(page)) {
-   dec_lruvec_page_state(page, NR_FILE_DIRTY);
-   dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
+   if (TestClearFolioDirty(folio)) {
+   dec_lruvec_folio_stat(folio, NR_FILE_DIRTY);
+   dec_zone_folio_stat(folio, NR_ZONE_WRITE_PENDING);
@@ -2432,3 +2433,3 @@ void account_page_dirtied(struct page *page, struct addres
s_space *mapping)
-   __inc_lruvec_page_state(page, NR_FILE_DIRTY);
-   __inc_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-   __inc_node_page_state(page, NR_DIRTIED);
+   __inc_lruvec_folio_stat(folio, NR_FILE_DIRTY);
+   __inc_zone_folio_stat(folio, NR_ZONE_WRITE_PENDING);
+   __inc_node_folio_stat(folio, NR_DIRTIED);
@@ -891 +890 @@ noinline int __add_to_page_cache_locked(struct page *page,
-   __inc_lruvec_page_state(page, NR_FILE_PAGES);
+   __inc_lruvec_folio_stat(folio, NR_FILE_PAGES);
@@ -2759,2 +2759,2 @@ int test_clear_page_writeback(struct page *page)
-   dec_zone_page_state(page, NR_ZONE_WRITE_PENDING);
-   inc_node_page_state(page, NR_WRITTEN);
+   dec_zone_folio_stat(folio, NR_ZONE_WRITE_PENDING);
+   inc_node_folio_stat(folio, NR_WRITTEN);

I think it's clear from this that I haven't found all the places
that I need to change yet ;-)

Looking at the places I did change in the thp tree, there are changes
like this:

@@ -860,27 +864,30 @@ noinline int __add_to_page_cache_locked(struct page *page,
-   if (!huge)
-   __inc_lruvec_page_state(page, NR_FILE_PAGES);
+   if (!huge) {
+   __mod_lruvec_page_state(page, NR_FILE_PAGES, nr);
+   if (nr > 1)
+   __mod_node_page_state(page_pgdat(page),
+   NR_FILE_THPS, nr);
+   }

... but I never did do some of the changes which the above changes imply
are needed.  So the thp tree probably had all kinds of bad statistics
that I never noticed.

So ... at least some of the users are definitely going to want to
cache the 'nr_pages' and use it multiple times, including calling
__mod_node_folio_state(), but others should do what you suggested.
Thanks!  I'll make that change.


AW: [PATCH 0/8] USB Audio Gadget part 2: Feedback endpoint, Volume/Mute support

2021-03-01 Thread Johannes Freyberger
Hi Ruslan,

thanks a lot for your quick answer.

> -Ursprüngliche Nachricht-
> Von: Ruslan Bilovol
> Gesendet: Montag, 1. März 2021 22:34
> An: Johannes Freyberger 
> Cc: Felipe Balbi ; Jonathan Corbet ;
> Greg Kroah-Hartman ; Glenn Schmottlach
> ; linux-...@vger.kernel.org; linux-
> ker...@vger.kernel.org; Linux USB 
> Betreff: Re: [PATCH 0/8] USB Audio Gadget part 2: Feedback endpoint,
> Volume/Mute support
> 
> Hi Johannes,
> 
> On Mon, Mar 1, 2021 at 6:49 PM Johannes Freyberger
>  wrote:
> >
> > Hi Ruslan,
> >
> > thanks for all your efforts to make the USB Audio Gadget work in Win10
> > using UAC2. Meanwhile I managed to apply and compile your previous
> > modifications and now my Raspberry PI shows up in the Windows Device
> > Manager as a valid
> > UAC2 audio device. Unfortunately it still doesn't work to transfer any
> > audio as it seems the audio endpoints or the topology is not working.
> 
> Are you testing my previous version of the patches on some older kernel?
> 
> Just for records - these two patch sets (part 1 and part 2) are based on 
> Greg's
> usb-next branch (commit b5a12546e779d4f5586f58e60e0ef5070a833a64
> which is based on v5.11-rc5 tag). I retested them today with a BBB board and
> it works fine under Win 10. Also I rebased these two patchsets today against
> latest Greg's usb-next branch which is now Linus's v5.12-rc1 tag and again it
> works fine under Win10 - both Volume/Mute controls and audio streaming.
> 
> These patches have been tested previously on Raspberry PI 4 running v5.9
> and v5.10 stable kernels. The only issues I've seen were because of
> Raspberry's DWC2 DMA issue in the driver that I described in this cover
> letter.
> However if you disable volume/mute controls, it won't affect you.
> 
> > I checked it
> > with some tools and found one providing some information on the USB
> > part (it's called UVCview.exe and is part of the Windows Driver Kit).
> > Here's the output which I hope can give some hints on the problems
> > still existing in this driver:
> 
> From the output below I see UAC2 descriptors are completely screwed up
> (or UVCview.exe doesn't show them correctly). Windows is very strict to the
> descriptors and doesn't allow devices to start in case of any issues.
> So if it appears as a valid UAC2 device in Device Manager, most likely
> UVCview.exe doesn't decode UAC2 descriptors well.
> 

You are right, they really look screwed up. Meanwhile I found another similar 
tool which also knows Audio 2.0 and here everything looks fine ( 
https://www.uwe-sieber.de/usbtreeview.html#download )

> Could you please also apply these patches to the latest kernel (v5.12-rc1) and
> test?

Yes, I'd like to do this and I want to apologize for my newbie questions in 
advance. But I have to admit I'm rather new to Linux, Kernel compiling etc. and 
I followed the description on 
https://www.raspberrypi.org/documentation/linux/kernel/building.md and then 
applied your patches - partially I had to do some modifications by hand as the 
sources had changed. The version I downloaded via "git clone --depth=1 
https://github.com/raspberrypi/linux; seems to be Linux 5.10.17-v7l. And I 
cannot see the version you mention at 
https://github.com/raspberrypi/linux/branches . Where can I get the version 
v5.12-rc1 for these tests?

> 
> Thanks,
> Ruslan
> 

Thanks to you for helping beginners like me,
best regards,
Johannes

> >
> >   ---===>Device Information<===--- English product name:
> > "Linux USB Audio Gadget"
> >
> > ConnectionStatus:
> > Current Config Value:  0x01  -> Device Bus Speed: High
> > Device Address:0x0F
> > Open Pipes:   0
> > *!*ERROR:  No open pipes!
> >
> >   ===>Device Descriptor<===
> > bLength:   0x12
> > bDescriptorType:   0x01
> > bcdUSB:  0x0200
> > bDeviceClass:  0xEF  -> This is a Multi-interface
> > Function Code Device
> > bDeviceSubClass:   0x02  -> This is the Common Class Sub
> > Class
> > bDeviceProtocol:   0x01  -> This is the Interface
> > Association Descriptor protocol
> > bMaxPacketSize0:   0x40 = (64) Bytes
> > idVendor:0x1D6B = The Linux Foundation
> > idProduct:   0x0101
> > bcdDevice:   0x0510
> > iManufacturer: 0x01
> >  English (United States)  "Linux 5.10.17-v7l-R3LAY_TEST+ with
> > fe98.usb"
> > iProduct:  0x02
> >  English (United States)  "Linux USB Audio Gadget"
> > iSerialNumber: 0x00
> > bNumConfigurations:0x01
> >
> >   ===>Configuration Descriptor<===
> > bLength:   0x09
> > bDescriptorType:   0x02
> > wTotalLength:0x00E2  -> Validated
> > bNumInterfaces:0x03
> > 

Re: [PATCH] Documentation/submitting-patches: Extend commit message layout description

2021-03-01 Thread Jonathan Corbet
Borislav Petkov  writes:

> From: Borislav Petkov 
> Subject: [PATCH] Documentation/submitting-patches: Extend commit message 
> layout description
>
> Add more blurb about the level of detail that should be contained in a
> patch's commit message. Extend and make more explicit what text should
> be added under the --- line. Extend examples and split into more easily
> palatable paragraphs.
>
> This has been partially carved out from a tip subsystem handbook
> patchset by Thomas Gleixner:
>
>   https://lkml.kernel.org/r/20181107171010.421878...@linutronix.de
>
> and incorporates follow-on comments.
>
> Signed-off-by: Borislav Petkov 
> ---
>
> /me sends the next generic topic blurb.
>
>  Documentation/process/submitting-patches.rst | 89 
>  1 file changed, 56 insertions(+), 33 deletions(-)

Applied, with one tweak:

> +If there are four patches in a patch series the individual patches may
> +be numbered like this: 1/4, 2/4, 3/4, 4/4. This assures that developers
> +understand the order in which the patches should be applied and that
> +they have reviewed or applied all of the patches in the patch series.
>  
>  A couple of example Subjects::
>  
>  Subject: [PATCH 2/5] ext2: improve scalability of bitmap searching
>  Subject: [PATCH v2 01/27] x86: fix eflags tracking
> +Subject: [PATCH v2] sub/sys: Condensed patch summary
> +Subject: [PATCH v2 M/N] sub/sys: Condensed patch summary

It's no longer "a couple" so I made this "Here are some good example
Subjects".

Thanks,

jon


Re: [PATCH v3 0/3] docs: arm: Improvements to Marvell SoC documentation

2021-03-01 Thread Jonathan Corbet
Lubomir Rintel  writes:

> Hi,
>
> please consider applying the patches chained to this message.
>
> The objective is to deal with the a large amount of dead links to
> material that often comes handy in marvel.rst; and improve some details
> along the way.
>
> Compared to v2, the patches "[PATCH v2 2/5] docs: arm: marvell: fix 38x
> functional spec link" and "[PATCH v2 5/5] docs: arm: marvell: rename
> marvel.rst to marvell.rst" have been removed, because analogous patches
> have already been applied. Also, more dead links have been removed,
> reducing the count of links removed in [PATCH v3 1/3] to one.
> Detailed changelogs in individual patches.

I've applied parts 1 and 3; since there is evidently an archive link for
the one killed in part 2, I left that out.

Thanks,

jon


<    1   2   3   4   5   6   7   8   9   10   >