date:20200805

Re: [PATCH v2 03/24] virtio: allow virtioXX, leXX in config space

2020-08-05 Thread Michael S. Tsirkin

On Thu, Aug 06, 2020 at 11:37:38AM +0800, Jason Wang wrote:
> 
> On 2020/8/5 下午7:45, Michael S. Tsirkin wrote:
> > > >#define virtio_cread(vdev, structname, member, ptr)  
> > > > \
> > > > do {
> > > > \
> > > > might_sleep();  
> > > > \
> > > > /* Must match the member's type, and be integer */  
> > > > \
> > > > -   if (!typecheck(typeofstructname*)0)->member)), 
> > > > *(ptr))) \
> > > > +   if (!__virtio_typecheck(structname, member, *(ptr)))
> > > > \
> > > > (*ptr) = 1; 
> > > > \
> > > A silly question,  compare to using set()/get() directly, what's the value
> > > of the accessors macro here?
> > > 
> > > Thanks
> > get/set don't convert to the native endian, I guess that's why
> > drivers use cread/cwrite. It is also nice that there's type
> > safety, checking the correct integer width is used.
> 
> 
> Yes, but this is simply because a macro is used here, how about just doing
> things similar like virtio_cread_bytes():
> 
> static inline void virtio_cread(struct virtio_device *vdev,
>                   unsigned int offset,
>                   void *buf, size_t len)
> 
> 
> And do the endian conversion inside?
> 
> Thanks
> 

Then you lose type safety. It's very easy to have an le32 field
and try to read it into a u16 by mistake.

These macros are all about preventing bugs: and the whole patchset
is about several bugs sparse found - that is what prompted me to make
type checks more strict.


> >

[PATCH 1/2] exfat: add NameLength check when extracting name

2020-08-05 Thread Tetsuhiro Kohada

The current implementation doesn't care NameLength when extracting
the name from Name dir-entries, so the name may be incorrect.
(Without null-termination, Insufficient Name dir-entry, etc)
Add a NameLength check when extracting the name from Name dir-entries
to extract correct name.
And, change to get the information of file/stream-ext dir-entries
via the member variable of exfat_entry_set_cache.

** This patch depends on:
  '[PATCH v3] exfat: integrates dir-entry getting and validation'.

Signed-off-by: Tetsuhiro Kohada 
---
 fs/exfat/dir.c | 81 --
 1 file changed, 39 insertions(+), 42 deletions(-)

diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
index 91cdbede0fd1..545bb73b95e9 100644
--- a/fs/exfat/dir.c
+++ b/fs/exfat/dir.c
@@ -28,16 +28,15 @@ static int exfat_extract_uni_name(struct exfat_dentry *ep,
 
 }
 
-static void exfat_get_uniname_from_ext_entry(struct super_block *sb,
-   struct exfat_chain *p_dir, int entry, unsigned short *uniname)
+static int exfat_get_uniname_from_name_entries(struct exfat_entry_set_cache 
*es,
+   struct exfat_uni_name *uniname)
 {
-   int i;
-   struct exfat_entry_set_cache *es;
+   int n, l, i;
struct exfat_dentry *ep;
 
-   es = exfat_get_dentry_set(sb, p_dir, entry, ES_ALL_ENTRIES);
-   if (!es)
-   return;
+   uniname->name_len = es->de_stream->name_len;
+   if (uniname->name_len == 0)
+   return -EIO;
 
/*
 * First entry  : file entry
@@ -45,14 +44,15 @@ static void exfat_get_uniname_from_ext_entry(struct 
super_block *sb,
 * Third entry  : first file-name entry
 * So, the index of first file-name dentry should start from 2.
 */
-
-   i = 2;
-   while ((ep = exfat_get_validated_dentry(es, i++, TYPE_NAME))) {
-   exfat_extract_uni_name(ep, uniname);
-   uniname += EXFAT_FILE_NAME_LEN;
+   for (l = 0, n = 2; l < uniname->name_len; n++) {
+   ep = exfat_get_validated_dentry(es, n, TYPE_NAME);
+   if (!ep)
+   return -EIO;
+   for (i = 0; l < uniname->name_len && i < EXFAT_FILE_NAME_LEN; 
i++, l++)
+   uniname->name[l] = 
le16_to_cpu(ep->dentry.name.unicode_0_14[i]);
}
-
-   exfat_free_dentry_set(es, false);
+   uniname->name[l] = 0;
+   return 0;
 }
 
 /* read a directory entry from the opened directory */
@@ -63,6 +63,7 @@ static int exfat_readdir(struct inode *inode, struct 
exfat_dir_entry *dir_entry)
sector_t sector;
struct exfat_chain dir, clu;
struct exfat_uni_name uni_name;
+   struct exfat_entry_set_cache *es;
struct exfat_dentry *ep;
struct super_block *sb = inode->i_sb;
struct exfat_sb_info *sbi = EXFAT_SB(sb);
@@ -114,47 +115,43 @@ static int exfat_readdir(struct inode *inode, struct 
exfat_dir_entry *dir_entry)
return -EIO;
 
type = exfat_get_entry_type(ep);
-   if (type == TYPE_UNUSED) {
-   brelse(bh);
+   brelse(bh);
+
+   if (type == TYPE_UNUSED)
break;
-   }
 
-   if (type != TYPE_FILE && type != TYPE_DIR) {
-   brelse(bh);
+   if (type != TYPE_FILE && type != TYPE_DIR)
continue;
-   }
 
-   dir_entry->attr = le16_to_cpu(ep->dentry.file.attr);
+   es = exfat_get_dentry_set(sb, , dentry, 
ES_ALL_ENTRIES);
+   if (!es)
+   return -EIO;
+
+   dir_entry->attr = le16_to_cpu(es->de_file->attr);
exfat_get_entry_time(sbi, _entry->crtime,
-   ep->dentry.file.create_tz,
-   ep->dentry.file.create_time,
-   ep->dentry.file.create_date,
-   ep->dentry.file.create_time_cs);
+   es->de_file->create_tz,
+   es->de_file->create_time,
+   es->de_file->create_date,
+   es->de_file->create_time_cs);
exfat_get_entry_time(sbi, _entry->mtime,
-   ep->dentry.file.modify_tz,
-   ep->dentry.file.modify_time,
-   ep->dentry.file.modify_date,
-   ep->dentry.file.modify_time_cs);
+   es->de_file->modify_tz,
+   es->de_file->modify_time,
+

[PATCH 2/2] exfat: unify name extraction

2020-08-05 Thread Tetsuhiro Kohada

Name extraction in exfat_find_dir_entry() also doesn't care NameLength,
so the name may be incorrect.
Replace the name extraction in exfat_find_dir_entry() with using
exfat_entry_set_cache and exfat_get_uniname_from_name_entries(),
like exfat_readdir().
Replace the name extraction with using exfat_entry_set_cache and
exfat_get_uniname_from_name_entries(), like exfat_readdir().
And, remove unused functions/parameters.

** This patch depends on:
  '[PATCH v3] exfat: integrates dir-entry getting and validation'.

Signed-off-by: Tetsuhiro Kohada 
---
 fs/exfat/dir.c  | 161 ++--
 fs/exfat/exfat_fs.h |   2 +-
 fs/exfat/namei.c|   4 +-
 3 files changed, 38 insertions(+), 129 deletions(-)

diff --git a/fs/exfat/dir.c b/fs/exfat/dir.c
index 545bb73b95e9..c9715c7a55a1 100644
--- a/fs/exfat/dir.c
+++ b/fs/exfat/dir.c
@@ -10,24 +10,6 @@
 #include "exfat_raw.h"
 #include "exfat_fs.h"
 
-static int exfat_extract_uni_name(struct exfat_dentry *ep,
-   unsigned short *uniname)
-{
-   int i, len = 0;
-
-   for (i = 0; i < EXFAT_FILE_NAME_LEN; i++) {
-   *uniname = le16_to_cpu(ep->dentry.name.unicode_0_14[i]);
-   if (*uniname == 0x0)
-   return len;
-   uniname++;
-   len++;
-   }
-
-   *uniname = 0x0;
-   return len;
-
-}
-
 static int exfat_get_uniname_from_name_entries(struct exfat_entry_set_cache 
*es,
struct exfat_uni_name *uniname)
 {
@@ -869,13 +851,6 @@ struct exfat_entry_set_cache *exfat_get_dentry_set(struct 
super_block *sb,
return NULL;
 }
 
-enum {
-   DIRENT_STEP_FILE,
-   DIRENT_STEP_STRM,
-   DIRENT_STEP_NAME,
-   DIRENT_STEP_SECD,
-};
-
 /*
  * return values:
  *   >= 0  : return dir entiry position with the name in dir
@@ -885,13 +860,12 @@ enum {
  */
 int exfat_find_dir_entry(struct super_block *sb, struct exfat_inode_info *ei,
struct exfat_chain *p_dir, struct exfat_uni_name *p_uniname,
-   int num_entries, unsigned int type)
+   int num_entries)
 {
-   int i, rewind = 0, dentry = 0, end_eidx = 0, num_ext = 0, len;
-   int order, step, name_len = 0;
+   int i, rewind = 0, dentry = 0, end_eidx = 0, num_ext = 0;
+   int name_len = 0;
int dentries_per_clu, num_empty = 0;
unsigned int entry_type;
-   unsigned short *uniname = NULL;
struct exfat_chain clu;
struct exfat_hint *hint_stat = >hint_stat;
struct exfat_hint_femp candi_empty;
@@ -909,27 +883,33 @@ int exfat_find_dir_entry(struct super_block *sb, struct 
exfat_inode_info *ei,
 
candi_empty.eidx = EXFAT_HINT_NONE;
 rewind:
-   order = 0;
-   step = DIRENT_STEP_FILE;
while (clu.dir != EXFAT_EOF_CLUSTER) {
i = dentry & (dentries_per_clu - 1);
for (; i < dentries_per_clu; i++, dentry++) {
struct exfat_dentry *ep;
struct buffer_head *bh;
+   struct exfat_entry_set_cache *es;
+   struct exfat_uni_name uni_name;
+   u16 name_hash;
 
if (rewind && dentry == end_eidx)
goto not_found;
 
+   /* skip secondary dir-entries in previous dir-entry set 
*/
+   if (num_ext) {
+   num_ext--;
+   continue;
+   }
+
ep = exfat_get_dentry(sb, , i, , NULL);
if (!ep)
return -EIO;
 
entry_type = exfat_get_entry_type(ep);
+   brelse(bh);
 
if (entry_type == TYPE_UNUSED ||
entry_type == TYPE_DELETED) {
-   step = DIRENT_STEP_FILE;
-
num_empty++;
if (candi_empty.eidx == EXFAT_HINT_NONE &&
num_empty == 1) {
@@ -954,7 +934,6 @@ int exfat_find_dir_entry(struct super_block *sb, struct 
exfat_inode_info *ei,
}
}
 
-   brelse(bh);
if (entry_type == TYPE_UNUSED)
goto not_found;
continue;
@@ -963,80 +942,38 @@ int exfat_find_dir_entry(struct super_block *sb, struct 
exfat_inode_info *ei,
num_empty = 0;
candi_empty.eidx = EXFAT_HINT_NONE;
 
-   if (entry_type == TYPE_FILE || entry_type == TYPE_DIR) {
-   step = DIRENT_STEP_FILE;
-   if (type == TYPE_ALL || type == entry_type) {
-   num_ext =

RE: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-08-05 Thread Doug Smythies

On 2020.08.05 09:56 Rafael J. Wysocki wrote:

> v6 -> v7:
>* Cosmetic changes in store_energy_performance_prefernce() to reduce the
>  LoC number and make it a bit easier to read.  No intentional functional
>  impact.

??
V7 is identical to V6.

Diff:

$ diff hwppassive-v6-2-2.patch hwppassive-v7-2-2.patch
2c2
< Sent: August 4, 2020 8:11 AM
---
> Sent: August 5, 2020 9:56 AM
5c5
< Subject: [PATCH v6] cpufreq: intel_pstate: Implement passive mode with HWP 
enabled
---
> Subject: [PATCH v7] cpufreq: intel_pstate: Implement passive mode with HWP 
> enabled
76a77,81
>
> v6 -> v7:
>* Cosmetic changes in store_energy_performance_prefernce() to reduce the
>  LoC number and make it a bit easier to read.  No intentional functional
>  impact.

... Doug

Re: [PATCH 4/4] vhost: vdpa: report iova range

2020-08-05 Thread Michael S. Tsirkin

On Thu, Aug 06, 2020 at 11:29:16AM +0800, Jason Wang wrote:
> 
> On 2020/8/5 下午8:58, Michael S. Tsirkin wrote:
> > On Wed, Jun 17, 2020 at 11:29:47AM +0800, Jason Wang wrote:
> > > This patch introduces a new ioctl for vhost-vdpa device that can
> > > report the iova range by the device. For device that depends on
> > > platform IOMMU, we fetch the iova range via DOMAIN_ATTR_GEOMETRY. For
> > > devices that has its own DMA translation unit, we fetch it directly
> > > from vDPA bus operation.
> > > 
> > > Signed-off-by: Jason Wang 
> > > ---
> > >   drivers/vhost/vdpa.c | 27 +++
> > >   include/uapi/linux/vhost.h   |  4 
> > >   include/uapi/linux/vhost_types.h |  5 +
> > >   3 files changed, 36 insertions(+)
> > > 
> > > diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> > > index 77a0c9fb6cc3..ad23e66cbf57 100644
> > > --- a/drivers/vhost/vdpa.c
> > > +++ b/drivers/vhost/vdpa.c
> > > @@ -332,6 +332,30 @@ static long vhost_vdpa_set_config_call(struct 
> > > vhost_vdpa *v, u32 __user *argp)
> > >   return 0;
> > >   }
> > > +
> > > +static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user 
> > > *argp)
> > > +{
> > > + struct iommu_domain_geometry geo;
> > > + struct vdpa_device *vdpa = v->vdpa;
> > > + const struct vdpa_config_ops *ops = vdpa->config;
> > > + struct vhost_vdpa_iova_range range;
> > > + struct vdpa_iova_range vdpa_range;
> > > +
> > > + if (!ops->set_map && !ops->dma_map) {
> > Why not just check if (ops->get_iova_range) directly?
> 
> 
> Because set_map || dma_ops is a hint that the device has its own DMA
> translation logic.
> 
> Device without get_iova_range does not necessarily meant it use IOMMU
> driver.
> 
> Thanks

OK let's add some code comments please, and check get_iova_range
is actually there before calling.

> 
> > 
> > 
> > 
> > 
> > > + iommu_domain_get_attr(v->domain,
> > > +   DOMAIN_ATTR_GEOMETRY, );
> > > + range.start = geo.aperture_start;
> > > + range.end = geo.aperture_end;
> > > + } else {
> > > + vdpa_range = ops->get_iova_range(vdpa);
> > > + range.start = vdpa_range.start;
> > > + range.end = vdpa_range.end;
> > > + }
> > > +
> > > + return copy_to_user(argp, , sizeof(range));
> > > +
> > > +}
> > > +
> > >   static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int 
> > > cmd,
> > >  void __user *argp)
> > >   {
> > > @@ -442,6 +466,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file 
> > > *filep,
> > >   case VHOST_VDPA_SET_CONFIG_CALL:
> > >   r = vhost_vdpa_set_config_call(v, argp);
> > >   break;
> > > + case VHOST_VDPA_GET_IOVA_RANGE:
> > > + r = vhost_vdpa_get_iova_range(v, argp);
> > > + break;
> > >   default:
> > >   r = vhost_dev_ioctl(>vdev, cmd, argp);
> > >   if (r == -ENOIOCTLCMD)
> > > diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
> > > index 0c2349612e77..850956980e27 100644
> > > --- a/include/uapi/linux/vhost.h
> > > +++ b/include/uapi/linux/vhost.h
> > > @@ -144,4 +144,8 @@
> > >   /* Set event fd for config interrupt*/
> > >   #define VHOST_VDPA_SET_CONFIG_CALL  _IOW(VHOST_VIRTIO, 0x77, int)
> > > +
> > > +/* Get the valid iova range */
> > > +#define VHOST_VDPA_GET_IOVA_RANGE_IOW(VHOST_VIRTIO, 0x78, \
> > > +  struct vhost_vdpa_iova_range)
> > >   #endif
> > > diff --git a/include/uapi/linux/vhost_types.h 
> > > b/include/uapi/linux/vhost_types.h
> > > index 669457ce5c48..4025b5a36177 100644
> > > --- a/include/uapi/linux/vhost_types.h
> > > +++ b/include/uapi/linux/vhost_types.h
> > > @@ -127,6 +127,11 @@ struct vhost_vdpa_config {
> > >   __u8 buf[0];
> > >   };
> > > +struct vhost_vdpa_iova_range {
> > > + __u64 start;
> > > + __u64 end;
> > > +};
> > > +
> > 
> > Pls document fields. And I think first/last is a better API ...
> > 
> > >   /* Feature bits */
> > >   /* Log all write descriptors. Can be changed while device is active. */
> > >   #define VHOST_F_LOG_ALL 26
> > > -- 
> > > 2.20.1

RE: [PATCH] cpufreq: intel_pstate: Implement passive mode with HWP enabled

2020-08-05 Thread Doug Smythies

On 2020.08.03 10:09 Rafael J. Wysocki wrote:
> On Sunday, August 2, 2020 5:17:39 PM CEST Doug Smythies wrote:
> > On 2020.07.19 04:43 Rafael J. Wysocki wrote:
> > > On Fri, Jul 17, 2020 at 3:37 PM Doug Smythies  wrote:
> > > > On 2020.07.16 05:08 Rafael J. Wysocki wrote:
> > > > > On Wed, Jul 15, 2020 at 10:39 PM Doug Smythies  
> > > > > wrote:
> > > > >> On 2020.07.14 11:16 Rafael J. Wysocki wrote:
> > > > >> >
> > > > >> > From: Rafael J. Wysocki 
> > > > >> ...
> > > > >> > Since the passive mode hasn't worked with HWP at all, and it is 
> > > > >> > not going to
> > > > >> > the default for HWP systems anyway, I don't see any drawbacks 
> > > > >> > related to making
> > > > >> > this change, so I would consider this as 5.9 material unless there 
> > > > >> > are any
> > > > >> > serious objections.
> > > > >>
> > > > >> Good point.
> > > >
> > > > Actually, for those users that default to passive mode upon boot,
> > > > this would mean they would find themselves using this.
> > > > Also, it isn't obvious, from the typical "what driver and what governor"
> > > > inquiry.
> > >
> > > So the change in behavior is that after this patch
> > > intel_pstate=passive doesn't imply no_hwp any more.
> > >
> > > That's a very minor difference though and I'm not aware of any adverse
> > > effects it can cause on HWP systems anyway.
> >
> > My point was, that it will now default to something where
> > testing has not been completed.
> >
> > > The "what governor" is straightforward in the passive mode: that's
> > > whatever cpufreq governor has been selected.
> >
> > I think you might have missed my point.
> > From the normal methods of inquiry one does not know
> > if HWP is being used or not. Why? Because with
> > or without HWP one gets the same answers under:
> >
> > /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver
> > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
> 
> Yes, but this is also the case in the active mode, isn't it?

Yes, fair enough.
But we aren't changing what it means by default
between kernel 5.8 and 5.9-rc1.

... Doug

Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range

2020-08-05 Thread Michael S. Tsirkin

On Thu, Aug 06, 2020 at 11:25:11AM +0800, Jason Wang wrote:
> 
> On 2020/8/5 下午8:51, Michael S. Tsirkin wrote:
> > On Wed, Jun 17, 2020 at 11:29:44AM +0800, Jason Wang wrote:
> > > This patch introduce a config op to get valid iova range from the vDPA
> > > device.
> > > 
> > > Signed-off-by: Jason Wang
> > > ---
> > >   include/linux/vdpa.h | 14 ++
> > >   1 file changed, 14 insertions(+)
> > > 
> > > diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > > index 239db794357c..b7633ed2500c 100644
> > > --- a/include/linux/vdpa.h
> > > +++ b/include/linux/vdpa.h
> > > @@ -41,6 +41,16 @@ struct vdpa_device {
> > >   unsigned int index;
> > >   };
> > > +/**
> > > + * vDPA IOVA range - the IOVA range support by the device
> > > + * @start: start of the IOVA range
> > > + * @end: end of the IOVA range
> > > + */
> > > +struct vdpa_iova_range {
> > > + u64 start;
> > > + u64 end;
> > > +};
> > > +
> > This is ambiguous. Is end in the range or just behind it?
> 
> 
> In the range.

OK I guess we can treat it as a bugfix and merge after rc1,
but pls add a bit more in the commit log about what's
currently broken.

> 
> > How about first/last?
> 
> 
> Sure.
> 
> Thanks
> 
> 
> > 
> > 
> >

Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy

2020-08-05 Thread Michael S. Tsirkin

On Thu, Aug 06, 2020 at 11:23:05AM +0800, Jason Wang wrote:
> 
> On 2020/8/5 下午7:40, Michael S. Tsirkin wrote:
> > On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote:
> > > On 2020/8/4 上午5:00, Michael S. Tsirkin wrote:
> > > > Some legacy guests just assume features are 0 after reset.
> > > > We detect that config space is accessed before features are
> > > > set and set features to 0 automatically.
> > > > Note: some legacy guests might not even access config space, if this is
> > > > reported in the field we might need to catch a kick to handle these.
> > > I wonder whether it's easier to just support modern device?
> > > 
> > > Thanks
> > Well hardware vendors are I think interested in supporting legacy
> > guests. Limiting vdpa to modern only would make it uncompetitive.
> 
> 
> My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA to
> work.

Hmm I don't really see why. Assume host maps guest memory properly,
VM does not have an IOMMU, legacy guest can just work.

Care explaining what's wrong with this picture?


> So it can only work for modern device ...
> 
> Thanks
> 
> 
> > 
> > 
> >

Re: [PATCH] jbd2: fix incorrect code style

2020-08-05 Thread tytso

On Sat, Jul 18, 2020 at 08:57:37AM -0400, Xianting Tian wrote:
> Remove unnecessary blank.
> 
> Signed-off-by: Xianting Tian 

Thanks, applied.

- Ted

Re: [RFC PATCH] mm: silence soft lockups from unlock_page

2020-08-05 Thread Hugh Dickins

On Mon, 27 Jul 2020, Greg KH wrote:
> 
> Linus just pointed me at this thread.
> 
> If you could run:
>   echo -n 'module xhci_hcd =p' > /sys/kernel/debug/dynamic_debug/control
> and run the same workload to see if anything shows up in the log when
> xhci crashes, that would be great.

Thanks, I tried that, and indeed it did have a story to tell:

ep 0x81 - asked for 16 bytes, 10 bytes untransferred
ep 0x81 - asked for 16 bytes, 10 bytes untransferred
ep 0x81 - asked for 16 bytes, 10 bytes untransferred
   a very large number of lines like the above, then
Cancel URB d81602f7, dev 4, ep 0x0, starting at offset 0xfffd42c0
// Ding dong!
ep 0x81 - asked for 16 bytes, 10 bytes untransferred
Stopped on No-op or Link TRB for slot 1 ep 0
xhci_drop_endpoint called for udev 5bc07fa6
drop ep 0x81, slot id 1, new drop flags = 0x8, new add flags = 0x0
add ep 0x81, slot id 1, new drop flags = 0x8, new add flags = 0x8
xhci_check_bandwidth called for udev 5bc07fa6
// Ding dong!
Successful Endpoint Configure command
Cancel URB 6b77d490, dev 4, ep 0x81, starting at offset 0x0
// Ding dong!
Stopped on No-op or Link TRB for slot 1 ep 2
Removing canceled TD starting at 0x0 (dma).
list_del corruption: prev(8fdb4de7a130)->next should be 8fdb41697f88,
   but is 6b6b6b6b6b6b6b6b; next(8fdb4de7a130)->prev is 6b6b6b6b6b6b6b6b.
[ cut here ]
kernel BUG at lib/list_debug.c:53!
RIP: 0010:__list_del_entry_valid+0x8e/0xb0
Call Trace:

 handle_cmd_completion+0x7d4/0x14f0 [xhci_hcd]
 xhci_irq+0x242/0x1ea0 [xhci_hcd]
 xhci_msi_irq+0x11/0x20 [xhci_hcd]
 __handle_irq_event_percpu+0x48/0x2c0
 handle_irq_event_percpu+0x32/0x80
 handle_irq_event+0x4a/0x80
 handle_edge_irq+0xd8/0x1b0
 handle_irq+0x2b/0x50
 do_IRQ+0xb6/0x1c0
 common_interrupt+0x90/0x90

Info provided for your interest, not expecting any response.
The list_del info in there is non-standard, from a patch of mine:
I find hashed addresses in debug output less than helpful.

> 
> Although if you are using an "older version" of the driver, there's not
> much I can suggest except update to a newer one :)

Yes, I was reluctant to post any info, since really the ball is at our
end of the court, not yours. I did have a go at bringing in the latest
xhci driver instead, but quickly saw that was not a sensible task for
me. And I did scan the git log of xhci changes (especially xhci-ring.c
changes): thought I saw a likely relevant and easily applied fix commit,
but in fact it made no difference here.

I suspect it's in part a hardware problem, but driver not recovering
correctly. I've replaced the machine (but also noticed that the same
crash has occasionally been seen on other machines). I'm sure it has
no relevance to this unlock_page() thread, though it's quite possible
that it's triggered under stress, and Linus's changes allowed greater
stress.

Hugh

Re: [PATCH v10 2/5] powerpc/vdso: Prepare for switching VDSO to generic C implementation.

2020-08-05 Thread Christophe Leroy


Hi,

On 08/05/2020 06:40 PM, Segher Boessenkool wrote:

Hi!

On Wed, Aug 05, 2020 at 04:40:16PM +, Christophe Leroy wrote:

It cannot optimise it because it does not know shift < 32.  The code
below is incorrect for shift equal to 32, fwiw.


Is there a way to tell it ?


Sure, for example the &31 should work (but it doesn't, with the GCC
version you used -- which version is that?)


GCC 10.1




What does the compiler do for just

static __always_inline u64 vdso_shift_ns(u64 ns, unsigned long shift)
return ns >> (shift & 31);
}



Worse:


I cannot make heads or tails of all that branch spaghetti, sorry.


  73c:  55 8c 06 fe clrlwi  r12,r12,27
  740:  7f c8 f0 14 addcr30,r8,r30
  744:  7c c6 4a 14 add r6,r6,r9
  748:  7c c6 e1 14 adder6,r6,r28
  74c:  34 6c ff e0 addic.  r3,r12,-32
  750:  41 80 00 70 blt 7c0 <__c_kernel_clock_gettime+0x114>


This branch is always true.  Hrm.


As a standalone function:

With your suggestion:

06ac :
 6ac:   54 a5 06 fe clrlwi  r5,r5,27
 6b0:   35 25 ff e0 addic.  r9,r5,-32
 6b4:   41 80 00 10 blt 6c4 
 6b8:   7c 64 4c 30 srw r4,r3,r9
 6bc:   38 60 00 00 li  r3,0
 6c0:   4e 80 00 20 blr
 6c4:   54 69 08 3c rlwinm  r9,r3,1,0,30
 6c8:   21 45 00 1f subfic  r10,r5,31
 6cc:   7c 84 2c 30 srw r4,r4,r5
 6d0:   7d 29 50 30 slw r9,r9,r10
 6d4:   7c 63 2c 30 srw r3,r3,r5
 6d8:   7d 24 23 78 or  r4,r9,r4
 6dc:   4e 80 00 20 blr


With the version as is in my series:

06ac :
 6ac:   21 25 00 20 subfic  r9,r5,32
 6b0:   7c 69 48 30 slw r9,r3,r9
 6b4:   7c 84 2c 30 srw r4,r4,r5
 6b8:   7d 24 23 78 or  r4,r9,r4
 6bc:   7c 63 2c 30 srw r3,r3,r5
 6c0:   4e 80 00 20 blr


Christophe

Re: [PATCH 2/2] Add a new sysctl knob: unprivileged_userfaultfd_user_mode_only

2020-08-05 Thread Michael S. Tsirkin

On Wed, Aug 05, 2020 at 05:43:02PM -0700, Nick Kralevich wrote:
> On Fri, Jul 24, 2020 at 6:40 AM Michael S. Tsirkin  wrote:
> >
> > On Thu, Jul 23, 2020 at 05:13:28PM -0700, Nick Kralevich wrote:
> > > On Thu, Jul 23, 2020 at 10:30 AM Lokesh Gidra  
> > > wrote:
> > > > From the discussion so far it seems that there is a consensus that
> > > > patch 1/2 in this series should be upstreamed in any case. Is there
> > > > anything that is pending on that patch?
> > >
> > > That's my reading of this thread too.
> > >
> > > > > > Unless I'm mistaken that you can already enforce bit 1 of the second
> > > > > > parameter of the userfaultfd syscall to be set with seccomp-bpf, 
> > > > > > this
> > > > > > would be more a question to the Android userland team.
> > > > > >
> > > > > > The question would be: does it ever happen that a seccomp filter 
> > > > > > isn't
> > > > > > already applied to unprivileged software running without
> > > > > > SYS_CAP_PTRACE capability?
> > > > >
> > > > > Yes.
> > > > >
> > > > > Android uses selinux as our primary sandboxing mechanism. We do use
> > > > > seccomp on a few processes, but we have found that it has a
> > > > > surprisingly high performance cost [1] on arm64 devices so turning it
> > > > > on system wide is not a good option.
> > > > >
> > > > > [1] 
> > > > > https://lore.kernel.org/linux-security-module/20200606.3F7109A@keescook/T/#m82ace19539ac595682affabdf652c0ffa5d27dad
> > >
> > > As Jeff mentioned, seccomp is used strategically on Android, but is
> > > not applied to all processes. It's too expensive and impractical when
> > > simpler implementations (such as this sysctl) can exist. It's also
> > > significantly simpler to test a sysctl value for correctness as
> > > opposed to a seccomp filter.
> >
> > Given that selinux is already used system-wide on Android, what is wrong
> > with using selinux to control userfaultfd as opposed to seccomp?
> 
> Userfaultfd file descriptors will be generally controlled by SELinux.
> You can see the patchset at
> https://lore.kernel.org/lkml/20200401213903.182112-3-dan...@google.com/
> (which is also referenced in the original commit message for this
> patchset). However, the SELinux patchset doesn't include the ability
> to control FAULT_FLAG_USER / UFFD_USER_MODE_ONLY directly.
> 
> SELinux already has the ability to control who gets CAP_SYS_PTRACE,
> which combined with this patch, is largely equivalent to direct
> UFFD_USER_MODE_ONLY checks. Additionally, with the SELinux patch
> above, movement of userfaultfd file descriptors can be mediated by
> SELinux, preventing one process from acquiring userfaultfd descriptors
> of other processes unless allowed by security policy.
> 
> It's an interesting question whether finer-grain SELinux support for
> controlling UFFD_USER_MODE_ONLY should be added. I can see some
> advantages to implementing this. However, we don't need to decide that
> now.
>
> Kernel security checks generally break down into DAC (discretionary
> access control) and MAC (mandatory access control) controls. Most
> kernel security features check via both of these mechanisms. Security
> attributes of the system should be settable without necessarily
> relying on an LSM such as SELinux. This patch follows the same basic
> model -- system wide control of a hardening feature is provided by the
> unprivileged_userfaultfd_user_mode_only sysctl (DAC), and if needed,
> SELinux support for this can also be implemented on top of the DAC
> controls.
> 
> This DAC/MAC split has been successful in several other security
> features. For example, the ability to map at page zero is controlled
> in DAC via the mmap_min_addr sysctl [1], and via SELinux via the
> mmap_zero access vector [2]. Similarly, access to the kernel ring
> buffer is controlled both via DAC as the dmesg_restrict sysctl [3], as
> well as the SELinux syslog_read [2] check. Indeed, the dmesg_restrict
> sysctl is very similar to this patch -- it introduces a capability
> (CAP_SYSLOG, CAP_SYS_PTRACE) check on access to a sensitive resource.
> 
> If we want to ensure that a security feature will be well tested and
> vetted, it's important to not limit its use to LSMs only. This ensures
> that kernel and application developers will always be able to test the
> effects of a security feature, without relying on LSMs like SELinux.
> It also ensures that all distributions can enable this security
> mitigation should it be necessary for their unique environments,
> without introducing an SELinux dependency. And this patch does not
> preclude an SELinux implementation should it be necessary.
> 
> Even if we decide to implement fine-grain SELinux controls on
> UFFD_USER_MODE_ONLY, we still need this patch. We shouldn't make this
> an either/or choice between SELinux and this patch. Both are
> necessary.
> 
> -- Nick
> 
> [1] https://wiki.debian.org/mmap_min_addr
> [2] https://selinuxproject.org/page/NB_ObjectClassesPermissions
> [3]

[PATCH] tty: synclink_gt: switch from 'pci_' to 'dma_' API

2020-08-05 Thread Christophe JAILLET

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'alloc_desc()' and 'alloc_bufs()', GFP_KERNEL
can be used because it is only called from a probe function and no lock is
acquired.
The call chain is:
   init_one(the probe function)
  --> device_init
 --> alloc_dma_bufs
--> alloc_desc
--> alloc_bufs

@@
@@
-PCI_DMA_BIDIRECTIONAL
+DMA_BIDIRECTIONAL

@@
@@
-PCI_DMA_TODEVICE
+DMA_TO_DEVICE

@@
@@
-PCI_DMA_FROMDEVICE
+DMA_FROM_DEVICE

@@
@@
-PCI_DMA_NONE
+DMA_NONE

@@
expression e1, e2, e3;
@@
-pci_alloc_consistent(e1, e2, e3)
+dma_alloc_coherent(>dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-pci_zalloc_consistent(e1, e2, e3)
+dma_alloc_coherent(>dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-pci_free_consistent(e1, e2, e3, e4)
+dma_free_coherent(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-pci_map_single(e1, e2, e3, e4)
+dma_map_single(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-pci_unmap_single(e1, e2, e3, e4)
+dma_unmap_single(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-pci_map_page(e1, e2, e3, e4, e5)
+dma_map_page(>dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-pci_unmap_page(e1, e2, e3, e4)
+dma_unmap_page(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-pci_map_sg(e1, e2, e3, e4)
+dma_map_sg(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-pci_unmap_sg(e1, e2, e3, e4)
+dma_unmap_sg(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+dma_sync_single_for_cpu(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-pci_dma_sync_single_for_device(e1, e2, e3, e4)
+dma_sync_single_for_device(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+dma_sync_sg_for_cpu(>dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+dma_sync_sg_for_device(>dev, e2, e3, e4)

@@
expression e1, e2;
@@
-pci_dma_mapping_error(e1, e2)
+dma_mapping_error(>dev, e2)

@@
expression e1, e2;
@@
-pci_set_dma_mask(e1, e2)
+dma_set_mask(>dev, e2)

@@
expression e1, e2;
@@
-pci_set_consistent_dma_mask(e1, e2)
+dma_set_coherent_mask(>dev, e2)

Signed-off-by: Christophe JAILLET 
---
If needed, see post from Christoph Hellwig on the kernel-janitors ML:
   https://marc.info/?l=kernel-janitors=158745678307186=4
---
 drivers/tty/synclink_gt.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/tty/synclink_gt.c b/drivers/tty/synclink_gt.c
index b794177ccfb9..1edf06653148 100644
--- a/drivers/tty/synclink_gt.c
+++ b/drivers/tty/synclink_gt.c
@@ -3341,8 +3341,8 @@ static int alloc_desc(struct slgt_info *info)
unsigned int pbufs;
 
/* allocate memory to hold descriptor lists */
-   info->bufs = pci_zalloc_consistent(info->pdev, DESC_LIST_SIZE,
-  >bufs_dma_addr);
+   info->bufs = dma_alloc_coherent(>pdev->dev, DESC_LIST_SIZE,
+   >bufs_dma_addr, GFP_KERNEL);
if (info->bufs == NULL)
return -ENOMEM;
 
@@ -3384,7 +3384,8 @@ static int alloc_desc(struct slgt_info *info)
 static void free_desc(struct slgt_info *info)
 {
if (info->bufs != NULL) {
-   pci_free_consistent(info->pdev, DESC_LIST_SIZE, info->bufs, 
info->bufs_dma_addr);
+   dma_free_coherent(>pdev->dev, DESC_LIST_SIZE,
+ info->bufs, info->bufs_dma_addr);
info->bufs  = NULL;
info->rbufs = NULL;
info->tbufs = NULL;
@@ -3395,7 +3396,9 @@ static int alloc_bufs(struct slgt_info *info, struct 
slgt_desc *bufs, int count)
 {
int i;
for (i=0; i < count; i++) {
-   if ((bufs[i].buf = pci_alloc_consistent(info->pdev, DMABUFSIZE, 
[i].buf_dma_addr)) == NULL)
+   bufs[i].buf = dma_alloc_coherent(>pdev->dev, DMABUFSIZE,
+[i].buf_dma_addr, 
GFP_KERNEL);
+   if (!bufs[i].buf)
return -ENOMEM;
bufs[i].pbuf  = cpu_to_le32((unsigned int)bufs[i].buf_dma_addr);
}
@@ -3408,7 +3411,8 @@ static void free_bufs(struct slgt_info *info, struct 
slgt_desc *bufs, int count)
for (i=0; i < count; i++) {
if (bufs[i].buf == NULL)
continue;
-   pci_free_consistent(info->pdev, DMABUFSIZE, bufs[i].buf, 
bufs[i].buf_dma_addr);
+   dma_free_coherent(>pdev->dev, DMABUFSIZE, bufs[i].buf,
+ bufs[i].buf_dma_addr);

Re: [PATCH 06/18] fsinfo: Add a uniquifier ID to struct mount [ver #21]

2020-08-05 Thread Ian Kent

On Wed, 2020-08-05 at 20:33 +0100, Matthew Wilcox wrote:
> On Wed, Aug 05, 2020 at 04:30:10PM +0100, David Howells wrote:
> > Miklos Szeredi  wrote:
> > 
> > > idr_alloc_cyclic() seems to be a good template for doing the
> > > lower
> > > 32bit allocation, and we can add code to increment the high 32bit
> > > on
> > > wraparound.
> > > 
> > > Lots of code uses idr_alloc_cyclic() so I guess it shouldn't be
> > > too
> > > bad in terms of memory use or performance.
> > 
> > It's optimised for shortness of path and trades memory for
> > performance.  It's
> > currently implemented using an xarray, so memory usage is dependent
> > on the
> > sparseness of the tree.  Each node in the tree is 576 bytes and in
> > the worst
> > case, each one node will contain one mount - and then you have to
> > backfill the
> > ancestry, though for lower memory costs.
> > 
> > Systemd makes life more interesting since it sets up a whole load
> > of
> > propagations.  Each mount you make may cause several others to be
> > created, but
> > that would likely make the tree more efficient.
> 
> I would recommend using xa_alloc and ignoring the ID assigned from
> xa_alloc.  Looking up by unique ID is then a matter of iterating
> every
> mount (xa_for_each()) looking for a matching unique ID in the mount
> struct.  That's O(n) search, but it's faster than a linked list, and
> we
> don't have that many mounts in a system.

How many is not many, 5000, 1, I agree that 3 plus is fairly
rare, even for the autofs direct mount case I hope the implementation
here will help to fix.

Ian

Re: [PATCH V2] venus: core: add shutdown callback for venus

2020-08-05 Thread kernel test robot

Hi Mansur,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linuxtv-media/master]
[also build test WARNING on v5.8 next-20200805]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Mansur-Alisha-Shaik/venus-core-add-shutdown-callback-for-venus/20200806-114716
base:   git://linuxtv.org/media_tree.git master
config: sh-allmodconfig (attached as .config)
compiler: sh4-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=sh 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

   In file included from include/linux/device.h:15,
from include/linux/node.h:18,
from include/linux/cpu.h:17,
from include/linux/of_device.h:5,
from drivers/media/platform/qcom/venus/core.c:11:
   drivers/media/platform/qcom/venus/core.c: In function 'venus_core_shutdown':
>> drivers/media/platform/qcom/venus/core.c:351:23: warning: too many arguments 
>> for format [-Wformat-extra-args]
 351 |   dev_warn(core->dev, "shutdown failed \n", ret);
 |   ^~~~
   include/linux/dev_printk.h:19:22: note: in definition of macro 'dev_fmt'
  19 | #define dev_fmt(fmt) fmt
 |  ^~~
   drivers/media/platform/qcom/venus/core.c:351:3: note: in expansion of macro 
'dev_warn'
 351 |   dev_warn(core->dev, "shutdown failed \n", ret);
 |   ^~~~

vim +351 drivers/media/platform/qcom/venus/core.c

   343  
   344  static void venus_core_shutdown(struct platform_device *pdev)
   345  {
   346  struct venus_core *core = platform_get_drvdata(pdev);
   347  int ret;
   348  
   349  ret = venus_remove(pdev);
   350  if(ret)
 > 351  dev_warn(core->dev, "shutdown failed \n", ret);
   352  }
   353  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip

RE: [PATCH] kprobes: fix compiler warning for !CONFIG_KPROBES_ON_FTRACE

2020-08-05 Thread John Fastabend

Muchun Song wrote:
> Fix compiler warning(as show below) for !CONFIG_KPROBES_ON_FTRACE.
> 
> kernel/kprobes.c: In function 'kill_kprobe':
> kernel/kprobes.c:1116:33: warning: statement with no effect
> [-Wunused-value]
>  1116 | #define disarm_kprobe_ftrace(p) (-ENODEV)
>   | ^
> kernel/kprobes.c:2154:3: note: in expansion of macro
> 'disarm_kprobe_ftrace'
>  2154 |   disarm_kprobe_ftrace(p);
> 
> Link: https://lore.kernel.org/r/20200805142136.0331f...@canb.auug.org.au
> 
> Reported-by: Stephen Rothwell 
> Fixes: 0cb2f1372baa ("kprobes: Fix NULL pointer dereference at 
> kprobe_ftrace_handler")
> Signed-off-by: Muchun Song 
> ---

Acked-by: John Fastabend

Re: ext4: fix spelling typos in ext4_mb_initialize_context

2020-08-05 Thread tytso

On Wed, Jul 15, 2020 at 11:00:44AM +0800, brookxu wrote:
> Fix spelling typos in ext4_mb_initialize_context.
> 
> Signed-off-by: Chunguang Xu 

Thanks, applied.

- Ted

Re: [PATCH 1/2] sched/topology: Allow archs to override cpu_smt_mask

2020-08-05 Thread Michael Ellerman

pet...@infradead.org writes:
> On Tue, Aug 04, 2020 at 05:40:07PM +0530, Srikar Dronamraju wrote:
>> * pet...@infradead.org  [2020-08-04 12:45:20]:
>> 
>> > On Tue, Aug 04, 2020 at 09:03:06AM +0530, Srikar Dronamraju wrote:
>> > > cpu_smt_mask tracks topology_sibling_cpumask. This would be good for
>> > > most architectures. One of the users of cpu_smt_mask(), would be to
>> > > identify idle-cores. On Power9, a pair of cores can be presented by the
>> > > firmware as a big-core for backward compatibility reasons.
>> > > 
>> > > In order to maintain userspace backward compatibility with previous
>> > > versions of processor, (since Power8 had SMT8 cores), Power9 onwards 
>> > > there
>> > > is option to the firmware to advertise a pair of SMT4 cores as a fused
>> > > cores (referred to as the big_core mode in the Linux Kernel). On Power9
>> > > this pair shares the L2 cache as well. However, from the scheduler's 
>> > > point
>> > > of view, a core should be determined by SMT4. The load-balancer already
>> > > does this. Hence allow PowerPc architecture to override the default
>> > > cpu_smt_mask() to point to the SMT4 cores in a big_core mode.
>> > 
>> > I'm utterly confused.
>> > 
>> > Why can't you set your regular siblings mask to the smt4 thing? Who
>> > cares about the compat stuff, I thought that was an LPAR/AIX thing.

To be clear this stuff is all for when we're running on the PowerVM machines,
ie. as LPARs.

That brings with it a bunch of problems, such as existing software that
has been developed/configured for Power8 and expects to see SMT8.

We also allow LPARs to be live migrated from Power8 to Power9 (and back), so
maintaining the illusion of SMT8 is considered a requirement to make that work.

>> There are no technical challenges to set the sibling mask to SMT4.
>> This is for Linux running on PowerVM. When these Power9 boxes are sold /
>> marketed as X core boxes (where X stand for SMT8 cores).  Since on PowerVM
>> world, everything is in SMT8 mode, the device tree properties still mark
>> the system to be running on 8 thread core. There are a number of utilities
>> like ppc64_cpu that directly read from device-tree. They would get core
>> count and thread count which is SMT8 based.
>> 
>> If the sibling_mask is set to small core, then same user when looking at
>
> FWIW, I find the small/big core naming utterly confusing vs the
> big/little naming from ARM. When you say small, I'm thinking of
> asymmetric cores, not SMT4/SMT8.

Yeah I agree the naming is confusing.

Let's call them "SMT4 cores" and "SMT8 cores"?

>> output from lscpu and other utilities that look at sysfs will start seeing
>> 2x number of cores to what he had provisioned and what the utilities from
>> the device-tree show. This can gets users confused.
>
> One will report SMT8 and the other SMT4, right? So only users that
> cannot read will be confused, but if you can't read, confusion is
> guaranteed anyway.

It's partly users, but also software that would see different values depending
on where it looks.

> Also, by exposing the true (SMT4) topology to userspace, userspace
> applications could behave better -- for those few that actually parse
> the topology information.

Agreed, though as you say there aren't that many that actually use the low-level
topology information.

>> So to keep the device-tree properties, utilities depending on device-tree,
>> sysfs and utilities depending on sysfs on the same page, userspace are only
>> exposed as SMT8.
>
> I'm not convinced it makes sense to lie to userspace just to accomodate
> a few users that cannot read.

The problem is we are already lying to userspace, because firmware lies to us.

ie. the firmware on these systems shows us an SMT8 core, and so current kernels
show SMT8 to userspace. I don't think we can realistically change that fact now,
as these systems are already out in the field.

What this patch tries to do is undo some of the mess, and at least give the
scheduler the right information.

cheers

Re: [PATCH v5 4/4] clk: qcom: lpass: Add support for LPASS clock controller for SC7180

2020-08-05 Thread Taniya Das


Hi Stephen,

On 8/6/2020 1:54 AM, Stephen Boyd wrote:

Quoting Taniya Das (2020-07-24 09:07:58)

+
+static struct clk_rcg2 core_clk_src = {
+   .cmd_rcgr = 0x1d000,
+   .mnd_width = 8,
+   .hid_width = 5,
+   .parent_map = lpass_core_cc_parent_map_2,
+   .clkr.hw.init = &(struct clk_init_data){
+   .name = "core_clk_src",


Any chance this can get a better name? Something with LPASS prefix?



These are the exact clock names from the hardware plan.


+   .parent_data = &(const struct clk_parent_data){
+   .fw_name = "bi_tcxo",
+   },
+   .num_parents = 1,
+   .ops = _rcg2_ops,
+   },
+};
+

[...]

+
+static struct clk_branch lpass_audio_core_sysnoc_mport_core_clk = {
+   .halt_reg = 0x23000,
+   .halt_check = BRANCH_HALT,
+   .hwcg_reg = 0x23000,
+   .hwcg_bit = 1,
+   .clkr = {
+   .enable_reg = 0x23000,
+   .enable_mask = BIT(0),
+   .hw.init = &(struct clk_init_data){
+   .name = "lpass_audio_core_sysnoc_mport_core_clk",
+   .parent_data = &(const struct clk_parent_data){
+   .hw = _clk_src.clkr.hw,
+   },
+   .num_parents = 1,
+   .flags = CLK_SET_RATE_PARENT,
+   .ops = _branch2_ops,
+   },
+   },
+};
+
+static struct clk_regmap *lpass_core_cc_sc7180_clocks[] = {
+   [EXT_MCLK0_CLK_SRC] = _mclk0_clk_src.clkr,
+   [LPAIF_PRI_CLK_SRC] = _pri_clk_src.clkr,
+   [LPAIF_SEC_CLK_SRC] = _sec_clk_src.clkr,
+   [CORE_CLK_SRC] = _clk_src.clkr,


And all of these, can they have LPASS_ prefix on the defines? Seems
like we're missing a namespace otherwise.



These are generated as they are in the HW plan. Do you still think I 
should update them?



+   [LPASS_AUDIO_CORE_EXT_MCLK0_CLK] = _audio_core_ext_mclk0_clk.clkr,
+   [LPASS_AUDIO_CORE_LPAIF_PRI_IBIT_CLK] =
+   _audio_core_lpaif_pri_ibit_clk.clkr,
+   [LPASS_AUDIO_CORE_LPAIF_SEC_IBIT_CLK] =
+   _audio_core_lpaif_sec_ibit_clk.clkr,
+   [LPASS_AUDIO_CORE_SYSNOC_MPORT_CORE_CLK] =
+   _audio_core_sysnoc_mport_core_clk.clkr,
+   [LPASS_LPAAUDIO_DIG_PLL] = _lpaaudio_dig_pll.clkr,
+   [LPASS_LPAAUDIO_DIG_PLL_OUT_ODD] = _lpaaudio_dig_pll_out_odd.clkr,
+};
+


--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation.

--

Re: [PATCH v6 15/18] nitro_enclaves: Add Makefile for the Nitro Enclaves driver

2020-08-05 Thread Paraschiv, Andra-Irina





On 05/08/2020 17:23, kernel test robot wrote:


Hi Andra,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linux/master]
[also build test ERROR on linus/master v5.8 next-20200805]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Andra-Paraschiv/Add-support-for-Nitro-Enclaves/20200805-171942
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
bcf876870b95592b52519ed4aafcf9d95999bc9c
config: arm64-allyesconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
 wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
 chmod +x ~/bin/make.cross
 # save the attached .config to linux build tree
 COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=arm64


Removed, for now, the dependency on ARM64 arch. x86 is currently 
supported, with Arm to come afterwards.


Thanks,
Andra



If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

drivers/virt/nitro_enclaves/ne_misc_dev.c: In function 'ne_setup_cpu_pool':

drivers/virt/nitro_enclaves/ne_misc_dev.c:245:46: error: 'smp_num_siblings' 
undeclared (first use in this function); did you mean 'cpu_sibling'?

  245 |  ne_cpu_pool.avail_cores_size = nr_cpu_ids / smp_num_siblings;
  |  ^~~~
  |  cpu_sibling
drivers/virt/nitro_enclaves/ne_misc_dev.c:245:46: note: each undeclared 
identifier is reported only once for each function it appears in
drivers/virt/nitro_enclaves/ne_misc_dev.c: In function 'ne_enclave_ioctl':
drivers/virt/nitro_enclaves/ne_misc_dev.c:928:54: error: 'smp_num_siblings' 
undeclared (first use in this function)
  928 |   if (vcpu_id >= (ne_enclave->avail_cpu_cores_size * 
smp_num_siblings)) {
  |  
^~~~

vim +245 drivers/virt/nitro_enclaves/ne_misc_dev.c

7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  130
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  131  /**
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  132   * ne_setup_cpu_pool() - Set 
the NE CPU pool after handling sanity checks such
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  133   *  as not 
sharing CPU cores with the primary / parent VM
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  134   *  or not 
using CPU 0, which should remain available for
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  135   *  the 
primary / parent VM. Offline the CPUs from the
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  136   *  pool 
after the checks passed.
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  137   * @ne_cpu_list:   The CPU 
list used for setting NE CPU pool.
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  138   *
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  139   * Context: Process context.
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  140   * Return:
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  141   * * 0 on success.
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  142   * * Negative return value on 
failure.
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  143   */
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  144  static int 
ne_setup_cpu_pool(const char *ne_cpu_list)
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  145  {
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  146 int core_id = -1;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  147 unsigned int cpu = 0;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  148 cpumask_var_t cpu_pool 
= NULL;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  149 unsigned int 
cpu_sibling = 0;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  150 unsigned int i = 0;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  151 int numa_node = -1;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  152 int rc = -EINVAL;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  153
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  154 if (!ne_cpu_list)
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  155 return 0;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  156
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  157 if 
(!zalloc_cpumask_var(_pool, GFP_KERNEL))
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  158 return -ENOMEM;
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  159
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  160 
mutex_lock(_cpu_pool.mutex);
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  161
7d5c9a7dfa51e60 Andra Paraschiv 2020-08-05  162 rc = 
cpulist_parse(ne_cpu_list, cpu_pool);
7d5c9a7dfa51e60 Andra Paraschi

[PATCH] dt-bindings: sound: Convert NXP spdif to json-schema

2020-08-05 Thread Anson Huang

Convert the NXP SPDIF binding to DT schema format using json-schema.

Signed-off-by: Anson Huang 
---
 .../devicetree/bindings/sound/fsl,spdif.txt|  68 -
 .../devicetree/bindings/sound/fsl,spdif.yaml   | 108 +
 2 files changed, 108 insertions(+), 68 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/sound/fsl,spdif.txt
 create mode 100644 Documentation/devicetree/bindings/sound/fsl,spdif.yaml

diff --git a/Documentation/devicetree/bindings/sound/fsl,spdif.txt 
b/Documentation/devicetree/bindings/sound/fsl,spdif.txt
deleted file mode 100644
index e1365b0..000
--- a/Documentation/devicetree/bindings/sound/fsl,spdif.txt
+++ /dev/null
@@ -1,68 +0,0 @@
-Freescale Sony/Philips Digital Interface Format (S/PDIF) Controller
-
-The Freescale S/PDIF audio block is a stereo transceiver that allows the
-processor to receive and transmit digital audio via an coaxial cable or
-a fibre cable.
-
-Required properties:
-
-  - compatible : Compatible list, should contain one of the following
- compatibles:
- "fsl,imx35-spdif",
- "fsl,vf610-spdif",
- "fsl,imx6sx-spdif",
-
-  - reg: Offset and length of the register set for the 
device.
-
-  - interrupts : Contains the spdif interrupt.
-
-  - dmas   : Generic dma devicetree binding as described in
- Documentation/devicetree/bindings/dma/dma.txt.
-
-  - dma-names  : Two dmas have to be defined, "tx" and "rx".
-
-  - clocks : Contains an entry for each entry in clock-names.
-
-  - clock-names: Includes the following entries:
-   "core"The core clock of spdif controller.
-   "rxtx<0-7>"   Clock source list for tx and rx clock.
- This clock list should be identical to the source
- list connecting to the spdif clock mux in "SPDIF
- Transceiver Clock Diagram" of SoC reference manual.
- It can also be referred to TxClk_Source bit of
- register SPDIF_STC.
-   "spba"The spba clock is required when SPDIF is placed as a
- bus slave of the Shared Peripheral Bus and when two
- or more bus masters (CPU, DMA or DSP) try to access
- it. This property is optional depending on the SoC
- design.
-
-Optional properties:
-
-   - big-endian: If this property is absent, the native endian 
mode
- will be in use as default, or the big endian mode
- will be in use for all the device registers.
-
-Example:
-
-spdif: spdif@2004000 {
-   compatible = "fsl,imx35-spdif";
-   reg = <0x02004000 0x4000>;
-   interrupts = <0 52 0x04>;
-   dmas = < 14 18 0>,
-  < 15 18 0>;
-   dma-names = "rx", "tx";
-
-   clocks = < 197>, < 3>,
-  < 197>, < 107>,
-  < 0>, < 118>,
-  < 62>, < 139>,
-  < 0>;
-   clock-names = "core", "rxtx0",
-   "rxtx1", "rxtx2",
-   "rxtx3", "rxtx4",
-   "rxtx5", "rxtx6",
-   "rxtx7";
-
-   big-endian;
-};
diff --git a/Documentation/devicetree/bindings/sound/fsl,spdif.yaml 
b/Documentation/devicetree/bindings/sound/fsl,spdif.yaml
new file mode 100644
index 000..819f37f
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/fsl,spdif.yaml
@@ -0,0 +1,108 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/sound/fsl,spdif.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale Sony/Philips Digital Interface Format (S/PDIF) Controller
+
+maintainers:
+  - Shengjiu Wang 
+
+description: |
+  The Freescale S/PDIF audio block is a stereo transceiver that allows the
+  processor to receive and transmit digital audio via an coaxial cable or
+  a fibre cable.
+
+properties:
+  compatible:
+enum:
+  - fsl,imx35-spdif
+  - fsl,vf610-spdif
+  - fsl,imx6sx-spdif
+
+  reg:
+maxItems: 1
+
+  interrupts:
+maxItems: 1
+
+  dmas:
+items:
+  - description: DMA controller phandle and request line for RX
+  - description: DMA controller phandle and request line for TX
+
+  dma-names:
+items:
+  - const: rx
+  - const: tx
+
+  clocks:
+items:
+  - description: The core clock of spdif controller.
+  - description: Clock for tx0 and rx0.
+  - description: Clock for tx1 and rx1.
+  - description: Clock for tx2 and rx2.
+  - description: Clock for tx3 and rx3.
+  - description: Clock for tx4 and rx4.
+  - description: Clock for tx5 and rx5.
+  - description: Clock for tx6 and rx6.
+  - description: Clock for tx7 and rx7.

Re: [RFC PATCH] mm: silence soft lockups from unlock_page

2020-08-05 Thread Hugh Dickins

Nice to see the +130.0% this morning.

I got back on to this on Monday, here's some follow-up.

On Sun, 26 Jul 2020, Hugh Dickins wrote:
> 
> The comparison runs have not yet completed (except for the one started
> early), but they have all got past the most interesting tests, and it's
> clear that they do not have the "failure"s seen with your patches.
> 
> From that I can only conclude that your patches make a difference.
> 
> I've deduced nothing useful from the logs, will have to leave that
> to others here with more experience of them.  But my assumption now
> is that you have successfully removed one bottleneck, so the tests
> get somewhat further and now stick in the next bottleneck, whatever
> that may be.  Which shows up as "failure", where the unlock_page()
> wake_up_page_bit() bottleneck had allowed the tests to proceed in
> a more serially sedate way.

Yes, that's still how it appears to me. The test failures, all
of them, came from fork() returning ENOSPC, which originated from
alloc_pid()'s idr_alloc_cyclic(). I did try doubling our already
large pid_max, that did not work out well, there are probably good
reasons for it to be where it is and I was wrong to dabble with it.
I also tried an rcu_barrier() and retry when getting -ENOSPC there,
thinking maybe RCU was not freeing them up fast enough, but that
didn't help either.

I think (but didn't quite make the effort to double-check with
an independent count) it was simply running out of pids: that your
change speeds up the forking enough, that exiting could not quite keep
up (SIGCHLD is SIG_IGNed); whereas before your change, the unlock_page()
in do_wp_page(), on a PageAnon stack page, slowed the forking down enough
when heavily contended.

(I think we could improve the checks there, to avoid taking page lock in
more cases; but I don't know if that would help any real-life workload -
I see now that Michal's case is do_read_fault() not do_wp_page().)

And FWIW a further speedup there is the opposite of what these tests
are wanting: for the moment I've enabled a delay to get them passing
as before.

Something I was interested to realize in looking at this: trylock_page()
on a contended lock is now much less likely to jump the queue and
succeed than before, since your lock holder hands off the page lock to
the next holder: much smaller window than waiting for the next to wake
to take it. Nothing wrong with that, but effect might be seen somewhere.

> 
> The xhci handle_cmd_completion list_del bugs (on an older version
> of the driver): weird, nothing to do with page wakeups, I'll just
> have to assume that it's some driver bug exposed by the greater
> stress allowed down, and let driver people investigate (if it
> still manifests) when we take in your improvements.

Complete red herring. I'll give Greg more info in response to his
mail, and there may be an xhci bug in there; but when I looked back,
found I'd come across the same bug back in October, and find that
occasionally it's been seen in our fleet. Yes, it's odd that your
change coincided with it becoming more common on that machine
(which I've since replaced by another), yes it's funny that it's
in __list_del_entry_valid(), which is exactly where I got crashes
on pages with your initial patch; but it's just a distraction.

> 
> One nice thing from the comparison runs without your patches:
> watchdog panic did crash one of those with exactly the unlock_page()
> wake_up_page_bit() softlockup symptom we've been fighting, that did
> not appear with your patches.  So although the sample size is much
> too small to justify a conclusion, it does tend towards confirming
> your changes.
> 
> Thank you for your work on this! And I'm sure you'd have preferred
> some hard data back, rather than a diary of my mood swings, but...
> we do what we can.
> 
> Hugh

Re: [PATCH v2 2/2] dma-pool: Only allocate from CMA when in same memory zone

2020-08-05 Thread Christoph Hellwig

On Tue, Aug 04, 2020 at 11:43:15AM +0200, Nicolas Saenz Julienne wrote:
> > Second I don't see the need (and actually some harm) in preventing 
> > GFP_KERNEL
> > allocations from dipping into lower CMA areas - something that we did 
> > support
> > before 5.8 with the single pool.
> 
> My thinking is the least we pressure CMA the better, it's generally scarse, 
> and
> it'll not grow as the atomic pools grow. As far as harm is concerned, we now
> check addresses for correctness, so we shouldn't run into problems.
> 
> There is a potential case for architectures defining a default CMA but not
> defining DMA zones where this could be problematic. But isn't that just plain
> abusing CMA? If you need low memory allocations, you should be defining DMA
> zones.

The latter is pretty much what I expect, as we only support the default and
per-device DMA CMAs.

[PATCH 9/9] scsi: ufs: Properly release resources if a task is aborted successfully

2020-08-05 Thread Can Guo

In current UFS task abort hook, namely ufshcd_abort(), if a task is
aborted successfully, clock scaling busy time statistics is not updated
and, most important, clk_gating.active_reqs is not decreased, which makes
clk_gating.active_reqs stay above zero forever, thus clock gating would
never happen. To fix it, instead of releasing resources "mannually", use
the existing func __ufshcd_transfer_req_compl(). This can also eliminate
racing of scsi_dma_unmap() from the real completion in IRQ handler path.

Signed-off-by: Can Guo 
CC: Stanley Chu 
Reviewed-by: Stanley Chu 

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index b2947ab..9541fc7 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -6636,11 +6636,8 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
goto out;
}
 
-   scsi_dma_unmap(cmd);
-
spin_lock_irqsave(host->host_lock, flags);
-   ufshcd_outstanding_req_clear(hba, tag);
-   hba->lrb[tag].cmd = NULL;
+   __ufshcd_transfer_req_compl(hba, (1UL << tag));
spin_unlock_irqrestore(host->host_lock, flags);
 
 out:
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

[PATCH 1/9] scsi: ufs: Add checks before setting clk-gating states

2020-08-05 Thread Can Guo

Clock gating features can be turned on/off selectively which means its
state information is only important if it is enabled. This change makes
sure that we only look at state of clk-gating if it is enabled.

Signed-off-by: Can Guo 
Reviewed-by: Avri Altman 
Reviewed-by: Hongwu Su 
Reviewed-by: Stanley Chu 
Reviewed-by: Bean Huo 

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 3076222..5acb38c 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -1839,6 +1839,8 @@ static void ufshcd_init_clk_gating(struct ufs_hba *hba)
if (!ufshcd_is_clkgating_allowed(hba))
return;
 
+   hba->clk_gating.state = CLKS_ON;
+
hba->clk_gating.delay_ms = 150;
INIT_DELAYED_WORK(>clk_gating.gate_work, ufshcd_gate_work);
INIT_WORK(>clk_gating.ungate_work, ufshcd_ungate_work);
@@ -2541,7 +2543,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
err = SCSI_MLQUEUE_HOST_BUSY;
goto out;
}
-   WARN_ON(hba->clk_gating.state != CLKS_ON);
+   WARN_ON(ufshcd_is_clkgating_allowed(hba) &&
+   (hba->clk_gating.state != CLKS_ON));
 
lrbp = >lrb[tag];
 
@@ -8326,8 +8329,11 @@ static int ufshcd_suspend(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
/* If link is active, device ref_clk can't be switched off */
__ufshcd_setup_clocks(hba, false, true);
 
-   hba->clk_gating.state = CLKS_OFF;
-   trace_ufshcd_clk_gating(dev_name(hba->dev), hba->clk_gating.state);
+   if (ufshcd_is_clkgating_allowed(hba)) {
+   hba->clk_gating.state = CLKS_OFF;
+   trace_ufshcd_clk_gating(dev_name(hba->dev),
+   hba->clk_gating.state);
+   }
 
/* Put the host controller in low power mode if possible */
ufshcd_hba_vreg_set_lpm(hba);
@@ -8467,6 +8473,11 @@ static int ufshcd_resume(struct ufs_hba *hba, enum 
ufs_pm_op pm_op)
if (hba->clk_scaling.is_allowed)
ufshcd_suspend_clkscaling(hba);
ufshcd_setup_clocks(hba, false);
+   if (ufshcd_is_clkgating_allowed(hba)) {
+   hba->clk_gating.state = CLKS_OFF;
+   trace_ufshcd_clk_gating(dev_name(hba->dev),
+   hba->clk_gating.state);
+   }
 out:
hba->pm_op_in_progress = 0;
if (ret)
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

[PATCH 7/9] scsi: ufs: Move dumps in IRQ handler to error handler

2020-08-05 Thread Can Guo

Sometime dumps in IRQ handler are heavy enough to cause system stability
issues, move them to error handler and only print basic host regs here.

Signed-off-by: Can Guo 
Reviewed-by: Bean Huo 

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 6a10003..a79fbbd 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5696,6 +5696,19 @@ static void ufshcd_err_handler(struct work_struct *work)
UFSHCD_UIC_DL_TCx_REPLAY_ERROR
needs_reset = true;
 
+   if (hba->saved_err & (INT_FATAL_ERRORS | UIC_ERROR |
+ UFSHCD_UIC_HIBERN8_MASK)) {
+   bool pr_prdt = !!(hba->saved_err & SYSTEM_BUS_FATAL_ERROR);
+
+   spin_unlock_irqrestore(hba->host->host_lock, flags);
+   ufshcd_print_host_state(hba);
+   ufshcd_print_pwr_info(hba);
+   ufshcd_print_host_regs(hba);
+   ufshcd_print_tmrs(hba, hba->outstanding_tasks);
+   ufshcd_print_trs(hba, hba->outstanding_reqs, pr_prdt);
+   spin_lock_irqsave(hba->host->host_lock, flags);
+   }
+
/*
 * if host reset is required then skip clearing the pending
 * transfers forcefully because they will get cleared during
@@ -5915,18 +5928,12 @@ static irqreturn_t ufshcd_check_errors(struct ufs_hba 
*hba)
 
/* dump controller state before resetting */
if (hba->saved_err & (INT_FATAL_ERRORS | UIC_ERROR)) {
-   bool pr_prdt = !!(hba->saved_err &
-   SYSTEM_BUS_FATAL_ERROR);
-
dev_err(hba->dev, "%s: saved_err 0x%x saved_uic_err 
0x%x\n",
__func__, hba->saved_err,
hba->saved_uic_err);
-
-   ufshcd_print_host_regs(hba);
+   ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE,
+"host_regs: ");
ufshcd_print_pwr_info(hba);
-   ufshcd_print_tmrs(hba, hba->outstanding_tasks);
-   ufshcd_print_trs(hba, hba->outstanding_reqs,
-   pr_prdt);
}
ufshcd_schedule_eh_work(hba);
retval |= IRQ_HANDLED;
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

[PATCH 6/9] scsi: ufs: Recover hba runtime PM error in error handler

2020-08-05 Thread Can Guo

Current error handler cannot work well or recover hba runtime PM error if
ufshcd_suspend/resume has failed due to UFS errors, e.g. hibern8 enter/exit
error or SSU cmd error. When this happens, error handler may fail doing
full reset and restore because error handler always assumes that powers,
IRQs and clocks are ready after pm_runtime_get_sync returns, but actually
they are not if ufshcd_reusme fails [1]. Besides, if ufschd_suspend/resume
fails due to UFS error, runtime PM framework saves the error value to
dev.power.runtime_error. After that, hba dev runtime suspend/resume would
not be invoked anymore unless runtime_error is cleared [2].

In case of ufshcd_suspend/resume fails due to UFS errors, for scenario [1],
error handler cannot assume anything of pm_runtime_get_sync, meaning error
handler should explicitly turn ON powers, IRQs and clocks again. To get the
hba runtime PM work as regard for scenario [2], error handler can clear the
runtime_error by calling pm_runtime_set_active() if full reset and restore
succeeds. And, more important, if pm_runtime_set_active() returns no error,
which means runtime_error has been cleared, we also need to resume those
scsi devices under hba in case any of them has failed to be resumed due to
hba runtime resume failure. This is to unblock blk_queue_enter in case
there are bios waiting inside it.

Signed-off-by: Can Guo 
Reviewed-by: Bean Huo 

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 2604016..6a10003 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "ufshcd.h"
 #include "ufs_quirks.h"
 #include "unipro.h"
@@ -229,6 +230,10 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba);
 static int ufshcd_change_power_mode(struct ufs_hba *hba,
 struct ufs_pa_layer_attr *pwr_mode);
 static void ufshcd_schedule_eh_work(struct ufs_hba *hba);
+static int ufshcd_setup_hba_vreg(struct ufs_hba *hba, bool on);
+static int ufshcd_setup_vreg(struct ufs_hba *hba, bool on);
+static inline int ufshcd_config_vreg_hpm(struct ufs_hba *hba,
+struct ufs_vreg *vreg);
 static int ufshcd_wb_buf_flush_enable(struct ufs_hba *hba);
 static int ufshcd_wb_buf_flush_disable(struct ufs_hba *hba);
 static int ufshcd_wb_ctrl(struct ufs_hba *hba, bool enable);
@@ -5553,6 +5558,84 @@ static inline void ufshcd_schedule_eh_work(struct 
ufs_hba *hba)
}
 }
 
+static void ufshcd_err_handling_prepare(struct ufs_hba *hba)
+{
+   pm_runtime_get_sync(hba->dev);
+   if (pm_runtime_suspended(hba->dev)) {
+   /*
+* Don't assume anything of pm_runtime_get_sync(), if
+* resume fails, irq and clocks can be OFF, and powers
+* can be OFF or in LPM.
+*/
+   ufshcd_setup_hba_vreg(hba, true);
+   ufshcd_enable_irq(hba);
+   ufshcd_setup_vreg(hba, true);
+   ufshcd_config_vreg_hpm(hba, hba->vreg_info.vccq);
+   ufshcd_config_vreg_hpm(hba, hba->vreg_info.vccq2);
+   ufshcd_hold(hba, false);
+   if (!ufshcd_is_clkgating_allowed(hba))
+   ufshcd_setup_clocks(hba, true);
+   ufshcd_release(hba);
+   ufshcd_vops_resume(hba, UFS_RUNTIME_PM);
+   } else {
+   ufshcd_hold(hba, false);
+   if (hba->clk_scaling.is_allowed) {
+   cancel_work_sync(>clk_scaling.suspend_work);
+   cancel_work_sync(>clk_scaling.resume_work);
+   ufshcd_suspend_clkscaling(hba);
+   }
+   }
+}
+
+static void ufshcd_err_handling_unprepare(struct ufs_hba *hba)
+{
+   ufshcd_release(hba);
+   if (hba->clk_scaling.is_allowed)
+   ufshcd_resume_clkscaling(hba);
+   pm_runtime_put(hba->dev);
+}
+
+static inline bool ufshcd_err_handling_should_stop(struct ufs_hba *hba)
+{
+   return (hba->ufshcd_state == UFSHCD_STATE_ERROR ||
+   (!(hba->saved_err || hba->saved_uic_err || hba->force_reset ||
+   ufshcd_is_link_broken(hba;
+}
+
+#ifdef CONFIG_PM
+static void ufshcd_recover_pm_error(struct ufs_hba *hba)
+{
+   struct Scsi_Host *shost = hba->host;
+   struct scsi_device *sdev;
+   struct request_queue *q;
+   int ret;
+
+   /*
+* Set RPM status of hba device to RPM_ACTIVE,
+* this also clears its runtime error.
+*/
+   ret = pm_runtime_set_active(hba->dev);
+   /*
+* If hba device had runtime error, we also need to resume those
+* scsi devices under hba in case any of them has failed to be
+* resumed due to hba runtime resume failure. This is to unblock
+* blk_queue_enter in case there are bios waiting inside it.
+*/
+   if (!ret) {
+   list_for_each_entry(sdev, >__devices, siblings) {
+

[PATCH 5/9] scsi: ufs: Fix concurrency of error handler and other error recovery paths

2020-08-05 Thread Can Guo

Error recovery can be invoked from multiple paths, including hibern8
enter/exit (from ufshcd_link_recovery), ufshcd_eh_host_reset_handler and
eh_work scheduled from IRQ context. Ultimately, these paths are trying to
invoke ufshcd_reset_and_restore, in either sync or async manner.

Having both sync and async manners at the same time has some problems

- If link recovery happens during ungate work, ufshcd_hold() would be
  called recursively. Although commit 53c12d0ef6fcb
  ("scsi: ufs: fix error recovery after the hibern8 exit failure") [1]
  fixed a deadlock due to recursive calls of ufshcd_hold() by adding a
  check of eh_in_progress into ufshcd_hold, this check allows eh_work to
  run in parallel while link recovery is running.

- Similar concurrency can also happen when error recovery is invoked from
  ufshcd_eh_host_reset_handler and ufshcd_link_recovery.

- Concurrency can even happen between eh_works. eh_work, currently queued
  on system_wq, is allowed to have multiple instances running in parallel,
  but we don't have proper protection for that.

If any of above concurrency happens, error recovery would fail and lead
ufs device and host into bad states. To fix the concurrency problem, this
change queues eh_work on a single threaded workqueue and remove link
recovery calls from hibern8 enter/exit path. Meanwhile, make use of eh_work
in eh_host_reset_handler instead of calling ufshcd_reset_and_restore. This
unifies UFS error recovery mechanism.

In addition, according to the UFSHCI JEDEC spec, hibern8 enter/exit error
occurs when the link is broken. This essentially applies to any power mode
change operations (since they all use PACP_PWR cmds in UniPro layer). So,
in this change, if a power mode change operation (including AH8 enter/exit)
fails, mark link state as UIC_LINK_BROKEN_STATE and schedule the eh_work.
In this case, error handler needs to do a full reset and restore to recover
the link back to active. Before the link state is recovered to active,
ufshcd_uic_pwr_ctrl simply returns -ENOLINK to avoid more errors.

Signed-off-by: Can Guo 
Reviewed-by: Bean Huo 

diff --git a/drivers/scsi/ufs/ufs-sysfs.c b/drivers/scsi/ufs/ufs-sysfs.c
index 2d71d23..02d379f00 100644
--- a/drivers/scsi/ufs/ufs-sysfs.c
+++ b/drivers/scsi/ufs/ufs-sysfs.c
@@ -16,6 +16,7 @@ static const char *ufschd_uic_link_state_to_string(
case UIC_LINK_OFF_STATE:return "OFF";
case UIC_LINK_ACTIVE_STATE: return "ACTIVE";
case UIC_LINK_HIBERN8_STATE:return "HIBERN8";
+   case UIC_LINK_BROKEN_STATE: return "BROKEN";
default:return "UNKNOWN";
}
 }
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 71c650f..2604016 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -228,6 +228,7 @@ static int ufshcd_scale_clks(struct ufs_hba *hba, bool 
scale_up);
 static irqreturn_t ufshcd_intr(int irq, void *__hba);
 static int ufshcd_change_power_mode(struct ufs_hba *hba,
 struct ufs_pa_layer_attr *pwr_mode);
+static void ufshcd_schedule_eh_work(struct ufs_hba *hba);
 static int ufshcd_wb_buf_flush_enable(struct ufs_hba *hba);
 static int ufshcd_wb_buf_flush_disable(struct ufs_hba *hba);
 static int ufshcd_wb_ctrl(struct ufs_hba *hba, bool enable);
@@ -1571,11 +1572,6 @@ int ufshcd_hold(struct ufs_hba *hba, bool async)
spin_lock_irqsave(hba->host->host_lock, flags);
hba->clk_gating.active_reqs++;
 
-   if (ufshcd_eh_in_progress(hba)) {
-   spin_unlock_irqrestore(hba->host->host_lock, flags);
-   return 0;
-   }
-
 start:
switch (hba->clk_gating.state) {
case CLKS_ON:
@@ -1653,6 +1649,7 @@ static void ufshcd_gate_work(struct work_struct *work)
struct ufs_hba *hba = container_of(work, struct ufs_hba,
clk_gating.gate_work.work);
unsigned long flags;
+   int ret;
 
spin_lock_irqsave(hba->host->host_lock, flags);
/*
@@ -1679,8 +1676,11 @@ static void ufshcd_gate_work(struct work_struct *work)
 
/* put the link into hibern8 mode before turning off clocks */
if (ufshcd_can_hibern8_during_gating(hba)) {
-   if (ufshcd_uic_hibern8_enter(hba)) {
+   ret = ufshcd_uic_hibern8_enter(hba);
+   if (ret) {
hba->clk_gating.state = CLKS_ON;
+   dev_err(hba->dev, "%s: hibern8 enter failed %d\n",
+   __func__, ret);
trace_ufshcd_clk_gating(dev_name(hba->dev),
hba->clk_gating.state);
goto out;
@@ -1725,11 +1725,10 @@ static void __ufshcd_release(struct ufs_hba *hba)
 
hba->clk_gating.active_reqs--;
 
-   if (hba->clk_gating.active_reqs || hba->clk_gating.is_suspended
-   || hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL
-   ||

[PATCH 8/9] scsi: ufs: Fix a racing problem btw error handler and runtime PM ops

2020-08-05 Thread Can Guo

Current IRQ handler blocks scsi requests before scheduling eh_work, when
error handler calls pm_runtime_get_sync, if ufshcd_suspend/resume sends a
scsi cmd, most likely the SSU cmd, since scsi requests are blocked,
pm_runtime_get_sync() will never return because ufshcd_suspend/reusme is
blocked by the scsi cmd. Some changes and code re-arrangement can be made
to resolve it.

o In queuecommand path, hba->ufshcd_state check and ufshcd_send_command
  should stay into the same spin lock. This is to make sure that no more
  commands leak into doorbell after hba->ufshcd_state is changed.
o Don't block scsi requests before error handler starts to run, let error
  handler block scsi requests when it is ready to start error recovery.
o Don't let scsi layer keep requeuing the scsi cmds sent from hba runtime
  PM ops, let them pass or fail them. Let them pass if eh_work is scheduled
  due to non-fatal errors. Fail them if eh_work is scheduled due to fatal
  errors, otherwise the cmds may eventually time out since UFS is in bad
  state, which gets error handler blocked for too long. If we fail the scsi
  cmds sent from hba runtime PM ops, hba runtime PM ops fails too, but it
  does not hurt since error handler can recover hba runtime PM error.

Signed-off-by: Can Guo 
Reviewed-by: Bean Huo 

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index a79fbbd..b2947ab 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -126,7 +126,8 @@ enum {
UFSHCD_STATE_RESET,
UFSHCD_STATE_ERROR,
UFSHCD_STATE_OPERATIONAL,
-   UFSHCD_STATE_EH_SCHEDULED,
+   UFSHCD_STATE_EH_SCHEDULED_FATAL,
+   UFSHCD_STATE_EH_SCHEDULED_NON_FATAL,
 };
 
 /* UFSHCD error handling flags */
@@ -2515,34 +2516,6 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
if (!down_read_trylock(>clk_scaling_lock))
return SCSI_MLQUEUE_HOST_BUSY;
 
-   spin_lock_irqsave(hba->host->host_lock, flags);
-   switch (hba->ufshcd_state) {
-   case UFSHCD_STATE_OPERATIONAL:
-   break;
-   case UFSHCD_STATE_EH_SCHEDULED:
-   case UFSHCD_STATE_RESET:
-   err = SCSI_MLQUEUE_HOST_BUSY;
-   goto out_unlock;
-   case UFSHCD_STATE_ERROR:
-   set_host_byte(cmd, DID_ERROR);
-   cmd->scsi_done(cmd);
-   goto out_unlock;
-   default:
-   dev_WARN_ONCE(hba->dev, 1, "%s: invalid state %d\n",
-   __func__, hba->ufshcd_state);
-   set_host_byte(cmd, DID_BAD_TARGET);
-   cmd->scsi_done(cmd);
-   goto out_unlock;
-   }
-
-   /* if error handling is in progress, don't issue commands */
-   if (ufshcd_eh_in_progress(hba)) {
-   set_host_byte(cmd, DID_ERROR);
-   cmd->scsi_done(cmd);
-   goto out_unlock;
-   }
-   spin_unlock_irqrestore(hba->host->host_lock, flags);
-
hba->req_abort_count = 0;
 
err = ufshcd_hold(hba, true);
@@ -2578,11 +2551,51 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
/* Make sure descriptors are ready before ringing the doorbell */
wmb();
 
-   /* issue command to the controller */
spin_lock_irqsave(hba->host->host_lock, flags);
+   switch (hba->ufshcd_state) {
+   case UFSHCD_STATE_OPERATIONAL:
+   case UFSHCD_STATE_EH_SCHEDULED_NON_FATAL:
+   break;
+   case UFSHCD_STATE_EH_SCHEDULED_FATAL:
+   /*
+* pm_runtime_get_sync() is used at error handling preparation
+* stage. If a scsi cmd, e.g. the SSU cmd, is sent from hba's
+* PM ops, it can never be finished if we let SCSI layer keep
+* retrying it, which gets err handler stuck forever. Neither
+* can we let the scsi cmd pass through, because UFS is in bad
+* state, the scsi cmd may eventually time out, which will get
+* err handler blocked for too long. So, just fail the scsi cmd
+* sent from PM ops, err handler can recover PM error anyways.
+*/
+   if (hba->pm_op_in_progress) {
+   hba->force_reset = true;
+   set_host_byte(cmd, DID_BAD_TARGET);
+   goto out_compl_cmd;
+   }
+   fallthrough;
+   case UFSHCD_STATE_RESET:
+   err = SCSI_MLQUEUE_HOST_BUSY;
+   goto out_compl_cmd;
+   case UFSHCD_STATE_ERROR:
+   set_host_byte(cmd, DID_ERROR);
+   goto out_compl_cmd;
+   default:
+   dev_WARN_ONCE(hba->dev, 1, "%s: invalid state %d\n",
+   __func__, hba->ufshcd_state);
+   set_host_byte(cmd, DID_BAD_TARGET);
+   goto out_compl_cmd;
+   }
ufshcd_send_command(hba, tag);
-out_unlock:

[PATCH 4/9] scsi: ufs: Add some debug infos to ufshcd_print_host_state

2020-08-05 Thread Can Guo

The infos of the last interrupt status and its timestamp are very helpful
when debug system stability issues, e.g. IRQ starvation, so add them to
ufshcd_print_host_state. Meanwhile, UFS device infos like model name and
its FW version also come in handy during debug. In addition, this change
makes cleanup to some prints in ufshcd_print_host_regs as similar prints
are already available in ufshcd_print_host_state.

Signed-off-by: Can Guo 
Reviewed-by: Avri Altman 
Reviewed-by: Hongwu Su 
Reviewed-by: Asutosh Das 
Reviewed-by: Stanley Chu 
Reviewed-by: Bean Huo 

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 5acb38c..71c650f 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -411,15 +411,6 @@ static void ufshcd_print_err_hist(struct ufs_hba *hba,
 static void ufshcd_print_host_regs(struct ufs_hba *hba)
 {
ufshcd_dump_regs(hba, 0, UFSHCI_REG_SPACE_SIZE, "host_regs: ");
-   dev_err(hba->dev, "hba->ufs_version = 0x%x, hba->capabilities = 0x%x\n",
-   hba->ufs_version, hba->capabilities);
-   dev_err(hba->dev,
-   "hba->outstanding_reqs = 0x%x, hba->outstanding_tasks = 0x%x\n",
-   (u32)hba->outstanding_reqs, (u32)hba->outstanding_tasks);
-   dev_err(hba->dev,
-   "last_hibern8_exit_tstamp at %lld us, hibern8_exit_cnt = %d\n",
-   ktime_to_us(hba->ufs_stats.last_hibern8_exit_tstamp),
-   hba->ufs_stats.hibern8_exit_cnt);
 
ufshcd_print_err_hist(hba, >ufs_stats.pa_err, "pa_err");
ufshcd_print_err_hist(hba, >ufs_stats.dl_err, "dl_err");
@@ -438,8 +429,6 @@ static void ufshcd_print_host_regs(struct ufs_hba *hba)
ufshcd_print_err_hist(hba, >ufs_stats.host_reset, "host_reset");
ufshcd_print_err_hist(hba, >ufs_stats.task_abort, "task_abort");
 
-   ufshcd_print_clk_freqs(hba);
-
ufshcd_vops_dbg_register_dump(hba);
 }
 
@@ -499,6 +488,8 @@ static void ufshcd_print_tmrs(struct ufs_hba *hba, unsigned 
long bitmap)
 
 static void ufshcd_print_host_state(struct ufs_hba *hba)
 {
+   struct scsi_device *sdev_ufs = hba->sdev_ufs_device;
+
dev_err(hba->dev, "UFS Host state=%d\n", hba->ufshcd_state);
dev_err(hba->dev, "outstanding reqs=0x%lx tasks=0x%lx\n",
hba->outstanding_reqs, hba->outstanding_tasks);
@@ -511,12 +502,24 @@ static void ufshcd_print_host_state(struct ufs_hba *hba)
dev_err(hba->dev, "Auto BKOPS=%d, Host self-block=%d\n",
hba->auto_bkops_enabled, hba->host->host_self_blocked);
dev_err(hba->dev, "Clk gate=%d\n", hba->clk_gating.state);
+   dev_err(hba->dev,
+   "last_hibern8_exit_tstamp at %lld us, hibern8_exit_cnt=%d\n",
+   ktime_to_us(hba->ufs_stats.last_hibern8_exit_tstamp),
+   hba->ufs_stats.hibern8_exit_cnt);
+   dev_err(hba->dev, "last intr at %lld us, last intr status=0x%x\n",
+   ktime_to_us(hba->ufs_stats.last_intr_ts),
+   hba->ufs_stats.last_intr_status);
dev_err(hba->dev, "error handling flags=0x%x, req. abort count=%d\n",
hba->eh_flags, hba->req_abort_count);
-   dev_err(hba->dev, "Host capabilities=0x%x, caps=0x%x\n",
-   hba->capabilities, hba->caps);
+   dev_err(hba->dev, "hba->ufs_version=0x%x, Host capabilities=0x%x, 
caps=0x%x\n",
+   hba->ufs_version, hba->capabilities, hba->caps);
dev_err(hba->dev, "quirks=0x%x, dev. quirks=0x%x\n", hba->quirks,
hba->dev_quirks);
+   if (sdev_ufs)
+   dev_err(hba->dev, "UFS dev info: %.8s %.16s rev %.4s\n",
+   sdev_ufs->vendor, sdev_ufs->model, sdev_ufs->rev);
+
+   ufshcd_print_clk_freqs(hba);
 }
 
 /**
@@ -5951,6 +5954,8 @@ static irqreturn_t ufshcd_intr(int irq, void *__hba)
 
spin_lock(hba->host->host_lock);
intr_status = ufshcd_readl(hba, REG_INTERRUPT_STATUS);
+   hba->ufs_stats.last_intr_status = intr_status;
+   hba->ufs_stats.last_intr_ts = ktime_get();
 
/*
 * There could be max of hba->nutrs reqs in flight and in worst case
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index b2ef18f..b7f54af 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -409,6 +409,8 @@ struct ufs_err_reg_hist {
 
 /**
  * struct ufs_stats - keeps usage/err statistics
+ * @last_intr_status: record the last interrupt status.
+ * @last_intr_ts: record the last interrupt timestamp.
  * @hibern8_exit_cnt: Counter to keep track of number of exits,
  * reset this after link-startup.
  * @last_hibern8_exit_tstamp: Set time after the hibern8 exit.
@@ -428,6 +430,9 @@ struct ufs_err_reg_hist {
  * @tsk_abort: tracks task abort events
  */
 struct ufs_stats {
+   u32 last_intr_status;
+   ktime_t last_intr_ts;
+
u32 hibern8_exit_cnt;
ktime_t last_hibern8_exit_tstamp;
 
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora

[PATCH 3/9] scsi: ufs-qcom: Remove testbus dump in ufs_qcom_dump_dbg_regs

2020-08-05 Thread Can Guo

Dumping testbus registers is heavy enough to cause stability issues
sometime, just remove them as of now.

Signed-off-by: Can Guo 
Reviewed-by: Hongwu Su 
Reviewed-by: Avri Altman 
Reviewed-by: Bean Huo 

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index 823eccf..6b75338 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -1630,44 +1630,12 @@ int ufs_qcom_testbus_config(struct ufs_qcom_host *host)
return 0;
 }
 
-static void ufs_qcom_testbus_read(struct ufs_hba *hba)
-{
-   ufshcd_dump_regs(hba, UFS_TEST_BUS, 4, "UFS_TEST_BUS ");
-}
-
-static void ufs_qcom_print_unipro_testbus(struct ufs_hba *hba)
-{
-   struct ufs_qcom_host *host = ufshcd_get_variant(hba);
-   u32 *testbus = NULL;
-   int i, nminor = 256, testbus_len = nminor * sizeof(u32);
-
-   testbus = kmalloc(testbus_len, GFP_KERNEL);
-   if (!testbus)
-   return;
-
-   host->testbus.select_major = TSTBUS_UNIPRO;
-   for (i = 0; i < nminor; i++) {
-   host->testbus.select_minor = i;
-   ufs_qcom_testbus_config(host);
-   testbus[i] = ufshcd_readl(hba, UFS_TEST_BUS);
-   }
-   print_hex_dump(KERN_ERR, "UNIPRO_TEST_BUS ", DUMP_PREFIX_OFFSET,
-   16, 4, testbus, testbus_len, false);
-   kfree(testbus);
-}
-
 static void ufs_qcom_dump_dbg_regs(struct ufs_hba *hba)
 {
ufshcd_dump_regs(hba, REG_UFS_SYS1CLK_1US, 16 * 4,
 "HCI Vendor Specific Registers ");
 
-   /* sleep a bit intermittently as we are dumping too much data */
ufs_qcom_print_hw_debug_reg_all(hba, NULL, ufs_qcom_dump_regs_wrapper);
-   udelay(1000);
-   ufs_qcom_testbus_read(hba);
-   udelay(1000);
-   ufs_qcom_print_unipro_testbus(hba);
-   udelay(1000);
 }
 
 /**
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

[PATCH 2/9] ufs: ufs-qcom: Fix race conditions caused by func ufs_qcom_testbus_config

2020-08-05 Thread Can Guo

If ufs_qcom_dump_dbg_regs() calls ufs_qcom_testbus_config() from
ufshcd_suspend/resume and/or clk gate/ungate context, pm_runtime_get_sync()
and ufshcd_hold() will cause racing problems. Fix this by removing the
unnecessary calls of pm_runtime_get_sync() and ufshcd_hold().

Signed-off-by: Can Guo 
Reviewed-by: Hongwu Su 
Reviewed-by: Avri Altman 
Reviewed-by: Bean Huo 

diff --git a/drivers/scsi/ufs/ufs-qcom.c b/drivers/scsi/ufs/ufs-qcom.c
index d0d7552..823eccf 100644
--- a/drivers/scsi/ufs/ufs-qcom.c
+++ b/drivers/scsi/ufs/ufs-qcom.c
@@ -1614,9 +1614,6 @@ int ufs_qcom_testbus_config(struct ufs_qcom_host *host)
 */
}
mask <<= offset;
-
-   pm_runtime_get_sync(host->hba->dev);
-   ufshcd_hold(host->hba, false);
ufshcd_rmwl(host->hba, TEST_BUS_SEL,
(u32)host->testbus.select_major << 19,
REG_UFS_CFG1);
@@ -1629,8 +1626,6 @@ int ufs_qcom_testbus_config(struct ufs_qcom_host *host)
 * committed before returning.
 */
mb();
-   ufshcd_release(host->hba);
-   pm_runtime_put_sync(host->hba->dev);
 
return 0;
 }
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
Foundation Collaborative Project.

Re: [PATCH] Replace HTTP links with HTTPS ones: Ext4

2020-08-05 Thread tytso

On Mon, Jul 06, 2020 at 09:03:39PM +0200, Alexander A. Klimov wrote:
> Rationale:
> Reduces attack surface on kernel devs opening the links for MITM
> as HTTPS traffic is much harder to manipulate.

Thanks, applied.

- Ted

Re: [PATCH v9 8/9] scsi: ufs: Fix a racing problem btw error handler and runtime PM ops

2020-08-05 Thread Can Guo


On 2020-08-05 09:31, Martin K. Petersen wrote:

Can,


Current IRQ handler blocks scsi requests before scheduling eh_work,
when error handler calls pm_runtime_get_sync, if ufshcd_suspend/resume
sends a scsi cmd, most likely the SSU cmd, since scsi requests are
blocked, pm_runtime_get_sync() will never return because
ufshcd_suspend/reusme is blocked by the scsi cmd. Some changes and
code re-arrangement can be made to resolve it.


  CC [M]  drivers/scsi/ufs/ufshcd.o
drivers/scsi/ufs/ufshcd.c: In function ‘ufshcd_queuecommand’:
drivers/scsi/ufs/ufshcd.c:2570:6: error: this statement may fall
through [-Werror=implicit-fallthrough=]
 2570 |   if (hba->pm_op_in_progress) {
  |  ^
drivers/scsi/ufs/ufshcd.c:2575:2: note: here
 2575 |  case UFSHCD_STATE_RESET:
  |  ^~~~
cc1: all warnings being treated as errors
make[3]: *** [scripts/Makefile.build:280: drivers/scsi/ufs/ufshcd.o] 
Error 1

make[2]: *** [scripts/Makefile.build:497: drivers/scsi/ufs] Error 2
make[1]: *** [scripts/Makefile.build:497: drivers/scsi] Error 2
make: *** [Makefile:1764: drivers] Error 2


Thanks Martin, will fix it in next version.

Can Guo.

drivers/usb/host/ehci.h:743:17: sparse: sparse: incorrect type in argument 1 (different address spaces)

2020-08-05 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   fffe3ae0ee84e25d2befe2ae59bc32aa2b6bc77b
commit: 670d0a4b10704667765f7d18f7592993d02783aa sparse: use identifiers to 
define address spaces
date:   7 weeks ago
config: mips-randconfig-s031-20200806 (attached as .config)
compiler: mips-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.2-117-g8c7aee71-dirty
git checkout 670d0a4b10704667765f7d18f7592993d02783aa
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=mips 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)

   drivers/usb/host/ehci-hcd.c: note: in included file:
   drivers/usb/host/ehci-q.c:1389:27: sparse: sparse: incorrect type in 
assignment (different base types) @@ expected restricted __hc32 [usertype] 
old_current @@ got int @@
   drivers/usb/host/ehci-q.c:1389:27: sparse: expected restricted __hc32 
[usertype] old_current
   drivers/usb/host/ehci-q.c:1389:27: sparse: got int
   drivers/usb/host/ehci-hcd.c: note: in included file:
   drivers/usb/host/ehci-mem.c:188:24: sparse: sparse: incorrect type in 
assignment (different base types) @@ expected restricted __hc32 [usertype] 
*periodic @@ got restricted __le32 [usertype] * @@
   drivers/usb/host/ehci-mem.c:188:24: sparse: expected restricted __hc32 
[usertype] *periodic
   drivers/usb/host/ehci-mem.c:188:24: sparse: got restricted __le32 
[usertype] *
   drivers/usb/host/ehci-hcd.c:566:27: sparse: sparse: incorrect type in 
assignment (different base types) @@ expected restricted __hc32 [usertype] 
old_current @@ got int @@
   drivers/usb/host/ehci-hcd.c:566:27: sparse: expected restricted __hc32 
[usertype] old_current
   drivers/usb/host/ehci-hcd.c:566:27: sparse: got int
   drivers/usb/host/ehci-hcd.c: note: in included file:
>> drivers/usb/host/ehci.h:743:17: sparse: sparse: incorrect type in argument 1 
>> (different address spaces) @@ expected void const volatile [noderef] 
>> __iomem *mem @@ got unsigned int * @@
>> drivers/usb/host/ehci.h:743:17: sparse: expected void const volatile 
>> [noderef] __iomem *mem
   drivers/usb/host/ehci.h:743:17: sparse: got unsigned int *
   drivers/usb/host/ehci.h:743:17: sparse: sparse: cast to restricted __be32
   drivers/usb/host/ehci-hcd.c: note: in included file (through 
arch/mips/include/asm/mmiowb.h, include/linux/spinlock.h, 
include/linux/seqlock.h, ...):
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   drivers/usb/host/ehci-hcd.c: note: in included file:
>> drivers/usb/host/ehci.h:743:17: sparse: sparse: incorrect type in argument 1 
>> (different address spaces) @@ expected void const volatile [noderef] 
>> __iomem *mem @@ got unsigned int * @@
>> drivers/usb/host/ehci.h:743:17: sparse: expected void const volatile 
>> [noderef] __iomem *mem
   drivers/usb/host/ehci.h:743:17: sparse: got unsigned int *
   drivers/usb/host/ehci.h:743:17: sparse: sparse: cast to restricted __be32
   drivers/usb/host/ehci-hcd.c: note: in included file (through 
arch/mips/include/asm/mmiowb.h, include/linux/spinlock.h, 
include/linux/seqlock.h, ...):
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   arch/mips/include/asm/io.h:354:1: sparse: sparse: cast to restricted __le32
   drivers/usb/host/ehci-hcd.c: note: in included file:
>> drivers/usb/host/ehci.h:743:17: sparse: sparse: incorrect type in argument 1 
>> (different address spaces) @@ expected void const volatile [noderef] 
>> __iomem *mem @@ got unsigned int * @@
>> drivers/usb/host/ehci.h:743:17: sparse: expected void const volatile 
>> [noderef] __iomem *mem
   drivers/usb/host/ehci.h:743:17: sparse: got unsigned int *
   drivers/usb/host/ehci.h:743:17: sparse: sparse: cast to restricted __be32
   drivers/usb/host/ehci-hcd.c: note: in included

[PATCH v4] PCI: Reduce warnings on possible RW1C corruption

2020-08-05 Thread Mark Tomlinson

For hardware that only supports 32-bit writes to PCI there is the
possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
messages was introduced by fb2659230120, but rate-limiting is not the
best choice here. Some devices may not show the warnings they should if
another device has just produced a bunch of warnings. Also, the number
of messages can be a nuisance on devices which are otherwise working
fine.

This patch changes the ratelimit to a single warning per bus. This
ensures no bus is 'starved' of emitting a warning and also that there
isn't a continuous stream of warnings. It would be preferable to have a
warning per device, but the pci_dev structure is not available here, and
a lookup from devfn would be far too slow.

Suggested-by: Bjorn Helgaas 
Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit 
config writes")
Signed-off-by: Mark Tomlinson 
---
changes in v4:
 - Use bitfield rather than bool to save memory (was meant to be in v3).

 drivers/pci/access.c | 9 ++---
 include/linux/pci.h  | 1 +
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 79c4a2ef269a..b452467fd133 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, 
unsigned int devfn,
 * write happen to have any RW1C (write-one-to-clear) bits set, we
 * just inadvertently cleared something we shouldn't have.
 */
-   dev_warn_ratelimited(>dev, "%d-byte config write to 
%04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
-size, pci_domain_nr(bus), bus->number,
-PCI_SLOT(devfn), PCI_FUNC(devfn), where);
+   if (!bus->unsafe_warn) {
+   dev_warn(>dev, "%d-byte config write to %04x:%02x:%02x.%d 
offset %#x may corrupt adjacent RW1C bits\n",
+size, pci_domain_nr(bus), bus->number,
+PCI_SLOT(devfn), PCI_FUNC(devfn), where);
+   bus->unsafe_warn = 1;
+   }
 
mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
tmp = readl(addr) & mask;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 34c1c4f45288..85211a787f8b 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -626,6 +626,7 @@ struct pci_bus {
struct bin_attribute*legacy_io; /* Legacy I/O for this bus */
struct bin_attribute*legacy_mem;/* Legacy mem */
unsigned intis_added:1;
+   unsigned intunsafe_warn:1;  /* warned about RW1C config 
write */
 };
 
 #define to_pci_bus(n)  container_of(n, struct pci_bus, dev)
-- 
2.28.0

[PATCH] softirq: add irq off checking for __raise_softirq_irqoff

2020-08-05 Thread Jiafei Pan

__raise_softirq_irqoff will update per-CPU mask of pending softirqs,
it need to be called in irq disabled context in order to keep it atomic
operation, otherwise it will be interrupted by hardware interrupt,
and per-CPU softirqs pending mask will be corrupted, the result is
there will be unexpected issue, for example hrtimer soft irq will
be losed and soft hrtimer will never be expire and handled.

Adding irqs disabled checking here to provide warning in irqs enabled
context.

Signed-off-by: Jiafei Pan 
---
 kernel/softirq.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index bf88d7f62433..11f61e54a3ae 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -481,6 +481,11 @@ void raise_softirq(unsigned int nr)
 
 void __raise_softirq_irqoff(unsigned int nr)
 {
+   /* This function can only be called in irq disabled context,
+* otherwise or_softirq_pending will be interrupted by hardware
+* interrupt, so that there will be unexpected issue.
+*/
+   WARN_ON_ONCE(!irqs_disabled());
trace_softirq_raise(nr);
or_softirq_pending(1UL << nr);
 }
-- 
2.17.1

[PATCH] leds: Add an optional property named 'sdb-gpios'

2020-08-05 Thread Grant Feng

The chip enters hardware shutdown when the SDB pin is pulled low.
The chip releases hardware shutdown when the SDB pin is pulled high.

Signed-off-by: Grant Feng 
---
 Documentation/devicetree/bindings/leds/leds-is31fl32xx.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/Documentation/devicetree/bindings/leds/leds-is31fl32xx.txt 
b/Documentation/devicetree/bindings/leds/leds-is31fl32xx.txt
index 926c2117942c..94f02827fd83 100644
--- a/Documentation/devicetree/bindings/leds/leds-is31fl32xx.txt
+++ b/Documentation/devicetree/bindings/leds/leds-is31fl32xx.txt
@@ -15,6 +15,8 @@ Required properties:
 - reg: I2C slave address
 - address-cells : must be 1
 - size-cells : must be 0
+- sdb-gpios : (optional)
+  Specifier of the GPIO connected to SDB pin.
 
 LED sub-node properties:
 - reg : LED channel number (1..N)
@@ -31,6 +33,7 @@ is31fl3236: led-controller@3c {
reg = <0x3c>;
#address-cells = <1>;
#size-cells = <0>;
+   sdb-gpios = < 11 GPIO_ACTIVE_HIGH>;
 
led@1 {
reg = <1>;
-- 
2.17.1

RE: [EXT] Re: [PATCH v4 2/2] net: dsa: ocelot: Add support for QinQ Operation

2020-08-05 Thread Hongbo Wang

> On 8/3/2020 11:36 PM, Hongbo Wang wrote:
> >>> + if (vlan->proto == ETH_P_8021AD) {
> >>> + ocelot->enable_qinq = true;
> >>> + ocelot_port->qinq_mode = true;
> >>> + }
> >>  ...
> >>> + if (vlan->proto == ETH_P_8021AD) {
> >>> + ocelot->enable_qinq = false;
> >>> + ocelot_port->qinq_mode = false;
> >>> + }
> >>> +
> >>
> >> I don't understand how this can work just by using a boolean to track
> >> the state.
> >>
> >> This won't work properly if you are handling multiple QinQ VLAN entries.
> >>
> >> Also, I need Andrew and Florian to review and ACK the DSA layer
> >> changes that add the protocol value to the device notifier block.
> >
> > Hi David,
> > Thanks for reply.
> >
> > When setting bridge's VLAN protocol to 802.1AD by the command "ip link
> > set br0 type bridge vlan_protocol 802.1ad", it will call
> > dsa_slave_vlan_rx_add(dev, proto, vid) for every port in the bridge,
> > the parameter vid is port's pvid 1, if pvid's proto is 802.1AD, I will
> > enable switch's enable_qinq, and the related port's qinq_mode,
> >
> > When there are multiple QinQ VLAN entries, If one VLAN's proto is 802.1AD,
> I will enable switch and the related port into QinQ mode.
> 
> The enabling appears fine, the problem is the disabling, the first 802.1AD 
> VLAN
> entry that gets deleted will lead to the port and switch no longer being in 
> QinQ
> mode, and this does not look intended.
> --
> Florian

When I try to add reference counter, I found that:
1.
the command "ip link set br0 type bridge vlan_protocol 802.1ad" call path is:
br_changelink -> __br_vlan_set_proto -> vlan_vid_add -> ... -> 
ndo_vlan_rx_add_vid -> dsa_slave_vlan_rx_add_vid(dev, proto, vid) -> 
felix_vlan_add

dsa_slave_vlan_rx_add_vid can pass correct protocol and vid(1) to ocelot driver.

vlan_vid_add is in net/8021q/vlan_core.c, it maintains a vid_list that stores 
the map of vid and protocol,
the function vlan_vid_info_get can read the map.

but when deleting bridge using "ip link del dev br0 type bridge", the call path 
is:
br_dev_delete -> ... -> br_switchdev_port_vlan_del -> ... -> 
dsa_slave_port_obj_del -> dsa_slave_vlan_del -> ... -> felix_vlan_del

br_switchdev_port_vlan_del is in net/bridge/br_switchdev.c, it didn't have the 
list for map vid and protocol,
so it can't pass correct protocol that corresponding with vid to ocelot driver.

2.
For ocelot QinQ case, the switch port linked to customer has different actions 
with the port for ISP,

uplink: Customer LAN(CTAG) -> swp0(vlan_aware:0 pop_cnt:0) -> swp1(add STAG) -> 
ISP MAN(STAG + CTAG)
downlink: ISP MAN(STAG + CTAG) -> swp1(vlan_aware:1 pop_cnt:1, pop STAG) -> 
swp0(only CTAG) -> Customer LAN

the different action is descripted in "4.3.3 Provider Bridges and Q-in-Q 
Operation" in VSC99599_1_00_TS.pdf

so I need a standard command to set swp0 and swp1 for different mode, 
but "ip link set br0 type bridge vlan_protocol 802.1ad" will set all ports into 
the same mode, it's not my intent.

3.
I thought some ways to resovle the above issue:
a. br_switchdev_port_vlan_del will pass default value ETH_P_8021Q, but don't 
care it in felix_vlan_del.
b. In felix_vlan_add and felix_vlan_del, only when vid is ocelot_port's pvid, 
it enable or disable switch's enable_qinq.
c. Maybe I can use devlink to set swp0 and swp1 into different mode.
d. let br_switchdev_port_vlan_del call vlan_vid_info_get to get protocol for 
vid, but vlan_vid_info_get is static in vlan_core.c, so this need to add 
related functions in br_switchdev.c.

Any comments is welcome!

Thanks
Hongbo

Re: [PATCH v4 01/12] ASoC: qcom: Add common array to initialize soc based core clocks

2020-08-05 Thread Rohit Kumar


Thanks Stephen for reviewing.

On 8/6/2020 6:01 AM, Stephen Boyd wrote:

Quoting Rohit kumar (2020-07-22 03:31:44)

From: Ajit Pandey 

LPASS variants have their own soc specific clocks that needs to be
enabled for MI2S audio support. Added a common variable in drvdata to
initialize such clocks using bulk clk api. Such clock names is
defined in variants specific data and needs to fetched during init.

Why not just get all the clks and not even care about the names of them?
Use devm_clk_bulk_get_all() for that, unless some clks need to change
rates?


There is ahbix clk which needs clk rate to be set. Please check below 
patch in


the series for reference

[PATCH v5 02/12] ASoC: qcom: lpass-cpu: Move ahbix clk to platform 
specific function


Thanks,

Rohit

--
Qualcomm INDIA, on behalf of Qualcomm Innovation Center, Inc.is a member
of the Code Aurora Forum, hosted by the Linux Foundation.

Re: [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings

2020-08-05 Thread Roman Gushchin

On Wed, Aug 05, 2020 at 08:01:33PM -0700, Hugh Dickins wrote:
> On Mon, 3 Aug 2020, Roman Gushchin wrote:
> > On Fri, Jul 31, 2020 at 07:17:05PM -0700, Hugh Dickins wrote:
> > > On Fri, 31 Jul 2020, Roman Gushchin wrote:
> > > > On Thu, Jul 30, 2020 at 09:06:55PM -0700, Hugh Dickins wrote:
> > > > > 
> > > > > Though another alternative did occur to me overnight: we could
> > > > > scrap the logged warning, and show "nr_whatever -53" as output
> > > > > from /proc/sys/vm/stat_refresh: that too would be acceptable
> > > > > to me, and you redirect to /dev/null.
> > > > 
> > > > It sounds like a good idea to me. Do you want me to prepare a patch?
> > > 
> > > Yes, if you like that one best, please do prepare a patch - thanks!
> > 
> > Hi Hugh,
> > 
> > I mastered a patch (attached below), but honestly I can't say I like it.
> > The resulting interface is confusing: we don't generally use sysctls to
> > print debug data and/or warnings.
> 
> Since you confessed to not liking it yourself, I paid it very little
> attention.  Yes, when I made that suggestion, I wasn't really thinking
> of how stat_refresh is a /proc/sys/vm sysctl thing; and I'm not at all
> sure how issuing output from a /proc file intended for input works out
> (perhaps there are plenty of good examples, and you followed one, but
> it smells fishy to me now).
> 
> > 
> > I thought about treating a write to this sysctls as setting the threshold,
> > so that "echo 0 > /proc/sys/vm/stat_refresh" would warn on all negative
> > entries, and "cat /proc/sys/vm/stat_refresh" would use the default threshold
> > as in my patch. But this breaks  to some extent the current ABI, as passing
> > an incorrect value will result in -EINVAL instead of passing (as now).
> 
> I expect we could handle that well enough, by more lenient validation
> of the input; though my comment above on output versus input sheds doubt.
> 
> > 
> > Overall I still think we shouldn't warn on any values inside the possible
> > range, as it's not an indication of any kind of error. The only reason
> > why we see some values going negative and some not, is that some of them
> > are updated more frequently than others, and some are bouncing around
> > zero, while other can't reach zero too easily (like the number of free 
> > pages).
> 
> We continue to disagree on that (and it amuses me that you who are so
> sure they can be ignored, cannot ignore them; whereas I who am so curious
> to investigate them, have not actually found the time to do so in years).
> It was looking as if nothing could satisfy us both, but...

I can only repeat my understanding here: with the current implementation
the measured number can vary in range of
  (true_value - zone_threshold * NR_CPUS,
   true_value + zone_threshold * NR_CPUS).
zone_threshold depends on the size of a zone and the number of CPUs,
but cannot exceed 125.

Of course, most likely measured numbers are mostly distributed somewhere
close to the real number, and reaching distant ends of this range is
unlikely. But it's a question of probability.

So if the true value is close to 0, there are high chances of getting
negative measured numbers. The bigger is the value, the lower are these
chances. And if it's bigger than the maximal drift, these chances are 0.

So we can be sure that a measured value can't go negative only if we know
for sure that the true number is bigger than zone_threshold * NR_CPUS.

You can, probably, say that if the chances of getting a negative value
are really really low, it's better to spawn a warning, rather than miss
a potential error. I'd happily agree, if we'd have a nice formula
to calculate the tolerance by the given probability. But if we'll treat
all negative numbers as warnings, we'll just end with a lot of false
warnings.

> 
> > 
> > Actually, if someone wants to ensure that numbers are accurate,
> > we have to temporarily set the threshold to 0, then flush the percpu data
> > and only then check atomics. In the current design flushing percpu data
> > matters for only slowly updated counters, as all others will run away while
> > we're waiting for the flush. So if we're targeting some slowly updating
> > counters, maybe we should warn only on them being negative, Idk.
> 
> I was going to look into that angle, though it would probably add a little
> unjustifiable overhead to fast paths, and be rejected on that basis.

I'd expect it. What I think can be acceptable is to have different tolerance
for different counters, if there is a good reason to have more precise values
for some counters.
I did a similar thing in the "new slab controller" patchset for memcg
slab statistics, which required a different threshold because they are measured
in bytes (all other metrics were historically in pages).

> 
> But in going to do so, came up against an earlier comment of yours, of
> which I had misunderstood the significance. I had said and you replied:
> 
> > > nr_zone_write_pending: yes, I've looked at our machines, and see

Re: [PATCH v17 01/21] mm/vmscan: remove unnecessary lruvec adding

2020-08-05 Thread Alex Shi




在 2020/7/25 下午8:59, Alex Shi 写道:
> We don't have to add a freeable page into lru and then remove from it.
> This change saves a couple of actions and makes the moving more clear.
> 
> The SetPageLRU needs to be kept here for list intergrity.
> Otherwise:
>  #0 mave_pages_to_lru  #1 release_pages
>if (put_page_testzero())
>  if !put_page_testzero
>  !PageLRU //skip lru_lock
>list_add(>lru,)
>list_add(>lru,) //corrupt

The race comments should be corrected to this:
/*
 * The SetPageLRU needs to be kept here for list intergrity.
 * Otherwise:
 *   #0 mave_pages_to_lru #1 release_pages
 *   if !put_page_testzero
 *if (put_page_testzero())
 *  !PageLRU //skip lru_lock
 * SetPageLRU()
 * list_add(>lru,)
 *list_add(>lru,)
 */

> 
> [a...@linux-foundation.org: coding style fixes]
> Signed-off-by: Alex Shi 
> Cc: Andrew Morton 
> Cc: Johannes Weiner 
> Cc: Tejun Heo 
> Cc: Matthew Wilcox 
> Cc: Hugh Dickins 
> Cc: linux...@kvack.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  mm/vmscan.c | 37 -
>  1 file changed, 24 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 749d239c62b2..ddb29d813d77 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1856,26 +1856,29 @@ static unsigned noinline_for_stack 
> move_pages_to_lru(struct lruvec *lruvec,
>   while (!list_empty(list)) {
>   page = lru_to_page(list);
>   VM_BUG_ON_PAGE(PageLRU(page), page);
> + list_del(>lru);
>   if (unlikely(!page_evictable(page))) {
> - list_del(>lru);
>   spin_unlock_irq(>lru_lock);
>   putback_lru_page(page);
>   spin_lock_irq(>lru_lock);
>   continue;
>   }
> - lruvec = mem_cgroup_page_lruvec(page, pgdat);
>  
> + /*
> +  * The SetPageLRU needs to be kept here for list intergrity.
> +  * Otherwise:
> +  *   #0 mave_pages_to_lru #1 release_pages
> +  *if (put_page_testzero())
> +  *   if !put_page_testzero
> +  *  !PageLRU //skip lru_lock
> +  *list_add(>lru,)
> +  * list_add(>lru,) //corrupt
> +  */

/*
 * The SetPageLRU needs to be kept here for list intergrity.
 * Otherwise:
 *   #0 mave_pages_to_lru #1 release_pages
 *   if !put_page_testzero
 *if (put_page_testzero())
 *  !PageLRU //skip lru_lock
 * SetPageLRU()
 * list_add(>lru,)
 *list_add(>lru,)
 */

>   SetPageLRU(page);
> - lru = page_lru(page);
>  
> - nr_pages = hpage_nr_pages(page);
> - update_lru_size(lruvec, lru, page_zonenum(page), nr_pages);
> - list_move(>lru, >lists[lru]);
> -
> - if (put_page_testzero(page)) {
> + if (unlikely(put_page_testzero(page))) {
>   __ClearPageLRU(page);
>   __ClearPageActive(page);
> - del_page_from_lru_list(page, lruvec, lru);
>  
>   if (unlikely(PageCompound(page))) {
>   spin_unlock_irq(>lru_lock);
> @@ -1883,11 +1886,19 @@ static unsigned noinline_for_stack 
> move_pages_to_lru(struct lruvec *lruvec,
>   spin_lock_irq(>lru_lock);
>   } else
>   list_add(>lru, _to_free);
> - } else {
> - nr_moved += nr_pages;
> - if (PageActive(page))
> - workingset_age_nonresident(lruvec, nr_pages);
> +
> + continue;
>   }
> +
> + lruvec = mem_cgroup_page_lruvec(page, pgdat);
> + lru = page_lru(page);
> + nr_pages = hpage_nr_pages(page);
> +
> + update_lru_size(lruvec, lru, page_zonenum(page), nr_pages);
> + list_add(>lru, >lists[lru]);
> + nr_moved += nr_pages;
> + if (PageActive(page))
> + workingset_age_nonresident(lruvec, nr_pages);
>   }
>  
>   /*
>

Re: [PATCH] venus: core: add shutdown callback for venus

2020-08-05 Thread mansur


Hi Sai,


On 2020-06-24 12:17, Sai Prakash Ranjan wrote:

Hi Mansur,

On 2020-06-13 16:03, Mansur Alisha Shaik wrote:

After the SMMU translation is disabled in the
arm-smmu shutdown callback during reboot, if
any subsystem are still alive then IOVAs they
are using will become PAs on bus, which may
lead to crash.

Below are the consumers of smmu from venus
arm-smmu: consumer: aa0.video-codec supplier=1500.iommu
arm-smmu: consumer: video-firmware.0 supplier=1500.iommu

So implemented shutdown callback, which detach iommu maps.

Change-Id: I0f0f331056e0b84b92f1d86f66618d4b1caaa24a
Signed-off-by: Mansur Alisha Shaik 
---
 drivers/media/platform/qcom/venus/core.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/media/platform/qcom/venus/core.c
b/drivers/media/platform/qcom/venus/core.c
index 30d4b9e..acf798c 100644
--- a/drivers/media/platform/qcom/venus/core.c
+++ b/drivers/media/platform/qcom/venus/core.c
@@ -371,6 +371,14 @@ static int venus_remove(struct platform_device 
*pdev)

return ret;
 }

+static void venus_core_shutdown(struct platform_device *pdev)
+{
+   int ret;
+
+   ret = venus_remove(pdev);
+   WARN_ON(ret < 0);


I don't think you should warn here, its shutdown path and you can't
do anything with this WARN unlike remove callback where you have
to be sure to cleanup properly so that you are able to reload module.
But if you still want a hint about this failure, then just add a 
dev_err()
to indicate the failure instead of a big stack trace spamming kernel 
log.




posted V2 version by adding dev_warn during shutdown failure instead of 
WARN_ON.

V2 version : https://lore.kernel.org/patchwork/patch/1284693/


Thanks,
Sai


---
Thanks,
Mansur

[PATCH V2] venus: core: add shutdown callback for venus

2020-08-05 Thread Mansur Alisha Shaik

After the SMMU translation is disabled in the
arm-smmu shutdown callback during reboot, if
any subsystem are still alive then IOVAs they
are using will become PAs on bus, which may
lead to crash.

Below are the consumers of smmu from venus
arm-smmu: consumer: aa0.video-codec supplier=1500.iommu
arm-smmu: consumer: video-firmware.0 supplier=1500.iommu

So implemented shutdown callback, which detach iommu maps.

Change-Id: I0f0f331056e0b84b92f1d86f66618d4b1caaa24a
Signed-off-by: Mansur Alisha Shaik 
---
 drivers/media/platform/qcom/venus/core.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/media/platform/qcom/venus/core.c 
b/drivers/media/platform/qcom/venus/core.c
index 203c653..92aac06 100644
--- a/drivers/media/platform/qcom/venus/core.c
+++ b/drivers/media/platform/qcom/venus/core.c
@@ -341,6 +341,16 @@ static int venus_remove(struct platform_device *pdev)
return ret;
 }
 
+static void venus_core_shutdown(struct platform_device *pdev)
+{
+   struct venus_core *core = platform_get_drvdata(pdev);
+   int ret;
+
+   ret = venus_remove(pdev);
+   if(ret)
+   dev_warn(core->dev, "shutdown failed \n", ret);
+}
+
 static __maybe_unused int venus_runtime_suspend(struct device *dev)
 {
struct venus_core *core = dev_get_drvdata(dev);
@@ -592,6 +602,7 @@ static struct platform_driver qcom_venus_driver = {
.of_match_table = venus_dt_match,
.pm = _pm_ops,
},
+   .shutdown = venus_core_shutdown,
 };
 module_platform_driver(qcom_venus_driver);
 
-- 
2.7.4

Re: [PATCH 0/2] locking/qspinlock: Break qspinlock_types.h header loop

2020-08-05 Thread Vineet Gupta

On 7/30/20 12:50 AM, Herbert Xu wrote:
> On Thu, Jul 30, 2020 at 10:47:16AM +0300, Andy Shevchenko wrote:
>> We may ask Synopsys folks to look at this as well.
>> Vineet, any ideas if we may unify ATOMIC64_INIT() across the architectures?
> I don't think there is any technical difficulty.  The custom
> atomic64_t simply adds an alignment requirement so the initialisor
> remains the same.

Exactly so.

FWIW the alignment requirement is because ARC ABI allows 64-bit data to be 
32-bit
aligned provided hardware deals fine with 4 byte aligned for the non-atomic
double-load/store LDD/STD instructions. The 64-bit alignement however is 
required
for atomic double load/store LLOCKD/SCONDD instructions hence the definition of
ARC atomic64_t

-Vineet

Re: [PATCH v2 03/24] virtio: allow virtioXX, leXX in config space

2020-08-05 Thread Jason Wang




On 2020/8/5 下午7:45, Michael S. Tsirkin wrote:

   #define virtio_cread(vdev, structname, member, ptr)  \
do {\
might_sleep();  \
/* Must match the member's type, and be integer */  \
-   if (!typecheck(typeofstructname*)0)->member)), *(ptr))) \
+   if (!__virtio_typecheck(structname, member, *(ptr)))\
(*ptr) = 1; \

A silly question,  compare to using set()/get() directly, what's the value
of the accessors macro here?

Thanks

get/set don't convert to the native endian, I guess that's why
drivers use cread/cwrite. It is also nice that there's type
safety, checking the correct integer width is used.



Yes, but this is simply because a macro is used here, how about just 
doing things similar like virtio_cread_bytes():


static inline void virtio_cread(struct virtio_device *vdev,
                  unsigned int offset,
                  void *buf, size_t len)


And do the endian conversion inside?

Thanks

Re: linux-next: manual merge of the hmm tree with the kvm-ppc tree

2020-08-05 Thread Stephen Rothwell

Hi all,

On Thu, 30 Jul 2020 19:16:10 +1000 Stephen Rothwell  
wrote:
>
> Today's linux-next merge of the hmm tree got a conflict in:
> 
>   arch/powerpc/kvm/book3s_hv_uvmem.c
> 
> between commit:
> 
>   f1b87ea8784b ("KVM: PPC: Book3S HV: Move kvmppc_svm_page_out up")
> 
> from the kvm-ppc tree and commit:
> 
>   5143192cd410 ("mm/migrate: add a flags parameter to migrate_vma")
> 
> from the hmm tree.
> 
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
> 
> diff --cc arch/powerpc/kvm/book3s_hv_uvmem.c
> index 0d49e3425a12,6850bd04bcb9..
> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
> @@@ -496,94 -253,14 +496,95 @@@ unsigned long kvmppc_h_svm_init_start(s
>   return ret;
>   }
>   
>  -unsigned long kvmppc_h_svm_init_done(struct kvm *kvm)
>  +/*
>  + * Provision a new page on HV side and copy over the contents
>  + * from secure memory using UV_PAGE_OUT uvcall.
>  + * Caller must held kvm->arch.uvmem_lock.
>  + */
>  +static int __kvmppc_svm_page_out(struct vm_area_struct *vma,
>  +unsigned long start,
>  +unsigned long end, unsigned long page_shift,
>  +struct kvm *kvm, unsigned long gpa)
>   {
>  -if (!(kvm->arch.secure_guest & KVMPPC_SECURE_INIT_START))
>  -return H_UNSUPPORTED;
>  +unsigned long src_pfn, dst_pfn = 0;
>  +struct migrate_vma mig;
>  +struct page *dpage, *spage;
>  +struct kvmppc_uvmem_page_pvt *pvt;
>  +unsigned long pfn;
>  +int ret = U_SUCCESS;
>   
>  -kvm->arch.secure_guest |= KVMPPC_SECURE_INIT_DONE;
>  -pr_info("LPID %d went secure\n", kvm->arch.lpid);
>  -return H_SUCCESS;
>  +memset(, 0, sizeof(mig));
>  +mig.vma = vma;
>  +mig.start = start;
>  +mig.end = end;
>  +mig.src = _pfn;
>  +mig.dst = _pfn;
> - mig.src_owner = _uvmem_pgmap;
> ++mig.pgmap_owner = _uvmem_pgmap;
> ++mig.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
>  +
>  +/* The requested page is already paged-out, nothing to do */
>  +if (!kvmppc_gfn_is_uvmem_pfn(gpa >> page_shift, kvm, NULL))
>  +return ret;
>  +
>  +ret = migrate_vma_setup();
>  +if (ret)
>  +return -1;
>  +
>  +spage = migrate_pfn_to_page(*mig.src);
>  +if (!spage || !(*mig.src & MIGRATE_PFN_MIGRATE))
>  +goto out_finalize;
>  +
>  +if (!is_zone_device_page(spage))
>  +goto out_finalize;
>  +
>  +dpage = alloc_page_vma(GFP_HIGHUSER, vma, start);
>  +if (!dpage) {
>  +ret = -1;
>  +goto out_finalize;
>  +}
>  +
>  +lock_page(dpage);
>  +pvt = spage->zone_device_data;
>  +pfn = page_to_pfn(dpage);
>  +
>  +/*
>  + * This function is used in two cases:
>  + * - When HV touches a secure page, for which we do UV_PAGE_OUT
>  + * - When a secure page is converted to shared page, we *get*
>  + *   the page to essentially unmap the device page. In this
>  + *   case we skip page-out.
>  + */
>  +if (!pvt->skip_page_out)
>  +ret = uv_page_out(kvm->arch.lpid, pfn << page_shift,
>  +  gpa, 0, page_shift);
>  +
>  +if (ret == U_SUCCESS)
>  +*mig.dst = migrate_pfn(pfn) | MIGRATE_PFN_LOCKED;
>  +else {
>  +unlock_page(dpage);
>  +__free_page(dpage);
>  +goto out_finalize;
>  +}
>  +
>  +migrate_vma_pages();
>  +
>  +out_finalize:
>  +migrate_vma_finalize();
>  +return ret;
>  +}
>  +
>  +static inline int kvmppc_svm_page_out(struct vm_area_struct *vma,
>  +  unsigned long start, unsigned long end,
>  +  unsigned long page_shift,
>  +  struct kvm *kvm, unsigned long gpa)
>  +{
>  +int ret;
>  +
>  +mutex_lock(>arch.uvmem_lock);
>  +ret = __kvmppc_svm_page_out(vma, start, end, page_shift, kvm, gpa);
>  +mutex_unlock(>arch.uvmem_lock);
>  +
>  +return ret;
>   }
>   
>   /*
> @@@ -744,7 -400,20 +745,8 @@@ static int kvmppc_svm_page_in(struct vm
>   mig.end = end;
>   mig.src = _pfn;
>   mig.dst = _pfn;
> + mig.flags = MIGRATE_VMA_SELECT_SYSTEM;
>   
>  -/*
>  - * We come here with mmap_lock write lock held just for
>  - * ksm_madvise(), otherwise we only need read mmap_lock.
>  - * Hence downgrade to read lock once ksm_madvise() is done.
>  - */
>  -ret = ksm_madvise(vma, vma->vm_start, vma->vm_end,
>  -  MADV_UNMERGEABLE, >vm_flags);
>  -mmap_write_downgrade(kvm->mm);
>  -*downgrade = true;
>  -if (ret)
>  -

Re: [PATCH 4/4] vhost: vdpa: report iova range

2020-08-05 Thread Jason Wang




On 2020/8/5 下午8:58, Michael S. Tsirkin wrote:

On Wed, Jun 17, 2020 at 11:29:47AM +0800, Jason Wang wrote:

This patch introduces a new ioctl for vhost-vdpa device that can
report the iova range by the device. For device that depends on
platform IOMMU, we fetch the iova range via DOMAIN_ATTR_GEOMETRY. For
devices that has its own DMA translation unit, we fetch it directly
from vDPA bus operation.

Signed-off-by: Jason Wang 
---
  drivers/vhost/vdpa.c | 27 +++
  include/uapi/linux/vhost.h   |  4 
  include/uapi/linux/vhost_types.h |  5 +
  3 files changed, 36 insertions(+)

diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
index 77a0c9fb6cc3..ad23e66cbf57 100644
--- a/drivers/vhost/vdpa.c
+++ b/drivers/vhost/vdpa.c
@@ -332,6 +332,30 @@ static long vhost_vdpa_set_config_call(struct vhost_vdpa 
*v, u32 __user *argp)
  
  	return 0;

  }
+
+static long vhost_vdpa_get_iova_range(struct vhost_vdpa *v, u32 __user *argp)
+{
+   struct iommu_domain_geometry geo;
+   struct vdpa_device *vdpa = v->vdpa;
+   const struct vdpa_config_ops *ops = vdpa->config;
+   struct vhost_vdpa_iova_range range;
+   struct vdpa_iova_range vdpa_range;
+
+   if (!ops->set_map && !ops->dma_map) {

Why not just check if (ops->get_iova_range) directly?



Because set_map || dma_ops is a hint that the device has its own DMA 
translation logic.


Device without get_iova_range does not necessarily meant it use IOMMU 
driver.


Thanks








+   iommu_domain_get_attr(v->domain,
+ DOMAIN_ATTR_GEOMETRY, );
+   range.start = geo.aperture_start;
+   range.end = geo.aperture_end;
+   } else {
+   vdpa_range = ops->get_iova_range(vdpa);
+   range.start = vdpa_range.start;
+   range.end = vdpa_range.end;
+   }
+
+   return copy_to_user(argp, , sizeof(range));
+
+}
+
  static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, unsigned int cmd,
   void __user *argp)
  {
@@ -442,6 +466,9 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
case VHOST_VDPA_SET_CONFIG_CALL:
r = vhost_vdpa_set_config_call(v, argp);
break;
+   case VHOST_VDPA_GET_IOVA_RANGE:
+   r = vhost_vdpa_get_iova_range(v, argp);
+   break;
default:
r = vhost_dev_ioctl(>vdev, cmd, argp);
if (r == -ENOIOCTLCMD)
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 0c2349612e77..850956980e27 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -144,4 +144,8 @@
  
  /* Set event fd for config interrupt*/

  #define VHOST_VDPA_SET_CONFIG_CALL_IOW(VHOST_VIRTIO, 0x77, int)
+
+/* Get the valid iova range */
+#define VHOST_VDPA_GET_IOVA_RANGE  _IOW(VHOST_VIRTIO, 0x78, \
+struct vhost_vdpa_iova_range)
  #endif
diff --git a/include/uapi/linux/vhost_types.h b/include/uapi/linux/vhost_types.h
index 669457ce5c48..4025b5a36177 100644
--- a/include/uapi/linux/vhost_types.h
+++ b/include/uapi/linux/vhost_types.h
@@ -127,6 +127,11 @@ struct vhost_vdpa_config {
__u8 buf[0];
  };
  
+struct vhost_vdpa_iova_range {

+   __u64 start;
+   __u64 end;
+};
+


Pls document fields. And I think first/last is a better API ...


  /* Feature bits */
  /* Log all write descriptors. Can be changed while device is active. */
  #define VHOST_F_LOG_ALL 26
--
2.20.1

[GIT PULL] erofs fixes for 5.9-rc1

2020-08-05 Thread Gao Xiang

Hi Linus,

Could you consider this pull request for 5.9-rc1?

This cycle mainly addresses an issue out of some extended inode with
designated location, which can hardly be generated by current mkfs but
needs to handle at runtime anyway. The others are quite trivial ones.

All commits have been tested and have been in linux-next as well.
This merges cleanly with master.

Thanks,
Gao Xiang

The following changes since commit 92ed301919932f13b9172e525674157e983d:

  Linux 5.8-rc7 (2020-07-26 14:14:06 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs.git 
tags/erofs-for-5.9-rc1

for you to fetch changes up to 0e62ea33ac12ebde876b67eca113630805191a66:

  erofs: remove WQ_CPU_INTENSIVE flag from unbound wq's (2020-08-03 21:04:46 
+0800)


Changes since last update:

 - use HTTPS links instead of insecure HTTP ones;

 - fix crossing page boundary on specific extended inodes;

 - remove useless WQ_CPU_INTENSIVE flag for unbound wq;

 - minor cleanup.


Alexander A. Klimov (1):
  erofs: Replace HTTP links with HTTPS ones

Gao Xiang (3):
  erofs: fix extended inode could cross boundary
  erofs: fold in used-once helper erofs_workgroup_unfreeze_final()
  erofs: remove WQ_CPU_INTENSIVE flag from unbound wq's

 fs/erofs/compress.h |   2 +-
 fs/erofs/data.c |   2 +-
 fs/erofs/decompressor.c |   2 +-
 fs/erofs/dir.c  |   2 +-
 fs/erofs/erofs_fs.h |   2 +-
 fs/erofs/inode.c| 123 +++-
 fs/erofs/internal.h |   2 +-
 fs/erofs/namei.c|   2 +-
 fs/erofs/super.c|   2 +-
 fs/erofs/utils.c|  16 ++-
 fs/erofs/xattr.c|   2 +-
 fs/erofs/xattr.h|   2 +-
 fs/erofs/zdata.c|   6 +--
 fs/erofs/zdata.h|   2 +-
 fs/erofs/zmap.c |   2 +-
 fs/erofs/zpvec.h|   2 +-
 16 files changed, 100 insertions(+), 71 deletions(-)

Re: [PATCH 3/4] vdpa: get_iova_range() is mandatory for device specific DMA translation

2020-08-05 Thread Jason Wang




On 2020/8/5 下午8:55, Michael S. Tsirkin wrote:

On Wed, Jun 17, 2020 at 11:29:46AM +0800, Jason Wang wrote:

In order to let userspace work correctly, get_iova_range() is a must
for the device that has its own DMA translation logic.

I guess you mean for a device.

However in absence of ths op, I don't see what is wrong with just
assuming device can access any address.



It's just for safe, if you want, we can assume any address without this op.





Signed-off-by: Jason Wang 
---
  drivers/vdpa/vdpa.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/drivers/vdpa/vdpa.c b/drivers/vdpa/vdpa.c
index de211ef3738c..ab7af978ef70 100644
--- a/drivers/vdpa/vdpa.c
+++ b/drivers/vdpa/vdpa.c
@@ -82,6 +82,10 @@ struct vdpa_device *__vdpa_alloc_device(struct device 
*parent,
if (!!config->dma_map != !!config->dma_unmap)
goto err;
  
+	if ((config->dma_map || config->set_map) &&

+   !config->get_iova_range)
+   goto err;
+
err = -ENOMEM;
vdev = kzalloc(size, GFP_KERNEL);
if (!vdev)

What about devices using an IOMMU for translation?
IOMMUs generally have a limited IOVA range too, right?



See patch 4 which query the IOMMU geometry in this case:

+        iommu_domain_get_attr(v->domain,
+                  DOMAIN_ATTR_GEOMETRY, );
+        range.start = geo.aperture_start;
+        range.end = geo.aperture_end;

Thanks







--
2.20.1

Re: WARNING in rxrpc_recvmsg

2020-08-05 Thread syzbot

syzbot suspects this issue was fixed by commit:

commit 65550098c1c4db528400c73acf3e46bfa78d9264
Author: David Howells 
Date:   Tue Jul 28 23:03:56 2020 +

rxrpc: Fix race between recvmsg and sendmsg on immediate call failure

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10bd3bcc90
start commit:   7cc2a8ea Merge tag 'block-5.8-2020-07-01' of git://git.ker..
git tree:   upstream
kernel config:  https://syzkaller.appspot.com/x/.config?x=7be693511b29b338
dashboard link: https://syzkaller.appspot.com/bug?extid=1a68d5c4e74edea44294
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=17a5022f10
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=150932a710

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: rxrpc: Fix race between recvmsg and sendmsg on immediate call failure

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Re: [PATCH 1/4] vdpa: introduce config op to get valid iova range

2020-08-05 Thread Jason Wang




On 2020/8/5 下午8:51, Michael S. Tsirkin wrote:

On Wed, Jun 17, 2020 at 11:29:44AM +0800, Jason Wang wrote:

This patch introduce a config op to get valid iova range from the vDPA
device.

Signed-off-by: Jason Wang
---
  include/linux/vdpa.h | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
index 239db794357c..b7633ed2500c 100644
--- a/include/linux/vdpa.h
+++ b/include/linux/vdpa.h
@@ -41,6 +41,16 @@ struct vdpa_device {
unsigned int index;
  };
  
+/**

+ * vDPA IOVA range - the IOVA range support by the device
+ * @start: start of the IOVA range
+ * @end: end of the IOVA range
+ */
+struct vdpa_iova_range {
+   u64 start;
+   u64 end;
+};
+

This is ambiguous. Is end in the range or just behind it?



In the range.



How about first/last?



Sure.

Thanks

Re: [PATCH v2 22/24] vdpa_sim: fix endian-ness of config space

2020-08-05 Thread Jason Wang




On 2020/8/5 下午8:06, Michael S. Tsirkin wrote:

On Wed, Aug 05, 2020 at 02:21:07PM +0800, Jason Wang wrote:

On 2020/8/4 上午5:00, Michael S. Tsirkin wrote:

VDPA sim accesses config space as native endian - this is
wrong since it's a modern device and actually uses LE.

It only supports modern guests so we could punt and
just force LE, but let's use the full virtio APIs since people
tend to copy/paste code, and this is not data path anyway.

Signed-off-by: Michael S. Tsirkin
---
   drivers/vdpa/vdpa_sim/vdpa_sim.c | 31 ++-
   1 file changed, 26 insertions(+), 5 deletions(-)

diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c
index a9bc5e0fb353..fa05e065ff69 100644
--- a/drivers/vdpa/vdpa_sim/vdpa_sim.c
+++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c
@@ -24,6 +24,7 @@
   #include 
   #include 
   #include 
+#include 
   #include 
   #include 
   #include 
@@ -72,6 +73,23 @@ struct vdpasim {
u64 features;
   };
+/* TODO: cross-endian support */
+static inline bool vdpasim_is_little_endian(struct vdpasim *vdpasim)
+{
+   return virtio_legacy_is_little_endian() ||
+   (vdpasim->features & (1ULL << VIRTIO_F_VERSION_1));
+}
+
+static inline u16 vdpasim16_to_cpu(struct vdpasim *vdpasim, __virtio16 val)
+{
+   return __virtio16_to_cpu(vdpasim_is_little_endian(vdpasim), val);
+}
+
+static inline __virtio16 cpu_to_vdpasim16(struct vdpasim *vdpasim, u16 val)
+{
+   return __cpu_to_virtio16(vdpasim_is_little_endian(vdpasim), val);
+}
+
   static struct vdpasim *vdpasim_dev;
   static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa)
@@ -306,7 +324,6 @@ static const struct vdpa_config_ops vdpasim_net_config_ops;
   static struct vdpasim *vdpasim_create(void)
   {
-   struct virtio_net_config *config;
struct vdpasim *vdpasim;
struct device *dev;
int ret = -ENOMEM;
@@ -331,10 +348,7 @@ static struct vdpasim *vdpasim_create(void)
if (!vdpasim->buffer)
goto err_iommu;
-   config = >config;
-   config->mtu = 1500;
-   config->status = VIRTIO_NET_S_LINK_UP;
-   eth_random_addr(config->mac);
+   eth_random_addr(vdpasim->config.mac);
vringh_set_iotlb(>vqs[0].vring, vdpasim->iommu);
vringh_set_iotlb(>vqs[1].vring, vdpasim->iommu);
@@ -448,6 +462,7 @@ static u64 vdpasim_get_features(struct vdpa_device *vdpa)
   static int vdpasim_set_features(struct vdpa_device *vdpa, u64 features)
   {
struct vdpasim *vdpasim = vdpa_to_sim(vdpa);
+   struct virtio_net_config *config = >config;
/* DMA mapping must be done by driver */
if (!(features & (1ULL << VIRTIO_F_ACCESS_PLATFORM)))
@@ -455,6 +470,12 @@ static int vdpasim_set_features(struct vdpa_device *vdpa, 
u64 features)
vdpasim->features = features & vdpasim_features;
+   /* We only know whether guest is using the legacy interface here, so
+* that's the earliest we can set config fields.
+*/

We check whether or not ACCESS_PLATFORM is set before which is probably a
hint that only modern device is supported. So I wonder just force LE and
fail if VERSION_1 is not set is better?

Thanks

So how about I add a comment along the lines of

/*
  * vdpasim ATM requires VIRTIO_F_ACCESS_PLATFORM, so we don't need to
  * support legacy guests. Keep transitional device code around for
  * the benefit of people who might copy-and-paste this into transitional
  * device code.
  */



That's fine.

Thanks

Re: [PATCH v2 19/24] vdpa: make sure set_features in invoked for legacy

2020-08-05 Thread Jason Wang




On 2020/8/5 下午7:40, Michael S. Tsirkin wrote:

On Wed, Aug 05, 2020 at 02:14:07PM +0800, Jason Wang wrote:

On 2020/8/4 上午5:00, Michael S. Tsirkin wrote:

Some legacy guests just assume features are 0 after reset.
We detect that config space is accessed before features are
set and set features to 0 automatically.
Note: some legacy guests might not even access config space, if this is
reported in the field we might need to catch a kick to handle these.

I wonder whether it's easier to just support modern device?

Thanks

Well hardware vendors are I think interested in supporting legacy
guests. Limiting vdpa to modern only would make it uncompetitive.



My understanding is that, IOMMU_PLATFORM is mandatory for hardware vDPA 
to work. So it can only work for modern device ...


Thanks

Re: Is anyone else getting a bad signature from kernel.org's 5.8 sources+Greg's sign?

2020-08-05 Thread David Niklas

On Wed, 5 Aug 2020 18:36:08 -0700
Randy Dunlap  wrote:

> On 8/5/20 5:59 PM, David Niklas wrote:
> > Hello,
> > I downloaded the kernel sources from kernel.org using curl, then
> > opera, and finally lynx (to rule out an html parsing bug). I did the
> > same with the sign and I keep getting:
> > 
> > %  gpg2 --verify linux-5.8.tar.sign linux-5.8.tar.xz
> > gpg: Signature made Mon Aug  3 00:19:13 2020 EDT
> > gpg:using RSA key
> > 647F28654894E3BD457199BE38DBBDC86092693E gpg: BAD signature from
> > "Greg Kroah-Hartman " [unknown]
> > 
> > I did refresh all the keys just in case.
> > I believe this is important so I'm addressing this to the signer and
> > only CC'ing the list.
> > 
> > If I'm made some simple mistake, feel free to send SIG666 to my
> > terminal. I did re-read the man page just in case.  
> 
> It works successfully for me.
> 
> 
> from https://www.kernel.org/category/signatures.html::
> 
> 
> If you get "BAD signature"
> 
> If at any time you see "BAD signature" output from "gpg2 --verify",
> please first check the following first:
> 
> Make sure that you are verifying the signature against the .tar
> version of the archive, not the compressed (.tar.xz) version. Make sure
> the the downloaded file is correct and not truncated or otherwise
> corrupted.
> 
> If you repeatedly get the same "BAD signature" output, please email
> helpd...@kernel.org, so we can investigate the problem.
> 
> 
> 

Many thanks. I've never seen a signature done that way before, but I
understand why you would do it that way.

David

Re: [GIT PULL] LEDs changes for v5.9-rc1

2020-08-05 Thread pr-tracker-bot

The pull request you sent on Wed, 5 Aug 2020 23:33:29 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git/ 
> tags/leds-5.9-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/e4a7b2dc35d9582c253cf5e6d6c3605aabc7284d

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

[PATCH RESEND v1 06/11] perf mem: Support Arm SPE events

2020-08-05 Thread Leo Yan

This patch is to add Arm SPE events for perf memory profiling.  It
supports three Arm SPE events:

  - spe-load: memory event for only recording memory load ops;
  - spe-store: memory event for only recording memory store ops;
  - spe-ldst: memory event for recording memory load and store ops.

Signed-off-by: Leo Yan 
---
 tools/perf/arch/arm64/util/Build|  2 +-
 tools/perf/arch/arm64/util/mem-events.c | 46 +
 2 files changed, 47 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/arch/arm64/util/mem-events.c

diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index 5c13438c7bd4..cb18442e840f 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -8,4 +8,4 @@ perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
 perf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
  ../../arm/util/auxtrace.o \
  ../../arm/util/cs-etm.o \
- arm-spe.o
+ arm-spe.o mem-events.o
diff --git a/tools/perf/arch/arm64/util/mem-events.c 
b/tools/perf/arch/arm64/util/mem-events.c
new file mode 100644
index ..f23128db54fb
--- /dev/null
+++ b/tools/perf/arch/arm64/util/mem-events.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "map_symbol.h"
+#include "mem-events.h"
+
+#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
+
+static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+   E("spe-load",   
"arm_spe_0/ts_enable=1,load_filter=1,store_filter=0,min_latency=%u/",   
"arm_spe_0"),
+   E("spe-store",  "arm_spe_0/ts_enable=1,load_filter=0,store_filter=1/",  
"arm_spe_0"),
+   E("spe-ldst",   
"arm_spe_0/ts_enable=1,load_filter=1,store_filter=1,min_latency=%u/",   
"arm_spe_0"),
+};
+
+static char mem_ld_name[100];
+static char mem_st_name[100];
+static char mem_ldst_name[100];
+
+struct perf_mem_event *perf_mem_events__ptr(int i)
+{
+   if (i >= PERF_MEM_EVENTS__MAX)
+   return NULL;
+
+   return _mem_events[i];
+}
+
+char *perf_mem_events__name(int i)
+{
+   struct perf_mem_event *e = perf_mem_events__ptr(i);
+
+   if (i >= PERF_MEM_EVENTS__MAX)
+   return NULL;
+
+   if (i == PERF_MEM_EVENTS__LOAD) {
+   scnprintf(mem_ld_name, sizeof(mem_ld_name),
+ e->name, perf_mem_events__loads_ldlat);
+   return mem_ld_name;
+   }
+
+   if (i == PERF_MEM_EVENTS__STORE) {
+   scnprintf(mem_st_name, sizeof(mem_st_name), e->name);
+   return mem_st_name;
+   }
+
+   scnprintf(mem_ldst_name, sizeof(mem_ldst_name),
+ e->name, perf_mem_events__loads_ldlat);
+   return mem_ldst_name;
+}
-- 
2.17.1

[PATCH RESEND v1 05/11] perf mem: Support AUX trace

2020-08-05 Thread Leo Yan

Perf memory profiling doesn't support aux trace data so the tool cannot
receive the synthesized samples from hardware tracing data.  On the
Arm64 platform, though it doesn't support PMU events for memory load and
store, but Armv8's SPE is a good candidate for memory profiling, the
hardware tracer can record memory accessing operations with physical
address and virtual address for different cache level and it also stats
the memory operations for remote access and TLB.

To allow the perf memory tool to support AUX trace, this patches adds
the aux callbacks for session structure.  It passes the predefined synth
options (like llc, flc, remote_access, tlb, etc) so this notifies the
tracing decoder to generate corresponding samples.  This patch also
invokes the standard API perf_event__process_attr() to register sample
IDs into evlist.

Signed-off-by: Leo Yan 
---
 tools/perf/builtin-mem.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index a7204634893c..6c8b5e956a4a 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -7,6 +7,7 @@
 #include "perf.h"
 
 #include 
+#include "util/auxtrace.h"
 #include "util/trace-event.h"
 #include "util/tool.h"
 #include "util/session.h"
@@ -249,6 +250,15 @@ static int process_sample_event(struct perf_tool *tool,
 
 static int report_raw_events(struct perf_mem *mem)
 {
+   struct itrace_synth_opts itrace_synth_opts = {
+   .set = true,
+   .flc = true,/* First level cache samples */
+   .llc = true,/* Last level cache samples */
+   .tlb = true,/* TLB samples */
+   .remote_access = true,  /* Remote access samples */
+   .default_no_sample = true,
+   };
+
struct perf_data data = {
.path  = input_name,
.mode  = PERF_DATA_MODE_READ,
@@ -261,6 +271,8 @@ static int report_raw_events(struct perf_mem *mem)
if (IS_ERR(session))
return PTR_ERR(session);
 
+   session->itrace_synth_opts = _synth_opts;
+
if (mem->cpu_list) {
ret = perf_session__cpu_bitmap(session, mem->cpu_list,
   mem->cpu_bitmap);
@@ -394,6 +406,19 @@ parse_mem_ops(const struct option *opt, const char *str, 
int unset)
return ret;
 }
 
+static int process_attr(struct perf_tool *tool __maybe_unused,
+   union perf_event *event,
+   struct evlist **pevlist)
+{
+   int err;
+
+   err = perf_event__process_attr(tool, event, pevlist);
+   if (err)
+   return err;
+
+   return 0;
+}
+
 int cmd_mem(int argc, const char **argv)
 {
struct stat st;
@@ -405,8 +430,12 @@ int cmd_mem(int argc, const char **argv)
.comm   = perf_event__process_comm,
.lost   = perf_event__process_lost,
.fork   = perf_event__process_fork,
+   .attr   = process_attr,
.build_id   = perf_event__process_build_id,
.namespaces = perf_event__process_namespaces,
+   .auxtrace_info  = perf_event__process_auxtrace_info,
+   .auxtrace   = perf_event__process_auxtrace,
+   .auxtrace_error = perf_event__process_auxtrace_error,
.ordered_events = true,
},
.input_name  = "perf.data",
-- 
2.17.1

[PATCH RESEND v1 04/11] perf mem: Only initialize memory event for recording

2020-08-05 Thread Leo Yan

It's needless to initialize memory events for perf reporting, so only
initialize memory event for perf recording.  This change allows to parse
perf data on cross platforms, e.g. perf tool can output reports even the
machine doesn't enable any memory events.

Signed-off-by: Leo Yan 
---
 tools/perf/builtin-mem.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index bd4229ca3685..a7204634893c 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -78,6 +78,11 @@ static int __cmd_record(int argc, const char **argv, struct 
perf_mem *mem)
OPT_END()
};
 
+   if (perf_mem_events__init()) {
+   pr_err("failed: memory events not supported\n");
+   return -1;
+   }
+
argc = parse_options(argc, argv, options, record_mem_usage,
 PARSE_OPT_KEEP_UNKNOWN);
 
@@ -436,11 +441,6 @@ int cmd_mem(int argc, const char **argv)
NULL
};
 
-   if (perf_mem_events__init()) {
-   pr_err("failed: memory events not supported\n");
-   return -1;
-   }
-
argc = parse_options_subcommand(argc, argv, mem_options, 
mem_subcommands,
mem_usage, PARSE_OPT_KEEP_UNKNOWN);
 
-- 
2.17.1

[PATCH RESEND v1 11/11] perf arm-spe: Set sample's data source field

2020-08-05 Thread Leo Yan

The sample structure contains the field 'data_src' which is used to
tell the detailed info for data operations, e.g. this field indicates
the data operation is loading or storing, on which cache level, it's
snooping or remote accessing, etc.  At the end, the 'data_src' will be
parsed by perf memory tool to display human readable strings.

This patch is to fill the 'data_src' field in the synthesized samples
base on different types.  Now support types for Level 1 dcache miss,
Level 1 dcache hit, Last level cache miss, Last level cache access,
TLB miss, TLB hit, remote access for other socket.

Note, current perf tool can display statistics for L1/L2/L3 caches but
it doesn't support the 'last level cache'.  To fit into current
implementation, 'data_src' field uses L3 cache for last level cache.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 87 +++
 1 file changed, 79 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 74308a72b000..3114f059fc2f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -259,7 +259,7 @@ arm_spe_deliver_synth_event(struct arm_spe *spe,
 }
 
 static int arm_spe__synth_mem_sample(struct arm_spe_queue *speq,
-u64 spe_events_id)
+u64 spe_events_id, u64 data_src)
 {
struct arm_spe *spe = speq->spe;
struct arm_spe_record *record = >decoder->record;
@@ -272,6 +272,7 @@ static int arm_spe__synth_mem_sample(struct arm_spe_queue 
*speq,
sample.stream_id = spe_events_id;
sample.addr = record->addr;
sample.phys_addr = record->phys_addr;
+   sample.data_src = data_src;
 
return arm_spe_deliver_synth_event(spe, speq, event, );
 }
@@ -293,21 +294,74 @@ static int arm_spe__synth_branch_sample(struct 
arm_spe_queue *speq,
return arm_spe_deliver_synth_event(spe, speq, event, );
 }
 
+static u64 arm_spe__synth_data_source(const struct arm_spe_record *record,
+ int type)
+{
+   union perf_mem_data_src data_src = { 0 };
+
+   if (record->op == ARM_SPE_LD)
+   data_src.mem_op = PERF_MEM_OP_LOAD;
+   else
+   data_src.mem_op = PERF_MEM_OP_STORE;
+
+   switch (type) {
+   case ARM_SPE_L1D_MISS:
+   data_src.mem_lvl_num = PERF_MEM_LVLNUM_L1;
+   data_src.mem_lvl = PERF_MEM_LVL_MISS | PERF_MEM_LVL_L1;
+   break;
+   case ARM_SPE_L1D_ACCESS:
+   data_src.mem_lvl_num = PERF_MEM_LVLNUM_L1;
+   data_src.mem_lvl = PERF_MEM_LVL_HIT | PERF_MEM_LVL_L1;
+   break;
+   case ARM_SPE_LLC_MISS:
+   data_src.mem_lvl_num = PERF_MEM_LVLNUM_L3;
+   data_src.mem_lvl = PERF_MEM_LVL_MISS | PERF_MEM_LVL_L3;
+   break;
+   case ARM_SPE_LLC_ACCESS:
+   data_src.mem_lvl_num = PERF_MEM_LVLNUM_L3;
+   data_src.mem_lvl = PERF_MEM_LVL_HIT | PERF_MEM_LVL_L3;
+   break;
+   case ARM_SPE_TLB_MISS:
+   data_src.mem_dtlb = PERF_MEM_TLB_WK | PERF_MEM_TLB_MISS;
+   break;
+   case ARM_SPE_TLB_ACCESS:
+   data_src.mem_dtlb = PERF_MEM_TLB_WK | PERF_MEM_TLB_HIT;
+   break;
+   case ARM_SPE_REMOTE_ACCESS:
+   data_src.mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE;
+   data_src.mem_lvl = PERF_MEM_LVL_HIT | PERF_MEM_LVL_REM_CCE1;
+   break;
+   default:
+   break;
+   }
+
+   return data_src.val;
+}
+
 static int arm_spe_sample(struct arm_spe_queue *speq)
 {
const struct arm_spe_record *record = >decoder->record;
struct arm_spe *spe = speq->spe;
+   u64 data_src;
int err;
 
if (spe->sample_flc) {
if (record->type & ARM_SPE_L1D_MISS) {
-   err = arm_spe__synth_mem_sample(speq, spe->l1d_miss_id);
+   data_src = arm_spe__synth_data_source(record,
+ ARM_SPE_L1D_MISS);
+
+   err = arm_spe__synth_mem_sample(speq, spe->l1d_miss_id,
+   data_src);
if (err)
return err;
}
 
if (record->type & ARM_SPE_L1D_ACCESS) {
-   err = arm_spe__synth_mem_sample(speq, 
spe->l1d_access_id);
+   data_src = arm_spe__synth_data_source(record,
+ 
ARM_SPE_L1D_ACCESS);
+
+   err = arm_spe__synth_mem_sample(speq, 
spe->l1d_access_id,
+   data_src);
if (err)
return err;
}
@@ -315,13 +369,21 @@ static int arm_spe_sample(struct arm_spe_queue

[PATCH RESEND v1 03/11] perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORE

2020-08-05 Thread Leo Yan

The existed architectures which have supported perf memory profiling,
usually it contains two types of hardware events: load and store, so if
want to profile memory for both load and store operations, the tool will
use these two events at the same time.  But this is not valid for aux
tracing event, the same event can be used with setting different
configurations for memory operation filtering, e.g the event can be used
to only trace memory load, or only memory store, or trace for both memory
load and store.

This patch introduces a new event PERF_MEM_EVENTS__LOAD_STORE, which is
used to support the event which can record both memory load and store
operations.

Signed-off-by: Leo Yan 
---
 tools/perf/builtin-mem.c | 11 +--
 tools/perf/util/mem-events.h |  1 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 9a7df8d01296..bd4229ca3685 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -19,8 +19,9 @@
 #include "util/symbol.h"
 #include 
 
-#define MEM_OPERATION_LOAD 0x1
-#define MEM_OPERATION_STORE0x2
+#define MEM_OPERATION_LOAD 0x1
+#define MEM_OPERATION_STORE0x2
+#define MEM_OPERATION_LOAD_STORE   0x4
 
 struct perf_mem {
struct perf_tooltool;
@@ -97,6 +98,11 @@ static int __cmd_record(int argc, const char **argv, struct 
perf_mem *mem)
e->record = true;
}
 
+   if (mem->operation & MEM_OPERATION_LOAD_STORE) {
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD_STORE);
+   e->record = true;
+   }
+
e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
if (e->record)
rec_argv[i++] = "-W";
@@ -326,6 +332,7 @@ struct mem_mode {
 static const struct mem_mode mem_modes[]={
MEM_OPT("load", MEM_OPERATION_LOAD),
MEM_OPT("store", MEM_OPERATION_STORE),
+   MEM_OPT("ldst", MEM_OPERATION_LOAD_STORE),
MEM_END
 };
 
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 726a9c8103e4..5ef178278909 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -28,6 +28,7 @@ struct mem_info {
 enum {
PERF_MEM_EVENTS__LOAD,
PERF_MEM_EVENTS__STORE,
+   PERF_MEM_EVENTS__LOAD_STORE,
PERF_MEM_EVENTS__MAX,
 };
 
-- 
2.17.1

[PATCH RESEND v1 08/11] perf arm-spe: Save memory addresses in packet

2020-08-05 Thread Leo Yan

This patch is to save virtual and physical memory addresses in packet,
the address info can be used for generating memory samples.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 4 
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 93e063f22be5..373dc2d1cf06 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -162,6 +162,10 @@ static int arm_spe_read_record(struct arm_spe_decoder 
*decoder)
decoder->record.from_ip = ip;
else if (idx == SPE_ADDR_PKT_HDR_INDEX_BRANCH)
decoder->record.to_ip = ip;
+   else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT)
+   decoder->record.addr = ip;
+   else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS)
+   decoder->record.phys_addr = ip;
break;
case ARM_SPE_COUNTER:
break;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index a5111a8d4360..5acddfcffbd1 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -47,6 +47,8 @@ struct arm_spe_record {
u64 from_ip;
u64 to_ip;
u64 timestamp;
+   u64 addr;
+   u64 phys_addr;
 };
 
 struct arm_spe_insn;
-- 
2.17.1

[PATCH RESEND v1 10/11] perf arm-spe: Fill address info for memory samples

2020-08-05 Thread Leo Yan

Since the Arm SPE backend decoder has passed virtual and physical
addresses info through packet, these addresses info can be filled into
the synthesize samples, finally the address info can be used for memory
profiling.

To support memory related samples, this patch divides into two functions
for generating samples:
  - arm_spe__synth_mem_sample() is for synthesizing memory accessing and
TLB related samples;
  - arm_spe__synth_branch_sample() is to synthesize branch samples which
is mainly for branch miss prediction.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 52 +++
 1 file changed, 31 insertions(+), 21 deletions(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index c2cf5058648f..74308a72b000 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -235,7 +235,6 @@ static void arm_spe_prep_sample(struct arm_spe *spe,
sample->cpumode = arm_spe_cpumode(spe, sample->ip);
sample->pid = speq->pid;
sample->tid = speq->tid;
-   sample->addr = record->to_ip;
sample->period = 1;
sample->cpu = speq->cpu;
 
@@ -259,18 +258,37 @@ arm_spe_deliver_synth_event(struct arm_spe *spe,
return ret;
 }
 
-static int
-arm_spe_synth_spe_events_sample(struct arm_spe_queue *speq,
-   u64 spe_events_id)
+static int arm_spe__synth_mem_sample(struct arm_spe_queue *speq,
+u64 spe_events_id)
 {
struct arm_spe *spe = speq->spe;
+   struct arm_spe_record *record = >decoder->record;
+   union perf_event *event = speq->event_buf;
+   struct perf_sample sample = { 0 };
+
+   arm_spe_prep_sample(spe, speq, event, );
+
+   sample.id = spe_events_id;
+   sample.stream_id = spe_events_id;
+   sample.addr = record->addr;
+   sample.phys_addr = record->phys_addr;
+
+   return arm_spe_deliver_synth_event(spe, speq, event, );
+}
+
+static int arm_spe__synth_branch_sample(struct arm_spe_queue *speq,
+   u64 spe_events_id)
+{
+   struct arm_spe *spe = speq->spe;
+   struct arm_spe_record *record = >decoder->record;
union perf_event *event = speq->event_buf;
-   struct perf_sample sample = { .ip = 0, };
+   struct perf_sample sample = { 0 };
 
arm_spe_prep_sample(spe, speq, event, );
 
sample.id = spe_events_id;
sample.stream_id = spe_events_id;
+   sample.addr = record->to_ip;
 
return arm_spe_deliver_synth_event(spe, speq, event, );
 }
@@ -283,15 +301,13 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
if (spe->sample_flc) {
if (record->type & ARM_SPE_L1D_MISS) {
-   err = arm_spe_synth_spe_events_sample(
-   speq, spe->l1d_miss_id);
+   err = arm_spe__synth_mem_sample(speq, spe->l1d_miss_id);
if (err)
return err;
}
 
if (record->type & ARM_SPE_L1D_ACCESS) {
-   err = arm_spe_synth_spe_events_sample(
-   speq, spe->l1d_access_id);
+   err = arm_spe__synth_mem_sample(speq, 
spe->l1d_access_id);
if (err)
return err;
}
@@ -299,15 +315,13 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
if (spe->sample_llc) {
if (record->type & ARM_SPE_LLC_MISS) {
-   err = arm_spe_synth_spe_events_sample(
-   speq, spe->llc_miss_id);
+   err = arm_spe__synth_mem_sample(speq, spe->llc_miss_id);
if (err)
return err;
}
 
if (record->type & ARM_SPE_LLC_ACCESS) {
-   err = arm_spe_synth_spe_events_sample(
-   speq, spe->llc_access_id);
+   err = arm_spe__synth_mem_sample(speq, 
spe->llc_access_id);
if (err)
return err;
}
@@ -315,31 +329,27 @@ static int arm_spe_sample(struct arm_spe_queue *speq)
 
if (spe->sample_tlb) {
if (record->type & ARM_SPE_TLB_MISS) {
-   err = arm_spe_synth_spe_events_sample(
-   speq, spe->tlb_miss_id);
+   err = arm_spe__synth_mem_sample(speq, spe->tlb_miss_id);
if (err)
return err;
}
 
if (record->type & ARM_SPE_TLB_ACCESS) {
-   err = arm_spe_synth_spe_events_sample(
-   speq, spe->tlb_access_id);
+   err = arm_spe__synth_mem_sample(speq, 
spe->tlb_access_id);

Re: [git pull] drm next for 5.9-rc1

2020-08-05 Thread pr-tracker-bot

The pull request you sent on Thu, 6 Aug 2020 11:07:02 +1000:

> git://anongit.freedesktop.org/drm/drm tags/drm-next-2020-08-06

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/8186749621ed6b8fc42644c399e8c755a2b6f630

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

[PATCH RESEND v1 09/11] perf arm-spe: Store operation types in packet

2020-08-05 Thread Leo Yan

This patch is to store operation types into packet structure, this can
be used by frontend to generate memory accessing info for samples.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 11 +++
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h |  6 ++
 2 files changed, 17 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 373dc2d1cf06..cba394784b0d 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -172,6 +172,17 @@ static int arm_spe_read_record(struct arm_spe_decoder 
*decoder)
case ARM_SPE_CONTEXT:
break;
case ARM_SPE_OP_TYPE:
+   /*
+* When operation type packet header's class equals 1,
+* the payload's least significant bit (LSB) indicates
+* the operation type: load/swap or store.
+*/
+   if (idx == 1) {
+   if (payload & 0x1)
+   decoder->record.op = ARM_SPE_ST;
+   else
+   decoder->record.op = ARM_SPE_LD;
+   }
break;
case ARM_SPE_EVENTS:
if (payload & BIT(EV_L1D_REFILL))
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index 5acddfcffbd1..f23188282ef0 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -41,9 +41,15 @@ enum arm_spe_sample_type {
ARM_SPE_REMOTE_ACCESS   = 1 << 7,
 };
 
+enum arm_spe_op_type {
+   ARM_SPE_LD  = 1 << 0,
+   ARM_SPE_ST  = 1 << 1,
+};
+
 struct arm_spe_record {
enum arm_spe_sample_type type;
int err;
+   u32 op;
u64 from_ip;
u64 to_ip;
u64 timestamp;
-- 
2.17.1

[PATCH RESEND v1 07/11] perf arm-spe: Enable attribution PERF_SAMPLE_DATA_SRC

2020-08-05 Thread Leo Yan

This patch is to enable attribution PERF_SAMPLE_DATA_SRC for the perf
data, when decoding the tracing data, it will tells the tool it contains
memory data.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 3882a5360ada..c2cf5058648f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -803,7 +803,7 @@ arm_spe_synth_events(struct arm_spe *spe, struct 
perf_session *session)
attr.type = PERF_TYPE_HARDWARE;
attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK;
attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
-   PERF_SAMPLE_PERIOD;
+   PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC;
if (spe->timeless_decoding)
attr.sample_type &= ~(u64)PERF_SAMPLE_TIME;
else
-- 
2.17.1

[PATCH RESEND v1 01/11] perf mem: Search event name with more flexible path

2020-08-05 Thread Leo Yan

Perf tool searches memory event name under the folder
'/sys/devices/cpu/events/', this leads to the limitation for selection
memory profiling event which must be under this folder.  Thus it's
impossible to use any other event as memory event which is not under
this specific folder, e.g. Arm SPE hardware event is not located in
'/sys/devices/cpu/events/' so it cannot be enabled for memory profiling.

This patch changes to search folder from '/sys/devices/cpu/events/' to
'/sys/devices', so it give flexibility to find events which can be used
for memory profiling.

Signed-off-by: Leo Yan 
---
 tools/perf/util/mem-events.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index ea0af0bc4314..35c8d175a9d2 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -18,8 +18,8 @@ unsigned int perf_mem_events__loads_ldlat = 30;
 #define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
 
 struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
-   E("ldlat-loads","cpu/mem-loads,ldlat=%u/P", "mem-loads"),
-   E("ldlat-stores",   "cpu/mem-stores/P", "mem-stores"),
+   E("ldlat-loads","cpu/mem-loads,ldlat=%u/P", 
"cpu/events/mem-loads"),
+   E("ldlat-stores",   "cpu/mem-stores/P", 
"cpu/events/mem-stores"),
 };
 #undef E
 
@@ -93,7 +93,7 @@ int perf_mem_events__init(void)
struct perf_mem_event *e = _mem_events[j];
struct stat st;
 
-   scnprintf(path, PATH_MAX, "%s/devices/cpu/events/%s",
+   scnprintf(path, PATH_MAX, "%s/devices/%s",
  mnt, e->sysfs_name);
 
if (!stat(path, ))
-- 
2.17.1

[PATCH RESEND v1 00/11] perf mem: Support AUX trace and Arm SPE

2020-08-05 Thread Leo Yan

This patch set is to support AUX trace and Arm SPE as the first enabled
hardware tracing for Perf memory tool.

Patches 01 ~ 04 are preparasion patches which mainly resolve the issue
for memory events, since the existed code is hard coded the memory
events which based on x86 and PowerPC architectures, so patches 01 ~ 04
extend to support more flexible memory event name, and introduce weak
functions so can allow every architecture to define its own memory
events structure and returning event pointer and name respectively.

Patch 05 is used to extend Perf memory tool to support AUX trace.

Patch 06 ~ 11 are to support Arm SPE with Perf memory tool.  Firstly it
registers SPE events for memory events, then it extends the SPE packet
to pass addresses info and operation types, and also set 'data_src'
field so can allow the tool to display readable string in the result.

This patch set has been tested on ARMv8 Hisilicon D06 platform.  I noted
now the 'data object' cannot be displayed properly, this should be
another issue so need to check separately.   Below is testing result:

# Samples: 73  of event 'l1d-miss'
# Total weight : 73
# Sort order   : 
local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead   Samples  Local Weight  Memory access Symbol
  Shared Object  Data Symbol
   Data Object Snoop TLB access 
 Locked
#         
..  .  
  
..    ..  ..
#
 2.74% 2  0 L1 or L1 miss [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027aacb08a8   
 [unknown]   N/A   N/A  
   No
 2.74% 2  0 L1 or L1 miss [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027be6488a8   
 [unknown]   N/A   N/A  
   No
 2.74% 2  0 L1 or L1 miss [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027c432f8a8   
 [unknown]   N/A   N/A  
   No
 1.37% 1  0 L1 or L1 miss [k] 
__arch_copy_to_user [kernel.kallsyms]  [k] 0x0027a65352a0   
 [unknown]   N/A   N/A  
   No
 1.37% 1  0 L1 or L1 miss [k] 
__d_lookup_rcu  [kernel.kallsyms]  [k] 0x0027d3cbf468   
 [unknown]   N/A   N/A  
   No
 1.37% 1  0 L1 or L1 miss [k] 
__d_lookup_rcu  [kernel.kallsyms]  [k] 0x0027d8f44490   
 [unknown]   N/A   N/A  
   No
 [...]


# Samples: 101  of event 'l1d-access'
# Total weight : 101
# Sort order   : 
local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead   Samples  Local Weight  Memory access Symbol
  Shared Object  Data Symbol
   Data Object Snoop TLB access 
 Locked
#         
..  .  
  
..    ..  ..
#
 2.97% 3  0 L1 or L1 hit  [k] 
perf_event_mmap [kernel.kallsyms]  [k] perf_swevent+0x5c
 [kernel.kallsyms].data  N/A   N/A  
   No
 1.98% 2  0 L1 or L1 hit  [k] 
kmem_cache_alloc[kernel.kallsyms]  [k] 0x2027af40e3d0   
 [unknown]   N/A   N/A  
   No
 1.98% 2  0 L1 or L1 hit  [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027aacb08a8   
 [unknown]   N/A   N/A  
   No
 1.98% 2  0 L1 or L1 hit  [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027be6488a8   
 [unknown]   N/A   N/A  
   No
 1.98% 2  0 L1 or L1 hit  [k] 
perf_iterate_ctx.constprop.151

[PATCH RESEND v1 02/11] perf mem: Introduce weak function perf_mem_events__ptr()

2020-08-05 Thread Leo Yan

Different architectures might use different event or different event
parameters for memory profiling, this patch introduces weak function
perf_mem_events__ptr(), which allows to return back architecture
specific memory event.

After the function perf_mem_events__ptr() is introduced, the variable
'perf_mem_events' can be accessed by using this new function; so marks
the variable as 'static' variable, this can allow the architectures to
define its own memory event array.

Signed-off-by: Leo Yan 
---
 tools/perf/builtin-c2c.c | 18 --
 tools/perf/builtin-mem.c | 21 ++---
 tools/perf/util/mem-events.c | 26 +++---
 tools/perf/util/mem-events.h |  2 +-
 4 files changed, 46 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 5938b100eaf4..88e68f36aa62 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2914,6 +2914,7 @@ static int perf_c2c__record(int argc, const char **argv)
int ret;
bool all_user = false, all_kernel = false;
bool event_set = false;
+   struct perf_mem_event *e;
struct option options[] = {
OPT_CALLBACK('e', "event", _set, "event",
 "event selector. Use 'perf mem record -e list' to list 
available events",
@@ -2941,11 +2942,15 @@ static int perf_c2c__record(int argc, const char **argv)
rec_argv[i++] = "record";
 
if (!event_set) {
-   perf_mem_events[PERF_MEM_EVENTS__LOAD].record  = true;
-   perf_mem_events[PERF_MEM_EVENTS__STORE].record = true;
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+   e->record = true;
+
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
+   e->record = true;
}
 
-   if (perf_mem_events[PERF_MEM_EVENTS__LOAD].record)
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+   if (e->record)
rec_argv[i++] = "-W";
 
rec_argv[i++] = "-d";
@@ -2953,12 +2958,13 @@ static int perf_c2c__record(int argc, const char **argv)
rec_argv[i++] = "--sample-cpu";
 
for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-   if (!perf_mem_events[j].record)
+   e = perf_mem_events__ptr(j);
+   if (!e->record)
continue;
 
-   if (!perf_mem_events[j].supported) {
+   if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
-  perf_mem_events[j].name);
+  perf_mem_events__name(j));
free(rec_argv);
return -1;
}
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 3523279af6af..9a7df8d01296 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -64,6 +64,7 @@ static int __cmd_record(int argc, const char **argv, struct 
perf_mem *mem)
const char **rec_argv;
int ret;
bool all_user = false, all_kernel = false;
+   struct perf_mem_event *e;
struct option options[] = {
OPT_CALLBACK('e', "event", , "event",
 "event selector. use 'perf mem record -e list' to list 
available events",
@@ -86,13 +87,18 @@ static int __cmd_record(int argc, const char **argv, struct 
perf_mem *mem)
 
rec_argv[i++] = "record";
 
-   if (mem->operation & MEM_OPERATION_LOAD)
-   perf_mem_events[PERF_MEM_EVENTS__LOAD].record = true;
+   if (mem->operation & MEM_OPERATION_LOAD) {
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+   e->record = true;
+   }
 
-   if (mem->operation & MEM_OPERATION_STORE)
-   perf_mem_events[PERF_MEM_EVENTS__STORE].record = true;
+   if (mem->operation & MEM_OPERATION_STORE) {
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
+   e->record = true;
+   }
 
-   if (perf_mem_events[PERF_MEM_EVENTS__LOAD].record)
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+   if (e->record)
rec_argv[i++] = "-W";
 
rec_argv[i++] = "-d";
@@ -101,10 +107,11 @@ static int __cmd_record(int argc, const char **argv, 
struct perf_mem *mem)
rec_argv[i++] = "--phys-data";
 
for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-   if (!perf_mem_events[j].record)
+   e = perf_mem_events__ptr(j);
+   if (!e->record)
continue;
 
-   if (!perf_mem_events[j].supported) {
+   if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
   perf_mem_events__name(j));
free(rec_argv);
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 35c8d175a9d2..7a5a0d699e27 100644
--- a/tools/perf/util/mem-events.c
+++

drivers/net/ethernet/xilinx/ll_temac_main.c:93:2: warning: Non-boolean value returned from function returning bool

2020-08-05 Thread kernel test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   fffe3ae0ee84e25d2befe2ae59bc32aa2b6bc77b
commit: e8b6c54f6d57822e228027d41a1edb317034a08c net: xilinx: temac: Relax 
Kconfig dependencies
date:   4 months ago
compiler: ia64-linux-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


cppcheck warnings: (new ones prefixed by >>)

>> drivers/net/ethernet/xilinx/ll_temac_main.c:93:2: warning: Non-boolean value 
>> returned from function returning bool [returnNonBoolInBooleanFunction]
return temac_ior(lp, XTE_RDY0_OFFSET) & XTE_RDY0_HARD_ACS_RDY_MASK;
^
>> drivers/net/ethernet/xilinx/ll_temac_main.c:469:44: warning: Shifting signed 
>> 32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
 temac_indirect_out32(lp, XTE_AFM_OFFSET, XTE_AFM_EPPRM_MASK);
  ^
   drivers/net/ethernet/xilinx/ll_temac_main.c:505:8: warning: Shifting signed 
32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
& XTE_AFM_EPPRM_MASK) {
  ^
   drivers/net/ethernet/xilinx/ll_temac_main.c:579:10: warning: Shifting signed 
32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
 .m_or =XTE_AFM_EPPRM_MASK,
^
   drivers/net/ethernet/xilinx/ll_temac_main.c:637:44: warning: Shifting signed 
32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
temac_indirect_out32(lp, XTE_RXC1_OFFSET, XTE_RXC1_RXRST_MASK);
  ^
   drivers/net/ethernet/xilinx/ll_temac_main.c:639:52: warning: Shifting signed 
32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
while (temac_indirect_in32(lp, XTE_RXC1_OFFSET) & XTE_RXC1_RXRST_MASK) {
  ^
   drivers/net/ethernet/xilinx/ll_temac_main.c:649:43: warning: Shifting signed 
32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
temac_indirect_out32(lp, XTE_TXC_OFFSET, XTE_TXC_TXRST_MASK);
 ^
   drivers/net/ethernet/xilinx/ll_temac_main.c:651:51: warning: Shifting signed 
32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
while (temac_indirect_in32(lp, XTE_TXC_OFFSET) & XTE_TXC_TXRST_MASK) {
 ^
   drivers/net/ethernet/xilinx/ll_temac_main.c:725:33: warning: Shifting signed 
32-bit value by 31 bits is undefined behaviour [shiftTooManyBitsSigned]
 case SPEED_1000: mii_speed |= XTE_EMCFG_LINKSPD_1000; break;
   ^

vim +93 drivers/net/ethernet/xilinx/ll_temac_main.c

92744989533cbe drivers/net/ll_temac_main.c Grant Likely
2009-04-25  90  
1bd33bf0fe6d30 drivers/net/ethernet/xilinx/ll_temac_main.c Esben Haabendal 
2019-05-23  91  static bool hard_acs_rdy(struct temac_local *lp)
1bd33bf0fe6d30 drivers/net/ethernet/xilinx/ll_temac_main.c Esben Haabendal 
2019-05-23  92  {
1bd33bf0fe6d30 drivers/net/ethernet/xilinx/ll_temac_main.c Esben Haabendal 
2019-05-23 @93   return temac_ior(lp, XTE_RDY0_OFFSET) & 
XTE_RDY0_HARD_ACS_RDY_MASK;
1bd33bf0fe6d30 drivers/net/ethernet/xilinx/ll_temac_main.c Esben Haabendal 
2019-05-23  94  }
1bd33bf0fe6d30 drivers/net/ethernet/xilinx/ll_temac_main.c Esben Haabendal 
2019-05-23  95  

:: The code at line 93 was first introduced by commit
:: 1bd33bf0fe6d3012410db0302187199871b510a0 net: ll_temac: Prepare indirect 
register access for multicast support

:: TO: Esben Haabendal 
:: CC: David S. Miller 

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org

Re: [PATCH v2] mm: vmstat: fix /proc/sys/vm/stat_refresh generating false warnings

2020-08-05 Thread Hugh Dickins

On Mon, 3 Aug 2020, Roman Gushchin wrote:
> On Fri, Jul 31, 2020 at 07:17:05PM -0700, Hugh Dickins wrote:
> > On Fri, 31 Jul 2020, Roman Gushchin wrote:
> > > On Thu, Jul 30, 2020 at 09:06:55PM -0700, Hugh Dickins wrote:
> > > > 
> > > > Though another alternative did occur to me overnight: we could
> > > > scrap the logged warning, and show "nr_whatever -53" as output
> > > > from /proc/sys/vm/stat_refresh: that too would be acceptable
> > > > to me, and you redirect to /dev/null.
> > > 
> > > It sounds like a good idea to me. Do you want me to prepare a patch?
> > 
> > Yes, if you like that one best, please do prepare a patch - thanks!
> 
> Hi Hugh,
> 
> I mastered a patch (attached below), but honestly I can't say I like it.
> The resulting interface is confusing: we don't generally use sysctls to
> print debug data and/or warnings.

Since you confessed to not liking it yourself, I paid it very little
attention.  Yes, when I made that suggestion, I wasn't really thinking
of how stat_refresh is a /proc/sys/vm sysctl thing; and I'm not at all
sure how issuing output from a /proc file intended for input works out
(perhaps there are plenty of good examples, and you followed one, but
it smells fishy to me now).

> 
> I thought about treating a write to this sysctls as setting the threshold,
> so that "echo 0 > /proc/sys/vm/stat_refresh" would warn on all negative
> entries, and "cat /proc/sys/vm/stat_refresh" would use the default threshold
> as in my patch. But this breaks  to some extent the current ABI, as passing
> an incorrect value will result in -EINVAL instead of passing (as now).

I expect we could handle that well enough, by more lenient validation
of the input; though my comment above on output versus input sheds doubt.

> 
> Overall I still think we shouldn't warn on any values inside the possible
> range, as it's not an indication of any kind of error. The only reason
> why we see some values going negative and some not, is that some of them
> are updated more frequently than others, and some are bouncing around
> zero, while other can't reach zero too easily (like the number of free pages).

We continue to disagree on that (and it amuses me that you who are so
sure they can be ignored, cannot ignore them; whereas I who am so curious
to investigate them, have not actually found the time to do so in years).
It was looking as if nothing could satisfy us both, but...

> 
> Actually, if someone wants to ensure that numbers are accurate,
> we have to temporarily set the threshold to 0, then flush the percpu data
> and only then check atomics. In the current design flushing percpu data
> matters for only slowly updated counters, as all others will run away while
> we're waiting for the flush. So if we're targeting some slowly updating
> counters, maybe we should warn only on them being negative, Idk.

I was going to look into that angle, though it would probably add a little
unjustifiable overhead to fast paths, and be rejected on that basis.

But in going to do so, came up against an earlier comment of yours, of
which I had misunderstood the significance. I had said and you replied:

> > nr_zone_write_pending: yes, I've looked at our machines, and see that
> > showing up for us too (-49 was the worst I saw).  Not at all common,
> > but seen.  And not followed by increasingly worse numbers, so a state
> > that corrects itself.  nr_dirty too (fewer instances, bigger numbers);
> > but never nr_writeback, which you'd expect to go along with those.
> 
> NR_DIRTY and NR_WRITEBACK are node counters, so we don't check them?

Wow. Now I see what you were pointing out: when v4.8's 75ef71840539
("mm, vmstat: add infrastructure for per-node vmstats") went in, it
missed updating vmstat_refresh() to check all the NR_VM_NODE_STAT items.

And I've never noticed, and have interpreted its silence on those items
as meaning they're all good (and the nr_dirty ones I mentioned above,
must have been from residual old kernels, hence the fewer instances).
I see the particularly tricky NR_ISOLATED ones are in that category.
Maybe they are all good, but I have been mistaken.

I shall certainly want to reintroduce those stats to checking for
negatives, even if it's in a patch that never earns your approval,
and just ends up kept internal for debugging.  But equally certainly,
I must not suddenly reintroduce that checking without gaining some
experience of it (and perhaps getting as irritated as you by more
transient negatives).

I said earlier that I'd prefer you to rip out all that checking for
negatives, rather than retaining it with the uselessly over-generous
125 * nr_cpus leeway.  Please, Roman, would you send Andrew a patch
doing that, to replace the patch in this thread?  Or if you prefer,
I can do so.

Thanks,
Hugh

RE: [PATCH] ath10k: Fix the size used in a 'dma_free_coherent()' call in an error handling path

2020-08-05 Thread Rakesh Pillai




> -Original Message-
> From: Christophe JAILLET 
> Sent: Sunday, August 2, 2020 5:52 PM
> To: kv...@codeaurora.org; da...@davemloft.net; k...@kernel.org;
> pill...@codeaurora.org
> Cc: ath...@lists.infradead.org; linux-wirel...@vger.kernel.org;
> net...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-
> janit...@vger.kernel.org; Christophe JAILLET
> 
> Subject: [PATCH] ath10k: Fix the size used in a 'dma_free_coherent()' call
in
> an error handling path
> 
> Update the size used in 'dma_free_coherent()' in order to match the one
> used in the corresponding 'dma_alloc_coherent()'.
> 
> Fixes: 1863008369ae ("ath10k: fix shadow register implementation for
> WCN3990")
> Signed-off-by: Christophe JAILLET 
> ---
> This patch looks obvious to me, but commit 1863008369ae looks also simple.
> So it is surprising that such a "typo" slipped in.

Reviewed-by: Rakesh Pillai  

> ---
>  drivers/net/wireless/ath/ath10k/ce.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/ath/ath10k/ce.c
> b/drivers/net/wireless/ath/ath10k/ce.c
> index 294fbc1e89ab..e6e0284e4783 100644
> --- a/drivers/net/wireless/ath/ath10k/ce.c
> +++ b/drivers/net/wireless/ath/ath10k/ce.c
> @@ -1555,7 +1555,7 @@ ath10k_ce_alloc_src_ring(struct ath10k *ar,
> unsigned int ce_id,
>   ret = ath10k_ce_alloc_shadow_base(ar, src_ring, nentries);
>   if (ret) {
>   dma_free_coherent(ar->dev,
> -   (nentries * sizeof(struct
> ce_desc_64) +
> +   (nentries * sizeof(struct ce_desc)
+
>  CE_DESC_RING_ALIGN),
> src_ring-
> >base_addr_owner_space_unaligned,
> base_addr);
> --
> 2.25.1

cma_alloc(), add sleep-and-retry for temporary page pinning

2020-08-05 Thread Chris Goldsworthy

On mobile devices, failure to allocate from a CMA area constitutes a
functional failure.  Sometimes during CMA allocations, we have observed
that pages in a CMA area allocated through alloc_pages(), that we're trying
to migrate away to make room for a CMA allocation, are temporarily pinned.
This temporary pinning can occur when a process that owns the pinned page
is being forked (the example is explained further in the commit text).
This patch addresses this issue by adding a sleep-and-retry loop in
cma_alloc() . There's another example we know of similar to the above that
occurs during exit_mmap() (in zap_pte_range() specifically), but I need to
determine if this is still relevant today.

[PATCH] mm: cma: retry allocations in cma_alloc

2020-08-05 Thread Chris Goldsworthy

CMA allocations will fail if 'pinned' pages are in a CMA area, since we
cannot migrate pinned pages. The _refcount of a struct page being greater
than _mapcount for that page can cause pinning for anonymous pages.  This
is because try_to_unmap(), which (1) is called in the CMA allocation path,
and (2) decrements both _refcount and _mapcount for a page, will stop
unmapping a page from VMAs once the _mapcount for a page reaches 0.  This
implies that after try_to_unmap() has finished successfully for a page
where _recount > _mapcount, that _refcount will be greater than 0.  Later
in the CMA allocation path in migrate_page_move_mapping(), we will have one
more reference count than intended for anonymous pages, meaning the
allocation will fail for that page.

One example of where _refcount can be greater than _mapcount for a page we
would not expect to be pinned is inside of copy_one_pte(), which is called
during a fork. For ptes for which pte_present(pte) == true, copy_one_pte()
will increment the _refcount field followed by the  _mapcount field of a
page. If the process doing copy_one_pte() is context switched out after
incrementing _refcount but before incrementing _mapcount, then the page
will be temporarily pinned.

So, inside of cma_alloc(), instead of giving up when alloc_contig_range()
returns -EBUSY after having scanned a whole CMA-region bitmap, perform
retries with sleeps to give the system an opportunity to unpin any pinned
pages.

Signed-off-by: Chris Goldsworthy 
Co-developed-by: Susheel Khiani 
Signed-off-by: Susheel Khiani 
Co-developed-by: Vinayak Menon 
Signed-off-by: Vinayak Menon 
---
 mm/cma.c | 24 ++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/mm/cma.c b/mm/cma.c
index 7f415d7..7b85fe6 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "cma.h"
@@ -418,6 +419,8 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align,
size_t i;
struct page *page = NULL;
int ret = -ENOMEM;
+   int num_attempts = 0;
+   int max_retries = 5;
 
if (!cma || !cma->count || !cma->bitmap)
return NULL;
@@ -442,8 +445,25 @@ struct page *cma_alloc(struct cma *cma, size_t count, 
unsigned int align,
bitmap_maxno, start, bitmap_count, mask,
offset);
if (bitmap_no >= bitmap_maxno) {
-   mutex_unlock(>lock);
-   break;
+   if ((num_attempts < max_retries) && (ret == -EBUSY)) {
+   mutex_unlock(>lock);
+
+   /*
+* Page may be momentarily pinned by some other
+* process which has been scheduled out, e.g.
+* in exit path, during unmap call, or process
+* fork and so cannot be freed there. Sleep
+* for 100ms and retry the allocation.
+*/
+   start = 0;
+   ret = -ENOMEM;
+   msleep(100);
+   num_attempts++;
+   continue;
+   } else {
+   mutex_unlock(>lock);
+   break;
+   }
}
bitmap_set(cma->bitmap, bitmap_no, bitmap_count);
/*
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

Re: [PATCH v3 2/4] fpga: dfl: map feature mmio resources in their own feature drivers

2020-08-05 Thread Xu Yilun

On Wed, Aug 05, 2020 at 08:15:27PM +0800, Wu, Hao wrote:
> > Subject: [PATCH v3 2/4] fpga: dfl: map feature mmio resources in their own
> > feature drivers
> >
> > +static int dfl_binfo_prepare(struct build_feature_devs_info *binfo,
> > + resource_size_t start, resource_size_t len)
> > +{
> > +struct device *dev = binfo->dev;
> > +void __iomem *ioaddr;
> > +
> > +if (!devm_request_mem_region(dev, start, len, dev_name(dev))) {
> > +dev_err(dev, "request region fail, start:%pa, len:%pa\n",
> > +, );
> > +return -EBUSY;
> > +}
> > +
> > +ioaddr = devm_ioremap(dev, start, len);
> > +if (!ioaddr) {
> > +dev_err(dev, "ioremap region fail, start:%pa, len:%pa\n",
> > +, );
> > +devm_release_mem_region(dev, start, len);
> 
> as it's devm_request_mem_region, do we still need to release it here?

Yes, I could delete it.

> > @@ -869,24 +935,24 @@ static int parse_feature_private(struct
> > build_feature_devs_info *binfo,
> >   *
> >   * @binfo: build feature devices information.
> >   * @dfl: device feature list to parse
> 
> Remove this line.

Yes.

Thanks,
Yilun.

> 
> Other place looks good to me.
> 
> Thanks
> Hao

[PATCH v1 04/11] perf mem: Only initialize memory event for recording

2020-08-05 Thread Leo Yan

It's needless to initialize memory events for perf reporting, so only
initialize memory event for perf recording.  This change allows to parse
perf data on cross platforms, e.g. perf tool can output reports even the
machine doesn't enable any memory events.

Signed-off-by: Leo Yan 
---
 tools/perf/builtin-mem.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index bd4229ca3685..a7204634893c 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -78,6 +78,11 @@ static int __cmd_record(int argc, const char **argv, struct 
perf_mem *mem)
OPT_END()
};
 
+   if (perf_mem_events__init()) {
+   pr_err("failed: memory events not supported\n");
+   return -1;
+   }
+
argc = parse_options(argc, argv, options, record_mem_usage,
 PARSE_OPT_KEEP_UNKNOWN);
 
@@ -436,11 +441,6 @@ int cmd_mem(int argc, const char **argv)
NULL
};
 
-   if (perf_mem_events__init()) {
-   pr_err("failed: memory events not supported\n");
-   return -1;
-   }
-
argc = parse_options_subcommand(argc, argv, mem_options, 
mem_subcommands,
mem_usage, PARSE_OPT_KEEP_UNKNOWN);
 
-- 
2.17.1

[PATCH v1 09/11] perf arm-spe: Store operation types in packet

2020-08-05 Thread Leo Yan

This patch is to store operation types into packet structure, this can
be used by frontend to generate memory accessing info for samples.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 11 +++
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h |  6 ++
 2 files changed, 17 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 373dc2d1cf06..cba394784b0d 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -172,6 +172,17 @@ static int arm_spe_read_record(struct arm_spe_decoder 
*decoder)
case ARM_SPE_CONTEXT:
break;
case ARM_SPE_OP_TYPE:
+   /*
+* When operation type packet header's class equals 1,
+* the payload's least significant bit (LSB) indicates
+* the operation type: load/swap or store.
+*/
+   if (idx == 1) {
+   if (payload & 0x1)
+   decoder->record.op = ARM_SPE_ST;
+   else
+   decoder->record.op = ARM_SPE_LD;
+   }
break;
case ARM_SPE_EVENTS:
if (payload & BIT(EV_L1D_REFILL))
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index 5acddfcffbd1..f23188282ef0 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -41,9 +41,15 @@ enum arm_spe_sample_type {
ARM_SPE_REMOTE_ACCESS   = 1 << 7,
 };
 
+enum arm_spe_op_type {
+   ARM_SPE_LD  = 1 << 0,
+   ARM_SPE_ST  = 1 << 1,
+};
+
 struct arm_spe_record {
enum arm_spe_sample_type type;
int err;
+   u32 op;
u64 from_ip;
u64 to_ip;
u64 timestamp;
-- 
2.17.1

Re: [PATCH RFC] sched/fair: simplfy the work when reweighting entity

2020-08-05 Thread 蒋彪



> On Aug 6, 2020, at 12:21 AM, Dietmar Eggemann  
> wrote:
> 
> On 04/08/2020 09:12, Jiang Biao wrote:
>> If a se is on_rq when reweighting entity, all we need should be
>> updating the load of cfs_rq, other dequeue/enqueue works could be
>> redundant, such as,
>> * account_numa_dequeue/account_numa_enqueue
>> * list_del/list_add from/into cfs_tasks
>> * nr_running--/nr_running++
> 
> I think this could make sense. Have you spotted a code path where this
> gives you a change?
> 
> I guess only for a task on the rq, so: entity_is_task(se) && se->on_rq
Yes, you're right. No other code path I spotted except what you list below.

> 
>> Just simplfy the work. Could be helpful for the hot path.
> 
> IMHO hotpath is update_cfs_group() -> reweight_entity() but this is only
> called for '!entity_is_task(se)'.
> 
> See
> 
> 3290 if (!gcfs_rq)
> 3291 return;
> 
> in update_cfs_group().
Yes, It is.
But *nr_running--/nr_running++* works are still redundant for
‘!entity_is_task(se)' hot path. :)
Besides, I guess we could simplify the logic and make it cleaner and
more readable with this patch.
If it could make sense to you, would you mind if I resend the patch
with the commit log amended?

> 
> The 'entity_is_task(se)' case is
> 
> set_load_weight(struct task_struct *p, ...) -> reweight_task(p, ...) ->
> reweight_entity(..., >se, ...)
> 
> but here !se->on_rq.
Yes, indeed.

Thanks a lot for your comments.
Regards,
Jiang

> 
>> Signed-off-by: Jiang Biao 
>> ---
>> kernel/sched/fair.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 04fa8dbcfa4d..18a8fc7bd0de 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -3086,7 +3086,7 @@ static void reweight_entity(struct cfs_rq *cfs_rq, 
>> struct sched_entity *se,
>>  /* commit outstanding execution time */
>>  if (cfs_rq->curr == se)
>>  update_curr(cfs_rq);
>> -account_entity_dequeue(cfs_rq, se);
>> +update_load_sub(_rq->load, se->load.weight);
>>  }
>>  dequeue_load_avg(cfs_rq, se);
>> 
>> @@ -3102,7 +3102,7 @@ static void reweight_entity(struct cfs_rq *cfs_rq, 
>> struct sched_entity *se,
>> 
>>  enqueue_load_avg(cfs_rq, se);
>>  if (se->on_rq)
>> -account_entity_enqueue(cfs_rq, se);
>> +update_load_add(_rq->load, se->load.weight);
>> 
>> }
>

[PATCH v1 02/11] perf mem: Introduce weak function perf_mem_events__ptr()

2020-08-05 Thread Leo Yan

Different architectures might use different event or different event
parameters for memory profiling, this patch introduces weak function
perf_mem_events__ptr(), which allows to return back architecture
specific memory event.

After the function perf_mem_events__ptr() is introduced, the variable
'perf_mem_events' can be accessed by using this new function; so marks
the variable as 'static' variable, this can allow the architectures to
define its own memory event array.

Signed-off-by: Leo Yan 
---
 tools/perf/builtin-c2c.c | 18 --
 tools/perf/builtin-mem.c | 21 ++---
 tools/perf/util/mem-events.c | 26 +++---
 tools/perf/util/mem-events.h |  2 +-
 4 files changed, 46 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 5938b100eaf4..88e68f36aa62 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2914,6 +2914,7 @@ static int perf_c2c__record(int argc, const char **argv)
int ret;
bool all_user = false, all_kernel = false;
bool event_set = false;
+   struct perf_mem_event *e;
struct option options[] = {
OPT_CALLBACK('e', "event", _set, "event",
 "event selector. Use 'perf mem record -e list' to list 
available events",
@@ -2941,11 +2942,15 @@ static int perf_c2c__record(int argc, const char **argv)
rec_argv[i++] = "record";
 
if (!event_set) {
-   perf_mem_events[PERF_MEM_EVENTS__LOAD].record  = true;
-   perf_mem_events[PERF_MEM_EVENTS__STORE].record = true;
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+   e->record = true;
+
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
+   e->record = true;
}
 
-   if (perf_mem_events[PERF_MEM_EVENTS__LOAD].record)
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+   if (e->record)
rec_argv[i++] = "-W";
 
rec_argv[i++] = "-d";
@@ -2953,12 +2958,13 @@ static int perf_c2c__record(int argc, const char **argv)
rec_argv[i++] = "--sample-cpu";
 
for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-   if (!perf_mem_events[j].record)
+   e = perf_mem_events__ptr(j);
+   if (!e->record)
continue;
 
-   if (!perf_mem_events[j].supported) {
+   if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
-  perf_mem_events[j].name);
+  perf_mem_events__name(j));
free(rec_argv);
return -1;
}
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 3523279af6af..9a7df8d01296 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -64,6 +64,7 @@ static int __cmd_record(int argc, const char **argv, struct 
perf_mem *mem)
const char **rec_argv;
int ret;
bool all_user = false, all_kernel = false;
+   struct perf_mem_event *e;
struct option options[] = {
OPT_CALLBACK('e', "event", , "event",
 "event selector. use 'perf mem record -e list' to list 
available events",
@@ -86,13 +87,18 @@ static int __cmd_record(int argc, const char **argv, struct 
perf_mem *mem)
 
rec_argv[i++] = "record";
 
-   if (mem->operation & MEM_OPERATION_LOAD)
-   perf_mem_events[PERF_MEM_EVENTS__LOAD].record = true;
+   if (mem->operation & MEM_OPERATION_LOAD) {
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+   e->record = true;
+   }
 
-   if (mem->operation & MEM_OPERATION_STORE)
-   perf_mem_events[PERF_MEM_EVENTS__STORE].record = true;
+   if (mem->operation & MEM_OPERATION_STORE) {
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
+   e->record = true;
+   }
 
-   if (perf_mem_events[PERF_MEM_EVENTS__LOAD].record)
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+   if (e->record)
rec_argv[i++] = "-W";
 
rec_argv[i++] = "-d";
@@ -101,10 +107,11 @@ static int __cmd_record(int argc, const char **argv, 
struct perf_mem *mem)
rec_argv[i++] = "--phys-data";
 
for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-   if (!perf_mem_events[j].record)
+   e = perf_mem_events__ptr(j);
+   if (!e->record)
continue;
 
-   if (!perf_mem_events[j].supported) {
+   if (!e->supported) {
pr_err("failed: event '%s' not supported\n",
   perf_mem_events__name(j));
free(rec_argv);
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 35c8d175a9d2..7a5a0d699e27 100644
--- a/tools/perf/util/mem-events.c
+++

[PATCH v1 03/11] perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORE

2020-08-05 Thread Leo Yan

The existed architectures which have supported perf memory profiling,
usually it contains two types of hardware events: load and store, so if
want to profile memory for both load and store operations, the tool will
use these two events at the same time.  But this is not valid for aux
tracing event, the same event can be used with setting different
configurations for memory operation filtering, e.g the event can be used
to only trace memory load, or only memory store, or trace for both memory
load and store.

This patch introduces a new event PERF_MEM_EVENTS__LOAD_STORE, which is
used to support the event which can record both memory load and store
operations.

Signed-off-by: Leo Yan 
---
 tools/perf/builtin-mem.c | 11 +--
 tools/perf/util/mem-events.h |  1 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 9a7df8d01296..bd4229ca3685 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -19,8 +19,9 @@
 #include "util/symbol.h"
 #include 
 
-#define MEM_OPERATION_LOAD 0x1
-#define MEM_OPERATION_STORE0x2
+#define MEM_OPERATION_LOAD 0x1
+#define MEM_OPERATION_STORE0x2
+#define MEM_OPERATION_LOAD_STORE   0x4
 
 struct perf_mem {
struct perf_tooltool;
@@ -97,6 +98,11 @@ static int __cmd_record(int argc, const char **argv, struct 
perf_mem *mem)
e->record = true;
}
 
+   if (mem->operation & MEM_OPERATION_LOAD_STORE) {
+   e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD_STORE);
+   e->record = true;
+   }
+
e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
if (e->record)
rec_argv[i++] = "-W";
@@ -326,6 +332,7 @@ struct mem_mode {
 static const struct mem_mode mem_modes[]={
MEM_OPT("load", MEM_OPERATION_LOAD),
MEM_OPT("store", MEM_OPERATION_STORE),
+   MEM_OPT("ldst", MEM_OPERATION_LOAD_STORE),
MEM_END
 };
 
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 726a9c8103e4..5ef178278909 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -28,6 +28,7 @@ struct mem_info {
 enum {
PERF_MEM_EVENTS__LOAD,
PERF_MEM_EVENTS__STORE,
+   PERF_MEM_EVENTS__LOAD_STORE,
PERF_MEM_EVENTS__MAX,
 };
 
-- 
2.17.1

Re: 答复: 答复: 答复: 答复: 答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

2020-08-05 Thread Lu Baolu


Hi Felix,

On 8/5/20 3:37 PM, FelixCui-oc wrote:

Hi baolu,
Let me talk about why acpi_device_create_direct_mappings() is 
needed and please tell me if there is an error.


Sure. Before that, let me sync my understanding with you. You have an
acpi namespace device in ANDD table, it also shows up in the device
scope of a RMRR. Current code doesn't enumerate that device for the
RMRR, hence iommu_create_device_direct_mappings() doesn't work for this
device.

At the same time, probe_acpi_namespace_devices() doesn't work for this
device, hence you want to add a home-made
acpi_device_create_direct_mappings() helper.

Did I get it right?


In the probe_acpi_namespace_devices() function, only the device in 
the addev->physical_node_list is probed,
but we need to establish identity mapping for the namespace 
device in RMRR. These are two different devices.


The namespace device has been probed and put in one drhd's device list.
Hence, it should be processed by probe_acpi_namespace_devices(). So the
question is why there are no devices in addev->physical_node_list?


Therefore, the namespace device in RMRR is not mapped in 
probe_acpi_namespace_devices().
acpi_device_create_direct_mappings() is to create direct 
mappings for namespace devices in RMRR.


Best regards,
baolu

[PATCH v1 06/11] perf mem: Support Arm SPE events

2020-08-05 Thread Leo Yan

This patch is to add Arm SPE events for perf memory profiling.  It
supports three Arm SPE events:

  - spe-load: memory event for only recording memory load ops;
  - spe-store: memory event for only recording memory store ops;
  - spe-ldst: memory event for recording memory load and store ops.

Signed-off-by: Leo Yan 
---
 tools/perf/arch/arm64/util/Build|  2 +-
 tools/perf/arch/arm64/util/mem-events.c | 46 +
 2 files changed, 47 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/arch/arm64/util/mem-events.c

diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index 5c13438c7bd4..cb18442e840f 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -8,4 +8,4 @@ perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
 perf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
  ../../arm/util/auxtrace.o \
  ../../arm/util/cs-etm.o \
- arm-spe.o
+ arm-spe.o mem-events.o
diff --git a/tools/perf/arch/arm64/util/mem-events.c 
b/tools/perf/arch/arm64/util/mem-events.c
new file mode 100644
index ..f23128db54fb
--- /dev/null
+++ b/tools/perf/arch/arm64/util/mem-events.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "map_symbol.h"
+#include "mem-events.h"
+
+#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
+
+static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+   E("spe-load",   
"arm_spe_0/ts_enable=1,load_filter=1,store_filter=0,min_latency=%u/",   
"arm_spe_0"),
+   E("spe-store",  "arm_spe_0/ts_enable=1,load_filter=0,store_filter=1/",  
"arm_spe_0"),
+   E("spe-ldst",   
"arm_spe_0/ts_enable=1,load_filter=1,store_filter=1,min_latency=%u/",   
"arm_spe_0"),
+};
+
+static char mem_ld_name[100];
+static char mem_st_name[100];
+static char mem_ldst_name[100];
+
+struct perf_mem_event *perf_mem_events__ptr(int i)
+{
+   if (i >= PERF_MEM_EVENTS__MAX)
+   return NULL;
+
+   return _mem_events[i];
+}
+
+char *perf_mem_events__name(int i)
+{
+   struct perf_mem_event *e = perf_mem_events__ptr(i);
+
+   if (i >= PERF_MEM_EVENTS__MAX)
+   return NULL;
+
+   if (i == PERF_MEM_EVENTS__LOAD) {
+   scnprintf(mem_ld_name, sizeof(mem_ld_name),
+ e->name, perf_mem_events__loads_ldlat);
+   return mem_ld_name;
+   }
+
+   if (i == PERF_MEM_EVENTS__STORE) {
+   scnprintf(mem_st_name, sizeof(mem_st_name), e->name);
+   return mem_st_name;
+   }
+
+   scnprintf(mem_ldst_name, sizeof(mem_ldst_name),
+ e->name, perf_mem_events__loads_ldlat);
+   return mem_ldst_name;
+}
-- 
2.17.1

[PATCH v1 07/11] perf arm-spe: Enable attribution PERF_SAMPLE_DATA_SRC

2020-08-05 Thread Leo Yan

This patch is to enable attribution PERF_SAMPLE_DATA_SRC for the perf
data, when decoding the tracing data, it will tells the tool it contains
memory data.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 3882a5360ada..c2cf5058648f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -803,7 +803,7 @@ arm_spe_synth_events(struct arm_spe *spe, struct 
perf_session *session)
attr.type = PERF_TYPE_HARDWARE;
attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK;
attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
-   PERF_SAMPLE_PERIOD;
+   PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC;
if (spe->timeless_decoding)
attr.sample_type &= ~(u64)PERF_SAMPLE_TIME;
else
-- 
2.17.1

[PATCH v1 08/11] perf arm-spe: Save memory addresses in packet

2020-08-05 Thread Leo Yan

This patch is to save virtual and physical memory addresses in packet,
the address info can be used for generating memory samples.

Signed-off-by: Leo Yan 
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 4 
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 93e063f22be5..373dc2d1cf06 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -162,6 +162,10 @@ static int arm_spe_read_record(struct arm_spe_decoder 
*decoder)
decoder->record.from_ip = ip;
else if (idx == SPE_ADDR_PKT_HDR_INDEX_BRANCH)
decoder->record.to_ip = ip;
+   else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT)
+   decoder->record.addr = ip;
+   else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS)
+   decoder->record.phys_addr = ip;
break;
case ARM_SPE_COUNTER:
break;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h 
b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index a5111a8d4360..5acddfcffbd1 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -47,6 +47,8 @@ struct arm_spe_record {
u64 from_ip;
u64 to_ip;
u64 timestamp;
+   u64 addr;
+   u64 phys_addr;
 };
 
 struct arm_spe_insn;
-- 
2.17.1

[PATCH v1 05/11] perf mem: Support AUX trace

2020-08-05 Thread Leo Yan

Perf memory profiling doesn't support aux trace data so the tool cannot
receive the synthesized samples from hardware tracing data.  On the
Arm64 platform, though it doesn't support PMU events for memory load and
store, but Armv8's SPE is a good candidate for memory profiling, the
hardware tracer can record memory accessing operations with physical
address and virtual address for different cache level and it also stats
the memory operations for remote access and TLB.

To allow the perf memory tool to support AUX trace, this patches adds
the aux callbacks for session structure.  It passes the predefined synth
options (like llc, flc, remote_access, tlb, etc) so this notifies the
tracing decoder to generate corresponding samples.  This patch also
invokes the standard API perf_event__process_attr() to register sample
IDs into evlist.

Signed-off-by: Leo Yan 
---
 tools/perf/builtin-mem.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index a7204634893c..6c8b5e956a4a 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -7,6 +7,7 @@
 #include "perf.h"
 
 #include 
+#include "util/auxtrace.h"
 #include "util/trace-event.h"
 #include "util/tool.h"
 #include "util/session.h"
@@ -249,6 +250,15 @@ static int process_sample_event(struct perf_tool *tool,
 
 static int report_raw_events(struct perf_mem *mem)
 {
+   struct itrace_synth_opts itrace_synth_opts = {
+   .set = true,
+   .flc = true,/* First level cache samples */
+   .llc = true,/* Last level cache samples */
+   .tlb = true,/* TLB samples */
+   .remote_access = true,  /* Remote access samples */
+   .default_no_sample = true,
+   };
+
struct perf_data data = {
.path  = input_name,
.mode  = PERF_DATA_MODE_READ,
@@ -261,6 +271,8 @@ static int report_raw_events(struct perf_mem *mem)
if (IS_ERR(session))
return PTR_ERR(session);
 
+   session->itrace_synth_opts = _synth_opts;
+
if (mem->cpu_list) {
ret = perf_session__cpu_bitmap(session, mem->cpu_list,
   mem->cpu_bitmap);
@@ -394,6 +406,19 @@ parse_mem_ops(const struct option *opt, const char *str, 
int unset)
return ret;
 }
 
+static int process_attr(struct perf_tool *tool __maybe_unused,
+   union perf_event *event,
+   struct evlist **pevlist)
+{
+   int err;
+
+   err = perf_event__process_attr(tool, event, pevlist);
+   if (err)
+   return err;
+
+   return 0;
+}
+
 int cmd_mem(int argc, const char **argv)
 {
struct stat st;
@@ -405,8 +430,12 @@ int cmd_mem(int argc, const char **argv)
.comm   = perf_event__process_comm,
.lost   = perf_event__process_lost,
.fork   = perf_event__process_fork,
+   .attr   = process_attr,
.build_id   = perf_event__process_build_id,
.namespaces = perf_event__process_namespaces,
+   .auxtrace_info  = perf_event__process_auxtrace_info,
+   .auxtrace   = perf_event__process_auxtrace,
+   .auxtrace_error = perf_event__process_auxtrace_error,
.ordered_events = true,
},
.input_name  = "perf.data",
-- 
2.17.1

[PATCH v1 01/11] perf mem: Search event name with more flexible path

2020-08-05 Thread Leo Yan

Perf tool searches memory event name under the folder
'/sys/devices/cpu/events/', this leads to the limitation for selection
memory profiling event which must be under this folder.  Thus it's
impossible to use any other event as memory event which is not under
this specific, e.g. it cannot support Arm SPE hardware tracing for
memory profiling

This patch changes to search folder from '/sys/devices/cpu/events/' to
'/sys/devices', so it give flexibility to find events which can be used
for memory profiling.

Signed-off-by: Leo Yan 
---
 tools/perf/util/mem-events.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index ea0af0bc4314..35c8d175a9d2 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -18,8 +18,8 @@ unsigned int perf_mem_events__loads_ldlat = 30;
 #define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
 
 struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
-   E("ldlat-loads","cpu/mem-loads,ldlat=%u/P", "mem-loads"),
-   E("ldlat-stores",   "cpu/mem-stores/P", "mem-stores"),
+   E("ldlat-loads","cpu/mem-loads,ldlat=%u/P", 
"cpu/events/mem-loads"),
+   E("ldlat-stores",   "cpu/mem-stores/P", 
"cpu/events/mem-stores"),
 };
 #undef E
 
@@ -93,7 +93,7 @@ int perf_mem_events__init(void)
struct perf_mem_event *e = _mem_events[j];
struct stat st;
 
-   scnprintf(path, PATH_MAX, "%s/devices/cpu/events/%s",
+   scnprintf(path, PATH_MAX, "%s/devices/%s",
  mnt, e->sysfs_name);
 
if (!stat(path, ))
-- 
2.17.1

[PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE

2020-08-05 Thread Leo Yan

This patch set is to support AUX trace and Arm SPE as the first enabled
hardware tracing for Perf memory tool.

Patches 01 ~ 04 are preparasion patches which mainly resolve the issue
for memory events, since the existed code is hard coded the memory
events which based on x86 and PowerPC architectures, so patches 01 ~ 04
extend to support more flexible memory event name, and introduce weak
functions so can allow every architecture to define its own memory
events structure and returning event pointer and name respectively.

Patch 05 is used to extend Perf memory tool to support AUX trace.

Patch 06 ~ 11 are to support Arm SPE with Perf memory tool.  Firstly it
registers SPE events for memory events, then it extends the SPE packet
to pass addresses info and operation types, and also set 'data_src'
field so can allow the tool to display readable string in the result.

This patch set has been tested on ARMv8 Hisilicon D06 platform.  I noted
now the 'data object' cannot be displayed properly, this should be
another issue so need to check separately.   Below is testing result:

# Samples: 73  of event 'l1d-miss'
# Total weight : 73
# Sort order   : 
local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead   Samples  Local Weight  Memory access Symbol
  Shared Object  Data Symbol
   Data Object Snoop TLB access 
 Locked
#         
..  .  
  
..    ..  ..
#
 2.74% 2  0 L1 or L1 miss [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027aacb08a8   
 [unknown]   N/A   N/A  
   No
 2.74% 2  0 L1 or L1 miss [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027be6488a8   
 [unknown]   N/A   N/A  
   No
 2.74% 2  0 L1 or L1 miss [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027c432f8a8   
 [unknown]   N/A   N/A  
   No
 1.37% 1  0 L1 or L1 miss [k] 
__arch_copy_to_user [kernel.kallsyms]  [k] 0x0027a65352a0   
 [unknown]   N/A   N/A  
   No
 1.37% 1  0 L1 or L1 miss [k] 
__d_lookup_rcu  [kernel.kallsyms]  [k] 0x0027d3cbf468   
 [unknown]   N/A   N/A  
   No
 1.37% 1  0 L1 or L1 miss [k] 
__d_lookup_rcu  [kernel.kallsyms]  [k] 0x0027d8f44490   
 [unknown]   N/A   N/A  
   No
 [...]


# Samples: 101  of event 'l1d-access'
# Total weight : 101
# Sort order   : 
local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead   Samples  Local Weight  Memory access Symbol
  Shared Object  Data Symbol
   Data Object Snoop TLB access 
 Locked
#         
..  .  
  
..    ..  ..
#
 2.97% 3  0 L1 or L1 hit  [k] 
perf_event_mmap [kernel.kallsyms]  [k] perf_swevent+0x5c
 [kernel.kallsyms].data  N/A   N/A  
   No
 1.98% 2  0 L1 or L1 hit  [k] 
kmem_cache_alloc[kernel.kallsyms]  [k] 0x2027af40e3d0   
 [unknown]   N/A   N/A  
   No
 1.98% 2  0 L1 or L1 hit  [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027aacb08a8   
 [unknown]   N/A   N/A  
   No
 1.98% 2  0 L1 or L1 hit  [k] 
perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0x2027be6488a8   
 [unknown]   N/A   N/A  
   No
 1.98% 2  0 L1 or L1 hit  [k] 
perf_iterate_ctx.constprop.151

[PATCH] ARM: dts: am335x: add common dtsi for MOXA UC-8100 series

2020-08-05 Thread 陳昭勳

Add am335x-moxa-uc-8100-common.dtsi for many products of MOXA UC-8100
series, and remove common nodes from am335x-moxa-uc-8100-me-t.dts.

Signed-off-by: Johnson Chen 
---
 .../boot/dts/am335x-moxa-uc-8100-common.dtsi  | 427 ++
 .../arm/boot/dts/am335x-moxa-uc-8100-me-t.dts | 404 +
 2 files changed, 428 insertions(+), 403 deletions(-)
 create mode 100644 arch/arm/boot/dts/am335x-moxa-uc-8100-common.dtsi

diff --git a/arch/arm/boot/dts/am335x-moxa-uc-8100-common.dtsi 
b/arch/arm/boot/dts/am335x-moxa-uc-8100-common.dtsi
new file mode 100644
index ..98d8ed4ad967
--- /dev/null
+++ b/arch/arm/boot/dts/am335x-moxa-uc-8100-common.dtsi
@@ -0,0 +1,427 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 MOXA Inc. - https://www.moxa.com/
+ *
+ * Author: Johnson Chen 
+ */
+
+#include "am33xx.dtsi"
+
+/ {
+
+   cpus {
+   cpu@0 {
+   cpu0-supply = <_reg>;
+   };
+   };
+
+   vbat: vbat-regulator {
+   compatible = "regulator-fixed";
+   };
+
+   /* Power supply provides a fixed 3.3V @3A */
+   vmmcsd_fixed: vmmcsd-regulator {
+ compatible = "regulator-fixed";
+ regulator-name = "vmmcsd_fixed";
+ regulator-min-microvolt = <330>;
+ regulator-max-microvolt = <330>;
+ regulator-boot-on;
+   };
+
+   buttons: push_button {
+   compatible = "gpio-keys";
+   };
+
+};
+
+_pinmux {
+   pinctrl-names = "default";
+   pinctrl-0 = <_pins>;
+
+   minipcie_pins: pinmux_minipcie {
+   pinctrl-single,pins = <
+   AM33XX_PADCONF(AM335X_PIN_LCD_PCLK, PIN_INPUT_PULLDOWN, 
MUX_MODE7)  /* lcd_pclk.gpio2_24 */
+   AM33XX_PADCONF(AM335X_PIN_LCD_AC_BIAS_EN, 
PIN_INPUT_PULLDOWN, MUX_MODE7)/* lcd_ac_bias_en.gpio2_25 */
+   AM33XX_PADCONF(AM335X_PIN_LCD_VSYNC, 
PIN_INPUT_PULLDOWN, MUX_MODE7) /* lcd_vsync.gpio2_22  Power off PIN*/
+   >;
+   };
+
+   push_button_pins: pinmux_push_button {
+   pinctrl-single,pins = <
+   AM33XX_PADCONF(AM335X_PIN_MCASP0_AHCLKX, 
PIN_INPUT_PULLDOWN, MUX_MODE7) /* mcasp0_ahcklx.gpio3_21 */
+   >;
+   };
+
+   i2c0_pins: pinmux_i2c0_pins {
+   pinctrl-single,pins = <
+   AM33XX_PADCONF(AM335X_PIN_I2C0_SDA, PIN_INPUT_PULLUP, 
MUX_MODE0)
+   AM33XX_PADCONF(AM335X_PIN_I2C0_SCL, PIN_INPUT_PULLUP, 
MUX_MODE0)
+   >;
+   };
+
+
+   i2c1_pins: pinmux_i2c1_pins {
+   pinctrl-single,pins = <
+   AM33XX_PADCONF(AM335X_PIN_UART0_CTSN, PIN_INPUT_PULLUP, 
MUX_MODE3)  /* uart0_ctsn.i2c1_sda */
+   AM33XX_PADCONF(AM335X_PIN_UART0_RTSN, PIN_INPUT_PULLUP, 
MUX_MODE3)  /* uart0_rtsn.i2c1_scl */
+   >;
+   };
+
+   uart0_pins: pinmux_uart0_pins {
+   pinctrl-single,pins = <
+   AM33XX_PADCONF(AM335X_PIN_UART0_RXD, PIN_INPUT_PULLUP, 
MUX_MODE0)
+   AM33XX_PADCONF(AM335X_PIN_UART0_TXD, 
PIN_OUTPUT_PULLDOWN, MUX_MODE0)
+   >;
+   };
+
+   uart1_pins: pinmux_uart1_pins {
+   pinctrl-single,pins = <
+   AM33XX_PADCONF(AM335X_PIN_UART1_CTSN, PIN_INPUT, 
MUX_MODE0)
+   AM33XX_PADCONF(AM335X_PIN_UART1_RTSN, 
PIN_OUTPUT_PULLDOWN, MUX_MODE0)
+   AM33XX_PADCONF(AM335X_PIN_UART1_RXD, PIN_INPUT_PULLUP, 
MUX_MODE0)
+   AM33XX_PADCONF(AM335X_PIN_UART1_TXD, PIN_OUTPUT, 
MUX_MODE0)
+   >;
+   };
+
+   uart2_pins: pinmux_uart2_pins {
+   pinctrl-single,pins = <
+   AM33XX_PADCONF(AM335X_PIN_LCD_DATA14, PIN_INPUT, 
MUX_MODE6) /* lcd_data14.uart5_ctsn */
+   AM33XX_PADCONF(AM335X_PIN_LCD_DATA15, 
PIN_OUTPUT_PULLDOWN, MUX_MODE6)  /* lcd_data15.uart5_rtsn */
+   AM33XX_PADCONF(AM335X_PIN_LCD_DATA9, PIN_INPUT_PULLUP, 
MUX_MODE4) /* lcd_data9.uart5_rxd */
+   AM33XX_PADCONF(AM335X_PIN_LCD_DATA8, PIN_OUTPUT, 
MUX_MODE4) /* lcd_data8.uart5_txd */
+   >;
+   };
+
+   cpsw_default: cpsw_default {
+   pinctrl-single,pins = <
+   /* Slave 1 */
+   AM33XX_PADCONF(AM335X_PIN_MII1_CRS, PIN_INPUT_PULLDOWN, 
MUX_MODE1)
+   AM33XX_PADCONF(AM335X_PIN_MII1_RX_ER, PIN_INPUT_PULLUP, 
MUX_MODE1)
+   AM33XX_PADCONF(AM335X_PIN_MII1_TX_EN, 
PIN_OUTPUT_PULLDOWN, MUX_MODE1)
+   AM33XX_PADCONF(AM335X_PIN_MII1_TXD1, 
PIN_OUTPUT_PULLDOWN, MUX_MODE1)
+   AM33XX_PADCONF(AM335X_PIN_MII1_TXD0, 
PIN_OUTPUT_PULLDOWN, MUX_MODE1)
+   AM33XX_PADCONF(AM335X_PIN_MII1_RXD1,

Re: [PATCH 1/2] net: tls: add compat for get/setsockopt

2020-08-05 Thread kernel test robot

Hi Rouven,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.8]
[cannot apply to next-20200805]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Rouven-Czerwinski/net-tls-add-compat-for-get-setsockopt/20200806-040123
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
ecfd7940b8641da6e41ca94eba36876dc2ba827b
config: i386-randconfig-s002-20200805 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.2-117-g8c7aee71-dirty
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   net/tls/tls_main.c: In function 'tls_compat_getsockopt':
>> net/tls/tls_main.c:459:23: error: 'struct proto' has no member named 
>> 'compat_getsockopt'
 459 |   return ctx->sk_proto->compat_getsockopt(sk, level, optname,
 |   ^~
   net/tls/tls_main.c: In function 'tls_compat_setsockopt':
>> net/tls/tls_main.c:632:23: error: 'struct proto' has no member named 
>> 'compat_setsockopt'
 632 |   return ctx->sk_proto->compat_setsockopt(sk, level, optname,
 |   ^~
   At top level:
   net/tls/tls_main.c:626:12: warning: 'tls_compat_setsockopt' defined but not 
used [-Wunused-function]
 626 | static int tls_compat_setsockopt(struct sock *sk, int level, int 
optname,
 |^
   net/tls/tls_main.c:453:12: warning: 'tls_compat_getsockopt' defined but not 
used [-Wunused-function]
 453 | static int tls_compat_getsockopt(struct sock *sk, int level, int 
optname,
 |^

vim +459 net/tls/tls_main.c

   452  
   453  static int tls_compat_getsockopt(struct sock *sk, int level, int 
optname,
   454   char __user *optval, int __user 
*optlen)
   455  {
   456  struct tls_context *ctx = tls_get_ctx(sk);
   457  
   458  if (level != SOL_TLS)
 > 459  return ctx->sk_proto->compat_getsockopt(sk, level, 
 > optname,
   460  optval, optlen);
   461  
   462  return do_tls_getsockopt(sk, optname, optval, optlen);
   463  }
   464  
   465  static int do_tls_setsockopt_conf(struct sock *sk, char __user *optval,
   466unsigned int optlen, int tx)
   467  {
   468  struct tls_crypto_info *crypto_info;
   469  struct tls_crypto_info *alt_crypto_info;
   470  struct tls_context *ctx = tls_get_ctx(sk);
   471  size_t optsize;
   472  int rc = 0;
   473  int conf;
   474  
   475  if (!optval || (optlen < sizeof(*crypto_info))) {
   476  rc = -EINVAL;
   477  goto out;
   478  }
   479  
   480  if (tx) {
   481  crypto_info = >crypto_send.info;
   482  alt_crypto_info = >crypto_recv.info;
   483  } else {
   484  crypto_info = >crypto_recv.info;
   485  alt_crypto_info = >crypto_send.info;
   486  }
   487  
   488  /* Currently we don't support set crypto info more than one 
time */
   489  if (TLS_CRYPTO_INFO_READY(crypto_info)) {
   490  rc = -EBUSY;
   491  goto out;
   492  }
   493  
   494  rc = copy_from_user(crypto_info, optval, sizeof(*crypto_info));
   495  if (rc) {
   496  rc = -EFAULT;
   497  goto err_crypto_info;
   498  }
   499  
   500  /* check version */
   501  if (crypto_info->version != TLS_1_2_VERSION &&
   502  crypto_info->version != TLS_1_3_VERSION) {
   503  rc = -EINVAL;
   504  goto err_crypto_info;
   505  }
   506  
   507  /* Ensure that TLS version and ciphers are same in both 
directions */
   508  if (TLS_CRYPTO_INFO_READY(alt_crypto_info)) {
   509  if (alt_crypto_info->version != crypto_info->version ||
   510  alt_crypto_info->cipher_type != 
crypto_info->cipher_type) {
   511  rc = -EINVAL;
   512  goto err_crypto_info;
   513  }
   514  }
   515  
   516  switch (crypto_info->cipher_type) {
   517  case TLS_CIPHER_AES_GCM_128:
   518  optsize =

[PATCH] nfc: enforce CAP_NET_RAW for raw sockets When creating a raw AF_NFC socket, CAP_NET_RAW needs to be checked first.

2020-08-05 Thread Qingyu Li

Signed-off-by: Qingyu Li 
---
 net/nfc/rawsock.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index ba5ffd3badd3..c1302b689a98 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -332,8 +332,11 @@ static int rawsock_create(struct net *net, struct socket 
*sock,
if ((sock->type != SOCK_SEQPACKET) && (sock->type != SOCK_RAW))
return -ESOCKTNOSUPPORT;

-   if (sock->type == SOCK_RAW)
+   if (sock->type == SOCK_RAW){
+   if (!capable(CAP_NET_RAW))
+   return -EPERM;
sock->ops = _raw_ops;
+   }
else
sock->ops = _ops;

Re: [git pull] drm next for 5.9-rc1

2020-08-05 Thread Dave Airlie

On Thu, 6 Aug 2020 at 11:07, Dave Airlie  wrote:
>
> Hi Linus,
>
> This the main drm pull request for 5.9-rc1.
>
> New xilinx displayport driver, AMD support for two new GPUs (more
> header files), i915 initial support for RocketLake and some work on
> their DG1 (discrete chip).
>
> The core also grew some lockdep annotations to try and constrain what
> drivers do with dma-fences, and added some documentation on why the
> idea of indefinite fences doesn't work.
>
> The long list is below.
>
> I did a test merge into your tree and only had two minor conflicts, so
> I think you should be able to take care of it fine.

I should say I did a test merge yesterday, but you likely pulled more trees,

https://lore.kernel.org/dri-devel/20200806115140.6aa46...@canb.auug.org.au/T/#t

So there was an unfortunate miscommunication and one patch went two
ways, in future Jason and Ben will coordinate better.

Dave.

Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples

2020-08-05 Thread Jin, Yao


Hi Peter,

On 8/5/2020 8:44 PM, pet...@infradead.org wrote:

On Wed, Aug 05, 2020 at 10:15:26AM +0800, Jin, Yao wrote:

Hi Peter,

On 8/4/2020 7:49 PM, pet...@infradead.org wrote:

On Fri, Jul 31, 2020 at 10:56:17AM +0800, Jin Yao wrote:

@@ -6973,7 +6973,8 @@ static struct perf_callchain_entry __empty_callchain = { 
.nr = 0, };
   struct perf_callchain_entry *
   perf_callchain(struct perf_event *event, struct pt_regs *regs)
   {
-   bool kernel = !event->attr.exclude_callchain_kernel;
+   bool kernel = !event->attr.exclude_callchain_kernel &&
+ !event->attr.exclude_kernel;


This seems weird; how can we get there. Also it seems to me that if you
have !exclude_callchain_kernel you already have permission for kernel
bits, so who cares.



In perf tool, exclude_callchain_kernel is set to 1 when perf-record only
collects the user callchains and exclude_kernel is set to 1 when events are
configured to run in user space.

So if an event is configured to run in user space, that should make sense we
don't need it's kernel callchains.

But it seems to me there is no code logic in perf tool which can make sure
!exclude_callchain_kernel -> !exclude_kernel.

Jiri, Arnaldo, is my understanding correct?


What the perf tool does or does not do is irrelevant. It is a valid,
(albeit slightly silly) configuration to have:

exclude_kernel && !exclude_callchain_kernel

You're now saying that when you configure things like this you're not
allowed kernel IPs, that's wrong I think.

Also, !exclude_callchain_kernel should require privilidge, whcih needs
fixing, see below.



I see you add '!exclude_callchain_kernel' check before perf_allow_kernel() at syscall entry in below 
code.


So if we allow callchain_kernel collection that means we allow kernel by default. That makes sense, 
thanks!



So the new code looks like:

if (event->attr.exclude_kernel && !user_mode(regs)) {
if (!(current->flags & PF_KTHREAD)) {
regs_fake = task_pt_regs(current);
if (!regs_fake)
instruction_pointer_set(regs, -1L);
} else {
instruction_pointer_set(regs, -1L);
}


Again:

if (!(current->flags & PF_KTHREAD))
regs_fake = task_pt_regs(current);

if (!regs_fake)
instruction_pointer_set(regs, -1L);

Is much simpler and more readable.



Yes, agree. Your code is much simpler and better.


+   if ((header->misc & PERF_RECORD_MISC_CPUMODE_MASK) ==
+PERF_RECORD_MISC_KERNEL) {
+   header->misc &= ~PERF_RECORD_MISC_CPUMODE_MASK;
+   header->misc |= PERF_RECORD_MISC_USER;
+   }


Why the conditional? At this point it had better be unconditionally
user, no?

headers->misc &= ~PERF_RECORD_MISC_CPUMODE_MASK;
headers->misc |= PERF_RECORD_MISC_USER;



#define PERF_RECORD_MISC_CPUMODE_MASK   (7 << 0)
#define PERF_RECORD_MISC_CPUMODE_UNKNOWN(0 << 0)
#define PERF_RECORD_MISC_KERNEL (1 << 0)
#define PERF_RECORD_MISC_USER   (2 << 0)
#define PERF_RECORD_MISC_HYPERVISOR (3 << 0)
#define PERF_RECORD_MISC_GUEST_KERNEL   (4 << 0)
#define PERF_RECORD_MISC_GUEST_USER (5 << 0)

If we unconditionally set user, it will reset for hypervisor, guest
kernel and guest_user.


At the same time :u had better not get any of those either. Which seems
to suggest we're going about this wrong.

Also, if we call this before perf_misc_flags() we don't need to fix it
up.

How's this?

---
  kernel/events/core.c | 38 +-
  1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 7c436d705fbd..3e4e328b521a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6988,23 +6988,49 @@ perf_callchain(struct perf_event *event, struct pt_regs 
*regs)
return callchain ?: &__empty_callchain;
  }
  
+/*

+ * Due to interrupt latency (skid), we may enter the kernel before taking the
+ * PMI, even if the PMU is configured to only count user events. To avoid
+ * leaking kernel addresses, use task_pt_regs(), when available.
+ */
+static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct 
pt_regs *regs)
+{
+   struct pt_regs *sample_regs = regs;
+
+   /* user only */
+   if (!event->attr.exclude_kernel || !event->attr.exclude_hv ||
+   !event->attr.exclude_host   || !event->attr.exclude_guest)
+   return sample_regs;
+


Is this condition correct?

Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It will go "return 
sample_regs" path.



+   if (sample_regs(regs))
+   return sample_regs;
+


In your another mail, you said it should be:

if (user_regs(regs))
return sample_regs;


+   if (!(current->flags &

Re: Patch "KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts" has been added to the 4.4-stable tree

2020-08-05 Thread yangerkun


Hi,

Not familiar with kvm. And I have a question about this patch. Maybe 
backport this patch 3204be4109ad("KVM: arm64: Make vcpu_cp1x() work on 
Big Endian hosts") without 52f6c4f02164 ("KVM: arm64: Change 32-bit 
handling of VM system registers") seems not right?


Thanks,
Kun.

在 2020/6/16 18:56, gre...@linuxfoundation.org 写道:


This is a note to let you know that I've just added the patch titled

 KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts

to the 4.4-stable tree which can be found at:
 
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
  kvm-arm64-make-vcpu_cp1x-work-on-big-endian-hosts.patch
and it can be found in the queue-4.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let  know about it.



From 3204be4109ad681523e3461ce64454c79278450a Mon Sep 17 00:00:00 2001

From: Marc Zyngier 
Date: Tue, 9 Jun 2020 08:40:35 +0100
Subject: KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts

From: Marc Zyngier 

commit 3204be4109ad681523e3461ce64454c79278450a upstream.

AArch32 CP1x registers are overlayed on their AArch64 counterparts
in the vcpu struct. This leads to an interesting problem as they
are stored in their CPU-local format, and thus a CP1x register
doesn't "hit" the lower 32bit portion of the AArch64 register on
a BE host.

To workaround this unfortunate situation, introduce a bias trick
in the vcpu_cp1x() accessors which picks the correct half of the
64bit register.

Cc: sta...@vger.kernel.org
Reported-by: James Morse 
Tested-by: James Morse 
Acked-by: James Morse 
Signed-off-by: Marc Zyngier 
Signed-off-by: Greg Kroah-Hartman 

---
  arch/arm64/include/asm/kvm_host.h |6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -178,8 +178,10 @@ struct kvm_vcpu_arch {
   * CP14 and CP15 live in the same array, as they are backed by the
   * same system registers.
   */
-#define vcpu_cp14(v,r) ((v)->arch.ctxt.copro[(r)])
-#define vcpu_cp15(v,r) ((v)->arch.ctxt.copro[(r)])
+#define CPx_BIAS   IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)
+
+#define vcpu_cp14(v,r) ((v)->arch.ctxt.copro[(r) ^ CPx_BIAS])
+#define vcpu_cp15(v,r) ((v)->arch.ctxt.copro[(r) ^ CPx_BIAS])
  
  #ifdef CONFIG_CPU_BIG_ENDIAN

  #define vcpu_cp15_64_high(v,r)vcpu_cp15((v),(r))


Patches currently in stable-queue which might be from m...@kernel.org are

queue-4.4/kvm-arm64-make-vcpu_cp1x-work-on-big-endian-hosts.patch

.

Re: [PATCH v4] ext4: fix direct I/O read error

2020-08-05 Thread 姜迎

Sorry，I will fix this error on 4.4 and 4.9，and then send a patch for 4.4 and 
4.9，thanks！

发自我的iPhone

> 在 2020年8月6日，上午9:19，Sasha Levin  写道：
> 
> On Wed, Aug 05, 2020 at 10:51:07AM +0200, Jan Kara wrote:
>> Note to stable tree maintainers (summary from the rather long changelog):
>> This is a non-upstream patch. It will not go upstream because the problem
>> there has been fixed by converting ext4 to use iomap infrastructure.
>> However that change is out of scope for stable kernels and this is a
>> minimal fix for the problem that has hit real-world applications so I think
>> it would be worth it to include the fix in stable trees. Thanks.
> 
> How far back should it go? It breaks the build on 4.9 and 4.4 but the
> fix for the breakage is trivial.
> 
> It does however suggest that this fix wasn't tested on 4.9 or 4.4, so
> I'd like to clarify it here before fixing it up (or dropping it).
> 
> -- 
> Thanks,
> Sasha

Re: [PATCH] kbuild: Add dtc flag test

2020-08-05 Thread Masahiro Yamada

On Thu, Aug 6, 2020 at 2:59 AM Elliot Berman  wrote:
>
> Host dtc may not support the same flags as kernel's copy of dtc. Test
> if dtc supports each flag when the dtc comes from host.
>
> Signed-off-by: Elliot Berman 


I think this supports only the newer external DTC,
but not older ones.

This feature is intended to test the upstream DTC
before resyncing in-kernel scripts/dtc/.






> ---
>  scripts/Makefile.lib | 34 ++
>  1 file changed, 22 insertions(+), 12 deletions(-)
>
> diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
> index 841ac03..2722a67 100644
> --- a/scripts/Makefile.lib
> +++ b/scripts/Makefile.lib
> @@ -274,25 +274,35 @@ quiet_cmd_gzip = GZIP$@
>
>  # DTC
>  # ---
> +ifeq ("$(origin DTC)", "command line")
> +PHONY += $(DTC)
> +dtc-option = $(call try-run, $(DTC) $1 -v,$1)
> +else
> +# Just add the flag. DTC is compiled later as a prerequisite, so there's no 
> dtc
> +# to test the flag against. This is okay because we're not testing flags 
> which
> +# aren't supported by in-kernel dtc to begin with.
> +dtc-option = $1
> +endif
> +
>  DTC ?= $(objtree)/scripts/dtc/dtc
> -DTC_FLAGS += -Wno-interrupt_provider
> +DTC_FLAGS += $(call dtc-option,-Wno-interrupt_provider)
>
>  # Disable noisy checks by default
>  ifeq ($(findstring 1,$(KBUILD_EXTRA_WARN)),)
> -DTC_FLAGS += -Wno-unit_address_vs_reg \
> -   -Wno-unit_address_format \
> -   -Wno-avoid_unnecessary_addr_size \
> -   -Wno-alias_paths \
> -   -Wno-graph_child_address \
> -   -Wno-simple_bus_reg \
> -   -Wno-unique_unit_address \
> -   -Wno-pci_device_reg
> +DTC_FLAGS += $(call dtc-option,-Wno-unit_address_vs_reg) \
> +   $(call dtc-option,-Wno-unit_address_format) \
> +   $(call dtc-option,-Wno-avoid_unnecessary_addr_size) \
> +   $(call dtc-option,-Wno-alias_paths) \
> +   $(call dtc-option,-Wno-graph_child_address) \
> +   $(call dtc-option,-Wno-simple_bus_reg) \
> +   $(call dtc-option,-Wno-unique_unit_address) \
> +   $(call dtc-option,-Wno-pci_device_reg)
>  endif
>
>  ifneq ($(findstring 2,$(KBUILD_EXTRA_WARN)),)
> -DTC_FLAGS += -Wnode_name_chars_strict \
> -   -Wproperty_name_chars_strict \
> -   -Winterrupt_provider
> +DTC_FLAGS += $(call dtc-option,-Wnode_name_chars_strict) \
> +   $(call dtc-option,-Wproperty_name_chars_strict) \
> +   $(call dtc-option,-Winterrupt_provider)
>  endif
>
>  DTC_FLAGS += $(DTC_FLAGS_$(basetarget))
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>


-- 
Best Regards
Masahiro Yamada

[PATCH] sound: pci: delete repeated words in comments

2020-08-05 Thread Randy Dunlap

Drop duplicated words in sound/pci/.
{and, the, at}

Signed-off-by: Randy Dunlap 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: alsa-de...@alsa-project.org
Cc: Clemens Ladisch 
---
 sound/pci/cs46xx/cs46xx_lib.c   |2 +-
 sound/pci/cs46xx/dsp_spos_scb_lib.c |2 +-
 sound/pci/hda/hda_codec.c   |2 +-
 sound/pci/hda/hda_generic.c |2 +-
 sound/pci/hda/patch_sigmatel.c  |2 +-
 sound/pci/ice1712/prodigy192.c  |2 +-
 sound/pci/oxygen/xonar_dg.c |2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

--- linux-next-20200805.orig/sound/pci/cs46xx/cs46xx_lib.c
+++ linux-next-20200805/sound/pci/cs46xx/cs46xx_lib.c
@@ -766,7 +766,7 @@ static void snd_cs46xx_set_capture_sampl
rate = 48000 / 9;
 
/*
-*  We can not capture at at rate greater than the Input Rate (48000).
+*  We can not capture at a rate greater than the Input Rate (48000).
 *  Return an error if an attempt is made to stray outside that limit.
 */
if (rate > 48000)
--- linux-next-20200805.orig/sound/pci/cs46xx/dsp_spos_scb_lib.c
+++ linux-next-20200805/sound/pci/cs46xx/dsp_spos_scb_lib.c
@@ -1716,7 +1716,7 @@ int cs46xx_iec958_pre_open (struct snd_c
struct dsp_spos_instance * ins = chip->dsp_spos_instance;
 
if ( ins->spdif_status_out & DSP_SPDIF_STATUS_OUTPUT_ENABLED ) {
-   /* remove AsynchFGTxSCB and and PCMSerialInput_II */
+   /* remove AsynchFGTxSCB and PCMSerialInput_II */
cs46xx_dsp_disable_spdif_out (chip);
 
/* save state */
--- linux-next-20200805.orig/sound/pci/hda/hda_codec.c
+++ linux-next-20200805/sound/pci/hda/hda_codec.c
@@ -3428,7 +3428,7 @@ EXPORT_SYMBOL_GPL(snd_hda_set_power_save
  * @nid: NID to check / update
  *
  * Check whether the given NID is in the amp list.  If it's in the list,
- * check the current AMP status, and update the the power-status according
+ * check the current AMP status, and update the power-status according
  * to the mute status.
  *
  * This function is supposed to be set or called from the check_power_status
--- linux-next-20200805.orig/sound/pci/hda/hda_generic.c
+++ linux-next-20200805/sound/pci/hda/hda_generic.c
@@ -813,7 +813,7 @@ static void activate_amp_in(struct hda_c
}
 }
 
-/* sync power of each widget in the the given path */
+/* sync power of each widget in the given path */
 static hda_nid_t path_power_update(struct hda_codec *codec,
   struct nid_path *path,
   bool allow_powerdown)
--- linux-next-20200805.orig/sound/pci/hda/patch_sigmatel.c
+++ linux-next-20200805/sound/pci/hda/patch_sigmatel.c
@@ -838,7 +838,7 @@ static int stac_auto_create_beep_ctls(st
static const struct snd_kcontrol_new beep_vol_ctl =
HDA_CODEC_VOLUME(NULL, 0, 0, 0);
 
-   /* check for mute support for the the amp */
+   /* check for mute support for the amp */
if ((caps & AC_AMPCAP_MUTE) >> AC_AMPCAP_MUTE_SHIFT) {
const struct snd_kcontrol_new *temp;
if (spec->anabeep_nid == nid)
--- linux-next-20200805.orig/sound/pci/ice1712/prodigy192.c
+++ linux-next-20200805/sound/pci/ice1712/prodigy192.c
@@ -32,7 +32,7 @@
  *   Experimentally I found out that only a combination of
  *   OCKS0=1, OCKS1=1 (128fs, 64fs output) and ice1724 -
  *   VT1724_MT_I2S_MCLK_128X=0 (256fs input) yields correct
- *   sampling rate. That means the the FPGA doubles the
+ *   sampling rate. That means that the FPGA doubles the
  *   MCK01 rate.
  *
  * Copyright (c) 2003 Takashi Iwai 
--- linux-next-20200805.orig/sound/pci/oxygen/xonar_dg.c
+++ linux-next-20200805/sound/pci/oxygen/xonar_dg.c
@@ -29,7 +29,7 @@
  *   GPIO 4 <- headphone detect
  *   GPIO 5 -> enable ADC analog circuit for the left channel
  *   GPIO 6 -> enable ADC analog circuit for the right channel
- *   GPIO 7 -> switch green rear output jack between CS4245 and and the first
+ *   GPIO 7 -> switch green rear output jack between CS4245 and the first
  * channel of CS4361 (mechanical relay)
  *   GPIO 8 -> enable output to speakers
  *

[PATCH] sound: isa: delete repeated words in comments

2020-08-05 Thread Randy Dunlap

Drop duplicated words in sound/isa/.
{be, bit}

Signed-off-by: Randy Dunlap 
Cc: Jaroslav Kysela 
Cc: Takashi Iwai 
Cc: alsa-de...@alsa-project.org
---
 sound/isa/cs423x/cs4236_lib.c |2 +-
 sound/isa/es18xx.c|2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

--- linux-next-20200805.orig/sound/isa/cs423x/cs4236_lib.c
+++ linux-next-20200805/sound/isa/cs423x/cs4236_lib.c
@@ -39,7 +39,7 @@
  * D7: consumer serial port enable (CS4237B,CS4238B)
  * D6: channels status block reset (CS4237B,CS4238B)
  * D5: user bit in sub-frame of digital audio data (CS4237B,CS4238B)
- * D4: validity bit bit in sub-frame of digital audio data 
(CS4237B,CS4238B)
+ * D4: validity bit in sub-frame of digital audio data (CS4237B,CS4238B)
  * 
  *  C5  lower channel status (digital serial data description) 
(CS4237B,CS4238B)
  * D7-D6: first two bits of category code
--- linux-next-20200805.orig/sound/isa/es18xx.c
+++ linux-next-20200805/sound/isa/es18xx.c
@@ -955,7 +955,7 @@ static int snd_es18xx_info_mux(struct sn
case 0x1887:
case 0x1888:
return snd_ctl_enum_info(uinfo, 1, 5, texts5Source);
-   case 0x1869: /* DS somewhat contradictory for 1869: could be be 5 or 8 
*/
+   case 0x1869: /* DS somewhat contradictory for 1869: could be 5 or 8 */
case 0x1879:
return snd_ctl_enum_info(uinfo, 1, 8, texts8Source);
default:

Re: [PATCH] drivers/net/wan/lapbether: Added needed_headroom and a skb->len check

2020-08-05 Thread Xie He

I'm sorry I forgot to include the "net" prefix again. I remembered
"PATCH" but not "net" this time. I'll try to remember both next time.
If requested I can resend the patch with the correct prefix. Sorry.

Re: PROBLEM: IO lockup on reiserfs FS.

2020-08-05 Thread Metztli Information Technology

 
On Wed, Aug 5, 2020 at 5:01 PM  wrote:

> On Wed, 5 Aug 2020 12:51:41 -0700
> Linus Torvalds  wrote:
> > On Wed, Aug 5, 2020 at 9:53 AM  wrote:
> > >
> > > It's been over 1 week since I sent this into the reiserfs-devel
> > > mailing list. I'm escalating this as the kernel docs recommend.
> > > I'm still willing to help debug and test a fix for this problem.Â  
> > 
> > The thing is, you're using an ancient 4.14 kernel, 
> 
> Sorry, I didn't realize kernel development went that fast.
> I did try to go to the 5.X series, but the AMDGPU drivers don't work on
> my SI card anymore (I need to bisect which takes time and many re-boots
> to find the problematic commit).
> I'll try the Radeon-SI driver and see if I can reproduce this reliably.
> 
> > and a filesystem
> > that isn't really maintained any more. You'll find very few people
> > interested in trying to debug that combination.
> > 
> > You *might* have more luck with a more modern kernel, but even then
> > ... reiserfs?
> > 
> >Â  Â  Â  Â  Â  Â  Â  Â Linus
> > 
> 
> Why does no one (I've met others who share a similar sentiment), like
> reiserfs?
Could be because 'others' are all 'virtuous' individuals, employed by 
'virtuous' corporations, headquartered at 'virtuous' western countries, whose 
'virtuous' governments are able to finance the finest hasbara...er, propaganda, 
that a corporatocracy...er, 'democracy', can buy.

> I'm not looking for fight, I'm incredulous. It's a great FS
> that survives oops-es, power failures, and random crashes very very well.
> It's the only FLOSS FS with tail packing.
On a more sober note, Reiser4, Software Framework Release Number (SFRN) 4.0.2, 
is stable, and supercedes the features you appreciate in reiserfs, like Edward 
stated in his subsequent reply.
> 
> Thanks,
> David


Best Professional Regards.

-- 
Jose R R
http://metztli.it
-
Download Metztli Reiser4: Debian Buster w/ Linux 5.5.19 AMD64
-
feats ZSTD compression https://sf.net/projects/metztli-reiser4/
---
Official current Reiser4 resources: https://reiser4.wiki.kernel.org/

Re: [PATCH v8 5/8] powerpc/vdso: Prepare for switching VDSO to generic C implementation.

2020-08-05 Thread Michael Ellerman

Segher Boessenkool  writes:
> On Wed, Aug 05, 2020 at 04:24:16PM +1000, Michael Ellerman wrote:
>> Christophe Leroy  writes:
>> > Indeed, 32-bit doesn't have a redzone, so I believe it needs a stack 
>> > frame whenever it has anything to same.
>> 
>> Yeah OK that would explain it.
>> 
>> > Here is what I have in libc.so:
>> >
>> > 000fbb60 <__clock_gettime>:
>> > fbb60: 94 21 ff e0 stwur1,-32(r1)
>
> This is the *only* place where you can use a negative offset from r1:
> in the stwu to extend the stack (set up a new stack frame, or make the
> current one bigger).

(You're talking about 32-bit code here right?)

>> > diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h 
>> > b/arch/powerpc/include/asm/vdso/gettimeofday.h
>> > index a0712a6e80d9..0b6fa245d54e 100644
>> > --- a/arch/powerpc/include/asm/vdso/gettimeofday.h
>> > +++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
>> > @@ -10,6 +10,7 @@
>> > .cfi_startproc
>> >PPC_STLUr1, -STACK_FRAME_OVERHEAD(r1)
>> >mflrr0
>> > +  PPC_STLUr1, -STACK_FRAME_OVERHEAD(r1)
>> > .cfi_register lr, r0
>> 
>> The cfi_register should come directly after the mflr I think.
>
> That is the idiomatic way to write it, and most obviously correct.  But
> as long as the value in LR at function entry is available in multiple
> places (like, in LR and in R0 here), it is fine to use either for
> unwinding.  Sometimes you can use this to optimise the unwind tables a
> bit -- not really worth it in hand-written code, it's more important to
> make it legible ;-)

OK. Because LR still holds the LR value until it's clobbered later, by
which point the cfi_register has taken effect.

But yeah I think for readability it's best to keep the cfi_register next
to the mflr.

>> >> There's also no code to load/restore the TOC pointer on BE, which I
>> >> think we'll need to handle.
>> >
>> > I see no code in the generated vdso64.so doing anything with r2, but if 
>> > you think that's needed, just let's do it:
>> 
>> Hmm, true.
>> 
>> The compiler will use the toc for globals (and possibly also for large
>> constants?)
>
> And anything else it bloody well wants to, yeah :-)

Haha yeah OK.

>> AFAIK there's no way to disable use of the toc, or make it a build error
>> if it's needed.
>
> Yes.
>
>> At the same time it's much safer for us to just save/restore r2, and
>> probably in the noise performance wise.
>
> If you want a function to be able to work with ABI-compliant code safely
> (in all cases), you'll have to make it itself ABI-compliant as well,
> yes :-)

True. Except this is the VDSO which has previously been a bit wild west
as far as ABI goes :)

>> So yeah we should probably do as below.
>
> [ snip ]
>
> Looks good yes.

Thanks for reviewing.

cheers

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1106 matches

Mail list logo