Re: [PATCH] eventfs: Process deletion of dentry more thoroughly

2023-10-31 Thread Steven Rostedt
On Wed, 1 Nov 2023 02:25:53 +
Al Viro  wrote:

> On Tue, Oct 31, 2023 at 02:47:03PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (Google)" 
> > 
> > Looking at how dentry is removed via the tracefs system, I found that
> > eventfs does not do everything that it did under tracefs. The tracefs
> > removal of a dentry calls simple_recursive_removal() that does a lot more
> > than a simple d_invalidate().  
> 
> Umm...  Is there any reason not to use simple_recursive_removal() there?

Hmm, I may be able to (I'm still a newbie with understanding of the vfs).

I did it this way thinking that a dentry may exist in the children but not
at a higher level, but I don't think that can be the case. This creates
dentries and inodes dynamically when they are referenced. The eventfs_inode
maps to each directory (the files of a directory are created from the
information from the eventfs_inode).

My thought process for doing it this way was if a child created a dentry
but the parent did not. But I don't think that can happen, right? So all I
may need to do is to check if the ei->dentry exists for the ei that is
being deleted, and after marking it and all its children as "freed", I can
then call simple_recursive_removal() on the top ei->dentry if it exists, as
that will guarantee to get all the dentries of any of the children that
exist. Right?

-- Steve



Re: [PATCH] eventfs: Process deletion of dentry more thoroughly

2023-10-31 Thread Al Viro
On Tue, Oct 31, 2023 at 02:47:03PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> Looking at how dentry is removed via the tracefs system, I found that
> eventfs does not do everything that it did under tracefs. The tracefs
> removal of a dentry calls simple_recursive_removal() that does a lot more
> than a simple d_invalidate().

Umm...  Is there any reason not to use simple_recursive_removal() there?



Re: [PATCH for-next] tracing/kprobes: Add symbol counting check when module loads

2023-10-31 Thread Google
Hi,

On Tue, 31 Oct 2023 23:24:43 +0200
Francis Laniel  wrote:

> > @@ -729,17 +744,55 @@ static int count_mod_symbols(void *data, const char
> > *name, unsigned long unused) return 0;
> >  }
> > 
> > -static unsigned int number_of_same_symbols(char *func_name)
> > +static unsigned int number_of_same_symbols(const char *mod, const char
> > *func_name) {
> > struct sym_count_ctx ctx = { .count = 0, .name = func_name };
> > 
> > -   kallsyms_on_each_match_symbol(count_symbols, func_name, &ctx.count);
> > +   if (!mod)
> > +   kallsyms_on_each_match_symbol(count_symbols, func_name, 
> &ctx.count);
> > 
> > -   module_kallsyms_on_each_symbol(NULL, count_mod_symbols, &ctx);
> > +   module_kallsyms_on_each_symbol(mod, count_mod_symbols, &ctx);
> 
> I may be missing something here or reviewing too quickly.
> Wouldn't this function return count to be 0 if func_name is only part of the 
> module named mod?

No, please read below.

> Indeed, if the function is not in kernel symbol, 
> kallsyms_on_each_match_symbol() will not loop.
> And, by giving mod to module_kallsyms_on_each_symbol(), the corresponding 
> module will be skipped, so count_mob_symbols() would not be called.
> Hence, we would have 0 as count, which would lead to ENOENT later.

Would you mean the case func_name is on the specific module?
If 'mod' is specified, module_kallsyms_on_each_symbol() only loops on
symbols in the module names 'mod'.

int module_kallsyms_on_each_symbol(const char *modname,
   int (*fn)(void *, const char *, unsigned 
long),
   void *data)
{
struct module *mod;
unsigned int i;
int ret = 0;

mutex_lock(&module_mutex);
list_for_each_entry(mod, &modules, list) {
struct mod_kallsyms *kallsyms;

if (mod->state == MODULE_STATE_UNFORMED)
continue;

if (modname && strcmp(modname, mod->name))
continue;
...

So with above change, 'if mod is not specified, search the symbols in kernel 
and all modules. If mod is sepecified, search the symbol on the specific
module'.

Thus, "if func_name is only part of the module named mod", the
module_kallsyms_on_each_symbol() will count the 'func_name' in 'mod' module
correctly.

Thank you,


Thank you,

-- 
Masami Hiramatsu (Google) 



[PATCH v5 6/7] eventfs: Delete eventfs_inode when the last dentry is freed

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

There exists a race between holding a reference of an eventfs_inode dentry
and the freeing of the eventfs_inode. If user space has a dentry held long
enough, it may still be able to access the dentry's eventfs_inode after it
has been freed.

To prevent this, have he eventfs_inode freed via the last dput() (or via
RCU if the eventfs_inode does not have a dentry).

This means reintroducing the eventfs_inode del_list field at a temporary
place to put the eventfs_inode. It needs to mark it as freed (via the
list) but also must invalidate the dentry immediately as the return from
eventfs_remove_dir() expects that they are. But the dentry invalidation
must not be called under the eventfs_mutex, so it must be done after the
eventfs_inode is marked as free (put on a deletion list).

Cc: sta...@vger.kernel.org
Fixes: 5bdcd5f5331a2 ("eventfs: Implement removal of meta data from eventfs")
Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/event_inode.c | 184 +--
 fs/tracefs/internal.h|   2 +
 2 files changed, 84 insertions(+), 102 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 87a8aaeda231..827ca152cfbe 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -85,8 +85,7 @@ static int eventfs_set_attr(struct mnt_idmap *idmap, struct 
dentry *dentry,
 
mutex_lock(&eventfs_mutex);
ei = dentry->d_fsdata;
-   /* The LSB is set when the eventfs_inode is being freed */
-   if (((unsigned long)ei & 1UL) || ei->is_freed) {
+   if (ei->is_freed) {
/* Do not allow changes if the event is about to be removed. */
mutex_unlock(&eventfs_mutex);
return -ENODEV;
@@ -276,35 +275,17 @@ static void free_ei(struct eventfs_inode *ei)
 void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry 
*dentry)
 {
struct tracefs_inode *ti_parent;
-   struct eventfs_inode *ei_child, *tmp;
struct eventfs_inode *ei;
int i;
 
/* The top level events directory may be freed by this */
if (unlikely(ti->flags & TRACEFS_EVENT_TOP_INODE)) {
-   LIST_HEAD(ef_del_list);
-
mutex_lock(&eventfs_mutex);
-
ei = ti->private;
-
-   /* Record all the top level files */
-   list_for_each_entry_srcu(ei_child, &ei->children, list,
-lockdep_is_held(&eventfs_mutex)) {
-   list_add_tail(&ei_child->del_list, &ef_del_list);
-   }
-
/* Nothing should access this, but just in case! */
ti->private = NULL;
-
mutex_unlock(&eventfs_mutex);
 
-   /* Now safely free the top level files and their children */
-   list_for_each_entry_safe(ei_child, tmp, &ef_del_list, del_list) 
{
-   list_del(&ei_child->del_list);
-   eventfs_remove_dir(ei_child);
-   }
-
free_ei(ei);
return;
}
@@ -319,14 +300,6 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, 
struct dentry *dentry)
if (!ei)
goto out;
 
-   /*
-* If ei was freed, then the LSB bit is set for d_fsdata.
-* But this should not happen, as it should still have a
-* ref count that prevents it. Warn in case it does.
-*/
-   if (WARN_ON_ONCE((unsigned long)ei & 1))
-   goto out;
-
/* This could belong to one of the files of the ei */
if (ei->dentry != dentry) {
for (i = 0; i < ei->nr_entries; i++) {
@@ -336,6 +309,8 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, 
struct dentry *dentry)
if (WARN_ON_ONCE(i == ei->nr_entries))
goto out;
ei->d_children[i] = NULL;
+   } else if (ei->is_freed) {
+   free_ei(ei);
} else {
ei->dentry = NULL;
}
@@ -962,13 +937,79 @@ struct eventfs_inode *eventfs_create_events_dir(const 
char *name, struct dentry
return ERR_PTR(-ENOMEM);
 }
 
+static LLIST_HEAD(free_list);
+
+static void eventfs_workfn(struct work_struct *work)
+{
+struct eventfs_inode *ei, *tmp;
+struct llist_node *llnode;
+
+   llnode = llist_del_all(&free_list);
+llist_for_each_entry_safe(ei, tmp, llnode, llist) {
+   /* This dput() matches the dget() from unhook_dentry() */
+   for (int i = 0; i < ei->nr_entries; i++) {
+   if (ei->d_children[i])
+   dput(ei->d_children[i]);
+   }
+   /* This should only get here if it had a dentry */
+   if (!WARN_ON_ONCE(!ei->dentry))
+   dput(ei->dentry);
+}
+}
+
+static DECLARE_WORK(eventfs_work, eventfs_workfn);
+
 static void free_rcu_ei(struct rcu_

[PATCH v5 7/7] eventfs: Remove special processing of dput() of events directory

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The top level events directory is no longer special with regards to how it
should be delete. Remove the extra processing for it in
eventfs_set_ei_status_free().

Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/event_inode.c | 19 ++-
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 827ca152cfbe..7cf8f5ebaae7 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -274,28 +274,11 @@ static void free_ei(struct eventfs_inode *ei)
  */
 void eventfs_set_ei_status_free(struct tracefs_inode *ti, struct dentry 
*dentry)
 {
-   struct tracefs_inode *ti_parent;
struct eventfs_inode *ei;
int i;
 
-   /* The top level events directory may be freed by this */
-   if (unlikely(ti->flags & TRACEFS_EVENT_TOP_INODE)) {
-   mutex_lock(&eventfs_mutex);
-   ei = ti->private;
-   /* Nothing should access this, but just in case! */
-   ti->private = NULL;
-   mutex_unlock(&eventfs_mutex);
-
-   free_ei(ei);
-   return;
-   }
-
mutex_lock(&eventfs_mutex);
 
-   ti_parent = get_tracefs(dentry->d_parent->d_inode);
-   if (!ti_parent || !(ti_parent->flags & TRACEFS_EVENT_INODE))
-   goto out;
-
ei = dentry->d_fsdata;
if (!ei)
goto out;
@@ -920,6 +903,8 @@ struct eventfs_inode *eventfs_create_events_dir(const char 
*name, struct dentry
inode->i_op = &eventfs_root_dir_inode_operations;
inode->i_fop = &eventfs_file_operations;
 
+   dentry->d_fsdata = ei;
+
/* directory inodes start off with i_nlink == 2 (for "." entry) */
inc_nlink(inode);
d_instantiate(dentry, inode);
-- 
2.42.0



[PATCH v5 3/7] eventfs: Test for ei->is_freed when accessing ei->dentry

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The eventfs_inode (ei) is protected by SRCU, but the ei->dentry is not. It
is protected by the eventfs_mutex. Anytime the eventfs_mutex is released,
and access to the ei->dentry needs to be done, it should first check if
ei->is_freed is set under the eventfs_mutex. If it is, then the ei->dentry
is invalid and must not be used. The ei->dentry must only be accessed
under the eventfs_mutex and after checking if ei->is_freed is set.

When the ei is being freed, it will (under the eventfs_mutex) set is_freed
and at the same time move the dentry to a free list to be cleared after
the eventfs_mutex is released. This means that any access to the
ei->dentry must check first if ei->is_freed is set, because if it is, then
the dentry is on its way to be freed.

Also add comments to describe this better.

Link: 
https://lore.kernel.org/all/CA+G9fYt6pY+tMZEOg=soeywqoe19fgp3ur15sgowkdk+_x8...@mail.gmail.com/
Link: 
https://lore.kernel.org/all/CA+G9fYuDP3hVQ3t7FfrBAjd_WFVSurMgCepTxunSJf=MTe=6...@mail.gmail.com/

Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Reported-by: Linux Kernel Functional Testing 
Reported-by: Naresh Kamboju 
Reported-by: Beau Belgrave 
Reviewed-by: Masami Hiramatsu (Google) 
Tested-by: Linux Kernel Functional Testing 
Tested-by: Naresh Kamboju 
Tested-by: Beau Belgrave 
Signed-off-by: Steven Rostedt (Google) 
---
Changes since v4: https://lkml.kernel.org/r/20231031193428.347017...@goodmis.org

- Rebased to this series

 fs/tracefs/event_inode.c | 45 ++--
 fs/tracefs/internal.h|  3 ++-
 2 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 0331d9bd568b..b28f240bbb6d 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -24,7 +24,20 @@
 #include 
 #include "internal.h"
 
+/*
+ * eventfs_mutex protects the eventfs_inode (ei) dentry. Any access
+ * to the ei->dentry must be done under this mutex and after checking
+ * if ei->is_freed is not set. The ei->dentry is released under the
+ * mutex at the same time ei->is_freed is set. If ei->is_freed is set
+ * then the ei->dentry is invalid.
+ */
 static DEFINE_MUTEX(eventfs_mutex);
+
+/*
+ * The eventfs_inode (ei) itself is protected by SRCU. It is released from
+ * its parent's list and will have is_freed set (under eventfs_mutex).
+ * After the SRCU grace period is over, the ei may be freed.
+ */
 DEFINE_STATIC_SRCU(eventfs_srcu);
 
 static struct dentry *eventfs_root_lookup(struct inode *dir,
@@ -239,6 +252,10 @@ create_file_dentry(struct eventfs_inode *ei, struct dentry 
**e_dentry,
bool invalidate = false;
 
mutex_lock(&eventfs_mutex);
+   if (ei->is_freed) {
+   mutex_unlock(&eventfs_mutex);
+   return NULL;
+   }
/* If the e_dentry already has a dentry, use it */
if (*e_dentry) {
/* lookup does not need to up the ref count */
@@ -312,6 +329,8 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
struct eventfs_inode *ei_child;
struct tracefs_inode *ti;
 
+   lockdep_assert_held(&eventfs_mutex);
+
/* srcu lock already held */
/* fill parent-child relation */
list_for_each_entry_srcu(ei_child, &ei->children, list,
@@ -325,6 +344,7 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
 
 /**
  * create_dir_dentry - Create a directory dentry for the eventfs_inode
+ * @pei: The eventfs_inode parent of ei.
  * @ei: The eventfs_inode to create the directory for
  * @parent: The dentry of the parent of this directory
  * @lookup: True if this is called by the lookup code
@@ -332,12 +352,17 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
  * This creates and attaches a directory dentry to the eventfs_inode @ei.
  */
 static struct dentry *
-create_dir_dentry(struct eventfs_inode *ei, struct dentry *parent, bool lookup)
+create_dir_dentry(struct eventfs_inode *pei, struct eventfs_inode *ei,
+ struct dentry *parent, bool lookup)
 {
bool invalidate = false;
struct dentry *dentry = NULL;
 
mutex_lock(&eventfs_mutex);
+   if (pei->is_freed || ei->is_freed) {
+   mutex_unlock(&eventfs_mutex);
+   return NULL;
+   }
if (ei->dentry) {
/* If the dentry already has a dentry, use it */
dentry = ei->dentry;
@@ -440,7 +465,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
 */
mutex_lock(&eventfs_mutex);
ei = READ_ONCE(ti->private);
-   if (ei)
+   if (ei && !ei->is_freed)
ei_dentry = READ_ONCE(ei->dentry);
mutex_unlock(&eventfs_mutex);
 
@@ -454,7 +479,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
if (strcmp(ei_child->name, name) != 0)
continue;
ret = simple_look

[PATCH v5 4/7] eventfs: Save ownership and mode

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

Now that inodes and dentries are created on the fly, they are also
reclaimed on memory pressure. Since the ownership and file mode are saved
in the inode, if they are freed, any changes to the ownership and mode
will be lost.

To counter this, if the user changes the permissions or ownership, save
them, and when creating the inodes again, restore those changes.

Signed-off-by: Steven Rostedt (Google) 
---
Changes since v4: https://lkml.kernel.org/r/20231031193428.558586...@goodmis.org

- Rebased to this series

 fs/tracefs/event_inode.c | 148 +++
 fs/tracefs/internal.h|  16 +
 2 files changed, 151 insertions(+), 13 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index b28f240bbb6d..d1683bf6d316 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -40,6 +40,15 @@ static DEFINE_MUTEX(eventfs_mutex);
  */
 DEFINE_STATIC_SRCU(eventfs_srcu);
 
+/* Mode is unsigned short, use the upper bits for flags */
+enum {
+   EVENTFS_SAVE_MODE   = BIT(16),
+   EVENTFS_SAVE_UID= BIT(17),
+   EVENTFS_SAVE_GID= BIT(18),
+};
+
+#define EVENTFS_MODE_MASK  (EVENTFS_SAVE_MODE - 1)
+
 static struct dentry *eventfs_root_lookup(struct inode *dir,
  struct dentry *dentry,
  unsigned int flags);
@@ -47,8 +56,89 @@ static int dcache_dir_open_wrapper(struct inode *inode, 
struct file *file);
 static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx);
 static int eventfs_release(struct inode *inode, struct file *file);
 
+static void update_attr(struct eventfs_attr *attr, struct iattr *iattr)
+{
+   unsigned int ia_valid = iattr->ia_valid;
+
+   if (ia_valid & ATTR_MODE) {
+   attr->mode = (attr->mode & ~EVENTFS_MODE_MASK) |
+   (iattr->ia_mode & EVENTFS_MODE_MASK) |
+   EVENTFS_SAVE_MODE;
+   }
+   if (ia_valid & ATTR_UID) {
+   attr->mode |= EVENTFS_SAVE_UID;
+   attr->uid = iattr->ia_uid;
+   }
+   if (ia_valid & ATTR_GID) {
+   attr->mode |= EVENTFS_SAVE_GID;
+   attr->gid = iattr->ia_gid;
+   }
+}
+
+static int eventfs_set_attr(struct mnt_idmap *idmap, struct dentry *dentry,
+   struct iattr *iattr)
+{
+   const struct eventfs_entry *entry;
+   struct eventfs_inode *ei;
+   const char *name;
+   int ret;
+
+   mutex_lock(&eventfs_mutex);
+   ei = dentry->d_fsdata;
+   /* The LSB is set when the eventfs_inode is being freed */
+   if (((unsigned long)ei & 1UL) || ei->is_freed) {
+   /* Do not allow changes if the event is about to be removed. */
+   mutex_unlock(&eventfs_mutex);
+   return -ENODEV;
+   }
+
+   /* Preallocate the children mode array if necessary */
+   if (!(dentry->d_inode->i_mode & S_IFDIR)) {
+   if (!ei->entry_attrs) {
+   ei->entry_attrs = kzalloc(sizeof(*ei->entry_attrs) * 
ei->nr_entries,
+ GFP_KERNEL);
+   if (!ei->entry_attrs) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   }
+
+   ret = simple_setattr(idmap, dentry, iattr);
+   if (ret < 0)
+   goto out;
+
+   /*
+* If this is a dir, then update the ei cache, only the file
+* mode is saved in the ei->m_children, and the ownership is
+* determined by the parent directory.
+*/
+   if (dentry->d_inode->i_mode & S_IFDIR) {
+   update_attr(&ei->attr, iattr);
+
+   } else {
+   name = dentry->d_name.name;
+
+   for (int i = 0; i < ei->nr_entries; i++) {
+   entry = &ei->entries[i];
+   if (strcmp(name, entry->name) == 0) {
+   update_attr(&ei->entry_attrs[i], iattr);
+   break;
+   }
+   }
+   }
+ out:
+   mutex_unlock(&eventfs_mutex);
+   return ret;
+}
+
 static const struct inode_operations eventfs_root_dir_inode_operations = {
.lookup = eventfs_root_lookup,
+   .setattr= eventfs_set_attr,
+};
+
+static const struct inode_operations eventfs_file_inode_operations = {
+   .setattr= eventfs_set_attr,
 };
 
 static const struct file_operations eventfs_file_operations = {
@@ -59,10 +149,30 @@ static const struct file_operations 
eventfs_file_operations = {
.release= eventfs_release,
 };
 
+static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, 
umode_t mode)
+{
+   if (!attr) {
+   inode->i_mode = mode;
+   return;
+   }
+
+   if (attr->mod

[PATCH v5 5/7] eventfs: Hold eventfs_mutex when calling callback functions

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The callback function that is used to create inodes and dentries is not
protected by anything and the data that is passed to it could become
stale. After eventfs_remove_dir() is called by the tracing system, it is
free to remove the events that are associated to that directory.
Unfortunately, that means the callbacks must not be called after that.

 CPU0   CPU1
    
 eventfs_root_lookup() {
 eventfs_remove_dir() {
  mutex_lock(&event_mutex);
  ei->is_freed = set;
  mutex_unlock(&event_mutex);
 }
 kfree(event_call);

for (...) {
  entry = &ei->entries[i];
  r = entry->callback() {
  call = data;  // call == event_call above
  if (call->flags ...)

 [ USE AFTER FREE BUG ]

The safest way to protect this is to wrap the callback with:

 mutex_lock(&eventfs_mutex);
 if (!ei->is_freed)
 r = entry->callback();
 else
 r = -1;
 mutex_unlock(&eventfs_mutex);

This will make sure that the callback will not be called after it is
freed. But now it needs to be known that the callback is called while
holding internal eventfs locks, and that it must not call back into the
eventfs / tracefs system. There's no reason it should anyway, but document
that as well.

Link: 
https://lore.kernel.org/all/CA+G9fYu9GOEbD=rR5eMR-=HJ8H6rMsbzDC2ZY5=y50wpwae...@mail.gmail.com/

Reported-by: Linux Kernel Functional Testing 
Reported-by: Naresh Kamboju 
Tested-by: Linux Kernel Functional Testing 
Tested-by: Naresh Kamboju 
Signed-off-by: Steven Rostedt (Google) 
---
Changes since v1/v4: 
https://lore.kernel.org/linux-trace-kernel/20231030114047.759c7...@gandalf.local.home

- Rebased to this series (which is was v4)

 fs/tracefs/event_inode.c | 22 ++--
 include/linux/tracefs.h  | 43 
 2 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index d1683bf6d316..87a8aaeda231 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -615,7 +615,13 @@ static struct dentry *eventfs_root_lookup(struct inode 
*dir,
entry = &ei->entries[i];
if (strcmp(name, entry->name) == 0) {
void *cdata = data;
-   r = entry->callback(name, &mode, &cdata, &fops);
+   mutex_lock(&eventfs_mutex);
+   /* If ei->is_freed, then the event itself may be too */
+   if (!ei->is_freed)
+   r = entry->callback(name, &mode, &cdata, &fops);
+   else
+   r = -1;
+   mutex_unlock(&eventfs_mutex);
if (r <= 0)
continue;
ret = simple_lookup(dir, dentry, flags);
@@ -749,7 +755,13 @@ static int dcache_dir_open_wrapper(struct inode *inode, 
struct file *file)
void *cdata = data;
entry = &ei->entries[i];
name = entry->name;
-   r = entry->callback(name, &mode, &cdata, &fops);
+   mutex_lock(&eventfs_mutex);
+   /* If ei->is_freed, then the event itself may be too */
+   if (!ei->is_freed)
+   r = entry->callback(name, &mode, &cdata, &fops);
+   else
+   r = -1;
+   mutex_unlock(&eventfs_mutex);
if (r <= 0)
continue;
d = create_file_dentry(ei, i, parent, name, mode, cdata, fops, 
false);
@@ -819,6 +831,10 @@ static int dcache_readdir_wrapper(struct file *file, 
struct dir_context *ctx)
  *   data = A pointer to @data, and the callback may replace it, which will
  * cause the file created to pass the new data to the open() call.
  *   fops = the fops to use for the created file.
+ *
+ * NB. @callback is called while holding internal locks of the eventfs
+ * system. The callback must not call any code that might also call into
+ * the tracefs or eventfs system or it will risk creating a deadlock.
  */
 struct eventfs_inode *eventfs_create_dir(const char *name, struct 
eventfs_inode *parent,
 const struct eventfs_entry *entries,
@@ -878,6 +894,8 @@ struct eventfs_inode *eventfs_create_dir(const char *name, 
struct eventfs_inode
  * @data: The default data to pass to the files (an entry may override it).
  *
  * This function creates the top of the trace event directory.
+ *
+ * See eventfs_create_dir() for use of @entries.
  */
 struct eventfs_inode *eventfs_create_events_dir(const char *name, struct 
dentry *parent,

[PATCH v5 2/7] eventfs: Have a free_ei() that just frees the eventfs_inode

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

As the eventfs_inode is freed in two different locations, make a helper
function free_ei() to make sure all the allocated fields of the
eventfs_inode is freed.

This requires renaming the existing free_ei() which is called by the srcu
handler to free_rcu_ei() and have free_ei() just do the freeing, where
free_rcu_ei() will call it.

Signed-off-by: Steven Rostedt (Google) 
---
Changse since v4: 
https://lore.kernel.org/all/20231031193428.133533...@goodmis.org/T/#u

- Rebased to this patch series

 fs/tracefs/event_inode.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 2c2c75b2ad73..0331d9bd568b 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -129,6 +129,13 @@ static struct dentry *create_dir(const char *name, struct 
dentry *parent)
return eventfs_end_creating(dentry);
 }
 
+static void free_ei(struct eventfs_inode *ei)
+{
+   kfree_const(ei->name);
+   kfree(ei->d_children);
+   kfree(ei);
+}
+
 /**
  * eventfs_set_ei_status_free - remove the dentry reference from an 
eventfs_inode
  * @ti: the tracefs_inode of the dentry
@@ -168,9 +175,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, 
struct dentry *dentry)
eventfs_remove_dir(ei_child);
}
 
-   kfree_const(ei->name);
-   kfree(ei->d_children);
-   kfree(ei);
+   free_ei(ei);
return;
}
 
@@ -784,13 +789,11 @@ struct eventfs_inode *eventfs_create_events_dir(const 
char *name, struct dentry
return ERR_PTR(-ENOMEM);
 }
 
-static void free_ei(struct rcu_head *head)
+static void free_rcu_ei(struct rcu_head *head)
 {
struct eventfs_inode *ei = container_of(head, struct eventfs_inode, 
rcu);
 
-   kfree_const(ei->name);
-   kfree(ei->d_children);
-   kfree(ei);
+   free_ei(ei);
 }
 
 /**
@@ -883,7 +886,7 @@ void eventfs_remove_dir(struct eventfs_inode *ei)
for (i = 0; i < ei->nr_entries; i++)
unhook_dentry(&ei->d_children[i], &dentry_list);
unhook_dentry(&ei->dentry, &dentry_list);
-   call_srcu(&eventfs_srcu, &ei->rcu, free_ei);
+   call_srcu(&eventfs_srcu, &ei->rcu, free_rcu_ei);
}
mutex_unlock(&eventfs_mutex);
 
-- 
2.42.0



[PATCH v5 0/7] eventfs: Fixing dynamic creation

2023-10-31 Thread Steven Rostedt
I found an issue with using a union between the rcu list head and
the "is_freed" boolean (word). That is, rcu list is a single link
list where the last element has a NULL pointer. That means, if
the eventfs_inode is the last element (which it likely will be)
it will not have its flag set to is_free.

Instead use a bit from nr_entries to be the is_freed flag.

Changes since v4: 
https://lore.kernel.org/linux-trace-kernel/20231031193109.018322...@goodmis.org/

- Updated on top of the change to separate out is_freed from the union

- Add another fix to only delete the eventfs_inode after the last
  dentry has cleared its reference (there's more races there without
  doing that)

- Make the top level eventfs directory the same as the rest in
  being removed.

Steven Rostedt (Google) (7):
  eventfs: Remove "is_freed" union with rcu head
  eventfs: Have a free_ei() that just frees the eventfs_inode
  eventfs: Test for ei->is_freed when accessing ei->dentry
  eventfs: Save ownership and mode
  eventfs: Hold eventfs_mutex when calling callback functions
  eventfs: Delete eventfs_inode when the last dentry is freed
  eventfs: Remove special processing of dput() of events directory


 fs/tracefs/event_inode.c | 431 +++
 fs/tracefs/internal.h|  27 ++-
 include/linux/tracefs.h  |  43 +
 3 files changed, 353 insertions(+), 148 deletions(-)



[PATCH v5 1/7] eventfs: Remove "is_freed" union with rcu head

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The eventfs_inode->is_freed was a union with the rcu_head with the
assumption that when it was on the srcu list the head would contain a
pointer which would make "is_freed" true. But that was a wrong assumption
as the rcu head is a single link list where the last element is NULL.

Instead, split the nr_entries integer so that "is_freed" is one bit and
the nr_entries is the next 31 bits. As there shouldn't be more than 10
(currently there's at most 5 to 7 depending on the config), this should
not be a problem.

Cc: sta...@vger.kernel.org
Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions")
Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/event_inode.c | 2 ++
 fs/tracefs/internal.h| 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 754885dfe71c..2c2c75b2ad73 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -824,6 +824,8 @@ static void eventfs_remove_rec(struct eventfs_inode *ei, 
struct list_head *head,
eventfs_remove_rec(ei_child, head, level + 1);
}
 
+   ei->is_freed = 1;
+
list_del_rcu(&ei->list);
list_add_tail(&ei->del_list, head);
 }
diff --git a/fs/tracefs/internal.h b/fs/tracefs/internal.h
index 64fde9490f52..c7d88aaa949f 100644
--- a/fs/tracefs/internal.h
+++ b/fs/tracefs/internal.h
@@ -23,6 +23,7 @@ struct tracefs_inode {
  * @d_parent:   pointer to the parent's dentry
  * @d_children: The array of dentries to represent the files when created
  * @data:  The private data to pass to the callbacks
+ * @is_freed:  Flag set if the eventfs is on its way to be freed
  * @nr_entries: The number of items in @entries
  */
 struct eventfs_inode {
@@ -38,14 +39,13 @@ struct eventfs_inode {
 * Union - used for deletion
 * @del_list:   list of eventfs_inode to delete
 * @rcu:eventfs_inode to delete in RCU
-* @is_freed:   node is freed if one of the above is set
 */
union {
struct list_headdel_list;
struct rcu_head rcu;
-   unsigned long   is_freed;
};
-   int nr_entries;
+   unsigned intis_freed:1;
+   unsigned intnr_entries:31;
 };
 
 static inline struct tracefs_inode *get_tracefs(const struct inode *inode)
-- 
2.42.0



Re: [PATCH for-next] tracing/kprobes: Add symbol counting check when module loads

2023-10-31 Thread Francis Laniel
Hi!


Le dimanche 29 octobre 2023, 05:10:46 EET Masami Hiramatsu (Google) a écrit :
> From: Masami Hiramatsu (Google) 
> 
> Check the number of probe target symbols in the target module when
> the module is loaded. If the probe is not on the unique name symbols
> in the module, it will be rejected at that point.
> 
> Note that the symbol which has a unique name in the target module,
> it will be accepted even if there are same-name symbols in the
> kernel or other modules,

I am wondering about symbols which are only part of one specific module.
I wrote a comment about it below, please let me know what you think about it.

> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  kernel/trace/trace_kprobe.c |  112
> ++- 1 file changed, 68
> insertions(+), 44 deletions(-)
> 
> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index e834f149695b..90cf2219adb4 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -670,6 +670,21 @@ static int register_trace_kprobe(struct trace_kprobe
> *tk) return ret;
>  }
> 
> +static int validate_module_probe_symbol(const char *modname, const char
> *symbol); +
> +static int register_module_trace_kprobe(struct module *mod, struct
> trace_kprobe *tk) +{
> + const char *p;
> + int ret = 0;
> +
> + p = strchr(trace_kprobe_symbol(tk), ':');
> + if (p)
> + ret = validate_module_probe_symbol(module_name(mod), p++);
> + if (!ret)
> + ret = register_trace_kprobe(tk);
> + return ret;
> +}
> +
>  /* Module notifier call back, checking event on the module */
>  static int trace_kprobe_module_callback(struct notifier_block *nb,
>  unsigned long val, void *data)
> @@ -688,7 +703,7 @@ static int trace_kprobe_module_callback(struct
> notifier_block *nb, if (trace_kprobe_within_module(tk, mod)) {
>   /* Don't need to check busy - this should have gone. */
>   __unregister_trace_kprobe(tk);
> - ret = __register_trace_kprobe(tk);
> + ret = register_module_trace_kprobe(mod, tk);
>   if (ret)
>   pr_warn("Failed to re-register probe %s on %s: 
%d\n",
>   trace_probe_name(&tk->tp),
> @@ -729,17 +744,55 @@ static int count_mod_symbols(void *data, const char
> *name, unsigned long unused) return 0;
>  }
> 
> -static unsigned int number_of_same_symbols(char *func_name)
> +static unsigned int number_of_same_symbols(const char *mod, const char
> *func_name) {
>   struct sym_count_ctx ctx = { .count = 0, .name = func_name };
> 
> - kallsyms_on_each_match_symbol(count_symbols, func_name, &ctx.count);
> + if (!mod)
> + kallsyms_on_each_match_symbol(count_symbols, func_name, 
&ctx.count);
> 
> - module_kallsyms_on_each_symbol(NULL, count_mod_symbols, &ctx);
> + module_kallsyms_on_each_symbol(mod, count_mod_symbols, &ctx);

I may be missing something here or reviewing too quickly.
Wouldn't this function return count to be 0 if func_name is only part of the 
module named mod?
Indeed, if the function is not in kernel symbol, 
kallsyms_on_each_match_symbol() will not loop.
And, by giving mod to module_kallsyms_on_each_symbol(), the corresponding 
module will be skipped, so count_mob_symbols() would not be called.
Hence, we would have 0 as count, which would lead to ENOENT later.

>   return ctx.count;
>  }
> 
> +static int validate_module_probe_symbol(const char *modname, const char
> *symbol) +{
> + unsigned int count = number_of_same_symbols(modname, symbol);
> +
> + if (count > 1) {
> + /*
> +  * Users should use ADDR to remove the ambiguity of
> +  * using KSYM only.
> +  */
> + return -EADDRNOTAVAIL;
> + } else if (count == 0) {
> + /*
> +  * We can return ENOENT earlier than when register the
> +  * kprobe.
> +  */
> + return -ENOENT;
> + }
> + return 0;
> +}
> +
> +static int validate_probe_symbol(char *symbol)
> +{
> + char *mod = NULL, *p;
> + int ret;
> +
> + p = strchr(symbol, ':');
> + if (p) {
> + mod = symbol;
> + symbol = p + 1;
> + *p = '\0';
> + }
> + ret = validate_module_probe_symbol(mod, symbol);
> + if (p)
> + *p = ':';
> + return ret;
> +}
> +
>  static int __trace_kprobe_create(int argc, const char *argv[])
>  {
>   /*
> @@ -859,6 +912,14 @@ static int __trace_kprobe_create(int argc, const char
> *argv[]) trace_probe_log_err(0, BAD_PROBE_ADDR);
>   goto parse_error;
>   }
> + ret = validate_probe_symbol(symbol);
> + if (ret) {
> + if (ret == -EADDRNOTAVAIL)
> + trace_probe_log_err(0, NON_UNIQ_SYMBO

[PATCH v4 2/3] eventfs: Test for ei->is_freed when accessing ei->dentry

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The eventfs_inode (ei) is protected by SRCU, but the ei->dentry is not. It
is protected by the eventfs_mutex. Anytime the eventfs_mutex is released,
and access to the ei->dentry needs to be done, it should first check if
ei->is_freed is set under the eventfs_mutex. If it is, then the ei->dentry
is invalid and must not be used. The ei->dentry must only be accessed
under the eventfs_mutex and after checking if ei->is_freed is set.

When the ei is being freed, it will (under the eventfs_mutex) set is_freed
and at the same time move the dentry to a free list to be cleared after
the eventfs_mutex is released. This means that any access to the
ei->dentry must check first if ei->is_freed is set, because if it is, then
the dentry is on its way to be freed.

Also add comments to describe this better.

Link: 
https://lore.kernel.org/all/CA+G9fYt6pY+tMZEOg=soeywqoe19fgp3ur15sgowkdk+_x8...@mail.gmail.com/
Link: 
https://lore.kernel.org/all/CA+G9fYuDP3hVQ3t7FfrBAjd_WFVSurMgCepTxunSJf=MTe=6...@mail.gmail.com/

Cc: Mark Rutland 
Cc: "Arnd Bergmann" 
Cc: "Ajay Kaher" 
Cc: Andrew Morton 
Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Reported-by: Linux Kernel Functional Testing 
Reported-by: Naresh Kamboju 
Reported-by: Beau Belgrave 
Reviewed-by: Masami Hiramatsu (Google) 
Tested-by: Linux Kernel Functional Testing 
Tested-by: Naresh Kamboju 
Tested-by: Beau Belgrave 
Signed-off-by: Steven Rostedt (Google) 
---
Changes since v2: 
https://lore.kernel.org/linux-trace-kernel/20231028164650.4f5ea...@rorschach.local.home/

- Separated out free_ei() into its own patch

 fs/tracefs/event_inode.c | 50 +++-
 fs/tracefs/internal.h|  3 ++-
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 76323f48d5f0..362a2dba82c2 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -24,7 +24,20 @@
 #include 
 #include "internal.h"
 
+/*
+ * eventfs_mutex protects the eventfs_inode (ei) dentry. Any access
+ * to the ei->dentry must be done under this mutex and after checking
+ * if ei->is_freed is not set. The ei->dentry is released under the
+ * mutex at the same time ei->is_freed is set. If ei->is_freed is set
+ * then the ei->dentry is invalid.
+ */
 static DEFINE_MUTEX(eventfs_mutex);
+
+/*
+ * The eventfs_inode (ei) itself is protected by SRCU. It is released from
+ * its parent's list and will have is_freed set (under eventfs_mutex).
+ * After the SRCU grace period is over, the ei may be freed.
+ */
 DEFINE_STATIC_SRCU(eventfs_srcu);
 
 static struct dentry *eventfs_root_lookup(struct inode *dir,
@@ -239,6 +252,10 @@ create_file_dentry(struct eventfs_inode *ei, struct dentry 
**e_dentry,
bool invalidate = false;
 
mutex_lock(&eventfs_mutex);
+   if (ei->is_freed) {
+   mutex_unlock(&eventfs_mutex);
+   return NULL;
+   }
/* If the e_dentry already has a dentry, use it */
if (*e_dentry) {
/* lookup does not need to up the ref count */
@@ -312,6 +329,8 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
struct eventfs_inode *ei_child;
struct tracefs_inode *ti;
 
+   lockdep_assert_held(&eventfs_mutex);
+
/* srcu lock already held */
/* fill parent-child relation */
list_for_each_entry_srcu(ei_child, &ei->children, list,
@@ -325,6 +344,7 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
 
 /**
  * create_dir_dentry - Create a directory dentry for the eventfs_inode
+ * @pei: The eventfs_inode parent of ei.
  * @ei: The eventfs_inode to create the directory for
  * @parent: The dentry of the parent of this directory
  * @lookup: True if this is called by the lookup code
@@ -332,12 +352,17 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
  * This creates and attaches a directory dentry to the eventfs_inode @ei.
  */
 static struct dentry *
-create_dir_dentry(struct eventfs_inode *ei, struct dentry *parent, bool lookup)
+create_dir_dentry(struct eventfs_inode *pei, struct eventfs_inode *ei,
+ struct dentry *parent, bool lookup)
 {
bool invalidate = false;
struct dentry *dentry = NULL;
 
mutex_lock(&eventfs_mutex);
+   if (pei->is_freed || ei->is_freed) {
+   mutex_unlock(&eventfs_mutex);
+   return NULL;
+   }
if (ei->dentry) {
/* If the dentry already has a dentry, use it */
dentry = ei->dentry;
@@ -440,7 +465,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,
 */
mutex_lock(&eventfs_mutex);
ei = READ_ONCE(ti->private);
-   if (ei)
+   if (ei && !ei->is_freed)
ei_dentry = READ_ONCE(ei->dentry);
mutex_unlock(&eventfs_mutex);
 
@@ -454,7 +479,7 @@ static struct dentry *eventfs_root_lookup(struct inode *dir,

[PATCH v4 3/3] eventfs: Save ownership and mode

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

Now that inodes and dentries are created on the fly, they are also
reclaimed on memory pressure. Since the ownership and file mode are saved
in the inode, if they are freed, any changes to the ownership and mode
will be lost.

To counter this, if the user changes the permissions or ownership, save
them, and when creating the inodes again, restore those changes.

Cc: Masami Hiramatsu 
Cc: Mark Rutland 
Cc: Ajay Kaher 
Signed-off-by: Steven Rostedt (Google) 
---
Changes since v3: 
https://lore.kernel.org/linux-trace-kernel/20231031113627.2edfa...@gandalf.local.home

- Just rebased as part of this patch series

 fs/tracefs/event_inode.c | 148 +++
 fs/tracefs/internal.h|  16 +
 2 files changed, 151 insertions(+), 13 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 362a2dba82c2..81e7ccef2783 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -40,6 +40,15 @@ static DEFINE_MUTEX(eventfs_mutex);
  */
 DEFINE_STATIC_SRCU(eventfs_srcu);
 
+/* Mode is unsigned short, use the upper bits for flags */
+enum {
+   EVENTFS_SAVE_MODE   = BIT(16),
+   EVENTFS_SAVE_UID= BIT(17),
+   EVENTFS_SAVE_GID= BIT(18),
+};
+
+#define EVENTFS_MODE_MASK  (EVENTFS_SAVE_MODE - 1)
+
 static struct dentry *eventfs_root_lookup(struct inode *dir,
  struct dentry *dentry,
  unsigned int flags);
@@ -47,8 +56,89 @@ static int dcache_dir_open_wrapper(struct inode *inode, 
struct file *file);
 static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx);
 static int eventfs_release(struct inode *inode, struct file *file);
 
+static void update_attr(struct eventfs_attr *attr, struct iattr *iattr)
+{
+   unsigned int ia_valid = iattr->ia_valid;
+
+   if (ia_valid & ATTR_MODE) {
+   attr->mode = (attr->mode & ~EVENTFS_MODE_MASK) |
+   (iattr->ia_mode & EVENTFS_MODE_MASK) |
+   EVENTFS_SAVE_MODE;
+   }
+   if (ia_valid & ATTR_UID) {
+   attr->mode |= EVENTFS_SAVE_UID;
+   attr->uid = iattr->ia_uid;
+   }
+   if (ia_valid & ATTR_GID) {
+   attr->mode |= EVENTFS_SAVE_GID;
+   attr->gid = iattr->ia_gid;
+   }
+}
+
+static int eventfs_set_attr(struct mnt_idmap *idmap, struct dentry *dentry,
+   struct iattr *iattr)
+{
+   const struct eventfs_entry *entry;
+   struct eventfs_inode *ei;
+   const char *name;
+   int ret;
+
+   mutex_lock(&eventfs_mutex);
+   ei = dentry->d_fsdata;
+   /* The LSB is set when the eventfs_inode is being freed */
+   if (((unsigned long)ei & 1UL) || ei->is_freed) {
+   /* Do not allow changes if the event is about to be removed. */
+   mutex_unlock(&eventfs_mutex);
+   return -ENODEV;
+   }
+
+   /* Preallocate the children mode array if necessary */
+   if (!(dentry->d_inode->i_mode & S_IFDIR)) {
+   if (!ei->entry_attrs) {
+   ei->entry_attrs = kzalloc(sizeof(*ei->entry_attrs) * 
ei->nr_entries,
+ GFP_KERNEL);
+   if (!ei->entry_attrs) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   }
+
+   ret = simple_setattr(idmap, dentry, iattr);
+   if (ret < 0)
+   goto out;
+
+   /*
+* If this is a dir, then update the ei cache, only the file
+* mode is saved in the ei->m_children, and the ownership is
+* determined by the parent directory.
+*/
+   if (dentry->d_inode->i_mode & S_IFDIR) {
+   update_attr(&ei->attr, iattr);
+
+   } else {
+   name = dentry->d_name.name;
+
+   for (int i = 0; i < ei->nr_entries; i++) {
+   entry = &ei->entries[i];
+   if (strcmp(name, entry->name) == 0) {
+   update_attr(&ei->entry_attrs[i], iattr);
+   break;
+   }
+   }
+   }
+ out:
+   mutex_unlock(&eventfs_mutex);
+   return ret;
+}
+
 static const struct inode_operations eventfs_root_dir_inode_operations = {
.lookup = eventfs_root_lookup,
+   .setattr= eventfs_set_attr,
+};
+
+static const struct inode_operations eventfs_file_inode_operations = {
+   .setattr= eventfs_set_attr,
 };
 
 static const struct file_operations eventfs_file_operations = {
@@ -59,10 +149,30 @@ static const struct file_operations 
eventfs_file_operations = {
.release= eventfs_release,
 };
 
+static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, 
umode_t mode)
+{
+   if (!attr

[PATCH v4 1/3] eventfs: Have a free_ei() that just frees the eventfs_inode

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

As the eventfs_inode is freed in two different locations, make a helper
function free_ei() to make sure all the allocated fields of the
eventfs_inode is freed.

This requires renaming the existing free_ei() which is called by the srcu
handler to free_rcu_ei() and have free_ei() just do the freeing, where
free_rcu_ei() will call it.

Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/event_inode.c | 19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 754885dfe71c..76323f48d5f0 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -129,6 +129,13 @@ static struct dentry *create_dir(const char *name, struct 
dentry *parent)
return eventfs_end_creating(dentry);
 }
 
+static void free_ei(struct eventfs_inode *ei)
+{
+   kfree_const(ei->name);
+   kfree(ei->d_children);
+   kfree(ei);
+}
+
 /**
  * eventfs_set_ei_status_free - remove the dentry reference from an 
eventfs_inode
  * @ti: the tracefs_inode of the dentry
@@ -168,9 +175,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, 
struct dentry *dentry)
eventfs_remove_dir(ei_child);
}
 
-   kfree_const(ei->name);
-   kfree(ei->d_children);
-   kfree(ei);
+   free_ei(ei);
return;
}
 
@@ -784,13 +789,11 @@ struct eventfs_inode *eventfs_create_events_dir(const 
char *name, struct dentry
return ERR_PTR(-ENOMEM);
 }
 
-static void free_ei(struct rcu_head *head)
+static void free_rcu_ei(struct rcu_head *head)
 {
struct eventfs_inode *ei = container_of(head, struct eventfs_inode, 
rcu);
 
-   kfree_const(ei->name);
-   kfree(ei->d_children);
-   kfree(ei);
+   free_ei(ei);
 }
 
 /**
@@ -881,7 +884,7 @@ void eventfs_remove_dir(struct eventfs_inode *ei)
for (i = 0; i < ei->nr_entries; i++)
unhook_dentry(&ei->d_children[i], &dentry_list);
unhook_dentry(&ei->dentry, &dentry_list);
-   call_srcu(&eventfs_srcu, &ei->rcu, free_ei);
+   call_srcu(&eventfs_srcu, &ei->rcu, free_rcu_ei);
}
mutex_unlock(&eventfs_mutex);
 
-- 
2.42.0



[PATCH v4 0/3] eventfs: Fixing dynamic creation

2023-10-31 Thread Steven Rostedt
This is really just putting together two eventfs patches that were
stepping on each other in conflicts. I decided to rip out the free_ei()
part into its own patch (as that was what was confliciting.

This series is just a combination of:

  
https://lore.kernel.org/linux-trace-kernel/2023103325.50087...@gandalf.local.home/

and

 
https://lore.kernel.org/linux-trace-kernel/20231031113627.2edfa...@gandalf.local.home/

With the stripped out free_ei() patch.

Steven Rostedt (Google) (3):
  eventfs: Have a free_ei() that just frees the eventfs_inode
  eventfs: Test for ei->is_freed when accessing ei->dentry
  eventfs: Save ownership and mode


 fs/tracefs/event_inode.c | 217 +--
 fs/tracefs/internal.h|  19 -
 2 files changed, 208 insertions(+), 28 deletions(-)



[PATCH] tracing: Have the user copy of synthetic event address use correct context

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

A synthetic event is created by the synthetic event interface that can
read both user or kernel address memory. In reality, it reads any
arbitrary memory location from within the kernel. If the address space is
in USER (where CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE is set) then
it uses strncpy_from_user_nofault() to copy strings otherwise it uses
strncpy_from_kernel_nofault().

But since both functions use the same variable there's no annotation to
what that variable is (ie. __user). This makes sparse complain.

Quiet sparse by typecasting the strncpy_from_user_nofault() variable to
a __user pointer.

Cc: sta...@vger.kernel.org
Fixes: 0934ae9977c2 ("tracing: Fix reading strings from synthetic events");
Reported-by: kernel test robot 
Closes: 
https://lore.kernel.org/oe-kbuild-all/202311010013.fm8wtxa5-...@intel.com/
Signed-off-by: Steven Rostedt (Google) 
---
 kernel/trace/trace_events_synth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_events_synth.c 
b/kernel/trace/trace_events_synth.c
index 14cb275a0bab..846e02c0fb59 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -452,7 +452,7 @@ static unsigned int trace_string(struct synth_trace_event 
*entry,
 
 #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
if ((unsigned long)str_val < TASK_SIZE)
-   ret = strncpy_from_user_nofault(str_field, str_val, 
STR_VAR_LEN_MAX);
+   ret = strncpy_from_user_nofault(str_field, (const void 
__user *)str_val, STR_VAR_LEN_MAX);
else
 #endif
ret = strncpy_from_kernel_nofault(str_field, str_val, 
STR_VAR_LEN_MAX);
-- 
2.42.0




[PATCH] eventfs: Process deletion of dentry more thoroughly

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

Looking at how dentry is removed via the tracefs system, I found that
eventfs does not do everything that it did under tracefs. The tracefs
removal of a dentry calls simple_recursive_removal() that does a lot more
than a simple d_invalidate().

Have the same done on eventfs dentry:

 1. Set S_DEAD for directories
 2. Call clear_nlink() on the dentry inode
 3. Call any notifiers about the dentry removal

Cc: sta...@vger.kernel.org
Fixes: 5bdcd5f5331a2 ("eventfs: Implement removal of meta data from eventfs")
Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/event_inode.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 4d2da7480e5f..ab807edaf538 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -856,6 +856,7 @@ static void unhook_dentry(struct dentry **dentry, struct 
dentry **list)
*dentry = NULL;
}
 }
+
 /**
  * eventfs_remove_dir - remove eventfs dir or file from list
  * @ei: eventfs_inode to be removed.
@@ -868,6 +869,7 @@ void eventfs_remove_dir(struct eventfs_inode *ei)
LIST_HEAD(ei_del_list);
struct dentry *dentry_list = NULL;
struct dentry *dentry;
+   struct inode *inode;
int i;
 
if (!ei)
@@ -891,7 +893,28 @@ void eventfs_remove_dir(struct eventfs_inode *ei)
ptr = (unsigned long)dentry->d_fsdata & ~1UL;
dentry_list = (struct dentry *)ptr;
dentry->d_fsdata = NULL;
+
+   inode = dentry->d_inode;
+   inode_lock(inode);
+   if (d_is_dir(dentry))
+   inode->i_flags |= S_DEAD;
+   clear_nlink(inode);
+   inode_unlock(inode);
+
+   inode = dentry->d_parent->d_inode;
+   inode_lock(inode);
+
+   /* Remove its visibility */
d_invalidate(dentry);
+   if (d_is_dir(dentry))
+   fsnotify_rmdir(inode, dentry);
+   else
+   fsnotify_unlink(inode, dentry);
+
+   if (d_is_dir(dentry))
+   drop_nlink(inode);
+   inode_unlock(inode);
+
mutex_lock(&eventfs_mutex);
/* dentry should now have at least a single reference */
WARN_ONCE((int)d_count(dentry) < 1,
-- 
2.42.0




Re: [PATCH v2 2/3] remoteproc: qcom: pas: make region assign more generic

2023-10-31 Thread Neil Armstrong

Hi,

On 30/10/2023 14:10, Mukesh Ojha wrote:



On 10/30/2023 3:33 PM, Neil Armstrong wrote:

The current memory region assign only supports a single
memory region.

But new platforms introduces more regions to make the
memory requirements more flexible for various use cases.
Those new platforms also shares the memory region between the
DSP and HLOS.

To handle this, make the region assign more generic in order
to support more than a single memory region and also permit
setting the regions permissions as shared.

Signed-off-by: Neil Armstrong 
---
  drivers/remoteproc/qcom_q6v5_pas.c | 102 -
  1 file changed, 66 insertions(+), 36 deletions(-)

diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
b/drivers/remoteproc/qcom_q6v5_pas.c
index 913a5d2068e8..4829fd26e17d 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -33,6 +33,8 @@
  #define ADSP_DECRYPT_SHUTDOWN_DELAY_MS    100
+#define MAX_ASSIGN_COUNT 2
+
  struct adsp_data {
  int crash_reason_smem;
  const char *firmware_name;
@@ -51,6 +53,9 @@ struct adsp_data {
  int ssctl_id;
  int region_assign_idx;
+    int region_assign_count;
+    bool region_assign_shared;
+    int region_assign_vmid;
  };
  struct qcom_adsp {
@@ -87,15 +92,18 @@ struct qcom_adsp {
  phys_addr_t dtb_mem_phys;
  phys_addr_t mem_reloc;
  phys_addr_t dtb_mem_reloc;
-    phys_addr_t region_assign_phys;
+    phys_addr_t region_assign_phys[MAX_ASSIGN_COUNT];
  void *mem_region;
  void *dtb_mem_region;
  size_t mem_size;
  size_t dtb_mem_size;
-    size_t region_assign_size;
+    size_t region_assign_size[MAX_ASSIGN_COUNT];
  int region_assign_idx;
-    u64 region_assign_perms;
+    int region_assign_count;
+    bool region_assign_shared;
+    int region_assign_vmid;
+    u64 region_assign_perms[MAX_ASSIGN_COUNT];
  struct qcom_rproc_glink glink_subdev;
  struct qcom_rproc_subdev smd_subdev;
@@ -590,37 +598,52 @@ static int adsp_alloc_memory_region(struct qcom_adsp 
*adsp)
  static int adsp_assign_memory_region(struct qcom_adsp *adsp)
  {
-    struct reserved_mem *rmem = NULL;
-    struct qcom_scm_vmperm perm;
+    struct qcom_scm_vmperm perm[MAX_ASSIGN_COUNT];
+    unsigned int perm_size = 1;


AFAICS, not need of initialization.


Indeed, removed




  struct device_node *node;
-    int ret;
+    int offset, ret;


Nit: one variable per line.


Done




  if (!adsp->region_assign_idx)


Not related to this patch..
Should not this be valid only for > 1 ?


I don't understand, only region_assign_idx > 1 triggers a memory_assign,
and this check discards configurations with region_assign_idx == 0 as
expected.





  return 0;
-    node = of_parse_phandle(adsp->dev->of_node, "memory-region", 
adsp->region_assign_idx);
-    if (node)
-    rmem = of_reserved_mem_lookup(node);
-    of_node_put(node);
-    if (!rmem) {
-    dev_err(adsp->dev, "unable to resolve shareable memory-region\n");
-    return -EINVAL;
-    }
+    for (offset = 0; offset < adsp->region_assign_count; ++offset) {
+    struct reserved_mem *rmem = NULL;
+
+    node = of_parse_phandle(adsp->dev->of_node, "memory-region",
+    adsp->region_assign_idx + offset);
+    if (node)
+    rmem = of_reserved_mem_lookup(node);
+    of_node_put(node);
+    if (!rmem) {
+    dev_err(adsp->dev, "unable to resolve shareable memory-region index 
%d\n",
+    offset);
+    return -EINVAL; > +    }




-    perm.vmid = QCOM_SCM_VMID_MSS_MSA;
-    perm.perm = QCOM_SCM_PERM_RW;
+    if (adsp->region_assign_shared)  {
+    perm[0].vmid = QCOM_SCM_VMID_HLOS;
+    perm[0].perm = QCOM_SCM_PERM_RW;
+    perm[1].vmid = adsp->region_assign_vmid;
+    perm[1].perm = QCOM_SCM_PERM_RW;
+    perm_size = 2;
+    } else {
+    perm[0].vmid = adsp->region_assign_vmid;
+    perm[0].perm = QCOM_SCM_PERM_RW;
+    perm_size = 1;
+    }
-    adsp->region_assign_phys = rmem->base;
-    adsp->region_assign_size = rmem->size;
-    adsp->region_assign_perms = BIT(QCOM_SCM_VMID_HLOS);
+    adsp->region_assign_phys[offset] = rmem->base;
+    adsp->region_assign_size[offset] = rmem->size;
+    adsp->region_assign_perms[offset] = BIT(QCOM_SCM_VMID_HLOS);


Do we need array for this, is this changing ?


We need to keep region_assign_perms for unassign, but for the other 2 we would
need to duplicate the code from adsp_assign_memory_region into
adsp_unassign_memory_region.




-    ret = qcom_scm_assign_mem(adsp->region_assign_phys,
-  adsp->region_assign_size,
-  &adsp->region_assign_perms,
-  &perm, 1);
-    if (ret < 0) {
-    dev_err(adsp->dev, "assign memory failed\n");
-    return ret;
+    ret = qcom_scm_assign_mem(adsp->region_assign_phys[offset],
+  adsp->region_assign_

Re: [PATCH for-next] tracing/kprobes: Add symbol counting check when module loads

2023-10-31 Thread Andrii Nakryiko
On Sat, Oct 28, 2023 at 8:10 PM Masami Hiramatsu (Google)
 wrote:
>
> From: Masami Hiramatsu (Google) 
>
> Check the number of probe target symbols in the target module when
> the module is loaded. If the probe is not on the unique name symbols
> in the module, it will be rejected at that point.
>
> Note that the symbol which has a unique name in the target module,
> it will be accepted even if there are same-name symbols in the
> kernel or other modules,
>
> Signed-off-by: Masami Hiramatsu (Google) 
> ---
>  kernel/trace/trace_kprobe.c |  112 
> ++-
>  1 file changed, 68 insertions(+), 44 deletions(-)
>

LGTM.

Acked-by: Andrii Nakryiko 


> diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
> index e834f149695b..90cf2219adb4 100644
> --- a/kernel/trace/trace_kprobe.c
> +++ b/kernel/trace/trace_kprobe.c
> @@ -670,6 +670,21 @@ static int register_trace_kprobe(struct trace_kprobe *tk)
> return ret;
>  }
>
> +static int validate_module_probe_symbol(const char *modname, const char 
> *symbol);
> +
> +static int register_module_trace_kprobe(struct module *mod, struct 
> trace_kprobe *tk)
> +{
> +   const char *p;
> +   int ret = 0;
> +
> +   p = strchr(trace_kprobe_symbol(tk), ':');
> +   if (p)
> +   ret = validate_module_probe_symbol(module_name(mod), p++);
> +   if (!ret)
> +   ret = register_trace_kprobe(tk);
> +   return ret;
> +}
> +
>  /* Module notifier call back, checking event on the module */
>  static int trace_kprobe_module_callback(struct notifier_block *nb,
>unsigned long val, void *data)
> @@ -688,7 +703,7 @@ static int trace_kprobe_module_callback(struct 
> notifier_block *nb,
> if (trace_kprobe_within_module(tk, mod)) {
> /* Don't need to check busy - this should have gone. 
> */
> __unregister_trace_kprobe(tk);
> -   ret = __register_trace_kprobe(tk);
> +   ret = register_module_trace_kprobe(mod, tk);
> if (ret)
> pr_warn("Failed to re-register probe %s on 
> %s: %d\n",
> trace_probe_name(&tk->tp),
> @@ -729,17 +744,55 @@ static int count_mod_symbols(void *data, const char 
> *name, unsigned long unused)
> return 0;
>  }
>
> -static unsigned int number_of_same_symbols(char *func_name)
> +static unsigned int number_of_same_symbols(const char *mod, const char 
> *func_name)
>  {
> struct sym_count_ctx ctx = { .count = 0, .name = func_name };
>
> -   kallsyms_on_each_match_symbol(count_symbols, func_name, &ctx.count);
> +   if (!mod)
> +   kallsyms_on_each_match_symbol(count_symbols, func_name, 
> &ctx.count);
>
> -   module_kallsyms_on_each_symbol(NULL, count_mod_symbols, &ctx);
> +   module_kallsyms_on_each_symbol(mod, count_mod_symbols, &ctx);
>
> return ctx.count;
>  }
>
> +static int validate_module_probe_symbol(const char *modname, const char 
> *symbol)
> +{
> +   unsigned int count = number_of_same_symbols(modname, symbol);
> +
> +   if (count > 1) {
> +   /*
> +* Users should use ADDR to remove the ambiguity of
> +* using KSYM only.
> +*/
> +   return -EADDRNOTAVAIL;
> +   } else if (count == 0) {
> +   /*
> +* We can return ENOENT earlier than when register the
> +* kprobe.
> +*/
> +   return -ENOENT;
> +   }
> +   return 0;
> +}
> +
> +static int validate_probe_symbol(char *symbol)
> +{
> +   char *mod = NULL, *p;
> +   int ret;
> +
> +   p = strchr(symbol, ':');
> +   if (p) {
> +   mod = symbol;
> +   symbol = p + 1;
> +   *p = '\0';
> +   }
> +   ret = validate_module_probe_symbol(mod, symbol);
> +   if (p)
> +   *p = ':';
> +   return ret;
> +}
> +
>  static int __trace_kprobe_create(int argc, const char *argv[])
>  {
> /*
> @@ -859,6 +912,14 @@ static int __trace_kprobe_create(int argc, const char 
> *argv[])
> trace_probe_log_err(0, BAD_PROBE_ADDR);
> goto parse_error;
> }
> +   ret = validate_probe_symbol(symbol);
> +   if (ret) {
> +   if (ret == -EADDRNOTAVAIL)
> +   trace_probe_log_err(0, NON_UNIQ_SYMBOL);
> +   else
> +   trace_probe_log_err(0, BAD_PROBE_ADDR);
> +   goto parse_error;
> +   }
> if (is_return)
> ctx.flags |= TPARG_FL_RETURN;
> ret = kprobe_on_func_entry(NULL, symbol, offset);
> @@ -871,31 +932,6 @@ static int __trace_kprobe_create(int argc, co

Re: [PATCH] tracing: Have trace_event_file have ref counters

2023-10-31 Thread Beau Belgrave
On Tue, Oct 31, 2023 at 12:24:53PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> The following can crash the kernel:
> 
>  # cd /sys/kernel/tracing
>  # echo 'p:sched schedule' > kprobe_events
>  # exec 5>>events/kprobes/sched/enable
>  # > kprobe_events
>  # exec 5>&-
> 
> The above commands:
> 
>  1. Change directory to the tracefs directory
>  2. Create a kprobe event (doesn't matter what one)
>  3. Open bash file descriptor 5 on the enable file of the kprobe event
>  4. Delete the kprobe event (removes the files too)
>  5. Close the bash file descriptor 5
> 
> The above causes a crash!
> 
>  BUG: kernel NULL pointer dereference, address: 0028
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x) - not-present page
>  PGD 0 P4D 0
>  Oops:  [#1] PREEMPT SMP PTI
>  CPU: 6 PID: 877 Comm: bash Not tainted 
> 6.5.0-rc4-test-8-g2c6b6b1029d4-dirty #186
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> 1.16.2-debian-1.16.2-1 04/01/2014
>  RIP: 0010:tracing_release_file_tr+0xc/0x50
> 
> What happens here is that the kprobe event creates a trace_event_file
> "file" descriptor that represents the file in tracefs to the event. It
> maintains state of the event (is it enabled for the given instance?).
> Opening the "enable" file gets a reference to the event "file" descriptor
> via the open file descriptor. When the kprobe event is deleted, the file is
> also deleted from the tracefs system which also frees the event "file"
> descriptor.
> 
> But as the tracefs file is still opened by user space, it will not be
> totally removed until the final dput() is called on it. But this is not
> true with the event "file" descriptor that is already freed. If the user
> does a write to or simply closes the file descriptor it will reference the
> event "file" descriptor that was just freed, causing a use-after-free bug.
> 
> To solve this, add a ref count to the event "file" descriptor as well as a
> new flag called "FREED". The "file" will not be freed until the last
> reference is released. But the FREE flag will be set when the event is
> removed to prevent any more modifications to that event from happening,
> even if there's still a reference to the event "file" descriptor.
> 
> Link: 
> https://lore.kernel.org/linux-trace-kernel/2023103131.1e705...@gandalf.local.home/
> 

I ran this patch with your repro and via the user_events ftrace self
test repro. Both worked correctly, thanks!

Tested-by: Beau Belgrave 

Thanks,
-Beau

> Cc: sta...@vger.kernel.org
> Fixes: f5ca233e2e66d ("tracing: Increase trace array ref count on enable and 
> filter files")
> Reported-by: Beau Belgrave 
> Signed-off-by: Steven Rostedt (Google) 
> ---
>  include/linux/trace_events.h   |  4 
>  kernel/trace/trace.c   | 15 +++
>  kernel/trace/trace.h   |  3 +++
>  kernel/trace/trace_events.c| 31 ++
>  kernel/trace/trace_events_filter.c |  3 +++
>  5 files changed, 52 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
> index 12207dc6722d..696f8dc4aa53 100644
> --- a/include/linux/trace_events.h
> +++ b/include/linux/trace_events.h
> @@ -492,6 +492,7 @@ enum {
>   EVENT_FILE_FL_TRIGGER_COND_BIT,
>   EVENT_FILE_FL_PID_FILTER_BIT,
>   EVENT_FILE_FL_WAS_ENABLED_BIT,
> + EVENT_FILE_FL_FREED_BIT,
>  };
>  
>  extern struct trace_event_file *trace_get_event_file(const char *instance,
> @@ -630,6 +631,7 @@ extern int __kprobe_event_add_fields(struct dynevent_cmd 
> *cmd, ...);
>   *  TRIGGER_COND  - When set, one or more triggers has an associated filter
>   *  PID_FILTER- When set, the event is filtered based on pid
>   *  WAS_ENABLED   - Set when enabled to know to clear trace on module removal
> + *  FREED - File descriptor is freed, all fields should be 
> considered invalid
>   */
>  enum {
>   EVENT_FILE_FL_ENABLED   = (1 << EVENT_FILE_FL_ENABLED_BIT),
> @@ -643,6 +645,7 @@ enum {
>   EVENT_FILE_FL_TRIGGER_COND  = (1 << EVENT_FILE_FL_TRIGGER_COND_BIT),
>   EVENT_FILE_FL_PID_FILTER= (1 << EVENT_FILE_FL_PID_FILTER_BIT),
>   EVENT_FILE_FL_WAS_ENABLED   = (1 << EVENT_FILE_FL_WAS_ENABLED_BIT),
> + EVENT_FILE_FL_FREED = (1 << EVENT_FILE_FL_FREED_BIT),
>  };
>  
>  struct trace_event_file {
> @@ -671,6 +674,7 @@ struct trace_event_file {
>* caching and such. Which is mostly OK ;-)
>*/
>   unsigned long   flags;
> + atomic_tref;/* ref count for opened files */
>   atomic_tsm_ref; /* soft-mode reference counter */
>   atomic_ttm_ref; /* trigger-mode reference counter */
>  };
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 2539cfc20a97..9aebf904ff97 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -4978,6 +4978,20 @@ int tracing_open_file

[PATCH] eventfs: Remove extra dget() in eventfs_create_events_dir()

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The creation of the top events directory does a dget() at the end of the
creation in eventfs_create_events_dir() with a comment saying the final
dput() will happen when it is removed. The problem is that a dget() is
already done on the dentry when it was created with tracefs_start_creating()!
The dget() now just causes a memory leak of that dentry.

Remove the extra dget() as the final dput() in the deletion of the events
directory actually matches the one in tracefs_start_creating().

Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Signed-off-by: Steven Rostedt (Google) 
---
 fs/tracefs/event_inode.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index ae2172814bdc..d051b889147d 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -952,9 +952,6 @@ struct eventfs_inode *eventfs_create_events_dir(const char 
*name, struct dentry
fsnotify_mkdir(dentry->d_parent->d_inode, dentry);
tracefs_end_creating(dentry);
 
-   /* Will call dput when the directory is removed */
-   dget(dentry);
-
return ei;
 
  fail:
-- 
2.42.0




[PATCH] tracing: Have trace_event_file have ref counters

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The following can crash the kernel:

 # cd /sys/kernel/tracing
 # echo 'p:sched schedule' > kprobe_events
 # exec 5>>events/kprobes/sched/enable
 # > kprobe_events
 # exec 5>&-

The above commands:

 1. Change directory to the tracefs directory
 2. Create a kprobe event (doesn't matter what one)
 3. Open bash file descriptor 5 on the enable file of the kprobe event
 4. Delete the kprobe event (removes the files too)
 5. Close the bash file descriptor 5

The above causes a crash!

 BUG: kernel NULL pointer dereference, address: 0028
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x) - not-present page
 PGD 0 P4D 0
 Oops:  [#1] PREEMPT SMP PTI
 CPU: 6 PID: 877 Comm: bash Not tainted 
6.5.0-rc4-test-8-g2c6b6b1029d4-dirty #186
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.16.2-debian-1.16.2-1 04/01/2014
 RIP: 0010:tracing_release_file_tr+0xc/0x50

What happens here is that the kprobe event creates a trace_event_file
"file" descriptor that represents the file in tracefs to the event. It
maintains state of the event (is it enabled for the given instance?).
Opening the "enable" file gets a reference to the event "file" descriptor
via the open file descriptor. When the kprobe event is deleted, the file is
also deleted from the tracefs system which also frees the event "file"
descriptor.

But as the tracefs file is still opened by user space, it will not be
totally removed until the final dput() is called on it. But this is not
true with the event "file" descriptor that is already freed. If the user
does a write to or simply closes the file descriptor it will reference the
event "file" descriptor that was just freed, causing a use-after-free bug.

To solve this, add a ref count to the event "file" descriptor as well as a
new flag called "FREED". The "file" will not be freed until the last
reference is released. But the FREE flag will be set when the event is
removed to prevent any more modifications to that event from happening,
even if there's still a reference to the event "file" descriptor.

Link: 
https://lore.kernel.org/linux-trace-kernel/2023103131.1e705...@gandalf.local.home/

Cc: sta...@vger.kernel.org
Fixes: f5ca233e2e66d ("tracing: Increase trace array ref count on enable and 
filter files")
Reported-by: Beau Belgrave 
Signed-off-by: Steven Rostedt (Google) 
---
 include/linux/trace_events.h   |  4 
 kernel/trace/trace.c   | 15 +++
 kernel/trace/trace.h   |  3 +++
 kernel/trace/trace_events.c| 31 ++
 kernel/trace/trace_events_filter.c |  3 +++
 5 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 12207dc6722d..696f8dc4aa53 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -492,6 +492,7 @@ enum {
EVENT_FILE_FL_TRIGGER_COND_BIT,
EVENT_FILE_FL_PID_FILTER_BIT,
EVENT_FILE_FL_WAS_ENABLED_BIT,
+   EVENT_FILE_FL_FREED_BIT,
 };
 
 extern struct trace_event_file *trace_get_event_file(const char *instance,
@@ -630,6 +631,7 @@ extern int __kprobe_event_add_fields(struct dynevent_cmd 
*cmd, ...);
  *  TRIGGER_COND  - When set, one or more triggers has an associated filter
  *  PID_FILTER- When set, the event is filtered based on pid
  *  WAS_ENABLED   - Set when enabled to know to clear trace on module removal
+ *  FREED - File descriptor is freed, all fields should be considered 
invalid
  */
 enum {
EVENT_FILE_FL_ENABLED   = (1 << EVENT_FILE_FL_ENABLED_BIT),
@@ -643,6 +645,7 @@ enum {
EVENT_FILE_FL_TRIGGER_COND  = (1 << EVENT_FILE_FL_TRIGGER_COND_BIT),
EVENT_FILE_FL_PID_FILTER= (1 << EVENT_FILE_FL_PID_FILTER_BIT),
EVENT_FILE_FL_WAS_ENABLED   = (1 << EVENT_FILE_FL_WAS_ENABLED_BIT),
+   EVENT_FILE_FL_FREED = (1 << EVENT_FILE_FL_FREED_BIT),
 };
 
 struct trace_event_file {
@@ -671,6 +674,7 @@ struct trace_event_file {
 * caching and such. Which is mostly OK ;-)
 */
unsigned long   flags;
+   atomic_tref;/* ref count for opened files */
atomic_tsm_ref; /* soft-mode reference counter */
atomic_ttm_ref; /* trigger-mode reference counter */
 };
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 2539cfc20a97..9aebf904ff97 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4978,6 +4978,20 @@ int tracing_open_file_tr(struct inode *inode, struct 
file *filp)
if (ret)
return ret;
 
+   mutex_lock(&event_mutex);
+
+   /* Fail if the file is marked for removal */
+   if (file->flags & EVENT_FILE_FL_FREED) {
+   trace_array_put(file->tr);
+   ret = -ENODEV;
+   } else {
+   event_file_get(file);
+   }
+
+   mutex_unlock(&event_mutex);
+  

[PATCH v3] eventfs: Save ownership and mode

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

Now that inodes and dentries are created on the fly, they are also
reclaimed on memory pressure. Since the ownership and file mode are saved
in the inode, if they are freed, any changes to the ownership and mode
will be lost.

To counter this, if the user changes the permissions or ownership, save
them, and when creating the inodes again, restore those changes.

Signed-off-by: Steven Rostedt (Google) 
---
Changes since v2: 
https://lore.kernel.org/linux-trace-kernel/20231030120433.390fa...@gandalf.local.home/

- Rebased on top of 
https://lore.kernel.org/linux-trace-kernel/2023103325.50087...@gandalf.local.home/

- Fixed some more comments and re-arranged declarations.

 fs/tracefs/event_inode.c | 148 +++
 fs/tracefs/internal.h|  16 +
 2 files changed, 151 insertions(+), 13 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index a62cb2298350..7f1358956c3a 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -33,6 +33,15 @@
  */
 static DEFINE_MUTEX(eventfs_mutex);
 
+/* Mode is unsigned short, use the upper bits for flags */
+enum {
+   EVENTFS_SAVE_MODE   = BIT(16),
+   EVENTFS_SAVE_UID= BIT(17),
+   EVENTFS_SAVE_GID= BIT(18),
+};
+
+#define EVENTFS_MODE_MASK  (EVENTFS_SAVE_MODE - 1)
+
 /*
  * The eventfs_inode (ei) itself is protected by SRCU. It is released from
  * its parent's list and will have is_freed set (under eventfs_mutex).
@@ -47,8 +56,89 @@ static int dcache_dir_open_wrapper(struct inode *inode, 
struct file *file);
 static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx);
 static int eventfs_release(struct inode *inode, struct file *file);
 
+static void update_attr(struct eventfs_attr *attr, struct iattr *iattr)
+{
+   unsigned int ia_valid = iattr->ia_valid;
+
+   if (ia_valid & ATTR_MODE) {
+   attr->mode = (attr->mode & ~EVENTFS_MODE_MASK) |
+   (iattr->ia_mode & EVENTFS_MODE_MASK) |
+   EVENTFS_SAVE_MODE;
+   }
+   if (ia_valid & ATTR_UID) {
+   attr->mode |= EVENTFS_SAVE_UID;
+   attr->uid = iattr->ia_uid;
+   }
+   if (ia_valid & ATTR_GID) {
+   attr->mode |= EVENTFS_SAVE_GID;
+   attr->gid = iattr->ia_gid;
+   }
+}
+
+static int eventfs_set_attr(struct mnt_idmap *idmap, struct dentry *dentry,
+   struct iattr *iattr)
+{
+   const struct eventfs_entry *entry;
+   struct eventfs_inode *ei;
+   const char *name;
+   int ret;
+
+   mutex_lock(&eventfs_mutex);
+   ei = dentry->d_fsdata;
+   /* The LSB is set when the eventfs_inode is being freed */
+   if (((unsigned long)ei & 1UL) || ei->is_freed) {
+   /* Do not allow changes if the event is about to be removed. */
+   mutex_unlock(&eventfs_mutex);
+   return -ENODEV;
+   }
+
+   /* Preallocate the children mode array if necessary */
+   if (!(dentry->d_inode->i_mode & S_IFDIR)) {
+   if (!ei->entry_attrs) {
+   ei->entry_attrs = kzalloc(sizeof(*ei->entry_attrs) * 
ei->nr_entries,
+ GFP_KERNEL);
+   if (!ei->entry_attrs) {
+   ret = -ENOMEM;
+   goto out;
+   }
+   }
+   }
+
+   ret = simple_setattr(idmap, dentry, iattr);
+   if (ret < 0)
+   goto out;
+
+   /*
+* If this is a dir, then update the ei cache, only the file
+* mode is saved in the ei->m_children, and the ownership is
+* determined by the parent directory.
+*/
+   if (dentry->d_inode->i_mode & S_IFDIR) {
+   update_attr(&ei->attr, iattr);
+
+   } else {
+   name = dentry->d_name.name;
+
+   for (int i = 0; i < ei->nr_entries; i++) {
+   entry = &ei->entries[i];
+   if (strcmp(name, entry->name) == 0) {
+   update_attr(&ei->entry_attrs[i], iattr);
+   break;
+   }
+   }
+   }
+ out:
+   mutex_unlock(&eventfs_mutex);
+   return ret;
+}
+
 static const struct inode_operations eventfs_root_dir_inode_operations = {
.lookup = eventfs_root_lookup,
+   .setattr= eventfs_set_attr,
+};
+
+static const struct inode_operations eventfs_file_inode_operations = {
+   .setattr= eventfs_set_attr,
 };
 
 static const struct file_operations eventfs_file_operations = {
@@ -59,10 +149,30 @@ static const struct file_operations 
eventfs_file_operations = {
.release= eventfs_release,
 };
 
+static void update_inode_attr(struct inode *inode, struct eventfs_attr *attr, 
umode_t mode)
+{
+   if (!attr) {
+

[PATCH v3] eventfs: Test for ei->is_freed when accessing ei->dentry

2023-10-31 Thread Steven Rostedt
From: "Steven Rostedt (Google)" 

The eventfs_inode (ei) is protected by SRCU, but the ei->dentry is not. It
is protected by the eventfs_mutex. Anytime the eventfs_mutex is released,
and access to the ei->dentry needs to be done, it should first check if
ei->is_freed is set under the eventfs_mutex. If it is, then the ei->dentry
is invalid and must not be used. The ei->dentry must only be accessed
under the eventfs_mutex and after checking if ei->is_freed is set.

When the ei is being freed, it will (under the eventfs_mutex) set is_freed
and at the same time move the dentry to a free list to be cleared after
the eventfs_mutex is released. This means that any access to the
ei->dentry must check first if ei->is_freed is set, because if it is, then
the dentry is on its way to be freed.

Also add comments to describe this better.

Link: 
https://lore.kernel.org/all/CA+G9fYt6pY+tMZEOg=soeywqoe19fgp3ur15sgowkdk+_x8...@mail.gmail.com/
Link: 
https://lore.kernel.org/all/CA+G9fYuDP3hVQ3t7FfrBAjd_WFVSurMgCepTxunSJf=MTe=6...@mail.gmail.com/

Fixes: 5790b1fb3d672 ("eventfs: Remove eventfs_file and just use eventfs_inode")
Reported-by: Linux Kernel Functional Testing 
Reported-by: Naresh Kamboju 
Reported-by: Beau Belgrave 
Reviewed-by: Masami Hiramatsu (Google) 
Signed-off-by: Steven Rostedt (Google) 
---
Changes since v2: 
https://lore.kernel.org/linux-trace-kernel/20231028164650.4f5ea...@rorschach.local.home

- Since I added free_ei() separately, might as well use it for where it frees 
the
  top eventfs_inode too.

 fs/tracefs/event_inode.c | 69 
 fs/tracefs/internal.h|  3 +-
 2 files changed, 57 insertions(+), 15 deletions(-)

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index 4d2da7480e5f..a62cb2298350 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -24,7 +24,20 @@
 #include 
 #include "internal.h"
 
+/*
+ * eventfs_mutex protects the eventfs_inode (ei) dentry. Any access
+ * to the ei->dentry must be done under this mutex and after checking
+ * if ei->is_freed is not set. The ei->dentry is released under the
+ * mutex at the same time ei->is_freed is set. If ei->is_freed is set
+ * then the ei->dentry is invalid.
+ */
 static DEFINE_MUTEX(eventfs_mutex);
+
+/*
+ * The eventfs_inode (ei) itself is protected by SRCU. It is released from
+ * its parent's list and will have is_freed set (under eventfs_mutex).
+ * After the SRCU grace period is over, the ei may be freed.
+ */
 DEFINE_STATIC_SRCU(eventfs_srcu);
 
 static struct dentry *eventfs_root_lookup(struct inode *dir,
@@ -129,6 +142,13 @@ static struct dentry *create_dir(const char *name, struct 
dentry *parent)
return eventfs_end_creating(dentry);
 }
 
+static void free_ei(struct eventfs_inode *ei)
+{
+   kfree_const(ei->name);
+   kfree(ei->d_children);
+   kfree(ei);
+}
+
 /**
  * eventfs_set_ei_status_free - remove the dentry reference from an 
eventfs_inode
  * @ti: the tracefs_inode of the dentry
@@ -168,9 +188,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, 
struct dentry *dentry)
eventfs_remove_dir(ei_child);
}
 
-   kfree_const(ei->name);
-   kfree(ei->d_children);
-   kfree(ei);
+   free_ei(ei);
return;
}
 
@@ -234,6 +252,10 @@ create_file_dentry(struct eventfs_inode *ei, struct dentry 
**e_dentry,
bool invalidate = false;
 
mutex_lock(&eventfs_mutex);
+   if (ei->is_freed) {
+   mutex_unlock(&eventfs_mutex);
+   return NULL;
+   }
/* If the e_dentry already has a dentry, use it */
if (*e_dentry) {
/* lookup does not need to up the ref count */
@@ -307,6 +329,8 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
struct eventfs_inode *ei_child;
struct tracefs_inode *ti;
 
+   lockdep_assert_held(&eventfs_mutex);
+
/* srcu lock already held */
/* fill parent-child relation */
list_for_each_entry_srcu(ei_child, &ei->children, list,
@@ -320,6 +344,7 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
 
 /**
  * create_dir_dentry - Create a directory dentry for the eventfs_inode
+ * @pei: The eventfs_inode parent of ei.
  * @ei: The eventfs_inode to create the directory for
  * @parent: The dentry of the parent of this directory
  * @lookup: True if this is called by the lookup code
@@ -327,12 +352,17 @@ static void eventfs_post_create_dir(struct eventfs_inode 
*ei)
  * This creates and attaches a directory dentry to the eventfs_inode @ei.
  */
 static struct dentry *
-create_dir_dentry(struct eventfs_inode *ei, struct dentry *parent, bool lookup)
+create_dir_dentry(struct eventfs_inode *pei, struct eventfs_inode *ei,
+ struct dentry *parent, bool lookup)
 {
bool invalidate = false;
struct dentry *dentry = NULL;
 
mutex_lock(&eventfs_mutex

[PATCH v5 1/4] ARM: dts: qcom: samsung-matisse-common: Add initial common device tree

2023-10-31 Thread Stefan Hansson
According to the dts from the kernel source code released by Samsung,
matissewifi and matisselte only have minor differences in hardware, so
use a shared dtsi to reduce duplicated code. Additionally, this should
make adding support for matisse3g easier should someone want to do that
at a later point.

As such, add a common device tree for all matisse devices by Samsung
based on the matissewifi dts. Support for matisselte will be introduced
in a later patch in this series and will use the common dtsi as well.

Signed-off-by: Stefan Hansson 
Reviewed-by: Krzysztof Kozlowski 
---
 .../qcom-apq8026-samsung-matisse-wifi.dts | 595 +++---
 ... qcom-msm8226-samsung-matisse-common.dtsi} |  66 --
 2 files changed, 76 insertions(+), 585 deletions(-)
 rewrite arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts (85%)
 copy arch/arm/boot/dts/qcom/{qcom-apq8026-samsung-matisse-wifi.dts => 
qcom-msm8226-samsung-matisse-common.dtsi} (86%)

diff --git a/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts 
b/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
dissimilarity index 85%
index f516e0426bb9..b0da51f67539 100644
--- a/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
+++ b/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
@@ -1,519 +1,76 @@
-// SPDX-License-Identifier: BSD-3-Clause
-/*
- * Copyright (c) 2022, Matti Lehtimäki 
- */
-
-/dts-v1/;
-
-#include 
-#include "qcom-msm8226.dtsi"
-#include "qcom-pm8226.dtsi"
-
-/delete-node/ &adsp_region;
-/delete-node/ &smem_region;
-
-/ {
-   model = "Samsung Galaxy Tab 4 10.1";
-   compatible = "samsung,matisse-wifi", "qcom,apq8026";
-   chassis-type = "tablet";
-
-   aliases {
-   mmc0 = &sdhc_1; /* SDC1 eMMC slot */
-   mmc1 = &sdhc_2; /* SDC2 SD card slot */
-   display0 = &framebuffer0;
-   };
-
-   chosen {
-   #address-cells = <1>;
-   #size-cells = <1>;
-   ranges;
-
-   stdout-path = "display0";
-
-   framebuffer0: framebuffer@320 {
-   compatible = "simple-framebuffer";
-   reg = <0x0320 0x80>;
-   width = <1280>;
-   height = <800>;
-   stride = <(1280 * 3)>;
-   format = "r8g8b8";
-   };
-   };
-
-   gpio-hall-sensor {
-   compatible = "gpio-keys";
-
-   event-hall-sensor {
-   label = "Hall Effect Sensor";
-   gpios = <&tlmm 110 GPIO_ACTIVE_LOW>;
-   linux,input-type = ;
-   linux,code = ;
-   debounce-interval = <15>;
-   linux,can-disable;
-   wakeup-source;
-   };
-   };
-
-   gpio-keys {
-   compatible = "gpio-keys";
-   autorepeat;
-
-   key-home {
-   label = "Home";
-   gpios = <&tlmm 108 GPIO_ACTIVE_LOW>;
-   linux,code = ;
-   debounce-interval = <15>;
-   };
-
-   key-volume-down {
-   label = "Volume Down";
-   gpios = <&tlmm 107 GPIO_ACTIVE_LOW>;
-   linux,code = ;
-   debounce-interval = <15>;
-   };
-
-   key-volume-up {
-   label = "Volume Up";
-   gpios = <&tlmm 106 GPIO_ACTIVE_LOW>;
-   linux,code = ;
-   debounce-interval = <15>;
-   };
-   };
-
-   i2c-backlight {
-   compatible = "i2c-gpio";
-   sda-gpios = <&tlmm 20 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
-   scl-gpios = <&tlmm 21 (GPIO_ACTIVE_HIGH|GPIO_OPEN_DRAIN)>;
-
-   pinctrl-0 = <&backlight_i2c_default_state>;
-   pinctrl-names = "default";
-
-   i2c-gpio,delay-us = <4>;
-
-   #address-cells = <1>;
-   #size-cells = <0>;
-
-   backlight@2c {
-   compatible = "ti,lp8556";
-   reg = <0x2c>;
-
-   dev-ctrl = /bits/ 8 <0x80>;
-   init-brt = /bits/ 8 <0x3f>;
-
-   pwms = <&backlight_pwm 0 10>;
-   pwm-names = "lp8556";
-
-   rom-a0h {
-   rom-addr = /bits/ 8 <0xa0>;
-   rom-val = /bits/ 8 <0x44>;
-   };
-
-   rom-a1h {
-   rom-addr = /bits/ 8 <0xa1>;
-   rom-val = /bits/ 8 <0x6c>;
-   };
-
-   rom-a5h {
-   rom-addr = /bits/ 8 <0xa5>;
-   rom-val = /bits

[PATCH v5 3/4] ARM: dts: qcom: Add support for Samsung Galaxy Tab 4 10.1 LTE (SM-T535)

2023-10-31 Thread Stefan Hansson
Add a device tree for the Samsung Galaxy Tab 4 10.1 (SM-T535) LTE tablet
based on the MSM8926 platform.

The common dtsi is also modified to describe the widest constraints,
which required modifications to the matisse-wifi dts.

Signed-off-by: Stefan Hansson 
Reviewed-by: Krzysztof Kozlowski 
---
 arch/arm/boot/dts/qcom/Makefile   |  1 +
 .../qcom-apq8026-samsung-matisse-wifi.dts |  8 
 .../qcom-msm8226-samsung-matisse-common.dtsi  |  4 +-
 .../qcom/qcom-msm8926-samsung-matisselte.dts  | 37 +++
 4 files changed, 48 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts

diff --git a/arch/arm/boot/dts/qcom/Makefile b/arch/arm/boot/dts/qcom/Makefile
index a3d293e40820..cab35eeb30f6 100644
--- a/arch/arm/boot/dts/qcom/Makefile
+++ b/arch/arm/boot/dts/qcom/Makefile
@@ -34,6 +34,7 @@ dtb-$(CONFIG_ARCH_QCOM) += \
qcom-msm8916-samsung-serranove.dtb \
qcom-msm8926-microsoft-superman-lte.dtb \
qcom-msm8926-microsoft-tesla.dtb \
+   qcom-msm8926-samsung-matisselte.dtb \
qcom-msm8960-cdp.dtb \
qcom-msm8960-samsung-expressatt.dtb \
qcom-msm8974-lge-nexus5-hammerhead.dtb \
diff --git a/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts 
b/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
index b0da51f67539..3be02bdbb919 100644
--- a/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
+++ b/arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts
@@ -66,6 +66,14 @@ touchscreen@4a {
};
 };
 
+&pm8226_l3 {
+   regulator-max-microvolt = <1337500>;
+};
+
+&pm8226_s4 {
+   regulator-max-microvolt = <180>;
+};
+
 &tlmm {
tsp_en1_default_state: tsp-en1-default-state {
pins = "gpio73";
diff --git a/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi 
b/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
index ef98d88927ca..28317ce79e97 100644
--- a/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
+++ b/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
@@ -230,7 +230,7 @@ pm8226_s3: s3 {
 
pm8226_s4: s4 {
regulator-min-microvolt = <180>;
-   regulator-max-microvolt = <180>;
+   regulator-max-microvolt = <220>;
};
 
pm8226_s5: s5 {
@@ -250,7 +250,7 @@ pm8226_l2: l2 {
 
pm8226_l3: l3 {
regulator-min-microvolt = <75>;
-   regulator-max-microvolt = <1337500>;
+   regulator-max-microvolt = <135>;
regulator-always-on;
};
 
diff --git a/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts 
b/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
new file mode 100644
index ..d0e1bc39f8ef
--- /dev/null
+++ b/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2022, Matti Lehtimäki 
+ * Copyright (c) 2023, Stefan Hansson 
+ */
+
+/dts-v1/;
+
+#include "qcom-msm8226-samsung-matisse-common.dtsi"
+
+/ {
+   model = "Samsung Galaxy Tab 4 10.1 LTE";
+   compatible = "samsung,matisselte", "qcom,msm8926", "qcom,msm8226";
+   chassis-type = "tablet";
+
+   reg_tsp_3p3v: regulator-tsp-3p3v {
+   compatible = "regulator-fixed";
+   regulator-name = "tsp_3p3v";
+   regulator-min-microvolt = <330>;
+   regulator-max-microvolt = <330>;
+
+   gpio = <&tlmm 32 GPIO_ACTIVE_HIGH>;
+   enable-active-high;
+
+   pinctrl-names = "default";
+   pinctrl-0 = <&tsp_en1_default_state>;
+   };
+};
+
+&tlmm {
+   tsp_en1_default_state: tsp-en1-default-state {
+   pins = "gpio32";
+   function = "gpio";
+   drive-strength = <2>;
+   bias-disable;
+   };
+};
-- 
2.41.0



[PATCH v5 2/4] dt-bindings: arm: qcom: Add Samsung Galaxy Tab 4 10.1 LTE

2023-10-31 Thread Stefan Hansson
This documents Samsung Galaxy Tab 4 10.1 LTE (samsung,matisselte)
which is a tablet by Samsung based on the MSM8926 SoC.

Signed-off-by: Stefan Hansson 
Reviewed-by: Krzysztof Kozlowski 
---
 Documentation/devicetree/bindings/arm/qcom.yaml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/arm/qcom.yaml 
b/Documentation/devicetree/bindings/arm/qcom.yaml
index 88b84035e7b1..242ffe89c6c6 100644
--- a/Documentation/devicetree/bindings/arm/qcom.yaml
+++ b/Documentation/devicetree/bindings/arm/qcom.yaml
@@ -196,6 +196,7 @@ properties:
   - enum:
   - microsoft,superman-lte
   - microsoft,tesla
+  - samsung,matisselte
   - const: qcom,msm8926
   - const: qcom,msm8226
 
-- 
2.41.0



[PATCH v5 4/4] ARM: dts: qcom: samsung-matisse-common: Add UART

2023-10-31 Thread Stefan Hansson
This was not enabled in the matisse-wifi tree. Without this, it is not
possible to use the USB port for serial debugging via a "Carkit debug
cable".

Signed-off-by: Stefan Hansson 
Reviewed-by: Krzysztof Kozlowski 
Reviewed-by: Konrad Dybcio 
---
 .../boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi| 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi 
b/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
index 28317ce79e97..dc63b91f94bc 100644
--- a/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
+++ b/arch/arm/boot/dts/qcom/qcom-msm8226-samsung-matisse-common.dtsi
@@ -219,6 +219,10 @@ muic: usb-switch@25 {
};
 };
 
+&blsp1_uart3 {
+   status = "okay";
+};
+
 &rpm_requests {
regulators {
compatible = "qcom,rpm-pm8226-regulators";
-- 
2.41.0



[PATCH v5 0/4] Add samsung-matisselte and common matisse dtsi

2023-10-31 Thread Stefan Hansson
This series adds a common samsung-matisse dtsi and reworks
samsung-matisse-wifi to use it, and introduces samsung-matisselte. I
choose matisselte over matisse-lte as this is how most other devices
(klte, s3ve3g) do it and it is the codename that Samsung gave the
device. See individual commits for more information.

---
Changes since v1:

 - Rebased on latest linux-next
 - Added qcom,msm8226 compatible to matisselte inspired by recent Lumia
   830 patch. This is done as in v1, the patch was rejected because I
   included the msm8226 dtsi despite not marking matisselte as
   compatible with msm8226, and I was not sure how to resolve that. As
   such, I'm copying what was done in the Lumia 830 (microsoft-tesla)
   patch given that it was accepted.

Changes since v2:

 - Updated commit message for UART patch to explain why it was added.
 - Gave more flags to git to provide a hopefully more readable patch.

Changes since v3:

 - Collect tags.
 - Remove spurious copyright notice.
 - Miscellaneous fixes following review feedback.

Changes since v4:

 - Collect tags.
 - Enable SD card support on matisselte.

Stefan Hansson (4):
  ARM: dts: qcom: samsung-matisse-common: Add initial common device tree
  dt-bindings: arm: qcom: Add Samsung Galaxy Tab 4 10.1 LTE
  ARM: dts: qcom: Add support for Samsung Galaxy Tab 4 10.1 LTE
(SM-T535)
  ARM: dts: qcom: samsung-matisse-common: Add UART

 .../devicetree/bindings/arm/qcom.yaml |   1 +
 arch/arm/boot/dts/qcom/Makefile   |   1 +
 .../qcom-apq8026-samsung-matisse-wifi.dts | 603 +++---
 ... qcom-msm8226-samsung-matisse-common.dtsi} |  68 +-
 .../qcom/qcom-msm8926-samsung-matisselte.dts  |  37 ++
 5 files changed, 126 insertions(+), 584 deletions(-)
 rewrite arch/arm/boot/dts/qcom/qcom-apq8026-samsung-matisse-wifi.dts (85%)
 copy arch/arm/boot/dts/qcom/{qcom-apq8026-samsung-matisse-wifi.dts => 
qcom-msm8226-samsung-matisse-common.dtsi} (86%)
 create mode 100644 arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts

-- 
2.41.0



Re: [PATCH v3 3/4] ARM: dts: qcom: Add support for Samsung Galaxy Tab 4 10.1 LTE (SM-T535)

2023-10-31 Thread Stefan Hansson




On 2023-10-31 12:08, Konrad Dybcio wrote:

On 25.10.2023 10:37, Stefan Hansson wrote:

Add a device tree for the Samsung Galaxy Tab 4 10.1 (SM-T535) LTE tablet
based on the MSM8926 platform.

Signed-off-by: Stefan Hansson 
---
  arch/arm/boot/dts/qcom/Makefile   |  1 +
  .../qcom/qcom-msm8926-samsung-matisselte.dts  | 36 +++
  2 files changed, 37 insertions(+)
  create mode 100644 arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts

diff --git a/arch/arm/boot/dts/qcom/Makefile b/arch/arm/boot/dts/qcom/Makefile
index a3d293e40820..cab35eeb30f6 100644
--- a/arch/arm/boot/dts/qcom/Makefile
+++ b/arch/arm/boot/dts/qcom/Makefile
@@ -34,6 +34,7 @@ dtb-$(CONFIG_ARCH_QCOM) += \
qcom-msm8916-samsung-serranove.dtb \
qcom-msm8926-microsoft-superman-lte.dtb \
qcom-msm8926-microsoft-tesla.dtb \
+   qcom-msm8926-samsung-matisselte.dtb \
qcom-msm8960-cdp.dtb \
qcom-msm8960-samsung-expressatt.dtb \
qcom-msm8974-lge-nexus5-hammerhead.dtb \
diff --git a/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts 
b/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
new file mode 100644
index ..6e25b1a74ce5
--- /dev/null
+++ b/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: BSD-3-Clause
+/*
+ * Copyright (c) 2022, Matti Lehtimäki 
+ * Copyright (c) 2023, Stefan Hansson 
+ */
+
+/dts-v1/;
+
+#include "qcom-msm8226-samsung-matisse-common.dtsi"
+
+/ {
+   model = "Samsung Galaxy Tab 4 10.1 LTE";
+   compatible = "samsung,matisselte", "qcom,msm8926", "qcom,msm8226";
+   chassis-type = "tablet";
+};
+
+&pm8226_l3 {
+   regulator-max-microvolt = <135>;
+};
+
+&pm8226_s4 {
+   regulator-max-microvolt = <220>;
+};
+
+®_tsp_3p3v {
+   gpio = <&tlmm 32 GPIO_ACTIVE_HIGH>;
+};
+
+&sdhc_2 {
+   /* SD card fails to probe with error -110 */
+   status = "disabled";

Can you give us some logs?


I tested it again just now, and it worked without issues. Maybe I used a 
defective SD card to test it or hadn't inserted it properly. I'll send 
another revision fixing this.



Konrad


Stefan


Re: [PATCH v3 3/4] ARM: dts: qcom: Add support for Samsung Galaxy Tab 4 10.1 LTE (SM-T535)

2023-10-31 Thread Konrad Dybcio
On 25.10.2023 10:37, Stefan Hansson wrote:
> Add a device tree for the Samsung Galaxy Tab 4 10.1 (SM-T535) LTE tablet
> based on the MSM8926 platform.
> 
> Signed-off-by: Stefan Hansson 
> ---
>  arch/arm/boot/dts/qcom/Makefile   |  1 +
>  .../qcom/qcom-msm8926-samsung-matisselte.dts  | 36 +++
>  2 files changed, 37 insertions(+)
>  create mode 100644 arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
> 
> diff --git a/arch/arm/boot/dts/qcom/Makefile b/arch/arm/boot/dts/qcom/Makefile
> index a3d293e40820..cab35eeb30f6 100644
> --- a/arch/arm/boot/dts/qcom/Makefile
> +++ b/arch/arm/boot/dts/qcom/Makefile
> @@ -34,6 +34,7 @@ dtb-$(CONFIG_ARCH_QCOM) += \
>   qcom-msm8916-samsung-serranove.dtb \
>   qcom-msm8926-microsoft-superman-lte.dtb \
>   qcom-msm8926-microsoft-tesla.dtb \
> + qcom-msm8926-samsung-matisselte.dtb \
>   qcom-msm8960-cdp.dtb \
>   qcom-msm8960-samsung-expressatt.dtb \
>   qcom-msm8974-lge-nexus5-hammerhead.dtb \
> diff --git a/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts 
> b/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
> new file mode 100644
> index ..6e25b1a74ce5
> --- /dev/null
> +++ b/arch/arm/boot/dts/qcom/qcom-msm8926-samsung-matisselte.dts
> @@ -0,0 +1,36 @@
> +// SPDX-License-Identifier: BSD-3-Clause
> +/*
> + * Copyright (c) 2022, Matti Lehtimäki 
> + * Copyright (c) 2023, Stefan Hansson 
> + */
> +
> +/dts-v1/;
> +
> +#include "qcom-msm8226-samsung-matisse-common.dtsi"
> +
> +/ {
> + model = "Samsung Galaxy Tab 4 10.1 LTE";
> + compatible = "samsung,matisselte", "qcom,msm8926", "qcom,msm8226";
> + chassis-type = "tablet";
> +};
> +
> +&pm8226_l3 {
> + regulator-max-microvolt = <135>;
> +};
> +
> +&pm8226_s4 {
> + regulator-max-microvolt = <220>;
> +};
> +
> +®_tsp_3p3v {
> + gpio = <&tlmm 32 GPIO_ACTIVE_HIGH>;
> +};
> +
> +&sdhc_2 {
> + /* SD card fails to probe with error -110 */
> + status = "disabled";
Can you give us some logs?

Konrad


Re: [PATCH v3 4/4] ARM: dts: qcom: samsung-matisse-common: Add UART

2023-10-31 Thread Konrad Dybcio
On 25.10.2023 10:37, Stefan Hansson wrote:
> This was not enabled in the matisse-wifi tree. Without this, it is not
> possible to use the USB port for serial debugging via a "Carkit debug
> cable".
> 
> Signed-off-by: Stefan Hansson 
> ---
Reviewed-by: Konrad Dybcio 

Konrad


Re: [PATCH 9/9] arm64: dts: qcom: qcm6490-fairphone-fp5: Enable WiFi

2023-10-31 Thread Konrad Dybcio
On 31.10.2023 11:31, Luca Weiss wrote:
> On Mon Oct 30, 2023 at 8:26 PM CET, Konrad Dybcio wrote:
>> On 27.10.2023 16:20, Luca Weiss wrote:
>>> Now that the WPSS remoteproc is enabled, enable wifi so we can use it.
>>>
>>> Signed-off-by: Luca Weiss 
>>> ---
>>>  arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts | 4 
>>>  1 file changed, 4 insertions(+)
>>>
>>> diff --git a/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts 
>>> b/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
>>> index d65eef30091b..e7e20f73cbe6 100644
>>> --- a/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
>>> +++ b/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
>>> @@ -713,3 +713,7 @@ &venus {
>>> firmware-name = "qcom/qcm6490/fairphone5/venus.mbn";
>>> status = "okay";
>>>  };
>>> +
>>> +&wifi {
>>> +   status = "okay";
>> qcom,ath11k-calibration-variant?
> 
> What value would I put there for my device? Based on existing usages
> (mostly for ath10k) I'd say "Fairphone_5"?
> 
> And you mean I should add this property in dts before even looking into
> the firmware/calibration side of it?
This is basically a "compatible" for the board file, I think Fairphone_5
makes sense here, perhaps Dmitry can confirm

Konrad


Re: [PATCH 8/9] arm64: dts: qcom: qcm6490-fairphone-fp5: Enable various remoteprocs

2023-10-31 Thread Luca Weiss
On Mon Oct 30, 2023 at 8:26 PM CET, Konrad Dybcio wrote:
> On 27.10.2023 16:20, Luca Weiss wrote:
> > Enable the ADSP, CDSP, MPSS and WPSS that are found on the SoC.
> > 
> > Signed-off-by: Luca Weiss 
> > ---
> >  arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts | 20 
> > 
> >  1 file changed, 20 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts 
> > b/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> > index cc092735ce17..d65eef30091b 100644
> > --- a/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> > +++ b/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> > @@ -490,6 +490,26 @@ &qupv3_id_1 {
> > status = "okay";
> >  };
> >  
> > +&remoteproc_adsp {
> > +   firmware-name = "qcom/qcm6490/fairphone5/adsp.mdt";
> > +   status = "okay";
> > +};
> > +
> > +&remoteproc_cdsp {
> > +   firmware-name = "qcom/qcm6490/fairphone5/cdsp.mdt";
> > +   status = "okay";
> > +};
> > +
> > +&remoteproc_mpss {
> > +   firmware-name = "qcom/qcm6490/fairphone5/modem.mdt";
> > +   status = "okay";
> > +};
> > +
> > +&remoteproc_wpss {
> > +   firmware-name = "qcom/qcm6490/fairphone5/wpss.mdt";
> mbn?

Downstream ships mdt but if preferred I can change to .mbn and use
pil-squasher. Not sure what's the correct thing nowadays :)

Regards
Luca

>
> Konrad



Re: [PATCH 9/9] arm64: dts: qcom: qcm6490-fairphone-fp5: Enable WiFi

2023-10-31 Thread Luca Weiss
On Mon Oct 30, 2023 at 8:26 PM CET, Konrad Dybcio wrote:
> On 27.10.2023 16:20, Luca Weiss wrote:
> > Now that the WPSS remoteproc is enabled, enable wifi so we can use it.
> > 
> > Signed-off-by: Luca Weiss 
> > ---
> >  arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts | 4 
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts 
> > b/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> > index d65eef30091b..e7e20f73cbe6 100644
> > --- a/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> > +++ b/arch/arm64/boot/dts/qcom/qcm6490-fairphone-fp5.dts
> > @@ -713,3 +713,7 @@ &venus {
> > firmware-name = "qcom/qcm6490/fairphone5/venus.mbn";
> > status = "okay";
> >  };
> > +
> > +&wifi {
> > +   status = "okay";
> qcom,ath11k-calibration-variant?

What value would I put there for my device? Based on existing usages
(mostly for ath10k) I'd say "Fairphone_5"?

And you mean I should add this property in dts before even looking into
the firmware/calibration side of it?

Regards
Luca

>
> Konrad



Re: [PATCH v7 2/3] mm/memory_hotplug: split memmap_on_memory requests across memblocks

2023-10-31 Thread David Hildenbrand

On 31.10.23 03:14, Verma, Vishal L wrote:

On Mon, 2023-10-30 at 11:20 +0100, David Hildenbrand wrote:

On 26.10.23 00:44, Vishal Verma wrote:



[..]


@@ -2146,11 +2186,69 @@ void try_offline_node(int nid)
   }
   EXPORT_SYMBOL(try_offline_node);
   
-static int __ref try_remove_memory(u64 start, u64 size)

+static void __ref remove_memory_blocks_and_altmaps(u64 start, u64 size)
   {
-   struct memory_block *mem;
-   int rc = 0, nid = NUMA_NO_NODE;
+   unsigned long memblock_size = memory_block_size_bytes();
 struct vmem_altmap *altmap = NULL;
+   struct memory_block *mem;
+   u64 cur_start;
+   int rc;
+
+   /*
+    * For memmap_on_memory, the altmaps could have been added on
+    * a per-memblock basis. Loop through the entire range if so,
+    * and remove each memblock and its altmap.
+    */


/*
   * altmaps where added on a per-memblock basis; we have to process
   * each individual memory block.
   */


+   for (cur_start = start; cur_start < start + size;
+    cur_start += memblock_size) {
+   rc = walk_memory_blocks(cur_start, memblock_size, &mem,
+   test_has_altmap_cb);
+   if (rc) {
+   altmap = mem->altmap;
+   /*
+    * Mark altmap NULL so that we can add a debug
+    * check on memblock free.
+    */
+   mem->altmap = NULL;
+   }


Simpler (especially, we know that there must be an altmap):

mem = find_memory_block(pfn_to_section_nr(cur_start));
altmap = mem->altmap;
mem->altmap = NULL;

I think we might be able to remove test_has_altmap_cb() then.


+
+   remove_memory_block_devices(cur_start, memblock_size);
+
+   arch_remove_memory(cur_start, memblock_size, altmap);
+
+   /* Verify that all vmemmap pages have actually been freed. */
+   if (altmap) {


There must be an altmap, so this can be done unconditionally.


Hi David,


Hi!



All other comments make sense, making those changes now.

However for this one, does the WARN() below go away then?

I was wondering if maybe arch_remove_memory() is responsible for
freeing the altmap here, and at this stage we're just checking if that
happened. If it didn't WARN and then free it.


I think that has to stay, to make sure arch_remove_memory() did the 
right thing and we don't -- by BUG -- still have some altmap pages in 
use after they should have been completely freed.




I drilled down the path, and I don't see altmap actually getting freed
in vmem_altmap_free(), but I wasn't sure if  was meant
to free it as altmap->alloc went down to 0.



vmem_altmap_free() does the "altmap->alloc -= nr_pfns", which is called 
when arch_remove_memory() frees the vmemmap pages and detects that they 
actually come from the altmap reserve and not from the buddy/earlyboot 
allocator.


Freeing an altmap is just unaccounting it in the altmap structure; and 
here we make sure that we are actually back down to 0 and don't have 
some weird altmap freeing BUG in arch_remove_memory().


--
Cheers,

David / dhildenb




Re: [PATCH] tracing/kprobes: Fix the order of argument descriptions

2023-10-31 Thread Mukesh Ojha




On 10/31/2023 9:43 AM, Yujie Liu wrote:

The order of descriptions should be consistent with the argument list of
the function, so "kretprobe" should be the second one.

int __kprobe_event_gen_cmd_start(struct dynevent_cmd *cmd, bool kretprobe,
  const char *name, const char *loc, ...)

Fixes: 2a588dd1d5d6 ("tracing: Add kprobe event command generation functions")
Suggested-by: Mukesh Ojha 
Signed-off-by: Yujie Liu 


Thanks.

Reviewed-by: Mukesh Ojha 

-Mukesh


---
  kernel/trace/trace_kprobe.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index e834f149695b..47812aa16bb5 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1020,9 +1020,9 @@ EXPORT_SYMBOL_GPL(kprobe_event_cmd_init);
  /**
   * __kprobe_event_gen_cmd_start - Generate a kprobe event command from arg 
list
   * @cmd: A pointer to the dynevent_cmd struct representing the new event
+ * @kretprobe: Is this a return probe?
   * @name: The name of the kprobe event
   * @loc: The location of the kprobe event
- * @kretprobe: Is this a return probe?
   * @...: Variable number of arg (pairs), one pair for each field
   *
   * NOTE: Users normally won't want to call this function directly, but