Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy

2024-01-31 Thread Steven Rostedt
On Wed, 31 Jan 2024 22:21:27 -0500
Steven Rostedt  wrote:

> We (Linus and I) got it wrong. It originally had:
> 
>   d_add(dentry, NULL);
>   [..]
>   return NULL;

OK, so I changed that function to this:

static struct dentry *eventfs_root_lookup(struct inode *dir,
  struct dentry *dentry,
  unsigned int flags)
{
struct eventfs_inode *ei_child;
struct tracefs_inode *ti;
struct eventfs_inode *ei;
const char *name = dentry->d_name.name;

ti = get_tracefs(dir);
if (!(ti->flags & TRACEFS_EVENT_INODE))
return ERR_PTR(-EIO);

mutex_lock(_mutex);

ei = ti->private;
if (!ei || ei->is_freed)
goto out;

list_for_each_entry(ei_child, >children, list) {
if (strcmp(ei_child->name, name) != 0)
continue;
if (ei_child->is_freed)
goto out;
lookup_dir_entry(dentry, ei, ei_child);
goto out;
}

for (int i = 0; i < ei->nr_entries; i++) {
void *data;
umode_t mode;
const struct file_operations *fops;
const struct eventfs_entry *entry = >entries[i];

if (strcmp(name, entry->name) != 0)
continue;

data = ei->data;
if (entry->callback(name, , , ) <= 0)
goto out;

lookup_file_dentry(dentry, ei, i, mode, data, fops);
goto out;
}
 out:
mutex_unlock(_mutex);
return NULL;
}

And it passes the make kprobe test. I'll send out a v3 of this patch, and
remove the inc_nlink(dentry->d_parent->d_inode) and the fsnotify as
separate patches as that code was there before Linus touched it.

Thanks,

-- Steve



Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy

2024-01-31 Thread Steven Rostedt
On Thu, 1 Feb 2024 03:02:05 +
Al Viro  wrote:

> > We had a problem here with just returning NULL. It leaves the negative
> > dentry around and doesn't get refreshed.  
> 
> Why would that dentry stick around?  And how would anyone find
> it, anyway, when it's not hashed?

We (Linus and I) got it wrong. It originally had:

d_add(dentry, NULL);
[..]
return NULL;

and it caused the:


  # ls events/kprobes/sched/
ls: cannot access 'events/kprobes/sched/': No such file or directory

  # echo 'p:sched schedule' >> /sys/kernel/tracing/kprobe_events 
  # ls events/kprobes/sched/
ls: cannot access 'events/kprobes/sched/': No such file or directory

I just changed the code to simply return NULL, and it had no issues:

  # ls events/kprobes/sched/
ls: cannot access 'events/kprobes/sched/': No such file or directory

  # echo 'p:sched schedule' >> /sys/kernel/tracing/kprobe_events 
  # ls events/kprobes/sched/
enable  filter  format  hist  hist_debug  id  inject  trigger

But then I added the: d_add(dentry, NULL); that we originally had, and then
it caused the issue again.

So it wasn't the returning NULL that was causing a problem, it was calling
the d_add(dentry, NULL); that was.

I'll update the patch.

-- Steve



Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy

2024-01-31 Thread Al Viro
On Wed, Jan 31, 2024 at 09:26:42PM -0500, Steven Rostedt wrote:

> > Huh?  Just return NULL and be done with that - you'll get an
> > unhashed negative dentry and let the caller turn that into
> > -ENOENT...
> 
> We had a problem here with just returning NULL. It leaves the negative
> dentry around and doesn't get refreshed.

Why would that dentry stick around?  And how would anyone find
it, anyway, when it's not hashed?

> I did this:
> 
>  # cd /sys/kernel/tracing
>  # ls events/kprobes/sched/
> ls: cannot access 'events/kprobes/sched/': No such file or directory
>  # echo 'p:sched schedule' >> kprobe_events
>  # ls events/kprobes/sched/
> ls: cannot access 'events/kprobes/sched/': No such file or directory
> 
> When it should have been:
> 
>  # ls events/kprobes/sched/
> enable  filter  format  hist  hist_debug  id  inject  trigger
> 
> Leaving the negative dentry there will have it fail when the directory
> exists the next time.

Then you have something very deeply fucked up.  NULL or ERR_PTR(-ENOENT)
from ->lookup() in the last component of open() would do exactly the
same thing: dput() whatever had been passed to ->lookup() and fail
open(2) with -ENOENT.



Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy

2024-01-31 Thread Steven Rostedt
On Thu, 1 Feb 2024 00:27:19 +
Al Viro  wrote:

> On Wed, Jan 31, 2024 at 01:49:22PM -0500, Steven Rostedt wrote:
> 
> > @@ -329,32 +320,29 @@ static struct dentry *create_file(const char *name, 
> > umode_t mode,
> >  
> > ti = get_tracefs(inode);
> > ti->flags |= TRACEFS_EVENT_INODE;
> > -   d_instantiate(dentry, inode);
> > +
> > +   d_add(dentry, inode);
> > fsnotify_create(dentry->d_parent->d_inode, dentry);  
> 
> Seriously?  stat(2), have it go away from dcache on memory pressure,
> lather, rinse, repeat...  Won't *snotify get confused by the stream
> of creations of the same thing, with not a removal in sight?
> 

That looks to be cut and paste from the old create in tracefs. I don't know
of a real use case for that. I think we could possibly delete it without
anyone noticing.


> > -   return eventfs_end_creating(dentry);
> > +   return dentry;
> >  };
> >  


> > @@ -371,11 +359,14 @@ static struct dentry *create_dir(struct eventfs_inode 
> > *ei, struct dentry *parent
> > /* Only directories have ti->private set to an ei, not files */
> > ti->private = ei;
> >  
> > +   dentry->d_fsdata = ei;
> > +ei->dentry = dentry;   // Remove me!
> > +
> > inc_nlink(inode);
> > -   d_instantiate(dentry, inode);
> > +   d_add(dentry, inode);
> > inc_nlink(dentry->d_parent->d_inode);  
> 
> What will happen when that thing gets evicted from dcache,
> gets looked up again, and again, and...?
> 
> > fsnotify_mkdir(dentry->d_parent->d_inode, dentry);  
> 
> Same re snotify confusion...

Yeah, again, I think it's useless. Doing that is more useless than taring
the tracefs directory ;-)

> 
> > -   return eventfs_end_creating(dentry);
> > +   return dentry;
> >  }
> >  
> >  static void free_ei(struct eventfs_inode *ei)
> > @@ -425,7 +416,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode 
> > *ti, struct dentry *dentry)
> >  }
> >  


> > @@ -607,79 +462,55 @@ static struct dentry *eventfs_root_lookup(struct 
> > inode *dir,
> >   struct dentry *dentry,
> >   unsigned int flags)
> >  {
> > -   const struct file_operations *fops;
> > -   const struct eventfs_entry *entry;
> > struct eventfs_inode *ei_child;
> > struct tracefs_inode *ti;
> > struct eventfs_inode *ei;
> > -   struct dentry *ei_dentry = NULL;
> > -   struct dentry *ret = NULL;
> > -   struct dentry *d;
> > const char *name = dentry->d_name.name;
> > -   umode_t mode;
> > -   void *data;
> > -   int idx;
> > -   int i;
> > -   int r;
> > +   struct dentry *result = NULL;
> >  
> > ti = get_tracefs(dir);
> > if (!(ti->flags & TRACEFS_EVENT_INODE))  
> 
>   Can that ever happen?  I mean, why set ->i_op to something that
> has this for ->lookup() on a directory without TRACEFS_EVENT_INODE in
> its inode?  It's not as if you ever removed that flag...

That's been there mostly as paranoia. Should probably be switched to:

if (WARN_ON_ONCE(!(ti->flags & TRACEFS_EVENT_INODE)))


> 
> > -   return NULL;
> > -
> > -   /* Grab srcu to prevent the ei from going away */
> > -   idx = srcu_read_lock(_srcu);
> > +   return ERR_PTR(-EIO);
> >  
> > -   /*
> > -* Grab the eventfs_mutex to consistent value from ti->private.
> > -* This s
> > -*/
> > mutex_lock(_mutex);
> > -   ei = READ_ONCE(ti->private);
> > -   if (ei && !ei->is_freed)
> > -   ei_dentry = READ_ONCE(ei->dentry);
> > -   mutex_unlock(_mutex);
> > -
> > -   if (!ei || !ei_dentry)
> > -   goto out;
> >  
> > -   data = ei->data;
> > +   ei = ti->private;
> > +   if (!ei || ei->is_freed)
> > +   goto enoent;
> >  
> > -   list_for_each_entry_srcu(ei_child, >children, list,
> > -srcu_read_lock_held(_srcu)) {
> > +   list_for_each_entry(ei_child, >children, list) {
> > if (strcmp(ei_child->name, name) != 0)
> > continue;
> > -   ret = simple_lookup(dir, dentry, flags);
> > -   if (IS_ERR(ret))
> > -   goto out;
> > -   d = create_dir_dentry(ei, ei_child, ei_dentry);
> > -   dput(d);
> > +   if (ei_child->is_freed)
> > +   goto enoent;  
> 
> Out of curiosity - can that happen now?  You've got exclusion with
> eventfs_remove_rec(), so you shouldn't be able to catch the moment
> between setting ->is_freed and removal from the list...

Yeah, that's from when we just used SRCU. If anything, it too should just
add a WARN_ON_ONCE() to it.

> 
> > +   lookup_dir_entry(dentry, ei, ei_child);
> > goto out;
> > }
> >  
> > -   for (i = 0; i < ei->nr_entries; i++) {
> > -   entry = >entries[i];
> > -   if (strcmp(name, entry->name) == 0) {
> > -   void *cdata = data;
> > -   mutex_lock(_mutex);
> > -   /* If ei->is_freed, then the event itself may be too */
> > -   if 

Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy

2024-01-31 Thread Al Viro
On Wed, Jan 31, 2024 at 01:49:22PM -0500, Steven Rostedt wrote:

> @@ -329,32 +320,29 @@ static struct dentry *create_file(const char *name, 
> umode_t mode,
>  
>   ti = get_tracefs(inode);
>   ti->flags |= TRACEFS_EVENT_INODE;
> - d_instantiate(dentry, inode);
> +
> + d_add(dentry, inode);
>   fsnotify_create(dentry->d_parent->d_inode, dentry);

Seriously?  stat(2), have it go away from dcache on memory pressure,
lather, rinse, repeat...  Won't *snotify get confused by the stream
of creations of the same thing, with not a removal in sight?

> - return eventfs_end_creating(dentry);
> + return dentry;
>  };
>  
>  /**
> - * create_dir - create a dir in the tracefs filesystem
> + * lookup_dir_entry - look up a dir in the tracefs filesystem
> + * @dentry: the directory to look up
>   * @ei: the eventfs_inode that represents the directory to create
> - * @parent: parent dentry for this file.
>   *
> - * This function will create a dentry for a directory represented by
> + * This function will look up a dentry for a directory represented by
>   * a eventfs_inode.
>   */
> -static struct dentry *create_dir(struct eventfs_inode *ei, struct dentry 
> *parent)
> +static struct dentry *lookup_dir_entry(struct dentry *dentry,
> + struct eventfs_inode *pei, struct eventfs_inode *ei)
>  {
>   struct tracefs_inode *ti;
> - struct dentry *dentry;
>   struct inode *inode;
>  
> - dentry = eventfs_start_creating(ei->name, parent);
> - if (IS_ERR(dentry))
> - return dentry;
> -
>   inode = tracefs_get_inode(dentry->d_sb);
>   if (unlikely(!inode))
> - return eventfs_failed_creating(dentry);
> + return ERR_PTR(-ENOMEM);
>  
>   /* If the user updated the directory's attributes, use them */
>   update_inode_attr(dentry, inode, >attr,
> @@ -371,11 +359,14 @@ static struct dentry *create_dir(struct eventfs_inode 
> *ei, struct dentry *parent
>   /* Only directories have ti->private set to an ei, not files */
>   ti->private = ei;
>  
> + dentry->d_fsdata = ei;
> +ei->dentry = dentry; // Remove me!
> +
>   inc_nlink(inode);
> - d_instantiate(dentry, inode);
> + d_add(dentry, inode);
>   inc_nlink(dentry->d_parent->d_inode);

What will happen when that thing gets evicted from dcache,
gets looked up again, and again, and...?

>   fsnotify_mkdir(dentry->d_parent->d_inode, dentry);

Same re snotify confusion...

> - return eventfs_end_creating(dentry);
> + return dentry;
>  }
>  
>  static void free_ei(struct eventfs_inode *ei)
> @@ -425,7 +416,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, 
> struct dentry *dentry)
>  }
>  
>  /**
> - * create_file_dentry - create a dentry for a file of an eventfs_inode
> + * lookup_file_dentry - create a dentry for a file of an eventfs_inode
>   * @ei: the eventfs_inode that the file will be created under
>   * @idx: the index into the d_children[] of the @ei
>   * @parent: The parent dentry of the created file.
> @@ -438,157 +429,21 @@ void eventfs_set_ei_status_free(struct tracefs_inode 
> *ti, struct dentry *dentry)
>   * address located at @e_dentry.
>   */
>  static struct dentry *
> -create_file_dentry(struct eventfs_inode *ei, int idx,
> -struct dentry *parent, const char *name, umode_t mode, void 
> *data,
> +lookup_file_dentry(struct dentry *dentry,
> +struct eventfs_inode *ei, int idx,
> +umode_t mode, void *data,
>  const struct file_operations *fops)
>  {
>   struct eventfs_attr *attr = NULL;
>   struct dentry **e_dentry = >d_children[idx];
> - struct dentry *dentry;
> -
> - WARN_ON_ONCE(!inode_is_locked(parent->d_inode));
>  
> - mutex_lock(_mutex);
> - if (ei->is_freed) {
> - mutex_unlock(_mutex);
> - return NULL;
> - }
> - /* If the e_dentry already has a dentry, use it */
> - if (*e_dentry) {
> - dget(*e_dentry);
> - mutex_unlock(_mutex);
> - return *e_dentry;
> - }
> -
> - /* ei->entry_attrs are protected by SRCU */
>   if (ei->entry_attrs)
>   attr = >entry_attrs[idx];
>  
> - mutex_unlock(_mutex);
> -
> - dentry = create_file(name, mode, attr, parent, data, fops);
> -
> - mutex_lock(_mutex);
> -
> - if (IS_ERR_OR_NULL(dentry)) {
> - /*
> -  * When the mutex was released, something else could have
> -  * created the dentry for this e_dentry. In which case
> -  * use that one.
> -  *
> -  * If ei->is_freed is set, the e_dentry is currently on its
> -  * way to being freed, don't return it. If e_dentry is NULL
> -  * it means it was already freed.
> -  */
> - if (ei->is_freed) {
> - dentry = NULL;
> - } else {
> - dentry = *e_dentry;
> -