Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy
On Wed, 31 Jan 2024 22:21:27 -0500 Steven Rostedt wrote: > We (Linus and I) got it wrong. It originally had: > > d_add(dentry, NULL); > [..] > return NULL; OK, so I changed that function to this: static struct dentry *eventfs_root_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags) { struct eventfs_inode *ei_child; struct tracefs_inode *ti; struct eventfs_inode *ei; const char *name = dentry->d_name.name; ti = get_tracefs(dir); if (!(ti->flags & TRACEFS_EVENT_INODE)) return ERR_PTR(-EIO); mutex_lock(&eventfs_mutex); ei = ti->private; if (!ei || ei->is_freed) goto out; list_for_each_entry(ei_child, &ei->children, list) { if (strcmp(ei_child->name, name) != 0) continue; if (ei_child->is_freed) goto out; lookup_dir_entry(dentry, ei, ei_child); goto out; } for (int i = 0; i < ei->nr_entries; i++) { void *data; umode_t mode; const struct file_operations *fops; const struct eventfs_entry *entry = &ei->entries[i]; if (strcmp(name, entry->name) != 0) continue; data = ei->data; if (entry->callback(name, &mode, &data, &fops) <= 0) goto out; lookup_file_dentry(dentry, ei, i, mode, data, fops); goto out; } out: mutex_unlock(&eventfs_mutex); return NULL; } And it passes the make kprobe test. I'll send out a v3 of this patch, and remove the inc_nlink(dentry->d_parent->d_inode) and the fsnotify as separate patches as that code was there before Linus touched it. Thanks, -- Steve
Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy
On Thu, 1 Feb 2024 03:02:05 + Al Viro wrote: > > We had a problem here with just returning NULL. It leaves the negative > > dentry around and doesn't get refreshed. > > Why would that dentry stick around? And how would anyone find > it, anyway, when it's not hashed? We (Linus and I) got it wrong. It originally had: d_add(dentry, NULL); [..] return NULL; and it caused the: # ls events/kprobes/sched/ ls: cannot access 'events/kprobes/sched/': No such file or directory # echo 'p:sched schedule' >> /sys/kernel/tracing/kprobe_events # ls events/kprobes/sched/ ls: cannot access 'events/kprobes/sched/': No such file or directory I just changed the code to simply return NULL, and it had no issues: # ls events/kprobes/sched/ ls: cannot access 'events/kprobes/sched/': No such file or directory # echo 'p:sched schedule' >> /sys/kernel/tracing/kprobe_events # ls events/kprobes/sched/ enable filter format hist hist_debug id inject trigger But then I added the: d_add(dentry, NULL); that we originally had, and then it caused the issue again. So it wasn't the returning NULL that was causing a problem, it was calling the d_add(dentry, NULL); that was. I'll update the patch. -- Steve
Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy
On Wed, Jan 31, 2024 at 09:26:42PM -0500, Steven Rostedt wrote: > > Huh? Just return NULL and be done with that - you'll get an > > unhashed negative dentry and let the caller turn that into > > -ENOENT... > > We had a problem here with just returning NULL. It leaves the negative > dentry around and doesn't get refreshed. Why would that dentry stick around? And how would anyone find it, anyway, when it's not hashed? > I did this: > > # cd /sys/kernel/tracing > # ls events/kprobes/sched/ > ls: cannot access 'events/kprobes/sched/': No such file or directory > # echo 'p:sched schedule' >> kprobe_events > # ls events/kprobes/sched/ > ls: cannot access 'events/kprobes/sched/': No such file or directory > > When it should have been: > > # ls events/kprobes/sched/ > enable filter format hist hist_debug id inject trigger > > Leaving the negative dentry there will have it fail when the directory > exists the next time. Then you have something very deeply fucked up. NULL or ERR_PTR(-ENOENT) from ->lookup() in the last component of open() would do exactly the same thing: dput() whatever had been passed to ->lookup() and fail open(2) with -ENOENT.
Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy
On Thu, 1 Feb 2024 00:27:19 + Al Viro wrote: > On Wed, Jan 31, 2024 at 01:49:22PM -0500, Steven Rostedt wrote: > > > @@ -329,32 +320,29 @@ static struct dentry *create_file(const char *name, > > umode_t mode, > > > > ti = get_tracefs(inode); > > ti->flags |= TRACEFS_EVENT_INODE; > > - d_instantiate(dentry, inode); > > + > > + d_add(dentry, inode); > > fsnotify_create(dentry->d_parent->d_inode, dentry); > > Seriously? stat(2), have it go away from dcache on memory pressure, > lather, rinse, repeat... Won't *snotify get confused by the stream > of creations of the same thing, with not a removal in sight? > That looks to be cut and paste from the old create in tracefs. I don't know of a real use case for that. I think we could possibly delete it without anyone noticing. > > - return eventfs_end_creating(dentry); > > + return dentry; > > }; > > > > @@ -371,11 +359,14 @@ static struct dentry *create_dir(struct eventfs_inode > > *ei, struct dentry *parent > > /* Only directories have ti->private set to an ei, not files */ > > ti->private = ei; > > > > + dentry->d_fsdata = ei; > > +ei->dentry = dentry; // Remove me! > > + > > inc_nlink(inode); > > - d_instantiate(dentry, inode); > > + d_add(dentry, inode); > > inc_nlink(dentry->d_parent->d_inode); > > What will happen when that thing gets evicted from dcache, > gets looked up again, and again, and...? > > > fsnotify_mkdir(dentry->d_parent->d_inode, dentry); > > Same re snotify confusion... Yeah, again, I think it's useless. Doing that is more useless than taring the tracefs directory ;-) > > > - return eventfs_end_creating(dentry); > > + return dentry; > > } > > > > static void free_ei(struct eventfs_inode *ei) > > @@ -425,7 +416,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode > > *ti, struct dentry *dentry) > > } > > > > @@ -607,79 +462,55 @@ static struct dentry *eventfs_root_lookup(struct > > inode *dir, > > struct dentry *dentry, > > unsigned int flags) > > { > > - const struct file_operations *fops; > > - const struct eventfs_entry *entry; > > struct eventfs_inode *ei_child; > > struct tracefs_inode *ti; > > struct eventfs_inode *ei; > > - struct dentry *ei_dentry = NULL; > > - struct dentry *ret = NULL; > > - struct dentry *d; > > const char *name = dentry->d_name.name; > > - umode_t mode; > > - void *data; > > - int idx; > > - int i; > > - int r; > > + struct dentry *result = NULL; > > > > ti = get_tracefs(dir); > > if (!(ti->flags & TRACEFS_EVENT_INODE)) > > Can that ever happen? I mean, why set ->i_op to something that > has this for ->lookup() on a directory without TRACEFS_EVENT_INODE in > its inode? It's not as if you ever removed that flag... That's been there mostly as paranoia. Should probably be switched to: if (WARN_ON_ONCE(!(ti->flags & TRACEFS_EVENT_INODE))) > > > - return NULL; > > - > > - /* Grab srcu to prevent the ei from going away */ > > - idx = srcu_read_lock(&eventfs_srcu); > > + return ERR_PTR(-EIO); > > > > - /* > > -* Grab the eventfs_mutex to consistent value from ti->private. > > -* This s > > -*/ > > mutex_lock(&eventfs_mutex); > > - ei = READ_ONCE(ti->private); > > - if (ei && !ei->is_freed) > > - ei_dentry = READ_ONCE(ei->dentry); > > - mutex_unlock(&eventfs_mutex); > > - > > - if (!ei || !ei_dentry) > > - goto out; > > > > - data = ei->data; > > + ei = ti->private; > > + if (!ei || ei->is_freed) > > + goto enoent; > > > > - list_for_each_entry_srcu(ei_child, &ei->children, list, > > -srcu_read_lock_held(&eventfs_srcu)) { > > + list_for_each_entry(ei_child, &ei->children, list) { > > if (strcmp(ei_child->name, name) != 0) > > continue; > > - ret = simple_lookup(dir, dentry, flags); > > - if (IS_ERR(ret)) > > - goto out; > > - d = create_dir_dentry(ei, ei_child, ei_dentry); > > - dput(d); > > + if (ei_child->is_freed) > > + goto enoent; > > Out of curiosity - can that happen now? You've got exclusion with > eventfs_remove_rec(), so you shouldn't be able to catch the moment > between setting ->is_freed and removal from the list... Yeah, that's from when we just used SRCU. If anything, it too should just add a WARN_ON_ONCE() to it. > > > + lookup_dir_entry(dentry, ei, ei_child); > > goto out; > > } > > > > - for (i = 0; i < ei->nr_entries; i++) { > > - entry = &ei->entries[i]; > > - if (strcmp(name, entry->name) == 0) { > > - void *cdata = data; > > - mutex_lock(&eventfs_mutex); > > - /* If ei->is_freed, then the event
Re: [PATCH v2 4/7] tracefs: dentry lookup crapectomy
On Wed, Jan 31, 2024 at 01:49:22PM -0500, Steven Rostedt wrote: > @@ -329,32 +320,29 @@ static struct dentry *create_file(const char *name, > umode_t mode, > > ti = get_tracefs(inode); > ti->flags |= TRACEFS_EVENT_INODE; > - d_instantiate(dentry, inode); > + > + d_add(dentry, inode); > fsnotify_create(dentry->d_parent->d_inode, dentry); Seriously? stat(2), have it go away from dcache on memory pressure, lather, rinse, repeat... Won't *snotify get confused by the stream of creations of the same thing, with not a removal in sight? > - return eventfs_end_creating(dentry); > + return dentry; > }; > > /** > - * create_dir - create a dir in the tracefs filesystem > + * lookup_dir_entry - look up a dir in the tracefs filesystem > + * @dentry: the directory to look up > * @ei: the eventfs_inode that represents the directory to create > - * @parent: parent dentry for this file. > * > - * This function will create a dentry for a directory represented by > + * This function will look up a dentry for a directory represented by > * a eventfs_inode. > */ > -static struct dentry *create_dir(struct eventfs_inode *ei, struct dentry > *parent) > +static struct dentry *lookup_dir_entry(struct dentry *dentry, > + struct eventfs_inode *pei, struct eventfs_inode *ei) > { > struct tracefs_inode *ti; > - struct dentry *dentry; > struct inode *inode; > > - dentry = eventfs_start_creating(ei->name, parent); > - if (IS_ERR(dentry)) > - return dentry; > - > inode = tracefs_get_inode(dentry->d_sb); > if (unlikely(!inode)) > - return eventfs_failed_creating(dentry); > + return ERR_PTR(-ENOMEM); > > /* If the user updated the directory's attributes, use them */ > update_inode_attr(dentry, inode, &ei->attr, > @@ -371,11 +359,14 @@ static struct dentry *create_dir(struct eventfs_inode > *ei, struct dentry *parent > /* Only directories have ti->private set to an ei, not files */ > ti->private = ei; > > + dentry->d_fsdata = ei; > +ei->dentry = dentry; // Remove me! > + > inc_nlink(inode); > - d_instantiate(dentry, inode); > + d_add(dentry, inode); > inc_nlink(dentry->d_parent->d_inode); What will happen when that thing gets evicted from dcache, gets looked up again, and again, and...? > fsnotify_mkdir(dentry->d_parent->d_inode, dentry); Same re snotify confusion... > - return eventfs_end_creating(dentry); > + return dentry; > } > > static void free_ei(struct eventfs_inode *ei) > @@ -425,7 +416,7 @@ void eventfs_set_ei_status_free(struct tracefs_inode *ti, > struct dentry *dentry) > } > > /** > - * create_file_dentry - create a dentry for a file of an eventfs_inode > + * lookup_file_dentry - create a dentry for a file of an eventfs_inode > * @ei: the eventfs_inode that the file will be created under > * @idx: the index into the d_children[] of the @ei > * @parent: The parent dentry of the created file. > @@ -438,157 +429,21 @@ void eventfs_set_ei_status_free(struct tracefs_inode > *ti, struct dentry *dentry) > * address located at @e_dentry. > */ > static struct dentry * > -create_file_dentry(struct eventfs_inode *ei, int idx, > -struct dentry *parent, const char *name, umode_t mode, void > *data, > +lookup_file_dentry(struct dentry *dentry, > +struct eventfs_inode *ei, int idx, > +umode_t mode, void *data, > const struct file_operations *fops) > { > struct eventfs_attr *attr = NULL; > struct dentry **e_dentry = &ei->d_children[idx]; > - struct dentry *dentry; > - > - WARN_ON_ONCE(!inode_is_locked(parent->d_inode)); > > - mutex_lock(&eventfs_mutex); > - if (ei->is_freed) { > - mutex_unlock(&eventfs_mutex); > - return NULL; > - } > - /* If the e_dentry already has a dentry, use it */ > - if (*e_dentry) { > - dget(*e_dentry); > - mutex_unlock(&eventfs_mutex); > - return *e_dentry; > - } > - > - /* ei->entry_attrs are protected by SRCU */ > if (ei->entry_attrs) > attr = &ei->entry_attrs[idx]; > > - mutex_unlock(&eventfs_mutex); > - > - dentry = create_file(name, mode, attr, parent, data, fops); > - > - mutex_lock(&eventfs_mutex); > - > - if (IS_ERR_OR_NULL(dentry)) { > - /* > - * When the mutex was released, something else could have > - * created the dentry for this e_dentry. In which case > - * use that one. > - * > - * If ei->is_freed is set, the e_dentry is currently on its > - * way to being freed, don't return it. If e_dentry is NULL > - * it means it was already freed. > - */ > - if (ei->is_freed) { > - dentry = NULL; > - } else { > -