On Fri, 01 May 2020 16:04:41 +0200 Christian Schoenebeck <qemu_...@crudebyte.com> wrote:
> On Donnerstag, 30. April 2020 15:30:49 CEST Greg Kurz wrote: > > > > I agree that a client that issues concurrent readdir requests on the > > > > same fid is probably asking for troubles, but this permitted by the > > > > spec. Whether we should detect such conditions and warn or even fail > > > > is discussion for another thread. > > > > > > > > The locking is only needed to avoid concurrent accesses to the dirent > > > > structure returned by readdir(), otherwise we could return partially > > > > overwritten file names to the client. It must be done for each > > > > individual > > > > call to readdir(), but certainly not for multiple calls. > > > > > > Yeah, that would resolve this issue more appropriately for 9p2000.L, since > > > Treaddir specifies an offset, but for 9p2000.u the result of a concurrent > > > read on a directory (9p2000.u) would still be undefined. > > > > The bad client behavior you want to tackle has nothing to do with > > the locking itself. Since all the code in 9p.c runs serialized in > > the context of the QEMU main loop, concurrent readdir requests could > > easily be detected up-front with a simple flag in the fid structure. > > Well, it's fine with me. I don't really see an issue here right now. But that > all the code was serialized is not fully true. Most of the 9p.c code is still > heavily dispatching between main thread and worker threads back and forth. > And > for that reason the order of request processing might change quite > arbitrarily > in between. Just keep that in mind. > Just to make things clear. The code in 9p.c is ALWAYS exclusively run by the main thread. Only the code called under v9fs_co_run_in_worker() is dispatched on worker threads. So, yes the order of individual backend operations may change, but the start of a new client request is necessarily serialized with the completion of pending ones, which is the only thing we care for actually. > > > > > + > > > > > + /* save the directory position */ > > > > > + saved_dir_pos = s->ops->telldir(&s->ctx, &fidp->fs); > > > > > + if (saved_dir_pos < 0) { > > > > > + err = saved_dir_pos; > > > > > + goto out; > > > > > + } > > > > > + > > > > > + while (true) { > > > > > + /* get directory entry from fs driver */ > > > > > + err = do_readdir(pdu, fidp, &dent); > > > > > + if (err || !dent) { > > > > > + break; > > > > > + } > > > > > + > > > > > + /* > > > > > + * stop this loop as soon as it would exceed the allowed > > > > > maximum > > > > > + * response message size for the directory entries collected > > > > > so > > > > > far, + * because anything beyond that size would need to be > > > > > discarded by + * 9p controller (main thread / top half) anyway > > > > > + */ > > > > > + v9fs_string_init(&name); > > > > > + v9fs_string_sprintf(&name, "%s", dent->d_name); > > > > > + len = v9fs_readdir_response_size(&name); > > > > > + v9fs_string_free(&name); > > > > > + if (size + len > maxsize) { > > > > > + /* this is not an error case actually */ > > > > > + break; > > > > > + } > > > > > + > > > > > + /* append next node to result chain */ > > > > > + if (!e) { > > > > > + *entries = e = g_malloc0(sizeof(V9fsDirEnt)); > > > > > + } else { > > > > > + e = e->next = g_malloc0(sizeof(V9fsDirEnt)); > > > > > + } > > > > > + e->dent = g_malloc0(sizeof(struct dirent)); > > > > > > > > So we're allocating a bunch of stuff here... > > > > > > > > > + memcpy(e->dent, dent, sizeof(struct dirent)); > > > > > + > > > > > + /* perform a full stat() for directory entry if requested by > > > > > caller */ + if (dostat) { > > > > > + err = s->ops->name_to_path( > > > > > + &s->ctx, &fidp->path, dent->d_name, &path > > > > > + ); > > > > > + if (err < 0) { > > > > > > > > > > err = -errno; > > > > > > > > > > - } else { > > > > > - *dent = entry; > > > > > - err = 0; > > > > > + break; > > > > > > > > ... but we're erroring out there and it seems that we're leaking > > > > all the entries that have been allocated so far. > > > > > > No, they are not leaking actually. > > > > > > You are right that they are not deallocated in do_readdir_many(), but > > > that's intentional: in the new implementation of v9fs_do_readdir() you > > > see that v9fs_free_dirents(entries) is *always* called at the very end of > > > the function, no matter if success or any error. That's one of the > > > measures to simplify overall code as much as possible. > > > > Hmm... I still don't quite like the idea of having an erroring function > > asking for extra cleanup. I suggest you come up with an idem-potent version > > of v9fs_free_dirents(), move it to codir.c (I also prefer locality of calls > > to g_malloc and g_free in the same unit), make it extern and call it > > both on the error path of v9fs_co_readdir_many() and in v9fs_do_readdir(). > > I understand your position of course, but I still won't find that to be a > good > move. > > My veto here has a reason: your requested change would prevent an application > that I had in mind for future purpose actually: Allowing "greedy" fetching Are you telling that this series has some kind of hidden agenda related to a possible future change ?!? > directory entries, that is returning all successful read entries while some > of > the entries might have been errored for some reason. Think about a directory > where one entry is another device which errors for some reason, then a user > probably still would want to see the other entries at least. I know that Hrm... if dostat is such a weak requirement that callers might want to simply ignore the missing stat, then readdir_many() shouldn't error out in the first place. > purpose would still need some minor adjustments, but no changes to the > current > structure of this function actually. > > But to also make this clear: there is no "extra" cleanup right now. > v9fs_free_dirents(entries) is *always* called by caller, no matter if > success, > error, allocated list or NULL. It couldn't be more simple. A user of this > function must call v9fs_free_dirents(entries) at a certain point anyway. > > If you have a bad feeling about it, I can make it more clear by an API doc > comment on v9fs_co_readdir_many() if you want, like e.g. "You @b MUST @ > ALWAYS > call v9fs_free_dirents() after using this function, both on success or > error.". > No, I just cannot ack the case of a function that returns partially valid data and an error at the same time. Doesn't look sane to me. > > > I think you mean the case when there's an error inside the if (dostat) {} > > > block: The comments on struct V9fsDirEnt already suggest that the "st" > > > member is optional and may be NULL. So if there's an error inside if > > > (dostat) {} then caller still has a valid "dent" field at least and it's > > > up to caller whether or not it's a problem for its purpose that "st" is > > > empty. For that reason I would not move the block forward. > > > > Hrm... the comment is: > > > > /* > > * optional (may be NULL): A full stat of each directory entry is just > > * done if explicitly told to fs driver. > > */ > > > > I don't read that it is optional for the fs driver to populate "st" > > if this was required by the caller. > > Well, if you find that ambigious, I could add an additional sentence "It > might > also be NULL if stat failed.". > > > Also, looking at the next patch > > I see that the condition for calling stat() is V9FS_REMAP_INODES and > > the code explicitly requires "st" to be available in this case. > > Yes, but there is also an if (!st) { ... } in the subsequent patch already. > So > I don't see an issue here. > > Changing the order in readdir_many() would prevent future applications of > that > function that I described. Or let's say, order would need to be reverted back > again later on. > > Best regards, > Christian Schoenebeck > >