On Fri, Jan 29, 2021 at 07:37:03AM +0200, Lennert Buytenhek wrote: > > > > One open question is whether IORING_OP_GETDENTS64 should be more like > > > > pread(2) and allow passing in a starting offset to read from the > > > > directory from. (This would require some more surgery in fs/readdir.c.) > > > > > > Since directories are seekable this ought to work. > > > Modulo horrid issues with 32bit file offsets. > > > > The incremental patch below does this. (It doesn't apply cleanly on > > top of v1 of the IORING_OP_GETDENTS patch as I have other changes in > > my tree -- I'm including it just to illustrate the changes that would > > make this work.) > > > > This change seems to work, and makes IORING_OP_GETDENTS take an > > explicitly specified directory offset (instead of using the file's > > ->f_pos), making it more like pread(2) [...] > > ...but the fact that this patch avoids taking file->f_pos_lock (as this > proposed version of IORING_OP_GETDENTS avoids using file->f_pos) means > that ->iterate_shared() can then be called concurrently on the same > struct file, which breaks the mutual exclusion guarantees provided here. > > If possible, I'd like to keep the ability to explicitly pass in a > directory offset to start reading from into IORING_OP_GETDENTS, so > perhaps we can simply satisfy the mutual exclusion requirement by > taking ->f_pos_lock by hand -- but then I do need to check that it's OK > for ->iterate{,_shared}() to be called successively with discontinuous > offsets without ->llseek() being called in between. > > (If that's not OK, then we can either have IORING_OP_GETDENTS just > always start reading at ->f_pos like before (which might then require > adding a IORING_OP_GETDENTS2 at some point in the future if we'll > ever want to change that), or we could have IORING_OP_GETDENTS itself > call ->llseek() for now whenever necessary.)
Having IORING_OP_GETDENTS seek to sqe->off if needed seems easy enough to implement, and it simplifies the other code as well, so I'll send out a v2 RFC shortly that does this.