On Fri, Sep 6, 2013 at 8:01 PM, Al Viro <v...@zeniv.linux.org.uk> wrote: > > * plain seqretry loop (d_lookup(), is_subdir(), autofs4_getpath(), > ceph_misc_build_path(), [cifs] build_path_from_dentry(), nfs_path(), > [audit] handle_path()) > * try seqretry once, then switch to write_seqlock() (the things > that got unified into d_walk()) > * try seqretry three times, then switch to write_seqlock() (d_path() > and friends) > * several pure write_seqlock() users (d_move(), d_set_mounted(), > d_materialize_unique()) > > The last class is not a problem - these we want as writers. I really don't > like the way the rest is distributed - if nothing else, nfs_path() and > friends are in exactly the same situation as d_path(). Moreover, why > the distinction between "try once" and "try thrice"? > > _If_ we fold the second and the third groups together (and probably have > a bunch from the first one join that), we at least get something > understandable, but the I really wonder if seqlock has the right calling > conventions for that (and at least I'd like to fold the "already got > writelock" > flag into seq - we do have a spare bit there). > > Comments?
So I think we could make a more complicated data structure that looks something like this: struct seqlock_retry { unsigned int seq_no; int state; }; and pass that around. Gcc should do pretty well, especially if we inline things (but even if not, small structures that fit in 64 bytes generate reasonable code even on 32-bit targets, because gcc knows about using two registers for passing data around).. Then you can make "state" have a retry counter in it, and have a negative value mean "I hold the lock for writing". Add a couple of helper functions, and you can fairly easily handle the mixed "try for reading first, then fall back to writing". That said, __d_lookup() still shows up as very performance-critical on some loads (symlinks in particular cause us to fall out of the RCU cases) so I'd like to keep that using the simple pure read case. I don't believe you can livelock it, as mentioned. But the other ones might well be worth moving to a "fall back to write-locking after <n> tries" model. They might all traverse user-specified paths of fairly arbitrary depth, no? So this "seqlock_retry" thing wouldn't _replace_ bare seqlocks, it would just be a helper thing for this kind of behavior where we want to normally do things with just the read-lock, but want to guarantee that we don't live-lock. Sounds reasonable? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/