Thanks for this, github works for review right now :-) I am occupied till Wednesday, and will review them by this week. A glance on the changes looks good to me.
Few tests which can run for validations are : tests/bugs/shard/bug-1696136-lru-limit-equals-deletion-rate.t tests/features/fuse-lru-limit.t tests/bugs/shard/shard-inode-refcount-test.t Ideal is to run the full regression with `./run-tests.sh -c` Regards, Amar On Mon, Nov 4, 2019 at 9:21 AM Changwei Ge <c...@linux.alibaba.com> wrote: > Hi Amar, > > On 2019/10/31 6:30 下午, Amar Tumballi wrote: > > > > > > On Wed, Oct 30, 2019 at 4:32 PM Xavi Hernandez <jaher...@redhat.com > > <mailto:jaher...@redhat.com>> wrote: > > > > Hi Changwei, > > > > On Tue, Oct 29, 2019 at 7:56 AM Changwei Ge <c...@linux.alibaba.com > > <mailto:c...@linux.alibaba.com>> wrote: > > > > Hi, > > > > I am recently working on reducing inode_[un]ref() locking > > contention by > > getting rid of inode table lock. Just use inode lock to protect > > inode > > REF. I have already discussed a couple rounds with several > > Glusterfs > > developers via emails and Gerrit and basically get understood on > > major > > logic around. > > > > Currently, inode REF can be ZERO and be reused by increasing it > > to ONE. > > This is IMO why we have to burden so much work for inode table > when > > REF/UNREF. It makes inode [un]ref() and inode table and > > dentries(alias) > > searching hard to run concurrently. > > > > So my question is in what cases, how can we find a inode whose > > REF is ZERO? > > > > As Glusterfs store its inode memory address into kernel/fuse, > > can we > > conclude that only fuse_ino_to_inode() can bring back a REF=0 > inode? > > > > > > Xavi's answer below provides some insights. and same time, assuming that > > only fuse_ino_to_inode() can bring back inode from ref=0 state (for > > now), is a good start. > > > > > > Yes, when an inode gets refs = 0, it means that gluster code is not > > using it anywhere, so it cannot be referenced again unless kernel > > sends new requests on the same inode. Once refs=0 and nlookup=0, the > > inode can be destroyed. > > > > Inode code is quite complex right now and I haven't had time to > > investigate this further, but I think we could simplify inode > > management significantly (specially unref) if we add a reference > > when nlookup becomes > 0, and remove a reference when > > nlookup becomes 0 again. Maybe with this approach we could avoid > > inode table lock in many cases. However we need to make sure we > > correctly handle invalidation logic to keep inode table size under > > control. > > > > > > My suggestion is, don't wait for a complete solution for posting the > > patch. Let us get a chance to have a look at WorkInProgress patches, so > > we can have discussions on code itself. It would help to reach better > > solutions sooner. > > Agree. > > I have almost implemented my draft design for this experiment. > The immature code has been pushed to my personal Glusterfs repo[1]. > > Now it's a single large patch, I will split it to patches when I decide > to push it to Gerrit for review convenience. If you prefer to push it to > Gerrit for a early review and discussion, I can do that :-). But I am > still doing some debug stuff. > > My work includes: > > 1. Move inode refing and unrefing logic unrelated logic out from > `__inode_[un]ref()` hence to reduce their arguments. > 2. Add a specific ‘ref_lock’ to inode to keep ref/unref atomicity. > 3. As `inode_table::active_size` is only used for debug purpose, convert > it to atomic variable. > 4. Factor out pruning inode. > 5. In order to run inode search and grep run concurrently, firstly use > RDLOCK and then convert it WRLOCK if necessary. > 6. Inode table lock is not necessary for inode ref/unref unless we have > to move it between table lists. > > etc... > > Any comments, ideas, suggestions are kindly welcomed. > > Thanks, > Changwei > > [1]: > > https://github.com/changweige/glusterfs/commit/d7226d2458281212af19ec8c2ca3d8c8caae1330 > > > > > Regards, > > > > Xavi > > > > > > > > Thanks, > > Changwei > > _______________________________________________ > > > > Community Meeting Calendar: > > > > APAC Schedule - > > Every 2nd and 4th Tuesday at 11:30 AM IST > > Bridge: https://bluejeans.com/118564314 > > > > NA/EMEA Schedule - > > Every 1st and 3rd Tuesday at 01:00 PM EDT > > Bridge: https://bluejeans.com/118564314 > > > > Gluster-devel mailing list > > Gluster-devel@gluster.org <mailto:Gluster-devel@gluster.org> > > https://lists.gluster.org/mailman/listinfo/gluster-devel > > >
_______________________________________________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel