So, the way this is supposed to work is that getting a ref when the ref is 1 is always an LRU_REQ_INITIAL ref, so that moves it to the MRU. At that point, further refs don't move it around in the queue, just increment the refcount. This should be the case, because mdcache_new_entry() and mdcache_find_keyed() both get an INITIAL ref, and all other refs require you to already have a pointer to the entry (and therefore a ref).

Can you try something, since you have a reproducer? It seems that, with 1.7 million files, 17 lanes may be a bit low. Can you try with something ridiculously large, like 1023, and see if that makes a difference?

I suspect we'll have to add logic to move further down the lanes if futility hits.

Daniel

On 04/02/2018 12:30 PM, Pradeep wrote:
We discussed this a while ago. I'm running into this again with 2.6.0. Here is a snapshot of the lru_state (I set the max entries to 10):

{entries_hiwat = 200000, entries_used = 1772870, chunks_hiwat = 100000, chunks_used = 16371, lru_reap_l1 = 8116842,   lru_reap_l2 = 1637334, lru_reap_failed = 1637334, attr_from_cache = 31917512, attr_from_cache_for_client = 5975849,   fds_system_imposed = 1048576, fds_hard_limit = 1038090, fds_hiwat = 943718, fds_lowat = 524288, futility = 0, per_lane_work = 50,   biggest_window = 419430, prev_fd_count = 0, prev_time = 1522647830, caching_fds = true}

As you can see it has grown well beyond the limlt set (1.7 million vs 200K max size). lru_reap_failed indicates number of times the reap failed from L1 and L2. I'm wondering what can cause the reap to fail once it reaches a steady state. It appears to me that the entry at LRU (head of the queue) is actually being used (refcnt > 1) and there are entries in the queue with refcnt == 1. But those are not being looked at. My understanding is that if an entry is accessed, it must move to MRU (tail of the queue). Any idea why the entry at LRU can have a refcnt > 1?

This can happen if the refcnt is incremented without QLOCK and if lru_reap_impl() is called at the same time from another thread, it will skip the first entry and return NULL. This was done in _mdcache_lru_ref() which could cause the refcnt on the head of the queue to be incremented while some other thread looks at it holding a QLOCK. I tried moving the increment/dequeue in _mdcache_lru_ref() inside QLOCK; but that did not help.

Also if "get_ref()" is called for the entry at the LRU for some reason, it will just increment refcnt and return. I think the assumption is that by the time "get_ref() is called, the entry is supposed to be out of LRU.


Thanks,
Pradeep


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to