On Mon 11-02-19 13:59:24, Linux Upstream wrote: > > > >> Signed-off-by: Chintan Pandya <chintan.pan...@oneplus.com> > > > > NAK. > > > > This is bound to regress some stuff. Now agreed that using non-atomic > > ops is tricky, but many are in places where we 'know' there can't be > > concurrency. > > > > If you can show any single one is wrong, we can fix that one, but we're > > not going to blanket remove all this just because. > > Not quite familiar with below stack but from crash dump, found that this > was another stack running on some other CPU at the same time which also > updates page cache lru and manipulate locks. > > [84415.344577] [20190123_21:27:50.786264]@1 preempt_count_add+0xdc/0x184 > [84415.344588] [20190123_21:27:50.786276]@1 workingset_refault+0xdc/0x268 > [84415.344600] [20190123_21:27:50.786288]@1 add_to_page_cache_lru+0x84/0x11c > [84415.344612] [20190123_21:27:50.786301]@1 ext4_mpage_readpages+0x178/0x714 > [84415.344625] [20190123_21:27:50.786313]@1 ext4_readpages+0x50/0x60 > [84415.344636] [20190123_21:27:50.786324]@1 > __do_page_cache_readahead+0x16c/0x280 > [84415.344646] [20190123_21:27:50.786334]@1 filemap_fault+0x41c/0x588 > [84415.344655] [20190123_21:27:50.786343]@1 ext4_filemap_fault+0x34/0x50 > [84415.344664] [20190123_21:27:50.786353]@1 __do_fault+0x28/0x88 > > Not entirely sure if it's racing with the crashing stack or it's simply > overrides the the bit set by case 2 (mentioned in 0/2).
So this is interesting. Looking at __add_to_page_cache_locked() nothing seems to prevent __SetPageLocked(page) in add_to_page_cache_lru() to get reordered into __add_to_page_cache_locked() after page is actually added to the xarray. So that one particular instance might benefit from atomic SetPageLocked or a barrier somewhere between __SetPageLocked() and the actual addition of entry into the xarray. Honza -- Jan Kara <j...@suse.com> SUSE Labs, CR