On Thu 22-04-21 14:51:42, Andreas Gruenbacher wrote:
> Hi Jan,
>
> On Thu, Apr 22, 2021 at 1:26 PM Jan Kara <[email protected]> wrote:
> > I am looking into how GFS2 protects against races between hole punching and
> > things like page fault or readahead and AFAICT it seems it does not. In
> > particular is there anything that protects against a race like:
> >
> > CPU1 CPU2
> > gfs2_fallocate()
> > __gfs2_punch_hole()
> > truncate_pagecache_range()
> > gfs2_fault()
> > - faults in old data into page
> > cache
> > punch_hole()
> >
> > And now we have stale data in the page cache (data corruption). If
> > gfs2_page_mkwrite() sneaked in that window as well, we might be even racing
> > with writeback and are possibly corrupting the filesystem on disk. Is there
> > anything I'm missing?
>
> This is controlled by the inode glock (ip->i_gl). Readers are holding
> a shared lock while writers are holding an exclusive lock, similar to
> an rwlock. Those locks take effect cluster wide (via DLM locks) and
> down to individual tasks locally. The only exception are resource
> group glocks, which use the LM_FLAG_NODE_SCOPE flag to indicate that
> exclusive locks should be shared locally. In that case, rgd->rd_mutex
> provides the exclusion among local tasks.
OK, thanks for explanation! I missed that GFS2 glocks are task local. But
then I have another question. We have the following:
gfs2_file_read_iter()
grabs inode glock in shared mode
generic_file_read_iter()
filemap_read()
copy_page_to_iter()
<page fault>
grabs mmap_sem
gfs2_fault()
- tries to grab inode glock again -> possible recursive locking
And even if the locking isn't recursive, you have glock->mmap_sem and
mmap_sem->glock lock orderings so ABBA deadlocks are possible AFAICT.
And there's a similar problem with the write path as well, just the lock is
grabbed exclusively.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR