On Thu 24-05-18 11:38:27, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.

Just one objection: Why does stable care about this (and the previous
patch)? I've checked the stable queue and I don't see anything that would
have these patches as a prerequisite. And on their own, they are only
cleanups without substantial gains.

                                                                Honza

> 
> ------------------
> 
> From: Mel Gorman <mgor...@techsingularity.net>
> 
> commit ebded02788b5d7c7600f8cff26ae07896d568649 upstream.
> 
> In the generic read paths the kernel looks up a page in the page cache
> and if it's up to date, it is used.  If not, the page lock is acquired
> to wait for IO to complete and then check the page.  If multiple
> processes are waiting on IO, they all serialise against the lock and
> duplicate the checks.  This is unnecessary.
> 
> The page lock in itself does not give any guarantees to the callers
> about the page state as it can be immediately truncated or reclaimed
> after the page is unlocked.  It's sufficient to wait_on_page_locked and
> then continue if the page is up to date on wakeup.
> 
> It is possible that a truncated but up-to-date page is returned but the
> reference taken during read prevents it disappearing underneath the
> caller and the data is still valid if PageUptodate.
> 
> The overall impact is small as even if processes serialise on the lock,
> the lock section is tiny once the IO is complete.  Profiles indicated
> that unlock_page and friends are generally a tiny portion of a
> read-intensive workload.  An artificial test was created that had
> instances of dd access a cache-cold file on an ext4 filesystem and
> measure how long the read took.
> 
> paralleldd
>                                     4.4.0                 4.4.0
>                                   vanilla             avoidlock
> Amean    Elapsd-1          5.28 (  0.00%)        5.15 (  2.50%)
> Amean    Elapsd-4          5.29 (  0.00%)        5.17 (  2.12%)
> Amean    Elapsd-7          5.28 (  0.00%)        5.18 (  1.78%)
> Amean    Elapsd-12         5.20 (  0.00%)        5.33 ( -2.50%)
> Amean    Elapsd-21         5.14 (  0.00%)        5.21 ( -1.41%)
> Amean    Elapsd-30         5.30 (  0.00%)        5.12 (  3.38%)
> Amean    Elapsd-48         5.78 (  0.00%)        5.42 (  6.21%)
> Amean    Elapsd-79         6.78 (  0.00%)        6.62 (  2.46%)
> Amean    Elapsd-110        9.09 (  0.00%)        8.99 (  1.15%)
> Amean    Elapsd-128       10.60 (  0.00%)       10.43 (  1.66%)
> 
> The impact is small but intuitively, it makes sense to avoid unnecessary
> calls to lock_page.
> 
> Signed-off-by: Mel Gorman <mgor...@techsingularity.net>
> Reviewed-by: Jan Kara <j...@suse.cz>
> Cc: Hugh Dickins <hu...@google.com>
> Signed-off-by: Andrew Morton <a...@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
> Signed-off-by: Mel Gorman <mgor...@suse.de>
> Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org>
> 
> ---
>  mm/filemap.c |   49 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 49 insertions(+)
> 
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1581,6 +1581,15 @@ find_page:
>                                       index, last_index - index);
>               }
>               if (!PageUptodate(page)) {
> +                     /*
> +                      * See comment in do_read_cache_page on why
> +                      * wait_on_page_locked is used to avoid unnecessarily
> +                      * serialisations and why it's safe.
> +                      */
> +                     wait_on_page_locked_killable(page);
> +                     if (PageUptodate(page))
> +                             goto page_ok;
> +
>                       if (inode->i_blkbits == PAGE_CACHE_SHIFT ||
>                                       !mapping->a_ops->is_partially_uptodate)
>                               goto page_not_up_to_date;
> @@ -2253,12 +2262,52 @@ filler:
>       if (PageUptodate(page))
>               goto out;
>  
> +     /*
> +      * Page is not up to date and may be locked due one of the following
> +      * case a: Page is being filled and the page lock is held
> +      * case b: Read/write error clearing the page uptodate status
> +      * case c: Truncation in progress (page locked)
> +      * case d: Reclaim in progress
> +      *
> +      * Case a, the page will be up to date when the page is unlocked.
> +      *    There is no need to serialise on the page lock here as the page
> +      *    is pinned so the lock gives no additional protection. Even if the
> +      *    the page is truncated, the data is still valid if PageUptodate as
> +      *    it's a race vs truncate race.
> +      * Case b, the page will not be up to date
> +      * Case c, the page may be truncated but in itself, the data may still
> +      *    be valid after IO completes as it's a read vs truncate race. The
> +      *    operation must restart if the page is not uptodate on unlock but
> +      *    otherwise serialising on page lock to stabilise the mapping gives
> +      *    no additional guarantees to the caller as the page lock is
> +      *    released before return.
> +      * Case d, similar to truncation. If reclaim holds the page lock, it
> +      *    will be a race with remove_mapping that determines if the mapping
> +      *    is valid on unlock but otherwise the data is valid and there is
> +      *    no need to serialise with page lock.
> +      *
> +      * As the page lock gives no additional guarantee, we optimistically
> +      * wait on the page to be unlocked and check if it's up to date and
> +      * use the page if it is. Otherwise, the page lock is required to
> +      * distinguish between the different cases. The motivation is that we
> +      * avoid spurious serialisations and wakeups when multiple processes
> +      * wait on the same page for IO to complete.
> +      */
> +     wait_on_page_locked(page);
> +     if (PageUptodate(page))
> +             goto out;
> +
> +     /* Distinguish between all the cases under the safety of the lock */
>       lock_page(page);
> +
> +     /* Case c or d, restart the operation */
>       if (!page->mapping) {
>               unlock_page(page);
>               page_cache_release(page);
>               goto repeat;
>       }
> +
> +     /* Someone else locked and filled the page in a very small window */
>       if (PageUptodate(page)) {
>               unlock_page(page);
>               goto out;
> 
> 
-- 
Jan Kara <j...@suse.com>
SUSE Labs, CR

Reply via email to