Hi Ben, Al, On 09/10/2013 12:02 AM, Benjamin LaHaise wrote:
> Hi Al, Gu, > > I've added this patch to my tree at git://git.kvack.org/~bcrl/aio-next.git > to fix the get_user_pages() issue introduced by Gu's changes in the page > migration patch. Thanks Al for spotting this. Thanks very much for spotting and fixing this issue. Best regards, Gu > > -ben > > commit d6c355c7dabcd753a75bc77d150d36328a355267 > Author: Benjamin LaHaise <b...@kvack.org> > Date: Mon Sep 9 11:57:59 2013 -0400 > > aio: fix race in ring buffer page lookup introduced by page migration > support > > Prior to the introduction of page migration support in "fs/aio: Add > support > to aio ring pages migration" / 36bc08cc01709b4a9bb563b35aa530241ddc63e3, > mapping of the ring buffer pages was done via get_user_pages() while > retaining mmap_sem held for write. This avoided possible races with > userland > racing an munmap() or mremap(). The page migration patch, however, > switched > to using mm_populate() to prime the page mapping. mm_populate() cannot be > called with mmap_sem held. > > Instead of dropping the mmap_sem, revert to the old behaviour and simply > drop the use of mm_populate() since get_user_pages() will cause the pages > to > get mapped anyways. Thanks to Al Viro for spotting this issue. > > Signed-off-by: Benjamin LaHaise <b...@kvack.org> > > diff --git a/fs/aio.c b/fs/aio.c > index 6e26755..f4a27af 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -307,16 +307,25 @@ static int aio_setup_ring(struct kioctx *ctx) > aio_free_ring(ctx); > return -EAGAIN; > } > - up_write(&mm->mmap_sem); > - > - mm_populate(ctx->mmap_base, populate); > > pr_debug("mmap address: 0x%08lx\n", ctx->mmap_base); > + > + /* We must do this while still holding mmap_sem for write, as we > + * need to be protected against userspace attempting to mremap() > + * or munmap() the ring buffer. > + */ > ctx->nr_pages = get_user_pages(current, mm, ctx->mmap_base, nr_pages, > 1, 0, ctx->ring_pages, NULL); > + > + /* Dropping the reference here is safe as the page cache will hold > + * onto the pages for us. It is also required so that page migration > + * can unmap the pages and get the right reference count. > + */ > for (i = 0; i < ctx->nr_pages; i++) > put_page(ctx->ring_pages[i]); > > + up_write(&mm->mmap_sem); > + > if (unlikely(ctx->nr_pages != nr_pages)) { > aio_free_ring(ctx); > return -EAGAIN; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/