[PATCHv1, RFC 28/33] ext4: handle huge pages in __ext4_block_zero_page_range()

2016-07-25 Thread Kirill A. Shutemov
As the function handles zeroing range only within one block, the required changes are trivial, just remove assuption on page size. Signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git

[PATCHv1, RFC 20/33] thp: introduce hpage_size() and hpage_mask()

2016-07-25 Thread Kirill A. Shutemov
Introduce new helpers which return size/mask of the page: HPAGE_PMD_SIZE/HPAGE_PMD_MASK if the page is PageTransHuge() and PAGE_SIZE/PAGE_MASK otherwise. Signed-off-by: Kirill A. Shutemov --- include/linux/huge_mm.h | 16 1 file changed, 16

[PATCHv1, RFC 21/33] fs: make block_read_full_page() be able to read huge page

2016-07-25 Thread Kirill A. Shutemov
The approach is straight-forward: for compound pages we read out whole huge page. For huge page we cannot have array of buffer head pointers on stack -- it's 4096 pointers on x86-64 -- 'arr' is allocated with kmalloc() for huge pages. Signed-off-by: Kirill A. Shutemov

[PATCHv1, RFC 26/33] ext4: make ext4_writepage() work on huge pages

2016-07-25 Thread Kirill A. Shutemov
Change ext4_writepage() and underlying ext4_bio_write_page(). It basically removes assumption on page size, infer it from struct page instead. Signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 10 +- fs/ext4/page-io.c | 11 +-- 2 files

[PATCHv1, RFC 22/33] fs: make block_write_{begin,end}() be able to handle huge pages

2016-07-25 Thread Kirill A. Shutemov
It's more or less straight-forward. Most changes are around getting offset/len withing page right and zero out desired part of the page. Signed-off-by: Kirill A. Shutemov --- fs/buffer.c | 53 +++-- 1 file

[PATCHv1, RFC 33/33] ext4, vfs: add huge= mount option

2016-07-25 Thread Kirill A. Shutemov
The same four values as in tmpfs case. Signed-off-by: Kirill A. Shutemov --- fs/ext4/ext4.h | 5 + fs/ext4/inode.c | 26 +- fs/ext4/super.c | 19 +++ 3 files changed, 45 insertions(+), 5 deletions(-) diff --git

[PATCHv1, RFC 11/33] thp: allow splitting non-shmem file-backed THPs

2016-07-25 Thread Kirill A. Shutemov
split_huge_page() is ready to handle file-backed huge pages, we only need to remove one guarding VM_BUG_ON_PAGE(). Signed-off-by: Kirill A. Shutemov --- mm/huge_memory.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index

[PATCHv1, RFC 29/33] ext4: handle huge pages in ext4_da_write_end()

2016-07-25 Thread Kirill A. Shutemov
Call ext4_da_should_update_i_disksize() for head page with offset relative to head page. Signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index

[PATCHv1, RFC 00/33] ext4: support of huge pages

2016-07-25 Thread Kirill A. Shutemov
Here's the first version of my patchset which intended to bring huge pages to ext4. It's not yet ready for applying or serious use, but good enough to show the approach. The basics are the same as with tmpfs[1] which is in -mm tree and ext4 built on of it. The main difference is that we need to

[PATCHv1, RFC 31/33] WIP: ext4: handle writeback with huge pages

2016-07-25 Thread Kirill A. Shutemov
Modify mpage_map_and_submit_buffers() to do writeback with huge pages. This is somewhat unstable. I have hard time see full picture yet. More work is required. Not-yet-signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 40

[PATCHv1, RFC 04/33] radix-tree: Add radix_tree_split

2016-07-25 Thread Kirill A. Shutemov
From: Matthew Wilcox This new function splits a larger multiorder entry into smaller entries (potentially multi-order entries). These entries are initialised to RADIX_TREE_RETRY to ensure that RCU walkers who see this state aren't confused. The caller should then call

[PATCHv1, RFC 19/33] mm: make write_cache_pages() work on huge pages

2016-07-25 Thread Kirill A. Shutemov
We writeback whole huge page a time. Let's adjust iteration this way. Signed-off-by: Kirill A. Shutemov --- include/linux/mm.h | 1 + include/linux/pagemap.h | 1 + mm/page-writeback.c | 17 - 3 files changed, 14 insertions(+), 5

[PATCHv1, RFC 12/33] truncate: make sure invalidate_mapping_pages() can discard huge pages

2016-07-25 Thread Kirill A. Shutemov
invalidate_inode_page() has expectation about page_count() of the page -- if it's not 2 (one to caller, one to radix-tree), it will not be dropped. That condition almost never met for THPs -- tail pages are pinned to the pagevec. Let's drop them, before calling invalidate_inode_page().

[PATCHv1, RFC 16/33] filemap: handle huge pages in filemap_fdatawait_range()

2016-07-25 Thread Kirill A. Shutemov
We writeback whole huge page a time. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 5 + 1 file changed, 5 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index ad73b99c5ba7..3d46db277e73 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -371,9

[PATCHv1, RFC 24/33] truncate: make truncate_inode_pages_range() aware about huge pages

2016-07-25 Thread Kirill A. Shutemov
As with shmem_undo_range(), truncate_inode_pages_range() removes huge pages, if it fully within range. Partial truncate of huge pages zero out this part of THP. Unlike with shmem, it doesn't prevent us having holes in the middle of huge page we still can skip writeback not touched buffers. With

[PATCHv1, RFC 32/33] mm, fs, ext4: expand use of page_mapping() and page_to_pgoff()

2016-07-25 Thread Kirill A. Shutemov
With huge pages in page cache we see tail pages in more code paths. This patch replaces direct access to struct page fields with macros which can handle tail pages properly. Signed-off-by: Kirill A. Shutemov --- fs/buffer.c | 2 +- fs/ext4/inode.c |

[PATCHv1, RFC 18/33] HACK: block: bump BIO_MAX_PAGES

2016-07-25 Thread Kirill A. Shutemov
We are going to do IO a huge page a time. For x86-64, it's 512 pages, so we need to double current BIO_MAX_PAGES. To be portable to other archtectures we need more generic solution. Signed-off-by: Kirill A. Shutemov --- include/linux/bio.h | 2 +- 1 file

[PATCHv1, RFC 06/33] radix-tree: Handle multiorder entries being deleted by replace_clear_tags

2016-07-25 Thread Kirill A. Shutemov
From: Matthew Wilcox radix_tree_replace_clear_tags() can be called with NULL as the replacement value; in this case we need to delete sibling entries which point to the slot. Signed-off-by: Matthew Wilcox Signed-off-by: Kirill A. Shutemov

[PATCHv1, RFC 08/33] Revert "radix-tree: implement radix_tree_maybe_preload_order()"

2016-07-25 Thread Kirill A. Shutemov
This reverts commit 356e1c23292a4f63cfdf1daf0e0ddada51f32de8. After conversion of huge tmpfs to multi-order entries, we don't need this anymore. Signed-off-by: Kirill A. Shutemov --- include/linux/radix-tree.h | 1 - lib/radix-tree.c | 74

Re: [PATCH v2 1/1] block: fix blk_queue_split() resource exhaustion

2016-07-25 Thread Jeff Moyer
Eric Wheeler writes: > [+cc Mikulas Patocka, Jeff Moyer; Do either of you have any input on Lars' > commentary related to patchwork #'s 9204125 and 7398411 and BZ#119841? ] Sorry, I don't have any time to look at this right now. Cheers, Jeff > > On Tue, 19 Jul