Re: [PATCH 00/24] huge tmpfs: an alternative approach to THPageCache

2015-03-22 Thread Hugh Dickins
On Mon, 23 Feb 2015, Kirill A. Shutemov wrote:
> 
> I scanned through the patches to get general idea on how it works.

Thanks!

> I'm not
> sure that I will have time and will power to do proper code-digging before
> the summit. I found few bugs in my patchset which I want to troubleshoot
> first.

Yes, I agree that should take priority.

> 
> One thing I'm not really comfortable with is introducing yet another way
> to couple pages together. It's less risky in short term than my approach
> -- fewer existing codepaths affected, but it rises maintaining cost later.
> Not sure it's what we want.

Yes, I appreciate your reluctance to add another way of achieving the
same thing.  I still believe that compound pages were a wrong direction
for THP; but until I've posted an implementation of anon THP my way,
and you've posted an implementation of huge tmpfs your way, it's going
to be hard to compare the advantages and disadvantages of each, to
decide between them.

And (as we said at LSF/MM) we each have a priority to attend to before
that: I need to support page migration, and recovery of hugeness after
swap; and you your bugfixes.  (The only bug I've noticed in mine since
posting, a consequence of developing on an earlier release then not
reauditing pmd_trans, is that I need to relax your VM_BUG_ON_VMA in
mm/mremap.c move_page_tables().)

For now, huge tmpfs is giving us useful "transparent hugetlbfs"
functionality, and we're happy to continue developing it that way;
but can switch it over to compound pages, if they win the argument
without sacrificing too much.

> 
> After Johannes' work which added exceptional entries to normal page cache
> I hoped to see shmem/tmpfs implementation moving toward generic page
> cache. But this patchset is step in other direction -- it makes
> shmem/tmpfs even more special-cased. :(

Well, Johannes's use for the exceptional entries was rather different
from tmpfs's.  I think tmpfs will always be a special case, and one
especially entitled to huge pages, and that does not distress me at
all - though I wasn't deaf to Chris Mason asking for huge pages too.

(I do wonder if Boaz and persistent memory and the dynamic 4k struct
pages discussion will overtake and re-inform both of our designs.)

> 
> Do you have any insights on how this approach applies to real filesystems?
> I don't think there's any show stopper, but better to ask early ;)

The not-quite-a-show-stopper is my use of page->private, as Konstantin
observes in other mail: I'll muse on that a little in replying to him.

Aside from the page->private issue, the changes outside of shmem.c
should be easily applicable to other filesystems, and some of them
perhaps already useful to you.

But frankly I've given next to no thought as to how easily the code
added in shmem.c could be moved out and used for others: tmpfs was
where we wanted it.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 00/24] huge tmpfs: an alternative approach to THPageCache

2015-02-23 Thread Kirill A. Shutemov
On Fri, Feb 20, 2015 at 07:49:16PM -0800, Hugh Dickins wrote:
> I warned last month that I have been working on "huge tmpfs":
> an implementation of Transparent Huge Page Cache in tmpfs,
> for those who are tired of the limitations of hugetlbfs.
> 
> Here's a fully working patchset, against v3.19 so that you can give it
> a try against a stable base.  I've not yet studied how well it applies
> to current git: probably lots of easily resolved clashes with nonlinear
> removal.  Against mmotm, the rmap.c differences looked nontrivial.
> 
> Fully working?  Well, at present page migration just keeps away from
> these teams of pages.  And once memory pressure has disbanded a team
> to swap it out, there is nothing to put it together again later on,
> to restore the original hugepage performance.  Those must follow,
> but no thought yet (khugepaged? maybe).
> 
> Yes, I realize there's nothing yet under Documentation, nor fs/proc
> beyond meminfo, nor other debug/visibility files: must follow, but
> I've cared more to provide the basic functionality.
> 
> I don't expect to update this patchset in the next few weeks: now that
> it's posted, my priority is look at other people's work before LSF/MM;
> and in particular, of course, your (Kirill's) THP refcounting redesign.

I scanned through the patches to get general idea on how it works. I'm not
sure that I will have time and will power to do proper code-digging before
the summit. I found few bugs in my patchset which I want to troubleshoot
first.

One thing I'm not really comfortable with is introducing yet another way
to couple pages together. It's less risky in short term than my approach
-- fewer existing codepaths affected, but it rises maintaining cost later.
Not sure it's what we want.

After Johannes' work which added exceptional entries to normal page cache
I hoped to see shmem/tmpfs implementation moving toward generic page
cache. But this patchset is step in other direction -- it makes
shmem/tmpfs even more special-cased. :(

Do you have any insights on how this approach applies to real filesystems?
I don't think there's any show stopper, but better to ask early ;)

-- 
 Kirill A. Shutemov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 00/24] huge tmpfs: an alternative approach to THPageCache

2015-02-20 Thread Hugh Dickins
I warned last month that I have been working on "huge tmpfs":
an implementation of Transparent Huge Page Cache in tmpfs,
for those who are tired of the limitations of hugetlbfs.

Here's a fully working patchset, against v3.19 so that you can give it
a try against a stable base.  I've not yet studied how well it applies
to current git: probably lots of easily resolved clashes with nonlinear
removal.  Against mmotm, the rmap.c differences looked nontrivial.

Fully working?  Well, at present page migration just keeps away from
these teams of pages.  And once memory pressure has disbanded a team
to swap it out, there is nothing to put it together again later on,
to restore the original hugepage performance.  Those must follow,
but no thought yet (khugepaged? maybe).

Yes, I realize there's nothing yet under Documentation, nor fs/proc
beyond meminfo, nor other debug/visibility files: must follow, but
I've cared more to provide the basic functionality.

I don't expect to update this patchset in the next few weeks: now that
it's posted, my priority is look at other people's work before LSF/MM;
and in particular, of course, your (Kirill's) THP refcounting redesign.

01 mm: update_lru_size warn and reset bad lru_size
02 mm: update_lru_size do the __mod_zone_page_state
03 mm: use __SetPageSwapBacked and don't ClearPageSwapBacked
04 mm: make page migration's newpage handling more robust
05 tmpfs: preliminary minor tidyups
06 huge tmpfs: prepare counts in meminfo, vmstat and SysRq-m
07 huge tmpfs: include shmem freeholes in available memory counts
08 huge tmpfs: prepare huge=N mount option and /proc/sys/vm/shmem_huge
09 huge tmpfs: try to allocate huge pages, split into a team
10 huge tmpfs: avoid team pages in a few places
11 huge tmpfs: shrinker to migrate and free underused holes
12 huge tmpfs: get_unmapped_area align and fault supply huge page
13 huge tmpfs: extend get_user_pages_fast to shmem pmd
14 huge tmpfs: extend vma_adjust_trans_huge to shmem pmd
15 huge tmpfs: rework page_referenced_one and try_to_unmap_one
16 huge tmpfs: fix problems from premature exposure of pagetable
17 huge tmpfs: map shmem by huge page pmd or by page team ptes
18 huge tmpfs: mmap_sem is unlocked when truncation splits huge pmd
19 huge tmpfs: disband split huge pmds on race or memory failure
20 huge tmpfs: use Unevictable lru with variable hpage_nr_pages()
21 huge tmpfs: fix Mlocked meminfo, tracking huge and unhuge mlocks
22 huge tmpfs: fix Mapped meminfo, tracking huge and unhuge mappings
23 kvm: plumb return of hva when resolving page fault.
24 kvm: teach kvm to map page teams as huge pages.

 arch/mips/mm/gup.c |   17 
 arch/powerpc/mm/pgtable_64.c   |7 
 arch/s390/mm/gup.c |   22 
 arch/sparc/mm/gup.c|   22 
 arch/x86/kvm/mmu.c |  171 +++-
 arch/x86/kvm/paging_tmpl.h |6 
 arch/x86/mm/gup.c  |   17 
 drivers/base/node.c|   20 
 drivers/char/mem.c |   23 
 fs/proc/meminfo.c  |   17 
 include/linux/huge_mm.h|   18 
 include/linux/kvm_host.h   |2 
 include/linux/memcontrol.h |   11 
 include/linux/mempolicy.h  |6 
 include/linux/migrate.h|3 
 include/linux/mm.h |   95 +-
 include/linux/mm_inline.h  |   24 
 include/linux/mm_types.h   |1 
 include/linux/mmzone.h |5 
 include/linux/page-flags.h |6 
 include/linux/pageteam.h   |  289 +++
 include/linux/shmem_fs.h   |   21 
 include/trace/events/migrate.h |3 
 ipc/shm.c  |6 
 kernel/sysctl.c|   12 
 mm/balloon_compaction.c|   10 
 mm/compaction.c|6 
 mm/filemap.c   |   10 
 mm/gup.c   |   22 
 mm/huge_memory.c   |  281 ++-
 mm/internal.h  |   25 
 mm/memcontrol.c|   42 -
 mm/memory-failure.c|8 
 mm/memory.c|  227 +++--
 mm/migrate.c   |  139 +--
 mm/mlock.c |  181 ++--
 mm/mmap.c  |   17 
 mm/page-writeback.c|2 
 mm/page_alloc.c|   13 
 mm/rmap.c  |  207 +++--
 mm/shmem.c | 1235 +--
 mm/swap.c  |5 
 mm/swap_state.c|3 
 mm/truncate.c  |2 
 mm/vmscan.c|   76 +
 mm/vmstat.c|3 
 mm/zswap.c |3 
 virt/kvm/kvm_main.c|   24 
 48 files changed, 2790 insertions(+), 575 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/