Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 4/21/2016 8:22 PM, Matthew Wilcox wrote: On Thu, Apr 21, 2016 at 07:43:39PM -0400, Toshi Kani wrote: On 4/21/2016 4:21 PM, Mike Kravetz wrote: Might want to keep the future possibility of PUD_SIZE THP in mind? Yes, this is why the func name does not say 'pmd'. It can be extended to support PUD_SIZE in future. Sure ... but what does that look like? I think it should look a little like this: Yes, I had something similar in mind, too. Do you want me to use this version without the call with PUD_SIZE? unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len, loff_t off, unsigned long flags, unsigned long size); { unsigned long addr; loff_t off_end = off + len; loff_t off_align = round_up(off, size); unsigned long len_size; if ((off_end <= off_align) || ((off_end - off_align) < size)) return NULL; len_size = len + size; if ((len_size < len) || (off + len_size) < off) return NULL; addr = current->mm->get_unmapped_area(filp, NULL, len_size, off >> PAGE_SHIFT, flags); if (IS_ERR_VALUE(addr)) return NULL; addr += (off - addr) & (size - 1); return addr; } unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { loff_t off = (loff_t)pgoff << PAGE_SHIFT; if (addr) goto out; if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) goto out; /* Kirill, please fill in the right condition here for THP pagecache */ addr = __thp_get_unmapped_area(filp, len, off, flags, PUD_SIZE); if (addr) return addr; addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE); if (addr) return addr; out: return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } By the way, I added an extra check here, when we add len and size (PMD_SIZE in the original), we need to make sure that doesn't wrap. NB: I'm not even compiling these suggestions, just throwing them out here as ideas to be criticised. Yes, I agree with the extra check. Thanks for pointing this out. Also, len_size is a stupid name, but I can't think of a better one. How about len_pad? Thanks, -Toshi
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 4/21/2016 8:22 PM, Matthew Wilcox wrote: On Thu, Apr 21, 2016 at 07:43:39PM -0400, Toshi Kani wrote: On 4/21/2016 4:21 PM, Mike Kravetz wrote: Might want to keep the future possibility of PUD_SIZE THP in mind? Yes, this is why the func name does not say 'pmd'. It can be extended to support PUD_SIZE in future. Sure ... but what does that look like? I think it should look a little like this: Yes, I had something similar in mind, too. Do you want me to use this version without the call with PUD_SIZE? unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len, loff_t off, unsigned long flags, unsigned long size); { unsigned long addr; loff_t off_end = off + len; loff_t off_align = round_up(off, size); unsigned long len_size; if ((off_end <= off_align) || ((off_end - off_align) < size)) return NULL; len_size = len + size; if ((len_size < len) || (off + len_size) < off) return NULL; addr = current->mm->get_unmapped_area(filp, NULL, len_size, off >> PAGE_SHIFT, flags); if (IS_ERR_VALUE(addr)) return NULL; addr += (off - addr) & (size - 1); return addr; } unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { loff_t off = (loff_t)pgoff << PAGE_SHIFT; if (addr) goto out; if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) goto out; /* Kirill, please fill in the right condition here for THP pagecache */ addr = __thp_get_unmapped_area(filp, len, off, flags, PUD_SIZE); if (addr) return addr; addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE); if (addr) return addr; out: return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } By the way, I added an extra check here, when we add len and size (PMD_SIZE in the original), we need to make sure that doesn't wrap. NB: I'm not even compiling these suggestions, just throwing them out here as ideas to be criticised. Yes, I agree with the extra check. Thanks for pointing this out. Also, len_size is a stupid name, but I can't think of a better one. How about len_pad? Thanks, -Toshi
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Thu, Apr 21, 2016 at 07:43:39PM -0400, Toshi Kani wrote: > On 4/21/2016 4:21 PM, Mike Kravetz wrote: > >Might want to keep the future possibility of PUD_SIZE THP in mind? > > Yes, this is why the func name does not say 'pmd'. It can be extended to > support > PUD_SIZE in future. Sure ... but what does that look like? I think it should look a little like this: unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len, loff_t off, unsigned long flags, unsigned long size); { unsigned long addr; loff_t off_end = off + len; loff_t off_align = round_up(off, size); unsigned long len_size; if ((off_end <= off_align) || ((off_end - off_align) < size)) return NULL; len_size = len + size; if ((len_size < len) || (off + len_size) < off) return NULL; addr = current->mm->get_unmapped_area(filp, NULL, len_size, off >> PAGE_SHIFT, flags); if (IS_ERR_VALUE(addr)) return NULL; addr += (off - addr) & (size - 1); return addr; } unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { loff_t off = (loff_t)pgoff << PAGE_SHIFT; if (addr) goto out; if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) goto out; /* Kirill, please fill in the right condition here for THP pagecache */ addr = __thp_get_unmapped_area(filp, len, off, flags, PUD_SIZE); if (addr) return addr; addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE); if (addr) return addr; out: return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } By the way, I added an extra check here, when we add len and size (PMD_SIZE in the original), we need to make sure that doesn't wrap. NB: I'm not even compiling these suggestions, just throwing them out here as ideas to be criticised. Also, len_size is a stupid name, but I can't think of a better one.
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Thu, Apr 21, 2016 at 07:43:39PM -0400, Toshi Kani wrote: > On 4/21/2016 4:21 PM, Mike Kravetz wrote: > >Might want to keep the future possibility of PUD_SIZE THP in mind? > > Yes, this is why the func name does not say 'pmd'. It can be extended to > support > PUD_SIZE in future. Sure ... but what does that look like? I think it should look a little like this: unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len, loff_t off, unsigned long flags, unsigned long size); { unsigned long addr; loff_t off_end = off + len; loff_t off_align = round_up(off, size); unsigned long len_size; if ((off_end <= off_align) || ((off_end - off_align) < size)) return NULL; len_size = len + size; if ((len_size < len) || (off + len_size) < off) return NULL; addr = current->mm->get_unmapped_area(filp, NULL, len_size, off >> PAGE_SHIFT, flags); if (IS_ERR_VALUE(addr)) return NULL; addr += (off - addr) & (size - 1); return addr; } unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { loff_t off = (loff_t)pgoff << PAGE_SHIFT; if (addr) goto out; if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) goto out; /* Kirill, please fill in the right condition here for THP pagecache */ addr = __thp_get_unmapped_area(filp, len, off, flags, PUD_SIZE); if (addr) return addr; addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE); if (addr) return addr; out: return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } By the way, I added an extra check here, when we add len and size (PMD_SIZE in the original), we need to make sure that doesn't wrap. NB: I'm not even compiling these suggestions, just throwing them out here as ideas to be criticised. Also, len_size is a stupid name, but I can't think of a better one.
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 4/21/2016 4:21 PM, Mike Kravetz wrote: On 04/21/2016 12:06 AM, Matthew Wilcox wrote: On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote: How about moving the function (as is) to mm/huge_memory.c, rename it to get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h when TRANSPARENT_HUGEPAGE is unset? Great idea. Perhaps it should look something like this? unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { Might want to keep the future possibility of PUD_SIZE THP in mind? Yes, this is why the func name does not say 'pmd'. It can be extended to support PUD_SIZE in future. Thanks, -Toshi
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 4/21/2016 4:21 PM, Mike Kravetz wrote: On 04/21/2016 12:06 AM, Matthew Wilcox wrote: On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote: How about moving the function (as is) to mm/huge_memory.c, rename it to get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h when TRANSPARENT_HUGEPAGE is unset? Great idea. Perhaps it should look something like this? unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { Might want to keep the future possibility of PUD_SIZE THP in mind? Yes, this is why the func name does not say 'pmd'. It can be extended to support PUD_SIZE in future. Thanks, -Toshi
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 4/21/2016 3:06 AM, Matthew Wilcox wrote: On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote: How about moving the function (as is) to mm/huge_memory.c, rename it to get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h when TRANSPARENT_HUGEPAGE is unset? Great idea. Perhaps it should look something like this? Yes, it looks good! I will use it. :-) unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { loff_t off, off_end, off_pmd; unsigned long len_pmd, addr_pmd; if (addr) goto out; if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) goto out; /* Kirill, please fill in the right condition here for THP pagecache */ off = (loff_t)pgoff << PAGE_SHIFT; off_end = off + len; off_pmd = round_up(off, PMD_SIZE); /* pmd-aligned start offset */ if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE)) goto out; len_pmd = len + PMD_SIZE; if ((off + len_pmd) < off) goto out; addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd, pgoff, flags); if (!IS_ERR_VALUE(addr_pmd)) { addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1); return addr_pmd; } out: return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } - I deleted the check for filp == NULL. It can't be NULL ... this is a file_operation ;-) Right. - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)? The length is padded with an extra-PMD size so that any assigned address 'addr_pmd' can be aligned by PMD. IOW, it does not make an assumption that addr_pmd is aligned by the length. - I'm still in two minds about passing 'addr' to the first call to get_unmapped_area() instead of NULL. When 'addr' is specified, we need to use 'len' since user may be managing free VMA range by itself. So, I think falling back with the original args is correct. Thanks, -Toshi
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 4/21/2016 3:06 AM, Matthew Wilcox wrote: On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote: How about moving the function (as is) to mm/huge_memory.c, rename it to get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h when TRANSPARENT_HUGEPAGE is unset? Great idea. Perhaps it should look something like this? Yes, it looks good! I will use it. :-) unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { loff_t off, off_end, off_pmd; unsigned long len_pmd, addr_pmd; if (addr) goto out; if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) goto out; /* Kirill, please fill in the right condition here for THP pagecache */ off = (loff_t)pgoff << PAGE_SHIFT; off_end = off + len; off_pmd = round_up(off, PMD_SIZE); /* pmd-aligned start offset */ if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE)) goto out; len_pmd = len + PMD_SIZE; if ((off + len_pmd) < off) goto out; addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd, pgoff, flags); if (!IS_ERR_VALUE(addr_pmd)) { addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1); return addr_pmd; } out: return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } - I deleted the check for filp == NULL. It can't be NULL ... this is a file_operation ;-) Right. - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)? The length is padded with an extra-PMD size so that any assigned address 'addr_pmd' can be aligned by PMD. IOW, it does not make an assumption that addr_pmd is aligned by the length. - I'm still in two minds about passing 'addr' to the first call to get_unmapped_area() instead of NULL. When 'addr' is specified, we need to use 'len' since user may be managing free VMA range by itself. So, I think falling back with the original args is correct. Thanks, -Toshi
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 04/21/2016 12:06 AM, Matthew Wilcox wrote: > On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote: >> How about moving the function (as is) to mm/huge_memory.c, rename it to >> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h >> when TRANSPARENT_HUGEPAGE is unset? > > Great idea. Perhaps it should look something like this? > > unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, > unsigned long len, unsigned long pgoff, unsigned long flags) > { Might want to keep the future possibility of PUD_SIZE THP in mind? -- Mike Kravetz > loff_t off, off_end, off_pmd; > unsigned long len_pmd, addr_pmd; > > if (addr) > goto out; > if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) > goto out; > /* Kirill, please fill in the right condition here for THP pagecache > */ > > off = (loff_t)pgoff << PAGE_SHIFT; > off_end = off + len; > off_pmd = round_up(off, PMD_SIZE); /* pmd-aligned start offset */ > > if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE)) > goto out; > > len_pmd = len + PMD_SIZE; > if ((off + len_pmd) < off) > goto out; > > addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd, > pgoff, flags); > if (!IS_ERR_VALUE(addr_pmd)) { > addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1); > return addr_pmd; > } > out: > return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); > } > > - I deleted the check for filp == NULL. It can't be NULL ... this is a >file_operation ;-) > - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)? > - I'm still in two minds about passing 'addr' to the first call to >get_unmapped_area() instead of NULL. > ___ > Linux-nvdimm mailing list > linux-nvd...@lists.01.org > https://lists.01.org/mailman/listinfo/linux-nvdimm >
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 04/21/2016 12:06 AM, Matthew Wilcox wrote: > On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote: >> How about moving the function (as is) to mm/huge_memory.c, rename it to >> get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h >> when TRANSPARENT_HUGEPAGE is unset? > > Great idea. Perhaps it should look something like this? > > unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, > unsigned long len, unsigned long pgoff, unsigned long flags) > { Might want to keep the future possibility of PUD_SIZE THP in mind? -- Mike Kravetz > loff_t off, off_end, off_pmd; > unsigned long len_pmd, addr_pmd; > > if (addr) > goto out; > if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) > goto out; > /* Kirill, please fill in the right condition here for THP pagecache > */ > > off = (loff_t)pgoff << PAGE_SHIFT; > off_end = off + len; > off_pmd = round_up(off, PMD_SIZE); /* pmd-aligned start offset */ > > if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE)) > goto out; > > len_pmd = len + PMD_SIZE; > if ((off + len_pmd) < off) > goto out; > > addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd, > pgoff, flags); > if (!IS_ERR_VALUE(addr_pmd)) { > addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1); > return addr_pmd; > } > out: > return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); > } > > - I deleted the check for filp == NULL. It can't be NULL ... this is a >file_operation ;-) > - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)? > - I'm still in two minds about passing 'addr' to the first call to >get_unmapped_area() instead of NULL. > ___ > Linux-nvdimm mailing list > linux-nvd...@lists.01.org > https://lists.01.org/mailman/listinfo/linux-nvdimm >
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote: > How about moving the function (as is) to mm/huge_memory.c, rename it to > get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h > when TRANSPARENT_HUGEPAGE is unset? Great idea. Perhaps it should look something like this? unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { loff_t off, off_end, off_pmd; unsigned long len_pmd, addr_pmd; if (addr) goto out; if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) goto out; /* Kirill, please fill in the right condition here for THP pagecache */ off = (loff_t)pgoff << PAGE_SHIFT; off_end = off + len; off_pmd = round_up(off, PMD_SIZE); /* pmd-aligned start offset */ if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE)) goto out; len_pmd = len + PMD_SIZE; if ((off + len_pmd) < off) goto out; addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd, pgoff, flags); if (!IS_ERR_VALUE(addr_pmd)) { addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1); return addr_pmd; } out: return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } - I deleted the check for filp == NULL. It can't be NULL ... this is a file_operation ;-) - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)? - I'm still in two minds about passing 'addr' to the first call to get_unmapped_area() instead of NULL.
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Wed, Apr 20, 2016 at 11:10:25PM -0400, Toshi Kani wrote: > How about moving the function (as is) to mm/huge_memory.c, rename it to > get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h > when TRANSPARENT_HUGEPAGE is unset? Great idea. Perhaps it should look something like this? unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { loff_t off, off_end, off_pmd; unsigned long len_pmd, addr_pmd; if (addr) goto out; if (IS_DAX(filp->f_mapping->host) && !IS_ENABLED(CONFIG_FS_DAX_PMD)) goto out; /* Kirill, please fill in the right condition here for THP pagecache */ off = (loff_t)pgoff << PAGE_SHIFT; off_end = off + len; off_pmd = round_up(off, PMD_SIZE); /* pmd-aligned start offset */ if ((off_end <= off_pmd) || ((off_end - off_pmd) < PMD_SIZE)) goto out; len_pmd = len + PMD_SIZE; if ((off + len_pmd) < off) goto out; addr_pmd = current->mm->get_unmapped_area(filp, NULL, len_pmd, pgoff, flags); if (!IS_ERR_VALUE(addr_pmd)) { addr_pmd += (off - addr_pmd) & (PMD_SIZE - 1); return addr_pmd; } out: return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); } - I deleted the check for filp == NULL. It can't be NULL ... this is a file_operation ;-) - Why is len_pmd len + PMD_SIZE instead of round_up(len, PMD_SIZE)? - I'm still in two minds about passing 'addr' to the first call to get_unmapped_area() instead of NULL.
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 4/19/2016 2:23 PM, Matthew Wilcox wrote: On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote: On Fri 15-04-16 22:05:31, Andrew Morton wrote: On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kaniwrote: When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page size. This feature relies on both mmap virtual address and FS block (i.e. physical address) to be aligned by the pmd page size. Users can use mkfs options to specify FS to align block allocations. However, aligning mmap address requires code changes to existing applications for providing a pmd-aligned address to mmap(). For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. It calls mmap() with a NULL address, which needs to be changed to provide a pmd-aligned address for testing with DAX pmd mappings. Changing all applications that call mmap() with NULL is undesirable. This patch-set extends filesystems to align an mmap address for a DAX file so that unmodified applications can use DAX pmd mappings. Matthew sounded unconvinced about the need for this patchset, but I must say that : The point is that we do not need to modify existing applications for using : DAX PMD mappings. : : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). : https://github.com/caius/fio/blob/master/engines/mmap.c : : With this change, unmodified fio can be used for testing with DAX PMD : mappings. There are many examples like this, and I do not think we want : to modify all applications that we want to evaluate/test with. sounds pretty convincing? And if we go ahead with this, it looks like 4.7 material to me - it affects ABI and we want to get that stabilized asap. What do people think? So I think Mathew didn't question the patch set as a whole. I think we all agree that we should align the virtual address we map to so that PMD mappings can be used. What Mathew was questioning was whether we really need to play tricks when logical offset in the file where mmap is starting is not aligned (and similarly for map length). Whether allowing PMD mappings for unaligned file offsets is worth the complication is IMO a valid question. I was questioning the approach as a whole ... since we have userspace already doing this in the form of NVML, do we really need the kernel to do this for us? Now, a further wrinkle. We have two competing patch sets (from Kirill and Hugh) which are going to give us THP for page cache filesystems. I would suggest that this is not DAX functionality but rather VFS functionality to opportunistically align all mmaps on files which are reasonably likely to be able to use THP. I hadn't thought about this until earlier today, and I'm sorry I didn't raise it further. Perhaps we can do a lightning session on this later today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself) are here. How about moving the function (as is) to mm/huge_memory.c, rename it to get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h when TRANSPARENT_HUGEPAGE is unset? Thanks, -Toshi
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On 4/19/2016 2:23 PM, Matthew Wilcox wrote: On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote: On Fri 15-04-16 22:05:31, Andrew Morton wrote: On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani wrote: When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page size. This feature relies on both mmap virtual address and FS block (i.e. physical address) to be aligned by the pmd page size. Users can use mkfs options to specify FS to align block allocations. However, aligning mmap address requires code changes to existing applications for providing a pmd-aligned address to mmap(). For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. It calls mmap() with a NULL address, which needs to be changed to provide a pmd-aligned address for testing with DAX pmd mappings. Changing all applications that call mmap() with NULL is undesirable. This patch-set extends filesystems to align an mmap address for a DAX file so that unmodified applications can use DAX pmd mappings. Matthew sounded unconvinced about the need for this patchset, but I must say that : The point is that we do not need to modify existing applications for using : DAX PMD mappings. : : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). : https://github.com/caius/fio/blob/master/engines/mmap.c : : With this change, unmodified fio can be used for testing with DAX PMD : mappings. There are many examples like this, and I do not think we want : to modify all applications that we want to evaluate/test with. sounds pretty convincing? And if we go ahead with this, it looks like 4.7 material to me - it affects ABI and we want to get that stabilized asap. What do people think? So I think Mathew didn't question the patch set as a whole. I think we all agree that we should align the virtual address we map to so that PMD mappings can be used. What Mathew was questioning was whether we really need to play tricks when logical offset in the file where mmap is starting is not aligned (and similarly for map length). Whether allowing PMD mappings for unaligned file offsets is worth the complication is IMO a valid question. I was questioning the approach as a whole ... since we have userspace already doing this in the form of NVML, do we really need the kernel to do this for us? Now, a further wrinkle. We have two competing patch sets (from Kirill and Hugh) which are going to give us THP for page cache filesystems. I would suggest that this is not DAX functionality but rather VFS functionality to opportunistically align all mmaps on files which are reasonably likely to be able to use THP. I hadn't thought about this until earlier today, and I'm sorry I didn't raise it further. Perhaps we can do a lightning session on this later today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself) are here. How about moving the function (as is) to mm/huge_memory.c, rename it to get_hugepage_unmapped_area(), which is defined to NULL in huge_mm.h when TRANSPARENT_HUGEPAGE is unset? Thanks, -Toshi
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote: > On Fri 15-04-16 22:05:31, Andrew Morton wrote: > > On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kaniwrote: > > > > > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page > > > size. This feature relies on both mmap virtual address and FS > > > block (i.e. physical address) to be aligned by the pmd page size. > > > Users can use mkfs options to specify FS to align block allocations. > > > However, aligning mmap address requires code changes to existing > > > applications for providing a pmd-aligned address to mmap(). > > > > > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. > > > It calls mmap() with a NULL address, which needs to be changed to > > > provide a pmd-aligned address for testing with DAX pmd mappings. > > > Changing all applications that call mmap() with NULL is undesirable. > > > > > > This patch-set extends filesystems to align an mmap address for > > > a DAX file so that unmodified applications can use DAX pmd mappings. > > > > Matthew sounded unconvinced about the need for this patchset, but I > > must say that > > > > : The point is that we do not need to modify existing applications for using > > : DAX PMD mappings. > > : > > : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). > > : https://github.com/caius/fio/blob/master/engines/mmap.c > > : > > : With this change, unmodified fio can be used for testing with DAX PMD > > : mappings. There are many examples like this, and I do not think we want > > : to modify all applications that we want to evaluate/test with. > > > > sounds pretty convincing? > > > > > > And if we go ahead with this, it looks like 4.7 material to me - it > > affects ABI and we want to get that stabilized asap. What do people > > think? > > So I think Mathew didn't question the patch set as a whole. I think we all > agree that we should align the virtual address we map to so that PMD > mappings can be used. What Mathew was questioning was whether we really > need to play tricks when logical offset in the file where mmap is starting > is not aligned (and similarly for map length). Whether allowing PMD > mappings for unaligned file offsets is worth the complication is IMO a > valid question. I was questioning the approach as a whole ... since we have userspace already doing this in the form of NVML, do we really need the kernel to do this for us? Now, a further wrinkle. We have two competing patch sets (from Kirill and Hugh) which are going to give us THP for page cache filesystems. I would suggest that this is not DAX functionality but rather VFS functionality to opportunistically align all mmaps on files which are reasonably likely to be able to use THP. I hadn't thought about this until earlier today, and I'm sorry I didn't raise it further. Perhaps we can do a lightning session on this later today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself) are here.
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote: > On Fri 15-04-16 22:05:31, Andrew Morton wrote: > > On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani wrote: > > > > > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page > > > size. This feature relies on both mmap virtual address and FS > > > block (i.e. physical address) to be aligned by the pmd page size. > > > Users can use mkfs options to specify FS to align block allocations. > > > However, aligning mmap address requires code changes to existing > > > applications for providing a pmd-aligned address to mmap(). > > > > > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. > > > It calls mmap() with a NULL address, which needs to be changed to > > > provide a pmd-aligned address for testing with DAX pmd mappings. > > > Changing all applications that call mmap() with NULL is undesirable. > > > > > > This patch-set extends filesystems to align an mmap address for > > > a DAX file so that unmodified applications can use DAX pmd mappings. > > > > Matthew sounded unconvinced about the need for this patchset, but I > > must say that > > > > : The point is that we do not need to modify existing applications for using > > : DAX PMD mappings. > > : > > : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). > > : https://github.com/caius/fio/blob/master/engines/mmap.c > > : > > : With this change, unmodified fio can be used for testing with DAX PMD > > : mappings. There are many examples like this, and I do not think we want > > : to modify all applications that we want to evaluate/test with. > > > > sounds pretty convincing? > > > > > > And if we go ahead with this, it looks like 4.7 material to me - it > > affects ABI and we want to get that stabilized asap. What do people > > think? > > So I think Mathew didn't question the patch set as a whole. I think we all > agree that we should align the virtual address we map to so that PMD > mappings can be used. What Mathew was questioning was whether we really > need to play tricks when logical offset in the file where mmap is starting > is not aligned (and similarly for map length). Whether allowing PMD > mappings for unaligned file offsets is worth the complication is IMO a > valid question. I was questioning the approach as a whole ... since we have userspace already doing this in the form of NVML, do we really need the kernel to do this for us? Now, a further wrinkle. We have two competing patch sets (from Kirill and Hugh) which are going to give us THP for page cache filesystems. I would suggest that this is not DAX functionality but rather VFS functionality to opportunistically align all mmaps on files which are reasonably likely to be able to use THP. I hadn't thought about this until earlier today, and I'm sorry I didn't raise it further. Perhaps we can do a lightning session on this later today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself) are here.
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Fri 15-04-16 22:05:31, Andrew Morton wrote: > On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kaniwrote: > > > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page > > size. This feature relies on both mmap virtual address and FS > > block (i.e. physical address) to be aligned by the pmd page size. > > Users can use mkfs options to specify FS to align block allocations. > > However, aligning mmap address requires code changes to existing > > applications for providing a pmd-aligned address to mmap(). > > > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. > > It calls mmap() with a NULL address, which needs to be changed to > > provide a pmd-aligned address for testing with DAX pmd mappings. > > Changing all applications that call mmap() with NULL is undesirable. > > > > This patch-set extends filesystems to align an mmap address for > > a DAX file so that unmodified applications can use DAX pmd mappings. > > Matthew sounded unconvinced about the need for this patchset, but I > must say that > > : The point is that we do not need to modify existing applications for using > : DAX PMD mappings. > : > : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). > : https://github.com/caius/fio/blob/master/engines/mmap.c > : > : With this change, unmodified fio can be used for testing with DAX PMD > : mappings. There are many examples like this, and I do not think we want > : to modify all applications that we want to evaluate/test with. > > sounds pretty convincing? > > > And if we go ahead with this, it looks like 4.7 material to me - it > affects ABI and we want to get that stabilized asap. What do people > think? So I think Mathew didn't question the patch set as a whole. I think we all agree that we should align the virtual address we map to so that PMD mappings can be used. What Mathew was questioning was whether we really need to play tricks when logical offset in the file where mmap is starting is not aligned (and similarly for map length). Whether allowing PMD mappings for unaligned file offsets is worth the complication is IMO a valid question. Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Fri 15-04-16 22:05:31, Andrew Morton wrote: > On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani wrote: > > > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page > > size. This feature relies on both mmap virtual address and FS > > block (i.e. physical address) to be aligned by the pmd page size. > > Users can use mkfs options to specify FS to align block allocations. > > However, aligning mmap address requires code changes to existing > > applications for providing a pmd-aligned address to mmap(). > > > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. > > It calls mmap() with a NULL address, which needs to be changed to > > provide a pmd-aligned address for testing with DAX pmd mappings. > > Changing all applications that call mmap() with NULL is undesirable. > > > > This patch-set extends filesystems to align an mmap address for > > a DAX file so that unmodified applications can use DAX pmd mappings. > > Matthew sounded unconvinced about the need for this patchset, but I > must say that > > : The point is that we do not need to modify existing applications for using > : DAX PMD mappings. > : > : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). > : https://github.com/caius/fio/blob/master/engines/mmap.c > : > : With this change, unmodified fio can be used for testing with DAX PMD > : mappings. There are many examples like this, and I do not think we want > : to modify all applications that we want to evaluate/test with. > > sounds pretty convincing? > > > And if we go ahead with this, it looks like 4.7 material to me - it > affects ABI and we want to get that stabilized asap. What do people > think? So I think Mathew didn't question the patch set as a whole. I think we all agree that we should align the virtual address we map to so that PMD mappings can be used. What Mathew was questioning was whether we really need to play tricks when logical offset in the file where mmap is starting is not aligned (and similarly for map length). Whether allowing PMD mappings for unaligned file offsets is worth the complication is IMO a valid question. Honza -- Jan Kara SUSE Labs, CR
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kaniwrote: > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page > size. This feature relies on both mmap virtual address and FS > block (i.e. physical address) to be aligned by the pmd page size. > Users can use mkfs options to specify FS to align block allocations. > However, aligning mmap address requires code changes to existing > applications for providing a pmd-aligned address to mmap(). > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. > It calls mmap() with a NULL address, which needs to be changed to > provide a pmd-aligned address for testing with DAX pmd mappings. > Changing all applications that call mmap() with NULL is undesirable. > > This patch-set extends filesystems to align an mmap address for > a DAX file so that unmodified applications can use DAX pmd mappings. Matthew sounded unconvinced about the need for this patchset, but I must say that : The point is that we do not need to modify existing applications for using : DAX PMD mappings. : : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). : https://github.com/caius/fio/blob/master/engines/mmap.c : : With this change, unmodified fio can be used for testing with DAX PMD : mappings. There are many examples like this, and I do not think we want : to modify all applications that we want to evaluate/test with. sounds pretty convincing? And if we go ahead with this, it looks like 4.7 material to me - it affects ABI and we want to get that stabilized asap. What do people think?
Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings
On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani wrote: > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page > size. This feature relies on both mmap virtual address and FS > block (i.e. physical address) to be aligned by the pmd page size. > Users can use mkfs options to specify FS to align block allocations. > However, aligning mmap address requires code changes to existing > applications for providing a pmd-aligned address to mmap(). > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. > It calls mmap() with a NULL address, which needs to be changed to > provide a pmd-aligned address for testing with DAX pmd mappings. > Changing all applications that call mmap() with NULL is undesirable. > > This patch-set extends filesystems to align an mmap address for > a DAX file so that unmodified applications can use DAX pmd mappings. Matthew sounded unconvinced about the need for this patchset, but I must say that : The point is that we do not need to modify existing applications for using : DAX PMD mappings. : : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). : https://github.com/caius/fio/blob/master/engines/mmap.c : : With this change, unmodified fio can be used for testing with DAX PMD : mappings. There are many examples like this, and I do not think we want : to modify all applications that we want to evaluate/test with. sounds pretty convincing? And if we go ahead with this, it looks like 4.7 material to me - it affects ABI and we want to get that stabilized asap. What do people think?