.
The Scheduling Scalability page is at:
http://lse.sourceforge.net/scheduling/
If you are interested in this work, please join the lse-tech
mailing list at:
http://sourceforge.net/projects/lse
--
Mike Kravetz [EMAIL PROTECTED]
IBM Linux Technology Center
d from the run-queue.
Now, what usually happens is that wake_up_process_synchronous or
wake_up_process will add the task back to the run-queue as soon
as the scheduler drops the run-queue lock. Therefore, this does
not seem to cause any problems.
I'm curious, is this behavior by design OR are
://sourceforge.net/projects/lse
Thanks,
--
Mike Kravetz [EMAIL PROTECTED]
IBM Linux Technology Center
15450 SW Koll Parkway
Beaverton, OR 97006-6063 (503)578-3494
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" i
scheduler patches
located at:
http://lse.sourceforge.net/scheduling/
I would be interested in your observations.
--
Mike Kravetz [EMAIL PROTECTED]
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the
veloped
a 'token passing' benchmark which attempts to address these issues
(called reflex at the above site). However, I would really like
to get a pointer to a community acceptable workload/benchmark for
these low thread cases.
--
Mike Kravetz [EMAIL PROTECTED]
IBM
e. However, at this point one could argue that
we have moved away from a 'realistic' low task count system load.
> lmbench's lat_ctx for example, and other tools in lmbench trigger various
> scheduler workloads as well.
Thanks, I'll add these to our list.
--
Mike Kravetz
i-queue patch I developed, the
scheduler always attempts to make the same global scheduling decisions
as the current scheduler.
--
Mike Kravetz [EMAIL PROTECTED]
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel"
ons, load balancing algorithms take considerable effort
to get working in a reasonable well performing manner.
>
> Could you make a port of your thing on recent kernels?
There is a 2.4.2 patch on the web page. I'll put out a 2.4.3 patch
as soon as I get some time.
--
Mike Kravetz
On Mon, Apr 04, 2005 at 10:50:09AM -0700, Dave Hansen wrote:
diff -puN mm/Kconfig~A6-mm-Kconfig mm/Kconfig
--- memhotplug/mm/Kconfig~A6-mm-Kconfig 2005-04-04 09:04:48.0 -0700
+++ memhotplug-dave/mm/Kconfig 2005-04-04 10:15:23.0 -0700
@@ -0,0 +1,25 @@
> +choice
> + prompt
and FLAT for others.
--
Signed-off-by: Mike Kravetz <[EMAIL PROTECTED]>
diff -Naupr linux-2.6.12-rc2-mm1/arch/ppc64/Kconfig
linux-2.6.12-rc2-mm1.work/arch/ppc64/Kconfig
--- linux-2.6.12-rc2-mm1/arch/ppc64/Kconfig 2005-04-05 18:44:57.0
+
+++ linux-2.6.12-rc2-mm1.work/arch
On Thu, Feb 17, 2005 at 04:03:53PM -0800, Dave Hansen wrote:
> The attached patch
Just tried to compile this and noticed that there is no definition
of valid_section_nr(), referenced in sparse_init.
--
Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body
On Thu, Mar 10, 2005 at 02:36:13AM -0800, Andrew Morton wrote:
>
> This patch causes the non-numa G5 to oops very early in boot in
> smp_call_function().
>
OK - Let me take a look.
--
Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
On Fri, Mar 11, 2005 at 07:51:38PM +1100, Paul Mackerras wrote:
>
> Anyway, the ultimate reason seems to be that the numa.c code is
> assuming that an address value and a size value occupy the same number
> of cells. On the G5 we have #address-cells = 2 but #size-cells = 1.
> Previously this
this on a machine known to break with
the previous version (such as G5).
--
Signed-off-by: Mike Kravetz <[EMAIL PROTECTED]>
diff -Naupr linux-2.6.11/arch/ppc64/mm/numa.c
linux-2.6.11.work/arch/ppc64/mm/numa.c
--- linux-2.6.11/arch/ppc64/mm/numa.c 2005-03-02 07:38:38.0 +
+++ linux-
and OpenPower 720.
--
Signed-off-by: Mike Kravetz <[EMAIL PROTECTED]>
diff -Naupr linux-2.6.11.4/arch/ppc64/mm/numa.c
linux-2.6.11.4.work/arch/ppc64/mm/numa.c
--- linux-2.6.11.4/arch/ppc64/mm/numa.c 2005-03-16 00:09:31.0 +
+++ linux-2.6.11.4.work/arch/ppc64/mm/numa.c2005-03-16
On Wed, Mar 23, 2005 at 11:11:10PM +1100, Michael Ellerman wrote:
>
> Can you test this on your 720 or whatever it was? And if anyone else
> has an interesting NUMA machine they can test it on I'd love to hear
> about it!
>
I've tested this with various config options on my 720. Appears to
On Thu, Aug 04, 2005 at 03:19:52PM -0700, Christoph Lameter wrote:
> This code already exist in the memory hotplug code base and Ray already
> had a working implementation for page migration. The migration code will
> also be necessary in order to relocate pages with ECC single bit failures
>
On Wed, Dec 13, 2006 at 07:20:57PM +0100, Arnd Bergmann wrote:
> After a lot of debugging in spufs, I found that a crash that we encountered
> on Cell actually was caused by a change in the memory management.
>
> The patch that caused it is archived in http://lkml.org/lkml/2006/11/1/43,
> and
I've been trying to track down some unexpected realtime latencies and
believe one source is a bug in the wakeup code. Specifically, this is
within the try_to_wake_up() routine. Within this routine there is the
following code segment:
/*
* If a newly woken up RT task cannot
On Tue, Oct 02, 2007 at 07:06:32AM +0200, Ingo Molnar wrote:
> * Mike Kravetz <[EMAIL PROTECTED]> wrote:
> >
> > My observations/debugging/conclusions are based on an earlier version
> > of the code. It appears the same code/issue still exists in the most
> > v
On Tue, Oct 02, 2007 at 07:06:32AM +0200, Ingo Molnar wrote:
> Index: linux-rt-rebase.q/kernel/sched.c
> ===
> --- linux-rt-rebase.q.orig/kernel/sched.c
> +++ linux-rt-rebase.q/kernel/sched.c
> @@ -1819,6 +1819,13 @@ out_set_cpu:
>
Hi Ingo,
After applying the fix to try_to_wake_up() I was still seeing some large
latencies for realtime tasks. Some debug code pointed out two additional
causes of these latencies. I have put fixes into my 'old' kernel and the
scheduler related latencies have gone away. I'm pretty confident
On Fri, Oct 05, 2007 at 07:15:48PM -0700, Mike Kravetz wrote:
> After applying the fix to try_to_wake_up() I was still seeing some large
> latencies for realtime tasks.
I've been looking for places in the code where reschedule IPIs should
be sent in the case of 'overload' to redistribute Re
On Mon, Oct 08, 2007 at 11:04:12PM -0400, Steven Rostedt wrote:
> On Mon, Oct 08, 2007 at 11:45:23AM -0700, Mike Kravetz wrote:
> > Are these accurate statements? I'll start working on a reliable delivery
> > mechanism for RealTime scheduling. But, I just want to make sure tha
On Mon, Oct 08, 2007 at 10:46:21PM -0400, Steven Rostedt wrote:
> Mike,
>
> Can you attach your Signed-off-by to this patch, please.
>
>
> On Fri, Oct 05, 2007 at 07:15:48PM -0700, Mike Kravetz wrote:
> > Hi Ingo,
> >
> > After applying the fix to try_to_wak
On Tue, Oct 09, 2007 at 01:59:37PM -0400, Steven Rostedt wrote:
> This has been complied tested (and no more ;-)
>
> The idea here is when we find a situation that we just scheduled in an
> RT task and we either pushed a lesser RT task away or more than one RT
> task was scheduled on this CPU
On Tue, Oct 09, 2007 at 04:50:47PM -0400, Steven Rostedt wrote:
> > I did something like this a while ago for another scheduling project.
> > A couple 'possible' optimizations to think about are:
> > 1) Only scan the remote runqueues once and keep a local copy of the
> >remote priorities for
On Wed, Oct 10, 2007 at 10:49:35AM -0400, Gregory Haskins wrote:
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 3e75c62..b7f7a96 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -1869,7 +1869,8 @@ out_activate:
>* extra locking in this particular case, because
>
On Wed, Oct 10, 2007 at 07:50:52AM -0400, Steven Rostedt wrote:
> On Tue, Oct 09, 2007 at 11:49:53AM -0700, Mike Kravetz wrote:
> > The more I try understand the IPI handling the more confused I get. :(
> > At fist I was concerned about an IPI happening in the middle of the
> &g
On 03/13/2018 02:14 PM, Andrew Morton wrote:
> On Fri, 9 Mar 2018 14:47:31 -0800 Mike Kravetz
> wrote:
>
>> start_isolate_page_range() is used to set the migrate type of a
>> set of pageblocks to MIGRATE_ISOLATE while attempting to start
>> a migration operation
t system (and some other changes).
To me, this seems like step in the wrong direction. But, I could
be totally wrong and perhaps self tests should primarily target the
host system header files.
--
Mike Kravetz
nge() fail if already
isolated" should handle this situation IF we decide to expose
alloc_gigantic_page (which I do not suggest).
--
Mike Kravetz
On 02/15/2018 12:39 PM, Reinette Chatre wrote:
> On 2/14/2018 10:31 AM, Reinette Chatre wrote:
>> On 2/14/2018 10:12 AM, Mike Kravetz wrote:
>>> On 02/13/2018 07:46 AM, Reinette Chatre wrote:
>>>> Adding MM maintainers to v2 to share the new MM change (patch
On 02/12/2018 02:20 PM, Mike Kravetz wrote:
> start_isolate_page_range() is used to set the migrate type of a
> page block to MIGRATE_ISOLATE while attempting to start a
> migration operation. It is assumed that only one thread is
> attempting such an operation, and due to the li
4.
>
> There is a regression on arm32 in libhugetlbfs/truncate_above_4GB-2M-32
> that also exists in 4.14 and mainline. We'll investigate the root cause
> and report upstream in mainline. I suspect the cause is "hugetlbfs:
> check for pgoff value overflow", but have not ver
On 03/28/2018 12:06 PM, Mike Kravetz wrote:
> On 03/28/2018 11:44 AM, Dan Rue wrote:
>> On Tue, Mar 27, 2018 at 06:26:40PM +0200, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 4.15.14 release.
>>> There are 105 patches in this
r than 4GB on 32 bit kernels.
The above is in the commit message. 63489f8e8211 has been sent upstream
and to stable, so cc'ing stable here as well.
I would appreciate some more eyes on this code. There have been several
fixes and we keep running into issues.
Mike Kravetz (1):
hugetlbfs: fix bu
y: Dan Rue
Signed-off-by: Mike Kravetz
---
fs/hugetlbfs/inode.c | 22 +-
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index b9a254dcc0e7..8450a1d75dfa 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
On 03/28/2018 09:16 PM, Mike Kravetz wrote:
> Commit 63489f8e8211 ("hugetlbfs: check for pgoff value overflow")
> introduced a regression in 32 bit kernels. When creating the mask
> to check vm_pgoff, it incorrectly specified that the size of a loff_t
> was the size of
via shmget/shmat have their vm_ops replaced. Therefore, this
split callout is never made.
The shm vm_ops do indirectly call the original vm_ops routines as needed.
Therefore, I would suggest a patch something like the following instead.
If we move forward with the patch, we should include Laurent
On 03/20/2018 02:26 PM, Mike Kravetz wrote:
> Thanks Laurent!
>
> This bug was introduced by 31383c6865a5. Dan's changes for 31383c6865a5
> seem pretty straight forward. It simply replaces an explicit check when
> splitting a vma to a new vm_ops split callout. Unfortunately, map
->split() to
vm_operations_struct")
Signed-off-by: Mike Kravetz
Reported by: Laurent Dufour
Tested-by: Laurent Dufour
Acked-by: Michal Hocko
Cc: sta...@vger.kernel.org
---
Changes in v2
* Updated commit message
* Cc stable
ipc/shm.c | 12
1 file changed, 12 insertions(+
On 03/21/2018 01:56 PM, Andrew Morton wrote:
> On Wed, 21 Mar 2018 09:13:14 -0700 Mike Kravetz
> wrote:
>>
>> +static int shm_split(struct vm_area_struct *vma, unsigned long addr)
>> +{
>> +struct file *file = vma->vm_file;
>> +struct
On 03/07/2018 08:25 PM, Mike Kravetz wrote:
> On 03/07/2018 05:35 PM, Yisheng Xie wrote:
>> However, region_chg makes me a litter puzzle that when its return value < 0,
>> sometime
>> adds_in_progress is added like this case, while sometime it is not. so wh
fix to this code was incomplete and did not
take the remap_file_pages system call into account.
Fixes: 045c7a3f53d9 ("hugetlbfs: fix offset overflow in hugetlbfs mmap")
Cc:
Reported-by: Nic Losby
Signed-off-by: Mike Kravetz
---
Changes in v2
* Use bitmask for overflow check as suggested b
On 03/08/2018 02:15 PM, Andrew Morton wrote:
> On Thu, 8 Mar 2018 13:05:02 -0800 Mike Kravetz
> wrote:
>
>> A vma with vm_pgoff large enough to overflow a loff_t type when
>> converted to a byte offset can be passed via the remap_file_pages
>> system call. The
fix to this code was incomplete and did not
take the remap_file_pages system call into account.
Fixes: 045c7a3f53d9 ("hugetlbfs: fix offset overflow in hugetlbfs mmap")
Cc:
Reported-by: Nic Losby
Signed-off-by: Mike Kravetz
---
Changes in v3
* Use a simpler mask computation as suggested by
functionality.
Signed-off-by: Mike Kravetz
---
Changes in v2
* Updated commit message and comments as suggested by Andrew Morton
mm/page_alloc.c | 8
mm/page_isolation.c | 18 +-
2 files changed, 21 insertions(+), 5 deletions(-)
diff --git a/mm/page_alloc.c b/mm
On 03/16/2018 03:17 AM, Michal Hocko wrote:
> On Thu 08-03-18 16:27:26, Mike Kravetz wrote:
>
> OK, looks good to me. Hairy but seems to be the easiest way around this.
> Acked-by: Michal Hocko
>
>> +/*
>> + * Mask used when checking the page offset value passed
llocation? case you are trying to move away from. Sorry, I have not been
following development of this feature.
If you would have to create a device to accept a user buffer, could you
perhaps use the same device to create/hand out a contiguous mapping?
--
Mike Kravetz
h->surplus_huge_pages_node[page_to_nid(page)]++;
> }
>
> out_unlock:
I thought we had this corrected in a previous version of the patch.
My apologies for not looking more closely at this version.
FWIW,
Reviewed-by: Mike Kravetz
--
Mike Kravetz
25.6%, the
> IPC (instruction per cycle) increased from 0.3 to 0.37, and the time
> spent in user space is reduced ~19.3%
Since this patch only addresses hugetlbfs huge pages, I would suggest
making that more explicit in the commit message. Other than that, the
changes look fine to me.
>
On 05/03/2018 05:09 PM, TSUKADA Koutaro wrote:
> On 2018/05/03 11:33, Mike Kravetz wrote:
>> On 05/01/2018 11:54 PM, TSUKADA Koutaro wrote:
>>> On 2018/05/02 13:41, Mike Kravetz wrote:
>>>> What is the reason for not charging pages at allocation/reserve time? I
ksft_skip
We now KNOW that we are running as root because of the check above. We
can delete this test, and rely on the later check to determine if the
number of huge pages was actually increased.
How about this instead (untested)?
Signed-off-by: Mike Kravetz
diff --git a/tools/testing/selftests/
n it.
>
> In addition, return skip code when not enough huge pages are available to
> run the test.
>
> Kselftest framework SKIP code is 4 and the framework prints appropriate
> messages to indicate that the test is skipped.
>
> Signed-off-by: Shuah Khan (Samsung OSG)
Tha
DONE
> ok 1..2 selftests: memfd: run_fuse_test.sh [PASS]
> selftests: memfd: run_hugetlbfs_test.sh
>
> Please run memfd with hugetlbfs test as root
> not ok 1..3 selftests: memfd: run_hugetlbfs_test.sh [SKIP]
>
> Signed-off-by: Shuah Khan (Samsung OSG)
Thanks for all your
not charged to a memcg. memcg charges in other
code paths seem to happen at huge page allocation time.
--
Mike Kravetz
>
> The page charged to memcg will finally be uncharged at free_huge_page.
>
> Modification of memcontrol.c is for updating of statistical information
>
On 04/21/2018 09:16 AM, Vlastimil Babka wrote:
> On 04/17/2018 04:09 AM, Mike Kravetz wrote:
>> find_alloc_contig_pages() is a new interface that attempts to locate
>> and allocate a contiguous range of pages. It is provided as a more
>> convenient interface than alloc
On 05/01/2018 11:54 PM, TSUKADA Koutaro wrote:
> On 2018/05/02 13:41, Mike Kravetz wrote:
>> What is the reason for not charging pages at allocation/reserve time? I am
>> not an expert in memcg accounting, but I would think the pages should be
>> charged at allocation tim
urce Director Technology Cache Pseudo-Locking.
Mike Kravetz (4):
mm: change type of free_contig_range(nr_pages) to unsigned long
mm: check for proper migrate type during isolation
mm: add find_alloc_contig_pages() interface
mm/hugetlb: use find_alloc_contig_pages() to allocate gigantic pa
as there are two primary
users. Contiguous range allocation which wants to enforce migration
type checking. Memory offline (hotplug) which is not concerned about
type checking.
Signed-off-by: Mike Kravetz
---
include/linux/page-isolation.h | 8 +++-
mm/memory_hotplug.c| 2
Use the new find_alloc_contig_pages() interface for the allocation of
gigantic pages and remove associated code in hugetlb.c.
Signed-off-by: Mike Kravetz
---
mm/hugetlb.c | 87 +---
1 file changed, 6 insertions(+), 81 deletions(-)
diff
an unsigned int.
However, this should be changed to an unsigned long to be consistent
with other page counts.
Signed-off-by: Mike Kravetz
---
include/linux/gfp.h | 2 +-
mm/cma.c| 2 +-
mm/hugetlb.c| 2 +-
mm/page_alloc.c | 6 +++---
4 files changed, 6 insertions(+), 6
is employed
if possible. There is no guarantee that the routine will succeed.
So, the user must be prepared for failure and have a fall back plan.
Signed-off-by: Mike Kravetz
---
include/linux/gfp.h | 12 +
mm/page_alloc.c | 136 +++-
2
a to consider?
That gets back to Michal's question of a specific use case or generic
optimization. Unless code is simple (as in this patch), seems like we should
hold off on considering additional optimizations unless there is a specific
use case.
I'm still OK with this change.
--
Mike Kravetz
hich generates heavy cache
> pressure. At the same time, the cache miss rate reduced from ~36.3%
> to ~25.6%, the IPC (instruction per cycle) increased from 0.3 to 0.37,
> and the time spent in user space is reduced ~19.3%.
>
Agree with Michal that commit message looks better.
I we
sub-pages are to be copied. IIUC, you
added the same algorithm for sub-page ordering to copy_huge_page()
that was previously added to clear_huge_page(). Correct? If so,
then perhaps a common helper could be used by both the clear and copy
huge page routines. It would also make maintenance easier.
--
Mike Kravetz
On 05/18/2018 02:12 AM, Vlastimil Babka wrote:
> On 05/04/2018 01:29 AM, Mike Kravetz wrote:
>> free_contig_range() is currently defined as:
>> void free_contig_range(unsigned long pfn, unsigned nr_pages);
>> change to,
>> void free_contig_range(unsigned long
. hugetlb.c and hugetlb.h are
not 100% hugetlbfs, but a majority of their content is hugetlbfs
related.
Signed-off-by: Mike Kravetz
---
MAINTAINERS | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 9051a9ca24a2..c7a5eb074eb1 100644
lar to the OpenBSD minherit syscall with MAP_INHERIT_ZERO:
>
> https://man.openbsd.org/minherit.2
>
> Reported-by: Florian Weimer
> Reported-by: Colm MacCártaigh
> Signed-off-by: Rik van Riel
My primary concern with the first suggested patch was trying to define
semantics if MADV
ted behaviour of this function.
> This is to set clear semantics for architecture specific implementations
> of huge_pte_offset().
>
> Signed-off-by: Punit Agrawal
> Cc: Catalin Marinas
> Cc: Naoya Horiguchi
> Cc: Steve Capper
> Cc: Will Deacon
> Cc: Kirill A. Shutem
ence well enough to know if it would be
possible for driver code to make CMA reservations. But, it looks doubtful.
--
Mike Kravetz
Add the flag VM_CONTIG to vma structure to identify vmas which are
backed by contiguous memory allocations. This flag is not propogated
to child processes, so be sure to clear at fork time.
Signed-off-by: Mike Kravetz
---
include/linux/mm.h | 1 +
kernel/fork.c | 2 +-
2 files changed, 2
-by: Mike Kravetz
---
include/uapi/asm-generic/mman.h | 1 +
mm/mmap.c | 94 +
2 files changed, 95 insertions(+)
diff --git a/include/uapi/asm-generic/mman.h b/include/uapi/asm-generic/mman.h
index 7162cd4cca73..e8046b4c4ac4 100644
. Also, the allocations should probably
be done outside mmap_sem but that was the easiest place to do it in
this quick and easy POC.
I just wanted to throw out some code to get further ideas. It is far
from complete.
Mike Kravetz (3):
mm/map_contig: Add VM_CONTIG flag to vma struct
mm
When populating mappings backed by contiguous memory allocations
(VM_CONTIG), use the preallocated pages instead of allocating new.
Signed-off-by: Mike Kravetz
---
mm/memory.c | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/mm/memory.c b/mm/memory.c
index
On 10/12/2017 07:37 AM, Michal Hocko wrote:
> On Wed 11-10-17 18:46:11, Mike Kravetz wrote:
>> Add new MAP_CONTIG flag to mmap system call. Check for flag in normal
>> mmap flag processing. If present, pre-allocate a contiguous set of
>> pages to back the mapping. The
for these encodings.
Put common definitions in a single header file. The primary uapi
header files for mmap and shm will use these definitions as a basis
for definitions specific to those system calls.
Signed-off-by: Mike Kravetz
---
include/uapi/asm-generic/hugetlb_encode.h | 34
header file, and add to user (uapi/linux/shm.h) header file. Add
definitions for all known huge page size encodings as in mmap.
[1]https://lkml.org/lkml/2017/3/8/548
Mike Kravetz (3):
mm:hugetlb: Define system call hugetlb size encodings in single file
mm: arch: Consolidate mmap hugetlb
). Include definitions for all known huge page
sizes. Use the generic encoding definitions in hugetlb_encode.h
as the basis for these definitions.
Signed-off-by: Mike Kravetz
---
arch/alpha/include/uapi/asm/mman.h | 11 ---
arch/mips/include/uapi/asm/mman.h | 11 ---
arch
Use the common definitions from hugetlb_encode.h header file for
encoding hugetlb size definitions in shmget system call flags.
In addition, move these definitions from the internal (kernel) to
user (uapi) header file.
Suggested-by: Matthew Wilcox
Signed-off-by: Mike Kravetz
---
include/linux
) with some kludges to
use the pages at fault time. It is really ugly, which is why I am not
sharing the code. Hoping for some comments/suggestions.
[1] https://www.linuxplumbersconf.org/2017/ocw/proposals/4669
--
Mike Kravetz
On 10/04/2017 04:54 AM, Michal Nazarewicz wrote:
> On Tue, Oct 03 2017, Mike Kravetz wrote:
>> At Plumbers this year, Guy Shattah and Christoph Lameter gave a presentation
>> titled 'User space contiguous memory allocation for DMA' [1]. The slides
>> point out the performanc
On 10/04/2017 06:49 AM, Anshuman Khandual wrote:
> On 10/04/2017 05:26 AM, Mike Kravetz wrote:
>> At Plumbers this year, Guy Shattah and Christoph Lameter gave a presentation
>> titled 'User space contiguous memory allocation for DMA' [1]. The slides
>> point out the
e
populated at mmap time, and the pages locked. Therefore, there should
be no swap or migration.
--
Mike Kravetz
dex 5624918154db..1c08f0136667 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -813,7 +813,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm,
> struct task_struct *p,
> init_rwsem(>mmap_sem);
> INIT_LIST_HEAD(>mmlist);
> mm->core_state
-allocate pages for their use, and
this 'might' be something useful for contiguous allocations as well.
I wonder if going down the path of a separate devide/filesystem/etc for
contiguous allocations might be a better option. It would keep the
implementation somewhat separate. However, I would then be afraid that
we end up with another 'separate/special vm' as in the case of hugetlbfs
today.
--
Mike Kravetz
On 10/16/2017 11:07 AM, Michal Hocko wrote:
> On Mon 16-10-17 10:43:38, Mike Kravetz wrote:
>> Just to be clear, the posix standard talks about a typed memory object.
>> The suggested implementation has one create a connection to the memory
>> object to receive a fd, then use
On 10/16/2017 02:03 PM, Laura Abbott wrote:
> On 10/16/2017 01:32 PM, Mike Kravetz wrote:
>> On 10/16/2017 11:07 AM, Michal Hocko wrote:
>>> On Mon 16-10-17 10:43:38, Mike Kravetz wrote:
>>>> Just to be clear, the posix standard talks about a typed memory object.
>
= hugetlb_vmtruncate(inode, attr->ia_size);
>
Thanks for noticing.
I would hope the compiler is smarter than the code and optimize this away.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
upported(), which is only there if
ARCH_ENABLE_HUGEPAGE_MIGRATION is defined. IIUC, this functionality
was added for powerpc. Yet, powerpc does not define
ARCH_ENABLE_HUGEPAGE_MIGRATION (unless I am missing something).
--
Mike Kravetz
the cgroup limit', the migration
may fail because of this.
I like your new code below as it explicitly takes reserve and cgroup
accounting out of the picture for migration. Let me think about it
for another day before providing a Reviewed-by.
--
Mike Kravetz
>> I don't think this is a bu
On 12/20/2017 04:26 PM, Andrew Morton wrote:
> On Wed, 20 Dec 2017 16:10:51 +0100 Michal Hocko wrote:
>
>> On Wed 20-12-17 15:15:50, Marc-André Lureau wrote:
>>> Hi
>>>
>>> On Wed, Nov 15, 2017 at 4:13 AM, Mike Kravetz
>>> wrote:
>>>&
page = alloc_fresh_huge_page_node(h, node);
> - if (page) {
> - ret = 1;
> + page = __hugetlb_alloc_buddy_huge_page(h, gfp_mask,
> + node, nodes_allowed);
I don't have the greatest understanding of
porary(hpage);
> + ClearPageHugeTemporary(new_hpage);
> + }
> }
>
> unlock_page(hpage);
>
I'm still trying to wrap my head around all the different scenarios.
In general, this new code only 'kicks in' if the there is not a free
pre-allocated huge page for migration. Right?
So, if there are free huge pages they are 'consumed' during migration
and the number of available pre-allocated huge pages is reduced? Or,
is that not exactly how it works? Or does it depend in the purpose
of the migration?
The only reason I ask is because this new method of allocating a surplus
page (if successful) results in no decrease of available huge pages.
Perhaps all migrations should attempt to allocate surplus pages and not
impact the pre-allocated number of available huge pages.
Or, perhaps I am just confused. :)
--
Mike Kravetz
ed, if only to prevent future breakage or someone copy-pasting this
>> code.
>>
>> Fixes: 70c3547e36f5c ("hugetlbfs: add hugetlbfs_fallocate()")
>>
>> cc: Eric Biggers
>> cc: Mike Kravetz
>>
>> Signed-off-by: Nadav Amit
>> ---
&
On 12/13/2017 11:40 PM, Michal Hocko wrote:
> On Wed 13-12-17 15:35:33, Mike Kravetz wrote:
>> On 12/04/2017 06:01 AM, Michal Hocko wrote:
> [...]
>>> Before migration
>>> /sys/devices/system/node/node0/hugepages/hugepages-2048kB/free_hugepages:0
>>> /
On 12/13/2017 11:50 PM, Michal Hocko wrote:
> On Wed 13-12-17 16:45:55, Mike Kravetz wrote:
>> On 12/04/2017 06:01 AM, Michal Hocko wrote:
>>> From: Michal Hocko
>>>
>>> alloc_surplus_huge_page increases the pool size and the number of
>>> surplus
ir excessive prefix underscores to make names shorter
>
This patch will need to be modified to take into account the incremental
diff to patch 4 in this series. Other than that, the changes look good.
Reviewed-by: Mike Kravetz
--
Mike Kravetz
> Signed-off-by: Michal Hocko
> ---
> m
On 12/20/2017 11:28 PM, Michal Hocko wrote:
> On Wed 20-12-17 14:43:03, Mike Kravetz wrote:
>> On 12/20/2017 01:53 AM, Michal Hocko wrote:
>>> On Wed 20-12-17 05:33:36, Naoya Horiguchi wrote:
>>>> I have one comment on the code path from mbind(2).
>>>
701 - 800 of 2070 matches
Mail list logo