On Wed, 21 Nov 2018, Michal Hocko wrote:
> On Mon 19-11-18 21:44:41, Hugh Dickins wrote:
> [...]
> > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> >
> > We have all assumed that it is essential to hold a page reference while
> > waiting on a page lock: partly to guarantee
On Wed, 21 Nov 2018, Michal Hocko wrote:
> On Mon 19-11-18 21:44:41, Hugh Dickins wrote:
> [...]
> > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> >
> > We have all assumed that it is essential to hold a page reference while
> > waiting on a page lock: partly to guarantee
On Mon 19-11-18 21:44:41, Hugh Dickins wrote:
[...]
> [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
>
> We have all assumed that it is essential to hold a page reference while
> waiting on a page lock: partly to guarantee that there is still a struct
> page when
On Mon 19-11-18 21:44:41, Hugh Dickins wrote:
[...]
> [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
>
> We have all assumed that it is essential to hold a page reference while
> waiting on a page lock: partly to guarantee that there is still a struct
> page when
On Tue, 20 Nov 2018, Hugh Dickins wrote:
> On Tue, 20 Nov 2018, Vlastimil Babka wrote:
> > >
> > > finish_wait(q, wait);
> >
> > ... the code continues by:
> >
> > if (thrashing) {
> > if (!PageSwapBacked(page))
> >
> > So maybe we should not set 'thrashing' true
On Tue, 20 Nov 2018, Hugh Dickins wrote:
> On Tue, 20 Nov 2018, Vlastimil Babka wrote:
> > >
> > > finish_wait(q, wait);
> >
> > ... the code continues by:
> >
> > if (thrashing) {
> > if (!PageSwapBacked(page))
> >
> > So maybe we should not set 'thrashing' true
On Tue, 20 Nov 2018, Baoquan He wrote:
> On 11/20/18 at 02:38pm, Vlastimil Babka wrote:
> > On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> > >
> > > We have all assumed that it is essential to hold a page reference while
> > >
On Tue, 20 Nov 2018, Baoquan He wrote:
> On 11/20/18 at 02:38pm, Vlastimil Babka wrote:
> > On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> > >
> > > We have all assumed that it is essential to hold a page reference while
> > >
On Tue, 20 Nov 2018, Vlastimil Babka wrote:
> On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> >
> > We have all assumed that it is essential to hold a page reference while
> > waiting on a page lock: partly to guarantee that there is
On Tue, 20 Nov 2018, Vlastimil Babka wrote:
> On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> >
> > We have all assumed that it is essential to hold a page reference while
> > waiting on a page lock: partly to guarantee that there is
On 11/20/18 at 03:05pm, Michal Hocko wrote:
> > Yes, I applied Hugh's patch 8 hours ago, then our QE Ping operated on
> > that machine, after many times of hot removing/adding, the endless
> > looping during mirgrating is not seen any more. The test result for
> > Hugh's patch is positive. I even
On 11/20/18 at 03:05pm, Michal Hocko wrote:
> > Yes, I applied Hugh's patch 8 hours ago, then our QE Ping operated on
> > that machine, after many times of hot removing/adding, the endless
> > looping during mirgrating is not seen any more. The test result for
> > Hugh's patch is positive. I even
On Tue 20-11-18 21:58:03, Baoquan He wrote:
> Hi,
>
> On 11/20/18 at 02:38pm, Vlastimil Babka wrote:
> > On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> > >
> > > We have all assumed that it is essential to hold a page reference
On Tue 20-11-18 21:58:03, Baoquan He wrote:
> Hi,
>
> On 11/20/18 at 02:38pm, Vlastimil Babka wrote:
> > On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> > >
> > > We have all assumed that it is essential to hold a page reference
Hi,
On 11/20/18 at 02:38pm, Vlastimil Babka wrote:
> On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> >
> > We have all assumed that it is essential to hold a page reference while
> > waiting on a page lock: partly to guarantee that
Hi,
On 11/20/18 at 02:38pm, Vlastimil Babka wrote:
> On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> >
> > We have all assumed that it is essential to hold a page reference while
> > waiting on a page lock: partly to guarantee that
On 11/20/18 6:44 AM, Hugh Dickins wrote:
> [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
>
> We have all assumed that it is essential to hold a page reference while
> waiting on a page lock: partly to guarantee that there is still a struct
> page when MEMORY_HOTREMOVE is
On 11/20/18 6:44 AM, Hugh Dickins wrote:
> [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
>
> We have all assumed that it is essential to hold a page reference while
> waiting on a page lock: partly to guarantee that there is still a struct
> page when MEMORY_HOTREMOVE is
On Tue, 20 Nov 2018, Baoquan He wrote:
> On 11/19/18 at 09:59pm, Michal Hocko wrote:
> > On Mon 19-11-18 12:34:09, Hugh Dickins wrote:
> > > I'm glad that I delayed, what I had then (migration_waitqueue instead
> > > of using page_waitqueue) was not wrong, but what I've been using the
> > > last
On Tue, 20 Nov 2018, Baoquan He wrote:
> On 11/19/18 at 09:59pm, Michal Hocko wrote:
> > On Mon 19-11-18 12:34:09, Hugh Dickins wrote:
> > > I'm glad that I delayed, what I had then (migration_waitqueue instead
> > > of using page_waitqueue) was not wrong, but what I've been using the
> > > last
On 11/19/18 at 09:59pm, Michal Hocko wrote:
> On Mon 19-11-18 12:34:09, Hugh Dickins wrote:
> > I'm glad that I delayed, what I had then (migration_waitqueue instead
> > of using page_waitqueue) was not wrong, but what I've been using the
> > last couple of months is rather better (and can be put
On 11/19/18 at 09:59pm, Michal Hocko wrote:
> On Mon 19-11-18 12:34:09, Hugh Dickins wrote:
> > I'm glad that I delayed, what I had then (migration_waitqueue instead
> > of using page_waitqueue) was not wrong, but what I've been using the
> > last couple of months is rather better (and can be put
On Mon 19-11-18 12:34:09, Hugh Dickins wrote:
> On Mon, 19 Nov 2018, Michal Hocko wrote:
> > On Mon 19-11-18 15:10:16, Michal Hocko wrote:
> > [...]
> > > In other words. Why cannot we do the following?
> >
> > Baoquan, this is certainly not the right fix but I would be really
> > curious whether
On Mon 19-11-18 12:34:09, Hugh Dickins wrote:
> On Mon, 19 Nov 2018, Michal Hocko wrote:
> > On Mon 19-11-18 15:10:16, Michal Hocko wrote:
> > [...]
> > > In other words. Why cannot we do the following?
> >
> > Baoquan, this is certainly not the right fix but I would be really
> > curious whether
On Mon, 19 Nov 2018, Michal Hocko wrote:
> On Mon 19-11-18 15:10:16, Michal Hocko wrote:
> [...]
> > In other words. Why cannot we do the following?
>
> Baoquan, this is certainly not the right fix but I would be really
> curious whether it makes the problem go away.
>
> > diff --git
On Mon, 19 Nov 2018, Michal Hocko wrote:
> On Mon 19-11-18 15:10:16, Michal Hocko wrote:
> [...]
> > In other words. Why cannot we do the following?
>
> Baoquan, this is certainly not the right fix but I would be really
> curious whether it makes the problem go away.
>
> > diff --git
On Mon 19-11-18 15:10:16, Michal Hocko wrote:
[...]
> In other words. Why cannot we do the following?
Baoquan, this is certainly not the right fix but I would be really
curious whether it makes the problem go away.
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f7e4bfdc13b7..7ccab29bcf9a
On Mon 19-11-18 15:10:16, Michal Hocko wrote:
[...]
> In other words. Why cannot we do the following?
Baoquan, this is certainly not the right fix but I would be really
curious whether it makes the problem go away.
> diff --git a/mm/migrate.c b/mm/migrate.c
> index f7e4bfdc13b7..7ccab29bcf9a
On Mon 19-11-18 17:48:35, Vlastimil Babka wrote:
> On 11/19/18 5:46 PM, Vlastimil Babka wrote:
> > On 11/19/18 5:46 PM, Michal Hocko wrote:
> >> On Mon 19-11-18 17:36:21, Vlastimil Babka wrote:
> >>>
> >>> So what protects us from locking a page whose refcount dropped to zero?
> >>> and is being
On Mon 19-11-18 17:48:35, Vlastimil Babka wrote:
> On 11/19/18 5:46 PM, Vlastimil Babka wrote:
> > On 11/19/18 5:46 PM, Michal Hocko wrote:
> >> On Mon 19-11-18 17:36:21, Vlastimil Babka wrote:
> >>>
> >>> So what protects us from locking a page whose refcount dropped to zero?
> >>> and is being
On 11/19/18 5:46 PM, Vlastimil Babka wrote:
> On 11/19/18 5:46 PM, Michal Hocko wrote:
>> On Mon 19-11-18 17:36:21, Vlastimil Babka wrote:
>>>
>>> So what protects us from locking a page whose refcount dropped to zero?
>>> and is being freed? The checks in freeing path won't be happy about a
>>>
On 11/19/18 5:46 PM, Vlastimil Babka wrote:
> On 11/19/18 5:46 PM, Michal Hocko wrote:
>> On Mon 19-11-18 17:36:21, Vlastimil Babka wrote:
>>>
>>> So what protects us from locking a page whose refcount dropped to zero?
>>> and is being freed? The checks in freeing path won't be happy about a
>>>
On 11/19/18 5:46 PM, Michal Hocko wrote:
> On Mon 19-11-18 17:36:21, Vlastimil Babka wrote:
>>
>> So what protects us from locking a page whose refcount dropped to zero?
>> and is being freed? The checks in freeing path won't be happy about a
>> stray lock.
>
> Nothing really prevents that. But
On 11/19/18 5:46 PM, Michal Hocko wrote:
> On Mon 19-11-18 17:36:21, Vlastimil Babka wrote:
>>
>> So what protects us from locking a page whose refcount dropped to zero?
>> and is being freed? The checks in freeing path won't be happy about a
>> stray lock.
>
> Nothing really prevents that. But
On Mon 19-11-18 17:36:21, Vlastimil Babka wrote:
> On 11/19/18 3:10 PM, Michal Hocko wrote:
> > On Mon 19-11-18 13:51:21, Michal Hocko wrote:
> >> On Mon 19-11-18 13:40:33, Michal Hocko wrote:
> >>> How are
> >>> we supposed to converge when the swapin code waits for the migration to
> >>> finish
On Mon 19-11-18 17:36:21, Vlastimil Babka wrote:
> On 11/19/18 3:10 PM, Michal Hocko wrote:
> > On Mon 19-11-18 13:51:21, Michal Hocko wrote:
> >> On Mon 19-11-18 13:40:33, Michal Hocko wrote:
> >>> How are
> >>> we supposed to converge when the swapin code waits for the migration to
> >>> finish
On 11/19/18 3:10 PM, Michal Hocko wrote:
> On Mon 19-11-18 13:51:21, Michal Hocko wrote:
>> On Mon 19-11-18 13:40:33, Michal Hocko wrote:
>>> How are
>>> we supposed to converge when the swapin code waits for the migration to
>>> finish with the reference count elevated?
Indeed this looks wrong.
On 11/19/18 3:10 PM, Michal Hocko wrote:
> On Mon 19-11-18 13:51:21, Michal Hocko wrote:
>> On Mon 19-11-18 13:40:33, Michal Hocko wrote:
>>> How are
>>> we supposed to converge when the swapin code waits for the migration to
>>> finish with the reference count elevated?
Indeed this looks wrong.
On Mon 19-11-18 13:51:21, Michal Hocko wrote:
> On Mon 19-11-18 13:40:33, Michal Hocko wrote:
> > On Mon 19-11-18 18:52:02, Baoquan He wrote:
> > [...]
> >
> > There are few stacks directly in the offline path but those should be
> > OK.
> > The real culprit seems to be the swap in code
> >
> >
On Mon 19-11-18 13:51:21, Michal Hocko wrote:
> On Mon 19-11-18 13:40:33, Michal Hocko wrote:
> > On Mon 19-11-18 18:52:02, Baoquan He wrote:
> > [...]
> >
> > There are few stacks directly in the offline path but those should be
> > OK.
> > The real culprit seems to be the swap in code
> >
> >
On Mon 19-11-18 13:40:33, Michal Hocko wrote:
> On Mon 19-11-18 18:52:02, Baoquan He wrote:
> [...]
>
> There are few stacks directly in the offline path but those should be
> OK.
> The real culprit seems to be the swap in code
>
> > [ +1.734416] CPU: 255 PID: 5558 Comm: stress Tainted: G
On Mon 19-11-18 13:40:33, Michal Hocko wrote:
> On Mon 19-11-18 18:52:02, Baoquan He wrote:
> [...]
>
> There are few stacks directly in the offline path but those should be
> OK.
> The real culprit seems to be the swap in code
>
> > [ +1.734416] CPU: 255 PID: 5558 Comm: stress Tainted: G
On Mon 19-11-18 18:52:02, Baoquan He wrote:
[...]
There are few stacks directly in the offline path but those should be
OK.
The real culprit seems to be the swap in code
> [ +1.734416] CPU: 255 PID: 5558 Comm: stress Tainted: G L
> 4.20.0-rc2+ #7
> [ +0.007927] Hardware name:
On Mon 19-11-18 18:52:02, Baoquan He wrote:
[...]
There are few stacks directly in the offline path but those should be
OK.
The real culprit seems to be the swap in code
> [ +1.734416] CPU: 255 PID: 5558 Comm: stress Tainted: G L
> 4.20.0-rc2+ #7
> [ +0.007927] Hardware name:
On 11/16/18 at 10:14am, Michal Hocko wrote:
> Could you try to apply this debugging patch on top please? It will dump
> stack trace for each reference count elevation for one page that fails
> to migrate after multiple passes.
Thanks, applied and fixed two code issues. The dmesg has been sent to
On 11/16/18 at 10:14am, Michal Hocko wrote:
> Could you try to apply this debugging patch on top please? It will dump
> stack trace for each reference count elevation for one page that fails
> to migrate after multiple passes.
Thanks, applied and fixed two code issues. The dmesg has been sent to
On Fri 16-11-18 09:24:33, Baoquan He wrote:
> On 11/15/18 at 03:32pm, Michal Hocko wrote:
> > On Thu 15-11-18 21:38:40, Baoquan He wrote:
> > > On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > > > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> >
On Fri 16-11-18 09:24:33, Baoquan He wrote:
> On 11/15/18 at 03:32pm, Michal Hocko wrote:
> > On Thu 15-11-18 21:38:40, Baoquan He wrote:
> > > On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > > > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> >
On 11/15/18 at 03:32pm, Michal Hocko wrote:
> On Thu 15-11-18 21:38:40, Baoquan He wrote:
> > On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > > [...]
> > > > > It would be also good to find out
On 11/15/18 at 03:32pm, Michal Hocko wrote:
> On Thu 15-11-18 21:38:40, Baoquan He wrote:
> > On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > > [...]
> > > > > It would be also good to find out
On 11/15/18 at 03:32pm, Michal Hocko wrote:
> On Thu 15-11-18 21:38:40, Baoquan He wrote:
> > On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > > [...]
> > > > > It would be also good to find out
On 11/15/18 at 03:32pm, Michal Hocko wrote:
> On Thu 15-11-18 21:38:40, Baoquan He wrote:
> > On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > > [...]
> > > > > It would be also good to find out
On Thu 15-11-18 21:38:40, Baoquan He wrote:
> On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > [...]
> > > > It would be also good to find out whether this is fs specific. E.g. does
> > > > it make any
On Thu 15-11-18 21:38:40, Baoquan He wrote:
> On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > [...]
> > > > It would be also good to find out whether this is fs specific. E.g. does
> > > > it make any
On Thu 15-11-18 21:23:42, Baoquan He wrote:
> On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > [...]
> > > > It would be also good to find out whether this is fs specific. E.g. does
> > > > it make any
On Thu 15-11-18 21:23:42, Baoquan He wrote:
> On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > [...]
> > > > It would be also good to find out whether this is fs specific. E.g. does
> > > > it make any
On 11/15/18 at 02:19pm, Michal Hocko wrote:
> On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > On 11/15/18 at 09:30am, Michal Hocko wrote:
> [...]
> > > It would be also good to find out whether this is fs specific. E.g. does
> > > it make any difference if you use a different one for your stress
>
On 11/15/18 at 02:19pm, Michal Hocko wrote:
> On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > On 11/15/18 at 09:30am, Michal Hocko wrote:
> [...]
> > > It would be also good to find out whether this is fs specific. E.g. does
> > > it make any difference if you use a different one for your stress
>
On 11/15/18 at 02:19pm, Michal Hocko wrote:
> On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > On 11/15/18 at 09:30am, Michal Hocko wrote:
> [...]
> > > It would be also good to find out whether this is fs specific. E.g. does
> > > it make any difference if you use a different one for your stress
>
On 11/15/18 at 02:19pm, Michal Hocko wrote:
> On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > On 11/15/18 at 09:30am, Michal Hocko wrote:
> [...]
> > > It would be also good to find out whether this is fs specific. E.g. does
> > > it make any difference if you use a different one for your stress
>
On Thu 15-11-18 21:12:11, Baoquan He wrote:
> On 11/15/18 at 09:30am, Michal Hocko wrote:
[...]
> > It would be also good to find out whether this is fs specific. E.g. does
> > it make any difference if you use a different one for your stress
> > testing?
>
> Created a ramdisk and put stress bin
On Thu 15-11-18 21:12:11, Baoquan He wrote:
> On 11/15/18 at 09:30am, Michal Hocko wrote:
[...]
> > It would be also good to find out whether this is fs specific. E.g. does
> > it make any difference if you use a different one for your stress
> > testing?
>
> Created a ramdisk and put stress bin
On 11/15/18 at 09:30am, Michal Hocko wrote:
> On Thu 15-11-18 15:53:56, Baoquan He wrote:
> > On 11/15/18 at 08:30am, Michal Hocko wrote:
> > > On Thu 15-11-18 13:10:34, Baoquan He wrote:
> > > > On 11/14/18 at 04:00pm, Michal Hocko wrote:
> > > > > On Wed 14-11-18 22:52:50, Baoquan He wrote:
> >
On 11/15/18 at 09:30am, Michal Hocko wrote:
> On Thu 15-11-18 15:53:56, Baoquan He wrote:
> > On 11/15/18 at 08:30am, Michal Hocko wrote:
> > > On Thu 15-11-18 13:10:34, Baoquan He wrote:
> > > > On 11/14/18 at 04:00pm, Michal Hocko wrote:
> > > > > On Wed 14-11-18 22:52:50, Baoquan He wrote:
> >
On 15.11.18 10:52, Baoquan He wrote:
> On 11/15/18 at 10:42am, David Hildenbrand wrote:
>> I am wondering why it is always the last memory block of that device
>> (and even that node). Coincidence?
>
> I remember one or two times it's the last 6G or 4G which stall there,
> the size of memory
On 15.11.18 10:52, Baoquan He wrote:
> On 11/15/18 at 10:42am, David Hildenbrand wrote:
>> I am wondering why it is always the last memory block of that device
>> (and even that node). Coincidence?
>
> I remember one or two times it's the last 6G or 4G which stall there,
> the size of memory
On 11/15/18 at 10:42am, David Hildenbrand wrote:
> I am wondering why it is always the last memory block of that device
> (and even that node). Coincidence?
I remember one or two times it's the last 6G or 4G which stall there,
the size of memory block is 2G. But most of time it's the last memory
On 11/15/18 at 10:42am, David Hildenbrand wrote:
> I am wondering why it is always the last memory block of that device
> (and even that node). Coincidence?
I remember one or two times it's the last 6G or 4G which stall there,
the size of memory block is 2G. But most of time it's the last memory
On 15.11.18 09:30, Michal Hocko wrote:
> On Thu 15-11-18 15:53:56, Baoquan He wrote:
>> On 11/15/18 at 08:30am, Michal Hocko wrote:
>>> On Thu 15-11-18 13:10:34, Baoquan He wrote:
On 11/14/18 at 04:00pm, Michal Hocko wrote:
> On Wed 14-11-18 22:52:50, Baoquan He wrote:
>> On 11/14/18
On 15.11.18 09:30, Michal Hocko wrote:
> On Thu 15-11-18 15:53:56, Baoquan He wrote:
>> On 11/15/18 at 08:30am, Michal Hocko wrote:
>>> On Thu 15-11-18 13:10:34, Baoquan He wrote:
On 11/14/18 at 04:00pm, Michal Hocko wrote:
> On Wed 14-11-18 22:52:50, Baoquan He wrote:
>> On 11/14/18
On Thu 15-11-18 15:53:56, Baoquan He wrote:
> On 11/15/18 at 08:30am, Michal Hocko wrote:
> > On Thu 15-11-18 13:10:34, Baoquan He wrote:
> > > On 11/14/18 at 04:00pm, Michal Hocko wrote:
> > > > On Wed 14-11-18 22:52:50, Baoquan He wrote:
> > > > > On 11/14/18 at 10:01am, Michal Hocko wrote:
> >
On Thu 15-11-18 15:53:56, Baoquan He wrote:
> On 11/15/18 at 08:30am, Michal Hocko wrote:
> > On Thu 15-11-18 13:10:34, Baoquan He wrote:
> > > On 11/14/18 at 04:00pm, Michal Hocko wrote:
> > > > On Wed 14-11-18 22:52:50, Baoquan He wrote:
> > > > > On 11/14/18 at 10:01am, Michal Hocko wrote:
> >
On 11/15/18 at 08:30am, Michal Hocko wrote:
> On Thu 15-11-18 13:10:34, Baoquan He wrote:
> > On 11/14/18 at 04:00pm, Michal Hocko wrote:
> > > On Wed 14-11-18 22:52:50, Baoquan He wrote:
> > > > On 11/14/18 at 10:01am, Michal Hocko wrote:
> > > > > I have seen an issue when the migration cannot
On 11/15/18 at 08:30am, Michal Hocko wrote:
> On Thu 15-11-18 13:10:34, Baoquan He wrote:
> > On 11/14/18 at 04:00pm, Michal Hocko wrote:
> > > On Wed 14-11-18 22:52:50, Baoquan He wrote:
> > > > On 11/14/18 at 10:01am, Michal Hocko wrote:
> > > > > I have seen an issue when the migration cannot
On Thu 15-11-18 13:10:34, Baoquan He wrote:
> On 11/14/18 at 04:00pm, Michal Hocko wrote:
> > On Wed 14-11-18 22:52:50, Baoquan He wrote:
> > > On 11/14/18 at 10:01am, Michal Hocko wrote:
> > > > I have seen an issue when the migration cannot make a forward progress
> > > > because of a glibc page
On Thu 15-11-18 13:10:34, Baoquan He wrote:
> On 11/14/18 at 04:00pm, Michal Hocko wrote:
> > On Wed 14-11-18 22:52:50, Baoquan He wrote:
> > > On 11/14/18 at 10:01am, Michal Hocko wrote:
> > > > I have seen an issue when the migration cannot make a forward progress
> > > > because of a glibc page
On 11/14/18 at 04:00pm, Michal Hocko wrote:
> On Wed 14-11-18 22:52:50, Baoquan He wrote:
> > On 11/14/18 at 10:01am, Michal Hocko wrote:
> > > I have seen an issue when the migration cannot make a forward progress
> > > because of a glibc page with a reference count bumping up and down. Most
> >
On 11/14/18 at 04:00pm, Michal Hocko wrote:
> On Wed 14-11-18 22:52:50, Baoquan He wrote:
> > On 11/14/18 at 10:01am, Michal Hocko wrote:
> > > I have seen an issue when the migration cannot make a forward progress
> > > because of a glibc page with a reference count bumping up and down. Most
> >
On Wed 14-11-18 22:52:50, Baoquan He wrote:
> On 11/14/18 at 10:01am, Michal Hocko wrote:
> > I have seen an issue when the migration cannot make a forward progress
> > because of a glibc page with a reference count bumping up and down. Most
> > probable explanation is the faultaround code. I am
On Wed 14-11-18 22:52:50, Baoquan He wrote:
> On 11/14/18 at 10:01am, Michal Hocko wrote:
> > I have seen an issue when the migration cannot make a forward progress
> > because of a glibc page with a reference count bumping up and down. Most
> > probable explanation is the faultaround code. I am
On 11/14/18 at 10:01am, Michal Hocko wrote:
> I have seen an issue when the migration cannot make a forward progress
> because of a glibc page with a reference count bumping up and down. Most
> probable explanation is the faultaround code. I am working on this and
> will post a patch soon. In any
On 11/14/18 at 10:01am, Michal Hocko wrote:
> I have seen an issue when the migration cannot make a forward progress
> because of a glibc page with a reference count bumping up and down. Most
> probable explanation is the faultaround code. I am working on this and
> will post a patch soon. In any
On Wed 14-11-18 10:48:09, David Hildenbrand wrote:
> On 14.11.18 10:41, Michal Hocko wrote:
> > On Wed 14-11-18 10:25:57, David Hildenbrand wrote:
> >> On 14.11.18 10:00, Baoquan He wrote:
> >>> Hi David,
> >>>
> >>> On 11/14/18 at 09:18am, David Hildenbrand wrote:
> Code seems to be waiting
On Wed 14-11-18 10:48:09, David Hildenbrand wrote:
> On 14.11.18 10:41, Michal Hocko wrote:
> > On Wed 14-11-18 10:25:57, David Hildenbrand wrote:
> >> On 14.11.18 10:00, Baoquan He wrote:
> >>> Hi David,
> >>>
> >>> On 11/14/18 at 09:18am, David Hildenbrand wrote:
> Code seems to be waiting
[Cc Vladimir]
On Wed 14-11-18 15:09:09, Baoquan He wrote:
> Hi,
>
> Tested memory hotplug on a bare metal system, hot removing always
> trigger a lock. Usually need hot plug/unplug several times, then the hot
> removing will hang there at the last block. Surely with memory pressure
> added by
[Cc Vladimir]
On Wed 14-11-18 15:09:09, Baoquan He wrote:
> Hi,
>
> Tested memory hotplug on a bare metal system, hot removing always
> trigger a lock. Usually need hot plug/unplug several times, then the hot
> removing will hang there at the last block. Surely with memory pressure
> added by
On 14.11.18 10:41, Michal Hocko wrote:
> On Wed 14-11-18 10:25:57, David Hildenbrand wrote:
>> On 14.11.18 10:00, Baoquan He wrote:
>>> Hi David,
>>>
>>> On 11/14/18 at 09:18am, David Hildenbrand wrote:
Code seems to be waiting for the mem_hotplug_lock in read.
We hold mem_hotplug_lock
On 14.11.18 10:41, Michal Hocko wrote:
> On Wed 14-11-18 10:25:57, David Hildenbrand wrote:
>> On 14.11.18 10:00, Baoquan He wrote:
>>> Hi David,
>>>
>>> On 11/14/18 at 09:18am, David Hildenbrand wrote:
Code seems to be waiting for the mem_hotplug_lock in read.
We hold mem_hotplug_lock
On Wed 14-11-18 10:25:57, David Hildenbrand wrote:
> On 14.11.18 10:00, Baoquan He wrote:
> > Hi David,
> >
> > On 11/14/18 at 09:18am, David Hildenbrand wrote:
> >> Code seems to be waiting for the mem_hotplug_lock in read.
> >> We hold mem_hotplug_lock in write whenever we
On Wed 14-11-18 10:25:57, David Hildenbrand wrote:
> On 14.11.18 10:00, Baoquan He wrote:
> > Hi David,
> >
> > On 11/14/18 at 09:18am, David Hildenbrand wrote:
> >> Code seems to be waiting for the mem_hotplug_lock in read.
> >> We hold mem_hotplug_lock in write whenever we
On Wed 14-11-18 10:22:31, David Hildenbrand wrote:
> >>
> >> The real question is, however, why offlining of the last block doesn't
> >> succeed. In __offline_pages() we basically have an endless loop (while
> >> holding the mem_hotplug_lock in write). Now I consider this piece of
> >> code very
On Wed 14-11-18 10:22:31, David Hildenbrand wrote:
> >>
> >> The real question is, however, why offlining of the last block doesn't
> >> succeed. In __offline_pages() we basically have an endless loop (while
> >> holding the mem_hotplug_lock in write). Now I consider this piece of
> >> code very
>>> Failing on ENOMEM is a questionable thing. I haven't seen that happening
>>> wildly but if it is a case then I wouldn't be opposed.
>>>
You mentioned memory pressure, if our host is under memory pressure we
can easily trigger running into an endless loop there, because we
>>> Failing on ENOMEM is a questionable thing. I haven't seen that happening
>>> wildly but if it is a case then I wouldn't be opposed.
>>>
You mentioned memory pressure, if our host is under memory pressure we
can easily trigger running into an endless loop there, because we
On 14.11.18 10:00, Baoquan He wrote:
> Hi David,
>
> On 11/14/18 at 09:18am, David Hildenbrand wrote:
>> Code seems to be waiting for the mem_hotplug_lock in read.
>> We hold mem_hotplug_lock in write whenever we online/offline/add/remove
>> memory. There are two ways to trigger offlining of
On 14.11.18 10:00, Baoquan He wrote:
> Hi David,
>
> On 11/14/18 at 09:18am, David Hildenbrand wrote:
>> Code seems to be waiting for the mem_hotplug_lock in read.
>> We hold mem_hotplug_lock in write whenever we online/offline/add/remove
>> memory. There are two ways to trigger offlining of
>>
>> The real question is, however, why offlining of the last block doesn't
>> succeed. In __offline_pages() we basically have an endless loop (while
>> holding the mem_hotplug_lock in write). Now I consider this piece of
>> code very problematic (we should automatically fail after X
>>
>>
>> The real question is, however, why offlining of the last block doesn't
>> succeed. In __offline_pages() we basically have an endless loop (while
>> holding the mem_hotplug_lock in write). Now I consider this piece of
>> code very problematic (we should automatically fail after X
>>
On Wed 14-11-18 09:18:09, David Hildenbrand wrote:
> On 14.11.18 08:09, Baoquan He wrote:
> > Hi,
> >
> > Tested memory hotplug on a bare metal system, hot removing always
> > trigger a lock. Usually need hot plug/unplug several times, then the hot
> > removing will hang there at the last block.
Hi David,
On 11/14/18 at 09:18am, David Hildenbrand wrote:
> Code seems to be waiting for the mem_hotplug_lock in read.
> We hold mem_hotplug_lock in write whenever we online/offline/add/remove
> memory. There are two ways to trigger offlining of memory:
>
> 1. Offlining via "cat offline >
1 - 100 of 104 matches
Mail list logo