Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-10 Thread P. Christeas
On Saturday 08 November 2014, Vlastimil Babka wrote:
> >From fbf8eb0bcd2897090312e23da6a31bad9cc6b337 Mon Sep 17 00:00:00 2001
> 
> From: Vlastimil Babka 
> Date: Sat, 8 Nov 2014 22:20:43 +0100
> Subject: [PATCH] mm, compaction: prevent endless loop in migrate scanner

After 30hrs uptime, I also mark this test as PASSED .

:)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-10 Thread Joonsoo Kim
On Mon, Nov 10, 2014 at 08:53:38AM +0100, Vlastimil Babka wrote:
> On 11/10/2014 07:07 AM, Joonsoo Kim wrote:
> >On Sat, Nov 08, 2014 at 11:18:37PM +0100, Vlastimil Babka wrote:
> >>On 11/08/2014 02:11 PM, P. Christeas wrote:
> >>
> >>Hi,
> >>
> >>I think I finally found the cause by staring into the code... CCing
> >>people from all 4 separate threads I know about this issue.
> >>The problem with finding the cause was that the first report I got from
> >>Markus was about isolate_freepages_block() overhead, and later Norbert
> >>reported that reverting a patch for isolate_freepages* helped. But the
> >>problem seems to be that although the loop in isolate_migratepages exits
> >>because the scanners almost meet (they are within same pageblock), they
> >>don't truly meet, therefore compact_finished() decides to continue, but
> >>isolate_migratepages() exits immediately... boom! But indeed e14c720efdd7
> >>made this situation possible, as free scaner pfn can now point to a
> >>middle of pageblock.
> >
> >Indeed.
> >
> >>
> >>So I hope the attached patch will fix the soft-lockup issues in
> >>compact_zone. Please apply on 3.18-rc3 or later without any other reverts,
> >>and test. It probably won't help Markus and his isolate_freepages_block()
> >>overhead though...
> >
> >Yes, I found this bug too, but, it can't explain
> >isolate_freepages_block() overhead. Anyway, I can't find another bug
> >related to isolate_freepages_block(). :/
> 
> Thanks for checking.
> 
> >>Thanks,
> >>Vlastimil
> >>
> >>--8<--
> >>>From fbf8eb0bcd2897090312e23da6a31bad9cc6b337 Mon Sep 17 00:00:00 2001
> >>From: Vlastimil Babka 
> >>Date: Sat, 8 Nov 2014 22:20:43 +0100
> >>Subject: [PATCH] mm, compaction: prevent endless loop in migrate scanner
> >>
> >>---
> >>  mm/compaction.c | 8 ++--
> >>  1 file changed, 6 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/mm/compaction.c b/mm/compaction.c
> >>index ec74cf0..1b7a1be 100644
> >>--- a/mm/compaction.c
> >>+++ b/mm/compaction.c
> >>@@ -1029,8 +1029,12 @@ static isolate_migrate_t isolate_migratepages(struct 
> >>zone *zone,
> >>}
> >>
> >>acct_isolated(zone, cc);
> >>-   /* Record where migration scanner will be restarted */
> >>-   cc->migrate_pfn = low_pfn;
> >>+   /*
> >>+* Record where migration scanner will be restarted. If we end up in
> >>+* the same pageblock as the free scanner, make the scanners fully
> >>+* meet so that compact_finished() terminates compaction.
> >>+*/
> >>+   cc->migrate_pfn = (end_pfn <= cc->free_pfn) ? low_pfn : cc->free_pfn;
> >>
> >>return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
> >>  }
> >
> >IMHO, proper fix is not to change this logic, but, to change decision
> >logic in compact_finished() and in compact_zone(). Maybe helper
> >function would be good for readability.
> 
> OK but I would think that to fix 3.18 ASAP and not introduce more
> regressions, go with the patch above first, as it is the minimal fix
> and people already test it. Then we can implement your suggestion
> later as a cleanup for 3.19?

Yeap. Agreed.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-09 Thread Vlastimil Babka

On 11/10/2014 07:07 AM, Joonsoo Kim wrote:

On Sat, Nov 08, 2014 at 11:18:37PM +0100, Vlastimil Babka wrote:

On 11/08/2014 02:11 PM, P. Christeas wrote:

Hi,

I think I finally found the cause by staring into the code... CCing
people from all 4 separate threads I know about this issue.
The problem with finding the cause was that the first report I got from
Markus was about isolate_freepages_block() overhead, and later Norbert
reported that reverting a patch for isolate_freepages* helped. But the
problem seems to be that although the loop in isolate_migratepages exits
because the scanners almost meet (they are within same pageblock), they
don't truly meet, therefore compact_finished() decides to continue, but
isolate_migratepages() exits immediately... boom! But indeed e14c720efdd7
made this situation possible, as free scaner pfn can now point to a
middle of pageblock.


Indeed.



So I hope the attached patch will fix the soft-lockup issues in
compact_zone. Please apply on 3.18-rc3 or later without any other reverts,
and test. It probably won't help Markus and his isolate_freepages_block()
overhead though...


Yes, I found this bug too, but, it can't explain
isolate_freepages_block() overhead. Anyway, I can't find another bug
related to isolate_freepages_block(). :/


Thanks for checking.


Thanks,
Vlastimil

--8<--
>From fbf8eb0bcd2897090312e23da6a31bad9cc6b337 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Sat, 8 Nov 2014 22:20:43 +0100
Subject: [PATCH] mm, compaction: prevent endless loop in migrate scanner

---
  mm/compaction.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index ec74cf0..1b7a1be 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1029,8 +1029,12 @@ static isolate_migrate_t isolate_migratepages(struct 
zone *zone,
}

acct_isolated(zone, cc);
-   /* Record where migration scanner will be restarted */
-   cc->migrate_pfn = low_pfn;
+   /*
+* Record where migration scanner will be restarted. If we end up in
+* the same pageblock as the free scanner, make the scanners fully
+* meet so that compact_finished() terminates compaction.
+*/
+   cc->migrate_pfn = (end_pfn <= cc->free_pfn) ? low_pfn : cc->free_pfn;

return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
  }


IMHO, proper fix is not to change this logic, but, to change decision
logic in compact_finished() and in compact_zone(). Maybe helper
function would be good for readability.


OK but I would think that to fix 3.18 ASAP and not introduce more 
regressions, go with the patch above first, as it is the minimal fix and 
people already test it. Then we can implement your suggestion later as a 
cleanup for 3.19?


Vlastimil


Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-09 Thread Joonsoo Kim
On Sat, Nov 08, 2014 at 11:18:37PM +0100, Vlastimil Babka wrote:
> On 11/08/2014 02:11 PM, P. Christeas wrote:
> > On Thursday 06 November 2014, Vlastimil Babka wrote:
> >>> On Wednesday 05 November 2014, Vlastimil Babka wrote:
>  Can you please try the following patch?
>  -compaction_defer_reset(zone, order, false);
> >> Oh and did I ask in this thread for /proc/zoneinfo yet? :)
> > 
> > Using that same kernel[1], got again into a race, gathered a few more data.
> > 
> > This time, I had 1x "urpmq" process [2] hung at 100% CPU , when "kwin" got 
> > apparently blocked (100% CPU, too) trying to resize a GUI window. I suppose 
> > the resizing operation would mean heavy memory alloc/free.
> > 
> > The rest of the system was responsive, I could easily get a console, login, 
> > gather the files.. Then, I have *killed* -9 the "urpmq" process, which 
> > solved 
> > the race and my system is still alive! "kwin" is still running, returned to 
> > regular CPU load.
> > 
> > Attached is traces from SysRq+l (pressed a few times, wanted to "snapshot" 
> > the 
> > stack) and /proc/zoneinfo + /proc/vmstat
> > 
> > Bisection is not yet meaningful, IMHO, because I cannot be sure that "good" 
> > points are really free from this issue. I'd estimate that each test would 
> > take 
> > +3days, unless I really find a deterministic way to reproduce the issue .
> 
> Hi,
> 
> I think I finally found the cause by staring into the code... CCing
> people from all 4 separate threads I know about this issue.
> The problem with finding the cause was that the first report I got from
> Markus was about isolate_freepages_block() overhead, and later Norbert
> reported that reverting a patch for isolate_freepages* helped. But the
> problem seems to be that although the loop in isolate_migratepages exits
> because the scanners almost meet (they are within same pageblock), they
> don't truly meet, therefore compact_finished() decides to continue, but
> isolate_migratepages() exits immediately... boom! But indeed e14c720efdd7
> made this situation possible, as free scaner pfn can now point to a
> middle of pageblock.

Indeed.

> 
> So I hope the attached patch will fix the soft-lockup issues in
> compact_zone. Please apply on 3.18-rc3 or later without any other reverts,
> and test. It probably won't help Markus and his isolate_freepages_block()
> overhead though...

Yes, I found this bug too, but, it can't explain
isolate_freepages_block() overhead. Anyway, I can't find another bug
related to isolate_freepages_block(). :/

> Thanks,
> Vlastimil
> 
> --8<--
> >From fbf8eb0bcd2897090312e23da6a31bad9cc6b337 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka 
> Date: Sat, 8 Nov 2014 22:20:43 +0100
> Subject: [PATCH] mm, compaction: prevent endless loop in migrate scanner
> 
> ---
>  mm/compaction.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index ec74cf0..1b7a1be 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1029,8 +1029,12 @@ static isolate_migrate_t isolate_migratepages(struct 
> zone *zone,
>   }
>  
>   acct_isolated(zone, cc);
> - /* Record where migration scanner will be restarted */
> - cc->migrate_pfn = low_pfn;
> + /* 
> +  * Record where migration scanner will be restarted. If we end up in
> +  * the same pageblock as the free scanner, make the scanners fully
> +  * meet so that compact_finished() terminates compaction.
> +  */
> + cc->migrate_pfn = (end_pfn <= cc->free_pfn) ? low_pfn : cc->free_pfn;
>  
>   return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
>  }

IMHO, proper fix is not to change this logic, but, to change decision
logic in compact_finished() and in compact_zone(). Maybe helper
function would be good for readability.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-09 Thread Hillf Danton
> >
> > I guess this one would mitigate against Vlastmil's migration scanner issue,
> > wouldn't it?
> 
Nope, I wanted to see  if free pages are low enough.

> Please no, that's a wrong fix. The purpose of compaction is to make the
> high-order watermark meet, not give up.
> 
Yupe, have to spin.

--- a/mm/compaction.c   Sun Nov  9 12:02:59 2014
+++ b/mm/compaction.c   Mon Nov 10 11:12:07 2014
@@ -1074,6 +1074,8 @@ static int compact_finished(struct zone 
watermark = low_wmark_pages(zone);
watermark += (1 << cc->order);
 
+   if (!zone_watermark_ok(zone, 0, watermark, 0, 0))
+   return COMPACT_SKIPPED;
if (!zone_watermark_ok(zone, cc->order, watermark, 0, 0))
return COMPACT_CONTINUE;
 
--


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-09 Thread Norbert Preining
Hi Vlastimil, hi all,

On Sun, 09 Nov 2014, Vlastimil Babka wrote:
> I don't want to send untested fix, and wasn't able to reproduce the bug
> myself. I think Norbert could do it rather quickly so I hope he can tell
> us soon.

Sorry, weekend means I am away from my laptop for extended times,
and I wanted to give it a bit of stress testing.

No problems till now, no hangs, all working as expected with
your latest patch.

Thanks a lot

Norbert


PREINING, Norbert   http://www.preining.info
JAIST, Japan TeX Live & Debian Developer
GPG: 0x860CDC13   fp: F7D8 A928 26E3 16A1 9FA0  ACF0 6CAC A448 860C DC13

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-09 Thread Vlastimil Babka
On 11/09/2014 09:27 AM, Pavel Machek wrote:
> Hi!
> 
 Oh and did I ask in this thread for /proc/zoneinfo yet? :)
>>>
>>> Using that same kernel[1], got again into a race, gathered a few more data.
>>>
>>> This time, I had 1x "urpmq" process [2] hung at 100% CPU , when "kwin" got 
>>> apparently blocked (100% CPU, too) trying to resize a GUI window. I suppose 
>>> the resizing operation would mean heavy memory alloc/free.
>>>
>>> The rest of the system was responsive, I could easily get a console, login, 
>>> gather the files.. Then, I have *killed* -9 the "urpmq" process, which 
>>> solved 
>>> the race and my system is still alive! "kwin" is still running, returned to 
>>> regular CPU load.
>>>
>>> Attached is traces from SysRq+l (pressed a few times, wanted to "snapshot" 
>>> the 
>>> stack) and /proc/zoneinfo + /proc/vmstat
>>>
>>> Bisection is not yet meaningful, IMHO, because I cannot be sure that "good" 
>>> points are really free from this issue. I'd estimate that each test would 
>>> take 
>>> +3days, unless I really find a deterministic way to reproduce the issue .
>>
>> Hi,
>>
>> I think I finally found the cause by staring into the code... CCing
>> people from all 4 separate threads I know about this issue.
>> The problem with finding the cause was that the first report I got from
>> Markus was about isolate_freepages_block() overhead, and later Norbert
>> reported that reverting a patch for isolate_freepages* helped. But the
>> problem seems to be that although the loop in isolate_migratepages exits
>> because the scanners almost meet (they are within same pageblock), they
>> don't truly meet, therefore compact_finished() decides to continue, but
>> isolate_migratepages() exits immediately... boom! But indeed e14c720efdd7
>> made this situation possible, as free scaner pfn can now point to a
>> middle of pageblock.
> 
> Ok, it seems it happened second time now, again shortly after
> resume. I guess I should apply your patch after all.

Thanks.

> (Or... instead it should go to Linus ASAP -- it fixes known problem
> that is affected people, and we want it in soon in case it is not
> complete fix.)

I don't want to send untested fix, and wasn't able to reproduce the bug
myself. I think Norbert could do it rather quickly so I hope he can tell
us soon.

> Dmesg is in the attachment, perhaps it helps.
>   Pavel

It looks the same as before, so no surprises there, which is good.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-09 Thread Vlastimil Babka
On 11/09/2014 09:22 AM, P. Christeas wrote:
> On Sunday 09 November 2014, Hillf Danton wrote:
>> -return COMPACT_CONTINUE;
>> +return COMPACT_SKIPPED;
> 
> I guess this one would mitigate against Vlastmil's migration scanner issue, 
> wouldn't it?

Please no, that's a wrong fix. The purpose of compaction is to make the
high-order watermark meet, not give up.

> In that case, I should wait a bit[1] to try the first patch, then revert, try 
> yours and (hopefully) have some results.

I hope my patch will be enough,

> Then, apply both.
> 
> [1] trying to push the vm by loading memory-hungry apps and random load.

Maybe the tools/testing/selftests/vm/transhuge-stress could help

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-09 Thread Pavel Machek
Hi!

> >> Oh and did I ask in this thread for /proc/zoneinfo yet? :)
> > 
> > Using that same kernel[1], got again into a race, gathered a few more data.
> > 
> > This time, I had 1x "urpmq" process [2] hung at 100% CPU , when "kwin" got 
> > apparently blocked (100% CPU, too) trying to resize a GUI window. I suppose 
> > the resizing operation would mean heavy memory alloc/free.
> > 
> > The rest of the system was responsive, I could easily get a console, login, 
> > gather the files.. Then, I have *killed* -9 the "urpmq" process, which 
> > solved 
> > the race and my system is still alive! "kwin" is still running, returned to 
> > regular CPU load.
> > 
> > Attached is traces from SysRq+l (pressed a few times, wanted to "snapshot" 
> > the 
> > stack) and /proc/zoneinfo + /proc/vmstat
> > 
> > Bisection is not yet meaningful, IMHO, because I cannot be sure that "good" 
> > points are really free from this issue. I'd estimate that each test would 
> > take 
> > +3days, unless I really find a deterministic way to reproduce the issue .
> 
> Hi,
> 
> I think I finally found the cause by staring into the code... CCing
> people from all 4 separate threads I know about this issue.
> The problem with finding the cause was that the first report I got from
> Markus was about isolate_freepages_block() overhead, and later Norbert
> reported that reverting a patch for isolate_freepages* helped. But the
> problem seems to be that although the loop in isolate_migratepages exits
> because the scanners almost meet (they are within same pageblock), they
> don't truly meet, therefore compact_finished() decides to continue, but
> isolate_migratepages() exits immediately... boom! But indeed e14c720efdd7
> made this situation possible, as free scaner pfn can now point to a
> middle of pageblock.

Ok, it seems it happened second time now, again shortly after
resume. I guess I should apply your patch after all.

(Or... instead it should go to Linus ASAP -- it fixes known problem
that is affected people, and we want it in soon in case it is not
complete fix.)

Dmesg is in the attachment, perhaps it helps.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


delme.gz
Description: application/gzip


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-09 Thread P. Christeas
On Sunday 09 November 2014, Hillf Danton wrote:
> - return COMPACT_CONTINUE;
> + return COMPACT_SKIPPED;

I guess this one would mitigate against Vlastmil's migration scanner issue, 
wouldn't it?

In that case, I should wait a bit[1] to try the first patch, then revert, try 
yours and (hopefully) have some results.

Then, apply both.

[1] trying to push the vm by loading memory-hungry apps and random load.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-08 Thread Hillf Danton
> > Can you please try the following patch?
> > --- a/mm/compaction.c
> > +++ b/mm/compaction.c
> > @@ -1325,13 +1325,6 @@ unsigned long try_to_compact_pages(struct zonelist
> > -   compaction_defer_reset(zone, order, false);
> 
> NACK :(
> 
> I just got again into a state that some task was spinning out of control, and
> blocking the rest of the desktop.
> 
Would you please try the diff(against 3.18-rc3) if no other progress?

--- a/mm/compaction.c   Sun Nov  9 12:02:59 2014
+++ b/mm/compaction.c   Sun Nov  9 12:07:30 2014
@@ -1070,12 +1070,12 @@ static int compact_finished(struct zone 
if (cc->order == -1)
return COMPACT_CONTINUE;
 
-   /* Compaction run is not finished if the watermark is not met */
+   /* Compaction run is skipped if the watermark is not met */
watermark = low_wmark_pages(zone);
watermark += (1 << cc->order);
 
if (!zone_watermark_ok(zone, cc->order, watermark, 0, 0))
-   return COMPACT_CONTINUE;
+   return COMPACT_SKIPPED;
 
/* Direct compactor: Is a suitable page free? */
for (order = cc->order; order < MAX_ORDER; order++) {
--

> You will see me trying a few things, apparently the first OOM managed to
> unblock something, but a few seconds later the system "stepped" on some other
> blocking task.
> 
> See attached log, it may only give you some hint; the problem could well be in
> some other part of the kernel.
> 
> In the meanwhile, I'm pulling linus/master ...
> 
> SysRq : Show backtrace of all active CPUs
> sending NMI to all CPUs:
> NMI backtrace for cpu 1
> CPU: 1 PID: 13544 Comm: python Not tainted 3.18.0-rc3+ #46
> Hardware name: AcerTravelMate 5720/Columbia   
> , BIOS V1.34   04/15/2008
> task: 88000c78ee40 ti: 88000e5f8000 task.ti: 88000e5f8000
> RIP: 0010:[]  [] delay_tsc+0x28/0xa2
> RSP: :8800bf303b28  EFLAGS: 0002
> RAX: 6bd322e8 RBX: 2710 RCX: 0007
> RDX: 021d RSI: 8151623e RDI: 8152fea5
> RBP: 8800bf303b48 R08: 0400 R09: 
> R10: 0046 R11: 0046 R12: 00185ac0
> R13: 0001 R14: 0001 R15: 81668f90
> FS:  7f1570ed1700() GS:8800bf30() knlGS:
> CS:  0010 DS:  ES:  CR0: 8005003b
> CR2: 7f7cd966b000 CR3: 740c9000 CR4: 07e0
> Stack:
>  2710 0003 006c 0001
>  8800bf303b58 811df814 8800bf303b68 811df83d
>  8800bf303b88 81025de1 80010002 816692b0
> Call Trace:
>  
> 
>  [] __delay+0xa/0xc
>  [] __const_udelay+0x27/0x29
>  [] arch_trigger_all_cpu_backtrace+0xa8/0xd2
>  [] sysrq_handle_showallcpus+0xe/0x10
>  [] __handle_sysrq+0x94/0x126
>  [] sysrq_filter+0xee/0x287
>  [] input_to_handler+0x5e/0xcb
>  [] input_pass_values.part.3+0x76/0x134
>  [] input_handle_event+0x457/0x46d
>  [] input_event+0x55/0x6f
>  [] input_sync+0xf/0x11
>  [] atkbd_interrupt+0x4d5/0x595
>  [] serio_interrupt+0x43/0x7d
>  [] i8042_interrupt+0x292/0x2a8
>  [] ? tick_sched_do_timer+0x33/0x33
>  [] handle_irq_event_percpu+0x44/0x19f
>  [] handle_irq_event+0x3c/0x5c
>  [] ? apic_eoi+0x18/0x1a
>  [] handle_edge_irq+0x95/0xae
>  [] handle_irq+0x158/0x16d
>  [] ? get_parent_ip+0xe/0x3e
>  [] do_IRQ+0x58/0xda
>  [] common_interrupt+0x6a/0x6a
>  
> 
>  [] ? rcu_read_unlock_sched_notrace+0x17/0x17
>  [] ? compact_zone+0x2a8/0x4b2
>  [] compact_zone_order+0x4c/0x5f
>  [] try_to_compact_pages+0xc4/0x1d6
>  [] __alloc_pages_direct_compact+0x61/0x1bf
>  [] __alloc_pages_nodemask+0x409/0x799
>  [] ? anon_vma_prepare+0xf5/0x12c
>  [] do_huge_pmd_anonymous_page+0x13c/0x255
>  [] ? mmap_region+0x171/0x458
>  [] handle_mm_fault+0x112/0x808
>  [] __do_page_fault+0x27a/0x358
>  [] ? do_mmap_pgoff+0x2b8/0x306
>  [] ? vm_mmap_pgoff+0x82/0xaa
>  [] ? SyS_mmap_pgoff+0x183/0x1cf
>  [] do_page_fault+0xc/0xe
>  [] page_fault+0x22/0x30
> Code: ff 5d c3 55 48 89 e5 41 56 41 55 41 54 41 89 fc bf 01 00 00 00 53 e8 f7 
> 6f e7 ff e8 9a 9c 00 00 41 89 c5 0f 1f 00 0f ae e8 0f 31 <89> c3 0f
> 1f 00 0f ae e8 0f 31 48 c1 e2 20 89 c0 48 09 c2 41 89
> NMI backtrace for cpu 0
> CPU: 0 PID: 13788 Comm: net_applet Not tainted 3.18.0-rc3+ #46
> Hardware name: AcerTravelMate 5720/Columbia   
> , BIOS V1.34   04/15/2008
> task: 8800067a3720 ti: 88000e20c000 task.ti: 88000e20c000
> RIP: 0010:[]  [] compact_zone+0x3c4/0x4b2
> RSP: :88000e20fa18  EFLAGS: 0202
> RAX:  RBX: 8168be40 RCX: 0008
> RDX: 0380 RSI: 0009 RDI: 8168be40
> RBP: 88000e20fa78 R08:  R09: fef5
> R10: 0038 R11: 8168be40 R12: 000bf800
> R13: 000bf600 R14: 88000

Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-08 Thread Vlastimil Babka
On 11/08/2014 02:11 PM, P. Christeas wrote:
> On Thursday 06 November 2014, Vlastimil Babka wrote:
>>> On Wednesday 05 November 2014, Vlastimil Babka wrote:
 Can you please try the following patch?
 -  compaction_defer_reset(zone, order, false);
>> Oh and did I ask in this thread for /proc/zoneinfo yet? :)
> 
> Using that same kernel[1], got again into a race, gathered a few more data.
> 
> This time, I had 1x "urpmq" process [2] hung at 100% CPU , when "kwin" got 
> apparently blocked (100% CPU, too) trying to resize a GUI window. I suppose 
> the resizing operation would mean heavy memory alloc/free.
> 
> The rest of the system was responsive, I could easily get a console, login, 
> gather the files.. Then, I have *killed* -9 the "urpmq" process, which solved 
> the race and my system is still alive! "kwin" is still running, returned to 
> regular CPU load.
> 
> Attached is traces from SysRq+l (pressed a few times, wanted to "snapshot" 
> the 
> stack) and /proc/zoneinfo + /proc/vmstat
> 
> Bisection is not yet meaningful, IMHO, because I cannot be sure that "good" 
> points are really free from this issue. I'd estimate that each test would 
> take 
> +3days, unless I really find a deterministic way to reproduce the issue .

Hi,

I think I finally found the cause by staring into the code... CCing
people from all 4 separate threads I know about this issue.
The problem with finding the cause was that the first report I got from
Markus was about isolate_freepages_block() overhead, and later Norbert
reported that reverting a patch for isolate_freepages* helped. But the
problem seems to be that although the loop in isolate_migratepages exits
because the scanners almost meet (they are within same pageblock), they
don't truly meet, therefore compact_finished() decides to continue, but
isolate_migratepages() exits immediately... boom! But indeed e14c720efdd7
made this situation possible, as free scaner pfn can now point to a
middle of pageblock.

So I hope the attached patch will fix the soft-lockup issues in
compact_zone. Please apply on 3.18-rc3 or later without any other reverts,
and test. It probably won't help Markus and his isolate_freepages_block()
overhead though...

Thanks,
Vlastimil

--8<--
>From fbf8eb0bcd2897090312e23da6a31bad9cc6b337 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Sat, 8 Nov 2014 22:20:43 +0100
Subject: [PATCH] mm, compaction: prevent endless loop in migrate scanner

---
 mm/compaction.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index ec74cf0..1b7a1be 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1029,8 +1029,12 @@ static isolate_migrate_t isolate_migratepages(struct 
zone *zone,
}
 
acct_isolated(zone, cc);
-   /* Record where migration scanner will be restarted */
-   cc->migrate_pfn = low_pfn;
+   /* 
+* Record where migration scanner will be restarted. If we end up in
+* the same pageblock as the free scanner, make the scanners fully
+* meet so that compact_finished() terminates compaction.
+*/
+   cc->migrate_pfn = (end_pfn <= cc->free_pfn) ? low_pfn : cc->free_pfn;
 
return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
 }
-- 
2.1.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-08 Thread P. Christeas
On Thursday 06 November 2014, Vlastimil Babka wrote:
> > On Wednesday 05 November 2014, Vlastimil Babka wrote:
> >> Can you please try the following patch?
> >> -  compaction_defer_reset(zone, order, false);
> Oh and did I ask in this thread for /proc/zoneinfo yet? :)

Using that same kernel[1], got again into a race, gathered a few more data.

This time, I had 1x "urpmq" process [2] hung at 100% CPU , when "kwin" got 
apparently blocked (100% CPU, too) trying to resize a GUI window. I suppose 
the resizing operation would mean heavy memory alloc/free.

The rest of the system was responsive, I could easily get a console, login, 
gather the files.. Then, I have *killed* -9 the "urpmq" process, which solved 
the race and my system is still alive! "kwin" is still running, returned to 
regular CPU load.

Attached is traces from SysRq+l (pressed a few times, wanted to "snapshot" the 
stack) and /proc/zoneinfo + /proc/vmstat

Bisection is not yet meaningful, IMHO, because I cannot be sure that "good" 
points are really free from this issue. I'd estimate that each test would take 
+3days, unless I really find a deterministic way to reproduce the issue .


Thank you, again.


[1] linus's didn't have any -mm changes, so I haven't compiled anything yet. 
This means it also contains the "- compaction_defer_reset()" change

[2] urpmq is a Mandrake distro Perl script for querying the RPM database. It 
does some disk I/O , loads data into allocated Perl structs and sorts that, 
FYI.
Node 0, zone  DMA
  pages free 3055
min  58
low  72
high 87
scanned  0
spanned  4095
present  3998
managed  3977
nr_free_pages 3055
nr_alloc_batch 15
nr_inactive_anon 295
nr_active_anon 132
nr_inactive_file 134
nr_active_file 198
nr_unevictable 0
nr_mlock 0
nr_anon_pages 388
nr_mapped84
nr_file_pages 386
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 59
nr_slab_unreclaimable 32
nr_page_table_pages 57
nr_kernel_stack 2
nr_unstable  0
nr_bounce0
nr_vmscan_write 1301
nr_vmscan_immediate_reclaim 272
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 19
nr_dirtied   21715
nr_written   20583
nr_pages_scanned 0
workingset_refault 7169
workingset_activate 1604
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma  0
protection: (0, 2984, 2984, 2984)
  pagesets
cpu: 0
  count: 0
  high:  0
  batch: 1
  vm stats threshold: 4
cpu: 1
  count: 0
  high:  0
  batch: 1
  vm stats threshold: 4
  all_unreclaimable: 0
  start_pfn: 1
  inactive_ratio:1
Node 0, zoneDMA32
  pages free 51824
min  11205
low  14006
high 16807
scanned  0
spanned  779984
present  779984
managed  764335
nr_free_pages 51824
nr_alloc_batch 42
nr_inactive_anon 108284
nr_active_anon 388047
nr_inactive_file 28047
nr_active_file 95328
nr_unevictable 0
nr_mlock 0
nr_anon_pages 72
nr_mapped78178
nr_file_pages 186535
nr_dirty 236
nr_writeback 0
nr_slab_reclaimable 53697
nr_slab_unreclaimable 11297
nr_page_table_pages 18188
nr_kernel_stack 483
nr_unstable  0
nr_bounce0
nr_vmscan_write 423678
nr_vmscan_immediate_reclaim 2915
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 50993
nr_dirtied   10098257
nr_written   8809535
nr_pages_scanned 0
workingset_refault 5683710
workingset_activate 1087302
workingset_nodereclaim 1664
nr_anon_transparent_hugepages 334
nr_free_cma  0
protection: (0, 0, 0, 0)
  pagesets
cpu: 0
  count: 155
  high:  186
  batch: 31
  vm stats threshold: 24
cpu: 1
  count: 49
  high:  186
  batch: 31
  vm stats threshold: 24
  all_unreclaimable: 0
  start_pfn: 4096
  inactive_ratio:4

/proc/vmstat:
nr_free_pages 24041
nr_alloc_batch 1364
nr_inactive_anon 108048
nr_active_anon 397021
nr_inactive_file 42071
nr_active_file 102045
nr_unevictable 0
nr_mlock 0
nr_anon_pages 453175
nr_mapped 79221
nr_file_pages 208686
nr_dirty 977
nr_writeback 0
nr_slab_reclaimable 54008
nr_slab_unreclaimable 11475
nr_page_table_pages 19820
nr_kernel_stack 488
nr_unstable 0
nr_bounce 0
nr_vmscan_write 425540
nr_vmscan_immediate_reclaim 3187
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 50631
nr_dirtied 10151224
nr_written 8851175
nr_pages_scanned 0
workingset_refault 5711048
workingset_activate 1090895
workingset_nodereclaim 1664
nr_anon_transparent_hugepages 331
nr_free_cma 0
nr_dirty_threshold 29656
nr_dirty_background_threshold 14828
pgpgin 26370697
pgpgout 36940756
psw

Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-06 Thread Vlastimil Babka
On 11/06/2014 08:23 PM, P. Christeas wrote:
> On Wednesday 05 November 2014, Vlastimil Babka wrote:
>> Can you please try the following patch?
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1325,13 +1325,6 @@ unsigned long try_to_compact_pages(struct zonelist
>> -compaction_defer_reset(zone, order, false);
> 
> NACK :(

Sigh.

> I just got again into a state that some task was spinning out of control, and 
> blocking the rest of the desktop.

Well this is similar to reports [1] and [2] except [1] points to
isolate_freepages_block() and your traces only go as deep as compact_zone. Which
probably inlines isolate_migratepages* but I would expect it cannot inline
isolate_freepages* due to invocation via pointer.

> You will see me trying a few things, apparently the first OOM managed to 
> unblock something, but a few seconds later the system "stepped" on some other 
> blocking task.
> 
> See attached log, it may only give you some hint; the problem could well be 
> in 
> some other part of the kernel.

Well I doubt that but I'd like to be surprised :)

> In the meanwhile, I'm pulling linus/master ...

Could you perhaps bisect the most suspicious part? It's not a lot of commits
and you seem to be reproducing this quite easily?

commit 447f05bb488bff4282088259b04f47f0f9f76760 should be good
commit 6d7ce55940b6ecd463ca044ad241f0122d913293 should be bad

If that's true, then bisection should find the cause rather quickly.

Oh and did I ask in this thread for /proc/zoneinfo yet? :)

Thanks.

> kcrash.log
> 

[1]
http://article.gmane.org/gmane.linux.kernel.mm/124451/match=isolate_freepages_block+very+high+intermittent+overhead

[2] https://lkml.org/lkml/2014/11/4/904

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-06 Thread P. Christeas
On Wednesday 05 November 2014, Vlastimil Babka wrote:
> Can you please try the following patch?
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1325,13 +1325,6 @@ unsigned long try_to_compact_pages(struct zonelist
> - compaction_defer_reset(zone, order, false);

NACK :(

I just got again into a state that some task was spinning out of control, and 
blocking the rest of the desktop.

You will see me trying a few things, apparently the first OOM managed to 
unblock something, but a few seconds later the system "stepped" on some other 
blocking task.

See attached log, it may only give you some hint; the problem could well be in 
some other part of the kernel.

In the meanwhile, I'm pulling linus/master ...

SysRq : Show backtrace of all active CPUs
sending NMI to all CPUs:
NMI backtrace for cpu 1
CPU: 1 PID: 13544 Comm: python Not tainted 3.18.0-rc3+ #46
Hardware name: AcerTravelMate 5720/Columbia   , BIOS V1.34   04/15/2008
task: 88000c78ee40 ti: 88000e5f8000 task.ti: 88000e5f8000
RIP: 0010:[]  [] delay_tsc+0x28/0xa2
RSP: :8800bf303b28  EFLAGS: 0002
RAX: 6bd322e8 RBX: 2710 RCX: 0007
RDX: 021d RSI: 8151623e RDI: 8152fea5
RBP: 8800bf303b48 R08: 0400 R09: 
R10: 0046 R11: 0046 R12: 00185ac0
R13: 0001 R14: 0001 R15: 81668f90
FS:  7f1570ed1700() GS:8800bf30() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7f7cd966b000 CR3: 740c9000 CR4: 07e0
Stack:
 2710 0003 006c 0001
 8800bf303b58 811df814 8800bf303b68 811df83d
 8800bf303b88 81025de1 80010002 816692b0
Call Trace:
  

 [] __delay+0xa/0xc
 [] __const_udelay+0x27/0x29
 [] arch_trigger_all_cpu_backtrace+0xa8/0xd2
 [] sysrq_handle_showallcpus+0xe/0x10
 [] __handle_sysrq+0x94/0x126
 [] sysrq_filter+0xee/0x287
 [] input_to_handler+0x5e/0xcb
 [] input_pass_values.part.3+0x76/0x134
 [] input_handle_event+0x457/0x46d
 [] input_event+0x55/0x6f
 [] input_sync+0xf/0x11
 [] atkbd_interrupt+0x4d5/0x595
 [] serio_interrupt+0x43/0x7d
 [] i8042_interrupt+0x292/0x2a8
 [] ? tick_sched_do_timer+0x33/0x33
 [] handle_irq_event_percpu+0x44/0x19f
 [] handle_irq_event+0x3c/0x5c
 [] ? apic_eoi+0x18/0x1a
 [] handle_edge_irq+0x95/0xae
 [] handle_irq+0x158/0x16d
 [] ? get_parent_ip+0xe/0x3e
 [] do_IRQ+0x58/0xda
 [] common_interrupt+0x6a/0x6a
  

 [] ? rcu_read_unlock_sched_notrace+0x17/0x17
 [] ? compact_zone+0x2a8/0x4b2
 [] compact_zone_order+0x4c/0x5f
 [] try_to_compact_pages+0xc4/0x1d6
 [] __alloc_pages_direct_compact+0x61/0x1bf
 [] __alloc_pages_nodemask+0x409/0x799
 [] ? anon_vma_prepare+0xf5/0x12c
 [] do_huge_pmd_anonymous_page+0x13c/0x255
 [] ? mmap_region+0x171/0x458
 [] handle_mm_fault+0x112/0x808
 [] __do_page_fault+0x27a/0x358
 [] ? do_mmap_pgoff+0x2b8/0x306
 [] ? vm_mmap_pgoff+0x82/0xaa
 [] ? SyS_mmap_pgoff+0x183/0x1cf
 [] do_page_fault+0xc/0xe
 [] page_fault+0x22/0x30
Code: ff 5d c3 55 48 89 e5 41 56 41 55 41 54 41 89 fc bf 01 00 00 00 53 e8 f7 6f e7 ff e8 9a 9c 00 00 41 89 c5 0f 1f 00 0f ae e8 0f 31 <89> c3 0f 1f 00 0f ae e8 0f 31 48 c1 e2 20 89 c0 48 09 c2 41 89 
NMI backtrace for cpu 0
CPU: 0 PID: 13788 Comm: net_applet Not tainted 3.18.0-rc3+ #46
Hardware name: AcerTravelMate 5720/Columbia   , BIOS V1.34   04/15/2008
task: 8800067a3720 ti: 88000e20c000 task.ti: 88000e20c000
RIP: 0010:[]  [] compact_zone+0x3c4/0x4b2
RSP: :88000e20fa18  EFLAGS: 0202
RAX:  RBX: 8168be40 RCX: 0008
RDX: 0380 RSI: 0009 RDI: 8168be40
RBP: 88000e20fa78 R08:  R09: fef5
R10: 0038 R11: 8168be40 R12: 000bf800
R13: 000bf600 R14: 88000e20fa98 R15: 1600
FS:  7ff9cbe92700() GS:8800bf20() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 7ff3a52f CR3: 0af17000 CR4: 07f0
Stack:
 88000e20fa18 ea0002fd 0020 8800067a3720
 0004 88000e20faa8  
 0009 88000e20fccc 0002 
Call Trace:
 [] compact_zone_order+0x4c/0x5f
 [] try_to_compact_pages+0xc4/0x1d6
 [] __alloc_pages_direct_compact+0x61/0x1bf
 [] __alloc_pages_nodemask+0x409/0x799
 [] ? unlock_page+0x1f/0x23
 [] do_huge_pmd_wp_page+0x127/0x4eb
 [] handle_mm_fault+0x151/0x808
 [] __do_page_fault+0x27a/0x358
 [] ? do_page_fault+0xc/0xe
 [] ? page_fault+0x22/0x30
 [] ? __put_user_4+0x20/0x30
 [] do_page_fault+0xc/0xe
 [] page_fault+0x22/0x30
Code: 8b 7b 08 44 89 e6 ff 13 48 83 c3 10 48 83 3b 00 eb eb 41 83 7e 40 01 4d 8

Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-05 Thread P. Christeas
On Wednesday 05 November 2014, Vlastimil Babka wrote:
> I see. I've tried to reproduce such issues with 3.18-rc3 but wasn't
> successful. But I noticed a possible issue that could lead to your problem.
> Can you please try the following patch?

OK, I can give it a try.

FYI, the "stability canary" is still alive, my system is on for 28hours, under 
avg. load >=3 all this time, HEAD=980d0d51b1c9617a4

/me goes busy fire-proofing your patch...



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-05 Thread Vlastimil Babka
On 11/04/2014 10:36 AM, P. Christeas wrote:
> On Tuesday 04 November 2014, Vlastimil Babka wrote:
>> Please do keep testing (and see below what we need), and don't try
>> another tree - it's 3.18 we need to fix!
> Let me apologize/warn you about the poor quality of this report (and debug 
> data).
> It is on a system meant for everyday desktop usage, not kernel development. 
> Thus, it is tuned to be "slightly" debuggable ; mostly for performance.
> 
>> I'm not sure what you mean by "race" here and your snippet is
>> unfortunately just a small portion of the output ...
> 
> It is a shot in the dark. System becomes non-responsive (narrowed to desktop 
> apps waiting each other, or the X+kwin blocking), I can feel the CPU heating 
> and /sometimes/ disk I/O.
> 
> No BUG, Oops or any kernel message. (is printk level 4 adequate? )
> 
> Then, I try to drop to a console and collect as much data as possible with 
> SysRq.
> 
> The snippet I'd sent you is from all-cpus-backtrace (l), trying to see which 
> traces appear consistently during the lockup. There is also the huge traces 
> of 
> "task-states" (t), but I reckon they are too noisy.
> That trace also matches the usage profile, because AFAICG[uess] the issue 
> appears when allocating during I/O load. 
> 
> After turning on full-preemption, I have been able to terminate/kill all 
> tasks 
> and continue with same kernel but new userspace.
> 
>> OK so the process is not dead due to the problem? That probably rules
>> out some kinds of errors but we still need the full output. Thanks in
>> advance. 
>> I'm not aware of this, CCing lkml for wider coverage.
> 
> Thank you. As I've told in the first mail, this is an early report of 
> possible 
> 3.18 regression. I'm trying to narrow down the case and make it reproducible 
> or get a good trace.

I see. I've tried to reproduce such issues with 3.18-rc3 but wasn't successful.
But I noticed a possible issue that could lead to your problem.
Can you please try the following patch?

8<---
>From fe9c963cc665cdab50abb41f3babb5b19d08fab1 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka 
Date: Wed, 5 Nov 2014 14:19:18 +0100
Subject: [PATCH] mm, compaction: do not reset deferred compaction
 optimistically

In try_to_compact_pages() we reset deferred compaction for a zone where we
think compaction has succeeded. Although this action does not reset the
counters affecting deferred compaction period, just bumping the deferred order
means that another compaction attempt will be able to pass the check in
compaction_deferred() and proceed with compaction.

This is a problem when try_to_compact_pages() thinks compaction was successful
just because the watermark check is missing proper classzone_idx parameter,
but then the allocation attempt itself will fail due to its watermark check
having the proper value. Although __alloc_pages_direct_compact() will re-defer
compaction in such case, this happens only in the case of sync compaction.
Async compaction will leave the zone open for another compaction attempt which
may reset the deferred order again. This could possibly explain what
P. Christeas reported - a system where many processes include the following
backtrace:

[] preempt_schedule_irq+0x3c/0x59
[] retint_kernel+0x20/0x30
[] ? __zone_watermark_ok+0x77/0x85
[] zone_watermark_ok+0x1a/0x1c
[] compact_zone+0x215/0x4b2
[] compact_zone_order+0x4c/0x5f
[] try_to_compact_pages+0xc4/0x1e8
[] __alloc_pages_direct_compact+0x61/0x1bf
[] __alloc_pages_nodemask+0x409/0x799
[] new_slab+0x5f/0x21c

The issue has been made visible by commit 53853e2d2bfb ("mm, compaction: defer
each zone individually instead of preferred zone"), since before the commit,
deferred compaction for fallback zones (where classzone_idx matters) was not
considered separately.

Although work is underway to fix the underlying zone watermark check mismatch,
this patch fixes the immediate problem by removing the optimistic defer reset
completely. Its usefulness is questionable anyway, since if the allocation
really succeeds, a full defer reset (including the period counters) follows.

Fixes: 53853e2d2bfb748a8b5aa2fd1de15699266865e0
Reported-by: P. Christeas 
Signed-off-by: Vlastimil Babka 
---
 mm/compaction.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index ec74cf0..f0335f9 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1325,13 +1325,6 @@ unsigned long try_to_compact_pages(struct zonelist 
*zonelist,
  alloc_flags)) {
*candidate_zone = zone;
/*
-* We think the allocation will succeed in this zone,
-* but it is not certain, hence the false. The caller
-* will repeat this with true if allocation indeed
-* succeeds in this zone.
-   

Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-04 Thread P. Christeas
On Tuesday 04 November 2014, Vlastimil Babka wrote:
> Please do keep testing (and see below what we need), and don't try
> another tree - it's 3.18 we need to fix!
Let me apologize/warn you about the poor quality of this report (and debug 
data).
It is on a system meant for everyday desktop usage, not kernel development. 
Thus, it is tuned to be "slightly" debuggable ; mostly for performance.

> I'm not sure what you mean by "race" here and your snippet is
> unfortunately just a small portion of the output ...

It is a shot in the dark. System becomes non-responsive (narrowed to desktop 
apps waiting each other, or the X+kwin blocking), I can feel the CPU heating 
and /sometimes/ disk I/O.

No BUG, Oops or any kernel message. (is printk level 4 adequate? )

Then, I try to drop to a console and collect as much data as possible with 
SysRq.

The snippet I'd sent you is from all-cpus-backtrace (l), trying to see which 
traces appear consistently during the lockup. There is also the huge traces of 
"task-states" (t), but I reckon they are too noisy.
That trace also matches the usage profile, because AFAICG[uess] the issue 
appears when allocating during I/O load. 

After turning on full-preemption, I have been able to terminate/kill all tasks 
and continue with same kernel but new userspace.

> OK so the process is not dead due to the problem? That probably rules
> out some kinds of errors but we still need the full output. Thanks in
> advance. 
> I'm not aware of this, CCing lkml for wider coverage.

Thank you. As I've told in the first mail, this is an early report of possible 
3.18 regression. I'm trying to narrow down the case and make it reproducible 
or get a good trace.

Attached is my current .config




config-3.18.gz
Description: application/gzip


Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

2014-11-04 Thread Vlastimil Babka

On 11/04/2014 08:26 AM, P. Christeas wrote:

TL;DR: I'm testing Linus's 3.18-rcX in my desktop (x86_64, full load),
experiencing mm races about every day. Current -rc starves the canary of
stablity

Will keep testing (should I try some -mm tree, please? ) , provide you
feedback about the issue.


Hello,

Please do keep testing (and see below what we need), and don't try 
another tree - it's 3.18 we need to fix!



Not an active kernel-developer.

Long:

Since 26 Oct. upgraded my everything-on-it laptop to new distro (systemd -
based, all new glibc etc.) and switched from 3.17 to 3.18-pre . First time in
years, kernel got unstable.

This machine is occasionaly under heavy load, doing I/O and serving random
desktop applications. (machine is Intel x86_64, dual core, mechanical SATA
disk).
Now, I have a race about once a day, have narrowed them down (guess) to:

 [] preempt_schedule_irq+0x3c/0x59
 [] retint_kernel+0x20/0x30
 [] ? __zone_watermark_ok+0x77/0x85
 [] zone_watermark_ok+0x1a/0x1c
 [] compact_zone+0x215/0x4b2
 [] compact_zone_order+0x4c/0x5f
 [] try_to_compact_pages+0xc4/0x1e8
 [] __alloc_pages_direct_compact+0x61/0x1bf
 [] __alloc_pages_nodemask+0x409/0x799
 [] new_slab+0x5f/0x21c
...


I'm not sure what you mean by "race" here and your snippet is 
unfortunately just a small portion of the output which could be a BUG, 
OOPS, lockdep, soft-lockup, hardlock and possibly many other things. But 
the backtrace itself is not enough, please send the whole error output 
(it should stard and end with something like:

-[ cut here ]--
Thanks in advance.


Sometimes is a less critical process, that I can safely kill, otherwise I have
to drop everything and reboot.


OK so the process is not dead due to the problem? That probably rules 
out some kinds of errors but we still need the full output. Thanks in 
advance.



Unless you are already aware of this case, please accept this feedback.
I'm pulling from Linus, should I also try some of your trees for an early
solution?


I'm not aware of this, CCing lkml for wider coverage.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/