> On Tue, Nov 17, 2009 at 09:18:11PM +0900, KOSAKI Motohiro wrote:
> > > On Tue, Nov 17, 2009 at 08:03:21PM +0900, KOSAKI Motohiro wrote:
> > > > I'm sorry for the long delay.
> > > > 
> > > > > On Sat, Nov 14, 2009 at 06:34:23PM +0900, KOSAKI Motohiro wrote:
> > > > > > 2009/11/14 Mel Gorman <[email protected]>:
> > > > > > > On Sat, Nov 14, 2009 at 03:00:57AM +0900, KOSAKI Motohiro wrote:
> > > > > > >> > On Fri, Nov 13, 2009 at 07:43:09PM +0900, KOSAKI Motohiro 
> > > > > > >> > wrote:
> > > > > > >> > > > After kswapd balances all zones in a pgdat, it goes to 
> > > > > > >> > > > sleep. In the event
> > > > > > >> > > > of no IO congestion, kswapd can go to sleep very shortly 
> > > > > > >> > > > after the high
> > > > > > >> > > > watermark was reached. If there are a constant stream of 
> > > > > > >> > > > allocations from
> > > > > > >> > > > parallel processes, it can mean that kswapd went to sleep 
> > > > > > >> > > > too quickly and
> > > > > > >> > > > the high watermark is not being maintained for sufficient 
> > > > > > >> > > > length time.
> > > > > > >> > > >
> > > > > > >> > > > This patch makes kswapd go to sleep as a two-stage 
> > > > > > >> > > > process. It first
> > > > > > >> > > > tries to sleep for HZ/10. If it is woken up by another 
> > > > > > >> > > > process or the
> > > > > > >> > > > high watermark is no longer met, it's considered a 
> > > > > > >> > > > premature sleep and
> > > > > > >> > > > kswapd continues work. Otherwise it goes fully to sleep.
> > > > > > >> > > >
> > > > > > >> > > > This adds more counters to distinguish between fast and 
> > > > > > >> > > > slow breaches of
> > > > > > >> > > > watermarks. A "fast" premature sleep is one where the low 
> > > > > > >> > > > watermark was
> > > > > > >> > > > hit in a very short time after kswapd going to sleep. A 
> > > > > > >> > > > "slow" premature
> > > > > > >> > > > sleep indicates that the high watermark was breached after 
> > > > > > >> > > > a very short
> > > > > > >> > > > interval.
> > > > > > >> > > >
> > > > > > >> > > > Signed-off-by: Mel Gorman <[email protected]>
> > > > > > >> > >
> > > > > > >> > > Why do you submit this patch to mainline? this is debugging 
> > > > > > >> > > patch
> > > > > > >> > > no more and no less.
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> > Do you mean the stats part? The stats are included until such 
> > > > > > >> > time as the page
> > > > > > >> > allocator failure reports stop or are significantly reduced. 
> > > > > > >> > In the event a
> > > > > > >> > report is received, the value of the counters help determine 
> > > > > > >> > if kswapd was
> > > > > > >> > struggling or not. They should be removed once this mess is 
> > > > > > >> > ironed out.
> > > > > > >> >
> > > > > > >> > If there is a preference, I can split out the stats part and 
> > > > > > >> > send it to
> > > > > > >> > people with page allocator failure reports for retesting.
> > > > > > >>
> > > > > > >> I'm sorry my last mail didn't have enough explanation.
> > > > > > >> This stats help to solve this issue. I agreed. but after solving 
> > > > > > >> this issue,
> > > > > > >> I don't imagine administrator how to use this stats. if 
> > > > > > >> KSWAPD_PREMATURE_FAST or
> > > > > > >> KSWAPD_PREMATURE_SLOW significantly increased, what should admin 
> > > > > > >> do?
> > > > > > >
> > > > > > > One possible workaround would be to raise min_free_kbytes while a 
> > > > > > > fix is
> > > > > > > being worked on.
> > > > > > 
> > > > > > Please correct me, if I said wrong thing.
> > > > > 
> > > > > You didn't.
> > > > > 
> > > > > > if I was admin, I don't watch this stats because kswapd frequently
> > > > > > wakeup doesn't mean any trouble. instead I watch number of 
> > > > > > allocation
> > > > > > failure.
> > > > > 
> > > > > The stats are not tracking when kswapd wakes up. It helps track how
> > > > > quickly the high or low watermarks are going under once kswapd tries 
> > > > > to
> > > > > go back to sleep.
> > > > 
> > > > Umm, honestly I'm still puzlled. probably we need go back one step at 
> > > > once.
> > > > kswapd wake up when memory amount less than low watermark and sleep
> > > > when memory amount much than high watermask. We need to know 
> > > > GFP_ATOMIC failure sign.
> > > > 
> > > > My point is, kswapd wakeup only happen after kswapd sleeping. but if 
> > > > the system is
> > > > under heavy pressure and memory amount go up and down between low 
> > > > watermark
> > > > and high watermark, this stats don't increase at all. IOW, this stats 
> > > > is strong related to
> > > > high watermark.
> > > > 
> > > 
> > > Yes, this is true but as long as kswapd is awake and doing its job, it
> > > will continue taking direction on what order it should be reclaiming from
> > > processes that failed the low_watermark test.  The GFP_ATOMIC allocations
> > > will be allowed to go under this low watermark but will have informed 
> > > kswapd
> > > what order it should be reclaiming at so it stays working.
> > > 
> > > A stat that increases between the low and high watermark would indicate
> > > that memory pressure is there or that the reclaim algorithm is not
> > > working as expected but that's checking for a different problem.
> > > 
> > > What I was looking at was  kswapd going to sleep and the low or min 
> > > watermarks
> > > being hit very quickly after that so that kswapd pre-emptively kicks in
> > > before allocations start failing again.
> > > 
> > > > Probaby, min watermark or low watermark are more useful for us.
> > > 




> > > Why? kswapd is awake between those points.
> > 
> > What's difference below (1) and (2)?
> > 
> > 1. kswapd() run 100ms and sleep 10ms and run 100ms.
> > 2. kswapd() run 200ms
> 
> Because prior to the patch, once kswapd went to sleep, it would
> not wake again until the low watermark was reached. There appeared
> to be a timing issue where congestion_wait() would block kswapd
> just long enough before checking the high watermark to mean that
> it stayed awake. This wasn't happening hence the approach of
> briefly-sleep-after-high-watermark-is-reached-and-double-check-watermarks-are-ok.

You are right.

> > (1) represent amount memory go beyond high-watermark very shortly and go 
> > below
> > low-watermark right after . (2) represent amount memory neared 
> > high-watermark closely, but don't touched,
> > and probably go blow low-watermark right after.
> > It's almost same memory pressure. but (1) increase 
> > KSWAPD_HIGH_WMARK_HIT_QUICKLY and
> > (2) don't increase any stat.
> > 
> > Thus, We can't think KSWAPD_HIGH_WMARK_HIT_QUICKLY indicate memory pressure.
> > 
> 
> It indicates mild memory pressure because the high watermark was only
> reached for a short period of time.
> 
> > > > # of called wake_all_kswapd() is related to low watermark. and It's 
> > > > conteniously
> > > > increase although the system have strong memroy pressure. I'm ok.
> > > > KSWAPD_NO_CONGESTION_WAIT is related to min watermark. I'm ok too..
> > > > # of page allocation failure is related to  min watermark too. I'm ok 
> > > > too.
> > > > 
> > > > IOW, I only dislike this stat stop increase strong memory pressure 
> > > > (above explanation).
> > > > Can you please tell me why you think kswapd slept time is so important?
> > > 
> > > I don't think the amount of time it has slept is important. I think it's
> > > important to know if the system is getting back into watermark trouble 
> > > very
> > > shortly after kswapd reached the high watermark.
> > 
> > Probably, My last mail doesn't take kindly explanation.
> > My point is, beyond high-watermark or not doesn't indicate any meaningful 
> > phenomenon.
> > 
> > Then, I'd prefer low or min-watermark related stats.
> 
> What would that have to do with the kswapd-briefly-sleep logic? i.e. the
> stats you are suggesting, while meaningful, are for looking at a
> different type of problem. If you want those stats, then I should revert
> the stats part of this patch altogether. Does that then mean you also
> want the patch that makes kswapd double check watermarks to be dropped
> or just have no associated stats to see what it's doing in the event of
> allocation failure?

Yes, you talked right thing.

After awhile thinking (and one night good sleeping), I conclude I should not 
enforce
my personal preference any more. I'm sorry. 

Honestly, I haven't understand this stats usage. but the stats is not core
piece in this patch concept.  we sholdn't get stuck small issue.

then, I'll review your latest patch soon.

Thanks.




--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to