On Wed, Oct 15, 2014 at 11:56:33PM -0600, Justin T. Gibbs wrote:
> avg pointed out the rate limiting code in vm_pageout_scan() during discussion 
> about PR 187594.  While it certainly can contribute to the problems discussed 
> in that PR, a bigger problem is that it can allow the OOM killer to be 
> triggered even though there is plenty of reclaimable memory available in the 
> system.  Any load that can consume enough pages within the polling interval 
> to hit the v_free_min threshold (e.g. multiple 'dd if=/dev/zero 
> of=/file/on/zfs') can make this happen.
> 
> The product I?m working on does not have swap configured and treats any OOM 
> trigger as fatal, so it is very obvious when this happens. :-)
> 
> I?ve tried several things to mitigate the problem.  The first was to ignore 
> rate limiting for pass 2.  However, even though ZFS is guaranteed to receive 
> some feedback prior to OOM being declared, my testing showed that a trivial 
> load (a couple dd operations) could still consume enough of the reclaimed 
> space to leave the system below its target at the end of pass 2.  After 
> removing the rate limiting entirely, I?ve so far been unable to kill the 
> system via a ZFS induced load.
> 
> I understand the motivation behind the rate limiting, but the current 
> implementation seems too simplistic to be safe.  The documentation for the 
> Solaris slab allocator provides good motivation for their approach of using a 
> ?sliding average? to reign in temporary bursts of usage without unduly 
> harming efficient service for the recorded steady-state memory demand.  
> Regardless of the approach taken, I believe that the OOM killer must be a 
> last resort and shouldn?t be called when there are caches that can be culled.
> 
> One other thing I?ve noticed in my testing with ZFS is that it needs feedback 
> and a little time to react to memory pressure.  Calling it?s lowmem handler 
> just once isn?t enough for it to limit in-flight writes so it can avoid reuse 
> of pages that it just freed up.  But, it doesn?t take too long to react (> 
> 1sec in the profiling I?ve done).  Is there a way in vm_pageout_scan() that 
> we can better record that progress is being made (pages were freed in the 
> pass, even if some/all of them were consumed again) and allow more passes 
> before the OOM killer is invoked in this case?
> 
> ?
> Justin
https://docs.freebsd.org/cgi/getmsg.cgi?fetch=103436+0+/usr/local/www/db/text/2014/freebsd-hackers/20141012.freebsd-hackers
might have some relevance.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to