Re: Call fo comments - raising vfs.ufs.dirhash_reclaimage?

2013-10-08 Thread Peter Holm
On Mon, Oct 07, 2013 at 07:34:24PM +0200, Davide Italiano wrote:
  What would perhaps be better than a hardcoded reclaim age would be to use
  an LRU-type approach and perhaps set a target percent to reclaim.  That is,
  suppose you were to reclaim the oldest 10% of hashes on each lowmem call
  (and make the '10%' the tunable value).  Then you will always make some 
  amount
  of progress in a low memory situation (and if the situation remains dire you
  will eventually empty the entire cache), but the effective maximum age will
  be more dynamic.  Right now if you haven't touched UFS in 5 seconds it
  throws the entire thing out on the first lowmem event.  The LRU-approach 
  would
  only throw the oldest 10% out on the first call, but eventually throw it 
  all out
  if the situation remains dire.
 
  --
  John Baldwin
  ___
  freebsd...@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-fs
  To unsubscribe, send any mail to freebsd-fs-unsubscr...@freebsd.org
 
 I liked your idea more than what's available in HEAD right now and I
 implemented it.
 http://people.freebsd.org/~davide/review/ufs_direclaimage.diff
 I was unsure what kind of heuristic I should choose to select which
 (10% of) entries should be evicted so I just removed the first 10%
 ones from the head of the ufs_dirhash list (which should be the
 oldest).
 The code keeps rescanning the cache until 10% (or, the percentage set
 via SYSCTL) of the entry are freed, but probably we can discuss if
 this limit could be relaxed and just do a single scan over the list.
 Unfortunately I haven't a testcase to prove the effectiveness (or
 non-effectiveness) of the approach but I think either Ivan or Peter
 could be able to give it a spin, maybe.
 

I gave this patch a spin for 12 hours without finding any problems.
I can do more testing at a later time, if you want to.

- Peter
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Memory reserves or lack thereof

2012-11-12 Thread Peter Holm
On Mon, Nov 12, 2012 at 03:36:38PM +0200, Konstantin Belousov wrote:
 On Sun, Nov 11, 2012 at 03:40:24PM -0600, Alan Cox wrote:
  On Sat, Nov 10, 2012 at 7:20 AM, Konstantin Belousov 
  kostik...@gmail.comwrote:
  
   On Fri, Nov 09, 2012 at 07:10:04PM +, Sears, Steven wrote:
I have a memory subsystem design question that I'm hoping someone can
   answer.
   
I've been looking at a machine that is completely out of memory, as in
   
 v_free_count = 0,
 v_cache_count = 0,
   
I wondered how a machine could completely run out of memory like this,
   especially after finding a lack of interrupt storms or other pathologies
   that would tend to overcommit memory. So I started investigating.
   
Most allocators come down to vm_page_alloc(), which has this guard:
   
  if ((curproc == pageproc)  (page_req != VM_ALLOC_INTERRUPT)) {
  page_req = VM_ALLOC_SYSTEM;
  };
   
  if (cnt.v_free_count + cnt.v_cache_count  cnt.v_free_reserved ||
  (page_req == VM_ALLOC_SYSTEM 
  cnt.v_free_count + cnt.v_cache_count 
   cnt.v_interrupt_free_min) ||
  (page_req == VM_ALLOC_INTERRUPT 
  cnt.v_free_count + cnt.v_cache_count  0)) {
   
The key observation is if VM_ALLOC_INTERRUPT is set, it will allocate
   every last page.
   
From the name one might expect VM_ALLOC_INTERRUPT to be somewhat rare,
   perhaps only used from interrupt threads. Not so, see kmem_malloc() or
   uma_small_alloc() which both contain this mapping:
   
  if ((flags  (M_NOWAIT|M_USE_RESERVE)) == M_NOWAIT)
  pflags = VM_ALLOC_INTERRUPT | VM_ALLOC_WIRED;
  else
  pflags = VM_ALLOC_SYSTEM | VM_ALLOC_WIRED;
   
Note that M_USE_RESERVE has been deprecated and is used in just a
   handful of places. Also note that lots of code paths come through these
   routines.
   
What this means is essentially _any_ allocation using M_NOWAIT will
   bypass whatever reserves have been held back and will take every last page
   available.
   
There is no documentation stating M_NOWAIT has this side effect of
   essentially being privileged, so any innocuous piece of code that can't
   block will use it. And of course M_NOWAIT is literally used all over.
   
It looks to me like the design goal of the BSD allocators is on
   recovery; it will give all pages away knowing it can recover.
   
Am I missing anything? I would have expected some small number of pages
   to be held in reserve just in case. And I didn't expect M_NOWAIT to be a
   sort of back door for grabbing memory.
   
  
   Your analysis is right, there is nothing to add or correct.
   This is the reason to strongly prefer M_WAITOK.
  
  
  Agreed.  Once upon time, before SMPng, M_NOWAIT was rarely used.  It was
  well understand that it should only be used by interrupt handlers.
  
  The trouble is that M_NOWAIT conflates two orthogonal things.  The obvious
  being that the allocation shouldn't sleep.  The other being how far we're
  willing to deplete the cache/free page queues.
  
  When fine-grained locking got sprinkled throughout the kernel, we all to
  often found ourselves wanting to do allocations without the possibility of
  blocking.  So, M_NOWAIT became commonplace, where it wasn't before.
  
  This had the unintended consequence of introducing a lot of memory
  allocations in the top-half of the kernel, i.e., non-interrupt handling
  code, that were digging deep into the cache/free page queues.
  
  Also, ironically, in today's kernel an M_NOWAIT | M_USE_RESERVE
  allocation is less likely to succeed than an M_NOWAIT allocation.
  However, prior to FreeBSD 7.x, M_NOWAIT couldn't allocate a cached page; it
  could only allocate a free page.  M_USE_RESERVE said that it ok to allocate
  a cached page even though M_NOWAIT was specified.  Consequently, the system
  wouldn't dig as far into the free page queue if M_USE_RESERVE was
  specified, because it was allowed to reclaim a cached page.
  
  In conclusion, I think it's time that we change M_NOWAIT so that it doesn't
  dig any deeper into the cache/free page queues than M_WAITOK does and
  reintroduce a M_USE_RESERVE-like flag that says dig deep into the
  cache/free page queues.  The trouble is that we then need to identify all
  of those places that are implicitly depending on the current behavior of
  M_NOWAIT also digging deep into the cache/free page queues so that we can
  add an explicit M_USE_RESERVE.
  
  Alan
  
  P.S. I suspect that we should also increase the size of the page reserve
  that is kept for VM_ALLOC_INTERRUPT allocations in vm_page_alloc*().  How
  many legitimate users of a new M_USE_RESERVE-like flag in today's kernel
  could actually be satisfied by two pages?
 
 I am almost sure that most of people who put the M_NOWAIT flag, do not
 know the 'allow the deeper drain of free queue' effect. As such, I believe
 we should flip the meaning of 

Re: Kernel threads inherit CPU affinity from random sibling

2012-01-29 Thread Peter Holm
On Sat, Jan 28, 2012 at 02:39:17PM +0100, Attilio Rao wrote:
 2012/1/28 Attilio Rao atti...@freebsd.org:
  2012/1/28 Ryan Stone ryst...@gmail.com:
  On Fri, Jan 27, 2012 at 10:41 PM, Attilio Rao atti...@freebsd.org wrote:
  I think what you found out is very sensitive.
  However, the patch is not correct as you cannot call
  cpuset_setthread() with thread_lock held.
 
  Whoops!  I actually discovered that for myself and had already fixed
  it, but apparently I included an old version of the patch in the
  email.
 
  Hence this is my fix:
  http://www.freebsd.org/~attilio/cpuset_root.patch
 
  Oh, I do like this better.  I tried something similar myself but
  abandoned it because I misread how sched_affinity() was implemented by
  4BSD(I had gotten the impression that once TSF_AFFINITY is set it
  could never be cleared).
 
  Do you have a pathological test-case for it? Are you going to test the 
  patch?
 
 BTW, I've just now updated the patch in order to remove an added white
 line and s/priority/affinity in comments.
 

I've tested this patch with what I got of threaded test scenarios, for
14 hours without finding any issues.

- Peter
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Seeking testers for change to lib/libc/gen/fts.c

1999-08-26 Thread Peter Holm




PROBLEM:
Find core dumps with extreme long path names. See also
kern/12855

CAUSE:
fts.c does not handle realloc of buffer space correctly.

FIX:
Upgrade fts.c from OpenBSD version 1.9 to 1.20.  The
fix for when fts_open is used with option FTS_NOCHDIR
the full path entry of type FTS_DP is returned with a
trailing '/' if the final directory is empty, was
incorporated in version 1.20. Thanx to Todd Miller
[EMAIL PROTECTED]

The patch is available at http://www.freebsd.org/~pho/fts.diff

--
Peter Holm





To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



Seeking testers for change to lib/libc/gen/fts.c

1999-08-26 Thread Peter Holm



PROBLEM:
Find core dumps with extreme long path names. See also
kern/12855

CAUSE:
fts.c does not handle realloc of buffer space correctly.

FIX:
Upgrade fts.c from OpenBSD version 1.9 to 1.20.  The
fix for when fts_open is used with option FTS_NOCHDIR
the full path entry of type FTS_DP is returned with a
trailing '/' if the final directory is empty, was
incorporated in version 1.20. Thanx to Todd Miller
mill...@openbsd.org

The patch is available at http://www.freebsd.org/~pho/fts.diff

--
Peter Holm





To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message



NFS V3 and mkdir bug

1999-08-07 Thread Peter Holm
ent#  gdb -k -s kernel.debug  -e /var/crash/kernel.6 -c
/var/crash/vmcore.6
IdlePTD 3932160
initial pcb at 33cfc0
panicstr: ffs_valloc: dup alloc
panic messages:
---
panic: ffs_valloc: dup alloc

---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:291
291   dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:291
#1  0xc016710d in panic (fmt=0xc02f69c1 "ffs_valloc: dup alloc")
at ../../kern/kern_shutdown.c:505
#2  0xc0224103 in ffs_valloc (pvp=0xc8744a80, mode=16888,
cred=0xc0b94384,
vpp=0xc85d8a04) at ../../ufs/ffs/ffs_alloc.c:605
#3  0xc0236353 in ufs_mkdir (ap=0xc85d8bc4) at
../../ufs/ufs/ufs_vnops.c:1307
#4  0xc02374a1 in ufs_vnoperate (ap=0xc85d8bc4)
at ../../ufs/ufs/ufs_vnops.c:2316
#5  0xc01cc26d in nfsrv_mkdir (nfsd=0xc0b94300, slp=0xc09e4600,
procp=0xc7c05de0, mrq=0xc85d8dc4) at vnode_if.h:611
#6  0xc01da76e in nfssvc_nfsd (nsd=0xc85d8e80, argp=0x8071bc0 "",
p=0xc7c05de0)
at ../../nfs/nfs_syscalls.c:650
#7  0xc01da08f in nfssvc (p=0xc7c05de0, uap=0xc85d8f80)
at ../../nfs/nfs_syscalls.c:346
#8  0xc026d496 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
  tf_edi = 4, tf_esi = 1, tf_ebp = -1077944892, tf_isp = -933392428,

  tf_ebx = 0, tf_edx = -1077944336, tf_ecx = 0, tf_eax = 155,
  tf_trapno = 12, tf_err = 2, tf_eip = 134517008, tf_cs = 31,
  tf_eflags = 646, tf_esp = -1077945284, tf_ss = 47})
    at ../../i386/i386/trap.c:1056
#9  0xc025e526 in Xint0x80_syscall ()
#10 0x80480e9 in ?? ()
(kgdb) quit
current# exit

Any suggestions as where to investigate?

Regards
--
Peter Holm | mailto:[EMAIL PROTECTED] | http://login.dknet.dk/~pho/




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message



NFS V3 and mkdir bug

1999-08-07 Thread Peter Holm
 alloc

---
#0  boot (howto=256) at ../../kern/kern_shutdown.c:291
291   dumppcb.pcb_cr3 = rcr3();
(kgdb) bt
#0  boot (howto=256) at ../../kern/kern_shutdown.c:291
#1  0xc016710d in panic (fmt=0xc02f69c1 ffs_valloc: dup alloc)
at ../../kern/kern_shutdown.c:505
#2  0xc0224103 in ffs_valloc (pvp=0xc8744a80, mode=16888,
cred=0xc0b94384,
vpp=0xc85d8a04) at ../../ufs/ffs/ffs_alloc.c:605
#3  0xc0236353 in ufs_mkdir (ap=0xc85d8bc4) at
../../ufs/ufs/ufs_vnops.c:1307
#4  0xc02374a1 in ufs_vnoperate (ap=0xc85d8bc4)
at ../../ufs/ufs/ufs_vnops.c:2316
#5  0xc01cc26d in nfsrv_mkdir (nfsd=0xc0b94300, slp=0xc09e4600,
procp=0xc7c05de0, mrq=0xc85d8dc4) at vnode_if.h:611
#6  0xc01da76e in nfssvc_nfsd (nsd=0xc85d8e80, argp=0x8071bc0 ,
p=0xc7c05de0)
at ../../nfs/nfs_syscalls.c:650
#7  0xc01da08f in nfssvc (p=0xc7c05de0, uap=0xc85d8f80)
at ../../nfs/nfs_syscalls.c:346
#8  0xc026d496 in syscall (frame={tf_fs = 47, tf_es = 47, tf_ds = 47,
  tf_edi = 4, tf_esi = 1, tf_ebp = -1077944892, tf_isp = -933392428,

  tf_ebx = 0, tf_edx = -1077944336, tf_ecx = 0, tf_eax = 155,
  tf_trapno = 12, tf_err = 2, tf_eip = 134517008, tf_cs = 31,
  tf_eflags = 646, tf_esp = -1077945284, tf_ss = 47})
at ../../i386/i386/trap.c:1056
#9  0xc025e526 in Xint0x80_syscall ()
#10 0x80480e9 in ?? ()
(kgdb) quit
current# exit

Any suggestions as where to investigate?

Regards
--
Peter Holm | mailto:pe...@holm.cc | http://login.dknet.dk/~pho/




To Unsubscribe: send mail to majord...@freebsd.org
with unsubscribe freebsd-hackers in the body of the message