Re: somewhat important inteldrm fix

2014-02-05 Thread David Coppa
On Tue, Feb 4, 2014 at 11:55 PM, Mark Kettenis mark.kette...@xs4all.nl wrote:
 Running the updated xf86-video-intel driver uncovered a bug in the
 kernel drm code.  The page fault handler wasn't handling some of the
 possible errors correctly.  This made the X server die with a SIGSEGV.
 The diff below brings things closer to what Linux does, and seems to
 fix the crashes I was seeing.  A bit more testing would be welcome
 though.  Note that this needs the commit to drmP.h I just made, which
 might not have made it to all the anoncvs servers yet.

I saw the X server crash yesterday, while watching a video in
fullscreen using minitube.
I will apply your diff and report back asap...

ciao,
David



exp() / expl() on Linux and OpenBSD (expl() bug?)

2014-02-05 Thread David Coppa

Hi!

I hit this problem while working on updating math/R from version
2.15.3 to the latest version (3.0.2).

It started happening since upstream switched from double functions
to C99 long double functions (expl, fabsl, ...), during the R-3
development cycle.

Take the following reduced test-case, adapted from what R's code
does:

---8---

#include stdio.h
#include stdlib.h
#include math.h

int main(void) {
double theta = 1;
long double lambda, pr, pr2;

lambda = (0.5*theta);
pr = exp(-lambda);
pr2 = expl(-lambda);

printf(theta == %g, pr == %Lg, pr2 == %Lg\n, theta, pr, pr2);
exit(0);
}

---8---

This produces the following output on Linux (x86_64):

theta == 1, pr == 0.606531, pr2 == 0.606531

While on OpenBSD -current amd64:

theta == 1, pr == 0.606531, pr2 == nan

And indeed R-3's testsuite fails with the error message
NaNs produced:

Warning in pchisq(1e-300, df = 0, ncp = lam) : NaNs produced
 stopifnot(all.equal(p00, exp(-lam/2)),
+   all.equal(p.0, exp(-lam/2)))
Error: all.equal(p.0, exp(-lam/2)) is not TRUE
Execution halted

Is this a bug in our expl() ?

Ciao,
David



Re: exp() / expl() on Linux and OpenBSD (expl() bug?)

2014-02-05 Thread Mark Kettenis
 Date: Wed, 5 Feb 2014 01:57:33 -0700
 From: David Coppa dco...@openbsd.org
 
 Hi!
 
 I hit this problem while working on updating math/R from version
 2.15.3 to the latest version (3.0.2).
 
 It started happening since upstream switched from double functions
 to C99 long double functions (expl, fabsl, ...), during the R-3
 development cycle.
 
 Take the following reduced test-case, adapted from what R's code
 does:
 
 ---8---
 
 #include stdio.h
 #include stdlib.h
 #include math.h
 
 int main(void) {
   double theta = 1;
   long double lambda, pr, pr2;
 
   lambda = (0.5*theta);
   pr = exp(-lambda);
   pr2 = expl(-lambda);
 
   printf(theta == %g, pr == %Lg, pr2 == %Lg\n, theta, pr, pr2);
   exit(0);
 }
 
 ---8---
 
 This produces the following output on Linux (x86_64):
 
 theta == 1, pr == 0.606531, pr2 == 0.606531
 
 While on OpenBSD -current amd64:
 
 theta == 1, pr == 0.606531, pr2 == nan
 
 And indeed R-3's testsuite fails with the error message
 NaNs produced:
 
 Warning in pchisq(1e-300, df = 0, ncp = lam) : NaNs produced
  stopifnot(all.equal(p00, exp(-lam/2)),
 +   all.equal(p.0, exp(-lam/2)))
 Error: all.equal(p.0, exp(-lam/2)) is not TRUE
 Execution halted
 
 Is this a bug in our expl() ?

Yes.



Re: exp() / expl() on Linux and OpenBSD (expl() bug?)

2014-02-05 Thread Mark Kettenis
 Date: Wed, 5 Feb 2014 01:57:33 -0700
 From: David Coppa dco...@openbsd.org
 
 
 Hi!
 
 I hit this problem while working on updating math/R from version
 2.15.3 to the latest version (3.0.2).
 
 It started happening since upstream switched from double functions
 to C99 long double functions (expl, fabsl, ...), during the R-3
 development cycle.
 
 Take the following reduced test-case, adapted from what R's code
 does:
 
 ---8---
 
 #include stdio.h
 #include stdlib.h
 #include math.h
 
 int main(void) {
   double theta = 1;
   long double lambda, pr, pr2;
 
   lambda = (0.5*theta);
   pr = exp(-lambda);
   pr2 = expl(-lambda);
 
   printf(theta == %g, pr == %Lg, pr2 == %Lg\n, theta, pr, pr2);
   exit(0);
 }
 
 ---8---
 
 This produces the following output on Linux (x86_64):
 
 theta == 1, pr == 0.606531, pr2 == 0.606531
 
 While on OpenBSD -current amd64:
 
 theta == 1, pr == 0.606531, pr2 == nan
 
 And indeed R-3's testsuite fails with the error message
 NaNs produced:
 
 Warning in pchisq(1e-300, df = 0, ncp = lam) : NaNs produced
  stopifnot(all.equal(p00, exp(-lam/2)),
 +   all.equal(p.0, exp(-lam/2)))
 Error: all.equal(p.0, exp(-lam/2)) is not TRUE
 Execution halted
 
 Is this a bug in our expl() ?

Oh, btw, the quad-precision code used on sparc64 gets this right.  So
the bug is probably somewhere in src/lib/limb/src/ld80.



Re: somewhat important inteldrm fix

2014-02-05 Thread Mark Kettenis
 From: David Coppa dco...@gmail.com
 Date: Wed, 5 Feb 2014 09:01:45 +0100
 
 On Tue, Feb 4, 2014 at 11:55 PM, Mark Kettenis mark.kette...@xs4all.nl 
 wrote:
  Running the updated xf86-video-intel driver uncovered a bug in the
  kernel drm code.  The page fault handler wasn't handling some of the
  possible errors correctly.  This made the X server die with a SIGSEGV.
  The diff below brings things closer to what Linux does, and seems to
  fix the crashes I was seeing.  A bit more testing would be welcome
  though.  Note that this needs the commit to drmP.h I just made, which
  might not have made it to all the anoncvs servers yet.
 
 I saw the X server crash yesterday, while watching a video in
 fullscreen using minitube.
 I will apply your diff and report back asap...

Since matthieu@ confirmed it fixes his problems as well, I've
committed the diff.  So it should show up on your favourite anoncvs
mirror shortly.



Re: somewhat important inteldrm fix

2014-02-05 Thread David Coppa
On Wed, Feb 5, 2014 at 11:43 AM, Mark Kettenis mark.kette...@xs4all.nl wrote:
 From: David Coppa dco...@gmail.com
 Date: Wed, 5 Feb 2014 09:01:45 +0100

 On Tue, Feb 4, 2014 at 11:55 PM, Mark Kettenis mark.kette...@xs4all.nl 
 wrote:
  Running the updated xf86-video-intel driver uncovered a bug in the
  kernel drm code.  The page fault handler wasn't handling some of the
  possible errors correctly.  This made the X server die with a SIGSEGV.
  The diff below brings things closer to what Linux does, and seems to
  fix the crashes I was seeing.  A bit more testing would be welcome
  though.  Note that this needs the commit to drmP.h I just made, which
  might not have made it to all the anoncvs servers yet.

 I saw the X server crash yesterday, while watching a video in
 fullscreen using minitube.
 I will apply your diff and report back asap...

 Since matthieu@ confirmed it fixes his problems as well, I've
 committed the diff.  So it should show up on your favourite anoncvs
 mirror shortly.

OK, thanks.

cheers,
david



Re: help needed from someone with an sk(4)

2014-02-05 Thread Henning Brauer
* David Higgs hig...@gmail.com [2014-01-25 18:25]:
 On Jan 25, 2014, at 12:48 AM, David Higgs hig...@gmail.com wrote:
 
 On Fri, Jan 24, 2014 at 4:24 AM, Henning Brauer
 lists-openbsdt...@bsws.de wrote:
 
 * Henning Brauer lists-openbsdt...@bsws.de [2014-01-24 05:50]:
 
 i need this tested on an sk(4).
 I don't have that hardware at all.
 
 
 this gets rif od a slight little bit more.
 
 
 Resurrected an old box, kernel compile w/ patch is underway.  Should
 be able to provide feedback tomorrow.  I'm kinda new to this - do I
 need to exercise forwarding, do some speed tests, or is it sufficient
 to just make sure that host-based usage doesn't break?  Snapshot dmesg
 below.
 
 
 No problems seen with host usage, after downloading a couple files w/ FTP
 and updating source via CVS.

awesome, thanks for verifying!

a mistake would have shown up VERY early and clearly.

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services GmbH, http://bsws.de, Full-Service ISP
Secure Hosting, Mail and DNS Services. Dedicated Servers, Root to Fully Managed
Henning Brauer Consulting, http://henningbrauer.com/



Re: somewhat important inteldrm fix

2014-02-05 Thread janis
 Running the updated xf86-video-intel driver uncovered a bug in the
 kernel drm code.  The page fault handler wasn't handling some of the
 possible errors correctly.  This made the X server die with a SIGSEGV.
 The diff below brings things closer to what Linux does, and seems to
 fix the crashes I was seeing.  A bit more testing would be welcome
 though.  Note that this needs the commit to drmP.h I just made, which
 might not have made it to all the anoncvs servers yet.
 
 
 Index: i915_gem.c
 ===
 RCS file: /cvs/src/sys/dev/pci/drm/i915/i915_gem.c,v
 retrieving revision 1.68
 diff -u -p -r1.68 i915_gem.c
 --- i915_gem.c2 Feb 2014 10:54:10 -   1.68
 +++ i915_gem.c4 Feb 2014 22:47:53 -
 @@ -1522,6 +1522,7 @@ i915_gem_fault(struct drm_gem_object *ge
   NULL, NULL);
   DRM_UNLOCK();
   dev_priv-entries--;
 + pmap_update(ufi-orig_map-pmap);
   uvm_wait(intelflt);
   return (VM_PAGER_REFAULT);
   }
 @@ -1533,18 +1534,42 @@ unlock:
   DRM_UNLOCK();
   dev_priv-entries--;
   pmap_update(ufi-orig_map-pmap);
 - if (ret == -EIO) {
 +
 + switch (ret) {
 + case -EIO:
 + /* If this -EIO is due to a gpu hang, give the reset code a
 +  * chance to clean up the mess. Otherwise return the proper
 +  * SIGBUS. */
 + if (!atomic_read(dev_priv-mm.wedged))
 + return VM_PAGER_ERROR;
 + case -EAGAIN:
 + /* Give the error handler a chance to run and move the
 +  * objects off the GPU active list. Next time we service the
 +  * fault, we should be able to transition the page into the
 +  * GTT without touching the GPU (and so avoid further
 +  * EIO/EGAIN). If the GPU is wedged, then there is no issue
 +  * with coherency, just lost writes.
 +  */
 +#if 0
 + set_need_resched();
 +#endif
 + case 0:
 + case -ERESTART:
 + case -EINTR:
 + case -EBUSY:
   /*
 -  * EIO means we're wedged, so upon resetting the gpu we'll
 -  * be alright and can refault. XXX only on resettable chips.
 +  * EBUSY is ok: this just means that another thread
 +  * already did the job.
*/
 - ret = VM_PAGER_REFAULT;
 - } else if (ret) {
 - ret = VM_PAGER_ERROR;
 - } else {
 - ret = VM_PAGER_OK;
 + return VM_PAGER_OK;
 + case -ENOMEM:
 + return VM_PAGER_ERROR;
 + case -ENOSPC:
 + return VM_PAGER_ERROR;
 + default:
 + WARN_ONCE(ret, unhandled error in i915_gem_fault: %i\n, ret);
 + return VM_PAGER_ERROR;
   }
 - return ret;
  }
  
  /**
 
Thanks, this works good for me too -- no X segfaults anymore.



fail to boot snapshot 5.5 on MBPro8,2

2014-02-05 Thread Sven-Volker Nowarra
Hi,

I tried to install 5.5 snapshots from 2. Feb and 3. Feb onto my laptop MacBook 
Pro 8,2 - both failed. Then used an older snapshot from 
spacehopper.org/mirrmon, which claimed to be 12 days old. Failed as well. 

MacBook Pro runs perfectly well with 5.4, and 5.5 bsd.rd lets me go into the 
installation. After a reboot (bsd.mp), the system hangs after the line where 
the disk should be mounted, and the nvram/clock message normally appears 
(actually before going into userland). The screen then turns shortly but steady 
into white/grey, so the blue kernel lines can't be read anymore.

I then tried to boot with bsd -c, with an external USB keyboard - but it 
hangs as well (as such no difference to the internal keyboard). Looks like I 
have two prompts (underlines) on the screen, that are flickering.

With this status I cannot provide any dmesg:-(

The first line after boot -c would show the booting hd0a:bsd: 
7635180+1660460+1097336... line, and then entry point at  The kernels 
blue lines would say:

kbc: cmd word write error
[ using 926384 bytes of bsd ELF symbol table ]
Copyright (c) 1982, 1986, 1989, 1991, 1993
  The Regents of the University ...

OpenBSD 5.5-beta (GENERIC.MP) #284: Mon Feb  3 07:57:32 MST 2014
  t...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
RTC BIOS diagnostic error 
ffclock_battery,ROM_cksum,config_unit,memory_size,fixed_disk,invalid_time
real mem = 4185079808 (3991MB)
avail mem = 4065452032 (3877MB)
User Kernel Config
UKC

The errors on RTC BIOS line also appears in the fully functional 5.4 version. 
But I couldn't see the line kbc: cmd word write error in other dmesgs on 
OpenBSD-tech.

As mentioned, no dmesg possible, anyone else with MBPro issue?

Any idea were to get old snapshots? 
I could try to build the kernel from scratch, I successfully built 5.4, and 
building with -current shows same behaviour as snapshots. Any ideas where to go 
from here?
(No serial port obviously).

Volker



Use explicit_bzero in login_*

2014-02-05 Thread Arto Jonsson
Index: login_chpass/login_chpass.c
===
RCS file: /cvs/src/libexec/login_chpass/login_chpass.c,v
retrieving revision 1.16
diff -u -p -r1.16 login_chpass.c
--- login_chpass/login_chpass.c 4 Dec 2012 02:24:47 -   1.16
+++ login_chpass/login_chpass.c 5 Feb 2014 15:44:26 -
@@ -208,7 +208,7 @@ yp_chpass(char *username)
pwd_gensalt(salt, sizeof(salt), lc, 'y') == 0)
strlcpy(salt, xx, sizeof(salt));
crypt(p, salt);
-   memset(p, 0, strlen(p));
+   explicit_bzero(p, strlen(p));
}
warnx(YP passwd database unchanged.);
exit(1);
Index: login_lchpass/login_lchpass.c
===
RCS file: /cvs/src/libexec/login_lchpass/login_lchpass.c,v
retrieving revision 1.14
diff -u -p -r1.14 login_lchpass.c
--- login_lchpass/login_lchpass.c   4 Dec 2012 02:24:47 -   1.14
+++ login_lchpass/login_lchpass.c   5 Feb 2014 15:44:27 -
@@ -136,7 +136,7 @@ main(int argc, char *argv[])
exit(1);
 
salt = crypt(p, salt);
-   memset(p, 0, strlen(p));
+   explicit_bzero(p, strlen(p));
if (!pwd || strcmp(salt, pwd-pw_passwd) != 0)
exit(1);
 
Index: login_passwd/login.c
===
RCS file: /cvs/src/libexec/login_passwd/login.c,v
retrieving revision 1.10
diff -u -p -r1.10 login.c
--- login_passwd/login.c1 Jun 2012 01:43:19 -   1.10
+++ login_passwd/login.c5 Feb 2014 15:44:27 -
@@ -158,7 +158,7 @@ main(int argc, char **argv)
 #endif
 
if (password != NULL)
-   memset(password, 0, strlen(password));
+   explicit_bzero(password, strlen(password));
if (ret != AUTH_OK)
fprintf(back, BI_REJECT \n);
 
Index: login_passwd/login_passwd.c
===
RCS file: /cvs/src/libexec/login_passwd/login_passwd.c,v
retrieving revision 1.9
diff -u -p -r1.9 login_passwd.c
--- login_passwd/login_passwd.c 9 Mar 2006 19:14:10 -   1.9
+++ login_passwd/login_passwd.c 5 Feb 2014 15:44:27 -
@@ -62,7 +62,7 @@ pwd_login(char *username, char *password
 
salt = crypt(password, salt);
plen = strlen(password);
-   memset(password, 0, plen);
+   explicit_bzero(password, plen);
 
/*
 * Authentication fails if the user does not exist in the password
Index: login_tis/login_tis.c
===
RCS file: /cvs/src/libexec/login_tis/login_tis.c,v
retrieving revision 1.11
diff -u -p -r1.11 login_tis.c
--- login_tis/login_tis.c   4 Dec 2012 02:24:47 -   1.11
+++ login_tis/login_tis.c   5 Feb 2014 15:44:27 -
@@ -394,8 +394,8 @@ tis_getkey(struct tis_connection *tc)
}
DES_string_to_key(key, cblock);
error = DES_set_key(cblock, tc-keysched);
-   memset(key, 0, len);
-   memset(cblock, 0, sizeof(cblock));
+   explicit_bzero(key, len);
+   explicit_bzero(cblock, sizeof(cblock));
free(tbuf);
return (error);
 }
@@ -507,10 +507,10 @@ tis_recv(struct tis_connection *tc, u_ch
len, ks, iv, DES_DECRYPT);
if (strlcpy(buf, tbuf, bufsiz) = bufsiz) {
syslog(LOG_ERR, unencrypted data too large to store);
-   memset(tbuf, 0, sizeof(tbuf));
+   explicit_bzero(tbuf, sizeof(tbuf));
return (-1);
}
-   memset(tbuf, 0, sizeof(tbuf));
+   explicit_bzero(tbuf, sizeof(tbuf));
}
return (len);
 }
@@ -656,7 +656,7 @@ tis_authorize(struct tis_connection *tc,
syslog(LOG_ERR, unexpected response from authsrv: %s, obuf);
resp = error;
}
-   memset(buf, 0, sizeof(buf));
+   explicit_bzero(buf, sizeof(buf));
 
return (resp);
 }
@@ -684,10 +684,10 @@ tis_verify(struct tis_connection *tc, co
if (strncmp(buf, ok, 2) == 0) {
if (buf[2] != '\0')
strlcpy(ebuf, buf + 3, TIS_BUFSIZ);
-   memset(buf, 0, sizeof(buf));
+   explicit_bzero(buf, sizeof(buf));
return (0);
}
strlcpy(ebuf, buf, TIS_BUFSIZ);
-   memset(buf, 0, sizeof(buf));
+   explicit_bzero(buf, sizeof(buf));
return (-1);
 }
Index: login_yubikey/login_yubikey.c
===
RCS file: /cvs/src/libexec/login_yubikey/login_yubikey.c,v
retrieving revision 1.8
diff -u -p -r1.8 login_yubikey.c
--- login_yubikey/login_yubikey.c   27 Nov 2013 21:25:25 -  1.8
+++ login_yubikey/login_yubikey.c   5 Feb 2014 15:44:27 -

Re: ok to kill stdio.h in strsep.c?

2014-02-05 Thread Stefan Sperling
On Sat, Jan 25, 2014 at 01:49:24AM -0500, Jean-Philippe Ouellet wrote:
 It appeared in revision 1.3 (Update from lite2.)
 
 It's the only one in the string family that has it, and nothing from it
 is used.

I think this change is fine.
I'll commit this soon if I don't hear objections.

 Index: strsep.c
 ===
 RCS file: /cvs/src/lib/libc/string/strsep.c,v
 retrieving revision 1.6
 diff -u -p -r1.6 strsep.c
 --- strsep.c8 Aug 2005 08:05:37 -   1.6
 +++ strsep.c25 Jan 2014 06:41:18 -
 @@ -30,7 +30,6 @@
   */
 
  #include string.h
 -#include stdio.h
 
  /*
   * Get next token from string *stringp, where tokens are possibly-empty



Re: ok to kill stdio.h in strsep.c?

2014-02-05 Thread Kenneth Westerback
ok krw@

On 5 February 2014 13:15, Stefan Sperling s...@openbsd.org wrote:
 On Sat, Jan 25, 2014 at 01:49:24AM -0500, Jean-Philippe Ouellet wrote:
 It appeared in revision 1.3 (Update from lite2.)

 It's the only one in the string family that has it, and nothing from it
 is used.

 I think this change is fine.
 I'll commit this soon if I don't hear objections.

 Index: strsep.c
 ===
 RCS file: /cvs/src/lib/libc/string/strsep.c,v
 retrieving revision 1.6
 diff -u -p -r1.6 strsep.c
 --- strsep.c8 Aug 2005 08:05:37 -   1.6
 +++ strsep.c25 Jan 2014 06:41:18 -
 @@ -30,7 +30,6 @@
   */

  #include string.h
 -#include stdio.h

  /*
   * Get next token from string *stringp, where tokens are possibly-empty




quick fix for uvm deadlocks

2014-02-05 Thread Ted Unangst
We are missing back pressure channels from uvm to the buf cache. The
buf cache will happily sit on 9000 free pages while uvm churns around
trying to scavenge up one more page.

Fixing this is beyond the scope of a simple diff, but here's something
that seems to help in a lot of the common cases, particularly the pla
deadlock detected spin cycle that people see.

If we're out of memory, kick the buf cache into releasing some
clean pages. The buf cache may eventually find itself sorely in need
of memory and unable to get it, but this is better than nothing. I've
deliberately saved the back pressure until we're already on the about to
die path to minimize regressions. uvm won't steal back memory unless
it absolutely has to.

Index: kern/vfs_bio.c
===
RCS file: /cvs/src/sys/kern/vfs_bio.c,v
retrieving revision 1.154
diff -u -p -r1.154 vfs_bio.c
--- kern/vfs_bio.c  25 Jan 2014 04:23:31 -  1.154
+++ kern/vfs_bio.c  5 Feb 2014 22:08:07 -
@@ -305,6 +305,26 @@ bufadjust(int newbufpages)
splx(s);
 }
 
+int
+buf_nukeclean(void)
+{
+   struct buf *bp;
+   int n;
+
+   printf(nuking clean bufs\n);
+   n = 0;
+   while ((bp = TAILQ_FIRST(bufqueues[BQ_CLEAN]))  n++  10) {
+   bremfree(bp);
+   if (bp-b_vp) {
+   RB_REMOVE(buf_rb_bufs,
+   bp-b_vp-v_bufs_tree, bp);
+   brelvp(bp);
+   }
+   buf_put(bp);
+   }
+   return (n);
+}
+
 /*
  * Make the buffer cache back off from cachepct.
  */
Index: sys/buf.h
===
RCS file: /cvs/src/sys/sys/buf.h,v
retrieving revision 1.93
diff -u -p -r1.93 buf.h
--- sys/buf.h   21 Nov 2013 01:16:52 -  1.93
+++ sys/buf.h   5 Feb 2014 22:04:09 -
@@ -312,6 +312,8 @@ voidbuf_fix_mapping(struct buf *, vsize
 void   buf_alloc_pages(struct buf *, vsize_t);
 void   buf_free_pages(struct buf *);
 
+intbuf_nukeclean(void);
+
 
 void   minphys(struct buf *bp);
 intphysio(void (*strategy)(struct buf *), dev_t dev, int flags,
Index: uvm/uvm_pdaemon.c
===
RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
retrieving revision 1.64
diff -u -p -r1.64 uvm_pdaemon.c
--- uvm/uvm_pdaemon.c   30 May 2013 16:29:46 -  1.64
+++ uvm/uvm_pdaemon.c   5 Feb 2014 22:04:15 -
@@ -116,7 +116,7 @@ uvm_wait(const char *wmsg)
 * check for page daemon going to sleep (waiting for itself)
 */
 
-   if (curproc == uvm.pagedaemon_proc) {
+   if (curproc == uvm.pagedaemon_proc  buf_nukeclean() == 0) {
/*
 * now we have a problem: the pagedaemon wants to go to
 * sleep until it frees more memory.   but how can it
Index: uvm/uvm_pmemrange.c
===
RCS file: /cvs/src/sys/uvm/uvm_pmemrange.c,v
retrieving revision 1.36
diff -u -p -r1.36 uvm_pmemrange.c
--- uvm/uvm_pmemrange.c 29 Jan 2013 19:55:48 -  1.36
+++ uvm/uvm_pmemrange.c 5 Feb 2014 22:07:37 -
@@ -22,6 +22,7 @@
 #include sys/malloc.h
 #include sys/proc.h  /* XXX for atomic */
 #include sys/kernel.h
+#include sys/buf.h
 
 /*
  * 2 trees: addr tree and size tree.
@@ -1883,6 +1884,13 @@ uvm_wait_pla(paddr_t low, paddr_t high, 
const char *wmsg = pmrwait;
 
if (curproc == uvm.pagedaemon_proc) {
+   uvm_unlock_fpageq();
+   if (buf_nukeclean() != 0) {
+   uvm_lock_fpageq();
+   return 0;
+   }
+   uvm_lock_fpageq();
+
/*
 * XXX detect pagedaemon deadlock - see comment in
 * uvm_wait(), as this is exactly the same issue.




Re: quick fix for uvm deadlocks

2014-02-05 Thread Bob Beck
On Wed, Feb 5, 2014 at 3:17 PM, Ted Unangst t...@tedunangst.com wrote:
 We are missing back pressure channels from uvm to the buf cache. The
 buf cache will happily sit on 9000 free pages while uvm churns around
 trying to scavenge up one more page.

Indeed, those are it's minimums (I presume in your case) and are
exactly the amount of memory
that uvm would never have even seen under the model of the static
cache. So I don't agree
with your statement we are missing back pressure channels from the
uvm to the buf cache.

It looks to me like the situation you are talking about is that the
buffer cache has already backed off
to it's minium (which is used to ensure things like avoiding deadlocks
in the bufer cache on delayed
writes and fun stuff like that).

Or are you in a situation here where the cache has *not* backed off?



 Fixing this is beyond the scope of a simple diff, but here's something
 that seems to help in a lot of the common cases, particularly the pla
 deadlock detected spin cycle that people see.

 If we're out of memory, kick the buf cache into releasing some
 clean pages. The buf cache may eventually find itself sorely in need
 of memory and unable to get it, but this is better than nothing. I've
 deliberately saved the back pressure until we're already on the about to
 die path to minimize regressions. uvm won't steal back memory unless
 it absolutely has to.

And for the reasons you say, I think this has great potential to move
the deadlock
into the buffer cache on small memory machines when we get the entire
cache filled with
delwri...

Sure, we can make the miniums smaller, but in the end we still have to
fix the page
daemon or we are just delaying the inevitable or moving the deadlock
to other subsystems.





 Index: kern/vfs_bio.c
 ===
 RCS file: /cvs/src/sys/kern/vfs_bio.c,v
 retrieving revision 1.154
 diff -u -p -r1.154 vfs_bio.c
 --- kern/vfs_bio.c  25 Jan 2014 04:23:31 -  1.154
 +++ kern/vfs_bio.c  5 Feb 2014 22:08:07 -
 @@ -305,6 +305,26 @@ bufadjust(int newbufpages)
 splx(s);
  }

 +int
 +buf_nukeclean(void)
 +{
 +   struct buf *bp;
 +   int n;
 +
 +   printf(nuking clean bufs\n);
 +   n = 0;
 +   while ((bp = TAILQ_FIRST(bufqueues[BQ_CLEAN]))  n++  10) {
 +   bremfree(bp);
 +   if (bp-b_vp) {
 +   RB_REMOVE(buf_rb_bufs,
 +   bp-b_vp-v_bufs_tree, bp);
 +   brelvp(bp);
 +   }
 +   buf_put(bp);
 +   }
 +   return (n);
 +}
 +
  /*
   * Make the buffer cache back off from cachepct.
   */
 Index: sys/buf.h
 ===
 RCS file: /cvs/src/sys/sys/buf.h,v
 retrieving revision 1.93
 diff -u -p -r1.93 buf.h
 --- sys/buf.h   21 Nov 2013 01:16:52 -  1.93
 +++ sys/buf.h   5 Feb 2014 22:04:09 -
 @@ -312,6 +312,8 @@ voidbuf_fix_mapping(struct buf *, vsize
  void   buf_alloc_pages(struct buf *, vsize_t);
  void   buf_free_pages(struct buf *);

 +intbuf_nukeclean(void);
 +

  void   minphys(struct buf *bp);
  intphysio(void (*strategy)(struct buf *), dev_t dev, int flags,
 Index: uvm/uvm_pdaemon.c
 ===
 RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
 retrieving revision 1.64
 diff -u -p -r1.64 uvm_pdaemon.c
 --- uvm/uvm_pdaemon.c   30 May 2013 16:29:46 -  1.64
 +++ uvm/uvm_pdaemon.c   5 Feb 2014 22:04:15 -
 @@ -116,7 +116,7 @@ uvm_wait(const char *wmsg)
  * check for page daemon going to sleep (waiting for itself)
  */

 -   if (curproc == uvm.pagedaemon_proc) {
 +   if (curproc == uvm.pagedaemon_proc  buf_nukeclean() == 0) {
 /*
  * now we have a problem: the pagedaemon wants to go to
  * sleep until it frees more memory.   but how can it
 Index: uvm/uvm_pmemrange.c
 ===
 RCS file: /cvs/src/sys/uvm/uvm_pmemrange.c,v
 retrieving revision 1.36
 diff -u -p -r1.36 uvm_pmemrange.c
 --- uvm/uvm_pmemrange.c 29 Jan 2013 19:55:48 -  1.36
 +++ uvm/uvm_pmemrange.c 5 Feb 2014 22:07:37 -
 @@ -22,6 +22,7 @@
  #include sys/malloc.h
  #include sys/proc.h  /* XXX for atomic */
  #include sys/kernel.h
 +#include sys/buf.h

  /*
   * 2 trees: addr tree and size tree.
 @@ -1883,6 +1884,13 @@ uvm_wait_pla(paddr_t low, paddr_t high,
 const char *wmsg = pmrwait;

 if (curproc == uvm.pagedaemon_proc) {
 +   uvm_unlock_fpageq();
 +   if (buf_nukeclean() != 0) {
 +   uvm_lock_fpageq();
 +   return 0;
 +   }
 +   uvm_lock_fpageq();
 +
 /*
  * XXX detect pagedaemon deadlock - see comment in
  * uvm_wait(), as this is exactly the same 

Re: quick fix for uvm deadlocks

2014-02-05 Thread Ted Unangst
On Wed, Feb 05, 2014 at 17:53, Bob Beck wrote:
 On Wed, Feb 5, 2014 at 3:17 PM, Ted Unangst t...@tedunangst.com wrote:
 We are missing back pressure channels from uvm to the buf cache. The
 buf cache will happily sit on 9000 free pages while uvm churns around
 trying to scavenge up one more page.

 Or are you in a situation here where the cache has *not* backed off?

Talked to Bob and hashed out better ideas of the problem. The page
daemon does tell the buffer cache to make some room, but...

If you have a huge mmap file, the pdaemon will try to flush it out via
VOP_WRITE, which circles back via ffs into buf_get, which eats those
previously freed pages, and then some, as the pagedaemon continues
pushing more and more of the mmap file out.

We discussed some other changes and fixes that this situation has
clearly highlighted, but here's a slightly revised diff. It now uses
the correct bufbackoff() function to communicate uvm's needs. Any
other fix is rather precarious for this release, but as stated before,
this keeps the change to the deadlock paths. You were already dead,
but now you have a second chance.

(We don't currently use the pmemrange argument; we'll have to adjust
accordingly when the bufcache becomes range aware.)

Index: uvm_pdaemon.c
===
RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
retrieving revision 1.64
diff -u -p -r1.64 uvm_pdaemon.c
--- uvm_pdaemon.c   30 May 2013 16:29:46 -  1.64
+++ uvm_pdaemon.c   6 Feb 2014 03:09:53 -
@@ -117,6 +117,8 @@ uvm_wait(const char *wmsg)
 */
 
if (curproc == uvm.pagedaemon_proc) {
+   if (bufbackoff(NULL, 4) == 0)
+   return;
/*
 * now we have a problem: the pagedaemon wants to go to
 * sleep until it frees more memory.   but how can it
Index: uvm_pmemrange.c
===
RCS file: /cvs/src/sys/uvm/uvm_pmemrange.c,v
retrieving revision 1.36
diff -u -p -r1.36 uvm_pmemrange.c
--- uvm_pmemrange.c 29 Jan 2013 19:55:48 -  1.36
+++ uvm_pmemrange.c 6 Feb 2014 03:10:32 -
@@ -22,6 +22,7 @@
 #include sys/malloc.h
 #include sys/proc.h  /* XXX for atomic */
 #include sys/kernel.h
+#include sys/mount.h
 
 /*
  * 2 trees: addr tree and size tree.
@@ -1883,6 +1884,13 @@ uvm_wait_pla(paddr_t low, paddr_t high, 
const char *wmsg = pmrwait;
 
if (curproc == uvm.pagedaemon_proc) {
+   uvm_unlock_fpageq();
+   if (bufbackoff(NULL, atop(size)) == 0) {
+   uvm_lock_fpageq();
+   return 0;
+   }
+   uvm_lock_fpageq();
+
/*
 * XXX detect pagedaemon deadlock - see comment in
 * uvm_wait(), as this is exactly the same issue.



Re: quick fix for uvm deadlocks

2014-02-05 Thread Bob Beck
Yes, this is much better.  although I think this problem related to
big mmaps has been with us for a while.

and appears to avoid the problem with the offending test programs on
my machines.

I'm ok with that going in for the moment, although I want some of your
time to look at that nasty shit we talked about :)  I think
we can make that a lot better with some NOCACHE..


On Wed, Feb 5, 2014 at 9:03 PM, Ted Unangst t...@tedunangst.com wrote:
 On Wed, Feb 05, 2014 at 17:53, Bob Beck wrote:
 On Wed, Feb 5, 2014 at 3:17 PM, Ted Unangst t...@tedunangst.com wrote:
 We are missing back pressure channels from uvm to the buf cache. The
 buf cache will happily sit on 9000 free pages while uvm churns around
 trying to scavenge up one more page.

 Or are you in a situation here where the cache has *not* backed off?

 Talked to Bob and hashed out better ideas of the problem. The page
 daemon does tell the buffer cache to make some room, but...

 If you have a huge mmap file, the pdaemon will try to flush it out via
 VOP_WRITE, which circles back via ffs into buf_get, which eats those
 previously freed pages, and then some, as the pagedaemon continues
 pushing more and more of the mmap file out.

 We discussed some other changes and fixes that this situation has
 clearly highlighted, but here's a slightly revised diff. It now uses
 the correct bufbackoff() function to communicate uvm's needs. Any
 other fix is rather precarious for this release, but as stated before,
 this keeps the change to the deadlock paths. You were already dead,
 but now you have a second chance.

 (We don't currently use the pmemrange argument; we'll have to adjust
 accordingly when the bufcache becomes range aware.)

 Index: uvm_pdaemon.c
 ===
 RCS file: /cvs/src/sys/uvm/uvm_pdaemon.c,v
 retrieving revision 1.64
 diff -u -p -r1.64 uvm_pdaemon.c
 --- uvm_pdaemon.c   30 May 2013 16:29:46 -  1.64
 +++ uvm_pdaemon.c   6 Feb 2014 03:09:53 -
 @@ -117,6 +117,8 @@ uvm_wait(const char *wmsg)
  */

 if (curproc == uvm.pagedaemon_proc) {
 +   if (bufbackoff(NULL, 4) == 0)
 +   return;
 /*
  * now we have a problem: the pagedaemon wants to go to
  * sleep until it frees more memory.   but how can it
 Index: uvm_pmemrange.c
 ===
 RCS file: /cvs/src/sys/uvm/uvm_pmemrange.c,v
 retrieving revision 1.36
 diff -u -p -r1.36 uvm_pmemrange.c
 --- uvm_pmemrange.c 29 Jan 2013 19:55:48 -  1.36
 +++ uvm_pmemrange.c 6 Feb 2014 03:10:32 -
 @@ -22,6 +22,7 @@
  #include sys/malloc.h
  #include sys/proc.h  /* XXX for atomic */
  #include sys/kernel.h
 +#include sys/mount.h

  /*
   * 2 trees: addr tree and size tree.
 @@ -1883,6 +1884,13 @@ uvm_wait_pla(paddr_t low, paddr_t high,
 const char *wmsg = pmrwait;

 if (curproc == uvm.pagedaemon_proc) {
 +   uvm_unlock_fpageq();
 +   if (bufbackoff(NULL, atop(size)) == 0) {
 +   uvm_lock_fpageq();
 +   return 0;
 +   }
 +   uvm_lock_fpageq();
 +
 /*
  * XXX detect pagedaemon deadlock - see comment in
  * uvm_wait(), as this is exactly the same issue.