Re: CVS commit: src/sys/kern

2014-07-31 Thread Tyler Retzlaff
On Thu, Jul 31, 2014 at 08:28:59PM +, matthew green wrote:
 Module Name:  src
 Committed By: mrg
 Date: Thu Jul 31 20:28:59 UTC 2014
 
 Modified Files:
   src/sys/kern: uipc_socket.c
 
 Log Message:
 call -pr_abort(so) now instead of generic PRU_ABORT.
 fixes kern/49056, and appears to remove the only missed PRU_ABORT call.

this was the right thing and after re-checking i agree it was the only
missed call for pr_generic() req = PRU_ABORT.

thanks!

 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.229 -r1.230 src/sys/kern/uipc_socket.c
 
 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.
 


Re: CVS commit: src/sys/kern

2014-07-25 Thread Maxime Villard
Le 25/07/2014 10:25, David A. Holland a écrit :
 
 Module Name:  src
 Committed By: dholland
 Date: Fri Jul 25 08:25:47 UTC 2014
 
 Modified Files:
   src/sys/kern: syscalls.master vfs_syscalls.c
 
 Log Message:
 Add fdiscard and posix_fallocate syscalls.
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.269 -r1.270 src/sys/kern/syscalls.master
 cvs rdiff -u -r1.487 -r1.488 src/sys/kern/vfs_syscalls.c
 
 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.
 

I think 'error' instead of 'result' is better; it makes clear that it's
an error code, and it's consistent with the other functions in the file.


Re: CVS commit: src/sys/kern

2014-07-25 Thread David Holland
On Fri, Jul 25, 2014 at 11:21:47AM +0200, Maxime Villard wrote:
   Log Message:
   Add fdiscard and posix_fallocate syscalls.
  
  I think 'error' instead of 'result' is better; it makes clear that it's
  an error code, and it's consistent with the other functions in the file.

!

If you care that much, change it...

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/kern

2014-07-25 Thread Maxime Villard
Le 25/07/2014 16:47, David Holland a écrit :
 
 On Fri, Jul 25, 2014 at 11:21:47AM +0200, Maxime Villard wrote:
Log Message:
Add fdiscard and posix_fallocate syscalls.
   
   I think 'error' instead of 'result' is better; it makes clear that it's
   an error code, and it's consistent with the other functions in the file.
 
 !
 
 If you care that much, change it...
 

Also,


http://pubs.opengroup.org/onlinepubs/009695399/functions/posix_fallocate.html

indicates that

The posix_fallocate() function shall fail if:
[...]
[EINVAL]
The len argument was zero or the offset argument was less than zero.

but AFAICT there's no len == 0 check.

4718if (pos  0 || len  0 || len  OFF_T_MAX - pos) {
4719return EINVAL;
4720}


Re: CVS commit: src/sys/kern

2014-07-03 Thread Maxime Villard
 Maxime Villard m...@netbsd.org wrote:
  |Module Name:  src
  |Committed By: maxv
  |Date: Tue Jun 24 07:28:23 UTC 2014
  |
  |Modified Files:
  | src/sys/kern: subr_kmem.c
  |
  |Log Message:
  |KMEM_REDZONE+KMEM_POISON is supposed to detect buffer overflows. But it only
  |poisons memory after kmem_roundup_size(), which means that if an overflow
  |occurs in the page padding, it won't be detected.
  |
  |Fix this by making KMEM_REDZONE independent from KMEM_POISON and making it
  |put a 2-byte pattern at the end of each requested buffer, and check it when
  |freeing memory to ensure the caller hasn't written outside \
  |the requested area.
 
 Interesting.
 Having no idea of kernel programming i blindly assume that those
 pages are somehow isolated against preceeding pages, so that no
 checks of the lower bound are necessary or even useful, and of
 course checking wether the address as such can be safely accessed
 is also not necessary / done differently.
 
 But, whereas i really think it is a smart idea to use
 a mathematically verifieable pattern, how can you be sure that the
 pattern doesn't generate values which are extremely common,
 especially at E-O-B, such as '\0'?  Shouldn't at least 0 be
 replaced with a different value?

That was in my TODO list, it's fixed in r1.59.


Re: CVS commit: src/sys/kern

2014-07-03 Thread Steffen Nurpmeso
Maxime Villard m...@m00nbsd.net wrote:
 |That was in my TODO list, it's fixed in r1.59.

Oh please, did you hear me complain?
It is NetBSD in the end.
(Cute :-)

--steffen


Re: CVS commit: src/sys/kern

2014-07-03 Thread Maxime Villard
Le 03/07/2014 16:47, Steffen Nurpmeso a écrit :
 
 Maxime Villard m...@m00nbsd.net wrote:
  |Le 03/07/2014 15:59, Steffen Nurpmeso a écrit :
  | 
  | Maxime Villard m...@m00nbsd.net wrote:
  ||That was in my TODO list, it's fixed in r1.59.
  | 
  | Oh please, did you hear me complain?
  | It is NetBSD in the end.
  | (Cute :-)
  |
  |What do you mean?
 
 Scratch the first two sentences of mine.
 

I'm not sure I understand what you mean, but if I understand what is to be
understood (or what you would like me to understand), then I would give the
following answer - which fully answers your question, with the same allusive
tone:

Is there a reference ID in the mail you received from me?

Understand what is to be understood. And if you want to keep fucking around,
then let's discuss privately, because I'm not sure people will be interested
in this discussion.


Re: CVS commit: src/sys/kern

2014-06-24 Thread Steffen Nurpmeso
Maxime Villard m...@netbsd.org wrote:
 |Module Name:  src
 |Committed By: maxv
 |Date: Tue Jun 24 07:28:23 UTC 2014
 |
 |Modified Files:
 | src/sys/kern: subr_kmem.c
 |
 |Log Message:
 |KMEM_REDZONE+KMEM_POISON is supposed to detect buffer overflows. But it only
 |poisons memory after kmem_roundup_size(), which means that if an overflow
 |occurs in the page padding, it won't be detected.
 |
 |Fix this by making KMEM_REDZONE independent from KMEM_POISON and making it
 |put a 2-byte pattern at the end of each requested buffer, and check it when
 |freeing memory to ensure the caller hasn't written outside \
 |the requested area.

Interesting.
Having no idea of kernel programming i blindly assume that those
pages are somehow isolated against preceeding pages, so that no
checks of the lower bound are necessary or even useful, and of
course checking wether the address as such can be safely accessed
is also not necessary / done differently.

But, whereas i really think it is a smart idea to use
a mathematically verifieable pattern, how can you be sure that the
pattern doesn't generate values which are extremely common,
especially at E-O-B, such as '\0'?  Shouldn't at least 0 be
replaced with a different value?

(I always used fixed patterns for that, e.g., for S-nail i have
a mix of regular and irregular bit patterns:

   if (__xl.p_ui8p[0] != 0xDE) __i |= 10;\
   if (__xl.p_ui8p[1] != 0xAA) __i |= 11;\
   if (__xl.p_ui8p[2] != 0x55) __i |= 12;\
   if (__xl.p_ui8p[3] != 0xAD) __i |= 13;\
   if (__xl.p_ui8p[4] != 0xBE) __i |= 14;\
   if (__xl.p_ui8p[5] != 0x55) __i |= 15;\
   if (__xl.p_ui8p[6] != 0xAA) __i |= 16;\
   if (__xl.p_ui8p[7] != 0xEF) __i |= 17;\
   if (__i != 0) {\
  (BAD) = TRU1;\
  alert(%p: corrupt lower canary: 0x%02X: %s, line %u,\
 __xl.p_p, __i, mdbg_file, mdbg_line);\
   }\

though i guess havin 0xAA first is even better here.)

--steffen


Re: CVS commit: src/sys/kern

2014-06-15 Thread J. Hannken-Illjes
On 14 Jun 2014, at 18:12, Joerg Sonnenberger jo...@netbsd.org wrote:

 Module Name:  src
 Committed By: joerg
 Date: Sat Jun 14 16:12:34 UTC 2014
 
 Modified Files:
   src/sys/kern: vfs_cache.c
 
 Log Message:
 Make the stat mutex a leaf. XXX Use atomic counters.

This cannot work.

We now run cache_lookup_entry() without a cpu_lock and without
namecache_lock so the list may change while it gets traversed.

See the comment on top of cache_lookup_entry().

Please revert your changes to vfs_cache.c.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



re: CVS commit: src/sys/kern

2014-04-27 Thread David Brownlee
I'm away from my machine,  but I believe the simple test for this is to
build.sh a kernel and see if nm shows pool_head
On 26 Apr 2014 20:11, matthew green m...@eterna.com.au wrote:


 David Brownlee writes:
  Module Name:  src
  Committed By: abs
  Date: Sat Apr 26 16:30:05 UTC 2014
 
  Modified Files:
src/sys/kern: subr_pool.c
 
  Log Message:
  Ensure pool_head is non static - for vmstat -i

 did this use to work?

 does it work if you build your kernel with GCC 4.5?

 it was recently noticed that GDB is no longer able to see
 static symbols in netbsd.gdb images, and the preliminary
 testing i've done so far for amd64 shows that GCC 4.5 works.


 .mrg.



re: CVS commit: src/sys/kern

2014-04-26 Thread matthew green

David Brownlee writes:
 Module Name:  src
 Committed By: abs
 Date: Sat Apr 26 16:30:05 UTC 2014
 
 Modified Files:
   src/sys/kern: subr_pool.c
 
 Log Message:
 Ensure pool_head is non static - for vmstat -i

did this use to work?

does it work if you build your kernel with GCC 4.5?

it was recently noticed that GDB is no longer able to see
static symbols in netbsd.gdb images, and the preliminary
testing i've done so far for amd64 shows that GCC 4.5 works.


.mrg.


Re: CVS commit: src/sys/kern

2014-04-16 Thread J. Hannken-Illjes

On 15 Apr 2014, at 20:18, Taylor R Campbell 
campbell+netbsd-source-change...@mumble.net wrote:

   Date: Tue, 15 Apr 2014 09:50:45 +
   From: Juergen Hannken-Illjes hann...@netbsd.org
 
   Fix a deadlock where one thread exits, enters fstrans_lwp_dtor()
   and wants fstrans_lock.  This thread holds the proc_lock.
 
 This sounds fishy.  lwp_exit does not hold proc_lock while calling
 lwp_finispecific, so there are no invariants covered by proc_lock that
 the lwp_specific destructors can rely on.  I'm inclined to say that it
 is a bug for exit1 to hold proc_lock when it calls lwp_finispecific
 (and proc_finispecific).  Can we just release it before and re-acquire
 it after calling lwp/proc_finispecific?
 
   Another thread holds fstrans_lock and runs pserialize_perform().
   As the first thread holds the proc_lock, timeouts are blocked and
   the second thread blocks forever in kpause().
 
 This also sounds fishy.  How does T1's holding proc_lock cause T2 to
 block forever in kpause?  I think I'm missing something in this
 analysis.  kpause doesn't take proc_lock, does it?

Kpause() registers a callout and sleeps.

On the next clock tick the softclk thread calls timer_intr() which
wants the proc_lock.  Now we successfully stopped all softclk threads.

The callout for kpause() will not fire - deadlock/

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: CVS commit: src/sys/kern

2014-04-15 Thread Taylor R Campbell
   Date: Tue, 15 Apr 2014 09:50:45 +
   From: Juergen Hannken-Illjes hann...@netbsd.org

   Fix a deadlock where one thread exits, enters fstrans_lwp_dtor()
   and wants fstrans_lock.  This thread holds the proc_lock.

This sounds fishy.  lwp_exit does not hold proc_lock while calling
lwp_finispecific, so there are no invariants covered by proc_lock that
the lwp_specific destructors can rely on.  I'm inclined to say that it
is a bug for exit1 to hold proc_lock when it calls lwp_finispecific
(and proc_finispecific).  Can we just release it before and re-acquire
it after calling lwp/proc_finispecific?

   Another thread holds fstrans_lock and runs pserialize_perform().
   As the first thread holds the proc_lock, timeouts are blocked and
   the second thread blocks forever in kpause().

This also sounds fishy.  How does T1's holding proc_lock cause T2 to
block forever in kpause?  I think I'm missing something in this
analysis.  kpause doesn't take proc_lock, does it?


re: CVS commit: src/sys/kern

2014-04-13 Thread matthew green

  btw, why do you keep adding 4 space idented { } ?
 
 To narrow local var scopes.  They should go away eventually...

please do not commit these to the tree, keep them as local changes.


.mrg.


Re: CVS commit: src/sys/kern

2014-04-12 Thread Christos Zoulas
In article 20140412150856.b6ca...@cvs.netbsd.org,
Masao Uebayashi source-changes-d@NetBSD.org wrote:
-=-=-=-=-=-
-  /* setup new registers and do misc. setup. */
-  (*epp-ep_esch-es_emul-e_setregs)(l, epp, data-ed_newstack);
+{

No need for braces, we use c99 in the kernel.

christos



Re: CVS commit: src/sys/kern

2014-04-12 Thread Nick Hudson

On 04/12/14 16:08, Masao Uebayashi wrote:

Module Name:src
Committed By:   uebayasi
Date:   Sat Apr 12 15:08:56 UTC 2014

Modified Files:
src/sys/kern: kern_exec.c

Log Message:
execve_runproc: Correct thinko in Rev. 1.386; the new SP always points to
after (higher adderss) argc/argv/env/aux/strings regardless of stack growing
direction .


Doesn't this mean that argc, etc will get overwritten on _rtld_start 
call for stack grows down machines?


SP needs to point to the first available stack address. On stack grows 
down machines this is a lower value than argc, etc. On stack grows up 
machines this is a higher value.


btw, why do you keep adding 4 space idented { } ?

Nick



Re: CVS commit: src/sys/kern

2014-04-12 Thread Masao Uebayashi
On Sun, Apr 13, 2014 at 12:33 AM, Nick Hudson sk...@netbsd.org wrote:
 On 04/12/14 16:08, Masao Uebayashi wrote:

 Module Name:src
 Committed By:   uebayasi
 Date:   Sat Apr 12 15:08:56 UTC 2014

 Modified Files:
 src/sys/kern: kern_exec.c

 Log Message:
 execve_runproc: Correct thinko in Rev. 1.386; the new SP always points to
 after (higher adderss) argc/argv/env/aux/strings regardless of stack
 growing
 direction .

 Doesn't this mean that argc, etc will get overwritten on _rtld_start call
 for stack grows down machines?

 SP needs to point to the first available stack address. On stack grows down
 machines this is a lower value than argc, etc. On stack grows up machines
 this is a higher value.

http://www.netbsd.org/~uebayasi/execve-stack-growdown.pdf
http://www.netbsd.org/~uebayasi/execve-stack-growup.pdf

Initial SP points to STACK_GROW(minsaddr, ssize).  rtld can use the
given stack freely (contradicting obsolete comments found in
kern_exec.c).

stack-growing-up rtld has to figure out argc.  This is possible by
(vaddr_t)ps_argvstr - (vaddr_t)sizeof(argc).  Note that argc on stack
is long (== char *), not int.  hppa's rtld assumes int, which has to
be fixed for hppa64.

http://nxr.netbsd.org/xref/src/libexec/ld.elf_so/arch/hppa/rtld_start.S#88

 btw, why do you keep adding 4 space idented { } ?

To narrow local var scopes.  They should go away eventually...


Re: CVS commit: src/sys/kern

2014-03-06 Thread Alan Barrett

On Wed, 05 Mar 2014, Andreas Gustafsson wrote:
Changing the types of existing sysctl variables breaks both 
source and binary compatibility and should not be done lightly;


Changing the types without sufficient care can break source 
and binary compatibility.  With sufficient care, compatibility 
can be maintained, and I thought that dsl had attempted to do 
that.  However, I think that a change like that should have been 
discussed first.


that's why we have both both hw.physmem and hw.physmem64, for 
example.


I think that it was a mistake (made several years ago) to 
introduce hw.physmem64, and that hw.physmem should have been 
extended to allow larger values, in a backward compatible way, 
very much like whay dsl has attempted to do now.  Some details 
might be wrong, and it should have been discussed first, but the 
principle of returning what the caller expects seems reasonable to 
me.


I believe the harm caused by the incompatible type change far 
outweighs the cost of a few added lines of code, and would like 
the original types to be restored.


I agree that an incompatible type change would be harmful.  If 
that problem exists, then it could be fixed either by changing the 
types back, or by fixing the compatibility.  Without knowing the 
reason for the type change, I don't know which of those options I 
prefer in the long term.  For the short term, read on.



2. I also object to the change of kern_syctl.c 1.247.

This change attempts to work around the problems caused by 
the changes to the variable types by making sysctl() return 
different types depending on the value of the *oldlenp argument.


IMO, this is a bad idea.  The *oldlenp argument does *not* 
specify the size of the data type expected by the caller, but 
rather the size of a buffer.  The sysctl() API allows the caller 
to pass a buffer larger than the variable being read, and 
conversely, guarantees that passing a buffer that is too small 
results in ENOMEM.


Both of these aspects of the API are now broken: reading a 
4-byte CTLTYPE_INT variable now works for any buffer size = 
4 *except* 8, and attempting to read an 8-byte CTLTYPE_QUAD 
variable into a buffer of less than 8 bytes is now guaranteed to 
yield ENOMEM *except* if the buffer size happens to be 4.  IMO, 
this behavior violates both the letter of the sysctl() man page 
and the principle of least astonishment.


Also, the work-around is ineffective in the case of a caller 
that allocates the buffer dynamically using the size given by an 
initial sysctl() call with oldp = NULL.


Yes, I think you are right about the details, and we should 
probably revert the change, at least until a design is discussed 
that meets all reasonable requirements for compatibility.


--apb (Alan Barrett)


Re: CVS commit: src/sys/kern

2014-03-06 Thread David Laight
On Wed, Mar 05, 2014 at 06:04:02PM +0200, Andreas Gustafsson wrote:
 
 2. I also object to the change of kern_syctl.c 1.247.
 
 This change attempts to work around the problems caused by the changes
 to the variable types by making sysctl() return different types
 depending on the value of the *oldlenp argument.
 
 IMO, this is a bad idea.  The *oldlenp argument does *not* specify the
 size of the data type expected by the caller, but rather the size of a
 buffer.  The sysctl() API allows the caller to pass a buffer larger
 than the variable being read, and conversely, guarantees that passing
 a buffer that is too small results in ENOMEM.
 
 Both of these aspects of the API are now broken: reading a 4-byte
 CTLTYPE_INT variable now works for any buffer size = 4 *except* 8,

That wasn't the intent of the change.
The intent was that if the size was 8 then the code would return
a numeric value of size 8, otherwise the size would be chnaged to
4 and/or ENOMEM (stupid errno choice) returned.

 and attempting to read an 8-byte CTLTYPE_QUAD variable into a buffer
 of less than 8 bytes is now guaranteed to yield ENOMEM *except* if the
 buffer size happens to be 4.

A request to read a CTLTYPE_QUAD variable into a buffer that is shorter
than 8 bytes has always been a programming error.
The intent of the change was to relax that is the length happened to be 4.

 IMO, this behavior violates both the
 letter of the sysctl() man page and the principle of least astonishment.

I'm not sure about the latter.
I run 'sysctl -a' and find the name of the sysctl I'm interested in.
The result is a small number so I pass the address and size of a integer
variable and then print the result.
(Or the value is rather large and I think it might exceed 2^31 so I
use an int64.)
The 'principle of least astonishment' would mean that I get the value
that 'sysctl -a' printed.

On a BE system I have to be extremely careful with the return values
from sysctl() or I see garbage.

Note that code calling systctl() has to either know whether the value
it is expecting is a string, structure, or number, or use the API calls
that expose the kernel internals in order to find out.

 Also, the work-around is ineffective in the case of a caller that
 allocates the buffer dynamically using the size given by an initial
 sysctl() call with oldp = NULL.

Code that does that for a numeric value will be quite happy with
either a 32bit of 64bit result.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2014-03-05 Thread Andreas Gustafsson
On Feb 27, David Laight said:
 Module Name:  src
 Committed By: dsl
 Date: Thu Feb 27 22:50:52 UTC 2014
 
 Modified Files:
   src/sys/kern: kern_sysctl.c
 
 Log Message:
 Allow CTLTYPE_INT and CTLTYPE_QUAD to be read and written as either 4 or 8
   byte values regardless of the type.
 64bit writes to 32bit variables must be valid (signed) values.
 32bit reads of large values return -1.
 Amongst other things this should fix libm's code that reads machdep.sse
   as a 32bit int, but I'd changed it to 64bit (to common up some code).
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.246 -r1.247 src/sys/kern/kern_sysctl.c

In private correspondence with David, I have objected to this commit
as well as his earlier change to the type of the machdep.sse
sysctl variable.

Regrettably, David and I have not been able to reach an agreement, and
he has requested that I ask for opinions on a public list.  So...

1. I object to the earlier change of the following sysctl variables
from type CTLTYPE_INT to type CTLTYPE_QUAD:

machdep.fpu_present
machdep.osfxsr
machdep.sse
machdep.sse2
machdep.biosbasemem
machdep.biosextmem

My understanding of David's rationale for changing them is that it
facilitated sharing code with the machdep.xsave_features variable
which actually needs to be of type CTLTYPE_QUAD.  By my reckoning, 
the savings from this code sharing amount to eight lines of code.

Changing the types of existing sysctl variables breaks both source 
and binary compatibility and should not be done lightly; that's
why we have both both hw.physmem and hw.physmem64, for example.
Now, the types of multiple variables have been incompatibly changed
for no reason other than saving a few lines of code.

I believe the harm caused by the incompatible type change far
outweighs the cost of a few added lines of code, and would like 
the original types to be restored.  I am attaching a patch to do so;
it also changes the newly added variables machdep.fpu_save and
machdep.fpu_save_size to CTLTYPE_INT, but if David really thinks
those should be CTLTYPE_QUAD, I will not argue that point.


2. I also object to the change of kern_syctl.c 1.247.

This change attempts to work around the problems caused by the changes
to the variable types by making sysctl() return different types
depending on the value of the *oldlenp argument.

IMO, this is a bad idea.  The *oldlenp argument does *not* specify the
size of the data type expected by the caller, but rather the size of a
buffer.  The sysctl() API allows the caller to pass a buffer larger
than the variable being read, and conversely, guarantees that passing
a buffer that is too small results in ENOMEM.

Both of these aspects of the API are now broken: reading a 4-byte
CTLTYPE_INT variable now works for any buffer size = 4 *except* 8,
and attempting to read an 8-byte CTLTYPE_QUAD variable into a buffer
of less than 8 bytes is now guaranteed to yield ENOMEM *except* if the
buffer size happens to be 4.  IMO, this behavior violates both the
letter of the sysctl() man page and the principle of least astonishment.

Also, the work-around is ineffective in the case of a caller that
allocates the buffer dynamically using the size given by an initial
sysctl() call with oldp = NULL.

If the original types of the sysctl variables are restored, this
work-around will no longer serve a purpose, and I'm asking for it
to be removed.

Opinions?
-- 
Andreas Gustafsson, g...@netbsd.org

Index: x86_machdep.c
===
RCS file: /cvsroot/src/sys/arch/x86/x86/x86_machdep.c,v
retrieving revision 1.63
diff -u -r1.63 x86_machdep.c
--- x86_machdep.c   23 Feb 2014 22:38:40 -  1.63
+++ x86_machdep.c   3 Mar 2014 18:41:06 -
@@ -1056,7 +1056,17 @@
 }
 
 static void
-const_sysctl(struct sysctllog **clog, const char *name, u_quad_t value, int 
tag)
+const_sysctl_int(struct sysctllog **clog, const char *name, int value, int tag)
+{
+   sysctl_createv(clog, 0, NULL, NULL,
+  CTLFLAG_PERMANENT | CTLFLAG_IMMEDIATE,
+  CTLTYPE_INT, name, NULL,
+  NULL, (u_quad_t)value, NULL, 0,
+  CTL_MACHDEP, tag, CTL_EOL);
+}
+
+static void
+const_sysctl_quad(struct sysctllog **clog, const char *name, u_quad_t value, 
int tag)
 {
sysctl_createv(clog, 0, NULL, NULL,
   CTLFLAG_PERMANENT | CTLFLAG_IMMEDIATE,
@@ -1115,17 +1125,17 @@
   CTL_MACHDEP, CTL_CREATE, CTL_EOL);
 
/* None of these can ever change once the system has booted */
-   const_sysctl(clog, fpu_present, i386_fpu_present, CPU_FPU_PRESENT);
-   const_sysctl(clog, osfxsr, i386_use_fxsave, CPU_OSFXSR);
-   const_sysctl(clog, sse, i386_has_sse, CPU_SSE);
-   const_sysctl(clog, sse2, i386_has_sse2, CPU_SSE2);
-
-   const_sysctl(clog, fpu_save, x86_fpu_save, CTL_CREATE);
-   const_sysctl(clog, fpu_save_size, 

Re: CVS commit: src/sys/kern

2014-03-05 Thread Christos Zoulas
In article 21271.19186.176517.832...@guava.gson.org,
Andreas Gustafsson  g...@netbsd.org wrote:
On Feb 27, David Laight said:
 Module Name: src
 Committed By:dsl
 Date:Thu Feb 27 22:50:52 UTC 2014
 
 Modified Files:
  src/sys/kern: kern_sysctl.c
 
 Log Message:
 Allow CTLTYPE_INT and CTLTYPE_QUAD to be read and written as either 4 or 8
   byte values regardless of the type.
 64bit writes to 32bit variables must be valid (signed) values.
 32bit reads of large values return -1.
 Amongst other things this should fix libm's code that reads machdep.sse
   as a 32bit int, but I'd changed it to 64bit (to common up some code).
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.246 -r1.247 src/sys/kern/kern_sysctl.c

In private correspondence with David, I have objected to this commit
as well as his earlier change to the type of the machdep.sse
sysctl variable.

Regrettably, David and I have not been able to reach an agreement, and
he has requested that I ask for opinions on a public list.  So...

1. I object to the earlier change of the following sysctl variables
from type CTLTYPE_INT to type CTLTYPE_QUAD:

machdep.fpu_present
machdep.osfxsr
machdep.sse
machdep.sse2
machdep.biosbasemem
machdep.biosextmem

My understanding of David's rationale for changing them is that it
facilitated sharing code with the machdep.xsave_features variable
which actually needs to be of type CTLTYPE_QUAD.  By my reckoning, 
the savings from this code sharing amount to eight lines of code.

Changing the types of existing sysctl variables breaks both source 
and binary compatibility and should not be done lightly; that's
why we have both both hw.physmem and hw.physmem64, for example.
Now, the types of multiple variables have been incompatibly changed
for no reason other than saving a few lines of code.

I believe the harm caused by the incompatible type change far
outweighs the cost of a few added lines of code, and would like 
the original types to be restored.  I am attaching a patch to do so;
it also changes the newly added variables machdep.fpu_save and
machdep.fpu_save_size to CTLTYPE_INT, but if David really thinks
those should be CTLTYPE_QUAD, I will not argue that point.


2. I also object to the change of kern_syctl.c 1.247.

This change attempts to work around the problems caused by the changes
to the variable types by making sysctl() return different types
depending on the value of the *oldlenp argument.

IMO, this is a bad idea.  The *oldlenp argument does *not* specify the
size of the data type expected by the caller, but rather the size of a
buffer.  The sysctl() API allows the caller to pass a buffer larger
than the variable being read, and conversely, guarantees that passing
a buffer that is too small results in ENOMEM.

Both of these aspects of the API are now broken: reading a 4-byte
CTLTYPE_INT variable now works for any buffer size = 4 *except* 8,
and attempting to read an 8-byte CTLTYPE_QUAD variable into a buffer
of less than 8 bytes is now guaranteed to yield ENOMEM *except* if the
buffer size happens to be 4.  IMO, this behavior violates both the
letter of the sysctl() man page and the principle of least astonishment.

Also, the work-around is ineffective in the case of a caller that
allocates the buffer dynamically using the size given by an initial
sysctl() call with oldp = NULL.

If the original types of the sysctl variables are restored, this
work-around will no longer serve a purpose, and I'm asking for it
to be removed.

Opinions?

I think you are correct.

christos



Re: CVS commit: src/sys/kern

2014-03-01 Thread David Laight
On Sat, Mar 01, 2014 at 08:31:42AM +0200, Alan Barrett wrote:
 On Thu, 27 Feb 2014, David Laight wrote:
 Modified Files:
  src/sys/kern: kern_sysctl.c
 
 +case CTLTYPE_INT:
 +/* Allow for 64bit read of 32bit value */
 +if (*oldlenp == sizeof (uint64_t)) {
 +qval = *(int *)d;
 +d_out = qval;
 +sz =  sizeof (uint64_t);
 +}
 +break;
 +case CTLTYPE_QUAD:
 +/* Allow for 32bit read of 64bit value */
 +if (*oldlenp == sizeof (int)) {
 +qval = *(uint64_t *)d;
 +ival = qval  0x1 ? qval : 0x;
 +d_out = ival;
 +sz =  sizeof (int);
 +}
 +break;
 
 This will fail if int is not a 32-bit type, because it uses hardcoded
 masks instead of adapting to the actual size.

It goes wrong if 'int' is larger than 'uint64_t' as well!
I'll fix it for 64bit int (modulo any signed v unsigned warnings).

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2014-02-28 Thread Alan Barrett

On Thu, 27 Feb 2014, David Laight wrote:

Modified Files:
src/sys/kern: kern_sysctl.c



+   case CTLTYPE_INT:
+   /* Allow for 64bit read of 32bit value */
+   if (*oldlenp == sizeof (uint64_t)) {
+   qval = *(int *)d;
+   d_out = qval;
+   sz =  sizeof (uint64_t);
+   }
+   break;
+   case CTLTYPE_QUAD:
+   /* Allow for 32bit read of 64bit value */
+   if (*oldlenp == sizeof (int)) {
+   qval = *(uint64_t *)d;
+   ival = qval  0x1 ? qval : 0x;
+   d_out = ival;
+   sz =  sizeof (int);
+   }
+   break;


This will fail if int is not a 32-bit type, because it uses hardcoded
masks instead of adapting to the actual size.

--apb (Alan Barrett)


Re: CVS commit: src/sys/kern

2014-01-12 Thread Martin Husemann
On Sun, Jan 12, 2014 at 03:17:43AM +, David Holland wrote:
 On Sun, Jan 12, 2014 at 02:13:34PM +1100, matthew green wrote:
   i do not want this for my serial console systems, or my systems
   that drop to ddb on crashes before (manual) rebooting.
 
 Surely it should be set up so it skips the delay if ddb is active?

Isn't it better to have the console driver set an optional flush hook
and do this magic [or even an explicit hypervisor call] inside xencons?

Martin


re: CVS commit: src/sys/kern

2014-01-11 Thread matthew green

 Module Name:  src
 Committed By: joerg
 Date: Sun Jan 12 00:29:15 UTC 2014
 
 Modified Files:
   src/sys/kern: subr_prf.c
 
 Log Message:
 Revert, breaks the build due to missing rumpns_delay in librump.so.

thanks.

when this comes back, i'd rather it defaulted to zero but was a
sysctl controllable value.

i do not want this for my serial console systems, or my systems
that drop to ddb on crashes before (manual) rebooting.


.mrg.


Re: CVS commit: src/sys/kern

2014-01-11 Thread David Holland
On Sun, Jan 12, 2014 at 02:13:34PM +1100, matthew green wrote:
   Modified Files:
  src/sys/kern: subr_prf.c
   
   Log Message:
   Revert, breaks the build due to missing rumpns_delay in librump.so.
  
  thanks.
  
  when this comes back, i'd rather it defaulted to zero but was a
  sysctl controllable value.
  
  i do not want this for my serial console systems, or my systems
  that drop to ddb on crashes before (manual) rebooting.

Surely it should be set up so it skips the delay if ddb is active?

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/kern

2014-01-11 Thread Christos Zoulas
On Jan 12,  3:17am, dholland-sourcechan...@netbsd.org (David Holland) wrote:
-- Subject: Re: CVS commit: src/sys/kern

| On Sun, Jan 12, 2014 at 02:13:34PM +1100, matthew green wrote:
|Modified Files:
| src/sys/kern: subr_prf.c
|
|Log Message:
|Revert, breaks the build due to missing rumpns_delay in librump.so.
|   
|   thanks.
|   
|   when this comes back, i'd rather it defaulted to zero but was a
|   sysctl controllable value.
|   
|   i do not want this for my serial console systems, or my systems
|   that drop to ddb on crashes before (manual) rebooting.
| 
| Surely it should be set up so it skips the delay if ddb is active?

Good idea.

christos


Re: CVS commit: src/sys/kern

2013-12-31 Thread Christos Zoulas
In article 20131231171938.ga26...@netbsd.org,
David Holland  dholland-sourcechan...@netbsd.org wrote:
On Tue, Dec 24, 2013 at 09:47:04AM -0500, Christos Zoulas wrote:
  Modified Files:
  src/sys/kern: kern_exec.c
  
  Log Message:
  replace strcpy with copystr and remove useless strcpy (Maxime Villard)

...how is this a step forward?

The comment is enforced.

(and that second strcpy is not useless, I put it there for a reason)

Can you explain the reason? It would be nice to have a comment about
things like this that are non-obvious.

christos



Re: CVS commit: src/sys/kern

2013-08-27 Thread Thor Lancelot Simon
On Tue, Aug 27, 2013 at 02:01:36PM +, Taylor R Campbell wrote:
 
 This is a stop-gap on a stop-gap on a stop-gap; we really ought to
 back out all of these stop-gaps, make bcm2835_rng call rnd_add_data
 asynchronously to work around the original symptom, and design a real
 solution when we have time to sort this mess out properly.

Having slept on it -- you're right.  Most of this series of changes
should be backed out (a few parts are worth keeping, like the tweaks
to the softint initialization).

In particular, the shortcut calls to rnd_process_events should go back
into the softint schedule routines, and be removed from the extract
functions.

Then, I think, as a better interim solution, we should:

1) Make the hardware RNGs loop, synchronously adding entropy
   until they've added a pool-full, at startup.

2) Make all the users of the callback interface add entropy
   asynchronously at startup, so early in kernel startup
   other RNGs are not initialized from a near-empty pool.

Thor


Re: CVS commit: src/sys/kern

2013-07-25 Thread Joerg Sonnenberger
On Tue, Jul 23, 2013 at 08:17:24AM +0100, David Laight wrote:
 On Mon, Jul 22, 2013 at 07:49:09AM +0100, Alexander Nasonov wrote:
  Taylor R Campbell wrote:
   This shouldn't be necessary: snprintf guarantees null termination.
   Did you observe a pool name without null termination in pool_init in
   the wild?
  
  I misremembered how snprintf works then. I didn't observe it in the wild
  but some dtrace code uses long vmem names (dtrace_aggid_%d).
 
 You might have remembered how the microsoft version of snprintf()
 (doesn't) work!

You are thinking about _snprintf :)

Joerg


Re: CVS commit: src/sys/kern

2013-07-22 Thread Alexander Nasonov
Taylor R Campbell wrote:
 This shouldn't be necessary: snprintf guarantees null termination.
 Did you observe a pool name without null termination in pool_init in
 the wild?

I misremembered how snprintf works then. I didn't observe it in the wild
but some dtrace code uses long vmem names (dtrace_aggid_%d).

Alex


Re: CVS commit: src/sys/kern

2013-04-21 Thread David Laight
On Sat, Apr 20, 2013 at 02:17:17PM -0700, Matt Thomas wrote:
 
 On Apr 20, 2013, at 11:04 AM, Christos Zoulas chris...@netbsd.org wrote:
 
  Module Name:src
  Committed By:   christos
  Date:   Sat Apr 20 18:04:41 UTC 2013
  
  Modified Files:
  src/sys/kern: kern_exec.c
  
  Log Message:
  don't attempt to load elf64 on 32 bit machines
 
 if the load address doesn't fit in 4GB.
 
 I've run elf64(n64) on mips n32 kernel with no problems.

Isn't an n32 kernel a 64bit one that uses 32bit pointers?
A little unusual.

Doesn't that require a COMPAT_64 layer in the kernel?
Since user pointers will be 64bit and kernel ones only 32.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2013-04-21 Thread Matt Thomas

On Apr 21, 2013, at 2:15 AM, David Laight da...@l8s.co.uk wrote:

 On Sat, Apr 20, 2013 at 02:17:17PM -0700, Matt Thomas wrote:
 
 On Apr 20, 2013, at 11:04 AM, Christos Zoulas chris...@netbsd.org wrote:
 
 Module Name:src
 Committed By:   christos
 Date:   Sat Apr 20 18:04:41 UTC 2013
 
 Modified Files:
 src/sys/kern: kern_exec.c
 
 Log Message:
 don't attempt to load elf64 on 32 bit machines
 
 if the load address doesn't fit in 4GB.
 
 I've run elf64(n64) on mips n32 kernel with no problems.
 
 Isn't an n32 kernel a 64bit one that uses 32bit pointers?
 A little unusual.
 
 Doesn't that require a COMPAT_64 layer in the kernel?
 Since user pointers will be 64bit and kernel ones only 32.

I cheated and used objcopy to convert a N32 binary to N64.



Re: CVS commit: src/sys/kern

2013-04-20 Thread Matt Thomas

On Apr 20, 2013, at 11:04 AM, Christos Zoulas chris...@netbsd.org wrote:

 Module Name:  src
 Committed By: christos
 Date: Sat Apr 20 18:04:41 UTC 2013
 
 Modified Files:
   src/sys/kern: kern_exec.c
 
 Log Message:
 don't attempt to load elf64 on 32 bit machines

if the load address doesn't fit in 4GB.

I've run elf64(n64) on mips n32 kernel with no problems.



Re: CVS commit: src/sys/kern

2013-04-01 Thread Christos Zoulas
In article 20130401123134.dbf3b17...@cvs.netbsd.org,
Martin Husemann source-changes-d@NetBSD.org wrote:
-  KASSERT(*timo  0);

This should really not be failing since itimespecfix uses tick to correct
the minimal timeout.

christos



Re: CVS commit: src/sys/kern

2013-04-01 Thread Martin Husemann
On Mon, Apr 01, 2013 at 02:28:02PM +, Christos Zoulas wrote:
 In article 20130401123134.dbf3b17...@cvs.netbsd.org,
 Martin Husemann source-changes-d@NetBSD.org wrote:
 -KASSERT(*timo  0);
 
 This should really not be failing since itimespecfix uses tick to correct
 the minimal timeout.

It does not for ts_sec == 0 and ts_nsec == 0.

Martin


Re: CVS commit: src/sys/kern

2013-04-01 Thread Christos Zoulas
On Apr 1,  4:32pm, mar...@duskware.de (Martin Husemann) wrote:
-- Subject: Re: CVS commit: src/sys/kern

| On Mon, Apr 01, 2013 at 02:28:02PM +, Christos Zoulas wrote:
|  In article 20130401123134.dbf3b17...@cvs.netbsd.org,
|  Martin Husemann source-changes-d@NetBSD.org wrote:
|  -  KASSERT(*timo  0);
|  
|  This should really not be failing since itimespecfix uses tick to correct
|  the minimal timeout.
| 
| It does not for ts_sec == 0 and ts_nsec == 0.

Probably cleaner to check for it there? Anyway, the question is really if
we should bother to sleep at all for 0, 0 or shortcut the return. I think
we should shortcut the return.

christos


Re: CVS commit: src/sys/kern

2013-04-01 Thread Martin Husemann
On Mon, Apr 01, 2013 at 11:36:57AM -0400, Christos Zoulas wrote:
 Probably cleaner to check for it there?

Maybe, but I wasn't sure about the other callers. There is similar
code in itimerfix() and changing it there symetrically made a lot
of things fail imediately, so I went for the simple fix.

 Anyway, the question is really if
 we should bother to sleep at all for 0, 0 or shortcut the return. I think
 we should shortcut the return.

Shortcut is what I did, or do you mean something else?

Martin


Re: CVS commit: src/sys/kern

2013-04-01 Thread Christos Zoulas
On Apr 1,  6:20pm, mar...@duskware.de (Martin Husemann) wrote:
-- Subject: Re: CVS commit: src/sys/kern

| On Mon, Apr 01, 2013 at 11:36:57AM -0400, Christos Zoulas wrote:
|  Probably cleaner to check for it there?
| 
| Maybe, but I wasn't sure about the other callers. There is similar
| code in itimerfix() and changing it there symetrically made a lot
| of things fail imediately, so I went for the simple fix.
| 
|  Anyway, the question is really if
|  we should bother to sleep at all for 0, 0 or shortcut the return. I think
|  we should shortcut the return.
| 
| Shortcut is what I did, or do you mean something else?

I meant fix it in itim{er,spec}fix. Yes, I will move the check out again,
because 0 means disable the timer.


christos


Re: CVS commit: src/sys/kern

2013-03-12 Thread David Laight
On Tue, Mar 12, 2013 at 01:31:21PM +1100, matthew green wrote:
 
   Modified Files:
 src/sys/kern: subr_pool.c
   
   Log Message:
   In pool_cache_put_slow(), pool_get() can block (it does mutex_enter()),
   so we need to retry if curlwp took a context switch during the call.
  
  I didn't think mutex_enter() blocked - isn't it a spinlock.
  Which means that if things are going wrong they can go wrong
  even if the mutex is available immediately.
 
 netbsd kernel mutexes can be either, but this one is a spinlock yeah.

Hmmm I thought that internal test had gone with lockmgr :-)
My guess is that every call site knows whether sleeping is allowed,
so it could be a property of the mutex type.
Then there would be no confusion about whether the code might sleep.

I know the uncontended path is now common to both (and fast) - unlike
lockmgr which took a couple of 100 instructions to decide what to do.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2013-03-11 Thread David Laight
On Mon, Mar 11, 2013 at 09:37:54PM +, Antti Kantee wrote:
 Module Name:  src
 Committed By: pooka
 Date: Mon Mar 11 21:37:54 UTC 2013
 
 Modified Files:
   src/sys/kern: subr_pool.c
 
 Log Message:
 In pool_cache_put_slow(), pool_get() can block (it does mutex_enter()),
 so we need to retry if curlwp took a context switch during the call.

I didn't think mutex_enter() blocked - isn't it a spinlock.
Which means that if things are going wrong they can go wrong
even if the mutex is available immediately.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2013-03-11 Thread YAMAMOTO Takashi
hi,

 Module Name:  src
 Committed By: pooka
 Date: Mon Mar 11 21:37:54 UTC 2013
 
 Modified Files:
   src/sys/kern: subr_pool.c
 
 Log Message:
 In pool_cache_put_slow(), pool_get() can block (it does mutex_enter()),
 so we need to retry if curlwp took a context switch during the call.
 Otherwise, CPU-local invariants can get screwed up:
 
 panic: kernel diagnostic assertion cur-pcg_avail == cur-pcg_size 
 failed
 
 This is (was) very easy to reproduce by just running:
 
   while : ; do RUMP_NCPU=32 ./a.out ; done
 
 where a.out only calls rump_init().  But, any situation there's contention
 and a pool doesn't have emptygroups would do.

depending on mutex_init's arguments (type and ipl), a mutex can be
spin or adaptive.
rump mutex implementation should honor the behaviour, i guess.

YAMAMOTO Takashi

 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.199 -r1.200 src/sys/kern/subr_pool.c
 
 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.


Re: CVS commit: src/sys/kern

2013-03-11 Thread Antti Kantee

On 11.03.2013 23:56, YAMAMOTO Takashi wrote:

Module Name:src
Committed By:   pooka
Date:   Mon Mar 11 21:37:54 UTC 2013

Modified Files:
src/sys/kern: subr_pool.c

Log Message:
In pool_cache_put_slow(), pool_get() can block (it does mutex_enter()),
so we need to retry if curlwp took a context switch during the call.
Otherwise, CPU-local invariants can get screwed up:

 panic: kernel diagnostic assertion cur-pcg_avail == cur-pcg_size failed

This is (was) very easy to reproduce by just running:

   while : ; do RUMP_NCPU=32 ./a.out ; done

where a.out only calls rump_init().  But, any situation there's contention
and a pool doesn't have emptygroups would do.


depending on mutex_init's arguments (type and ipl), a mutex can be
spin or adaptive.
rump mutex implementation should honor the behaviour, i guess.


Oh bummer, forgot that mutex_enter() can be a spin mutex entry.  Yea, 
I'll fix that soon and convert the otherwise harmless back-off to a 
KASSERT() -- I found some similar invariant failures with a websearch, 
so it might be a common problem.


  - antti


re: CVS commit: src/sys/kern

2013-03-11 Thread matthew green

  Modified Files:
  src/sys/kern: subr_pool.c
  
  Log Message:
  In pool_cache_put_slow(), pool_get() can block (it does mutex_enter()),
  so we need to retry if curlwp took a context switch during the call.
 
 I didn't think mutex_enter() blocked - isn't it a spinlock.
 Which means that if things are going wrong they can go wrong
 even if the mutex is available immediately.

netbsd kernel mutexes can be either, but this one is a spinlock yeah.


.mrg.


re: CVS commit: src/sys/kern

2013-02-19 Thread matthew green

 Module Name:  src
 Committed By: martin
 Date: Tue Feb 19 11:20:17 UTC 2013
 
 Modified Files:
   src/sys/kern: subr_xcall.c
 
 Log Message:
 Oops, accidently did not commit this part of pooka's change.

please update the commit log to reference the original commit,
rather than the above which will be much less obvious to someone
looking in a few years time.

thanks,


.mrg.


Re: CVS commit: src/sys/kern

2013-01-30 Thread Martin Husemann
On Wed, Jan 30, 2013 at 08:07:14AM +, David Laight wrote:
 On Tue, Jan 29, 2013 at 11:00:31PM +, Lars Heidieker wrote:
  Module Name:src
  Committed By:   para
  Date:   Tue Jan 29 23:00:31 UTC 2013
  
  Modified Files:
  src/sys/kern: kern_sysctl.c
  
  Log Message:
  fix the sysctl_setup_func typedef
 
 -typedef void (*sysctl_setup_func)(struct sysctllog **);
 +typedef void sysctl_setup_func(struct sysctllog **);
 
 IIRC you are only supposed to be able to typedef pointers to functions.

You can use typdef for a function instead of a function pointer, but then
you can only declare args/variables as sysctl_setup_func*.

This does not look like an improvement to me.

Martin


Re: CVS commit: src/sys/kern

2013-01-30 Thread Valeriy E. Ushakov
On Wed, Jan 30, 2013 at 09:36:56 +, Martin Husemann wrote:

 On Wed, Jan 30, 2013 at 08:07:14AM +, David Laight wrote:
  On Tue, Jan 29, 2013 at 11:00:31PM +, Lars Heidieker wrote:
   Module Name:  src
   Committed By: para
   Date: Tue Jan 29 23:00:31 UTC 2013
   
   Modified Files:
 src/sys/kern: kern_sysctl.c
   
   Log Message:
   fix the sysctl_setup_func typedef
  
  -typedef void (*sysctl_setup_func)(struct sysctllog **);
  +typedef void sysctl_setup_func(struct sysctllog **);
  
  IIRC you are only supposed to be able to typedef pointers to functions.
 
 You can use typdef for a function instead of a function pointer, but then
 you can only declare args/variables as sysctl_setup_func*.
 
 This does not look like an improvement to me.

__link_set* macros add that extra pointer to:

#define __link_set_decl(set, ptype) \
extern ptype * const __start_link_set_##set[] __dso_hidden; \
extern ptype * const __stop_link_set_##set[] __dso_hidden   \

-uwe


Re: CVS commit: src/sys/kern

2013-01-30 Thread Warner Losh

On Jan 30, 2013, at 1:07 AM, David Laight wrote:

 On Tue, Jan 29, 2013 at 11:00:31PM +, Lars Heidieker wrote:
 Module Name: src
 Committed By:para
 Date:Tue Jan 29 23:00:31 UTC 2013
 
 Modified Files:
  src/sys/kern: kern_sysctl.c
 
 Log Message:
 fix the sysctl_setup_func typedef
 
 -typedef void (*sysctl_setup_func)(struct sysctllog **);
 +typedef void sysctl_setup_func(struct sysctllog **);
 
 IIRC you are only supposed to be able to typedef pointers to functions.

It is totally legal, and has been for three decades or so...

 The extra level of indirection (that caused the horrid casting)
 is elsewhere.
 (I wasn't at all sure the previous 'fix' was right.)
 
   David
 
 -- 
 David Laight: da...@l8s.co.uk



Re: CVS commit: src/sys/kern

2013-01-29 Thread David Laight
On Tue, Jan 29, 2013 at 11:00:31PM +, Lars Heidieker wrote:
 Module Name:  src
 Committed By: para
 Date: Tue Jan 29 23:00:31 UTC 2013
 
 Modified Files:
   src/sys/kern: kern_sysctl.c
 
 Log Message:
 fix the sysctl_setup_func typedef

-typedef void (*sysctl_setup_func)(struct sysctllog **);
+typedef void sysctl_setup_func(struct sysctllog **);

IIRC you are only supposed to be able to typedef pointers to functions.

The extra level of indirection (that caused the horrid casting)
is elsewhere.
(I wasn't at all sure the previous 'fix' was right.)

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2012-12-30 Thread J. Hannken-Illjes
On Dec 29, 2012, at 10:56 PM, Christos Zoulas chris...@netbsd.org wrote:

 Always call brelse() on error. Otherwise a possible error from bread() will
 cause the buffer to stay lock and we end up blocking forever in
 VOP_CLOSE-spec_close-vinvalbuf-bbysy since the buffer is marked busy
 but there is no I/O pending.
 This caused my laptop to hang on boot_findwedge because:
findroot: unable to read block 358331527 of dev dk0 (22)

Thanks.  Just applied the same to breadn().

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: CVS commit: src/sys/kern

2012-12-30 Thread Christos Zoulas
On Dec 30, 10:21am, hann...@eis.cs.tu-bs.de (J. Hannken-Illjes) wrote:
-- Subject: Re: CVS commit: src/sys/kern

| Thanks.  Just applied the same to breadn().

Heh, I missed that!

Thanks,

christos


Re: CVS commit: src/sys/kern

2012-10-21 Thread David Holland
On Sat, Oct 20, 2012 at 09:33:03AM +0100, David Laight wrote:
  rename(a/b, a/c) and rename(a/c/x, a/b/y) will deadlock.
 
 Surely it just converts rename(a/c/x, a/b/y) into
 rename(a/c/x, a/c/y) which isn't quite the intended operation.
   
   No, it will (or can) deadlock. The problem is that for the second
   (cross-directory) rename, as far as it can tell a/c and a/b are
   incommensurate, so it locks in an arbtrary order (which for
   reasons of internal convenience is a/b first then a/c) but the
   same-dir rename locks a/b first then a/c.
  
  Why does the same-dir rename need to lock either a/b or a/c ?
  Surely it is just an operation on the 'file' contents of a itself.

No. And stop calling me Shirley. :-p

If they're directories, it has to update the .. entry of the source
dir and truncate the target dir, both of which operations definitely
require locking. You can't tell if they're directories without looking
them up, which currently involves locking them. Even if they aren't
directories, you still need to unlink and maybe thereby truncate the
target file, and many file systems, including e.g. ffs, need to
manipulate the link count of the source file as well. This too
requires locking.

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/kern

2012-10-20 Thread David Laight
On Sat, Oct 20, 2012 at 04:06:13AM +, David Holland wrote:
 On Fri, Oct 19, 2012 at 07:58:34AM +0100, David Laight wrote:
Module Name: src
Committed By:riastradh
Date:Fri Oct 19 02:07:23 UTC 2012

Modified Files:
 src/sys/kern: vfs_syscalls.c

Log Message:
No, we can't elide the fs-wide rename lock for same-directory rename.

rename(a/b, a/c) and rename(a/c/x, a/b/y) will deadlock.
   
   Surely it just converts rename(a/c/x, a/b/y) into
   rename(a/c/x, a/c/y) which isn't quite the intended operation.
 
 No, it will (or can) deadlock. The problem is that for the second
 (cross-directory) rename, as far as it can tell a/c and a/b are
 incommensurate, so it locks in an arbtrary order (which for
 reasons of internal convenience is a/b first then a/c) but the
 same-dir rename locks a/b first then a/c.

Why does the same-dir rename need to lock either a/b or a/c ?
Surely it is just an operation on the 'file' contents of a itself.

Clearly that can only be done if you detect the same-dir-ness of
the rename by analysing the filename paths, but that is likely the
common case.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2012-10-20 Thread Taylor R Campbell
   Date: Sat, 20 Oct 2012 04:06:13 +
   From: David Holland dholland-sourcechan...@netbsd.org

   On Fri, Oct 19, 2012 at 07:58:34AM +0100, David Laight wrote:
  Committed By:   riastradh
  Date:   Fri Oct 19 02:07:23 UTC 2012
  
  rename(a/b, a/c) and rename(a/c/x, a/b/y) will deadlock.
 
 Surely it just converts rename(a/c/x, a/b/y) into
 rename(a/c/x, a/c/y) which isn't quite the intended operation.

   No, it will (or can) deadlock. The problem is that for the second
   (cross-directory) rename, as far as it can tell a/c and a/b are
   incommensurate, so it locks in an arbtrary order (which for
   reasons of internal convenience is a/b first then a/c) but the
   same-dir rename locks a/b first then a/c.

We could avoid this case by breaking ties using vnode address.

 Maybe convert the fs-wide rename lock into a rw-lock and only require
 read access for a same-directory rename.

   That will not help.

It seems pretty clear to me that an rwlock would avoid this particular
case too.

However, it's not clear to me that either of these approaches yields a
deadlock-free result.  The proof that our current rename locking is
deadlock-free is easy:  Except for rename, everything holds at most
two vnode locks at once, in parent-to-child order, and nothing holds
locks on pairs of incommensurate vnodes.  Rename locks in ancestor-to-
descendant order when the order exists, which includes parent-to-child
order; and is the only operation that can lock sets of incommensurate
nodes, and it does them only within directories that nobody else can
have locked simultaneously, so if there's only one rename operation in
flight the lock order it uses on incommensurate nodes doesn't matter.


Re: CVS commit: src/sys/kern

2012-10-19 Thread David Laight
On Fri, Oct 19, 2012 at 02:07:23AM +, Taylor R Campbell wrote:
 Module Name:  src
 Committed By: riastradh
 Date: Fri Oct 19 02:07:23 UTC 2012
 
 Modified Files:
   src/sys/kern: vfs_syscalls.c
 
 Log Message:
 No, we can't elide the fs-wide rename lock for same-directory rename.
 
 rename(a/b, a/c) and rename(a/c/x, a/b/y) will deadlock.

Surely it just converts rename(a/c/x, a/b/y) into rename(a/c/x, a/c/y)
which isn't quite the intended operation.

Maybe convert the fs-wide rename lock into a rw-lock and only require
read access for a same-directory rename.

That might only be useful in the common case where the two directory
paths are the same actual character - when only one VOP_LOOKUP()
is needed at all and the child inodes don't even need looking at.

I suspect the DNLC (and any other caches) will need both rename ops done
atomically though.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2012-10-19 Thread David Holland
On Fri, Oct 19, 2012 at 07:58:34AM +0100, David Laight wrote:
   Module Name:   src
   Committed By:  riastradh
   Date:  Fri Oct 19 02:07:23 UTC 2012
   
   Modified Files:
  src/sys/kern: vfs_syscalls.c
   
   Log Message:
   No, we can't elide the fs-wide rename lock for same-directory rename.
   
   rename(a/b, a/c) and rename(a/c/x, a/b/y) will deadlock.
  
  Surely it just converts rename(a/c/x, a/b/y) into
  rename(a/c/x, a/c/y) which isn't quite the intended operation.

No, it will (or can) deadlock. The problem is that for the second
(cross-directory) rename, as far as it can tell a/c and a/b are
incommensurate, so it locks in an arbtrary order (which for
reasons of internal convenience is a/b first then a/c) but the
same-dir rename locks a/b first then a/c.

  Maybe convert the fs-wide rename lock into a rw-lock and only require
  read access for a same-directory rename.

That will not help.

  That might only be useful in the common case where the two directory
  paths are the same actual character - when only one VOP_LOOKUP()
  is needed at all and the child inodes don't even need looking at.

...?

  I suspect the DNLC (and any other caches) will need both rename ops done
  atomically though.

That's not the issue.

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/kern

2012-10-19 Thread David Holland
On Sat, Oct 20, 2012 at 04:06:13AM +, David Holland wrote:
Maybe convert the fs-wide rename lock into a rw-lock and only require
read access for a same-directory rename.
  
  That will not help.

Erm. I can read, honest.

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/kern

2012-10-01 Thread Christos Zoulas
In article 20120930114944.5b4cb17...@cvs.netbsd.org,
Michael van Elst source-changes-d@NetBSD.org wrote:
-=-=-=-=-=-

Module Name:   src
Committed By:  mlelstv
Date:  Sun Sep 30 11:49:44 UTC 2012

Modified Files:
   src/sys/kern: tty.c

Log Message:
Provide consistent locking around getc() in ttread(). This is necessary
to prevent crashes in MPSAFE tty drivers like ucom.


Breaks ptyfs pty's: subyte() - ttread() crash.
Just running ssh on a machine crashes it.

christos



Re: CVS commit: src/sys/kern

2012-07-13 Thread Izumi Tsutsui
christos@ wrote:

 Module Name:  src
 Committed By: christos
 Date: Fri Jul 13 16:15:49 UTC 2012
 
 Modified Files:
   src/sys/kern: subr_disk_mbr.c
 
 Log Message:
 revert previous; the problem was off by one in the bios device comparison
 in x86_autoconf.c

Sorry, the off by one is my fault in rev 1.35, but it has been there
before 5.0 and fixing x86_autoconf.c doesn't make root on cd0a work
without this workaround at least on my Optiplex 760.

I think the problem in subr_disk_mbr.c is that it makes
/dev/cd0a unconfigured if any unexpected error occurs during
reading media because kernel recognizes boot device as cd0 properly:
http://mail-index.NetBSD.org/current-users/2012/06/13/msg020407.html

---
Izumi Tsutsui


Re: CVS commit: src/sys/kern

2012-07-13 Thread Christos Zoulas
On Jul 14,  9:20am, tsut...@ceres.dti.ne.jp (Izumi Tsutsui) wrote:
-- Subject: Re: CVS commit: src/sys/kern

| christos@ wrote:
| 
|  Module Name:src
|  Committed By:   christos
|  Date:   Fri Jul 13 16:15:49 UTC 2012
|  
|  Modified Files:
|  src/sys/kern: subr_disk_mbr.c
|  
|  Log Message:
|  revert previous; the problem was off by one in the bios device comparison
|  in x86_autoconf.c
| 
| Sorry, the off by one is my fault in rev 1.35, but it has been there
| before 5.0 and fixing x86_autoconf.c doesn't make root on cd0a work
| without this workaround at least on my Optiplex 760.
| 
| I think the problem in subr_disk_mbr.c is that it makes
| /dev/cd0a unconfigured if any unexpected error occurs during
| reading media because kernel recognizes boot device as cd0 properly:
| http://mail-index.NetBSD.org/current-users/2012/06/13/msg020407.html

Ok, so do we need to put this back for now, or should someone determine
what the unexpected media error is?

christos


Re: CVS commit: src/sys/kern

2012-07-13 Thread Izumi Tsutsui
christos@ wrote:

 On Jul 14,  9:20am, tsut...@ceres.dti.ne.jp (Izumi Tsutsui) wrote:
 -- Subject: Re: CVS commit: src/sys/kern
 
 | christos@ wrote:
 | 
 |  Module Name:  src
 |  Committed By: christos
 |  Date: Fri Jul 13 16:15:49 UTC 2012
 |  
 |  Modified Files:
 |src/sys/kern: subr_disk_mbr.c
 |  
 |  Log Message:
 |  revert previous; the problem was off by one in the bios device comparison
 |  in x86_autoconf.c
 | 
 | Sorry, the off by one is my fault in rev 1.35, but it has been there
 | before 5.0 and fixing x86_autoconf.c doesn't make root on cd0a work
 | without this workaround at least on my Optiplex 760.
 | 
 | I think the problem in subr_disk_mbr.c is that it makes
 | /dev/cd0a unconfigured if any unexpected error occurs during
 | reading media because kernel recognizes boot device as cd0 properly:
 | http://mail-index.NetBSD.org/current-users/2012/06/13/msg020407.html
 
 Ok, so do we need to put this back for now,

I think so.

 or should someone determine
 what the unexpected media error is?

It's just a guess because bootloader also fails to load boot.cfg
in installcd. (I can test if someone provides debug code though)

Anyway I don't think it's a good idea to fake a disklabel
by reading data structures in target media in subr_disk_mbr.c.
It looks inconsistent with cdgetdefaultlabel() in sys/dev/scsipi/cd.c
and I don't see reason why UDF is in RAW_PART (not partition a:)
if we can detect FS type by reading media.

Did we have any discussion about the design?
If not, we should put the workaround (that disables override)
for 6.0.

---
Izumi Tsutsui


Re: CVS commit: src/sys/kern

2012-07-13 Thread Michael L. Hitch

On Sat, 14 Jul 2012, Izumi Tsutsui wrote:


It's just a guess because bootloader also fails to load boot.cfg
in installcd. (I can test if someone provides debug code though)


  The boot.cfg fails because the read() of the file doesn't return any 
data (at least on my machine).  The open() was sucessful, and the fstat() 
returns the proper files size.  I don't know if the read() returns a 0 or 
error yet;  I probably won't be able to continue testing until Monday.


  I had removed the change in subr_disk_mbr.c, so that wasn't needed in my 
case, but who knows about other systems.


Mike

--
Michael L. Hitchmhi...@montana.edu
Computer Consultant
Information Technology Center
Montana State UniversityBozeman, MT USA


Re: CVS commit: src/sys/kern

2012-06-04 Thread Erik Fair
This  the immediate predecessor are definitely a netbsd-5 pull up items.

Erik f...@netbsd.org

On Jun 3, 2012, at 09:23, David Laight wrote:

 Module Name:  src
 Committed By: dsl
 Date: Sun Jun  3 16:23:44 UTC 2012
 
 Modified Files:
   src/sys/kern: vfs_bio.c
 
 Log Message:
 Use separate temporaries for the 'int' percentage and the 'long'
  water marks.
 Previous paniced on sparc64 due to a misaligned copy.
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.238 -r1.239 src/sys/kern/vfs_bio.c
 
 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.
 



Re: CVS commit: src/sys/kern

2012-06-04 Thread David Laight
On Mon, Jun 04, 2012 at 12:02:37PM -0700, Erik Fair wrote:
 This  the immediate predecessor are definitely a netbsd-5 pull up items.

Just looked at a -5 tree.
I think only the vfs_bio change really matters.
(and the src/sys/dev/bluetooth/bcsp.c one - bool v int, but I suspect
the -5 tree has a few more of those).

The acpi ones won't matter because acpi is LE only.
Unfortunately I don't think a cvs patch will apply ...

 
 On Jun 3, 2012, at 09:23, David Laight wrote:
 
  Module Name:src
  Committed By:   dsl
  Date:   Sun Jun  3 16:23:44 UTC 2012
  
  Modified Files:
  src/sys/kern: vfs_bio.c
  
  Log Message:
  Use separate temporaries for the 'int' percentage and the 'long'
   water marks.
  Previous paniced on sparc64 due to a misaligned copy.
  
  
  To generate a diff of this commit:
  cvs rdiff -u -r1.238 -r1.239 src/sys/kern/vfs_bio.c
  
  Please note that diffs are not public domain; they are subject to the
  copyright notices on the relevant files.
  
 

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2012-05-16 Thread Jukka Ruohonen
On Wed, May 16, 2012 at 09:41:11AM +, Martin Husemann wrote:
 Module Name:  src
 Committed By: martin
 Date: Wed May 16 09:41:11 UTC 2012
 
 Modified Files:
   src/sys/kern: sys_pipe.c
 
 Log Message:
 Make sure we can deliver two file descriptors for pipe2() before we set
 up anything special (like close on exec).
 Fixes PR kern/46457.

Thanks for fixing this. There is at least one other critical, 100 %
reproducable, bug that allows non-root users to crash the system; PR
kern/38889 (see also tests/lib/libc/sys/t_mmap; mmap_block).

- Jukka.




Re: CVS commit: src/sys/kern

2012-05-16 Thread Martin Husemann
On Wed, May 16, 2012 at 04:49:06PM +0300, Jukka Ruohonen wrote:
 Thanks for fixing this. There is at least one other critical, 100 %
 reproducable, bug that allows non-root users to crash the system; PR
 kern/38889 (see also tests/lib/libc/sys/t_mmap; mmap_block).

I'm investigating...
But: I don't see how that could work for non-root users.

Martin


Re: CVS commit: src/sys/kern

2012-02-20 Thread tsugutomo . enami
Christos Zoulas chris...@netbsd.org writes:

 To generate a diff of this commit:
 cvs rdiff -u -r1.341 -r1.342 src/sys/kern/kern_exec.c

+   kmem_free(fa-fae, sizeof(*fa-fae));

Two bugs here. fa-fae isn't allocated if (original) fa-len is zero.
And the size of allocation is `sizeof(*fa-fae) * fa-len' (again,
original fa-len).

enami.


Re: CVS commit: src/sys/kern

2012-02-20 Thread tsugutomo . enami
Christos Zoulas chris...@netbsd.org writes:

 Modified Files:
   src/sys/kern: kern_exec.c

 Log Message:
 fix fae free'ing, from enami.

The fa-len in the posix_spawn_fa_free() might no longer be equal to the
value used when fa-fae is allocated.  So you can't use it as is.

That's why I said 'original fa-len' in the last mail.  

enami.


Re: CVS commit: src/sys/kern

2012-02-15 Thread YAMAMOTO Takashi
hi,

 Module Name:  src
 Committed By: martin
 Date: Wed Feb 15 11:59:30 UTC 2012
 
 Modified Files:
   src/sys/kern: kern_exit.c
 
 Log Message:
 Fix fallout from the new tests exercising all error paths: do not deactivate
 the pmap of a vmspace-less child of a posix spawn operation that never
 made it to userland.

please take a look at uvm_proc_exit().  it borrows proc0.p_vmspace
for exiting process.

while which of these methods is cleaner is probably a matter of taste,
these two cases should use the same method.

YAMAMOTO Takashi

 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.235 -r1.236 src/sys/kern/kern_exit.c
 
 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.


Re: CVS commit: src/sys/kern

2012-02-04 Thread Mindaugas Rasiukevicius
y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) wrote:
 hi,
 
  Module Name:src
  Committed By:   rmind
  Date:   Mon Jan 30 21:05:40 UTC 2012
  
  Modified Files:
  src/sys/kern: subr_kmem.c
  
  Log Message:
  Fix for KMEM_GUARD; do not use it from interrupt context.
 
 kmem_zalloc still seems broken and anyway the test looks too fragile.
 how about simply moving the #ifdef blocks to callers?

It was just a quick workaround to make it run.  I have changed KMGUARD
such that it could be called from interrupt context.  Might need some
tuning as memory consumption is even more significant, but that can be
done later.  KMGUARD already discovered 4 bugs.

 
 YAMAMOTO Takashi

-- 
Mindaugas


Re: CVS commit: src/sys/kern

2012-01-31 Thread David Young
On Tue, Jan 31, 2012 at 07:11:38PM +, Alexander Nasonov wrote:
 Module Name:  src
 Committed By: alnsn
 Date: Tue Jan 31 19:11:38 UTC 2012
 
 Modified Files:
   src/sys/kern: subr_pcq.c
 
 Log Message:
 Replace offsetof(pcq_t, pcq_items[nitems]) with sizeof(pcq_t) + sizeof(void 
 *[nitems]).

Yuck. The offsetof() way was much more readable.  Please put it back
the old way.  If you have to, provide and use a runtime_offsetof() that
DTRT.

Dave

-- 
David Young
dyo...@pobox.comUrbana, IL(217) 721-9981


Re: CVS commit: src/sys/kern

2012-01-31 Thread Alexander Nasonov
David Young wrote:
 Yuck. The offsetof() way was much more readable.

Yes, it's more readable but it's not standard C99.
And it's confusing to use offsetof when you want to use sizeof.

 Please put it back the old way.
 If you have to, provide and use a runtime_offsetof() that
 DTRT.

DTRT TWW ;-)

What about something like this (untested)?

/*
 * Return a size of a structure s with flexible-array member m
 * with n elements.
 */
#define sizeof_fam(s, m, n) (sizeof(s) + sizeof(((s *)NULL)-m[0]) * (n))


and use like this:

sizeof_fam(struct pcq, pcq_items, nitems);

Alex


Re: CVS commit: src/sys/kern

2012-01-31 Thread Joerg Sonnenberger
On Tue, Jan 31, 2012 at 07:32:52PM +, Alexander Nasonov wrote:
 What about something like this (untested)?
 
 /*
  * Return a size of a structure s with flexible-array member m
  * with n elements.
  */
 #define sizeof_fam(s, m, n) (sizeof(s) + sizeof(((s *)NULL)-m[0]) * (n))

That's still not necessarily optimal, depending on the padding rules of
the platform. You want to do offsetof(s, m[0]) + n * sizeof((s*)NULL-m[0]).

Joerg


Re: CVS commit: src/sys/kern

2012-01-31 Thread Alexander Nasonov
Joerg Sonnenberger wrote:
 On Tue, Jan 31, 2012 at 07:32:52PM +, Alexander Nasonov wrote:
  #define sizeof_fam(s, m, n) (sizeof(s) + sizeof(((s *)NULL)-m[0]) * (n))
 
 That's still not necessarily optimal, depending on the padding rules of
 the platform. You want to do offsetof(s, m[0]) + n * sizeof((s*)NULL-m[0]).

I'm aware of this but I was merely following examples from the standard.

More specifically I was following DR 282 (flexible array members 
struct padding) which was incorporated into TC2.

http://std.dkuug.dk/jtc1/sc22/wg14/www/docs/dr_282.htm

I can add a comment about padding to sizeof_fam. If there is no padding,
my and your expression should return same values.

Alex


Re: CVS commit: src/sys/kern

2012-01-31 Thread David Laight
On Tue, Jan 31, 2012 at 07:50:26PM +, Alexander Nasonov wrote:
 Joerg Sonnenberger wrote:
  On Tue, Jan 31, 2012 at 07:32:52PM +, Alexander Nasonov wrote:
   #define sizeof_fam(s, m, n) (sizeof(s) + sizeof(((s *)NULL)-m[0]) * (n))
  
  That's still not necessarily optimal, depending on the padding rules of
  the platform. You want to do offsetof(s, m[0]) + n * sizeof((s*)NULL-m[0]).
 
 I'm aware of this but I was merely following examples from the standard.
 
 More specifically I was following DR 282 (flexible array members 
 struct padding) which was incorporated into TC2.
 
 http://std.dkuug.dk/jtc1/sc22/wg14/www/docs/dr_282.htm
 
 I can add a comment about padding to sizeof_fam. If there is no padding,
 my and your expression should return same values.

And if there is padding, and it is after the last field (and a few
other clauses) your scheme will allocate a longer buffer than needed.

The advantage of Joerg's is that you don't have to 'know' the type of
the member.

Sudden barin explosion - how about (untested):
#define sizeof_var_struct(s, m, c) \
offsetof(s, m[0]) + (c) * (offsetof(s, m[1]) -  offsetof(s, m[0]))

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2012-01-31 Thread Alexander Nasonov
Joerg Sonnenberger wrote:
 That's still not necessarily optimal, depending on the padding rules of
 the platform. You want to do offsetof(s, m[0]) + n * sizeof((s*)NULL-m[0]).

Using m[0] inside offsetof is non-standard but I think this will work:

offsetof(s, m) + n * sizeof((s*)NULL-m[0]).

Alex


Re: CVS commit: src/sys/kern

2012-01-31 Thread David Laight
On Tue, Jan 31, 2012 at 09:51:17PM +, Alexander Nasonov wrote:
 Joerg Sonnenberger wrote:
  That's still not necessarily optimal, depending on the padding rules of
  the platform. You want to do offsetof(s, m[0]) + n * sizeof((s*)NULL-m[0]).
 
 Using m[0] inside offsetof is non-standard but I think this will work:
 
 offsetof(s, m) + n * sizeof((s*)NULL-m[0]).

I don't believe there is a problem using m[0], just m[non-constant-expr].

OTOH 'sizeof (s *)0-m[0]' might be deemed invalid (because it has
an inferred dereference of NULL.

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2012-01-31 Thread Alexander Nasonov
David Laight wrote:
 I don't believe there is a problem using m[0], just m[non-constant-expr].

Practically speaking, m[0] is not a problem but the standard says

... to the structure member (designated by member-designator), ...
 ^^

 OTOH 'sizeof (s *)0-m[0]' might be deemed invalid (because it has
 an inferred dereference of NULL.

Oh yes. I can change it to:

extern void *foo(void);

and then sizeof(((s *)foo())-m[0]) but NULL should work too.

So, if we reached a consensus on this, where should I add the macro?

Alex



Re: CVS commit: src/sys/kern

2012-01-31 Thread Warner Losh

On Jan 31, 2012, at 3:53 PM, Alexander Nasonov wrote:
 David Laight wrote:
 I don't believe there is a problem using m[0], just m[non-constant-expr].
 
 Practically speaking, m[0] is not a problem but the standard says
 
 ... to the structure member (designated by member-designator), ...
 ^^
 
 OTOH 'sizeof (s *)0-m[0]' might be deemed invalid (because it has
 an inferred dereference of NULL.
 
 Oh yes. I can change it to:
 
 extern void *foo(void);
 
 and then sizeof(((s *)foo())-m[0]) but NULL should work too.
 
 So, if we reached a consensus on this, where should I add the macro?

There's no NULL dereference, so it is fine (completely standards conforming).  
The sizeof operator is a compile time construct that evaluates its argument for 
its type only, so no dereference is happening, so any value for the pointer is 
OK.  In the absence of a local variable, 0 is just as valid as anything else.

Warner



Re: CVS commit: src/sys/kern

2012-01-31 Thread Valeriy E. Ushakov
On Tue, Jan 31, 2012 at 22:53:37 +, Alexander Nasonov wrote:

 David Laight wrote:
  I don't believe there is a problem using m[0], just m[non-constant-expr].
 
 Practically speaking, m[0] is not a problem but the standard says
 
 ... to the structure member (designated by member-designator), ...
  ^^

Well, the standard also says that:

   The type and member designator shall be such that given

   static type t;

   then  the  expression (t.member-designator) evaluates to an
   address constant.  (If the specified member is a  bit-field,
   the behavior is undefined.)

Note that the standard doesn't actually bother to define
memeber-designator formally, but see 6.7.8 Initialization.  We can
haggle about the finer points of specific wording used in the relevant
paragraphs, but I think the intended meaning is pretty clear.

-uwe


Re: CVS commit: src/sys/kern

2012-01-30 Thread YAMAMOTO Takashi
hi,

 Module Name:  src
 Committed By: rmind
 Date: Mon Jan 30 21:05:40 UTC 2012
 
 Modified Files:
   src/sys/kern: subr_kmem.c
 
 Log Message:
 Fix for KMEM_GUARD; do not use it from interrupt context.

kmem_zalloc still seems broken and anyway the test looks too fragile.
how about simply moving the #ifdef blocks to callers?

YAMAMOTO Takashi

 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.40 -r1.41 src/sys/kern/subr_kmem.c
 
 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.


Re: CVS commit: src/sys/kern

2012-01-22 Thread David Laight
On Sun, Jan 22, 2012 at 03:48:51AM +, Mindaugas Rasiukevicius wrote:
 Module Name:  src
 Committed By: rmind
 Date: Sun Jan 22 03:48:51 UTC 2012
 
 Modified Files:
   src/sys/kern: kern_fileassoc.c
 
 Log Message:
 fileassoc_file_delete: pre-check whether fileassoc was used and thus avoid
 acquiring kernel-lock, which damages sys_unlink() performance.

Erm... looking at the file the locking in there looks decidedly dubious.

1) There doesn't seem to be any locking on the hash table.
2) It isn't clear why the KERNEL_LOCK was acquired in one specific path.
3) If fileassoc_file_delete() is expected to remove all references for
   a vnode, something external must have forced the state of the vnode.
   (otherwise the stuff might be added - inc. global init - while this
   code is being called.

OTOH I've actually NFI what the code in this file is for!

David

-- 
David Laight: da...@l8s.co.uk


Re: CVS commit: src/sys/kern

2012-01-22 Thread Mindaugas Rasiukevicius
David Laight da...@l8s.co.uk wrote:
 On Sun, Jan 22, 2012 at 03:48:51AM +, Mindaugas Rasiukevicius wrote:
  Module Name:src
  Committed By:   rmind
  Date:   Sun Jan 22 03:48:51 UTC 2012
  
  Modified Files:
  src/sys/kern: kern_fileassoc.c
  
  Log Message:
  fileassoc_file_delete: pre-check whether fileassoc was used and thus
  avoid acquiring kernel-lock, which damages sys_unlink() performance.
 
 Erm... looking at the file the locking in there looks decidedly dubious.
 
 1) There doesn't seem to be any locking on the hash table.
 2) It isn't clear why the KERNEL_LOCK was acquired in one specific path.
 3) If fileassoc_file_delete() is expected to remove all references for
a vnode, something external must have forced the state of the vnode.
(otherwise the stuff might be added - inc. global init - while this
code is being called.

Yes, locking issues are known in this code (see e.g. PR/35351) and I would
say fileassoc(9) should be disabled by default while this is resolved.

However, I do not really have much interest in fixing fileassoc(9), so my
only concern was to fix performance degradation of unlink(2) due to it.

-- 
Mindaugas


Re: CVS commit: src/sys/kern

2011-12-14 Thread David Holland
On Thu, Dec 15, 2011 at 12:05:18AM +, Jared D. McNeill wrote:
  Modified Files:
   src/sys/kern: kern_lwp.c
  
  Log Message:
  In lwp_create, copy l_sigstk from the template. Previously
  sigaltstack wasn't inherited across fork (but SA_ONSTACK handlers
  was), causing the child process to crash when the handler is
  invoked. ok martin@, christos@

oog. Thanks for finding and fixing that...

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/kern

2011-12-06 Thread Christos Zoulas
On Dec 6,  4:32pm, nona...@gmail.com (NONAKA Kimihiro) wrote:
-- Subject: Re: CVS commit: src/sys/kern

|  Thanks for fixing it. Now, I am more interested why the printf fired,
|  and under what conditions I can get it to fire again.
| 
| nonaka@koharu:[-]$ mount
| /dev/wd0a on / type ffs (log, NFS exported, local)
| tmpfs on /tmp type tmpfs (local)
| kernfs on /kern type kernfs (local)
| procfs on /proc type procfs (local)
| ptyfs on /dev/pts type ptyfs (local)
| tmpfs on /usr/pkg/emul/linux/dev/shm type tmpfs (local)
| /home/source/NetBSD/src on /usr/src type null (local)
| /home/source/NetBSD/pkgsrc on /usr/pkgsrc type null (local)
| /home/source/NetBSD/xsrc on /usr/xsrc type null (local)
| 
| # fired (e.g. mail/fetchmail)
| nonaka@koharu:[-]$ cd /usr/pkgsrc/mail/fetchmail
| nonaka@koharu:[-]$ make
| 
| # not fired
| nonaka@koharu:[-]$ cd /home/source/NetBSD/pkgsrc/mail/fetchmail
| nonaka@koharu:[-]$ make

Great, thanks!

christos


Re: CVS commit: src/sys/kern

2011-12-05 Thread NONAKA Kimihiro
Hi,

2011/12/5 Christos Zoulas chris...@zoulas.com:

 | and good catch, I checked to see if the file had changed before
 | posting in current-users, and looked straight through it. :-/

 Thanks for fixing it. Now, I am more interested why the printf fired,
 and under what conditions I can get it to fire again.

nonaka@koharu:[-]$ mount
/dev/wd0a on / type ffs (log, NFS exported, local)
tmpfs on /tmp type tmpfs (local)
kernfs on /kern type kernfs (local)
procfs on /proc type procfs (local)
ptyfs on /dev/pts type ptyfs (local)
tmpfs on /usr/pkg/emul/linux/dev/shm type tmpfs (local)
/home/source/NetBSD/src on /usr/src type null (local)
/home/source/NetBSD/pkgsrc on /usr/pkgsrc type null (local)
/home/source/NetBSD/xsrc on /usr/xsrc type null (local)

# fired (e.g. mail/fetchmail)
nonaka@koharu:[-]$ cd /usr/pkgsrc/mail/fetchmail
nonaka@koharu:[-]$ make

# not fired
nonaka@koharu:[-]$ cd /home/source/NetBSD/pkgsrc/mail/fetchmail
nonaka@koharu:[-]$ make

-- 
NONAKA Kimihiro


Re: CVS commit: src/sys/kern

2011-12-04 Thread Bernd Ernesti
On Thu, Nov 24, 2011 at 02:55:22PM -0500, Christos Zoulas wrote:
 Module Name:  src
 Committed By: christos
 Date: Thu Nov 24 19:55:22 UTC 2011
 
 Modified Files:
   src/sys/kern: kern_exec.c
 
 Log Message:
 fix incomplete statement.
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.331 -r1.332 src/sys/kern/kern_exec.c

Hmm, why did you change two '#ifdef notyet' to '#ifndef notyet'?

See the mail from NONAKA Kimihiro on current-users where he now get
'Cannot get path for pid' messages due too this change.

Bernd



Re: CVS commit: src/sys/kern

2011-12-04 Thread David Holland
On Sun, Dec 04, 2011 at 11:22:54AM +0100, Bernd Ernesti wrote:
   Modified Files:
  src/sys/kern: kern_exec.c
   
   Log Message:
   fix incomplete statement.
  
  Hmm, why did you change two '#ifdef notyet' to '#ifndef notyet'?

By mistake, I assume, probably because he's been running with that
code enabled.

I reverted that part.

and good catch, I checked to see if the file had changed before
posting in current-users, and looked straight through it. :-/

-- 
David A. Holland
dholl...@netbsd.org


Re: CVS commit: src/sys/kern

2011-12-04 Thread Christos Zoulas
In article 20111204102254.ga16...@arresum.veego.de,
Bernd Ernesti  net...@lists.veego.de wrote:
On Thu, Nov 24, 2011 at 02:55:22PM -0500, Christos Zoulas wrote:
 Module Name: src
 Committed By:christos
 Date:Thu Nov 24 19:55:22 UTC 2011
 
 Modified Files:
  src/sys/kern: kern_exec.c
 
 Log Message:
 fix incomplete statement.
 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.331 -r1.332 src/sys/kern/kern_exec.c

Hmm, why did you change two '#ifdef notyet' to '#ifndef notyet'?

See the mail from NONAKA Kimihiro on current-users where he now get
'Cannot get path for pid' messages due too this change.

I did not mean to commit that, and it should be reverted.

christos



Re: CVS commit: src/sys/kern

2011-12-04 Thread Christos Zoulas
On Dec 4,  3:13pm, dholland-sourcechan...@netbsd.org (David Holland) wrote:
-- Subject: Re: CVS commit: src/sys/kern

| On Sun, Dec 04, 2011 at 11:22:54AM +0100, Bernd Ernesti wrote:
|Modified Files:
| src/sys/kern: kern_exec.c
|
|Log Message:
|fix incomplete statement.
|   
|   Hmm, why did you change two '#ifdef notyet' to '#ifndef notyet'?
| 
| By mistake, I assume, probably because he's been running with that
| code enabled.

That is correct ;-)

| I reverted that part.
| 
| and good catch, I checked to see if the file had changed before
| posting in current-users, and looked straight through it. :-/

Thanks for fixing it. Now, I am more interested why the printf fired,
and under what conditions I can get it to fire again.

christos


re: CVS commit: src/sys/kern

2011-11-30 Thread matthew green

 Module Name:  src
 Committed By: mbalmer
 Date: Wed Nov 30 10:27:46 UTC 2011
 
 Modified Files:
   src/sys/kern: kern_ktrace.c
 
 Log Message:
 Only return values when there was no error.

what's the motivation for this?  does it match what *all* syscalls
do?  ie, not set anything even on all error cases?


.mrg.


Re: CVS commit: src/sys/kern

2011-11-21 Thread Joerg Sonnenberger
On Mon, Nov 21, 2011 at 01:44:38PM +, Izumi Tsutsui wrote:
 Module Name:  src
 Committed By: tsutsui
 Date: Mon Nov 21 13:44:38 UTC 2011
 
 Modified Files:
   src/sys/kern: subr_cprng.c
 
 Log Message:
 Include MD machine/cpu_counter.h only if defined(__HAVE_CPU_COUNTER).
 
 XXX: Why not timecounter(9) but deprecated cpu_counter32() and microtime(9)?

microtime(9) is part of timecounter(9). There is no real replacement for
cpu_counter32, which is doing a lot less processing than the
alternatives.

Joerg


Re: CVS commit: src/sys/kern

2011-11-21 Thread Izumi Tsutsui
  Modified Files:
  src/sys/kern: subr_cprng.c
  
  Log Message:
  Include MD machine/cpu_counter.h only if defined(__HAVE_CPU_COUNTER).
  
  XXX: Why not timecounter(9) but deprecated cpu_counter32() and microtime(9)?
 
 microtime(9) is part of timecounter(9).

With extra overhead?

 There is no real replacement for
 cpu_counter32, which is doing a lot less processing than the
 alternatives.

It seems used to get a random number and
I think MI getbintime(9) is simple and enough.

---
Izumi Tsutsui


Re: CVS commit: src/sys/kern

2011-11-20 Thread J. Hannken-Illjes

On Nov 19, 2011, at 5:11 PM, Christos Zoulas wrote:

 Module Name:  src
 Committed By: christos
 Date: Sat Nov 19 16:11:24 UTC 2011
 
 Modified Files:
   src/sys/kern: cnmagic.c
 
 Log Message:
 PR/45633: Christian Biere: Don't access byte after NUL when setting magic.

This cannot work:

if (c == '\0')
return EINVAL;
switch (c) {
case '\0':
...
return 0;   == This statement looks unreachable.

All attempts to set cnmagic result in EINVAL.

--
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: CVS commit: src/sys/kern

2011-11-20 Thread Christos Zoulas
On Nov 20, 10:02am, hann...@eis.cs.tu-bs.de (J. Hannken-Illjes) wrote:
-- Subject: Re: CVS commit: src/sys/kern

| 
| On Nov 19, 2011, at 5:11 PM, Christos Zoulas wrote:
| 
|  Module Name:src
|  Committed By:   christos
|  Date:   Sat Nov 19 16:11:24 UTC 2011
|  
|  Modified Files:
|  src/sys/kern: cnmagic.c
|  
|  Log Message:
|  PR/45633: Christian Biere: Don't access byte after NUL when setting magic.
| 
| This cannot work:
| 
|   if (c == '\0')
|   return EINVAL;
|   switch (c) {
|   case '\0':
|   ...
|   return 0;   == This statement looks unreachable.
| 
| All attempts to set cnmagic result in EINVAL.

The patch was amended. Do you still have an issue?

christos


Re: CVS commit: src/sys/kern

2011-11-20 Thread J. Hannken-Illjes

On Nov 20, 2011, at 6:10 PM, Christos Zoulas wrote:

 On Nov 20, 10:02am, hann...@eis.cs.tu-bs.de (J. Hannken-Illjes) wrote:
 -- Subject: Re: CVS commit: src/sys/kern
 
 | 
 | On Nov 19, 2011, at 5:11 PM, Christos Zoulas wrote:
 | 
 |  Module Name:  src
 |  Committed By: christos
 |  Date: Sat Nov 19 16:11:24 UTC 2011
 |  
 |  Modified Files:
 |src/sys/kern: cnmagic.c
 |  
 |  Log Message:
 |  PR/45633: Christian Biere: Don't access byte after NUL when setting magic.
 | 
 | This cannot work:
 | 
 | if (c == '\0')
 | return EINVAL;
 | switch (c) {
 | case '\0':
 | ...
 | return 0;   == This statement looks unreachable.
 | 
 | All attempts to set cnmagic result in EINVAL.
 
 The patch was amended. Do you still have an issue?

No, Rev. 1.13 works fine -- sorry for the noise ...

--
Juergen Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig (Germany)



Re: CVS commit: src/sys/kern

2011-11-02 Thread Mindaugas Rasiukevicius
Hello Juergen,

Juergen Hannken-Illjes hann...@netbsd.org wrote:
 Log Message:
 The path getnewvnode()-getcleanvnode()-vclean()-VOP_LOCK() will panic
 if the vnode we want to clean is a layered vnode and the caller already
 locked its lower vnode.
 
 Change getnewvnode() to always allocate a fresh vnode and add a helper
 thread (vdrain) to keep the number of allocated vnodes within
 desiredvnodes.
 
 Rename getcleanvnode() to cleanvnode() and let it take a vnode from the
 lists, clean and free it.

 ...
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.11 -r1.12 src/sys/kern/vfs_vnode.c

Not that I object to the change, but such approach was already discussed in
the past on tech-kern@ and rejected.  See the follow ups of this thread:

http://mail-index.netbsd.org/tech-kern/2009/08/17/msg005861.html

This significantly changes the dynamics of vnode cache and reclamation,
apart from making it single-threaded.  Given the current locking scheme
with its issues (and thus reclamation complexity), the change is probably
good enough for the medium term.  However, in the long term, I would say
that getnewvnode() behaviour should be restored.

-- 
Mindaugas


Re: CVS commit: src/sys/kern

2011-10-24 Thread Jean-Yves Migeon

On 24.10.2011 03:53, YAMAMOTO Takashi wrote:

Log Message:
Turn a workqueue(9) name into an array in the struct workqueue, rather
than a const char *. This avoids keeping a reference to a string
owned by caller (string could be allocated on stack).


what needs it?


Me, in a not-so-distant future. The purpose of this change is to allow 
passing strings that could be constructed that way:


char name[16];
...
snprintf(name, sizeof(name), xbdback%d.%d, domid, instance);
workqueue_create(foo, name, funcfoo, NULL, PRI_NONE, SPL_BIO, 0);

I'd like to construct workqueues based on a context; the old code does 
not permit that, unless you build and store the string somewhere, 
forcing the caller to *know* that it only keeps a pointer and does not 
copy the content. This will get misused, sooner or later.


--
Jean-Yves Migeon
jeanyves.mig...@free.fr


Re: CVS commit: src/sys/kern

2011-10-23 Thread YAMAMOTO Takashi
hi,

 Module Name:  src
 Committed By: jym
 Date: Sun Oct 23 21:41:23 UTC 2011
 
 Modified Files:
   src/sys/kern: subr_workqueue.c
 
 Log Message:
 Turn a workqueue(9) name into an array in the struct workqueue, rather
 than a const char *. This avoids keeping a reference to a string
 owned by caller (string could be allocated on stack).

what needs it?

YAMAMOTO Takashi

 
 
 To generate a diff of this commit:
 cvs rdiff -u -r1.31 -r1.32 src/sys/kern/subr_workqueue.c
 
 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.


Re: CVS commit: src/sys/kern

2011-10-05 Thread Alan Barrett

On Thu, 06 Oct 2011, matthew green wrote:

src/sys/kern: kern_synch.c

Log Message:
Print WARNING: negative runtime; monotonic clock has gone backwards\n
using log(LOG_WARNING, ...), not just printf(...).

From PR 45421 by Greg Woods.


i object to this change.

this is a serious error and indicates a real bug.  i think having
it hidden away in just the post-boot dmesg is a poor idea.


I was confused about how log(9) worked.  I have reverted the change.

Perhaps we could allow TOCONS to be passed as a flag to log(9).

--apb (Alan Barrett)


<    1   2   3   4   5   6   7   >