Re: Making dhcpcd work on diskless clients
On Sun, 08 Feb 2015, Roy Marples wrote: since this problem goes away if we make all of dhcpcd in-memory first, possibly what happens here is that with i386 or amd64, the layout is such that we don't ever try to fault in code during the small period of time that the route is missing. I don't fully understand what you are saying. Some part of the code that you are trying to run may not be in memory, so you may encounter a page fault when you try to run it. Responding to the page fault involves reading information from the the file system. On a diskless client, reading from the file system actually involves transferring data over the network from the remote file server. If you try to read from the file server at a time when the routing table does not contain a usable route to the file server, then you lose. But do you have an idea of how this can be fixed then without dhcpcd having to learn the routing table at load time? Do you currently use RTM_DELETE and RTM_ADD? Can you use RTM_CHANGE instead? --apb (Alan Barrett)
Re: Reuse strtonum(3) and reallocarray(3) from OpenBSD
On Sat, 29 Nov 2014, Kamil Rytarowski wrote: My proposition is to add a new header in src/sys/sys/overflow.h (/usr/include/sys/overflow.h) with the following content: operator_XaddY_overflow() operator_XsubY_overflow() operator_XmulY_overflow() X = optional s (signed) Y = optional l,ll, etc [* see comment] OK, so you have told us the names of the proposed functions. But what are their semantics, and why would they be useful? Last but not least please stop enforcing programmers' fancy to produce this kind of art: https://github.com/ivmai/bdwgc/commit/83231d0ab5ed60015797c3d1ad9056295ac3b2bb :-) Please don't assume that people reading your email messages have convenient internet access. It's fine to give URLs thatrexpand on what you have said, but if you give the URL without any explanation then I have no idea what you are talking about. --apb (Alan Barrett)
Re: posix_madvise(2) should fail with ENOMEM for invalid adresses range
On Sun, 23 Nov 2014, Nicolas Joly wrote: According the OpenGroup online document for posix_madvise[1], it should fail with ENOMEM for invalid addresses ranges : [...] But we currently fail with EINVAL (returned value from range_check() function). In general, when POSIX doesn't make sense, NetBSD should not need to follow what POSIX says. In this case, it probably doesn't matter much. --apb (Alan Barrett)
Re: CTLTYPE_UINT?
On Sat, 04 Oct 2014, Justin Cormack wrote: I agree about being explicit with the 32 bitness, but using S64 and U64 as the 64 bit names to be consistent with FreeBSD might make sense. The S64 and U64 names are fine. I'd also add S32 and U32. long types seems best avoided if possible, you can see the temptation to use them for memory amounts, but you could be running on 32 bit userspace on a 64 bit kernel. One of the reasons that I like user/kernel interfaces to use types with explicit bit width is to simplify running 32-bit userland on 64-bit kernels. If you use a type whose actual size changes between 32 bits and 64 bits, then the kernel has to have a compatibility layer to copy and adjust the data, and tools like kdump or ktruss should also translate (it's a bug that they don't do so today). If you use a type that's always 64 bits, then it's much easier to deal with. Occasionally, an argument for run-time efficiency in a 32-bit userland will outweigh this argument for ease of coding, and then a type whose size changes should be used. --apb (Alan Barrett)
Re: CTLTYPE_UINT?
On Fri, 03 Oct 2014, Justin Cormack wrote: Back in the sysctl discussion a while back, core group said: http://mail-index.netbsd.org/tech-kern/2014/03/26/msg016779.html a) What types are needed? Currently, CTLTYPE_INT is a signed 32-bit type, and CTLTYPE_QUAD is an unsigned 64-bit type. Perhaps all four possible combinations of signed/unsigned and 32 bits/64 bits should be supported. If you add new sysctl types, please use names that describe the size and signedness. For example, rename CTLTYPE_INT to CTLTPE_INT32, keep CTLTYPE_INT as a backward compatible alias for CTLTYPE_INT32, and add CTLTYPE_UINT32. Similarly, rename CTLTYPE_QUAD to CTLTYPE_UINT64, keep CTLTYPE_QUAD as an alias, and add CTLTYPE_INT64. Please don't add a CTLTYPE_UINT with no indication of its size. A survey of what other OSes do would also be useful. --apb (Alan Barrett)
Re: Unification of common date/time macros
On Mon, 22 Sep 2014, Robert Elz wrote: | My proposition is [...] | | #define SECSPERMIN 60L | #define MINSPERHOUR 60L | #define HOURSPERDAY 24L | #define DAYSPERWEEK 7L | #define DAYSPERNYEAR365L | #define DAYSPERLYEAR366L | #define SECSPERHOUR (SECSPERMIN * MINSPERHOUR) | #define SECSPERDAY (SECSPERHOUR * HOURSPERDAY) | #define MONSPERYEAR 12L | #define EPOCH_YEAR 1970L Why are they all to be long ? The only one that has even the slightest potential for that need (and which is currently defined as long for the userspace definitions) is SECSPERDAY, and that's only to cope with the possibility that int is 16 bits (which I don't think NetBSD supports at all, since there is no pdp11 port - but is kept that way for API consistency.) kre's analysis is correct. I'd just define them all as plain numbers, without any L, U, or UL suffix. I'd probably also use 3600 and 86400 for SECSPERHOUR and SECSPERDAY, to avoid surprises in the arithmetic. For an example of an unwanted surprise, consider (SECSPERHOUR * HOURSPERDAY) or ((60 * 60) * 24) on a machine with 16-bit ints: the desired result of 86400 is too large to represent in 16 bits, which causes undefined behaviour. NetBSD doesn't support any machines with 16-bit int, but this is the sort of code where it's easy to accommodate such machines, so we might as well do it. --apb (Alan Barrett)
Re: build kernel from source
On Sun, 21 Sep 2014, bycn82 wrote: *Command**:* cd /usr/src; ./build.sh -O /usr/obj -U -j 8 tools kernel=NB6 modules distribution sets I see you didn't pass -T ${TOOLDIR} option to build.sh *Result:* configure: creating ./config.status config.status: creating host-mkdep chmod +x host-mkdep # install /tooldir.NetBSD-6.1.4-amd64/bin/nbhost-mkdep mkdir -p /tooldir.NetBSD-6.1.4-amd64/bin /usr/src/tools/binstall/xinstall -c -r -m 555 host-mkdep /tooldir.NetBSD-6.1.4-amd64/bin/nbhost-mkdep make: exec(/usr/src/tools/binstall/xinstall) failed (No such file or directory) *** Error code 1 The install line suggests that some part of the build thinks that the TOOLDIR is /tooldir.NetBSD-6.1.4-amd64, which makes no sense, and the invocation of /usr/src/tools/binstall/xinstall suggests that it thinks you are not using an OBJDIR, which also makes no sense. Were there any error or warning messages printed by build.sh before it got to === Updated makewrapper: ? What TOOLDIR path did build.sh print? _*Try to build the npfctl command*_ If you can't build tools, nothing else is going to work, but ... # cd /usr/src/usr.sbin/npf/npfctl/ # pwd /usr/src/usr.sbin/npf/npfctl # make # lex npfctl/npf_scan.c /usr/src/tooldir.NetBSD-6.1.4-amd64/bin/nblex-onpf_scan.c npf_scan.l make: exec(/usr/src/tooldir.NetBSD-6.1.4-amd64/bin/nblex) failed (No such file or directory) *** Error code 1 ... now it seems to think that your TOOLDIR is /usr/src/tooldir.NetBSD-6.1.4-amd64, not the /tooldir.NetBSD-6.1.4-amd64 that appeared earlier. How to change the NetBSD6.1.4? I am using the current development version of source! The host platform name is embedded in the default TOOLDIR name, so it's fine for it to say NetBSD-6.1.4. --apb (Alan Barrett)
Re: build kernel from source
On Mon, 22 Sep 2014, bycn82 wrote: *Now I am building the npfctl, and I met below** * -Wsign-compare -Wformat=2 -Werror -I/usr/src/usr.sbin/npf/npfctl --sysroot=/ -c /usr/src/usr.sbin/npf/npfctl/npf_bpf_comp.c In file included from /usr/src/usr.sbin/npf/npfctl/npf_bpf_comp.c:56:0: /usr/src/usr.sbin/npf/npfctl/npf_bpf_comp.c: In function 'npfctl_bpf_table': /usr/src/usr.sbin/npf/npfctl/npf_bpf_comp.c:610:21: error: 'BPF_COP' undeclared (first use in this function) BPF_STMT(BPF_MISC+BPF_COP, NPF_COP_TABLE), ^ /usr/src/usr.sbin/npf/npfctl/npf_bpf_comp.c:610:21: note: each undeclared identifier is reported only once for each function it appears in *** Error code 1 Stop. make: stopped in /usr/src/usr.sbin/npf/npfctl I suspect that you have a corrupted source tree. Please take this to current-users, not tech-kern. *Can someone told me which header has the declaration of the BPF_COP ? ** **I found below only.* # grep -R BPF_COP /usr/src /usr/src/doc/CHANGES.prev:kernel: Add BPF coprocessor support (BPF_COP/BPF_COPX instructions). It's in src/sys/net/bpf.h. If grep didn't find it then you have an incomplete source tree. Please fix that, and then if you still have build problems, ask in current-users. --apb (Alan Barrett)
Re: Testing 7.0 Beta: FFS still very slow when creating files
On Tue, 26 Aug 2014, Robert Elz wrote: | memcmp is only supposed to provide the correct sign, not | the difference. | true, but that's not what memcmp(9) says. This is a normal problem with man pages - they're written to document what the code actually does, then interpreted as a specification of what the code is required to do. Man pages should be the former, the latter is the job of standards docs. Often, there are no standards docs, and the man page has to serve as both a specification of the parts of the interface that users can depend on, and documentation of what the code actually does. For example, it's possible to document returns -ve, 0, or +ve in one part of the man page, as an interface specification, and returns the difference in another part of th man page, as an implementation note. If anything needs changing, it would be to make it more clear that the man pages should not be interpreted as an interface specification, but as a statement of what the implementations actually do - not to be interpreted as a promise that they will always do that - for what can be relied upon a reference should be made to the relevant standard (which can be POSIX (or IEEE for C, or anyone else), or POSIX (etc) as amended by NetBSD, or a NetBSD private standard for stuff that either isn't documented by anyone else's standards doc, or where NetBSD's version has simply decided to be different. In cases where there really is a standard that can be referred to, that might work, but I like to have all the information in one place. If it's easy for the NetBSD man page to say both what's promised, and what is actually done, then I would like it to do so. I think that this helps both people using the interface and people changing the implementation. --apb (Alan Barrett)
Re: fdiscard error cases
On Sun, 27 Jul 2014, David Holland wrote: It was pointed out that it would be well to distinguish devices that don't currently support discard, but theoretically should (because they're disks) from devices where it makes no sense (e.g. ttys). This is probably a good idea. For fdiscard, I think the following errno values are likely to be relevant: ENOSYS: The operation is not supported at all. e.g. a kernel module has not been loaded, or a build option was not enabled. ENOTTY: It doesn't make sense to ask this driver layer to perform this operation. e.g. disk operation on a file or non-disk device. ENODEV: It does make sense to ask, but this device (or this driver layer) doesn't support this operation. e.g. this device or file system doesn't implement discard. This doesn't distinguish between not supported by driver and not supported by underlying hardware. EINVAL: The arguments don't make sense. e.g. null pointer, or an invalid combination of flags, or a length or offset out of bounds. EPERM: You don't have permission, but perhaps a process with different credentials might success. EACCESS: You don't have permission, but the problem is not in your credentials, the problem is in the way the object was opened. e.g. write on file opened with O_RDONLY. Another option is to add a new errno for Operation not implemented on this object or the like, to be a bit clearer about the distinction between not appropriate and not implemented and maintain the distinction between not implemented at the syscall level and not implemented on a particular backend entity. But, adding errnos is not something to do lightly... I think the ENOTTY/ENODEV distinction is enough. Many existing drivers or subsystems use ENODEV where I think ENOTTY or EINVAL or some other error would be more appropriate, but if you are careful to make the distinction for your new syscalls then you can document it. --apb (Alan Barrett)
More detailed build infomation in kernels
I have some private patches that append arbitrary additional information to the kernel version string. Essentially, I pass BUILDINFO=multi-line message here in the environment (through build.sh and the make wrapper), and then a modified version of src/sys/conf/newvers.sh appends it to the value of the version variable in the vers.c file that's compiled into the kernel. I also add the information to /etc/release. The additional information is exposed in sysctl kern.version, and in {struct utsname}.version as returned by uname(3), and in the output from uname(1) -v. I use this feature to add infomation about the source tree and build date, so I see information like this: $ sysctl kern.version kern.version = NetBSD 6.99.47 (APB) fossil repository: apb-local-src.fossil fossil tags: local fossil commit: 449e51b700 (2014-07-19 15:41:14 UTC) fossil comment: merge src from trunk as of 2014-07-19 00:00 UTC I imagine that it would be useful for official builds to include some sort of official statement here. The multi-line BUILDINFO strings are truncated and folded to a single line by uname(3), which is unhelpful, so I am inclined to store them in a new kernel variable, exposed via a new sysctl node, instead of appending to the existing kernel version variable. Then the new information would not be exposed by uname(3) or uname(1). Comments? --apb (Alan Barrett)
Re: Vnode API change: add global vnode cache
On Tue, 08 Apr 2014, Mindaugas Rasiukevicius wrote: Nothing [in NetBSD] really uses intern. Perhaps not a great naming, but other subsystems usually just use get. Yes, that's a good argument for just using get. --apb (Alan Barrett)
Re: Vnode API change: add global vnode cache
On Mon, 07 Apr 2014, Mindaugas Rasiukevicius wrote: Taylor R Campbell campbell+netbsd-tech-k...@mumble.net wrote: What is intern? `Intern' means `lookup, or create and insert if not there'. The point being is that I do not find it meaningful/intuitive. Many other systems just use get(). If you want more accurate name, I suggest conget() or something more meaningful. I would find conget confusing, while finding intern clear. Essentially, to intern a string or an external representation of an an object, means to create an internal representation of the string or object, or to find an already existing internal representation of an identical object, and (usually) to return a reference to that internal representation. The wikipedia article at https://en.wikipedia.org/wiki/String_interning is focused on strings, but other objects can also be interned. --apb (Alan Barrett)
Core statement on sysctl 32-bit/64-bit changes
The NetBSD core group has considered the sysctl changes made by David Laight on 23 and 27 Feb 2014 (see http://mail-index.netbsd.org/source-changes/2014/02/23/msg051946.html, http://mail-index.netbsd.org/source-changes/2014/02/27/msg052131.html, and http://mail-index.netbsd.org/source-changes/2014/03/01/msg052200.html), and objections raised by Andreas Gustafsson (see http://mail-index.netbsd.org/source-changes-d/2014/03/05/msg006587.html, http://mail-index.netbsd.org/tech-kern/2014/03/05/msg016706.html, and subsequent discussion in the source-changes-d and tech-kern lists). We note that the following sysctl nodes (for the i386 and amd64 ports) have been changed from CTLTYPE_INT (a 32-bit signed integer) to CTLTYPE_QUAD (a 64-bit unsigned integer): machdep.fpu_present machdep.osfxsr machdep.sse machdep.sse2 machdep.biosbasemem machdep.biosextmem Both binary and source code compatibility are important for NetBSD. If the types of sysctl variables are changed, then we would want it to be done in such a way that old code continues to work. We note that there has been an attempt to provide compatibility, by allowing 32-bit sysctl variables to be read into 64-bit buffers, and vice versa, but we are concerned that this mechanism was introduced without prior discussion, and there may be cases where compatibility is lost. Several of the affected sysctl variables appear to be essentially boolean in nature, and there appears to be no good reason for the variables to be exposed to userland as 64-bit values. It's not clear why variables that are logically boolean (such as machdep.fpu_present) were ever defined as CTLTYPE_INT instead of CTLTYPE_BOOL, but it does seem clear that changing them to CTLTYPE_QUAD is not an improvement. Two of the variables, machdep.biosbasemem and machdep.biosextmem, represent memory sizes, and it is possible that 32 bits might not be large enough for them. For these variables, changing the size to 64 bits, or adding new 64-bit variables in parallel with the old variables, may be appropriate. In the past, new 64-bit sysctl variables hw.physmem64 and hw.usermem64 were introduced in parallel to the older 32-bit variables hw.basemem and hw.usermem. We have heard suggestions that the same should be done for machdep.biosbasemem and machdep.biosextmem, if 32 bits is not sufficient for those variables. We have also heard suggestions that increasing the size of the variables would be preferable to adding new variables with different names, provided that it was done in a compatible way. The core group now recommends as follows: 1. In the short term, the affected sysctl variables should be changed back to their original type, and the compatibility code should be removed. 2. sysctl variables should not be wider than necessary. If there is no need for the existing variables to be made wider than 32 bits, then they should not be made wider than 32 bits. 3. If some existing variables need to be made wider, then consideration should be given to either: a) adding new 64-bit sysctl variables in parallel to the existing 32-bit variables (such as adding a new machdep.biosextmem64 in parallel to the existing machdep.biosextmem); or b) adding new infrastructure for 32-bit/64-bit compatibility, and using that infrastructure. 4. If new infrastructure is considered, to allow reading 64-bit sysctl variables into 32-bit buffers, then the design and implementation should be discussed in public. Some considerations that we would like to see addressed are: a) What types are needed? Currently, CTLTYPE_INT is a signed 32-bit type, and CTLTYPE_QUAD is an unsigned 64-bit type. Perhaps all four possible combinations of signed/unsigned and 32 bits/64 bits should be supported. b) Should the ability to read values with a different size apply to all sysctl variables, or only to those defined in a special way? For example, there could be a new CTLFLAG_COMPAT32 flag that allows reading a 64-bit value into a 32-bit buffer. c) What is the appropriate error return when a 64-bit value is too large to fit in a user-provided 32-bit buffer? d) Will old code still work without change? e) Will new userland code be able to detect the presence of wider sysctl variables with 32-bit compatibility? f) Is coordination with other projects using the sysctl(3) or sysctl(9) interface needed? g) Are the new interfaces adequately documented? -- Alan Barrett, on behalf of the NetBSD core group
Re: DIOCGDISKINFO support for vnd
On Tue, 11 Mar 2014, Patrick Welche wrote: The attached trivial patch allows vnd(4) to support generic disk ioctls. The only one in kern/subr_disk.c at the moment is DIOCGDISKINFO. Before: $ ./vndtest /dev/vnd0a vndtest: DIOCGDISKINFO: Inappropriate ioctl for device After: $ ./vndtest /dev/vnd0a size of /dev/vnd0a: 524288 bytes That's good, but ... default: - return ENOTTY; + error = disk_ioctl(vnd-sc_dkdev, cmd, data, flag, l); + if (error == EPASSTHROUGH) + return ENOTTY; + else + return error; I think there's no need to translate EPASSTHROUGH to ENOTTY here; that translation will be done by sys_ioctl() before returning to userland. Also, several other disk drivers have their ioctl handlers call disk_ioctl early (see fdioctl, wdioctl, sdioctl, dkioctl, raidioctl, among others), and it's not clear why vndioctl doesn't do that. --apb (Alan Barrett)
Re: Closing a serial device takes one second
On Fri, 07 Feb 2014, Marc Balmer wrote: Am 06.02.14 17:18, schrieb Marc Balmer: fd = open(/dev/dty03, O_RDWR); /* returns immediately */ close(fd); /* returns after one second */ So it is clear now that the delay is there for a specific application case: A modem on a tty line that hangs up when DTR is gone for one second. For probably all other use cases the delay is not necessary. Yes, for many use cases, the delay is not necessary. If you have one of those use cases then you can clear the termios(4) HUPCL flag. For use cases where the delay is desired, it would be better if the delay was neither unconditionally inserted at close time, not unconditionally inserted at open time, but rather if the delay was inserted only when necessary, such as when a close (with HUPCL set) is followed very soon by an open. In conclusion, the delay probably comes from times long gone when people used dial-in modems. For modern serial applications they are more a nuisance. Is it time to switch the default to no delay? I think this idea should be rephrased in terms of changing the default termios(4) flags. --apb (Alan Barrett)
Re: amd64 kernel, i386 userland
On Fri, 24 Jan 2014, Alan Barrett wrote: I have successfully used magic symlinks (see symlink(7)) to allow i386 and amd64 to use different instances of /dev. The basic scheme is: Build a kernel with options MAGICLINKS, or arrange to run sysctl -w vfs.generic.magiclinks=1 very early in /etc/rc. Putting the setting in /etc/sysctl.conf will probably be too late. mkdir /dev.i386 mkdir /dev.amd64 copy the i386 version of MAKEDEV to /dev.i386/MAKEDEV copy the amd64 version of MAKEDEV to /dev.amd64/MAKEDEV ( cd /dev.i386 sh ./MAKEDEV all ) ( cd /dev.amd64 sh ./MAKEDEV all ) mv /dev /dev.old ln -sf dev.@machine /dev reboot. If it works then rm -rf /dev.old. Oh, I forgot to address the issue of booting without options MAGICLINKS in the kernel. No matter how early in /etc/rc you try to put the sysctl -w vfs.generic.magiclinks=1 command, init(8) will want to open /dev/console earlier than that. So you either have to enable magiclinks in the kernel (so it's already enabled before init(8) starts), or you have to arrange for /dev/console to work even before magiclinks are enabled via the sysctl command. Adding a symlink from /dev.@machine to dev.i386 works for this (taking dev.@machine literally instead of as a magic expansion): ln -s dev.i386 /dev.@machine This is good enough for /dev/console and /dev/null, because the amd64 and i386 versions of those device nodes are identical. --apb (Alan Barrett)
Re: amd64 kernel, i386 userland
(0x801b, 0x3b) wd3m: device (0x801c, 0x3c) wd3n: device (0x801d, 0x3d) wd3o: device (0x801e, 0x3e) wd3p: device (0x801f, 0x3f) wd4a: device (0x20, 0x40) wd4b: device (0x21, 0x41) wd4c: device (0x22, 0x42) wd4d: device (0x23, 0x43) wd4e: device (0x24, 0x44) wd4f: device (0x25, 0x45) wd4g: device (0x26, 0x46) wd4h: device (0x27, 0x47) wd4i: device (0x8020, 0x48) wd4j: device (0x8021, 0x49) wd4k: device (0x8022, 0x4a) wd4l: device (0x8023, 0x4b) wd4m: device (0x8024, 0x4c) wd4n: device (0x8025, 0x4d) wd4o: device (0x8026, 0x4e) wd4p: device (0x8027, 0x4f) wd5a: device (0x28, 0x50) wd5b: device (0x29, 0x51) wd5c: device (0x2a, 0x52) wd5d: device (0x2b, 0x53) wd5e: device (0x2c, 0x54) wd5f: device (0x2d, 0x55) wd5g: device (0x2e, 0x56) wd5h: device (0x2f, 0x57) wd5i: device (0x8028, 0x58) wd5j: device (0x8029, 0x59) wd5k: device (0x802a, 0x5a) wd5l: device (0x802b, 0x5b) wd5m: device (0x802c, 0x5c) wd5n: device (0x802d, 0x5d) wd5o: device (0x802e, 0x5e) wd5p: device (0x802f, 0x5f) wd6a: device (0x30, 0x60) wd6b: device (0x31, 0x61) wd6c: device (0x32, 0x62) wd6d: device (0x33, 0x63) wd6e: device (0x34, 0x64) wd6f: device (0x35, 0x65) wd6g: device (0x36, 0x66) wd6h: device (0x37, 0x67) wd6i: device (0x8030, 0x68) wd6j: device (0x8031, 0x69) wd6k: device (0x8032, 0x6a) wd6l: device (0x8033, 0x6b) wd6m: device (0x8034, 0x6c) wd6n: device (0x8035, 0x6d) wd6o: device (0x8036, 0x6e) wd6p: device (0x8037, 0x6f) wd7a: device (0x38, 0x70) wd7b: device (0x39, 0x71) wd7c: device (0x3a, 0x72) wd7d: device (0x3b, 0x73) wd7e: device (0x3c, 0x74) wd7f: device (0x3d, 0x75) wd7g: device (0x3e, 0x76) wd7h: device (0x3f, 0x77) wd7i: device (0x8038, 0x78) wd7j: device (0x8039, 0x79) wd7k: device (0x803a, 0x7a) wd7l: device (0x803b, 0x7b) wd7m: device (0x803c, 0x7c) wd7n: device (0x803d, 0x7d) wd7o: device (0x803e, 0x7e) wd7p: device (0x803f, 0x7f) wsfont: device (0x5100, 0x5600) xbd0i: device (0x80008e00, 0x8e08) xbd0j: device (0x80008e01, 0x8e09) xbd0k: device (0x80008e02, 0x8e0a) xbd0l: device (0x80008e03, 0x8e0b) xbd0m: device (0x80008e04, 0x8e0c) xbd0n: device (0x80008e05, 0x8e0d) xbd0o: device (0x80008e06, 0x8e0e) xbd0p: device (0x80008e07, 0x8e0f) xbd1a: device (0x8e08, 0x8e10) xbd1b: device (0x8e09, 0x8e11) xbd1c: device (0x8e0a, 0x8e12) xbd1d: device (0x8e0b, 0x8e13) xbd1e: device (0x8e0c, 0x8e14) xbd1f: device (0x8e0d, 0x8e15) xbd1g: device (0x8e0e, 0x8e16) xbd1h: device (0x8e0f, 0x8e17) xbd1i: device (0x80008e08, 0x8e18) xbd1j: device (0x80008e09, 0x8e19) xbd1k: device (0x80008e0a, 0x8e1a) xbd1l: device (0x80008e0b, 0x8e1b) xbd1m: device (0x80008e0c, 0x8e1c) xbd1n: device (0x80008e0d, 0x8e1d) xbd1o: device (0x80008e0e, 0x8e1e) xbd1p: device (0x80008e0f, 0x8e1f) xbd2a: device (0x8e10, 0x8e20) xbd2b: device (0x8e11, 0x8e21) xbd2c: device (0x8e12, 0x8e22) xbd2d: device (0x8e13, 0x8e23) xbd2e: device (0x8e14, 0x8e24) xbd2f: device (0x8e15, 0x8e25) xbd2g: device (0x8e16, 0x8e26) xbd2h: device (0x8e17, 0x8e27) xbd2i: device (0x80008e10, 0x8e28) xbd2j: device (0x80008e11, 0x8e29) xbd2k: device (0x80008e12, 0x8e2a) xbd2l: device (0x80008e13, 0x8e2b) xbd2m: device (0x80008e14, 0x8e2c) xbd2n: device (0x80008e15, 0x8e2d) xbd2o: device (0x80008e16, 0x8e2e) xbd2p: device (0x80008e17, 0x8e2f) xbd3a: device (0x8e18, 0x8e30) xbd3b: device (0x8e19, 0x8e31) xbd3c: device (0x8e1a, 0x8e32) xbd3d: device (0x8e1b, 0x8e33) xbd3e: device (0x8e1c, 0x8e34) xbd3f: device (0x8e1d, 0x8e35) xbd3g: device (0x8e1e, 0x8e36) xbd3h: device (0x8e1f, 0x8e37) xbd3i: device (0x80008e18, 0x8e38) xbd3j: device (0x80008e19, 0x8e39) xbd3k: device (0x80008e1a, 0x8e3a) xbd3l: device (0x80008e1b, 0x8e3b) xbd3m: device (0x80008e1c, 0x8e3c) xbd3n: device (0x80008e1d, 0x8e3d) xbd3o: device (0x80008e1e, 0x8e3e) xbd3p: device (0x80008e1f, 0x8e3f) --apb (Alan Barrett)
Re: amd64 kernel, i386 userland
On Sat, 25 Jan 2014, Emmanuel Dreyfus wrote: Alan Barrett a...@cequrux.com wrote: I see the following differences from this mtree comparison: As I said, if your only filesystem is root on raid0a and your swap is on sd0b/wd0b, you boot to multiuser without touching /dev Perhaps you are thinking of some other scenario, but I am talking about the scenario that exists if you follow the steps in my first message, and do not follow the steps in my second message, and use a kernel that does not have options MAGICLINKS. In such a case, any attempt to cd /dev or open /dev/console will fail, because /dev will be a dangling symlink. It doesn't matter how similar the i386 and amd64 versions of /dev are; if /dev is a dangling symlink then nothing really works. Try it yourself: mkdir /dev.i386 mkdir /dev.amd64 copy the i386 version of MAKEDEV to /dev.i386/MAKEDEV copy the amd64 version of MAKEDEV to /dev.amd64/MAKEDEV ( cd /dev.i386 sh ./MAKEDEV all ) ( cd /dev.amd64 sh ./MAKEDEV all ) mv /dev /dev.old ln -sf dev.@machine /dev and then boot a kernel that does *NOT* have options MAGICLINKS. /dev will be a dangling symlink, and /dev/console will not be found, and init(8)'s attempt to (cd /dev/ sh ./MAKEDEV init) will fail. --apb (Alan Barrett)
Re: amd64 kernel, i386 userland
On Sat, 25 Jan 2014, Thor Lancelot Simon wrote: Perhaps you are thinking of some other scenario, but I am talking about the scenario that exists if you follow the steps in my first message, and do not follow the steps in my second message, and use a kernel that does not have options MAGICLINKS. In such a case, any attempt to cd /dev or open /dev/console will fail, because /dev will be a dangling symlink. init should detect this (doesn't it already?) and mount a tmpfs /dev. init detects the absence of /dev/console, and tries to mount a tmpfs /dev, but the first step in that process is chdir(/dev), which fails when /dev is a dangling symlink. --apb (Alan Barrett)
Re: amd64 kernel, i386 userland
On Fri, 24 Jan 2014, matthew green wrote: i386 and amd64 do NOT have compatible /dev. if you boot an amd64 kernel, make sure you run an amd64 MAKEDEV in /dev. I have successfully used magic symlinks (see symlink(7)) to allow i386 and amd64 to use different instances of /dev. The basic scheme is: Build a kernel with options MAGICLINKS, or arrange to run sysctl -w vfs.generic.magiclinks=1 very early in /etc/rc. Putting the setting in /etc/sysctl.conf will probably be too late. mkdir /dev.i386 mkdir /dev.amd64 copy the i386 version of MAKEDEV to /dev.i386/MAKEDEV copy the amd64 version of MAKEDEV to /dev.amd64/MAKEDEV ( cd /dev.i386 sh ./MAKEDEV all ) ( cd /dev.amd64 sh ./MAKEDEV all ) mv /dev /dev.old ln -sf dev.@machine /dev reboot. If it works then rm -rf /dev.old. --apb (Alan Barrett)
Re: amd64 kernel, i386 userland
On Tue, 21 Jan 2014, Emmanuel Dreyfus wrote: In order to have more RAM without reinstalling everything, using an amd64 kernel on to of a i386 userland seems appealing. netbsd32 emulation works fine, and the machine boots to multiuser without a hassle. But there are minor problems, with binaries that use ioctl to talk with a kernel subsystem: I spoted ipf and raidctl. If there are particular ioctls that don't have proper netbsd32 compat equivalents, they can be fixed on a case by case basis. We'd need to start with a list of the problematic ioctls, either by noticing what fails, or by systematically searching for ioctls whose data includes fields that change size between 32-bit and 64-bit systems. I think that we should also avoid adding such problematic kernel interfaces in future, by using fixed width types wherever possible. Things would be much easier if the kernel searched /emul/netbsd64 before / for native binaries. Of course such a behavior cannot be made default because of the performance penalty. But a compile time option would be nice without causing any performance harm to people that do not want it. I have also sometimes wished for that. --apb (Alan Barrett)
Re: Vnode API cleanup pass 2a
On Wed, 15 Jan 2014, Taylor R Campbell wrote: For that matter, why new machinery for this versioning stuff at all? Why not just rename the vop from mkdir to mkdir_v2? That would take care of both struct vop_mkdir_v2_args and VOP_MKDIR_V2. Am I missing something? That would the calling code ugly. --apb (Alan Barrett)
Re: qsort_r
On Mon, 09 Dec 2013, Mouse wrote: I actually don't see anything that promises that a pointer to a function type may be converted to a pointer to void, nor back again (except, in each direction, when the original pointer is nil), much less promising anything about the results if it is done. But I haven't read over the whole thing recently enough to be sure there isn't such a promise hiding somewhere. Sorry, I did not express myself clearly enough. C does not promise that function pointers can be converted to or from void* pointers, but I believe that all existing NetBSD implementations do allow such conversions. --apb (Alan Barrett)
Re: qsort_r
On Sun, 08 Dec 2013, David Holland wrote: My irritation with not being able to pass a data pointer through qsort() boiled over just now. Apparently Linux and/or GNU has a qsort_r() that supports this; so, following is a patch that gives us a compatible qsort_r() plus mergesort_r(), and heapsort_r(). Apparently FreeBSD [1] and GNU [2] have incompatible versions of qsort_r, passing the extra 'thunk' or 'data' argument in a different position. [1]: FreeBSD qsort_r http://www.manpagez.com/man/3/qsort_r/ [2]: Linux qsort_r http://man7.org/linux/man-pages/man3/qsort.3.html If we have to pick one, let's pick the FreeBSD version. I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. I'd probably duplicate the code via CPP, to trade time for space, but your way is fine. Note that the thunks use an extra struct to hold the function pointer; this is to satisfy C standards pedantry about void pointers vs. function pointers, and if we decide not to care it could be simplified. That adds more run-time overhead. Could you make it conditional on whether it's really necessary? All existing NetBSD platforms can convert back and forth between void * and function pointers without any trouble. --apb (Alan Barrett)
Re: qsort_r
On Sun, 08 Dec 2013, Mouse wrote: Is just casting the function pointers safe in C No. As soon as you call through a pointer to the wrong type you're off in nasal demon territory. (Loosely put; I'd have to look up the exact wording - there is a little wiggle room, but, if I've understood the subject of the discussion correctly, not enough.) You can't call through a function pointer of the wrong type, but you can cast from one type to another. I think that's enough, provided that void * is large enough to be converted to and from a function pointer. If you can find me a description of what NetBSD assumes beyond what C promises, I can have a stab at answering that question. There is no such list. That's a bug in NetBSD's documentation. --apb (Alan Barrett)
Re: [patch] put ptrdiff_t in the kernel and create sys/stddef.h
On Wed, 04 Dec 2013, David Holland wrote: (*) A complete scheme for doing it right removes all the _BSD_FOO_T_ drivel and ifdefs scattered in userland headers in favor of: - a single header file that defines all the needed types prefixed with __, which can be included anywhere; - in userland, include-guarded header files akin to sys/null.h that define single or common groups of the names without the __ prefixes, e.g. types/size_t.h; - including these header files in the proper places, such as in standard userland header files like stddef.h; - in the kernel, a single header file that defines all the types without the __, that is or is exposed to sys/types.h but does not affect userland. Yes, that's one way of doing it right. Until such time as somebody does it right, please follow the pattern of what's done already. --apb (Alan Barrett)
Re: [patch] changing lua_Number to int64_t
On Sun, 17 Nov 2013, Mouse wrote: sizeof returns the number of bytes used to store an object. This is only loosely related to the number of data bits in the object; the latter is no more than sizeof the object times CHAR_BIT, but it may be lower. Also, using an exact-width type assumes that the hardware/compiler in question _has_ such a type. Yes, that's true of C. It's possible that lua, NetBSD, or the combination of the two is willing to write off portability to machines where one or both of those potential portability issues becomes actual. But that seems to be asking for trouble to me; history is full of but nobody will ever want to port this to one of _those_ that come back to bite people. NetBSD already assumes that char is exactly 8 bits, and that integer types with exactly 16, 32, and 64 bits exist. Adding more instances of the same assumptions doesn't seem like a big problem to me. If there's ever a need to port to a machine where those assumptions do not hold, then we can worry about it at that time, but I susect that it will be possible to change to using things like int_least64_t (for a type with no less than 64 bits) instead of int64_t (for a type with exactly 64 bits). --apb (Alan Barrett)
Re: pulse-per-second API status
On Fri, 01 Nov 2013, Greg Troxel wrote: But if NetBSD enables PPS on ucom, there's going to be an expectation that it is good enough for stratum-1 timekeeping, like PPS on real serial ports. I don't think there's any such expectation created. [...] People who expect the same as serial PPS are confused, and we are not responsible for that. I think that PPS on a device with very high interrupt latency is sufficiently similar to PPS on a device with low interrupt latency that it deserves to have the same API. I don't think it even needs a sysctl to enable it. I think that it just needs careful documentation, in ucom(4) and wherever we document the PPS API. Maybe the documentation for applications like ntpd should also warn against using PPS on USB interfaces. --apb (Alan Barrett)
Re: pulse-per-second API status
On Fri, 01 Nov 2013, paul_kon...@dell.com wrote: I don't know this API. But my first reaction when I saw the designation PPS is to think of GPS timekeeping boxes and other precision frequency sources that have a PPS output. On those devices, the PPS output is divided down from the main oscillator frequency, i.e., you can expect accuracies of 10^-9 for modest price crystal oscillators, 10^-10 to 10^-12 for higher end stuff -- and jitter in the nanosecond range or better. It seems rather confusing to have another interface that goes by the same name but has specs 6 or more orders of magnitude worse. How about a different name that avoids this confusion? It's exactly the same interface. Something in the external timekeeping box is hooked up to one of the modem control lines on a serial port; the modem control line is hooked up to an interrupt (or something like an interrupt); the interrupt is hooked up to something in the kernel that records the time that the interrupt occurred. The difference is only one of interrupt latency. With plain old serial ports, the modem control line can be hooked up to a CPU interrupt pin using low-latency electronics. With USB, if I have understood correctly, the interrupt is faked by some sort of polling interface with much higher latency and jitter. --apb (Alan Barrett)
Re: zero-length symlinks
On Fri, 01 Nov 2013, David Holland wrote: rmind@ points out that it's possible to create zero-length symlinks. As zero-length symlinks aren't sensible, this should probably be prohibited. Does anyone see any reason they shouldn't be? Symlink names should satisfy all the rules for file system object names, so should not be allowed. Symlink targets are just strings. They are usually used to store path names, but they can also be used to store arbitrary strings that can be read via readlink(2). NetBSD's malloc implementation uses /etc/malloc.conf in this way, and I don't see a reason to prohibit it from using . POSIX says The string pointed to by path1 shall be treated only as a character string and shall not be validated as a pathname. http://pubs.opengroup.org/onlinepubs/9699919799/functions/symlink.html --apb (Alan Barrett)
Re: module path message
On Tue, 29 Oct 2013, John Nemeth wrote: The default path for module loading is: /stand/amd64-xen/6.99.25/modules I suggest exposing the path via sysctl, and printing the sysctl mib name in the message, something like kern.module.path=/stand/amd64-xen/6.99.25/modules --apb (Alan Barrett)
Re: changing KASSERT()'s definition for non-diag kernels
On Sun, 20 Oct 2013, matthew green wrote: as part of the GCC 4.8 preparation work, we're seeing many new warnings where variables are only used inside KASSERT(), but the non-diag kernel builds trigger errors. my solution, rather than marking these variables with __USE(), is to change KASSERT() into a real function that consumes its arguments, but is still an empty function. That seems sensible to me. More generally, a lot of our exiting macros can be rewritten as static inline functions, now that we require a C99 compiler. note that there is a re-direction to force the input to KASSERT() to be an integer type, as it is called with all sorts of types of input (pointers, values, boolean expressions..) The KASSERT macro can be invoked with anything that has a truth value as its first argument. Casting that to int seems reasonable, but perhaps using (!!(e)) to convert any type to a truth value would be clearer and less likely to trigger compiler warnings about casting non-numeric types to int. --apb (Alan Barrett)
Re: Why do we need lua in-tree again? Yet another call for actual evidence, please. (was Re: Moving Lua source codes)
On Sat, 19 Oct 2013, Marc Balmer wrote: The inclusion and use of Lua in base, for use in userland and the kernel, [...] has, last but not least, core's blessing. Would you please either present some evidence for that claim, or stop making the claim. To the best of my knowledge, userland Lua was approved by core in 2010, but kernel Lua has never been approved by core. Can we now please stop this useless discussion? People will continue to ask questions until they receive some satisfactory answers. --apb (Alan Barrett)
Re: Why do we need lua in-tree again? Yet another call for actual evidence, please. (was Re: Moving Lua source codes)
On Fri, 18 Oct 2013, Lourival Vieira Neto wrote: I have to point out that interesting work is commonly used as a sort of euphemism to refer to highly experimental work with unclear future. Yes. But I'm talking about interesting *user* work. I'm not claiming that they should be in the kernel. I'm just saying that, IMHO, we should incorporate a small device driver that facilitates this kind of development (outside the tree). You seem to want the lua device driver to be inside the tree, to facilitate experimental work outside the tree. Other people have asked why the lua(4) device driver itself can't be developed outside the tree (with a view to importing it later, if it ever proves to be more than an experiment), and I have seen no good answer to that. --apb (Alan Barrett)
Re: Why do we need lua in-tree again? Yet another call for actual evidence, please. (was Re: Moving Lua source codes)
On Sat, 19 Oct 2013, Marc Balmer wrote: And now to give you a practical example what I personally do with lua(4) right now: In the past I wrote several tty line disciplines to decode various serial formats. Now I have a need for that again. Doing this in C is of course possible, but I want something more dynamic. So I wrote a tty line discipline that uses Lua to do all the decoding. That works like a charm: Load the script, test, change the script and reload. Really practical. I will release this code once I sorted out a few remaining details. And in the course of this work, I also found deficencies in slattach(8). In previous work I used Lua to create a software gpio device, a modified version of gpiosim(4) that uses a Lua script to mimick a real device. Also handy. Thank you. Those seem like useful example. --apb (Alan Barrett)
Re: Moving Lua source codes
On Tue, 15 Oct 2013, Marc Balmer wrote: Well, you are in contradiction to our guide, which under http://www.netbsd.org/releases/release-map.html#current states that NetBSD-current is the main development branch. NetBSD-current the main development branch for things that we know we want, and that we are prepared to support for a long time, and that mostly work. If any of those tests fail, then I'd say that the code should not be in -current, but could be in a branch or in pkgsrc or in some third party tree. In the case of kernel Lua, some people are not convinced that we want it, and some people are not convinced that the API is stable enough that we should commit to long-term compatibility for it. Although I think that developing in the main -current tree is acceptable (especially if users are told not to expect as much future compatibility as for most other parts of NetBSD), I would have preferred to see development in a branch. It's certainly not the clear-cut no need for a branch situation that you seem to think. --apb (Alan Barrett)
Re: Sending ATA commands?
On Sun, 11 Aug 2013, Mouse wrote: What does your support do? Does it let you write over the host protected area? Does it let you extract what's in there? Yes and yes. It simply removes the protection, letting the host see the HPA as what it really is: more space appended to the space advertised to HPA-unaware software. I don't really like silently appending the host protected area to the unprotected part of the disk. Exposing something that is supposed to be hidden could have unexpected consequences. I think I'd prefer to present the HPA as a separate device (an ld(4) device, as others have suggested, would be fine), and add some ioctls and atactl commands to query and adjust the sizes of the ordinary and HPA parts of the disk. --apb (Alan Barrett)
Re: marking kern_assert(9) as __dead, and recursive panics
On Sun, 10 Feb 2013, Alan Barrett wrote: * Remove the panicstr test from kern_assert() in sys/lib/libkern/kern_assert.c, so that KASSERT, KASSERTMSG and friends do not degenerate to no-ops after a panic. I don't know a reason for making all kernel asserts degenerate to no-ops, but I imagine that it might have been a workaround for problems with recursive panics, and I propose to address recursive panics directly (see below). I can also imagine that there are particular kernel asserts that need to degenerate to no-ops after a panic, and I suggest explicitly rewriting them in terms of (panicstr != NULL || other tests). I have not attempted to identify such asserts. People have informed me that, when debugging a kernel after a panic, they often want to call functions that may hit assertion failures, and the particular asserts cannot reasonably be identified in advance, so it's useful for all kernel asserts to degenerate to no-ops after a panic. I will produce a revised proposal that retains this feature which people obviously want. My current ideas are to print a message about the fact that the assertion failure was ignored (instead of silently ignoring the assertion failure), and to use ifdefs to allow static analysers to behave as if the assertion failures are never ignored. --apb (Alan Barrett)
Re: fixing compat_12 getdents
also, EINVAL doesn't seem like a great error code for this condition. it's not an input parameter that's causing the error, but rather that the required output format cannot express the data to be returned. I think solaris uses EOVERFLOW for this kind of situation, and ERANGE doesn't seem too bad either. any opinions on that? There's also E2BIG, but I don't think it fits. ERANGE is documented in terms of the available space, while EOVERFLOW is documented in terms of a numeric result. So perhaps EOVERFLOW for integer is too large to fit in N bits, and ERANGE for string is too long to fit in N bytes? Or vice versa? Somebody(TM) should go through the errno(2) documentation and make the descriptions more generic, and add guidance for choosing which code to return. --apb (Alan Barrett)
Re: KNF and the C preprocessor
On Mon, 10 Dec 2012, David Young wrote: What do people think about setting stricter guidelines for using the C preprocessor than the guidelines from the past? Maybe. The C preprocessor MUST NOT be used for 1 In-line code: 'static inline' subroutines are virtually always better than macros. I disagree with this one. If you tone it down to SHOULD NOT or prefer static inline functions where appropriate then I might agree, but MUST NOT is way too strict. Sometimes the C standard mandates the use of macros, and I would not want to violate that simply to comply with your MUST NOT requirement. For example, the ctype(3) API must be provided by extern functions, and may also be provided by macros; I don't see how the macros could be replaced by static inline functions without breaking something. The first breakage that springs to mind is that adding parentheses around a name like (isalpha) is supposed to prevent it from being interpreted as a macro, so you get the function instead; but there's no analogous way to prevent something from being interpreted as a static inline function so you get the external function instead. 2 Configuration management: use the compiler linker to a greater extent than the C preprocessor to configure your program for your execution environment, your chosen compilation options, et cetera. Again here, MUST NOT is way too strict. While I don't like *.c files littered with ifdefs, I think it's OK for *.c files to contain many macro invocations, and it's OK for header files to contain many ifdefs, but both these would be outlawed by your MUST NOT requirement. --apb (Alan Barrett)
Re: core statement on fexecve, O_EXEC, and O_SEARCH
The fexecve function could be implemented entirely in libc, via execve(2) on a file name of the form /proc/self/fd/N. Any security concerns around fexecve() also apply to exec of /proc/self/fd/N. I gave a try to this approach. There is an unexpected issue: The descriptor is probably already closed on exec before the syscall tries to use it. I believe that we should not fix that without a proper design of how all the parts will work together. Some questions that I would like to see answered are: Should it be possible to exec a fd only if a special flag was used in the open(2) call? Should the file's executability be checked at open time or at exec time, or both, or does it depend on open flags or on what happened to the fd in between open and exec? Should the record of the fact that the fd may be eligible for exec be erased when the fd is passed from one process to another? Always or only sometimes? How can fds obtained from procfs be made to follow the rules? --apb (Alan Barrett)
Re: FFS write coalescing
On Mon, 03 Dec 2012, Chuck Silvers wrote: the genfs code also never writes clean pages to disk, even though for RAID5 storage it would likely be more efficient to write clean pages that are in the same stripe as dirty pages if that would avoid issuing partial-stripe writes. (which is basically another way of saying what david said.) Perhaps there should be a way for block devices to report at least three block sizes: a) smallest possible block size (512 for almost all disks) b) smallest efficient block size and alignment (4k for modern disks, stripe size for raid) c) largest possible size (a device and bus-dependent variant of MAXPHYS) Then the file system could use (b) to know when it's a good idea to combine dirty and clean pages into the same write. --apb (Alan Barrett)
core statement on fexecve, O_EXEC, and O_SEARCH
The NetBSD core group has considered adding the fexecve(2) or fexecve(3) syscall or function, and adding new O_EXEC and O_SEARCH open(2) flags. These new features may be useful, but their security properties are not well understood. The core group is of the opinion that these new features should not be added to NetBSD until there is a design that discusses their security properties, the way they interact with each other and existing features, and addresses the security concerns. Designs that are slightly incompatible with other operating systems or with POSIX need not be ruled out; for example, it may be reasonable to make fexecve() fail if the fd was not opened with certain flags, or to automatically clear certain flags when the fd is passed from one process to another. The fexecve function could be implemented entirely in libc, via execve(2) on a file name of the form /proc/self/fd/N. Any security concerns around fexecve() also apply to exec of /proc/self/fd/N. If necessary, the open(2) syscall could be versioned so that O_RDONLY is no longer defined as zero. -- Alan Barrett, on behalf of core
Re: [PATCH] POSIX extended API set 2
On Sun, 11 Nov 2012, Emmanuel Dreyfus wrote: Taylor R Campbell campbell+netbsd-tech-k...@mumble.net wrote: I know this is a bike shed, and I'm sorry to be the one to bring it up, but can we use the names chmodat, chownat, c., for our native system calls, and just use libc aliases or _BLAH_SOURCE nonsense or something for the ridiculous `f' prefix on fchmodat, fchownat, c.? What is the goal? You want to write userland code using chmodat() instead of fchmodat()? I want the names to follow a clear and easily-documented pattern. Takes a nameTakes a fd, not a name Takes a name and an at fd (prepend f) (append at) -- --- open- (fopen is different) openat link- linkat unlink - unlinkat rename - renameat chdir fchdir chdirat mkdir fmkdir mkdirat mkfifo fmkfifo mkfifoat utimens futimensutimensat chmod fchmod chmodat (not fchmodat) chown fchown chownat (not fchownat) statfstat statat (not fstatat) access - accessat (not faccessat) However, I also want the inconsistent POSIX names to be provided. I don't know a good way of satisfying both goals. --apb (Alan Barrett)
Re: pass-through linux ioctl for mfi(4)
On Wed, 19 Sep 2012, Manuel Bouyer wrote: Here's an updated patch, which checks the size before malloc in mfifioctl(), and I also removed a debug printf in compat_linux. I intend to commit this next weekend. Are these pass-through ioctl commands denied at securelevel = 1? --apb (Alan Barrett)
Core statement on directory naming for kernel modules
Core statement on directory naming for kernel modules -- July 2012 The NetBSD core group has noted concerns about the name of the directory used for kernel modules. At present, the kernel loads modules from the directory /stand/${MODULE_MACHINE}/${VERSION}/modules (e.g. /stand/amd64/6.99.4/modules). There have been several objections to this use of the /stand directory, and several suggestions for alternatives. On 8 July 2012, David Holland presented this summary of the proposals, and objections to them: /boot is wrong because modules are not used only or even primarily at boot. /lib/modules is wrong because modules are not link libraries. /libdata/modules is wrong because modules are not data. /libexec/modules is wrong because modules are not programs. /modules is wrong because it adds a new toplevel directory. /stand/modules is wrong because modules are not used without the kernel. There have also been proposals for more radical changes, including: Keeping both the kernel and its modules together in a directory. A detailed description was posted by Luke Mewburn http://mail-index.NetBSD.org/current-users/2009/05/10/msg009372.html. Keeping both the kernel and its modules together in a tar archive. Keeping both the kernel and its modules together in an ELF executable. The core group is of the opinion that it is too late for such major changes to be included in NetBSD-6. Accordingly, we think that the existing scheme should be retained, without changes to either directory names or more fundamental aspects, for the NetBSD-6 release. Changes to either the directory names, or more fundamental aspects of the scheme, or both, may be made in the future. The core group would also like to see the following changes in the near future: Implementation of the scheme described by Luke Mewburn in http://mail-index.NetBSD.org/current-users/2009/05/10/msg009372.html to allow a kernel and its modules to be kept together. Changes to config(1) to extend the existing notion of whether or not an option is built-in to the kernel, to three states: built-in, not built-in but loadable as a module, entirely excluded and not even loadable as a module. Alan Barrett, on behalf of core
Re: raid1: unable to open device, error = 16
On Thu, 26 Jul 2012, matthew green wrote: library functions like opendisk() will look for a file-path of the given name before trying other names. that tends to make them use the block dev when you want the char dev. eg, compare the ktrace for newfs raid1d when you are in/not in /dev. I think that opendisk should do something like this: if (path contains a slash) { use specified path, do not search in any way } else { try /dev/[r]foo try /dev/[r]fooX, where X is c or d depending on kern.rawpartition } If the user wants to open a file in the cwd, then let the user pass ./foo instead of foo to opendisk. --apb (Alan Barrett)
Re: File systems on 4k sector devices?
On Thu, 07 Jun 2012, Michael van Elst wrote: WAPBL is an exception, the position and size of the journal is stored int terms of pyhsical disk sectors. So if I use dd(1) or equivalent to copy the data from a disk with 512-byte sectors to a disk with 4096-byte sectors, and if the data includes a WAPBL wile system, then it won't work? Can we fix this? Can we fix it before next week, when I plan to upgrade my laptop's disk to one with 4kB sectors? --apb (Alan Barrett)
Re: Time to remove COMPAT_386BSD_MBRPART
On Sun, 04 Mar 2012, David Holland wrote: Now that netbsd-6 has been branched, I think it's time to remove COMPAT_386BSD_MBRPART. This entails both the kernel code and some code in disklabel(8) and sysinst. The kernel code has been disabled by default for years; the disklabel(8) code was overlooked at the time and disabled about a year ago (in both current and -5) after several reports of trashed FreeBSD partitions were traced to it. Does anyone object? It's been a good long time since the partition ID changed. No objection, but please can it be added to Features that will be removed in the next version in the netbsd-6 release notes. --apb (Alan Barrett)
Re: extattr namespaces
On Fri, 10 Feb 2012, YAMAMOTO Takashi wrote: how about the following mapping? xattr name string - ufs on-disk system.foo - SYSTEM foo others.bar - USER others.bar Looks reasonable, but then which of the following? a) user.user.baz - USER user.baz b) user.baz - USER user.baz c) user.baz - USER baz d) baz- USER baz (I suggest b and d) --apb (Alan Barrett)
Re: Implementing mount_union(8) into vfs (for -o union)?
On Sat, 28 Jan 2012, Julian Fagir wrote: I've just been trying to mount a tmpfs over a read-only root file system. Unfortunately, this won't work just by mounting a tmpfs with option union over the root file system. You'd have to create a tmpfs, and mount that one with mount_union(8) over the root file system, which is again not possible. I read your message twice and I still don't know what you mean. Could you give examples of the commands that you use, and the errors. --apb (Alan Barrett)
Re: fifo and [acm]time
On Mon, 26 Dec 2011, Taylor R Campbell wrote: Is one inode update per minute enough to be a significant issue? It means the disk must continue spinning and, e.g., will continue to draw power from a laptop battery to do so, even when the system is functionally idle. I think that's a more general problem. It would be nice if all updates to atime/mtime/ctime could be buffered in memory (not committed to stable storage) until either the disk happens to be spinning anyway, or the amount of memory wasted in buffering the updates is too large, or the updates are forced using a mechanism like fsync(2) or sync(2). I even want syslogd's writes to /var/log/messages to be buffered until the disk happens to be spinning anyway. --apb (Alan Barrett)
Re: Patch: new random pseudodevice
On Fri, 09 Dec 2011, Thor Lancelot Simon wrote: An attacker who can break AES might be able to predict the future output of _one_ instance of the generator. An attacker who can break AES and recover the key and defeat the backtracking resistance designed into CTR_DRBG *might* be able to recover the prior outputs of the generator for that user. An attacker who can do all these things *and* recover earlier entropy-pool output from later entropy-pool output (that is, do exactly what would have had to be done to break the old design) can recover keys provided by the generator to other users. If he happens to know when exactly they were produced (time is an input to the algorithm), etc. Fair enough, but you still seem to be talking about how good a CSPRNG it is, whereas my concern is that it's pseudorandom, nor random. How many different bit streams of length 2^31 can be produced by a generator that has a 128-bit key? I think it's 2^128 different pseudorandom bit streams of length 2^31. If they were truly random, then there would be 2^(2^31) of them. I still think it's not appropriate for /dev/random to output pseudorandom bits (even cryptographically secure pseudorandom bits) when it has historically output random bits (or at least attempted to output random bits, modulo bugs, design mistakes, etc.). --apb (Alan Barrett)
Re: Patch: new random pseudodevice
On Fri, 09 Dec 2011, Thor Lancelot Simon wrote: On Fri, Dec 09, 2011 at 12:14:40PM -0500, Thor Lancelot Simon wrote: Let me put it this way: before, you may have thought you were getting some kind of true randomness. You weren't. Now, you still aren't, but at least what sits between you and the entropy source is a lot more clear, and a lot better analyzed. I am not knowledgeable enough to comment on that, so I'll take your word for it. However, when applications use /dev/random, we could consider a request to be a single read from the device. This also has the appealing property that it aligns with how the underlying generator (CTR_DRBG) counts requests. That way, in practice, each read from /dev/random would get a fresh AES key -- and most application reads from /dev/random, which may block, are very small. I think that, in practice, that is about as close to meeting the expectations of the application authors as possible. I like that idea. --apb (Alan Barrett)
Re: Patch: new random pseudodevice
On Fri, 09 Dec 2011, Pawel Jakub Dawidek wrote: You are aware of the fact that 99.99% of computers don't have true random number generators and the bits you claim that are random are not random at all? They try to be unpredictable. I believe that there is a truly random component to air turbulence inside mechanical disk drives, and that some of the randomness can be harvested in timing measurements. I believe that there is a truly random component to the relationship between two uncoupled oscillators, and that some of that randomness can be harvested in timing measurements. I believe that there is a truly random component to the noise produced by an amplifier, and that some of that randomness can be harvested by an A/D converter. I believe that most computers have hardware capable of exploiting some of this randomness. I believe that this randomness is of thermodynamic and quantum origin, that it's difficult to estimate how many bits of entropy are theoretically present, and even more difficult to estimate how many bits of entropy are actually harvested. CSPRNG have two roles: turn few almost unpredictable bits that your machine can gather into many cryptographically secure pseudo-random bits and to hide those almost unpredictable bits from consumers. Yes. Returning gathered entropy directly is very, very risky. Yes. --apb (Alan Barrett)
Re: Patch: new random pseudodevice
On Thu, 08 Dec 2011, Thor Lancelot Simon wrote: The urandom device node will key the generator and output data even if the kernel entropy pool estimates that it does not have enough bits to provide an AES-128 key with ful entropy. The random device node will block until sufficient bits are available from the pool to key the generator. So, /dev/urandom will never block, and each opened file descriptor from /dev/random may block the first time you read or select from it, but will not block again until it is re-keyed after 2^31 bits (or is it bytes?) of output have been generated? The previous /dev/random implementation would never give out more data than the estimated entropy in the pool, so callers could think that they were getting the highest quality possible. Callers will now get 2^31 bits of output and consume only 128 bits of entropy from the pool, so they may think that they are getting lower quality output. I have this naive idea that trying to get out more than you put in is cheating, and I think it's fine for /dev/urandom to cheat, but I am not happy about /dev/random cheating. Please could you explain where I have misunderstood. --apb (Alan Barrett)
Re: secmodel_register(9) API
On Mon, 05 Dec 2011, Elad Efrat wrote: Personally I don't care if this stays or not. All I can say is that I have not seen a single argument worthy of consideration against it and I would strongly recommend to leave it in. When you want to introduce a new feature, you should provide arguments in favour of the new feature, not merely say there are no good arguments against it. This is especially important in the case of features that have non-trivial security impact. --apb (Alan Barrett)
Re: language bindings (fs-independent quotas)
On Fri, 18 Nov 2011, Manuel Bouyer wrote: Assuming that there's no need to handle fields with embedded spaces, perl's split() function will DTRT. No, it does not because there are fields that can be empty. The common way of dealing with that is to have a placehloder like - for empty fields. By the way, I still haven't figured out how to test any of this quota stuff. quotaon / followed by edquota -f / does nothing (no error message, and no useful result). Using the device name /dev/cgd1a instead of the file system name / does not help. what are you trying to do ? I am just trying to enable quotas so that I can test some of the quota-related commands. quotaon won't do anything if / doesn't have the userquota or groupquota keyword in the fstab, and you have to run quotacheck before quotaon. This is for ufs-quota1. I don't see that in the quotaon(8) man page. The filesystems specified must have entries in /etc/fstab and be mounted. I have that. quotaon expects each filesystem to have quota files named quota.user and quota.group which are located at the root of the associated file system. These defaults may be overridden in /etc/fstab. By default both user and group quotas are enabled. I interpreted that as by default, quotaon will just work. Anyway, when I run quotacheck, it complains: $ sudo quotacheck / quotacheck: / not found in /etc/fstab I do have an entry for / in /etc/fstab: from_mount / ffs rw,log 1 0 This is in a chroot, and the actual device name is /dev/cgd1a, but fstab doesn't know that. For ufs-quota2, quotas are enabled at newfs time, or with tunefs (with the later this has to be done on a read-only mounted filesystem, and you have to run fsck before mounting R/W). quotaon won''t do anything for ufs-quota2. The quotaon(8) man page does not say that it's only for some file system types, and does not refer to newfs, tunefs, or fsck. --apb (Alan Barrett)
Re: language bindings (fs-independent quotas)
On Fri, 18 Nov 2011, David Holland wrote: The proposed standard format for quotas is an ordinary columnar text file. The reason language bindings came up is that Manuel was complaining, somewhat oddly, that it's hard to handle these in Perl. Assuming that there's no need to handle fields with embedded spaces, perl's split() function will DTRT. And actually, language bindings are probably a good thing anyway; if you have an installation with 50,000 users and you want to frob their quotas from a Perl script, forking 50,000 edquota processes is probably not the best approach. Oh my, I missed the part of the edquota man page where it says a temporary file is created for each user. Why can't it just create a single temporary file with a text table of all quotas? By the way, I still haven't figured out how to test any of this quota stuff. quotaon / followed by edquota -f / does nothing (no error message, and no useful result). Using the device name /dev/cgd1a instead of the file system name / does not help. --apb (Alan Barrett)
Re: fsync, rdiff-backup, wapbl, and WD Elements 1T drive
Matthew Mondor wrote: Greg Troxel g...@ir.bbn.com wrote: So, I'm inclined to patch rdiff-backup not to fsync, since it seems excessive, and the backup is toast if the machine crashes before it is finished -- in that case rdiff-backup just rolls back. Opinions? I also wonder why fsync would be used for every file, especially if you consider a whole run a single transaction, even more so if using snapshots (although you don't mention using them). If rdiff-backup was easily able to roll back after a crash, then I'd probably agree with the above. But it's expensive to roll back (you have to compare the actual data in the files, without assuming that {same size, same mtime} implies same data). The current state of ffs+wabl is that, if the system crashes and the log is replayed, then files that had been written shortly before the crash end up with whatever old data happened to be in the underlying disk blocks, but new metadata indicating that the size and timestamps are all up to date. I think that this violates traditional unix file system semantics, but the people who worked on wapbl don't seem to think it's a problem. Anyway, the new metadata with old data tends to make rsync (and probably rdiff-backup) think that the file is up to date, and so not copy it again next time (unless you perform an expensive comparison of all the data, nit just the metadata). I have patched rsync to issue fdatasync(2) calls frequently, to mitigate this problem in my own usage. It does slow it down, but nowhere near as dramatically as you report. (I use NetBSD-current.) --apb (Alan Barrett)
Re: fs-independent quotas
On Wed, 19 Oct 2011, David Holland wrote: - the quota key is: the quota *class* the id the quota *type* - the quota value is: the configured hard limit the configured soft limit the configured grace period the current usage the current grace expiry time (if any) This seems sensible. 1. A file system type can have or not have support for quotas. If there is no support for quotas, nothing else works. 2. Any given filesystem volume may have or not have quota data on it. This is the filesystem code's problem and irrelevant to the FS-independent logic. 3. Any given filesystem volume may be mounted with or without quotas enabled. If quotas are not enabled, quota information is not available and the quota utilities will not be able to do anything. 4. Once mounted, quotas can be either on or off. As far as the FS-independent code is concerned, quotas being off means only that they aren't enforced; that is, with quotas off operations that increase usage do not fail with EDQUOT. When quotas are off, quota information can still be inspected or updated. I don't like the names on and off at level 4. They are too vague, and too easily confused with enabled or not enabled at level 3. I'd suggest these names: 1. supported or not supported by the file system format 2. present or not present in the file system backing store 3. enabled or not enabled in the mounted file system 4. enforced or not enforced for the system/user/group/file system/??? I think you might want a fs-independent API to ask the file system whether or not quotas are supported or present. I suppose getschemaname answers the present? question, but I don't see anything that would help a user interface choose whether to display a message saying quotas not supported, tough luck or quotas not enabled, would you like to enable quotas now? --apb (Alan Barrett)
Re: RFC: SEEK_DATA/SEEK_HOLE implementation version 2
On Wed, 17 Aug 2011, Reinoud Zandijk wrote: after getting stuck in the 1st implementation in the rump/puffs/refuse jungle i started a new version that is more in line with the Solaris implementation and is far less invasive. Basicly the system call forwards the requests using ioctl's just like Solaris and, as it turns out, also FreeBSD with their ZFS import. For simplicity and to reduce compat stuff i've used the same ioctls FreeBSD defines. FreeBSDs support is limited though; only ZFS handles them. The ioctl names are not documented yet. So, if I am reverse engineering the code correctly, the design is like this: There are no new VOP calls. There are two new ioctls, FIOSEEKDATA and FIOSEEKHOLE. Each file system may provide its own implementation. If the underlying file system doesn't support them, then they fail. There are two new lseek 'whence' flags, SEEK_DATA and SEEK_HOLE. The kernel's lseek implementation forwards them to the underlying file system using VOP_IOCTL(FIOSEEKDATA) and VOP_IOCTL(FIOSEEKHOLE). If the ioctl fails, then lseek implements the fallback behaviour of treating the file as a single data region followed by a hole after the end of file. I think that it would be better to implement the fallback behaviour in the vfs layer rather than in the lseek syscall. --apb (Alan Barrett)
Re: RFC: SEEK_DATA/SEEK_HOLE implementation version 2
On Mon, 03 Oct 2011, Reinoud Zandijk wrote: On Mon, Oct 03, 2011 at 08:33:06AM +0200, Alan Barrett wrote: I think that it would be better to implement the fallback behaviour in the vfs layer rather than in the lseek syscall. I tried that before and it was in my origional patch. I changed the VOP_SEEK() to accept the other two `whence' argument values. VOP_SEEK()'s prototype had to be extended resulting in severe compatibility issues with puffs/rump/(re)fuse etc. resulting in a HUGE patchset. Also, external maintained code like ZFS had to be changed. Your original patch did that in VOP_SEEK, yes. I think that was a bad idea, and that's not what I am suggesting. When I suggested implement the fallback behaviour in the vfs layer, I meant in the vfs layer's handling of the new FIOSEEKHOLE and FIOSEEKDATA ioctls. This would mean that users of the new lseek flags, and users of the new ioctls, would both get the fallback behaviour that, if the underlying file system doesn't know better, a file appears to have a single data region followed by a hole after EOF. Does this answer your question? Not really, but I see that my suggesiton was unclear. I hope it's more clear now. --apb (Alan Barrett)
Re: A simple cpufreq(9)
On Sun, 25 Sep 2011, Jukka Ruohonen wrote: So here is a quick draft for the first iteration with the cpuctl(8). If there are issues, speak now, otherwise I'll proceed with something based on this. You forgot to include the documentation. --apb (Alan Barrett)
Re: core's decision on modular kernels
On Wed, 21 Sep 2011, Martin S. Weber wrote: On Wed, Sep 21, 2011 at 07:55:38AM +0200, Alan Barrett wrote: - A port's MONOLITHIC kernel should include features that traditionally would have been present in a non-modular GENERIC kernel, and it may or may not include options MODULAR, at the portmaster's discretion. Huh? Would it be possible please to get a more detailed rationale behind allowing options MODULAR in a MONOLITHIC kernel, if all ports using modules already offer MODULAR and GENERIC? The main difference between MODULAR and MONOLITHIC would be that MONOLITHIC has built-in support for almost everything considered stable and useful, whereas MODULAR might expect to load a lot of modules at run time. MONOLITHIC might still not have absolutely everything built-in, and options MODULAR allows it to load additional modules at run time, if the portmaster decides that this would be useful. I use a MONOLITHIC kernel with options MODULAR to allow loading of a module that contains the root file system as an md(4) image. --apb (Alan Barrett)
core's decision on modular kernels
Dear NetBSD users, The NetBSD core group has discussed the questions presented to us about the situation with modules and modular kernels. We understand that there are problems with modularization on all the platforms, specially on amd64, and we have seen a lot of breakage due to them in the past years. As core we believe that ultimately the ability to build modular kernels is the way to go and that by reverting a lot of the modularization on head we limit its testing making it harder to become mainstream. On the other hand, we should always provide a safe way for people to build and release kernels. On the positive side: - Modules can speed up kernel development because they eliminate many reboots by simply loading and unloading the module during each development cycle. - Modules can conserve kernel memory in memory shortage situations. - Modules can be used to add/remove/replace functionality on the fly. On the negative side: - Many of our modules are half baked (don't work correctly as modules, don't specify the right dependencies, or cannot be unloaded). - Our module separation is not good (try compiling a kernel with only COMPAT_30 and all the rest of the compat code as modules; for now all that works is the all or nothing approach). - Modules don't work on all platforms. Some platforms don't have a need for them because their hardware is fixed, but modules could still be used for software features (compat code, emulations). - We don't have an easy way to group a kernel and its associated modules together, so that it's possible to have multiple bootable kernels, and multiple associated sets of modules, even if the kernels all share the same version number. - We don't have a stable kernel ABI so that modules are reusable across different kernel versions. - We don't have a way to tell from the kernel config file whether a feature can be used in a module form or not. (Perhaps comments or additional config(1) syntax could be used for this.) Accordingly, we propose the following policy for the immediate future. We expect that it will be appropriate to re-evaluate this policy as the state of modular support changes later. - All ports using modules should provide all three of MODULAR, MONOLITHIC, and GENERIC kernels. - At the portmaster's discretion, options MODULAR may be made the default by adding it to the port's std.machine configuration file. (A kernel without the MODULAR option cannot load any modules, not even through the modload(8) command.) - A port's MONOLITHIC kernel should include features that traditionally would have been present in a non-modular GENERIC kernel, and it may or may not include options MODULAR, at the portmaster's discretion. - A port's MODULAR kernel may lack many built-in features, expecting them to be loaded from modules at run time. However, all features that are necessary for the standard MODULAR kernel to boot and work reasonably must be built-in. This includes: * common file systems, including all file systems that can be the root file system, and also including nullfs and tmpfs; * disk devices that can contain the root file system; * common network devices; * exec support for the native ELF format, and for scripts (not necessarily for a.out, ECOFF, or compat formats); * core dump support. Users or developers may of course comment out relevant lines if they want to load these items as modules. - The GENERIC kernel should be based on either MODULAR or MONOLITHIC, using an include directive. The GENERIC kernel should include options MODULAR, even if it it based on a MONOLITHIC kernel that does not include options MODULAR. - A port may not set GENERIC = MODULAR if it lacks an easy way to group a kernel and its associated modules together. Because no existing ports have this feature, no existing ports may set GENERIC = MODULAR. Alan Barrett On behalf of the NetBSD core group
Re: RFC: New security model secmodel_securechroot(9)
On Sat, 09 Jul 2011, Aleksey Cheusov wrote: · Adding and enabling a ppp(4) interface is not allowed. · Adding and enabling a sl(4) interface is not allowed. · Adding and enabling a strip(4) interface is not allowed. · Adding and enabling a tun(4) interface is not allowed. · Adding and enabling a bcsp(4) device is not allowed. · Adding and enabling a btuart(4) device is not allowed. Can this be generalised to adding and enabling any kind of network interface is not allowed? --apb (Alan Barrett)
Re: mutexes, locks and so on...
Please could somebody on the eat your CAS whether you like it or not side of the fence explain why the following idea would not work: On Sat, 13 Nov 2010, der Mouse wrote: Consider this hypothetical: x86 does #define ATOMIC_OPS_USE_CAS and defines a CAS(); MI code notices this and defines all the higher-level primitives (if that's not too much of an oxymoron) in terms of CAS(). ppc, arm, all the arches sufficiently modern to have CAS, likewise. Arches without a sufficiently general CAS[%] do not define ATOMIC_OPS_USE_CAS and provides their own implementations of mutexes, spinlocks, whatever. --apb (Alan Barrett)
Re: XIP
On Mon, 25 Oct 2010, Masao Uebayashi wrote: I think the uebayasi-xip branch is ready to be merged. This branch implements a preliminary support of eXecute-In-Place; execute programs directly from memory-mappable devices without copying files into RAM. This benefits mainly resource restricted embedded systems to save RAM consumption. Would memory disks (such as md(4)) also benefit from XIP, or do they already do something to avoid having multiple copies of the same data? --apb (Alan Barrett)
Re: XIP
On Tue, 26 Oct 2010, Alan Barrett wrote: Would memory disks (such as md(4)) also benefit from XIP, or do they already do something to avoid having multiple copies of the same data? Never mind. I see you discuss this in section 11.6 of the paper. --apb (Alan Barrett)
Re: 16 year old bug [with non-contiguous netmasks]
On Mon, 23 Aug 2010, Christoph Egger wrote: [OpenBSD] commit message: Fix a 16 year old bug in the sorting routine for non-contiguous netmasks. I suggest removing support for non-contiguous netmasks. They are unusable with CIDR (introduced in 1993 in RFCs 1517, 1518, and 1519). Even RFC 950 (August 1985) recommended that subnet bits should be contiguous. --apb (Alan Barrett)
deprecating #define'd sysctl OIDs
On Sun, 15 Aug 2010, Jean-Yves Migeon wrote: It might make sense to add comments near all existing lists of hard-wired sysctl OID values asking people not to add more of them. Shall it be added for all other archs then? I assume that they can all benefit from the dynamic sysctl(9) interface? If we do this at all, then we should do it for all lists of sysctl OID values. Several of them are in sys/sysctl.h, and I am sure there are more scattered around. I don't see the point of doing it only for CPU_* definitions. All three of the sysctl(3), sysctl(7), and sysctl(9) man pages could also be improved, to make it more clear that new code can (should?) use dynamic allocation instead of #define'd OID values. --apb (Alan Barrett)
Re: Adding 'i386_use_pae' variable, and expose it through sysctl
On Sun, 15 Aug 2010, Thor Lancelot Simon wrote: You can't do it for existing OIDs, that breaks binary compatibility. Yes, obviously. My suggestion was about adding comments and documentation to discourage new OIDS from being added in the old way. --apb (Alan Barrett)
Re: Forcing a serial console for the kernel
On Sun, 28 Mar 2010, STEPHEN JONES, W0TTY wrote: If your system has serial BIOS, it is probably hiding the first serial port from the bootblocks so they don't automatically detect it. This is a change you need to make to the bootblocks -- not the kernel. Try installboot (possibly with -e depending on your application) -o console=com0 -o ioaddr=0x3f8 -o speed=9600. The ioaddr= option forces the bootblocks to detect the serial port even though the BIOS claims it's not there. Unfortunately this is a 2.0 (GOJU.RYU.COM) system which does not seem to have the ioaddr option for installboot. If you installed or upgraded from a CD, then the installboot command on the CD will have the options you need. If you built netbsd-5 from source, then ${TOOLDIR}/bin/nbinstallboot will be a version of installboot that runs under the existing system but supports the options found in netbsd-5's installboot. You'll also need new boot blocks, from ${DESTDIR}/usr/share/mdec (or from a netbsd-5 install CD). --apb (Alan Barrett)
[no subject]
On Mon, 15 Mar 2010, Aleksej Saushev wrote: While here, can anyone enlighten us how one boots NetBSD so that it looks for modules in non-default directory? You can't, and the people who want NetBSD to move to modular kernels don't seem to care. Until this problem is fixed, I will try to avoid using modular kernels. --apb (Alan Barrett)