Re: Kernel work that needs help?
Maybe habe a Look at our Problem Report (PR) database. - Hubert > Am 07.04.2024 um 22:00 schrieb Ice Cream : > > I would like to contribute to development and am looking > for ongoing kernel dev work that needs help. > > I worked on one of the projects (fullfs) from the project list > and am looking for more. Most projects on the list are years > old or not with clear definition or not of my general interest. > > I would like to join projects that are being actively worked > on and have well defined milestones for someone relatively > inexperienced. So, if you're working on something awesome > then let me know :-) > > -- IC
Re: "Boot this kernel once" functionality? (amd64)
On Wed, 16 Sep 2020, Anthony Mallet wrote: I was wondering how easy that would be to add a "boot once" feature to FWIW: the Amiga port had a /dev/reload in its very early days, where you could cp a kernel too, which would then be bootet. I'm not sure this is easy to make on a PC platform. - Hubert
Documenting the kern.sched sysctl tree [patch]
Hi, currently the kern.sched sysctl tree is not documented at all. The patch below changes this. Changes, updates and suggestions welcome! Here's what the formatted text looks like, for easier feedback: kern.sched (dynamic) Influence the scheduling of LWPs, their priorisation and how they are distributed on and moved between CPUs. Third level name Type Changeable kern.sched.cacheht_time integeryes kern.sched.balance_period integeryes kern.sched.average_weight integeryes kern.sched.min_catch integeryes kern.sched.timesoftints integeryes kern.sched.kpreempt_pri integeryes kern.sched.upreempt_pri integeryes kern.sched.maxts integeryes kern.sched.mints integeryes kern.sched.name string no kern.sched.rtts integerno kern.sched.pri_minintegerno kern.sched.pri_maxintegerno The variables are as follows: kern.sched.cacheht_time (dynamic) Cache hotness time in which a LPW is kept on one particu- lar CPU and not moved to another CPU. This reduces the overhead of flushing and reloading caches. Defaults to 3ms. Needs to be given in ``hz'' units, see mstohz(9). kern.sched.balance_period (dynamic) Interval in which the CPU queues are checked for re-bal- ancing. Defaults to 300ms. Needs to be given in ``hz'' units, see mstohz(9). kern.sched.average_weight (dynamic) Can be used to influence how likely LWPs can be migrated from one CPU's queue of LPSs that are ready to run to a different, idle CPU. The value gives the percentage for weighting the average count of migratable threads from the past against the current number of migratable threads. Small gives more weight to the past, big values more weight on the current situation. Defaults to 50 and must be between 0 and 100. kern.sched.min_catch (dynamic) Minimum count of migratable (runable) threads for catch- ing (stealing) from another CPU. Defaults to 1 but can be increased to decrease chance of thread migration between CPUs. kern.sched.timesoftints (dynamic) This switch allows to enable tracking of CPU time for soft interrupts as part of a LWP's real execution time. Set to a non-zero value to enable, and see ps(1) for printing CPU times. kern.sched.kpreempt_pri (dynamic) Minimum priority to trigger kernel preemption. kern.sched.upreempt_pri (dynamic) Minimum priority to trigger user preemption. kern.sched.maxts (dynamic) Scheduler specific maximal time quantum (in millisec- onds). Must be set to a value larger than ``mints'' and between 10 and ``hz'' as given by the ``kern.clockrate'' sysctl. Provided by the M2 scheduler. kern.sched.mints (dynamic) Scheduler specific minimal time quantum (in millisec- onds). Must be set to a value smaller than ``maxts'' and between 1 and ``hz'' as given by the ``kern.clockrate'' sysctl. Provided by the M2 scheduler. kern.sched.name (dynamic) Scheduler name. Provided both by the M2 and the 4BSD scheduler. kern.sched.rtts (dynamic) Fixed scheduler specific round-robin time quantum in mil- liseconds. Provided both by the M2 and the 4BSD sched- uler. kern.sched.pri_min (dynamic) Minimal POSIX real-time priority. See sched(3). kern.sched.pri_max (dynamic) Maximal POSIX real-time priority. See sched(3). - Hubert Index: sysctl.7 === RCS file: /cvsroot/src/share/man/man7/sysctl.7,v retrieving revision 1.104 diff -u -r1.104 sysctl.7 --- sysctl.717 Nov 2016 01:22:00 - 1.104 +++ sysctl.71 Jan 2017 22:19:46 - @@ -355,7 +355,7 @@ .It kern.rtc_off
Re: CPUs and processes not evenly assigned?!
> Am 22.12.2016 um 11:09 schrieb Michael van Elst : >> What are we waiting for here, how should we go on? >> Would you want to commit the latest patch and allow it get more wide-spread >> testing in -current? > > anyone who has time to _test_ it? > > Would be nice if someone looks at performance for different use cases > and possibly different settings of the knobs :) What use cases exactly do you have in mind? I guess we can spec out things and then see if we can find someone to help with the tests. It would be a pity to have a solution sitting around and still having our operating system not have it because we can’t move things forward. :) - Hubert
Re: CPUs and processes not evenly assigned?!
Hi, > Am 26.11.2016 um 11:09 schrieb Hubert Feyrer : > I see a lot of ground for more research here, determining right amount of > bits and A and B. To sum up our options at this point: > > a) leave the situation as-is and wait for research to get a perfect formula > b) commit the patch we have and wait for the research to be done > > Given that the existing patch in PR kern/43561 and PR kern/51615 does improve > the current situation, I'd vote for option "b". > > Any takers? Any objections? What are we waiting for here, how should we go on? Would you want to commit the latest patch and allow it get more wide-spread testing in -current? - Hubert
Re: CPUs and processes not evenly assigned?!
On Sun, 27 Nov 2016, Michael van Elst wrote: I've made a new patch: http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/kern_runq.diff FWIW, I've made a test run with two CPUs, and with my original test case of two concurrent CPU-hogs, this works fine (just like the previous tests). -> it uses 8 guard bits (multiply by 256 instead of 2). -> it lets you configure the moving average factor in percent (default is 50% like now) with a sysctl. I tried playing with these, but I didn't see a difference. I didn't fully get the meaning of this - as far as I understand, one extreme (1? 99?) gives much weight to the previous value and avoids switching short-handedly, the other one puts weight on newly occurring processes on a cpu and favors switching. But I'm not sure which is which, and if this is true at all. Yet another magic knob for our scheduler ;) Where to go from here? - Hubert
Re: CPUs and processes not evenly assigned?!
On Fri, 11 Nov 2016, Michael van Elst wrote: Since we don't have floating point the computation should be done in fixed point arithmetic, e.g. r_avgcount = (A * r_avgcount + B * INT2FIX(r_mcount)) / (A + B); With the current A=B=1 you get alpha=0.5, but other values are thinkable to make the balancer decide on the short term thread count or an even longer term moving average. Using one fractional bit for INT2FIX by multiplying by two might not be enough. I see a lot of ground for more research here, determining right amount of bits and A and B. To sum up our options at this point: a) leave the situation as-is and wait for research to get a perfect formula b) commit the patch we have and wait for the research to be done Given that the existing patch in PR kern/43561 and PR kern/51615 does improve the current situation, I'd vote for option "b". Any takers? Any objections? - Hubert
Re: Audio - In kernel audio mixing
On Wed, 23 Nov 2016, Nathanial Sloss wrote: ftp.NetBSD.org/pub/NetBSD/misc/nat/nextaudio5-kern.gz ftp://ftp.NetBSD.org/pub/NetBSD/misc/nat/nextaudio5-kern.diff.gz - Hubert
Re: CPUs and processes not evenly assigned?!
On Thu, 10 Nov 2016, Michael van Elst wrote: Can you try this? Tried it, and it seems to help with both 2 and 4 CPUs. Codebase: -current Great, so the problem was just the rounding error. Yes. I've also played with it a bit more, and the "stealing" of processes between CPUs seems to get less with the factor of r_mcount, e.g. with 4* below I see less movement of processes on a 4-CPU system as shown in the patch below. Maybe this should be made a sysctl or depend on the number of CPUs? - Hubert --- sys/kern/kern_runq.c.orig 2016-11-10 21:11:10.0 + +++ sys/kern/kern_runq.c @@ -534,7 +534,9 @@ sched_balance(void *nocallout) ci_rq = ci->ci_schedstate.spc_sched_info; /* Average count of the threads */ - ci_rq->r_avgcount = (ci_rq->r_avgcount + ci_rq->r_mcount) >> 1; + ci_rq->r_avgcount = (ci_rq->r_avgcount + + 4*ci_rq->r_mcount + ) /2; /* Look for CPU with the highest average */ if (ci_rq->r_avgcount > highest) { - Hubert
Re: CPUs and processes not evenly assigned?!
On Wed, 9 Nov 2016, Michael van Elst wrote: Can you try this? Tried it, and it seems to help with both 2 and 4 CPUs. Codebase: -current top(1): load averages: 2.00, 2.00, 1.96; up 0+01:21:47 13:38:27 25 processes: 1 runnable, 22 sleeping, 2 on CPU CPU0 states: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle CPU1 states: 100% user, 0.0% nice, 0.0% system, 0.0% interrupt, 0.0% idle Memory: 48M Act, 6728K Wired, 11M Exec, 31M File, 636M Free Swap: 512M Total, 512M Free PID USERNAME PRI NICE SIZE RES STATE TIME WCPUCPU COMMAND 909 feyrer28011M 1260K CPU/1 78:00 99.02% 99.02% sh 66 feyrer28011M 1260K RUN/1 77:59 99.02% 99.02% sh 713 feyrer43016M 1860K CPU/0 0:00 0.00% 0.00% top - Hubert
Re: CPUs and processes not evenly assigned?!
On Sat, 5 Nov 2016, Hubert Feyrer wrote: Is this expected behaviour? Definitely surprised me! :) FWIW, it seems the same behaviour happens on both netbsd-7/amd64 as of today as well as on -current/amd64, also from today. I've put a screenshot here that shows the issue: http://www.feyrer.de/Misc/priv/bad-scheduling-7.0_STABLE+7.99.42.png (Right side is 7.0_STABLE, left is -current as can be seen on the top) - Hubert
CPUs and processes not evenly assigned?!
Hi, I had to run a CPU-bound program today on a 2 CPU Xen (AWS) instance. I expected each process to run on its own process, but what really happened is that both processes were fighting for the first CPU, leaving the second one idle. NetBSD 7.0/xen with DOMU from Amazon AWS, more details here: http://www.feyrer.de/NetBSD/blog.html/nb_20161105_1754.html Is this expected behaviour? Definitely surprised me! :) - Hubert
Re: A Library for Converting Data to and from C Structs for Lua
On Sun, 17 Nov 2013, Marc Balmer wrote: I plan to import it and to make it available to both lua(1) and lua(4) I wonder if we really need to get all this into NetBSD, instead of moving it to pkgsrc somehow. - Hubert
Re: zero-length symlinks
On Sat, 2 Nov 2013, David Holland wrote: > I think "not sensible" is not a good enough reason to prohibit > something. Yeah yeah, but still nowadays we don't allow adding hard links to directories. So while that's a valid premise, it's not universal. FWIW, the idea not allowing hard links to directories is that ".." wouldn't be unique any more. I don't see such a thing with a symlink pointing to "". - Hubert
Re: zero-length symlinks
> Does anyone see any reason they shouldn't be? If it ain't broken don't fix it? Hubert smime.p7s Description: S/MIME cryptographic signature
Re: Graceful USB disk detach/reattach
Hi, Am 10.04.2013 um 22:00 schrieb Ujjwal Thaakar: > Hi, my name is Ujjwal Thaakar and I'm a 3rd year CS student from India. > This year I'm applying for GSoC and am interested in implementing the project > mentioned in the subject. I wanted to know wether it is feasible for me to > apply for something like this since I'm an undergrad. I do have good > programming skills but little experience in kernel development. Currently I'm > working on implementing sigpid on Minix3 and this is the only experience I > have with kernel development. This is a hard question for us to answer, as we do not know you. Please take the time to read through the following web page, and take up the questions there. With your answers to them, there is a fair chance to give you a realistic answer. Guidelines to apply for a project: http://wiki.netbsd.org/projects/application/ - Hubert signature.asc Description: Message signed with OpenPGP using GPGMail
Re: how we document kernel software architecture
On Thu, 28 Feb 2013, Jochen Kunz wrote: I wonder if it belongs in section 9, or as a document in the source tree, or perhaps just some huge comments. In an ideal world it would be covered by The NetBSD Guide: http://www.netbsd.org/docs/guide/en/ The chapters "VI. NetBSD User Land Programming" and "VII. NetBSD Kernel Programming" seem to be the logical folow on to chapter "V. Building the system". The Guide is the right place to draw the big picture, leaving the details to the man pages. FWIW, there's also the NetBSD Internals Guide: http://www.netbsd.org/docs/internals/en/index.html - Hubert
Re: netbsd internals
>>> Could someone get me a link that gives pointers to the above? Probably not too recent, but maybe good for a start: http://www.netbsd.org/docs/internals/en/ - Hubert smime.p7s Description: S/MIME cryptographic signature
Re: Question about i386 CDs (bootxx_cd9660/boot-big.fs)
On Sun, 29 Jan 2012, Evgeniy Ivanov wrote: What is used in i386 CDs? bootxx_cd9660 + boot from stand, right? yes. See the prepare-target in src/distrib/common/Makefile.bootcd - Hubert
Re: Question about i386 CDs (bootxx_cd9660/boot-big.fs)
On Sat, 28 Jan 2012, Evgeniy Ivanov wrote: I experiment with NetBSD boot stuff for i386. Main question is what is boot-big.fs and how does it different from bootxx_cd9660? With this 'mkisofs -o test1.iso -b bootxx_cd9660 -no-emul-boot -c boot.catalog -l -J -R -allow-leading-dots ./cdreleasefiles/' and in cdreleasefiles contains: boot boot.cfg bootxx_cd9660 modules I'm able to boot. Though boot doesn't read boot.cfg, while loads a kernel from modules without issues (I have to go to prompt). When I use boot-big.fs instead I get a nice colorful useless picture on booting. Without "-no-emul-boot" (like in manual) I fail to create an image: Size of boot image is 7200 sectors -> genisoimage: Error - boot image 'cdreleasefiles/boot-big.fs' has not an allowable size. boot-big.fs is a complete 2.88MB floppy image that has everything to boot netbsd. bootxx_cd9660 is just a bootloader for the ISO 9660 format. - Hubert
WARNING: couldn't open cd9660 (after loading kernel, before loading ramdisk), or: how do I put a file-system into the kernel (not as module)
I'm toying with -current, and wonder what the following error when booting from an ISO means (i386): > load miniroot.kmod > boot 6122760+454896+292596 [370528+368073]=0x743260 >WARNING: couldn't open cd9660 (/stand/i386/5.99.56/modules/cd9660/cd9660.kmod) Loading miniroot.kmod WARNING: 1 module failed to load Copyright (c) 1996, 1997, ... I have trouble that -current doesn't read my boot.cfg at all (read returns 0 bytes; the ISO seems to have to right file as indicated by some debug printfs. I don't understand why the kernel (or who?) wants to load cd9660.kmod. Is there a way to build the CD9660 file-system into the kernel, but keeping the rest modular (or at least loading the ramdisk)? How does one hardwire modules into the kernel in -current kernel configs? - Hubert
Re: bumping ARG_MAX
On Mon, 14 Nov 2011, Simon Burge wrote: I think I like the Linux idea of a portion of stack size best What is the stack size? Is it what ulimit(1) gives me? The 2kB there seem pretty small for the problem at hand, and I can max. raise it to 64kB. - Hubert
Re: bumping ARG_MAX
On Sun, 13 Nov 2011, David Holland wrote: thoughts? Here's an interesting comparison: http://www.in-ulm.de/~mascheck/various/argmax/#results - Hubert
nfs_lookup() panic, again
Hallo David, we've been talking about a NFS panic in nfs_lookop() some time ago: panic: kernel diagnostic assertion "dvp != *vpp" failed: file ".../sys/nfs/nfs_vnopds.c", line 826 The setup of this is NetBSD-current/i386 in VMware Fusion with sources mounted via NFS from a Mac OS X NFS server. The panic occurs repeatly when doing a full build. I've had a look at ddb and gdb, but can't really make a lot of sense from that: http://www.feyrer.de/Misc/priv/nfs-panic/nfs_lookup-panic-msg+vnode.png has the panic message (from ddb "dmesg"), and an attempt to dig into the vnode pointed to by dvp and *vpp. The source of nfs_vnops.c was modified for those printfs, it's at www.feyrer.de/Misc/priv/nfs-panic/nfs_vnops.c I also have a stack backtrace from ddb, which is at http://www.feyrer.de/Misc/priv/nfs-panic/nfs_lookup-panic-bt.png I've tried to look into the kernel crash dump with gdb, but that has the stack messed up, see screenshot at http://www.feyrer.de/Misc/priv/nfs-panic/nfs_lookup-panic-gdb.png Do you (or anyone else on tech-kern?) have an idea on where to go next? I'm not familiar with vnodes... Thanks! - Hubert
Re: Using coccinelle for (quick?) syntax fixing
On Thu, 12 Aug 2010, Bernd Ernesti wrote: if (dev_priv == NULL) { DRM_ERROR("called with no initialization\n"); - DRM_SPINUNLOCK(&dev_priv->cs.cs_mutex); ... Hmm, you didn't mention why you are doing that in your initial mail. Use of pointer after determining it's NULL, in thias case dev_priv (it _was_ in one of the prior mails). - Hubert
systems hangs with slow disk - how does FS locking work?
Hi, I have a file system on a very slow storage (eeprom-based). Access to the underlying eeprom happens in chunks of few hundred bytes, yet when I access the storage and copy some kB of data to the storage, the whole system hangs for many seconds. The system uses a kernel thread to write the data passed from the eeprom-"disk"-driver to the eeprom. Communication between the driver and the kernel thread is synchronized with a mutex variable, the kernel thread is marked as MP_SAFE, I'm trying to understand where the hanging of the system comes from: My assumption is that the "big" data block written is devided in several small chunks (disk blocks, 512 bytes), which will then be written to disk sequentially, and with the system preempting if there's other work to do. The preemption would further delay the writes, but write speed not an issue. Hanging of the whole system's userland while I/O is going on is, though. I haven't dived into the file system code (yet), but maybe someone familiar can give me an answer to the following question: Is it possible that the kernel / file system layer locks the whole system until *all* blocks are written to disk, instead of doing writes in small chunks? FWIW, I currently use a msdos filesystem on the storage (due to less overhead than FFS), codebase is NetBSD 5. Thanks! - Hubert
Re: [gsoc] syscall/libc fuzzer proposal
On Sat, 20 Mar 2010, Mateusz Kocielski wrote: ...your ideas? Reminds me of 1991's crashme: http://crashme.codeplex.com/ The idea sounds more like a research project to me... - Hubert
Re: Proposal for adding fsx(8) to base system
On Sun, 24 Jan 2010, o...@linbsd.org wrote: Fsx is a filesystem exerciser that is used to stress filesystem code. I would like to propose importing fsx into the base systems, or perhaps pkgsrc. The intent is to import ftp://ftp.netbsd.org/pub/NetBSD/misc/ober/fsx/ to src/usr.sbin. Sounds like a case for pkgsrc/benchmark for me. - Hubert
Re: Fastest dump device
On Mon, 11 Jan 2010, Edgar Fuß wrote: What's the fastest type of device NetBSD can dump to? On sd, it dumps about 3MB/s, making 4GB take ~20 minutes. Some sort of flash device would be nice. Probably not really answering the question, but: NetBSD 5.0/i386's release announcement mentions sparse kernel core dumps. I wonder if that may help if you're on !i386... - Hubert