Re: Kernel work that needs help?

2024-04-07 Thread Hubert Feyrer
Maybe habe a Look at our Problem Report (PR) database.

 - Hubert


> Am 07.04.2024 um 22:00 schrieb Ice Cream :
> 
> I would like to contribute to development and am looking
> for ongoing kernel dev work that needs help.
> 
> I worked on one of the projects (fullfs) from the project list
> and am looking for more. Most projects on the list are years
> old or not with clear definition or not of my general interest.
> 
> I would like to join projects that are being actively worked
> on and have well defined milestones for someone relatively
> inexperienced. So, if you're working on something awesome
> then let me know :-)
> 
> -- IC



Re: "Boot this kernel once" functionality? (amd64)

2020-09-16 Thread Hubert Feyrer



On Wed, 16 Sep 2020, Anthony Mallet wrote:

I was wondering how easy that would be to add a "boot once" feature to


FWIW: the Amiga port had a /dev/reload in its very early days, where you 
could cp a kernel too, which would then be bootet. I'm not sure this is 
easy to make on a PC platform.



 - Hubert


Documenting the kern.sched sysctl tree [patch]

2017-01-01 Thread Hubert Feyrer


Hi,

currently the kern.sched sysctl tree is not documented at all.
The patch below changes this. Changes, updates and suggestions welcome!

Here's what the formatted text looks like, for easier feedback:

 kern.sched (dynamic)
 Influence the scheduling of LWPs, their priorisation and how they
 are distributed on and moved between CPUs.

   Third level name  Type   Changeable
   kern.sched.cacheht_time   integeryes
   kern.sched.balance_period integeryes
   kern.sched.average_weight integeryes
   kern.sched.min_catch  integeryes
   kern.sched.timesoftints   integeryes
   kern.sched.kpreempt_pri   integeryes
   kern.sched.upreempt_pri   integeryes
   kern.sched.maxts  integeryes
   kern.sched.mints  integeryes
   kern.sched.name   string no
   kern.sched.rtts   integerno
   kern.sched.pri_minintegerno
   kern.sched.pri_maxintegerno

 The variables are as follows:

 kern.sched.cacheht_time (dynamic)
 Cache hotness time in which a LPW is kept on one particu-
 lar CPU and not moved to another CPU. This reduces the
 overhead of flushing and reloading caches.  Defaults to
 3ms.  Needs to be given in ``hz'' units, see mstohz(9).

 kern.sched.balance_period (dynamic)
 Interval in which the CPU queues are checked for re-bal-
 ancing.  Defaults to 300ms.  Needs to be given in ``hz''
 units, see mstohz(9).

 kern.sched.average_weight (dynamic)
 Can be used to influence how likely LWPs can be migrated
 from one CPU's queue of LPSs that are ready to run to a
 different, idle CPU.  The value gives the percentage for
 weighting the average count of migratable threads from
 the past against the current number of migratable
 threads.  Small gives more weight to the past, big values
 more weight on the current situation.  Defaults to 50 and
 must be between 0 and 100.

 kern.sched.min_catch (dynamic)
 Minimum count of migratable (runable) threads for catch-
 ing (stealing) from another CPU.  Defaults to 1 but can
 be increased to decrease chance of thread migration
 between CPUs.

 kern.sched.timesoftints (dynamic)
 This switch allows to enable tracking of CPU time for
 soft interrupts as part of a LWP's real execution time.
 Set to a non-zero value to enable, and see ps(1) for
 printing CPU times.

 kern.sched.kpreempt_pri (dynamic)
 Minimum priority to trigger kernel preemption.

 kern.sched.upreempt_pri (dynamic)
 Minimum priority to trigger user preemption.

 kern.sched.maxts (dynamic)
 Scheduler specific maximal time quantum (in millisec-
 onds).  Must be set to a value larger than ``mints'' and
 between 10 and ``hz'' as given by the ``kern.clockrate''
 sysctl.  Provided by the M2 scheduler.

 kern.sched.mints (dynamic)
 Scheduler specific minimal time quantum (in millisec-
 onds).  Must be set to a value smaller than ``maxts'' and
 between 1 and ``hz'' as given by the ``kern.clockrate''
 sysctl.  Provided by the M2 scheduler.

 kern.sched.name (dynamic)
 Scheduler name.  Provided both by the M2 and the 4BSD
 scheduler.

 kern.sched.rtts (dynamic)
 Fixed scheduler specific round-robin time quantum in mil-
 liseconds.  Provided both by the M2 and the 4BSD sched-
 uler.

 kern.sched.pri_min (dynamic)
 Minimal POSIX real-time priority.  See sched(3).

 kern.sched.pri_max (dynamic)
 Maximal POSIX real-time priority.  See sched(3).


 - Hubert


Index: sysctl.7
===
RCS file: /cvsroot/src/share/man/man7/sysctl.7,v
retrieving revision 1.104
diff -u -r1.104 sysctl.7
--- sysctl.717 Nov 2016 01:22:00 -  1.104
+++ sysctl.71 Jan 2017 22:19:46 -
@@ -355,7 +355,7 @@
 .It kern.rtc_off

Re: CPUs and processes not evenly assigned?!

2016-12-23 Thread Hubert Feyrer

> Am 22.12.2016 um 11:09 schrieb Michael van Elst :
>> What are we waiting for here, how should we go on?
>> Would you want to commit the latest patch and allow it get more wide-spread 
>> testing in -current?
> 
> anyone who has time to _test_ it?
> 
> Would be nice if someone looks at performance for different use cases
> and possibly different settings of the knobs :)

What use cases exactly do you have in mind?
I guess we can spec out things and then see if we can find someone to help with 
the tests.

It would be a pity to have a solution sitting around and still having our 
operating system not have it because we can’t move things forward. :)

 - Hubert


Re: CPUs and processes not evenly assigned?!

2016-12-22 Thread Hubert Feyrer
Hi,

> Am 26.11.2016 um 11:09 schrieb Hubert Feyrer :
> I see a lot of ground for more research here, determining right amount of 
> bits and A and B. To sum up our options at this point:
> 
> a) leave the situation as-is and wait for research to get a perfect formula
> b) commit the patch we have and wait for the research to be done
> 
> Given that the existing patch in PR kern/43561 and PR kern/51615 does improve 
> the current situation, I'd vote for option "b".
> 
> Any takers? Any objections?

What are we waiting for here, how should we go on?
Would you want to commit the latest patch and allow it get more wide-spread 
testing in -current?


 - Hubert

Re: CPUs and processes not evenly assigned?!

2016-12-03 Thread Hubert Feyrer

On Sun, 27 Nov 2016, Michael van Elst wrote:

I've made a new patch:
http://ftp.netbsd.org/pub/NetBSD/misc/mlelstv/kern_runq.diff


FWIW, I've made a test run with two CPUs, and with my original test case 
of two concurrent CPU-hogs, this works fine (just like the previous 
tests).




-> it uses 8 guard bits (multiply by 256 instead of 2).
-> it lets you configure the moving average factor in percent
  (default is 50% like now) with a sysctl.


I tried playing with these, but I didn't see a difference. I didn't fully 
get the meaning of this - as far as I understand, one extreme (1? 99?) 
gives much weight to the previous value and avoids switching 
short-handedly, the other one puts weight on newly occurring processes on 
a cpu and favors switching. But I'm not sure which is which, and if this 
is true at all. Yet another magic knob for our scheduler ;)


Where to go from here?


 - Hubert


Re: CPUs and processes not evenly assigned?!

2016-11-26 Thread Hubert Feyrer

On Fri, 11 Nov 2016, Michael van Elst wrote:

Since we don't have floating point the computation should be done in
fixed point arithmetic, e.g.

r_avgcount = (A * r_avgcount + B * INT2FIX(r_mcount)) / (A + B);

With the current A=B=1 you get alpha=0.5, but other values are thinkable
to make the balancer decide on the short term thread count or an even
longer term moving average.

Using one fractional bit for INT2FIX by multiplying by two might not
be enough.


I see a lot of ground for more research here, determining right amount of 
bits and A and B. To sum up our options at this point:


a) leave the situation as-is and wait for research to get a perfect formula
b) commit the patch we have and wait for the research to be done

Given that the existing patch in PR kern/43561 and PR kern/51615 does 
improve the current situation, I'd vote for option "b".


Any takers? Any objections?


  - Hubert


Re: Audio - In kernel audio mixing

2016-11-23 Thread Hubert Feyrer

On Wed, 23 Nov 2016, Nathanial Sloss wrote:

ftp.NetBSD.org/pub/NetBSD/misc/nat/nextaudio5-kern.gz


ftp://ftp.NetBSD.org/pub/NetBSD/misc/nat/nextaudio5-kern.diff.gz


 - Hubert


Re: CPUs and processes not evenly assigned?!

2016-11-10 Thread Hubert Feyrer

On Thu, 10 Nov 2016, Michael van Elst wrote:

Can you try this?



Tried it, and it seems to help with both 2 and 4 CPUs.
Codebase: -current


Great, so the problem was just the rounding error.


Yes. I've also played with it a bit more, and the "stealing" of processes 
between CPUs seems to get less with the factor of r_mcount, e.g. with 4* 
below I see less movement of processes on a 4-CPU system as shown in the 
patch below.


Maybe this should be made a sysctl or depend on the number of CPUs?


 - Hubert



--- sys/kern/kern_runq.c.orig   2016-11-10 21:11:10.0 +
+++ sys/kern/kern_runq.c
@@ -534,7 +534,9 @@ sched_balance(void *nocallout)
ci_rq = ci->ci_schedstate.spc_sched_info;

/* Average count of the threads */
-   ci_rq->r_avgcount = (ci_rq->r_avgcount + ci_rq->r_mcount) >> 1;
+   ci_rq->r_avgcount = (ci_rq->r_avgcount +
+   4*ci_rq->r_mcount
+   ) /2;

/* Look for CPU with the highest average */
if (ci_rq->r_avgcount > highest) {


 - Hubert


Re: CPUs and processes not evenly assigned?!

2016-11-10 Thread Hubert Feyrer

On Wed, 9 Nov 2016, Michael van Elst wrote:

Can you try this?


Tried it, and it seems to help with both 2 and 4 CPUs.
Codebase: -current

top(1):

load averages:  2.00,  2.00,  1.96;   up 0+01:21:47   13:38:27
25 processes: 1 runnable, 22 sleeping, 2 on CPU
CPU0 states:  100% user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle
CPU1 states:  100% user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle
Memory: 48M Act, 6728K Wired, 11M Exec, 31M File, 636M Free
Swap: 512M Total, 512M Free

  PID USERNAME PRI NICE   SIZE   RES STATE  TIME   WCPUCPU COMMAND
  909 feyrer28011M 1260K CPU/1 78:00 99.02% 99.02% sh
   66 feyrer28011M 1260K RUN/1 77:59 99.02% 99.02% sh
  713 feyrer43016M 1860K CPU/0  0:00  0.00%  0.00% top


 - Hubert


Re: CPUs and processes not evenly assigned?!

2016-11-08 Thread Hubert Feyrer


On Sat, 5 Nov 2016, Hubert Feyrer wrote:

Is this expected behaviour? Definitely surprised me! :)


FWIW, it seems the same behaviour happens on both netbsd-7/amd64 as of 
today as well as on -current/amd64, also from today.


I've put a screenshot here that shows the issue:
http://www.feyrer.de/Misc/priv/bad-scheduling-7.0_STABLE+7.99.42.png
(Right side is 7.0_STABLE, left is -current as can be seen on the top)


 - Hubert



CPUs and processes not evenly assigned?!

2016-11-05 Thread Hubert Feyrer


Hi,

I had to run a CPU-bound program today on a 2 CPU Xen (AWS) instance.
I expected each process to run on its own process, but what really 
happened is that both processes were fighting for the first CPU, leaving 
the second one idle.


NetBSD 7.0/xen with DOMU from Amazon AWS, more details here:
http://www.feyrer.de/NetBSD/blog.html/nb_20161105_1754.html

Is this expected behaviour? Definitely surprised me! :)


 - Hubert


Re: A Library for Converting Data to and from C Structs for Lua

2013-11-17 Thread Hubert Feyrer

On Sun, 17 Nov 2013, Marc Balmer wrote:

I plan to import it and to make it available to both lua(1) and lua(4)


I wonder if we really need to get all this into NetBSD,
instead of moving it to pkgsrc somehow.


 - Hubert


Re: zero-length symlinks

2013-11-03 Thread Hubert Feyrer

On Sat, 2 Nov 2013, David Holland wrote:

> I think "not sensible" is not a good enough reason to prohibit
> something.

Yeah yeah, but still nowadays we don't allow adding hard links to
directories. So while that's a valid premise, it's not universal.


FWIW, the idea not allowing hard links to directories is that ".." 
wouldn't be unique any more. I don't see such a thing with a symlink 
pointing to "".



 - Hubert


Re: zero-length symlinks

2013-11-02 Thread Hubert Feyrer

>  Does anyone see any reason they shouldn't be?

If it ain't broken don't fix it?


Hubert

smime.p7s
Description: S/MIME cryptographic signature


Re: Graceful USB disk detach/reattach

2013-04-11 Thread Hubert Feyrer
Hi,

Am 10.04.2013 um 22:00 schrieb Ujjwal Thaakar:
> Hi, my name is Ujjwal Thaakar and I'm a 3rd year CS student from India.
> This year I'm applying for GSoC and am interested in implementing the project 
> mentioned in the subject. I wanted to know wether it is feasible for me to 
> apply for something like this since I'm an undergrad. I do have good 
> programming skills but little experience in kernel development. Currently I'm 
> working on implementing sigpid on Minix3 and this is the only experience I 
> have with kernel development.

This is a hard question for us to answer, as we do not know you.
Please take the time to read through the following web page, and take up the 
questions there.
With your answers to them, there is a fair chance to give you a realistic 
answer.

Guidelines to apply for a project: http://wiki.netbsd.org/projects/application/


- Hubert



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: how we document kernel software architecture

2013-02-28 Thread Hubert Feyrer

On Thu, 28 Feb 2013, Jochen Kunz wrote:

I wonder if it belongs in section 9, or as a document in the source
tree, or perhaps just some huge comments.

In an ideal world it would be covered by The NetBSD Guide:
http://www.netbsd.org/docs/guide/en/
The chapters "VI. NetBSD User Land Programming" and "VII. NetBSD Kernel
Programming" seem to be the logical folow on to chapter "V. Building
the system". The Guide is the right place to draw the big picture,
leaving the details to the man pages.


FWIW, there's also the NetBSD Internals Guide:
http://www.netbsd.org/docs/internals/en/index.html


 - Hubert


Re: netbsd internals

2012-07-19 Thread Hubert Feyrer

>>> Could someone get me a link that gives pointers to the above?

Probably not too recent, but maybe good for a start:

http://www.netbsd.org/docs/internals/en/


 - Hubert

smime.p7s
Description: S/MIME cryptographic signature


Re: Question about i386 CDs (bootxx_cd9660/boot-big.fs)

2012-01-29 Thread Hubert Feyrer

On Sun, 29 Jan 2012, Evgeniy Ivanov wrote:

What is used in i386 CDs? bootxx_cd9660 + boot from stand, right?


yes. See the prepare-target in src/distrib/common/Makefile.bootcd


 - Hubert


Re: Question about i386 CDs (bootxx_cd9660/boot-big.fs)

2012-01-28 Thread Hubert Feyrer

On Sat, 28 Jan 2012, Evgeniy Ivanov wrote:

I experiment with NetBSD boot stuff for i386. Main question is what is
boot-big.fs and how does it different from bootxx_cd9660?
With this 'mkisofs -o test1.iso -b bootxx_cd9660 -no-emul-boot -c
boot.catalog -l -J -R -allow-leading-dots ./cdreleasefiles/' and in
cdreleasefiles contains:
boot  boot.cfg  bootxx_cd9660  modules
I'm able to boot. Though boot doesn't read boot.cfg, while loads a
kernel from modules without issues (I have to go to prompt).

When I use boot-big.fs instead I get a nice colorful useless picture
on booting. Without "-no-emul-boot" (like in manual) I fail to create
an image: Size of boot image is 7200 sectors -> genisoimage: Error -
boot image 'cdreleasefiles/boot-big.fs' has not an allowable size.


boot-big.fs is a complete 2.88MB floppy image that has everything to boot 
netbsd.


bootxx_cd9660 is just a bootloader for the ISO 9660 format.


 - Hubert


WARNING: couldn't open cd9660 (after loading kernel, before loading ramdisk), or: how do I put a file-system into the kernel (not as module)

2011-11-22 Thread Hubert Feyrer


I'm toying with -current, and wonder what the following error when booting 
from an ISO means (i386):


> load miniroot.kmod
> boot
6122760+454896+292596 [370528+368073]=0x743260
>WARNING: couldn't open cd9660 
(/stand/i386/5.99.56/modules/cd9660/cd9660.kmod)
Loading miniroot.kmod
WARNING: 1 module failed to load
Copyright (c) 1996, 1997, ...


I have trouble that -current doesn't read my boot.cfg at all (read returns 
0 bytes; the ISO seems to have to right file as indicated by some debug 
printfs. I don't understand why the kernel (or who?) wants to load 
cd9660.kmod.


Is there a way to build the CD9660 file-system into the kernel, but 
keeping the rest modular (or at least loading the ramdisk)?

How does one hardwire modules into the kernel in -current kernel configs?


 - Hubert


Re: bumping ARG_MAX

2011-11-13 Thread Hubert Feyrer

On Mon, 14 Nov 2011, Simon Burge wrote:

I think I like the Linux idea of a portion of stack size best


What is the stack size?
Is it what ulimit(1) gives me? The 2kB there seem pretty small for the 
problem at hand, and I can max. raise it to 64kB.



 - Hubert


Re: bumping ARG_MAX

2011-11-13 Thread Hubert Feyrer

On Sun, 13 Nov 2011, David Holland wrote:

thoughts?


Here's an interesting comparison:
http://www.in-ulm.de/~mascheck/various/argmax/#results


 - Hubert


nfs_lookup() panic, again

2011-10-08 Thread Hubert Feyrer


Hallo David,

we've been talking about a NFS panic in nfs_lookop() some time ago:

panic: kernel diagnostic assertion "dvp != *vpp" failed: file
".../sys/nfs/nfs_vnopds.c", line 826

The setup of this is NetBSD-current/i386 in VMware Fusion with sources 
mounted via NFS from a Mac OS X NFS server. The panic occurs repeatly when 
doing a full build.


I've had a look at ddb and gdb, but can't really make a lot of sense from 
that:


http://www.feyrer.de/Misc/priv/nfs-panic/nfs_lookup-panic-msg+vnode.png 
has the panic message (from ddb "dmesg"), and an attempt to dig into the 
vnode pointed to by dvp and *vpp. The source of nfs_vnops.c was modified 
for those printfs, it's at

www.feyrer.de/Misc/priv/nfs-panic/nfs_vnops.c

I also have a stack backtrace from ddb, which is at
http://www.feyrer.de/Misc/priv/nfs-panic/nfs_lookup-panic-bt.png

I've tried to look into the kernel crash dump with gdb, but that has the 
stack messed up, see screenshot at 
http://www.feyrer.de/Misc/priv/nfs-panic/nfs_lookup-panic-gdb.png


Do you (or anyone else on tech-kern?) have an idea on where to go next? 
I'm not familiar with vnodes... Thanks!



 - Hubert



Re: Using coccinelle for (quick?) syntax fixing

2010-08-11 Thread Hubert Feyrer

On Thu, 12 Aug 2010, Bernd Ernesti wrote:

if (dev_priv == NULL) {
DRM_ERROR("called with no initialization\n");
-   DRM_SPINUNLOCK(&dev_priv->cs.cs_mutex);

...

Hmm, you didn't mention why you are doing that in your initial mail.


Use of pointer after determining it's NULL, in thias case dev_priv
(it _was_ in one of the prior mails).


 - Hubert


systems hangs with slow disk - how does FS locking work?

2010-06-29 Thread Hubert Feyrer


Hi,

I have a file system on a very slow storage (eeprom-based).
Access to the underlying eeprom happens in chunks of few hundred bytes, 
yet when I access the storage and copy some kB of data to the storage, the 
whole system hangs for many seconds.


The system uses a kernel thread to write the data passed from the 
eeprom-"disk"-driver to the eeprom. Communication between the driver and 
the kernel thread is synchronized with a mutex variable, the kernel thread 
is marked as MP_SAFE,


I'm trying to understand where the hanging of the system comes from:

My assumption is that the "big" data block written is devided in several 
small chunks (disk blocks, 512 bytes), which will then be written to disk 
sequentially, and with the system preempting if there's other work to do. 
The preemption would further delay the writes, but write speed not an 
issue.


Hanging of the whole system's userland while I/O is going on is, though. I 
haven't dived into the file system code (yet), but maybe someone familiar 
can give me an answer to the following question:


Is it possible that the kernel / file system layer locks the whole
system until *all* blocks are written to disk, instead of
doing writes in small chunks?

FWIW, I currently use a msdos filesystem on the storage (due to less 
overhead than FFS), codebase is NetBSD 5.


Thanks!


 - Hubert


Re: [gsoc] syscall/libc fuzzer proposal

2010-03-20 Thread Hubert Feyrer

On Sat, 20 Mar 2010, Mateusz Kocielski wrote:

...your ideas?


Reminds me of 1991's crashme: http://crashme.codeplex.com/

The idea sounds more like a research project to me...


 - Hubert


Re: Proposal for adding fsx(8) to base system

2010-01-24 Thread Hubert Feyrer

On Sun, 24 Jan 2010, o...@linbsd.org wrote:

Fsx is a filesystem exerciser that is used to stress filesystem code.
I would like to propose importing fsx into the base systems, or perhaps pkgsrc.
The intent is to import ftp://ftp.netbsd.org/pub/NetBSD/misc/ober/fsx/ to 
src/usr.sbin.


Sounds like a case for pkgsrc/benchmark for me.


 - Hubert


Re: Fastest dump device

2010-01-11 Thread Hubert Feyrer

On Mon, 11 Jan 2010, Edgar Fuß wrote:

What's the fastest type of device NetBSD can dump to?
On sd, it dumps about 3MB/s, making 4GB take ~20 minutes.
Some sort of flash device would be nice.


Probably not really answering the question, but:
NetBSD 5.0/i386's release announcement mentions sparse kernel core dumps.
I wonder if that may help if you're on !i386...


 - Hubert