VOP_PUTPAGE ignores mount_nfs -o soft,intr

2015-06-19 Thread Emmanuel Dreyfus
Hi

I have encountered a bug with NetBSD NFS client. Despite a mount with
-o intr,soft, we can hit situation where a process can remain hang in 
kernel because the NFS server is gone.

This happens when the ioflush does its duty, with the following code path:
sync_fsync / nfs_sync / VOP_FSYNC / nfs_fsync / nfs_flush / VOP_PUTPAGES

VOP_PUTPAGES has flags = PGO_ALLPAGES|PGO_FREE. It then goes through
genfs_putpages and genfs_do_putpages, and get stuck in:

/* Wait for output to complete. */
if (!wasclean && !async && vp->v_numoutput != 0) {
while (vp->v_numoutput != 0)
cv_wait(&vp->v_cv, slock);
}

This cv_wait() is tiemout-less and uninterruptible. ioflush will 
sleep there forever, holding vnode lock. Any other process doing
I/O on the filesystem will sleep in tstile waiting for the vnode
lock with this path: 
sys_write / dofilewrite / vn_write / vn_lock / VOP_LOCK / rw_enter

We have another timeout-less and uninterruptible wait for the
vnode lock, which means -o intr,soft are not honoured. If the NFS 
server does not come back, the only way out is reboot -n. Even 
umount -f -R will get hung in tstile.

How can we fix it? 

1) ioflush should not sleep forever awaiting I/O completion for 
a NFS mount if it was mounted with -o soft. A PGO_SOFT
flags could be added to VOP_PUTPAGES so that cv_timedwait() is used
instead of cv_wait(), but how can we get the timeout? Should we introduce
a VOP_PUTPAGES2 with an addtionnal argument? Use a sane default? Get it
from the filesystem using a new VFS_GETTIMEOUT method? (or more general 
VFS_GETMNTINFO which would be able to query different informations).

2) Honouring -o intr seems to require either the introduction of a 
real nfs_lock (currently it is genfs_lock), or a change to genfs_lock.

The goal is to create an interruptible sleep for vp->v_lock. How can 
this be achieved? We have no rw_(try)enter_sig, should we introduce it? 
Or should we loop sleeping in an interruptible sleep  retrying  at
regular intervals? And how can a -o soft 's timeout could be hnoured here?

Last question: is there any hope to get this fixed in netbsd-7, or did the
VFS interface changed too much?

-- 
Emmanuel Dreyfus
m...@netbsd.org


Re: bottom half

2015-06-19 Thread Edgar Fuß
> "Runs on kernel stack in kernel space" is not the same thing as the Linux
> concept of bottom half. :-)
I don't know what the Linux (or VMS or Windows) concept of "nottom half" is.
I thought I knew what the BSD concept of kernel halves is.

> I don't know what the figured referred to is, 
Figure 3.1 Run-time structure of the kernel

> but the text quoted do not say "bottom half" at least...
The text describes what "bottom half of the kernel" in the figure means.

To eleborate, as I seem to have been too cryptic in my references: I learned 
the terms top and bottom half from "The Design and Implementation of the 
4.4BSD Operating System" by McKusick, Bostic, Karels and Quaterman. The 
text on page 51 explains "The bottom half of the kernel comprises routines 
that are invoked to handle hardware interrupts." and the figure 3.1 above 
explains "Never scheduled, cannot block. Runs on kernel stack in kernel 
address space."

Now it may well be that things have changed, my understanding is out-dated 
and NetBSD handles interrupts in a conceptually different way from 4.4BSD.

But if things have not changed, I think a documentation on NetBSD's locking 
mechanisms (thanks go to Kamil for writing this!) may well use the terms 
the definite reference on 4.4BSD uses, no matter what penguin addicts use 
the term for. In fact I think it should because I can't stand people picking 
up well-established terms, re-defining them and then refusing to accept 
the traditional definition. Ever tried to talk to someone grown up with 
git about what a patch is?


Re: bottom half

2015-06-19 Thread Johnny Billquist

On 2015-06-19 11:45, Edgar Fuß wrote:

"Runs on kernel stack in kernel space" is not the same thing as the Linux
concept of bottom half. :-)

I don't know what the Linux (or VMS or Windows) concept of "nottom half" is.
I thought I knew what the BSD concept of kernel halves is.


I can't comment on Windows - I have no idea.
VMS uses a very different solution, so it don't make much sense to talk 
about that here (or at least the parts that I know, which might be 
outdated).


If I remember Linux right, they have a fixed list of registered bottom 
half handlers (which of course ran out of space a long time ago), and 
then through some tricks extended it to be more general, but in essence 
the bottom half is the part of the device driver that runs after an 
interrupt to complete an I/O request. The top half being also running in 
the kernel, but in the context of a process that does the I/O, and the 
top half blocks until the I/O completes. And the bottom half is the part 
the unblocks the top half again. And each driver has its own bottom and 
top halves. One major point of the bottom halves is that when running 
the bottom half, interrupts are not blocked. A device driver normally do 
only a minimal amount of work in the interrupt handler itself, and then 
defer the rest of the work to the bottom half code, which will run at 
some later time.


Of course, I could be remembering this all wrong, and it might be 
outdated as well. So take what I write with a grain of salt. Or rather, 
read up on it in a Linux book instead.


I would say the bottom half concept in Linux is close to the softint 
stuff in NetBSD. But I might be wrong on that one, as I don't remember 
all the details of that either right now.


And tghe text you refer to in the 4.4BSD book then obviously are not 
describing the same concept as the Linux bottom halves, as the Linux 
bottom halves do not handle the hardware interrupt itself, and they can 
be interrupted by anything.


Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: bottom half

2015-06-19 Thread Greg Troxel

Taylor R Campbell  writes:

> What I meant when I said that to Kamil is that we don't have any
> formalized notion called `top half' and `bottom half'.  We have hard
> interrupt handlers which are supposed to have small bounded latency,
> and we have soft interrupt handlers and kernel threads at lower
> priorities to which hard interrupt handlers defer long computations
> and I/O.

The notion of top and bottom half is historical in BSD from long ago
(2BSD even).  As far as "formal" goes, I think it's just the usual BSD
problem of an undocumented design.

Since the original, two things have happened:

  we need locking, because preemption of interrupts only works for
  mutual exclusion on a single CPU

  we have softints, which are not bottom half and not top half, but in
  between.  This is not really because they are preemptible (traditional
  disk interrupt handlers were preemptible by serial drivers) but
  because they can sleep.

Because of this transition from halves to thirds, and because of
cconfusion with Linux, the term bottom half may now be best avoided.

Overall, the whole subject of locking in the kernel is too hard for new
people to figure out, and I think it's great that Kamil is writing an
overview for it.



pgpcNNHlwyns_.pgp
Description: PGP signature


Re: bottom half

2015-06-19 Thread Edgar Fuß
> as the Linux bottom halves do not handle the hardware interrupt 
> itself, and they can be interrupted by anything.
Oh well. So they use a well-established terminology to meen something 
different from what it originally meant. Sigh.

Thanks for the explanation.


Re: bottom half

2015-06-19 Thread Edgar Fuß
> Because of this transition from halves to thirds,
OK, I understand.

> and because of []confusion with Linux, 
Sigh.

> the term bottom half may now be best avoided.
OK.

> it's great that Kamil is writing an overview for it.
YES.


Re: bottom half

2015-06-19 Thread Johnny Billquist

On 2015-06-19 14:27, Edgar Fuß wrote:

as the Linux bottom halves do not handle the hardware interrupt
itself, and they can be interrupted by anything.

Oh well. So they use a well-established terminology to meen something
different from what it originally meant. Sigh.

Thanks for the explanation.


Hey. It is Linux. What did you expect? :-)

Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: bottom half

2015-06-19 Thread Eduardo Horvath
On Fri, 19 Jun 2015, Johnny Billquist wrote:

> On 2015-06-19 11:45, Edgar Fu?? wrote:
> > > "Runs on kernel stack in kernel space" is not the same thing as the Linux
> > > concept of bottom half. :-)
> > I don't know what the Linux (or VMS or Windows) concept of "nottom half" is.
> > I thought I knew what the BSD concept of kernel halves is.
> 
> I can't comment on Windows - I have no idea.
> VMS uses a very different solution, so it don't make much sense to talk about
> that here (or at least the parts that I know, which might be outdated).
> 
> If I remember Linux right, they have a fixed list of registered bottom half
> handlers (which of course ran out of space a long time ago), and then through
> some tricks extended it to be more general, but in essence the bottom half is
> the part of the device driver that runs after an interrupt to complete an I/O
> request. The top half being also running in the kernel, but in the context of
> a process that does the I/O, and the top half blocks until the I/O completes.
> And the bottom half is the part the unblocks the top half again. And each
> driver has its own bottom and top halves. One major point of the bottom halves
> is that when running the bottom half, interrupts are not blocked. A device
> driver normally do only a minimal amount of work in the interrupt handler
> itself, and then defer the rest of the work to the bottom half code, which
> will run at some later time.
> 
> Of course, I could be remembering this all wrong, and it might be outdated as
> well. So take what I write with a grain of salt. Or rather, read up on it in a
> Linux book instead.
> 
> I would say the bottom half concept in Linux is close to the softint stuff in
> NetBSD. But I might be wrong on that one, as I don't remember all the details
> of that either right now.

From what I remember the BSD book talks about the "top half" and "bottom 
half" of the driver, not just interrupt dispatch the way linux does.

On linux, the top half of the interrupt handler runs in interrupt context, 
either preempting a kernel thread on the kernel stack or on a separate 
interrupt stack depending on the particular architecture.

The bottom half of an interrupt handler is basically a softint that runs 
on a kernel stack.

In general, you register a top half handler to acknowledge the interrupt.  
If the interrupt has any notable amount of processing to do or needs to 
fiddle with locks, the top half schedules a bottom half interrupt to do 
that.  

In addition to that you can have code that runs in a kernel thread, and 
you can have code that runs in the kernel in process context as the result 
of a system call.

It's been a while since I fiddled with interrupts on NetBSD, but ISTR we 
now schedule an interrupt thread to do all of the processing so there is 
no equivalent of the linux interrupt top half and interrupt bottom half.  
Or is that Solaris?  I forget.

Eduardo

Re: VOP_PUTPAGE ignores mount_nfs -o soft,intr

2015-06-19 Thread Christos Zoulas
In article <20150619083656.gt19...@homeworld.netbsd.org>,
Emmanuel Dreyfus   wrote:
>Hi
>
>I have encountered a bug with NetBSD NFS client. Despite a mount with
>-o intr,soft, we can hit situation where a process can remain hang in 
>kernel because the NFS server is gone.
>
>This happens when the ioflush does its duty, with the following code path:
>sync_fsync / nfs_sync / VOP_FSYNC / nfs_fsync / nfs_flush / VOP_PUTPAGES
>
>VOP_PUTPAGES has flags = PGO_ALLPAGES|PGO_FREE. It then goes through
>genfs_putpages and genfs_do_putpages, and get stuck in:
>
>   /* Wait for output to complete. */
>   if (!wasclean && !async && vp->v_numoutput != 0) {
>   while (vp->v_numoutput != 0)
>   cv_wait(&vp->v_cv, slock);
>   }
>
>This cv_wait() is tiemout-less and uninterruptible. ioflush will 
>sleep there forever, holding vnode lock. Any other process doing
>I/O on the filesystem will sleep in tstile waiting for the vnode
>lock with this path: 
>sys_write / dofilewrite / vn_write / vn_lock / VOP_LOCK / rw_enter

Yes, but ioflush is not a user process... An interruptible mount
means that a user process can interrupt a syscall doing an NFS
operation. No other operating system I know of, takes this to mean
that you can unmount the filesystem or make delayed writes abort
and fail.

Having said that, yes it is a problem that you need to reboot
because an NFS server is gone, and we should make umount -f work
properly in that case. I don't think that we should introduce umount
-l (like linux) unless there is a compelling reason to do so.

christos



KGDB/i386 broken/supposed to work?

2015-06-19 Thread Timo Buhrmester
I'm failing to get KGDB on i386 working for kernel debugging over a serial 
(nullmodem) link, as described in http://www.netbsd.org/docs/kernel/kgdb.html

The TARGET (to-be-debugged) system has two serial ports, com0 is the boot 
console, com1 is what I set KGDB to operate on.
The REMOTE (debugger) system uses its com0 port to connect to the target's com1.

Using a GENERIC kernel with only the modifications required to enable KGDB (see 
bottom for config diff), I get the following behavior on the TARGET machine:
| > boot netbsd -d
| 15741968+590492+466076 [689568+730405]=0x1161fd4
| kernel text is mapped with 4 large pages and 5 normal pages
| Loaded initial symtab at 0xc110750c, strtab at 0xc11afaac, # entries 43075
| kgdb waiting...fatal breakpoint trap in supervisor mode
| trap type 1 code 0 eip c02a6744 cs 8 eflags 202 cr2 0 ilevel 8 esp c1265ea0
| curlwp 0xc1078900 pid 0 lid 1 lowest kstack 0xc12632c0

There is no delay between ``kgdb waiting...'' and ``fatal breakpoint trap in 
supervisor mode''.
I'm not sure whether or not this is the expected behavior, because eip c02a6744 
is in the `breakpoint` function so that would make sense; but the documentation 
makes it sound like it should just say ``kgdb waiting...''.


On the REMOTE (debugger) machine (serial port tty00) I get/do:
| # gdb -q netbsd.gdb
| Reading symbols from netbsd.gdb...done.
| (gdb) set remotebaud 38400 
| Warning: command 'set remotebaud' is deprecated.
| Use 'set serial baud'.
|
| (gdb) set serial baud 38400
| (gdb) set remotebreak 1
| Warning: command 'set remotebreak' is deprecated.
| Use 'set remote interrupt-sequence'.
|
| (gdb) set remote interrupt-sequence Ctrl-C 
| (gdb) set remotetimeout 5 
| (gdb) target remote /dev/tty00
| Remote debugging using /dev/tty00
| Ignoring packet error, continuing...
| warning: unrecognized item "timeout" in "qSupported" response
| Ignoring packet error, continuing...
| Ignoring packet error, continuing...
| Bogus trace status reply from target: timeout
| (gdb)

..which I presume is due to the target already having ceased execution.


Both machines run the same, recent -current build (7.99.18) on i386.
I have verified that the serial connection works in both directions, using a 
non-KGDB GENERIC kernel.
I have also verified that kgdb is actually in the kernel and using the right 
port (com1) when booting the KGDB kernel without -d:
| com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
| com0: console
| com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
| com1: kgdb


The difference between GENERIC and my KGDB-enabled version of it:
-#options   DEBUG   # expensive debugging checks/support
+optionsDEBUG   # expensive debugging checks/support
 #options   LOCKDEBUG   # expensive locking checks/support
 #options   KMEMSTATS   # kernel memory statistics (vmstat -m)
-optionsDDB # in-kernel debugger
+#options   DDB # in-kernel debugger
 #options   DDB_ONPANIC=1   # see also sysctl(7): `ddb.onpanic'
-optionsDDB_HISTORY_SIZE=512# enable history editing in DDB
+#options   DDB_HISTORY_SIZE=512# enable history editing in DDB
 #options   DDB_VERBOSE_HELP
-#options   KGDB# remote debugger
-#options   KGDB_DEVNAME="\"com\"",KGDB_DEVADDR=0x3f8,KGDB_DEVRATE=9600
-#makeoptions   DEBUG="-g"  # compile full symbol table
+optionsKGDB# remote debugger
+optionsKGDB_DEVNAME="\"com\"",KGDB_DEVADDR=0x2f8,KGDB_DEVRATE=38400
+makeoptionsDEBUG="-g"  # compile full symbol table
 #options   SYSCALL_STATS   # per syscall counts
 #options   SYSCALL_TIMES   # per syscall times
 #options   SYSCALL_TIMES_HASCOUNTER# use 'broken' rdtsc (soekris)


Any idea whether a) KGDB is tested/supposed to work and b) what I might be 
doing wrong?
Is there any other relevant information I missed that would be useful to 
provide?


Timo Buhrmester


Re: KGDB/i386 broken/supposed to work?

2015-06-19 Thread Christos Zoulas
In article <20150619201302.GA243@frozen.localdomain>,
Timo Buhrmester   wrote:
>I'm failing to get KGDB on i386 working for kernel debugging over a
>serial (nullmodem) link, as described in
>http://www.netbsd.org/docs/kernel/kgdb.html
>
>The TARGET (to-be-debugged) system has two serial ports, com0 is the
>boot console, com1 is what I set KGDB to operate on.
>The REMOTE (debugger) system uses its com0 port to connect to the
>target's com1.
>
>Using a GENERIC kernel with only the modifications required to enable
>KGDB (see bottom for config diff), I get the following behavior on the
>TARGET machine:
>| > boot netbsd -d
>| 15741968+590492+466076 [689568+730405]=0x1161fd4
>| kernel text is mapped with 4 large pages and 5 normal pages
>| Loaded initial symtab at 0xc110750c, strtab at 0xc11afaac, # entries 43075
>| kgdb waiting...fatal breakpoint trap in supervisor mode
>| trap type 1 code 0 eip c02a6744 cs 8 eflags 202 cr2 0 ilevel 8 esp c1265ea0
>| curlwp 0xc1078900 pid 0 lid 1 lowest kstack 0xc12632c0
>
>There is no delay between ``kgdb waiting...'' and ``fatal breakpoint
>trap in supervisor mode''.
>I'm not sure whether or not this is the expected behavior, because eip
>c02a6744 is in the `breakpoint` function so that would make sense; but
>the documentation makes it sound like it should just say ``kgdb
>waiting...''.
>
>
>On the REMOTE (debugger) machine (serial port tty00) I get/do:
>| # gdb -q netbsd.gdb
>| Reading symbols from netbsd.gdb...done.
>| (gdb) set remotebaud 38400 
>| Warning: command 'set remotebaud' is deprecated.
>| Use 'set serial baud'.
>|
>| (gdb) set serial baud 38400
>| (gdb) set remotebreak 1
>| Warning: command 'set remotebreak' is deprecated.
>| Use 'set remote interrupt-sequence'.
>|
>| (gdb) set remote interrupt-sequence Ctrl-C 
>| (gdb) set remotetimeout 5 
>| (gdb) target remote /dev/tty00
>| Remote debugging using /dev/tty00
>| Ignoring packet error, continuing...
>| warning: unrecognized item "timeout" in "qSupported" response
>| Ignoring packet error, continuing...
>| Ignoring packet error, continuing...
>| Bogus trace status reply from target: timeout
>| (gdb)
>
>..which I presume is due to the target already having ceased execution.
>
>
>Both machines run the same, recent -current build (7.99.18) on i386.
>I have verified that the serial connection works in both directions,
>using a non-KGDB GENERIC kernel.
>I have also verified that kgdb is actually in the kernel and using the
>right port (com1) when booting the KGDB kernel without -d:
>| com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
>| com0: console
>| com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
>| com1: kgdb
>
>
>The difference between GENERIC and my KGDB-enabled version of it:
>-#options  DEBUG   # expensive debugging checks/support
>+options   DEBUG   # expensive debugging checks/support
> #options  LOCKDEBUG   # expensive locking checks/support
> #options  KMEMSTATS   # kernel memory statistics (vmstat -m)
>-options   DDB # in-kernel debugger
>+#options  DDB # in-kernel debugger
> #options  DDB_ONPANIC=1   # see also sysctl(7): `ddb.onpanic'
>-options   DDB_HISTORY_SIZE=512# enable history editing in DDB
>+#options  DDB_HISTORY_SIZE=512# enable history editing in DDB
> #options  DDB_VERBOSE_HELP
>-#options  KGDB# remote debugger
>-#options  KGDB_DEVNAME="\"com\"",KGDB_DEVADDR=0x3f8,KGDB_DEVRATE=9600
>-#makeoptions  DEBUG="-g"  # compile full symbol table
>+options   KGDB# remote debugger
>+options   KGDB_DEVNAME="\"com\"",KGDB_DEVADDR=0x2f8,KGDB_DEVRATE=38400
>+makeoptions   DEBUG="-g"  # compile full symbol table
> #options  SYSCALL_STATS   # per syscall counts
> #options  SYSCALL_TIMES   # per syscall times
> #options  SYSCALL_TIMES_HASCOUNTER# use 'broken' rdtsc (soekris)
>
>
>Any idea whether a) KGDB is tested/supposed to work and b) what I might
>be doing wrong?
>Is there any other relevant information I missed that would be useful
>to provide?
>

No, but the explanation is that support for it has probably rotted out.
I would file a PR so this information is not lost.

christos



Re: VOP_PUTPAGE ignores mount_nfs -o soft,intr

2015-06-19 Thread David Holland
On Fri, Jun 19, 2015 at 05:42:45PM +, Christos Zoulas wrote:
 > >This cv_wait() is tiemout-less and uninterruptible. ioflush will 
 > >sleep there forever, holding vnode lock. Any other process doing
 > >I/O on the filesystem will sleep in tstile waiting for the vnode
 > >lock with this path: 
 > >sys_write / dofilewrite / vn_write / vn_lock / VOP_LOCK / rw_enter
 > 
 > Yes, but ioflush is not a user process... An interruptible mount
 > means that a user process can interrupt a syscall doing an NFS
 > operation. No other operating system I know of, takes this to mean
 > that you can unmount the filesystem or make delayed writes abort
 > and fail.

Sure. But it also doesn't mean that there should be cases where I/O to
the filesystem hangs uninterruptibly.

Nothing is supposed to hang in tstile; therefore, this wait is
incorrect...

 > Having said that, yes it is a problem that you need to reboot
 > because an NFS server is gone, and we should make umount -f work
 > properly in that case. I don't think that we should introduce umount
 > -l (like linux) unless there is a compelling reason to do so.

I would say we want umount -l, but it's both not trivial and not a
solution to this problem.

-- 
David A. Holland
dholl...@netbsd.org


Re: VOP_PUTPAGE ignores mount_nfs -o soft,intr

2015-06-19 Thread Christos Zoulas
In article <20150619215901.ga22...@netbsd.org>,
David Holland   wrote:
>On Fri, Jun 19, 2015 at 05:42:45PM +, Christos Zoulas wrote:
> > >This cv_wait() is tiemout-less and uninterruptible. ioflush will 
> > >sleep there forever, holding vnode lock. Any other process doing
> > >I/O on the filesystem will sleep in tstile waiting for the vnode
> > >lock with this path: 
> > >sys_write / dofilewrite / vn_write / vn_lock / VOP_LOCK / rw_enter
> > 
> > Yes, but ioflush is not a user process... An interruptible mount
> > means that a user process can interrupt a syscall doing an NFS
> > operation. No other operating system I know of, takes this to mean
> > that you can unmount the filesystem or make delayed writes abort
> > and fail.
>
>Sure. But it also doesn't mean that there should be cases where I/O to
>the filesystem hangs uninterruptibly.
>
>Nothing is supposed to hang in tstile; therefore, this wait is
>incorrect...

Ok, what is it supposed to do? Does it fail? Give up? Get interrupted
and keep looping?

> > Having said that, yes it is a problem that you need to reboot
> > because an NFS server is gone, and we should make umount -f work
> > properly in that case. I don't think that we should introduce umount
> > -l (like linux) unless there is a compelling reason to do so.
>
>I would say we want umount -l, but it's both not trivial and not a
>solution to this problem.

My understanding that umount -l hides the mount, but does not deallocate
resources it can't. So it does not look that hard to me.

christos



Re: New manpage: locking(9)

2015-06-19 Thread Kamil Rytarowski
On 18.06.2015 22:33, Christos Zoulas wrote:
> In article 
> ,
> Kamil Rytarowski  wrote:
>> -=-=-=-=-=-
>>
>> I'm attaching a proposition of locking(9).
>>
>> It was inspired by:
>> http://leaf.dragonflybsd.org/cgi/web-man?command=locking§ion=ANY
>> https://www.freebsd.org/cgi/man.cgi?query=locking%289%29
>>
>> And by this page:
>> http://www.feyrer.de/NetBSD/bx/blosxom.cgi/nb_20080409_0027.html
>>
>> I included some extra notes about the kernel design and contexts:
>> - thread context vs softirq context vs hardirq context,
>> - process vs kernel thread (LWP),
>> - top kernel half vs bottom kernel half.
>>
>> These details might be off topic, but I need them to understand the
>> overall design and the internal flow.
> 
> That's very nice. I would like to include information on what is the
> typical use for each one and also which ones are obsolete. I also think
> that *tsleep should be included in the docs (at least saying that it has
> been replaced by condvars).
> 

I will do it.

I was told that the mb(9) interface deprecated I don't know why? And
indeed, I see it just on a few archs.

$ grep -r 'mb_write()' .
./arch/hppa/include/mutex.h:mb_write();

$ grep -r 'mb_memory()' .
./arch/alpha/include/mutex.h:#defineMUTEX_GIVE(mtx) 
mb_memory()
./arch/hppa/include/mutex.h:mb_memory();
./arch/m68k/include/mutex.h:#define MUTEX_GIVE(mtx) 
mb_memory()
./arch/mips/include/lock.h: mb_memory();
./arch/mips/include/lock.h: mb_memory();
./arch/powerpc/include/mutex.h:#define  MUTEX_GIVE(mtx) 
mb_memory()

$ grep -r 'mb_read()' .
./arch/alpha/include/mutex.h:#defineMUTEX_RECEIVE(mtx)  
mb_read()
./arch/m68k/include/mutex.h:#define MUTEX_RECEIVE(mtx)  
mb_read()
./arch/mips/include/lock.h: mb_read();
./arch/mips/include/lock.h: mb_read();
./arch/powerpc/include/mutex.h:#define  MUTEX_RECEIVE(mtx)  
mb_read()
./arch/sparc/include/lock.h:mb_read();
./arch/sparc64/include/mutex.h:#define  MUTEX_RECEIVE(mtx)  
mb_read()
./arch/sparc64/include/rwlock.h:#define RW_RECEIVE(rw)  
mb_read()



Re: VOP_PUTPAGE ignores mount_nfs -o soft,intr

2015-06-19 Thread David Holland
On Sat, Jun 20, 2015 at 12:30:28AM +, Christos Zoulas wrote:
 > >Sure. But it also doesn't mean that there should be cases where I/O to
 > >the filesystem hangs uninterruptibly.
 > >
 > >Nothing is supposed to hang in tstile; therefore, this wait is
 > >incorrect...
 > 
 > Ok, what is it supposed to do? Does it fail? Give up? Get interrupted
 > and keep looping?

I don't know; maybe time out and release/retake the vnode lock so
other threads can run?

 > > > Having said that, yes it is a problem that you need to reboot
 > > > because an NFS server is gone, and we should make umount -f work
 > > > properly in that case. I don't think that we should introduce umount
 > > > -l (like linux) unless there is a compelling reason to do so.
 > >
 > >I would say we want umount -l, but it's both not trivial and not a
 > >solution to this problem.
 > 
 > My understanding that umount -l hides the mount, but does not deallocate
 > resources it can't. So it does not look that hard to me.

It requires splitting struct mount into two pieces, one for the
filesystem and one for the mount of the filesystem. This isn't
entirely trivial.

-- 
David A. Holland
dholl...@netbsd.org


Re: New manpage: locking(9)

2015-06-19 Thread Kamil Rytarowski
On 19.06.2015 01:06, Paul Goyette wrote:
> Great to have this.
> 
> I've attached a diff file with some minor wording/grammar changes.
> 

Thank you.

When I will make progress in my research I will drop for review new version.

I will add notes about the halves anyway, as I can learn from it in the
Design and Implementation of the 4.4 BSD Operating System book. The same
term was used in the continuation for FreeBSD (from 2003) and the 2nd
FreeBSD edition from 2014/2015.

I see no reason to capitulate and drop the original naming, refreshed
for the current kernel design in favor of some invented linuxism.

I will write more information about contexts and interrupt handling.

> WRT to the table of "applicability", I'm not sure I like having it say
> 
> mutex(9)yes depends   depends
> 
> Can we maybe specify the dependency?  Perhaps
> 
> mutex(9)yes ???   spin-mutex only
> 
> 

Right, I will make it more clear.

I can rename softirq and hardirq to full names, as these shortcuts
aren't used on NetBSD, what do you think?