Andrew J Caines wrote:
> > I?m currently developing a jail-management solution - I use a
> > readonly mount_null for central software-management of the jails.
> > The manpage was written May 1, 1995 - is using this tool still dangerous
> 
> I have used it for read-only mounts since way back and have not have any
> problems, including brief periods of high I/O.
> 
> I'd have reservations allowing unique data on a read-write mount, however
> I just did a few quick and simple tests of reads and writes on a rw null
> mount on my 4.8-RC box with no apparent problem.

R/O is fine.

R/W is a problem because there are explicit coherency problems
when stacking vnodes.  That's because each vnode has an associated
"struct vm_object *v_object" which is the backing store object.

When you stack vnodes, because of this, then it's possible, as a
result of mmap'ed I/O, that the top object in the stack will not
have the same data as the backing object in the underlying FS.

The nullfs code trys to avoid this (see null_getvobject()), but
there are certain places where, in a non-unified VM and buffer
cache implementation, previously, where there would be explicit
coherency enforced.  For this to work, you effectively need to
put back in the explicit coherency cycles that were removed in
the VM and buffer cache unification.  Actually, it was this set
of changes that make LFS no longer work on FreeBSD, as well.

One place where this is obviously problematic is the first call
to VOP_GETVOBJECT() in vinvalbuf() in /sys/kern/vfs_subr.c (see
the "XXX" block comment before the "do" loop).

Basically, to clean this up, you would need to implement both
getpages() and putpages() that used the read/write path, and
did explicit copies between the upper and lower objects.

Technically, you'd think that the VOP_GETVOBJECT() would be
enough to take care of this -- which is almost true, for a
linear mapping of uppr pages to lower pages, but definitely
not true for a translation mapping or an offset mapping or a
scatter mapping, but... there are still explicit references
to vp->v_object in various places (e.g. vlrureclaim(), and
vop_stdcreatevobject(), etc.) that should instead be calling
VOP_GETVOBJECT().

As long as you don't do R/W, though, read-through coherency
is pretty much guaranteed, as long as the underlying FS that
is being mounted over is also R/O (i.e. there are no notifications
up the stack for changes to the underlying FS; thus any cached
data in the upper layer v_object, if referenced by one of those
routines directly, instead of getting the underlying v_object,
could contain stale data).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message

Reply via email to