Module Name:    src
Committed By:   dholland
Date:           Fri Nov 20 07:20:21 UTC 2015

Modified Files:
        src/doc/roadmaps: storage

Log Message:
Update the storage roadmap. Please review/comment...


To generate a diff of this commit:
cvs rdiff -u -r1.9 -r1.10 src/doc/roadmaps/storage

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.

Modified files:

Index: src/doc/roadmaps/storage
diff -u src/doc/roadmaps/storage:1.9 src/doc/roadmaps/storage:1.10
--- src/doc/roadmaps/storage:1.9	Sat Jan 14 22:06:16 2012
+++ src/doc/roadmaps/storage	Fri Nov 20 07:20:21 2015
@@ -1,174 +1,358 @@
-$NetBSD: storage,v 1.9 2012/01/14 22:06:16 agc Exp $
+$NetBSD: storage,v 1.10 2015/11/20 07:20:21 dholland Exp $
 
 NetBSD Storage Roadmap
 ======================
 
 This is a small roadmap document, and deals with the storage and file
-systems side of the operating system.
+systems side of the operating system. It discusses elements, projects,
+and goals that are under development or under discussion; and it is
+divided into three categories based on perceived priority.
+
+The following elements, projects, and goals are considered strategic
+priorities for the project:
+
+ 1. Improving iscsi
+ 2. nfsv4 support
+ 3. A better journaling file system solution
+ 4. Getting zfs working for real
+ 5. Seamless full-disk encryption
+
+The following elements, projects, and goals are not strategic
+priorities but are still important undertakings worth doing:
+
+ 6. lfs64
+ 7. Per-process namespaces
+ 8. lvm tidyup
+ 9. Flash translation layer
+ 10. Shingled disk support
+ 11. ext3/ext4 support
+ 12. Port hammer from Dragonfly
+ 13. afs maintenance
+ 14. execute-in-place
+
+The following elements, projects, and goals are perhaps less pressing;
+this doesn't mean one shouldn't work on them but the expected payoff
+is perhaps less than for other things:
 
-The following elements and projects are pencilled in for 6.0, but
-please do not rely on them being there.
+ 15. coda maintenance
 
-Features that will be in 6.0:
-2. logical volume management
-3. a native port of Sun's ZFS
-4. ReFUSE, perfuse and pud
-6. Support for flash devices - NAND, and flash file system
-7. rump extensions
-9. in-kernel iSCSI initiator
-10. RAIDframe parity map
-11. quota system re-work
 
-Features that are planned for future releases:
-1. devfs/udevfsd
-5. web-based management tools for storage subsystems
-8. virtualised disks in userland
-12. lfs renovation
+Explanations
+============
 
-We'll continue to update this roadmap as features and dates get firmed up.
-
-Some explanations
-=================
-
-1. udevfsd
-----------
-
-There has always been discussion over devfs, and experience with it
-seems mixed (to be kind). At the same time, carrying around a whole
-populated /dev seems quite possible and effective, but maybe a bit
-unwieldy. jmcneill's udevfsd addresses this in a different way, and
-is currently in othersrc/external/bsd/udevfsd. Not planned for 6.0
-right now.
-
-Responsible: jmcneill
-
-2. Logical Volume Management
-----------------------------
-
-Based on the Linux lvm2 and devmapper software, with a new kernel component
-for NetBSD written. Merged in 5.99.5 sources, will be in 6.0.
-
-Responsible: haad, martin
-
-3. Native port of Sun's ZFS
----------------------------
-
-Two Summer of Code projects have been held, concentrating on the
-provision of ZFS support for NetBSD.  Mostly completed by haad, and
-building on ver's work, this is the port of Sun's ZFS, with
-modifications to make it compile on NetBSD by ad@, and based on the
-Sun code for the block layer. Discussions are still taking place to
-get the design right for support for the openat(2) system call family,
-and the correct architecture for reclaiming vnodes.
-
-The ZFS source code has been committed to the repository.
-
-Responsible: haad, ad, ver
-
-4. ReFUSE, perfuse and pud
---------------------------
-
-FUSE has two interfaces, the normal high-level one, and a lower-level
-interface which is closer to the way standard file systems operate. 
-manu's perfuse adds the low-level functionality in the same way that
-ReFUSE adds the high-level functionality.  In addition, there is the
-"pass to userspace device" framework added by pooka as part of rump. 
-All 3 will be in 6.0.
-
-Responsible: pooka, manu, agc
-
-5. Web-based Management tools for Storage Subsystems
-----------------------------------------------------
-
-Standard tools for managing the storage subsystems that NetBSD
-provides, using a standard web-server as the basic user interface on
-the storage device, allowing remote management by a standard web
-browser.  CIM and related functinoality are interesting datapoints in
-this space, although credentials and authentication are always
-challenges in this space. Will not make it into 6.0
-
-Responsible: agc
-
-6. Support for flash devices - NAND, and flash file system
-----------------------------------------------------------
-
-ahoka has have contributed many things in this area, including
-flash(4), flash(9), flashctl(8) and nand(9).  In addition, the
-University of Szeged has contributed chfs,
-http://en.wikipedia.org/wiki/CHFS, which is described as "the first
-open source flash specific file system written for NetBSD".  All of
-these will be in 6.0.
+1. Improving iscsi
+------------------
 
-Responsible: ahoka
+Both the existing iscsi target and initiator are fairly bad code, and
+neither works terribly well. Fixing this is fairly important as iscsi
+is where it's at for remote block devices. Note that there appears to
+be no compelling reason to move the target to the kernel or otherwise
+make major architectural changes.
+
+ - As of November 2015 nobody is known to be working on this.
+ - There is currently no clear timeframe or release target.
+ - Contact agc for further information.
+
+
+2. nfsv4 support
+----------------
+
+nfsv4 is at this point the de facto standard for FS-level (as opposed
+to block-level) network volumes in production settings. The legacy nfs
+code currently in NetBSD only supports nfsv2 and nfsv3.
+
+The intended plan is to port FreeBSD's nfsv4 code, which also includes
+nfsv2 and nfsv3 support, and eventually transition to it completely,
+dropping our current nfs code. (Which is kind of a mess.) So far the
+only step that has been taken is to import the code from FreeBSD. The
+next step is to update that import (since it was done a while ago now)
+and then work on getting it to configure and compile.
+
+ - As of November 2015 nobody is working on this, and a volunteer to
+   take charge is urgently needed.
+ - There is no clear timeframe or release target, although having an
+   experimental version ready for -8 would be great.
+ - Contact dholland for further information.
+
+
+3. A better journaling file system solution
+-------------------------------------------
+
+WAPBL, the journaling FFS that NetBSD rolled out some time back, has a
+critical problem: it does not address the historic ffs behavior of
+allowing stale on-disk data to leak into user files in crashes. And
+because it runs faster, this happens more often and with more data.
+This situation is both a correctness and a security liability. Fixing
+it has turned out to be difficult. It is not really clear what the
+best option at this point is:
+
++ Fixing WAPBL (e.g. to flush newly allocated/newly written blocks to
+disk early) has been examined by several people who know the code base
+and judged difficult. Still, it might be the best way forward.
+
++ There is another journaling FFS; the Harvard one done by Margo
+Seltzer's group some years back. We have a copy of this, but as it was
+written in BSD/OS circa 1999 it needs a lot of merging, and then will
+undoubtedly also need a certain amount of polishing to be ready for
+production use. It does record-based rather than block-based
+journaling and does not share the stale data problem.
+
++ We could bring back softupdates (in the softupdates-with-journaling
+form found today in FreeBSD) -- this code is even more complicated
+than the softupdates code we removed back in 2009, and it's not clear
+that it's any more robust either. However, it would solve the stale
+data problem if someone wanted to port it over. It isn't clear that
+this would be any less work than getting the Harvard journaling FFS
+running... or than writing a whole new file system either.
+
++ We could write a whole new journaling file system. (That is, not
+FFS. Doing a new journaling FFS implementation is probably not
+sensible relative to merging the Harvard journaling FFS.) This is a
+big project.
+
+Right now it is not clear which of these avenues is the best way
+forward. Given the general manpower shortage, it may be that the best
+way is whatever looks best to someone who wants to work on the
+problem.
+
+ - As of November 2015 nobody is working on fixing WAPBL. There has
+   been some interest in the Harvard journaling FFS but no significant
+   progress. Nobody is known to be working on or particularly
+   interested in porting softupdates-with-journaling. And, while
+   dholland has been mumbling for some time about a plan for a
+   specific new file system to solve this problem, there isn't any
+   realistic prospect of significant progress on that in the
+   foreseeable future, and nobody else is known to have or be working
+   on even that much.
+ - There is no clear timeframe or release target; but given that WAPBL
+   has been disabled by default for new installs in -7 this problem
+   can reasonably be said to have become critical.
+ - Contact joerg or martin regarding WAPBL; contact dholland regarding
+   the Harvard journaling FFS.
+
+
+4. Getting zfs working for real
+-------------------------------
+
+ZFS has been almost working for years now. It is high time we got it
+really working. One of the things this entails is updating the ZFS
+code, as what we have is rather old. The Illumos version is probably
+what we want for this.
+
+ - There has been intermittent work on zfs, but as of November 2015
+   nobody is known to be actively working on it
+ - There is no clear timeframe or release target.
+ - Contact riastradh or ?? for further information.
 
-7. RUMP Extensions
-------------------
 
-Rump support has been in NetBSD for 2 releases now, and continues to be
-developed actively. Recent additions have included cgd support, and smbfs
-client support.
+5. Seamless full-disk encryption
+--------------------------------
 
-Responsible: pooka
+(This is only sort of a storage issue.) We have cgd, and it is
+believed to still be cryptographically suitable, at least for the time
+being. However, we don't have any of the following things:
 
++ An easy way to install a machine with full-disk encryption. It
+should really just be a checkbox item in sysinst, or not much more
+than that.
 
-8. Virtualised disks in Userland
---------------------------------
++ Ideally, also an easy way to turn on full-disk encryption for a
+machine that's already been installed, though this is harder.
 
-For better support of virtualization, a library which provides a consistent 
-view of virtualized disk images has been developed by jmcneill. This will
-not make it into 6.0, although some extra functionality for reading vmdk
-images is available in othersrc/external/bsd/vmdk.
++ A good story for booting off a disk that is otherwise encrypted;
+obviously one cannot encrypt the bootblocks, but it isn't clear where
+in boot the encrypted volume should take over, or how to make a best
+effort at protecting the unencrypted elements needed to boot. (At
+least, in the absence of something like UEFI secure boot combined with
+an cryptographic oracle to sign your bootloader image so UEFI will
+accept it.) There's also the question of how one runs cgdconfig(8) and
+where the cgdconfig binary comes from.
 
-Responsible: jmcneill, agc
++ A reasonable way to handle volume passphrases. MacOS apparently uses
+login passwords for this (or as passphrases for secondary keys, or
+something) and this seems to work well enough apart from the somewhat
+surreal experience of sometimes having to log in twice. However, it
+will complicate the bootup story.
 
+Given the increasing regulatory-level importance of full-disk
+encryption, this is at least a de facto requirement for using NetBSD
+on laptops in many circumstances.
 
-9. In-kernel iSCSI Initiator
-----------------------------
+ - As of November 2015 nobody is known to be working on this.
+ - There is no clear timeframe or release target.
+ - Contact dholland for further information.
 
-NetBSD has had a userland implementation of an iSCSI initiator since
-NetBSD 4.99.35, based on ReFUSE.  Wasabi Systems kindly contributed their
-kernel-based iSCSI initiator, and it will be in 6.0.
 
-Responsible: riz, agc
+6. lfs64
+--------
 
+LFS currently only supports volumes up to 2 TB. As LFS is of interest
+for use on shingled disks (which are larger than 2 TB) and also for
+use on disk arrays (ditto) this is something of a problem. A 64-bit
+version of LFS for large volumes is in the works.
 
-10. RAIDframe parity map
-------------------------
+ - As of November 2015 dholland is working on this.
+ - It is close to being ready for at least experimental use and is
+   expected to be in 8.0.
+ - Responsible: dholland
 
-Jed Davis successfully completed a Summer of Code project to implement
-parity map zones for RAIDframe.  Parity mapping drastically reduces
-the amount of time spent rewriting parity after an unclean shutdown by
-keeping better track of which regions might have had outstanding
-writes.  Enabled by default; can be disabled on a per-set basis, or
-tuned, with the new raidctl(8) commands.
 
-Merged in 5.99.22 sources, and will be in 6.0.  A separate set of
-patches is available for NetBSD-5.
+7. Per-process namespaces
+-------------------------
 
-Responsible: jld
+Support for per-process variation of the file system namespace enables
+a number of things; more flexible chroots, for example, and also
+potentially more efficient pkgsrc builds. dholland thought up a
+somewhat hackish but low-footprint way to implement this.
 
+ - As of November 2015 dholland is working on this.
+ - It is scheduled to be in 8.0.
+ - Responsible: dholland
 
-11. quota system re-work
-------------------------
 
-The quota system has been re-worked by bouyer, and is in -current
-right now.  dholland is updating and modifying this rework so that it
-is a more generalised solution, with better features for security. 
-This is expected to be in 6.0, although there is a lot of work to
-complete.
+8. lvm tidyup
+-------------
 
-Responsible: bouyer, dholland
+[agc says someone should look at our lvm stuff; XXX fill this in]
 
+ - As of November 2015 nobody is known to be working on this.
+ - There is no clear timeframe or release target.
+ - Contact agc for further information.
 
-12. LFS renovation
-------------------
 
-LFS had been de-emphasised in the time period leading up to the
-5.0 release, but is undergoing some rework by perseant, and dholland
-has some contributions in this area too.
+9. Flash translation layer
+--------------------------
 
-Responsible: perseant, dholland
+SSDs ship with firmware called a "flash translation layer" that
+arbitrates between the block device software expects to see and the
+raw flash chips. FTLs handle wear leveling, lifetime management, and
+also internal caching, striping, and other performance concerns. While
+NetBSD has a file system for raw flash (chfs), it seems that given
+things NetBSD is often used for it ought to come with a flash
+translation layer as well.
+
+Note that this is an area where writing your own is probably a bad
+plan; it is a complicated area with a lot of prior art that's also
+reportedly full of patent mines. There are a couple of open FTL
+implementations that we might be able to import.
+
+ - As of November 2015 nobody is known to be working on this.
+ - There is no clear timeframe or release target.
+ - Contact dholland for further information.
+
+
+10. Shingled disk support
+-------------------------
+
+Shingled disks (or more technically, disks with "shingled magnetic
+recording" or SMR) can only write whole tracks at once. Thus, to
+operate effectively they require translation support similar to the
+flash translation layers found in SSDs. The nature and structure of
+shingle translation layers is still being researched; however, at some
+point we will want to support these things in NetBSD.
+
+ - As of November 2015 one of dholland's coworkers is looking at this.
+ - There is no clear timeframe or release target.
+ - Contact dholland for further information.
+
+
+11. ext3/ext4 support
+---------------------
+
+We would like to be able to read and write Linux ext3fs and ext4fs
+volumes. (We can already read clean ext3fs volumes as they're the same
+as ext2fs, modulo volume features our ext2fs code does not support;
+but we can't write them.)
+
+Ideally someone would write ext3 and/or ext4 code, whether integrated
+with or separate from the ext2 code we already have. It might also
+make sense to port or wrap the Linux ext3 or ext4 code so it can be
+loaded as a GPL'd kernel module; it isn't clear if that would be more
+or less work than doing an implementation.
+
+Note however that implementing ext3 has already defeated several
+people; this is a harder project than it looks.
+
+ - As of November 2015 nobody is known to be working on this.
+ - There is no clear timeframe or release target.
+ - Contact ?? for further information.
+
+
+12. Port hammer from Dragonfly
+------------------------------
+
+While the motivation for and role of hammer isn't perhaps super
+persuasive, it would still be good to have it. Porting it from
+Dragonfly is probably not that painful (compared to, say, zfs) but as
+the Dragonfly and NetBSD VFS layers have diverged in different
+directions from the original 4.4BSD, may not be entirely trivial
+either.
+
+ - As of November 2015 nobody is known to be working on this.
+ - There is no clear timeframe or release target.
+ - There probably isn't any particular person to contact; for VFS
+   concerns contact dholland or hannken.
+
+
+13. afs maintenance
+-------------------
+
+AFS needs periodic care and feeding to continue working as NetBSD
+changes, because the kernel-level bits aren't kept in the NetBSD tree
+and don't get updated with other things. This is an ongoing issue that
+always seems to need more manpower than it gets. It might make sense
+to import some of the kernel AFS code, or maybe even just some of the
+glue layer that it uses, in order to keep it more current.
+
+ - jakllsch sometimes works on this.
+ - We would like every release to have working AFS by the time it's
+   released.
+ - Contact jakllsch or gendalia about AFS; for VFS concerns contact
+   dholland or hannken.
+
+
+14. execute-in-place
+--------------------
+
+It is likely that the future includes non-volatile storage (so-called
+"nvram") that looks like RAM from the perspective of software. Most
+importantly: the storage is memory-mapped rather than looking like a
+disk controller. There are a number of things NetBSD ought to have to
+be ready for this, of which probably the most important is
+"execute-in-place": when an executable is run from such storage, and
+mapped into user memory with mmap, the storage hardware pages should
+be able to appear directly in user memory. Right now they get
+gratuitously copied into RAM, which is slow and wasteful. There are
+also other reasons (e.g. embedded device ROMs) to want execute-in-
+place support.
+
+Note that at the implementation level this is a UVM issue rather than
+strictly a storage issue. 
+
+Also note that one does not need access to nvram hardware to work on
+this issue; given the performance profiles touted for nvram
+technologies, a plain RAM disk like md(4) is sufficient both
+structurally and for performance analysis.
+
+ - As of November 2015 nobody is known to be working on this. Some
+   time back, uebayasi wrote some preliminary patches, but they were
+   rejected by the UVM maintainers.
+ - There is no clear timeframe or release target.
+ - Contact dholland for further information.
+
+
+15. coda maintenance
+--------------------
+
+Coda only sort of works. [And I think it's behind relative to
+upstream, or something of the sort; XXX fill this in.] Also the code
+appears to have an ugly incestuous relationship with FFS. This should
+really be cleaned up. That or maybe it's time to remove Coda.
+
+ - As of November 2015 nobody is known to be working on this.
+ - There is no clear timeframe or release target.
+ - There isn't anyone in particular to contact.
 
 
 Alistair Crooks, David Holland
-Sat Jan 14 05:52:37 PST 2012
+Fri Nov 20 02:17:53 EST 2015

Reply via email to