Roland Mainz wrote:
...
delay the project proposal until it is clear that Sun actually
releases the patches for their work. Starting from scratch without
help from Sun will be much harder.
Just a quick note to say I haven't dropped this on the floor. Please give
me some time to look into this. Nobody else on the VM team wanted to help
look into this right now
Sounds a little bit bitter... ;-(
... is there any reason for that ?
Nope. The 64K project wasn't done by the VM team, it was actually done by
the high-end platform group. Right now the other folks on the VM team (as
well as myself for that matter) are too busy fighting fires to spend any
serious cycles on this though we'd like to, and the folks who were in
involved in 64K simply don't have any interest in working on this in the open.
I managed to pull up a webrev yesterday and took a peek; there are about
7,000 lines of code change spread over 122 files. The rest of the changes
were committed over the course of Solaris 10 development as bugfixes so
they are already in place. The bulk of the changes are in UFS and in the
vnode layer. There's also a big whack to the nexus layer since as I
mentioned it also uses paged I/O.
The VOP_GETPAGE() and VOP_PUTPAGE() interfaces were added in the page
cache unification which I believe (without looking) dates back to SunOS
4.x. The original vnode interface specification from srk was based on the
buffer cache.
The major problem with these interfaces is that they expose PAGESIZE to
filesystems. UFS in particuar is problematic because (as I've said before)
it assumes that PAGESIZE <= MAXBSIZE.
The 64K project modified VOP_GETPAGE() and VOP_PUTPAGE() as well as the
direct I/O code to support partial page I/O. It's a pile of code, probably
80% or more of the changes.
Since extensive file system changes are necessary one way or the other to
pull this off, myself and others would prefer to take the tact (since we
have to go there eventually ANYWAY) of blowing up VOP_*PAGE(), and
introducing a new VOP_IO() interface which does not depend on PAGESIZE at
all. Instead of a page_t, it would use a base/bounds pair attached to a
different data structure which is associated with the I/O transaction.
To do this correctly requires revisiting the layering. Currently if I call
read() or write(), say, the system call goes directly into the VOP() call,
and the file system goes into the VM layer which goes BACK into the file
system with further VOP() calls. If everything went through the VM first,
and then into the filesystem, then things like PAGESIZE would be
completely hidden from the filesystem, and we can change the base pagesize
to whatever we wish, or even make it different for different processes!
and I'm buried in other stuff through next week.
Ok...
As you can see from the discussion thread I think there is a lot more than
throwing the code over the wall, but also deciding whether the previous
approach was the right one, and enumerating other possible angles of
attack. Not initiating the project without the code seems to me putting
the cart before the horse, because it implicitly assumes that their
approach was the best approach to solving the problem or was even tractable.
- Eric
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org