Roland Mainz wrote:
...
delay the project proposal until it is clear that Sun actually
releases the patches for their work. Starting from scratch without
help from Sun will be much harder.
Just a quick note to say I haven't dropped this on the floor. Please give
me some time to look into this. Nobody else on the VM team wanted to help
look into this right now

Sounds a little bit bitter... ;-(
... is there any reason for that ?

Nope. The 64K project wasn't done by the VM team, it was actually done by the high-end platform group. Right now the other folks on the VM team (as well as myself for that matter) are too busy fighting fires to spend any serious cycles on this though we'd like to, and the folks who were in involved in 64K simply don't have any interest in working on this in the open.

I managed to pull up a webrev yesterday and took a peek; there are about 7,000 lines of code change spread over 122 files. The rest of the changes were committed over the course of Solaris 10 development as bugfixes so they are already in place. The bulk of the changes are in UFS and in the vnode layer. There's also a big whack to the nexus layer since as I mentioned it also uses paged I/O.

The VOP_GETPAGE() and VOP_PUTPAGE() interfaces were added in the page cache unification which I believe (without looking) dates back to SunOS 4.x. The original vnode interface specification from srk was based on the buffer cache.

The major problem with these interfaces is that they expose PAGESIZE to filesystems. UFS in particuar is problematic because (as I've said before) it assumes that PAGESIZE <= MAXBSIZE.

The 64K project modified VOP_GETPAGE() and VOP_PUTPAGE() as well as the direct I/O code to support partial page I/O. It's a pile of code, probably 80% or more of the changes.

Since extensive file system changes are necessary one way or the other to pull this off, myself and others would prefer to take the tact (since we have to go there eventually ANYWAY) of blowing up VOP_*PAGE(), and introducing a new VOP_IO() interface which does not depend on PAGESIZE at all. Instead of a page_t, it would use a base/bounds pair attached to a different data structure which is associated with the I/O transaction.

To do this correctly requires revisiting the layering. Currently if I call read() or write(), say, the system call goes directly into the VOP() call, and the file system goes into the VM layer which goes BACK into the file system with further VOP() calls. If everything went through the VM first, and then into the filesystem, then things like PAGESIZE would be completely hidden from the filesystem, and we can change the base pagesize to whatever we wish, or even make it different for different processes!

and I'm buried in other stuff through next week.

Ok...

As you can see from the discussion thread I think there is a lot more than throwing the code over the wall, but also deciding whether the previous approach was the right one, and enumerating other possible angles of attack. Not initiating the project without the code seems to me putting the cart before the horse, because it implicitly assumes that their approach was the best approach to solving the problem or was even tractable.

- Eric
_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Reply via email to