Re: State of the Reiser4 FS
Avuton Olrich wrote: On 3/15/06, Hans Reiser [EMAIL PROTECTED] wrote: Avuton Olrich wrote: I just saw a thread on the LKML a minute ago asking about the state of getting the patch into vanilla linux. I read Andrew Morton's post about a month ago stating that it could happen soon, but was unlikely due to there not actually being a need for it to go into mainline (no major distro default, etc...). Can you supply a reference to this post? The only distro which is not influenced by performance numbers when selecting a filesystem is RedHat, and most of the rest are just waiting to be sure that politics will not kill reiser4 inclusion. I am sure I can come up with a we will support it if you let it in petition of distros if such a silliness is needed. I was refering to this post: http://marc.theaimsgroup.com/?l=linux-kernelm=113775878722100w=2 Thanks for all the answers -- avuton -- Anyone who quotes me in their sig is an idiot. -- Rusty Russell. Oh, well, the overall tone of that email is not all that negative. We will work on the 4k at a time issue, overcome that issue technically, and then after that is resolved deal with generating desire for a filesystem that is 2x (reiser4.0) to 4x (4.1alpha with compression) faster.
Re: State of the Reiser4 FS
On 10:29 Wed 15 Mar , Hans Reiser wrote: Tell the mosix guys we would be willing to cooperate with them regarding their problem. If it was that easy... The problem for openMosix is that most devices fetch data in 4k blocks via copy_from_user(). For migrated processes, openMosix intercepts these calls and forwards them to the node which currently hosts the process. This forwarding yields a high latency penalty. Obviously there are two ways to get rid of this problem: * modify _every_ Linux device driver to use a _a_lot_more_than_4k_at_a_time_ approach or * implement a second read ahead buffer which fetches large blocks via the network in the background and answers calls to copy_from_user() directly from the local buffer In my _very_ humble opinion the first approach would be much nicer, but after you guys had so many trouble with just your filesystem, I don't see that one coming, not at all. So I think the long term strategy for oM will the second, double buffering approach. At least I couldn't think of any other realistic, feasible way. BTW: how are you guys planning to solve this 4k issue? Will you revert to small blocks or will you pretend to perform 4k transfers and assemble those in the background to, again, process large chunks at once? If yes, wouldn't this seriously increase CPU usage due to (most likely) unnecessary data duplication? Regards -Andreas
Re: State of the Reiser4 FS
Jonathan Briggs wrote: On Tue, 2006-03-14 at 23:14 -0800, Hans Reiser wrote: [snip] They claim that if we don't use the ext3 code in our fs then they will be forced to shoulder an extra burden to maintain our code. We are not allowed to specify that they should not maintain our code at all. I need to read more Kafka I think, it is hard for me to understand it all. Err, this actually does make a lot of sense Hans. The mainline Linux Kernel code is maintained by everyone that can convince Linus or a sub-maintainer to accept their patch. In order to I am the reiserfs/reiser4 sub-maintainer. So, if reiser4 works well, and is faster than any other Linux FS, and it is, maintaining it over time is for me to worry about, not them.
Re: State of the Reiser4 FS
On Mar 15, 2006 20:27 +0100, Andreas Sch�fer wrote: If it was that easy... The problem for openMosix is that most devices fetch data in 4k blocks via copy_from_user(). For migrated processes, openMosix intercepts these calls and forwards them to the node which currently hosts the process. This forwarding yields a high latency penalty. Obviously there are two ways to get rid of this problem: * modify _every_ Linux device driver to use a _a_lot_more_than_4k_at_a_time_ approach or * implement a second read ahead buffer which fetches large blocks via the network in the background and answers calls to copy_from_user() directly from the local buffer Or you can use a network filesystem like Lustre that handles this itself ;-). Sadly, though, it has to do both of these to get good performance, via {sub,per}version of the VFS/VM. Clients do delayed-write (writeback cache, with write credits from the server to accound for space) to avoid small RPCs. They also do large amounts of readahead (in large chunks) to improve reads for applications and the VM that breaks up all reads into 4kB chunks. Servers also do batch block allocation and then large direct writes instead of going through the VFS/VM. There are still a number of device drivers that break up bios into chunks smaller than 1MB, and that hurts performance. Having a generic delayed/batch allocation mechanism is definitely the right way to go, and from my reading of linux-fsdevel this is underway by some folks at IBM. Since we have to support customers dating back to 2.4.21 it will be a while before we can move over to the newer APIs, once they are available. BTW: how are you guys planning to solve this 4k issue? Will you revert to small blocks or will you pretend to perform 4k transfers and assemble those in the background to, again, process large chunks at once? If yes, wouldn't this seriously increase CPU usage due to (most likely) unnecessary data duplication? It doesn't result in data duplication, per se, since the pages are copied into kernel space only once. What it does mean is that there needs to be a duplication of infrastructure in order to reassemble and track all of these pages. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Re: State of the Reiser4 FS
On 12:43 Wed 15 Mar , Hans Reiser wrote: I am the reiserfs/reiser4 sub-maintainer. So, if reiser4 works well, and is faster than any other Linux FS, and it is, maintaining it over time is for me to worry about, not them. I feel this thread is about to trail off to shores we all know too well. AFAICS we do have two completely different issues here: * The core maintainers want the whole code to adhere to certain standards. This doesn't have anything to do with performance etc. It's just for the fact that this standard is both, a sign of reliability and maintainability (even for the unlikely case that Namesys would disappear) * Reiser4 doesn't adhere to some of these standards because they don't make much sense from a performance (and design) point of view. I think the short term solution should be to adapt Reiser4 to the standard, but in the long run keep bugging the Linux people to change some paradigms (as one of Linux' core advantages has always been the ability and willingness to throw decayed code overboard). When you think about it, both POV do make sense. It's just so sad this whole debate has become much more a political than a style debate. -Andreas
Re: State of the Reiser4 FS
Hello On Tue, 2006-03-14 at 02:41 -0800, Avuton Olrich wrote: Hello, I just saw a thread on the LKML a minute ago asking about the state of getting the patch into vanilla linux. I read Andrew Morton's post about a month ago stating that it could happen soon, but was unlikely due to there not actually being a need for it to go into mainline (no major distro default, etc...). I was wondering, myself, earlier in the day what the state of the patch was, if anything further has been said about getting it into mainline. Was also wondering if there was a lot of work going into it right now, or are people tied up doing other things? If anyone has time for an answer it'd be appreciated by everyone I'm sure, AFAIK, the most recent reason why reiser4 does not get included is that reiser4 developers have to change reiser4 to use generic code to implement read/write. AFAICS, reiser4 developers do not work on that.
Re: State of the Reiser4 FS
Clemens Eisserer wrote: AFAIK, the most recent reason why reiser4 does not get included is that reiser4 developers have to change reiser4 to use generic code to implement read/write. AFAICS, reiser4 developers do not work on that. Has this really become a reason to not include reiser4 into mainline? Yes, this is the official reason. I also don't see a reason for that - at least it would bind reiser4 more close to linux making ports to other OS harder. You are entirely correct. It is an interesting social phenomenom that we must do this, yes? Using the ext3 (err, generic) code makes it much harder to license and port reiser4. Furthermore if it would decrease performance its simply no way to go. What we are currently doing is rewriting the reiser4 read and write code to not operate 4k at a time. The design specification was that it was supposed to do as much as possible once per write, and as little as possible every 4k. Unfortunately, when I reviewed our code the design specification had not been adhered to. After the reiser4 code adheres to the reiser4 design specification, it will be possible to argue that the reiser4 design specification is technically superior, and the generic code should change. I generally believe that the per 4k approach used throughout the linux kernel is not as CPU efficient as sending larger groups of pages through the layers all at once. In other words, there is a reason we have bios, and we need to learn the lesson from them that they teach us, and abstract it into a general design approach. We must make reiser4 adhere to the reiser4 design specification before we can deal with their demand that we change the generic code so that it does what reiser4 does. I have no desire to touch their code, but they require it. Generally speaking, they don't really like any feature existing in reiser4 that is not in their code, and ask that we add it to their code before reiser4 is allowed to have it. They call the ext3 code the generic code. They claim that if we don't use the ext3 code in our fs then they will be forced to shoulder an extra burden to maintain our code. We are not allowed to specify that they should not maintain our code at all. I need to read more Kafka I think, it is hard for me to understand it all. (btw. I think this could be a way to generate some revenue - I think there is demand for a modern fs which is supported by both, windows and linux). There are so many ways to generate revenue by spending revenue I don't have in my pocket right now. forgive me, yes, someday we should do that and will do that. lg Clemens If any of you users want to see a reiser4, you have to strenuously clamor for it to go into mainline, or you simply will not get it. Namesys cannot survive indefinitely with it not going into the kernel. This is a political issue, and viewing it as otherwise is simply naive. It is sad, I chose Linux over BSD to develop for because BSD used to be like this. Hans