Re: creating live virtual files by concatenation
On Sat, 25 Feb 2006, Maciej Soltysiak wrote: Code files, DNS zones, configuration files, HTML code. We are still dealing with lots of text files today. You say it like it's a bad thing, but in truth I suspect people often deal with text files because they're EASY to manipulate through scripts, etc. -- All Rights Reversed
Re: creating live virtual files by concatenation
On Sat, 25 Feb 2006, Peter Foldiak wrote: sub-file corresponding to a key-range. Writing a chapter should change the book that the chapter is part of. That is what would make it really valuable. Of course it would have all sorts of implications (e.g. for metadata for each part) that need to be thought about, but it could be done properly, I think. What happens if you read the first 10kB of a file, and one of the chapters behind your read cursor grows? Do you read part of the same data again when you continue reading? Does the read cursor automatically advance? Your idea changes the way userspace expects files to behave... -- All Rights Reversed
Re: reiser4 plugins
On Tue, 21 Jun 2005, David Masover wrote: The point is, this was in the kernel for quite awhile, and it was so ugly that someone would rather be fucked with a chainsaw. If something that bad can make it in the kernel and stay for awhile because it worked, and no one wanted to replace it I would like to think we could learn from the mistakes made in the past, instead of repeating them. Ugly code often is so ugly people don't *want* to fix it, so merging ugly code is often a big mistake. -- The Theory of Escalating Commitment: The cost of continuing mistakes is borne by others, while the cost of admitting mistakes is borne by yourself. -- Joseph Stiglitz, Nobel Laureate in Economics
Re: silent semantic changes with reiser4
On Thu, 26 Aug 2004, Denis Vlasenko wrote: I like cat a b. You can keep your progress. cat a b does not preserve following file properties even on standard UNIX filesystems: name,owner,group,permissions. Losing permissions is one thing. Annoying, mostly. However, actual losing file data during such a copy is nothing short of a disaster, IMHO. In my opinion we shouldn't merge file-as-a-directory semantics into the kernel until we figure out how to fix the backup/restore problem and keep standard unix utilities work. -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan
Re: silent semantic changes with reiser4
On Thu, 26 Aug 2004, Jamie Lokier wrote: Christophe Saout wrote: And if you read test.compound (the main stream) you get a special format that contains all the components. You can copy that single stream of bytes to another (reiser4) fs and then access test.compound/test.txt again. (To Rik especially), this is the design which more or less satisfies lots of different goals at once. And if an unaware application reads the compound file and then writes it out again, does the filesystem interpret the contents and create the other streams ? Unless I overlook something (please tell me what), the scheme just proposed requires filesystems to look at the content of files that is being written out, in order to make the streams work. -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan
Re: silent semantic changes with reiser4
On Thu, 26 Aug 2004, Linus Torvalds wrote: For example, you _could_ probably (but hey, maybe tar tries to strip slashes off the end of filenames, so this might not work due to silly reasons like that) back up a compound file with tar cvf file.tar file file/ So you'd have both a file and a directory that just happen to have the same name ? How would this work in the dcache? -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan
Re: silent semantic changes with reiser4
On Thu, 26 Aug 2004, Linus Torvalds wrote: On Thu, 26 Aug 2004, Rik van Riel wrote: So you'd have both a file and a directory that just happen to have the same name ? How would this work in the dcache? There would be only one entry in the dcache. The lookup will select whether it opens the file or the directory based on O_DIRECTORY (and usage, of course - if it's in the middle of a path, it obviously needs to be opened as a directory regardless). Hmmm, I just straced cp /bin/bash /tmp. One line stood out as a potential problem: open(/tmp/bash, O_WRONLY|O_CREAT|O_LARGEFILE, 0100755) = 4 What do we do with O_CREAT ? Do we always allow both a directory and a file to be created with the same name ? Does this create a new class of symlink attack style security holes ? -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan
Re: silent semantic changes with reiser4
On Thu, 26 Aug 2004, Linus Torvalds wrote: So /tmp/bash is _not_ two different things. It is _one_ entity, that contains both a standard data stream (the file part) _and_ pointers to other named streams (the directory part). OK, that makes sense. Hey, think of it as a wave-particle duality. Both modes exist at the same time, and cannot be separated from each other. Which one you see depends entirely on your experiment, ie how you open the file. Guess I'm scared again now. We need to make sure that backup programs don't fall victim to the uncertainty principle ;) -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan
Re: Using fs views to isolate untrusted processes: I need an assistant architect in the USA for Phase I of a DARPA funded linux kernel project
On Sun, 1 Aug 2004, Hans Reiser wrote: You can think of this as chroot on steroids. Sounds like what you want is pretty much the namespace stuff that has been in the kernel since the early 2.4 days. No need to replicate VFS functionality inside the filesystem. -- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. - Brian W. Kernighan
[reiserfs-list] Re: Note describing poor dcache utilization under high memory pressure
On Mon, 28 Jan 2002, Linus Torvalds wrote: I am, for example, very interested to see if Rik can get the overhead of the rmap stuff down low enough that it's not a noticeable hit under non-VM-pressure. I'm looking at the issue of doing COW on the page tables (which really is a separate issue), because it might make it more palatable to go with the rmap approach. I'd be interested to know exactly how much overhead -rmap is causing for both page faults and fork (but I'm sure one of the regular benchmarkers can figure that one out while I fix the RSS limit stuff ;)) About page table COW ... I've thought about it a lot and it wouldn't surprise me if the 4 MB granularity of page tables is too large to be of a real benefit since the classic path of fork+exec would _still_ get all 3 page tables of the typical process copied. OTOH, it wouldn't surprise me at all if it was a win ;)) kind regards, Rik -- Linux holds advantages over the single-vendor commercial OS -- Microsoft's Competing with Linux document http://www.surriel.com/ http://distro.conectiva.com/
[reiserfs-list] Re: Note describing poor dcache utilization under high memory pressure
On Mon, 28 Jan 2002, Linus Torvalds wrote: On Mon, 28 Jan 2002, Rik van Riel wrote: I'd be interested to know exactly how much overhead -rmap is causing for both page faults and fork (but I'm sure one of the regular benchmarkers can figure that one out while I fix the RSS limit stuff ;)) I doubt it is noticeable on page faults (the cost of maintaining the list at COW should be basically zero compared to all the other costs), but I've seen several people reporting fork() overheads of ~300% or so. Dave McCracken has tested with applications of different sizes and has found fork() speed differences of 10% for small applications up to 400% for a 10 MB (IIRC) program. This was with some debugging code enabled, however... (some of the debugging code I've only disabled now) Which is not that surprising, considering that most of the fork overhead by _far_ is the work to copy the page tables, and rmap makes them three times larger or so. For dense page tables they'll be 3 times larger, but for a page table with is only occupied for 10% (eg. bash with 1.5 MB spread over executable+data, libraries and stack) the space overhead is much smaller. The amount of RAM touched in fork() is mostly tripled though, if the program is completely resident, because fork() follows VMA boundaries. And I agree that COW'ing the page tables may not actually help. But it might be worth it even _without_ rmap, so it's worth a look. Absolutely, this is something to try... (Also, I'd like to understand why some people report so much better times on dbench, and some people reports so much _worse_ times with dbench. Admittedly dbench is a horrible benchmark, but still.. Is it just the elevator breakage, or is it rmap itself?) We're still looking into this. William Irwin is running a nice script to see if the settings in /proc/sys/vm/bdflush have an observable influence on dbench. Another thing which could have to do with decreased dbench and increased tiobench performance is drop behind vs. use-once. It turns out drop behind is better able to sustain IO streams of different speeds and can fit more IO streams in the same amount of cache (people running very heavily loaded ftp or web download servers can find a difference here). For the interested parties, I've put some text and pictures of this phenomenon online at: http://linux-mm.org/wiki/moin.cgi/StreamingIo It basically comes down to the fact that use-once degrades into FIFO, which isn't too efficient when different programs do IO at different speeds. I'm not sure how this is supposed to affect dbench, but it could have an influence... regards, Rik -- Linux holds advantages over the single-vendor commercial OS -- Microsoft's Competing with Linux document http://www.surriel.com/ http://distro.conectiva.com/