On 10/6/06, Erik Trimble <[EMAIL PROTECTED]> wrote:
David Dyer-Bennet wrote: > On 10/6/06, Nicolas Williams <[EMAIL PROTECTED]> wrote: > >> > >Maybe Erik would find it confusing. I know I would find it >> > >_annoying_. >> > >> > Then leave it set to 1 version >> >> Per-directory? Per-filesystem? > > Whatever. What's the actual issue here? > > I don't recall that on TOPS-20 it was possible to not version. What > you could do is set your logout.cmd file to purge your space down to > one copy when you logged out. But see, that assumes you have a logout-type functionality to use. Which indeed is possible for command-line usage, but then only in a very limited way. During a typical session, I access almost 20 NFS-mounted directories. And anyone using autofs/automount trees gets even more. You're saying that my logout script has to know about all of them to keep things clean? That's unrealistic. And that still doesn't solve the problem of people who use SAMBA or NFS from machines which don't have an interactive shell logout system (i.e. Windows).
Seems entirely realistic to me that your logout script would know about the things you routinely use. People who don't log into any system are more of a problem, though. Various things come to mind, like having a default number of files (so it doesn't expand without limits), and maybe a regular cron job; but I've never worked in an environment doing versioning for non-login users over the network, so they're all theory, no idea how they'd work in practice.
> This worked fine for the users I knew; even on a system that didn't > have as much as a gigabyte of disk storage total to support a few > dozen software engineers. > The problem is we are comparing apples to oranges in user bases here. TOPS-20 systems had a couple of dozen users (or, at most, a few hundred). VMS only slightly more. UNIX/POSIX systems have 10s of thousands. Plus, the number of files being created under typical modern systems is at least two (and probably three or four) orders of magnitude greater. I've got 100,000 files under /usr in Solaris, and almost 1,000 under my home directory. And I don't have anything significant in my /home (no source code, no build/test trees, just misc business stuff). What is managable with a few files quickly becomes unwieldy with more than a few dozen.
I have to ask again -- is this theory? Or have you actually worked on a versioning filesystem? And specifically on TOPS-20? (I remember, vaguely, that people found VMS versioning MUCH less comfortable to work with than TOPS-20, and I don't know at this distance if that was just because it was different, or because of subtle UI differences). I don't think the number of files under /usr is relevant; how often do you edit them by hand? I'd expect an installation procedure to clean up old versions when it was done installing new software; but if not a simple purge would settle the matter. I don't recall my directories having much fewer files then than now. I have more *directories* now, but the number of files in a directory is set by human issues and by development process issues, not by disk space available.
This is what Nico and I are talking about: if you turn on file versioning automatically (even for just a directory, and not a whole filesystem), the number of files being created explodes geometrically.
I don't see it; new versions are created *when you do something* to a file; not from the file just sitting there. And the number of files I poke in a day, again, isn't controlled much by the disk space available, it's controlled by *my time*, and so has stayed more constant over the years.
>> > The above should be simple to do however -- a program does an open of >> > a file name "foo.bar". ZFS / the file system routine would use the >> > most recent version by default if no version info is given. >> >> How can version information be given without changing the APIs or >> putting the version number/string into the file name? > > The version number is part of the file name in all the examples I know > about. I'd find it useless without that; it has to be a real part of > the filesystem, usable by everybody, not a special addon accessible > only with one or two dedicated applications. > >> Putting the version number/string into the file name is hard for me to >> accept. It's what would lead to polluting my directories. > > Set your ls default to not show versions. Isn't the problem then > solved? Maybe add that option to the GUI filesystem explorer as well. > But this requires modifying all the relevant apps, which is the same amount of work as modifying them to use a new FV API. It's not transparent to the end-user.
I think the relevant apps are very different in the two cases. File listing tools are much rarer than file using tools, and in my case you only need to modify the file listing tools. In your case, you have to modify every single file using tool.
> In practice, it never was a problem that I noticed, or that other > people noticed. And remember that this was on slower systems with > smaller screens and often rather slower screen update. > > Do you not like the idea based on theory, or did you actually use > TOPS-20 for a while and find the versioning troublesome? > Putting the file version number as part of the file name breaks things. Apps unaware of the special significance of this format will tend to write similar names, which can screw everything royally. Example: Say we use <file>;<version> In emacs, I edit FOO:2 it will write out a temp file "FOO:2~". So, how does the FS deal with this the next time they need to create a new version?
Whatever. None of the choices are a disaster. None of them "break" anything. I essentially never have to look at these, any version of them, so it doesn't matter very much what their names are. Possibly some clever definitions of how things are handled could make the results cleaner, and that's worth looking at, but the worst results I can imagine from this scenario are unimportant, they don't hurt anything.
The problem lies in that under VMS, the ';' was a special character, and unusable in normal naming. I suspect a similar situation exists under TOPS-20. No such luck in a POSIX filesystem - all printable (and many unprintable) characters are valid for use in filenames. So you _CAN'T_ use them to deliniate File Versioning, without risking blowing the entire scheme when some random app decides to either use your FV marker for its own needs, or something similar to the emacs case above.
This is theory again. In practice, there aren't such schemes in use anywhere I can find. If there are, yes, some file-versioning schemes would break them, and those apps would have to be updated. A theoretically clean approach is desirable, but an approach that actually works is more important. An approach that requires programs to be updated before they can use file versioning doesn't, by my standards, "work"; I wouldn't be able to use it with the files and applications it's valuable to me for any time soon. When you talk about a new API for versioning -- how do you envision information being conveyed from the command lines of programs to this new API? Isn't it likely that it would end up becoming a part of file name syntax, and changing the rules about allowable characters in filenames? And in that case, you can make the whole change in the "open" and "link" calls, and get the same end effect.
>> > one UI is the command line shell >> >> Indeed! And command-line tools, like ls(1), find(1), etc... >> >> What I'm saying is that I'd like to be able to keep multiple versions of >> my files without "echo *" or "ls" showing them to me by default. > > And I find that completely unacceptable; useless. The whole point of > putting versioning in the filesystem is that that makes it accessible > to all programs. > But, because of the explosion in the number of files, you CAN'T automatically show all versions. Users will NEVER accept this. The only clean way to do this is to show file versions only upon request. Not by default.
Is this theory, or do you have some experience to support it? You say "can't"; I'm not at all worried about it, myself. I've worked in these environments, and liked it very much. I've watched new people get introduced to them. People like this when they see it well-implemented. I don't accept your assertion that directories people edit files in have more files in them today than they used to, in general. I also don't accept the assertion that the number of extra versions scales with the number of files in the directory -- it scales with the number of files you re-write in the directory, which is limited more by human working speed and time in the day, not by number of files there.
>> > >What if an application deals in multiple files? >> > >> > so? >> >> So, file versions aren't useful unless the application explicitly >> decides tells the OS when to make them. > > File versions are created when a file is created. In the scenario > where, today, an existing file would be overwritten (deleted), instead > the old file is kept and the new file is given the version number +1 > of the old file. > >> Similarly with applications that keep files open but keep writing >> transactions in ways that the OS can't isolate without input from the >> app. E.g., databases. fsync(2) helps here, but lots and lots of >> fsync(2)s would result in no useful versioning. > > None of those are candidates for file versioning, and a darned good > thing, too. Honestly, as far as file versioning goes, the time to make a new version is when calling open() with the appropriate arguments to allow for append or modification. You obviously don't want to create a new version if you are only opening a file for read-only access, and changing version on fsync() is ludicrous, and on close() doesn't differentiate between a file which has been modified or not.
Yes, versioning is a file-create feature.
Given this, we're back into the problem FV is supposed to solve. It is entirely possible for an editor to keep open a file for a long time, periodically writing out your changes without issuing a new open().
You describe this as a problem, but *I* see it as the exact thing that makes file versioning useful. It DOESN'T save random magically chosen moments; it saves exactly all the version that *you*, the user, saved at some point of the editing session.
Word with auto-save turned off is a prime example. Given this, you've only created a new version when you first load the document, and all your intermediary changes are lost, since it only saves the document on close().
You're forgetting that the user, unless he's stupid, will save regularly during the editing session.
Thus, in order to get benefits from FV, your editor must issue periodic close() and open() commands on the same file, as you edit, all without your intervention. Exactly how many editors do this? I have no idea. So, the only way to enable FV is to require the user to periodically push the "Save" button. Which is how much more different than the current situation?
It is completely and utterly different from the current situation. In the current situation, when I type the "save" command *I am deleting a previous version*. That's dangerous, because people don't think of it as performing a destructive operation, and hence don't give it the care and consideration they give to an explicit "rm". And that's precisely what file versioning fixes; saving a file is no longer a destructive operation. -- David Dyer-Bennet, <mailto:[EMAIL PROTECTED]>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/> _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss