Re: Authoring a versioning plugin

2006-01-12 Thread Jonathan Briggs
On Wed, 2006-01-11 at 22:44 -0800, Hans Reiser wrote:
> Hans Reiser wrote:
> >  I am skeptical that having it occur with every
> >write is desirable actually.
> >  
> >
> Consider the case where you type cat file1 >> file2.  This will produce
> a version of file2 for every 4k that is in file1, because (well I didn't
> look at the bash source, but I would guess) it appends in 4k incremental
> writes rather than one big write.  Versioning on file close makes more
> sense
[snip]

Not that my opinion means anything. :-) But I agree with Hans that file
close is the place to create the new version.  The plugin should track
the writes (and mmap flushes) between file open and close, then on file
close it can process everything into a reverse binary diff to save
permanently.
-- 
Jonathan Briggs <[EMAIL PROTECTED]>
eSoft, Inc.


signature.asc
Description: This is a digitally signed message part


Re: Authoring a versioning plugin

2006-01-12 Thread Peter van Hardenberg
Hi Yoanis, good to see you're still pursuing this.

On January 11, 2006 02:59 pm, Yoanis Gil Delgado wrote:
> This are the intentions:
> To write a versioning plugin that will allows the file system user to
> easily revert the files under versioning to a some previous state.  The
> plugin will allow to revert the file state, based on revisions number and
> date modifications(and not sure about this one). There will be a special
> pseudo file named "previous" that will return the previous version of the
> file. The final result should allow to the the following actions:
>
> $ echo 1 > myfile.txt  (let's say we make this command at Wed Jan 11
> 16:53:55) $ echo 2 > myfile.txt  (let's say we make this command at Wed Jan
> 11 16:54:57) $ echo 3 >> myfile.txt (let's say we make this command at Wed
> Jan 11 16:55:59)
>
> Suppose you want the latest version, then you type:
> $ cat myfile/.../previous
>  Some other content
> Or you want the n-th version, then you type:
> $ cat myfile/.../1
>  Some content
> $ cat myfile/.../2
>  Some other content
> $ cat myfile/.../3

This is going to clutter the ... directory rather a lot. Instead of adding 
more files into "" (which, by the way, is completely obscure) I would 
suggest you create a new pseudo directory.

Perhaps:
$ cat myfile/.^4/history/previous
$ cat myfile/.^4/history/version/1
still not quite right, but at least it contains a bit more information about 
what the "1" refers to.

>  Some other content
>  Some more content
> $
> Or the version nearest to some date, then you type:
> $ cat myfile.txt/.../Wed\ Jan\ 11\ 16:50
>  Some other content

There are already userspace tools which can determine the file creation date. 
Just use those, instead of dealing with date parsing in kernel-space. Date 
parsing is a way, WAY more subtle problem than you want to deal with. To see 
a group that has spent some time on it, check out the Date::Parser for Perl.

Using "grep" or "find", or "ls" or whatever other tool will accomplish this in 
a much more thorough and Unix-consistent way and also save you a pile of 
coding time. Believe me, you're going to need it.

>
> Also , there will be an special attribute named under_versioning(or
> something like that), that will tell if the file is under versioning. This
> plugin will not track directories version, although it's a future plan(I
> think this should be mixed with some undelete plugin).
>

I imagine that attribute should be
$ echo "1" > myfile.txt//plugins/versioning
or
$ echo "everywrite" > myfile.txt//plugins/versioning

Unfortunately, my experience is that you cannot use "echo" to change the data 
in the plugins/* pseudoplugins, even when it should be legal to do so. I just 
had a little ruby script that looked roughly like this:

f = open pseudofile;
f.write('newplugin');
f.close;

Never had the time to figure out why that was necessary, but there it is. 
(There is a comment on the plugin-wiki gotchas section.)

> I'm planning to use a delta techniques for versioning storage (delta
> compression). The versioning will be at the write level. The versions will
> be saved in a special directory under the filesystem. I think the hard part
> is the one related to detecting the changes (a COW it's a possible
> solution, but i think it's to expensive). I'm thinking a possible solution
> will be detecting the bytes changing in each write and archiving then as
> the difference.  This introduce some problems like :
> 1-) What happens if the file shrinks?
> 2-) What happens if the file grows ?
>
> I will send another email with a solution to this problems.

This will not be easy, I look forward to seeing your solution.

>
> I've also plans to extent the documentantion of plugins creation in reiser4
> with the experiences of this project. I'll be working in this plugin for
> more than 4 months. If you're interested you're welcome to the the
> team(just me right now :D )

Well, I have my own fish to fry, but I hope you will document your experiences 
on the Reiser4 programmer's wiki, currently housed at:
http://pvh.ca/trac/wiki/reiser4

There is lots of important information there for new Reiser4 plugin 
developers, and it will continue to grow as time goes by.

> Well... I think this is all (for now  :D ). Please let me know what you
> think.

I would second Hans' suggestion about a "/version/snapshot" file which 
would essentially act like a "cvs commit" on that file. I'd suggest that 
there be two similar versioning plugins, one which automatically versions 
after each write, and one which only does it when explicitly asked to. See 
the fibration plugin type for an example of this.

-p

-- 
Peter van Hardenberg ([EMAIL PROTECTED])
Victoria, BC, Canada


Re: Authoring a versioning plugin

2006-01-12 Thread Bedros Hanounik
I think versioning plugin is a great idea and I bet there're many people like me waiting for such a plugin. However, I have few questions;what happens when I delete a file? should I loose all history of the file with such action? 
if there's an undelete plugin, what kind of hooks needed so undelete recovers the full state of the file with history.another concern is backup; if I backup the file or the entire directory (or drive), is it transparent to the backup app, or something extra needed to be done to backup the history of the file?
if you store all the history in  a sub direcotry let's say .rev and make it generic (and hence visible) to everyone, the above problems will go away. for example filename.ext deltas could be stored in  .rev/filename-
rev-date-time.delta with base rev in .rev/filename-rev-date-time.extcorrect me if I'm missing something, because I don't know the plugin mechanism of reiser4.-BOn 1/12/06, 
Jonathan Briggs <[EMAIL PROTECTED]> wrote:
On Wed, 2006-01-11 at 22:44 -0800, Hans Reiser wrote:> Hans Reiser wrote:> >  I am skeptical that having it occur with every> >write is desirable actually.> >> >> Consider the case where you type cat file1 >> file2.  This will produce
> a version of file2 for every 4k that is in file1, because (well I didn't> look at the bash source, but I would guess) it appends in 4k incremental> writes rather than one big write.  Versioning on file close makes more
> sense[snip]Not that my opinion means anything. :-) But I agree with Hans that fileclose is the place to create the new version.  The plugin should trackthe writes (and mmap flushes) between file open and close, then on file
close it can process everything into a reverse binary diff to savepermanently.--Jonathan Briggs <[EMAIL PROTECTED]>eSoft, Inc.-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2 (GNU/Linux)iD8DBQBDxoTGG8fHaOLTWwgRAkE0AJ9qn75iDGZTgO+KpOEcVVqOuT8mgwCcDYyyBHWE+E4oBHkabjZZHNIoAqM==Vtbo-END PGP SIGNATURE-


Re: Authoring a versioning plugin

2006-01-12 Thread Yoanis Gil Delgado
On Thursday 12 January 2006 02:34 pm, you wrote:
> On Thursday 12 January 2006 01:44 am, you wrote:
> > Hans Reiser wrote:
> > >  
> > >
> > >  I am skeptical that having it occur with every
> > >write is desirable actually.
> > >  
> >
> > Consider the case where you type cat file1 >> file2.  This will produce
> > a version of file2 for every 4k that is in file1, because (well I didn't
> > look at the bash source, but I would guess) it appends in 4k incremental
> > writes rather than one big write.  Versioning on file close makes more
> > sense, but I suggest manual control using the /checkin pseudofile,
> > and then we can reasonably make it the default plugin for the whole FS
> > (write it so that it calls the other plugins so that when we change the
> > other plugins we don't need to change your code to do it).  People who
> > don't want versioning will simply never touch the checkin pseudofile.
> > Make sure that for that case there is just an if statement condition
> > check as overhead, and there will be no reason to not make versioning
> > the default plugin that happens to do nothing unless you use the checkin
> > pseudofile at least once during the life of the file.
> >
> > hmm, maybe /snap is better than /checkin ?  Well, let's decide
> > that once the code is written;-)
> >
> > Do you agree with my points here?
>
 Yes I agree with your points. Still, i will like that some files have auto
 versioning.


Re: Authoring a versioning plugin

2006-01-12 Thread Yoanis Gil Delgado
On Thursday 12 January 2006 03:02 pm, you wrote:

 Please remember the plugin it's in an earlier design phase, and the answers

> can change, but right now this is what I think:
> > I think versioning plugin is a great idea and I bet there're many people
> > like me waiting for such a plugin. However, I have few questions;
> >
> > what happens when I delete a file? should I loose all history of the file
> > with such action?
>
 I think the history should  go away too, since history will be stored as
deltas.
>
> > if there's an undelete plugin, what kind of hooks needed so undelete
> > recovers the full state of the file with history.
>
 Undelete plugin is for future work. right now I'm thinking  that if there
 is mechanism to track down the versioning info of a file no matter the
 directory it's located,  then this should not be a problem.

> > another concern is backup; if I backup the file or the entire directory
> > (or drive), is it transparent to the backup app, or something extra
> > needed to be done to backup the history of the file?
>
 don't know right now. will come with more answers on next days.
>
> > if you store all the history in  a sub direcotry let's say .rev and make
> > it generic (and hence visible) to everyone, the above problems will go
> > away.
> >
> > for example filename.ext deltas could be stored in  .rev/filename-
> > rev-date-time.delta with base rev in .rev/filename-rev-date-time.ext
>
 But what happens if i type:
 $ mv filename.ext ../
 then the entire file revision tree must be copied. that's why i mention the
 idea of mechanism to track down the versioning info of a file.
>
> > correct me if I'm missing something, because I don't know the plugin
> > mechanism of reiser4.
Thanks a lot for the questions.


Re: Authoring a versioning plugin

2006-01-12 Thread Yoanis Gil Delgado
On Thursday 12 January 2006 02:39 pm, you wrote:
> On Thursday 12 January 2006 01:14 pm, you wrote:
> > > I'm planning to use a delta techniques for versioning storage (delta
> > > compression). The versioning will be at the write level. The versions
> > > will be saved in a special directory under the filesystem. I think the
> > > hard part is the one related to detecting the changes (a COW it's a
> > > possible solution, but i think it's to expensive). I'm thinking a
> > > possible solution will be detecting the bytes changing in each write
> > > and archiving then as the difference.  This introduce some problems
> > > like : 1-) What happens if the file shrinks?
> > > 2-) What happens if the file grows ?
> > >
> > > I will send another email with a solution to this problems.
> >
> > This will not be easy, I look forward to seeing your solution.
>
Well this is one of the interesting part of the projects. I will not start
 from scratch, since there is previous work on this area (the delta file
 system for example, although it's a little old).


Re: Authoring a versioning plugin

2006-01-12 Thread Mike Benoit
On Thu, 2006-01-12 at 15:05 -0500, Yoanis Gil Delgado wrote:
> On Thursday 12 January 2006 02:34 pm, you wrote:
> > On Thursday 12 January 2006 01:44 am, you wrote:
> > > Hans Reiser wrote:
> > > >  
> > > >
> > > >  I am skeptical that having it occur with every
> > > >write is desirable actually.
> > > >  
> > >
> > > Consider the case where you type cat file1 >> file2.  This will produce
> > > a version of file2 for every 4k that is in file1, because (well I didn't
> > > look at the bash source, but I would guess) it appends in 4k incremental
> > > writes rather than one big write.  Versioning on file close makes more
> > > sense, but I suggest manual control using the /checkin pseudofile,
> > > and then we can reasonably make it the default plugin for the whole FS
> > > (write it so that it calls the other plugins so that when we change the
> > > other plugins we don't need to change your code to do it).  People who
> > > don't want versioning will simply never touch the checkin pseudofile.
> > > Make sure that for that case there is just an if statement condition
> > > check as overhead, and there will be no reason to not make versioning
> > > the default plugin that happens to do nothing unless you use the checkin
> > > pseudofile at least once during the life of the file.
> > >
> > > hmm, maybe /snap is better than /checkin ?  Well, let's decide
> > > that once the code is written;-)
> > >
> > > Do you agree with my points here?
> >
>  Yes I agree with your points. Still, i will like that some files have auto
>  versioning.

Would you be able to enable auto versioning for an entire directory,
including all new files created in it? For instance I would like to
enable auto versioning on the /etc/ directory, so I can always track
changes to config files.

Also I assume it will track which UID makes the change?

-- 
Mike Benoit <[EMAIL PROTECTED]>


signature.asc
Description: This is a digitally signed message part


Re: Authoring a versioning plugin

2006-01-12 Thread David Masover

Bedros Hanounik wrote:
I think versioning plugin is a great idea and I bet there're many people 
like me waiting for such a plugin. However, I have few questions;


what happens when I delete a file? should I loose all history of the 
file with such action?


Depends.  Delete has always been a modification of the directory, not 
the file.  So, if you were versioning an entire directory tree 
(recursively, of course), then deleting a single file within that tree 
would be a modification to its parent directory.  Until the 
changelog/history/old versions of the parent directory all disappear, 
there will be a link to the file, meaning the file still behaves exactly 
as it always did, it's just that instead of only being linked to in


foo/bar

it's now in

foo/.../version/1234/bar

[...]

if you store all the history in  a sub direcotry let's say .rev and make 
it generic (and hence visible) to everyone, the above problems will go 
away.


Only now you have some new problems:

What if someone wants to make a .rev for something else?  '...' was 
carefully selected, and is still debated.  Now you're suggesting each 
plugin be able to have its own file/directory in every normal directory?


NO.  Will not work.

What might work is creating a '.metadata-archive' directory, which has 
hardlinks to everything tar is allowed to backup from '...' -- that way, 
we only need to come up with one more unique name than we already have.


Technically, not even that -- we could hide a directory with any name we 
like inside '...', and let tar archive '...', but not, say, '.../notar', 
but that's a hassle for users.  Anyone have a better name for 
'.metadata-archive', though?


You also have another problem:  You've just broken 'rm -r' and 'mv' 
across partitions/devices.  Consider:  Am I allowed to remove 
'.metadata-archive'?  If yes, then what happens when I look for it 
again?  Is it magically still there?  And if no, how am I supposed to 
ever be able to rmdir a directory that's technically not empty?  Should 
that just magically work?


I suspect that we'd be fine if we just say that removing it should 
pretend to work (return an OK status), and rmdir should actually work, 
but it should always show up in the directory listing, because I can't 
think of why a program would step through the directory listing, 
unlinking each file, and then get another directory listing, to make 
sure they're all gone, before it tried 'rmdir'-ing the directory.  It 
makes no sense -- you try to 'rmdir', and you complain to the user if 
"directory not empty" because someone stuck a new file in the dir while 
you were trying to delete it.


for example filename.ext deltas could be stored in  .rev/filename- 
rev-date-time.delta with base rev in .rev/filename-rev-date-time.ext


I don't like your semantics.  I just don't.

It makes sense to use file extensions for fibration, because you can 
guess that '.o' files are similar to each other.  Same with '.c' files. 
 But fibration is completely transparent, and just affects performance, 
so it's ok to guess there.


When it comes to actually affecting semantics, file extensions have 
always been irrelevant except to programs that care, and those programs 
only care about their own extensions.


For instance, say I have

foo.tar.bz2

Obviously, this way looks bad:

foo.tar-rev-date-time.delta
foo.tar-rev-date-time.bz2

So do we do it this way?

foo-rev-date-time.delta
foo-rev-date-time.tar.bz2

No, because then what if you have this?

foo.01.12.2006.tar.bz2

Obviously, this is wrong:

foo-rev-date-time.delta
foo-rev-date-time.01.12.2006.tar.bz2

And this is just as wrong as it was before:

foo.01.12.2006.tar-rev-date-time.delta
foo.01.12.2006.tar-rev-date-time.bz2

And for that matter, what if I have these?

foo.tbz2
foo.zip

Both become foo.delta.

And just a minor thing, but you're also making the limit on filenames 
just that much more restrictive.


Instead, I suggest we do it this way:
foo/.../version/

with symlinks to some handy versions of
foo/.../version/
foo/.../version/

This should also work if foo is a directory.  So to actually publish a 
given revision, you do


tar -cjSf foo-1234.tar.bz2 foo/.../1234

Now, what about forking?  If we had good copy-on-write support, we could 
just do something like


cp --cow foo/.../1234 foo-fork

Obviously, this would use the same underlying mechanism of copy-on-write 
that is used for the versioning system, so there's no need for the 
system itself to know about forks.


What about merges?  I don't know, but my knowledge of existing 
versioning systems is pretty limited...


Re: Authoring a versioning plugin

2006-01-12 Thread David Masover

Peter van Hardenberg wrote:

Hi Yoanis, good to see you're still pursuing this.

On January 11, 2006 02:59 pm, Yoanis Gil Delgado wrote:



I would second Hans' suggestion about a "/version/snapshot" file which 
would essentially act like a "cvs commit" on that file. I'd suggest that 
there be two similar versioning plugins, one which automatically versions 
after each write, and one which only does it when explicitly asked to. See 
the fibration plugin type for an example of this.


Sounds good.  I'd propose a third:  auto-versioning with optional 
commits.  Every commit nukes all previous auto-verisons and adds a 
long-term version.  That is:


The file

foo/.../version/1234

would be the version before

foo/.../version/auto/1

And if you committed

foo/.../version/auto/5678

it would become

foo/.../version/1235

and

foo/.../version/auto/*

would be nuked.



That way, you can protect yourself from doing something extremely 
stupid, such as "rm file", without having to go back to a manual 
version, while at the same time having a sane set of manual versions 
(where you know you didn't do something *that* stupid) to keep your disk 
usage sane, and to make it easier to go back and find something that 
genuinely was a previous version, and not just an "oops, the cat stepped 
on the keyboard and nuked all my changes" version.




Re: Authoring a versioning plugin

2006-01-12 Thread Bedros Hanounik
David,I appreciate your criticism, but we're not in a flame war. I never claimed to be an FS expert. Take it easy; you don't have to beat my suggestion to death. There's no perfect solution, and all feedbacks, no matter how idiotic or simple may seem, help making a better final solution.
my suggestions were burst of the moment, I didn't give 'em much thoughts; however, all the problems you found could be fixed. Again, I'm not the FS expert here.-BOn 1/12/06, 
David Masover <[EMAIL PROTECTED]> wrote:
Bedros Hanounik wrote:> I think versioning plugin is a great idea and I bet there're many people> like me waiting for such a plugin. However, I have few questions;>> what happens when I delete a file? should I loose all history of the
> file with such action?Depends.  Delete has always been a modification of the directory, notthe file.  So, if you were versioning an entire directory tree(recursively, of course), then deleting a single file within that tree
would be a modification to its parent directory.  Until thechangelog/history/old versions of the parent directory all disappear,there will be a link to the file, meaning the file still behaves exactlyas it always did, it's just that instead of only being linked to in
foo/barit's now infoo/.../version/1234/bar[...]> if you store all the history in  a sub direcotry let's say .rev and make> it generic (and hence visible) to everyone, the above problems will go
> away.Only now you have some new problems:What if someone wants to make a .rev for something else?  '...' wascarefully selected, and is still debated.  Now you're suggesting eachplugin be able to have its own file/directory in every normal directory?
NO.  Will not work.What might work is creating a '.metadata-archive' directory, which hashardlinks to everything tar is allowed to backup from '...' -- that way,we only need to come up with one more unique name than we already have.
Technically, not even that -- we could hide a directory with any name welike inside '...', and let tar archive '...', but not, say, '.../notar',but that's a hassle for users.  Anyone have a better name for
'.metadata-archive', though?You also have another problem:  You've just broken 'rm -r' and 'mv'across partitions/devices.  Consider:  Am I allowed to remove'.metadata-archive'?  If yes, then what happens when I look for it
again?  Is it magically still there?  And if no, how am I supposed toever be able to rmdir a directory that's technically not empty?  Shouldthat just magically work?I suspect that we'd be fine if we just say that removing it should
pretend to work (return an OK status), and rmdir should actually work,but it should always show up in the directory listing, because I can'tthink of why a program would step through the directory listing,unlinking each file, and then get another directory listing, to make
sure they're all gone, before it tried 'rmdir'-ing the directory.  Itmakes no sense -- you try to 'rmdir', and you complain to the user if"directory not empty" because someone stuck a new file in the dir while
you were trying to delete it.> for example filename.ext deltas could be stored in  .rev/filename-> rev-date-time.delta with base rev in .rev/filename-rev-date-time.extI don't like your semantics.  I just don't.
It makes sense to use file extensions for fibration, because you canguess that '.o' files are similar to each other.  Same with '.c' files.  But fibration is completely transparent, and just affects performance,
so it's ok to guess there.When it comes to actually affecting semantics, file extensions havealways been irrelevant except to programs that care, and those programsonly care about their own extensions.
For instance, say I havefoo.tar.bz2Obviously, this way looks bad:foo.tar-rev-date-time.deltafoo.tar-rev-date-time.bz2So do we do it this way?foo-rev-date-time.deltafoo-rev-date-time.tar.bz2
No, because then what if you have this?foo.01.12.2006.tar.bz2Obviously, this is wrong:foo-rev-date-time.deltafoo-rev-date-time.01.12.2006.tar.bz2And this is just as wrong as it was before:
foo.01.12.2006.tar-rev-date-time.deltafoo.01.12.2006.tar-rev-date-time.bz2And for that matter, what if I have these?foo.tbz2foo.zipBoth become foo.delta.And just a minor thing, but you're also making the limit on filenames
just that much more restrictive.Instead, I suggest we do it this way:foo/.../version/with symlinks to some handy versions offoo/.../version/foo/.../version/
This should also work if foo is a directory.  So to actually publish agiven revision, you dotar -cjSf foo-1234.tar.bz2 foo/.../1234Now, what about forking?  If we had good copy-on-write support, we could
just do something likecp --cow foo/.../1234 foo-forkObviously, this would use the same underlying mechanism of copy-on-writethat is used for the versioning system, so there's no need for thesystem itself to know about forks.
What about merges?  I don't know, but my knowledge of existingversioning systems is pretty limited...


Re: Authoring a versioning plugin

2006-01-12 Thread Yoanis Gil Delgado
On Thursday 12 January 2006 06:56 pm, you wrote:

 > David,
 >
 > I appreciate your criticism, but we're not in a flame war. I never
 > claimed to be an FS expert. Take it easy; you don't have to beat my
 > suggestion to death. There's no perfect solution, and all feedbacks, no
 > matter how idiotic or simple may seem, help making a better final
 > solution.
 >
 > my suggestions were burst of the moment, I didn't give 'em much thoughts;
 > however, all the problems you found could be fixed. Again, I'm not the FS
 > expert here.
 >
 > -B

Yes I agree with you Bedros, but i don't think David wanted to beat your
 suggestion to death. You're suggestions make me thinks things  I have not 
preview. As you say the idea it's to find a good solution.