Re: (valid) criticisms of Git addressed

2008-08-27 Thread Juliano F. Ravasi
martin f krafft wrote:
> If you emptied your inbox, why keep it around? I expect the tools
> I use to recreate empty directories aqs needed.

Yes, but some programs don't expect their directories to disappear. The
mail example was just it, an example.

> There is one thing to be said in favour of in-filesystem metadata,
> such as .gitattributes — conflicts in those are no different than
> conflicts in content files, and all of the standard and advanced
> conflict resolution mechanisms (merge drivers, git-rerere, etc.) can
> be used for those just as well. Surely, this could be remedied by
> exposing the metadata layer as files in the event of conflicts, but
> that would be a hack in my world, and likely come with other
> problems.

In this case it is very similar to Subversion. When conflicts happen in
properties, Subversion acts like if their containers (files or
directories) were directories, and the properties were files (except
that properties can't be copied and renamed). Then everything just works
like they work for files themselves.

In a sense, Svn properties are "small files inside files".

> This has not happened to me before, or well, it's not bitten me.
> Do you mean something like:

No... I mean, for binary files. Most binary formats we use today are
compressed, and the smallest change causes the "avalanche effect" that
makes the end file completely different than the original. It is
virtually impossible for Git to detect such changes. See this example:

# Create and commit test draft of image:

~/tmp/playground% convert -font DejaVu-Sans-Book -pointsize 72
label:Test draft.png
~/tmp/playground% git add draft.png
~/tmp/playground% git commit -m "First version of image."
Created initial commit 9325950: First version of image.
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 draft.png

# Change a single pixel in image and rename it to its final name:

~/tmp/playground% convert draft.png -draw 'point 1,1' final.png
~/tmp/playground% rm draft.png
~/tmp/playground% git add final.png

# Commit and check:

~/tmp/playground% git commit -a -m "Final version of image."
Created commit ff1506c: Final version of image.
 2 files changed, 0 insertions(+), 0 deletions(-)
 delete mode 100644 draft.png
 create mode 100644 final.png
~/tmp/playground% git log -M --follow final.png
commit ff1506c6c6e99773c989fc61e8c0e9d73a0cf2db
Author: Juliano F. Ravasi <...>
Date:   Wed Aug 27 15:29:50 2008 -0300

Final version of image.


See? Just a single-pixel change of an image, together with a rename from
draft.png to final.png broke the history, because Git doesn't record
this information. It depends on heuristics that may be valid for text
files, but not for any other file.

Any human is capable of looking to both draft.png and final.png and will
see clearly that they are almost the same file, and it makes complete
sense to share the history (since final.png was created based on
draft.png). But Git is not smart enough to look inside images to check
that the only difference between them is a single pixel... and it wasn't
designed for this purpose.

> If you store the encoding along with the filename, you'll run into
> a whole lewd of other issues when transcoding.
> 
> My solution to this is just to have UTF-8 everywhere. I am all too
> glad to have waved goodbye to all those encoding nightmares that
> were iso8859-* and ascii.

Yes, but unfortunately, there are many issues that push people to keep
using legacy encodings. They are legacy, but not obsolete. Tons of
Portuguese-localized systems still rely on ISO-8859-1, tons of
Japanese-localized systems still rely on Shift-JIS or EUC-JP, and so
on... It is not simple to just convert everything to UTF-8, it is
something that must be planned, tested, etc.

> I think this is a feature. If we keep adding backwards-compatibility
> layers to tools, we not only make them bigger, more error-prone, and
> harder to maintain, but we also slow down the transition to better
> times.

You have a point. But even if Git embraces and suggests everyone to use
UTF-8, it should at least detect and reject any non-UTF-8 normalized
input, so that you don't end with things like two files with the same
name in the repository, or names that can't be interpreted with any
Unicode meaning (that is necessary when porting to Windows and MacOS X).

Regards,

-- 
Juliano F. Ravasi ·· http://juliano.info/
5105 46CC B2B7 F0CD 5F47 E740 72CA 54F4 DF37 9E96

"A candle loses nothing by lighting another candle." -- Erin Majors

* NOTE: Don't try to reach me through this address, use "contact@" instead.
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

Re: (valid) criticisms of Git addressed

2008-08-27 Thread martin f krafft
also sprach Juliano F. Ravasi <[EMAIL PROTECTED]> [2008.08.27.1958 +0100]:
> > If you emptied your inbox, why keep it around? I expect the tools
> > I use to recreate empty directories aqs needed.
> 
> Yes, but some programs don't expect their directories to disappear.

Then I think those programmes are buggy. Empty directories have no
meaning, I think.

> See? Just a single-pixel change of an image, together with
> a rename from draft.png to final.png broke the history, because
> Git doesn't record this information.

Sure, it's ugly, but if you edit and rename in two separate commits,
it works fine. So it is possible and should be fixed.

In any case, I usually put renames into their own commits, so that's
why I have not been bitten by this yet.

-- 
martin | http://madduck.net/ | http://two.sentenc.es/
 
never trust an operating system
for which you do not have the source.
   -- source unknown
 
spamtraps: [EMAIL PROTECTED]


digital_signature_gpg.asc
Description: Digital signature (see http://martin-krafft.net/gpg/)
___
vcs-home mailing list
vcs-home@lists.madduck.net
http://lists.madduck.net/listinfo/vcs-home

(valid) criticisms of Git addressed (was: I am using fsvs and just found svnhome in google...)

2008-08-27 Thread martin f krafft
also sprach Juliano F. Ravasi <[EMAIL PROTECTED]> [2008.08.27.0302 +0100]:
> 1. Both don't see directories. Just because a directory is empty
> doesn't mean that it doesn't exist. If I delete all files in
> a directory, it doesn't mean that that directory ceased to exist.
> Some programs don't expect this. For example, file all messages
> from your Inbox, and commit. Then you pull from the other machine,
> and your Inbox directory just disappears.

If you emptied your inbox, why keep it around? I expect the tools
I use to recreate empty directories aqs needed.

This is apart from the fact that I prefer to use IMAP for
synchronising mail, since it's a better tool (made) for the task and
can do stuff (like flags) better than a mail-agnostic tool, such as
a content tracker.

> 2. Both lack proper ways to store metadata. This is evidenced by
> the need to pollute your directories with .gitignore,
> .gitattributes and .gitmodules (for Git), and .hgignore,
> .hgbranches and .hgtags (for Mercurial). All this information is
> threated and versioned as part of the contents of the repository,
> while it should not. It is the plumbing of the VCS that gets
> exposed and mixed with the user files.

This is a very valid point and I wish Git had a metadata layer. I've
tried to bring up the issue with the developers, but they're not
interested in making Git more generic ("it's used to track the linux
kernel sources, if you use it for anything else, you are on your
own").

But keep in min that Git, Mercurial & Co. are first-generation (if
you are willing to place arch into the zeroth generation, and see
Monotone in a league of its own). We are surely going to see new
tools which pick up on these issues in the future.

There is one thing to be said in favour of in-filesystem metadata,
such as .gitattributes — conflicts in those are no different than
conflicts in content files, and all of the standard and advanced
conflict resolution mechanisms (merge drivers, git-rerere, etc.) can
be used for those just as well. Surely, this could be remedied by
exposing the metadata layer as files in the event of conflicts, but
that would be a hack in my world, and likely come with other
problems.

> In the case of Git, .gitattributes is a huge misfeature. The
> attributes stored in it are user-edited, and is not attached to
> the actual files. If you move files around, you suddenly lose your
> attributes until you fix the attributes file.

Agreed.

> 4. Git only: lack of real rename/copy support. This affects not
> only vcs-home, but also general SCM use; it is just worse for
> vcs-home. There is a much bigger likelihood of having binary files
> in your home than common source code projects (what Git was
> designed to manage). Images, text documents, spreadsheets,
> presentations, compressed files, etc... For example, you change
> a single character in an ODF document and save. You will see that
> the file is completely different binary-wise, not easy to track.

Git doesn't track the file, it tracks the content. If you make small
changes, it's likely that the two blobs will be compressed to little
more than the size of one in the store. When it comes to keeping the
association (e.g. log messages), Git uses commit history anyway to
figure this out.

> If you also rename before the commit, you just lose the
> connection, and there is nothing you can do about it.

This has not happened to me before, or well, it's not bitten me.

Do you mean something like:

+lapse:~/.tmp/cdt.VLULwrjW|master|% echo foo > testfile
fatal: No HEAD commit to compare with (yet)
+lapse:~/.tmp/cdt.VLULwrjW|master|% git add testfile
gfatal: No HEAD commit to compare with (yet)
+lapse:~/.tmp/cdt.VLULwrjW|master|% git commit -m'add testfile(foo)'
Created initial commit 3ff1892: add testfile(foo)
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 testfile
+lapse:~/.tmp/cdt.VLULwrjW|master|% echo bar >| testfile  
changes on filesystem:
 testfile |2 +-
+lapse:~/.tmp/cdt.VLULwrjW|master|% git mv testfile someotherfile
cached/staged changes:
 someotherfile |1 +
 testfile  |1 -
+lapse:~/.tmp/cdt.VLULwrjW|master|% git commit -m'moved the file to 
someotherfile(bar)'
Created commit 78b242f: moved the file to someotherfile(bar)
 2 files changed, 1 insertions(+), 1 deletions(-)
 create mode 100644 someotherfile
 delete mode 100644 testfile
+lapse:~/.tmp/cdt.VLULwrjW|master|% git log -- someotherfile 
commit 78b242f2d5df1ebe96e25e2dc6c69eb1c135cbb2
Author: martin f. krafft <[EMAIL PROTECTED]>
Date:   Wed Aug 27 13:55:37 2008 +0100

moved the file to someotherfile(bar)

> 5. Git doesn't actually support Unicode filenames (neither does
> Mercurial). Both just store whatever the file name is in the
> filesystem directly into the repository, as just an array of
> bytes. You won't notice this unless you create files with names
> containing characters beyond the ASCII set, and use different
> encodings in different computers. This also causes pro