Re: [PATCH] doc: Modify git-add doc to say "staging area"
On December 14, 2017 1:50:00 PM EST, Junio C Hamanowrote: >I agree with that. I do not consider the proposed change "good". Why is "index" better? It is a confusing name, one that has many other unrelated meanings. In particular, many projects managed by git also have an index, but few have a staging area. Also, the phrase "staging area" is already in use, so this is not a new term (e.g., git-staging). --- David A.Wheeler
Re: [PATCH] doc: Modify git-add doc to say "staging area"
On December 13, 2017 7:54:04 AM EST, "Ævar Arnfjörð Bjarmason"wrote: >After your patch the majority of the docs will still talk about >"index", is this part of some larger series, perhaps it would be good >to see it all at once... Yes, this would be part of a larger series. I'm happy to do the work, but I don't want to do it if it's just going to be rejected. The work is very straightforward, in almost all cases you simply replace the word index with the phrase staging area. The change is similar for the word cache. So I'm not sure what seeing it all at once would do for anybody. Are there one or two other files that you would like to see transformed to see as an example? If you're just looking for a sense of it, that should be enough. --- David A.Wheeler
Re: [PATCH] doc: Modify git-add doc to say "staging area"
On Wed, 13 Dec 2017 09:02:42 -0800, Junio C Hamano <gits...@pobox.com> wrote: > .. But that is not the only thing the index does. When "git merge" > finds conflicting changes, it adds the contents for common, our and > their variants to the index for the path. This is quite different > from how you use the index "as staging area"; the index is being > used as the "merging area". When "git clean" wants to see which > paths it finds on the filesystem are not of interest, it consults > the index, which acts as the list of paths that are of interest. If the phrase "staging area" is consistently used *instead* of index, there's no problem. E.g., "git clean consults the staging area" conveys exactly the same information as "git clean consults the index" when index == staging area. The term "index" has too many *other* meanings. --- David A. Wheeler
Re: [PATCH] doc: Modify git-add doc to say "staging area"
On December 13, 2017 12:40:12 AM EST, Jacob Kellerwrote: >I know we've used various terms for this concept across a lot of the >documentation. However, I was under the impression that we most >explicitly used "index" rather than "staging area". I think "staging area" is the better term. It focuses on its purpose, and it is also less confusing ("index" and "cache" have other meanings in many of the repos managed by git). --- David A.Wheeler
[PATCH] doc: Modify git-add doc to say "staging area"
Change the documentation of git-add so that it consistently uses the phrase "staging area". The current git documentation uses inconsistent terminology ("index", "cache", and "staging area"). This commit switches git-add's documentation to consistently use the phrase "staging area", which is higher-level and should be less confusing for new users. Signed-off-by: David A. Wheeler <dwhee...@dwheeler.com> --- Documentation/git-add.txt | 104 -- 1 file changed, 54 insertions(+), 50 deletions(-) diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt index d50fa339d..927a152b0 100644 --- a/Documentation/git-add.txt +++ b/Documentation/git-add.txt @@ -3,7 +3,7 @@ git-add(1) NAME -git-add - Add file contents to the index +git-add - Add file contents to the staging area SYNOPSIS @@ -15,23 +15,24 @@ SYNOPSIS DESCRIPTION --- -This command updates the index using the current content found in -the working tree, to prepare the content staged for the next commit. -It typically adds the current content of existing paths as a whole, +This command updates the staging area using the current content found +in the working tree. +This command typically adds the current content of existing paths as a whole, but with some options it can also be used to add content with only part of the changes made to the working tree files applied, or remove paths that do not exist in the working tree anymore. -The "index" holds a snapshot of the content of the working tree, and it -is this snapshot that is taken as the contents of the next commit. Thus -after making any changes to the working tree, and before running -the commit command, you must use the `add` command to add any new or -modified files to the index. +The staging area (historically called the "index" or "cache") +holds a snapshot of the content of the working tree, and it +is this snapshot that is taken by default as the contents of the next commit. +Thus after making any changes to the working tree, and before running +the commit command, you can use the `add` command to add any new or +modified files to the staging area. This command can be performed multiple times before a commit. It only adds the content of the specified file(s) at the time the add command is run; if you want subsequent changes included in the next commit, then -you must run `git add` again to add the new content to the index. +you must run `git add` again to add the new content to the staging area. The `git status` command can be used to obtain a summary of which files have changes that are staged for the next commit. @@ -45,7 +46,9 @@ be used to add ignored files with the `-f` (force) option. Please see linkgit:git-commit[1] for alternative ways to add content to a commit. - +For example, you can use the git commit `-a` option to first automatically +add to the staging area all the files that have been have been +modified or deleted in the working tree. OPTIONS --- @@ -53,7 +56,7 @@ OPTIONS Files to add content from. Fileglobs (e.g. `*.c`) can be given to add all matching files. Also a leading directory name (e.g. `dir` to add `dir/file1` - and `dir/file2`) can be given to update the index to + and `dir/file2`) can be given to update the staging area to match the current state of the directory as a whole (e.g. specifying `dir` will record not just a file `dir/file1` modified in the working tree, a file `dir/file2` added to @@ -81,16 +84,16 @@ in linkgit:gitglossary[7]. -i:: --interactive:: Add modified contents in the working tree interactively to - the index. Optional path arguments may be supplied to limit + the staging area. Optional path arguments may be supplied to limit operation to a subset of the working tree. See ``Interactive mode'' for details. -p:: --patch:: - Interactively choose hunks of patch between the index and the - work tree and add them to the index. This gives the user a chance + Interactively choose hunks of patch between the staging area and the + work tree and add them to the staging area. This gives the user a chance to review the difference before adding modified contents to the - index. + staging area. + This effectively runs `add --interactive`, but bypasses the initial command menu and directly jumps to the `patch` subcommand. @@ -98,20 +101,20 @@ See ``Interactive mode'' for details. -e:: --edit:: - Open the diff vs. the index in an editor and let the user + Open the diff vs. the staging area in an editor and let the user edit it. After the editor was closed, adjust the hunk headers - and apply the patch to the index. + and apply the patch to the staging area. + The intent of this opt
[PATCH] Expand documentation describing --signoff
Modify various document (man page) files to explain in more detail what --signoff means. This was inspired by https://lwn.net/Articles/669976/ where paulj noted, "adding [the] '-s' argument to [a] git commit doesn't really mean you have even heard of the DCO...". Extending git's documentation will make it easier to argue that developers understood --signoff when they use it. Signed-off-by: David A. Wheeler <dwhee...@dwheeler.com> --- Documentation/git-am.txt | 1 + Documentation/git-cherry-pick.txt | 1 + Documentation/git-commit.txt | 6 +- Documentation/git-format-patch.txt | 1 + Documentation/git-revert.txt | 1 + 5 files changed, 9 insertions(+), 1 deletion(-) diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt index 452c1fe..13cdd7f 100644 --- a/Documentation/git-am.txt +++ b/Documentation/git-am.txt @@ -35,6 +35,7 @@ OPTIONS --signoff:: Add a `Signed-off-by:` line to the commit message, using the committer identity of yourself. + See the signoff option in linkgit:git-commit[1] for more information. -k:: --keep:: diff --git a/Documentation/git-cherry-pick.txt b/Documentation/git-cherry-pick.txt index 77da29a..6154e57 100644 --- a/Documentation/git-cherry-pick.txt +++ b/Documentation/git-cherry-pick.txt @@ -100,6 +100,7 @@ effect to your index in a row. -s:: --signoff:: Add Signed-off-by line at the end of the commit message. + See the signoff option in linkgit:git-commit[1] for more information. -S[]:: --gpg-sign[=]:: diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt index 7f34a5b..9ec6b3c 100644 --- a/Documentation/git-commit.txt +++ b/Documentation/git-commit.txt @@ -154,7 +154,11 @@ OPTIONS -s:: --signoff:: Add Signed-off-by line by the committer at the end of the commit - log message. + log message. The meaning of a signoff depends on the project, + but it typically certifies that committer has + the rights to submit this work under the same license and + agrees to a Developer Certificate of Origin + (see http://developercertificate.org/ for more information). -n:: --no-verify:: diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt index e3cdaeb..b149e09 100644 --- a/Documentation/git-format-patch.txt +++ b/Documentation/git-format-patch.txt @@ -109,6 +109,7 @@ include::diff-options.txt[] --signoff:: Add `Signed-off-by:` line to the commit message, using the committer identity of yourself. + See the signoff option in linkgit:git-commit[1] for more information. --stdout:: Print all commits to the standard output in mbox format, diff --git a/Documentation/git-revert.txt b/Documentation/git-revert.txt index b15139f..573616a 100644 --- a/Documentation/git-revert.txt +++ b/Documentation/git-revert.txt @@ -89,6 +89,7 @@ effect to your index in a row. -s:: --signoff:: Add Signed-off-by line at the end of the commit message. + See the signoff option in linkgit:git-commit[1] for more information. --strategy=:: Use the given merge strategy. Should only be used once. -- 2.5.3 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: Warn if clone lacks LICENSE or COPYING file
On Sun, 22 Mar 2015 18:56:52 +0100, Ævar Arnfjörð Bjarmason ava...@gmail.com wrote: However perhaps an interesting generalization of this would be something like a post-clone hook, obviously you couldn't store that in .git/hooks/ like other githooks(5) since there's no repo yet, but having it configured via the user/system config might be an interesting feature. Would that be acceptable to the wider group? --- David A. Wheeler -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] clone: Warn if clone lacks LICENSE or COPYING file
Junio C Hamano: An open source hosting site can help better by checking at the project creation time, because the people who interact with that interface are solely in the position to set and publish licensing terms. That doesn't help with the many projects that have *already* been created. E.G., GitHub has a license chooser now, but didn't for years, and it's still optional. Also, repos stored as shared filesystems don't do that kind of checking. More importantly, focusing on the hosting site doesn't warn people who *clone* from repos. The people who take on legal risks are often not the posters, but the people who clone *from* the sites. Thus, *they* are the ones who need the warning, and git is in an especially good spot to detect the issue. The general consumer who are cloning and fetching do not have direct control over this, and the only thing the could do to nudge the publishers is with an out-of-line communication... That's an option, but another option is to NOT use it. Often people have no idea there's an issue, and in their rush and lack of warning they forget to check the basics. An approach that checks only the top-level directory for fixed filename pattern would not be an effective way to protect the cloners, either. I disagree, I think it's remarkably effective. *Many* projects do this, including git itself. After all, many humans need to find out the licensing basics too; having a simple convention for *finding* it helps humans and tools alike. It's not even limited to open source software; developers of proprietary materials (software or now) *also* typically want to declare licensing. Sure, the top-level licensing text might be incomplete, but having that information provides a big help, and it's what most people rely on anyway. Indeed, a *lack* of this is a sign of trouble, which is exactly what warnings are good for. --- David A. Wheeler (P.S. I posted this previously but it seems to have failed for some reason, so I'm resending this in a different way.) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] clone: Warn if clone lacks LICENSE or COPYING file
Warn cloners if there is no LICENSE* or COPYING* file that makes the license clear. This is a useful warning, because if there is no license somewhere, then local copyright laws (which forbid many uses) and terms of service apply - and the cloner may not be expecting that. Many projects accidentally omit a license, so this is common enough to note. For more info on the issue, feel free to see: http://choosealicense.com/no-license/ http://www.wired.com/2013/07/github-licenses/ https://twitter.com/stephenrwalli/status/247597785069789184 Signed-off-by: David A. Wheeler dwhee...@dwheeler.com --- builtin/clone.c | 38 ++ 1 file changed, 38 insertions(+) diff --git a/builtin/clone.c b/builtin/clone.c index 9572467..9863b04 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -748,6 +748,41 @@ static void dissociate_from_references(void) die_errno(_(cannot unlink temporary alternates file)); } +static int starts_with_ignore_case(const char *str, const char *prefix) +{ + for (; ; str++, prefix++) + if (!*prefix) + return 1; + else if (tolower(*str) != tolower(*prefix)) + return 0; +} + +static int contains_license(void) +{ + DIR *dir = opendir(.); /* Examine current directory for license. */ + struct dirent *e; + struct stat st; + int ret = 0; + + if (!dir) + return 0; + + while ((e = readdir(dir)) != NULL) + if (starts_with_ignore_case(e-d_name, license) || + starts_with_ignore_case(e-d_name, copyright)) { + if (stat(e-d_name, st)) + continue; + if (st.st_size 1) { + ret = 1; + break; + } + } + + closedir(dir); + return ret; +} + + int cmd_clone(int argc, const char **argv, const char *prefix) { int is_bundle = 0, is_local; @@ -1016,6 +1051,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix) junk_mode = JUNK_LEAVE_REPO; err = checkout(); + if (!option_no_checkout !contains_license()) + warning(_(Repository has no LICENSE or COPYING file with content.)); + strbuf_release(reflog_msg); strbuf_release(branch_top); strbuf_release(key); -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] clone: Warn if LICENSE or COPYING file lacking and !clone.skiplicensecheck
Warn cloners if there is no LICENSE* or COPYING* file that makes the license clear. This is a useful warning, because if there is no license somewhere, then local copyright laws (which forbid many uses) and terms of service apply - and the cloner may not be expecting that. Many projects accidentally omit a license, so this is common enough to note. You can disable this warning by setting clone.skiplicensecheck to true. For more info on the issue, feel free to see: http://choosealicense.com/no-license/ http://www.wired.com/2013/07/github-licenses/ https://twitter.com/stephenrwalli/status/247597785069789184 Signed-off-by: David A. Wheeler dwhee...@dwheeler.com --- builtin/clone.c | 44 1 file changed, 44 insertions(+) diff --git a/builtin/clone.c b/builtin/clone.c index 9572467..a3e8584 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -50,6 +50,7 @@ static int option_progress = -1; static struct string_list option_config; static struct string_list option_reference; static int option_dissociate; +static int skip_license_check; static int opt_parse_reference(const struct option *opt, const char *arg, int unset) { @@ -748,6 +749,44 @@ static void dissociate_from_references(void) die_errno(_(cannot unlink temporary alternates file)); } +static int starts_with_ignore_case(const char *str, const char *prefix) +{ + for (; ; str++, prefix++) + if (!*prefix) + return 1; + else if (tolower(*str) != tolower(*prefix)) + return 0; +} + +static int missing_license(void) +{ + DIR *dir = opendir(.); /* Examine current directory for license. */ + struct dirent *e; + struct stat st; + int ret = 0; + + if (!dir) + return 0; /* Empty directory, no need for license. */ + + while ((e = readdir(dir)) != NULL) { + if (starts_with_ignore_case(e-d_name, license) || + starts_with_ignore_case(e-d_name, copyright)) { + if (stat(e-d_name, st) || st.st_size 2) + continue; + ret = 0; + break; + } + if (!strcmp(e-d_name, .) || !strcmp(e-d_name, ..) || + !strcmp(e-d_name, .git)) + continue; + ret = 1; /* Non-empty directory */ + } + + closedir(dir); + return ret; +} + + int cmd_clone(int argc, const char **argv, const char *prefix) { int is_bundle = 0, is_local; @@ -1016,6 +1055,11 @@ int cmd_clone(int argc, const char **argv, const char *prefix) junk_mode = JUNK_LEAVE_REPO; err = checkout(); + git_config_get_bool(clone.skiplicensecheck, skip_license_check); + if (!option_no_checkout !skip_license_check + missing_license()) + warning(_(Repository has no LICENSE or COPYING file with content.)); + strbuf_release(reflog_msg); strbuf_release(branch_top); strbuf_release(key); -- 2.3.3.221.g33aa87e.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] #!/bin/sh -- #!/usr/bin/env bash
Alecs King wrote: And as for bash, only gitdiff-do and gitlog.sh 'explicitly' use bash instead of /bin/sh. On most Linux distros, /bin/sh is just a symbolic link to bash. But not on some others. I found gitlsobj.sh could not work using a plain /bin/sh on fbsd. To make life easier, i think it might be better if we all explicitly use bash for all shell scripts. H. Peter Anvin wrote: How about #!/bin/bash (build from .in files if you feel it necessary to support systems which don't have bash in /bin) instead of doubling the number of execs? If # of execs is that critical, it probably should not be in bash anyway. OpenBSD (at least 3.1)'s bash appears to be in /usr/local/bin/bash, NOT /bin/bash. I'd go with the /bin/env solution for now; it maximizes the it just works factor, and when it comes time for .in files much of the cogito code (at least) will probably be rewritten in Perl, and anything performance-sensitive will be in C. --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Eliminate use of mktemp's -t option
It turns out that mktemp's -t option, while useful, isn't available on many systems (Mandrake Red Hat Linux 9 at least, and probably piles of others). So, here's a portability patch that removes all use of mktemp's -t option. Unlike the quick hack I posted earlier, this should be clean everywhere (assuming you have mktemp). This is a patch against git-pasky 0.6.3. This is my first attempt to _post_ a patch using git itself, and I'm not entirely sure how you want it. Let me know if you have a problem with it! --- David A. Wheeler commit 5f926b684025b83e34386bf8e4ef30a97e2ae5ec tree 61059575269ed1027cfb66543251e182f87d1064 parent dd69ca5f806c8b10bb29ecb8d77c88be007c981c author David A. Wheeler [EMAIL PROTECTED] 1114138972 -0400 committer David A. Wheeler [EMAIL PROTECTED] 1114138972 -0400 Eliminated use of mktemp's -t option; older mktemps don't support it. Index: README === --- 6a612d42afdba20fd2150e319a689ed451b010e4/README (mode:100644 sha1:a71b5fbdbdac0bf2e2d021e422b9f49dd5481165) +++ 61059575269ed1027cfb66543251e182f87d1064/README (mode:100644 sha1:80952e2f67b28f64c10cfb913df375a5dd244cd9) @@ -141,7 +141,7 @@ C compiler bash basic shell environment (sed, grep, textutils, ...) - mktemp 1.5+ (Mandrake users beware!) + mktemp diff, patch libssl rsync Index: gitapply.sh === --- 6a612d42afdba20fd2150e319a689ed451b010e4/gitapply.sh (mode:100755 sha1:7703809dc0743c6e4c1fa5b7d922a4efc16b4276) +++ 61059575269ed1027cfb66543251e182f87d1064/gitapply.sh (mode:100755 sha1:14a13ff23cff2a80f9a44c053002f837fec13e2c) @@ -8,9 +8,13 @@ # # Takes the diff on stdin. -gonefile=$(mktemp -t gitapply.XX) -todo=$(mktemp -t gitapply.XX) -patchfifo=$(mktemp -t gitapply.XX) +if [ -z $TMPDIR]; then + TMPDIR=/tmp +fi + +gonefile=$(mktemp $TMPDIR/gitapply.XX) +todo=$(mktemp $TMPDIR/gitapply.XX) +patchfifo=$(mktemp $TMPDIR/gitapply.XX) rm $patchfifo mkfifo -m 600 $patchfifo show-files --deleted $gonefile Index: gitcommit.sh === --- 6a612d42afdba20fd2150e319a689ed451b010e4/gitcommit.sh (mode:100755 sha1:a13bef2c84492ed75679d7d52bb710df35544f8a) +++ 61059575269ed1027cfb66543251e182f87d1064/gitcommit.sh (mode:100755 sha1:ee777605dccdc9737cf743f4f8c96b9bacd97f10) @@ -16,6 +16,9 @@ exit 1 } +if [ -z $TMPDIR]; then + TMPDIR=/tmp +fi [ -s .git/blocked ] die committing blocked: $(cat .git/blocked) @@ -67,7 +70,7 @@ fi echo Enter commit message, terminated by ctrl-D on a separate line: -LOGMSG=$(mktemp -t gitci.XX) +LOGMSG=$(mktemp $TMPDIR/gitci.XX) if [ $merging ]; then echo -n 'Merge with ' $LOGMSG echo -n 'Merge with ' @@ -111,7 +114,7 @@ if [ ! $customfiles ]; then rm -f .git/add-queue .git/rm-queue else - greptmp=$(mktemp -t gitci.XX) + greptmp=$(mktemp $TMPDIR/gitci.XX) for file in $customfiles; do if [ -s .git/add-queue ]; then fgrep -vx $file .git/add-queue $greptmp Index: gitdiff-do === --- 6a612d42afdba20fd2150e319a689ed451b010e4/gitdiff-do (mode:100755 sha1:218dfabeb4a5dcbd2cf58bd6f672f385690ec397) +++ 61059575269ed1027cfb66543251e182f87d1064/gitdiff-do (mode:100755 sha1:caf20ae034b8dc9f88922ee9f82809bb32a56231) @@ -32,7 +32,11 @@ [ $labelapp ] label=$label ($labelapp) } -diffdir=$(mktemp -d -t gitdiff.XX) +if [ -z $TMPDIR]; then + TMPDIR=/tmp +fi + +diffdir=$(mktemp -d $TMPDIR/gitdiff.XX) diffdir1=$diffdir/$id1 diffdir2=$diffdir/$id2 mkdir $diffdir1 $diffdir2 Index: gitdiff.sh === --- 6a612d42afdba20fd2150e319a689ed451b010e4/gitdiff.sh (mode:100755 sha1:195c3b9962c764855ec6168a78babf5867ea3046) +++ 61059575269ed1027cfb66543251e182f87d1064/gitdiff.sh (mode:100755 sha1:278511a3f491ed7d5e41bbd642acfd9b5a1d8257) @@ -80,6 +80,10 @@ shift fi +if [ -z $TMPDIR]; then + TMPDIR=/tmp +fi + if [ $parent ]; then id2=$id1 id1=$(parent-id $id2 | head -n 1) @@ -88,7 +92,7 @@ if [ $id2 = ]; then if [ $id1 != ]; then - export GIT_INDEX_FILE=$(mktemp -t gitdiff.XX) + export GIT_INDEX_FILE=$(mktemp $TMPDIR/gitdiff.XX) read-tree $(gitXnormid.sh $id1) update-cache --refresh fi Index: gitmerge.sh === --- 6a612d42afdba20fd2150e319a689ed451b010e4/gitmerge.sh (mode:100755 sha1:683755729b6f689ea43c692712fad6e51eeac354) +++ 61059575269ed1027cfb66543251e182f87d1064/gitmerge.sh (mode:100755 sha1:1c733bbdb9fe54c41787d962d0f55bb5f67d4c63) @@ -19,6 +19,10 @@ exit 1
Re: ia64 git pull
Petr Baudis [EMAIL PROTECTED] writes: Still, why would you escape it? My shell will not take # as a comment start if it is immediately after an alphanumeric character. I guess there MIGHT be some command shell implementation that stupidly _DID_ accept # as a comment character, even immediately after an alphanumeric. If that's true, then using # there would be a pain for portability. But I think that's highly improbable. A quick peek at the Single Unix Specification as posted by the Open Group seems to say that, according to the standards, that's NOT okay: http://www.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02 Basically, the command shell is supposed to tokenize, and # only means comment if it's at the beginning of a token. And as far as I can tell, it's not an issue in practice either. I did a few quick tests on Fedora Core 3 and OpenBSD. On Fedora Core 3, I can say that bash, ash csh all do NOT consider # as a comment start if an alpha precedes it. The same is true for OpenBSD /bin/sh, /bin/csh, and /bin/rksh. If such different shells do the same thing (this stuff isn't even legal C-shell text!), it's likely others do too. --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Change pull to _only_ download, and git update=pull+merge?
Daniel Barkalow wrote: See, I don't think you ever want to just pull. You want to pull-and-do-something, but the something could be any operation... In a _logical_ sense that's true; I'd only want to pull data if I intended to (possibly) do something with it. But as a _practical_ matter, I can see lots of reasons for doing a pull as a separate operation. One is disconnected operation; I may want to pull the data now, to prepare for disconnectino, and then work later while disconnected. Another is using lots of data compared to the pipesize; if I have a dial-in modem, or I want the history of the linux kernel since 0.0.1, I might want to pull go away/go to sleep for the night. I might use cron/at to automatically pull at 3am from some interesting branches. The next day, I could then pull again to update just what changed, and/or do the operation I intended to do if the operation auto-pulls the missing data. I'm actually getting suspicious that the right thing is to hide pull in the id scheme. That is, instead of saying linus to refer to the linus head that you currently have, you say +linus to refer to the head Linus has on his server currently, and this will cause you to download anything necessary to perform the operation with the resulting value. That's an interesting idea. I'll have to think about that. What command would you suggest for the common case of update with current track? I've proposed git update [NAME]. git merge with update-from-current-track as default seems unclear, and I worry that I might accidentally press RETURN too soon merge with the wrong thing. And I like the idea of git update doing the same thing (essentially) as cvs update and svn update; LOTS of people know what update does, so using the same command name for one of the most common operations smooths transition (GNU Arch's tla update is almost, though not exactly, the same too.) I still think it's important to have a very simple command that updates your current branch with a tracked branch (because it's common to stay in sync with a master branch), and a way to just download the data without doing things with it YET (because you want to do things in stages). The commands update and pull come to mind when thinking that way, though as long as the commands are simple clear that's a good thing (I think it's a GOOD idea to use the same commands as CVS and Subversion when the results are essentially the same, just because so many people are already familiar with them, but only where it makes sense.) --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [script] ge: export commits as patches
Petr Baudis wrote: Dear diary, on Tue, Apr 19, 2005 at 03:48:43PM CEST, I got a letter where Ingo Molnar [EMAIL PROTECTED] told me that... is there any 'export commit as patch' support in git-pasky? Nice idea. I will add it, probably as 'git patch'. Eek! It's a nice idea, and it'd be great as a subcommand. But PLEASE don't name it patch. I already know what patch does, patch ACCEPTS a patch... it doesn't create one ;-). How about naming it aspatch or asdiff instead? Or something else (good names, anyone?). soapbox_to_choirGood externally-viewed names are critical... good command names that are similar to what people already know can really help make the tool a joy to use./soapbox_to_choir --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2/4] Sorting commits by date
Petr Baudis wrote: [Re: Daniel Barkalow [EMAIL PROTECTED]'s patch] Note that you are breaking gcc-2.95 compatibility when using declarator in the middle of a block. Not that it might be a necessarily bad thing ;-) (although I still use gcc-2.95 a lot), just to ring a bell so that it doesn't slip through unnoticed and we can decide on a policy regarding this. I, at least, would REALLY like to see _highly_ portable C code; I'm looking at git as a potential long-term useful SCM tool for LOTS of projects, and if you're going to write C, it'd be nice to just write it portably to start with. There's certainly no crisis in using separate declarators. In fact, in the LONG term I'd like to see the shell code replaced with code that easily runs everywhere (Windows, etc.), again, for portability's sake. I think that would be unwise to do that right now; the shell is an excellent prototyping tool. But once things have settled down there's been some experience with the tools, the pieces could be slowly recoded. (Yes, I know of use Cygwin. And I prefer Python over Perl, but I'm really uninterested in language wars.) --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Linus Torvalds wrote: On Sat, 16 Apr 2005, Paul Jackson wrote: Morten wrote: It makes some sense in principle, but without storing what they mean (i.e., group==?) it certainly makes no sense. There's no they there. I think Martin's proposal, to which I agreed, was to store a _single_ bit. If any of the execute permissions of the incoming file are set, then the bit is stored ON, else it is stored OFF. On 'checkout', if the bit is ON, then the file permission is set mode 0777 (modulo umask), else it is set mode 0666 (modulo umask). I think I agree. Anybody willing to send me a patch? One issue is that if done the obvious way it's an incompatible change, and old tree objects won't be valid any more. It might be ok to just change the compare cache check to only care about a few bits, though: S_IXUSR and S_IFDIR. There's a minor reason to write out ALL the perm bit data, but only care about a few bits coming back in: Some people use SCM systems as a generalized backup system, so you can back up your system to an arbitrary known state in the past (e.g., Change my /etc files to the state I was at just before I installed that *#@ program!). For more on this, see: http://www.onlamp.com/pub/a/onlamp/2005/01/06/svn_homedir.html If you store all the bits, then you CAN restore things more exactly the way they were. This is imperfect, since it doesn't cover more exotic permission values from SELinux, xattrs, whatever. For some, that's enough. Yeah, I know, not the main purpose of git. But what the heck, I _like_ flexible infrastructures. --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Linus Torvalds wrote: On Sun, 17 Apr 2005, David A. Wheeler wrote: There's a minor reason to write out ALL the perm bit data, but only care about a few bits coming back in: Some people use SCM systems as a generalized backup system Yes. I was actually thinking about having system config files in a git repository when I started it, since I noticed how nicely it would do exactly that. However, since the mode bits also end up being part of the name of the tree object (ie they are most certainly part of the hash), it's really basically impossible to only care about one bit but writing out many bits: it's the same issue of having multiple identical blocks with different names. ... One solution is to tell git with a command line flag and/or config file entry that for this repo, I want you to honor all bits. That should be easy enough to add at some point, and then you really get what you want. Yes, I thought of that too. And I agree, that should do the job. My real concern is I'm looking at the early design of the storage format so that it's POSSIBLE to extend git in obvious ways. As long as it's possible later, then that's a great thing. ... Also, I made a design decision that git only cares about non-dotfiles. Git literally never sees or looks at _anything_ that starts with a .. I think that's absolutely the right thing to do for an SCM (if you hide your files, I really don't think you should expect the SCM to see it), but it's obviously not the right thing for a backup thing. Again, a command line flag or config file entry could change that in the future, if desired. So this is a decision that could be changed later... the best kind of decision :-). --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Parseable commit header
Stefan-W. Hahn wrote: Hi, after playing a while with git-pasky it is a crap to interpret the date of commit logs. Though it was a good idea to put the date in a parseable format (seconds since), but the format of the commit itself is not good parseable. Should be: ... Committer-Dater: 1113684324 +0200 I'm probably coming in late to the game, but exactly why is seconds-since-epoch format used instead of a format more easily understood by humans? Yes, I know tools can easily convert that, but you're already using an ASCII format; why not just record it in a format that's easily eyeballed like ISO's mmddThhmmss [timezone]? E.G.: 20050417T171520 +0200 or some such? I'm SURE that people will mention things like the patch I posted on April 17, 2005, and having the patch format record times that way, directly, would be convenient to the poor slobs^H^H^H^H^H developers who come later. Yes, a tool can handle the conversion, but choosing formats so a tool is unneeded for simple stuff is often better! --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Yet another base64 patch
Paul Jackson wrote: David wrote: My list would be: ext2, ext3, NFS, and Windows' NTFS (stupid short filenames, case-insensitive/case-preserving). I'm no mind reader, but I'd bet a pretty penny that what you have in mind and what Linus has in mind have no overlaps in their solution sets. Sadly, I lack the mind reading ability as well. Our goals are, I suspect, somewhat different. Linus wants to build a tool that meets his specific needs (managing kernel development), and he has particular requirements (such as fast simple merging when working at large scales). In contrast, I'm hoping for a more general OSS/FS SCM tool that many others can use as well. But I think there's heavy overlap in the solution space. The Linux kernel project is, to my knowledge, the largest project using a truly distributed SCM process. Anyone else who is considering a distributed SCM process would at _least_ want to think about how the Linux kernel project works, and if they're doing so, they might also want to reuse the development tools. I'm just taking a peek, and looking for situations where a design decision is irrelevant for his purposes, but a particular direction would be of particular help to other projects. I'm more worried about the storage format; if the code doesn't support some particular feature but it could be added later without great pain, no big deal. If something would imply a complete rewrite, that's undesirable. --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re-done kernel archive - real one?
On Sun, 17 Apr 2005, Russell King wrote: One thing which definitely needs to be considered is - what character encoding are the comments to be stored as? ... I replied: I would _heartily_ recommend moving towards UTF-8 as the internal charset for all comments. Petr said: Not that the plumbing should actually _care_ at all; anyone who uses it should take the care, so this is more of a social thing. The _plumbing_ shouldn't care, but the stuff above needs to know how to interpret the stuff that the plumbing produces. Russell King said: Except, I believe, MicroEMACS, which both Linus and myself use. As far as I know, there aren't any patches to make it UTF-8 compliant. Since plain ASCII is a subset of UTF-8, as long as MicroEMACS users only create ASCII comments, then the comments you create in MicroEMACS will still be UTF-8. No big deal. For reading comments, if the text is almost entirely plain ASCII, you could just ignore the problem and have the occasional character scramble. If you need more, you'll need a tool that's more internationalized or a working iconv, but if that's important you'd be motivated. Again, I'm looking for more generalized solutions, where non-English comments are more common than in Linux kernel code. --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Yet another base64 patch
Paul Jackson wrote: Earlier, hpa wrote: The base64 version has 2^12 subdirectories instead of 2^8 (I just used 2 characters as the hash key just like the hex version.) Later, hpa wrote: Ultimately the question is: do we care about old (broken) filesystems? I'd imagine we care a little - just not alot. Some people (e.g., me) would really like for git to be more forgiving of nasty filesystems, so that git can be used very widely. I.E., be forgiving about case insensitivity, poor performance or problems with a large # of files in a directory, etc. You're already working to make sure git handles filenames with spaces i18n filenames, a common failing of many other SCM systems. If git is used for Linux kernel development nothing else, it's still a success. But it'd be even better from my point of view if git was a useful tool for MANY other projects. I think there are advantages, even if you only plan to use git for the kernel, to making git easier to use for other projects. By making git less sensitive to the filesystem, you'll attract more (non-kernel-dev) users, some of whom will become new git developers who add cool new functionality. As noted in my SCM survey (http://www.dwheeler.com/essays/scm.html), I think SCM Windows support is really important to a lot of OSS projects. Many OSS projects, even if they start Unix/Linux only, spin off a Windows port, and it's painful if their SCM can't run on Windows then. Problems running on NFS filesystems have caused problems with GNU Arch users (there are workarounds, but now you need to learn about workarounds instead of things just working). If nothing else, look at the history of other SCM projects: all too many have undergone radical and painful surgeries so that they can be more portable to various filesystems. It's a trade-off, I know. --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SHA1 hash safety
Paul Jackson wrote: what I'm talking about is the chance that somewhere, sometime there will be two different documents that end up with the same hash I have vastly greater chance of a file colliding due to hardware or software glitch than a random message digest collision of two legitimate documents. The probability of an accidental overlap for SHA-1 for two different files is absurdly remote; it's just not worth worrying about. However, the possibility of an INTENTIONAL overlap is a completely different matter. I think the hash algorithm should change in the future; I have a proposal below. Someone has ALREADY broken into a server to modify the Linux kernel code already, so the idea of an attack on kernel code is not an idle fantasy. MD5 is dead, and SHA-1's work factor has already been sufficiently broken that people have already been told walk to the exits (i.e., DO NOT USE SHA-1 for new programs like git). The fact that blobs are compressed first, with a length header in front, _may_ make it harder to attack. But maybe not. I haven't checked for this case, but most decompression algorithms I know of have a don't change mode that essentially just copies the data behind it. If the one used in git has such a mode (I bet it does!), an attacker could use that mode to make it MUCH easier to create an attack vector than it would appear at first. Now the attacker just needs to create a collision (hmmm, where was that paper?). Remember, you don't need to run a hash algorithm over an entire file; you can precompute to near the end, and then try your iterations from there. A little hardware (inc. FPGAs) would speed the attack. Of course, that assumes you actually check everything to make sure that an attacker can't slip in something different. After each rsync, are all new files' hash values checked? Do they uncompress to right length? Do they have excess data after the decompression? I'm hoping that sort of input-checking (since the data might be from an attacker, if indirectly!) is already going on, though I haven't reviewed the git source code. While the jury's still out, the current belief by most folks I talk to is that SHA-1 variants with more bits, such as SHA-256, are the way to go now. The SHA-1 attack simply reduces the work factor (it's not a COMPLETE break), so adding more bits is believed to increase the work factor enough to counter it. Adding more information to the hash can make attacking even harder. Here's one idea: whenever that hash algorithm switch occurs, create a new hash value as this: SHA-256 + uncompressed-length Where SHA-256 is computed just like SHA-1 is now, e.g., SHA-256(file) where file = typecode + length + compressed data. Leave the internal format as-is (with the length embedded as well). This means that an attacker has to come up with an attack that creates the same length uncompressed, yet has the same hash of the compressed result. That's harder to do. Length is also really, really cheap to compute :-). That also might help the convince the what happens if there's an accidental collision crowd: now, if the file lengths are different, you're GUARANTEED that the hash values are different, though that's not the best reason to do that. One reason to think about switching sooner rather than later is that it'd be really nice if the object store also included signatures, so that in one fell swoop you could check who signed what (and thus you could later on CONFIRM with much more certainty who REALLY submitted a given change... say if it was clearly malicious). If you switch hash algorithms, the signatures might not work, depending on how you do it. --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Storing permissions
Paul Jackson wrote: Junio wrote: Sounds like svn I have no idea what svn is. svn = common abbreviation for Subversion, a widely-used centralized SCM tool intentionally similar to CVS. --- David A. Wheeler - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html