Re: [RFC v2] blame: new option --prefer-first to better handle merged cherry-picks
* Junio C Hamano gits...@pobox.com [140113 23:31]: I read the updated documentation three times but it still does not answer any of my questions I had in $gmane/239888, the most important part of which was: Yeah, the cherry-picked one will introduce the same change as the one that was cherry-picked, so if you look at the end result and ask where did _this_ line come from?, there are two equally plausible candidates, as blame output can give only one answer to each line. I still do not see why the one that is picked with the new option is better. Because: - it will blame the modifications of merged cherry-picked commit to only one commit. Without the option parts of the modification will be reported as coming from the one, parts will be reported to be from the other. With the option only one of those two commits is reported as the origin at the same time and not both. - it is more predictable which commit is blamed, so if one is interested in where some commit was introduced first into a mainline, one gets this information, and not somtimes a different one due to unrelated reasons. To put it another way, why/when would an end user choose to use this option? If the result of using this option is always better than without, why/when would an end user choose not to use this option? While the result is more consistent and more predictable in the case of merged cherry picks, it is also slower in every case. Usually speed will be more important than this exactness, especially as the result will not differ for the common case (if there are no cherry-picked commits merged or when those commits do not touch any files that are otherwise only modified in the merged branch). Bernhard R. Link -- F8AC 04D5 0B9B 064B 3383 C3DA AFFC 96D1 151D FFDC -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC v2] blame: new option --prefer-first to better handle merged cherry-picks
Allows to disable the git blame optimization of assuming that if there is a parent of a merge commit that has the exactly same file content, then only this parent is to be looked at. This optimization, while being faster in the usual case, means that in the case of cherry-picks the blamed commit depends on which other commits touched a file. If for example one commit A modified both files b and c. And there are commits B and C, B only modifies file b and C only modifies file c (so that no conflicts happen), and assume A is cherry-picked as A' and the two branches then merged: --o-B---A \ \ ---C---A'--M--- Then without this new option git blame blames the A|A' changes of file b to A while blaming the changes of c to A'. With the new option --prefer-first it blames both changes to the same commit and to the one more on the left side of the graph. Signed-off-by: Bernhard R. Link brl...@debian.org --- Documentation/blame-options.txt | 6 ++ builtin/blame.c | 7 +-- 2 files changed, 11 insertions(+), 2 deletions(-) Differences to first round: rename option and describe the effect instead of the implementation in documentation. diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt index 0cebc4f..b2e7fb8 100644 --- a/Documentation/blame-options.txt +++ b/Documentation/blame-options.txt @@ -48,6 +48,12 @@ include::line-range-format.txt[] Show the result incrementally in a format designed for machine consumption. +--prefer-first:: + If a line was introduced by two commits (for example via + a merged cherry-pick), prefer the commit that was + first merged in the history of always following the + first parent. + --encoding=encoding:: Specifies the encoding used to output author names and commit summaries. Setting it to `none` makes blame diff --git a/builtin/blame.c b/builtin/blame.c index 4916eb2..8ea34cf 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -45,6 +45,7 @@ static int incremental; static int xdl_opts; static int abbrev = -1; static int no_whole_file_rename; +static int prefer_first; static enum date_mode blame_date_mode = DATE_ISO8601; static size_t blame_date_width; @@ -1248,7 +1249,8 @@ static void pass_blame(struct scoreboard *sb, struct origin *origin, int opt) porigin = find(sb, p, origin); if (!porigin) continue; - if (!hashcmp(porigin-blob_sha1, origin-blob_sha1)) { + if (!prefer_first + !hashcmp(porigin-blob_sha1, origin-blob_sha1)) { pass_whole_blame(sb, origin, porigin); origin_decref(porigin); goto finish; @@ -2247,7 +2249,8 @@ int cmd_blame(int argc, const char **argv, const char *prefix) static const char *contents_from = NULL; static const struct option options[] = { OPT_BOOL(0, incremental, incremental, N_(Show blame entries as we find them, incrementally)), - OPT_BOOL('b', NULL, blank_boundary, N_(Show blank SHA-1 for boundary commits (Default: off))), + OPT_BOOL(0, prefer-first, prefer_first, N_(Prefer blaming commits merged earlier)), + OPT_BOOL('b', NULL, blank_boundary, N_(Show blank SHA-1 for boundary commits (Default: ff))), OPT_BOOL(0, root, show_root, N_(Do not treat root commits as boundaries (Default: off))), OPT_BOOL(0, show-stats, show_stats, N_(Show work cost statistics)), OPT_BIT(0, score-debug, output_option, N_(Show output score for blame entries), OUTPUT_SHOW_SCORE), -- 1.8.5.1 Bernhard R. Link -- F8AC 04D5 0B9B 064B 3383 C3DA AFFC 96D1 151D FFDC -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] blame: new option to better handle merged cherry-picks
* Junio C Hamano gits...@pobox.com [140102 21:29]: This optimization, while being faster in the usual case, means that in the case of cherry-picks the blamed commit depends on which other commits touched a file. If for example one commit A modified both files b and c. And there are commits B and C, B only modifies file b and C only modifies file c (so that no conflicts happen), and assume A is cherry-picked as A' and the two branches then merged: --o-B---A \ \ ---C---A'--M--- So the contents of b at M is as the same as in A, so following 'b' will see A and B changed that path, which is correct. The contents of c at M is? It is different from A because at A c lacks the change made to it at C. The merged result at M would match C in A', no? So following 'c' will see A' and C changed that path, no? So what is wrong about it? It's not wrong (that's why I do not suggest to change the default behaviour), but it's inconsistent and can be a bit confusing to have either the one or the other commit blamed depending on whether some file was touched or not. The history I'm a bit more concerned is something like (with ... being unrelated commits not touching B or C): --o-...---A--...---B---...-- \\ ---...---A'--...---C---...---M--- Here having B or C touching b or c determines which of A or A' is blamed for which part of the patch. It's even enough to have: --...---A'--...---B---...-- /\ ---o---...---A-----M--- To have the A/A' changes of c to be attributed to A while the b changes are attributed to A'. I.e. you have a master branch that has commit A, which is also cherry-picked to some previously forked side-branch. Once that side-branch is merged back, parts of the change are attributed to A' if they are in a file that is not touched otherwise in the main branch. Also, when handling a merge, we have to handle parents sequencially, checking the difference between M with its first parent first, and then passing blame for the remaining common lines to the remaining parents. If you flip the order of parents of M when you merge A and A' in your original history, and with your patch, what would you see when you blame c? Wouldn't it notice that M:c is identical to c in its first parent (now A') and pass the whole blame to A' anyway with or without your change? When giving git-blame the new option introduced with my patch, only the order of parents determines which commit is blamed. Without the option (i.e. the currently only possible behaviour) which commit is blamed depends what else touches other parts of the file. If both branches make modifications to the file (or if there is any merge conflict resolution in the merge) then the bahaviour with or without the option are the same. But in the example with one commit B touching also b and one commit C touching also c, there is (without the new option) always one part of the cherry-picked commit is blamed on the original and one on the cherry-picked, no matter how you order the parents. (While by having your mainline always the most leftward parent, with the the new option you always get those commit blamed that is the first one this was introduced to mainline.) Bernhard R. Link -- F8AC 04D5 0B9B 064B 3383 C3DA AFFC 96D1 151D FFDC -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git:// protocol over SSL/TLS
* Sergey Sharybin sergey@gmail.com [131227 15:25]: Security in this case is about being sure everyone gets exactly the same repository as stored on the server, without any modifications to the sources cased by MITM. Note that ssl (and thus https) only helps here against a resource-less man-in-the-middle. Getting catch-all CA-signed certificates is said to no longer available to everyone as easily as it was some years ago, but unless you allow only one private CA (and even there clients often fail) you still should assume everyone resourceful enough to still be able to do MITM. Bernhard R. Link -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-daemon: needs /root/.config/git/config?
* Ian Kumlien po...@vapor.com [130605 13:31]: Yes, i agree, it's suboptimal but I for one would use getpwuid to get the home directory of the executing user to avoid this - though i don't know how portable it is (or if there is any other issues) It's not only suboptimal but simply wrong. getpwuid gives at best the initial home directory, and even there it is only a guess. (If you are looking for some home directory of a different user it might be a good guess). But using getpwuid(getuid())-pw_dir if HOME is set is a serious mistake, as you throw out the good value for some almost but not quite totally unrelated value. Bernhard R. Link -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: English/German terminology, git.git's de.po, and pro-git
* Ralf Thielow ralf.thie...@gmail.com [130522 17:17]: remote branch = Remote-Branch remote-tracking branch = Remote-Tracking-Branch upstream branch= Upstream-Branch Yes. What's the main reason for using Branch in the German text? Consistency with the commands, or assumed familiarity of the term within the target audience? Zweig is available. I think it's at the same level as Commit and a well known SCM-term. Users (even beginners) who know Commit and Tag do also know Branch. And I think it sounds better in combination with Remote-, Remote-Tracking- and Upstream- which are english words. Additionally Zweig might be a bit misleading. A branch is not part of the trees. It is called branch because in other VCSes the commits build a tree and a any commit outside of the main branch of that tree is part of exactly one different branch (so the head of that branch and the branch are synonymns). With git the commits are no longer a tree, so a git-branch is no branch and does not describe the whole branch of the tree of commits but is just a names pointer into the graph of commits. As it lost all meanings of the original word branch, translating it with a translation of the original English word might more confusion than helping anyone. (same for push). In other messages, the translation is in the same message as the command itself. I think it's OK when we just use fetch and push when the command is meant (as it's done for pull, e.g. in error messages), and the translation when the messages tell what the command is doing (e.g. help messages). So it would depends on the message whether we translate the word or not. This would apply to other terms that are commands, too, like clean or revert. I'd not call it OK. It's the only sane possibility. If you speak about the magic keyword you have to give the command line, you won't translate it, of course[1]. (The obvious interesting case is where the English text plays with the command name having a meaning as word itself. Here the translation will have to diverge to differentiate between both (or sacrifice one of them, where it is not important)). [1] Unlike you want to introduce a translated command line interface, like Depp anfordere Herkunft Original instead of git fetch origin master diff = Differenz delta = Differenz (or Delta) patch = Patch apply = anwenden diffstat = (leave it as it is) hunk = Bereich IMHO Kontext is better if you use a German word. Technically the context is something else, but in a German text IMHO it fits nicer when explaining to the user where he/she can select the n-th hunk. Not sure if German users would know what hunk means, in case we leave it untranslated. And I'm not sure if I would understand Kontext. I tend to leave it untranslated. Anyone found a German translation of the Patch manpage? Translating the English word-play here, I'd suggest Block or Patch-Block. paths = Pfade symbolic link = symbolische Verknüfung path = Pfad link = Verknüpfung In the filesystem a Link is a Verweis in Unix, not a Verknüpfung (that are usually the pseudo-links Windows supports). Bernhard R. Link -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: possible Improving diff algoritm
* Javier Domingo javier...@gmail.com [121214 13:20]: I think the idea of being preferable to have a blank line at the end of the added/deleted block is key in this case. For symmetry I'd suggest to make it preferable to have blank lines at the end or the beginning. { old + } + + { + new } vs { old } + + { + new + } is just the same case in blue. (Although empty lines alone feels not quite optimal, it is at least a good start). Bernhard R. Link -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html