Re: [RFC v2] blame: new option --prefer-first to better handle merged cherry-picks

2014-01-13 Thread Bernhard R. Link
* Junio C Hamano gits...@pobox.com [140113 23:31]:
 I read the updated documentation three times but it still does not
 answer any of my questions I had in $gmane/239888, the most
 important part of which was:

 Yeah, the cherry-picked one will introduce the same change as
 the one that was cherry-picked, so if you look at the end result
 and ask where did _this_ line come from?, there are two
 equally plausible candidates, as blame output can give only
 one answer to each line.  I still do not see why the one that is
 picked with the new option is better.

Because:
  - it will blame the modifications of merged cherry-picked commit
to only one commit. Without the option parts of the modification
will be reported as coming from the one, parts will be reported
to be from the other. With the option only one of those two commits
is reported as the origin at the same time and not both.
  - it is more predictable which commit is blamed, so if one is
interested in where some commit was introduced first into a
mainline, one gets this information, and not somtimes a different
one due to unrelated reasons.

 To put it another way, why/when would an end user choose to use this
 option?  If the result of using this option is always better than
 without, why/when would an end user choose not to use this option?

While the result is more consistent and more predictable in the case
of merged cherry picks, it is also slower in every case. Usually speed
will be more important than this exactness, especially as the result
will not differ for the common case (if there are no cherry-picked
commits merged or when those commits do not touch any files that are
otherwise only modified in the merged branch).

Bernhard R. Link
-- 
F8AC 04D5 0B9B 064B 3383  C3DA AFFC 96D1 151D FFDC
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC v2] blame: new option --prefer-first to better handle merged cherry-picks

2014-01-12 Thread Bernhard R. Link
Allows to disable the git blame optimization of assuming that if there is a
parent of a merge commit that has the exactly same file content, then
only this parent is to be looked at.

This optimization, while being faster in the usual case, means that in
the case of cherry-picks the blamed commit depends on which other commits
touched a file.

If for example one commit A modified both files b and c. And there are
commits B and C, B only modifies file b and C only modifies file c
(so that no conflicts happen), and assume A is cherry-picked as A'
and the two branches then merged:

--o-B---A
   \ \
---C---A'--M---

Then without this new option git blame blames the A|A' changes of
file b to A while blaming the changes of c to A'.
With the new option --prefer-first it blames both changes to the
same commit and to the one more on the left side of the graph.

Signed-off-by: Bernhard R. Link brl...@debian.org
---
 Documentation/blame-options.txt | 6 ++
 builtin/blame.c | 7 +--
 2 files changed, 11 insertions(+), 2 deletions(-)

 Differences to first round: rename option and describe the effect
 instead of the implementation in documentation.

diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt
index 0cebc4f..b2e7fb8 100644
--- a/Documentation/blame-options.txt
+++ b/Documentation/blame-options.txt
@@ -48,6 +48,12 @@ include::line-range-format.txt[]
Show the result incrementally in a format designed for
machine consumption.
 
+--prefer-first::
+   If a line was introduced by two commits (for example via
+   a merged cherry-pick), prefer the commit that was
+   first merged in the history of always following the
+   first parent.
+
 --encoding=encoding::
Specifies the encoding used to output author names
and commit summaries. Setting it to `none` makes blame
diff --git a/builtin/blame.c b/builtin/blame.c
index 4916eb2..8ea34cf 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -45,6 +45,7 @@ static int incremental;
 static int xdl_opts;
 static int abbrev = -1;
 static int no_whole_file_rename;
+static int prefer_first;
 
 static enum date_mode blame_date_mode = DATE_ISO8601;
 static size_t blame_date_width;
@@ -1248,7 +1249,8 @@ static void pass_blame(struct scoreboard *sb, struct 
origin *origin, int opt)
porigin = find(sb, p, origin);
if (!porigin)
continue;
-   if (!hashcmp(porigin-blob_sha1, origin-blob_sha1)) {
+   if (!prefer_first 
+   !hashcmp(porigin-blob_sha1, origin-blob_sha1)) {
pass_whole_blame(sb, origin, porigin);
origin_decref(porigin);
goto finish;
@@ -2247,7 +2249,8 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
static const char *contents_from = NULL;
static const struct option options[] = {
OPT_BOOL(0, incremental, incremental, N_(Show blame entries 
as we find them, incrementally)),
-   OPT_BOOL('b', NULL, blank_boundary, N_(Show blank SHA-1 for 
boundary commits (Default: off))),
+   OPT_BOOL(0, prefer-first, prefer_first, N_(Prefer blaming 
commits merged earlier)),
+   OPT_BOOL('b', NULL, blank_boundary, N_(Show blank SHA-1 for 
boundary commits (Default: ff))),
OPT_BOOL(0, root, show_root, N_(Do not treat root commits 
as boundaries (Default: off))),
OPT_BOOL(0, show-stats, show_stats, N_(Show work cost 
statistics)),
OPT_BIT(0, score-debug, output_option, N_(Show output score 
for blame entries), OUTPUT_SHOW_SCORE),
-- 
1.8.5.1

Bernhard R. Link
-- 
F8AC 04D5 0B9B 064B 3383  C3DA AFFC 96D1 151D FFDC
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] blame: new option to better handle merged cherry-picks

2014-01-02 Thread Bernhard R. Link
* Junio C Hamano gits...@pobox.com [140102 21:29]:
  This optimization, while being faster in the usual case, means that in
  the case of cherry-picks the blamed commit depends on which other commits
  touched a file.
 
  If for example one commit A modified both files b and c. And there are
  commits B and C, B only modifies file b and C only modifies file c
  (so that no conflicts happen), and assume A is cherry-picked as A'
  and the two branches then merged:
 
  --o-B---A
 \ \
  ---C---A'--M---

 So the contents of b at M is as the same as in A, so following 'b'
 will see A and B changed that path, which is correct.

 The contents of c at M is?  It is different from A because at A c
 lacks the change made to it at C.  The merged result at M would
 match C in A', no?  So following 'c' will see A' and C changed that
 path, no?

 So what is wrong about it?

It's not wrong (that's why I do not suggest to change the default
behaviour), but it's inconsistent and can be a bit confusing to
have either the one or the other commit blamed depending on whether
some file was touched or not.
The history I'm a bit more concerned is something like (with ...
being unrelated commits not touching B or C):

 --o-...---A--...---B---...--
\\
 ---...---A'--...---C---...---M---


Here having B or C touching b or c determines which of A or A' is
blamed for which part of the patch.

It's even enough to have:

   --...---A'--...---B---...--
  /\
 ---o---...---A-----M---

To have the A/A' changes of c to be attributed to A while the b changes
are attributed to A'. I.e. you have a master branch that has commit A,
which is also cherry-picked to some previously forked side-branch.
Once that side-branch is merged back, parts of the change are attributed
to A' if they are in a file that is not touched otherwise in the main
branch.


 Also, when handling a merge, we have to handle parents sequencially,
 checking the difference between M with its first parent first, and
 then passing blame for the remaining common lines to the remaining
 parents.  If you flip the order of parents of M when you merge A and
 A' in your original history, and with your patch, what would you
 see when you blame c?  Wouldn't it notice that M:c is identical to c
 in its first parent (now A') and pass the whole blame to A' anyway
 with or without your change?

When giving git-blame the new option introduced with my patch, only
the order of parents determines which commit is blamed. Without
the option (i.e. the currently only possible behaviour) which commit
is blamed depends what else touches other parts of the file.
If both branches make modifications to the file (or if there is
any merge conflict resolution in the merge) then the bahaviour with
or without the option are the same.

But in the example with one commit B touching also b and one commit C
touching also c, there is (without the new option) always one part
of the cherry-picked commit is blamed on the original and one on the
cherry-picked, no matter how you order the parents.
(While by having your mainline always the most leftward parent, with
the the new option you always get those commit blamed that is the
first one this was introduced to mainline.)

Bernhard R. Link
-- 
F8AC 04D5 0B9B 064B 3383  C3DA AFFC 96D1 151D FFDC
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git:// protocol over SSL/TLS

2013-12-27 Thread Bernhard R. Link
* Sergey Sharybin sergey@gmail.com [131227 15:25]:
 Security in this case is about being sure everyone gets exactly the
 same repository as stored on the server, without any modifications to
 the sources cased by MITM.

Note that ssl (and thus https) only helps here against a resource-less
man-in-the-middle. Getting catch-all CA-signed certificates is said to
no longer available to everyone as easily as it was some years ago, but
unless you allow only one private CA (and even there clients often fail)
you still should assume everyone resourceful enough to still be able to
do MITM.

Bernhard R. Link
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-daemon: needs /root/.config/git/config?

2013-06-09 Thread Bernhard R. Link
* Ian Kumlien po...@vapor.com [130605 13:31]:
 Yes, i agree, it's suboptimal but I for one would use getpwuid to get
 the home directory of the executing user to avoid this - though i don't
 know how portable it is (or if there is any other issues)

It's not only suboptimal but simply wrong. getpwuid gives at best the
initial home directory, and even there it is only a guess. (If you are
looking for some home directory of a different user it might be a good
guess). But using getpwuid(getuid())-pw_dir if HOME is set is a serious
mistake, as you throw out the good value for some almost but not quite
totally unrelated value.

Bernhard R. Link
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: English/German terminology, git.git's de.po, and pro-git

2013-05-23 Thread Bernhard R. Link
* Ralf Thielow ralf.thie...@gmail.com [130522 17:17]:
  remote branch  = Remote-Branch
  remote-tracking branch = Remote-Tracking-Branch
  upstream branch= Upstream-Branch
 
  Yes. What's the main reason for using Branch in the German text? 
  Consistency
  with the commands, or assumed familiarity of the term within the target
  audience? Zweig is available.
 
 
 I think it's at the same level as Commit and a well known SCM-term. Users
 (even beginners) who know Commit and Tag do also know Branch. And
 I think it sounds better in combination with Remote-, Remote-Tracking- and
 Upstream- which are english words.

Additionally Zweig might be a bit misleading. A branch is not part of
the trees. It is called branch because in other VCSes the commits
build a tree and a any commit outside of the main branch of that tree is
part of exactly one different branch (so the head of that branch and the
branch are synonymns). With git the commits are no longer a tree, so a
git-branch is no branch and does not describe the whole branch of the
tree of commits but is just a names pointer into the graph of commits.
As it lost all meanings of the original word branch, translating it
with a translation of the original English word might more confusion
than helping anyone.

 (same for push). In other messages, the translation is in the same message
 as the command itself. I think it's OK when we just use fetch and push
 when the command is meant (as it's done for pull, e.g. in error messages),
 and the translation when the messages tell what the command is doing (e.g. 
 help
 messages). So it would depends on the message whether we translate the word
 or not. This would apply to other terms that are commands, too, like
 clean or revert.

I'd not call it OK. It's the only sane possibility. If you speak
about the magic keyword you have to give the command line, you won't
translate it, of course[1]. (The obvious interesting case is where the
English text plays with the command name having a meaning as word
itself. Here the translation will have to diverge to differentiate
between both (or sacrifice one of them, where it is not important)).

[1] Unlike you want to introduce a translated command line interface,
like Depp anfordere Herkunft Original instead of git fetch origin master

  diff   = Differenz
  delta  = Differenz (or Delta)
  patch  = Patch
  apply  = anwenden
  diffstat   = (leave it as it is)
  hunk   = Bereich
 
  IMHO Kontext is better if you use a German word. Technically the context 
  is
  something else, but in a German text IMHO it fits nicer when explaining to 
  the
  user where he/she can select the n-th hunk.
 

 Not sure if German users would know what hunk means, in case we
 leave it untranslated. And I'm not sure if I would understand Kontext.
 I tend to leave it untranslated.

Anyone found a German translation of the Patch manpage? Translating the
English word-play here, I'd suggest Block or Patch-Block.

  paths  = Pfade
 
  symbolic link = symbolische Verknüfung
  path = Pfad
  link = Verknüpfung

In the filesystem a Link is a Verweis in Unix, not a Verknüpfung
(that are usually the pseudo-links Windows supports).

Bernhard R. Link
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: possible Improving diff algoritm

2012-12-14 Thread Bernhard R. Link
* Javier Domingo javier...@gmail.com [121214 13:20]:
 I think the idea of being preferable to have a blank line at the end
 of the added/deleted block is key in this case.

For symmetry I'd suggest to make it preferable to have blank lines
at the end or the beginning.

  {
  old
+ }
+
+ {
+ new
  }

vs

  {
  old
  }
+
+ {
+ new
+ }

is just the same case in blue.
(Although empty lines alone feels not quite optimal, it is at least a
good start).

Bernhard R. Link
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html