Re: [PATCH] git submodule foreach: Skip eval for more than one argument

2014-03-04 Thread Matthijs Kooijman
Hey Johan,

> Ok, so IINM, Anders' original commit was about making "git submodule
> foreach " behave more like "" (from a naive user's
> perspective),
Ok, that makes sense.

> while you rather expect to insert quotes/escapes to finely control
> exactly when shell interpretation happens.
Well, I mostly expect that the $name and $path that git submodule makes
available to each command invocation can actually be used by the
command.

> Aren't these POVs mutually incompatible? Is the only 'real' solution
> to forbid multitple arguments, and force everybody to quote the entire
> command?
Yes, I think you're right that they're mutually exclusive. Specifically,
if you expect git submodule foreach  to behave like ,
that means you expect the (interactive) shell to do all the
interpolation, word-splitting, etc. If so, you can't then later still do
interpolation (of course, you could do sed magic to just replace $name
and $path, etc., but that's broken).

> I don't particularly care which way it goes, as long as (a) the common
> case behaves as most users would expect, (b) the uncommon/complicated
> case is still _possible_ (though not necessarily simple), and (c) we
> don't break a sizable number of existing users.
Well, if you call submodule directly, you can now just put everything in
a single command and get $name interpolation.

As I mentioned, I couldn't do this because I was using a git alias.
However, a bit of fiddling showed a solution to that using a shell
function:

[alias]
each = "!f(){ git submodule foreach --quiet \"echo \\$name $*\";}; f"

This uses a shell function to collect all alias arguments and then uses
$* to expand them again into the single submodule foreach argument. Note
that $* is expanded when evaluating the alias, while \\$name is expanded
later inside submodule.

This suggests that with the current code, the more complicated cases are
still possible. There is one catch in this approach, in that the
original word splitting is not preserved ($* expands to just the
unquoted arguments as a single word). I'm not sure if this is fixable
($@ expands to multiple quoted words, but then foreach sees multiple
arguments and doesn't do the eval). One would need to escape the output
of $@ somehow (e.g., add \ before ", but that would become terribly
complicated I expect...).


Perhaps an explicit --eval switch to git submodule makes sense for
complete control? If it has a correspondning --no-eval, you can even
pass a single-argument command without evalling, while still keeping the
current "least surprise" approach as the default?

Whatever behaviour is settled for, it should be documented in the
submodule manpage (which I think is not the case now).

Gr.

Matthijs


signature.asc
Description: Digital signature


Re: [PATCH] git submodule foreach: Skip eval for more than one argument

2014-03-04 Thread Matthijs Kooijman
On Tue, Mar 04, 2014 at 03:53:24PM +0100, Johan Herland wrote:
> On Tue, Mar 4, 2014 at 2:51 PM, Matthijs Kooijman  wrote:
> > matthijs@grubby:~/test$ git submodule foreach echo '$name'
> > Entering 'test'
> > $name
> 
> jherland@beta ~/test$ echo '$name'
> $name
> 
> What would you expect echo '$name' to do?
If I run git submodule foreach each '$name', then my shell eats the
single quotes (which are only to prevent my shell from interpreting
$name). git submodule will see $name, so it will run echo $name, not
echo '$name'.

> What happens if you use double instead of single quotes?
Then my shell eats up the double quotes _and_ replaces $name with
nothing, so I can't expect git submodule to replace it with the
submodule name then :-)

Does that help to clarify what I mean?

Gr.

Matthijs


signature.asc
Description: Digital signature


Re: [PATCH] git submodule foreach: Skip eval for more than one argument

2014-03-04 Thread Matthijs Kooijman
Hey folks,

On Thu, Sep 26, 2013 at 04:10:15PM -0400, Anders Kaseorg wrote:
> ‘eval "$@"’ created an extra layer of shell interpretation, which was
> probably not expected by a user who passed multiple arguments to git
> submodule foreach:

It seems this patch has broken the use of $name, $path, etc. inside the
command ran by foreach (when it contains more than one argument):


matthijs@grubby:~/test$ git --version
git version 1.9.0
matthijs@grubby:~/test$ git submodule foreach echo '$name'
Entering 'test'
$name

But it works on the single-argument version:

matthijs@grubby:~/test$ git submodule foreach 'echo $name'
Entering 'test'
test

And it used to work in older versions:

matthijs@login:~/test$ git --version
git version 1.7.5.4
matthijs@login:~/test$ git submodule foreach 'echo $name'
Entering 'test'
test
matthijs@login:~/test$ git submodule foreach echo '$name'
Entering 'test'
test


I'm not sure how to fix this exactly. Adding "export" for the variables in
git-submodule.sh seems obvious but doesn't seem to be a complete solution. This
makes the variables available in the environment of any commands called (so git
submodule sh -c 'echo $name') works, but the git submodule foreach echo '$name'
above still doesn't work, since the "$@" used does not do any substitution, it
just executes $@ as a commandline unmodified. Ideally, you would do variable
substitution, but not word splitting, but I'm not sure how to do that. Also,
you'd still need one more layer of backslash escapes, which is probably what
this commit wanted to prevent...

Note that saying "you should use the single argument version if you need
those variables" doesn't seem possible in all cases. In particular, I'm
creating an alias that calls git submodule foreach, where the alias
contains part of the command and the rest of command comes from
arguments to the alias, meaning we always have at least two arguments...

Finally, the new behaviour (e.g., eval with one argument, directly
execute with multiple) is not documented in the manpage, but it seems
relevant enough to need documentation?

Gr.

Matthijs

> 
> $ git grep "'"
> [searches for single quotes]
> $ git submodule foreach git grep "'"
> Entering '[submodule]'
> /usr/lib/git-core/git-submodule: 1: eval: Syntax error: Unterminated quoted 
> string
> Stopping at '[submodule]'; script returned non-zero status.
> 
> To fix this, if the user passed more than one argument, just execute
> "$@" directly instead of passing it to eval.
> 
> Signed-off-by: Anders Kaseorg 
> ---
>  git-submodule.sh | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/git-submodule.sh b/git-submodule.sh
> index c17bef1..3381864 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -545,7 +545,12 @@ cmd_foreach()
>   sm_path=$(relative_path "$sm_path") &&
>   # we make $path available to scripts ...
>   path=$sm_path &&
> - eval "$@" &&
> + if [ $# -eq 1 ]
> + then
> + eval "$1"
> + else
> + "$@"
> + fi &&
>   if test -n "$recursive"
>   then
>   cmd_foreach "--recursive" "$@"
> -- 
> 1.8.4
> 
> 
> 


signature.asc
Description: Digital signature


git submodule manpage does not document --checkout

2014-02-25 Thread Matthijs Kooijman
Hi,

it seems git submodule supports --checkout, which is also mentioned
indirectly in the manpage. However, the option itself is not mentioned
in the synopsis or detailed option list.

Gr.

Matthijs


signature.asc
Description: Digital signature


Re: [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-10-21 Thread Matthijs Kooijman
Hi Duy,

I saw your patch series got accepted in git master a while back, great!
Since I hope to be using the fixed behaviour soon, what was the plan for
including it? Am I correct in thinking that git master will become 1.8.5
in a while? Would this series perhaps be considered for backporting to
1.8.4.x?

Gr.

Matthijs


signature.asc
Description: Digital signature


Automatically filling in git send-email arguments based on an existing e-mail

2013-10-02 Thread Matthijs Kooijman
Hi folks,

sometimes when I send a patch, I want to reply it to an existing e-mail,
using pretty much the same recipient list. Currently, I have to:

 - copy-paste the message id for --in-reply-to header
 - copy one address for --to
 - copy the other addresses for the --cc's

Since I can't just chuck a list of addresses into --cc and I need to
quote every one because of the <> and spaces in there, this feels like
it's more tedious than needed.

It seems like there should be a weay to just copy paste the headers from
the original e-mail into the stdin of git send-email or a wrapper script
and let it sort things out from there.


Is there any interest in something like this? Does anyone else perhaps
already have such a script lying around?

(After writing this mail, I just noticed "[PATCH] git-send-email: two
new options: to-cover, cc-cover", which could help a bit to simplify
things, but not quite as far as I'm proposing here...)

Gr.

Matthijs


signature.asc
Description: Digital signature


Re: [PATCH] git-svn: Configure a prompt callback for gnome_keyring.

2013-08-29 Thread Matthijs Kooijman
Hi folks,

any chance this patch can be merged?

Gr.

Matthijs

On Tue, Jun 18, 2013 at 06:38:10PM +0200, Matthijs Kooijman wrote:
> This allows git-svn to prompt for a keyring unlock password, when a
> the needed gnome keyring is locked.
> 
> This requires changes in the subversion perl bindings which have been
> committed to svn trunk (r1241554 and some followup commits) and are
> first available in the 1.8.0 release.
> ---
>  perl/Git/SVN/Prompt.pm |  5 +
>  perl/Git/SVN/Ra.pm | 13 +
>  2 files changed, 18 insertions(+)
> 
> diff --git a/perl/Git/SVN/Prompt.pm b/perl/Git/SVN/Prompt.pm
> index e940b08..faeda01 100644
> --- a/perl/Git/SVN/Prompt.pm
> +++ b/perl/Git/SVN/Prompt.pm
> @@ -23,6 +23,11 @@ sub simple {
>   $SVN::_Core::SVN_NO_ERROR;
>  }
>  
> +sub gnome_keyring_unlock {
> + my ($keyring, $pool) = @_;
> + _read_password("Password for '$keyring' GNOME keyring: ", undef);
> +}
> +
>  sub ssl_server_trust {
>   my ($cred, $realm, $failures, $cert_info, $may_save, $pool) = @_;
>   $may_save = undef if $_no_auth_cache;
> diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
> index 75ecc42..38ed0cb 100644
> --- a/perl/Git/SVN/Ra.pm
> +++ b/perl/Git/SVN/Ra.pm
> @@ -104,6 +104,19 @@ sub new {
>   }
>   } # no warnings 'once'
>  
> + # Allow git-svn to show a prompt for opening up a gnome-keyring, if 
> needed.
> + if (defined(&SVN::Core::auth_set_gnome_keyring_unlock_prompt_func)) {
> + my $keyring_callback = 
> SVN::Core::auth_set_gnome_keyring_unlock_prompt_func(
> + $baton,
> + \&Git::SVN::Prompt::gnome_keyring_unlock
> + );
> + # Keep a reference to this callback, to prevent the function
> + # (reference) from being garbage collected.  We just add it to
> + # the callbacks value, which are also used only to prevent the
> + # garbage collector from eating stuff.
> + $callbacks = [$callbacks, $keyring_callback]
> + }
> +
>   my $self = SVN::Ra->new(url => $url, auth => $baton,
> config => $config,
> pool => SVN::Pool->new,
> -- 
> 1.8.3.rc1
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Add testcase for needless objects during a shallow fetch

2013-08-28 Thread Matthijs Kooijman
This is a testcase that checks for a problem where, during a specific
shallow fetch where the client does not have any commits that are a
successor of the new shallow root (i.e., the fetch creates a new
detached piece of history), the server would simply send over _all_
objects, instead of taking into account the objects already present in
the client.

The actual problem was fixed by a recent patch series by Nguyễn Thái
Ngọc Duy already.

Signed-off-by: Matthijs Kooijman 
---
 t/t5500-fetch-pack.sh | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index fd2598e..a022d65 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -393,6 +393,17 @@ test_expect_success 'fetch in shallow repo unreachable 
shallow objects' '
git fsck --no-dangling
)
 '
+test_expect_success 'fetch creating new shallow root' '
+   (
+   git clone "file://$(pwd)/." shallow10 &&
+   git commit --allow-empty -m empty &&
+   cd shallow10 &&
+   git fetch --depth=1 --progress 2> actual &&
+   # This should fetch only the empty commit, no tree or
+   # blob objects
+   grep "remote: Total 1" actual
+   )
+'
 
 test_expect_success 'setup tests for the --stdin parameter' '
for head in C D E F
-- 
1.8.4.rc1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-08-28 Thread Matthijs Kooijman
Hi Duy,

> I thought a bit but my thoughts often get stuck if I don't write them
> down in form of code :-) so this is what I got so far. 4/6 is a good
> thing in my opinion, but I might overlook something 6/6  is about this
> thread.

The series looks good to me, though I don't know enough about the code
to do detailed analysis.

In any case, I agree that 4/6 is a good change, it removes a bunch of
similar code for the shallow special case (which is now no longer a
completely separate special case).

The total series also seems to actually fix the problem I reported. I'll
resend the testcase from my original patch as well, which now passes
with your series applied.

Thanks for diving into this!

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/6] upload-pack: delegate rev walking in shallow fetch to pack-objects

2013-08-28 Thread Matthijs Kooijman
Hi Nguy,

On Fri, Aug 16, 2013 at 04:52:05PM +0700, Nguyễn Thái Ngọc Duy wrote:
> upload-pack has a special rev walking code for shallow recipients. It
> works almost like the similar code in pack-objects except:
> 
> 1. in upload-pack, graft points could be added for deepening
> 
> 2. also when the repository is deepened, the shallow point will be
>moved further away from the tip, but the old shallow point will be
>marked as edge to produce more efficient packs. See 6523078 (make
>shallow repository deepening more network efficient - 2009-09-03)
> 
> pass the file to pack-objects via --shallow-file. This will override
> $GIT_DIR/shallow and give pack-objects the exact repository shape that
> upload-pack has.
> 
> mark edge commits by revision command arguments. Even if old shallow
> points are passed as "--not" revisions as in this patch, they will not
> be picked up by mark_edges_uninteresting() because this function looks
> up to parents for edges, while in this case the edge is the children,
> in the opposite direction. This will be fixed in the next patch when
> all given uninteresting commits are marked as edges.
This says "the next patch" but it really refers to 6/6, not 5/6. Patch
6/6 has the same problem (it says "previous patch"). Perhaps patches 4
and 5 should just be swapped?

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-08-12 Thread Matthijs Kooijman
Hi Duy,

> OK. Mathijs, do you want make a patch for it?
I'm willing, but:
 - I don't understand the code and all of your comments well enough yet
   to start coding right away (though I haven't actually invested enough
   time in this yet, either).
 - I'll be on vacation for the next two weeks.

When I get back, I'll re-read this thread properly and reply where I
don't follow it. Feel free to continue discussing the plan until then,
of course :-)

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-08-07 Thread Matthijs Kooijman
Hi Junio,

I haven't got a reply to my mail yet. Could you have a look, so I can
update and resubmit my patch?

On Fri, Jul 12, 2013 at 09:11:57AM +0200, Matthijs Kooijman wrote:
> > [administrivia: you seem to have mail-followup-to that points at you
> > and the list; is that really needed???]
> > In your discussion (including the comment), you talk about "shallow
> > root" (I think that is the same as what we call "shallow boundary"),
> I think so, yes. I mean to refer to the commits referenced in
> .git/shallow, that have their parents "hidden".
Could you confirm that I got the terms right here (or is the shallow
boundary the first hidden commit?)

> > but in this added block, there is nothing that checks CLIENT_SHALLOW
> > or SHALLOW flags to special case that.
> >
> > Is it a good idea to unconditionally do this for all "have"
> > revisions?
> That's what I meant in my mail with "applying the fix unconditionally" -
> there is probably some check needed (I discussed a few options in the
> mail as well).
>
> Note that this entire do_rev_list function is only called when there are
> shallow revisions involved, so there is also a basic "only when shallow"
> check in place.

My proposal was to only apply the fix for all have revisions when the
previous history traversal came across some shallow boundary commits. If
this happens, then that shallow boundary commit will be a "new" one and
it will have prevented the history traversal from finding the full list
of relevant "have" commits. In this case, we should just use all "have"
commits instead.

Now, looking at the code, I see a few options for detecting this case:

 1 Modify mark_edges_uninteresting to return a boolean (or have an
   output argument) if any of the commits in the list of commits to find
   (not the edges) is a shallow boundary.
 2 Modify mark_edges_uninteresting to have a "show_shallow" argument
   that gets called for every shallow boundary. The show_shallow
   function passed would then simply keep a boolean if it is passed at
   least once.
 3 Add another loop over the commits _after_ the call to
   mark_edges_uninteresting, that simply looks for any shallow boundary
   commit.

The last option seems sensible to me, since it prevents modifying the
somewhat generic mark_edges_uninteresting function for this specific
usecase. On the other hand, it does mean that the list of commits is
looped twice, not sure what that means for performance.

Before I go and implement one of these, which option seems best to you?

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-07-12 Thread Matthijs Kooijman
Hi Junio,

> [administrivia: you seem to have mail-followup-to that points at you
> and the list; is that really needed???]
I'm not subscribed to the list, so yes :-)

> > This happens when a client issues a fetch with a depth bigger or equal
> > to the number of commits the server is ahead of the client.
> 
> Do you mean "smaller" (not "bigger")?
Yes, I meant smaller (reworded this first sentence a few times and then messed
up :-)

> > diff --git a/upload-pack.c b/upload-pack.c
> > index 59f43d1..5885f33 100644
> > --- a/upload-pack.c
> > +++ b/upload-pack.c
> > @@ -122,6 +122,14 @@ static int do_rev_list(int in, int out, void 
> > *user_data)
> > if (prepare_revision_walk(&revs))
> > die("revision walk setup failed");
> > mark_edges_uninteresting(revs.commits, &revs, show_edge);
> > +   /* In case we create a new shallow root, make sure that all
> > +* we don't send over objects that the client already has just
> > +* because their "have" revisions are no longer reachable from
> > +* the shallow root. */
> > +   for (i = 0; i < have_obj.nr; i++) {
> > +   struct commit *commit = (struct commit 
> > *)have_obj.objects[i].item;
> > +   mark_tree_uninteresting(commit->tree);
> > +   }
> 
> Hmph.
> 
> In your discussion (including the comment), you talk about "shallow
> root" (I think that is the same as what we call "shallow boundary"),
I think so, yes. I mean to refer to the commits referenced in
.git/shallow, that have their parents "hidden".

> but in this added block, there is nothing that checks CLIENT_SHALLOW
> or SHALLOW flags to special case that.
>
> Is it a good idea to unconditionally do this for all "have"
> revisions?
That's what I meant in my mail with "applying the fix unconditionally" -
there is probably some check needed (I discussed a few options in the
mail as well).

Note that this entire do_rev_list function is only called when there are
shallow revisions involved, so there is also a basic "only when shallow"
check in place.

> Also there is another loop that iterates over "have" revisions just
> above the precontext.  I wonder if this added code belongs in that
> loop.
I think we could add it there, yes. On the other hand, if we only want
to execute this code when there are shallow boundaries in the list of
revisions to send (as I suggested in my previous mail), then we can't
move this code up.

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH] During a shallow fetch, prevent sending over unneeded objects

2013-07-11 Thread Matthijs Kooijman
Hi folks,

while playing with shallow fetches, I've found that in some
circumstances running git fetch with --depth can return too many objects
(in particular, _all_ the objects for the requested revisions are
returned, even when some of those objects are already known to the
client).

This happens when a client issues a fetch with a depth bigger or equal
to the number of commits the server is ahead of the client. In this
case, the revisions to be sent over will be completely detached from any
revisions the client already has (history-wise), causing the server to
effectively ignore all objects the client has (as advertised using its
have lines) and just send over _all_ objects (needed for the revisions
it is sending over).

I've traced this down to the way do_rev_list in upload-pack.c works. If
I've poured over the code enough to understand it, this is what happens:
 - The new shallow roots are made into graft points without parents.
 - The "want" commits are added to the pending list (revs->pending)
 - The "have" commits are marked uninteresting and added to the pending list
 - prepare_revision_walk is called, which adds everything from the
   pending list into the commmit list (revs->commits)
 - limit_list is called, which traverses the history of each interesting
   commit in the commit list (i.e., all want revisions), up to excluding
   the first uninteresting commit (i.e. a have revision). The result of
   this is the new commit list.

   This means the commit list now contains all commits that the client
   wants, up to (excluding) any commits he already has or up to
   (including) any (new) shallow roots.
 - mark_edges_uninteresting is called, which marks the tree of every
   parent of each edge in the commit list as uninteresting (in practice,
   this marks the tree of each uninteresting parent, since those are by
   definition the only kinds of revisions that can be beyond the edge).
 - All trees and blobs that are referenced by trees in the commit list
   but are not marked as uninteresting, are passed to git-pack-objects
   to put into the pack.

Normally, the list of commits to send over is connected to the
client's existing commits (which are marked as uninteresting). This
means that only the trees of those uninteresting ("have") commits that
are actually (direct) predecessors of the commits to send over are
marked as uninteresting. This is probably useful, since it prevents
having to go over all trees the client has (for other branches, for
example) and instead limits to the trees that are the most likely to
contain duplicate (or similar, for delta-ing) objects.

However, in the "detached shallow fetch" case, this assumption is no
longer valid. There will be no uninteresting commits as parents for
the commit list, since all edge commits will be shallow roots (hence
have no parents).  Ideally, one would find out which of the "detached"
"have" revisions are the closest to the new shallow roots, but with the
current code these shallow roots have their parents cut off long before
this code even runs, so this is probably not feasible.

Instead, what we can do in this case, is simply mark the trees of all
"have" commits as uninteresting. This prevents all objects that are
contained in the "have" commits themselves from being sent to the
client, which can be a big win for bigger repositories. Marking them all
is is probably more work than strictly needed, but is easy to implement.

I have created a mockup patch which does this, and also adds a test case
demonstrating the problem. Right now, the above fix is applied always,
even in cases where it isn't needed.

Looking at the code, I think it would be good to let
mark_edges_uninteresting look for shallow roots in the commit list (or
perhaps just add another loop over the commit list inside do_rev_list)
and only apply the fix if any shallow roots are in the commit list
(meaning at least a part of the history to send over is detached from
the clients current history). I haven't implemented this yet, wanting to
get some feedback first.

Also, I'm not quite sure how this fits in with the concept of "thin
packs". There might be some opportunities missing here as well, though
git-pack-objects is called without --thin when shallow roots are
involved. I think this is related to the "-" prefixed commit sha's that
are sent to git-pack-objects, but I couldn't found any documentation on
what the - prefix is supposed to mean.

(On a somewhat related note, show_commit in upload-pack.c checks the
BOUNDARY flag, but AFAICS the revs->boundary flag is never set, so
BOUNDARY cannot ever be set in this case either?)

How does this patch look?

Gr.

Matthijs

---
 t/t5500-fetch-pack.sh | 11 +++
 upload-pack.c |  8 
 2 files changed, 19 insertions(+)

diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index fd2598e..a022d65 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -393,6 +393,17 @@ test_expect_success 'fetch in shal

[PATCH 2/3] upload-pack: Introduce new "fixed-off-by-one-depth" server feature

2013-07-11 Thread Matthijs Kooijman
Commit 682c7d2 (upload-pack: fix off-by-one depth calculation in shallow
clone) changed the meaning of the fetch depth sent over the wire to mean
the total number of commits to return, instead of the number of commits
beyond the first. However, when this change is deployed on some servers
but not others, this can cause a client to behave differently based on
the server version, which is unexpected.

To prevent this, the new, fixed, depth behaviour is advertised as a server
feature and the old behaviour is restored when the feature is not
requested by the client.

Signed-off-by: Matthijs Kooijman 
---
 upload-pack.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 127e59a..59f43d1 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -46,6 +46,7 @@ static unsigned int timeout;
 static int use_sideband;
 static int advertise_refs;
 static int stateless_rpc;
+static int fixed_depth;
 
 static void reset_timeout(void)
 {
@@ -633,6 +634,8 @@ static void receive_needs(void)
no_progress = 1;
if (parse_feature_request(features, "include-tag"))
use_include_tag = 1;
+   if (parse_feature_request(features, "fixed-off-by-one-depth"))
+   fixed_depth = 1;
 
o = parse_object(sha1_buf);
if (!o)
@@ -669,10 +672,14 @@ static void receive_needs(void)
struct object *object = 
shallows.objects[i].item;
object->flags |= NOT_SHALLOW;
}
-   else
+   else {
+   /* Emulate off-by-one bug in older versions */
+   if (!fixed_depth)
+   depth++;
backup = result =
get_shallow_commits(&want_obj, depth,
SHALLOW, NOT_SHALLOW);
+   }
while (result) {
struct object *object = &result->item->object;
if (!(object->flags & (CLIENT_SHALLOW|NOT_SHALLOW))) {
@@ -738,7 +745,7 @@ static int send_ref(const char *refname, const unsigned 
char *sha1, int flag, vo
 {
static const char *capabilities = "multi_ack thin-pack side-band"
" side-band-64k ofs-delta shallow no-progress"
-   " include-tag multi_ack_detailed";
+   " include-tag multi_ack_detailed fixed-off-by-one-depth";
const char *refname_nons = strip_namespace(refname);
unsigned char peeled[20];
 
-- 
1.8.3.rc1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/3] fetch-pack: Request fixed-off-by-one-depth when available

2013-07-11 Thread Matthijs Kooijman
This server feature changes the meaning of the fetch depth, allowing
fetching only a single revision instead of at least two as before. To
make sure the behaviour only depends on the client version, the depth
value sent over the wire is corrected depending on wether the server has
the fix.

There is one corner case: A server without the fix cannot send less than
2 commmits, so when --depth=1 is specified a warning is shown and 2
commits are fetched instead of 1.

Signed-off-by: Matthijs Kooijman 
---
 fetch-pack.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index abe5ffb..799b2c1 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -39,6 +39,7 @@ static int marked;
 
 static struct commit_list *rev_list;
 static int non_common_revs, multi_ack, use_sideband, allow_tip_sha1_in_want;
+static int fixed_depth;
 
 static void rev_list_push(struct commit *commit, int mark)
 {
@@ -327,6 +328,7 @@ static int find_common(struct fetch_pack_args *args,
if (prefer_ofs_delta)   strbuf_addstr(&c, " ofs-delta");
if (agent_supported)strbuf_addf(&c, " agent=%s",

git_user_agent_sanitized());
+   if (fixed_depth)strbuf_addstr(&c, " 
fixed-off-by-one-depth");
packet_buf_write(&req_buf, "want %s%s\n", remote_hex, 
c.buf);
strbuf_release(&c);
} else
@@ -342,8 +344,23 @@ static int find_common(struct fetch_pack_args *args,
 
if (is_repository_shallow())
write_shallow_commits(&req_buf, 1);
-   if (args->depth > 0)
-   packet_buf_write(&req_buf, "deepen %d", args->depth);
+   if (args->depth > 0) {
+   if (!fixed_depth && args->depth == 1)
+   warning("Server does not support depth=1, using depth=2 
instead");
+   if (!fixed_depth && args->depth > 1) {
+   /* Old server that interprets "deepen 1" as
+  "give me tip + 1 extra commit" */
+   packet_buf_write(&req_buf, "deepen %d", args->depth - 
1);
+   } else if (!fixed_depth && args->depth == 1) {
+   /* Old servers cannot handle depth=1 (deepen=0
+  means don't change depth / full depth). */
+   packet_buf_write(&req_buf, "deepen 1");
+   } else {
+   /* New server, send depth as-is */
+   packet_buf_write(&req_buf, "deepen %d", args->depth);
+   }
+   }
+
packet_buf_flush(&req_buf);
state_len = req_buf.len;
 
@@ -874,6 +891,11 @@ static struct ref *do_fetch_pack(struct fetch_pack_args 
*args,
fprintf(stderr, "Server supports ofs-delta\n");
} else
prefer_ofs_delta = 0;
+   if (server_supports("fixed-off-by-one-depth")) {
+   if (args->verbose)
+   fprintf(stderr, "Server has fixed meaning of depth 
value\n");
+   fixed_depth = 1;
+   }
 
if ((agent_feature = server_feature_value("agent", &agent_len))) {
agent_supported = 1;
-- 
1.8.3.rc1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/3] upload-pack: Remove a piece of dead code

2013-07-11 Thread Matthijs Kooijman
Commit 682c7d2 (upload-pack: fix off-by-one depth calculation in shallow
clone) introduced a new check in get_shallow_commits to decide when to
stop traversing the history and mark the current commit as a shallow
root.

With this new check in place, the old check can no longer be true, since
the first check always fires first. This commit removes that check,
making the code a bit more simple again.

Signed-off-by: Matthijs Kooijman 
---
 shallow.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/shallow.c b/shallow.c
index cbe2526..8a9c96d 100644
--- a/shallow.c
+++ b/shallow.c
@@ -110,17 +110,12 @@ struct commit_list *get_shallow_commits(struct 
object_array *heads, int depth,
continue;
*pointer = cur_depth;
}
-   if (cur_depth < depth) {
-   if (p->next)
-   add_object_array(&p->item->object,
-   NULL, &stack);
-   else {
-   commit = p->item;
-   cur_depth = *(int *)commit->util;
-   }
-   } else {
-   commit_list_insert(p->item, &result);
-   p->item->object.flags |= shallow_flag;
+   if (p->next)
+   add_object_array(&p->item->object,
+   NULL, &stack);
+   else {
+   commit = p->item;
+   cur_depth = *(int *)commit->util;
}
}
}
-- 
1.8.3.rc1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git clone depth of 0 not possible.

2013-07-11 Thread Matthijs Kooijman
Hi Junio,

> While implementing the above, I noticed my fix now introduced an
> off-by-one error the other way. When investigating, I found this commit:
> 
>   commit 682c7d2f1a2d1a5443777237450505738af2ff1a
>   Author: Nguyễn Thái Ngọc Duy 
>   Date:   Fri Jan 11 16:05:47 2013 +0700
> 
>   upload-pack: fix off-by-one depth calculation in shallow clone
> 
>   get_shallow_commits() is used to determine the cut points at a given
>   depth (i.e. the number of commits in a chain that the user likes to
>   get). However we count current depth up to the commit "commit" but 
> we
>   do the cutting at its parents (i.e. current depth + 1). This makes
>   upload-pack always return one commit more than requested. This patch
>   fixes it.
> 
>   Signed-off-by: Nguyễn Thái Ngọc Duy 
>   Signed-off-by: Junio C Hamano 
> 
> Which actually seems to fix the off-by-one bug that is described in this
> thread, but without going through the hoops of preserving current
> behaviour for older git versions (that is, it makes behaviour dependent
> on server version instead of client version).
> 
> Does this mean the discussion in this thread is meaningless, or is that
> commit not intended to be the final fix?
Looking more closely, I also see that the above change is already
released in 1.8.2 versions. Given that, I don't think it makes sense to
to still try to provide this capability to get backward compatible
behaviour, since this would cause a off-by-one error the other way when
talking to 1.8.2.x servers...

However, since I pretty much finished the code for this, I'll send over
the patches and let you decide wether to include them or not. If you
want to include them but they need to be changed in some way, just let
me know.

The first patch of the series should be merged regardless.

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git clone depth of 0 not possible.

2013-07-09 Thread Matthijs Kooijman
Hi Junio,

> Doing it "correctly" (in the shorter term) would involve:
> 
>  - adding a capability on the sending side "fixed-off-by-one-depth"
>to the protocol, and teaching the sending side to advertise the
>capability;
>
>  - teaching the requestor that got --depth=N from the end user to
>pay attention to the new capability in such a way that:
> 
>- when talking to an old sender (i.e. without the off-by-one
>  fix), send N-1 for N greater than 1.  Punt on N==1;
> 
>- when talking to a fixed sender, ask to enable the capability,
>  and send N as is (including N==1).
> 
>  - teaching the sending side to see if the new behaviour to fix
>off-by-one is asked by the requestor, and stop at the correct
>number of commits, not oversending one more.  Otherwise retain
>the old behaviour.

While implementing the above, I noticed my fix now introduced an
off-by-one error the other way. When investigating, I found this commit:

commit 682c7d2f1a2d1a5443777237450505738af2ff1a
Author: Nguyễn Thái Ngọc Duy 
Date:   Fri Jan 11 16:05:47 2013 +0700

upload-pack: fix off-by-one depth calculation in shallow clone

get_shallow_commits() is used to determine the cut points at a given
depth (i.e. the number of commits in a chain that the user likes to
get). However we count current depth up to the commit "commit" but 
we
do the cutting at its parents (i.e. current depth + 1). This makes
upload-pack always return one commit more than requested. This patch
fixes it.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Junio C Hamano 

Which actually seems to fix the off-by-one bug that is described in this
thread, but without going through the hoops of preserving current
behaviour for older git versions (that is, it makes behaviour dependent
on server version instead of client version).

Does this mean the discussion in this thread is meaningless, or is that
commit not intended to be the final fix?

In any case, IIUC that particular patch makes a piece of the existing
code dead, which needs to be removed.

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] git-svn: Configure a prompt callback for gnome_keyring.

2013-06-18 Thread Matthijs Kooijman
Hi folks,

On Fri, Apr 27, 2012 at 08:28:40AM +, Eric Wong wrote:
> Matthijs Kooijman  wrote:
> > This allows git-svn to prompt for a keyring unlock password, when a
> > the needed gnome keyring is locked.
> > 
> > This requires changes in the subversion perl bindings which have been
> > committed to trunk (1241554 and some followup commits) and should be
> > available with the (as of yet unreleased) 1.8.0 release.
> 
> I'm a hesitant to use/depend on unreleased functionality in SVN.
> 
> Is there a chance the API could change before the release.  Also,
> what kind of tests do the SVN guys do on the Perl bindings + GNOME?
> I'm especially concerned since we just worked around segfault
> bugs in the other patch.
> 
> Can we put this on hold until somebody can test the 1.8.0 release?

After over a year, Subversion has finally started with 1.8.0 release
candidates. I've rebased this patch and succesfully tested it against
1.8.0-rc3.

I'll send the updated patch over as a reply to this mail.


As for testing, it took a bit of messing to get all the paths correct,
so I'll document what I did here.

I used a the 1.8.0-rc3 tarball and ran:

subversion-1.8.0-rc3$ ./configure --prefix=/usr/local/svn
subversion-1.8.0-rc3$ make all install
subversion-1.8.0-rc3$ make swig-pl install-swig-pl PREFIX=/usr/local/svn

I took a git master checkout with the patch applied and ran:

git$ make prefix=/usr/local/git all install

Then, inside some git-svn clone, I ran:

$ PERL5LIB=/usr/local/svn/local/lib/perl/5.14.2/ 
LD_LIBRARY_PATH=/usr/local/svn/lib/  /usr/local/git/bin/git svn rebase
Password for 'default' GNOME keyring:
Current branch master is up to date.

When removing the PERL5LIB and LD_LIBRARY_PATH variables to run against
the system version of subversion (1.6.17 here), I get an authorization
failure as before:

$ /usr/local/git/bin/git svn rebase
Authorization failed: OPTIONS of 'http://svn.example.org': 
authorization failed: Could not authenticate to server: rejected Basic 
challenge (http://example.org) at /usr/local/git/share/perl/5.14.2/Git/SVN.pm 
line 717

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git-svn: Configure a prompt callback for gnome_keyring.

2013-06-18 Thread Matthijs Kooijman
This allows git-svn to prompt for a keyring unlock password, when a
the needed gnome keyring is locked.

This requires changes in the subversion perl bindings which have been
committed to svn trunk (r1241554 and some followup commits) and are
first available in the 1.8.0 release.
---
 perl/Git/SVN/Prompt.pm |  5 +
 perl/Git/SVN/Ra.pm | 13 +
 2 files changed, 18 insertions(+)

diff --git a/perl/Git/SVN/Prompt.pm b/perl/Git/SVN/Prompt.pm
index e940b08..faeda01 100644
--- a/perl/Git/SVN/Prompt.pm
+++ b/perl/Git/SVN/Prompt.pm
@@ -23,6 +23,11 @@ sub simple {
$SVN::_Core::SVN_NO_ERROR;
 }
 
+sub gnome_keyring_unlock {
+   my ($keyring, $pool) = @_;
+   _read_password("Password for '$keyring' GNOME keyring: ", undef);
+}
+
 sub ssl_server_trust {
my ($cred, $realm, $failures, $cert_info, $may_save, $pool) = @_;
$may_save = undef if $_no_auth_cache;
diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
index 75ecc42..38ed0cb 100644
--- a/perl/Git/SVN/Ra.pm
+++ b/perl/Git/SVN/Ra.pm
@@ -104,6 +104,19 @@ sub new {
}
} # no warnings 'once'
 
+   # Allow git-svn to show a prompt for opening up a gnome-keyring, if 
needed.
+   if (defined(&SVN::Core::auth_set_gnome_keyring_unlock_prompt_func)) {
+   my $keyring_callback = 
SVN::Core::auth_set_gnome_keyring_unlock_prompt_func(
+   $baton,
+   \&Git::SVN::Prompt::gnome_keyring_unlock
+   );
+   # Keep a reference to this callback, to prevent the function
+   # (reference) from being garbage collected.  We just add it to
+   # the callbacks value, which are also used only to prevent the
+   # garbage collector from eating stuff.
+   $callbacks = [$callbacks, $keyring_callback]
+   }
+
my $self = SVN::Ra->new(url => $url, auth => $baton,
  config => $config,
  pool => SVN::Pool->new,
-- 
1.8.3.rc1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git clone depth of 0 not possible.

2013-05-30 Thread Matthijs Kooijman
Hi Junio,

On Tue, May 28, 2013 at 10:04:46AM -0700, Junio C Hamano wrote:
> Matthijs Kooijman  writes:
> 
> > Did you consider how to implement this? Looking at the code, it seems
> > the "deepen" parameter in the wire protocol now means:
> >  - 0: Do not change anything about the shallowness (i.e., fetch
> >everything from the shallow root to the tip).
> >  - > 0: Create new shallow commits at depth commits below the tip (so
> >depth == 1 means tip and one below).
> >  - INFINITE_DEPTH (0x7fff): Remove all shallowness and fetch
> >complete history.
> >
> > Given this, I'm not sure how one can express "fetch the tip and nothing
> > below that", since depth == 0 already has a different meaning.
> 
> Doing it "correctly" (in the shorter term) would involve:

Given below suggestion, I take it you don't like what Jonathan proposed
(changing the meaning of the deepen parameter in the protocol so that
the server effectively decides how to interpret --depth)?

>  - adding a capability on the sending side "fixed-off-by-one-depth"
>to the protocol, and teaching the sending side to advertise the
>capability;
>
>  - teaching the sending side to see if the new behaviour to fix
>off-by-one is asked by the requestor, and stop at the correct
>number of commits, not oversending one more.  Otherwise retain
>the old behaviour.
We can implement these two in current git already, since they only
add to the protocol, not break it in an incompatible manner, right?

>  - teaching the requestor that got --depth=N from the end user to
>pay attention to the new capability in such a way that:
> 
>- when talking to an old sender (i.e. without the off-by-one
>  fix), send N-1 for N greater than 1.  Punt on N==1;
> 
>- when talking to a fixed sender, ask to enable the capability,
>  and send N as is (including N==1).
And these should wait for git2, since they change the meaning of the
--depth parameter? Or is this change ok for current git as well?

What do you mean by "punt" exactly? Show an error to the user, saying
only depth >= 2 is supported?

> In the longer term, I think we should introduce a better deepening
> mechanism.  Cf.
Even when there will be a better deepening mechanism, the above is still
useful (passing --depth=1 serves to get just a single commit without
history, which is a distinct usecase from deepening the history of an
existing shallow repository). In other words, I think the "improved
deepening" and "fixed depth" should be complementary features.

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git clone depth of 0 not possible.

2013-05-28 Thread Matthijs Kooijman
Hi Jonathan,

> > Did you consider how to implement this? Looking at the code, it seems
> > the "deepen" parameter in the wire protocol now means:
> >  - 0: Do not change anything about the shallowness (i.e., fetch
> >everything from the shallow root to the tip).
> >  - > 0: Create new shallow commits at depth commits below the tip (so
> >depth == 1 means tip and one below).
> >  - INFINITE_DEPTH (0x7fff): Remove all shallowness and fetch
> >complete history.
> >
> > Given this, I'm not sure how one can express "fetch the tip and nothing
> > below that", since depth == 0 already has a different meaning.
> 
> If I remember correctly, what we discussed is just changing the
> protocol to "5 means a depth of 5".

The mail from Junio I replied to said:
> >> As long as we do not change the meaning of the "shallow" count
> >> going over the wire

Which seems to conflict with your suggestion. Or are the "shallow count"
and the "depth" different things?

> The client already trusts what the server provides.
In other words: we won't break existing clients if we suddenly send back
one less commit than before, since the client just sends over what it
wants and then assumes that whatever it gets back is really what it
wanted?

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git clone depth of 0 not possible.

2013-05-28 Thread Matthijs Kooijman
Hi Junio,

I'm interested in getting a fetch tip commit only feature into git, I'll
probably look into creating a patch for this.

> >>> Sounds buggy.  Would anything break if we were to make --depth=1 mean
> >>> "1 deep, including the tip commit"?
> >>
> >> As long as we do not change the meaning of the "shallow" count going
> >> over the wire (i.e. the number we receive from the user will be
> >> fudged, so that user's "depth 1" that used to mean "the tip and one
> >> behind it" is expressed as "depth 2" at the end-user level, and we
> >> send over the wire the number that corresponded to the old "depth
> >> 1"), I do not think anything will break, and then --depth=0 may
> >> magically start meaning "only the tip; its immediate parents will
> >> not be transferred and recorded as the shallow boundary in the
> >> receiving repository".
> >
> > I'd rather we reserve 0 for unlimited fetch, something we haven't done
> > so far [1]. And because "unlimited clone" with --depth does not make
> > sense, --depth=0 should be rejected by git-clone.
> 
> I actually was thinking about changing --depth=1 to mean "the tip,
> with zero commits behind it" (and that was consistent with my
> description of "fudging"), but ended up saying "--depth=0" by
> mistake.  I too think "--depth=0" or "--depth<0" does not make
> sense, so we are in agreement.

Did you consider how to implement this? Looking at the code, it seems
the "deepen" parameter in the wire protocol now means:
 - 0: Do not change anything about the shallowness (i.e., fetch
   everything from the shallow root to the tip).
 - > 0: Create new shallow commits at depth commits below the tip (so
   depth == 1 means tip and one below).
 - INFINITE_DEPTH (0x7fff): Remove all shallowness and fetch
   complete history.

Given this, I'm not sure how one can express "fetch the tip and nothing
below that", since depth == 0 already has a different meaning.

Of course, one could using depth == 1 in this case to receive two
commits and then drop one, but this would seem a bit pointless to me
(especially if the commit below the tip is very different from the tip
leading to a lot of useless data transfer).

Or did I misunderstand something here?

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lines missing from git diff-tree -p -c output?

2013-05-15 Thread Matthijs Kooijman
Hi Junio,

> Could you explain why you think it hides the real problem, and what
> kind of future enhancement may break it?
I think the differences is mostly in the locality of the fix. In my
proposed patch, the no_pre_delete flag is never set on an interesting
line because it is checked in the line before it. In your patch, it
never happens because the control flow guarantees the "context" lines
before each change must be uninteresting.

The net effect is of course identical, but I'm arguing that depending on
the control flow and some code a doze lines down is easier to break than
depending on a previous line.

Having said that: I'm not sure if the difference is significant enough
to convince me in either direction.



However, thinking about this a bit more (and getting sidetracked on a
completely separate issue/question), I wonder why the coalescing-hunks
code is there in the first place? e.g., why not leave out these lines?

if (k < j + context) {
/* k is interesting and [j,k) are not, but
 * paint them interesting because the gap is small.
 */
while (j < k)
sline[j++].flag |= mark;
i = k;
goto again;
}

If the "context" lines before and after each group of changes are
painted interesting, then these lines in between will also be painted
interesting. Of course, this could cause some lines to be painted as
interesting twice and it needs my fix for the no_pre_delete thing, but
it would work just as well?

However, I can imagine that this code is present to prevent painting
lines twice, which would of course be a bit of a performance loss. But
if this really was the motivation, why is the first if not something
like:

if (k <= j + 2 * context) {

Since IIUC, the current code can still paint a few context lines twice
when they are exacly "context" lines apart, once by the "paint before"
and one by the "paint after" code (which is also what happens in my bug
example, I think). The above should "fix" that as well (the first part
of the test suite hasn't complained so far).

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] combine-diff.c: Fix output when changes are exactly 3 lines apart

2013-05-15 Thread Matthijs Kooijman
When a deletion is followed by exactly 3 (or whatever the number of
context lines) unchanged lines, followed by another change, the combined
diff output would hide the first deletion, resulting in a malformed
diff.

This happened because the 3 lines before each change are painted
interesting, but also marked as no_pre_delete to prevent showing deletes
that were previously marked as uninteresting. This behaviour was
introduced in c86fbe53 (diff -c/--cc: do not include uninteresting
deletion before leading context). However, as a side effect, this could
also mark deletes that were already interesting as no_pre_delete. This
would happen only if the delete was exactly 3 lines away from the next
change, since lines farther away would not be touched by the "paint
three lines before the change" code and lines closer would be painted
by the "merge two adjacent hunks" code instead, which does not set the
no_pre_delete flag.

This commit fixes this problem by only setting the no_pre_delete flag
for changes that were previously uninteresting.

Signed-off-by: Matthijs Kooijman 
---
 combine-diff.c   |  7 +--
 t/t4038-diff-combined.sh | 47 +++
 2 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 77d7872..3e8bb17 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -518,8 +518,11 @@ static int give_context(struct sline *sline, unsigned long 
cnt, int num_parent)
unsigned long k;
 
/* Paint a few lines before the first interesting line. */
-   while (j < i)
-   sline[j++].flag |= mark | no_pre_delete;
+   while (j < i) {
+   if (!(sline[j].flag & mark))
+   sline[j].flag |= no_pre_delete;
+   sline[j++].flag |= mark;
+   }
 
again:
/* we know up to i is to be included.  where does the
diff --git a/t/t4038-diff-combined.sh b/t/t4038-diff-combined.sh
index 1261dbb..a23ca7e 100755
--- a/t/t4038-diff-combined.sh
+++ b/t/t4038-diff-combined.sh
@@ -353,4 +353,51 @@ test_expect_failure 'combine diff coalesce three parents' '
compare_diff_patch expected actual
 '
 
+# Test for a bug reported at
+# http://thread.gmane.org/gmane.comp.version-control.git/224410
+# where a delete lines were missing from combined diff output when they
+# occurred exactly before the context lines of a later change.
+test_expect_success 'combine diff missing delete bug' '
+   git commit -m initial --allow-empty &&
+   cat <<-\EOF >test &&
+   1
+   2
+   3
+   4
+   EOF
+   git add test
+   git commit -a -m side1 &&
+   git checkout -B side1 &&
+   git checkout HEAD^ &&
+   cat <<-\EOF >test &&
+   0
+   1
+   2
+   3
+   4modified
+   EOF
+   git commit -a -m side2 &&
+   git branch -f side2 &&
+   test_must_fail git merge --no-commit side1 &&
+   cat <<-\EOF >test &&
+   1
+   2
+   3
+   4modified
+   EOF
+   git add test &&
+   git commit -a -m merge &&
+   git diff-tree -c -p HEAD >actual.tmp &&
+   sed -e "1,/^@@@/d" < actual.tmp >actual &&
+   tr -d Q <<-\EOF >expected &&
+   - 0
+ 1
+ 2
+ 3
+-4
++4modified
+   EOF
+   compare_diff_patch expected actual
+'
+
 test_done
-- 
1.8.3.rc1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lines missing from git diff-tree -p -c output?

2013-05-15 Thread Matthijs Kooijman
Hi Junio,

> I think the coalescing of two adjacent hunks into one is painting
> leading lines "interesting to show context but not worth showing
> deletion before it" incorrectly.
Yup, that seems to be the case.

> Does this patch fix the issue?

Yes, it fixes the issue. However, I think that this patch actually hides
the real problem (in a way that will always work with the current code,
though).

I had come up with a different fix myself (similar to the one I sent to
the list as a followup, but that one still had a bug), which I think
might be better. In any case, it includes a testcase for this bug which
seems good to include.

I'll send my patch as a followup in a minute, feel free to use it
entirely or only partially.

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lines missing from git diff-tree -p -c output?

2013-05-15 Thread Matthijs Kooijman
Hi folks,

> $ git diff-tree -p -c HEAD
> d945a51b6ca22e6e8e550c53980d026f11b05158
> diff --combined file
> index 3404f54,0eab113..e8c8c18
> --- a/file
> +++ b/file
> @@@ -1,7 -1,5 +1,6 @@@
>  +LEFT
>   BASE2
>   BASE3
>   BASE4
> - BASE5
> + BASE5MODIFIED
>   BASE6

I found the spot in the code where this is going wrong, there is an
incorrectly set "no_pre_delete" flag for the context lines before each
hunk. Since a patch says more than a thousand words, here's what I think
will fix this problem:

diff --git a/combine-diff.c b/combine-diff.c
index 77d7872..d36bfcf 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -518,8 +518,11 @@ static int give_context(struct sline *sline, unsigned long 
cnt, int num_parent)
unsigned long k;
 
/* Paint a few lines before the first interesting line. */
-   while (j < i)
-   sline[j++].flag |= mark | no_pre_delete;
+   while (j < i) {
+   if (!(sline[j++].flag & mark))
+   sline[j++].flag |= no_pre_delete;
+   sline[j++].flag |= mark;
+   }
 
again:
/* we know up to i is to be included.  where does the

I'll see if I can write up a testcase and then submit this as a proper
patch, but I wanted to at least send this over now lest someone wastes
time coming to the same conclusion as I did.

Gr.

Matthijs
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Lines missing from git diff-tree -p -c output?

2013-05-15 Thread Matthijs Kooijman
Hi folks,

while trying to parse git diff-tree output, I found out that in some
cases it appears to generate an incorrect diff (AFAICT). I orginally
found this in a 5-way merge commit in the Linux kernel, but managed to
reduce this to something a lot more managable (an ordinary 2-way merge
on a 6-line file).

To start with the wrong-ness, this is the diff generated:

$ git diff-tree -p -c HEAD
d945a51b6ca22e6e8e550c53980d026f11b05158
diff --combined file
index 3404f54,0eab113..e8c8c18
--- a/file
+++ b/file
@@@ -1,7 -1,5 +1,6 @@@
 +LEFT
  BASE2
  BASE3
  BASE4
- BASE5
+ BASE5MODIFIED
  BASE6

Here, the header claims that the first head has 7 lines, but there really are
only 6 (5 lines of context and one delete line). The numbers for the others
heads are incorrect. In the original diff, the difference was bigger
(first head was stated to have 28 lines, while the output was similar to
the above).

To find out what's going on, we can look at the -m output, which is
correct (or look at the original file contents at the end of this mail).

$ git diff-tree -m -p HEAD
d945a51b6ca22e6e8e550c53980d026f11b05158
diff --git a/file b/file
index 3404f54..e8c8c18 100644
--- a/file
+++ b/file
@@ -1,7 +1,6 @@
 LEFT
-BASE1
 BASE2
 BASE3
 BASE4
-BASE5
+BASE5MODIFIED
 BASE6
d945a51b6ca22e6e8e550c53980d026f11b05158
diff --git a/file b/file
index 0eab113..e8c8c18 100644
--- a/file
+++ b/file
@@ -1,3 +1,4 @@
+LEFT
 BASE2
 BASE3
 BASE4

As you can see here, first head added "LEFT", and the second head removed
"BASE1" and modified "BASE5". In the -c diff-tree output above, this removal of
"BASE1" is not shown, but it is counted in the number of lines, causing this
breakage.


Note that to trigger this behaviour, the number of context lines between the
BASE1 and BASE5 must be _exactly_ 3, more or less prevents this bug from
occuring. Also, the "LEFT" line introduced does not seem to be
essential, but there needed to be some change from both sides in order
to generate a diff at all.

I haven't looked into the code, though I might give that a go later.
Anyone got any clue why this is happening? Is this really a bug, or am I
misunderstanding here?

To recreate the above situation, you can use the following commands:

git init
cat > file < file < file < file 

Re: Git does not handle changing inode numbers well

2012-08-08 Thread Matthijs Kooijman
Hi Junio,

> - if (ce->ce_ino != (unsigned int) st->st_ino)
> + if (trust_inum && ce->ce_ino != (unsigned int) st->st_ino)
>   changed |= INODE_CHANGED;

I just tried this with 1.7.10 (that is, I deleted these two lines to
mimic trust_inum being false) and it indeed fixes my problem.

(I'll probably won't be implementing the full patch, though, I've
already figured out how to fix my filesystem instead)

Gr.

Matthijs


signature.asc
Description: Digital signature


Re: Git does not handle changing inode numbers well

2012-08-08 Thread Matthijs Kooijman
> So, let's see if I can fix my filesystem now ;-)
For anyone interested: turns out passing -o noforget makes fuse keep a
persistent path -> inode mapping (at the cost of memory usage, of
course).

However, it also turns out that fuse wasn't my problem: It was the aufs
mount that was overlayed over my fuse mount (this was on a Debian live
system), which sets the noxino option that prevents aufs from keeping
persistent inode numbers.

To get git status working as expected, I had to both remove noxino from
the aufs mount and add noforget to the underlying fuse mount.

Gr.

Matthijs


signature.asc
Description: Digital signature


Git does not handle changing inode numbers well

2012-08-08 Thread Matthijs Kooijman
(Please CC me, I'm not on the list)

Hi folks,

I've spent some time debugging an issue and I'd like to share the
results. The conclusion of my debugging is that git does not currently
handle changing inode numbers on files well.

I have a custom Fuse filesystem, and fuse dynamically allocates inode
numbers to paths, but keeps a limited cache of inode -> name mappings,
causing the inodes to change over time.

Now of course, you'll probably say, "it's the filesystem's fault, git
can't be expected to cope with that". You'll be right of course, but
since I already spent the time digging into this and figuring out what
goes on inside git in this case, I thought I might as well share the
analysis, just in case someone sees an easy fix in here, or in case
someone else stumbles upon this problem as well.

So, the actual problem I was seeing is that running "git status" showed
all symlinks as "modified", even though they really were identical
between the working copy, index and HEAD. Interestingly enough this only
happened when running "git status" without further arguments, when
running on a subdirectory, it would show no changes as expected.

I compared the output of stat to a hexdump of the index file and found
that everything matched, except for the inode numbers. I originally
thought I was misinterpreting what I saw, but gdb confirmed that it were
indeed the inode numbers that git observed as different.

Now, I could have stopped here and started trying to fix my filesystem
instead. But it was still weird that this problem only existed for
symlinks and that normal files acted as expected. So I dug in a bit
deeper, hoping to find some way to make this work for symlinks as well.

So, here's what happens (IIUC):
 - cmd_status calls refresh_index, which calls refresh_cache_ent for
   every entry in the index.
 - refresh_cache_ent notices that the inode number has changed (for both
   symlinks and regular files) and compares the file / symlink contents.
 - refresh_cache_ent sees the content hasn't changed, so it calls
   fill_stat_cache_info to update the stat info.
 - fill_stat_cache_info sets the EC_UPTODATE flag on the entry, but only
   if it is a regular file.
 - cmd_status calls wt_status_collect which calls
   wt_status_collect_changes_worktree which calls run_diff_files.
 - run_diff_files skips regular files, because of the EC_UPTODATE flag.
   For symlinks, however, it checks the stat info and notices that the
   inode number has changed (again). It does not do a content check at
   this point, but instead just outputs the file as "modified".


It turned out that the reason running "git status" on a subdirectory did
appear to work, was that the number of files in the subdir wasn't big
enough to overflow the inode number cache fuse keeps, so that numbers
didn't change in this case (the problem _did_ occur when trying a bigger
subdirectory).

So, it seems that git just doesn't cope well with changing inode numbers
because it checks the content in a first pass in refresh_index, but only
checks the stat info in the second pass in run_diff_files. The reason it
does work for regular files is EC_UPTODATE optimization introduced in
eadb5831: Avoid running lstat(2) on the same cache entry.

So, let's see if I can fix my filesystem now ;-)

Gr.

Matthijs


signature.asc
Description: Digital signature