Re: [PATCH] doc: clarify triangular workflow

2017-12-02 Thread Junio C Hamano
Timothee Albertin  writes:

> diff --git a/Documentation/gitworkflows.txt b/Documentation/gitworkflows.txt
> index 02569d0..21f6dc8 100644
> --- a/Documentation/gitworkflows.txt
> +++ b/Documentation/gitworkflows.txt
> @@ -407,8 +407,8 @@ follows.
>  `git pull  `
>  =
>  
> -Occasionally, the maintainer may get merge conflicts when he tries to
> -pull changes from downstream.  In this case, he can ask downstream to
> +Occasionally, the maintainers may get merge conflicts when they try to
> +pull changes from downstream.  In this case, they can ask downstream to
>  do the merge and resolve the conflicts themselves (perhaps they will
>  know better how to resolve them).  It is one of the rare cases where
>  downstream 'should' merge from upstream.

The document starts with

This document attempts to write down and motivate some of the
workflow elements used for `git.git` itself.  Many ideas apply
in general, though the full workflow is rarely required for
smaller projects with fewer people involved.

and makes me wonder (note: I am not involved in writing any of the
existing text in this document) how much material foreign to the
actual workflow used for `git.git` should go in here.  Having
multiple maintainers at the same time is not a workflow element that
we have ever used, for example, so I am not sure about the change in
the above paragraph.

> +TRIANGULAR WORKFLOW
> +---

I really hate to say this.  Before I made comment on the last round
that tried to add this section, I didn't read the original closely
enough.  

The thing is, it does already cover the triangular workflow in the
"Merge workflow" section (you may need to already know what you are
reading to realize that fact, though).  The text in the existing
"Merge workflow" section where requestor pushes to somewhere for the
maintainer to pull from may not be immediately obvious, and it may
be worthwhile to improve it, but I find it highly misleading to add
an entirely new section as if it is describing yet another separate
workflow that is different from anything that is already described
in the document.  It is not.

A replacement of the entire section (but I'd recommend keeping the
"Merge workflow" title, which contrasts well with the other "Patch
workflow" that follows), or a separate document that is referred to
with "see that other one for a lengthier description" by the
existing "Merge workflow" section, or somewhere in between, might be
a more acceptable organization, though.


Re: [PATCH v6 1/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875

2017-12-02 Thread Junio C Hamano
Jeff King  writes:

> ... Eventually our fill() will block trying to get data that is
> not there. On an Apache server, the webserver would know there is
> nothing left to send us and close() the pipe, and we'd get EOF.
> But on IIS, I think the pipe remains open and we'd just block
> indefinitely trying to read().

Ah, yeah, under that scenario, trusting content-length and trying to
read, waiting for input that would never come, will be a problem,
and it would probably want to get documented.



Re: [PATCH] Add a sample hook which saves push certs as notes

2017-12-02 Thread Junio C Hamano
Todd Zullinger  writes:

> (I also noticed the tests which use $GIT_PUSH_CERT, like t5534, use
> 'cat-file blob ...' rather than 'cat-file -p ...'.  I don't know if
> that's much safer/better than letting cat-file guess the object type
> in the hook.

The '-p' option is meant for human consumption and we promise that
the output from it _will_ change if it makes sense at the UI level.

In a script like this, you do care about the exact byte sequence.
So that is a more important reason why you should say "blob" not
"-p".

>> +# Verify that the ref update matches that in push certificate.
>> +if [[ $push_cert == *$oval" "$nval" "$ref* ]]; then

I am not sure what this expression is trying to do in the first
place.  The contents of the push certificate blob may contain these
three values, but has a lot more than that.

A post-receive is run after all the receive processing is done, so
its failing cannot abort the transfer.  I wonder how an almost
simultaneous push to a same ref, that would not fail normally
without this new hook script, would behave.  One receive updates the
tip from A to B and then starts running this script, while the other
receive updates the tip from B to C and then starts running another
copy of the script.  They both wants to update the notes database
but there can be only one winner in the race for its tip.  

What happens then?  Don't we need to be running a script like this
from a hook mechanism that runs under a lock or something?


Re: Unify annotated and non-annotated tags

2017-12-02 Thread Junio C Hamano
anatoly techtonik  writes:

>> Git is not doomed to preserve anything forever. We've gradually broken
>> backwards compatibility for a few core things like these.
>>
>> However, just as a bystander reading this thread I haven't seen any
>> compelling reason for why these should be removed. You initially had
>> questions about how to extract info about them, which you got answers
>> to.
>>
>> So what reasons remain for why they need to be removed?
>
> To reduce complexity and prior knowledge when dealing with Git tags.

Then you are barking up a wrong tree, I would have to say.  If you
do not think non-annotated tags are serving any good purpose in the
context of _a_ project you are involved in and making your life
miserable because some tags are and other tags are not annotated,
suggest to _that_ project to refrain from using non-annotated tags.

Projects that do use non-annotated tags have chosen to use them for
reasons of their own.  It may be that these tags could be annotated
tags instead for some projects, and depending on their tooling there
might be no downside in switching.  Or they may have chosen to use
non-annotated tags for reasons that you do not know.



Re: "git describe" documentation and behavior mismatch

2017-12-02 Thread Junio C Hamano
Daniel Knittl-Frank  writes:

> An interesting fact (and intentional behavior?) is that describing a
> commit with only a lightweight tag will properly display the tags/
> prefix. I assume this is because the annotated tags only store the
> tagname without any ref namespace, which is then picked up by git
> describe and displayed.

I suspect that "see if the name recorded in the tag object matches
the name of the ref that stores the tag after refs/tags/" code *is*
not just verifying what it claims to (which may be good) but also
unintentionally affecting the output (i.e. "--all" promises that the
prefix tags/ should be shown).  Perhaps the code needs to be fixed
if that is the case.




Re: [PATCH] hashmap: adjust documentation to reflect reality

2017-12-02 Thread Junio C Hamano
Jeff King  writes:

> My second suggestion (which I'm on the fence about) is: would it better
> to just say "see t/helper/test-hashmap.c for a representative example?"

I also had the same thought.  It is rather unwieldy to ask people to
lift code from comment text, and it is also hard to maintain such a
commented out code to make sure it is up to date.

> I'm all for code examples in documentation, but this one is quite
> complex. The code in test-hashmap.c is not much more complex, and is at
> least guaranteed to compile and run.

Yup.  Exactly.

> It doesn't show off how to combine a flex-array with a hashmap as
> well, but I'm not sure how important that is. So I could go either
> way.

Likewise.

In any case, keeping a bad example as-is is less good than replacing
it with a corrected one, so I do not mind taking this patch as an
immediate first step, whether we later decide to remove it and refer
to an external file that has a real example that will be easier to
maintain and use.

Thanks.


Re: [PATCH v4 1/4] Makefile: generate Perl header from template file

2017-12-02 Thread Junio C Hamano
Johannes Sixt  writes:

>> +sed -e 's=@@PATHSEP@@=$(pathsep)=g' \
>
> This doesn't work, unfortunately. When $(pathsep) is ';', we get an
> incomplete sed expression because ';' is also a command separator in
> the sed language.

It is correct that ';' can be and does get used in place of LF when
writing a script on a single line, but even then, as part of a
string argument to 's' command (and also others), there is no need
to quote ';' or otherwise treat it any specially, as the commands
know what their syntax is (e.g. 's=string=replacement=' after seeing
the first '=' knows that it needs to find one unquoted '=' to find
the end of the first argument, and another to find the end of the
replacement string, and ';' seen during that scanning would not have
any special meaning).

If your sed is so broken and does not satisfy the above expectation,
t6023 would not work for you, I would gess.

t/t6023-merge-file.sh:sed -e "s/deerit.\$/deerit;/" -e "s/me;\$/me./" < 
new5.txt > new6.txt
t/t6023-merge-file.sh:sed -e "s/deerit.\$/deerit,/" -e "s/me;\$/me,/" < 
new5.txt > new7.txt
t/t6023-merge-file.sh:sed -e 's/deerit./&/' -e "s/locavit,/locavit;/"< 
new6.txt | tr '%' '\012' > new8.txt



Re: [PATCH v4 2/2] launch_editor(): indicate that Git waits for user input

2017-12-02 Thread Junio C Hamano
Jeff King  writes:

> I tried to think of ways this "show a message and then delete it" could
> go wrong. It should work OK with editors that just do curses-like
> things, taking over the terminal and then restoring it at the end.
> ...

I think that it is not worth to special-case "dumb" terminals like
this round of the patches do.  If it so much disturbs reviewers that
"\e[K" may not work everywhere, we can do without the "then delete
it" part.  It was merely trying to be extra nice, and the more
important part of the "feature" is to be noticeable, and I do think
that not showing anything on "dumb", only because the message cannot
be retracted, is putting the cart before the horse.  

Since especially now people are hiding this behind an advise.*
thing, I think it is OK to show a message and waste a line, even.

> An even worse case (and yes, this is really reaching) is:
>
>   $ GIT_EDITOR='echo one; printf "two\\r"; vim' git commit
>   hint: Waiting for your editor input...one
>   Aborting commit due to empty commit message.
>
> There we ate the "two" line.

Yes, I would have to agree that this one is reaching, as there isn't
any valid reason other than "the editor then wanted to do \e[K
later" for it to end its last line with CR.  So our eating that line
is not a problem.


Re: [PATCH v6 1/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875

2017-12-02 Thread Jeff King
On Sat, Dec 02, 2017 at 05:02:37PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >   3. For large inputs (like incoming packfiles), we connect the
> >  descriptor directly to index-pack or unpack-objects, and they try
> >  to read to EOF.
> >
> >  For a well-formed pack, I _think_ this would work OK. We'd see the
> >  end of the pack and quit (there's a check for garbage at the end of
> >  the pack, but it triggers only for the non-pipe case).
> >
> >  For a truncated input, we'd hang forever rather than report an
> >  error.
> 
> Hmm.  index-pack and/or unpack-objects would be fed directly from
> the incoming pipe, they are not told how many bytes to expect (by
> design), so they try to read to EOF, which may come after the end of
> the pack-stream data, and they write the remaining junk to their
> standard output IIRC.
> 
> For a well-formed pack, the above is what should heppen.

Yeah, there should be zero bytes of "remaining junk" in the normal
well-formed case. And as long as the webserver does not mind us asking
to read() a few extra bytes, we are fine (it tells us there are no more
bytes available right now). The original problem report with IIS was
that it would hang trying to read() that any final EOF, and I don't
think that would happen here.

I wouldn't be surprised if there are webservers or situations where that
extra read() behaves badly (e.g., because it is connected directly to
the client socket and the client is trying to pipeline requests or
something).

> I am having trouble trying to come up with a case where the input
> stream is mangled and we cannot detect where the end of the
> pack-stream is without reading more than we will be fed through the
> pipe, and yet we do not trigger an "we tried to read because the data
> we received so far is incomplete, and got an EOF" error.
> 
> Wouldn't "early EOF" trigger in the fill() helper that these two
> programs have (but not share X-<)?

I think the original problem was the opposite of "early EOF": the other
side of the pipe never gives us EOF at all. So imagine the pack is
mangled so that the zlib stream of an object never ends, and just keeps
asking for more data. Eventually our fill() will block trying to get
data that is not there. On an Apache server, the webserver would know
there is nothing left to send us and close() the pipe, and we'd get EOF.
But on IIS, I think the pipe remains open and we'd just block
indefinitely trying to read().

I don't have such a setup to test on, and it's possible I'm
mis-interpreting or mis-remembering the discussion around the original
patch.

-Peff


Re: bug deleting "unmerged" branch (2.12.3)

2017-12-02 Thread Junio C Hamano
"Philip Oakley"  writes:

> I think it was that currently you are on M, and neither A nor B are
> ancestors (i.e. merged) of M.
>
> As Junio said:- "branch -d" protects branches that are yet to be
> merged to the **current branch**.

Actually, I think people loosened this over time and removal of
branch X is not rejected even if the range HEAD..X is not empty, as
long as X is marked to integrate with/build on something else with
branch.X.{remote,merge} and the range X@{upstream}..X is empty.

So the stress of "current branch" above you added is a bit of a
white lie.


Re: [PATCH v3 2/2] Doc/check-ref-format: clarify information about @{-N} syntax

2017-12-02 Thread Junio C Hamano
Kaartic Sivaraam  writes:

> When the N-th previous thing checked out syntax (@{-N}) is used
> with '--branch' option of check-ref-format the result might not
> always be a valid branch name (see NOTE below). This is because
> @{-N} is used to refer to the N-th last checked out "thing" which
> might be any commit (sometimes a branch). The documentation thus
> does a wrong thing by promoting it as the "previous branch syntax".
>
> So, correctly state @{-N} is the syntax for specifying "N-th last
> thing checked out" and also state that the result of using @{-N}
> might also result in a "commit hash".
>
> NOTE: Though a commit-hash is a "syntactically" valid branch name,
> it is generally not considered as one for the use cases of
> "git check-ref-format --branch". That's because a user who does
> "git check-ref-format --branch @{-$N}" would except the output
> to be a "existing" branch name apart from it being syntactically
> valid.

s/except/expect/ I suspect.  But I do not think this description is
correct.  "check-ref-format --branch @{-1}", when you come from the
detached HEAD state, happily report success so it does not hold that
it is "generally not considered".

Unless you are saying "check-ref-format --branch" is buggy, that is.
If so, we shouldn't be casting that buggy behaviour in stone by
documenting, but should be fixing it.

But because this patch is about documenting, the farthest you can go
is to say that "check-ref-format --branch only checks if the name is
syntactically valid, and if you came from a detached HEAD, or if you
came from a branch that you deleted since then, the command will say
'yes, that looks like a good branch name to use'.  That may not
match what you expect, but that is the way it is.  Get used to it
and that is why we document that behaviour here."

And the paragraph that leads to this NOTE and this NOTE itself are
both misleading from that point of view.  The result *is* always a
valid branch name, but it may not name a branch that currently
exists or ever existed.

Taking the above together, perhaps.

When the N-th previous thing checked out syntax (@{-N}) is used
with '--branch' option of check-ref-format the result may not be
the name of a branch that currently exists or ever existed.
This is because @{-N} is used to refer to the N-th last checked
out "thing", which might be any commit (sometimes a branch) in
the detached HEAD state. The documentation thus does a wrong
thing by promoting it as the "previous branch syntax".

State that @{-N} is the syntax for specifying "N-th last thing
checked out" and also state that the result of using @{-N} might
also result in an commit object name.

> diff --git a/Documentation/git-check-ref-format.txt 
> b/Documentation/git-check-ref-format.txt
> index cf0a0b7df..5ddb562d0 100644
> --- a/Documentation/git-check-ref-format.txt
> +++ b/Documentation/git-check-ref-format.txt
> @@ -78,17 +78,20 @@ reference name expressions (see linkgit:gitrevisions[7]):
>  . at-open-brace `@{` is used as a notation to access a reflog entry.
>  
>  With the `--branch` option, the command takes a name and checks if
> -it can be used as a valid branch name (e.g. when creating a new
> -branch).  The rule `git check-ref-format --branch $name` implements
> +it can be used as a valid branch name e.g. when creating a new branch
> +(except for one exception related to the previous checkout syntax
> +noted below). The rule `git check-ref-format --branch $name` implements

And "except for" is also wrong here.  40-hex still *IS* a valid
branch name; it is just it may not be what we expect.  So perhaps
rephrase it to something like "(but be cautious when using the
previous checkout syntax that may refer to a detached HEAD state)".

>  may be stricter than what `git check-ref-format refs/heads/$name`
>  says (e.g. a dash may appear at the beginning of a ref component,
>  but it is explicitly forbidden at the beginning of a branch name).
>  When run with `--branch` option in a repository, the input is first
> -expanded for the ``previous branch syntax''
> -`@{-n}`.  For example, `@{-1}` is a way to refer the last branch you
> -were on.  This option should be used by porcelains to accept this
> -syntax anywhere a branch name is expected, so they can act as if you
> -typed the branch name.
> +expanded for the ``previous checkout syntax''
> +`@{-n}`.  For example, `@{-1}` is a way to refer the last thing that
> +was checkout using "git checkout" operation. This option should be

s/was checkout/was checked out/;

> +used by porcelains to accept this syntax anywhere a branch name is
> +expected, so they can act as if you typed the branch name. As an
> +exception note that, the ``previous checkout operation'' might result
> +in a commit hash when the N-th last thing checked out was not a branch.

s/a commit hash/a commit object name/;



Re: [PATCH v2 2/2] Doc/check-ref-format: clarify information about @{-N} syntax

2017-12-02 Thread Junio C Hamano
Kaartic Sivaraam  writes:

>> I have a mild suspicion that "git checkout -B @{-1}" would want to
>> error out instead of creating a valid new branch whose name is
>> 40-hex that happen to be the name of the commit object you were
>> detached at previously.
>
> I thought this the other way round. Rather than letting the callers
> error out when @{-N} didn't expand to a branch name, I thought we
> should not be expanding @{-N} syntax for "check-ref-format --branch" at
> all to make a "stronger guarantee" that the result is "always" a valid
> branch name. Then I thought it might be too restrictive and didn't
> mention it. So, I dunno.
>
>
>> I am not sure if "check-ref-format --branch" should the same; it is
>> more about the syntax and the 40-hex _is_ valid there, so...
>
> I'm not sure what you were trying to say here, sorry.

The "am not sure if" comes from this question: should these two be
equivalent?

$ git check-ref-format --branch @{-1}
$ git check-ref-format --branch $(git rev-parse --verify @{-1})

If they should be equivalent (and I think the current implementation
says they are), then the answer to "if ... should do the same?"
becomes "no, we should not error out".

Stepping back a bit, the mild suspicion above says

$ git checkout HEAD^0
... do things ...
$ git checkout -b temp
... do more things ...
$ git checkout -B @{-1}

that creates a new branch whose name is 40-hex of a commit that
happens to be where we started the whole dance *is* a bug.  No sane
user expects that to happen, and the last step "checkout -B @{-1}"
should result in an error instead [*1*].

I was wondering if "git check-ref-format --branch @{-1}", when used
in place of "checkout -B @{-1}" in the above sequence, should or
should not fail.  It really boils down to this question: what @{-1}
expands to and what the user wants to do with it.

In the context of getting a revision (i.e. "rev-parse @{-1}") where
we are asking what the object name is, the value of the detached
HEAD we were on previously is a valid answer we are "expanding @{-1}
to".  If we were on a concrete branch and @{-1} yields a concrete
branch name, then rev-parse would turn that into an object name, and
in the end, in either case, the object name is what we wanted to
get.  So we do not want to error this out.

But a user of "git check-ref-format --branch" is not asking about
the object name ("git rev-parse" would have been used otherwise).
Which argues for erroring out "check-ref-format --branch @{-1}" if
we were not on a branch in the previous state.

And that argues for erroring out "check-ref-format --branch @{-1}"
in such a case, i.e. declaring that the first two commands in this
message are not equivalent.


[Footnote]

*1* "It should instead detach HEAD at that commit if @{-1} refers to
a detached HEAD state" is not a good suggestion (we wouldn't
have "-B" if a mere branch switching is desired).



Re: [PATCH 4/5] rebase -i: learn to abbreviate command names

2017-12-02 Thread Junio C Hamano
Johannes Schindelin  writes:

> I am a fan of not relying too heavily on compiler optimization and e.g.
> extract code from loops when it does not need to be evaluated every single
> iteration. In this case:
>
>   const char *pick = abbreviate_commands ? "p" : "pick";
>   ...
>   strbuf_addf(, "%s %s ", pick,
>   oid_to_hex(>object.oid));

I would have called that variable "pick_cmd", not just "pick"; this
preference is minor enough that I would probably reject a patch to
rename from one to the other if the above were already part of the
existing codebase.

I find that the code suggested above easier to follow, simply
because it expresses clearly the flow of thought and that flow of
thought matches how I personally think: we decide how this command
is spelled in the output upfront, and then use that same spelling
consistently throughout the loop.

I do not think it matters performance-wise either way, but I value
how easy it is to follow the code for humans, and it matters much
more in the longer run.  If a compiler does a poor job, we can
eventually notice and help it to produce better code that still does
what we wanted it to do (or it may not be performance critical and
we may not even notice).  If a code is hard to follow, on the other
hand, what we wanted it to do in the first place becomes harder to
figure out.


Re: [PATCH v6 1/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875

2017-12-02 Thread Junio C Hamano
Jeff King  writes:

>   3. For large inputs (like incoming packfiles), we connect the
>  descriptor directly to index-pack or unpack-objects, and they try
>  to read to EOF.
>
>  For a well-formed pack, I _think_ this would work OK. We'd see the
>  end of the pack and quit (there's a check for garbage at the end of
>  the pack, but it triggers only for the non-pipe case).
>
>  For a truncated input, we'd hang forever rather than report an
>  error.

Hmm.  index-pack and/or unpack-objects would be fed directly from
the incoming pipe, they are not told how many bytes to expect (by
design), so they try to read to EOF, which may come after the end of
the pack-stream data, and they write the remaining junk to their
standard output IIRC.

For a well-formed pack, the above is what should heppen.

I am having trouble trying to come up with a case where the input
stream is mangled and we cannot detect where the end of the
pack-stream is without reading more than we will be fed through the
pipe, and yet we do not trigger an "we tried to read because the data
we received so far is incomplete, and got an EOF" error.

Wouldn't "early EOF" trigger in the fill() helper that these two
programs have (but not share X-<)?



Re: [PATCH v6 0/2] http-backend: respect CONTENT_LENGTH as specified by rfc3875

2017-12-02 Thread Junio C Hamano
Max Kirillov  writes:

> On Mon, Nov 27, 2017 at 01:02:10PM +0900, Junio C Hamano wrote:
>> To recap (other than the typofix in the proposed log message), here
>> is what I would have as SQUASH??? on top of (or interspersed with)
>> v6.
>
> Thank you. I'll update it a bit later. May/should I add
> "Signed-off-by:" from you?

As you'd be dropping the new helper, I think you will not use a
major part of that message.  But even if you would, a "Helped-by:"
before your sign-off should be enough for a small "help" like the
one in the message you are responding.

Thanks.


Re: [PATCH v6 2/2] t5560-http-backend-noserver.sh: add CONTENT_LENGTH cases

2017-12-02 Thread Junio C Hamano
Max Kirillov  writes:

> If nobody objects changing the user-visible behavior, I'll
> consider using this.

What user-visible behaviour?  A user tries to use a new test helper
introduced by the previous round and does not find it?  That's OK.

> PS: I'll respond to your other reply a bit later.

Thanks.  


Re: [PATCH] Add a sample hook which saves push certs as notes

2017-12-02 Thread Todd Zullinger

Hi Shikher,

I'm not familiar with push certs, but I did notice some general issues
in the sample hook.  I hope they're helpful.

Shikher Verma wrote:

index 0..b4366e43f
--- /dev/null
+++ b/templates/hooks--post-receive.sample
+#!/bin/sh

...

+if test -z GIT_PUSH_CERT ; then
+exit 0
+fi


The $ is missing from GIT_PUSH_CERT.  test -z GIT_PUSH_CERT will
always be false. :)

The variable should also be quoted.  Not all sh implementations accept
a missing argument to test -z, as bash does.

More minor, Documentation/CodingGuidelines suggests placing 'then' on
a new line:

   if test -z "$GIT_PUSH_CERT"
   then
   exit 0
   fi

(There is plenty of code that doesn't follow that, so I don't know how
strong that preference is.)

This could also be written as:

   test -z "$GIT_PUSH_CERT" && exit 0

I don't know if there's any general preference to shorten it in git's
code or not.


+push_cert=$(git cat-file -p  $GIT_PUSH_CERT)


Very minor: there's an extra space before the variable here.

(I also noticed the tests which use $GIT_PUSH_CERT, like t5534, use
'cat-file blob ...' rather than 'cat-file -p ...'.  I don't know if
that's much safer/better than letting cat-file guess the object type
in the hook.  I have no idea if there's a chance that "$GIT_PUSH_CERT"
has some unexpected, non-blob object type.)


+while read oval nval ref
+do
+   # Verify that the ref update matches that in push certificate.
+   if [[ $push_cert == *$oval" "$nval" "$ref* ]]; then


[[ isn't portable across all the sh implementations git strives to
support, as far as I know.

The minor point about 'then' on new line is applicable here too.  It
would also better match the outer 'while' loop.


+   # add the push cert as note (namespaced pushcerts) to nval.
+   git notes --ref=pushcerts add -m "$push_cert" $nval -f
+   fi
+done


--
Todd
~~
Learn from the mistakes of others--you can never live long enough to
make them all yourself.
   -- John Luther



Re: git status always modifies index?

2017-12-02 Thread Junio C Hamano
Jonathan Nieder  writes:

> I am worried that the project is not learning from what happened here.
> ...
> Fair enough, though that feels like overengineering.  But I *still*
> don't see what that has to do with the name "no-optional-locks".  When
> is a lock *optional*?  And how am I supposed to discover this option?
>
> This also came up during review, and I am worried that this review
> feedback is being ignored.  In other words, I have no reason to
> believe it won't happen again.

I too would like to see this part explained a bit better.


Re: What's cooking in git.git (Nov 2017, #08; Tue, 28)

2017-12-02 Thread Junio C Hamano
Jeff Hostetler  writes:

> On 11/28/2017 8:17 PM, Junio C Hamano wrote:
>> [Cooking]
>>
>>
>> * jh/object-filtering (2017-11-22) 6 commits
>>(merged to 'next' on 2017-11-27 at e5008c3b28)
>>   + pack-objects: add list-objects filtering
>>   + rev-list: add list-objects filtering support
>>   + list-objects: filter objects in traverse_commit_list
>>   + oidset: add iterator methods to oidset
>>   + oidmap: add oidmap iterator methods
>>   + dir: allow exclusions from blob in addition to file
>>   (this branch is used by jh/fsck-promisors and jh/partial-clone.)
>>
>>   In preparation for implementing narrow/partial clone, the object
>>   walking machinery has been taught a way to tell it to "filter" some
>>   objects from enumeration.
>>
>>   Will merge to 'master'.
>
> Could we wait on this.  I'd like to send up a V6 that
> addresses the assert() questions raised.  And Jonathan
> Nieder had a few suggestions that I want to address.

Thanks for stopping me.

As with any topic that is already in 'next', please send in
necessary updates as patches that apply on top of the existing
patches (in other words, we no longer "reroll"; rather, we now go
"incremental").



Re: Re: Unify annotated and non-annotated tags

2017-12-02 Thread Philip Oakley

From: "anatoly techtonik" 

comment at end - Philip

On Fri, Nov 24, 2017 at 1:24 PM, Ævar Arnfjörð Bjarmason
 wrote:
On Fri, Nov 24, 2017 at 10:52 AM, anatoly techtonik  
wrote:

On Thu, Nov 23, 2017 at 6:08 PM, Randall S. Becker
 wrote:

On 2017-11-23 02:31 (GMT-05:00) anatoly techtonik wrote

Subject: Re: Unify annotated and non-annotated tags
On Sat, Nov 11, 2017 at 5:06 AM, Junio C Hamano  
wrote:

Igor Djordjevic  writes:


If you would like to mimic output of "git show-ref", repeating
commits for each tag pointing to it and showing full tag name as
well, you could do something like this, for example:

  for tag in $(git for-each-ref --format="%(refname)" refs/tags)
  do
  printf '%s %s\n' "$(git rev-parse $tag^0)" "$tag"
  done


Hope that helps a bit.


If you use for-each-ref's --format option, you could do something
like (pardon a long line):

git 
for-each-ref --format='%(if)%(*objectname)%(then)%(*objectname)%(else)%(objectname)%(end) 
%(refname)' refs/tags


without any loop, I would think.

Thanks. That helps.
So my proposal is to get rid of non-annotated tags, so to get all
tags with commits that they point to, one would use:
git for-each-ref --format='%(*objectname) %(refname)' refs/tags>
For so-called non-annotated tags just leave the message empty.
I don't see why anyone would need non-annotated tags though.


I have seen non-annotated tags used in automations (not necessarily well 
written ones) that create tags as a record of automation activity. I am 
not sure we should be writing off the concept of unannotated tags 
entirely. This may cause breakage based on existing expectations of how 
tags work at present. My take is that tags should include whodunnit, 
even if it's just the version of the automation being used, but I don't 
always get to have my wishes fulfilled. In essence, whatever behaviour a 
non-annotated tag has now may need to be emulated in future even if 
reconciliation happens. An option to preserve empty tag compatibility 
with pre-2.16 behaviour, perhaps? Sadly, I cannot supply examples of 
this usage based on a human memory page-fault and NDAs.


Are there any windows for backward compatibility breaks, or git is
doomed to preserve it forever?
Automation without support won't survive for long, and people who rely
on that, like Chromium team, usually hard set the version used.


Git is not doomed to preserve anything forever. We've gradually broken
backwards compatibility for a few core things like these.

However, just as a bystander reading this thread I haven't seen any
compelling reason for why these should be removed. You initially had
questions about how to extract info about them, which you got answers
to.

So what reasons remain for why they need to be removed?


To reduce complexity and prior knowledge when dealing with Git tags.

For example, http://readthedocs.io/ site contains a lot of broken
"Edit on GitHub" links, for example - 
http://git-memo.readthedocs.io/en/stable/


And it appeared that the reason for that is discrepancy between git
annotated and non-annotated tags. The pull request that fixes the issue
after it was researched and understood is simple
https://github.com/rtfd/readthedocs.org/pull/3302

However, while looking through linked issues and PRs, one can try to
imagine how many days it took for people to come up with the solution,
which came from this thread.
--
anatoly t.





So if I understand correctly, the hope is that `git show-ref --tags` could 
get an alternate option `--all-tags` [proper option name required...] such 
that the user would not have to develop the rather over the complicated 
expression that used a newish capability of a different command.


Would that be right?

Or at least update the man page docs to clarify the annotated vs 
non-annotated tags issue (many SO questions!).


And indicate if the --dereference and/or --hash options would do the 
trick! - maybe the "^{}" appended would be part of the problem (and need 
that new option "--objectreference" ).


Philip 



Re: Re: Unify annotated and non-annotated tags

2017-12-02 Thread anatoly techtonik
On Fri, Nov 24, 2017 at 1:24 PM, Ævar Arnfjörð Bjarmason
 wrote:
> On Fri, Nov 24, 2017 at 10:52 AM, anatoly techtonik  
> wrote:
>> On Thu, Nov 23, 2017 at 6:08 PM, Randall S. Becker
>>  wrote:
>>> On 2017-11-23 02:31 (GMT-05:00) anatoly techtonik wrote
Subject: Re: Unify annotated and non-annotated tags
On Sat, Nov 11, 2017 at 5:06 AM, Junio C Hamano  wrote:
> Igor Djordjevic  writes:
>
>> If you would like to mimic output of "git show-ref", repeating
>> commits for each tag pointing to it and showing full tag name as
>> well, you could do something like this, for example:
>>
>>   for tag in $(git for-each-ref --format="%(refname)" refs/tags)
>>   do
>>   printf '%s %s\n' "$(git rev-parse $tag^0)" "$tag"
>>   done
>>
>>
>> Hope that helps a bit.
>
> If you use for-each-ref's --format option, you could do something
> like (pardon a long line):
>
> git for-each-ref 
> --format='%(if)%(*objectname)%(then)%(*objectname)%(else)%(objectname)%(end)
>  %(refname)' refs/tags
>
> without any loop, I would think.
Thanks. That helps.
So my proposal is to get rid of non-annotated tags, so to get all
tags with commits that they point to, one would use:
git for-each-ref --format='%(*objectname) %(refname)' refs/tags>
For so-called non-annotated tags just leave the message empty.
I don't see why anyone would need non-annotated tags though.
>>>
>>> I have seen non-annotated tags used in automations (not necessarily well 
>>> written ones) that create tags as a record of automation activity. I am not 
>>> sure we should be writing off the concept of unannotated tags entirely. 
>>> This may cause breakage based on existing expectations of how tags work at 
>>> present. My take is that tags should include whodunnit, even if it's just 
>>> the version of the automation being used, but I don't always get to have my 
>>> wishes fulfilled. In essence, whatever behaviour a non-annotated tag has 
>>> now may need to be emulated in future even if reconciliation happens. An 
>>> option to preserve empty tag compatibility with pre-2.16 behaviour, 
>>> perhaps? Sadly, I cannot supply examples of this usage based on a human 
>>> memory page-fault and NDAs.
>>
>> Are there any windows for backward compatibility breaks, or git is
>> doomed to preserve it forever?
>> Automation without support won't survive for long, and people who rely
>> on that, like Chromium team, usually hard set the version used.
>
> Git is not doomed to preserve anything forever. We've gradually broken
> backwards compatibility for a few core things like these.
>
> However, just as a bystander reading this thread I haven't seen any
> compelling reason for why these should be removed. You initially had
> questions about how to extract info about them, which you got answers
> to.
>
> So what reasons remain for why they need to be removed?

To reduce complexity and prior knowledge when dealing with Git tags.

For example, http://readthedocs.io/ site contains a lot of broken
"Edit on GitHub" links, for example - http://git-memo.readthedocs.io/en/stable/

And it appeared that the reason for that is discrepancy between git
annotated and non-annotated tags. The pull request that fixes the issue
after it was researched and understood is simple
https://github.com/rtfd/readthedocs.org/pull/3302

However, while looking through linked issues and PRs, one can try to
imagine how many days it took for people to come up with the solution,
which came from this thread.
-- 
anatoly t.


Re: Re: bug deleting "unmerged" branch (2.12.3)

2017-12-02 Thread Philip Oakley

From: "Ulrich Windl" 
To: 
Cc: 
Sent: Wednesday, November 29, 2017 8:32 AM
Subject: Antw: Re: bug deleting "unmerged" branch (2.12.3)





"Ulrich Windl"  writes:


I think if more than one branches are pointing to the same commit,
one should be allowed to delete all but the last one without
warning. Do you agree?


That comes from a viewpoint that the only purpose "branch -d" exists
in addition to "branch -D" is to protect objects from "gc".  Those
who added the safety feature may have shared that view originally,
but it turns out that it protects another important thing you are
forgetting.

Imagine that two topics, 'topicA' and 'topicB', were independently
forked from 'master', and then later we wanted to add a feature that
depends on these two topics.  Since the 'feature' forked, there may
have been other developments, and we ended up in this topology:

---o---o---o---o---o---M
\   \
 \   o---A---o---F
  \ /
   o---o---o---o---B

where A, B and F are the tips of 'topicA', 'topicB' and 'feature'
branches right now [*1*].

Now imagine we are on 'master' and just made 'topicB' graduate.  We
would have this topology.

---o---o---o---o---o---o---M
\   \ /
 \   o---A---o---F   /
  \ /   /
   o---o---o---o---B

While we do have 'topicA' and 'feature' branches still in flight,
we are done with 'topicB'.  Even though the tip of 'topicA' is
reachable from the tip of 'feature', the fact that the branch points
at 'A' is still relevant.  If we lose that information right now,
we'd have to go find it when we (1) want to further enhance the
topic by checking out and building on 'topicA', and (2) want to
finally get 'topicA' graduate to 'master'.

Because removal of a topic (in this case 'topicB') is often done
after a merge of that topic is made into an integration branch,
"branch -d" that protects branches that are yet to be merged to the
current branch catches you if you said "branch -d topic{A,B}" (or
other equivalent forms, most likely you'd have a script that spits
out list of branches and feed it to "xargs branch -d").

So, no, I do not agree.


Hi!

I can follow your argumentation, but I fail to see that your branches A 
and B point to the same commit (which is what I was talking about). So my 
situation would be:


o---oA,B

I still think I could safely remove either A or B, even when the branch 
(identified by the commit, not by the name) is unmerged. What did I miss?


I think it was that currently you are on M, and neither A nor B are 
ancestors (i.e. merged) of M.


As Junio said:- "branch -d" protects branches that are yet to be merged to 
the **current branch**.


[I said the same in another part of the thread. The question now would be 
what needs changing? the error/warning message, the docs, something else?]




Regards,
Ulrich




[Footnotes]

*1* Since the 'feature' started developing, there were a few commits
added to 'topicB' but because the feature does not depend on
these enhancements to that topic, B is ahead of the commit that
was originally merged with the tip of 'topicA' to form the
'feature' branch.






Re: Antw: Re: bug deleting "unmerged" branch (2.12.3)

2017-12-02 Thread Philip Oakley

Hi Ulrich

From: "Johannes Schindelin" 
To: "Ulrich Windl" 
Cc: 
Sent: Wednesday, November 29, 2017 12:27 PM
Subject: Re: Antw: Re: bug deleting "unmerged" branch (2.12.3)



Hi Ulrich,

On Wed, 29 Nov 2017, Ulrich Windl wrote:


> On Tue, 28 Nov 2017, Ulrich Windl wrote:
>
>> During a rebase that turned out to be heavier than expected 8-( I
>> decided to keep the old branch by creating a temporary branch name to
>> the commit of the branch to rebase (which was still the old commit ID
>> at that time).
>>
>> When done rebasing, I attached a new name to the new (rebased)
>> branch, deleted the old name (pointing at the same rebase commit),
>> then recreated the old branch from the temporary branch name (created
>> to remember the commit id).
>>
>> When I wanted to delete the temporary branch (which is of no use
>> now), I got a message that the branch is unmerged.
>
> This is actually as designed, at least for performance reasons (it is
> not exactly cheap to figure out whether a given commit is contained in
> any other branch).
>
>> I think if more than one branches are pointing to the same commit,
>> one should be allowed to delete all but the last one without warning.
>> Do you agree?
>
> No, respectfully disagree, because I have found myself with branches
> pointing to the same commit, even if the branches served different
> purposes. I really like the current behavior where you can delete a
> branch with `git branch -d` as long as it is contained in its upstream
> branch.

I'm not talking about the intention of a branch, but of the state of a
branch: If multiple branches point (not "contain") the same commit, they
are equivalent (besides the name) at that moment.


I did a poor job of explaining myself, please let me try again. I'll give
you one concrete example:

Recently, while working on some topic, I stumbled over a bug and committed
a bug fix, then committed that and branched off a new branch to remind
myself to rebase the bug fix and contribute it.

At that point, those branches were at the same revision, but distinctly
not equivalent (except in just one, very narrow sense of the word, which I
would argue is the wrong interpretation in this context).

Sadly, I was called away at that moment to take care of something
completely different. Even if I had not been, the worktree with the first
branch would still have been at that revision for a longer time, as I had
to try out a couple of changes before I could commit.

This is just one example where the idea backfires that you can safely
delete one of two branches that happen to point at the same commit at the
same time.

I am sure that you possess vivid enough of an imagination to come up with
plenty more examples where that is the case.


As no program can predict the future or the intentions of the user, it
should be safe to delete the branch, because it can easily be recreated
(from the remaining branches pointing to the same commit).


Yes, no program can predict the future (at least *accurately*).

No, it is not safe to delete that branch. Especially if you take the
current paradigm of "it is safe to delete a branch if it is up-to-date
with, or at least fast-forwardable to, its upstream branch" into account.

And no, a branch cannot easily be recreated from the remaining branches in
the future, as branches can have different reflogs (and they are lost when
deleting the branch).


It shouldn't need a lot of computational power to find out when multiple
branches point to the same commit.


Sure, that test can even be scripted easily by using the `git for-each-ref
--points-at=` command.

By the way, if you are still convinced that my argument is flawed and that
it should be considered safe to delete a branch if any other branch points
to the same revision, I encourage you to work on a patch to make it so.

For maximum chance of getting included, you would want to guard this
behind a new config setting, say, branch.deleteRedundantIsSafe, parse it
here:

https://github.com/git/git/blob/v2.15.1/config.c#L1260-L1288

or here:

https://github.com/git/git/blob/v2.15.1/builtin/branch.c#L78-L97



I'd agree that it is easy to misinterpret the message. After close reading 
of the thread, Junio put his finger on the scenario with:


-  "branch -d" protects branches that are yet to be merged to the 
**current** branch.   (my emphasis)


Maybe the error message could say that (what exactly was the error 
message?),

or the documenation be improved to clarify.



document it here:

https://github.com/git/git/blob/v2.15.1/Documentation/git-branch.txt

and here:

https://github.com/git/git/blob/v2.15.1/Documentation/config.txt#L969

and handle it here:

https://github.com/git/git/blob/v2.15.1/builtin/branch.c#L185-L288

(look for the places where `force` is used, likely just before the call to
`check_branch_commit()`).

The way you'd want it to handle is most lilkely by 

Re: [PATCH v5 07/10] introduce fetch-object: fetch one promisor object

2017-12-02 Thread Christian Couder
On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler  wrote:
> From: Jonathan Tan 

> +void fetch_object(const char *remote_name, const unsigned char *sha1)
> +{
> +   struct remote *remote;
> +   struct transport *transport;
> +   struct ref *ref;
> +
> +   remote = remote_get(remote_name);
> +   if (!remote->url[0])
> +   die(_("Remote with no URL"));
> +   transport = transport_get(remote, remote->url[0]);
> +
> +   ref = alloc_ref(sha1_to_hex(sha1));
> +   hashcpy(ref->old_oid.hash, sha1);
> +   transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
> +   transport_set_option(transport, TRANS_OPT_NO_HAVES, "1");
> +   transport_fetch_refs(transport, ref);
> +}

I think it would be interesting to return what transport_fetch_refs()
returns, so that a caller could know if an error happened.


Re: [add-default-config] add --default option to git config.

2017-12-02 Thread Philip Oakley

From: "Soukaina NAIT HMID" 

From: Soukaina NAIT HMID 




From a coursory read, there does need a bit more explanation.


I see you also add a --color description and code, and don't say what the 
problem being solved is.


If it is trickty to explain, then a two patch series may tease apart the 
issues. perhaps add the --color option first (noting you'll use it in the 
next patch), then a second patch that explains about the --default problem.


The patch title should be something like "[PATCH 1/n] config: add --default 
option"


You may also want to explain the test rationale, and maybe split them if 
appropriate.


--
Philip



Signed-off-by: Soukaina NAIT HMID 
---
Documentation/git-config.txt |   4 ++
builtin/config.c |  34 -
config.c |  10 +++
config.h |   1 +
t/t1300-repo-config.sh   | 161 
+++

5 files changed, 209 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 4edd09fc6b074..5d5cd58fdae37 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -179,6 +179,10 @@ See also <>.
 specified user.  This option has no effect when setting the
 value (but you can use `git config section.variable ~/`
 from the command line to let your shell do the expansion).
+--color::
+ Find the color configured for `name` (e.g. `color.diff.new`) and
+ output it as the ANSI color escape sequence to the standard
+ output.

-z::
--null::
diff --git a/builtin/config.c b/builtin/config.c
index d13daeeb55927..5e5b998b7c892 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -30,6 +30,7 @@ static int end_null;
static int respect_includes_opt = -1;
static struct config_options config_options;
static int show_origin;
+static const char *default_value;

#define ACTION_GET (1<<0)
#define ACTION_GET_ALL (1<<1)
@@ -52,6 +53,8 @@ static int show_origin;
#define TYPE_INT (1<<1)
#define TYPE_BOOL_OR_INT (1<<2)
#define TYPE_PATH (1<<3)
+#define TYPE_COLOR (1<<4)
+

static struct option builtin_config_options[] = {
 OPT_GROUP(N_("Config file location")),
@@ -80,11 +83,13 @@ static struct option builtin_config_options[] = {
 OPT_BIT(0, "int", , N_("value is decimal number"), TYPE_INT),
 OPT_BIT(0, "bool-or-int", , N_("value is --bool or --int"), 
TYPE_BOOL_OR_INT),
 OPT_BIT(0, "path", , N_("value is a path (file or directory 
name)"), TYPE_PATH),
+ OPT_BIT(0, "color", , N_("find the color configured"), 
TYPE_COLOR),

 OPT_GROUP(N_("Other")),
 OPT_BOOL('z', "null", _null, N_("terminate values with NUL byte")),
 OPT_BOOL(0, "name-only", _values, N_("show variable names only")),
 OPT_BOOL(0, "includes", _includes_opt, N_("respect include 
directives on lookup")),
 OPT_BOOL(0, "show-origin", _origin, N_("show origin of config (file, 
standard input, blob, command line)")),
+ OPT_STRING(0, "default", _value, N_("default-value"), N_("sets 
default value when no value is returned from config")),

 OPT_END(),
};

@@ -159,6 +164,13 @@ static int format_config(struct strbuf *buf, const 
char *key_, const char *value

 return -1;
 strbuf_addstr(buf, v);
 free((char *)v);
+ }
+ else if (types == TYPE_COLOR) {
+ char *v = xmalloc(COLOR_MAXLEN);
+ if (git_config_color(, key_, value_) < 0)
+ return -1;
+ strbuf_addstr(buf, v);
+ free((char *)v);
 } else if (value_) {
 strbuf_addstr(buf, value_);
 } else {
@@ -244,8 +256,16 @@ static int get_value(const char *key_, const char 
*regex_)

 config_with_options(collect_config, ,
 _config_source, _options);

- ret = !values.nr;
+ if (!values.nr && default_value && types) {
+ struct strbuf *item;
+ ALLOC_GROW(values.items, values.nr + 1, values.alloc);
+ item = [values.nr++];
+ if(format_config(item, key_, default_value) < 0){
+ values.nr = 0;
+ }
+ }

+ ret = !values.nr;
 for (i = 0; i < values.nr; i++) {
 struct strbuf *buf = values.items + i;
 if (do_all || i == values.nr - 1)
@@ -268,6 +288,7 @@ static int get_value(const char *key_, const char 
*regex_)

 return ret;
}

+
static char *normalize_value(const char *key, const char *value)
{
 if (!value)
@@ -281,6 +302,17 @@ static char *normalize_value(const char *key, const 
char *value)

 * when retrieving the value.
 */
 return xstrdup(value);
+ if (types == TYPE_COLOR)
+ {
+ char *v = xmalloc(COLOR_MAXLEN);
+ if (git_config_color(, key, value) == 0)
+ {
+ free((char *)v);
+ return xstrdup(value);
+ }
+ free((char *)v);
+ die("cannot parse color '%s'", value);
+ }
 if (types == TYPE_INT)
 return xstrfmt("%"PRId64, git_config_int64(key, value));
 if (types == TYPE_BOOL)
diff --git a/config.c b/config.c
index 903abf9533b18..5c5daffeb6723 100644
--- a/config.c
+++ b/config.c
@@ -16,6 +16,7 @@
#include "string-list.h"
#include "utf8.h"
#include "dir.h"
+#include "color.h"

struct config_source {
 struct config_source *prev;
@@ -990,6 +991,15 @@ int git_config_pathname(const char **dest, const 

Re: [PATCH] l10n: update de_DE translation

2017-12-02 Thread Ralf Thielow
Robert Abel  wrote:
> Der-, die- and dasselbe and their declensions are spelt as one word in German.
> 

Thanks!

> Signed-off-by: Robert Abel 
> ---
>  po/de.po | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/po/de.po b/po/de.po
> index a05aca5f3..400262625 100644
> --- a/po/de.po
> +++ b/po/de.po
> @@ -2925,7 +2925,7 @@ msgstr "  (benutzen Sie \"git branch --unset-upstream\" 
> zum Beheben)\n"
>  #: remote.c:2083
>  #, c-format
>  msgid "Your branch is up to date with '%s'.\n"
> -msgstr "Ihr Branch ist auf dem selben Stand wie '%s'.\n"
> +msgstr "Ihr Branch ist auf demselben Stand wie '%s'.\n"
>  
>  #: remote.c:2087
>  #, c-format
> @@ -10874,7 +10874,7 @@ msgstr "Kann nicht überschreiben"
>  
>  #: builtin/mv.c:239
>  msgid "multiple sources for the same target"
> -msgstr "mehrere Quellen für das selbe Ziel"
> +msgstr "mehrere Quellen für dasselbe Ziel"
>  
>  #: builtin/mv.c:241
>  msgid "destination directory does not exist"
> @@ -11827,7 +11827,7 @@ msgstr ""
>  "\n"
>  "git push %s HEAD:%s\n"
>  "\n"
> -"Um auf den Branch mit dem selben Namen im Remote-Repository zu versenden,\n"
> +"Um auf den Branch mit demselben Namen im Remote-Repository zu versenden,\n"
>  "benutzen Sie:\n"
>  "\n"
>  "git push %s %s\n"
> -- 
> 2.15.1
> 


Re: git-p4: cloning with a change number does not import all files

2017-12-02 Thread Patrick Rouleau
On Sat, Dec 2, 2017 at 12:55 PM, Luke Diamand  wrote:
> I think I've sort of stumbled across something like the problem you've
> described in the past. Perhaps the files you need have been deleted
> and then re-integrated or some such.
>
> Would you be able to take a look at some files with this problem and
> see if you can spot what's happened to it ("p4 changes" and perhaps
> "p4 changes -i").

Sorry, but these commands only show the date and an extract of the commit
message.

>
> One thing that can certainly happen is that Perforce gets *very*
> confused if you start getting too clever with symlinked directories,
> and git-p4 can only do so much in the face of this. But it may be
> something else.

Overall, the depot history is very "messy". It contains a lot of projects. The
project on which I work has reach v5 and it has been branched from v4.
>From p4v's "show graph", I can see there was a 'main' branch at some point,
but it doesn't exist anymore.

One of the files missing from my clone only has 2 revisions and it was created
in the v5 branch. It was created at 608436 and modified at 608816. I cloned
from 610443, mainly because we added a feature branch and I want to access
it from git too (610443 is one month before the branch creation).

I will play a little bit with p4 to see if I can locate where the problem comes
from and maybe hack git-p4 to make it more verbose.

Thanks,
P. Rouleau


Re: [PATCH v5 08/10] sha1_file: support lazily fetching missing objects

2017-12-02 Thread Christian Couder
On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler  wrote:
> From: Jonathan Tan 

> diff --git a/sha1_file.c b/sha1_file.c
> index 10c3a00..fc7718a 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -29,6 +29,7 @@
>  #include "mergesort.h"
>  #include "quote.h"
>  #include "packfile.h"
> +#include "fetch-object.h"
>
>  const unsigned char null_sha1[GIT_MAX_RAWSZ];
>  const struct object_id null_oid;
> @@ -1144,6 +1145,8 @@ static int sha1_loose_object_info(const unsigned char 
> *sha1,
> return (status < 0) ? status : 0;
>  }
>
> +int fetch_if_missing = 1;
> +
>  int sha1_object_info_extended(const unsigned char *sha1, struct object_info 
> *oi, unsigned flags)
>  {
> static struct object_info blank_oi = OBJECT_INFO_INIT;
> @@ -1152,6 +1155,7 @@ int sha1_object_info_extended(const unsigned char 
> *sha1, struct object_info *oi,
> const unsigned char *real = (flags & OBJECT_INFO_LOOKUP_REPLACE) ?
> lookup_replace_object(sha1) :
> sha1;
> +   int already_retried = 0;
>
> if (!oi)
> oi = _oi;
> @@ -1176,28 +1180,36 @@ int sha1_object_info_extended(const unsigned char 
> *sha1, struct object_info *oi,
> }
> }
>
> -   if (!find_pack_entry(real, )) {
> -   /* Most likely it's a loose object. */
> -   if (!sha1_loose_object_info(real, oi, flags))
> -   return 0;
> +retry:
> +   if (find_pack_entry(real, ))
> +   goto found_packed;
>
> -   /* Not a loose object; someone else may have just packed it. 
> */
> -   if (flags & OBJECT_INFO_QUICK) {
> -   return -1;
> -   } else {
> -   reprepare_packed_git();
> -   if (!find_pack_entry(real, ))
> -   return -1;
> -   }
> +   /* Most likely it's a loose object. */
> +   if (!sha1_loose_object_info(real, oi, flags))
> +   return 0;
> +
> +   /* Not a loose object; someone else may have just packed it. */
> +   reprepare_packed_git();
> +   if (find_pack_entry(real, ))
> +   goto found_packed;
> +
> +   /* Check if it is a missing object */
> +   if (fetch_if_missing && repository_format_partial_clone &&
> +   !already_retried) {
> +   fetch_object(repository_format_partial_clone, real);
> +   already_retried = 1;
> +   goto retry;
> }
>
> +   return -1;
> +
> +found_packed:
> if (oi == _oi)
> /*
>  * We know that the caller doesn't actually need the
>  * information below, so return early.
>  */
> return 0;
> -
> rtype = packed_object_info(e.p, e.offset, oi);
> if (rtype < 0) {
> mark_bad_packed_object(e.p, real);

The above is adding 2 different gotos into this function while there
are quite simple ways to avoid them. See
https://public-inbox.org/git/cap8ufd2thrj7+rxmismutuopxqv4wm7azsejpd_fheoycp+...@mail.gmail.com/
and the follow up email in the thread.


Re: How hard would it be to implement sparse fetching/pulling?

2017-12-02 Thread Philip Oakley

From: "Jeff Hostetler" 
Sent: Friday, December 01, 2017 5:23 PM

On 11/30/2017 6:43 PM, Philip Oakley wrote:

From: "Vitaly Arbuzov" 

[...]

comments below..


On Thu, Nov 30, 2017 at 9:01 AM, Vitaly Arbuzov  wrote:

Hey Jeff,

It's great, I didn't expect that anyone is actively working on this.
I'll check out your branch, meanwhile do you have any design docs that
describe these changes or can you define high level goals that you
want to achieve?

On Thu, Nov 30, 2017 at 6:24 AM, Jeff Hostetler 
wrote:



On 11/29/2017 10:16 PM, Vitaly Arbuzov wrote:

[...]




I have, for separate reasons been _thinking_ about the issue ($dayjob is 
in

defence, so a similar partition would be useful).

The changes would almost certainly need to be server side (as well as 
client
side), as it is the server that decides what is sent over the wire in the 
pack files, which would need to be a 'narrow' pack file.


Yes, there will need to be both client and server changes.
In the current 3 part patch series, the client sends a "filter_spec"
to the server as part of the fetch-pack/upload-pack protocol.
If the server chooses to honor it, upload-pack passes the filter_spec
to pack-objects to build an "incomplete" packfile omitting various
objects (currently blobs).  Proprietary servers will need similar
changes to support this feature.

Discussing this feature in the context of the defense industry
makes me a little nervous.  (I used to be in that area.)


I'm viewing the desire for codebase partitioning from a soft layering of 
risk view (perhaps a more UK than USA approach ;-)



What we have in the code so far may be a nice start, but
probably doesn't have the assurances that you would need
for actual deployment.  But it's a start


True. I need to get some of my collegues more engaged...





If we had such a feature then all we would need on top is a separate
tool that builds the right "sparse" scope for the workspace based on
paths that developer wants to work on.

In the world where more and more companies are moving towards large
monorepos this improvement would provide a good way of scaling git to
meet this demand.


The 'companies' problem is that it tends to force a client-server, 
always-on
on-line mentality. I'm also wanting the original DVCS off-line capability 
to

still be available, with _user_ control, in a generic sense, of what they
have locally available (including files/directories they have not yet 
looked
at, but expect to have. IIUC Jeff's work is that on-line view, without 
the

off-line capability.

I'd commented early in the series at [1,2,3].


Yes, this does tend to lead towards an always-online mentality.
However, there are 2 parts:
[a] dynamic object fetching for missing objects, such as during a
random command like diff or blame or merge.  We need this
regardless of usage -- because we can't always predict (or
dry-run) every command the user might run in advance.


Making something "useful" happen here when off-line is an obvious goal.


[b] batch fetch mode, such as using partial-fetch to match your
sparse-checkout so that you always have the blobs of interest
to you.  And assuming you don't wander outside of this subset
of the tree, you should be able to work offline as usual.
If you can work within the confines of [b], you wouldn't need to
always be online.


I feel this is the area that does need ensure a capability to avoid any 
perception of the much maligned 'Embrace, extend, and extinguish' by 
accidental lockout.


I don't think this should be viewed as a type of sparse checkout - it's just 
a checkout of what you have (under the hood it could use the same code 
though).




We might also add a part [c] with explicit commands to back-fill or
alter your incomplete view of the ODB (as I explained in response
to the "git diff  " comment later in this thread.



At its core, my idea was to use the object store to hold markers for the
'not yet fetched' objects (mainly trees and blobs). These would be in a 
known fixed format, and have the same effect (conceptually) as the 
sub-module markers - they _confirm_ the oid, yet say 'not here, try 
elsewhere'.


We do have something like this.  Jonathan can explain better than I, but
basically, we denote possibly incomplete packfiles from partial clones
and fetches as "promisor" and have special rules in the code to assert
that a missing blob referenced from a "promisor" packfile is OK and can
be fetched later if necessary from the "promising" remote.


The remote interaction is one area that may need thought, especially in a 
triangle workflow, of which there are a few.




The main problem with markers or other lists of missing objects is
that it has scale problems for large repos.  Suppose I have 100M
blobs in my repo.  If I do a blob:none clone, I'd have 100M missing
blobs that would need tracking.  If I then do a batch fetch of the
blobs 

Re: git-p4: cloning with a change number does not import all files

2017-12-02 Thread Luke Diamand
On 2 December 2017 at 15:35, Patrick Rouleau  wrote:
> On Fri, Dec 1, 2017 at 7:45 AM, Lars Schneider  
> wrote:
>> Oh, I am with you. However, I only used git-p4 for a very short time in the
>> way you want to use it. Therefore, I don't have much experience in that kind
>> of usage pattern. I was able to convice my management to move all source to
>> Git and I used git-p4 to migrate the repositories ;-)
>>
>> Here is a talk on the topic if you want to learn more:
>> https://www.youtube.com/watch?v=QNixDNtwYJ0
>>
>> Cheers,
>> Lars
>
> Sadly, there is no way I convince the company to switch to git. They
> have acquired
> many companies in the past years and they have standardized around Perforce.
> It is in part because they want access control and they need to store
> a lot of binary
> files (including big VM images).
>
> I keep this video close to explain why I like git:
> https://www.youtube.com/watch?v=o4PFDKIc2fs

I feel your pain.

I think I've sort of stumbled across something like the problem you've
described in the past. Perhaps the files you need have been deleted
and then re-integrated or some such.

Would you be able to take a look at some files with this problem and
see if you can spot what's happened to it ("p4 changes" and perhaps
"p4 changes -i").

One thing that can certainly happen is that Perforce gets *very*
confused if you start getting too clever with symlinked directories,
and git-p4 can only do so much in the face of this. But it may be
something else.

Thanks
Luke


Re: [PATCH v5 10/10] gc: do not repack promisor packfiles

2017-12-02 Thread Christian Couder
On Tue, Nov 21, 2017 at 10:07 PM, Jeff Hostetler  wrote:
> From: Jonathan Tan 

>  pack_as_from_promisor () {
> HASH=$(git -C repo pack-objects .git/objects/pack/pack) &&
> >repo/.git/objects/pack/pack-$HASH.promisor
> +   echo $HASH
>  }

Now the exit code of the above function will always be the exit code
of "echo $HASH".
It seems to me that it would be better to add "&&" at the end of the
line above the "echo $HASH".


Re: [PATCH] send-email: extract email-parsing code into a subroutine

2017-12-02 Thread Nathan PAYRE
I found a mistake in my signed-off-by, please replace
 by 

Excuse me.

2017-12-02 18:02 GMT+01:00 Payre Nathan :
> From: Nathan Payre 
>
> The existing code mixes parsing of email header with regular
> expression and actual code. Extract the parsing code into a new
> subroutine 'parse_header_line()'. This improves the code readability
> and make parse_header_line reusable in other place.
>
> Signed-off-by: Nathan Payre 
> Signed-off-by: Matthieu Moy 
> Signed-off-by: Timothee Albertin 
> Signed-off-by: Daniel Bensoussan 
> ---
>
> This patch is a first step to implement a new feature.
> See new feature discussion here: 
> https://public-inbox.org/git/20171030223444.5052-1-nathan.pa...@etu.univ-lyon1.fr/
>
>  git-send-email.perl | 106 
> +++-
>  1 file changed, 80 insertions(+), 26 deletions(-)
>
> diff --git a/git-send-email.perl b/git-send-email.perl
> index 2208dcc21..98c2e461c 100755
> --- a/git-send-email.perl
> +++ b/git-send-email.perl
> @@ -715,41 +715,64 @@ EOT3
> if (!defined $compose_encoding) {
> $compose_encoding = "UTF-8";
> }
> -   while(<$c>) {
> +
> +   my %parsed_email;
> +   while (<$c>) {
> next if m/^GIT:/;
> -   if ($in_body) {
> -   $summary_empty = 0 unless (/^\n$/);
> -   } elsif (/^\n$/) {
> -   $in_body = 1;
> -   if ($need_8bit_cte) {
> +   parse_header_line($_, \%parsed_email);
> +   if (/^\n$/i) {
> +   while (my $row = <$c>) {
> +   if (!($row =~ m/^GIT:/)) {
> +   $parsed_email{'body'} = 
> $parsed_email{'body'} . $row;
> +   }
> +   }
> +   }
> +   }
> +   if ($parsed_email{'from'}) {
> +   $sender = $parsed_email{'from'};
> +   }
> +   if ($parsed_email{'in_reply_to'}) {
> +   $initial_reply_to = $parsed_email{'in_reply_to'};
> +   }
> +   if ($parsed_email{'subject'}) {
> +   $initial_subject = $parsed_email{'subject'};
> +   print $c2 "Subject: " .
> +   quote_subject($parsed_email{'subject'}, 
> $compose_encoding) .
> +   "\n";
> +   }
> +   if ($parsed_email{'mime-version'}) {
> +   $need_8bit_cte = 0;
> +   }
> +   if ($need_8bit_cte) {
> +   if ($parsed_email{'content-type'}) {
> +   print $c2 "MIME-Version: 1.0\n",
> +"Content-Type: 
> $parsed_email{'content-type'};",
> +"Content-Transfer-Encoding: 8bit\n";
> +   } else {
> print $c2 "MIME-Version: 1.0\n",
>  "Content-Type: text/plain; ",
> -  "charset=$compose_encoding\n",
> +"charset=$compose_encoding\n",
>  "Content-Transfer-Encoding: 8bit\n";
> }
> -   } elsif (/^MIME-Version:/i) {
> -   $need_8bit_cte = 0;
> -   } elsif (/^Subject:\s*(.+)\s*$/i) {
> -   $initial_subject = $1;
> -   my $subject = $initial_subject;
> -   $_ = "Subject: " .
> -   quote_subject($subject, $compose_encoding) .
> -   "\n";
> -   } elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
> -   $initial_reply_to = $1;
> -   next;
> -   } elsif (/^From:\s*(.+)\s*$/i) {
> -   $sender = $1;
> -   next;
> -   } elsif (/^(?:To|Cc|Bcc):/i) {
> -   print __("To/Cc/Bcc fields are not interpreted yet, 
> they have been ignored\n");
> -   next;
> -   }
> -   print $c2 $_;
> }
> +   if ($parsed_email{'body'}) {
> +   $summary_empty = 0;
> +   print $c2 "\n$parsed_email{'body'}\n";
> +   }
> +
> close $c;
> close $c2;
>
> +   open $c2, "<", $compose_filename . ".final"
> +   or die sprintf(__("Failed to open %s.final: %s"), 
> $compose_filename, $!);
> +
> +   print "affichage : \n";
> +   while (<$c2>) {
> +   print $_;
> +   }
> +
> +   close $c2;
> +
> if ($summary_empty) {
> print __("Summary email is empty, skipping it\n");
>  

[PATCH] send-email: extract email-parsing code into a subroutine

2017-12-02 Thread Payre Nathan
From: Nathan Payre 

The existing code mixes parsing of email header with regular
expression and actual code. Extract the parsing code into a new
subroutine 'parse_header_line()'. This improves the code readability
and make parse_header_line reusable in other place.

Signed-off-by: Nathan Payre 
Signed-off-by: Matthieu Moy 
Signed-off-by: Timothee Albertin 
Signed-off-by: Daniel Bensoussan 
---

This patch is a first step to implement a new feature.
See new feature discussion here: 
https://public-inbox.org/git/20171030223444.5052-1-nathan.pa...@etu.univ-lyon1.fr/

 git-send-email.perl | 106 +++-
 1 file changed, 80 insertions(+), 26 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2208dcc21..98c2e461c 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -715,41 +715,64 @@ EOT3
if (!defined $compose_encoding) {
$compose_encoding = "UTF-8";
}
-   while(<$c>) {
+
+   my %parsed_email;
+   while (<$c>) {
next if m/^GIT:/;
-   if ($in_body) {
-   $summary_empty = 0 unless (/^\n$/);
-   } elsif (/^\n$/) {
-   $in_body = 1;
-   if ($need_8bit_cte) {
+   parse_header_line($_, \%parsed_email);
+   if (/^\n$/i) {
+   while (my $row = <$c>) {
+   if (!($row =~ m/^GIT:/)) {
+   $parsed_email{'body'} = 
$parsed_email{'body'} . $row;
+   }
+   }
+   }
+   }
+   if ($parsed_email{'from'}) {
+   $sender = $parsed_email{'from'};
+   }
+   if ($parsed_email{'in_reply_to'}) {
+   $initial_reply_to = $parsed_email{'in_reply_to'};
+   }
+   if ($parsed_email{'subject'}) {
+   $initial_subject = $parsed_email{'subject'};
+   print $c2 "Subject: " .
+   quote_subject($parsed_email{'subject'}, 
$compose_encoding) .
+   "\n";
+   }
+   if ($parsed_email{'mime-version'}) {
+   $need_8bit_cte = 0;
+   }
+   if ($need_8bit_cte) {
+   if ($parsed_email{'content-type'}) {
+   print $c2 "MIME-Version: 1.0\n",
+"Content-Type: 
$parsed_email{'content-type'};",
+"Content-Transfer-Encoding: 8bit\n";
+   } else {
print $c2 "MIME-Version: 1.0\n",
 "Content-Type: text/plain; ",
-  "charset=$compose_encoding\n",
+"charset=$compose_encoding\n",
 "Content-Transfer-Encoding: 8bit\n";
}
-   } elsif (/^MIME-Version:/i) {
-   $need_8bit_cte = 0;
-   } elsif (/^Subject:\s*(.+)\s*$/i) {
-   $initial_subject = $1;
-   my $subject = $initial_subject;
-   $_ = "Subject: " .
-   quote_subject($subject, $compose_encoding) .
-   "\n";
-   } elsif (/^In-Reply-To:\s*(.+)\s*$/i) {
-   $initial_reply_to = $1;
-   next;
-   } elsif (/^From:\s*(.+)\s*$/i) {
-   $sender = $1;
-   next;
-   } elsif (/^(?:To|Cc|Bcc):/i) {
-   print __("To/Cc/Bcc fields are not interpreted yet, 
they have been ignored\n");
-   next;
-   }
-   print $c2 $_;
}
+   if ($parsed_email{'body'}) {
+   $summary_empty = 0;
+   print $c2 "\n$parsed_email{'body'}\n";
+   }
+
close $c;
close $c2;
 
+   open $c2, "<", $compose_filename . ".final"
+   or die sprintf(__("Failed to open %s.final: %s"), 
$compose_filename, $!);
+
+   print "affichage : \n";
+   while (<$c2>) {
+   print $_;
+   }
+
+   close $c2;
+
if ($summary_empty) {
print __("Summary email is empty, skipping it\n");
$compose = -1;
@@ -792,6 +815,37 @@ sub ask {
return;
 }
 
+sub parse_header_line {
+   my $lines = shift;
+   my $parsed_line = shift;
+
+   foreach (split(/\n/, $lines)) {
+   if (/^From:\s*(.+)$/i) {
+   $parsed_line->{'from'} = $1;
+   } elsif (/^To:\s*(.+)$/i) {
+   $parsed_line->{'to'} = [ parse_address_line($1) ];
+

Hello

2017-12-02 Thread MISS BORTE OGORTAI


--
GREETINGS,

I AM BORTE ,I WAS  DIAGNOSE OF OVARIAN CANCER,WHICH DOCTORS HAVE 
CONFIRMED THAT I HAVE ONLY FEW WEEKS TO LIVE, SO I HAVE DECIDED TO 
DONATE EVERYTHING I HAVE TO THE ORPHANAGE AND THE POOR WIDOWS THROUGH 
YOU .PLEASE KINDLY REPLY  ME(missbortiogo...@gmail.com)  AS SOON AS 
POSIBLE TO ENABLE ME GIVE YOU MORE INFORMATION ABOUT MYSELF AND HOW TO 
GO ABOUT IT .

THANKS 

MISS BORTE



--



Re: How hard would it be to implement sparse fetching/pulling?

2017-12-02 Thread Philip Oakley

Hi Jonathan,

Thanks for the outline. It has help clarify some points and see the very 
similar alignments.


The one thing I wasn't clear about is the "promised" objects/remote. Is that 
"promisor" remote a fixed entity, or could it be one of many remotes that 
could be a "provider"? (sort of like fetching sub-modules...)


Philip

From: "Jonathan Nieder" 
Sent: Friday, December 01, 2017 2:51 AM

Hi Vitaly,

Vitaly Arbuzov wrote:


I think it would be great if we high level agree on desired user
experience, so let me put a few possible use cases here.


I think one thing this thread is pointing to is a lack of overview
documentation about how the 'partial clone' series currently works.
The basic components are:

1. extending git protocol to (1) allow fetching only a subset of the
   objects reachable from the commits being fetched and (2) later,
   going back and fetching the objects that were left out.

   We've also discussed some other protocol changes, e.g. to allow
   obtaining the sizes of un-fetched objects without fetching the
   objects themselves

2. extending git's on-disk format to allow having some objects not be
   present but only be "promised" to be obtainable from a remote
   repository.  When running a command that requires those objects,
   the user can choose to have it either (a) error out ("airplane
   mode") or (b) fetch the required objects.

   It is still possible to work fully locally in such a repo, make
   changes, get useful results out of "git fsck", etc.  It is kind of
   similar to the existing "shallow clone" feature, except that there
   is a more straightforward way to obtain objects that are outside
   the "shallow" clone when needed on demand.

3. improving everyday commands to require fewer objects.  For
   example, if I run "git log -p", then I way to see the history of
   most files but I don't necessarily want to download large binary
   files just to print 'Binary files differ' for them.

   And by the same token, we might want to have a mode for commands
   like "git log -p" to default to restricting to a particular
   directory, instead of downloading files outside that directory.

   There are some fundamental changes to make in this category ---
   e.g. modifying the index format to not require entries for files
   outside the sparse checkout, to avoid having to download the
   trees for them.

The overall goal is to make git scale better.

The existing patches do (1) and (2), though it is possible to do more
in those categories. :)  We have plans to work on (3) as well.

These are overall changes that happen at a fairly low level in git.
They mostly don't require changes command-by-command.

Thanks,
Jonathan 




GREETINGS BELOVED

2017-12-02 Thread MISS BORTE OGOTAI


--
GREETINGS BELOVED

I AM BORTE ,I WAS  DIAGNOSE WITH OVARIAN CANCER,WHICH DOCTORS HAVE 
CONFIRMED THAT I HAVE ONLY FEW WEEKS TO LIVE, SO I HAVE DECIDED TO 
DONATE EVERYTHING I HAVE TO THE ORPHANAGE AND THE POOR WIDOWS THROUGH 
YOU AND YOUR HELP .PLEASE KINDLY REPLY  ME ONLY ON MY  EMAIL ADDRES 
HERE SO THAT YOUR MESSAGE WILL GET TO ME (misbotteogot...@gmail.com)
REPLY AS SOON AS POSIBLE TO ENABLE ME GIVE YOU MORE INFORMATION ABOUT 
MYSELF AND HOW TO GO ABOUT IT .


THANKS 

MISS BORTE
--



Re: How hard would it be to implement sparse fetching/pulling?

2017-12-02 Thread Philip Oakley

From: "Jeff Hostetler" 
Sent: Friday, December 01, 2017 2:30 PM

On 11/30/2017 8:51 PM, Vitaly Arbuzov wrote:

I think it would be great if we high level agree on desired user
experience, so let me put a few possible use cases here.

1. Init and fetch into a new repo with a sparse list.
Preconditions: origin blah exists and has a lot of folders inside of
src including "bar".
Actions:
git init foo && cd foo
git config core.sparseAll true # New flag to activate all sparse
operations by default so you don't need to pass options to each
command.
echo "src/bar" > .git/info/sparse-checkout
git remote add origin blah
git pull origin master
Expected results: foo contains src/bar folder and nothing else,
objects that are unrelated to this tree are not fetched.
Notes: This should work same when fetch/merge/checkout operations are
used in the right order.


With the current patches (parts 1,2,3) we can pass a blob-ish
to the server during a clone that refers to a sparse-checkout
specification.


I hadn't appreciated this capability. I see it as important, and should be 
available both ways, so that a .gitNarrow spec can be imposed from the 
server side, as well as by the requester.


It could also be used to assist in the 'precious/secret' blob problem, so 
that AWS keys are never pushed, nor available for fetching!



   There's a bit of a chicken-n-egg problem getting
things set up.  So if we assume your team would create a series
of "known enlistments" under version control, then you could


s/enlistments/entitlements/ I presume?


just reference one by : during your clone.  The
server can lookup that blob and just use it.

git clone --filter=sparse:oid=master:templates/bar URL

And then the server will filter-out the unwanted blobs during
the clone.  (The current version only filters blobs; you still
get full commits and trees.  That will be revisited later.)


I'm for the idea that only the in-heirachy trees should be sent.
It should also be possible that the server replies that it is only sending a 
narrow clone, with the given (accessible?) spec.




On the client side, the partial clone installs local config
settings into the repo so that subsequent fetches default to
the same filter criteria as used in the clone.


I don't currently have provision to send a full sparse-checkout
specification to the server during a clone or fetch.  That
seemed like too much to try to squeeze into the protocols.
We can revisit this later if there is interest, but it wasn't
critical for the initial phase.

Agreed. I think it should be somewhere 'visible' to the user, but could be 
setup by the server admin / repo maintainer if they don't have write access. 
But there could still be the catch-22 - maybe one starts with a toptree> :  pair to define an origin point (it's not as refined as a 
.gitNarrow spec file, but is definative). The toptree option could even 
allow sub-tree clones.. maybe..






2. Add a file and push changes.
Preconditions: all steps above followed.
touch src/bar/baz.txt && git add -A && git commit -m "added a file"
git push origin master
Expected results: changes are pushed to remote.


I don't believe partial clone and/or partial fetch will cause
any changes for push.


I suspect that pushes could be rejected if the user 'pretends' to modify 
files or trees outside their area. It does need the user to be able to spoof 
part of a tree they don't have, so an upstream / remote would immediatly 
know it was a spoof but locally the narrow clone doesn't have enough detail 
about the 'bad' oid. It would be right to reject such attempts!






3. Clone a repo with a sparse list as a filter.
Preconditions: same as for #1
Actions:
echo "src/bar" > /tmp/blah-sparse-checkout
git clone --sparse /tmp/blah-sparse-checkout blah # Clone should be
the only command that would requires specific option key being passed.
Expected results: same as for #1 plus /tmp/blah-sparse-checkout is
copied into .git/info/sparse-checkout


I presume clone and fetch are treated equivalently here.



There are 2 independent concepts here: clone and checkout.
Currently, there isn't any automatic linkage of the partial clone to
the sparse-checkout settings, so you could do something like this:

I see an implicit link that clearly one cannot checkout (inflate/populate) a 
file/directory that one does not have in the object store. But that does not 
imply the reverse linkage. The regular sparse checkout should be available 
independently of the local clone being a narrow one.



git clone --no-checkout --filter=sparse:oid=master:templates/bar URL
git cat-file ... templates/bar >.git/info/sparse-checkout
git config core.sparsecheckout true
git checkout ...

I've been focused on the clone/fetch issues and have not looked
into the automation to couple them.



I foresee that large files and certain files need to be filterable for 
fetch-clone, and that might not be (backward) compatible with the 

Re: [PATCH v4 3/4] RUNTIME_PREFIX relocatable Git

2017-12-02 Thread Dan Jacques
My initial set of replies are inline here:
https://public-inbox.org/git/20171129223807.91343-1-...@google.com/T/#m1ff5cda787eaea4a4015a8f00a8be5966122c73a

I put together this solution for the I18N module. It exports "localedir" to
the Perl header and injects the correct value into Git::I18N. It also
suppresses the emission of the full ++LOCALEDIR++ when in runtime prefix
mode.

One downside of this approach is that, since the point of resolution happens
at the beginning of the script execution, external users of the runtime
prefix Git::I18N module will not be able to resolve the locale directory.
I think this is okay, though, since RUNTIME_PREFIX_PERL is decidedly *not*
intended for system installation and is not trying to offer system module
import portability.

After looking a bit at your $ENV{GIT_EXEC_PATH} optimization suggestion, I
ended up favoring the current usage of FindBin::Bin for a few reasons:

- Encoding the relative GITEXECDIR into the script is more consistent with
  how "exec_cmd.c" does its own self-resolution.
- The resolution code is a bit more readable, both in terms of what it's
  doing and where it starts from.
- It makes the scripts individually executable, rather than having to go
  through the Git wrapper. This is, minimally, useful for testing.

I admittedly don't know much about FindBin, so if it has a downside that
could outweigh this I'm more than happy to reevaluate.

With respect to testing, I did find "GIT_TEST_INSTALLED". Unfortunately,
it doesn't seem to work against the full test suite even on master, and
individual tests, especially Perl and locale, set their own path
expectations. I think including installation-relative testing here would
take a fair amount of work.

Note that this is a replacement for [PATCH v4 3/4], so if you're trying to
apply it, you will need to build it on top of the preceding patches.

Many thanks,
-Dan

---
 Makefile   | 59 +++---
 perl/Git/I18N.pm   |  2 +-
 perl/Makefile  | 17 ++
 perl/header_runtime_prefix.pl.template | 31 ++
 t/test-lib.sh  | 13 +---
 5 files changed, 107 insertions(+), 15 deletions(-)
 create mode 100644 perl/header_runtime_prefix.pl.template

diff --git a/Makefile b/Makefile
index 80904f8b0..558bd3ebb 100644
--- a/Makefile
+++ b/Makefile
@@ -425,6 +425,10 @@ all::
 #
 # to say "export LESS=FRX (and LV=-c) if the environment variable
 # LESS (and LV) is not set, respectively".
+#
+# Define RUNTIME_PREFIX_PERL to configure Git's PERL commands to locate Git
+# support libraries relative to their filesystem path instead of hard-coding
+# it.
 
 GIT-VERSION-FILE: FORCE
@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -462,6 +466,7 @@ ARFLAGS = rcs
 #   mandir
 #   infodir
 #   htmldir
+#   localedir
 #   perllibdir
 # This can help installing the suite in a relocatable way.
 
@@ -483,9 +488,12 @@ lib = lib
 # DESTDIR =
 pathsep = :
 
+gitexecdir_relative = $(patsubst $(prefix)/%,%,$(gitexecdir))
 mandir_relative = $(patsubst $(prefix)/%,%,$(mandir))
 infodir_relative = $(patsubst $(prefix)/%,%,$(infodir))
+localedir_relative = $(patsubst $(prefix)/%,%,$(localedir))
 htmldir_relative = $(patsubst $(prefix)/%,%,$(htmldir))
+perllibdir_relative = $(patsubst $(prefix)/%,%,$(perllibdir))
 
 export prefix bindir sharedir sysconfdir gitwebdir localedir
 
@@ -1718,11 +1726,13 @@ bindir_relative_SQ = $(subst ','\'',$(bindir_relative))
 mandir_relative_SQ = $(subst ','\'',$(mandir_relative))
 infodir_relative_SQ = $(subst ','\'',$(infodir_relative))
 localedir_SQ = $(subst ','\'',$(localedir))
+localedir_relative_SQ = $(subst ','\'',$(localedir_relative))
 gitexecdir_SQ = $(subst ','\'',$(gitexecdir))
+gitexecdir_relative_SQ = $(subst ','\'',$(gitexecdir_relative))
 template_dir_SQ = $(subst ','\'',$(template_dir))
 htmldir_relative_SQ = $(subst ','\'',$(htmldir_relative))
 prefix_SQ = $(subst ','\'',$(prefix))
-perllibdir_SQ = $(subst ','\'',$(perllibdir))
+perllibdir_relative_SQ = $(subst ','\'',$(perllibdir_relative))
 gitwebdir_SQ = $(subst ','\'',$(gitwebdir))
 
 SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
@@ -1965,14 +1975,52 @@ perl/PM.stamp: GIT-PERL-DEFINES FORCE
$(RM) $@+
 
 PERL_HEADER_TEMPLATE = perl/header_fixed_prefix.pl.template
+perl_localedir = $(localedir)
 
 PERL_DEFINES := $(PERL_PATH_SQ) $(PERLLIB_EXTRA_SQ)
-PERL_DEFINES += $(NO_PERL_MAKEMAKER)
-PERL_DEFINES += $(perllibdir)
+PERL_DEFINES += $(NO_PERL_MAKEMAKER) $(RUNTIME_PREFIX_PERL)
+
+# Support Perl runtime prefix. In this mode, a different header is installed
+# into Perl scripts. This header expects both the scripts and their support
+# library to be installed relative to $(prefix), and resolves the path to
+# the Perl libraries (perllibdir) from the executable's current path
+# (gitexecdir).
+#
+# This configuration requires both $(perllibdir) and $(gitexecdir) to be
+# relative paths, and will error if 

Re: git-p4: cloning with a change number does not import all files

2017-12-02 Thread Patrick Rouleau
On Fri, Dec 1, 2017 at 7:45 AM, Lars Schneider  wrote:
> Oh, I am with you. However, I only used git-p4 for a very short time in the
> way you want to use it. Therefore, I don't have much experience in that kind
> of usage pattern. I was able to convice my management to move all source to
> Git and I used git-p4 to migrate the repositories ;-)
>
> Here is a talk on the topic if you want to learn more:
> https://www.youtube.com/watch?v=QNixDNtwYJ0
>
> Cheers,
> Lars

Sadly, there is no way I convince the company to switch to git. They
have acquired
many companies in the past years and they have standardized around Perforce.
It is in part because they want access control and they need to store
a lot of binary
files (including big VM images).

I keep this video close to explain why I like git:
https://www.youtube.com/watch?v=o4PFDKIc2fs

Regards,
P. Rouleau


Re: How hard would it be to implement sparse fetching/pulling?

2017-12-02 Thread Philip Oakley

From: "Vitaly Arbuzov" 
Sent: Friday, December 01, 2017 1:27 AM

Jonathan, thanks for references, that is super helpful, I will follow

your suggestions.


Philip, I agree that keeping original DVCS off-line capability is an

important point. Ideally this feature should work even with remotes
that are located on the local disk.

And with other any other remote. (even to the extent that the other remote 
may indicate it has no capability, sorry, go away..)
E.g. One ought to be able to have/create a Github narrow fork of only the 
git.git/Documenation repo, and interact with that. (how much nicer if it was 
git.git/Documenation/ManPages/ to ease the exclusion of RelNotes/, howto/ 
and technical/ )



Which part of Jeff's work do you think wouldn't work offline after

repo initialization is done and sparse fetch is performed? All the
stuff that I've seen seems to be quite usable without GVFS.

I think it's that initial download that may be different, and what is 
expected of it. In my case, one may never connect to that server again, yet 
still be able to work both off-line and with other remotes (push and pull as 
per capabilities). Below I note that I'd only fetch the needed trees, not 
all of them. Also one needs to fetch a complete (pre-defined) subset, rather 
than an on-demand subset.



I'm not sure if we need to store markers/tombstones on the client,

what problem does it solve?

The part that the markers hopes to solve is the part that I hadn't said, 
that they should also show in the work tree so that users can see what is 
missing and where.


Importantly I would also trim the directory (tree) structure so only the 
direct heirachy of those files the user sees are visible, though at each 
level they would see side directory names (which are embedded in the 
heirachical tree objects). (IIUC Jeff H's scheme downloads *all* trees, not 
just a few)


It would mean that users can create a complete fresh tree and commit that 
can be merged and picked onto the usptream tree from the _directory worktree 
alone_, because the oid's of all the parts are listed in the worktree. The 
actual objects for the missing oids being available in the appropriate 
upstream.


It also means the index can be deleted, and with only the local narrow pack 
files and the current worktree the index can be recreated at the current 
sparseness level. (I'm hoping I've understood the dispersement of data 
between index and narrow packs corrrectly here ;-)


--
Philip

On Thu, Nov 30, 2017 at 3:43 PM, Philip Oakley  wrote:

From: "Vitaly Arbuzov" 


Found some details here: https://github.com/jeffhostetler/git/pull/3

Looking at commits I see that you've done a lot of work already,
including packing, filtering, fetching, cloning etc.
What are some areas that aren't complete yet? Do you need any help
with implementation?



comments below..



On Thu, Nov 30, 2017 at 9:01 AM, Vitaly Arbuzov  wrote:


Hey Jeff,

It's great, I didn't expect that anyone is actively working on this.
I'll check out your branch, meanwhile do you have any design docs that
describe these changes or can you define high level goals that you
want to achieve?

On Thu, Nov 30, 2017 at 6:24 AM, Jeff Hostetler 
wrote:




On 11/29/2017 10:16 PM, Vitaly Arbuzov wrote:



Hi guys,

I'm looking for ways to improve fetch/pull/clone time for large git
(mono)repositories with unrelated source trees (that span across
multiple services).
I've found sparse checkout approach appealing and helpful for most of
client-side operations (e.g. status, reset, commit, etc.)
The problem is that there is no feature like sparse fetch/pull in git,
this means that ALL objects in unrelated trees are always fetched.
It may take a lot of time for large repositories and results in some
practical scalability limits for git.
This forced some large companies like Facebook and Google to move to
Mercurial as they were unable to improve client-side experience with
git while Microsoft has developed GVFS, which seems to be a step back
to CVCS world.

I want to get a feedback (from more experienced git users than I am)
on what it would take to implement sparse fetching/pulling.
(Downloading only objects related to the sparse-checkout list)
Are there any issues with missing hashes?
Are there any fundamental problems why it can't be done?
Can we get away with only client-side changes or would it require
special features on the server side?



I have, for separate reasons been _thinking_ about the issue ($dayjob is 
in

defence, so a similar partition would be useful).

The changes would almost certainly need to be server side (as well as 
client

side), as it is the server that decides what is sent over the wire in the
pack files, which would need to be a 'narrow' pack file.


If we had such a feature then all we would need on top is a separate
tool that builds the right "sparse" scope for the workspace based 

[PATCH v2] git-gui: Prevent double UTF-8 conversion

2017-12-02 Thread Łukasz Stelmach
Convert author's name from the UTF-8 (or any other) encoding in
load_last_commit function the same way commit message is converted.

Amending commits in git-gui without such conversion breaks UTF-8
strings. For example, "\305\201ukasz" (as written by git cat-file) becomes
"\303\205\302\201ukasz" in an amended commit.

Signed-off-by: Łukasz Stelmach 
---
 git-gui/lib/commit.tcl | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-gui/lib/commit.tcl b/git-gui/lib/commit.tcl
index 83620b7cb..f820c24bf 100644
--- a/git-gui/lib/commit.tcl
+++ b/git-gui/lib/commit.tcl
@@ -34,9 +34,7 @@ You are currently in the middle of a merge that has not been 
fully completed.  Y
lappend parents [string range $line 7 
end]
} elseif {[string match {encoding *} $line]} {
set enc [string tolower [string range 
$line 9 end]]
-   } elseif {[regexp "author 
(.*)\\s<(.*)>\\s(\\d.*$)" $line all name email time]} {
-   set commit_author [list name $name 
email $email date $time]
-   }
+   } elseif {[regexp "author 
(.*)\\s<(.*)>\\s(\\d.*$)" $line all name email time]} { }
}
set msg [read $fd]
close $fd
@@ -44,7 +42,9 @@ You are currently in the middle of a merge that has not been 
fully completed.  Y
set enc [tcl_encoding $enc]
if {$enc ne {}} {
set msg [encoding convertfrom $enc $msg]
+   set name [encoding convertfrom $enc $name]
}
+   set commit_author [list name $name email $email date 
$time]
set msg [string trim $msg]
} err]} {
error_popup [strcat [mc "Error loading commit data for amend:"] 
"\n\n$err"]
-- 
2.11.0



Re: [PATCH] git-gui: Prevent double UTF-8 conversion

2017-12-02 Thread Łukasz Stelmach
On Tue, 28 Nov 2017 15:35:21 +0100, Johannes Schindelin wrote:
> On Tue, 28 Nov 2017, Łukasz Stelmach wrote:

> > With encoding on the file descriptor set to "binary" Tcl (8.6 in my
> > case) does double conversion which breaks e.g. author name in amended
> > commits.
>
> Is the problem in question occurring when a commit message contains
> non-ASCII characters encoded in UTF-8 and you click the "Amend Last
> Commit" radio button on the right side above the "Commit Message" text
> box?

No, only with the author data. That means there is a difference in how
Tcl handles input in with "read" (the message) and "gets" (headers
including the author header).

Indeed! A few lines below the message is converted from utf-8 and author
is not. So the right thing to do is to convert author too and not change
the file descriptor settings.

I am going back to the drawing board. Thanks. 

-- 
Było mi bardzo miło.  --- Rurku. --- ...
>Łukasz<--- To dobrze, że mnie słuchasz.


pgp9ALgHNl0h4.pgp
Description: PGP signature


Re: git checkout's stderr v stdout usage

2017-12-02 Thread nemo54
I found this an useful post. 
Thanks for sharing such a valuable details. 
I appreciates you very much. 
Now, you can check your Karnataka SSLC PUC Results 2018 at anywhere.
Check for more information  www.karresults.nic.in
  .



--
Sent from: http://git.661346.n2.nabble.com/


[PATCH] Add a sample hook which saves push certs as notes

2017-12-02 Thread Shikher Verma
hooks--post-receive.sample: If push cert is present, add it as a git
note to the top most commit of the updated ref.

Signed-off-by: Shikher Verma 
---
 templates/hooks--post-receive.sample | 38 
 1 file changed, 38 insertions(+)
 create mode 100755 templates/hooks--post-receive.sample

diff --git a/templates/hooks--post-receive.sample 
b/templates/hooks--post-receive.sample
new file mode 100755
index 0..b4366e43f
--- /dev/null
+++ b/templates/hooks--post-receive.sample
@@ -0,0 +1,38 @@
+#!/bin/sh
+#
+# An example hook script to store push certificates as notes.
+#
+# To enable this hook, rename this file to "post-receive".
+#
+# The stdin of the hook will be one line for each updated ref:
+#   
+#
+# For each updated ref this script will :
+# 1. Verify that the ref update matches that in push certificate.
+# 2. add the push cert as note (namespace pushcerts) to .
+#
+# If this hook is enabled on the server then clients can prevent
+# git metadata tampering, by using signed pushes and 
+# doing the following while fetching :
+# 1. fetch the git notes (of namespace pushcerts) from server.
+# $ git fetch origin refs/notes/pushcerts:refs/notes/pushcerts
+# 2. Check that the fetched ref's top most commit has a note
+# containing a push certificate.
+# 3. Verify the validity of the push certificate in the note and 
+# check that the ref update matches that in push certificate.
+#
+
+if test -z GIT_PUSH_CERT ; then
+exit 0
+fi
+
+push_cert=$(git cat-file -p  $GIT_PUSH_CERT)
+
+while read oval nval ref
+do
+   # Verify that the ref update matches that in push certificate.
+   if [[ $push_cert == *$oval" "$nval" "$ref* ]]; then
+   # add the push cert as note (namespaced pushcerts) to nval.
+   git notes --ref=pushcerts add -m "$push_cert" $nval -f
+   fi
+done
-- 
2.15.0