[PATCH 0/5] consistent setup code for external commands

2016-06-30 Thread Jeff King
On Fri, Jul 01, 2016 at 12:07:15AM -0400, Jeff King wrote:

> Interesting. It's failing on the assert(argv0_path) in system_path().
> 
> That's part of the RUNTIME_PREFIX code which is built only on Windows,
> so this is a Windows-specific issue.
> 
> I can guess the reason that argv0_path is not set is that
> credential-store has its own main() function and does not call
> git_extract_argv0_path(). It never mattered before, but as of v2.8.0,
> one of the library functions it calls wants to use system_path(), which
> assumes that the argv0 stuff has been set up.
> 
> I'm preparing some patches to fix this case (and some other related
> ones).

Here they are:

  [1/5]: add an extra level of indirection to main()
  [2/5]: common-main: call git_extract_argv0_path()
  [3/5]: common-main: call sanitize_stdfds()
  [4/5]: common-main: call restore_sigpipe_to_default()
  [5/5]: common-main: call git_setup_gettext()

The first patch is refactoring so we can fix this problem once, rather
than in the dozens of programs that need it.

The second should fix the problem you saw. It's unfortunate that the
tests didn't detect it; there's some discussion of that in the commit
message.

Patches 3-5 are other places where we can use the refactoring to be more
consistent and remove boilerplate code.

I almost did a patch 6 moving trace_command_performance(), but I'm not
sure if people would care or not. Running "git foo" already covers that,
even for an external command, because the git wrapper waits until the
sub-process finishes. Setting it up in each sub-program would mean:

  - you get it for dashed externals you run directly, too

  - for external programs run as "git foo", you get _two_ times. One for
the time the actual sub-program ran, and one with the overhead of
the git wrapper process.

I'm not sure if people actually care about that distinction, or
whether the extra number would simply be annoying.

So I punted. It's easy to move it over later if anybody cares.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 1/5] t9300: factor out portable "head -c" replacement

2016-06-30 Thread Eric Sunshine
On Thursday, June 30, 2016, Jeff King  wrote:
> In shell scripts it is sometimes useful to be able to read
> exactly N bytes from a pipe. Doing this portably turns out
> to be surprisingly difficult.
>
> We want a solution that:
>
>   - is portable
>
>   - never reads more than N bytes due to buffering (which
> would mean those bytes are not available to the next
> program to read from the same pipe)
>
>   - handles partial reads by looping until N bytes or read

s/or/are/

> (or we see EOF)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git-credentials-store.exe crash

2016-06-30 Thread Jeff King
On Thu, Jun 30, 2016 at 03:24:41PM -0600, d...@ucar.edu wrote:

> Carson:part2: git push
> This application has requested the Runtime to terminate it in an unusual
> way.
> Please contact the application's support team for more information.
> A s s e r t i o n   f a i l e d !
> 
>  P r o g r a m :   C : \ t o o l s \ G i t \ m i n g w 3 2 \ l i b e x e c \
> g i
>  t - c o r e \ g i t - c r e d e n t i a l - s t o r e . e x e
>  F i l e :   e x e c _ c m d . c ,   L i n e   2 3
> 
>  E x p r e s s i o n :   a r g v 0 _ p a t h
>  Everything up-to-date

Interesting. It's failing on the assert(argv0_path) in system_path().

That's part of the RUNTIME_PREFIX code which is built only on Windows,
so this is a Windows-specific issue.

I can guess the reason that argv0_path is not set is that
credential-store has its own main() function and does not call
git_extract_argv0_path(). It never mattered before, but as of v2.8.0,
one of the library functions it calls wants to use system_path(), which
assumes that the argv0 stuff has been set up.

I'm preparing some patches to fix this case (and some other related
ones).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: topological index field for commit objects

2016-06-30 Thread Jeff King
On Thu, Jun 30, 2016 at 11:12:52AM -0700, Linus Torvalds wrote:

> I do think that it's ok to cache generation numbers somewhere if there
> is an algorithm that can make use of them, but every time this comes
> up, it's just not been important enough to make a big deal and a new
> incompatible object format for it. The committer date is preexisting
> and has existing pseudo-generation-number usage, so..improving on the
> quality of it sounds like a good idea.

If you are OK with a cache, I don't think one needs to change the object
format at all. It can be computed on the fly, and is purely a local
optimization.

> The first step should probably be to make fsck warn about the existing
> cases of "commit has older date than parents". Something like the
> attached patch, perhaps?

I have mixed feelings on this, because it forces the user to confront
the issue at a time that's potentially very far from when it actually
happened (and often when it is not their fault).

I expect most people don't run fsck at all under normal circumstances,
so it is only when they have a problem of some sort that they would see
this warning at all. It would kick in and prevent objects being
transferred when things like receive.fsckObjects are configured, but
it's not on by default. GitHub does enable it, so pushing there is often
the first notification people get about any kind of problem.

This is a general problem with all fsck-driven warnings (people may
fetch without realizing they're getting breakage, and such objects may
get years of history built on top). But I think it can be even worse
with timestamps, because the broken state may not even be recognizable
when you first fetch the troublesome object.

E.g., imagine somebody else has their clock set forward, and you fetch
from them. Their object by itself is not broken. It is only when you
want to commit on top of it, with the correct clock, that the broken
state is created (and then, we cannot know whether it is your clock or
the original committer's clock that is bad).

So I think it would be more productive to put a check like this in "git
commit" rather than (or perhaps in addition to) fsck. That prevents
us creating the broken relationship, but it does still mean the user may
have to to go back and tell the original committer that their clock was
broken.

You could also have the fsck check look not only for out-of-order
commits, but also commits in the future (from the perspective of the
receiver). That would reject such broken commits before they even hit
your repository (though again, it is unclear in such a case if the
commit is broken or the clock of the checker).

> +static void fsck_commit_date(struct fsck_options *options, struct commit 
> *commit)
> +{
> + struct commit_list *p;
> +
> + for (p = commit->parents; p; p = p->next) {
> + struct commit *parent = p->item;
> + if (commit->date < parent->date)
> + report(options, >object, 
> FSCK_MSG_DATE_ORDERING, "Bad commit date ordering with parent");
> + }
> +}

I didn't test it, but I suspect this won't work reliably, as we do not
always have the parents parsed during an fsck check (e.g., during
"index-pack --strict").

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: topological index field for commit objects

2016-06-30 Thread Mike Hommey
On Thu, Jun 30, 2016 at 11:12:52AM -0700, Linus Torvalds wrote:
> On Thu, Jun 30, 2016 at 3:30 AM, Jakub Narębski  wrote:
> >
> > P.S. Having Git ensure that committerdate (as an epoch) is greater
> > than committerdates of its parents at the commit creation time (with
> > providing warning about time skew, perhaps not doing it if skew is
> > too large) would be not costly. No need to add anything, and it would
> > help future queries to be faster, I think.
> 
> So I think git should check the committer date (ie verify that the
> commitetr date of a new commit is always equal to or more recent than
> all the parents).
> 
> But we should probably allow some slop for when machines are out of
> sync (think build farms etc automation that may do automated commits,
> but the clocks on different machines may be off by a few seconds). We
> already do things like that in some of the use of committer dates -
> see for example CUTOFF_DATE_SLOP)
> 
> And we should probably allow the user to override the refusal to
> create new commits with bogus dates - think importing repositories etc
> where you may have reasons to just say "that's just how it was".
> 
> And it will take years for that to percolate out to all users, so it's
> not like it will fix things immediately, but then some time from now,
> we can consider commit dates to be more reliably generation numbers.

This is an assumption that might work for git repositories, but not for
foreign repositories imported in git (think e.g. mercurial repositories
accessed with a remote helper)

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat()

2016-06-30 Thread Ben Peart
Duy Nguyen  gmail.com> writes:

> 
> On Thu, Jun 30, 2016 at 7:55 PM, Ben Peart  gmail.com> wrote:
> > David Turner  novalis.org> writes:
> >
> >>
> >> Hiding watchman behind index-helper means you need both daemons. You
> >> can't run watchman alone. Not so good. But on the other hand, 'git'
> >> binary is not linked to watchman/json libraries, which is good for
> >> packaging. Core git package will run fine without watchman-related
> >> packages. If they need watchman, they can install git-index-helper and
> >> dependencies.
> >>
> >
> > Have you considered splitting index-helper and watchman apart?  Using
> > Watchman to not lstat unchanged entries is a huge perf win with very
> > large repos.
> 
> On large repos (i.e. lots of files/dirs on worktree), the cost of
> reading index will increase proportionally. Yes lstat costs, but I
> suspect index reading (integrity verification actually) may cost more,
> especially on platforms with cheap lstat like linux. On these repos
> you really want to enable all four: index-helper (with watchman),
> split-index (I still need to work out pruning on split-index) and
> untracked cache. There's still a lot more to make git run fast on
> large repos though.
> 

I've found (at least on Windows) that as the repo size gets larger, the
time to read the index becomes a much smaller percentage of the overall
time.  I just captured some perf traces of git status on a large repo we
have.  Of that, 92.5% was spent in git!read_directory and only 4.0% was 
spent in git!read_index.  Of that 4%, 2.6% was git!glk_SHA1_Update.

Given there were no dirty files, Watchman would have made a huge 
improvement in the overall time but index helper would have had
relatively little impact.  We've noticed this same pattern in all our
repos which is what is driving our interest in the Watchman model and
also shows why index-helper is of less interest.

> > It would also be interesting to make the Watchman backend replaceable by
> > using an extensible API.  This has the benefit of not having to link the
> > 'git' binary to the watchman/json libraries.
> 
> 'git' binary is not linked to watchman libraries. git-index-helper is
> a separate binary, by design. In theory you can create a
> 'git-index-helper' replacement binary with something other than
> watchman. I think David documented the protocol well (it may change in
> the future though and we are not prepared for capability progression)
> 
> > Is there any pattern already in git for accomplishing this?

While the current design hides watchman behind index-helper, if we were
to change that model so they were independent, we would be interested
in doing it in such a way that provided some abstraction so that it 
could be replaced with another file watching daemon.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: topological index field for commit objects

2016-06-30 Thread Jakub Narębski
W dniu 2016-06-30 o 20:12, Linus Torvalds pisze:
> On Thu, Jun 30, 2016 at 3:30 AM, Jakub Narębski  wrote:
>>
>> P.S. Having Git ensure that committerdate (as an epoch) is greater
>> than committerdates of its parents at the commit creation time (with
>> providing warning about time skew, perhaps not doing it if skew is
>> too large) would be not costly. No need to add anything, and it would
>> help future queries to be faster, I think.
> 
> So I think git should check the committer date (ie verify that the
> committer date of a new commit is always equal to or more recent than
> all the parents).
> 
> But we should probably allow some slop for when machines are out of
> sync (think build farms etc automation that may do automated commits,
> but the clocks on different machines may be off by a few seconds). We
> already do things like that in some of the use of committer dates -
> see for example CUTOFF_DATE_SLOP).

The problem with this idea is that the clock skew might be in two
directions, but it does fix/help only one.  If committer's clock
lags behind true time, the automatic bump to have committerdate of
a new commit be greater than committerdates of all its parent is quite
helpful.

However, if some developer has his or her clock misconfigured so that
it gives time in the future, this feature would not help.  Commits from
such developer (from such machine) would screw it up for others.

That's why I think some limit is needed.  For example, if for some
reason commit was created with committer date 3 days in the future,
we would probably do not want for other developers to have to lie
about when they created their commit.


Or did you mean that if the date for new commit is in the past
of committerdates of its parents, its all right when it is within
slop?

> 
> And we should probably allow the user to override the refusal to
> create new commits with bogus dates - think importing repositories etc
> where you may have reasons to just say "that's just how it was".

Right. I was thinking about (ab)using --no-verify flag, or perhaps
prompting user if the terminal is interactive. But perhaps a separate
flag / environment variable would be better...

> And it will take years for that to percolate out to all users, so it's
> not like it will fix things immediately, but then some time from now,
> we can consider commit dates to be more reliably generation numbers.
[...]

That's the idea of the proposal.

-- 
Jakub Narębski


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git-credentials-store.exe crash

2016-06-30 Thread d...@ucar.edu

Since I cannot submit a github issue, I will try here.



 - Which version of Git for Windows are you using? 32-bit or 64-bit? 
Include the

   output of `git version` as well.
git version 2.9.0.windows.1
32-bit

 - Which version of Windows are you running? 32-bit or 64-bit?
64-bit windows 7

 - What options did you set as part of the installation? Or did you 
choose the

   defaults?
no git credentials manager

 - Any other interesting things about your environment that might be 
related

   to the issue you're seeing?
Unknown

### Details

 - Which terminal/shell are you running Git from? e.g 
Bash/CMD/PowerShell/other


 cygwin bash

 - What commands did you run to trigger this issue? If you can provide a
   [Minimal, Complete, and Verifiable 
example](http://stackoverflow.com/help/mcve)

   this will help us understand the issue.

I kept getting complaints by git about credentials lock existing.
deleted it and then git-credentials-store crashed as follows.
```
Carson:part2: git push
fatal: unable to get credential storage lock: File exists
Everything up-to-date
Carson:part2: ls ~/.git*
/cygdrive/c/Users/dmh/.gitconfig* 
/cygdrive/c/Users/dmh/.gitignore*

/cygdrive/c/Users/dmh/.git-credentials*   /cygdrive/c/Users/dmh/.gitk*
/cygdrive/c/Users/dmh/.git-credentials.lock*
Carson:part2: rm ~/.git-credentials.lock
Carson:part2: git push
This application has requested the Runtime to terminate it in an unusual 
way.

Please contact the application's support team for more information.
A s s e r t i o n   f a i l e d !

 P r o g r a m :   C : \ t o o l s \ G i t \ m i n g w 3 2 \ l i b e x 
e c \ g i

 t - c o r e \ g i t - c r e d e n t i a l - s t o r e . e x e
 F i l e :   e x e c _ c m d . c ,   L i n e   2 3

 E x p r e s s i o n :   a r g v 0 _ p a t h
 Everything up-to-date
Carson:part2:

```
 - What did you expect to occur after running these commands?
completion with no errors

 - What actually happened instead?

 git-credentials-store failed.

 - If the problem was occurring with a specific repository, can you 
provide the

   URL to that repository to help us with testing?
url = https://github.com/Unidata/thredds.git

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with --shallow-submodules option

2016-06-30 Thread Istvan Zakar
Hi,

Thanks for the clarification, it makes sense now.

Thanks,
Istvan


On 30 June 2016 at 22:57, Stefan Beller  wrote:
> On Thu, Jun 30, 2016 at 6:27 AM, Istvan Zakar  wrote:
>> Hello,
>>
>> Thanks for your answers. I tested it after the changes were made on
>> the git server, and it seems to be working. But some other issue came
>> up.
>>
>> We have quite many submodules in our project so I did some comaprision:
>>
>> If I do a clone with these parameters:
>> --jobs 20 --recurse-submodules
>>
>> The clone lasts ~53 seconds, and the total size of the folder is around 2 GB.
>>
>> If I add the shallow-submodules option, the size of the folder will be
>> a bit below 1GB, so the size decreased as I expected, but the time of
>> the clone itself increased to 90 seconds. It seems the last step of
>> the command, checking out the submodules is executed one-by-one, and
>> not in parallel, so it seems at this step the jobs parameter does not
>> have effect.
>>
>> Is it intentional, or there is some option I missed?
>
> It was intentional at the time of submitting the patches.
> The checkout phase is a bit complicated as it combines the
> newly cloned submodules as well as the submodules to incrementally
> fetch into one bucket and treats them the same.
>
> And for submodules that were fetched incrementally you may run into problems
> when combining that with the local state (e.g. rebase or merge configured in
> `submodule..update` or passed on the command line), which requires
> human interaction (resolving the merge conflict), which we want to present one
> at a time to the user.
>
> The handling for the user is not quite clear, when to stop, see:
> 15ffb7cde48b73b3d5ce259443db7d2e0ba13750 (submodule update: continue
> when a checkout fails)
> 877449c136539cf8b9b4ed9cfe33a796b7b93f93 (git-submodule.sh: clarify
> the "should we die now" logic)
>
> So we want to die as soon as we see a merge conflict or other
> error that is likely to require some human interaction.
> To do that properly we need to have complicated logic or just update
> one submodule at a time.
>
> For initial checkouts we know that there will be no merge conflicts, i.e.
> it will be a "checkout -f" (with an implicit must_die_on_failure=no)
> So we could run all checkouts of submodules in parallel, too. We'd
> just need to write the patch for that.
>
> As the cloning is already done in parallel, we can hook into the initial
> checkout there easily. I'd build that on top of [1], creating a similar 
> commit.
> In the successful case of `update_clone_task_finished` (the case with
> `!result`  -> return 0;) we would need to add the checkout command to
> the queue instead of just finishing.
>
> [1] 
> https://github.com/gitster/git/commit/665b35eccd39fefd714cb5c332277a6b94fd9386
>
>
>>
>> I'm using git 2.9.0 on client side.
>>
>> Thanks,
>>Istvan
>>
>> ps: if I update the submodules with --depth 1 parameter in parallel
>> using xargs it lasts about 18 seconds, so it's a workaround for this
>> issue, but it would be nice to do it with a single command.
>>
>>
>>
>>
>> On 22 June 2016 at 17:31, Fredrik Gustafsson  wrote:
>>> On Mon, Jun 20, 2016 at 01:06:39PM +, Istvan Zakar wrote:
 I'm working on a relatively big project with many submodules. During
 cloning for testing I tried to decrease the amount of data need to be
 fetched from the server by using --shallow-submodules option in the clone
 command. It seems to check out the tip of the remote repo, and if it's not
 the commit registered in the superproject the submodule update fails
 (obviously). Can I somehow tell to fetch that exact commit I need for my
 superproject?
>>>
>>> Maybe. http://stackoverflow.com/questions/2144406/git-shallow-submodules
>>> gives a good overview of this problem.
>>>
>>> git fetches a branch and is shallow from that branch, which might be an
>>> other sha1 than the one the submodule points to, (as you say). This
>>> is/was one of the drawbacks with this method. However the since git 2.8,
>>> git will try to fetch the sha1 direct (and not the branch). So then it
>>> will work, if(!), the server supports direct access to sha1. This was
>>> previously not allowed due to security concerns (if I recall correctly).
>>>
>>> So the answer is, yes this will work if you've a recent version of git
>>> and support on the server side for doing this. Unfortunately I'm not
>>> sure which git version is needed on the server side for this to work.
>>>
>>> --
>>> Fredrik Gustafsson
>>>
>>> phone: +46 733-608274
>>> e-mail: iv...@iveqy.com
>>> website: http://www.iveqy.com
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More 

Re: Problem with --shallow-submodules option

2016-06-30 Thread Stefan Beller
On Thu, Jun 30, 2016 at 6:27 AM, Istvan Zakar  wrote:
> Hello,
>
> Thanks for your answers. I tested it after the changes were made on
> the git server, and it seems to be working. But some other issue came
> up.
>
> We have quite many submodules in our project so I did some comaprision:
>
> If I do a clone with these parameters:
> --jobs 20 --recurse-submodules
>
> The clone lasts ~53 seconds, and the total size of the folder is around 2 GB.
>
> If I add the shallow-submodules option, the size of the folder will be
> a bit below 1GB, so the size decreased as I expected, but the time of
> the clone itself increased to 90 seconds. It seems the last step of
> the command, checking out the submodules is executed one-by-one, and
> not in parallel, so it seems at this step the jobs parameter does not
> have effect.
>
> Is it intentional, or there is some option I missed?

It was intentional at the time of submitting the patches.
The checkout phase is a bit complicated as it combines the
newly cloned submodules as well as the submodules to incrementally
fetch into one bucket and treats them the same.

And for submodules that were fetched incrementally you may run into problems
when combining that with the local state (e.g. rebase or merge configured in
`submodule..update` or passed on the command line), which requires
human interaction (resolving the merge conflict), which we want to present one
at a time to the user.

The handling for the user is not quite clear, when to stop, see:
15ffb7cde48b73b3d5ce259443db7d2e0ba13750 (submodule update: continue
when a checkout fails)
877449c136539cf8b9b4ed9cfe33a796b7b93f93 (git-submodule.sh: clarify
the "should we die now" logic)

So we want to die as soon as we see a merge conflict or other
error that is likely to require some human interaction.
To do that properly we need to have complicated logic or just update
one submodule at a time.

For initial checkouts we know that there will be no merge conflicts, i.e.
it will be a "checkout -f" (with an implicit must_die_on_failure=no)
So we could run all checkouts of submodules in parallel, too. We'd
just need to write the patch for that.

As the cloning is already done in parallel, we can hook into the initial
checkout there easily. I'd build that on top of [1], creating a similar commit.
In the successful case of `update_clone_task_finished` (the case with
`!result`  -> return 0;) we would need to add the checkout command to
the queue instead of just finishing.

[1] 
https://github.com/gitster/git/commit/665b35eccd39fefd714cb5c332277a6b94fd9386


>
> I'm using git 2.9.0 on client side.
>
> Thanks,
>Istvan
>
> ps: if I update the submodules with --depth 1 parameter in parallel
> using xargs it lasts about 18 seconds, so it's a workaround for this
> issue, but it would be nice to do it with a single command.
>
>
>
>
> On 22 June 2016 at 17:31, Fredrik Gustafsson  wrote:
>> On Mon, Jun 20, 2016 at 01:06:39PM +, Istvan Zakar wrote:
>>> I'm working on a relatively big project with many submodules. During
>>> cloning for testing I tried to decrease the amount of data need to be
>>> fetched from the server by using --shallow-submodules option in the clone
>>> command. It seems to check out the tip of the remote repo, and if it's not
>>> the commit registered in the superproject the submodule update fails
>>> (obviously). Can I somehow tell to fetch that exact commit I need for my
>>> superproject?
>>
>> Maybe. http://stackoverflow.com/questions/2144406/git-shallow-submodules
>> gives a good overview of this problem.
>>
>> git fetches a branch and is shallow from that branch, which might be an
>> other sha1 than the one the submodule points to, (as you say). This
>> is/was one of the drawbacks with this method. However the since git 2.8,
>> git will try to fetch the sha1 direct (and not the branch). So then it
>> will work, if(!), the server supports direct access to sha1. This was
>> previously not allowed due to security concerns (if I recall correctly).
>>
>> So the answer is, yes this will work if you've a recent version of git
>> and support on the server side for doing this. Unfortunately I'm not
>> sure which git version is needed on the server side for this to work.
>>
>> --
>> Fredrik Gustafsson
>>
>> phone: +46 733-608274
>> e-mail: iv...@iveqy.com
>> website: http://www.iveqy.com
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/4] doc: clarify that `^r1` will exclude `r1` itself

2016-06-30 Thread Philip Oakley
Signed-off-by: Philip Oakley 
---
 Documentation/revisions.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt
index 131060c..87be9c4 100644
--- a/Documentation/revisions.txt
+++ b/Documentation/revisions.txt
@@ -246,7 +246,7 @@ The '{caret}' (caret) notation
 ~~~
 To exclude commits reachable from a commit, a prefix '{caret}'
 notation is used.  E.g. '{caret}r1 r2' means commits reachable
-from 'r2' but exclude the ones reachable from 'r1'.
+from 'r2' but exclude 'r1' and those reachable from 'r1'.
 
 The '..' (two-dot) range notation
 ~
-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 3/4] doc: give headings for the two and three dot notations

2016-06-30 Thread Philip Oakley
While there, also break out the other shorthand notations and
add a title for the revision range summary (which also appears
in git-rev-parse, so keep it mixed case).

Signed-off-by: Philip Oakley 
---
 Documentation/revisions.txt | 23 +--
 1 file changed, 17 insertions(+), 6 deletions(-)

diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt
index 19314e3..131060c 100644
--- a/Documentation/revisions.txt
+++ b/Documentation/revisions.txt
@@ -242,35 +242,46 @@ specifying a single revision with the notation described 
in the
 previous section means the set of commits reachable from that
 commit, following the commit ancestry chain.
 
+The '{caret}' (caret) notation
+~~~
 To exclude commits reachable from a commit, a prefix '{caret}'
 notation is used.  E.g. '{caret}r1 r2' means commits reachable
 from 'r2' but exclude the ones reachable from 'r1'.
 
-This set operation appears so often that there is a shorthand
+The '..' (two-dot) range notation
+~
+The '{caret}r1 r2' set operation appears so often that there is a shorthand
 for it.  When you have two commits 'r1' and 'r2' (named according
 to the syntax explained in SPECIFYING REVISIONS above), you can ask
 for commits that are reachable from r2 excluding those that are reachable
 from r1 by '{caret}r1 r2' and it can be written as 'r1..r2'.
 
+The '...' (three dot) Symmetric Difference notation
+~~~
 A similar notation 'r1\...r2' is called symmetric difference
 of 'r1' and 'r2' and is defined as
 'r1 r2 --not $(git merge-base --all r1 r2)'.
 It is the set of commits that are reachable from either one of
 'r1' or 'r2' but not from both.
 
-In these two shorthands, you can omit one end and let it default to HEAD.
+In these two shorthand notations, you can omit one end and let it default to 
HEAD.
 For example, 'origin..' is a shorthand for 'origin..HEAD' and asks "What
 did I do since I forked from the origin branch?"  Similarly, '..origin'
 is a shorthand for 'HEAD..origin' and asks "What did the origin do since
 I forked from them?"  Note that '..' would mean 'HEAD..HEAD' which is an
 empty range that is both reachable and unreachable from HEAD.
 
+Additional '{caret}' Shorthand notations
+
 Two other shorthands for naming a set that is formed by a commit
-and its parent commits exist.  The 'r1{caret}@' notation means all
-parents of 'r1'.  'r1{caret}!' includes commit 'r1' but excludes
-all of its parents.
+and its parent commits exist.
 
-To summarize:
+The 'r1{caret}@' notation means all parents of 'r1'.
+
+'r1{caret}!' includes commit 'r1' but excludes all of its parents.
+
+Revision Range Summary
+--
 
 ''::
Include commits that are reachable from (i.e. ancestors of)
-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/4] doc: use 'symmetric difference' consistently

2016-06-30 Thread Philip Oakley
Signed-off-by: Philip Oakley 
---
 Documentation/gitk.txt | 2 +-
 Documentation/rev-list-options.txt | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/gitk.txt b/Documentation/gitk.txt
index 6ade002..6c3eb15 100644
--- a/Documentation/gitk.txt
+++ b/Documentation/gitk.txt
@@ -70,7 +70,7 @@ linkgit:git-rev-list[1] for a complete list.
 
 --left-right::
 
-   Mark which side of a symmetric diff a commit is reachable
+   Mark which side of a symmetric difference a commit is reachable
from.  Commits from the left side are prefixed with a `<`
symbol and those from the right with a `>` symbol.
 
diff --git a/Documentation/rev-list-options.txt 
b/Documentation/rev-list-options.txt
index 4f009d4..6dc0bb0 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -225,7 +225,7 @@ excluded from the output.
 
 --left-only::
 --right-only::
-   List only commits on the respective side of a symmetric range,
+   List only commits on the respective side of a symmetric difference,
i.e. only those which would be marked `<` resp. `>` by
`--left-right`.
 +
@@ -766,7 +766,7 @@ ifdef::git-rev-list[]
 endif::git-rev-list[]
 
 --left-right::
-   Mark which side of a symmetric diff a commit is reachable from.
+   Mark which side of a symmetric difference a commit is reachable from.
Commits from the left side are prefixed with `<` and those from
the right with `>`.  If combined with `--boundary`, those
commits are prefixed with `-`.
-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 0/4] Name for A..B ranges?

2016-06-30 Thread Philip Oakley
This is the re-roll of the po/range-doc (2016-06-27) 3 commits

The order is slightly re-arranged, and an additional patch clarifying
that ^r1 excludes r1 itself being added.

The heading have been tweaked.

Discussion: $gmane/297908
previous patch series $gmane/298223

Philip Oakley (4):
  doc: use 'symmetric difference' consistently
  doc: show the actual left, right, and boundary marks
  doc: give headings for the two and three dot notations
  doc: clarify that `^r1` will exclude `r1` itself

 Documentation/gitk.txt |  2 +-
 Documentation/pretty-formats.txt   |  2 +-
 Documentation/rev-list-options.txt |  4 ++--
 Documentation/revisions.txt| 25 ++---
 4 files changed, 22 insertions(+), 11 deletions(-)

-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 2/4] doc: show the actual left, right, and boundary marks

2016-06-30 Thread Philip Oakley
Signed-off-by: Philip Oakley 
---
Found while checking the 'symmetric difference' documentation
---
 Documentation/pretty-formats.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 29b19b9..10719e1 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -166,7 +166,7 @@ endif::git-rev-list[]
   respecting the `auto` settings of the former if we are going to a
   terminal). `auto` alone (i.e. `%C(auto)`) will turn on auto coloring
   on the next placeholders until the color is switched again.
-- '%m': left, right or boundary mark
+- '%m': left (`<`), right (`>`) or boundary (`-`) mark
 - '%n': newline
 - '%%': a raw '%'
 - '%x00': print a byte from a hex code
-- 
2.8.4.windows.1.3.ge328a54

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: --abbrev-commit gives longer hash than necessary

2016-06-30 Thread Bryan Turner
Steffen,

Git commands generally have a 7 character minimum by default when
abbreviating hashes, even if fewer characters are still (currently)
unique. Per the documentation:

   core.abbrev
   Set the length object names are abbreviated to. If unspecified,
   many commands abbreviate to 7 hexdigits, which may not be enough
   for abbreviated object names to stay unique for sufficiently long
   time.

You may be able to set core.abbrev to a smaller value to get even
shorter hashes, but the shorter you go the more likely you are to find
once-unique hashes no longer are, as your repository grows over time.
The default isn't just about what's likely to be unique now; it's
about what's likely to stay unique for a period of time.

Hope this helps!
Bryan Turner

On Thu, Jun 30, 2016 at 12:38 PM, Steffen Nurpmeso  wrote:
> Hello, for your possible interest.
>
> For some time (currently with 2.9.0) know see that a single commit
> gives a longer hash than necessary, even though there is no
> ambiguity:
>
>   ?0[steffen@wales ]$ git longca|
>   awk 'BEGIN{l7=0;l8=0}\
> /^[[:alnum:]]{7} /{++l7;next}\
> /^[[:alnum:]]{8} /{++l8;print}\
>   END{print "L7 " l7 " L8 " l8}'
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   L7 3364 L8 1
>
> So it is only this single commit.. but why?
>
>   ?0[steffen@wales ]$ git long1 786d0c9
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   ?0[steffen@wales ]$ git long1 786d0c
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   ?0[steffen@wales ]$ git long1 786d0
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>   ?0[steffen@wales ]$ git long1 786d
>   786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
> handlers..
>
> Not really ambiguous:
>
>   ?0[steffen@wales ]$ git long|cut -f1 -d' '|grep ^786
>   786d0c9c
>   786f219
>
> Ciao!
>
> --steffen
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--abbrev-commit gives longer hash than necessary

2016-06-30 Thread Steffen Nurpmeso
Hello, for your possible interest.

For some time (currently with 2.9.0) know see that a single commit
gives a longer hash than necessary, even though there is no
ambiguity:

  ?0[steffen@wales ]$ git longca|
  awk 'BEGIN{l7=0;l8=0}\
/^[[:alnum:]]{7} /{++l7;next}\
/^[[:alnum:]]{8} /{++l8;print}\
  END{print "L7 " l7 " L8 " l8}'
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..
  L7 3364 L8 1

So it is only this single commit.. but why?

  ?0[steffen@wales ]$ git long1 786d0c9
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..
  ?0[steffen@wales ]$ git long1 786d0c
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..
  ?0[steffen@wales ]$ git long1 786d0
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..
  ?0[steffen@wales ]$ git long1 786d
  786d0c9c [mimepipe.2] send.c:sendpart(): force iconv(3)+ for TEXT part 
handlers..

Not really ambiguous:

  ?0[steffen@wales ]$ git long|cut -f1 -d' '|grep ^786
  786d0c9c
  786f219

Ciao!

--steffen
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat()

2016-06-30 Thread Duy Nguyen
On Thu, Jun 30, 2016 at 7:55 PM, Ben Peart  wrote:
> David Turner  novalis.org> writes:
>
>>
>> Hiding watchman behind index-helper means you need both daemons. You
>> can't run watchman alone. Not so good. But on the other hand, 'git'
>> binary is not linked to watchman/json libraries, which is good for
>> packaging. Core git package will run fine without watchman-related
>> packages. If they need watchman, they can install git-index-helper and
>> dependencies.
>>
>
> Have you considered splitting index-helper and watchman apart?  Using
> Watchman to not lstat unchanged entries is a huge perf win with very
> large repos.

On large repos (i.e. lots of files/dirs on worktree), the cost of
reading index will increase proportionally. Yes lstat costs, but I
suspect index reading (integrity verification actually) may cost more,
especially on platforms with cheap lstat like linux. On these repos
you really want to enable all four: index-helper (with watchman),
split-index (I still need to work out pruning on split-index) and
untracked cache. There's still a lot more to make git run fast on
large repos though.

> It would also be interesting to make the Watchman backend replaceable by
> using an extensible API.  This has the benefit of not having to link the
> 'git' binary to the watchman/json libraries.

'git' binary is not linked to watchman libraries. git-index-helper is
a separate binary, by design. In theory you can create a
'git-index-helper' replacement binary with something other than
watchman. I think David documented the protocol well (it may change in
the future though and we are not prepared for capability progression)

> Is there any pattern already in git for accomplishing this?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: topological index field for commit objects

2016-06-30 Thread Linus Torvalds
On Thu, Jun 30, 2016 at 3:30 AM, Jakub Narębski  wrote:
>
> P.S. Having Git ensure that committerdate (as an epoch) is greater
> than committerdates of its parents at the commit creation time (with
> providing warning about time skew, perhaps not doing it if skew is
> too large) would be not costly. No need to add anything, and it would
> help future queries to be faster, I think.

So I think git should check the committer date (ie verify that the
commitetr date of a new commit is always equal to or more recent than
all the parents).

But we should probably allow some slop for when machines are out of
sync (think build farms etc automation that may do automated commits,
but the clocks on different machines may be off by a few seconds). We
already do things like that in some of the use of committer dates -
see for example CUTOFF_DATE_SLOP)

And we should probably allow the user to override the refusal to
create new commits with bogus dates - think importing repositories etc
where you may have reasons to just say "that's just how it was".

And it will take years for that to percolate out to all users, so it's
not like it will fix things immediately, but then some time from now,
we can consider commit dates to be more reliably generation numbers.

I do think that it's ok to cache generation numbers somewhere if there
is an algorithm that can make use of them, but every time this comes
up, it's just not been important enough to make a big deal and a new
incompatible object format for it. The committer date is preexisting
and has existing pseudo-generation-number usage, so..improving on the
quality of it sounds like a good idea.

The first step should probably be to make fsck warn about the existing
cases of "commit has older date than parents". Something like the
attached patch, perhaps?

  Linus
 fsck.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/fsck.c b/fsck.c
index 05315451c56f..722b65371d38 100644
--- a/fsck.c
+++ b/fsck.c
@@ -60,6 +60,7 @@
FUNC(NULL_SHA1, WARN) \
FUNC(ZERO_PADDED_FILEMODE, WARN) \
FUNC(NUL_IN_COMMIT, WARN) \
+   FUNC(DATE_ORDERING, WARN) \
/* infos (reported as warnings, but ignored by default) */ \
FUNC(BAD_TAG_NAME, INFO) \
FUNC(MISSING_TAGGER_ENTRY, INFO)
@@ -604,6 +605,17 @@ static int fsck_ident(const char **ident, struct object 
*obj, struct fsck_option
return 0;
 }
 
+static void fsck_commit_date(struct fsck_options *options, struct commit 
*commit)
+{
+   struct commit_list *p;
+
+   for (p = commit->parents; p; p = p->next) {
+   struct commit *parent = p->item;
+   if (commit->date < parent->date)
+   report(options, >object, 
FSCK_MSG_DATE_ORDERING, "Bad commit date ordering with parent");
+   }
+}
+
 static int fsck_commit_buffer(struct commit *commit, const char *buffer,
unsigned long size, struct fsck_options *options)
 {
@@ -650,6 +662,7 @@ static int fsck_commit_buffer(struct commit *commit, const 
char *buffer,
return err;
}
}
+   fsck_commit_date(options, commit);
author_count = 0;
while (skip_prefix(buffer, "author ", )) {
author_count++;


Re: [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat()

2016-06-30 Thread Ben Peart
David Turner  novalis.org> writes:

> 
> Hiding watchman behind index-helper means you need both daemons. You
> can't run watchman alone. Not so good. But on the other hand, 'git'
> binary is not linked to watchman/json libraries, which is good for
> packaging. Core git package will run fine without watchman-related
> packages. If they need watchman, they can install git-index-helper and
> dependencies.
> 

Have you considered splitting index-helper and watchman apart?  Using 
Watchman to not lstat unchanged entries is a huge perf win with very 
large repos.

It would also be interesting to make the Watchman backend replaceable by
using an extensible API.  This has the benefit of not having to link the
'git' binary to the watchman/json libraries.  Is there any pattern 
already in git for accomplishing this?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-06-30 Thread Torsten Bögershausen
On 29.06.16 18:14, Junio C Hamano wrote:
> tbo...@web.de writes:
> 
>> From: Torsten Bögershausen 
>>
>> The following didn't work as expected:
> 
> Sorry for being slow (not in response but in understanding), but
> let's examine the expectation first.

Thanks for the patience.
There is one detail missing in my description:
The check if a file in the working tree is clean:

static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
{
int match = -1;
int fd = open(ce->name, O_RDONLY);

if (fd >= 0) {
unsigned char sha1[20];
unsigned flags = HASH_USE_SHA_NOT_PATH;
--
Remove the "new feature":

git diff  read-cache.c
diff --git a/read-cache.c b/read-cache.c
index dd98b36..1f951ea 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -170,7 +170,7 @@ static int ce_compare_data(const struct cache_entry *ce, 
struct stat *st)
 
if (fd >= 0) {
unsigned char sha1[20];
-   unsigned flags = HASH_USE_SHA_NOT_PATH;
+   unsigned flags = 0;//HASH_USE_SHA_NOT_PATH;
memcpy(sha1, ce->sha1, sizeof(sha1));
if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
match = hashcmp(sha1, ce->sha1);

The the problem can be re-produced, and debugged with help of t6038:

not ok 5 - Merge addition of text=auto eol=CRLF


> 
>>  - In a middle of a merge
>>  - merge.renormalize is true,
> 
> gitattributes(5) tells us that this would make a "virtual check-out
> and check-in of all stages", i.e. each of the three blob objects
> involved is run through convert_to_working_tree(), and the result is
> run through convert_to_git().  Now, what are these two operations
> told to do?
> 
>>  - .gitattributes = "* text=auto"
>>  - core.eol = crlf
> 
> git-config(1) tells us that a text file will be checked out with
> CRLF line endings, so convert_to_working_tree() would work that
> way.  Even though core.eol entry in that manual page does not tell
> us anything, presumably convert_to_git() is expected to turn it back
> to LF line endings.
> 
Yes, the git-config(1) may need improvements, but that can go as a later path.

>> Merge a blob with CRLF "first line\r\nsame line\r\n" and a blob
>> with LF "first line\nsame line\n".
I may have written:
Merge a blob with CRLF "first line\r\nsame line\r\n" 
(which is checked out as
"first line\r\nsame line\r\n" 
in the working tree, and therefore clean)
and a blob with LF "first line\nsame line\n".

> 
> The former, when renormalized, should (we are discussing the
> expectation, not necessarily what is done by today's code) be
> processed like this:
> 
>  * Pretend as if blob "first line\r\nsame line\r\n" is in the stage
>#0 of the index at the path;
>  * Pretend as if the user said "git checkout" that path;
>  * Pretend as if the user then said "git add" that path.
> 
> The checkout process hopefully does not blindly turn LF into CRLF,
> making it "first line \r\r\nsame line\r\rn".  Instead the (virtual)
> working tree file will have "first line\r\nsame line\r\n".
Yes
> 
> Then "git add" should turn that CRLF into LF, which would give us
> "first line\nsame line\n", but because the "safer autocrlf" rule
> prevents it from happening.  The (real) index already has CR in it
> in the blob in stage #2, so the check-in process would (this is not
> describing the ideal, but what is done by today's code) disable
> CRLF->LF conversion.
> 
> Is that the problem you are trying to solve?
Not sure, if that problem isn't already solved.

> 
> If that is the case, I do not see how "don't use stage #2 of the
> real index; use the blob being renormalized instead" would help.
> The blob being renormalized may have CR in it, triggering that
> "safer autocrlf" rule and cause you the same trouble, no?
> 
> To me, it appears that if you consider that the "safer autocrlf" is
> a sensible thing, you _must_ expect that the renormalization would
> not work at all, in the scenario you presented.  Also,
> 
>> The expected result of the merge is "first line\nsame line\n".
> 
> if you expect this, to me, it appears that you need to reject the
> "safer autocrlf", at least during renormalization, as a sensible
> thing.  And if you _are_ to disable the "safer autocrlf" thing, then
> it does not matter what is currently in the index--the conversion
> can happen solely based on the data being converted and the
> configuration and attribute settings.
> 
> So I still do not see why you want to pass "no do not use stage #0
> or stage #2; use this blob instead".  Shouldn't you be just passing
> a bit "don't do the safer autocrlf thing in this convert_to_git()
> call" from renormalization codepath without doing anything else?
> 

This is my understanding:
- git checkout and git add are working as expected:
  LF in blob gives CRLF in the working tree at checkout (if attr says "auto")
  CRLF in blob gives CRLF in the working tree at checkout.
  

[PATCH v2] t5541: become resilient to GETTEXT_POISON

2016-06-30 Thread Vasco Almeida
Use test_i18n* functions for testing text already marked for
translation.

Signed-off-by: Vasco Almeida 
---
Fix typo on v1.
Forgot to mention that tests with TTY prerequisite were skipped, I don't
know how to run them.

Notes:
Incremental update for va/i18n-even-more (merged to 'next' on 2016-06-28 at
5919dfa).

I don't know how I didn't catch this one.

 t/t5541-http-push-smart.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index fd7d06b..ca6becf 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -119,7 +119,7 @@ test_expect_success 'rejected update prints status' '
git commit -m dev2 &&
test_must_fail git push origin dev2 2>act &&
sed -e "/^remote: /s/ *$//" cmp &&
-   test_cmp exp cmp
+   test_i18ncmp exp cmp
 '
 rm -f "$HTTPD_DOCUMENT_ROOT_PATH/test_repo.git/hooks/update"
 
@@ -219,7 +219,7 @@ test_expect_success TTY 'push shows progress when stderr is 
a tty' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit noisy &&
test_terminal git push >output 2>&1 &&
-   grep "^Writing objects" output
+   test_i18ngrep "^Writing objects" output
 '
 
 test_expect_success TTY 'push --quiet silences status and progress' '
@@ -233,16 +233,16 @@ test_expect_success TTY 'push --no-progress silences 
progress but not status' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit no-progress &&
test_terminal git push --no-progress >output 2>&1 &&
-   grep "^To http" output &&
-   ! grep "^Writing objects"
+   test_i18ngrep "^To http" output &&
+   test_i18ngrep ! "^Writing objects"
 '
 
 test_expect_success 'push --progress shows progress to non-tty' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit progress &&
git push --progress >output 2>&1 &&
-   grep "^To http" output &&
-   grep "^Writing objects" output
+   test_i18ngrep "^To http" output &&
+   test_i18ngrep "^Writing objects" output
 '
 
 test_expect_success 'http push gives sane defaults to reflog' '
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] log: decorate HEAD -> branch with the same color for branch and HEAD

2016-06-30 Thread Duy Nguyen
On Thu, Jun 30, 2016 at 6:39 PM, Nguyễn Thái Ngọc Duy  wrote:
> Commit 76c61fb (log: decorate HEAD with branch name under
> --decorate=full, too - 2015-05-13)

.. and I got the commit wrong. It should be 51ff0f2 (log: decorate
HEAD with branch name - 2015-03-10)

adds "HEAD -> branch" decoration to
> show current branch vs detached HEAD. The sign of whether HEAD is
> detached or not is "->" (vs ", "). It's too subtle for my poor
> eyes. If color is used, we can make the branch name's color the same
> as HEAD to visually emphasize that it's the current branch.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] log: decorate HEAD -> branch with the same color for branch and HEAD

2016-06-30 Thread Nguyễn Thái Ngọc Duy
Commit 76c61fb (log: decorate HEAD with branch name under
--decorate=full, too - 2015-05-13) adds "HEAD -> branch" decoration to
show current branch vs detached HEAD. The sign of whether HEAD is
detached or not is "->" (vs ", "). It's too subtle for my poor
eyes. If color is used, we can make the branch name's color the same
as HEAD to visually emphasize that it's the current branch.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 For the original discussion of 76c61fb see
 http://thread.gmane.org/gmane.comp.version-control.git/263922

 log-tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/log-tree.c b/log-tree.c
index 78a5381..440e7cc 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -267,7 +267,7 @@ void format_decorations_extended(struct strbuf *sb,
strbuf_addstr(sb, color_commit);
strbuf_addstr(sb, " -> ");
strbuf_addstr(sb, color_reset);
-   strbuf_addstr(sb, decorate_get_color(use_color, 
current_and_HEAD->type));
+   strbuf_addstr(sb, decorate_get_color(use_color, 
DECORATION_REF_HEAD));
show_name(sb, current_and_HEAD);
}
strbuf_addstr(sb, color_reset);
-- 
2.8.2.531.gd073806

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] t5541: become resilient to GETTEXT_POISON

2016-06-30 Thread Vasco Almeida
Use test_i18n* functions for testing text already marked for
translation.

Signed-off-by: Vasco Almeida 
---
Incremental update for va/i18n-even-more (merged to 'next' on 2016-06-28 at
5919dfa).

I don't know how I didn't catch this one.

 t/t5541-http-push-smart.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index fd7d06b..97748af 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -119,7 +119,7 @@ test_expect_success 'rejected update prints status' '
git commit -m dev2 &&
test_must_fail git push origin dev2 2>act &&
sed -e "/^remote: /s/ *$//" cmp &&
-   test_cmp exp cmp
+   test_i18ncmp exp cmp
 '
 rm -f "$HTTPD_DOCUMENT_ROOT_PATH/test_repo.git/hooks/update"
 
@@ -219,7 +219,7 @@ test_expect_success TTY 'push shows progress when stderr is 
a tty' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit noisy &&
test_terminal git push >output 2>&1 &&
-   grep "^Writing objects" output
+   test_i18ngrep "^Writing objects" output
 '
 
 test_expect_success TTY 'push --quiet silences status and progress' '
@@ -233,16 +233,16 @@ test_expect_success TTY 'push --no-progress silences 
progress but not status' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit no-progress &&
test_terminal git push --no-progress >output 2>&1 &&
-   grep "^To http" output &&
-   ! grep "^Writing objects"
+   test_i18ngrep ! "^To http" output &&
+   test_i18ngrep ! "^Writing objects"
 '
 
 test_expect_success 'push --progress shows progress to non-tty' '
cd "$ROOT_PATH"/test_repo_clone &&
test_commit progress &&
git push --progress >output 2>&1 &&
-   grep "^To http" output &&
-   grep "^Writing objects" output
+   test_i18ngrep "^To http" output &&
+   test_i18ngrep "^Writing objects" output
 '
 
 test_expect_success 'http push gives sane defaults to reflog' '
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/5] Better ref summary alignment in "git fetch"

2016-06-30 Thread Duy Nguyen
On Mon, Jun 27, 2016 at 8:43 PM, Jeff King  wrote:
> I tried it on my most-horrible example case, and the results were...just
> OK. Because the variable-length part of each line comes first, the
> alignment code means that the "origin/$" bit of every line gets bumped
> out. And if you have a single large branch name, then everybody gets
> bumped out very far, even to the point of wrapping. E.g., I get
> something like (with fetch.output=compact, obviously):
>
>   From ...
>* [new branch]  branch1  -> origin/$
>* [new branch]  branch2  -> origin/$
>* [new branch]  some-really-long-branch-name -> origin/$
>+ 1234abc..5678def  branch3  -> origin/$ (forced
> update)
>* [new branch]  branch4  -> origin/$
>
> I've shrunk it a bit to fit in the email; my actual "long" name was much
> larger. And the average length for the shorter ones is, too, but the
> overall effect is the same; almost every line has a huge run of
> whitespace. And some lines wrap that would not have even under the
> normal, duplicated scheme.

I was about to resend with s/\$/*/g and ignored this issue (with a
note) then it occurred to me we can simply ignore these long lines
from column width calculation. Yeah such a line may still be wrapped
around, but it will not push every other line to the far right. We
already have code for that in adjust_refcol_width()

max= term_columns();
...
/*
* rough estimation to see if the output line is too long and
* should not be counted (we can't do precise calculation
* anyway because we don't know if the error explanation part
* will be printed in update_local_ref)
*/
if (21 /* flag and summary */ + rlen + 4 /* -> */ + llen >= max)
return;
...

we can limit max to, like, term_columns() / 2 (or 2/3 of
term_columns). There's no perfect number, some people will still find
the output ugly _often_. But hopefully the majority won't. What do you
think?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] fixup! worktree: add "lock" command

2016-06-30 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Torsten, this seems to fix the symlink problem for me. How many times
 have I got similar reports from you and still managed to forget ...

 t/t2028-worktree-move.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t2028-worktree-move.sh b/t/t2028-worktree-move.sh
index 68d3fe8..8298aaf 100755
--- a/t/t2028-worktree-move.sh
+++ b/t/t2028-worktree-move.sh
@@ -9,8 +9,8 @@ test_expect_success 'setup' '
git worktree add source &&
git worktree list --porcelain | grep "^worktree" >actual &&
cat <<-EOF >expected &&
-   worktree $TRASH_DIRECTORY
-   worktree $TRASH_DIRECTORY/source
+   worktree $(pwd)
+   worktree $(pwd)/source
EOF
test_cmp expected actual
 '
-- 
2.8.2.531.gd073806

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/12] nd/icase updates

2016-06-30 Thread Duy Nguyen
On Mon, Jun 27, 2016 at 4:53 PM, Junio C Hamano  wrote:
>> diff --git a/grep.c b/grep.c
>> index cb058a5..92587a8 100644
>> --- a/grep.c
>> +++ b/grep.c
>> @@ -432,15 +432,8 @@ static void compile_regexp(struct grep_pat *p, struct 
>> grep_opt *opt)
>>   icase  = opt->regflags & REG_ICASE || p->ignore_case;
>>   ascii_only = !has_non_ascii(p->pattern);
>>
>> + if (opt->fixed || is_fixed(p->pattern, p->patternlen))
>>   p->fixed = !icase || ascii_only;
>>   else
>>   p->fixed = 0;
>>
>> @@ -449,6 +442,9 @@ static void compile_regexp(struct grep_pat *p, struct 
>> grep_opt *opt)
>>   kwsincr(p->kws, p->pattern, p->patternlen);
>>   kwsprep(p->kws);
>>   return;
>> + } else if (opt->fixed) {
>> + compile_fixed_regexp(p, opt);
>> + return;
>>   }
>
> This if/elseif/else cascade made a lot simpler and while the
> discussion is fresh in my brain it makes sense, but it may deserve a
> bit of commenting.
>
> And while attempting to do so, I found one possible issue there.
>
> Can't p->ignore_case be true even when opt->regflags does not have
> REG_ICASE?  The user never asked us to do a regexp match in such a
> case, and the logical place to compensate for that case would be
> inside compile_fixed_regexp(), where we use regexp engine behind
> user's back for our convenience, I would think.
>
> In the current code, compile_fixed_regexp() is only called when we
> want ICASE, but hardcoding that assumption to it unnecessarily robs
> flexibility (and the function name does not tell us it is only for
> icase in the first place), so I taught it to do the REG_ICASE thing
> only when opt->ignore_case is set.
>
> How does this look?
>
>
>  grep.c | 23 ++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/grep.c b/grep.c
> index 92587a8..3a3a9f4 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -407,17 +407,21 @@ static int is_fixed(const char *s, size_t len)
>  static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt)
>  {
> struct strbuf sb = STRBUF_INIT;
> int err;
> +   int regflags;
>
> basic_regex_quote_buf(, p->pattern);
> -   err = regcomp(>regexp, sb.buf, opt->regflags & ~REG_EXTENDED);
> +   regflags = opt->regflags & ~REG_EXTENDED;
> +   if (opt->ignore_case)
> +   regflags |= REG_ICASE;
> +   err = regcomp(>regexp, sb.buf, regflags);

Makes sense. But then if opt->ignore_case is false and regflags
happens to have REG_ICASE set, should we clear it as well?

The rest looks good (after your comment fixup). I see you already have
all the changes in your SQUASH??? commit. Do you want me to resend or
you will just squash this in locally?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFD] Place to document magic pathspecs like ":/" and pathspec handling

2016-06-30 Thread Duy Nguyen
Hi Jakub,

Where have you been all these years? :D

On Thu, Jun 30, 2016 at 11:42 AM, Jakub Narębski  wrote:
> Nowadays we have gitcli(7) manual page, but perhaps
> it would be better to create a separate manpage for issues related
> to pathspec handling (of which ":/" is only one part)... but then
> what should it be named?
>
> So we could describe how Git handles pathspecs and pathspec magic
> in the new manual page named gitpathspec(7), or gitpaths(7). The
> former has the advantage of the name being identical to the entry
> in gitglossary(7). The latter has the probable advantage of being
> easier for the Git novice to find,

git-pathspec(7) is a great idea. It bugs me that all the pathspec
details are hidden in that glossary file (though I didn't know it was
also available via "git help glossary", which probably just reinforces
its invisibility). The closet place I could think of was git(1) but
that page is already very long.

> and that it could be used to
> gather various ways to generate list of files in Git (files in
> the working area, files in the staging area aka the index, files
> in the revision / tree object, changed files, etc.);

You mean the list of commands in this man page? OK. Another thing we
could do is the reverse direction, add gitpathspec(7) as a reference
to all commands that may need it.

> the pathspec
> in strict sense is about the input. Well, we could have 'manpage
> alias' of gitpaths to gitpathspec, or vice versa.

>  Sidenote: I wonder if people (especially novices) have problem
>  finding relevant documentation, and if adding something like
>  "git apropos " command ("apropos", or "man -k"), or
>  add the '--apropos' option to "git help" would be useful...
>  and if it would be easy to create.

I have that problem sometimes and I don't think I can call myself a
noob anymore. I usually need to do some grepping. So yeah "git
apropos" would be great.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v13 04/20] index-helper: new daemon for caching index and related stuff

2016-06-30 Thread Duy Nguyen
On Thu, Jun 30, 2016 at 3:06 PM, Johannes Schindelin
 wrote:
> Even when NO_MMAP is empty, there might be no Unix sockets available (such
> as is the case on Windows). In any case, you really only want to skip
> these tests when index-helper is not available, so would you mind
> squashing this patch in?
>
> -- snipsnap --
> From: Johannes Schindelin 
> Subject: [PATCH] fixup! index-helper: new daemon for caching index and related
>  stuff
>
> ---
>  t/t7900-index-helper.sh | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/t/t7900-index-helper.sh b/t/t7900-index-helper.sh
> index 6c9b4dd..12b5bf7 100755
> --- a/t/t7900-index-helper.sh
> +++ b/t/t7900-index-helper.sh
> @@ -10,8 +10,10 @@ Testing git index-helper
>
>  . ./test-lib.sh
>
> -test -n "$NO_MMAP" && {
> -   skip_all='skipping index-helper tests: no mmap'
> +git index-helper -h 2>/dev/null
> +test $? = 129 ||

So when NO_MMAP is set, "git index-helper -h" will set $? to 1. And we
correctly skip the tests as well. It's a bit subtle though. How about
"git help -a|grep index-helper"?

> +{
> +   skip_all='skipping index-helper tests: no index-helper executable'
> test_done
>  }
>
> --
> 2.9.0.270.g810e421
>



-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Warm greatings

2016-06-30 Thread NASOLY STORE
Greetings from NASOLY STORE Supply , We are keen to placing an order for so
me items . Kindly let me know the kinds of credit card you take for payment
 .

Will be waiting to read back from you via : allenlescot...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] diff compaction heuristic: favor shortest neighboring blank lines

2016-06-30 Thread Michael Haggerty
On 06/23/2016 07:10 PM, Michael Haggerty wrote:
> On 06/17/2016 06:09 PM, Stefan Beller wrote:
>> I think before spending more time on discussing and implementing new
>> (hopefully better) heuristics, I'd want to step back and try to be a bit more
>> systematic, i.e. I'll want to collect lots of test cases in different 
>> languages
>> and use cases. A mini test suite, maybe?
> 
> Stefan,
> 
> I've also been playing around with diff heuristics and also trying to
> find some good test data. Have you put together some test cases yet?

I put quite of work into tooling to evaluate diff heuristics, and just
published it on GitHub:

https://github.com/mhagger/diff-slider-tools

The README there is hopefully enough to let people get started using it
to test their own favorite heuristics.

!!!
Please be careful! The scripts are not hardened against strangely-named
files!
!!!

I've run a bunch of tests of the heuristic that I described in the
previous email against a bunch of repositories (to try to get a sampling
of popular languages): ant, bugzilla, couchdb, docker, git, ipython,
junit, nodejs, phpmyadmin, test-more, test-unit, xmonad.

I haven't done a lot of systematic analysis yet, nor tried to optimize
the heuristic systematically, but by eye the results look very good.
Some constructs (C preprocessor directives, Shell here documents)
sometimes confuse it because of their weird indentation, and of course
code with inconsistent indentation or strangely-placed blank lines
doesn't get the best results. But in a large majority of cases where it
disagrees with the standard Git or with `git --compaction-heuristic`, it
does a better job.

Below are a few examples to show off some typical results. I promise I
tried not to cherry pick only flattering examples.

In the output below, `|` mark the highest and lowest possible
positioning of the "slider" (a group of added or deleted lines whose
exact identity is ambiguous). Current Git, by default, always chooses
the lowest positioning. "c" is where Git master with
`--compaction-heuristic` chooses to position the slider, and `i` is the
choice of my indentation-based algorithm. The numbers are just the shift
relative to the standard shift.

Michael

The algorithm does pretty well with XML and HTML:

> fef3ea39f8bd474add292bb6437df6cbd22e1ba7:contributors.xml 
> a394a0bdf8e6240dc09023a8260059c57c158a00:contributors.xml + 1624
> 
>>Toni
>>  
>>  
>>Vincent
>>Legoll
>   -2 | >  
>   -1 |   i >  
>0 || ci >Vincent
>  || ci >Privat
>   | ci >  
>   | c  >  
>>Vimil
>>Saju
>>  
>>  
>>Vitold
>>Sedyshev
> 

> 789422e131b6c2c003d94f394169a64297e986c6:manual/Tasks/signjar.html 
> 7f51882300a8e40ff675867e055061867ba6c8bd:manual/Tasks/signjar.html + 151
> 
>>  
>>  
>>tsacert
>>alias in the keystore for a timestamp 
> authority for
>>timestamped JAR files in Java1.5+
>   -3 | >No
>   -2 | >  
>   -1 |   i >  
>0 || ci >tsaproxyhost
>  || ci >proxy host to be used when connecting to 
> TSA server
>  || ci >No
>  || ci >  
>  || ci >  
>  || ci >getTsaproxyport
>  || ci >proxy port to be used when connecting to 
> TSA server
>   | ci >No
>   | ci >  
>   | c  >  
>>executable
>>Specify a particular 
> jarsigner executable
>>  to use in place of the default binary (found in the 
> same JDK as
>>  Apache Ant is running in).
>>  Must support the same command line options as the Sun 
> JDK
>>  jarsigner command.
> 

Java:

> de5b4058b9bb039cdb17082fc543098de598ece2:src/main/org/apache/tools/ant/taskdefs/optional/ssh/AbstractSshMessage.java
>  
> 3fe363e2120342f6de2b219c3bd74efe9aea2803:src/main/org/apache/tools/ant/taskdefs/optional/ssh/AbstractSshMessage.java
>  + 208
> 
>> * @return true if the verbose attribute is set
>> * @since Ant 1.6.2
>> */
>>protected final boolean getVerbose() {
>>return verbose;
>   -3 | >}
>   -2 | >
>   -1 |  ci >/**
>0 || ci > * Is the compressed attribute set.
>  || ci > * @return true if the compressed attribute is set
>  

Re: Problem with --shallow-submodules option

2016-06-30 Thread Istvan Zakar
Hello,

Thanks for your answers. I tested it after the changes were made on
the git server, and it seems to be working. But some other issue came
up.

We have quite many submodules in our project so I did some comaprision:

If I do a clone with these parameters:
--jobs 20 --recurse-submodules

The clone lasts ~53 seconds, and the total size of the folder is around 2 GB.

If I add the shallow-submodules option, the size of the folder will be
a bit below 1GB, so the size decreased as I expected, but the time of
the clone itself increased to 90 seconds. It seems the last step of
the command, checking out the submodules is executed one-by-one, and
not in parallel, so it seems at this step the jobs parameter does not
have effect.

Is it intentional, or there is some option I missed?

I'm using git 2.9.0 on client side.

Thanks,
   Istvan

ps: if I update the submodules with --depth 1 parameter in parallel
using xargs it lasts about 18 seconds, so it's a workaround for this
issue, but it would be nice to do it with a single command.




On 22 June 2016 at 17:31, Fredrik Gustafsson  wrote:
> On Mon, Jun 20, 2016 at 01:06:39PM +, Istvan Zakar wrote:
>> I'm working on a relatively big project with many submodules. During
>> cloning for testing I tried to decrease the amount of data need to be
>> fetched from the server by using --shallow-submodules option in the clone
>> command. It seems to check out the tip of the remote repo, and if it's not
>> the commit registered in the superproject the submodule update fails
>> (obviously). Can I somehow tell to fetch that exact commit I need for my
>> superproject?
>
> Maybe. http://stackoverflow.com/questions/2144406/git-shallow-submodules
> gives a good overview of this problem.
>
> git fetches a branch and is shallow from that branch, which might be an
> other sha1 than the one the submodule points to, (as you say). This
> is/was one of the drawbacks with this method. However the since git 2.8,
> git will try to fetch the sha1 direct (and not the branch). So then it
> will work, if(!), the server supports direct access to sha1. This was
> previously not allowed due to security concerns (if I recall correctly).
>
> So the answer is, yes this will work if you've a recent version of git
> and support on the server side for doing this. Unfortunately I'm not
> sure which git version is needed on the server side for this to work.
>
> --
> Fredrik Gustafsson
>
> phone: +46 733-608274
> e-mail: iv...@iveqy.com
> website: http://www.iveqy.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v13 04/20] index-helper: new daemon for caching index and related stuff

2016-06-30 Thread Johannes Schindelin
Hi Dave,

On Sun, 26 Jun 2016, David Turner wrote:

> diff --git a/t/t7900-index-helper.sh b/t/t7900-index-helper.sh
> new file mode 100755
> index 000..114c112
> --- /dev/null
> +++ b/t/t7900-index-helper.sh
> @@ -0,0 +1,23 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2016, Twitter, Inc
> +#
> +
> +test_description='git-index-helper
> +
> +Testing git index-helper
> +'
> +
> +. ./test-lib.sh
> +
> +test -n "$NO_MMAP" && {
> + skip_all='skipping index-helper tests: no mmap'
> + test_done
> +}

Even when NO_MMAP is empty, there might be no Unix sockets available (such
as is the case on Windows). In any case, you really only want to skip
these tests when index-helper is not available, so would you mind
squashing this patch in?

-- snipsnap --
From: Johannes Schindelin 
Subject: [PATCH] fixup! index-helper: new daemon for caching index and related
 stuff

---
 t/t7900-index-helper.sh | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/t7900-index-helper.sh b/t/t7900-index-helper.sh
index 6c9b4dd..12b5bf7 100755
--- a/t/t7900-index-helper.sh
+++ b/t/t7900-index-helper.sh
@@ -10,8 +10,10 @@ Testing git index-helper
 
 . ./test-lib.sh
 
-test -n "$NO_MMAP" && {
-   skip_all='skipping index-helper tests: no mmap'
+git index-helper -h 2>/dev/null
+test $? = 129 ||
+{
+   skip_all='skipping index-helper tests: no index-helper executable'
test_done
 }
 
-- 
2.9.0.270.g810e421

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/3] t7810-grep.sh: fix duplicated test name

2016-06-30 Thread Charles Bailey
Signed-off-by: Charles Bailey 
---
 t/t7810-grep.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 1e72971..c4302ed 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -353,7 +353,7 @@ test_expect_success 'grep -l -C' '
 cat >expected 

[PATCH v3 0/3] Grepping with intent to add

2016-06-30 Thread Charles Bailey
So I've got back around to this topic again.

I've applied fixes to the tests as suggested by Eric and Junio.

I came up with a test case that demonstrates a difference between the
additional fix that Duy suggested and the alternative that Junio
suggested.

I've kept Duy's fix because I think it makes more sense, although it's
a sufficiently obscure case that I don't feel strongly that it's
definitely the best behavior.

The fix ensures that if you have a file which is both "intend to add"
and "assume unchanged" that it is not listed if you "grep -L" for for
something. In effect, we are applying the "contents indeterminate" state
of the index to the working tree file.

Charles Bailey (3):
  t7810-grep.sh: fix duplicated test name
  t7810-grep.sh: fix a whitespace inconsistency
  grep: fix grepping for "intent to add" files

 builtin/grep.c  |  4 ++--
 t/t7810-grep.sh | 62 +++--
 2 files changed, 62 insertions(+), 4 deletions(-)

-- 
2.8.2.311.gee88674

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/3] t7810-grep.sh: fix a whitespace inconsistency

2016-06-30 Thread Charles Bailey
Signed-off-by: Charles Bailey 
---
 t/t7810-grep.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index c4302ed..6e6eaa4 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -175,7 +175,7 @@ do
 
test_expect_success "grep -c $L (no /dev/null)" '
! git grep -c test $H | grep /dev/null
-'
+   '
 
test_expect_success "grep --max-depth -1 $L" '
{
-- 
2.8.2.311.gee88674

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 3/3] grep: fix grepping for "intent to add" files

2016-06-30 Thread Charles Bailey
From: Charles Bailey 

This reverts commit 4d5520053 (grep: make it clear i-t-a entries are
ignored, 2015-12-27) and adds an alternative fix to maintain the -L
--cached behavior.

4d5520053 caused 'git grep' to no longer find matches in new files in
the working tree where the corresponding index entry had the "intent to
add" bit set, despite the fact that these files are tracked.

The content in the index of a file for which the "intent to add" bit is
set is considered indeterminate and not empty. For most grep queries we
want these to behave the same, however for -L --cached (files without a
match) we don't want to respond positively for "intent to add" files as
their contents are indeterminate. This is in contrast to files with
empty contents in the index (no lines implies no matches for any grep
query expression) which should be reported in the output of a grep -L
--cached invocation.

Add tests to cover this case and a few related cases which previously
lacked coverage.

Helped-by: Nguyễn Thái Ngọc Duy 
Signed-off-by: Charles Bailey 
---
 builtin/grep.c  |  4 ++--
 t/t7810-grep.sh | 58 +
 2 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 462e607..ae73831 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -386,7 +386,7 @@ static int grep_cache(struct grep_opt *opt, const struct 
pathspec *pathspec, int
 
for (nr = 0; nr < active_nr; nr++) {
const struct cache_entry *ce = active_cache[nr];
-   if (!S_ISREG(ce->ce_mode) || ce_intent_to_add(ce))
+   if (!S_ISREG(ce->ce_mode))
continue;
if (!ce_path_match(ce, pathspec, NULL))
continue;
@@ -396,7 +396,7 @@ static int grep_cache(struct grep_opt *opt, const struct 
pathspec *pathspec, int
 * cache version instead
 */
if (cached || (ce->ce_flags & CE_VALID) || 
ce_skip_worktree(ce)) {
-   if (ce_stage(ce))
+   if (ce_stage(ce) || ce_intent_to_add(ce))
continue;
hit |= grep_sha1(opt, ce->sha1, ce->name, 0, ce->name);
}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 6e6eaa4..c18e954 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -1364,4 +1364,62 @@ test_expect_success 'grep --color -e A --and -e B -p 
with context' '
test_cmp expected actual
 '
 
+test_expect_success 'grep can find things only in the work tree' '
+   : >work-tree-only &&
+   git add work-tree-only &&
+   test_when_finished "git rm -f work-tree-only" &&
+   echo "find in work tree" >work-tree-only &&
+   git grep --quiet "find in work tree" &&
+   test_must_fail git grep --quiet --cached "find in work tree" &&
+   test_must_fail git grep --quiet "find in work tree" HEAD
+'
+
+test_expect_success 'grep can find things only in the work tree (i-t-a)' '
+   echo "intend to add this" >intend-to-add &&
+   git add -N intend-to-add &&
+   test_when_finished "git rm -f intend-to-add" &&
+   git grep --quiet "intend to add this" &&
+   test_must_fail git grep --quiet --cached "intend to add this" &&
+   test_must_fail git grep --quiet "intend to add this" HEAD
+'
+
+test_expect_success 'grep does not search work tree with assume unchanged' '
+   echo "intend to add this" >intend-to-add &&
+   git add -N intend-to-add &&
+   git update-index --assume-unchanged intend-to-add &&
+   test_when_finished "git rm -f intend-to-add" &&
+   test_must_fail git grep --quiet "intend to add this" &&
+   test_must_fail git grep --quiet --cached "intend to add this" &&
+   test_must_fail git grep --quiet "intend to add this" HEAD
+'
+
+test_expect_success 'grep can find things only in the index' '
+   echo "only in the index" >cache-this &&
+   git add cache-this &&
+   rm cache-this &&
+   test_when_finished "git rm --cached cache-this" &&
+   test_must_fail git grep --quiet "only in the index" &&
+   git grep --quiet --cached "only in the index" &&
+   test_must_fail git grep --quiet "only in the index" HEAD
+'
+
+test_expect_success 'grep does not report i-t-a with -L --cached' '
+   echo "intend to add this" >intend-to-add &&
+   git add -N intend-to-add &&
+   test_when_finished "git rm -f intend-to-add" &&
+   git ls-files | grep -v "^intend-to-add\$" >expected &&
+   git grep -L --cached "nonexistent_string" >actual &&
+   test_cmp expected actual
+'
+
+test_expect_success 'grep does not report i-t-a and assume unchanged with -L' '
+   echo "intend to add this" >intend-to-add-assume-unchanged &&
+   git add -N intend-to-add-assume-unchanged &&
+   test_when_finished "git rm -f intend-to-add-assume-unchanged" &&

Re: git tag --contains for cherry-picks

2016-06-30 Thread Jakub Narębski
W dniu 2016-06-30 o 08:22, Jeff King pisze:
> On Wed, Jun 29, 2016 at 12:48:33PM +0100, Laszlo Papp wrote:
> 
>> Old releases are maintained with important bug fixes or even new features
>> in our case. It sometimes means that we need to cherry-pick commits across
>> branches, like from master to a specific release branch.
>>
>> Cherry-picking changes the hash of the commit, therefore, this may no
>> longer work for cherry-picks:
>>
>> git tag --contains
>>
>> I am thinking of having something like:
>>
>> git tag --contains-follow
>>
>> which would follow cherry-picks. I am not sure how easily and/or
>> efficiently this can be implemented, but my gut feeling is that in the vast
>> majority of the cases, the content check would bail out already at the
>> "subject line".
> 
> Git generally considers commits "equivalent" based on the patch-id, whic
> his a sha1 of the diff (modulo some canonicalization).

The problem with patch based equivalence is that for cherry-picking
on old release branches you might need to modify a patch for it to
apply - and then patch-id might not detect it.

[...]
> Of course there are other ways of determining commit equivalence. You
> could find ones with duplicate commit messages, or duplicate subjects,
> or whatever. But if you have a cherry-picking workflow, I suspect the
> easiest thing may be to simply use "git cherry-pick -x", which will
> write the sha1 of the original commit into the cherry-picked commit
> message. You can then use that to correlate directly.

"git log --grep=" would help there (or even match the specific
message that "git cherry-pick -x" adds).

-- 
Jakub Narębski

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: topological index field for commit objects

2016-06-30 Thread Jakub Narębski
W dniu 2016-06-30 o 00:00, Jeff King pisze:
> On Wed, Jun 29, 2016 at 11:49:35PM +0200, Jakub Narębski wrote:
> 
>>> So this is the ideal case for generation numbers (the worst cases are
>>> when the things you are looking for are in branchy, close history where
>>> the generation numbers don't tell you much; but in such cases the
>>> walking is usually not too bad).
>>
>> There are other approaches (special indices) that help reachability
>> queries beside "generation number".
> 
> Yes, though generation numbers can help with more questions (e.g., "what
> is the merge base").

Well, those special indices usually need generation number anyway. For
example FELINE indices^1 (from "Reachability Queries in Very Large Graphs",
http://openproceedings.org/EDBT/2014/paper_166.pdf, that I found when
trying to find more about compressed bitmap indices used by Git) also
utilize generation numbers to speed up reachability analysis. The idea
behind FELINE is to associate every vertex (commit) with a unique ordered
pair of natural integers (i.e. two more numbers, in addition to the level
aka generation number), so that partial order over N^2 is superset of
partial order of graph (DAG of revisions). It can answer in O(1) that
nodes are not connected, and cut search space if they are.

BTW. some references in this paper ("Related work" section) use PWAH
compression scheme - perhaps it would work with EWAH that Git uses?

1. FELINE = Fast rEfined onLINE search ... or fun with acronyms

>> By the way, what should happen if you add a replacement (in the git-replace
>> meaning) that creates a shortcut, therefore invalidating generation numbers,
>> at least in strict sense - committerdate as generation number would be still
>> good, I think?
> 
> This is one of the open questions. My older patches turned them off when
> replacements and grafts are in effect.

Well, if you store the cache of generation numbers in the packfile, or in
the index of the packfile, or in the bitmap file, or in separate bitmap-like
file, generating them on repack, then of course any grafts or replacements
invalidate them... though for low level commands (like object counting)
replacements are transparent -- or rather they are (and can be) treated as
any other ref for reachability analysis.

Well, if there are no grafts, you could still use them for doing
"git --no-replace-objects log ...", isn't it?

SIDENOTE: should grafts be deprecated and later removed, now that Git has
superior alternative in the form of git-replace, and a simple way to convert
grafts to replacements?

Anyway, if file with mapping from revisions to its generation numbers
were stored outside of packfiles, it could simply be updated / rewritten
when adding new replacement. You could use one cache for no-replace,
and one for replace. Though I do wonder if it would be possible to do
such rewrite fast... well, fast enough assuming that adding replacements
is rare.

Answering my own question:
>> committerdate as generation number would be still good, I think?
No, it wouldn't:

  pre replace:

1 <-- 2 <-- 3 <-- 5 <--
   \
\- 4 <-- 6 <-- 7 <-- 8

  post replace

   1 <-- 2  \-- 3' <-- 5
  \  \
   \-- 4 <-- 6 <-- 7 <-- 8 <--\

Either the replacement commit would have generation number correct, but
its child wouldn't, or its child would have correct generation number but
the replacement wouldn't.
 
>>> I have patches that generate and store the numbers at pack time, similar
>>> to the way we do the reachability bitmaps.

Ah, so those cached generation numbers are generated and stored at pack
time. Where you store them: is it a separate file? Bitmap file? Packfile?

>>>They're not production ready,
>>> but they could probably be made so without too much effort. You wouldn't
>>> have ready-made generation numbers for commits since the last full
>>> repack, but you can compute them incrementally based on what you do have
>>> at a cost linear to the unpacked commits (this is the same for bitmaps).
>>
>> Do Git use EWAH / EWOK bitmaps for reachability analysis, or is it still
>> limited to object counting?
> 
> At GitHub we are using them for --contains analysis, along with mass
> ahead/behind (e.g., as in https://github.com/gitster/git/branches). My
> plan is to send patches upstream, but they need some cleanup first.

That would be nice to have, please.

Er, is mass ahead/behind something that can be plugged into Git
(e.g. for "git branch -v -v"), or is it something GitHub-specific?


P.S. Having Git ensure that committerdate (as an epoch) is greater
than committerdates of its parents at the commit creation time (with
providing warning about time skew, perhaps not doing it if skew is
too large) would be not costly. No need to add anything, and it would
help future queries to be faster, I think.

-- 
Jakub Narębski


--
To unsubscribe from this 

Re: [RFC/PATCH v2 00/10] Add initial experimental external ODB support

2016-06-30 Thread Christian Couder
On Wed, Jun 29, 2016 at 10:01 PM, Eric Wong  wrote:
> Christian Couder  wrote:
>> Design discussion about performance
>> ~~~
>>
>> Yeah, it is not efficient to fork/exec a command to just read or write
>> one object to or from the external ODB. Batch calls and/or using a
>> daemon and/or RPC should be used instead to be able to store regular
>> objects in an external ODB. But for now the external ODB would be all
>> about really big files, where the cost of a fork+exec should not
>> matter much. If we later want to extend usage of external ODBs, yeah
>> we will probably need to design other mechanisms.
>
> I would also investigate switching run_command to use vfork+exec
> or posix_spawn for performance (keeping in mind vfork
> caveats documented at https://ewontfix.com/7/ )

Thanks Eric for this idea and the test!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8 33/41] write_or_die: use warning() instead of fprintf(stderr, ...)

2016-06-30 Thread Christian Couder
On Tue, Jun 28, 2016 at 11:39 PM, Junio C Hamano  wrote:
> Christian Couder  writes:
>
>> @@ -98,8 +97,7 @@ int write_or_whine_pipe(int fd, const void *buf, size_t 
>> count, const char *msg)
>>  int write_or_whine(int fd, const void *buf, size_t count, const char *msg)
>>  {
>>   if (write_in_full(fd, buf, count) < 0) {
>> - fprintf(stderr, "%s: write error (%s)\n",
>> - msg, strerror(errno));
>> + warning("%s: write error (%s)\n", msg, strerror(errno));
>>   return 0;
>>   }
>
> I do not think you call write_or_whine() at all.  As another topic
> in flight removes the last caller of this function, this hunk is
> very much unwelcome.  The only effect of it is to force me resolve
> unnecessary merge conflicts.

Ok, sorry about that. I will remove the hunk.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] builtin/apply: include declaration of cmd_apply()

2016-06-30 Thread Christian Couder
Hi Ramsay,

On Thu, Jun 30, 2016 at 3:47 AM, Ramsay Jones
 wrote:
>
> Hi Christian,
>
> If you need to re-roll your 'cc/apply-am' branch, could you please
> squash this into the relevant patch. Commit 95a3b0ba ("apply: move
> libified code from builtin/apply.c to apply.{c,h}", 22-04-2016)
> removed this '#include "builtin.h"' line, which causes sparse to
> issue a warning.

Ok, I will squash this.

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFD] Place to document magic pathspecs like ":/" and pathspec handling

2016-06-30 Thread Jakub Narębski
W dniu 2016-06-29 o 23:28, Junio C Hamano pisze:
> Jakub Narębski  writes:
> 
>> But I think it is not the best place to keep this documentation.
> 
> All true.  In case it was not obvious, I didn't mean to say "Here
> you find the information, shut up."  It was "here is a pointer if
> you didn't find it, so that you can use it as a starting point to
> make a better documentation."
> 
> An analogous entity in the world model of Git that appears
> everywhere and is not limited to a single command is the revision
> set calculus.  Where do we describe it and how do we make it
> discoverable?  Can the new way to describe pathspec and pathspec
> magic mimic the way it is done for the revisions?

Is that a trick question? :-P

The revision set calculus is described as a standalone documentation
in the gitrevisions(7) manpage (titled "specifying revisions and
[revision] ranges for Git"... well, the "[revision]" isn't there).
This documentation is also accessible via `git help revisions`,
which is cool.  It is referenced in the "Symbolic Identifiers"
section of the git(1) manual page.

Long time ago the description of revision set calculus was hidden
in the manpage for the plumbing command git-rev-parse, and is still
included in it (the common part is, via "include::revisions.txt[]").

As I wrote:

 Nowadays we have gitcli(7) manual page, but perhaps
 it would be better to create a separate manpage for issues related
 to pathspec handling (of which ":/" is only one part)... but then
 what should it be named?

So we could describe how Git handles pathspecs and pathspec magic
in the new manual page named gitpathspec(7), or gitpaths(7). The
former has the advantage of the name being identical to the entry
in gitglossary(7). The latter has the probable advantage of being
easier for the Git novice to find, and that it could be used to
gather various ways to generate list of files in Git (files in
the working area, files in the staging area aka the index, files
in the revision / tree object, changed files, etc.); the pathspec
in strict sense is about the input. Well, we could have 'manpage
alias' of gitpaths to gitpathspec, or vice versa.

 Sidenote: I wonder if people (especially novices) have problem
 finding relevant documentation, and if adding something like 
 "git apropos " command ("apropos", or "man -k"), or
 add the '--apropos' option to "git help" would be useful...
 and if it would be easy to create.

The description from gitglossary would be a good start, as would
parts of gitcli and relevant RelNotes. It would need to be linked
from git(1) manpage, and probably also from gitcli.

Now only to start it...
-- 
Jakub Narębski

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] Report bugs consistently

2016-06-30 Thread Jeff King
On Thu, Jun 30, 2016 at 10:42:37AM +0200, Johannes Schindelin wrote:

> > > The vast majority of error messages in Git's source code which report a
> > > bug use the convention to prefix the message with "BUG:".
> > 
> > Good thing to do.
> > 
> > But if we were to review and apply a 200+ line patch, I wonder if we
> > want to go one step further to allow us to write
> > 
> > BUG("killed-file %s not found", name);
> > 
> > instead.
> 
> If the idea is to make it easier to find, I would wager a guess that
> 'die("BUG:' would be just as good a search term. Even better, I think,
> because 'BUG' would also match comments.

I have been tempted to switch to BUG(), because it would make it easy to
call abort() and get a coredump (and therefore a stack trace). On the
other hand:

  - we could always trigger such behavior in die() by looking for "BUG:" in
the output string. :)

  - it's also sometimes useful to get a stack trace from a regular
non-bug die(). So maybe something optional like:

  if (git_env_bool("GIT_ABORT_ON_DIE", 0))
  abort();

would be helpful (since you have to turn it on ahead of time, you
could also just run the program under gdb, of course; however, I
sometimes find that it's hard to get gdb where you want it because
git spawns so many sub-programs. Or maybe I just need to get better
at using gdb's child-following options).

The other thing BUG() would get us is that we could turn it into a macro
(on systems with vararg macros) and report things like __FILE__ and
__LINE__.  In practice, though our BUG messages are unique enough that
there is no problem finding the source.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-06-30 Thread Jeff King
The ustar format only has room for 11 (or 12, depending on
some implementations) octal digits for the size and mtime of
each file. For values larger than this, we have to add pax
extended headers to specify the real data, and git does not
yet know how to do so.

Before fixing that, let's start off with some test
infrastructure, as designing portable and efficient tests
for this is non-trivial.

We want to use the system tar to check our output (because
what we really care about is interoperability), but we can't
rely on it:

  1. being able to read pax headers

  2. being able to handle huge sizes or mtimes

  3. supporting a "t" format we can parse

So as a prerequisite, we can feed the system tar a reference
tarball to make sure it can handle these features. The
reference tar here was created with:

  dd if=/dev/zero seek=64G bs=1 count=1 of=huge
  touch -d @68719476737 huge
  tar cf - --format=pax |
  head -c 2048

using GNU tar. Note that this is not a complete tarfile, but
it's enough to contain the headers we want to examine.

Likewise, we need to convince git that it has a 64GB blob to
output. Running "git add" on that 64GB file takes many
minutes of CPU, and even compressed, the result is 64MB. So
again, I pre-generated that loose object, and then took only
the first 2k of it. That should be enough to generate 2MB of
data before hitting an inflate error, which is plenty for us
to generate the tar header (and then die of SIGPIPE while
streaming the rest out).

The tests are split so that we test as much as we can even
with an uncooperative system tar. This actually catches the
current breakage (which is that we die("BUG") trying to
write the ustar header) on every system, and then on systems
where we can, we go farther and actually verify the result.

Helped-by: Robin H. Johnson 
Signed-off-by: Jeff King 
---
 t/t5000-tar-tree.sh  |  74 +++
 t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a | Bin 0 -> 2048 bytes
 t/t5000/huge-and-future.tar  | Bin 0 -> 2048 bytes
 3 files changed, 74 insertions(+)
 create mode 100644 t/t5000/19f9c8273ec45a8938e6999cb59b3ff66739902a
 create mode 100644 t/t5000/huge-and-future.tar

diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 4b68bba..950bdd3 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -319,4 +319,78 @@ test_expect_success 'catch non-matching pathspec' '
test_must_fail git archive -v HEAD -- "*.abc" >/dev/null
 '
 
+# Pull the size and date of each entry in a tarfile using the system tar.
+#
+# We'll pull out only the year from the date; that avoids any question of
+# timezones impacting the result (as long as we keep our test times away from a
+# year boundary; our reference times are all in August).
+#
+# The output of tar_info is expected to be " ", both in decimal. It
+# ignores the return value of tar. We have to do this, because some of our test
+# input is only partial (the real data is 64GB in some cases).
+tar_info () {
+   "$TAR" tvf "$1" |
+   awk '{
+   split($4, date, "-")
+   print $3 " " date[1]
+   }'
+}
+
+# See if our system tar can handle a tar file with huge sizes and dates far in
+# the future, and that we can actually parse its output.
+#
+# The reference file was generated by GNU tar, and the magic time and size are
+# both octal 010001, which overflows normal ustar fields.
+test_lazy_prereq TAR_HUGE '
+   echo "68719476737 4147" >expect &&
+   tar_info "$TEST_DIRECTORY"/t5000/huge-and-future.tar >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'set up repository with huge blob' '
+   obj_d=19 &&
+   obj_f=f9c8273ec45a8938e6999cb59b3ff66739902a &&
+   obj=${obj_d}${obj_f} &&
+   mkdir -p .git/objects/$obj_d &&
+   cp "$TEST_DIRECTORY"/t5000/$obj .git/objects/$obj_d/$obj_f &&
+   rm -f .git/index &&
+   git update-index --add --cacheinfo 100644,$obj,huge &&
+   git commit -m huge
+'
+
+# We expect git to die with SIGPIPE here (otherwise we
+# would generate the whole 64GB).
+test_expect_failure 'generate tar with huge size' '
+   {
+   git archive HEAD
+   echo $? >exit-code
+   } | test_copy_bytes 4096 >huge.tar &&
+   echo 141 >expect &&
+   test_cmp expect exit-code
+'
+
+test_expect_failure TAR_HUGE 'system tar can read our huge size' '
+   echo 68719476737 >expect &&
+   tar_info huge.tar | cut -d" " -f1 >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'set up repository with far-future commit' '
+   rm -f .git/index &&
+   echo content >file &&
+   git add file &&
+   GIT_COMMITTER_DATE="@68719476737 +" \
+   git commit -m "tempori parendum"
+'
+
+test_expect_failure 'generate tar with future mtime' '
+   git archive HEAD >future.tar
+'
+
+test_expect_failure TAR_HUGE 'system tar can read our 

[PATCH v4 6/5] t5000: use test_match_signal

2016-06-30 Thread Jeff King
On Thu, Jun 30, 2016 at 05:06:14AM -0400, Jeff King wrote:

> The one thing that isn't fixed is the use of "141" to test for sigpipe
> death. That should use test_match_signal, but that topic just got
> re-rolled, too.

And here's what the patch for that looks like (which can be applied if
this topic and jk/test-match-signal are merged).

-- >8 --
Subject: t5000: use test_match_signal

We are testing for sigpipe death from git, and doing so
portably requires using our test helper.

Signed-off-by: Jeff King 
---

Of course this does not help Windows at all, because we removed the "3"
check from jk/test-match-signal. So this new test will probably need to
be dealt with by Windows folks, one way or another (either extending
test_match_signal, or just skipping the exit code check under the MINGW
prereq).

 t/t5000-tar-tree.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 96d208d..6950d7d 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -365,8 +365,7 @@ test_expect_success 'generate tar with huge size' '
git archive HEAD
echo $? >exit-code
} | test_copy_bytes 4096 >huge.tar &&
-   echo 141 >expect &&
-   test_cmp expect exit-code
+   test_match_signal 13 "$(cat exit-code)"
 '
 
 test_expect_success TAR_HUGE 'system tar can read our huge size' '
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/5] t9300: factor out portable "head -c" replacement

2016-06-30 Thread Jeff King
In shell scripts it is sometimes useful to be able to read
exactly N bytes from a pipe. Doing this portably turns out
to be surprisingly difficult.

We want a solution that:

  - is portable

  - never reads more than N bytes due to buffering (which
would mean those bytes are not available to the next
program to read from the same pipe)

  - handles partial reads by looping until N bytes or read
(or we see EOF)

  - is resilient to stray signals giving us EINTR while
trying to read (even though we don't send them, things
like SIGWINCH could cause apparently-random failures)

Some possible solutions are:

  - "head -c" is not portable, and implementations may
buffer (though GNU head does not)

  - "read -N" is a bash-ism, and thus not portable

  - "dd bs=$n count=1" does not handle partial reads. GNU dd
has iflags=fullblock, but that is not portable

  - "dd bs=1 count=$n" fixes the partial read problem (all
reads are 1-byte, so there can be no partial response).
It does make a lot of write() calls, but for our tests
that's unlikely to matter.  It's fairly portable. We
already use it in our tests, and it's unlikely that
implementations would screw up any of our criteria. The
most unknown one would be signal handling.

  - perl can do a sysread() loop pretty easily. On my Linux
system, at least, it seems to restart the read() call
automatically. If that turns out not to be portable,
though, it would be easy for us to handle it.

That makes the perl solution the least bad (because we
conveniently omitted "length of code" as a criterion).
It's also what t9300 is currently using, so we can just pull
the implementation from there.

Signed-off-by: Jeff King 
---
 t/t9300-fast-import.sh  | 23 +++
 t/test-lib-functions.sh | 14 ++
 2 files changed, 17 insertions(+), 20 deletions(-)

diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index 74d740d..2e0ba3e 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -7,23 +7,6 @@ test_description='test git fast-import utility'
 . ./test-lib.sh
 . "$TEST_DIRECTORY"/diff-lib.sh ;# test-lib chdir's into trash
 
-# Print $1 bytes from stdin to stdout.
-#
-# This could be written as "head -c $1", but IRIX "head" does not
-# support the -c option.
-head_c () {
-   perl -e '
-   my $len = $ARGV[1];
-   while ($len > 0) {
-   my $s;
-   my $nread = sysread(STDIN, $s, $len);
-   die "cannot read: $!" unless defined($nread);
-   print $s;
-   $len -= $nread;
-   }
-   ' - "$1"
-}
-
 verify_packs () {
for p in .git/objects/pack/*.pack
do
@@ -2481,7 +2464,7 @@ test_expect_success PIPE 'R: copy using cat-file' '
 
read blob_id type size <&3 &&
echo "$blob_id $type $size" >response &&
-   head_c $size >blob <&3 &&
+   test_copy_bytes $size >blob <&3 &&
read newline <&3 &&
 
cat <<-EOF &&
@@ -2524,7 +2507,7 @@ test_expect_success PIPE 'R: print blob mid-commit' '
EOF
 
read blob_id type size <&3 &&
-   head_c $size >actual <&3 &&
+   test_copy_bytes $size >actual <&3 &&
read newline <&3 &&
 
echo
@@ -2559,7 +2542,7 @@ test_expect_success PIPE 'R: print staged blob within 
commit' '
echo "cat-blob $to_get" &&
 
read blob_id type size <&3 &&
-   head_c $size >actual <&3 &&
+   test_copy_bytes $size >actual <&3 &&
read newline <&3 &&
 
echo deleteall
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 48884d5..90856d6 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -961,3 +961,17 @@ test_env () {
done
)
 }
+
+# Read up to "$1" bytes (or to EOF) from stdin and write them to stdout.
+test_copy_bytes () {
+   perl -e '
+   my $len = $ARGV[1];
+   while ($len > 0) {
+   my $s;
+   my $nread = sysread(STDIN, $s, $len);
+   die "cannot read: $!" unless defined($nread);
+   print $s;
+   $len -= $nread;
+   }
+   ' - "$1"
+}
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 4/5] archive-tar: write extended headers for far-future mtime

2016-06-30 Thread Jeff King
The ustar format represents timestamps as seconds since the
epoch, but only has room to store 11 octal digits.  To
express anything larger, we need to use an extended header.
This is exactly the same case we fixed for the size field in
the previous commit, and the solution here follows the same
pattern.

This is even mentioned as an issue in f2f0267 (archive-tar:
use xsnprintf for trivial formatting, 2015-09-24), but since
it only affected things far in the future, it wasn't deemed
worth dealing with. But note that my calculations claiming
thousands of years were off there; because our xsnprintf
produces a NUL byte, we only have until the year 2242 to fix
this.

Given that this is just around the corner (geologically
speaking, anyway), and because it's easy to fix, let's just
make it work. Unlike the previous fix for "size", where we
had to write an individual extended header for each file, we
can write one global header (since we have only one mtime
for the whole archive).

There's a slight bit of trickiness there. We may already be
writing a global header with a "comment" field for the
commit sha1. So we need to write our new field into the same
header. To do this, we push the decision of whether to write
such a header down into write_global_extended_header(),
which will now assemble the header as it sees fit, and will
return early if we have nothing to write (in practice, we'll
only have a large mtime if it comes from a commit, but this
makes it also work if you set your system clock ahead such
that time() returns a huge value).

Note that we don't (and never did) handle negative
timestamps (i.e., before 1970). This would probably not be
too hard to support in the same way, but since git does not
support negative timestamps at all, I didn't bother here.

After writing the extended header, we munge the timestamp in
the ustar headers to the maximum-allowable size. This is
wrong, but it's the least-wrong thing we can provide to a
tar implementation that doesn't understand pax headers (it's
also what GNU tar does).

Helped-by: René Scharfe 
Signed-off-by: Jeff King 
---
 archive-tar.c   | 19 ---
 t/t5000-tar-tree.sh |  4 ++--
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index 57a1540..d671bc3 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -22,8 +22,11 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
  * This is the max value that a ustar size header can specify, as it is fixed
  * at 11 octal digits. POSIX specifies that we switch to extended headers at
  * this size.
+ *
+ * Likewise for the mtime (which happens to use a buffer of the same size).
  */
 #define USTAR_MAX_SIZE 0777UL
+#define USTAR_MAX_MTIME 0777UL
 
 /* writes out the whole block, but only if it is full */
 static void write_if_needed(void)
@@ -324,7 +327,18 @@ static int write_global_extended_header(struct 
archiver_args *args)
unsigned int mode;
int err = 0;
 
-   strbuf_append_ext_header(_header, "comment", sha1_to_hex(sha1), 40);
+   if (sha1)
+   strbuf_append_ext_header(_header, "comment",
+sha1_to_hex(sha1), 40);
+   if (args->time > USTAR_MAX_MTIME) {
+   strbuf_append_ext_header_uint(_header, "mtime",
+ args->time);
+   args->time = USTAR_MAX_MTIME;
+   }
+
+   if (!ext_header.len)
+   return 0;
+
memset(, 0, sizeof(header));
*header.typeflag = TYPEFLAG_GLOBAL_HEADER;
mode = 0100666;
@@ -409,8 +423,7 @@ static int write_tar_archive(const struct archiver *ar,
 {
int err = 0;
 
-   if (args->commit_sha1)
-   err = write_global_extended_header(args);
+   err = write_global_extended_header(args);
if (!err)
err = write_archive_entries(args, write_tar_entry);
if (!err)
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 93c2d34..96d208d 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -383,11 +383,11 @@ test_expect_success 'set up repository with far-future 
commit' '
git commit -m "tempori parendum"
 '
 
-test_expect_failure 'generate tar with future mtime' '
+test_expect_success 'generate tar with future mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_failure TAR_HUGE 'system tar can read our future mtime' '
+test_expect_success TAR_HUGE 'system tar can read our future mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-06-30 Thread Jeff King
The ustar format has a fixed-length field for the size of
each file entry which is supposed to contain up to 11 bytes
of octal-formatted data plus a NUL or space terminator.

These means that the largest size we can represent is
0777, or 1 byte short of 8GB. The correct solution
for a larger file, according to POSIX.1-2001, is to add an
extended pax header, similar to how we handle long
filenames. This patch does that, and writes zero for the
size field in the ustar header (the last bit is not
mentioned by POSIX, but it matches how GNU tar behaves with
--format=pax).

This should be a strict improvement over the current
behavior, which is to die in xsnprintf with a "BUG".
However, there's some interesting history here.

Prior to f2f0267 (archive-tar: use xsnprintf for trivial
formatting, 2015-09-24), we silently overflowed the "size"
field. The extra bytes ended up in the "mtime" field of the
header, which was then immediately written itself,
overwriting our extra bytes. What that means depends on how
many bytes we wrote.

If the size was 64GB or greater, then we actually overflowed
digits into the mtime field, meaning our value was
effectively right-shifted by those lost octal digits. And
this patch is again a strict improvement over that.

But if the size was between 8GB and 64GB, then our 12-byte
field held all of the actual digits, and only our NUL
terminator overflowed. According to POSIX, there should be a
NUL or space at the end of the field. However, GNU tar seems
to be lenient here, and will correctly parse a size up 64GB
(minus one) from the field. So sizes in this range might
have just worked, depending on the implementation reading
the tarfile.

This patch is mostly still an improvement there, as the 8GB
limit is specifically mentioned in POSIX as the correct
limit. But it's possible that it could be a regression
(versus the pre-f2f0267 state) if all of the following are
true:

  1. You have a file between 8GB and 64GB.

  2. Your tar implementation _doesn't_ know about pax
 extended headers.

  3. Your tar implementation _does_ parse 12-byte sizes from
 the ustar header without a delimiter.

It's probably not worth worrying about such an obscure set
of conditions, but I'm documenting it here just in case.

Helped-by: René Scharfe 
Signed-off-by: Jeff King 
---
 archive-tar.c   | 31 +--
 t/t5000-tar-tree.sh |  4 ++--
 2 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index cb99df2..57a1540 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -18,6 +18,13 @@ static int tar_umask = 002;
 static int write_tar_filter_archive(const struct archiver *ar,
struct archiver_args *args);
 
+/*
+ * This is the max value that a ustar size header can specify, as it is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended headers at
+ * this size.
+ */
+#define USTAR_MAX_SIZE 0777UL
+
 /* writes out the whole block, but only if it is full */
 static void write_if_needed(void)
 {
@@ -137,6 +144,20 @@ static void strbuf_append_ext_header(struct strbuf *sb, 
const char *keyword,
strbuf_addch(sb, '\n');
 }
 
+/*
+ * Like strbuf_append_ext_header, but for numeric values.
+ */
+static void strbuf_append_ext_header_uint(struct strbuf *sb,
+ const char *keyword,
+ uintmax_t value)
+{
+   char buf[40]; /* big enough for 2^128 in decimal, plus NUL */
+   int len;
+
+   len = xsnprintf(buf, sizeof(buf), "%"PRIuMAX, value);
+   strbuf_append_ext_header(sb, keyword, buf, len);
+}
+
 static unsigned int ustar_header_chksum(const struct ustar_header *header)
 {
const unsigned char *p = (const unsigned char *)header;
@@ -208,7 +229,7 @@ static int write_tar_entry(struct archiver_args *args,
struct ustar_header header;
struct strbuf ext_header = STRBUF_INIT;
unsigned int old_mode = mode;
-   unsigned long size;
+   unsigned long size, size_in_header;
void *buffer;
int err = 0;
 
@@ -267,7 +288,13 @@ static int write_tar_entry(struct archiver_args *args,
memcpy(header.linkname, buffer, size);
}
 
-   prepare_header(args, , mode, size);
+   size_in_header = size;
+   if (S_ISREG(mode) && size > USTAR_MAX_SIZE) {
+   size_in_header = 0;
+   strbuf_append_ext_header_uint(_header, "size", size);
+   }
+
+   prepare_header(args, , mode, size_in_header);
 
if (ext_header.len > 0) {
err = write_extended_header(args, sha1, ext_header.buf,
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 950bdd3..93c2d34 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -360,7 +360,7 @@ test_expect_success 'set up repository with huge blob' '
 
 # We expect git to die with SIGPIPE here (otherwise 

[PATCH v4 5/5] archive-tar: drop return value

2016-06-30 Thread Jeff King
We never do any error checks, and so never return anything
but "0". Let's just drop this to simplify the code.

Signed-off-by: Jeff King 
---
 archive-tar.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/archive-tar.c b/archive-tar.c
index d671bc3..7ea4e90 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -319,13 +319,12 @@ static int write_tar_entry(struct archiver_args *args,
return err;
 }
 
-static int write_global_extended_header(struct archiver_args *args)
+static void write_global_extended_header(struct archiver_args *args)
 {
const unsigned char *sha1 = args->commit_sha1;
struct strbuf ext_header = STRBUF_INIT;
struct ustar_header header;
unsigned int mode;
-   int err = 0;
 
if (sha1)
strbuf_append_ext_header(_header, "comment",
@@ -337,7 +336,7 @@ static int write_global_extended_header(struct 
archiver_args *args)
}
 
if (!ext_header.len)
-   return 0;
+   return;
 
memset(, 0, sizeof(header));
*header.typeflag = TYPEFLAG_GLOBAL_HEADER;
@@ -347,7 +346,6 @@ static int write_global_extended_header(struct 
archiver_args *args)
write_blocked(, sizeof(header));
write_blocked(ext_header.buf, ext_header.len);
strbuf_release(_header);
-   return err;
 }
 
 static struct archiver **tar_filters;
@@ -423,9 +421,8 @@ static int write_tar_archive(const struct archiver *ar,
 {
int err = 0;
 
-   err = write_global_extended_header(args);
-   if (!err)
-   err = write_archive_entries(args, write_tar_entry);
+   write_global_extended_header(args);
+   err = write_archive_entries(args, write_tar_entry);
if (!err)
write_trailer();
return err;
-- 
2.9.0.317.g65b4e7c
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 0/5] friendlier handling of overflows in archive-tar

2016-06-30 Thread Jeff King
This is a re-roll of the jk/big-and-future-archive-tar topic. It
addresses all but one of the review comments, and I hope should be
pretty polished.

The changes are:

  - the dependency on bunzip2 is dropped; instead, we just provide a
partial object for the 64GB blob. See the first commit message for
details.

  - the portable "head -c" replacement from t9300 has been factored out,
and we use it in the new tests

  - symbolic constants for the giant octal numbers (with a comment
warning that the values are set by posix)

  - the comments for tar_info() and the lazy-prereq were split so the
two aren't mashed together

  - uses awk in tar_info() instead of "sed | cut"

  - extra simplification in the final commit, as suggested by review

  - typo and awkwardness fixes in the commit messages

The one thing that isn't fixed is the use of "141" to test for sigpipe
death. That should use test_match_signal, but that topic just got
re-rolled, too.

  [1/5]: t9300: factor out portable "head -c" replacement
  [2/5]: t5000: test tar files that overflow ustar headers
  [3/5]: archive-tar: write extended headers for file sizes >= 8GB
  [4/5]: archive-tar: write extended headers for far-future mtime
  [5/5]: archive-tar: drop return value

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] Report bugs consistently

2016-06-30 Thread Johannes Schindelin
Hi Hannes,

On Thu, 30 Jun 2016, Johannes Sixt wrote:

> Am 29.06.2016 um 13:36 schrieb Johannes Schindelin:
> > @@ -955,9 +955,8 @@ static struct merge_file_info merge_file_1(struct
> > merge_options *o,
> >
> >  if (!sha_eq(a->sha1, b->sha1))
> > result.clean = 0;
> > -   } else {
> > -   die(_("unsupported object type in the tree"));
> > -   }
> > +   } else
> > +   die(_("BUG: unsupported object type in the tree"));
> 
> Would it perhaps make sense to remove the _() markup (here and a few more
> instances in this patch)? It's simpler for developers to find the code
> location when a "BUG:" is reported untranslated.

I would agree, but the purpose of this patch was not to fix that, but to
fix the inconsistency of the message.

Maybe as an add-on patch, with *all* 'die(_("BUG:' instances converted?
That would be even more outside the purview of my patch series than
touching the bug reports outside merge-recursive.c, though.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 9/9] am: make a direct call to merge_recursive

2016-06-30 Thread Johannes Schindelin
Hi Junio,

On Wed, 29 Jun 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > From: Junio C Hamano 
> 
> Did I write this thing?

Yes, you did. It was db05d6194d3f9ea9e64163944961d5f6e85302be as part of
pu@{2016-06-15}.

> Having two sets of numbers that illustrated that this is not really
> a useful optimization in the bigger picture looks vaguely familiar
> (e.g. $gmane/279417), but the numbers are different.

I beg to differ. While the performance improvement is not huge, even your
toy experiment shows it is significant, i.e. noticeable.

> > It feels *slightly* wrong to submit your own patch to review,
> > however, please keep in mind that
> >
> > 1) I changed the patch (o.gently does not exist anymore, so I do
> >not set it), and
> >
> > 2) I added my own timings performed on Windows.
> 
> It probably is much less confusing if you take the authorship,
> possibly with a passing reference to whatever I wrote as the source
> of inspiration in the log message, or something.  I do not think I
> deserve a credit in this 9-patch series.

The patch is an almost verbatim copy of db05d6194, I only removed the
now-obsolete "gently" setting, is all.

That means that I feel really, really uneasy about claiming authorship
because I did not write it.

If you really want me to, I will take custody of this patch and rewrite
the commit message as well using --reset--author, of course.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] Report bugs consistently

2016-06-30 Thread Johannes Schindelin
Hi Junio,

On Wed, 29 Jun 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > The vast majority of error messages in Git's source code which report a
> > bug use the convention to prefix the message with "BUG:".
> 
> Good thing to do.
> 
> But if we were to review and apply a 200+ line patch, I wonder if we
> want to go one step further to allow us to write
> 
> BUG("killed-file %s not found", name);
> 
> instead.

If the idea is to make it easier to find, I would wager a guess that
'die("BUG:' would be just as good a search term. Even better, I think,
because 'BUG' would also match comments.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/9] Report bugs consistently

2016-06-30 Thread Johannes Schindelin
Hi Eric,

On Wed, 29 Jun 2016, Eric Sunshine wrote:

> On Wed, Jun 29, 2016 at 7:36 AM, Johannes Schindelin
>  wrote:
> > The vast majority of error messages in Git's source code which report a
> > bug use the convention to prefix the message with "BUG:".
> > [...]
> > Signed-off-by: Johannes Schindelin 
> > ---
> > diff --git a/merge-recursive.c b/merge-recursive.c
> > @@ -1853,7 +1852,7 @@ int merge_trees(struct merge_options *o,
> > -   die(_("Unprocessed path??? %s"),
> > +   die(_("BUG: unprocessed path??? %s"),
> 
> This and others downcase the first word (which is consistent with
> modern practice)...
> 
> > diff --git a/sha1_file.c b/sha1_file.c
> > @@ -795,7 +795,7 @@ void close_all_packs(void)
> > -   die("BUG! Want to close pack marked 
> > 'do-not-close'");
> > +   die("BUG: Want to close pack marked 
> > 'do-not-close'");
> 
> ...but this one neglects to.

Thanks! Will be fixed as part of v2.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] t3404: add a test for the --gpg-sign option

2016-06-30 Thread Johannes Schindelin
Hi Junio,

On Wed, 29 Jun 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > To keep the time t3404 requires short (in this developer's Windows
> > setup, this single test already takes a painful 8 minutes to pass),
> > we avoid a full-blown GPG test and cop out by verifying the message
> > displayed to the user upon an 'edit' command.
> 
> This is a tangent, but I wonder if we should be solving it by
> parallelizing the tests even more.  If the script for example
> can be split into part1 and part2 that share the same set-up
> that is prepared by the very first test, we could split this
> into three files (common one that is dot-sourced by two actual
> test scripts that have part1 and part2).

Sure, that would work around the problem in the short run.

In the long run, the only real solution would be to stop relying on shell
scripting so much, because the biggest performance hit on Windows stems
from the fact that our test suite is a big honking shell script.

> > diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> > index c7ea8ba..4c96075 100755
> > --- a/t/t3404-rebase-interactive.sh
> > +++ b/t/t3404-rebase-interactive.sh
> > @@ -1281,4 +1281,11 @@ test_expect_success 'editor saves as CR/LF' '
> > )
> >  '
> >  
> > +EPIPHANY="'"
> 
> Why?  If you really need a variable, SQ is probably the way this
> codebase typically spell it.

Have you ever watched the movie "Hook"?

> > +test_expect_success 'rebase -i --gpg-sign=' '
> > +   set_fake_editor &&
> > +   FAKE_LINES="edit 1" git rebase -i --gpg-sign=\" HEAD^ >out 2>err &&
> > +   grep "$EPIPHANY-S\"$EPIPHANY" err
> 
> I am not sure what is going on here.  You are asking to sign using
> the keyid of a single double-quote, and expecting the message that
> says
> 
>   You can amend the commit now, with
> 
>   git commit --amend '-S"'

Precisely. That way, I not only verified that the key ID was correctly
passed around, but also that it is quoted properly.

>   ...
> 
> that has a substring '-S"' in it to ensure that the codepath to
> parse --gpg-sign= on the command line of "rebase", and to the
> message we see here are working correctly, without actually checking
> if GPG is invoked at all, or if it is invoked the key given by the
> option is correctly passed to the invocation?

Exactly. I want to test --gpg-sign even when there is no gpg executable
available.

> If so, can't you do that without confusing users by using keyid "?
> IOW, wouldn't using --gpg-sign=me work equally well?

Not really, because it is much easier to get quoting of "me" right than of
"\"".

I guess I could change the test to pass --gpg-sign="\"S I Gner\"".

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] t3404: fix another typo

2016-06-30 Thread Johannes Schindelin
Hi Junio,

On Wed, 29 Jun 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > On Wed, 29 Jun 2016, Junio C Hamano wrote:
> >
> >> Johannes Schindelin  writes:
> >> 
> >> > The past tense of "to run" is "run", not "ran".
> >> 
> >> Actually, past tense of the verb "to run" is "ran" ;-) The reason
> >> why this patch is still correct is because this is writing in
> >> passive voice, where you want "be + past participle", and the past
> >> participle of the verb "to run" is "run".
> >
> > Embarrassing! ;-)
> >
> > Well, I shall fix up the commit message thusly:
> >
> > t3404: fix another typo
> >
> > The passive form of "to run" is "run", not "ran".
> 
> If your convention of referring to a verb is to show its infinitive
> form, i.e. "to run", then you would probably want to say
> 
>   The passive from of "to run" is "to be run", not "to be
>   ran".
> 
> But I think we do not need any of this if we just retitled it to
> 
>   Subject: t3404: fix a grammo (commands are ran -> commands are run)
> 
> without any body.

Done. Will be part of v2.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: preview: What's cooking in git.git (Jun 2016, #10; Tue, 28)

2016-06-30 Thread Jeff King
On Wed, Jun 29, 2016 at 07:41:36AM +0200, Johannes Sixt wrote:

> Am 29.06.2016 um 03:43 schrieb Jeff King:
> > Another is to just put the posix/ksh schemes into the helper function,
> > and let Windows people sort it out later if they want to.
> 
> Let's do this.

OK, here's a replacement for the first patch in test-match-signal (the
others do not need touched at all). It punts on Windows entirely.

It does retain the existing check for "3" in t0005 (though I note in
04422c7 you skipped the next test entirely in MINGW, and a similar
argument could perhaps apply here).

I suspect that supporting Windows in test_match_sigpipe would require
at least:

  - checking for code 3 to cover cases where we raise (either because of
check_pipe(), or because we are doing the pop-and-raise thing in our
cleanup handlers)

  - checking for 128 when we want sigpipe, in case we get EPIPE and it
caused a die().

  - figuring out what the heck happens when you "kill -15" a git process
and mapping that

But those are all just guesses (and we'd probably want to hide them
behind "if test_have_prerequisite MINGW" to avoid making the tests less
robust elsewhere).

This patch should change nothing at all for Windows, so at least we are
not making things worse.

-- >8 --
Subject: [PATCH] tests: factor portable signal check out of t0005

In POSIX shells, a program which exits due to a signal
generally has an exit code of 128 plus the signal number.
However, ksh uses 256 plus the signal number.  We've
accounted for that in t0005, but not in other tests.  Let's
pull out the logic so we can use it elsewhere.

It would be nice for debugging if this additionally printed
errors to stderr, like our other test_* helpers. But we're
going to need to use it in other places besides the innards
of a test_expect block. So let's leave it as generic as
possible.

Note that we also leave the magic "3" for Windows out of the
generic helper. This is an artifact of the way we use
raise() to kill ourselves in test-sigchain.c, and will not
necessarily apply to all programs. So it's better to keep it
out of the helper, to reduce the chance of confusing it with
a real call to exit(3).

Signed-off-by: Jeff King 
---
 t/t0005-signals.sh  | 13 +++--
 t/test-lib-functions.sh | 15 +++
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/t/t0005-signals.sh b/t/t0005-signals.sh
index e7f27eb..95f8c05 100755
--- a/t/t0005-signals.sh
+++ b/t/t0005-signals.sh
@@ -11,12 +11,13 @@ EOF
 
 test_expect_success 'sigchain works' '
{ test-sigchain >actual; ret=$?; } &&
-   case "$ret" in
-   143) true ;; # POSIX w/ SIGTERM=15
-   271) true ;; # ksh w/ SIGTERM=15
- 3) true ;; # Windows
- *) false ;;
-   esac &&
+   {
+   # Signal death by raise() on Windows acts like exit(3),
+   # regardless of the signal number. So we must allow that
+   # as well as the normal signal check.
+   test_match_signal 15 "$ret" ||
+   test "$ret" = 3
+   } &&
test_cmp expect actual
 '
 
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 48884d5..15ef3f8 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -961,3 +961,18 @@ test_env () {
done
)
 }
+
+# Returns true if the numeric exit code in "$2" represents the expected signal
+# in "$1". Signals should be given numerically.
+test_match_signal () {
+   if test "$2" = "$((128 + $1))"
+   then
+   # POSIX
+   return 0
+   elif test "$2" = "$((256 + $1))"
+   then
+   # ksh
+   return 0
+   fi
+   return 1
+}
-- 
2.9.0.317.g65b4e7c

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] t/Makefile: add a rule to re-run previously-failed tests

2016-06-30 Thread Jeff King
On Wed, Jun 29, 2016 at 09:02:37AM +0200, Johannes Schindelin wrote:

> It is the most convenient way to determine which tests failed after
> running the entire test suite, in parallel, to look for left-over "trash
> directory.t*" subdirectories in the t/ subdirectory.

As Junio noted, this doesn't work with --root. I have sometimes used:

  grep 'failed [^0]' test-results/*

for this purpose.

> This patch automates the process of determinig which tests failed
> previously and re-running them; It turned out to be quite convenient
> when trying to squash bugs that crept in during rebases.

I suspect your response will be "perl tools on Windows are too painful
to use", but the "prove" tool which comes with perl can do this and more
(e.g., running the failed tests first, and then following up with the
others to double-check), and our test suite supports it quite well.

  $ grep -B1 PROVE config.mak
  # run tests in parallel, with slow ones first to keep pipelines full
  GIT_PROVE_OPTS = -j16 --state=slow,save

  $ cd t
  $ make prove
  ... reports some test failed ...
  $ prove --state=failed
  ... re-runs just the failed test ...

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/7] literal formatting in documentation

2016-06-30 Thread Jeff King
On Tue, Jun 28, 2016 at 01:40:08PM +0200, Matthieu Moy wrote:

> This should address all comments from Peff.

Thanks. The review is mind-numbing enough that I didn't do a complete
read-through again, but just spot-checked a few places. This version
looks good to me.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: preview: What's cooking in git.git (Jun 2016, #10; Tue, 28)

2016-06-30 Thread Jeff King
On Wed, Jun 29, 2016 at 01:47:31PM +0200, Johannes Schindelin wrote:

> On Tue, 28 Jun 2016, Junio C Hamano wrote:
> 
> > * jk/ansi-color (2016-06-23) 7 commits
> >   (merged to 'next' on 2016-06-28 at 354989c)
> >  + color: support strike-through attribute
> >  + color: support "italic" attribute
> >  + color: allow "no-" for negating attributes
> >  + color: refactor parse_attr
> >  + add skip_prefix_mem helper
> >  + doc: refactor description of color format
> >  + color: fix max-size comment
> > 
> >  The output coloring scheme learned two new attributes, italic and
> >  strike, in addition to existing bold, reverse, etc.
> > 
> >  Will merge to 'master'.
> 
> Please note that those "colors" do not work on Windows, at least as far as
> I know, I only skimmed the code in set_attr():
> 
>   https://github.com/git/git/blob/v2.9.0/compat/winansi.c#L175-L314
> 
> ... and it looks as if italic is plainly unsupported, and strike-through
> is not handled.

I suspect winansi doesn't handle 256-color or 24-bit color modes either,
and those are also not supported on all terminals. All of the color
output is subject to the user's terminal supporting it.  It might be
that we should make a more clear disclaimer in the documentation.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git tag --contains for cherry-picks

2016-06-30 Thread Jeff King
On Wed, Jun 29, 2016 at 12:48:33PM +0100, Laszlo Papp wrote:

> Old releases are maintained with important bug fixes or even new features
> in our case. It sometimes means that we need to cherry-pick commits across
> branches, like from master to a specific release branch.
> 
> Cherry-picking changes the hash of the commit, therefore, this may no
> longer work for cherry-picks:
> 
> git tag --contains
> 
> I am thinking of having something like:
> 
> git tag --contains-follow
> 
> which would follow cherry-picks. I am not sure how easily and/or
> efficiently this can be implemented, but my gut feeling is that in the vast
> majority of the cases, the content check would bail out already at the
> "subject line".

Git generally considers commits "equivalent" based on the patch-id, whic
his a sha1 of the diff (modulo some canonicalization).

So you could ask right now for the patch-id of a particular commit:

  git show $commit | git patch-id >needle

and then the patch-id of all tagged commits:

  git log -p --tags | git patch-id | sort >haystack

Each line will have the patch-id followed by the commit id. You can then
correlate them (join is part of GNU coreutils):

  join needle haystack | cut -d' ' -f2- >synonyms

That gives you a list of synonym commits, which you can use to ask git
which tags contain any of them:

  git tag $(for i in $(cat synonyms); do echo "--contains $i"; done)

The big downside is that generating the haystack is expensive (it has to
do a diff on every commit). But if this is something you do a lot, you
can save the haystack and incrementally update it with new commits.


Of course there are other ways of determining commit equivalence. You
could find ones with duplicate commit messages, or duplicate subjects,
or whatever. But if you have a cherry-picking workflow, I suspect the
easiest thing may be to simply use "git cherry-pick -x", which will
write the sha1 of the original commit into the cherry-picked commit
message. You can then use that to correlate directly.

So there's no specific "--contains-follow" as you want, but the tools
are there to do it (and more flexibly and efficiently, depending on your
needs). That doesn't necessarily mean it's not a good idea to make the
simple case easier, though.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4] Refactor recv_sideband()

2016-06-30 Thread Lukas Fleischer
On Wed, 29 Jun 2016 at 18:40:16, Junio C Hamano wrote:
> Lukas, can you see what is in 'pu' after I push out today's
> integration result in several hours and tell us if you like the
> result of the SQUASH??? change?

Looks good to me. Thank you both for working on this. Note that you
should amend the last paragraph of the original commit message when you
squash Nicos patch into mine.

Oh, and one more detail: I wonder why we still use fwrite(), now that we
know we can use xwrite() which guarantees atomicity. Is there a reason
for that?

Regards,
Lukas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html