[PATCH v2 2/3] config: respect `pager.config` in list/get-mode only

2018-02-21 Thread Martin Ågren
Similar to de121ffe5 (tag: respect `pager.tag` in list-mode only,
2017-08-02), use the DELAY_PAGER_CONFIG-mechanism to only respect
`pager.config` when we are listing or "get"ing config.

We have several getters and some are guaranteed to give at most one line
of output. Paging all getters including those could be convenient from a
documentation point-of-view. The downside would be that a misconfigured
or not so modern pager might wait for user interaction before
terminating. Let's instead respect the config for precisely those
getters which may produce more than one line of output.

`--get-urlmatch` may or may not produce multiple lines of output,
depending on the exact usage. Let's not try to recognize the two modes,
but instead make `--get-urlmatch` always respect the config. Analyzing
the detailed usage might be trivial enough here, but could establish a
precedent that we will never be able to enforce throughout the codebase
and that will just open a can of worms.

This fixes the failing test added in the previous commit. Also adapt the
test for whether `git config foo.bar bar` and `git config --get foo.bar`
respects `pager.config`.

Signed-off-by: Martin Ågren 
---
 Documentation/git-config.txt |  5 +
 t/t7006-pager.sh | 10 +-
 builtin/config.c | 10 ++
 git.c|  2 +-
 4 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 14da5fc157..249090ac84 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -233,6 +233,11 @@ See also <>.
using `--file`, `--global`, etc) and `on` when searching all
config files.
 
+CONFIGURATION
+-
+`pager.config` is only respected when listing configuration, i.e., when
+using `--list` or any of the `--get-*` which may return multiple results.
+
 [[FILES]]
 FILES
 -
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index a46a079339..95bd26f0b2 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -245,13 +245,13 @@ test_expect_success TTY 'git branch --set-upstream-to 
ignores pager.branch' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git config respects pager.config when setting' '
+test_expect_success TTY 'git config ignores pager.config when setting' '
rm -f paginated.out &&
test_terminal git -c pager.config config foo.bar bar &&
-   test -e paginated.out
+   ! test -e paginated.out
 '
 
-test_expect_failure TTY 'git config --edit ignores pager.config' '
+test_expect_success TTY 'git config --edit ignores pager.config' '
rm -f paginated.out editor.used &&
write_script editor <<-\EOF &&
touch editor.used
@@ -261,10 +261,10 @@ test_expect_failure TTY 'git config --edit ignores 
pager.config' '
test -e editor.used
 '
 
-test_expect_success TTY 'git config --get respects pager.config' '
+test_expect_success TTY 'git config --get ignores pager.config' '
rm -f paginated.out &&
test_terminal git -c pager.config config --get foo.bar &&
-   test -e paginated.out
+   ! test -e paginated.out
 '
 
 test_expect_success TTY 'git config --get-urlmatch defaults to not paging' '
diff --git a/builtin/config.c b/builtin/config.c
index ab5f95476e..a732d9b56c 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -48,6 +48,13 @@ static int show_origin;
 #define ACTION_GET_COLORBOOL (1<<14)
 #define ACTION_GET_URLMATCH (1<<15)
 
+/*
+ * The actions "ACTION_LIST | ACTION_GET_*" which may produce more than
+ * one line of output and which should therefore be paged.
+ */
+#define PAGING_ACTIONS (ACTION_LIST | ACTION_GET_ALL | \
+   ACTION_GET_REGEXP | ACTION_GET_URLMATCH)
+
 #define TYPE_BOOL (1<<0)
 #define TYPE_INT (1<<1)
 #define TYPE_BOOL_OR_INT (1<<2)
@@ -594,6 +601,9 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
usage_with_options(builtin_config_usage, 
builtin_config_options);
}
 
+   if (actions & PAGING_ACTIONS)
+   setup_auto_pager("config", 0);
+
if (actions == ACTION_LIST) {
check_argc(argc, 0, 0);
if (config_with_options(show_all_config, NULL,
diff --git a/git.c b/git.c
index c870b9719c..e5c9b8729d 100644
--- a/git.c
+++ b/git.c
@@ -389,7 +389,7 @@ static struct cmd_struct commands[] = {
{ "column", cmd_column, RUN_SETUP_GENTLY },
{ "commit", cmd_commit, RUN_SETUP | NEED_WORK_TREE },
{ "commit-tree", cmd_commit_tree, RUN_SETUP },
-   { "config", cmd_config, RUN_SETUP_GENTLY },
+   { "config", cmd_config, RUN_SETUP_GENTLY | DELAY_PAGER_CONFIG },
{ "count-objects", cmd_count_objects, RUN_SETUP },
{ "credential", cmd_credential, RUN_SETUP_GENTLY },
{ "describe", cmd_describe, RUN_SETUP },
-- 
2.16.2.246.ga4ee8f



Re: [PATCH 00/36] object_id part 12

2018-02-21 Thread Junio C Hamano
"brian m. carlson"  writes:

> This is the twelfth in a series of patches to convert from unsigned char
> [20] to struct object_id.  This series is based on next.
>
> Included in this series are conversions for find_unique_abbrev and
> lookup_replace_object, as well as parts of the sha1_file code.
>
> Conflicts with pu are average in number but minor, mostly because of the
> type_name conversion.  None of them are tricky, except that the
> introduction of get_tree_entry_if_blob requires a conversion of that
> function.

And the reason why this is based on 'next' is...?  Which topic(s) do
we have to wait for until we can queue this series, in other words?

Thanks for working on this, though.


Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-21 Thread Lars Schneider

> On 16 Feb 2018, at 19:55, Junio C Hamano  wrote:
> 
> Jeff King  writes:
> 
>> So a full proposal would support both cases: "check this out in the
>> local platform's preferred encoding" and "always check this out in
>> _this_ encoding". And Lars's proposal is just the second half of that.
> 
> Actually, what you seem to take as a whole is just half of the
> story.  The other half that is an ability to say "what is in the
> repository for this path is stored in this encoding".  I agree that
> "check it out in this encoding" is a useful thing to have, and using
> the in-tree .gitattributes as a place to state the project-wide
> preference may be OK (and .git/info/attributes should be able to
> override it if needed -- this probably deserves to be added to a
> test somewhere by this series).

Good call! I'll add this test case!


> Luckily, lack of 'in-repository-encoding' attribute is not a show
> stopper for this series.  A later topic could start with "earlier,
> in order to make use of w-t-e attribute, you had to have your
> contents in UTF-8.  Teach the codepath to honor a new attribute that
> tells in what encoding the blob contents are stored." without having
> to be a part of this topic.

I have the impression that this is the purpose of the already existing 
"encoding" attribute, no? AFAIK this attribute is only respected by 
gitk, though. A future series could make Git respect this attribute too.


- Lars



Re: [PATCH 2/2] ref-filter: get rid of goto

2018-02-21 Thread Junio C Hamano
Olga Telezhnaya  writes:

> Get rid of goto command in ref-filter for better readability.
>
> Signed-off-by: Olga Telezhnaia 
> Mentored-by: Christian Couder 
> Mentored by: Jeff King 
> ---
>  ref-filter.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

It looks like this is the same change as the bottom-most change on
your "cat-file --batch" series (and is obviously correct).

I am puzzled by your intention---are you re-organizing and rebooting
the series?  Either 'Yes' or 'No' is an acceptable answer, and so is
anything else.  I just want to know what you want to happen to the
merge conflicts if I queued this while still keeping your "cat-file
--batch" thing I have on 'pu').

>
> diff --git a/ref-filter.c b/ref-filter.c
> index 83ffd84affe52..28df6e21fb996 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1494,11 +1494,11 @@ static void populate_value(struct ref_array_item *ref)
>   for (i = 0; i < used_atom_cnt; i++) {
>   struct atom_value *v = >value[i];
>   if (v->s == NULL)
> - goto need_obj;
> + break;
>   }
> - return;
> + if (used_atom_cnt <= i)
> + return;
>  
> - need_obj:
>   get_object(ref, >objectname, 0, );
>  
>   /*
>
> --
> https://github.com/git/git/pull/460


Re: [PATCH 1/2] ref-filter: get rid of duplicate code

2018-02-21 Thread Junio C Hamano
Olga Telezhnaya  writes:

> Make one function from 2 duplicate pieces and invoke it twice.
>
> Signed-off-by: Olga Telezhnaia 
> Mentored-by: Christian Couder 
> Mentored by: Jeff King 
> ---
>  ref-filter.c | 45 +
>  1 file changed, 21 insertions(+), 24 deletions(-)

Nice code reduction, removing 2 instances of 11-line section and
replacing with a 17-line function ;-)

This looks related to your earlier changes for "cat-file --batch"
but I do not recall seeing this refactoring.  It seems to be done
correctly.

Good spotting.

> diff --git a/ref-filter.c b/ref-filter.c
> index f9e25aea7a97e..83ffd84affe52 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1354,15 +1354,31 @@ static const char *get_refname(struct used_atom 
> *atom, struct ref_array_item *re
>   return show_ref(>u.refname, ref->refname);
>  }
>  
> +static void get_object(struct ref_array_item *ref, const struct object_id 
> *oid,
> +int deref, struct object **obj)
> +{
> + int eaten;
> + unsigned long size;
> + void *buf = get_obj(oid, obj, , );
> + if (!buf)
> + die(_("missing object %s for %s"),
> + oid_to_hex(oid), ref->refname);
> + if (!*obj)
> + die(_("parse_object_buffer failed on %s for %s"),
> + oid_to_hex(oid), ref->refname);
> +
> + grab_values(ref->value, deref, *obj, buf, size);
> + if (!eaten)
> + free(buf);
> +}
> +
>  /*
>   * Parse the object referred by ref, and grab needed value.
>   */
>  static void populate_value(struct ref_array_item *ref)
>  {
> - void *buf;
>   struct object *obj;
> - int eaten, i;
> - unsigned long size;
> + int i;
>   const struct object_id *tagged;
>  
>   ref->value = xcalloc(used_atom_cnt, sizeof(struct atom_value));
> @@ -1483,17 +1499,7 @@ static void populate_value(struct ref_array_item *ref)
>   return;
>  
>   need_obj:
> - buf = get_obj(>objectname, , , );
> - if (!buf)
> - die(_("missing object %s for %s"),
> - oid_to_hex(>objectname), ref->refname);
> - if (!obj)
> - die(_("parse_object_buffer failed on %s for %s"),
> - oid_to_hex(>objectname), ref->refname);
> -
> - grab_values(ref->value, 0, obj, buf, size);
> - if (!eaten)
> - free(buf);
> + get_object(ref, >objectname, 0, );
>  
>   /*
>* If there is no atom that wants to know about tagged
> @@ -1514,16 +1520,7 @@ static void populate_value(struct ref_array_item *ref)
>* is not consistent with what deref_tag() does
>* which peels the onion to the core.
>*/
> - buf = get_obj(tagged, , , );
> - if (!buf)
> - die(_("missing object %s for %s"),
> - oid_to_hex(tagged), ref->refname);
> - if (!obj)
> - die(_("parse_object_buffer failed on %s for %s"),
> - oid_to_hex(tagged), ref->refname);
> - grab_values(ref->value, 1, obj, buf, size);
> - if (!eaten)
> - free(buf);
> + get_object(ref, tagged, 1, );
>  }
>  
>  /*
>
> --
> https://github.com/git/git/pull/460


Re: Duplicate safecrlf warning for racily clean index entry

2018-02-21 Thread Junio C Hamano
Matt McCutchen  writes:

> ... may be an important optimization.)  If the line endings are changed
> without changing the size or post-conversion content, then no unstaged
> change is reported.  It does not appear that git saves the pre-
> conversion content.

Correct.  The cached-stat information is meant to be compared with
the size on the filesystem.  Based on your observation, it seems
that what you are seeing is not specific to safe-crlf thing.

If you reconfigure anything that affects the checkout conversion
codepath, including the "smudge" filter, an entry in the index that
used to be up-to-date will still have cached-stat info like
timestamp and size that match the on-disk file, even though if you
_were_ to check it out afresh out of the index, the reconfigured
checkout codepath may produce different file contents on-disk.  A
consequence of this is that you may cause Git to still say that the
path is clean, even though it is no longer true.

There is no single "right" solution out of this situation, as it
depends on the reason why you made such a reconfiguration in the
first place.

 - If the reason is because you found that what is stored in the
   index is correct but their contents are checked out incorrectly
   (e.g. both the index and the working tree files end their lines
   with LF, but you want your working tree files to be converted to
   CRLF, and you futzed with .gitattributes or core.crlf), then you
   would want to "correct" it by checking them out, bypassing the
   "if the cached-stat information says we already have the matching
   contents on disk, do not write the file out" optimization.

 - If the reason is the other way around, then you would want to
   "correct" the indexed contents by rehashing what you have on
   disk.  Perhaps a recently added "git add --renormalize" is what
   you are looking for.





Re: [PATCH v2 4/9] t3701: don't hard code sha1 hash values

2018-02-21 Thread Junio C Hamano
Phillip Wood  writes:

> Keeping the permission bits makes sense (I'd not thought of them when
> I created the patch) as we want to check that the file has the correct
> permissions. As for the all-zero object name, is it really worth
> leaving it in - if a file has been created or deleted then we'll still
> have /dev/null as the file name for one side or the other and the diff
> lines will show it as well. As these tests are just to check the state
> of the index then I'm not sure the hash values add anything. How do
> you feel about a filter like
>
> sed "/^   index/s/ [0-9a-f][0-9a-f]*\.\.[0-9a-f][0-9a-f]*/x/"

Something like that, with perhaps the single 'x' replaced with
something more normal looking and the all-zero thing special cased,
was what I had in mind.

Special casing all-zero matters.  I know that the current code uses
all-zero on the missing side.  The thing is, tests are about
protecting the correctness we currently have, and we want to catch
the next idiot that breaks the code to stop showing all-zero when
talking about the missing side.


Re: [PATCH] t/known-leaky: add list of known-leaky test scripts

2018-02-21 Thread Junio C Hamano
Martin Ågren  writes:

> On 19 February 2018 at 22:29, Jeff King  wrote:
> ...
>> Or alternatively, we could just not bother with checking this into the
>> repository, and it becomes a local thing for people interested in
>> leak-testing. What's the value in having a shared known-leaky list,
>> especially if we don't expect most people to run it.
>
> This sums up my feeling about this.

Even though keeping track of list of known-leaky tests may not be so
interesting, we can still salvage useful pieces from the discussion
and make them available to developers, e.g.  something like

prove --dry --state=failed |
perl -lne '/^(t[0-9]{4})-.*\.sh$/ and print $1' | sort >$@+
if cmp >/dev/null $@ $@+; then rm $@+; else mv $@+ $@; fi

could be made into a target to stash away the list of failing tests
after a test run?



RE: Stackdump from stash save on Windows 10 64-bit

2018-02-21 Thread Johannes Schindelin
Hi Tim,

I re-Cc:ed the Git mailing list, as I do not like the pressure of being
the only one you ask for help.

On Wed, 21 Feb 2018, Tim Mayo wrote:

>   > Can you please test with v2.16.2
> 
> Is there a built version somewhere .. or do I have to build it myself?

Yes, official Git for Windows versions are always available from
https://gitforwindows.org/: either click the big Download button (which
will download the installer right away), or click on the version in the
upper left (if you want to choose which flavor of Git for Windows you want
to download).

Ciao,
Johannes


Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Derrick Stolee

On 2/20/2018 5:57 PM, Jeff King wrote:

On Tue, Feb 20, 2018 at 05:12:50PM -0500, Derrick Stolee wrote:


In rev-list, the "--header" option outputs a value and expects the buffer to
be cached. It outputs the header info only if get_cached_commit_buffer()
returns a non-null buffer, giving incorrect output. If it called
get_commit_buffer() instead, it would immediately call
get_cached_commit_buffer() and on failure actually load the buffer.

This has not been a problem before, since the buffer was always loaded at
some point for each commit (and saved because of the save_commit_buffer
global).

I propose to make get_cached_commit_buffer() static to commit.c and convert
all callers to get_commit_buffer(). Is there any reason to _not_ do this? It
seems that there is no functional or performance change.

That helper was added in 152ff1cceb (provide helpers to access the
commit buffer, 2014-06-10). I think interesting part is the final
paragraph:

   Note that we also need to add a "get_cached" variant which
   returns NULL when we do not have a cached buffer. At first
   glance this seems to defeat the purpose of "get", which is
   to always provide a return value. However, some log code
   paths actually use the NULL-ness of commit->buffer as a
   boolean flag to decide whether to try printing the
   commit. At least for now, we want to continue supporting
   that use.

So I think a conversion to get_commit_buffer() would break the callers
that use the boolean nature for something useful. Unfortunately the
author did not enumerate exactly what those uses are, so we'll have to
dig. :)

My guess is that it has to do with showing some commits twice, since we
would normally have the buffer available the first time we hit the
commit, and then free it in finish_commit().

If we blame that rev-list line (and then walk back over a bunch of
uninteresting commits via parent re-blaming), it comes from 3131b71301
(Add "--show-all" revision walker flag for debugging, 2008-02-09).


Thanks for doing this digging. I appreciate the breadcrumbs, too, so I 
can do a better job of digging next time.



So there it is. It does show commits multiple times, but suppresses the
verbose header after the first showing. If we do something like this:

   git rev-list --show-all --pretty --boundary c93150cfb0^-

you'll see some boundary commits that _don't_ have their pretty headers
shown. And with your proposed patch, we'd show them again. To keep the
same behavior we need to store that "we've already seen this" boolean
somewhere else (e.g., in an object flag; possibly SEEN, but that might
run afoul of other logic).


What confuses me about this behavior is that the OID is still shown on 
the repeat (and in the case of `git log --oneline` will not actually 
have a line break between two short-OIDs). I don't believe this behavior 
is something to preserve.


You are right that we definitely don't want to show the full content twice.


It looks like the call in log-tree comes from the same commit, and
serves the same purpose.

Aside from storing the boolean "did we show it" in another way, the
other option is to simply change the behavior and accept that we might
pretty-print the commit twice. This is a backwards-incompatible change,
but I'm not sure if anybody would care. According to that commit,
--show-all was added explicitly for debugging, and it has never been
documented.  I couldn't find any reference to people actually using it
on the list (a grep of the whole archive turns up 32 messages, most of
which are just it appearing in context; the only person mentioning its
actual use was Linus in 2008.


Unless I am misunderstanding, the current behavior on a repeated commit 
is already incorrect: some amount of output occurs before checking the 
buffer, so the output includes repeated records but with formatting that 
violates the expectation. By doing the simple change of swapping 
get_cached_commit_buffer() with get_commit_buffer(), we correct that 
format violation but have duplicate copies.


The most-correct thing to do (in my opinion) is to put the requirement 
of "no repeats" into the revision walk logic and stop having the 
formatting methods expect them. Then, however we change this boolean 
setting of "we have seen this before" it will not require the formatting 
methods to change.


I can start working on a patch to move the duplicate-removal logic into 
revision.c instead of these three callers:


builtin/rev-list.c: if (revs->verbose_header && 
get_cached_commit_buffer(commit, NULL)) {

log-tree.c: if (!get_cached_commit_buffer(commit, NULL))
object.c:   if (!get_cached_commit_buffer(commit, 
NULL)) {


But this caller seems pretty important in pretty.c:

    /*
 * Otherwise, we still want to munge the encoding header in the
 * result, which will be done by modifying the buffer. If we
 * are using a fresh copy, we can reuse it. But if 

Re: Duplicate safecrlf warning for racily clean index entry

2018-02-21 Thread Matt McCutchen
On Wed, 2018-02-21 at 08:53 +0100, Torsten Bögershausen wrote:
> I don't hava a pointer, but what should happen ?
> 2 warnings for 2 "git add" should be OK, I think.
> 
> 1 warning is part of the optimization, that Git does to handle
> hundrets and thousands of files efficciently.
> 
> Is the 1/2 warning  real live problem  ?

As I've suggested, my opinion is that the nondeterministic second
warning can result in significant user confusion and should be avoided.
 (If it were always shown, I'd be less concerned.)  We'll see what the
decision-makers think.

Matt


Re: Duplicate safecrlf warning for racily clean index entry

2018-02-21 Thread Matt McCutchen
On Tue, 2018-02-20 at 08:42 -0500, Matt McCutchen wrote:
> In either case, if "git update-index --refresh" (or "git status") is
> run before "git add", then "git add" does not print the warning.  On
> the other hand, if line endings in the working tree file are changed,
> then git shows the file as having an unstaged change, even though the
> content that would be added to the index after CRLF conversion is
> identical.  So it seems that git remembers the pre-conversion file
> content and uses it for "git update-index --refresh" and would just
> need to use it for "git add" as well.

On further testing, this analysis is wrong.  What I was seeing is that
if the size of the working tree file has changed, git reports an
unstaged change.  (I suppose that reporting an unstaged change in this
case without checking whether the post-conversion content has changed
may be an important optimization.)  If the line endings are changed
without changing the size or post-conversion content, then no unstaged
change is reported.  It does not appear that git saves the pre-
conversion content.

Thus, if it were possible to create a file that doesn't need a safecrlf
warning, add it to the index, and then modify it so that it does need a
safecrlf warning without changing the size or post-conversion content,
we would have a bug where no warning is shown in the case where "git
status" is run before the second "git add".  I believe this bug can't
occur in the particular case of CRLF conversion without other filters
because the file that doesn't need a safecrlf warning has a unique
minimum (LF) or maximum (CRLF) size, though I presume it could occur
with custom filters.  My proposal would then be that "git add" should
not show a safecrlf warning if the size and post-conversion content
haven't changed; it would merely bring "git add" to parity with the
potential bug in the "git status" case.

Matt


Re: Why git-revert doesn't invoke the pre-commit and the commit-msg hooks?

2018-02-21 Thread Gustavo Chaves
2018-02-20 15:00 GMT-03:00 Junio C Hamano :
> It would make more sense (if we were to add
> an option to run any hook we currently do not run to the command) to
> run pre-revert/revert-msg hooks instead, and then people who happen
> to want to do the same thing in these hooks what they do for
> ordinary commits can just call their pre-commit/commit-msg hooks
> from there, perhaps.

I like this idea very much as it doesn't break a long standing behaviour and
simply introduces a new feature.

-- 
Gustavo.


[ANNOUNCE] Git Rev News edition 36

2018-02-21 Thread Christian Couder
Hi everyone,

The 36th edition of Git Rev News is now published:

  https://git.github.io/rev_news/2018/02/21/edition-36/

Thanks a lot to all the contributors!

Enjoy,
Christian, Jakub, Markus and Gabriel.


Re: [PATCH v2 4/9] t3701: don't hard code sha1 hash values

2018-02-21 Thread Phillip Wood

On 20/02/18 17:39, Junio C Hamano wrote:

Phillip Wood  writes:


From: Phillip Wood 

Purge the index lines from diffs so we're not hard coding sha1 hash
values in the expected output.


The motivation of this patch is clear, but all-zero object name for
missing side of deletion or creation patch should not change when we
transition to any hash function.  Neither the permission bits shown
in the output (and whether the index line has the bits are shown on
it in the first place, i.e. the index line of a creation patch does
not, whilethe one in a modification patch does).

So I am a bit ambivalent about this change.

Perhaps have a filter that redacts, instead of removes, selected
pieces of information that are likely to change while hash
transition, and use that consistently to filter both the expected
output and the actual output before comparing?

Keeping the permission bits makes sense (I'd not thought of them when I 
created the patch) as we want to check that the file has the correct 
permissions. As for the all-zero object name, is it really worth leaving 
it in - if a file has been created or deleted then we'll still have 
/dev/null as the file name for one side or the other and the diff lines 
will show it as well. As these tests are just to check the state of the 
index then I'm not sure the hash values add anything. How do you feel 
about a filter like


sed "/^index/s/ [0-9a-f][0-9a-f]*\.\.[0-9a-f][0-9a-f]*/x/"

Best Wishes

Phillip


Re: [PATCH v2 5/9] t3701: add failing test for pathological context lines

2018-02-21 Thread Phillip Wood

On 19/02/18 11:29, Phillip Wood wrote:

From: Phillip Wood 

When a hunk is skipped by add -i the offsets of subsequent hunks are
not adjusted to account for any missing insertions due to the skipped
hunk. Most of the time this does not matter as apply uses the context
lines to apply the subsequent hunks in the correct place, however in
pathological cases the context lines will match at the now incorrect
offset and the hunk will be applied in the wrong place. The offsets of
hunks following an edited hunk that has had the number of insertions
or deletions changed also need to be updated in the same way. Add
failing tests to demonstrate this.

Signed-off-by: Phillip Wood 
---

Notes:
 changes since v1:
  - changed edit test to use the existing fake editor and to strip
the hunk header and some context lines as well as deleting an
insertion
  - style fixes

  t/t3701-add-interactive.sh | 31 +++
  1 file changed, 31 insertions(+)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
70748393f28c93f4bc5d43b07bd96bd208aba3e9..06c4747f506a1b05ccad0f9387e1fbd69cfd7784
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -511,4 +511,35 @@ test_expect_success 'status ignores dirty submodules 
(except HEAD)' '
! grep dirty-otherwise output
  '
  
+test_expect_success 'set up pathological context' '

+   git reset --hard &&
+   test_write_lines a a a a a a a a a a a >a &&
+   git add a &&
+   git commit -m a &&
+   test_write_lines c b a a a a a a a b a a a a >a &&
+   test_write_lines a a a a a a a b a a a a >expected-1 &&
+   test_write_lines   b a a a a a a a b a a a a >expected-2 &&
+   # check editing can cope with missing header and deleted context lines
+   # as well as changes to other lines
+   test_write_lines +b " a" >patch
+'
+
+test_expect_failure 'add -p works with pathological context lines' '
+   git reset &&
+   printf "%s\n" n y |
+   git add -p &&
+   git cat-file blob :a >actual &&
+   test_cmp expected-1 actual
+'
+
+test_expect_failure 'add -p patch editing works with pathological context 
lines' '
+   git reset &&
+   test_set_editor "$FAKE_EDITOR" &&

In light of the discussion of patch 2, I think this line should be deleted



+   # n q q below is in case edit fails
+   printf "%s\n" e yn q q |
+   git add -p &&
+   git cat-file blob :a >actual &&
+   test_cmp expected-2 actual
+'
+
  test_done





Re: [PATCH v2 3/9] t3701: use test_write_lines and write_script

2018-02-21 Thread Phillip Wood

On 20/02/18 17:19, Junio C Hamano wrote:

Eric Sunshine  writes:


  test_expect_success 'setup fake editor' '
-   echo "#!$SHELL_PATH" >fake_editor.sh &&
-   cat >>fake_editor.sh <<-\EOF &&
+   FAKE_EDITOR="$(pwd)/fake-editor.sh" &&
+   write_script "$FAKE_EDITOR" <<-\EOF &&
 mv -f "$1" oldpatch &&
 mv -f patch "$1"
 EOF
-   chmod a+x fake_editor.sh &&
-   test_set_editor "$(pwd)/fake_editor.sh"
+   test_set_editor "$FAKE_EDITOR"
  '


The very first thing that test_set_editor() does is set FAKE_EDITOR to
the value of $1, so it is confusing to see it getting setting it here
first; the reader has to spend extra brain cycles wondering if
something non-obvious is going on that requires this manual
assignment. Perhaps drop the assignment altogether and just write
literal "fake_editor.sh" in the couple places it's needed (as was done
in the original code)?


Yeah, I think $(pwd)/ prefix is needed at the final step (i.e. as
the first argument to test_set_editor) for this to be a faithful
rewrite but it is distracting having to write it anywhere else.

Other than that, this looks like a quite straight-forward cleanup.

Thanks, both.  Here is what I'd be queuing tentatively.


That looks good, thanks for fixing it up

Phillip


  t/t3701-add-interactive.sh | 33 +
  1 file changed, 5 insertions(+), 28 deletions(-)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 39c7423069..4a369fcb51 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -87,13 +87,8 @@ test_expect_success 'setup expected' '
EOF
  '
  
-test_expect_success 'setup fake editor' '

-   >fake_editor.sh &&
-   chmod a+x fake_editor.sh &&
-   test_set_editor "$(pwd)/fake_editor.sh"
-'
-
  test_expect_success 'dummy edit works' '
+   test_set_editor : &&
(echo e; echo a) | git add -p &&
git diff > diff &&
test_cmp expected diff
@@ -110,12 +105,10 @@ test_expect_success 'setup patch' '
  '
  
  test_expect_success 'setup fake editor' '

-   echo "#!$SHELL_PATH" >fake_editor.sh &&
-   cat >>fake_editor.sh <<-\EOF &&
+   write_script "fake_editor.sh" <<-\EOF &&
mv -f "$1" oldpatch &&
mv -f patch "$1"
EOF
-   chmod a+x fake_editor.sh &&
test_set_editor "$(pwd)/fake_editor.sh"
  '
  
@@ -302,18 +295,12 @@ test_expect_success 'deleting an empty file' '
  
  test_expect_success 'split hunk setup' '

git reset --hard &&
-   for i in 10 20 30 40 50 60
-   do
-   echo $i
-   done >test &&
+   test_write_lines 10 20 30 40 50 60 >test &&
git add test &&
test_tick &&
git commit -m test &&
  
-	for i in 10 15 20 21 22 23 24 30 40 50 60

-   do
-   echo $i
-   done >test
+   test_write_lines 10 15 20 21 22 23 24 30 40 50 60 >test
  '
  
  test_expect_success 'split hunk "add -p (edit)"' '

@@ -334,17 +321,7 @@ test_expect_success 'split hunk "add -p (edit)"' '
  '
  
  test_expect_failure 'split hunk "add -p (no, yes, edit)"' '

-   cat >test <<-\EOF &&
-   5
-   10
-   20
-   21
-   30
-   31
-   40
-   50
-   60
-   EOF
+   test_write_lines 5 10 20 21 30 31 40 50 60 >test &&
git reset &&
# test sequence is s(plit), n(o), y(es), e(dit)
# q n q q is there to make sure we exit at the end.





Re: Git should preserve modification times at least on request

2018-02-21 Thread Jacob Keller
On Tue, Feb 20, 2018 at 2:05 PM, Peter Backes  wrote:
> Hi Jeff,
>
> On Tue, Feb 20, 2018 at 04:16:34PM -0500, Jeff King wrote:
>> I think there are some references buried somewhere in that wiki, but did
>> you look at any of the third-party tools that store file metadata
>> alongside the files in the repository? E.g.:
>>
>>   https://etckeeper.branchable.com/
>>
>> or
>>
>>   https://github.com/przemoc/metastore
>>
>> I didn't see either of those mentioned in this thread (though I also do
>> not have personal experience with them, either).
>>
>> Modification times are a subset of the total metadata you might care
>> about, so they are solving a much more general problem. Which may also
>> partially answer your question about why this isn't built into git. The
>> general problem gets much bigger when you start wanting to carry things
>> like modes (which git doesn't actually track; we really only care about
>> the executable bit) or extended attributes (acls, etc).
>
> I know about those, but that's not what I am looking for. Those tools
> serve entirely different purposes, ie., tracking file system changes.
> I, however, am specifically interested in version control.
>
> In version control, the user checks out his own copy of the tree for
> working. For this purpose, it is thus pointless to track ownership,
> permissions (except for the x bit), xattrs, or any other metadata. In
> fact, it can be considered the wrong thing to do.
>
> The modification time, however, is special. It clearly has its place in
> version control. It tells us when the last modification was actually
> done to the file. I am often working on some feature, and one part is
> finished and is lying around, but I am still working on other parts in
> other files. Then, maybe after some weeks, the other parts are
> finished. Now, when committing, the information about modification time
> is lost. Maybe some weeks later I want to figure out when I last
> modified those files that were committed. But that information is now
> gone, at least in the git repository. Sure, I could do lots of WIP
> commits, but this would clutter up the history unneccessarly and I
> would have lots of versions that might not even compile, let alone run.

You could have git figure this out by the commit time of the last
commit which modified a file. This gets a bit weird for cherry-picks
or other things like rebase, but that should get what you want.

If you only ever need this information sometimes, you can look it up
by doing something like:

git log -1 --pretty="%cd" -- 

That should show the commit time of the latest commit which touches
that file, which is "essentially" the modify time of the file in terms
of  the version control history.

Obviously, this wouldn't work if you continually amend a change of
multiple files, since git wouldn't track the files separately, and
this only really shows you the time of the last commit.

However, in "version control" sense, this *is* the last time a file
was modified, since it doesn't really care about the stuff that
happens outside of version control.

I'm not really sure if this is enough for you, or if you really want
to store the actual mtime for some reason? (I think you can likely
solve your problem in some other way though).

>
> As far as I remember, bitkeeper had this distinction between checkins
> and commits. You could check in a file at any time, and any number of
> times, and then group all those checkins together with a commit. Git
> seems to have avoided this principle, or have kept it only
> rudimentarily via git add (but git add cannot add more than one version
> of the same file). Perhaps for simplificiation of use, perhaps for
> simplification of implementation, I don't know.
>

You can do lots of commits on a branch and then one merge commit to
merge it into the main line. This is a common strategy used by many
people.

Thanks,
Jake

> I assume, if it were not for the build tool issues, git would have
> tracked mtime from the very start.
>

Maybe. Personally, I would hate having my mtime not be "the time I
checked the file out", since this is intuitive to me at this point.
I'm sure if I lived in a different world I'd be used to that way also,
though.

The build issue *is* important though, because many build systems rely
on the mtime to figure out what to rebuild, and a complete rebuild
isn't a good idea for very large projects.

Thanks,
Jake

> Best wishes
> Peter
> --
> Peter Backes, r...@helen.plasma.xg8.de


test bare repository for unit tests

2018-02-21 Thread Basin Ilya
Hi.
I want to the test-repo-git under 
https://github.com/apache/maven-wagon/tree/master/wagon-providers/wagon-scm/src/test/resources/
just like test-repo-cvs and test-repo-svn

Which configuation options would suit that?
I think core.compression 0 for human readable diffs.
also, I need to force loose, gc after each push.


<    1   2