date:20180215

[PATCH] subtree: hide GPG signatures in calls to log

2018-02-15 Thread Stephen R Guglielmo

This fixes `add` and `pull` for GPG signed objects.

Signed-off-by: Stephen R Guglielmo 
---
 contrib/subtree/git-subtree.sh | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/contrib/subtree/git-subtree.sh b/contrib/subtree/git-subtree.sh
index dec085a23..9594ca4b5 100755
--- a/contrib/subtree/git-subtree.sh
+++ b/contrib/subtree/git-subtree.sh
@@ -297,7 +297,7 @@ find_latest_squash () {
 main=
 sub=
 git log --grep="^git-subtree-dir: $dir/*\$" \
---pretty=format:'START %H%n%s%n%n%b%nEND%n' HEAD |
+--no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' HEAD |
 while read a b junk
 do
 debug "$a $b $junk"
@@ -341,7 +341,7 @@ find_existing_splits () {
 main=
 sub=
 git log --grep="^git-subtree-dir: $dir/*\$" \
---pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs |
+--no-show-signature --pretty=format:'START %H%n%s%n%n%b%nEND%n' $revs |
 while read a b junk
 do
 case "$a" in
@@ -382,7 +382,7 @@ copy_commit () {
 # We're going to set some environment vars here, so
 # do it in a subshell to get rid of them safely later
 debug copy_commit "{$1}" "{$2}" "{$3}"
-git log -1 --pretty=format:'%an%n%ae%n%aD%n%cn%n%ce%n%cD%n%B' "$1" |
+git log --no-show-signature -1
--pretty=format:'%an%n%ae%n%aD%n%cn%n%ce%n%cD%n%B' "$1" |
 (
 read GIT_AUTHOR_NAME
 read GIT_AUTHOR_EMAIL
@@ -462,8 +462,8 @@ squash_msg () {
 oldsub_short=$(git rev-parse --short "$oldsub")
 echo "Squashed '$dir/' changes from $oldsub_short..$newsub_short"
 echo
-git log --pretty=tformat:'%h %s' "$oldsub..$newsub"
-git log --pretty=tformat:'REVERT: %h %s' "$newsub..$oldsub"
+git log --no-show-signature --pretty=tformat:'%h %s' "$oldsub..$newsub"
+git log --no-show-signature --pretty=tformat:'REVERT: %h %s'
"$newsub..$oldsub"
 else
 echo "Squashed '$dir/' content from commit $newsub_short"
 fi
@@ -475,7 +475,7 @@ squash_msg () {

 toptree_for_commit () {
 commit="$1"
-git log -1 --pretty=format:'%T' "$commit" -- || exit $?
+git rev-parse --verify "$commit^{tree}" || exit $?
 }

 subtree_for_commit () {
-- 
2.16.1

[PATCH] test-lib.sh: unset XDG_CACHE_HOME

2018-02-15 Thread Genki Sky

git respects XDG_CACHE_HOME for the credential cache. So, we should
unset XDG_CACHE_HOME for the test environment, lest a user's custom one
cause failure in the test.

For example, t/t0301-credential-cache.sh expects a default directory
to be used if it hasn't explicitly set XDG_CACHE_HOME.

Signed-off-by: Genki Sky 
---

Notes:

  This is the XDG_CACHE_HOME version of 5adf84ebb ("test-lib.sh: unset
  XDG_CONFIG_HOME", 2012-07-24).

 t/test-lib.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 9af19055b..001ef6b64 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -116,6 +116,7 @@ unset VISUAL EMAIL LANGUAGE COLUMNS $("$PERL_PATH" -e '
my @vars = grep(/^GIT_/ && !/^GIT_($ok)/o, @env);
print join("\n", @vars);
 ')
+unset XDG_CACHE_HOME
 unset XDG_CONFIG_HOME
 unset GITPERLLIB
 GIT_AUTHOR_EMAIL=aut...@example.com
--
2.16.1.195.g373da842b

Re: [PATCH v16 Part II 7/8] bisect--helper: `bisect_start` shell function partially in C

2018-02-15 Thread SZEDER Gábor

> diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
> index ab0580ce0089a..4ac175c49e80c 100644
> --- a/builtin/bisect--helper.c
> +++ b/builtin/bisect--helper.c

> + /*
> +  * Check if we are bisecting
> +  */
> + if (!is_empty_or_missing_file(git_path_bisect_start())) {
> + /* Reset to the rev from where we started */
> + strbuf_read_file(_head, git_path_bisect_start(), 0);
> + strbuf_trim(_head);
> + if (!no_checkout) {
> + struct argv_array argv = ARGV_ARRAY_INIT;
> +
> + argv_array_pushl(, "checkout", start_head.buf,
> +  "--", NULL);
> + if (run_command_v_opt(argv.argv, RUN_GIT_CMD)) {
> + error(_("checking out '%s' failed. Try 'git "
> + "bisect start '."),
> +   start_head.buf);
> + goto fail;
> + }
> + }
> + } else {
> + /* Get the rev from where we start. */
> + if (!get_oid(head, _oid) &&
> + !starts_with(head, "refs/heads/")) {
> + strbuf_reset(_head);
> + strbuf_addstr(_head, sha1_to_hex(head_oid.hash));

Please use oid_to_hex(_oid) instead of sha1_to_hex().

[RFC PATCH] cherry: teach git cherry to respect pathspec

2018-02-15 Thread Rasmus Villemoes

This is a very naive attempt at teaching git cherry to restrict
attention to a given pathspec. Very much RFC, hence no documentation
update or test cases for now.

The first problem is the existing signature, with between 0 and 3
arguments. In order to preserve having defaults for limit, head,
upstream, I decided to make use of -- mandatory. The alternative seems
to be to only take a pathspec as arguments 4+, forcing the users to
write out the default values for head and/or limit (and how does one
actually write the default "no limit"?).

At first I just tried parse_pathspec directly to _data, but
that didn't seem to have any effect, and from staring at revision.c for
a while, it seems that one has to go through setup_revisions(). But that
fits well with insisting on the -- above, since we then have an argv[]
that ensures we never hit the handle_revision_arg() or
handle_revision_pseudo_opt() calls.

The motivation for this is that I'm in the process of switching a BSP
from a vendor kernel to one much closer to mainline's master branch. I
do need to apply/port some out-of-tree patches from the vendor kernel,
but it's much more managable if one can do a per-subsystem survey of
what might need porting. So I naively started by doing

  git cherry -v linus/master LSDK-17.12-V4.9 v4.9.62 -- drivers/mtd/spi-nor/

assuming that would Just Work. I was somewhat surprised that this didn't
give any output, but digging into the source revealed that (1) there
isn't actually any pathspec handling and (2) any argc other than 1..3 is
treated as 0 - something which is probably also worth fixing.

With this patch, the command above does give something reasonable, and
t3500-cherry.sh also seems to pass.

Signed-off-by: Rasmus Villemoes 
---
 builtin/log.c | 33 +++--
 1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/builtin/log.c b/builtin/log.c
index 46b4ca13e..2d4534b5c 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -1890,6 +1890,8 @@ int cmd_cherry(int argc, const char **argv, const char 
*prefix)
const char *head = "HEAD";
const char *limit = NULL;
int verbose = 0, abbrev = 0;
+   int pathspec_argc = 0;
+   int i;
 
struct option options[] = {
OPT__ABBREV(),
@@ -1897,17 +1899,27 @@ int cmd_cherry(int argc, const char **argv, const char 
*prefix)
OPT_END()
};
 
-   argc = parse_options(argc, argv, prefix, options, cherry_usage, 0);
+   argc = parse_options(argc, argv, prefix, options, cherry_usage,
+PARSE_OPT_KEEP_ARGV0 | PARSE_OPT_KEEP_UNKNOWN |
+PARSE_OPT_KEEP_DASHDASH);
+
+   for (i = 1; i < argc; ++i) {
+   if (!strcmp(argv[i], "--")) {
+   pathspec_argc = 1 + argc - i;
+   argc = i;
+   break;
+   }
+   }
 
switch (argc) {
+   case 4:
+   limit = argv[3];
+   /* FALLTHROUGH */
case 3:
-   limit = argv[2];
+   head = argv[2];
/* FALLTHROUGH */
case 2:
-   head = argv[1];
-   /* FALLTHROUGH */
-   case 1:
-   upstream = argv[0];
+   upstream = argv[1];
break;
default:
current_branch = branch_get(NULL);
@@ -1921,6 +1933,15 @@ int cmd_cherry(int argc, const char **argv, const char 
*prefix)
}
 
init_revisions(, prefix);
+   if (pathspec_argc) {
+   /*
+* hack: this reuses the element of argv before "--"
+* as argv[0] - setup_revisions assumes argv[0] is
+* present and ignores it, and starts its search for
+* "--" at argv[1].
+*/
+   setup_revisions(pathspec_argc, [argc-1], , NULL);
+   }
revs.max_parents = 1;
 
if (add_pending_commit(head, , 0))
-- 
2.15.1

Re: [PATCH v7 25/31] merge-recursive: apply necessary modifications for directory renames

2018-02-15 Thread SZEDER Gábor

On Wed, Jan 31, 2018 at 12:25 AM, Elijah Newren  wrote:
> This commit hooks together all the directory rename logic by making the
> necessary changes to the rename struct, it's dst_entry, and the
> diff_filepair under consideration.
>
> Signed-off-by: Elijah Newren 
> ---
>  merge-recursive.c   | 187 
> +++-
>  t/t6043-merge-rename-directories.sh |  50 +-
>  2 files changed, 211 insertions(+), 26 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 38dc0eefaf..7c78dc2dc1 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c

> @@ -641,6 +643,27 @@ static int update_stages(struct merge_options *opt, 
> const char *path,
> return 0;
>  }
>
> +static int update_stages_for_stage_data(struct merge_options *opt,
> +   const char *path,
> +   const struct stage_data *stage_data)
> +{
> +   struct diff_filespec o, a, b;
> +
> +   o.mode = stage_data->stages[1].mode;
> +   oidcpy(, _data->stages[1].oid);
> +
> +   a.mode = stage_data->stages[2].mode;
> +   oidcpy(, _data->stages[2].oid);
> +
> +   b.mode = stage_data->stages[3].mode;
> +   oidcpy(, _data->stages[3].oid);
> +
> +   return update_stages(opt, path,
> +is_null_sha1(o.oid.hash) ? NULL : ,
> +is_null_sha1(a.oid.hash) ? NULL : ,
> +is_null_sha1(b.oid.hash) ? NULL : );

Please use is_null_oid() etc. instead of is_null_sha1().

Re: [PATCH] merge: allow fast-forward when merging a tracked tag

2018-02-15 Thread Eric Sunshine

On Thu, Feb 15, 2018 at 5:45 PM, Junio C Hamano  wrote:
> [...]
> Update the default (again) for "git merge" that merges a tag object
> to (1) --no-ff (i.e. create a merge commit even when side branch
> fast forwards) if the tag being merged is not at its expected place
> in refs/tags/ hierarchy and (2) --ff (i.e. allow fast-forward update
> when able) otherwise.
>
> Signed-off-by: Junio C Hamano 
> ---
> diff --git a/builtin/merge.c b/builtin/merge.c
> @@ -1125,6 +1126,42 @@ static struct commit_list *collect_parents(struct 
> commit *head_commit,
> +static int merging_a_throwaway_tag(struct commit *commit)
> +{
> +   const char *tag_ref;
> +   struct object_id oid;
> +
> +   /* Are we merging a tag? */
> +   if (!merge_remote_util(commit) ||
> +   !merge_remote_util(commit)->obj ||
> +   merge_remote_util(commit)->obj->type != OBJ_TAG)
> +   return 0;
> +
> +   /*
> +* Now we know we are merging a tag object.  Are we downstream
> +* and following the tags from upstream?  If so, we must have
> +* the tag object pointed at by "refs/tags/$T" where $T is the
> +* tagname recorded in the tag object.  We want to allow such
> +* a "just to catch up" merge to fast-forward.
> +*/
> +   tag_ref = xstrfmt("refs/tags/%s",
> + ((struct tag 
> *)merge_remote_util(commit)->obj)->tag);

xstrfmt() allocates a new string...

> +   if (!read_ref(tag_ref, ) &&
> +   !oidcmp(, _remote_util(commit)->obj->oid))
> +   return 0;

...which is leaked here...

> +
> +   /*
> +* Otherwise, we are playing an integrator's role, making a
> +* merge with a throw-away tag from a contributor with
> +* something like "git pull $contributor $signed_tag".
> +* We want to forbid such a merge from fast-forwarding
> +* by default; otherwise we would not keep the signature
> +* anywhere.
> +*/
> +   return 1;

...and here.

> +}

[PATCH v3] worktree: add: fix 'post-checkout' not knowing new worktree location

2018-02-15 Thread Eric Sunshine

Although "git worktree add" learned to run the 'post-checkout' hook in
ade546be47 (worktree: invoke post-checkout hook, 2017-12-07), it
neglected to change to the directory of the newly-created worktree
before running the hook. Instead, the hook runs within the directory
from which the "git worktree add" command itself was invoked, which
effectively neuters the hook since it knows nothing about the new
worktree directory.

Further, ade546be47 failed to sanitize the environment before running
the hook, which means that user-assigned values of GIT_DIR and
GIT_WORK_TREE could mislead the hook about the location of the new
worktree. In the case of "git worktree add" being run from a bare
repository, the GIT_DIR="." assigned by Git itself leaks into the hook's
environment and breaks Git commands; this is so even when the working
directory is correctly changed to the new worktree before the hook runs
since ".", relative to the new worktree directory, does not point at the
bare repository.

Fix these problems by (1) changing to the new worktree's directory
before running the hook, and (2) sanitizing the environment of GIT_DIR
and GIT_WORK_TREE so hooks can't be confused by misleading values.

Enhance the t2025 'post-checkout' tests to verify that the hook is
indeed run within the correct directory and that Git commands invoked by
the hook compute Git-dir and top-level worktree locations correctly.

While at it, also add two new tests: (1) verify that the hook is run
within the correct directory even when the new worktree is created from
a sibling worktree (as opposed to the main worktree); (2) verify that
the hook is provided with correct context when the new worktree is
created from a bare repository (test provided by Lars Schneider).

Implementation Notes:

Rather than sanitizing the environment of GIT_DIR and GIT_WORK_TREE, an
alternative would be to set them explicitly, as is already done for
other Git commands run internally by "git worktree add". This patch opts
instead to sanitize the environment in order to clearly document that
the worktree is fully functional by the time the hook is run, thus does
not require special environmental overrides.

The hook is run manually, rather than via run_hook_le(), since it needs
to change the working directory to that of the worktree, and
run_hook_le() does not provide such functionality. As this is a one-off
case, adding 'run_hook' overloads which allow the directory to be set
does not seem warranted at this time.

Reported-by: Lars Schneider 
Signed-off-by: Eric Sunshine 
---

This is a re-roll of [1] which fixes "git worktree add" to provide
proper context to the 'post-checkout' hook so that the hook knows the
location of the newly-created worktree.

Changes since v2:

* Fix crash due to missing NULL-terminator on 'env' list passed to
  run_command().

[1]: 
https://public-inbox.org/git/20180215191841.40848-1-sunsh...@sunshineco.com/

builtin/worktree.c  | 20 ---
 t/t2025-worktree-add.sh | 54 ++---
 2 files changed, 62 insertions(+), 12 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 7cef5b120b..f69f862947 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -345,9 +345,23 @@ static int add_worktree(const char *path, const char 
*refname,
 * Hook failure does not warrant worktree deletion, so run hook after
 * is_junk is cleared, but do return appropriate code when hook fails.
 */
-   if (!ret && opts->checkout)
-   ret = run_hook_le(NULL, "post-checkout", oid_to_hex(_oid),
- oid_to_hex(>object.oid), "1", NULL);
+   if (!ret && opts->checkout) {
+   const char *hook = find_hook("post-checkout");
+   if (hook) {
+   const char *env[] = { "GIT_DIR", "GIT_WORK_TREE", NULL 
};
+   cp.git_cmd = 0;
+   cp.no_stdin = 1;
+   cp.stdout_to_stderr = 1;
+   cp.dir = path;
+   cp.env = env;
+   cp.argv = NULL;
+   argv_array_pushl(, absolute_path(hook),
+oid_to_hex(_oid),
+oid_to_hex(>object.oid),
+"1", NULL);
+   ret = run_command();
+   }
+   }
 
argv_array_clear(_env);
strbuf_release();
diff --git a/t/t2025-worktree-add.sh b/t/t2025-worktree-add.sh
index 2b95944973..d0d2e4f7ec 100755
--- a/t/t2025-worktree-add.sh
+++ b/t/t2025-worktree-add.sh
@@ -451,32 +451,68 @@ test_expect_success 'git worktree --no-guess-remote 
option overrides config' '
 '
 
 post_checkout_hook () {
-   test_when_finished "rm -f .git/hooks/post-checkout" &&
-   mkdir -p .git/hooks &&
-   write_script .git/hooks/post-checkout <<-\EOF
-

Re: [PATCH v2] worktree: add: fix 'post-checkout' not knowing new worktree location

2018-02-15 Thread Eric Sunshine

On Thu, Feb 15, 2018 at 4:27 PM, Eric Sunshine  wrote:
> On Thu, Feb 15, 2018 at 12:52:11PM -0800, Junio C Hamano wrote:
>> This seems to segfault on me, without leaving hook.actual anywhere.
>
> I'm unable to reproduce the segfault, but I'm guessing it's because
> I'm a dummy. Can you squash in the following and retry?
>
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> -   const char *env[] = { "GIT_DIR", "GIT_WORK_TREE" };
> +   const char *env[] = { "GIT_DIR", "GIT_WORK_TREE", 
> NULL };
>
> If that fixes it, can you squash it locally or should I re-send?

Okay, I was able to reproduce the crash on FreeBSD (but not MacOS or
Linux), and the above change does indeed fix it. I'll send v3 in a
moment to address it.

[PATCH] merge: allow fast-forward when merging a tracked tag

2018-02-15 Thread Junio C Hamano

Long time ago at fab47d05 ("merge: force edit and no-ff mode when
merging a tag object", 2011-11-07), "git merge" was made to always
create a merge commit when merging a tag, even when the side branch
being merged is a descendant of the current branch.

This default is good for merges made by upstream maintainers to
integrate work signed by downstream contributors, but will leave
pointless no-ff merges when downstream contributors pull a newer
release tag to make their long-running topic branches catch up with
the upstream.  When there is no local work left on the topic, such a
merge should simply fast-forward to the commit pointed at by the
release tag.

Update the default (again) for "git merge" that merges a tag object
to (1) --no-ff (i.e. create a merge commit even when side branch
fast forwards) if the tag being merged is not at its expected place
in refs/tags/ hierarchy and (2) --ff (i.e. allow fast-forward update
when able) otherwise.

Signed-off-by: Junio C Hamano 
---

> There are a few fallouts in the testsuite if we go this route.  I am
> not quite decided if I like the approach.

This time with a few tests for the new default, and minimum
adjustement to the documentation.

 Documentation/merge-options.txt |  3 ++-
 builtin/merge.c | 42 +
 t/t6200-fmt-merge-msg.sh|  2 +-
 t/t7600-merge.sh| 38 -
 4 files changed, 78 insertions(+), 7 deletions(-)

diff --git a/Documentation/merge-options.txt b/Documentation/merge-options.txt
index 3888c3ff85..63a3fc0954 100644
--- a/Documentation/merge-options.txt
+++ b/Documentation/merge-options.txt
@@ -35,7 +35,8 @@ set to `no` at the beginning of them.
 --no-ff::
Create a merge commit even when the merge resolves as a
fast-forward.  This is the default behaviour when merging an
-   annotated (and possibly signed) tag.
+   annotated (and possibly signed) tag that is not stored in
+   its natural place in 'refs/tags/' hierarchy.
 
 --ff-only::
Refuse to merge and exit with a non-zero status unless the
diff --git a/builtin/merge.c b/builtin/merge.c
index 30264cfd7c..45c7916505 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -33,6 +33,7 @@
 #include "sequencer.h"
 #include "string-list.h"
 #include "packfile.h"
+#include "tag.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -1125,6 +1126,42 @@ static struct commit_list *collect_parents(struct commit 
*head_commit,
return remoteheads;
 }
 
+static int merging_a_throwaway_tag(struct commit *commit)
+{
+   const char *tag_ref;
+   struct object_id oid;
+
+   /* Are we merging a tag? */
+   if (!merge_remote_util(commit) ||
+   !merge_remote_util(commit)->obj ||
+   merge_remote_util(commit)->obj->type != OBJ_TAG)
+   return 0;
+
+   /*
+* Now we know we are merging a tag object.  Are we downstream
+* and following the tags from upstream?  If so, we must have
+* the tag object pointed at by "refs/tags/$T" where $T is the
+* tagname recorded in the tag object.  We want to allow such
+* a "just to catch up" merge to fast-forward.
+*/
+   tag_ref = xstrfmt("refs/tags/%s",
+ ((struct tag *)merge_remote_util(commit)->obj)->tag);
+
+   if (!read_ref(tag_ref, ) &&
+   !oidcmp(, _remote_util(commit)->obj->oid))
+   return 0;
+
+   /*
+* Otherwise, we are playing an integrator's role, making a
+* merge with a throw-away tag from a contributor with
+* something like "git pull $contributor $signed_tag".
+* We want to forbid such a merge from fast-forwarding
+* by default; otherwise we would not keep the signature
+* anywhere.
+*/
+   return 1;
+}
+
 int cmd_merge(int argc, const char **argv, const char *prefix)
 {
struct object_id result_tree, stash, head_oid;
@@ -1322,10 +1359,7 @@ int cmd_merge(int argc, const char **argv, const char 
*prefix)
oid_to_hex(>object.oid));
setenv(buf.buf, merge_remote_util(commit)->name, 1);
strbuf_reset();
-   if (fast_forward != FF_ONLY &&
-   merge_remote_util(commit) &&
-   merge_remote_util(commit)->obj &&
-   merge_remote_util(commit)->obj->type == OBJ_TAG)
+   if (fast_forward != FF_ONLY && merging_a_throwaway_tag(commit))
fast_forward = FF_NO;
}
 
diff --git a/t/t6200-fmt-merge-msg.sh b/t/t6200-fmt-merge-msg.sh
index 2e2fb0e957..a54a52aaa4 100755
--- a/t/t6200-fmt-merge-msg.sh
+++ b/t/t6200-fmt-merge-msg.sh
@@ -512,7 +512,7 @@ test_expect_success 'merge-msg with "merging" an annotated 
tag' '
 
test_when_finished "git reset --hard" &&
annote=$(git rev-parse annote) &&
-

Re: [PATCH 3/3] grep: avoid one strdup() per file

2018-02-15 Thread Jeff King

On Thu, Feb 15, 2018 at 10:56:15PM +0100, Rasmus Villemoes wrote:

> There is only one instance of grep_source_init(GREP_SOURCE_FILE), and in
> that case the path and identifier arguments are equal - not just as
> strings, but the same pointer is passed. So we can save some time and
> memory by reusing the gs->path = xstrdup_or_null(path) we have already
> done as gs->identifier, and changing grep_source_clear accordingly
> to avoid a double free.

IMHO this special case is not really worth it, unless we can show that
those few bytes saved somehow make a measurable difference in either
peak memory or execution speed.

-Peff

Re: [PATCH 1/3] grep: move grep_source_init outside critical section

2018-02-15 Thread Jeff King

On Thu, Feb 15, 2018 at 10:56:13PM +0100, Rasmus Villemoes wrote:

> grep_source_init typically does three strdup()s, and in the threaded
> case, the call from add_work() happens while holding grep_mutex.
> 
> We can thus reduce the time we hold grep_mutex by moving the
> grep_source_init() call out of add_work(), and simply have add_work()
> copy the initialized structure to the available slot in the todo
> array.
> 
> This also simplifies the prototype of add_work(), since it no longer
> needs to duplicate all the parameters of grep_source_init(). In the
> callers of add_work(), we get to reduce the amount of code duplicated in
> the threaded and non-threaded cases slightly (avoiding repeating the
> "GREP_SOURCE_OID, pathbuf.buf, path, oid" argument list); a subsequent
> cleanup patch will make that even more so.

I think this makes sense. It does blur the memory ownership lines of the
grep_source, though. Can we make that more clear with a comment here:

> + grep_source_init(, GREP_SOURCE_OID, pathbuf.buf, path, oid);
> +
>  #ifndef NO_PTHREADS
>   if (num_threads) {
> - add_work(opt, GREP_SOURCE_OID, pathbuf.buf, path, oid);
> + add_work(opt, );
>   strbuf_release();
>   return 0;
>   } else

like:

  /* leak grep_source, whose fields are now owned by add_work() */

or something? We could even memset() it back to all-zeroes to avoid an
accidental call to grep_source_clear(), but that's probably unnecessary
if we have a comment.

-Peff

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-15 Thread Jeff King

On Thu, Feb 15, 2018 at 12:03:06PM -0800, Junio C Hamano wrote:

> And from that point of view, perhaps w-t-e attribute is somewhat
> misdesigned.
> 
> In general, an attribute is about the project's contents in the
> manner independent of platform or environment.  You define "this
> file is a C source" or "this file has JPEG image" there.  What exact
> program you use to present diffs between the two versions of such a
> file (external diff command) or what exact program you use to
> extract the textual representations (textconv filter) is environment
> and platform dependent and is left to the configuration mechanism
> for each repository.
> 
> To be in line with the above design principle, I think the attribute
> ought to be "the in-tree contents of this path is encoded in ..."
> whose values could be things like UTF-8, ShiftJIS, etc.  What
> external encoding the paths should be checked out is not a
> project-wide matter, especially when talking about cross platform
> projects.  Perhaps a project in Japanese language wants to check
> out its contents in EUC-jp on Unices and in ShiftJIS on DOS derived
> systems.  The participants all need to know what in-repository
> encoding is used, which is a sensible use of attributes.  They also
> need to know what the recommended external encoding to be used in
> the working tree is for their platforms, but that is more like what
> Makefile variable to set for their platforms, etc., and is not a
> good match to the attributes system.

While I agree what you're saying philosophically here, I suspect you'd
still need another attribute for "no really, this needs to be checked
out as encoding X". The same way we treat line endings as a platform
decision, but we still need to have `eol=crlf` for those files which
really, no matter what platform you're on, have external tools depending
on them to have some particular line ending.

So a full proposal would support both cases: "check this out in the
local platform's preferred encoding" and "always check this out in
_this_ encoding". And Lars's proposal is just the second half of that.

But I'm not sure anybody even really cares about the first part; I don't
think we've seen anybody actually ask for it.

-Peff

Re: [PATCH] make hg-to-git compatible with python2.x and 3.x

2018-02-15 Thread Junio C Hamano

Hervé Beraud  writes:

> ---

Thanks for posting, but this is way under-justified, even for
something in contrib/ area.  

Are all changes in this patch necessary to "make it compatible with
both Python 2 and 3", or are some parts do not contribute directly
to the objective but are good changes to suit your personal taste,
match BCP styles, etc.?  I am guessing it is the latter (e.g. with
Python 3, you cannot use print without invoking it as a function,
but os.system() can still accept a literal command line string, so
introducing variable "rm_f" that gets assigned a literal string only
once to pass it to os.system() does not have anything to do with the
Python2to3 transition), and if that is the case, it makes sense to
split this into at least 2 patches, i.e. prelimiary clean-up,
followed by changes required to run it with Python3.  It also is OK
to make it a 3-patch series, in which case you would limit the
preliminary clean-up to minimum uncontroversial changes and follow
up with an optional update to suit personal taste at the very end.

Also, the submission needs to be signed-off for us to be able to
use.  Please see Documentation/SubmittingPatches for details.

Thanks.

>  contrib/hg-to-git/hg-to-git.py | 140 
> -
>  1 file changed, 83 insertions(+), 57 deletions(-)
>
> diff --git a/contrib/hg-to-git/hg-to-git.py b/contrib/hg-to-git/hg-to-git.py
> ...
>  os.system('git ls-files -x .hg --others | git update-index --add 
> --stdin')
>  # delete removed files
> -os.system('git ls-files -x .hg --deleted | git update-index --remove 
> --stdin')
> +rm_f = 'git ls-files -x .hg --deleted | git update-index --remove 
> --stdin'
> +os.system(rm_f)
>  ... 
> @@ -247,7 +273,7 @@ def getgitenv(user, date):
>  # write the state for incrementals
>  if state:
>  if verbose:
> -print 'Writing state'
> +print('Writing state')
>  f = open(state, 'w')
>  pickle.dump(hgvers, f)
>  
>
> --
> https://github.com/git/git/pull/458

Re: [PATCH 0/3] a few grep patches

2018-02-15 Thread Brandon Williams

On 02/15, Rasmus Villemoes wrote:
> I believe the first two should be ok, but I'm not sure what I myself
> think of the third one. Perhaps the saving is not worth the
> complexity, but it does annoy my optimization nerve to see all the
> unnecessary duplicated work being done.

I agree, the first two seem like good changes to me though I don't know
if i like the complexity that the third introduces.

> 
> Rasmus Villemoes (3):
>   grep: move grep_source_init outside critical section
>   grep: simplify grep_oid and grep_file
>   grep: avoid one strdup() per file
> 
>  builtin/grep.c | 25 -
>  grep.c |  8 ++--
>  2 files changed, 18 insertions(+), 15 deletions(-)
> 
> -- 
> 2.15.1
> 

-- 
Brandon Williams

[PATCH 1/3] grep: move grep_source_init outside critical section

2018-02-15 Thread Rasmus Villemoes

grep_source_init typically does three strdup()s, and in the threaded
case, the call from add_work() happens while holding grep_mutex.

We can thus reduce the time we hold grep_mutex by moving the
grep_source_init() call out of add_work(), and simply have add_work()
copy the initialized structure to the available slot in the todo
array.

This also simplifies the prototype of add_work(), since it no longer
needs to duplicate all the parameters of grep_source_init(). In the
callers of add_work(), we get to reduce the amount of code duplicated in
the threaded and non-threaded cases slightly (avoiding repeating the
"GREP_SOURCE_OID, pathbuf.buf, path, oid" argument list); a subsequent
cleanup patch will make that even more so.

Signed-off-by: Rasmus Villemoes 
---
 builtin/grep.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 3ca4ac80d..4a4f15172 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -92,8 +92,7 @@ static pthread_cond_t cond_result;
 
 static int skip_first_line;
 
-static void add_work(struct grep_opt *opt, enum grep_source_type type,
-const char *name, const char *path, const void *id)
+static void add_work(struct grep_opt *opt, const struct grep_source *gs)
 {
grep_lock();
 
@@ -101,7 +100,7 @@ static void add_work(struct grep_opt *opt, enum 
grep_source_type type,
pthread_cond_wait(_write, _mutex);
}
 
-   grep_source_init([todo_end].source, type, name, path, id);
+   todo[todo_end].source = *gs;
if (opt->binary != GREP_BINARY_TEXT)
grep_source_load_driver([todo_end].source);
todo[todo_end].done = 0;
@@ -317,6 +316,7 @@ static int grep_oid(struct grep_opt *opt, const struct 
object_id *oid,
 const char *path)
 {
struct strbuf pathbuf = STRBUF_INIT;
+   struct grep_source gs;
 
if (opt->relative && opt->prefix_length) {
quote_path_relative(filename + tree_name_len, opt->prefix, 
);
@@ -325,18 +325,18 @@ static int grep_oid(struct grep_opt *opt, const struct 
object_id *oid,
strbuf_addstr(, filename);
}
 
+   grep_source_init(, GREP_SOURCE_OID, pathbuf.buf, path, oid);
+
 #ifndef NO_PTHREADS
if (num_threads) {
-   add_work(opt, GREP_SOURCE_OID, pathbuf.buf, path, oid);
+   add_work(opt, );
strbuf_release();
return 0;
} else
 #endif
{
-   struct grep_source gs;
int hit;
 
-   grep_source_init(, GREP_SOURCE_OID, pathbuf.buf, path, oid);
strbuf_release();
hit = grep_source(opt, );
 
@@ -348,24 +348,25 @@ static int grep_oid(struct grep_opt *opt, const struct 
object_id *oid,
 static int grep_file(struct grep_opt *opt, const char *filename)
 {
struct strbuf buf = STRBUF_INIT;
+   struct grep_source gs;
 
if (opt->relative && opt->prefix_length)
quote_path_relative(filename, opt->prefix, );
else
strbuf_addstr(, filename);
 
+   grep_source_init(, GREP_SOURCE_FILE, buf.buf, filename, filename);
+
 #ifndef NO_PTHREADS
if (num_threads) {
-   add_work(opt, GREP_SOURCE_FILE, buf.buf, filename, filename);
+   add_work(opt, );
strbuf_release();
return 0;
} else
 #endif
{
-   struct grep_source gs;
int hit;
 
-   grep_source_init(, GREP_SOURCE_FILE, buf.buf, filename, 
filename);
strbuf_release();
hit = grep_source(opt, );
 
-- 
2.15.1

[PATCH 0/3] a few grep patches

2018-02-15 Thread Rasmus Villemoes

I believe the first two should be ok, but I'm not sure what I myself
think of the third one. Perhaps the saving is not worth the
complexity, but it does annoy my optimization nerve to see all the
unnecessary duplicated work being done.

Rasmus Villemoes (3):
  grep: move grep_source_init outside critical section
  grep: simplify grep_oid and grep_file
  grep: avoid one strdup() per file

 builtin/grep.c | 25 -
 grep.c |  8 ++--
 2 files changed, 18 insertions(+), 15 deletions(-)

-- 
2.15.1

[PATCH 2/3] grep: simplify grep_oid and grep_file

2018-02-15 Thread Rasmus Villemoes

In the NO_PTHREADS or !num_threads case, this doesn't change
anything. In the threaded case, note that grep_source_init duplicates
its third argument, so there is no need to keep [path]buf.buf alive
across the call of add_work().

Signed-off-by: Rasmus Villemoes 
---
 builtin/grep.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 4a4f15172..593f48d59 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -326,18 +326,17 @@ static int grep_oid(struct grep_opt *opt, const struct 
object_id *oid,
}
 
grep_source_init(, GREP_SOURCE_OID, pathbuf.buf, path, oid);
+   strbuf_release();
 
 #ifndef NO_PTHREADS
if (num_threads) {
add_work(opt, );
-   strbuf_release();
return 0;
} else
 #endif
{
int hit;
 
-   strbuf_release();
hit = grep_source(opt, );
 
grep_source_clear();
@@ -356,18 +355,17 @@ static int grep_file(struct grep_opt *opt, const char 
*filename)
strbuf_addstr(, filename);
 
grep_source_init(, GREP_SOURCE_FILE, buf.buf, filename, filename);
+   strbuf_release();
 
 #ifndef NO_PTHREADS
if (num_threads) {
add_work(opt, );
-   strbuf_release();
return 0;
} else
 #endif
{
int hit;
 
-   strbuf_release();
hit = grep_source(opt, );
 
grep_source_clear();
-- 
2.15.1

[PATCH 3/3] grep: avoid one strdup() per file

2018-02-15 Thread Rasmus Villemoes

There is only one instance of grep_source_init(GREP_SOURCE_FILE), and in
that case the path and identifier arguments are equal - not just as
strings, but the same pointer is passed. So we can save some time and
memory by reusing the gs->path = xstrdup_or_null(path) we have already
done as gs->identifier, and changing grep_source_clear accordingly
to avoid a double free.

Signed-off-by: Rasmus Villemoes 
---
 grep.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/grep.c b/grep.c
index 3d7cd0e96..b1532b1b6 100644
--- a/grep.c
+++ b/grep.c
@@ -1972,7 +1972,8 @@ void grep_source_init(struct grep_source *gs, enum 
grep_source_type type,
 
switch (type) {
case GREP_SOURCE_FILE:
-   gs->identifier = xstrdup(identifier);
+   gs->identifier = identifier == path ?
+   gs->path : xstrdup(identifier);
break;
case GREP_SOURCE_OID:
gs->identifier = oiddup(identifier);
@@ -1986,7 +1987,10 @@ void grep_source_init(struct grep_source *gs, enum 
grep_source_type type,
 void grep_source_clear(struct grep_source *gs)
 {
FREE_AND_NULL(gs->name);
-   FREE_AND_NULL(gs->path);
+   if (gs->path == gs->identifier)
+   gs->path = NULL;
+   else
+   FREE_AND_NULL(gs->path);
FREE_AND_NULL(gs->identifier);
grep_source_clear_data(gs);
 }
-- 
2.15.1

Re: Line ending normalization doesn't work as expected

2018-02-15 Thread Robert Dailey

On Thu, Feb 15, 2018 at 1:16 PM, Junio C Hamano  wrote:
> I think the message you are referring to is a tangent that discusses
> how it was done in the old world, with issues that come from the
> fact that with such an approach the paths are first removed from the
> index and then added afresh to the index, which can lose cases and
> executable bits when working on a filesystem that does not retain
> enough information.
>
> The way in the new world is to use "add --renormalize" which was
> added at 9472935d ("add: introduce "--renormalize"", 2017-11-16), I
> think.

Oh I didn't realize someone actually did it. If so, that's awesome.
Thanks Junio!

Re: [PATCH v2] worktree: add: fix 'post-checkout' not knowing new worktree location

2018-02-15 Thread Junio C Hamano

Eric Sunshine  writes:

> I'm a dummy. Can you squash in the following and retry?

I haven't tried, but that does look like the right thing to do,
whether it is the root cause of the fault I am seeing.

Thanks.

>
> --- >8 ---
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index 604a0292b0..f69f862947 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -348,7 +348,7 @@ static int add_worktree(const char *path, const char 
> *refname,
>   if (!ret && opts->checkout) {
>   const char *hook = find_hook("post-checkout");
>   if (hook) {
> - const char *env[] = { "GIT_DIR", "GIT_WORK_TREE" };
> + const char *env[] = { "GIT_DIR", "GIT_WORK_TREE", NULL 
> };
>   cp.git_cmd = 0;
>   cp.no_stdin = 1;
>   cp.stdout_to_stderr = 1;
> --- >8 ---
>
> If that fixes it, can you squash it locally or should I re-send?

Re: [PATCH 8/8] perl: hard-depend on the File::{Temp,Spec} modules

2018-02-15 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason  writes:

> On Wed, Feb 14 2018, Jonathan Nieder jotted:
>
>> Ævar Arnfjörð Bjarmason wrote:
>>
>>> --- a/perl/Git.pm
>>> +++ b/perl/Git.pm
>>> @@ -1324,8 +1324,9 @@ sub _temp_cache {
>>>  }
>>>
>>>  sub _verify_require {
>>> -   eval { require File::Temp; require File::Spec; };
>>> -   $@ and throw Error::Simple($@);
>>> +   require File::Temp;
>>> +   require File::Spec;
>>> +   return;
>>
>> Same question as in the other patches: any reason not to simplify by
>> using 'use' at the top of the file instead?
>
> I was just going for the minimal change, but yeah, that makes
> sense. Will do that in v2.

Yes, if we are making them hard dependency, unless a major part of
the perl/Git.pm can be used without hitting _verify_require in the
current code, there is no point using "require only if we use it"
pattern.  As _verify_require is used even in the Git::repository
constructor, I think it makes more sense to have "use" upfront for
these two modules.

Re: [PATCH v2] worktree: add: fix 'post-checkout' not knowing new worktree location

2018-02-15 Thread Eric Sunshine

On Thu, Feb 15, 2018 at 12:52:11PM -0800, Junio C Hamano wrote:
> Eric Sunshine  writes:
> >  test_expect_success '"add" invokes post-checkout hook (branch)' '
> > post_checkout_hook &&
> > -   printf "%s %s 1\n" $_z40 $(git rev-parse HEAD) >hook.expect &&
> > +   {
> > +   echo $_z40 $(git rev-parse HEAD) 1 &&
> > +   echo $(pwd)/.git/worktrees/gumby &&
> > +   echo $(pwd)/gumby
> > +   } >hook.expect &&
> > git worktree add gumby &&
> > -   test_cmp hook.expect hook.actual
> > +   test_cmp hook.expect gumby/hook.actual
> >  '
> 
> This seems to segfault on me, without leaving hook.actual anywhere.

I'm unable to reproduce the segfault, but I'm guessing it's because
I'm a dummy. Can you squash in the following and retry?

--- >8 ---
diff --git a/builtin/worktree.c b/builtin/worktree.c
index 604a0292b0..f69f862947 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -348,7 +348,7 @@ static int add_worktree(const char *path, const char 
*refname,
if (!ret && opts->checkout) {
const char *hook = find_hook("post-checkout");
if (hook) {
-   const char *env[] = { "GIT_DIR", "GIT_WORK_TREE" };
+   const char *env[] = { "GIT_DIR", "GIT_WORK_TREE", NULL 
};
cp.git_cmd = 0;
cp.no_stdin = 1;
cp.stdout_to_stderr = 1;
--- >8 ---

If that fixes it, can you squash it locally or should I re-send?

Re: [PATCH 3/8] perl: generalize the Git::LoadCPAN facility

2018-02-15 Thread Todd Zullinger

[I dropped bbour...@slb.com from the Cc: list, as it bounced
on my previous reply.]

Ævar Arnfjörð Bjarmason wrote:
> That makes sense, I'll incorporate that in a re-roll. I like
> NO_PERL_CPAN_FALLBACKS or just NO_CPAN_FALLBACKS better.

Either is an improvement.  Starting with NO_PERL_ seems
like a slightly better bikeshed color. :)

> I'd really like to find some solution that works differently though,
> because with this approach we'll run the full test suite against a
> system where our fallbacks will be in place (although if the OS
> distributor has done as promised we won't use them), and then just
> remove this at 'make install' time, also meaning we'll re-gen it before
> running 'make install' again, only to rm it again.
> 
> The former issue we could deal with by munging the Git::LoadCPAN file so
> it knows about NO_PERL_CPAN_FALLBACKS, and will always refuse to use the
> fallbacks if that's set. That's a good idea anyway, because right now if
> you e.g. uninstall Error.pm on Debian (which strips the CPAN fallbacks)
> you get a cryptic "BUG: ..." message, it should instead say "we couldn't
> get this module the OS promised we'd have" or something to that effect.

Teaching Git::LoadCPAN to never fallback sounds like a good
idea.  At least then if the packager intended to avoid the
fallbacks and didn't get it right the error message could be
more useful.

Hopefully that's not a common problem for packagers though.
(And adding the Makefile knob was intended to help make it
easier for packagers to achieve this common goal.)

> The latter is trickier, I don't see an easy way to coerce the Makefile
> into not copying the FromCPAN directory without going back to a
> hardcoded list again, the easiest thing is probably to turn that:
> 
> $(TAR) cf - .)
> 
> Into:
> 
> $(TAR) cf - $(find ... -not )
> 
> Or something like that to get all the stuff that isn't the Git/FromCPAN
> directory.
> 
> Other suggestions most welcome.

What about moving perl/Git/FromCPAN to perl/FromCPAN and
then including perl/FromCPAN in LIB_PERL{,_GEN} only if
NO_PERL_CPAN_FALLBACKS is unset?

 LIB_PERL := $(wildcard perl/Git.pm perl/Git/*.pm perl/Git/*/*.pm 
perl/Git/*/*/*.pm)
+ifndef NO_PERL_CPAN_FALLBACKS
+LIB_PERL += $(wildcard perl/FromCPAN/*.pm perl/FromCPAN/*/*.pm)
+endif
 LIB_PERL_GEN := $(patsubst perl/%.pm,perl/build/lib/%.pm,$(LIB_PERL))

I haven't tested that at all, so it could be broken in many
ways.

Thanks,

-- 
Todd
~~
The nice thing about egotists is that they don't talk about other
people.
-- Lucille S. Harper

Do you need funding? (credit..)

2018-02-15 Thread MUND-WORTH FS

Do you need funding * Business loans * Personal loans without stress and quick 
approval? We offer loan at low
interest rate,for more information contact us via email.

Chris J.G
Financier/Mund-Worth FS

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-15 Thread Stefan Beller

On Thu, Feb 15, 2018 at 12:42 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> This is a real take on the first part of the recent RFC[1].
>
> For the patches remaining in this series, The scope is about right
> and the size is more manageable.

ok, thanks.

>  With topics already on 'master',
> they have some interactions:
>
>  - ot/mru-on-list & gs/retire-mru
>
>The packed_git MRU has been greatly simplified by using list API
>directly.
>
>  - cc/sha1-file-name
>
>The function signature of sha1_file_name() has been updated.
>
> I could certainly carry evil merge resolutions going forward (these
> changes need to be made to object-store.h that has no mechanical
> conflicts), but it may be less distracting for later readers of the
> series if we rebase it on top of a more recent master.

I was debating when to reroll this series as I want to make the
change that Duy proposed, moving the 'ignore_env' into the
object store as well.

I'll rebase on top of the latest master or these topicc while at it.

Thanks for the heads up.
Stefan

Re: [PATCH v2] worktree: add: fix 'post-checkout' not knowing new worktree location

2018-02-15 Thread Junio C Hamano

Eric Sunshine  writes:

>  test_expect_success '"add" invokes post-checkout hook (branch)' '
>   post_checkout_hook &&
> - printf "%s %s 1\n" $_z40 $(git rev-parse HEAD) >hook.expect &&
> + {
> + echo $_z40 $(git rev-parse HEAD) 1 &&
> + echo $(pwd)/.git/worktrees/gumby &&
> + echo $(pwd)/gumby
> + } >hook.expect &&
>   git worktree add gumby &&
> - test_cmp hook.expect hook.actual
> + test_cmp hook.expect gumby/hook.actual
>  '

This seems to segfault on me, without leaving hook.actual anywhere.

Re: [PATCH 4/8] perl: update our ancient copy of Error.pm

2018-02-15 Thread Ævar Arnfjörð Bjarmason

On Wed, Feb 14 2018, Jonathan Nieder jotted:

> Ævar Arnfjörð Bjarmason wrote:
>
>> The Error.pm shipped with Git as a fallback if there was no Error.pm
>> on the system was released in April 2006, there's been dozens of
>> releases since then, the latest at August 7, 2017, let's update to
>> that.
>
> Comma splices:
>  s/, there's/. There's/
>  s/, let's/. Let's/
>
> The one piece of information I was curious about that this (quite clear,
> thank you) commit message is missing is what changed in the intervening
> time.  Is this just about keeping up with upstream to make it easy to
> keep up later, or has upstream made any useful changes?  E.g. did any
> API or behaviors get better?

Will note that in v2, nothing we need changed, this is just a matter of
keeping up, and e.g. if I run Git on Debian I get a few month old
Error.pm, but if I build Git from source I get one that's more than a
decade old.

> Related: do we have to worry about in-tree users taking advantage of
> improved API and packagers forgetting to add a dependency on the new
> version?  Do we declare the minimal required Error.pm version somewhere
> (e.g. in the INSTALL file)?

We don't, but if that situation arises we could just start doing 'use
 ' to declare it, see "perldoc -f use", but right now
we work on seemingly every version of these modules ever released.

Re: [PATCH 6/8] git-send-email: unconditionally use Net::{SMTP,Domain}

2018-02-15 Thread Ævar Arnfjörð Bjarmason


On Wed, Feb 14 2018, Jonathan Nieder jotted:

> Ævar Arnfjörð Bjarmason wrote:
>
>> The Net::SMTP and Net::Domain were both first released with perl
>> v5.7.3, since my d48b284183 ("perl: bump the required Perl version to
>> 5.8 from 5.6.[21]", 2010-09-24) we've depended on 5.8, so there's no
>> reason to conditionally require this anymore.
>>
>> This conditional loading was initially added in
>> 87840620fd ("send-email: only 'require' instead of 'use' Net::SMTP",
>> 2006-06-01) for Net::SMTP and 134550fe21 ("git-send-email.perl - try
>> to give real name of the calling host to HELO/EHLO", 2010-03-14) for
>> Net::Domain, both of which predate the hard dependency on 5.8.
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  git-send-email.perl | 24 +++-
>>  1 file changed, 11 insertions(+), 13 deletions(-)
>>
>> diff --git a/git-send-email.perl b/git-send-email.perl
>> index 85bb6482f2..69bd443245 100755
>> --- a/git-send-email.perl
>> +++ b/git-send-email.perl
>> @@ -1143,10 +1143,9 @@ sub valid_fqdn {
>>  sub maildomain_net {
>>  my $maildomain;
>>
>> -if (eval { require Net::Domain; 1 }) {
>> -my $domain = Net::Domain::domainname();
>> -$maildomain = $domain if valid_fqdn($domain);
>> -}
>> +require Net::Domain;
>> +my $domain = Net::Domain::domainname();
>> +$maildomain = $domain if valid_fqdn($domain);
>
> Now that we indeed require the module, any reason not to 'use' it?
> E.g. is it particularly expensive to load?
>
> I haven't checked the assertions above about minimal perl versions
> including these modules, but I assume they're true. :)  So this looks
> like a good change.

FWIW this is easily found out for any given module by running `corelist
` on a system with perl installed:

$ corelist File::Spec
Data for 2017-01-14
File::Spec was first released with perl 5.00405

Re: [PATCH 8/8] perl: hard-depend on the File::{Temp,Spec} modules

2018-02-15 Thread Ævar Arnfjörð Bjarmason


On Wed, Feb 14 2018, Jonathan Nieder jotted:

> Ævar Arnfjörð Bjarmason wrote:
>
>> --- a/perl/Git.pm
>> +++ b/perl/Git.pm
>> @@ -1324,8 +1324,9 @@ sub _temp_cache {
>>  }
>>
>>  sub _verify_require {
>> -eval { require File::Temp; require File::Spec; };
>> -$@ and throw Error::Simple($@);
>> +require File::Temp;
>> +require File::Spec;
>> +return;
>
> Same question as in the other patches: any reason not to simplify by
> using 'use' at the top of the file instead?

I was just going for the minimal change, but yeah, that makes
sense. Will do that in v2.

Re: [PATCH 00/26] Moving global state into the repository object (part 1)

2018-02-15 Thread Junio C Hamano

Stefan Beller  writes:

> This is a real take on the first part of the recent RFC[1].

For the patches remaining in this series, The scope is about right
and the size is more manageable.  With topics already on 'master',
they have some interactions:

 - ot/mru-on-list & gs/retire-mru

   The packed_git MRU has been greatly simplified by using list API
   directly.

 - cc/sha1-file-name

   The function signature of sha1_file_name() has been updated.

I could certainly carry evil merge resolutions going forward (these
changes need to be made to object-store.h that has no mechanical
conflicts), but it may be less distracting for later readers of the
series if we rebase it on top of a more recent master.

Re: [PATCH 3/8] perl: generalize the Git::LoadCPAN facility

2018-02-15 Thread Ævar Arnfjörð Bjarmason


On Thu, Feb 15 2018, Todd Zullinger jotted:

> Hi Ævar,
>
> Ævar Arnfjörð Bjarmason wrote:
>> +Git::LoadCPAN - Wrapper for loading modules from the CPAN (OS) or Git's own 
>> copy
>> +
>> +=head1 DESCRIPTION
>> +
>> +The Perl code in Git depends on some modules from the CPAN, but we
>> +don't want to make those a hard requirement for anyone building from
>> +source.
>> +
>> +Therefore the L namespace shipped with Git contains
>> +wrapper modules like C that will first
>> +attempt to load C from the OS, and if that doesn't work
>> +will fall back on C shipped with Git
>> +itself.
>> +
>> +Usually OS's will not ship with Git's Git::FromCPAN tree at all,
>> +preferring to use their own packaging of CPAN modules instead.
>
> This is something I wondered about.  What's the recommended
> method to ensure git packaged for an OS/distribution doesn't
> ever use the fallbacks?  Remove $perllibdir/Git/FromCPAN
> after make install?
>
> If so, would it be useful to add a Makefile knob to not
> install the FromCPAN bits, which may be generally useful to
> packagers?
>
> Something like the following, perhaps?
>
> (I'd feel bad suggesting this without a patch, after all the
> work you've already done to simplify and improve the perl
> bits.)
>
>  8< 
> From: Todd Zullinger 
> Date: Wed, 14 Feb 2018 23:00:30 -0500
> Subject: [PATCH] Makefile: add NO_PERL_CPAN to disable fallback module install
>
> As noted in perl/Git/LoadCPAN.pm, operating system packages often don't
> want to ship Git::FromCPAN tree at all, preferring to use their own
> packaging of CPAN modules instead.  Allow such packagers to set
> NO_PERL_CPAN to easily avoid installing these fallback perl CPAN
> modules.
>
> Signed-off-by: Todd Zullinger 
> ---
>  Makefile | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/Makefile b/Makefile
> index 5bcd83ddf3..c4e035e5bf 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -296,6 +296,9 @@ all::
>  #
>  # Define NO_PERL if you do not want Perl scripts or libraries at all.
>  #
> +# Define NO_PERL_CPAN if you do not want to install fallbacks for perl CPAN
> +# modules.
> +#
>  # Define PYTHON_PATH to the path of your Python binary (often /usr/bin/python
>  # but /usr/bin/python2.7 on some platforms).
>  #
> @@ -2572,6 +2575,7 @@ ifndef NO_GETTEXT
>  endif
>  ifndef NO_PERL
>   $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perllibdir_SQ)'
> + test -z "$(NO_PERL_CPAN)" || rm -rf perl/build/lib/Git/FromCPAN
>   (cd perl/build/lib && $(TAR) cf - .) | \
>   (cd '$(DESTDIR_SQ)$(perllibdir_SQ)' && umask 022 && $(TAR) xof -)
>   $(MAKE) -C gitweb install
> --
> 2.16.1
>
> I don't particularly like NO_PERL_CPAN, but I'm confident
> someone else will suggest an obviously better name.
>
> I thought about moving the 'rm -rf Git/FromCPAN' after the
> tar/untar, to keep the files in place for the tests.  No
> tests seem to rely on those local files, so I stuck with
> removing them before.  That diff was:
>
> --- a/Makefile
> +++ b/Makefile
> @@ -2574,6 +2574,7 @@
>   $(INSTALL) -d -m 755 '$(DESTDIR_SQ)$(perllibdir_SQ)'
>   (cd perl/build/lib && $(TAR) cf - .) | \
>   (cd '$(DESTDIR_SQ)$(perllibdir_SQ)' && umask 022 && $(TAR) xof -)
> + test -n "$(NO_PERL_CPAN)" && rm -rf 
> '$(DESTDIR_SQ)$(perllibdir_SQ)'/Git/FromCPAN
>   $(MAKE) -C gitweb install
>  endif
>  ifndef NO_TCLTK

That makes sense, I'll incorporate that in a re-roll. I like
NO_PERL_CPAN_FALLBACKS or just NO_CPAN_FALLBACKS better.

I'd really like to find some solution that works differently though,
because with this approach we'll run the full test suite against a
system where our fallbacks will be in place (although if the OS
distributor has done as promised we won't use them), and then just
remove this at 'make install' time, also meaning we'll re-gen it before
running 'make install' again, only to rm it again.

The former issue we could deal with by munging the Git::LoadCPAN file so
it knows about NO_PERL_CPAN_FALLBACKS, and will always refuse to use the
fallbacks if that's set. That's a good idea anyway, because right now if
you e.g. uninstall Error.pm on Debian (which strips the CPAN fallbacks)
you get a cryptic "BUG: ..." message, it should instead say "we couldn't
get this module the OS promised we'd have" or something to that effect.

The latter is trickier, I don't see an easy way to coerce the Makefile
into not copying the FromCPAN directory without going back to a
hardcoded list again, the easiest thing is probably to turn that:

$(TAR) cf - .)

Into:

$(TAR) cf - $(find ... -not )

Or something like that to get all the stuff that isn't the Git/FromCPAN
directory.

Other suggestions most welcome.

Re: [PATCH 5/8] perl: update our copy of Mail::Address

2018-02-15 Thread Ævar Arnfjörð Bjarmason


On Thu, Feb 15 2018, Matthieu Moy jotted:

> "Jonathan Nieder"  wrote:
>
>> Ævar Arnfjörð Bjarmason wrote:
>>
>> > Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
>> > 23, 2018). This should be a trivial update[1] but it seems the version
>> > Matthieu Moy imported in bd869f67b9 ("send-email: add and use a local
>> > copy of Mail::Address", 2018-01-05) doesn't correspond to any 2.19
>> > version found on the CPAN. From the comment at the top of the file it
>> > looks like some OS version with the POD stripped, and with different
>> > indentation.
>>
>> Were there changes other than the POD stripping?
>
> No.
>
> I should have mentionned it in the commit message, but the one I took was
> from:
>
>   http://cpansearch.perl.org/src/MARKOV/MailTools-2.19/lib/Mail/Address.pm
>
> i.e. following the "source" link from:
>
>   http://search.cpan.org/~markov/MailTools-2.19/lib/Mail/Address.pod
>
> The link name suggested it was the actual source code but indeed it's a
> pre-processed file with the POD stripped.
>
> It would make sense to indicate explicitly where this file is from in
> this commit's message to avoid having the same discussion next time someone
> upgrades the package.

I see that I'm being a complete idiot and added the *.pod file in the
distro instead of the *.pm file, I got it from metacpan and didn't check
it carefully enough (and only tested with system Error.pm removed, not
Mail::Address), sorry. So yours was the right version. Will fix in a
re-roll.

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-15 Thread Junio C Hamano

lars.schnei...@autodesk.com writes:

> -- Git clients that do not support the `working-tree-encoding` attribute
> -  will checkout the respective files UTF-8 encoded and not in the
> -  expected encoding. Consequently, these files will appear different
> -  which typically causes trouble. This is in particular the case for
> -  older Git versions and alternative Git implementations such as JGit
> -  or libgit2 (as of February 2018).
> +- Third party Git implementations that do not support the
> +  `working-tree-encoding` attribute will checkout the respective files
> +  UTF-8 encoded and not in the expected encoding. Consequently, these
> +  files will appear different which typically causes trouble. This is
> +  in particular the case for older Git versions and alternative Git
> +  implementations such as JGit or libgit2 (as of February 2018).

I know somebody found "clients" misleading in the original, but the
ones that do not understand w-t-e do not have to be third party
reimplementations and imitations.  All existing Git implementations,
including ours, don't.

One thing I find more problematic is that the above places *too*
much stress on the UTF-8 centric worldview.  It is perfectly valid
to store your text contents encoded in ShiftJIS and check them out
as-is, with or without this patch.  It is grossly misleading to say
that older versions of Git will check them out in UTF-8.  "will
checkout these files as-is without encoding conversion" is a better
way to say it, probably.

Also notice that even in the world with w-t-e, such a project won't
benefit from w-t-e at all.  After all, they have been happy using
ShiftJIS in repository and using the same encoding on the working
tree, and because w-t-e assumes that everybody should be using UTF-8
internally, such a project cannot take advantage of the new
mechanism.

And from that point of view, perhaps w-t-e attribute is somewhat
misdesigned.

In general, an attribute is about the project's contents in the
manner independent of platform or environment.  You define "this
file is a C source" or "this file has JPEG image" there.  What exact
program you use to present diffs between the two versions of such a
file (external diff command) or what exact program you use to
extract the textual representations (textconv filter) is environment
and platform dependent and is left to the configuration mechanism
for each repository.

To be in line with the above design principle, I think the attribute
ought to be "the in-tree contents of this path is encoded in ..."
whose values could be things like UTF-8, ShiftJIS, etc.  What
external encoding the paths should be checked out is not a
project-wide matter, especially when talking about cross platform
projects.  Perhaps a project in Japanese language wants to check
out its contents in EUC-jp on Unices and in ShiftJIS on DOS derived
systems.  The participants all need to know what in-repository
encoding is used, which is a sensible use of attributes.  They also
need to know what the recommended external encoding to be used in
the working tree is for their platforms, but that is more like what
Makefile variable to set for their platforms, etc., and is not a
good match to the attributes system.

[PATCH v2] worktree: add: fix 'post-checkout' not knowing new worktree location

2018-02-15 Thread Eric Sunshine

Although "git worktree add" learned to run the 'post-checkout' hook in
ade546be47 (worktree: invoke post-checkout hook, 2017-12-07), it
neglected to change to the directory of the newly-created worktree
before running the hook. Instead, the hook runs within the directory
from which the "git worktree add" command itself was invoked, which
effectively neuters the hook since it knows nothing about the new
worktree directory.

Further, ade546be47 failed to sanitize the environment before running
the hook, which means that user-assigned values of GIT_DIR and
GIT_WORK_TREE could mislead the hook about the location of the new
worktree. In the case of "git worktree add" being run from a bare
repository, the GIT_DIR="." assigned by Git itself leaks into the hook's
environment and breaks Git commands; this is so even when the working
directory is correctly changed to the new worktree before the hook runs
since ".", relative to the new worktree directory, does not point at the
bare repository.

Fix these problems by (1) changing to the new worktree's directory
before running the hook, and (2) sanitizing the environment of GIT_DIR
and GIT_WORK_TREE so hooks can't be confused by misleading values.

Enhance the t2025 'post-checkout' tests to verify that the hook is
indeed run within the correct directory and that Git commands invoked by
the hook compute Git-dir and top-level worktree locations correctly.

While at it, also add two new tests: (1) verify that the hook is run
within the correct directory even when the new worktree is created from
a sibling worktree (as opposed to the main worktree); (2) verify that
the hook is provided with correct context when the new worktree is
created from a bare repository (test provided by Lars Schneider).

Implementation Notes:

Rather than sanitizing the environment of GIT_DIR and GIT_WORK_TREE, an
alternative would be to set them explicitly, as is already done for
other Git commands run internally by "git worktree add". This patch opts
instead to sanitize the environment in order to clearly document that
the worktree is fully functional by the time the hook is run, thus does
not require special environmental overrides.

The hook is run manually, rather than via run_hook_le(), since it needs
to change the working directory to that of the worktree, and
run_hook_le() does not provide such functionality. As this is a one-off
case, adding 'run_hook' overloads which allow the directory to be set
does not seem warranted at this time.

Reported-by: Lars Schneider 
Signed-off-by: Eric Sunshine 
---

This is a re-roll of [1] which fixes "git worktree add" to provide
proper context to the 'post-checkout' hook so that the hook knows the
location of the newly-created worktree. Thanks to Junio for his review
comments and to Lars for pointing out bare repository failure and for
providing a test case.

Changes since v1:

* Sanitize environment so user-assigned GIT_DIR and GIT_WORK_TREE don't
  confuse hook. (Junio)

* Fix failure of Git commands run by hook when "git worktree add" is
  invoked from within a bare repository. (Lars)

* Run hook manually in builtin/worktree.c rather than adding new
  'run_hook' overloads to 'run-command' which allow the directory to be
  set. This one-off case doesn't warrant adding 'run_hook' overloads at
  this time.

No interdiff since almost everything changed.

[1]: 
https://public-inbox.org/git/20180212031526.40039-1-sunsh...@sunshineco.com/

builtin/worktree.c  | 20 ---
 t/t2025-worktree-add.sh | 54 ++---
 2 files changed, 62 insertions(+), 12 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index 7cef5b120b..604a0292b0 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -345,9 +345,23 @@ static int add_worktree(const char *path, const char 
*refname,
 * Hook failure does not warrant worktree deletion, so run hook after
 * is_junk is cleared, but do return appropriate code when hook fails.
 */
-   if (!ret && opts->checkout)
-   ret = run_hook_le(NULL, "post-checkout", oid_to_hex(_oid),
- oid_to_hex(>object.oid), "1", NULL);
+   if (!ret && opts->checkout) {
+   const char *hook = find_hook("post-checkout");
+   if (hook) {
+   const char *env[] = { "GIT_DIR", "GIT_WORK_TREE" };
+   cp.git_cmd = 0;
+   cp.no_stdin = 1;
+   cp.stdout_to_stderr = 1;
+   cp.dir = path;
+   cp.env = env;
+   cp.argv = NULL;
+   argv_array_pushl(, absolute_path(hook),
+oid_to_hex(_oid),
+oid_to_hex(>object.oid),
+"1", NULL);
+   ret = run_command();
+

Re: Line ending normalization doesn't work as expected

2018-02-15 Thread Junio C Hamano

Robert Dailey  writes:

> On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
>> Torsten Bögershausen  writes:
>>
 $ git rm -r --cached . && git add .
>>>
>>> (Both should work)
>>>
>>> To be honest, from the documentation, I can't figure out the difference 
>>> between
>>> $ git read-tree --empty
>>> and
>>> $ git rm -r --cached .
>>>
>>> Does anybody remember the discussion, why we ended up with read-tree ?
>> ...
>
> Sorry to bring this old thread back to life, but I did notice that
> this causes file modes to reset back to 644 (from 755) on Windows
> version of Git. Is there a way to `$ git read-tree --empty && git add
> .` without mucking with file permissions?

I think the message you are referring to is a tangent that discusses
how it was done in the old world, with issues that come from the
fact that with such an approach the paths are first removed from the
index and then added afresh to the index, which can lose cases and
executable bits when working on a filesystem that does not retain
enough information.

The way in the new world is to use "add --renormalize" which was
added at 9472935d ("add: introduce "--renormalize"", 2017-11-16), I
think.

Re: [PATCH 1/1] Documentation/git-status: clarify status table for porcelain mode

2018-02-15 Thread Junio C Hamano

Stefan Beller  writes:

> It is possible to have the output ' A' from 'git status --porcelain'
> by adding a file using the '--intend-to-add' flag.  Make this clear by
> adding the pattern in the table of the documentation.
>
> However the mode 'DM' (deleted in the index, modified in the working tree)
> is not possible in the non-merge case in which the file only shows
> as 'D ' (and adding it back to the worktree would show an additional line

Correct.  Thanks, will queue.

Re: [PATCH 1/2] parse-options: expand $HOME on filename options

2018-02-15 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason  writes:

> In general I'm mildly negative on adding this, for every user like Doron
> who'll be less confused by a hack like this, you'll have other users
> who'll be confused about git inexplicably working with ~ in the middle
> of strings, even though;
>
> $ echo git init --template ~/path
> git init --template /home/avar/path
> $ echo git init --template=~/path
> git init --template=~/path
>
> I think it makes more sense to just leave such expansion to the shell,
> and not try to magically expand it after the fact, since it's both
> confusing (user: why does this work with git and not this other
> program?), and as shown above changes existing semantics.

The above certainly is a reasonable argument.

> I think this way lies madness, and it's better to just avoid it.

Re: Line ending normalization doesn't work as expected

2018-02-15 Thread Robert Dailey

On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
> Torsten Bögershausen  writes:
>
>>> $ git rm -r --cached . && git add .
>>
>> (Both should work)
>>
>> To be honest, from the documentation, I can't figure out the difference 
>> between
>> $ git read-tree --empty
>> and
>> $ git rm -r --cached .
>>
>> Does anybody remember the discussion, why we ended up with read-tree ?
>
> We used to use neither, and considered it fine to "rm .git/index" if
> you wanted to empty the on-disk index file in the old world.  In the
> modern world, folks want you to avoid touching filesystem directly
> and instead want you to use Git tools, and the above are two obvious
> ways to do so.
>
> "git read-tree" (without any parameter, i.e. "read these 0 trees and
> populate the index with it") and its modern and preferred synonym
> "git read-tree --empty" (i.e. "I am giving 0 trees and I know the
> sole effect of this command is to empty the index.") are more direct
> ways to express "I want the index emptied" between the two.
>
> The other one, "git rm -r --cached .", in the end gives you the same
> state because it tells Git to "iterate over all the entries in the
> index, find the ones that match pathspec '.', and remove them from
> the index.".  It is not wrong per-se, but conceptually it is a bit
> roundabout way to say that "I want the index emptied", I would
> think.
>
> I wouldn't be surprised if the "rm -r --cached ." were a lot slower,
> due to the overhead of having to do the pathspec filtering that ends
> up to be a no-op, but there shouldn't be a difference in the end
> result.

Sorry to bring this old thread back to life, but I did notice that
this causes file modes to reset back to 644 (from 755) on Windows
version of Git. Is there a way to `$ git read-tree --empty && git add
.` without mucking with file permissions?

Re: [PATCH v3 20/23] ref-filter: unifying formatting of cat-file opts

2018-02-15 Thread Junio C Hamano

Оля Тележная   writes:

>>> - else if (deref)
>>> + } else if (!strcmp(name, "objectsize:disk")) {
>>> + if (cat_file_info.is_cat_file) {
>>> + v->value = cat_file_info.disk_size;
>>> + v->s = xstrfmt("%"PRIuMAX, 
>>> (uintmax_t)v->value);
>>> + }
>>> + } else if (deref)
>>
>> Why do we care about is_cat_file here. Shouldn't:
>>
>>   git for-each-ref --format='%(objectsize:disk)'
>>
>> work? I.e., shouldn't the cat_file_info.disk_size variable be held
>> somewhere in a used_atom struct?
>
> At that point - no.
> I think it sounds like other separate task to add this functionality
> and to test it properly.

What does "that point" refer to?  This point at 20th patch in the
23-patch series it is not premature, but it will become feasible in
later steps?

As Peff already said in his review on earlier steps like 4/23 and
7/23, I too found the series quite confusing and felt as if I was
watching somebody stumbling in all directions in the dark in the
earlier steps in the series before deciding to go in one direction.

Re: [PATCH v3 11/14] commit: integrate commit graph with commit parsing

2018-02-15 Thread Junio C Hamano

Derrick Stolee  writes:

> +struct object_id *get_nth_commit_oid(struct commit_graph *g,
> +  uint32_t n,
> +  struct object_id *oid)
> +{
> + hashcpy(oid->hash, g->chunk_oid_lookup + g->hash_len * n);
> + return oid;
> +}

This looks like a rather klunky API to me.  

It seems that many current callers in this series (not limited to
this step but in later patches in the series) discard the returned
value.

I would understand the API a lot better if the function returned
"const struct object_id *" that points into the copy of the oid the
graph structure keeps (and the caller can do hashcpy() if it wants
to).

That would allow the API to later check for errors when the caller
gives 'n' that is too large by returning a NULL, for example.

> +static struct commit_list **insert_parent_or_die(struct commit_graph *g,
> +int pos,
> +struct commit_list **pptr)
> +{
> + struct commit *c;
> + struct object_id oid;
> + get_nth_commit_oid(g, pos, );
> + c = lookup_commit();

Re: [PATCH v3 04/14] commit-graph: implement write_commit_graph()

2018-02-15 Thread Derrick Stolee


On 2/15/2018 1:19 PM, Junio C Hamano wrote:

Derrick Stolee  writes:


+struct packed_oid_list {
+   struct object_id **list;
+   int nr;
+   int alloc;
+};

What is the typical access pattern for this data structure?  If it
is pretty much "allocate and grow as we find more", then a dynamic
array of struct (rather than a dynamic array of pointers to struct)
would be a lot more appropriate.  IOW

struct packed_oid_list {
struct object_id *list;
int nr, alloc;
};

The version in the posted patch has to pay malloc overhead plus an
extra pointer for each object id in the list; unless you often
replace elements in the list randomly and/or you borrow object ID
field in other existing data structure whose lifetime is longer than
this list by pointing at it, I do not see how the extra indirection
is worth it.



The pattern used in write_commit_graph() is to append OIDs to the list 
as we discover them and then sort in lexicographic order. The sort then 
only swaps pointers.


I can switch this to sort the 'struct object_id' elements themselves, if 
that is a better pattern.


Thanks,
-Stolee

Re: [PATCH v3 04/14] commit-graph: implement write_commit_graph()

2018-02-15 Thread Junio C Hamano

Derrick Stolee  writes:

> +struct packed_oid_list {
> + struct object_id **list;
> + int nr;
> + int alloc;
> +};

What is the typical access pattern for this data structure?  If it
is pretty much "allocate and grow as we find more", then a dynamic
array of struct (rather than a dynamic array of pointers to struct)
would be a lot more appropriate.  IOW

struct packed_oid_list {
struct object_id *list;
int nr, alloc;
};

The version in the posted patch has to pay malloc overhead plus an
extra pointer for each object id in the list; unless you often
replace elements in the list randomly and/or you borrow object ID
field in other existing data structure whose lifetime is longer than
this list by pointing at it, I do not see how the extra indirection
is worth it.

[PATCH v7 0/7] convert: add support for different encodings

2018-02-15 Thread lars . schneider

From: Lars Schneider 

Hi,

Patches 1-4, 6 are preparation and helper functions.
Patch 5,7 are the actual change.

This series depends on Torsten's 8462ff43e4 (convert_to_git():
safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is already
in master.

Changes since v6:

* use consistent casing for core.checkRoundtripEncoding (Junio)
* fix gibberish in commit message (Junio)
* improve documentation (Torsten)
* improve advise messages (Torsten)


Thanks,
Lars

  RFC: 
https://public-inbox.org/git/bdb9b884-6d17-4be3-a83c-f67e2afa2...@gmail.com/
   v1: 
https://public-inbox.org/git/20171211155023.1405-1-lars.schnei...@autodesk.com/
   v2: 
https://public-inbox.org/git/2017122915.39680-1-lars.schnei...@autodesk.com/
   v3: 
https://public-inbox.org/git/20180106004808.77513-1-lars.schnei...@autodesk.com/
   v4: 
https://public-inbox.org/git/20180120152418.52859-1-lars.schnei...@autodesk.com/
   v5: https://public-inbox.org/git/20180129201855.9182-1-tbo...@web.de/
   v6: 
https://public-inbox.org/git/20180209132830.55385-1-lars.schnei...@autodesk.com/


Base Ref:
Web-Diff: https://github.com/larsxschneider/git/commit/2b94bec353
Checkout: git fetch https://github.com/larsxschneider/git encoding-v7 && git 
checkout 2b94bec353


### Interdiff (v6..v7):

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index ea5a9509c6..10cb37795d 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -291,19 +291,20 @@ the content is reencoded back to the specified encoding.
 Please note that using the `working-tree-encoding` attribute may have a
 number of pitfalls:

-- Git clients that do not support the `working-tree-encoding` attribute
-  will checkout the respective files UTF-8 encoded and not in the
-  expected encoding. Consequently, these files will appear different
-  which typically causes trouble. This is in particular the case for
-  older Git versions and alternative Git implementations such as JGit
-  or libgit2 (as of February 2018).
+- Third party Git implementations that do not support the
+  `working-tree-encoding` attribute will checkout the respective files
+  UTF-8 encoded and not in the expected encoding. Consequently, these
+  files will appear different which typically causes trouble. This is
+  in particular the case for older Git versions and alternative Git
+  implementations such as JGit or libgit2 (as of February 2018).

 - Reencoding content to non-UTF encodings can cause errors as the
   conversion might not be UTF-8 round trip safe. If you suspect your
-  encoding to not be round trip safe, then add it to 
`core.checkRoundtripEncoding`
-  to make Git check the round trip encoding (see linkgit:git-config[1]).
-  SHIFT-JIS (Japanese character set) is known to have round trip issues
-  with UTF-8 and is checked by default.
+  encoding to not be round trip safe, then add it to
+  `core.checkRoundtripEncoding` to make Git check the round trip
+  encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character
+  set) is known to have round trip issues with UTF-8 and is checked by
+  default.

 - Reencoding content requires resources that might slow down certain
   Git operations (e.g 'git checkout' or 'git add').
@@ -327,7 +328,7 @@ explicitly define the line endings with `eol` if the 
`working-tree-encoding`
 attribute is used to avoid ambiguity.

 
-*.proj working-tree-encoding=UTF-16LE text eol=CRLF
+*.proj text working-tree-encoding=UTF-16LE eol=CRLF
 

 You can get a list of all available encodings on your platform with the
diff --git a/convert.c b/convert.c
index 71dffc7167..398cd9cf7b 100644
--- a/convert.c
+++ b/convert.c
@@ -352,29 +352,29 @@ static int encode_to_git(const char *path, const char 
*src, size_t src_len,

if (has_prohibited_utf_bom(enc->name, src, src_len)) {
const char *error_msg = _(
-   "BOM is prohibited for '%s' if encoded as %s");
+   "BOM is prohibited in '%s' if encoded as %s");
+   /*
+* This advise is shown for UTF-??BE and UTF-??LE encodings.
+* We truncate the encoding name to 6 chars with %.6s to cut
+* off the last two "byte order" characters.
+*/
const char *advise_msg = _(
-   "You told Git to treat '%s' as %s. A byte order mark "
-   "(BOM) is prohibited with this encoding. Either use "
-   "%.6s as working tree encoding or remove the BOM from 
the "
-   "file.");
-
-   advise(advise_msg, path, enc->name, enc->name, enc->name);
+   "The file '%s' contains a byte order mark (BOM). "
+   "Please use %.6s as working-tree-encoding.");
+   advise(advise_msg, path, enc->name);
if

[PATCH v7 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-02-15 Thread lars . schneider

From: Lars Schneider 

Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we
allocate the buffer for the lower case string with xmallocz(). This
already ensures a NUL at the end of the allocated buffer.

Remove the unnecessary assignment.

Signed-off-by: Lars Schneider 
---
 strbuf.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index 1df674e919..55b7daeb35 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -781,7 +781,6 @@ char *xstrdup_tolower(const char *string)
result = xmallocz(len);
for (i = 0; i < len; i++)
result[i] = tolower(string[i]);
-   result[i] = '\0';
return result;
 }
 
-- 
2.16.1

[PATCH v7 2/7] strbuf: add xstrdup_toupper()

2018-02-15 Thread lars . schneider

From: Lars Schneider 

Create a copy of an existing string and make all characters upper case.
Similar xstrdup_tolower().

This function is used in a subsequent commit.

Signed-off-by: Lars Schneider 
---
 strbuf.c | 12 
 strbuf.h |  1 +
 2 files changed, 13 insertions(+)

diff --git a/strbuf.c b/strbuf.c
index 55b7daeb35..b635f0bdc4 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -784,6 +784,18 @@ char *xstrdup_tolower(const char *string)
return result;
 }
 
+char *xstrdup_toupper(const char *string)
+{
+   char *result;
+   size_t len, i;
+
+   len = strlen(string);
+   result = xmallocz(len);
+   for (i = 0; i < len; i++)
+   result[i] = toupper(string[i]);
+   return result;
+}
+
 char *xstrvfmt(const char *fmt, va_list ap)
 {
struct strbuf buf = STRBUF_INIT;
diff --git a/strbuf.h b/strbuf.h
index 14c8c10d66..df7ced53ed 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -607,6 +607,7 @@ __attribute__((format (printf,2,3)))
 extern int fprintf_ln(FILE *fp, const char *fmt, ...);
 
 char *xstrdup_tolower(const char *);
+char *xstrdup_toupper(const char *);
 
 /**
  * Create a newly allocated string using printf format. You can do this easily
-- 
2.16.1

[PATCH v7 3/7] utf8: add function to detect prohibited UTF-16/32 BOM

2018-02-15 Thread lars . schneider

From: Lars Schneider 

Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#bom10

Signed-off-by: Lars Schneider 
---
 utf8.c | 24 
 utf8.h |  9 +
 2 files changed, 33 insertions(+)

diff --git a/utf8.c b/utf8.c
index 2c27ce0137..914881cd1f 100644
--- a/utf8.c
+++ b/utf8.c
@@ -538,6 +538,30 @@ char *reencode_string_len(const char *in, int insz,
 }
 #endif
 
+static int has_bom_prefix(const char *data, size_t len,
+ const char *bom, size_t bom_len)
+{
+   return (len >= bom_len) && !memcmp(data, bom, bom_len);
+}
+
+static const char utf16_be_bom[] = {0xFE, 0xFF};
+static const char utf16_le_bom[] = {0xFF, 0xFE};
+static const char utf32_be_bom[] = {0x00, 0x00, 0xFE, 0xFF};
+static const char utf32_le_bom[] = {0xFF, 0xFE, 0x00, 0x00};
+
+int has_prohibited_utf_bom(const char *enc, const char *data, size_t len)
+{
+   return (
+ (!strcmp(enc, "UTF-16BE") || !strcmp(enc, "UTF-16LE")) &&
+ (has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
+  has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
+   ) || (
+ (!strcmp(enc, "UTF-32BE") || !strcmp(enc, "UTF-32LE")) &&
+ (has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
+  has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
+   );
+}
+
 /*
  * Returns first character length in bytes for multi-byte `text` according to
  * `encoding`.
diff --git a/utf8.h b/utf8.h
index 6bbcf31a83..4711429af9 100644
--- a/utf8.h
+++ b/utf8.h
@@ -70,4 +70,13 @@ typedef enum {
 void strbuf_utf8_align(struct strbuf *buf, align_type position, unsigned int 
width,
   const char *s);
 
+/*
+ * Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
+ * or UTF-32LE a BOM must not be used [1]. The function returns true if
+ * this is the case.
+ *
+ * [1] http://unicode.org/faq/utf_bom.html#bom10
+ */
+int has_prohibited_utf_bom(const char *enc, const char *data, size_t len);
+
 #endif
-- 
2.16.1

[PATCH v7 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-15 Thread lars . schneider

From: Lars Schneider 

If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.

The Unicode standard instructs to assume big-endian if there in no BOM
for UTF-16/32 [1][2]. However, the W3C/WHATWG encoding standard used
in HTML5 recommends to assume little-endian to "deal with deployed
content" [3]. Strictly requiring a BOM seems to be the safest option
for content in Git.

This function is used in a subsequent commit.

[1] http://unicode.org/faq/utf_bom.html#gen6
[2] http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf
 Section 3.10, D98, page 132
[3] https://encoding.spec.whatwg.org/#utf-16le

Signed-off-by: Lars Schneider 
---
 utf8.c | 13 +
 utf8.h | 16 
 2 files changed, 29 insertions(+)

diff --git a/utf8.c b/utf8.c
index 914881cd1f..5113d26e56 100644
--- a/utf8.c
+++ b/utf8.c
@@ -562,6 +562,19 @@ int has_prohibited_utf_bom(const char *enc, const char 
*data, size_t len)
);
 }
 
+int is_missing_required_utf_bom(const char *enc, const char *data, size_t len)
+{
+   return (
+  !strcmp(enc, "UTF-16") &&
+  !(has_bom_prefix(data, len, utf16_be_bom, sizeof(utf16_be_bom)) ||
+has_bom_prefix(data, len, utf16_le_bom, sizeof(utf16_le_bom)))
+   ) || (
+  !strcmp(enc, "UTF-32") &&
+  !(has_bom_prefix(data, len, utf32_be_bom, sizeof(utf32_be_bom)) ||
+has_bom_prefix(data, len, utf32_le_bom, sizeof(utf32_le_bom)))
+   );
+}
+
 /*
  * Returns first character length in bytes for multi-byte `text` according to
  * `encoding`.
diff --git a/utf8.h b/utf8.h
index 4711429af9..62f86fba64 100644
--- a/utf8.h
+++ b/utf8.h
@@ -79,4 +79,20 @@ void strbuf_utf8_align(struct strbuf *buf, align_type 
position, unsigned int wid
  */
 int has_prohibited_utf_bom(const char *enc, const char *data, size_t len);
 
+/*
+ * If the endianness is not defined in the encoding name, then we
+ * require a BOM. The function returns true if a required BOM is missing.
+ *
+ * The Unicode standard instructs to assume big-endian if there
+ * in no BOM for UTF-16/32 [1][2]. However, the W3C/WHATWG
+ * encoding standard used in HTML5 recommends to assume
+ * little-endian to "deal with deployed content" [3].
+ *
+ * [1] http://unicode.org/faq/utf_bom.html#gen6
+ * [2] http://www.unicode.org/versions/Unicode10.0.0/ch03.pdf
+ * Section 3.10, D98, page 132
+ * [3] https://encoding.spec.whatwg.org/#utf-16le
+ */
+int is_missing_required_utf_bom(const char *enc, const char *data, size_t len);
+
 #endif
-- 
2.16.1

[PATCH v7 5/7] convert: add 'working-tree-encoding' attribute

2018-02-15 Thread lars . schneider

From: Lars Schneider 

Git recognizes files encoded with ASCII or one of its supersets (e.g.
UTF-8 or ISO-8859-1) as text files. All other encodings are usually
interpreted as binary and consequently built-in Git text processing
tools (e.g. 'git diff') as well as most Git web front ends do not
visualize the content.

Add an attribute to tell Git what encoding the user has defined for a
given file. If the content is added to the index, then Git converts the
content to a canonical UTF-8 representation. On checkout Git will
reverse the conversion.

Signed-off-by: Lars Schneider 
---
 Documentation/gitattributes.txt  |  66 
 convert.c| 157 -
 convert.h|   1 +
 sha1_file.c  |   2 +-
 t/t0028-working-tree-encoding.sh | 210 +++
 5 files changed, 434 insertions(+), 2 deletions(-)
 create mode 100755 t/t0028-working-tree-encoding.sh

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 30687de81a..5ec179d631 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -272,6 +272,72 @@ few exceptions.  Even though...
   catch potential problems early, safety triggers.
 
 
+`working-tree-encoding`
+^^^
+
+Git recognizes files encoded with ASCII or one of its supersets (e.g.
+UTF-8 or ISO-8859-1) as text files.  All other encodings are usually
+interpreted as binary and consequently built-in Git text processing
+tools (e.g. 'git diff') as well as most Git web front ends do not
+visualize the content.
+
+In these cases you can tell Git the encoding of a file in the working
+directory with the `working-tree-encoding` attribute. If a file with this
+attribute is added to Git, then Git reencodes the content from the
+specified encoding to UTF-8. Finally, Git stores the UTF-8 encoded
+content in its internal data structure (called "the index"). On checkout
+the content is reencoded back to the specified encoding.
+
+Please note that using the `working-tree-encoding` attribute may have a
+number of pitfalls:
+
+- Third party Git implementations that do not support the
+  `working-tree-encoding` attribute will checkout the respective files
+  UTF-8 encoded and not in the expected encoding. Consequently, these
+  files will appear different which typically causes trouble. This is
+  in particular the case for older Git versions and alternative Git
+  implementations such as JGit or libgit2 (as of February 2018).
+
+- Reencoding content requires resources that might slow down certain
+  Git operations (e.g 'git checkout' or 'git add').
+
+Use the `working-tree-encoding` attribute only if you cannot store a file
+in UTF-8 encoding and if you want Git to be able to process the content
+as text.
+
+As an example, use the following attributes if your '*.proj' files are
+UTF-16 encoded with byte order mark (BOM) and you want Git to perform
+automatic line ending conversion based on your platform.
+
+
+*.proj text working-tree-encoding=UTF-16
+
+
+Use the following attributes if your '*.proj' files are UTF-16 little
+endian encoded without BOM and you want Git to use Windows line endings
+in the working directory. Please note, it is highly recommended to
+explicitly define the line endings with `eol` if the `working-tree-encoding`
+attribute is used to avoid ambiguity.
+
+
+*.proj text working-tree-encoding=UTF-16LE eol=CRLF
+
+
+You can get a list of all available encodings on your platform with the
+following command:
+
+
+iconv --list
+
+
+If you do not know the encoding of a file, then you can use the `file`
+command to guess the encoding:
+
+
+file foo.proj
+
+
+
 `ident`
 ^^^
 
diff --git a/convert.c b/convert.c
index b976eb968c..d20c341b6d 100644
--- a/convert.c
+++ b/convert.c
@@ -7,6 +7,7 @@
 #include "sigchain.h"
 #include "pkt-line.h"
 #include "sub-process.h"
+#include "utf8.h"
 
 /*
  * convert.c - convert a file when checking it out and checking it in.
@@ -265,6 +266,110 @@ static int will_convert_lf_to_crlf(size_t len, struct 
text_stat *stats,
 
 }
 
+static struct encoding {
+   const char *name;
+   struct encoding *next;
+} *encoding, **encoding_tail;
+static const char *default_encoding = "UTF-8";
+
+static int encode_to_git(const char *path, const char *src, size_t src_len,
+struct strbuf *buf, struct encoding *enc, int 
conv_flags)
+{
+   char *dst;
+   int dst_len;
+
+   /*
+* No encoding is specified or there is nothing to encode.
+* Tell the caller that the content was not modified.
+*/
+   if (!enc || (src && !src_len))
+   return 0;
+
+

[PATCH v7 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-02-15 Thread lars . schneider

From: Lars Schneider 

UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are known to have round trip issues [1].

Add 'core.checkRoundTripEncoding', which contains a comma separated
list of encodings, to define for what encodings Git should check the
conversion round trip if they are used in the 'working-tree-encoding'
attribute.

Set SHIFT-JIS as default value for 'core.checkRoundTripEncoding'.

[1] 
https://support.microsoft.com/en-us/help/170559/prb-conversion-problem-between-shift-jis-and-unicode

Signed-off-by: Lars Schneider 
---
 Documentation/config.txt |  6 
 Documentation/gitattributes.txt  |  8 +
 config.c |  5 +++
 convert.c| 74 
 convert.h|  1 +
 environment.c|  1 +
 t/t0028-working-tree-encoding.sh | 41 ++
 7 files changed, 136 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0e25b2c92b..d7a56054a5 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -530,6 +530,12 @@ core.autocrlf::
This variable can be set to 'input',
in which case no output conversion is performed.
 
+core.checkRoundtripEncoding::
+   A comma separated list of encodings that Git performs UTF-8 round
+   trip checks on if they are used in an `working-tree-encoding`
+   attribute (see linkgit:gitattributes[5]). The default value is
+   `SHIFT-JIS`.
+
 core.symlinks::
If false, symbolic links are checked out as small plain files that
contain the link text. linkgit:git-update-index[1] and
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 5ec179d631..10cb37795d 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -298,6 +298,14 @@ number of pitfalls:
   in particular the case for older Git versions and alternative Git
   implementations such as JGit or libgit2 (as of February 2018).
 
+- Reencoding content to non-UTF encodings can cause errors as the
+  conversion might not be UTF-8 round trip safe. If you suspect your
+  encoding to not be round trip safe, then add it to
+  `core.checkRoundtripEncoding` to make Git check the round trip
+  encoding (see linkgit:git-config[1]). SHIFT-JIS (Japanese character
+  set) is known to have round trip issues with UTF-8 and is checked by
+  default.
+
 - Reencoding content requires resources that might slow down certain
   Git operations (e.g 'git checkout' or 'git add').
 
diff --git a/config.c b/config.c
index 1f003fbb90..d0ada9fcd4 100644
--- a/config.c
+++ b/config.c
@@ -1172,6 +1172,11 @@ static int git_default_core_config(const char *var, 
const char *value)
return 0;
}
 
+   if (!strcmp(var, "core.checkroundtripencoding")) {
+   check_roundtrip_encoding = xstrdup(value);
+   return 0;
+   }
+
if (!strcmp(var, "core.notesref")) {
notes_ref_name = xstrdup(value);
return 0;
diff --git a/convert.c b/convert.c
index c4e2fd5fa5..398cd9cf7b 100644
--- a/convert.c
+++ b/convert.c
@@ -289,6 +289,39 @@ static void trace_encoding(const char *context, const char 
*path,
strbuf_release();
 }
 
+static int check_roundtrip(const char* enc_name)
+{
+   /*
+* check_roundtrip_encoding contains a string of space and/or
+* comma separated encodings (eg. "UTF-16, ASCII, CP1125").
+* Search for the given encoding in that string.
+*/
+   const char *found = strcasestr(check_roundtrip_encoding, enc_name);
+   const char *next = found + strlen(enc_name);
+   int len = strlen(check_roundtrip_encoding);
+   return (found && (
+   /*
+* check that the found encoding is at the
+* beginning of check_roundtrip_encoding or
+* that it is prefixed with a space or comma
+*/
+   found == check_roundtrip_encoding || (
+   found > check_roundtrip_encoding &&
+   (*(found-1) == ' ' || *(found-1) == ',')
+   )
+   ) && (
+   /*
+* check that the found encoding is at the
+* end of check_roundtrip_encoding or
+* that it is suffixed with a space or comma
+*/
+   next == check_roundtrip_encoding + len || (
+   next < check_roundtrip_encoding + len &&
+   (*next == ' ' || *next == ',')
+

[PATCH v7 6/7] convert: add tracing for 'working-tree-encoding' attribute

2018-02-15 Thread lars . schneider

From: Lars Schneider 

Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.

Signed-off-by: Lars Schneider 
---
 convert.c| 25 +
 t/t0028-working-tree-encoding.sh |  2 ++
 2 files changed, 27 insertions(+)

diff --git a/convert.c b/convert.c
index d20c341b6d..c4e2fd5fa5 100644
--- a/convert.c
+++ b/convert.c
@@ -266,6 +266,29 @@ static int will_convert_lf_to_crlf(size_t len, struct 
text_stat *stats,
 
 }
 
+static void trace_encoding(const char *context, const char *path,
+  const char *encoding, const char *buf, size_t len)
+{
+   static struct trace_key coe = TRACE_KEY_INIT(WORKING_TREE_ENCODING);
+   struct strbuf trace = STRBUF_INIT;
+   int i;
+
+   strbuf_addf(, "%s (%s, considered %s):\n", context, path, 
encoding);
+   for (i = 0; i < len && buf; ++i) {
+   strbuf_addf(
+   ,"| \e[2m%2i:\e[0m %2x \e[2m%c\e[0m%c",
+   i,
+   (unsigned char) buf[i],
+   (buf[i] > 32 && buf[i] < 127 ? buf[i] : ' '),
+   ((i+1) % 8 && (i+1) < len ? ' ' : '\n')
+   );
+   }
+   strbuf_addchars(, '\n', 1);
+
+   trace_strbuf(, );
+   strbuf_release();
+}
+
 static struct encoding {
const char *name;
struct encoding *next;
@@ -325,6 +348,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
error(error_msg, path, enc->name);
}
 
+   trace_encoding("source", path, enc->name, src, src_len);
dst = reencode_string_len(src, src_len, default_encoding, enc->name,
  _len);
if (!dst) {
@@ -340,6 +364,7 @@ static int encode_to_git(const char *path, const char *src, 
size_t src_len,
else
error(msg, path, enc->name, default_encoding);
}
+   trace_encoding("destination", path, default_encoding, dst, dst_len);
 
strbuf_attach(buf, dst, dst_len, dst_len + 1);
return 1;
diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh
index f9ce3e5ef5..01789ae1b8 100755
--- a/t/t0028-working-tree-encoding.sh
+++ b/t/t0028-working-tree-encoding.sh
@@ -4,6 +4,8 @@ test_description='working-tree-encoding conversion via 
gitattributes'
 
 . ./test-lib.sh
 
+GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING
+
 test_expect_success 'setup test repo' '
git config core.eol lf &&
 
-- 
2.16.1

Re: [PATCH] Makefile: generate Git(3pm) as dependency of the 'doc' and 'man' targets

2018-02-15 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason  writes:

> On Thu, Feb 15 2018, SZEDER Gábor jotted:
>
>> Since commit 20d2a30f8f (Makefile: replace perl/Makefile.PL with
>> simple make rules, 2017-12-10), the Git(3pm) man page is only
>> generated as an indirect dependency of the 'install-doc' and
>> 'install-man' Makefile targets.  Consequently, if someone runs 'make
>> man && sudo make install-man' (or their 'doc' counterparts), then
>> Git(3pm) will be generated as root, and the resulting root-owned files
>> and directories will in turn cause the next user-run 'make clean' to
>> fail.  This was not an issue in the past, because Git(3pm) was
>> generated when 'make all' descended into 'perl/', which is usually not
>> run as root.
>>
>> List Git(3pm) as a dependency of the 'doc' and 'man' Makefile targets,
>> too, so it gets generated by targets that are usually built as
>> ordinary users.
>>
>> While at it, add 'install-man-perl' to the list of .PHONY targets.
>
> Thanks for the fixup of my crappy 'make' skills. I tested this before
> the patch and it indeed has the problem you describe, and this fixes
> it. Thanks! CC-ing Junio because I think it makes sense to pick this up
> as-is.
>
> Reviewed-by: Ævar Arnfjörð Bjarmason 

Thanks.  

It is better late than never to have fixes like this, but I am a bit
disturbed to notice that for the last few batches, we see fixes to
topics after they land 'master', not while they are in 'next'.

Will apply.

Re: [PATCH 2/2] apply: handle Subversion diffs with /dev/null gracefully

2018-02-15 Thread Junio C Hamano

Johannes Schindelin  writes:

>   } else {
> - if (!starts_with(line, "/dev/null\n"))
> + if (!is_dev_null(line))
>   return error(_("git apply: bad git-diff - expected 
> /dev/null on line %d"), state->linenr);
>   }

Yup.  This seems to be the last explicit/manual check with the
string "/dev/null" (instead of using is_dev_null(), which is how it
should be and already is done in codepaths that guesses -p value and
decides if it is a creation or a deletion patch).

Looks good.  Will queue.

> diff --git a/t/t4135-apply-weird-filenames.sh 
> b/t/t4135-apply-weird-filenames.sh
> index b14b8085786..c7c688fcc4b 100755
> --- a/t/t4135-apply-weird-filenames.sh
> +++ b/t/t4135-apply-weird-filenames.sh
> @@ -100,7 +100,7 @@ deleted file mode 100644
>  -
>  EOF
>  
> -test_expect_failure 'apply handles a diff generated by Subversion' '
> +test_expect_success 'apply handles a diff generated by Subversion' '
>   >Makefile &&
>   git apply -p2 diff-from-svn &&
>   test_path_is_missing Makefile

Re: [PATCH] t6300-for-each-ref: fix "more than one quoting style" tests

2018-02-15 Thread SZEDER Gábor

On Thu, Feb 15, 2018 at 5:52 PM, Jeff King  wrote:
> On Thu, Feb 15, 2018 at 05:39:28PM +0100, SZEDER Gábor wrote:
>
>> On Tue, Feb 13, 2018 at 11:22 PM, Jeff King  wrote:
>>
>> >>  for i in "--perl --shell" "-s --python" "--python --tcl" "--tcl --perl"; 
>> >> do
>> >>   test_expect_success "more than one quoting style: $i" "
>> >> - git for-each-ref $i 2>&1 | (read line &&
>> >> - case \$line in
>> >> - \"error: more than one quoting style\"*) : happy;;
>> >> - *) false
>> >> - esac)
>> >> + test_must_fail git for-each-ref $i 2>err &&
>> >> + grep '^error: more than one quoting style' err
>> >
>> > I suspect in the long run this ought to be test_i18ngrep, but since it's
>> > not localized yet, it makes sense to stop here with this patch.
>>
>> I thought 'git for-each-ref' is plumbing and that means that it
>> shouldn't be localized.
>
> I always assumed stderr was mostly fair game, even for plumbing. At any
> rate, I'm willing to ignore the issue until somebody actually proposes a
> patch to translate it.

Ah, OK, will keep that in mind.

Anyway, the first GETTEXT_POISON build will fail if that error string
ever gets translated but the test doesn't get updated.

Re: [PATCH 0/6] Yet another approach to handling empty snapshots

2018-02-15 Thread Johannes Schindelin

Hi Michael,

On Wed, 24 Jan 2018, Michael Haggerty wrote:

> This patch series fixes the handling of empty packed-refs snapshots
> (i.e., those with `snapshot->buf` and friends equal to `NULL`), partly
> by changing `snapshot` to store a pointer to the start of the
> post-header `packed-refs` content instead of `header_len`. It makes a
> couple of other improvements as well.
> 
> I'm not sure whether I like this approach better than the alternative
> of always setting `snapshot->buf` to a non-NULL value, by allocating a
> length-1 bit of RAM if necessary. The latter is less intrusive, though
> even if that approach is taken, I think patches 01, 02, and 04 from
> this patch series would be worthwhile improvements.

Thank you for Cc:ing me on this patch series. I tried to find some time to
review it, I really did, but failed. As I saw that others already had a
good look at it, I will just archive the mail thread.

I hope you do not mind!

Ciao,
Dscho

Re: [PATCH] t6300-for-each-ref: fix "more than one quoting style" tests

2018-02-15 Thread Jeff King

On Thu, Feb 15, 2018 at 05:39:28PM +0100, SZEDER Gábor wrote:

> On Tue, Feb 13, 2018 at 11:22 PM, Jeff King  wrote:
> 
> >>  for i in "--perl --shell" "-s --python" "--python --tcl" "--tcl --perl"; 
> >> do
> >>   test_expect_success "more than one quoting style: $i" "
> >> - git for-each-ref $i 2>&1 | (read line &&
> >> - case \$line in
> >> - \"error: more than one quoting style\"*) : happy;;
> >> - *) false
> >> - esac)
> >> + test_must_fail git for-each-ref $i 2>err &&
> >> + grep '^error: more than one quoting style' err
> >
> > I suspect in the long run this ought to be test_i18ngrep, but since it's
> > not localized yet, it makes sense to stop here with this patch.
> 
> I thought 'git for-each-ref' is plumbing and that means that it
> shouldn't be localized.

I always assumed stderr was mostly fair game, even for plumbing. At any
rate, I'm willing to ignore the issue until somebody actually proposes a
patch to translate it.

-Peff

I await your response for more close discussions.

2018-02-15 Thread Mr Jeremiah

Hello dear ,
I am Mr. Jeremiah Bworo , nationality of Ethiopia but currently In Benin 
Republic as a Principal Director with Federal Ministry Of Works and Housing 
here .

I will be delighted to invest with you there in your country into lucrative 
investments under your management as far as you could promise me of your 
country's economic stability . 

I await your response for more close discussions.

Regards,
Mr. Jeremiah Bworo
Principal director
Ministry of Works and Housing 
Benin Republic.

Re: [PATCH 5/8] rebase: introduce the --recreate-merges option

2018-02-15 Thread Johannes Schindelin

Hi,

On Thu, 15 Feb 2018, Sergey Organov wrote:

> Johannes Schindelin  writes:
> 
> > On Tue, 13 Feb 2018, Sergey Organov wrote:
> >
> >> Johannes Schindelin  writes:
> >> 
> >> > The wording is poor either way, but you are also not a native speaker so
> >> > we have to rely on, say, Eric to help us out here.
> >> 
> >> Likely, but why didn't you keep original wording from --preserve-merges?
> >> Do you feel it's somehow poor either?
> >
> > Yes, I felt it is poor, especially when --recreate-merges is present, that
> > is indeed why I changed it.
> 
> So, how about this (yeah, I noticed the option now got arguments, but
> please, tweak this to the new implementation yourself):
> 
> --recreate-merges::
>   Recreate merge commits instead of flattening the history. Merge
>   conflict resolutions or manual amendments to merge commits are
>   not preserved. 
> 
> -p::
> --preserve-merges::
>   (deprecated) This option is similar to --recreate-merges. It has
> no proper support for interactive mode and thus is deprecated.
> Use '--recreate-merges' instead.

I still don't like either.

I want something different there: descriptions that are a bit more
self-contained, and only describe the differences to -i or
--preserve-merges in a second paragraph.

Don't worry about it, though, I don't think you or me are capable of a
good explanation. I will ask some native speakers I trust.

Ciao,
Johannes

I await your response for more close discussions.

2018-02-15 Thread Mr Jeremiah

Hello dear ,
I am Mr. Jeremiah Bworo , nationality of Ethiopia but currently In Benin 
Republic as a Principal Director with Federal Ministry Of Works and Housing 
here .

I will be delighted to invest with you there in your country into lucrative 
investments under your management as far as you could promise me of your 
country's economic stability . 

I await your response for more close discussions.

Regards,
Mr. Jeremiah Bworo
Principal director
Ministry of Works and Housing 
Benin Republic.

Re: [PATCH 5/8] rebase: introduce the --recreate-merges option

2018-02-15 Thread Johannes Schindelin

Hi,

On Thu, 15 Feb 2018, Sergey Organov wrote:

> Johannes Schindelin  writes:
> 
> [...]
> 
> >> > I'd not argue this way myself. If there are out-of-git-tree non-human
> >> > users that accept and tweak todo _generated_ by current "git rebase -p"
> >> > _command_, I also vote for a new option.
> >> >
> >> 
> >> To be fair, I have not seen anything that actually reads the todo list
> >> and tweaks it in such a manner. The closest example is the git garden
> >> shears script, which simply replaces the todo list.
> >> 
> >> It's certainly *possible* that such a script would exist though,
> >
> > We actually know of such scripts.
> 
> Please consider to explain this in the description of the change. I
> believe readers deserve an explanation of why you decided to invent new
> option instead of fixing the old one, even if it were only a suspicion,
> more so if it is confidence.

I considered.

And since even the absence of this use case would *still* not be a
convincing case against keeping --preserve-merges backwards-compatible, I
will not mention it.

Just saying that --preserve-merges is not changed, in order to keep
backwards-compatibility, is plenty enough.

It probably already convinced the Git maintainer, who is very careful
about backwards-compatibility, and rightfully so.

Ciao,
Johannes

Re: [PATCH] t6300-for-each-ref: fix "more than one quoting style" tests

2018-02-15 Thread SZEDER Gábor

On Tue, Feb 13, 2018 at 11:22 PM, Jeff King  wrote:

>>  for i in "--perl --shell" "-s --python" "--python --tcl" "--tcl --perl"; do
>>   test_expect_success "more than one quoting style: $i" "
>> - git for-each-ref $i 2>&1 | (read line &&
>> - case \$line in
>> - \"error: more than one quoting style\"*) : happy;;
>> - *) false
>> - esac)
>> + test_must_fail git for-each-ref $i 2>err &&
>> + grep '^error: more than one quoting style' err
>
> I suspect in the long run this ought to be test_i18ngrep, but since it's
> not localized yet, it makes sense to stop here with this patch.

I thought 'git for-each-ref' is plumbing and that means that it
shouldn't be localized.

Targeted B2B Companies Emails List

2018-02-15 Thread sally . grant


Hi,

I was wondering if you would be interested in targeting a customized list  
of your competitors End Users Install Base for your upcoming Marketing  
Strategy.


• ERP- JD Edwards, Infor Baan, SAP, Exact Software, NetSuite, PeopleSoft,  
etc.
• CRM- Salesforce, MS Dynamics, NetSuite, Siebel, Teradata, Epicor, Infor,  
CDC Software, etc.
• Engineering Software- Autodesk, Siemens PLM, Adobe, AutoCAD, MAYA, Revit,  
Solid works, PTC, MADCAD, etc.

• Cloud Computing- Amazon, Rack Space, Google APPS, Hyper-V, NetApp, etc.
• Storage application - NetApp, EMC, Citrix, HP, Brocade, DELL, etc.
• Security Software- Symantec, McAfee, IBM, Riverbed, Tabberg, Commvault,  
Juniper Networks, F5, etc.

• Networking- Brocade, Symantec, Avaya, Cisco, Shoretel, etc.
• Medical Software- NextGen, All Scripts, EMR, McKesson, Practice Fusion,  
eClinical Works, etc.
• Accounting Software- Sage, Peachtree, Timberline, MS Dynamics, NetSuite,  
Deltek, Lawson, QuickBooks, etc.
• Business Intelligence- SAP Business Objects, Microstratergy, Tibco,  
Microsoft BI, QlikTech, Information Builders, etc.
We provide data across the globe - North America, EMEA, Asia Pacific and  
LATAM. We provide the decision Makers contacts like CIO, CTO, CISO, IT VP,  
IT Director, IT manager, IT head, etc.


Please review and let me know if you are looking for any type of list and  
we can service you.


If you are interested, let me know your targeted geography so that I will  
get back to you with the counts and more information.


Await your response!
Thanks,
Sally Grant
Marketing Executive

 If you are not interested in receiving further emails, please answer back  
with "overlook" in the title.

Re: Uninstall Git

2018-02-15 Thread Konstantin Khomoutov

On Thu, Feb 15, 2018 at 01:01:14PM +0100, Jan Nguyen wrote:

> I would like to ask how can I uninstall Git?

Please follow the answers given to this question at
https://apple.stackexchange.com/search?q=%5Bgit%5D+uninstall

Re: git-rebase --undo-skip proposal

2018-02-15 Thread Johannes Schindelin

Hi Jake,

On Wed, 14 Feb 2018, Jacob Keller wrote:

> On Wed, Feb 14, 2018 at 5:50 PM, Psidium Guajava  wrote:
> > 2018-02-14 22:53 GMT-02:00 Johannes Schindelin :
> >> Now, when is the next possible time you can call `git rebase --undo-skip`?
> >
> > What if the scope were reduced from `--undo-skip` to `--undo-last-skip`?
> > Also, further reduce the scope to only allow `--undo-last-skip` during
> > a ongoing rebase, not caring about a finished one?
> >
> > But, this could be so niche that I have doubts if this would ever be used;
> 
> I think this is too niche to actually be useful in practice,
> especially since figuring out exactly what to replay might be tricky..
> I suppose it could keep track of where in the rebase the last skip was
> used, and then just go back to rebase and stop there again? That
> sounds like just redoing the rebase is easier.. (ie: use the reflog to
> go back to before the rebase started, and then re-run it again and
> don't skip this time).

Yes, and having a record of what commits were skipped would probably be
very helpful not only in this instance.

Maybe also a record of what commits caused merge conflicts, which a
mapping between original and new commit (which does get a bit tricky with
fixup!/squash! commits, though).

Ciao,
Dscho

Re: git-rebase --undo-skip proposal

2018-02-15 Thread Johannes Schindelin

Hi Jake,

On Wed, 14 Feb 2018, Jacob Keller wrote:

> On Wed, Feb 14, 2018 at 4:53 PM, Johannes Schindelin
>  wrote:
> >
> > On Wed, 14 Feb 2018, Psidium Guajava wrote:
> >
> >> On 2018-02-13 18:42 GMT-02:00 Stefan Beller  wrote:
> >> > On Tue, Feb 13, 2018 at 12:22 PM, Psidium Guajava
> >> >  wrote: I think this could also be done with
> >> > "git rebase --edit-todo", which brings up the right file in your
> >> > editor.
> >>
> >> Yeah that'd would only work if one started a rebase as a interactive
> >> one, not am or merge.
> >
> > I agree that the original proposal was clearly for the non-interactive
> > rebase (it talked about .git/rebase-apply/).
> >
> > The biggest problem with the feature request is not how useful it
> > would be: I agree it would be useful. The biggest problem is that it
> > is a little bit of an ill-defined problem.
> >
> > Imagine that you are rebasing 30 patches. Now, let's assume that patch
> > #7 causes a merge conflict, and you mistakenly call `git rebase
> > --skip`.
> >
> > Now, when is the next possible time you can call `git rebase
> > --undo-skip`?  It could be after a merge conflict in #8. Or in
> > interactive rebase, after a `pick` that was turned into an `edit`. Or,
> > and this is also entirely possible, after the rebase finished with #30
> > without problems and the rebase is actually no longer in progress.
> >
> > So I do not think that there is a way, in general, to implement this
> > feature. Even if you try to remember the state where a `--skip` was
> > called, you would remember it in the .git/rebase-apply/ (or
> > .git/rebase-merge/) directory, which is cleaned up after the rebase
> > concluded successfully. So even then the information required to
> > implement the feature would not necessarily be there, still, when it
> > would be needed.
> 
> Instead of an "--undo-skip", what if we ask the question of what the
> user actually wants?

Heh, I thought in this instance, --undo-skip was what the user wanted ;-)

> Generally I'd assume that the user wishes to go back to the rebase and
> "pick" the commit back in.

Right, and then replay whatever series of commits was picked since the one
that was skipped.

What *could* be done is to save a copy of the current todo list (with the
skipped commit put back in, before the current first line) and save that
together with `git rev-parse HEAD`.

This should make it possible to implement `--undo-skip` for as long as
the rebase did not finish.

Theoretically, you could even save the commit name of the skipped commit
somewhere else than $GIT_DIR/rebase-apply/ (or $GIT_DIR/rebase-merge/),
say, as worktree-local `refs/rebase/skipped`, and then a `git rebase
--undo-skip` could detect the absence of $GIT_DIR/rebase-*/ and fall back
to `git cherry-pick refs/rebase/skipped`.

You'd have to take pains to handle that ref in gc, and to record when the
user edited the todo list via `git rebase --edit-todo` after skipping that
commit (and warn loudly upon/prevent --undo-skip) because those todo list
changes would then be lost.

That's just one way how this feature could be implemented.

It does strike me as awfully specific, though. And it would still only
extend to the latest `git rebase --skip`.

So I am not sure whether we really would want to go this direction, or
whether we can maybe come up with something (probably based on your
suggestion to give the user enough information) that would allow many more
scenarios than just --undo-skip.

> So what if we just make "git rebase --skip" more verbose so that it
> more clearly spells out which commit is being skipped? Possibly even
> as extra lines of "the following patches were skipped during the
> rebase" after it completes?
> 
> Then it's up to the user to determine what to do with those commits,
> and there are many tools they could use to solve it, "git rebase -i,
> git cherry-pick, git reflog to restore to a previous and re-run the
> rebase, etc".

I think this is a saner direction, as it will probably allow more
scenarios to be addressed than just undoing the latest `git rebase
--skip`.

Ciao,
Dscho

Uninstall Git

2018-02-15 Thread Jan Nguyen

Hello there, 

I would like to ask how can I uninstall Git?

Kind regards

Jan

[PATCH] make hg-to-git compatible with python2.x and 3.x

2018-02-15 Thread Hervé Beraud

---
 contrib/hg-to-git/hg-to-git.py | 140 -
 1 file changed, 83 insertions(+), 57 deletions(-)

diff --git a/contrib/hg-to-git/hg-to-git.py b/contrib/hg-to-git/hg-to-git.py
index de3f81667ed97..9b0842c3883dc 100755
--- a/contrib/hg-to-git/hg-to-git.py
+++ b/contrib/hg-to-git/hg-to-git.py
@@ -18,14 +18,19 @@
 along with this program; if not, see .
 """
 
-import os, os.path, sys
-import tempfile, pickle, getopt
+import os
+import os.path
+import sys
+import tempfile
+import pickle
+import getopt
+import textwrap
 import re
 
 if sys.hexversion < 0x0203:
-   # The behavior of the pickle module changed significantly in 2.3
-   sys.stderr.write("hg-to-git.py: requires Python 2.3 or later.\n")
-   sys.exit(1)
+# The behavior of the pickle module changed significantly in 2.3
+sys.stderr.write("hg-to-git.py: requires Python 2.3 or later.\n")
+sys.exit(1)
 
 # Maps hg version -> git version
 hgvers = {}
@@ -38,29 +43,31 @@
 # Number of new changesets converted from hg
 hgnewcsets = 0
 
-#--
+# -
+
 
 def usage():
 
-print """\
-%s: [OPTIONS] 
+print(textwrap.dedent("""\
+{0}: [OPTIONS] 
 
-options:
--s, --gitstate=FILE: name of the state to be saved/read
- for incrementals
--n, --nrepack=INT:   number of changesets that will trigger
- a repack (default=0, -1 to deactivate)
--v, --verbose:   be verbose
+options:
+-s, --gitstate=FILE: name of the state to be saved/read
+ for incrementals
+-n, --nrepack=INT:   number of changesets that will trigger
+ a repack (default=0, -1 to deactivate)
+-v, --verbose:   be verbose
 
-required:
-hgprj:  name of the HG project to import (directory)
-""" % sys.argv[0]
+required:
+hgprj:  name of the HG project to import (directory)
+""").format(sys.argv[0]))
+
+# -
 
-#--
 
 def getgitenv(user, date):
 env = ''
-elems = re.compile('(.*?)\s+<(.*)>').match(user)
+elems = re.compile(r'(.*?)\s+<(.*)>').match(user)
 if elems:
 env += 'export GIT_AUTHOR_NAME="%s" ;' % elems.group(1)
 env += 'export GIT_COMMITTER_NAME="%s" ;' % elems.group(1)
@@ -76,14 +83,16 @@ def getgitenv(user, date):
 env += 'export GIT_COMMITTER_DATE="%s" ;' % date
 return env
 
-#--
+# -
+
 
 state = ''
 opt_nrepack = 0
 verbose = False
 
 try:
-opts, args = getopt.getopt(sys.argv[1:], 's:t:n:v', ['gitstate=', 
'tempdir=', 'nrepack=', 'verbose'])
+options = ['gitstate=', 'tempdir=', 'nrepack=', 'verbose']
+opts, args = getopt.getopt(sys.argv[1:], 's:t:n:v', options)
 for o, a in opts:
 if o in ('-s', '--gitstate'):
 state = a
@@ -94,7 +103,7 @@ def getgitenv(user, date):
 verbose = True
 if len(args) != 1:
 raise Exception('params')
-except:
+except Exception:
 usage()
 sys.exit(1)
 
@@ -104,37 +113,38 @@ def getgitenv(user, date):
 if state:
 if os.path.exists(state):
 if verbose:
-print 'State does exist, reading'
+print('State does exist, reading')
 f = open(state, 'r')
 hgvers = pickle.load(f)
 else:
-print 'State does not exist, first run'
+print('State does not exist, first run')
 
 sock = os.popen('hg tip --template "{rev}"')
 tip = sock.read()
 if sock.close():
 sys.exit(1)
 if verbose:
-print 'tip is', tip
+print('tip is', tip)
 
 # Calculate the branches
 if verbose:
-print 'analysing the branches...'
+print('analysing the branches...')
 hgchildren["0"] = ()
 hgparents["0"] = (None, None)
 hgbranch["0"] = "master"
 for cset in range(1, int(tip) + 1):
 hgchildren[str(cset)] = ()
-prnts = os.popen('hg log -r %d --template "{parents}"' % 
cset).read().strip().split(' ')
+prnts = os.popen(
+'hg log -r %d --template "{parents}"' % cset).read().strip().split(' ')
 prnts = map(lambda x: x[:x.find(':')], prnts)
 if prnts[0] != '':
 parent = prnts[0].strip()
 else:
 parent = str(cset - 1)
-hgchildren[parent] += ( str(cset), )
+hgchildren[parent] += (str(cset), )
 if len(prnts) > 1:
 mparent = prnts[1].strip()
-hgchildren[mparent] += ( str(cset), )
+hgchildren[mparent] += (str(cset), )
 else:
 mparent = None
 
@@ -148,59 +158,65 @@ def getgitenv(user, date):

Re: [PATCH v3 21/23] for-each-ref: tests for new atoms added

2018-02-15 Thread Оля Тележная

2018-02-15 8:57 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Add tests for new formatting atoms: rest, deltabase, objectsize:disk.
>> rest means nothing and we expand it into empty string.
>> We need this atom for cat-file command.
>> Have plans to support deltabase and objectsize:disk further
>> (as it is done in cat-file), now also expand it to empty string.
>
> I'm glad that you're adding tests, but I'm not sure it's a good idea to
> add tests checking for the thing we know to be wrong. If anything, you
> could be adding test_expect_failure looking for the _right_ thing, and
> accept that it does not yet work.

OK, I will try to fix that when other parts of the patch will look
good enough. I am not sure at this point that we will add my code to
the main version of the project, so these tests were written actually
for checking current functionality better before sending the code for
the review.

>
> -Peff

Re: [PATCH v3 20/23] ref-filter: unifying formatting of cat-file opts

2018-02-15 Thread Оля Тележная

2018-02-15 8:56 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> cat-file options are now filled by general logic.
>
> Yay.
>
> One puzzling thing:
>
>> diff --git a/ref-filter.c b/ref-filter.c
>> index 8d104b567eb7c..5781416cf9126 100644
>> --- a/ref-filter.c
>> +++ b/ref-filter.c
>> @@ -824,8 +824,12 @@ static void grab_common_values(struct atom_value *val, 
>> int deref, struct object
>>   else if (!strcmp(name, "objectsize")) {
>>   v->value = sz;
>>   v->s = xstrfmt("%lu", sz);
>> - }
>> - else if (deref)
>> + } else if (!strcmp(name, "objectsize:disk")) {
>> + if (cat_file_info.is_cat_file) {
>> + v->value = cat_file_info.disk_size;
>> + v->s = xstrfmt("%"PRIuMAX, 
>> (uintmax_t)v->value);
>> + }
>> + } else if (deref)
>
> Why do we care about is_cat_file here. Shouldn't:
>
>   git for-each-ref --format='%(objectsize:disk)'
>
> work? I.e., shouldn't the cat_file_info.disk_size variable be held
> somewhere in a used_atom struct?

At that point - no.
I think it sounds like other separate task to add this functionality
and to test it properly.

>
> -Peff

Re: [PATCH v3 16/23] ref-filter: make cat_file_info independent

2018-02-15 Thread Оля Тележная

2018-02-15 8:53 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Remove connection between expand_data variable
>> in cat-file and in ref-filter.
>> It will help further to get rid of using expand_data in cat-file.
>
> I have to admit I'm confused at this point about what is_cat_file is
> for, or even why we need cat_file_data. Shouldn't these items be handled
> by their matching ref-filter atoms at this point?

We discussed that earlier outside of mailing list, and I even tried to
implement that idea and spent a couple of days to prove that it's not
possible.
The problem is that the list of atoms is made dynamically, and we
can't store pointers to any values in each atom. That's why we need
separate cat_file_info variable that is outside of main atom list.
We also need is_cat_file because we still have some part of logic that
is different for cat-file and for all other commands, and sometimes we
need to know that information.

>
> -Peff

Re: [PATCH v3 15/23] cat-file: move skip_object_info into ref-filter

2018-02-15 Thread Оля Тележная

2018-02-15 8:51 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Move logic related to skip_object_info into ref-filter,
>> so that cat-file does not use that field at all.
>
> I think this is going the wrong way. ref-filter should always do as
> little work as possible to fulfill the request. So it should skip the
> object_info call whenever it can. And then any callers who want to make
> sure that the object exists can do so (as long as the formatting code
> tells them whether it looked up the object or not).
>
> And then ref-filter doesn't have to know about this skip_object_info
> flag at all.

Your message looks contradictory to me.
I agree that ref-filter should do as least as it's possible, and that
is the main reason why I put this code there. Moreover, I think that
it's a good idea to implement that variable not only for cat-file, but
for all ref-filter callers. And I think that it's a task of ref-filter
to check whether the object exists or not (or - whether the ref is
valid or not). But I am not sure that I need to solve that moment in
current patch. It sounds like another separate task.

>
> -Peff

Re: [PATCH v3 14/23] ref_filter: add is_atom_used function

2018-02-15 Thread Оля Тележная

2018-02-15 8:49 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Delete all items related to split_on_whitespace from ref-filter
>> and add new function for handling the logic.
>> Now cat-file could invoke that function to implementing its logic.
>
> OK, this is a good direction. I think in a more compact series we'd
> avoid moving the split-on-whitespace bits over to ref-filter in the
> first place, and have two commits:
>
>  - one early in the series adding is_atom_used()
>
>  - one late in the series switching cat-file over to is_atom_used() as
>part of the conversion to ref-filter
>
>> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
>> index 6db57e3533806..3a49b55a1cc2e 100644
>> --- a/builtin/cat-file.c
>> +++ b/builtin/cat-file.c
>> @@ -382,8 +382,7 @@ static int batch_objects(struct batch_options *opt)
>>  {
>>   struct strbuf buf = STRBUF_INIT;
>>   struct expand_data data;
>> - int save_warning;
>> - int retval = 0;
>> + int save_warning, is_rest, retval = 0;
>
> Try to avoid reformatting existing code that you're not otherwise
> touching, as it makes the diff noisier. Just adding "int is_rest" would
> make this easier to review.
>
> I also think the variable name should probably still be
> "split_on_whitespace". It's set based on whether we saw a "%(rest)"
> atom, but ultimately we'll use it to decide whether to split.

OK, I will fix that.

>
> -Peff

Re: [PATCH v3 13/23] ref-filter: get rid of mark_atom_in_object_info()

2018-02-15 Thread Оля Тележная

2018-02-15 8:45 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Remove mark_atom_in_object_info() and create same logic
>> in terms of ref-filter style.
>
> This one is definitely a step in the right direction. In fact, this is
> what I would have expected to see in the beginning. It seems like this
> first half of the series really would benefit from being squashed into a
> few commits. I.e., I'd have expected to see the whole series looking
> something like:
>
>   - preparatory cleanup of ref-filter
>
>   - teach ref-filter any new atoms (or atom options) that cat-file knows
> about but it doesn't
>
>   - convert cat-file to use ref-filter
>
> Most of what I've seen so far is basically that second step, but strung
> out along a bunch of commits. Can we compact it into a few commits that
> all make clear forward progress (by using "rebase -i" with "squash")?

I am afraid that I don't really see any 100% improvements by squashing
some of the commits. I tried to do that several times and I squashed
some of them, but now for me it looks like nothing else could be
squashed so that commits still look close to making atomic changes (I
am sure that it's super important).

>
> -Peff

Re: [PATCH v3 09/23] cat-file: start use ref_array_item struct

2018-02-15 Thread Оля Тележная

2018-02-15 8:40 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Moving from using expand_data to ref_array_item structure.
>> That helps us to reuse functions from ref-filter easier.
>
> This one feels weird. The point of a ref_array_item is for the caller to
> feed data into the ref-filter formatting code (usually that data comes
> from an earlier call to filter_refs(), but in the new cat-file we'd
> presumably feed single items).
>
> But here we're adding a bunch of fields for items that we'd expect the
> format code to compute itself.

It would be changed later, it's just the addition of new structure
that we have never used in cat-file before.

>
> -Peff

Re: [PATCH v3 08/23] ref-filter: reuse parse_ref_filter_atom()

2018-02-15 Thread Оля Тележная

2018-02-15 8:37 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Continue migrating formatting logic from cat-file to ref-filter.
>> Reuse parse_ref_filter_atom() for unifying all processes in ref-filter
>> and further removing of mark_atom_in_object_info().
>
> OK, now it looks we're moving in a good direction.
>
> One thing that puzzles me:
>
>> @@ -401,20 +420,14 @@ static int is_atom(const char *atom, const char *s, 
>> int slen)
>>  static void mark_atom_in_object_info(const char *atom, int len,
>>   struct expand_data *data)
>>  {
>> - if (is_atom("objectname", atom, len))
>> - ; /* do nothing */
>> - else if (is_atom("objecttype", atom, len))
>> + if (is_atom("objecttype", atom, len))
>>   data->info.typep = >type;
>>   else if (is_atom("objectsize", atom, len))
>>   data->info.sizep = >size;
>> - else if (is_atom("objectsize:disk", atom, len))
>> - data->info.disk_sizep = >disk_size;
>>   else if (is_atom("rest", atom, len))
>>   data->split_on_whitespace = 1;
>>   else if (is_atom("deltabase", atom, len))
>>   data->info.delta_base_sha1 = data->delta_base_oid.hash;
>> - else
>> - die("unknown format element: %.*s", len, atom);
>>  }
>
> Why do some of these atoms go away and not others?

I deleted "objectname" because we were doing nothing there;
"objectsize:disk" because we have its own parser function;
"die" because ref-filter has its own checker whether the atom is valid or not.
I left all others because I haven't supported them at that point. This
whole function will be removed later.

> It seems like we're
> now relying on ref-filter to parse some of the common ones using its
> existing atom-parser. But wouldn't it have objecttype and objectsize
> already, then?

We haven't migrated enough to ref-filter at this point and we can't
reuse general ref-filter logic about filling the fields. So, we still
need to have our own function for doing that. Anyway, as I said
earlier, we will reach that status in the end of the patch: this
function would be deleted and we will use general ref-filter logic.

>
> -Peff

[no subject]

2018-02-15 Thread Vanesa Ali

hola Mi nombre es Vanessa Ali. a France Nationality, soy viuda,
actualmente hospitalizada debido a una enfermedad de cáncer. Mientras
tanto, he decidido donar mi fondo para usted como una persona
confiable que usará este dinero sabiamente, € 2,800,000 Millones de
Euros. para ayudar a los pobres y menos privilegiados. Entonces, si
está dispuesto a aceptar esta oferta y hacer exactamente lo que le
instruiré, vuelva a consultarme para obtener más detalles. Sra.
Vanessa Ali

Re: git-rebase --undo-skip proposal

2018-02-15 Thread Paul Tan

Hi Gabriel,

On Wed, Feb 14, 2018 at 4:42 AM, Stefan Beller  wrote:
> On Tue, Feb 13, 2018 at 12:22 PM, Psidium Guajava  wrote:
>>
>> Also, a little unrelated with this issue:
>> 5. What happened to the rewrite of rebase in C [2]? I couldn't find
>> any information after 2016.
>>
>> [1] https://public-inbox.org/git/201311011522.44631.tho...@koch.ro/
>> [2] 
>> https://public-inbox.org/git/1457779597-6918-1-git-send-email-pyoka...@gmail.com/
>
> cc'd Paul Tan, maybe he recalls the situation.

It was discarded in favor of Johannes' rebase-helper approach, and I
think parts of it are already in master. There's probably room for
help there.

I haven't had time to keep track of Git development, hence my
inactivity. Sorry about that.

Regards,
Paul

Re: [PATCH v3 04/23] ref-filter: make valid_atom as function parameter

2018-02-15 Thread Оля Тележная

2018-02-15 8:23 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Make valid_atom as a function parameter,
>> there could be another variable further.
>> Need that for further reusing of formatting logic in cat-file.c.
>>
>> We do not need to allow users to pass their own valid_atom variable in
>> global functions like verify_ref_format() because in the end we want to
>> have same set of valid atoms for all commands. But, as a first step
>> of migrating, I create further another version of valid_atom
>> for cat-file.
>
> Hmm. So I see where you're going here, but I think in the end we'd want
> to have a single valid_atom list again, and we wouldn't need this.
>
> And indeed, it looks like this goes away in patch 17. Can we
> reorganize/rebase the series so that we avoid dead-end directions like
> this?

I tried to do that, but in my opinion it's easier to understand
everything in current version. I need to squash too many items so that
commits do not look atomic, and it's really hard to understand what is
going on. Now we have that helper commit that will be cancelled later,
and other logic is more clear for readers.

>
> -Peff

Re: [PATCH v3 02/23] ref-filter: add return value to some functions

2018-02-15 Thread Оля Тележная

2018-02-15 8:16 GMT+03:00 Jeff King :
> On Mon, Feb 12, 2018 at 08:08:54AM +, Olga Telezhnaya wrote:
>
>> Add return flag to format_ref_array_item(), show_ref_array_item(),
>> get_ref_array_info() and populate_value() for further using.
>> Need it to handle situations when item is broken but we can not invoke
>> die() because we are in batch mode and all items need to be processed.
>
> OK. The source of these errors would eventually be calls in
> populate_value(), but we don't flag any errors there yet (well, we do,
> but they all end up in die() for now). So I'd expect to see later in the
> series those die() calls converted to errors (I haven't looked further
> yet; just making a note to myself).
>
>> --- a/ref-filter.c
>> +++ b/ref-filter.c
>> @@ -1356,8 +1356,9 @@ static const char *get_refname(struct used_atom *atom, 
>> struct ref_array_item *re
>>
>>  /*
>>   * Parse the object referred by ref, and grab needed value.
>> + * Return 0 if everything was successful, -1 otherwise.
>>   */
>
> We discussed off-list the concept that the caller may want to know one
> of three outcomes:
>
>   - we completed the request, having accessed the object
>   - we completed the request, but it didn't require accessing any
> objects
>   - an error occurred accessing the object
>
> Since callers like "cat-file" would need to check has_sha1_file()
> manually in the second case. Should this return value actually be an
> enum, which would make it easier to convert later to a tri-state?

I decided not to implement this particular scenario because all other
callers are waiting that everything will be printed inside ref-filter.
We just add support for cat-file there. I don't think that I need to
re-think all printing process and move printing logic to all other
callers so that cat-file will behave fine. In my opinion, in the final
version cat-file must accept all ref-filter logic parts and adapt to
them.

>
>> -static void populate_value(struct ref_array_item *ref)
>> +static int populate_value(struct ref_array_item *ref)
>>  {
>>   void *buf;
>>   struct object *obj;
>> @@ -1482,7 +1483,7 @@ static void populate_value(struct ref_array_item *ref)
>>   }
>>   }
>>   if (used_atom_cnt <= i)
>> - return;
>> + return 0;
>
> Most of these conversions are obviously correct, because they just turn
> a void return into one with a value. But this one is trickier:
>
>> @@ -2138,9 +2144,10 @@ void format_ref_array_item(struct ref_array_item 
>> *info,
>>   ep = strchr(sp, ')');
>>   if (cp < sp)
>>   append_literal(cp, sp, );
>> - get_ref_atom_value(info,
>> -parse_ref_filter_atom(format, sp + 2, ep),
>> -);
>> + if (get_ref_atom_value(info,
>> +parse_ref_filter_atom(format, sp + 2, 
>> ep),
>> +))
>> + return -1;
>>   atomv->handler(atomv, );
>>   }
>
> since it affects the control flow. Might we be skipping any necessary
> cleanup in the function if we see an error?
>
> It looks like we may have called push_stack_element(), but we'd never
> get to the end of the function where we call pop_stack_element(),
> causing us to leak.

Agree,  I will fix this.

>
> -Peff

Re: [PATCH 5/8] perl: update our copy of Mail::Address

2018-02-15 Thread Matthieu Moy

"Jonathan Nieder"  wrote:

> Ævar Arnfjörð Bjarmason wrote:
>
> > Update our copy of Mail::Address from 2.19 (Aug 22, 2017) to 2.20 (Jan
> > 23, 2018). This should be a trivial update[1] but it seems the version
> > Matthieu Moy imported in bd869f67b9 ("send-email: add and use a local
> > copy of Mail::Address", 2018-01-05) doesn't correspond to any 2.19
> > version found on the CPAN. From the comment at the top of the file it
> > looks like some OS version with the POD stripped, and with different
> > indentation.
>
> Were there changes other than the POD stripping?

No.

I should have mentionned it in the commit message, but the one I took was
from:

  http://cpansearch.perl.org/src/MARKOV/MailTools-2.19/lib/Mail/Address.pm

i.e. following the "source" link from:

  http://search.cpan.org/~markov/MailTools-2.19/lib/Mail/Address.pod

The link name suggested it was the actual source code but indeed it's a
pre-processed file with the POD stripped.

It would make sense to indicate explicitly where this file is from in
this commit's message to avoid having the same discussion next time someone
upgrades the package.

-- 
Matthieu Moy
https://matthieu-moy.fr/

Re: [PATCH] Makefile: generate Git(3pm) as dependency of the 'doc' and 'man' targets

2018-02-15 Thread Ævar Arnfjörð Bjarmason


On Thu, Feb 15 2018, SZEDER Gábor jotted:

> Since commit 20d2a30f8f (Makefile: replace perl/Makefile.PL with
> simple make rules, 2017-12-10), the Git(3pm) man page is only
> generated as an indirect dependency of the 'install-doc' and
> 'install-man' Makefile targets.  Consequently, if someone runs 'make
> man && sudo make install-man' (or their 'doc' counterparts), then
> Git(3pm) will be generated as root, and the resulting root-owned files
> and directories will in turn cause the next user-run 'make clean' to
> fail.  This was not an issue in the past, because Git(3pm) was
> generated when 'make all' descended into 'perl/', which is usually not
> run as root.
>
> List Git(3pm) as a dependency of the 'doc' and 'man' Makefile targets,
> too, so it gets generated by targets that are usually built as
> ordinary users.
>
> While at it, add 'install-man-perl' to the list of .PHONY targets.

Thanks for the fixup of my crappy 'make' skills. I tested this before
the patch and it indeed has the problem you describe, and this fixes
it. Thanks! CC-ing Junio because I think it makes sense to pick this up
as-is.

Reviewed-by: Ævar Arnfjörð Bjarmason

80 matches

Mail list logo