Re: [PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Junio C Hamano
Jacob Keller  writes:

> +void show_submodule_diff(FILE *f, const char *path,
> + const char *line_prefix,
> + unsigned char one[20], unsigned char two[20],
> + unsigned dirty_submodule, const char *meta,
> + const char *reset)
> +{
> + struct child_process cp = CHILD_PROCESS_INIT;
> + struct strbuf sb = STRBUF_INIT;
> + struct strbuf submodule_git_dir = STRBUF_INIT;
> + const char *git_dir, *message = NULL;
> + int create_dirty_diff = 0;
> + FILE *diff;
> + char c;
> +
> + if ((dirty_submodule & DIRTY_SUBMODULE_UNTRACKED) ||
> + (dirty_submodule & DIRTY_SUBMODULE_MODIFIED))
> + create_dirty_diff = 1;
> +
> + strbuf_addf(&sb, "%s%sSubmodule %s %s..", line_prefix, meta, path,
> + find_unique_abbrev(one, DEFAULT_ABBREV));
> + strbuf_addf(&sb, "%s:%s\n", !create_dirty_diff ?
> + find_unique_abbrev(two, DEFAULT_ABBREV) : "", reset);
> + fwrite(sb.buf, sb.len, 1, f);
> +
> + strbuf_addf(&submodule_git_dir, "%s/.git", path);
> + git_dir = read_gitfile(submodule_git_dir.buf);
> + if (!git_dir)
> + git_dir = submodule_git_dir.buf;
> + if (!is_directory(git_dir)) {
> + strbuf_reset(&submodule_git_dir);
> + strbuf_addf(&submodule_git_dir, ".git/modules/%s", path);
> + git_dir = submodule_git_dir.buf;
> + }
> +
> + cp.git_cmd = 1;
> + if (!create_dirty_diff)
> + cp.dir = git_dir;
> + else
> + cp.dir = path;
> + cp.out = -1;
> + cp.no_stdin = 1;
> + argv_array_push(&cp.args, "diff");
> + argv_array_pushf(&cp.args, "--src-prefix=a/%s/", path);
> + argv_array_pushf(&cp.args, "--dst-prefix=b/%s/", path);

I think this is wrong.  Imagine when the caller gave you prefixes
that are different from a/ and b/ (think: the extreme case is that
you are a sub-sub-module, and the caller is recursing into you with
its own prefix, perhaps set to a/sub and b/sub).  Use the prefixes
you got for a/ and b/ instead of hardcoding them and you'd be fine,
I'd guess.

> + argv_array_push(&cp.args, sha1_to_hex(one));
> +
> + /*
> +  * If the submodule has untracked or modified contents, diff between
> +  * the old base and the current tip. This had the advantage of showing
> +  * the full difference of the submodule contents.
> +  */
> + if (!create_dirty_diff)
> + argv_array_push(&cp.args, sha1_to_hex(two));
> +
> + if (start_command(&cp))
> + die("Could not run 'git diff' command in submodule %s", path);
> +
> + diff = fdopen(cp.out, "r");
> +
> + c = fgetc(diff);
> + while (c != EOF) {
> + fputc(c, f);
> + c = fgetc(diff);
> + }
> +
> + fclose(diff);
> + finish_command(&cp);

I do not think you need to do this.  If "f" is already a FILE * to
which the output must go, then instead of reading from the pipe and
writing it, you can just let the "diff" spit out its output to the
same file descriptor, by doing something like:

fflush(f);
cp.out = dup(fileno(f));
... other setup ...
run_command(&cp);

no?  I do not offhand recall run_command() closes cp.out after done,
and if it doesn't you may have to close it yourself after the above
sequence.

While I personally do not want to see code to access submodule's
object store by temporarily adding it as alternates, the "show
left-right log between the commits" code already does so, so I
should point out that running "diff" internally without spawning it
as a subprocess shouldn't be too hard.  The diff API is reasonably
libified and there shouldn't be additional "libification" preparation
work required to do this, if you really wanted to.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Eric Wong
Josh Triplett  wrote:
> On Tue, Aug 09, 2016 at 06:28:00PM +, Eric Wong wrote:
> > Some of these problems I hope public-inbox (or something like
> > it) can fix and turn the tide towards email, again.
> 
> This really seems like the dichotomy that drives people towards central
> services like GitHub or GitLab.  We need an alternative that doesn't
> involve email, or at the very least, doesn't require people to use email
> directly.  Half of the pain in the process comes from coaxing email
> clients that don't treat mail text as sacrosanct to leave it alone and
> not mangle it.  (Some of that would go away if we accepted attachments
> with inline disposition, but not all of it.  All of it would go away if
> the submission process just involved "git push" to an appropriate
> location.)

I don't mind patches as attachments and did some work a few
months ago to ensure they're individually downloadable in the
public-inbox WWW interface (along with full mboxrd messages)[1].

Fwiw, attachments are preferred in perl5-porters, and it might
be acceptable on LKML, even.  Not my call, though.

Having a push/pull-only workflow would still require some sort
of messaging system to notify others.  Ideally that message
would have the output of "git request-pull" to ensure people are
on the same page; but I'd prefer patches (either attachments or
inline) continue to be sent anyways in case the server is down
or the reader is offline or on a machine without git.

[1] see Brian's (who is new, here) initial email for diff-highlight:
https://public-inbox.org/git/20160728162712.ga29...@tci.corp.yp.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Josh Triplett
On Tue, Aug 09, 2016 at 06:28:00PM +, Eric Wong wrote:
> Some of these problems I hope public-inbox (or something like
> it) can fix and turn the tide towards email, again.

This really seems like the dichotomy that drives people towards central
services like GitHub or GitLab.  We need an alternative that doesn't
involve email, or at the very least, doesn't require people to use email
directly.  Half of the pain in the process comes from coaxing email
clients that don't treat mail text as sacrosanct to leave it alone and
not mangle it.  (Some of that would go away if we accepted attachments
with inline disposition, but not all of it.  All of it would go away if
the submission process just involved "git push" to an appropriate
location.)

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 5:23 PM, Jacob Keller  wrote:
>
> +static int prepare_submodule_diff(struct strbuf *buf, const char *path,
> +   unsigned char one[20], unsigned char two[20])
> +{

This is not used any more, but the child is run directly below?

> +   strbuf_addf(&submodule_git_dir, "%s/.git", path);
> +   git_dir = read_gitfile(submodule_git_dir.buf);
> +   if (!git_dir)
> +   git_dir = submodule_git_dir.buf;
> +   if (!is_directory(git_dir)) {
> +   strbuf_reset(&submodule_git_dir);
> +   strbuf_addf(&submodule_git_dir, ".git/modules/%s", path);
> +   git_dir = submodule_git_dir.buf;
> +   }

This pattern seems familar, do we have a function for that?
(get a submodules git dir? As I was mainly working on the helper
I do not know off hand)

> +   if (start_command(&cp))
> +   die("Could not run 'git diff' command in submodule %s", path);
> +
> +   diff = fdopen(cp.out, "r");

diff is a FILE* pointer. cp.out is a standard int file descriptor

Maybe you'd want a similar thing as strbuf_getwholeline_fd()
just instead of adding it to the strbuf, adding it to `f` ?

(Which is what you have here? I just wonder about the buffer size,
as I think reading 1 by 1 is not beneficial)

> +
> +   c = fgetc(diff);
> +   while (c != EOF) {
> +   fputc(c, f);
> +   c = fgetc(diff);
> +   }
> +
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Josh Triplett
On Tue, Aug 09, 2016 at 06:57:05AM -0400, Jeff King wrote:
> On Tue, Aug 09, 2016 at 10:11:30AM +0200, Michael J Gruber wrote:
> 
> > Maybe two more points of input for the discussion:
> > 
> > off-line capabilities
> > =
> > 
> > The current workflow here seems to work best when you are subscribed to
> > the git-ml and have your own (local, maybe selective) copy of git-ml in
> > your (text-based) MUA (mutt, (al)pine, emacs, ...) that can jump right
> > into git-am and such directly. I'm not sure how important the "off-line"
> > aspect of that is for some of us, and how that could be replicated on
> > GitHub - while PRs and such may be Git based behind the scenes there
> > seems to be no way to clone that info and work from a local clone.
> > (Don't know if GitLab is more open.)
> 
> You can pull it all out via GitHub's HTTP API, but the question is what
> format you would use to store it locally (and which tools you would then
> use to play with it).
> 
> I haven't really tried this lately, though, so I don't know if there is
> information that the API would be missing.
> 
> I do have a dream of writing a tool that sucks in GitHub PRs to a fake
> email thread, lets me make my responses inline in an editor, and then
> pushes it back up as PR comments (finding the right positions based on
> the quoted context).

You might try https://github.com/joeyh/github-backup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC v2] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Jacob Keller
From: Jacob Keller 

For projects which have frequent updates to submodules it is often
useful to be able to see a submodule update commit as a difference.
Teach diff's --submodule= a new "diff" format which will execute a diff
for the submodule between the old and new commit, and display it as
a standard diff. This allows users to easily see what changed in
a submodule without having to dig into the submodule themselves.

Signed-off-by: Jacob Keller 
---
- v2
* Use fgetc and fputc... I tried to use xread but somehow screwed it up
  and this worked fine.

Use src-prefix and dst-prefix.

Use the dirty_submodule flags to indicate whether we should check a diff
between two sources or not. This way we check "current checkout" if
there is any dirty changes so the user sees the entire diff.

 Documentation/diff-config.txt  |  3 +-
 Documentation/diff-options.txt |  6 ++-
 diff.c | 21 --
 diff.h |  2 +-
 submodule.c| 95 ++
 submodule.h|  5 +++
 6 files changed, 125 insertions(+), 7 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index d5a5b17d5088..f5d693afad6c 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -123,7 +123,8 @@ diff.suppressBlankEmpty::
 diff.submodule::
Specify the format in which differences in submodules are
shown.  The "log" format lists the commits in the range like
-   linkgit:git-submodule[1] `summary` does.  The "short" format
+   linkgit:git-submodule[1] `summary` does.  The "diff" format shows the
+   diff between the beginning and end of the range. The "short" format
format just shows the names of the commits at the beginning
and end of the range.  Defaults to short.
 
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 705a87394200..b17348407805 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -215,8 +215,10 @@ any of those replacements occurred.
the commits in the range like linkgit:git-submodule[1] `summary` does.
Omitting the `--submodule` option or specifying `--submodule=short`,
uses the 'short' format. This format just shows the names of the commits
-   at the beginning and end of the range.  Can be tweaked via the
-   `diff.submodule` configuration variable.
+   at the beginning and end of the range. When `--submodule=diff` is
+   given, the 'diff' format is used. This format shows the diff between
+   the old and new submodule commmit from the perspective of the
+   submodule.  Can be tweaked via the `diff.submodule` configuration 
variable.
 
 --color[=]::
Show colored diff.
diff --git a/diff.c b/diff.c
index b43d3dd2ecb7..8d620c5d2250 100644
--- a/diff.c
+++ b/diff.c
@@ -130,12 +130,18 @@ static int parse_dirstat_params(struct diff_options 
*options, const char *params
 
 static int parse_submodule_params(struct diff_options *options, const char 
*value)
 {
-   if (!strcmp(value, "log"))
+   if (!strcmp(value, "log")) {
DIFF_OPT_SET(options, SUBMODULE_LOG);
-   else if (!strcmp(value, "short"))
+   DIFF_OPT_CLR(options, SUBMODULE_DIFF);
+   } else if (!strcmp(value, "diff")) {
+   DIFF_OPT_SET(options, SUBMODULE_DIFF);
DIFF_OPT_CLR(options, SUBMODULE_LOG);
-   else
+   } else if (!strcmp(value, "short")) {
+   DIFF_OPT_CLR(options, SUBMODULE_DIFF);
+   DIFF_OPT_CLR(options, SUBMODULE_LOG);
+   } else {
return -1;
+   }
return 0;
 }
 
@@ -2310,6 +2316,15 @@ static void builtin_diff(const char *name_a,
two->dirty_submodule,
meta, del, add, reset);
return;
+   } else if (DIFF_OPT_TST(o, SUBMODULE_DIFF) &&
+   (!one->mode || S_ISGITLINK(one->mode)) &&
+   (!two->mode || S_ISGITLINK(two->mode))) {
+   show_submodule_diff(o->file, one->path ? one->path : two->path,
+   line_prefix,
+   one->oid.hash, two->oid.hash,
+   two->dirty_submodule,
+   meta, reset);
+   return;
}
 
if (DIFF_OPT_TST(o, ALLOW_TEXTCONV)) {
diff --git a/diff.h b/diff.h
index 125447be09eb..a80f3e5bafe9 100644
--- a/diff.h
+++ b/diff.h
@@ -69,7 +69,7 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct 
diff_options *opt, void *data)
 #define DIFF_OPT_FIND_COPIES_HARDER  (1 <<  6)
 #define DIFF_OPT_FOLLOW_RENAMES  (1 <<  7)
 #define DIFF_OPT_RENAME_EMPTY(1 <<  8)
-/* (1 <<  9) unused */
+#define DIFF_OPT_SUBMODULE_DIFF  (1 <<  9)
 #define DIFF_OPT_HAS_CHANGES (1 << 10)
 #define DIFF_OPT_QUICK   (1 << 11)

[PATCH] http-backend: buffer headers before sending

2016-08-09 Thread Eric Wong
Avoid waking up the readers for unnecessary context switches for
each line of header data being written, as all the headers are
written in short succession.

It is unlikely any HTTP/1.x server would want to read a CGI
response one-line-at-a-time and trickle each to the client.
Instead, I'd expect HTTP servers want to minimize syscall and
TCP/IP framing overhead by trying to send all of its response
headers in a single syscall or even combining the headers and
first chunk of the body with MSG_MORE or writev.

Verified by strace-ing response parsing on the CGI side.

Signed-off-by: Eric Wong 
---
  I admit I only noticed this because I was being lazy when
  implementing the reader-side on an HTTP server by making
  a single read(2) call :x

 http-backend.c | 220 ++---
 1 file changed, 116 insertions(+), 104 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index 0d59499..adc8c8c 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -75,55 +75,57 @@ static void format_write(int fd, const char *fmt, ...)
write_or_die(fd, buffer, n);
 }
 
-static void http_status(unsigned code, const char *msg)
+static void http_status(struct strbuf *hdr, unsigned code, const char *msg)
 {
-   format_write(1, "Status: %u %s\r\n", code, msg);
+   strbuf_addf(hdr, "Status: %u %s\r\n", code, msg);
 }
 
-static void hdr_str(const char *name, const char *value)
+static void hdr_str(struct strbuf *hdr, const char *name, const char *value)
 {
-   format_write(1, "%s: %s\r\n", name, value);
+   strbuf_addf(hdr, "%s: %s\r\n", name, value);
 }
 
-static void hdr_int(const char *name, uintmax_t value)
+static void hdr_int(struct strbuf *hdr, const char *name, uintmax_t value)
 {
-   format_write(1, "%s: %" PRIuMAX "\r\n", name, value);
+   strbuf_addf(hdr, "%s: %" PRIuMAX "\r\n", name, value);
 }
 
-static void hdr_date(const char *name, unsigned long when)
+static void hdr_date(struct strbuf *hdr, const char *name, unsigned long when)
 {
const char *value = show_date(when, 0, DATE_MODE(RFC2822));
-   hdr_str(name, value);
+   hdr_str(hdr, name, value);
 }
 
-static void hdr_nocache(void)
+static void hdr_nocache(struct strbuf *hdr)
 {
-   hdr_str("Expires", "Fri, 01 Jan 1980 00:00:00 GMT");
-   hdr_str("Pragma", "no-cache");
-   hdr_str("Cache-Control", "no-cache, max-age=0, must-revalidate");
+   hdr_str(hdr, "Expires", "Fri, 01 Jan 1980 00:00:00 GMT");
+   hdr_str(hdr, "Pragma", "no-cache");
+   hdr_str(hdr, "Cache-Control", "no-cache, max-age=0, must-revalidate");
 }
 
-static void hdr_cache_forever(void)
+static void hdr_cache_forever(struct strbuf *hdr)
 {
unsigned long now = time(NULL);
-   hdr_date("Date", now);
-   hdr_date("Expires", now + 31536000);
-   hdr_str("Cache-Control", "public, max-age=31536000");
+   hdr_date(hdr, "Date", now);
+   hdr_date(hdr, "Expires", now + 31536000);
+   hdr_str(hdr, "Cache-Control", "public, max-age=31536000");
 }
 
-static void end_headers(void)
+static void end_headers(struct strbuf *hdr)
 {
-   write_or_die(1, "\r\n", 2);
+   strbuf_add(hdr, "\r\n", 2);
+   write_or_die(1, hdr->buf, hdr->len);
+   strbuf_release(hdr);
 }
 
-__attribute__((format (printf, 1, 2)))
-static NORETURN void not_found(const char *err, ...)
+__attribute__((format (printf, 2, 3)))
+static NORETURN void not_found(struct strbuf *hdr, const char *err, ...)
 {
va_list params;
 
-   http_status(404, "Not Found");
-   hdr_nocache();
-   end_headers();
+   http_status(hdr, 404, "Not Found");
+   hdr_nocache(hdr);
+   end_headers(hdr);
 
va_start(params, err);
if (err && *err)
@@ -132,14 +134,14 @@ static NORETURN void not_found(const char *err, ...)
exit(0);
 }
 
-__attribute__((format (printf, 1, 2)))
-static NORETURN void forbidden(const char *err, ...)
+__attribute__((format (printf, 2, 3)))
+static NORETURN void forbidden(struct strbuf *hdr, const char *err, ...)
 {
va_list params;
 
-   http_status(403, "Forbidden");
-   hdr_nocache();
-   end_headers();
+   http_status(hdr, 403, "Forbidden");
+   hdr_nocache(hdr);
+   end_headers(hdr);
 
va_start(params, err);
if (err && *err)
@@ -148,21 +150,23 @@ static NORETURN void forbidden(const char *err, ...)
exit(0);
 }
 
-static void select_getanyfile(void)
+static void select_getanyfile(struct strbuf *hdr)
 {
if (!getanyfile)
-   forbidden("Unsupported service: getanyfile");
+   forbidden(hdr, "Unsupported service: getanyfile");
 }
 
-static void send_strbuf(const char *type, struct strbuf *buf)
+static void send_strbuf(struct strbuf *hdr,
+   const char *type, struct strbuf *buf)
 {
-   hdr_int(content_length, buf->len);
-   hdr_str(content_type, type);
-   end_headers();
+   hdr_int(hdr, content_length, buf->len);
+ 

Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 4:10 PM, Stefan Beller  wrote:

> xread does use a poll() for you so it is not active polling,
> but only reading when data is available.

s/active polling/ spinning/

>
>
>>>
>>> When not checked out, we can invoke the diff command
>>> in .git/modules/ as that is the git dir of the submodule,
>>> i.e. operating diff with a bare repo?
>>
>> We can actually do this every time. How would you pass that in a
>> child_process? I don't think it's "dir" but instead passing
>> "--git-dir" somehow?

I think it doesn't matter. You can still use .dir.
(That would be equivalent to `cd $GIT_DIR && git diff sha1..sha1`
which works just as well, even in the submodule case)
Alternatively you could do
argv_array_pushf(cp.env_array, "GIT_DIR=%s", ...)

So I would drop the -C here and just use the .dir attribute.

> git -C $GIT_DIR diff --relative ${superprojects ce->name}
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 3:56 PM, Jacob Keller  wrote:

>>> +
>>> +   if (strbuf_read(buf, cp.out, 0) < 0)
>>
>> So we keep the whole diff in memory
>> I don't know much about the diff machinery, but I thought
>> the rest of the diff machinery just streams it out?
>
> Yea, but I can't figure out how to do that. Is there an easy way to
> stream chunks from the pipe straight into the file?

Maybe, roughly:

cp.stdout = -1;
start_command(&cp);
do {
int r = xread(cp.stdout, buf, MAX_IO_SIZE);
} while (r >=0);
finish_command(&cp);

xread does use a poll() for you so it is not active polling,
but only reading when data is available.


>>
>> When not checked out, we can invoke the diff command
>> in .git/modules/ as that is the git dir of the submodule,
>> i.e. operating diff with a bare repo?
>
> We can actually do this every time. How would you pass that in a
> child_process? I don't think it's "dir" but instead passing
> "--git-dir" somehow?

git -C $GIT_DIR diff --relative ${superprojects ce->name}

with $GIT_DIR = ${SUPERPROJECTS GITDIR + modules// with name to be looked
as submodule_from_path(ce->name) and then taking `name` from the
submodule struct)


Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/9] t7408: modernize style

2016-08-09 Thread Eric Sunshine
On Tue, Aug 9, 2016 at 1:39 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>> I originally thought that it may be easier to have 2 patches.
>> This first one is very gentle and "obviously correct" as it only changes
>> formatting and drops the directory changes.
>>
>> The second patch goes for renaming ans subtle style issues,
>> combining some tests, so it is more likely to break.
>>
>> After this review, I consider using just one patch and do it all
>> at once to not confuse the readers as otherwise I should reword
>> the commit message with the rationale of doing it in two patches.
>
> FWIW, I would think your split between 1/ and 2/ were very sensible,
> and have a slight preference for keeping them separate.

The review comment about renaming "current" to "actual" was made
without having looked yet at patch 2. Having now seen patch 2, I agree
with Junio that the existing split is preferable.

So, only the review comment about dropping space after '>' remains relevant.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Jacob Keller
On Tue, Aug 9, 2016 at 3:50 PM, Stefan Beller  wrote:
> On Tue, Aug 9, 2016 at 3:32 PM, Jacob Keller  wrote:
>> From: Jacob Keller 
>>
>> For projects which have frequent updates to submodules it is often
>> useful to be able to see a submodule update commit as a difference.
>> Teach diff's --submodule= a new "diff" format which will execute a diff
>> for the submodule between the old and new commit, and display it as
>> a standard diff. This allows users to easily see what changed in
>> a submodule without having to dig into the submodule themselves.
>>
>> Signed-off-by: Jacob Keller 
>> ---
>> There are no tests yet. Stefan suggested the use of child_process,
>
> Well you said "I just want this one command but don't know where to put it "
> so it's natural to suggest using child_process.  ;)

Right this is a straight forward way..

>
>> but I
>> really believe there has to be some way of getting the diff without
>> having to run a sub process (as this means we also can't do the diff
>> without a checked out submodule). It doesn't properly handle options
>> either, so a better solution would be much appreciated.
>
> Oh right, we would need to codify all options again (and all future options,
> too)


That's why I'd prefer if we had a way to do this via builtins, because
it would make option passing easier.

>
>>
>> I also would prefer if the diff actually prefixed the files with
>> "path/to/submodule" so that it was obvious where the file was located in
>> the directory.
>>
>> Suggestions welcome.
>>
>> +
>> +   if (start_command(&cp))
>
> I wonder if we want to stay single-execution here,
> i.e. if we rather want to use run_processes_parallel
> for all the submodules at the same time?
>
> Though then non deterministic ordering may be an issue.

There is no easy way to do this, we're given it one module at a time
and it would be a huge re-write to make it go in parallel. I think
that's the wrong way to think about this.

>
>> +   return -1;
>> +
>> +   if (strbuf_read(buf, cp.out, 0) < 0)
>
> So we keep the whole diff in memory
> I don't know much about the diff machinery, but I thought
> the rest of the diff machinery just streams it out?

Yea, but I can't figure out how to do that. Is there an easy way to
stream chunks from the pipe straight into the file?

>
>> +
>> +void show_submodule_diff(FILE *f, const char *path,
>> +   const char *line_prefix,
>> +   unsigned char one[20], unsigned char two[20],
>> +   unsigned dirty_submodule, const char *meta,
>> +   const char *reset)
>> +{
>> +   struct strbuf buf = STRBUF_INIT;
>> +   struct strbuf sb = STRBUF_INIT;
>> +   const char *message = NULL;
>> +
>> +   if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
>> +   fprintf(f, "%sSubmodule %s contains untracked content\n",
>> +   line_prefix, path);
>> +   if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
>> +   fprintf(f, "%sSubmodule %s contains modified content\n",
>> +   line_prefix, path);
>> +
>> +   if (!hashcmp(one, two)) {
>> +   strbuf_release(&sb);
>> +   return;
>> +   }
>> +
>> +   if (add_submodule_odb(path))
>> +   message = "(not checked out)";
>
> When not checked out, we can invoke the diff command
> in .git/modules/ as that is the git dir of the submodule,
> i.e. operating diff with a bare repo?

We can actually do this every time. How would you pass that in a
child_process? I don't think it's "dir" but instead passing
"--git-dir" somehow?

Thanks,
Jake

>
> Thanks,
> Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 3:32 PM, Jacob Keller  wrote:
> From: Jacob Keller 
>
> For projects which have frequent updates to submodules it is often
> useful to be able to see a submodule update commit as a difference.
> Teach diff's --submodule= a new "diff" format which will execute a diff
> for the submodule between the old and new commit, and display it as
> a standard diff. This allows users to easily see what changed in
> a submodule without having to dig into the submodule themselves.
>
> Signed-off-by: Jacob Keller 
> ---
> There are no tests yet. Stefan suggested the use of child_process,

Well you said "I just want this one command but don't know where to put it "
so it's natural to suggest using child_process.  ;)

> but I
> really believe there has to be some way of getting the diff without
> having to run a sub process (as this means we also can't do the diff
> without a checked out submodule). It doesn't properly handle options
> either, so a better solution would be much appreciated.

Oh right, we would need to codify all options again (and all future options,
too)

>
> I also would prefer if the diff actually prefixed the files with
> "path/to/submodule" so that it was obvious where the file was located in
> the directory.
>
> Suggestions welcome.
>
> +
> +   if (start_command(&cp))

I wonder if we want to stay single-execution here,
i.e. if we rather want to use run_processes_parallel
for all the submodules at the same time?

Though then non deterministic ordering may be an issue.

> +   return -1;
> +
> +   if (strbuf_read(buf, cp.out, 0) < 0)

So we keep the whole diff in memory
I don't know much about the diff machinery, but I thought
the rest of the diff machinery just streams it out?

> +
> +void show_submodule_diff(FILE *f, const char *path,
> +   const char *line_prefix,
> +   unsigned char one[20], unsigned char two[20],
> +   unsigned dirty_submodule, const char *meta,
> +   const char *reset)
> +{
> +   struct strbuf buf = STRBUF_INIT;
> +   struct strbuf sb = STRBUF_INIT;
> +   const char *message = NULL;
> +
> +   if (dirty_submodule & DIRTY_SUBMODULE_UNTRACKED)
> +   fprintf(f, "%sSubmodule %s contains untracked content\n",
> +   line_prefix, path);
> +   if (dirty_submodule & DIRTY_SUBMODULE_MODIFIED)
> +   fprintf(f, "%sSubmodule %s contains modified content\n",
> +   line_prefix, path);
> +
> +   if (!hashcmp(one, two)) {
> +   strbuf_release(&sb);
> +   return;
> +   }
> +
> +   if (add_submodule_odb(path))
> +   message = "(not checked out)";

When not checked out, we can invoke the diff command
in .git/modules/ as that is the git dir of the submodule,
i.e. operating diff with a bare repo?

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Junio C Hamano
Jacob Keller  writes:

> +static int prepare_submodule_diff(struct strbuf *buf, const char *path,
> + unsigned char one[20], unsigned char two[20])
> +{
> + struct child_process cp = CHILD_PROCESS_INIT;
> + cp.git_cmd = 1;
> + cp.dir = path;
> + cp.out = -1;
> + cp.no_stdin = 1;
> + argv_array_push(&cp.args, "diff");
> + argv_array_push(&cp.args, sha1_to_hex(one));
> + argv_array_push(&cp.args, sha1_to_hex(two));
> +
> + if (start_command(&cp))
> + return -1;
> +
> + if (strbuf_read(buf, cp.out, 0) < 0)
> + return -1;
> +
> + if (finish_command(&cp))
> + return -1;
> +
> + return 0;
> +}

It is a good idea to keep the submodule data isolated from the main
process by going through run-command/start-command interface (I
think the call to add_submodule_odb() should be rethought and
removed if possible).

I however wonder if you want to use src/dst-prefix to make the path
to the submodule appear there?  That is, if your superproject has a
submodule at "dir/" and two versions of the submodule changes a file
"doc/README", wouldn't you rather want to see

diff --git a/dir/doc/README b/dir/doc/README

instead of comparison between a/doc/README and b/doc/README?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC] diff: add SUBMODULE_DIFF format to display submodule diff

2016-08-09 Thread Jacob Keller
From: Jacob Keller 

For projects which have frequent updates to submodules it is often
useful to be able to see a submodule update commit as a difference.
Teach diff's --submodule= a new "diff" format which will execute a diff
for the submodule between the old and new commit, and display it as
a standard diff. This allows users to easily see what changed in
a submodule without having to dig into the submodule themselves.

Signed-off-by: Jacob Keller 
---
There are no tests yet. Stefan suggested the use of child_process, but I
really believe there has to be some way of getting the diff without
having to run a sub process (as this means we also can't do the diff
without a checked out submodule). It doesn't properly handle options
either, so a better solution would be much appreciated.

I also would prefer if the diff actually prefixed the files with
"path/to/submodule" so that it was obvious where the file was located in
the directory.

Suggestions welcome.

 Documentation/diff-config.txt  |  3 +-
 Documentation/diff-options.txt |  6 ++--
 diff.c | 21 +++--
 diff.h |  2 +-
 submodule.c| 67 ++
 submodule.h|  5 
 6 files changed, 97 insertions(+), 7 deletions(-)

diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index d5a5b17d5088..f5d693afad6c 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -123,7 +123,8 @@ diff.suppressBlankEmpty::
 diff.submodule::
Specify the format in which differences in submodules are
shown.  The "log" format lists the commits in the range like
-   linkgit:git-submodule[1] `summary` does.  The "short" format
+   linkgit:git-submodule[1] `summary` does.  The "diff" format shows the
+   diff between the beginning and end of the range. The "short" format
format just shows the names of the commits at the beginning
and end of the range.  Defaults to short.
 
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 705a87394200..b17348407805 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -215,8 +215,10 @@ any of those replacements occurred.
the commits in the range like linkgit:git-submodule[1] `summary` does.
Omitting the `--submodule` option or specifying `--submodule=short`,
uses the 'short' format. This format just shows the names of the commits
-   at the beginning and end of the range.  Can be tweaked via the
-   `diff.submodule` configuration variable.
+   at the beginning and end of the range. When `--submodule=diff` is
+   given, the 'diff' format is used. This format shows the diff between
+   the old and new submodule commmit from the perspective of the
+   submodule.  Can be tweaked via the `diff.submodule` configuration 
variable.
 
 --color[=]::
Show colored diff.
diff --git a/diff.c b/diff.c
index b43d3dd2ecb7..74d43931c8df 100644
--- a/diff.c
+++ b/diff.c
@@ -130,12 +130,18 @@ static int parse_dirstat_params(struct diff_options 
*options, const char *params
 
 static int parse_submodule_params(struct diff_options *options, const char 
*value)
 {
-   if (!strcmp(value, "log"))
+   if (!strcmp(value, "log")) {
DIFF_OPT_SET(options, SUBMODULE_LOG);
-   else if (!strcmp(value, "short"))
+   DIFF_OPT_CLR(options, SUBMODULE_DIFF);
+   else if (!strcmp(value, "diff")) {
+   DIFF_OPT_SET(options, SUBMODULE_DIFF);
DIFF_OPT_CLR(options, SUBMODULE_LOG);
-   else
+   } else if (!strcmp(value, "short")) {
+   DIFF_OPT_CLR(options, SUBMODULE_DIFF);
+   DIFF_OPT_CLR(options, SUBMODULE_LOG);
+   } else {
return -1;
+   }
return 0;
 }
 
@@ -2310,6 +2316,15 @@ static void builtin_diff(const char *name_a,
two->dirty_submodule,
meta, del, add, reset);
return;
+   } else if (DIFF_OPT_TST(o, SUBMODULE_DIFF) &&
+   (!one->mode || S_ISGITLINK(one->mode)) &&
+   (!two->mode || S_ISGITLINK(two->mode))) {
+   show_submodule_diff(o->file, one->path ? one->path : two->path,
+   line_prefix,
+   one->oid.hash, two->oid.hash,
+   two->dirty_submodule,
+   meta, reset);
+   return;
}
 
if (DIFF_OPT_TST(o, ALLOW_TEXTCONV)) {
diff --git a/diff.h b/diff.h
index 125447be09eb..a80f3e5bafe9 100644
--- a/diff.h
+++ b/diff.h
@@ -69,7 +69,7 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct 
diff_options *opt, void *data)
 #define DIFF_OPT_FIND_COPIES_HARDER  (1 <<  6)
 #define DIFF_OPT_FOLLOW_RENAMES  (1 <<  7)
 #define DIFF_OPT_RENAME_EMPTY(

Re: [PATCH v2 7/7] pack-objects: use mru list when iterating over packs

2016-08-09 Thread Junio C Hamano
Jeff King  writes:

> Here's the code to do the cycle-breaking. Aside from the "hacky" bit,
> it's quite simple.  I added a new state enum to object_entry to handle
> the graph traversal. Since it only needs 2 bits, I _assume_ a compiler
> can fit it in with the bitfields above (or at the very least give it its
> own single byte so we just use what would otherwise be struct padding).
> But I didn't check; if it turns out not to be the case we can easily
> emulate it with two bitfields.  The write_object() check abuses the
> "idx.offset" field to keep the same state, but we could convert it to
> use these flags if we care.

> @@ -1516,6 +1577,13 @@ static void get_object_details(void)
>   entry->no_try_delta = 1;
>   }
>  
> + /*
> +  * This must happen in a second pass, since we rely on the delta
> +  * information for the whole list being completed.
> +  */
> + for (i = 0; i < to_pack.nr_objects; i++)
> + break_delta_cycles(&to_pack.objects[i]);
> +
>   free(sorted_by_offset);
>  }

A potential cycle can only come from reusing deltas across packs in
an unstable order, that happens way before we do the find_delta()
thing, so this is a good place to have the new call.  While reading
break_delta_cycles(), I was wondering if what it does is safe under
multi-threading but there is no need to worry.

The recursiveness of break-delta-cycles is not too bad for the same
reason why it is OK to recurse in check_delta_limit(), I would guess?

This is not new with this change, but I am not quite sure what in
the current code prevents us from busting the delta limit for reused
ones, though.

> I think my preference is to clean up the "hacky" bit of this patch, and
> then apply the earlier MRU patch on top of it (which takes my repack
> from 44 minutes to 5 minutes for this particular test set).

Yup, with something like this to break the delta chain _and_ allow
an object to go through the usual deltify machinery, I'd say the MRU
patch is a wonderful thing to have.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 2:45 PM, Junio C Hamano  wrote:
> becomes easily doable (i.e. subsequent "submodule update" can realize
> that the submodule does not have alternates but it could borrow from
> the submodule in the other-super-project-location).

I would suggest to postpone this to a later time once the need arises.

I rather imagine that once you clone from "other-super-project-location"
and get the warning about no alternates borrowing, you can decide if you want
to set it up and link the submodule manually to that alternate, or if you just
don't care about this one repository being duplicated.

For now I plan to introduce 2 options, as these are 2 different things, that
can be extended independently of each other:

submodule.alternateLocation::
Specifies how the submodules obtain alternates when submodules are
cloned. Possible values are `no`, `superproject`.
By default `no` is assumed, which doesn't add references. When the
value is set to `superproject` the submodule to be cloned computes
its alternates location relative to the superprojects alternate.

submodule.alternateErrorStrategy::
Specifies how to treat errors with the alternates for a submodule
as computed via `submodule.alternateLocation`. Possible values are
`ignore`, `info`, `die`.

> *1* Rather, I meant: clone has a very intimate knowledge on what and
> what cannot be borrowed from and it is not just "is there a
> directory?", so "git submodule update --init" is not in a good
> position to decide if it wants to add --reference to the
> invocation of "git clone" it makes internally, and introducing
> an "if-able" variant to "clone" is one way to relieve it from
> having to make that decision.

That is how I first understood the design as well.

>
> I could have suggested an alternative: because the submodule
> machinery is gettting moved to C the "update --init" code that
> drives the internal invocation of "git clone" could share the
> the logic in "git clone --reference" that determines if a local
> repository can be used as an alternate by small refactoring of
> builtin/clone.c::add_one_reference().

I see. I might actually need to do this anyway as the helper is a place
to act on the submodule.alternateErrorStrategy.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Donation!!!

2016-08-09 Thread Susanne Klatten
 Dear Recipient

Making charitable donation of $ 50 M to 13 People as my philanthropic gesture 
to the world. Interested individuals should contact me with these Code 
DON/CH2016'.

Email: susannee.ha...@aol.com

Susanne Hanne Ursula Klatten
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> At the time of cloning you may run
>
>   git clone --recursive --reference 
> or
>   git clone --recursive --reference-if-able 
> or
>   git clone --recursive

That's an interesting tangent.  I never meant "if-able" to be an
end-user visible option [*1*], but now you mention it, I do not see
a reason why "clone --reference-if-able" of the top-level project
cannot be used together with "--recursive".

> Then later when we run
> git submodule update
> we have this option to know if a submodule alternate should be treated
> as optional or required referenced as the existence
> of the superprojects alternate (as a boolean indicator) is not enough of
> an indicator what the user later wants.

A tangent that comes to my mind after reading this is if letting
"if-able" just skip (with or without warning) and forget about it
once a clone is made is what we really want.

Suppose the "other-super-project-location" repository did not have a
clone of a submodule when you create a new clone of the superproject
using it as a reference.  The submodule will be made a full clone,
but after that happens, other-super-project-location can get
interested in the submodule and can acquire its own clone.

At that point, our clone of the submodule _could_ add the submodule
in the other-super-project-location as its alternate and lose the
duplicate objects that it could borrow from there by repacking, but
the suggested "clone with if-able, and forget about it after a clone
is made" would not allow us to do this.  I do not know if a
real-world use of submodules want the ability to do so, or it is
unnecessary.  I suspect with the recording of "you were told that
borrowing from the same location as the superproject is OK", this
becomes easily doable (i.e. subsequent "submodule update" can realize
that the submodule does not have alternates but it could borrow from
the submodule in the other-super-project-location).


[Footnote]

*1* Rather, I meant: clone has a very intimate knowledge on what and
what cannot be borrowed from and it is not just "is there a
directory?", so "git submodule update --init" is not in a good
position to decide if it wants to add --reference to the
invocation of "git clone" it makes internally, and introducing
an "if-able" variant to "clone" is one way to relieve it from
having to make that decision.

I could have suggested an alternative: because the submodule
machinery is gettting moved to C the "update --init" code that
drives the internal invocation of "git clone" could share the
the logic in "git clone --reference" that determines if a local
repository can be used as an alternate by small refactoring of
builtin/clone.c::add_one_reference().
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] submodule--helper: use parallel processor correctly.

2016-08-09 Thread Stefan Beller
When developing another patch series I had a temporary state in which
git-clone would segfault, when the call was prepared in
prepare_to_clone_next_submodule. This lead to the call failing, i.e. in
`update_clone_task_finished` the task was scheduled to be tried again.
The second call to prepare_to_clone_next_submodule would return 0, as the
segfaulted clone did create the .git file already, such that was not
considered to need to be cloned again. I was seeing the "BUG: ce was
a submodule before?\n" message, which was the correct behavior at the
time as my local code was buggy. When trying to debug this failure, I
tried to use printing messages into the strbuf that is passed around,
but these messages were never printed as the die(..) doesn't
flush the `err` strbuf.

When implementing the die() in 665b35ecc (2016-06-09, "submodule--helper:
initial clone learns retry logic"), I considered this condition to be
a severe condition, which should lead to an immediate abort as we do not
trust ourselves any more. However the queued messages in `err` are valuable
so let's not toss them out by immediately dying, but a graceful return.

Another thing to note: The error message itself was misleading. A return
value of 0 doesn't indicate the passed in `ce` is not a submodule any more,
but just that we do not consider cloning it any more.

Signed-off-by: Stefan Beller 
---

 Here is the last patch of the series as an independant bug fix applicable
 on master.
 
 Thanks,
 Stefan

 builtin/submodule--helper.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6f6d67a..bd7cce6 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -748,8 +748,12 @@ static int update_clone_get_next_task(struct child_process 
*child,
if (index < suc->failed_clones_nr) {
int *p;
ce = suc->failed_clones[index];
-   if (!prepare_to_clone_next_submodule(ce, child, suc, err))
-   die("BUG: ce was a submodule before?");
+   if (!prepare_to_clone_next_submodule(ce, child, suc, err)) {
+   suc->current ++;
+   strbuf_addf(err, "BUG: submodule considered for 
cloning,"
+   "doesn't need cloning any more?\n");
+   return 0;
+   }
p = xmalloc(sizeof(*p));
*p = suc->current;
*idx_task_cb = p;
-- 
2.9.2.583.gd6329be.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0027 racy?

2016-08-09 Thread Torsten Bögershausen
>   git -c core.autocrlf=$crlf add $fname >"${pfx}_$f.err" 2>&1
> 
> would make more sense. We _know_ that we have to do convert_to_git() in
> that step because the content is changed. And then you can ignore the
> warnings from "git commit" (which are racy), or you can simply commit as
> a whole later, as some other loops do.
> 
> But like Dscho, I do not actually understand what this test is checking.
> The function is called commit_chk_wrnNNO(), so perhaps you really are
> interested in what "commit" has to say. But IMHO that is not an
> interesting test. We know that if it has to read the content from disk,
> it will call convert_to_git(), which is the exact same code path used by
> "git add". So I do not understand what it is accomplishing to make a
> commit at all here.

It seems as if the test has been written without understanding the raciness.
It should commit files with different line endings on top of
a file with mixed line endings.
The warning should be checked (and here "git add" can be used,
or the file can be commited directly).
I'm not sure why the test ended up in doing both.

However, doing it the right way triggers a bug in convert.c,
(some warnings are missing, so I need some days to come up
with a proper patch)

Thanks for reporting & digging.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 11:44 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> The way I understood and implemented it is
>>
>> here is a path, try to use it as an alternate; if that is not
>> an alternate, it's fine too; maybe warn about it, but carry
>> on with the operation.
>
> My expectation is without "maybe warn about it".  And not adding it
> as an alternate (because it is not usable) is just "doing as I was
> told to do", nothing noteworthy.  Because "do it only if you can" is
> an explicit instruction to the doer that the caller does not care
> about the outcome, I'd think there shouldn't be any warning, whether
> it is used or not.  At the same time, if the caller wants to know
> the outcome, and using if-able as a way to say "I prefer to see it
> happen, but you do not have to panic if you can't", I'd think it is
> OK to give "info:" to report which of the two possible paths was
> taken in either case.  Throwing a "warning:" only when it didn't do
> so does not make much sense to me.

I think this whole thing needs a new config variable to be set.

At the time of cloning you may run

  git clone --recursive --reference 
or
  git clone --recursive --reference-if-able 
or
  git clone --recursive

The first two should trickle down to the submodules, i.e.
in the first case we maybe want to run

git config submodule.alternate=required-superproject &&
git submodule update --init --recurse

In the second case
git config submodule.alternate=optional-superproject &&
git submodule update --init --recurse

And in the third case we maybe only want to record
git config submodule.alternate=none # implicit by not setting it as well
git submodule update --init --recurse


Then later when we run
git submodule update
we have this option to know if a submodule alternate should be treated
as optional or required referenced as the existence
of the superprojects alternate (as a boolean indicator) is not enough of
an indicator what the user later wants.


>>
>> I see. This way the user is most informed.
>
> Yes, and if we were to go that route, "clone" without "-v" should
> not report which of the two it took.  It is OK for "submodule" to
> look at what "clone" left as the result and do more intelligent
> reporting that is better suited for the context of the command.
>
>> The current implementation
>> of "submodule update --init" for start from scratch yields for example:
>>
>> Submodule 'foo' () registered for path 'foo'
>> Submodule 'hooks' () registered for path 'hooks'
>> Cloning into '/home/sbeller/super/foo'...
>> Cloning into '/home/sbeller/super/hooks'...
>> warning: Not using proposed alternate
>> /home/sbeller/super-reference/.git/modules/hooks/
>> Submodule path 'foo': checked out '7b41f3a413b46140b050ae5324cbbcdd467d2b3a'
>> Submodule path 'hooks': checked out 
>> '3acc14d10d26678eae6489038fe0d4dad644a9b4'
>>
>> so before this series we had 3 lines per submodule, and with this series
>> we get a 4th line for the unusual case.
>>
>> You would prefer to see always 4 lines per submodule?
>
> If the user says "--recursive --reference", then the user probably
> deserves to be notified.  As I said (eh, rather, as I wanted to say
> but failed to say so), I do not have a strong preference either way.

With a new config option we can trigger the notifications based on that as well.

"required superproject" would die when the submodule alternate is not there,
and the "optional superproject" would always state if it found an alternate.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] pack-objects: Teach --use-bitmap-index codepath to respect --local, --honor-pack-keep and --incremental

2016-08-09 Thread Junio C Hamano
Kirill Smelkov  writes:

> On Tue, Aug 09, 2016 at 09:52:18AM -0700, Junio C Hamano wrote:
>> "touch A" forcess the readers wonder "does the timestamp of A
>> matter, and if so in what way?" and "does any later test care what
>> is _in_ A, and if so in what way?"  Both of them is wasting their
>> time when there is no reason why "touch" should have been used. 
>
> I see, thanks for explaining. I used to read it a bit the other way;

Surely ">A" may invite "Hmm, is it important that A gets empty?", so
the choice between the two is not so black-and-white.  It just is
that "touch" has a more specific "update the timestamp while keeping
its contents intact" meaning, compared to ">A", which _could_ be
read as "make it empty and update its mtime" but most people would
not (i.e. "update its mtime" is a side effect for any modification).

> Ok, makes sense. Both patches adjusted and will be reposted.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] i18n: git-stash: mark message for translation

2016-08-09 Thread Junio C Hamano
Vasco Almeida  writes:

> Signed-off-by: Vasco Almeida 
> ---
>  git-stash.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/git-stash.sh b/git-stash.sh
> index 22fb8bc..9cbd682 100755
> --- a/git-stash.sh
> +++ b/git-stash.sh
> @@ -265,7 +265,7 @@ save_stash () {
>   create_stash "$stash_msg" $untracked
>   store_stash -m "$stash_msg" -q $w_commit ||
>   die "$(gettext "Cannot save the current status")"
> - say Saved working directory and index state "$stash_msg"
> + say "$(eval_gettext "Saved working directory and index state 
> \$stash_msg")"
>  
>   if test -z "$patch_mode"
>   then

There is another "say" without gettext in pop_stash function.  Don't
you want to do that one, too?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] i18n: archive: mark errors for translation

2016-08-09 Thread Junio C Hamano
Vasco Almeida  writes:

> Signed-off-by: Vasco Almeida 
> ---
>  archive.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/archive.c b/archive.c
> index 42df974..dde1ab4 100644
> --- a/archive.c
> +++ b/archive.c
> @@ -458,11 +458,11 @@ static int parse_archive_args(int argc, const char 
> **argv,
>   argc = parse_options(argc, argv, NULL, opts, archive_usage, 0);
>  
>   if (remote)
> - die("Unexpected option --remote");
> + die(_("Unexpected option --remote"));
>   if (exec)
> - die("Option --exec can only be used together with --remote");
> + die(_("Option --exec can only be used together with --remote"));
>   if (output)
> - die("Unexpected option --output");
> + die(_("Unexpected option --output"));
> ...
> - die("Unknown archive format '%s'", format);
> + die(_("Unknown archive format '%s'"), format);
> ...
> - die("Argument not supported for format '%s': -%d",
> + die(_("Argument not supported for format '%s': -%d"),
>   format, compression_level);

Hmm, this function is called by write_archive(), which can be called
by the upload-archive process running on the remote end, whose
locale certainly is different from that of your local environment.

If I do not read English and got one of these messages from the
remote end, I can copy that into a search engine to read more about
the error, but if I got it translated into, say, Portuguese, I'd
have a (slightly) harder time dealing with the error, I would think.

Having said that, I expect that sites that expect internatinal
audience to come would run these services in C locale, and other
sites that target audiences in a single locale would choose to use
their favourite single locale, so probably we do not have to worry
about it.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2 v7] pack-objects: use reachability bitmap index when generating non-stdout pack

2016-08-09 Thread Kirill Smelkov
Starting from 6b8fda2d (pack-objects: use bitmaps when packing objects)
if a repository has bitmap index, pack-objects can nicely speedup
"Counting objects" graph traversal phase. That however was done only for
case when resultant pack is sent to stdout, not written into a file.

The reason here is for on-disk repack by default we want:

- to produce good pack (with bitmap index not-yet-packed objects are
  emitted to pack in suboptimal order).

- to use more robust pack-generation codepath (avoiding possible
  bugs in bitmap code and possible bitmap index corruption).

Jeff King further explains:

The reason for this split is that pack-objects tries to determine how
"careful" it should be based on whether we are packing to disk or to
stdout. Packing to disk implies "git repack", and that we will likely
delete the old packs after finishing. We want to be more careful (so
as not to carry forward a corruption, and to generate a more optimal
pack), and we presumably run less frequently and can afford extra CPU.
Whereas packing to stdout implies serving a remote via "git fetch" or
"git push". This happens more frequently (e.g., a server handling many
fetching clients), and we assume the receiving end takes more
responsibility for verifying the data.

But this isn't always the case. One might want to generate on-disk
packfiles for a specialized object transfer. Just using "--stdout" and
writing to a file is not optimal, as it will not generate the matching
pack index.

So it would be useful to have some way of overriding this heuristic:
to tell pack-objects that even though it should generate on-disk
files, it is still OK to use the reachability bitmaps to do the
traversal.

So we can teach pack-objects to use bitmap index for initial object
counting phase when generating resultant pack file too:

- if we care it is not activated under git-repack:

  See above about repack robustness and not forward-carrying corruption.

- if we know bitmap index generation is not enabled for resultant pack:

  Current code has singleton bitmap_git so cannot work simultaneously
  with two bitmap indices.

  We also want to avoid (at least with current implementation)
  generating bitmaps off of bitmaps. The reason here is: when generating
  a pack, not-yet-packed objects will be emitted into pack in
  suboptimal order and added to tail of the bitmap as "extended entries".
  When the resultant pack + some new objects in associated repository
  are in turn used to generate another pack with bitmap, the situation
  repeats: new objects are again not emitted optimally and just added to
  bitmap tail - not in recency order.

  So the pack badness can grow over time when at each step we have
  bitmapped pack + some other objects. That's why we want to avoid
  generating bitmaps off of bitmaps, not to let pack badness grow.

- if we keep pack reuse enabled still only for "send-to-stdout" case:

  Because on pack reuse raw entries are directly written out to destination
  pack by write_reused_pack() bypassing needed for pack index generation
  bookkeeping done by regular codepath in write_one() and friends.

This way for pack-objects -> file we get nice speedup:

erp5.git[1] (~230MB) extracted from ~ 5GB lab.nexedi.com backup
repository managed by git-backup[2] via

time echo 0186ac99 | git pack-objects --revs erp5pack

before:  37.2s
after:   26.2s

And for `git repack -adb` packed git.git

time echo 5c589a73 | git pack-objects --revs gitpack

before:   7.1s
after:3.6s

i.e. it can be 30% - 50% speedup for pack extraction.

git-backup extracts many packs on repositories restoration. That was my
initial motivation for the patch.

[1] https://lab.nexedi.com/nexedi/erp5
[2] https://lab.nexedi.com/kirr/git-backup

NOTE

Jeff also suggests that pack.useBitmaps was probably a mistake to
introduce originally. This way we are not adding another config point,
but instead just always default to-file pack-objects not to use bitmap
index: Tools which need to generate on-disk packs with using bitmap, can
pass --use-bitmap-index explicitly. And git-repack does never pass
--use-bitmap-index, so this way we can be sure regular on-disk repacking
remains robust.

NOTE2

`git pack-objects --stdout >file.pack` + `git index-pack file.pack` is much 
slower
than `git pack-objects file.pack`. Extracting erp5.git pack from
lab.nexedi.com backup repository:

$ time echo 0186ac99 | git pack-objects --stdout --revs 
>erp5pack-stdout.pack

real0m22.309s
user0m21.148s
sys 0m0.932s

$ time git index-pack erp5pack-stdout.pack

real0m50.873s   <-- more than 2 times slower than time to generate pack 
itself!
user0m49.300s
sys 0m1.360s

So the time for

`pack-object --stdout >file.pack` + `index-pack file.pack`  is  72s,

while

`pack-objects file.pack` which does both pack and index is  27s.

And even

`pac

[PATCH 1/2 v5] pack-objects: respect --local/--honor-pack-keep/--incremental when bitmap is in use

2016-08-09 Thread Kirill Smelkov
Since 6b8fda2d (pack-objects: use bitmaps when packing objects) there
are two codepaths in pack-objects: with & without using bitmap
reachability index.

However add_object_entry_from_bitmap(), despite its non-bitmapped
counterpart add_object_entry(), in no way does check for whether --local
or --honor-pack-keep or --incremental should be respected. In
non-bitmapped codepath this is handled in want_object_in_pack(), but
bitmapped codepath has simply no such checking at all.

The bitmapped codepath however was allowing to pass in all those options
and with bitmap indices still being used under such conditions -
potentially giving wrong output (e.g. including objects from non-local or
.keep'ed pack).

We can easily fix this by noting the following: when an object comes to
add_object_entry_from_bitmap() it can come for two reasons:

1. entries coming from main pack covered by bitmap index, and
2. object coming from, possibly alternate, loose or other packs.

"2" can be already handled by want_object_in_pack() and to cover
"1" we can teach want_object_in_pack() to expect that *found_pack can be
non-NULL, meaning calling client already found object's pack entry.

In want_object_in_pack() we care to start the checks from already found
pack, if we have one, this way determining the answer right away
in case neither --local nor --honour-pack-keep are active. In
particular, as p5310-pack-bitmaps.sh shows, we do not do harm to
served-with-bitmap clones performance-wise:

Test  56dfeb62  this tree
-
5310.2: repack to disk9.63(8.67+0.33)   9.47(8.55+0.28) -1.7%
5310.3: simulated clone   2.07(2.17+0.12)   2.03(2.14+0.12) -1.9%
5310.4: simulated fetch   0.78(1.03+0.02)   0.76(1.00+0.03) -2.6%
5310.6: partial bitmap1.97(2.43+0.15)   1.92(2.36+0.14) -2.5%

with all differences strangely showing we are a bit faster now, but
probably all being within noise.

And in the general case we care not to have duplicate
find_pack_entry_one(*found_pack) calls. Worst what can happen is we can
call want_found_object(*found_pack) -- newly introduced helper for
checking whether we want object -- twice, but since want_found_object()
is very lightweight it does not make any difference.

I appreciate help and discussing this change with Junio C Hamano and
Jeff King.

Signed-off-by: Kirill Smelkov 
---
 builtin/pack-objects.c  | 93 +++--
 t/t5310-pack-bitmaps.sh | 92 
 2 files changed, 152 insertions(+), 33 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c4c2a3c..b1007f2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -944,13 +944,44 @@ static int have_duplicate_entry(const unsigned char *sha1,
return 1;
 }
 
+static int want_found_object(int exclude, struct packed_git *p)
+{
+   if (exclude)
+   return 1;
+   if (incremental)
+   return 0;
+
+   /*
+* When asked to do --local (do not include an object that appears in a
+* pack we borrow from elsewhere) or --honor-pack-keep (do not include
+* an object that appears in a pack marked with .keep), finding a pack
+* that matches the criteria is sufficient for us to decide to omit it.
+* However, even if this pack does not satisfy the criteria, we need to
+* make sure no copy of this object appears in _any_ pack that makes us
+* to omit the object, so we need to check all the packs. Signal that by
+* returning -1 to the caller.
+*/
+   if (!ignore_packed_keep &&
+   (!local || !have_non_local_packs))
+   return 1;
+
+   if (local && !p->pack_local)
+   return 0;
+   if (ignore_packed_keep && p->pack_local && p->pack_keep)
+   return 0;
+
+   /* we don't know yet; keep looking for more packs */
+   return -1;
+}
+
 /*
  * Check whether we want the object in the pack (e.g., we do not want
  * objects found in non-local stores if the "--local" option was used).
  *
- * As a side effect of this check, we will find the packed version of this
- * object, if any. We therefore pass out the pack information to avoid having
- * to look it up again later.
+ * If the caller already knows an existing pack it wants to take the object
+ * from, that is passed in *found_pack and *found_offset; otherwise this
+ * function finds if there is any pack that has the object and returns the pack
+ * and its offset in these variables.
  */
 static int want_object_in_pack(const unsigned char *sha1,
   int exclude,
@@ -958,15 +989,30 @@ static int want_object_in_pack(const unsigned char *sha1,
   off_t *found_offset)
 {
struct packed_git *p;
+   int want;
 
if (!exclude && local && has_loose_object_n

Re: [PATCH 1/2] pack-objects: Teach --use-bitmap-index codepath to respect --local, --honor-pack-keep and --incremental

2016-08-09 Thread Kirill Smelkov
On Tue, Aug 09, 2016 at 09:52:18AM -0700, Junio C Hamano wrote:
> Kirill Smelkov  writes:
> 
> > Would you please explain why we should not use touch if we do not care
> > about timestamps? Simply style?
> 
> To help readers.
> 
> "touch A" forcess the readers wonder "does the timestamp of A
> matter, and if so in what way?" and "does any later test care what
> is _in_ A, and if so in what way?"  Both of them is wasting their
> time when there is no reason why "touch" should have been used. 

I see, thanks for explaining. I used to read it a bit the other way;
maybe it is just an environment difference.


> > diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
> > index cce95d8..44914ac 100755
> > --- a/t/t5310-pack-bitmaps.sh
> > +++ b/t/t5310-pack-bitmaps.sh
> > @@ -8,16 +8,15 @@ objpath () {
> >  }
> >  
> >  # show objects present in pack ($1 should be associated *.idx)
> > -packobjects () {
> > -   git show-index <$1 | cut -d' ' -f2
> > +pack_list_objects () {
> > +   git show-index <"$1" | cut -d' ' -f2
> >  }
> 
> pack-list-objects still sounds as if you are packing "list objects",
> though.  If you are listing packed objects (or objects in a pack),
> list-packed-objects (or list-objects-in-pack) reads clearer and more
> to the point, at least to me.

Ok, let it be list_packed_objects().


> > -# hasany pattern-file content-file
> > +# has_any pattern-file content-file
> >  # tests whether content-file has any entry from pattern-file with entries 
> > being
> >  # whole lines.
> > -hasany () {
> > -   # NOTE `grep -f` is not portable
> > -   git grep --no-index -qFf $1 $2
> > +has_any () {
> > +   grep -qFf "$1" "$2"
> 
> Omitting "-q" would help those who have to debug breakage in this
> test or the code that this test checks.  What test_expect_success
> outputs is not shown by default, and running the test script with
> "-v" would show them as a debugging aid.

Ok, makes sense. Both patches adjusted and will be reposted.

Thanks,
Kirill
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Aug 09, 2016 at 10:34:11AM -0700, Junio C Hamano wrote:
>
>> It may be a good UI that is optimized for drive-by contributors.  It
>> is just that it is not very well suited (compared to mailing list
>> discussions) to conduct high-volume exchange of ideas and changes
>> efficiently.
>
> I think that's something to ponder; can we have a workflow where
> drive-by contributors can use something that has a lower learning/setup
> curve, but long-term contributors might opt for something more powerful?
>
> I think SubmitGit is a step in that direction.

Yes, agreed 100% with that.  The author of the tool must be praised
by getting added to the Cc: line in this discussion ;-)

> It does still require
> switching to the mailing list for subsequent conversation, though. It
> would be interesting to see something like SubmitGit that puts its own
> email in the "From", and that processes email replies into PR comments,
> and then subsequent PR comments into emails (i.e., part of my "dream tool"
> from earlier). It's not clear to me whether the result would just end up
> being irritating for both sides to use (because it doesn't _quite_
> conform to the norms of each format). But it would be fun to find out.

Perhaps.  I do not know if I like that second and subsequent steps
for SubmitGit, but its first step as currently deployed I am very
happy with.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #03; Mon, 8)

2016-08-09 Thread Junio C Hamano
Elijah Newren  writes:

> So, I think the series looks good.  Sorry that I didn't spot any errors.

That's nothing to be sorry about--it is an excellent news.

Thanks for helping.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 11:50:51AM -0700, Stefan Beller wrote:

> > Could you share your mutt set up pleaaase? I've been wanting this for
> > a long time, but never used mutt long enough to bother with a proper
> > setup like this (I blame gmail).
> 
> 
> That is my complaint^H^H^H^H position, too.
> I always wanted to switch to a more powerful
> setup than git-send-email for sending /gmail for reading,
> but I could not convince myself the steep learning/setup curve
> is worth it eventually as it is "not broken enough" to do the change
> right now.

I think I may have shared it before, but here is the script I use to
send emails. It dumps you in mutt, and then I have:

  macro index,pager b ":set edit_headers=yes:set 
edit_headers=no"

to send the message ("b" is for "bounce", which I think may be another
Pine-ism).

-- >8 --
#!/bin/sh

upstream_branch() {
  current=`git symbolic-ref HEAD`
  upstream=`git for-each-ref --format='%(upstream)' "$current"`
  if test -n "$upstream"; then
echo $upstream
  else
echo origin
  fi
}

get_reply_headers() {
  perl -ne '
if (defined $opt) {
  if (/^\s+(.*)/) {
$val .= " $1";
next;
  }
  print "--$opt=", quotemeta($val), " ";
  $opt = $val = undef;
}
if (/^(cc|to):\s*(.*)/i) {
  $opt = lc($1);
  $val = $2;
}
elsif (/^message-id:\s*(.*)/i) {
  $opt = "in-reply-to";
  $val = $1;
}
elsif (/^subject:\s*\[PATCH v(\d+)/i) {
  print "-v$1 ";
}
elsif (/^$/) {
  last;
}
  '
}

has_nonoption=
for i in "$@"; do
  case "$i" in
-[0-9]) has_nonoption=yes ;;
-*) ;;
 *) has_nonoption=yes
  esac
done

git rev-parse || exit 1

: ${REPLY:=$HOME/patch}
test -e "$REPLY" && eval "set -- `get_reply_headers <\"$REPLY\"` \"\$@\""
test "$has_nonoption" = "yes" || set -- "$@" `upstream_branch`

git format-patch -s --stdout --from "$@" >.mbox
if test -t 1; then
  mutt -e 'set sort=mailbox-order' -f .mbox
else
  perl -lne '
if (/^Subject: (.*)/) {
  $subject = $1;
}
elsif ($subject && /^\s+(.*)/) {
  $subject .= " $1";
}
elsif ($subject) {
  print $subject;
  $subject = undef;
}
  ' .mbox |
  sed -e 's/\[PATCH /[/' \
  -e 's/]/]:/' \
  -e 's/^/  /'
fi
rm -f .mbox
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 08:43:59PM +0200, Duy Nguyen wrote:

> On Tue, Aug 9, 2016 at 1:37 PM, Jeff King  wrote:
> >That's (relatively) easy for me to script via mutt (grab
> >these patches, apply them).
> 
> Could you share your mutt set up pleaaase? I've been wanting this for
> a long time, but never used mutt long enough to bother with a proper
> setup like this (I blame gmail).

It's actually pretty simple. The relevant config from my .muttrc is:

   macro pager,index D 'rm -f $HOME/patch'
   macro pager,index A '~/patch'

I use "~/patch" as a rendezvous point, and then "git am ~/patch" from my
other terminal. That avoids mutt having to know which repo to apply to,
and keeps the "am" process in its own terminal (which is handy if it
runs into conflicts, for example).

So generally I would "D" to clear out the contents of ~/patch, and then
"A" whichever patches I want to apply. I often use mutt's aggregate
selection for that. My bindings are:

  bind index \; tag-pattern
  bind index a tag-prefix

which I think come from pine (which I used for many years before
switching to mutt probably 15 years ago). I don't recall the default
keybindings.

Anyway, you can either tag using a pattern (with ";"), or tag mails
individually (using "t", the default), and then "a-A" to apply the "A"
to all of them (if you are in the habit of tagging all of them and then
doing "A" in one swoop, you could also get rid of the separate "D"
command and just make "A" imply it).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #03; Mon, 8)

2016-08-09 Thread Elijah Newren
Hi,

On Mon, Aug 8, 2016 at 3:32 PM, Junio C Hamano  wrote:
> * js/am-3-merge-recursive-direct (2016-08-01) 16 commits
>   (merged to 'next' on 2016-08-05 at dc1c9bb)
>  + merge-recursive: flush output buffer even when erroring out
>  + merge_trees(): ensure that the callers release output buffer
>  + merge-recursive: offer an option to retain the output in 'obuf'
>  + merge-recursive: write the commit title in one go
>  + merge-recursive: flush output buffer before printing error messages
>  + am -3: use merge_recursive() directly again
>  + merge-recursive: switch to returning errors instead of dying
>  + merge-recursive: handle return values indicating errors
>  + merge-recursive: allow write_tree_from_memory() to error out
>  + merge-recursive: avoid returning a wholesale struct
>  + merge_recursive: abort properly upon errors
>  + prepare the builtins for a libified merge_recursive()
>  + merge-recursive: clarify code in was_tracked()
>  + die(_("BUG")): avoid translating bug messages
>  + die("bug"): report bugs consistently
>  + t5520: verify that `pull --rebase` shows the helpful advice when failing
>
>  "git am -3" calls "git merge-recursive" when it needs to fall back
>  to a three-way merge; this call has been turned into an internal
>  subroutine call instead of spawning a separate subprocess.
>
>  Will merge to 'master'.
>
>  Eyes from other people are highly appreciated, as my eyes (and the
>  original author's, too) have rotten by staring many rerolls of the
>  same topic and are not effective in spotting errors.

I gave it a whirl and read over all these patches.  I read them first
trying to see if there was any difference in behavior for cases where
the old code wouldn't hit a die().  As far as I can tell, the only
place that could change the non-die() paths was the early return added
when a filed couldn't be removed in remove_file_from_cache (inside the
handle_change_delete() function):

-   remove_file_from_cache(path);
-   update_file(o, 0, o_oid, o_mode, renamed ? renamed : path);
+   ret = remove_file_from_cache(path);
+   if (!ret)
+   ret = update_file(o, 0, o_oid, o_mode,
+ renamed ? renamed : path);

However, remove_file_from_cache() unconditionally returns 0 and
appears to have done so since it was introduced in 2005.  That seems a
bit odd (Is the function just buggy?  Should it be transformed into a
void function to make it clear that there's no point checking return
values?)  Anyway, even if remove_file_from_cache() was modified to
return a nonzero result when its argument wasn't already in the index,
I believe that wouldn't matter here because path must be in the index
in order to reach this point of the code (if it somehow wasn't in the
index, that probably would have been a die()-worthy condition in the
old code that we just didn't bother checking for).

After my reading, I'm fairly confident that things that worked before
without hitting a die() should have the same behavior after this set
of patches (modulo the intentional changes like output buffering).

I then re-read in another pass for when the code would have hit a
die() previously, and in all those cases Johannes is returning as
early as possible and avoiding falling through to subsequent code,
which is probably the most reasonable thing to do.  There might be a
few cases where a few extra steps could be taken, but it's safer to
just return early.

So, I think the series looks good.  Sorry that I didn't spot any errors.

Elijah
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 11:43 AM, Duy Nguyen  wrote:
> On Tue, Aug 9, 2016 at 1:37 PM, Jeff King  wrote:
>>That's (relatively) easy for me to script via mutt (grab
>>these patches, apply them).
>
> Could you share your mutt set up pleaaase? I've been wanting this for
> a long time, but never used mutt long enough to bother with a proper
> setup like this (I blame gmail).


That is my complaint^H^H^H^H position, too.
I always wanted to switch to a more powerful
setup than git-send-email for sending /gmail for reading,
but I could not convince myself the steep learning/setup curve
is worth it eventually as it is "not broken enough" to do the change
right now.

My experiments with mutts, have left these lines in my
~/.muttrc

> # use shift + A to apply a patch in the working dir
> # macro index A ":unset pipe_decode\n|git am -3\n:set pipe_decode\n"
> # macro pager A ":unset pipe_decode\n|git am -3\n:set pipe_decode\n"
>
> macro index A ":set folder='.'\n:copy-message\n"

(IIRC they were broken for many patches, but I got applying
one patch to work. Which sucks for long email series.)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> The way I understood and implemented it is
>
> here is a path, try to use it as an alternate; if that is not
> an alternate, it's fine too; maybe warn about it, but carry
> on with the operation.

My expectation is without "maybe warn about it".  And not adding it
as an alternate (because it is not usable) is just "doing as I was
told to do", nothing noteworthy.  Because "do it only if you can" is
an explicit instruction to the doer that the caller does not care
about the outcome, I'd think there shouldn't be any warning, whether
it is used or not.  At the same time, if the caller wants to know
the outcome, and using if-able as a way to say "I prefer to see it
happen, but you do not have to panic if you can't", I'd think it is
OK to give "info:" to report which of the two possible paths was
taken in either case.  Throwing a "warning:" only when it didn't do
so does not make much sense to me.

> The way you write this response I think you have a slightly
> different understanding of what the -if-able ought to do?
>
> When --reference is given, only the superproject should
> borrow and the -if-able, may allow submodules to also borrow?

I have no idea what this sentence means.

>> I have a very strong opinion what should happen when the end-user
>> facing command is "git submodule", but if I have to choose, I would
>> prefer to see "git submodule update" tell what it did with "info:"
>> either case, i.e. "info: borrowing from ... because the superproject
>> borrows from there", and "info: not borrowing from ... even though
>> the superproject borrows from there".
>
> I see. This way the user is most informed.

Yes, and if we were to go that route, "clone" without "-v" should
not report which of the two it took.  It is OK for "submodule" to
look at what "clone" left as the result and do more intelligent
reporting that is better suited for the context of the command.

> The current implementation
> of "submodule update --init" for start from scratch yields for example:
>
> Submodule 'foo' () registered for path 'foo'
> Submodule 'hooks' () registered for path 'hooks'
> Cloning into '/home/sbeller/super/foo'...
> Cloning into '/home/sbeller/super/hooks'...
> warning: Not using proposed alternate
> /home/sbeller/super-reference/.git/modules/hooks/
> Submodule path 'foo': checked out '7b41f3a413b46140b050ae5324cbbcdd467d2b3a'
> Submodule path 'hooks': checked out '3acc14d10d26678eae6489038fe0d4dad644a9b4'
>
> so before this series we had 3 lines per submodule, and with this series
> we get a 4th line for the unusual case.
>
> You would prefer to see always 4 lines per submodule?

If the user says "--recursive --reference", then the user probably
deserves to be notified.  As I said (eh, rather, as I wanted to say
but failed to say so), I do not have a strong preference either way.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Duy Nguyen
On Tue, Aug 9, 2016 at 1:37 PM, Jeff King  wrote:
>That's (relatively) easy for me to script via mutt (grab
>these patches, apply them).

Could you share your mutt set up pleaaase? I've been wanting this for
a long time, but never used mutt long enough to bother with a proper
setup like this (I blame gmail).
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Spelling fixes

2016-08-09 Thread Ville Skyttä
On Tue, Aug 9, 2016 at 9:19 PM, Junio C Hamano  wrote:
[...]
> There are two "commited" you seem to have missed, though,
>
> t/t3420-rebase-autostash.sh:echo uncommited-content >file0 &&
> t/t3420-rebase-autostash.sh:echo uncommited-content >expected &&
>
> which I'll queue on top of this patch to be later squashed (i.e. no
> need to resend the whole thing only to add these two).

Thanks. https://github.com/client9/misspell/pull/61 :)

Also, there's SSTATE_TRANSFERING->SSTATE_TRANSFERRING in
transport-helper.c, maybe you can squash that one in as well if you
think it's fine?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Duy Nguyen
On Tue, Aug 9, 2016 at 8:36 PM, Duy Nguyen  wrote:
> On Tue, Aug 9, 2016 at 1:20 AM, Michael Haggerty  wrote:
>> Could you elaborate why you would expect quality and/or quantity of
>> reviews to suffer? I'm really curious, and I'd be happy to pass your
>> feedback along to my colleagues.
>
> Since I have been using github at work for a couple months, I do have
> a few complaints if it will ever become the way of reviewing things in
> git. Some of these may be covered by other people already (I haven't
> read all new mails in this thread yet)

Another super nit thing: use monospace font for commit messages, or at
least have an option for that.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Duy Nguyen
On Tue, Aug 9, 2016 at 1:20 AM, Michael Haggerty  wrote:
> Could you elaborate why you would expect quality and/or quantity of
> reviews to suffer? I'm really curious, and I'd be happy to pass your
> feedback along to my colleagues.

Since I have been using github at work for a couple months, I do have
a few complaints if it will ever become the way of reviewing things in
git. Some of these may be covered by other people already (I haven't
read all new mails in this thread yet)

 - Github PRs seem to encourage seeing changes as a whole, not a
separate commits. Or at least it's not so convenient to view separate
commits (e.g. not easy to go to the next commit)

 - The ability to show all outdated comments, in case I want to search
through them.

 - I have a feeling that commits in PRs are sorted by authordate, not
in topological order. The order of commits being committed is
important sometimes.

 - Not showing leading spaces mixing with TABs, or trailing spaces

 - I would love to have all patches numbered so can refer to them as
1/7, 2/5... instead of just short sha1 (and I think you have the
ability to refer to "1/7 of iteration 2", see next bullet point)

 -I guess you can manage multiple iterations of a topic with one
iteration per PR, then linking them together. It would be nicer to
somehow bake the iteration concept directly to a PR so we can switch
between them, or do interdiff. I know, this is more of a improvement
request than complaint because ML is not really better.

 - Offline support would be very nice. I'm only most of the time, but
sometimes I do work on git stuff offline.

 - We lose the integration with ML, I think. Sometimes the user
reports a bug here, then we reply back with a patch. With github, I
guess we reply back with a PR number, then further discussion may go
there, some discussion may still be on ML.

> * It is easy to summon somebody else into the review conversation by
> @-mentioning them. That person immediately can see the whole history of
> the PR. (This is an improvement on the status quo, where a new entrant
> to a conversation might have to dig through the email trash or an email
> archive to see emails that they overlooked before they were added to the
> CC list.)

On the other hand, we can't just CC anybody anymore because we don't
know if they have a github account (or the account name for that
matter). Or does github allow @-ing email addresses too? We record
people's email address, not github account names.

> * It is possible to search old issues and PRs for additional context.
> (Granted, the history of the project from its ML era would have to be
> searched separately.)

To me searching in email is still better. Maybe I haven't fully
explored github search capabilities
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 6/9] clone: implement optional references

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> Thinking about that we may want to not rely on such a hack,
> but make it clearer.

But that is far from sufficient, isn't it?

You'd need to bypass "not a local repository", "shallow" and "is
grafted" anyway, so in that sense, that hack is not doing much.
If modules/X (or modules/X/) does not exist, it won't pass these
checks, so burdening the readers with slash trickery is not really
worth their time.



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] completion: add completion for --submodule=* diff option

2016-08-09 Thread Jacob Keller
From: Jacob Keller 

Teach git-completion.bash to complete --submodule= for git commands
which take diff options. Also teach completion for git-log to support
--diff-algorithms as well.

Signed-off-by: Jacob Keller 
---
 contrib/completion/git-completion.bash | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index e1899ee53613..d81ee688c3b7 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1207,6 +1207,8 @@ _git_describe ()
 
 __git_diff_algorithms="myers minimal patience histogram"
 
+__git_diff_submodule_formats="log short"
+
 __git_diff_common_options="--stat --numstat --shortstat --summary
--patch-with-stat --name-only --name-status --color
--no-color --color-words --no-renames --check
@@ -1222,6 +1224,7 @@ __git_diff_common_options="--stat --numstat --shortstat 
--summary
--dirstat --dirstat= --dirstat-by-file
--dirstat-by-file= --cumulative
--diff-algorithm=
+   --submodule --submodule=
 "
 
 _git_diff ()
@@ -1233,6 +1236,10 @@ _git_diff ()
__gitcomp "$__git_diff_algorithms" "" 
"${cur##--diff-algorithm=}"
return
;;
+   --submodule=*)
+   __gitcomp "$__git_diff_submodule_formats" "" 
"${cur##--submodule=}"
+   return
+   ;;
--*)
__gitcomp "--cached --staged --pickaxe-all --pickaxe-regex
--base --ours --theirs --no-index
@@ -1496,6 +1503,14 @@ _git_log ()
__gitcomp "full short no" "" "${cur##--decorate=}"
return
;;
+   --diff-algorithm=*)
+   __gitcomp "$__git_diff_algorithms" "" 
"${cur##--diff-algorithm=}"
+   return
+   ;;
+   --submodule=*)
+   __gitcomp "$__git_diff_submodule_formats" "" 
"${cur##--submodule=}"
+   return
+   ;;
--*)
__gitcomp "
$__git_log_common_options
@@ -2459,6 +2474,10 @@ _git_show ()
__gitcomp "$__git_diff_algorithms" "" 
"${cur##--diff-algorithm=}"
return
;;
+   --submodule=*)
+   __gitcomp "$__git_diff_submodule_formats" "" 
"${cur##--submodule=}"
+   return
+   ;;
--*)
__gitcomp "--pretty= --format= --abbrev-commit --oneline
--show-signature
-- 
2.9.2.701.gf965a18

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] completion: complete --delete, --move, and --remotes for git branch

2016-08-09 Thread Junio C Hamano
Ville Skyttä  writes:

> Signed-off-by: Ville Skyttä 
> ---
>  contrib/completion/git-completion.bash | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/contrib/completion/git-completion.bash 
> b/contrib/completion/git-completion.bash
> index 6a187bc..76abbd1 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -1008,8 +1008,8 @@ _git_branch ()
>   while [ $c -lt $cword ]; do
>   i="${words[c]}"
>   case "$i" in
> - -d|-m)  only_local_ref="y" ;;
> - -r) has_r="y" ;;
> + -d|--delete|-m|--move)  only_local_ref="y" ;;
> + -r|--remotes)   has_r="y" ;;
>   esac
>   ((c++))
>   done

Sounds sensible; we already had "-d" but not its fully-spelled
variant, and you are adding it (together with its friends).

Will queue.

> @@ -1023,7 +1023,7 @@ _git_branch ()
>   --color --no-color --verbose --abbrev= --no-abbrev
>   --track --no-track --contains --merged --no-merged
>   --set-upstream-to= --edit-description --list
> - --unset-upstream
> + --unset-upstream --delete --move --remotes
>   "
>   ;;
>   *)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Eric Wong
Michael Haggerty  wrote:
> On 08/04/2016 05:58 PM, Johannes Schindelin wrote:
> > [...]
> > Even requiring every contributor to register with GitHub would be too much
> > of a limitation, I would wager.
> > [...]

> * Discussion of pull requests can be done either
>   * via the website (super easy for beginners but powerful for
> experienced users),
>   * by setting up email notifications for your account and replying to
> those emails, or
>   * via an API.
>   Such discussion is all in markdown, which supports light formatting,
> hyperlinks, and @-mentions.



> Disclaimer: I work for GitHub, but in this email I'm speaking for myself.
> 
> Michael
> 
> [1] I concede that people who refuse on ideological grounds to use
> proprietary software will find this step insurmountable. Perhaps we
> could come up with a workaround for such people.

I'm one of those ideological people and I don't see an
acceptable workaround.  GitHub already has misfeatures designed
to lock people in into centralized messaging:

* pull request feature doesn't work for self-hosted repos
  (this disincentivizes people from running and improving
   git-daemon/git-http-backend/etc...)

* "noreply" email addresses

* @-mentions you wrote about

* custom email notifications

This is a problem with Gitlab, Redmine, etc, too:
they cannot interoperate with each other.

At least for now, large proprietary mail providers like Gmail
still interoperate with whatever Free Software SMTP software I
run.  I dread the day when that is no longer true.

Some of these problems I hope public-inbox (or something like
it) can fix and turn the tide towards email, again.  In
contrast, public-inbox is designed to push decentralization:

* "reply" links are instructions for "git send-email" which
  encourage reply-to-all (this applies to what Jeff said
  about vger going down, I noticed it, too)

* anybody can clone the code + repo, replicate the
  instances, and tweak it to their needs.

* public-inbox.org/git/$MESSAGE_ID/t.atom allows subscriptions
  to Atom feeds without any registration or user-tracking

* Message-IDs are exposed for proper threading and interop

* low-bandwidth, Tor-friendly design to encourage deployments
  even behind NATs and firewalls.

Anyways, my optimistic side might interpret your advocacy as
GitHub already feeling threatened by public-inbox.  I certainly
wouldn't expect it at this stage, but I certainly hope it will
be the case one day :)


Disclaimer: I've always been willing to risk a lifetime of
unemployment for ideology.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] document git-repack interaction of pack.threads and pack.windowMemory

2016-08-09 Thread Junio C Hamano
Michael Stahl  writes:

> Signed-off-by: Michael Stahl 
> ---
>  Documentation/git-pack-objects.txt | 2 +-
>  Documentation/git-repack.txt   | 6 +-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-pack-objects.txt 
> b/Documentation/git-pack-objects.txt
> index 19cdcd0..0b655a5 100644
> --- a/Documentation/git-pack-objects.txt
> +++ b/Documentation/git-pack-objects.txt
> @@ -105,7 +105,7 @@ base-name::
>   advantage of the large window for the smaller objects.  The
>   size can be suffixed with "k", "m", or "g".
>   `--window-memory=0` makes memory usage unlimited, which is the
> - default.
> + default, unless the config variable `pack.windowMemory` is set.

That does not go far enough as it does not say what happens when the
variable is set (i.e. "the variable serves as the default in that
case").

Perhaps "...makes memory usage unlimited.  The default is taken from
the `pack.windowMemory` configuration variable." would make it better,
as it is described that its default is "unlimited".

> diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
> index c597523..300455b 100644
> --- a/Documentation/git-repack.txt
> +++ b/Documentation/git-repack.txt
> @@ -101,7 +101,11 @@ other objects in that pack they already have locally.
>   advantage of the large window for the smaller objects.  The
>   size can be suffixed with "k", "m", or "g".
>   `--window-memory=0` makes memory usage unlimited, which is the
> - default.
> + default, unless the config variable `pack.windowMemory` is set.
> + Note that the actual memory usage will be multiplied
> + by the number of threads used by linkgit:git-pack-objects[1],
> + which is lacking a corresponding git-repack flag but can be
> + set via the config variable `pack.threads`.

I think the same comment applies to the first line.  The next two
lines are definite improvement, but I do not think the last two
lines are necessary.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 6/9] clone: implement optional references

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 10:54 AM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> If I am reading the implementation of real_path_internal()
>> correctly, the most relevant reason that an "if-able" repository
>> cannot be used causes real_path_if_valid() to return NULL.
>
> Oops, too many proofreading and rephrasing.  Please insert "I do not
> think that" before "the most relevant".

Side note:
  It took me a while to mentally process and reread this.
  It would have been easier with just
  s/the most relevant reason/I do not think that the most relevant reason/
  but that is just me being used to s/ expressions by now.

I think our emails nearly crossed. In the other thread I wondered if
you had different expectations on the -if-able part than I do.

For the expected use case I do expect that the missing path is the
most relevant thing (the submodule is not checked out).

Of course the submodule may be cloned shallow as it was recommended
in the .gitmodules file, so we should also care about that use case.


> Let's say your superproject borrows from ~/w/super, whose submodule
> repositories (if cloned) are directly in "~/w/super/.git/modules/".
> When you clone a submodule X for your superproject, you allow it to
> borrow from "~/w/super/.git/modules/X" if there is one available.

which is why we need to pass in /X/ with an ending slash.
See the comment in 8/9.

+   /*
+* We need to end the new path with '/' to mark it as a dir,
+* otherwise a submodule name containing '/' will be broken
+* as the last part of a missing submodule reference would
+* be taken as a file name.
+*/
+   strbuf_addf(&sb, "modules/%s/", sas->submodule_name);

Thinking about that we may want to not rely on such a hack,
but make it clearer.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Spelling fixes

2016-08-09 Thread Junio C Hamano
Ville Skyttä  writes:

> Signed-off-by: Ville Skyttä 
> ---
>  Documentation/RelNotes/2.3.10.txt   | 2 +-
>  Documentation/RelNotes/2.4.10.txt   | 2 +-
>  Documentation/RelNotes/2.5.4.txt| 2 +-
>  Documentation/RelNotes/2.6.1.txt| 2 +-
>  Documentation/git-remote-fd.txt | 2 +-
>  Documentation/gitattributes.txt | 2 +-
>  Documentation/gitmodules.txt| 2 +-
>  contrib/hooks/multimail/README  | 4 ++--
>  contrib/mw-to-git/.perlcriticrc | 2 +-
>  contrib/mw-to-git/git-remote-mediawiki.perl | 2 +-
>  contrib/subtree/t/t7900-subtree.sh  | 2 +-
>  git-p4.py   | 2 +-
>  sha1_file.c | 2 +-
>  t/README| 2 +-
>  t/t1006-cat-file.sh | 2 +-
>  t/t3101-ls-tree-dirname.sh  | 2 +-
>  t/t6018-rev-list-glob.sh| 2 +-
>  t/t6030-bisect-porcelain.sh | 2 +-
>  t/t7001-mv.sh   | 8 
>  t/t7810-grep.sh | 2 +-
>  t/t9401-git-cvsserver-crlf.sh   | 2 +-
>  upload-pack.c   | 2 +-
>  22 files changed, 26 insertions(+), 26 deletions(-)

Wow, that's a lot of typos.  I asked "git show --word-diff" what got
changed, which told me this:

 

accidentlyaccidentally
commited  committed
dependancydependency
emtpy empty
existance existence
explicitely   explicitly
git-upload-achive git-upload-archive
hierachy  hierarchy
intialinitial
mulitple  multiple
non-existant  non-existent
precendence.  precedence.
priviledged   privileged
programatically   programmatically
psuedo-binary pseudo-binary
soemwhere somewhere
successfull   successful
unkownunknown
usefull   useful
writting  writing

and then looked at all lines in the patch.  They all looked
reasonable.

There are two "commited" you seem to have missed, though,

t/t3420-rebase-autostash.sh:echo uncommited-content >file0 &&
t/t3420-rebase-autostash.sh:echo uncommited-content >expected &&

which I'll queue on top of this patch to be later squashed (i.e. no
need to resend the whole thing only to add these two).

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 10:47 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> "you did ask me to use alternates once and for all when setting up the
>> superproject: now for this added submodule I don't find the alternate;
>> That is strange?"
>
> Absolutely.  I do not think you should expect a user to remember if
> s/he used alternates when getting a copy of the superproject long
> time ago.  Giving "info: using from ..." is good; giving "warning:
> not using from ..." is probably irritating, I would think.
>
> Stepping back a bit, when the user says --reference-if-able, does it
> warrant any warning either way?

Stepping another step back, let's discuss the expectations of the
new option "--reference-if-able".

The way I understood and implemented it is

here is a path, try to use it as an alternate; if that is not
an alternate, it's fine too; maybe warn about it, but carry
on with the operation.

With such a semantics you can just add the --reference-if-able
to the submodules that try to borrow from the other superprojects
submodule.

I expected the "--reference-if-able" not necessarily be
used by the end user. It is a convenient way for our scripted
use, as the -if-able is just a check if the path exists and nothing
else.

We could check if the git dir exists inthe helper for the
submodule command, such that we only pass --reference
as we are certain the alternate exists; we could have a switch
in the helper --on-missing-submodule-alternate=[die,info,silent]

The way you write this response I think you have a slightly
different understanding of what the -if-able ought to do?

When --reference is given, only the superproject should
borrow and the -if-able, may allow submodules to also borrow?

> I.e. "we made it borrow from there,
> so be careful not to trash that one" may be just as warning-worthy
> (if not even more) as "you said we can borrow from there if there is
> anything to borrow, but it turns out there isn't any, so the result
> is complete stand-alone."  It feels as if we can go without any
> warning at least from "git clone --reference-if-able", unless "-v"
> option is given.

But when git clone is not issueing a warning/info, who is responsible for
that? As you noted the superproject may be setup some time ago and
the user forgot they used references and want to use references for this
new submodule. So the helper would need to do that?

>
> I have a very strong opinion what should happen when the end-user
> facing command is "git submodule", but if I have to choose, I would
> prefer to see "git submodule update" tell what it did with "info:"
> either case, i.e. "info: borrowing from ... because the superproject
> borrows from there", and "info: not borrowing from ... even though
> the superproject borrows from there".

I see. This way the user is most informed. The current implementation
of "submodule update --init" for start from scratch yields for example:

Submodule 'foo' () registered for path 'foo'
Submodule 'hooks' () registered for path 'hooks'
Cloning into '/home/sbeller/super/foo'...
Cloning into '/home/sbeller/super/hooks'...
warning: Not using proposed alternate
/home/sbeller/super-reference/.git/modules/hooks/
Submodule path 'foo': checked out '7b41f3a413b46140b050ae5324cbbcdd467d2b3a'
Submodule path 'hooks': checked out '3acc14d10d26678eae6489038fe0d4dad644a9b4'

so before this series we had 3 lines per submodule, and with this series
we get a 4th line for the unusual case.

You would prefer to see always 4 lines per submodule?
Is one extra line (25% more output) a reasonable tradeoff for the
reference feature?

I dunno; I guess we could argue either way.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 7/7] pack-objects: use mru list when iterating over packs

2016-08-09 Thread Junio C Hamano
Jeff King  writes:

> I ran the repack again with stock git (no MRU patch), and the number of
> objects in the delta phase jumped to 3087880, around 56,000 more than
> with the MRU patch. So that's probably where the magic "3%" is coming
> from.  By looking at the smaller packs first, we are more likely to find
> copies of objects which _aren't_ deltas, and then consider them for
> delta compression. And that compression tends to find a better match
> than reusing what was in the big pack, and we end up with a slightly
> smaller output pack.

Yeah, that is a very plausible explanation.

> So why are the deltas we find so much better than what is in the big
> pack?
>
> There's something rather subtle going on, and it has to do with the fact
> that the existing pack was _not_ made by a stock version of git.

;-)

> I may have mentioned before that I have some patches which restrict the
> delta selection process. The goal is that if you have several "islands"
> of refs (in my case, refs corresponding to particular forks of a
> repository), it's desirable that you don't make a delta against a base
> that is not reachable from all of the same islands. 

That also explains it.  It is expected (small) price to pay to
ensure the islands are standalone.

> I also suspect that one of the tricks I tried, simply reversing the pack
> order (so the big pack is at the front, but the order is stable) would
> work very well for this case. But as I mentioned before, I prefer the
> MRU strategy because it's less susceptible to pathological cases.

Yup, excellent.  Thanks for working on this.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Junio C Hamano
Junio C Hamano  writes:

> Stefan Beller  writes:
>
>> "you did ask me to use alternates once and for all when setting up the
>> superproject: now for this added submodule I don't find the alternate;
>> That is strange?"
>
> Absolutely.  I do not think you should expect a user to remember if
> s/he used alternates when getting a copy of the superproject long
> time ago.  Giving "info: using from ..." is good; giving "warning:
> not using from ..." is probably irritating, I would think.
>
> Stepping back a bit, when the user says --reference-if-able, does it
> warrant any warning either way?  I.e. "we made it borrow from there,
> so be careful not to trash that one" may be just as warning-worthy
> (if not even more) as "you said we can borrow from there if there is
> anything to borrow, but it turns out there isn't any, so the result
> is complete stand-alone."  It feels as if we can go without any
> warning at least from "git clone --reference-if-able", unless "-v"
> option is given.
>
> I have a very strong opinion what should happen when the end-user

Too many proof-reading and rephrasing.  "I DON'T have a very strong
opinion" is what I wanted to say.  Sorry for the lack of last-time
proof-reading.

> facing command is "git submodule", but if I have to choose, I would
> prefer to see "git submodule update" tell what it did with "info:"
> either case, i.e. "info: borrowing from ... because the superproject
> borrows from there", and "info: not borrowing from ... even though
> the superproject borrows from there".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 6/9] clone: implement optional references

2016-08-09 Thread Junio C Hamano
Junio C Hamano  writes:

> If I am reading the implementation of real_path_internal()
> correctly, the most relevant reason that an "if-able" repository
> cannot be used causes real_path_if_valid() to return NULL.

Oops, too many proofreading and rephrasing.  Please insert "I do not
think that" before "the most relevant".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 10:34:11AM -0700, Junio C Hamano wrote:

> >The threading in GitHub comments and pull requests is also not great.
> >Each PR or issue is its own thread, but it's totally flat inside.
> >There are no sub-threads to organize discussion, and it's sometimes
> >hard to see what people are replying to.
> 
> It may be a good UI that is optimized for drive-by contributors.  It
> is just that it is not very well suited (compared to mailing list
> discussions) to conduct high-volume exchange of ideas and changes
> efficiently.

I think that's something to ponder; can we have a workflow where
drive-by contributors can use something that has a lower learning/setup
curve, but long-term contributors might opt for something more powerful?

I think SubmitGit is a step in that direction. It does still require
switching to the mailing list for subsequent conversation, though. It
would be interesting to see something like SubmitGit that puts its own
email in the "From", and that processes email replies into PR comments,
and then subsequent PR comments into emails (i.e., part of my "dream tool"
from earlier). It's not clear to me whether the result would just end up
being irritating for both sides to use (because it doesn't _quite_
conform to the norms of each format). But it would be fun to find out.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> "you did ask me to use alternates once and for all when setting up the
> superproject: now for this added submodule I don't find the alternate;
> That is strange?"

Absolutely.  I do not think you should expect a user to remember if
s/he used alternates when getting a copy of the superproject long
time ago.  Giving "info: using from ..." is good; giving "warning:
not using from ..." is probably irritating, I would think.

Stepping back a bit, when the user says --reference-if-able, does it
warrant any warning either way?  I.e. "we made it borrow from there,
so be careful not to trash that one" may be just as warning-worthy
(if not even more) as "you said we can borrow from there if there is
anything to borrow, but it turns out there isn't any, so the result
is complete stand-alone."  It feels as if we can go without any
warning at least from "git clone --reference-if-able", unless "-v"
option is given.

I have a very strong opinion what should happen when the end-user
facing command is "git submodule", but if I have to choose, I would
prefer to see "git submodule update" tell what it did with "info:"
either case, i.e. "info: borrowing from ... because the superproject
borrows from there", and "info: not borrowing from ... even though
the superproject borrows from there".


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 7/7] pack-objects: use mru list when iterating over packs

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 10:04:11AM -0400, Jeff King wrote:

> Here's the code to do the cycle-breaking.
> [...]
> It seems to perform well, and it does break the cycles (the later check
> during the write does not kick in, and we get no warnings). I didn't dig
> into the fates of specific objects, but any cycles should be added to
> the delta-compression phase.

Curious about that last part, I did the repack both with and without
this patch. We should be able to see the difference in the progress
meter from the "Compressing" phase. And indeed, it goes from 3030778
objects to 3031056, an increase of 278 objects. Which just so happens to
be exactly the number of "warning: recursive delta detected" lines that
are printed without the patch.

Not surprising, but a nice confirmation that everything is working as
expected.  But while looking at this, I managed to unravel what's been
going on this whole time.

I ran the repack again with stock git (no MRU patch), and the number of
objects in the delta phase jumped to 3087880, around 56,000 more than
with the MRU patch. So that's probably where the magic "3%" is coming
from.  By looking at the smaller packs first, we are more likely to find
copies of objects which _aren't_ deltas, and then consider them for
delta compression. And that compression tends to find a better match
than reusing what was in the big pack, and we end up with a slightly
smaller output pack.

So why are the deltas we find so much better than what is in the big
pack?

There's something rather subtle going on, and it has to do with the fact
that the existing pack was _not_ made by a stock version of git.

I may have mentioned before that I have some patches which restrict the
delta selection process. The goal is that if you have several "islands"
of refs (in my case, refs corresponding to particular forks of a
repository), it's desirable that you don't make a delta against a base
that is not reachable from all of the same islands. Otherwise, when
somebody fetches or clones the fork, you have to throw away the delta
and find one from scratch. If this happens a lot, it gets expensive.
Unsurprisingly, the packfiles produced with the island-restrictions
turned on are slightly worse than normal packfiles (because we reject
some delta possibilities that might be better).

When I did my testing, I did not enable the delta islands, because I
didn't want them to influence the results. But of course their effect
could still be seen in the deltas in the existing packfile, because it
came off our production servers (because I wanted a real-world test
case). So any pack ordering which ends up reusing _fewer_ of those
deltas is going to be a net win in size, because it's going to find new
deltas without being hampered by the island restrictions. And that's
exactly what happened in my tests; the stock pack ordering finds more
objects in the newer packs, has fewer deltas to reuse, and is able to
find better unrestricted deltas.

Unfortunately I don't have a real-world test case that's been packed up
until now with stock git. So the next-best thing I can do is re-run my
tests with the island restrictions turned on, so that newly-found deltas
do not have an unfair advantage. And here are the sizes for each case
after doing so:

  7848512375 
before.git/objects/pack/pack-2f5878f6569cd6682e7084318de311ed26d19af5.pack
  7851778788 
mru.git/objects/pack/pack-7694766c660eaa0e27e4e51a77bd5c457c3d2f1d.pack
  7851165480 
cycle-break.git/objects/pack/pack-cfaceb3423993c72e4ac588c30c94da2d087145a.pack

The MRU case _is_ slightly bigger, but only by 0.04%. And then adding
the cycle-breaking on top of that reclaims a little bit of space,
because rather than throwing out those 287 deltas at the write phase,
it's able to add those objects back to the delta search and find new
deltas for them.

So there. Mystery solved, and I feel confident in saying that the MRU
patches _don't_ represent a real regression in pack size. We do need to
deal with the cycles, of course, but it's less an issue of size and more
one of correctness (well, the pack is correct either way, but still, it
seems more correct to realize during the correct phase that we cannot
reuse those deltas).

I also suspect that one of the tricks I tried, simply reversing the pack
order (so the big pack is at the front, but the order is stable) would
work very well for this case. But as I mentioned before, I prefer the
MRU strategy because it's less susceptible to pathological cases.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Junio C Hamano
Jeff King  writes all what I wanted to say, and a lot
more, so I don't have to say much ;-)

>  - I really like the flow of having conversations next to patches. I can
>look at the index of the mailing list folder and see what people are
>talking about, how big the threads are, etc, at a glance. Moving
>between messages and threads involve single keystrokes.
>
>Similarly, having local storage is _fast_. I think GitHub is fine for
>a web app. But when I'm reading a high-volume mailing list, I really
>want to flip around quickly. If there's even 500ms to get to the next
>message or thread, it feels clunky and slow to me. Obviously I spend
>more than 500ms _reading_ most messages (though for some I see the
>first paragraph and immediately jump away). It's just the latency
>when I've decided I'm done with one thing and want to move to the
>next.

Viewing threads in a threaded mail client to help prioritizing
various topics being discussed is what I value the most and I am
not sure how I can be as efficient with the pull-request page.

>The threading in GitHub comments and pull requests is also not great.
>Each PR or issue is its own thread, but it's totally flat inside.
>There are no sub-threads to organize discussion, and it's sometimes
>hard to see what people are replying to.

It may be a good UI that is optimized for drive-by contributors.  It
is just that it is not very well suited (compared to mailing list
discussions) to conduct high-volume exchange of ideas and changes
efficiently.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/9] t7408: modernize style

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> On Tue, Aug 9, 2016 at 8:51 AM, Junio C Hamano  wrote:
>> Junio C Hamano  writes:
>>
>>> Eric Sunshine  writes:
>>>
>
> I originally thought that it may be easier to have 2 patches.
> This first one is very gentle and "obviously correct" as it only changes
> formatting and drops the directory changes.
>
> The second patch goes for renaming ans subtle style issues,
> combining some tests, so it is more likely to break.
>
> After this review, I consider using just one patch and do it all
> at once to not confuse the readers as otherwise I should reword
> the commit message with the rationale of doing it in two patches.

FWIW, I would think your split between 1/ and 2/ were very sensible,
and have a slight preference for keeping them separate.

If you already have squashed, I do not insist you to split it again;
it is not a big deal either way.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/9] t7408: modernize style

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 8:51 AM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> Eric Sunshine  writes:
>>

I originally thought that it may be easier to have 2 patches.
This first one is very gentle and "obviously correct" as it only changes
formatting and drops the directory changes.

The second patch goes for renaming ans subtle style issues,
combining some tests, so it is more likely to break.

After this review, I consider using just one patch and do it all
at once to not confuse the readers as otherwise I should reword
the commit message with the rationale of doing it in two patches.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 8:49 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> v3:
>>
>> Thanks to Junios critial questions regarding the design, I took a step back
>> to look at the bigger picture.
>>
>> --super-reference sounds confusing. (what is the super referring to?)
>> So drop that approach.
>>
>> Instead we'll compute where the reference might be in the superproject scope
>> and ask the submodule clone operation to consider an optional reference.
>> If the referenced alternate is not there, we'll just warn about it and
>> carry on.
>
> I like the general direction, but I hope that the warning comes only
> when the user said "--reference" on the command line (i.e. "you
> asked me to use reference, but for this submodule I couldn't find a
> usable one").

No it comes all the time now, as there is no difference between
"clone --recursive" and later running "submodule update" manually.

Assume a new submodule is added and you forget to update your
main alternate all your other clones are borrowing from. If there is
no warning you create a real clone for the new submodule all the time,
which is what you wanted to avoid in the first place?

>  If the implementation allows the same mechanism to
> help later "submodule init && submodule update" borrow from the
> submodule repositories of the superproject the current superproject
> borrows from (i.e. no explicit "--reference" on the command line
> when doing init/update), I would think the case that needs warning
> is "you didn't explicitly ask me to borrow objects, but I found one
> we could, so I did it anyway without being asked", and it is not a
> warning-worthy condition if we didn't find a cloned submodule in the
> repository our superproject borrows from.

"you did ask me to use alternates once and for all when setting up the
superproject: now for this added submodule I don't find the alternate;
That is strange?"

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 2/9] t7408: merge short tests, factor out testing method

2016-08-09 Thread Stefan Beller
On Tue, Aug 9, 2016 at 9:41 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> +# The tests up to this point, and repositories created by them
>> +# (A, B, super and super/sub), are about setting up the stage
>> +# forsubsequent tests and meant to be kept throughout the
>
> s/forsub/for sub/;

ok.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Junio C Hamano
Johannes Schindelin  writes:

> On Mon, 8 Aug 2016, Junio C Hamano wrote:
>
>> Unless a patch is about an area you are super familiar with so that you
>> know what is beyond the context of the patch to be able to judge if the
>> change is good in the context of the file being touched, it is always
>> hard to review from inside a mail reader.
>> 
>> Running "git am" is a good first step to review such a patch, as that
>> lets you view the resulting code with the full power of Git.  As you
>> gain experience on the codebase, you'll be able to spot more problems
>> while in your mail reader.
>
> I am glad that you agree that the requirement to manually transform the
> patches back into Git (where they had been originally to begin with) is
> cumbersome. This is the first time that I see you admit it ;-)

I was about to apologize for writing a statement that can be
misread, but I do not think what I wrote can be misinterpreted, even
if a reader deliberately tries to twist the words s/he reads, to
lead to such a conclusion, so I won't.

I merely said that reviewing a change in an unfamiliar area is
harder (not "cumbersome", but "needs understanding first") with a
patch, and it is easier to see changes in context by applying (which
is an easy, not "cumbersome", process).

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] commit-slab.h: avoid duplicated global static variables

2016-08-09 Thread Junio C Hamano
Johannes Sixt  writes:

>   * like function declarations.  I.e., the expansion of
>   *
>   *define_commit_slab(indegree, int);
>   *
> - * ends in 'static int stat_indegreerealloc;'.  This would otherwise
> + * ends in 'struct indegree;'.  This would otherwise
>   * be a syntax error according (at least) to ISO C.  It's hard to
>   * catch because GCC silently parses it by default.
>   */

I briefly wondered what "otherwise" and "it" in "silently parses it"
refer to; it is having a hanging semicolon after the definition of
the *_peek() function, i.e.

...
static int *indegree_peek(...)
{
return indegree_at_peek(s, c, 0);
}
;

so all is fine.

Needless to say, using "struct slabname;" is indeed a much nicer
solution you found to the original issue.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] pack-objects: Teach --use-bitmap-index codepath to respect --local, --honor-pack-keep and --incremental

2016-08-09 Thread Junio C Hamano
Kirill Smelkov  writes:

> Would you please explain why we should not use touch if we do not care
> about timestamps? Simply style?

To help readers.

"touch A" forcess the readers wonder "does the timestamp of A
matter, and if so in what way?" and "does any later test care what
is _in_ A, and if so in what way?"  Both of them is wasting their
time when there is no reason why "touch" should have been used. 

> diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
> index cce95d8..44914ac 100755
> --- a/t/t5310-pack-bitmaps.sh
> +++ b/t/t5310-pack-bitmaps.sh
> @@ -8,16 +8,15 @@ objpath () {
>  }
>  
>  # show objects present in pack ($1 should be associated *.idx)
> -packobjects () {
> - git show-index <$1 | cut -d' ' -f2
> +pack_list_objects () {
> + git show-index <"$1" | cut -d' ' -f2
>  }

pack-list-objects still sounds as if you are packing "list objects",
though.  If you are listing packed objects (or objects in a pack),
list-packed-objects (or list-objects-in-pack) reads clearer and more
to the point, at least to me.

> -# hasany pattern-file content-file
> +# has_any pattern-file content-file
>  # tests whether content-file has any entry from pattern-file with entries 
> being
>  # whole lines.
> -hasany () {
> - # NOTE `grep -f` is not portable
> - git grep --no-index -qFf $1 $2
> +has_any () {
> + grep -qFf "$1" "$2"

Omitting "-q" would help those who have to debug breakage in this
test or the code that this test checks.  What test_expect_success
outputs is not shown by default, and running the test script with
"-v" would show them as a debugging aid.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 2/9] t7408: merge short tests, factor out testing method

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> +# The tests up to this point, and repositories created by them
> +# (A, B, super and super/sub), are about setting up the stage
> +# forsubsequent tests and meant to be kept throughout the

s/forsub/for sub/;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 9/9] submodule--helper: use parallel processor correctly.

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> When implementing the die() in 665b35ecc (2016-06-09, "submodule--helper:
> initial clone learns retry logic"), I considered this condition to be
> a severe condition, which should lead to an immediate abort as we do not
> trust ourselves any more. However the queued messages in `err` are valuable
> so let's not toss them out by immediately dieing, but a graceful return.

I think you'll be rerolling this series at least once (if only to
correct for 6/9), so perhaps split this fix into a preparatory fix
that can go earlier to 'next' and further that the remainder of the
series depend on?

>
> Another thing to note: The error message itself was missleading. A return
> value of 0 doesn't indicate the passed in `ce` is not a submodule any more,
> but just that we do not consider cloning it any more.
>
> Signed-off-by: Stefan Beller 
> ---
>  builtin/submodule--helper.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index fc14843..3e40f99 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -815,8 +815,12 @@ static int update_clone_get_next_task(struct 
> child_process *child,
>   if (index < suc->failed_clones_nr) {
>   int *p;
>   ce = suc->failed_clones[index];
> - if (!prepare_to_clone_next_submodule(ce, child, suc, err))
> - die("BUG: ce was a submodule before?");
> + if (!prepare_to_clone_next_submodule(ce, child, suc, err)) {
> + suc->current ++;

s/current /current/;

> + strbuf_addf(err, "BUG: submodule considered for 
> cloning,"
> + "doesn't need cloning any more?\n");
> + return 0;
> + }
>   p = xmalloc(sizeof(*p));
>   *p = suc->current;
>   *idx_task_cb = p;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 6/9] clone: implement optional references

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> @@ -283,11 +286,22 @@ static void strip_trailing_slashes(char *dir)
>  static int add_one_reference(struct string_list_item *item, void *cb_data)
>  {
>   char *ref_git;
> - const char *repo;
> + const char *repo, *ref_git_s;
> + int *required = cb_data;
>   struct strbuf alternate = STRBUF_INIT;
>  
> - /* Beware: read_gitfile(), real_path() and mkpath() return static 
> buffer */
> - ref_git = xstrdup(real_path(item->string));
> + ref_git_s = *required ?
> + real_path(item->string) :
> + real_path_if_valid(item->string);
> + if (!ref_git_s) {
> + warning(_("Not using proposed alternate %s"), item->string);

If I am reading the implementation of real_path_internal()
correctly, the most relevant reason that an "if-able" repository
cannot be used causes real_path_if_valid() to return NULL.

Let's say your superproject borrows from ~/w/super, whose submodule
repositories (if cloned) are directly in "~/w/super/.git/modules/".
When you clone a submodule X for your superproject, you allow it to
borrow from "~/w/super/.git/modules/X" if there is one available.

I think both real_path() and real_path_if_valid() happily returns
the real path to "~/w/super/.git/modules/" with "X" concatenated at
the end, as long as that 'modules' directory exists, even when there
is no X inside.

Using if_valid() is still necessary to avoid a totally bogus path to
cause real_path() to barf.  I am just saying that this warning is
not so interesting, and a real check, "do we have a repository
there?  If not, don't add it as an alternate" must be somewhere
else.

> + return 0;
> + } else
> + /*
> +  * Beware: read_gitfile(), real_path() and mkpath()
> +  * return static buffer
> +  */
> + ref_git = xstrdup(ref_git_s);
>  
>   repo = read_gitfile(ref_git);
>   if (!repo)
> @@ -304,7 +318,8 @@ static int add_one_reference(struct string_list_item 
> *item, void *cb_data)
>   } else if (!is_directory(mkpath("%s/objects", ref_git))) {
>   struct strbuf sb = STRBUF_INIT;
>   if (get_common_dir(&sb, ref_git))
> - die(_("reference repository '%s' as a linked checkout 
> is not supported yet."),
> + die(_("reference repository '%s' as a "
> +   "linked checkout is not supported yet."),
>   item->string);

And curiously, this is not such a check.  This is an unrelated change.

>   die(_("reference repository '%s' is not a local repository."),
>   item->string);

You need to arrange this die() not to trigger when you are asked to
borrow from there if there is one to borrow from.  I see a few other
die() following this check to avoid using shallow repositories and
repositories with grafts, but I think they all should turn into a
"Nah, this is unusable, and I was asked to borrow _only_ if the
repository is usable, so I won't use it."

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/9] t7408: modernize style

2016-08-09 Thread Junio C Hamano
Junio C Hamano  writes:

> Eric Sunshine  writes:
>
>> On Tue, Aug 9, 2016 at 12:08 AM, Stefan Beller  wrote:
>>> t7408: modernize style
>>>
>>> Signed-off-by: Stefan Beller 
>>> ---
>>> diff --git a/t/t7408-submodule-reference.sh b/t/t7408-submodule-reference.sh
>>> @@ -8,74 +8,76 @@ test_description='test clone --reference'
>>> +test_expect_success 'that reference gets used with add' '
>>> +   (
>>> +   cd super/sub &&
>>> +   echo "0 objects, 0 kilobytes" > expected &&
>>
>> Since you're modernizing the style, perhaps drop the space after '>'
>> here and elsewhere.
>>
>>> +   git count-objects > current &&
>>> +   diff expected current
>>
>> Modern practice is to call this 'actual' rather than 'current', but
>> perhaps that's outside the scope of this patch(?).
>
> Not just that, modern practice is to use "test_cmp" not "diff".

Sorry, hit the "send" too soon, before writing "But all of that
comes in the next step."

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv3 1/9] t7408: modernize style

2016-08-09 Thread Junio C Hamano
Eric Sunshine  writes:

> On Tue, Aug 9, 2016 at 12:08 AM, Stefan Beller  wrote:
>> t7408: modernize style
>>
>> Signed-off-by: Stefan Beller 
>> ---
>> diff --git a/t/t7408-submodule-reference.sh b/t/t7408-submodule-reference.sh
>> @@ -8,74 +8,76 @@ test_description='test clone --reference'
>> +test_expect_success 'that reference gets used with add' '
>> +   (
>> +   cd super/sub &&
>> +   echo "0 objects, 0 kilobytes" > expected &&
>
> Since you're modernizing the style, perhaps drop the space after '>'
> here and elsewhere.
>
>> +   git count-objects > current &&
>> +   diff expected current
>
> Modern practice is to call this 'actual' rather than 'current', but
> perhaps that's outside the scope of this patch(?).

Not just that, modern practice is to use "test_cmp" not "diff".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 0/6] git clone: Marry --recursive and --reference

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> v3:
>
> Thanks to Junios critial questions regarding the design, I took a step back
> to look at the bigger picture. 
>
> --super-reference sounds confusing. (what is the super referring to?)
> So drop that approach.
>
> Instead we'll compute where the reference might be in the superproject scope
> and ask the submodule clone operation to consider an optional reference.
> If the referenced alternate is not there, we'll just warn about it and
> carry on.

I like the general direction, but I hope that the warning comes only
when the user said "--reference" on the command line (i.e. "you
asked me to use reference, but for this submodule I couldn't find a
usable one").  If the implementation allows the same mechanism to
help later "submodule init && submodule update" borrow from the
submodule repositories of the superproject the current superproject
borrows from (i.e. no explicit "--reference" on the command line
when doing init/update), I would think the case that needs warning
is "you didn't explicitly ask me to borrow objects, but I found one
we could, so I did it anyway without being asked", and it is not a
warning-worthy condition if we didn't find a cloned submodule in the
repository our superproject borrows from.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Aug 2016, #03; Mon, 8)

2016-08-09 Thread Junio C Hamano
Stefan Beller  writes:

> On Mon, Aug 8, 2016 at 3:32 PM, Junio C Hamano  wrote:
>>
>> * sb/submodule-clone-rr (2016-08-06) 6 commits
>>  - clone: reference flag is used for submodules as well
>>  - submodule update: add super-reference flag
>>  - submodule--helper update-clone: allow multiple references
>>  - submodule--helper module-clone: allow multiple references
>>  - t7408: merge short tests, factor out testing method
>>  - t7408: modernize style
>>
>>  Waiting for review discussion to settle.
>>  cf. <20160806012318.17968-1-sbel...@google.com>
>
> I will reroll today with a totally different approach.

OK.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 01/40] apply: make some names more specific

2016-08-09 Thread stefan.naewe
Am 08.08.2016 um 23:02 schrieb Christian Couder:
> To prepare for some structs and constants being moved from
> builtin/apply.c to apply.h, we should give them some more
> specific names to avoid possible name collisions in th global

s/th/the/

> namespace.
> 
> Signed-off-by: Christian Couder 
> ---
>  builtin/apply.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/builtin/apply.c b/builtin/apply.c
> index 1a488f9..ab8f0bd 100644
> --- a/builtin/apply.c
> +++ b/builtin/apply.c
> @@ -21,7 +21,7 @@
>  #include "ll-merge.h"
>  #include "rerere.h"
>  
> -enum ws_error_action {
> +enum apply_ws_error_action {
>   nowarn_ws_error,
>   warn_on_ws_error,
>   die_on_ws_error,
> @@ -29,7 +29,7 @@ enum ws_error_action {
>  };
>  
>  
> -enum ws_ignore {
> +enum apply_ws_ignore {
>   ignore_ws_none,
>   ignore_ws_change
>  };
> @@ -45,8 +45,8 @@ enum ws_ignore {
>   * See also "struct string_list symlink_changes" in "struct
>   * apply_state".
>   */
> -#define SYMLINK_GOES_AWAY 01
> -#define SYMLINK_IN_RESULT 02
> +#define APPLY_SYMLINK_GOES_AWAY 01
> +#define APPLY_SYMLINK_IN_RESULT 02
>  
>  struct apply_state {
>   const char *prefix;
> @@ -110,8 +110,8 @@ struct apply_state {
>   struct string_list fn_table;
>  
>   /* These control whitespace errors */
> - enum ws_error_action ws_error_action;
> - enum ws_ignore ws_ignore_action;
> + enum apply_ws_error_action ws_error_action;
> + enum apply_ws_ignore ws_ignore_action;
>   const char *whitespace_option;
>   int whitespace_error;
>   int squelch_whitespace_errors;
> @@ -3750,11 +3750,11 @@ static void prepare_symlink_changes(struct 
> apply_state *state, struct patch *pat
>   if ((patch->old_name && S_ISLNK(patch->old_mode)) &&
>   (patch->is_rename || patch->is_delete))
>   /* the symlink at patch->old_name is removed */
> - register_symlink_changes(state, patch->old_name, 
> SYMLINK_GOES_AWAY);
> + register_symlink_changes(state, patch->old_name, 
> APPLY_SYMLINK_GOES_AWAY);
>  
>   if (patch->new_name && S_ISLNK(patch->new_mode))
>   /* the symlink at patch->new_name is created or remains 
> */
> - register_symlink_changes(state, patch->new_name, 
> SYMLINK_IN_RESULT);
> + register_symlink_changes(state, patch->new_name, 
> APPLY_SYMLINK_IN_RESULT);
>   }
>  }
>  
> @@ -3769,9 +3769,9 @@ static int path_is_beyond_symlink_1(struct apply_state 
> *state, struct strbuf *na
>   break;
>   name->buf[name->len] = '\0';
>   change = check_symlink_changes(state, name->buf);
> - if (change & SYMLINK_IN_RESULT)
> + if (change & APPLY_SYMLINK_IN_RESULT)
>   return 1;
> - if (change & SYMLINK_GOES_AWAY)
> + if (change & APPLY_SYMLINK_GOES_AWAY)
>   /*
>* This cannot be "return 0", because we may
>* see a new one created at a higher level.
> 

Stefan
-- 

/dev/random says: Don't be so humble, you're not that great. -Golda Meir
python -c "print 
'73746566616e2e6e616577654061746c61732d656c656b74726f6e696b2e636f6d'.decode('hex')"
 
GPG Key fingerprint = 2DF5 E01B 09C3 7501 BCA9  9666 829B 49C5 9221 27AF

Re: [PATCH v10 28/40] builtin/apply: rename option parsing functions

2016-08-09 Thread stefan.naewe
Am 08.08.2016 um 23:03 schrieb Christian Couder:
> As these functions are going to be part of the libified
> apply api, let's give them a name that is more specific
> to the apply api.

s/api/API/

> 
> Signed-off-by: Christian Couder 
> ---
>  builtin/apply.c | 40 
>  1 file changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/builtin/apply.c b/builtin/apply.c
> index ad403f8..429fe44 100644
> --- a/builtin/apply.c
> +++ b/builtin/apply.c
> @@ -4588,16 +4588,16 @@ static int apply_patch(struct apply_state *state,
>   return res;
>  }
>  
> -static int option_parse_exclude(const struct option *opt,
> - const char *arg, int unset)
> +static int apply_option_parse_exclude(const struct option *opt,
> +   const char *arg, int unset)
>  {
>   struct apply_state *state = opt->value;
>   add_name_limit(state, arg, 1);
>   return 0;
>  }
>  
> -static int option_parse_include(const struct option *opt,
> - const char *arg, int unset)
> +static int apply_option_parse_include(const struct option *opt,
> +   const char *arg, int unset)
>  {
>   struct apply_state *state = opt->value;
>   add_name_limit(state, arg, 0);
> @@ -4605,9 +4605,9 @@ static int option_parse_include(const struct option 
> *opt,
>   return 0;
>  }
>  
> -static int option_parse_p(const struct option *opt,
> -   const char *arg,
> -   int unset)
> +static int apply_option_parse_p(const struct option *opt,
> + const char *arg,
> + int unset)
>  {
>   struct apply_state *state = opt->value;
>   state->p_value = atoi(arg);
> @@ -4615,8 +4615,8 @@ static int option_parse_p(const struct option *opt,
>   return 0;
>  }
>  
> -static int option_parse_space_change(const struct option *opt,
> -  const char *arg, int unset)
> +static int apply_option_parse_space_change(const struct option *opt,
> +const char *arg, int unset)
>  {
>   struct apply_state *state = opt->value;
>   if (unset)
> @@ -4626,8 +4626,8 @@ static int option_parse_space_change(const struct 
> option *opt,
>   return 0;
>  }
>  
> -static int option_parse_whitespace(const struct option *opt,
> -const char *arg, int unset)
> +static int apply_option_parse_whitespace(const struct option *opt,
> +  const char *arg, int unset)
>  {
>   struct apply_state *state = opt->value;
>   state->whitespace_option = arg;
> @@ -4636,8 +4636,8 @@ static int option_parse_whitespace(const struct option 
> *opt,
>   return 0;
>  }
>  
> -static int option_parse_directory(const struct option *opt,
> -   const char *arg, int unset)
> +static int apply_option_parse_directory(const struct option *opt,
> + const char *arg, int unset)
>  {
>   struct apply_state *state = opt->value;
>   strbuf_reset(&state->root);
> @@ -4755,13 +4755,13 @@ int cmd_apply(int argc, const char **argv, const char 
> *prefix)
>   struct option builtin_apply_options[] = {
>   { OPTION_CALLBACK, 0, "exclude", &state, N_("path"),
>   N_("don't apply changes matching the given path"),
> - 0, option_parse_exclude },
> + 0, apply_option_parse_exclude },
>   { OPTION_CALLBACK, 0, "include", &state, N_("path"),
>   N_("apply changes matching the given path"),
> - 0, option_parse_include },
> + 0, apply_option_parse_include },
>   { OPTION_CALLBACK, 'p', NULL, &state, N_("num"),
>   N_("remove  leading slashes from traditional diff 
> paths"),
> - 0, option_parse_p },
> + 0, apply_option_parse_p },
>   OPT_BOOL(0, "no-add", &state.no_add,
>   N_("ignore additions made by the patch")),
>   OPT_BOOL(0, "stat", &state.diffstat,
> @@ -4793,13 +4793,13 @@ int cmd_apply(int argc, const char **argv, const char 
> *prefix)
>   N_("ensure at least  lines of context 
> match")),
>   { OPTION_CALLBACK, 0, "whitespace", &state, N_("action"),
>   N_("detect new or modified lines that have whitespace 
> errors"),
> - 0, option_parse_whitespace },
> + 0, apply_option_parse_whitespace },
>   { OPTION_CALLBACK, 0, "ignore-space-change", &state, NULL,
>   N_("ignore changes in whitespace when finding context"),
> - PARSE_OPT_NOARG, option_parse_space_change },
> + PARSE_OPT_NOARG, apply_option_parse_spa

Re: [PATCH 2/2] commit-slab.h: avoid duplicated global static variables

2016-08-09 Thread Johannes Sixt
BTW, these are all instances of duplicated global static variables that 
can be found in a standard Linux build.


How I found them? I waded through the error messages produced by compiling 
the code base as C++ code for the fun of it (basically CFLAGS='-x c++ 
-fpermissive').


-- Hannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] commit-slab.h: avoid duplicated global static variables

2016-08-09 Thread Johannes Sixt
The gigantic define_commit_slab() macro repeats the definition of a
static variable that occurs earlier in the macro text. The purpose of
the repeated definition at the end of the macro is that it takes the
semicolon that occurs where the macro is used.

We cannot just remove the first definition of the variable because it
is referenced elsewhere in the macro text, and defining the macro later
would produce undefined identifier errors. We cannot have a "forward"
declaration, either. (This works only with "extern" global variables.)

The solution is to use a declaration of a struct that is already defined
earlier. This language construct can serve the same purpose as the
duplicated static variable definition, but without the confusion.

Signed-off-by: Johannes Sixt 
---
 commit-slab.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/commit-slab.h b/commit-slab.h
index f84b449..006a50b 100644
--- a/commit-slab.h
+++ b/commit-slab.h
@@ -126,16 +126,16 @@ static MAYBE_UNUSED elemtype *slabname## _peek(struct 
slabname *s,\
return slabname##_at_peek(s, c, 0); \
 }  \
\
-static int stat_ ##slabname## realloc
+struct slabname
 
 /*
- * Note that this seemingly redundant second declaration is required
+ * Note that this redundant forward declaration is required
  * to allow a terminating semicolon, which makes instantiations look
  * like function declarations.  I.e., the expansion of
  *
  *define_commit_slab(indegree, int);
  *
- * ends in 'static int stat_indegreerealloc;'.  This would otherwise
+ * ends in 'struct indegree;'.  This would otherwise
  * be a syntax error according (at least) to ISO C.  It's hard to
  * catch because GCC silently parses it by default.
  */
-- 
2.9.2.935.gccae72a.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] config.c: avoid duplicated global static variables

2016-08-09 Thread Johannes Sixt
Repeating the definition of a static variable seems to be valid in C.
Nevertheless, it is bad style because it can cause confusion, definitely
when it becomes necessary to change the type.

d64ec16 (git config: reorganize to use parseopt, 2009-02-21) added two
static variables near the top of the file config.c without removing the
definitions of the two variables that occurs later in the file.

The two variables were needed earlier in the file in the newly
introduced parseopt structure. These references were removed later in
d0e08d6 (config: fix parsing of "git config --get-color some.key -1",
2014-11-20).

Remove the redundant, younger, definitions near the top of the file and
keep the original definitions that occur later.

Signed-off-by: Johannes Sixt 
---
 Why the heck are duplicated static variables not an error in C?
 Since they aren't (as it seems), this and the next patch are just
 a matter of taste.

 builtin/config.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/builtin/config.c b/builtin/config.c
index a991a53..6cbf733 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -25,7 +25,6 @@ static char term = '\n';
 static int use_global_config, use_system_config, use_local_config;
 static struct git_config_source given_config_source;
 static int actions, types;
-static const char *get_color_slot, *get_colorbool_slot;
 static int end_null;
 static int respect_includes = -1;
 static int show_origin;
-- 
2.9.2.935.gccae72a.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 7/7] pack-objects: use mru list when iterating over packs

2016-08-09 Thread Jeff King
On Mon, Aug 08, 2016 at 10:16:32AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >> It worries me a lot to lose the warning unconditionally, though.
> >> That's the (only) coal-mine canary that lets us notice a problem
> >> when we actually start hitting that last-ditch cycle breaking too
> >> often.
> >
> > The dedicated cycle-detection will lose that warning, too (it doesn't
> > touch that code, but it's effectively checking the same thing earlier).
> >
> > I agree it's unfortunate to lose. On the other hand, it is being lost
> > because we are correctly handling the cycles, and there is nothing to
> > warn about. So it ceases to be a problem, and starts being a normal,
> > acceptable thing.
> 
> That unfortunately is beside the point.  The existing cycle breaking
> was meant to be a last ditch effort to avoid producing a broken pack
> (i.e. a suboptimal pack without cycle is better than unusable pack
> with delta cycle), while letting us know that we found a case where
> the remainder of the pack building machinery does not function well
> without it (so that we know we broke something when we tweaked the
> machinery without intending to break it).  Squelching the warnings
> feels similar to "we see too many valgrind warnings, so let's stop
> running valgrind"; I was hoping there would be a solution more like
> "instead of not running, let's teach valgrind this and that codepath
> is OK".

I don't think there is a way to do "this code path is OK", exactly.
Though by putting cycle-breaking at the check_object() stage, the
warning at the write_object() stage would continue to ensure that we
don't introduce any new cycles with our delta search. So that does have
some value.

Here's the code to do the cycle-breaking. Aside from the "hacky" bit,
it's quite simple.  I added a new state enum to object_entry to handle
the graph traversal. Since it only needs 2 bits, I _assume_ a compiler
can fit it in with the bitfields above (or at the very least give it its
own single byte so we just use what would otherwise be struct padding).
But I didn't check; if it turns out not to be the case we can easily
emulate it with two bitfields.  The write_object() check abuses the
"idx.offset" field to keep the same state, but we could convert it to
use these flags if we care.

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c91d54a..07b6fea 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1499,6 +1499,67 @@ static int pack_offset_sort(const void *_a, const void 
*_b)
(a->in_pack_offset > b->in_pack_offset);
 }
 
+/*
+ * Follow the chain of deltas from this entry onward, throwing away any links
+ * that cause us to hit a cycle (as determined by the DFS state flags in
+ * the entries).
+ */
+static void break_delta_cycles(struct object_entry *entry)
+{
+   /* If it's not a delta, it can't be part of a cycle. */
+   if (!entry->delta) {
+   entry->dfs_state = DFS_DONE;
+   return;
+   }
+
+   switch (entry->dfs_state) {
+   case DFS_NONE:
+   /*
+* This is the first time we've seen the object. We become part
+* of the active potential cycle and recurse.
+*/
+   entry->dfs_state = DFS_ACTIVE;
+   break_delta_cycles(entry->delta);
+   entry->dfs_state = DFS_DONE;
+   break;
+
+   case DFS_DONE:
+   /* object already examined, and not part of a cycle */
+   break;
+
+   case DFS_ACTIVE:
+   /*
+* We found a cycle that needs broken. We have to not only
+* drop our entry->delta link, but we need to remove
+* ourselves from the delta_sibling chain of our base.
+*/
+   {
+   struct object_entry **p = &entry->delta->delta_child;
+   while (*p) {
+   if (*p == entry)
+   *p = (*p)->delta_sibling;
+   else
+   p = &(*p)->delta_sibling;
+   }
+   }
+   entry->delta = NULL;
+
+   /*
+* XXX This is hacky. We need to figure out our real size (not
+* the delta size). check_object() already does this, so let's
+* just re-run it, but telling it not to reuse any deltas. This
+* probably should just be a single function to track down the
+* size from the delta (or even just sha1_object_info(),
+* though that is a little less efficient because we already
+* know which pack we're in).
+*/
+   reuse_delta = 0;
+   check_object(entry);
+   reuse_delta = 1;
+   break;
+   }
+}
+
 static void get_object_details(void)
 {

Re: t0027 racy?

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 12:59:58PM +, Torsten Bögershausen wrote:

> Thanks for the explanation, so there are 2 chances for a race.
> 
> I assume that the suggested "touch" will fix race#2 in most cases.
> In my understanding, the change of the file size will be more reliable:
>  
> 
> diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
> index 2860d2d..9933a9b 100755
> --- a/t/t0027-auto-crlf.sh
> +++ b/t/t0027-auto-crlf.sh
> @@ -120,6 +120,7 @@ commit_chk_wrnNNO () {
> cp $f $fname &&
> printf Z >>"$fname" &&
> git -c core.autocrlf=$crlf add $fname 2>/dev/null &&
> +   printf Z >>"$fname" &&
> git -c core.autocrlf=$crlf commit -m "commit_$fname" $fname 
> >"${pfx}_$f.err" 2>&1
> done
> ---
> Does anybody agree ?

I think the mtime change is reliable. We know that the mtime on the file
will be greater than or equal to the index mtime (because it happened
afterwards), so git will always look at the on-disk contents.

With your change, "git commit" will also always re-read the file from
disk, because it actually has new content (and you provide the filename
on the command line, so it is stage-and-commit, not just
"commit-the-index").

So either is fine.

> And, by the way, the convert warning may be issued twice, once in
> "git add" and once in "git commit".

Yes, but you only save it from "git commit", so we can ignore what
happens from "add" here. But that's why I wondered if:

  git -c core.autocrlf=$crlf add $fname >"${pfx}_$f.err" 2>&1

would make more sense. We _know_ that we have to do convert_to_git() in
that step because the content is changed. And then you can ignore the
warnings from "git commit" (which are racy), or you can simply commit as
a whole later, as some other loops do.

But like Dscho, I do not actually understand what this test is checking.
The function is called commit_chk_wrnNNO(), so perhaps you really are
interested in what "commit" has to say. But IMHO that is not an
interesting test. We know that if it has to read the content from disk,
it will call convert_to_git(), which is the exact same code path used by
"git add". So I do not understand what it is accomplishing to make a
commit at all here.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0027 racy?

2016-08-09 Thread Torsten Bögershausen
On Tue, Aug 09, 2016 at 07:49:38AM -0400, Jeff King wrote:
> On Tue, Aug 09, 2016 at 11:33:37AM +, Torsten Bögershausen wrote:
> 
> > > The second one seems plausible, given the history of issues with
> > > changing CRLF settings for an existing checkout. I'm not sure if it
> > > would be feasible to reset the index completely before each tested
> > > command, but that would probably solve it.
> > The content of the file has been changed (we appended the letter 'Z' to it,
> > so even if mtime is the same, st.st_size should differ.
> > And it seems as if the commit is triggered, see below.
> 
> I don't think I made myself clear. It's not a question of whether there
> is something to commit. It's that when git asks the index "what is the
> sha1 of the content at this path?", the index may be able to answer
> directly (the file is up-to-date, so we return the index value), or it
> may have to go to the filesystem and read the file content. It is this
> latter which triggers convert_to_git(), which is what generates the
> message in question.
> 
> For a more stripped-down example, try:
> 
>   git add foo
>   git commit -m msg
> 
> versus:
> 
>   git add foo
>   sleep 1
>   git commit -m msg
> 
> In the latter case, we should not generally need convert_to_git() in the
> "commit" step. It was already done by "git add", and we reuse the cached
> result.
> 
> Whereas in the first one, we may run into the racy-index problem and
> have to re-read the file to be on the safe side.
> 
> -Peff

Thanks for the explanation, so there are 2 chances for a race.

I assume that the suggested "touch" will fix race#2 in most cases.
In my understanding, the change of the file size will be more reliable:
 

diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index 2860d2d..9933a9b 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -120,6 +120,7 @@ commit_chk_wrnNNO () {
cp $f $fname &&
printf Z >>"$fname" &&
git -c core.autocrlf=$crlf add $fname 2>/dev/null &&
+   printf Z >>"$fname" &&
git -c core.autocrlf=$crlf commit -m "commit_$fname" $fname 
>"${pfx}_$f.err" 2>&1
done
---
Does anybody agree ?
And, by the way, the convert warning may be issued twice, once in
"git add" and once in "git commit".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Johannes Schindelin
Hi Michael,

On Tue, 9 Aug 2016, Michael Haggerty wrote:

> On 08/04/2016 05:58 PM, Johannes Schindelin wrote:
> > [...]
> > Even requiring every contributor to register with GitHub would be too
> > much of a limitation, I would wager.
> > [...]
> 
> Is it *really* so insane to consider moving collaboration on the Git
> project to GitHub or some other similar platform?

Speaking for myself, I do prefer GitHub's UI to mail, by a lot. Not only
because it is more focused on the code, but because it integrates so
nicely with Git, which email distinctly does not.

So I personally would not have the least bit of a problem to switch to
GitHub (that's indeed what Git for Windows did, getting substantially more
contributions than we would otherwise have).

And of course I use the email notifications quite a bit. They are really
convenient: I get my updates via my mail program, still, and the
discussion I want to participate in is just one click away.

The reason why I stated that GitHub is out of the question is that I
expected resistance against it. But you are right: I should not have ruled
it out so categorically, it is not at all my call to make.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Thunderbird woes, was: Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Michael J Gruber
Jakub Narębski venit, vidit, dixit 09.08.2016 10:24:
> On 9 August 2016 at 10:11, Michael J Gruber  wrote:
> 
>> My own setup
>> 
>>
>> My usual MUA is Thunderbird because of its integration with calendars
>> and address books. I usually read and post to mailing lists via
>> nntp/gmane, that works best for me for several reasons (e.g. archive
>> available when I need it).
>>
>> For git-ml, I had to learn early on to answer by e-mail to git-ml rather
>> than by nntp-reply because proper nntp-replies somehow didn't meet the
>> expectations of the e-mail users (double copies because of the cc-policy
>> or such, I don't remember).
> 
> I use either KNode or Thunderbird with NNTP/Gmane, and I don't have
> any problems when doing "Reply All" even if I forget to remove nntp-reply.
> You should do reply-all anyway, because not everyone is subscribed,
> and not everyone uses nntp-ml.

There used to be a problem (many years ago), so let's see...

> 
>> I use git sendemail even for small throw-in patches because the git-ml
>> does not allow attachments but wants patches (files) as in-line text,
>> and Thunderbird possibly reformats text (because it's text, you know).
> 
> For some strange reason Thunderbird used as NNTP reader does not
> screw up with whitespace, while Thunderbird used as email client does.
> Al least it did last time I forgot to create new email in its NNTP reader.

You mean, "format fl[ao]wed" is not used when posting via NNTP? That
still wouldn't help with reply all.

> Note that git-format-patch has Thunderbird subsection in the
> "MUA specific hints" section.

I know, but that changes config (esp. wrap+flowed) for all e-mails. I
would be nice if using an external editor would turn off TB's mangling.

>> When I want to try out a proposed patch I have to "save message" and run
>> git-am because patches don't come as file attachments on the git-ml
>> (can't use "save/open attachment"+ git-apply) nor a PR (can't use git
>> fetch nor view in browser).
> 
> Inline are preferred over attachment because it is easier to review
> and comment on online patches. Anyway, it is the same amount of
> steps, and git-am preserves commit message.

I know why we do that, but attachments are there for a reason (stable
transport), and sending attached patches would be much easier for many
MUAs. It just shows that for nowadays usage of e-mail (not withstanding
Junio's historic remarks), using in-line text as a stable transport is
really "very special", to put it mildly.

>>If it's a series, I have to do that 
>> for each
>> invididual patch, which usually means I simply don't (or rely on Junio
>> doing it and fetch his xy/topic branch).
> 
> I think you can save-all on the whole thread...

I'll definitely try, thanks.

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] document git-repack interaction of pack.threads and pack.windowMemory

2016-08-09 Thread Michael Stahl
Signed-off-by: Michael Stahl 
---
 Documentation/git-pack-objects.txt | 2 +-
 Documentation/git-repack.txt   | 6 +-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 19cdcd0..0b655a5 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -105,7 +105,7 @@ base-name::
advantage of the large window for the smaller objects.  The
size can be suffixed with "k", "m", or "g".
`--window-memory=0` makes memory usage unlimited, which is the
-   default.
+   default, unless the config variable `pack.windowMemory` is set.
 
 --max-pack-size=::
Maximum size of each output pack file. The size can be suffixed with
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index c597523..300455b 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -101,7 +101,11 @@ other objects in that pack they already have locally.
advantage of the large window for the smaller objects.  The
size can be suffixed with "k", "m", or "g".
`--window-memory=0` makes memory usage unlimited, which is the
-   default.
+   default, unless the config variable `pack.windowMemory` is set.
+   Note that the actual memory usage will be multiplied
+   by the number of threads used by linkgit:git-pack-objects[1],
+   which is lacking a corresponding git-repack flag but can be
+   set via the config variable `pack.threads`.
 
 --max-pack-size=::
Maximum size of each output pack file. The size can be suffixed with
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0027 racy?

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 11:33:37AM +, Torsten Bögershausen wrote:

> > The second one seems plausible, given the history of issues with
> > changing CRLF settings for an existing checkout. I'm not sure if it
> > would be feasible to reset the index completely before each tested
> > command, but that would probably solve it.
> The content of the file has been changed (we appended the letter 'Z' to it,
> so even if mtime is the same, st.st_size should differ.
> And it seems as if the commit is triggered, see below.

I don't think I made myself clear. It's not a question of whether there
is something to commit. It's that when git asks the index "what is the
sha1 of the content at this path?", the index may be able to answer
directly (the file is up-to-date, so we return the index value), or it
may have to go to the filesystem and read the file content. It is this
latter which triggers convert_to_git(), which is what generates the
message in question.

For a more stripped-down example, try:

  git add foo
  git commit -m msg

versus:

  git add foo
  sleep 1
  git commit -m msg

In the latter case, we should not generally need convert_to_git() in the
"commit" step. It was already done by "git add", and we reuse the cached
result.

Whereas in the first one, we may run into the racy-index problem and
have to re-read the file to be on the safe side.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Johannes Schindelin
Hi Junio,

On Mon, 8 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > I think you both got it wrong. The original citizens were the mail
> > clients that allowed you to communicate with other human beings.
> > ... It is our usage to transport machine-readable content (and not
> > as an attachment!) that is the intruder.
> 
> It is not "its is our usage".
> 
> You are too young to remember or too old to remember the history, or
> you are knowingly distorting it.  The original users of "patch" and
> "diff" expected that e-mail to be a medium to safely exchange
> changes to programs among themselves.

If you are saying that transporting patches via email was the original
purpose of email, then it is not exactly I who is misremembering history.

But that is not what you meant, I believe. You probably wanted to point
out that the Git developers are not the first ones to abuse the medium
known as email that way. And you are correct, of course. And I never
claimed anything else. I just said that the problem is our usage of emails
as a means to transport byte-exact content intended primarily to be
consumed by a program instead of a human. It does not matter whether
others did that before us. It is the problem we face right now, that is
the important part of my message.

And even if it seems as if you are eagerly defending this system, I do not
believe even a microsecond that you think it is a good system. I believe
that you, too, would welcome a better review/contribution system that is
easier to use, more welcoming to new users, less error-prone and less
time-wasting than the current, email-based one, just like you jumped on
Git as a better SCM when it came around, from whatever inadequate system
you came from.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Johannes Schindelin
Hi Junio,

On Mon, 8 Aug 2016, Junio C Hamano wrote:

> Lars Schneider  writes:
> 
> > 4.) Reviewing patches is super hard for me because my email client
> > does not support patch color highlighting and I can't easily expand
> > context or look at the history of code touched by the patch (e.g via
> > git blame). I tried to setup Alpine but I wasn't happy with the
> > interface either. I like patches with a GitHub URL for review but then
> > I need to find the right line in the original email to write a
> > comment.
> 
> Unless a patch is about an area you are super familiar with so that you
> know what is beyond the context of the patch to be able to judge if the
> change is good in the context of the file being touched, it is always
> hard to review from inside a mail reader.
> 
> Running "git am" is a good first step to review such a patch, as that
> lets you view the resulting code with the full power of Git.  As you
> gain experience on the codebase, you'll be able to spot more problems
> while in your mail reader.

I am glad that you agree that the requirement to manually transform the
patches back into Git (where they had been originally to begin with) is
cumbersome. This is the first time that I see you admit it ;-)

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0027 racy?

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 01:33:04PM +0200, Johannes Schindelin wrote:

> On Mon, 8 Aug 2016, Jeff King wrote:
> 
> > I got failure within about 30 seconds on t0027 (though 5 minutes? Yeesh.
> > It runs in 9s on my laptop. I weep for you).
> 
> Yep. That is the price I (and all other Git for Windows developers) pay
> for the decision to implement Git's entire test suite in Shell script,
> including expensive tests with over a 1,000 test cases such as t0027.

To be fair, it does not run unless you set GIT_TEST_LONG. Even _I_ don't
do that by default.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 01:20:05AM +0200, Michael Haggerty wrote:

> > but I
> > do not think it is sane to expect that the same quality and quantity
> > of reviews as we do here can be maintained with that thing.
> 
> Could you elaborate why you would expect quality and/or quantity of
> reviews to suffer? I'm really curious, and I'd be happy to pass your
> feedback along to my colleagues.

Having done a lot of review here on the mailing list, as well as in
GitHub PRs, I vastly prefer the mailing list.

Here's a random list of things that I think I would miss:

 - I really like the flow of having the commit message and diff dumped
   in my editor. I'm very efficient at slicing and dicing text, omitting
   uninteresting quoted bits, etc.

   Web text boxes feel like a straitjacket. I do have a browser plugin
   that opens them in vim. That helps, but it breaks the flow (I make a
   comment, save the file, click "comment", then read to the next place,
   click "+", then start a new vim instance for that comment).  Besides
   the tedium of clicking around, it loses the "unit" size of a single
   email, where I may make many comments, go back and revise earlier
   comments after reading more of the patch, etc.

 - I really like the flow of having conversations next to patches. I can
   look at the index of the mailing list folder and see what people are
   talking about, how big the threads are, etc, at a glance. Moving
   between messages and threads involve single keystrokes.

   Similarly, having local storage is _fast_. I think GitHub is fine for
   a web app. But when I'm reading a high-volume mailing list, I really
   want to flip around quickly. If there's even 500ms to get to the next
   message or thread, it feels clunky and slow to me. Obviously I spend
   more than 500ms _reading_ most messages (though for some I see the
   first paragraph and immediately jump away). It's just the latency
   when I've decided I'm done with one thing and want to move to the
   next.

 - For that matter, GitHub doesn't really have a good tool for random
   conversations. There are issues, which you can vaguely use like a
   thread, but it doesn't quite feel the same.

   I think part of it is that I can view the mailing list both as a
   series of threads _and_ as a stream of messages. So sometimes I mark
   a thread as "read", and then see the next day that there are a ton of
   new messages on it. Maybe those are uninteresting (and it's a single
   keystroke to mark the thread again), but maybe that's a hint that
   there's interesting discussion going on.

   The threading in GitHub comments and pull requests is also not great.
   Each PR or issue is its own thread, but it's totally flat inside.
   There are no sub-threads to organize discussion, and it's sometimes
   hard to see what people are replying to.

 - When I move between a discussion and patches on the list and my local
   git checkout, it's important to do so with minimal fuss. Which means
   I want to use _context_ in my workflow. If I'm reading a thread, I
   want there to be a keystroke for "jump to this thread in my
   checkout". That's (relatively) easy for me to script via mutt (grab
   these patches, apply them). It's a bit harder in the browser (the
   best I've got is to copy-paste the URL to a script that pulls out the
   PR number, then fetches and checks it out).

 - A sort-of feature: the mailing list is actually fairly decentralized,
   because of the "reply-to-all" convention. I don't know if anybody
   else noticed, but vger seemed to be down Friday evening and Saturday
   morning (at least my messages to the list got 400 SMTP codes, and no
   new messages were delivered to me). But I still had some
   conversations going with people, because our messages were mailed
   directly (and the list eventually caught up).

   Now that probably doesn't matter for GitHub, which seems to have
   fairly reasonable uptime. It would matter if we picked a centralized
   tool that didn't.

There are probably more, but I've run out of ranting steam for now. :)

> Here are some factors that I think will *improve* reviews:

I was going to respond point-by-point to a few of these, but I think I
covered most of it above. In short, I agree with many of the benefits
you list. In most cases, I've already reaped those benefits for my own
workflow (e.g., my "git am" workflow is pretty efficient now). But not
everybody has done so, and it's a lot to ask of casual contributors.

> Given that I work for GitHub, I'm uncomfortable doing any more advocacy
> here. If people have concrete questions, I'd be happy to answer them, on
> the list or in private.

Hopefully I provided some counterpoint. ;)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0027 racy?

2016-08-09 Thread Torsten Bögershausen
On Tue, Aug 09, 2016 at 02:51:10AM -0400, Jeff King wrote:
> On Mon, Aug 08, 2016 at 08:32:24PM +, Torsten Bögershausen wrote:
> 
> > > The verbose output is not very exciting, though:
> > > 
> > >   expecting success: 
> > >   check_warning "$lfwarn" ${pfx}_LF.err
> > > 
> > >   --- NNO_attr_auto_aeol_crlf_false_LF.err.expect 2016-08-08 
> > > 15:26:37.061701392 +
> > >   +++ NNO_attr_auto_aeol_crlf_false_LF.err.actual 2016-08-08 
> > > 15:26:37.061701392 +
> > >   @@ -1 +0,0 @@
> > >   -warning: LF will be replaced by CRLF
> > >   not ok 114 - commit NNO files crlf=false attr=auto LF
> > [...]
> > The warning is missing, but should be there:
> > 
> > The file has LF, and after commit and a new checkout these LF will
> > be convertet into CRLF.
> > 
> > So why isn't the warning there (but here on my oldish machines)
> 
> To be clear, the warning _is_ there when I just run t0027 by itself, and
> the test passes.  It's only under heavy load that it isn't. So it's a
> race condition either in the test script or in git itself.
> 
> Usually race conditions like these are due to one of:
> 
>   - git dying from SIGPIPE before it has a chance to output the command.
> But I don't see any pipes being used in the test script.
> 
>   - index raciness causing us to avoid reading file content. For
> example, if you do:
> 
>   echo foo >bar
>   git add bar
> 
> Then _usually_ "bar" and the index will have the same mtime. And
> therefore subsequent commands that need to refresh the index will
> re-read the content of "bar", because they cannot tell from the stat
> information if we have the latest version of "bar" in the index or
> not (it could have been written after the index update, but in the
> same second).
> 
> But on a slow or heavily loaded system (or if you simply get unlucky
> in crossing the boundary to a new second), they'll have different
> mtimes. And therefore git knows it can skip reading the content from
> the filesystem.
> 
> So if your test relies on git actually re-converting the file
> content, it would sometimes randomly fail.
> 
> The second one seems plausible, given the history of issues with
> changing CRLF settings for an existing checkout. I'm not sure if it
> would be feasible to reset the index completely before each tested
> command, but that would probably solve it.
The content of the file has been changed (we appended the letter 'Z' to it,
so even if mtime is the same, st.st_size should differ.
And it seems as if the commit is triggered, see below.
> 

(I got the stress script working; no I can reproduce it nicely)

Is it possible to vote for a 3rd option ?
I added some more warnings, to have always some output in stderr. 

This is the good case, this test case passed:
cat NNO_attr_auto_aeol_crlf_true_LF.err 

warning: check_safe_crlf in NNO_attr_auto_aeol_crlf_true_LF.txt 2
warning: LF was here in NNO_attr_auto_aeol_crlf_true_LF.txt..
warning: LF will be replaced by CRLF in NNO_attr_auto_aeol_crlf_true_LF.txt.
The file will have its original line endings in your working directory.
warning: check_safe_crlf in NNO_attr__aeol_crlf_true_CRLF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_crlf_true_CRLF_mix_LF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_crlf_true_LF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_lf_true_CRLF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_lf_true_CRLF_mix_LF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_lf_true_LF.txt 0
warning: check_safe_crlf in NNO_attr_auto_aeol_lf_true_LF.txt 0
[master 2decee0] commit_NNO_attr_auto_aeol_crlf_true_LF.txt
 Author: A U Thor 
 1 file changed, 2 insertions(+), 2 deletions(-)
---
And this one failed, the same code base, but a different file:
cat commit_NNO_attr_auto_aeol_crlf_input_LF.

[master ce2045a] commit_NNO_attr_auto_aeol_crlf_input_LF.txt
 Author: A U Thor 
 1 file changed, 2 insertions(+), 2 deletions(-)

---
Both had been generated with a patched convert.c:
git diff convert.c
index 67d69b5..ae64a58 100644
--- a/convert.c
+++ b/convert.c
@@ -191,6 +191,7 @@ static enum eol output_eol(enum crlf_action crlf_action)
 static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
 struct text_stat *stats, enum safe_crlf checksafe)
 {
+   warning("check_safe_crlf in %s %d", path, (int)checksafe);
if (!checksafe)
return;
 
@@ -210,6 +211,7 @@ static void check_safe_crlf(const char *path, enum 
crlf_action crlf_action,
 * CRLFs would be added by checkout:
 * check if we have "naked" LFs
 */
+   warning("LF was here in %s..", path);
if (stats->lonelf) {
if (checksafe == SAFE_CRLF_WARN)
warning("LF will be replaced by CRLF in %s.\nThe

-

In the failing case,

Re: t0027 racy?

2016-08-09 Thread Johannes Schindelin
Hi Peff,

On Mon, 8 Aug 2016, Jeff King wrote:

> I got failure within about 30 seconds on t0027 (though 5 minutes? Yeesh.
> It runs in 9s on my laptop. I weep for you).

Yep. That is the price I (and all other Git for Windows developers) pay
for the decision to implement Git's entire test suite in Shell script,
including expensive tests with over a 1,000 test cases such as t0027.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: t0027 racy?

2016-08-09 Thread Johannes Schindelin
Hi Peff,

On Tue, 9 Aug 2016, Jeff King wrote:

> And indeed, this seems to fix it for me (at least it has been running in
> a 16-way loop for 5 minutes, whereas before it died after 30 seconds or
> so):
> 
> diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
> index 2860d2d..9f057ff 100755
> --- a/t/t0027-auto-crlf.sh
> +++ b/t/t0027-auto-crlf.sh
> @@ -120,6 +120,7 @@ commit_chk_wrnNNO () {
>   cp $f $fname &&
>   printf Z >>"$fname" &&
>   git -c core.autocrlf=$crlf add $fname 2>/dev/null &&
> + touch $fname && # ensure index raciness
>   git -c core.autocrlf=$crlf commit -m "commit_$fname" $fname 
> >"${pfx}_$f.err" 2>&1
>   done

I will test this patch as soon as my poor machine has less work.

> I'm not sure if there is a more elegant solution, though (for instance,
> why not collect the output from "git add", which should have the same
> warning, I would think).

That should work, too. I am a bit overwhelmed with t0027, though, I do not
really understand what is going on. It tests *so many* things, and in no
test case is it clear to me what it tests and what it expects let alone
why those expectations make sense.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2 v6] pack-objects: use reachability bitmap index when generating non-stdout pack

2016-08-09 Thread Kirill Smelkov
Starting from 6b8fda2d (pack-objects: use bitmaps when packing objects)
if a repository has bitmap index, pack-objects can nicely speedup
"Counting objects" graph traversal phase. That however was done only for
case when resultant pack is sent to stdout, not written into a file.

The reason here is for on-disk repack by default we want:

- to produce good pack (with bitmap index not-yet-packed objects are
  emitted to pack in suboptimal order).

- to use more robust pack-generation codepath (avoiding possible
  bugs in bitmap code and possible bitmap index corruption).

Jeff King further explains:

The reason for this split is that pack-objects tries to determine how
"careful" it should be based on whether we are packing to disk or to
stdout. Packing to disk implies "git repack", and that we will likely
delete the old packs after finishing. We want to be more careful (so
as not to carry forward a corruption, and to generate a more optimal
pack), and we presumably run less frequently and can afford extra CPU.
Whereas packing to stdout implies serving a remote via "git fetch" or
"git push". This happens more frequently (e.g., a server handling many
fetching clients), and we assume the receiving end takes more
responsibility for verifying the data.

But this isn't always the case. One might want to generate on-disk
packfiles for a specialized object transfer. Just using "--stdout" and
writing to a file is not optimal, as it will not generate the matching
pack index.

So it would be useful to have some way of overriding this heuristic:
to tell pack-objects that even though it should generate on-disk
files, it is still OK to use the reachability bitmaps to do the
traversal.

So we can teach pack-objects to use bitmap index for initial object
counting phase when generating resultant pack file too:

- if we care it is not activated under git-repack:

  See above about repack robustness and not forward-carrying corruption.

- if we know bitmap index generation is not enabled for resultant pack:

  Current code has singleton bitmap_git so cannot work simultaneously
  with two bitmap indices.

  We also want to avoid (at least with current implementation)
  generating bitmaps off of bitmaps. The reason here is: when generating
  a pack, not-yet-packed objects will be emitted into pack in
  suboptimal order and added to tail of the bitmap as "extended entries".
  When the resultant pack + some new objects in associated repository
  are in turn used to generate another pack with bitmap, the situation
  repeats: new objects are again not emitted optimally and just added to
  bitmap tail - not in recency order.

  So the pack badness can grow over time when at each step we have
  bitmapped pack + some other objects. That's why we want to avoid
  generating bitmaps off of bitmaps, not to let pack badness grow.

- if we keep pack reuse enabled still only for "send-to-stdout" case:

  Because on pack reuse raw entries are directly written out to destination
  pack by write_reused_pack() bypassing needed for pack index generation
  bookkeeping done by regular codepath in write_one() and friends.

This way for pack-objects -> file we get nice speedup:

erp5.git[1] (~230MB) extracted from ~ 5GB lab.nexedi.com backup
repository managed by git-backup[2] via

time echo 0186ac99 | git pack-objects --revs erp5pack

before:  37.2s
after:   26.2s

And for `git repack -adb` packed git.git

time echo 5c589a73 | git pack-objects --revs gitpack

before:   7.1s
after:3.6s

i.e. it can be 30% - 50% speedup for pack extraction.

git-backup extracts many packs on repositories restoration. That was my
initial motivation for the patch.

[1] https://lab.nexedi.com/nexedi/erp5
[2] https://lab.nexedi.com/kirr/git-backup

NOTE

Jeff also suggests that pack.useBitmaps was probably a mistake to
introduce originally. This way we are not adding another config point,
but instead just always default to-file pack-objects not to use bitmap
index: Tools which need to generate on-disk packs with using bitmap, can
pass --use-bitmap-index explicitly. And git-repack does never pass
--use-bitmap-index, so this way we can be sure regular on-disk repacking
remains robust.

NOTE2

`git pack-objects --stdout >file.pack` + `git index-pack file.pack` is much 
slower
than `git pack-objects file.pack`. Extracting erp5.git pack from
lab.nexedi.com backup repository:

$ time echo 0186ac99 | git pack-objects --stdout --revs 
>erp5pack-stdout.pack

real0m22.309s
user0m21.148s
sys 0m0.932s

$ time git index-pack erp5pack-stdout.pack

real0m50.873s   <-- more than 2 times slower than time to generate pack 
itself!
user0m49.300s
sys 0m1.360s

So the time for

`pack-object --stdout >file.pack` + `index-pack file.pack`  is  72s,

while

`pack-objects file.pack` which does both pack and index is  27s.

And even

`pac

[PATCH 1/2 v4] pack-objects: respect --local/--honor-pack-keep/--incremental when bitmap is in use

2016-08-09 Thread Kirill Smelkov
Since 6b8fda2d (pack-objects: use bitmaps when packing objects) there
are two codepaths in pack-objects: with & without using bitmap
reachability index.

However add_object_entry_from_bitmap(), despite its non-bitmapped
counterpart add_object_entry(), in no way does check for whether --local
or --honor-pack-keep or --incremental should be respected. In
non-bitmapped codepath this is handled in want_object_in_pack(), but
bitmapped codepath has simply no such checking at all.

The bitmapped codepath however was allowing to pass in all those options
and with bitmap indices still being used under such conditions -
potentially giving wrong output (e.g. including objects from non-local or
.keep'ed pack).

We can easily fix this by noting the following: when an object comes to
add_object_entry_from_bitmap() it can come for two reasons:

1. entries coming from main pack covered by bitmap index, and
2. object coming from, possibly alternate, loose or other packs.

"2" can be already handled by want_object_in_pack() and to cover
"1" we can teach want_object_in_pack() to expect that *found_pack can be
non-NULL, meaning calling client already found object's pack entry.

In want_object_in_pack() we care to start the checks from already found
pack, if we have one, this way determining the answer right away
in case neither --local nor --honour-pack-keep are active. In
particular, as p5310-pack-bitmaps.sh shows, we do not do harm to
served-with-bitmap clones performance-wise:

Test  56dfeb62  this tree
-
5310.2: repack to disk9.63(8.67+0.33)   9.47(8.55+0.28) -1.7%
5310.3: simulated clone   2.07(2.17+0.12)   2.03(2.14+0.12) -1.9%
5310.4: simulated fetch   0.78(1.03+0.02)   0.76(1.00+0.03) -2.6%
5310.6: partial bitmap1.97(2.43+0.15)   1.92(2.36+0.14) -2.5%

with all differences strangely showing we are a bit faster now, but
probably all being within noise.

And in the general case we care not to have duplicate
find_pack_entry_one(*found_pack) calls. Worst what can happen is we can
call want_found_object(*found_pack) -- newly introduced helper for
checking whether we want object -- twice, but since want_found_object()
is very lightweight it does not make any difference.

I appreciate help and discussing this change with Junio C Hamano and
Jeff King.

Signed-off-by: Kirill Smelkov 
---
 builtin/pack-objects.c  | 93 +++--
 t/t5310-pack-bitmaps.sh | 92 
 2 files changed, 152 insertions(+), 33 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index c4c2a3c..b1007f2 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -944,13 +944,44 @@ static int have_duplicate_entry(const unsigned char *sha1,
return 1;
 }
 
+static int want_found_object(int exclude, struct packed_git *p)
+{
+   if (exclude)
+   return 1;
+   if (incremental)
+   return 0;
+
+   /*
+* When asked to do --local (do not include an object that appears in a
+* pack we borrow from elsewhere) or --honor-pack-keep (do not include
+* an object that appears in a pack marked with .keep), finding a pack
+* that matches the criteria is sufficient for us to decide to omit it.
+* However, even if this pack does not satisfy the criteria, we need to
+* make sure no copy of this object appears in _any_ pack that makes us
+* to omit the object, so we need to check all the packs. Signal that by
+* returning -1 to the caller.
+*/
+   if (!ignore_packed_keep &&
+   (!local || !have_non_local_packs))
+   return 1;
+
+   if (local && !p->pack_local)
+   return 0;
+   if (ignore_packed_keep && p->pack_local && p->pack_keep)
+   return 0;
+
+   /* we don't know yet; keep looking for more packs */
+   return -1;
+}
+
 /*
  * Check whether we want the object in the pack (e.g., we do not want
  * objects found in non-local stores if the "--local" option was used).
  *
- * As a side effect of this check, we will find the packed version of this
- * object, if any. We therefore pass out the pack information to avoid having
- * to look it up again later.
+ * If the caller already knows an existing pack it wants to take the object
+ * from, that is passed in *found_pack and *found_offset; otherwise this
+ * function finds if there is any pack that has the object and returns the pack
+ * and its offset in these variables.
  */
 static int want_object_in_pack(const unsigned char *sha1,
   int exclude,
@@ -958,15 +989,30 @@ static int want_object_in_pack(const unsigned char *sha1,
   off_t *found_offset)
 {
struct packed_git *p;
+   int want;
 
if (!exclude && local && has_loose_object_n

Re: t0027 racy?

2016-08-09 Thread Johannes Schindelin
Hi Torsten,

On Mon, 8 Aug 2016, Torsten Bögershausen wrote:

> On 2016-08-08 17.05, Johannes Schindelin wrote:
> > 
> > I remember that you did a ton of work on t0027. Now I see problems,
> > and not only that the entire script now takes a whopping 4 minutes 20
> > seconds to run on my high-end Windows machine.
> > 
> > It appears that t0027 fails randomly for me, in seemingly random
> > places.  Sometimes all 1388 cases pass. Sometimes "29 - commit NNO
> > files crlf=true attr=auto LF" fails. Sometimes it is "24 - commit NNO
> > files crlf=false attr=auto LF". Sometimes it is "114 - commit NNO
> > files crlf=false attr=auto LF", and sometimes "111 - commit NNO files
> > attr=auto aeol=lf crlf=false CRLF_mix_LF".
> > 
> > When I run it with -i -v -x --tee, it passes every single time (taking
> > over 5 minutes, just to make things worse)...
> > 
> > Any idea about any possible races?
>
> Just to double-check: I assume that you have this
> commit ded2444ad8ab8128cae2b91b8efa57ea2dd8c7a5
> Author: Torsten Bögershausen 
> Date:   Mon Apr 25 18:56:27 2016 +0200
> 
> t0027: make commit_chk_wrnNNO() reliable
> in your tree ?

I tested this with multiple branches, but yes, the one I tested most was
the shears/pu branch of git-for-windows/git (which has all
Windows-specific patches of our master branch rebased on top of pu). I
also tested with the pu branch as of yesterday.

> Is there a special pattern ?

No. Just "make -j15 DEVELOPER=1 -k test".

> Did you
> a) Update the machine ?

Yep, it's up-to-date. Windows 10 Anniversary Update.

> b) Update Git code base ?

As I said, several.

> Is it only the NNO tests that fail ?

As I said, the failures are random, I just picked the 4 most recent ones.

> Did they ever pass ?

As I said, if I run with -i -v -x --tee, everything passes.

> I see only "commit NNO files" in you report, they belong to
> check_warning(), which is called around line 126 in t0027.

I believe this is true. Some race, probably, leading to the commit *not*
refreshing the files. Or some such, this is just a guess on my side.

> How reproducible is the problem ?

Not very. That is, about half of the time t0027 passes even *without* -i
-v -x --tee. And when it fails, it is anybody's guess which case fails.

> If you add
> exit 0
> After the last "commit_chk_wrnNNO" line (line 418),
> does
> ls -l crlf*.err
> give you any hint ?

No. It simply does not contain that warning that is expected.

Ciao,
Dscho

Re: [PATCH v5] pack-objects: teach it to use reachability bitmap index when generating non-stdout pack too

2016-08-09 Thread Kirill Smelkov
On Mon, Aug 08, 2016 at 01:53:20PM -0700, Junio C Hamano wrote:
> Kirill Smelkov  writes:
> 
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index bc1c433..4ba0c4a 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -2244,6 +2244,9 @@ pack.useBitmaps::
> > to stdout (e.g., during the server side of a fetch). Defaults to
> > true. You should not generally need to turn this off unless
> > you are debugging pack bitmaps.
> > ++
> > +*NOTE*: when packing to file (e.g., on repack) the default is always not 
> > to use
> > +   pack bitmaps.
> 
> This is a bit hard to read and understand.
> 
> The patched result starts with "When true, git will use bitmap when
> packing to stdout", i.e. when packing to file, git will not.  So
> this *NOTE* is repeating the same thing.  The reader is made to
> wonder "Why does it need to repeat the same thing?  Does this mean
> when the variable is set, a pack sent to a disk uses the bitmap?"
> 
> I think what you actually do in the code is to make the variable
> affect _only_ the standard-output case, and users need a command
> line option if they want to use bitmap when writing to a file (the
> code to do so looks correctly done).

Yes it is this way how it is programmed. But I've added the note because
it is very implicit to me that "When true, git will use bitmap when
packing to stdout" means 1) the default for packing-to-file is different
and 2) there is no way to set the default for packing-to-file. That's
why I added the explicit info.

And especially since the config name "pack.useBitmaps" does not contain
"stdout" at all it can be very confusing to people looking at this the
first time (at least it was so this way for me). Also please recall you
wondering why 6b8fda2d added bitmap support only for to-stdout case not
even mentioning about why it is done only for that case and not for
to-file case).

I do not insist on the note however - I only thought it is better to
have it - so if you prefer we go without it - let us drop this note.

Will send v6 as reply to this mail with below interdiff.

Thanks,
Kirill

 8<  (interdiff)

--- b/Documentation/config.txt
+++ a/Documentation/config.txt
@@ -2246,9 +2246,6 @@
to stdout (e.g., during the server side of a fetch). Defaults to
true. You should not generally need to turn this off unless
you are debugging pack bitmaps.
-+
-*NOTE*: when packing to file (e.g., on repack) the default is always not to use
-   pack bitmaps.
 
 pack.writeBitmaps (deprecated)::
This is a deprecated synonym for `repack.writeBitmaps`.
diff -u b/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
--- b/t/t5310-pack-bitmaps.sh
+++ b/t/t5310-pack-bitmaps.sh
@@ -219,8 +219,8 @@
# verify equivalent packs are generated with/without using bitmap index
packasha1=$(git pack-objects --no-use-bitmap-index --all packa 
packa.objects &&
-   git show-index packb.objects &&
+   pack_list_objects packa.objects &&
+   pack_list_objects packb.objects &&
test_cmp packa.objects packb.objects
 '
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] pack-objects: Teach --use-bitmap-index codepath to respect --local, --honor-pack-keep and --incremental

2016-08-09 Thread Kirill Smelkov
Junio, first of all thanks for feedback,

On Mon, Aug 08, 2016 at 12:26:33PM -0700, Junio C Hamano wrote:
> Kirill Smelkov  writes:
[...]
> > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> > index c4c2a3c..e06c1bf 100644
> > --- a/builtin/pack-objects.c
> > +++ b/builtin/pack-objects.c
> > @@ -944,13 +944,45 @@ static int have_duplicate_entry(const unsigned char 
> > *sha1,
> > return 1;
> >  }
> >  
> > +static int want_found_object(int exclude, struct packed_git *p)
> > +{
> > +   if (exclude)
> > +   return 1;
> > +   if (incremental)
> > +   return 0;
> > +
> > +   /*
> > +* When asked to do --local (do not include an
> > +* object that appears in a pack we borrow
> > +* from elsewhere) or --honor-pack-keep (do not
> > +* include an object that appears in a pack marked
> > +* with .keep), we need to make sure no copy of this
> > +* object come from in _any_ pack that causes us to
> > +* omit it, and need to complete this loop.  When
> > +* neither option is in effect, we know the object
> > +* we just found is going to be packed, so break
> > +* out of the search loop now.
> > +*/
> 
> The blame is mine, but "no copy of this object appears in _any_ pack"
> would be more correct and easier to read.
> 
> This code is no longer in a search loop; its caller is.  Further
> rephrasing is needed.  "When asked to do ...these things..., finding
> a pack that matches the criteria is sufficient for us to decide to
> omit it.  However, even if this pack does not satisify the criteria,
> we need to make sure no copy of this object appears in _any_ pack
> that makes us to omit the object, so we need to check all the packs.
> Signal that by returning -1 to the caller." or something along that
> line.

Ok, I've rephrased it your way. Thanks for advising.

> >  /*
> >   * Check whether we want the object in the pack (e.g., we do not want
> >   * objects found in non-local stores if the "--local" option was used).
> >   *
> > - * As a side effect of this check, we will find the packed version of this
> > - * object, if any. We therefore pass out the pack information to avoid 
> > having
> > - * to look it up again later.
> > + * As a side effect of this check, if object's pack entry was not already 
> > found,
> > + * we will find the packed version of this object, if any. We therefore 
> > pass
> > + * out the pack information to avoid having to look it up again later.
> 
> The reasoning leading to "We therefore" is understandable, but "pass
> out the pack information" is not quite.  Is this meant to explain
> the fact that *found_pack and *found_offset are in-out parameters?
> 
> The explanation to justify why *found_pack and *found_offset that
> used to be out parameters are made in-out parameters belongs to the
> log message.  We do not want this in-code comment to explain the
> updated code relative to what the code used to do; that is not
> useful to those who read the code for the first time in the context
> of the committed state.
> 
> /* 
>  * Check whether we want to pack the object in the pack (e.g. ...).
>  *
>  * If the caller already knows an existing pack it wants to
>  * take the object from, that is passed in *found_pack and
>  * *found_offset; otherwise this function finds if there is
>  * any pack that has the object and returns the pack and its
>  * offset in these variables.
>  */

The "pass out the pack information ..." is not my text - I only added
"if object's pack entry was not already found" in the middle of the
sentence and rewrapped this paragraph. The "pass out the pack
information ..." comes from ce2bc424 (pack-objects: split
add_object_entry; 2013-12-21)

I agree your text is more clear and it is better to adjust the comments.

> > diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
> > index 3893afd..e71caa4 100755
> > --- a/t/t5310-pack-bitmaps.sh
> > +++ b/t/t5310-pack-bitmaps.sh
> > @@ -7,6 +7,19 @@ objpath () {
> > echo ".git/objects/$(echo "$1" | sed -e 's|\(..\)|\1/|')"
> >  }
> >  
> > +# show objects present in pack ($1 should be associated *.idx)
> > +packobjects () {
> > +   git show-index <$1 | cut -d' ' -f2
> > +}
> 
> That is a misleading name for a helper function that produces a list
> of objects that were packed.  "list_packed_objects", perhaps.

I agree it is ambiguous wrt `git pack-objects` and sorry for choosing
not good name from the start. I'm changing it to pack_list_objects().
( personally I would use pack_obj_list a-la git-rev-list, but let's try
  not to create another review step because of abbreviate vs not-abbreviate )

> > +# hasany pattern-file content-file
> > +# tests whether content-file has any entry from pattern-file with entries 
> > being
> > +# whole lines.
> > +hasany () {
> > +   # NOTE `grep -f` is not portable
> > +   git grep --no-index -qFf $1 $2
> > +}
> 
> I doubt "grep -f patter

Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 10:11:30AM +0200, Michael J Gruber wrote:

> Maybe two more points of input for the discussion:
> 
> off-line capabilities
> =
> 
> The current workflow here seems to work best when you are subscribed to
> the git-ml and have your own (local, maybe selective) copy of git-ml in
> your (text-based) MUA (mutt, (al)pine, emacs, ...) that can jump right
> into git-am and such directly. I'm not sure how important the "off-line"
> aspect of that is for some of us, and how that could be replicated on
> GitHub - while PRs and such may be Git based behind the scenes there
> seems to be no way to clone that info and work from a local clone.
> (Don't know if GitLab is more open.)

You can pull it all out via GitHub's HTTP API, but the question is what
format you would use to store it locally (and which tools you would then
use to play with it).

I haven't really tried this lately, though, so I don't know if there is
information that the API would be missing.

I do have a dream of writing a tool that sucks in GitHub PRs to a fake
email thread, lets me make my responses inline in an editor, and then
pushes it back up as PR comments (finding the right positions based on
the quoted context).

> For git-ml, I had to learn early on to answer by e-mail to git-ml rather
> than by nntp-reply because proper nntp-replies somehow didn't meet the
> expectations of the e-mail users (double copies because of the cc-policy
> or such, I don't remember).

At least some people's workflows seem to send two copies to the list.
For instance, Jakub's <2bfd9cf5-a9fa-7650-21e9-9ceb9cc34...@gmail.com>
got delivered to me via the list twice. Once directly from gmail with:

  To: Oleg Taranenko ,
  Junio C Hamano 
  Cc: git@vger.kernel.org

and once via gmane with:

  To: git@vger.kernel.org
  Cc: git@vger.kernel.org

It's like this with all of his messages (sorry I can't point to the
duplicates in an archive; they have the same message-id, so public-inbox
treats them as a single unit).

Replying to the second one breaks the usual "cc-everybody" rule. Sending
duplicates means everybody sees it twice (3 times if they're on the cc
list!), and the second copy still has the bogus headers (so people
replying need to pick the right one).

> I use git sendemail even for small throw-in patches because the git-ml
> does not allow attachments but wants patches (files) as in-line text,
> and Thunderbird possibly reformats text (because it's text, you know).

I wonder if this is something we could change. I do not personally have
any problem with attached patches. "git am" knows how to apply them, and
mutt is smart enough to show text/* by default, and to include it in
quoted text on reply. So the output of "git format-patch --attach" works
fine for me. But it may not be as nice in other MUAs, and we have to
care about all of the other reviewers.

> When I want to try out a proposed patch I have to "save message" and run
> git-am because patches don't come as file attachments on the git-ml
> (can't use "save/open attachment"+ git-apply) nor a PR (can't use git
> fetch nor view in browser). If it's a series, I have to do that for each
> invididual patch, which usually means I simply don't (or rely on Junio
> doing it and fetch his xy/topic branch).

So you would like the opposite of my dream tool, I think: something that
takes mailing list conversations and turns them into PRs.

(My real dream is actually to have a bidirectional version of the tool,
so that everybody can use whatever interface they like, and nobody has
to care about somebody else's preferences).

> And more often than not, patches from series do not appear in sequence,
> not threaded on top of the cover letter (in the gmane nntp version of
> git-ml), and it usually happens for the same people again and again,
> which tells me it's a git sendemail config issue and not gmane.

Just a guess, but I suspect this is caused by people who use "rebase -i"
to rearrange patches. When format-patch writes out the patches, it uses
the author date as the "Date" field, which means it may be out of order.
I think send-email will always write out a new, monotonically increasing
date. But I suspect other workflows (e.g., imap-send and then mailing
from a MUA) blindly re-use that date.

> Suggestion
> ==
> 
> Maybe the current gmane woes are a good reason to try out something
> different for a month or two, with copies to the git-ml, and with the
> default being to revert back to git-ml after that and discuss what we've
> learned. As a result we may improve our workflow here, get GitHub to
> improve, and maybe switch or not. Either way we could profit from that.

I think public-inbox is a nice step forward on the reading side (it's a
lot easier to get raw patches out of it, for example). But it doesn't
help much with sending (and sending is a tricky subject; anytime you
promise to send mail on behalf of somebody, you're going to attract
spammers).

-Peff
-

Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Jeff King
On Tue, Aug 09, 2016 at 10:17:22AM +0100, Richard Ipsum wrote:

> > In the very unlikely event that github is shut down, how do we get all
> > review comments out of it, assuming that we will use pull requests for
> > review?
> 
> For what it's worth this is exactly why I think it would be worthwhile for git
> to establish a common review format, services like Github/Gitlab could then
> start storing reviews and comments in the git repo rather than in a separate
> sql database.

I doubt that the "rather than" part will ever happen. Git does not make
a very good database, and certainly not when you want to do things that
cut across repositories (like, say, efficiently get all review comments
made by one user).

It would be nice to have a common interchange format, though. In theory
that could feed into (and out of) a more efficient representation on the
backend of the site. It doesn't _have_ to be git-based, but it would be
nice if it was.

Somebody asked elsewhere "what happens if GitHub goes away?". And the
answer is that you can already get all of that data out in a
programmatic way, via the API. But since there's no common interchange
format, you'd be stuck writing a conversion to whatever format your new
destination uses.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Michael Haggerty
On 08/09/2016 06:22 AM, Duy Nguyen wrote:
> On Tue, Aug 9, 2016 at 12:20 AM, Michael Haggerty  
> wrote:
>> On 08/04/2016 05:58 PM, Johannes Schindelin wrote:
>>> [...]
>>> Even requiring every contributor to register with GitHub would be too much
>>> of a limitation, I would wager.
>>> [...]
>>
>> Is it *really* so insane to consider moving collaboration on the Git
>> project to GitHub or some other similar platform?
> 
> In the very unlikely event that github is shut down, how do we get all
> review comments out of it, assuming that we will use pull requests for
> review?

I don't have any experience with these tools, but a quick search turns
up the following possibilities (among others):

* github-backup (by Joey Hess): https://github.com/joeyh/github-backup
* python-github-backup: https://github.com/josegonzalez/python-github-backup
* BackHub (commercial service): https://backhub.co/
* Import GitHub project into GitLab:
http://docs.gitlab.com/ce/workflow/importing/import_projects_from_github.html

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] completion: complete --delete, --move, and --remotes for git branch

2016-08-09 Thread Ville Skyttä
Signed-off-by: Ville Skyttä 
---
 contrib/completion/git-completion.bash | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/contrib/completion/git-completion.bash 
b/contrib/completion/git-completion.bash
index 6a187bc..76abbd1 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -1008,8 +1008,8 @@ _git_branch ()
while [ $c -lt $cword ]; do
i="${words[c]}"
case "$i" in
-   -d|-m)  only_local_ref="y" ;;
-   -r) has_r="y" ;;
+   -d|--delete|-m|--move)  only_local_ref="y" ;;
+   -r|--remotes)   has_r="y" ;;
esac
((c++))
done
@@ -1023,7 +1023,7 @@ _git_branch ()
--color --no-color --verbose --abbrev= --no-abbrev
--track --no-track --contains --merged --no-merged
--set-upstream-to= --edit-description --list
-   --unset-upstream
+   --unset-upstream --delete --move --remotes
"
;;
*)
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Richard Ipsum
On Tue, Aug 09, 2016 at 06:22:21AM +0200, Duy Nguyen wrote:
> On Tue, Aug 9, 2016 at 12:20 AM, Michael Haggerty  
> wrote:
> > On 08/04/2016 05:58 PM, Johannes Schindelin wrote:
> >> [...]
> >> Even requiring every contributor to register with GitHub would be too much
> >> of a limitation, I would wager.
> >> [...]
> >
> > Is it *really* so insane to consider moving collaboration on the Git
> > project to GitHub or some other similar platform?
> 
> In the very unlikely event that github is shut down, how do we get all
> review comments out of it, assuming that we will use pull requests for
> review?

For what it's worth this is exactly why I think it would be worthwhile for git
to establish a common review format, services like Github/Gitlab could then
start storing reviews and comments in the git repo rather than in a separate
sql database.

Gerrit is already doing this with notedb, which literally gives you a
git log of a review. Admittedly with Gerrit the change metadata
sits in a separate git repo, still,
this is much better than the current situation with
Github and Gitlab in my opinion.

I apologise once again if my comments here are somewhat unrelated,
but I feel there is at least some overlap, since the existence of a
common review format for git could potentially make Github/Gitlab a more
attractive option, if Github/Gitlab chose to adopt such a format.

Really I think that reviews shouldn't be stored on mailing lists,
and they shouldn't be stored in sql databases,
they should be stored in git.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Spelling fixes

2016-08-09 Thread Ville Skyttä
Signed-off-by: Ville Skyttä 
---
 Documentation/RelNotes/2.3.10.txt   | 2 +-
 Documentation/RelNotes/2.4.10.txt   | 2 +-
 Documentation/RelNotes/2.5.4.txt| 2 +-
 Documentation/RelNotes/2.6.1.txt| 2 +-
 Documentation/git-remote-fd.txt | 2 +-
 Documentation/gitattributes.txt | 2 +-
 Documentation/gitmodules.txt| 2 +-
 contrib/hooks/multimail/README  | 4 ++--
 contrib/mw-to-git/.perlcriticrc | 2 +-
 contrib/mw-to-git/git-remote-mediawiki.perl | 2 +-
 contrib/subtree/t/t7900-subtree.sh  | 2 +-
 git-p4.py   | 2 +-
 sha1_file.c | 2 +-
 t/README| 2 +-
 t/t1006-cat-file.sh | 2 +-
 t/t3101-ls-tree-dirname.sh  | 2 +-
 t/t6018-rev-list-glob.sh| 2 +-
 t/t6030-bisect-porcelain.sh | 2 +-
 t/t7001-mv.sh   | 8 
 t/t7810-grep.sh | 2 +-
 t/t9401-git-cvsserver-crlf.sh   | 2 +-
 upload-pack.c   | 2 +-
 22 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/Documentation/RelNotes/2.3.10.txt 
b/Documentation/RelNotes/2.3.10.txt
index 9d425d8..20c2d2c 100644
--- a/Documentation/RelNotes/2.3.10.txt
+++ b/Documentation/RelNotes/2.3.10.txt
@@ -7,7 +7,7 @@ Fixes since v2.3.9
  * xdiff code we use to generate diffs is not prepared to handle
extremely large files.  It uses "int" in many places, which can
overflow if we have a very large number of lines or even bytes in
-   our input files, for example.  Cap the input size to soemwhere
+   our input files, for example.  Cap the input size to somewhere
around 1GB for now.
 
  * Some protocols (like git-remote-ext) can execute arbitrary code
diff --git a/Documentation/RelNotes/2.4.10.txt 
b/Documentation/RelNotes/2.4.10.txt
index 8621199..702d8d4 100644
--- a/Documentation/RelNotes/2.4.10.txt
+++ b/Documentation/RelNotes/2.4.10.txt
@@ -7,7 +7,7 @@ Fixes since v2.4.9
  * xdiff code we use to generate diffs is not prepared to handle
extremely large files.  It uses "int" in many places, which can
overflow if we have a very large number of lines or even bytes in
-   our input files, for example.  Cap the input size to soemwhere
+   our input files, for example.  Cap the input size to somewhere
around 1GB for now.
 
  * Some protocols (like git-remote-ext) can execute arbitrary code
diff --git a/Documentation/RelNotes/2.5.4.txt b/Documentation/RelNotes/2.5.4.txt
index a5e8477..b8a2f93 100644
--- a/Documentation/RelNotes/2.5.4.txt
+++ b/Documentation/RelNotes/2.5.4.txt
@@ -7,7 +7,7 @@ Fixes since v2.5.4
  * xdiff code we use to generate diffs is not prepared to handle
extremely large files.  It uses "int" in many places, which can
overflow if we have a very large number of lines or even bytes in
-   our input files, for example.  Cap the input size to soemwhere
+   our input files, for example.  Cap the input size to somewhere
around 1GB for now.
 
  * Some protocols (like git-remote-ext) can execute arbitrary code
diff --git a/Documentation/RelNotes/2.6.1.txt b/Documentation/RelNotes/2.6.1.txt
index 1e51363..f37ea89 100644
--- a/Documentation/RelNotes/2.6.1.txt
+++ b/Documentation/RelNotes/2.6.1.txt
@@ -7,7 +7,7 @@ Fixes since v2.6
  * xdiff code we use to generate diffs is not prepared to handle
extremely large files.  It uses "int" in many places, which can
overflow if we have a very large number of lines or even bytes in
-   our input files, for example.  Cap the input size to soemwhere
+   our input files, for example.  Cap the input size to somewhere
around 1GB for now.
 
  * Some protocols (like git-remote-ext) can execute arbitrary code
diff --git a/Documentation/git-remote-fd.txt b/Documentation/git-remote-fd.txt
index e700baf..80afca8 100644
--- a/Documentation/git-remote-fd.txt
+++ b/Documentation/git-remote-fd.txt
@@ -17,7 +17,7 @@ fetch, push or archive.
 
 If only  is given, it is assumed to be a bidirectional socket connected
 to remote Git server (git-upload-pack, git-receive-pack or
-git-upload-achive). If both  and  are given, they are assumed
+git-upload-archive). If both  and  are given, they are assumed
 to be pipes connected to a remote Git server ( being the inbound pipe
 and  being the outbound pipe.
 
diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 34db3e2..807577a 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -133,7 +133,7 @@ Set to string value "auto"::
When `text` is set to "auto", the path is marked for automatic
end-of-line conversion.  If Git decides that the content is
text, its line endings are converted to LF on checkin.
-   When the file has been commited with CRLF, no conversion is done.
+   When the file has

Thunderbird woes, was: Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Jakub Narębski
On 9 August 2016 at 10:11, Michael J Gruber  wrote:

> My own setup
> 
>
> My usual MUA is Thunderbird because of its integration with calendars
> and address books. I usually read and post to mailing lists via
> nntp/gmane, that works best for me for several reasons (e.g. archive
> available when I need it).
>
> For git-ml, I had to learn early on to answer by e-mail to git-ml rather
> than by nntp-reply because proper nntp-replies somehow didn't meet the
> expectations of the e-mail users (double copies because of the cc-policy
> or such, I don't remember).

I use either KNode or Thunderbird with NNTP/Gmane, and I don't have
any problems when doing "Reply All" even if I forget to remove nntp-reply.
You should do reply-all anyway, because not everyone is subscribed,
and not everyone uses nntp-ml.

> I use git sendemail even for small throw-in patches because the git-ml
> does not allow attachments but wants patches (files) as in-line text,
> and Thunderbird possibly reformats text (because it's text, you know).

For some strange reason Thunderbird used as NNTP reader does not
screw up with whitespace, while Thunderbird used as email client does.
Al least it did last time I forgot to create new email in its NNTP reader.

Note that git-format-patch has Thunderbird subsection in the
"MUA specific hints" section.

> When I want to try out a proposed patch I have to "save message" and run
> git-am because patches don't come as file attachments on the git-ml
> (can't use "save/open attachment"+ git-apply) nor a PR (can't use git
> fetch nor view in browser).

Inline are preferred over attachment because it is easier to review
and comment on online patches. Anyway, it is the same amount of
steps, and git-am preserves commit message.

>If it's a series, I have to do that 
> for each
> invididual patch, which usually means I simply don't (or rely on Junio
> doing it and fetch his xy/topic branch).

I think you can save-all on the whole thread...

Best,
-- 
Jakub Narębski
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: patch submission process, was Re: [PATCH v6 06/16] merge_recursive: abort properly upon errors

2016-08-09 Thread Michael J Gruber
Michael Haggerty venit, vidit, dixit 09.08.2016 01:20:
> Given that I work for GitHub, I'm uncomfortable doing any more advocacy
> here. If people have concrete questions, I'd be happy to answer them, on
> the list or in private.

You're doing a great job differentiating between your roles as a member
of the Git devel community and as a GitHub employee, so please keep the
discussion here.

Maybe two more points of input for the discussion:

off-line capabilities
=

The current workflow here seems to work best when you are subscribed to
the git-ml and have your own (local, maybe selective) copy of git-ml in
your (text-based) MUA (mutt, (al)pine, emacs, ...) that can jump right
into git-am and such directly. I'm not sure how important the "off-line"
aspect of that is for some of us, and how that could be replicated on
GitHub - while PRs and such may be Git based behind the scenes there
seems to be no way to clone that info and work from a local clone.
(Don't know if GitLab is more open.)

My own setup


My usual MUA is Thunderbird because of its integration with calendars
and address books. I usually read and post to mailing lists via
nntp/gmane, that works best for me for several reasons (e.g. archive
available when I need it).

For git-ml, I had to learn early on to answer by e-mail to git-ml rather
than by nntp-reply because proper nntp-replies somehow didn't meet the
expectations of the e-mail users (double copies because of the cc-policy
or such, I don't remember).

I use git sendemail even for small throw-in patches because the git-ml
does not allow attachments but wants patches (files) as in-line text,
and Thunderbird possibly reformats text (because it's text, you know).

When I want to try out a proposed patch I have to "save message" and run
git-am because patches don't come as file attachments on the git-ml
(can't use "save/open attachment"+ git-apply) nor a PR (can't use git
fetch nor view in browser). If it's a series, I have to do that for each
invididual patch, which usually means I simply don't (or rely on Junio
doing it and fetch his xy/topic branch).

And more often than not, patches from series do not appear in sequence,
not threaded on top of the cover letter (in the gmane nntp version of
git-ml), and it usually happens for the same people again and again,
which tells me it's a git sendemail config issue and not gmane.

So really, everytime I interact with the git-ml I think about switching
to mutt or such just for git-ml, even though over time I have gotten
used to the number of hoops that I have to jump through if I want to
interact with git-ml.

Suggestion
==

Maybe the current gmane woes are a good reason to try out something
different for a month or two, with copies to the git-ml, and with the
default being to revert back to git-ml after that and discuss what we've
learned. As a result we may improve our workflow here, get GitHub to
improve, and maybe switch or not. Either way we could profit from that.

Michael
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: storing cover letter of a patch series?

2016-08-09 Thread Michael J Gruber
Junio C Hamano venit, vidit, dixit 08.08.2016 19:42:
> Duy Nguyen  writes:
> 
>> git-notes was mentioned in this thread back in 2015, but I think it's
>> discarded because of the argument that's part of the cover letter was
>> not meant to be kept permanently.
> 
> I do not think the reason why we didn't think the notes mechanism
> was a good match was mainly because the cover letter material was
> about a branch as a whole, which does not have a good counter-part
> in Git at the conceptual level.  "A branch is just a moving pointer
> that points at one commit that happens to be at the tip" is not a
> perfect match to "I am holding these N patches to achieve X and I am
> constantly adding, rewinding and rebuilding".  The notes mechanism
> gives an easy way to describe the former (i.e. annotate one commit,
> and let various commands to move that notes as you rewind and
> rebuild) but not the latter (i.e. "branch.description" configuration
> is the best match, but that is just a check-box feature and does not
> make any serious attempt to be part of a version-control system).
> 
>> But I think we can still use it as a
>> local/temporary place for cover letter instead of the empty commit at
>> the topic's tip. It is a mark of the beginning of commit, it does not
>> require rewriting history when you update the cover letter, and
>> git-merge can be taught to pick it up when you're ready to set it in
>> stone.
> 
> That depends on what you exactly mean by "the beginning of".  Do you
> mean the first commit that is on the topic?  Then that still requires
> you to move it around when the topic is rebuilt.  If you mean the
> commit on the mainline the topic forks from, then of course that
> would not work, as you can fork multiple topics at the same commit.

Well, my idea back then was: attach notes to refs rather than commits,
in this case to the branch ref. That would solve both the "branch moves"
as well as the "cover letter refers to the whole branch/topic" issues.

In fact, I had an implementation that I had been rebasing and using for
quite some time, but it never became popular, and branch.description
landed in-tree. [short version: notes attached to (virtual) refname
objects (virtual blobs - not stored, but "existing" for (fsck, notes
prune and the like)]

The notes based approach suffered from the old notes deficiency: we
don't have good simple tooling for sharing notes; really, we don't have
good tooling for dealing with any remote refs besides branches (read:
ref namespace reorg project is stalled), unless you set up specific
refspecs.


OTOH, branch.description is inherently local, too, and can't even be
transported after setting up refspecs or such.

Also, notes trees have a history, so you would gain a log on your cover
letter edits; again, our tooling around that notes feature is
sub-optimal, that is: the feature is there, the ui could improve.

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >