Re: [PATCH] revision.c: Correctly dereference interesting_cache
Hi, Stefan Beller wrote: This was introduced at b6e8a3b5 (2015-04-17, limit_list: avoid quadratic behavior from still_interesting), which also introduced the check a few lines before, which already dereferences `interesting_cache`. So at this point `interesting_cache` is guaranteed to be not NULL. The above is the rationale for the coverity warning, but it does not explain why this change is safe. The code is called referencing the address of a local variable, so `interesting_cache` can actually never be NULL and trigger a segmentation fault by dereferencing it a few lines before this. I'm having trouble parsing this sentence. Do you mean that limit_list() only calls still_interesting() (and thus, indirectly, everybody_uninteresting()), with the second parameter equal to the address of the local interesting_cache variable, so it can never be NULL? That makes sense, but I had to look at the code and reread the above sentence a few times before I understood. Do you know what this code is trying to check for? What does it mean for *interesting_cache to be NULL? Should there be if (!interesting_cache) die(BUG: interesting_cache == NULL); checks at the top of still_interesting and everybody_uninteresting to futureproof this? What does the *interesting_cache variable represent, anyway? This code seems to be underdocumented. Thanks and hope that helps, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 06/19] fsck: Report the ID of the error/warning
Hi Junio, On 2015-06-19 21:28, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Some legacy code has objects with non-fatal fsck issues; To enable the user to ignore those issues, let's print out the ID (e.g. when encountering missingemail, the user might want to call `git config --add receive.fsck.missingemail=warn`). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 16 t/t1450-fsck.sh | 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 8e6faa8..0b3e18f 100644 --- a/fsck.c +++ b/fsck.c @@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) } } +static void append_msg_id(struct strbuf *sb, const char *msg_id) +{ +for (;;) { +char c = *(msg_id)++; + +if (!c) +break; +if (c != '_') +strbuf_addch(sb, tolower(c)); +} + +strbuf_addstr(sb, : ); +} + __attribute__((format (printf, 4, 5))) static int report(struct fsck_options *options, struct object *object, enum fsck_msg_id id, const char *fmt, ...) @@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object, struct strbuf sb = STRBUF_INIT; int msg_type = fsck_msg_type(id, options), result; +append_msg_id(sb, msg_id_info[id].id_string); Nice. The append function can be made a bit more context sensitive to upcase a char immediately after _ to make it easier to cut and paste into git config and keep the result readable, I think. git config --add receive.fsck.missingEmail=warn Okay. I camelCased the IDs; it is a bit sore on my eyes in the command-line output, and the config variables are case-insensitive, anyway, but your wish is my command... I changed it locally, it will be part of v7. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly
Johannes Schindelin johannes.schinde...@gmx.de writes: When fsck_tag() identifies a problem with the commit, it should try to make it possible to continue checking the commit object, in case the user wants to demote the detected errors to mere warnings. I agree with that. But if FSCK_MSG_BAD_OBJECT_SHA1 is an ignorable error, why should we still have a conditional goto done here? Shouldn't we be parsing the object the same way regardless? Just like fsck_commit(), there are certain problems that could hide other issues with the same tag object. For example, if the 'type' line is not encountered in the correct position, the 'tag' line – if there is any – would not be handled at all. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fsck.c b/fsck.c index 0cfa4d0..21e3052 100644 --- a/fsck.c +++ b/fsck.c @@ -640,7 +640,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data, } if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') { ret = report(options, tag-object, FSCK_MSG_BAD_OBJECT_SHA1, invalid 'object' line format - bad sha1); - goto done; + if (ret) + goto done; } buffer += 41; -- To unsubscribe from this list: send the line unsubscribe git in
Re: co-authoring commits
On Fri, Jun 19, 2015 at 8:18 PM, Jakub Narębski jna...@gmail.com wrote: On 2015-06-18 at 23:25, Tuncer Ayaz wrote: On Thu, Jun 18, 2015 at 12:52 AM, Theodore Ts'o wrote: On Wed, Jun 17, 2015 at 10:26:32PM +0200, Tuncer Ayaz wrote: [...] One could imagine some frankly, quite rare example where there is a team of people who votes on each commit before it gets sent out and where everyone is equal and there is no hierarchy. In that case, perhaps you could set the from field to a mailing list address. But honestly, how often is that *all* of the authors are completely equal[1]? For that case something like patchwork, phabricator, or gerrit seems to be the logical tool to use, and should ideally leave a trace of approvals and such in the resulting commit message(s). If the patch management tool takes care of merging the commit(s), it can be harder to misattribute signed-off/reviewed-by/etc, which is a good thing. Doesn't Gerrit (at least) use trailer-like structured *notes* in the 'reviews' category (i.e. refs/notes/reviews ref) to store information about review process? Don't remember if it does specifically, but I'm sure it can be configured to. I know Phabricator appends a lot upon doing the commit. I'll have to check it the next time I happen to use Gerrit. You could of course use multiple (everybody makes their own) commits, where you risk breaking bisectability and avoid the need for equal co-authorship support. In pair programming such intermediate commits will quite often be fixups, and when you attempt to squash the fixups for bisectability's sake, you may get a desire for co-authorship of the resulting commit. Hmmm... I didn't think about the problem of attributing authorship for squashed commits. Though here multiple 'author' headers, or Strictly speaking, in a live pair programming session usually the one currently typing will be corrected by the other dev, and the roles will switch when the typist changes. multiline 'author' header would be a better match than 'coauthor' header (which itself doesn't need, I think, the date filed, or does it?) What makes sense to me is that the date is decoupled from the list of authors and there is one date only. I think of it as an expanded version of the previously single-entry author field, where the limit has been lifted. With my purely git-user hat on, that is. So, yes, a multiline author header sounds like a solution to me. Maybe it should be sorted alphabetically upon commit regardless of the order of --author ONE --author TWO. [This is sent from Thunderbird news, so it should be all right] This is fine, the other one was broken. Out of curiosity what's the difference between Thunderbird email and news? -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
On Fri, Jun 19, 2015 at 01:53:01PM -0700, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Can you think of a name for the option that is as short as `--quick` but means the same as `--connectivity-only`? No I can't. I think `--connectivity-only` is a very good name that is unfortunately a mouthful, I agree that we need a name that is as short as `--x` that means the same as `--connectivity-only`. I do not think `--quick` is that word; it does not mean such a thing. How about `--linkage` or `--links`? -- Scott Schmit -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly
Hi Junio, On 2015-06-19 22:18, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: When fsck_tag() identifies a problem with the commit, it should try to make it possible to continue checking the commit object, in case the user wants to demote the detected errors to mere warnings. I agree with that. But if FSCK_MSG_BAD_OBJECT_SHA1 is an ignorable error, why should we still have a conditional goto done here? Shouldn't we be parsing the object the same way regardless? Same reason as I mentioned before: in `git receive-pack --strict` we want to fail early, there is no sense to keep going when we know that nothing of the rest will make it to the server. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: co-authoring commits
On Fri, Jun 19, 2015 at 11:11 PM, Tuncer Ayaz tuncer.a...@gmail.com wrote: On Fri, Jun 19, 2015 at 8:18 PM, Jakub Narębski jna...@gmail.com wrote: [This is sent from Thunderbird news, so it should be all right] This is fine, the other one was broken. Out of curiosity what's the difference between Thunderbird email and news? One was sent as Reply All from the news interface (nntp://news.gmane.org), one was sent as Reply All from the email interface (Gmail account). Damned if I know why the difference... -- Jakub Narebski -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
Junio C Hamano gits...@pobox.com writes: What would be the end-user experience if you stopped at the first error? You see an error, add an fsck.msg-id = ignore and rerun, only to find another error and rinse and repeat? Wouldn't you rather see all of them and add the ignore to cover them in one go? I actually see a really good reason to *keep* the current behavior: one of the most prominent users of this code path is `git receive-pack --strict`. It is used heavily by GitHub to ensure at least a certain level of validity of pushed objects. Now, for this use case it is easy to see that you want to stop *as soon as an error was encountered*. And as GitHub sponsors my work on this patch series, my main aim is to support their use case. While I understand that use case, I do not think stopping after showing three more errors in a single commit would make much difference in the bigger picture. I actually changed my mind. The above talks about the value given to the end user by noticing as many errors in a single object, but I'd think fsck.msg-id is pretty much useless as a tool to keep using a repository with malformed object in its history. When you are told object d6602ec is bad (that's the v0.99 tag that does not have tagger field), you would never want to say in this repository, any tag without tagger is allowed, because you would still want to catch and prevent future breakages of the same kind in new tags. And the way to do so is to say I know the object d6602ec is bad, so do not report breakage you find to me by using the skip-list. For that use case, showing _all_ errors (or warnings for that matter) does not add any value. So let's stop at the first error as your patch did. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors
Hi Junio, On 2015-06-19 22:22, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: #define FSCK_FATAL -1 +#define FSCK_INFO -2 #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ @@ -50,15 +51,16 @@ FUNC(ZERO_PADDED_DATE, ERROR) \ /* warnings */ \ FUNC(BAD_FILEMODE, WARN) \ -FUNC(BAD_TAG_NAME, WARN) \ FUNC(EMPTY_NAME, WARN) \ FUNC(FULL_PATHNAME, WARN) \ FUNC(HAS_DOT, WARN) \ FUNC(HAS_DOTDOT, WARN) \ FUNC(HAS_DOTGIT, WARN) \ -FUNC(MISSING_TAGGER_ENTRY, WARN) \ FUNC(NULL_SHA1, WARN) \ -FUNC(ZERO_PADDED_FILEMODE, WARN) +FUNC(ZERO_PADDED_FILEMODE, WARN) \ +/* infos (reported as warnings, but ignored by default) */ \ +FUNC(BAD_TAG_NAME, INFO) \ +FUNC(MISSING_TAGGER_ENTRY, INFO) Exactly the same comment as 12/19 applies to this change; not only complaints but also result makes sense part. And my explanation is the same ;-) At 02/19 time, it would just puzzle me, as a reader, to see special treatment without any good reason. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings
Ahh, I didn't see that they were not grouped by object types, features or any meaningful axis. That explains it (i.e. I can now understand why the original list was ordered differently from the final order). On Fri, Jun 19, 2015 at 2:09 PM, Johannes Schindelin johannes.schinde...@gmx.de wrote: Hi Junio, On 2015-06-19 22:21, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Some kinds of errors are intrinsically unrecoverable (e.g. errors while uncompressing objects). It does not make sense to allow demoting them to mere warnings. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 14 -- t/t5504-fetch-receive-strict.sh | 11 +++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 21e3052..a4fbce3 100644 --- a/fsck.c +++ b/fsck.c @@ -9,7 +9,12 @@ #include refs.h #include utf8.h +#define FSCK_FATAL -1 + #define FOREACH_MSG_ID(FUNC) \ +/* fatal errors */ \ +FUNC(NUL_IN_HEADER, FATAL) \ +FUNC(UNTERMINATED_HEADER, FATAL) \ /* errors */ \ FUNC(BAD_DATE, ERROR) \ FUNC(BAD_DATE_OVERFLOW, ERROR) \ @@ -39,11 +44,9 @@ FUNC(MISSING_TYPE, ERROR) \ FUNC(MISSING_TYPE_ENTRY, ERROR) \ FUNC(MULTIPLE_AUTHORS, ERROR) \ -FUNC(NUL_IN_HEADER, ERROR) \ FUNC(TAG_OBJECT_NOT_TAG, ERROR) \ FUNC(TREE_NOT_SORTED, ERROR) \ FUNC(UNKNOWN_TYPE, ERROR) \ -FUNC(UNTERMINATED_HEADER, ERROR) \ I think the end result very much makes a good sense, but why didn't this list enumerate the errors in the above final order from the beginning in 02/19? Because they are alphabetically ordered, within message type categories, that is; this helped me develop with more ease (you do not want to know how many hundreds of times I ran an interactive rebase on all of these patches...). And from the point of a development story (which a patch series is), it would puzzle me, as a reader, if those two out of all the others were in front in 02/19, when they are no different from the others at that stage. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: Some kinds of errors are intrinsically unrecoverable (e.g. errors while uncompressing objects). It does not make sense to allow demoting them to mere warnings. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 14 -- t/t5504-fetch-receive-strict.sh | 11 +++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 21e3052..a4fbce3 100644 --- a/fsck.c +++ b/fsck.c @@ -9,7 +9,12 @@ #include refs.h #include utf8.h +#define FSCK_FATAL -1 + #define FOREACH_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER, FATAL) \ + FUNC(UNTERMINATED_HEADER, FATAL) \ /* errors */ \ FUNC(BAD_DATE, ERROR) \ FUNC(BAD_DATE_OVERFLOW, ERROR) \ @@ -39,11 +44,9 @@ FUNC(MISSING_TYPE, ERROR) \ FUNC(MISSING_TYPE_ENTRY, ERROR) \ FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(NUL_IN_HEADER, ERROR) \ FUNC(TAG_OBJECT_NOT_TAG, ERROR) \ FUNC(TREE_NOT_SORTED, ERROR) \ FUNC(UNKNOWN_TYPE, ERROR) \ - FUNC(UNTERMINATED_HEADER, ERROR) \ I think the end result very much makes a good sense, but why didn't this list enumerate the errors in the above final order from the beginning in 02/19? -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 09/19] fsck: Handle multiple authors in commits specially
Hi Junio, On 2015-06-19 22:16, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: err = fsck_ident(buffer, commit-object, options); if (err) return err; +while (skip_prefix(buffer, author , buffer)) { +err = report(options, commit-object, FSCK_MSG_MULTIPLE_AUTHORS, invalid format - multiple 'author' lines); +if (err) +return err; +err = fsck_ident(buffer, commit-object, options); +if (err) +return err; +} Hmph, naively I would have expected that you wouldn't need an extra call to fsck_ident() here, and instead would see something like this: author_count = 0; while (skip_prefix(author )) { author_count++; ... do the existing check as-is ... } if (author_count 1) err |= report(missing author); else if (author_count 1) err |= report(multiple authors); Good idea! I fixed this in my branch and it will be part of v7. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
Johannes Schindelin johannes.schinde...@gmx.de writes: + if (strcmp(var, receive.fsck.skiplist) == 0) { + const char *path = is_absolute_path(value) ? + value : git_path(%s, value); This either absolute or inside $GIT_DIR looks somewhat strange to me. Shouldn't we mimick what git config --path does instead? -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings
Hi Junio, On 2015-06-19 22:21, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Some kinds of errors are intrinsically unrecoverable (e.g. errors while uncompressing objects). It does not make sense to allow demoting them to mere warnings. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 14 -- t/t5504-fetch-receive-strict.sh | 11 +++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 21e3052..a4fbce3 100644 --- a/fsck.c +++ b/fsck.c @@ -9,7 +9,12 @@ #include refs.h #include utf8.h +#define FSCK_FATAL -1 + #define FOREACH_MSG_ID(FUNC) \ +/* fatal errors */ \ +FUNC(NUL_IN_HEADER, FATAL) \ +FUNC(UNTERMINATED_HEADER, FATAL) \ /* errors */ \ FUNC(BAD_DATE, ERROR) \ FUNC(BAD_DATE_OVERFLOW, ERROR) \ @@ -39,11 +44,9 @@ FUNC(MISSING_TYPE, ERROR) \ FUNC(MISSING_TYPE_ENTRY, ERROR) \ FUNC(MULTIPLE_AUTHORS, ERROR) \ -FUNC(NUL_IN_HEADER, ERROR) \ FUNC(TAG_OBJECT_NOT_TAG, ERROR) \ FUNC(TREE_NOT_SORTED, ERROR) \ FUNC(UNKNOWN_TYPE, ERROR) \ -FUNC(UNTERMINATED_HEADER, ERROR) \ I think the end result very much makes a good sense, but why didn't this list enumerate the errors in the above final order from the beginning in 02/19? Because they are alphabetically ordered, within message type categories, that is; this helped me develop with more ease (you do not want to know how many hundreds of times I ran an interactive rebase on all of these patches...). And from the point of a development story (which a patch series is), it would puzzle me, as a reader, if those two out of all the others were in front in 02/19, when they are no different from the others at that stage. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 06/19] fsck: Report the ID of the error/warning
That ... can be made ... was not my wish but more like the way the code is structured it is possible for somebody to do such a thing easily, well done compliment ;-) The message names will have to be shown somewhere in the documentation, and in Documentation/ we try to use camelCase to show the word boundary; it would be better to match that, as this output is meant to be used there. On Fri, Jun 19, 2015 at 2:34 PM, Johannes Schindelin johannes.schinde...@gmx.de wrote: Hi Junio, On 2015-06-19 21:28, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Some legacy code has objects with non-fatal fsck issues; To enable the user to ignore those issues, let's print out the ID (e.g. when encountering missingemail, the user might want to call `git config --add receive.fsck.missingemail=warn`). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 16 t/t1450-fsck.sh | 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 8e6faa8..0b3e18f 100644 --- a/fsck.c +++ b/fsck.c @@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) } } +static void append_msg_id(struct strbuf *sb, const char *msg_id) +{ +for (;;) { +char c = *(msg_id)++; + +if (!c) +break; +if (c != '_') +strbuf_addch(sb, tolower(c)); +} + +strbuf_addstr(sb, : ); +} + __attribute__((format (printf, 4, 5))) static int report(struct fsck_options *options, struct object *object, enum fsck_msg_id id, const char *fmt, ...) @@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object, struct strbuf sb = STRBUF_INIT; int msg_type = fsck_msg_type(id, options), result; +append_msg_id(sb, msg_id_info[id].id_string); Nice. The append function can be made a bit more context sensitive to upcase a char immediately after _ to make it easier to cut and paste into git config and keep the result readable, I think. git config --add receive.fsck.missingEmail=warn Okay. I camelCased the IDs; it is a bit sore on my eyes in the command-line output, and the config variables are case-insensitive, anyway, but your wish is my command... I changed it locally, it will be part of v7. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
Johannes Schindelin johannes.schinde...@gmx.de writes: When fsck_commit() identifies a problem with the commit, it should try to make it possible to continue checking the commit object, in case the user wants to demote the detected errors to mere warnings. That makes sense. Note that some problems are too problematic to simply ignore. For example, when the header lines are mixed up, we punt after encountering an incorrect line. Therefore, demoting certain warnings to errors can hide other problems. Example: demoting the missingauthor error to a warning would hide a problematic committer line. Is this a warning to end-users (which should be better in the doc), or because some of them are too problematic to ignore that forgot to add the explanation hence we do not keep going in this code (which should be in the log message if that is what is going on)? I notice that there are many instances of if (object does not pass some test) return report(...); that do not do err = report(); if (err) return; in this function after applying this patch. I think that answers the above question. The answer is because some are too problematic, even after this patch, we give up parsing the remainder of the commit object once we hit certain errors, leaving some other errors that appear later in the object undetected. I think that is a sensible design decision, but the proposed log message forgets to say so. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index 9faaf53..9fe9f48 100644 --- a/fsck.c +++ b/fsck.c @@ -534,12 +534,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, if (!skip_prefix(buffer, tree , buffer)) return report(options, commit-object, FSCK_MSG_MISSING_TREE, invalid format - expected 'tree' line); - if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') - return report(options, commit-object, FSCK_MSG_BAD_TREE_SHA1, invalid 'tree' line format - bad sha1); + if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') { + err = report(options, commit-object, FSCK_MSG_BAD_TREE_SHA1, invalid 'tree' line format - bad sha1); + if (err) + return err; + } I do not think this if (err) return err; that uses the return value of report(), makes sense. As all the errors that use this pattern are isolated ones that does not break parsing of the remainder (e.g. author ident had an extra in it may break author but that does not prevent us from checking committer ). Your report() switches its return value based on the user setting; specifically, it returns 0 if the user tells us to ignore/skip or warn. Which means that the user will see all warnings, but we stop at the first error. Shouldn't we continue regardless of the end-user setting in order to show errors on other fields, too? -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Hi Paul, On 2015-06-19 18:15, Paul Tan wrote: On Fri, Jun 19, 2015 at 11:09 PM, Junio C Hamano gits...@pobox.com wrote: The primary thing I care about is to discourage callers of the API element am_state from touching these fields with strbuf functions. If it is char * then the would think twice before modifying (which would involve allocating the new buffer, forming the new value and assigning into it). With the fields left as strbuf after they are read or formed by the API (by reading by the API implementation from $GIT_DIR/rebase-apply/), it is too tempting to do strbuf_add(), strbuf_trim(), etc., without restoring the value to the original saying I'm the last user of it anyway--that is the sloppiness we can prevent by not leaving it as strbuf. I don't think this is a good deterrent. If the code wanted to, it could just use strbuf_attach()/strbuf_detach() as well, no? Sadly, I am a bit too busy with some loose Git for Windows ends, so I haven't had the chance to look at your code. But I still wanted to throw this idea out there: would it maybe possible to avoid having those values as global variables, and instead pass them as const char * parameters to the respective functions? That should help resolve the concerns of both sides because it would allow us to keep the strbuf instances, just as local variables. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors
Johannes Schindelin johannes.schinde...@gmx.de writes: #define FSCK_FATAL -1 +#define FSCK_INFO -2 #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ @@ -50,15 +51,16 @@ FUNC(ZERO_PADDED_DATE, ERROR) \ /* warnings */ \ FUNC(BAD_FILEMODE, WARN) \ - FUNC(BAD_TAG_NAME, WARN) \ FUNC(EMPTY_NAME, WARN) \ FUNC(FULL_PATHNAME, WARN) \ FUNC(HAS_DOT, WARN) \ FUNC(HAS_DOTDOT, WARN) \ FUNC(HAS_DOTGIT, WARN) \ - FUNC(MISSING_TAGGER_ENTRY, WARN) \ FUNC(NULL_SHA1, WARN) \ - FUNC(ZERO_PADDED_FILEMODE, WARN) + FUNC(ZERO_PADDED_FILEMODE, WARN) \ + /* infos (reported as warnings, but ignored by default) */ \ + FUNC(BAD_TAG_NAME, INFO) \ + FUNC(MISSING_TAGGER_ENTRY, INFO) Exactly the same comment as 12/19 applies to this change; not only complaints but also result makes sense part. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
Johannes Schindelin johannes.schinde...@gmx.de writes: This option avoids unpacking each and all objects, and just verifies the connectivity. That sounds like marketing ;-) Wow this does not unpack unnecessarily, wait, it needs to unpack and parse 3 out of 4 kinds of objects? Jokes aside, given that you should regularly repack your repository anyway, I do not think it is such a big downside that this mode misses a corrupt objects, and the 1 out of 4 kinds of objects, i.e. blobs, occupy major part of the repository storage, so this new mode probably makes sense. diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index 922c346..2863a8a 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' ' test_must_fail git -C missing fsck ' +test_expect_success 'fsck --quick' ' + rm -rf quick + git init quick + ( + cd quick + touch empty + git add empty + test_commit empty + empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 + rm -f $empty + echo invalid $empty + test_must_fail git fsck --strict + git fsck --strict --quick + tree=$(git rev-parse HEAD:) + suffix=${tree#??} + tree=.git/objects/${tree%$suffix}/$suffix + rm -f $tree + echo invalid $tree + test_must_fail git fsck --strict --quick + ) +' + test_done I see a few impedance mismatch here. For --quick, I would have expected that the addition would be in t/perf/, not here. Also the fact that quickness comes by cheating on blobs is an implementation detail; in the future, perhaps somebody may come up with a way to do a quick fsck while making sure blob corruption is also detected. The new test that expects --quick to ignore a corrupt blob forbids such a progress. If the option name was --ignore-corrupt-blob, then the above change is 100% justified, though. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
Hi Junio, On 2015-06-19 22:12, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Note that some problems are too problematic to simply ignore. For example, when the header lines are mixed up, we punt after encountering an incorrect line. Therefore, demoting certain warnings to errors can hide other problems. Example: demoting the missingauthor error to a warning would hide a problematic committer line. Is this a warning to end-users (which should be better in the doc), or because some of them are too problematic to ignore that forgot to add the explanation hence we do not keep going in this code (which should be in the log message if that is what is going on)? It was intended to offer the explanation for the design decision you commented on later: I notice that there are many instances of if (object does not pass some test) return report(...); that do not do err = report(); if (err) return; in this function after applying this patch. I think that answers the above question. The answer is because some are too problematic, even after this patch, we give up parsing the remainder of the commit object once we hit certain errors, leaving some other errors that appear later in the object undetected. I think that is a sensible design decision, but the proposed log message forgets to say so. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index 9faaf53..9fe9f48 100644 --- a/fsck.c +++ b/fsck.c @@ -534,12 +534,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, if (!skip_prefix(buffer, tree , buffer)) return report(options, commit-object, FSCK_MSG_MISSING_TREE, invalid format - expected 'tree' line); -if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') -return report(options, commit-object, FSCK_MSG_BAD_TREE_SHA1, invalid 'tree' line format - bad sha1); +if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') { +err = report(options, commit-object, FSCK_MSG_BAD_TREE_SHA1, invalid 'tree' line format - bad sha1); +if (err) +return err; +} I do not think this if (err) return err; that uses the return value of report(), makes sense. As all the errors that use this pattern are isolated ones that does not break parsing of the remainder (e.g. author ident had an extra in it may break author but that does not prevent us from checking committer ). Your report() switches its return value based on the user setting; specifically, it returns 0 if the user tells us to ignore/skip or warn. Which means that the user will see all warnings, but we stop at the first error. Shouldn't we continue regardless of the end-user setting in order to show errors on other fields, too? I can make that happen, but please note that this is a change of behavior: we always stopped upon the first error. It was my intention not to change behavior in that way without a proper reason, and I saw none. I actually see a really good reason to *keep* the current behavior: one of the most prominent users of this code path is `git receive-pack --strict`. It is used heavily by GitHub to ensure at least a certain level of validity of pushed objects. Now, for this use case it is easy to see that you want to stop *as soon as an error was encountered*. And as GitHub sponsors my work on this patch series, my main aim is to support their use case. Having said that, I agree that it could actually make sense for `git fsck` to show all errors, or at least to have an option to do so. But that is a story for another night ;-) Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
Johannes Schindelin johannes.schinde...@gmx.de writes: Can you think of a name for the option that is as short as `--quick` but means the same as `--connectivity-only`? No I can't. I think `--connectivity-only` is a very good name that is unfortunately a mouthful, I agree that we need a name that is as short as `--x` that means the same as `--connectivity-only`. I do not think `--quick` is that word; it does not mean such a thing. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
Johannes Schindelin johannes.schinde...@gmx.de writes: I do not think this if (err) return err; that uses the return value of report(), makes sense. As all the errors that use this pattern are isolated ones that does not break parsing of the remainder (e.g. author ident had an extra in it may break author but that does not prevent us from checking committer ). Your report() switches its return value based on the user setting; specifically, it returns 0 if the user tells us to ignore/skip or warn. Which means that the user will see all warnings, but we stop at the first error. Shouldn't we continue regardless of the end-user setting in order to show errors on other fields, too? I can make that happen, but please note that this is a change of behavior: we always stopped upon the first error. Yeah, and we always died when we saw error, without giving users an option to turn it down. So? It was my intention not to change behavior in that way without a proper reason, and I saw none. What would be the end-user experience if you stopped at the first error? You see an error, add an fsck.msg-id = ignore and rerun, only to find another error and rinse and repeat? Wouldn't you rather see all of them and add the ignore to cover them in one go? I actually see a really good reason to *keep* the current behavior: one of the most prominent users of this code path is `git receive-pack --strict`. It is used heavily by GitHub to ensure at least a certain level of validity of pushed objects. Now, for this use case it is easy to see that you want to stop *as soon as an error was encountered*. And as GitHub sponsors my work on this patch series, my main aim is to support their use case. While I understand that use case, I do not think stopping after showing three more errors in a single commit would make much difference in the bigger picture. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 09/19] fsck: Handle multiple authors in commits specially
Johannes Schindelin johannes.schinde...@gmx.de writes: err = fsck_ident(buffer, commit-object, options); if (err) return err; + while (skip_prefix(buffer, author , buffer)) { + err = report(options, commit-object, FSCK_MSG_MULTIPLE_AUTHORS, invalid format - multiple 'author' lines); + if (err) + return err; + err = fsck_ident(buffer, commit-object, options); + if (err) + return err; + } Hmph, naively I would have expected that you wouldn't need an extra call to fsck_ident() here, and instead would see something like this: author_count = 0; while (skip_prefix(author )) { author_count++; ... do the existing check as-is ... } if (author_count 1) err |= report(missing author); else if (author_count 1) err |= report(multiple authors); -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] revision.c: Correctly dereference interesting_cache
On Fri, Jun 19, 2015 at 12:01:23PM -0700, Stefan Beller wrote: This was introduced at b6e8a3b5 (2015-04-17, limit_list: avoid quadratic behavior from still_interesting), which also introduced the check a few lines before, which already dereferences `interesting_cache`. So at this point `interesting_cache` is guaranteed to be not NULL. The code is called referencing the address of a local variable, so `interesting_cache` can actually never be NULL and trigger a segmentation fault by dereferencing it a few lines before this. Yeah, I agree it can never be NULL here or we would have already segfaulted. Thanks for digging into it. I think the right thing is to check for `*interesting_cache` as that can become NULL actually. I don't think this is right. We have found the interesting commit, so we want to write it into the cache unconditionally, not only if there is nothing else in the cache (we know if we got here that either there was no current cache item, or it already became UNINTERESTING). I think it is simply a misguided defensive measure to make sure that the caller passed in a cache slot to us. But there is only one caller, and they always pass a cache, so the first part of the function was lazy and not defensive. while (list) { struct commit *commit = list-item; list = list-next; if (commit-object.flags UNINTERESTING) continue; - if (interesting_cache) + if (*interesting_cache) *interesting_cache = commit; So I think the right solution is just to drop the conditional entirely. The current code is not wrong (it is always a noop). What you have here actually misbehaves; it does not update the cache slot when it has become UNINTERESTING. That does not produce wrong results, but it loses the benefit of the cache in some cases. -Peff -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v2 12/12] refs: move the remaining ref module declarations to refs.h
On 06/15/2015 08:35 PM, Jeff King wrote: On Mon, Jun 15, 2015 at 11:13:22AM -0700, Junio C Hamano wrote: @@ -78,15 +170,15 @@ typedef int each_ref_fn(const char *refname, * modifies the reference also returns a nonzero value to immediately * stop the iteration. */ -extern int head_ref(each_ref_fn, void *); +extern int head_ref(each_ref_fn fn, void *cb_data); For example, between these two, what did we gain? Because of their types, it already was clear what the two parameters are in the original, without noisewords like fn (which obviously stands for a function, but that is clear due to each_ref_fn). I think the real benefit of naming parameters is that you can talk about fn and cb_data by name in the docstring[1]. Of course we do not do that here, so we could clearly wait until if-and-when we do so. But I do not think it is a big deal for our style guide to say we always name parameters in declarations, and to bring things in line there (though perhaps it should be a separate patch in that case). I also like the idea that all parameters in declarations should be named. It is true that sometimes the purpose of a parameter is easy to guess from its type without a name. But giving them names make them easier to talk about in docstrings, in commit messages, or even on the mailing list when reviewing patches or discussing bugs. Moreover, I don't see the value of wasting mental energy wondering hmmm, in this case is the meaning sufficiently obvious from the type? every time I write a function declaration. It is easier to have a consistent policy of putting them in. Finally, when I'm inventing new functions (which I admit isn't the case here), I usually write the declaration first and then copy-paste it to the C file as the first step in writing the definition. If I name the parameters in the declaration then (a) I don't have to go back and edit them at the definition and (b) if parameter names are needed later at the declaration (e.g., for a docstring), I don't have to edit the declaration again, cross-referencing back to the C file to make sure I name them consistently. I will split the add names in declarations changes into a separate patch. Also, that way Junio can ignore the patch if he still disagrees :-) -extern int for_each_glob_ref_in(each_ref_fn, const char *pattern, const char* prefix, void *); +extern int for_each_glob_ref_in(each_ref_fn fn, const char *pattern, const char* prefix, void *cb_data); Likewise for addition of fn and cb_data. If you really want to make unrelated changes to this file, what you should fix is to update const char* prefix to const char *prefix ;-) IMHO they are in the same boat (style fixes), and I would be happy to see both improved. :) I will fix this too. Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v2 08/12] initial_ref_transaction_commit(): function for initial ref creation
On 06/15/2015 08:53 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: +struct ref_transaction *t; +struct strbuf err = STRBUF_INIT; + +t = ref_transaction_begin(err); +if (!t) +die(err.buf); Yikes, and sorry for sending three messages without consolidating against the same patch, but die(%s, err.buf); because extern NORETURN void die(const char *err, ...) __attribute__((format (printf, 1, 2))); in its declaration causes -Werror=format-security to barf. Likewise for a few other instances of the same construct all in the same file. Thanks for catching this. I'll fix it in the re-roll (and also add that gcc option to my config.mak for the future). Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
Johannes Schindelin johannes.schinde...@gmx.de writes: Jokes aside, given that you should regularly repack your repository anyway, I do not think it is such a big downside that this mode misses a corrupt objects, and the 1 out of 4 kinds of objects, i.e. blobs, occupy major part of the repository storage, so this new mode probably makes sense. It actually makes a ton of sense as a kind of light-weight check ;-) Yes, didn't I agree that it makes sense already? The meaning of quick that I was thinking of was not the same as fast, but more like just a quick check. As in quick dirty ;-) Sure. The point is not to ignore corrupt blobs, by the way, it is to check the connectivity only, and save substantial amounts of time doing so. Yeah, I understand that; after all, that is exactly why I said make sure it does not notice corrupt blobs is not a good test. On the other hand, you are also checking that it notices a broken tree (which makes it impossible to complete connectivity check), which is a good test. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 17/19] fsck: Introduce `git fsck --quick`
Scott Schmit i.g...@comcast.net writes: On Fri, Jun 19, 2015 at 01:53:01PM -0700, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: Can you think of a name for the option that is as short as `--quick` but means the same as `--connectivity-only`? No I can't. I think `--connectivity-only` is a very good name that is unfortunately a mouthful, I agree that we need a name that is as short as `--x` that means the same as `--connectivity-only`. I do not think `--quick` is that word; it does not mean such a thing. How about `--linkage` or `--links`? Even though link may be shorter than connectivity, the real difficulty is to come up with a phrase that conveys the -only part of the fully spelled name, which is more important, without spending 5 letters that take to say -only. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] format-patch: introduce format.outputDirectory configuration
Hello Jeff and Junio, Thank you for feedback and help. I think also I need to add yet another test which tests case when configuration option is set and -o passed. I'll make changes and resend the patch. Thank you. 2015-06-19 10:14 GMT+06:00 Jeff King p...@peff.net: On Thu, Jun 18, 2015 at 02:46:36PM -0700, Junio C Hamano wrote: If I were designing from scratch, I would consider making -o - output to stdout, and letting it override a previous -o (or vice versa). We could still do that (and make --stdout an alias for that), but I don't know if it is worth the trouble (it does change the behavior for anybody who wanted a directory called -, but IMHO it is more likely to save somebody a headache than create one). I agree with later -o should override an earlier one, but I do not necessarily agree with '-o -' should be --stdout, for a simple reason that -o foo is not --stdout foo. Good point. At any rate, that was all in my designing from scratch hypothetical, so it is doubly not worth considering. Perhaps something like this to replace builtin/ part of Alexander's patch? [...] @@ -1337,6 +1342,9 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) die (_(--subject-prefix and -k are mutually exclusive.)); rev.preserve_subject = keep_subject; + if (!output_directory !use_stdout) + output_directory = config_output_directory; + Yeah, I think that is the sanest way to do it given the constraints. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks
On Fri, Jun 19, 2015 at 01:32:23AM -0400, Jeff King wrote: The least-work thing may actually be teaching the separate diff-highlight script to strip out the colorizing and re-add it by offset. OK, here is that patch. It seems to work OK, and should preserve existing colors produced by git anywhere within the input. It should also serve as a stepping stone to using the same technique inside git itself. I am not planning to do that anytime soon, though. I took a look at the word-diff code earlier today, and my brain partially melted. I'm sure it's possible, but I've already spent enough time and brain cells on this for now. I think this patch does need some more polishing, or more heuristics, though. Some of the output is really fantastic. Here's a simple one from 69f9a6e54: -* link:v2.4.2/git.html[documentation for release 2.4.2] +* link:v2.4.3/git.html[documentation for release 2.4.3] The angle brackets show what is highlighted, but that really doesn't do it justice. I encourage people to apply the patch and git show --color $commit | contrib/diff-highlight/diff-highlight to see for yourselves (I'm hesitant to assume that sending ANSI color codes via email will end up readable for anyone). What's interesting about that example (that didn't work before) is that we can find multiple changes and highlight them independently (whereas before, we would have highlighted everything between them, too). Here's a better code example from 9cc2b07a7: -int parse_commit(struct commit *item) +int parse_commit_gently(struct commit *item, int quiet_on_missing) It highlights the change in the function name and the matching change to the parameter list. And from a few lines below in the same diff: - return error(Could not read %s, + return quiet_on_missing ? -1 : + error(Could not read %s, This demonstrates it working on multiple lines, and in particular one where the matching content actually crosses a line boundary. So that's all the good news. Here's some bad news, from f0e7f11d054: - return offset1 - offset2; + return offset1 offset2 ? -1 : + offset1 offset2 ? 1 : + 0; I don't think adding angle brackets to this one would do it justice, but you can look for yourself. What happens is that we match offset1, but then the - from the pre-image matches the first character of -1 from the post-image. And all of offset2 ? is highlighted, then not -, then 1 :, and so on. The result is rather confusing to read. I think that can probably be fixed by smarter tokenizing of what we feed to the per-hunk diff (i.e., -1 should be a full token). It also does a weird thing where the offset1 from the second post-image line is _not_ highlighted, and it should be (even though it's ugly, it is the correct thing to do). I think what is happening is that git's existing colorizing is getting in the way. It does not say turn on green; OK, now turn off green. It says turn on green; OK, now reset all colors. So at the beginning of each post-image line, we will accidentally turn off any highlighting carried over from the previous line. That's probably OK in general, because highlighting across lines is a good indication that it's not helpful. But it is a little hacky. And here's some more bad news. If you look at the diff for this patch itself, it's terribly unreadable (the regular diff already is pretty bad, but the highlights make it much worse). There are big chunks where we take away 5 or 10 lines from the old code, and replace them with totally unrelated lines. We end up highlighting almost the entire thing, except for spaces and punctuation. We might be able to solve this with a percentage heuristic similar to the one Patrick proposed. It's not really interesting to highlight unless we're doing it on probably 20% or less of the diff (where 20% is a number I just made up). I also have a feeling that we might get better results if we don't feed the delimiters into the diff algorithm. I think the word-diff code actually tokenizes foo bar baz into the list (foo, bar, baz) and diffs only that. I think it then _loses_ the original whitespace, but here we would need to record and restore it. We might be able to use the same mechanism as for stripping and restoring the colors. I'm not sure. Anyway, here's the patch. It's rather messy, so I'd recommend just reading the resulting code; I wish there was a good way to convince the diff engine to produce larger hunks. --- contrib/diff-highlight/diff-highlight | 202 +++--- 1 file changed, 86 insertions(+), 116 deletions(-) diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight index ffefc31..1525ccc 100755 --- a/contrib/diff-highlight/diff-highlight +++ b/contrib/diff-highlight/diff-highlight @@ -3,6 +3,7 @@ use 5.008; use warnings FATAL = 'all'; use strict; +use Algorithm::Diff; # Highlight
Re: [PATCH/RFC v4 07/10] send-email: reduce dependancies impact on parse_address_line
Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr writes: Matthieu Moy matthieu@grenoble-inp.fr writes: Cool. Then almost all the work is done to get an automated test. Next step would be to add the tests itself in the code. I would do that by adding a hidden --selfcheck option to git send-email that would compare Mail::Address-parse($string); and split_addrs($string); for all your testcases, and die if they do not match. Then calling it from the testsuite would be trivial. Ok, are there such --selfcheck options elsewhere? Not as far as I know. If I understand it right, you want to put the tests inside the git-send-email script. I don't feel really good about that but I guess it's hard to test it otherwise... Hmm, actually there is, I didn't look at the right places yesterday. git-send-email.perl already does 'use Git;', and there's already a set of unit-tests for Git.pm: t9700-perl-git.sh, which calls perl $TEST_DIRECTORY/t9700/test.pl. So, you can just add your code as a function in Git.pm and unit-tests in t/t9700/test.pl. Also what will we do with the failing tests? Just discard them? I think there's two sort of failing test: - When output provided by parse_address_ without Mail::Address is better or has no impact at all on the code. Such as: I'm not sure we can be better as long as we do use Mail::Address when available. Any difference is potentially harmfull for the user because it means that Git will have different behavior on different machines. Perhaps this is an argument to use your version unconditionally and drop Mail::Address actually. But you can still test that with is(parse_address_(...), Doe, Jane, description); (possibly not calling Mail::Address) http://search.cpan.org/~exodist/Test-Simple-1.001014/lib/Test/More.pm The cases where Mail::Address and your version give the same result can be tested with a foreach loop calling is(parse_address_(...), Mail::Address(...), ...); - When we don't really care about the output, because the user entry is wrong, and we just expect the script to be aborted somehow... We don't need to test that. ... but if you already have the tests, you can keep them as known failure. See the TODO: BLOCK section of the doc of Test::More. I can do that on top of your series if you don't have time. Time will become a problem soon, but I think I can handle it unless you really want to do it ! If you have time, just do it. Thanks, -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug report: Unhandled Exception
Hi Gary, On 2015-06-11 21:21, Gary England wrote: Using git version 1.9.2-preview20140411, in Git Bash for Windows, performing a git pull --rebase, received an unhandled exception. Please note that the newest 1.9.x release is 1.9.5. Could you re-test with that version, please? Or maybe you want to give the 2nd release candidate of Git for Windows 2.x a try: https://git-for-windows.github.io/#download Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 09/19] fsck: Handle multiple authors in commits specially
This problem has been detected in the wild, and is the primary reason to introduce an option to demote certain fsck errors to warnings. Let's offer to ignore this particular problem specifically. Technically, we could handle such repositories by setting receive.fsck.msg-id to missingcommitter=warn, but that could hide missing tree objects in the same commit because we cannot continue verifying any commit object after encountering a missing committer line, while we can continue in the case of multiple author lines. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 9 + 1 file changed, 9 insertions(+) diff --git a/fsck.c b/fsck.c index 9fe9f48..0cfa4d0 100644 --- a/fsck.c +++ b/fsck.c @@ -38,6 +38,7 @@ FUNC(MISSING_TREE, ERROR) \ FUNC(MISSING_TYPE, ERROR) \ FUNC(MISSING_TYPE_ENTRY, ERROR) \ + FUNC(MULTIPLE_AUTHORS, ERROR) \ FUNC(NUL_IN_HEADER, ERROR) \ FUNC(TAG_OBJECT_NOT_TAG, ERROR) \ FUNC(TREE_NOT_SORTED, ERROR) \ @@ -571,6 +572,14 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, err = fsck_ident(buffer, commit-object, options); if (err) return err; + while (skip_prefix(buffer, author , buffer)) { + err = report(options, commit-object, FSCK_MSG_MULTIPLE_AUTHORS, invalid format - multiple 'author' lines); + if (err) + return err; + err = fsck_ident(buffer, commit-object, options); + if (err) + return err; + } if (!skip_prefix(buffer, committer , buffer)) return report(options, commit-object, FSCK_MSG_MISSING_COMMITTER, invalid format - expected 'committer' line); err = fsck_ident(buffer, commit-object, options); -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 13/19] fsck: Optionally ignore specific fsck issues completely
An fsck issue in a legacy repository might be so common that one would like not to bother the user with mentioning it at all. With this change, that is possible by setting the respective message type to ignore. This change abuses the missingemail=warn test to verify that ignore is also accepted and works correctly. And while at it, it makes sure that multiple options work, too (they are passed to unpack-objects or index-pack as a comma-separated list via the --strict=... command-line option). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 5 + fsck.h | 1 + t/t5504-fetch-receive-strict.sh | 9 - 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/fsck.c b/fsck.c index a4fbce3..b728646 100644 --- a/fsck.c +++ b/fsck.c @@ -137,6 +137,8 @@ static int parse_msg_type(const char *str, int len) return FSCK_ERROR; else if (!substrcmp(str, len, warn)) return FSCK_WARN; + else if (!substrcmp(str, len, ignore)) + return FSCK_IGNORE; else die(Unknown fsck message type: '%.*s', len, str); @@ -220,6 +222,9 @@ static int report(struct fsck_options *options, struct object *object, struct strbuf sb = STRBUF_INIT; int msg_type = fsck_msg_type(id, options), result; + if (msg_type == FSCK_IGNORE) + return 0; + if (msg_type == FSCK_FATAL) msg_type = FSCK_ERROR; diff --git a/fsck.h b/fsck.h index 738c9df..7e49372 100644 --- a/fsck.h +++ b/fsck.h @@ -3,6 +3,7 @@ #define FSCK_ERROR 1 #define FSCK_WARN 2 +#define FSCK_IGNORE 3 struct fsck_options; diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh index 0d64229..cb077b7 100755 --- a/t/t5504-fetch-receive-strict.sh +++ b/t/t5504-fetch-receive-strict.sh @@ -133,7 +133,14 @@ test_expect_success 'push with receive.fsck.missingemail=warn' ' git --git-dir=dst/.git config \ receive.fsck.missingemail warn git push --porcelain dst bogus act 21 - grep missingemail act + grep missingemail act + git --git-dir=dst/.git branch -D bogus + git --git-dir=dst/.git config --add \ + receive.fsck.missingemail ignore + git --git-dir=dst/.git config --add \ + receive.fsck.baddate warn + git push --porcelain dst bogus act 21 + test_must_fail grep missingemail act ' test_expect_success \ -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 10/19] fsck: Make fsck_tag() warn-friendly
When fsck_tag() identifies a problem with the commit, it should try to make it possible to continue checking the commit object, in case the user wants to demote the detected errors to mere warnings. Just like fsck_commit(), there are certain problems that could hide other issues with the same tag object. For example, if the 'type' line is not encountered in the correct position, the 'tag' line – if there is any – would not be handled at all. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fsck.c b/fsck.c index 0cfa4d0..21e3052 100644 --- a/fsck.c +++ b/fsck.c @@ -640,7 +640,8 @@ static int fsck_tag_buffer(struct tag *tag, const char *data, } if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') { ret = report(options, tag-object, FSCK_MSG_BAD_OBJECT_SHA1, invalid 'object' line format - bad sha1); - goto done; + if (ret) + goto done; } buffer += 41; -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH 3/3] Add filter-objects command
From: Charles Bailey cbaile...@bloomberg.net filter-objects is a command to scan all objects in the object database for the repository and print the ids of those which match the given criteria. The current supported criteria are object type and the minimum size of the object. The guiding use case is to scan repositories quickly for large objects which may cause performance issues for users. The list of objects can then be used to guide some future remediating action. Signed-off-by: Charles Bailey cbaile...@bloomberg.net --- Documentation/git-filter-objects.txt | 38 +++ Makefile | 1 + builtin.h| 1 + builtin/filter-objects.c | 73 git.c| 1 + t/t8100-filter-objects.sh| 67 + 6 files changed, 181 insertions(+) create mode 100644 Documentation/git-filter-objects.txt create mode 100644 builtin/filter-objects.c create mode 100755 t/t8100-filter-objects.sh diff --git a/Documentation/git-filter-objects.txt b/Documentation/git-filter-objects.txt new file mode 100644 index 000..c10ca01 --- /dev/null +++ b/Documentation/git-filter-objects.txt @@ -0,0 +1,38 @@ +git-filter-objects(1) += + +NAME + +git-filter-objects - Scan through all objects in the repository and print those +matching a given filter + + +SYNOPSIS + +[verse] +'git filter-objects' [-t type | --type=type] [--min-size=size] + [-v|--verbose] + +DESCRIPTION +--- +Scans all objects in a repository - including any unreachable objects - and +print out the ids of all matching objects. If `--verbose` is specified then +the object type and size is printed out as well as its id. + +OPTIONS +--- +-t:: +--type:: + Only list objects whose type matches type. + +--min-size:: + Only list objects whose size exceeds size bytes. + +-v:: +--verbose:: + Output in the followin format instead of just printing object ids: + sha1 SP type SP size + +GIT +--- +Part of the linkgit:git[1] suite diff --git a/Makefile b/Makefile index 149f1c7..a7c017f 100644 --- a/Makefile +++ b/Makefile @@ -842,6 +842,7 @@ BUILTIN_OBJS += builtin/diff.o BUILTIN_OBJS += builtin/fast-export.o BUILTIN_OBJS += builtin/fetch-pack.o BUILTIN_OBJS += builtin/fetch.o +BUILTIN_OBJS += builtin/filter-objects.o BUILTIN_OBJS += builtin/fmt-merge-msg.o BUILTIN_OBJS += builtin/for-each-ref.o BUILTIN_OBJS += builtin/fsck.o diff --git a/builtin.h b/builtin.h index b87df70..5a15693 100644 --- a/builtin.h +++ b/builtin.h @@ -62,6 +62,7 @@ extern int cmd_diff_tree(int argc, const char **argv, const char *prefix); extern int cmd_fast_export(int argc, const char **argv, const char *prefix); extern int cmd_fetch(int argc, const char **argv, const char *prefix); extern int cmd_fetch_pack(int argc, const char **argv, const char *prefix); +extern int cmd_filter_objects(int argc, const char **argv, const char *prefix); extern int cmd_fmt_merge_msg(int argc, const char **argv, const char *prefix); extern int cmd_for_each_ref(int argc, const char **argv, const char *prefix); extern int cmd_format_patch(int argc, const char **argv, const char *prefix); diff --git a/builtin/filter-objects.c b/builtin/filter-objects.c new file mode 100644 index 000..c40d621 --- /dev/null +++ b/builtin/filter-objects.c @@ -0,0 +1,73 @@ +#include cache.h +#include builtin.h +#include revision.h +#include parse-options.h + +#include stdio.h + +static int req_type; +static unsigned long min_size; +static int verbose; + +static int check_object(const unsigned char *sha1) +{ + unsigned long size; + int type = sha1_object_info(sha1, size); + + if (type 0) + return -1; + + if (size = min_size (!req_type || type == req_type)) { + if (verbose) + printf(%s %s %lu\n, sha1_to_hex(sha1), typename(type), size); + else + printf(%s\n, sha1_to_hex(sha1)); + } + + return 0; +} + +static int check_loose_object(const unsigned char *sha1, + const char *path, + void *data) +{ + return check_object(sha1); +} + +static int check_packed_object(const unsigned char *sha1, + struct packed_git *pack, + uint32_t pos, + void *data) +{ + return check_object(sha1); +} + +static char *opt_type; +static struct option builtin_filter_objects_options[] = { + OPT_ULONG(0, min-size, min_size, minimum size of object to show), + OPT_STRING('t', type, opt_type, NULL, type of objects to show), + OPT__VERBOSE(verbose, show object type and size), + OPT_END() +}; + +int cmd_filter_objects(int argc, const char **argv, const char *prefix) +{ + struct packed_git *p; + + argc =
Re: [PATCH/WIP v3 03/31] am: implement skeletal builtin am
On Fri, Jun 19, 2015 at 4:26 AM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: @@ -0,0 +1,28 @@ +/* + * Builtin git am + * + * Based on git-am.sh by Junio C Hamano. + */ +#include cache.h +#include builtin.h +#include exec_cmd.h + +int cmd_am(int argc, const char **argv, const char *prefix) +{ + /* + * FIXME: Once all the features of git-am.sh have been re-implemented + * in builtin/am.c, this preamble can be removed. + */ It's not broken, so FIXME is not quite appropriate (and that is why I sent you NEEDSWORK). OK. Also mention that the entry in the commands[] array needs RUN_SETUP | NEED_WORK_TREE added, I think. OK. + if (!getenv(_GIT_USE_BUILTIN_AM)) { + const char *path = mkpath(%s/git-am, git_exec_path()); + + if (sane_execvp(path, (char **)argv) 0) + die_errno(could not exec %s, path); + } else { + prefix = setup_git_directory(); + trace_repo_setup(prefix); + setup_work_tree(); + } + + return 0; +} diff --git a/git.c b/git.c index e7a7713..a671535 100644 --- a/git.c +++ b/git.c @@ -370,6 +370,7 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv) static struct cmd_struct commands[] = { { add, cmd_add, RUN_SETUP | NEED_WORK_TREE }, + { am, cmd_am, NO_SETUP }, NO_SETUP is for things like init and clone that start without a repository and then work in the one that they create. I think imitating archive or diff is more appropriate. Ah OK. I thought that handle_alias() in git.c would chdir() and set GIT_DIR because it called setup_git_directory_gently(), and thus requires NO_SETUP to restore the initial environment, but turns out it chdir()s back to the original directory, and sets GIT_DIR appropriately, so we're OK. Thanks, Paul -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 00/19] Make git-pull a builtin
On Fri, Jun 19, 2015 at 4:13 AM, Junio C Hamano gits...@pobox.com wrote: I didn't look carefully, but does that mean 04/19 has the what if you start from a subdirectory and are still using the scripted one? issue we discussed recently for am? It does, but git-pull.sh does not care about the original working directory, no? Regards, Paul -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] format-patch: introduce format.outputDirectory configuration
2015-06-19 3:46 GMT+06:00 Junio C Hamano gits...@pobox.com: I agree with later -o should override an earlier one, but I do not necessarily agree with '-o -' should be --stdout, for a simple reason that -o foo is not --stdout foo. Perhaps something like this to replace builtin/ part of Alexander's patch? @@ -1337,6 +1342,9 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) die (_(--subject-prefix and -k are mutually exclusive.)); rev.preserve_subject = keep_subject; + if (!output_directory !use_stdout) + output_directory = config_output_directory; + But there is following condition above: if (!use_stdout) output_directory = set_outdir(prefix, output_directory); After which output_directory will be ./ everytime and + if (!output_directory !use_stdout) + output_directory = config_output_directory; + will not work here. What if we remove if (output_directory) {} and update it as: if (!use_stdout) { if (!config_output_directory !output_directory) output_directory = set_outdir(prefix, output_directory); else if (config_output_directory) output_directory = config_output_directory; if (mkdir(output_directory, 0777) 0 errno != EEXIST) die_errno(_(Could not create directory '%s'), output_directory); } else setup_pager(); ? -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 03/19] fsck: Provide a function to parse fsck message IDs
These functions will be used in the next commits to allow the user to ask fsck to handle specific problems differently, e.g. demoting certain errors to warnings. The upcoming `fsck_set_msg_types()` function has to handle partial strings because we would like to be able to parse, say, 'missingemail=warn,missingtaggerentry=warn' command line parameters (which will be passed by receive-pack to index-pack and unpack-objects). To make the parsing robust, we generate strings from the enum keys, and using these keys, we match up strings without dashes case-insensitively to the corresponding enum values. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 30 -- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index ab24618..da5717c 100644 --- a/fsck.c +++ b/fsck.c @@ -63,15 +63,41 @@ enum fsck_msg_id { }; #undef MSG_ID -#define MSG_ID(id, msg_type) { FSCK_##msg_type }, +#define STR(x) #x +#define MSG_ID(id, msg_type) { STR(id), FSCK_##msg_type }, static struct { + const char *id_string; int msg_type; } msg_id_info[FSCK_MSG_MAX + 1] = { FOREACH_MSG_ID(MSG_ID) - { -1 } + { NULL, -1 } }; #undef MSG_ID +static int parse_msg_id(const char *text, int len) +{ + int i, j; + + if (len 0) + len = strlen(text); + + for (i = 0; i FSCK_MSG_MAX; i++) { + const char *key = msg_id_info[i].id_string; + /* match id_string case-insensitively, without underscores. */ + for (j = 0; j len; j++) { + char c = *(key++); + if (c == '_') + c = *(key++); + if (toupper(text[j]) != c) + break; + } + if (j == len !*key) + return i; + } + + return -1; +} + static int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options) { -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
There are legacy repositories out there whose older commits and tags have issues that prevent pushing them when 'receive.fsckObjects' is set. One real-life example is a commit object that has been hand-crafted to list two authors. Often, it is not possible to fix those issues without disrupting the work with said repositories, yet it is still desirable to perform checks by setting `receive.fsckObjects = true`. This commit is the first step to allow demoting specific fsck issues to mere warnings. The `fsck_set_msg_types()` function added by this commit parses a list of settings in the form: missingemail=warn,badname=warn,... Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by git fsck so far, but other call paths (e.g. git index-pack --strict) error out *always* no matter what type was specified. Therefore, we need to take extra care to set all message types to FSCK_ERROR by default in those cases. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 82 +++--- fsck.h | 10 ++-- 2 files changed, 87 insertions(+), 5 deletions(-) diff --git a/fsck.c b/fsck.c index da5717c..8c3caff 100644 --- a/fsck.c +++ b/fsck.c @@ -103,13 +103,85 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, { int msg_type; - msg_type = msg_id_info[msg_id].msg_type; - if (options-strict msg_type == FSCK_WARN) - msg_type = FSCK_ERROR; + assert(msg_id = 0 msg_id FSCK_MSG_MAX); + + if (options-msg_type) + msg_type = options-msg_type[msg_id]; + else { + msg_type = msg_id_info[msg_id].msg_type; + if (options-strict msg_type == FSCK_WARN) + msg_type = FSCK_ERROR; + } return msg_type; } +static inline int substrcmp(const char *string, int len, const char *match) +{ + int match_len = strlen(match); + if (match_len != len) + return -1; + return memcmp(string, match, len); +} + +static int parse_msg_type(const char *str, int len) +{ + if (len 0) + len = strlen(str); + + if (!substrcmp(str, len, error)) + return FSCK_ERROR; + else if (!substrcmp(str, len, warn)) + return FSCK_WARN; + else + die(Unknown fsck message type: '%.*s', + len, str); +} + +void fsck_set_msg_type(struct fsck_options *options, + const char *msg_id, int msg_id_len, + const char *msg_type, int msg_type_len) +{ + int id = parse_msg_id(msg_id, msg_id_len), type; + + if (id 0) + die(Unhandled message id: %.*s, msg_id_len, msg_id); + type = parse_msg_type(msg_type, msg_type_len); + + if (!options-msg_type) { + int i; + int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX); + for (i = 0; i FSCK_MSG_MAX; i++) + msg_type[i] = fsck_msg_type(i, options); + options-msg_type = msg_type; + } + + options-msg_type[id] = type; +} + +void fsck_set_msg_types(struct fsck_options *options, const char *values) +{ + while (*values) { + int len = strcspn(values, ,|), equal; + + if (!len) { + values++; + continue; + } + + for (equal = 0; equal len; equal++) + if (values[equal] == '=' || values[equal] == ':') + break; + + if (equal == len) + die(Missing '=': '%.*s', len, values); + + fsck_set_msg_type(options, values, equal, + values + equal + 1, len - equal - 1); + values += len; + } +} + __attribute__((format (printf, 4, 5))) static int report(struct fsck_options *options, struct object *object, enum fsck_msg_id id, const char *fmt, ...) @@ -599,6 +671,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct object *obj, int msg_type, const char *message) { + if (msg_type == FSCK_WARN) { + warning(object %s: %s, sha1_to_hex(obj-sha1), message); + return 0; + } error(object %s: %s, sha1_to_hex(obj-sha1), message); return 1; } diff --git a/fsck.h b/fsck.h index f6f268a..edb4540 100644 --- a/fsck.h +++ b/fsck.h @@ -6,6 +6,11 @@ struct fsck_options; +void fsck_set_msg_type(struct fsck_options *options, + const char *msg_id, int msg_id_len, + const char *msg_type, int msg_type_len); +void fsck_set_msg_types(struct fsck_options *options, const char *values); + /* * callback function for fsck_walk * type is the expected type of the object or OBJ_ANY @@ -25,10 +30,11 @@ struct fsck_options { fsck_walk_func walk; fsck_error
[PATCH v6 17/19] fsck: Introduce `git fsck --quick`
This option avoids unpacking each and all objects, and just verifies the connectivity. In particular with large repositories, this speeds up the operation, at the expense of missing corrupt blobs and ignoring unreachable objects, if any. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- Documentation/git-fsck.txt | 7 ++- builtin/fsck.c | 7 ++- t/t1450-fsck.sh| 22 ++ 3 files changed, 34 insertions(+), 2 deletions(-) diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt index 25c431d..b98fb43 100644 --- a/Documentation/git-fsck.txt +++ b/Documentation/git-fsck.txt @@ -10,7 +10,7 @@ SYNOPSIS [verse] 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs] -[--[no-]full] [--strict] [--verbose] [--lost-found] +[--[no-]full] [--quick] [--strict] [--verbose] [--lost-found] [--[no-]dangling] [--[no-]progress] [object*] DESCRIPTION @@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and all reflogs object pools. This is now default; you can turn it off with --no-full. +--quick:: + Check only the connectivity of tags, commits and tree objects. By + avoiding to unpack blobs, this speeds up the operation, at the + expense of missing corrupt objects. + --strict:: Enable more strict checking, namely to catch a file mode recorded with g+w bit set, which was created by older diff --git a/builtin/fsck.c b/builtin/fsck.c index 6de9f3e..75fcb5f 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -23,6 +23,7 @@ static int show_tags; static int show_unreachable; static int include_reflogs = 1; static int check_full = 1; +static int quick; static int check_strict; static int keep_cache_objects; static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT; @@ -181,6 +182,8 @@ static void check_reachable_object(struct object *obj) if (!(obj-flags HAS_OBJ)) { if (has_sha1_pack(obj-sha1)) return; /* it is in pack - forget about it */ + if (quick has_sha1_file(obj-sha1)) + return; printf(missing %s %s\n, typename(obj-type), sha1_to_hex(obj-sha1)); errors_found |= ERROR_REACHABLE; return; @@ -623,6 +626,7 @@ static struct option fsck_opts[] = { OPT_BOOL(0, cache, keep_cache_objects, N_(make index objects head nodes)), OPT_BOOL(0, reflogs, include_reflogs, N_(make reflogs head nodes (default))), OPT_BOOL(0, full, check_full, N_(also consider packs and alternate objects)), + OPT_BOOL(0, quick, quick, N_(check only connectivity)), OPT_BOOL(0, strict, check_strict, N_(enable more strict checking)), OPT_BOOL(0, lost-found, write_lost_and_found, N_(write dangling objects in .git/lost-found)), @@ -659,7 +663,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) git_config(fsck_config, NULL); fsck_head_link(); - fsck_object_dir(get_object_directory()); + if (!quick) + fsck_object_dir(get_object_directory()); prepare_alt_odb(); for (alt = alt_odb_list; alt; alt = alt-next) { diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index 922c346..2863a8a 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -431,4 +431,26 @@ test_expect_success 'fsck notices ref pointing to missing tag' ' test_must_fail git -C missing fsck ' +test_expect_success 'fsck --quick' ' + rm -rf quick + git init quick + ( + cd quick + touch empty + git add empty + test_commit empty + empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 + rm -f $empty + echo invalid $empty + test_must_fail git fsck --strict + git fsck --strict --quick + tree=$(git rev-parse HEAD:) + suffix=${tree#??} + tree=.git/objects/${tree%$suffix}/$suffix + rm -f $tree + echo invalid $tree + test_must_fail git fsck --strict --quick + ) +' + test_done -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 15/19] fsck: Document the new receive.fsck.msg-id options
Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- Documentation/config.txt | 14 ++ 1 file changed, 14 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index 3e37b93..306ab7a 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2205,6 +2205,20 @@ receive.fsckObjects:: Defaults to false. If not set, the value of `transfer.fsckObjects` is used instead. +receive.fsck.msg-id:: + When `receive.fsckObjects` is set to true, errors can be switched + to warnings and vice versa by configuring the `receive.fsck.msg-id` + setting where the `msg-id` is the fsck message ID and the value + is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes + the error/warning with the message ID, e.g. missingemail: invalid + author/committer line - missing email means that setting + `receive.fsck.missingemail = ignore` will hide that issue. ++ +This feature is intended to support working with legacy repositories +which would not pass pushing when `receive.fsckObjects = true`, allowing +the host to accept repositories with certain known issues but still catch +other issues. + receive.unpackLimit:: If the number of objects received in a push is below this limit then the objects will be unpacked into loose object -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 19/19] fsck: support ignoring objects in `git fsck` via fsck.skiplist
Identical to support in `git receive-pack for the config option `receive.fsck.skiplist`, we now support ignoring given objects in `git fsck` via `fsck.skiplist` altogether. This is extremely handy in case of legacy repositories where it would cause more pain to change incorrect objects than to live with them (e.g. a duplicate 'author' line in an early commit object). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- Documentation/config.txt | 7 +++ builtin/fsck.c | 10 ++ 2 files changed, 17 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index 5f45115..5aba63a 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1261,6 +1261,13 @@ that setting `fsck.missingemail = ignore` will hide that issue. This feature is intended to support working with legacy repositories which cannot be repaired without disruptive changes. +fsck.skipList:: + The path to a sorted list of object names (i.e. one SHA-1 per + line) that are known to be broken in a non-fatal way and should + be ignored. This feature is useful when an established project + should be accepted despite early commits containing errors that + can be safely ignored such as invalid committer email addresses. + gc.aggressiveDepth:: The depth parameter used in the delta compression algorithm used by 'git gc --aggressive'. This defaults diff --git a/builtin/fsck.c b/builtin/fsck.c index 75fcb5f..ce538ac 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -54,6 +54,16 @@ static int fsck_config(const char *var, const char *value, void *cb) return 0; } + if (strcmp(var, fsck.skiplist) == 0) { + const char *path = is_absolute_path(value) ? + value : git_path(%s, value); + struct strbuf sb = STRBUF_INIT; + strbuf_addf(sb, skiplist=%s, path); + fsck_set_msg_types(fsck_obj_options, sb.buf); + strbuf_release(sb); + return 0; + } + return git_default_config(var, value, cb); } -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 18/19] fsck: git receive-pack: support excluding objects from fsck'ing
The optional new config option `receive.fsck.skiplist` specifies the path to a file listing the names, i.e. SHA-1s, one per line, of objects that are to be ignored by `git receive-pack` when `receive.fsckObjects = true`. This is extremely handy in case of legacy repositories where it would cause more pain to change incorrect objects than to live with them (e.g. a duplicate 'author' line in an early commit object). The intended use case is for server administrators to inspect objects that are reported by `git push` as being too problematic to enter the repository, and to add the objects' SHA-1 to a (preferably sorted) file when the objects are legitimate, i.e. when it is determined that those problematic objects should be allowed to enter the server. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- Documentation/config.txt| 7 ++ builtin/receive-pack.c | 8 ++ fsck.c | 54 + fsck.h | 1 + t/t5504-fetch-receive-strict.sh | 12 + 5 files changed, 82 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index 41fd460..5f45115 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2230,6 +2230,13 @@ which would not pass pushing when `receive.fsckObjects = true`, allowing the host to accept repositories with certain known issues but still catch other issues. +receive.fsck.skipList:: + The path to a sorted list of object names (i.e. one SHA-1 per + line) that are known to be broken in a non-fatal way and should + be ignored. This feature is useful when an established project + should be accepted despite early commits containing errors that + can be safely ignored such as invalid committer email addresses. + receive.unpackLimit:: If the number of objects received in a push is below this limit then the objects will be unpacked into loose object diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 3afe8f8..80574f9 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -117,6 +117,14 @@ static int receive_pack_config(const char *var, const char *value, void *cb) return 0; } + if (strcmp(var, receive.fsck.skiplist) == 0) { + const char *path = is_absolute_path(value) ? + value : git_path(%s, value); + strbuf_addf(fsck_msg_types, %cskiplist=%s, + fsck_msg_types.len ? ',' : '=', path); + return 0; + } + if (skip_prefix(var, receive.fsck., var)) { if (is_valid_msg_type(var, value)) strbuf_addf(fsck_msg_types, %c%s=%s, diff --git a/fsck.c b/fsck.c index dedad01..f80b508 100644 --- a/fsck.c +++ b/fsck.c @@ -8,6 +8,7 @@ #include fsck.h #include refs.h #include utf8.h +#include sha1-array.h #define FSCK_FATAL -1 #define FSCK_INFO -2 @@ -122,6 +123,43 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, return msg_type; } +static void init_skiplist(struct fsck_options *options, const char *path) +{ + static struct sha1_array skiplist = SHA1_ARRAY_INIT; + int sorted, fd; + char buffer[41]; + unsigned char sha1[20]; + + if (options-skiplist) + sorted = options-skiplist-sorted; + else { + sorted = 1; + options-skiplist = skiplist; + } + + fd = open(path, O_RDONLY); + if (fd 0) + die(Could not open skip list: %s, path); + for (;;) { + int result = read_in_full(fd, buffer, sizeof(buffer)); + if (result 0) + die_errno(Could not read '%s', path); + if (!result) + break; + if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') + die(Invalid SHA-1: %s, buffer); + sha1_array_append(skiplist, sha1); + if (sorted skiplist.nr 1 + hashcmp(skiplist.sha1[skiplist.nr - 2], + sha1) 0) + sorted = 0; + } + close(fd); + + if (sorted) + skiplist.sorted = 1; +} + static inline int substrcmp(const char *string, int len, const char *match) { int match_len = strlen(match); @@ -193,6 +231,18 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) if (values[equal] == '=' || values[equal] == ':') break; + if (!substrcmp(values, equal, skiplist)) { + char *path = xstrndup(values + equal + 1, + len - equal - 1); + + if (equal == len) + die(skiplist requires a path); + init_skiplist(options,
Re: [msysGit] crlf issues
Hi Rusi, Cc:ing the main Git list, as some of this has been discussed at the GitMerge in Paris, and some participants of that discussion might want to join this thread. On 2015-06-19 14:55, Rusi Mody wrote: Trying to setup git. Majority folks are MS VS users but also some *nixers. My current understanding/analysis of the situation: 1. autocrlf is broken -- keep it false 2. safecrlf is a cure (to some extent) -- keep it true 3. core.eol can be left native (or is crlf on windows/lf on *nix preferred??) And of course most crucial: 4. Make sure ALL file types are covered in .gitattributes; eg *.txt text *.cs text : : *.pdf binary etc Still when we get a merge conflict the file (created by git with conflict markers) shows ^M in vi ie I assume its unclean line endings. Any tips what to do? The most important preliminary question is: how cross platform is your project? If there is no risk of any developer working on a non-Windows platform, you could simply commit CRLF and be done with it. As to autocrlf: I actually had some good experience with setting it to `input` when working with other developers. Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in
Re: Bug report: Unhandled Exception
Hi Gary, On 2015-06-19 16:09, Gary England wrote: Will do. It's unlikely that I'll be able to reproduce the error. But, in the event that it happens again, I'll let you know. For the record, the problem usually lies with a certain implementation detail on 32-bit only (if you know the `rebase.exe` tool -- confusingly completely unrelated to `git rebase`! -- you will know the root cause). That's why it is typically a sporadic problem only. Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 00/19] Introduce an internal API to interact with the fsck machinery
At the moment, the git-fsck's integrity checks are targeted toward the end user, i.e. the error messages are really just messages, intended for human consumption. Under certain circumstances, some of those errors should be allowed to be turned into mere warnings, though, because the cost of fixing the issues might well be larger than the cost of carrying those flawed objects. For example, when an already-public repository contains a commit object with two authors for years, it does not make sense to force the maintainer to rewrite the history, affecting all contributors negatively by forcing them to update. This branch introduces an internal fsck API to be able to turn some of the errors into warnings, and to make it easier to call the fsck machinery from elsewhere in general. I am proud to report that this work has been sponsored by GitHub. Changes since v5: - The BAD_* names have been renamed to INVALID_* names Interdiff below the diffstat. Johannes Schindelin (19): fsck: Introduce fsck options fsck: Introduce identifiers for fsck messages fsck: Provide a function to parse fsck message IDs fsck: Offer a function to demote fsck errors to warnings fsck (receive-pack): Allow demoting errors to warnings fsck: Report the ID of the error/warning fsck: Make fsck_ident() warn-friendly fsck: Make fsck_commit() warn-friendly fsck: Handle multiple authors in commits specially fsck: Make fsck_tag() warn-friendly fsck: Add a simple test for receive.fsck.msg-id fsck: Disallow demoting grave fsck errors to warnings fsck: Optionally ignore specific fsck issues completely fsck: Allow upgrading fsck warnings to errors fsck: Document the new receive.fsck.msg-id options fsck: Support demoting errors to warnings fsck: Introduce `git fsck --quick` fsck: git receive-pack: support excluding objects from fsck'ing fsck: support ignoring objects in `git fsck` via fsck.skiplist Documentation/config.txt| 39 +++ Documentation/git-fsck.txt | 7 +- builtin/fsck.c | 75 -- builtin/index-pack.c| 13 +- builtin/receive-pack.c | 25 +- builtin/unpack-objects.c| 16 +- fsck.c | 553 +++- fsck.h | 31 ++- t/t1450-fsck.sh | 37 ++- t/t5302-pack-index.sh | 2 +- t/t5504-fetch-receive-strict.sh | 51 11 files changed, 686 insertions(+), 163 deletions(-) diff --git a/fsck.c b/fsck.c index 9b8981e..f80b508 100644 --- a/fsck.c +++ b/fsck.c @@ -19,17 +19,17 @@ FUNC(UNTERMINATED_HEADER, FATAL) \ /* errors */ \ FUNC(BAD_DATE, ERROR) \ + FUNC(BAD_DATE_OVERFLOW, ERROR) \ FUNC(BAD_EMAIL, ERROR) \ FUNC(BAD_NAME, ERROR) \ + FUNC(BAD_OBJECT_SHA1, ERROR) \ FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_TAG_OBJECT, ERROR) \ FUNC(BAD_TIMEZONE, ERROR) \ + FUNC(BAD_TREE, ERROR) \ FUNC(BAD_TREE_SHA1, ERROR) \ - FUNC(DATE_OVERFLOW, ERROR) \ + FUNC(BAD_TYPE, ERROR) \ FUNC(DUPLICATE_ENTRIES, ERROR) \ - FUNC(INVALID_OBJECT_SHA1, ERROR) \ - FUNC(INVALID_TAG_OBJECT, ERROR) \ - FUNC(INVALID_TREE, ERROR) \ - FUNC(INVALID_TYPE, ERROR) \ FUNC(MISSING_AUTHOR, ERROR) \ FUNC(MISSING_COMMITTER, ERROR) \ FUNC(MISSING_EMAIL, ERROR) \ @@ -46,8 +46,8 @@ FUNC(MISSING_TYPE, ERROR) \ FUNC(MISSING_TYPE_ENTRY, ERROR) \ FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(NOT_SORTED, ERROR) \ FUNC(TAG_OBJECT_NOT_TAG, ERROR) \ + FUNC(TREE_NOT_SORTED, ERROR) \ FUNC(UNKNOWN_TYPE, ERROR) \ FUNC(ZERO_PADDED_DATE, ERROR) \ /* warnings */ \ @@ -60,7 +60,7 @@ FUNC(NULL_SHA1, WARN) \ FUNC(ZERO_PADDED_FILEMODE, WARN) \ /* infos (reported as warnings, but ignored by default) */ \ - FUNC(INVALID_TAG_NAME, INFO) \ + FUNC(BAD_TAG_NAME, INFO) \ FUNC(MISSING_TAGGER_ENTRY, INFO) #define MSG_ID(id, msg_type) FSCK_MSG_##id, @@ -525,7 +525,7 @@ static int fsck_tree(struct tree *item, struct fsck_options *options) if (has_dup_entries) retval += report(options, item-object, FSCK_MSG_DUPLICATE_ENTRIES, contains duplicate file entries); if (not_properly_sorted) - retval += report(options, item-object, FSCK_MSG_NOT_SORTED, not properly sorted); + retval += report(options, item-object, FSCK_MSG_TREE_NOT_SORTED, not properly sorted); return retval; } @@ -580,7 +580,7 @@ static int fsck_ident(const char **ident, struct object *obj, struct fsck_option if (*p == '0' p[1] != ' ') return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, invalid author/committer line - zero-padded date); if (date_overflows(strtoul(p, end, 10))) - return report(options, obj, FSCK_MSG_DATE_OVERFLOW, invalid author/committer line -
Re: [PATCH v2 05/12] delete_refs(): improve error message
On 06/15/2015 08:29 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: Subject: Re: [PATCH v2 05/12] delete_refs(): improve error message I'd call this make error message more generic. Change the error message from Could not remove branch %s to could not remove reference %s * s/branch/reference/. This change makes sense even for the existing caller, which uses the function to delete remote-tracking branches. and replace this bullet with something like: * Originally 'branch -d' was the only caller of this, but as part of the refs API, we will allow it to be called on refs that is not a branch or a remote-tracking branch. as calling a remote-tracking branch a 'branch' is not incorrect per-se. True, but branch refs/remotes/origin/foo is not how we would usually refer to that reference. Usually we would say reference refs/remotes/origin/foo or remote-tracking branch origin/foo. But I will make approximately the change that you suggest. What would count as true improvement is ... * Convert it to lower case, as per our usual convention. ... this one. If somebody eventually chooses to make the message finer grained by switching on the prefix refs/{what}, so that the user can differentiate between branches, remote-tracking branches, tags, etc., that would also count as improvement as well. I thought about proposing a function that changes a reference name into its prose equivalent, but I'm pretty sure that the result would not be internationalizable. Probably such a function would have to look like enum refname_kind { REFNAME_KIND_BRANCH, REFNAME_KIND_TAG, REFNAME_KIND_REMOTE, ...(?), REFNAME_KIND_OTHER}; enum refname_kind shorten_refname(const char *refname, const char **short_refname); and callers would still need logic to pick the correct (internationalized) formatting string based on the return value of the function. Someday... Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 02/19] fsck: Introduce identifiers for fsck messages
Instead of specifying whether a message by the fsck machinery constitutes an error or a warning, let's specify an identifier relating to the concrete problem that was encountered. This is necessary for upcoming support to be able to demote certain errors to warnings. In the process, simplify the requirements on the calling code: instead of having to handle full-blown varargs in every callback, we now send a string buffer ready to be used by the callback. We could use a simple enum for the message IDs here, but we want to guarantee that the enum values are associated with the appropriate message types (i.e. error or warning?). Besides, we want to introduce a parser in the next commit that maps the string representation to the enum value, hence we use the slightly ugly preprocessor construct that is extensible for use with said parser. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- builtin/fsck.c | 26 +++- fsck.c | 201 + fsck.h | 5 +- 3 files changed, 154 insertions(+), 78 deletions(-) diff --git a/builtin/fsck.c b/builtin/fsck.c index 981dca5..fff38fe 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -46,33 +46,23 @@ static int show_dangling = 1; #define DIRENT_SORT_HINT(de) ((de)-d_ino) #endif -static void objreport(struct object *obj, const char *severity, - const char *err, va_list params) +static void objreport(struct object *obj, const char *msg_type, + const char *err) { - fprintf(stderr, %s in %s %s: , - severity, typename(obj-type), sha1_to_hex(obj-sha1)); - vfprintf(stderr, err, params); - fputs(\n, stderr); + fprintf(stderr, %s in %s %s: %s\n, + msg_type, typename(obj-type), sha1_to_hex(obj-sha1), err); } -__attribute__((format (printf, 2, 3))) -static int objerror(struct object *obj, const char *err, ...) +static int objerror(struct object *obj, const char *err) { - va_list params; - va_start(params, err); errors_found |= ERROR_OBJECT; - objreport(obj, error, err, params); - va_end(params); + objreport(obj, error, err); return -1; } -__attribute__((format (printf, 3, 4))) -static int fsck_error_func(struct object *obj, int type, const char *err, ...) +static int fsck_error_func(struct object *obj, int type, const char *message) { - va_list params; - va_start(params, err); - objreport(obj, (type == FSCK_WARN) ? warning : error, err, params); - va_end(params); + objreport(obj, (type == FSCK_WARN) ? warning : error, message); return (type == FSCK_WARN) ? 0 : 1; } diff --git a/fsck.c b/fsck.c index d83b811..ab24618 100644 --- a/fsck.c +++ b/fsck.c @@ -9,6 +9,98 @@ #include refs.h #include utf8.h +#define FOREACH_MSG_ID(FUNC) \ + /* errors */ \ + FUNC(BAD_DATE, ERROR) \ + FUNC(BAD_DATE_OVERFLOW, ERROR) \ + FUNC(BAD_EMAIL, ERROR) \ + FUNC(BAD_NAME, ERROR) \ + FUNC(BAD_OBJECT_SHA1, ERROR) \ + FUNC(BAD_PARENT_SHA1, ERROR) \ + FUNC(BAD_TAG_OBJECT, ERROR) \ + FUNC(BAD_TIMEZONE, ERROR) \ + FUNC(BAD_TREE, ERROR) \ + FUNC(BAD_TREE_SHA1, ERROR) \ + FUNC(BAD_TYPE, ERROR) \ + FUNC(DUPLICATE_ENTRIES, ERROR) \ + FUNC(MISSING_AUTHOR, ERROR) \ + FUNC(MISSING_COMMITTER, ERROR) \ + FUNC(MISSING_EMAIL, ERROR) \ + FUNC(MISSING_GRAFT, ERROR) \ + FUNC(MISSING_NAME_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_OBJECT, ERROR) \ + FUNC(MISSING_PARENT, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_DATE, ERROR) \ + FUNC(MISSING_SPACE_BEFORE_EMAIL, ERROR) \ + FUNC(MISSING_TAG, ERROR) \ + FUNC(MISSING_TAG_ENTRY, ERROR) \ + FUNC(MISSING_TAG_OBJECT, ERROR) \ + FUNC(MISSING_TREE, ERROR) \ + FUNC(MISSING_TYPE, ERROR) \ + FUNC(MISSING_TYPE_ENTRY, ERROR) \ + FUNC(NUL_IN_HEADER, ERROR) \ + FUNC(TAG_OBJECT_NOT_TAG, ERROR) \ + FUNC(TREE_NOT_SORTED, ERROR) \ + FUNC(UNKNOWN_TYPE, ERROR) \ + FUNC(UNTERMINATED_HEADER, ERROR) \ + FUNC(ZERO_PADDED_DATE, ERROR) \ + /* warnings */ \ + FUNC(BAD_FILEMODE, WARN) \ + FUNC(BAD_TAG_NAME, WARN) \ + FUNC(EMPTY_NAME, WARN) \ + FUNC(FULL_PATHNAME, WARN) \ + FUNC(HAS_DOT, WARN) \ + FUNC(HAS_DOTDOT, WARN) \ + FUNC(HAS_DOTGIT, WARN) \ + FUNC(MISSING_TAGGER_ENTRY, WARN) \ + FUNC(NULL_SHA1, WARN) \ + FUNC(ZERO_PADDED_FILEMODE, WARN) + +#define MSG_ID(id, msg_type) FSCK_MSG_##id, +enum fsck_msg_id { + FOREACH_MSG_ID(MSG_ID) + FSCK_MSG_MAX +}; +#undef MSG_ID + +#define MSG_ID(id, msg_type) { FSCK_##msg_type }, +static struct { + int msg_type; +} msg_id_info[FSCK_MSG_MAX + 1] = { + FOREACH_MSG_ID(MSG_ID) + { -1 } +}; +#undef MSG_ID + +static int fsck_msg_type(enum fsck_msg_id msg_id, + struct fsck_options *options) +{ +
[PATCH v6 05/19] fsck (receive-pack): Allow demoting errors to warnings
For example, missing emails in commit and tag objects can be demoted to mere warnings with git config receive.fsck.missingemail=warn The value is actually a comma-separated list. In case that the same key is listed in multiple receive.fsck.msg-id lines in the config, the latter configuration wins (this can happen for example when both $HOME/.gitconfig and .git/config contain message type settings). As git receive-pack does not actually perform the checks, it hands off the setting to index-pack or unpack-objects in the form of an optional argument to the --strict option. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- builtin/index-pack.c | 4 builtin/receive-pack.c | 17 +++-- builtin/unpack-objects.c | 5 + fsck.c | 8 fsck.h | 1 + 5 files changed, 33 insertions(+), 2 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 87ae9ba..98e14fe 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -1633,6 +1633,10 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) } else if (!strcmp(arg, --strict)) { strict = 1; do_fsck_object = 1; + } else if (skip_prefix(arg, --strict=, arg)) { + strict = 1; + do_fsck_object = 1; + fsck_set_msg_types(fsck_options, arg); } else if (!strcmp(arg, --check-self-contained-and-connected)) { strict = 1; check_self_contained_and_connected = 1; diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 94d0571..3afe8f8 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -19,6 +19,7 @@ #include tag.h #include gpg-interface.h #include sigchain.h +#include fsck.h static const char receive_pack_usage[] = git receive-pack git-dir; @@ -36,6 +37,7 @@ static enum deny_action deny_current_branch = DENY_UNCONFIGURED; static enum deny_action deny_delete_current = DENY_UNCONFIGURED; static int receive_fsck_objects = -1; static int transfer_fsck_objects = -1; +static struct strbuf fsck_msg_types = STRBUF_INIT; static int receive_unpack_limit = -1; static int transfer_unpack_limit = -1; static int advertise_atomic_push = 1; @@ -115,6 +117,15 @@ static int receive_pack_config(const char *var, const char *value, void *cb) return 0; } + if (skip_prefix(var, receive.fsck., var)) { + if (is_valid_msg_type(var, value)) + strbuf_addf(fsck_msg_types, %c%s=%s, + fsck_msg_types.len ? ',' : '=', var, value); + else + warning(Skipping unknown msg id '%s', var); + return 0; + } + if (strcmp(var, receive.fsckobjects) == 0) { receive_fsck_objects = git_config_bool(var, value); return 0; @@ -1490,7 +1501,8 @@ static const char *unpack(int err_fd, struct shallow_info *si) if (quiet) argv_array_push(child.args, -q); if (fsck_objects) - argv_array_push(child.args, --strict); + argv_array_pushf(child.args, --strict%s, + fsck_msg_types.buf); child.no_stdout = 1; child.err = err_fd; child.git_cmd = 1; @@ -1508,7 +1520,8 @@ static const char *unpack(int err_fd, struct shallow_info *si) argv_array_pushl(child.args, index-pack, --stdin, hdr_arg, keep_arg, NULL); if (fsck_objects) - argv_array_push(child.args, --strict); + argv_array_pushf(child.args, --strict%s, + fsck_msg_types.buf); if (fix_thin) argv_array_push(child.args, --fix-thin); child.out = -1; diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c index 6d17040..7cc086f 100644 --- a/builtin/unpack-objects.c +++ b/builtin/unpack-objects.c @@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix) strict = 1; continue; } + if (skip_prefix(arg, --strict=, arg)) { + strict = 1; + fsck_set_msg_types(fsck_options, arg); + continue; + } if (starts_with(arg, --pack_header=)) { struct pack_header *hdr; char *c; diff --git a/fsck.c b/fsck.c index 8c3caff..8e6faa8 100644 --- a/fsck.c
[PATCH v6 06/19] fsck: Report the ID of the error/warning
Some legacy code has objects with non-fatal fsck issues; To enable the user to ignore those issues, let's print out the ID (e.g. when encountering missingemail, the user might want to call `git config --add receive.fsck.missingemail=warn`). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 16 t/t1450-fsck.sh | 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 8e6faa8..0b3e18f 100644 --- a/fsck.c +++ b/fsck.c @@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) } } +static void append_msg_id(struct strbuf *sb, const char *msg_id) +{ + for (;;) { + char c = *(msg_id)++; + + if (!c) + break; + if (c != '_') + strbuf_addch(sb, tolower(c)); + } + + strbuf_addstr(sb, : ); +} + __attribute__((format (printf, 4, 5))) static int report(struct fsck_options *options, struct object *object, enum fsck_msg_id id, const char *fmt, ...) @@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object, struct strbuf sb = STRBUF_INIT; int msg_type = fsck_msg_type(id, options), result; + append_msg_id(sb, msg_id_info[id].id_string); + va_start(ap, fmt); strbuf_vaddf(sb, fmt, ap); result = options-error_func(object, msg_type, sb.buf); diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index cfb32b6..d6d3b13 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name missing tagger' ' git fsck --tags 2out cat expect -EOF - warning in tag $tag: invalid '\''tag'\'' name: wrong name format - warning in tag $tag: invalid format - expected '\''tagger'\'' line + warning in tag $tag: badtagname: invalid '\''tag'\'' name: wrong name format + warning in tag $tag: missingtaggerentry: invalid format - expected '\''tagger'\'' line EOF test_cmp expect out ' -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 07/19] fsck: Make fsck_ident() warn-friendly
When fsck_ident() identifies a problem with the ident, it should still advance the pointer to the next line so that fsck can continue in the case of a mere warning. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 49 +++-- 1 file changed, 27 insertions(+), 22 deletions(-) diff --git a/fsck.c b/fsck.c index 0b3e18f..9faaf53 100644 --- a/fsck.c +++ b/fsck.c @@ -479,40 +479,45 @@ static int require_end_of_header(const void *data, unsigned long size, static int fsck_ident(const char **ident, struct object *obj, struct fsck_options *options) { + const char *p = *ident; char *end; - if (**ident == '') + *ident = strchrnul(*ident, '\n'); + if (**ident == '\n') + (*ident)++; + + if (*p == '') return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, invalid author/committer line - missing space before email); - *ident += strcspn(*ident, \n); - if (**ident == '') + p += strcspn(p, \n); + if (*p == '') return report(options, obj, FSCK_MSG_BAD_NAME, invalid author/committer line - bad name); - if (**ident != '') + if (*p != '') return report(options, obj, FSCK_MSG_MISSING_EMAIL, invalid author/committer line - missing email); - if ((*ident)[-1] != ' ') + if (p[-1] != ' ') return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, invalid author/committer line - missing space before email); - (*ident)++; - *ident += strcspn(*ident, \n); - if (**ident != '') + p++; + p += strcspn(p, \n); + if (*p != '') return report(options, obj, FSCK_MSG_BAD_EMAIL, invalid author/committer line - bad email); - (*ident)++; - if (**ident != ' ') + p++; + if (*p != ' ') return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, invalid author/committer line - missing space before date); - (*ident)++; - if (**ident == '0' (*ident)[1] != ' ') + p++; + if (*p == '0' p[1] != ' ') return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, invalid author/committer line - zero-padded date); - if (date_overflows(strtoul(*ident, end, 10))) + if (date_overflows(strtoul(p, end, 10))) return report(options, obj, FSCK_MSG_BAD_DATE_OVERFLOW, invalid author/committer line - date causes integer overflow); - if (end == *ident || *end != ' ') + if ((end == p || *end != ' ')) return report(options, obj, FSCK_MSG_BAD_DATE, invalid author/committer line - bad date); - *ident = end + 1; - if ((**ident != '+' **ident != '-') || - !isdigit((*ident)[1]) || - !isdigit((*ident)[2]) || - !isdigit((*ident)[3]) || - !isdigit((*ident)[4]) || - ((*ident)[5] != '\n')) + p = end + 1; + if ((*p != '+' *p != '-') || + !isdigit(p[1]) || + !isdigit(p[2]) || + !isdigit(p[3]) || + !isdigit(p[4]) || + (p[5] != '\n')) return report(options, obj, FSCK_MSG_BAD_TIMEZONE, invalid author/committer line - bad time zone); - (*ident) += 6; + p += 6; return 0; } -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 08/19] fsck: Make fsck_commit() warn-friendly
When fsck_commit() identifies a problem with the commit, it should try to make it possible to continue checking the commit object, in case the user wants to demote the detected errors to mere warnings. Note that some problems are too problematic to simply ignore. For example, when the header lines are mixed up, we punt after encountering an incorrect line. Therefore, demoting certain warnings to errors can hide other problems. Example: demoting the missingauthor error to a warning would hide a problematic committer line. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 28 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index 9faaf53..9fe9f48 100644 --- a/fsck.c +++ b/fsck.c @@ -534,12 +534,18 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, if (!skip_prefix(buffer, tree , buffer)) return report(options, commit-object, FSCK_MSG_MISSING_TREE, invalid format - expected 'tree' line); - if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') - return report(options, commit-object, FSCK_MSG_BAD_TREE_SHA1, invalid 'tree' line format - bad sha1); + if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') { + err = report(options, commit-object, FSCK_MSG_BAD_TREE_SHA1, invalid 'tree' line format - bad sha1); + if (err) + return err; + } buffer += 41; while (skip_prefix(buffer, parent , buffer)) { - if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') - return report(options, commit-object, FSCK_MSG_BAD_PARENT_SHA1, invalid 'parent' line format - bad sha1); + if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') { + err = report(options, commit-object, FSCK_MSG_BAD_PARENT_SHA1, invalid 'parent' line format - bad sha1); + if (err) + return err; + } buffer += 41; parent_line_count++; } @@ -548,11 +554,17 @@ static int fsck_commit_buffer(struct commit *commit, const char *buffer, if (graft) { if (graft-nr_parent == -1 !parent_count) ; /* shallow commit */ - else if (graft-nr_parent != parent_count) - return report(options, commit-object, FSCK_MSG_MISSING_GRAFT, graft objects missing); + else if (graft-nr_parent != parent_count) { + err = report(options, commit-object, FSCK_MSG_MISSING_GRAFT, graft objects missing); + if (err) + return err; + } } else { - if (parent_count != parent_line_count) - return report(options, commit-object, FSCK_MSG_MISSING_PARENT, parent objects missing); + if (parent_count != parent_line_count) { + err = report(options, commit-object, FSCK_MSG_MISSING_PARENT, parent objects missing); + if (err) + return err; + } } if (!skip_prefix(buffer, author , buffer)) return report(options, commit-object, FSCK_MSG_MISSING_AUTHOR, invalid format - expected 'author' line); -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 11/19] fsck: Add a simple test for receive.fsck.msg-id
Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- t/t5504-fetch-receive-strict.sh | 21 + 1 file changed, 21 insertions(+) diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh index 69ee13c..3f7e96a 100755 --- a/t/t5504-fetch-receive-strict.sh +++ b/t/t5504-fetch-receive-strict.sh @@ -115,4 +115,25 @@ test_expect_success 'push with transfer.fsckobjects' ' test_cmp exp act ' +cat bogus-commit \EOF +tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904 +author Bugs Bunny 1234567890 + +committer Bugs Bunny b...@bun.ni 1234567890 + + +This commit object intentionally broken +EOF + +test_expect_success 'push with receive.fsck.missingemail=warn' ' + commit=$(git hash-object -t commit -w --stdin bogus-commit) + git push . $commit:refs/heads/bogus + rm -rf dst + git init dst + git --git-dir=dst/.git config receive.fsckobjects true + test_must_fail git push --porcelain dst bogus + git --git-dir=dst/.git config \ + receive.fsck.missingemail warn + git push --porcelain dst bogus act 21 + grep missingemail act +' + test_done -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 12/19] fsck: Disallow demoting grave fsck errors to warnings
Some kinds of errors are intrinsically unrecoverable (e.g. errors while uncompressing objects). It does not make sense to allow demoting them to mere warnings. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 14 -- t/t5504-fetch-receive-strict.sh | 11 +++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 21e3052..a4fbce3 100644 --- a/fsck.c +++ b/fsck.c @@ -9,7 +9,12 @@ #include refs.h #include utf8.h +#define FSCK_FATAL -1 + #define FOREACH_MSG_ID(FUNC) \ + /* fatal errors */ \ + FUNC(NUL_IN_HEADER, FATAL) \ + FUNC(UNTERMINATED_HEADER, FATAL) \ /* errors */ \ FUNC(BAD_DATE, ERROR) \ FUNC(BAD_DATE_OVERFLOW, ERROR) \ @@ -39,11 +44,9 @@ FUNC(MISSING_TYPE, ERROR) \ FUNC(MISSING_TYPE_ENTRY, ERROR) \ FUNC(MULTIPLE_AUTHORS, ERROR) \ - FUNC(NUL_IN_HEADER, ERROR) \ FUNC(TAG_OBJECT_NOT_TAG, ERROR) \ FUNC(TREE_NOT_SORTED, ERROR) \ FUNC(UNKNOWN_TYPE, ERROR) \ - FUNC(UNTERMINATED_HEADER, ERROR) \ FUNC(ZERO_PADDED_DATE, ERROR) \ /* warnings */ \ FUNC(BAD_FILEMODE, WARN) \ @@ -157,6 +160,10 @@ void fsck_set_msg_type(struct fsck_options *options, die(Unhandled message id: %.*s, msg_id_len, msg_id); type = parse_msg_type(msg_type, msg_type_len); + if (type != FSCK_ERROR msg_id_info[id].msg_type == FSCK_FATAL) + die(Cannot demote %.*s to %.*s, msg_id_len, msg_id, + msg_type_len, msg_type); + if (!options-msg_type) { int i; int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX); @@ -213,6 +220,9 @@ static int report(struct fsck_options *options, struct object *object, struct strbuf sb = STRBUF_INIT; int msg_type = fsck_msg_type(id, options), result; + if (msg_type == FSCK_FATAL) + msg_type = FSCK_ERROR; + append_msg_id(sb, msg_id_info[id].id_string); va_start(ap, fmt); diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh index 3f7e96a..0d64229 100755 --- a/t/t5504-fetch-receive-strict.sh +++ b/t/t5504-fetch-receive-strict.sh @@ -136,4 +136,15 @@ test_expect_success 'push with receive.fsck.missingemail=warn' ' grep missingemail act ' +test_expect_success \ + 'receive.fsck.unterminatedheader=warn triggers error' ' + rm -rf dst + git init dst + git --git-dir=dst/.git config receive.fsckobjects true + git --git-dir=dst/.git config \ + receive.fsck.unterminatedheader warn + test_must_fail git push --porcelain dst HEAD act 21 + grep Cannot demote unterminatedheader act +' + test_done -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 14/19] fsck: Allow upgrading fsck warnings to errors
The 'invalid tag name' and 'missing tagger entry' warnings can now be upgraded to errors by specifying `invalidtagname` and `missingtaggerentry` in the receive.fsck.msg-id config setting. Incidentally, the missing tagger warning is now really shown as a warning (as opposed to being reported with the error: prefix, as it used to be the case before this commit). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c| 24 +--- t/t5302-pack-index.sh | 2 +- 2 files changed, 18 insertions(+), 8 deletions(-) diff --git a/fsck.c b/fsck.c index b728646..dedad01 100644 --- a/fsck.c +++ b/fsck.c @@ -10,6 +10,7 @@ #include utf8.h #define FSCK_FATAL -1 +#define FSCK_INFO -2 #define FOREACH_MSG_ID(FUNC) \ /* fatal errors */ \ @@ -50,15 +51,16 @@ FUNC(ZERO_PADDED_DATE, ERROR) \ /* warnings */ \ FUNC(BAD_FILEMODE, WARN) \ - FUNC(BAD_TAG_NAME, WARN) \ FUNC(EMPTY_NAME, WARN) \ FUNC(FULL_PATHNAME, WARN) \ FUNC(HAS_DOT, WARN) \ FUNC(HAS_DOTDOT, WARN) \ FUNC(HAS_DOTGIT, WARN) \ - FUNC(MISSING_TAGGER_ENTRY, WARN) \ FUNC(NULL_SHA1, WARN) \ - FUNC(ZERO_PADDED_FILEMODE, WARN) + FUNC(ZERO_PADDED_FILEMODE, WARN) \ + /* infos (reported as warnings, but ignored by default) */ \ + FUNC(BAD_TAG_NAME, INFO) \ + FUNC(MISSING_TAGGER_ENTRY, INFO) #define MSG_ID(id, msg_type) FSCK_MSG_##id, enum fsck_msg_id { @@ -227,6 +229,8 @@ static int report(struct fsck_options *options, struct object *object, if (msg_type == FSCK_FATAL) msg_type = FSCK_ERROR; + else if (msg_type == FSCK_INFO) + msg_type = FSCK_WARN; append_msg_id(sb, msg_id_info[id].id_string); @@ -685,15 +689,21 @@ static int fsck_tag_buffer(struct tag *tag, const char *data, goto done; } strbuf_addf(sb, refs/tags/%.*s, (int)(eol - buffer), buffer); - if (check_refname_format(sb.buf, 0)) - report(options, tag-object, FSCK_MSG_BAD_TAG_NAME, + if (check_refname_format(sb.buf, 0)) { + ret = report(options, tag-object, FSCK_MSG_BAD_TAG_NAME, invalid 'tag' name: %.*s, (int)(eol - buffer), buffer); + if (ret) + goto done; + } buffer = eol + 1; - if (!skip_prefix(buffer, tagger , buffer)) + if (!skip_prefix(buffer, tagger , buffer)) { /* early tags do not contain 'tagger' lines; warn only */ - report(options, tag-object, FSCK_MSG_MISSING_TAGGER_ENTRY, invalid format - expected 'tagger' line); + ret = report(options, tag-object, FSCK_MSG_MISSING_TAGGER_ENTRY, invalid format - expected 'tagger' line); + if (ret) + goto done; + } else ret = fsck_ident(buffer, tag-object, options); diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh index 61bc8da..3dc5ec4 100755 --- a/t/t5302-pack-index.sh +++ b/t/t5302-pack-index.sh @@ -259,7 +259,7 @@ EOF thirtyeight=${tag#??} rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight git index-pack --strict tag-test-${pack1}.pack 2err -grep ^error:.* expected .tagger. line err +grep ^warning:.* expected .tagger. line err ' test_done -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v6 16/19] fsck: Support demoting errors to warnings
We already have support in `git receive-pack` to deal with some legacy repositories which have non-fatal issues. Let's make `git fsck` itself useful with such repositories, too, by allowing users to ignore known issues, or at least demote those issues to mere warnings. Example: `git -c fsck.missingemail=ignore fsck` would hide problems with missing emails in author, committer and tagger lines. In the same spirit that `git receive-pack`'s usage of the fsck machinery differs from `git fsck`'s – some of the non-fatal warnings in `git fsck` are fatal with `git receive-pack` when receive.fsckObjects = true, for example – we strictly separate the fsck.msg-id from the receive.fsck.msg-id settings. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- Documentation/config.txt | 11 +++ builtin/fsck.c | 12 t/t1450-fsck.sh | 11 +++ 3 files changed, 34 insertions(+) diff --git a/Documentation/config.txt b/Documentation/config.txt index 306ab7a..41fd460 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1250,6 +1250,17 @@ filter.driver.smudge:: object to a worktree file upon checkout. See linkgit:gitattributes[5] for details. +fsck.msg-id:: + Allows overriding the message type (error, warn or ignore) of a + specific message ID such as `missingemail`. ++ +For convenience, fsck prefixes the error/warning with the message ID, +e.g. missingemail: invalid author/committer line - missing email means +that setting `fsck.missingemail = ignore` will hide that issue. ++ +This feature is intended to support working with legacy repositories +which cannot be repaired without disruptive changes. + gc.aggressiveDepth:: The depth parameter used in the delta compression algorithm used by 'git gc --aggressive'. This defaults diff --git a/builtin/fsck.c b/builtin/fsck.c index fff38fe..6de9f3e 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -46,6 +46,16 @@ static int show_dangling = 1; #define DIRENT_SORT_HINT(de) ((de)-d_ino) #endif +static int fsck_config(const char *var, const char *value, void *cb) +{ + if (skip_prefix(var, fsck., var)) { + fsck_set_msg_type(fsck_obj_options, var, -1, value, -1); + return 0; + } + + return git_default_config(var, value, cb); +} + static void objreport(struct object *obj, const char *msg_type, const char *err) { @@ -646,6 +656,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) include_reflogs = 0; } + git_config(fsck_config, NULL); + fsck_head_link(); fsck_object_dir(get_object_directory()); diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index d6d3b13..922c346 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -287,6 +287,17 @@ test_expect_success 'rev-list --verify-objects with bad sha1' ' grep -q error: sha1 mismatch 63ff out ' +test_expect_success 'force fsck to ignore double author' ' + git cat-file commit HEAD basis + sed s/^author .*/,/ basis | tr , \\n multiple-authors + new=$(git hash-object -t commit -w --stdin multiple-authors) + test_when_finished remove_object $new + git update-ref refs/heads/bogus $new + test_when_finished git update-ref -d refs/heads/bogus + test_must_fail git fsck + git -c fsck.multipleauthors=ignore fsck +' + _bz='\0' _bz5=$_bz$_bz$_bz$_bz$_bz _bz20=$_bz5$_bz5$_bz5$_bz5 -- 2.3.1.windows.1.9.g8c01ab4 -- To unsubscribe from this list: send the line unsubscribe git in
Re: Bug report: Unhandled Exception
Hi Johannes, Will do. It's unlikely that I'll be able to reproduce the error. But, in the event that it happens again, I'll let you know. Thanks, Gary From: Johannes Schindelin johannes.schinde...@gmx.de Sent: Friday, June 19, 2015 6:58 AM To: Gary England Cc: git@vger.kernel.org Subject: Re: Bug report: Unhandled Exception Hi Gary, On 2015-06-11 21:21, Gary England wrote: Using git version 1.9.2-preview20140411, in Git Bash for Windows, performing a git pull --rebase, received an unhandled exception. Please note that the newest 1.9.x release is 1.9.5. Could you re-test with that version, please? Or maybe you want to give the 2nd release candidate of Git for Windows 2.x a try: https://git-for-windows.github.io/#download Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] strbuf: stop out-of-boundary warnings from Coverity
On Fri, Jun 19, 2015 at 5:50 PM, Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr wrote: Duy Nguyen pclo...@gmail.com writes: + sb.buf[0] = 'Z'; + printf(%c\n, strbuf_slopbuf[0]); + return 0; startup_info = git_startup_info; I might be wrong, but I definitely think that this printf and return 0 are some debug lines that you forgot to remove. Yes it's debug code. I hoped that if I made a mistake forcing the segfault, people would notice. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH 3/3] Add filter-objects command
On Fri, Jun 19, 2015 at 11:33:24AM +0100, Charles Bailey wrote: Obviously I've glossed over the how to get a list of objects part. If you truly want all objects (not just reachable ones), or if rev-list --objects is too slow [...] So, yes, performance is definitely an issue and I could have called this command git magically-generate-all-object-for-scripts but then, as it was so easy to provide exactly the filtering that I was looking for in the C code, I thought I would do that as well and then filter-objects (filter-all-objects?) seemed like a better name. Right, my point was only that it works for _your_ particular filter, but it would be nice to have something more general. And we already have cat-file --batch-check. IOW, I think I would prefer the magical form because it's a better scripting building block. As you note, filter-objects without any filters is exactly that. Your 10 extra lines of C code are not exactly bloat, but I just wonder if other people will find it all that useful. It's about an order of magnitude faster on the systems I've checked to do a parameterless filter-objects then rev-list --all --objects, although I understand they do different things. Right, it's the object-opening and hash lookups that kill you in rev-list, because it's actually walking the graph. I am also thinking about another piece that answers the question: which commits introduce any of (or the first of) this list of objects?. This can be done by parseing a diff --raw for commits but I think it should be possible to do this faster, too. If you care about introduce, I think you have to traverse and do the diffs. If you only care about contains (for example, because you want to know which path the blob is found at), you can find trees which mention it, then trees which mention that tree, and so on. I think that ends up slower in practice, though. I have patches that implement a rev-list --find=$sha1, which sets a bit on $sha1 and then traverses with --objects until we find it (or them; you can specify multiple). It's pretty straightforward, but it does cost as much as git rev-list --objects in the worst case. Let me know if you're interested and I can clean it up and post it. -Peff -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH 3/3] Add filter-objects command
On Fri, Jun 19, 2015 at 11:33:24AM +0100, Charles Bailey wrote: So, yes, performance is definitely an issue and I could have called this command git magically-generate-all-object-for-scripts but then, as it was so easy to provide exactly the filtering that I was looking for in the C code, I thought I would do that as well and then filter-objects (filter-all-objects?) seemed like a better name. By analogy with git filter-branch, I don't think filter-objects is a good name here. My preference would be ls-objects. -- To unsubscribe from this list: send the line unsubscribe git in
Re: Using clean/smudge filters with difftool
On Fri, Jun 19, 2015 at 10:57:55AM +0200, Michael J Gruber wrote: Junio C Hamano venit, vidit, dixit 19.06.2015 00:55: John Keeping j...@keeping.me.uk writes: I think the summary is that there are some scenarios where the external diff tool should see the smudged version and others where the clean version is more appropriate and Git should support both options. It seems this is a property of the filter, so I wonder if the best solution is a new filter.name.extdiff = [clean|smudge] configuration variable (there's probably a better name for the variable than extdiff). Not just the external diff, but the textconv filter obeys the same rule. The setting should be done the same way for both, if we are going to go in that direction. textconv is a one-way filter from blob to readable blob. External diffs may prefer to work on blob rather than readable blob, but the currect setup does not seem to produce surprises. clean and smudge are two-way filters: clean from worktree blob (aka file) to repo blob, smudge the other way round. Typically, the user perceives these as inverse to each other. But we only require clean to be a left-inverse of smudge, i.e. (cat-file then) smudge then clean should give the same repo blob (as cat-file). We don't require that the other way round, i.e. we don't require smudge to be a left-inverse of clean, and in most setups (like the current one) it is not: smudge does not recreate what clean has cleaned out. It is a no-op (the identity, while clean is a projection). Now, since external diff runs on smudged blobs, it appears as if we mixed cleaned and smudged blobs when feeding external diffs; whereas really, we mix worktree blobs and smudged repo blobs, which is okay as per our definition of clean/smudge: the difference is irrelevant by definition. I agree with this. But I was wrong that should diff clean/should diff smudged is a property of the filter. I can also imagine a situation where a more intelligent external diff tool wants to see the smudged version where a naïve tool would want the clean version. For example, some of the big file stores (e.g. git-lfs [1]) use clean/smudge filters and I can imagine a diff utility that avoids needing to fetch the data for large files and instead shows the diff on the server when both blobs are available there. In that case we generally want to use the smudged copy for external diff, so the filter would use that setting, but the diff utility knows better and would want to override that. [1] https://github.com/github/git-lfs -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Add filter-objects command
On Fri, Jun 19, 2015 at 06:10:10AM -0400, Jeff King wrote: On Fri, Jun 19, 2015 at 10:10:59AM +0100, Charles Bailey wrote: filter-objects is a command to scan all objects in the object database for the repository and print the ids of those which match the given criteria. The current supported criteria are object type and the minimum size of the object. The guiding use case is to scan repositories quickly for large objects which may cause performance issues for users. The list of objects can then be used to guide some future remediating action. I've had to perform this exact same task. You can already do the filtering part pretty easily and efficiently with cat-file and a perl script, like: magically_generate_all_objects | git cat-file --batch-check='%(objectsize) %(objectname)' | perl -alne 'print $F[1] if $F[0] 1234' That's not as friendly as your filter-objects, but it's a lot more flexible (since you can ask cat-file for all sorts of information). Obviously I've glossed over the how to get a list of objects part. If you truly want all objects (not just reachable ones), or if rev-list --objects is too slow [...] So, yes, performance is definitely an issue and I could have called this command git magically-generate-all-object-for-scripts but then, as it was so easy to provide exactly the filtering that I was looking for in the C code, I thought I would do that as well and then filter-objects (filter-all-objects?) seemed like a better name. It's about an order of magnitude faster on the systems I've checked to do a parameterless filter-objects then rev-list --all --objects, although I understand they do different things. I am also thinking about another piece that answers the question: which commits introduce any of (or the first of) this list of objects?. This can be done by parseing a diff --raw for commits but I think it should be possible to do this faster, too. Charles. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] Move unsigned long option parsing out of pack-objects.c
Charles Bailey cbaile...@bloomberg.net writes: test_expect_success 'long options' ' - test-parse-options --boolean --integer 1729 --boolean --string2=321 \ - --verbose --verbose --no-dry-run --abbrev=10 --file fi.le\ - --obsolete output 2 output.err + test-parse-options --boolean --integer 1729 --unsigned-long 16k \ + --boolean --string2=321 --verbose --verbose --no-dry-run \ + --abbrev=10 --file fi.le --obsolete \ + output 2 output.err test_must_be_empty output.err test_cmp expect output ' It's trivial matter but the line: + output 2 output.err should be written: + output 2output.err It was incorrectly written before but since you are modifying the line, it might be a good thing to change it now. Rémi -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH 2/3] Move unsigned long option parsing out of pack-objects.c
On Fri, Jun 19, 2015 at 01:03:25PM +0200, Remi Galan Alfonso wrote: It's trivial matter but the line: + output 2 output.err should be written: + output 2output.err It was incorrectly written before but since you are modifying the line, it might be a good thing to change it now. Yes, I can fold this in. I just changed the wrapping and didn't spot this style error. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism
On Fri, Jun 19, 2015 at 4:43 AM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: diff --git a/builtin/am.c b/builtin/am.c index dbc8836..af68c51 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -6,6 +6,158 @@ #include cache.h #include builtin.h #include exec_cmd.h +#include parse-options.h +#include dir.h + +struct am_state { + /* state directory path */ + struct strbuf dir; Is this a temporary variable you will append /patch, etc. to form a different string to use for fopen() etc., or do you use separate strbufs for things like that and this is only used to initialize them? - If the former then dir is a misnomer. - If the latter, then it probably does not have to be a strbuf; rather, it should probably be a const char *. Unless you pass this directly to functions that take a strbuf, such as remove_dir_recursively(), that is. It's the latter, and yes it does not need to be an strbuf since it will not usually change. However, I don't think it should be a const char*, as it means that the API user has to keep track of the lifetime of the dir string. Note that we will have to git_pathdup() as git_path() returns a static buffer: char *dir = git_pathdup(rebase-apply); struct am_state state; am_state_init(state); state.dir = dir; ...stuff... if (foo) { ... separate code path ... am_state_release(state); free(dir); return 0; } ... separate code path ... am_state_release(state); free(dir); return 0; Note the additional memory bookkeeping. Instead, I would rather the am_state struct take ownership of the dir string and free it in am_state_release(). So dir will be a char*: struct am_state state; am_state_init(state, git_path(rebase-apply)); ... stuff ... if (foo) { ... separate code path ... am_state_release(state); return 0; } ... separate code path ... am_state_release(state); return 0; +/** + * Release memory allocated by an am_state. + */ Everybody else in this file seems to say Initializes, Returns, Reads, etc. While I personally prefer to use imperative (i.e. give command to this function to release memory allocated), you would want to be consistent throughout the file; Releases memory would make it so. OK +/** + * Setup a new am session for applying patches + */ +static void am_setup(struct am_state *state) +{ + if (mkdir(state-dir.buf, 0777) 0 errno != EEXIST) + die_errno(_(failed to create directory '%s'), state-dir.buf); + + write_file(am_path(state, next), 1, %d, state-cur); + + write_file(am_path(state, last), 1, %d, state-last); These two lines are closely related pair; drop the blank in between. I am tno sure if write_file() is an appropriate thing to use, though. What happens when you get interrupted after opening the file but before you manage to write and close? Shouldn't we be doing the usual write to temp, close and then rename to final dance? This comment applies to all the other use of write_file(). We could, although I don't think it would help us much. The state directory is made up of many different files, so even if we made modifications to single files atomic, there's no point if we terminate unexpectedly in the middle of writing multiple files to the state directory as the state, as a whole, is corrupted anyway. So, we must be able to make updates to the entire state directory as a single transaction, which is a more difficult problem... Furthermore, while applying patches, we may face abnormal termination between e.g. the patch application and commit, so I think it is safe to say that if abnormal termination occurs, we will not be able to reliably --resume or --skip anyway, and the user should just run --abort to go back to the last known state. However, it would be an improvement if we wrote the next and last files last, as we use their presence to determine if we have an am session in progress or not, so if we terminate in am_setup() we will just have a stray directory. I will make this change in the next iteration. Regards, Paul -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks
On Fri, Jun 19, 2015 at 03:34:55AM -0400, Jeff King wrote: And here's some more bad news. If you look at the diff for this patch itself, it's terribly unreadable (the regular diff already is pretty bad, but the highlights make it much worse). There are big chunks where we take away 5 or 10 lines from the old code, and replace them with totally unrelated lines. We end up highlighting almost the entire thing, except for spaces and punctuation. We might be able to solve this with a percentage heuristic similar to the one Patrick proposed. It's not really interesting to highlight unless we're doing it on probably 20% or less of the diff (where 20% is a number I just made up). That turned out to be pretty easy; patch is below (on top of what I sent earlier). I set the percentage at 50% based on eyeballing git log -p in git.git, and it seems to give good results. So I think the big remaining issue is improved tokenizing. Maybe Patrick will want to take a stab at it. --- diff --git a/contrib/diff-highlight/diff-highlight b/contrib/diff-highlight/diff-highlight index 1525ccc..9454446 100755 --- a/contrib/diff-highlight/diff-highlight +++ b/contrib/diff-highlight/diff-highlight @@ -114,12 +114,32 @@ sub show_hunk { if $bits 2; } + my $highlighted = count_highlight(@highlight_a) + + count_highlight(@highlight_b); + my $total = length($a) + length($b); + my $pct = $highlighted / $total; + + if ($pct 0.5) { + @highlight_a = (); + @highlight_b = (); + } + # And now show the output both with the original stripped annotations, # as well as our new highlights. show_image($a, [merge_annotations(\@stripped_a, \@highlight_a)]); show_image($b, [merge_annotations(\@stripped_b, \@highlight_b)]); } +sub count_highlight { + my $total = 0; + while (@_) { + my $from = shift; + my $to = shift; + $total += $to-[0] - $from-[0]; + } + return $total; +} + # Strip out any diff syntax (i.e., leading +/-), along with any ANSI color # codes from the pre- or post-image of a hunk. The result is a string of text # suitable for diffing against the other side of the hunk. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH/WIP v3 10/31] am: refresh the index at start
On Fri, Jun 19, 2015 at 5:28 AM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: If a file is unchanged but stat-dirty, git-apply may erroneously fail to apply patches, thinking that they conflict with a dirty working tree. As such, since 2a6f08a (am: refresh the index at start and --resolved, 2011-08-15), git-am will refresh the index before applying patches. Re-implement this behavior. Good. I would actually have expected to see this as part of 08/31, though. Ah right, makes sense since this patch is primarily about git-apply. I've squashed this patch into 08/31. Regards, Paul -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git svn die signal 6
I even found Pull Request on GitHub with useful changes, but looks like there is too hard way to commit changes and author gave up. https://github.com/git/git/pull/62 Probably you can take a short look into it and include this changes in next release. 2015-06-15 16:54 GMT+03:00 Олег Кохтенко kohte...@gmail.com: Hi great git crew member :) I found small, but annoying bug in 'git svn'. If you would try to make any commit with new files (or delete/modify files) which contain in name @ sign - you'll see crash with minimum information about what's just happened. Such file names are very useful for iOS projects when you can add same pictures with different dimensions with suffix @2x or @3x. In guts, svn uses @ sign for pin version of file and it's probably confused if this sign presented in file name. For committing such changes thru svn you would use svn add 'im...@3x.png'@. This will prevent such problem. I hope you can easily find out how to reproduce and fix this issue. Please, feel free to contact me for any details about this. Best regards -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
On Fri, Jun 19, 2015 at 5:10 AM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: + /* commit message and metadata */ + struct strbuf author_name; + struct strbuf author_email; + struct strbuf author_date; + struct strbuf msg; Same comment as dir in the earlier patch applies to these. If the fields are read or computed and then kept constant, use a temporary variable that is a strbuf to read/compute the final value, and then detach to a const char * field. If they are constantly changing and in-place updates are vital, then they can and should be strbufs, but I do not think that is the case for these. I do think it is the case here. The commit message and metadata fields change for every patch we process, and we could be processing a large volume of patches, so we must be careful to not leak any memory. We could use a char* field, but then to prevent memory leaks we would then have to free() it and malloc() a new string for every patch we process, which will lead to lots of malloc()s and free()s over a large volume of patches that can lead to memory fragmentation. Regards, Paul -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Visualizing merge conflicts after the fact (using kdiff3)
Junio C Hamano venit, vidit, dixit 18.06.2015 17:57: Michael J Gruber g...@drmicha.warpmail.net writes: This type of request comes up often (for a reason). I'm wondering whether we could support it more systematically, either by exposing the steps above as a command, or by storing the unresolved merge somewhere (leveraging stash or rerere). Perhaps 'tr/remerge-diff' (on 'pu') is of interest? Ingenious! To me, this seems to be the most useful view if you want to understand a merge just from the parents and the merge commit. Since you would use that for individual commits only, the cpu cycles are well spent. As and added benefit, tr/remerge-diff merges to current next with conflicts (oid...) so that you get to test it on its own merge! I haven't reviewed remerge-diff but merged it on top of my own local additions and ran the full test suite successfully. Any big blocker to watch out for? Michael -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] mergetool-lib: fix default tool selection
When no diff nor merge tool is specified (config, option), mergetool-lib is supposed to choose a default tool from a set of tools. That set is constructed dynamically depending on the environment (graphical, editor setting) as a space separated string of tool names. 719518f (mergetool--lib: set IFS for difftool and mergetool, 2015-05-20) introduced a newline as IFS which breaks the parsing of the space separated list into items, resulting in a failed search for an available tool. Set IFS to a space locally for the tool search. Signed-off-by: Michael J Gruber g...@drmicha.warpmail.net --- git-mergetool--lib.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/git-mergetool--lib.sh b/git-mergetool--lib.sh index 14b039d..54ac8e4 100644 --- a/git-mergetool--lib.sh +++ b/git-mergetool--lib.sh @@ -305,6 +305,7 @@ guess_merge_tool () { EOF # Loop over each candidate and stop when a valid merge tool is found. + IFS=' ' for tool in $tools do is_available $tool echo $tool return 0 -- 2.4.4.610.gea30ab1.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] Move unsigned long option parsing out of pack-objects.c
From: Charles Bailey cbaile...@bloomberg.net The unsigned long option parsing (including 'k'/'m'/'g' suffix parsing) is more widely applicable. Add support for OPT_ULONG to parse-options.h and change pack-objects.c use this support. Signed-off-by: Charles Bailey cbaile...@bloomberg.net --- builtin/pack-objects.c | 17 - parse-options.c | 15 +++ parse-options.h | 5 - t/t0040-parse-options.sh | 46 ++ test-parse-options.c | 3 +++ 5 files changed, 64 insertions(+), 22 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 80fe8c7..5de76db 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -2588,23 +2588,6 @@ static int option_parse_unpack_unreachable(const struct option *opt, return 0; } -static int option_parse_ulong(const struct option *opt, - const char *arg, int unset) -{ - if (unset) - die(_(option %s does not accept negative form), - opt-long_name); - - if (!git_parse_ulong(arg, opt-value)) - die(_(unable to parse value '%s' for option %s), - arg, opt-long_name); - return 0; -} - -#define OPT_ULONG(s, l, v, h) \ - { OPTION_CALLBACK, (s), (l), (v), n, (h), \ - PARSE_OPT_NONEG, option_parse_ulong } - int cmd_pack_objects(int argc, const char **argv, const char *prefix) { int use_internal_rev_list = 0; diff --git a/parse-options.c b/parse-options.c index 80106c0..76a5c3e 100644 --- a/parse-options.c +++ b/parse-options.c @@ -180,6 +180,21 @@ static int get_value(struct parse_opt_ctx_t *p, return opterror(opt, expects a numerical value, flags); return 0; + case OPTION_ULONG: + if (unset) { + *(unsigned long *)opt-value = 0; + return 0; + } + if (opt-flags PARSE_OPT_OPTARG !p-opt) { + *(unsigned long *)opt-value = opt-defval; + return 0; + } + if (get_arg(p, opt, flags, arg)) + return -1; + if (!git_parse_ulong(arg, opt-value)) + return opterror(opt, expects a numerical value, flags); + return 0; + default: die(should not happen, someone must be hit on the forehead); } diff --git a/parse-options.h b/parse-options.h index c71e9da..2ddb26f 100644 --- a/parse-options.h +++ b/parse-options.h @@ -18,7 +18,8 @@ enum parse_opt_type { OPTION_INTEGER, OPTION_CALLBACK, OPTION_LOWLEVEL_CALLBACK, - OPTION_FILENAME + OPTION_FILENAME, + OPTION_ULONG }; enum parse_opt_flags { @@ -129,6 +130,8 @@ struct option { #define OPT_CMDMODE(s, l, v, h, i) { OPTION_CMDMODE, (s), (l), (v), NULL, \ (h), PARSE_OPT_NOARG|PARSE_OPT_NONEG, NULL, (i) } #define OPT_INTEGER(s, l, v, h) { OPTION_INTEGER, (s), (l), (v), N_(n), (h) } +#define OPT_ULONG(s, l, v, h) { OPTION_ULONG, (s), (l), (v), N_(n), \ + (h), PARSE_OPT_NONEG } #define OPT_STRING(s, l, v, a, h) { OPTION_STRING, (s), (l), (v), (a), (h) } #define OPT_STRING_LIST(s, l, v, a, h) \ { OPTION_CALLBACK, (s), (l), (v), (a), \ diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh index ecb7417..55b3dba 100755 --- a/t/t0040-parse-options.sh +++ b/t/t0040-parse-options.sh @@ -19,6 +19,8 @@ usage: test-parse-options options -i, --integer n get a integer -j nget a integer, too +-u, --unsigned-long n + get an unsigned long --set23 set integer to 23 -t time get timestamp of time -L, --length strget length of str @@ -58,6 +60,7 @@ mv expect expect.err cat expect.template EOF boolean: 0 integer: 0 +unsigned long: 0 timestamp: 0 string: (not set) abbrev: 7 @@ -132,9 +135,32 @@ test_expect_success 'OPT_BOOL() no negation #2' 'check_unknown_i18n --no-no-fear test_expect_success 'OPT_BOOL() positivation' 'check boolean: 0 -D --doubt' +test_expect_success 'OPT_INT() negative' 'check integer: -2345 -i -2345' + +test_expect_success 'OPT_ULONG() simple' ' + check unsigned long: 2345678 -u 2345678 +' + +test_expect_success 'OPT_ULONG() kilo' ' + check unsigned long: 239616 -u 234k +' + +test_expect_success 'OPT_ULONG() mega' ' + check unsigned long: 104857600 -u 100m +' + +test_expect_success 'OPT_ULONG() giga' ' + check unsigned long: 1073741824 -u 1g +' + +test_expect_success 'OPT_ULONG() 3giga' ' + check unsigned long: 3221225472 -u 3g +' + cat expect EOF boolean: 2 integer: 1729 +unsigned long: 16384 timestamp: 0 string: 123 abbrev: 7 @@ -145,7 +171,7 @@ file:
Improvements to parse-options and a new filter-objects command
In my team we've been looking for a fast way to check a large number of repositories for large files, which are typically unintentionally checked in binaries, so that we can warn repository owners and help them tidy up as desired. There seem to be two main approaches to scripting this. The first is to do something revision-walk based such as `log --numstat` and the second is to scan pack files using `verify-pack -v` and either to ensure that everything is packed or scan loose objects separately. The revision walking tends to be slow and parsing verify-pack -v is awkward not only because of the need to take account of multiple packs and loose objects, but also because it is porcelainish. For example, at some point it gained a delta chain summary which needs to be snipped before the list of packed objects can be sorted and used. The third patch in this series adds a new built in which makes this simple and fast. While implementing it, I found a couple of other improvements which I think stand alone. [PATCH 1/3] Correct test-parse-options to handle negative ints I noticed that a printf in test-parse-options was using %u instead of %d for an int with the consequence that it wouldn't ever print a negative value correctly. I don't know that we do ever parse a negative integer as an option, but there's no reason that it shouldn't work so I fixed it and added a trivial test. [PATCH 2/3] Move unsigned long option parsing out of pack-objects.c I wanted to be able to parse options like --min-size=500k in my new command so I started to add OPT_ULONG, only to realise that it already existed but was private to pack-objects. I added OPT_ULONG support to parse-options based on the existing OPT_INTEGER code, added new tests and changed pack-objects to use this instead. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] Correct test-parse-options to handle negative ints
From: Charles Bailey cbaile...@bloomberg.net Fix the printf specification to treat 'integer' as the signed type that it is and add a test that checks that we parse negative option arguments. Signed-off-by: Charles Bailey cbaile...@bloomberg.net --- t/t0040-parse-options.sh | 2 ++ test-parse-options.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh index b044785..ecb7417 100755 --- a/t/t0040-parse-options.sh +++ b/t/t0040-parse-options.sh @@ -151,6 +151,8 @@ test_expect_success 'short options' ' test_must_be_empty output.err ' +test_expect_success 'OPT_INT() negative' 'check integer: -2345 -i -2345' + cat expect EOF boolean: 2 integer: 1729 diff --git a/test-parse-options.c b/test-parse-options.c index 5dabce6..7c492cf 100644 --- a/test-parse-options.c +++ b/test-parse-options.c @@ -82,7 +82,7 @@ int main(int argc, char **argv) argc = parse_options(argc, (const char **)argv, prefix, options, usage, 0); printf(boolean: %d\n, boolean); - printf(integer: %u\n, integer); + printf(integer: %d\n, integer); printf(timestamp: %lu\n, timestamp); printf(string: %s\n, string ? string : (not set)); printf(abbrev: %d\n, abbrev); -- 2.4.0.53.g8440f74 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Using clean/smudge filters with difftool
Junio C Hamano venit, vidit, dixit 19.06.2015 00:55: John Keeping j...@keeping.me.uk writes: I think the summary is that there are some scenarios where the external diff tool should see the smudged version and others where the clean version is more appropriate and Git should support both options. It seems this is a property of the filter, so I wonder if the best solution is a new filter.name.extdiff = [clean|smudge] configuration variable (there's probably a better name for the variable than extdiff). Not just the external diff, but the textconv filter obeys the same rule. The setting should be done the same way for both, if we are going to go in that direction. textconv is a one-way filter from blob to readable blob. External diffs may prefer to work on blob rather than readable blob, but the currect setup does not seem to produce surprises. clean and smudge are two-way filters: clean from worktree blob (aka file) to repo blob, smudge the other way round. Typically, the user perceives these as inverse to each other. But we only require clean to be a left-inverse of smudge, i.e. (cat-file then) smudge then clean should give the same repo blob (as cat-file). We don't require that the other way round, i.e. we don't require smudge to be a left-inverse of clean, and in most setups (like the current one) it is not: smudge does not recreate what clean has cleaned out. It is a no-op (the identity, while clean is a projection). Now, since external diff runs on smudged blobs, it appears as if we mixed cleaned and smudged blobs when feeding external diffs; whereas really, we mix worktree blobs and smudged repo blobs, which is okay as per our definition of clean/smudge: the difference is irrelevant by definition. I still think that feeding cleaned blobs to external diff would be less surprising (and should be the default, but maybe can't be changed any more) and feeding smudged blobs should be the special case requiring a special config. Because otherwise, the external diff would have to know which parts of the diff are irrelevant - if it display the complete (uncleaned) diff, it shows differences (what will be committed) that will not end up in the commit (because they will get cleaned out before). As a guiding principle, a worktree-HEAD diff and an index-HEAD diff should be previews of the result of commit -a resp. commit, and therefore should diff cleaned versions. textconv, on the other hand, is a setting by which you tell git: Don't show me the 'proper' diff/commit-preview but a readable version. Michael -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] strbuf: stop out-of-boundary warnings from Coverity
Duy Nguyen pclo...@gmail.com writes: Anyway it does not put strbuf_slopbuf in .rodata. That's sad. I wa hoping that it would behave the same as this, which does give me SEGV: #include stdio.h static const char x = '\0'; static char *y = (char *)x; int main (void) { *y = 1; } -- To unsubscribe from this list: send the line unsubscribe git in
Re: Using clean/smudge filters with difftool
2015-06-19 11:32 GMT+02:00 John Keeping j...@keeping.me.uk: On Fri, Jun 19, 2015 at 10:57:55AM +0200, Michael J Gruber wrote: Junio C Hamano venit, vidit, dixit 19.06.2015 00:55: John Keeping j...@keeping.me.uk writes: I think the summary is that there are some scenarios where the external diff tool should see the smudged version and others where the clean version is more appropriate and Git should support both options. It seems this is a property of the filter, so I wonder if the best solution is a new filter.name.extdiff = [clean|smudge] configuration variable (there's probably a better name for the variable than extdiff). Not just the external diff, but the textconv filter obeys the same rule. The setting should be done the same way for both, if we are going to go in that direction. textconv is a one-way filter from blob to readable blob. External diffs may prefer to work on blob rather than readable blob, but the currect setup does not seem to produce surprises. clean and smudge are two-way filters: clean from worktree blob (aka file) to repo blob, smudge the other way round. Typically, the user perceives these as inverse to each other. But we only require clean to be a left-inverse of smudge, i.e. (cat-file then) smudge then clean should give the same repo blob (as cat-file). We don't require that the other way round, i.e. we don't require smudge to be a left-inverse of clean, and in most setups (like the current one) it is not: smudge does not recreate what clean has cleaned out. It is a no-op (the identity, while clean is a projection). Now, since external diff runs on smudged blobs, it appears as if we mixed cleaned and smudged blobs when feeding external diffs; whereas really, we mix worktree blobs and smudged repo blobs, which is okay as per our definition of clean/smudge: the difference is irrelevant by definition. I agree with this. But I was wrong that should diff clean/should diff smudged is a property of the filter. I can also imagine a situation where a more intelligent external diff tool wants to see the smudged version where a naïve tool would want the clean version. For example, some of the big file stores (e.g. git-lfs [1]) use clean/smudge filters and I can imagine a diff utility that avoids needing to fetch the data for large files and instead shows the diff on the server when both blobs are available there. In that case we generally want to use the smudged copy for external diff, so the filter would use that setting, but the diff utility knows better and would want to override that. [1] https://github.com/github/git-lfs I can understand why they are not fed with the clean copy by default. Since some external diff tool enable modifying the working copy file, this would correspond to apply the cleaning filter to the working copy version. Nevertheless, in my case it would be really helpful if there were an option to feed the external diff tool with the cleaned version. Otherwise, I'll probably write a custom script which does this. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
On Fri, Jun 19, 2015 at 11:09 PM, Junio C Hamano gits...@pobox.com wrote: You do realize that strbuf internally does alloc/free so as a solution to fragmentation issue you are at the mercy of the same alloc/free, don't you? Yes, of course, but it has the alloc variable to keep track of the size of the allocated buffer, and provided that ALLOC_GROW() uses a sensible factor to grow the buffer, it won't be calling malloc/free that much to be a problem. Of course, we could do the same and just add a alloc variable as well and then use it with strbuf_attach()/strbuf_detach(), but at this point why not just store it in an strbuf then and avoid the extra alloc struct member? The primary thing I care about is to discourage callers of the API element am_state from touching these fields with strbuf functions. If it is char * then the would think twice before modifying (which would involve allocating the new buffer, forming the new value and assigning into it). With the fields left as strbuf after they are read or formed by the API (by reading by the API implementation from $GIT_DIR/rebase-apply/), it is too tempting to do strbuf_add(), strbuf_trim(), etc., without restoring the value to the original saying I'm the last user of it anyway--that is the sloppiness we can prevent by not leaving it as strbuf. I don't think this is a good deterrent. If the code wanted to, it could just use strbuf_attach()/strbuf_detach() as well, no? I believe this kind of issue would be better solved with a big WARNING comment and code review. Also, there may be use cases where the user may wish to modify the commit message/author fields. E.g., user may wish to modify the message of the commit that results from am --continue to take into account the conflicts that caused the am to fail. Regards, Paul -- To unsubscribe from this list: send the line unsubscribe git in
Re: Using clean/smudge filters with difftool
Michael J Gruber g...@drmicha.warpmail.net writes: Now, since external diff runs on smudged blobs, it appears as if we mixed cleaned and smudged blobs when feeding external diffs; whereas really, we mix worktree blobs and smudged repo blobs, which is okay as per our definition of clean/smudge: the difference is irrelevant by definition. It does not appear to mix cleaned and smudged to me (even though before Dscho's commit that John pointed out, we did mix by mistake) to me, but you arrived at the correct conclusion in the rest of your sentence. We treat worktree files and smudged repo blobs as comparable because by definition the latter is what you get if you did a checkout of the blob. Indeed, when we know a worktree file is an unmodified checkout from a blob and we want to have a read-only temporary file for a smudged repo blob, we allow that worktree file to be used as such. So in that sense, the commit by Dscho that John pointed out earlier was not something that changed the semantics; it merely made things consistent (before that commit, we used to use clean version if we do not have a usable worktree file). It is a separate question which of clean or smudged an external diff tool should be given to work on. I still think that feeding cleaned blobs to external diff would be less surprising (and should be the default, but maybe can't be changed any more) and feeding smudged blobs should be the special case requiring a special config. Go back six years and make a review comment before 4e218f54 (Smudge the files fed to external diff and textconv, 2009-03-21) was taken ;-). The argument against that commit may have gone like this: * The current (that is, current as of 4e218f54^) code is inconsistent, and your patch has a side effect of making it consistent by always feeding smudged version. * We however could make it consistent by always feeding clean version (i.e. disable borrow-from-working-tree codepath when driving external diff). And that gives us cleaner semantics; the internal diff and external diff will both work on clean, not smudged data. * Of course, going the clean way would not help your cause of allowing external diff to work on smudged version, so you would need a separate patch on top of that consistently feed 'clean' version fix to optionally allow consistently feed 'smudge' version mode to help msysGit issue 177. And I would have bought such an argument with 97% chance [*1*]. I do not think 6 years have changed things very much with respect to the above three-bullet point argument, except that it would be too late to set the default to 'clean' all of a sudden. So a plausible way forward would be to * introduce an option to feed 'clean' versions to external diff drivers, perhaps with --ext-diff-clean=driver command line option and GIT_EXTERNAL_DIFF_CLEAN environment variable, both of which take precedence over existing --ext-diff/GIT_EXTERNAL_DIFF * optionally add a configuration variable diff.feedCleanToExternal that makes --ext-diff/GIT_EXTERNAL_DIFF behave as if their 'clean' siblings were given. Default it to false. My gut feeling is that textconv should need a similar treatment for consistency (after all, it goes through the same prepare_temp_file() infrastructure). [Footnote] *1* The 3% reservation is that I am not entirely convinced that both internal and external get to work on the same 'clean' representation gives us cleaner semantics is always true. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] format-patch: introduce format.outputDirectory configuration
Alexander Kuleshov kuleshovm...@gmail.com writes: 2015-06-19 3:46 GMT+06:00 Junio C Hamano gits...@pobox.com: I agree with later -o should override an earlier one, but I do not necessarily agree with '-o -' should be --stdout, for a simple reason that -o foo is not --stdout foo. Perhaps something like this to replace builtin/ part of Alexander's patch? @@ -1337,6 +1342,9 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) die (_(--subject-prefix and -k are mutually exclusive.)); rev.preserve_subject = keep_subject; + if (!output_directory !use_stdout) + output_directory = config_output_directory; + But there is following condition above: if (!use_stdout) output_directory = set_outdir(prefix, output_directory); After which output_directory will be ./ everytime and + if (!output_directory !use_stdout) + output_directory = config_output_directory; + will not work here. I thought I made that if we did not see '-o dir' on the command line, initialize output_directory to what we read from the config before we make a call to set_outdir(). What I am missing? Puzzled... FWIW, IIRC, the patch you are responding to passed the test you added. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Paul Tan pyoka...@gmail.com writes: On Fri, Jun 19, 2015 at 5:10 AM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: + /* commit message and metadata */ + struct strbuf author_name; + struct strbuf author_email; + struct strbuf author_date; + struct strbuf msg; Same comment as dir in the earlier patch applies to these. If the fields are read or computed and then kept constant, use a temporary variable that is a strbuf to read/compute the final value, and then detach to a const char * field. If they are constantly changing and in-place updates are vital, then they can and should be strbufs, but I do not think that is the case for these. I do think it is the case here. The commit message and metadata fields change for every patch we process, and we could be processing a large volume of patches, so we must be careful to not leak any memory. With the above fields, it is clear that the above fields are per-message thing. So the loop to process multiple messages is conceptually: set up the entire am_state (for things like cur=1, last=N) for each message { update per-message fields like cur, author, msg, etc. process the single message clean up per-message fileds like cur++, free(msg), etc. } tear down the entire am_state Reusing the same strbuf and rely on them to clean up after themselves is simply a bad taste. More importantly, we'd want to make sure that people who will be touching the process the single message part in the above loop to think twice before mucking with read-only fields like author. If they are char *, they would need to allocate new storage themselves, format a new value into there, free the original, and replace the field with the new value. It takes a conscious effort and very much visible code and would be clear to reviewers what is going on, and gives them a chance to question if it is a good idea to update things in-place in the first place. If you left it in strbuf, that invites let's extend it temporarily with strbuf_add() and then return to the original once I am done with this single step, which is an entry to a slippery slope to let's extend it with strbuf_add() for my purpose, and I do not even bother to clean up because I know I am the last person to touch this. So, no, please don't leave a field that won't change during the bulk of the processing in strbuf, unless you have a compelling reason to do so (and compelling reasons are pretty much limited to after all, that field we originally thought it won't change is something we need to change very often). Thanks. -- To unsubscribe from this list: send the line unsubscribe git in
Re: Visualizing merge conflicts after the fact (using kdiff3)
Michael J Gruber g...@drmicha.warpmail.net writes: Junio C Hamano venit, vidit, dixit 18.06.2015 17:57: Perhaps 'tr/remerge-diff' (on 'pu') is of interest? I haven't reviewed remerge-diff but merged it on top of my own local additions and ran the full test suite successfully. Any big blocker to watch out for? What's cooking report marks it as Waiting for a reroll. ($gmane/256591). So http://thread.gmane.org/gmane.comp.version-control.git/256591 would be the first place to revisit. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] format-patch: introduce format.outputDirectory configuration
I thought I made that if we did not see '-o dir' on the command line, initialize output_directory to what we read from the config before we make a call to set_outdir(). What I am missing? Puzzled... FWIW, IIRC, the patch you are responding to passed the test you added. Ok, Now we have: if (!use_stdout) output_directory = set_outdir(prefix, output_directory); else setup_pager(); and if (output_directory) { // test that we did not pass use_stdout and mkdir than } If we didn't pass --stdout and -o the set_outdir will be called and there is static const char *set_outdir(const char *prefix, const char *output_directory) { //printf(is_absoulte_path %d\n, is_absolute_path(output_directory)); if (output_directory is_absolute_path(output_directory)) return output_directory; if (!prefix || !*prefix) { if (output_directory) return output_directory; return ./; } } So it returns ./, output_directory will not be null. After this + if (!output_directory !use_stdout) + output_directory = config_output_directory; clause will not be executed never. Or I've missed something? Thank you. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] Improve contrib/diff-highlight to highlight unevenly-sized hunks
Jeff King p...@peff.net writes: I do strip it off, so it is OK for it to be different in both the pre-image and post-image. But what I can't tolerate is the intermingling with actual data: +\t\t\x1b[32m;foo +\t\x1b[32m;bar I think that depends on the definition of strip it off ;-) What I meant was that whitespace coloring can appear anywhere on the line, e.g. echo COPYING whitespace=tab-in-indent,-space-before-tab .gitattributes printf \tHeh\n COPYING git diff will give you new + reset SP wserror HT reset new Heh reset LF Obviously, stripping + painted in new is not sufficient. Side note: hmm, shouldn't that SP painted in new, though? I think this array of spans is the only way to go. Otherwise whichever markup scheme processes the hunk first ruins the data for the next processor. Yes, I agree with that 100%. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] format-patch: introduce format.outputDirectory configuration
Alexander Kuleshov kuleshovm...@gmail.com writes: Ah, you mean to put this check before. I am fuzzy what you mean before (or after); the how about doing it this way instead? patch we are discussing is to replace the change you did in your original, so if you apply it you would know what addition goes to where ;-) -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH 3/3] Add filter-objects command
Jeff King p...@peff.net writes: Right, my point was only that it works for _your_ particular filter, but it would be nice to have something more general. And we already have cat-file --batch-check. IOW, I think I would prefer the magical form because it's a better scripting building block. As you note, filter-objects without any filters is exactly that. Your 10 extra lines of C code are not exactly bloat, but I just wonder if other people will find it all that useful. Yup. I do not mind a fast enumerate all objects but I suspect that making all fast may turn out to be not so great a trade-off after all, as you would need more work on the now we have all coming from our input, let's filter with this and that criteria downstream in general cases. Graph-based filtering e.g. Oops, here is our whole customer database committed by mistake--which branch should we rewrite to nuke? inherently is much more costly to do in the downstream that essentially has to reconstruct the graph around interesting parts of the history, and is better done by enumerate phase spending time to do the actual graph traversal. And filter-anything should not be the name for enumerate command that comes on the upstream of a pipe. You usually call what is downstream of a pipe a filter. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery
Johannes Schindelin johannes.schinde...@gmx.de writes: I basically made up names on the go, based on the messages. Some of the questionable groups are: BAD_DATE DATE_OVERFLOW I guess it should be BAD_DATE_OVERFLOW to be more consistent? I am not sure about consistency, but surely a common prefix would help readers to group things. But for this particular group, I was wondering if singling out integer overflow, zero stuffed timestamp, etc. into such a finer sub-errors of you have a bad timestamp was beneficial. BAD_TREE_SHA1 INVALID_OBJECT_SHA1 INVALID_TREE BAD_PARENT_SHA1 INVALID_OBJECT_SHA1 So how about s/INVALID_/BAD_/g? It is not just about distinction between INVAID and BAD. I was basically wondering what rule decides which one among BAD_TREE_SHA1, INVALID_OBJECT_SHA1 and INVALID_TREE I would get when I have a random non-hexdigit string in various places, e.g. after 'tree ' in the object header of a commit object, after 'tag ' in a tag object that says 'type tree', etc. Also it is unclear if NOT_SORTED is to be used ever for any error other than a tree object sorted incorrectly, or if we start noticing a new error that something is not sorted, we will reuse this one. s/NOT_SORTED/TREE_/ maybe? If that error is specific to tree sorting order, then that would be a definite improvement. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] mergetool-lib: fix default tool selection
Michael J Gruber g...@drmicha.warpmail.net writes: When no diff nor merge tool is specified (config, option), mergetool-lib is supposed to choose a default tool from a set of tools. That set is constructed dynamically depending on the environment (graphical, editor setting) as a space separated string of tool names. 719518f (mergetool--lib: set IFS for difftool and mergetool, 2015-05-20) introduced a newline as IFS which breaks the parsing of the space separated list into items, resulting in a failed search for an available tool. Set IFS to a space locally for the tool search. I wondered where this locally is ensured; it turns out that this shell function is supposed to be always called inside $() to return its result via its standard output ;-) So I think this change makes sense. Signed-off-by: Michael J Gruber g...@drmicha.warpmail.net --- git-mergetool--lib.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/git-mergetool--lib.sh b/git-mergetool--lib.sh index 14b039d..54ac8e4 100644 --- a/git-mergetool--lib.sh +++ b/git-mergetool--lib.sh @@ -305,6 +305,7 @@ guess_merge_tool () { EOF # Loop over each candidate and stop when a valid merge tool is found. + IFS=' ' for tool in $tools do is_available $tool echo $tool return 0 -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH] format-patch: introduce format.outputDirectory configuration
Ah, you mean to put this check before. Just tested it and many tests are broken. Will look on it now 2015-06-19 23:19 GMT+06:00 Alexander Kuleshov kuleshovm...@gmail.com: I thought I made that if we did not see '-o dir' on the command line, initialize output_directory to what we read from the config before we make a call to set_outdir(). What I am missing? Puzzled... FWIW, IIRC, the patch you are responding to passed the test you added. Ok, Now we have: if (!use_stdout) output_directory = set_outdir(prefix, output_directory); else setup_pager(); and if (output_directory) { // test that we did not pass use_stdout and mkdir than } If we didn't pass --stdout and -o the set_outdir will be called and there is static const char *set_outdir(const char *prefix, const char *output_directory) { //printf(is_absoulte_path %d\n, is_absolute_path(output_directory)); if (output_directory is_absolute_path(output_directory)) return output_directory; if (!prefix || !*prefix) { if (output_directory) return output_directory; return ./; } } So it returns ./, output_directory will not be null. After this + if (!output_directory !use_stdout) + output_directory = config_output_directory; clause will not be executed never. Or I've missed something? Thank you. -- To unsubscribe from this list: send the line unsubscribe git in
Re: co-authoring commits
On 2015-06-18 at 23:25, Tuncer Ayaz wrote: On Thu, Jun 18, 2015 at 12:52 AM, Theodore Ts'o wrote: On Wed, Jun 17, 2015 at 10:26:32PM +0200, Tuncer Ayaz wrote: [...] One could imagine some frankly, quite rare example where there is a team of people who votes on each commit before it gets sent out and where everyone is equal and there is no hierarchy. In that case, perhaps you could set the from field to a mailing list address. But honestly, how often is that *all* of the authors are completely equal[1]? For that case something like patchwork, phabricator, or gerrit seems to be the logical tool to use, and should ideally leave a trace of approvals and such in the resulting commit message(s). If the patch management tool takes care of merging the commit(s), it can be harder to misattribute signed-off/reviewed-by/etc, which is a good thing. Doesn't Gerrit (at least) use trailer-like structured *notes* in the 'reviews' category (i.e. refs/notes/reviews ref) to store information about review process? You could of course use multiple (everybody makes their own) commits, where you risk breaking bisectability and avoid the need for equal co-authorship support. In pair programming such intermediate commits will quite often be fixups, and when you attempt to squash the fixups for bisectability's sake, you may get a desire for co-authorship of the resulting commit. Hmmm... I didn't think about the problem of attributing authorship for squashed commits. Though here multiple 'author' headers, or multiline 'author' header would be a better match than 'coauthor' header (which itself doesn't need, I think, the date filed, or does it?) [This is sent from Thunderbird news, so it should be all right] -- Jakub Narębski -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH 1/3] Correct test-parse-options to handle negative ints
Charles Bailey char...@hashpling.org writes: From: Charles Bailey cbaile...@bloomberg.net Fix the printf specification to treat 'integer' as the signed type that it is and add a test that checks that we parse negative option arguments. Signed-off-by: Charles Bailey cbaile...@bloomberg.net --- Makes sense. Will queue. t/t0040-parse-options.sh | 2 ++ test-parse-options.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh index b044785..ecb7417 100755 --- a/t/t0040-parse-options.sh +++ b/t/t0040-parse-options.sh @@ -151,6 +151,8 @@ test_expect_success 'short options' ' test_must_be_empty output.err ' +test_expect_success 'OPT_INT() negative' 'check integer: -2345 -i -2345' + cat expect EOF boolean: 2 integer: 1729 diff --git a/test-parse-options.c b/test-parse-options.c index 5dabce6..7c492cf 100644 --- a/test-parse-options.c +++ b/test-parse-options.c @@ -82,7 +82,7 @@ int main(int argc, char **argv) argc = parse_options(argc, (const char **)argv, prefix, options, usage, 0); printf(boolean: %d\n, boolean); - printf(integer: %u\n, integer); + printf(integer: %d\n, integer); printf(timestamp: %lu\n, timestamp); printf(string: %s\n, string ? string : (not set)); printf(abbrev: %d\n, abbrev); -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH v2] format-patch: introduce format.outputDirectory configuration
We can pass -o/--output-directory to the format-patch command to store patches not in the working directory. This patch introduces format.outputDirectory configuration option for same purpose. The case of usage of this configuration option can be convinience to not pass everytime -o/--output-directory if an user has pattern to store all patches in the /patches directory for example. The format.outputDirectory has lower priority than command line option, so if user will set format.outputDirectory and pass the command line option, a result will be stored in a directory that passed to command line option. Signed-off-by: Alexander Kuleshov kuleshovm...@gmail.com --- Documentation/config.txt | 4 Documentation/git-format-patch.txt | 6 +- builtin/log.c | 8 t/t4014-format-patch.sh| 18 ++ 4 files changed, 35 insertions(+), 1 deletion(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index e159fe5..4f991b6 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1262,6 +1262,10 @@ format.coverLetter:: format-patch is invoked, but in addition can be set to auto, to generate a cover-letter only when there's more than one patch. +format.outputDirectory:: + Set a custom directory to store the resulting files instead of the + current working directory. + filter.driver.clean:: The command which is used to convert the content of a worktree file to a blob upon checkin. See linkgit:gitattributes[5] for diff --git a/Documentation/git-format-patch.txt b/Documentation/git-format-patch.txt index 0dac4e9..38ddd76 100644 --- a/Documentation/git-format-patch.txt +++ b/Documentation/git-format-patch.txt @@ -57,7 +57,11 @@ The names of the output files are printed to standard output, unless the `--stdout` option is specified. If `-o` is specified, output files are created in dir. Otherwise -they are created in the current working directory. +they are created in the current working directory. The default path +can be set with the seting 'format.outputDirectory' configuration option. +If `-o` is specified and 'format.outputDirectory' is set, output files +will be stored in a dir that passed to `-o`. When 'format.outputDirectory' +is set to get default behaviour back is to pass './' to the `-o`. By default, the subject of a single patch is [PATCH] followed by the concatenation of lines from the commit message up to the first blank diff --git a/builtin/log.c b/builtin/log.c index 78b3e2c..fc26360 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -688,6 +688,8 @@ enum { COVER_AUTO }; +static const char *config_output_directory = NULL; + static int git_format_config(const char *var, const char *value, void *cb) { if (!strcmp(var, format.headers)) { @@ -758,6 +760,9 @@ static int git_format_config(const char *var, const char *value, void *cb) config_cover_letter = git_config_bool(var, value) ? COVER_ON : COVER_OFF; return 0; } + if (!strcmp(var, format.outputdirectory)) { + return git_config_string(config_output_directory, var, value); + } return git_log_config(var, value, cb); } @@ -1368,6 +1373,9 @@ int cmd_format_patch(int argc, const char **argv, const char *prefix) if (rev.show_notes) init_display_notes(rev.notes_opt); + if (!output_directory !use_stdout) + output_directory = config_output_directory; + if (!use_stdout) output_directory = set_outdir(prefix, output_directory); else diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh index 890db11..613e2cc 100755 --- a/t/t4014-format-patch.sh +++ b/t/t4014-format-patch.sh @@ -40,6 +40,24 @@ test_expect_success setup ' ' +test_expect_success format-patch format.outputDirectory option ' + git config format.outputDirectory patches/ + git format-patch master..side + cnt=$(ls | wc -l) + test $cnt = 3 + test_config format.outputDirectory patches/ + git config --unset format.outputDirectory +' + +test_expect_success format-patch format.outputDirectory overwritten with -o ' + rm -rf patches + git config format.outputDirectory patches/ + git format-patch master..side -o . + test_must_fail ls patches/ + test_config format.outputDirectory patches/ + git config --unset format.outputDirectory +' + test_expect_success format-patch --ignore-if-in-upstream ' git format-patch --stdout master..side patch0 -- 2.4.4.727.g5c3049e.dirty -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 04/19] fsck: Offer a function to demote fsck errors to warnings
Johannes Schindelin johannes.schinde...@gmx.de writes: diff --git a/fsck.c b/fsck.c index da5717c..8c3caff 100644 --- a/fsck.c +++ b/fsck.c @@ -103,13 +103,85 @@ static int fsck_msg_type(enum fsck_msg_id msg_id, { int msg_type; - msg_type = msg_id_info[msg_id].msg_type; - if (options-strict msg_type == FSCK_WARN) - msg_type = FSCK_ERROR; + assert(msg_id = 0 msg_id FSCK_MSG_MAX); + + if (options-msg_type) + msg_type = options-msg_type[msg_id]; + else { + msg_type = msg_id_info[msg_id].msg_type; + if (options-strict msg_type == FSCK_WARN) + msg_type = FSCK_ERROR; + } return msg_type; } Nice. +static inline int substrcmp(const char *string, int len, const char *match) +{ + int match_len = strlen(match); + if (match_len != len) + return -1; + return memcmp(string, match, len); +} What this does looks suspiciously like starts_with(), but its name substrcmp() does not give any hint that this is about the beginnig part of string; if anything, it gives a wrong hint that it may be any substring. prefixcmp() might be a better name but that was the old name for !starts_with() so we cannot use it here. It is a mouthful, but starts_with_counted() may be. But the whole thing may be moot. If we take the why not upcase the end-user string upfront suggestion from the previous review, fsck_set_msg_types() would have an upcased copy of the end-user string that it can muck with; it can turn badfoo=error,poorbar=... into BADFOO=error,POORBAR=... that is stored in its own writable memory (possibly a strbuf), and at that point it can afford to NUL-terminate BADFOO=error after finding where one specification ends with strcspn() before calling fsck_set_msg_type(), which in turn calls parse_msg_type(). So all parse_msg_type() needs to do is just !strcmp(). + +static int parse_msg_type(const char *str, int len) +{ + if (len 0) + len = strlen(str); + + if (!substrcmp(str, len, error)) + return FSCK_ERROR; + else if (!substrcmp(str, len, warn)) + return FSCK_WARN; + else + die(Unknown fsck message type: '%.*s', + len, str); +} + +void fsck_set_msg_type(struct fsck_options *options, + const char *msg_id, int msg_id_len, + const char *msg_type, int msg_type_len) +{ + int id = parse_msg_id(msg_id, msg_id_len), type; + + if (id 0) + die(Unhandled message id: %.*s, msg_id_len, msg_id); + type = parse_msg_type(msg_type, msg_type_len); + + if (!options-msg_type) { + int i; + int *msg_type = xmalloc(sizeof(int) * FSCK_MSG_MAX); + for (i = 0; i FSCK_MSG_MAX; i++) + msg_type[i] = fsck_msg_type(i, options); + options-msg_type = msg_type; + } + + options-msg_type[id] = type; +} + +void fsck_set_msg_types(struct fsck_options *options, const char *values) +{ + while (*values) { + int len = strcspn(values, ,|), equal; + + if (!len) { + values++; + continue; + } + + for (equal = 0; equal len; equal++) + if (values[equal] == '=' || values[equal] == ':') + break; + + if (equal == len) + die(Missing '=': '%.*s', len, values); + + fsck_set_msg_type(options, values, equal, + values + equal + 1, len - equal - 1); + values += len; + } +} + __attribute__((format (printf, 4, 5))) static int report(struct fsck_options *options, struct object *object, enum fsck_msg_id id, const char *fmt, ...) @@ -599,6 +671,10 @@ int fsck_object(struct object *obj, void *data, unsigned long size, int fsck_error_function(struct object *obj, int msg_type, const char *message) { + if (msg_type == FSCK_WARN) { + warning(object %s: %s, sha1_to_hex(obj-sha1), message); + return 0; + } error(object %s: %s, sha1_to_hex(obj-sha1), message); return 1; } diff --git a/fsck.h b/fsck.h index f6f268a..edb4540 100644 --- a/fsck.h +++ b/fsck.h @@ -6,6 +6,11 @@ struct fsck_options; +void fsck_set_msg_type(struct fsck_options *options, + const char *msg_id, int msg_id_len, + const char *msg_type, int msg_type_len); +void fsck_set_msg_types(struct fsck_options *options, const char *values); + /* * callback function for fsck_walk * type is the expected type of the object or OBJ_ANY @@ -25,10 +30,11 @@ struct fsck_options { fsck_walk_func walk; fsck_error error_func; unsigned strict:1; + int *msg_type; }; -#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0
Re: [PATCH v6 06/19] fsck: Report the ID of the error/warning
Johannes Schindelin johannes.schinde...@gmx.de writes: Some legacy code has objects with non-fatal fsck issues; To enable the user to ignore those issues, let's print out the ID (e.g. when encountering missingemail, the user might want to call `git config --add receive.fsck.missingemail=warn`). Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- fsck.c | 16 t/t1450-fsck.sh | 4 ++-- 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/fsck.c b/fsck.c index 8e6faa8..0b3e18f 100644 --- a/fsck.c +++ b/fsck.c @@ -190,6 +190,20 @@ void fsck_set_msg_types(struct fsck_options *options, const char *values) } } +static void append_msg_id(struct strbuf *sb, const char *msg_id) +{ + for (;;) { + char c = *(msg_id)++; + + if (!c) + break; + if (c != '_') + strbuf_addch(sb, tolower(c)); + } + + strbuf_addstr(sb, : ); +} + __attribute__((format (printf, 4, 5))) static int report(struct fsck_options *options, struct object *object, enum fsck_msg_id id, const char *fmt, ...) @@ -198,6 +212,8 @@ static int report(struct fsck_options *options, struct object *object, struct strbuf sb = STRBUF_INIT; int msg_type = fsck_msg_type(id, options), result; + append_msg_id(sb, msg_id_info[id].id_string); Nice. The append function can be made a bit more context sensitive to upcase a char immediately after _ to make it easier to cut and paste into git config and keep the result readable, I think. git config --add receive.fsck.missingEmail=warn va_start(ap, fmt); strbuf_vaddf(sb, fmt, ap); result = options-error_func(object, msg_type, sb.buf); diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index cfb32b6..d6d3b13 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -231,8 +231,8 @@ test_expect_success 'tag with incorrect tag name missing tagger' ' git fsck --tags 2out cat expect -EOF - warning in tag $tag: invalid '\''tag'\'' name: wrong name format - warning in tag $tag: invalid format - expected '\''tagger'\'' line + warning in tag $tag: badtagname: invalid '\''tag'\'' name: wrong name format + warning in tag $tag: missingtaggerentry: invalid format - expected '\''tagger'\'' line EOF test_cmp expect out ' -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v5 00/19] Introduce an internal API to interact with the fsck machinery
Hi Junio, first of all: the improvements discussed here are already part of v6. On 2015-06-19 19:33, Junio C Hamano wrote: Johannes Schindelin johannes.schinde...@gmx.de writes: I basically made up names on the go, based on the messages. Some of the questionable groups are: BAD_DATE DATE_OVERFLOW I guess it should be BAD_DATE_OVERFLOW to be more consistent? I am not sure about consistency, but surely a common prefix would help readers to group things. But for this particular group, I was wondering if singling out integer overflow, zero stuffed timestamp, etc. into such a finer sub-errors of you have a bad timestamp was beneficial. Well, someone thought it a good idea to print different error messages, and I took that as an indicator that there is merit in being able to distinguish these issues from one another. BAD_TREE_SHA1 INVALID_OBJECT_SHA1 INVALID_TREE BAD_PARENT_SHA1 INVALID_OBJECT_SHA1 So how about s/INVALID_/BAD_/g? It is not just about distinction between INVAID and BAD. I was basically wondering what rule decides which one among BAD_TREE_SHA1, INVALID_OBJECT_SHA1 and INVALID_TREE I would get when I have a random non-hexdigit string in various places, e.g. after 'tree ' in the object header of a commit object, after 'tag ' in a tag object that says 'type tree', etc. To be honest, I think the IDs do not really matter as much as your comment makes it sound: the IDs purpose is solely to be able to configure the message type (read: whether to error out, warn or ignore the issue). The real information is in the actual message (and I did not change any message, therefore I could not make things worse than they are right now). Example: you would never read BAD_TREE_SHA1, but instead: badtreesha1: invalid 'tree' line format - bad sha1. For BAD_OBJECT_SHA1 (as it is now called), there are actually two code paths generating the error: invalid 'object' line format - bad sha1 and no valid object to fsck. And for BAD_TREE it is: could not load commit's tree SHA1. Thus, from the error message it should be really clear what is going on. Also it is unclear if NOT_SORTED is to be used ever for any error other than a tree object sorted incorrectly, or if we start noticing a new error that something is not sorted, we will reuse this one. s/NOT_SORTED/TREE_/ maybe? If that error is specific to tree sorting order, then that would be a definite improvement. Yes, that is the case. Tree objects are assumed to list their contents in order, and this ID applies to the problem where a tree object's list is out of order. As I said, I already made both discussed changes part of v6. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH v6 07/19] fsck: Make fsck_ident() warn-friendly
Johannes Schindelin johannes.schinde...@gmx.de writes: When fsck_ident() identifies a problem with the ident, it should still advance the pointer to the next line so that fsck can continue in the case of a mere warning. Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de --- Makes sense. -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH 2/3] Move unsigned long option parsing out of pack-objects.c
Junio C Hamano gits...@pobox.com writes: Except for the minor nits above, I think this is a good change. Oh, I forgot to mention one thing. I am not sure if this should be called ULONG. unsigned long-ness is not the most important part of this thing from the end-user's point of view, and also from the point of view of the programmer who supports end-users by using this new feature. It is unlike OPT_INTEGER, the user can specify it as a human readble scaled quantity that is the reason to use this new thing. I think we discussed to introduce OPT_HUMINT (HUM stands for HUMAN, obviously) or some name like that a few years ago to do exactly this, but that is not a great name, either. I was tempted to suggest a name that has size in it, but because places that we may conceivably want to use it in the future would be to specify: - sizes, e.g. split the packfiles into 4.3G chunks. - counts, e.g. show me the most recent 2k commits. - bandwidth, e.g. limit the transfer to consume at most 2M bps. which is not limited to size, it is not a very good idea, either. OPT_SCALED_ULONG(), or something with scaled in its name, perhaps? -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH 2/3] Move unsigned long option parsing out of pack-objects.c
W dniu 2015-06-19 o 19:58, Junio C Hamano pisze: Charles Bailey char...@hashpling.org writes: [...] +if (!git_parse_ulong(arg, opt-value)) +return opterror(opt, expects a numerical value, flags); This used to be: -die(_(unable to parse value '%s' for option %s), -arg, opt-long_name); but opterror() talks about which option, so there is no information loss by losing for option %s from here. That means there is only one difference for pack-objects: $ git pack-objects --max-pack-size=1T fatal: unable to parse value '1T' for option max-pack-size $ ./git pack-objects --max-pack-size=1T error: option `max-pack-size' expects a numerical value usage: git pack-objects --stdout [options... ... 30 more lines omitted ... Eh, make that two: * We no longer say what value we did not like. The user presumably knows what he typed, so this is only a minor loss. Well, in this case this is not a problem, but for longer commandline invocation it might be hard to find the exact argument among all the options (though I don't think there is any integer-accepting option that can be repeated). * We used to stop without giving usage, as the error message was specific enough. We now spew descriptions on other options unrelated to the specific error the user may want to concentrate on. Perhaps this is a minor regression. I wonder if expects a numerical value is the best way to say this. Is expects numerical value easier to understand than unable to parse value? Ponder: - we do not take 4.8 - we won't take locale specific 4,8 (for some locales) - 4O is not numerical... 40 is - we do not take -4. - people may not realize, from numerical, that we take 5M. Except for the minor nits above, I think this is a good change. This is a totally unrelated tangent that does not have to be part of your series, but we probably should take 4.8M; I do not think we currently do. Oh, and perhaps 1T, too. -- To unsubscribe from this list: send the line unsubscribe git in