Re: [PATCH] userdiff: add built-in pattern for CSS
Am 03.06.2016 um 00:48 schrieb William Duclot: Logic behind the "pattern" regex is: The name of the macro parameter is "pattern", but the actual meaning is "function name" regex. diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index e3b1de8..81f60ad 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -525,6 +525,8 @@ patterns are available: - `csharp` suitable for source code in the C# language. +- `css` suitable for source code in the CSS language. CSS is not so much source code. How about "suitable for cascaded style sheets"? diff --git a/t/t4018/css-common b/t/t4018/css-common new file mode 100644 index 000..84ed754 --- /dev/null +++ b/t/t4018/css-common @@ -0,0 +1,4 @@ +RIGHT label.control-label { +margin-top: 10px!important; +border : 10px ChangeMe #C6C6C6; +} diff --git a/t/t4018/css-rule b/t/t4018/css-rule new file mode 100644 index 000..84ed754 --- /dev/null +++ b/t/t4018/css-rule @@ -0,0 +1,4 @@ +RIGHT label.control-label { +margin-top: 10px!important; +border : 10px ChangeMe #C6C6C6; +} These two are the same. Please pick only one. I propose the name "common" because it is how CSS rules are written most commonly. +IPATTERN("css", +"!^.*;\n" Is there a difference between this and "!;\n"? Is it necessary to anchor the pattern at the beginning of the line? In the commit message you talk about colon (':'), but you actually use a semicolon (';'). Thinking a bit more about it, rejecting lines with either one would be even better. Consider this case (without the indentation): h1 { color: red; } (New test case, hint, hint!) Therefore, it could be: "![:;]\n". +"^[_a-z0-9].*$", +/* -- */ +/* + * This regex comes from W3C CSS specs. Should theoretically also + * allow ISO 10646 characters U+00A0 and higher, + * but they are not handled in this regex. + */ +"-?[_a-zA-F][-_a-zA-F0-9]*" /* identifiers */ Drop A-F. +"|-?[0-9]+|\\#[0-9a-fA-F]+" /* numbers */ Here, too: it is an IPATTERN. -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Jun 2016, #01; Thu, 2)
On 06/03/2016 01:13 AM, Mike Hommey wrote: On Thu, Jun 02, 2016 at 03:52:41PM -0700, Junio C Hamano wrote: * mh/connect (2016-06-01) 9 commits - connect: move ssh command line preparation to a separate function - connect: actively reject git:// urls with a user part - connect: change the --diag-url output to separate user and host - connect: make parse_connect_url() return the user part of the url as a separate value - connect: group CONNECT_DIAG_URL handling code - connect: make parse_connect_url() return separated host and port - connect: re-derive a host:port string from the separate host and port variables - connect: call get_host_and_port() earlier - connect: document why we sometimes call get_port after get_host_and_port Needs review. I /think/ Torsten reviewed it all, and his last comments are in $gmane/295800. It's still not clear to me why he wants to remove the comment about []. There where 2 comments in the review. The most important thing is that now git://[example.com:123]/path/to/repo is valid, but it shouldn't. This patch fixes it: @@ -673,7 +669,7 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_user, * "host:port" and NULL. * To support this undocumented legacy we still need to split the port. */ - if (!port) + if (!port && protocol == PROTO_SSH) The other thing is that I asked for a test case for git://[example.com:123]/path/to/repo which shouldn't be hard to do. If nobody else things that this comment in the code is stale: - /* -* Don't do destructive transforms as protocol code does -* '[]' unwrapping in get_host_and_port() -*/ then just leave it as it is. Thanks -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t1308: do not get fooled by symbolic links to the source tree
On Thu, Jun 02, 2016 at 04:23:00PM -0700, Stefan Beller wrote: > > To prevent this in the future, I switched my default --root= to point to > > a symlink. I wonder if we could do something in the test suite, though, > > as we did long ago by introducing "trash directory" with a space to > > catch corner cases. > > > > I guess it would be something like: > > > > if test_have_prereq SYMLINKS > > IIUC this would need each test to be marked with SYMLINKS > when testing with symlinks is desired. Marking a test with that > is easily forgotten, so I'd rather do it by default as: > > if (system supports symlinks): > > then > > mkdir "$TRASH_DIRECTORY.real" && > > ln -s "$TRASH_DIRECTORY.real" "$TRASH_DIRECTORY" > > else > > mkdir "$TRASH_DIRECTORY" > > fi I'm not sure I understand. My intent was to say "does the system support symlinks" in test-lib.sh when setting up the trash directory. The test_have_prereq function asks that, AFAIK (the "test_" prefix is not "does _this_ test require it" but just "this function is in the test namespace"). > I like the idea of testing with symlinks. (Does it have performance issues > when everything goes through symlinks?) I didn't notice any on my system (running with --root pointed at a directory, and at a symlink to that directory). It would be extra work whenever we determine a canonical absolute path, but I suspect that is drowned out in the noise of the rest of the test suite. Plus some minor extra work in the `ln` and `rm` calls during test setup/teardown, but likewise that should be a small part of the total cost. > On the other hand if we do tests by default in a symlinked path, we could > introduce errors more easily in non-symlinked path, but that is what > developers > use for developing (I guess), so it's not as likely? True, but I'd be surprised if there are bugs that show up in a non-symlinked path that _do_ show up in a symlinked one. I'm not sure if we've seen a case where this would find an actual bug in git, though. The cases I remember were mostly about the test suite being picky (i.e., git canonicalizes but the expected output doesn't, or vice versa). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] pathspec: allow escaped query values
On Thu, Jun 2, 2016 at 4:23 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> However if we add a value restriction here, we need to be as strict in the >> .gitattributes parsing as well and put a warning there (similar to >> invalid_attr_name_message) I would think. > > Remember, the attribute system is used for many purposes other than > this new "further limit pathspec". Right, and in the past we followed a rigid pattern of non-crazy values. The crazy values will show up once users realize they can put anything they want into the .gitattributes to query for files. Before this labeling change, there was no need to put anything other the officially documented values into the attribute files. > So I do not think it is necessary or even beneficial to add such a > warning. Ok. > >> +static char *attr_value_unescape(const char *value) >> +{ >> + const char *src; >> + char *dst, *ret; >> + >> + ret = xmallocz(strlen(value)); >> + for (src = value, dst = ret; *src; src++, dst++) { >> + if (*src == '\\') { >> + if (!src[1]) >> + die(_("Escape character '\\' not allowed as " >> + "last character in attr value")); >> + src++; >> + } >> + if (*src && invalid_value_char(*src)) >> + die("cannot use '%c' for value matching", *src); >> + *dst = *src; >> + } >> + *dst = '\0'; >> + return ret; >> +} > > Please sanity-check me. Just like I said to your original "I doubt > *i could be NUL here", I now doubt *src could be NUL there where > invalid_value_char() gets called. > > If *src could be NUL there, then *dst gets NUL once, and then after > loop exits (presumably after incrementing dst), *dst gets another > NUL, which was the terminating NUL condition being iffy I mentioned, > but as you said, I do not think it would happen, so we can lose the > "*src && " before invalid_value_char() is called. Right, we can lose the *src check before invalid_value_char as that ought to be caught in if (!src[1]) die(...) before. Thanks, Stefan > > Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv3] pathspec: allow escaped query values
Stefan Bellerwrites: > However if we add a value restriction here, we need to be as strict in the > .gitattributes parsing as well and put a warning there (similar to > invalid_attr_name_message) I would think. Remember, the attribute system is used for many purposes other than this new "further limit pathspec". While the (hopefully temporary) limitation hurts here, users will limit the attributes they use for ":(attr:VAR=VAL)" pathspec magic, and it is unlikely that either the VAR part or its VALue are something the core Git currently uses anyway. So I do not think it is necessary or even beneficial to add such a warning. > +static char *attr_value_unescape(const char *value) > +{ > + const char *src; > + char *dst, *ret; > + > + ret = xmallocz(strlen(value)); > + for (src = value, dst = ret; *src; src++, dst++) { > + if (*src == '\\') { > + if (!src[1]) > + die(_("Escape character '\\' not allowed as " > + "last character in attr value")); > + src++; > + } > + if (*src && invalid_value_char(*src)) > + die("cannot use '%c' for value matching", *src); > + *dst = *src; > + } > + *dst = '\0'; > + return ret; > +} Please sanity-check me. Just like I said to your original "I doubt *i could be NUL here", I now doubt *src could be NUL there where invalid_value_char() gets called. If *src could be NUL there, then *dst gets NUL once, and then after loop exits (presumably after incrementing dst), *dst gets another NUL, which was the terminating NUL condition being iffy I mentioned, but as you said, I do not think it would happen, so we can lose the "*src && " before invalid_value_char() is called. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t1308: do not get fooled by symbolic links to the source tree
On Thu, Jun 2, 2016 at 4:16 PM, Jeff Kingwrote: > On Thu, Jun 02, 2016 at 03:15:39PM -0700, Junio C Hamano wrote: > >> When your $PWD does not match $(/bin/pwd), e.g. you have your copy >> of the git source tree in one place, point it with a symbolic link, >> and then "cd" to that symbolic link before running 'make test', one >> of the tests in t1308 expects that the per-user configuration was >> reported to have been read from the true path (i.e. relative to the >> target of such a symbolic link), but the test-config program reports >> a path relative to $PWD (i.e. the symbolic link). >> >> Instead, expect a path relative to $HOME (aka $TRASH_DIRECTORY), as >> per-user configuration is read from $HOME/.gitconfig and the test >> framework sets these shell variables up in such a way to avoid this >> problem. > > Looks good. > > Acked-by: Jeff King > > To prevent this in the future, I switched my default --root= to point to > a symlink. I wonder if we could do something in the test suite, though, > as we did long ago by introducing "trash directory" with a space to > catch corner cases. > > I guess it would be something like: > > if test_have_prereq SYMLINKS IIUC this would need each test to be marked with SYMLINKS when testing with symlinks is desired. Marking a test with that is easily forgotten, so I'd rather do it by default as: if (system supports symlinks): > then > mkdir "$TRASH_DIRECTORY.real" && > ln -s "$TRASH_DIRECTORY.real" "$TRASH_DIRECTORY" > else > mkdir "$TRASH_DIRECTORY" > fi > > but there may be some other tweaks required (e.g., for cleanup). I like the idea of testing with symlinks. (Does it have performance issues when everything goes through symlinks?) On the other hand if we do tests by default in a symlinked path, we could introduce errors more easily in non-symlinked path, but that is what developers use for developing (I guess), so it's not as likely? Thanks, Stefan > > -Peff > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] t1308: do not get fooled by symbolic links to the source tree
On Thu, Jun 02, 2016 at 03:15:39PM -0700, Junio C Hamano wrote: > When your $PWD does not match $(/bin/pwd), e.g. you have your copy > of the git source tree in one place, point it with a symbolic link, > and then "cd" to that symbolic link before running 'make test', one > of the tests in t1308 expects that the per-user configuration was > reported to have been read from the true path (i.e. relative to the > target of such a symbolic link), but the test-config program reports > a path relative to $PWD (i.e. the symbolic link). > > Instead, expect a path relative to $HOME (aka $TRASH_DIRECTORY), as > per-user configuration is read from $HOME/.gitconfig and the test > framework sets these shell variables up in such a way to avoid this > problem. Looks good. Acked-by: Jeff KingTo prevent this in the future, I switched my default --root= to point to a symlink. I wonder if we could do something in the test suite, though, as we did long ago by introducing "trash directory" with a space to catch corner cases. I guess it would be something like: if test_have_prereq SYMLINKS then mkdir "$TRASH_DIRECTORY.real" && ln -s "$TRASH_DIRECTORY.real" "$TRASH_DIRECTORY" else mkdir "$TRASH_DIRECTORY" fi but there may be some other tweaks required (e.g., for cleanup). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv3] pathspec: allow escaped query values
In our own .gitattributes file we have attributes such as: *.[ch] whitespace=indent,trail,space When querying for attributes we want to be able to ask for the exact value, i.e. git ls-files :(attr:whitespace=indent,trail,space) should work, but the commas are used in the attr magic to introduce the next attr, such that this query currently fails with fatal: Invalid pathspec magic 'trail' in ':(attr:whitespace=indent,trail,space)' This change allows escaping characters by a backslash, such that the query git ls-files :(attr:whitespace=indent\,trail\,space) will match all path that have the value "indent,trail,space" for the whitespace attribute. To accomplish this, we need to modify two places. First `eat_long_magic` needs to not stop early upon seeing a comma or closing paren that is escaped. As a second step we need to remove any escaping from the attr value. Helped-by: Junio C HamanoSigned-off-by: Stefan Beller --- > That would be true _only_ when "find next escape and copy up to that > byte" aka "scanning once with optimized strchr(), and copying once > with optimized memmove()" is faster than "scanning once and copying" > loop. Oh right. that would work for larger strings that don't fit into a cache very well, but in our expected case this will do. > I was merely reacting to your use of memmove() in a very different > loop, where if you unescape "a\b\c\defghijk", your memmove() would > move "efghijk" many times. Right, at the time of writing I didn't like it, but was ok with it as we only have a few escapes. > Also the handling of the terminating NUL may need to be > updated. I don't think so. > That is why I did not say "replace yours with this", but > merely "along the lines of this" ;-) This is pretty much your code modulo nits (xmalloz, semicolons) now, and I am convinced it works by the test. However if we add a value restriction here, we need to be as strict in the .gitattributes parsing as well and put a warning there (similar to invalid_attr_name_message) I would think. Thanks, Stefan pathspec.c | 52 + t/t6134-pathspec-with-labels.sh | 10 2 files changed, 58 insertions(+), 4 deletions(-) diff --git a/pathspec.c b/pathspec.c index 326863a..fe53ddf 100644 --- a/pathspec.c +++ b/pathspec.c @@ -89,6 +89,51 @@ static void prefix_short_magic(struct strbuf *sb, int prefixlen, strbuf_addf(sb, ",prefix:%d)", prefixlen); } +static size_t strcspn_escaped(const char *s, const char *stop) +{ + const char *i; + + for (i = s; *i; i++) { + /* skip the escaped character */ + if (i[0] == '\\' && i[1]) { + i++; + continue; + } + + if (strchr(stop, *i)) + break; + } + return i - s; +} + +static inline int invalid_value_char(const char ch) +{ + if (isalnum(ch) || strchr(",-_", ch)) + return 0; + return -1; +} + +static char *attr_value_unescape(const char *value) +{ + const char *src; + char *dst, *ret; + + ret = xmallocz(strlen(value)); + for (src = value, dst = ret; *src; src++, dst++) { + if (*src == '\\') { + if (!src[1]) + die(_("Escape character '\\' not allowed as " + "last character in attr value")); + src++; + } + if (*src && invalid_value_char(*src)) + die("cannot use '%c' for value matching", *src); + *dst = *src; + } + *dst = '\0'; + return ret; +} + static void parse_pathspec_attr_match(struct pathspec_item *item, const char *value) { struct string_list_item *si; @@ -131,10 +176,9 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va if (attr[attr_len] != '=') am->match_mode = MATCH_SET; else { + const char *v = [attr_len + 1]; am->match_mode = MATCH_VALUE; - am->value = xstrdup([attr_len + 1]); - if (strchr(am->value, '\\')) - die(_("attr spec values must not contain backslashes")); + am->value = attr_value_unescape(v); } break; } @@ -166,7 +210,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt, for (copyfrom = elt + 2; *copyfrom && *copyfrom != ')'; copyfrom = nextat) { - size_t len = strcspn(copyfrom, ",)"); + size_t len = strcspn_escaped(copyfrom, ",)"); if (copyfrom[len]
Re: What's cooking in git.git (Jun 2016, #01; Thu, 2)
On Thu, Jun 02, 2016 at 03:52:41PM -0700, Junio C Hamano wrote: > * mh/connect (2016-06-01) 9 commits > - connect: move ssh command line preparation to a separate function > - connect: actively reject git:// urls with a user part > - connect: change the --diag-url output to separate user and host > - connect: make parse_connect_url() return the user part of the url as a > separate value > - connect: group CONNECT_DIAG_URL handling code > - connect: make parse_connect_url() return separated host and port > - connect: re-derive a host:port string from the separate host and port > variables > - connect: call get_host_and_port() earlier > - connect: document why we sometimes call get_port after get_host_and_port > > Needs review. I /think/ Torsten reviewed it all, and his last comments are in $gmane/295800. It's still not clear to me why he wants to remove the comment about []. Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] rev-list: disable bitmaps when "-n" is used with listing objects
You can ask rev-list to use bitmaps to speed up an --objects traversal, which should generally give you your answers much faster. Likewise, you can ask rev-list to limit such a traversal with `-n`, in which case we'll show only a limited set of commits (and only the tree and commit objects directly reachable from those commits). But if you do both together, the results are nonsensical. We end up limiting any fallback traversal we do to _find_ the bitmaps, but the actual set of objects we output will be picked arbitrarily from the union of any bitmaps we do find, and will involve the objects of many more commits. It's possible that somebody might want this as a "show me what you can, but limit the amount of work you do" flag. But as with the prior commit clamping "--count", the results are basically non-deterministic; you'll get the values from some commits between `n` and the total number, and you can't tell which. And unlike the `--count` case, we can't easily generate the "real" value from the bitmap values (you can't just walk back `-n` commits and subtract out the reachable objects from the boundary commits; the bitmaps for `X` record its total reachability, so you don't know which objects are directly from `X` itself, which from `X^`, and so on). So let's just fallback to the non-bitmap code path in this case, so we always give a sane answer. Signed-off-by: Jeff King--- builtin/rev-list.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/builtin/rev-list.c b/builtin/rev-list.c index aaa79a3..9fb6608 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -365,7 +365,8 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) printf("%d\n", commit_count); return 0; } - } else if (revs.tag_objects && revs.tree_objects && revs.blob_objects) { + } else if (!revs.max_count && + revs.tag_objects && revs.tree_objects && revs.blob_objects) { if (!prepare_bitmap_walk()) { traverse_bitmap_commit_list(_object_fast); return 0; -- 2.9.0.rc1.132.g33c7f30 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] rev-list: "adjust" results of "--count --use-bitmap-index -n"
If you ask rev-list for: git rev-list --count --use-bitmap-index HEAD we optimize out the actual traversal and just give you the number of bits set in the commit bitmap. This is faster, which is good. But if you ask to limit the size of the traversal, like: git rev-list --count --use-bitmap-index -n 100 HEAD we'll still output the full bitmapped number we found. On the surface, that might even seem OK. You explicitly asked to use the bitmap index, and it was cheap to compute the real answer, so we gave it to you. But there's something much more complicated going on under the hood. If we don't have a bitmap directly for HEAD, then we have to actually traverse backwards, looking for a bitmapped commit. And _that_ traversal is bounded by our `-n` count. This is a good thing, because it bounds the work we have to do, which is probably what the user wanted by asking for `-n`. But now it makes the output quite confusing. You might get many values: - your `-n` value, if we walked back and never found a bitmap (or fewer if there weren't that many commits) - the actual full count, if we found a bitmap root for every path of our traversal with in the `-n` limit - any number in between! We might have walked back and found _some_ bitmaps, but then cut off the traversal early with some commits not accounted for in the result. So you cannot even see a value higher than your `-n` and say "OK, bitmaps kicked in, this must be the real full count". The only sane thing is for git to just clamp the value to a maximum of the `-n` value, which means we should output the exact same results whether bitmaps are in use or not. The test in t5310 demonstrates this by using `-n 1`. Without this patch we fail in the full-bitmap case (where we do not have to traverse at all) but _not_ in the partial-bitmap case (where we have to walk down to find an actual bitmap). With this patch, both cases just work. I didn't implement the crazy in-between case, just because it's complicated to set up, and is really a subset of the full-count case, which we do cover. Signed-off-by: Jeff King--- builtin/rev-list.c | 2 ++ t/t5310-pack-bitmaps.sh | 6 ++ 2 files changed, 8 insertions(+) diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 275da0d..aaa79a3 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -360,6 +360,8 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) uint32_t commit_count; if (!prepare_bitmap_walk()) { count_bitmap_commit_list(_count, NULL, NULL, NULL); + if (revs.max_count && revs.max_count < commit_count) + commit_count = revs.max_count; printf("%d\n", commit_count); return 0; } diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index d446706..3893afd 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -47,6 +47,12 @@ rev_list_tests() { test_cmp expect actual ' + test_expect_success "counting commits with limit ($state)" ' + git rev-list --count -n 1 HEAD >expect && + git rev-list --use-bitmap-index --count -n 1 HEAD >actual && + test_cmp expect actual + ' + test_expect_success "counting non-linear history ($state)" ' git rev-list --count other...master >expect && git rev-list --use-bitmap-index --count other...master >actual && -- 2.9.0.rc1.132.g33c7f30 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] corner cases with "rev-list --use-bitmap-index -n"
This series fixes two corner-cases I found when using "-n" along with bitmaps. The results do make _some_ sense if you interpret them correctly, but are sufficiently confusing that I think it's worth dealing with. See the commit messages for the gory details. The good news is that in the first case, we can produce a more sensible answer without spending any extra work. The bad news is that the second case cannot, and must fall back to a regular traversal. I doubt anybody even cares about this case in practice, though, as "--objects -n" is somewhat nonsensical already (I didn't run across it in practice; I only noticed while I fixing the first one, which I did see in practice). [1/2]: rev-list: "adjust" results of "--count --use-bitmap-index -n" [2/2]: rev-list: disable bitmaps when "-n" is used with listing objects -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] userdiff: add built-in pattern for CSS
William Duclotwrites: > diff --git a/userdiff.c b/userdiff.c > index 6bf2505..00fc3bf 100644 > --- a/userdiff.c > +++ b/userdiff.c > @@ -148,6 +148,18 @@ PATTERNS("csharp", >"[a-zA-Z_][a-zA-Z0-9_]*" >"|[-+0-9.e]+[fFlL]?|0[xXbB]?[0-9a-fA-F]+[lL]?" >"|[-+*/<>%&^|=!]=|--|\\+\\+|<<=?|>>=?|&&|\\|\\||::|->"), > +IPATTERN("css", > + "!^.*;\n" > + "^[_a-z0-9].*$", > + /* -- */ > + /* > + * This regex comes from W3C CSS specs. Should theoretically also > + * allow ISO 10646 characters U+00A0 and higher, > + * but they are not handled in this regex. > + */ > + "-?[_a-zA-F][-_a-zA-F0-9]*" /* identifiers */ > + "|-?[0-9]+|\\#[0-9a-fA-F]+" /* numbers */ You could now lose A-F from the above two lines. In fact, the first line "identifiers" has "a-zA-F", which probably would work correctly under IPATTERN() to include 'G-Z' as part of the legit letters for identifiers, but is indeed misleading. > +), > { "default", NULL, -1, { NULL, 0 } }, > }; > #undef PATTERNS -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] pathspec: allow escaped query values
Stefan Bellerwrites: > Thinking about efficiency, I have the believe that memmove can be faster > than a `*src=*dst` thing we do ourselves as it may have access to specialized > assembly instructions to move larger chunks of memory or such. > So I think ideally we would do a block copy between the escape characters > (sketched as:) > > last = input > while input not ended: > current = find next escape character in input > memcpy from input value in the range of last to current > last = current + 1 > copy remaining parts if no further escape is found That would be true _only_ when "find next escape and copy up to that byte" aka "scanning once with optimized strchr(), and copying once with optimized memmove()" is faster than "scanning once and copying" loop. I was merely reacting to your use of memmove() in a very different loop, where if you unescape "a\b\c\defghijk", your memmove() would move "efghijk" many times. >> ret = xmalloc(strlen(value)); > > xmallocz at least? Yes. Also the handling of the terminating NUL may need to be updated. That is why I did not say "replace yours with this", but merely "along the lines of this" ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] pathspec: allow escaped query values
On Thu, Jun 2, 2016 at 2:54 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> In our own .gitattributes file we have attributes such as: >> >> *.[ch] whitespace=indent,trail,space >> >> When querying for attributes we want to be able to ask for the exact >> value, i.e. >> >> git ls-files :(attr:whitespace=indent,trail,space) >> >> should work, but the commas are used in the attr magic to introduce >> the next attr, ... >> ... >> So here is the "escaping only, but escaping done right" version. >> (It goes on top of sb/pathspec-label) > > The phrase "should work" is probably a bit too strong (I'd have said > "it would be nice if this worked"), as we do not have to even > support comma for our immediately expected use cases. Allowing it > merely makes a casual test using our own .gitattributes easier. > >> +static size_t strcspn_escaped(const char *s, const char *reject) > > Perhaps s/reject/stop/? > >> +{ >> + const char *i, *j; >> + >> + for (i = s; *i; i++) { >> + /* skip escaped the character */ >> + if (i[0] == '\\' && i[1]) { >> + i++; >> + continue; >> + } >> + /* see if any of the chars matches the current character */ >> + for (j = reject; *j; j++) >> + if (!*i || *i == *j) >> + return i - s; > > I somehow doubt that *i can be NUL here. In any case, this looks > more like > > /* is *i is one of the stop codon? */ > if (strchr(stop, *i)) > break; > >> + } >> + return i - s; >> +} > >> +static char *attr_value_unescape(const char *value) >> +{ >> + char *i, *ret = xstrdup(value); >> + >> + for (i = ret; *i; i++) { >> + if (i[0] == '\\') { >> + if (!i[1]) >> + die(_("Escape character '\\' not allowed as " >> + "last character in attr value")); >> + >> + /* remove the backslash */ >> + memmove(i, i + 1, strlen(i)); >> + /* and ignore the character after that */ >> + i++; >> + } >> + } >> + return ret; >> +} >> + > > Repeated memmove() and strlen() somehow bothers me. Would there be > a more efficient and straight-forward way to do this, perhaps along > the lines of this instead? Thinking about efficiency, I have the believe that memmove can be faster than a `*src=*dst` thing we do ourselves as it may have access to specialized assembly instructions to move larger chunks of memory or such. So I think ideally we would do a block copy between the escape characters (sketched as:) last = input while input not ended: current = find next escape character in input memcpy from input value in the range of last to current last = current + 1 copy remaining parts if no further escape is found It doesn't seem worth the effort to get it right though. > > const char *src; > char *dst, *ret; > > ret = xmalloc(strlen(value)); xmallocz at least? > for (src = value, dst = ret; *src; src++, dst++) { > if (*src == '\\') { > if (!src[1]) > die(); > src++; > } > if (*src && invalid_value_char(*src)) > die("cannot use '%c' for value matching", *src) > *dst = *src; > } > *dst = '\0' > return ret; > Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
What's cooking in git.git (Jun 2016, #01; Thu, 2)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. 2.9-rc2 is scheduled for early next week. A handful low-impact topics are expected to be in 'master' by then. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [New Topics] * et/add-chmod-x (2016-06-01) 2 commits - SQUASH??? add: add --chmod=+x / --chmod=-x options - add: add --chmod=+x / --chmod=-x options Waiting for a reroll (or ack for proposed fixup). * jc/t2300-setup (2016-06-01) 1 commit - t2300: run git-sh-setup in an environment that better mimics the real life A test fix. Will merge to 'next'. * jk/shell-portability (2016-06-01) 2 commits - t5500 & t7403: lose bash-ism "local" - test-lib: add in-shell "env" replacement test fixes. Will merge to 'next'. * mh/connect (2016-06-01) 9 commits - connect: move ssh command line preparation to a separate function - connect: actively reject git:// urls with a user part - connect: change the --diag-url output to separate user and host - connect: make parse_connect_url() return the user part of the url as a separate value - connect: group CONNECT_DIAG_URL handling code - connect: make parse_connect_url() return separated host and port - connect: re-derive a host:port string from the separate host and port variables - connect: call get_host_and_port() earlier - connect: document why we sometimes call get_port after get_host_and_port Needs review. * sb/submodule-helper-list-signal-unmatch-via-exit-status (2016-06-01) 1 commit (merged to 'next' on 2016-06-02 at 29064a2) + submodule--helper: offer a consistent API (this branch is used by sb/submodule-helper-relative-path.) The way how "submodule--helper list" signals unmatch error to its callers has been updated. Will merge to 'master'. * sb/submodule-helper-relative-path (2016-06-01) 1 commit (merged to 'next' on 2016-06-02 at 8a191e1) + submodule: remove bashism from shell script (this branch uses sb/submodule-helper-list-signal-unmatch-via-exit-status.) A bash-ism "local" has been removed from "git submodule" scripted Porcelain. Will merge to 'master'. * mr/send-email-doc-gmail-2fa (2016-06-01) 1 commit (merged to 'next' on 2016-06-02 at cd2ade8) + Documentation/git-send-email: fix typo in gmail 2FA section Typofix. Will merge to 'master'. * aq/upload-pack-use-parse-options (2016-05-31) 1 commit - upload-pack.c: use parse-options API "git upload-pack" command has been updated to use the parse-options API. Will merge to 'next'. * jc/clear-pathspec (2016-06-02) 1 commit - pathspec: rename free_pathspec() to clear_pathspec() We usually call a function that clears the contents a data structure X without freeing the structure itself clear_X(), and call a function that does clear_X() and also frees it free_X(). free_pathspec() function has been renamed to clear_pathspec() to avoid confusion. Will merge to 'next'. -- [Stalled] * sb/bisect (2016-04-15) 22 commits - SQUASH??? - bisect: get back halfway shortcut - bisect: compute best bisection in compute_relevant_weights() - bisect: use a bottom-up traversal to find relevant weights - bisect: prepare for different algorithms based on find_all - bisect: rename count_distance() to compute_weight() - bisect: make total number of commits global - bisect: introduce distance_direction() - bisect: extract get_distance() function from code duplication - bisect: use commit instead of commit list as arguments when appropriate - bisect: replace clear_distance() by unique markers - bisect: use struct node_data array instead of int array - bisect: get rid of recursion in count_distance() - bisect: make algorithm behavior independent of DEBUG_BISECT - bisect: make bisect compile if DEBUG_BISECT is set - bisect: plug the biggest memory leak - bisect: add test for the bisect algorithm - t6030: generalize test to not rely on current implementation - t: use test_cmp_rev() where appropriate - t/test-lib-functions.sh: generalize test_cmp_rev - bisect: allow 'bisect run' if no good commit is known - bisect: write about `bisect next` in documentation The internal algorithm used in "git bisect" to find the next commit to check has been optimized greatly. Expecting a reroll. ($gmane/291163) * nd/shallow-deepen (2016-04-13) 26 commits - fetch, upload-pack: --deepen=N extends shallow boundary by N commits - upload-pack: add get_reachable_list() - upload-pack: split check_unreachable() in two, prep for get_reachable_list() - t5500, t5539: tests for shallow depth excluding a ref - clone: define shallow
[PATCH] userdiff: add built-in pattern for CSS
CSS is widely used, motivating it being included as a built-in pattern. It must be noted that the word_regex for CSS (i.e. the regex defining what is a word in the language) does not consider '.' and '#' characters (in CSS selectors) to be part of the word. This behavior is documented by the test t/t4018/css-rule. The logic behind this behavior is the following: identifiers in CSS selectors are identifiers in a HTML/XML document. Therefore, the '.'/'#' character are not part of the identifier, but an indicator of the nature of the identifier in HTML/XML (class or id). Diffing ".class1" and ".class2" must show that the class name is changed, but we still are selecting a class. Logic behind the "pattern" regex is: 1. reject lines containing a colon (properties) 2. if a line begins with a name in column 1, pick the whole line Credits to Johannes Sixt (j...@kdbg.org) for the pattern regex and most of the tests. Signed-off-by: William DuclotSigned-off-by: Matthieu Moy --- Changes since V2: - pattern regex has changed - more tests Documentation/gitattributes.txt | 2 ++ t/t4018-diff-funcname.sh| 1 + t/t4018/css-brace-in-col-1 | 5 + t/t4018/css-common | 4 t/t4018/css-long-selector-list | 6 ++ t/t4018/css-prop-sans-indent| 5 + t/t4018/css-rule| 4 t/t4018/css-short-selector-list | 4 t/t4034-diff-words.sh | 1 + t/t4034/css/expect | 16 t/t4034/css/post| 10 ++ t/t4034/css/pre | 10 ++ userdiff.c | 12 13 files changed, 80 insertions(+) create mode 100644 t/t4018/css-brace-in-col-1 create mode 100644 t/t4018/css-common create mode 100644 t/t4018/css-long-selector-list create mode 100644 t/t4018/css-prop-sans-indent create mode 100644 t/t4018/css-rule create mode 100644 t/t4018/css-short-selector-list create mode 100644 t/t4034/css/expect create mode 100644 t/t4034/css/post create mode 100644 t/t4034/css/pre diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index e3b1de8..81f60ad 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -525,6 +525,8 @@ patterns are available: - `csharp` suitable for source code in the C# language. +- `css` suitable for source code in the CSS language. + - `fortran` suitable for source code in the Fortran language. - `fountain` suitable for Fountain documents. diff --git a/t/t4018-diff-funcname.sh b/t/t4018-diff-funcname.sh index 67373dc..1795ffc 100755 --- a/t/t4018-diff-funcname.sh +++ b/t/t4018-diff-funcname.sh @@ -30,6 +30,7 @@ diffpatterns=" bibtex cpp csharp + css fortran fountain html diff --git a/t/t4018/css-brace-in-col-1 b/t/t4018/css-brace-in-col-1 new file mode 100644 index 000..7831577 --- /dev/null +++ b/t/t4018/css-brace-in-col-1 @@ -0,0 +1,5 @@ +RIGHT label.control-label +{ +margin-top: 10px!important; +border : 10px ChangeMe #C6C6C6; +} diff --git a/t/t4018/css-common b/t/t4018/css-common new file mode 100644 index 000..84ed754 --- /dev/null +++ b/t/t4018/css-common @@ -0,0 +1,4 @@ +RIGHT label.control-label { +margin-top: 10px!important; +border : 10px ChangeMe #C6C6C6; +} diff --git a/t/t4018/css-long-selector-list b/t/t4018/css-long-selector-list new file mode 100644 index 000..7ccd25d --- /dev/null +++ b/t/t4018/css-long-selector-list @@ -0,0 +1,6 @@ +p.header, +label.control-label, +div ul#RIGHT { +margin-top: 10px!important; +border : 10px ChangeMe #C6C6C6; +} diff --git a/t/t4018/css-prop-sans-indent b/t/t4018/css-prop-sans-indent new file mode 100644 index 000..a9e3c86 --- /dev/null +++ b/t/t4018/css-prop-sans-indent @@ -0,0 +1,5 @@ +RIGHT, label.control-label { +margin-top: 10px!important; +padding: 0; +border : 10px ChangeMe #C6C6C6; +} diff --git a/t/t4018/css-rule b/t/t4018/css-rule new file mode 100644 index 000..84ed754 --- /dev/null +++ b/t/t4018/css-rule @@ -0,0 +1,4 @@ +RIGHT label.control-label { +margin-top: 10px!important; +border : 10px ChangeMe #C6C6C6; +} diff --git a/t/t4018/css-short-selector-list b/t/t4018/css-short-selector-list new file mode 100644 index 000..6a0bdee --- /dev/null +++ b/t/t4018/css-short-selector-list @@ -0,0 +1,4 @@ +label.control, div ul#RIGHT { +margin-top: 10px!important; +border : 10px ChangeMe #C6C6C6; +} diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh index f2f55fc..912df91 100755 --- a/t/t4034-diff-words.sh +++ b/t/t4034-diff-words.sh @@ -302,6 +302,7 @@ test_language_driver ada test_language_driver bibtex test_language_driver cpp test_language_driver csharp +test_language_driver css test_language_driver fortran test_language_driver html test_language_driver java diff --git a/t/t4034/css/expect
[PATCH] t1308: do not get fooled by symbolic links to the source tree
When your $PWD does not match $(/bin/pwd), e.g. you have your copy of the git source tree in one place, point it with a symbolic link, and then "cd" to that symbolic link before running 'make test', one of the tests in t1308 expects that the per-user configuration was reported to have been read from the true path (i.e. relative to the target of such a symbolic link), but the test-config program reports a path relative to $PWD (i.e. the symbolic link). Instead, expect a path relative to $HOME (aka $TRASH_DIRECTORY), as per-user configuration is read from $HOME/.gitconfig and the test framework sets these shell variables up in such a way to avoid this problem. Signed-off-by: Junio C Hamano--- > Yeah, I think it is the same issue. I think the most accurate value > there would probably be $HOME, though. Thanks. t/t1308-config-set.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t1308-config-set.sh b/t/t1308-config-set.sh index 065d5eb..cf716b4 100755 --- a/t/t1308-config-set.sh +++ b/t/t1308-config-set.sh @@ -237,7 +237,7 @@ test_expect_success 'iteration shows correct origins' ' key=foo.bar value=from-home origin=file - name=$(pwd)/.gitconfig + name=$HOME/.gitconfig scope=global key=foo.bar -- 2.9.0-rc1-228-gd00d833 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] pathspec: allow escaped query values
On Thu, Jun 2, 2016 at 2:54 PM, Junio C Hamanowrote: > Stefan Beller writes: > >> In our own .gitattributes file we have attributes such as: >> >> *.[ch] whitespace=indent,trail,space >> >> When querying for attributes we want to be able to ask for the exact >> value, i.e. >> >> git ls-files :(attr:whitespace=indent,trail,space) >> >> should work, but the commas are used in the attr magic to introduce >> the next attr, ... >> ... >> So here is the "escaping only, but escaping done right" version. >> (It goes on top of sb/pathspec-label) > > The phrase "should work" is probably a bit too strong (I'd have said > "it would be nice if this worked"), as we do not have to even > support comma for our immediately expected use cases. Allowing it > merely makes a casual test using our own .gitattributes easier. > >> +static size_t strcspn_escaped(const char *s, const char *reject) > > Perhaps s/reject/stop/? I took the signature from the strcspn man page, and my version of the man page has `reject` there, `stop` certainly works too > >> +{ >> + const char *i, *j; >> + >> + for (i = s; *i; i++) { >> + /* skip escaped the character */ >> + if (i[0] == '\\' && i[1]) { >> + i++; >> + continue; >> + } >> + /* see if any of the chars matches the current character */ >> + for (j = reject; *j; j++) >> + if (!*i || *i == *j) >> + return i - s; > > I somehow doubt that *i can be NUL here. In any case, this looks > more like > > /* is *i is one of the stop codon? */ > if (strchr(stop, *i)) > break; right, that seems easier. > >> + } >> + return i - s; >> +} > >> +static char *attr_value_unescape(const char *value) >> +{ >> + char *i, *ret = xstrdup(value); >> + >> + for (i = ret; *i; i++) { >> + if (i[0] == '\\') { >> + if (!i[1]) >> + die(_("Escape character '\\' not allowed as " >> + "last character in attr value")); >> + >> + /* remove the backslash */ >> + memmove(i, i + 1, strlen(i)); >> + /* and ignore the character after that */ >> + i++; >> + } >> + } >> + return ret; >> +} >> + > > Repeated memmove() and strlen() somehow bothers me. Would there be > a more efficient and straight-forward way to do this, perhaps along > the lines of this instead? > > const char *src; > char *dst, *ret; > > ret = xmalloc(strlen(value)); > for (src = value, dst = ret; *src; src++, dst++) { > if (*src == '\\') { > if (!src[1]) > die(); > src++; > } > if (*src && invalid_value_char(*src)) > die("cannot use '%c' for value matching", *src) > *dst = *src; > } > *dst = '\0' > return ret; > > Even though I earlier said "Now we have an unquote mechanism, we can > open the floodgate by lifting the "no backslash in value" check, I > now realize that we do want to keep some escape hatch for us to > extend the quoting syntax even more later, so perhaps with something > like this: > > static inline int invalid_value_char(const char ch) > { > if (isalnum(ch) || strchr(",-_", ch)) > return 0; > return -1; > } Makes sense. Later on we could have : or ; for an and/or of the values and parens and such, so the invalid_value_char makes sense. > > Thanks. Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] pathspec: rename free_pathspec() to clear_pathspec()
Jeff Kingwrites: > I think diff_filespec_clear() would not be quite right. It is freeing > only the allocated _data_, but leaving the other portions intact. You are (as usual) more right than I am ;-) Yes, the free_data() is designed to retain enough information about the filespec so that the data can be re-read by the next person who needs to access it after free_data() is called; i.e. it was a measure to reduce the memory pressure by trading it off with I/O cost. It is wrong to name it "clear", which is to "clear the slate, make it into pristine state" whose side effect is to release held resources. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mark remote `gc --auto` error messages
Jeff Kingwrites: > So the greater question is not "should this output be marked" but > "should auto-gc data go over the sideband so that all clients see it > (and any server-side stderr does not)". And I think the answer is > probably yes. And that fixes the "remote: " thing as a side effect. Thanks for stating this a lot more clearly than I could, and I agree that sending this to the other side regardless of the protocol is the right thing. I somehow doubt that server operators would check Apache logs to decide when to do a proper GC, so I do not consider it a true loss ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] pathspec: allow escaped query values
Stefan Bellerwrites: > In our own .gitattributes file we have attributes such as: > > *.[ch] whitespace=indent,trail,space > > When querying for attributes we want to be able to ask for the exact > value, i.e. > > git ls-files :(attr:whitespace=indent,trail,space) > > should work, but the commas are used in the attr magic to introduce > the next attr, ... > ... > So here is the "escaping only, but escaping done right" version. > (It goes on top of sb/pathspec-label) The phrase "should work" is probably a bit too strong (I'd have said "it would be nice if this worked"), as we do not have to even support comma for our immediately expected use cases. Allowing it merely makes a casual test using our own .gitattributes easier. > +static size_t strcspn_escaped(const char *s, const char *reject) Perhaps s/reject/stop/? > +{ > + const char *i, *j; > + > + for (i = s; *i; i++) { > + /* skip escaped the character */ > + if (i[0] == '\\' && i[1]) { > + i++; > + continue; > + } > + /* see if any of the chars matches the current character */ > + for (j = reject; *j; j++) > + if (!*i || *i == *j) > + return i - s; I somehow doubt that *i can be NUL here. In any case, this looks more like /* is *i is one of the stop codon? */ if (strchr(stop, *i)) break; > + } > + return i - s; > +} > +static char *attr_value_unescape(const char *value) > +{ > + char *i, *ret = xstrdup(value); > + > + for (i = ret; *i; i++) { > + if (i[0] == '\\') { > + if (!i[1]) > + die(_("Escape character '\\' not allowed as " > + "last character in attr value")); > + > + /* remove the backslash */ > + memmove(i, i + 1, strlen(i)); > + /* and ignore the character after that */ > + i++; > + } > + } > + return ret; > +} > + Repeated memmove() and strlen() somehow bothers me. Would there be a more efficient and straight-forward way to do this, perhaps along the lines of this instead? const char *src; char *dst, *ret; ret = xmalloc(strlen(value)); for (src = value, dst = ret; *src; src++, dst++) { if (*src == '\\') { if (!src[1]) die(); src++; } if (*src && invalid_value_char(*src)) die("cannot use '%c' for value matching", *src) *dst = *src; } *dst = '\0' return ret; Even though I earlier said "Now we have an unquote mechanism, we can open the floodgate by lifting the "no backslash in value" check, I now realize that we do want to keep some escape hatch for us to extend the quoting syntax even more later, so perhaps with something like this: static inline int invalid_value_char(const char ch) { if (isalnum(ch) || strchr(",-_", ch)) return 0; return -1; } Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mark remote `gc --auto` error messages
On Thu, Jun 02, 2016 at 01:14:02PM -0700, Junio C Hamano wrote: > > What exactly are you referring to (you only quoted the introduction)? > > Do you think we should fix the git-gc issue but keep the general > > behavior of printing messages unaltered? Do you think it would be > > worthwhile to make server messages distinguishable in general? > > The latter, which I think was what your implementation was attempting to do > if I read it correctly. And btw, I don't think this patch fixes the general case. E.g., if receive-pack hits any of its die("BUG") lines, they will not be prefixed. Most clients wouldn't see them, but ssh ones would. To fix that you'd have to do a whole async process wrapping `receive-pack` that just reads its stdout and stderr and muxes it back over the sideband. But I can think of two roadblocks there: - I think the original design of receive-pack was _not_ to share all of stderr with the user, because it might contain secret-ish server-side things. That's why we have rp_error() which copies to the sideband. I don't know how useful that is in practice. We copy the stderr wholesale from sub-processes like index-pack, so things like file paths are likely to get leaked there. - the implementation is a bit tricky, because the die() will take down the mux thread, too. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mark remote `gc --auto` error messages
On Thu, Jun 02, 2016 at 01:14:02PM -0700, Junio C Hamano wrote: > On Thu, Jun 2, 2016 at 1:06 PM, Lukas Fleischerwrote: > > On Thu, 02 Jun 2016 at 21:33:33, Junio C Hamano wrote: > >> Lukas Fleischer writes: > >> > >> > When running `git push`, it might occur that error messages are > >> > transferred from the server to the client. While most messages (those > >> > explicitly sent on sideband 2) are prefixed with "remote:", it seems > >> > that error messages printed during the automatic householding performed > >> > by git-gc(1) are displayed without any additional decoration. Thus, such > >> > messages can easily be misinterpreted as git-gc failing locally, see [1] > >> > for an actual example of where that happened. > >> > >> Sounds like a sensible goal to me. > > > > What exactly are you referring to (you only quoted the introduction)? > > Do you think we should fix the git-gc issue but keep the general > > behavior of printing messages unaltered? Do you think it would be > > worthwhile to make server messages distinguishable in general? > > The latter, which I think was what your implementation was attempting to do > if I read it correctly. I think the implementation is doing much more, but it is probably a good thing. Right now we do not send auto-gc output over the sideband, and its stderr goes to receive-pack's stderr. But that is a different place for different protocols. For git-over-https, it is probably apache's error log, or /dev/null if the server admin configured it. For ssh, it may be back over the ssh stderr channel, or it may go to a log or nowhere if the server admin intercepts receive-pack and redirects it. So the greater question is not "should this output be marked" but "should auto-gc data go over the sideband so that all clients see it (and any server-side stderr does not)". And I think the answer is probably yes. And that fixes the "remote: " thing as a side effect. If it were no, then this is not the right solution, and the solution is to swap out copy_to_sideband() for something that copies to stderr with "remote: " prepended, or something. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (May 2016, #09; Tue, 31) t1308 broken
On Thu, Jun 02, 2016 at 02:31:50PM -0700, Junio C Hamano wrote: > Perhaps like this, taking hint from the log message of 6eafa6d0 > (submodules: don't stumble over symbolic links when cloning > recursively, 2012-07-12)? > > t/t1308-config-set.sh | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/t/t1308-config-set.sh b/t/t1308-config-set.sh > index 065d5eb..1ba9ecb 100755 > --- a/t/t1308-config-set.sh > +++ b/t/t1308-config-set.sh > @@ -237,7 +237,7 @@ test_expect_success 'iteration shows correct origins' ' > key=foo.bar > value=from-home > origin=file > - name=$(pwd)/.gitconfig > + name=$TRASH_DIRECTORY/.gitconfig > scope=global > > key=foo.bar Yeah, I think it is the same issue. I think the most accurate value there would probably be $HOME, though. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (May 2016, #09; Tue, 31) t1308 broken
Junio C Hamanowrites: > Torsten Bögershausen writes: > >> It seams as ./t1308-config-set.sh is broken, >> when the the directory is a soft link: >> >> -name=/home/tb/NoBackup/projects/git/git.pu/t/trash >> directory.t1308-config-set/.gitconfig >> +name=/home/tb/projects/git/git.pu/t/trash >> directory.t1308-config-set/.gitconfig >> scope=global > > It does seem that way. Somebody is affected by $PWD when we should > be consistently using the physical / real path. > >> >> key=foo.bar >> not ok 28 - iteration shows correct origins >> >> I havent't digged further, too many conflicts in the config code, may be >> somebody knows it directly ? Perhaps like this, taking hint from the log message of 6eafa6d0 (submodules: don't stumble over symbolic links when cloning recursively, 2012-07-12)? t/t1308-config-set.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t1308-config-set.sh b/t/t1308-config-set.sh index 065d5eb..1ba9ecb 100755 --- a/t/t1308-config-set.sh +++ b/t/t1308-config-set.sh @@ -237,7 +237,7 @@ test_expect_success 'iteration shows correct origins' ' key=foo.bar value=from-home origin=file - name=$(pwd)/.gitconfig + name=$TRASH_DIRECTORY/.gitconfig scope=global key=foo.bar -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCHv2] pathspec: allow escaped query values
In our own .gitattributes file we have attributes such as: *.[ch] whitespace=indent,trail,space When querying for attributes we want to be able to ask for the exact value, i.e. git ls-files :(attr:whitespace=indent,trail,space) should work, but the commas are used in the attr magic to introduce the next attr, such that this query currently fails with fatal: Invalid pathspec magic 'trail' in ':(attr:whitespace=indent,trail,space)' This change allows escaping characters by a backslash, such that the query git ls-files :(attr:whitespace=indent\,trail\,space) will match all path that have the value "indent,trail,space" for the whitespace attribute. To accomplish this, we need to modify two places. First `eat_long_magic` needs to not stop early upon seeing a comma or closing paren that is escaped. As a second step we need to remove any escaping from the attr value. Signed-off-by: Stefan Beller--- So here is the "escaping only, but escaping done right" version. (It goes on top of sb/pathspec-label) Thanks, Stefan pathspec.c | 44 + t/t6134-pathspec-with-labels.sh | 23 + 2 files changed, 63 insertions(+), 4 deletions(-) diff --git a/pathspec.c b/pathspec.c index 326863a..45bd775 100644 --- a/pathspec.c +++ b/pathspec.c @@ -89,6 +89,43 @@ static void prefix_short_magic(struct strbuf *sb, int prefixlen, strbuf_addf(sb, ",prefix:%d)", prefixlen); } +static size_t strcspn_escaped(const char *s, const char *reject) +{ + const char *i, *j; + + for (i = s; *i; i++) { + /* skip escaped the character */ + if (i[0] == '\\' && i[1]) { + i++; + continue; + } + /* see if any of the chars matches the current character */ + for (j = reject; *j; j++) + if (!*i || *i == *j) + return i - s; + } + return i - s; +} + +static char *attr_value_unescape(const char *value) +{ + char *i, *ret = xstrdup(value); + + for (i = ret; *i; i++) { + if (i[0] == '\\') { + if (!i[1]) + die(_("Escape character '\\' not allowed as " + "last character in attr value")); + + /* remove the backslash */ + memmove(i, i + 1, strlen(i)); + /* and ignore the character after that */ + i++; + } + } + return ret; +} + static void parse_pathspec_attr_match(struct pathspec_item *item, const char *value) { struct string_list_item *si; @@ -131,10 +168,9 @@ static void parse_pathspec_attr_match(struct pathspec_item *item, const char *va if (attr[attr_len] != '=') am->match_mode = MATCH_SET; else { + const char *v = [attr_len + 1]; am->match_mode = MATCH_VALUE; - am->value = xstrdup([attr_len + 1]); - if (strchr(am->value, '\\')) - die(_("attr spec values must not contain backslashes")); + am->value = attr_value_unescape(v); } break; } @@ -166,7 +202,7 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt, for (copyfrom = elt + 2; *copyfrom && *copyfrom != ')'; copyfrom = nextat) { - size_t len = strcspn(copyfrom, ",)"); + size_t len = strcspn_escaped(copyfrom, ",)"); if (copyfrom[len] == ',') nextat = copyfrom + len + 1; else diff --git a/t/t6134-pathspec-with-labels.sh b/t/t6134-pathspec-with-labels.sh index a5c9632..cbea858 100755 --- a/t/t6134-pathspec-with-labels.sh +++ b/t/t6134-pathspec-with-labels.sh @@ -163,4 +163,27 @@ test_expect_success 'abort on asking for wrong magic' ' test_must_fail git ls-files . ":(attr:!label=foo)" ' +test_expect_success 'check attribute list' ' + cat <<-EOF >>.gitattributes && + * whitespace=indent,trail,space + EOF + cat .gitattributes && + git ls-files ":(attr:whitespace=indent\,trail\,space)" >actual && + git ls-files >expect && + test_cmp expect actual +' +test_expect_success 'wrong escaping caught' ' + # Pass one backslash to git to fail with a missing closing paren + test_must_fail git ls-files ":(attr:marked-with-backslash=\\)" 2>actual && + test_i18ngrep Missing actual +' +test_expect_success 'check escaped backslash' ' + cat <<-EOF >>.gitattributes && + /sub/* marked-with-backslash=\\ + EOF + git
Re: [PATCH] pathspec: rename free_pathspec() to clear_pathspec()
On Thu, Jun 02, 2016 at 02:18:47PM -0700, Junio C Hamano wrote: > The function takes a pointer to a pathspec structure, and releases > the resources held by it, but does not free() the structure itself. > Such a function should be called "clear", not "free". Hmm, makes sense, though... > * This is just something I noticed. Among the hits in > > $ git grep free_ \*.h > >I think free_notes() is also a candidate for such renaming, but >because we are not actively working on that subsystem, we may >want to leave that dog sleeping to avoid unnecessary code churn. >The same for diff_free_filespec_data(), for which a better name >would have been diff_filespec_clear(). I think diff_filespec_clear() would not be quite right. It is freeing only the allocated _data_, but leaving the other portions intact. Generally our _clear() functions reset the object back to an initial state, from which it can be reused. I don't see that as a big problem because there is an other object for the verb "free" here: "data". We are just freeing its data, but the rest of the object remains intact and we may fill in the data again later. But I think pathspec is in similar boat; it has not been cleared back to its initial state. But it is in a much _worse_ state than the filespec, which you can continue to use. It is in a totally broken state where "nr" does not correspond to the actual number of items, the has_wildcard flag is bogus, etc. So I think it would be OK to move it to "clear", but we should probably also zero the whole thing, too. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (May 2016, #09; Tue, 31) t1308 broken
Torsten Bögershausenwrites: > It seams as ./t1308-config-set.sh is broken, > when the the directory is a soft link: > > -name=/home/tb/NoBackup/projects/git/git.pu/t/trash > directory.t1308-config-set/.gitconfig > +name=/home/tb/projects/git/git.pu/t/trash > directory.t1308-config-set/.gitconfig > scope=global It does seem that way. Somebody is affected by $PWD when we should be consistently using the physical / real path. > > key=foo.bar > not ok 28 - iteration shows correct origins > > I havent't digged further, too many conflicts in the config code, may be > somebody knows it directly ? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] pathspec: rename free_pathspec() to clear_pathspec()
The function takes a pointer to a pathspec structure, and releases the resources held by it, but does not free() the structure itself. Such a function should be called "clear", not "free". Signed-off-by: Junio C Hamano--- * This is just something I noticed. Among the hits in $ git grep free_ \*.h I think free_notes() is also a candidate for such renaming, but because we are not actively working on that subsystem, we may want to leave that dog sleeping to avoid unnecessary code churn. The same for diff_free_filespec_data(), for which a better name would have been diff_filespec_clear(). archive.c | 2 +- builtin/blame.c| 6 +++--- builtin/reset.c| 2 +- builtin/update-index.c | 2 +- combine-diff.c | 2 +- notes-merge.c | 4 ++-- pathspec.c | 2 +- pathspec.h | 4 ++-- revision.c | 2 +- tree-diff.c| 4 ++-- 10 files changed, 15 insertions(+), 15 deletions(-) diff --git a/archive.c b/archive.c index 5d735ae..42df974 100644 --- a/archive.c +++ b/archive.c @@ -322,7 +322,7 @@ static int path_exists(struct tree *tree, const char *path) pathspec.recursive = 1; ret = read_tree_recursive(tree, "", 0, 0, , reject_entry, ); - free_pathspec(); + clear_pathspec(); return ret != 0; } diff --git a/builtin/blame.c b/builtin/blame.c index 21f42b0..759d84a 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -609,7 +609,7 @@ static struct origin *find_origin(struct scoreboard *sb, } } diff_flush(_opts); - free_pathspec(_opts.pathspec); + clear_pathspec(_opts.pathspec); return porigin; } @@ -651,7 +651,7 @@ static struct origin *find_rename(struct scoreboard *sb, } } diff_flush(_opts); - free_pathspec(_opts.pathspec); + clear_pathspec(_opts.pathspec); return porigin; } @@ -1343,7 +1343,7 @@ static void find_copy_in_parent(struct scoreboard *sb, } while (unblamed); target->suspects = reverse_blame(leftover, NULL); diff_flush(_opts); - free_pathspec(_opts.pathspec); + clear_pathspec(_opts.pathspec); } /* diff --git a/builtin/reset.c b/builtin/reset.c index 092c3a5..acd6278 100644 --- a/builtin/reset.c +++ b/builtin/reset.c @@ -158,7 +158,7 @@ static int read_from_tree(const struct pathspec *pathspec, return 1; diffcore_std(); diff_flush(); - free_pathspec(); + clear_pathspec(); return 0; } diff --git a/builtin/update-index.c b/builtin/update-index.c index b8b8522..6cdfd5f 100644 --- a/builtin/update-index.c +++ b/builtin/update-index.c @@ -759,7 +759,7 @@ static int do_reupdate(int ac, const char **av, if (save_nr != active_nr) goto redo; } - free_pathspec(); + clear_pathspec(); return 0; } diff --git a/combine-diff.c b/combine-diff.c index 8f2313d..5920df8 100644 --- a/combine-diff.c +++ b/combine-diff.c @@ -1525,7 +1525,7 @@ void diff_tree_combined(const unsigned char *sha1, free(tmp); } - free_pathspec(); + clear_pathspec(); } void diff_tree_combined_merge(const struct commit *commit, int dense, diff --git a/notes-merge.c b/notes-merge.c index 34bfac0..b7814c9 100644 --- a/notes-merge.c +++ b/notes-merge.c @@ -170,7 +170,7 @@ static struct notes_merge_pair *diff_tree_remote(struct notes_merge_options *o, sha1_to_hex(mp->remote)); } diff_flush(); - free_pathspec(); + clear_pathspec(); *num_changes = len; return changes; @@ -256,7 +256,7 @@ static void diff_tree_local(struct notes_merge_options *o, sha1_to_hex(mp->local)); } diff_flush(); - free_pathspec(); + clear_pathspec(); } static void check_notes_merge_worktree(struct notes_merge_options *o) diff --git a/pathspec.c b/pathspec.c index c9e9b6c..24e0dd5 100644 --- a/pathspec.c +++ b/pathspec.c @@ -489,7 +489,7 @@ void copy_pathspec(struct pathspec *dst, const struct pathspec *src) sizeof(struct pathspec_item) * dst->nr); } -void free_pathspec(struct pathspec *pathspec) +void clear_pathspec(struct pathspec *pathspec) { free(pathspec->items); pathspec->items = NULL; diff --git a/pathspec.h b/pathspec.h index 0c11262..4a80f6f 100644 --- a/pathspec.h +++ b/pathspec.h @@ -19,7 +19,7 @@ #define PATHSPEC_ONESTAR 1 /* the pathspec pattern satisfies GFNM_ONESTAR */ struct pathspec { - const char **_raw; /* get_pathspec() result, not freed by free_pathspec() */ + const char **_raw; /* get_pathspec() result, not freed by clear_pathspec() */ int nr; unsigned int has_wildcard:1; unsigned int recursive:1; @@ -74,7 +74,7 @@ extern void parse_pathspec(struct
Re: [PATCH/RFC 0/4] Add option to enable filters in git-svn
Matteo Bertiniwrote: > Il 2016-05-31 20:12 Eric Wong ha scritto: > >Matteo Bertini wrote: > >>Sorry to all, but I missed a Checksum mismatch error, I'll have a > >>look and submit an update. > > I've had a look, the problem is a missing smudge filter. > Unluckily the small svn revision range i was using was never updating > a filtered file. > > The code actually uses `cat-file --batch` to get the blobs, > but the stored blob is not what SVN::TxDelta::apply need. > What I need is the smudged blob, but cat-file does not supports it. > > I can modify cat-file with a new option to call > `convert_to_working_tree`, > and have the filters applied, for example --use-filters, that expects an > extra field, like $sha\t$path, in the stdin stream. > > I don't like a lot putting an high level feature in a low level command, > but --textconv seems very similar to this. > > Opinions or other suggestions? In the absense of other opinions or suggestions, I suggest you try it and see what others think :) I'm not that experienced at the C code of this project, but I can try to follow along as best as I can. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mark remote `gc --auto` error messages
On Thu, Jun 2, 2016 at 1:06 PM, Lukas Fleischerwrote: > On Thu, 02 Jun 2016 at 21:33:33, Junio C Hamano wrote: >> Lukas Fleischer writes: >> >> > When running `git push`, it might occur that error messages are >> > transferred from the server to the client. While most messages (those >> > explicitly sent on sideband 2) are prefixed with "remote:", it seems >> > that error messages printed during the automatic householding performed >> > by git-gc(1) are displayed without any additional decoration. Thus, such >> > messages can easily be misinterpreted as git-gc failing locally, see [1] >> > for an actual example of where that happened. >> >> Sounds like a sensible goal to me. > > What exactly are you referring to (you only quoted the introduction)? > Do you think we should fix the git-gc issue but keep the general > behavior of printing messages unaltered? Do you think it would be > worthwhile to make server messages distinguishable in general? The latter, which I think was what your implementation was attempting to do if I read it correctly. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mark remote `gc --auto` error messages
On Thu, 02 Jun 2016 at 21:33:33, Junio C Hamano wrote: > Lukas Fleischerwrites: > > > When running `git push`, it might occur that error messages are > > transferred from the server to the client. While most messages (those > > explicitly sent on sideband 2) are prefixed with "remote:", it seems > > that error messages printed during the automatic householding performed > > by git-gc(1) are displayed without any additional decoration. Thus, such > > messages can easily be misinterpreted as git-gc failing locally, see [1] > > for an actual example of where that happened. > > Sounds like a sensible goal to me. What exactly are you referring to (you only quoted the introduction)? Do you think we should fix the git-gc issue but keep the general behavior of printing messages unaltered? Do you think it would be worthwhile to make server messages distinguishable in general? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [WIP-PATCH 1/2] send-email: create email parser subroutine
Samuel GROOTwrote: > On 05/29/2016 01:33 AM, Eric Wong wrote: > >Matthieu Moy wrote: > >>Samuel GROOT writes: > >> > >>>Parsing and processing in send-email is done in the same loop. > >>> > >>>To make the code more maintainable, we create two subroutines: > >>>- `parse_email` to separate header and body > >>>- `parse_header` to retrieve data from header > >> > >>These routines are not specific to git send-email, nor to Git. > >> > >>Does it make sense to use an external library, like > >>http://search.cpan.org/~rjbs/Email-Simple-2.210/lib/Email/Simple.pm , > >>either by depending on it, or by copying it in Git's source tree ? > > > >That might be overkill and increase installation/maintenance > >burden. Bundling it would probably be problematic to distros, > >too. > > We have 5 solutions here: > > 1. Make a new dependence to Email::Simple. > > 2. Bundle Email::Simple in Git's source tree. > > 3. Use Email::Simple if installed, else use our library. > > 4. Making our own email parser library. > > 5. Duplicate parser loop as we did for our patch to implement > `--quote-email` as proposed in $gmane/295772 . > > Obviously, option (5) is the easiest one for us, but it leaves refactoring > for later, and option (1) is also easier but adds a new dependence which is > not that good. I would go with (5) for now and leave (4) for later (which might just be moving the function to a new file). > Since our project ends next week, we might not have enough time to finish > developing a custom parser API so (4) is not a viable option for now but > could be done in the future. > > We could consider bundling Email::Simple as the best option, as it's > developed since 2003 and might be safer to use than anything we could write > in several weeks. In an ideal world, (1) would be nice. But (IMHO) git-send-email should remain installable on non-ideal systems which do not provide Email::Simple as a package. (2) would probably be non-ideal for distro maintainers (+Cc: Jonathan for opinions), and (3) is the most complex and difficult-to-support. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
On 02/06/16 20:46, Junio C Hamano wrote: > Ramsay Joneswrites: > >> I think Junio wants to go with just " quoting (see other thread). > > No. I meant just \ quoting. Yes, sorry, I only just read your last email on the other thread. ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
On 02/06/16 20:29, Junio C Hamano wrote: > Junio C Hamanowrites: > >> On Thu, Jun 2, 2016 at 11:42 AM, Ramsay Jones >> wrote: That would be workable, I would think. Before attr:VAR=VAL extention, supported pathspec were only single lowercase-ascii alphabet tokens, so nobody would have used " as a part of magic. So quting with double-quote pair would work. >>> >>> I was thinking about both ' and ", so that you could do: >> >> Yes, I understood your suggestion as such. Quoting like shells would work >> without breaking backward compatibility for the same reason quoting with >> double-quote and backslash only without supporting single-quotes would >> work. > > Having said that, "It would work" does not have to mean "Hence we > must do it that way" at all. Quoting character pairs make the > parsing and unquoting significantly more complex. > > As you said, not many people used attributes and pathspec magic, and > I do not think those who want to use the new "further limits with > attributes" magic, envisioned primarily to be those who want to give > classes to submodules, have compelling reason to name their classes > with anything but lowercase-ascii-alphabet tokens. So for a practical > purposes, I'd rather see Stefan > > * just implement backquote-blindly-passes-the-next-byte and nothing >more elaborate; and > > * forbid [^-a-z0-9,_] from being used in the value part in the >attr:VAR=VAL magic. > > at least for now, and concentrate more on the other more important > parts of the submodule enhancement topic. OK, that reasonable. I didn't mean to derail Stefan's development! ATB, Ramsay Jones > > That way, those who will start using attr:VAR=VAL magic will stick > themselves to lowercase-ascii-alphabet tokens for now (simply > because an attribut that has the forbidden characters in its value > would not be usable with the magic), and we can later extend the > magic syntax parser in a backward compatible way to allow paired > quotes and other "more convenient" syntax. > > > [Footnote] > > *1* The reason I prefer to keep the initially allowed value > characters narrow is because I envision that something like > > :(attr:VAR=()) > > may become necessary, and do not want to promise users that open or > close parentheses will forever be taken literally. > > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
Ramsay Joneswrites: > I think Junio wants to go with just " quoting (see other thread). No. I meant just \ quoting. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
On 02/06/16 20:04, Stefan Beller wrote: > On Thu, Jun 2, 2016 at 11:42 AM, Ramsay Jones >wrote: >> >> >> On 02/06/16 17:10, Junio C Hamano wrote: >>> Ramsay Jones writes: >>> So, at risk of annoying you, let me continue in my ignorance a little longer and ask: even if you have to protect all of this 'magic' from the shell with '/" quoting, could you not use (nested) quotes to protect the part of an ? For example: git ls-files ':(attr:whitespace="indent,trail,space",icase)' >>> >>> That would be workable, I would think. Before attr:VAR=VAL >>> extention, supported pathspec were only single lowercase-ascii >>> alphabet tokens, so nobody would have used " as a part of magic. So >>> quting with double-quote pair would work. >> >> I was thinking about both ' and ", so that you could do: >> >>$ ./args ':(attr:whitespace="indent,trail,space",icase)' >> 1::(attr:whitespace="indent,trail,space",icase) >> >>$ ./args ":(attr:whitespace='indent,trail,space',icase)" >> 1::(attr:whitespace='indent,trail,space',icase) >> >>$ p=':(attr:whitespace="indent,trail,space",icase)' >>$ ./args "$p" >> 1::(attr:whitespace="indent,trail,space",icase) >> >>$ p=":(attr:whitespace=\"indent,trail,space\",icase)" >>$ ./args "$p" >> 1::(attr:whitespace="indent,trail,space",icase) >> >> but limiting it to " would probably be OK too. >> >>> You'd need to come up with a way to quote a double quote that >>> happens to be a part of VAL somehow, though. >> >> Yes I was assuming \ quoting as well - I just want to reduce the >> need for such quoting (especially on windows). >> >>> I think attribute >>> value is limited to a string with non-whitespace letters; even >>> though the built-in attributes that have defined meaning to the Git >>> itself may not use values with letters beyond [-a-zA-Z0-9,], end >>> users and projects can add arbitrary values within the allowed >>> syntax, so it is not unconceivable that some project may have a >>> custom attribute that lists forbidden characters in a path with >>> >>> === .gitattributes === >>> *.txt forbidden=`" > > We restrict the 'forbidden' to follow [-a-zA-Z0-9,], so we could enforce > it for the values, too. > > >> >>$ ./args ":(attr:*.txt forbidden=\'\\\",icase)" >> 1::(attr:*.txt forbidden=\'\",icase) > > You should lose the *.txt in there, but put it at the back Ah, yes, just shows my ignorance of the attribute system! > >> $ ./args ":(attr:forbidden=\'\\\",icase)*.txt" > >> >>$ ./args ':(attr:*.txt forbidden=\'\''\",icase)' >> 1::(attr:*.txt forbidden=\'\",icase) > > I see, so quoting by " or ' is preferred. What if the user > wants to do a I think Junio wants to go with just " quoting (see other thread). > forbidden='," > > so we have to escape those in there, such as > > ./args ':(attr:"forbidden=\',\"")' No, that won't work (" is not terminated), try this: $ ./args ':(attr:"forbidden='\'',\"")' 1::(attr:"forbidden=',\"") $ $ ./args ":(attr:\"forbidden=',\\\"\")" 1::(attr:"forbidden=',\"") $ [half of the problem is just getting past the shell] ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Mark remote `gc --auto` error messages
Lukas Fleischerwrites: > When running `git push`, it might occur that error messages are > transferred from the server to the client. While most messages (those > explicitly sent on sideband 2) are prefixed with "remote:", it seems > that error messages printed during the automatic householding performed > by git-gc(1) are displayed without any additional decoration. Thus, such > messages can easily be misinterpreted as git-gc failing locally, see [1] > for an actual example of where that happened. Sounds like a sensible goal to me. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
Junio C Hamanowrites: > On Thu, Jun 2, 2016 at 11:42 AM, Ramsay Jones > wrote: >>> >>> That would be workable, I would think. Before attr:VAR=VAL >>> extention, supported pathspec were only single lowercase-ascii >>> alphabet tokens, so nobody would have used " as a part of magic. So >>> quting with double-quote pair would work. >> >> I was thinking about both ' and ", so that you could do: > > Yes, I understood your suggestion as such. Quoting like shells would work > without breaking backward compatibility for the same reason quoting with > double-quote and backslash only without supporting single-quotes would > work. Having said that, "It would work" does not have to mean "Hence we must do it that way" at all. Quoting character pairs make the parsing and unquoting significantly more complex. As you said, not many people used attributes and pathspec magic, and I do not think those who want to use the new "further limits with attributes" magic, envisioned primarily to be those who want to give classes to submodules, have compelling reason to name their classes with anything but lowercase-ascii-alphabet tokens. So for a practical purposes, I'd rather see Stefan * just implement backquote-blindly-passes-the-next-byte and nothing more elaborate; and * forbid [^-a-z0-9,_] from being used in the value part in the attr:VAR=VAL magic. at least for now, and concentrate more on the other more important parts of the submodule enhancement topic. That way, those who will start using attr:VAR=VAL magic will stick themselves to lowercase-ascii-alphabet tokens for now (simply because an attribut that has the forbidden characters in its value would not be usable with the magic), and we can later extend the magic syntax parser in a backward compatible way to allow paired quotes and other "more convenient" syntax. [Footnote] *1* The reason I prefer to keep the initially allowed value characters narrow is because I envision that something like :(attr:VAR=()) may become necessary, and do not want to promise users that open or close parentheses will forever be taken literally. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Mark remote `gc --auto` error messages
When running `git push`, it might occur that error messages are transferred from the server to the client. While most messages (those explicitly sent on sideband 2) are prefixed with "remote:", it seems that error messages printed during the automatic householding performed by git-gc(1) are displayed without any additional decoration. Thus, such messages can easily be misinterpreted as git-gc failing locally, see [1] for an actual example of where that happened. Do we want anything like the following patch (completely untested)? -- 8< -- diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index a744437..15c323a 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -1775,9 +1775,20 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix) const char *argv_gc_auto[] = { "gc", "--auto", "--quiet", NULL, }; - int opt = RUN_GIT_CMD | RUN_COMMAND_STDOUT_TO_STDERR; + struct child_process proc = CHILD_PROCESS_INIT; + + proc.no_stdin = 1; + proc.stdout_to_stderr = 1; + proc.err = use_sideband ? -1 : 0; + proc.git_cmd = 1; + proc.argv = argv_gc_auto; + close_all_packs(); - run_command_v_opt(argv_gc_auto, opt); + if (!start_command()) { + if (use_sideband) + copy_to_sideband(proc.err, -1, NULL); + finish_command(); + } } if (auto_update_server_info) update_server_info(0); -- 8< -- More generally, do we care about making *all* "remote" strings easily distinguishable from "local" strings? Even though it is unlikely to use this for an actual attack, it seems that a malicious server can currently trick a user into performing an action by printing a message that looks like something coming from "local" Git. Prefixing every server message by "remote:" might look a bit ugly but maybe we can simply use a different color instead and fall back to the prefix on terminals without color support. Opinions? [1] https://lists.archlinux.org/pipermail/aur-general/2016-June/032340.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
On Thu, Jun 2, 2016 at 11:42 AM, Ramsay Joneswrote: > > > On 02/06/16 17:10, Junio C Hamano wrote: >> Ramsay Jones writes: >> >>> So, at risk of annoying you, let me continue in my ignorance a little >>> longer and ask: even if you have to protect all of this 'magic' from >>> the shell with '/" quoting, could you not use (nested) quotes to >>> protect the part of an ? For example: >>> >>> git ls-files ':(attr:whitespace="indent,trail,space",icase)' >> >> That would be workable, I would think. Before attr:VAR=VAL >> extention, supported pathspec were only single lowercase-ascii >> alphabet tokens, so nobody would have used " as a part of magic. So >> quting with double-quote pair would work. > > I was thinking about both ' and ", so that you could do: > >$ ./args ':(attr:whitespace="indent,trail,space",icase)' > 1::(attr:whitespace="indent,trail,space",icase) > >$ ./args ":(attr:whitespace='indent,trail,space',icase)" > 1::(attr:whitespace='indent,trail,space',icase) > >$ p=':(attr:whitespace="indent,trail,space",icase)' >$ ./args "$p" > 1::(attr:whitespace="indent,trail,space",icase) > >$ p=":(attr:whitespace=\"indent,trail,space\",icase)" >$ ./args "$p" > 1::(attr:whitespace="indent,trail,space",icase) > > but limiting it to " would probably be OK too. > >> You'd need to come up with a way to quote a double quote that >> happens to be a part of VAL somehow, though. > > Yes I was assuming \ quoting as well - I just want to reduce the > need for such quoting (especially on windows). > >> I think attribute >> value is limited to a string with non-whitespace letters; even >> though the built-in attributes that have defined meaning to the Git >> itself may not use values with letters beyond [-a-zA-Z0-9,], end >> users and projects can add arbitrary values within the allowed >> syntax, so it is not unconceivable that some project may have a >> custom attribute that lists forbidden characters in a path with >> >> === .gitattributes === >> *.txt forbidden=`" We restrict the 'forbidden' to follow [-a-zA-Z0-9,], so we could enforce it for the values, too. > >$ ./args ":(attr:*.txt forbidden=\'\\\",icase)" > 1::(attr:*.txt forbidden=\'\",icase) You should lose the *.txt in there, but put it at the back > $ ./args ":(attr:forbidden=\'\\\",icase)*.txt" > >$ ./args ':(attr:*.txt forbidden=\'\''\",icase)' > 1::(attr:*.txt forbidden=\'\",icase) I see, so quoting by " or ' is preferred. What if the user wants to do a forbidden='," so we have to escape those in there, such as ./args ':(attr:"forbidden=\',\"")' We need to escape both ' and ", one for the outer shell and one for Git parsing. Does that seem ok? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
On Thu, Jun 2, 2016 at 11:42 AM, Ramsay Joneswrote: >> >> That would be workable, I would think. Before attr:VAR=VAL >> extention, supported pathspec were only single lowercase-ascii >> alphabet tokens, so nobody would have used " as a part of magic. So >> quting with double-quote pair would work. > > I was thinking about both ' and ", so that you could do: Yes, I understood your suggestion as such. Quoting like shells would work without breaking backward compatibility for the same reason quoting with double-quote and backslash only without supporting single-quotes would work. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
On 02/06/16 17:10, Junio C Hamano wrote: > Ramsay Joneswrites: > >> So, at risk of annoying you, let me continue in my ignorance a little >> longer and ask: even if you have to protect all of this 'magic' from >> the shell with '/" quoting, could you not use (nested) quotes to >> protect the part of an ? For example: >> >> git ls-files ':(attr:whitespace="indent,trail,space",icase)' > > That would be workable, I would think. Before attr:VAR=VAL > extention, supported pathspec were only single lowercase-ascii > alphabet tokens, so nobody would have used " as a part of magic. So > quting with double-quote pair would work. I was thinking about both ' and ", so that you could do: $ ./args ':(attr:whitespace="indent,trail,space",icase)' 1::(attr:whitespace="indent,trail,space",icase) $ ./args ":(attr:whitespace='indent,trail,space',icase)" 1::(attr:whitespace='indent,trail,space',icase) $ p=':(attr:whitespace="indent,trail,space",icase)' $ ./args "$p" 1::(attr:whitespace="indent,trail,space",icase) $ p=":(attr:whitespace=\"indent,trail,space\",icase)" $ ./args "$p" 1::(attr:whitespace="indent,trail,space",icase) but limiting it to " would probably be OK too. > You'd need to come up with a way to quote a double quote that > happens to be a part of VAL somehow, though. Yes I was assuming \ quoting as well - I just want to reduce the need for such quoting (especially on windows). > I think attribute > value is limited to a string with non-whitespace letters; even > though the built-in attributes that have defined meaning to the Git > itself may not use values with letters beyond [-a-zA-Z0-9,], end > users and projects can add arbitrary values within the allowed > syntax, so it is not unconceivable that some project may have a > custom attribute that lists forbidden characters in a path with > > === .gitattributes === > *.txt forbidden=`" > > that tells their documentation cannot have these letters in it, or > something like that. Heh, yeah, that gets ugly: $ ./args ":(attr:*.txt forbidden=\'\\\",icase)" 1::(attr:*.txt forbidden=\'\",icase) $ ./args ':(attr:*.txt forbidden=\'\''\",icase)' 1::(attr:*.txt forbidden=\'\",icase) [Note the initial ' 1:' above is output from args] ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/2] completion: add git status
Thomas Braunwrites: > + untracked_state="$(__git_find_on_cmdline "--untracked-files=no\ > + --untracked-files=normal --untracked-files=all")" Just wondering but does this help my use of the command like $ git status -uno or do I now have to spell it out like $ git status --untracked-files=no to take advantage of it? > + untracked_state=${untracked_state##--untracked-files=} > + > + if [ -z "$untracked_state" ]; then > + untracked_state="$(git --git-dir="$(__gitdir)" config > "status.showUntrackedFiles")" > + fi > + > + case "$untracked_state" in > + no) > + # --ignored option does not matter Style. I see existing case/esac statements that use this style, but our preference is not to indent case arms like this; rather: case "$untracked_state" in no) # --ignored ... which saves the indentation one level overall. > + complete_opt= > + ;; > + all|normal|*) > + complete_opt="--cached --directory --no-empty-directory > --others" > + > + if [ -n "$(__git_find_on_cmdline "--ignored")" ]; then Same question as the "--untracked-files=no vs -uno" applies here. > + complete_opt="$complete_opt --ignored > --exclude=*" > + fi > + ;; > + esac > + > + __git_complete_index_file "$complete_opt" > +} > + > __git_config_get_set_variables () > { > local prevword word config_file= c=$cword -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Fix prune/gc problem with multiple worktrees
Michael Haggertywrites: > Fixing reachability via the index and detached HEADs feels relatively > important. > ... I agree with the order of importance above. But "relatively" is a very good keyword. Just like bisection refs, what is in the index and the commit detached HEAD points at are expected to be tentative. As a part of still-experimental feature, I'd rather see our bandwidth spent on fixing it the right way first time, instead of piling on an unproven quick-fix as a band aid, having to rip it off and fixing it properly later. > It's hard for me to predict when the ref-iterator stuff will be merged. > It is a big change, but so far the feedback seems pretty good. I can > tell you that pushing it and ref-stores forward is high on my priority list. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 28/39] i18n: config: unfold error messages marked for translation
Vasco Almeidawrites: > Às 17:43 de 01-06-2016, Junio C Hamano escreveu: >> Vasco Almeida writes: >> >>> Introduced in 473166b ("config: add 'origin_type' to config_source >>> struct", 2016-02-19), Git can inform the user about the origin of a >>> config error, but the implementation does not allow translators to >>> translate the keywords 'file', 'blob, 'standard input', and >>> 'submodule-blob'. Moreover, for the second message, a reason for the >>> error is appended to the message, not allowing translators to translate >>> that reason either. >> >> Good intentions. >> >>> @@ -417,6 +417,7 @@ static int git_parse_source(config_fn_t fn, void *data) >>> int comment = 0; >>> int baselen = 0; >>> struct strbuf *var = >var; >>> + char error_msg[128]; >>> >>> /* U+FEFF Byte Order Mark in UTF8 */ >>> const char *bomptr = utf8_bom; >>> @@ -471,10 +472,38 @@ static int git_parse_source(config_fn_t fn, void >>> *data) >>> if (get_value(fn, data, var) < 0) >>> break; >>> } >>> + >>> + switch (cf->origin_type) { >>> + case CFG_BLOB: >>> + xsnprintf(error_msg, sizeof(error_msg), >>> + _("bad config line %d in blob %s"), >>> + cf->linenr, cf->name); >> >> Use xstrfmt() intead, perhaps? That would be cleaner. >> > Wouldn't that create a memory leak? Yes, I didn't mean to suggest "use xstrfmt() instead of xsnprintf() without changing anything else". It was merely to suggest that you do not have to have 128-byte limit if you used xstrfmt(); having to free the result was too obvious that I left it unsaid, and as expected, you noticed the need to do so, which is good ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Segfault in the attr stack
On Thu, Jun 2, 2016 at 8:59 AM, Junio C Hamanowrote: > Junio C Hamano writes: > Gaah, of course. This is coming from the cache preload codepath, where multiple threads try to run ce_path_match(). It used to be OK because pathspec magic never looked at attributes, but now it does, and attribute system is not thread-safe. > > I'll look into teaching a threadble interface to the attribute > subsystem, but for now, this should get you unstuck, I think. Thanks for looking into this! (I was about to do that if you would not have stepped up, as the bug only surfaces when using the pathspec labels.) > > + /* Do not preload when pathspec uses non-threadable subsystems */ > + if (pathspec && pathspec_is_non_threadable(pathspec)) > + return; /* for now ... */ This fixes the problem for now; I'll look into the comma escaping then. Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/6] worktree.c: find_worktree() learns to identify worktrees by basename
Duy Nguyenwrites: > On Thu, Jun 2, 2016 at 1:44 AM, Junio C Hamano wrote: >>> We would >>> need to convert or match both '/' and '\' in "to/foo" case because of >>> Windows, so it's not much easier than basename(). >> >> I never said "easier to implement". But can this codepath get >> backslashed paths in the first place? I somehow thought that >> normalization would happen a lot before the control reaches here. >> >> You'll be calling into fspathcmp() anyway; shouldn't the function >> know that '/' and '\' are equivalent on some platforms, or is it >> legal to only call fspathcmp() on a single path component without >> directory separator? > > We still need to calculate the length to compare, which could be > problematic when utf-8 is involved, or some other encoding. Hmph. I was unaware that fspathcmp() used here does more than byte-for-byte processing, which would cause problems due to encoding issues when you hand code the comparison. +static struct worktree *find_worktree_by_basename(struct worktree **list, + const char *base_name) +{ + struct worktree *found = NULL; + int nr_found = 0; + + for (; *list && nr_found < 2; list++) { + char *path = xstrdup((*list)->path); + if (!fspathcmp(base_name, basename(path))) { + found = *list; + nr_found++; + } + free(path); + } + return nr_found == 1 ? found : NULL; +} > If we always split at '/' boundary though (e.g. "abc/def/ghi", > "def/ghi" or "ghi" but never "ef/ghi") then it should be ok. Does "basename()" used here know '/' and '\' can both be a directory separator, or does worktree->path have a normalized representation of the path, i.e. '/' is the only directory separator? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 09/13] refs: introduce an iterator interface
On 06/02/2016 12:08 PM, Duy Nguyen wrote: > On Mon, May 30, 2016 at 2:55 PM, Michael Haggerty> wrote: >> Currently, the API for iterating over references is via a family of >> for_each_ref()-type functions that invoke a callback function for each >> selected reference. All of these eventually call do_for_each_ref(), >> which knows how to do one thing: iterate in parallel through two >> ref_caches, one for loose and one for packed refs, giving loose >> references precedence over packed refs. This is rather complicated code, >> and is quite specialized to the files backend. It also requires callers >> to encapsulate their work into a callback function, which often means >> that they have to define and use a "cb_data" struct to manage their >> context. >> >> The current design is already bursting at the seams, and will become >> even more awkward in the upcoming world of multiple reference storage >> backends: >> >> * Per-worktree vs. shared references are currently handled via a kludge >> in git_path() rather than iterating over each part of the reference >> namespace separately and merging the results. This kludge will cease >> to work when we have multiple reference storage backends. > > Question from a refs user. Right now worktree.c:get_worktrees() peeks > directly to "$GIT_DIR/worktrees/xxx/HEAD" and parses the content > itself, something that I promised to fix but never got around to do > it. Will we have an iterator to go through all worktrees' HEAD, or > will there be an API to say "resolve ref HEAD from worktree XYZ"? My preference is that there is a way to say "create a ref_store object representing the loose references stored physically under "$GIT_DIR/worktrees/xxx". Then that ref_store could be asked to read its `HEAD` (or iterate over all of the refs under that path or whatever). You could even write `HEAD` through the same ref_store. Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] fetch: better alignment in ref summary
Duy Nguyenwrites: > On Mon, May 23, 2016 at 7:58 AM, Junio C Hamano wrote: >> That is, I wonder if the above can become something like: >> >>> From github.com:pclouds/git >>> * [new branch] { -> pclouds/}2nd-index >>> * [new branch] { -> pclouds/}3nd-index >>> * [new branch] { -> pclouds/}file-watcher >>> ... > > for context of the following quote... > > On Thu, May 26, 2016 at 12:18 PM, Jeff King wrote: >> I do agree with Junio that we could probably improve the output quite a >> bit by not showing each refname twice in the common case. I don't quite >> find the "{ -> origin/}whatever" particularly pretty, but something like >> that which keeps the variable bits at the end would make this problem >> just go away. > > I tried it out. It's a bit hard to read at the "/}" boundary. Color > highlight might help. But it occurs to me, could we extend refspec a > bit to allow "foo/bar:base/..." be be equivalent of > "foo/bar:base/foo/bar". Then fetch output could become > >>> * [new branch] 2nd-index:pclouds/... >>> * [new branch] 3nd-index:pclouds/... >>> * [new branch] file-watcher:pclouds/... > > It might be a tiny bit better, and a forced update could be displayed > with a prefix '+'. Hmm? I do not find that "..." particularly more readable, but that probably is very subjective. It is much less copy, when compared to pasting "pclouds/}2nd-index" and then removing "}". -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] pathspec: allow escaped query values
Ramsay Joneswrites: > So, at risk of annoying you, let me continue in my ignorance a little > longer and ask: even if you have to protect all of this 'magic' from > the shell with '/" quoting, could you not use (nested) quotes to > protect the part of an ? For example: > > git ls-files ':(attr:whitespace="indent,trail,space",icase)' That would be workable, I would think. Before attr:VAR=VAL extention, supported pathspec were only single lowercase-ascii alphabet tokens, so nobody would have used " as a part of magic. So quting with double-quote pair would work. You'd need to come up with a way to quote a double quote that happens to be a part of VAL somehow, though. I think attribute value is limited to a string with non-whitespace letters; even though the built-in attributes that have defined meaning to the Git itself may not use values with letters beyond [-a-zA-Z0-9,], end users and projects can add arbitrary values within the allowed syntax, so it is not unconceivable that some project may have a custom attribute that lists forbidden characters in a path with === .gitattributes === *.txt forbidden=`" that tells their documentation cannot have these letters in it, or something like that. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Segfault in the attr stack
Junio C Hamanowrites: >>> Gaah, of course. >>> >>> This is coming from the cache preload codepath, where multiple threads >>> try to run ce_path_match(). >>> It used to be OK because pathspec magic never looked at attributes, >>> but now it does, and attribute system is not thread-safe. I'll look into teaching a threadble interface to the attribute subsystem, but for now, this should get you unstuck, I think. One of the ways preload codepath assures that multiple threads do not stomp on the shared datastructure is by copying things that are involved in the operation, and there is copy_pathspec() call. I think the "pathspec magic that limits with the attributes" topic, which adds to the pathspec structure, needs to update the function so that its shared data structure is properly copied. Before the series, I think the pathspec structure only has either pointers to borrowed strings (e.g. raw) or scalars, and it was sufficient to do a shallow copy with memcpy(). With the change, that is no longer the case. Which would mean we'd need to make sure that any codepath that calls copy_pathspec() needs to think about how to clean it up. Before the change, preload-index codepath didn't have to, but with the change, it will. We'd need pathspec_clear() or something like that. pathspec.c | 12 pathspec.h | 2 ++ preload-index.c | 4 3 files changed, 18 insertions(+) diff --git a/pathspec.c b/pathspec.c index 0a02255..326863a 100644 --- a/pathspec.c +++ b/pathspec.c @@ -208,6 +208,18 @@ static void eat_long_magic(struct pathspec_item *item, const char *elt, *copyfrom_ = copyfrom; } +int pathspec_is_non_threadable(const struct pathspec *pathspec) +{ + int i; + + for (i = 0; i < pathspec->nr; i++) { + const struct pathspec_item *item = >items[i]; + if (item->attr_check) + return 1; + } + return 0; +} + /* * Take an element of a pathspec and check for magic signatures. * Append the result to the prefix. Return the magic bitmap. diff --git a/pathspec.h b/pathspec.h index 5308137..07d21f0 100644 --- a/pathspec.h +++ b/pathspec.h @@ -115,4 +115,6 @@ extern void add_pathspec_matches_against_index(const struct pathspec *pathspec, extern const char *check_path_for_gitlink(const char *path); extern void die_if_path_beyond_symlink(const char *path, const char *prefix); +extern int pathspec_is_non_threadable(const struct pathspec *); + #endif /* PATHSPEC_H */ diff --git a/preload-index.c b/preload-index.c index c1fe3a3..af46878 100644 --- a/preload-index.c +++ b/preload-index.c @@ -76,6 +76,10 @@ static void preload_index(struct index_state *index, if (!core_preload_index) return; + /* Do not preload when pathspec uses non-threadable subsystems */ + if (pathspec && pathspec_is_non_threadable(pathspec)) + return; /* for now ... */ + threads = index->cache_nr / THREAD_COST; if (threads < 2) return; -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] push: deny policy to prevent pushes to unwanted remotes.
> On 30 May 2016, at 06:45, Antoine Queru >wrote: > > Currently, a user wanting to prevent accidental pushes to the wrong remote > has to create a pre-push hook. > The feature offers a configuration to allow users to prevent accidental pushes > to the wrong remote. The user may define a list of whitelisted remotes, a list > of blacklisted remotes and a default policy ("allow" or "deny"). A push > is denied if the remote is explicitely blacklisted or if it isn't > whitelisted and the default policy is "deny". > > This feature is intended as a safety net more than a real security, the user > will always be able to modify the config if he wants to. It is here for him to > consciously restrict his push possibilities. For example, it may be useful > for an unexperimented user fearing to push to the wrong remote, or for > companies wanting to avoid unintentionnal leaking of private code on public > repositories. Thanks for working on this feature. Unfortunately I won't be able to test and review it before June 14. I am traveling without laptop and only very sporadic internet access :) One thing that I noticed already: I think a custom warning/error message for rejected pushes would be important because, as you wrote above, this feature does not provide real security. That means if a push is rejected for someone in an organization then the user needs to understand what is going on. E.g. in my organization I would point the user to the open source contribution guidelines etc. Thanks, Lars > > By default the whitelist/blacklist feature is disabled since the default > policy > is "allow". > > Signed-off-by: Antoine Queru > Signed-off-by: Francois Beutin > Signed-off-by: Matthieu Moy > --- > This is the first implementation of the feature proposed in SoCG 2016. > The conversation about it can be found here: > http://thread.gmane.org/gmane.comp.version-control.git/295166 > and http://thread.gmane.org/gmane.comp.version-control.git/289749. > We have included in cc all the participants in the previous conversation. > Documentation/config.txt| 17 +++ > Documentation/git-push.txt | 32 + > builtin/push.c | 75 --- > t/t5544-push-whitelist-blacklist.sh | 90 + > 4 files changed, 209 insertions(+), 5 deletions(-) > create mode 100755 t/t5544-push-whitelist-blacklist.sh > > diff --git a/Documentation/config.txt b/Documentation/config.txt > index 53f00db..1478ce3 100644 > --- a/Documentation/config.txt > +++ b/Documentation/config.txt > @@ -2517,6 +2517,23 @@ remote.pushDefault:: >`branch..remote` for all branches, and is overridden by >`branch..pushRemote` for specific branches. > > +remote.pushBlacklisted:: > +The list of remotes the user is forbidden to push to. > +See linkgit:git-push[1] > + > +remote.pushWhitelisted:: > +The list of remotes the user is allowed to push to. > +See linkgit:git-push[1] > + > +remote.pushDefaultPolicy:: > +Can be set to either 'allow' or 'deny', defines the policy to adopt > +when the user tries to push to a remote not in the whitelist or > +the blacklist. Defaults to 'allow'. > + > +remote.pushDenyMessage:: > +The message printed when pushing to a forbidden remote. > +Defaults to "Pushing to this remote has been forbidden". > + > remote..url:: >The URL of a remote repository. See linkgit:git-fetch[1] or >linkgit:git-push[1]. > diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt > index cf6ee4a..644bfde 100644 > --- a/Documentation/git-push.txt > +++ b/Documentation/git-push.txt > @@ -368,6 +368,38 @@ reason:: >refs, no explanation is needed. For a failed ref, the reason for >failure is described. > > + > +Note about denying pushes to the wrong remotes > +-- > + > +If you fear accidental pushes to the wrong remotes, you can use the > +blacklist/whitelist feature. The goal is to catch pushes to unwanted > +remotes before they happen. > + > +The simplest way to forbid pushes to a remote is to add its url in the > +config file under the 'remote.pushBlacklisted' variable. > +A more restrictive way is to change 'remote.pushDefaultPolicy' to 'deny', > +thus denying pushes to every remote not whitelisted. You can then add > +the right remotes under 'remote.pushWhitelisted'. > + > +You can also configure more advanced acceptation/denial behavior following > +this rule: the more the url in the config prefixes the asked url the more > +priority it has. > + > +For example, if we set up the configuration variables like this: > +git config --add remote.pushBlacklisted repository.com > +git config --add remote.pushWhitelisted repository.com/Special_Folder > +Any push of this form will
Re: [RFC/PATCH] pathspec: allow escaped query values
On 02/06/16 06:46, Junio C Hamano wrote: > Ramsay Joneswrites: > >> Not having given this much thought at all, but the question which comes >> to mind is: can you use some other separator for the -s rather than >> a comma? That way you don't need to quote them in the part of the >> -spec. >> >> (I dunno, maybe use ; or : instead?) > > There are two kinds of comma involved in this discussion. > > * Multiple pathspec magic can be attached to augment the way > selects paths. ":(,,...)" is >the syntax, and are things like "icase" (select the path >that matches case-insensitively), "top" ( must >match from the top level of the working tree, even when you are >running the command from a subdirectory). We added a new kind of > whose syntax is "attr:VAR=VAL" recently, which says "not >only must match the path, in order to be selected, the >path must have the attribute VAR with value VAL". > >The comma that separates multiple magic is not something you can >change now; it has been with us since v1.7.6 (Jun 2011) > > * My example wanted to use the attr:VAR=VAL form to select those >paths that has one specific string as the value for whitespace >attribute, i.e. VAR in this case is "whitespace". The value for >whitespace attribute determines what kind of whitespace anomalies >are considered as errors by "git apply" and "git diff", and it is >formed by concatenating things like "indent-with-non-tab" (starts >a line with more than 8 consecutive SPs), "space-before-tab" (a >SP appears immediately before HT in the indent), etc., with a >comma in between. > >The comma that separates various kinds of whitespace errors is >not something you can change now; it has been with us since >v1.5.4 (Feb 2008). > > So using different separator is not a viable solution. Ah, OK, makes sense. Note that I have not used 'pathspec magic' or the attribute system in git (never felt/had the need)! ;-) So, at risk of annoying you, let me continue in my ignorance a little longer and ask: even if you have to protect all of this 'magic' from the shell with '/" quoting, could you not use (nested) quotes to protect the part of an ? For example: git ls-files ':(attr:whitespace="indent,trail,space",icase)' [Hmm, I don't seem to be able to find documentation on the syntax of these features (apart from the code, of course).] ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 1/2] completion: factor out untracked file modes into a variable
Signed-off-by: Thomas Braun--- contrib/completion/git-completion.bash | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index 3402475..addea89 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1098,6 +1098,8 @@ _git_clone () esac } +__git_untracked_file_modes="all no normal" + _git_commit () { case "$prev" in @@ -1119,7 +1121,7 @@ _git_commit () return ;; --untracked-files=*) - __gitcomp "all no normal" "" "${cur##--untracked-files=}" + __gitcomp "$__git_untracked_file_modes" "" "${cur##--untracked-files=}" return ;; --*) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 2/2] completion: add git status
Signed-off-by: Thomas Braun--- contrib/completion/git-completion.bash | 55 ++ 1 file changed, 55 insertions(+) diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash index addea89..fa7a03a 100644 --- a/contrib/completion/git-completion.bash +++ b/contrib/completion/git-completion.bash @@ -1782,6 +1782,61 @@ _git_stage () _git_add } +_git_status () +{ + local complete_opt + local untracked_state + + case "$cur" in + --ignore-submodules=*) + __gitcomp "none untracked dirty all" "" "${cur##--ignore-submodules=}" + return + ;; + --untracked-files=*) + __gitcomp "$__git_untracked_file_modes" "" "${cur##--untracked-files=}" + return + ;; + --column=*) + __gitcomp " + always never auto column row plain dense nodense + " "" "${cur##--column=}" + return + ;; + --*) + __gitcomp " + --short --branch --porcelain --long --verbose + --untracked-files= --ignore-submodules= --ignored + --column= --no-column + " + return + ;; + esac + + untracked_state="$(__git_find_on_cmdline "--untracked-files=no\ + --untracked-files=normal --untracked-files=all")" + untracked_state=${untracked_state##--untracked-files=} + + if [ -z "$untracked_state" ]; then + untracked_state="$(git --git-dir="$(__gitdir)" config "status.showUntrackedFiles")" + fi + + case "$untracked_state" in + no) + # --ignored option does not matter + complete_opt= + ;; + all|normal|*) + complete_opt="--cached --directory --no-empty-directory --others" + + if [ -n "$(__git_find_on_cmdline "--ignored")" ]; then + complete_opt="$complete_opt --ignored --exclude=*" + fi + ;; + esac + + __git_complete_index_file "$complete_opt" +} + __git_config_get_set_variables () { local prevword word config_file= c=$cword -- 2.8.3.windows.1 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 2/2] completion: add git status
Am 01.06.2016 um 14:15 schrieb SZEDER Gábor: > > Quoting Thomas Braun: > >> Signed-off-by: Thomas Braun >> --- >> contrib/completion/git-completion.bash | 29 >> + >> 1 file changed, 29 insertions(+) >> >> diff --git a/contrib/completion/git-completion.bash >> b/contrib/completion/git-completion.bash >> index addea89..77343da 100644 >> --- a/contrib/completion/git-completion.bash >> +++ b/contrib/completion/git-completion.bash >> @@ -1782,6 +1782,35 @@ _git_stage () >> _git_add >> } >> >> +_git_status () >> +{ >> +case "$cur" in >> +--ignore-submodules=*) >> +__gitcomp "none untracked dirty all" "" >> "${cur##--ignore-submodules=}" >> +return >> +;; >> +--untracked-files=*) >> +__gitcomp "$__git_untracked_file_modes" "" >> "${cur##--untracked-files=}" >> +return >> +;; >> +--column=*) >> +__gitcomp " >> +always never auto column row plain dense nodense >> +" "" "${cur##--column=}" >> +return >> +;; >> +--*) >> +__gitcomp " >> +--short --branch --porcelain --long --verbose >> +--untracked-files= --ignore-submodules= --ignored >> +--column= --no-column >> +" >> +return >> +;; >> +esac >> +__git_complete_file > > __git_complete_file()'s job is to complete the ':' notation, > e.g. 'master:Mak', which is not what we want here, because this > notation doesn't make sense for 'git status' and because 'git status > ' would then offer refs instead of files. Correct. I might have been mislead by the name ;) > I think there are two choices what to do instead: > > - Don't do anything :) Bash will then fall back to filename > completion, which is quite close to what we want here (and in this > case the return statements from the other case arms can go away as > well). The drawback is that all ignored files in the current > working directory will show up after 'git status '. > > - use __git_complete_index_file() with appropriate options, perhaps > '--cached --others', but I didn't think this through. For bonus > points pass additional options when certain 'git status' options are > already present on the command line, e.g. pass '--ignored', too, if > it is present. I went for the bonus points way. If that is too involved I can also go back to "Don't do anything". -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 1/2] completion: create variable for untracked file modes
Am 01.06.2016 um 13:59 schrieb SZEDER Gábor: > > This subject would perhaps read better: > > completion: factor out untracked file modes into a variable Yes, definitly. Will be included in reroll. > Quoting Thomas Braun: > >> Signed-off-by: Thomas Braun >> --- >> contrib/completion/git-completion.bash | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/contrib/completion/git-completion.bash >> b/contrib/completion/git-completion.bash >> index 3402475..addea89 100644 >> --- a/contrib/completion/git-completion.bash >> +++ b/contrib/completion/git-completion.bash >> @@ -1098,6 +1098,8 @@ _git_clone () >> esac >> } >> >> +__git_untracked_file_modes="all no normal" >> + >> _git_commit () >> { >> case "$prev" in >> @@ -1119,7 +1121,7 @@ _git_commit () >> return >> ;; >> --untracked-files=*) >> -__gitcomp "all no normal" "" "${cur##--untracked-files=}" >> +__gitcomp "$__git_untracked_file_modes" "" >> "${cur##--untracked-files=}" >> return >> ;; >> --*) > > > -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 28/39] i18n: config: unfold error messages marked for translation
Às 17:43 de 01-06-2016, Junio C Hamano escreveu: > Vasco Almeidawrites: > >> Introduced in 473166b ("config: add 'origin_type' to config_source >> struct", 2016-02-19), Git can inform the user about the origin of a >> config error, but the implementation does not allow translators to >> translate the keywords 'file', 'blob, 'standard input', and >> 'submodule-blob'. Moreover, for the second message, a reason for the >> error is appended to the message, not allowing translators to translate >> that reason either. > > Good intentions. > >> @@ -417,6 +417,7 @@ static int git_parse_source(config_fn_t fn, void *data) >> int comment = 0; >> int baselen = 0; >> struct strbuf *var = >var; >> +char error_msg[128]; >> >> /* U+FEFF Byte Order Mark in UTF8 */ >> const char *bomptr = utf8_bom; >> @@ -471,10 +472,38 @@ static int git_parse_source(config_fn_t fn, void *data) >> if (get_value(fn, data, var) < 0) >> break; >> } >> + >> +switch (cf->origin_type) { >> +case CFG_BLOB: >> +xsnprintf(error_msg, sizeof(error_msg), >> + _("bad config line %d in blob %s"), >> + cf->linenr, cf->name); > > Use xstrfmt() intead, perhaps? That would be cleaner. > Wouldn't that create a memory leak? Because, in this patch, error_msg is used in the following way if (cf->die_on_error) die(error_msg); else return error(error_msg); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 13/39] i18n: git-sh-setup.sh: mark strings for translation
Às 20:30 de 01-06-2016, Junio C Hamano escreveu: > Junio C Hamanowrites: > Would it allow you to lose the $(git --exec-path) prefix in the new > dot-source to have this patch before applying your patch? > > -- >8 -- > Subject: t2300: run git-sh-setup in an environment that better mimics the > real life > > When we run scripted Porcelains, "git" potty has set up the $PATH by > prepending $GIT_EXEC_PATH, the path given by "git --exec-path=$there > $cmd", etc. already. Because of this, scripted Porcelains can > dot-source shell script library like git-sh-setup with simple dot > without specifying any path. > > t2300 however dot-sources git-sh-setup without adjusting $PATH like > the real "git" potty does. This has not been a problem so far, but > once git-sh-setup wants to rely on the $PATH adjustment, just like > any scripted Porcelains already do, it would become one. It cannot > for example dot-source another shell library without specifying the > full path to it by prefixing $(git --exec-path). > > Signed-off-by: Junio C Hamano > --- > t/t2300-cd-to-toplevel.sh | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/t/t2300-cd-to-toplevel.sh b/t/t2300-cd-to-toplevel.sh > index 9965bc5..cccd7d9 100755 > --- a/t/t2300-cd-to-toplevel.sh > +++ b/t/t2300-cd-to-toplevel.sh > @@ -8,7 +8,8 @@ test_cd_to_toplevel () { > test_expect_success $3 "$2" ' > ( > cd '"'$1'"' && > - . "$(git --exec-path)"/git-sh-setup && > + PATH="$(git --exec-path):$PATH" && > + . git-sh-setup && > cd_to_toplevel && > [ "$(pwd -P)" = "$TOPLEVEL" ] > ) > Yes, this patch allows to have . git-sh-i18n instead of . "$(git --exec-path)"/git-sh-i18n in git-sh-setup.sh file. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 24/39] i18n: bisect: enable l10n of bisect terms in messages
Às 17:38 de 01-06-2016, Junio C Hamano escreveu: > Vasco Almeidawrites: > >> +enum term { BAD, GOOD, OLD, NEW }; >> +static const char *term_names[] = { >> +/* TRANSLATORS: in bisect.c source code file, the following terms are >> + used to describe a "bad commit", "good commit", "new revision", etc. >> + Please, if you can, check the source when you are not sure if a %s >> + would be replaced by one of the following terms. */ >> +N_("bad"), N_("good"), N_("old"), N_("new"), NULL >> +}; >> + >> /* Remember to update object flag allocation in object.h */ >> #define COUNTED (1u<<16) >> >> @@ -725,12 +734,12 @@ static void handle_bad_merge_base(void) >> if (is_expected_rev(current_bad_oid)) { >> char *bad_hex = oid_to_hex(current_bad_oid); >> char *good_hex = join_sha1_array_hex(_revs, ' '); >> -if (!strcmp(term_bad, "bad") && !strcmp(term_good, "good")) { >> +if (!strcmp(term_bad, term_names[BAD]) && !strcmp(term_good, >> term_names[GOOD])) { >> fprintf(stderr, _("The merge base %s is bad.\n" >> "This means the bug has been fixed " >> "between %s and [%s].\n"), >> bad_hex, bad_hex, good_hex); >> -} else if (!strcmp(term_bad, "new") && !strcmp(term_good, >> "old")) { >> +} else if (!strcmp(term_bad, term_names[NEW]) && >> !strcmp(term_good, term_names[OLD])) { >> fprintf(stderr, _("The merge base %s is new.\n" >> "The property has changed " >> "between %s and [%s].\n"), >> @@ -739,7 +748,7 @@ static void handle_bad_merge_base(void) >> fprintf(stderr, _("The merge base %s is %s.\n" >> "This means the first '%s' commit is " >> "between %s and [%s].\n"), >> -bad_hex, term_bad, term_good, bad_hex, >> good_hex); >> +bad_hex, _(term_bad), _(term_good), bad_hex, >> good_hex); > > These "bad" and "good" that are compared with term_bad and term_good > are the literal tokens the end user gives from the "git bisect" > command line. I do not think you would want to catch them with > > $ git bisect novo > $ git bisect velho > > unless the user has done > > $ git bisect --term-old=velho --term-new=novo > > previously. I may be misunderstanding you, but we do not "catch" those terms with this patch, although I'm not sure what you mean by "catch them". I think you forget that no-operation N_("good"), does not affect in any way the string "good", it only enables xgettext to extract it to .pot file, does not trigger translation. Overall, I don't understand what are you trying to tell me here. > > And that "custom bisect terms" case is covered by the last "else" > clause in this if/elseif cascade (outside the context we can see in > your message). > > The only thing you need to do around here is to mark the string as > translatable. I do not think we need "enum term", or term_names[]. This patch tries to make bisect output those term translated within the also translated message. To enable this, it is handy to have term_names[] in order to mark each term, although I could have mark them anywhere they appeared in the source. It was only for that I chose to have term_names[]. > >> @@ -747,7 +756,7 @@ static void handle_bad_merge_base(void) >> fprintf(stderr, _("Some %s revs are not ancestor of the %s rev.\n" >> "git bisect cannot work properly in this case.\n" >> "Maybe you mistook %s and %s revs?\n"), >> -term_good, term_bad, term_good, term_bad); >> +_(term_good), _(term_bad), _(term_good), _(term_bad)); > > Likewise for all _(term_good), _(term_bad) and use of term_names[] > we see in this patch. > Indeed this was more of a PATCH/RFC to see what people would think of it. If nobody contest and there is no value in it, I'll happily drop this patch in the next re-roll. My motivation for this patch was that a user could feel embarrassment reading a message in her language with those terms untranslated. Although I do believe that no translating them is also a possibility. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] strbuf: improve API
On Thu, Jun 02, 2016 at 02:58:22PM +0200, Matthieu Moy wrote: > Michael Haggertywrites: > >> 1. The amount of added code complexity is small and quite >>encapsulated. > > Actually, STRBUF_OWNS_MEMORY can even be seen as a simplification if > done properly: we already have the case where the strbuf does not own > the memory with strbuf_slopbuf. I already pointed places in > strbuf_grow() which could be simplified after the patch. Re-reading the > code it seems at lesat the call to strbuf_grow(sb, 0); in strbuf_detach > becomes useless. The same in strbuf_attach() probably is, too. > > So, the final strbuf.[ch] code might not be "worse" that the previous. > > I'm unsure about the complexity of the future code using the new API. I > don't forsee cases where using the new API would lead to a high > maintenance cost, but I don't claim I considered all possible uses. > >> 2. The ability to use strbufs without having to allocate memory might >>make enough *psychological* difference that it encourages some >>devs to use strbufs where they would otherwise have done manual >>memory management. I think this would be a *big* win in terms of >>potential bugs and security vulnerabilities avoided. > > Note that this can also be seen as a counter-argument, since it > may psychologically encourage people to micro-optimize code and use > contributors/reviewers neurons to spend time on "shall this be on-stack > or malloced?". > > I think we already have a tendency to micro-optimize non-critical code > too much in Git's codebase, so it's not necessarily a step in the right > direction. > > In conclusion, I don't have a conclusion, sorry ;-). Thank you all for your input and your tests, those are very valuable! Me and Simon have to take a decision, as this contribution is part of a school project that comes to an end. We won't have the time to create tests that are representative of the use of strbuf in the Git codebase (partly because we lack knowledge of the codebase) to settle on whether this API improvement is useful or not. Jeff made very good points, and the tests we ran ourselves seem to agree with yours. That being said, the tests made by Michael, more detailed, suggest that there may be room for an improvement of the strbuf API (even thought that's to confront to the reality of the codebase). Having little time, we will refactor the patch and send it as a V2: this way, if it appears one day that improving the API with stack-allocated memory is indeed useful, the patch will be ready to merge :) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (May 2016, #09; Tue, 31) t1308 broken
It seams as ./t1308-config-set.sh is broken, when the the directory is a soft link: -name=/home/tb/NoBackup/projects/git/git.pu/t/trash directory.t1308-config-set/.gitconfig +name=/home/tb/projects/git/git.pu/t/trash directory.t1308-config-set/.gitconfig scope=global key=foo.bar not ok 28 - iteration shows correct origins I havent't digged further, too many conflicts in the config code, may be somebody knows it directly ? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] fetch: better alignment in ref summary
On Mon, May 23, 2016 at 7:58 AM, Junio C Hamanowrote: > That is, I wonder if the above can become something like: > >> From github.com:pclouds/git >> * [new branch] { -> pclouds/}2nd-index >> * [new branch] { -> pclouds/}3nd-index >> * [new branch] { -> pclouds/}file-watcher >> ... for context of the following quote... On Thu, May 26, 2016 at 12:18 PM, Jeff King wrote: > I do agree with Junio that we could probably improve the output quite a > bit by not showing each refname twice in the common case. I don't quite > find the "{ -> origin/}whatever" particularly pretty, but something like > that which keeps the variable bits at the end would make this problem > just go away. I tried it out. It's a bit hard to read at the "/}" boundary. Color highlight might help. But it occurs to me, could we extend refspec a bit to allow "foo/bar:base/..." be be equivalent of "foo/bar:base/foo/bar". Then fetch output could become >> * [new branch] 2nd-index:pclouds/... >> * [new branch] 3nd-index:pclouds/... >> * [new branch] file-watcher:pclouds/... It might be a tiny bit better, and a forced update could be displayed with a prefix '+'. Hmm? -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] strbuf: improve API
Michael Haggertywrites: > 1. The amount of added code complexity is small and quite >encapsulated. Actually, STRBUF_OWNS_MEMORY can even be seen as a simplification if done properly: we already have the case where the strbuf does not own the memory with strbuf_slopbuf. I already pointed places in strbuf_grow() which could be simplified after the patch. Re-reading the code it seems at lesat the call to strbuf_grow(sb, 0); in strbuf_detach becomes useless. The same in strbuf_attach() probably is, too. So, the final strbuf.[ch] code might not be "worse" that the previous. I'm unsure about the complexity of the future code using the new API. I don't forsee cases where using the new API would lead to a high maintenance cost, but I don't claim I considered all possible uses. > 2. The ability to use strbufs without having to allocate memory might >make enough *psychological* difference that it encourages some >devs to use strbufs where they would otherwise have done manual >memory management. I think this would be a *big* win in terms of >potential bugs and security vulnerabilities avoided. Note that this can also be seen as a counter-argument, since it may psychologically encourage people to micro-optimize code and use contributors/reviewers neurons to spend time on "shall this be on-stack or malloced?". I think we already have a tendency to micro-optimize non-critical code too much in Git's codebase, so it's not necessarily a step in the right direction. In conclusion, I don't have a conclusion, sorry ;-). -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH] Triangular Workflow UI improvement: Documentation
On 05/31/2016 04:33 PM, Matthieu Moy wrote: > Jordan DE GEAwrites: > [...] >> +DESCRIPTION >> +--- >> + >> +Triangular Workflow (or Asymmetric Workflow) is a workflow which gives >> +the possibility to: >> + >> +- fetch (or pull) from a repository >> +- push to another repository > > [...] > > I find Michael Haggerty's definition of triangular workflow much > clearer: > > https://github.com/blog/2042-git-2-5-including-multiple-worktrees-and-triangular-workflows#improved-support-for-triangular-workflows > > I don't see a licence on the GitHub blog, so I don't think it's legal to > copy-past directly to our docs, but Michael might allow us to do so? I'm glad you find that blog post useful! You are correct that the text of GitHub blog posts is copyrighted, so indeed you need to ask before using it. It is OK with me (and more importantly with GitHub, because I wrote this text in their employ) for you to use the text about triangular workflows from the blog post mentioned above under the Git project's license. I strongly suggest that you adapt it to make it fit better with the rest of the Git documentation, and because having verbatim copies of the same text in two places seems a little bit silly. But that's up to you. Signed-off-by: Michael Haggerty Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Fix prune/gc problem with multiple worktrees
On 06/02/2016 11:53 AM, Duy Nguyen wrote: > (from patch 4/4 mail) > > On Wed, Jun 1, 2016 at 10:51 PM, Michael Haggerty> wrote: >>> + path = xstrdup(worktree_git_path(wt, "logs/refs/bisect")); >>> + if (file_exists(path)) >>> + handle_one_reflog(path, NULL, 0, ); >>> + free(path); >>> +} >> >> `refs/bisect` is not a single reference. It is a namespace that contains >> references with names like `refs/bisect/bad` and >> `refs/bisect/good-66106691a1b71e445fe5e4d6b8b043dffc7dfe4c`. > > Yeah I missed that. I'm not going to write another directory walker to > collect all logs/refs/bisect/*. I didn't add pending objects for > refs/bisect/* of other worktrees either. At that point waiting for the > new ref iterator makes more sense... > > On Wed, Jun 1, 2016 at 11:06 PM, Junio C Hamano wrote: >> Nguyễn Thái Ngọc Duy writes: >> >>> This series makes sure that objects referenced by all worktrees are >>> marked reachable so that we don't accidentally delete objects that are >>> being used. Previously per-worktree references in index, detached HEAD >>> or per-worktree reflogs come from current worktree only, not all >>> worktrees. >> >> I'll let this topic simmer on the list for now, instead of picking >> it up immediately to 'pu', as Michael in $gmane/296068 makes me >> wonder if we want to keep piling on the current "worktree ref >> iterations are bolted on" or if we want to first clean it up, whose >> natural fallout hopefully would eliminate the bug away. > > So what should be the way forward? My intention was having something > that can fix the problem for now, even if a bit hacky while waiting > for ref iterator to be ready, then convert to use it and clean things > up, because I don't how long ref-iterator would take and losing data > is serous enough that I'd like to fix it soon. If we go with "fix soon > then convert to ref-iterator later", then I will drop the > logs/bisect/* check, checking logs/HEAD alone should be good enough. > Otherwise I'll prepare a series on top of ref-iterator. Fixing reachability via the index and detached HEADs feels relatively important. Fixing refs/bisect/* feels a bit less urgent, because (1) usually bisections don't take very long, and (2) if the commits that one is bisecting are not otherwise reachable anymore, then the bisection is probably not interesting anymore. However, these are real references, so if they get broken, the worktree will be seen as corrupt and recovery is not super obvious (I guess most people would end up deleting the whole worktree). Fixing the reflogs for HEAD and (especially) refs/bisect/* in worktrees feels even less important, because reflogs are not nearly as important as current ref values, Git is relatively tolerant of broken reflog entries, and there are easy ways to prune them if breakage occurs. It's hard for me to predict when the ref-iterator stuff will be merged. It is a big change, but so far the feedback seems pretty good. I can tell you that pushing it and ref-stores forward is high on my priority list. Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] strbuf: improve API
On 06/01/2016 11:07 PM, Jeff King wrote: > On Wed, Jun 01, 2016 at 03:42:18AM -0400, Jeff King wrote: > >> I have no idea if those ideas would work. But I wouldn't want to start >> looking into either of them without some idea of how much time we're >> actually spending on strbuf mallocs (or how much time we would spend if >> strbufs were used in some proposed sites). > > So I tried to come up with some numbers. > > Here's an utterly silly use of strbufs, but one that I think should > over-emphasize the effect of any improvements we make: > > diff --git a/Makefile b/Makefile > index 7a0551a..72b968a 100644 > --- a/Makefile > +++ b/Makefile > @@ -579,6 +579,7 @@ PROGRAM_OBJS += shell.o > PROGRAM_OBJS += show-index.o > PROGRAM_OBJS += upload-pack.o > PROGRAM_OBJS += remote-testsvn.o > +PROGRAM_OBJS += foo.o > > # Binary suffix, set to .exe for Windows builds > X = > diff --git a/foo.c b/foo.c > index e69de29..b62dd97 100644 > --- a/foo.c > +++ b/foo.c > @@ -0,0 +1,18 @@ > +#include "git-compat-util.h" > +#include "strbuf.h" > + > +int main(void) > +{ > + const char *str = "this is a string that we'll repeatedly insert"; > + size_t len = strlen(str); > + > + int i; > + for (i = 0; i < 100; i++) { > + struct strbuf buf = STRBUF_INIT; > + int j; > + for (j = 0; j < 500; j++) > + strbuf_add(, str, len); > + strbuf_release(); > + } > + return 0; > +} Thanks for generating actual data. Your benchmark could do two things to stress malloc()/free() even more. I'm not claiming that this makes the benchmark more typical of real-world use, but it maybe gets us closer to the theoretical upper limit on improvement. 1. Since strbuf_grow() allocates new space in geometrically increasing sizes, the number of mallocs needed to do N strbuf_add()s increases only like log(N). So decreasing the "500" would increase the fraction of the time spent on allocations. 2. Since the size of the string being appended increases the time spent copying bytes, without appreciably changing the number of allocations (also because of geometric growth), decreasing the length of the string would also increase the fraction of the time spent on allocations. I put those together as options, programmed several variants of the string-concatenating loop, and added a perf script to run them; you can see the full patch here: https://github.com/mhagger/git/commit/b417935a4425e0f2bf62e59a924dc652bb2eae0c The guts look like this: > int j; > if (variant == 0) { > /* Use buffer allocated a single time */ > char *buf = big_constant_lifetime_buf; > > for (j = 0; j < reps; j++) > strcpy(buf + j * len, str); > } else if (variant == 1) { > /* One correct-sized buffer malloc per iteration */ > char *buf = xmalloc(reps * len + 1); > > for (j = 0; j < reps; j++) > strcpy(buf + j * len, str); > > free(buf); > } else if (variant == 2) { > /* Conventional use of strbuf */ > struct strbuf buf = STRBUF_INIT; > > for (j = 0; j < reps; j++) > strbuf_add(, str, len); > > strbuf_release(); > } else if (variant == 3) { > /* strbuf initialized to correct size */ > struct strbuf buf; > strbuf_init(, reps * len); > > for (j = 0; j < reps; j++) > strbuf_add(, str, len); > > strbuf_release(); > } else if (variant == 4) { > /* >* Simulated fixed strbuf with correct size. >* This code only works because we know how >* strbuf works internally, namely that it >* will never realloc() or free() the buffer >* that we attach to it. >*/ > struct strbuf buf = STRBUF_INIT; > strbuf_attach(, big_constant_lifetime_buf, 0, reps * len + > 1); > > for (j = 0; j < reps; j++) > strbuf_add(, str, len); > > /* No strbuf_release() here! */ > } I ran this for a short string ("short") and a long string ("this is a string that we will repeatedly insert"), and also concatenating the string 5, 20, or 500 times. The number of loops around the whole program is normalized to make the total number of concatenations approximately constant. Here are the full results: time (s) Test 0 1 2 3 4 5 short strings 1.64 3.37 8.72 6.08 3.65 20 short strings1.72 2.12 5.43 4.01 3.39 500 short strings 1.62 1.61 3.36 3.26 3.10 5 long strings 2.08 6.64 13.09 8.50 3.79 20 long strings 2.16 3.33 7.03
Re: [PATCH 09/13] refs: introduce an iterator interface
On Mon, May 30, 2016 at 2:55 PM, Michael Haggertywrote: > Currently, the API for iterating over references is via a family of > for_each_ref()-type functions that invoke a callback function for each > selected reference. All of these eventually call do_for_each_ref(), > which knows how to do one thing: iterate in parallel through two > ref_caches, one for loose and one for packed refs, giving loose > references precedence over packed refs. This is rather complicated code, > and is quite specialized to the files backend. It also requires callers > to encapsulate their work into a callback function, which often means > that they have to define and use a "cb_data" struct to manage their > context. > > The current design is already bursting at the seams, and will become > even more awkward in the upcoming world of multiple reference storage > backends: > > * Per-worktree vs. shared references are currently handled via a kludge > in git_path() rather than iterating over each part of the reference > namespace separately and merging the results. This kludge will cease > to work when we have multiple reference storage backends. Question from a refs user. Right now worktree.c:get_worktrees() peeks directly to "$GIT_DIR/worktrees/xxx/HEAD" and parses the content itself, something that I promised to fix but never got around to do it. Will we have an iterator to go through all worktrees' HEAD, or will there be an API to say "resolve ref HEAD from worktree XYZ"? -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4] Fix prune/gc problem with multiple worktrees
(from patch 4/4 mail) On Wed, Jun 1, 2016 at 10:51 PM, Michael Haggertywrote: >> + path = xstrdup(worktree_git_path(wt, "logs/refs/bisect")); >> + if (file_exists(path)) >> + handle_one_reflog(path, NULL, 0, ); >> + free(path); >> +} > > `refs/bisect` is not a single reference. It is a namespace that contains > references with names like `refs/bisect/bad` and > `refs/bisect/good-66106691a1b71e445fe5e4d6b8b043dffc7dfe4c`. Yeah I missed that. I'm not going to write another directory walker to collect all logs/refs/bisect/*. I didn't add pending objects for refs/bisect/* of other worktrees either. At that point waiting for the new ref iterator makes more sense... On Wed, Jun 1, 2016 at 11:06 PM, Junio C Hamano wrote: > Nguyễn Thái Ngọc Duy writes: > >> This series makes sure that objects referenced by all worktrees are >> marked reachable so that we don't accidentally delete objects that are >> being used. Previously per-worktree references in index, detached HEAD >> or per-worktree reflogs come from current worktree only, not all >> worktrees. > > I'll let this topic simmer on the list for now, instead of picking > it up immediately to 'pu', as Michael in $gmane/296068 makes me > wonder if we want to keep piling on the current "worktree ref > iterations are bolted on" or if we want to first clean it up, whose > natural fallout hopefully would eliminate the bug away. So what should be the way forward? My intention was having something that can fix the problem for now, even if a bit hacky while waiting for ref iterator to be ready, then convert to use it and clean things up, because I don't how long ref-iterator would take and losing data is serous enough that I'd like to fix it soon. If we go with "fix soon then convert to ref-iterator later", then I will drop the logs/bisect/* check, checking logs/HEAD alone should be good enough. Otherwise I'll prepare a series on top of ref-iterator. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 2/6] worktree.c: find_worktree() learns to identify worktrees by basename
On Thu, Jun 2, 2016 at 1:44 AM, Junio C Hamanowrote: >> We would >> need to convert or match both '/' and '\' in "to/foo" case because of >> Windows, so it's not much easier than basename(). > > I never said "easier to implement". But can this codepath get > backslashed paths in the first place? I somehow thought that > normalization would happen a lot before the control reaches here. > > You'll be calling into fspathcmp() anyway; shouldn't the function > know that '/' and '\' are equivalent on some platforms, or is it > legal to only call fspathcmp() on a single path component without > directory separator? We still need to calculate the length to compare, which could be problematic when utf-8 is involved, or some other encoding. If we always split at '/' boundary though (e.g. "abc/def/ghi", "def/ghi" or "ghi" but never "ef/ghi") then it should be ok. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] reachable.c: mark reachable objects in index from all worktrees
On Thu, Jun 2, 2016 at 1:57 AM, David Turnerwrote: >> + struct index_state istate; >> + >> + memset(, 0, sizeof(istate)); > > > Why not just struct index_state istate = {0}; ? > My first thought was.. "hmm.. C99?" but then there are 24 of them in the code base already. Will change. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/4] reachable.c: mark reachable objects in index from all worktrees
On Thu, Jun 2, 2016 at 1:13 AM, Eric Sunshinewrote: > On Wed, Jun 1, 2016 at 6:45 AM, Nguyễn Thái Ngọc Duy > wrote: >> Current mark_reachable_objects() only marks objects from index from >> _current_ worktree as reachable instead of all worktrees. Because this >> function is used for pruning, there is a chance that objects referenced >> by other worktrees may be deleted. Fix that. >> >> Small behavior change in "one worktree" case, the index is read again >> from file. In the current implementation, if the_index is already >> loaded, the index file will not be read from file again. This adds some >> more cost to this operation, hopefully insignificant because >> reachability test is usually very expensive already. > > Could this extra index read be avoided by taking advantage of 'struct > worktree::is_current'[1] and passing the already-loaded index to > add_index_objects_to_pending() if true? > > Or, am I misunderstanding the issue? > > [1]: http://article.gmane.org/gmane.comp.version-control.git/292194 Hah! I broke my worktree changes into many pieces and lost track of them. I thought that one was still in some unsent series. Will use it. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] pretty: support "mboxrd" output format
Eric Wongwrote: > Eric Sunshine wrote: > > On Tue, May 31, 2016 at 3:45 AM, Eric Wong wrote: > > > Eric Sunshine wrote: > > >> I wonder if hand-coding, rather than using a regex, could be an > > >> improvement: > > >> > > >> static int is_mboxrd_from(const char *s, size_t n) > > >> { > > >> size_t f = strlen("From "); > > >> const char *t = s + n; > > >> > > >> while (s < t && *s == '>') > > >> s++; > > >> return t - s >= f && !memcmp(s, "From ", f); > > >> } > > >> > > >> or something. > > > > > > Yikes. I mostly work in high-level languages and do my best to > > > avoid string parsing in C; so that scares me. A lot. > > > > The hand-coded is_mboxrd_from() above is pretty much idiomatic C and > > (I think) typical of how such a function would be coded in Git itself, > > so it looks normal and easy to grok to me (but, of course, I'm > > probably biased since I wrote it). For reference, here is the gfrom function from qmail (gfrom.c, source package netqmail-1.06 in Debian, reformatted git style) int gfrom(char *s, int len) { while ((len > 0) && (*s == '>')) { ++s; --len; } return (len >= 5) && !str_diffn(s, "From ", 5); } Similar to yours, but a several small things improves readability for me: * the avoidance of subtraction from the "return" conditional * s/n/len/ variable name * extra parentheses * removal of "t" variable (t for "terminal/termination"?) str_diffn is memcmp-like, I assume. My eyes glazed over when I saw that function implemented in str_diffn.c, too. Just thinking out loud, with sufficient tests I could go with either. Will reroll when/if I get the chance tomorrow. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html