On Thu, Jun 2, 2016 at 2:54 PM, Junio C Hamano <gits...@pobox.com> wrote: > Stefan Beller <sbel...@google.com> writes: > >> In our own .gitattributes file we have attributes such as: >> >> *.[ch] whitespace=indent,trail,space >> >> When querying for attributes we want to be able to ask for the exact >> value, i.e. >> >> git ls-files :(attr:whitespace=indent,trail,space) >> >> should work, but the commas are used in the attr magic to introduce >> the next attr, ... >> ... >> So here is the "escaping only, but escaping done right" version. >> (It goes on top of sb/pathspec-label) > > The phrase "should work" is probably a bit too strong (I'd have said > "it would be nice if this worked"), as we do not have to even > support comma for our immediately expected use cases. Allowing it > merely makes a casual test using our own .gitattributes easier. > >> +static size_t strcspn_escaped(const char *s, const char *reject) > > Perhaps s/reject/stop/?
I took the signature from the strcspn man page, and my version of the man page has `reject` there, `stop` certainly works too > >> +{ >> + const char *i, *j; >> + >> + for (i = s; *i; i++) { >> + /* skip escaped the character */ >> + if (i[0] == '\\' && i[1]) { >> + i++; >> + continue; >> + } >> + /* see if any of the chars matches the current character */ >> + for (j = reject; *j; j++) >> + if (!*i || *i == *j) >> + return i - s; > > I somehow doubt that *i can be NUL here. In any case, this looks > more like > > /* is *i is one of the stop codon? */ > if (strchr(stop, *i)) > break; right, that seems easier. > >> + } >> + return i - s; >> +} > >> +static char *attr_value_unescape(const char *value) >> +{ >> + char *i, *ret = xstrdup(value); >> + >> + for (i = ret; *i; i++) { >> + if (i[0] == '\\') { >> + if (!i[1]) >> + die(_("Escape character '\\' not allowed as " >> + "last character in attr value")); >> + >> + /* remove the backslash */ >> + memmove(i, i + 1, strlen(i)); >> + /* and ignore the character after that */ >> + i++; >> + } >> + } >> + return ret; >> +} >> + > > Repeated memmove() and strlen() somehow bothers me. Would there be > a more efficient and straight-forward way to do this, perhaps along > the lines of this instead? > > const char *src; > char *dst, *ret; > > ret = xmalloc(strlen(value)); > for (src = value, dst = ret; *src; src++, dst++) { > if (*src == '\\') { > if (!src[1]) > die(); > src++; > } > if (*src && invalid_value_char(*src)) > die("cannot use '%c' for value matching", *src) > *dst = *src; > } > *dst = '\0' > return ret; > > Even though I earlier said "Now we have an unquote mechanism, we can > open the floodgate by lifting the "no backslash in value" check, I > now realize that we do want to keep some escape hatch for us to > extend the quoting syntax even more later, so perhaps with something > like this: > > static inline int invalid_value_char(const char ch) > { > if (isalnum(ch) || strchr(",-_", ch)) > return 0; > return -1; > } Makes sense. Later on we could have : or ; for an and/or of the values and parens and such, so the invalid_value_char makes sense. > > Thanks. Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html