[PATCH] simplify unhex_and_quote

2012-12-23 Thread Mark Walters

On Sun, 23 Dec 2012, david at tethera.net wrote:
> From: David Bremner 
>
> the overgeneral definition of a prefix can be replaced by lower case
> alphabetic, and still work fine with current notmuch query syntax.
>
> token_len++ is moved to the end, and we restore the delimiter just so
> we can leave the string as as we found it.
> ---
>
> As always, Jani has a keen eye for muddle. Except he's wrong about 
> tok_len - prefix_len, and Mark and I are right. Hopefully ;).
>
> Restoring the delimiter at the end might be pointless (since the rest
> of the input line is modified), but it is one less surprise for somebody 
> repurposing the function.

I am now worried about side bit of Xapian syntax, in particular, what
about brackets. I think we could have

(tag:inbox or tag:tag%20with%20spaces) and 

In which case the first token is (tag:inbox which does not
match. Additionally the third token is tag:tag%20with%20spaces) which
presumably gets quoted to tag:"tag with spaces)" and I am guessing
Xapian  treats this differently than with bracket after the quote.

Finally, I don't know if a query can contain a : without being a prefix
query. If it can that could end up being misquoted.

One possible way round the first problem might be to require the Xapian
syntax to be space separated from the rest but that does mean we are
diverging from the command line syntax.

(I am not very familiar with Xapian syntax nor with quite where this
function is used so I may be worrying about nothing)

Best wishes

Mark





>
> Patches 5 and 6 can be ignored now.
>  tag-util.c |   12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/tag-util.c b/tag-util.c
> index b0a846b..ee28512 100644
> --- a/tag-util.c
> +++ b/tag-util.c
> @@ -78,11 +78,13 @@ unhex_and_quote (void *ctx, char *encoded, const char 
> *line_for_error,
>   size_t prefix_len;
>   char delim = *(tok + tok_len);
>  
> - *(tok + tok_len++) = '\0';
> + *(tok + tok_len) = '\0';
>  
> - prefix_len = hex_invariant (tok, tok_len);
> + /* The following matches a superset of prefixes currently
> +  * used by notmuch */
> + prefix_len = strspn (tok, "abcdefghijklmnopqrstuvwxyz");
>  
> - if ((strcmp (tok, "*") == 0) || prefix_len >= tok_len - 1) {
> + if ((strcmp (tok, "*") == 0) || prefix_len == tok_len) {
>  
>   /* pass some things through without quoting or decoding.
>* Note for '*' this is mandatory.
> @@ -98,7 +100,7 @@ unhex_and_quote (void *ctx, char *encoded, const char 
> *line_for_error,
>  
>   } else {
>   /* potential prefix: one for ':', then something after */
> - if ((tok_len - prefix_len > 2) && *(tok + prefix_len) == ':') {
> + if ((tok_len - prefix_len >= 2) && *(tok + prefix_len) == ':') {
>   if (! (*query_string = talloc_strndup_append (*query_string,
> tok,
> prefix_len + 1))) 
> {
> @@ -129,6 +131,8 @@ unhex_and_quote (void *ctx, char *encoded, const char 
> *line_for_error,
>   goto DONE;
>   }
>   }
> + /* restore the string and skip delimiter */
> + *(tok + tok_len++) = delim;
>  }
>  
>DONE:
> -- 
> 1.7.10.4
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH] simplify unhex_and_quote

2012-12-22 Thread da...@tethera.net
From: David Bremner 

the overgeneral definition of a prefix can be replaced by lower case
alphabetic, and still work fine with current notmuch query syntax.

token_len++ is moved to the end, and we restore the delimiter just so
we can leave the string as as we found it.
---

As always, Jani has a keen eye for muddle. Except he's wrong about 
tok_len - prefix_len, and Mark and I are right. Hopefully ;).

Restoring the delimiter at the end might be pointless (since the rest
of the input line is modified), but it is one less surprise for somebody 
repurposing the function.

Patches 5 and 6 can be ignored now.
 tag-util.c |   12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tag-util.c b/tag-util.c
index b0a846b..ee28512 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -78,11 +78,13 @@ unhex_and_quote (void *ctx, char *encoded, const char 
*line_for_error,
size_t prefix_len;
char delim = *(tok + tok_len);

-   *(tok + tok_len++) = '\0';
+   *(tok + tok_len) = '\0';

-   prefix_len = hex_invariant (tok, tok_len);
+   /* The following matches a superset of prefixes currently
+* used by notmuch */
+   prefix_len = strspn (tok, "abcdefghijklmnopqrstuvwxyz");

-   if ((strcmp (tok, "*") == 0) || prefix_len >= tok_len - 1) {
+   if ((strcmp (tok, "*") == 0) || prefix_len == tok_len) {

/* pass some things through without quoting or decoding.
 * Note for '*' this is mandatory.
@@ -98,7 +100,7 @@ unhex_and_quote (void *ctx, char *encoded, const char 
*line_for_error,

} else {
/* potential prefix: one for ':', then something after */
-   if ((tok_len - prefix_len > 2) && *(tok + prefix_len) == ':') {
+   if ((tok_len - prefix_len >= 2) && *(tok + prefix_len) == ':') {
if (! (*query_string = talloc_strndup_append (*query_string,
  tok,
  prefix_len + 1))) 
{
@@ -129,6 +131,8 @@ unhex_and_quote (void *ctx, char *encoded, const char 
*line_for_error,
goto DONE;
}
}
+   /* restore the string and skip delimiter */
+   *(tok + tok_len++) = delim;
 }

   DONE:
-- 
1.7.10.4



[PATCH] simplify unhex_and_quote

2012-12-22 Thread david
From: David Bremner brem...@unb.ca

the overgeneral definition of a prefix can be replaced by lower case
alphabetic, and still work fine with current notmuch query syntax.

token_len++ is moved to the end, and we restore the delimiter just so
we can leave the string as as we found it.
---

As always, Jani has a keen eye for muddle. Except he's wrong about 
tok_len - prefix_len, and Mark and I are right. Hopefully ;).

Restoring the delimiter at the end might be pointless (since the rest
of the input line is modified), but it is one less surprise for somebody 
repurposing the function.

Patches 5 and 6 can be ignored now.
 tag-util.c |   12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tag-util.c b/tag-util.c
index b0a846b..ee28512 100644
--- a/tag-util.c
+++ b/tag-util.c
@@ -78,11 +78,13 @@ unhex_and_quote (void *ctx, char *encoded, const char 
*line_for_error,
size_t prefix_len;
char delim = *(tok + tok_len);
 
-   *(tok + tok_len++) = '\0';
+   *(tok + tok_len) = '\0';
 
-   prefix_len = hex_invariant (tok, tok_len);
+   /* The following matches a superset of prefixes currently
+* used by notmuch */
+   prefix_len = strspn (tok, abcdefghijklmnopqrstuvwxyz);
 
-   if ((strcmp (tok, *) == 0) || prefix_len = tok_len - 1) {
+   if ((strcmp (tok, *) == 0) || prefix_len == tok_len) {
 
/* pass some things through without quoting or decoding.
 * Note for '*' this is mandatory.
@@ -98,7 +100,7 @@ unhex_and_quote (void *ctx, char *encoded, const char 
*line_for_error,
 
} else {
/* potential prefix: one for ':', then something after */
-   if ((tok_len - prefix_len  2)  *(tok + prefix_len) == ':') {
+   if ((tok_len - prefix_len = 2)  *(tok + prefix_len) == ':') {
if (! (*query_string = talloc_strndup_append (*query_string,
  tok,
  prefix_len + 1))) 
{
@@ -129,6 +131,8 @@ unhex_and_quote (void *ctx, char *encoded, const char 
*line_for_error,
goto DONE;
}
}
+   /* restore the string and skip delimiter */
+   *(tok + tok_len++) = delim;
 }
 
   DONE:
-- 
1.7.10.4

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch