Re: [PATCH 1/2] wildmatch: add interface for precompiling wildmatch() patterns

2018-02-26 Thread Ævar Arnfjörð Bjarmason

On Mon, Feb 26 2018, Duy Nguyen jotted:

> On Mon, Feb 26, 2018 at 3:35 AM, Ævar Arnfjörð Bjarmason
>  wrote:
>> Add the scaffolding necessary for precompiling wildmatch()
>> patterns.
>>
>> There is currently no point in doing this with the wildmatch()
>> function we have now, since it can't make any use of precompiling the
>> pattern.
>>
>> But adding this interface and making use of it will make it easy to
>> refactor the wildmatch() function to parse the pattern into opcodes as
>> some glob() implementations do, or to drop an alternate wildmatch()
>> backend in which trades parsing slowness for faster matching, such as
>> the PCRE v2 conversion function that understands the wildmatch()
>> syntax.
>>
>> It's very unlikely that we'll remove the wildmatch() function as a
>> convenience wrapper even if we end up requiring a separate compilation
>> step in some future implementation. There are a lot of one-shot
>> wildmatches in the codebase, in that case most likely wildmatch() will
>> be kept around as a shorthand for wildmatch_{compile,match,free}().
>>
>> I modeled this interface on the PCRE v2 interface. I didn't go with a
>> glob(3) & globfree(3)-like interface because that would require every
>> wildmatch() user to pass a dummy parameter, which I got rid of in
>> 55d3426929 ("wildmatch: remove unused wildopts parameter",
>> 2017-06-22).
>>
>> Signed-off-by: Ævar Arnfjörð Bjarmason 
>> ---
>>  wildmatch.c | 25 +
>>  wildmatch.h | 11 +++
>>  2 files changed, 36 insertions(+)
>>
>> diff --git a/wildmatch.c b/wildmatch.c
>> index d074c1be10..032f339391 100644
>> --- a/wildmatch.c
>> +++ b/wildmatch.c
>> @@ -276,3 +276,28 @@ int wildmatch(const char *pattern, const char *text, 
>> unsigned int flags)
>>  {
>> return dowild((const uchar*)pattern, (const uchar*)text, flags);
>>  }
>> +
>> +struct wildmatch_compiled *wildmatch_compile(const char *pattern,
>> +unsigned int flags)
>> +{
>> +   struct wildmatch_compiled *wildmatch_compiled = xmalloc(
>> +   sizeof(struct wildmatch_compiled));
>
> struct wildmatch_compiled *data = xmalloc(sizeof(*data));
>
> ?
>
> It shortens the line a bit. We already use WM_ prefix for wildmatch
> flags, perhaps we can use it for wildmatch structs too (e.g.
> wm_compiled instead)

Thanks, that's better.

>> +   wildmatch_compiled->pattern = xstrdup(pattern);
>> +   wildmatch_compiled->flags = flags;
>> +
>> +   return wildmatch_compiled;
>> +}
>> +
>> +int wildmatch_match(struct wildmatch_compiled *wildmatch_compiled,
>> +   const char *text)
>> +{
>> +   return wildmatch(wildmatch_compiled->pattern, text,
>> +wildmatch_compiled->flags);
>> +}
>> +
>> +void wildmatch_free(struct wildmatch_compiled *wildmatch_compiled)
>> +{
>> +   if (wildmatch_compiled)
>> +   free((void *)wildmatch_compiled->pattern);
>
> Why do make pattern type "const char *" then remove "const" with
> typecast here? Why not just "char *" in wildmatch_compiled?
>
> If the reason is to avoid other users from peeking in and modifying
> it, then perhaps you can move struct wildmatch_compiled to wildmatch.c
> and keep it an opaque struct pointer.

Yes, it's to indicate that "pattern" won't ever be modified. I think
it's more readable / self documenting to have the compiler enforce that
via const, even though it requires the cast to free it.


Re: [PATCH 1/2] wildmatch: add interface for precompiling wildmatch() patterns

2018-02-26 Thread Duy Nguyen
On Mon, Feb 26, 2018 at 3:35 AM, Ævar Arnfjörð Bjarmason
 wrote:
> Add the scaffolding necessary for precompiling wildmatch()
> patterns.
>
> There is currently no point in doing this with the wildmatch()
> function we have now, since it can't make any use of precompiling the
> pattern.
>
> But adding this interface and making use of it will make it easy to
> refactor the wildmatch() function to parse the pattern into opcodes as
> some glob() implementations do, or to drop an alternate wildmatch()
> backend in which trades parsing slowness for faster matching, such as
> the PCRE v2 conversion function that understands the wildmatch()
> syntax.
>
> It's very unlikely that we'll remove the wildmatch() function as a
> convenience wrapper even if we end up requiring a separate compilation
> step in some future implementation. There are a lot of one-shot
> wildmatches in the codebase, in that case most likely wildmatch() will
> be kept around as a shorthand for wildmatch_{compile,match,free}().
>
> I modeled this interface on the PCRE v2 interface. I didn't go with a
> glob(3) & globfree(3)-like interface because that would require every
> wildmatch() user to pass a dummy parameter, which I got rid of in
> 55d3426929 ("wildmatch: remove unused wildopts parameter",
> 2017-06-22).
>
> Signed-off-by: Ævar Arnfjörð Bjarmason 
> ---
>  wildmatch.c | 25 +
>  wildmatch.h | 11 +++
>  2 files changed, 36 insertions(+)
>
> diff --git a/wildmatch.c b/wildmatch.c
> index d074c1be10..032f339391 100644
> --- a/wildmatch.c
> +++ b/wildmatch.c
> @@ -276,3 +276,28 @@ int wildmatch(const char *pattern, const char *text, 
> unsigned int flags)
>  {
> return dowild((const uchar*)pattern, (const uchar*)text, flags);
>  }
> +
> +struct wildmatch_compiled *wildmatch_compile(const char *pattern,
> +unsigned int flags)
> +{
> +   struct wildmatch_compiled *wildmatch_compiled = xmalloc(
> +   sizeof(struct wildmatch_compiled));

struct wildmatch_compiled *data = xmalloc(sizeof(*data));

?

It shortens the line a bit. We already use WM_ prefix for wildmatch
flags, perhaps we can use it for wildmatch structs too (e.g.
wm_compiled instead)

> +   wildmatch_compiled->pattern = xstrdup(pattern);
> +   wildmatch_compiled->flags = flags;
> +
> +   return wildmatch_compiled;
> +}
> +
> +int wildmatch_match(struct wildmatch_compiled *wildmatch_compiled,
> +   const char *text)
> +{
> +   return wildmatch(wildmatch_compiled->pattern, text,
> +wildmatch_compiled->flags);
> +}
> +
> +void wildmatch_free(struct wildmatch_compiled *wildmatch_compiled)
> +{
> +   if (wildmatch_compiled)
> +   free((void *)wildmatch_compiled->pattern);

Why do make pattern type "const char *" then remove "const" with
typecast here? Why not just "char *" in wildmatch_compiled?

If the reason is to avoid other users from peeking in and modifying
it, then perhaps you can move struct wildmatch_compiled to wildmatch.c
and keep it an opaque struct pointer.

> +   free(wildmatch_compiled);
> +}
> diff --git a/wildmatch.h b/wildmatch.h
> index b8c826aa68..2fc00e0ca0 100644
> --- a/wildmatch.h
> +++ b/wildmatch.h
> @@ -10,5 +10,16 @@
>  #define WM_ABORT_ALL -1
>  #define WM_ABORT_TO_STARSTAR -2
>
> +struct wildmatch_compiled {
> +   const char *pattern;
> +   unsigned int flags;
> +};
> +
>  int wildmatch(const char *pattern, const char *text, unsigned int flags);
> +struct wildmatch_compiled *wildmatch_compile(const char *pattern,
> +unsigned int flags);
> +int wildmatch_match(struct wildmatch_compiled *wildmatch_compiled,
> +   const char *text);
> +void wildmatch_free(struct wildmatch_compiled *wildmatch_compiled);
> +
>  #endif
> --
> 2.15.1.424.g9478a66081
>



-- 
Duy


[PATCH 1/2] wildmatch: add interface for precompiling wildmatch() patterns

2018-02-25 Thread Ævar Arnfjörð Bjarmason
Add the scaffolding necessary for precompiling wildmatch()
patterns.

There is currently no point in doing this with the wildmatch()
function we have now, since it can't make any use of precompiling the
pattern.

But adding this interface and making use of it will make it easy to
refactor the wildmatch() function to parse the pattern into opcodes as
some glob() implementations do, or to drop an alternate wildmatch()
backend in which trades parsing slowness for faster matching, such as
the PCRE v2 conversion function that understands the wildmatch()
syntax.

It's very unlikely that we'll remove the wildmatch() function as a
convenience wrapper even if we end up requiring a separate compilation
step in some future implementation. There are a lot of one-shot
wildmatches in the codebase, in that case most likely wildmatch() will
be kept around as a shorthand for wildmatch_{compile,match,free}().

I modeled this interface on the PCRE v2 interface. I didn't go with a
glob(3) & globfree(3)-like interface because that would require every
wildmatch() user to pass a dummy parameter, which I got rid of in
55d3426929 ("wildmatch: remove unused wildopts parameter",
2017-06-22).

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 wildmatch.c | 25 +
 wildmatch.h | 11 +++
 2 files changed, 36 insertions(+)

diff --git a/wildmatch.c b/wildmatch.c
index d074c1be10..032f339391 100644
--- a/wildmatch.c
+++ b/wildmatch.c
@@ -276,3 +276,28 @@ int wildmatch(const char *pattern, const char *text, 
unsigned int flags)
 {
return dowild((const uchar*)pattern, (const uchar*)text, flags);
 }
+
+struct wildmatch_compiled *wildmatch_compile(const char *pattern,
+unsigned int flags)
+{
+   struct wildmatch_compiled *wildmatch_compiled = xmalloc(
+   sizeof(struct wildmatch_compiled));
+   wildmatch_compiled->pattern = xstrdup(pattern);
+   wildmatch_compiled->flags = flags;
+
+   return wildmatch_compiled;
+}
+
+int wildmatch_match(struct wildmatch_compiled *wildmatch_compiled,
+   const char *text)
+{
+   return wildmatch(wildmatch_compiled->pattern, text,
+wildmatch_compiled->flags);
+}
+
+void wildmatch_free(struct wildmatch_compiled *wildmatch_compiled)
+{
+   if (wildmatch_compiled)
+   free((void *)wildmatch_compiled->pattern);
+   free(wildmatch_compiled);
+}
diff --git a/wildmatch.h b/wildmatch.h
index b8c826aa68..2fc00e0ca0 100644
--- a/wildmatch.h
+++ b/wildmatch.h
@@ -10,5 +10,16 @@
 #define WM_ABORT_ALL -1
 #define WM_ABORT_TO_STARSTAR -2
 
+struct wildmatch_compiled {
+   const char *pattern;
+   unsigned int flags;
+};
+
 int wildmatch(const char *pattern, const char *text, unsigned int flags);
+struct wildmatch_compiled *wildmatch_compile(const char *pattern,
+unsigned int flags);
+int wildmatch_match(struct wildmatch_compiled *wildmatch_compiled,
+   const char *text);
+void wildmatch_free(struct wildmatch_compiled *wildmatch_compiled);
+
 #endif
-- 
2.15.1.424.g9478a66081