2010/2/10 Dag-Erling Smørgrav <d...@des.no>:
> Garrett Cooper <yanef...@gmail.com> writes:
>> C-shell globs as some programming languages referring to it as,
>> i.e. perl (which this is a subset of the globs concept) allow for
>> expansion via `*' to be `anything'. Regexp style globs for what you're
>> looking for would be either .* (greedy) or .+ (non-greedy), with it
>> being most likely the latter case.
>
> Uh, not quite.
>
> Formally, a regular expression is a textual representation of a finite
> state machine that describes a context-free grammar.
>
> A glob pattern can be trivially translated to a regular expression, but
> not the other way around.  Basically, * in a glob pattern corresponds to
> [^/]*, ? corresponds to ., and [abcd] and [^abcd] have the same meaning

                                             ^^^^ ???? ^^^^

The former is a positive assertion, where the latter is a negative
assertion -- how can they have the same meaning?

> as in a regular expression.  The glob pattern syntax has no equivalent
> for +, ?, {m,n}, (foo|bar), etc.

+, {}, and () -- no... that's typically an extension to shell expanded
values (IIRC). ? however, is a supported glob quantifier [from
glob(3)]:

     The argument pattern is a pointer to a pathname pattern to be expanded.
     The glob() argument matches all accessible pathnames against the pattern
     and creates a list of the pathnames that match.  In order to have access
     to a pathname, glob() requires search permission on every component of a
     path except the last and read permission on each directory of any file-
     name component of pattern that contains any of the special characters
     `*', `?' or `['.

> Some shells implement something that resembles alternations, where
> {foo,bar} corresponds to (foo|bar), but these are expanded before the
> glob pattern.  For instance, /tmp/{*,*} is expanded to /tmp/* /tmp/*,
> which is then expanded to two complete copies of the list of files and
> directories in /tmp.
>
> There is no such thing as a "regexp style glob", and I have no idea what
> you mean by "a subset of the globs concept" or where Perl fits into the
> discussion.

This is what I'm referring to:
http://perldoc.perl.org/functions/glob.html . Semantically I was wrong
in areas in my original statement, but I was trying to hand wave from
a basic `I don't know how globs vs regexps work', without the
technical lexigram discussion.

> Finally, .* and .+ are *both* greedy.  Perl's regular expression syntax
> includes non-greedy variants for both (.*? and .+? respectively).

 Yes, but I didn't explicitly note those forms.

> Note that the [], +, ? and {m,n} notations are merely shorthand for
> expressions which can be expressed using only concatenation, alternation
> and the kleene star, which are the only operations available in formal
> regular expressions.
>
>> I'll see if I can whip up a quick patch in the next day or so -- but
>> before I do that, does it make more sense to do globs or regular
>> expressions? There are pluses and minuses to each version and would
>> require some degree of parsing (and potentially escaping).
>
> I think you'll find that, at least in this particular case, regular
> expressions are an order of magnitude harder to implement than glob
> patterns.

Yes... I wholeheartedly agree...

Thanks :),
-Garrett
_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"

Reply via email to