[EMAIL PROTECTED] wrote:
Experiments using bash indicate that either ^ or ! is accepted
as the negation of a character set. Hence,
ls -d [^tu]*
ls -d [!tu]*
both return the same thing - a list of all files and directories
in the current directory whose names do not begin with "t" or "u".
SQLite only supports ^, not !. I wonder if this is something I
should change? It would not be much trouble to get GLOB to support
both, must like the globber in bash.
Anybody have an old Bourne shell around? An authentic C-shell?
What do they do?
Richard,
I found the following info in a Jedit appendix.
| |
*
|?| matches any one character
*
|*| matches any number of characters
*
|{!/|glob|/}| Matches anything that does /not/ match /|glob|/
*
|{/|a|/,/|b|/,/|c|/}| matches any one of /|a|/, /|b|/ or /|c|/
*
|[/|abc|/]| matches any character in the set /|a|/, /|b|/ or /|c|/
*
|[^/|abc|/]| matches any character not in the set /|a|/, /|b|/
or /|c|/
*
|[/|a-z|/]| matches any character in the range /|a|/ to /|z|/,
inclusive. A leading or trailing dash will be interpreted literally
I noticed that SQLite doesn't implement any of the curly brace grouping
of globs. It also shows the use of ^ for inversion with a character set,
and ! for inversion of a complete glob.
The following is from the TCL documentation:
The /pattern/ arguments may contain any of the following special
characters:
*?*
Matches any single character.
***
Matches any sequence of zero or more characters.
*[*/chars/*]*
Matches any single character in /chars/. If /chars/ contains a
sequence of the form /a/*-*/b/ then any character between /a/ and
/b/ (inclusive) will match.
*\*/x/
Matches the character /x/.
*{*/a/*,*/b/*,*/.../}
Matches any of the strings /a/, /b/, etc.
This doesn't mention inversion at all, but it does say a backslash can
be used to escape a character.
And the following is from a the documentation of a glob compiler class.
* *** - Matches zero or more instances of any character. If the
STAR_CANNOT_MATCH_NULL_MASK option is used, *** matches one or
more instances of any character.
* *?* - Matches one instance of any character. If the
QUESTION_MATCHES_ZERO_OR_ONE_MASK option is used, *?* matches
zero or one instances of any character.
* *[...]* - Matches any of characters enclosed by the brackets. *
* * and *?* lose their special meanings within a character
class. Additionaly if the first character following the opening
bracket is a *!* or a *^*, then any character not in the
character class is matched. A *-* between two characters can be
used to denote a range. A *-* at the beginning or end of the
character class matches itself rather than referring to a range.
A *]* immediately following the opening *[* matches itself
rather than indicating the end of the character class, otherwise
it must be escaped with a backslash to refer to itself.
* *\* - A backslash matches itself in most situations. But when a
special character such as a *** follows it, a backslash /
escapes / the character, indicating that the special chracter
should be interpreted as a normal character instead of its
special meaning.
* All other characters match themselves.
This class explicitly mentions using either ^ or ! to invert a character
set. It also allows backslash escapes for special characters. It says *
and ? loose their special status in a character set, so it isn't really
an escape.
The following is from the Apple's documentation
*?* Matches any single character.
*** Matches any sequence of zero or more characters.
*[*_chars_*]* Matches any single character in _chars_. If _chars_
contains a
sequence of the form _a_*-*_b_ then any character between
_a_ and _b_
(inclusive) will match.
*\*_x_ Matches the character _x_.
*{*_a_*,*_b_*,*_..._} Matches any of the strings _a_, _b_, etc.
And finally, from the GNU bash documentation:
3.5.8.1 Pattern Matching
Any character that appears in a pattern, other than the special
pattern characters described below, matches itself. The nul character
may not occur in a pattern. A backslash escapes the following
character; the escaping backslash is discarded when matching. The
special pattern characters must be quoted if they are to be matched
literally.
The special pattern characters have the following meanings:
|*|
Matches any string, including the null string.
|?|
Matches any single character.
|[...]|
Matches any one of the enclosed characters. A pair of characters
separated by a hyphen denotes a range expression; any character
that sorts between those two characters, inclusive, using the
current locale's collating sequence and character set, is matched.
If the first character following the ‘[’ is a ‘!’ or a ‘^’ then
any character not enclosed is matched. A ‘−’ may be matched by
including it as the first or last character in the set. A ‘]’ may
be matched by including it as the first character in the set. The
sorting order of characters in range expressions is determined by
the current locale and the value of the LC_COLLATE shell variable,
if set.
For example, in the default C locale, ‘[a-dx-z]’ is equivalent to
‘[abcdxyz]’. Many locales sort characters in dictionary order, and
in these locales ‘[a-dx-z]’ is typically not equivalent to
‘[abcdxyz]’; it might be equivalent to ‘[aBbCcDdxXyYz]’, for
example. To obtain the traditional interpretation of ranges in
bracket expressions, you can force the use of the C locale by
setting the LC_COLLATE or LC_ALL environment variable to the value
‘C’.
Within ‘[’ and ‘]’, character classes can be specified using the
syntax |[:|class|:]|, where class is one of the following classes
defined in the posix standard:
alnum alpha ascii blank cntrl digit graph lower
print punct space upper word xdigit
A character class matches any character belonging to that class.
The |word| character class matches letters, digits, and the
character ‘_’.
Within ‘[’ and ‘]’, an equivalence class can be specified using
the syntax |[=|c|=]|, which matches all characters with the same
collation weight (as defined by the current locale) as the
character c.
Within ‘[’ and ‘]’, the syntax |[.|symbol|.]| matches the
collating symbol symbol.
If the |extglob| shell option is enabled using the |shopt| builtin,
several extended pattern matching operators are recognized. In the
following description, a pattern-list is a list of one or more
patterns separated by a ‘|’. Composite patterns may be formed using
one or more of the following sub-patterns:
|?(|pattern-list|)|
Matches zero or one occurrence of the given patterns.
|*(|pattern-list|)|
Matches zero or more occurrences of the given patterns.
|+(|pattern-list|)|
Matches one or more occurrences of the given patterns.
|@(|pattern-list|)|
Matches one of the given patterns.
|!(|pattern-list|)|
Matches anything except one of the given patterns.
------------------------------------------------------------------------
This does say a backslash should be used to escape spacial characters,
and that ^ and ! are equivalent at the beginning of a character set.
It seems like there is some variation in GLOB syntax. :-)
Perhaps you should add support for a backslash escape and then simply
document what SQLite does (instead of saying it supports standard Unix
glob syntax, since there isn't a standard).
HTH
Dennis Cote
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------