Hi everybody,
I am developing some functions that use regular expressions
and grepl, to check whether certain strings match a given pattern or not.

In the regular expression I use some shortcuts such as [:alnum:].

Reading the documentation for regular expression there is one sentence that
is not entirely clear to me:

<< The only portable way to specify all ASCII letters is to list them all
as the character class
[ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz].
(The current implementation uses numerical order of the encoding.)

Certain named classes of characters are predefined. Their interpretation
depends on the locale (see locales); the interpretation below is that of
the POSIX locale.

[:alnum:]
        Alphanumeric characters: [:alpha:] and [:digit:]. >>

Does this mean that I can use [:alnum:] safely to check for letters and
numbers?
Or is there the risk that the code won't work in a computer using a
different locale?
If so, can't I tell grepl to use the POSIX locale to interpret the
alfanumeric characters?

Thanks a lot in advance for the help!
Best,
Luca

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to