On Thu, Dec 16, 2010 at 06:17:45AM -0800, Dieter Menne wrote: > Petr Savicky wrote: > > > > One of the suggestions in this thread was to use an external program. > > A possible solution without negation in Perl is > > > > @a = ("AB15E9SDF654VKBN?dvb.65" =~ m/[0-9]/g); > > print @a, "\n"; > > 15965465 > > > > > > Which is > > gsub("[^0-9]", "", "AB15E9SDF654VKBN?dvb.65") > > as Henrique suggested.
I agree. The Perl code was a reply to a question, whether the same can be done by describing the required elements and not by describing the ones to be removed. This could be useful, if we want to extract elements described by a more complex regular expression. A more accurate, although not complete and definitely not the best, extraction of nonnegative numbers in Perl may be done as follows @a = ("abcde. 11 abc 5.31e+34, (1.45)" =~ m/[0-9]+\.[0-9]+e[+-][0-9]+|[0-9]+\.[0-9]+|[0-9]+/g); print join(" ", @a), "\n"; 11 5.31e+34 1.45 Can something similar be done in R either specifically for numbers or for a general regular expression? Going back to the original question, the answer depends on the complexity of extracting numbers in a concrete situation. If possible, using functions within R is suggested (gsub(), strsplit(), ...). On the other hand, there are cases, where an external tool can be helpful. See also R-intro Chapter 7 Reading data from files, which says There is a clear presumption by the designers of R that you will be able to modify your input files using other tools, such as file editors or Perl to fit in with the requirements of R. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.