Re: [R] R newbie: how to replace string/regular expression

Charles C. Berry Sun, 02 Nov 2008 13:58:45 -0800



Gabor,

Why not just this:

        expos <- list( B="e9", M="e6", m="e6", k="e3" )
        as.numeric( gsubfn("[[:alpha:]]", expos, d ) )

HTH,

Chuck

p.s. I am not sure why B goes with e6 or K with e-02 (below), butKrishna can adjust the values accordingly.



On Sun, 2 Nov 2008, Gabor Grothendieck wrote:

There was an error in your regexp which I did not correct. Here it is
again corrected to better illustrate the solution:

gsubfn("(.*)B", ~ as.numeric(x) * 10e6, d, ignore.case = TRUE)

[1] "120.0M"    "11.01m"    "2.097e+09" "100.00k"   "50"

On Sun, Nov 2, 2008 at 7:55 AM, Gabor Grothendieck
<[EMAIL PROTECTED]> wrote:

Your gsub example is almost exactly what gsubfn in the gsubfn package
does.  gsubfn like gsub except the replacement string is a function:

library(gsubfn)
gsubfn("(.*)B$", ~ as.numeric(x) * 10e6, d, ignore.case = TRUE)

[1] "120.0M"    "11.01m"    "2.097e+09" "100.00k"   "50"

Also there are examples very similare to this

1. at the end of section 2 of
vignette("gsubfn")

2. in
demo("gsubfn-si")

Also see the gsubfn home page:
http://gsubfn.googlecode.com

Also note that if you want to return the values rather than
transform and reinsert them then strapply in the same package
can do that.

On Sun, Nov 2, 2008 at 3:43 AM, Krishna Dagli/Krushna Dagli
<[EMAIL PROTECTED]> wrote:

Hello;

I am a R newbie and would like to know correct and efficient method for
doing string replacement.

I have a large data set, where I want to replace character "M", "b",
and "K" (currency in Million, Billion and K) to  millions.  That is
209.7B with (209.7 * 10e6) and 100.00K with (100.00 *1/100)
and etc..

d <- c("120.0M", "11.01m", "209.7B", "100.00k", "50")

This works that is it removes "b/B",

gsub ("(.*)(B$)", "\\1", d, ignore.case=T, perl=T)

but

gsub ("(.*)(B$)", as.numeric("\\1") * 10e6, d, ignore.case=T, perl=T)

does not work. I tried with sprintf and other combination of as.numeric but
that fails, how to use \\1 and multiply with 10e6??

The other solution is :

location <- grep ("M", d, ignore.case=T)
y <- sub("M", "", d, ignore.case=T)
y[location]<-y[location] * 10e6

Is the second solution faster or (if) combination of grep along with
multiply (if it works) is faster? Or what is the most efficient method
to do something like this in R?

Thanks and Regards
Krishna

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry                            (858) 534-2098
                                            Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]                  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R newbie: how to replace string/regular expression

Reply via email to