It's only "awfully" inefficient if it's a bottleneck.  You're not doing this 
more than once per item fetched from the network, and the time is insignificant 
relative to the fetch.  If it were somehow in your inner loop, it would be 
worth worrying about, but your purpose is to eliminate Ms and Bs so that you'll 
never ever see them again. If performance is a problem, look at your inner 
loop, not here.


-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Mike Marchywka
Sent: Tuesday, February 15, 2011 9:01 PM
To: s...@gnu.org; r-h...@stat.math.ethz.ch
Subject: Re: [R] string parsing







----------------------------------------
> To: r-h...@stat.math.ethz.ch
> From: s...@gnu.org
> Date: Tue, 15 Feb 2011 17:20:11 -0500
> Subject: [R] string parsing
>
> I am trying to get stock metadata from Yahoo finance (or maybe there is
> a better source?)

search this for "yahoo",

http://cran.r-project.org/web/packages/quantmod/quantmod.pdf

as a perennial page scraper, I was amazed this existed :)


> here is what I did so far:
>
> yahoo.url <- "http://finance.yahoo.com/d/quotes.csv?f=j1jka2&s=";;
> stocks <- c("IBM","NOIZ","MSFT","LNN","C","BODY","F"); # just some samples
> socket <- url(paste(yahoo.url,sep="",paste(stocks,collapse="+")),open="r");
> data <- read.csv(socket, header = FALSE);
> close(socket);
> data is now:
> V1 V2 V3 V4
> 1 200.5B 116.00 166.25 4965150
> 2 19.1M 3.75 5.47 8521
> 3 226.6B 22.73 31.58 57127000
> 4 886.4M 30.80 74.54 226690
> 5 142.4B 3.21 5.15 541804992
> 6 276.4M 11.98 21.30 149656
> 7 55.823B 9.75 18.97 89369000
>
> now I need to do this:
>
> --> convert 55.823B to 55e9 and 19.1M to 19e6
>
> parse.num <- function (s) { as.numeric(gsub("M$","e6",gsub("B$","e9",s))); }
> data[1]<-lapply(data[1],parse.num);
>
> seems like awfully inefficient (two regexp substitutions),
> is there a better way?
>
> --> iterate over stocks & data at the same time and put the results into
> a hash table:
> for (i in 1:length(stocks)) cache[[stocks[i]]] <- data[i,];
>
> I do get the right results,
> but I am wondering if I am doing it "the right R way".
> E.g., the hash table value is a data frame.
> A structure(record?) seems more appropriate.
>
> thanks!
>
> --
> Sam Steingold (http://sds.podval.org/) on CentOS release 5.3 (Final)
> http://pmw.org.il http://ffii.org http://camera.org http://honestreporting.com
> http://iris.org.il http://mideasttruth.com http://thereligionofpeace.com
> I haven't lost my mind -- it's backed up on tape somewhere.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to