[R] extract tables as data.frames from HTML source
Hi, I wonder whether there is any convenient function (or package) to extract tables from a HTML page? e.g. from http://www.google.com/finance/historical?q=SHE:002251 I know we can readLines('URL'), gsub('td...', '...', source), ... and at last get the numbers; I'm writing to ask whether someone has already contributed a more general function (with the package XML or other packages). Thanks! Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 Mobile: +86-15810805877 Homepage: http://www.yihui.name School of Statistics, Room 1037, Mingde Main Building, Renmin University of China, Beijing, 100872, China __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract tables as data.frames from HTML source
Yihui Xie wrote: I wonder whether there is any convenient function (or package) to extract tables from a HTML page? e.g. from http://www.google.com/finance/historical?q=SHE:002251 Try a search on R (I prefer markmail search) http://r-project.markmail.org/search/?q=extract%20html Dieter -- View this message in context: http://www.nabble.com/extract-tables-as-data.frames-from-HTML-source-tp22862641p22863816.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extract tables as data.frames from HTML source
If the question is specific to getting stock data from google finance then check out getSymbols.google in the quantmod package. Also note that there exists an r-sig-finance list for questions pertaining to R and finance. On Fri, Apr 3, 2009 at 2:18 AM, Yihui Xie xieyi...@gmail.com wrote: Hi, I wonder whether there is any convenient function (or package) to extract tables from a HTML page? e.g. from http://www.google.com/finance/historical?q=SHE:002251 I know we can readLines('URL'), gsub('td...', '...', source), ... and at last get the numbers; I'm writing to ask whether someone has already contributed a more general function (with the package XML or other packages). Thanks! Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 Mobile: +86-15810805877 Homepage: http://www.yihui.name School of Statistics, Room 1037, Mingde Main Building, Renmin University of China, Beijing, 100872, China __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.