Re: [R] Grap Element from Web Page

2013-08-15 Thread Sparks, John James
Thanks, the second approach worked fine on Windows. --JJS On Thu, August 15, 2013 8:38 am, Jeffrey Dick wrote: > Sorry, I can't generate an error when running those commands in R on Linux > 64-bit. But if I move to Windows (R version 3.0.1, XML_3.98-1.1), I get a > different error ... > >> requir

Re: [R] Grap Element from Web Page

2013-08-15 Thread Jeffrey Dick
Sorry, I can't generate an error when running those commands in R on Linux 64-bit. But if I move to Windows (R version 3.0.1, XML_3.98-1.1), I get a different error ... > require(XML) Loading required package: XML > doc <- htmlTreeParse(" http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFT&Find=Searc

Re: [R] Grap Element from Web Page

2013-08-14 Thread Sparks, John James
Thanks so much for looking into this for me. Unfortunately, I get an error when I execute your code. Is there a library that you loaded that I haven't? require(scrapeR) require(XML) require(RCurl) doc<-htmlTreeParse("http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFT&Find=Search&owner=exclude&acti

Re: [R] Grap Element from Web Page

2013-08-14 Thread Jeffrey Dick
Hi, There are many occurrences of the CIK number in the page source. This pulls out the first node containing it: node <- getNodeSet(doc[[1]], "//link[@rel='alternate']" ) >From there you can extract the number. Here's one way to do it. strsplit(strsplit(unlist(node)[[5]], "CIK=")[[1]][2], "&ty

[R] Grap Element from Web Page

2013-08-13 Thread Sparks, John James
Dear R Helpers, I would like to pull the CIK number from the web page http://www.sec.gov/cgi-bin/browse-edgar?CIK=MSFT&Find=Search&owner=exclude&action=getcompany If you put this web page into your browser you will see the CIK number in red on the left side of the page near the top. When I try