subject:"\[R\] retrieve certain part from html"

Re: [R] retrieve certain part from html

2009-09-23 Thread Tony Breyal

maybe you could modify the following to suit your situation (i use this xPath expression to get links from google): ?htmlTreeParse ?getNodeSet > library(XML) > link <- > 'http://www.google.co.uk/search?hl=en&client=firefox-a&rls=org.mozilla:en-GB:official&hs=2XR&ei=mxa6SojjOeaMjAfJkcDuBQ&sa=X&oi

Re: [R] retrieve certain part from html

2009-09-23 Thread Henrique Dallazuanna

Try using XML package: Lines <- "2005-012006-012007-012008-012009-01" library(XML) xpathApply(htmlParse(Lines), "//a", xmlAttrs) On Wed, Sep 23, 2009 at 9:29 AM, Rene wrote: > Dear All, > > > > Can someone please guide me how to get the certain part from a long html > language? > > > > e.g. > >

Re: [R] retrieve certain part from html

2009-09-23 Thread Romain Francois

Hi, The R4X package can help you. (I have wrapped your td's into one tr) > x <- xml( "2005-012006-012007-012008-012009-01" ) > x["td/a/#"] tdtdtdtdtd "2005-01" "2006-01" "2007-01" "2008-01" "2009-01" > x["td/a/@href"] td td

Re: [R] retrieve certain part from html

2009-09-23 Thread Rene

Dear All, Can someone please guide me how to get the certain part from a long html language? e.g. "2005-012006-012007-012008-012009-01" How to get only the wording of "2005-01.html", "2006-01.html", "2007-01.html"," 2008-01.html"," 2009-01.html" from the above html code? I have tr