subject:"Re\: \[R\] Downloading data from from internet"

Re: [R] Downloading data from from internet

2009-09-26 Thread Duncan Temple Lang



Bogaso wrote:
 Thanks Duncan for your input. However I could not install the package
 RHTMLForms, it is saying as not not available :
 
 install.packages(RHTMLForms, repos = http://www.omegahat.org/R;) 
 Warning in install.packages(RHTMLForms, repos =
 http://www.omegahat.org/R;) :
   argument 'lib' is missing: using
 'C:\Users\Arrun's\Documents/R/win-library/2.9'
 Warning message:
 In getDependencies(pkgs, dependencies, available, lib) :
   package ‘RHTMLForms’ is not available
 
 I found this package in net : http://www.omegahat.org/RHTMLForms/ However it
 is gz file which I could not use as I am a window user. Can you please
 provide me alternate source?


Hi Bogaso.

Yes, I made the package available in source form with the expectation
that people who were interested in using it would find out how to build it
for themselves.
I have made a binary version available of the package for R-2.9.*
so install.packages() will work for you on Windows.


However, you can use the source form of the package as a Windows
user; you just have to install it. That involves finding out how to do this
(either with Uwe's Windows package building service or by installing the tools
that Brian Ripley and Duncan Murdoch have spent time making available to more 
easily use.)

Generally (i.e. not pointing fingers at any one in particular), I do wish 
Windows users would learn
how to do things for themselves and not put further burden on people who 
provide them with free
software and free advice to also provide them with binary versions of easily
installed packages. It does take time for us to maintain different operating
systems and to create binaries. Running Windows and not being able to install
R packages from source is a choice, not a technical limitation.


 D.


 
 Thanks,
 
 
 
 Duncan Temple Lang wrote:


 Bogaso wrote:
 Thank you so much for those helps. However I need little more help. In
 the
 site
 http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php;
 if I scroll below then there is an option Historical CPI Index For USA
 Next if I click on Get Data then another table pops-up, however without
 any significant change in address bar. This tables holds more data
 starting
 from 1999. Can you please help me how to get the values of this table?


 Hi again

 Well, this is a little bit more involved, as this is an HTML form
 and so we need to be able to emulate submitting a form with
 values for the different parameters the form expects, along with
 ensuring they are correct inputs.  Ordinarily, this would involve
 looking at the source of the HTML document, finding the relevant
 form element, getting its action attribute, and all its inputs
 and figuring out the possible inputs.  This is straightforward
 but involved. But we have an R package that does this reasonably
 well in an automated form. This is the RHTMLForms from the
 www.omegahat.org/R repository.

 We can use this with
  install.packages(RHTMLForms, repos = http://www.omegahat.org/R;)

 Then

 library(RHTMLForms)

 ff =
 getHTMLFormDescription(http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php;)

 # The form we want is the third one. We can determine this
 # from the names of the parameters.
 # So we request that this form description be turned into an R function

 g = createFunction(ff[[3]])

   # Now we call this.
 xx = g(2001, 2008)


   # This returns the content of an HTML document
   # so we parse it and then pass this to readHTMLTable()
   # This is why we have methods for

 library(XML)
 doc = htmlParse(xx, asText = TRUE)
 tbls = readHTMLTable(doc)

   # we want the last of the tables.
 tbls[[length(tbls)]]


 So hopefully that helps solve your problem and introduces another Omegahat
 package that
 we hope people find through Google. The RHTMLForms package is an approach
 to the
 poor-man's Web services - HTML forms- rather than REST and SOAP that are
 becoming more relevant
 each day.  The RCurl and SSOAP address the latter.

   D.





 Thanks


 Duncan Temple Lang wrote:
 Thanks for explaining this, Charlie.

 Just for completeness and to make things a little easier,
 the XML package has a function named readHTMLTable()
 and you can call it with a URL and it will attempt
 to read all the tables in the page.

  tbls =
 readHTMLTable('http://www.rateinflation.com/consumer-price-index/usa-cpi.php')

 yields a list with 10 elements, and the table of interest with the data
 is
 the 10th one.

  tbls[[10]]

 The function does the XPath voodoo and sapply() work for you and uses
 some
 heuristics.
 There are various controls one can specify and also various methods for
 working
 with sub-parts of the HTML document directly.

   D.



 cls59 wrote:
 Bogaso wrote:
 Hi all,

 I want to download data from those two different sources, directly
 into
 R
 :

 http://www.rateinflation.com/consumer-price-index/usa-cpi.php
 http://eaindustry.nic.in/asp2/list_d.asp

 First one is CPI of US and 2nd one is WPI of India. Can

Re: [R] Downloading data from from internet

2009-09-26 Thread cls59



Duncan Temple Lang wrote:
 
 
 
 However, you can use the source form of the package as a Windows
 user; you just have to install it. That involves finding out how to do
 this
 (either with Uwe's Windows package building service or by installing the
 tools
 that Brian Ripley and Duncan Murdoch have spent time making available to
 more easily use.)
 
 

As a footnote to this, the tools required to enable package building on
Windows are available at:

http://www.murdoch-sutherland.com/Rtools/

Download and run the installer for your version of R. Make sure you allow
the installer to modify your PATH. After installing the tools, you should be
able to build and install most packages from within R via:

install.packages( 'packageName', type = 'source' )



Duncan Temple Lang wrote:
 
 
 Generally (i.e. not pointing fingers at any one in particular), I do wish
 Windows users would learn
 how to do things for themselves and not put further burden on people who
 provide them with free
 software and free advice to also provide them with binary versions of
 easily
 installed packages. It does take time for us to maintain different
 operating
 systems and to create binaries. Running Windows and not being able to
 install
 R packages from source is a choice, not a technical limitation.
 
 
  D.
 
 

I echo this sentiment as well-- but personally I believe this is mostly a
symptom of Microsoft's decision to provide such a sorry excuse for a command
line in Windows. Most Windows users never even consider building from
source because it's not something that their operating system is capable of
doing out of the box.

This problem is further exacerbated by the fact that most IT departments go
to such ridiculous lengths to lock their users out of Windows in an attempt
to secure it. For example, I couldn't install Rtools on my workstation at
the university even if I wanted to-- luckily all of our computers can dual
boot into Linux.

The lack of a decent command line prestocked with common tools, such as Perl
and a C compiler, is the main reason I consider Windows an operating system
of last resort.

Here endeth the rant.

-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25627641.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Downloading data from from internet

2009-09-26 Thread Gabor Grothendieck

Here are three different approaches:

1. Using the first link as an example, on Windows you can copy the
data and headers from IE (won't work in Firefox) to Excel and from
there to clipboard again and then in R:

library(zoo)
DF - read.delim(clipboard)
z - zooreg(c(t(DF[5:1, 2:13])), start = as.yearmon(2005-01), freq = 12)

2. on any platform you can read it straight into R:

L - readLines(http://www.rateinflation.com/consumer-price-index/usa-cpi.php;)

and then use the character manipulation functions (grep, sub, gsub,
substr) and as.numeric to parse out the data or

3. on any platform, use the XML package adapting the code in this post:

https://stat.ethz.ch/pipermail/r-help/2009-July/203063.html

On Thu, Sep 24, 2009 at 9:34 AM, Bogaso bogaso.christo...@gmail.com wrote:

 Hi all,

 I want to download data from those two different sources, directly into R :

 http://www.rateinflation.com/consumer-price-index/usa-cpi.php
 http://eaindustry.nic.in/asp2/list_d.asp

 First one is CPI of US and 2nd one is WPI of India. Can anyone please give
 any clue how to download them directly into R. I want to make them zoo
 object for further analysis.

 Thanks,
 --
 View this message in context: 
 http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25568930.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Downloading data from from internet

2009-09-25 Thread Bogaso


Thank you so much for those helps. However I need little more help. In the
site
http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php;
if I scroll below then there is an option Historical CPI Index For USA
Next if I click on Get Data then another table pops-up, however without
any significant change in address bar. This tables holds more data starting
from 1999. Can you please help me how to get the values of this table?

Thanks


Duncan Temple Lang wrote:
 
 
 Thanks for explaining this, Charlie.
 
 Just for completeness and to make things a little easier,
 the XML package has a function named readHTMLTable()
 and you can call it with a URL and it will attempt
 to read all the tables in the page.
 
  tbls =
 readHTMLTable('http://www.rateinflation.com/consumer-price-index/usa-cpi.php')
 
 yields a list with 10 elements, and the table of interest with the data is
 the 10th one.
 
  tbls[[10]]
 
 The function does the XPath voodoo and sapply() work for you and uses some
 heuristics.
 There are various controls one can specify and also various methods for
 working
 with sub-parts of the HTML document directly.
 
   D.
 
 
 
 cls59 wrote:
 
 
 Bogaso wrote:
 Hi all,

 I want to download data from those two different sources, directly into
 R
 :

 http://www.rateinflation.com/consumer-price-index/usa-cpi.php
 http://eaindustry.nic.in/asp2/list_d.asp

 First one is CPI of US and 2nd one is WPI of India. Can anyone please
 give
 any clue how to download them directly into R. I want to make them zoo
 object for further analysis.

 Thanks,

 
 The following site did not load for me:
 
 http://eaindustry.nic.in/asp2/list_d.asp
 
 But I was able to extract the table from the US CPI site using Duncan
 Temple
 Lang's XML package:
 
   library(XML)
 
 
 First, download the website into R:
 
   html.raw - readLines(
 'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' )
 
 Then, convert to an HTML object using the XML package:
 
   html.data - htmlTreeParse( html.raw, asText = T, useInternalNodes = T
 )
 
 A quick scan of the page source in the browser reveals that the table you
 want is encased in a div with a class of dynamicContent-- we will use a
 xpath specification[1] to retrieve all rows in that table:
 
   table.html - getNodeSet( html.data,
 '//d...@class=dynamicContent]/table/tr' )
 
 Now, the data values can be extracted from the cells in the rows using a
 little sapply and xpathXpply voodoo:
 
   table.data - t( sapply( table.html, function( row ){
 
 row.data -  xpathSApply( row, './td', xmlValue )
 return( row.data)
 
   }))
 
 
 Good luck!
 
 -Charlie
  
   [1]:  http://www.w3schools.com/XPath/xpath_syntax.asp
 
 -
 Charlie Sharpsteen
 Undergraduate
 Environmental Resources Engineering
 Humboldt State University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25610171.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Downloading data from from internet

2009-09-25 Thread Duncan Temple Lang



Bogaso wrote:
 Thank you so much for those helps. However I need little more help. In the
 site
 http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php;
 if I scroll below then there is an option Historical CPI Index For USA
 Next if I click on Get Data then another table pops-up, however without
 any significant change in address bar. This tables holds more data starting
 from 1999. Can you please help me how to get the values of this table?
 


Hi again

Well, this is a little bit more involved, as this is an HTML form
and so we need to be able to emulate submitting a form with
values for the different parameters the form expects, along with
ensuring they are correct inputs.  Ordinarily, this would involve
looking at the source of the HTML document, finding the relevant
form element, getting its action attribute, and all its inputs
and figuring out the possible inputs.  This is straightforward
but involved. But we have an R package that does this reasonably
well in an automated form. This is the RHTMLForms from the
www.omegahat.org/R repository.

We can use this with
 install.packages(RHTMLForms, repos = http://www.omegahat.org/R;)

Then

library(RHTMLForms)

ff = 
getHTMLFormDescription(http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php;)

# The form we want is the third one. We can determine this
# from the names of the parameters.
# So we request that this form description be turned into an R function

g = createFunction(ff[[3]])

  # Now we call this.
xx = g(2001, 2008)


  # This returns the content of an HTML document
  # so we parse it and then pass this to readHTMLTable()
  # This is why we have methods for

library(XML)
doc = htmlParse(xx, asText = TRUE)
tbls = readHTMLTable(doc)

  # we want the last of the tables.
tbls[[length(tbls)]]


So hopefully that helps solve your problem and introduces another Omegahat 
package that
we hope people find through Google. The RHTMLForms package is an approach to the
poor-man's Web services - HTML forms- rather than REST and SOAP that are 
becoming more relevant
each day.  The RCurl and SSOAP address the latter.

  D.





 Thanks
 
 
 Duncan Temple Lang wrote:

 Thanks for explaining this, Charlie.

 Just for completeness and to make things a little easier,
 the XML package has a function named readHTMLTable()
 and you can call it with a URL and it will attempt
 to read all the tables in the page.

  tbls =
 readHTMLTable('http://www.rateinflation.com/consumer-price-index/usa-cpi.php')

 yields a list with 10 elements, and the table of interest with the data is
 the 10th one.

  tbls[[10]]

 The function does the XPath voodoo and sapply() work for you and uses some
 heuristics.
 There are various controls one can specify and also various methods for
 working
 with sub-parts of the HTML document directly.

   D.



 cls59 wrote:

 Bogaso wrote:
 Hi all,

 I want to download data from those two different sources, directly into
 R
 :

 http://www.rateinflation.com/consumer-price-index/usa-cpi.php
 http://eaindustry.nic.in/asp2/list_d.asp

 First one is CPI of US and 2nd one is WPI of India. Can anyone please
 give
 any clue how to download them directly into R. I want to make them zoo
 object for further analysis.

 Thanks,

 The following site did not load for me:

 http://eaindustry.nic.in/asp2/list_d.asp

 But I was able to extract the table from the US CPI site using Duncan
 Temple
 Lang's XML package:

   library(XML)


 First, download the website into R:

   html.raw - readLines(
 'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' )

 Then, convert to an HTML object using the XML package:

   html.data - htmlTreeParse( html.raw, asText = T, useInternalNodes = T
 )

 A quick scan of the page source in the browser reveals that the table you
 want is encased in a div with a class of dynamicContent-- we will use a
 xpath specification[1] to retrieve all rows in that table:

   table.html - getNodeSet( html.data,
 '//d...@class=dynamicContent]/table/tr' )

 Now, the data values can be extracted from the cells in the rows using a
 little sapply and xpathXpply voodoo:

   table.data - t( sapply( table.html, function( row ){

 row.data -  xpathSApply( row, './td', xmlValue )
 return( row.data)

   }))


 Good luck!

 -Charlie
  
   [1]:  http://www.w3schools.com/XPath/xpath_syntax.asp

 -
 Charlie Sharpsteen
 Undergraduate
 Environmental Resources Engineering
 Humboldt State University
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal,

Re: [R] Downloading data from from internet

2009-09-25 Thread Bogaso


Thanks Duncan for your input. However I could not install the package
RHTMLForms, it is saying as not not available :

 install.packages(RHTMLForms, repos = http://www.omegahat.org/R;) 
Warning in install.packages(RHTMLForms, repos =
http://www.omegahat.org/R;) :
  argument 'lib' is missing: using
'C:\Users\Arrun's\Documents/R/win-library/2.9'
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
  package ‘RHTMLForms’ is not available

I found this package in net : http://www.omegahat.org/RHTMLForms/ However it
is gz file which I could not use as I am a window user. Can you please
provide me alternate source?

Thanks,



Duncan Temple Lang wrote:
 
 
 
 Bogaso wrote:
 Thank you so much for those helps. However I need little more help. In
 the
 site
 http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php;
 if I scroll below then there is an option Historical CPI Index For USA
 Next if I click on Get Data then another table pops-up, however without
 any significant change in address bar. This tables holds more data
 starting
 from 1999. Can you please help me how to get the values of this table?
 
 
 
 Hi again
 
 Well, this is a little bit more involved, as this is an HTML form
 and so we need to be able to emulate submitting a form with
 values for the different parameters the form expects, along with
 ensuring they are correct inputs.  Ordinarily, this would involve
 looking at the source of the HTML document, finding the relevant
 form element, getting its action attribute, and all its inputs
 and figuring out the possible inputs.  This is straightforward
 but involved. But we have an R package that does this reasonably
 well in an automated form. This is the RHTMLForms from the
 www.omegahat.org/R repository.
 
 We can use this with
  install.packages(RHTMLForms, repos = http://www.omegahat.org/R;)
 
 Then
 
 library(RHTMLForms)
 
 ff =
 getHTMLFormDescription(http://www.rateinflation.com/consumer-price-index/usa-historical-cpi.php;)
 
 # The form we want is the third one. We can determine this
 # from the names of the parameters.
 # So we request that this form description be turned into an R function
 
 g = createFunction(ff[[3]])
 
   # Now we call this.
 xx = g(2001, 2008)
 
 
   # This returns the content of an HTML document
   # so we parse it and then pass this to readHTMLTable()
   # This is why we have methods for
 
 library(XML)
 doc = htmlParse(xx, asText = TRUE)
 tbls = readHTMLTable(doc)
 
   # we want the last of the tables.
 tbls[[length(tbls)]]
 
 
 So hopefully that helps solve your problem and introduces another Omegahat
 package that
 we hope people find through Google. The RHTMLForms package is an approach
 to the
 poor-man's Web services - HTML forms- rather than REST and SOAP that are
 becoming more relevant
 each day.  The RCurl and SSOAP address the latter.
 
   D.
 
 
 
 
 
 Thanks
 
 
 Duncan Temple Lang wrote:

 Thanks for explaining this, Charlie.

 Just for completeness and to make things a little easier,
 the XML package has a function named readHTMLTable()
 and you can call it with a URL and it will attempt
 to read all the tables in the page.

  tbls =
 readHTMLTable('http://www.rateinflation.com/consumer-price-index/usa-cpi.php')

 yields a list with 10 elements, and the table of interest with the data
 is
 the 10th one.

  tbls[[10]]

 The function does the XPath voodoo and sapply() work for you and uses
 some
 heuristics.
 There are various controls one can specify and also various methods for
 working
 with sub-parts of the HTML document directly.

   D.



 cls59 wrote:

 Bogaso wrote:
 Hi all,

 I want to download data from those two different sources, directly
 into
 R
 :

 http://www.rateinflation.com/consumer-price-index/usa-cpi.php
 http://eaindustry.nic.in/asp2/list_d.asp

 First one is CPI of US and 2nd one is WPI of India. Can anyone please
 give
 any clue how to download them directly into R. I want to make them zoo
 object for further analysis.

 Thanks,

 The following site did not load for me:

 http://eaindustry.nic.in/asp2/list_d.asp

 But I was able to extract the table from the US CPI site using Duncan
 Temple
 Lang's XML package:

   library(XML)


 First, download the website into R:

   html.raw - readLines(
 'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' )

 Then, convert to an HTML object using the XML package:

   html.data - htmlTreeParse( html.raw, asText = T, useInternalNodes =
 T
 )

 A quick scan of the page source in the browser reveals that the table
 you
 want is encased in a div with a class of dynamicContent-- we will use
 a
 xpath specification[1] to retrieve all rows in that table:

   table.html - getNodeSet( html.data,
 '//d...@class=dynamicContent]/table/tr' )

 Now, the data values can be extracted from the cells in the rows using
 a
 little sapply and xpathXpply voodoo:

   table.data - t( sapply( table.html, function( row ){

 row.data -  xpathSApply( row, './td',

Re: [R] Downloading data from from internet

2009-09-24 Thread cls59




Bogaso wrote:
 
 Hi all,
 
 I want to download data from those two different sources, directly into R
 :
 
 http://www.rateinflation.com/consumer-price-index/usa-cpi.php
 http://eaindustry.nic.in/asp2/list_d.asp
 
 First one is CPI of US and 2nd one is WPI of India. Can anyone please give
 any clue how to download them directly into R. I want to make them zoo
 object for further analysis.
 
 Thanks,
 

The following site did not load for me:

http://eaindustry.nic.in/asp2/list_d.asp

But I was able to extract the table from the US CPI site using Duncan Temple
Lang's XML package:

  library(XML)


First, download the website into R:

  html.raw - readLines(
'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' )

Then, convert to an HTML object using the XML package:

  html.data - htmlTreeParse( html.raw, asText = T, useInternalNodes = T )

A quick scan of the page source in the browser reveals that the table you
want is encased in a div with a class of dynamicContent-- we will use a
xpath specification[1] to retrieve all rows in that table:

  table.html - getNodeSet( html.data,
'//d...@class=dynamicContent]/table/tr' )

Now, the data values can be extracted from the cells in the rows using a
little sapply and xpathXpply voodoo:

  table.data - t( sapply( table.html, function( row ){

row.data -  xpathSApply( row, './td', xmlValue )
return( row.data)

  }))


Good luck!

-Charlie
 
  [1]:  http://www.w3schools.com/XPath/xpath_syntax.asp

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://www.nabble.com/Downloading-data-from-from-internet-tp25568930p25572316.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Downloading data from from internet

2009-09-24 Thread Duncan Temple Lang


Thanks for explaining this, Charlie.

Just for completeness and to make things a little easier,
the XML package has a function named readHTMLTable()
and you can call it with a URL and it will attempt
to read all the tables in the page.

 tbls = 
readHTMLTable('http://www.rateinflation.com/consumer-price-index/usa-cpi.php')

yields a list with 10 elements, and the table of interest with the data is the 
10th one.

 tbls[[10]]

The function does the XPath voodoo and sapply() work for you and uses some 
heuristics.
There are various controls one can specify and also various methods for working
with sub-parts of the HTML document directly.

  D.



cls59 wrote:
 
 
 Bogaso wrote:
 Hi all,

 I want to download data from those two different sources, directly into R
 :

 http://www.rateinflation.com/consumer-price-index/usa-cpi.php
 http://eaindustry.nic.in/asp2/list_d.asp

 First one is CPI of US and 2nd one is WPI of India. Can anyone please give
 any clue how to download them directly into R. I want to make them zoo
 object for further analysis.

 Thanks,

 
 The following site did not load for me:
 
 http://eaindustry.nic.in/asp2/list_d.asp
 
 But I was able to extract the table from the US CPI site using Duncan Temple
 Lang's XML package:
 
   library(XML)
 
 
 First, download the website into R:
 
   html.raw - readLines(
 'http://www.rateinflation.com/consumer-price-index/usa-cpi.php' )
 
 Then, convert to an HTML object using the XML package:
 
   html.data - htmlTreeParse( html.raw, asText = T, useInternalNodes = T )
 
 A quick scan of the page source in the browser reveals that the table you
 want is encased in a div with a class of dynamicContent-- we will use a
 xpath specification[1] to retrieve all rows in that table:
 
   table.html - getNodeSet( html.data,
 '//d...@class=dynamicContent]/table/tr' )
 
 Now, the data values can be extracted from the cells in the rows using a
 little sapply and xpathXpply voodoo:
 
   table.data - t( sapply( table.html, function( row ){
 
 row.data -  xpathSApply( row, './td', xmlValue )
 return( row.data)
 
   }))
 
 
 Good luck!
 
 -Charlie
  
   [1]:  http://www.w3schools.com/XPath/xpath_syntax.asp
 
 -
 Charlie Sharpsteen
 Undergraduate
 Environmental Resources Engineering
 Humboldt State University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Downloading data from from internet

Re: [R] Downloading data from from internet

Re: [R] Downloading data from from internet

Re: [R] Downloading data from from internet

Re: [R] Downloading data from from internet

Re: [R] Downloading data from from internet

Re: [R] Downloading data from from internet

Re: [R] Downloading data from from internet

8 matches

Site Navigation

Mail list logo

Footer information