That website uses javascript to submit the form (and doesn't work in
Chrome). You could build a javascript interpreter in R, have parse the
page, and then use the various javascript to submit the form. R just isn't
the right tool for that type of interaction.

Performing the task you want--as described--is possible, just not
reasonable with R. There are better tools for automating webpages such
as Automato [1] or Sikuli [2] which are handy tools.

But better would be to query the site directly. Checking the source of the
page each of the different report types stems from a different URL, passing
it arguments in the form of:

beginDate=03012012&endDate=03032012&SelectFormat=CSV

results in values from March 1st to 3rd of this year in a csv. To find the
URLs of interest go view the source and search for "Select a Report"

Easier still might be to contact AESO and ask them for the data.

[1] http://automa.to/
[2] http://sikuli.org/

-Tyler

On Mon, Mar 5, 2012 at 10:38 AM, Guang Dai <guang....@albertamsa.ca> wrote:

> hi all,
> I'm working on scrapping some website data to build a database.
> Under most cases, I can use package XML to get the dataset.
> However, some of the website doesn't give a explicit address of the
> downloaded tables.
>
> To be more specific, for example, I'm interested in the website
> http://ets.aeso.ca/
> The data we are scraping is the "Pool Weekly Summary" under the category
> of "Historical".
> However, after clicking "historical" and choose the "Pool Weekly Summary"
>  item on the website,
> the address is always http://ets.aeso.ca/ and doesn't change.
>
> In this case, I guess I need to tell R first click the "historical" button
> then choose the item before
> scraping the data. But, the question is how?
>
> Any suggestions are welcome.
> Guang
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to