Re: [R-sig-Geo] Extract CRU data

2023-01-25 Thread Miluji Sb
That indeed was the most efficient solution. Thank you!

On Tue, Jan 24, 2023 at 1:45 PM Barry Rowlingson 
wrote:

> Are you asking if there's a way to automate the download of a list of
> links from that page? You could write an R script to get the HTML, then
> find all the HTML  tags, and then get the URLs in the link addresses,
> and there's packages for doing this kind of web scraping.
>
> But for this kind of thing it might be easier to use a web browser add-on
> - I have "Down Them All" set up on Firefox, and with a click or two I can
> get a list of all the link URLs and hit a button that downloads everything
> to a single folder. Once done, I can use standard R functions to list all
> the downloaded files and read them. Took about 20 seconds to do for this
> page, and now I have a folder of 292 .tmp.per files.
>
> Barry
>
>
> On Tue, Jan 24, 2023 at 11:13 AM Miluji Sb  wrote:
>
>> Greetings everyone,
>>
>> I have a question on extracting country-level data from CRU (
>>
>> https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/
>> ).
>> The data for each variable are available for individual countries and I am
>> struggling to download all of them. Can I extract all the files in R then
>> merge? Thanks so much.
>>
>> Best,
>>
>> Milu
>>
>> [[alternative HTML version deleted]]
>>
>> ___
>> R-sig-Geo mailing list
>> R-sig-Geo@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Extract CRU data

2023-01-24 Thread Grzegorz Sapijaszko
On Tue, 2023-01-24 at 12:13 +0100, Miluji Sb wrote:
> Greetings everyone,
> 
> I have a question on extracting country-level data from CRU (
> https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/
> ).


Something like:

To get all links/filenames in one table:

a <-
rvest::read_html("https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/
") 

tbl <- a |>
  rvest::html_table() |>
  as.data.frame()

tbl <- tbl[-c(1,2),]

To download them all to specific directory

my_download_function <- function(myurl ="", output_dir = "data") {
  if(!dir.exists({{output_dir}})) {dir.create({{output_dir}})}
  .destfile = paste0({{output_dir}}, "/", {{myurl}})
  .myurl <-
paste0("https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/
", {{myurl}})
  download.file(url = .myurl, destfile = .destfile, method = "wget",
extra = "-c --progress=bar:force")  
  NULL
}

invisible(lapply(seq(nrow(tbl)), function(i)
my_download_function(tbl[i,1], "data")))

Now, having it locally you can read them one by one with read.csv,
like:

f <- list.files(path = "data", pattern = "crucy*", full.names = TRUE)
read.csv(f[i], skip = 3, header = TRUE)

It doesn't make sense without adding additional information about
country/territotry, but at least you have starting point.

Regards,
Grzegorz

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo


Re: [R-sig-Geo] Extract CRU data

2023-01-24 Thread Barry Rowlingson
Are you asking if there's a way to automate the download of a list of links
from that page? You could write an R script to get the HTML, then find all
the HTML  tags, and then get the URLs in the link addresses, and there's
packages for doing this kind of web scraping.

But for this kind of thing it might be easier to use a web browser add-on -
I have "Down Them All" set up on Firefox, and with a click or two I can get
a list of all the link URLs and hit a button that downloads everything to a
single folder. Once done, I can use standard R functions to list all the
downloaded files and read them. Took about 20 seconds to do for this page,
and now I have a folder of 292 .tmp.per files.

Barry


On Tue, Jan 24, 2023 at 11:13 AM Miluji Sb  wrote:

> Greetings everyone,
>
> I have a question on extracting country-level data from CRU (
>
> https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/
> ).
> The data for each variable are available for individual countries and I am
> struggling to download all of them. Can I extract all the files in R then
> merge? Thanks so much.
>
> Best,
>
> Milu
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-Geo mailing list
> R-sig-Geo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

[[alternative HTML version deleted]]

___
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo