Re: [R-sig-Geo] Extract CRU data
That indeed was the most efficient solution. Thank you! On Tue, Jan 24, 2023 at 1:45 PM Barry Rowlingson wrote: > Are you asking if there's a way to automate the download of a list of > links from that page? You could write an R script to get the HTML, then > find all the HTML tags, and then get the URLs in the link addresses, > and there's packages for doing this kind of web scraping. > > But for this kind of thing it might be easier to use a web browser add-on > - I have "Down Them All" set up on Firefox, and with a click or two I can > get a list of all the link URLs and hit a button that downloads everything > to a single folder. Once done, I can use standard R functions to list all > the downloaded files and read them. Took about 20 seconds to do for this > page, and now I have a folder of 292 .tmp.per files. > > Barry > > > On Tue, Jan 24, 2023 at 11:13 AM Miluji Sb wrote: > >> Greetings everyone, >> >> I have a question on extracting country-level data from CRU ( >> >> https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/ >> ). >> The data for each variable are available for individual countries and I am >> struggling to download all of them. Can I extract all the files in R then >> merge? Thanks so much. >> >> Best, >> >> Milu >> >> [[alternative HTML version deleted]] >> >> ___ >> R-sig-Geo mailing list >> R-sig-Geo@r-project.org >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo >> > [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Re: [R-sig-Geo] Extract CRU data
On Tue, 2023-01-24 at 12:13 +0100, Miluji Sb wrote: > Greetings everyone, > > I have a question on extracting country-level data from CRU ( > https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/ > ). Something like: To get all links/filenames in one table: a <- rvest::read_html("https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/ ") tbl <- a |> rvest::html_table() |> as.data.frame() tbl <- tbl[-c(1,2),] To download them all to specific directory my_download_function <- function(myurl ="", output_dir = "data") { if(!dir.exists({{output_dir}})) {dir.create({{output_dir}})} .destfile = paste0({{output_dir}}, "/", {{myurl}}) .myurl <- paste0("https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/ ", {{myurl}}) download.file(url = .myurl, destfile = .destfile, method = "wget", extra = "-c --progress=bar:force") NULL } invisible(lapply(seq(nrow(tbl)), function(i) my_download_function(tbl[i,1], "data"))) Now, having it locally you can read them one by one with read.csv, like: f <- list.files(path = "data", pattern = "crucy*", full.names = TRUE) read.csv(f[i], skip = 3, header = TRUE) It doesn't make sense without adding additional information about country/territotry, but at least you have starting point. Regards, Grzegorz ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Re: [R-sig-Geo] Extract CRU data
Are you asking if there's a way to automate the download of a list of links from that page? You could write an R script to get the HTML, then find all the HTML tags, and then get the URLs in the link addresses, and there's packages for doing this kind of web scraping. But for this kind of thing it might be easier to use a web browser add-on - I have "Down Them All" set up on Firefox, and with a click or two I can get a list of all the link URLs and hit a button that downloads everything to a single folder. Once done, I can use standard R functions to list all the downloaded files and read them. Took about 20 seconds to do for this page, and now I have a folder of 292 .tmp.per files. Barry On Tue, Jan 24, 2023 at 11:13 AM Miluji Sb wrote: > Greetings everyone, > > I have a question on extracting country-level data from CRU ( > > https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_4.06/crucy.2205251923.v4.06/countries/tmp/ > ). > The data for each variable are available for individual countries and I am > struggling to download all of them. Can I extract all the files in R then > merge? Thanks so much. > > Best, > > Milu > > [[alternative HTML version deleted]] > > ___ > R-sig-Geo mailing list > R-sig-Geo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-geo > [[alternative HTML version deleted]] ___ R-sig-Geo mailing list R-sig-Geo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-geo