Hi Jeff, thanks for your reply. I read about it in the documentation and I plan to build the functions to download the csv and convert to rda in the future. At the moment, however, I am looking for a quick and dirty solution to lazy load the files I already have. (I know I could convert them straight to .rda files and will do it, I am using the occasion to learn some other way to do it).
For example I have tried doing something like lazyLoadFun <- function(path){ ds <- NULL read.data <- function(){ if (is.null(ds)){ message("importing dataset the first time") ds <- read.csv(path) } else { message("using cached copy of the dataset") } return(ds) } return(read.data) } get.italian.cities <- lazyLoadFun(system.file("extdata/italian_cities.csv")) In the functions where I need to use italian_cities I then do: italian_cities <- get.italian.cities() The first time I use get.italian.cities the file is loaded and cached into "ds", while the other time the cached variable "ds" is retrieved. The problem is that R doesn't pass variables by reference, so that ds is copied into italian_cities, which makes things quite slow for big datasets and makes using cached data useless. I am figuring out, for example, if it is possible to use "assign" and avoid copying the data into italian_cities. Thanks again for your reply, I will definitively look into .rda files! Luca 2014-04-16 15:39 GMT+02:00 Jeff Newmiller <jdnew...@dcn.davis.ca.us>: > The standard way to put data into a package is to convert it to RDA as > described in the Writing R Extensions document. This is faster and more > compact than CSV. > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > On April 16, 2014 5:10:03 AM PDT, Luca Cerone <luca.cer...@gmail.com> > wrote: > >Hi, in a package I am developing some functions need to use some > >external > >data. > >I have these data as a set of .csv files that I have placed in the > >inst/extdata folder. > > > >At the moment I have a file "db-internal.r" where I load all the > >internal > >databases that could be used by the functions in my package; > >and assign them to some global (to the package) variables (all with the > >prefix db_ in front of them) > >For example (I didn't come out with a better name, sorry) > > > >db_italian_cities = read.csv(system.file("extdata/italian_cities.csv") > > > >like this I can use db_italian_cities in my functions. > > > >Some of these datasets are quite big and really slow down loading the > >package, plus for some of the task the package is meant to solve they > >might > >not even be required. > >I would like to be able to lazyload these datasets only when needed, > >how > >can I possibly achieve this without creating special databases? > > > >Some of them could change, so I intend to be able to download the most > >recent ones through a function that ensure the package is using the > >most > >recent version, > >so I would really prefer to simply use csv files. > > > >Thanks a lot in advance for the help! > > > >Cheers, > >Luca > > > > [[alternative HTML version deleted]] > > > >______________________________________________ > >R-help@r-project.org mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- *Luca Cerone* Tel: +34 692 06 71 28 Skype: luca.cerone [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.