Hi Jeff,
thanks for your reply.

I read about it in the documentation and I plan to build the functions to
download the csv and convert to rda in the future.
At the moment, however, I am looking for a quick and dirty solution to lazy
load the files I already have.
(I know I could convert them straight to .rda files and will do it, I am
using the occasion to learn some other
way to do it).

For example I have tried doing something like

lazyLoadFun <- function(path){
    ds <- NULL
    read.data <- function(){
       if (is.null(ds)){
          message("importing dataset the first time")
          ds <- read.csv(path)
       } else {
          message("using cached copy of the dataset")
       }
       return(ds)
    }
    return(read.data)
}

get.italian.cities <- lazyLoadFun(system.file("extdata/italian_cities.csv"))


In the functions where I need to use italian_cities I then do:
italian_cities <- get.italian.cities()
The first time I use get.italian.cities the file is loaded and cached into
"ds", while the other time the cached variable "ds" is retrieved.

The problem is that R doesn't pass variables by reference, so that ds is
copied into italian_cities, which makes things quite slow for big datasets
and makes using cached data useless.

I am figuring out, for example, if it is possible to use "assign" and avoid
copying the data into italian_cities.

Thanks again for your reply, I will definitively look into .rda files!

Luca


2014-04-16 15:39 GMT+02:00 Jeff Newmiller <jdnew...@dcn.davis.ca.us>:

> The standard way to put data into a package is to convert it to RDA as
> described in the Writing R Extensions document. This is faster and more
> compact than CSV.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
> Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> On April 16, 2014 5:10:03 AM PDT, Luca Cerone <luca.cer...@gmail.com>
> wrote:
> >Hi, in a package I am developing some functions need to use some
> >external
> >data.
> >I have these data as a set of .csv files that I have placed in the
> >inst/extdata folder.
> >
> >At the moment I have a file "db-internal.r" where I  load all the
> >internal
> >databases that could be used by the functions in my package;
> >and assign them to some global (to the package) variables (all with the
> >prefix db_ in front of them)
> >For example (I didn't come out with a better name, sorry)
> >
> >db_italian_cities = read.csv(system.file("extdata/italian_cities.csv")
> >
> >like this I can use db_italian_cities in my functions.
> >
> >Some of these datasets are quite big and really slow down loading the
> >package, plus for some of the task the package is meant to solve they
> >might
> >not even be required.
> >I would like to be able to lazyload these datasets only when needed,
> >how
> >can I possibly achieve this without creating special databases?
> >
> >Some of them could change, so I intend to be able to download the most
> >recent ones through a function that ensure the package is using the
> >most
> >recent version,
> >so I would really prefer to simply use csv files.
> >
> >Thanks a lot in advance for the help!
> >
> >Cheers,
> >Luca
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
*Luca Cerone*

Tel: +34 692 06 71 28
Skype: luca.cerone

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to