On 25 March 2024 at 11:12, Jairo Hidalgo Migueles wrote:
| I'm reaching out to seek some guidance regarding the storage of relatively
| large data, ranging from 10-40 MB, intended for use within an R package.
| Specifically, this data consists of regression and random forest models
| crucial for making predictions within our R package.
| 
| Initially, I attempted to save these models as internal data within the
| package. While this approach maintains functionality, it has led to a
| package size exceeding 20 MB. I'm concerned that this would complicate
| submitting the package to CRAN in the future.
| 
| I would greatly appreciate any suggestions or insights you may have on
| alternative methods or best practices for efficiently storing and accessing
| this data within our R package.

Brooke and I wrote a paper on one way of addressing it via a 'data' package
accessibly via an Additional_repositories: entry supported by a drat repo.

See https://journal.r-project.org/archive/2017/RJ-2017-026/index.html for the
paper which contains a nice slow walkthrough of all the details.

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to