Re: [R-pkg-devel] RData files with identical objects in package
Thank you so much! Perhaps it could be mentioned in the official documentation on writing R extensions - even if - if I can read English - the default is to avoid "lazyData" loading - and "laxyData" loading is in some opposition to loading using data() - whereas - if we use RStudio, and make an R documentation file for data, we have it ending with: \examples{ data(ddd) ## maybe str(ddd) ; plot(ddd) ... } \keyword{datasets} At the same time as "lazyData" is used default in DESCRIPTION ? 1.1.6 Data in packages The data subdirectory is for data files, either to be made available via lazy-loading or for loading using data(). (The choice is made by the 'LazyData' field in the DESCRIPTION file: the default is not to do so.) It should not be used for other data files needed by the package, and the convention has grown up to use directory inst/extdata for such files. All best wishes Troels -Oprindelig meddelelse- Fra: peter dalgaard Sendt: 13. januar 2019 22:00 Til: Troels Ring Cc: Michael Dewey ; package-develop Emne: Re: [R-pkg-devel] RData files with identical objects in package I think it is illegal if you use the lazyload database, because that is indexed by name and contains every object that would be created by data(). This creates an obvious issue if two objects share a name. Once you use the lazyload database, loading the package creates an environment which is initially full of promises, one for each object. Evaluating one of these makes the actual object appear in the environment. Using data() causes the corresponding promise(s) to be created in the global environment. IIRC, there is a registry that says which objects are created by which arguments to data(), but as they are still taken from the lazydata database, the last one created with a given name still wins. -ps > On 13 Jan 2019, at 14:13 , Troels Ring wrote: > > Thanks a lot - I'm sure you are right that I could just use different > names but I cannot understand why it could cause problem to have two > different well formated .RData files in the /data directory both with > an "x" - is that really illegal? I cannot see it stated in the > official munual - but it is long (wrting r extensions) -BW Troels > > -Oprindelig meddelelse- > Fra: Michael Dewey > Sendt: 13. januar 2019 12:56 > Til: Troels Ring ; package-develop > > Emne: Re: [R-pkg-devel] RData files with identical objects in package > > Dear Troels > > Perhaps I misunderstand what you are trying to do but would it be > possible to put each x and y into a list or a dataframe with different > names and then modify your usgae to pull them from there? Then there > would be no danger of users getting the wrong x and y > > Michael > > On 13/01/2019 08:38, Troels Ring wrote: >> Dear friends - I have a package under creation making heavy >> calculations on chemical/clinical data and I plan to include as >> "examples" the use of some literature data used in my papers. To >> illustrate what then occurs, I made two RData files consisting only >> of x and y with different values for x and y like >> >> X <- 100 >> >> Y <- 1000 >> >> save(x,y,file="first.RData") >> >> and then a new x and y in "second" with x <- 45 and y <- 32 >> >> When I put these in a "data" directory of a new package without >> further ado in RStudio >> >> Ctrl-shift-L >> >> Ctrl-shift-B >> >> >> >> .there is a warning >> >> * installing *source* package 'try' ... >> >> ** R >> >> ** data >> >> *** moving datasets to lazyload DB >> >> warning: objects 'x', 'y' are created by more than one data call >> >> ** byte-compile and prepare package for lazy loading >> >> ** help >> >> converting help for package 'try' >> >> *** installing help indices >> >> finding HTML links ...hello html >> >> done >> >> >> >> Now, when I clear the workspace: >> >>> ls() >> character(0) >>> devtools::load_all(".") >> Loading try >> >> Restarting R session... >> >>> library(try) >>> ls() >> character(0) >>> x #-- so even if workspace Is empty x is still kept >> [1] 45 >>> data(first) # and "first" is not seen x >> [1] 45 >> >> >> >> x is still present - and y >> >> >> >> I have been reading and searching in "Writing R extensions" but so >> far didn't find the clue. >> >> Seemingly it is the file with the last name that is assessed - when I >> rename first.RData to "xfile.RData" we get 100 and 1000. >> >> Now and then when running ctrl-shift-L and - B we see >> >> >> >> Attaches package: 'try' >> >> The following objects are masked _by_ '.GlobalEnv': >> x, y >> >> >> >> Sorry for these problems - >> >> BW >> Troels >> >> >> [[alternative HTML version deleted]] >> >> __ >> R-package-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-package-devel >> > > -- > Michael > http://www.dewey.myzen.co.uk/home.html > > __ > R-package-
Re: [R-pkg-devel] Getting started with memory debugging
On 14 January 2019 at 09:48, Michael Chirico wrote: | Hello all, | | I'm getting started doing some debugging of memory errors and got stuck | trying to reproduce the errors found during my CRAN submission process: | | https://cran.r-project.org/web/checks/check_results_geohashTools.html | | Starting with the clang-ASAN issues, my approach was to try and use the | rocker/r-devel-san image. Good idea! I once set this up for a similar need, and then created the 'sanitiziers' package to have a few "known to fail" test cases to make sure the container was still valid and identifying faults it was supposed to find. But being busy with a number of other things meant I did not keep up with this container. So there is no promise it currently reflects what CRAN tests for. Winston built another (comprehensive !!) set of images over at this repo: https://github.com/wch/r-debug These are more current -- but fundamentally they have the same exact flaw: CRAN does their thing, and someone has to catch up. Deep down, that really is a silly game. We'd be (much) better off if the CRAN testbeds were truly reproducible, and I had meant to write something up suggestion something around that idea. It hasn't happened. So here you are. | Launching with the package directory mounted via: | | docker run --rm -it -v | /Users/michael.chirico/github/geohashTools/:/home/docker/geohashTools | rocker/r-devel-san /bin/bash | | Building required libraries: | | apt-get update | apt-get install libgdal-dev libudunits2-dev | | Then installing my Imports/Suggests: | | Rscript -e "install.packages(c('Rcpp', 'sp', 'sf', 'testthat', 'mockery'))" | | Now attempting to reproduce the memory errors: | | cd /home/docker/geohashTools | RD CMD build . | RD CMD check geohashTools_0.2.0.tar.gz | | But this is check is successful (I was hoping it'd fail)... I assume the | problem is from the last few steps. The manual says: | | > | > It requires code to have been compiled and linked with -fsanitize=address | | But I'm not sure how to enforce this (I assumed it was being handled by how | RD binary is built but I didn't notice any compilation output from R CMD | build . I am a little out of sync with your package here. Maybe it "merely" requires that the library you reinstalled also rebuilds with -fsanitize=address which you could ensure, I'd hope, via PKG_CPPFLAGS and/or editing of its src/Makevars. Because that is essentially it: you do need *all* of - R itself - the package you wanted - and its dependencies to be built consistently for SAN/ASAN with the very settings CRAN uses. Which are "documented" in a README somewhere on some site. I think it is worthwhile having a conversation about how we can do one step better than that. I would be happy to help, but a little constrained on free time and cannot lead this. Can you, or someone else? Dirk | | Any help on getting started here would be appreciated :) | Michael Chirico | | PS the source can be found at https://github.com/MichaelChirico/geohashTools | | [[alternative HTML version deleted]] | | __ | R-package-devel@r-project.org mailing list | https://stat.ethz.ch/mailman/listinfo/r-package-devel -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] Getting started with memory debugging
Hello all, I'm getting started doing some debugging of memory errors and got stuck trying to reproduce the errors found during my CRAN submission process: https://cran.r-project.org/web/checks/check_results_geohashTools.html Starting with the clang-ASAN issues, my approach was to try and use the rocker/r-devel-san image. Launching with the package directory mounted via: docker run --rm -it -v /Users/michael.chirico/github/geohashTools/:/home/docker/geohashTools rocker/r-devel-san /bin/bash Building required libraries: apt-get update apt-get install libgdal-dev libudunits2-dev Then installing my Imports/Suggests: Rscript -e "install.packages(c('Rcpp', 'sp', 'sf', 'testthat', 'mockery'))" Now attempting to reproduce the memory errors: cd /home/docker/geohashTools RD CMD build . RD CMD check geohashTools_0.2.0.tar.gz But this is check is successful (I was hoping it'd fail)... I assume the problem is from the last few steps. The manual says: > > It requires code to have been compiled and linked with -fsanitize=address But I'm not sure how to enforce this (I assumed it was being handled by how RD binary is built but I didn't notice any compilation output from R CMD build . Any help on getting started here would be appreciated :) Michael Chirico PS the source can be found at https://github.com/MichaelChirico/geohashTools [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] RData files with identical objects in package
I think it is illegal if you use the lazyload database, because that is indexed by name and contains every object that would be created by data(). This creates an obvious issue if two objects share a name. Once you use the lazyload database, loading the package creates an environment which is initially full of promises, one for each object. Evaluating one of these makes the actual object appear in the environment. Using data() causes the corresponding promise(s) to be created in the global environment. IIRC, there is a registry that says which objects are created by which arguments to data(), but as they are still taken from the lazydata database, the last one created with a given name still wins. -ps > On 13 Jan 2019, at 14:13 , Troels Ring wrote: > > Thanks a lot - I'm sure you are right that I could just use different names > but I cannot understand why it could cause problem to have two different well > formated .RData files in the /data directory both with an "x" - is that > really illegal? I cannot see it stated in the official munual - but it is > long (wrting r extensions) > -BW > Troels > > -Oprindelig meddelelse- > Fra: Michael Dewey > Sendt: 13. januar 2019 12:56 > Til: Troels Ring ; package-develop > > Emne: Re: [R-pkg-devel] RData files with identical objects in package > > Dear Troels > > Perhaps I misunderstand what you are trying to do but would it be possible to > put each x and y into a list or a dataframe with different names and then > modify your usgae to pull them from there? Then there would be no danger of > users getting the wrong x and y > > Michael > > On 13/01/2019 08:38, Troels Ring wrote: >> Dear friends - I have a package under creation making heavy >> calculations on chemical/clinical data and I plan to include as >> "examples" the use of some literature data used in my papers. To >> illustrate what then occurs, I made two RData files consisting only of >> x and y with different values for x and y like >> >> X <- 100 >> >> Y <- 1000 >> >> save(x,y,file="first.RData") >> >> and then a new x and y in "second" with x <- 45 and y <- 32 >> >> When I put these in a "data" directory of a new package without >> further ado in RStudio >> >> Ctrl-shift-L >> >> Ctrl-shift-B >> >> >> >> .there is a warning >> >> * installing *source* package 'try' ... >> >> ** R >> >> ** data >> >> *** moving datasets to lazyload DB >> >> warning: objects 'x', 'y' are created by more than one data call >> >> ** byte-compile and prepare package for lazy loading >> >> ** help >> >> converting help for package 'try' >> >> *** installing help indices >> >> finding HTML links ...hello html >> >> done >> >> >> >> Now, when I clear the workspace: >> >>> ls() >> character(0) >>> devtools::load_all(".") >> Loading try >> >> Restarting R session... >> >>> library(try) >>> ls() >> character(0) >>> x #-- so even if workspace Is empty x is still kept >> [1] 45 >>> data(first) # and "first" is not seen x >> [1] 45 >> >> >> >> x is still present - and y >> >> >> >> I have been reading and searching in "Writing R extensions" but so far >> didn't find the clue. >> >> Seemingly it is the file with the last name that is assessed - when I rename >> first.RData to "xfile.RData" we get 100 and 1000. >> >> Now and then when running ctrl-shift-L and - B we see >> >> >> >> Attaches package: 'try' >> >> The following objects are masked _by_ '.GlobalEnv': >> x, y >> >> >> >> Sorry for these problems - >> >> BW >> Troels >> >> >> [[alternative HTML version deleted]] >> >> __ >> R-package-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-package-devel >> > > -- > Michael > http://www.dewey.myzen.co.uk/home.html > > __ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] RData files with identical objects in package
Thanks a lot - I'm sure you are right that I could just use different names but I cannot understand why it could cause problem to have two different well formated .RData files in the /data directory both with an "x" - is that really illegal? I cannot see it stated in the official munual - but it is long (wrting r extensions) -BW Troels -Oprindelig meddelelse- Fra: Michael Dewey Sendt: 13. januar 2019 12:56 Til: Troels Ring ; package-develop Emne: Re: [R-pkg-devel] RData files with identical objects in package Dear Troels Perhaps I misunderstand what you are trying to do but would it be possible to put each x and y into a list or a dataframe with different names and then modify your usgae to pull them from there? Then there would be no danger of users getting the wrong x and y Michael On 13/01/2019 08:38, Troels Ring wrote: > Dear friends - I have a package under creation making heavy > calculations on chemical/clinical data and I plan to include as > "examples" the use of some literature data used in my papers. To > illustrate what then occurs, I made two RData files consisting only of > x and y with different values for x and y like > > X <- 100 > > Y <- 1000 > > save(x,y,file="first.RData") > > and then a new x and y in "second" with x <- 45 and y <- 32 > > When I put these in a "data" directory of a new package without > further ado in RStudio > > Ctrl-shift-L > > Ctrl-shift-B > > > > .there is a warning > > * installing *source* package 'try' ... > > ** R > > ** data > > *** moving datasets to lazyload DB > > warning: objects 'x', 'y' are created by more than one data call > > ** byte-compile and prepare package for lazy loading > > ** help > >converting help for package 'try' > > *** installing help indices > > finding HTML links ...hello html > > done > > > > Now, when I clear the workspace: > >> ls() > character(0) >> devtools::load_all(".") > Loading try > > Restarting R session... > >> library(try) >> ls() > character(0) >> x #-- so even if workspace Is empty x is still kept > [1] 45 >> data(first) # and "first" is not seen x > [1] 45 > > > > x is still present - and y > > > > I have been reading and searching in "Writing R extensions" but so far > didn't find the clue. > > Seemingly it is the file with the last name that is assessed - when I rename > first.RData to "xfile.RData" we get 100 and 1000. > > Now and then when running ctrl-shift-L and - B we see > > > > Attaches package: 'try' > > The following objects are masked _by_ '.GlobalEnv': > x, y > > > > Sorry for these problems - > > BW > Troels > > > [[alternative HTML version deleted]] > > __ > R-package-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-package-devel > -- Michael http://www.dewey.myzen.co.uk/home.html __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] RData files with identical objects in package
Thanks a lot - here's is what I get: A single object matching ‘x’ was found It was found in the following places package:try with value [1] 100 Now put in the last "xfile.RData" - and "afile.RData" is still muted Restarting R session... > library(try) > x [1] 100 > getAnywhere("x") A single object matching ‘x’ was found It was found in the following places package:try with value [1] 100 > data(afile) > x [1] 100 Whereas we know x in afile.RData is 45 So something is very wrong Sorry to be so helpless BW Troels -Oprindelig meddelelse- Fra: Duncan Murdoch Sendt: 13. januar 2019 12:46 Til: Troels Ring ; package-develop Emne: Re: [R-pkg-devel] RData files with identical objects in package On 13/01/2019 3:38 a.m., Troels Ring wrote: > Dear friends - I have a package under creation making heavy > calculations on chemical/clinical data and I plan to include as > "examples" the use of some literature data used in my papers. To > illustrate what then occurs, I made two RData files consisting only of > x and y with different values for x and y like > > X <- 100 > > Y <- 1000 > > save(x,y,file="first.RData") > > and then a new x and y in "second" with x <- 45 and y <- 32 > > When I put these in a "data" directory of a new package without > further ado in RStudio > > Ctrl-shift-L > > Ctrl-shift-B > > > > .there is a warning > > * installing *source* package 'try' ... > > ** R > > ** data > > *** moving datasets to lazyload DB > > warning: objects 'x', 'y' are created by more than one data call > > ** byte-compile and prepare package for lazy loading > > ** help > >converting help for package 'try' > > *** installing help indices > > finding HTML links ...hello html > > done > > > > Now, when I clear the workspace: > >> ls() > character(0) >> devtools::load_all(".") > Loading try > > Restarting R session... > >> library(try) >> ls() > character(0) >> x #-- so even if workspace Is empty x is still kept > [1] 45 >> data(first) # and "first" is not seen x > [1] 45 > > > > x is still present - and y > > > > I have been reading and searching in "Writing R extensions" but so far > didn't find the clue. > > Seemingly it is the file with the last name that is assessed - when I rename > first.RData to "xfile.RData" we get 100 and 1000. > > Now and then when running ctrl-shift-L and - B we see > > > > Attaches package: 'try' > > The following objects are masked _by_ '.GlobalEnv': > x, y > Does that every happen when ls() returns character(0)? It seems likely that you have copies of them in the global workspace, and the message is correct. In the earlier situation (ls() *does* return character(0), but x is still found), you can find where it was found using getAnywhere("x") For example, > x <- 2 > getAnywhere("x") A single object matching ‘x’ was found It was found in the following places .GlobalEnv with value [1] 2 Duncan Murdoch __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] RData files with identical objects in package
Dear Troels Perhaps I misunderstand what you are trying to do but would it be possible to put each x and y into a list or a dataframe with different names and then modify your usgae to pull them from there? Then there would be no danger of users getting the wrong x and y Michael On 13/01/2019 08:38, Troels Ring wrote: Dear friends - I have a package under creation making heavy calculations on chemical/clinical data and I plan to include as "examples" the use of some literature data used in my papers. To illustrate what then occurs, I made two RData files consisting only of x and y with different values for x and y like X <- 100 Y <- 1000 save(x,y,file="first.RData") and then a new x and y in "second" with x <- 45 and y <- 32 When I put these in a "data" directory of a new package without further ado in RStudio Ctrl-shift-L Ctrl-shift-B .there is a warning * installing *source* package 'try' ... ** R ** data *** moving datasets to lazyload DB warning: objects 'x', 'y' are created by more than one data call ** byte-compile and prepare package for lazy loading ** help converting help for package 'try' *** installing help indices finding HTML links ...hello html done Now, when I clear the workspace: ls() character(0) devtools::load_all(".") Loading try Restarting R session... library(try) ls() character(0) x #-- so even if workspace Is empty x is still kept [1] 45 data(first) # and "first" is not seen x [1] 45 x is still present - and y I have been reading and searching in "Writing R extensions" but so far didn't find the clue. Seemingly it is the file with the last name that is assessed - when I rename first.RData to "xfile.RData" we get 100 and 1000. Now and then when running ctrl-shift-L and - B we see Attaches package: 'try' The following objects are masked _by_ '.GlobalEnv': x, y Sorry for these problems - BW Troels [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel -- Michael http://www.dewey.myzen.co.uk/home.html __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] RData files with identical objects in package
On 13/01/2019 3:38 a.m., Troels Ring wrote: Dear friends - I have a package under creation making heavy calculations on chemical/clinical data and I plan to include as "examples" the use of some literature data used in my papers. To illustrate what then occurs, I made two RData files consisting only of x and y with different values for x and y like X <- 100 Y <- 1000 save(x,y,file="first.RData") and then a new x and y in "second" with x <- 45 and y <- 32 When I put these in a "data" directory of a new package without further ado in RStudio Ctrl-shift-L Ctrl-shift-B .there is a warning * installing *source* package 'try' ... ** R ** data *** moving datasets to lazyload DB warning: objects 'x', 'y' are created by more than one data call ** byte-compile and prepare package for lazy loading ** help converting help for package 'try' *** installing help indices finding HTML links ...hello html done Now, when I clear the workspace: ls() character(0) devtools::load_all(".") Loading try Restarting R session... library(try) ls() character(0) x #-- so even if workspace Is empty x is still kept [1] 45 data(first) # and "first" is not seen x [1] 45 x is still present - and y I have been reading and searching in "Writing R extensions" but so far didn't find the clue. Seemingly it is the file with the last name that is assessed - when I rename first.RData to "xfile.RData" we get 100 and 1000. Now and then when running ctrl-shift-L and - B we see Attaches package: 'try' The following objects are masked _by_ '.GlobalEnv': x, y Does that every happen when ls() returns character(0)? It seems likely that you have copies of them in the global workspace, and the message is correct. In the earlier situation (ls() *does* return character(0), but x is still found), you can find where it was found using getAnywhere("x") For example, > x <- 2 > getAnywhere("x") A single object matching ‘x’ was found It was found in the following places .GlobalEnv with value [1] 2 Duncan Murdoch __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] RData files with identical objects in package
Dear friends - I have a package under creation making heavy calculations on chemical/clinical data and I plan to include as "examples" the use of some literature data used in my papers. To illustrate what then occurs, I made two RData files consisting only of x and y with different values for x and y like X <- 100 Y <- 1000 save(x,y,file="first.RData") and then a new x and y in "second" with x <- 45 and y <- 32 When I put these in a "data" directory of a new package without further ado in RStudio Ctrl-shift-L Ctrl-shift-B .there is a warning * installing *source* package 'try' ... ** R ** data *** moving datasets to lazyload DB warning: objects 'x', 'y' are created by more than one data call ** byte-compile and prepare package for lazy loading ** help converting help for package 'try' *** installing help indices finding HTML links ...hello html done Now, when I clear the workspace: > ls() character(0) > devtools::load_all(".") Loading try Restarting R session... > library(try) > ls() character(0) > x #-- so even if workspace Is empty x is still kept [1] 45 > data(first) # and "first" is not seen > x [1] 45 x is still present - and y I have been reading and searching in "Writing R extensions" but so far didn't find the clue. Seemingly it is the file with the last name that is assessed - when I rename first.RData to "xfile.RData" we get 100 and 1000. Now and then when running ctrl-shift-L and - B we see Attaches package: 'try' The following objects are masked _by_ '.GlobalEnv': x, y Sorry for these problems - BW Troels [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel