Re: [Rd] Large discrepancies in the same object being saved to .RData
On Sun, 11 Jul 2010, Tony Plate wrote: Another way of seeing the environments referenced in an object is using str(), e.g.: f1 - function() { + junk - rnorm(1000) + x - 1:3 + y - rnorm(3) + lm(y ~ x) + } v1 - f1() object.size(f1) 1636 bytes grep(Environment, capture.output(str(v1)), value=TRUE) [1] .. ..- attr(*, \.Environment\)=environment: 0x01f11a30 [2] .. .. ..- attr(*, \.Environment\)=environment: 0x01f11a30 'Some of the environments in a few cases': remember environments have environments (and so on), and that namespaces and packages are also environments. So we need to know about the environment of environment(v1$terms), which also gets saved (either as a reference or as an environment, depending on what it is). And this approach does not work for many of the commonest cases: f - function() { + x - pi + g - function() print(x) + return(g) + } g - f() str(g) function () - attr(*, source)= chr function() print(x) ls(environment(g)) [1] g x In fact I think it works only for formulae. -- Tony Plate On 7/10/2010 10:10 PM, bill.venab...@csiro.au wrote: Well, I have answered one of my questions below. The hidden environment is attached to the 'terms' component of v1. Well, not really hidden. A terms component is a formula (see ?terms.object), and a formula has an environment just as a closure does. In neither case does the print() method tell you about it -- but ?formula does. To see this lapply(v1, environment) $coefficients NULL $residuals NULL $effects NULL $rank NULL $fitted.values NULL $assign NULL $qr NULL $df.residual NULL $xlevels NULL $call NULL $terms environment: 0x021b9e18 $model NULL rm(junk, envir = with(v1, environment(terms))) usedVcells() [1] 96532 This is still a bit of a trap for young (and old!) players... I think the main point in my mind is why is it that object.size() excludes enclosing environments in its reckonings? Bill Venables. -Original Message- From: Venables, Bill (CMIS, Cleveland) Sent: Sunday, 11 July 2010 11:40 AM To: 'Duncan Murdoch'; 'Paul Johnson' Cc: 'r-devel@r-project.org'; Taylor, Julian (CMIS, Waite Campus) Subject: RE: [Rd] Large discrepancies in the same object being saved to .RData I'm still a bit puzzled by the original question. I don't think it has much to do with .RData files and their sizes. For me the puzzle comes much earlier. Here is an example of what I mean using a little session usedVcells- function() gc()[Vcells, used] usedVcells()### the base load [1] 96345 ### Now look at what happens when a function returns a formula as the ### value, with a big item floating around in the function closure: f0- function() { + junk- rnorm(1000) + y ~ x + } v0- f0() usedVcells() ### much bigger than base, why? [1] 10096355 v0 ### no obvious envirnoment y ~ x object.size(v0) ### so far, no clue given where ### the extra Vcells are located. 372 bytes ### Does v0 have an enclosing environment? environment(v0) ### yep. environment: 0x021cc538 ls(envir = environment(v0)) ### as expected, there's the junk [1] junk rm(junk, envir = environment(v0)) ### this does the trick. usedVcells() [1] 96355 ### Now consider a second example where the object ### is not a formula, but contains one. f1- function() { + junk- rnorm(1000) + x- 1:3 + y- rnorm(3) + lm(y ~ x) + } v1- f1() usedVcells() ### as might have been expected. [1] 10096455 ### in this case, though, there is no ### (obvious) enclosing environment environment(v1) NULL object.size(v1) ### so where are the junk Vcells located? 7744 bytes ls(envir = environment(v1)) ### clearly wil not work Error in ls(envir = environment(v1)) : invalid 'envir' argument rm(v1) ### removing the object does clear out the junk. usedVcells() [1] 96366 And in this second case, as noted by Julian Taylor, if you save() the object the .RData file is also huge. There is an environment attached to the object somewhere, but it appears to be occluded and entirely inaccessible. (I have poked around the object components trying to find the thing but without success.) Have I missed something? Bill Venables. -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Sunday, 11 July 2010 10:36 AM To: Paul Johnson Cc: r-devel@r-project.org Subject: Re: [Rd] Large discrepancies in the same object being saved to .RData On 10/07/2010 2:33 PM, Paul Johnson wrote: On Wed, Jul 7, 2010 at 7:12 AM, Duncan Murdochmurdoch.dun...@gmail.com wrote: On 06/07/2010 9:04 PM, julian.tay...@csiro.au wrote: Hi developers, After some investigation I have found there can be large discrepancies in the same object being saved as an external xx.RData file. The immediate repercussion of this is the possible increased size of your .RData workspace
Re: [Rd] LinkingTo and C++
While linking to package shared libs is not possible in general, as Simon point out, it is possible under Windows, provided Windows knows how to find the library linked to at runtime (this requires a customized Makefile.win). One way this is done under Windows is simply to place the package/libs directories containing the package shared libs to be linked to in the Windows search path, but this may be a problem for packages released to CRAN since this would require updates to CRAN's PATH environment variable. Another possibility is to load all packages containing shared libs to be linked to before using any shared lib that is dynamically linked to them. For example, if B.dll is dynamically linked to A.dll (and both A and B are packages), and if foo() is a function in B.dll that uses functions in A.dll, then this will work: library(A) library(B) .Call('foo') But this is not a very natural or convenient solution. It requires that library() commands be used with every instance of Rscript, for example. A better solution would be some variant of LinkingTo: A that somehow has the same effect as setting Windows search path to include the libs directory containing A.dll and B.dll. Of course, most of the issues disappear when linking to static libs instead of dynamic ones, and it is not clear that the extra effort needed to support dynamic libs will yield much benefit in this case. Dominick On Thu, Feb 11, 2010 at 12:55 PM, Simon Urbanek simon.urba...@r-project.org wrote: Romain, I think your'e confusing two entirely different concepts here: 1) LinkingTo: allows a package to provide C-level functions to other packages (see R-ext 5.4). Let's say package A provides a function foo by calling R_RegisterCCallable for that function. If a package B wants to use that function, it uses LinkingTo: and calls R_GetCCallable to obtain the function pointer. It does not actually link to package A because that is in general not possible - it simply obtains the pointers through R. In addition, LinkingTo: makes sure that you have access to the header files of package A which help you to cast the functions and define any data structures you may need. Since C++ is a superset of C you can use this facility with C++ as long as you don't depend on anything outside of the header files. 2) linking directly to another package's shared object is in general not possible, because packages are not guaranteed to be dynamic libraries. They are usually shared objects which may or may not be compatible with a dynamic library on a given platform. Therefore the R-ext describes other way in which you may provide some library independently of the package shared object to other packages (see R-ext 5.8). The issue is that you have to create a separate library (PKG/libs[/arch]/PKG.so won't work in general!) and provide this to other packages. As 5.8 says, this is in general not trivial because it is very platform dependent and the most portable way is to offer a static library. To come back to your example, LinkingTo: A and B will work if you remove Makevars from B (you don't want to link) and put your hello method into the A.h header: library (B) Loading required package: A .Call(say_hello, PACKAGE = B) [1] hello However, your'e not really using the LinkingTo: facilities for the functions so it's essentially just helping you to find the header file. Cheers, Simon On Feb 11, 2010, at 4:08 AM, Romain Francois wrote: Hello, I've been trying to make LinkingTo work when the package linked to has c++ code. I've put dumb packages to illustrate this emails here ; http://addictedtor.free.fr/misc/linkingto Package A defines this C++ class: class A { public: A() ; ~A() ; SEXP hello() ; } ; Package B has this function : SEXP say_hello(){ A a ; return a.hello() ; } headers of package A are copied into inst/include so that package B can have. LinkingTo: A in its DESCRIPTION file. Also, package B has the R function ; g - function(){ .Call(say_hello, PACKAGE = B) } With this I can compile A and B, but then I get : $ Rscript -e B::g() Error in dyn.load(file, DLLpath = DLLpath, ...) : unable to load shared library '/usr/local/lib/R/library/B/libs/B.so': /usr/local/lib/R/library/B/libs/B.so: undefined symbol: _ZN1AD1Ev Calls: :: ... tryCatch - tryCatchList - tryCatchOne - Anonymous If I then add a Makevars in B with this : # find the root directory where A is installed ADIR=$(shell $(R_HOME)/bin/Rscript -e cat(system.file(package='A')) ) PKG_LIBS= $(ADIR)/libs/A$(DYLIB_EXT) Then it works: $ Rscript -e B::g() [1] hello So it appears that adding the -I flag, which is what LinkingTo does is not enough when the package linking from (B) actually has to link to the linked to package (A). I've been looking at
Re: [Rd] Large discrepancies in the same object being saved to .RData
On 11/07/2010 1:30 PM, Prof Brian Ripley wrote: On Sun, 11 Jul 2010, Tony Plate wrote: Another way of seeing the environments referenced in an object is using str(), e.g.: f1 - function() { + junk - rnorm(1000) + x - 1:3 + y - rnorm(3) + lm(y ~ x) + } v1 - f1() object.size(f1) 1636 bytes grep(Environment, capture.output(str(v1)), value=TRUE) [1] .. ..- attr(*, \.Environment\)=environment: 0x01f11a30 [2] .. .. ..- attr(*, \.Environment\)=environment: 0x01f11a30 'Some of the environments in a few cases': remember environments have environments (and so on), and that namespaces and packages are also environments. So we need to know about the environment of environment(v1$terms), which also gets saved (either as a reference or as an environment, depending on what it is). And this approach does not work for many of the commonest cases: f - function() { + x - pi + g - function() print(x) + return(g) + } g - f() str(g) function () - attr(*, source)= chr function() print(x) ls(environment(g)) [1] g x In fact I think it works only for formulae. -- Tony Plate On 7/10/2010 10:10 PM, bill.venab...@csiro.au wrote: Well, I have answered one of my questions below. The hidden environment is attached to the 'terms' component of v1. Well, not really hidden. A terms component is a formula (see ?terms.object), and a formula has an environment just as a closure does. In neither case does the print() method tell you about it -- but ?formula does. I've just changed the default print method for formulas to display the environment if it is not globalenv(), which is the rule used for closures as well. So now in R-devel: as.formula(y ~ x) y ~ x as before, but as.formula(y ~ x, env=new.env()) y ~ x environment: 01f83400 Duncan Murdoch __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Saving an R program as C Code
Is there anyway to say R procedures or packages as C code. Ideally, I want to run a logistic regression in R but have the code available in C, or Java or whatever. Thoughts? [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] S4 class extends data.frame, getDataPart sees list
R-Devel: When I get the data part of an S4 class that contains=data.frame, it gives me a list, even when the data.frame is the S4 version: d-data.frame(x=1:3) isS4(d) [1] FALSE # of course dS4-new(data.frame,d) isS4(dS4) [1] TRUE# ok class(dS4) [1] data.frame # good attr(,package) [1] methods setClass(A, representation(label=character), contains=data.frame) [1] A a-new(A,dS4, label=myFrame) getDataPart(a) [[1]] # oh? [1] 1 2 3 class(a...@.data) [1] list # hmm names(a) [1] x # sure, that makes sense a Object of class A x 1 1 2 2 3 3 Slot label: [1] myFrame Was I wrong to have expected the data part of 'a' to be a data.frame? Thanks. Dan Murphy [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel