Re: [Rd] making object.size() more meaningful on environments?

2015-09-29 Thread Hadley Wickham
You might like to try pryr::object_size() :

``` r
library(pryr)
env1 <- new.env()
object_size(env1)
#> 328 B
env2 <- new.env(hash = TRUE, size = 7500L)
object_size(env2)
#> 600 MB
env3 <- list2env(list(a = runif(2.5e+07), L = LETTERS))
object_size(env3)
#> 200 MB
```

It handles the issue that Gabe mentions:

``` r
a <- list2env(list(a = runif(1e+06)))
object_size(a)
#> 8 MB
b <- new.env()
b$a <- a
b$b <- runif(1e+06)
object_size(b)
#> 16 MB
object_size(a, b)
#> 16 MB
```

You just have to remember that object_size(a) + object_size(b) <=
object_size(a, b).

Hadley

On Tue, Sep 29, 2015 at 4:42 PM, Hervé Pagès  wrote:
> Hi,
>
> Currently object.size() is not very useful on environments as it always
> returns 56 bytes, no matter how big the environment is:
>
>   env1 <- new.env()
>   object.size(env1)  # 56 bytes
>
>   env2 <- new.env(hash=TRUE, size=7500L)
>   object.size(env2)  # 56 bytes
>
>   env3 <- list2env(list(a=runif(2500), L=LETTERS))
>   object.size(env3)  # 56 bytes
>
> This makes it pretty useless on reference class instances and other
> objects that use environments internally for caching or other purposes.
>
> What about changing this and make it return something more meaningful?
>
> Cheers,
> H.
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] making object.size() more meaningful on environments?

2015-09-29 Thread Gabriel Becker
Herve,

The problem then would be that for A a refClass whose fields take up N
bytes (in the sense that you mean), if we do

B <- A

A and B would look like the BOTH take up N bytes, for a total of 2N,
whereas AFAIK R would only be using ~ N + 2*56 bytes, right?

~G



On Tue, Sep 29, 2015 at 2:42 PM, Hervé Pagès  wrote:

> Hi,
>
> Currently object.size() is not very useful on environments as it always
> returns 56 bytes, no matter how big the environment is:
>
>   env1 <- new.env()
>   object.size(env1)  # 56 bytes
>
>   env2 <- new.env(hash=TRUE, size=7500L)
>   object.size(env2)  # 56 bytes
>
>   env3 <- list2env(list(a=runif(2500), L=LETTERS))
>   object.size(env3)  # 56 bytes
>
> This makes it pretty useless on reference class instances and other
> objects that use environments internally for caching or other purposes.
>
> What about changing this and make it return something more meaningful?
>
> Cheers,
> H.
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Gabriel Becker, PhD
Computational Biologist
Bioinformatics and Computational Biology
Genentech, Inc.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] making object.size() more meaningful on environments?

2015-09-29 Thread Hervé Pagès

On 09/29/2015 03:18 PM, Hervé Pagès wrote:

Hi Gabe,

On 09/29/2015 02:51 PM, Gabriel Becker wrote:

Herve,

The problem then would be that for A a refClass whose fields take up N
bytes (in the sense that you mean), if we do

B <- A

A and B would look like the BOTH take up N bytes, for a total of 2N,
whereas AFAIK R would only be using ~ N + 2*56 bytes, right?


Of course I should also add that this is actually the situation with
any object in R, not just refClass objects, because of the
copy-on-modification trick.

H.



Yes, but that's still a *much* better situation than the current one in
my opinion. More generally speaking counting shared memory for each
object (or process) that uses it is a common, sensible, and accepted
approach. No need to look far: a character vector is just a collection
of pointers to stuff that is shared thru the global CHARSXP cache and
AFAIK object.size() takes this stuff into account.

H.



~G



On Tue, Sep 29, 2015 at 2:42 PM, Hervé Pagès > wrote:

Hi,

Currently object.size() is not very useful on environments as it
always
returns 56 bytes, no matter how big the environment is:

   env1 <- new.env()
   object.size(env1)  # 56 bytes

   env2 <- new.env(hash=TRUE, size=7500L)
   object.size(env2)  # 56 bytes

   env3 <- list2env(list(a=runif(2500), L=LETTERS))
   object.size(env3)  # 56 bytes

This makes it pretty useless on reference class instances and other
objects that use environments internally for caching or other
purposes.

What about changing this and make it return something more
meaningful?

Cheers,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org 
Phone: (206) 667-5791 
Fax: (206) 667-1319 

__
R-devel@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Gabriel Becker, PhD
Computational Biologist
Bioinformatics and Computational Biology
Genentech, Inc.




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] making object.size() more meaningful on environments?

2015-09-29 Thread Hervé Pagès

Hi Gabe,

On 09/29/2015 02:51 PM, Gabriel Becker wrote:

Herve,

The problem then would be that for A a refClass whose fields take up N
bytes (in the sense that you mean), if we do

B <- A

A and B would look like the BOTH take up N bytes, for a total of 2N,
whereas AFAIK R would only be using ~ N + 2*56 bytes, right?


Yes, but that's still a *much* better situation than the current one in
my opinion. More generally speaking counting shared memory for each
object (or process) that uses it is a common, sensible, and accepted
approach. No need to look far: a character vector is just a collection
of pointers to stuff that is shared thru the global CHARSXP cache and
AFAIK object.size() takes this stuff into account.

H.



~G



On Tue, Sep 29, 2015 at 2:42 PM, Hervé Pagès > wrote:

Hi,

Currently object.size() is not very useful on environments as it always
returns 56 bytes, no matter how big the environment is:

   env1 <- new.env()
   object.size(env1)  # 56 bytes

   env2 <- new.env(hash=TRUE, size=7500L)
   object.size(env2)  # 56 bytes

   env3 <- list2env(list(a=runif(2500), L=LETTERS))
   object.size(env3)  # 56 bytes

This makes it pretty useless on reference class instances and other
objects that use environments internally for caching or other purposes.

What about changing this and make it return something more meaningful?

Cheers,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org 
Phone: (206) 667-5791 
Fax: (206) 667-1319 

__
R-devel@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Gabriel Becker, PhD
Computational Biologist
Bioinformatics and Computational Biology
Genentech, Inc.


--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel