Re: [Rd] dget() much slower in recent R versions

2014-06-26 Thread Hervé Pagès



On 06/21/2014 12:56 AM, Prof Brian Ripley wrote:

On 20/06/2014 15:37, Ista Zahn wrote:

Hello,

I've noticed that dget() is much slower in the current and devel R
versions than in previous versions. In 2.15 reading a 1-row
data.frame takes less than half a second:


(which.r - R.Version()$version.string)

[1] R version 2.15.2 (2012-10-26)

x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol
= 10))
dput(x, which.r)
system.time(y - dget(which.r))

user  system elapsed
   0.546   0.033   0.586

While in 3.1.0 and r-devel it takes around 7 seconds.


(which.r - R.Version()$version.string)

[1] R version 3.1.0 (2014-04-10)

x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol
= 10))
dput(x, which.r)
system.time(y - dget(which.r))

user  system elapsed
   6.920   0.060   7.074


(which.r - R.Version()$version.string)

[1] R Under development (unstable) (2014-06-19 r65979)

x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol
= 10))
dput(x, which.r)
system.time(y - dget(which.r))

user  system elapsed
   6.886   0.047   6.943




I know dput/dget is probably not the right tool for this job:
nevertheless the slowdown in quite dramatic so I thought it was worth
calling attention to.


This is completely the wrong way to do this. See ?dump.

dget() basically calls eval(parse()).  parse() is much slower in R =
3.0 mainly because it keeps more information.  Using keep.source=FALSE
here speeds things up a lot.

  system.time(y - dget(which.r))
user  system elapsed
   3.233   0.012   3.248
  options(keep.source=FALSE)
  system.time(y - dget(which.r))
user  system elapsed
   0.090   0.001   0.092


Nice. But why add the 'keep.source' arg do dget() in R-devel rev 65990:

  dget - function(file, keep.source = FALSE)
  eval(parse(file = file, keep.source = FALSE))

(Note that the 'keep.source' arg is actually ignored.)

Why not just:

  dget - function(file)
  eval(parse(file = file, keep.source = FALSE))

Cheers,

H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] dget() much slower in recent R versions

2014-06-21 Thread Prof Brian Ripley

On 20/06/2014 15:37, Ista Zahn wrote:

Hello,

I've noticed that dget() is much slower in the current and devel R
versions than in previous versions. In 2.15 reading a 1-row
data.frame takes less than half a second:


(which.r - R.Version()$version.string)

[1] R version 2.15.2 (2012-10-26)

x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol = 10))
dput(x, which.r)
system.time(y - dget(which.r))

user  system elapsed
   0.546   0.033   0.586

While in 3.1.0 and r-devel it takes around 7 seconds.


(which.r - R.Version()$version.string)

[1] R version 3.1.0 (2014-04-10)

x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol = 10))
dput(x, which.r)
system.time(y - dget(which.r))

user  system elapsed
   6.920   0.060   7.074


(which.r - R.Version()$version.string)

[1] R Under development (unstable) (2014-06-19 r65979)

x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol = 10))
dput(x, which.r)
system.time(y - dget(which.r))

user  system elapsed
   6.886   0.047   6.943




I know dput/dget is probably not the right tool for this job:
nevertheless the slowdown in quite dramatic so I thought it was worth
calling attention to.


This is completely the wrong way to do this. See ?dump.

dget() basically calls eval(parse()).  parse() is much slower in R = 
3.0 mainly because it keeps more information.  Using keep.source=FALSE 
here speeds things up a lot.


 system.time(y - dget(which.r))
   user  system elapsed
  3.233   0.012   3.248
 options(keep.source=FALSE)
 system.time(y - dget(which.r))
   user  system elapsed
  0.090   0.001   0.092


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] dget() much slower in recent R versions

2014-06-21 Thread Ista Zahn
Makes sense, thanks for the explanation.

Best,
Ista

On Sat, Jun 21, 2014 at 3:56 AM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:
 On 20/06/2014 15:37, Ista Zahn wrote:

 Hello,

 I've noticed that dget() is much slower in the current and devel R
 versions than in previous versions. In 2.15 reading a 1-row
 data.frame takes less than half a second:

 (which.r - R.Version()$version.string)

 [1] R version 2.15.2 (2012-10-26)

 x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol =
 10))
 dput(x, which.r)
 system.time(y - dget(which.r))

 user  system elapsed
0.546   0.033   0.586

 While in 3.1.0 and r-devel it takes around 7 seconds.

 (which.r - R.Version()$version.string)

 [1] R version 3.1.0 (2014-04-10)

 x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol =
 10))
 dput(x, which.r)
 system.time(y - dget(which.r))

 user  system elapsed
6.920   0.060   7.074

 (which.r - R.Version()$version.string)

 [1] R Under development (unstable) (2014-06-19 r65979)

 x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol =
 10))
 dput(x, which.r)
 system.time(y - dget(which.r))

 user  system elapsed
6.886   0.047   6.943



 I know dput/dget is probably not the right tool for this job:
 nevertheless the slowdown in quite dramatic so I thought it was worth
 calling attention to.


 This is completely the wrong way to do this. See ?dump.

 dget() basically calls eval(parse()).  parse() is much slower in R = 3.0
 mainly because it keeps more information.  Using keep.source=FALSE here
 speeds things up a lot.


 system.time(y - dget(which.r))
user  system elapsed
   3.233   0.012   3.248
 options(keep.source=FALSE)

 system.time(y - dget(which.r))
user  system elapsed
   0.090   0.001   0.092


 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] dget() much slower in recent R versions

2014-06-20 Thread Ista Zahn
Hello,

I've noticed that dget() is much slower in the current and devel R
versions than in previous versions. In 2.15 reading a 1-row
data.frame takes less than half a second:

 (which.r - R.Version()$version.string)
[1] R version 2.15.2 (2012-10-26)
 x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol = 10))
 dput(x, which.r)
 system.time(y - dget(which.r))
   user  system elapsed
  0.546   0.033   0.586

While in 3.1.0 and r-devel it takes around 7 seconds.

 (which.r - R.Version()$version.string)
[1] R version 3.1.0 (2014-04-10)
 x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol = 10))
 dput(x, which.r)
 system.time(y - dget(which.r))
   user  system elapsed
  6.920   0.060   7.074

 (which.r - R.Version()$version.string)
[1] R Under development (unstable) (2014-06-19 r65979)
 x - data.frame(matrix(sample(letters, 10, replace = TRUE), ncol = 10))
 dput(x, which.r)
 system.time(y - dget(which.r))
   user  system elapsed
  6.886   0.047   6.943


I know dput/dget is probably not the right tool for this job:
nevertheless the slowdown in quite dramatic so I thought it was worth
calling attention to.

Best,
Ista

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel