Re: [Rd] nlminb with constraints failing on some platforms

2019-01-31 Thread ProfJCNash
I'm not entirely sure what you are asking. However, optimx is really NOT
meant as a production tool. I intend it as a way to
1) try out a lot of optimizers quickly on a user's problem or problem
class to select a method or methods that suit well;
2) to provide (in the source code of optimr()) an example of how to
call the particular optimizers. They all have a lot of different syntax
elements, which in fact are the biggest headache in building and
extending optimx.

Best, JN

On 2019-01-31 9:26 a.m., Amit Mittal wrote:
> Prof Nash, Prof Galanos
> 
> Is it possible to use a generic code stub in front of packages that use
> optimx to improve optimx use or curtail it according to the requirements?
> 
> 
> Best Regards
> 
> Amit
> 
> +91 7899381263
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
> Please request Skype as available 
> 
> 5^th  Year FPM (Ph.D.) in Finance and Accounting Area
> 
> Indian Institute of Management, Lucknow, (U.P.) 226013 India
> 
> http://bit.ly/2A2PhD
> 
> AEA Job profile : http://bit.ly/AEAamit
> 
> FMA 2 page profile : http://bit.ly/FMApdf2p
> 
> SSRN top10% downloaded since July 2017: http://ssrn.com/author=2665511
> 
> 
> 
> On Thu, Jan 31, 2019 at 7:22 PM ProfJCNash  > wrote:
> 
> This is not about the failure on some platforms, which is an important
> issue. However, what is below may provide a temporary workaround until
> the source of the problem is uncovered.
> 
> FWIW, the problem seems fairly straightforward for most optimizers at my
> disposal in the R-forge (developmental) version of the optimx package at
> https://r-forge.r-project.org/projects/optimizer/
> 
> I used the code
> 
> ## KKristensen19nlminb.R
> f <- function(x) sum( log(diff(x)^2+.01) + (x[1]-1)^2 )
> opt <- nlminb(rep(0, 10), f, lower=-1, upper=3)
> xhat <- rep(1, 10)
> abs( opt$objective - f(xhat) ) < 1e-4  ## Must be TRUE
> opt
> library(optimx)
> optx <- opm(rep(0,10), f, lower=-1, upper=3, method="ALL")
> summary(optx, order=value)
> optxc <- opm(rep(0,10), f, gr="grcentral", lower=-1, upper=3,
> method="ALL")
> summary(optxc, order=value)
> optxn <- opm(rep(0,10), f, gr="grnd", lower=-1, upper=3, method="ALL")
> summary(optxn, order=value)
> 
> It should not be too difficult to actually supply the gradient, which
> would give speedier and more reliable outcomes.
> 
> 
> JN
> 
> 
> 
> On 2019-01-28 3:56 a.m., Kasper Kristensen via R-devel wrote:
> > I've noticed unstable behavior of nlminb on some Linux systems.
> The problem can be reproduced by compiling R-3.5.2 using gcc-8.2 and
> running the following snippet:
> >
> > f <- function(x) sum( log(diff(x)^2+.01) + (x[1]-1)^2 )
> > opt <- nlminb(rep(0, 10), f, lower=-1, upper=3)
> > xhat <- rep(1, 10)
> > abs( opt$objective - f(xhat) ) < 1e-4  ## Must be TRUE
> >
> > The example works perfectly when removing the bounds. However,
> when bounds are added the snippet returns 'FALSE'.
> >
> > An older R version (3.4.4), compiled using the same gcc-8.2, did
> not have the problem. Between the two versions R has changed the
> flags to compile Fortran sources:
> >
> > < SAFE_FFLAGS = -O2 -fomit-frame-pointer -ffloat-store
> > ---
> >> SAFE_FFLAGS = -O2 -fomit-frame-pointer -msse2 -mfpmath=sse
> >
> > Reverting to the old SAFE_FFLAGS 'solves' the problem.
> >
> >> sessionInfo()
> > R version 3.5.2 (2018-12-20)
> > Platform: x86_64-pc-linux-gnu (64-bit)
> > Running under: Scientific Linux release 6.4 (Carbon)
> >
> > Matrix products: default
> > BLAS/LAPACK:
> 
> /zdata/groups/nfsopt/intel/2018update3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so
> >
> > locale:
> > [1] C
> >
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_3.5.2
> >
> >
> >
> >       [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org  mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Object.size() should not visit every element for alt-rep strings, or there should be an altstring_objectsize_method

2019-01-31 Thread Travers Ching
Hi Lujke,

Thanks for the response.  But for some reason, this is a duplicate
post I had sent WEEKS ago, but for some reason is only showing up now?
 I initially thought it was filtered out and detected as spam because
of the github link, so I re-wrote the email (several times in fact),
and you can see the other thread.   Very weird.

Also, the good people at rstudio seem to have fixed the issue!

Thanks
Travers

On Thu, Jan 31, 2019 at 5:35 AM Tierney, Luke  wrote:
>
> You should really take this up with RStudio. Calling object.size on
> every top level assignment as they appear to do is a bad idea, even
> without ALTREP. object.size is only a cheap operation for simple
> atomic vectors. For anything with recursive sturcture it needs to walk
> the object, so the effort is proprtional to object size:
>
> > x <- rep("A", 1e8)
> > system.time(object.size(x))
> user  system elapsed
>1.222   0.624   1.850
> > x <- rep(list(1), 1e8)
> > system.time(object.size(x))
> user  system elapsed
>1.247   0.022   1.273
>
> The current help for object.size says
>
>   Provides an estimate of the memory that is being used to store an
>   R object.
>
> If this is interpreted as the current memory use, which could change
> in the ALTREP context (or for environments, though there the changes
> are ignored), then we could define object.size for ALTREP objects to
> avoid any ALTREP-specific computation. I'm not convinced yet that this
> is a good idea, but it even if we do change this at the R level,
> RStudio would still be well-advised to have another look at what they
> are doing.
>
> Best,
>
> luke
>
> On Tue, 15 Jan 2019, Travers Ching wrote:
>
> >
> > Below is a toy alt-rep string example, that generates N random strings:
> >
> > https://gist.github.com/traversc/a48a504eb062554f2d6ff8043ca16f9c
> >
> > example:
> > `x <- altrandomStrings(1e8)`
> > `head(x)`
> > [1] "2PN0bdwPY7CA8M06zVKEkhHgZVgtV1" "5PN2qmWqBlQ9wQj99nsQzldVI5ZuGX" ...
> > `object.size(1e8)`
> >
> > Object.size will call the `set_altstring_Elt_method` for every single
> > element, materializing (slowly) every element of the vector.  This is
> > a problem mostly in R-studio since object.size is called
> > automatically, defeating the purpose of alt-rep entirely.
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Runnable R packages

2019-01-31 Thread David Lindelof
Would you care to share how your package installs its own dependencies? I
assume this is done during the call to `main()`? (Last time I checked, R
CMD INSTALL would not install a package's dependencies...)


On Thu, Jan 31, 2019 at 4:38 PM Barry Rowlingson <
b.rowling...@lancaster.ac.uk> wrote:

>
>
> On Thu, Jan 31, 2019 at 3:14 PM David Lindelof  wrote:
>
>>
>> In summary, I'm convinced R would benefit from something similar to Java's
>> `Main-Class` header or Python's `__main__()` function. A new R CMD command
>> would take a package, install its dependencies, and run its "main"
>> function.
>
>
>
> I just created and built a very boilerplate R package called "runme". I
> can install its dependencies and run its "main" function with:
>
>  $ R CMD INSTALL runme_0.0.0.9000.tar.gz
>  $ R -e 'runme::main()'
>
> No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
> python and java and C the entrypoint is more tightly specified (__name__ ==
> "__main__" in python, int main(..) in C and so on). But I don't think
> that's much of a problem.
>
> Does that not satisfy your requirements close enough? If you want it in
> one line then:
>
> R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'
>
> will do the second if the first succeeds (Unix shells).
>
> You could write a script for $RHOME/bin/RUN which would be a two-liner and
> that could mandate the use of "main" as an entry point. But good luck
> getting anything into base R.
>
> Barry
>
>
>
>
>> If we have this machinery available, we could even consider
>> reaching out to Spark (and other tech stacks) developers and make it
>> easier
>> to develop R applications for those platforms.
>>
>>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Runnable R packages

2019-01-31 Thread Jan Gorecki
Quoting:

"In summary, I'm convinced R would benefit from something similar to Java's
`Main-Class` header or Python's `__main__()` function. A new R CMD command
would take a package, install its dependencies, and run its "main"
function."

This kind of increase the scope of your idea. New command in R CMD to
redirect to "main" is interesting idea. On the other hand it will
impose limitation on user comparing to the way how you could do it
now: Rscript -e 'mypkg::mymain("myparam")' (or littler, it should be
shipped with R IMO).
For production system one doesn't want to just "install its
dependencies". First dependencies has to be mirrored and their version
frozen. Then testing your package on that set of dependencies. Once
successfully done then same set of packages should be used for
production deployment. For those processes you might find tools4pkgs
branch in base R useful (packages.dcf, mirror.packages functions),
unfortunately not merged:
https://github.com/wch/r-source/compare/tools4pkgs

Jan Gorecki

On Thu, Jan 31, 2019 at 9:08 PM Barry Rowlingson
 wrote:
>
> On Thu, Jan 31, 2019 at 3:14 PM David Lindelof  wrote:
>
> >
> > In summary, I'm convinced R would benefit from something similar to Java's
> > `Main-Class` header or Python's `__main__()` function. A new R CMD command
> > would take a package, install its dependencies, and run its "main"
> > function.
>
>
>
> I just created and built a very boilerplate R package called "runme". I can
> install its dependencies and run its "main" function with:
>
>  $ R CMD INSTALL runme_0.0.0.9000.tar.gz
>  $ R -e 'runme::main()'
>
> No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
> python and java and C the entrypoint is more tightly specified (__name__ ==
> "__main__" in python, int main(..) in C and so on). But I don't think
> that's much of a problem.
>
> Does that not satisfy your requirements close enough? If you want it in one
> line then:
>
> R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'
>
> will do the second if the first succeeds (Unix shells).
>
> You could write a script for $RHOME/bin/RUN which would be a two-liner and
> that could mandate the use of "main" as an entry point. But good luck
> getting anything into base R.
>
> Barry
>
>
>
>
> > If we have this machinery available, we could even consider
> > reaching out to Spark (and other tech stacks) developers and make it easier
> > to develop R applications for those platforms.
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Runnable R packages

2019-01-31 Thread Barry Rowlingson
On Thu, Jan 31, 2019 at 3:14 PM David Lindelof  wrote:

>
> In summary, I'm convinced R would benefit from something similar to Java's
> `Main-Class` header or Python's `__main__()` function. A new R CMD command
> would take a package, install its dependencies, and run its "main"
> function.



I just created and built a very boilerplate R package called "runme". I can
install its dependencies and run its "main" function with:

 $ R CMD INSTALL runme_0.0.0.9000.tar.gz
 $ R -e 'runme::main()'

No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
python and java and C the entrypoint is more tightly specified (__name__ ==
"__main__" in python, int main(..) in C and so on). But I don't think
that's much of a problem.

Does that not satisfy your requirements close enough? If you want it in one
line then:

R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'

will do the second if the first succeeds (Unix shells).

You could write a script for $RHOME/bin/RUN which would be a two-liner and
that could mandate the use of "main" as an entry point. But good luck
getting anything into base R.

Barry




> If we have this machinery available, we could even consider
> reaching out to Spark (and other tech stacks) developers and make it easier
> to develop R applications for those platforms.
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Runnable R packages

2019-01-31 Thread Duncan Murdoch

On 31/01/2019 9:32 a.m., David Lindelof wrote:

Belated thanks to all who replied to my initial query. In summary, three
approaches have been mentioned to run R code "in production": 1)
ShinyProxy, mentioned by Tobias, for deploying Shiny applications; 2)
Docker-like solutions, mentioned by Gergely and Iñaki; and 3) Solutions
based on Rscript or littler, mentioned by Dirk.

I can't speak to 1) because I don't currently use Shiny. And it seems to me
that Docker-like solutions will still need some "point of entry" for the R
application, which will have to be Rscript or littler.

In my first email, I observed that Rscript expects a single expression or a
single script, which is probably why (in my experience) many data
scientists tend to provide their code in a very limited number of files.
Gergely disagreed, arguing to the contrary that data scientists are
encouraged to provide their application as an R package called by a short
script executed by Rscript. But this doesn't happen where I work for
several reasons:

- it implies installing your package on the production machine(s),
including its dependencies, which must be done by hand
- some machine learning platforms will simply not accept code provided
as an R package
- we have some "big data" use cases for which we need Spark; Spark can
run R or Python code, but only when it is provided as a single file. (On
the other hand, Spark can run applications provided as JAR files)

In summary, I'm convinced R would benefit from something similar to Java's
`Main-Class` header or Python's `__main__()` function. A new R CMD command
would take a package, install its dependencies, and run its "main"
function. If we have this machinery available, we could even consider
reaching out to Spark (and other tech stacks) developers and make it easier
to develop R applications for those platforms.

A candid comment from Dirk suggested that I should implement this myself,
which I would be happy to do, provided this is the normal procedure. Or is
there a more formal process I should follow?


You can't implement it to run under R CMD, but it should be 
straightforward to put this in an R package, to be run by Rscript using 
something like


  Rscript -e "yourpackage::run_main('somepackage')"

You can use the installation code from the `remotes` package, so 
run_main() could be a pretty simple function.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Runnable R packages

2019-01-31 Thread David Lindelof
Belated thanks to all who replied to my initial query. In summary, three
approaches have been mentioned to run R code "in production": 1)
ShinyProxy, mentioned by Tobias, for deploying Shiny applications; 2)
Docker-like solutions, mentioned by Gergely and Iñaki; and 3) Solutions
based on Rscript or littler, mentioned by Dirk.

I can't speak to 1) because I don't currently use Shiny. And it seems to me
that Docker-like solutions will still need some "point of entry" for the R
application, which will have to be Rscript or littler.

In my first email, I observed that Rscript expects a single expression or a
single script, which is probably why (in my experience) many data
scientists tend to provide their code in a very limited number of files.
Gergely disagreed, arguing to the contrary that data scientists are
encouraged to provide their application as an R package called by a short
script executed by Rscript. But this doesn't happen where I work for
several reasons:

   - it implies installing your package on the production machine(s),
   including its dependencies, which must be done by hand
   - some machine learning platforms will simply not accept code provided
   as an R package
   - we have some "big data" use cases for which we need Spark; Spark can
   run R or Python code, but only when it is provided as a single file. (On
   the other hand, Spark can run applications provided as JAR files)

In summary, I'm convinced R would benefit from something similar to Java's
`Main-Class` header or Python's `__main__()` function. A new R CMD command
would take a package, install its dependencies, and run its "main"
function. If we have this machinery available, we could even consider
reaching out to Spark (and other tech stacks) developers and make it easier
to develop R applications for those platforms.

A candid comment from Dirk suggested that I should implement this myself,
which I would be happy to do, provided this is the normal procedure. Or is
there a more formal process I should follow?

Kind regards,

David Lindelöf

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] nlminb with constraints failing on some platforms

2019-01-31 Thread Amit Mittal
Prof Nash, Prof Galanos

Is it possible to use a generic code stub in front of packages that use
optimx to improve optimx use or curtail it according to the requirements?


Best Regards

Amit

+91 7899381263











 

Please request Skype as available

5th Year FPM (Ph.D.) in Finance and Accounting Area

Indian Institute of Management, Lucknow, (U.P.) 226013 India

http://bit.ly/2A2PhD

AEA Job profile : http://bit.ly/AEAamit

FMA 2 page profile : http://bit.ly/FMApdf2p
SSRN top10% downloaded since July 2017: http://ssrn.com/author=2665511



On Thu, Jan 31, 2019 at 7:22 PM ProfJCNash  wrote:

> This is not about the failure on some platforms, which is an important
> issue. However, what is below may provide a temporary workaround until
> the source of the problem is uncovered.
>
> FWIW, the problem seems fairly straightforward for most optimizers at my
> disposal in the R-forge (developmental) version of the optimx package at
> https://r-forge.r-project.org/projects/optimizer/
>
> I used the code
>
> ## KKristensen19nlminb.R
> f <- function(x) sum( log(diff(x)^2+.01) + (x[1]-1)^2 )
> opt <- nlminb(rep(0, 10), f, lower=-1, upper=3)
> xhat <- rep(1, 10)
> abs( opt$objective - f(xhat) ) < 1e-4  ## Must be TRUE
> opt
> library(optimx)
> optx <- opm(rep(0,10), f, lower=-1, upper=3, method="ALL")
> summary(optx, order=value)
> optxc <- opm(rep(0,10), f, gr="grcentral", lower=-1, upper=3, method="ALL")
> summary(optxc, order=value)
> optxn <- opm(rep(0,10), f, gr="grnd", lower=-1, upper=3, method="ALL")
> summary(optxn, order=value)
>
> It should not be too difficult to actually supply the gradient, which
> would give speedier and more reliable outcomes.
>
>
> JN
>
>
>
> On 2019-01-28 3:56 a.m., Kasper Kristensen via R-devel wrote:
> > I've noticed unstable behavior of nlminb on some Linux systems. The
> problem can be reproduced by compiling R-3.5.2 using gcc-8.2 and running
> the following snippet:
> >
> > f <- function(x) sum( log(diff(x)^2+.01) + (x[1]-1)^2 )
> > opt <- nlminb(rep(0, 10), f, lower=-1, upper=3)
> > xhat <- rep(1, 10)
> > abs( opt$objective - f(xhat) ) < 1e-4  ## Must be TRUE
> >
> > The example works perfectly when removing the bounds. However, when
> bounds are added the snippet returns 'FALSE'.
> >
> > An older R version (3.4.4), compiled using the same gcc-8.2, did not
> have the problem. Between the two versions R has changed the flags to
> compile Fortran sources:
> >
> > < SAFE_FFLAGS = -O2 -fomit-frame-pointer -ffloat-store
> > ---
> >> SAFE_FFLAGS = -O2 -fomit-frame-pointer -msse2 -mfpmath=sse
> >
> > Reverting to the old SAFE_FFLAGS 'solves' the problem.
> >
> >> sessionInfo()
> > R version 3.5.2 (2018-12-20)
> > Platform: x86_64-pc-linux-gnu (64-bit)
> > Running under: Scientific Linux release 6.4 (Carbon)
> >
> > Matrix products: default
> > BLAS/LAPACK:
> /zdata/groups/nfsopt/intel/2018update3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so
> >
> > locale:
> > [1] C
> >
> > attached base packages:
> > [1] stats graphics  grDevices utils datasets  methods   base
> >
> > loaded via a namespace (and not attached):
> > [1] compiler_3.5.2
> >
> >
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] nlminb with constraints failing on some platforms

2019-01-31 Thread ProfJCNash
This is not about the failure on some platforms, which is an important
issue. However, what is below may provide a temporary workaround until
the source of the problem is uncovered.

FWIW, the problem seems fairly straightforward for most optimizers at my
disposal in the R-forge (developmental) version of the optimx package at
https://r-forge.r-project.org/projects/optimizer/

I used the code

## KKristensen19nlminb.R
f <- function(x) sum( log(diff(x)^2+.01) + (x[1]-1)^2 )
opt <- nlminb(rep(0, 10), f, lower=-1, upper=3)
xhat <- rep(1, 10)
abs( opt$objective - f(xhat) ) < 1e-4  ## Must be TRUE
opt
library(optimx)
optx <- opm(rep(0,10), f, lower=-1, upper=3, method="ALL")
summary(optx, order=value)
optxc <- opm(rep(0,10), f, gr="grcentral", lower=-1, upper=3, method="ALL")
summary(optxc, order=value)
optxn <- opm(rep(0,10), f, gr="grnd", lower=-1, upper=3, method="ALL")
summary(optxn, order=value)

It should not be too difficult to actually supply the gradient, which
would give speedier and more reliable outcomes.


JN



On 2019-01-28 3:56 a.m., Kasper Kristensen via R-devel wrote:
> I've noticed unstable behavior of nlminb on some Linux systems. The problem 
> can be reproduced by compiling R-3.5.2 using gcc-8.2 and running the 
> following snippet:
> 
> f <- function(x) sum( log(diff(x)^2+.01) + (x[1]-1)^2 )
> opt <- nlminb(rep(0, 10), f, lower=-1, upper=3)
> xhat <- rep(1, 10)
> abs( opt$objective - f(xhat) ) < 1e-4  ## Must be TRUE
> 
> The example works perfectly when removing the bounds. However, when bounds 
> are added the snippet returns 'FALSE'.
> 
> An older R version (3.4.4), compiled using the same gcc-8.2, did not have the 
> problem. Between the two versions R has changed the flags to compile Fortran 
> sources:
> 
> < SAFE_FFLAGS = -O2 -fomit-frame-pointer -ffloat-store
> ---
>> SAFE_FFLAGS = -O2 -fomit-frame-pointer -msse2 -mfpmath=sse
> 
> Reverting to the old SAFE_FFLAGS 'solves' the problem.
> 
>> sessionInfo()
> R version 3.5.2 (2018-12-20)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Scientific Linux release 6.4 (Carbon)
> 
> Matrix products: default
> BLAS/LAPACK: 
> /zdata/groups/nfsopt/intel/2018update3/compilers_and_libraries_2018.3.222/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so
> 
> locale:
> [1] C
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> loaded via a namespace (and not attached):
> [1] compiler_3.5.2
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Object.size() should not visit every element for alt-rep strings, or there should be an altstring_objectsize_method

2019-01-31 Thread Tierney, Luke
You should really take this up with RStudio. Calling object.size on
every top level assignment as they appear to do is a bad idea, even
without ALTREP. object.size is only a cheap operation for simple
atomic vectors. For anything with recursive sturcture it needs to walk
the object, so the effort is proprtional to object size:

> x <- rep("A", 1e8)
> system.time(object.size(x))
user  system elapsed
   1.222   0.624   1.850 
> x <- rep(list(1), 1e8)
> system.time(object.size(x))
user  system elapsed
   1.247   0.022   1.273

The current help for object.size says

  Provides an estimate of the memory that is being used to store an
  R object.

If this is interpreted as the current memory use, which could change
in the ALTREP context (or for environments, though there the changes
are ignored), then we could define object.size for ALTREP objects to
avoid any ALTREP-specific computation. I'm not convinced yet that this
is a good idea, but it even if we do change this at the R level,
RStudio would still be well-advised to have another look at what they
are doing.

Best,

luke

On Tue, 15 Jan 2019, Travers Ching wrote:

>
> Below is a toy alt-rep string example, that generates N random strings:
>
> https://gist.github.com/traversc/a48a504eb062554f2d6ff8043ca16f9c
>
> example:
> `x <- altrandomStrings(1e8)`
> `head(x)`
> [1] "2PN0bdwPY7CA8M06zVKEkhHgZVgtV1" "5PN2qmWqBlQ9wQj99nsQzldVI5ZuGX" ...
> `object.size(1e8)`
>
> Object.size will call the `set_altstring_Elt_method` for every single
> element, materializing (slowly) every element of the vector.  This is
> a problem mostly in R-studio since object.size is called
> automatically, defeating the purpose of alt-rep entirely.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Object.size() should not visit every element for alt-rep strings, or there should be an altstring_objectsize_method

2019-01-31 Thread Travers Ching


Below is a toy alt-rep string example, that generates N random strings:

https://gist.github.com/traversc/a48a504eb062554f2d6ff8043ca16f9c

example:
`x <- altrandomStrings(1e8)`
`head(x)`
[1] "2PN0bdwPY7CA8M06zVKEkhHgZVgtV1" "5PN2qmWqBlQ9wQj99nsQzldVI5ZuGX" ...
`object.size(1e8)`

Object.size will call the `set_altstring_Elt_method` for every single
element, materializing (slowly) every element of the vector.  This is
a problem mostly in R-studio since object.size is called
automatically, defeating the purpose of alt-rep entirely.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel