Re: [Rd] faster base::sequence
Hi Romain, FWIW I see at least 2 small differences in the way sequence_c() behaves with respect to good old sequence(): zeros and names. > sequence(c(a=5, b=0, c=2)) a1 a2 a3 a4 a5 c1 c2 1 2 3 4 5 1 2 sequence_c() ignores the names and doesn't support zeros in the input. Cheers, H. On 11/28/2010 01:56 AM, Romain Francois wrote: Le 28/11/10 10:30, Prof Brian Ripley a écrit : Is sequence used enough to warrant this? As the help page says Note that ‘sequence <- function(nvec) unlist(lapply(nvec, seq_len))’ and it mainly exists in reverence to the very early history of R. I don't know. Would it be used more if it were more efficient ? I regard it as unsafe to assume that NA_INTEGER will always be negative, and bear in mind that at some point not so far off R integers (or at least lengths) will need to be more than 32-bit. sure. updated and dressed up as a patch. I've made it a .Call because I'm not really comfortable with .Internal, etc ... Do you mean that I should also use something else instead of "int" and "int*". Is there some future proof typedef or macro for the type associated with INTSXP ? On Sun, 28 Nov 2010, Romain Francois wrote: Hello, Based on yesterday's R-help thread (help: program efficiency), and following Bill's suggestions, it appeared that sequence: sequence function (nvec) unlist(lapply(nvec, seq_len)) could benefit from being written in C to avoid unnecessary memory allocations. I made this version using inline: require( inline ) sequence_c <- local( { fx <- cfunction( signature( x = "integer"), ' int n = length(x) ; int* px = INTEGER(x) ; int x_i, s = 0 ; /* error checking */ for( int i=0; i x <- 1:1 system.time( a <- sequence(x ) ) utilisateur système écoulé 0.191 0.108 0.298 system.time( b <- sequence_c(x ) ) utilisateur système écoulé 0.060 0.063 0.122 identical( a, b ) [1] TRUE system.time( for( i in 1:1) sequence(1:10) ) utilisateur système écoulé 0.119 0.000 0.119 system.time( for( i in 1:1) sequence_c(1:10) ) utilisateur système écoulé 0.019 0.000 0.019 I would write a proper patch if someone from R-core is willing to push it. Romain __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] faster base::sequence
Le 28/11/10 11:30, Prof Brian Ripley a écrit : On Sun, 28 Nov 2010, Romain Francois wrote: Le 28/11/10 10:30, Prof Brian Ripley a écrit : Is sequence used enough to warrant this? As the help page says Note that ‘sequence <- function(nvec) unlist(lapply(nvec, seq_len))’ and it mainly exists in reverence to the very early history of R. I don't know. Would it be used more if it were more efficient ? It is for you to make a compelling case for others to do work (maintain changed code) for your wish. No trouble. The patch is there, if anyone finds it interesting or compelling, they will speak up I suppose. Otherwise it is fine for me if it ends up in no man's land. I have the code, if I want to use it, I can squeeze it in a package. I regard it as unsafe to assume that NA_INTEGER will always be negative, and bear in mind that at some point not so far off R integers (or at least lengths) will need to be more than 32-bit. sure. updated and dressed up as a patch. I've made it a .Call because I'm not really comfortable with .Internal, etc ... Do you mean that I should also use something else instead of "int" and "int*". Is there some future proof typedef or macro for the type associated with INTSXP ? Not yet. I was explaining why NA_INTEGER might change. sure. thanks for the reminder. On Sun, 28 Nov 2010, Romain Francois wrote: Hello, Based on yesterday's R-help thread (help: program efficiency), and following Bill's suggestions, it appeared that sequence: sequence function (nvec) unlist(lapply(nvec, seq_len)) could benefit from being written in C to avoid unnecessary memory allocations. I made this version using inline: require( inline ) sequence_c <- local( { fx <- cfunction( signature( x = "integer"), ' int n = length(x) ; int* px = INTEGER(x) ; int x_i, s = 0 ; /* error checking */ for( int i=0; i x <- 1:1 system.time( a <- sequence(x ) ) utilisateur système écoulé 0.191 0.108 0.298 system.time( b <- sequence_c(x ) ) utilisateur système écoulé 0.060 0.063 0.122 identical( a, b ) [1] TRUE system.time( for( i in 1:1) sequence(1:10) ) utilisateur système écoulé 0.119 0.000 0.119 system.time( for( i in 1:1) sequence_c(1:10) ) utilisateur système écoulé 0.019 0.000 0.019 I would write a proper patch if someone from R-core is willing to push it. Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9VOd3l : ZAT! 2010 |- http://bit.ly/c6DzuX : Impressionnism with R `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9VOd3l : ZAT! 2010 |- http://bit.ly/c6DzuX : Impressionnism with R `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] faster base::sequence
On Sun, 28 Nov 2010, Romain Francois wrote: Le 28/11/10 10:30, Prof Brian Ripley a écrit : Is sequence used enough to warrant this? As the help page says Note that ‘sequence <- function(nvec) unlist(lapply(nvec, seq_len))’ and it mainly exists in reverence to the very early history of R. I don't know. Would it be used more if it were more efficient ? It is for you to make a compelling case for others to do work (maintain changed code) for your wish. I regard it as unsafe to assume that NA_INTEGER will always be negative, and bear in mind that at some point not so far off R integers (or at least lengths) will need to be more than 32-bit. sure. updated and dressed up as a patch. I've made it a .Call because I'm not really comfortable with .Internal, etc ... Do you mean that I should also use something else instead of "int" and "int*". Is there some future proof typedef or macro for the type associated with INTSXP ? Not yet. I was explaining why NA_INTEGER might change. On Sun, 28 Nov 2010, Romain Francois wrote: Hello, Based on yesterday's R-help thread (help: program efficiency), and following Bill's suggestions, it appeared that sequence: sequence function (nvec) unlist(lapply(nvec, seq_len)) could benefit from being written in C to avoid unnecessary memory allocations. I made this version using inline: require( inline ) sequence_c <- local( { fx <- cfunction( signature( x = "integer"), ' int n = length(x) ; int* px = INTEGER(x) ; int x_i, s = 0 ; /* error checking */ for( int i=0; i x <- 1:1 system.time( a <- sequence(x ) ) utilisateur système écoulé 0.191 0.108 0.298 system.time( b <- sequence_c(x ) ) utilisateur système écoulé 0.060 0.063 0.122 identical( a, b ) [1] TRUE system.time( for( i in 1:1) sequence(1:10) ) utilisateur système écoulé 0.119 0.000 0.119 system.time( for( i in 1:1) sequence_c(1:10) ) utilisateur système écoulé 0.019 0.000 0.019 I would write a proper patch if someone from R-core is willing to push it. Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9VOd3l : ZAT! 2010 |- http://bit.ly/c6DzuX : Impressionnism with R `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] faster base::sequence
Le 28/11/10 10:30, Prof Brian Ripley a écrit : Is sequence used enough to warrant this? As the help page says Note that ‘sequence <- function(nvec) unlist(lapply(nvec, seq_len))’ and it mainly exists in reverence to the very early history of R. I don't know. Would it be used more if it were more efficient ? I regard it as unsafe to assume that NA_INTEGER will always be negative, and bear in mind that at some point not so far off R integers (or at least lengths) will need to be more than 32-bit. sure. updated and dressed up as a patch. I've made it a .Call because I'm not really comfortable with .Internal, etc ... Do you mean that I should also use something else instead of "int" and "int*". Is there some future proof typedef or macro for the type associated with INTSXP ? On Sun, 28 Nov 2010, Romain Francois wrote: Hello, Based on yesterday's R-help thread (help: program efficiency), and following Bill's suggestions, it appeared that sequence: sequence function (nvec) unlist(lapply(nvec, seq_len)) could benefit from being written in C to avoid unnecessary memory allocations. I made this version using inline: require( inline ) sequence_c <- local( { fx <- cfunction( signature( x = "integer"), ' int n = length(x) ; int* px = INTEGER(x) ; int x_i, s = 0 ; /* error checking */ for( int i=0; i x <- 1:1 system.time( a <- sequence(x ) ) utilisateur système écoulé 0.191 0.108 0.298 system.time( b <- sequence_c(x ) ) utilisateur système écoulé 0.060 0.063 0.122 identical( a, b ) [1] TRUE system.time( for( i in 1:1) sequence(1:10) ) utilisateur système écoulé 0.119 0.000 0.119 system.time( for( i in 1:1) sequence_c(1:10) ) utilisateur système écoulé 0.019 0.000 0.019 I would write a proper patch if someone from R-core is willing to push it. Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9VOd3l : ZAT! 2010 |- http://bit.ly/c6DzuX : Impressionnism with R `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube Index: src/library/base/R/seq.R === --- src/library/base/R/seq.R(revision 53680) +++ src/library/base/R/seq.R(working copy) @@ -85,4 +85,6 @@ } ## In reverence to the very first versions of R which already had sequence(): -sequence <- function(nvec) unlist(lapply(nvec, seq_len)) +# sequence <- function(nvec) unlist(lapply(nvec, seq_len)) +sequence <- function(nvec) .Call( "sequence", as.integer(nvec), PACKAGE = "base" ) + Index: src/main/registration.c === --- src/main/registration.c (revision 53680) +++ src/main/registration.c (working copy) @@ -245,6 +245,8 @@ CALLDEF(bitwiseOr, 2), CALLDEF(bitwiseXor, 2), +/* sequence */ +CALLDEF(sequence,1), {NULL, NULL, 0} }; Index: src/main/seq.c === --- src/main/seq.c (revision 53680) +++ src/main/seq.c (working copy) @@ -679,3 +679,28 @@ return ans; } + +SEXP attribute_hidden sequence(SEXP x) +{ + R_len_t n = length(x), s = 0 ; + int *px = INTEGER(x) ; + int x_i ; + /* error checking */ + for( int i=0; i__ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] faster base::sequence
Is sequence used enough to warrant this? As the help page says Note that ‘sequence <- function(nvec) unlist(lapply(nvec, seq_len))’ and it mainly exists in reverence to the very early history of R. I regard it as unsafe to assume that NA_INTEGER will always be negative, and bear in mind that at some point not so far off R integers (or at least lengths) will need to be more than 32-bit. On Sun, 28 Nov 2010, Romain Francois wrote: Hello, Based on yesterday's R-help thread (help: program efficiency), and following Bill's suggestions, it appeared that sequence: sequence function (nvec) unlist(lapply(nvec, seq_len)) could benefit from being written in C to avoid unnecessary memory allocations. I made this version using inline: require( inline ) sequence_c <- local( { fx <- cfunction( signature( x = "integer"), ' int n = length(x) ; int* px = INTEGER(x) ; int x_i, s = 0 ; /* error checking */ for( int i=0; i x <- 1:1 system.time( a <- sequence(x ) ) utilisateur système écoulé 0.191 0.108 0.298 system.time( b <- sequence_c(x ) ) utilisateur système écoulé 0.060 0.063 0.122 identical( a, b ) [1] TRUE system.time( for( i in 1:1) sequence(1:10) ) utilisateur système écoulé 0.119 0.000 0.119 system.time( for( i in 1:1) sequence_c(1:10) ) utilisateur système écoulé 0.019 0.000 0.019 I would write a proper patch if someone from R-core is willing to push it. Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9VOd3l : ZAT! 2010 |- http://bit.ly/c6DzuX : Impressionnism with R `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] faster base::sequence
Hello, Based on yesterday's R-help thread (help: program efficiency), and following Bill's suggestions, it appeared that sequence: > sequence function (nvec) unlist(lapply(nvec, seq_len)) could benefit from being written in C to avoid unnecessary memory allocations. I made this version using inline: require( inline ) sequence_c <- local( { fx <- cfunction( signature( x = "integer"), ' int n = length(x) ; int* px = INTEGER(x) ; int x_i, s = 0 ; /* error checking */ for( int i=0; i x <- 1:1 > system.time( a <- sequence(x ) ) utilisateur système écoulé 0.191 0.108 0.298 > system.time( b <- sequence_c(x ) ) utilisateur système écoulé 0.060 0.063 0.122 > identical( a, b ) [1] TRUE > system.time( for( i in 1:1) sequence(1:10) ) utilisateur système écoulé 0.119 0.000 0.119 > > system.time( for( i in 1:1) sequence_c(1:10) ) utilisateur système écoulé 0.019 0.000 0.019 I would write a proper patch if someone from R-core is willing to push it. Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/9VOd3l : ZAT! 2010 |- http://bit.ly/c6DzuX : Impressionnism with R `- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel