Hi Dale, On 11 April 2013 at 21:12, Dale Smith wrote: | I had a problem this afternoon with an arma::mat constructor and the | mat::insert_col method.
That is not a method I've used, I think, so I will defer comments. Whenever I did what you do below (ie expand a N*k matrix with a standard 'iota' column for a constant in the regression) then I did so _before calling the C++ code_ as can be seen in all versions of fastLm we shipped. | Since I got the example from Dirk, I thought I would | point out the problem I had and a solution. See I am not sure I call that a "problem and a solution". It is worth a discussion on that page, possibly. It is a little like the clone() example I give in the Rcpp classes -- sometimes the side benefit of a copy is noticeable and even welcome -- as eg here where you want to modify the data structure. I don't do that each time as fastLm is for _pure speed_. | http://gallery.rcpp.org/articles/fast-linear-model-with-armadillo/ | | for background. Before I get to the example to reproduce the problem and the | fix, let me say I'm posting from home as I can't post code from my work email; | otherwise I'm fired. | | I do believe Rcpp::as should be used to marshall from NumericMatrix to That is the default. | arma::mat, not the arma::mat constructor that accesses raw memory, as Dirk | uses. a) By choice, and b) for faster performance speed, and c) after consulting on this with Conrad "way back when". Nobody got bitten yet. And we unit test this each and every time RcppArmadillo is built and tested. And note that that is _not_ the default constructor. In fact, I had contemplated replacing the default constructor but what we have now is fine. | I read the Armadillo documentation this afternoon and do believe the | problem with .insert_col is due to the handling of memory in the constructor | | mat(aux_mem*, n_rows, n_cols, copy_aux_mem = true, strict = true) | | see | | http://arma.sourceforge.net/docs.html So just create a copy. | Note that I've run similar examples on my Windows 7 box at work and on my OS X | laptop (Mountain Lion) at home to confirm this is not os-dependent. I think this just a subtle misunderstanding. _If_ one wants to muck with the data _then_ the "promise" made to R in the constructor arma::mat X(Xr.begin(), n, k, false); is not a good one. But for illustration of _why_ we do what we do, consider this simple example. We sum up a matrix -- a cheap operation -- and can compare the cost of instantiating the matrix: ------------------------------------------------------------------------------ #include <RcppArmadillo.h> // [[Rcpp::depends(RcppArmadillo)]] // [[Rcpp::export]] double sum1(Rcpp::NumericMatrix M) { arma::mat A(M.begin(), M.rows(), M.cols(), false); return sum(sum(A)); } // [[Rcpp::export]] double sum2(arma::mat M) { return sum(sum(M)); } /*** R set.seed(42) M <- matrix(rnorm(100*100),100,100) library(microbenchmark) print(sum1(M)) print(sum2(M)) microbenchmark(sum1(M), sum2(M), times=100) */ ------------------------------------------------------------------------------ which when running yields ------------------------------------------------------------------------------ R> sourceCpp("/tmp/dale.cpp") R> set.seed(42) R> M <- matrix(rnorm(100*100),100,100) R> library(microbenchmark) R> print(sum1(M)) [1] -113.094 R> print(sum2(M)) [1] -113.094 R> microbenchmark(sum1(M), sum2(M), times=100) Unit: microseconds expr min lq median uq max neval sum1(M) 8.256 8.4805 9.5750 9.9160 47.225 100 sum2(M) 21.852 22.2845 22.8535 23.5475 60.001 100 R> ------------------------------------------------------------------------------ There is nice pickup in speed when we forego the extra copy. You created yourself a need for the copy, and you're free to make one. We should possibly amend the Rcpp Gallery page to note this. Could you send a suggested patch or added paragraph? Thanks, Dirk | | Thanks, | Dale Smith | dtsm...@mindspring.com | | mat <- cbind(rnorm(50), rnorm(50), rnorm(50)) | y <- rnorm(50) | | cppFunction(' | List fastLm(NumericVector yr, NumericMatrix Xr) { | int n = Xr.nrow(), k = Xr.ncol(); | | arma::mat X(Xr.begin(), n, k, false); | arma::colvec y(yr.begin(), yr.size(), false); | | arma::colvec coef = arma::solve(X, y); | arma::colvec resid = y - X*coef; | | arma::colvec ones = arma::ones<arma::colvec>(X.n_rows); | X.insert_cols(1, ones); | | double sig2 = arma::as_scalar(arma::trans(resid)*resid/(n-k)); | arma::colvec stderrest = arma::sqrt( | sig2 * arma::diagvec( arma::inv(arma::trans(X)*X)) ); | | return List::create(Named("coefficients") = coef, | Named("stderr") = stderrest); | }', depends = c("RcppArmadillo")) | | | cppFunction(' | List fastLmMod(NumericVector yr, NumericMatrix Xr) { | int n = Xr.nrow(), k = Xr.ncol(); | | arma::mat X = Rcpp::as<arma::mat>(Xr); | arma::colvec y(yr.begin(), yr.size(), false); | | arma::colvec ones = arma::ones<arma::colvec>(X.n_rows); | X.insert_cols(0, ones); | | arma::colvec coef = arma::solve(X, y); | arma::colvec resid = y - X*coef; | | double sig2 = arma::as_scalar(arma::trans(resid)*resid/(n-k)); | arma::colvec stderrest = arma::sqrt( | sig2 * arma::diagvec( arma::inv(arma::trans(X)*X)) ); | | return List::create(Named("coefficients") = coef, | Named("stderr") = stderrest); | }', depends = c("RcppArmadillo")) | | fastLm(y, mat) # error is "error: Mat::init(): mismatch between size of | auxiliary memory and requested size" | fastLmMod(y, mat) # coefficients are -0.15785722, -0.05668155, 0.15409366, | -0.10079147 | lm(y ~ mat) # confirm fastLmMod via lm function in R | | | ---------------------------------------------------------------------- | _______________________________________________ | Rcpp-devel mailing list | Rcpp-devel@lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel