As Dirk suggested below, I'm working on a paragraph to point out the constructor used implies no modification of the arma::mat memory.
Dale On Apr 11, 2013, at 9:52 PM, Dirk Eddelbuettel <e...@debian.org> wrote: > > Hi Dale, > > On 11 April 2013 at 21:12, Dale Smith wrote: > | I had a problem this afternoon with an arma::mat constructor and the > | mat::insert_col method. > > That is not a method I've used, I think, so I will defer comments. > > Whenever I did what you do below (ie expand a N*k matrix with a standard > 'iota' column for a constant in the regression) then I did so _before calling > the C++ code_ as can be seen in all versions of fastLm we shipped. > > | Since I got the example from Dirk, I thought I would > | point out the problem I had and a solution. See > > I am not sure I call that a "problem and a solution". It is worth a > discussion on that page, possibly. > > It is a little like the clone() example I give in the Rcpp classes -- > sometimes the side benefit of a copy is noticeable and even welcome -- as eg > here where you want to modify the data structure. > > I don't do that each time as fastLm is for _pure speed_. > > | http://gallery.rcpp.org/articles/fast-linear-model-with-armadillo/ > | > | for background. Before I get to the example to reproduce the problem and the > | fix, let me say I'm posting from home as I can't post code from my work > email; > | otherwise I'm fired. > | > | I do believe Rcpp::as should be used to marshall from NumericMatrix to > > That is the default. > > | arma::mat, not the arma::mat constructor that accesses raw memory, as Dirk > | uses. > > a) By choice, and b) for faster performance speed, and c) after consulting on > this with Conrad "way back when". Nobody got bitten yet. And we unit test > this each and every time RcppArmadillo is built and tested. > > And note that that is _not_ the default constructor. In fact, I had > contemplated replacing the default constructor but what we have now is fine. > > | I read the Armadillo documentation this afternoon and do believe the > | problem with .insert_col is due to the handling of memory in the constructor > | > | mat(aux_mem*, n_rows, n_cols, copy_aux_mem = true, strict = true) > | > | see > | > | http://arma.sourceforge.net/docs.html > > So just create a copy. > > | Note that I've run similar examples on my Windows 7 box at work and on my > OS X > | laptop (Mountain Lion) at home to confirm this is not os-dependent. > > I think this just a subtle misunderstanding. _If_ one wants to muck with the > data _then_ the "promise" made to R in the constructor > > arma::mat X(Xr.begin(), n, k, false); > > is not a good one. > > But for illustration of _why_ we do what we do, consider this simple > example. We sum up a matrix -- a cheap operation -- and can compare the cost > of instantiating the matrix: > > ------------------------------------------------------------------------------ > > #include <RcppArmadillo.h> > > // [[Rcpp::depends(RcppArmadillo)]] > > // [[Rcpp::export]] > double sum1(Rcpp::NumericMatrix M) { > arma::mat A(M.begin(), M.rows(), M.cols(), false); > return sum(sum(A)); > } > > // [[Rcpp::export]] > double sum2(arma::mat M) { > return sum(sum(M)); > } > > > /*** R > set.seed(42) > M <- matrix(rnorm(100*100),100,100) > library(microbenchmark) > print(sum1(M)) > print(sum2(M)) > microbenchmark(sum1(M), sum2(M), times=100) > */ > > ------------------------------------------------------------------------------ > > which when running yields > > ------------------------------------------------------------------------------ > R> sourceCpp("/tmp/dale.cpp") > > R> set.seed(42) > > R> M <- matrix(rnorm(100*100),100,100) > > R> library(microbenchmark) > > R> print(sum1(M)) > [1] -113.094 > > R> print(sum2(M)) > [1] -113.094 > > R> microbenchmark(sum1(M), sum2(M), times=100) > Unit: microseconds > expr min lq median uq max neval > sum1(M) 8.256 8.4805 9.5750 9.9160 47.225 100 > sum2(M) 21.852 22.2845 22.8535 23.5475 60.001 100 > R> > ------------------------------------------------------------------------------ > > There is nice pickup in speed when we forego the extra copy. You created > yourself a need for the copy, and you're free to make one. > > We should possibly amend the Rcpp Gallery page to note this. Could you send > a suggested patch or added paragraph? > > Thanks, Dirk > > > | > | Thanks, > | Dale Smith > | dtsm...@mindspring.com > | > | mat <- cbind(rnorm(50), rnorm(50), rnorm(50)) > | y <- rnorm(50) > | > | cppFunction(' > | List fastLm(NumericVector yr, NumericMatrix Xr) { > | int n = Xr.nrow(), k = Xr.ncol(); > | > | arma::mat X(Xr.begin(), n, k, false); > | arma::colvec y(yr.begin(), yr.size(), false); > | > | arma::colvec coef = arma::solve(X, y); > | arma::colvec resid = y - X*coef; > | > | arma::colvec ones = arma::ones<arma::colvec>(X.n_rows); > | X.insert_cols(1, ones); > | > | double sig2 = arma::as_scalar(arma::trans(resid)*resid/(n-k)); > | arma::colvec stderrest = arma::sqrt( > | sig2 * arma::diagvec( arma::inv(arma::trans(X)*X)) ); > | > | return List::create(Named("coefficients") = coef, > | Named("stderr") = stderrest); > | }', depends = c("RcppArmadillo")) > | > | > | cppFunction(' > | List fastLmMod(NumericVector yr, NumericMatrix Xr) { > | int n = Xr.nrow(), k = Xr.ncol(); > | > | arma::mat X = Rcpp::as<arma::mat>(Xr); > | arma::colvec y(yr.begin(), yr.size(), false); > | > | arma::colvec ones = arma::ones<arma::colvec>(X.n_rows); > | X.insert_cols(0, ones); > | > | arma::colvec coef = arma::solve(X, y); > | arma::colvec resid = y - X*coef; > | > | double sig2 = arma::as_scalar(arma::trans(resid)*resid/(n-k)); > | arma::colvec stderrest = arma::sqrt( > | sig2 * arma::diagvec( arma::inv(arma::trans(X)*X)) ); > | > | return List::create(Named("coefficients") = coef, > | Named("stderr") = stderrest); > | }', depends = c("RcppArmadillo")) > | > | fastLm(y, mat) # error is "error: Mat::init(): mismatch between size of > | auxiliary memory and requested size" > | fastLmMod(y, mat) # coefficients are -0.15785722, -0.05668155, 0.15409366, > | -0.10079147 > | lm(y ~ mat) # confirm fastLmMod via lm function in R > | > | > | ---------------------------------------------------------------------- > | _______________________________________________ > | Rcpp-devel mailing list > | Rcpp-devel@lists.r-forge.r-project.org > | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel > -- > Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel