I figured that getting the ncol(DF) information would be something simpler than I resorted to.
As it turned out, my impression of the time it took to convert the dataframe to a matrix was confused with running it through Excel, using RExcel. In the R console this was momentary even for the 43,000 line dataframe. It turned out that no matter how I would try to work on this, such conversion was necessary and R could not be beaten. Then, by access to matrix math functions in R, and use of some sapply functions, everything I wanted to do ended up best done in R. Just goes to show, if you can avoid an explicit loop in R, there is a chance that even the interpreted language can do you many favors. -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Douglas Bates Sent: Monday, June 27, 2011 1:47 PM To: Silkworth,David J. Cc: [email protected] Subject: Re: [Rcpp-devel] getting ncol(DF) in Rcpp A data.frame in R is a curious object that is really a list of the columns. So myDF.size() returns the number of columns. Try the enclosed R source file. On Mon, Jun 27, 2011 at 12:30 PM, Silkworth,David J. <[email protected]> wrote: > You guys know I am here just to give you a chuckle. > > I wanted to build a function passing just a dataframe to Rcpp. In > order to use this dataframe, I need to know how many columns it has at > runtime. My attempts at getting this ncol information were thwarted > on several counts. The Dimension class appears to only work on STL > containers, which Rcpp::DataFrame is not. I resorted to the > Environment facility to attempt a feeble-minded RInside, (since I > can't understand RInside anyway). > > Environment base("package:base"); > Function ncol = base["ncol"]; > Rcpp::NumericVector test(1); > test[0]=ncol(myDF); > > This fails to compile with the following error: > error: cannot convert 'SEXPREC*' to > 'Rcpp::traits::storage_type<14>::type' > > However, just short of sending another single element vector with this > information as an argument to Rcpp I tried the following, AND IT WORKED! > > (My debug technique is to send items back to R for inspection. This > is just some test code to show that an integer value of myNames.size() > will be useful as a proxy for ncol(DF) in further code development.) > > src <- ' > Rcpp::DataFrame myDF=(arg1); > Environment base("package:base"); > Function names = base["names"]; > Rcpp::CharacterVector myNames(names(myDF)); Rcpp::NumericVector > ncol(1); ncol[0]=myNames.size(); return(ncol); ' > > fun <- cxxfunction(signature(arg1 = "numeric"), > src, plugin = "Rcpp") > > vec1<-rep(5,5) > vec2<-c(1:5) > DF<-data.frame(vec1,vec2) > test<-fun(DF) > > Okay, how's that for a laugher. > > In my real case I am using the same dataframe that I needed to clean > up in my 'redimension' chain. My solution there works quite fine. > Now I have yet to decompose this dataframe back into vectors and a > matrix to enable entries to be accessed in Rcpp. But at least I have > the dimensions for the matrix now. > > It takes about 3 seconds for R to extract a matrix based on > DF[,3:ncol(DF)] on a dataframe with 46,000 rows. I am counting on > Rcpp code to execute this more efficiently. One could argue that I > should never have left Rcpp in the first place. But that is another story. > > _______________________________________________ > Rcpp-devel mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-deve > l > _______________________________________________ Rcpp-devel mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
