> -----Original Message-----
> From: Dirk Eddelbuettel [mailto:e...@debian.org]
> Sent: Tuesday, February 19, 2013 5:02 PM
> To: Ken Williams
> Cc: rcpp-devel@lists.r-forge.r-project.org
> Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column
>
>
> Ken,
>
> On 19 February 2013 at 22:35, Ken Williams wrote:
> | I have a need to loop through all the entries of a DataFrame by row,
> | then column.  I know two different ways:
>
> There have been prior discussions of this topic, as well as example posts --
> even leading to a Rcpp Gallery article. Did you read any of these?  It wasn't
> clear from your post.

I looked, but I didn't find anything directly addressing it.  Most of what I 
found at http://search.gmane.org/?query=dataframe&group=gmane.comp.lang.r.rcpp 
seems to deal with creating DataFrame objects, not indexing into them.

In the Rcpp Gallery, I also see 2 articles on creating/modifying DataFrame 
objects, but nothing demonstrating any indexing differently than I wrote.

The other place I looked was inst/unitTests/cpp/DataFrame.cpp in the repository.

If I missed something relevant, I'd be happy to be pointed to it.


>
> | I?m also curious why it?s a syntax error in Case A to just write
> | `df[j][i]` or
>
> Eeeek.  I prefer the more C++-y way of writing df(j,i).

Attempting to do so, I get a compile-time error:

window.cpp:68:34: error: ambiguous overload for 'operator-' in 
'Rcpp::Vector<RTYPE>::operator()(const size_t&, const size_t&) [with int RTYPE 
= 19, Rcpp::Vector<RTYPE>::Proxy = Rcpp::internal::generic_proxy<19>, size_t = 
long long unsigned int]((* &((size_t)j)), (* &((size_t)i))) - 
Rcpp::Vector<RTYPE>::operator()(const size_t&, const size_t&) [with int RTYPE = 
19, Rcpp::Vector<RTYPE>::Proxy = Rcpp::internal::generic_proxy<19>, size_t = 
long long unsigned int]((* &((size_t)j)), (* &((size_t)last_i)))'
window.cpp:68:34: note: candidates are:
window.cpp:68:34: note: operator-(SEXP, SEXP) <built-in>
window.cpp:68:34: note: operator-(SEXP, long long int) <built-in>
window.cpp:68:34: note: operator-(int, int) <built-in>

For context, the line that's failing is:

     if(fabs(df(j,i)-df(j,last_i))>thresh) {


> Overall, your premise may be wrong too.  "We all know" that a data.frame is
> not the fastest data structure in R, so by forcing ourselves to the same 
> access
> are we not handycapping ourselves.

I was operating under the premise that there "must be" a constant-time accessor 
for a List element (DataFrame column), and once I have that, a constant-time 
accessor for an element of that vector.  I know the latter is true, but is the 
former not true?  I assumed it was but that I just couldn't find it.

 -Ken

________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended 
recipient(s) and may contain confidential and privileged information. Any 
unauthorized review, use, disclosure or distribution of any kind is strictly 
prohibited. If you are not the intended recipient, please contact the sender 
via reply e-mail and destroy all copies of the original message. Thank you.
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Reply via email to