On 12/12/2008 12:23 PM, hadley wickham wrote:
On Fri, Dec 12, 2008 at 11:18 AM, Duncan Murdoch <murd...@stats.uwo.ca> wrote:
On 12/12/2008 11:38 AM, hadley wickham wrote:

On Fri, Dec 12, 2008 at 8:41 AM, Duncan Murdoch <murd...@stats.uwo.ca>
wrote:

On 12/12/2008 8:25 AM, hadley wickham wrote:

From which you might conclude that I don't like the design of subset,
and
you'd be right.  However, I don't think this is a counterexample to my
general rule.  In the subset function, the select argument is treated
as
an
unevaluated expression, and then there are rules about what to do with
it.
 (I.e. try to look up name `a` in the data frame, if that fails, ...)

For the requested behaviour to similarly fall within the general rule,
we'd
have to treat all indices to all kinds of things (vectors, matrices,
dataframes, etc.) as unevaluated expressions, with special handling for
the
particular symbol `end`.

Except you wouldn't have to necessarily change indexing - you could
change seq instead.  Then 5:end could produce some kind of special
data structure (maybe an iterator) that was recognised by the various
indexing functions.

Ummm, doesn't that require changes to *both* indexing and seq?

Ooops, yes.  I meant it wouldn't require indexing to use unevaluated
expression.

This would still be a lot of work for not a lot
of payoff, but it would be a logically consistent way of adding this
behaviour to indexing, and the basic work would make it possible to
develop other sorts of indexing, eg df[evens(), ], or df[last(5),
last(3)].

I agree:  it would be a nice addition, but a fair bit of work.  I think
it
would be quite doable for the indexable things in the base packages, but
there are a lot of contributed packages that define [ methods, and those
methods would all need to be modified too.

That's true, although I suspect many contributed [.methods eventually
delegate to base methods and might work without further modification.

(Just to be clear, when I say doable, I'm thinking that your iterators
return functions that compute subsets of index ranges.  For example,
evens()
might be implemented as

evens <- function() {
 result <- function(indices) {
  indices[indices %% 2 == 0]
 }
 class(result) <- "iterator"
 return(result)
}

and then `[` in v[evens()] would recognize that it had been passed an
iterator, and would pass 1:length(v) to the iterator to get the subset of
even indices.  Is that what you had in mind?)

Yes, that's exactly what I was thinking, although you'd have to put
some thought into the conventions - would it be better to pass in the
length of the vector instead of a vector of indices?  Should all
iterators return logical vectors?  That way you could do x[evens() &
last(5)] to get the even indices out of the last 5, as opposed to
x[evens()][last(5)] which would return the last 5 even indices.

Actually, I don't think so.  "evens() & last(5)" would fail to evaluate,
because you're trying to do a logical combination of two functions, not of
two logical vectors.  Or are we going to extend the logical operators to
work on iterators/selectors too?

Oh yes, that's a good point.  But wouldn't the following do the job?

"&.selector" <- function(a, b) {
  function(n) a(n) & b(n)
}

or

"&.selector" <- function(a, b) {
  function(n) intersect(a(n), b(n))
}

depending on whether selectors return logical or numeric vectors.
Writing functions for | and ! would be similarly easy.  Or am I
missing something?

No, I think those definitions would be fine, but I'd be concerned about speed issues if we start messing with primitives.

While we're at it, we might as well do the same sort of thing for :, and define a selector named end, and then 3:end would give a selector from 3 to the end, which brings us back to the original question. So it's not nearly as intrusive as I thought it would be.

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to