On Fri, Dec 12, 2008 at 11:18 AM, Duncan Murdoch <murd...@stats.uwo.ca> wrote: > On 12/12/2008 11:38 AM, hadley wickham wrote: >> >> On Fri, Dec 12, 2008 at 8:41 AM, Duncan Murdoch <murd...@stats.uwo.ca> >> wrote: >>> >>> On 12/12/2008 8:25 AM, hadley wickham wrote: >>>>> >>>>> From which you might conclude that I don't like the design of subset, >>>>> and >>>>> you'd be right. However, I don't think this is a counterexample to my >>>>> general rule. In the subset function, the select argument is treated >>>>> as >>>>> an >>>>> unevaluated expression, and then there are rules about what to do with >>>>> it. >>>>> (I.e. try to look up name `a` in the data frame, if that fails, ...) >>>>> >>>>> For the requested behaviour to similarly fall within the general rule, >>>>> we'd >>>>> have to treat all indices to all kinds of things (vectors, matrices, >>>>> dataframes, etc.) as unevaluated expressions, with special handling for >>>>> the >>>>> particular symbol `end`. >>>> >>>> Except you wouldn't have to necessarily change indexing - you could >>>> change seq instead. Then 5:end could produce some kind of special >>>> data structure (maybe an iterator) that was recognised by the various >>>> indexing functions. >>> >>> Ummm, doesn't that require changes to *both* indexing and seq? >> >> Ooops, yes. I meant it wouldn't require indexing to use unevaluated >> expression. >> >>>> This would still be a lot of work for not a lot >>>> of payoff, but it would be a logically consistent way of adding this >>>> behaviour to indexing, and the basic work would make it possible to >>>> develop other sorts of indexing, eg df[evens(), ], or df[last(5), >>>> last(3)]. >>> >>> I agree: it would be a nice addition, but a fair bit of work. I think >>> it >>> would be quite doable for the indexable things in the base packages, but >>> there are a lot of contributed packages that define [ methods, and those >>> methods would all need to be modified too. >> >> That's true, although I suspect many contributed [.methods eventually >> delegate to base methods and might work without further modification. >> >>> (Just to be clear, when I say doable, I'm thinking that your iterators >>> return functions that compute subsets of index ranges. For example, >>> evens() >>> might be implemented as >>> >>> evens <- function() { >>> result <- function(indices) { >>> indices[indices %% 2 == 0] >>> } >>> class(result) <- "iterator" >>> return(result) >>> } >>> >>> and then `[` in v[evens()] would recognize that it had been passed an >>> iterator, and would pass 1:length(v) to the iterator to get the subset of >>> even indices. Is that what you had in mind?) >> >> Yes, that's exactly what I was thinking, although you'd have to put >> some thought into the conventions - would it be better to pass in the >> length of the vector instead of a vector of indices? Should all >> iterators return logical vectors? That way you could do x[evens() & >> last(5)] to get the even indices out of the last 5, as opposed to >> x[evens()][last(5)] which would return the last 5 even indices. > > Actually, I don't think so. "evens() & last(5)" would fail to evaluate, > because you're trying to do a logical combination of two functions, not of > two logical vectors. Or are we going to extend the logical operators to > work on iterators/selectors too?
Oh yes, that's a good point. But wouldn't the following do the job? "&.selector" <- function(a, b) { function(n) a(n) & b(n) } or "&.selector" <- function(a, b) { function(n) intersect(a(n), b(n)) } depending on whether selectors return logical or numeric vectors. Writing functions for | and ! would be similarly easy. Or am I missing something? Hadley -- http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.