Re: [R] sweep() and recycling

2005-06-22 Thread Prof Brian Ripley
I think that as the proponents do not agree, we need to leave this as is.

BTW, R-devel is the place to discuss patched to R, rather than R-help.

On Tue, 21 Jun 2005, Heather Turner wrote:

> Agreed my examples may be trivial and I'm sure there are more efficient ways 
> to do the same thing, but I disagree that my examples are against the spirit 
> of sweep (). In the first example a vector is swept out from the rows, with 
> one value for odd rows and one value for even rows. In the second example an 
> array of values is swept out across the third dimension. In the third example 
> an array of values is swept out from the full array.
>
> The first example is a natural use of recycling. E.g.
>
> sweep(matrix(1:100, 50, 2), 1, c(1, 1000), "+", give.warning = TRUE)
>
> is a quicker way of writing
>
> sweep(matrix(1:100, 50, 2), 1, rep(c(1, 1000), 25), "+", give.warning = TRUE)
>
> but your code would give a warning in the first case, even though the intent 
> and the result are exactly the same as in the second case.
>
> As you say, it is only a warning, that can be ignored. However the warning 
> should at least reflect the warning condition used, i.e. warn that the length 
> of STATS does not equal the extent of MARGIN, rather warning that STATS does 
> not recycle exactly.
>
> Heather
>
 Robin Hankin <[EMAIL PROTECTED]> 06/21/05 02:47pm >>>
> Hi
>
> On Jun 21, 2005, at 02:33 pm, Heather Turner wrote:
>
>> I think the warning condition in Robin's patch is too harsh - the
>> following examples seem reasonable to me, but all produce warnings
>>
>> sweep(array(1:24, dim = c(4,3,2)), 1, 1:2, give.warning = TRUE)
>> sweep(array(1:24, dim = c(4,3,2)), 1, 1:12, give.warning = TRUE)
>> sweep(array(1:24, dim = c(4,3,2)), 1, 1:24, give.warning = TRUE)
>>
>
>
> The examples above do give warnings (as intended) but I think all three
> cases above
> are inimical to the spirit of sweep(): nothing is being "swept" out.
>
> So a warning is appropriate, IMO.
>
> In any case, one can always suppress (or ignore!) a warning if one knows
> what one is doing.  YMMV, but if I wanted to do the above operations I
> would
> replace
>
>
> sweep(array(0, dim = c(4,3,2)), c(1,3), 1:12, "+" , give.warning =
> FALSE)
>
> with
>
>  aperm(array(1:12,c(4,2,3)),c(1,3,2))
>
>
> best wishes
>
> rksh
>
>
>
>
>
>
>
>> I have written an alternative (given below) which does not give
>> warnings in the above cases, but does warn in the following case
>>
>>> sweep(array(1:24, dim = c(4,3,2)), 1:2, 1:3)
>> , , 1
>>
>>  [,1] [,2] [,3]
>> [1,]036
>> [2,]039
>> [3,]069
>> [4,]369
>>
>> , , 2
>>
>>  [,1] [,2] [,3]
>> [1,]   12   15   18
>> [2,]   12   15   21
>> [3,]   12   18   21
>> [4,]   15   18   21
>>
>> Warning message:
>> STATS does not recycle exactly across MARGIN
>>
>> The code could be easily modified to warn in other cases, e.g. when
>> length of STATS is a divisor of the corresponding array extent (as in
>> the first example above, with length(STATS) = 2).
>>
>> The code also includes Gabor's suggestion.
>>
>> Heather
>>
>> sweep <- function (x, MARGIN, STATS, FUN = "-", warn =
>> getOption("warn"), ...)
>> {
>> FUN <- match.fun(FUN)
>> dims <- dim(x)
>> perm <- c(MARGIN, (1:length(dims))[-MARGIN])
>> if (warn >= 0) {
>> s <- length(STATS)
>> cumDim <- c(1, cumprod(dims[perm]))
>> if (s > max(cumDim))
>> warning("length of STATS greater than length of array",
>> call. = FALSE)
>> else {
>> upper <- min(ifelse(cumDim > s, cumDim, max(cumDim)))
>> lower <- max(ifelse(cumDim < s, cumDim, min(cumDim)))
>> if (any(upper %% s != 0, s %% lower != 0))
>> warning("STATS does not recycle exactly across MARGIN",
>> call. = FALSE)
>> }
>> }
>> FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
>> }
>>
> Gabor Grothendieck <[EMAIL PROTECTED]> 06/21/05 01:25pm >>>
>> \
>> Perhaps the signature should be:
>>
>>sweep(...other args go here..., warn=getOption("warn"))
>>
>> so that the name and value of the argument are consistent with
>> the R warn option.
>>
>> On 6/21/05, Robin Hankin <[EMAIL PROTECTED]> wrote:
>>>
>>> On Jun 20, 2005, at 04:58 pm, Prof Brian Ripley wrote:
>>>
 The issue here is that the equivalent command array(1:5, c(6,6)) (to
 matrix(1:5,6,6)) gives no warning, and sweep uses array().

 I am not sure either should: fractional recycling was normally
 allowed
 in S3 (S4 tightened up a bit).

 Perhaps someone who thinks sweep() should warn could contribute a
 tested patch?

>>>
>>>
>>> OK,  modified R code and Rd file below (is this the best way to do
>>> this?)
>>>
>>>
>>>
>>>
>>> "sweep" <-
>>>   function (x, MARGIN, STATS, FUN = "-", give.warning = FALSE, ...)
>>> {
>>>   FUN <- match.fun(FUN)
>>>   dims <- dim(x)
>>>   if(give.warning & length(STATS

Re: [R] sweep() and recycling

2005-06-21 Thread Heather Turner
Agreed my examples may be trivial and I'm sure there are more efficient ways to 
do the same thing, but I disagree that my examples are against the spirit of 
sweep (). In the first example a vector is swept out from the rows, with one 
value for odd rows and one value for even rows. In the second example an array 
of values is swept out across the third dimension. In the third example an 
array of values is swept out from the full array.

The first example is a natural use of recycling. E.g. 

sweep(matrix(1:100, 50, 2), 1, c(1, 1000), "+", give.warning = TRUE)

is a quicker way of writing

sweep(matrix(1:100, 50, 2), 1, rep(c(1, 1000), 25), "+", give.warning = TRUE)

but your code would give a warning in the first case, even though the intent 
and the result are exactly the same as in the second case.

As you say, it is only a warning, that can be ignored. However the warning 
should at least reflect the warning condition used, i.e. warn that the length 
of STATS does not equal the extent of MARGIN, rather warning that STATS does 
not recycle exactly.

Heather

>>> Robin Hankin <[EMAIL PROTECTED]> 06/21/05 02:47pm >>>
Hi

On Jun 21, 2005, at 02:33 pm, Heather Turner wrote:

> I think the warning condition in Robin's patch is too harsh - the 
> following examples seem reasonable to me, but all produce warnings
>
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:2, give.warning = TRUE)
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:12, give.warning = TRUE)
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:24, give.warning = TRUE)
>


The examples above do give warnings (as intended) but I think all three 
cases above
are inimical to the spirit of sweep(): nothing is being "swept" out.

So a warning is appropriate, IMO.

In any case, one can always suppress (or ignore!) a warning if one knows
what one is doing.  YMMV, but if I wanted to do the above operations I 
would
replace


sweep(array(0, dim = c(4,3,2)), c(1,3), 1:12, "+" , give.warning = 
FALSE)

with

  aperm(array(1:12,c(4,2,3)),c(1,3,2))


best wishes

rksh







> I have written an alternative (given below) which does not give 
> warnings in the above cases, but does warn in the following case
>
>> sweep(array(1:24, dim = c(4,3,2)), 1:2, 1:3)
> , , 1
>
>  [,1] [,2] [,3]
> [1,]036
> [2,]039
> [3,]069
> [4,]369
>
> , , 2
>
>  [,1] [,2] [,3]
> [1,]   12   15   18
> [2,]   12   15   21
> [3,]   12   18   21
> [4,]   15   18   21
>
> Warning message:
> STATS does not recycle exactly across MARGIN
>
> The code could be easily modified to warn in other cases, e.g. when 
> length of STATS is a divisor of the corresponding array extent (as in 
> the first example above, with length(STATS) = 2).
>
> The code also includes Gabor's suggestion.
>
> Heather
>
> sweep <- function (x, MARGIN, STATS, FUN = "-", warn = 
> getOption("warn"), ...)
> {
> FUN <- match.fun(FUN)
> dims <- dim(x)
> perm <- c(MARGIN, (1:length(dims))[-MARGIN])
> if (warn >= 0) {
> s <- length(STATS)
> cumDim <- c(1, cumprod(dims[perm]))
> if (s > max(cumDim))
> warning("length of STATS greater than length of array",
> call. = FALSE)
> else {
> upper <- min(ifelse(cumDim > s, cumDim, max(cumDim)))
> lower <- max(ifelse(cumDim < s, cumDim, min(cumDim)))
> if (any(upper %% s != 0, s %% lower != 0))
> warning("STATS does not recycle exactly across MARGIN",
> call. = FALSE)
> }
> }
> FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
> }
>
 Gabor Grothendieck <[EMAIL PROTECTED]> 06/21/05 01:25pm >>>
> \
> Perhaps the signature should be:
>
>sweep(...other args go here..., warn=getOption("warn"))
>
> so that the name and value of the argument are consistent with
> the R warn option.
>
> On 6/21/05, Robin Hankin <[EMAIL PROTECTED]> wrote:
>>
>> On Jun 20, 2005, at 04:58 pm, Prof Brian Ripley wrote:
>>
>>> The issue here is that the equivalent command array(1:5, c(6,6)) (to
>>> matrix(1:5,6,6)) gives no warning, and sweep uses array().
>>>
>>> I am not sure either should: fractional recycling was normally 
>>> allowed
>>> in S3 (S4 tightened up a bit).
>>>
>>> Perhaps someone who thinks sweep() should warn could contribute a
>>> tested patch?
>>>
>>
>>
>> OK,  modified R code and Rd file below (is this the best way to do
>> this?)
>>
>>
>>
>>
>> "sweep" <-
>>   function (x, MARGIN, STATS, FUN = "-", give.warning = FALSE, ...)
>> {
>>   FUN <- match.fun(FUN)
>>   dims <- dim(x)
>>   if(give.warning & length(STATS)>1 & any(dims[MARGIN] !=
>> dim(as.array(STATS{
>> warning("array extents do not recycle exactly")
>>   }
>>   perm <- c(MARGIN, (1:length(dims))[-MARGIN])
>>   FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
>> }
>>
>>
>>
>>
>>
>>
>>
>> \name{sweep}
>> \alias{sweep}
>> \title{Sweep out Array Summaries}
>> \description{
>>   Return an array obtained fr

Re: [R] sweep() and recycling

2005-06-21 Thread Robin Hankin
Hi

On Jun 21, 2005, at 02:33 pm, Heather Turner wrote:

> I think the warning condition in Robin's patch is too harsh - the 
> following examples seem reasonable to me, but all produce warnings
>
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:2, give.warning = TRUE)
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:12, give.warning = TRUE)
> sweep(array(1:24, dim = c(4,3,2)), 1, 1:24, give.warning = TRUE)
>


The examples above do give warnings (as intended) but I think all three 
cases above
are inimical to the spirit of sweep(): nothing is being "swept" out.

So a warning is appropriate, IMO.

In any case, one can always suppress (or ignore!) a warning if one knows
what one is doing.  YMMV, but if I wanted to do the above operations I 
would
replace


sweep(array(0, dim = c(4,3,2)), c(1,3), 1:12, "+" , give.warning = 
FALSE)

with

  aperm(array(1:12,c(4,2,3)),c(1,3,2))


best wishes

rksh







> I have written an alternative (given below) which does not give 
> warnings in the above cases, but does warn in the following case
>
>> sweep(array(1:24, dim = c(4,3,2)), 1:2, 1:3)
> , , 1
>
>  [,1] [,2] [,3]
> [1,]036
> [2,]039
> [3,]069
> [4,]369
>
> , , 2
>
>  [,1] [,2] [,3]
> [1,]   12   15   18
> [2,]   12   15   21
> [3,]   12   18   21
> [4,]   15   18   21
>
> Warning message:
> STATS does not recycle exactly across MARGIN
>
> The code could be easily modified to warn in other cases, e.g. when 
> length of STATS is a divisor of the corresponding array extent (as in 
> the first example above, with length(STATS) = 2).
>
> The code also includes Gabor's suggestion.
>
> Heather
>
> sweep <- function (x, MARGIN, STATS, FUN = "-", warn = 
> getOption("warn"), ...)
> {
> FUN <- match.fun(FUN)
> dims <- dim(x)
> perm <- c(MARGIN, (1:length(dims))[-MARGIN])
> if (warn >= 0) {
> s <- length(STATS)
> cumDim <- c(1, cumprod(dims[perm]))
> if (s > max(cumDim))
> warning("length of STATS greater than length of array",
> call. = FALSE)
> else {
> upper <- min(ifelse(cumDim > s, cumDim, max(cumDim)))
> lower <- max(ifelse(cumDim < s, cumDim, min(cumDim)))
> if (any(upper %% s != 0, s %% lower != 0))
> warning("STATS does not recycle exactly across MARGIN",
> call. = FALSE)
> }
> }
> FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
> }
>
 Gabor Grothendieck <[EMAIL PROTECTED]> 06/21/05 01:25pm >>>
> \
> Perhaps the signature should be:
>
>sweep(...other args go here..., warn=getOption("warn"))
>
> so that the name and value of the argument are consistent with
> the R warn option.
>
> On 6/21/05, Robin Hankin <[EMAIL PROTECTED]> wrote:
>>
>> On Jun 20, 2005, at 04:58 pm, Prof Brian Ripley wrote:
>>
>>> The issue here is that the equivalent command array(1:5, c(6,6)) (to
>>> matrix(1:5,6,6)) gives no warning, and sweep uses array().
>>>
>>> I am not sure either should: fractional recycling was normally 
>>> allowed
>>> in S3 (S4 tightened up a bit).
>>>
>>> Perhaps someone who thinks sweep() should warn could contribute a
>>> tested patch?
>>>
>>
>>
>> OK,  modified R code and Rd file below (is this the best way to do
>> this?)
>>
>>
>>
>>
>> "sweep" <-
>>   function (x, MARGIN, STATS, FUN = "-", give.warning = FALSE, ...)
>> {
>>   FUN <- match.fun(FUN)
>>   dims <- dim(x)
>>   if(give.warning & length(STATS)>1 & any(dims[MARGIN] !=
>> dim(as.array(STATS{
>> warning("array extents do not recycle exactly")
>>   }
>>   perm <- c(MARGIN, (1:length(dims))[-MARGIN])
>>   FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
>> }
>>
>>
>>
>>
>>
>>
>>
>> \name{sweep}
>> \alias{sweep}
>> \title{Sweep out Array Summaries}
>> \description{
>>   Return an array obtained from an input array by sweeping out a 
>> summary
>>   statistic.
>> }
>> \usage{
>> sweep(x, MARGIN, STATS, FUN="-", give.warning = FALSE, \dots)
>> }
>> \arguments{
>>   \item{x}{an array.}
>>   \item{MARGIN}{a vector of indices giving the extents of \code{x}
>> which correspond to \code{STATS}.}
>>   \item{STATS}{the summary statistic which is to be swept out.}
>>   \item{FUN}{the function to be used to carry out the sweep.  In the
>> case of binary operators such as \code{"/"} etc., the function 
>> name
>> must be quoted.}
>>   \item{give.warning}{Boolean, with default \code{FALSE} meaning to
>>   give no warning, even if array extents do not match.  If
>>   \code{TRUE}, check for the correct dimensions and if a
>>   mismatch is detected, give a suitable warning.}
>>   \item{\dots}{optional arguments to \code{FUN}.}
>> }
>> \value{
>>   An array with the same shape as \code{x}, but with the summary
>>   statistics swept out.
>> }
>> \note{
>>   If \code{STATS} is of length 1, recycling is carried out with no
>>   warning irrespective of the value of \code{give.warning}.
>> }
>>
>> \references{
>>   Becker,

Re: [R] sweep() and recycling

2005-06-21 Thread Heather Turner
I think the warning condition in Robin's patch is too harsh - the following 
examples seem reasonable to me, but all produce warnings

sweep(array(1:24, dim = c(4,3,2)), 1, 1:2, give.warning = TRUE)
sweep(array(1:24, dim = c(4,3,2)), 1, 1:12, give.warning = TRUE)
sweep(array(1:24, dim = c(4,3,2)), 1, 1:24, give.warning = TRUE)

I have written an alternative (given below) which does not give warnings in the 
above cases, but does warn in the following case

> sweep(array(1:24, dim = c(4,3,2)), 1:2, 1:3)
, , 1

 [,1] [,2] [,3]
[1,]036
[2,]039
[3,]069
[4,]369

, , 2

 [,1] [,2] [,3]
[1,]   12   15   18
[2,]   12   15   21
[3,]   12   18   21
[4,]   15   18   21

Warning message:
STATS does not recycle exactly across MARGIN

The code could be easily modified to warn in other cases, e.g. when length of 
STATS is a divisor of the corresponding array extent (as in the first example 
above, with length(STATS) = 2).

The code also includes Gabor's suggestion.

Heather

sweep <- function (x, MARGIN, STATS, FUN = "-", warn = getOption("warn"), ...) 
{
FUN <- match.fun(FUN)
dims <- dim(x)
perm <- c(MARGIN, (1:length(dims))[-MARGIN])
if (warn >= 0) {
s <- length(STATS)
cumDim <- c(1, cumprod(dims[perm]))
if (s > max(cumDim))
warning("length of STATS greater than length of array",
call. = FALSE)
else {
upper <- min(ifelse(cumDim > s, cumDim, max(cumDim)))
lower <- max(ifelse(cumDim < s, cumDim, min(cumDim)))
if (any(upper %% s != 0, s %% lower != 0)) 
warning("STATS does not recycle exactly across MARGIN",
call. = FALSE)
}
}
FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
}

>>> Gabor Grothendieck <[EMAIL PROTECTED]> 06/21/05 01:25pm >>>
\
Perhaps the signature should be:

   sweep(...other args go here..., warn=getOption("warn"))

so that the name and value of the argument are consistent with
the R warn option.

On 6/21/05, Robin Hankin <[EMAIL PROTECTED]> wrote:
> 
> On Jun 20, 2005, at 04:58 pm, Prof Brian Ripley wrote:
> 
> > The issue here is that the equivalent command array(1:5, c(6,6)) (to
> > matrix(1:5,6,6)) gives no warning, and sweep uses array().
> >
> > I am not sure either should: fractional recycling was normally allowed
> > in S3 (S4 tightened up a bit).
> >
> > Perhaps someone who thinks sweep() should warn could contribute a
> > tested patch?
> >
> 
> 
> OK,  modified R code and Rd file below (is this the best way to do
> this?)
> 
> 
> 
> 
> "sweep" <-
>   function (x, MARGIN, STATS, FUN = "-", give.warning = FALSE, ...)
> {
>   FUN <- match.fun(FUN)
>   dims <- dim(x)
>   if(give.warning & length(STATS)>1 & any(dims[MARGIN] !=
> dim(as.array(STATS{
> warning("array extents do not recycle exactly")
>   }
>   perm <- c(MARGIN, (1:length(dims))[-MARGIN])
>   FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
> }
> 
> 
> 
> 
> 
> 
> 
> \name{sweep}
> \alias{sweep}
> \title{Sweep out Array Summaries}
> \description{
>   Return an array obtained from an input array by sweeping out a summary
>   statistic.
> }
> \usage{
> sweep(x, MARGIN, STATS, FUN="-", give.warning = FALSE, \dots)
> }
> \arguments{
>   \item{x}{an array.}
>   \item{MARGIN}{a vector of indices giving the extents of \code{x}
> which correspond to \code{STATS}.}
>   \item{STATS}{the summary statistic which is to be swept out.}
>   \item{FUN}{the function to be used to carry out the sweep.  In the
> case of binary operators such as \code{"/"} etc., the function name
> must be quoted.}
>   \item{give.warning}{Boolean, with default \code{FALSE} meaning to
>   give no warning, even if array extents do not match.  If
>   \code{TRUE}, check for the correct dimensions and if a
>   mismatch is detected, give a suitable warning.}
>   \item{\dots}{optional arguments to \code{FUN}.}
> }
> \value{
>   An array with the same shape as \code{x}, but with the summary
>   statistics swept out.
> }
> \note{
>   If \code{STATS} is of length 1, recycling is carried out with no
>   warning irrespective of the value of \code{give.warning}.
> }
> 
> \references{
>   Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
>   \emph{The New S Language}.
>   Wadsworth \& Brooks/Cole.
> }
> \seealso{
>   \code{\link{apply}} on which \code{sweep} used to be based;
>   \code{\link{scale}} for centering and scaling.
> }
> \examples{
> require(stats) # for median
> med.att <- apply(attitude, 2, median)
> sweep(data.matrix(attitude), 2, med.att)# subtract the column medians
> 
> a <- array(0, c(2, 3, 4))
> b <- matrix(1:8, c(2, 4))
> sweep(a, c(1, 3), b, "+", give.warning = TRUE)  # no warning:
> all(dim(a)[c(1,3)] == dim(b))
> sweep(a, c(1, 2), b, "+", give.warning = TRUE)  # warning given
> 
> }
> \keyword{array}
> \keyword{iteration}
> 
> 
> 
> 
> --
> Robin Hankin
> Uncertainty Analyst
> National Oceanography

Re: [R] sweep() and recycling

2005-06-21 Thread Gabor Grothendieck
\
Perhaps the signature should be:

   sweep(...other args go here..., warn=getOption("warn"))

so that the name and value of the argument are consistent with
the R warn option.

On 6/21/05, Robin Hankin <[EMAIL PROTECTED]> wrote:
> 
> On Jun 20, 2005, at 04:58 pm, Prof Brian Ripley wrote:
> 
> > The issue here is that the equivalent command array(1:5, c(6,6)) (to
> > matrix(1:5,6,6)) gives no warning, and sweep uses array().
> >
> > I am not sure either should: fractional recycling was normally allowed
> > in S3 (S4 tightened up a bit).
> >
> > Perhaps someone who thinks sweep() should warn could contribute a
> > tested patch?
> >
> 
> 
> OK,  modified R code and Rd file below (is this the best way to do
> this?)
> 
> 
> 
> 
> "sweep" <-
>   function (x, MARGIN, STATS, FUN = "-", give.warning = FALSE, ...)
> {
>   FUN <- match.fun(FUN)
>   dims <- dim(x)
>   if(give.warning & length(STATS)>1 & any(dims[MARGIN] !=
> dim(as.array(STATS{
> warning("array extents do not recycle exactly")
>   }
>   perm <- c(MARGIN, (1:length(dims))[-MARGIN])
>   FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
> }
> 
> 
> 
> 
> 
> 
> 
> \name{sweep}
> \alias{sweep}
> \title{Sweep out Array Summaries}
> \description{
>   Return an array obtained from an input array by sweeping out a summary
>   statistic.
> }
> \usage{
> sweep(x, MARGIN, STATS, FUN="-", give.warning = FALSE, \dots)
> }
> \arguments{
>   \item{x}{an array.}
>   \item{MARGIN}{a vector of indices giving the extents of \code{x}
> which correspond to \code{STATS}.}
>   \item{STATS}{the summary statistic which is to be swept out.}
>   \item{FUN}{the function to be used to carry out the sweep.  In the
> case of binary operators such as \code{"/"} etc., the function name
> must be quoted.}
>   \item{give.warning}{Boolean, with default \code{FALSE} meaning to
>   give no warning, even if array extents do not match.  If
>   \code{TRUE}, check for the correct dimensions and if a
>   mismatch is detected, give a suitable warning.}
>   \item{\dots}{optional arguments to \code{FUN}.}
> }
> \value{
>   An array with the same shape as \code{x}, but with the summary
>   statistics swept out.
> }
> \note{
>   If \code{STATS} is of length 1, recycling is carried out with no
>   warning irrespective of the value of \code{give.warning}.
> }
> 
> \references{
>   Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
>   \emph{The New S Language}.
>   Wadsworth \& Brooks/Cole.
> }
> \seealso{
>   \code{\link{apply}} on which \code{sweep} used to be based;
>   \code{\link{scale}} for centering and scaling.
> }
> \examples{
> require(stats) # for median
> med.att <- apply(attitude, 2, median)
> sweep(data.matrix(attitude), 2, med.att)# subtract the column medians
> 
> a <- array(0, c(2, 3, 4))
> b <- matrix(1:8, c(2, 4))
> sweep(a, c(1, 3), b, "+", give.warning = TRUE)  # no warning:
> all(dim(a)[c(1,3)] == dim(b))
> sweep(a, c(1, 2), b, "+", give.warning = TRUE)  # warning given
> 
> }
> \keyword{array}
> \keyword{iteration}
> 
> 
> 
> 
> --
> Robin Hankin
> Uncertainty Analyst
> National Oceanography Centre, Southampton
> European Way, Southampton SO14 3ZH, UK
>  tel  023-8059-7743
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sweep() and recycling

2005-06-21 Thread Robin Hankin

On Jun 20, 2005, at 04:58 pm, Prof Brian Ripley wrote:

> The issue here is that the equivalent command array(1:5, c(6,6)) (to 
> matrix(1:5,6,6)) gives no warning, and sweep uses array().
>
> I am not sure either should: fractional recycling was normally allowed 
> in S3 (S4 tightened up a bit).
>
> Perhaps someone who thinks sweep() should warn could contribute a 
> tested patch?
>


OK,  modified R code and Rd file below (is this the best way to do 
this?)




"sweep" <-
   function (x, MARGIN, STATS, FUN = "-", give.warning = FALSE, ...)
{
   FUN <- match.fun(FUN)
   dims <- dim(x)
   if(give.warning & length(STATS)>1 & any(dims[MARGIN] != 
dim(as.array(STATS{
 warning("array extents do not recycle exactly")
   }
   perm <- c(MARGIN, (1:length(dims))[-MARGIN])
   FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
}







\name{sweep}
\alias{sweep}
\title{Sweep out Array Summaries}
\description{
   Return an array obtained from an input array by sweeping out a summary
   statistic.
}
\usage{
sweep(x, MARGIN, STATS, FUN="-", give.warning = FALSE, \dots)
}
\arguments{
   \item{x}{an array.}
   \item{MARGIN}{a vector of indices giving the extents of \code{x}
 which correspond to \code{STATS}.}
   \item{STATS}{the summary statistic which is to be swept out.}
   \item{FUN}{the function to be used to carry out the sweep.  In the
 case of binary operators such as \code{"/"} etc., the function name
 must be quoted.}
   \item{give.warning}{Boolean, with default \code{FALSE} meaning to
   give no warning, even if array extents do not match.  If
   \code{TRUE}, check for the correct dimensions and if a
   mismatch is detected, give a suitable warning.}
   \item{\dots}{optional arguments to \code{FUN}.}
}
\value{
   An array with the same shape as \code{x}, but with the summary
   statistics swept out.
}
\note{
   If \code{STATS} is of length 1, recycling is carried out with no
   warning irrespective of the value of \code{give.warning}.
}

\references{
   Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988)
   \emph{The New S Language}.
   Wadsworth \& Brooks/Cole.
}
\seealso{
   \code{\link{apply}} on which \code{sweep} used to be based;
   \code{\link{scale}} for centering and scaling.
}
\examples{
require(stats) # for median
med.att <- apply(attitude, 2, median)
sweep(data.matrix(attitude), 2, med.att)# subtract the column medians

a <- array(0, c(2, 3, 4))
b <- matrix(1:8, c(2, 4))
sweep(a, c(1, 3), b, "+", give.warning = TRUE)  # no warning: 
all(dim(a)[c(1,3)] == dim(b))
sweep(a, c(1, 2), b, "+", give.warning = TRUE)  # warning given

}
\keyword{array}
\keyword{iteration}




--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sweep() and recycling

2005-06-20 Thread Prof Brian Ripley
The issue here is that the equivalent command array(1:5, c(6,6)) (to 
matrix(1:5,6,6)) gives no warning, and sweep uses array().

I am not sure either should: fractional recycling was normally allowed in 
S3 (S4 tightened up a bit).

Perhaps someone who thinks sweep() should warn could contribute a tested 
patch?

On Mon, 20 Jun 2005, Heather Turner wrote:

>
>
> Dr H Turner
> Research Assistant
> Dept. of Statistics
> The University of Warwick
> Coventry
> CV4 7AL
>
> Tel: 024 76575870
>
 Robin Hankin <[EMAIL PROTECTED]> 06/20/05 10:32am >>>
> Hi
>
>
> I had a hard-to-find bug in some of my code the other day, which I
> eventually
> traced to my misusing of sweep().
>
> I would expect sweep() to give
> me a warning if the elements don't recycle nicely, but
>
>  X <- matrix(1:36,6,6)
>  sweep(X,1,1:5,"+")
>  [,1] [,2] [,3] [,4] [,5] [,6]
> [1,]29   16   23   30   32
> [2,]4   11   18   25   27   34
> [3,]6   13   20   22   29   36
> [4,]8   15   17   24   31   38
> [5,]   10   12   19   26   33   40
> [6,]7   14   21   28   35   37
>
> gives no warning, even though (5,36)=1.
>
> Also,
>
>
> sweep(X,1,c(1,1000),"+")
>
> and
>
> sweep(X,2,c(1,1000),"+")
>
>
> behave as expected,   But
>
>
> sweep(X,1,1:5,"+")
>
> and
>
>  sweep(X,2,1:5,"+")
>
>
>
> are identical!
>
> ...which is also expected since the matrix is square. Perhaps the
> following will help you see why:
>> matrix(1:5,6,6)
> [,1] [,2] [,3] [,4] [,5] [,6]
> [1,]123451
> [2,]234512
> [3,]345123
> [4,]451234
> [5,]512345
> [6,]123451
> Warning message:
> data length [5] is not a sub-multiple or multiple of the number of rows
> [6] in matrix
>> matrix(1:5,6,6, byrow = TRUE)
> [,1] [,2] [,3] [,4] [,5] [,6]
> [1,]123451
> [2,]234512
> [3,]345123
> [4,]451234
> [5,]512345
> [6,]123451
> Warning message:
> data length [5] is not a sub-multiple or multiple of the number of rows
> [6] in matrix
> i.e. recycling a vector of length 5 down the columns gives the same
> result as recycling the same vector along the rows.
>
> I agree with your main point however, it would be useful if 'sweep'
> would give a warning when the length of the vector of statistics to be
> swept out was not a sub-multiple or multiple of the corresponding array
> dimension.
>
> Heather
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sweep() and recycling

2005-06-20 Thread Robin Hankin
Hi Heather

thanks for this.  (I have messed up quote level somewhere.  Sorry for 
that).


On Jun 20, 2005, at 02:01 pm, Heather Turner wrote:
>
 Robin Hankin <[EMAIL PROTECTED]> 06/20/05 10:32am >>>
>> Hi
>>
>> [snip]
>> I would expect sweep() to give
>> me a warning if the elements don't recycle nicely
>
[snip]
>
>> behave as expected,   But
>>
>>
>> sweep(X,1,1:5,"+")
>>
>> and
>>
>>   sweep(X,2,1:5,"+")
>>
>>
>>
>> are identical!
>>
> ...which is also expected since the matrix is square. Perhaps the
>> following will help you see why:
>

> matrix(1:5,6,6)
>  matrix(1:5,6,6, byrow = TRUE)



Yes,  after thinking about your example for a few minutes, it was a bit 
foolish of me
to use a square matrix to illustrate this problem.
Thinking about it, it is the coincidence that ncol(X) %% n == 
nrow(X)%%n == 1 that
caused the phenomenon I remarked on [here n=length(x)].

I should have known better, as I use this phenomenon deliberately in 
magic square
construction! (although not by using  sweep()).

> I agree with your main point however, it would be useful if 'sweep'
> would give a warning when the length of the vector of statistics to be
> swept out was not a sub-multiple or multiple of the corresponding array
> dimension.


yes, that would be good.   Can anyone think of  a situation in which 
one would want to ignore
such a warning?  Conversely, does anyone deliberately use sweep() with
   non-recycling vector?

>
--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sweep() and recycling

2005-06-20 Thread Heather Turner


Dr H Turner
Research Assistant
Dept. of Statistics
The University of Warwick
Coventry
CV4 7AL

Tel: 024 76575870

>>> Robin Hankin <[EMAIL PROTECTED]> 06/20/05 10:32am >>>
Hi


I had a hard-to-find bug in some of my code the other day, which I 
eventually
traced to my misusing of sweep().

I would expect sweep() to give
me a warning if the elements don't recycle nicely, but

  X <- matrix(1:36,6,6)
  sweep(X,1,1:5,"+")
  [,1] [,2] [,3] [,4] [,5] [,6]
[1,]29   16   23   30   32
[2,]4   11   18   25   27   34
[3,]6   13   20   22   29   36
[4,]8   15   17   24   31   38
[5,]   10   12   19   26   33   40
[6,]7   14   21   28   35   37

gives no warning, even though (5,36)=1.

Also,


sweep(X,1,c(1,1000),"+")

and

sweep(X,2,c(1,1000),"+")


behave as expected,   But


sweep(X,1,1:5,"+")

and

  sweep(X,2,1:5,"+")



are identical!

...which is also expected since the matrix is square. Perhaps the
following will help you see why:
> matrix(1:5,6,6)
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]123451
[2,]234512
[3,]345123
[4,]451234
[5,]512345
[6,]123451
Warning message:
data length [5] is not a sub-multiple or multiple of the number of rows
[6] in matrix 
> matrix(1:5,6,6, byrow = TRUE)
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,]123451
[2,]234512
[3,]345123
[4,]451234
[5,]512345
[6,]123451
Warning message:
data length [5] is not a sub-multiple or multiple of the number of rows
[6] in matrix 
i.e. recycling a vector of length 5 down the columns gives the same
result as recycling the same vector along the rows.

I agree with your main point however, it would be useful if 'sweep'
would give a warning when the length of the vector of statistics to be
swept out was not a sub-multiple or multiple of the corresponding array
dimension.

Heather

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] sweep() and recycling

2005-06-20 Thread Robin Hankin
Hi


I had a hard-to-find bug in some of my code the other day, which I 
eventually
traced to my misusing of sweep().

I would expect sweep() to give
me a warning if the elements don't recycle nicely, but

  X <- matrix(1:36,6,6)
  sweep(X,1,1:5,"+")
  [,1] [,2] [,3] [,4] [,5] [,6]
[1,]29   16   23   30   32
[2,]4   11   18   25   27   34
[3,]6   13   20   22   29   36
[4,]8   15   17   24   31   38
[5,]   10   12   19   26   33   40
[6,]7   14   21   28   35   37

gives no warning, even though (5,36)=1.

Also,


sweep(X,1,c(1,1000),"+")

and

sweep(X,2,c(1,1000),"+")


behave as expected,   But


sweep(X,1,1:5,"+")

and

  sweep(X,2,1:5,"+")



are identical!


I find this confusing, because I interpret sweep()'s second
argument as the margin along which the third argument is fourth 
argumented
this interpretation breaks down in a non-obvious way if elements
are left over after recycling.

If sweep() had given a warning, I would have found my mistake much 
faster.

Could we  make sweep() issue a warning
under these circumstances or, failing that, add a warning to sweep.Rd 
saying
that no warning is given in cases like this?



--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html